public class KDTree extends NearestNeighbourSearch
@article{Friedman1977, author = {Jerome H. Friedman and Jon Luis Bentley and Raphael Ari Finkel}, journal = {ACM Transactions on Mathematics Software}, month = {September}, number = {3}, pages = {209-226}, title = {An Algorithm for Finding Best Matches in Logarithmic Expected Time}, volume = {3}, year = {1977} } @techreport{Moore1991, author = {Andrew Moore}, booktitle = {University of Cambridge Computer Laboratory Technical Report No. 209}, howpublished = {Extract from PhD Thesis}, title = {A tutorial on kd-trees}, year = {1991}, HTTP = {Available from http://www.autonlab.org/autonweb/14665.html} }Valid options are:
-S <classname and options> Node splitting method to use. (default: weka.core.neighboursearch.kdtrees.SlidingMidPointOfWidestSide)
-W <value> Set minimal width of a box (default: 1.0E-2).
-L Maximal number of instances in a leaf (default: 40).
-N Normalizing will be done (Select dimension for split, with normalising to universe).
NearestNeighbourSearch.MyHeap, NearestNeighbourSearch.MyHeapElement, NearestNeighbourSearch.NeighborList, NearestNeighbourSearch.NeighborNode
Modifier and Type | Field and Description |
---|---|
protected double[] |
m_DistanceList
Array holding the distances of the nearest neighbours.
|
protected EuclideanDistance |
m_EuclideanDistance
The euclidean distance function to use.
|
protected int[] |
m_InstList
Indexlist of the instances of this kdtree.
|
protected int |
m_MaxDepth
Tree stats.
|
protected int |
m_MaxInstInLeaf
maximal number of instances in a leaf.
|
protected double |
m_MinBoxRelWidth
minimal relative width of a KDTree rectangle.
|
protected int |
m_NumLeaves
Tree stats.
|
protected int |
m_NumNodes
Tree stats.
|
protected KDTreeNode |
m_Root
The root node of the tree.
|
protected KDTreeNodeSplitter |
m_Splitter
The node splitter.
|
static int |
MAX
The index of MAX value in attributes' range array.
|
static int |
MIN
The index of MIN value in attributes' range array.
|
static int |
WIDTH
The index of WIDTH (MAX-MIN) value in attributes' range array.
|
m_DistanceFunction, m_Instances, m_kNN, m_MeasurePerformance
Constructor and Description |
---|
KDTree()
Creates a new instance of KDTree.
|
KDTree(weka.core.Instances insts)
Creates a new instance of KDTree.
|
Modifier and Type | Method and Description |
---|---|
void |
addInstanceInfo(weka.core.Instance instance)
Adds one instance to KDTree loosly.
|
protected void |
addInstanceToTree(weka.core.Instance inst,
KDTreeNode node)
Recursively adds an instance to the tree starting from
the supplied KDTreeNode.
|
protected void |
afterAddInstance(KDTreeNode node)
Corrects the start and end indices of a
KDTreeNode after an instance is added to
the tree.
|
void |
assignSubToCenters(KDTreeNode node,
weka.core.Instances centers,
int[] centList,
int[] assignments)
Assigns instances of this node to center.
|
protected void |
buildKDTree(weka.core.Instances instances)
Builds the KDTree on the supplied set of instances/points.
|
protected boolean |
candidateIsFullOwner(KDTreeNode node,
weka.core.Instance candidate,
weka.core.Instance competitor)
Returns true if candidate is a full owner in respect to a competitor.
|
void |
centerInstances(weka.core.Instances centers,
int[] assignments,
double pc)
Assigns instances to centers using KDTree.
|
protected void |
checkMissing(weka.core.Instance ins)
Checks if there is any missing value in the given
instance.
|
protected void |
checkMissing(weka.core.Instances instances)
Checks if there is any instance with missing values.
|
protected boolean |
clipToInsideHrect(KDTreeNode node,
weka.core.Instance x)
Finds the closest point in the hyper rectangle to a given point.
|
protected void |
determineAssignments(KDTreeNode node,
weka.core.Instances centers,
int[] candidates,
int[] assignments,
double pc)
Assigns instances to the current centers called candidates.
|
protected double |
distanceToHrect(KDTreeNode node,
weka.core.Instance x)
Returns the distance between a point and an hyperrectangle.
|
Enumeration |
enumerateMeasures()
Returns an enumeration of the additional measure names.
|
protected void |
findNearestNeighbours(weka.core.Instance target,
KDTreeNode node,
int k,
NearestNeighbourSearch.MyHeap heap,
double distanceToParents)
Returns (in the supplied heap object) the k nearest
neighbours of the given instance starting from the give
tree node.
|
DistanceFunction |
getDistanceFunction()
returns the distance function currently in use.
|
double[] |
getDistances()
Returns the distances to the kNearest or 1 nearest neighbour currently
found with either the kNearestNeighbours or the nearestNeighbour method.
|
int |
getMaxInstInLeaf()
Get the maximum number of instances in a leaf.
|
protected double |
getMaxRelativeNodeWidth(double[][] nodeRanges,
double[][] universe)
Returns the maximum attribute width of instances/points
in a KDTreeNode relative to the whole dataset.
|
double |
getMeasure(String additionalMeasureName)
Returns the value of the named measure.
|
double |
getMinBoxRelWidth()
Gets the minimum relative box width.
|
KDTreeNodeSplitter |
getNodeSplitter()
Returns the splitting method currently in use to split the nodes of the
KDTree.
|
boolean |
getNormalizeNodeWidth()
Gets the normalize flag.
|
String |
globalInfo()
Returns a string describing this nearest neighbour search algorithm.
|
weka.core.Instances |
kNearestNeighbours(weka.core.Instance target,
int k)
Returns the k nearest neighbours of the supplied instance.
|
String |
maxInstInLeafTipText()
Tip text for this property.
|
double |
measureMaxDepth()
Returns the depth of the tree.
|
double |
measureNumLeaves()
Returns the number of leaves.
|
double |
measureTreeSize()
Returns the size of the tree.
|
String |
minBoxRelWidthTipText()
Tip text for this property.
|
weka.core.Instance |
nearestNeighbour(weka.core.Instance target)
Returns the nearest neighbour of the supplied target
instance.
|
String |
nodeSplitterTipText()
Returns the tip text for this property.
|
String |
normalizeNodeWidthTipText()
Tip text for this property.
|
protected int[] |
refineOwners(KDTreeNode node,
weka.core.Instances centers,
int[] candidates)
Refines the ownerlist.
|
void |
setDistanceFunction(DistanceFunction df)
sets the distance function to use for nearest neighbour search.
|
void |
setInstances(weka.core.Instances instances)
Builds the KDTree on the given set of instances.
|
void |
setMaxInstInLeaf(int i)
Sets the maximum number of instances in a leaf.
|
void |
setMeasurePerformance(boolean measurePerformance)
Sets whether to calculate the performance statistics or not.
|
void |
setMinBoxRelWidth(double i)
Sets the minimum relative box width.
|
void |
setNodeSplitter(KDTreeNodeSplitter splitter)
Sets the splitting method to use to split the nodes of the KDTree.
|
void |
setNormalizeNodeWidth(boolean n)
Sets the flag for normalizing the widths of a KDTree Node by the width of
the dimension in the universe.
|
protected void |
splitNodes(KDTreeNode node,
double[][] universe,
int depth)
Recursively splits nodes of a tree starting from the supplied node.
|
void |
update(weka.core.Instance instance)
Adds one instance to the KDTree.
|
protected int |
widestDim(double[][] nodeRanges,
double[][] universe)
Returns the widest dimension/attribute in a
KDTreeNode (widest after normalizing).
|
combSort11, distanceFunctionTipText, getInstances, getMeasurePerformance, measurePerformanceTipText, partition, quickSort
protected double[] m_DistanceList
protected int[] m_InstList
protected KDTreeNode m_Root
protected KDTreeNodeSplitter m_Splitter
protected int m_NumNodes
protected int m_NumLeaves
protected int m_MaxDepth
public static final int MIN
public static final int MAX
public static final int WIDTH
protected EuclideanDistance m_EuclideanDistance
protected double m_MinBoxRelWidth
protected int m_MaxInstInLeaf
public KDTree()
public KDTree(weka.core.Instances insts)
insts
- The instances/points on which the BallTree
should be built on.protected void buildKDTree(weka.core.Instances instances) throws Exception
instances
- The instances to build the tree onException
- if something goes wrongprotected void splitNodes(KDTreeNode node, double[][] universe, int depth) throws Exception
node
- The node to start splitting from.universe
- The attribute ranges of the whole dataset.depth
- The depth of the supplied node.Exception
- If there is some problem
splitting.protected void findNearestNeighbours(weka.core.Instance target, KDTreeNode node, int k, NearestNeighbourSearch.MyHeap heap, double distanceToParents) throws Exception
target
- The instance to find the nearest neighbours for.node
- The KDTreeNode to start the search from.k
- The number of neighbours to find.heap
- The MyHeap object to store/update the kNNs found
during the search.distanceToParents
- The distance of the supplied target
to the parents of the supplied tree node.Exception
- if the nearest neighbour could not be found.public weka.core.Instances kNearestNeighbours(weka.core.Instance target, int k) throws Exception
kNearestNeighbours
in class NearestNeighbourSearch
target
- The instance to find the nearest neighbours for.k
- The number of neighbours to find.Exception
- if the nearest neighbour could not be found.public weka.core.Instance nearestNeighbour(weka.core.Instance target) throws Exception
nearestNeighbour
in class NearestNeighbourSearch
target
- The instance to find the nearest neighbour for.Exception
- if the neighbours could not be found.public double[] getDistances() throws Exception
getDistances
in class NearestNeighbourSearch
Exception
- if called before calling kNearestNeighbours or
nearestNeighbours.public void setInstances(weka.core.Instances instances) throws Exception
setInstances
in class NearestNeighbourSearch
instances
- The insts on which the KDTree is to be
built.Exception
- If some error occurs while
building the KDTreepublic void update(weka.core.Instance instance) throws Exception
update
in class NearestNeighbourSearch
instance
- the instance to be added. Usually the newly added instance in the
training set.Exception
- If the instance cannot be added.protected void addInstanceToTree(weka.core.Instance inst, KDTreeNode node) throws Exception
inst
- The instance to add to the treenode
- The node to start the recursive search
from, for the leaf node where the supplied instance
would go.Exception
- If some error occurs while adding
the instance.protected void afterAddInstance(KDTreeNode node)
node
- KDTreeNode whose start and end indices
need to be updated.public void addInstanceInfo(weka.core.Instance instance)
addInstanceInfo
in class NearestNeighbourSearch
instance
- the new instance. Usually this is the test instance
supplied to update the range of attributes in the distance function.protected void checkMissing(weka.core.Instances instances) throws Exception
instances
- the instances to checkException
- if missing values are encounteredprotected void checkMissing(weka.core.Instance ins) throws Exception
ins
- The instance to check missing values in.Exception
- If there is a missing value in the
instance.protected double getMaxRelativeNodeWidth(double[][] nodeRanges, double[][] universe)
nodeRanges
- The attribute ranges of the
KDTreeNode whose maximum relative width is to be
determined.universe
- The attribute ranges of the whole
dataset (training instances + test instances so
far encountered).protected int widestDim(double[][] nodeRanges, double[][] universe)
nodeRanges
- The attribute ranges of
the KDTreeNode.universe
- The attribute ranges of the
whole dataset (training instances + test
instances so far encountered).public double measureTreeSize()
public double measureNumLeaves()
public double measureMaxDepth()
public Enumeration enumerateMeasures()
public double getMeasure(String additionalMeasureName)
additionalMeasureName
- the name of
the measure to query for its value.IllegalArgumentException
- If the named measure
is not supported.public void setMeasurePerformance(boolean measurePerformance)
measurePerformance
- Should be true if performance
statistics are to be measured.public void centerInstances(weka.core.Instances centers, int[] assignments, double pc) throws Exception
centers
- the current centersassignments
- the centerindex for each instancepc
- the threshold value for pruning.Exception
- If there is some problem
assigning instances to centers.protected void determineAssignments(KDTreeNode node, weka.core.Instances centers, int[] candidates, int[] assignments, double pc) throws Exception
node
- The node to start assigning the instances from.centers
- all the current centers.candidates
- the current centers the method works on.assignments
- the center index for each instance.pc
- the threshold value for pruning.Exception
- If there is some problem assigning
instances to centers.protected int[] refineOwners(KDTreeNode node, weka.core.Instances centers, int[] candidates) throws Exception
node
- The current tree node.centers
- all centerscandidates
- the indexes of those centers that are candidates.Exception
- If some problem occurs in refining.protected double distanceToHrect(KDTreeNode node, weka.core.Instance x) throws Exception
node
- The current node from whose hyperrectangle
the distance is to be measured.x
- the pointException
- If some problem occurs in determining
the distance to the hyperrectangle.protected boolean clipToInsideHrect(KDTreeNode node, weka.core.Instance x)
node
- The current KDTreeNode in whose hyperrectangle the closest
point is to be found.x
- a pointprotected boolean candidateIsFullOwner(KDTreeNode node, weka.core.Instance candidate, weka.core.Instance competitor) throws Exception
The candidate has been the closer point to the current rectangle or even has been a point within the rectangle. The competitor is competing with the candidate for a few points out of the rectangle although it is a point further away from the rectangle then the candidate. The extrem point is the corner of the rectangle that is furthest away from the candidate towards the direction of the competitor. If the distance candidate to this extreme point is smaller then the distance competitor to this extreme point, then it is proven that none of the points in the rectangle can be owned be the competitor and the candidate is full owner of the rectangle in respect to this competitor. See also D. Pelleg and A. Moore's paper 'Accelerating exact k-means Algorithms with Geometric Reasoning'.
node
- The current KDTreeNode / hyperrectangle.candidate
- instance that is candidate to be ownercompetitor
- instance that competes against the candidateException
- If some problem occurs.public void assignSubToCenters(KDTreeNode node, weka.core.Instances centers, int[] centList, int[] assignments) throws Exception
node
- The KDTreeNode whose instances are to be assigned.centers
- all the input centerscentList
- the list of centers to work withassignments
- index list of last assignmentsException
- If there is error assigning the instances.public String minBoxRelWidthTipText()
public void setMinBoxRelWidth(double i)
i
- the minimum relative box widthpublic double getMinBoxRelWidth()
public String maxInstInLeafTipText()
public void setMaxInstInLeaf(int i)
i
- the maximum number of instances in a leafpublic int getMaxInstInLeaf()
public String normalizeNodeWidthTipText()
public void setNormalizeNodeWidth(boolean n)
n
- true to use normalizing.public boolean getNormalizeNodeWidth()
public DistanceFunction getDistanceFunction()
getDistanceFunction
in class NearestNeighbourSearch
public void setDistanceFunction(DistanceFunction df) throws Exception
setDistanceFunction
in class NearestNeighbourSearch
df
- the distance function to useException
- if not EuclideanDistancepublic String nodeSplitterTipText()
public KDTreeNodeSplitter getNodeSplitter()
public void setNodeSplitter(KDTreeNodeSplitter splitter)
splitter
- The KDTreeNodeSplitter to use.public String globalInfo()
globalInfo
in class NearestNeighbourSearch
Copyright © 2014 University of Waikato, Hamilton, NZ. All Rights Reserved.