Hi!

Today I'll be writing about the implementation of kd-trees. A kd-tree (short for k-dimensional tree) is a type of spacial partitioning data structure, and a special case of BSP trees. What's the point of it? Well, it can be used to organize a vast number of points in space in such a way that searching for the nearest point to a certain coordinate, for example, is far quicker than more traditional methods.

It has many applications, but the one of interest to me (and maybe you, if you're following the Ray-Tracer series), is in photon mapping. Photon mapping is a special procedure carried out before the main ray tracing in which the light sources in the scene emit a large number of photons. The end-position of the photons emitted are stored in a kd-tree, which can be used to quickly determine just how many photons hit close to a point in our scene, which in turn tells us a bit about the lighting at the point. A kd-tree is absolutely necessary for this, as trying to do a nearest-point search using more traditional methods on millions of photons, numerous times for each pixel on the screen, would take far too long. This photon mapping procedure allows the rendering of caustics and diffuse inter-reflection, which can greatly increase the realism of a scene.

So how does it work? We start with the coordinates of a large number of points in any dimensional space (2d in the case of this example), stored in a big list or array. From there, we first choose a coordinate to use to split the list - let's start with the X coordinate. We find the median x coordinate from all the points, and then split the list up into two smaller lists, one of the points with an x coordinate bigger than the median, one of the points with an x coordinate smaller than or equal to the median. We then recursively use this procedure, starting with the two new lists, but swap the coordinate each time (so we'd use the y coordinate for the second lists, then return to using the x, and so on). When we get to a state where there's only one point left in our lists, we stop the recursion, and the tree is built! Let's check out some pseudo code that can build a list:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
//Store out points in an array, where id is the point id, and coordinates 0 is x, and 1 is y. So points[0,0] is the x coordinate of our first point. points[id,coordinates] = <big list of points> //Call the split tree function to start the tree building process tree = split_tree(points, 1); function tuple split_tree(input_points,depth){ if length(input_points)<=1 exit //exit the function if there's only 1 point in the list //Choose the coordinate to split the list by //This will alternate between 0 and 1 split_coordinate = depth % 2; //Get the median coordinate median = get_median(input_points, split_coordinate); //Loop through every point for(i = 0; i<length(input_points); i++){ //If the coordinate of the current point is less than medium if input_points[i,split_coordinate]<=median{ //Add that point to a new array lower_list[length(lower_list),0) = input_points[i,0] lower_list[length(lower_list),1) = input_points[i,1] } else //otherwise add it to a higher array { higher_list[length(higher_list),0) = input_points[i,0] higher_list[length(higher_list),1) = input_points[i,1] } } node.median = median; node.left = split_tree(higher_list, depth + 1) node.right = split_tree(lower_list, depth + 1) return node } |

This recursive function continues to split the list time and time again, until it is down to a bunch of lists with just a single point in them. At this point, the tree is complete. It's important to note there must be a way of linking each node/list in the tree to the previous nodes, and the following nodes, as this is what makes it possible to traverse/search the tree. In my C# implementation I achieved this by having a Kd-Tree node class which was created at the end of the split_tree function, which held some variables such as a pointer to the instance of the class that instantiated it, as well as pointers to the two instances of the class it then went on to create, and it then called the split_tree function from inside itself. There are however other (and probably better!) ways of doing this.

So no that we have our tree, it's time to use it so search for the nearest point to an arbitrary coordinate. To do this, we simply start at the top of our tree, and work our way through it by comparing the coordinate we're looking for the nearest point to with the median at each stage. Here's some pseudo code:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
//Where point[0] is x, points[1] is y function search_tree(depth, node, point[]){ if length(tree)<=1 return tree; //Found the point, as we're at the end of the tree //Compare our coordinate with the currents node median, to choose next node if point[depth % 2] < node.median next_node = node.lower_list; else next_node = node.higher_list; //Continue the search on the next node search_tree(depth+1, next_node, point[]) } |

As you can see, it's another recursive function, which calls itself until it reaches the bottom of the tree. By comparing the coordinates point we're searching for with the median at each node, we make our way down the tree and will eventually wind up at the point closest to the point of interest.

And that's all there is to it! Kd-Trees are extremely useful as they are lightning fast compared to, say, comparing the distances between each point in the list at the point of interest. The larger the number points get, the bigger the speed boost will be, so for photon mapping it will provide a massive speed increase.