Googling "point cloud registration" will return hundreds of results on different algorithms which can be used to take two 'clouds' of points/coordinates (assumed to be representing the same thing) in 2D or 3D space and transform one cloud such that that the points it contains optimally overlap the points in the second cloud, or in other words, the clouds are optimally aligned with one another. An example application in 2D space is 'image registration', which is regularly used in a biomedical context to align/overlap and hence compare scans from different devices, for example an MRI and PET scan of the same patient's head.
Unfortunately, there are two very large assumptions made in the vast majority of these algorithms, which are not necessarily always true:
1) There are 'markers' in the point clouds, i.e there are at least three points within the two clouds which are known to be the same point in both clouds, in the context of the problem. For example, in the biomedical image registration context from above, it is possible to visually identify features within the two images to be alligned, such as the edges of the skull, which are common to both point clouds and can be used to align the images.
2) The transformation has a certain number of degrees of freedom, often the maximum number possible in the context (translation, rotation, scaling and shearing).
The context within which I required point cloud registration was during the development of a novel approach to 3D strain field calculations from CT scans. An object (a bone sample, in this instance), was placed into the CT machine and a 3D scan was taken. A force was then applied to the object in order to generate strain (i.e the object was compressed), and during this a second scan was taken. The object in question was made from a material which contained certain features (homogeneously distributed internal cavities), the positions of which could be found using image processing techniques on the scan data, resulting in a point cloud of the positions of these cavities. The movement of these individual cavities relative to each other between the two point clouds (one from the scan before the compression, and one from the scan during), could then be used to estimate the internal strains in the object.
Unfortunately, this posed some problems with regards to the two assumptions posed above. Firstly, the material in question contained many "markers" which were used to generate the point clouds, but it was impossible to guarantee which of the points were the SAME point between the two data sets, and as such there were no markers which could be used for the alignment/registration. Secondly, the shear and scaling of the second point cloud relative to the first contained the strain information that I wished to extract - so using a unlimited degree-of-freedom algorithm which allowed the point clouds to be scaled or sheared during the alignment would be of no use as it'd alter the results. As such, a rigid-body, marker-less/"blind" algorithm was required.
After this was established and I began researching suitable solutions, it also became apparent that there was a third problem arising. A lot (read: all) of the marker-less solutions to the problem I came across were only able to find the locally optimal solution. Most of the algorithms worked by iteratively attempting to minimize an objective function, usually the mean distance between the closest points between each set, using gradient descent methods. This always resulted in the algorithm finding the local minima, which was not necessarily the global minima (i.e. what we want). In short, these algorithms only worked if the two point clouds were already reasonably close to being aligned at the start - if one cloud was, for example, shifted 90 degrees on one axis, the registration would fail miserably. As such, a "global" solution was also required.
There are presumably many exciting and complicated methods of ensuring that the global minima is found when minimizing an objective function, but I settled on a simpler solution whereby I split the registration into two stages - a course, estimated registration to attempt to get a "ballpark" alignment, followed by the use of an algorithm which would converge to the local minima (which would then hopefully be the global minima due to the initial coarse registration).
Formulating the problem
The outline of an average registration problem can be generalized as such:
Where is the original "target" point cloud, is the point cloud to be mapped to , is a transformation matrix and is a translational matrix, defined as:
Where , and are the coordinates of each point in the cloud, so and are effectively matrices in which the columns are comprised of the coordinates of every point in each cloud.
So, in this scenario, the solution of the registration problem is found through the calculation of the and matrices which define the transformation that aligns the two point clouds.
This calculation appears reasonably trivial. The and matrices can be combined in to a single matrix (with a change to our matrix) like so:
This gives us:
Which is trivial to solve for , right? Well, if n = 3, as seen here, then it is:
It is solved easily by multiplying both sides by the inverse of , giving us . However, it is only possible to determine the inverse of a square matrix (i.e. a matrix with the same number of columns and rows), so if we want to use this technique with more than 3 points it will not work out. To do this we need to extend the technique to ensure our matrix is definitely square.
This is achievable by multiplying a our by it's transpose, which is guaranteed to produce a square matrix. Thus, we can multiply both sides of our equation by to get a square matrix, of which we can then find the inverse and ultimately find , like so:
This leads to an interesting complication which must be kept in mind. When using more than 3 points, there is no longer a guarantee that a transformation matrix exists which perfectly maps the points from on to those in . Instead, the solution will provide the least-mean-squares fit, meaning the mean-square error (euclidean distance) between the points occupying the same columns in and is minimized.
That was easy! But wait..
This solution to the general registration problem gives us a solution with no restrictions to the degrees of freedom, so there will be scaling factors and shear factors included in the transformation matrix. This is not acceptable for us as we are looking for a rigid-body solution only. Luckily, there is a different method, proposed in "A Method for Registration of 3-D Shapes" by Paul J. Besl and Neil D. McKay, which enforces our rigid-body condition using singular value decomposition. The detailed mathematics can be found in the paper, so I'll give a quick run down of the process instead:
- Formulate the problem as such:
2. Find the "center of mass" of our two point clouds (i.e. the mean of the x, y and z coords of all points):
2. Find the 3x3 covariance matrix, :
3. Calculate the SVD and from this calculate our rotational transformation matrix, (rigid-body equivalent to our transformation matrix from earlier).
4. Calculate our translation matrix, (equivalent to our from earlier):
And that's it! Our full rigid-body transformation to map on to is fully described by and , as . As with our previous method, this will give us the transformation which results in the minimal mean-square-error. Simple!
Our current solution assumes we know exactly which points in each of our points in corresponds to. This is not the case in our problem, as we are looking for a marker-less solution. An attempt to use the current solution with the order of the columns in our matrix randomized (in other words the corresponding pairs of points are incorrect), the registration completely fails as seen on the right here:
Not knowing the pairs of corresponding points makes the problem significantly more complicated, although there are numerous algorithms or varying complexity which can tackle this problem. Probably the simplest (yet fairly effective) algorithm is called Iterative Closest Point and is what I elected to use. The algorithm works as follows:
- For each point in our cloud we pair it with the nearest point in our cloud (i.e. assume the closest points between the two sets correspond to one another). The distance measurements should be done using a Kd-tree structure to increase the speed (see my post on Kd-trees!).
- Determine the least-mean-squares transformation with our SVD technique, using the points we have matched up in step 1. (i.e. the point in any column in should have it's nearest neighbour in at the same column index).
- Transform using the determined transformation matrix.
- Iterate from step 1 until the solution converges (i.e. re-associate the points and find a new transformation until there is no further transformation required).
This technique will converge on the local minima, that is, it will only successfully align our point clouds if the initial offsets/rotations are not too high. This brings us on to our last point.
We now have a solution to our problem which can give us a least-mean-square optimized transformation to align our two point clouds. The only issue we have now is that it converges on the local minima, which will not necessarily be the global minima (the solution we want). However, as luck would have it, we can use the handy tool known as Principal Component Analysis to give us a coarse initial registration, which we can then fine tune using our main technique (as we then know the local minima will be the global minima).
The first step is to determine a rough translation. This is very straightforward - we just calculate the position of the center of mass for our two point clouds and then subtract it from the position of every point in the clouds so that they are both centered about (0,0,0).
Following this, we want to yet again calculate a covariance matrix for both and . From now on the steps are the same for both so I will just use a singular example. As we have already removed the center of mass from our data, it is slightly simpler to calculate:
Now we need to calculate the eigendecomposition of , which will give us the eigenvalues and eigenvectors - which just so happen to have a very useful spatial meaning. The eigenvectors of will give us three orthogonal axes, which point along the three direction in which the point cloud has the highest variance. As our two point clouds should reasonably resemble each other, it's perfectly reasonable to assume the eigenvectors from our two clouds should be the same if the clouds were aligned. As such, we can perform our coarse registration by determining the rotational transformation which is required to map the eigenvectors from on to those from . The eigenvalues correspond to the variance of the dataset along the relevant axes - which can be a good way to check the results (as the more similar the strains are, the less shears/scales between the data set, and thus the better the rigid-body registration will function). The transformation is simple to calculate:
Where is the first eigenvector and so on. These matrices are 3x3 and as such is easy to calculate by simply inverting the matrix: .
Applying this transformation to will give us a good first guess to the registration, regardless of the original translation or orientation, which will ensure our local minima is our global minima and our registration is successful!
The full matlab code to perform all the steps of this algorithm is as such: (NOTE: I'm using the KD tree implementation from the Statistics and Machine Learning Toolbox, but this can be replaced with another distance-checking method pretty easily)
function [X_registered] = register(Y,X)
%Returns X aligned with Y, assuming Y = R*X+T
%Remove means from X and Y to center clouds about (0,0,0)
Y_centered = Y - repmat(mean(Y,2),1,size(Y,2));
X_centered = X - repmat(mean(X,2),1,size(X,2));
H_x = X_centered*X_centered';
H_y = Y_centered*Y_centered';
%Get eigenvectors/eigenvalues from covariance matrices
[Vec_y,D_y] = eig(H_y);
D_y = diag(D_y);
[Vec_x,D_x] = eig(H_x);
D_x = diag(D_x);
%Order by largest eigenvalue
D_y = D_y(end:-1:1);
Vec_y = Vec_y(:,end:-1:1);
D_x = D_x(end:-1:1);
Vec_x = Vec_x(:,end:-1:1);
%Determine coarse transformation matrix assuming X_eig*A = Y_eig
A = inv(Vec_x')*Vec_y'
%Transform x using this to get coarse registration estimation
X_coarse(i,:) = X_centered(:,i)'*A;
X_coarse = X_coarse'; %Put it back in the expected format
%% Fine registration
%Remove mean from X_coarse
X_coarse_centered = (X_coarse - repmat(mean(X_coarse,2),1,size(X_coarse,2)));
%Find closest neighbours and match them up using KD tree
kdtree = KDTreeSearcher(Y_centered');
[match, ~] = knnsearch(kdtree,X_coarse_centered');
%Indeces of the paired points
y_idx = match';
x_idx = 1:length(X_coarse_centered);
%Covariance matrix of X and Y in the matched-pairs-order
H_xy = X_coarse_centered(:,x_idx)*Y_centered(:,y_idx)';
%SVD of this
[U,~,V] = svd(H_xy);
%Calculate rotation and translation matrices
R = V*U';
T = -R*mean(X_coarse,2) + mean(Y_centered,2);
%Transform X_coarse using these
X_registered(i,:) = R*X_coarse(:,i);
X_registered = X_registered' + repmat(T,1,size(X_registered,1)); %%Add translation
X_registered = X_registered + repmat(mean(Y,2),1,size(Y,2)); %add original mean