first of all, thanks a lot for sharing your code! I have some questions about the training process:
Do you also use 10 iterations of the IC-LK algorithm during training?
In line 222 of trainer.py (quoted below) , you seem to define two different variants for the feature loss. If I understand correctly, the variant when pr is None is what is described in the paper, whereas the second variant seems to compare the feature difference between the last and the previous iteration of the LK algorithm. Can you explain the second variant and give an intuition of which to use when?
pr = ptnetlk.prev_r if pr is not None: loss_r = model.AnalyticalPointNetLK.rsq(r - pr) else: loss_r = model.AnalyticalPointNetLK.rsq(r)
- I'm a bit confused about how feature-aggregation/random features is implemented: The code snippet below seems to implement the splitting strategy for feature computation described in the supplementary material. However, the computed features are overwritten for each new split of the point cloud, so I take it this corresponds to the random feature selection approach? Furthermore, f1 is never used in the following code and f0 is overwritten when the jacobian is computed, so I'm wondering if this computation actually serves any purpose (except for initializing the batch norm layers) or if it is just there for reference?
# create a data sampler if mode != 'test': data_sampler = np.random.choice(num_points, (num_points//num_random_points, num_random_points), replace=False) # input through entire pointnet if training: # first, update BatchNorm modules f0 = self.ptnet(p0[:, data_sampler, :], 0) f1 = self.ptnet(p1[:, data_sampler, :], 0) self.ptnet.eval() if mode != 'test': for i in range(1, num_points//num_random_points-1): f0 = self.ptnet(p0[:, data_sampler[i], :], i) f1 = self.ptnet(p1[:, data_sampler[i], :], i)
Random point selection for computing the jacobian: I was wondering if it is important to compute the feature vector using the same subset of the pointcloud that was used to compute the jacobian or if it would also be possible to e.g. compute the jacobian on a random subset but the feature vector on the full point cloud?
In general, what would be the recommended setup for training? From the supplementary material, it seems that random features + random jacobian gives the best results (and this also seems to be what is implemented), but my initial tests loading the pretrained model and using this setup give relatively poor results (even if I sample more than 100 points), unless I turn on voxelization (which is not practical during training since the number of voxels with points in them is not constant). Any guidance on this?
Thanks again for sharing your code and sorry about the wall of text. I would be super grateful for your help!