Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable RANSAC in triangulation #292

Merged
merged 34 commits into from
Jan 25, 2022
Merged

Conversation

travisdriver
Copy link
Collaborator

@travisdriver travisdriver commented Sep 13, 2021

Enabled RANSAC in the triangulation module.

@travisdriver
Copy link
Collaborator Author

The results on Skydio-32 look promising (will post other results once the benchmarking finishes):

Before

Metric name Value
number_cameras 32
number_tracks_unfiltered 5724
mean_3d_track_lengths_unfiltered 2.67313
mean_reprojection_errors_unfiltered_px 10.9595
mean_rotation_error_angle_deg 7.326
mean_translation_error_distance 0.651919
mean_translation_angle_error_deg 11.4042
number_tracks_filtered 1256
mean_3d_track_lengths_filtered 2.08758
mean_reprojection_errors_filtered_px 1.39266

After

Metric name Value
number_cameras 32
number_tracks_unfiltered 3482
mean_3d_track_lengths_unfiltered 2.04451
mean_reprojection_errors_unfiltered_px 0.669101
mean_rotation_error_angle_deg 6.7561
mean_translation_error_distance 0.920434
mean_translation_angle_error_deg 12.0292
number_tracks_filtered 3445
mean_3d_track_lengths_filtered 2.04325
mean_reprojection_errors_filtered_px 0.641061

@johnwlambert
Copy link
Collaborator

Thanks for the PR, Travis. Cool to see.

I'm curious about why the average camera distance from GT goes from 0.65 to 0.9. Could you try sharing what the visualizations look like before and after the change?

@johnwlambert
Copy link
Collaborator

Travis, is there any way you could share some examples of how RANSAC is doing? Like for some visualizations of tracks (using our DA track viz script that dumps vertically stacked patches), could we see which ones RANSAC is labeling as outliers? Do the outliers look correct in practice? I think having GT metrics here on 3d GT will help too : - )

@travisdriver
Copy link
Collaborator Author

Travis, is there any way you could share some examples of how RANSAC is doing? Like for some visualizations of tracks (using our DA track viz script that dumps vertically stacked patches), could we see which ones RANSAC is labeling as outliers? Do the outliers look correct in practice? I think having GT metrics here on 3d GT will help too : - )

Which GT metrics are you referring to?

@johnwlambert
Copy link
Collaborator

Travis, is there any way you could share some examples of how RANSAC is doing? Like for some visualizations of tracks (using our DA track viz script that dumps vertically stacked patches), could we see which ones RANSAC is labeling as outliers? Do the outliers look correct in practice? I think having GT metrics here on 3d GT will help too : - )

Which GT metrics are you referring to?

Oh I meant the track precision/recall from the 3d model -- I think we might need a bit more DA metrics before merging this DA PR

@travisdriver
Copy link
Collaborator Author

travisdriver commented Oct 8, 2021

The performance with RANSAC is relatively positive so far. Will continue to update as I run tests.

Skydio-32 (deep_front_end, skydio-32, 32, 760, true)

Metric name w/o RANSAC w/ RANSAC
number_cameras 30 30
number_tracks_unfiltered 8532 8577
mean_3d_track_lengths_unfiltered 2.97703 2.69255
mean_reprojection_errors_unfiltered_px 3.37457 1.18834
mean_rotation_error_angle_deg 0.175664 0.227616
mean_translation_error_distance 0.0192357 0.0256501
mean_translation_angle_error_deg 0.396858 0.477618
Filtered tracks triangulated with GT cams: SUCCESS 6403 7201
Filtered tracks triangulated with GT cams: EXCEEDS_REPROJ_THRESH 106 109
Filtered tracks triangulated with GT cams: CHEIRALITY_FAILURE 12 14
number_tracks_filtered 6521 7324
mean_3d_track_lengths_filtered 2.50437 2.45508
mean_reprojection_errors_filtered_px 0.809112 0.797346

RC3 (deep_front_end, 2011205_rc3, 65, 1024, true)

Metric name w/o RANSAC w/ RANSAC
number_cameras 22 13
number_tracks_unfiltered 2664 2972
mean_3d_track_lengths_unfiltered 2.00188 2
mean_reprojection_errors_unfiltered_px 0.422914 0.37011
mean_rotation_error_angle_deg 8.10497 4.86258
mean_translation_error_distance 2553.99 1217.21
mean_translation_angle_error_deg 36.4078 27.7624
Filtered tracks triangulated with GT cams: SUCCESS 2663 2972
number_tracks_filtered 2664 2972
mean_3d_track_lengths_filtered 2.00188 2
mean_reprojection_errors_filtered_px 0.422914 0.37011

@johnwlambert
Copy link
Collaborator

How does the dashboard look for this PR Travis? Could you share the file if you have created it?

@travisdriver
Copy link
Collaborator Author

travisdriver commented Nov 28, 2021

Here are the results for 100 RANSAC iterations with a 5 pixel threshold.

I think door-12 may be so "converged" that small perturbations in the results make it look like it's doing really bad in the dashboard. For example, the ~1000% increase in translation error is a result of it going from 0.002 to 0.02 (keep in mind the ground truth is the COLMAP solution and not the actual). Nevertheless, the results are still not great.

I still think there's something peculiar going on. I would have at least expected the number of unfiltered tracks to increase across the board (since were no longer rejecting entire tracks), but this is not the case. Also, the mean track length seems to have decreased substantially in most cases, suggesting that RANSAC is rejecting a significant amount of track correspondences (maybe we need to increase the pixel threshold).

I will investigate and debug.

image

@travisdriver
Copy link
Collaborator Author

Actually I guess I would expect the number of unfiltered tracks to decrease since the default reprojection error threshold is 100 pixels on master...

@travisdriver travisdriver marked this pull request as draft November 29, 2021 15:35
@travisdriver travisdriver changed the title [WIP] Enabled RANSAC in triangulation Enable RANSAC in triangulation Nov 29, 2021
@travisdriver
Copy link
Collaborator Author

Here are the results with a RANSAC threshold of 100 pixels.

My hypothesis is that the poses from the averaging module are not accurate enough to perform RANSAC at more reasonable thresholds (this is supported by the huge performance increase afforded by #225).

newplot (1)

@johnwlambert
Copy link
Collaborator

Thanks for the PR, Travis. I'm taking a look at the report now:

visual_comparison_dashboard 3.html.zip

Just trying to figure out which changes are sort of random changes vs. deterministic ones. I think you have a better sense from me from having run the CI a few times on this branch.

Maybe if we fix the determinism bug first it will be a bit clearer. What are your thoughts?

@johnwlambert
Copy link
Collaborator

Hi @travisdriver the PR looks good. Do you have a sense for the new runtime? I'm curious if it's 1% slower, 10% slower, or 100%+ slower with the new RANSAC loop.

@travisdriver
Copy link
Collaborator Author

Hi @travisdriver the PR looks good. Do you have a sense for the new runtime? I'm curious if it's 1% slower, 10% slower, or 100%+ slower with the new RANSAC loop.

I'm not sure what you're referring to; there's no increase in runtime according to the CI.

@johnwlambert
Copy link
Collaborator

Hi @travisdriver the PR looks good. Do you have a sense for the new runtime? I'm curious if it's 1% slower, 10% slower, or 100%+ slower with the new RANSAC loop.

I'm not sure what you're referring to; there's no increase in runtime according to the CI.

I was thinking since we were adding a new inner Python for-loop for our 2-view geometry estimation, so that could slow things down considerably. But I guess since we limit to just 100 hypotheses the effect on runtime is neglible?

@travisdriver
Copy link
Collaborator Author

Hi @travisdriver the PR looks good. Do you have a sense for the new runtime? I'm curious if it's 1% slower, 10% slower, or 100%+ slower with the new RANSAC loop.

I'm not sure what you're referring to; there's no increase in runtime according to the CI.

I was thinking since we were adding a new inner Python for-loop for our 2-view geometry estimation, so that could slow things down considerably. But I guess since we limit to just 100 hypotheses the effect on runtime is neglible?

Yes, triangulation is a relatively inexpensive operation.

@travisdriver
Copy link
Collaborator Author

@johnwlambert What else needs to be done in order to merge this PR?

@johnwlambert
Copy link
Collaborator

johnwlambert commented Jan 18, 2022

Hi Travis, thanks for your patience here. I'm a bit confused about the performance on skydio-8 and skydio-32. I triggered the CI one more time to see if those regressions were random. @akshay-krishnan @ayushbaid could you all take a look? I'm fine with landing the new config just for astronet right away, though.
visual_comparison_dashboard.html.zip

Screen Shot 2022-01-18 at 6 44 43 PM

I saw that the runtime went from 58 min to 2 hours 20 min just for one benchmark (deep front end RC3). I'm running again to see if that's repeatable.
Screen Shot 2022-01-18 at 6 41 26 PM

@johnwlambert
Copy link
Collaborator

I checked the dashboard for the second run, and still seeing some confusing things. Hopefully if we land the repeatability stuff first, it will be more clear for us
Screen Shot 2022-01-19 at 9 34 39 AM

@travisdriver
Copy link
Collaborator Author

I saw that the runtime went from 58 min to 2 hours 20 min just for one benchmark (deep front end RC3). I'm running again to see if that's repeatable.

This is from the most recent commit on master; the benchmark runtime fluctuates a lot for some reason, but there are plenty of instances other than this branch where is takes >2 hours.

Screenshot from 2022-01-19 09-34-40

@johnwlambert
Copy link
Collaborator

I saw that the runtime went from 58 min to 2 hours 20 min just for one benchmark (deep front end RC3). I'm running again to see if that's repeatable.

This is from the most recent commit on master; the benchmark runtime fluctuates a lot for some reason, but there are plenty of instances other than this branch where is takes >2 hours.

Screenshot from 2022-01-19 09-34-40

I see, cool. Didn't realize runtime was fluctuating that much too. I wonder if the hardware specs are different between runs on the CI

@travisdriver
Copy link
Collaborator Author

I checked the dashboard for the second run, and still seeing some confusing things. Hopefully if we land the repeatability stuff first, it will be more clear for us

I don't think an increase in angular error from .3 to .8 should completely hold the PR, especially when there are significant improvements across the board. The repeatability issues effect all PRs, but it seems to only be holding up this one...

I checked the dashboard for the second run, and still seeing some confusing things. Hopefully if we land the repeatability stuff first, it will be more clear for us Screen Shot 2022-01-19 at 9 34 39 AM

Thanks for looking at the benchmarks. What's confusing from the output? A .5 degree increase in the mean rotation angle error seems insignificant compared to the other improvements.

dyn_num_hypotheses = int(
np.log(1 - self.confidence)
/ np.log(1 - self.min_inlier_ratio ** NUM_SAMPLES_PER_RANSAC_HYPOTHESIS)
* self.dyn_num_hypotheses_multiplier
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Travis, could you add a reference for this dynamic multiplier? If COLMAP is using it, do you mind adding an inline comment w/ URL to the COLMAP source where they do it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

COLMAP uses it, but all it does it multiply the number of dynamic hypotheses by a constant so I'm indifferent about whether it warrants a reference

gtsfm/data_association/point3d_initializer.py Outdated Show resolved Hide resolved
mode: TriangulationParam

# RANSAC parameters
min_inlier_ratio: float = 0.1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isnt that a really small ratio? do we have a min_num_inliers too? just to be sure this does not select tracks with one or two inliers?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a smaller ratio will result in a larger number of dynamic hypotheses.

Keep in mind that the measurements are the cameras (with the corresponding keypoint), so a track with one inlier would result in a underdetermined problem (which we check for and triangulation would fail anyways). Two inliers is allowed, but tracks are discarded according to min_track_length.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I wonder if a higher min inlier ratio produces better hypothesis? It would definitely save some compute.

gtsfm/data_association/point3d_initializer.py Outdated Show resolved Hide resolved
Copy link
Contributor

@ayushbaid ayushbaid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except few minor comments.

Do you want to wait for the repeatability issues in 1dsfm to land before using it in astronet, or do you want to start using it regardless?

@@ -0,0 +1,68 @@
# Relaxes triangulation RANSAC threshold to 100 (10x threshold of deep_front_end.yaml)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we name it something which mentions this attribute value being high instead of tagging it with a database?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I think this would improve clarity

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AstroNet is the only dataset that uses the config by default, so it seems fitting to name it accordingly.

Comment on lines 47 to 53
triangulation_options:
_target_: gtsfm.data_association.point3d_initializer.TriangulationOptions
reproj_error_threshold: 10
mode:
_target_: gtsfm.data_association.point3d_initializer.TriangulationSamplingMode
value: 0 # 0 corresponds to NO_RANSAC
max_num_hypotheses: 20
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change to ransac in the unit test?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't quite follow here Ayush -- I believe we were using NO_RANSAC before, and the test is still using NO_RANSAC now.

mode: triangulation mode, which dictates whether or not to use robust estimation.
min_inlier_ratio: a priori assumed minimum probability that a point is an inlier.
confidence: desired confidence that at least one hypothesis is outlier free.
dyn_num_hypotheses_multiplier: multiplication factor for dynamically computed hyptheses based on confidence.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate what this is in the PR comments?

Also good to have an example computation in the docstring for the number of hypothesis.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@travisdriver we could add the following math:

The RANSAC module defaults to 2749 iterations, computed as 
np.log(1-0.9999) / np.log( 1 - 0.1 **2) * 3 = 2749.3

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this to the docstring

mode: TriangulationParam
reproj_error_thresh: float
num_ransac_hypotheses: Optional[int] = None
options: TriangulationOptions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we rename to triangulation_options to prevent ambiguity?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like maybe that would be too wordy/long. Just my two cents.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think just options is sufficient since there aren't any similar attributes.

Copy link
Collaborator

@johnwlambert johnwlambert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@johnwlambert johnwlambert merged commit 185cb97 into master Jan 25, 2022
@johnwlambert johnwlambert deleted the use-ransac-in-triangulation branch January 25, 2022 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants