-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Positional encoding in RadianceNet
#2
Comments
Hi @shrubb, However, the raw 3D coordinates may still be directly feeded, as in IDR, in a way of respecting official choice of implementation. But of course feeding embeded input may lead to better results. As for the training speed, in my test: volsdf.yaml
....
(radiance_net): RadianceNet(
(embed_fn): Identity()
(embed_fn_view): Embedder()
(layers): ModuleList(
(0): DenseLayer(
in_features=289, out_features=256, bias=True
(activation): ReLU(inplace=True)
)
...
0%| | 97/100000 [00:25<6:06:50, 4.54it/s, loss_img=0.135, loss_total=0.137, lr=0.000499]
(radiance_net): RadianceNet(
(embed_fn): Embedder()
(embed_fn_view): Embedder()
(layers): ModuleList(
(0): DenseLayer(
in_features=325, out_features=256, bias=True
(activation): ReLU(inplace=True)
)
...
0%| | 107/100000 [00:28<6:07:23, 4.53it/s, loss_img=0.16, loss_total=0.164, lr=0.000499] volsdf_nerfpp_blended.yamlYou may notice that the training iterations arise from
0%| | 131/200000 [00:34<14:43:28, 3.77it/s, loss_img=0.182, loss_total=0.185, lr=0.0005]
0%| | 209/200000 [00:52<13:54:47, 3.99it/s, loss_img=0.215, loss_total=0.22, lr=0.0005]
0%| | 121/200000 [00:32<14:50:09, 3.74it/s, loss_img=0.162, loss_total=0.163, lr=0.0005] |
As for whether training collapses or not, I'm running training tests on BlendedMVS, to be continued... |
Hi @shrubb , At early training stage, the dominating representing branch needs to be the If the embeded 3D coordinates are feeded to radiance network instead of raw 3D coordinates ( That is to say, the dominating representing branch needs to be the first one among the following three at early stages:
Embedding location and view direction input of radiance network introduce larger gradients and "prempt" more gradients flows to the radiance net, "sharing" relatively less gradients to the surface network, as shown below: Pratical comparison of normals validation when training: (You can see that no meaningful shapes are learned in the latter two cases) |
But still, in The |
Makes perfect sense, thank you for the insight and the experiments! 🎉 |
Glad to know it helps 😄 |
Hi, and thanks a lot for the implementation!
neurecon/configs/volsdf_nerfpp_blended.yaml
Lines 41 to 42 in 972e810
I was wondering why we are not using positional encoding and instead are feeding raw 3D coordinates and view directions here? Especially because IDR is not doing so and the defaults are 6 and 4... 🤔
I tried changing these from -1 to 6 and/or 4, and training collapses or at least goes much slower... To me, this seems extremely weird!
The text was updated successfully, but these errors were encountered: