Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to train the APNet #3

Open
lubovbyc opened this issue May 10, 2022 · 16 comments
Open

Failed to train the APNet #3

lubovbyc opened this issue May 10, 2022 · 16 comments

Comments

@lubovbyc
Copy link

Following the default procedure and parameter settings, I cannot train the APNet successfully. For different latents input, the network always produces the same result, e.g. mean face.
00001253

I tried to print the output of each layer (as shown below). It seems the network has already collapsed.
image

@cassiePython
Copy link
Owner

@lubovbyc It seems the optimization collapsed. I'm not sure. Have you tried other losses, such as the PDC loss, VDC loss ? You may combine all losses or try another optimizer. If still bugging you, please share your generated data with me, I will try it on my server.

@lubovbyc
Copy link
Author

@cassiePython Thanks for your reply!

Have you tried other losses, such as the PDC loss, VDC loss ? You may combine all losses or try another optimizer.

Yes. I have tried different losses a few times with different weights. Besides, I also attempted to decrease the initial learning rate. But it seems that all these attempts could only slow up the process of collapsing.

I'm not sure whether I missed some important details. In normal cases, is the training of APNet stable? I have uploaded the generated data to google drive. Please help check it when you are available. Thanks a lot!

@cassiePython
Copy link
Owner

@lubovbyc Thanks for your sharing. I will check it immediately after the Sig Asia submission. Thanks for your patience.

@cassiePython
Copy link
Owner

@lubovbyc Hi. I have tried to alleviate this. You can add more MLP layers for the APNet and use the PDC loss to improve robustness.

@lubovbyc
Copy link
Author

@cassiePython Thanks for your reply! I will take a shot and check if working.

@roundchuan
Copy link

@lubovbyc Hi. I have tried to alleviate this. You can add more MLP layers for the APNet and use the PDC loss to improve robustness.
Can you provide the details of change to avoid the collapse. I also get the mean face according to the default code.

@roundchuan
Copy link

@cassiePython Thanks for your reply! I will take a shot and check if working.

Do you get the correct results of APNet by the advice of author? I meet the trouble same as you

@hxngiee
Copy link

hxngiee commented Jun 6, 2022

I also struggle in finding the good results of APNet. Should I use the renderer, landmark loss as well?

@cassiePython
Copy link
Owner

cassiePython commented Jun 6, 2022

@lubovbyc I am still confused with the problem. I did not find this problem on my dataset. I attached the dataset with 4K images and the corresponding checkpoint. Please check whether it works on your device: https://portland-my.sharepoint.com/:u:/g/personal/cwang355-c_my_cityu_edu_hk/EWMWjP8DHtpEqvhLtZTyfr0BerKDVixlbx8zApUS3QTngA?e=GXsNKw.

Besides, first, please try adding the PDC loss.

@cassiePython
Copy link
Owner

I also struggle in finding the good results of APNet. Should I use the renderer, landmark loss as well?

  1. Before using the pseudo gt to train the APNet, I have tried the combination of the rendered loss and the landmark loss. Unfortunately, I fail to train the APNet.
  2. But I have not tried the WPDC loss plus the rendered loss and the landmark loss.

@roundchuan
Copy link

@lubovbyc I am still confused with the problem. I did not find this problem on my dataset. I attached the dataset with 4K images and the corresponding checkpoint. Please check whether it works on your device: https://portland-my.sharepoint.com/:u:/g/personal/cwang355-c_my_cityu_edu_hk/EWMWjP8DHtpEqvhLtZTyfr0BerKDVixlbx8zApUS3QTngA?e=GXsNKw.

Besides, first, please try adding the PDC loss.

Using the model and data provided by you, I still get the same face for different latent code. However, there is a change that the render face is not the mean face using your model.

@cassiePython
Copy link
Owner

@roundchuan
Can you get results like this:

image

@roundchuan
Copy link

@roundchuan Can you get results like this:

image

@roundchuan Can you get results like this:

No , all the render faces are the same. And I print the "param_lst" the output of APNet are all the same.

@roundchuan
Copy link

I also struggle in finding the good results of APNet. Should I use the renderer, landmark loss as well?

  1. Before using the pseudo gt to train the APNet, I have tried the combination of the rendered loss and the landmark loss. Unfortunately, I fail to train the APNet.
  2. But I have not tried the WPDC loss plus the rendered loss and the landmark loss.

tt

The results are just like this follow your data and checkpoints.

@hxngiee
Copy link

hxngiee commented Jun 7, 2022

@lubovbyc I am still confused with the problem. I did not find this problem on my dataset. I attached the dataset with 4K images and the corresponding checkpoint. Please check whether it works on your device: https://portland-my.sharepoint.com/:u:/g/personal/cwang355-c_my_cityu_edu_hk/EWMWjP8DHtpEqvhLtZTyfr0BerKDVixlbx8zApUS3QTngA?e=GXsNKw.
Besides, first, please try adding the PDC loss.

Using the model and data provided by you, I still get the same face for different latent code. However, there is a change that the render face is not the mean face using your model.

@cassiePython Thanks for your kind reply. can you attach constants.pkl as well? It seems that the file is omitted

@cassiePython
Copy link
Owner

cassiePython commented Jun 16, 2022

TO ALL: Recently, I generate more datasets with different sizes (e.g. 4K, 6K, 8K, 1W, 1.5W, 2W), and find a local minimum (e.g. the mean face) will appear during training the APNet all the time on some datasets. Before fixing this issue, you can use my pre-trained model first: https://drive.google.com/drive/folders/1qNvRu8vLPD278FW7GS-I9p6-yxYhKZY9

I am trying to:

  1. Add more data to make the latent space compact.
  2. Add the rendering loss and landmark loss, following StyleRig and traditional face reconstruction methods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants