Failed to train the APNet #3

lubovbyc · 2022-05-10T08:05:16Z

Following the default procedure and parameter settings, I cannot train the APNet successfully. For different latents input, the network always produces the same result, e.g. mean face.

I tried to print the output of each layer (as shown below). It seems the network has already collapsed.

cassiePython · 2022-05-12T09:03:59Z

@lubovbyc It seems the optimization collapsed. I'm not sure. Have you tried other losses, such as the PDC loss, VDC loss ? You may combine all losses or try another optimizer. If still bugging you, please share your generated data with me, I will try it on my server.

lubovbyc · 2022-05-12T14:28:49Z

@cassiePython Thanks for your reply!

Have you tried other losses, such as the PDC loss, VDC loss ? You may combine all losses or try another optimizer.

Yes. I have tried different losses a few times with different weights. Besides, I also attempted to decrease the initial learning rate. But it seems that all these attempts could only slow up the process of collapsing.

I'm not sure whether I missed some important details. In normal cases, is the training of APNet stable? I have uploaded the generated data to google drive. Please help check it when you are available. Thanks a lot!

cassiePython · 2022-05-17T05:00:48Z

@lubovbyc Thanks for your sharing. I will check it immediately after the Sig Asia submission. Thanks for your patience.

cassiePython · 2022-05-23T00:40:05Z

@lubovbyc Hi. I have tried to alleviate this. You can add more MLP layers for the APNet and use the PDC loss to improve robustness.

lubovbyc · 2022-05-23T06:13:16Z

@cassiePython Thanks for your reply! I will take a shot and check if working.

roundchuan · 2022-06-06T09:57:26Z

@lubovbyc Hi. I have tried to alleviate this. You can add more MLP layers for the APNet and use the PDC loss to improve robustness.
Can you provide the details of change to avoid the collapse. I also get the mean face according to the default code.

roundchuan · 2022-06-06T09:58:47Z

@cassiePython Thanks for your reply! I will take a shot and check if working.

Do you get the correct results of APNet by the advice of author? I meet the trouble same as you

hxngiee · 2022-06-06T14:36:39Z

I also struggle in finding the good results of APNet. Should I use the renderer, landmark loss as well?

cassiePython · 2022-06-06T16:39:21Z

@lubovbyc I am still confused with the problem. I did not find this problem on my dataset. I attached the dataset with 4K images and the corresponding checkpoint. Please check whether it works on your device: https://portland-my.sharepoint.com/:u:/g/personal/cwang355-c_my_cityu_edu_hk/EWMWjP8DHtpEqvhLtZTyfr0BerKDVixlbx8zApUS3QTngA?e=GXsNKw.

Besides, first, please try adding the PDC loss.

cassiePython · 2022-06-06T16:49:35Z

I also struggle in finding the good results of APNet. Should I use the renderer, landmark loss as well?

Before using the pseudo gt to train the APNet, I have tried the combination of the rendered loss and the landmark loss. Unfortunately, I fail to train the APNet.
But I have not tried the WPDC loss plus the rendered loss and the landmark loss.

roundchuan · 2022-06-07T03:40:21Z

@lubovbyc I am still confused with the problem. I did not find this problem on my dataset. I attached the dataset with 4K images and the corresponding checkpoint. Please check whether it works on your device: https://portland-my.sharepoint.com/:u:/g/personal/cwang355-c_my_cityu_edu_hk/EWMWjP8DHtpEqvhLtZTyfr0BerKDVixlbx8zApUS3QTngA?e=GXsNKw.

Besides, first, please try adding the PDC loss.

Using the model and data provided by you, I still get the same face for different latent code. However, there is a change that the render face is not the mean face using your model.

cassiePython · 2022-06-07T03:48:39Z

@roundchuan
Can you get results like this:

roundchuan · 2022-06-07T03:55:50Z

@roundchuan Can you get results like this:

No , all the render faces are the same. And I print the "param_lst" the output of APNet are all the same.

roundchuan · 2022-06-07T04:04:03Z

I also struggle in finding the good results of APNet. Should I use the renderer, landmark loss as well?

Before using the pseudo gt to train the APNet, I have tried the combination of the rendered loss and the landmark loss. Unfortunately, I fail to train the APNet.

But I have not tried the WPDC loss plus the rendered loss and the landmark loss.

The results are just like this follow your data and checkpoints.

hxngiee · 2022-06-07T05:05:14Z

@lubovbyc I am still confused with the problem. I did not find this problem on my dataset. I attached the dataset with 4K images and the corresponding checkpoint. Please check whether it works on your device: https://portland-my.sharepoint.com/:u:/g/personal/cwang355-c_my_cityu_edu_hk/EWMWjP8DHtpEqvhLtZTyfr0BerKDVixlbx8zApUS3QTngA?e=GXsNKw.
Besides, first, please try adding the PDC loss.

Using the model and data provided by you, I still get the same face for different latent code. However, there is a change that the render face is not the mean face using your model.

@cassiePython Thanks for your kind reply. can you attach constants.pkl as well? It seems that the file is omitted

cassiePython · 2022-06-16T00:55:26Z

TO ALL: Recently, I generate more datasets with different sizes (e.g. 4K, 6K, 8K, 1W, 1.5W, 2W), and find a local minimum (e.g. the mean face) will appear during training the APNet all the time on some datasets. Before fixing this issue, you can use my pre-trained model first: https://drive.google.com/drive/folders/1qNvRu8vLPD278FW7GS-I9p6-yxYhKZY9

I am trying to:

Add more data to make the latent space compact.
Add the rendering loss and landmark loss, following StyleRig and traditional face reconstruction methods.

cassiePython mentioned this issue Jun 17, 2022

Error when run evaluate.py #5

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failed to train the APNet #3

Failed to train the APNet #3

lubovbyc commented May 10, 2022

cassiePython commented May 12, 2022

lubovbyc commented May 12, 2022

cassiePython commented May 17, 2022

cassiePython commented May 23, 2022

lubovbyc commented May 23, 2022

roundchuan commented Jun 6, 2022

roundchuan commented Jun 6, 2022

hxngiee commented Jun 6, 2022

cassiePython commented Jun 6, 2022 •

edited

Loading

cassiePython commented Jun 6, 2022

roundchuan commented Jun 7, 2022

cassiePython commented Jun 7, 2022

roundchuan commented Jun 7, 2022

roundchuan commented Jun 7, 2022

hxngiee commented Jun 7, 2022 •

edited

Loading

cassiePython commented Jun 16, 2022 •

edited

Loading

Failed to train the APNet #3

Failed to train the APNet #3

Comments

lubovbyc commented May 10, 2022

cassiePython commented May 12, 2022

lubovbyc commented May 12, 2022

cassiePython commented May 17, 2022

cassiePython commented May 23, 2022

lubovbyc commented May 23, 2022

roundchuan commented Jun 6, 2022

roundchuan commented Jun 6, 2022

hxngiee commented Jun 6, 2022

cassiePython commented Jun 6, 2022 • edited Loading

cassiePython commented Jun 6, 2022

roundchuan commented Jun 7, 2022

cassiePython commented Jun 7, 2022

roundchuan commented Jun 7, 2022

roundchuan commented Jun 7, 2022

hxngiee commented Jun 7, 2022 • edited Loading

cassiePython commented Jun 16, 2022 • edited Loading

cassiePython commented Jun 6, 2022 •

edited

Loading

hxngiee commented Jun 7, 2022 •

edited

Loading

cassiePython commented Jun 16, 2022 •

edited

Loading