Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhanced video is blurry #14

Open
lishubing17 opened this issue Jan 10, 2025 · 14 comments
Open

Enhanced video is blurry #14

lishubing17 opened this issue Jan 10, 2025 · 14 comments
Assignees

Comments

@lishubing17
Copy link

I was doing motion enhancement on a video in UBFC-rPPG and the enhanced video appeared very blurry. Can I ask if it is a problem with my driving-video selection, I am downloading the corresponding video from the link provided in your project, the download is a video with a more intense AU, the video is as follows:

QQ20250110-144541
QQ20250110-144655
QQ20250110-144722

@yahskapar yahskapar self-assigned this Jan 10, 2025
@yahskapar
Copy link
Owner

@lishubing17,

That result seems super strange to me, especially if it's one of the high AU videos provided as a subset of the TalkingHead-1KH dataset. Few things I suggest looking into:

  1. While you were setting up the repo and getting things to run, did you encounter any subtle errors (e.g., during installation of packages)? I do wonder if somehow there is an issue with the face detection and subsequent crop for the UBFC-rPPG video you used and when considered alongside the chosen driving video. Sometimes, any significant mismatch with the detected key points can cause artifacts. With relation to cropping, you can also adjust the larger box coefficient here (currently it's set to 2.0, maybe try a smaller parameter or a larger parameter, such as 1.7 or 2.3 respectively).

  2. To help debug this, I would also suggest trying lower AU videos (perhaps a low AU one, a moderate AU one, and a different high AU one that the one you used). It's worth adding some extra code to save the input after some pre-processing (e.g., face detection and face crop). It's possible a different high AU video would yield a better result than the one you chose for the particular source UBFC-rPPG video used. I never exhaustively tested all possible pairs of videos and it's very reasonable to say the underlying motion transfer algorithms (e.g., FaceVid2Vid) could have poor results in certain conditions that aren't obvious.

All the best,

Akshay

@lishubing17
Copy link
Author

Based on your description, I think I probably know the crux of the problem, is it that I didn't zoom in on the face box, but instead used the recognized face area directly. Wouldn't it be better to zoom in on the face frame to make it about the same size as the video in motion?

@yahskapar
Copy link
Owner

yahskapar commented Jan 12, 2025

It might be better - it's worth a try and generally really depends on the source and driving video combination in particular. It's very rare that a source video's face and a driving video's face are perfectly aligned, which in turn would make it easier to detect and transfer key points in most videos of reasonable quality, so I would recommend empirically finding the face box settings depending on your source and driving videos.

Also, keep in mind that it's not a big deal if one or two videos of, for example, 34 motion augmented videos ends up having blurring or other artifacts. You could always manually drop those few videos from your training set or try motion augmentation again with just those few videos and different driving videos. It also might be possible that different, newer neural motion transfer approaches would do better than the default usage of face-vid2vid in this repo.

@lishubing17
Copy link
Author

I've gotten a little bit better than before by changing larger_box_coef, but there are still some serious artifacts, it should be because of the movement of the video I chose, I'll verify it by choosing the video with less head posture movement, thank you very much for your reply!

@lishubing17
Copy link
Author

I have another question, are you using OpenFace for the standard deviation calculations for head rotation and AU?

@yahskapar
Copy link
Owner

I have another question, are you using OpenFace for the standard deviation calculations for head rotation and AU?

Yes, OpenFace was used for those metrics to help categorize data. If you end up also using the TalkingHead-1KH dataset, I recommend initially filtering out videos using those metrics and carefully visually inspecting the dataset for any videos that are useless as driving videos (e.g., videos that are effectively a slide show of multiple faces, very low frame-rate, or a bizarre viewpoint).

I believe NVIDIA scraped those videos from YouTube and my guess is there were quite a number of videos that got through whatever filtering they might've done.

@lishubing17
Copy link
Author

My composite video got better, but there is still some blurriness, I would like to ask if this is caused by anything? I chose videos for compositing that didn't have such strong motion.
联想截图_20250114101727

@yahskapar
Copy link
Owner

yahskapar commented Jan 14, 2025

That does seem quite strange - out of curiosity, did you directly fork this repo and use it as is or did you make any other changes (aside from the face box one, which maybe you can tweak further) that you can note in this thread for reference? Furthermore, in the example you gave above, what driving video was used?

For what it's worth, you can try another branch of this repo that uses a different motion transfer backbone (e.g., FOMM) here. Also, I generally recommend visualizing the augmented results using the rPPG-Toolbox, which supports motion augmented data.

@lishubing17
Copy link
Author

Yes, I used to dlib face detection for face detection and used the last box in case the face was not detected, now I will switch to try it in the project. I've run the code in the project before (with nothing changed) and the composite video was worse, this might have something to do with the video I chose, below I will try the motion smaller video for that.

@yahskapar
Copy link
Owner

Yeah I would definitely check using the base repo again without any changes, aside from maybe the face box change - it's likely there is some other discrepancy here. I've tried this code again after an year or so and it seemed fine recently, and I also know it's been used in a few other papers, including this larger scale study.

Also, I'm not sure if you're already using these, but the curated driving videos from my paper can be found here.

@lishubing17
Copy link
Author

Yes, I used the corresponding driving video and used to one of the videos in AU_moderate, thanks for the recommendation, I will watch this article.

@lishubing17
Copy link
Author

I've only run one video and the results are worse, the way I saved it as a video was to copy it using the source code you sent in another question.
联想截图_20250114111952
My driving video is 1C7do4tIrfo_0000_S0_E1051_L348_T44_R844_B540.mp4 in AU_moderate

@yahskapar
Copy link
Owner

As a sanity check, can you try using the exact same source video as a driving video for said source video? What happens in that case?

Also, instead of using that same driving video you used before, try v_bumG-g0Dc_0000_S0_E1094_L244_T29_R852_B637.mp4 in the HP_Mild folder - it's totally possible that some combination of the driving video background in the example you mentioned and specific identities are causing issues.

@lishubing17
Copy link
Author

lishubing17 commented Jan 14, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants