-
Notifications
You must be signed in to change notification settings - Fork 932
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request]Handle Youtube .vtt files <c> </c>, extra spaces and duplicates #9272
Comments
Please attach one or more sample subtitle files, so we can try it out... |
Glad .zip files are allowed. Unable to upload .vtt files as they aren't allowed by GitHub |
Is it just me or does this .vtt file downloaded from YouTube have elements of .ass files.... (Not a subtitle expert) |
Think the below links may serve as excellent starting points saving you time https://gist.github.com/glasslion/b2fcad16bc8a9630dbd7a945ab5ebf5e https://stackoverflow.com/questions/56927772/convert-webvtt-file-from-youtube-to-plain-text |
|
Wow!! That was a brilliant answer... Thank you Is there a way to implement the same in a Batch Convert (Calling Subtitle Edit via Command Prompt would by the best outcome) Below is a powershell script I use
And I need to mention that Subtitle Edit is an excellent piece of software. Can't believe the features I needed were already present. I just needed to find them... Thank you for your work!! SubtitleEdit is a powerful tool. Thank You!! |
Hey, I used subtitle edit to convert .vtt files downloaded from Youtube to Plaintext format
It removes most of the .vtt syntax but leaves behind
some text here that needs to be retained
It also has duplicate lines
.vtt file open with Subtitle Edit
Converting to plain text leaves the below data
Plain text version of the screenshot above
Also note the duplicate
this<00:00:00.399> was<00:00:00.919> pretty<00:00:01.280> good<00:00:02.040> I<00:00:02.120> will<00:00:02.320> admit<00:00:02.600> I<00:00:02.800> had
this was pretty good I will admit I had
this was pretty good I will admit I had
my<00:00:03.159> doubts<00:00:03.480> but<00:00:03.639> after<00:00:03.840> seeing<00:00:04.200> the<00:00:04.319> first
my doubts but after seeing the first
my doubts but after seeing the first
episode<00:00:05.160> The<00:00:05.319> pacing<00:00:05.640> and<00:00:05.799> the<00:00:05.879> fluid
episode The pacing and the fluid
episode The pacing and the fluid
animation<00:00:07.040> I<00:00:07.160> think<00:00:07.319> solo<00:00:07.680>
Steps to download .vtt file from youtube
yt-dlp --write-auto-sub --skip-download --sub-lang en [YouTube URL here without the square brackets]
The text was updated successfully, but these errors were encountered: