Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]Handle Youtube .vtt files <c> </c>, extra spaces and duplicates #9272

Open
MonsterSe7en opened this issue Jan 27, 2025 · 6 comments

Comments

@MonsterSe7en
Copy link

Hey, I used subtitle edit to convert .vtt files downloaded from Youtube to Plaintext format

It removes most of the .vtt syntax but leaves behind
some text here that needs to be retained

It also has duplicate lines


.vtt file open with Subtitle Edit

Image

Converting to plain text leaves the below data

Image

Plain text version of the screenshot above

Also note the duplicate

this<00:00:00.399> was<00:00:00.919> pretty<00:00:01.280> good<00:00:02.040> I<00:00:02.120> will<00:00:02.320> admit<00:00:02.600> I<00:00:02.800> had
this was pretty good I will admit I had
this was pretty good I will admit I had
my<00:00:03.159> doubts<00:00:03.480> but<00:00:03.639> after<00:00:03.840> seeing<00:00:04.200> the<00:00:04.319> first
my doubts but after seeing the first
my doubts but after seeing the first
episode<00:00:05.160> The<00:00:05.319> pacing<00:00:05.640> and<00:00:05.799> the<00:00:05.879> fluid
episode The pacing and the fluid
episode The pacing and the fluid
animation<00:00:07.040> I<00:00:07.160> think<00:00:07.319> solo<00:00:07.680>


Steps to download .vtt file from youtube

  1. Install yt-dlp from https://github.com/yt-dlp/yt-dlp
  2. run the below command with a youtube url of any youtube video

yt-dlp --write-auto-sub --skip-download --sub-lang en [YouTube URL here without the square brackets]

@niksedk
Copy link
Member

niksedk commented Jan 27, 2025

Please attach one or more sample subtitle files, so we can try it out...

@MonsterSe7en
Copy link
Author

MonsterSe7en commented Jan 27, 2025

Compressed vtt files.zip

Glad .zip files are allowed.

Unable to upload .vtt files as they aren't allowed by GitHub

@MonsterSe7en
Copy link
Author

Is it just me or does this .vtt file downloaded from YouTube have elements of .ass files....

(Not a subtitle expert)

@niksedk
Copy link
Member

niksedk commented Jan 28, 2025

  1. Change the format to SubRip in the toolbar
  2. Select all lines in the list view (ctrl+a), right-click, choose "Remove formatting" - "All"
  3. Tools - Merge lines with same text
  4. File - Export - Plain text...

@MonsterSe7en
Copy link
Author

MonsterSe7en commented Jan 28, 2025

Wow!! That was a brilliant answer... Thank you

Is there a way to implement the same in a Batch Convert (Calling Subtitle Edit via Command Prompt would by the best outcome)

Below is a powershell script I use

# Define input and output folders
$inputFolder = "C:\C temp\New folder"
$outputFolder = "F:\Multimedia Subtitles\Anime\Solo Leveling\Aninews Solo Leveling"

# Convert to Plaintext Batch
& "C:\Program Files\Subtitle Edit\SubtitleEdit.exe" /convert *.* Plaintext /inputfolder:"$inputFolder" /outputfolder:"$outputFolder"

And I need to mention that Subtitle Edit is an excellent piece of software. Can't believe the features I needed were already present. I just needed to find them...

Thank you for your work!! SubtitleEdit is a powerful tool. Thank You!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants