Skip to content

Commit

Permalink
打包字幕生成可执行文件
Browse files Browse the repository at this point in the history
  • Loading branch information
AuYang261 committed Apr 9, 2024
1 parent 493016e commit cce3581
Show file tree
Hide file tree
Showing 15 changed files with 133 additions and 75 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@ build/
dist/
*.spec
whisper_models/
release-downloader/
56 changes: 35 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,11 @@

双击运行`main.exe`(Release中的)或`run.bat`文件,并输入你想下载的课程编号(40524)。输出课程视频列表:

![image-20230926124749421](md/README/image-20230926124749421.png)
![image-20240409103306945](md/README/image-20240409103306945.png)

输入想下载的视频编号,用英文逗号(,)分隔,回车。接着选择下载video视频录像(即教室后的摄像头录像)还是下载screen信号(即教室电脑的屏幕),默认为视频录像。回车即开始下载:
输入想下载的视频编号,用英文逗号(,)分隔,回车。接着输入数字选择下载video视频录像(即教室后的摄像头录像)还是下载screen信号(即教室电脑的屏幕),默认为视频录像。回车即开始下载:

![image-20230926124841432](md/README/image-20230926124841432.png)
![image-20240409103338980](md/README/image-20240409103338980.png)

下载完成的文件在`output/`目录下以`课程名-video/screen`格式命名的文件夹中。

Expand All @@ -30,27 +30,19 @@

本项目提供自动生成字幕功能,使用openai的[whisper](https://github.com/openai/whisper)项目及其模型在本地进行语音转文字生成字幕。

最好使用GPU运行,否则速度较慢。
最好使用GPU运行,否则速度较慢,依赖见[下文](#依赖)

由于涉及到的库较多,打包生成的可执行文件较大,目前暂不发布打包的可执行文件,需要python环境运行,配置python环境见下文依赖部分。
下载[生成字幕可执行文件](),保存在上述解压的目录中,如下所示:

运行gen_caption.py为指定视频生成字幕:
![image-20240409105228362](md/README/image-20240409105228362.png)

```bash
python gen_caption.py video_path
```

或输入数字选择视频:
下载完视频后,双击运行`gen_caption.exe`(文件较大,需要等一会),输入数字选择视频,回车:

```bash
python gen_caption.py
```
![image-20240409103224309](md/README/image-20240409103224309.png)

![2024-04-08_17-42](md/README/2024-04-08_17-42.png)
等待程序运行完成,生成的字幕文件为`.srt`格式,与视频文件在同级目录下,用支持字幕的播放器(如potplayer)打开视频即可看到带字幕的视频。

等待程序运行完成,生成的字幕文件为`.srt`格式,与视频文件在同级目录下,用支持字幕的播放器打开视频即可看到带字幕的视频。

tips: 生成字幕的时间较长,可以先观看视频,字幕生成好了再重新打开视频享受字幕。使用GPU大约需要几分钟,若无GPU则不建议使用本项目提供的字幕功能,可自行寻找其他生成字幕的工具。
*tips: 语音转文字所需的时间较长,可以先观看视频,字幕生成好了再重新打开视频享受字幕。使用GPU大约需要几分钟,若无GPU则不建议使用本项目提供的字幕功能,可自行寻找其他生成字幕的工具。*

## 依赖

Expand All @@ -61,6 +53,8 @@ sudo apt update
sudo apt install ffmpeg
```

* **若使用GPU运行自动生成字幕功能,需要先安装cuda,安装方法见[cuda安装](https://blog.csdn.net/chen565884393/article/details/127905428)**

*若想用python环境运行,需安装以下依赖*

* python,[下载](https://www.python.org/ftp/python/3.9.4/python-3.9.4-amd64.exe)并安装
Expand All @@ -71,9 +65,8 @@ sudo apt install ffmpeg
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
```

**若使用GPU运行自动生成字幕功能,需要先安装cuda版本的pytorch,具体安装方法见[pytorch官网](https://pytorch.org/get-started/locally/)**
* 安装语音转文字的依赖:(依赖于pytorch,若未安装pytorch,会自动安装,但是cpu版本。安装cuda版本的pytorch方法见[pytorch官网](https://pytorch.org/get-started/locally/)。)

安装whisper:(依赖于pytorch,若未安装pytorch,会自动安装,但是cpu版本)
```bash
pip install -r requirements_whisper.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
```
Expand All @@ -83,7 +76,7 @@ pip install -r requirements_whisper.txt -i https://pypi.tuna.tsinghua.edu.cn/sim
* 需要关闭本机上的代理,否则会提示类似`check_hostname requires server_hostname`的报错信息。
* 可以下载无权限的课程,只要知道课程链接(中的课程编号)就行。

## 打包
## 打包(仅开发者需要)

如果想要运行时不依赖python环境,可将python程序打包成可执行文件。Release中已打包。

Expand All @@ -96,3 +89,24 @@ pip install pyinstaller
pyinstaller -F main.py
pyinstaller -F gen_caption.py
```
打包`gen_caption.py`时可能会失败,提示递归过深:

<img src="md/README/image-20240409095211597.png" alt="image-20240409095211597" style="zoom:50%;" />

解决方法参考[这里](https://zhuanlan.zhihu.com/p/661325305),需要修改项目根目录下的`gen_caption.spec`配置文件,在文件开始处加上以下代码:

```python
import sys ; sys.setrecursionlimit(sys.getrecursionlimit() * 5)
```

再使用如下命令打包:

```bash
pyinstaller --clean .\gen_caption.spec
```

打包完成后运行若出现Temp目录下的文件未找到:

![image-20240409095831766](md/README/image-20240409095831766.png)

可将项目`hooks`目录下的`hook-whisper.py``hook-zhconv.py`文件复制到pyinstaller的hook目录下(通常在`python根目录\Lib\site-packages\PyInstaller\hooks`),参考[这个](https://blog.csdn.net/qq_42324086/article/details/118280341)
72 changes: 41 additions & 31 deletions gen_caption.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,54 +18,64 @@ def seconds_to_hmsm(seconds):
seconds = str(int(seconds))
# 补0
if len(hours) < 2:
hours = '0' + hours
hours = "0" + hours
if len(minutes) < 2:
minutes = '0' + minutes
minutes = "0" + minutes
if len(seconds) < 2:
seconds = '0' + seconds
seconds = "0" + seconds
if len(milliseconds) < 3:
milliseconds = '0'*(3-len(milliseconds)) + milliseconds
milliseconds = "0" * (3 - len(milliseconds)) + milliseconds
return f"{hours}:{minutes}:{seconds},{milliseconds}"


def main():
# 视频文件路径
video_paths = []
if len(sys.argv) >= 2:
video_path = sys.argv[1]
video_paths.append(sys.argv[1])
else:
files = []
for dirpath, dirnames, filenames in os.walk('output/'):
for dirpath, dirnames, filenames in os.walk("output/"):
for filename in filenames:
if filename.endswith('.mp4'):
files.append(os.path.join(dirpath, filename))
if filename.endswith(".mp4"):
files.append(os.path.join(dirpath, filename).replace("\\", "/"))
for i, f in enumerate(files):
print(i, ":", f)
video_path = files[eval(input('select a video file by input a num: '))]
print(f"[{i}]: ", f)
input_list = eval("[" + input("select a video file by input a num: ") + "]")
for i in input_list:
video_paths.append(files[i])

audio_path = video_path.replace("mp4", "m4a")
cmd = f"ffmpeg -i '{video_path}' -vn -ar {whisper.audio.SAMPLE_RATE} '{audio_path}'"
os.system(cmd)
for video_path in video_paths:
audio_path = video_path.replace("mp4", "m4a")
cmd = f'ffmpeg -i "{video_path}" -vn -ar {whisper.audio.SAMPLE_RATE} "{audio_path}"'
os.system(cmd)

model = whisper.load_model("base", download_root="whisper_models/")
model = whisper.load_model("base", download_root="whisper_models/")

start = time.time()
result = model.transcribe(audio_path, verbose=False, language="zh")
print("Time cost: ", time.time() - start)
start = time.time()
result = model.transcribe(audio_path, verbose=False, language="zh")
print("Time cost: ", time.time() - start)

# 写入字幕文件
with open(video_path.replace("mp4", "srt"), 'w', encoding='utf-8') as f:
i = 1
for r in result['segments']:
f.write(str(i)+'\n')
f.write(seconds_to_hmsm(float(r['start'])) +
' --> '+seconds_to_hmsm(float(r['end']))+'\n')
i += 1
f.write(convert(r['text'], 'zh-cn')+'\n') # 结果可能是繁体,转为简体zh-cn
f.write('\n')

# 删除音频文件
os.remove(audio_path)
# 写入字幕文件
with open(video_path.replace("mp4", "srt"), "w", encoding="utf-8") as f:
i = 1
for r in result["segments"]:
f.write(str(i) + "\n")
f.write(
seconds_to_hmsm(float(r["start"]))
+ " --> "
+ seconds_to_hmsm(float(r["end"]))
+ "\n"
)
i += 1
f.write(
convert(r["text"], "zh-cn") + "\n"
) # 结果可能是繁体,转为简体zh-cn
f.write("\n")

# 删除音频文件
os.remove(audio_path)

if __name__ == '__main__':

if __name__ == "__main__":
main()
3 changes: 3 additions & 0 deletions hooks/hook-whisper.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
from PyInstaller.utils.hooks import collect_data_files

datas = collect_data_files("whisper")
3 changes: 3 additions & 0 deletions hooks/hook-zhconv.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
from PyInstaller.utils.hooks import collect_data_files

datas = collect_data_files("whisper")
73 changes: 50 additions & 23 deletions main.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,55 +6,82 @@
import json
import os
import cProfile
headers={
'Origin': 'https://www.yanhekt.cn',

headers = {
"Origin": "https://www.yanhekt.cn",
"xdomain-client": "web_user",
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36 Edg/107.0.1418.26'
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36 Edg/107.0.1418.26",
}


# courseID = 31425
def main():
if len(sys.argv) == 1:
courseID = eval(input('Please input course ID: '))
courseID = eval(input("Please input course ID: "))
else:
courseID = sys.argv[1]

course = requests.get(f'https://cbiz.yanhekt.cn/v1/course?id={courseID}&with_professor_badges=true', headers=headers)
req = requests.get(f'https://cbiz.yanhekt.cn/v2/course/session/list?course_id={courseID}', headers=headers)
if course.json()['code'] != '0' and course.json()['code'] != 0:
print(course.json()['code'])
print(course.json()['message'])
raise Exception("Please Check your course ID, note that it should be started with yanhekt.cn/course/***, not yanhekt.cn/session/***")
print(course.json()['data']['name_zh'])
videoList = req.json()['data']
course = requests.get(
f"https://cbiz.yanhekt.cn/v1/course?id={courseID}&with_professor_badges=true",
headers=headers,
)
req = requests.get(
f"https://cbiz.yanhekt.cn/v2/course/session/list?course_id={courseID}",
headers=headers,
)
if course.json()["code"] != "0" and course.json()["code"] != 0:
print(course.json()["code"])
print(course.json()["message"])
raise Exception(
"Please Check your course ID, note that it should be started with yanhekt.cn/course/***, not yanhekt.cn/session/***"
)
print(course.json()["data"]["name_zh"])
videoList = req.json()["data"]
# print(json.dumps(videoList, indent=2))
for i, c in enumerate(videoList):
print(i, ":", c['title'])
print(f"[{i}]: ", c["title"])

index = eval('[' + input('select(split by \',\', such as: 0,2,4):') + ']')
vga = input('video(1) or screen(2)?(input 1 or 2, default video):')
if not os.path.exists('output/'):
os.mkdir('output/')
index = eval("[" + input("select(split by ',', such as: 0,2,4): ") + "]")
vga = input("video(1) or screen(2)?(input 1 or 2, default video):")
if not os.path.exists("output/"):
os.mkdir("output/")
for i in index:
c = videoList[i]
name = course.json()['data']['name_zh'].strip() + '-' + course.json()['data']['professors'][0]['name'] + '-' + c['title']
name = (
course.json()["data"]["name_zh"].strip()
+ "-"
+ course.json()["data"]["professors"][0]["name"]
+ "-"
+ c["title"]
)
print(name)
if vga == "2":
print("Downloading screen...")
m3u8dl.M3u8Download(c['videos'][0]['vga'], 'output/' + course.json()['data']['name_zh'].strip() + '-screen', name)
m3u8dl.M3u8Download(
c["videos"][0]["vga"],
"output/" + course.json()["data"]["name_zh"].strip() + "-screen",
name,
)
else:
print("Downloading video...")
m3u8dl.M3u8Download(c['videos'][0]['main'], 'output/'+ course.json()['data']['name_zh'].strip() + '-video', name)
m3u8dl.M3u8Download(
c["videos"][0]["main"],
"output/" + course.json()["data"]["name_zh"].strip() + "-video",
name,
)


if __name__ == '__main__':
if __name__ == "__main__":
try:
main()
# cProfile.run('main()', 'output/profile.txt')
except Exception as e:
print(e)
print("If the problem is still not solved, you can report an issue in https://github.com/AuYang261/BIT_yanhe_download/issues.")
print(
"If the problem is still not solved, you can report an issue in https://github.com/AuYang261/BIT_yanhe_download/issues."
)
print("Or contact with the author [email protected]. Thanks for your report!")
print("如果问题仍未解决,您可以在https://github.com/AuYang261/BIT_yanhe_download/issues 中报告问题。")
print(
"如果问题仍未解决,您可以在https://github.com/AuYang261/BIT_yanhe_download/issues 中报告问题。"
)
print("或者联系作者[email protected]。感谢您的报告!")
Binary file removed md/README/2024-04-08_17-42.png
Binary file not shown.
Binary file removed md/README/image-20230926124749421.png
Binary file not shown.
Binary file removed md/README/image-20230926124841432.png
Binary file not shown.
Binary file added md/README/image-20240409095211597.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added md/README/image-20240409095831766.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added md/README/image-20240409103224309.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added md/README/image-20240409103306945.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added md/README/image-20240409103338980.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added md/README/image-20240409105228362.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit cce3581

Please sign in to comment.