前情提要
In 2024, SAG-AFTRA announced a video game strike against the abuse of AI technology. The strike has been going on for a long time, and Hoyoverse Games which were not related to the reason of the strike has also been widely affected. As a plotline enjoyer and an English learner, I can not bear playing games without voice overs. Therefore, Here comes the all-in-one AI dubbing solution for the current situation. – README.md of AIDub
简单来说,SAG-AFTRA 发布了一项针对 AI 技术滥用行为的游戏打击计划,打击目标是游戏中出现的 AI 声音。游戏打击计划已经持续了很长时间,而与打击原因无关的 Hoyoverse 也受到了严重影响。为了解决这个问题,于是有了这个命令行工具。
介绍
AIDub 通过 beautifulsoup4 和对 GPT-SoViTs 的集成,实现了抓取已有配音,并使用 GPT-SoViTs 进行声音合成,最终通过 Detect-and-Play 功能实时检测屏幕字幕并播放合成的声音。
使用
options:
-h, --help show this help message and exit
--voice Fetch voices for dataset
--subtitle Fetch subtitles for dubbing
--inference-server Start GPT-SoVITs inference server for dubbing
--dub-all Dub all the subtitles in the manifest file
--emotion-classification
Pre-process the audio files and generate emotion analsysis configuration for dubbing
--dataset-overview Check the dataset overview
--missing-voices Find potentially missing voices for the subtitles
- 安装 miniconda 环境
点此参照安装指南安装 miniconda 环境。
完成后,打开命令行
conda create -n aidub python=3.10
conda activate aidub
- 克隆仓库
git clone https://github.com/XtherDevTeam/AIDub.git
git clone https://github.com/RVC-Boss/GPT-SoVITS thirdparty/GPTSoViTs
# @title Download pretrained models 下载预训练模型
mkdir -p thirdparty/GPTSoViTs/GPT_SoVITS/pretrained_models
mkdir -p thirdparty/GPTSoViTs/tools/damo_asr/models
mkdir -p thirdparty/GPTSoViTs/tools/uvr5
git clone https://huggingface.co/lj1995/GPT-SoVITS thirdparty/GPTSoViTs/GPT_SoVITS/pretrained_models/
git clone https://www.modelscope.cn/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch.git thirdparty/GPTSoViTs/tools/damo_asr/models/
git clone https://www.modelscope.cn/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch.git thirdparty/GPTSoViTs/tools/damo_asr/models/
git clone https://www.modelscope.cn/damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch.git thirdparty/GPTSoViTs/tools/damo_asr/models/
- 安装依赖
conda install -c conda-forge gcc
conda install -c conda-forge gxx
conda install ffmpeg cmake
conda install pytorch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 pytorch-cuda=11.8 -c pytorch -c nvidia
pip install -r thirdparty/GPTSoViTs/requirements.txt
pip install -r requirements.txt
- 配置 config.py
根据自身需求,增加或修改以下配置项:
- muted_characters: 需要配音的角色
- sources_to_fetch_voice: 抓取语音的来源
- source_text_to_dub: 需要配音的字幕
- 抓取 & 训练
# 可使用以下命令检查是否存在缺失的语音
python app.py --missing-voices
# 进行语音抓取和情感分析
python app.py --voice
python app.py --subtitlte
python app.py --emotion-classification
# 进行训练,根据 GPT-SoVITs 文档进行操作
cd thirdparty/GPTSoViTs && python webui.py
# 完成如上后,开启 GPT-SoVITs 服务器
python app.py --inference-server
# 新建终端,运行如下
python app.py --dub-all
- 游玩
在 Windows 环境下,打开终端,运行如下命令
python dnp.py
待 DnP 启动完成后,打开游戏即可。
存在问题
- pydub 库在解析aac格式的音频时存在问题,导致不时出现段错误,重启 DnP 即可