Whisper AI Tutorial

嘗試voice to text的AI，消耗一個下午整理出以下資訊。

Whisper Desktop、Whisper官方開源。

電腦配備：

優劣比較
#

	優點	缺點
Whisper Desktop	方便，簡易下載設定	久沒更新、西文transcibe正確率很差
Whisper 官方開源	下載難度高，無UI介面	正確率高

於 Github 下載Whisper Desktop最新版本的zip檔，解壓縮

再到 Hugging face 下載模型 (ggml-medium.bin 最穩定，q5_0 以及q8_0測試皆無法使用)

開啟應用程式，Model Path選取剛下載的模型bin檔案，按ok

Transcribe File 選擇要生成逐字稿的音檔 (mp3, m4a皆可)

Output Format 可選擇輸出檔案格式、是否要時間戳印

設定好後壓Transcribe即可。

設置基礎環境 (設置細節參考上述網站，有詳盡的圖文教學)：
Python 3.12.7; git version 2.48.1.windows.1; Pytorch 2.6.0+cu118; Cuda 11.8
ffmpeg 下載設置
Whisper 的安裝：開啟 cmd 依序輸入以下指令
```
pip install git+https://github.com/openai/whisper.git
pip install --upgrade --no-deps --force-reinstall git+https://github.com/openai/whisper.git
```
即完成安裝。過程中出任何問題把 warning 貼給 ChatGPT 多幾個步驟都能解決。

官方版好用，模型可用到最新版的。

【 YouTube AI 上字幕教學｜如何使用免費自動字幕 (逐字稿) 生成軟體 WhisperDesktop｜OpenAI Whisper 教學】posted by 2025,1,23 ( https://notesstartup.com/youtube-ai-subtitle-tutorial/ )

【 OpenAI 免費開源語音辨識系統– Whisper 安裝簡介及原理】posted by M.H. 2023,4,25 ( https://ithelp.ithome.com.tw/articles/10311957 )

作者

David Chang