I tried out voice-to-text AI and spent an afternoon compiling the information below.
Whisper Desktop and the official open-source Whisper.
Computer specs:
Pros and Cons#
Pros | Cons | |
---|---|---|
Whisper Desktop | Convenient, easy to download and set up | Not updated for a long time, poor Spanish transcription accuracy |
Official Whisper | High accuracy | Hard to download, no UI |
Whisper Desktop#
This option is easier to download and comes with a UI. Follow the tutorial YouTube AI 上字幕教學|如何使用免費自動字幕 (逐字稿) 生成軟體 WhisperDesktop|OpenAI Whisper 教學.
Download the latest zip from GitHub and extract it.


Then go to Hugging Face to download a model (the ggml-medium.bin
model is the most stable; q5_0 and q8_0 did not work).
Open the application, choose Model Path, select the downloaded .bin
model file, and click OK.
Use Transcribe File to pick an audio file (mp3 or m4a).
In Output Format, choose the output file type and whether to include timestamps.
After setting everything, click Transcribe.
Whisper Official#
Refer to OpenAI 免費開源語音辨識系統– Whisper 安裝簡介及原理.
Set up the basic environment (see the above site for a full illustrated guide):
Python 3.12.7; git version 2.48.1.windows.1; Pytorch 2.6.0+cu118; Cuda 11.8
Download and configure ffmpeg
Install Whisper by opening
cmd
and entering:pip install git+https://github.com/openai/whisper.git pip install --upgrade --no-deps --force-reinstall git+https://github.com/openai/whisper.git
Once done, the installation is complete. If any problems arise, paste the warning into ChatGPT for further troubleshooting.
Usage#
In
cmd
,cd
to the folder with your audio (e.g.,cd desktop
\example)cd desktop
Run:
whisper filename.mp4 --device cuda
The above command uses CUDA and auto-detects the transcription language; the process looks like this:
- Other command options are shown below:


- The final output includes transcripts in all formats:
Summary#
The official version works well and lets you use the latest models.
Reference#
【 YouTube AI 上字幕教學|如何使用免費自動字幕 (逐字稿) 生成軟體 WhisperDesktop|OpenAI Whisper 教學 】posted by 2025,1,23 ( https://notesstartup.com/youtube-ai-subtitle-tutorial/ )
【 OpenAI 免費開源語音辨識系統– Whisper 安裝簡介及原理 】posted by M.H. 2023,4,25 ( https://ithelp.ithome.com.tw/articles/10311957 )