Skip to main content

Whisper AI Tutorial

·2 mins·
Tutorial AI Whisper Speech-to-Text
Table of Contents

I tried out voice-to-text AI and spent an afternoon compiling the information below.

Whisper Desktop and the official open-source Whisper.

Computer specs:

Pros and Cons
#

ProsCons
Whisper DesktopConvenient, easy to download and set upNot updated for a long time, poor Spanish transcription accuracy
Official WhisperHigh accuracyHard to download, no UI

Whisper Desktop
#

This option is easier to download and comes with a UI. Follow the tutorial YouTube AI 上字幕教學|如何使用免費自動字幕 (逐字稿) 生成軟體 WhisperDesktop|OpenAI Whisper 教學.

Download the latest zip from GitHub and extract it.

Then go to Hugging Face to download a model (the ggml-medium.bin model is the most stable; q5_0 and q8_0 did not work).

Open the application, choose Model Path, select the downloaded .bin model file, and click OK.

Use Transcribe File to pick an audio file (mp3 or m4a).

In Output Format, choose the output file type and whether to include timestamps.

After setting everything, click Transcribe.

Whisper Official
#

Refer to OpenAI 免費開源語音辨識系統– Whisper 安裝簡介及原理.

  1. Set up the basic environment (see the above site for a full illustrated guide):

    Python 3.12.7; git version 2.48.1.windows.1; Pytorch 2.6.0+cu118; Cuda 11.8

  2. Download and configure ffmpeg

  3. Install Whisper by opening cmd and entering:

    pip install git+https://github.com/openai/whisper.git
    pip install --upgrade --no-deps --force-reinstall git+https://github.com/openai/whisper.git
    

    Once done, the installation is complete. If any problems arise, paste the warning into ChatGPT for further troubleshooting.

Usage
#

  1. In cmd, cd to the folder with your audio (e.g., cd desktop \example)

    cd desktop
    
  2. Run:

    whisper filename.mp4 --device cuda
    
  3. The above command uses CUDA and auto-detects the transcription language; the process looks like this:

  1. Other command options are shown below:
  1. The final output includes transcripts in all formats:

Summary
#

The official version works well and lets you use the latest models.


Reference
#

【 YouTube AI 上字幕教學|如何使用免費自動字幕 (逐字稿) 生成軟體 WhisperDesktop|OpenAI Whisper 教學 】posted by 2025,1,23 ( https://notesstartup.com/youtube-ai-subtitle-tutorial/ )

【 OpenAI 免費開源語音辨識系統– Whisper 安裝簡介及原理 】posted by M.H. 2023,4,25 ( https://ithelp.ithome.com.tw/articles/10311957 )

David Chang
Author
David Chang