Whisper overview :

Whisper is an open-source automatic speech recognition (ASR) system from OpenAI that converts speech to text. It’s multilingual and multitask, supporting:

Speech-to-text transcription in many languages
Translation of non‑English speech to English text
Language identification

Whisper uses a Transformer encoder–decoder architecture trained on a large and diverse audio–text corpus, making it robust across accents, noise conditions, and technical domains. You can read all about it here

There are a number of different model sizes that trade accuracy for speed and resource use. Common variants include tiny (87MB), small (497MB), medium (1.83GB), and large (3.39). Larger models are generally more accurate but significantly slower and more resource-intensive. In our testing we found the “small” model to provide a strong balance between accuracy and performance for most deployments.

Whisper is distributed as a single file. This has a special format and is called a “Llamafile,” which runs on Linux, Windows, and macOS without separate installs. One file does it all. You will need to download Whisper from here -> Mozilla whisperfile on Hugging Face 

Note:

Windows: after download, add the .exe extension to the file (e.g., whisper-small.llamafile.exe) so it can run.
Linux/macOS: make the file executable:
- chmod +x whisper-small.llamafile
- Then run it from the terminal.

You can deploy Whisper on your Domino server or on a separate machine to leverage optimised hardware (e.g., a macOS system with Apple Silicon).

Preemptive AI connects to Whisper via HTTP. By default only local (on the same server) connections are possible, so you may need to change this if you are running Whisper on a different server from your Domino Server. To change the port or listen to all interfaces, specify the host (0.0.0.0 makes the server publicly accessible) and port at launch:

Example:

./whisper-small.llamafile --host 0.0.0.0 --port 8080 (Linux/macOS)

whisper-small.llamafile.exe --host 0.0.0.0 --port 8080 (Windows)

Note: Binding to 0.0.0.0 exposes the service on your network. Place it behind your firewall or restrict access (e.g., reverse proxy, allowlist, or VPN).

There are many additional command-line arguments you can pass Whisper; some of them are documented here -> ggml-org/whisper.cpp examples/cli

 Performance tips:

On Apple Silicon, use the appropriate build that leverages Metal and include the command line argument --gpu metal
Adjust the number of threads Whisper can use based on your server’s hardware.

Audio intelligence Quick start checklist.

How can we help you today?

Whisper overview Print

How can we help you today?

Whisper overview Print

Related Articles