The LATE toolkit consists of three components:
- Front-end - a single-page web application built with vanilla JavaScript and several third-party libraries (Silero VAD, FFmpeg, ProseMirror, WaveSurfer).
- Back-end - a statically compiled and linked binary that includes a web server, the SQLite database engine, the whisper.cpp engine, and optionally, the CUDA library.
- Speech models - a Whisper-based ASR (automatic speech recognition) model in the GGML format, compatible with the whisper.cpp engine, and a VAD (voice activity detection) model.
Currently, a precompiled LATE binary is available for macOS on Apple Silicon (M1, M2, M3, or M4).
A binary release for Linux/x86_64 systems is in progress.
First, download the latest binary release (late-<date>-darwin-universal.tgz) from the releases page, then unpack it.
In a Terminal application, change to the directory where you unpacked LATE. Then download the ASR and VAD models by executing the following command line:
bash download_models.sh <lang>
Replace <lang> with lv or ltg to download the fine-tuned Latvian or Latgalian model (Q8-quantized), respectively. If the <lang> parameter is omitted, the original Whisper large-v3 model (Q5-quantized) will be downloaded.
Note that downloading the ASR model may take some time.
Finally, run the LATE front-end and back-end by executing:
bash run_late.sh
The front-end will open in your default web browser. Please, wait a few seconds while the back-end loads and the front-end automatically reloads.
This work was funded by the EU Recovery and Resilience Facility's project Language Technology Initiative (2.3.1.1.i.0/1/22/I/CFLA/002) in synergy with the State Research Programme's project Research on Modern Latvian Language and Development of Language Technology (VPP-LETONIKA-2021/1-0006).