LATE: Toolkit for Private Speech Transcription

The LATE toolkit consists of three components:

Front-end - a single-page web application built with vanilla JavaScript and several third-party libraries (Silero VAD, FFmpeg, ProseMirror, WaveSurfer).
Back-end - a statically compiled and linked binary that includes a web server, the SQLite database engine, the whisper.cpp engine, and optionally, the CUDA library.
Speech models - a Whisper-based ASR (automatic speech recognition) model in the GGML format, compatible with the whisper.cpp engine, and a VAD (voice activity detection) model.

Running LATE locally

Currently, a precompiled LATE binary is available for macOS on Apple Silicon (M1, M2, M3, or M4).

A binary release for Linux/x86_64 systems is in progress.

First, download the latest binary release (late-<date>-darwin-universal.tgz) from the releases page, then unpack it.

In a Terminal application, change to the directory where you unpacked LATE. Then download the ASR and VAD models by executing the following command line:

bash download_models.sh <lang>

Replace <lang> with lv or ltg to download the fine-tuned Latvian or Latgalian model (Q8-quantized), respectively. If the <lang> parameter is omitted, the original Whisper large-v3 model (Q5-quantized) will be downloaded.

Note that downloading the ASR model may take some time.

Finally, run the LATE front-end and back-end by executing:

bash run_late.sh

The front-end will open in your default web browser. Please, wait a few seconds while the back-end loads and the front-end automatically reloads.

Acknowledgements

This work was funded by the EU Recovery and Resilience Facility's project Language Technology Initiative (2.3.1.1.i.0/1/22/I/CFLA/002) in synergy with the State Research Programme's project Research on Modern Latvian Language and Development of Language Technology (VPP-LETONIKA-2021/1-0006).

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
deps		deps
src		src
static		static
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
download_models.sh		download_models.sh
run_late.sh		run_late.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LATE: Toolkit for Private Speech Transcription

Running LATE locally

Acknowledgements

About

Uh oh!

Releases 3

Packages

Contributors 2

Uh oh!

Languages

License

LUMII-AILab/LATE

Folders and files

Latest commit

History

Repository files navigation

LATE: Toolkit for Private Speech Transcription

Running LATE locally

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 2

Uh oh!

Languages

Packages