Skip to content

Feature: Add FunASR as self-hosted STT alternative to Deepgram #7563

@LauraGPT

Description

@LauraGPT

Motivation

Omi currently uses Deepgram for speech-to-text, which requires API keys and incurs costs. FunASR (16K+ stars) provides a fully self-hosted, open-source alternative with an OpenAI-compatible API — meaning minimal integration effort.

Why FunASR

  • Free & self-hosted: No API keys, no per-minute billing, data stays on your infrastructure
  • OpenAI-compatible API: /v1/audio/transcriptions endpoint — drop-in replacement
  • 50+ languages including English, Chinese, Japanese, Korean, etc.
  • Industrial-grade accuracy: Paraformer (non-autoregressive, 170x realtime on GPU), SenseVoice (50+ languages, emotion detection)
  • Built-in VAD + punctuation + speaker diarization (cam++)
  • Runs on consumer GPUs: SenseVoice-Small (234M params) works on 4GB VRAM

Quick Start

pip install funasr vllm
funasr-server --device cuda  # starts OpenAI-compatible server at :8000

# Test
curl http://localhost:8000/v1/audio/transcriptions \
  -F file=@audio.wav -F model=sensevoice

Since Omi already has Deepgram self-hosted deployment (Helm charts in backend/charts/), FunASR could serve as a lighter-weight, truly open-source alternative that's easier to deploy.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions