A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
-
Updated
Jan 28, 2026 - Python
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
PyTorch implementation of "Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss" (ICASSP 2020)
A curated list of awesome papers on contextualizing E2E ASR outputs
An efficient implementation of RNN-T Prefix Beam Search in C++/CUDA.
An implementation of RNN-Transducer loss in TF-2.0.
I'm building an end-to-end Vietnamese Speech Recognition System. I'll deploy it into production with the help of Flask, Uwsgi, Nginx, and AWS ...
FunASR实时语音识别版,识别麦克风和电脑内播放的声音,电脑语音打字软件
Pure PyTorch implementation of the loss described in "Online Segment to Segment Neural Transduction" https://arxiv.org/abs/1609.08194
🔊 Enhance speech recognition with GLM-ASR-Nano-2512, a high-performance model excelling in dialect support and low-volume audio accuracy.
Deep learning-based subtitle generation model that processes audio datasets to generate accurate text transcriptions. Includes audio feature extraction, encoder-decoder architecture, training pipelines, and evaluation metrics for subtitle alignment.
🚀 Create and manage SPL tokens on the Solana blockchain with ease, using our Next.js-based launchpad for streamlined token and liquidity management.
🎤 Deploy a simplified voice synthesis service with Fun-CosyVoice3-0.5B-2512, featuring real-time audio output and advanced performance optimizations.
Add a description, image, and links to the rnnt topic page so that developers can more easily learn about it.
To associate your repository with the rnnt topic, visit your repo's landing page and select "manage topics."