TTSFM is a reverse-engineered API server mirroring OpenAI's TTS service for text-to-speech conversion.
Demo app for Groq plugins in LiveKit Agents.
A simple voice generation tool that converts text to natural speech using the CosyVoice2 model.
Dolphin is a multilingual, multitask ASR model jointly trained by DataoceanAI and Tsinghua University.
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
A video translation and dubbing tool powered by LLMs for professional-grade translations and one-click deployment.
The media player for language learning, with dual subtitles, AI-generated subtitles, real-time translation, and more!
A tool to convert videos and audios into various document styles like notes and mind maps.
An AI-powered interactive avatar engine using Live2D, LLM, ASR, TTS, and RVC for VTubing and virtual assistant applications.
Extremely fast live recording, automatic slicing, rendering, and uploading for Bilibili, compatible with low configuration machines.
Targeted Adversarial Examples on Speech-to-Text systems.