Transforms research papers into engaging three-person podcast discussions for a fresh listening experience.
A third-party music player providing local services, desktop lyrics, music downloads, and high sound quality.
Dolphin is a multilingual, multitask ASR model jointly trained by DataoceanAI and Tsinghua University.
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
SOTA Open Source TTS for high-quality text-to-speech synthesis with multilingual support.
A one-stop solution for creating digital avatars from WeChat chat records using fine-tuned large language models.
A video translation and dubbing tool powered by LLMs for professional-grade translations and one-click deployment.
Speech-AI-Forge is a project centered on TTS generation, offering an API Server and a Gradio-based WebUI.
AbletonMCP connects Ableton Live to Claude AI, enabling AI-assisted music production and session manipulation.
Animation testing based on Bert-VITS2 for generating facial expressions and body animations from audio input.
A tool to convert videos and audios into various document styles like notes and mind maps.