
An AI chatbot integrating Dify and Coze platforms for WeChat with UI configuration and memory capabilities.

Dolphin is a multilingual, multitask ASR model jointly trained by DataoceanAI and Tsinghua University.

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Documentation on setting up an LLM server on Debian from scratch, using Ollama/vLLM, Open WebUI, OpenedAI Speech/Kokoro FastAPI, and ComfyUI.

A one-stop solution for creating digital avatars from WeChat chat records using fine-tuned large language models.

Speech-AI-Forge is a project centered on TTS generation, offering an API Server and a Gradio-based WebUI.

Animation testing based on Bert-VITS2 for generating facial expressions and body animations from audio input.

EdgePersona is a fully localized intelligent digital human that runs offline with low computational requirements.

An AI-powered interactive avatar engine using Live2D, LLM, ASR, TTS, and RVC for VTubing and virtual assistant applications.

OpenUtau is a free and open-source singing synthesis platform, serving as a successor to UTAU.

An AI companion that enhances paper reading with interactive features and a quirky AI professor persona.

A TTS and STS library built on Apple's MLX framework for efficient speech synthesis on Apple Silicon.