
Instant voice cloning by MIT and MyShell. Audio foundation model.

Foundational Models for State-of-the-Art Speech and Text Translation.

TripoSG is a high-fidelity image-to-3D generation model leveraging rectified flow transformers for superior performance.

A comprehensive collection of papers focused on evaluating large language models (LLMs).

FlagEval is an evaluation toolkit for AI large foundation models.

A collection of high-quality pretrained models and resources for Chinese natural language processing.

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.