UniTok

A Unified Tokenizer for Visual Generation and Understanding.

image for UniTok

Introduction

UniTok: A Unified Tokenizer for Visual Generation and Understanding

Key Features:

Unified visual tokenizer compatible with autoregressive generative and multimodal understanding models.
Implements a state-of-the-art MLLM within the Liquid framework, enhancing performance in both generation and understanding tasks.
The repository includes installation instructions, model weights, and inference capabilities.

Benefits:

Boosts performance across unified MLLMs by integrating advanced features such as improved attention mechanisms.
Open to community contributions, feedback, and continuous improvements.
Published research results support its effectiveness on visual comprehension benchmarks.

Highlights:

Gradio demo available on Huggingface.
Comprehensive training setups for various tasks, including data preparation for evaluation.
MIT licensed for open-source collaboration.

Information

Publisher
AISecKit
Websitegithub.com
Published date2025/04/28

Categories

Tags

More Products

image of Nano Bananary

AI ModelsAI Application PlatformsAI Video Tools

Nano Bananary

Nano Bananary is an AI batch image and video generator with 142 effects.

Text-to-Video Generative AI

image of Twocast

AI Application PlatformsAI Productivity ToolsAI Audio Tools

Twocast

AI Podcast Generator for bilingual episodes, supporting multiple languages and alternative to NotebookLLM.

Content Creation

image of ZCF

AI Application PlatformsAI Productivity ToolsAI Development Frameworks

ZCF

Zero-Config Code Flow for Claude code & Codex, enabling seamless integration and configuration for AI development.

Open Source Claude