Newsletter
Join the Community
Subscribe to our newsletter for the latest news and updates
This repository contains the code for generating the ToxiGen dataset for hate speech detection.

Nano Bananary is an AI batch image and video generator with 142 effects.

AI Podcast Generator for bilingual episodes, supporting multiple languages and alternative to NotebookLLM.

Zero-Config Code Flow for Claude code & Codex, enabling seamless integration and configuration for AI development.
ToxiGen is a large-scale machine-generated dataset designed for adversarial and implicit hate speech detection, published at ACL 2022. This repository includes the necessary code and tools to generate the ToxiGen dataset, which contains implicitly toxic and benign sentences mentioning 13 minority groups. The dataset aims to train classifiers to detect subtle hate speech that does not include slurs or profanity.