Newsletter
Join the Community
Subscribe to our newsletter for the latest news and updates
Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"

A bilingual Chinese-English translation of 'Agentic Design Patterns' by Antonio Gulli, focusing on intelligent systems design.

基于多智能体LLM的中文金融交易框架,支持A股/港股/美股分析。
SecAlign is a defensive framework designed to enhance the robustness of large language models (LLMs) against prompt injection attacks. The framework leverages preference optimization techniques, creating a preference dataset that includes both prompt-injected (insecure) inputs and secure outputs. By performing preference optimization on this dataset, SecAlign teaches the LLM to prefer secure outputs, significantly reducing the success rates of prompt injections.