Newsletter
Join the Community
Subscribe to our newsletter for the latest news and updates
This paper discusses new methods for generating transferable adversarial attacks on aligned language models, improving LLM security.

A bilingual Chinese-English translation of 'Agentic Design Patterns' by Antonio Gulli, focusing on intelligent systems design.

基于多智能体LLM的中文金融交易框架,支持A股/港股/美股分析。
This paper introduces a novel method for creating universal and transferable adversarial attacks against aligned large language models (LLMs). The authors propose an approach that automatically generates suffixes to be appended to various prompts. By employing a combination of greedy and gradient-based optimization techniques, these adversarial suffixes increase the likelihood that aligned LLMs produce objectionable responses.