LogoAISecKit
icon of LLM-Bias-Evaluation

LLM-Bias-Evaluation

A study evaluating geopolitical and cultural biases in large language models through dual-layered assessments.

Introduction

LLM-Bias-Evaluation

Overview

This repository contains the dataset, evaluation scripts, and results for analyzing geopolitical and cultural biases in large language models (LLMs). The study is structured into two evaluation phases: factual QA (objective knowledge) and disputable QA (politically sensitive disputes). We explore how LLMs exhibit model bias (training-induced) and inference bias (query language-induced) when answering questions in different languages.

Key Features
  • Dual-Layered Evaluation: Conducts both factual and disputable QA to assess biases.
  • Comprehensive Dataset: Includes datasets for both factual and disputable questions, translated and verified in multiple languages.
  • Evaluation Scripts: Provides scripts for running evaluations and generating responses from various models.
  • Bias Analysis: Analyzes model bias and inference bias through various metrics and evaluation methods.
Benefits
  • Insightful Findings: Offers insights into how LLMs respond to geopolitical and cultural questions, highlighting biases.
  • Open Source: Available for researchers and developers to utilize and contribute to.
  • Multilingual Support: Evaluates responses in multiple languages, enhancing the study's relevance across different cultures.
Highlights
  • Investigates biases in LLMs through two phases: factual and disputable QA.
  • Includes detailed analysis of model and inference biases.
  • Provides scripts for easy execution of evaluations and response generation.

Newsletter

Join the Community

Subscribe to our newsletter for the latest news and updates