Opik

Tracing: Track all LLM calls and traces during development and production.
Annotations: Log feedback scores using the Python SDK or UI.
Playground: Experiment with different prompts and models.
Automated Evaluation: Store test cases and run experiments to evaluate LLM applications.
CI/CD Integration: Run evaluations as part of your CI/CD pipeline using PyTest integration.
Production Monitoring: Log high volumes of traces and monitor production applications with dashboards.

Open-source platform for debugging, evaluating, and monitoring LLM applications with comprehensive tracing and automated evaluations.