Introduction
This GitHub repository provides tools for generating training data using large language models (LLMs) specifically for logistics applications. The main objective is to assist a logistics company in creating an interactive Q&A bot that can classify user queries related to logistics.
Key Features
- Data Generation: Using DeepSeek-V3, it generates diverse datasets relevant to logistics and non-logistics queries.
- Data Review: Implements DeepSeek-R1 to verify and check the generated datasets for quality and classification accuracy.
- Python-Based: The tools are implemented in Python, making them accessible and modifiable for developers.
Benefits
- Automation: Reduces the hassle of manual data collection and labeling by automating the data generation process.
- Quality Assurance: Ensures that the generated datasets are relevant and accurate through robust data review mechanisms.
Highlights
- Specifically designed for logistics-related query classification.
- Simplifies the data preparation phase for building AI applications in logistics.
- Open-source project encouraging contributions and community support.