Fine-Tune Pipeline¶
Welcome to the Fine-Tune Pipeline documentation! This is a comprehensive fine-tuning pipeline for language models that includes training, inference, and evaluation capabilities.
🚀 Features¶
- Easy Configuration: TOML-based configuration system for all components
- Modern Architecture: Built with Unsloth, Transformers, and TRL for efficient fine-tuning
- Comprehensive Evaluation: Multiple evaluation metrics including BLEU, ROUGE, and semantic similarity
- CI/CD Ready: Designed to work seamlessly with GitHub Actions and Jenkins
- Flexible Inference: Built-in inference capabilities with customizable parameters
- Weights & Biases Integration: Automatic logging and experiment tracking
🏗️ Architecture¶
The pipeline consists of three main components:
1. Fine-Tuner (app/finetuner.py
)¶
- Handles model fine-tuning using LoRA (Low-Rank Adaptation)
- Supports 4-bit and 8-bit quantization for memory efficiency
- Integrates with Weights & Biases for experiment tracking
- Automatic model publishing to Hugging Face Hub
2. Inferencer (app/inferencer.py
)¶
- Performs inference on test datasets
- Configurable generation parameters
- Supports both local and Hub models
- Outputs results in JSONL format
3. Evaluator (app/evaluator.py
)¶
- Comprehensive evaluation suite with multiple metrics
- Support for both traditional (BLEU, ROUGE) and LLM-based evaluation
- Detailed reporting with Excel and JSON outputs
- Semantic similarity and factual correctness evaluation
🛠️ Technology Stack¶
- Unsloth: Efficient fine-tuning framework
- Transformers: Hugging Face Transformers library
- TRL: Transformer Reinforcement Learning
- Datasets: Hugging Face Datasets library
- Weights & Biases: Experiment tracking and logging
- RAGAS: Retrieval Augmented Generation Assessment
📊 Supported Models¶
The pipeline supports various model architectures including:
- Qwen 2.5 series
- Llama models
- Mistral models
- And any model compatible with Unsloth
📈 Evaluation Metrics¶
Built-in support for:
- BLEU Score: Translation quality assessment
- ROUGE Score: Summarization evaluation
- Factual Correctness: LLM-based factual evaluation
- Semantic Similarity: Embedding-based similarity
- Answer Accuracy: Custom accuracy metrics
- Answer Relevancy: Relevance assessment
🔧 Quick Start¶
Ready to get started? Check out our Quick Start Guide to begin fine-tuning your first model!
📚 What's Next?¶
- Installation Guide - Set up your environment
- Configuration Overview - Understand the configuration system
- Basic Fine-Tuning Tutorial - Your first fine-tuning project
- API Reference - Detailed API documentation
This documentation is automatically generated and maintained. For issues or contributions, please visit our GitHub repository.