Skip to content

A Modular, Tagger-Driven LLM Fine-Tuning Pipeline: From Data Curation to Emotionally-Aligned AI

Abstract

This whitepaper presents a modular pipeline for LLM fine-tuning that blends automated persona-based data generation, multi-stage tagging, hallucination/confidence filtering, and explainable export, all leading to emotionally-aligned, production-grade conversational models. We highlight innovations in prompt engineering, tagger-driven QA curation, and the orchestration of data flows with full auditability.


Table of Contents

  1. Introduction
  2. Pipeline Overview
  3. Persona-Based Question Generation
  4. Persona-Driven Answer Synthesis
  5. Multi-Tagger Post-Hoc Classification
  6. Hallucination & Emotional Confidence Filtering
  7. Curated Data Export for Fine-Tuning
  8. Model Training and Evaluation
  9. Explainability & Auditability
  10. Future Directions
  11. References

1. Introduction

Modern LLM fine-tuning workflows often struggle to balance data quality, alignment, and explainability. We describe a pipeline—deployed that addresses these needs by integrating modular stages for question generation, answer synthesis, post-hoc tagging, and confidence-based filtering. This approach enables safe, emotionally-aligned, and audit-ready conversational AI development.


2. Pipeline Overview

The pipeline consists of:

  • Automated persona/author-based question generation
  • Persona-driven answer generation using dynamic prompt templates
  • Multi-tagger system for emotional, topic, temporal, belief, and hallucination analysis
  • Confidence and hallucination filtering for high-quality QA selection
  • Curated export with full provenance and explainability
  • LoRA-based fine-tuning using PEFT

Example Workflow

#!/bin/bash
set -e

CUSTOM_MODEL_DIR="models/trained-mistral"
TRAIN_DATA="training_data/trained_qa.jsonl"

echo "[0] Resetting testing..."
python 00_reset_testing.py

echo "🌱 [1] Generating questions..."
python 01_generate_questions.py \
  --generate 5 \
  --target trained

echo "🧠 [2] Generating answers..."
python 02_generate_answers.py

echo "🏷️  [3] Tagging data with taggers..."
python 03_tag_data.py --retries 20

echo "📤 [4] Exporting curated QA data to JSONL..."
python 04_export_curated_data.py \
  --out "$TRAIN_DATA" \
  --max-hallucination 0.1 \
  --min-confidence 1.5 \
  --min-confidence 1.2 \
  -v

echo "🔥 [5] Training the model..."
python 05_train_model.py \
  --task dialogue \
  --data training_data/trained_qa.jsonl \
  --out "$CUSTOM_MODEL_DIR" \
  --model models/mistral-7b-instruct-v0.2 \
  --epochs 5

echo "🧪 [6] Evaluating model performance..."
python 06_test_model.py --model "$CUSTOM_MODEL_DIR"

python 06_test_model.py --mode chat --model "$CUSTOM_MODEL_DIR"

# echo "📊 [7] Pushing stats (optional)..."
# python statpush.py || echo "statpush failed or is optional – skipping."

echo "✅ All steps completed successfully!"

3. Persona-Based Question Generation

One of the foundational innovations in this pipeline is the use of automated, persona-driven question generation. Rather than relying on ad-hoc or crowdsourced data, the system uses Jinja2 templates for both system and author profiles. These templates are designed to capture a wide spectrum of conversational tones and inquiry modes—such as Socratic, existential, supportive, pragmatic, and playful—ensuring both breadth and depth in the resulting dataset.

Each question is generated via an LLM (through Ollama), using the combined persona and author profile prompts. An embedding model (e.g., MiniLM) computes a vector for each generated question, which is then checked against existing questions to ensure semantic diversity and prevent duplication. This process enables large, richly-varied QA datasets that are precisely aligned with intended conversational roles.

Key Features:

  • Automated model-availability checks for seamless pipeline execution.
  • Embedding-based similarity filtering to avoid near-duplicate questions.
  • Pluggable author and persona profiles, supporting dynamic extension.
  • All questions are written to MongoDB with provenance, embeddings, and metadata for later analysis.

4. Persona-Driven Answer Synthesis

For every generated question, the pipeline synthesizes an answer using a separate Jinja2 persona template. Each persona has a tailored default prompt file, ensuring that answers are consistently on-brand and emotionally coherent. The answer generation phase can incorporate personality, beliefs, and meta-attributes such as empathy, supportiveness, or humor directly into the output.

The answer generator takes into account the conversational context, target persona, and sometimes prior emotional state or session context. Generated answers are cleaned, post-processed, and stored in MongoDB alongside their corresponding questions. This allows downstream modules to review and analyze answer quality, and to trace every answer back to its generative prompt.

Key Features:

  • Template-driven, persona-consistent answer generation.
  • Separation of question and answer profiles enables flexible conversational pairings.
  • Rich answer metadata (prompt profile, timestamp, model used, etc.) for full traceability.
  • Built-in normalization, duplicate avoidance, and personality alignment in answer synthesis.

5. Multi-Tagger Post-Hoc Classification

Once QA pairs are generated, the pipeline employs a modular, multi-tagger system to classify each exchange along multiple dimensions. Taggers include:

  • Emotional Tagger: Identifies emotional tones (e.g., supportive, playful, anxious) and computes a confidence score.
  • Topic Tagger: Extracts high-level topics and subtopics.
  • Temporal Tagger: Classifies temporal focus (past, present, future, timeless).
  • Belief Tagger: Detects explicit or implied beliefs, values, or philosophical stances.
  • Hallucination Tagger: Scores the likelihood of unsupported claims or hallucinations.

Each tagger is invoked via its own prompt template (YAML/Jinja2), and outputs structured JSON with the relevant metadata. Results are attached directly to each QA document in MongoDB for later filtering and analytics. Tagging is robust to failures—failed attempts are retried, and errors are logged for auditability.

Key Features:

  • Each tagger is modular and replayable (tags can be updated as taggers improve).
  • Tagger prompts are versioned and extensible.
  • Confidence scoring for emotional tags enables advanced filtering.
  • Tagging failures are tracked separately for transparency and pipeline health monitoring.

6. Hallucination & Emotional Confidence Filtering {#6-hallucination-emotional-confidence-filtering}

Not all QA pairs are fit for training—particularly those with high hallucination risk or low emotional alignment. The pipeline enforces strict quality gates:

  • Hallucination Filtering: QA pairs with a hallucination score above a configurable threshold are excluded.
  • Emotional Confidence Filtering: Only QA pairs with an emotional confidence score above a minimum threshold are considered for export.
  • Multi-Tagger Completeness: Entries must pass all required taggers (e.g., must have emotional, topic, and temporal tags present).
  • Cosine Similarity Pruning: Embedding similarity checks prevent near-duplicate QA pairs.

This ensures that only the most reliable, diverse, and emotionally-consistent pairs make it to the final dataset. Filtering stats and confidence histograms are tracked for ongoing quality assurance.

Key Features:

  • Configurable thresholds for hallucination and emotional confidence.
  • Statistical reporting (histograms, counts) on retained vs. filtered data.
  • Quality gates are fully transparent and explainable.

7. Curated Data Export for Fine-Tuning

The final, filtered QA pairs are exported to JSONL for use in LoRA/PEFT fine-tuning. Each entry includes not only the prompt/response, but also:

  • Complete tagger outputs (emotional tones, topics, beliefs, hallucination scores, etc.)
  • Persona, author, mode, and all prompt provenance
  • Additional metadata (MongoDB ID, timestamps, etc.)

This curated, explainable dataset is ready for direct use in HuggingFace-style training scripts and enables reproducible research, continual evaluation, and auditability. Quality metrics from export (such as confidence and hallucination histograms) provide a clear picture of dataset composition.

Key Features:

  • JSONL export format, compatible with HuggingFace and other frameworks.
  • Every example traceable to its source, tags, and generation context.
  • Dataset is ready for continual re-export as new taggers or filters are introduced.

8. Model Training and Evaluation

The exported, high-quality QA dataset serves as the foundation for fine-tuning large language models using modern frameworks such as HuggingFace Transformers with PEFT (Parameter-Efficient Fine-Tuning) and LoRA (Low-Rank Adaptation). The training phase leverages:

  • Efficient 4-bit quantization to enable training on consumer-grade hardware without loss of expressive power.
  • Persona/context-aware prompts included in the training data, so the model internalizes both conversational style and metadata.
  • Trainer scripts that provide robust batching, logging, and periodic evaluation on held-out validation sets.

After fine-tuning, the resulting adapter(s) and tokenizer are saved for production use or further research. Evaluation includes both automated scoring (e.g., perplexity, BLEU, etc.) and qualitative review, with the ability to test model responses using “hardcoded” conversational prompts for alignment and emotional fidelity.

Key Features:

  • LoRA/PEFT support for efficient, modular fine-tuning.
  • Full compatibility with open-source frameworks and evaluation tools.
  • Output adapters and tokenizers are portable and reusable.
  • Direct support for interactive, persona-driven evaluation and free chat.

9. Explainability & Auditability {#9-explainability-auditability}

A hallmark of the pipeline is radical transparency and explainability:

  • Every QA pair is fully traceable: Each example includes not only the prompt and response, but all tagger outputs, scores, and provenance.
  • Audit logs for failures: All failed tagger attempts, hallucination outliers, and filtered entries are logged in MongoDB for review and debugging.
  • Configurable, replayable tagging: Taggers and filters can be improved or re-run at any time, enabling continuous pipeline evolution.
  • Statistical dashboards: Confidence, hallucination, and tag frequency histograms give both researchers and practitioners actionable insight into dataset quality.
  • Explainable filtering decisions: Quality gates are based on explicit, documented thresholds, with all filtering steps exportable and inspectable.

This level of explainability is especially critical for aligning LLMs to human values, enabling responsible deployment, and supporting reproducible research.


10. Future Directions

To be written...


11. References

To be written...