Greenflash Docs

Greenflash helps you build training datasets from real production conversations, filtered and organized by comprehensive quality analysis.

Intelligent data selection

Greenflash’s analysis engine identifies which conversations make good training examples and which demonstrate patterns to avoid.

Quality-based curation

Export conversations filtered by:

Quality scores (high for positive training, low for negative examples)
User ratings and satisfaction metrics
Edge cases and unusual patterns
Specific topics or use cases

Every export includes:

Full conversation transcript
Quality metrics and analysis scores
User ratings and feedback
Topic classification and keywords
Sentiment progression
Safety issue flags

Custom filtering

Build datasets that match your exact training needs:

Filter by sentiment trajectory (improving vs. declining)
Select specific error types (hallucinations, refusals, confusion)
Choose conversation lengths and complexity levels
Include only rated conversations with explicit feedback
Combine multiple criteria for precise curation

Training data that matters

Positive examples

Greenflash identifies what makes conversations successful:

High user satisfaction scores
Positive sentiment throughout
Successful task completion
Efficient problem resolution
No safety or quality issues

Negative examples

Learn from what goes wrong:

Hallucination occurrences with context
Sentiment drops with triggering messages
Failed task attempts with breakdown points
Safety violations with specific patterns
User frustration signals

Balanced datasets

Greenflash helps create representative training sets:

Topic coverage across categories
Varied conversation complexity
Balance of positive and negative examples
Representative user interaction patterns

Export capabilities

Each exported conversation includes:

Complete message transcripts with metadata
Quality scores and analysis results
Sentiment, topic, and safety annotations
Tool calls and system prompts
User ratings and feedback

Available formats:

CSV for analysis and filtering
JSON for model training pipelines
Custom schemas for your requirements

Workflow

Deploy your AI with Greenflash monitoring
Accumulate and analyze real conversations
Export curated datasets based on quality metrics
Fine-tune models with production data
Measure improvement and iterate

Privacy and compliance

Data ownership: Your conversations remain yours
Filtering controls: Exclude sensitive information
Anonymization: Remove PII before export
Audit trails: Track what data was exported when
Access controls: Limit who can export data

Common use cases

Fine-tuning: Select high-quality conversations for model training
Evaluation: Build test sets from real edge cases
Error analysis: Export and study failure patterns
RLHF: Curate examples with clear quality signals

Getting Started

Features

Data & Integrations

Analyses

Developers

Data Curation

Intelligent data selection

Quality-based curation

Custom filtering

Training data that matters

Positive examples

Negative examples

Balanced datasets

Export capabilities

Workflow

Privacy and compliance

Common use cases

Getting Started

Features

Data & Integrations

Analyses

Developers

​Intelligent data selection

​Quality-based curation

​Custom filtering

​Training data that matters

​Positive examples

​Negative examples

​Balanced datasets

​Export capabilities

​Workflow

​Privacy and compliance

​Common use cases

Intelligent data selection

Quality-based curation

Custom filtering

Training data that matters

Positive examples

Negative examples

Balanced datasets

Export capabilities

Workflow

Privacy and compliance

Common use cases