Greenflash Docs

For high-volume applications, you can control the percentage of data that is ingested using sampling. This helps manage costs while still capturing representative data and ensuring critical information always gets through.

How Sampling Works

Both /messages and /events endpoints support sampling via two fields:

Field	Type	Description
`sampleRate`	number (0-1)	The probability that this request will be ingested. `0.1` means 10% of requests are stored. Defaults to `1.0` (all requests).
`forceSample`	boolean	When `true`, bypasses sampling and ensures the request is always ingested.

Free plan customers are not affected by sampling. All data is ingested regardless of the sampleRate value.

Messages vs. Events

Sampling works differently for messages and events to optimize for their distinct use cases:

Messages: Conversation-Level Sampling

For /messages, sampling is deterministic per conversation. Either all messages in a conversation are ingested or none are. This ensures:

Complete conversation context is preserved
No fragmented conversations with missing messages
Consistent behavior across retries and multiple message batches

# All messages for this conversation will be kept or dropped together
curl --request POST \
  --url https://www.greenflash.ai/api/v1/messages \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
    "externalConversationId": "conv-123",
    "productId": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
    "messages": [{"role": "user", "content": "Hello"}],
    "sampleRate": 0.1
  }'

Events: Per-Event Sampling

For /events, sampling is non-deterministic per event. Each event has an independent probability of being ingested. This ensures:

Even distribution across all organizations and event types
No entire buckets (org, eventType) are permanently included or excluded
Representative sampling across your entire event stream

# Each event is independently sampled
curl --request POST \
  --url https://www.greenflash.ai/api/v1/events \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
    "eventType": "feature_used",
    "productId": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
    "value": "search",
    "sampleRate": 0.1
  }'

When to Use Sampling

High-Frequency Data

If you’re tracking high-volume data like general usage events or routine conversations, sample at 10-20% to capture trends without storing every occurrence.

# Sample 10% of routine feature usage
{
  "eventType": "page_view",
  "sampleRate": 0.1
}

Critical Data

Always use forceSample: true for high-value data that should never be dropped:

Events: purchase_completed, subscription_started, churn_detected
Messages: Support escalations, error conversations, VIP customer interactions

# Always capture purchase events
{
  "eventType": "purchase_completed",
  "influence": "positive",
  "value": "299.00",
  "valueType": "currency",
  "forceSample": true
}

Response for Dropped Requests

When a request is dropped due to sampling, the API returns a 204 No Content response. This indicates the request was valid but intentionally not processed. Your integration should handle this gracefully:

const response = await fetch('https://www.greenflash.ai/api/v1/events', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${apiKey}`,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    eventType: 'feature_used',
    productId: productId,
    sampleRate: 0.1
  })
});

if (response.status === 204) {
  // Request was valid but dropped due to sampling - this is expected
  console.log('Event sampled out');
} else if (response.ok) {
  // Event was ingested
  const data = await response.json();
  console.log('Event created:', data.eventId);
}

Best Practices

Start with 100% sampling during development and initial rollout to ensure your integration is working correctly.
Reduce sampling gradually as volume increases. Monitor your analytics to ensure you’re still capturing representative data.
Never sample critical business events. Use forceSample: true for events that directly impact revenue or customer success metrics.
Consider conversation importance when sampling messages. High-value customer conversations or error scenarios should use forceSample: true.
Monitor the 204 response rate to verify your sampling is working as expected.

Getting Started

Features

Data & Integrations

Analyses

Developers

Sampling

How Sampling Works

Messages vs. Events

Messages: Conversation-Level Sampling

Events: Per-Event Sampling

When to Use Sampling

High-Frequency Data

Critical Data

Response for Dropped Requests

Best Practices

Getting Started

Features

Data & Integrations

Analyses

Developers

​How Sampling Works

​Messages vs. Events

​Messages: Conversation-Level Sampling

​Events: Per-Event Sampling

​When to Use Sampling

​High-Frequency Data

​Critical Data

​Response for Dropped Requests

​Best Practices

How Sampling Works

Messages vs. Events

Messages: Conversation-Level Sampling

Events: Per-Event Sampling

When to Use Sampling

High-Frequency Data

Critical Data

Response for Dropped Requests

Best Practices