LLM Output Parsing & Structured Data: JSON, Function Calling, Validation
Extract structured data reliably from LLMs. JSON parsing, function calling, Pydantic validation, and error handling for production applications.
Introduction: The Structured Data Challenge
Large Language Models generate impressive natural language responses, but production applications rarely need prose—they need structured data. Your customer service chatbot doesn’t just want a friendly paragraph about order status; it needs an order ID, status code, and estimated delivery date in a format your database understands. Your document processor doesn’t want a text summary; it needs extracted fields validated and ready for insertion into your CRM.
The gap between LLM text generation and structured data requirements represents one of the biggest friction points in production AI systems. Models are non-deterministic, prone to adding explanation around requested JSON, inconsistent in field naming, and creative in inventing schemas. A prompt asking for JSON might return markdown-wrapped JSON, or JSON with comments, or prose followed by JSON, or malformed JSON that crashes your parser.
Yet structured data extraction is critical for AI applications. Every form processor, data pipeline, API integration, workflow automation, and analytics system requires reliable parsing of LLM outputs into typed, validated data structures. This guide reveals production-grade techniques for extracting structured data from LLMs with reliability approaching traditional APIs.
We’ll cover JSON extraction, OpenAI function calling, Pydantic validation, error handling, retry strategies, and schema enforcement—everything needed to transform unreliable text into production-ready structured data.
The Structured Output Spectrum
Different approaches offer varying levels of structure and reliability.
Reliability Hierarchy
Level 1: Prompt-Based JSON (60-80% reliability)
- Ask model to return JSON
- Parse with try/except
- Handle failures manually
Level 2: Constrained Decoding (85-95% reliability)
- JSON mode (OpenAI)
- Structured outputs (Anthropic)
- Grammar-based generation
Level 3: Function Calling (90-98% reliability)
- Native function call interfaces
- Typed parameters
- Built-in validation
Level 4: Typed Extraction with Validation (95-99% reliability)
- Function calling + Pydantic
- Multi-level validation
- Automatic retry on failure
Approach Comparison
| Method | Reliability | Flexibility | Setup Complexity | Best For |
|---|---|---|---|---|
| Prompt JSON | 70% | High | Low | Prototypes |
| JSON Mode | 90% | Medium | Low | Simple structures |
| Function Calling | 95% | Medium | Medium | API integrations |
| Typed + Validation | 98% | Lower | High | Production systems |
Strategy 1: Prompt-Based JSON Extraction
The simplest approach: ask nicely and parse carefully.
Basic JSON Extraction
import json
import re
from typing import Optional, Any
class JSONExtractor:
def extract_json(self, text: str) -> Optional[dict]:
"""Extract JSON from LLM response."""
# Remove markdown code blocks
text = re.sub(r'```json\s*', '', text)
text = re.sub(r'```\s*', '', text)
# Try to find JSON object
json_match = re.search(r'\{.*\}', text, re.DOTALL)
if not json_match:
# Try array
json_match = re.search(r'\[.*\]', text, re.DOTALL)
if not json_match:
return None
try:
return json.loads(json_match.group())
except json.JSONDecodeError:
return self.repair_json(json_match.group())
def repair_json(self, malformed_json: str) -> Optional[dict]:
"""Attempt to repair common JSON issues."""
# Remove trailing commas
repaired = re.sub(r',(\s*[}\]])', r'\1', malformed_json)
# Fix single quotes
repaired = repaired.replace("'", '"')
# Remove comments
repaired = re.sub(r'//.*?\n', '\n', repaired)
repaired = re.sub(r'/\*.*?\*/', '', repaired, flags=re.DOTALL)
try:
return json.loads(repaired)
except json.JSONDecodeError:
return None
# Usage
extractor = JSONExtractor()
llm_response = """Here's the data you requested:
```json
{
"name": "John Doe",
"age": 30,
"email": "john@example.com",
}
```"""
data = extractor.extract_json(llm_response)
print(data) # {'name': 'John Doe', 'age': 30, 'email': 'john@example.com'}
Robust Prompting for JSON
class StructuredPromptBuilder:
def build_json_prompt(
self,
task: str,
schema: dict,
examples: list[dict] = None
) -> str:
"""Build prompt optimized for JSON output."""
prompt_parts = [
f"Task: {task}",
"",
"CRITICAL: Return ONLY valid JSON. No explanations, no markdown, no preamble.",
"",
"Expected schema:",
json.dumps(schema, indent=2)
]
if examples:
prompt_parts.append("\nExamples:")
for i, example in enumerate(examples, 1):
prompt_parts.append(f"\nExample {i}:")
prompt_parts.append(json.dumps(example, indent=2))
prompt_parts.extend([
"",
"Return your response as a JSON object matching the schema above.",
"Do not wrap in markdown code blocks.",
"Do not include any text before or after the JSON."
])
return "\n".join(prompt_parts)
# Usage
builder = StructuredPromptBuilder()
schema = {
"product_name": "string",
"price": "number",
"in_stock": "boolean",
"categories": ["string"]
}
prompt = builder.build_json_prompt(
task="Extract product information from the description",
schema=schema,
examples=[
{
"product_name": "Wireless Mouse",
"price": 29.99,
"in_stock": True,
"categories": ["Electronics", "Accessories"]
}
]
)
response = llm.generate(prompt)
data = extractor.extract_json(response)
JSON Mode (OpenAI)
from openai import OpenAI
class OpenAIJSONExtractor:
def __init__(self, api_key: str):
self.client = OpenAI(api_key=api_key)
def extract_structured_data(
self,
prompt: str,
model: str = "gpt-4-turbo-preview"
) -> dict:
"""Use JSON mode for guaranteed valid JSON."""
response = self.client.chat.completions.create(
model=model,
messages=[
{
"role": "system",
"content": "You are a data extraction assistant. Always respond with valid JSON."
},
{
"role": "user",
"content": prompt
}
],
response_format={"type": "json_object"} # Enforces JSON
)
return json.loads(response.choices[0].message.content)
# Usage
extractor = OpenAIJSONExtractor(api_key="your-key")
result = extractor.extract_structured_data(
"""Extract information from: "Premium wireless mouse, $29.99, in stock"
Return JSON with: product_name, price, in_stock"""
)
print(result)
# Guaranteed to be valid JSON
Strategy 2: Function Calling
Function calling provides the most reliable structured extraction.
OpenAI Function Calling
from openai import OpenAI
from typing import Literal
class FunctionCallingExtractor:
def __init__(self, api_key: str):
self.client = OpenAI(api_key=api_key)
def extract_with_function(
self,
prompt: str,
function_name: str,
function_description: str,
parameters: dict
) -> dict:
"""Extract data using function calling."""
response = self.client.chat.completions.create(
model="gpt-4-turbo-preview",
messages=[{"role": "user", "content": prompt}],
tools=[
{
"type": "function",
"function": {
"name": function_name,
"description": function_description,
"parameters": parameters
}
}
],
tool_choice={"type": "function", "function": {"name": function_name}}
)
# Extract function arguments
tool_call = response.choices[0].message.tool_calls[0]
arguments = json.loads(tool_call.function.arguments)
return arguments
# Usage - Product Extraction
extractor = FunctionCallingExtractor(api_key="your-key")
product_data = extractor.extract_with_function(
prompt="Extract: 'Premium Wireless Mouse - $29.99, in stock, Electronics category'",
function_name="extract_product",
function_description="Extract product information",
parameters={
"type": "object",
"properties": {
"product_name": {
"type": "string",
"description": "Product name"
},
"price": {
"type": "number",
"description": "Product price in USD"
},
"in_stock": {
"type": "boolean",
"description": "Whether product is in stock"
},
"category": {
"type": "string",
"description": "Product category"
}
},
"required": ["product_name", "price", "in_stock"]
}
)
print(product_data)
# {'product_name': 'Premium Wireless Mouse', 'price': 29.99, 'in_stock': True, 'category': 'Electronics'}
Complex Schema with Nested Objects
def extract_invoice_data(invoice_text: str) -> dict:
"""Extract structured invoice data."""
extractor = FunctionCallingExtractor(api_key="your-key")
return extractor.extract_with_function(
prompt=f"Extract all information from this invoice:\n\n{invoice_text}",
function_name="extract_invoice",
function_description="Extract structured invoice information",
parameters={
"type": "object",
"properties": {
"invoice_number": {"type": "string"},
"date": {"type": "string", "description": "ISO format date"},
"vendor": {
"type": "object",
"properties": {
"name": {"type": "string"},
"address": {"type": "string"},
"tax_id": {"type": "string"}
},
"required": ["name"]
},
"line_items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"description": {"type": "string"},
"quantity": {"type": "number"},
"unit_price": {"type": "number"},
"total": {"type": "number"}
},
"required": ["description", "quantity", "unit_price", "total"]
}
},
"subtotal": {"type": "number"},
"tax": {"type": "number"},
"total": {"type": "number"}
},
"required": ["invoice_number", "date", "vendor", "line_items", "total"]
}
)
# Usage
invoice_text = """
INVOICE #INV-2024-001
Date: 2024-01-15
Vendor: Acme Corp
Address: 123 Main St, City, State 12345
Tax ID: 12-3456789
Line Items:
1. Premium Widgets (10 × $5.00) = $50.00
2. Standard Gadgets (5 × $10.00) = $50.00
Subtotal: $100.00
Tax (8%): $8.00
Total: $108.00
"""
data = extract_invoice_data(invoice_text)
print(json.dumps(data, indent=2))
Anthropic Tool Use
from anthropic import Anthropic
class ClaudeToolExtractor:
def __init__(self, api_key: str):
self.client = Anthropic(api_key=api_key)
def extract_with_tool(
self,
prompt: str,
tool_name: str,
tool_description: str,
input_schema: dict
) -> dict:
"""Extract data using Claude's tool use."""
message = self.client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=4096,
tools=[
{
"name": tool_name,
"description": tool_description,
"input_schema": input_schema
}
],
messages=[{"role": "user", "content": prompt}]
)
# Extract tool use
for content in message.content:
if content.type == "tool_use":
return content.input
raise ValueError("No tool use found in response")
# Usage
claude_extractor = ClaudeToolExtractor(api_key="your-key")
contact_data = claude_extractor.extract_with_tool(
prompt="Extract contact info: 'John Doe, john@example.com, +1-555-0123, San Francisco, CA'",
tool_name="extract_contact",
tool_description="Extract contact information",
input_schema={
"type": "object",
"properties": {
"name": {"type": "string"},
"email": {"type": "string"},
"phone": {"type": "string"},
"city": {"type": "string"},
"state": {"type": "string"}
},
"required": ["name", "email"]
}
)
print(contact_data)
Strategy 3: Pydantic Validation
Combine function calling with Pydantic for type-safe extraction.
Basic Pydantic Model
from pydantic import BaseModel, Field, validator
from typing import List, Optional
from datetime import date
class Product(BaseModel):
product_name: str = Field(..., min_length=1, max_length=200)
price: float = Field(..., gt=0)
in_stock: bool
categories: List[str] = Field(default_factory=list)
sku: Optional[str] = None
@validator('price')
def validate_price(cls, v):
if v > 100000:
raise ValueError('Price seems unreasonably high')
return round(v, 2)
@validator('product_name')
def clean_name(cls, v):
return v.strip().title()
class PydanticExtractor:
def __init__(self, llm_extractor):
self.llm = llm_extractor
def extract_and_validate(
self,
prompt: str,
model_class: type[BaseModel]
) -> BaseModel:
"""Extract data and validate with Pydantic."""
# Convert Pydantic model to function schema
schema = model_class.model_json_schema()
# Extract using function calling
raw_data = self.llm.extract_with_function(
prompt=prompt,
function_name=f"extract_{model_class.__name__.lower()}",
function_description=f"Extract {model_class.__name__} data",
parameters=schema
)
# Validate and parse with Pydantic
try:
return model_class(**raw_data)
except Exception as e:
raise ValueError(f"Validation failed: {e}")
# Usage
pydantic_extractor = PydanticExtractor(llm_extractor)
product = pydantic_extractor.extract_and_validate(
prompt="Extract: 'premium WIRELESS mouse - $29.99, in stock, electronics'",
model_class=Product
)
print(product.product_name) # "Premium Wireless Mouse" (cleaned and formatted)
print(product.price) # 29.99 (validated and rounded)
print(product.in_stock) # True
print(product.dict()) # Get as dictionary
Complex Nested Models
from decimal import Decimal
from enum import Enum
class Currency(str, Enum):
USD = "USD"
EUR = "EUR"
GBP = "GBP"
class Address(BaseModel):
street: str
city: str
state: Optional[str] = None
postal_code: str
country: str = "USA"
class Vendor(BaseModel):
name: str
address: Address
tax_id: Optional[str] = None
email: Optional[str] = None
@validator('email')
def validate_email(cls, v):
if v and '@' not in v:
raise ValueError('Invalid email format')
return v
class LineItem(BaseModel):
description: str
quantity: int = Field(..., gt=0)
unit_price: Decimal = Field(..., gt=0)
total: Decimal = Field(..., gt=0)
@validator('total')
def validate_total(cls, v, values):
if 'quantity' in values and 'unit_price' in values:
expected = values['quantity'] * values['unit_price']
if abs(v - expected) > Decimal('0.01'):
raise ValueError(f'Total {v} does not match quantity × price')
return v
class Invoice(BaseModel):
invoice_number: str
date: date
vendor: Vendor
line_items: List[LineItem] = Field(..., min_items=1)
subtotal: Decimal
tax: Decimal = Field(..., ge=0)
total: Decimal
currency: Currency = Currency.USD
notes: Optional[str] = None
@validator('total')
def validate_total(cls, v, values):
if 'subtotal' in values and 'tax' in values:
expected = values['subtotal'] + values['tax']
if abs(v - expected) > Decimal('0.01'):
raise ValueError('Total does not match subtotal + tax')
return v
class Config:
use_enum_values = True
# Usage - Extract and validate complex invoice
invoice = pydantic_extractor.extract_and_validate(
prompt=invoice_text,
model_class=Invoice
)
# All fields are now typed and validated
print(f"Invoice: {invoice.invoice_number}")
print(f"Total: {invoice.currency} {invoice.total}")
print(f"Vendor: {invoice.vendor.name}")
for item in invoice.line_items:
print(f" {item.description}: {item.quantity} × ${item.unit_price} = ${item.total}")
Auto-Retry on Validation Failure
class RobustPydanticExtractor:
def __init__(self, llm_extractor, max_retries: int = 3):
self.llm = llm_extractor
self.max_retries = max_retries
def extract_with_retry(
self,
prompt: str,
model_class: type[BaseModel]
) -> BaseModel:
"""Extract with automatic retry on validation failure."""
last_error = None
for attempt in range(self.max_retries):
try:
# Add validation error context to prompt if retrying
if last_error:
enhanced_prompt = f"""{prompt}
PREVIOUS ATTEMPT FAILED with error:
{last_error}
Please correct the issue and try again."""
else:
enhanced_prompt = prompt
# Extract
schema = model_class.model_json_schema()
raw_data = self.llm.extract_with_function(
prompt=enhanced_prompt,
function_name=f"extract_{model_class.__name__.lower()}",
function_description=f"Extract {model_class.__name__} data",
parameters=schema
)
# Validate
return model_class(**raw_data)
except Exception as e:
last_error = str(e)
if attempt == self.max_retries - 1:
raise ValueError(f"Failed after {self.max_retries} attempts: {last_error}")
raise ValueError("Unexpected error in retry logic")
# Usage
robust_extractor = RobustPydanticExtractor(llm_extractor, max_retries=3)
# Will automatically retry if validation fails
invoice = robust_extractor.extract_with_retry(
prompt=invoice_text,
model_class=Invoice
)
Strategy 4: Schema Evolution and Versioning
Handle schema changes gracefully.
Version-Aware Models
from typing import Union
class ProductV1(BaseModel):
name: str
price: float
class ProductV2(BaseModel):
name: str
price: float
in_stock: bool
categories: List[str] = []
class Config:
version = 2
class ProductV3(BaseModel):
name: str
price: float
in_stock: bool
categories: List[str] = []
vendor: str
class Config:
version = 3
class VersionedExtractor:
def __init__(self, llm_extractor):
self.llm = llm_extractor
self.versions = {
1: ProductV1,
2: ProductV2,
3: ProductV3
}
def extract_latest(self, prompt: str) -> BaseModel:
"""Extract using latest schema version."""
latest_version = max(self.versions.keys())
model_class = self.versions[latest_version]
return self.llm.extract_and_validate(prompt, model_class)
def extract_with_fallback(self, prompt: str) -> BaseModel:
"""Try latest version, fall back to older on failure."""
for version in sorted(self.versions.keys(), reverse=True):
try:
model_class = self.versions[version]
result = self.llm.extract_and_validate(prompt, model_class)
return result
except Exception as e:
if version == 1:
raise
continue
def migrate_version(
self,
data: BaseModel,
target_version: int
) -> BaseModel:
"""Migrate data between schema versions."""
current_version = data.Config.version if hasattr(data.Config, 'version') else 1
target_class = self.versions[target_version]
# Convert to dict
data_dict = data.dict()
# Apply migrations
if current_version < target_version:
data_dict = self.migrate_up(data_dict, current_version, target_version)
elif current_version > target_version:
data_dict = self.migrate_down(data_dict, current_version, target_version)
return target_class(**data_dict)
Optional Fields for Flexibility
class FlexibleProduct(BaseModel):
# Core required fields
name: str
price: float
# Optional fields with defaults
description: Optional[str] = None
in_stock: bool = True
categories: List[str] = Field(default_factory=list)
sku: Optional[str] = None
vendor: Optional[str] = None
dimensions: Optional[dict] = None
# Allow extra fields
class Config:
extra = 'allow' # or 'ignore' to silently drop extras
@classmethod
def from_partial(cls, data: dict) -> 'FlexibleProduct':
"""Create from potentially incomplete data."""
# Set defaults for missing required fields
data.setdefault('name', 'Unknown Product')
data.setdefault('price', 0.0)
return cls(**data)
Strategy 5: Batch Extraction
Process multiple items efficiently.
Batch Processing
class BatchExtractor:
def __init__(self, llm_extractor):
self.llm = llm_extractor
def extract_batch(
self,
items: List[str],
model_class: type[BaseModel],
batch_size: int = 10
) -> List[BaseModel]:
"""Extract multiple items in batches."""
results = []
for i in range(0, len(items), batch_size):
batch = items[i:i + batch_size]
# Create batch prompt
batch_prompt = self.create_batch_prompt(batch, model_class)
# Extract as array
schema = {
"type": "object",
"properties": {
"items": {
"type": "array",
"items": model_class.model_json_schema()
}
},
"required": ["items"]
}
raw_data = self.llm.extract_with_function(
prompt=batch_prompt,
function_name="extract_batch",
function_description=f"Extract batch of {model_class.__name__}",
parameters=schema
)
# Validate each item
for item_data in raw_data['items']:
try:
results.append(model_class(**item_data))
except Exception as e:
# Log error but continue
logger.error(f"Validation failed for item: {e}")
return results
def create_batch_prompt(self, items: List[str], model_class: type[BaseModel]) -> str:
"""Create prompt for batch extraction."""
items_text = "\n".join(f"{i+1}. {item}" for i, item in enumerate(items))
return f"""Extract information from these items:
{items_text}
Return as an array of {model_class.__name__} objects."""
# Usage
extractor = BatchExtractor(llm_extractor)
product_texts = [
"Premium Wireless Mouse - $29.99, in stock",
"Mechanical Keyboard - $89.99, out of stock",
"USB-C Hub - $49.99, in stock"
]
products = extractor.extract_batch(product_texts, Product, batch_size=10)
for product in products:
print(f"{product.product_name}: ${product.price}")
Parallel Processing
import asyncio
from typing import List
class AsyncBatchExtractor:
def __init__(self, llm_extractor, max_concurrent: int = 5):
self.llm = llm_extractor
self.semaphore = asyncio.Semaphore(max_concurrent)
async def extract_one_async(
self,
item: str,
model_class: type[BaseModel]
) -> Optional[BaseModel]:
"""Extract single item asynchronously."""
async with self.semaphore:
try:
# Simulate async extraction (replace with actual async LLM call)
raw_data = await self.llm.extract_with_function_async(
prompt=f"Extract: {item}",
function_name=f"extract_{model_class.__name__.lower()}",
function_description=f"Extract {model_class.__name__}",
parameters=model_class.model_json_schema()
)
return model_class(**raw_data)
except Exception as e:
logger.error(f"Extraction failed: {e}")
return None
async def extract_batch_async(
self,
items: List[str],
model_class: type[BaseModel]
) -> List[BaseModel]:
"""Extract multiple items in parallel."""
tasks = [
self.extract_one_async(item, model_class)
for item in items
]
results = await asyncio.gather(*tasks)
# Filter out None results
return [r for r in results if r is not None]
# Usage
async def main():
async_extractor = AsyncBatchExtractor(llm_extractor, max_concurrent=5)
products = await async_extractor.extract_batch_async(
product_texts,
Product
)
print(f"Extracted {len(products)} products")
# Run
asyncio.run(main())
Production Error Handling
Comprehensive Error Handling
from enum import Enum
from typing import Union, Tuple
class ExtractionError(Exception):
"""Base exception for extraction errors."""
pass
class ValidationError(ExtractionError):
"""Data failed validation."""
pass
class ParseError(ExtractionError):
"""Failed to parse LLM output."""
pass
class ExtractionResult(BaseModel):
success: bool
data: Optional[BaseModel] = None
error: Optional[str] = None
attempts: int = 1
class ProductionExtractor:
def __init__(self, llm_extractor, max_retries: int = 3):
self.llm = llm_extractor
self.max_retries = max_retries
def extract_safe(
self,
prompt: str,
model_class: type[BaseModel]
) -> ExtractionResult:
"""Extract with comprehensive error handling."""
for attempt in range(self.max_retries):
try:
# Extract
result = self.llm.extract_and_validate(prompt, model_class)
return ExtractionResult(
success=True,
data=result,
attempts=attempt + 1
)
except json.JSONDecodeError as e:
error = f"Parse error: {e}"
if attempt == self.max_retries - 1:
return ExtractionResult(
success=False,
error=error,
attempts=attempt + 1
)
except ValidationError as e:
error = f"Validation error: {e}"
if attempt == self.max_retries - 1:
return ExtractionResult(
success=False,
error=error,
attempts=attempt + 1
)
except Exception as e:
error = f"Unexpected error: {e}"
return ExtractionResult(
success=False,
error=error,
attempts=attempt + 1
)
return ExtractionResult(
success=False,
error="Max retries exceeded",
attempts=self.max_retries
)
# Usage
extractor = ProductionExtractor(llm_extractor, max_retries=3)
result = extractor.extract_safe(
prompt="Extract: 'Wireless Mouse - $29.99'",
model_class=Product
)
if result.success:
print(f"Success after {result.attempts} attempts")
print(result.data)
else:
print(f"Failed: {result.error}")
# Log error, use fallback, alert team, etc.
Fallback Strategies
class FallbackExtractor:
def __init__(self, primary_llm, fallback_llm):
self.primary = primary_llm
self.fallback = fallback_llm
def extract_with_fallback(
self,
prompt: str,
model_class: type[BaseModel]
) -> Tuple[BaseModel, str]:
"""Try primary LLM, fall back to secondary on failure."""
# Try primary
try:
result = self.primary.extract_and_validate(prompt, model_class)
return result, "primary"
except Exception as primary_error:
logger.warning(f"Primary extraction failed: {primary_error}")
# Try fallback
try:
result = self.fallback.extract_and_validate(prompt, model_class)
return result, "fallback"
except Exception as fallback_error:
logger.error(f"Fallback extraction failed: {fallback_error}")
raise ExtractionError("Both primary and fallback failed")
def extract_with_template_fallback(
self,
prompt: str,
model_class: type[BaseModel],
template_data: Optional[dict] = None
) -> BaseModel:
"""Fall back to template if extraction fails."""
try:
return self.primary.extract_and_validate(prompt, model_class)
except Exception:
if template_data:
# Use template with partial data
return model_class(**template_data)
else:
# Use empty/default template
return model_class.construct()
Monitoring and Quality Metrics
from dataclasses import dataclass
from datetime import datetime
@dataclass
class ExtractionMetrics:
total_attempts: int = 0
successful: int = 0
failed: int = 0
retries: int = 0
validation_errors: int = 0
parse_errors: int = 0
avg_attempts: float = 0.0
def success_rate(self) -> float:
return self.successful / self.total_attempts if self.total_attempts > 0 else 0.0
class MonitoredExtractor:
def __init__(self, llm_extractor):
self.llm = llm_extractor
self.metrics = ExtractionMetrics()
self.extraction_history = []
def extract_monitored(
self,
prompt: str,
model_class: type[BaseModel]
) -> ExtractionResult:
"""Extract with metrics tracking."""
start_time = datetime.now()
self.metrics.total_attempts += 1
result = self.llm.extract_safe(prompt, model_class)
# Update metrics
if result.success:
self.metrics.successful += 1
else:
self.metrics.failed += 1
if "validation" in result.error.lower():
self.metrics.validation_errors += 1
elif "parse" in result.error.lower():
self.metrics.parse_errors += 1
if result.attempts > 1:
self.metrics.retries += result.attempts - 1
# Track history
self.extraction_history.append({
'timestamp': start_time,
'duration': (datetime.now() - start_time).total_seconds(),
'success': result.success,
'attempts': result.attempts,
'model': model_class.__name__
})
return result
def get_report(self) -> dict:
"""Generate metrics report."""
return {
'success_rate': f"{self.metrics.success_rate():.1%}",
'total_attempts': self.metrics.total_attempts,
'successful': self.metrics.successful,
'failed': self.metrics.failed,
'retries': self.metrics.retries,
'validation_errors': self.metrics.validation_errors,
'parse_errors': self.metrics.parse_errors,
'avg_attempts': sum(h['attempts'] for h in self.extraction_history) / len(self.extraction_history) if self.extraction_history else 0
}
Conclusion: Reliable Structured Extraction
Structured data extraction from LLMs requires systematic approaches beyond simple prompting. Function calling, Pydantic validation, retry logic, and comprehensive error handling transform unreliable text generation into production-grade data extraction.
Key implementation principles:
- Use function calling: Native APIs provide highest reliability
- Validate with Pydantic: Type safety and validation catch errors
- Implement retry logic: Models improve on second attempt when given error context
- Handle failures gracefully: Not every extraction will succeed
- Monitor quality: Track success rates and common failure modes
The techniques in this guide enable building production systems that reliably extract structured data from LLM outputs—transforming creative text generators into trustworthy data processors.
Last Updated: December 2024


