When Basic Extraction Is Not Enough
Simple field extraction — name, email, amount — is the entry point. Production systems need more: streaming partial results to reduce perceived latency, classification with confidence scores, step-by-step reasoning before final answers, and pipelines that chain multiple extraction steps.
Instructor supports all of these patterns through standard Pydantic features plus a few Instructor-specific utilities.
Partial Streaming
Instructor's create_partial returns an iterable of partially-filled model instances as tokens stream in. This lets you start rendering UI before the full response is complete:
import instructor
from anthropic import Anthropic
from pydantic import BaseModel
from typing import Optional, List
client = instructor.from_anthropic(Anthropic())
class ProductAnalysis(BaseModel):
product_name: str
strengths: List[str]
weaknesses: List[str]
verdict: str
score: Optional[float] = None
# create_partial streams partial objects as tokens arrive
partial_analysis = client.messages.create_partial(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
messages=[{"role": "user", "content": "Analyse the iPhone 16 Pro for a developer audience"}],
response_model=ProductAnalysis
)
for partial in partial_analysis:
# Each iteration: a partially-filled ProductAnalysis object
if partial.product_name:
print(f"\rAnalysing: {partial.product_name}", end="")
if partial.strengths:
print(f"\nStrengths so far: {len(partial.strengths)}")
# Last iteration is the complete, validated object
print(f"\nFinal score: {partial.score}")
create_partial is ideal for long-form structured responses where early fields can be shown to the user while later fields are still generating — for example, showing a summary while a detailed breakdown is still streaming.Multi-Label Classification with Confidence
import instructor
from anthropic import Anthropic
from pydantic import BaseModel, Field
from typing import List, Literal
client = instructor.from_anthropic(Anthropic())
class CategoryScore(BaseModel):
category: Literal["billing", "technical", "account", "feature-request", "complaint"]
confidence: float = Field(ge=0.0, le=1.0, description="Confidence score 0-1")
class TicketClassification(BaseModel):
categories: List[CategoryScore] = Field(
description="All applicable categories with confidence scores, highest confidence first"
)
primary_category: Literal["billing", "technical", "account", "feature-request", "complaint"]
urgency: Literal["low", "medium", "high", "critical"]
needs_human: bool
result = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=512,
messages=[{
"role": "user",
"content": "I've been charged twice this month and now I can't log in. Very frustrated."
}],
response_model=TicketClassification
)
print(result.primary_category) # billing
print(result.needs_human) # True
for cat in result.categories:
print(f"{cat.category}: {cat.confidence:.0%}")
Chain-of-Thought Before Structured Output
For complex reasoning tasks, add a reasoning field before the structured answer. The LLM works through the problem in the reasoning field, which improves the quality of the structured fields that follow:
from pydantic import BaseModel, Field
from typing import Literal
import instructor
from anthropic import Anthropic
client = instructor.from_anthropic(Anthropic())
class DiagnosisResult(BaseModel):
reasoning: str = Field(
description="Step-by-step analysis of the symptoms before reaching a conclusion"
)
likely_cause: str
confidence: Literal["low", "medium", "high"]
recommended_action: str
escalate_to_human: bool
result = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
messages=[{
"role": "user",
"content": "Server CPU is at 100%, response times spiked 10 minutes ago, no recent deploys."
}],
response_model=DiagnosisResult
)
print(result.reasoning) # shows the step-by-step analysis
print(result.likely_cause) # 'Runaway process or query'
print(result.escalate_to_human) # True
The reasoning field serves two purposes: it improves accuracy (chain-of-thought effect), and it makes the output auditable — you can log the reasoning alongside the decision for explainability.
Multi-Step Extraction Pipelines
Some extraction tasks are too complex for a single call. Break them into stages where each stage's output feeds the next:
import instructor
from anthropic import Anthropic
from pydantic import BaseModel
from typing import List
client = instructor.from_anthropic(Anthropic())
# Stage 1: Extract raw entities
class RawEntities(BaseModel):
companies: List[str]
people: List[str]
dates: List[str]
# Stage 2: Enrich each entity
class EnrichedCompany(BaseModel):
name: str
role_in_document: str # 'acquirer', 'target', 'advisor', etc.
mentioned_amount: float | None
class EnrichedDocument(BaseModel):
companies: List[EnrichedCompany]
event_type: str
event_date: str | None
def extract_document(text: str) -> EnrichedDocument:
# Stage 1: fast, cheap extraction
entities = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=512,
messages=[{"role": "user", "content": f"Extract all companies, people, and dates from:\n{text}"}],
response_model=RawEntities
)
# Stage 2: richer extraction with context from stage 1
enriched = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
messages=[{
"role": "user",
"content": (
f"Document: {text}\n\n"
f"Known entities: companies={entities.companies}, people={entities.people}\n\n"
"Now extract enriched company details and the event type."
)
}],
response_model=EnrichedDocument
)
return enriched
Stage 1 with a fast, cheap model (Haiku, GPT-4o-mini) to identify entities, Stage 2 with a capable model only for the entities that need enrichment. This reduces cost significantly compared to running a capable model on the full document.Iterable Extraction: Multiple Objects from One Document
When a document contains multiple instances of the same entity (e.g. all jobs from a careers page), use create_iterable:
import instructor
from anthropic import Anthropic
from pydantic import BaseModel
from typing import Iterable
client = instructor.from_anthropic(Anthropic())
class JobListing(BaseModel):
title: str
department: str
location: str
salary_range: str | None
jobs_text = """
We are hiring:
- Senior Engineer, Platform team, Remote, $180-220k
- Product Designer, Design team, New York, $120-150k
- Data Scientist, ML team, San Francisco
"""
jobs = client.messages.create_iterable(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
messages=[{"role": "user", "content": f"Extract all job listings:\n{jobs_text}"}],
response_model=JobListing
)
for job in jobs: # iterates as each JobListing is extracted
print(f"{job.title} — {job.department}")
Async Support
import instructor
from anthropic import AsyncAnthropic
from pydantic import BaseModel
import asyncio
async_client = instructor.from_anthropic(AsyncAnthropic())
class Summary(BaseModel):
title: str
key_points: list[str]
async def summarise_many(texts: list[str]) -> list[Summary]:
tasks = [
async_client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=512,
messages=[{"role": "user", "content": f"Summarise:\n{text}"}],
response_model=Summary
)
for text in texts
]
return await asyncio.gather(*tasks)
Summary
Instructor's advanced features — partial streaming, multi-label classification, chain-of-thought, multi-step pipelines, and iterable extraction — all build on the same Pydantic foundation as basic extraction. The library stays out of your way: there is no Instructor-specific DSL to learn, just Python type hints and Pydantic validators. The complexity ceiling is high enough for production document processing pipelines, classification systems, and any scenario where you need reliable structured data from unstructured text.