When Basic Extraction Is Not Enough

Simple field extraction — name, email, amount — is the entry point. Production systems need more: streaming partial results to reduce perceived latency, classification with confidence scores, step-by-step reasoning before final answers, and pipelines that chain multiple extraction steps.

Instructor supports all of these patterns through standard Pydantic features plus a few Instructor-specific utilities.

Partial Streaming

Instructor's create_partial returns an iterable of partially-filled model instances as tokens stream in. This lets you start rendering UI before the full response is complete:

import instructor
from anthropic import Anthropic
from pydantic import BaseModel
from typing import Optional, List
 
client = instructor.from_anthropic(Anthropic())
 
class ProductAnalysis(BaseModel):
    product_name: str
    strengths: List[str]
    weaknesses: List[str]
    verdict: str
    score: Optional[float] = None
 
# create_partial streams partial objects as tokens arrive
partial_analysis = client.messages.create_partial(
    model="claude-haiku-4-5-20251001",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Analyse the iPhone 16 Pro for a developer audience"}],
    response_model=ProductAnalysis
)
 
for partial in partial_analysis:
    # Each iteration: a partially-filled ProductAnalysis object
    if partial.product_name:
        print(f"\rAnalysing: {partial.product_name}", end="")
    if partial.strengths:
        print(f"\nStrengths so far: {len(partial.strengths)}")
 
# Last iteration is the complete, validated object
print(f"\nFinal score: {partial.score}")
 
create_partial is ideal for long-form structured responses where early fields can be shown to the user while later fields are still generating — for example, showing a summary while a detailed breakdown is still streaming.

Multi-Label Classification with Confidence

import instructor
from anthropic import Anthropic
from pydantic import BaseModel, Field
from typing import List, Literal
 
client = instructor.from_anthropic(Anthropic())
 
class CategoryScore(BaseModel):
    category: Literal["billing", "technical", "account", "feature-request", "complaint"]
    confidence: float = Field(ge=0.0, le=1.0, description="Confidence score 0-1")
 
class TicketClassification(BaseModel):
    categories: List[CategoryScore] = Field(
        description="All applicable categories with confidence scores, highest confidence first"
    )
    primary_category: Literal["billing", "technical", "account", "feature-request", "complaint"]
    urgency: Literal["low", "medium", "high", "critical"]
    needs_human: bool
 
result = client.messages.create(
    model="claude-haiku-4-5-20251001",
    max_tokens=512,
    messages=[{
        "role": "user",
        "content": "I've been charged twice this month and now I can't log in. Very frustrated."
    }],
    response_model=TicketClassification
)
 
print(result.primary_category)   # billing
print(result.needs_human)        # True
for cat in result.categories:
    print(f"{cat.category}: {cat.confidence:.0%}")
 

Chain-of-Thought Before Structured Output

For complex reasoning tasks, add a reasoning field before the structured answer. The LLM works through the problem in the reasoning field, which improves the quality of the structured fields that follow:

from pydantic import BaseModel, Field
from typing import Literal
import instructor
from anthropic import Anthropic
 
client = instructor.from_anthropic(Anthropic())
 
class DiagnosisResult(BaseModel):
    reasoning: str = Field(
        description="Step-by-step analysis of the symptoms before reaching a conclusion"
    )
    likely_cause: str
    confidence: Literal["low", "medium", "high"]
    recommended_action: str
    escalate_to_human: bool
 
result = client.messages.create(
    model="claude-haiku-4-5-20251001",
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": "Server CPU is at 100%, response times spiked 10 minutes ago, no recent deploys."
    }],
    response_model=DiagnosisResult
)
 
print(result.reasoning)          # shows the step-by-step analysis
print(result.likely_cause)       # 'Runaway process or query'
print(result.escalate_to_human)  # True
 

The reasoning field serves two purposes: it improves accuracy (chain-of-thought effect), and it makes the output auditable — you can log the reasoning alongside the decision for explainability.

Multi-Step Extraction Pipelines

Some extraction tasks are too complex for a single call. Break them into stages where each stage's output feeds the next:

import instructor
from anthropic import Anthropic
from pydantic import BaseModel
from typing import List
 
client = instructor.from_anthropic(Anthropic())
 
# Stage 1: Extract raw entities
class RawEntities(BaseModel):
    companies: List[str]
    people: List[str]
    dates: List[str]
 
# Stage 2: Enrich each entity
class EnrichedCompany(BaseModel):
    name: str
    role_in_document: str  # 'acquirer', 'target', 'advisor', etc.
    mentioned_amount: float | None
 
class EnrichedDocument(BaseModel):
    companies: List[EnrichedCompany]
    event_type: str
    event_date: str | None
 
def extract_document(text: str) -> EnrichedDocument:
    # Stage 1: fast, cheap extraction
    entities = client.messages.create(
        model="claude-haiku-4-5-20251001",
        max_tokens=512,
        messages=[{"role": "user", "content": f"Extract all companies, people, and dates from:\n{text}"}],
        response_model=RawEntities
    )
 
    # Stage 2: richer extraction with context from stage 1
    enriched = client.messages.create(
        model="claude-haiku-4-5-20251001",
        max_tokens=1024,
        messages=[{
            "role": "user",
            "content": (
                f"Document: {text}\n\n"
                f"Known entities: companies={entities.companies}, people={entities.people}\n\n"
                "Now extract enriched company details and the event type."
            )
        }],
        response_model=EnrichedDocument
    )
 
    return enriched
 
Stage 1 with a fast, cheap model (Haiku, GPT-4o-mini) to identify entities, Stage 2 with a capable model only for the entities that need enrichment. This reduces cost significantly compared to running a capable model on the full document.

Iterable Extraction: Multiple Objects from One Document

When a document contains multiple instances of the same entity (e.g. all jobs from a careers page), use create_iterable:

import instructor
from anthropic import Anthropic
from pydantic import BaseModel
from typing import Iterable
 
client = instructor.from_anthropic(Anthropic())
 
class JobListing(BaseModel):
    title: str
    department: str
    location: str
    salary_range: str | None
 
jobs_text = """
We are hiring:
- Senior Engineer, Platform team, Remote, $180-220k
- Product Designer, Design team, New York, $120-150k
- Data Scientist, ML team, San Francisco
"""
 
jobs = client.messages.create_iterable(
    model="claude-haiku-4-5-20251001",
    max_tokens=1024,
    messages=[{"role": "user", "content": f"Extract all job listings:\n{jobs_text}"}],
    response_model=JobListing
)
 
for job in jobs:  # iterates as each JobListing is extracted
    print(f"{job.title} — {job.department}")
 

Async Support

import instructor
from anthropic import AsyncAnthropic
from pydantic import BaseModel
import asyncio
 
async_client = instructor.from_anthropic(AsyncAnthropic())
 
class Summary(BaseModel):
    title: str
    key_points: list[str]
 
async def summarise_many(texts: list[str]) -> list[Summary]:
    tasks = [
        async_client.messages.create(
            model="claude-haiku-4-5-20251001",
            max_tokens=512,
            messages=[{"role": "user", "content": f"Summarise:\n{text}"}],
            response_model=Summary
        )
        for text in texts
    ]
    return await asyncio.gather(*tasks)
 

Summary

Instructor's advanced features — partial streaming, multi-label classification, chain-of-thought, multi-step pipelines, and iterable extraction — all build on the same Pydantic foundation as basic extraction. The library stays out of your way: there is no Instructor-specific DSL to learn, just Python type hints and Pydantic validators. The complexity ceiling is high enough for production document processing pipelines, classification systems, and any scenario where you need reliable structured data from unstructured text.