The problem: LLMs speak prose, forms need data
You have an insurance form with 30 fields. The user uploads a document that contains all the answers — but as unstructured text. The LLM can understand the document, but its default output is a paragraph of prose, not a JSON object matching your form schema.
Structured output solves this: you tell the LLM exactly what shape the output should have, and it generates valid, typed data you can pipe directly into a form.
Structured output with the Vercel AI SDK
The cleanest approach for TypeScript developers. Define a Zod schema, and the SDK ensures the LLM returns matching data.
import { generateObject } from 'ai';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';
const InsuranceFormSchema = z.object({
firstName: z.string().describe('First name of the policyholder'),
lastName: z.string().describe('Last name of the policyholder'),
dateOfBirth: z.string().describe('Date of birth in YYYY-MM-DD format'),
address: z.object({
street: z.string(),
city: z.string(),
state: z.string(),
zipCode: z.string(),
}),
policyType: z.enum(['auto', 'home', 'life', 'health']),
coverageAmount: z.number().describe('Coverage amount in USD'),
startDate: z.string().describe('Policy start date in YYYY-MM-DD'),
});
type InsuranceForm = z.infer<typeof InsuranceFormSchema>;
async function extractFormData(documentText: string): Promise<InsuranceForm> {
const { object } = await generateObject({
model: openai('gpt-4o'),
schema: InsuranceFormSchema,
prompt: 'Extract insurance form data from this document: ' + documentText,
});
return object; // Fully typed InsuranceForm — no parsing needed
}
The schema descriptions are critical. They guide the LLM on what each field means and what format to use. Without descriptions, the LLM guesses — and guesses wrong on dates, phone numbers, and enums.
Adding confidence scores
Raw extraction is not enough. Users need to know which fields the AI is confident about and which ones need manual verification.
// Schema with confidence scores per field
const ExtractedFieldSchema = z.object({
value: z.string(),
confidence: z.number().min(0).max(1).describe('Confidence score from 0 to 1'),
source: z.string().describe('The text snippet this was extracted from'),
});
const InsuranceExtractionSchema = z.object({
firstName: ExtractedFieldSchema,
lastName: ExtractedFieldSchema,
dateOfBirth: ExtractedFieldSchema,
address: z.object({
street: ExtractedFieldSchema,
city: ExtractedFieldSchema,
state: ExtractedFieldSchema,
zipCode: ExtractedFieldSchema,
}),
policyType: z.object({
value: z.enum(['auto', 'home', 'life', 'health']),
confidence: z.number().min(0).max(1),
source: z.string(),
}),
coverageAmount: z.object({
value: z.number(),
confidence: z.number().min(0).max(1),
source: z.string(),
}),
});
// Usage with confidence-aware UI
async function extractWithConfidence(documentText: string) {
const { object } = await generateObject({
model: openai('gpt-4o'),
schema: InsuranceExtractionSchema,
prompt: `Extract insurance form data from this document. For each field, provide:
- The extracted value
- A confidence score (0-1) based on how clearly the information is stated
- The exact text snippet you extracted from
Document:
${documentText}`,
});
return object;
}
The form auto-fill UX
The extraction is only half the problem. The other half is how you present AI-filled data to the user.
The trust spectrum for auto-filled fields:
// Map confidence to visual treatment
function getFieldStyle(confidence: number): FieldStyle {
if (confidence >= 0.9) {
return {
border: 'green', // High confidence — pre-filled, editable
icon: 'check-circle',
tooltip: 'Auto-filled with high confidence',
requiresReview: false,
};
}
if (confidence >= 0.6) {
return {
border: 'amber', // Medium — pre-filled, highlighted for review
icon: 'alert-circle',
tooltip: 'Please verify this auto-filled value',
requiresReview: true,
};
}
if (confidence > 0) {
return {
border: 'red', // Low — suggested but not pre-filled
icon: 'help-circle',
tooltip: 'AI suggestion — click to accept or type your own',
requiresReview: true,
};
}
return {
border: 'default', // Not found — empty field
icon: null,
tooltip: null,
requiresReview: false,
};
}
Key UX principles:
- Never auto-submit. Always show extracted data for user review before submission.
- Highlight uncertain fields. Use color coding (green/amber/red) so users can quickly scan for fields that need attention.
- Show the source. When the user hovers over an auto-filled field, show the exact text snippet the AI extracted from. This builds trust.
- Allow easy override. Clicking any auto-filled field should let the user type their own value. The AI suggestion should feel like a helpful default, not a locked-in choice.
Extraction with LangChain.js
import { ChatOpenAI } from '@langchain/openai';
import { z } from 'zod';
const model = new ChatOpenAI({
modelName: 'gpt-4o',
}).withStructuredOutput(
z.object({
contacts: z.array(z.object({
name: z.string(),
email: z.string().email(),
role: z.string(),
company: z.string().optional(),
})),
})
);
const result = await model.invoke(
'Extract all contacts from this email thread: ...'
);
// result.contacts is a typed array — no JSON.parse needed
Handling extraction failures
LLMs are not perfect extractors. Some documents are ambiguous, some fields are missing, and some formats are unexpected.
interface ExtractionResult<T> {
data: Partial<T>;
filledFields: string[];
missingFields: string[];
lowConfidenceFields: string[];
errors: Array<{ field: string; error: string }>;
}
async function safeExtract<T>(
documentText: string,
schema: z.ZodSchema<T>,
requiredFields: string[]
): Promise<ExtractionResult<T>> {
try {
const { object } = await generateObject({
model: openai('gpt-4o'),
schema,
prompt: 'Extract data from: ' + documentText,
});
const filled: string[] = [];
const missing: string[] = [];
const lowConf: string[] = [];
for (const field of requiredFields) {
const val = (object as any)[field];
if (!val || (typeof val === 'object' && !val.value)) {
missing.push(field);
} else if (typeof val === 'object' && val.confidence < 0.6) {
lowConf.push(field);
filled.push(field);
} else {
filled.push(field);
}
}
return {
data: object as Partial<T>,
filledFields: filled,
missingFields: missing,
lowConfidenceFields: lowConf,
errors: [],
};
} catch (error) {
// Structured output generation failed — schema mismatch or model error
return {
data: {} as Partial<T>,
filledFields: [],
missingFields: requiredFields,
lowConfidenceFields: [],
errors: [{ field: '_all', error: String(error) }],
};
}
}
When structured output fails: streaming partial results
For large documents, extraction can take several seconds. Instead of making the user wait, stream partial results as fields are extracted:
import { streamObject } from 'ai';
async function streamExtraction(documentText: string, onField: (field: string, value: any) => void) {
const { partialObjectStream } = streamObject({
model: openai('gpt-4o'),
schema: InsuranceFormSchema,
prompt: 'Extract from: ' + documentText,
});
let lastFields = new Set<string>();
for await (const partial of partialObjectStream) {
// Detect newly filled fields
const currentFields = new Set(
Object.entries(partial)
.filter(([, v]) => v !== undefined)
.map(([k]) => k)
);
for (const field of currentFields) {
if (!lastFields.has(field)) {
onField(field, (partial as any)[field]);
}
}
lastFields = currentFields;
}
}
// Usage: form fields fill in one by one as the LLM processes the document
streamExtraction(documentText, (field, value) => {
formState[field] = value;
highlightField(field); // Animate the field filling in
});
Practice designing this
- AI-Powered Form Auto-Fill — design an extraction pipeline with confidence scoring, user override, and partial failure handling
- Conversational UI Agent — agents also use structured output for tool arguments
For the broader context on AI patterns, see 5 AI Patterns Every Frontend Engineer Will Build in 2026 and Tool-Calling Agents and the ReAct Pattern.
LLM-friendly summary
A guide to extracting structured data from LLMs using schema-constrained generation with Zod, Vercel AI SDK, and LangChain.js, covering form auto-fill, confidence scoring, and validation patterns.