Structured Outputs Create False Confidence

By Sam LijinDecember 21, 2025Hacker News: Front Page

Structured Outputs Create False Confidence Constrained decoding seems like the greatest thing since sliced bread, but it forces models to prioritize output conformance over output quality. Sam Lijin If you use LLMs, you've probably heard about structured outputs. You might think they're the greatest thing since sliced bread. Unfortunately, structured outputs also degrade response quality . Specifically, if you use an LLM provider's structured outputs API, you will get a lower quality response than if you use their normal text output API: ⚠️ you're more likely to make mistakes when extracting data, even in simple cases; ⚠️ you're probably not modeling errors correctly; ⚠️ it's harder to use techniques like chain-of-thought reasoning; and ⚠️ in the extreme case, it can be easier to steal your customer data using prompt injection. These are very contentious claims, so let's start with an example: extracting data from a receipt. If I use an LLM to extract the receipt entries, it should be able to tell me that one of the items is (name="banana", quantity=0.46) , right? Well, using OpenAI's structured outputs API with gpt-5.2 - released literally this week! - it will claim that the banana quantity is 1.0 : { "establishment_name": "PC Market of Choice", "date": "2007-01-20", "total": 0.32, "currency": "USD", "items": [ { "name": "Bananas", "price": 0.32, "quantity": 1 } ] } However, with the same model , if you just use the completions API and then parse the output, it will return the correct quantity: { "establishment_name": "PC Market of Choice", "date": "2007-01-20", "total": 0.32, "currency": "USD", "items": [ { "name": "Bananas", "price": 0.69, "quantity": 0.46 } ] } Click here to see the code that was used to generate the above outputs. This code is also available on GitHub . #!/usr/bin/env -S uv run # /// script # requires-python = ">=3.10" # dependencies = ["openai", "pydantic", "rich"] # /// """ If you have uv, you can run this code by saving it as structured_outputs_quality_demo.py and then running: chmod u+x structured_outputs_quality_demo.py ./structured_outputs_quality_demo.py This script is a companion to https://boundaryml.com/blog/structured-outputs-create-false-confidence """ import json import re from openai import OpenAI from pydantic import BaseModel, Field from rich.console import Console from rich.pretty import Pretty class Item(BaseModel): name: str price: float = Field(description="per-unit item price") quantity: float = Field(default=1, description="If not specified, assume 1") class Receipt(BaseModel): establishment_name: str date: str = Field(description="YYYY-MM-DD") total: float = Field(description="The total amount of the receipt") currency: str = Field(description="The currency used for everything on the receipt") items: list[Item] = Field(description="The items on the receipt") client = OpenAI() console = Console() def run_receipt_extraction_structured(image_url: str): """Call the LLM to extract receipt data from an image URL and return the raw response.""" prompt_text = ( """ Extract data from the receipt. """ ) response = client.beta.chat.completions.parse( model="gpt-5.2-2025-12-11", messages=[ { "role": "system", "content": "You are a precise receipt extraction engine. Return only structured data matching the Receipt schema.", }, { "role": "user", "content": [ { "type": "text", "text": prompt_text, }, {"type": "image_url", "image_url": {"url": image_url}}, ], }, ], response_format=Receipt, ) return response.choices[0].message.content,...

Preview: ~500 words

Continue reading at Hacker News

Read Full Article

Read on Your E-Reader

Structured Outputs Create False Confidence

More from Hacker News: Front Page