LangChain Structured Output |
\\n\\nMost of the time, what you need is not free-form text, but structured data β for example, a JSON object.
\\n\\nLangChain Structured Output allows the Agent to return results in the format you specify, making it easy for programs to use them directly.
\\n\\n\\n\\n
Why Structured Output Is Needed
\\n\\nSuppose you need to extract the name, age, and occupation from a userβs description:
\\n\\n| Method | \\nOutput Format | \\nPost-processing | \\n
|---|---|---|
| Normal Response | \\n"Zhang San is 28 years old and works as an engineer." | \\nRequires regex or another model call to parse | \\n
| Structured Output | \\n{name: "Zhang San", age: 28, job: "engineer"} | \\nDirectly usable as a Python object | \\n
Structured output eliminates the step of βparsing data from text,β allowing AI outputs to be used directly by programs.
\\n\\n\\n\\n
The Simplest Usage β Passing a Pydantic Model
\\n\\nSimply pass a Pydantic model to the response_format parameter:
Example
\\n\\nfrom dotenv import load_dotenv\\n\\nload_dotenv()\\n\\nfrom pydantic import BaseModel, Field\\n\\nfrom langchain.agents import create_agent\\n\\nfrom langchain.chat_models import init_chat_model\\n\\nfrom langchain.messages import HumanMessage\\n\\n# Define the expected output structure\\n\\nclass CourseInfo(BaseModel):\\n """Result of course extraction from TUTORIAL ()"""\\n\\n course_name: str = Field(description="Course name")\\n difficulty: str = Field(description="Difficulty: beginner/intermediate/advanced")\\n estimated_hours: int = Field(description="Estimated learning duration (hours)")\\n is_free: bool = Field(description="Is it free?")\\n\\nmodel = init_chat_model("deepseek:deepseek-v4-flash", temperature=0)\\n\\nagent = create_agent(\\n model=model,\\n response_format=CourseInfo, # Pass in Pydantic model\\n system_prompt="You are a course assistant for TUTORIAL (). Extract course information from user descriptions.",\\n)\\n\\n# User provides an unstructured description\\nresult = agent.invoke({\\n "messages": [HumanMessage(\\n content="I've been learning the Python3 Basics Tutorial, which is beginner-level, "\\n "takes about 20 hours, and is completely free."\\n )]\\n})\\n\\n# Retrieve structured result from structured_response\\nif "structured_response" in result:\\n course = result\\n print(f"Course name: {course.course_name}")\\n print(f"Difficulty: {course.difficulty}")\\n print(f"Estimated duration: {course.estimated_hours} hours")\\n print(f"Free: {'Yes' if course.is_free else 'No'}")\\n print(f"Object type: {type(course)}")\\n\\n\\nOutput:
\\n\\nCourse name: Python3 Basic Tutorial\\nDifficulty: Getting Started\\nEstimated duration: 20 Hour\\nFree: is\\nObject type: <class '__main__.CourseInfo'>\\n\\n\\n\\n\\n\\nThe returned
\\nstructured_responseis a Pydantic model instance, not a plain dictionary. This means you can access attributes like.course_name, and your IDE can provide auto-completion.
\\n\\n
Structured Output with Tools
\\n\\nresponse_format and tools can be used together β the Agent calls tools as needed and finally outputs structured data:
Example
\\n\\nfrom pydantic import BaseModel, Field\\n\\nfrom langchain.agents import create_agent\\n\\nfrom langchain.chat_models import init_chat_model\\n\\nfrom langchain.messages import HumanMessage\\n\\nfrom langchain.tools import tool\\n\\n@tool\\ndef search_course(keyword: str) -> str:\\n """Search for course information on TUTORIAL ()"""\\n courses = {\\n "python": "Python3 Basic Tutorial | Getting Started | Free | 30Chapter | About 20 Hours",\\n "java": "Java Basic Tutorial | Getting Started | Free | 35Chapter | About 25 Hours",\\n "Data Analysis": "Python Data Analysis | Advanced | Member | 25Chapter | About 30 Hours",\\n }\\n return courses.get(keyword.lower(), f"No course found for '{keyword}'")\\n\\nclass CourseRecommendation(BaseModel):\\n """Course recommendation result"""\\n\\n course_name: str = Field(description="Recommended course name")\\n reason: str = Field(description="Reason for recommendation")\\n difficulty: str = Field(description="Difficulty: beginner/intermediate/advanced")\\n\\nmodel = init_chat_model("deepseek:deepseek-v4-flash", temperature=0)\\n\\nagent = create_agent(\\n model=model,\\n tools=,\\n response_format=CourseRecommendation,\\n system_prompt="You are a course advisor for TUTORIAL (). First search for courses, then give recommendations.",\\n)\\n\\nresult = agent.invoke({\\n "messages": [HumanMessage(content="I want to learn Python. Any recommendations?")]\\n})\\n\\nrec = result\\nprint(f"Recommended course: {rec.course_name}")\\nprint(f"Reason: {rec.reason}")\\nprint(f"Difficulty: {rec.difficulty}")\\n\\n# View full execution process\\nprint("n=== Execution Process ===")\\nfor msg in result:\\n if msg.type == "tool":\\n print(f" Called {msg.name}: {msg.content}")\\n\\n\\nOutput:
\\n\\nRecommended course: Python3 Basic Tutorial\\nReason: This course is free and suitable for Python beginners, with a study duration of about 20 hours.\\nDifficulty: Getting Started\\n=== Execution Process ===\\n Called search_course: Python3 Basic Tutorial | Getting Started | Free | 30Chapter | About 20 Hours\\n\\n\\n\\n\\n
Complex Nested Structures
\\n\\nPydantic supports complex structures such as nesting, lists, and enums:
\\n\\nExample
\\n\\nfrom pydantic import BaseModel, Field\\n\\nfrom typing import Literal\\n\\nfrom langchain.agents import create_agent\\n\\nfrom langchain.chat_models import init_chat_model\\n\\nfrom langchain.messages import HumanMessage\\n\\nclass Topic(BaseModel):\\n """Knowledge point"""\\n\\n name: str = Field(description="Knowledge point name")\\n order: int = Field(description="Learning order, starting from 1")\\n minutes: int = Field(description="Suggested learning duration in minutes")\\n\\nclass LearningPlan(BaseModel):\\n """Learning plan"""\\n\\n goal: str = Field(description="Overview of learning goal")\\n level: Literal["Getting Started", "Advanced", "Advanced"] = Field(description="Difficulty level")\\n total_hours: float = Field(description="Total duration in hours")\\n topics: list = Field(description="List of knowledge points")\\n\\nmodel = init_chat_model("deepseek:deepseek-v4-flash", temperature=0)\\n\\nagent = create_agent(\\n model=model,\\n response_format=LearningPlan,\\n system_prompt="You are a learning planner for TUTORIAL ().",\\n)\\n\\nresult = agent.invoke({\\n "messages": [HumanMessage(\\n content="Help me create a Python beginner learning plan with total duration under 10 hours"\\n )]\\n})\\n\\nplan = result\\nprint(f"Goal: {plan.goal}")\\nprint(f"Level: {plan.level}")\\nprint(f"Total duration: {plan.total_hours} hours")\\nprint(f"n Topics ({len(plan.topics)} items):")\\nfor topic in plan.topics:\\n print(f" {topic.order}. {topic.name} ({topic.minutes} minutes)")\\n\\n\\nOutput:
\\n\\nGoal: Master Python basic syntax and be able to independently write simple Python programs\\nLevel: Getting Started\\nTotal duration: 9.5 Hour\\n\\n Topics (6 items):\\n 1. Environment Setup and Basic Syntax (60 minutes)\\n 2. Data Types and Variables (90 minutes)\\n 3. Conditional Statements and Loops (120 minutes)\\n 4. Functions and Modules (120 minutes)\\n 5. Lists and Dictionaries (90 minutes)\\n 6. Comprehensive Exercises (90 minutes)\\n\\n\\n\\n\\n
Getting Structured Output Directly from Messages
\\n\\nIf you donβt need the Agentβs tool-calling capability and only want to extract structured information from text, you can use the model directly:
\\n\\nExample
\\n\\nfrom pydantic import BaseModel, Field\\n\\nfrom langchain.chat_models import init_chat_model\\n\\nclass SentimentResult(BaseModel):\\n """Sentiment analysis result"""\\n\\n sentiment: str = Field(description="Positive/Negative/Neutral")\\n score: float = Field(description="Sentiment intensity, 0β1")\\n keywords: list = Field(description="Key sentiment words")\\n\\nmodel = init_chat_model("deepseek:deepseek-v4-flash", temperature=0)\\n\\n# Use with_structured_output() directly on the model\\n# No Agent required\\nstructured_model = model.with_structured_output(SentimentResult)\\n\\ntexts = [\\n "TUTORIAL () is incredibly useful β highly recommended!",\\n "This tutorial has too little content β not worth it.",\\n "The weather is nice today.",\\n]\\n\\nfor text in texts:\\n result = structured_model.invoke(text)\\n print(f"Text: {text[:30]}...")\\n print(f" Sentiment: {result.sentiment}, Intensity: {result.score}, Keywords: {result.keywords}")\\n\\n\\nOutput:
\\n\\nText: TUTORIAL () is incredibly useful β h...\\n Sentiment: Positive, Intensity: 0.95, Keywords: ['useful', 'recommended']\\nText: This tutorial has too little content β not...\\n Sentiment: Negative, Intensity: 0.7, Keywords: ['too little content', 'not worth it']\\nText: The weather is nice today.\\n Sentiment: Neutral, Intensity: 0.1, Keywords: ['nice']\\n\\n\\n\\n\\n
with_structured_output()is a method of the model and can be used without an Agent. If your use case is βinformation extractionβ rather than βmulti-step reasoning,β usingwith_structured_output()directly is simpler and more efficient.
YouTip