跳转到主内容

Claude Structured Outputs 实战:用 JSON Schema 保证 100% 格式输出

告别 prompt 约束不稳定、正则提取易崩溃的时代。本文完整讲解如何通过 Claude API 的 Structured Outputs 功能,强制模型输出符合 JSON Schema 的结构化数据。

开发指南Tool UseJSON Schema预计阅读15分钟
2026.04.30 发表
Claude Structured Outputs 实战:用 JSON Schema 保证 100% 格式输出

Claude Structured Outputs 实战:用 JSON Schema 保证 100% 格式输出

AI 应用开发中,让模型稳定输出结构化 JSON 是一个经典难题:

  • 用 Prompt 约束?偶尔夹带多余文字
  • 用正则提取?格式稍变就崩
  • 写兜底逻辑?耗时且难维护

Claude API 的 Structured Outputs 功能从根本上解决了这个问题——通过 JSON Schema 约束,强制模型输出完全符合指定结构的数据,不多一个字段,不少一个字段。


什么是 Structured Outputs

Structured Outputs 是 Claude API 的结构化输出能力。调用时传入一个 JSON Schema,模型的每次输出都会严格遵循该 Schema,返回的数据可以直接使用,无需任何后处理。

核心优势:

对比项 传统方式 Structured Outputs
格式保证 不稳定 100% 符合 Schema
类型安全 需手动校验 字段类型严格匹配
后处理成本 无需处理
实现复杂度 中高

实现原理

Structured Outputs 基于 Claude 的 Tool Use(工具调用) 机制实现:

  1. 将你的 JSON Schema 定义为一个虚拟工具的 input_schema
  2. 通过 tool_choice 强制模型调用该工具
  3. 从响应的 tool_use 内容块中取出结构化数据

模型不会输出自由文本,只会返回符合 Schema 的结构化结果。


Python 基础示例

安装 SDK

pip install anthropic
pip install anthropic

基础调用

import anthropic

client = anthropic.Anthropic(
    api_key="your-api-key",
    base_url="https://gw.claudeapi.com"
)

# 定义输出结构
response_schema = {
    "type": "object",
    "properties": {
        "name":     {"type": "string",  "description": "商品名称"},
        "price":    {"type": "number",  "description": "价格"},
        "in_stock": {"type": "boolean", "description": "是否有货"},
        "tags": {
            "type": "array",
            "items": {"type": "string"},
            "description": "商品标签"
        }
    },
    "required": ["name", "price", "in_stock"]
}

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    tools=[{
        "name": "product_info",
        "description": "返回结构化的商品信息",
        "input_schema": response_schema,
    }],
    tool_choice={"type": "tool", "name": "product_info"},
    messages=[
        {"role": "user", "content": "帮我生成一个 iPhone 16 的商品信息"}
    ],
)

# tool_use block 的 input 字段已经是 dict,无需 json.loads
result = next(b.input for b in message.content if b.type == "tool_use")
print(result)
# {'name': 'iPhone 16', 'price': 5999.0, 'in_stock': True, 'tags': ['手机', '苹果', '5G']}
import anthropic

client = anthropic.Anthropic(
    api_key="your-api-key",
    base_url="https://gw.claudeapi.com"
)

# 定义输出结构
response_schema = {
    "type": "object",
    "properties": {
        "name":     {"type": "string",  "description": "商品名称"},
        "price":    {"type": "number",  "description": "价格"},
        "in_stock": {"type": "boolean", "description": "是否有货"},
        "tags": {
            "type": "array",
            "items": {"type": "string"},
            "description": "商品标签"
        }
    },
    "required": ["name", "price", "in_stock"]
}

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    tools=[{
        "name": "product_info",
        "description": "返回结构化的商品信息",
        "input_schema": response_schema,
    }],
    tool_choice={"type": "tool", "name": "product_info"},
    messages=[
        {"role": "user", "content": "帮我生成一个 iPhone 16 的商品信息"}
    ],
)

# tool_use block 的 input 字段已经是 dict,无需 json.loads
result = next(b.input for b in message.content if b.type == "tool_use")
print(result)
# {'name': 'iPhone 16', 'price': 5999.0, 'in_stock': True, 'tags': ['手机', '苹果', '5G']}


进阶用法:嵌套结构

实际业务中往往需要多层嵌套,Structured Outputs 完全支持。以订单数据为例:

import anthropic

client = anthropic.Anthropic(
    api_key="your-api-key",
    base_url="https://gw.claudeapi.com"
)

order_schema = {
    "type": "object",
    "properties": {
        "order_id": {"type": "string"},
        "customer": {
            "type": "object",
            "properties": {
                "name":  {"type": "string"},
                "email": {"type": "string"},
                "phone": {"type": "string"}
            },
            "required": ["name", "email"]
        },
        "items": {
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "product_name": {"type": "string"},
                    "quantity":     {"type": "integer"},
                    "unit_price":   {"type": "number"}
                },
                "required": ["product_name", "quantity", "unit_price"]
            }
        },
        "total_amount": {"type": "number"},
        "status": {
            "type": "string",
            "enum": ["pending", "paid", "shipped", "delivered"]
        }
    },
    "required": ["order_id", "customer", "items", "total_amount", "status"]
}

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=2048,
    tools=[{
        "name": "order_info",
        "description": "返回结构化的订单数据",
        "input_schema": order_schema,
    }],
    tool_choice={"type": "tool", "name": "order_info"},
    messages=[
        {"role": "user", "content": "生成一个包含 3 件商品的订单数据,客户叫张三"}
    ],
)

order = next(b.input for b in message.content if b.type == "tool_use")
print(f"订单号: {order['order_id']}")
print(f"客户:   {order['customer']['name']}")
print(f"商品数: {len(order['items'])}")
print(f"总金额: {order['total_amount']}")
import anthropic

client = anthropic.Anthropic(
    api_key="your-api-key",
    base_url="https://gw.claudeapi.com"
)

order_schema = {
    "type": "object",
    "properties": {
        "order_id": {"type": "string"},
        "customer": {
            "type": "object",
            "properties": {
                "name":  {"type": "string"},
                "email": {"type": "string"},
                "phone": {"type": "string"}
            },
            "required": ["name", "email"]
        },
        "items": {
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "product_name": {"type": "string"},
                    "quantity":     {"type": "integer"},
                    "unit_price":   {"type": "number"}
                },
                "required": ["product_name", "quantity", "unit_price"]
            }
        },
        "total_amount": {"type": "number"},
        "status": {
            "type": "string",
            "enum": ["pending", "paid", "shipped", "delivered"]
        }
    },
    "required": ["order_id", "customer", "items", "total_amount", "status"]
}

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=2048,
    tools=[{
        "name": "order_info",
        "description": "返回结构化的订单数据",
        "input_schema": order_schema,
    }],
    tool_choice={"type": "tool", "name": "order_info"},
    messages=[
        {"role": "user", "content": "生成一个包含 3 件商品的订单数据,客户叫张三"}
    ],
)

order = next(b.input for b in message.content if b.type == "tool_use")
print(f"订单号: {order['order_id']}")
print(f"客户:   {order['customer']['name']}")
print(f"商品数: {len(order['items'])}")
print(f"总金额: {order['total_amount']}")


实战场景:非结构化文本信息提取

Structured Outputs 最高频的用途之一:从自由文本中精准提取结构化字段

以简历解析为例:

import anthropic
import json

client = anthropic.Anthropic(
    api_key="your-api-key",
    base_url="https://gw.claudeapi.com"
)

resume_schema = {
    "type": "object",
    "properties": {
        "name":  {"type": "string"},
        "phone": {"type": "string"},
        "email": {"type": "string"},
        "education": {
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "school": {"type": "string"},
                    "degree": {"type": "string"},
                    "major":  {"type": "string"},
                    "year":   {"type": "integer"}
                }
            }
        },
        "skills": {
            "type": "array",
            "items": {"type": "string"}
        },
        "work_experience_years": {"type": "integer"}
    },
    "required": ["name", "skills"]
}

resume_text = """
张伟,男,1995年生
联系电话:138-1234-5678
邮箱:[email protected]

教育背景:
2017年毕业于清华大学计算机科学与技术专业,硕士学位

技能:Python、Java、机器学习、深度学习、PyTorch、分布式系统

工作经验:2017-2024年,共 7 年开发经验
"""

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    tools=[{
        "name": "resume_info",
        "description": "从简历文本中提取结构化信息",
        "input_schema": resume_schema,
    }],
    tool_choice={"type": "tool", "name": "resume_info"},
    messages=[
        {"role": "user", "content": f"从以下简历中提取信息:\n\n{resume_text}"}
    ],
)

resume_data = next(b.input for b in message.content if b.type == "tool_use")
print(json.dumps(resume_data, ensure_ascii=False, indent=2))
import anthropic
import json

client = anthropic.Anthropic(
    api_key="your-api-key",
    base_url="https://gw.claudeapi.com"
)

resume_schema = {
    "type": "object",
    "properties": {
        "name":  {"type": "string"},
        "phone": {"type": "string"},
        "email": {"type": "string"},
        "education": {
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "school": {"type": "string"},
                    "degree": {"type": "string"},
                    "major":  {"type": "string"},
                    "year":   {"type": "integer"}
                }
            }
        },
        "skills": {
            "type": "array",
            "items": {"type": "string"}
        },
        "work_experience_years": {"type": "integer"}
    },
    "required": ["name", "skills"]
}

resume_text = """
张伟,男,1995年生
联系电话:138-1234-5678
邮箱:[email protected]

教育背景:
2017年毕业于清华大学计算机科学与技术专业,硕士学位

技能:Python、Java、机器学习、深度学习、PyTorch、分布式系统

工作经验:2017-2024年,共 7 年开发经验
"""

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    tools=[{
        "name": "resume_info",
        "description": "从简历文本中提取结构化信息",
        "input_schema": resume_schema,
    }],
    tool_choice={"type": "tool", "name": "resume_info"},
    messages=[
        {"role": "user", "content": f"从以下简历中提取信息:\n\n{resume_text}"}
    ],
)

resume_data = next(b.input for b in message.content if b.type == "tool_use")
print(json.dumps(resume_data, ensure_ascii=False, indent=2))

类似的模式可以直接套用到合同要素提取、商品评论分析、日志结构化等场景。


TypeScript / Node.js 示例

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  apiKey: "your-api-key",
  baseURL: "https://gw.claudeapi.com",
});

interface ProductInfo {
  name: string;
  price: number;
  in_stock: boolean;
  tags: string[];
}

const schema = {
  type: "object",
  properties: {
    name:     { type: "string" },
    price:    { type: "number" },
    in_stock: { type: "boolean" },
    tags:     { type: "array", items: { type: "string" } },
  },
  required: ["name", "price", "in_stock"],
};

async function getProductInfo(prompt: string): Promise<ProductInfo> {
  const message = await client.messages.create({
    model: "claude-sonnet-4-6",
    max_tokens: 1024,
    tools: [{
      name: "product_info",
      description: "返回结构化的商品信息",
      input_schema: schema,
    }],
    tool_choice: { type: "tool", name: "product_info" },
    messages: [{ role: "user", content: prompt }],
  });

  const toolUse = message.content.find((b) => b.type === "tool_use");
  return (toolUse as any).input as ProductInfo;
}

const product = await getProductInfo("生成 MacBook Pro 的商品信息");
console.log(product.name);  // 类型安全,无需手动断言
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  apiKey: "your-api-key",
  baseURL: "https://gw.claudeapi.com",
});

interface ProductInfo {
  name: string;
  price: number;
  in_stock: boolean;
  tags: string[];
}

const schema = {
  type: "object",
  properties: {
    name:     { type: "string" },
    price:    { type: "number" },
    in_stock: { type: "boolean" },
    tags:     { type: "array", items: { type: "string" } },
  },
  required: ["name", "price", "in_stock"],
};

async function getProductInfo(prompt: string): Promise<ProductInfo> {
  const message = await client.messages.create({
    model: "claude-sonnet-4-6",
    max_tokens: 1024,
    tools: [{
      name: "product_info",
      description: "返回结构化的商品信息",
      input_schema: schema,
    }],
    tool_choice: { type: "tool", name: "product_info" },
    messages: [{ role: "user", content: prompt }],
  });

  const toolUse = message.content.find((b) => b.type === "tool_use");
  return (toolUse as any).input as ProductInfo;
}

const product = await getProductInfo("生成 MacBook Pro 的商品信息");
console.log(product.name);  // 类型安全,无需手动断言


常见问题

Q:Schema 报验证错误怎么排查?

最常见原因是 required 数组中的字段名与 properties 中的 key 不一致,逐一比对即可。

# 错误示例
"required": ["names"]   # ❌ properties 中是 "name"
"required": ["name"]    # ✅
# 错误示例
"required": ["names"]   # ❌ properties 中是 "name"
"required": ["name"]    # ✅

Q:输出被截断,JSON 不完整?

调大 max_tokens。嵌套结构较深时,建议设置到 4096 以上。

Q:某些字段不想强制要求?

只把必填字段加入 required,其余字段定义在 properties 中但不列入 required 即为可选。


小结

Structured Outputs 通过 Tool Use 机制,将 JSON Schema 的约束力从"提示"升级为"强制"。适合所有需要稳定结构化输出的场景:数据提取、表单填充、API 响应生成、内容分类等。

接入方式极简——只需替换 base_url,其余与官方 SDK 完全兼容:

client = anthropic.Anthropic(
    api_key="your-api-key",
    base_url="https://gw.claudeapi.com"   # 国内直连,无需其他配置
)
client = anthropic.Anthropic(
    api_key="your-api-key",
    base_url="https://gw.claudeapi.com"   # 国内直连,无需其他配置
)

查看完整 API 文档与定价:claudeapi.com

相关文章