跳到内容

JSON Schema

API 文档

pydantic.json_schema

Pydantic 允许从模型自动创建和自定义 JSON schema。生成的 JSON schema 符合以下规范

生成 JSON Schema

使用以下函数生成 JSON schema

注意

这些方法不要与 BaseModel.model_dump_jsonTypeAdapter.dump_json 混淆,后者分别序列化模型或适配类型的实例。这些方法返回 JSON 字符串。相比之下,BaseModel.model_json_schemaTypeAdapter.json_schema 返回表示模型或适配类型 JSON schema 的 jsonable dict。

关于 JSON schema 的 "jsonable" 性质

关于 model_json_schema 结果的 "jsonable" 性质,调用 json.dumps(m.model_json_schema()) 在某些 BaseModel m 上会返回有效的 JSON 字符串。类似地,对于 TypeAdapter.json_schema,调用 json.dumps(TypeAdapter(<some_type>).json_schema()) 会返回有效的 JSON 字符串。

提示

Pydantic 为以下两种方式提供支持:

  1. 自定义 JSON Schema
  2. 自定义 JSON Schema 生成过程

第一种方法通常具有更窄的范围,允许针对更具体的案例和类型自定义 JSON schema。第二种方法通常具有更广泛的范围,允许自定义整个 JSON schema 生成过程。相同效果可以通过任一方法实现,但根据您的用例,一种方法可能比另一种方法提供更简单的解决方案。

这是一个从 BaseModel 生成 JSON schema 的示例

import json
from enum import Enum
from typing import Annotated, Union

from pydantic import BaseModel, Field
from pydantic.config import ConfigDict


class FooBar(BaseModel):
    count: int
    size: Union[float, None] = None


class Gender(str, Enum):
    male = 'male'
    female = 'female'
    other = 'other'
    not_given = 'not_given'


class MainModel(BaseModel):
    """
    This is the description of the main model
    """

    model_config = ConfigDict(title='Main')

    foo_bar: FooBar
    gender: Annotated[Union[Gender, None], Field(alias='Gender')] = None
    snap: int = Field(
        default=42,
        title='The Snap',
        description='this is the value of snap',
        gt=30,
        lt=50,
    )


main_model_schema = MainModel.model_json_schema()  # (1)!
print(json.dumps(main_model_schema, indent=2))  # (2)!

JSON 输出

{
  "$defs": {
    "FooBar": {
      "properties": {
        "count": {
          "title": "Count",
          "type": "integer"
        },
        "size": {
          "anyOf": [
            {
              "type": "number"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "title": "Size"
        }
      },
      "required": [
        "count"
      ],
      "title": "FooBar",
      "type": "object"
    },
    "Gender": {
      "enum": [
        "male",
        "female",
        "other",
        "not_given"
      ],
      "title": "Gender",
      "type": "string"
    }
  },
  "description": "This is the description of the main model",
  "properties": {
    "foo_bar": {
      "$ref": "#/$defs/FooBar"
    },
    "Gender": {
      "anyOf": [
        {
          "$ref": "#/$defs/Gender"
        },
        {
          "type": "null"
        }
      ],
      "default": null
    },
    "snap": {
      "default": 42,
      "description": "this is the value of snap",
      "exclusiveMaximum": 50,
      "exclusiveMinimum": 30,
      "title": "The Snap",
      "type": "integer"
    }
  },
  "required": [
    "foo_bar"
  ],
  "title": "Main",
  "type": "object"
}
  1. 这会生成 MainModel schema 的 "jsonable" dict。
  2. 在 schema dict 上调用 json.dumps 会生成 JSON 字符串。
import json
from enum import Enum
from typing import Annotated

from pydantic import BaseModel, Field
from pydantic.config import ConfigDict


class FooBar(BaseModel):
    count: int
    size: float | None = None


class Gender(str, Enum):
    male = 'male'
    female = 'female'
    other = 'other'
    not_given = 'not_given'


class MainModel(BaseModel):
    """
    This is the description of the main model
    """

    model_config = ConfigDict(title='Main')

    foo_bar: FooBar
    gender: Annotated[Gender | None, Field(alias='Gender')] = None
    snap: int = Field(
        default=42,
        title='The Snap',
        description='this is the value of snap',
        gt=30,
        lt=50,
    )


main_model_schema = MainModel.model_json_schema()  # (1)!
print(json.dumps(main_model_schema, indent=2))  # (2)!

JSON 输出

{
  "$defs": {
    "FooBar": {
      "properties": {
        "count": {
          "title": "Count",
          "type": "integer"
        },
        "size": {
          "anyOf": [
            {
              "type": "number"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "title": "Size"
        }
      },
      "required": [
        "count"
      ],
      "title": "FooBar",
      "type": "object"
    },
    "Gender": {
      "enum": [
        "male",
        "female",
        "other",
        "not_given"
      ],
      "title": "Gender",
      "type": "string"
    }
  },
  "description": "This is the description of the main model",
  "properties": {
    "foo_bar": {
      "$ref": "#/$defs/FooBar"
    },
    "Gender": {
      "anyOf": [
        {
          "$ref": "#/$defs/Gender"
        },
        {
          "type": "null"
        }
      ],
      "default": null
    },
    "snap": {
      "default": 42,
      "description": "this is the value of snap",
      "exclusiveMaximum": 50,
      "exclusiveMinimum": 30,
      "title": "The Snap",
      "type": "integer"
    }
  },
  "required": [
    "foo_bar"
  ],
  "title": "Main",
  "type": "object"
}
  1. 这会生成 MainModel schema 的 "jsonable" dict。
  2. 在 schema dict 上调用 json.dumps 会生成 JSON 字符串。

TypeAdapter 类允许您创建一个对象,该对象具有用于验证、序列化以及为任意类型生成 JSON schema 的方法。这可以完全替代 Pydantic V1 中的 schema_of(现已弃用)。

这是一个从 TypeAdapter 生成 JSON schema 的示例

from pydantic import TypeAdapter

adapter = TypeAdapter(list[int])
print(adapter.json_schema())
#> {'items': {'type': 'integer'}, 'type': 'array'}

您还可以为 BaseModelTypeAdapter 的组合生成 JSON schema,如本示例所示

import json
from typing import Union

from pydantic import BaseModel, TypeAdapter


class Cat(BaseModel):
    name: str
    color: str


class Dog(BaseModel):
    name: str
    breed: str


ta = TypeAdapter(Union[Cat, Dog])
ta_schema = ta.json_schema()
print(json.dumps(ta_schema, indent=2))

JSON 输出

{
  "$defs": {
    "Cat": {
      "properties": {
        "name": {
          "title": "Name",
          "type": "string"
        },
        "color": {
          "title": "Color",
          "type": "string"
        }
      },
      "required": [
        "name",
        "color"
      ],
      "title": "Cat",
      "type": "object"
    },
    "Dog": {
      "properties": {
        "name": {
          "title": "Name",
          "type": "string"
        },
        "breed": {
          "title": "Breed",
          "type": "string"
        }
      },
      "required": [
        "name",
        "breed"
      ],
      "title": "Dog",
      "type": "object"
    }
  },
  "anyOf": [
    {
      "$ref": "#/$defs/Cat"
    },
    {
      "$ref": "#/$defs/Dog"
    }
  ]
}

配置 JsonSchemaMode

通过 model_json_schemaTypeAdapter.json_schema 方法中的 mode 参数指定 JSON schema 生成模式。默认情况下,模式设置为 'validation',这会生成与模型的验证 schema 相对应的 JSON schema。

JsonSchemaMode 是一个类型别名,表示 mode 参数的可用选项

  • 'validation'
  • 'serialization'

这是一个关于如何指定 mode 参数以及它如何影响生成的 JSON schema 的示例

from decimal import Decimal

from pydantic import BaseModel


class Model(BaseModel):
    a: Decimal = Decimal('12.34')


print(Model.model_json_schema(mode='validation'))
"""
{
    'properties': {
        'a': {
            'anyOf': [{'type': 'number'}, {'type': 'string'}],
            'default': '12.34',
            'title': 'A',
        }
    },
    'title': 'Model',
    'type': 'object',
}
"""

print(Model.model_json_schema(mode='serialization'))
"""
{
    'properties': {'a': {'default': '12.34', 'title': 'A', 'type': 'string'}},
    'title': 'Model',
    'type': 'object',
}
"""

自定义 JSON Schema

生成的 JSON schema 可以在字段级别和模型级别进行自定义,通过

  1. 字段级自定义,使用 Field 构造函数
  2. 模型级自定义,使用 model_config

在字段级别和模型级别,您都可以使用 json_schema_extra 选项向 JSON schema 添加额外信息。下面的 使用 json_schema_extra 部分提供了关于此选项的更多详细信息。

对于自定义类型,Pydantic 提供了其他工具来定制 JSON schema 生成

  1. WithJsonSchema 注解
  2. SkipJsonSchema 注解
  3. 实现 __get_pydantic_core_schema__
  4. 实现 __get_pydantic_json_schema__

字段级自定义

可选地,可以使用 Field 函数来提供关于字段和验证的额外信息。

一些字段参数专门用于自定义生成的 JSON Schema

  • title:字段的标题。
  • description:字段的描述。
  • examples:字段的示例。
  • json_schema_extra:要添加到字段的额外 JSON Schema 属性。
  • field_title_generator:一个函数,根据字段的名称和信息以编程方式设置字段的标题。

这是一个示例

import json

from pydantic import BaseModel, EmailStr, Field, SecretStr


class User(BaseModel):
    age: int = Field(description='Age of the user')
    email: EmailStr = Field(examples=['[email protected]'])
    name: str = Field(title='Username')
    password: SecretStr = Field(
        json_schema_extra={
            'title': 'Password',
            'description': 'Password of the user',
            'examples': ['123456'],
        }
    )


print(json.dumps(User.model_json_schema(), indent=2))

JSON 输出

{
  "properties": {
    "age": {
      "description": "Age of the user",
      "title": "Age",
      "type": "integer"
    },
    "email": {
      "examples": [
        "[email protected]"
      ],
      "format": "email",
      "title": "Email",
      "type": "string"
    },
    "name": {
      "title": "Username",
      "type": "string"
    },
    "password": {
      "description": "Password of the user",
      "examples": [
        "123456"
      ],
      "format": "password",
      "title": "Password",
      "type": "string",
      "writeOnly": true
    }
  },
  "required": [
    "age",
    "email",
    "name",
    "password"
  ],
  "title": "User",
  "type": "object"
}

未强制执行的 Field 约束

如果 Pydantic 发现未被强制执行的约束,则会引发错误。如果您想强制约束出现在 schema 中,即使在解析时未进行检查,您也可以使用可变参数到 Field,并使用原始 schema 属性名称

from pydantic import BaseModel, Field, PositiveInt

try:
    # this won't work since `PositiveInt` takes precedence over the
    # constraints defined in `Field`, meaning they're ignored
    class Model(BaseModel):
        foo: PositiveInt = Field(lt=10)

except ValueError as e:
    print(e)


# if you find yourself needing this, an alternative is to declare
# the constraints in `Field` (or you could use `conint()`)
# here both constraints will be enforced:
class ModelB(BaseModel):
    # Here both constraints will be applied and the schema
    # will be generated correctly
    foo: int = Field(gt=0, lt=10)


print(ModelB.model_json_schema())
"""
{
    'properties': {
        'foo': {
            'exclusiveMaximum': 10,
            'exclusiveMinimum': 0,
            'title': 'Foo',
            'type': 'integer',
        }
    },
    'required': ['foo'],
    'title': 'ModelB',
    'type': 'object',
}
"""

您也可以通过 Field 构造函数,经由 typing.Annotated 来指定 JSON schema 修改

import json
from typing import Annotated
from uuid import uuid4

from pydantic import BaseModel, Field


class Foo(BaseModel):
    id: Annotated[str, Field(default_factory=lambda: uuid4().hex)]
    name: Annotated[str, Field(max_length=256)] = Field(
        'Bar', title='CustomName'
    )


print(json.dumps(Foo.model_json_schema(), indent=2))

JSON 输出

{
  "properties": {
    "id": {
      "title": "Id",
      "type": "string"
    },
    "name": {
      "default": "Bar",
      "maxLength": 256,
      "title": "CustomName",
      "type": "string"
    }
  },
  "title": "Foo",
  "type": "object"
}

程序化字段标题生成

field_title_generator 参数可用于根据字段的名称和信息以编程方式生成字段的标题。

请参阅以下示例

import json

from pydantic import BaseModel, Field
from pydantic.fields import FieldInfo


def make_title(field_name: str, field_info: FieldInfo) -> str:
    return field_name.upper()


class Person(BaseModel):
    name: str = Field(field_title_generator=make_title)
    age: int = Field(field_title_generator=make_title)


print(json.dumps(Person.model_json_schema(), indent=2))
"""
{
  "properties": {
    "name": {
      "title": "NAME",
      "type": "string"
    },
    "age": {
      "title": "AGE",
      "type": "integer"
    }
  },
  "required": [
    "name",
    "age"
  ],
  "title": "Person",
  "type": "object"
}
"""

模型级自定义

您还可以使用 模型配置 来自定义模型上的 JSON schema 生成。具体来说,以下配置选项相关

使用 json_schema_extra

json_schema_extra 选项可用于向 JSON schema 添加额外信息,可以在 字段级别模型级别。您可以将 dictCallable 传递给 json_schema_extra

json_schema_extradict 一起使用

您可以将 dict 传递给 json_schema_extra 以向 JSON schema 添加额外信息

import json

from pydantic import BaseModel, ConfigDict


class Model(BaseModel):
    a: str

    model_config = ConfigDict(json_schema_extra={'examples': [{'a': 'Foo'}]})


print(json.dumps(Model.model_json_schema(), indent=2))

JSON 输出

{
  "examples": [
    {
      "a": "Foo"
    }
  ],
  "properties": {
    "a": {
      "title": "A",
      "type": "string"
    }
  },
  "required": [
    "a"
  ],
  "title": "Model",
  "type": "object"
}

json_schema_extraCallable 一起使用

您可以将 Callable 传递给 json_schema_extra 以使用函数修改 JSON schema

import json

from pydantic import BaseModel, Field


def pop_default(s):
    s.pop('default')


class Model(BaseModel):
    a: int = Field(default=1, json_schema_extra=pop_default)


print(json.dumps(Model.model_json_schema(), indent=2))

JSON 输出

{
  "properties": {
    "a": {
      "title": "A",
      "type": "integer"
    }
  },
  "title": "Model",
  "type": "object"
}

合并 json_schema_extra

从 v2.9 开始,Pydantic 合并来自注解类型的 json_schema_extra 字典。这种模式提供了一种更具累加性的合并方法,而不是之前的覆盖行为。这对于在多种类型中重用 json schema 额外信息的情况非常有帮助。

我们将此更改主要视为错误修复,因为它解决了 BaseModelTypeAdapter 实例之间 json_schema_extra 合并行为的意外差异 - 有关更多详细信息,请参阅 此问题

import json
from typing import Annotated

from typing_extensions import TypeAlias

from pydantic import Field, TypeAdapter

ExternalType: TypeAlias = Annotated[
    int, Field(json_schema_extra={'key1': 'value1'})
]

ta = TypeAdapter(
    Annotated[ExternalType, Field(json_schema_extra={'key2': 'value2'})]
)
print(json.dumps(ta.json_schema(), indent=2))
"""
{
  "key1": "value1",
  "key2": "value2",
  "type": "integer"
}
"""
import json
from typing import Annotated

from typing import TypeAlias

from pydantic import Field, TypeAdapter

ExternalType: TypeAlias = Annotated[
    int, Field(json_schema_extra={'key1': 'value1'})
]

ta = TypeAdapter(
    Annotated[ExternalType, Field(json_schema_extra={'key2': 'value2'})]
)
print(json.dumps(ta.json_schema(), indent=2))
"""
{
  "key1": "value1",
  "key2": "value2",
  "type": "integer"
}
"""

注意

我们不再(也从未完全)支持组合 dictcallable 类型的 json_schema_extra 规范。如果这是您的用例的要求,请 打开一个 pydantic issue 并解释您的情况 - 当呈现有说服力的案例时,我们很乐意重新考虑此决定。

WithJsonSchema 注解

API 文档

pydantic.json_schema.WithJsonSchema

提示

对于自定义类型,建议使用 WithJsonSchema 而不是 实现 __get_pydantic_json_schema__,因为它更简单且不易出错。

WithJsonSchema 注解可用于覆盖给定类型的生成(基本)JSON schema,而无需在该类型本身上实现 __get_pydantic_core_schema____get_pydantic_json_schema__。请注意,这会覆盖字段的整个 JSON Schema 生成过程(在以下示例中,还需要提供 'type')。

import json
from typing import Annotated

from pydantic import BaseModel, WithJsonSchema

MyInt = Annotated[
    int,
    WithJsonSchema({'type': 'integer', 'examples': [1, 0, -1]}),
]


class Model(BaseModel):
    a: MyInt


print(json.dumps(Model.model_json_schema(), indent=2))

JSON 输出

{
  "properties": {
    "a": {
      "examples": [
        1,
        0,
        -1
      ],
      "title": "A",
      "type": "integer"
    }
  },
  "required": [
    "a"
  ],
  "title": "Model",
  "type": "object"
}

注意

您可能会想使用 WithJsonSchema 注解来微调附加了 验证器 的字段的 JSON Schema。相反,建议使用 json_schema_input_type 参数

SkipJsonSchema 注解

API 文档

pydantic.json_schema.SkipJsonSchema

SkipJsonSchema 注解可用于从生成的 JSON schema 中跳过包含的字段(或字段规范的一部分)。有关更多详细信息,请参阅 API 文档。

实现 __get_pydantic_core_schema__

自定义类型(用作 field_name: TheTypefield_name: Annotated[TheType, ...])以及 Annotated 元数据(用作 field_name: Annotated[int, SomeMetadata])可以通过实现 __get_pydantic_core_schema__ 来修改或覆盖生成的 schema。此方法接收两个位置参数:

  1. 对应于此类型的类型注解(因此在 TheType[T][int] 的情况下,它将是 TheType[int])。
  2. 一个处理程序/回调,用于调用 __get_pydantic_core_schema__ 的下一个实现者。

处理程序系统的工作方式与 wrap 字段验证器 类似。在这种情况下,输入是类型,输出是 core_schema

这是一个自定义类型的示例,它覆盖了生成的 core_schema

from dataclasses import dataclass
from typing import Any

from pydantic_core import core_schema

from pydantic import BaseModel, GetCoreSchemaHandler


@dataclass
class CompressedString:
    dictionary: dict[int, str]
    text: list[int]

    def build(self) -> str:
        return ' '.join([self.dictionary[key] for key in self.text])

    @classmethod
    def __get_pydantic_core_schema__(
        cls, source: type[Any], handler: GetCoreSchemaHandler
    ) -> core_schema.CoreSchema:
        assert source is CompressedString
        return core_schema.no_info_after_validator_function(
            cls._validate,
            core_schema.str_schema(),
            serialization=core_schema.plain_serializer_function_ser_schema(
                cls._serialize,
                info_arg=False,
                return_schema=core_schema.str_schema(),
            ),
        )

    @staticmethod
    def _validate(value: str) -> 'CompressedString':
        inverse_dictionary: dict[str, int] = {}
        text: list[int] = []
        for word in value.split(' '):
            if word not in inverse_dictionary:
                inverse_dictionary[word] = len(inverse_dictionary)
            text.append(inverse_dictionary[word])
        return CompressedString(
            {v: k for k, v in inverse_dictionary.items()}, text
        )

    @staticmethod
    def _serialize(value: 'CompressedString') -> str:
        return value.build()


class MyModel(BaseModel):
    value: CompressedString


print(MyModel.model_json_schema())
"""
{
    'properties': {'value': {'title': 'Value', 'type': 'string'}},
    'required': ['value'],
    'title': 'MyModel',
    'type': 'object',
}
"""
print(MyModel(value='fox fox fox dog fox'))
"""
value = CompressedString(dictionary={0: 'fox', 1: 'dog'}, text=[0, 0, 0, 1, 0])
"""

print(MyModel(value='fox fox fox dog fox').model_dump(mode='json'))
#> {'value': 'fox fox fox dog fox'}

由于 Pydantic 不知道如何为 CompressedString 生成 schema,如果您在其 __get_pydantic_core_schema__ 方法中调用 handler(source),您将收到 pydantic.errors.PydanticSchemaGenerationError 错误。对于大多数自定义类型来说,情况都是如此,因此您几乎永远不想为自定义类型调用 handler

Annotated 元数据的过程大致相同,只是您通常可以调用 handler 以让 Pydantic 处理生成 schema。

from dataclasses import dataclass
from typing import Annotated, Any, Sequence

from pydantic_core import core_schema

from pydantic import BaseModel, GetCoreSchemaHandler, ValidationError


@dataclass
class RestrictCharacters:
    alphabet: Sequence[str]

    def __get_pydantic_core_schema__(
        self, source: type[Any], handler: GetCoreSchemaHandler
    ) -> core_schema.CoreSchema:
        if not self.alphabet:
            raise ValueError('Alphabet may not be empty')
        schema = handler(
            source
        )  # get the CoreSchema from the type / inner constraints
        if schema['type'] != 'str':
            raise TypeError('RestrictCharacters can only be applied to strings')
        return core_schema.no_info_after_validator_function(
            self.validate,
            schema,
        )

    def validate(self, value: str) -> str:
        if any(c not in self.alphabet for c in value):
            raise ValueError(
                f'{value!r} is not restricted to {self.alphabet!r}'
            )
        return value


class MyModel(BaseModel):
    value: Annotated[str, RestrictCharacters('ABC')]


print(MyModel.model_json_schema())
"""
{
    'properties': {'value': {'title': 'Value', 'type': 'string'}},
    'required': ['value'],
    'title': 'MyModel',
    'type': 'object',
}
"""
print(MyModel(value='CBA'))
#> value='CBA'

try:
    MyModel(value='XYZ')
except ValidationError as e:
    print(e)
    """
    1 validation error for MyModel
    value
      Value error, 'XYZ' is not restricted to 'ABC' [type=value_error, input_value='XYZ', input_type=str]
    """
from dataclasses import dataclass
from typing import Annotated, Any
from collections.abc import Sequence

from pydantic_core import core_schema

from pydantic import BaseModel, GetCoreSchemaHandler, ValidationError


@dataclass
class RestrictCharacters:
    alphabet: Sequence[str]

    def __get_pydantic_core_schema__(
        self, source: type[Any], handler: GetCoreSchemaHandler
    ) -> core_schema.CoreSchema:
        if not self.alphabet:
            raise ValueError('Alphabet may not be empty')
        schema = handler(
            source
        )  # get the CoreSchema from the type / inner constraints
        if schema['type'] != 'str':
            raise TypeError('RestrictCharacters can only be applied to strings')
        return core_schema.no_info_after_validator_function(
            self.validate,
            schema,
        )

    def validate(self, value: str) -> str:
        if any(c not in self.alphabet for c in value):
            raise ValueError(
                f'{value!r} is not restricted to {self.alphabet!r}'
            )
        return value


class MyModel(BaseModel):
    value: Annotated[str, RestrictCharacters('ABC')]


print(MyModel.model_json_schema())
"""
{
    'properties': {'value': {'title': 'Value', 'type': 'string'}},
    'required': ['value'],
    'title': 'MyModel',
    'type': 'object',
}
"""
print(MyModel(value='CBA'))
#> value='CBA'

try:
    MyModel(value='XYZ')
except ValidationError as e:
    print(e)
    """
    1 validation error for MyModel
    value
      Value error, 'XYZ' is not restricted to 'ABC' [type=value_error, input_value='XYZ', input_type=str]
    """

到目前为止,我们一直在包装 schema,但如果您只想修改它或忽略它,您也可以这样做。

要修改 schema,请首先调用处理程序,然后更改结果

from typing import Annotated, Any

from pydantic_core import ValidationError, core_schema

from pydantic import BaseModel, GetCoreSchemaHandler


class SmallString:
    def __get_pydantic_core_schema__(
        self,
        source: type[Any],
        handler: GetCoreSchemaHandler,
    ) -> core_schema.CoreSchema:
        schema = handler(source)
        assert schema['type'] == 'str'
        schema['max_length'] = 10  # modify in place
        return schema


class MyModel(BaseModel):
    value: Annotated[str, SmallString()]


try:
    MyModel(value='too long!!!!!')
except ValidationError as e:
    print(e)
    """
    1 validation error for MyModel
    value
      String should have at most 10 characters [type=string_too_long, input_value='too long!!!!!', input_type=str]
    """

提示

请注意,您必须返回一个 schema,即使您只是就地更改它。

要完全覆盖 schema,请不要调用处理程序并返回您自己的 CoreSchema

from typing import Annotated, Any

from pydantic_core import ValidationError, core_schema

from pydantic import BaseModel, GetCoreSchemaHandler


class AllowAnySubclass:
    def __get_pydantic_core_schema__(
        self, source: type[Any], handler: GetCoreSchemaHandler
    ) -> core_schema.CoreSchema:
        # we can't call handler since it will fail for arbitrary types
        def validate(value: Any) -> Any:
            if not isinstance(value, source):
                raise ValueError(
                    f'Expected an instance of {source}, got an instance of {type(value)}'
                )

        return core_schema.no_info_plain_validator_function(validate)


class Foo:
    pass


class Model(BaseModel):
    f: Annotated[Foo, AllowAnySubclass()]


print(Model(f=Foo()))
#> f=None


class NotFoo:
    pass


try:
    Model(f=NotFoo())
except ValidationError as e:
    print(e)
    """
    1 validation error for Model
    f
      Value error, Expected an instance of <class '__main__.Foo'>, got an instance of <class '__main__.NotFoo'> [type=value_error, input_value=<__main__.NotFoo object at 0x0123456789ab>, input_type=NotFoo]
    """

实现 __get_pydantic_json_schema__

您还可以实现 __get_pydantic_json_schema__ 来修改或覆盖生成的 json schema。修改此方法仅影响 JSON schema - 它不会影响核心 schema,后者用于验证和序列化。

这是一个修改生成的 JSON schema 的示例

import json
from typing import Any

from pydantic_core import core_schema as cs

from pydantic import GetCoreSchemaHandler, GetJsonSchemaHandler, TypeAdapter
from pydantic.json_schema import JsonSchemaValue


class Person:
    name: str
    age: int

    def __init__(self, name: str, age: int):
        self.name = name
        self.age = age

    @classmethod
    def __get_pydantic_core_schema__(
        cls, source_type: Any, handler: GetCoreSchemaHandler
    ) -> cs.CoreSchema:
        return cs.typed_dict_schema(
            {
                'name': cs.typed_dict_field(cs.str_schema()),
                'age': cs.typed_dict_field(cs.int_schema()),
            },
        )

    @classmethod
    def __get_pydantic_json_schema__(
        cls, core_schema: cs.CoreSchema, handler: GetJsonSchemaHandler
    ) -> JsonSchemaValue:
        json_schema = handler(core_schema)
        json_schema = handler.resolve_ref_schema(json_schema)
        json_schema['examples'] = [
            {
                'name': 'John Doe',
                'age': 25,
            }
        ]
        json_schema['title'] = 'Person'
        return json_schema


print(json.dumps(TypeAdapter(Person).json_schema(), indent=2))

JSON 输出

{
  "examples": [
    {
      "age": 25,
      "name": "John Doe"
    }
  ],
  "properties": {
    "name": {
      "title": "Name",
      "type": "string"
    },
    "age": {
      "title": "Age",
      "type": "integer"
    }
  },
  "required": [
    "name",
    "age"
  ],
  "title": "Person",
  "type": "object"
}

使用 field_title_generator

field_title_generator 参数可用于根据字段的名称和信息以编程方式生成字段的标题。这类似于字段级别的 field_title_generator,但 ConfigDict 选项将应用于该类的所有字段。

请参阅以下示例

import json

from pydantic import BaseModel, ConfigDict


class Person(BaseModel):
    model_config = ConfigDict(
        field_title_generator=lambda field_name, field_info: field_name.upper()
    )
    name: str
    age: int


print(json.dumps(Person.model_json_schema(), indent=2))
"""
{
  "properties": {
    "name": {
      "title": "NAME",
      "type": "string"
    },
    "age": {
      "title": "AGE",
      "type": "integer"
    }
  },
  "required": [
    "name",
    "age"
  ],
  "title": "Person",
  "type": "object"
}
"""

使用 model_title_generator

model_title_generator 配置选项类似于 field_title_generator 选项,但它应用于模型本身的标题,并接受模型类作为输入。

请参阅以下示例

import json

from pydantic import BaseModel, ConfigDict


def make_title(model: type) -> str:
    return f'Title-{model.__name__}'


class Person(BaseModel):
    model_config = ConfigDict(model_title_generator=make_title)
    name: str
    age: int


print(json.dumps(Person.model_json_schema(), indent=2))
"""
{
  "properties": {
    "name": {
      "title": "Name",
      "type": "string"
    },
    "age": {
      "title": "Age",
      "type": "integer"
    }
  },
  "required": [
    "name",
    "age"
  ],
  "title": "Title-Person",
  "type": "object"
}
"""

JSON schema 类型

类型、自定义字段类型和约束(如 max_length)按照以下优先级顺序映射到相应的规范格式(当有等效项可用时):

  1. JSON Schema Core
  2. JSON Schema Validation
  3. OpenAPI 数据类型
  4. 标准 format JSON 字段用于为更复杂的 string 子类型定义 Pydantic 扩展。

从 Python 或 Pydantic 到 JSON schema 的字段 schema 映射如下:

{{ schema_mappings_table }}

顶层 schema 生成

您还可以生成顶层 JSON schema,该 schema 仅在其 $defs 中包含模型列表和相关的子模型

import json

from pydantic import BaseModel
from pydantic.json_schema import models_json_schema


class Foo(BaseModel):
    a: str = None


class Model(BaseModel):
    b: Foo


class Bar(BaseModel):
    c: int


_, top_level_schema = models_json_schema(
    [(Model, 'validation'), (Bar, 'validation')], title='My Schema'
)
print(json.dumps(top_level_schema, indent=2))

JSON 输出

{
  "$defs": {
    "Bar": {
      "properties": {
        "c": {
          "title": "C",
          "type": "integer"
        }
      },
      "required": [
        "c"
      ],
      "title": "Bar",
      "type": "object"
    },
    "Foo": {
      "properties": {
        "a": {
          "default": null,
          "title": "A",
          "type": "string"
        }
      },
      "title": "Foo",
      "type": "object"
    },
    "Model": {
      "properties": {
        "b": {
          "$ref": "#/$defs/Foo"
        }
      },
      "required": [
        "b"
      ],
      "title": "Model",
      "type": "object"
    }
  },
  "title": "My Schema"
}

自定义 JSON Schema 生成过程

API 文档

pydantic.json_schema

如果您需要自定义 schema 生成,可以使用 schema_generator,根据您的应用程序的需要修改 GenerateJsonSchema 类。

可用于生成 JSON schema 的各种方法接受一个关键字参数 schema_generator: type[GenerateJsonSchema] = GenerateJsonSchema,您可以将自定义子类传递给这些方法,以便使用您自己的 JSON schema 生成方法。

GenerateJsonSchema 实现了将类型的 pydantic-core schema 转换为 JSON schema。根据设计,此类将 JSON schema 生成过程分解为更小的方法,这些方法可以在子类中轻松覆盖,以修改 JSON schema 生成的“全局”方法。

from pydantic import BaseModel
from pydantic.json_schema import GenerateJsonSchema


class MyGenerateJsonSchema(GenerateJsonSchema):
    def generate(self, schema, mode='validation'):
        json_schema = super().generate(schema, mode=mode)
        json_schema['title'] = 'Customize title'
        json_schema['$schema'] = self.schema_dialect
        return json_schema


class MyModel(BaseModel):
    x: int


print(MyModel.model_json_schema(schema_generator=MyGenerateJsonSchema))
"""
{
    'properties': {'x': {'title': 'X', 'type': 'integer'}},
    'required': ['x'],
    'title': 'Customize title',
    'type': 'object',
    '$schema': 'https://json-schema.fullstack.org.cn/draft/2020-12/schema',
}
"""

以下是您可以用来从 schema 中排除任何没有有效 json schema 的字段的方法

from typing import Callable

from pydantic_core import PydanticOmit, core_schema

from pydantic import BaseModel
from pydantic.json_schema import GenerateJsonSchema, JsonSchemaValue


class MyGenerateJsonSchema(GenerateJsonSchema):
    def handle_invalid_for_json_schema(
        self, schema: core_schema.CoreSchema, error_info: str
    ) -> JsonSchemaValue:
        raise PydanticOmit


def example_callable():
    return 1


class Example(BaseModel):
    name: str = 'example'
    function: Callable = example_callable


instance_example = Example()

validation_schema = instance_example.model_json_schema(
    schema_generator=MyGenerateJsonSchema, mode='validation'
)
print(validation_schema)
"""
{
    'properties': {
        'name': {'default': 'example', 'title': 'Name', 'type': 'string'}
    },
    'title': 'Example',
    'type': 'object',
}
"""
from collections.abc import Callable

from pydantic_core import PydanticOmit, core_schema

from pydantic import BaseModel
from pydantic.json_schema import GenerateJsonSchema, JsonSchemaValue


class MyGenerateJsonSchema(GenerateJsonSchema):
    def handle_invalid_for_json_schema(
        self, schema: core_schema.CoreSchema, error_info: str
    ) -> JsonSchemaValue:
        raise PydanticOmit


def example_callable():
    return 1


class Example(BaseModel):
    name: str = 'example'
    function: Callable = example_callable


instance_example = Example()

validation_schema = instance_example.model_json_schema(
    schema_generator=MyGenerateJsonSchema, mode='validation'
)
print(validation_schema)
"""
{
    'properties': {
        'name': {'default': 'example', 'title': 'Name', 'type': 'string'}
    },
    'title': 'Example',
    'type': 'object',
}
"""

JSON schema 排序

默认情况下,Pydantic 通过按字母顺序对键进行排序来递归地对 JSON schema 进行排序。值得注意的是,Pydantic 跳过对 properties 键的值进行排序,以保留字段在模型中定义的顺序。

如果您想自定义此行为,可以在自定义 GenerateJsonSchema 子类中覆盖 sort 方法。以下示例使用 no-op sort 方法完全禁用排序,这反映在模型字段和 json_schema_extra 键的保留顺序中

import json
from typing import Optional

from pydantic import BaseModel, Field
from pydantic.json_schema import GenerateJsonSchema, JsonSchemaValue


class MyGenerateJsonSchema(GenerateJsonSchema):
    def sort(
        self, value: JsonSchemaValue, parent_key: Optional[str] = None
    ) -> JsonSchemaValue:
        """No-op, we don't want to sort schema values at all."""
        return value


class Bar(BaseModel):
    c: str
    b: str
    a: str = Field(json_schema_extra={'c': 'hi', 'b': 'hello', 'a': 'world'})


json_schema = Bar.model_json_schema(schema_generator=MyGenerateJsonSchema)
print(json.dumps(json_schema, indent=2))
"""
{
  "type": "object",
  "properties": {
    "c": {
      "type": "string",
      "title": "C"
    },
    "b": {
      "type": "string",
      "title": "B"
    },
    "a": {
      "type": "string",
      "c": "hi",
      "b": "hello",
      "a": "world",
      "title": "A"
    }
  },
  "required": [
    "c",
    "b",
    "a"
  ],
  "title": "Bar"
}
"""
import json

from pydantic import BaseModel, Field
from pydantic.json_schema import GenerateJsonSchema, JsonSchemaValue


class MyGenerateJsonSchema(GenerateJsonSchema):
    def sort(
        self, value: JsonSchemaValue, parent_key: str | None = None
    ) -> JsonSchemaValue:
        """No-op, we don't want to sort schema values at all."""
        return value


class Bar(BaseModel):
    c: str
    b: str
    a: str = Field(json_schema_extra={'c': 'hi', 'b': 'hello', 'a': 'world'})


json_schema = Bar.model_json_schema(schema_generator=MyGenerateJsonSchema)
print(json.dumps(json_schema, indent=2))
"""
{
  "type": "object",
  "properties": {
    "c": {
      "type": "string",
      "title": "C"
    },
    "b": {
      "type": "string",
      "title": "B"
    },
    "a": {
      "type": "string",
      "c": "hi",
      "b": "hello",
      "a": "world",
      "title": "A"
    }
  },
  "required": [
    "c",
    "b",
    "a"
  ],
  "title": "Bar"
}
"""

自定义 JSON Schema 中的 $refs

可以通过使用 ref_template 关键字参数调用 model_json_schema()model_dump_json() 来更改 $refs 的格式。定义始终存储在键 $defs 下,但可以为引用使用指定的前缀。

如果您需要扩展或修改 JSON schema 默认定义位置,这将非常有用。例如,使用 OpenAPI

import json

from pydantic import BaseModel
from pydantic.type_adapter import TypeAdapter


class Foo(BaseModel):
    a: int


class Model(BaseModel):
    a: Foo


adapter = TypeAdapter(Model)

print(
    json.dumps(
        adapter.json_schema(ref_template='#/components/schemas/{model}'),
        indent=2,
    )
)

JSON 输出

{
  "$defs": {
    "Foo": {
      "properties": {
        "a": {
          "title": "A",
          "type": "integer"
        }
      },
      "required": [
        "a"
      ],
      "title": "Foo",
      "type": "object"
    }
  },
  "properties": {
    "a": {
      "$ref": "#/components/schemas/Foo"
    }
  },
  "required": [
    "a"
  ],
  "title": "Model",
  "type": "object"
}

关于 JSON Schema 生成的杂项说明

  • Optional 字段的 JSON schema 指示允许值 null
  • Decimal 类型在 JSON schema 中(和序列化时)作为字符串公开。
  • 由于 namedtuple 类型在 JSON 中不存在,因此模型的 JSON schema 不会将 namedtuple 保留为 namedtuple
  • 使用的子模型会添加到 $defs JSON 属性中并被引用,按照规范。
  • 具有修改(通过 Field 类)的子模型,例如自定义标题、描述或默认值,将递归包含而不是引用。
  • 模型的 description 来自类的文档字符串或 Field 类的参数 description
  • schema 默认使用别名作为键生成,但可以通过调用 model_json_schema()model_dump_json() 并使用 by_alias=False 关键字参数来使用模型属性名称生成。