Skip to content

Fields

Much of the time, fields can be defined automatically based on their type annotations. However, these annotations are generally just shorthand for a more explicit field definition. Explicit field definitions have much more flexible behavior.

Basic Fields

Let's start with a simple example: let's say we're recording test results. A simple test result schema might look like:

from cleanchausie import Schema

class TestResult(Schema):
    name: str
    score: int

The more explicit version of this schema would look like:

from cleanchausie import field, Schema, StrField, IntField

class TestResult(Schema):
    name = field(StrField())
    score = field(IntField())

Explicit Fields

Now let's say we want to be a bit stricter with our score field. We want to ensure that the score is between 0 and 100, inclusive. It turns out that the built-in IntField already supports this behavior:

from cleanchausie import field, Schema, StrField, IntField

class TestResult(Schema):
    name = field(StrField())
    score = field(IntField(min_value=0, max_value=100))

Using the field function explicitly allows for more granular control over the field's behavior. For example, field accepts:

  • parents - a list of fields the validation logic is chained through before being considered validated. In this case, the first argument is automatically considered to be the parent field.
  • accepts - a list of field names this field should accept from any source data. A common reason to use this would be renaming a field or accepting multiple deprecated field names over an API. accepts can also be None, meaning that no values are accepted for this field. Instead, the field may be derived from other fields or contextual information.
  • serialize_to - The name this field should take when serialized (if different from the field name)
  • serialize_func - A custom function to use for serializing any non-standard types or for applying any field-specific transformations to the value.
  • nullability - A Nullability object that defines how this field should behave when passed None or omitted. See the Nullability section for more details.

Custom Field Validation

But let's say we want a bit stricter validation on the name field. We want to ensure that the name is at least 3 characters long, doesn't have trailing spaces, and starts with a capital letter.

There's actually a few options for how to define this. If a field is only going to be used once, the most convenient option is to define it using field as a decorator:

from typing import Union
from cleanchausie import field, Schema, StrField, IntField, Error

class TestResult(Schema):
    score = field(IntField(min_value=0, max_value=100))

    @field(parents=StrField(min_length=3))
    def name(self, value: str) -> Union[str, Error]:
        value = value.strip()
        if not value[0].isupper():
            return Error("Name must start with a capital letter")
        return value

A few things to note here:

  • When field is used as a decorator, the name of the decorated function becomes the field name.
  • The decorated function is given a value arg, which has already been validated by the parent fields (StrField in this case).
  • If there's an error, the function should return an Error object.
  • If there isn't an error, the function should return the (perhaps modified) value. When a field has multiple parents, the value is passed through each parent in order, with the final returned value as the result.

When defining fields with the intent to reuse them, the same field can be defined as a standalone function (minus the self arg):

from typing import Union
from cleanchausie import field, Schema, StrField, IntField, Error

@field(parents=StrField(min_length=3))
def name_field(value: str) -> Union[str, Error]:
  value = value.strip()
  if not value[0].isupper():
    return Error("Name must start with a capital letter")
  return value

class TestResult(Schema):
  name = name_field
  score = field(IntField(min_value=0, max_value=100))

Dependent Fields

In our TestResult example, let's say we wanted to also automatically assign a letter grade to each result. Here's where fields get really powerful: they can depend on the validated value of sibling fields. All you have to do is add an arg with the same name as the field you want to depend on:

from cleanchausie import clean, field, serialize, Schema, IntField

class TestResult(Schema):
    name: str
    score = field(IntField(min_value=0, max_value=100))

    @field(accepts=None)
    def letter_grade(self, score: int) -> str:
        if score >= 90:
            return "A"
        elif score >= 80:
            return "B"
        elif score >= 70:
            return "C"
        elif score >= 60:
            return "D"
        else:
            return "F"

result = clean(TestResult, {"name": "Janice", "score": 89})
print(serialize(result))
"""
{
    "name": "Janice",
    "score": 89,
    "letter_grade": "B"
}
"""

Now we have a field which doesn't accept any input, but instead depends solely on the score. CleanChausie sees the score arg, validates the score field first, then passes the validated result into letter_grade.

However, we can also define fields that both depend on other fields and accept their own input:

from cleanchausie import Schema, field, StrField

class TestResult(Schema):
  name: str
  score: int

  @field(parents=StrField())
  def description(self, value: str, name: str, score: int) -> str:
    return f"{name} scored {score} on the {value} test"

Contextual Validation

Sometimes the parameters of a field's validation logic depend on the context, such as the current time, information associated with the authenticated user, or other validation-relevant data that's not part of the data being cleaned.

CleanChausie supports this by allowing fields to accept a context argument. For example:

from typing import Union, Set

import attrs
from cleanchausie import clean, field, serialize, Schema, IntField, StrField, Error

@attrs.frozen
class ValidationCtx:
    supported_languages: Set[str]

class TestResult(Schema):
    name: str
    score = field(IntField(min_value=0, max_value=100))

    @field(parents=StrField())
    def language(self, value: str, context: ValidationCtx) -> Union[str, Error]:
        if value not in context.supported_languages:
            return Error(f"Test was not offered in language: {value}")
        return value

result = clean(
    TestResult,
    {"name": "Janice", "score": 89, "language": "de"},
    context=ValidationCtx({"en", "es"})
)
print(serialize(result))
"""
{
    'errors': [
        {
            'field': ('language',),
            'msg': 'Test was not offered in language: de',
        }
    ]
}
"""

This pattern is particularly useful with explicit session management, where field validation relies on accessing a database. A database session often has a short lifecycle and should be discarded after it's been used. If this was passed in as a field, a reference would stick around on the validated schema. If we're just trying to be explicit about session management, we should pass it in using a context instead:

import attrs
from cleanchausie.fields import clean, field, StrField
from cleanchausie.schema import Schema

class MyModel:  # some ORM model
  id: str
  created_by_id: str  # User id

@attrs.frozen
class Context:
  authenticated_user: 'User'  # the User making a request
  session: 'Session'  # active ORM Session

class ContextExampleSchema(Schema):
  @field(parents=StrField(), accepts=("id",))
  def obj(self, value: str, context: Context) -> MyModel:
    # in real usage this might look more like:
    #   context.session
    #     .query(MyModel)
    #     .filter(MyModel.created_by_id == authenticated_user.id)
    #     .filter(MyModel.id == value)
    return context.session.find_by_user_and_id(
      value, context.authenticated_user.id
    )

with atomic() as session:
  result = clean(
    ContextExampleSchema,
    data={'id': 'mymodel_primarykey'},
    context=Context(authenticated_user=EXAMPLE_USER, session=session)
  )
assert isinstance(result, ContextExampleSchema)
assert isinstance(result.obj, MyModel)