Fields
Much of the time, fields can be defined automatically based on their type annotations. However, these annotations are generally just shorthand for a more explicit field definition. Explicit field definitions have much more flexible behavior.
Basic Fields
Let's start with a simple example: let's say we're recording test results. A simple test result schema might look like:
The more explicit version of this schema would look like:
from cleanchausie import field, Schema, StrField, IntField
class TestResult(Schema):
name = field(StrField())
score = field(IntField())
Explicit Fields
Now let's say we want to be a bit stricter with our score field. We want to
ensure that the score is between 0 and 100, inclusive. It turns out that the
built-in IntField
already supports this behavior:
from cleanchausie import field, Schema, StrField, IntField
class TestResult(Schema):
name = field(StrField())
score = field(IntField(min_value=0, max_value=100))
Using the field
function explicitly allows for more granular control over the
field's behavior. For example, field
accepts:
parents
- a list of fields the validation logic is chained through before being considered validated. In this case, the first argument is automatically considered to be the parent field.accepts
- a list of field names this field should accept from any source data. A common reason to use this would be renaming a field or accepting multiple deprecated field names over an API.accepts
can also beNone
, meaning that no values are accepted for this field. Instead, the field may be derived from other fields or contextual information.serialize_to
- The name this field should take when serialized (if different from the field name)serialize_func
- A custom function to use for serializing any non-standard types or for applying any field-specific transformations to the value.nullability
- ANullability
object that defines how this field should behave when passedNone
or omitted. See the Nullability section for more details.
Custom Field Validation
But let's say we want a bit stricter validation on the name field. We want to ensure that the name is at least 3 characters long, doesn't have trailing spaces, and starts with a capital letter.
There's actually a few options for how to define this. If a field is only
going to be used once, the most convenient option is to define it using field
as a decorator:
from typing import Union
from cleanchausie import field, Schema, StrField, IntField, Error
class TestResult(Schema):
score = field(IntField(min_value=0, max_value=100))
@field(parents=StrField(min_length=3))
def name(self, value: str) -> Union[str, Error]:
value = value.strip()
if not value[0].isupper():
return Error("Name must start with a capital letter")
return value
A few things to note here:
- When
field
is used as a decorator, the name of the decorated function becomes the field name. - The decorated function is given a
value
arg, which has already been validated by the parent fields (StrField
in this case). - If there's an error, the function should return an
Error
object. - If there isn't an error, the function should return the (perhaps modified)
value. When a field has multiple
parents
, the value is passed through each parent in order, with the final returned value as the result.
When defining fields with the intent to reuse them, the same field can be
defined as a standalone function (minus the self
arg):
from typing import Union
from cleanchausie import field, Schema, StrField, IntField, Error
@field(parents=StrField(min_length=3))
def name_field(value: str) -> Union[str, Error]:
value = value.strip()
if not value[0].isupper():
return Error("Name must start with a capital letter")
return value
class TestResult(Schema):
name = name_field
score = field(IntField(min_value=0, max_value=100))
Dependent Fields
In our TestResult example, let's say we wanted to also automatically assign a letter grade to each result. Here's where fields get really powerful: they can depend on the validated value of sibling fields. All you have to do is add an arg with the same name as the field you want to depend on:
from cleanchausie import clean, field, serialize, Schema, IntField
class TestResult(Schema):
name: str
score = field(IntField(min_value=0, max_value=100))
@field(accepts=None)
def letter_grade(self, score: int) -> str:
if score >= 90:
return "A"
elif score >= 80:
return "B"
elif score >= 70:
return "C"
elif score >= 60:
return "D"
else:
return "F"
result = clean(TestResult, {"name": "Janice", "score": 89})
print(serialize(result))
"""
{
"name": "Janice",
"score": 89,
"letter_grade": "B"
}
"""
Now we have a field which doesn't accept any input, but instead depends solely
on the score
. CleanChausie sees the score
arg, validates the score
field
first, then passes the validated result into letter_grade
.
However, we can also define fields that both depend on other fields and accept their own input:
from cleanchausie import Schema, field, StrField
class TestResult(Schema):
name: str
score: int
@field(parents=StrField())
def description(self, value: str, name: str, score: int) -> str:
return f"{name} scored {score} on the {value} test"
Contextual Validation
Sometimes the parameters of a field's validation logic depend on the context, such as the current time, information associated with the authenticated user, or other validation-relevant data that's not part of the data being cleaned.
CleanChausie supports this by allowing fields to accept a context
argument.
For example:
from typing import Union, Set
import attrs
from cleanchausie import clean, field, serialize, Schema, IntField, StrField, Error
@attrs.frozen
class ValidationCtx:
supported_languages: Set[str]
class TestResult(Schema):
name: str
score = field(IntField(min_value=0, max_value=100))
@field(parents=StrField())
def language(self, value: str, context: ValidationCtx) -> Union[str, Error]:
if value not in context.supported_languages:
return Error(f"Test was not offered in language: {value}")
return value
result = clean(
TestResult,
{"name": "Janice", "score": 89, "language": "de"},
context=ValidationCtx({"en", "es"})
)
print(serialize(result))
"""
{
'errors': [
{
'field': ('language',),
'msg': 'Test was not offered in language: de',
}
]
}
"""
This pattern is particularly useful with explicit session management, where
field validation relies on accessing a database. A database session
often has
a short lifecycle and should be discarded after it's been used. If this was
passed in as a field, a reference would stick around on the validated schema.
If we're just trying to be explicit about session management, we should pass it
in using a context instead:
import attrs
from cleanchausie.fields import clean, field, StrField
from cleanchausie.schema import Schema
class MyModel: # some ORM model
id: str
created_by_id: str # User id
@attrs.frozen
class Context:
authenticated_user: 'User' # the User making a request
session: 'Session' # active ORM Session
class ContextExampleSchema(Schema):
@field(parents=StrField(), accepts=("id",))
def obj(self, value: str, context: Context) -> MyModel:
# in real usage this might look more like:
# context.session
# .query(MyModel)
# .filter(MyModel.created_by_id == authenticated_user.id)
# .filter(MyModel.id == value)
return context.session.find_by_user_and_id(
value, context.authenticated_user.id
)
with atomic() as session:
result = clean(
ContextExampleSchema,
data={'id': 'mymodel_primarykey'},
context=Context(authenticated_user=EXAMPLE_USER, session=session)
)
assert isinstance(result, ContextExampleSchema)
assert isinstance(result.obj, MyModel)