Table of Contents
Introduction
Data integrity refers to the accuracy of data stored inside a database. When creating web applications, data integrity becomes an important issue, and rules help enforce data integrity at various levels including inside Django by writing Python code or at the database level with constraints. I prefer using database constraints because they require very little code and maintenance, and the rule gets enforced regardless of where in the code the database table is used. A Django Model’s clean method is an alternative to database constraints. Using the clean method can be confusing because it is not called by a Model’s save method. I’ve found this leads to bugs and invalid data entering the database.
For this article, we’ll create a Django application to store employee data for various organizations. Part of our application allows organizations to set employee goals with a score between 1 and 5. The goal includes a baseline score where the employee starts and a target score where the employee should end.
Example Code
import decimal
from django.contrib.auth import get_user_model
from django.contrib.postgres.constraints import ExclusionConstraint
from django.contrib.postgres.fields import (
DateRangeField,
RangeBoundary,
RangeOperators,
)
from django.core.validators import MaxValueValidator, MinValueValidator
from django.db import models
User = get_user_model()
class DateRangeFunc(models.Func):
function = "daterange"
output_field = DateRangeField()
class Organization(models.Model):
name = models.CharField(max_length=255)
class Employee(models.Model):
class Meta:
constraints = [
models.UniqueConstraint(
fields=["user", "organization"], name="unique_employee_user"
)
]
user = models.ForeignKey(
User, on_delete=models.CASCADE, related_name="employees"
)
organization = models.ForeignKey(
Organization, on_delete=models.CASCADE, related_name="employees"
)
class EmployeeGoal(models.Model):
class Meta:
constraints = [
ExclusionConstraint(
name="exclude_overlapping_goals",
expressions=(
(
DateRangeFunc(
"start_date", "end_date", RangeBoundary()
),
RangeOperators.OVERLAPS,
),
("employee", RangeOperators.EQUAL),
),
),
models.CheckConstraint(
name="baseline_less_than_target",
check=models.Q(baseline_score__lte=models.F("target_score")),
),
]
employee = models.ForeignKey(
Employee,
on_delete=models.CASCADE,
)
description = models.TextField()
target_score = models.DecimalField(
decimal_places=2,
max_digits=3,
validators=[
MinValueValidator(decimal.Decimal(1)),
MaxValueValidator(decimal.Decimal(5)),
],
)
baseline_score = models.DecimalField(
decimal_places=2,
max_digits=3,
default=decimal.Decimal("2.50"),
validators=[
MinValueValidator(decimal.Decimal(1)),
MaxValueValidator(decimal.Decimal(5)),
],
)
start_date = models.DateField()
end_date = models.DateField()
Unique Constraint
Our Employee
model contains a unique constraint to ensure we don’t duplicate an employee in an organization.
models.UniqueConstraint(
fields=["user", "organization"], name="unique_employee_user"
)
In the fields argument, we pass a list of model fields that should be unique together. We use the Employee
model to map Django’s default User
model to an Organization
model that we defined above our Employee
model, and we want to ensure that duplicate employee records don’t exist. Let’s say we have an organization, LAAC Technology; a User, Steven; and an employee record linking these stored in our database. If we attempt to create another employee linking LAAC Technology and Steven, an IntegrityError
gets raised, and the database prevents this new record from being created.
Check Constraint
Our check constraint exists on the EmployeeGoal model to prevent the goal’s baseline score from being greater than the target score. In our application, an employee goal where the baseline score is greater than the target score doesn’t make sense because we want the employee to improve not decline in score.
models.CheckConstraint(
name="baseline_less_than_target",
check=models.Q(baseline_score__lte=models.F("target_score")),
),
For the check argument, we pass a Q object which contains our query, baseline score is less than or equal to the target score. We wrap the target score in an F object, which lets us refer to the value of target_score within the check constraint. If the baseline score is greater than the target score, the check constraint will result in an IntegrityError
and prevent the change in the database.
Exclusion Constraint
Django supports a PostgreSQL specific database constraint called the exclusion constraint. This constraint allows us to define complex rules for our database tables. To use this constraint, you need to run a migration to install the btree gist extension.
For our EmployeeGoal
model, we only want one goal active at a time. To implement this, we use an exclusion constraint to prevent overlapping date ranges for start date and end date per employee. We define a database function that allows our constraint to determine the date range.
class DateRangeFunc(models.Func):
function = "daterange"
output_field = DateRangeField()
ExclusionConstraint(
name="exclude_overlapping_goals",
expressions=(
(
DateRangeFunc(
"start_date", "end_date", RangeBoundary()
),
RangeOperators.OVERLAPS,
),
("employee", RangeOperators.EQUAL),
),
),
The exclusion constraint allows us to pass in a list of expressions that make up the constraint. An expression consists of a reference to a field or database function and a SQL operator. Our first expression says that the date range between start_date
and end_date
should not overlap. Our second expression says that the employee should not be equal. Combining these rules creates our exclusion constraint. Let’s say that our Steven employee of LAAC Technology has a goal with a start date of Feburary 1, 2021 and end date of March 1, 2021. If we attempt to create an employee goal for Steven with a start date of Februray 15, 2021 and an end date of March 15, 2021, the attempt results in an IntegrityError
, and the new goal won’t be created.
Final Thoughts
Whenever I want to define rules for my application’s data, I use database constraints. In my experience, they result in the most consistent data because no matter where in the code a database table is operated on the rule is enforced. Occasionally, the database constraints are too limited, and I’ll be forced to write Python code. In these cases, I’ll reach for a Model’s clean method. Since this method is not called during a Model’s save method, you have to ensure that you call the clean method in the appropriate places such as inside a Form.
from Planet Python
via read more
No comments:
Post a Comment