Validation in Domain-Driven Design (DDD)

Validation is a broad subject because it is prevalent throughout all areas of an application. Validation is difficult to implement in practice because it must be implemented throughout all areas of an application, typically employing different methods for each area. In a general sense, validation is a mechanism for ensuring operations result in a valid states. The ambiguity in that statement must not be overlooked because it illustrates several important characteristics of validation. One characteristic is context - the context under which validation is invoked. Context is critical because validation in one context may not be applicable in another context. Another corollary is the open-endedness of what is regarded as valid. Validity may be a trivial statement such as “The string representing a customer’s name must not be null” or it may be a complex sequence of CycL assertions. This post addresses validation as manifest in DDD-based enterprise applications. Validation, in this post, is distinct from a related discipline of correctness in theoretical computer science as researched by the likes of Edsger Dijkstra.

Always Valid

In domain-driven design, there are two schools of thought regarding validation which revolve around the notion of the always valid entity. Jeffrey Palermo proposes that the always valid entity is a fallacy. He suggests that validation logic should be decoupled from the entity which would defer the determination of the validation rules to invoke until runtime. The other school of thought, supported by Greg Young and others, asserts that entities should be always valid. (I must admit that I side with the always-valid school of though and therefore my statements are be biased.)

The scenarios explored by Palermo are certainly suitable and typical, however solutions involving always-valid entities can be implemented. He contends the following considerations for his example:

  • The fact that name is required needs to be context-bound. When is it invalid?
  • The message should be the responsibility of the presentation layer.
  • When loading historical data, some genders may be missing. Should the application blow up when loading data?
  • When loading historical data, perhaps the user needs to enter a gender when he edits his profile the next time.

Point one addresses the fact that if the user profile entity prevents a null name from being assigned, all application code where a null name might be assigned will blow up. Palmero’s solution is to decouple validation such that code where this might happen invokes a different set of validation rules, if any. An alternative solution is to use a different model designed toward that particular scenario. In fact the read-model pattern can be a fitting approach. The second point discusses error messages in the presentation layer. User error messages certainly are the responsibility of the presentation layer, but the always-valid entity does not imply that error messages in exceptions raised by the entity should be propagated directly to the presentation layer. Contrarily, this is regarded as an anti-pattern and a potential security thread. Instead, the presentation layer, viewed as an adapter in a Hexagonal architecture, should catch and interpret the exception translating it into a form applicable to the UI framework at hand. Finally, the last two points discuss an interesting evolutionary issue. Suppose that a gender attribute is introduced into the user profile entity. It is evident that existing users won’t have a gender specified. This is a realistic business scenario and a from a business perspective the users without a specified gender are simply users without a specified gender. When translated into code, this could mean a gender type of “unspecified”. This gender type can serve as a flag to initiate the workflow which asks the user to specify a gender. For new users, the presentation layer can enforce the rule that a gender must be specified. There is no need to allow the entity to enter an invalid state. From the DDD perspective, validation rules can be viewed as invariants. One of the central responsibilities of an aggregate is enforcement of invariants across state changes.

Jimmy Bogard writes:

If we start looking at command/query separation and closure of operations not only on our service objects but our entities as well, we can treat our entities with a little more respect and not drag them around into areas they don’t really belong. Simply put, if we control the operation side of the equation, why in the world would we allow our entities to get into an invalid state? Life becomes much more complicated if we start having “IsValid” properties on our entities.

These succinct statements carry a great deal of information. First is the idea of the “IsValid” property. A requirement to invoke validation or to query an “IsValid” property requires calling code to be non-atomic and this can lead to inconsistencies and a greater potential for human error. It is a difference between:

and this:

The second code sample requires clients of the UserProfile class to be aware of the “IsValid” property and always use it consistently. The first code sample avoids this all together - the operation of instantiating a user profile is atomic. This is a good example of leveraging programming language constructs to represent real world constraints. The next important part of Bogard’s statement is “not drag them around into areas they don’t really belong” which leads into the subsequent section on application layers. If entity validation seems inappropriate in a certain area then this may be an indication that an entity doesn’t belong in that area.

Application Layers

All sufficiently complex enterprise applications consist of multiple layers. From a user’s perspective the layers are abstracted away and they exist solely to assist the programmer in managing all of the emergent complexity. Distinct layers imply that translation must happen between the layers in order for information to propagate. For example, in a typical enterprise use case, an entity is loaded from the database, operated upon, persisted back to the database and information regarding the operation is returned to the user through a presentation layer via perhaps a REST adapter. Applications layers imply the existence of boundaries and as per Mark Seemann’s post, At the Boundaries, Applications are Not Object-Oriented. The entity is contained within the domain layer and should not be dragged into areas it doesn’t belong. In the presentation layer, a specific MVC view may require a user to enter a name and then gender. After having entered a name, the gender is still unspecified and the target entity is an invalid state. An always-valid entity cannot be bound to this view and it fact it should not be bound to the view - this is what the view model is for. The view model is a building block of the presentation layer and the domain entity doesn’t belong there. Instead, an appropriate domain layer entity should be created based on data contained in the view model. This can be done directly or by passing a DTO to a service.

Validation Frameworks

Validation can be implemented with trivial if-then control flows but this can become cumbersome and the programmer’s answer is the validation framework. A plurality of validation frameworks abound including data annotations, FluentValidation, NHibernate Validators, Enterprise Library Validation Block, etc. Validation frameworks however, can be abused because one can be lead into thinking that a framework solves all validation concerns, across all application layers. Unfortunately, this is not always possible. In practice, I’ve found that validation frameworks are best suited for use at application layer boundaries - such as validation user input in the presentation layer, ensuring database constraints at the the persistence layer, or enforcing conformance to a schema in a REST adapter. Implementing validation at each layer separately allows validation to be context specific. However, this can lead to a degree of duplication, in response to which DRY fanatics will scream blasphemy. This subject is addressed in the next section.

The domain layer is best kept lean with use of plain-old-exceptions to enforce validation rules. This is because validation frameworks carry a requirement to invoke the application framework, similar to the “IsValid” methodology addressed above. A presentation layer developed with ASP.NET MVC provides action handlers to inject validation invocation logic. In this way, a validation framework can be applied globally toward the entire application. Similar injection points don’t exist in plain C# or Java code, and use of an AOP framework can add needless complexity. This programming language “short-coming” can be overcome with extensions to homoiconicity. [.NET Code Contracts] can be regarded as a validation framework with extended static verification. The more general paradigm of Design by contract proposes that software should be written in terms of formal and verifiable specifications. The Eiffel programming language, the creator of which gave use [command-query separation] (http://bit.ly/oAo2b) among other things, is based on this principle at the its very core.

Duplication Validation Logic

A Stackoverflow question asks whether there is a way to re-use validation logic. In practice, it is often simpler to allow a degree of duplication rather than to strive for complete consistency. Consider the following example. Suppose we have a customer entity where the customer’s name is required. Enforcing this constraint in the entity is trivial:

An entity never stands alone however and we must consider the clients of this class. Who calls the constructor? In a web application implemented using ASP.NET MVC there would be a corresponding customer view model. This view model is part of the presentation layer and is designed with data binding in mind. Data annotations can be used to declare validation rules:

This class is very similar to the domain entity class but with several important differences. It has a parameterless constructor, a name property without a guard clause and a data annotations validation attribute. These aspects of this class make it suitable for use in the presentation layer. Instead of attempting to carry validation rules from the entity class, the name requirement constraint is effectively duplicated. This is indeed duplication, but it must much simpler to manager than a some sort of validation mapping framework.

Complex Validation

Entities can enforce certain invariants but the scope of these invariants are always limited by the entity itself. Since entities should be lean and self-contained, without access to external services or repositories, they may not have access to the resources required to enforce certain validation rules. In this case, the application service can serve a mediating role and procure the resources required to enforce validity. There exist business rules that are not natural responsibilities of an entity or a validation framework. For example, a uniqueness constraint on user’s user name cannot be verified by an entity because the entity does not and should not have access to the database of existing users. Instead, this rule can be enforced in the application service. Furthermore, rules may become sufficiently complex to warrant a business rules engine, in which case the application service is once again tasked with enforcing validation. An even more ambitious discipline is ontology engineering where CycL is an ontology language. Ontological engineering purports to formalize all business rules in machine executable representations. Jeff Zhuk, one of the leading practitioners in this field, proposes a Knowledge Driven Architecture based on these technologies.

Summary

  • Entities should enforce their own consistency and be always-valid. What is the purpose of an entity if not to enforce its own consistency?
  • If a need arises to allow an entity to enter an invalid state, consider whether application boundaries are at play which call for a different object model.
  • Validation frameworks are best used at specific application layers, not across all layers.
  • It is easier to duplicate validation logic than to keep it consistent across application layers.
  • The application service can enforce complex validation rules not accessible to an entity.

Comments