ORM Lazy Loading Pitfalls

Object-relational mappers furnish a mapping layer between object-oriented code and relational databases. ORMs such as NHibernate and Entity Framework support lazy loaded associations which allow the loading of specific subsets of an object graph from an underlying relational store. This is beneficial because an object model can construct an object graph which is unfeasible to contain in main memory in its entirety. Lazy loading can prevent unnecessary data from being loaded and as such it is often presented as a performance optimization technique. This technique however incurs several drawbacks and is limited in its scalability. One drawback is that classes are static declarations and object associations will be accessible regardless of whether they are lazy loaded or eager loaded. As a result, it becomes more difficult to understand code because it isn’t immediately certain whether navigating an association will result in a database call behind the scenes. Moreover, care must be taken to ensure that an ORM session is available lest we run into the dreaded LazyInitializationException. Given that lazy loading is typically implemented using the proxy pattern, data access implementation details inevitably and invisibly leak into the rest of the application. In sense these characteristics can be regarded as a violation of the principle of least astonishment.

The problem that lazy loading attempts to address can be illustrated with an analogy to the world wide web. The success of the word wide web can be attributed in part to its hyperlinked nature - resources are connected with links allowing for navigation of the web graph loading resources as they are needed. It is unrealistic for the entire web to be loaded into memory, unless you are Google, of course. A relational database can be viewed as a web which ORMs attempt to navigate while at the same time mapping relational data to an object model. The reality is that there is a subtle mismatch between the object model and the relational model. More accurately, No-SQL, corresponding to the object model, and SQL, corresponding to the relational model are mathematical duals as described by Erik Meijer in “A co-Relational Model of Data for Large Shared Data Banks”. (Meijer was also responsible for the Reactive Extensions Framework and demonstrating the duality between IEnumerable and IObservable).

In practice, the mismatch exists because SQL is best suited for ad hoc queries and ad hoc field selection whereas OOP is best suited for static models. From the relational perspective there is a tension to select data specifically for a given query which is in turn designed for a specific use case. From the OOP perspective there is a tension to conform the query to an object model which is designed for a variety of use cases. An important observation is that these mapping issues are most prominent on the query side of the equation. Consequently, a technique such as read-models can be utilized to mitigate lazy loading issues all together. Instead of devising intricate fetching strategies with lazy loading it is much simpler to create read-model classes purposed toward representing queries for specific use cases. In the CQRS and event sourcing world, persistent read-models or projections are a similar technique for implementing queries.

Summary

Lazy loading can bring performance benefits in certain use cases.
Lazy loading is not a data loading panacea and can lead to unexpected results.
Lazy loading doesn’t scale because it is always restricted by the static nature of the target object model.
Read-models or projections can be used in place of lazy loading.

Summary

Comments