Abstraction and Encapsulation

The notions of abstraction and encapsulation are prevalent throughout the realm of programming and are intimately related. Their differences are subtle indeed and warrant careful examination. Specifically, the differences are most apparent on the basis of intent. Ultimately, encapsulation can be viewed as a structural aspect of abstraction which can be employed without the intent to abstract.

Abstraction is frequently associated with the intent to re-use. Identification of an abstraction followed by the preparation of a suitable representation allows code operating upon abstraction to be shared among derived instances of the abstraction. This is traditional OOP polymorphism at play.

Application of abstraction must be judicious because it incurs a non-trivial cost. It forges a dependency chain which in turn requires maintenance the cost of which can outweigh the benefits. Typically, this happens when abstraction focus is misapplied at non-critical caverns far beneath higher level structures in the code. The importance of proper abstraction tends to increase at higher levels of abstraction. A high number of abstractions at low levels results in significant re-factoring friction. Ideally, forces of the DRY principle must be balanced by forces of the YAGNI and KISS principles. A cautionary tale of abstractions is the Limit Your Abstractions series by Ayende.

Encapsulation is a trait of an abstraction. An interface is abstract because implementation is delegated to implementing classes. As a by-product, it also encapsulates the implementation thereby facilitating new semantic levels. New semantic levels however need not be the immediate intent of encapsulation which is also suitable for purely organizational purposes. For example, to improve readability, a private class method can be used encapsulate an operation even if that operation is only invoked in a single place.

These observations can be applied to discussions about the value of certain types of abstractions. There is a debate about the value of the repository abstraction. The repository tends to be a very leaky abstraction because it tends to be difficult to reuse in its entirety across distinct persistence implementations. As a result, significant investment into intricate repository abstraction design ends up as wasted effort - the abstractions are never actually reused. However, the repository abstraction can still reap the benefits of encapsulation. This can be done without any interfaces at all simply by referencing a repository class containing data access methods. This “repository” doesn’t implement an interface and isn’t intended for polymorphism - it is only used to encapsulate.

For example, the Raccoon Blog project avoids repositories and places data access logic directly into the controller. This has the immediate benefit of eliminating two code files - the interface declaration file and the implementation file. On the other hand, it increases the amount of code in the controller. This can make it difficult to distinguish between responsibilities of the controller and responsibilities of the data access layer. Additionally, reasoning about the data access layer of an application becomes trickier because the layer isn’t explicit. Effectively, this is a matter of preference and organizations as well as individual developers can choose an approach best suited for them while considering the implications.

Comments