Data model objects such as Active Record models have many negative impacts to software.

This article presents a few effective mitigation strategies to reduce the negative business impact of this approach.

For direct factual content about the short and long-term costs of using Active Record for fully automated business processes, read this thorough article.

Properties of Effective Business Software

The software that enables critical business value often has properties which are a critical liability. For example, the failure of the system to quickly adapt to changing business requirements is blamed on shortcuts that were taken in the past.

I assert that this is often an incorrect assessment.

Software often fails –not because the business cut corners in order to meet business demands– but because the software was not written to support the speed of change that the business needs.

In order to fix such systems, developers secure the budget for a system rewrite. Then they end up building a system with the exact same medium and long-term life cycles.

Corners MUST be cut if the cost to change a system is too high.

Business software has a single goal, to empower the business in its mission. It generally needs to be reliable and to enable low lead times for business process changes.

While this seems to be common sense, I assert that we are not measuring our software development approach against these metrics. When we apply this measurement, our software falls short.

Just as declarative programmers discover ways to improve their code, behavioral modeling can help us to improve our data-model code. The following are some easy principles to apply to build much better systems using data model objects.

Principle: Expose Behavior, Not State

Due to massive leakage of the data model, the cost of comprehending and modifying all code coupled to data-model objects is significantly increased. Encapsulation is completely absent and its benefits are lost.

Typical Approach

ORM implementations including –but not limited to– ActiveRecord often expose database schema as public properties on data model objects.

$order->discount_code

$order is an ActiveRecord entity and discount_code is a database column in the orders table.

Use cases are implemented as one or more “service objects" working together to mutate these model objects field by field.

$order = Order::create();

$order->discount_code = value;
$order->other_field = value;
$order->something_id = value;
$order->other_id = value;
$order->timestamp = value;

return $order;

Preferred Approach

Instead of assigning values to public properties, define life cycle behaviors.

$order = Order::place(
$discountCode,
$otherField,
$somethingId,
$otherId,
$timestamp
);

class Order extends ORM
{
public static function place(...): self
{
$order = Order::create();

$order->discount_code = value;
$order->other_field = value;
$order->something_id = value;
$order->other_id = value;
$order->timestamp = value;

return $order;
}

public function pay($paidAt): void
{
// assign paid at timestamp
}
}

Define ‘state change’ methods on the models and encapsulating the implementation details related to these behaviors.

This pays off in a few ways. It’s easier to build models to the correct state in both production and testing scopes. It also disincentivizes deeply nested control structures.

Testing in isolation improves because test cases do not need to manually assign many fields which become out-of-sync with production systems.

It’s now possible to construct entity state for tests using production code.

$order = Order::place(...);
$order->finalize(...);
$order->pay(...);

The important behavior is encapsulated comfortably inside the object boundary and yet, if the production code changes tests won’t need to be updated.

Providing Guardrails

Other developers still have access to the public properties. This is part of the negative cost of Active Record. Some implementations support removal of public property access. Make use of this feature to gain the benefits of encapsulation.

Disable access to public properties and force usage of behavioral methods.

This reduces the leakage of database schemas in the application and emphasizes behaviors over direct state manipulation.

Principle: Encapsulate Local and Distributed State Changes

Increasingly we implement message-driven systems in order to handle concerns such as scale, complexity, governance, and more. It’s critical that we ensure both local and distributed state changes occur correctly and atomically.

Typical Approach

Active Record models utilize their direct access to a database handle to store changes. Event dispatchers are used to dispatch messages to a bus.

$order->save();

$this->eventDispatcher->dispatch(
new OrderWasPaid(...)
);

Another common approach is to use “observers" to react to state changes.

public function onPaid(Order $order): void
{
$this->events->dispatch(
new OrderWasPaid(...)
);
}

There are issues with both approaches.

Multiple behavioral routes often result in similar conclusions. For example, an order can become PAID when a customer “checks out" on the website or when a recurring payment is charged automatically at the end of every month.

Out of Sync Behaviors

When multiple routes arrive at the same conclusion, they must duplicate behavior. The dispatch of the events or correct assignment of fields may become out of sync. This chance is increased as new developers are added to the project.

Transactional Consistency

If the system fails between the storage of local data and the dispatch of events, critical message consumers will not receive the event. Important business processes may not be triggered.

It’s possible an Order becomes paid without order fulfillment processes becoming triggered. The customer now must contact support and ask why their purchase stalled.

Preferred Approach

To mitigate these issues, prefer the following techniques.

Encapsulate State Changes in Models

Couple and encapsulate relational and distributed state changes.

Don’t:

  1. mutate the order model
  2. call save()
  3. then dispatch an event

Instead, encapsulate the distributed state change within the data model itself. So that there is only ‘end point’ for this state change.

public function pay($paidAt): void
{
// assign paid at timestamp
$this->recordedEvents[] = new OrderWasPaid(...);
}

public function flushEvents(): Collection
{
$eventsToFlush = $this->recordedEvents;
$this->recordedEvents = [];

return Collection::of($eventsToFlush);
}

Now, we have encapsulated both local state changes and distributed state changes into a single entry-point, making it impossible to pay an order without buffering the correct event for dispatch.

Exclusively Use Repositories and Transactional Outbox Event Dispatch

Do not use the Active Record save() method outside of repositories. Implement a repository interface.

interface OrderRepository
{
public function findById($orderId): Order;
public function store(Order $order): void;
}

Within the repository implementation:

  1. begin a database transaction
  2. store the local state changes
  3. dispatch the pending events to an outbox table
  4. commit the database transaction
class OrmOrderRepository implements OrderRepository
{
public function __construct(
private readonly OutboxEventDispatcher $outboxEventDispatcher,
) {
}

public function store(
Order $order
): void {
/* 1. Open a database transaction */
DB::transaction(function() use ($order) {
/* 2. Store the model */
$order->save();

/* 3. Dispatch Events to a relational db table. */
$this->outboxEventDispatcher->dispatch(
$order->flushEvents()
);
});
}
}

If you aren’t familiar with the outbox pattern. Then I recommend immediately learning about transaction outbox dispatch from Frank de Jonge. The core concept is that events are stored in a database table in the same commit with local state changes. An external “relayer" process then reads from this table and dispatches the events to a message bus, resulting in improved resiliency.

Exclusively using this OrderRepository provides important guarantees.

Principle: Prefer Passing Independent Variables to Methods.

Because data models carry an incredible amount of context, they make poor function arguments. Unit test suites suffer in applications which make heavy use of data model function arguments which can result in systems which resist refactoring.

Typical Approach

Often, functions accept data models, which may read or write to any field, and then pass the models on for further mutation.

class FirstBehavior
{
public function do(...): Model
{
$entity = $this->createEntity(...);

// other validations
// other assignments

$this->secondBehavior->do($entity);
}
}

class SecondBehavior
{
public function do(Model $entity): Model
{
// do stuff using the $entity
}
}

While this is sometimes desirable or unavoidable, it comes at a cost.

Preferred Approach

Avoid passing data models to functions where practical.

Prefer Flat Structures

If we receive a request for an order, we first process the order, then we process a charge for the order.

The following example couples the processing of the order with the processing of the charge using nesting.

Nesting Calls
-> ProcessOrder -> ProcessCharge

class ProcessOrder
{
public function process(...): Order
{
$order = Order::create(...);

// validations
// mutations
// decisions

$this->processCharge->process($order);

return $order;
}
}

class ProcessCharge
{
public function process(...): Charge
{
// ...
return $charge;
}
}

The following example processes the order and the charge independently, utilizing a third scope (implemented as a command handler) to represent the whole of the process.

Instead of tracing through ProcessOrder and following it to ProcessCharge in order to understand the steps of this business process, we now have a single location which represents the process.

Instead of using nested calls, we have made the process more flat.

class ProcessOrder
{
public function process(...): Order
{
// ...
return $order;
}
}

class ProcessCharge
{
public function charge(...): Charge
{
// ...
return $charge;
}
}

class MakeOrderCommandHandler
{
public function handle(MakeOrder $command): void
{
$order = $this->processOrder->process(...);
$charge = $this->processCharge->process(...);
}
}

Flat Calls
-> ProcessOrder
-> ProcessCharge

The coupling between ProcessOrder and ProcessCharge has been removed from and relocated to MakeOrderCommandHandler.

Prefer to Pass Individual Fields Instead of Data Models

The following examples show the difference between passing data models as arguments versus individual fields.

function makeDecision(Order $order): void
{
if (
PaymentMethod::CreditCard()->equals($order->paymentMethod())
) {
// do thing
return 'outcome 1';
}

// do other thing
return 'outcome 2',
}

function usingArgument(PaymentMethod $method): void
{
if (PaymentMethod::CreditCard()->equals($paymentMethod))
{
// ...
}
}

Some benefits of only passing the necessary arguments:

Summary

These are just some mitigation patterns for improving reliability and reducing lead-time for changes.

Additional Resources