Wednesday, June 16, 2010

Agile Testing UK meetup

While in England for DDD Exchange, I caught the opportunity to participate in a meetup of the UK Agile Testing community. Gojko Adzic led a very interesting workshop about how to write acceptance tests.

Neil McLaughlin, who happened to be in my team in the hands-on simulation wrote a nice report on the evening, and then Gojko himself posted about what he showed us. I'll try add something, from the participant perspective.

The developer mindset

Given we had quite a few people in the room, around 60, it was really interesting to note all different behaviors, but there were also some striking similarities. Most of the discussion we had were focused on

  • Formal correctness of the test
  • Splitting tests to be testing one concern at that time
  • avoiding duplication in test fixture setup

When discussing the outcome, Gojko pointed out that the main concern for acceptance tests was readability, instead. Readability is a subjective evaluation criteria, and might lead to a not-so-precise definition about how to make it right. But that's the only way to write a useful acceptance test. But the developer's mindset really had a strong influence on us, it was really hard to give up.

Constrained by the tools

Despite Skills Matter (that was hosting the evening) doing a fantastic job in providing an impressive amount of whiteboards for all the team, some of the discussions were constrained by the amount of space available on the whiteboard, so we ended up pruning some concerns, thinking they were of little value to the discussion. It turned out we were wrong.

But, as I said, we had an amount of writable space that was a lot larger than what I normally found in a typical workplace... and still this was constraining us. Buying one more whiteboard is cheap, making wrong assumptions because of writable space constraints is a lot more costly.

We're problem solvers, after all

During the exercise, Gojko was wandering around from one team to another one, playing the role of the Domain Expert. However, most of the teams seemed more interested in his Teacher role and asked some questions about how to solve the exercise, but barely none about ambiguities in the domain. We preferred to use assumptions instead.

Even if I knew the trick, it worked on me as well. But it's scary to note that even when writing acceptance tests, which are a lot more translation than design, our built-in problem solving brain leads us so easily on the road of guessing instead of asking.

Tuesday, June 15, 2010

About Entities, Aggregates and Data Duplication.

There's been an interesting discussion about Aggregates on the Italian DDD mailing lists. When things become complex, a simple example might just turn too simple. So I came up with this medium-sized one. Hope it won't be too long. Ok, so let's start from our first User Story

User Story #1: Placing an order
As a Customer
I Want to place an order
In order to purchase some goods

The simplest implementation of the story is essentially stateless: every time a customer wants to order something, needs to re-enter the data. In the DDD perspective, the resulting model is based on a single aggregate (we're deliberately ignoring the catalog for now) whose root is the Order class.

DDD order story 1.1.jpg

The stateless nature of the service makes it really easy to implement: a Customer is just a Value Object, created and eventually dropped at needs. Also we took some shortcut: we've chosen to implement Address as a String. Item is sort of natural value object, while LineItem is somewhat in the middle: we can change quantities, while the order is in open state, but we can implement this also using droppable Value Objects, easing the integrity burden for the aggregate root.

Some businesses (like buying train tickets) might just work like this, but our marketing is more inclined to manage customers in a more long-term way, so here are two more user stories.

User Story #2: Returning customer
As a Customer
I Want to retrieve my profile
In order to place more orders

Story #2 breaks our assumption about the aggregate boundaries. If we stick to the aggregate rule-of-thumb, if we want to delete an Order, we probably don't want to delete also the corresponding Customer. So we need a separate aggregate for that. What would happen if we decide to delete a Customer? Should we delete all orders? We don't have enough information to answer that, yet, we'll mark it as an outstanding question for our next meeting with the domain expert. Let's try the model with two aggregates and see how does it perform.

We now have a relationship crossing the aggregate boundary. We have promoted Customer to become the root of the newly created aggregate, so there is no potential integrity violation. Still we have to watch it closely, because this is where problems related to lazy/eager loading will arise. We also added username and password to Customer.

User Story #3: Different shipping address
As a Customer
I Want to specify a valid shipping address
In order to ship to a different destination

Multiple addresses are a call for a separate type to manage Address. We don't have so many responsibilities so far, for this class, except validation (which as Udi Dahan would say, doesn't necessarily belong to the domain layer) but the smell of duplication is probably enough to go for a separate class. We try to keep the model as simple as possible, so we treat Address like a value object.

DDD order story 3.jpg

User Story #4: Editable customer profile
As a Customer
I Want to edit my profile
In order to update it if needed

Story #4 makes explicit what we've been suspecting: Customer needs to be an entity, because it has a nontrivial lifecycle. No revolutions at Domain Model level, but this triggers a question: "What happens if I have an outstanding order and the customer changes its data before the order is dispatched?" This is the type of questions you don't want to answer as software developer. So we walk up the stairs to have a talk with the Domain Expert. We come back with two fresh user stories:

User Story #5: Specify Billing and Shipping address
As a Customer
I Want to specify independent billing and shipping addresses
In order to deliver goods to different locations

This one is relatively easy: just reinforcing our design. We'll now have two references from Order to Address, which are managed by the Customer. We just store a default Address in the Customer aggregate (we could do better) but's ok for now.

DDD order story 5.jpg

Story #6 is a little trickier, it comes from the legal department and tell us what we probably expected.

User Story #6: Track past orders
As a Legal Department
I Want to track orders
In order to in order to manage litigations

The domain expert state it clear: once an order is placed, it can't be changed in any of its parts, be it the content or the Customer. In case of litigation, it must behave exactly like printed paper. But Customer does not, its lifecycle is different from our needs, we'd need a separate class for that. We're lacking fantasy and call it CustomerData.

DDD order story 6.jpg

Everything is looking a lot different from the beginning. The two aggregates are now largely decoupled: we can change or delete an order without affecting the customer or deleting or unregistering a customer without losing tracks of its past orders. On the other hand we have explicit duplication here. Customer and CustomerData look so similar we're feeling guilty. Did we violate DRY principle? At first look the data is the same, but if we think about behavior, or class lifecycle, Customer and CustomerData are clearly two different beasts. But more often than not, when the starting point is the data model, instead of the domain model we end up thinking that's the same data, hence the same class. I bet experienced data modelers do not fall into these pitfalls as well, but I've seen these problems recurring quite often.

Once we accept that little bit of data duplication we have a system which is a lot easier to evolve and maintain, with aggregate roots as integrity enforcers within their boundaries, and agnostic about the rest.

To add some salt, don't forget aggregates are also building blocks of distributed systems: suppose we'll need to send orders to a remote system. Sending just the single aggregate and the referenced value objects is probably the cleanest way.