Now, Turning to Reason, & Its Just Sweetness (the design)

This is the first part in a 3 part blog series on how to write code so you can build testable systems with external dependencies.  Link to part 2 is at the end of the post.

There has been some discussion over the last year or so about testing. Some of this discussion has been naive but well meaning. The intent is good, we all want to build reliable and robust systems, but it’s been naive because the chosen path leads to the opposite.

The general thrust of this discussion is about “mocking out external interfaces”. Which, on the surface seems like a sensible thing to do. After all, we want to test our software and we don’t want to have these slow external interfaces (like a database, a message queue, a file system) impact our development.  So clearly, the approach is to mock/stub out the calls to AWS/MQ/database in our code and “pow” – tested, reliable software.

Well, no.

What we have now is coupled, fragile software and those tests you spent all that time writing, they’re basically testing “does String.equals() work?”

So, it’s clear that there’s a gulf in both education and experience in this space, so prompted by the recent discussions on Slack, I’ll present to you, a narrative on the design and development of a multi-component software system.

First, we have some context so people can do some reading.

  1. The assertion that we need “AWS stubs” to test systems that are deployed to AWS
  2. A wonderful video by Gary Bernhardt about testing which covers a lot of what I’m going to go through Boundaries — Destroy All Software Talks
  3. A wonderful blog post by Ken Scambler about why mocks and stubs should be avoided. To Kill a Mockingtest | Tech Blog

Second, I’m going to talk a little bit about testing before launching into some solution design so that people can understand the why part.

Testing is hard. Much harder than people give credit for, and what’s worse, most people think that testing is easier than it is, leading to lots and lots of terrible tests. Mocks and stubs are an indication of a design smell. Ken’s comments in the Slack conversation and his post provide more concrete description of why.

Along with testing being hard, there are different concerns with testing, and this is fundamentally where the big issue occurs. Not all testing is symmetrical, and not all techniques are sensible or desirable.

Within a software system we have 3 main categories of tests.

  1. Functionality tests – these are tests that ensure that the software WE are writing behaves as WE expect it to. The most notable type of test that are encountered in this space are Unit tests.
  2. Contract verification – these are fixtures that VALIDATE the components and interfaces that the software validated with unit tests will continue to work. Think of these as a sort of pre-condition. They’re not so much tests as they are contract verification. It just so happens a lot of the testing frameworks that we have in the software ecosystem are very good to be able to build contract verification suites.
  3. Smoke tests – these are fixtures that VALIDATE the components deployed into an environment are correctly configured, the interfaces are available as expected and they all operate together correctly. This can be a single verification, it can be a sub-set of the contract verification tests, it can be a synthetic transaction e-2-e through the system. So many choices, so many options.

For the purposes of this narrative, I’m going to be only interested in the first 2, they are the general consideration for component design and development. This doesn’t, nor shouldn’t mean that for an operational system the 3rd category isn’t equally, or even more important – it’s just that I’m going to deliberately put them out of scope for now.

Ok, context done. Let’s do some design. Step one is to have a look at the problem we want to solve, and fortunately for us we have a spare one.

The system receives an SQS event which indicate which files in S3 to load and process, the files contain some numbers which we “do math” and the result is written out to S3

Many people would at this point launch into TDD, and while that might seem sensible, I’d always advocate it’s worth some thinking about the problem, and some analysis and preliminary high level design.

30 minutes later, add a small amount of Ken Scambler for sanity and we have the following initial thoughts about how the system design will proceed.  Note, this isn’t locked in stone, but when doing TDD, it’s not some random wandering in the dark about where your design will end up – you should be doing science here. Have a hypothesis, and let the code help you work that all out.


We can see the main components, and have identified the basic flows. Nothing too exciting, probably 30mins worth of chatting with Ken. For those interested, he did a “functional style” analysis and having us both work on the design we ended up with substantially the same system components and interaction design. Was a lot of fun. Recommended. Would pair with Ken again.

Now we want to think about one of the most interesting parts of the implementation. How will the use of the data store work with the event queue. Part of the requirements says that the events are only to be removed when the data store items have been successfully processed – so we need some form of signalling between the two. We can couple the two together with some horrible “if” code, and expose the innards of the event queue. Guaranteed this will be hard to test, so we’ll just dependency inject in some processor into the event queue – seems like the best approach. Writing code will test it out, but if you don’t know the direction you’re heading in – you’ll just wander all over the map.


(Note: You’ll see that I’ve put some form of “attach()” method in the interface/contract. This gives me some way of doing “authentication” / “connection” to the external systems. Probably not going to implement in the initial phases, but just a reminder that it’s probably going to be important at some point)

The important part of this is the process(Processor p):boolean method. This enables us to “tell, don’t ask” when processing things on the event queue. For now, we’re only going to get one type of thing on that queue so this is probably the simplest implementation and all that is needed. If there was a bunch of different things on the event queue, I’d probably construct the event queue with some form of Factory that would allow each of the events to be processed, but no need for that now.

The last little bits are pretty similar, and don’t really require any major thought – just simple data sources and sinks.



As stated above, names-not-final. There’s nothing about what I’ve scrawled here that is “forcing” me to do it in this way, and the code may well change my thoughts as I get into it. However, spending the (about) 60 minutes to draw these 4 pictures and talk with Ken gives me confidence I have a robust solution that’s going to fulfil the solution requirements as well as have the right sorts of extension points. The discussion and some thought experiments means that I’m pretty sure I can implement this solution using any underlying implementation technologies. Files, databases, queues, sockets etc. This is the most important thing when designing something – it’s not about “can I build this using technology <X>”, it’s “can I build this in ANY technology”.

Finally, if we look at this now slightly differently, we have the classic “boundaries” model where our business logic (the calculation) is all in the “middle” with our external interfaces providing the interfaces to the horrible messiness of the outside world. Functional core, imperative shell. This is another good indication that our design proposal has merit.


This also helps us understand where our testing styles should be going. We should have our unit/functional tests for the “functional core”, and contract/verification tests for our “imperative shell”. Our code is the core – this is really, really important and is the key point that needs to be made from this entire narrative. Our job is not (NOT!) to test the AWS libraries, the DB connection libraries, the SNS/SQS libraries – these can be verified at run-time using smoke tests, or at various points in the development cycle using contract/validation tests.

For people who worry about the protocols – that’s not a testing job, that’s a business logic task. If the payload in the event queue is “different” – then the system should just fail (gracefully or otherwise). The contract is broken, you no longer have to continue to behave rationally and can make sensible decisions about your own reactions. Under no circumstances should you attempt to “massage/hide” the broken contract. This leads to hard to detect errors and is a significant source of production failures. Just fail early – and in close proximity to the broken contract. This is a fundamental of good software implementations.


Now, Turning to Reason, & Its Just Sweetness (the aftermath)

This is the last part in a 3 part blog series on how to write code so you can build testable systems with external dependencies.  The first post can be found here

Thanks to Ken Scambler, Milly Rowett and Alyssa Biasi who all made contributions to the posts.

Things that happened about the design.

The initial design is pretty compact, and the components are well factored in terms of their responsibility, but I noticed a bit of a smell in the implementation where there was a bit of twisty-turny logic to deal with the receipt of the message on a queue, and then waiting around to see if the file existed.

The implementation problems the design creates (which I ended up ignoring for this example) is that if the file _never_ appears, you end up with all sorts of knots, and I don’t really want that sort of logic to be part of my “process the file” code.

Going through the design a second time with Alyssa Biasi we came up with separating the logic for the “got the message, wait for the file” and “process the file”. We can decouple these 2 problems and the responsibility for each part makes it a lot cleaner. Then we can tune the “retry logic” in the “wait for the file” code, and the “process the file” logic never needs to know. A simple IPC mechanism (another queue message suffices) and responsibility separation seems to be a lot cleaner. The nice thing is that it’s just another form of QueueProcessor, so we can re-use most of the code framework, with just some changed wiring. Winning.

In the implementation of S3DataStore(), there’s the opportunity to decouple the Authentication method from the implementation. For the purpose of the example I didn’t bother adding this complexity in, but my original notes in the design highlighted how it might happen. The Java AWS libraries actually make the implementation of such a design very straightforward. (Currently the implementation assumes Anonymous “world read” access.)

Things that happened about the code.

It was fun writing code again. There’s parts of the code that I’d refactor to make neater. A protocol implementation decision about how to pass information in the request queue and some of my test code is kinda bleh and I’d like to fix that in the future.

My choice of code structure (and lack of new Java 1.8 features) is based on history and a lack of writing much code. I didn’t particularly want to use this as a mechanism for learning new coding techniques, and to be frank, there’s nothing complicated enough in here that warrants it.  There’s definitely areas that could be cleaned up, generally where I create local variables for return values. Most of those can be inlined (the compiler is doing this anyway), but it’s handier for me to examine the variables in a debugger when I’m tracing things.

I was also working with Milly Rowett for a significant part of the development. When mentoring, I prefer to keep things obvious, even if it means a little more typing is involved. Milly might be able to provide feedback on how valuable it was – it certainly was easier for me to explain as I typed.

The code structure completely changed when I decided to use maven (to get the AWS dependencies managed) which was pretty painless, but annoying. Not sure I would have done it differently, because I didn’t need maven until half way through. The final structure is fine, but created a change-set which was nothing but structural changes.

The code isn’t really meant to be a demonstration of “great art in writing code” or “faultless”. It was done as an implementation to show how to design code so you can test external dependencies (such as AWS) without relying on mocking libraries or having the code be solely dependent on that implementation.

If you want to download and play – and finish it –

Now, Turning to Reason, & Its Just Sweetness (the code)

This is the second part in a 3 part blog series on how to write code so you can build testable systems with external dependencies.  Link to part 3 is at the end of the post. The first post can be found here

Authors comment:This post was written over a period of a couple of weeks, and developed along side the code. There is inconsistency in what I say at the start and what ends up in the code. This is to be expected, and probably worth examining how things changed – as much to see the thought process, and that while design tends to remain fairly static, the implementation changes as new information is gathered. The total time spend on the code is about 10 hours. Of that time about 4-5 hours was code developed while mentoring a (very smart) graduate. Other consideration is my unfamiliarity these days with code authoring (sadly). I imagine if I was to start again, I’d probably be able to do it all in about 4 hours (if that). The code is pretty trivial (like most code).

—- Actual post follows —


First, simple structure, as would be expected from most projects. The “contracts” section contains the verification tests for our external interfaces. The sorts of testing that is “pact-like” in that if our code breaks for unexpected or unexplained reasons, we can run our contract tests, and see if the remote systems that we have no control over have changed. These are our sanity check. They’re not there to “prove” anything – other than our confidence that at the time the code was written, they were against those contract tests. It’s a good idea, you should think about it.



Now, our code layout looks like our design, even after working through it TDD-styles.  That’s pretty handy. There’s 2 useful observations and questions at this point. The first is “how do we calculate the math part?”. The answer to that is “we implement a Processor” for it. The second is “where is the application?”. The answer to that is “there isn’t one yet”, but it’s basically a simple wrapper around the EventProcessor. We can see how it will look if we examine one of the EventProcessor tests.


We can see our alternative implementations of the interfaces for DataStore, Processor, EventQueue and OutputDevice for our testing. I’m not completely happy with the EventProcessor and needing to set the delay and retry. It seems to be the right thing internally, it just looks ugly. Maybe something better will emerge further down the line and EventProcessor will become an interface, with some form of concrete implementation. For now it’s concrete, and it’s the only real co-ordination point within the code.


Where the interesting things occur within the code is in the EventQueue. The queue has the defined responsibility for processing the events. This is by design, and we can see the SuccessRemovalEventQueue in the example above has injected a List of Events (the queue) and the Processor which in this case is the KeyExistsInStoreProcessor. These particular implementations were chosen because I want to start modelling and investigating parts of the real solution that’s going to be needed. The concrete implementations here for EventQueue and OutputDevice are used as a “dipstick” (Testing with mock objects – JUnit in Action – page 106) .

This form of development makes it trivial to then compose up the application. Turns out – it’s just a setup, then construction of the dependencies. I wanted to print out when things were processed in the first version (let’s call it the MVP) so to do that we just implemented ConsoleOutput to implement the output device. Total time to create the application – 2 minutes.


You’ll notice that it’s one of the test cases, that’s been modified. That’s ok – it’s a great way to start – the actual application doesn’t need to know, and we can focus on implementing each of the features one at a time. We’ll start with the DataStore, because that’s probably the easiest part to implement first.

In the blink of an eye, the BatchFileProcessor has changed to the following. And with the great confidence that as the injected strategies have the same contracts as the tested ones, our imperative shell works as advertised.


Now, build out DoSomeMathProcessor() with TDD.





Essentially we test this by having a known set of values passed into our data store. This is made easier by having an “object mother” (ObjectMother)for the creation of the test data. Notice how we’re testing results, not trying to delve into the class to see if “math works” – but if we’re getting the right results.

We’re at the point of our code where we now have great confidence if we have a DataStore that returns us the valid data – we can add it up correctly. So, do that next.

And the contract test for S3DataStore(). Note here that I’m not attempting to mock, or stub or do anything to test the implementation of S3DataStore() with the BatchFileProcessor. This will be creating the contract for the use of the S3DataStore(). If I can use it against S3, then it works. That’s it !


There’s a little bit of context worth discussing here. For this particular bit of code to work, that means that the S3DataStore() class needs to be implemented and working. This was done in a few stages (over about 20-40 minutes as I looked up various bits of the S3 API). I started with the ds.exists() test, because that also allowed me to see if the Authentication parts were going to work.
For this test to ever work, we need to set up the world. That’s ok – we know that, we’re not trying to fake out the world here, this is a contract test to verify if our code works against the real world. This could also form part of a smoke test. I manually set up an S3 bucket in the Sydney region, and manually copied the “testdata.txt” file into it. I could use a shell script to do this, I could use a bit of Java code to do this and clean up after. That’s all completely valid, but doesn’t really help in understanding about “how to test AWS code” (really, it’s the implementations of the imperative shell we’re testing, AWS is just a particular implementation).

The implementation is pretty simple. Current implementation is naive and will not fulfil the “requirements” if there are errors, but the interesting part is it’s trivial to add all the business logic and test it. If we need to check for malformed files, we can do that – and have the load(String path) method do sensible things. We can trivially create test cases for this to make sure the processor acts correctly.

At this point – the code is now “working” – and we can run our implementation. We would then choose the next part of the project to implement. If I was going to continue, I’d probably do the output notification – mostly because that requires the least amount of setup.