Toy Robot Coding Puzzle

It may come as a bit of a surprise to many developers in Melbourne (and possibly other places) that I’m the author of the much loved and used “Toy Robot Test”. What may surprise people more is that it’s been around since 2007 and was written when I needed to evaluate a large number of hires for people at ANZ.

It shouldn’t be a huge surprise that it’s not entirely original – lets face it, nothing is. It was heavily influenced by the “Mars Rover” code puzzle that was in use at ThoughtWorks when I was there. I was also involved in hiring people at TW during this period and while I didn’t do this as part of my evaluation, I probably wrote and re-wrote code for this puzzle about 20-30 times as I explored different solutions so I could talk to candidates about it.

When it came to needing one for ANZ, it was all very familiar.

The other part of this that is also very familiar is the one question I keep being asked, which is “Shouldn’t we change it, it’s common and people could ‘cheat’ and copy the solution?”. To this I normally sigh, then start with telling a story all over again. This has happened again recently as one of our wonderful HR recruiting staff asked the same question, so I wrote her an email in response. I was chatting with a good buddy (hey there Travo) who asked me about the code puzzle in a similar way, I showed him the email.

He then suggested that I post my response in a blog. It seems like a good idea, but it’s probably worth understanding how the code puzzle works in my recruitment pipeline. It’s the first filter, which is quite unusual for many people (something else I often get asked about).

Why? Well, the sorts of filters for somebody you want to hire as a software developer are;

  1. Are they a competent software developer?
  2. Are they an asshole?
  3. Can they cope with working with/near me?

In my life experience, people are generally not assholes. So as a result, doing some form of “phone screen” or “cultural interview” first is pointless – because it’s really, really likely they’re a nice person who you’d enjoy hanging out and having a beer/wine/halal drink of choice with. However, if I’m going to hire them as somebody who will write code that’s pretty important. So, I figure – maybe I should ask that. And, in my experience, most people are *fucking terrible* at writing code (no really, you all are, deal with it).

So, I use the filter that’s going to let the smallest number of potentially non-asshole people through. Then I can chat with them about technical stuff and I can do things like work out exactly how much they do know about technology, if they’re smart and use emacs, and other important things like keybindings and directory structure preferences. Also, doing this gives them a reasonable idea about what it might be like to work with me. So, that kinda sorts out 2 and 3.

Apart from letting some Scala people through this process when even I was too naive to think people would use that borg of a system, it’s worked pretty well. So yeah, the point of this blog post – why I don’t care that the Toy Robot Coding puzzle is well known.

Hi Jon,

Can we write a new Robot puzzle please?



Hiya (redacted),

I suppose I’m trying to get you to articulate why you think you want to do these things.

You’re telling me things about what is happening (which I already knew about), but there’s another part to this that I want to discuss with you, but I want to understand what you want.

 I suspect that you’re worried that people are going to “cheat” the code review and that’s why you want it changed. I’ve had this discussion with everybody since I started doing this – so it’s by no means a new conversation.

Now, this is the very funny part. The general quality of responses from candidates is so poor that they’re too ineffective to even copy a good version of it. Think about that for a minute. Most of the candidates are incapable of copying a working solution to try and “trick” me. What does that say about most programmers? It’s a pretty sad indictment of our industry that they can’t Google a good solution, modify slightly and understand what was written enough to trick me through an interview. It’s embarrassing.

There is also the problem (regardless of what we do) where somebody else can write the code for the candidate.

Now, this is where we have a conversation with the candidate and ask them about the code. Turns out if they didn’t write it – they don’t know how it works, or how to extend it, or how to talk about the potential design issues with it.

Our potential outcomes are as follows;

  1. Candidate is able to complete code puzzle honestly and “passes”
  2. Candidate is able to complete code puzzle honestly and “fails”
  3. Candidate copies from github and passes
  4. Candidate copies from github and fails
  5. Candidate has submission written by 3rd party and passes
  6. Candidate has submission written by 3rd party and fails

Now, our recruitment process will treat 1,3,5 as identical and 2,4,6 as identical during the code review phase. Of course, obvious copying of the wrong names etc may make for great amusement but is otherwise undetectable.

2,4,6 we don’t care about – because they failed anyway

Of 1,3,5 we really need to try and identify 1 separately from 3 and 5, as 1 is the candidate we are at most interested in progressing with.

 If we examine the characteristics of 3 and 5, we notice that they are the same. It’s really just “code not written by the candidate”. So, if we choose a different puzzle we are under the incorrect assumption that it will somehow make it “harder” for the candidate to cheat – but option 5 is still untouched.

So, that’s why we talk to the candidate about the code. In depth, and often with reference to the design decisions they have made (or not made) and ask them about how they would extend it.  This allows us to detect both 3 and 5.

Having said all of this, I hope you can understand more about why “changing the code puzzle” really doesn’t matter at all. We can change it for other reasons (such as a different problem for the fun of it) but having an “original” problem doesn’t mean anything in terms of improving our hiring decision strategy.

Going paperless

I decided to go paperless at home, and have purchased a scanner and going to digitise all the random pieces of paper I get during the year.

I’m wading through all my tax from last year and realise that it would be a lot easier if I didn’t have to shuffle the papers around, and store them.

If you’ve done this – what are some good pieces of advice you can give? I don’t plan to scan receipts, maybe bills. What should or shouldn’t I bother with?

Providing a framework for communication

A lot of my work at present is helping the technical leaders of our business aligned groups do their work effectively. There’s lots of communication individually, and as a team.

Having all the smart people (and me) in a room greatly aids decision making, but one of the most valuable communication aids that I’ve implemented is describing responsibility and accountability in terms of bounded contexts.

Eric Evans uses this term in Domain Driven Design to separate out modelling concepts and to try and stop the “grand unified theory of Customer object definition”. What I’m using the bounded contexts for is to describe the areas of accountability for the tech leads and say “whatever happens inside that boundary is your problem, but you have to communicate to other contexts using my rules”.

So far, this has been working very well. It really helps in guiding the communication to “what bounded context should this functionality logically be located” and hence the ownership.  Then, the tech lead working in that area is completely free to implement the functionality according to their methods, exposing the interaction via an interface that is “proscribed” (but generally we thrash about as a team defining what this will look like).

As I was trying to describe this process internally to managers at REA, I found that this article ( provided a nice starting point.


Chicken and Vegetable Green Curry

I love this dish, because it’s nice easy to cook, super cheap to make and freezes really well. You can also substitute like crazy on any of the items. Cooking to me is all about the exploration. Swap the chicken with fish (so good, so, so good), use red curry or massaman curry, drop the meat completely and just have vegetables. Drop the vegetables and add bacon (only kidding, never tried that).

Don’t be daunted by the number of items in the list – it’s mostly just random different vegetables that I like to add in, and the volume of curry it makes is pretty large, I normally get 6-8 decent sized serves out of it of which I freeze in 500ml storage containers.  If you want to reduce the volume, reduce the meat to 250g and only use 1 can of coconut liquid as well as reducing random vegetables in both volume and number. This is more art than science, it will work out fine – trust me.

When I talk about adding oil to things, I mean a small amount, like 1 tsp – not a cup. You’re not deep frying things, it’s just to help the meat and vegetables brown without sticking. For the frying parts, you can exclude the oil completely if you like.


  • 10 minutes for prep / chopping
  • 30 minutes for cooking (depending on how you chop the vegies)


  • The ingredients listed serves about 6-8 depending on how much you serve.  I use a 250ml cup and just scoop the curry out onto the rice, and each serving for me is about 1 to 1.5 scoops.

Kitchen Cookware

  • Medium or large frypan
  • Large frypan or wok (I use a wok)
  • Microwave and microwave proof dishes (not required, but makes prep and cooking time much faster)
  • Nice sharp knife
  • Wooden spoon/spatula


  • 1/2 jar of Valcom Green Curry Paste (could use any brand, but this is what I like)
  • 400ml (one big can) of Coconut Cream (replace with Coconut milk if you want a thinner sauce)
  • 400ml (one big can) of Coconut Milk (replace with Cream if you want a creamier/thicker curry sauce)
  • 1 tsp of Fish Sauce
  • 1 tsp of sugar
  • 1 kaffir lime leaf
  • 1 small hot chilli (optional – but I like it)
  • 500g of chicken (I prefer thighs, but breasts work ok)
  • 2 medium potatoes
  • 1 medium sweet potato
  • 1 small green capsicum (or half a larger one)
  • 1 small red capsicum
  • 2 small brown onions
  • 2 carrots
  • big handful of green beans
  • vegetable oil of some form – I’ve used Peanut and Olive – both work fine
  • rice for serving on (I use long grained rice of a random sort that was on special at the supermarket)


  1. Pour some oil in the wok and heat on high, add 1/2 jar of Green Curry Paste – cook until slightly spitty and fragrant, reduce heat to low (this is really important, or bad things happen to the about to be added coconut milk)
  2. Cook for a little longer (~1 min) until the paste stops popping and mix in the Coconut Cream (if you use milk here, be super careful, I’ve had the milk separate and go all curdly at this point – much safer to use the Cream)
  3. Mix with a wooden spoon and keep a careful watch – if it starts boiling hard, then remove from heat (add back to heat when you start adding in the vegetables)
  4. Chop the (optional) chilli and the kaffir lime leaf into tiny weeny little pieces – add to the wok – mix in
  5. Peel the sweet potato  – chop into 1cm cubes (or whatever you feel like – it doesn’t matter to be honest)
  6. Cook the sweet potato in the microwave for about 2 mins with a bit of water – this should just soften it. You’ll have to adjust for your microwave and the size of the potato, but the objective is to “half cook it” not turn it to mush.
  7. Add oil to the 2nd frypan, and lightly fry the sweet potato – just make the outside bits all nice and brown – should take about 1-2 minutes
  8. Add the fried sweet potato to the wok, mix in (wok should be on low heat – sauce should not be boiling rapidly – as you add the vegetables it will cool the sauce, just keep an eye on it)
  9. Repeat steps 6-9 for the potato (peeling optional)
  10. Repeat steps 6-9 for the carrots (if you peel them, I’m coming to your house to slap you)
  11. Roughly chop the onions and add to the wok
  12. Roughly chop the capsicum (both red and green) and add to the wok
  13. ## Rice ## (see the end)
  14. Chop the beans into 1/2 or 1/3 and add to the wok
  15. At this point you are probably over the top of the curry sauce, don’t stress we’ll fix that in a minute, just keep stirring it
  16. Chop up the chicken – you can make strips if you want, I like to make cubes and fry until it’s browned on the outside – more colour is good in my opinion – but don’t overcook it – you want the inside of it still pink/raw
  17. Add the browned chicken to the wok
  18. Add in the 2nd can of Coconut liquid – stir in thoroughly
  19. Add in 1 tsp of Fish sauce – mix well
  20. Add in 1 tsp of sugar – mix well
  21. Taste the sauce – it should be mildly fragrant with a little bit of heat.
  22. Keep stirring, turning and making sure it’s not sticking to the bottom of the pan – adjust heat
  23. Cook slowly until the potato is cooked through. That’s my trigger for it “being ready”. It normally takes about 30 mins, but that’s my microwave, my timing and my chopping sizes. Don’t be stressed if it takes another 10-15 minutes – grab a glass of wine, watch a bit of TV, set the table, panic because the guests are about to arrive and you forgot to sweep the floor.
  24. Serve to your guests – bask in the glory of their compliments

Rice – I use a rice cooker which helps because it’s zero effort and I don’t have to watch it. Depending on how you cook your rice, I would normally put it on at about the time I start chopping the beans.  If you have a kitchen minion, you can direct them to do it at that point, or cook the rice at the end. Cooking the curry for longer on low heat IS GOOD so don’t worry about that, the only thing to watch is that it’s on a low heat and isn’t sticking on the bottom of the pan/wok.  

When I made this last night I put the rice on at the very end, sat and ate cheese, dip and crackers drinking wine with my guest. It was probably 60 minutes after the curry had “finished” before I served up and it was delicious.

I ride so I can eat

In an attempt to rekindle my joy of writing, I’m going to start blogging a bit more about more mundane things in the hope that it will spur me on to write some more comprehensive entries.

In the last 18 – 24 months I’ve been riding a lot more, and because of that, I’ve been able to indulge in all the food I could ever want. Even to the point where I end up being in calorie deficit over the week. In 2012 I managed just over 11,000km on the bike. For the first 6 months of 2013 I’m a bit behind, at about 4,800km.

For those interested in how much energy you can burn while riding, here’s some very rough numbers. You’ll burn about 400-600kcal per hour depending on work rate, which translates into about 1675 – 2500kJ per hour.  As the average adult daily kJ requirement is about 8700, going riding for 1-2 hours consumes a lot of extra energy.

This can be “reverse engineered” if you have a power meter, or use a site like Strava which will estimate your power output. I’ve read that cycling is about 19-26% efficient in turning “energy spent by the body” into “energy used at the pedal”. So, if you take the values of power and multiply by 4, or 5 (or 4.5 for a middle ground) the numbers should be close.

I’ve always enjoyed eating healthily, and have had a good food intake. Now I’m enjoying cooking for myself, and for George and exploring new and interesting things to cook.  The other part of my cooking is that now (at least for some part of the time) I’m a single Dad, so meals need to be quick and easy to make, and be child friendly.

Over the next couple of weeks/months, I’ll keep posting the things I cook and eat hopefully others will find this useful and enjoyable.

For every question there is an answer that is obvious, simple…

… and wrong.   Never have I seen such a cluster of these answers around the questions of “How long will <X> take?”, and “Are we there yet?”. Today I’m going to describe a bit about how to answer them, and be a lot less wrong in the process.

How long will <X> take?
This is the classic estimation problem.  A wiser man than myself said “Normally longer than you think”, but while accurate, it’s not very fulfilling nor informative.  My experience is that most people are pretty damn good at estimating how long “their bit” takes, and what’s more – how long “their bit” takes when they actually get to work on it.  However, nearly everybody is terrible at understanding the effects of external influences on the time.  This is where the significant source of error in estimation occurs.  Not only that – this is where great caution needs to be applied to the “well, it took Z days last time” without understanding the influences on that outcome.
At this point we get to segue into a real world example of exactly this.  I ride to work every day.  I’ve ridden to work every day for over 18 months.  ~ 6km each way for 12 months to ANZ (~400 samples), and ~ 10km each way for 6 months to REA (~200 samples).  I travel on the same route every day and at approximately the same time every day.  I have a Garmin 500 GPS unit that tracks all my travels – so I have a long historical record of doing exactly the same thing every day.  With all this wonderful data, you would think I’d be able to accurately predict how long it takes me to get to and/or from work.  Here’s the news, for what is an average of 30 min journey, I cannot predict within 10% what my journey time will be.  How the fuck is that possible?  My fastest time home is 25 minutes, and my slowest is nearly 35 minutes.  
So, you’re an astute reader (well, you’re reading my blog – so you must be), you’re scratching your head trying to work out how I’m getting nearly a 30% variation over the time.  Time of day? (no) Weather? (no) Fitness? (no) Bike chosen to ride on? (no) Traffic? (no)
Here’s the crucial piece of information that my awesome Garmin unit has.  It has my average speed, and my average _moving_ speed.  Turns out that my average moving time is very stable (pretty low variation) – a fairly comfortable 26km/hr.  So, if my moving speed is constant at 26km/hr – how on earth is there a 30% variation?
Externalities.  In this case – traffic lights.  There are 30 traffic and pedestrian lights on my trip into work.  I’ve not done the analysis on all of them – but I know that 2 of the traffic lights have a cycle of 2 minutes.  So – from a best case of 0, to a worst case of 4 minutes – that’s a  10% variation just from those.  Wow.   On the upside, I can say that my _expected_ time is 30 minutes, but it could be from 25 to 35 minutes.
So, here’s a tip.  When looking at estimating – even for things you know you do all the time – look at the external influences on the task at hand.  Count them – that should give you a good idea of the level of variation that may occur.  More external influences that you don’t have any control over – the lower the confidence that should be placed, and the greater the need to have a conversation about “minimum, expected and maximum”.
This is also a very good reason to use synthetic values for estimation (function points, story points) and instead of predicting, use tracking as a means of determination of task and project length.
Are we there yet?
Not only is this the dreaded question for parents of children, it’s also a bleeding sore for most software developers.  Provided you’ve already moved away from aggregating single estimates in hours or days and have decided that a synthetic proxy is the way to go (great first step) we need to have some coherent way of determining when we’re likely to be finished.
We’ve all read the “past performance does not guarantee future performance”, yet that is exactly what we’re doing when we take a time-slice of the project, and then project the work already completed to determine the end-point.  The good (and bad) news is that while we should keep the quote in the back of our heads, there’s not a better way to determine the end-point.
However, the big kicker is this, the value of “past performance” is relative to the volume of activity performed, and the amount of variance external activities have caused on those activities _relative_ to the possible impacts remaining.  The first ~200m of my journey has no traffic lights, so it should come as no surprise that there’s low variation, but also should come as no surprise that the predictive value of that first 200m of the journey is low.  Yet, I see people doing this every day in projects.  “At the end of the first iteration we did 40 units of work, excellent – we’re going to finish in <X>” and then getting frustrated, angry or disappointed that next iteration only 20 units of work was completed.
At what point can we have a discussion about the end-point? You probably could at the very start, but it’s hardly a valuable discussion. The very end is probably too late – so it’s somewhere in between. Sure, but where? This is the hard part, and it’s a function of the number of external influences remaining. As we’ve seen from my cycling story above, when there’s a large number of external influences (approximately 1 every minute) – we’re looking at a 30% variation, regardless of where we choose to make a projection.  Clearly as we get closer to the end, there’s less total impact, but the ratio of impact remains (mostly) constant.
Sadly for this story, there’s not an easy answer to the question of “Are we done yet?”.  The best advice I can give is to reduce the external impacts or at the very least be able to quantify them and reduce the problem to understanding your average moving speed.

Once more unto the breach

One of the (very few) things that I completely love about using a Mac for software development is the integrated command line which does a pretty good job of approximating a Unix environment.  This is great for me, and supplements how I work nicely. 

To get a similar environment under Windows has always been a great goal, but somewhat hindered by the completely shit that is cygwin.  So, as I commented in my last entry, I have started using VirtualBox to provide a great scripting environment, but I’m still stuck in the mire of not having a decent terminal client to access the command line shell.

There are a number of alternatives, of which SecureCRT is probably the best (an awesome application that I’ve owned for many, many years) but it’s a commercial piece of software that many people may not feel the need to buy.  PuTTY is pretty good, but it’s only a marginally better application than the standard Windows console application.

Frustration over the weekend finally got to the point of doing something about it, and I investigated a number of alternatives including Console2, Xshell, MinGW and finally settled on using MobaXterm (  Great application with a number of awesome features that I’ve not completely finished exploring yet.  It comes as a free for personal use, with an upgrade to a professional version for $49 EURO.  I’ll keep using it for a while to see if I want to upgrade, but I might do it just to support the work.

Hope that helps some people become more productive.