Being Unitized

I was interested to read Don't Let Yourself Get Unitized, I guess because he refers to problems with JUnit, and unittest is based on JUnit, and I don't like unittest that much.

But the article turns into a curmudgeonly rant and his underlying resentment of unit testing comes out:

Consider the goals and tenents of unit testing:

Very small "units" are tested

Testing is almost always done of individual components in isolation from other components

Mocking strengthens the isolation aspect

The code and the tests are almost always written by the same person

Taken together this means that unit tests are testing the lowest level pieces of your code, each in turn and in isolation from all other pieces, and the definition of the tests and the code are done by the same person.

This sort of testing catches what I consider "low hanging fruit". It catches problems-in-the-small. It'll find individual methods or classes which don't match what the unit tests say should happen.

This is a good thing and provides very valuable feedback on the correctness of your code. But keep in mind it _only_ catches low hanging fruit. By design, unit testing is supposed to be easy, and to consider individual small pieces of a system in isolation. Because of this, by its very nature, unit testing does not consider the _composition_ of a system, only its individual parts. Unit tests never check the interconnections of an application, it never checks how they are wired together.

In my experience, the interconnections and "wiring" of an app is where most of the complexity of the application lies. The wiring defines your design, and if considered at a high enough level it can even be considered to capture your architecture. How information flows across many software layers and between many components really define what an application does. And the very definition of Unit Testing is that it does not test these aspects of an application. Unit testing ignores information flow across software layers and components, ignores how classes and objects are interrelated and are put together into larger designs and architecture. This means that unit tests can catch simple errors in individual pieces of code, but says nothing whatsoever about your system's design or architecture. And what makes or breaks an application really is the overall design and architecture. The design and architecture captures your system's performance, it's memory use, the "end-to-end" correctness from the user's inputs out to whatever servers you might be using, and the round trip back again. How all the wiring interconnects shows the true system behavior, and it is in this area where the toughest bugs and problems lie, and where people sweat blood to get things right. Writing individual components in isolation is easy. It's hooking them together into a cohesive whole that's hard - and unit tests only pass judgement on the individual parts in isolation, not the whole.

Getting one component to act "correctly" in a system is almost always a pretty trivial exercise. Writing one component in isolation is not the difficult part of computer programming. Any single small component of a system is generally easy to code. The hard part of development comes in getting all of the components of a system to work together - to get the wiring right. Unit tests can verify that each of your individual components does what you the developer thinks it should do. But by its very definition, unit testing cannot check the more complex "wiring" - and the wiring is where most of our design, development, and debugging time goes into.

There's also some weird digs at Martin Fowler (and I guess all consultants), which seems out of the blue and a little mean. The guy has a chip on his shoulder. But anyway, I'll respond with my thinking on unit tests.

First: Getting one component to act "correctly" in a system is almost always a pretty trivial exercise. Sure, getting it to act "correctly" is easy. Getting a component to act correctly without scare quotes is much harder. If your component acts correctly all that wiring will work correctly too (by definition). When he's talking about "correctly" I think he really means "according to some spec that was delivered to the programmer." This is a symptom of defensive programming, where the programmer isn't at fault if there's a bug in the spec, or a bug in the larger design, or what-have-you. A responsible programmer cares about the success of the larger project, and judges correctness based on the utility and reliability of their code in that larger project, so there is no "correct" that does not include the system.

But another error is in how he views "units". A unit test is small and isolated, and tests a small amount of code. But a "small amount of code" is a matter of perspective. I shudder to think about the amount of code involved in the Python expression a = some_dict.copy(). There's the Python compiler, and VM, and the dictionary implementation, and the code behind classes and types, and underneath that is libc and the kernel and who knows what. And the combinations! Most of those pieces have multiple implementations; alternate dictionary-like objects, alternate VMs, many underlying platforms. We have created ourselves a Tower of Babel, an incredible monument to abstraction. But God hasn't struck us down... well, we have been "cursed" with a multitude of languages, but our efforts have not collapsed.

And yet despite all that underlying code, that one line is too small to be unit tested. So how did we get there? For the purpose of what we're testing we ignore the code and trust the abstractions. All those pieces underneath it are mature and well tested, having undergone years of development and testing by a large number of developers.

When I create a system, I have to build those same kinds of abstractions, reliable pieces that I can trust. When I later create code that depends on those libraries, I test that code, I do not test the library.

But unlike foundational code like libc, I can't put years of effort into building my abstractions. I have deadlines, and anyway I don't actually want to work on a single project for years at a time, all the way through to some time-based maturity, suffering through an unreliable interim and using QA departments and other systems that require bureaucracy.

Unit testing is in this way circular. Unit testing allows me to bring something to maturity faster, to practically will it to maturity. Along the way I have to make compromises -- I have to keep my code decoupled, and I have to write for testability. But I make those compromises because they allow me to move beyond my code, to build up things that don't need to be constantly revisited and retested according to context.

This is circular, because only with this foundation can I move to testing higher-level code in "isolation". Because of the Stable Dependencies Principle, I can't build something reliable and mature on top of something that is unreliable and immature. I can only make my code "isolated" if I have real trust that my underlying code is well defined and functional, that I can ignore it the way I ignore libc and the virtual machine. And I can only do that with unit tests.

Ian Bicking: the old part of his blog

Being Unitized

Comments: