Ian Bicking: a blog :: Runtime vs. Test time

{ 2008 02 06 }

Runtime vs. Test time

Defenders of static typing have long claimed that it helps them avoid bugs, the compiler providing a constant test of code consistency. Agile practitioners counter that tests do the same thing and much more.

I find myself inclined to something more in the middle. I don’t care for static typing, but I think programming by contract has a lot of validity, is more powerful and helpful than static typing, and works just as well in a dynamically typed language. Programming by contract is more a principle than a particular technology. Unit testing is not xUnit; that’s just one way to do it. You can do unit testing perfectly well without any framework or special infrastructure at all. Similarly programming by contract just means: state and check your up-front expectations in a piece of code, and state and check our expectations for what that code returns or does. This can be as simple as “assert isinstance(n, int)“ in your code.

Programming by contract is not enough to produce quality code. While the contract is checked at runtime that only means that exceptions are raised early and bad or inconsistent data does not propagate into the system. But it only works if you go through the code paths to invoke all those contracts.

Here’s where functional tests come in. A functional test does stuff. In a web application a functional test does a simulated HTTP request and gives you the result. Generally they are high-level and go through the entire application stack. This is in contrast to unit tests that test just one piece of code.

Proponents of unit tests say that functional tests are problematic. They often give confusing output, so that you have to work harder to debug problems when you are invoking so much code. You have combinatorial problems where it can be difficult to run tests that really get to all branches of the code. Functional tests can be slow.

Of course no one really claims you shouldn’t have a mixture of both functional and unit tests for your code, this is more a question of what the best balance is. Functional tests without runtime checks in your code are a problem. They can be fragile because people look for every detail of how the application responds to identify how specific features function. But every detail that is verified is a detail that might change and require updates to the tests.

With runtime tests functional tests can be much more compact. You can do things like simply verify that a page renders, or that code completes without an exception. If you have some confidence that it won’t be complete wrong — wrong as in without any exception but with errors — then simply exercising to code can be sufficient. In addition to getting more use out of your functional tests, you also have some assurance that production code will act in a sane way — that is, if it is broken, the code will act broken. Given a good deployment situation, the turnaround on fixing deployed bugs can be much shorter.

Automatically generated list of related posts:

Thoughts About the Erlang Runtime I should preface this entire post by noting that I...
toppcloud renamed to Silver Lining After some pondering at PyCon, I decided on a new...

10 Comments

Jake McArthur says:

February 7, 2008 at 12:19 am

You have the right idea, but there is another step you can take this. Your contracts can be statically checked! Unfortunately, we don’t have many (any?) general purpose programming languages that can do this yet, but the theory is sound, and theorem provers (Coq, for example) already use the technique.

The idea is that, in a dependently typed language (one in which your types can depend on the values… which sounds weird but these can be statically checked), you pass around proofs. That is, arguments in your functions call for propositions (preconditions to assert), and calling these functions will only compile if you supply appropriate proofs of these propositions. Likewise, the function can return proofs (postconditions) to be used elsewhere as well. That’s a very rough metaphor of what you actually do, anyway.

So design by contract isn’t really a substitute for static typing after all, since it can be statically checked!
Ian Bicking says:

February 7, 2008 at 12:25 am

Sometimes a contract can be statically checked. But as we all have learned from statically typed languages, this kind of language analysis can lead to blowback for the programs, restricting the kinds of programs you can write in order to facilitate the static analysis. There’s simply no static analysis that is smart enough to accept all correct programs. This goes double for contracts.

Of course a contract can also get kind of expensive at runtime. E.g., for item in seq: assert item is None or isinstance(item, int). You have to consider just how far you are willing to go to check values up-front. But then an assert in the middle of a function (e.g., inside a for loop) can be just as good, and often much cheaper.
Jake McArthur says:

February 7, 2008 at 12:49 am

That is true. You generally have to be willing to sacrifice Turing completeness in order to be able to statically check all properties. Then again, the kinds of contracts that would impose this restriction could just be checked at runtime anyway. There are not many cases I can think of that this would be strictly necessary though. Turing completeness is overrated most of the time.

Anyway, it’s still an open area of research. We will have practical results from it eventually though!
Tony Morris says:

February 7, 2008 at 2:28 am

Please read http://cdsmith.twu.net/types.html and stop spreading the Agile memetic fallacy.
Michael Foord says:

February 7, 2008 at 6:35 am

Although you don’t mention it, I wonder if you miss the point about TDD. TDD doesn’t produce better code because it produces more tested code, it produces better code because it is a better design process. This is overwhelmingly my experience (based on a massive two years worth of experience!) and has irrevocably changed the way I program.

Having a full test suite is a great side effect that is very useful for refactoring, but it is only a side effect.

(Your ‘Website’ field in the comment form doesn’t allow enough characters by the way!)
mike bayer says:

February 7, 2008 at 10:06 am

ian -

how do you deal with the performance latency as well as source code bloat introduced by injecting argument checkers throughout all methods and functions ? what I like about unit tests is that you can get similarly good results without weighing down the application’s core functionality, with the “argument checkers” focused mainly on the public-facing API functions. its still “design by contract” but the contract lives externally, with no restrictions on how deeply and completely it can validate functionality.
mmj says:

February 8, 2008 at 3:17 am

(term “type” in this post means anything type-checkable – class, method, generic types, etc.)

Tests (and similarly runtime contract checks) are not that suitable, if you want to enforce certain rules across the whole code base. I learned to program by designing types, so certain (design?) errors that I encountered during development are impossible (or very hard) to express using the new types. For example, when I find a bug, I always think: “could the bug fix be expressed as a type?”. If it is, you introduce the new type and delete the old one and the compiler will simply flag all possible bugs (= usage of old type) for you**.

Of course, it takes quite some practice to design classes/functions in a way, that bugs will actually be prevented AND your team members will understand you code. I do use both unit and functional tests in addition because it is more suitable for more localized type of bugs, though.

So, the last 6 years on the project were largely just a constant iteration (sound familiar?:) of type modifications hugely aided by static compiler checks.

** Note, this isn’t just refactoring, which is done in dynamic languages. For example, we introduced new multi-threading primitives, because old ones were found to be problematic in certain cases. You might know, that multi-threading is simply not testable dynamically (not with current tools at least), you just have to enforce it statically.
mmj says:

February 8, 2008 at 3:27 am

Oh, the important part – you have to use a language with expressive type system. I was strongly influenced by haskell and ml, but the project is in c++ for practical reasons. Java for example would be too weak to express all the types that we use. We’d probably have to make Java “precompiler” and insert our own checks in the process.
Wouter Lievens says:

February 8, 2008 at 4:22 am

Static typing is, imo, a subset of design by contract: an error will be raised when an inappropriate type is passed to a function.

The mere fact that static typing produces these errors at compile time is a detail, it just means that some contracts could (or should?) be checked at run time.
Ian Bicking says:

February 12, 2008 at 3:22 pm

Michael Foord: I agree, TDD is a useful design methodology. But I’m describing a difference between unit and system and runtime tests, not test-first and test-last. Maybe it would be helpful to clarify that my post is much more applicable to the maintenance and debugging of existing software, software that has already been designed.

Mike Bayer: I don’t think it’s usually very bad in practice. Especially for the places where runtime checks are most applicable, which is inside larger integration projects, as opposed to small reusable libraries. Though reusable libraries also should have good checks, as it makes the library much more pleasant to use. So… maybe it applies to both. Either way, I don’t think the efficiency is a big problem, and that it’s a good tradeoff. A simple assertion is generally quite fast.

Also, these checks are often best to put in in response to real errors, found during debugging (even during debugging of code written with TDD, as TDD also implies lots of failing tests). The total number of errors you could check for is huge, but the number of errors that a programmer is likely to make is much smaller. So maybe coming back to Michael’s post, this actually does make sense along with TDD, as you still have lots of bugs in a TDD process, those are just the bugs you pre-identified using TDD, and fixing the test and writing the code becomes the same thing.

Ian Bicking: a blog

Runtime vs. Test time

10 Comments

Home

About

Archives

Categories

Recent Posts

Recent Comments