Key Takeaways
- We use more than one kind of logic in everyday life and when writing code.
- Certain kinds of logic are unintuitive in abstract situations meaning we can miss some opportunities to reason logically about code.
- Much of the power of tests is that they let us apply logical reasoning automatically, even in situations that are too abstract to think intuitively about.
- We can use tests to analyse code dynamically (by running it), which can be quicker and more effective than a static code analysis.
- These techniques are valuable for understanding software with a high degree of accumulated complexity.
This article discusses a number of debugging techniques as well as some of the theories involved in the logic of some common and less common software testing techniques. The main goal is to think through the role of tests in helping you understand complex code, especially in cases where you are starting from a position of unfamiliarity with the code base.
I think most of us would agree that tests allow us to automate the process of answering a question like "Is my software working right now?". Since the need to answer this question comes up all the time, at least as frequently as you deploy, it makes sense to spend time automating the process of answering it. However, even a large test suite can be a poor proxy for this question since it can only ever really answer the question "Do all my tests pass?". Fortunately, tests can be useful in helping us answer a larger range of questions. In some cases they allow us to dynamically analyse code, enabling us to glean a genuine understanding of how complex systems operate, that might otherwise be hard won.
We will look at some code examples shortly, but first, we need to think about a simple, but famous, puzzle.
Here we have some cards with numbers on one side but the backs are different colours for some reason; maybe I mixed up the decks. I tell you that all the even-numbered cards are red on the back. How can you check that? This puzzle was devised by Peter Wason in his paper, Reasoning about a Rule.
When presented with this puzzle pretty much everyone spots that it would be a good idea to turn over the even-numbered card to see if it’s red on the back. The trick that a lot of people miss, however, is that turning over the card with the brown side showing to check that it has an odd number on the other side is just as valid. In other words, you should check the negative case as well as the more intuitive positive case.
The interesting thing that the original study showed is that one reason people often miss the trick is that our reasoning relies a lot on real-world intuition. Few people miss the negative cases if the puzzle is posed using everyday items in a way that puts money or fairness at stake. That tells us something that I think should be relevant to all programmers: When thinking in the abstract, as is usual with software, it’s easy for the logic of the situation to get lost, especially when dealing with negative cases.
Another lesson from Wason’s selection puzzle is that there are two, quite different, ways in which people can think. One way, called inductive thinking, looks for positive cases with something in common, such as several even numbered cards with red backs, and attempts to draw general conclusions; often the conjecture that the pattern applies in all cases. The alternative is deductive thinking which in some cases can be used to argue by contradiction e.g. If there is a brown card with an even number on the other side then not all the even-numbered cards are red on the back, but this kind of logic can sometimes be unintuitive.
As programmers, I think we’ll take any kind of inference that works for us. We don’t need to be academic about it, but if we’re missing the cases where we could argue by contradiction we may be missing some opportunities to use deduction to better understand our code. How to apply deductive arguments by contradiction is the thing I want us to think through in this article. This takes us straight back to tests because the logic of tests often has the form of an argument by contradiction. Time for an example.
If you’ll excuse the pseudocode, I’m trying here to display the name of a user on a web page, let’s say it’s their profile page.
String formatUserName(User user) {
return user.firstName.formatName() +
user.lastName.formatName();
}
This code works fine, but now we’re asked to support users with middle names so I change the code to handle that case.
String formatUserName(User user) {
return user.firstName.formatName() +
user.middleName.formatName() +
user.lastName.formatName();
}
I’m a diligent professional so before I ship my change I run up the app in a test environment and check that the middle name now appears on the user profile page. In this case, I see the name John Peter McIntyre on the profile page, so job done, ship it.
Unfortunately, while the page works for the user I tested, it now returns 500 for 90% of other users. These are the ones who don’t have a middle name, so for them user.middleName
is null
, and trying to format it throws an exception. I don’t mean to suggest that you are silly enough to make this mistake, though I think most of us would admit to having made similar ones in the past, but the thought process we follow to spot this mistake before we make it is an argument by contradiction. The case of someone without a middle name invalidates the general rule that the code asserts.
Fortunately, I wasn’t actually so cavalier with my client’s website. I have some tests from when I wrote the original code and when I ran them before deploying the code this one failed:
testFormattingNameOfUserWithFirstNameAndLastName() {
User user = new User("John", null, "McIntyre");
String name = user.formatUserName();
assertEqual("John McIntyre", name);
}
We didn’t have to argue by contradiction, our test did it for us. The test suite as a whole argued by contradiction in hundreds of different cases, saving us a lot of effort and Wason’s selection puzzle tells us why this is so valuable; we’re not very good at thinking deductively in abstract cases, but our tests can do that for us.
We’ve just looked at a simple example, but we can apply similar ideas when trying to understand much more complex code. Let’s suppose that we get a bug report saying that on the profile page, the middle name and the last name are appearing in the wrong order, so we actually get John McIntyre Peter. We’ve just fixed the formatting code — and we know it’s tested — so the problem must be elsewhere. It might be time to start digging through code, potentially starting with the input data and following it all the way through to the UI. It’s looking like a long day of debugging ahead of us. However, there might be a simpler way; if we can just find an appropriate test that processes the user name data, it could help us understand the situation. Tests can be useful in this kind of situation, even if they’re currently passing, because you can see the inputs and outputs of the relevant part of the system and they execute a subset of the code so the problem is often a narrower one.
So it seems we just need to look through the tests which might not take too long. Of course, we're assuming that the tests for data input are with the data input code, the tests for persistence are with the persistence code and the tests for business logic are with the business-level code, but before you get too confident I’m going to be mean and introduce a bit more of reality into the scenario.
How do you organise your tests? Perhaps you read a book about software development and it told you to organise your tests like this:
This implies that while most of the tests you write are unit tests, you also have a fair number of integration tests that test more than one part of the system together, and your high-level tests that run the whole system are restricted to a few that focus on particular areas of value. If you follow this pattern then it’s probably not too hard for you to look through your tests to find the one we need to help us fix our bug. Most of your tests will probably be in predictable places and most of them will be simple enough to quickly decide if they’re relevant.
However in reality this pattern is often not followed. I don’t want to be too normative about it since there are reasonable people on the internet who think the right-side-up pyramid is not optimal and some even advocate an upside-down pyramid. Let’s just acknowledge that you may well be dealing with a situation like this:
In this case, the test you’re looking for might not be close to the code it’s testing and a quick inspection might not be enough to understand each test. That’s the problem in front of us. We’d like to find a test that will help us track down our bug, but we’re dealing with a complex system that no single person understands completely. Or, we might be relatively new to the project so we don’t have much understanding of the system yet, and trying to understand it is exactly the problem we’re trying to solve.
Here’s the trick: we’re going to break the code, just slightly. Let’s rewrite our formatting function:
String formatUserName(User user) {
return user.firstName.formatName();
}
We know our unit test won’t pass now, but that’s ok. What’s really going to give us insight into the system is running the entire suite of tests. If those tests are an upside-down pyramid that could take a few minutes, but it’s still much faster than reading all the tests. When the tests have finished running some of them will have failed because we broke the code. Those tests will be ones that process user name data which is great because we know there’s a bug in the user name code. It turns out this integration test failed:
testSavingAndRetrievingUser() {
SaveUserRequest request = new SaveUserRequest("{
\"id\": 1234,
\"user\": {
\"firstName\": \"John\",
\"lastName\": \”McIntyre\”
}
}");
UserSystem system = new UserSystem(
new Logging(),
new UserPersistence(),
new BusinessLogic(),
new FakeExternalDependencyClient());
system.handleSaveUserRequest(request);
ReadUserResponse response = system.handleReadUserRequest(
new ReadUserRequest("{ \"id\": 1234 }")
);
User user = User.parseUser(response.body);
assertEqual(“John McIntyre”, user.formatUserName());
}
It’s not the most readable test — I’ve seen much worse in practice — but it shows us all the components that go into the process. Let’s look at UserPersistence.
class UserPersistence {
DataSource dataSource = new DataSource();
saveUser(User user) {
dataSource.execute("
INSERT INTO user SET
user.firstName,
user.lastName,
user.middleName;
")
}
}
And we found the bug! The middle name and the last name are persisted in the wrong order. We can see now that we don’t have a test that tests the persistence of the middle name so we can make sure that’s tested now. Also don’t forget to put back the formatting code we broke!
So breaking the code slightly brought us straight to a test that helped us understand the buggy part of the system and fix the problem. The tests helped us do more than inspect the code statically; we performed a dynamic analysis that identified relevant parts of the code. I’m not sure how well-known this technique is, but I’ve explained it to enough experienced developers to believe it’s underused. It’s also a great example of using negative cases to understand complex software systems.
I’m not suggesting that we all abandon inductive thinking and rely solely on deduction. We’re not machines and inductive thinking comes naturally to us even if, as we saw in the first example, it can sometimes lead to disaster. But understanding where we can take advantage of the logic of arguing by contradiction can really help when developing software.
As we saw in the first example, a test that was originally written to confirm the correctness of a particular feature becomes a negative test for subsequent features that relate to the same code, potentially telling us if our new code doesn’t satisfy the original requirement. Repeatedly subjecting our codebase to attempts to falsify our implementation, by running the suite of all previously written tests, gives us more confidence in our implementation than we would get if each test only applied to the case that it was written for.
Software is complex, often to the extent that it’s impossible to understand in totality, so having the ability to dynamically analyse code using tests can lead us to answers much more quickly than a static inspection of the code itself. Few of us study logic on the way to becoming great programmers, but a little understanding can help elucidate some common and less common programming techniques.