What Are Unit Tests For?

July 26, 2017

I was really confused about this, so I talked to Victoria Kirst who teaches Computer Science at Stanford and was a Google engineer in the past. She really elucidated unit tests for me! Here are my takeaways.

Unit tests are to make sure you haven’t broken anything in the program when you change it. In order words, unit tests are there to prevent regression in your code.

When should I write unit tests?

For exploratory code, unit tests should be written when you have a first prototype of your codeUnless you’re doing test-driven development. . The point is to write tests when your code stabilizes. If you write before that, chances are your classes are going to change so much re-writing tests will slow you down.

There are exceptions. If you are writing a small fixed module as part of a large codebase, for example a weekly calendar calculator, then it’s a good idea to write a test: off-by-one errors are easy to make, the small module isn’t likely to change, and it’s quicker to run the test than check manually.

What should I test for?

You should test for every. single. use case your program executes. If you have three if statements, then your test needs to execute all three.

Unit tests are an automatic test against stupid errors that you would do every time you change your code if you were disciplined enough to do it.

Writing the right tests will catch your errors, but it won’t prevent you from writing erroneous code in the first place.

Chances are you will write wrong code, then test hopefully catches your error. When it does, you will have to manually to rectify your mistake in a way that doesn’t introduce more mistakes.

What should it return?

A good unit test returns nothing or very little when everything works, and a precise error when something doesn’t.

A good way to do this is with assert statements. Let’s do an example.

Example

Say you have a function which takes as input a number $n$, and outputs $n/2$ if $n$ is even, and $3n + 1$ if $n$ is odd.Also known as the Collatz function ;).

Your code file code.py will look this:

def f(n):
	if n % 2 == 0:
		return 2 * n
	else:
		return 3 * n + 1

If our function works as intended, then $f(4) = 8$ and $f(5) = 16$. The associated unit test file test.py look like this:

from code import *

assert f(4) == 8
assert f(5) == 16

print "Tests passed!"

Running test.py will print “Tests passed!” as the function is correct.

Now let’s say that in handling your code, you’ve inadvertently deleted the + 1 at the end of the return function, such that the last line of the code went from

		return 3 * n + 1

		return 3 * n

then running test.py will return this to the command line:

$ python code.py
Traceback (most recent call last):
  File "code.py", line 8, in <module>
    assert f(5) == 16
AssertionError 

alerting you that you’ve accidentally broke your code without you having to check that specific portion of your code. If the n % 2 == 0 branch broke, test.py would have returned an assert error on f(4) without you having to manually check for the error.

The larger your code becomes, the more this pays off.

This is especially important when you have two or more people working on the same code base. The unit test can make sure your changes to the code doesn’t break someone else’s feature that you may not even know about, and vice versa.

How should I use a unit test?

This is where unit tests really shine! Once you’ve written them, they are super efficient and allow you to be an unattainably disciplined and meticulous human being by just typing a command! Say you’re done with the first prototype of your code and now you want to change something in it.

The temptation is to the the following:

Make the change.
Run a print statement or some manual test.
Eyeball that everything works as it should.

Instead, we can do the following:

Make the change.
Run the unit test, for example by typing $ python test.py.

on the command line. You’re now verifying every single line of your code and checking that your change hasn’t broken anything with respect to the test.

If something is wrong, an assert error will be thrown. Otherwise, “tests passed” or whatever string you decided to use will be printed!

What is a well-written test file?

The reader can understand what the unit test is doing.
The reader can understand why the unit test is doing what it’s doing.
The unit test should read like a set of instructions (unlike in code, redundancy is ok).

For example, your unit test could check that login is doing what it’s supposed to do on a web app. So your unit test should read like the following:

case 1

Go to the page.
Enter wrong username and password.
Check that the login fails.

case 2

Go to the page.
Enter correct username and password.
Check that login succeeds.

Line 1 and 2 are redundant, and you wouldn’t want that in a code. But in unit tests, this is much better than say, making 1 and 2 a function with the user-password combination, because that’s too complicated and hard to read.

If unit tests are too hard to write, there might be something wrong with the code itself. Unit tests, like code, should be as simple as possible.

When unit tests are well-written, they can even serve as documentation. In some sense, they are the first user of your code!

When to update the unit tests?

Whenever you have added a new functionality into your code. In particular, if you find a bug in your program and resolve it, then you should write a unit test which reproduces the bug (the edge case) and test that the code is no longer erroneous.

That way, you will never unknowingly encounter the bug again as long as you run the unit test after each code change!

Note on random variable assignments

Unit tests are written using expected outputs. If the output depends on random value, then obviously this becomes impossible to do without a fix.

If you are using random generators in your code, it is possible to make the code generator “random” things in a deterministic pattern by using a seed.

If a seed is not possible, the pickle module in Python can save a data structure in your hard disk and restore it. Instead of assigning a new variable that’s random, you can simply reload the random variable.

A big thank you to Victoria Kirst. This post was written from the perspective of a beginner.