Introduction to pytest

Introduction to `pytest`

In this chapter, we are going to introduce you to pytest, a library for unit testing your code in Python.

The pytest framework makes it easy to write small tests, yet scales to support complex functional testing for applications and libraries.

Unit testing fundamentals

If this is your first time learning about unit testing, read on this section for a crash course on what it's all about. (If you're familiar with unit testing, but want to know more about pytest's idioms, move on to the next section.)

From Wikipedia:

unit testing is a software testing method by which individual units of source code—sets of one or more computer program modules together with associated control data, usage procedures, and operating procedures—are tested to determine whether they are fit for use.

If that sounded confusing, fret not: here's a colloquial version that might help you:

Unit testing is a software testing method where we test logical "units" of source code, in isolation from other "units" of source code, to make sure they work as expected.

For the astute amongst you, there's a key skill involved in unit testing: that is learning how to break a problem down into logical "units" of work. This is an art, is problem-dependent, and developing a shared taste with others takes practice and time; learning how to write unit tests will accelerate your development on this matter.

`pytest` idioms

Before I get you writing code, it's important to make sure that you know some of pytest's idioms.

Firstly, pytest expects that the block of code that you want to test is wrapped inside a function or a class. That way, you can import it into your testing suite (more on what the "testing suite" is later!). In other words, you should have a function defined in an importable module, and that module should be installed in the conda environment. (And if even this paragraph feels awkward to you, I'd recommend reading the "Prerequisite Knowledge" chapter again.)

Secondly, pytest looks for all files that are prefixed with test_ (for example, test_something.py), and then looks for all functions in those files that are also prefixed with test_ (for example def test_somefunc()). Those will then comprise the "test suite" for your project.

Thirdly, pytest is invoked at the command line. A single basic command, with no configuration flags, looks like this:

pytest .

This tells pytest to look recursively under the current working directory for all tests to execute, and then it will execute them in order.

Exercises in Testing

With that background information out of the way, I'm going to give you a few exercises to get your hands wet with testing.

Exercise 1: Write a test for the `increment` function.

We have provided a function inside this tutorial's custom source, under testing_tutorial.functions, called increment.

The source code for the function is below:

from testing_tutorial.functions import increment 

increment??

We're going to use this function to show you the anatomy of writing a test.

Firstly, we define a new function that informatively identifies it as a test for increment:

def test_increment():
    pass

Now, we can fill it up with the "setup" for the test, i.e. the inputs that go into the function.

def test_increment():
    x = 1  # one example, "set up" to be passed into the test.

Then, we execute the function to be tested using the setup inputs:

def test_increment():
    x = 1                  # setup to be passed into the test.
    result = increment(x)  # execute the function

Finally, we assert that the result matches some expectation:

def test_increment():
    x = 1                  # setup to be passed into the test.
    result = increment(x)  # execute the function and get result.
    assert result == 2     # assert equality of result with expectation.
</div>
</div>
</div>
<div class="cell border-box-sizing text_cell rendered" markdown="1">
<div class="inner_cell" markdown="1">
<div class="text_cell_render border-box-sizing rendered_html" markdown="1">
Now, I'd like you to write a more general version of this test, and place it inside `./src/testing_tutorial/tests/test_functions_student.py`. Don't break your skull trying to come up with a mathematically rigorous version, though. The intent here is to make sure you're actually thinking about some form of testing, and not just copy/pasting code.
</div>
</div>
</div>
<div class="cell border-box-sizing text_cell rendered" markdown="1">
<div class="inner_cell" markdown="1">
<div class="text_cell_render border-box-sizing rendered_html" markdown="1">
Now, in your terminal, execute the following command:

```bash
$ pytest

This is the output you should expect:

$ pytest
================================= test session starts =================================
platform darwin -- Python 3.6.10, pytest-6.0.1, py-1.9.0, pluggy-0.13.1
rootdir: /Users/ericmjl/github/tutorials/data-testing-tutorial/src
plugins: cov-2.10.1
collected 2 items                                                                     

tests/test_functions.py ..                                                      [100%]

================================== 2 passed in 0.04s ==================================

This tells you that:

The collected 2 items indicates how many test functions were written.
The bottom line contains tests/test_functions.py, and is the place where the tests showed up. (These are the instructor versions, yours should show up under tests/test_functions_student.py)
The .. shows that the tests passed! (If they failed, they would show up with red Xs.)

Congratulations! You have written the first test of the tutorial!

Exercise 2: Simulating what happens when you accidentally break code

We're now going to simulate what happens when you accidentally break your code. Go ahead and change the increment function in functions.py such that it returns the wrong thing.

Then, execute the test suite!

You should see something that looks like the following:

================================= test session starts =================================
platform darwin -- Python 3.6.10, pytest-6.0.1, py-1.9.0, pluggy-0.13.1
rootdir: /Users/ericmjl/github/tutorials/data-testing-tutorial/src
plugins: cov-2.10.1
collected 2 items                                                                     

tests/test_functions.py FF                                                      [100%]

====================================== FAILURES =======================================
___________________________________ test_increment ____________________________________

    def test_increment():
        """Test for increment function."""
        x = 1
        result = increment(x)
>       assert result == 2
E       assert 3 == 2

tests/test_functions.py:8: AssertionError
_______________________________ test_increment_general ________________________________

    def test_increment_general():
        """A slightly more test for increment function."""
        x = 1
        result = increment(x)
>       assert result - 1 == x
E       assert (3 - 1) == 1

tests/test_functions.py:15: AssertionError
=============================== short test summary info ===============================
FAILED tests/test_functions.py::test_increment - assert 3 == 2
FAILED tests/test_functions.py::test_increment_general - assert (3 - 1) == 1
================================== 2 failed in 0.27s ==================================

Now, we'll see the tests fail, and the error messages will show us what's wrong!

Reading the test error message immediately tells us that the function that failed the test was increment. In particular, it failed both the example-based test and the slightly more general test. When we execute the test, we've caught where breaking changes happen, because the tests serve as an independent check on the correctness of the function.

Go ahead and fix the test, then re-run the test suite. Everything should now pass.

Some notes

This is a very good just-in-time moment to emphasize a few points points.

Anatomy of a Test

Firstly, let's revise now what the anatomy of a test is like.

You will always have a setup, a result from the execution of the function, and an assertion about the result.

from module import function

def test_function():  # `test_` in the name is very important!
    """Docstring about the test."""
    setup_value = ...               # set up the test
    result = function(setup_value)  # execution of the function
    assert result == correct_val    # assertion statement

The function should do something that is "testable", and ideally, it should run fast (like under a few hundred milliseconds). After all, if you start accumulating a bunch of functions and need to test them, then you'll end up waiting a long time for all of your tests to run.

Testing Loop

The other thing to introduce just-in-time here is the so-called "testing loop":

Write a test for a function.
Write the function.
Execute pytest.
Go back to step 1.

There's nothing complex behind the ideas of testing, 80% of your cases will boil down to doing this loop.

What kinds of tests

There are a few kinds of tests that you can write, and I'll list here the "bare minimum".

Execution tests

If you don't do anything else, then simply executing the test and making sure it runs without erroring is better than nothing. (Basically, just eliminate the assertions.)

I don't recommend doing this for everything, but if you're genuinely at a loss, then call the function inside the test and just make sure it doesn't error out.

Example-based tests

I did this in test_increment, where I provided one example input in the test, and tested that the expected result was some exact value. This is already one level up from a simple execution test.

Parametrized tests

You can read more about parametrizing a test on the pytest docs. Here, we parametrize the test function, so that pytest can provide to it a range of examples that we have pre-defined.

Property-based tests

If you read test_increment_general, that is an example of a property-based test. There, we test a property of the output that is invariant to the input given, i.e. that the output should always be one more than the input, regardless of what input is given. In the later chapters, you will see how to do this with Hypothesis.

Exercise 3: Testing a Min-Max Scaler

In functions.py, we have a function called min_max_scaler(x) for your data. It takes in a numpy array and scales all of the values to be between 0 and 1 inclusive. The min value should be 0, and the max value should be 1.

Try writing a test for the min-max scaler. It should check the following:

Given a particular array-like input (e.g. numpy array, list, tuples), it should be equal to some other array. Use the np.allclose(arr1, arr2) function to test closeness of two floating point values.
The minimum value should be 0, and the maximum value of the output should be 1.

Note: This function is also implemented in the scikit-learn library as part of their preprocessing module. However, in case an engineering decision that you make is that you don't want to import an entire library just to use one function, it is considered good practice to rewrite it on your own, provided you also test it sufficiently.

Once you're done, execute the test and check that the min_max_scaler adheres to the behaviour laid out in its docstring.

Exercise 4: Testing functions on textual data.

Imagine we have textual data, and we want to clean it up. There are two functions we may want to write to standardize the data:

bag_of_words(text), which takes in the text and tokenizes the text into its set of constituent words.
strip_punctuation(text), which strips punctuation from the text.

Now, I'd like you to invert the flow we've been following, and start by designing the tests first, before implementing the two functions in src/student_functions.py; you may wish to write additional helper functions to manage the business logic. There's leeway in this exercise; feel free to get creative! (And if you want a reference starting point, feel free to peek inside src/functions.py!)

The reason I'm asking you to design the tests first is because that will clarify how you intend to use the function, which will bring much more clarity to how you implement the function. This is also known as "Test-Driven Development".

Once you're done, execute the tests and make sure they all pass.

Conclusions

This level of testing is sufficient to get you through 80% of the day-to-day. I have used this level of testing at work to raise the level of confidence I have in the functions I write. It facilitates code sharing as well, because my colleagues can now have the confidence that the code I write is reliable and works as expected. My engineering colleagues will have confidence in taking over what I have developed, because they know they have an easy starting point for studying the code (in the test suite).

There are advanced patterns for testing that I have not gone through here, such as parametrizing the tests, checking that errors are raised, and more. For these, the pytest docs are a wonderful resource, and I would encourage you to check them out.

Meanwhile, we are going to go to the next chapter, which is on how writing tests for our data.

Introduction to pytest