Unit testing for Data Science in Python

All We Need is Data, itself !·2022년 4월 3일

Data Science datacamp python

pytest docs

Assert

understanding test result report

test types

Mastering assert statements

Don't do this

testing for exceptions instead of return values

Test Driven Development (TDD)

How to organize a growing set of tests?

Expected failures and conditional skipping

Mocking

Mocker

DataCamp

목록 보기

3/13

pytest docs

API ref: https://docs.pytest.org/en/6.2.x/reference.html

Assert

Assert: 뒤의 조건이 true가 아니면 asserterror 리턴

ref: https://wikidocs.net/21050

understanding test result report

general information
test result
- F : failure (an exception is raised)
- . : passed

.F. : pass , fail, pass

information about failed tests

test types

data module
feature module
models module
unit test
- unit : small, independent piece of code

Mastering assert statements

message is raised when AssertError caused

actual = [ sth ]
expected = None
message = ( sth )

assert actual is expected, message

Don't do this

use pytest.approx()

testing for exceptions instead of return values

with pytest.raises(ValueError):
	sth

Test Driven Development (TDD)

def convert_to_int(integer_string_with_commas):
    comma_separated_parts = integer_string_with_commas.split(",")
    for i in range(len(comma_separated_parts)):
        # Write an if statement for checking missing commas
        if len(comma_separated_parts[i]) > 3:
            return None
        # Write the if statement for incorrectly placed commas
        if i != 0 and len(comma_separated_parts[i]) != 3:
            return None
    integer_string_without_commas = "".join(comma_separated_parts)
    try:
        return int(integer_string_without_commas)
    # Fill in with the correct exception for float valued argument strings
    except ValueError:
        return None

How to organize a growing set of tests?

run all the tests in the test class using node IDs : !pytest models/test_train.py::[name]
run only the previously failing test:
!pytest models/[name].py::[f name]

Expected failures and conditional skipping

conditional skipping

class
	@pytest.mark.skipif( [condition] , reason = " sth ")
    def name(args):
    
    assert ~

the command that would only show the reason for expected failures in the test result report: !pytest -rs
- both skipped test : add x

Mocking

testing funcs independently of dependencies
- pytest-mock
- unittest.mock

Mocker

# Add the correct argument to use the mocking fixture in this test
def test_on_raw_data(self, raw_and_clean_data_file, mocker):
    raw_path, clean_path = raw_and_clean_data_file
    # Replace the dependency with the bug-free mock
    convert_to_int_mock = mocker.patch("data.preprocessing_helpers.convert_to_int",
                                       side_effect=convert_to_int_bug_free)
    preprocess(raw_path, clean_path)
    # Check if preprocess() called the dependency correctly
    assert convert_to_int_mock.call_args_list == [call("1,801"), call("201,411"), call("2,002"), call("333,209"),
                                                  call("1990"),  call("782,911"), call("1,285"), call("389129")
                                                  ]
    with open(clean_path, "r") as f:
        lines = f.readlines()
    first_line = lines[0]
    assert first_line == "1801\\t201411\\n"
    second_line = lines[1]
    assert second_line == "2002\\t333209\\n"