Test-Driven Development with Python / R

Workshop

grafik.png

Frie & Alexandra

Why use TDD?

grafik.png

"Classic" coding

Starting a new project

  • start writing code, which works fine

What then happens

  • Refactoring
  • New feature that builds on top of the existing code
  • New team members work with your code

-> bugs, long time for debugging needed, anxious to touch old code and refactor

The solution: write tests

grafik.png

Example Test

grafik.png

Test-Driven Development: Write tests before code

  • first think of usage of code, then of implementation
  • clearer code, less duplication
  • less unused code

How does TDD work?

TDD for Data Science 1/2

Great article on TDD for data science: https://towardsdatascience.com/tdd-datascience-689c98492fcc

TDD is probably not worth the effort in the following scenarios:

  • You are exploring a data source, especially if you do it to get an idea of the potential and pitfalls of said source.
  • You are building a simple and straightforward proof of concept. Your goal is to evaluate whether further efforts are promising or not.
  • You are working with a complete and manageable data source.
  • You are (and you will be) the only person who is working on a project. This assumption is stronger than it might appear at first glance but holds for ad-hoc analyses.

TDD for Data Science 2/2

In contrast, TDD is great in these cases:

  • Analytics pipeline
  • Complicated proof of concept, i.e. different ways to solve a subproblem, clean data etc…
  • Working with a subset of data, so you have to make sure that you capture problems when new issues come up without destroying working code.
  • You are working in a team, yet you want to make sure that no one breaks the functioning code.

source: https://towardsdatascience.com/tdd-datascience-689c98492fcc

Unit tests, integration tests, ...

different "levels" of testing

  • unit tests
    • calling a function with different parameters and ensure that it returns expected values and/or tYour unit tests will call a function with different parameters and ensure that it returns the expected values
    • In an object-oriented language a unit can range from a single method to an entire class.
  • integration tests: They test the integration of your application with all the parts that live outside of your application. For example: making API calls, database connections.
  • end-to-end tests: Testing your deployed application via its user interface is the most end-to-end way you could test your application.

source: https://martinfowler.com/articles/practical-test-pyramid.html

Contact

Alexandra Kapp
alexandra.k@correlaid.org
@lxndrkp

Frie Preu
frie.p@correlaid.org
@ameisen_strasse