Defining your regression tests can be a tedious, and often also difficult task. Which are the unit tests that might detect future bugs in your system? How can you keep the number of regression tests that you have to continuously maintain and execute over the lifespan of your software small, but still achieve a big coverage of your code?
In this article, I’ll present you a very satisfying answer to all of these questions: Automation.
1Automatically assess the effectiveness of your tests
In the end, what defining regression tests comes down to is assessing the future value of each unit test.
Luckily, there are already metrics that allow you to easily judge this value for a given (set of) unit tests. Code coverage tells you how much lines of code are covered by executing a test (also with some drawbacks, as I’ve showcased in this article: Limits at which code coverage fails catastrophically). With mutation testing, you can even inject artificial defects (so-called mutants) into your test suite. By evaluating how many of these injected mutants are detected by a test, mutation score gives you insights into the bug-finding potential of your unit tests. And isn’t this the true potential of a regression test?
So using these two metrics, you can automatically get insights into the effectiveness of your existing tests. You can even rank each unit test based on the covered lines of code, or the number of killed mutants.
2Automatically select the most effective regression tests
Even when you know the effectiveness of your individual tests, selecting which ones should be used as regression tests can be tricky. Because if you simply maximize the metrics described above, you will end up executing all of your tests. I mean obviously, the more tests you run, the more code you can cover, and the higher your chance of finding mutants. But there is also a cost associated with each test that you keep as unit test. You have to keep this test up to date after every change in your system, and execute it after every check-in. Both costs time that you could better spend on developing new features. So you want to keep your regression test suite as small as possibe. Of course, you will choose those who cover a lot of code, and detect many mutants on their way. But what about those tests who cover some mutants, but take like forever to execute? Or they cover a lot of code, simply because they are not actual unit tests (Good Unit Tests VS Bad Unit Tests (with C# and NUnit Examples)), but cover several parts of your software at once, thus are very difficult to maintain? By mathematically defining this tradeoff (e.g. test code effectiveness = 0.5Coverage + 0.5Mutation Score -0.004*execution time in milliseconds), you can automate the selection of the most effective tests for your individual project requirements, and even define a boundary for automatically selecting regression tests (e.g. test code effectiveness > 7). If you don’t want to experiment with such boundaries, you can further automate by using genetic algorithms to fully automatically select the set of most effective regression tests for you (see Search Algorithms for Regression Test Case Prioritization).
That was my take on automating regression tests – let me know your take on it in the comments section below!
When you visit any web site, it may store or retrieve information on your browser, mostly in the form of cookies. Control your personal Cookie Services here.