Functional Programming Unit Testing - Part 4
In our previous installment, we talked about bringing together the traditional xUnit tests and QuickCheck property-based tests together in a single cohesive step. For this installment, let's talk about test coverage.
But, before we continue, let's get caught up to where we are today:
- Part 1 - xUnit Frameworks - HUnit
- Part 2 - Property-Based Tests - QuickCheck
- Part 3 - QuickCheck + xUnit Tests Together
Code Coverage
Code coverage is an important metric used as part of our design process to describe to what degree our source code has been tested. The code coverage tools inspect the code directly as a form of white box testing of your code. I believe having a high code coverage percentage is important, although such hard-line stances of 100% path code coverage required is most often unnecessary and is evil. However, for some applications, such as safety-critical, some form of 100% coverage should be considered.
What do we consider as part of the criteria when we're calculating code coverage?
- Function coverage
All functions in the program called? - Statement coverage
All lines in the program called? - Branch coverage
All control structures such as if/then/else evaluated to true and false? - Condition coverage
All boolean sub-expressions evaluated to true and false? - Path coverage
All possible routes through the program called? - Entry/exit coverage
All possible call and return of the function executed?
Of course some of these are connected in some way such as the following:
- Decision coverage includes statement coverage since exercising every branch must lead to exercising every statement.
- Path coverage includes branch coverage.
Where should we focus? Using such things as statement coverage, decision coverage, and/or condition/branch coverage, around 80-90% of code coverage would suffice. Getting to 100% test code coverage is unrealistic and doesn't always ensure quality, and the amount of energy required for this is wasteful. The number we're looking for is somewhere greater than 80%.
We can use above metrics to determine how well we're writing our tests for our applications. For many algorithms, it's important to ensure that we have our edge cases covered, especially those in safety-critical systems. Let's walk through an example in Haskell for code coverage.
Code Coverage with Haskell Program Coverage (HPC)
The Haskell Program Coverage (HPC) tool is a built-in extension to the Haskell compiler used to record and display the parts of the code that were executed during a run of your program. With the criteria given above, we are able to record which functions, branches, expressions among other things were evaluated.
The HPC tool is designed to give you the following metrics:
- Expressions used (Function coverage)
- Boolean coverage
- Guard coverage
- if confitions
- Qualifiers
- Alternatives used
- Local declarations used
- Top-level declarations used
Let's walk through an example of how to use this tool to your advantage. In the previous post, I've shown some QuickCheck code that doesn't give 100% code coverage so that I can show you how to better achieve it. Let's look at the example again.
First, let's look at the implementation of the ROT13 algorithm again:
import Data.Char
rot13 :: String -> String
rot13 =
map mapRot
where mapRot :: Char -> Char
mapRot c | c >= 'A' && c <= 'Z' = rot 'A' c
| c >= 'a' && c <= 'z' = rot 'a' c
| otherwise = c
rot :: Char -> Char -> Char
rot b c = chr $ (ord c - ord b + 13) `mod` 26 + ord b
Now, let's look at our QuickCheck property-based tests to perform to ensure the correctness of our algorithm.
import Data.List
import Encryption
import Test.Framework
import Test.Framework.Providers.QuickCheck
import Test.QuickCheck
instance Arbitrary Char where
arbitrary = elements (['A'..'Z'] ++ ['a'..'z'])
-- Equal
prop_rot13_equals s =
rot13 s == rot13 s
-- Single is inequal to original
prop_rot13_single_notEquals s =
rot13 s /= s
-- Double is equal to original
prop_rot13_double_equals s =
(rot13 . rot13) s == s
-- Distribution shapes should be equal
prop_rot13_group_equals s =
getDistro s == getDistro (rot13 s)
where getDistro = sort . map length . group . sort
tests = [
testGroup "ROT13 Tests" [
testProperty "prop_rot13_equals" prop_rot13_equals,
testProperty "prop_rot13_single_notEquals" prop_rot13_single_notEquals,
testProperty "prop_rot13_double_equals" prop_rot13_double_equals,
testProperty "prop_rot13_group_equals" prop_rot13_group_equals]
]
main = defaultMain tests
In order for us to capture the test coverage data from HPC, we need to add the -fhpc flag to the command-line for compiling our tests such as this:
After instrumenting the code, we then run our code in order to capture the results. You may have noticed that it created a .hpc folder with a .mix file. When we run our code, we get the following results as usual.
ROT13 Tests:
prop_rot13_equals: [OK, passed 100 tests]
prop_rot13_single_notEquals: [OK, passed 100 tests]
prop_rot13_double_equals: [OK, passed 100 tests]
prop_rot13_group_equals: [OK, passed 100 tests]
Properties Total
Passed 4 4
Failed 0 0
Total 4 4
You will also note that it created a .tix file which captures the actual code coverage metrics. Let's now analyze the results of our run:
97% expressions used (95/97)
33% boolean coverage (1/3)
33% guards (1/3), 1 always True, 1 unevaluated
100% 'if' conditions (0/0)
100% qualifiers (0/0)
66% alternatives used (2/3)
100% local declarations used (3/3)
100% top-level declarations used (8/8)
Analyzing the results, we realize we've made a mistake. If you look back at our Arbitrary Char instance, we're only using alphabetic characters. The problem arises is that we're not testing a portion of our rot13 function which takes a character that isn't alphabetic. But, when we change this, we have to be mindful that our tests will have to change as well. Why? Because the inequality check will not be successful if there are not letters involved. Let's make some changes and then check the results again.
arbitrary = elements (['A'..'Z'] ++ ['a'..'z'] ++ "!@#$%^&*()" )
-- Single is inequal to original
prop_rot13_single_notEquals s =
any isAlpha s ==> rot13 s /= s
Now we can recompile our code once again as we did above and do the run once more.
100% expressions used (99/99)
66% boolean coverage (2/3)
66% guards (2/3), 1 always True
100% 'if' conditions (0/0)
100% qualifiers (0/0)
100% alternatives used (3/3)
100% local declarations used (3/3)
100% top-level declarations used (8/8)
Much better! Now we have 100% coverage on our ROT13 implementation. We also have the ability to dig deeper into the analysis through the use of the markup command. This will generate web pages which contain drill-down information about our code metrics. Below is a sample screen shot of my final results of my last run.
This tool is quite powerful for the code analysis we need to ensure that we're writing the right kind of tests for our specifications and implementations. Now, let's turn our attention to the F# world. What options do we have?
Code Coverage with TestDriven.NET and NCover
Once again, the TestDriven.NET addition to F# saves us once again when it comes to code coverage. With the integration of NCover, we have the ability to perform rather rich analytics on our code much like above using HPC. Let's take the code from the previous post and look at the relevant parts.
namespace CodeBetter.Samples
module EncryptionTests =
open System
open FsCheck
open FsCheck.Generator
open Xunit
open Encryption
open ListExtensions
open FsCheckExtensions
type CharGenerator =
static member Chars =
elements(['A'..'Z'] @
['a'..'z'])
overwriteGenerators (typeof<CharGenerator>)
let prop_rot13_equals s =
propl (rot13 s = rot13 s)
[<Fact>]
let test_prop_rot13_equals() =
check config prop_rot13_equals
let prop_rot13_double_equals s =
propl ((rot13 >> rot13) s = s)
[<Fact>]
let test_prop_rot13_double_equals() =
check config prop_rot13_double_equals
let prop_rot13_single_notEquals s =
propl (rot13 s <> s)
[<Fact>]
let test_prop_rot13_single_notEquals() =
check config prop_rot13_single_notEquals
let prop_rot13_group_equals s =
let getDistro = ListExtensions.defaultSort >>
ListExtensions.group >>
List.map List.length >>
ListExtensions.defaultSort
propl (getDistro s = getDistro (rot13 s))
[<Fact>]
let test_prop_rot13_group_equals() =
check config prop_rot13_group_equals
In order to get the code metrics we need, simply right-click on the project and click Test With => Coverage. This will bring up NCover explorer. We can then browse our results to once again see our mistake.
Now that we realize our mistake of not including normal characters, let's make two changes. First, let's remove the char generator because the default should suffice. Unlike the Haskell version, FsCheck comes with an arbitrary char instance already created. Also, let's ensure the success of the prop_rot13_single_notEquals function by ensuring that it contains at least one letter such as the following:
List.exists Char.IsLetter s ==>
propl (rot13 s <> s)
This ensures that if we have at least one letter, we can ensure that the ROT13 transformation will make sure the two strings are not equal. We can now prove our success by once again running the Test With => Coverage option and see the results as below.
Conclusion
Tools such as NCover and the Haskell Program Coverage tool, it can ensure our honesty when it comes to tests, and we get a glaring reminder when we don't. These tools, when combined with our traditional xUnit and property-based tests with saturation test generation can be a satisfying experience. We've now covered the creation and combination of traditional xUnit tests with property-based tests and how to leverage code coverage as a tool for refining. There is still more to be covered in this series which includes refactoring.