Functional Programming Unit Testing - Part 4

Sunday, December 21, 2008

In our previous installment, we talked about bringing together the traditional xUnit tests and QuickCheck property-based tests together in a single cohesive step. For this installment, let's talk about test coverage.

But, before we continue, let's get caught up to where we are today:

Code Coverage

Code coverage is an important metric used as part of our design process to describe to what degree our source code has been tested. The code coverage tools inspect the code directly as a form of white box testing of your code. I believe having a high code coverage percentage is important, although such hard-line stances of 100% path code coverage required is most often unnecessary and is evil. However, for some applications, such as safety-critical, some form of 100% coverage should be considered.

What do we consider as part of the criteria when we're calculating code coverage?

Function coverage
All functions in the program called?
Statement coverage
All lines in the program called?
Branch coverage
All control structures such as if/then/else evaluated to true and false?
Condition coverage
All boolean sub-expressions evaluated to true and false?
Path coverage
All possible routes through the program called?
Entry/exit coverage
All possible call and return of the function executed?

Of course some of these are connected in some way such as the following:

Decision coverage includes statement coverage since exercising every branch must lead to exercising every statement.
Path coverage includes branch coverage.

Where should we focus? Using such things as statement coverage, decision coverage, and/or condition/branch coverage, around 80-90% of code coverage would suffice. Getting to 100% test code coverage is unrealistic and doesn't always ensure quality, and the amount of energy required for this is wasteful. The number we're looking for is somewhere greater than 80%.

We can use above metrics to determine how well we're writing our tests for our applications. For many algorithms, it's important to ensure that we have our edge cases covered, especially those in safety-critical systems. Let's walk through an example in Haskell for code coverage.

Code Coverage with Haskell Program Coverage (HPC)

The Haskell Program Coverage (HPC) tool is a built-in extension to the Haskell compiler used to record and display the parts of the code that were executed during a run of your program. With the criteria given above, we are able to record which functions, branches, expressions among other things were evaluated.

The HPC tool is designed to give you the following metrics:

Expressions used (Function coverage)
Boolean coverage
- Guard coverage
- if confitions
- Qualifiers
Alternatives used
Local declarations used
Top-level declarations used

Let's walk through an example of how to use this tool to your advantage. In the previous post, I've shown some QuickCheck code that doesn't give 100% code coverage so that I can show you how to better achieve it. Let's look at the example again.

First, let's look at the implementation of the ROT13 algorithm again:

--file Encryption.hs

module Encryption(rot13) where 
  
import Data.Char 
  
rot13 :: String -> String  
rot13 =  
  map mapRot 
  
  where mapRot :: Char -> Char 
  
        mapRot c | c >= 'A' && c <= 'Z' = rot 'A' c 
  
                 | c >= 'a' && c <= 'z' = rot 'a' c 
  
                 | otherwise            = c 
  
        rot :: Char -> Char -> Char  
        rot b c = chr $ (ord c - ord b + 13) `mod` 26 + ord b

Now, let's look at our QuickCheck property-based tests to perform to ensure the correctness of our algorithm.

-- file EncryptionTests.hs

import Data.Char 
  
import Data.List 
  
import Encryption 
  
import Test.Framework 
  
import Test.Framework.Providers.QuickCheck 
  
import Test.QuickCheck 
  
instance Arbitrary Char where 
  
  arbitrary   = elements (['A'..'Z'] ++ ['a'..'z']) 
  
-- Equal 
  
prop_rot13_equals s =  
  rot13 s == rot13 s 
  
-- Single is inequal to original 
  
prop_rot13_single_notEquals s =  
  rot13 s /= s 
  
-- Double is equal to original           
prop_rot13_double_equals s =    
  (rot13 . rot13) s == s 
  
-- Distribution shapes should be equal   
prop_rot13_group_equals s =  
  getDistro s == getDistro (rot13 s) 
  
  where getDistro = sort . map length . group . sort 
  
tests = [ 
  
  testGroup "ROT13 Tests" [ 
  
    testProperty "prop_rot13_equals" prop_rot13_equals, 
  
    testProperty "prop_rot13_single_notEquals" prop_rot13_single_notEquals, 
  
    testProperty "prop_rot13_double_equals" prop_rot13_double_equals, 
  
    testProperty "prop_rot13_group_equals" prop_rot13_group_equals] 
  
] 
  
main = defaultMain tests

In order for us to capture the test coverage data from HPC, we need to add the -fhpc flag to the command-line for compiling our tests such as this:

>ghc -fhpc EncryptionTests.hs --make

After instrumenting the code, we then run our code in order to capture the results. You may have noticed that it created a .hpc folder with a .mix file. When we run our code, we get the following results as usual.

>EncryptionTests 
    
ROT13 Tests: 
    
  prop_rot13_equals: [OK, passed 100 tests] 
    
  prop_rot13_single_notEquals: [OK, passed 100 tests] 
    
  prop_rot13_double_equals: [OK, passed 100 tests] 
    
  prop_rot13_group_equals: [OK, passed 100 tests] 
    
         Properties  Total 
    
Passed  4           4 
    
Failed  0           0 
    
Total   4           4

You will also note that it created a .tix file which captures the actual code coverage metrics. Let's now analyze the results of our run:

>hpc report encryptiontests 
    
97% expressions used (95/97) 
    
33% boolean coverage (1/3) 
    
      33% guards (1/3), 1 always True, 1 unevaluated 
    
     100% 'if' conditions (0/0) 
    
     100% qualifiers (0/0) 
    
66% alternatives used (2/3) 
    
100% local declarations used (3/3) 
    
100% top-level declarations used (8/8)

Analyzing the results, we realize we've made a mistake. If you look back at our Arbitrary Char instance, we're only using alphabetic characters. The problem arises is that we're not testing a portion of our rot13 function which takes a character that isn't alphabetic. But, when we change this, we have to be mindful that our tests will have to change as well. Why? Because the inequality check will not be successful if there are not letters involved. Let's make some changes and then check the results again.

instance Arbitrary Char where 
  
  arbitrary   = elements (['A'..'Z'] ++ ['a'..'z'] ++ "!@#$%^&*()" ) 
  
-- Single is inequal to original 
  
prop_rot13_single_notEquals s = 
  
  any isAlpha s ==> rot13 s /= s

Now we can recompile our code once again as we did above and do the run once more.

>hpc report encryptiontests 
    
100% expressions used (99/99) 
    
66% boolean coverage (2/3) 
    
      66% guards (2/3), 1 always True 
    
     100% 'if' conditions (0/0) 
    
     100% qualifiers (0/0) 
    
100% alternatives used (3/3) 
    
100% local declarations used (3/3) 
    
100% top-level declarations used (8/8)

Much better! Now we have 100% coverage on our ROT13 implementation. We also have the ability to dig deeper into the analysis through the use of the markup command. This will generate web pages which contain drill-down information about our code metrics. Below is a sample screen shot of my final results of my last run.

hpc_markup

This tool is quite powerful for the code analysis we need to ensure that we're writing the right kind of tests for our specifications and implementations. Now, let's turn our attention to the F# world. What options do we have?

Code Coverage with TestDriven.NET and NCover

Once again, the TestDriven.NET addition to F# saves us once again when it comes to code coverage. With the integration of NCover, we have the ability to perform rather rich analytics on our code much like above using HPC. Let's take the code from the previous post and look at the relevant parts.

#light 
  
namespace CodeBetter.Samples 
  
module EncryptionTests = 
  
  open System  
  open FsCheck 
  
  open FsCheck.Generator 
  
  open Xunit 
  
  open Encryption 
  
  open ListExtensions 
  
  open FsCheckExtensions 
  
  type CharGenerator = 
  
    static member Chars =  
      elements(['A'..'Z'] @  
               ['a'..'z']) 
  
  overwriteGenerators (typeof<CharGenerator>) 
  
  let prop_rot13_equals s = 
  
    propl (rot13 s = rot13 s) 
  
  [<Fact>] 
  
  let test_prop_rot13_equals() =   
    check config prop_rot13_equals 
  
  let prop_rot13_double_equals s = 
  
    propl ((rot13 >> rot13) s = s) 
  
  [<Fact>] 
  
  let test_prop_rot13_double_equals() =   
    check config prop_rot13_double_equals 
  
  let prop_rot13_single_notEquals s = 
  
    propl (rot13 s <> s) 
  
  [<Fact>] 
  
  let test_prop_rot13_single_notEquals() =   
    check config prop_rot13_single_notEquals 
  
  let prop_rot13_group_equals s = 
  
    let getDistro = ListExtensions.defaultSort >>  
                    ListExtensions.group >>  
                    List.map List.length >>  
                    ListExtensions.defaultSort 
  
    propl (getDistro s = getDistro (rot13 s)) 
  
  [<Fact>] 
  
  let test_prop_rot13_group_equals() =   
    check config prop_rot13_group_equals

In order to get the code metrics we need, simply right-click on the project and click Test With => Coverage. This will bring up NCover explorer. We can then browse our results to once again see our mistake.

ncover_failed

Now that we realize our mistake of not including normal characters, let's make two changes. First, let's remove the char generator because the default should suffice. Unlike the Haskell version, FsCheck comes with an arbitrary char instance already created. Also, let's ensure the success of the prop_rot13_single_notEquals function by ensuring that it contains at least one letter such as the following:

  let prop_rot13_single_notEquals s = 
  
    List.exists Char.IsLetter s ==>  
      propl (rot13 s <> s)

This ensures that if we have at least one letter, we can ensure that the ROT13 transformation will make sure the two strings are not equal. We can now prove our success by once again running the Test With => Coverage option and see the results as below.

ncover_success

Conclusion

Tools such as NCover and the Haskell Program Coverage tool, it can ensure our honesty when it comes to tests, and we get a glaring reminder when we don't. These tools, when combined with our traditional xUnit and property-based tests with saturation test generation can be a satisfying experience. We've now covered the creation and combination of traditional xUnit tests with property-based tests and how to leverage code coverage as a tool for refining. There is still more to be covered in this series which includes refactoring.

Code Coverage

Code Coverage with Haskell Program Coverage (HPC)

Code Coverage with TestDriven.NET and NCover

Conclusion

No Comments