Showing posts with label Testing. Show all posts
Showing posts with label Testing. Show all posts

Tuesday, 14 January 2014

NOTE: Thoughts on Lineage

I got quite a lot of interested feedback on the BI Lineage post I made last week. My post highlighted a most informative article from Metacoda's Paul Homes.

Paul himself commented on my post and offered an additional tip. Here's what Paul said:
I agree it would be nice if BI developers could do their own scans without relying on unrestricted admins to do them ahead of time. This would be similar to how DI developers can do their own impact analysis for DI content in SAS Data Integration Studio. Ideally, as with DI, they could be done dynamically, without having to do a scan and have a BI Lineage custom repository to store them in.

In the meantime, one tip I'd suggest to make it easier for the BI developers, is that BI Lineage scans can also be scheduled. An unrestricted admin can schedule a scan, at a high level in the metadata tree, to be done every night for example.
A useful tip indeed. Thanks Paul.

Friday, 10 January 2014

NOTE: Wrap-Up on Test Coverage and MCOVERAGE

I've spent this week describing the functionality and purpose of the MCOVERAGE system option introduced in SAS V9.3. Coverage testing is an important consideration for your testing strategy - it's important to know how much of your code has been tested.

As its name suggests, MCOVERAGE only logs macro coverage. It's a great shame that there isn't an equivalent for DATA steps. Perhaps it will be added in due course, to DATA steps or DS2, or both.

With some planning, and judicious use of some post-processing capability to make sense of the log(s), MCOVERAGE can be an important tool in your testing arsenal.

I note that HMS Analytical Software GmbH's testing tool (SASunit) includes coverage testing through the use of MCOVERAGE. I've not used SASunit myself, and I can't speak for how complete, reliable and supported it may be, but if you're interested in learning more I suggest you read the SASUnit: General overview and recent developments paper from the 2013 PhUSE conference and take a look at SASunit's SourceForge pages.

What is your experience with using coverage testing and/or MCOVERAGE? Post a comment, I'd love to hear from you.

MCOVERAGE:

NOTE: Macros Newness in 9.4 and 9.3 (MCOVERAGE), 6-Jan-2014
NOTE: Macro Coverage in Testing (MCOVERAGE), 7-Jan-2014
NOTE: Making Sense of MCOVERAGE for Coverage Testing of Your Macros, 8-Jan-2014
NOTE: Expanding Our Use of MCOVERAGE for Coverage Analysis of our Macro Testing, 9-Jan-2014
NOTE: Wrap-Up on Test Coverage and MCOVERAGE, 10-Jan-2014 (this article!)

Thursday, 9 January 2014

NOTE: Expanding Our Use of MCOVERAGE for Coverage Analysis of our Macro Testing

Over the last few days I've been revealing the features and benefits of the MCOVERAGE system option introduced in SAS V9.3. This system option creates a log file to show which lines of our macro(s) were executed, e.g. during our tests.

Knowing that we tested all lines of code, or knowing that we tested 66% of all lines of code is important when judging whether we have tested sufficient amounts of our code to give sufficient confidence to put the new/updated system into production. This information relates back to our testing strategy (where we specified targets for the proportion of code lines tested). It also helps us spot dead lines of code, i.e. lines of code that will not ever be executed (perhaps due to redundant logic).

Yesterday I showed code to read an mcoverage log file and create a table to show which macro lines had been executed and which had not. My code was basic and only worked for one execution of the tested macro. Quite often we need to run our code mor than once to test all branches through our logic, so today I'll discuss upgrading my mcoverage processing code so that it handles multiple executions of the tested macro.

We might start by running our tested macro twice, with two different parameter values...

filename MClog "~/mcoverage2.log";

options mcoverage mcoverageloc=MClog;

%fred(param=2);
%fred(param=1); /* Take a different path through the code */

filename MClog clear;
* BUT, see my note about closing MClog at 
  the end of my earlier blog post;

The mcoverage log produced from these two consecutive executions looks like this:


1 1 18 FRED
2 1 1 FRED
3 17 17 FRED
2 1 1 FRED
2 2 2 FRED
2 3 3 FRED
2 4 4 FRED
2 4 4 FRED
2 4 4 FRED
2 5 5 FRED
2 6 6 FRED
2 7 7 FRED
2 8 8 FRED
2 8 8 FRED
2 9 9 FRED
2 13 13 FRED
2 18 18 FRED
1 1 18 FRED
2 1 1 FRED
3 17 17 FRED
2 1 1 FRED
2 2 2 FRED
2 3 3 FRED
2 4 4 FRED
2 4 4 FRED
2 4 4 FRED
2 5 5 FRED
2 6 6 FRED
2 7 7 FRED
2 8 8 FRED
2 8 8 FRED
2 9 9 FRED
2 13 13 FRED
2 14 14 FRED
2 15 15 FRED
2 16 16 FRED
2 16 16 FRED
2 18 18 FRED

You will recall that type 1 records mark the beginning execution for a macro, type 3 records indicate non-compiled lines (such as blank lines), and type 2 records indicate executed lines of code.

Note how we now get two type 1 records. These each mark the start of a new execution of the %fred macro. Close inspection of the type 2 records shows different sets of line numbers for the first and second executions, reflecting different paths through the %fred macro code.

We're aiming to create an output that shows whether the lines of %fred macro code were executed in one or more tests, or not. So, given that non-executed rows of macro code don't create a record in the mcoverage log, we can process the mcoverage log quite simply by counting the number of type 2 records for each line of macro code. For simplicity, we'll count the type 3s too. The output that we get will look like this:


Recordnum RecordRectype Executions Analysis
1 %macro fred(param=2); 2 4 Used
2   * comment ; 2 2 Used
3   %put hello world: &param; 2 2 Used
4   %if 1 eq 1 %then %put TRUE; 2 6 Used
5   %if 1 eq 1 %then 2 2 Used
6   %do; 2 2 Used
7     %put SO TRUE; 2 2 Used
8   %end; 2 4 Used
9   %if 1 eq 0 %then 2 2 Used
10   %do; . . NOT used!
11     %put FALSE; . . NOT used!
12   %end; . . NOT used!
13   %if &param eq 1 %then 2 2 Used
14   %do; 2 1 Used
15     %put FOUND ME; 2 1 Used
16   %end; 2 2 Used
17 3 2 Not compiled
18 %mend fred; 2 2 Used

So, we can see that executing the %fred macro with two different values for param has resulted in all but three lines of code being executed. We might choose to add additional tests in order to exercise the remaining lines, or a closer inspection might reveal that they are dead lines of code.

The code to create the above output is included at the end of this post. The sequence followed by the code is as follows:
  • Read the mcoverage log file into a data set. Process the data set in order to i) remove type 1 records, and ii) count the number of rows for each line of macro code
  • Read the macro source into a data set, adding a calculated column that contains a line numbering scheme that matchers the scheme used by the mcoverage log. We are careful to preserve leading blanks in order to preserve indentation from the code
  • Join the two data sets and produce the final report. Use a monospace font for the code and be careful to preserve leading blanks for indentation
I'll wrap-up this series tomorrow with a summary of what we learned plus some hints and tips on additional features that could be added.

Here's the code:


/* This code will not cope reliably if the macro    */
/* source does not have a line beginning with the   */
/* %macro statement for the macro under inspection. */
/* This code expects a coverage log file from one   */
/* macro. It cannot cope reliably with log files    */
/* containing executions of more than one different */
/* macro.                                           */
/* Multiple different macros might be required if */
/* testing a suite of macros.                     */
filename MClog "~/mcoverage2.log"; /* The coverage log file (MCOVERAGELOC=) */
filename MacSrc "~/fred.sas";      /* The macro source  */

/* Go get the coverage file. Create macro */
/* var NUMLINES with number of lines      */
/* specified in (first) type 1 record.    */
data LogFile;
  length macname $32;
  keep Macname Start End Rectype;
  infile MClog;
  input Rectype start end macname $;
  prevmac = compress(lag(macname));

  if _n_ ne 1 and prevmac ne compress(macname) then
    put "ERR" "OR: Can only process one macro";

  if rectype eq 1 then
    call symputx('NUMLINES',end);

  if rectype ne 1 and start ne end then
    put "ERR" "OR: Not capable of handling START <> END";
run;

%put NUMLINES=&numlines;

/* Count the number of log records for each line of code. */
proc summary data=LogFile nway;
  where rectype ne 1;
  class start rectype;
  var start; /* Irrelevant choice because we only want N statistic */
  output out=LogFile_summed n=Executions;
run;

/* Go get macro source and add a line number value that */
/* starts at the %macro statement (because this is how  */
/* MCOVERAGE refers to lines.                           */
/* Restrict number of lines stored to the number we got */
/* from the coverage log file.                          */
/* Numlines does not include %mend, so we implicitly    */
/* increment the number of lines by one and thereby     */
/* retain the line containing %mend, purely for         */
/* aesthetic reasons for the final report.              */
data MacroSource;
  length Record $132;
  retain FoundStart 0 
    LastLine 
    Recordnum 0;
  keep record recordnum;
  infile MacSrc pad;
  input record $char132.; /* Keep leading blanks */

  if not FoundStart and upcase(left(record)) eq: '%MACRO' then
    do;
      FoundStart = 1;
      LastLine = _n_ + &NumLines - 1;
    end;

  if FoundStart then
    recordnum + 1;

  if FoundStart and _n_ le LastLine then
    OUTPUT;
run;

/* Bring it all together by marking each line of code */
/* with the ecord type from the coverage log.         */
proc sql;
  create table dilly as
    select  code.recordnum
      ,code.record
      ,log.rectype
      ,log.Executions
      ,
    case log.rectype
      when 2 then "Used"
      when 3 then "Not compiled"
      when . then "NOT used!" 
      else "UNEXPECTED record type!!"
      end 
    as Analysis
      from MacroSource code left join LogFile_summed log
        on code.recordnum eq log.start;
quit;

proc report data=dilly nowd;
  define record /display style(column)={fontfamily="courier" asis=on};
run;

filename MacSrc clear;
filename MClog clear;

*** end ***;

MCOVERAGE:

NOTE: Macros Newness in 9.4 and 9.3 (MCOVERAGE), 6-Jan-2014
NOTE: Macro Coverage in Testing (MCOVERAGE), 7-Jan-2014
NOTE: Making Sense of MCOVERAGE for Coverage Testing of Your Macros, 8-Jan-2014
NOTE: Expanding Our Use of MCOVERAGE for Coverage Analysis of our Macro Testing, 9-Jan-2014 (this article!)
NOTE: Wrap-Up on Test Coverage and MCOVERAGE, 10-Jan-2014

Wednesday, 8 January 2014

NOTE: Making Sense of MCOVERAGE for Coverage Testing of Your Macros

Over the last couple of days I've been uncovering the MCOVERAGE system option for coverage of testing of macro code. Coverage testing shows which lines were executed by your tests (and which were not). Clearly, knowing the percentage of code lines that were executed by your test suite is an important measure of your coding efforts.

Yesterday we saw what the mcoverage contained for a typical execution of a macro. What we would like to do is make the information more presentable. That's what we'll do today. We'll produce some code that will output the following summary (from which, we can determine that 33% of our code lines weren't executed by our test).

recordnum record rectype analysis
1 %macro fred(param=2); 2 Used
2 * comment ; 2 Used
3 %put hello world: &param; 2 Used
4 %if 1 eq 1 %then %put TRUE; 2 Used
5 %if 1 eq 1 %then 2 Used
6 %do; 2 Used
7 %put SO TRUE; 2 Used
8 %end; 2 Used
9 %if 1 eq 0 %then 2 Used
10 %do; . NOT used!
11 %put FALSE; . NOT used!
12 %end; . NOT used!
13 %if &param eq 1 %then 2 Used
14 %do; . NOT used!
15 %put FOUND ME; . NOT used!
16 %end; . NOT used!
17 3 Not compiled
18 %mend fred; 2 Used

To create this table, we need to read the mcoverage log and the macro source for %fred as follows:

  • We need to process the mcoverage log by reading it into a data set and i) removing record type 1 (because it has no part to play in the above table, and ii) removing duplicated log rows for the same code line (which happens when a line of code is executed more than once).
  • We need to process the macro source by reading it into a data set and adding a column to record the line number (matching the numbers used in the coverage log).
  • Having read both files into separate data sets (and processed them as outlined above), we can join them and produce our report. The code to achieve this is shown at the end of this post.

The code that I've created expects a coverage log file from one execution of one macro. It cannot cope reliably with log files containing either multiple executions of the same macro or executions of more than one different macro. Is this a problem? Well, multiple executions of the same macro might be required if testing various permutations of inputs (parameters and data); and multiple different macros might be required if testing a suite of macros.

Tomorrow I'll augment the code so that it can deal with multiple executions of the same macro, e.g. testing %fred with param=2 and param=1.

Meanwhile, here's today's code...

/* This code will not cope reliably if the macro    */
/* source does not have a line beginning with the   */
/* %macro statement for the macro under inspection. */

/* This code expects a coverage log file from ONE */
/* execution of ONE macro. It cannot cope         */
/* reliably with log files containing either      */
/* multiple executions of the same macro or       */
/* executions of more than one different macro.   */

filename MClog "~/mcoverage1.log"; /* The coverage log file (MCOVERAGELOC=) */
filename MacSrc "~/fred.sas";     /* The macro source  */

/* Go get the coverage file. Create macro */
/* var NUMLINES with number of lines      */
/* specified in type 1 record.            */
data LogFile;
  length macname $32;
  keep macname start end rectype;
  infile MClog;
  input rectype start end macname $;
  prevmac = lag(macname);

  if _n_ ne 1 and prevmac ne macname then
    put "ERR" "OR: Can only process one macro";

  if rectype eq 1 then
    call symputx('NUMLINES',end);

  if rectype ne 1 and start ne end then
    put "ERR" "OR: Not capable of handling START <> END";
run;

%put NUMLINES=&numlines;

/* Remove duplicates by sorting START with NODUPKEY. */
/* Hence we have no more than one data set row per   */
/* line of code.                                     */

  /* This assumes the log file did not contain different 
     RECTYPEs for the same start number */
  /* This assumes log file does not contain differing
     permutations of START and END */
proc sort data=LogFile out=LogFileProcessed NODUPKEY;
  where rectype ne 1;
  by start;
run;

/* Go get macro source and add a line number value that */
/* starts at the %macro statement (because this is how  */
/* MCOVERAGE refers to lines.                           */
/* Restrict number of lines stored to the number we got */
/* from the coverage log file.                          */
data MacroSource;
  length record $132;
  retain FoundStart 0 
    LastLine 
    recordnum 0;
  keep record recordnum;
  infile MacSrc pad;
  input record $132.;

  if not FoundStart and upcase(left(record)) eq: '%MACRO' then
    do;
      FoundStart = 1;
      LastLine = _n_ + &NumLines - 1;
    end;

  if FoundStart then
    recordnum + 1;

  if FoundStart and _n_ le LastLine then
    OUTPUT;
run;

/* Bring it all together by marking each line of code */
/* with the record type from the coverage log.        */
proc sql;
  select  code.recordnum
    ,code.record
    ,log.rectype
    ,case log.rectype
       when 2 then "Used"
       when 3 then "Not compiled"
       when . then "NOT used!"
       else "UNEXPECTED record type!!"
     end as analysis
  from MacroSource code left join LogFileProcessed log
    on code.recordnum eq log.start;
quit;

filename MacSrc clear;
filename MClog clear;

As an endnote, I should explain my personal/idiosyncratic coding style:
  • I want to be able to search the log and find "ERROR" only if errors have occurred. But if I code put "ERROR: message"; then I will always find "ERROR" when I search the log (because my source code will be echoed to the log). By coding put "ERR" "OR: message"; my code looks a little odd but I can be sure that "ERROR" gets written to the log only if an error has occured

Tuesday, 7 January 2014

NOTE: Broadening Access to the BI Lineage Plug-In

Metacoda's Paul Homes recently wrote a most informative article entitled Providing User Access to the SAS BI Lineage Plug-in. As Paul says in his article, the BI Lineage plug-in can be used to do impact analysis for BI content (reports, information maps etc.) in a similar way that SAS Data Integration Studio provides impact analysis for DI content (jobs, tables, etc).

The plug-in was new with the November 2010 release of V9.2. Its results include lineage and reverse lineage, i.e. predecessor and successor objects.

Developers find this information useful in order to understand the impact of changing an information map (for example) on reports and, contrary-wise, it is useful for understanding what BI objects (such as information maps) will need to be changed in order to add a new column to a report. This information is useful to capture the full scope of a proposed change and hence to more accurately estimate the effort required.

Testers also find this information useful because it helps to gives them a gauge of the amount of coverage their testing is achieving (this week's theme on NOTE:!).

Paul describes how to make lineage reports viewable by any authorised user, but he concludes that only a strictly limited set of users can create the reports, i.e. what SAS calls "unrestricted users". This a shame because the functionality is of broad interest and value. Let's hope that SAS makes the creation of lineage reports more accessible in future. If you agree, hop over to the SASware ballot community and propose the enhancement. If you're unfamiliar with the ballot, read my overview from August 2012.

In addition, the ability to join lineage reports for BI and DI objects would provide the full provenance of data items. Now that's something I'd love to see!

NOTE: Macro Coverage in Testing (MCOVERAGE)

Yesterday I introduced the MCOVERAGE system option (introduced in V9.3) for capturing coverage of macro execution. This is useful in testing, to be sure you executed all lines of your macro. This may take more than one execution of your macro, with different input parameters and data.

I finished yesterday's post by showing the mcoverage log file created from the execution of a sample macro. I've listed all three files below. They are:
  1. The program that I ran
  2. The mcoverage log file
  3. The macro source for %fred (with line numbers added; the blank lines were intentional, to show how they are dealt with by MCOVERAGE)


filename MClog "~/mcoverage1.log";

options mcoverage mcoverageloc=MClog;

%fred(param=2);

filename MClog clear;

* BUT, see my note about closing MClog at
  the end of yesterday's blog post;

1 1 18 FRED
2 1 1 FRED
3 17 17 FRED
2 1 1 FRED
2 2 2 FRED
2 3 3 FRED
2 4 4 FRED
2 4 4 FRED
2 4 4 FRED
2 5 5 FRED
2 6 6 FRED
2 7 7 FRED
2 8 8 FRED
2 8 8 FRED
2 9 9 FRED
2 13 13 FRED
2 18 18 FRED

1.
2.  %macro fred(param=2);
3.    * comment ;
4.    %put hello world: ¶m;
5.    %if 1 eq 1 %then %put TRUE;
6.    %if 1 eq 1 %then 
7.    %do;
8.      %put SO TRUE;
9.    %end;
10.   %if 1 eq 0 %then 
11.   %do;
12.     %put FALSE;
13.   %end;
14.   %if &param eq 1 %then
15.   %do;
16.     %put FOUND ME;
17.   %end;
18.
19. %mend fred;
20.

The SAS 9.4 Macro Language: Reference manual tells us that the format of the coverage analysis data is a space delimited flat text file that contains three types of records. Field one of the log file contains the record type indicator. The record type indicator can be:
  • 1 = indicates the beginning of the execution of a macro. Record type 1 appears once for each invocation of a macro
  • 2 = indicates the lines of a macro that have executed. A single line of a macro might cause more than one record to be generated.
  • 3 = indicates which lines of the macro cannot be executed because no code was generated from them. These lines might be either commentary lines or lines that cause no macro code to be generated.
We can see examples of these in the listing shown above. The second and third fields contain the starting and ending record number, and the fourth field contains the name of the macro (you figured that out yourself, right?).

So, record type 1 from our log is telling us that %fred is 18 lines long; record type 3 is telling us that line 17 has no executable elements within it (because it's blank); and the record type 2 lines are telling us which code lines were executed. By implication, lines of code that were not executed don't feature in the mcoverage log. How do we interpret all of this?

The first thing to note is that the line numbers shown in the mcoverage log are relative to the %macro statement and hence don't align with our own line numbers (I deliberately included a blank first and last line in the fred.sas file in order to demonstrate this). The type 2 records show that all lines were executed by our test except 10-12 and 14-17 (these are numbered 11-13 and 15-18 above). Given the logic and the fact that we supplied param=2 when we executed the macro (see yesterday's post), this would seem understandable/correct.

However, surely we can write a quick bit of SAS code to do the brainwork for us and show which lines were executed and which weren't. Of course we can, and I'll show an example program to do this tomorrow...

MCOVERAGE:

NOTE: Macros Newness in 9.4 and 9.3 (MCOVERAGE), 6-Jan-2014
NOTE: Macro Coverage in Testing (MCOVERAGE), 7-Jan-2014 (this article!)
NOTE: Making Sense of MCOVERAGE for Coverage Testing of Your Macros, 8-Jan-2014
NOTE: Expanding Our Use of MCOVERAGE for Coverage Analysis of our Macro Testing, 9-Jan-2014
NOTE: Wrap-Up on Test Coverage and MCOVERAGE, 10-Jan-2014

Monday, 6 January 2014

NOTE: Macros Newness in 9.4 and 9.3 (MCOVERAGE)

The SAS macro language is almost as old as SAS itself (who knows exactly?) so you'd think the need to add new functionality would have ceased - particularly with the ability to access most DATA step functions through %sysexec. But apparently not...

SAS V9.4 introduces a few new macro features, but not a huge number. The two that caught my eye were:
  1. The SYSDATASTEPPHASE automatic macro variable which offers an insight into the current running phase of the DATA step
  2. The READONLY option on %local and %global.
Not so long ago, SAS V9.3 introduced a raft of new automatic macro variables, macro functions, macro statements and macro system options.

When 9.3 was launched, one of the new system options caught my eye: MCOVERAGE. It claimed to offer coverage analysis for macros, i.e. highlighting which macro code lines were executed and which were not (particularly useful whilst testing your macros). When I wrote of the release of 9.3 I didn't have immediate access to 9.3, the documentation offered little in the way of real-world explanation, and (I confess) I forgot to return to the topic when I got use of a copy of 9.3.

Well, I was reminded of MCOVERAGE recently and I've spent a bit of time over Christmas figuring out how it works and what it offers in real terms (what is Christmas for if it's not for indulging yourself in things you love?). If you do a lot of macro coding then you'll be interested to know that MCOVERAGE offers plenty. Read on...

Consider this piece of code:

filename MClog "~/mcoverage1.log";

options mcoverage mcoverageloc=MClog;

%fred(param=2);

filename MClog clear;
The SAS log doesn't include any extra information, but we've created a new file named mcoverage1.log in our unix home directory (if you're on Windows, substitute "~/mcoverage1.log" with "C:\mcoverage1.log". I'll describe what the %fred macro does later but, for now, let's just say it's a macro that we want to test. So, we've tested it (with param=2), it worked fine, but have we tested all of the lines of code, or did we only execute a sub-set of the whole macro? If we look into mcoverage1.log we can find the answer. It looks like this:

1 1 18 FRED
2 1 1 FRED
3 17 17 FRED
2 1 1 FRED
2 2 2 FRED
2 3 3 FRED
2 4 4 FRED
2 4 4 FRED
2 4 4 FRED
2 5 5 FRED
2 6 6 FRED
2 7 7 FRED
2 8 8 FRED
2 8 8 FRED
2 9 9 FRED
2 13 13 FRED
2 18 18 FRED
What does this mean? I'll explain tomorrow...

But before tomorrow, I must add one further piece of information. In order to see the mcoverage log, it needs to be closed by SAS. One does this by coding filename MClog clear;. However, I found that SAS refused to close the file because it was "in use". Even coding options nomcoverage; before closing it didn't help. In the end I resorted to running another (null) macro after setting nomcoverage. This did the trick, but if anybody can suggest how I can more easily free-up the mcoverage log I'd be very interested to hear. Here's the full code that I used:

%macro null;%mend null;

filename MClog "~/mcoverage1.log";

options mcoverage mcoverageloc=MClog;

%include "~/fred.sas";

%fred(param=2);

options nomcoverage mcoverageloc='.';

%null;

filename MClog clear;
MCOVERAGE:

NOTE: Macros Newness in 9.4 and 9.3 (MCOVERAGE), 6-Jan-2014 (this article!)
NOTE: Macro Coverage in Testing (MCOVERAGE), 7-Jan-2014
NOTE: Making Sense of MCOVERAGE for Coverage Testing of Your Macros, 8-Jan-2014
NOTE: Expanding Our Use of MCOVERAGE for Coverage Analysis of our Macro Testing, 9-Jan-2014
NOTE: Wrap-Up on Test Coverage and MCOVERAGE, 10-Jan-2014

Tuesday, 17 December 2013

Regression Tests, Holding Their Value

Last week I wrote about how our test cases should be considered an asset and added to an ever growing library of regression tests. I had a few correspondents ask how this could be the case when their test cases would only work with specific data; the specific data, they said, might not be available in the future because their system only held (say) six months of data.

It's a fair challenge. My answer is: design your test cases to be more robust. So, for instance, instead of choosing comparison data from a specific date (which might eventually get archived out of your data store), specify a relative date, e.g. instruct your tester to refer to data from the date preceding the execution of the test. Test cases have steps, with expected results for each step. Begin your test case by writing steps that instruct the tester to refer to the source/comparison data and to write down the values observed. In your subsequent steps, you can instruct your tester to refer to these values as expected results.

Other correspondents said that their tests were manual, i.e. using a user interface and clicking buttons, and hence couldn't be re-run because they were too time-consuming. In this case, I draw attention to my observations about a) deciding what degree of confidence the test exercise should engender in the modified system, and b) deciding what tests need be (re)run in order to provide the confidence. It's fine to choose not to re-run some of your regression tests, but be aware that you're making a decision that impacts the degree of confidence delivered by your tests.If sufficient confidence is delivered without rerunning the manual steps then all is good; if not, you need to revisit your decisions and get them back into balance. There's often no easy answer to this balancing act, but being open and honest about time/effort/cost versus confidence is important.

The longer term answer is to try to increase the number of automated tests and reduce those needing manual activity. But that's a topic for another day!

Wednesday, 11 December 2013

Test Cases, an Investment

It never ceases to frustrate and disappoint me when I hear people talking of test cases as use-once, throwaway artefacts. Any team worth its salt will be building a library of tests and will see that library as an asset and something worth investing in.

Any system change needs to be tested from two perspectives:
  1. Has our changed functionality taken effect? (incremental testing)
  2. Have we broken any existing functionality? (regression testing)
The former tends to be the main focus, the latter is often overlooked (it is assumed that nothing got broke). Worse still, since today's change will be different to tomorrow's (or next week's), there's a tendency to throw away today's incremental test cases. Yet, today's incremental test cases are tomorrow's regression test cases.

At one extreme, such as when building software for passenger jet aircraft, we might adopt the following strategy:
  • When introducing a system, write and execute test cases for all testable elements
  • When we introduce a new function, we should write test cases for the new function, we should run those new test cases to make sure the new function works, and we should re-run all the previous test cases to make sure we didn't break anything (they should all work perfectly because nothing else changed, right?)
  • When we update existing functionality, we should update the existing test cases for the updated function, we should run those updated test cases to make sure the updated function works, and we should re-run all the previous test cases to make sure we didn't break anything (again, they should all work perfectly because nothing else changed)
Now, if we're not building software for passenger jets, we need to take a more pragmatic, risk-based approach. Testing is not about creating guarantees, it's about establishing sufficient confidence in our software product. We only need to do sufficient amounts of testing to establish the desired degree of confidence. So there are two relatively subjective decisions to be made:
  1. How much confidence do we need?
  2. How many tests (and what type) do we need to establish the desired degree of confidence?
Wherever we draw the line of "sufficient confidence", our second decision ought to conclude that we need to run a mixture of incremental tests and regression tests. And, rather than writing fresh regression tests every time, we should be calling upon our library of past incremental tests and re-running them. And the bottom line here is that today's incremental tests are tomorrow's regression tests - they should work (unedited and without modification) because no other part of the system has changed.

Every one of our test cases is an investment, not an ephemeral object. If we're investing in test cases and managing our technical debt, then we are on the way to having a responsibly managed development team!

Monday, 25 February 2013

Giving Focus to Peer Reviews

I'm a keen advocate for peer reviews. I've written about them before (here and here) but there's always more to say.

Peer reviews must always be treated as a constructive exercise. The style of questions can play a big part in the atmosphere and tone of the exercise.

There are many ways you can judge the quality of code. It's easy for developers to put too much time and effort in the wrong places, building things no one would use to begin with. So, aside from the basic questions of the code meeting design and coding guidelines, I ask myself the following questions:

a. How easy will it be to add new features?
b. How easy will it be to change existing features?
c. How easy will it be for a new team member to become productive?

I look for a good balance between the potentially contradictory questions. Using complex architecture might help with (1) and sometimes (2) but will probably hurt (3), so there is an interesting compromise to be delivered.

These questions ensure that the review takes account of the productivity of the team (and company) in addition to the regular technical factors.

Monday, 21 January 2013

NOTE: I'll be Busy at SAS Global Forum! #sasgf13

I was very pleased to be invited to present a paper at this year's SAS Global Forum in San Francisco in April/May. To then have my contributed paper accepted too was icing on the cake. I don't yet know the dates and times where my two papers will be on the agenda, but it looks like I'll be busy this year.

Firstly, I was honoured to be invited to present "Visual Techniques for Problem Solving and Debugging" in the Reporting and Information Visualisation stream.
Abstract: No matter how well we plan, issues and bugs inevitably occur. Some are easily solved, but others are far more difficult and complex. This paper presents a range of largely visual techniques for understanding, investigating, solving, and monitoring your most difficult problems. Whether you have an intractable SAS coding bug or a repeatedly failing SAS server, this paper offers practical advice and concrete steps to get you to the bottom of the problem. Tools and techniques discussed in this paper include Ishikawa (fishbone) diagrams, sequence diagrams, tabular matrices, and mind maps.
And I had already submitted "Automated Testing of Your SAS Code and Collation of Results (Using Hash Tables)" into the Applications Development stream. It was subsequently accepted.
Abstract: Testing is an undeniably important part of the development process, but its multiple phases and approaches can be under-valued. I describe some of the principles I apply to the testing phases of my projects and then show some useful macros that I have developed to aid the re-use of tests and to collate their results automatically. Tests should be used time and again for regression testing. The collation of the results hinges on the use of hash tables, and the paper gives detail on the coding techniques employed. The small macro suite can be used for testing of SAS code written in a variety of tools including SAS Enterprise Guide, SAS Data Integration Studio, and the traditional SAS Display Manager Environment.
So, if you're attending SAS Global Forum this year, please stop by one or both of my papers, and be sure to say "hi"!

Wednesday, 17 October 2012

Mutation Testing

I've published a number of articles on testing in the past, and I thought I had a decent awareness and knowledge of the subject. But, as they say, every day is a learning day and I came across a new testing technique recently: Mutation Testing.

I've not yet tried this technique myself, but it certainly seems to offer benefits, and it's a useful extra consideration when you're creating your testing strategy for your project. Mutation Testing forms a test of your tests and so it is not of value in all cases.

In a nutshell, mutation testing involves creating multiple copies of your code, introducing small changes into each copy - with the deliberate intention of breaking the code in some small way - and then running your tests. If your suite of tests is of sufficient quality then each mutant copy of your code will fail at least one of your tests. If not then you need to enhance your tests so that all mutants fail at least one test.

The types of mutations can vary but they typically include 1) negation of logic operators, 2) setting values to zero, 3) use of wrong variable names. The general nature of the mutations is to emulate common programming errors.

Wikipedia tells us that "Mutation testing was originally proposed by Richard Lipton as a student in 1971,[2] and first developed and published by DeMillo, Lipton and Sayward. The first implementation of a mutation testing tool was by Timothy Budd as part of his PhD work (titled Mutation Analysis) in 1980 from Yale University."

As I said earlier, Mutation Testing is not of benefit in all cases The exercise of testing is about engendering confidence, not offering cast iron guarantees. As a technique to engender greater confidence, Mutation Testing is certainly of value. However, not all projects will require the degree of confidence that Mutation Testing brings. For some projects, the cost versus confidence balance will be tipped before Mutation Testing becomes appropriate.

Nonetheless, for those projects where a high degree of confidence is required, Mutation Testing certainly has a role to play.

Have you used Mutation Testing? Please let me know (through a comment) if you have, I'm keen to hear some experiences good or bad)

Monday, 20 August 2012

NOTE: BI Testing

I recently spotted a valuable posting from SAS's Angela Hall on her Real BI for Real Users blog. I hesitate to suggest it was valuable because that implies her other posts are of less value! Not true, but this particular article was titled Testing recommendations for SAS BI Dashboard & SAS Web Report Studio and, with its structure following the data flow of a BI report, it offered a comprehensive check-list and set of suggestions for end-to-end testing of your BI reports.

Testing delivers confidence not guarantees, and Angela's posting offers advice on how to decide what amounts of testing are appropriate for you.

SAS technology is easy to work with and provides quick and efficient tools for getting your results quickly, but it cannot prevent its users from making mistakes. Beware of using the ease and efficiency of SAS technology to produce the wrong results quickly. As Angela says in her posting, "the most important thing is to do SOME testing of your reports"; that has to be the most valuable piece of advice in the posting!

Tuesday, 10 January 2012

NOTE: Keeping Track of Defects

I recently stumbled across an article on David Biesack's Peer Revue blog regarding SAS's defect tracking process. David provides a detailed and extensive insight into how the guys and girls in Cary (and beyond) deal with defects, from beginning to end.

If you've ever raised a Track that was acknowledged as a bug, this article tells you what went on in the background prior to you getting a hotfix or a new version of SAS code. It's interesting reading, plus there's plenty of good practice that we can all use in our own development efforts. David highlights a number of things:
  • Effective tracking of defects (through their whole lifecycle)
  • The use of a source management system to keep track of versions and releases of code
  • Adoption of the Open/Closed Principle whereby code should be open for extension, but closed for modification. Expressed another way: when a single change to a program results in a cascade of changes to dependent modules (e.g. other macros or DI Studio jobs), that program exhibits the undesirable attributes that we have come to associate with “bad” design. The program becomes fragile, rigid, unpredictable and unreusable. The open/closed principle attacks this in a very straightforward way. It says that you should design modules that never change. When requirements change, you extend the behavior of such modules by adding new code, not by changing old code that already works. I wrote about this in NOTE: Coupling, Bad back in May of this year
David's article mentions the SAS Quality Imperative - SAS's commitment to producing quality software and services. This is an informative white paper and well worth a read. David's link is broken but you can find the paper here.

Tuesday, 13 December 2011

Testing - Peer Reviews

In my recent series on testing I concentrated on dynamic testing, i.e. executing some or all of your code to see if it works. The other type of testing is static testing. This was featured briefly in my "SAS Software Development With The V-Model" paper at this year's SAS Global Forum ( SGF).

The principle form of static testing is peer review. Like all tests and reviews, you have to have something to test or review against. For testing you will have a set of requirements or a set of specifications (ref: the V-Model). For peer review, a fellow programmer will take a look at your code and offer opinion on whether it meets the team’s coding guidelines and other standards (and whether it gives the impression of meeting the unit specification). Thus, to perform an effective peer review, you need some documented standards against which the review can be set.

I wrote a NOTE: post on peer review back in 2009, so I won't repeat those thoughts here; however, I did see a good post from Tricia on the BI Notes blog back on October. Tricia used the post to run through a number of potential pitfalls for those who are new to the process. In the article, Tricia offered valuable tips on how to keep the review focused & at the right level, and how to be properly prepared.

Done properly, peer reviews offer increased code quality, reduced cost of rework, increased amounts of shared knowledge of your applications, and increased sharing of code and design good practice.

Thursday, 8 December 2011

NOTE: Hash Tables, An Introduction

In my recent series of articles on testing I used the hash table to provide a means to write/update a table from within a DATA step without demanding changes to the DATA statement, etc. I've had some very kind feedback on the series, plus some requests to explain what the hash table does and how it works. So here goes...

The hash table was introduced to the DATA step as a means of performing fast, in-memory lookups. The fact that the hash table is a "component object" means that there's a steeper than usual learning curve for this feature. The SAS Language Reference: Concepts manual provides an introductory section on the hash table, but for many DATA step coders the hash table remains unused. A better route to getting started with hash tables is the "Getting Started with the DATA Step Hash Object" paper in the DATA Step sub-section of R&D's SAS/BASE section of the SAS Support web site (follow the link to Component Objects and Dot Notation).

In a nutshell, the hash table is a lookup table that's stored (temporarily) in memory and allows you to search for values within it and thereby get associated values returned. Let's introduce ourselves to the hash table by taking a two step approach: firstly we'll create the hash table, secondly we'll use it for lookups against each of the rows in our input table. Our DATA Step will look like this:

data result;
  set input;
  if _n_ eq 1 then
  do; /* It's first time through, so let's create hash table */

    <create the hash table>
  end;
  /* For each row in input table, do a lookup in the hash table */

  <do a lookup>
run;


Let's make ourselves some test data and assume it contains the sales from our small car sales operation last week:

data SalesThisWeek;
  length make $13 model $40;
  infile cards dsd;
  input make model;
  cards;
Jaguar,XJR 4dr
Land Rover,Discovery SE
Jaguar,XKR convertible 2dr
;
run;


We have a price list for all of the cars we sell; it's in sashelp.cars and contains variables named make, model and invoice. Frustratingly, the MODEL column contains leading blanks, so we use a quick DATA Step to get rid of them, thereby creating work.cars.

data work.cars; set sashelp.cars; model=left(model); run;

We want to load the price list into a hash table, then lookup each of our sold cars to find its invoice value. Here's the code to <create the hash table>:

  DECLARE HASH notehash (DATASET:'work.cars');
  notehash.DEFINEKEY('make','model');
  notehash.DEFINEDATA('invoice');
  notehash.DEFINEDONE();


Woh! That code looks like no other SAS code we've ever seen!! That's because the hash table is a "component object" and the syntax for dealing with components objects differs from mainstream DATA Step syntax. It's called "dot notation". It quickly makes sense once you get over the fact that it's different.

The first line tells SAS that we want to create a new hash table. Hash tables only exist temporarily in memory for the duration of the DATA Step. We use the DECLARE statement to begin to create a new component object; the first parameter (HASH) says what kind of component object we want to create; the second parameter (notehash) is an arbitrary name that we have chosen for the hash table; within the brackets we have told SAS that we're going to use some of the columns of the work.cars table as our lookup table.

The following two lines tell SAS a bit more about how we'd like to load and use the hash table; the fourth line (with DEFINEDONE) tells SAS we've nothing more to tell it about the hash table.

When we use dot notation we type i) the name of a component object, ii) an action we want to perform on the object, and optionally iii) parameters for the action. Parts (i) and (ii) are separated by dots, and the parameters (iii) are enclosed in brackets.

When we create a hash table, we have to declare it, then we have to specify a) the key column(s), i.e. the column(s) that we'll use to find something in the hash table, and b) the data column(s), i.e. the column(s) whose values will be returned once the key values are found in the hash table. In our case, MAKE and MODEL are our key columns, and INVOICE is our data column.

After specifying our key and data columns (with the DEFINEKEY and DEFINEDATA actions) we tell SAS that we're done by performing the DEFINEDONE action on the hash table.

The dot notation is different to what we're used to, but it's not too tricky to get your head around.

Now that we've created our hash table in memory, for use during the DATA Step, all we need to do now is use it. We lookup things in the table by performing the FIND action on the hash table. If SAS finds the key value(s) in the hash table, it will automatically put the associated data value(s) into the data variable(s) in the DATA Step. So, in our case, we need a variable in the DATA Step named INVOICE. If we don't create that variable prior to using the hash table we'll get an error.

When we do a FIND, we're given a return code value that tells us whether SAS found the key(s) in the hash table. A return code value of zero tells us that all is well and the value was found; any other value tells us that SAS did not find the value. So, our code to <do a lookup> will look like this:

  length invoice 8;
  rc = notehash.FIND();
  if rc ne 0 then
    put "Oh dear, we sold something we can't identify";


Notice that there's no direct reference to INVOICE when we do the find. The fact that FIND will put a value into INVOICE is implicit from our preceding DEFINEDATA.

Put all together, our code looks like this:

/* Create our test data */
data SalesThisWeek;
  length make $13 model $40;
  infile cards dsd;
  input make model;
  put make $quote22. model $quote50.;
cards;
Jaguar,XJR 4dr
Land Rover,Discovery SE
Jaguar,XKR convertible 2dr
;
run;

/* Strip leading blanks from MODEL */
data work.cars; set sashelp.cars; model=left(model); run;

/* Add invoice values to sales by using lookup */
data result;
  set SalesThisWeek;
  keep make model invoice;
  if _n_ eq 1 then
  do; /* It's first time through, so let's create hash table */
    DECLARE HASH notehash (dataset:'work.cars');
notehash.DEFINEKEY('make','model');
notehash.DEFINEDATA('invoice');
notehash.DEFINEDONE();
  end;
  /* For each row in input table, do a lookup in the hash table */
  length invoice 8;
  rc = notehash.FIND();
  if rc ne 0 then
    put "Oh dear, we sold something we can't identify";
run;


Once you've got the basic hang of hash tables, the two best sources of reference information are:

a) The hash table tip sheet, available from R&D's SAS/BASE section of the SAS Support web site (see the link to the tip sheet over on the right hand side of the page)

b) Component Objects: Reference in the SAS Programmer's Bookshelf

There are many ways to perform lookups in SAS. Some examples would be i) formats, ii) the KEY= parameter of the SET statement, iii) table joins. The hash table is another option which can offer advantages in many cases. Have fun...

Tuesday, 6 December 2011

NOTE: Testing Macros - Parameters Revisited

As my planned series on testing drew to a close last week, I got an email from Quentin McMullen with some very kind words about the NOTE: blog, but also some very erudite comments about my choice of parameters for my testing macros. Rather than paraphrase Quentin's comments, I decided to publish his email verbatim (with his permission). Here's the heart of Quentin's email, followed by a few brief comments from me.
Just a quick thought:

I have a similar macro to %assert_condition, but it only has one (main) parameter, &CONDITION, instead of three; &LEFT &OPERATOR &RIGHT.  So it looks like:

%macro assert_condition(condition,tag=);
 if &CONDITION then
   put "TESTING: &sysmacroname: TAG=&tag, OUTCOME=PASS";
 else
   put "TESTING: &sysmacroname: TAG=&tag, OUTCOME=FAIL";
%mend assert_condition;


So you can call it like:

%assert_condition(incount eq outcount)
or
%assert_condition (age > 0)
or
%assert_condition ( (incount=outcount) )

I tend to like the one parameter approach.

The only tricky part is if you have an equals sign in the condition, you have to put parentheses around the condition so the macro processor does not interpret the left side as a keyword parameter.  The nifty thing is that the parentheses also mask any commas,e.g.:

%assert_condition(gender IN ("M","F") )

Do you see benefits to the 3 parameter approach vs 1 parameter?
Yes, Quentin, I do very much see the benefits of your approach. Your example, using the IN operator, is particularly well chosen. Rest assured I'll be adapting the McMullen approach in future. Thanks for the comments. Keep reading NOTE:!

Thursday, 1 December 2011

NOTE: Testing (Presenting the Results)

The preceding two articles in this series on testing presented a simple, generic macro for testing and recording test results. All that remains now is for us to tidy-up some loose ends.

Firstly, the macro assumes data set work.results already exists. And it also assumes that the data set contains appropriate variables named Tag and Result. We can quickly arrange that by being sure to include a call to the following macro in our testing code:

%macro assert_init(resultdata=work.results);
  data &resultdata;
    length Tag $32 Result $4;
    stop;
  run; 
%mend assert_init;

Finally, we want to present our results. We can do this easily with a simple call to PROC REPORT:

%macro assert_term(resultdata=work.results);
  title "Test Results";
  proc report data=&resultdata;
    columns tag result;
    define tag / order; 
  run; 
%mend assert_term;

Equipped thus, we can focus on our testing code, not the mechanics of collating and presenting results. For example, let's imagine we have some new code to test; the purpose of the code is to read a raw file (a.txt), create some computed columns, and write-out a SAS data set (perm.a). One of our tests is to check that the number of rows in the raw file matches the number of rows in the SAS data set. Here's our code to test this functionality:

%assert_init;
%include "code_to_be_tested.sas";
%assert_EqualRowCount(infile=a.txt,outdata=perm.a,tag=T01-1);
%assert_term;


We can make the results a tad more visual by colourising the pass/fail values:

%macro assert_term(resultdata=work.results);
  proc format;
    value $bkres 'PASS'='Lime'
                 'FAIL'='Red';
  run;

  title "Test Results";
  proc report data=&resultdata;
    columns tag result;
    define tag / order;
  define result / style(column)={background=$bkres.};
  run;
%mend assert_term;


This assumes you're using SAS Enterprise Guide. If not, you'll need to add some appropriate ODS statements around the PROC REPORT.

The downside of the macros as they stand at this point is that the results data set gets recreated every time we run the code. Maybe we don't want that because we want to collate test results from a number of separate bits of test code. So, finally, we can make the creation of the results data set conditional, i.e. if it doesn't exist we'll create, if it already exists then we'll leave it alone:

%macro assert_init(resultdata=work.results);
  %if not %sysfunc(exist(&resultdata)) %then
  %do;
    data &resultdata;
      length Tag $32 Result $4;
      stop;
    run;
  %end;
%mend assert_init;

Tuesday, 29 November 2011

NOTE: Testing (Collating the Results)

We began this series of posts on testing with an introduction to testing disciplines, and I followed that with a description of how we could quickly create some small macros that would allow us to automate some of our testing and present the results in a consistent fashion in the SAS log. However, putting the results into a SAS data set rather than the log was the next step to improve our efforts.

We can use the tag as a unique key for the test results data set so that the data set has the most recent test result for each and every test. We can re-run individual tests and have an individual row in the results data set updated.

To give us the greatest flexibility to add more macros to our test suite, we don't want the process of writing results to a data set to interfere with activities that are external to the macro. So, using a SET statement, for example, would require the data set to be named in the DATA statement. This seems a good opportunity to use the OUTPUT method for a hash table. We can load the results data set into the hash table, use the TAG as the key for the table, and add/update a row with the result before outputting the hash table as the updated results data set. Here's the code:

%macro assert_condition(left,operator,right,tag=
                       ,resultdata=work.results);
  /* Load results into hash table */
  length Tag $32 Result $4;
  declare hash hrslt(dataset:"&resultdata");
  rc = hrslt.defineKey('TAG');
  rc = hrslt.defineData('TAG','RESULT');
  rc = hrslt.defineDone();
  /* Update the hash table */
  tag = "&tag";
  if &left &operator &right then
    result="PASS";
  else
    result="FAIL";
  rc=hrslt.replace(); /* Add/update */
  /* Write back the results data set */
  rc = hrslt.output(dataset:"&resultdata");
  rc = hrslt.delete();
%mend assert_condition;


By adding the maintenance of the results data set to our basic assert macro, the functionality gets inherited by any higher-level macro (such as yesterday's %assert_EqualRowCount).

Clearly, the new macro won't work if the results data set doesn't already exist, and we'd like to present the results in a format better than a plain data set. We'll cover that in the next post.

NOTE: Testing is Like Visiting the Dentist?

In a comment in response to my recent Testing - Discipline article, Rick@SAS provided a link to an article he'd posted on the American Statistical Association community forum wherein he drew an analogy between testing and going to the dentist. I think it's a very well drawn analogy. Thanks Rick.