Thursday 17 September 2009

NOTE: Be Of Good Type (Revisited)

I love it when one discovery leads to another. In my previous blog entry I highlighted SAS V9.2's new NESTED argument for the DATA statement. Given that it's a new argument, I wouldn't expect it to work in previous versions of SAS, but it's always worth trying these things because often they were available in older versions of the software, albeit undocumented and unsupported. What did I discover when I tried NESTED in V9.1.3? It didn't work, but the error message told me of other DATA statement arguments I'd never come across before!

ERROR 22-322: Syntax error, expecting one of the following: BUFFERED, MISSOPT, NOMISSOPT, NONOTE2ERR, NOPASSTHRU, NOPMML, NOTE2ERR, PASSTHRU, PGM, PMML, UNBUFFERED, VIEW.

Whilst I recognised VIEW and a couple of others, I had to look-up the others in SAS documentation. I didn't find most of them, so I used Google. The most intriguing results were for (NO)NOTE2ERR.

Back in issue 7 ("SAS With Style") and issue 8 ("Be Of Good Type") of our NOTE: newsletter we spoke of good programming practice, namely the proper and consistent usage of character and numeric variables and values. We offered the following code as a typical example of sloppy programming that wouldn't be accepted by most other program compilers/interpreters:

14 data _null_;
15 x = 1;
16 y = 'Day#' !! x;
17 z = 'Day#' !! left(x);
18 put x= y= z=;
19 run;

NOTE: Numeric values have been converted to character values at the places given by:
(Line):(Column).
16:17 17:22

x=1 y=Day# 1 z=Day#1


In the code shown above, X is numeric yet it's being used as an operand of the character-only concatenation operator (!!) and as a parameter of the character-only LEFT function. Most compilers/interpreters would refuse to allow this type mis-matching, but SAS tries to be helpful, assumes you know what you're doing and does an autoamtic type-conversion for you. This is sloppy programming and many SAS shops won't allow code to be shipped if it includes this kind of coding.

Well, imagine my surprise when I discovered that the NOTE2ERR argument for the DATA statement tells SAS that you don't want sloppy programming allowed and you want an error message if any is found! Quite simply, if we specify data _null_ / note2err; in the preceding code, we'll get an error message instead of a note. It seems that NOTE2ERR will convert a whole range of notes to errors, all representing (in my opinion) sloppy programming. All of the note messages relate to either a) invalid type (as discussed), b) uninitialised variables, c) division by zero, or d) (in)format not found.

There's a rider here. The option is undocumented and hence unsupported by SAS (even in V9.2). If, like me, you don't like putting unsupported code into production, you might consider using NOTE2ERR in some of your testing instead.

The suggestion of using the option in some of your test runs is made more practical by the fact that the option can be applied globally with the OPTIONS statement... options dsoptions="note2err"; so you don't need to apply it to each individual DATA step.

Finally, you need to be aware that some SAS procedures use the DATA step underneath the covers. PROC REPORT is an example. If you set the NOTE2ERR option using the OPTIONS statement, it will influence the behaviour of PROC REPORT and others that use the DATA step.

So, what are those other undocumented DATA statement arguments? I don't know but I'm guessing they're used by SAS solutions and/or procedures. And, why hasn't NOTE2ERR been documented and supported?... I'll have to do some digging and save the answer for another day.