Tuesday 1 May 2012

NOTE: Libnames, Who Needs 'Em?

My team received what turned out to be an interesting call for help from one of our clients today. We resolved the client's coding error but it also served as a reminder of a little used feature of BASE SAS,  namely the ability to specify directory names in code rather than bother with libnames. There are pro's and con's for doing this. I'll discuss these below after I explain the feature.

We're used to specifying data sets on DATA statements in the "libname.dataset" style. However, instead of using a data set name, you can specify the physical pathname to the file, using syntax that your operating system understands. The pathname must be enclosed in single or double quotation marks. Here's an example:

data "c:\mydata\mydataset";

In the foregoing example, the DATA step would create a SAS data set file named mydataset.sas7bdat in the c:\mydata directory.

There's more information in the section titled "Accessing Permanent SAS Files without a Libref" in the SAS 9.3 Language Reference: Concepts. You will see that we can use the same naming technique in almost any situation where a library and data set name are expected, e.g. a SET statement, a MERGE statement, an UPDATE statement, a MODIFY statement, the DATA= option of a SAS procedure, and the OPEN function.

My client's coding error resulted from the fact that they had specified a macro parameter intended as a data set name and they had surrounded it with quotes. The call %demo("name") resulted in a DATA statement like this: data "name". As a result, SAS tried to create a file named name.sas7bdat in the SAS session's current directory. That directory was the root directory of the SASApp server, the user didn't have permission to write to it, and hence the code failed. The intention was to create a data set named "name" in the work directory, the actuality was significantly different. It was all caused by a common misunderstanding/mistake - using quotes around character strings in macros.

So, we understand how we can dispense with LIBNAME statements, but should we take advantage of this capability? Well, I can't see too many advantages, but I can see plenty of disadvantages!

The disadvantages include i) need to accurately specify directory paths throughout the program (rather than eight character libnames), ii) cannot quickly and easily change a directory location (as can be useful when testing), and iii) cannot specify an engine for the library.

Can you think of any advantages? Let us know your suggestions in a comment.