Wednesday 24 March 2010

NOTE: Spring Cleaning

Spring has sprung, although it's difficult to detect here in South East London. With spring comes thoughts of spring cleaning. In the SAS context, the following two tips spring to mind:

1) Clean-Out Your Dead WORK Libraries

SAS uses the WORK library throughout the duration of your SAS session (local and remote). When your SAS session ends, SAS will automatically delete the WORK library. Specifically, your configuration options specify a "master WORK directory" where each new SAS session creates a sub-directory to use as its individual WORK library. Each individual SAS session deletes its individual WORK directory when it terminates. However, if SAS is cancelled or terminated abnormally then the individual WORK directory won't be deleted (nor its contents). Over time, the master WORK directory will get full of orphan WORK (sub)directories and their contents.

This applies not just to remote sessions belonging to SAS/CONNECT and workspaces pertaining to Enterprise guide; don't forget that your metadata server is a SAS session, as are your stored process server sessions, etc.

You need to regularly clear-out the master directory in order to avoid running-out of space, but you need to avoid accidentally deleting sub-directories that belong to active (non-orphan) SAS tasks. The SAS-supplied cleanwork utility is the answer. Yes, that's right, SAS supply a utility to keep your master directory clean, and it will check the provenance of each directory before deleting it, so it won't try to delete directories that belong to active SAS sessions. Whilst it can be run by individual users to tidy their own directories, it is best run as a scheduled admin task ans given appropriate privileges.

Check-out the operating system-specific SAS documentation for details. Read the documentation for Windows carefully - the last time I used cleanwork on Windows it was just a SAS program that deleted all directories older than a given date.

2) Tidy-Up As You Go

Help to reduce the overall size of WORK space usage by deleting temporary data sets as you go along. If I'm writing a modular program composed of a number of macros, I use the macro name as a prefix for all temporary tales created by the macro. By doing this I can easily delete all of a macro's temporary data sets at the end of the execution of the individual macro by using the colon (:) wild card in PROC DATASETS DELETE. The SYSMACRONAME automatic macro variable contains the name of the active macro, so it's easy to use the macro name as a prefix for all temporary data sets, and then delete them at the end of the macro. See the example below.

%macro leyland(...);

  data work.&sysmacroname._demotemp;
    set sashelp.class;
  run;

  ...

  proc datasets lib=work nolist;
    delete &sysmacroname.: ;
  quit;
%mend leyland;


Of course, this is not good for debugging! Deleting all of the temporary data sets automatically makes development and debugging very tricky. So, I add a parameter to each macro (passed from one embedded macro to another) that indicates whether the macro should tidy-up at its end. See below.

%macro leyland(tidy=y, ...);

  data work.&sysmacroname._demotemp;
    set sashelp.class;
  run;

  ...

  %if &tidy eq y %then
  %do;
    proc datasets lib=work nolist;
      delete &sysmacroname.: ;
    quit;
  %end;
%mend leyland;


Keeping your usage of WORK to a minimum will help your operational jobs run more reliably, and will save on the cost of disk space too. The two tips above will help you with the challenge.