Monday 20 September 2010

NOTE: SAS 9.2 Procedures by Name and Product

I came across a list of SAS procedures by name and by product the other day. It was very useful to me at the time and I thought I'd pass it on...

http://support.sas.com/documentation/cdl/en/allprodsproc/63875/HTML/default/viewer.htm#/documentation/cdl/en/allprodsproc/63875/HTML/default/a003178332.htm

I was trying to establish which SAS programmers used which products. Most programmers can tell you what PROCs they use, but they don't know what products these PROCs belong to. Hence my pleasure at finding the nice PROC/product look-up! I was able to get programmers send me a list of PROCs typed into spreadsheets, and I was able to read the web page into a SAS data set, and could then do cross-references between my data set and their spreadsheets (read into SAS of course!).

Monday 13 September 2010

BetterManagement Moves to the SAS Knowledge Exchange

I noticed that the oft valuable contributions at BetterManagement have moved to the SAS Knowledge Exchange web site. This was announced in a recent post from BetterManagement.

BetterManagement.com was formed in 2000 by ABC Technologies Inc.The aim was to launch an online service that provided a soft-sell style of message to potential customers. SAS acquired ABC in 2002, and hence BetterManagement too. The BM web site hosts an excellent collection of articles and on-demand webinars on a wide variety of topics, all of which gently lead the reader to consider the role of SAS software in the subject area. It will continue to be available until July 2011.

SAS tell me that content is being shifted to the SAS web site. To date, not much has been shifted. Searching the BM web site for "bicc" produces four hits, yet the SAS KE site produces none. Not a scientific test, but indicative nonetheless I think.

The four hits on the BM web site include an excellent business intelligence competency centre (BICC) audio & slides presentation by SAS's Stefanie Gerlach and Dagmar Bräutigam that I've watched previously. However, upon doing research for this article I was unable to get the presentation to roll. I hope this isn't a sign on an earlier than planned end for BM! On the plus-side, "Beyond Spreadsheets: Risks and Productivity in the Real World (video interview)" worked fine for me.

Wednesday 8 September 2010

NOTE: Sort Your Sorted Data

In my previous post I spoke of using views to give you access in EG visual coding to tables with dynamic names. This works well, but there's one final observation to make if you want to avoid poor performance in some circumstances.

If you're used to using DATA step merges then you'll also be used to sorting your input data sets prior to the merge. If you're using EG visual coding then you'll have realised that you don't need to sort the data sets that you use as input to your joins. Why? Because SQL will automatically sort your input data sets for you.

Neat huh? Well, yes it is, except in cases where your input data is already properly sorted and re-sorting is a waste of resource and causes unnecessary delays in the execution of your job.

So, can we tell SQL that one or more input data sets are already sorted? Yes, we can, by using the SORTEDBY data set option.

Monday 6 September 2010

NOTE: Marking Time with Enterprise Guide

In my recent review of a client's SAS/Base jobs' suitability for EG visual coding, I wasn't surprised to come across some jobs that created new tables each month where the table names included the date. The table names were of the form MASTER.SALES_2010_08, indicating sales for August 2010. However, the fact that the table was referenced with the use of a macro variable, i.e. MASTER.SALES_&month., meant that EG was less than happy.

This style of table naming doesn't sit well with EG's visual coding (nor DI Studio) because the process flow needs a static table name for inputs and outputs. You can use the Query Builder to create an output table whose names includes a macro variable. The code will execute and the table will be created, but it won't appear in the process flow. And you can't use a macro variable in the name of an input table for visual coding.

There are at least two approaches to take to this problem...

Project Management, Churchill-Style

Having recently marked the 70th anniversary of the start of World War II's Battle of Britain, my interest was piqued by a book sub-titled "Project Lessons from Winston Churchill and Battle of Britain". The idea of viewing World War II as a project surprised me. But the more you think about it, the more you see that, in detail, the requirements of executing a war are no different to any other project - you need leadership and governance, and you have to manage resources and tasks whilst keeping your eye on delivering the project's objectives.

Clearly, Winston Churchill's project had a higher cost of failure than any project I'll ever manage, and the risks and tasks are of a different nature to those I'm familiar with, but I can see the similarities too. So, the book's on order and I'm looking forward to reading it.

"Agile Leadership and Management of Change - Project Lessons from Winston Churchill and Battle of Britain" was written by Mark Kozak-Holland as part of the Lessons from History series.

Saturday 4 September 2010

NOTE: Celebrating Our 1st Birthday

Though I did a couple of test posts in July 2009, NOTE: really got started with its first post on Friday September 4th 2009. So, today we celebrate our first birthday.

In that year we've built a loyal band of followers. Thank you to all of you.

NOTE: is accessible in many different ways:

Just under 200 people use an RSS reader to get their SAS fix.
100 people get an automatic email copy of posts on any day in which an article is published.
73 people follow us on Twitter and get a near-instant tweet every time a new posts appears on NOTE:.
We get an average of 12 visitors to the web site each day, and they've created over 8,000 page views in the last year

I'm glad you find the content interesting. Please help us to grow further by telling your fellow SAS practitioners how informative NOTE: is, and by telling us what additional topics you'd like to read about.

Cheers!

Wednesday 1 September 2010

NOTE: Reflecting on EG, Here's 4.3

I knew that EG V4.3 was due soon, but I thought I'd have time to post some EG V4.2 observations before V4.3 saw the light of day. Wrong!

Chris Hemedinger highlighted yesterday that EG V4.3 is available to those who want it.

It has much to recommend it. Most notable to me was the emphasis on new features for users of the SAS program editor. It is heartening to see that SAS recognises that visual-coding does not (yet) satisfy the needs of all SAS programmers. Thanks SAS!

See the What's New section of the SAS V9.2 documentation for the low-down on the new features.

NOTE: Reflections on Enterprise Guide (EG)

I've recently had the benefit of spending some "quality time" with Enterprise Guide V4.2 (and SAS V9.2). I've enjoyed migrating a selected sample of my client's SAS/Base programs into EG projects. The foibles of my client's code gave me a great opportunity to re-evaluate EG's visual-coding, and to inform my client of the benefits and challenges of adopting visual-coding wholesale.

Visual-coding is the term I use to describe the creation of programs by the visual connection of process nodes with data nodes using a drag-and-drop graphical user interface. A visually-coded program usually results in a flow chart-like diagram of the program's activity.

The first point I should make is that EG does not force you to adopt visual-coding. You can use EG's program node with its code editor and ignore visual-coding altogether. Compared with SAS's classic display manager interface, you'll lose the ability to use the interactive DATA step debugger and to view the content of SAS catalogs, you'll gain an integrated FTP client, seamless remote submission of your code, and you can use visual-coding for wizard-like code generation for thoser PROCs you're less familiar with.

My two primary observations about EG are:

1) It's not a visual ETL tool. If you want to generate ETL in a visual manner, get DI Studio. EG's in-built task nodes do not provide any means of updating your data. Your only output options are a) create a new data set, or b) completely over-write an existing data set. In some cases option (b) is not available within a single task (most notably in the APPEND task). In many cases EG's options will be sufficient, but the absence of SQL's INSERT and UPDATE statements, and DATA step's MODIFY and UPDATE statements, means that it may not be the best tool for developing visual ETL.

2) You need to become competent with SQL if you want to use EG's visual-coding in an effective manner. Most of EG's data manipulation tasks are based upon SQL. Particularly the Query Builder task. If you want to go beyond the basics you need to understand how to use SQL syntax such as DISTINCT, GROUP and CASE. And you need to understand what each of the different types of joins do. If you're an experienced SAS programmer you won't struggle to get to terms with the new syntax, but you do need to know it.

I'll offer some more specific findings in the next few posts. In the meantime, if you have any observations of your own, please provide a comment to this post and start a conversation. You can also make use of the EG discussion forum on the SAS Support web site; and learn lots from Chris Hemedinger's blog.