Wednesday 24 February 2010

NOTE: Successes and Tools

My previous post was our 100th and marks quite a milestone. Since starting in July last year, Google Feedburner tells us:
  • yesterday we achieved our highest ever number of subscribers: 156
  • we've clocked-up 36,000 views
  • our most popular post of all time is Dashboards & Scorecards: What To Measure with 1,055 views (note that the widget in our right-margin shows popularity over the last 7 days)
And Site Meter tells us that the average visit length is over 2 minutes, so we know that you're reading the articles as well as just looking at them ;)

NOTE: Success Demonstrates BI Scope (the BI evolution)

Business Intelligence (BI) spans simple historic reporting to embedded real-time analytics. This is our 100th post and we're sharing our (minor) celebration with a SAS success.

BI is a commonly used term with a raft of different interpretations. Wikipedia begins to define it thus:
BI technologies provide historical, current, and predictive views of business operations. Common functions of Business Intelligence technologies are reporting, online analytical processing, analytics, data mining, business performance management, benchmarking, text mining, and predictive analytics.
There's a BI evolutionary path that starts with simple, static reporting on historic data (often delivered with spreadsheets) through to real-time predictive analytics embedded into front-office transactional systems. Many suppliers who claim to offer BI systems barely get off the ground on the BI flight to delivering real value to the enterprise.

All those products that offer sexy, shiny, slick graphics with animated 2.5D fuel gauges that make your historic data look exciting but don't begin to tell you about where you're headed are flattering to deceive. If you're considering implementing a BI solution, make sure your chosen software will give you the headroom to grow the value that the solution delivers. Don't box yourself in with a sexy solution that ultimately offers no real intelligence.

NOTE: Issue 49 of VIEWS News is Available

Phil Holland has published issue 49 of VIEWS News, the quarterly journal of the VIEWS International SAS Programming Community. This quarter's issue includes graphical impact, encryption algorithms & format merges, and Excel functions available in SAS 9.2, plus a prize draw especially for VIEWS News readers.

You can view the 48 back-issues in the archive.

NOTE: PROC MEANS Gives You All It's Got (and More!)

An oft overlooked parameter for PROC MEANS (and PROC SUMMARY) is COMPLETETYPES. It tells MEANS to create all possible combinations of the values of the classification variables, even if some of those combinations don't exist in the data. And PRELOADFMT will create combinations from values that don't even exist in your input data. This can be very useful in presenting what appears to be a more complete picture of the input data and can be equally useful in presenting a consistent layout amongst a group of reports (or regularly produced reports).

Here's a simple code example:

data sales;
  region = 'North'; product = 'Widget'; sales = 500; OUTPUT;
  region = 'North'; product = 'Foobar'; sales = 300; OUTPUT;
  region = 'South'; product = 'Widget'; sales = 100; OUTPUT;

proc means data=sales /*completetypes*/ sum;
  class region product;

If we run it without COMPLETETYPES we get:

       Analysis Variable : sales

region    product    Obs             Sum
North     Foobar       1     300.0000000
          Widget       1     500.0000000
South     Widget       1     100.0000000

And if we run it with COMPLETETYPES, we get (with the new information highlighted in red):

       Analysis Variable : sales

region    product    Obs             Sum
North     Foobar       1     300.0000000
          Widget       1     500.0000000
South     Foobar       0               .
          Widget       1     100.0000000

NOTE: More Online Training

Having mentioned Sunil's online training and featured his mooted courses in India yesterday, I have to mention Andrew Karp's ever growing catalog of online and face-to-face courses at Sierra Information Services. And the Virtual SAS Users Group who offer yet more online courses. Sitting alongside the online courses that SAS offer,  there's a wealth of material waiting for you to sign-up and learn more about the technology and your industry.

Tuesday 23 February 2010

What is a Project?

Alongside our series of posts on creating Gantt charts in Excel for the purpose of managing small to medium sized projects, a discussion on "what is a project?" might be useful. Most of us feel we understand the general usage of the term "project", but what does it mean in the context of Project Management?

The Cambridge Advanced Leaner's Dictionary defines the noun project as "a piece of planned work or an activity which is finished over a period of time and intended to achieve a particular aim". The key attributes are a) having an aim or purpose, and b) being able to define a start and end date/time. These attributes make projects distinct from ongoing operational, live, production or business as usual (BAU) activities.

The existence of the end date is an important element of the project, and a lot of work goes into agreeing the end date and then making sure the project is delivered/completed by the end date. Plans are drawn-up, often including Gantt charts.

Project Plans in Excel - Adding Dates

In the previous post in this series I described how to use Conditional Formatting to create a neat and simple Gantt chart alongside a simple Excel-based project plan. In this post I’ll describe how to use dates in addition to the day numbers that were featured in the previous post. The picture alongside (right) shows the result from today's post.

As with the previous case, I’m going to describe a quick and simple method. This method also takes weekends into account as non-working days. We ended the last post with what you see alongside (left).

So, let’s begin by adding the date for day 1 into cell F1. I'm typing “22/2” to represent 22nd February). It’s not readable in the small width of the cell, so we’ll go to the Format Cells window (you can use Ctrl-1 to get there quickly) and select text orientation as 90 degrees. Then, to get the date format that we want, we’ll stay in the Format Cells window and specify a custom number format of “dd-mmm (ddd)”. If the height of row 1 doesn’t automatically increase for you, just do it manually. You should have a result like this:

NOTE: New Course in India: Best Practices in SAS Statistical Programming for Regulatory Submission

Having successfully conducted his long-titled course in the USA, Netherlands, and online, my friend Sunil Gupta is considering the possibility of running it in one or two cities in India. While the schedule is not yet finalised, the favoured period for the two-day class is around April end/early May in Bangalore and maybe Hyderabad.

For further details of the course, see the overview of the online version. To register interest in the Indian course(s), email Sunil directly (and tell him NOTE: sent you!).

You might also like to know that Sunil is presenting his "Preparing SAS Programmers for the Pharmaceutical Industry (An Introduction)" course as a pre-conference course at SAS Global Forum (SGF) 2010. There may still be places available.

Wednesday 17 February 2010

NOTE: The Missing Semicolon Just Arrived

Systems Seminar Consultants' newsletter (named The Missing Semicolon) is always a good read, so I was pleased to get notification of the Winter 2010 issue last week. Featuring a mixture of topics, this issue seems to focus on writing good documentation (program documentation and system documentation). Please don't view this as a switch-off topic! Read the articles and you'll better understand the benefits that properly targeted and focused documentation offers.

However, I do strongly disagree with the author's rule of adding a comment to every line of code. Programming standards always give rise to a strong degree of discussion, but in my opinion slavishly putting comments onto every line of code doesn't add anything to the reader's knowledge of the code. Indeed, in the example code given, the vast majority of on-the-line comments are stating the obvious. Comments should describe what is not obvious in the code - that typically means describing what blocks of code are doing and/or why a particular approach was taken (and why other approaches were considered but discarded).

The issue also offers a review of The Little SAS Book (by Lora Delwich and Susan Slaughter whom I featured yesterday), and a nice tip regarding the INFILE statement's MISSOVER parameter.

I recommend you hop over to Systems Seminar Consultants' publications page and a) sign-up for a free subscription, and b) take some time to browse through the archive of issues.

Project Plans in Excel (simple, quick and effective)

For any piece of work other than the smallest, it’s worth planning. Planning doesn’t have to mean creating a huge monster in Microsoft Project - I find that Microsoft Excel (or similar) is often sufficient (and a lot more accessible to the team). This post (and the series of posts that follow) describes how to quickly and efficiently create an adequate plan for small to medium sized projects.

I don't expect all developers to be expert project managers, but I do expect my team members to understand the role of the project manager, to know how to work to a plan, and to focus on delivery. And I do expect developers to run their own (small to medium sized) projects from time-to-time.

A project plan can consist of just a list of tasks (preferably with start and end dates) together with the name of the person who will complete the task, but this can be made to communicate a lot more if you can deliver a Gantt Chart too. The name “Gantt Chart” sounds challenging to anybody who hasn’t met one before, but actually it’s rather simple format that most people are familiar with (often without knowing the name). Gantt Charts can contain a lot of detail and embellishment, but I’m going to describe how to create a simple yet communicative chart very quickly.

Tuesday 16 February 2010

NOTE: Informats and Enterprise Guide 4.2 (beware)

I'm a keen follower of Susan Slaughter's books (in conjunction with Lora Delwiche) and her Avocet Solutions web site. The web site is very nicely structured and contains a wealth of solid information. Last week, the Little SAS Book Series featured an article about informats and Enterprise Guide 4.2. The article highlighted useful, user-friendly features of EG 4.2's Data Grid, but also warned of the fact that said Data Grid ignores informats.

The site and the article are recommended reading.

Encourage the (New) Conference Speakers

Reflecting further on the unconference & BarCamp format of the Analytics Camp NC event that I mentioned last week,  whereby sessions are proposed and scheduled each day by the attendees and based upon pitches from the potential speakers, I realised that this is a good means of giving feedback to potential speakers and thereby encouraging new speakers.

It's a little daunting to write a paper and send it off to some anonymous conference organiser in the hope that you might be seen to offer something of interest to fellow conference attendees. And I've recently seen at first-hand how conference organisers can be dismissive of those whose papers are not selected (to the extent of not even bothering to tell them that their paper was not selected). To get some constructive criticism out of them, in order to do a better job next time around, can be like getting blood out of a stone. People who have had papers accepted for conferences on previous occasions will not be put off by such behaviour; however, for a first-timer the anonymous rejection can easily put them off of ever submitting a paper again.

By contrast, the atmosphere at Analytics Camp seems to have been very informal and welcoming. It sounds like just the sort of atmosphere where a novice might be tempted to propose a topic and be given positive encouragement to proceed with their idea.

I continue to warm to the unconference & BarCamp ideas and ideals. More importantly, if you're organising a conference, please be sure your section chairs show respect and offer encouragement for all of those who take the time and effort to prepare a paper and submit it. For conferences to thrive they need a regular influx of new thoughts and ideas; don't stifle and discourage first-timers.

Wednesday 10 February 2010

Analytics Camp NC

I noticed a lot of tweeting last weekend with hashtags of sas and acampnc. I managed to figure-out that there was some kind of informal, analytics event in North Carolina named Analytics Camp NC. I took the time this week to find out more. Seems it was a useful event, and an interesting format too.

Angela Hall (she of the SAS-BI blog and latterly a Technical Architect at SAS) has offered two very informative posts: Designing Dashboards Successfully (answering the question "What should all dashboards have to make them useful and successful?") and Content Analytics "All Abouts" on text analytics. Thanks Angela. Other follow-up articles are listed on the home page of the Analytics Camp web site. Social media is a growing area of organisations' marketing plans, and it's clear that there's a lot of growing interest in the area of social media analytics, i.e. tracking readers, followers, fans, and (most importantly) buyers.

The Analytics Camp web site offers further information about the objectives and organisation of the event:

Tuesday 9 February 2010

NOTE: Enterprise Guide Add-Ins Are Out There

I wrote recently about the paucity of 3rd-party add-ins for Enterprise Guide. I finished the post by wondering aloud whether there were more that I hadn't found. Well, I've had no reports of any 3rd-party add-ins, but in Chris Hemedinger's SAS Dummy blog he summarised a good range of SAS-supplied add-ins that are available for free download.

Whilst they are largely unsupported, they certainly provide some jolly useful functionality. And the source code is provided so you can support them yourself.

Thanks Chris!

Monday 8 February 2010

NOTE: The DIVIDE Function

At the risk of stumbling over the description of another "new" function, I discovered the DIVIDE function alongside my discovery of the IFC/IFN functions. This is definitely new in V9.2; confirmed by the What's New in the Base SAS 9.2 Language web page.

It caught my eye because it handles those division scenarios where SAS normally issues notes or warnings to the log, e.g. divide by zero. In many cases you would want to be warned of missing values and divide by zero, there are some cases where you do not, but you need to create long-winded conditional coding around your division in order to avoid the messages. Well, with the DIVIDE function you don't.

NOTE: More on IFC and IFN

It seems my recent post on the IFC and IFC functions caused a fair bit of interest. Not least in further related functions.

Firstly I must own-up to my mistake of originally stating IFC and IFN were new with 9.2. A number of correspondents pointed-out that they were available from the beginning of V9. Plus, I have it on good authority that they were experimental in V8.2 (albeit possibly with different names). They were implemented primarily to provide a logical construct that could be used interchangeably in both SQL and data step code. They happen to be a lot shorter than SQL's case/when/else construct too! My thanks to the little ex-SAS birdie and the correspondent who passed the message along.

Secondly, The SAS Plumber started a thread on In response, Data _Null_ pointed-out that the third conditional value only gets returned when the value of the first parameter is actually missing; hence my description of the functionality was incorrect. Jason Secosky pointed-out the same thing in comments on the original post. And Ron Fehd highlighted the safe, traditional option of using the SELECT statement.

Finally, Jack Hamilton suggested a look at CHOOSEC and CHOOSEN. These were new to me too - I've clearly been walking around with my eyes closed (and bumped into something, causing the poor quality of the original IFC/IFN post!). The syntax for both is:

CHOOSEx (index-expression, selection-1 <,...selection-n>)

The CHOOSEx function uses the value of index-expression to select from the arguments that follow. For example, if index-expression is three, CHOOSEx returns the value of selection-3. If the first argument is negative, the function counts backwards from the list of arguments, and returns that value.

In his comments to the original posting, Jack also mentioned that IFC/IFN are also available in the macro language via %SYSFUNC, and so provide a primitive IF/THEN/ELSE mechanism in open code. Jack highlighted the example in the wiki.

So, I now know four functions that I'd not heard of just a few short weeks ago. And I've made a note to read the documentation more carefully before posting next time! My thanks to all of you who contributed to the discussion.

Friday 5 February 2010

Round the World: Race 5 Ends - More Incidents for SAS Consultant's Yacht

The Clipper 09-10 Round the World Yacht Race, featuring UK-based SAS consultant Andy Phillips onboard the 68-foot Team Finland, continues to offer incidents and accidents. As race 5 was building to a close-fought contest between a number of yachts, one of Team Finland's rivals, Cork, struck a submerged reef in the Java Sea, leaving its crew to hurriedly launch the life rafts and paddle to the nearby small island of Gosong Mampango. Thankfully nobody was hurt. All 16 crew were subsequently evacuated to two sister yachts, Team Finland and California.