Wednesday, 4 December 2013

NOTE: Enterprise Guide vs DI Studio - What's the difference?

A favourite interview question of mine is: Compare and contrast SAS 9's stored process server and workspace server. This question is very good at revealing whether candidates actually understand some of what's going on behind the scenes of SAS 9. I mentioned this back in 2010, together with some notes on my expectations for an answer.

I was amused to see Michelle Homes post another of my favourite interview questions on the BI Notes blog recently: What’s the difference between SAS Enterprise Guide and SAS DI Studio? This question, and the ensuing conversation, establishes whether the candidate has used either or both of the tools, and it reveals how much the candidate is thinking about their environment and the tools within.

For me, there are two key differences: metadata, and primary use.

Michelle focuses on the former and gives a very good run-down of the use of metadata in Data Intergration Studio (and the little use in Enteprise Guide).

With regards to primary use, take a look at the visual nodes available in the two tools. The nodes in DI Studio are focused upon data extraction, transformation and loading (as you would expect), whilst the nodes in Enterprise Guide (EG) are focused upon analysing data. Sure, EG has nodes for sorting, transposing and other data-related activities (including SQL queries), but the data manipulation nodes are not as extensive as DI Studio. In addition to sorting and transposing, DI Studio offers nodes that understand data models, e.g. an SCD loader and a surrogate key generator (I described slowly changing dimensions (SCDs) and other elements of star schema data models in a post in 2009). On the other hand, EG has lots of nodes for tabulating, graphing, charting, analysing, and modelling your data.

One final distinction I'd draw is that EG's nodes are each based around one SAS procedure, whilst DI's nodes are based around an ETL technique or requirement. You can see that DI Studio was produced for a specific purpose, whilst EG was produced as a user friendly layer to put on top of the SAS language and thereby offers a more generalistic solution.

For the most part, I'm stating the obvious above, but the interview candidate's answer to the question provides a great deal of insight into their approach to their work, their sense of curiosity and awareness, and their technical insight.