Wednesday, 6 April 2011

NOTE: DI Studio Version 4.3

Version 4.3 of SAS Data Integration Studio (DI Studio) is to be released alongside SAS 9.3 later this year. I stopped by the Data Integration booth at SAS Global Forum to get some detail from those that should know. There are a raft of enhancements that are worthy of note, but the big one that caught my eye was job versioning and rollback. Here's what I found (and saw)...
  • Increased ability to perform ELT (Extract, Load and Transform) in addition to traditional ETL - doing your transformations down in the database and thus minimising the data miles travelled by your data
  • Code import wizard in 4.2 already understands macros to the extent that it can create and display a process node representing the macro, but DI Studio 4.3 will give you the option of expanding the macro so that you see the component tasks of the macro in the your DI process flow. This gives you a clearer picture of what your job is doing and allows job performance to be collected at a more granular level
  • Talking of job performance, DI Studio 4.3 provides more breadth and depth with regard to the performance profile of your jobs and their steps. You have access to predefined reports that can (optionally) be created in Web Report Studio
  • A Type 1 SCD loaded transform has been added for those that don't need the complexity of the Type 2 loader. Not sure what Type 1 and Type 2 are, see my earlier article on the subject
  • Job deployment can now be done from an OS command line using a newly-supplied shell script. This script supports deployment of one or more jobs at a time, and it can be scheduled. For those of you who have invested in automating the promotion and deployment of your jobs using the programmatic interfaces to export/import that DI Studio 4.2 brought, you can now consider completing the task by automating the deployment of your promoted jobs. Apparently, job deployment hasn't been made available through the programmable API because the task of deployment requires a great deal of Java activity that SAS weren't comfortable with delivering through the APIs. Disappointing, but there you go
  • And finally, for those who take configuration management seriously, we have versioning of jobs (and packages). Sadly this isn't a top-down delivery of a release management capability, but it's a step in the right direction. SAS have provided plug-ins for CVS and Subversion, but they say they'll publish the API and allow you to write your own plug-ins for your own preferred source code management system, such as IBM Rational's ClearCase. These plug-ins facilitate the process of pushing exported packages (with one or more jobs and dependent objects) into your source code management system. From within the DI Studio interface you can see and inspect versions of your packages.You can even compare objects within packages, so you can answer the common Release Manager questions like "what changes did you make to the job between version 2 and version 4?". Sadly, the comparison can only be done on one object at a time.
So, all-in-all, DI Studio 4.3 has a lot to offer alongside the introduction of SAS 9.3. The versioning leaves a lot more work still to be done with regard to release management, support for a wider range of source code management systems, and detailed reporting. Nonetheless, it's a job well done by the DI Studio team.

The plug-ins are not available from Management Console and thus cannot be used and applied to any SAS metadata objects that you might expect to export/import. Thus there's no support for BI objects. That's disappointing, but SAS tell me they are aware of the shortcoming and hope to plug the gap in a future release.

[See all of my posts regarding SAS Global Forum]