Tuesday 28 July 2015

Great Developers Communicate!

There are so many skills that make a difference between a good developer (someone who knows their syntax and has a bagful of good design patterns) and a great developer. Many of those differentiating skills are related to communication.

I wrote about good communication skills nearly six years ago, so it's probably about time to revisit the subject!

Few if any of us work alone. The vast majority of us work as a team. Communication within the team is a necessity, not an option. As a developer, good communication will ensure you deliver the right thing and gain recognition for your abilities; as a manager, good communication is important if you and your team are to deliver what the business needs when the business needs it so that they can get maximum business benefit.

Many developers like working independently and are proud of the autonomy they may be given. But the paradox here is that the autonomy and independence can lead to a lack of recognition of their abilities. Many of us have worked damn hard only to see the boss give plaudits to somebody who may have contributed much less to the collective. Why does this happen, and how can we avoid it?

Surely there must be some hard numbers that will make it clear who's contributing the most to the team. Well, in my experience, traditional metrics such as lines of code, bugs closed and features added all have drawbacks in the reality of day-to-day software development, e.g. are two small features more valuable than one larger feature? And clearly, softer activities such as writing-for-maintainability and helping/coaching others have no reliable metrics for comparison. So, if there are no reliable metrics, how do you get noticed?

I think it comes down to trust. When a manager gives autonomy and independence to team members they are trusting them to complete the assigned task, make wise and strategic decisions along the way, and pro-actively communicate problems long before the become a problem. For someone to invest their trust in us, we have to show that we are in fact a good investment. We might ask ourself:

Does my boss trust me?
Do my team-mates and peers trust me?
Have I done a good job to earn their trust?
How would my peers describe me to someone else?
How influential am I within the organisation?

As a team member, we want to be judged by our contributions, and we want autonomy and the ability to own substantial things. As a manager, we want to give recognition and praise to the people who deserve it, but we don’t want to micromanage and spend our days being Big Brother. So there's an implicit contract: I will give you autonomy and independence, but it is your responsibility to share status and information with me.

For example, a team member once told me he had worked hard and really gave it his best, but from my viewpoint his progress wasn't up to the same level of his team-mates. When he was leaving the company he told me all the things he had done – and I asked him "Why didn't you share this with me before?" You see, I would have advised him to spend his time elsewhere on priorities that were more important to the business. He responded with "I thought you would know." Don't make that same mistake.

And so my conclusion in all of this is: if you want autonomy, and the ability to own and control your own domain and projects – it is your job to push information and build trust with your team members.

In other words, you need to learn and do the following:

Follow through. Do what you say and consistently deliver on your commitments.

Pro-actively communicate when a task takes you longer than you thought, and why.

Improve your communication skills. In order for others to hear you, sometimes you have to hone the way you deliver your message.

Volunteer information and make an effort to explain vague or hard to understand ideas and concepts. Make an effort to share the details of your decisions and diversions. This is also important when you make mistakes – letting others know before they figure out on their own will show ownership of the situation and can prevent misunderstandings later.

Be forthright and authentic with your feelings. Even when you may hold a contrary opinion communicate your thoughts (respectfully and with tact).

Don’t talk behind the backs of others. It is very difficult to build trust if someone knows that you will say something negative about your boss, the company leadership, or another team-member.

Be objective and neutral in difficult situations. Learn how to be calm under pressure and act as a diplomat resolving conflicts instead of causing them.

Show consistency in your behaviour. Not just in follow-through but by eliminating any double standards that may exist.

Learn to trust them. This is one of the hardest ones, but trust is a two-way street. Giving others the benefit of the doubt and learning how to work with them is essential to a strong mutual working relationship.

In turn, hopefully, you have a good manager that will be able to ask you good questions and take the time to understand your contributions. And if that is not your situation, then make sure you are sharing information with those around you; such as your peers, your boss, and other stakeholders.

Good leadership is keeping everyone on the same page, and if you want independence it is your responsibility to make sure people know what you are contributing.

I don't claim to come close to following my own advice in all situations, but I do keep reminding myself of what I believe is the right route to trust and autonomy. What is your route?

Thursday 23 July 2015

NOTE: SAS Grid Manager for Hadoop

I've recently written about how much new functionality is getting released by SAS on an almost monthly basis without much fanfare, and I've also written about how Hadoop is becoming a new "operating system" and we should expect to see Grid and LASR running within Hadoop in due course. Well, the release of SAS v9.4 m3 earlier this month brought: SAS Grid Manager for Hadoop.

In fact, the m3 release of SAS Grid Manager brought a raft of changes that point towards a different future for grid computing with SAS.
  • SAS Grid Manager for Hadoop has been added. SAS Grid Manager for Hadoop brings workload management, accelerated processing, and scheduling, to a Hadoop environment
  • Support has been added for using an Oozie scheduling server. This server is used in a SAS Grid Manager for Hadoop environment
  • An agent plug-in and a management module have been added to SAS Environment Manager. In short, we can now monitor and manage our Platform grids using Environment Manager instead of RTM (although some features remain unique to RTM for the moment)
So, grid computing in SAS 9.4 m3 now offers a choice between Platform Suite for SAS and Grid Manager for Hadoop. And if you choose the Platform grid, you may no longer need to install and operate RTM.

Licensing issues aside, you may choose to run one or both of the types of grid technology. This article focuses on Grid Manager for Hadoop. From a user's perspective, there is little or no difference between the two choices because Grid Manager for Hadoop accepts all of the existing Grid syntax and submission modes; integration with other SAS products and solutions is supported by Grid Manager for Hadoop. However, from an architectural and administrative point of view, I believe there are two key advantages for Grid Manager for Hadoop:
  1. If your data is in Hadoop, you don't need to extract it out of the Hadoop cluster in order to process it on the grid. A key tenet of big data is to minimise "data miles" by sending the code to the data rather than transferring terabytes or petabytes of data to the compute server
  2. SAS Platform grids require a clustered file system ("shared data"); Grid Manager for Hadoop uses a shared-nothing approach and hence a bane of my life is eliminated! I've never shared a happy coexistence with a clustered file system. They have often been new/unknown technology for my client's IT infrastructure team, and they have often been unreliable (there may be a link between these two facts). When the clustered file system is the heart of the grid, unreliability is not a good quality
I must point out that the documentation does not state that the full syntax of SAS/BASE and associated products is available when run on a Hadoop-based grid. Certainly, up to this point time, the SAS processes embedded into Hadoop have only been able to run a subset of SAS syntax, via DS2, plus high performance (HP) procedures. Furthermore, if we think of the no-shared-data model, it would seem inefficient in the extreme to run a SAS job on one grid node and expect the Hive/HDFS data to be streamed to that one node from all of the data nodes where it resides. So, efficient use of the in-Hadoop capability necessitates the use of DS2 or HP procedures.

The SAS Grid Computing in SAS 9.4, Fourth Edition manual gives you all the information you need to plan, install and utilise your Grid within Hadoop with your v9.4 m3 environment. You will see that Yarn is used for resource management, Oozie for scheduling. Cloudera, Hortonworks and MapR distributions of Hadoop are supported.

The manual tells us that the install process involves six steps:
  1. Install Hadoop services
  2. Enable Kerberos on the Hadoop cluster
  3. Enable SSL
  4. Update YARN parameters
  5. Set up HDFS directories
  6. Run the SAS Deployment Wizard to install and configure a SAS Grid Manager for Hadoop control server
I'm sure this install won't be plain-sailing because there are a lot of new technologies and components involved. Equally, there are doubtless some features of the Platform grid that are not (yet) available in the Hadoop-hosted grid. But if you are planning a big data project and you need a grid, I suggest you give due consideration to this new option.

Wednesday 22 July 2015

NOTE: HTML 5 is in VA Hub Already!

Aside from comments about my SAS Enterprise Guide vs SAS Studio article, Metacoda's Michelle Homes (@HomesAtMetacoda) was quick to write a comment about my Flash & SAS Visual Analytics (VA) article and to point out that HTML5 is already an option for the VA Hub. Michelle said:
HTML 5 has been available as a configurable option in the hub in SAS VA 7.1 which was released in October 2014. Some information on this can be found at https://communities.sas.com/docs/DOC-8254

SAS VA 7.2 has a nice HTML 5 hub by default.
As a Session Recap from a SAS Live Q&A session states (along with nice comparison screenshots):
in Visual Analytics 7.1, the Home Page can be displayed using Flash or HTML5. Someone who has the Visual Analytics: Administration role can change the vah.client.ui.mode property in SAS Management Console. On the Plug-ins tab, navigate to Application Management --> Configuration Manager --> SAS Application Infrastructure --> Visual Analytics. Right-click the node and select Properties to access Advanced properties. The vah.client.ui.mode property specifies which mode of the Home Page to use. The default value, classic, specifies to use Flash to display the Home Page. The alternate value, modern, specifies to use HTML5 to display the Home Page. Note that the vah.client.ui.mode property is a site-wide setting that affects all users.
To learn more about SAS VA Mobile BI and HTML5, the TechTalks video of Himesh Patel (Sr Director, Research and Development) from this year's SAS Global Forum is a good place to start.

Tuesday 21 July 2015

NOTE: Your Response: EG & Studio

As Mark Twain is oft (incorrectly) quoted as saying: "Reports of my death are much exaggerated". I didn't say that Enterprise Guide (EG) was anywhere close to death when I (contentiously) wrote NOTE: What is SAS Studio? RIP Enterprise Guide? but I did suggest that there were good reasons to think that the web-based SAS Studio is in the ascendancy and that there might be a point in the future where it has sufficient features to make most sites seriously question whether it should be offered to users.

I got a numbers of comments in response (online and offline). Of the online comments, I was pretty certain when writing the original article that it would elicit a thoughtful and balanced response from Chris Hemedinger (@cjdinger), and I wasn't disappointed! Chris pointed-out that EG is still receiving new features (a sure sign of life in a software product). As Chris said, the rate of change has slowed (as it should for a mature product), but many SAS users still see it as an essential part of the toolbox. And Chris provided a link to a neat TechTalks video from this year's SAS Global Forum of Christie Corcoran (Development Manager for SAS Studio) talking about SAS Studio and its place alongside SAS Enterprise Guide. It's a nicely informative video, hosted informally by Chris. As Christie says in the video, SAS Studio is "another great way to get to your SAS".

Of the features in EG but not Studio, Eric Winslow highlighted Stored Processes. Eric pointed-out that Stored Processes are a great means for groups to share code. Within EG there is more than one means of simple and easily accessing Stored Processes (and editing and updating them). Whilst it could be said that PROC STP allows Stored Processes to be accessed from any SAS program (and hence Stored Processes are accessible from Studio), I imagine that Eric appreciates the interactivity available for executing Stored Processes in EG, plus the ability to manage the software development lifecycle in a more integrated and coherent fashion. Doubtless, explicit support for Stored Processes will come to Studio in time.

And, my old friend Phil Holland reminded me that he presented a paper on EG and Studio at this year's SAS Global Forum: "SAS Enterprise Guide or SAS Studio: Which is Best for You?". Oops, sorry Phil!  You can find Phil's excellent paper at the bottom of his extremely long list of papers presented at conferences on his web site. Phil's 23-page paper takes the reader through features, techniques and tips before offering some sound recommendations that are based upon the experience of the user in question.

Sunday 19 July 2015

VR - Now I get it!

I'm a regular viewer of the BBC Click technology programme on the BBC News channel (@BBCClick). It covers a broad variety of technology subjects in a very accessible manner - ideal Sunday morning viewing.

Until this weekend's Click I'd always associated virtual reality (VR) headsets with not-quite-there software along with games that involve shooting aliens. No more! If you're able to view Click on iPlayer, jump to 21 minutes 19 seconds in this week's episode and see Spencer Kelly introduce a car being driven by a driver who is wearing a VR headset. Awesome. I want one!

If you can't view iPlayer in your region of the world, take a look at Engadget's more detailed backgrounder and link to the associated YouTube videos a) the (somewhat over-produced) final advert for Castrol oil and a behind the scenes view.

If/when I get a VR headset I want one of those Mustang controllers with it!

Actually, if I'm honest, we have a couple of VR headsets in the house already. They're cheap and they're great fun. I'm talking about Google Cardboard. Upon delivery from Amazon, and after folding our £15 headsets into shape, we insert our Android phones, and we have a great VR headset. 

One of the first (free) apps I viewed on the headset was Paul McCartney. You don't have to be a fan of Sir Paul's music to get a buzz from standing on the edge of the stage whilst he and his band perform Live and Let Die, with fireworks aplenty. A great introduction to VR.

The Cardboard app itself provides a launcher for the various cardboard apps on your phone. YouTube has a growing number of 360-degree videos that work nicely with Cardboard. Have fun!

Wednesday 15 July 2015

NOTE: SAS v9.4 M3 is available

Further to my post on flavours of SAS v9.4 (indeed, flavours of SAS v9.4 m2), this week sees the release of SAS v9.4 m3 (otherwise known as 15w29). I've not had a chance to use it yet(!), but the documented features that caught my eye include:


  • The pre-production MSCHART procedure provides the ability to include "native" Excel charts in the Excel destination (see Chevell Parker's paper from this year's Global Forum)
  • Product upgrades: SAS Studio 3.4, SAS/STAT 14.1, SAS/ETS 14.1, Enterprise Miner 14.1, Data Integration Studio v4.901and others
  • A new product: SAS Factory Miner. Amusingly, this brand new product ships as v14.1! (to tie-in with the associated release of the SAS Analytics products)
  • Increased support for secure configurations of SAS
  • The upgrades to DI Studio include:
    • Three new transformations (Fork, Fork End, and Wait For Completion) to manage parallel execution of branches of nodes
    • The ability to embed a loop within a loop
    • Support for Hadoop With Query (HAWQ) with the addition of a source designer that provides an SQL interface to store data natively in the Hadoop Distributed File System (HDFS)
  • Environment Manager 2.5's features include:
    • Administration functions that were previously only available (interactively) in SAS Management Console: Managing metadata definitions for SAS users, servers, and libraries. User definitions can be viewed, created, and edited. Server and library definitions can be viewed, and SAS LASR libraries and servers and Base SAS libraries can be created and edited
    • Grid monitoring functions that were previously only available (interactively) in RTM: Collecting metric data from a SAS grid. Metric data is collected and reported upon for the grid and for individual grid nodes
    • SAS Backup Manager for scheduling, configuring, monitoring, and performing integrated backups. SAS Backup Manager can be accessed from the Administration tab of SAS Environment Manager
  • Also for those who install & administer SAS systems:
    • Changes that are expected to result in a 40% to 50% decrease in start-up time for SAS Web Application Server
    • Greater re-startability in the SAS Deployment Wizard (SDW). If the SDW is interrupted during an install and then restarted during the installation phase, it will install only those SAS products that it has not already installed
    • The SDW enables you to reduce the number of password prompts for required SAS internal accounts, metadata-based server accounts, and SAS Web Infrastructure Data Server accounts
    • Support has also been added for compressing and validating SAS Software Depots. In addition, the SAS Migration Utility has been enhanced to protect passwords in the migration package from being exposed
    • The installation and configuration of the SAS Embedded Process for Hadoop has been improved and simplified: for Cloudera and Hortonworks, Cloudera Manager and Ambari are used to install the SAS Embedded Process and the SAS Hadoop MapReduce JAR files.
For further details, start your journey in the M3 section of the What's New document.



Tuesday 14 July 2015

It's Not Too Late to Volunteer for the DataDive

I wrote in a post in June about the good that DataKind does by using teams of volunteers with data science knowledge. It's not too late to join the DataDive this coming weekend in London (July 17-19). If you'd like to contribute your time and knowledge, check-out the sign-up page on Eventbrite. Will I see you there?

What is a DataDive?
DataDives are weekend events that bring the data science community together with the non-profit community to tackle tough data problems in a short period of time. DataKind UK has selected specific charity projects to work on over the weekend. You need to bring your own hardware and software, your data skills, and the belief that you can help change the world for better!

Who are the social organisations bringing projects to the event?
DataKind UK will be working with My Help at Home, Ark, Centrepoint and The Key.

Tuesday 7 July 2015

Hadoop is the New Black

It feels like any SAS-related project in 2015 not using Hadoop is simply not ambitious enough. The key question seems to be "how big should our Hadoop cluster be" rather than "do we need a Hadoop cluster".

Of course, I'm exaggerating, not every project needs to use Hadoop, but there is an element of new thinking required when you consider what data sources are available to your next project and what value would they add to your end goal. Internal and external data sources are easier to acquire, and volume is less and less of an issue (or, stated another way, you can realistically aim to acquire large and larger data sources if they will add value to your enterprise).

Whilst SAS is busy moving clients from PC to web, there's a lot of work being done by SAS to move the capabilities of the SAS server inside of Hadoop. And that's to minimise "data miles" by moving the code to the data rather than vice-versa. It surely won't be long before we see SAS Grid and LASR running inside of Hadoop. It's almost like Hadoop has become a new operating system on which all of our server-side capabilities must be available.

We tend to think of Hadoop as being a central destination for data but it doesn't always start its presence in an organisation in that way. Hadoop may enter an organisation for a specific use case, but data attracts data, and so once in the door Hadoop tends to become a centre of gravity. This effect is caused in no small part by the appeal of big data being not just about the data size, but the agility it brings to an organisation.

SAS's Senior Director of the EMEA and AP Analytical Platform Centre of Excellence, Mark Torr (that's one heck of a title Mark!) recently wrote a well-founded article on the four levels of Hadoop adoption maturity based upon his experiences with many SAS customers. His experiences chime with my far more limited observations. Mark lists the four levels as:
  1. Monitoring - enterprises that don't yet see a use for Hadoop within their organisation, or are focused on other priorities
  2. Investigating - those at this level have no clear, focused use for Hadoop but they are open to the idea that it could bring value and hence they are experimenting to see where and how it can deliver benefit(s)
  3. Implementing - the first one or two Hadoop projects are the riskiest because there's little or no in-house experience, and maybe even some negative political undercurrents too. As Mark notes, the exit from Investigating into Implementing often marks the point where enterprises choose to move from the Apache distribution to a commercial distribution that offers more industrial-strength capabilities such as Hortonworks, Cloudera or MapR
  4. Established - At this level, Hadoop has become a strategic architectural tool for organisations and, given the relative immaturity of Hadoop, the organisations are working with their vendors to influence development towards full production-strength capabilities
Hadoop is (or will be) a journey for all of us. Many organisations are just starting to kick the tyres. Of those who are using Hadoop, most are in the early stages of this process in level 2, with a few front-runners living at level 3. Those organisations at leve 3 are typically big enough to face and invest in solutions to the challenges that the vendors haven’t yet stepped up to, such as managing provenance, data discovery and fine-grained security.

Does anybody live the dream fully yet? Arguably, yes, the internal infrastructures developed at Google and Facebook certainly provide their developers with the advantages and agility of the data lake dream. For most us, we must be content to continue our journey...

Thursday 2 July 2015

Summer of Coding

I'm always keen to encourage an awareness and uptake of coding in my kids. I think that coding brings a lot more than the simple ability to write programs. Coding requires a set of disciplines and an approach that are of great benefit in all walks of life.

As the summer holidays are upon us, with weeks upon weeks for kids to idle away their time, now is a good moment to revisit some of the online opportunities to give kids an insight into the joys of coding.

I've previously mentioned Scratch and App Inventor 2 (AI2) as two very accessible means for getting kids (and adults!) started, and producing a useful app that they can share with their friends very quickly. Both sites are free and use a clever building blocks interface to allow budding programmers to quickly understand the requirements of syntax. Scratch builds web-based apps and AI2 builds apps for Android devices (phones and tablets) with surprisingly powerful blocks for accessing web-based resources.

Scratch has always encouraged its users to share their work. Earlier this year App Inventor added its own gallery for showing and sharing.

Whilst it's not free, I've heard good things about Tynker. Tynker also takes the building blocks approach to syntax, and offers structured courses to help guide its students to exciting results.

Another means of getting your kids inspired is Lightbot. This is a series of programming-related puzzles featuring a cute robot character in a games app - available for Apple iOS, Android and other platforms. Great fun, and challenging too when you get to some of the higher levels.

As technology becomes more pervasive, traditional trades disappear, and the world of work becomes more globalised, the skills that newer members of the workforce need are changing: problem solving, team working, and communication are but three "21st century skills". Digital literacy (ability to find and use internet-based resources and information) and creativity— and the latter’s close relative, entrepreneurship—are close behind. And, the young have become more comfortable learning on their own, especially on topics of interest. They just need to be pointed in the right direction!