Monday, 8 September 2014

The Imitation Game @TIGmovie @BletchleyPark

For wholly different reasons, my daughter and I are thrilled to see a date for the premiere of The Imitation Game movie in London and the UK. She's thrilled because it features Benedict "Sherlock" Cumberbatch; I'm thrilled because it's a big screen depiction of the life and work of Alan Turing - the British pioneer of modern day computing.

Many would say that Alan Turing started the digital revolution. Although others around the world (such as the American Alonzo Church) had done some work, it was Alan Turing who envisaged and designed a machine that could be programmed to solve an infinite number of problems by being given a rule set upon which it would base its actions. In fact, it could theoretically solve any problem for which there was a solution (hence, it is basically a modern computer).

It was Alan Turing who was instrumental in the development of the Bombe at Bletchley Park and it was Alan Turing whose ground-breaking ideas about Artificial Intelligence (AI) really pushed the boundaries of mathematical thinking at that time.

When Time magazine published its list of the 100 most important people of the twentieth century in 1999, they included Alan Turing in that list and said of him:
"The fact remains that everyone who taps at a keyboard, opening a spreadsheet or a word-processing program, is working on an incarnation of a Turing machine."
Turing's sorry, shabby reward for the instrumental role he played in winning the war for Britain was to be persecuted during the Cold War because his homosexuality was viewed as a security risk, to the point that he committed suicide. His pardon last year was a small recognition of his country's past mistakes.

The two most well known of his papers are:
If you can't wait until the general release of the film, to get a fix of Bletchley or Enigma or Turing, you might like to read Robert Harris's Enigma (or watch the film version), or Jack Norman's Broken Crystal. Both are good semi-fictional reads.

Monday, 1 September 2014

Graphics on Android

Last week, writing about admin and deployment enhancements in SAS v9.4, I mentioned my estimation of the proportion of SAS customers on the latest version of SAS (I confidently estimated less than 50%).

These figures are available in other contexts. For instance, Google publish figures for Android versions on a monthly basis. This is useful for developers and gives them guidance on how backwards-compatible their apps should be.

When combined with the number of different manufacturers and handsets, the proliferation of Android versions is known as "fragmentation" and is seen in some quarters as a bad thing. From my perspective as a consumer, I think choice is a good thing, but I do see how it can create support and maintenance headaches for developers.

Anyway, my reason for mentioning this, aside from the nod back to last week's article, was to draw your attention to a recently published report on Android fragmentation by Open Signal. The quality and style of the graphics in their report really caught my eye, so I thought I'd share it with you. I like the look and style of the graphics, but I also like the interactivity you get when you move your mouse over the graphics.

What do you think? Could you replicate these graphics in SAS?

Thursday, 28 August 2014

NOTE: What's More in 9.4 - Admin & Deployment

We can't consider SAS version 9.4 to be "new" any more (it first shipped in July 2013), but if we had the numbers to show it, I'm sure we'd see that less than 50% of customers have upgraded, so it's worth revisiting 9.4's attractions.

The complexity, effort and cost of upgrading have grown over the years. I don't know many clients who still perform upgrades themselves; most rely on SAS Professional Services to do the heavy lifting. Whether this lack of self-sufficiency is good for the clients in the long-term is debatable. SAS themselves claim to recognise the issue and are making efforts to ease the burden of upgrading. Perhaps we'll see significant changes in this area in 9.5, but I won't hold my breath.

Anyway, to return to v9.4, I recently took a look at the What's New in SAS 9.4 book. Wow, that's a big tome! 140 pages. I've not bothered to check, but it's the biggest "What's New" that I recall. So there must be plenty of juicy new features to justify an upgrade. In fact, there are, and I'll spread them over a number of articles. To counterpoint my comments above, I'll start with a mention of the changes in the areas of deployment and administration.

Firstly, the web-based Environment Manager (EM) shows SAS's direction for a new Management Console. EM allows admin and monitoring from a web-based interface and hence does not require the installation of any client-side software.

Secondly, there's far greater and more explicit support for virtual SAS instances, either hosted on-site or off-site. This gives IT departments far greater flexibility to build and deploy multiple instances of SAS; this is a good thing if you think that multiple instances are a good thing.

Thirdly, many of those 3rd-party bits and bobs in the middle tier have been replaced by SAS Web Application Server. On the face of it, we no longer need to recruit and retain support personnel with skills in non-SAS technologies. Certainly, it's good to have just one supplier to turn to in the event of questions or problems. However, the skills and knowledge required to install and operate SAS Web Application Server are similar to those required for the bits ans bobs of mid-tier used with v9.3 and v9.2, so it's not a complete "get out of jail free" card. And if you look carefully, you'll see that SAS Web Application Server is largely a rebranded version of a 3rd party toolset. Nonetheless, it's a positive step.

And finally, availability and resilience have been much improved with the ability to have more than one metadata server. I wrote about this in May last year. Alongside clustered metadata servers, we can also have clustered middle-tier servers. Simplistically, this means that if one server fails then the service can continue and will not fail.

Unplanned Sabbatical - NO MORE

So, it's been a bit quiet in NOTE:land for the last six months. I last wrote in January, and it's now August.

I started the NOTE: blog in July 2009. Since then, up to January, I have posted 459 articles. That's an average of more than 8 posts per month. I wasn't aware of these numbers until I calculated them to write this article, but maybe they go some way to explaining why I hit a point in January where I felt I had to take a break from writing. At first it was just "a few weeks" but it's turned into months.

Despite publishing nothing for the last six months, the NOTE: site has received 7,500+ hits per month, so I guess there must be something of interest in those 459 articles.

I've always enjoyed writing my posts for NOTE:, so I knew that my "sabbatical" would end sooner rather than later. And so, you can now reinstate your expectations of a steady stream of stuff about SAS software, software development practices, data warehousing, business intelligence, and analytics, plus occasional mentions of broader technical topics that are of personal interest to me, e.g. Android and Bletchley Park.

And finally, thank you to those who sent kind messages of concern regarding my sudden silence in January. I was touched by your concern.

Wednesday, 22 January 2014

Estimation (Effort and Duration) #2/2

Yesterday I spoke about the art of estimation. I highlighted how valuable it is to be able to plan and estimate. I also highlighted that estimation is about getting a number that is close enough for the purpose, not about being accurate.

To get to the meat of things... here is my recipe for estimation success... It won't surprise you to see me say that the key elements of estimation are:
  1. Understand the task, i.e. what needs to be produced
  2. Comparison with similar tasks already completed
  3. Decomposition of tasks into smaller, more measurable activities
  4. Attention to detail (sufficient for our purposes)
The first (requirements) is obvious, but is very often not given enough attention to detail, resulting in an incomplete set of items to be produced. In a SAS context, this list might include technical objects such as Visual Analytics reports, stored processes, information maps, macros, DI Studio jobs, table metadata, library metadata, job schedules, security ACTs and ACEs, userIDs, data sets, views, and control files; on the business side, your list might include a user guide, training materials, a schedule of training delivery, a service model that specifies who supports which elements of your solution, a service catalog that tells users how to request support services, and a process guide that tells support teams how to fulfil support requests; and on the documentation side your list might include requirements, designs & specifications, and test cases.

Beyond identifying the products of your work, you'll need to identify what inputs are required in order to allow you to perform the task.

I'll offer further hints, tips and experience on requirements gathering in a subsequent article.

With regards to comparisons, we need to compare our planned task with similar tasks that have already been completed (and hence, we know how many people worked on them and how long they took). When doing this we need to be sure to look for differences between the tasks and make sure we take account of these by increasing or decreasing our estimate above or below the time it took for the actual task. In doing this, we're already starting to decompose the task because we're already looking for partial elements of the task that differ.

Decomposition is the real key, along with a solid approach to understanding what each of the sub-tasks does. As you decompose a unique task into more recognisable sub-tasks, you'll be able to more confidently estimate the effort/duration of the sub-tasks.

As we decompose the task into smaller tasks, we must be sure that we are clear which of the decomposed tasks is responsible for producing each of the deliverable items. We need to look out for intermediate items that are produced by one sub-task as an input to another sub-task; and pay the same attention to inputs - we must be certain that we understand the inputs and outputs of each sub-task.

I'll offer a deeper article on decomposition in a subsequent article.

You're probably thinking that requirements, comparisons, and decomposition are quite obvious. So they should be! We already established that all perform estimations every day of our life. All I've done is highlight the things that we do subconsciously. But there is one more key element: attention to detail. We must pay attention to understanding each sub-task in our composition. We must be sir to understand its inputs, its outputs, and how we're going to achieve the outputs from the inputs.

Having a clear understanding of the inputs, the outputs and the process is crucial, and it can often help to do the decomposition with a colleague. Much like with pair programming, two people can challenge each other's understanding of the task in hand and, in our context, make sure that the ins, outs and process of each sub-task are jointly understood.

I hope the foregoing has helped encourage you to estimate with more confidence, using your existing everyday skills. However, we should recognise that the change in context from supermarket & driving to software development means that we need a different bank of comparisons. We may need to build that bank with experience.

To learn more about estimating, talk to your more experienced colleagues and do some estimating with them. I'm not as great fan of training courses for estimation. I believe they're too generalistic. In my opinion, you're far better off learning from your colleagues in the context of the SAS environment and your enterprise. However, to the extent that some courses offer practical exercises, and those exercises offer experience, I can see some merit in courses.

Good luck!

Tuesday, 21 January 2014

Estimation (Effort and Duration) #1/2

Estimation: The art of approximating how much time or effort a task might take. I could write a book on the subject (yes, it'd probably be a very dull book!), it's something that I've worked hard at over the years. It's a very divisive subject: some people buy into the idea whilst others persuade themselves that it can't be done. There's a third group which repeatedly try to estimate but find their estimates wildly inaccurate and seemingly worthless and so they eventually end-up in the "can't be done" camp.

Beware. When we say we can't do estimates, it doesn't stop others doing the estimation on our behalf. The end result of somebody else estimating how long it'll take us to deliver a task is a mixture of inaccuracy and false expectations placed upon us. Our own estimates of our own tasks should be better than somebody else's estimate.

My personal belief is that anybody can (with a bit of practice and experience) make decent estimates, but only if they perceive that there is a value in doing so. In this article I'll address both: value, and how to estimate.

So, let's start by understanding the value to be gained from estimation. The purpose is not to beat up the estimator if the work stakes longer than estimated! All teams need to be able to plan - it allows them to balance their use of constrained resources, e.g. money and staff. No team has enough staff and money to do everything it wants at the same time. Having a plan, with estimated effort and duration for each activity, helps the team keep on top of what's happening now and what's happening next; it allows the team to start things in sufficient time to make sure they're finished by when they need to be; it allows teams to understand that they probably won't get a task done in time and hence the team needs to do something to bypass the issue.

Estimates form a key part of a plan. For me, the value of a plan comes from a) the thought processes used to create the plan, and b) the information gained from tracking actual activity against the planned activities and spotting the deviations. There's very little value, in my opinion, in simply having a plan; it's what you do with it that's important.

Estimates for effort or duration of individual tasks combine to form a plan - along with dependencies between tasks, etc.

Okay, so what's the magic set of steps to get an accurate, bullet-proof estimate?...

Well, before we get to the steps, let's be clear that an estimate is (by definition) not accurate nor bullet-proof. I remember the introduction of estimation in maths classes in my youth. Having been taught how to accurately answer mathematical questions, I recall that many of us struggled with the concept of estimation. In hindsight, I can see that estimating required a far higher level of skill than cranking out an accurate answer to a calculation. Instead of dealing with the numbers I was given, I had to make decisions about whether to round them up or round them down, and I had to choose whether to round to the nearest integer, ten or hundred before I performed the eventual calculation.

We use estimation every day. We tot-up an estimate of the bill in our heads as we walk around the supermarket putting things into our trolley (to make sure we don't go over budget). We estimate the distance and speed of approaching vehicles before overtaking the car in front of us. So, it's a skill that we all have and we all use regularly.

When I'm doing business analysis, I most frequently find that the person who is least able to provide detail on what they do is the person whose job is being studied. It's the business analyst's responsibility to help coax the information out of them. It's a skill that the business analyst needs to possess or learn.

So, it shouldn't surprise us to find that we use estimation every day of our life yet we feel challenged to know what to do when we need to use the same skills in a different context, i.e. we do it without thinking in the supermarket and in the car, yet we struggle when asked to consciously use the same techniques in the office. And, let's face it, the result of an inaccurate estimate in the office is unlikely to be as damaging as an inaccurate estimate whilst driving, so we should have confidence in our office-based estimations - if only we could figure out how to do it!

We should have confidence in our ability to estimate, and we should recognise that the objective is to get a value (or set of values) that are close enough; we're not trying to get the exact answer; we're trying to get something that is good enough for our purposes.

Don't just tell your project manager the value that you think they want to hear. That doesn't help you, and it doesn't help the project manager either.

Don't be afraid to be pessimistic and add a fudge factor or contingency factor. If you think it'll take you a day then it'll probably take you a day and a half! I used to work with somebody who was an eternal optimist with his estimations. He wasn't trying to pull the wool over anybody's eyes, he honestly believed his estimations (JH, you know I'm talking you!). Yet everybody in the office knew that his estimations were completely unreliable. Typically, his current piece of work would be "done by lunchtime" or "done by the end of today". We need to be realistic with our estimations, and we need to look back and compare how long the task took compared with our estimation. If you estimated one day and it took two, make sure you double your next estimation. If somebody questions why you think it'll "take you so long", point them to your last similar piece of work and tell them how long it took you.

When creating and supplying an estimate it's worth thinking about three values: the most likely estimate, the smallest estimate, and the biggest estimate. For example, if I want to estimate the time it might take to develop a single DI Studio job that requires a significant degree of complexity, perhaps I can be sure it'll take at least a couple of days to develop the job, certainly no more than two weeks. Armed with those boundaries, I can more confidently estimate that it'll take one week.

If you're not confident with your estimates, try supplying upper and lower bounds alongside them, so that the recipient of your estimates can better understand your degree of confidence.

Tomorrow I'll get into the meat of things and offer my recipe for estimation success.