In recent years, I’ve seen a new breed of analyst emerge. Indeed, more than a few of my superiors have encouraged me to join their ranks. This new breed of analyst knows statistics and finance but prefers to leave the coding to others. They have “people skills”. They know how to sit in meetings and provide “thought leadership” and “design policy”. However, these analysts cannot pull the actual data to support their point of view!
This approach to the technical side of our profession is doing them a serious disservice. Here’s why:
Analytics has many similarities to medicine. A new analyst needs to spend time to understand statistics and basic technique. This is easily the equivalent of a master’s degree. Some degree of social skills are required to be effective. You must regularly present information and collaborate with others. Furthermore, you will need to be able to build credibility (your bedside manner). Finally, you need to apply these skills within a specific problem space. This is our equivalent to the residency requirement.
The net result of this is that you become a trusted advisor, able to deliver useful guidance at the highest levels of the business. However, like medicine, there are some basic expectations.
1) First, you’re working with real world data from actual processes. Collected using imperfect tools and inconsistent operating procedures. Generally repeatedly copied / aggregated into some form of storage system. Which is subject to failure / corruption. To quote one of my favorite movies: “I’m smelling a lot of if coming off this plan”. How can you ethically render judgment about a dataset without being able to personally verify it? My joke on a recent project was – “I’ve written 500 queries for this project, 490 of which are titled QA”. But that’s the nature of the beast – an analyst need to be able to understand what is happening at a detailed level. Well over 30% of my time as a Six Sigma Black Belt was spent validating measurement systems and checking data quality. It’s our equivalent of checking a patient’s vital signs.
2) Second, a corollary to the first point. Generally your best insights occur from looking closely at the details of a business and noting where they differ from expectations. Unless you can write your own code (at least SQL, preferably Python, SAS, or R), you’re never going to get a good look at this. At best, you can try to copy the ideas of other analysts in your space (say…Tom said they use quasi-linear regression models to predict product demand at competitor X, lets try it). Which isn’t going to get you very far. In medicine, a good doctor tries to build a total picture, blending the patient’s data with the patient’s history, knowledge of current science, and other events in the community (eg. we’ve got a flu going around) and noting the differences from expected results.
3a) From a practical perspective, you regain control of your schedule – knowing how to code doesn’t preclude you from leveraging others (indeed, I get a lot of help from my team – couldn’t do it without them) but rather allows you to push forward in key areas and step in to address gaps when you need to. You’re no longer beholden to the IT priority queue. Imagine a doctor having to call a nurse to check a patient’s pulse…
3b) You also gain a great deal of political freedom within the organization – particularly on a controversial project where the stewards of the data have their own agenda. It’s really easy to stop the non-coding data analyst – just delay / defer the request until they go away, citing other emergencies. It’s harder to do that when the analyst can pull their own data.
4) And should I mention productivity and the ability to scale up a project? Between scripting repetitive processes and using scheduled jobs, you can build systems that effectively run themselves – sucking data out of your data warehouse, manipulating it, and dropping it into neat little packages you can distribute to the business. Need to create workbooks for 100 business managers? No problem, I’ve got this little concept called a for-loop…. No more repetitive excel workbook generation. One other nice advantage of using scripts is that you can often build some basic QA into the process via test scripts. Which comes in handy when you’re tweaking a thirty step process at 2AM for the fifth time after adding five new steps earlier that evening.
5) Finally, you lose the opportunity to claim a larger role in the implementation of your ideas and accelerate their speed-to-market. Once you master the basics of data slinging, go learn something about how to build GUI applications (MS Access is a great place to start) or web sites. Prototypes speak louder than PowerPoints and are often easier for the typical line manager to understand. A good analytics hacker with some basic GUI skills can take a significant project from cocktail napkin to apps-on-desk in under a couple of weeks. Plus they will build your credibility within the organization.
I should also note that once you’re able to set up your own LAMP / Python / Ruby site, the only thing which separates you from owning your own business is a good idea, a couple of months of hard work, and the cost of a nice steak dinner for two. And you’ll have all the tools you need to start your own word solver site – or whatever else floats your boat….
Excel, Access, Minitab, Crystal Ball, and drag-and-drop database tools are useful. But take the time to learn how to cut real code in a scripting language and learn the syntax of SQL (or the query language for your database of choice). It pays big dividends.
Would you trust a doctor that is unable to check a patient’s vitals? The profession has figured out how to manage a lousy bedside manner (paging Dr. House), but an inability to interact with the patient (in our case, the data) at a basic level? Of course not…
Follow me on twitter to hear about future articles: @MarginHound