Not a Data Scientist? You Can Still Be Data Savvy

I’ve been amused for a while by the tone of articles I read that marvel at the rise of the data scientist role. While not every article went so far as to declare that data scientists would have the “sexiest job of the 21st Century” as Harvard Business Review did, most of the posts I’ve seen echoed the we-have-seen-the-future tone. I don’t think they’re necessarily wrong (although this short Fortune article is a good reminder that the laws of supply and demand apply to data scientists too) but I don’t see what is surprising or a new about this trend. If The Onion were covering this story, I’d expect a headline like:

New study reveals that people who are good at math and programming are employed, affluent.

Where’s the news here? People with math and programming chops have been getting rich on Wall Street since the seventies. As more companies generate big data, the need for these skills has expanded to new industries, to say nothing of the demand for these skills in the tech sector, but it’s all part of a long-term upward trend of the value of quantitative skills. My favorite example of the media enthusiasm though was this New York Times article from this past July. It told the story of a young man named Paul Minton, a waiter in San Francisco (where else?) who decided to become a data scientist and, after taking a three-month course in programming and data analysis, went from making $20,000 per year to a six-figure salary. Behold the miracle of data science!

One small caveat (which to its credit the article mentions, though just barely): Mr. Minton had earned an undergraduate degree in math. In other words, he was a pretty smart waiter. I don’t know Mr. Minton and don’t want to pick on him, the NYT, or waiters, but most people don’t have the aptitude to make that transition, much less so quickly. I am a huge fan of coding academies and other non-traditional education outlets, as I’ve written about before on my blog and in my book. But noting that Mr. Minton made his remarkable transition in “just three months” ignores his years of prior education spanning calculus, statistics, probability theory and other tough courses on top of the programming skills you need to earn a degree in math. I’m guessing Mr. Minton had at least a passing familiarity with MATLAB. Not only do most people not have those years of training under their belts; they wouldn’t be able to understand it if you gave it to them for free. And I consider myself one of those people.

It was no more a secret that math and statistics were marketable skills when I was in college than it is today, but that didn’t make them easier to learn. Among my friends from college, I was maybe a little above average at best when it came to quantitative skills. I did well enough through calculus and statistics, but at some point I had enough math behind me to major in economics and also enough to know that I had no long-term advantage at being good at math. I would clearly have to build a career on more than just being better at math than most others. I think the vast majority of us have that realization with math at some point, where you see that you’re struggling more than everyone else around you.

I bring this up because if that moment of clarity takes place when you are well shy of a degree in mathematics, hearing that data scientists are in high demand is like hearing that NFL quarterbacks are well-paid. “Yup, I bet they are.”

The good news is that even if you can’t be a data scientist, you can still make yourself much more valuable and better at your job by becoming more data savvy.

The Secret Skills Gap in Companies Today: People Who Can Answer Questions

One of the things I’ve found most surprising over the years is how little understanding most employees have of their company’s own data. Forget about having enough data scientists, most of the companies I’ve come across have shockingly few people who are capable of analyzing their data in the most basic ways. For instance, I recently spoke at length with a marketing manager at a major hotel group who confided that, “maybe two or three people in the company,” understood the business and the internal systems well enough to analyze the company’s raw (i.e. non-aggregated) booking and sales data. A commercial products distributor had perhaps a half-dozen people out of tens of thousands of employees who understood both their databases and the business well enough to be able to quickly answer questions for the executive team. Another category-leading retail chain had only a tiny cabal of specialists who could analyze their raw data quickly. At many companies, the mandate of the “customer insights” team is to serve as a shared resource for other departments when they need someone who can understand the damn data and answer their questions.

Why is this?

The systems companies have in place are partly to blame. Many enterprises, particularly ones that grew by acquisition and inherited multiple IT departments as a result, store their data in systems that are difficult for non-technical employees to use. That alone discourages the vast majority of people from ever touching their company’s raw data. But the larger obstacle is simply that even if decent tools are available, it takes know-how and patience most people don’t have to analyze data that’s in a relational database as opposed to in a dashboard or an Excel file. It’s not just learning SQL, either. Understanding a company’s data model and how it stores data well enough that you can query it accurately takes patience and a lot of trial and error. There’s a big difference between the data you work with in business school and what you often see in the real world in terms of data reliability and quality. This is why the vast majority of people rely on aggregated reports and cleansed data they get from their IT departments; they can trust the data without thinking twice about it.

The problem with relying on dashboards and pre-built reports to do your analysis is that it’s hard to do work that sets you apart when you’re looking at the same small sliver of the facts as everyone else. Data quality is important, and companies emphasize having a single version of the truth for good reason, but it can seriously constrain your creativity. What happens when you have a question that you can’t answer with the cut of data someone else made available to you? How do you, for example, test whether your hotel is sufficiently meeting the needs of road-tripping families if you can’t analyze for yourself the spending patterns of people who only visited your hotel once, ordered room service off the children’s menu and requested roll-away cots? That’s the kind of analysis that makes your boss lean in and listen to what you’re saying. Being able to do it on your own is ten times better than having to ask someone else to do it for you.

Best of all, you don’t need more than junior-high math to answer that question. All you need is an inquisitive mind and the right data.

The Joy of Asking Simple Questions

It’s said that smart people ask hard questions while really smart people ask simple ones. Indeed, many of the most important questions you can ask about your company are the simplest. Why do people choose our products over our competitors’? Why do customers leave us when they do? Should we offer discounts to boost sales? When you’re up to your neck in being a good do-er it’s easy to lose sight of these fundamental questions because people don’t ask you to answer them when you’re still green. But oh, the liberation when you can! This is how you can begin to understand and contribute to solving some of the most important challenges facing your business today.

Learning SQL and how to interrogate a company’s raw operations data to answer fundamental questions about its business was probably the most useful business skill I acquired in the early years of my career. As it turned out, I was a natural at asking good questions and just needed the tools to be able to answer them. But more than that, a marvelous thing happens inside the businessperson’s mind as a result of analyzing a business through its internal databases: the discipline of querying databases teaches you to ask better questions. More specifically, it teaches you how to structure big questions in such a way that they can actually be answered with precision. It forces you to clean up lazy thinking, because computers don’t allow vague questions. It teaches you to think in sets, an incredibly valuable mindset, without even realizing it. In short, it makes you a better businessperson by allowing you to more fully capitalize on your domain expertise. I know it changed my career tremendously for the better.

In a subsequent post, I’ll cover the set of steps a novice can take to go from zero familiarity to proficiency regardless of their quantitative skills. If you’re unfamiliar with the concepts I’ve described here, visit the links I’ve provided in this post and check back next week. You’ll be surprised by how achievable it really is.