If you missed last week’s post about how to be data savvy if you’re not a quant, you may find it helpful to read that first.
About seven years ago, I had a small epiphany in my career. I was at a startup, and while we were building out our analytics platform most of our revenue came from consulting engagements. My client at the time was preparing to launch a series of marketing campaigns featuring price discounts on many of their products (which were things people normally buy every few weeks). They were timing the campaigns to coincide with the peak demand seasons for each product so that they would capture more sales when demand picked up. The approach sounded reasonable at first, but something struck me as off. When I starting asking questions I was surprised by how little proof lay behind the underlying assumptions of the strategy. Questions like, “How do we know this is the best time to run a discount?” or “How much existing revenue do discounts cannibalize?” weren’t just un-answered; they largely went un-asked.
It didn’t make sense for such questions to not come up among this group. My clients were smart people with plenty of experience, so a lack of mental horsepower wasn’t the problem. What they lacked, I found, were the tools and some of the skills they needed to ask the questions they should have. They had dashboards and canned reports that were better than nothing. But if they wanted to ask a new question for which they didn’t have the answers in front of them, they were usually out of luck. As a result, the team wasn’t asking those new, important questions in the first place. It showed me two things:
- A business that has lost the ability to analyze itself critically sacrifices an unknown amount of innovation that never takes place
- Mastering the basic skills and tools to do this on your own, especially when coupled with ample domain expertise, can make you valuable very quickly
As I mentioned last week, the most important questions you can ask about a business are often the simplest. There were only two ways that the client’s discount approach could make them more money: existing customers had to buy more, or new customers needed to show up. Would those things happen? Since I had the tools, I could start trying to answer that. After some digging around in the database and creating some data sets out of theirs, I came up with this (scrubbed to preserve client anonymity):
There was more to it, but what this ultimately allowed us to see was that people were less likely to try new products during the periods the client was planning to promote them (shaded in gray above and in the red columns below). The peak season for “first time buyers” was not the same as the overall peak sales period for that product. That insight totally changed how the client planned campaigns and thought about demand for their products. And for the most part, my analysis didn’t require anything beyond ninth-grade math – just a lot of thinking and playing around with their raw data. The hardest part was isolating all of the first-time sales in a single table I could analyze.
This is a small example of how liberating (and fun) it can be to work with raw data. When you understand your domain well and can answer big questions on your own, you can have a major influence on an organization very early in your career.
Three Steps to Get Moving with Raw Data
If you’re already familiar working with raw data this will be review for you. I’m writing this for people who are unfamiliar with the tools and techniques involved.
Step 1: Locate the Data You Need
The first thing to do is define the data you need access to in order to do meaningful analysis of your company’s business. Start with the transactions you need and then think about all the information that gives context to those transactions. For my client, it was critical to have the sales and product cost data handy, as well as all of the product and customer information. For a hotel company, you’d want all of the booking data, prices paid, how long the guests stayed, how many rooms they booked, and everything else that described a customer’s stay at the hotel in order to really dig into the business. For an e-commerce company, you would want the sales data, what else they browsed for, and so on. You get the idea.
The next step is finding out how you can access the data. You need to know where and how it’s stored and what tools are available for querying it. Ask around if you don’t know the answers. Explain what you’re trying to do and that you’d like to have “read-only” access to the underlying databases so that you can analyze the business’ data. Eventually, you’ll find the right person who can answer your questions, though what you’ll find varies. At newer companies, particularly those that were born digital, getting database access can be as easy as someone in IT getting you credentials for a tool like pgAdmin or SQL Server and pointing you at the right data sources. In some environments it can be more of a headache, and you might even be denied access outright. If that happens, don’t give up. The goal is to have unconstrained access to the data so that you can be better at your job. IT professionals (understandably) don’t want you to do things that can muck up their infrastructure, so just emphasize you only want analyze data in a sandbox environment. Get your boss involved if you need to.
It can be a pain to ask so many people for help and you may have to twist some arms, but just stay focused on the goal (unfettered access to the business’ data) and be flexible on everything else.
Step 2: Learn the Tools of the Trade
Assuming that the data you get access to is in a relational database, you’ll need to learn the basics of Structured Query Language (SQL). If you’ve never done any programming, SQL might seem scary at first, but it’s one of the easiest languages to learn. In fact, you’ll quickly find that the hardest thing about writing SQL queries is usually understanding the little oddities that exist in most company’s data rather than the language itself. That’s the downside of working with raw data: you have to do the cleanup yourself. But once you get the lay of the land, you’ll be running circles around your peers who are all looking at the same data in canned reports.
With just a few concepts under your belt writing SQL queries, you’ll have a solid understanding that you can build on in perpetuity:
- Select statements that allow you to specify the variables you want to analyze
- Table joins which allow you look at data stored in more than one place
- Where clauses which allow you to restrict the data you want to look at
- Group by statements which let you perform arithmetic on numbers
For a solid desk reference book on how to write queries, I recommend O’Reilly’s SQL Cookbook. You can focus on the parts about writing queries and ignore most of the other stuff.
Lastly, if you don’t get access to data in a relational database, you might have to learn another tool like Tableau, SAS, SPSS, MicroStrategy or something similar. That’s fine too – as long as you have the ability to answer any new questions you want to answer about the business, you’ve achieved the goal. Speaking of which…
Step 3: Practice Answering Big Questions
To sharpen your skills once you have the data you need and know which tools you’ll be using, I always recommend people do the following:
- Think of 5-10 questions about the business that you would never have been able to answer before. Make them wacky if it helps get the creative juices flowing, but they should still be relevant to the business. For example, “How much did we earn on sales last month to customers who had spent less than $500 in the year-to-date?” is the kind of question that is very hard to answer with canned reports. This is what you want.
- Answer every question you think of using either SQL or whatever tools have been made available to you.
- Sanity check your answers, and fix the bugs in your queries until you have answers you believe are right.
Not only does this kind of exercise teach you how to structure questions in a way that they can be answered, it forces you to learn the oddities of your company’s data and learn to account for them. This exercise will also show you whether you have enough “data freedom”. If you still can’t answer the questions you think of, you either need access to more data or more granular data than you currently have.
The Light at the End of the Tunnel
There is definitely a learning curve to all of this, but it’s worth it. While you’re still getting up to speed on techniques and tools, be cautious about declaring interesting findings until you’ve triple-checked your work. You’re guaranteed to make mistakes, but it’s one of those things where your skills and knowledge snowball pretty quickly into being able to do really useful things. Keep at it.
The value in this all comes back to knowing what your job really is. No matter your role or your level in the organization, your job is to make the organization better. Learning to use different tools or write simple code in a language is a means to an end, not an end in and of itself. The end is your ability to analyze your organization without restrictions on your creativity. When you can ask better questions about the business than your peers and have the ability to answer them, driving adoption of your ideas becomes that much easier.