Data scientists: Every company wants one, but few people can articulate exactly what the role entails. That’s probably because it requires these professionals to be good at a little bit of everything.
I caught up with Frank Lo, director of data science at Wayfair, who built the data science department over at the online furniture company from the ground up. Lo has been in Boston his entire life. He was born and raised around here, attended MIT for college and stuck around here for work following graduation. And he’s been in the data science space before it was cool to do so.
Now going on five years spearheading data science at Wayfair, Lo has grown all the wiser. He explained, “I’ve seen data science grow tremendously at Wayfair. I started the team back in 2011 and, for a time, I was basically just a one-person team that grew to a small team of a few people, and we’ve become an entire data science department now of 22 people.”
What does he have to say about the ever-booming space and the high-demand professionals flooding it? I found out.
Olivia Vanni: Data Science is such a hot term nowadays, but what does it really mean to those of you in the trenches?
Frank Lo: Some people view it as this big word and make associations with very specific things. So, for example, they think: Data science, big data, Hadoop – those are all just bundled together in the same word cloud. Or, they think that if you know data science, you need to know a certain type of software technology, or that data science is machine learning.
But I would take a big step back and say that data science is the intersection of a lot of different areas and a lot of different skills. Three areas matter. One is mathematics and quantitative technique. Two is engineering because data science is very technical. And three is, I’d say, business strategy and core knowledge of whatever subject matter you’re focused on.
One thing I like to make clear is that we’re not just engineers in the sense that we can develop software. Yes, we take the quantitative aspect of that and say, “OK, now we can build algorithms,” or solve data analysis problems. But, most importantly, we are able to take a look at a business and determine where different data adds value to that business. Our work is not just done in isolation with requests being thrown over the wall. We work very, very, very closely with business partners. In a way, I say we’re like consultants.
One of the big advantages we have is we have such close proximity to information that we can see things that other people might not see. I actually think part of our responsibility is to play a role in trying to understand how the business is moving with all of the data analysis that we are doing.
OV: Where does data science fit into the bigger picture of a company? What are the interactions with other departments like?
FL: Here in data science, we’ve structured things in a very specific way where every data scientist is closely aligned with a long-term project. I’ll give you an example: product recommendations… which products we show you when you visit our website, based on what we think your interests are. With all of projects, we work with a business partner in the company who’s in, say, product management or marketing management. Basically, they act like internal clients and are usually top-level executives. Very frequently, we have entry-level data scientists dealing with these directors and VPs in the company. They’re serious about the stuff we’re discovering.
The dynamic in which we work with them is a partnership. We make sure that it’s not like somebody has a request and throws it over the wall, it falls into our request queue, we go down the checklist and do everything. It’s more like if you have a question, we want to talk with you, have a discussion with you and even try to read in between the lines on what you’re trying to figure out so we can really help you out there. It’s very much a partnership.
OV: You’ve been in the data science game for 8 years now. How has it evolved since then?
FL: If you just look at the name, it wasn’t even called data science. You can call it whatever you want to call it, but it wasn’t until 4 or 5 years ago that people even started using the term data science. But that’s semantics.
I would say one of the things that has been really neat is that because there has been so much interest, a lot of material has popped up for people who are interested in the space to get deeper into it. They call them MOOCs – Massive Open Online Courses – which have been a great resource for people who have an interest but think, “Hey, I’m not going to go to school for this, I don’t have the money for this but I just want to learn.”
We’ve had people join our team where they education on data science has been a huge dose of curiosity and the fact that they’re able to go to something like Coursera or edX and learn all about machine learning, the analytics space or data technologies just from signing up for these online courses. I think that’s amazing.
OV: When you’re looking to build your team, what do you look for on a resume?
FL: I’d say that from a business strategy perspective, you’re good at that based off of experience in the corporate world and street smarts. It’s one of those things that’s harder to take a class and learn it. It’s more or less strategy and how you approach problems.
I personally have a soft spot for entrepreneurs. I love entrepreneurs because they take initiative. But in our own interview process, we make sure they can handle a very open-ended problem – not just technically and quantitatively, but also by tying it to the business.
On the quantitative side of things, one thing I ask everyone to do is if you’re not familiar with math you want to use, you should learn all about it. We don’t want you jumping into something new that you don’t know about already. I think a great resource for that is the open courses that I mentioned, but also, literally go buy a textbook and read about it… You can self-educate or take a course. I’d say those are some of the best ways to make sure you have the core, quantitative skills that you need.
On the technical side of things, you have to be comfortable with programming. You can’t not know how to code because there’s a lot of coding in data science. One of the best ways to practice that is work on your own projects where you need to code out some sort of solution. I think that a lot of people learn really well when they’re super hands-on. You learn by doing it. That’s how you get really good.
Seek out opportunities for themselves. Something that I look for in a resume and that pops out at me is that people have side projects because that shows me they were eager to learn and took initiative to try it out themselves.
A question that I ask is, “Tell me a data science work that you’ve done outside of school and work.” Because, let’s say you took a class, was assigned a project and did that project. Sure, it was nice that you did that project, however, you were told to do that project. When people engage in their own side projects, their own investigations, they’re showing that they are curious and they pursue it because they want to know the answers. I think that’s what people who want to get into this space do.
OV: What does the day-to-day look like?
FL: Let’s linking back to how a data scientist is on a project for many months: The whole point is that you can then cover a project end-to-end very well and have ownership and expert knowledge about everything that’s going on with that project. So every data scientist is neck-deep in data all the time, becoming super knowledgable about everything to do with their project from a data standpoint. There’s a lot of exploratory data analysis on every single aspect. We don’t just take data as is. We need to understand how it’s useful so we’re using tools, like, we’re a heavy Python shop and heavy SQL shop. They’re great tools for getting into very large data sets, processing through them, trying to understand them and learning new things.
And we discover new things in a way that’s useful to our objectives…Then, another thing that we do on the product side is developing the core product. We do a fair amount of programming… We’re prototyping, mocking up something to see what result we get. Maybe it worked, maybe it didn’t work so we keep trying. It’s a blend of art and science, what I like to call architecture. Once we have something that we want to move forward with, we bring it into production where our customers can see it, but we can still run tests to validate that we’re accomplishing our business goals…
We also analyze the tests very deeply. Based on test results coming in, we can determine what changes we should make or what ideas we should come up with to iterate on what we’re already built. Then, we test again. Throughout the entire process, we’re in close relation with the business partners. We’re definitely not doing it in isolation, it’s in collaboration. Communication is really important. There is no scenario where we’ll disappear into our coding cave, emerge with something and they’re like, “Oh, they’re back.”
Image via Wayfair.