June 25, 2020

What does it mean to have a data-driven mindset?

Today, I had a chat with someone discussing the barriers for data science projects. At one point, I was explaining what skills someone needs to become a data scientist. One of the things I mentioned was that I believe a data scientist should have a data-driven mindset. After I said it, I realized, I didn’t quite know what I meant by that.

There are many explanations of what a data-driven mentality is, how it affects companies, levels of being data-aware and all. That corporate mumbo jumbo is not what I was referring to. What I meant was more on a personal level; how as people we can be data-driven. And after I thought about it, I concluded that it includes three key concepts:

Having the habit of supporting your decisions with data

When you become a data scientist, your life will revolve around data. You will need to get used to making decisions based on that data. This also includes the decisions you’re making on how to approach analyzing a piece of data. You have to get into the habit of not making a lot of unnecessary assumptions.

  • Need to choose parameters for your model? Compare and see which one works best. Don't just use a random value. 
  • Not sure what to do with your missing values? Look into the data and see if there is anything you can do to fill those gaps. Don't just remove them.

This takes a while to get used to because as humans, we survive thanks to our brain being able to assume things based on “seeming” patterns. But when you are working with data, the most important thing is to be able to set aside your instinct of assuming by belief and start making hypotheses by observing.

When I was starting out, I found it helpful to note down the assumptions I made while working on a project. I would then go over them again and see if I can support them using the data I had.

Ability to be objective while reading the data

Apart from basing your decisions on data you also want to make sure you are seeing exactly what is being shown. It is very simple to fall into the trap of seeing as much as pleases us and ignore the rest. I have been guilty of this too where I would ignore the funny looking (as in a little bit too bad for comfort) performance metric in hopes that it didn’t mean anything. Trust me, in the end, it means something and it will make trouble. So you need to be able to address the problems that come your way without filtering them out.

Understanding possibilities and limitations of working with data

Being aware of what you can do with data is one of the main responsibilities of a data scientist. Knowing what you can't do though is at least as important as that. It's all about keeping realistic expectations while being aware of the potential.

That's why being able to approach problems and solutions in a realistic sense is part of having the data-driven mindset. You can measure your understanding for each project by asking yourself questions such as:

  • Can I formalise this problem in my head?
  • Does it fit into a data science format?
  • What are the input and the output?
  • How are they represented?

A data-driven mindset is not something you can learn overnight. You have to practice it, be open to criticism and keep your radars on for catching any mistake you might have done. It is basically a framework that you should train your brain to look at life through. In that sense, it is not only for data scientists. It applies to real-life too.

Especially lately, there has been a lot of misinformation all around the internet. We probably see many unproven and untrue “facts” swirling around in our feeds. This is exactly where a data-driven mindset can help you. Instead of believing the first thing you see, you can research it, collect information and data on it and then make your own decision. By finding facts and evidence that support what you think and say, you are being data-driven.

I hope this article helps make the vague term of "data-driven mindset" make sense to you. A very data-driven week to all!