August 20, 2020

You should start working with data at the company you want to leave

If you’re new in data science, "doing data science" likely sounds like a big deal to you. You might think that you need meticulously collected data, all the tools for data science and a flawless knowledge before you can claim that you “do data science”. But this is not true. You can, and in fact, you should start working and playing with data as soon as you can. You don’t have to call it “doing data science” if you don’t want to but working with data will only do good for you.

In this article, I will explain why working with data as part of your current job is a great idea. I will then give you examples of projects to get your imagination going. Lastly, we will look into some things to keep in mind while working on data science projects at your current company.

Why is it so important to start doing projects?

First of all, you will learn a ton of skills you didn’t even know you needed by doing hands-on work. Secondly, it is a great way to hint to your future employer that you mean business, you are interested in this job and you take every opportunity to improve yourself.

To learn new skills, any type of project counts. Work with data that you can get your hands on. Simple datasets from the internet, your own WhatsApp chat history, data you found on Reddit, anything goes. Take opportunities to stretch your working-with-data muscles. You can take on bigger challenges as you up your game.

Impressing a potential employer is a different deal though. It's hard to get ahead of the competition by showing simple projects. At that point, your proof will be the portfolio of projects with a good deal of thought gone into them. Creating a portfolio of projects is a very vast topic. In this article, I want to focus on professional portfolio projects.

What are professional portfolio projects?

Those are the data science projects you do at your current job. As I said in my article on Quick tips for career switchers who doesn't want to start from a junior position one of your main selling points is your professional experience and experience working with other people. You need to make use of this advantage to the fullest.

Professional projects are good indicators of a promising future data scientist for a couple of reasons:

  • It shows that you are interested
  • It shows that you take charge when you need to
  • It shows that you can work in a professional environment
  • It shows that you know how to formulate a data science project
  • It shows initiative

"It all sounds awesome but I can’t imagine what I can do at my job."

Well, your title really doesn’t have to be "data scientist" or "something analyst" for you to get access to some data and work with it. Doesn’t matter if you work in marketing, design or HR, if you ask for it, you can get the data as long as it is not confidential. Many companies are not yet making use of the data they collect remotely enough and they welcome opportunities for anyone to analyse it and draw conclusions from it. It's a win-win situation because you probably won't be able to find better data on the internet.

Let me give you examples of some cases.

Your company is selling beauty products (or any other retail product). Ask for a rundown of all sales. You can analyse the data to look for trends, try to see if you can predict sales in a region per day. Add a couple of your own features and see if the performance gets better. You can also take it to the next level and use explainability techniques to explain why your model is predicting what it’s predicting.

Let’s say you work for a scooter rental company. If your product collects data, can you create a model that anticipates after how long the scooter breaks down? Can you predict the need for maintenance ahead of time?

Maybe you work at a gym. If you collect member's entrance data, you can anonymise it and try to see who is more likely to keep showing up. This is a tricky one though. You wouldn’t want to be using any variable that might cause unethical results such as someone’s race or gender. Though gender might be okay to use here unless your company decides to give discounts to people who are likely to show up more. (Because that would be discrimination.) You can also look into predicting how busy the gym would be at a certain time based on season, time of day, the weather, etc. Probably safer to do so.

If you work at a recruitment company, you can try to get a hold of past hires and see how the persons’ profile correlates to the companies they were hired at. One option is to make a model that predicts how good of a fit a person and a company has. Some safe-to-use features would be, education level, school, degree, years of experience in the industry, years of general professional experience of the person and seniority level, industry, maturity and other similar factors of a job listing or company. You might want to keep an eye out for potential proxy variables (e.g. postcode might be a proxy for race in a city where people with the same ethnicity live in same neighbourhoods).

The projects don’t have to be groundbreaking or novel. You can replicate projects that are already done. As long as it’s your own work and you take your specific situation into consideration, it’s valuable data science work.

What should you focus on during the project and what should you highlight while presenting this work?

  • Make sure you understand your company’s business
  • Create a list of potential projects you’re interested in looking into, not all of them need to be feasible
  • Take notes on your process of collecting the data for your final reporting. Struggles you face while getting data is a part of the data science process.
  • Clearly state your goal and your approach. Your approach might take its final form as you go. Just make sure to note down your decision points and the way you decided to go forward.
  • Note ethical issues you faced. Variables you decided to use, the ones you decided to omit and why.
  • Talk about the features you created or added and why.

Tip: Apart from being technical work data science is very much a creative and critical thinking process. Explain your decisions and highlight smart solutions you came up with.

  • At the end of your projects, make sure to have something presentable. Have your results either on some slides, on a webpage or maybe on GitHub as a readme document. You can use this in your portfolio or just to check back on your work afterwards.

To get extra points during your job hunt, you can:

  • Try and arrange a time to present your work to your team
  • Try to see if anyone in the company is interested in your results and treat them like your stakeholders
  • Big plus points if you can get your company to use your results/model. Deployment to real-life is one of the most problematic parts of the data science pipeline. Showing experience in implementation will be very important for you. 

Tip: If you can’t implement a big project you worked on, do a simple one and get it implemented. It’s always impressive to deploy a piece of work in a company. Even if it’s just a simple analysis, it will show your capability to work with complex situations and get things done.

The secret to getting a job as a data scientist when you’re switching from a different career is to play your professional experience card. If on top of that, you have some data science projects you have done in a professional environment, that would be a huge plus for you. Keep your eyes and ears open for opportunities at your current job and don’t hesitate to ask around. I’m sure you’d be surprised at how eager your company would be to have data analysis done for free.