Data Analysis

Best Question-Driven Approach Award at ASA Five College Datafest 2018

This March, my friends and I attended the ASA Five College Datafest 2018 and we actually won an award! In this post, I just want to briefly explain our process.

Approach

Given the event was only about two days, we decided to split our approach into three sections:

  • Friday Night: Exploration
  • Saturday: Gather most compelling insights, develop conclusions, and create a presentation
  • Sunday Morning: Finalize presentation and wow the judges!

Data

The data was completely anonymized job listing data from the job hunting site Indeed. Some columns included location, post date, clicks, etc…

Exploration

Pretty early on we realized there was a lot of problems with the data. First, the clicks data was estimated and with some basic investigation, you could discover the function that was used for approximation. Second, even the company, job title, and description were masked so there was very little information to gather about the actual post. Finally, just like the real world, there was a considerable amount of missing data.

After, running into a considerable amount of dead ends, we landed on something good.

Insights

Using a basic choropleth plot and 2010 census data, we displayed the job listed per capita by state. The Northeast led the entire country with almost three job postings per 100 people. The South had the least with about one listing per 1000 people.

What made this discovery especially compelling was when we checked the Google Trends for Indeed, the site received much more traffic from users in the South. The discrepancy would be the foundation for our presentation.

Presentation

Using the power of our insights, we decided to take a business approach to our presentation. Instead of expecting the platform to adjust to the offset in demand, we created a new business strategy for Indeed so that it could take an active roll in capturing the market.

The slides we used in our presentation are available here.