Job Market Text Mining

  • Scraped over 1000 job descriptions from Indeed web pages using Python
  • Visualized the companies and the locations with the highest number of job listings
  • Calculated the minim and maxim average salary
  • Engineered features from the text of each job description to quantify the value companies put on specific tools, platforms, skills and data science roles
  • Created a word cloud highlighting the most frequently used words in job descriptions

GitHub


Which are the companies with the highest number of job listings?

Notice that Microsoft has the highest number of jobs, followed by Amazon and Zillow. There are 69 companies that posted only 1 job and 28 companies that listed 2 jobs. Most companies have between 1 and 10 jobs.


What is the frequency of words for specific tools and platforms, skills, and roles?

Tableau is the top data analytics tool while Azure is the preferred cloud computing platform. The roles as data scientists are in more demand than data analysts. Other roles like data engineer or machine learning engineer are less present. Most position types are for senior level, followed by intern and entry level.


Which are the most frequent words in job descriptions?

A count of all words that occur in job descriptions and job titles highlight the importance of specific words. As Word Cloud shows, the most frequent word is data, followed by machine learning, experience, engineering and customer.


What is the average salary?

Calculating the average salary included checking the formatting of salary values then defining a function to format, calculate and split salary in minim and maxim values. The results show that the average salary for a data scientist or data analyst role has a minim of $115,887 a year and a maxim of $159,665 a year.