Analyzing LinkedIn Job Recommendations with R

LinkedIn is a powerful tool used to network, find jobs and to spy on people (I joke). Among other things, LinkedIN has some powerful recommendation features that push job recommendations your way. A few times a week LinkedIn will send me recommended positions I should apply for and recently I’ve been getting some weird suggestions. I’ve gotten recommendations for Senior .NET Developer and Senior Database Administrator for example. I don’t know .NET and my experience with SQL is far too limited to be senior anything. In any event, I wanted to put some rigor to my thoughts around this issue and see if these job recommendations were just one off or if I should consider a career change.

Data

I grabbed this IFTTT recipe that would take new LinkedIn job recommendations and put them in a spreadsheet (Google Drive). I was able to capture some info about every job including where the job was, the posted company, the industry, a job description, salary and a nice company icon. I wanted to use this data to get a better picture of the types of recommendations LinkedIn was serving up to me. First, lets take a look at where most of the jobs are:



In order to accomplish this plot, I had to fix a few things about the location data. I had to remove words like “Greater” and “Area” so that the getGeoCode function would work. This was pretty straight forward with the stringr package. As we can see, the jobs are all over the U.S. with the majority of them in New York, SLC and San Francisco. In regards to New York City and San Francisco, please don’t tempt me LinkedIn.


Now lets take a look at the most frequently occurring industries. Marketing and Advertising are the most frequent followed by Computer Software and Internet. The numbers might actually be higher as this field can span multiple industries. You can search through the original dataset and see overlap.

  1. Marketing and Advertising - 43
  2. Computer Software - 25
  3. Internet - 23
  4. Information Technology and Services - 20
  5. Computer Software - 9

If you look through the job titles in the data table there is a considerable amount of marketing related work. Upon further review, I thought job titles were interesting but I wanted to look at the descriptions and see what trends lied in the natural text. I used Latent Dirichlet Allocation (LDA) to fit this model and to figure out what the most frequent topics for jobs were. I included the code in a gist for this post.

10 Terms Occurring More Than 30 Times:

  1. Advanced
  2. Analytics/Analysis/Analytical
  3. Business
  4. Clients/Client
  5. Candidate
  6. Data
  7. Company
  8. Customer
  9. Development
  10. Director

First Topic Set

  1. Analytics
  2. Data
  3. Marketing
  4. Insights
  5. Responsible

Second Topic Set

  1. Will
  2. Analytics
  3. Business
  4. Marketing
  5. Data


These topics are what you’d expect given my profile. I’ve worked in the web analytics space, which is often found in the marking department for example. While I wanted to show how LinkedIn was getting me useless job recommendations, they are actually pretty on point. There are other ways I could analyze this data and I will look to do that in a future post. For example, using dictionaries (hash tables) to give me a better idea of programming languages or technologies in the descriptions. In any event, I wont be deleting my LinkedIn profile anytime soon.

Code:

  1. me: *is lactose intolerant*
  2. me: *buys tub of cottage cheese to eat as snack*
  3. me: *dead*

©2010–2014 Jowanza's World