Rate Prediction
Making realistic estimations about an individual's worth is necessary for forecasting.
At the time of job application, an estimation of the applicant's set of technical and soft skills is made. This has always been an estimation based on human experience, getting rid of the human factor allows for a more objective estimation.
Step 1: Human Analysis
First step is to look at how recruiters and sales employees make their estimations currently.
Most of the information, bar the soft skills, is available on the CV. So that's where we got started, would a prediction solely based on CV-data be able to provide accurate results?
To figure that out we had to find a good way to extract data from the CV's. But first, which information to we extract? We settled on:
Start Year (when the individual started working)
Age (or date of birth)
Amount of employments
Department (.NET, Java, ...)
Degree
Employment (Payroll or contractor)
Province (Antwerp, East-Flanders, ...)

Step 2: Extracting Data
After deciding on which data to extract, we had to find a way to automate the information extraction.
We chose on self-hosting a version of LLAMA in our own AI Platform to manage auto-scaling and so forth.
This way we could send our AI a CV-file and have it return all required data as a JSON object.
Ending up with a CSV with approximately 13.000 CV-data, including the actual selling rate.
Step 3: Building the model
We built our model using a neural network in Pytorch. What we noticed is that predicting a number, being the rate in euros, from data that does not include euro's or whatsoever is quite difficult.
Our first attempt ended up looking something like this:

Not very good. This graph means that we only have an accuracy of 43,70 % with a delta of 1 euro. That means we're not very close to the actual results.
One strategy to improve your model is to add a value that kind-of correlates with the result you want. That would be the selling price in our case. Yet we can't include the selling price in our training that because then we'd need the rate to predict... the rate.
We chose to go for a grouping strategy. This means you group your training data based on a few parameters. Let's say age, department and degree. Then we add the average of that grouping to our training data.
So we add one parameter:
Start Year (when the individual started working)
Age (or date of birth)
Amount of employments
Department (.NET, Java, ...)
Degree
Employment (Payroll or contractor)
Province (Antwerp, East-Flanders, ...)
Average rate for people with the same start year, age and degree.
The average can easily be retrieved when doing the actual prediction. Let's retrain our model and check the results.

Bang! Much better. This means we can now predict a consultant's rate with a delta of 1 euro with an accuracy of almost 85 %.
Interested in getting your own rate prediction model or similar? Get in touch!
Want a solution like this?
Enter your email here and we'll get in touch with you.
