With the advances in natural language processing and big data analytics, the labor market community has introduced the emerging field of Labor Market Intelligence (LMI). This field aims to design and utilize Artificial Intelligence (AI) algorithms and frameworks to analyze data related to the labor market information for supporting policy and decision-making. This paper elaborates on the automatic classification of free-text Web job vacancies on a standard taxonomy of occupations. In achieving this, we draw on well-established approaches for extracting textual features, which subsequently are employed for training machine learning algorithms. The training and evaluation of our machine learning models were performed with data extracted from online sources, pre-processed, and hand-annotated following the ISCO taxonomy. The results showed that the proposed model is very promising. The advantage is its simplicity. After its application to a relatively small and difficult to clean dataset, it achieved a good accuracy. Furthermore, in this paper we discuss how real-life applications for skill anticipation and matching could benefit from our approach. |
*** Title, author list and abstract as seen in the Camera-Ready version of the paper that was provided to Conference Committee. Small changes that may have occurred during processing by Springer may not appear in this window.