LEAP: Linking Employee Aspirations to job Positions using natural language processing

Much of the information processed by organisations nowadays comes in the form of unstructured text. Using natural language processing tools, it is possible to extract quite valuable information from huge datasets. Without these tools, the data would otherwise remain inaccessible to the various busin...

ver descrição completa

Detalhes bibliográficos
Autor principal: Afonso Eurico Cerqueira Pereira Santos (author)
Formato: masterThesis
Idioma:eng
Publicado em: 2021
Assuntos:
Texto completo:https://hdl.handle.net/10216/137287
País:Portugal
Oai:oai:repositorio-aberto.up.pt:10216/137287
Descrição
Resumo:Much of the information processed by organisations nowadays comes in the form of unstructured text. Using natural language processing tools, it is possible to extract quite valuable information from huge datasets. Without these tools, the data would otherwise remain inaccessible to the various business units. This dissertation focuses on the task of establishing links between data related to comments of employees and a broad set of professional occupations. Using machine learning tools, it is intended to leverage the features of the information provided to promote a link between an employee's career perspectives and possible jobs to which he may be eligible. Several natural language processing techniques were applied for information extraction including stemming, lemmatization and Part-of-Speech Tagging. Given that the datasets used are somewhat unbalanced and composed of relatively short text structures, this study focused heavily on usability and knowledge extraction. The first experiment consisted in solving a multiclass problem by applying a clustering algorithm. an attempt is made to find the likely category for a given professional occupation, in order to define a set of distinct labour market areas. The second experiment uses a set of classifiers, including SVM and Naive Bayes to assign each job description to a predetermined cluster. Finally, a similarity algorithm was implemented, to make the link between the aspirations of a given professional and the set of most suitable professional occupations. The resulting models are a starting method for the application of complex natural language processing algorithms or by empowering taxonomies in the area of human resources that allow a more exhaustive analysis of the job descriptions performed by professionals in the labour market.