Using Google Trends, Gaussian Mixture Models and DBSCAN for the estimation of Twitter user home location

In this work we propose a novel approach to estimate the home location of Twitter users. Given a list of Twitter users, we extract their timelines (up to 3,200) using the Twitter Application Programming Interface (API) service. We use Google Trends to obtain a list of cities in which the nouns of a...

ver descrição completa

Detalhes bibliográficos
Autor principal: Zola, Paola (author)
Outros Autores: Cortez, Paulo (author), Tesconi, Maurizio (author)
Formato: conferencePaper
Idioma:eng
Publicado em: 2020
Assuntos:
Texto completo:http://hdl.handle.net/1822/68509
País:Portugal
Oai:oai:repositorium.sdum.uminho.pt:1822/68509
Descrição
Resumo:In this work we propose a novel approach to estimate the home location of Twitter users. Given a list of Twitter users, we extract their timelines (up to 3,200) using the Twitter Application Programming Interface (API) service. We use Google Trends to obtain a list of cities in which the nouns of a specific Twitter user are more popular. Then, based on word popularity, we sample the geographical coordinates (latitude, longitude) over all the world surface. Finally, the Gaussian Mixture Model and the DBSCAN clustering algorithms are implemented to estimate the users’ geographic coordinates. The results are evaluated using the mean and median error computed on the Haversine distance. Competitive findings are achieved when compared with a baseline approach that estimated the users’ location given the Google Trends city mode.