Using Google Trends, Gaussian Mixture Models and DBSCAN for the estimation of Twitter user home location

In this work we propose a novel approach to estimate the home location of Twitter users. Given a list of Twitter users, we extract their timelines (up to 3,200) using the Twitter Application Programming Interface (API) service. We use Google Trends to obtain a list of cities in which the nouns of a...

Full description

Bibliographic Details
Main Author: Zola, Paola (author)
Other Authors: Cortez, Paulo (author), Tesconi, Maurizio (author)
Format: conferencePaper
Language:eng
Published: 2020
Subjects:
Online Access:http://hdl.handle.net/1822/68509
Country:Portugal
Oai:oai:repositorium.sdum.uminho.pt:1822/68509
Description
Summary:In this work we propose a novel approach to estimate the home location of Twitter users. Given a list of Twitter users, we extract their timelines (up to 3,200) using the Twitter Application Programming Interface (API) service. We use Google Trends to obtain a list of cities in which the nouns of a specific Twitter user are more popular. Then, based on word popularity, we sample the geographical coordinates (latitude, longitude) over all the world surface. Finally, the Gaussian Mixture Model and the DBSCAN clustering algorithms are implemented to estimate the users’ geographic coordinates. The results are evaluated using the mean and median error computed on the Haversine distance. Competitive findings are achieved when compared with a baseline approach that estimated the users’ location given the Google Trends city mode.