Novel Trends in Scaling Up Machine Learning Algorithms

Big Data has been a catalyst force for the Machine Learning (ML) area, forcing us to rethink existing strategies in order to create innovative solutions that will push forward the field. This paper presents an overview of the strategies for using machine learning in Big Data with emphasis on the hig...

ver descrição completa

Detalhes bibliográficos
Autor principal: Lopes, Noel (author)
Outros Autores: Ribeiro, Bernardete (author)
Formato: article
Idioma:eng
Publicado em: 2018
Assuntos:
Texto completo:http://hdl.handle.net/10314/4183
País:Portugal
Oai:oai:bdigital.ipg.pt:10314/4183
Descrição
Resumo:Big Data has been a catalyst force for the Machine Learning (ML) area, forcing us to rethink existing strategies in order to create innovative solutions that will push forward the field. This paper presents an overview of the strategies for using machine learning in Big Data with emphasis on the high-performance parallel implementations on many-core hardware. The rationale is to increase the practical applicability of ML implementations to large-scale data problems. The common underlying thread has been the recent progress in usability, cost effectiveness and diversity of parallel computing platforms, specifically, the Graphics Processing Units (GPUs), tailored for a broad set of data analysis and Machine Learning tasks. In this context, we provide the main outcomes of a GPU Machine Learning Library (GPUMLib) framework, which empowers researchers with the capacity to tackle larger and more complex problems, by using high-performance implementations of wellknown ML algorithms. Moreover, we attempt to give insights on the future trends of Big Data Analytics and the challenges lying ahead.