Resumo: | Big Data has been a catalyst force for the Machine Learning (ML) area, forcing us to rethink existing strategies in order to create innovative solutions that will push forward the field. This paper presents an overview of the strategies for using machine learning in Big Data with emphasis on the high-performance parallel implementations on many-core hardware. The rationale is to increase the practical applicability of ML implementations to large-scale data problems. The common underlying thread has been the recent progress in usability, cost effectiveness and diversity of parallel computing platforms, specifically, the Graphics Processing Units (GPUs), tailored for a broad set of data analysis and Machine Learning tasks. In this context, we provide the main outcomes of a GPU Machine Learning Library (GPUMLib) framework, which empowers researchers with the capacity to tackle larger and more complex problems, by using high-performance implementations of wellknown ML algorithms. Moreover, we attempt to give insights on the future trends of Big Data Analytics and the challenges lying ahead.
|