Benchmarking polystores: the CloudMdsQL experience

The CloudMdsQL polystore provides integrated access to multiple heterogeneous data stores, such as RDBMS, NoSQL or even HDFS through a big data analytics framework such as MapReduce or Spark. The CloudMdsQL language is a functional SQL-like query language with a flexible nested data model. A major c...

Full description

Bibliographic Details
Main Author: Kolev, Bovan (author)
Other Authors: Pau, Raquel (author), Levchenko, Oleksandra (author), Valduriez, Patrick (author), Jimenez-Peri, Ricardo (author), Pereira, José (author)
Format: conferencePaper
Language:eng
Published: 2016
Subjects:
Online Access:http://hdl.handle.net/1822/52872
Country:Portugal
Oai:oai:repositorium.sdum.uminho.pt:1822/52872
Description
Summary:The CloudMdsQL polystore provides integrated access to multiple heterogeneous data stores, such as RDBMS, NoSQL or even HDFS through a big data analytics framework such as MapReduce or Spark. The CloudMdsQL language is a functional SQL-like query language with a flexible nested data model. A major capability is to exploit the full power of each of the underlying data stores by allowing native queries to be expressed as functions and involved in SQL statements. The CloudMdsQL polystore has been validated with a good number of different data stores: HDFS, key-value, document, graph, RDBMS and OLAP engine. In this paper, we introduce the benchmarking of the CloudMdsQL polystore and evaluate the performance benefits of important features enabled by the query language and engine.