Tyrex | Project

Distributed SPARQL Evaluation

SPARQL is the W3C standard query language for querying data expressed in the Resource Description Framework (RDF). The increasing amounts of RDF data available raise a major need and research interest in building efficient and scalable distributed SPARQL query evaluators. In this context, we propose strategies to efficiently store RDF datasets in a distributed manner, and methods for querying them in an efficient manner.

Experimental Studies

This project is strongly backed by practical experiments and cluster tuning. Indeed, here, theoretical concepts face the test of reality. Therfore, we compare our systems with competitors of the scientific literature.

  1. We systematically benchmarked a panel of opensource, distributed, recent or popular state-of-the-art SPARQL evaluators i.e. 4store, CliqueSquare, CumulusRDF, CouchBaseRDF, S2RDF, RYA & PigSPARQL. We present tutorials and the obtained results here.
  2. We also propose a new reading grid to rank SPARQL evaluators based on several criteria (respectively the velocity, the immediacy, the dynamicity, the parsimony and the resiliency) which is especially designed for a distributed context.

Sources

All the SPARQL evaluators developed are also openly available under the terms of the CeCILL license on the team github with other related software. In this particular project, we share the following evaluators:

  • pdf SPARQLGX
    An Efficient Distributed SPARQL Evaluator Based on Apache Spark.

  • pdf RDFHive
    A Direct Evaluator of SPARQL on top of Apache Hive.

  • pdf SDE (i.e. SPARQLGX as a Direct Evaluator)
    A Solution to Directly Evaluate SPARQL using Apache Spark.

Related Publications

  • [CONFERENCE] SPARQLGX : Une Solution Distribuée pour RDF Traduisant SPARQL vers Spark [HAL, PDF,Abstract]

    Damien Graux, Louis Jachiet, Pierre Genevès, Nabil Layaïda
    BDA 2016 - 32ème Conférence sur la Gestion de Données - Principes, Technologies et Applications, Nov 2016, Poitiers, France. BDA2016
  • [CONFERENCE] SPARQLGX in Action: Efficient Distributed Evaluation of SPARQL with Apache Spark [HAL, PDF, Abstract]

    Damien Graux, Louis Jachiet, Pierre Genevès, Nabil Layaïda
    15th International Semantic Web Conference (ISWC 2016 demo paper), Oct 2016, Kobe, Japan. 15th International Semantic Web Conference
  • [CONFERENCE] SPARQLGX: Efficient Distributed Evaluation of SPARQL with Apache Spark [HAL, PDF, Abstract]

    Damien Graux, Louis Jachiet, Pierre Genevès, Nabil Layaïda
    The 15th International Semantic Web Conference, Oct 2016, Kobe, Japan. The 15th International Semantic Web Conference, <10.1007/978-3-319-46547-0_9>