CLEAR: Compilation of Intermediate Languages into Efficient Big Data Runtimes.

CLEAR is a research project funded by ANR, January 2017 – March 2022.

Leader: Pierre Genev├Ęs

Project Overview

This project addresses one fundamental challenge of our time: the construction of effective programming models and compilation techniques for the correct, efficient and scalable exploitation of large amounts of data. We investigate new theories for synthesizing code which is optimized for distributed big data platforms. We implement systems for querying large graphs, with applications in healthcare in particular.

Major updates

March 2022: The CLEAR project ended successfully, with congratulations from ANR. The final project report is available upon request.

September 2021: We are developing methods for the distributed evaluation of graph queries: see our preprints.

September 2020: see our SIGMOD 2020 paper on extending the relational algebra for optimizing recursive queries, with an application to graph queries.

October 2018: Our works have an application in healthcare: we analyze very large amounts of electronic health records and train machine learning models for predicting risks of clinical outcomes such as adverse effects or in-hospital mortality. See more in our Big Data Research article, and in our followup paper presented at the DSAA conference.