Project Publications

Efficient Enumeration of Recursive Plans in Transformation-based Query Optimizers. Amela Fejza, Pierre Genevès and Nabil Layaïda. Proc. VLDB Endow. Vol. 17(11), 2024 (VLDB'24). new
Abstract: Query optimizers built on the transformation-based Volcano/Cascades framework are used in many database systems. Transformations proposed earlier on the logical query dag (LQDAG) data structure, which is key in such a framework, focus only on recursion-free queries. In this paper, we propose the recursive logical query dag (RLQDAG) which extends the LQDAG with the ability to capture and transform recursive queries, leveraging recent developments in recursive relational algebra. Specifically, this extension includes: (i) the ability of capturing and transforming sets of recursive relational terms thanks to (ii) annotated equivalence nodes used for guiding transformations that are more complex in the presence of recursion; and (iii) RLQDAG rewrite rules that transform sets of subterms in a grouped manner, instead of transforming individual terms in a sequential manner; and that (iv) incrementally update the necessary annotations. Core concepts of the RLQDAG are formalized using a syntax and formal semantics with a particular focus on subterm sharing and recursion. The result is a clean generalization of the LQDAG transformation-based approach, enabling more efficient explorations of plan spaces for recursive queries. An implementation of the proposed approach shows significant performance gains compared to the state-of-the-art.
BibTeX:
@proceedings{geneves-vldb2024,
  author = {Amela Fejza and Pierre Genevès and Nabil Layaïda},
  title = {Efficient Enumeration of Recursive Plans in Transformation-based Query Optimizers},
  journal = {Proc. VLDB Endow.},
  year = {2024},
  volume = {17},
  number = {11},
  pages = {3095--3108},
  url = {https://www.vldb.org/pvldb/vol17/p3095-geneves.pdf}
}
A Fast Plan Enumerator for Recursive Queries. Amela Fejza, Pierre Genevès and Nabil Layaïda. In 40th IEEE International Conference on Data Engineering, ICDE 2024, Utrecht, The Netherlands, May 13-16, 2024 IEEE, 2024 (ICDE'24).
Abstract: Plan enumeration is one of the most crucial components in relational query optimization. We demonstrate RLQDAG, a system implementation of a top-down plan enumerator for the purpose of transforming sets of recursive relational terms efficiently. We describe a complete system of query optimization with parsers and compilers adapted for recursive queries over knowledge and property graphs. We focus on the enumeration component of this sytem, the RLQDAG, and especially on its efficiency in generating plans out of reach of other approaches. We show graphical representations of explored plan spaces for queries on real datasets. We demonstrate the plan enumerator and its benefits in finding more efficient query plans.
BibTeX:
@proceedings{geneves-icde2024,
  author = {Amela Fejza and Pierre Genevès and Nabil Layaïda},
  title = {A Fast Plan Enumerator for Recursive Queries},
  booktitle = {40th IEEE International Conference on Data Engineering, ICDE 2024, Utrecht, The Netherlands, May 13-16, 2024},
  publisher = {IEEE},
  year = {2024},
  pages = {5449--5452},
  url = {https://doi.org/10.1109/ICDE60146.2024.00425},
  doi = {10.1109/ICDE60146.2024.00425}
}
Reproduce, Replicate, Reevaluate. The Long but Safe Way to Extend Machine Learning Methods. Luisa Werner, Nabil Layaïda, Pierre Genevès, Jérôme Euzenat and Damien Graux. In Thirty-Eighth AAAI Conference on Artificial Intelligence, AAAI 2024 (AAAI'24).
Abstract: Reproducibility is a desirable property of scientific research. On the one hand, it increases confidence in results. On the other hand, reproducible results can be extended on a solid basis. In rapidly developing fields such as machine learning, the latter is particularly important to ensure the reliability of research. In this paper, we present a systematic approach to reproducing (using the available implementation), replicating (using an alternative implementation) and reevaluating (using different datasets) state-of-the-art experiments. This approach enables the early detection and correction of deficiencies and thus the development of more robust and transparent machine learning methods. We detail the independent reproduction, replication, and reevaluation of the initially published experiments with a method that we want to extend. For each step, we identify issues and draw lessons learned. We further discuss solutions that have proven effective in overcoming the encountered problems. This work can serve as a guide for further reproducibility studies and generally improve reproducibility in machine learning.
BibTeX:
@inproceedings{geneves-aaai2024,
  author = {Luisa Werner and Nabil Laya\ida and Pierre Genevès and Jérôme Euzenat and Damien Graux},
  editor = {Michael J. Wooldridge and Jennifer G. Dy and Sriraam Natarajan},
  title = {Reproduce, Replicate, Reevaluate. The Long but Safe Way to Extend Machine Learning Methods},
  booktitle = {Thirty-Eighth AAAI Conference on Artificial Intelligence, AAAI 2024, Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence, IAAI 2024, Fourteenth Symposium on Educational Advances in Artificial Intelligence, EAAI 2014, February 20-27, 2024, Vancouver, Canada},
  publisher = {AAAI Press},
  year = {2024},
  pages = {15850--15858},
  url = {https://doi.org/10.1609/aaai.v38i14.29515},
  doi = {10.1609/AAAI.V38I14.29515}
}

Further reading

ANR CNRS Inria