I’m a data scientist with a passion for Artificial Intelligence, with a background in computer science and mathematics. I currently work as a data scientist apprentice at Dassault Systèmes SE in Vélizy-Villacoublay, France, building AI tooling that helps developers fix code automatically with Large Language Models.
Experience
Data Scientist Apprentice — Dassault Systèmes SE
August 2021 – present · Vélizy-Villacoublay, France
R&D project to help developers correct code bugs automatically with AI (Large Language Models):
- State-of-the-art study on automatic code correction (deep-learning approaches).
- Pipelines for massive data collection (Spark, HDFS, Hive).
- Data exploration, analysis and pre-processing (e.g. abstract syntax tree representations).
- ML models proposing corrections (Seq2Seq, Graph Neural Networks, Transformers: GPT, BERT, T5).
- Evaluation, comparison and characterization on real production data.
- Integration of models into production DevOps chains (web service for an IDE plugin).
- Communicating and presenting results to customers.
Image Analysis & Statistical Support Intern — Enza Zaden France
May 2021 – August 2021 · 3-month remote internship
Enza Zaden is an international vegetable seed-breeding company increasingly supporting breeding with AI:
- Image analysis by segmentation and statistical reporting in R.
- Applying ML algorithms on extracted features for prediction.
- Optimizing execution times and deploying the solution on the cloud (Azure).
Education
- Master of Artificial Intelligence, Systems, Data (apprenticeship track) — University of Paris Dauphine – PSL, Paris · 2022–2023
- Master of Data Science (apprenticeship track) — University of Paris-Saclay, Orsay · 2021–2022
- Bachelor’s Year 3, Computer Science & Mathematics for Decision-Making and Data — University of Paris Dauphine – PSL, Paris · 2020–2021
- Preparatory Cycle in Computer Science (Bac +2) — ESI – École Nationale Supérieure d’Informatique, Algiers · 2018–2020
Skills
Python · TensorFlow · Keras · PyTorch · HuggingFace · Google Cloud Platform · Spark · SQL