Research Engineer
Meta Superintelligence Labs - Fundamental AI Research (MSL - FAIR)
I'm a researcher in the Generative Modelling Foundations group at FAIR, advancing the foundations of flow matching and diffusion as well as applications at scale. In 2024 I obtained a Ph.D. from the University of Toronto and the Vector Institute for A.I. in information theory, generative modelling, and data compression. I spent most of grad school interning at Meta (FAIR Labs) and Google AI. Before grad school I worked as an electronics engineer (hardware/firmware for embedded systems), as well as a machine learning engineer in recommendation systems and ML for health.
- (ICLR 2026) Efficient learning and inference of distributions over permutations applied to re-ranking in recommendation systems (https://arxiv.org/abs/2505.24664)
- (ICLR 2025 oral) Advancing flow matching pre-training (\url{https://arxiv.org/abs/2412.03487})
- (NeurIPS 2025) Reducing the number of forward passes for code generation with masked diffusion models, without re-training (https://arxiv.org/abs/2505.24857)
- (Under Review) Improving memory efficiency of vector databases (FAISS) via lossless compression, with no impact to search quality (https://arxiv.org/abs/2501.10479)
For a complete list, please see my Google Scholar profile.
Originally, I am from Florianópolis (Brazil) but I've lived in NYC, Orlando, Toronto, São Paulo, and (now) Montréal, as well as other smaller cities in the south of Brazil.
I obtained a Ph.D. from the University of Toronto and the Vector Institute for A.I. in Information Theory and Generative Modelling. My thesis studies, and proposes algorithms for, lossless compression of combinatorial objects such as graphs, multisets, and partitions. Thesis: Random Permutation Codes: Lossless Source Coding of Non-Sequential Data
- Slides from DCC2025 (click here) / (mirror), 2025
- ICML 2023 Workshop on Neural Compression and Information Theory, 2023
- Asymmetric Numeral Systems (ANS) codec in pure Python, 2021
- A tutorial on bits-back with Huffman coding, 2021
- Vectorized Run-Length Encoding, 2021
- Persisting lru_cache to disk while using hashable pandas objects for parallel experiments, 2020





