I am a second-year PhD student in the newly-founded Data Systems Lab at UTN, advised by Andreas Kipf. My current focus is on robust query processing.
Previously, I worked as a student research assistant in the TUM database group (Prof. Thomas Neumann) and in the DAML group (Prof. Stephan Günnemann).
During my studies I did two industry internships at Oracle Labs and Amazon Redshift.
[Google Scholar] | [GitHub] | [Twitter]
Forget about $O(3^n)$-time dynamic programming evaluations.
DPconv
: Super-Polynomially Faster Join Ordering
[code]
Mihail Stoian, Andreas Kipf
SIGMOD 2025
[Slides @Microsoft GSL]
TL;DR
Virtual: Compressing Data Lake Files
Mihail Stoian, Alexander van Renen, Jan Kobiolka, Ping-Lin Kuo, Andreas Zimmerer, Josif Grabocka, Andreas Kipf
EDBT Demo 2025
TL;DR
Virtual
learns sparse regressors to level up Parquet file sizes, while having bounded column scan overhead in the number of reference columns.
Optimizing Linearized Join Enumeration by Adapting to the Query Structure
Altan Birler, Mihail Stoian, Thomas Neumann
BTW 2025
Lightweight Correlation-Aware Table Compression
[code]
[recording]
Mihail Stoian, Alexander van Renen, Jan Kobiolka, Ping-Lin Kuo, Josif Grabocka, Andreas Kipf
Table Representation Learning @ NeurIPS 2024
[Slides @TRL]
TL;DR
Virtual
learns sparse regressors to level up Parquet file sizes, while having bounded column scan overhead in the number of reference columns.
Unified Mechanism-Specific Amplification by Subsampling and Group Privacy Amplification
[code]
The differential privacy framework for deriving mechanism-specific guarantees.
Jan Schuchardt, Mihail Stoian*, Arthur Kosmala*, Stephan Günnemann
NeurIPS 2024
TL;DR
DataLoom: Simplifying Data Loading with LLMs
[code]
Alexander van Renen, Mihail Stoian, Andreas Kipf
VLDB Demo 2024
Approximate Min-Sum Subset Convolution
First proposal for approximate min-sum subset convolution. This results in out-of-the-box exp-time $(1 + \varepsilon)$-approximations for prize-collecting Steiner tree, min-cost $k$-coloring, protein networks, and more applications in computational biology.
Mihail Stoian
WAOA @ ALGO 2024
[Slides @WAOA]
TL;DR
Corra: Correlation-Aware Column Compression
Are you still using FOR-, Delta-, RLE-encodings? Correlation-aware column encodings can compress your data even better!
Hanwen Liu, Mihail Stoian, Alexander van Renen, Andreas Kipf
CloudDB @ VLDB 2024
TL;DR
On the Optimal Linear Contraction Order of Tree Tensor Networks, and Beyond
[
Polynomial-time contraction ordering algorithm for tree tensor networks for the total contraction cost. Extension of the well-known netzwerk
]
Mihail Stoian, Richard Milbradt, Christian B. Mendl
SIAM Journal on Scientific Computing
→ Check out our package netzwerk
with plug-in for opt_einsum
and cotengra
.
TL;DR
IKKBZ
algorithm in databases.
Fast Joint Shapley Values
[code] [recording]
Mihail Stoian
SRC @ SIGMOD 2023
Faster FFT-based Wildcard Pattern Matching
[code] [recording]
Mihail Stoian
SRC @ SIGMOD 2023
Concurrent Link-Cut Trees [code] [recording]
Mihail Stoian | Advised by Jana Giceva and Philipp Fent
SRC @ SIGMOD 2022
Towards Practical Learned Indexing [code] [recording]
Mihail Stoian, Andreas Kipf, Ryan Marcus, and Tim Kraska
AIDB @ VLDB 2021
Benchmarking Learned Indexes [blog] [code] [leaderboard]
Ryan
Marcus, Andreas Kipf, Alexander van Renen, Mihail Stoian, Sanchit Misra, Alfons Kemper,
Thomas Neumann, and Tim Kraska
VLDB 2021
RadixSpline: A Single-Pass Learned
Index [code] [talk]
Andreas Kipf*, Ryan Marcus*, Alexander van Renen*, Mihail
Stoian, Alfons Kemper, Tim Kraska, and Thomas Neumann
aiDM @ SIGMOD 2020
SOSD: A Benchmark for Learned Indexes [code]
Andreas Kipf*, Ryan
Marcus*, Alexander van Renen*, Mihail Stoian, Alfons Kemper,
Thomas Neumann, and Tim Kraska
ML For Systems @ NeurIPS 2019
Did Fourier Really Meet Möbius? Fast Subset Convolution via FFT
There is no need for Zeta/Möbius transforms in fast subset convolution. FFT suffices. Even in the same running time.
Mihail Stoian
TL;DR
TSP Escapes the $O(2^n n^2)$ Curse [proof-of-concept] [blog]
Mihail Stoian
DPconv
: Super-Polynomially Faster Join Ordering
January, 2025
@Microsoft GSL
Virtual
: Compressing World's Parquet Files
January, 2025
@TUMuchData
What do databases and tensor networks have in common?
August, 2023
@Universität Jena @Joachim Giesen's group. Check out their amazing tools: Matrix Calculus, used in RelationalAI's AutoDiff, and more!
mihail.stoian@utn.de