I am second-year PhD student in the newly-founded Data Systems Lab at UTN, advised by Andreas Kipf. My current focus is on robust query processing.
Previously, I worked as a student research assistant in the TUM database group (Prof. Thomas Neumann) and in the DAML group (Prof. Stephan Günnemann).
During my studies I did two industry internships at Oracle Labs and Amazon Redshift.
Whenever I see a problem in other areas that is similar to a database problem, I cannot leave it unsolved (see my blog).
[Google Scholar] | [GitHub] | [Twitter]
Forget about $O(3^n)$-time dynamic programming evaluations.
DPconv
: Super-Polynomially Faster Join Ordering
[code]
Mihail Stoian, Andreas Kipf
SIGMOD 2025
TL;DR
Lightweight Correlation-Aware Table Compression
[code]
Mihail Stoian, Alexander van Renen, Jan Kobiolka, Ping-Lin Kuo, Josif Grabocka, Andreas Kipf
Table Representation Learning @ NeurIPS 2024
TL;DR
Virtual
learns sparse regressors to level up Parquet file sizes, while having bounded column scan overhead in the number of reference columns.
Unified Mechanism-Specific Amplification by Subsampling and Group Privacy Amplification
[code]
The differential privacy framework for deriving mechanism-specific guarantees.
Jan Schuchardt, Mihail Stoian*, Arthur Kosmala*, Stephan Günnemann
NeurIPS 2024
TL;DR
DataLoom: Simplifying Data Loading with LLMs
[code]
Alexander van Renen, Mihail Stoian, Andreas Kipf
VLDB Demo 2024
Approximate Min-Sum Subset Convolution
[slides]
First proposal for approximate min-sum subset convolution. This results in out-of-the-box exp-time $(1 + \varepsilon)$-approximations for prize-collecting Steiner tree, min-cost $k$-coloring, protein networks, and more applications in computational biology.
Mihail Stoian
WAOA @ ALGO 2024
TL;DR
Corra: Correlation-Aware Column Compression
Are you still using FOR-, Delta-, RLE-encodings? Correlation-aware column encodings can compress your data even better!
Hanwen Liu, Mihail Stoian, Alexander van Renen, Andreas Kipf
CloudDB @ VLDB 2024
TL;DR
On the Optimal Linear Contraction Order of Tree Tensor Networks, and Beyond
[
Polynomial-time contraction ordering algorithm for tree tensor networks for the total contraction cost. Extension of the well-known netzwerk
]
Mihail Stoian, Richard Milbradt, Christian B. Mendl
SIAM Journal on Scientific Computing
→ Check out our package netzwerk
with plug-in for opt_einsum
and cotengra
.
TL;DR
IKKBZ
algorithm in databases.
Fast Joint Shapley Values
[code] [recording]
Mihail Stoian
SRC @ SIGMOD 2023
Faster FFT-based Wildcard Pattern Matching
[code] [recording]
Mihail Stoian
SRC @ SIGMOD 2023
Concurrent Link-Cut Trees [code] [recording]
Mihail Stoian | Advised by Jana Giceva and Philipp Fent
SRC @ SIGMOD 2022
Towards Practical Learned Indexing [code] [recording]
Mihail Stoian, Andreas Kipf, Ryan Marcus, and Tim Kraska
AIDB @ VLDB 2021
Benchmarking Learned Indexes [blog] [code] [leaderboard]
Ryan
Marcus, Andreas Kipf, Alexander van Renen, Mihail Stoian, Sanchit Misra, Alfons Kemper,
Thomas Neumann, and Tim Kraska
VLDB 2021
RadixSpline: A Single-Pass Learned
Index [code] [talk]
Andreas Kipf*, Ryan Marcus*, Alexander van Renen*, Mihail
Stoian, Alfons Kemper, Tim Kraska, and Thomas Neumann
aiDM @ SIGMOD 2020
SOSD: A Benchmark for Learned Indexes [code]
Andreas Kipf*, Ryan
Marcus*, Alexander van Renen*, Mihail Stoian, Alfons Kemper,
Thomas Neumann, and Tim Kraska
ML For Systems @ NeurIPS 2019
Did Fourier Really Meet Möbius? Fast Subset Convolution via FFT
There is no need for Zeta/Möbius transforms in fast subset convolution. FFT suffices. Even in the same running time.
Mihail Stoian
TL;DR
TSP Escapes the $O(2^n n^2)$ Curse [proof-of-concept] [blog]
Mihail Stoian
What do databases and tensor networks have in common? August, 2023
@Universität Jena @Joachim Giesen's group. Check out their amazing tools: Matrix Calculus, used in RelationalAI's AutoDiff, and more!
mihail.stoian@utn.de