ISG Talks are sponsored by Couchbase.

<< All Talks

Loading Events

« All Events

  • This event has passed.

Yiming Lin: QUIP: Query-driven Missing Value Imputation

February 10, 2023 @ 1:00 pm - 2:00 pm

QUIP: Query-driven Missing Value Imputation

This paper develops a query-time missing value imputation frame- work, entitled QUIP, that minimizes the joint costs of imputation and query execution. QUIP achieves this by modifying how rela- tional operators are processed. It adds a cost-based decision function in each operator that checks whether the operator should invoke imputation prior to execution or to defer the imputations for down- stream operators to resolve. QUIP implements a new approach to evaluating outer join that preserve missing values during query processing, and a bloom filter based index structure to optimize the space and running overhead. We have implemented QUIP using ImputeDB – a specialized database engine for data cleaning. Exten- sive experiments on both real and synthetic data sets demonstrates the effectiveness and efficiency of QUIP, which outperforms the state-of-the-art ImputeDB by 2 to 10 times on different query sets and data sets, and achieves the order-of-magnitudes improvement over offline approach.

Yiming is a final year PhD student working with Prof. Sharad Mehrotra. His research area focuses on data management, and especially on efficient query processing, query optimization, data quality and data integration.


February 10, 2023
1:00 pm - 2:00 pm
Event Tags:


DBH 4011