BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Information Systems Group - ECPv6.4.0.1//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:Information Systems Group
X-ORIGINAL-URL:https://isg.ics.uci.edu
X-WR-CALDESC:Events for Information Systems Group
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/Los_Angeles
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20260308T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20261101T090000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20260116T130000
DTEND;TZID=America/Los_Angeles:20260116T140000
DTSTAMP:20260508T150629
CREATED:20260113T215534Z
LAST-MODIFIED:20260113T215534Z
UID:2297-1768568400-1768572000@isg.ics.uci.edu
SUMMARY:Yiming Lin (UC Berkeley): AI-Powered Data Systems for Multimodal Analytics
DESCRIPTION:Time & Location:\n\n\nFriday Jan 16\, 2026\, 1:00 PM – 2:00 PM\nDonald Bren Hall 3011\, ICS\, UC Irvine \nLunch will be provided. \nTitle:\nAI-Powered Data Systems for Multimodal Analytics\n\nAbstract: \n\nWe live in a world overflowing with data\, and the emergence of AI\, such as Large Language Models (LLMs)\, is revolutionizing data analytics. However\, directly using AI to process massive and complex data is neither effective nor scalable. \nIn this talk\, I introduce my work on building database systems powered by AI to analyze and process multimodal data at scale\, focusing on tables and documents. On one hand\, when analyzing tables\, AI is often used to prepare data\, such as cleaning\, enriching\, or synthesizing data prior to query processing. This becomes prohibitively expensive when the data scale is large. To support scalable analysis over expensive data ingestion\, my work leverages the fact that not all data are needed to answer a query and explores a set of techniques to reduce AI operations unnecessary to analytics by optimizing the query engine in the database. On the other hand\, when analyzing documents\, current systems treat them as plain text and ignore underlying structures\, leading to limited accuracy and performance. In this regard\, we exhaustively identified three document structures that encompass most real-world documents we have encountered\, and we designed tools and systems to extract their structures and leverage them for accurate and efficient document analytics. Finally\, I’ll share my vision for building data systems for multimodal analytics\, including aspects of trustworthy systems\, interaction with hardware\, and co-optimization among different data modalities. \n\n\nBio: \n\n\n\n\nYiming Lin is a postdoctoral researcher at UC Berkeley\, and he received his Ph.D. from UC Irvine. His research interests span document analytics\, query processing and optimization\, and data cleaning\, with a current focus on building databases for multimodal analytics powered by AI. His work has had real-world impact: document analytics help public defenders\, journalists\, and the California police department process over 30\,000 pages\, while his efforts as part of TippersDB deliver high-quality IoT services to nursing homes\, industries\, and universities across five sites over six years. He has a number of publications and serves on the program committee of VLDB\, SIGMOD\, and ICDE. \n\n\n\n\nVolunteer: \nGuangxue Zhang
URL:https://isg.ics.uci.edu/event/yiming-lin-uc-berkeley-ai-powered-data-systems-for-multimodal-analytics/
LOCATION:DBH 3011
END:VEVENT
END:VCALENDAR