BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Information Systems Group - ECPv6.4.0.1//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-ORIGINAL-URL:https://isg.ics.uci.edu
X-WR-CALDESC:Events for Information Systems Group
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/Los_Angeles
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20240310T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20241103T090000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240202T130000
DTEND;TZID=America/Los_Angeles:20240202T140000
DTSTAMP:20260719T160700
CREATED:20240131T055243Z
LAST-MODIFIED:20240131T055243Z
UID:1674-1706878800-1706882400@isg.ics.uci.edu
SUMMARY:Shengquan Ni: Supporting time-travel debugging in Texera
DESCRIPTION:Title: Supporting time-travel debugging in Texera \nSpeaker: Shengquan Ni \nAbstract: Dataflow systems\, traditionally used for relational analysis\, now support a variety of tasks including complex user-defined functions. As dataflow jobs become more diverse and complex\, there is an increasing need for better debugging support to understand their runtime behaviors and identify issues either in data or the analysis. To achieve the goal in the Texera system\, we develop techniques to support “time-travel debugging.” In particular\, the system allows users to interact with an execution during runtime to retrieve an execution state\, which is a consistent snapshot of the engine. The user has the ability to “travel back to the past” to access the execution state of a previous interaction\, thus retrospectively explore and analyze a previous execution state.  We will show a demo of this powerful feature\, and give an overview of the underlying techniques. \nBio: Shengquan Ni is a Ph.D. student in the Department of Computer Science advised by Professor Chen Li. His current research interests include big data processing\, distributed systems\, data analytics and data science. He was a summer intern at Google.
URL:https://isg.ics.uci.edu/event/shengquan-ni-supporting-time-travel-debugging-in-texera/
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240209T110000
DTEND;TZID=America/Los_Angeles:20240209T120000
DTSTAMP:20260719T160700
CREATED:20240122T191410Z
LAST-MODIFIED:20240122T191430Z
UID:1669-1707476400-1707480000@isg.ics.uci.edu
SUMMARY:Joseph Hellerstein (UC Berkeley): Hydro: A Compiler Stack for Distributed Programs
DESCRIPTION:The Computer Science Department and Information Systems Group (ISG) \nat UC Irvine welcomes \n \nJoseph Hellerstein \nUC Berkeley and Sutter Hill Ventures \nTITLE: Hydro: A Compiler Stack for Distributed Programs \nABSTRACT:   \nNearly all programs of interest today are distributed. Unfortunately\, the traditional languages and compilers in common use today offer little assistance in ensuring the correctness of distributed programs. This state of affairs makes infrastructure development and tuning unduly expensive\, and hampers the ability of less-technical but highly creative individuals to invent new applications that take advantage of the ubiquity of cloud and mobile computing. \n  \nThe Hydro project at Berkeley is an effort to build a compiler stack to address these issues\, taking lessons from the success of scaling data management software. The foundation of the Hydro stack is Hydroflow\, a Rust-based dataflow runtime with an IR based on algebraic dataflow. Hydroflow enables a compiler to make correct program transformations that are natural in the context of distributed systems. Transformations include: \n– Refactoring: Given an arbitrary block of code\, refactor it into smaller blocks that can be launched on independent machines \n– Replication: Given an arbitrary block of code\, determine whether it can be safely replicated in deployment \n– Partitioning: Given an arbitrary block of code\, determine how its inputs can be safely partitioned (“sharded”) to multiple machines in deployment \n  \nThese transformations in turn allow distributed programs to be optimized for various goals\, including parallelism (both pipelines and partitioning)\, memory scaling\, performance isolation\, geoproximity and physical security. \n  \nAlthough the Hydro project is still in early stages\, I will present case studies showing correctness\, latency and scaling results when optimizing programs ranging from infrastructure like key-value stores\, applications like shopping carts and messaging systems\, and tricky consensus protocols. \n  \nJoint work with colleagues at UC Berkeley and Sutter Hill Ventures. \n  \nBIO: Joseph M. Hellerstein is the Jim Gray Professor of Computer Science at UC Berkeley\, and a Faculty Fellow at Sutter Hill Ventures. His academic recognition includes the ACM SIGMOD Codd Innovations Award\, ACM Fellow and Sloan Research Fellow awards\, and six “Test of Time” awards for his papers. Hellerstein is a longtime participant in the computing industry\, co-founding startups\, advising companies and venture funds\, and directing industry research. He also enjoys playing music\, and has performed live with legendary musicians including Joe Henderson\, Joshua Redman and Michael J. Carey. \n 
URL:https://isg.ics.uci.edu/event/joseph-hellerstein-uc-berkeley-hydro-a-compiler-stack-for-distributed-programs/
LOCATION:DBH 6011
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240212T130000
DTEND;TZID=America/Los_Angeles:20240212T140000
DTSTAMP:20260719T160700
CREATED:20240206T000852Z
LAST-MODIFIED:20260405T013527Z
UID:1676-1707742800-1707746400@isg.ics.uci.edu
SUMMARY:Raul Castro Fernandez (U. Chicago): On Data Ecology\, Data Markets\, the Value of Data\, and Dataflow Governance
DESCRIPTION:Abstract: \nData shapes our social\, economic\, cultural\, and technological environments. Data is valuable\, so people seek it\, inducing data to flow. The resulting dataflows distribute data and thus value. For example\, large Internet companies profit from accessing data from their users\, and engineers of large language models seek large and diverse data sources to train powerful models. It is possible to judge the impact of data in an environment by analyzing how the dataflows in that environment impact the participating agents. My research hypothesizes that it is also possible to design (better) data environments by controlling what dataflows materialize; not only can we analyze environments but also synthesize them. In this talk\, I present the research agenda on “data ecology\,” which seeks to build the principles\, theory\, algorithms\, and systems to design beneficial data environments. I will also present examples of data environments my group has designed\, including data markets for machine learning\, data-sharing\, and data integration. I will conclude by discussing the impact of dataflows in data governance and how the ideas are interwoven with the concepts of trust\, privacy\, and the elusive notion of “data value.” As part of the technical discussion\, I will complement the data market designs with the design of a data escrow system that permits controlling dataflows. \nBio (Raul Castro Fernandez): \nIn my research\, I ask what is the value of data and explore the potential of data markets to unlock that value. My group collaborates with economists\, legal scholars\, statisticians\, and domain scientists. We build systems to share\, discover\, prepare\, integrate\, and process data. I have traditionally worked on distributed query processing systems and continue to do so. I have received a SIGMOD’23 Test-of-time-Award. I am an assistant professor in the Department of Computer Science and on the Committee of Data Science at The University of Chicago. Before UChicago\, I did a postdoc at MIT with Sam Madden and Mike Stonebraker. And before that\, I completed a PhD at Imperial College London with Peter Pietzuch.
URL:https://isg.ics.uci.edu/event/raul-castro-fernandez-u-chicago-on-data-ecology-data-markets-the-value-of-data-and-dataflow-governance/
LOCATION:DBH 4011
END:VEVENT
END:VCALENDAR