Keynote · Day 1
Monday, June 22, 2026
Talk Title
On Securing Networked Embedded/Cyber-Physical Systems
Abstract
There has been an exponential growth of cyber-physical applications that rely on diverse types of embedded end-systems and devices, such as smart phones/watches/glasses, home appliances, consumer and industrial electronics, smart sensors and actuators. These applications require diverse types of Quality-of-Service (QoS) including timeliness, dependability, security and privacy, from end-systems/devices which are usually networked together via heterogeneous networking technologies and protocols.
We now know how to guarantee timeliness and, to a lesser extent, how to provide fault-tolerance, on both end-systems and their interconnection networks. However, how to secure them is far less known, despite the growing importance of protecting information stored in the end systems/devices and exchanged over their interconnection networks. Moreover, timeliness, fault-tolerance, security and privacy must be supported simultaneously, often with a tight resource budget such as memory, computation and communication bandwidth, and battery power. This talk will cover issues and approaches to the problems of securing networked embedded systems.
If time allows, I will discuss our work-in-progress on context-aware autonomous vehicles.
We now know how to guarantee timeliness and, to a lesser extent, how to provide fault-tolerance, on both end-systems and their interconnection networks. However, how to secure them is far less known, despite the growing importance of protecting information stored in the end systems/devices and exchanged over their interconnection networks. Moreover, timeliness, fault-tolerance, security and privacy must be supported simultaneously, often with a tight resource budget such as memory, computation and communication bandwidth, and battery power. This talk will cover issues and approaches to the problems of securing networked embedded systems.
If time allows, I will discuss our work-in-progress on context-aware autonomous vehicles.
Speaker Biography
Kang G. Shin (Life Fellow, IEEE) is currently the Kevin & Nancy O'Connor Professor Emeritus of Computer Science with the Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor. His current research focuses on safe and secure embedded real-time and cyber-physical systems and QoS-sensitive computing and networking. He has supervised the completion of 93 Ph.D.'s and authored/co-authored about 1000 technical articles, a textbook and more than 60 patents or invention disclosures. He has received numerous awards including the 2026 IEEE TC on Distributed Processing (TCDP) Award for Outstanding Technical Achievement, 2023 SIGMOBILE Test-of-Time Award, 2019 Caspar Bowden Award for Outstanding Research in Privacy Enhancing Technologies, and the 2006 Ho-Am Prize in Engineering.
Keynote · Day 2
Tuesday, June 23, 2026
Talk Title
Quo Vadis, Parallel Computing Systems?
Abstract
Parallel computing is entering a new era, driven by the rapid rise of AI, and undergoing a significant transformation. Previously associated primarily with supercomputers and scientific workloads, it now serves as the foundation of contemporary AI infrastructure. Large-scale training, real-time inference, agentic systems, simulation, robotics, and physical AI require extensive parallelism across accelerators, memory, networks, storage, software stacks, and energy infrastructure.
This talk explores the future direction of parallel computing systems in the context of AI. The next generation of systems will be defined by comprehensive full-stack co-design, rather than incremental improvements in processor speed or cluster size. Industry initiatives such as NVIDIA's AI factories and Google's AI Hypercomputer exemplify this transition. The talk will address major challenges including heterogeneous accelerators, communication bottlenecks, scheduling, reliability, observability, data locality, energy efficiency, sustainability, and openness.
This talk explores the future direction of parallel computing systems in the context of AI. The next generation of systems will be defined by comprehensive full-stack co-design, rather than incremental improvements in processor speed or cluster size. Industry initiatives such as NVIDIA's AI factories and Google's AI Hypercomputer exemplify this transition. The talk will address major challenges including heterogeneous accelerators, communication bottlenecks, scheduling, reliability, observability, data locality, energy efficiency, sustainability, and openness.
Speaker Biography
Jaejin Lee is a professor in the Department of Data Science and the Department of Computer Science and Engineering at Seoul National University (SNU), and Dean of the Graduate School of Data Science. He is also the director of the Center for Optimizing Hyperscale AI Models and Platforms (CHAMP). He received his Ph.D. in Computer Science from the University of Illinois at Urbana-Champaign (UIUC) in 1999, an M.S. from Stanford University in 1995, and a B.S. in Physics from SNU in 1991. He is an IEEE Fellow, and his research interests include programming systems of heterogeneous machines, building Large Language Models, and programming environments of quantum computers.
Keynote · Day 3
Wednesday, June 24, 2026
Talk Title
Towards Memory-Efficient LLM Inference for On-device AI
Abstract
The deployment of Large Language Models (LLMs) on devices faces significant challenges due to their extensive memory requirements. This talk introduces a series of work towards memory-efficient LLM from the perspective of model size as well as from the perspective of KV cache.
Double Compression combines model compression (quantization and pruning) with lossless data compression, achieving a 2.2x compression ratio while maintaining model accuracy within a 1% drop. FlexInfer leverages advanced system techniques such as prefetching and memory locking to maximize memory efficiency, achieving up to 12.5x faster inference under memory constraints compared to traditional methods. Together, these solutions enable memory-efficient LLM deployment on edge devices. The talk will also present the recent project ClawMobile, which runs efficient agents on mobile devices.
Double Compression combines model compression (quantization and pruning) with lossless data compression, achieving a 2.2x compression ratio while maintaining model accuracy within a 1% drop. FlexInfer leverages advanced system techniques such as prefetching and memory locking to maximize memory efficiency, achieving up to 12.5x faster inference under memory constraints compared to traditional methods. Together, these solutions enable memory-efficient LLM deployment on edge devices. The talk will also present the recent project ClawMobile, which runs efficient agents on mobile devices.
Speaker Biography
Prof. Chun Jason Xue is currently a professor of computer science at MBZUAI university, Abu Dhabi. His research focuses on memory and storage systems. He is current associate editor for ACM Transactions on Embedded Computing Systems, ACM Transaction on CPS, and ACM Transactions on Storage. He is a Distinguished Member of ACM and a Fellow of IEEE.







