Third IEEE Workshop on Coding for Machines

September 2025, Anchorage, Alaska

Satellite Workshop at IEEE ICIP 2025 

Workshop scope

Multimedia signals – speech, audio, images, video, point clouds, light fields, … – have traditionally been acquired, processed, and compressed for human use. However, it is estimated that in the near future, the majority of Internet connections will be machine-to-machine (M2M). So, increasingly, the data communicated across networks is primarily intended for automated machine analysis. Applications include remote monitoring, surveillance, and diagnostics, autonomous driving and navigation, smart homes / buildings / neighborhoods / cities, and so on. This necessitates rethinking of traditional compression and pre-/post-processing methods to facilitate efficient machine-based analysis of multimedia signals. As a result, standardization efforts such as MPEG VCM (Video Coding for Machines), MPEG FCM (Feature Coding for Machines) and JPEG AI have been launched.

Both the theory and early design examples have shown that significant bit savings for a given inference accuracy are possible compared to traditional human-oriented coding approaches. However, a number of open issues remain. These include a thorough understanding of the tradeoffs involved in coding for machines, coding for multiple machine tasks, as well as combined human-machine use, model architectures, software and hardware optimization, error resilience, privacy, security, and others. The workshop is intended to bring together researchers from academia, industry, and government who are working on related problems, provide a snapshot of the current research and standardization efforts in the area, and generate ideas for future work. We welcome papers on the following and related topics:


Important dates

Paper submission: 28 May 2025

Acceptance notification: 25 June 2025

Camera-ready papers: 2 July 2025

Workshop date: September 2025

Keynote lecture

Video Coding for Machines

Video is occupying about 80% of today’s Internet traffic. Out of this, more and more video contents are consumed by machines, in broad applications such as video surveillance, healthcare monitoring, transportation, smart cities, etc. Machine vision is different from human vision in many aspects. Great interests and demands have risen in recent years to develop video coding technologies and solutions for machine vision requirements and use cases. This talk will introduce recent advances in video coding technologies for machines and standard development.

Dr. Shan Liu

Tencent America

Shan Liu received the B.Eng. degree in electronic engineering from Tsinghua University, the M.S. and Ph.D. degrees in electrical engineering from the University of Southern California, respectively. She is a Distinguished Scientist and General Manager at Tencent. She was formerly Director of Media Technology Division at MediaTek USA. She was also formerly with MERL and Sony, etc. She has been a long-time contributor to international standardization with many technical proposals adopted into various standards such as VVC, HEVC, OMAF, DASH, MMT and PCC, and served as a Project Editor of ISO/IEC | ITU-T H.266/VVC standard. Among numerous standard AHG and WGs that she has chaired, she is currently co-Chair of MPEG AHG on Video Coding for Machines (VCM). She is a recipient of ISO&IEC Excellence Award, Technology Lumiere Award, USC SIPI Distinguished Alumni Award, and two-time IEEE TCSVT Best AE Award. She is a Fellow of IEEE. She currently serves as Associate Editor-in-Chief of IEEE Transactions on Circuits and Systems for Video Technology and Vice Chair of IEEE Data Compression Standards Committee. She also serves and has served on a few other Boards and Committees. She holds more than 600 granted US patents and has published more than 100 peer-reviewed papers and one book. Her interests include audio-visual, volumetric, immersive and emerging multimedia compression, intelligence, transport and systems.

Invited lecture

Human-Machine-Oriented Image and Video Coding

The exponential growth of image and video data has made it impractical for humans to process and analyze such large amount of data entirely on their own. At the same time, relying solely on machines for processing and analysis cannot fully guarantee accuracy. This has led to the emergence of a new application scenario: human-machine collaborative judgment. Traditional coding methods, primarily optimized for human perception, exhibit high complexity and low efficiency when applied to machine-centric tasks. To address the issue of high complexity, this talk will follow the JPEG-AI framework and explore direct image and video analysis in the compressed domain to improve analytical efficiency while maintaining low computational complexity. To tackle the problem of low efficiency, this talk will introduce two semantic scalable coding schemes—information layering and information decomposition—based on the assumption that machine tasks require less information than human vision. The base layer is designed for machine tasks, while the enhancement layer caters to human vision, thereby maximizing coding efficiency for both human and machine vision. Finally, this report provides a forward-looking perspective on the future of human-machine-oriented image and video coding. 

Prof. Li Li

University of Science and Technology of China

Li Li is a Professor and Doctoral Supervisor in the Department of Electronic Engineering and Information Science at the University of Science and Technology of China (USTC). He obtained his Bachelor’s degree in 2011 and his Ph.D. degree in 2016, both from USTC. From 2016 to 2020, he conducted postdoctoral research at the University of Missouri-Kansas City in the United States. Upon returning to China, he joined USTC and was promoted to Professor in 2022 after being awarded the Overseas Excellent Young Scholar grant. His primary research area is multimedia compression. He has published over 100 papers in top-tier international journals and conferences, accumulating more than 4,000 citations on Google Scholar. Additionally, he holds over 20 authorized patents and has successfully contributed more than 20 technical proposals adopted by leading standardization organizations. His outstanding achievements have been recognized with prestigious awards, including the Second Prize of the National Technology Invention Award in 2019 (ranked 5th), the 2024 DAMO Academy Young Fellow Award, and the 2023 Multimedia Rising Star Award. He also serves as an Associate Editor IEEE Transactions on Circuits and Systems for Video Technology (T-CSVT) and IEEE Transactions on Multimedia (T-MM).

Organizers

Changsheng Gao, Nanyang Technological University, Singapore

Ying Liu, Santa Clara University, USA

Heming Sun, Yokohama National University, Japan

Hyomin Choi, InterDigital, USA

Fengqing Maggie Zhu, Purdue University, USA

Ivan V. Bajić, Simon Fraser University, Canada

Technical Program Committee    (to be confirmed)

Balu Adsumilli, Google/YouTube, USA

Nilesh Ahuja, Intel Labs, USA

João Ascenso, Instituto Superior Técnico, Portugal

Zhihao Duan, Purdue University, USA

Yuxing (Erica) Han, Tsinghua University, China

Wei Jiang, Futurewei, USA

Hari Kalva, Florida Atlantic University, USA

André Kaup, Friedrich-Alexander University Erlangen-Nuremberg, Germany

Xiang Li, Google, USA

Weisi Lin, Nanyang Technological University, Singapore

Jiaying Liu, Peking University, China

Saeed Ranjbar Alvar, Huawei, Canada

Shiqi Wang, City University of Hong Kong

Shurun Wang, Alibaba DAMO Academy, China

Li Zhang, ByteDance, USA