
2025年9月14至18日,一年一度的IEEE图像处理国际会议(IEEE International Conference on Image Processing,ICIP)如期举办,IEEE ICIP 被公认为信号处理学会 (SPS) 规模最大、内容最全面的会议,专注于图像和视频处理以及计算机视觉领域。今年IEEE ICIP 会议的主题是“Imaging in the Age of GenAI”。
在会议举办的工业论坛中,多媒体实验室的专家研究员 Liang Zhao博士以及高级研究员Madhu Peringassery Krishnan, Tianqi Liu博士, Jayasingam Adhuan博士代表腾讯与来自 Google, Meta, Apple, Netflix的视频技术团队联合介绍了AOMedia 联盟在AV2标准中主要视频编解码技术。

在会议举办的tutorial讲座中,腾讯多媒体实验室的刘杉博士以及高级研究员Pranav Kadam博士联合Google的研究员Ondrej Stava博士带来了一场关于3个小时的关于Polygonal Mesh Coding Standard的精彩演讲。

腾讯多媒体实验室共有4篇论文入选ICIP 2025,内容涵盖帧内预测、块划分、熵编码、以及超分辨率等研究方向,展现了多媒体实验室在视频压缩和图像增强领域的技术能力与创新突破。
以下为入选论文概况:
01
硬件友好的多假设跨分量帧内预测
Hardware Friendly Multi-Hyupothesis Cross Component Prediction
Tianqi Liu, Liang Zhao, Madhu Peringassery Krishnan, Shan Liu, Jing Ye, Minhao Tang
Abstract:
This paper introduces a novel Multi-Hypothesis Cross Component Prediction (MHCCP) method to enhance coding efficiency in image and video compression on top of AOMedia Video Model (AVM). Inspired by prior cross-component coding techniques, this work firstly presents a new cross component prediction method and then introduces a hardware-friendly design for practical deployment. The proposed MHCCP method initially generates multiple hypothesis predictions, and a linear combination of these hypothesis pre- dictions is used to estimate the final chroma intensity. To determine the coefficients of the linear model in MHCCP, both the encoder and decoder use the Gaussian elimination method, which involves significant computational complexity. To address hardware implementation challenges, such as the high complexity for parameter derivation and limited above line buffer at the superblock boundary, the proposed method incorporates several optimizations, including decoupled vertical-horizontal prediction modes to capture diverse texture patterns, a padding mechanism to overcome line buffer constraints, and a sub-sampling strategy to reduce computational complexity. Experimental results on Common Test Condition (CTC) v7 demonstrate consistent coding gains: -0.57% (YUV-PSNR), -0.42% (Y-PSNR), -1.92% (U-PSNR), and -2.09% (V-PSNR) under all-intra settings on anchor research-v8.0.0. Notably, classes A1 and A2 achieve significant gains of -0.88% and -0.61%, respectively, highlighting the efficacy of the proposed approach.

Link:
https://ieeexplore.ieee.org/document/11084653
02
面向下一代的增强帧上下文初始化 的视频编码
Enhanced Frame Context Initialization for Video Coding Beyond AV1
Madhu Peringassery Krishnan, Liang Zhao, Zhipeng Dong, Tianqi Liu, Wei Kuang, Minhao Tang, Shan Liu
Abstract:
Entropy coding is an integral part of all modern hybrid block-based video codecs. The context adaptive binary arithmetic coding (CABAC) is a normative part of ITU-T/ISO/IEC video coding standards (H.264/AVC, HEVC and VVC), while the Alliance for Open Media (AOMedia) video standard (AV1) utilizes a context adaptive multi-symbol version of it for entropy coding. Recent explorations of new coding tools beyond AV1 capabilities have led to the improvement of the context adaptive multi-symbol arithmetic entropy coder. In this study, improvements to the context initialization process of the entropy coder are discussed in detail and results reported. The improvements include a) optimal selection of reference frame pairs and b) context modeling from selected reference frame pairs for frame context initialization. Two variants of the proposed method, Variant1 and Variant2 are implemented on top of the reference codebase (research-v8.0.0). Experimental results with CTCv7 show that for random access (RA) configuration, average BDRATE (PSNR-YUV) gains of -0.13% and -0.18% are achievable for Variant1 and Variant2 respectively. Meanwhile, for low delay (LD) configuration the reported gains are -0.47% and -0.50% respectively for Variant1 and Variant2.

Link:
https://ieeexplore.ieee.org/document/11084336
03
帧间编码中半耦合划分的扩展
Extension of Semi-Decoupled Partitioning in Inter Frames
Liang Zhao, Madhu Peringassery Krishnan, Shan Liu, Jayasingam Adhuran, Minhao Tang (Tencent), Jianle Chen, Urvang Joshi, Mohammed Sarwer, Debargha Mukerjee (Google)
Abstract:
The Alliance for Open Media (AOMedia) has been exploring new coding tools to enhance AV1 capabilities. Semi-Decoupled Partitioning (SDP), originally designed for intra frames in research-v2.0.0, improves coding by decoupling luma and chroma block partitioning. This study extends SDP to inter frames by introducing intra region coding, where the root node is explicitly signaled in the bitstream. Within the intra region, luma components of the intra-coded blocks can be further split, while chroma components remain unsplit. The experiments are implemented on the 8th anchor, research-v8.0.0, of AVM reference software with CTCv7, and experimental results show that the proposed method can achieve 0.12%, 2.39%, 2.58% coding gain for Y, U, and V component separately with random access configuration and 5% encoding time increase and almost no decoding time increase.

Link:
https://ieeexplore.ieee.org/document/11084310
04
基于CNN超分的GOP级自适应重采样
GOP-Level Adaptive Resampling with CNN-Based Super Resolution
Renjie Chang, Liqiang Wang, XiaozhongXu, Shan Liu
Abstract:
Deep learning-based super resolution methods have been studied for resampling-based video coding to compress high resolution images with limited bandwidth. In this paper, a group of picture-level (GOP-level) adaptive resampling method with convolutional neural network-based (CNN-based) super resolution is proposed to improve the coding gains beyond the latest video coding standard, named Versatile Video Coding (VVC). Specifically, to better restore the detailed information of high resolution video, a super resolution network using multiple side information is first proposed for generating the up-sampled videos. Besides, to further improve the overall performance, an encoder decision strategy is proposed to adaptively select the best scale factor from ×1.0 (original size) and ×2.0 (half size) to determine the encoding resolution at the GOP level. Experimental results demonstrate that the proposed method achieves {-5.34%, -2.38%, -2.08%} and {-3.35%, -8.02%, -4.98%} BD-rate savings for {Y, U, V} under random access and all intra configurations, respectively.

Link:
https://ieeexplore.ieee.org/document/11084415
会议议程:
https://2025.ieeeicip.org/aomedia-industry-workshop/
请随时与我们联系并分享您的需求
腾讯多媒体实验室
medialab@tencent.com