The USC Andrew and Erna Viterbi School of Engineering USC Signal and Image Processing Institute USC Ming Hsieh Department of Electrical Engineering University of Southern California

Technical Report USC-SIPI-381

“Rate Controlled Techniques of H.264 AVC Video with Enhanced Rate-Distortion Modeling”

by Do-Kyoung Kwon

December 2006

In this research, we propose rate control algorithms with enhanced rate and distortion modeling for H.264 video in various applications.We first propose an enhanced rate control scheme for real-time conversational applications. Compared with existing H.264 rate control, the proposed scheme offers several new features. First, the inter-dependency between RDO and rate control is resolved by allowing different quantization parameter values at the RDO process and the quantization process, respectively. Second, to address the increased importance of header bits, a header rate model is established so as to estimate header bits more accurately. To be more specific, the number of header bits is modeled as a function of the number of non-zero MV elements and the number of MVs. Third, a new source rate model and a distortion model are proposed. For this purpose, coded 4x4 blocks are identified and the number of source bits and distortion are modeled as functions of the quantization stepsize and the complexity of coded 4x4 blocks. Built upon the above ideas, a rate control algorithm is developed for real-time conversational applications under the CBR constraint.For the non-conversational H.264 video that has a fixed GOP (Group of Pictures) structure, we propose frame-layer bit allocation algorithms. Under the assumption of frame's independence, a two-pass algorithm based on the Lagrange optimization framework is proposed first as a fundamental study. Then, to reduce the encoding complexity, an one-pass algorithm via GOP-based rate modeling is proposed based on the Lagrange optimization framework as well. Instead of estimating the R-D data of future frames directly, the one-pass algorithm estimates the Lagrange multiplier using GOP rate models, which characterize the number of bits consumed by a GOP. For this purpose, GOP-based R-Q and R-lambda models are investigated. Finally, we propose a simplified one-pass algorithm by exploiting the monotonicity property. The simplified algorithm does not require any frame rate and distortion model.Thus, the rate control process can be greatly simplified. The GOP structure may change adaptively according to the spatial and temporal scene contexts. We address the rate control problem with varying GOP structures as well. We point out an important issue in this case, the inter-dependency of frame-layer bit allocation and GOP structure decision, and resolve this problem using the simplified bit allocation scheme and GOP rate and distortion models. Finally, we propose a GOP-layer bit allocation algorithm using GOP rate and distortion models. This algorithm achieves higher average quality as well as smoother visual quality variation.


This report is not currently available in PDF format for downloading. Contact the Signal and Image Processing Institute for information on its availability.