USC-SIPI REPORT #406

Technical Report USC-SIPI-406

“Advanced Intra Prediction Techniques for Image and Video Coding”

by Yunyang Dai

August 2010

Intra prediction has been used in the H.264/AVC video coding standard to improve the coding efficiency of the intra frame. We present different intra prediction techniques that outperform the existing ones adopted by H.264/AVC and JPEG-LS in this research:

1. joint block/line-based intra prediction (JBLIP),2. hierarchical (or multi-resolution) intra prediction (HIP), and3. context-based hierarchical intra prediction (CHIP).

We consider two image/video coding scenarios: lossy compression and lossless compression. For lossy compression, we conduct a comprehensive study and show that the existing line-based prediction (LIP) technique adopted by the H.264/AVC standard can only be effective in smooth and simple edge regions. However, it is not as useful in predicting complex regions that contain texture patterns. To overcome this difficulty, we propose a JBLIP scheme with 2D geometrical manipulation to improve coding efficiency. The complexity of the JBLIP scheme is however quite hight due to the need to search the best matched block for the prediction purpose. Thus, we propose a fast search algorithm to reduce the coding complexity. The proposed JBLIP scheme outperforms the LIP scheme in H.264/AVC by up to 1.68dB in the PSNR improvement at the same bit rate.Next, for lossless compression, we present an advanced intra frame coding using a hierarchical (or multi-resolution) approach called HIP. The objective is to support lossless image/video compression with spatial scalability. We analyze the characteristics of the underlying input signal characteristics and previously proposed signal modeling algorithms and show that most of the existing signal models cannot capture the dynamic signal characteristics through one fixed model. Hence, we propose a spatially scalable intra-prediction scheme that decompose signals according to their characteristics in the frequency domain. A block-based linear combination with edge detection and training set optimization is used to improve coding efficiency for complex textured areas in theEL. It is shown by experimental results the proposed lossless HIP scheme outperforms the lossless LIP scheme of H.264/AVC and JPEG-LS by a bit rate saving of 10%. Finally, we analyze the inefficiency of the proposed lossless HIP scheme and present an enhanced hierarchical intra prediction coding called the context-based hierarchical intra prediction (CHIP). To save bits for the coding of modes, we propose a mode estimation scheme. To improve prediction accuracy, we employ the principal components analysis (PCA) to extract dominant features from the coarse representation of the base layer. The extracted features are clustered using a k-means clustering algorithm. Then, the context-based interlayer prediction (CIP) scheme is used to select the best prediction candidate without any side information. To enhance coding efficiency furthermore, an adaptive precoding process is performed by analyzing the characteristics of the prediction residual signal and a more accurate approach is proposed to estimate the context model. Experimental results show that the proposed lossless CHIP scheme outperforms the lossless LIP scheme of H.264/AVC and JPEG-LS by 16% in the bit rate saving.

Technical Report USC-SIPI-406

To download the report in PDF format click here: USC-SIPI-406.pdf (3.9Mb)