“Efficient Transforms for Graph Signals with Applications to Video Coding”
by Keng-Shih Lu
December 2020
Graph signal processing (GSP) extends tools of conventional signal processing to graph signals, i.e., structured data residing on an irregular domain. The graph Fourier transform (GFT), as an extension of the discrete Fourier transform, defines the notion of frequency on graph, and can be used as a tool for frequency analysis of graph signals. However, challenges are posed by the computational complexity of graph signal processing techniques. First, a high dimensional graph Fourier transform can be computationally expensive and there is no fast algorithms for general graphs. Second, in graph filtering, which is a crucial tool in many GSP applications, an efficient implementation without GFT computation is a challenging problem that receives a great amount of attention. Third, while graph-based transforms have been shown to enhance the coding efficiency in MSE-based video compression framework, this application in coding schemes based on perceptually-driven metrics remains an open problem.
With the aim of answering these questions, we propose several techniques in this work. First, we study algebraic properties of the graph Laplacian matrix that lead to fast implementations of the graph Fourier transform, based on which we can design fast algorithms when the graph satisfies certain symmetry or bipartition properties. Second, we propose data-driven approaches to obtain fast GFT solutions for efficient separable and non-separable transforms. Third, we extend the notion of lapped transform to a graph-based setting, and propose a generalized version of the well-known lapped orthogonal transform. Fourth, we explore sparse graph operators for the well-known discrete cosine transform (DCT) and discrete sine transform (DST), which lead us to an efficient DCT/DST graph filtering approach and a fast transform type selection scheme for video encoder. Finally, we apply irregularity-aware graph Fourier transform (IAGFT) to perceptual image coding, which gives promising experimental results in terms of a popular perceptual quality metric, structural similarity (SSIM).