“Advanced Technologies for Learning-based Image/Video Enhancement, Image Generation and Attribute Editing”
by Zohreh Azizi
August 2023
Recent technological advances have led to production of massive volumes of visual data. Images and videos are nowadays important forms of content shared across social platforms for various purposes. In this manuscript, we propose novel methodologies to enhance, generate and manipulate visual content. Our contributions are outlined as follows:
Low-light image enhancement. We present a simple and effective low-light image enhancement method based on a noise-aware texture-preserving retinex model in this dissertation. The new method, called NATLE, attempts to strike a balance between noise removal and natural texture preservation through a low-complexity solution. Its cost function includes an estimated piecewise smooth illumination map and a noise-free texture-preserving reflectance map. After decomposing an image into the illumination and reflectance map, the illumination is adjusted to form the enhanced image together with the reflectance map. Extensive experiments are conducted on common low-light image enhancement datasets to demonstrate the superior performance of NATLE.
Low-light video enhancement. We also present a self-supervised adaptive low-light video enhancement method, called SALVE, in this dissertation. SALVE first enhances a few keyframes of an input low-light video using a retinex-based low-light image enhancement technique. For each keyframe, it learns a mapping from low-light image patches to enhanced ones via ridge regression. These mappings are then used to enhance the remaining frames in the low-light video. The combination of traditional retinex-based image enhancement and learning-based ridge regression leads to a robust, adaptive and computationally inexpensive solution to enhance low-light videos. Our extensive experiments along with a user study show that 87% of participants prefer SALVE over prior work.
Image generation. Then, we present a generative modeling approach based on successive subspace learning (SSL). Unlike most generative models in the literature, our method does not utilize neural networks to analyze the underlying source distribution and synthesize images. The resulting method, called the progressive attribute-guided extendable robust image generative (PAGER) model, has advantages in mathematical transparency, progressive content generation, lower training time, robust performance with fewer training samples, and extendibility to conditional image generation. PAGER consists of three modules: core generator, resolution enhancer, and quality booster. The core generator learns the distribution of low-resolution images and performs unconditional image generation. The resolution enhancer increases image resolution via conditional generation. Finally, the quality booster adds finer details to generated images. Extensive experiments on MNIST, Fashion-MNIST, and CelebA datasets are conducted to demonstrate generative performance of PAGER.
Facial Attribute Editing. Finally, we present a facial attribute editing method based on Gaussian Mixture Model (GMM). Our proposed method, named AttGMM, is the first to conduct facial attribute editing without exploiting neural networks. AttGMM first reconstructs the given image in a low-dimensional latent space through a posterior probability distribution. Next, it manipulates the low-dimensional latent vectors into a certain attribute. Finally, AttGMM utilizes the difference between the results of the previous two steps, along with the given image, to generate a refined and sharp image which possesses the target attribute. We show that AttGMM has a great advantage in lowering the computational cost. We present several experimental results to demonstrate the performance of AttGMM.