[Review] Image denoising + light network(2021~2023)

jeongjeong2·2023년 2월 20일
0

Medical AI Lab

목록 보기
9/11

2021

Single-Image Dehazing via Compositional Adversarial Network

Single-image dehazing has been an important topic given the commonly occurred image degradation caused by adverse atmosphere aerosols. The key to haze removal relies on an accurate estimation of global air-light and the transmission map. Most existing methods estimate these two parameters using separate pipelines which reduces the efficiency and accumulates errors, thus leading to a suboptimal approximation, hurting the model interpretability, and degrading the performance. To address these issues, this article introduces a novel generative adversarial network (GAN) for single-image dehazing. The network consists of a novel compositional generator and a novel deeply supervised discriminator. The compositional generator is a densely connected network, which combines fine-scale and coarse-scale information. Benefiting from the new generator, our method can directly learn the physical parameters from data and recover clean images from hazy ones in an end-to-end manner. The proposed discriminator is deeply supervised, which enforces that the output of the generator to look similar to the clean images from low-level details to high-level structures. To the best of our knowledge, this is the first end-to-end generative adversarial model for image dehazing, which simultaneously outputs clean images, transmission maps, and air-lights. Extensive experiments show that our method remarkably outperforms the state-of-theart methods. Furthermore, to facilitate future research, we crea

CrossNet++: Cross-Scale Large-Parallax Warping for Reference-Based Super-Resolution

The ability of camera arrays to efficiently capture higher space-bandwidth product than single cameras has led to various multiscale and hybrid systems. These systems play vital roles in computational photography, including light field imaging, 360 VR camera, gigapixel videography, etc. One of the critical tasks in multiscale hybrid imaging is matching and fusing cross-resolution images from different cameras under perspective parallax. In this paper, we investigate the reference-based super-resolution (RefSR) problem associated with dual-camera or multi-camera systems. RefSR consists of super-resolving a low-resolution (LR) image given an external high-resolution (HR) reference image, where they suffer both a significant resolution gap (8x) and large parallax ( 10% pixel displacement). We present CrossNet++, an end-to-end network containing novel two-stage cross-scale warping modules, image encoder and fusion decoder. The stage I learns to narrow down the parallax distinctively with the strong guidance of landmarks and intensity distribution consensus. Then the stage II operates more fine-grained alignment and aggregation in feature domain to synthesize the final super-resolved image. To further address the large parallax, new hybrid loss functions comprising warping loss, landmark loss and super-resolution loss are proposed to regularize training and enable better convergence. CrossNet++ significantly outperforms the state-of-art on light field datasets as well as real dual-camera data. We further demonstrate the generalization of our framework by transferring it to video super-resolution and video denoising.

A Noise Removal Algorithm Based on OPTICS for Photon-Counting LiDAR Data

Ice, Cloud, and Land Elevation Satellite-2 (ICESat2) shows great potential for forest height retrieval. However, there are abundant noise photons in the ICESat-2 data, which make the accurate extraction of global forest heights challenging. In this letter, a novel algorithm based on the clustering method of ordering points to identify the clustering structure (OPTICS) was proposed to remove noise photons. First, we modified the circular shape of the search area in the OPTICS algorithm to an elliptical shape. Second, a distance ordering of all photons was generated using the modified OPTICS algorithm. Finally, signal photons were effectively detected using distance thresholds set by the Otsu method. To evaluate the algorithm performance, both the simulated and real ICESat-2 data were applied to our proposed algorithm. In addition, we compared our algorithm with another noise removal algorithm based on the modified density-based spatial clustering of applications with noise (DBSCAN). The results show that our algorithm works well in distinguishing the signal and noise photons as indicated by high F values. Compared with the modified DBSCAN, our algorithm performs better in filtering out noise photons regardless of the simulated or real ICESat-2 data sets. In addition, the results also indicate that our algorithm is robust because it is insensitive to the clustering parameters. Overall, the new proposed algorithm is effective for removing noise photons in the ICESat-2 data.

Low-Illumination Image Enhancement for Night-Time UAV Pedestrian Detection

To accomplish reliable pedestrian detection using unmanned aerial vehicles (UAVs) under night-time conditions, an image enhancement method is developed in this article to improve the low-illumination image quality. First, the image brightness is mapped to a desirable level by a hyperbolic tangent curve. Second, the blockmatching and 3-D filtering methods are developed for an unsharp filter in YCbCr color space for image denoising and sharpening. Finally, pedestrian detection is performed using a convolutional neural network model to complete the surveillance task. Experimental results show that the Minkowski distance measurement index of enhanced images is increased to 0.975, and the detection accuracies, in F-measure and confidence coefficient, reach 0.907 and 0.840, respectively, which are the highest as compared with other image enhancement methods. This developed method has potential values for night-time UAV visual monitoring in smart city applications.

TBEFN: A Two-Branch Exposure-Fusion Network for Low-Light Image Enhancement

Images obtained under low-light conditions are usually accompanied by varied and highly unpredictable degradation. The uncertainty of the imaging environment makes the enhancement even more challenging. In this paper, we present a two-branch exposure-fusion network to tackle the problem of blind low-light image enhancement. In the first part of the paper, we provide a basic insight into the degradation mechanism of low-light images, and propose a quick and effective enhancement strategy by estimating the transfer function for varied illumination levels. To further deal with the challenge brought about by the blindness of input images, a novel generation-and-fusion strategy is then introduced, where the enhancements for slightly and heavily distorted images are carried out respectively in the two enhancing branches, followed by a self-adaptive attention unit to perform the final fusion. Moreover, a two-stage denoising strategy is also proposed to ensure effective noise reduction in a data-driven manner. To evaluate the performance of the proposed method, three commonly used datasets are adopted for quantitative evaluation and six for visual evaluation, where our method outperforms many of the existing state-of-the-art ones, showing great effectiveness and potential.

MCnet: Multiple Context Information Segmentation Network of No-Service Rail Surface Defects

Surface defect segmentation of no-service rail is important for its quality assessment. There are several challenges of uneven illumination, complex background, and difficulty of sample collection for no-service rail surface defects (NRSDs). In this article, we propose an acquisition scheme with two lamp light and color scan line charge-coupled device (CCD) to alleviate uneven illumination. Then, a multiple context information segmentation network is proposed to improve NRSD segmentation. The network makes full use of context information based on dense block, pyramid pooling module, and multi-information integration. Besides, the attention mechanism is applied to optimize extracted information by filtering noise. For the problem of real sample shortage, we propose to utilize artificial samples to train the network. And an NRSD data set NRSD-MN is built with artificial NRSDs and natural NRSDs. Experimental results show that our method is feasible and has a good segmentation effect on artificial and natural NRSDs.

Decomposition Makes Better Rain Removal: An Improved Attention-Guided Deraining Network

Rain streaks in the air show diverse characteristics with different shapes, directions, densities, even the complex overlapped phenomenon, causing great challenges for the deraining task. Recently, deep learning based image deraining methods have been extensively investigated due to their excellent performance. However, most of the existing algorithms still have limitations in removing rain streaks while preserving rich textural details under complicated rain conditions. To this end, we propose to decompose rain streaks into multiple rain layers and individually estimate each of them along the network stages to cope with the increasing abstracts. To better characterize rain layers, an improved non-local block is designed to exploit the self-similarity of rain information by learning the holistic spatial feature correlations while reducing the calculation complexity. Moreover, a mixed attention mechanism is applied to guide the fusion of rain layers by focusing on the local and global overlaps among these rain layers. Extensive experiments on both synthetic rainy/rain-haze/raindrop datasets, real-world samples, the haze, and low-light scenarios show substantial improvements both on quantitative indicators and visual effects over the current state-of-the-art technologies. The source code is available at https://github.com/kuihua/IADN.

Deep Dehazing Network With Latent Ensembling Architecture and Adversarial Learning

Most existing dehazing algorithms recover haze-free image by solving the hazy imaging model using estimated transmission map and global atmospheric light. However, inaccurate estimation of these variables and the strong assumptions of imaging model result in unrealistic dehazing results. In this paper, we use the adversarial game between a pair of neural networks to accomplish end-to-end photo-realistic dehazing. To avoid uniform contrast enhancement, the generator learns to simultaneously restore haze-free image and capture the non-uniformity of haze. The modules for the two tasks are assembled in sequential and parallel manners to enable information sharing at different levels, and the architecture of the generator implicitly forms an ensemble of dehazing models that allows for feature selection. A multi-scale discriminator competes with the generator by learning to detect dehazing artifacts and the inconsistency between dehazed image and the spatial variation of haze. Unlike existing works that penalize dehazing artifacts via hand-crafted loss, the proposed algorithm uses the identity mapping in the space of clear-scene images to regularize data-driven dehazing. The proposed work also addresses the adaptability of data-driven dehazing to high-level computer vision task. We propose a task-driven training strategy that can optimize the object detection performance on dehazed images without updating the parameters of object detector. Performance of the proposed algorithm is assessed on the RESIDE, I-Haze, and O-Haze benchmarks. The comparison with ten state-of-the-art algorithms shows that the proposed work is the best performer in most competitions.

Attenuation Coefficient Guided Two-Stage Network for Underwater Image Restoration

Underwater images suffer from severe color casts, low contrast and blurriness, which are caused by scattering and absorption when light propagates through water. However, existing deep learning methods treat the restoration process as a whole and do not fully consider the underwater physical distortion process. Thus, they cannot adequately tackle both absorption and scattering, leading to poor restoration results. To address this problem, we propose a novel two-stage network for underwater image restoration (UIR), which divides the restoration process into two parts viz. horizontal and vertical distortion restoration. In the first stage, a model-based network is proposed to handle horizontal distortion by directly embedding the underwater physical model into the network. The attenuation coefficient, as a feature representation in characterizing water type information, is first estimated to guide the accurate estimation of the parameters in the physical model. For the second stage, to tackle vertical distortion and reconstruct the clear underwater image, we put forth a novel attenuation coefficient prior attention block (ACPAB) to adaptively recalibrate the RGB channel-wise feature maps of the image suffering from the vertical distortion. Experiments on both synthetic dataset and real-world underwater images demonstrate that our method can effectively tackle scattering and absorption compared with several state-of-the-art methods.

Sparse Gradient Regularized Deep Retinex Network for Robust Low-Light Image Enhancement

Due to the absence of a desirable objective for low-light image enhancement, previous data-driven methods may provide undesirable enhanced results including amplified noise, degraded contrast and biased colors. In this work, inspired by Retinex theory, we design an end-to-end signal prior-guided layer separation and data-driven mapping network with layer-specified constraints for single-image low-light enhancement. A Sparse Gradient Minimization sub-Network (SGM-Net) is constructed to remove the low-amplitude structures and preserve major edge information, which facilitates extracting paired illumination maps of low/normal-light images. After the learned decomposition, two sub-networks (Enhance-Net and Restore-Net) are utilized to predict the enhanced illumination and reflectance maps, respectively, which helps stretch the contrast of the illumination map and remove intensive noise in the reflectance map. The effects of all these configured constraints, including the signal structure regularization and losses, combine together reciprocally, which leads to good reconstruction results in overall visual quality. The evaluation on both synthetic and real images, particularly on those containing intensive noise, compression artifacts and their interleaved artifacts, shows the effectiveness of our novel models, which significantly outperforms the state-of-the-art methods.

Sparse Gradient Regularized Deep Retinex Network for Robust Low-Light Image Enhancement

Due to the absence of a desirable objective for low-light image enhancement, previous data-driven methods may provide undesirable enhanced results including amplified noise, degraded contrast and biased colors. In this work, inspired by Retinex theory, we design an end-to-end signal prior-guided layer separation and data-driven mapping network with layer-specified constraints for single-image low-light enhancement. A Sparse Gradient Minimization sub-Network (SGM-Net) is constructed to remove the low-amplitude structures and preserve major edge information, which facilitates extracting paired illumination maps of low/normal-light images. After the learned decomposition, two sub-networks (Enhance-Net and Restore-Net) are utilized to predict the enhanced illumination and reflectance maps, respectively, which helps stretch the contrast of the illumination map and remove intensive noise in the reflectance map. The effects of all these configured constraints, including the signal structure regularization and losses, combine together reciprocally, which leads to good reconstruction results in overall visual quality. The evaluation on both synthetic and real images, particularly on those containing intensive noise, compression artifacts and their interleaved artifacts, shows the effectiveness of our novel models, which significantly outperforms the state-of-the-art methods.

Single Image Dehazing Via Region Adaptive Two-Shot Network

Single image dehazing is the key to enhancing image visibility in outdoor scenes, which facilitates human observation and computer recognition. The existing approaches generally utilize a one-shot strategy that indiscriminately applies the same filters to all local regions. However, due to neglecting inhomogeneous illumination and detail distortion, their dehazed results easily suffer from underfiltering or overfiltering across different regions. To tackle this issue, we propose a region-adaptive two-shot network (RATNet) that follows a coarse-to-fine framework. First, a lightweight subnetwork is applied to execute regular global filtering and obtain an initially restored image. Then, a two-branch subnetwork is put forward whose branches separately refine its illumination and detail. Eventually, we derive the final prediction by adaptively aggregating the results after illumination modification and detail restoration, whose region-variant weights are jointly optimized by maximizing the similarity between our fused result and haze-free counterpart. Extensive experiments validate the superiority of our proposed algorithm.

LLISP: Low-Light Image Signal Processing Net via Two-Stage Network

Images taken in extremely low light suffer from various problems such as heavy noise, blur, and color distortion. Assuming the low-light images contain a good representation of the scene content, current enhancement methods focus on finding a suitable illumination adjustment but often fail to deal with heavy noise and color distortion. Recently, some works try to suppress noise and reconstruct low-light images from raw data. But these works apply a network instead of an image signal processing pipeline (ISP) to map the raw data to enhanced results which leads to heavy learning burden for the network and get unsatisfactory results. In order to remove heavy noise, correct color bias and enhance details more effectively, we propose a two-stage Low Light Image Signal Processing Network named LLISP. The design of our network is inspired by the traditional ISP: processing the images in multiple stages according to the attributes of different tasks. In the first stage, a simple denoising module is introduced to reduce heavy noise. In the second stage, we propose a two-branch network to reconstruct the low-light images and enhance texture details. One branch aims at correcting color distortion and restoring image content, while another branch focuses on recovering realistic texture. Experimental results demonstrate that the proposed method can reconstruct high-quality images from low-light raw data and replace the traditional ISP.

ICEBIN: Image Contrast Enhancement Based on Induced Norm and Local Patch Approaches

Traditional histogram equalization may cause degraded results of over-enhanced images under uneven illuminations. In this paper, a simple and effective image contrast enhancement method is proposed to achieve high dynamic range imaging. First, the illumination of each pixel is estimated by using an induced norm of a patch of the image. Second, a pre-gamma correction is proposed to enhance the contrast of the illumination component appropriately. The parameters of gamma correction are set dynamically based on the local patch of the image. Third, an automatic Contrast-Limited Adaptive Histogram Equalization (CLAHE) whose clip point is automatically set is applied to the processed image for further image contrast enhancement. Fourth, a noise reduction algorithm based on the local patch is developed to reduce image noise and increase image quality. Finally, a post-gamma correction is applied to slightly enhance the dark regions of images and not affect the brighter areas. Experimental results show that the proposed method has its superiority over several state-of-the-art enhancement quality techniques by using qualitative and quantitative evaluations.

C-LIENet: A Multi-Context Low-Light Image Enhancement Network

Enhancement of low-light images is a challenging task due to the impact of low brightness, low contrast, and high noise. The inability to collect natural labeled data intensifies this problem further. Many researchers have attempted to solve this problem using learning-based approaches; however, most models ignore the impact of noise in low-lit images. In this paper, an encoder-decoder architecture, made up of separable convolution layers that solve the issues encountered in low-light image enhancement, is proposed. The architecture is trained end-to-end on a custom low-light image dataset (LID), comprising both clean and noisy images. We introduce a unique multi-context feature extraction module (MC-FEM) where the input first passes through a feature pyramid of dilated separable convolutions for hierarchical-context feature extraction followed by separable convolutions for feature compression. The model is optimized using a novel three-part loss function that focuses on high-level contextual features, structural similarity, and patch-wise local information. We conducted several ablation studies to determine the optimal model for low-light image enhancement under noisy and noiseless conditions. We have used performance metrics like peak-signal-to-noise ratio, structural similarity index matrix, visual information fidelity, and average brightness to demonstrate the superiority of the proposed work against the state-of-the-art algorithms. Qualitative results presented in this paper prove the strength and suitability of our model for real-time applications.

Lightweight Deep Extraction Networks for Single Image De-raining

In bad weather, artifacts such as rain streaks degrade the image quality. In addition, artifacts in the damaged image obstruct human vision and adversely affect the accuracy of object detection. Hence, single image rain removal is an important issue to improve image quality. However, state-of-the-art methods have limitation that require a lot of training data. This paper proposes a lightweight Deep Extraction Network (DEN), which performs well on image de-raining even with a small training dataset. Particularly, we design a novel Light Residual Block (LRB), which is connected in five cascading layers for extracting a deep local feature. Furthermore, DEN deploys a residual learning for training only artifacts. The experimental results on synthetic and real-world rainy image demonstrate the effectiveness in terms of visual and quantitative performance.

Lightweight Deep Extraction Networks for Single Image De-raining

In bad weather, artifacts such as rain streaks degrade the image quality. In addition, artifacts in the damaged image obstruct human vision and adversely affect the accuracy of object detection. Hence, single image rain removal is an important issue to improve image quality. However, state-of-the-art methods have limitation that require a lot of training data. This paper proposes a lightweight Deep Extraction Network (DEN), which performs well on image de-raining even with a small training dataset. Particularly, we design a novel Light Residual Block (LRB), which is connected in five cascading layers for extracting a deep local feature. Furthermore, DEN deploys a residual learning for training only artifacts. The experimental results on synthetic and real-world rainy image demonstrate the effectiveness in terms of visual and quantitative performance.

Attention-Based Multi-Branch Network for Low-Light Image Enhancement

Low-light conditions make the obtained images suffer a series of degradation, such as low contrast, noise interference and color distortion. Many previous learning-based methods have made remarkable progress, but they may still produce unsatisfactory results for ignoring noise in low-light regions. An attention-based multi-branch network is proposed, which can adequately enhance the image and suppress latent noise. The proposed method firstly estimates illumination component and reflectance component through a decomposition process. Then the illumination component is brightened to reconstruct the global lighting distribution, and the reflectance component is restored to remove noise and maintain details. A lightweight but effective attention block is employed to guide the restoration of the reflectance component, so as to concentrate on the distribution of lighting in different regions and effectively suppress noise in the dim environment. Extensive experiments on several datasets show the proposed method can achieve good results compared with classic and state-of-the-art methods.

D3Net: Joint Demosaicking, Deblurring and Deringing

Images acquired with standard digital cameras have Bayer patterns and suffer from lens blur. A demosaicking step is implemented in every digital camera, yet blur often remains unattended due to computational cost and instability of deblur-ring algorithms. Linear methods, which are computationally less demanding, produce ringing artifacts in deblurred images. Complex non-linear deblurring methods avoid artifacts, however their complexity imply offline application after camera demosaicking, which leads to sub-optimal performance. In this work, we propose a joint demosaicking deblurring and deringing network with a light-weight architecture inspired by the alternating direction method of multipliers. The proposed network has a transparent and clear interpretation compared to other black-box data driven approaches. We experimentally validate its superiority over state-of-the-art demosaicking methods with offline deblurring.

Hierarchically Aggregated Residual Transformation for Single Image Super Resolution

Visual patterns usually appear at different scales/sizes in natural images. Multi-scale feature representation is of great importance for the single-image super-resolution (SISR) task to reconstruct image objects at different scales. However, such characteristic has been rarely considered by CNN-based SISR methods. In this work, we propose a novel building block, i.e. hierarchically aggregated residual transformation (HART), to achieve multi-scale feature representation in each layer of the network. Within each HART block, we connect multiple convolutions in a hierarchical residual-like manner, which efficiently provides a wide range of effective receptive fields at a more granular level to detect both local and global image features. To theoretically understand the proposed HART block, we recast SISR as an optimal control problem and show that HART effectively approximates the classical 4th-order Runge-Kutta method, which has the merit of small local truncation error for solving numerical ordinary differential equation. By cascading the proposed HART blocks, we establish our high-performing HARTnet. Through extensive experiments on various benchmark datasets under different degradation models, we demonstrate that HARTnet compares favourably against existing state-of-the-art methods (including those in the NTIRE 2019 SR Challenge leaderboard) in terms of both quantitative metrics and visual quality. Moreover, the same HARTnet architecture achieves promising performance on such other image restoration tasks as image denoising and low-light image enhancement.

Deep Fusion of RGB and NIR Paired Images Using Convolutional Neural Networks

In low light condition, the captured color (RGB) images are highly degraded by noise with severe texture loss. In this paper, we propose deep fusion of RGB and NIR paired images in low light condition using convolutional neural networks (CNNs). The proposed deep fusion network consists of three independent sub-networks: DenoisingNet, EnhancingNet, and FusionNet. We build a denoising sub-network to eliminate noise from noisy RGB images. After denoising, we perform an enhancing sub-network to increase the brightness of low light RGB images. Since NIR image contains fine details, we fuse it with the Y channel of RGB image through a fusion subnetwork. Experimental results demonstrate that the proposed method successfully fuses RGB and NIR images, and generates high quality fusion results containing textures and colors.

Automatical Enhancement and Denoising of Extremely Low-light Images

Deep convolutional neural networks (DCNN) based methodologies have achieved remarkable performance on various low-level vision tasks recently. Restoring images captured at night is one of the trickiest low-level vision tasks due to its high-level noise and low-level intensity. We propose a DCNN-based methodology, Illumination and Noise Separation Network (INSNet), which performs both denoising and enhancement on these extremely low-light images. INSNet fully utilizes global-ware features and local-ware features using the modified network structure and image sampling scheme. Compared to well-designed complex neural networks, our proposed methodology only needs to add a bypass network to the existing network. However, it can boost the quality of recovered images dramatically but only increase the computational cost by less than 0.1%. Even without any manual settings, INSNet can stably restore the extremely low-light images to desired high-quality images.

Ancient Horoscopic Palm Leaf Binarization Using A Deep Binarization Model - RESNET

Binarization of ancient documents is a challenging task. Nowadays lot of traditional binarization algorithms exist with good accuracy but those algorithms cannot remove all kind of noises which are present in the same ancient documents. In traditional RESNET batch normalization is not using because of that it takes too much time for training. But proposed RESNET uses batch normalization which will increase the speed of the model training. Also, it is true huge data set can't be used at same time for enhancement. So, the deep learning models like RESNET will remove noise from ancient documents with good accuracy. The modified RESNET model will give good accuracy in ancient degraded image enhancement. Residual network will remove the noises like ink bleed and uneven illumination. In modified RESNET model with batch normalization which will increase the speed of the training phase. Proposed work is mainly based on modified RESNET with Convolution and Batch normalization along with Relu as one block like which five blocks are used for image binarization. It is working based on two phase method like down-sampling and up-sampling which is used to efficiently binarize the degraded ancient palm leaf manuscript with an accuracy of 95.38%.

Learning Model-Blind Temporal Denoisers without Ground Truths

Denoisers trained with synthetic noises often fail to cope with the diversity of real noises, giving way to methods that can adapt to unknown noise without noise modeling or ground truth. Previous image-based method leads to noise overfitting if directly applied to temporal denoising, and has inadequate temporal information management especially in terms of occlusion and lighting variation. In this paper, we propose a general framework for temporal denoising that successfully addresses these challenges. A novel twin sampler assembles training data by decoupling inputs from targets without altering semantics, which not only solves the noise overfitting problem, but also generates better occlusion masks by checking optical flow consistency. Lighting variation is quantified based on the local similarity of aligned frames. Our method consistently outperforms the prior art by 0.6-3.2dB PSNR on multiple noises, datasets and network architectures. State-of-the-art results on reducing model-blind video noises are achieved.

PD-GAN: Perceptual-Details GAN for Extremely Noisy Low Light Image Enhancement

Extremely noisy low light enhancement suffers from high-level noise, loss of texture detail, and color degradation. When recovering color or illumination for images taken in a dark environment, the challenge for networks is how to balance the enhancement for noise and texture details for a good visual effect. A single network is not suitable for solving the ill-posed problem of mapping the input image's noise to the clear target in the ground truth. To solve the problems, we pro-pose perceptual-details GAN (PD-GAN) utilizing Zero-DCE to initially recover illumination and combine residual dense-block Encoder-Decoder structure to suppress noise while finely adjusting the illumination. Besides, fractional differential gradient masks are integrated into the discriminator to enhance details. Experiment results demonstrate that PD-GAN outperforms other methods on the extremely low-light image dataset.

Deep Learning For Light Field Microscopy Using Physics-Based Models

Light Field Microscopy (LFM) is an imaging technique that captures 3D spatial information in a single 2D image. LFM is attractive because of its relatively simple implementation and fast acquisition rate. However, classic 3D reconstruction typically suffers from high computational cost, low lateral resolution, and reconstruction artifacts. In this work, we propose a new physics-based learning approach to improve the performance of the reconstruction under realistic conditions, these being lack of training data, background noise, and high data dimensionality. First, we propose a novel description of the system using a linear convolutional neural network. This description is complemented by a method that compacts the number of views of the acquired light field. Then, this model is used to solve the inverse problem under two scenarios. If labelled data is available, we train an end-to-end network that uses the Learned Iterative Shrinkage and Thresholding Algorithm (LISTA). If no labelled data is available, we propose an unsupervised technique that uses only unlabelled data to train LISTA by making use of Wasserstein Generative Adversarial Networks (WGANs). We experimentally show that our approach performs better than classic strategies in terms of artifact reduction and image quality.

Study of robust facial recognition under occlusion using different techniques

This paper represents the different techniques used for the study of robust facial recognition. The need of facial recognition is expanding very rapidly in current technologies as it is possible to identify a humane face features through different digital mediums. Although, facial recognition is not done achieved easily in the real-world situation as most of the time the human face goes through various occlusions making hard for the system to complete the identification process. The sole purpose of this article is to compare the techniques used and identify the most accurate and efficient one from all of them. Robust facial recognition is a task of doing recognition under uncertain environments and features. The main aim of the robust facial recognition is to analyze and identify the face images where the picture proposes difficult viewpoints, angles, poses, illumination, noises and expressions. The techniques studied for this article are Local Binary Patterns, HOG features, Occlusion-adaptive Deep Networks (ODN), Robust Principal Component Analysis (RPCA), Progressive Convolutional Neural Network and Region Attention Networks (RAN). In this article, a table is created summarizing the experiments and results of the techniques studied.

A Denoising Method for Light Field Imaging Sensor Based on Spatial-Angular Collaborative Encoding Network

Light field (LF) imaging sensors based on micro-lens array are susceptible to numerous noise pollution when collecting raw 4D LF data due to their own special structural design, which affects visual perception and subsequent applications such as depth estimation. Unfortunately, existing 2D image denoising methods are difficult to directly apply to 4D LF images. To this end, this paper proposes a new LF denoising method based on spatial-angular collaborative encoding network, considering the inherent 4D structure of LF image. Specifically, the convolutions in the spatial and angular branches are first constructed to extract specific 2D spatial and 2D angular features from noisy LF data. Then, a tailored spatial-angular collaborative encoder is designed to co-process spatial-angular features and improve the expression ability of features. After that, the spatial-angular feature fusion module is constructed to fuse the extracted features. Finally, the denoised LF image is reconstructed by a residual prediction module integrating attention mechanism. In particular, the proposed method simultaneously reconstructs all sub-aperture images of LF in one forward inference, so as to preserve the angular consistency of the denoised LF image. Extensive experimental results show that the proposed method outperforms the state-of-the-art methods in both subjective visual perception and objective quality evaluation. Furthermore, the proposed method preserves the parallax structure well, which is beneficial for subsequent LF applications.

Pixel-Wise Wasserstein Autoencoder for Highly Generative Dehazing

We propose a highly generative dehazing method based on pixel-wise Wasserstein autoencoders. In contrast to existing dehazing methods based on generative adversarial networks, our method can produce a variety of dehazed images with different styles. It significantly improves the dehazing accuracy via pixel-wise matching from hazy to dehazed images through 2-dimensional latent tensors of the Wasserstein autoencoder. In addition, we present an advanced feature fusion technique to deliver rich information to the latent space. For style transfer, we introduce a mapping function that transforms existing latent spaces to new ones. Thus, our method can produce highly generative haze-free images with various tones, illuminations, and moods, which induces several interesting applications, including low-light enhancement, daytime dehazing, nighttime dehazing, and underwater image enhancement. Experimental results demonstrate that our method quantitatively outperforms existing state-of-the-art methods for synthetic and real-world datasets, and simultaneously generates highly generative haze-free images, which are qualitatively diverse.

Outside Box and Contactless Palm Vein Recognition Based on a Wavelet Denoising ResNet

Palm vein recognition is a high-security biometric. Outside the NIR capture box and contactless palm vein recognition are more popular but challenging. The users feel comfortable outside the NIR capture box but face more optical blurring brought by visible light. Contactless capture gestures solve the hygienic problem but face the image rotation, position translation, and scale transformation which makes classification difficult especially in large-scale databases. To address these problems, we develop a wavelet denoising ResNet, which consists of two models: the wavelet denoising (WD) model and the squeeze-and-excitation ResNet18 (SER) model. The WD model focuses on removing noise from skin scattering and optical blurring from palm vein images. The WD model enhances the low-frequency feature into a deep learning feature by residual learning technology. This strategy increases the weight of an effective handcrafted feature in the deep learning network. The SER model overcomes rotation, position translation, and scale transformation by selectively emphasizing classification features and weakening less useful features. To train and verify the network, an inside box palm vein image database and an outside box palm vein image database are set up. The Tongji contactless palm vein image database was also employed in the experiments. The validity and superiority of our network are verified in a series of experiments.

Rethinking Noise Modeling in Extreme Low-Light Environments

Recent research has shown Convolutional Neural Networks (CNNs) trained in a fully-supervised fashion achieve promising performance on extreme low-light image denoising task. However, a large amount of "noisy-clean" image pairs are required to train a network, which are difficult to obtain. In this paper, we propose a compact yet effective noise model to generate synthetic noisy images for training. Especially, we address the severe color distortion problem in low-light images by identifying a novel noise component, black calibration error, as its physical origin. We prove that a small error at the sensing stage will strongly affect the following in-camera signal processing (ISP) pipeline and eventually lead to color bias. Experiment results demonstrate that the proposed model is superior in preserving perceptual quality and achieves state-of-the-art performance among existing noise synthesis methods.

On Constructing A Better Correlation Predictor For Prnu-Based Image Forgery Localization

Localizing image forgeries is one of the key topics in multimedia forensics. Among many image forgery localization techniques, the one based on the photo-response non-uniformity (PRNU) noise has attracted substantial attention because of its capability of localizing forgeries regardless of the type of forgery. However, despite the devoted efforts to improving the performance of PRNU-based forgery localization, there remain challenges to be overcome, especially for detecting subtle forgeries in PRNU-attenuated regions due to complex image content. In this work, we investigate the feasibility and effectiveness of convolutional neural networks (CNN) in predicting PRNU correlations under complex backgrounds for more accurate forgery localization. The experimental results on 20 cameras and 200 realistic forgery images show that significant improvement in correlation prediction and forgery localization can be achieved even with a light-weight CNN model. The robustness of different correlation predictors against JPEG compression is also evaluated.

Multi-Stage Raw Video Denoising with Adversarial Loss and Gradient Mask

In this paper, we propose a learning-based approach for denoising raw videos captured under low lighting conditions. We propose to do this by first explicitly aligning the neighboring frames to the current frame using a convolutional neural network (CNN). We then fuse the registered frames using another CNN to obtain the final denoised frame. To avoid directly aligning the temporally distant frames, we perform the two processes of alignment and fusion in multiple stages. Specifically, at each stage, we perform the denoising process on three consecutive input frames to generate the intermediate denoised frames which are then passed as the input to the next stage. By performing the process in multiple stages, we can effectively utilize the information of neighboring frames without directly aligning the temporally distant frames. We train our multi-stage system using an adversarial loss with a conditional discriminator. Specifically, we condition the discriminator on a soft gradient mask to prevent introducing high-frequency artifacts in smooth regions. We show that our system is able to produce temporally coherent videos with realistic details. Furthermore, we demonstrate through extensive experiments that our approach outperforms state-of-the-art image and video denoising methods both numerically and visually.

DeRenderNet: Intrinsic Image Decomposition of Urban Scenes with Shape-(In)dependent Shading Rendering

We propose DeRenderNet, a deep neural network to decompose the albedo and latent lighting, and render shape-(in)dependent shadings, given a single image of an outdoor urban scene, trained in a self-supervised manner. To achieve this goal, we propose to use the albedo maps extracted from scenes in videogames as direct supervision and pre-compute the normal and shadow prior maps based on the depth maps provided as indirect supervision. Compared with state-of-the-art intrinsic image decomposition methods, DeRenderNet produces shadow-free albedo maps with clean details and an accurate prediction of shadows in the shape-independent shading, which is shown to be effective in re-rendering and improving the accuracy of high-level vision tasks for urban scenes.

Haze Relevant Feature Attention Network for Single Image Dehazing

Single image dehazing methods based on deep learning technique have made great achievements in recent years. However, some methods recover haze-free images by estimating the so-called transmission map and global atmospheric light, which are strictly limited to the simplified atmospheric scattering model and do not give full play to the advantages of deep learning to fit complex functions. Other methods require pairs of training data, whereas in practice pairs of hazy and corresponding haze-free images are difficult to obtain. To address these problems, inspired by cycle generative adversarial model, we have developed an end-to-end haze relevant feature attention network for single image dehazing, which does not require paired training images. Specifically, we make explicit use of haze relevant feature by embedding an attention module into a novel dehazing generator that combines an encoder-decoder structure with dense blocks. The constructed network adopts a novel strategy which derives attention maps from several hand-designed priors, such as dark channel, color attenuation, maximum contrast and so on. Since haze is usually unevenly distributed across an image, the attention maps could serve as a guidance of the amount of haze at image pixels. Meanwhile, dense blocks can maximize information flow along features from different levels. Furthermore, color loss is proposed to avoid color distortion and generate visually better haze-free images. Extensive experiments demonstrate that the proposed method achieves significant improvements over the state-of-the-art methods.

Detection and Blur-Removal of Single Motion Blurred Image using Deep Convolutional Neural Network

This paper proposes a simple and efficient motion blur detection and removal method based on Deep CNN. The domain of computer vision has gained significant importance in recent years due to insurgence in the fields of self-driving cars, UAVs, medical image processing, etc. Due to low light conditions and the camera's fast motion, a large portion of image data generated is wasted. Such motion-blurred images impose a great challenge to the algorithms used for decision-making in machine vision. Although there have been significant improvements in denoising such image data, these methods are challenged by time constraints, insufficient data to train, reconstructed image quality, etc. The proposed paper employs a learning method to detect and deblur the single input image even in the absence of a ground-truth sharp image. We have used a synthetic dataset for experimental evaluation. This synthetic dataset that we have created and used for training the DCNN model has been made available for open source on Kaggle at the following link: https://www.kaggle.com/dikshaadke/motionblurdataset

Rain-Free and Residue Hand-in-Hand: A Progressive Coupled Network for Real-Time Image Deraining

Rainy weather is a challenge for many vision-oriented tasks ( e.g. , object detection and segmentation), which causes performance degradation. Image deraining is an effective solution to avoid performance drop of downstream vision tasks. However, most existing deraining methods either fail to produce satisfactory restoration results or cost too much computation. In this work, considering both effectiveness and efficiency of image deraining, we propose a progressive coupled network (PCNet) to well separate rain streaks while preserving rain-free details. To this end, we investigate the blending correlations between them and particularly devise a novel coupled representation module (CRM) to learn the joint features and the blending correlations. By cascading multiple CRMs, PCNet extracts the hierarchical features of multi-scale rain streaks, and separates the rain-free content and rain streaks progressively. To promote computation efficiency, we employ depth-wise separable convolutions and a U-shaped structure, and construct CRM in an asymmetric architecture to reduce model parameters and memory footprint. Extensive experiments are conducted to evaluate the efficacy of the proposed PCNet in two aspects: (1) image deraining on several synthetic and real-world rain datasets and (2) joint image deraining and downstream vision tasks ( e.g. , object detection and segmentation). Furthermore, we show that the proposed CRM can be easily adopted to similar image restoration tasks including image dehazing and low-light enhancement with competitive performance. The source code is available at https://github.com/kuijiang0802/PCNet .

Subband Adaptive Enhancement Of Low Light Images Using Wavelet-Based Convolutional Neural Networks

Images captured in low light condition have a narrow dynamic range with a dark tone, which are seriously degraded by noise due to the low signal-to-noise ratio (SNR). Discrete wavelet transform (DWT) is invertible and thus is able to decompose an image into subbands without information loss minimizing redundancy. In this paper, we propose subband adaptive enhancement of low light images using wavelet-based convolutional neural networks. We adopt DWT to achieve joint contrast enhancement and noise reduction. We combine DWT with convolutional neural networks (CNNs), i.e. wavelet-based CNN, to facilitate subband adaptive processing. First, we decompose the input image into LL, LH, HL, and HH subbands to get low and high frequency components. Second, we perform contrast enhancement for LL subband and noise reduction for LH, HL and HH subbands. Finally, we perform refinement to enhance image details. Experimental results show that the proposed method enhances low light images while successfully removing noise as well as outperforms state-of-the-art methods in terms of visual quality and quantitative measurements.

Using the Overlapping Score to Improve Corruption Benchmarks

Neural Networks are sensitive to various corruptions that usually occur in real-world applications such as blurs, noises, low-lighting conditions, etc. To estimate the robustness of neural networks to these common corruptions, we generally use a group of modeled corruptions gathered into a benchmark. Unfortunately, no objective criterion exists to determine whether a benchmark is representative of a large diversity of independent corruptions. In this paper, we propose a metric called corruption overlapping score, which can be used to reveal flaws in corruption benchmarks. Two corruptions overlap when the robustnesses of neural networks to these corruptions are correlated. We argue that taking into account overlappings between corruptions can help to improve existing benchmarks or build better ones.

Image Enhancement of Low Light UAV via Global Illumination Self-aware feature Estimation

UAV images acquired under low light conditions are often characterized by low contrast and poor visual effect. To improve image quality, a low light UAV image enhancement method via global illumination self-aware feature estimation was proposed. First, a novel lightweight GhostNet is introduced to extract deeper image features. Secondly, the self-aware module is used to correct the possible missing information between encoder network and decoder network. Finally, gradient loss and structural similarity loss are used to constrain the network to achieve the goal of edge preservation and detail restoration. Through extensive experiments, the method proposed can effectively improve the visualization effect, and get more natural and real results.

Low Light GAN-Based Photo Enhancement

This paper is based on a novel learning-based pipeline that replaces the conventional image processing pipeline in order to enhance performance of current multi camera devices for low light photography. This paper proposes a new suggestive algorithm to suggest a better position to capture image based on region segmentation (object-based segmentation), camera calibration, parameter selection, Frame Regeneration (Merging and alignment of frames) and then using Generative Adversarial Network (GAN) for image enhancement. As Low Light Image enhancement is one of the biggest digital image processing problem due to factors like noise, low exposure and incorrect edge detections of the object in low light so image enhancement is difficult these images, this research work proposes a low light image enhancement model by comparing it with current models. Also, the proposed research work is mainly focused on mobile devices as they are currently used by over 90% users to capture the images. The proposed method assists in increasing the SNR by 7.63 % over the other existing approaches.

Light-DehazeNet: A Novel Lightweight CNN Architecture for Single Image Dehazing

Due to the rapid development of artificial intelligence technology, industrial sectors are revolutionizing in automation, reliability, and robustness, thereby significantly increasing quality and productivity. Most of the surveillance and industrial sectors are monitored by visual sensor networks capturing different surrounding environment images. However, during tempestuous weather conditions, the visual quality of the images is reduced due to contaminated suspended atmospheric particles that affect the overall surveillance systems. To tackle these challenges, this article presents a computationally efficient lightweight convolutional neural network referred to as Light-DehazeNet (LD-Net) for the reconstruction of hazy images. Unlike other learning-based approaches, which separately measure the transmission map and the atmospheric light, our proposed LD-Net jointly estimates both the transmission map and the atmospheric light using a transformed atmospheric scattering model. Furthermore, a color visibility restoration method is proposed to evade the color distortion in the dehaze image. Finally, we conduct extensive experiments using synthetic and natural hazy images. The quantitative and qualitative evaluation on different benchmark hazy datasets verify the superiority of the proposed method over other state-of-the-art image dehazing techniques. Moreover, additional experimentation validates the applicability of the proposed method in the object detection tasks. Considering the lightweight architecture with minimal computational cost, the proposed system is encouraged to be incorporated as an integral part of the vision-based monitoring systems to improve the overall performance.

Hyperspectral Target Detection with Hierarchical Denoising Autoencoder and Subspace Projection

Target detection technique in hyperspectral imagery has been widely applied in various applications. However, its performance is severely limited by the useless interference contained in hyperspectral images (HSIs), mainly caused by the atmosphere, illumination, issues within the sensor itself, and some other factors. In this paper, we propose a hyperspectral target detector based on linear mixture model (LMM), which consists of three components. First, a hierarchical denoising autoencoder (HDAE) is specifically designed for redundant interference removal; then we apply an adaptive cluster approach to extract several representative background samples from the clean HSI; lastly, a target detector with subspace projection is developed for background suppression and target enhancement based on the clean HSI, representative background and prior-known target signatures. Experimental results on two real-world HSIs show the superiority of our proposed method, namely, the HDASP detector, comparing with other state-of-the-art target detection methods.

Multispectral Fusion of RGB and NIR Images Using Weighted Least Squares and Convolution Neural Networks

In low light condition, color (RGB) images captured by visible sensors suffer from severe noise causing loss of colors and textures. However, near infrared (NIR) images captured by NIR sensors are robust to noise even in low light condition without color. Since RGB and NIR images are complementary in low light condition, the multispectral fusion of RGB and NIR images provides a viable solution to low light imaging. In this paper, we propose multispectral fusion of RGB and NIR images using weighted least squares (WLS) and convolution neural networks (CNNs). We combine traditional WLS filtering for layer decomposition and denoising with latest deep learning for image enhancement and texture transfer into the multispectral fusion to take both advantages. We build two networks based on CNN: image enhancement network (IEN) for image enhancement and texture transfer network (TTN) for NIR texture transfer. First, we perform RGB image denoising based on WLS filtering and generate the base layer. We use both RGB and NIR images for WLS filtering as weights to filter out noise in low light RGB images. Second, we conduct IEN to enhance contrast of the base layer. Third, we perform TTN to deliver NIR details completely and naturally to the fusion result. The combination of WLS, TTN and IEN leads to noise reduction, contrast enhancement, and detail preservation in fusion. Experimental results show that the proposed method achieves good performance in both noise reduction and detail transfer as well as outperforms state-of-the-art methods in terms of visual quality and quantitative measurements.

Wavelength-Tunable OTDR for DWDM-PON Based on Optimized Wavelet Denoising

In this letter, a wavelength-tunable Optical Time-Domain Reflectometer (OTDR) for Dense Wavelength Division Multiplexing Passive Optical Network (DWDM-PON) is proposed and experimentally demonstrated. An Integrated Tunable Laser Assembly (ITLA) serves as the light source permitting measurements of 80 DWDM channels in the C-band. Dither modulation is added to ITLA to depress coherent noise resulted from the narrow linewidth of the light source. Semiconductor Optical Amplifier (SOA) modulates the continuous light into pulsed light and meanwhile amplifies peak optical power of the pulsed light to 13 dBm. In addition, we optimize the wavelet denoising algorithm for further signal-noise ratio (SNR) enhancement. Dynamic range and spatial resolution of the proposed OTDR reach up to 16.2 dB and 2 m respectively. A DWDM-PON fiber link including an Arrayed Waveguide Grating (AWG) is measured to verify the system performances, and the OTDR profiles of different channels coincide with each other. The designed wavelength-tunable OTDR is shown to be appropriate for DWDM-PON.

Learning to Denoise Gated Cardiac PET Images Using Convolutional Neural Networks

Noise and motion artifacts in Positron emission tomography (PET) scans can interfere in diagnosis and result in inaccurate interpretations. PET gating techniques effectively reduce motion blurring, but at the cost of increasing noise, as only a subset of the data is used to reconstruct the image. Deep convolutional neural networks (DCNNs) could complement gating techniques by correcting such noise. However, there is little research on the specific application of DCNNs to gated datasets, which present additional challenges that are not considered in these studies yet, such as the varying level of noise depending on the gate, and performance pitfalls due to changes in the noise properties between non-gated and gated scans. To extend the current status of artificial intelligence (AI) in gated-PET imaging, we present a post-reconstruction denoising approach based on U-Net architectures on cardiac dual-gated PET images obtained from 40 patients. To this end, we first evaluate the denoising performance of four different variants of the U-Net architecture (2D, semi-3D, 3D, Hybrid) on non-gated data to better understand the advantages of each type of model, and to shed more light on the factors to take in consideration when selecting a denoising architecture. Then, we tackle the denoising of gated-PET reconstructions, revising challenges and limitations, and propose two training approaches, which overcome the need for gated targets. Quantification results show that the proposed deep learning (DL) frameworks can successfully reduce noise levels while correctly preserving the original motionless resolution of the gates.

Towards Low-Visibility Enhancement in Maritime Video Surveillance: An Efficient and Effective Multi-Deep Neural Network

Limited by insufficient illumination, the images collected by maritime imaging devices often suffer from low brightness, low contrast, low signal-to-noise ratio, severe information loss, and so on. The above problems restrict the development of maritime- related work such as intelligent supervision, collision warning, accident investigation, etc. To improve the imaging quality of maritime video images, we propose an efficient and effective multi-deep neural network (termed EEMNN) for low-visibility enhancement. In particular, we fuse the multi-scale information extracted from the encoder-decoder module using the dense blocks (DBs) and attention blocks (ABs). It is capable of enhancing the fused information leading to preserving the edges, textures, and other fine details. To prevent the overexposure of enhanced images, we fuse and reconstruct the output features of DBs and ABs with the raw low-light image to get the final enhanced image through two residual blocks (RBs). The mixing of multiple network modules can effectively improve the generalization ability and robustness of our network. Through extensive experiments, EEMNN has higher objective evaluation indicators, more efficient enhancement, more natural maritime scenes, and stronger detail-preservation capabilities compared with other enhancement methods.

Adversarial 3D Human Pointcloud Completion From Limited Angle Depth Data

Most research in 3D objects and its occluded region reconstruction from a single perspective focuses on object completion from a synthetically generated dataset. This leaves a major knowledge gap when morphing 3D object reconstruction from an imperfect real-world frame. As a solution to this problem, we propose a three-stage deep auto-refining adversarial neural network capable of denoising and refining real-world depth data for a full human body posture shape completion. The proposed solution achieves results which are on par with other state-of-the-art approaches in both EarthMover’s and Chamfer distances, 0.059 and 0.079, respectively, while having the benefit of reconstructing from mask-less depth frames. Visual inspection of reconstructed pointcloud suggests great adaptation capabilities to the majority of real-world depth sensor noise deformities for both LiDAR and structured light depth sensors.

Deep Denoising of Flash and No-Flash Pairs for Photography in Low-Light Environments

We introduce a neural network-based method to denoise pairs of images taken in quick succession, with and without a flash, in low-light environments. Our goal is to produce a high-quality rendering of the scene that preserves the color and mood from the ambient illumination of the noisy no-flash image, while recovering surface texture and detail revealed by the flash. Our network outputs a gain map and a field of kernels, the latter obtained by linearly mixing elements of a per-image low-rank kernel basis. We first apply the kernel field to the no-flash image, and then multiply the result with the gain map to create the final output. We show our network effectively learns to produce high-quality images by combining a smoothed out estimate of the scene’s ambient appearance from the no-flash image, with high-frequency albedo details extracted from the flash input. Our experiments show significant improvements over alternative captures without a flash, and baseline denoisers that use flash no-flash pairs. In particular, our method produces images that are both noise-free and contain accurate ambient colors without the sharp shadows or strong specular highlights visible in the flash image.

Invertible Denoising Network: A Light Solution for Real Noise Removal

Invertible networks have various benefits for image de-noising since they are lightweight, information-lossless, and memory-saving during back-propagation. However, applying invertible models to remove noise is challenging because the input is noisy, and the reversed output is clean, following two different distributions. We propose an invertible denoising network, InvDN, to address this challenge. InvDN transforms the noisy input into a low-resolution clean image and a latent representation containing noise. To discard noise and restore the clean image, InvDN replaces the noisy latent representation with another one sampled from a prior distribution during reversion. The de-noising performance of InvDN is better than all the existing competitive models, achieving a new state-of-the-art result for the SIDD dataset while enjoying less run time. Moreover, the size of InvDN is far smaller, only having 4.2% of the number of parameters compared to the most recently proposed DANet. Further, via manipulating the noisy latent representation, InvDN is also able to generate noise more similar to the original one. Our code is available at: https://github.com/Yang-Liu1082/InvDN.git.

Spk2ImgNet: Learning to Reconstruct Dynamic Scene from Continuous Spike Stream

The recently invented retina-inspired spike camera has shown great potential for capturing dynamic scenes. Different from the conventional digital cameras that compact the photoelectric information within the exposure interval into a single snapshot, the spike camera produces a continuous spike stream to record the dynamic light intensity variation process. For spike cameras, image reconstruction remains an important and challenging issue. To this end, this paper develops a spike-to-image neural network (Spk2ImgNet) to reconstruct the dynamic scene from the continuous spike stream. In particular, to handle the challenges brought by both noise and high-speed motion, we propose a hierarchical architecture to exploit the temporal correlation of the spike stream progressively. Firstly, a spatially adaptive light inference subnet is proposed to exploit the local temporal correlation, producing basic light intensity estimates of different moments. Then, a pyramid deformable alignment is utilized to align the intermediate features such that the feature fusion module can exploit the long-term temporal correlation, while avoiding undesired motion blur. In addition, to train the network, we simulate the working mechanism of spike camera to generate a large-scale spike dataset composed of spike streams and corresponding ground truth images. Experimental results demonstrate that the proposed network evidently outperforms the state-of-the-art spike camera reconstruction methods.

NBNet: Noise Basis Learning for Image Denoising with Subspace Projection

In this paper, we introduce NBNet, a novel framework for image denoising. Unlike previous works, we propose to tackle this challenging problem from a new perspective: noise reduction by image-adaptive projection. Specifically, we propose to train a network that can separate signal and noise by learning a set of reconstruction basis in the feature space. Subsequently, image denosing can be achieved by selecting corresponding basis of the signal subspace and projecting the input into such space. Our key insight is that projection can naturally maintain the local structure of input signal, especially for areas with low light or weak textures. Towards this end, we propose SSA, a non-local attention module we design to explicitly learn the basis generation as well as subspace projection. We further incorporate SSA with NBNet, a UNet structured network designed for end-to-end image denosing based. We conduct evaluations on benchmarks, including SIDD and DND, and NBNet achieves state-of-the-art performance on PSNR and SSIM with significantly less computational cost.

Linear Fusion of Multi-Scale Transmissions for Image Dehazing

Images acquired in inclement weather conditions (haze, mist, fog, rain, etc.) suffer from various degradation problems such as low contrast, diminished visibility and color distortions. These low-quality images do not meet the demand for computer vision applications like object recognition, smart transportation, remote sensing, weather forecasting, etc. To restore the haze-free image, we require two parameters to estimate. The first parameter is an estimation of the transmission and second is the atmospheric light. Existing work focused on estimation of the transmission. However, two issues halo artifacts at the sudden change of the depth and over enhancement are unresolved in the dehazed image due to inaccurate estimation of the transmission. The traditional methods utilized two methods: pixel-wise and patch-wise while estimating the transmission. However, these two methods pixel-wise and patch-wise suffer from the problem of over-saturation or loss of details and halo artifacts in the dehazed images respectively. This paper suggests a linear fusion of multi-scale transmissions to overcome these problems. Experiments are performed on various hazy images and qualitative and quantitative results are presented. These qualitative and quantitative results reveal that the proposed method has successfully overcome these problems.

Burst Photography for Learning to Enhance Extremely Dark Images

Capturing images under extremely low-light conditions poses significant challenges for the standard camera pipeline. Images become too dark and too noisy, which makes traditional enhancement techniques almost impossible to apply. Recently, learning-based approaches have shown very promising results for this task since they have substantially more expressive capabilities to allow for improved quality. Motivated by these studies, in this paper, we aim to leverage burst photography to boost the performance and obtain much sharper and more accurate RGB images from extremely dark raw images. The backbone of our proposed framework is a novel coarse-to-fine network architecture that generates high-quality outputs progressively. The coarse network predicts a low-resolution, denoised raw image, which is then fed to the fine network to recover fine-scale details and realistic textures. To further reduce the noise level and improve the color accuracy, we extend this network to a permutation invariant structure so that it takes a burst of low-light images as input and merges information from multiple images at the feature-level. Our experiments demonstrate that our approach leads to perceptually more pleasing results than the state-of-the-art methods by producing more detailed and considerably higher quality images.

Efficient Wavelet Boost Learning-Based Multi-stage Progressive Refinement Network for Underwater Image Enhancement

Raw underwater images suffer from low contrast and color cast due to wavelength-selective light scattering and attenuation. The distortions in color and luminance mainly appear at the low frequency while that in edge and texture are mainly at the high frequency. However, the hybrid distortions are difficult to simultaneously recover for existing methods, which mainly focus on the spatial domain. To tackle these issues, we propose a novel deep learning network to progressively refine underwater images by wavelet boost learning strategy (PRWNet), both in spatial and frequency domains. Specifically, the Multi-stage refinement strategy is adopted to efficiently enhance the spatial-varying degradations in a coarse-to-fine way. For each refinement procedure, Wavelet Boost Learning (WBL) unit decomposes the hierarchical features into high and low frequency and enhances them respectively by normalization and attention mechanisms. The modified boosting strategy is also adopted in WBL to further enhance the feature representations. Extensive experiments show that our method achieves state-of-the-art results. Our network is efficient and has the potential for real-world applications. The code is available at: https://github.com/huofushuo/PRWNet.

A Convolutional Neural Network for Small Sample's Ring Structured Light Denoising

Denoising is an indispensable step when measuring depth by using omnidirectional ring structured light depth perception system. However, it is difficult to obtain uniform, clear and continuous structured light stripe by using general image denoising algorithm. To solve this problem, our paper adopts the method of combining Deep Convolutional Generative Adversarial Networks (DCGAN) and Denoising Convolutional Neural Network-Ring Structured Light (DnCNN-RSL) to denoise. We first use DCGAN network to generate more data sets, and then use DnCNN-RSL network to denoise images so that noiseless images can be obtained. DnCNN-RSL network is suitable for removing the noise of small sample data sets, which reflects in improving PNSR value of images, and SSIM value is also closer to 1. Experiments show that DnCNN-RSL network can get better denoising results than traditional image processing methods and obtain clear, continuous structured light stripe. In this way depth information obtained by denoised structured light stripe is more accurate.

Recursive Video Denoising Algorithm for Low Light Surveillance Applications

We designed a video denoising algorithm for surveillance applications under low light conditions which is targeted to run on weak CPU. State of art algorithms don't meet these requirements because of huge memory bandwidth consumption. Among them, algorithms based on neural network have generalization issues especially when there are no references for training. Besides, the complexity of such methods is unaffordable in realtime embedding application. Hence, we propose three techniques, including: 1) adaptive noise strength estimation to fit into noise profile in real applications; 2) multi-resolution background segmentation inspired by human vision system, and 3) multi-pass denoise strategy. It's recursive and of first-order Markov property. Adaptive noise strength estimation also eliminates pre-calibration steps usually required by denoising algorithm and leads to easy deployment. Experiments show that our method can achieve better subjective denoising quality compared to state of art methods in target applications, especially in extremely low light scenes. Moreover, it requires small computation loads and small storage which makes it very suitable for implementation on weak CPUs.

Scatter denoising technique using Fourier domain filtering and integral imaging

Scatter denoising is a challenging project for removing noise under scattered media conditions such as fog and turbid water. In the previous studies, blurring caused by scattering media particles in fog or turbid water was solved by low-pass filtering or by removing the estimated scattering media and then amplifying the brightness with the photon-counting algorithm. However, these methods lose the accurate color information of the object by low-frequency filtering or using the probability-based photon-counting algorithm. Moreover, it is difficult to obtain shape information of an object under heavy turbidity. To solve this problem, we propose a new algorithm of scatter removal by combining Fourier domain filtering and integral imaging. To prove the proposed method, we carry out numerical analysis of experimental results by previous method and our proposed method.

Robust and Guided Super-resolution for Single-Photon Depth Imaging via a Deep Network

The number of applications that use depth imaging is rapidly increasing, e.g. self-driving autonomous vehicles and auto-focus assist on smartphone cameras. Light detection and ranging (LiDAR) via single-photon sensitive detector (SPAD) arrays is an emerging technology that enables the acquisition of depth images at high frame rates. However, the spatial resolution of this technology is typically low in comparison to the intensity images recorded by conventional cameras. To increase the native resolution of depth images from a SPAD camera, we develop a deep network built to take advantage of the multiple features that can be extracted from a camera's histogram data. The network then uses the intensity images and multiple features extracted from down-sampled histograms to guide the up-sampling of the depth. Our network provides significant image resolution enhancement and image denoising across a wide range of signal-to-noise ratios and photon levels.

Video based Heart Rate Extraction using Skin ROI Segmentation and Attention CNN

Photoplethysmography imaging can be used to extract heart rate (HR) from video. The existing deep learning and the denoising methods are not effective for video with high RGB background. This paper presents a solution to address this issue by a Bayes level set based light weight region of interest segmentation in cooperation with a convolutional attention network. Evaluated on COHFACE dataset, the proposed model shows the highest HR extraction accuracy with an average absolute error of 3.058bpm, a root mean square error of 0.81bpm, and a correlation coefficient of 0.848.

Multi-object Paper Money Recognition Technology Based on AlexNet Model

The use of physical currency is currently the most widely used method of transaction. The pattern recognition technology of paper currency has a wide range of applications, such as self-checkout machines, ATMs, vending machines, and devices for the blind. In modern society, there are more and more convenience stores. If they can have the RMB recognition system, they will save energy from cashiering. That is the reason for choosing this research Direction. However, it fails to identify images rotated over a certain angle or with excessive noise at first, and it cannot be 100% exact that the accuracy of our work now. But it will be perfected. To realize the function, this paper proposed a neural network model, which can accurately identify RMB. The best performance of the image process can be achieved by utilizing the Automatic white balance, the Automatic brightness, and the Automatic de-noising. Then it is time to cut and extract the image. At last, the image is put into the neural network model for recognition. The defect of this study is that it ignores the influence of possible fluctuation of image quality on the recognition results in the real situation. The related work established a Radial Basis Function Network based on notes interesting features and correlations between images. The image processing technology is used to improve image quality. Through the above methods, the work can identify all kinds of RMB. Moreover, it can already ensure that the success rate of RMB identification can reach more than 90%.

Comparative Study of Image to Image Translation Models for Underwater Image Enhancement

Understanding the underwater environment is important for navigation, surveillance, and exploration. The aim is to restore underwater images degraded by light absorption and scattering in water medium resulting in poor visibility, low brightness, and low contrast of the scene. In this paper, we introduce a network based on the image to image translation that is Contextual Conditional Generative Adversarial Network (CCGAN) and compare its performance with the available image to image translation models namely, Unsupervised Image Translation (UNIT), cycleGAN, Contrastive Unpaired Translation (CUT) and fastCUT qualitatively and quantitatively. Both supervised and unsupervised models are analyzed and we show that the proposed model outrun the alternatives.

SARN: A Lightweight Stacked Attention Residual Network for Low-Light Image Enhancement

Low-light Image suffers from low contrast and brightness. If we increase the brightness of the image, the noise hidden in the dark regions will be amplified, and color and detail information may be lost after brightness enhancement. In this paper, we propose a lightweight Stacked Attention Residual Network (denoted as SARN) for low-light image enhancement. We insert Channel Attention Module (SE Module) into the residual block and its shortcut to construct the Attention Residual Block (ARB) for noise removal, and then stack ARBs as the backbone of our SARN. We insert Bottleneck Attention Module (BAM Module) into the bottlenecks to specially deal with the severe noise in real-world images. We extract the shallow features of the low-light images first, and then fuse the extracted shallow features with the high-level output features of the backbone through global skip connection to preserve the color information. Extensive ablation and comparative experiments demonstrate that our method outperforms many other state-of-the-art methods with much less time cost.

Deep Learning Prediction of Chlorophyll Content in Tomato Leaves

Precision agriculture has improved crops production around the world. Non-destructive evaluation of chlorophyll contents of plant leaves can be a useful solution, in the field of precision farming. In order to take the required measures, sometimes it is essential to precisely evaluate the chlorophyll content, without cutting the target leaves. In this work, a deep learning methodology is proposed to assess the quantity of chlorophyll in the leaves of the tomato plants, through image processing. This methodology can be extended to any other type of leaves. The proposed method consists of a convolutional denoising autoencoder, to reduce the ambient light noises. Then, using a deep autoencoder network, the valuable features of the plant leaf image are extracted and fed as input to another neural network that evaluates the chlorophyll content of the leaf, taking advantage of support vector regression. To validate the accuracy of the proposed method, measurements were performed using the SPAD chlorophyll meter. The validation results prove the desired accuracy and efficiency of the developed approach.

A GAN-based Background Noise Removal Method on Infrared Image of Gas-Insulated Transmission Line

Results of infrared inspection on 1100kV pipe gallery gas-insulated transmission line (GIL) project are seriously interfered by background noise such as the LED lights and induced heating on steel structures. In this paper, an image background noise removal method based on generative adversarial network (GAN) is proposed. Firstly, convolution neural network (CNN) is used to classify the different parts of GIL. Secondly, the method of threshold and graying are used to mark the classified parts. Finally, the general adverse network is used to repair marked interference parts of noise. In which the generator of GAN is used to repair the marked region to generate new infrared images without noise, and the discriminator is used to discriminate whether the new image output by the generator is successfully repaired. The results show that the proposed method can achieve better background removal effect on infrared images, and the texture feature of the image can preserve well when using GAN to remove the image noise compared with the image de-noising method of VAE. The application results on site show that it takes 0.26 seconds to classify each infrared image using CNN and 4 seconds to remove noise using GAN.

Visible to Thermal Image Synthesis using Light Weight Pyramid Network

We investigate Lightweight Pyramid Network as a general–purpose solution for image-to-image synthesis. Already employed techniques that are based upon deep convolutional neural networks (CNNs) have found reasonable success but with the trade-off of a large number of parameters that eventually result in high computational costs. In these methods, post processing of various types is incorporated as well to further refine the transformed image, thus making the whole process cumbersome and time taking. In this paper, we have made use of lightweight pyramid network (LPNet) for image synthesis that was primarily used for image deraining. We find that by using Laplacian-Gaussian image pyramid decomposition coupled with reconstruction and calculation of SSIM along with neural network, the heat signature in the resulting synthesized thermal image becomes much more enhanced and at the same time contours of various image objects stays prominent even without the use of any post processing techniques. The computations for training become less intensive due to the use of a shallow network. We further prove the efficacy of our approach by doing SSIM, PSNR and UQI Quantitative Analysis.

Underwater Image Dehazing Based on Disparity Estimation and Color Constraint

For underwater image quality improvement, we regard image restoration as dehazing problem, and proposed a restoration method combining disparity map and color constraints. According to the dark channel prior model, image can be recovered through the estimation of background light and transmission. Considering the consistency of disparity map and depth information, the disparity map and dark channel are fused in the non-subsampled contour wave transform domain. Then the background light closer to true depth in RGB color space is estimated by the fused map with haze-lines theory. In consideration of the different attenuation characteristics of different wavelengths light in water, the transmission estimation is established according to the red-based attenuation coefficient ratio. It is benefit to improve the color distortions especially in the texture area. The experiment results show that the proposed underwater restoration image method can achieves better brightness and contrast enhancement, the image edge sharpness and color recovery.

Attention-guided Dual Enhancement Train Driver Fatigue Detection Based on MTCNN

The train driver’s fatigue driving will affect the normal operation of the train, and even threaten the life and property safety of the driver, passengers and the surrounding environment. Therefore, it is very important to detect whether the train driver has fatigue. However, when the light intensity is weak, the accuracy of fatigue detection will be low. In order to avoid the influence of low light on fatigue detection, a low light enhanced fatigue detection algorithm is proposed. Firstly, low light enhancement is performed on the collected driver’s face video image, so as to complete exposure enhancement and image denoising; Secondly, multi task cascaded convolutional neural network (MTCNN) is used to detect face and locate key points; Then, the eye and mouth positions of the key parts in the driver’s face are located, and the corresponding eye and mouth fatigue characteristic parameters are extracted; Finally, two fatigue characteristic parameters are fused to judge the fatigue of train drivers according to PERCLOS criterion and fuzzy reasoning principle. The experimental results show that the proposed method can accurately detect the driver’s fatigue state under low light conditions, and the accuracy has been greatly improved.

GADO-Net: An Improved AOD-Net Single Image Dehazing Algorithm

With the development of modern industry, the frequency of special weather such as haze has increased, and images taken on foggy days will have blurred picture details, reduced contrast, and partial loss of image information, making subsequent image processing difficult. In previous studies, many scholars have proposed many algorithms with excellent performance to solve such problems, and the more widely used one is AOD-Net. However, there are a series of problems such as easy overfitting of the joint estimation of parameters during training, too small sensory field, and poor defogging effect of low-illumination images. The proposed algorithm—GADO-Net uses a depth-separable convolutional neural network instead of a 5-layer convolutional neural network for joint estimation of atmospheric light values and transmittance, while adding a pyramidal pooling module, which makes the algorithm have the advantages of extending the perception field to a certain extent, effectively improving the overfitting phenomenon during training, and improving the ability of the algorithm to fetch global information of foggy sky images. Finally, in this algorithm, the Peak Signal to Noise Ratio (PSNR) is significantly improved by using MS-SSIM and L1 weighting as the LOSS function and using optimization search to obtain the optimal parameters of the model. The experimental show that GAOD-Net performs better than the DCP (Dark Channel Prior), histogram equalization algorithm and conventional AOD-Net in terms of SSE and PSNR, and can dehazing foggy images more effectively.

Adaptive Unfolding Total Variation Network for Low-Light Image Enhancement

Real-world low-light images suffer from two main degradations, namely, inevitable noise and poor visibility. Since the noise exhibits different levels, its estimation has been implemented in recent works when enhancing low-light images from raw Bayer space. When it comes to sRGB color space, the noise estimation becomes more complicated due to the effect of the image processing pipeline. Nevertheless, most existing enhancing algorithms in sRGB space only focus on the low visibility problem or suppress the noise under a hypothetical noise level, leading them impractical due to the lack of robustness. To address this issue, we propose an adaptive unfolding total variation network (UTVNet), which approximates the noise level from the real sRGB low-light image by learning the balancing parameter in the model-based denoising method with total variation regularization. Meanwhile, we learn the noise level map by unrolling the corresponding minimization process for providing the inferences of smoothness and fidelity constraints. Guided by the noise level map, our UTVNet can recover finer details and is more capable to suppress noise in real captured low-light scenes. Extensive experiments on real-world low-light images clearly demonstrate the superior performance of UTVNet over state-of-the-art methods.

Image denoising with heterogeneous low rank matrix factorization

Low rank matrix factorization (LRMF) is an important research direction in computer vision. It can learn low dimensional subspace from high dimensional data. LRMF is mostly constructed by L 1 loss function and L 2 loss function in optimization problems. And whether LRMF is constructed by L 1 loss function or L 2 loss function, it is important to describe the noise in the dataset. In order to describe the noise in the dataset well, this paper describes the noise from a pixel-level perspective to construct LRMF. Considering that there maybe uneven light intensity and differences in light reflection intensity between different parts of each picture in the dataset, this paper assumes that all noise in the image dataset is heterogeneous. Based on this assumption, this paper describes the noise in the dataset by the Student-t distributions with different parameters, and constructs two hierarchical Bayesian models, and deduces all the parameters of the models design through variational Bayesian inference. It not only improves the calculation accuracy of the LRMF, but also improves the calculation speed of the model to a certain extent. A large number of experiments on face reconstruction and medical image denoising prove the superiority of the methods.

Discriminative Feature Extraction and Enhancement Network for Low-Light Image

Photos taken in low light conditions will cause a series of visual degradation phenomena due to underexposure, such as low brightness, loss of information, noise and color distortion. In order to solve the above problems, a discriminative feature extraction and enhancement network is proposed for low-light image enhancement. First, the shallow features are extracted by Inception V2,and the deep features are further extracted by the residual module. Then, the shallow and deep features are fused, and the fusion results are input into the discriminative feature enhancement module for enhancing. Specifically, the residual channel attention module is introduced after each stage to capture important feature information, which helps to restore the color of low-light images and reduce artifacts. Finally, the brightness adjustment module is used to adjust the brightness of the image. In addition, a hybrid loss function is designed to measure the loss of model training from multiple levels. The experimental results on the LOL-v2 dataset show that the proposed algorithm can reduce noise while improving image brightness, reduce color distortion and artifacts, and is superior to other related algorithms in objective indicators. The result maps are more real and natural in subjective vision.

2022

Low-Light Image Restoration With Short- and Long-Exposure Raw Pairs

Low-light imaging with handheld mobile devices is a challenging issue. Limited by the existing models and training data, most existing methods cannot be effectively applied in real scenarios. In this paper, we propose a new low-light image restoration method by using the complementary information of short- and long-exposure images. We first propose a novel data generation method to synthesize realistic short- and long-exposure raw images by simulating the imaging pipeline in low-light environment. Then, we design a new long-short-exposure fusion network (LSFNet) to deal with the problems of low-light image fusion, including high noise, motion blur, color distortion and misalignment. The proposed LSFNet takes pairs of short- and long-exposure raw images as input, and outputs a clear RGB image. Using our data generation method and the proposed LSFNet, we can recover the details and color of the original scene, and improve the low-light image quality effectively. Experiments demonstrate that our method can outperform the state-of-the-art methods.

RGBT Tracking via Noise-Robust Cross-Modal Ranking

Existing RGBT tracking methods usually localize a target object with a bounding box, in which the trackers are often affected by the inclusion of background clutter. To address this issue, this article presents a novel algorithm, called noise-robust cross-modal ranking, to suppress background effects in target bounding boxes for RGBT tracking. In particular, we handle the noise interference in cross-modal fusion and seed labels from the following two aspects. First, the soft cross-modality consistency is proposed to allow the sparse inconsistency in fusing different modalities, aiming to take both collaboration and heterogeneity of different modalities into account for more effective fusion. Second, the optimal seed learning is designed to handle label noises of ranking seeds caused by some problems, such as irregular object shape and occlusion. In addition, to deploy the complementarity and maintain the structural information of different features within each modality, we perform an individual ranking for each feature and employ a cross-feature consistency to pursue their collaboration. A unified optimization framework with an efficient convergence speed is developed to solve the proposed model. Extensive experiments demonstrate the effectiveness and efficiency of the proposed approach comparing with state-of-the-art tracking methods on GTOT and RGBT234 benchmark data sets.

Deep Spatial-Angular Regularization for Light Field Imaging, Denoising, and Super-Resolution

Coded aperture is a promising approach for capturing the 4-D light field (LF), in which the 4-D data are compressively modulated into 2-D coded measurements that are further decoded by reconstruction algorithms. The bottleneck lies in the reconstruction algorithms, resulting in rather limited reconstruction quality. To tackle this challenge, we propose a novel learning-based framework for the reconstruction of high-quality LFs from acquisitions via learned coded apertures. The proposed method incorporates the measurement observation into the deep learning framework elegantly to avoid relying entirely on data-driven priors for LF reconstruction. Specifically, we first formulate the compressive LF reconstruction as an inverse problem with an implicit regularization term. Then, we construct the regularization term with a deep efficient spatial-angular separable convolutional sub-network in the form of local and global residual learning to comprehensively explore the signal distribution free from the limited representation ability and inefficiency of deterministic mathematical modeling. Furthermore, we extend this pipeline to LF denoising and spatial super-resolution, which could be considered as variants of coded aperture imaging equipped with different degradation matrices. Extensive experimental results demonstrate that the proposed methods outperform state-of-the-art approaches to a significant extent both quantitatively and qualitatively, i.e., the reconstructed LFs not only achieve much higher PSNR/SSIM but also preserve the LF parallax structure better on both real and synthetic LF benchmarks. The code will be publicly available at https://github.com/MantangGuo/DRLF .

Physics-Based Noise Modeling for Extreme Low-Light Photography

Enhancing the visibility in extreme low-light environments is a challenging task. Under nearly lightless condition, existing image denoising methods could easily break down due to significantly low SNR. In this paper, we systematically study the noise statistics in the imaging pipeline of CMOS photosensors, and formulate a comprehensive noise model that can accurately characterize the real noise structures. Our novel model considers the noise sources caused by digital camera electronics which are largely overlooked by existing methods yet have significant influence on raw measurement in the dark. It provides a way to decouple the intricate noise structure into different statistical distributions with physical interpretations. Moreover, our noise model can be used to synthesize realistic training data for learning-based low-light denoising algorithms. In this regard, although promising results have been shown recently with deep convolutional neural networks, the success heavily depends on abundant noisy-clean image pairs for training, which are tremendously difficult to obtain in practice. Generalizing their trained models to images from new devices is also problematic. Extensive experiments on multiple low-light denoising datasets – including a newly collected one in this work covering various devices – show that a deep neural network trained with our proposed noise formation model can reach surprisingly-high accuracy. The results are on par with or sometimes even outperform training with paired real data, opening a new door to real-world extreme low-light photography.

Pasadena: Perceptually Aware and Stealthy Adversarial Denoise Attack

Image denoising can remove natural noise that widely exists in images captured by multimedia devices due to low-quality imaging sensors, unstable image transmission processes, or low light conditions. Recent works also find that image denoising benefits the high-level vision tasks, e.g ., image classification. In this work, we try to challenge this common sense and explore a totally new problem, i.e ., whether the image denoising can be given the capability of fooling the state-of-the-art deep neural networks (DNNs) while enhancing the image quality. To this end, we initiate the very first attempt to study this problem from the perspective of adversarial attack and propose the adversarial denoise attack . More specifically, our main contributions are three-fold: First , we identify a new task that stealthily embeds attacks inside the image denoising module widely deployed in multimedia devices as an image post-processing operation to simultaneously enhance the visual image quality and fool DNNs. Second , we formulate this new task as a kernel prediction problem for image filtering and propose the adversarial-denoising kernel prediction that can produce adversarial-noiseless kernels for effective denoising and adversarial attacking simultaneously. Third , we implement an adaptive perceptual region localization to identify semantic-related vulnerability regions with which the attack can be more effective while not doing too much harm to the denoising. We name the proposed method as Pasadena (Perceptually Aware and Stealthy Adversarial DENoise Attack) and validate our method on the NeurIPS’17 adversarial competition dataset, CVPR2021-AIC-VI: unrestricted adversarial attacks on ImageNet, and Tiny-ImageNet-C dataset. The comprehensive evaluation and analysis demonstrate that our method not only realizes denoising but also achieves a significantly higher success rate and transferability over state-of-the-art attacks.

Physics-Based Shadow Image Decomposition for Shadow Removal

We propose a novel deep learning method for shadow removal. Inspired by physical models of shadow formation, we use a linear illumination transformation to model the shadow effects in the image that allows the shadow image to be expressed as a combination of the shadow-free image, the shadow parameters, and a matte layer. We use two deep networks, namely SP-Net and M-Net, to predict the shadow parameters and the shadow matte respectively. This system allows us to remove the shadow effects from images. We then employ an inpainting network, I-Net, to further refine the results. We train and test our framework on the most challenging shadow removal dataset (ISTD). Our method improves the state-of-the-art in terms of mean absolute error (MAE) for the shadow area by 20%. Furthermore, this decomposition allows us to formulate a patch-based weakly-supervised shadow removal method. This model can be trained without any shadow- free images (that are cumbersome to acquire) and achieves competitive shadow removal results compared to state-of-the-art methods that are trained with fully paired shadow and shadow-free images. Last, we introduce SBU-Timelapse, a video shadow removal dataset for evaluating shadow removal methods.

Spatial Temporal Video Enhancement Using Alternating Exposures

High-speed video acquisition under poor illumination conditions is a challenging task. Imaging using long exposure can ensure brightness and suppress noise. However, the captured images may be blurry due to fast object movements or camera shakes. Imaging with short exposure can record sharp textures, but the high camera gain may cause noticeable noise. To alleviate this dilemma, we design a camera system using alternating exposures, where frames expose cyclically in a short-long way. The system consists of restoration and interpolation modules to reconstruct sharp, noise-reduced, high-frame-rate frames from low-frame-rate alternate-exposed input images. We design an optical-flow-based alternate-complementary alignment architecture for spatial enhancement, which effectively aligns the short-exposed and long-exposed images in a two-stage progressive way. Moreover, it explores complementary information from short-exposed and long-exposed inputs to ensure consistency between outputs. We propose a flow-enhanced frame interpolation module for temporal enhancement, which refines the intermediate flows and reconstructs the intermediate images based on the restored images of the alignment network and warped input neighboring frames. The whole network with two modules is end-to-end jointly learnable. We first evaluate the algorithm on simulation data. To demonstrate practicality, we then test it on real data by setting up a prototype camera. We propose an effective spatial degradation regularization strategy to reduce the domain gap between simulation and real data. Besides, we extend our method by integrating multi-frame exposure fusion technology to reduce overexposure areas in real scenarios. Experimental results show that our method performs favorably against state-of-the-art methods on both synthetic data and real-world data.

Better Than Reference in Low-Light Image Enhancement: Conditional Re-Enhancement Network

Low-light images suffer from severe noise, low brightness, low contrast, etc. In previous researches, many image enhancement methods have been proposed, but few methods can deal with these problems simultaneously. In this paper, to solve these problems simultaneously, we propose a low-light image enhancement method that can be combined with supervised learning and previous HSV (Hue, Saturation, Value) or Retinex model-based image enhancement methods. First, we analyse the relationship between the HSV color space and the Retinex theory, and show that the V channel (V channel in HSV color space, equals the maximum channel in RGB color space) of the enhanced image can well represent the contrast and brightness enhancement process. Then, a data-driven conditional re-enhancement network (denoted as CRENet) is proposed. The network takes low-light images as input and the enhanced V channel (V channel of the enhanced image) as a condition during testing, and then it can re-enhance the contrast and brightness of the low-light image and at the same time reduce noise and color distortion. In addition, it takes 23 ms to process a color image with the resolution 400*600 on a 1080Ti GPU. Finally, some comparative experiments are implemented to prove the effectiveness of the method. The results show that the method proposed in this paper can significantly improve the quality of the enhanced image, and by combining it with other image contrast enhancement methods, the final enhancement result can even be better than the reference image in contrast and brightness when the contrast and brightness of the reference are not good.

Structure-Texture Aware Network for Low-Light Image Enhancement

Global structure and local detailed texture have different effects on image enhancement tasks. However, most existing works treated these two components in the same way, without fully considering the characteristics of the global structure and local detailed texture. In this work, we propose a structure-texture aware network (STANet) that successfully exploits structure and texture features of low-light images to improve perceptual quality. To construct STANet, a fine-scale contour map guided filter is introduced to decompose the image into a structure component and a texture component. Then, structure-attention and texture-attention subnetworks are designed to fully exploit the characteristics of these two components. Finally, a fusion subnetwork with attention mechanisms is utilized to explore the internal correlations among the global and local features. Furthermore, to optimize the proposed STANet model, we propose a hybrid loss function; specifically, a color loss function is introduced to alleviate color distortion in the enhanced image. Extensive experiments demonstrate that the proposed method improves the visual quality of images; moreover, STANet outperforms most other state-of-the-art approaches.

MAGAN: Unsupervised low-light image enhancement guided by mixed-attention

Most learning-based low-light image enhancement methods typically suffer from two problems. First, they require a large amount of paired data for training, which are difficult to acquire in most cases. Second, in the process of enhancement, image noise is difficult to be removed and may even be amplified. In other words, performing denoising and illumination enhancement at the same time is difficult. As an alternative to supervised learning strategies that use a large amount of paired data, as presented in previous work, this paper presents an mixed-attention guided generative adversarial network called MAGAN for low-light image enhancement in a fully unsupervised fashion. We introduce a mixed-attention module layer, which can model the relationship between each pixel and feature of the image. In this way, our network can enhance a low-light image and remove its noise simultaneously. In addition, we conduct extensive experiments on paired and no-reference datasets to show the superiority of our method in enhancing low-light images.

A Multiple Light Scenes Suited Turbidity Analysis Method Based on Image Recognition and Information Fusion

Turbidity has been used as a significant indicator of water quality, so turbidity measurement is widely applied in sewage treatment and other fields. In the traditional measurement method of turbidity, a dark, closed measuring environment is required to reduce the interference of ambient light, which limits the application of turbidity measurement. To improve the adaptability of turbidity measurement to different light scenes, a multiple light scenes suited turbidity analysis method based on image recognition and information fusion is proposed. First, a turbidity image acquisition system is designed. After image preprocessing, prediction network groups for multiple light scenes are established, and two optimal prediction networks are adaptively selected according to different ambient light scenes, improving adaptability to multiple measuring environments. Second, to improve prediction accuracy, Dempster–Shafer (D–S) evidence theory is adopted to realize the information fusion of network prediction results. Three different light scenes of 0, 50, and 100 lx are built through experiments, and the results show that the accuracy of the proposed method in the three light scenes is above 95%, which demonstrates the adaptability to multiple light scenes and provides a new way of industrial online measurement.

Image Denoising of Seam Images With Deep Learning for Laser Vision Seam Tracking

Seam tracking with structured light vision has been widely applied into the robot welding. And the precise laser stripe extraction is the premise of automatic laser vision seam tracking. However, conventional laser stripe extraction methods based on image processing have the shortcomings of poor flexibility and robustness, which are easily affected by considerable image noises in the welding processing, such as arc light, smoke, and splash. To address this issue, inspired by image segmentation, with the strong contextual feature expression ability of deep convolutional neural network (DCNN), a novel image denoising method of seam images is proposed in this paper for automatic laser stripe extraction to serve intelligent robot welding applications, such as seam tracking, seam type detection, weld bead detection, etc. With the deep encoder-decoder network framework, aimed at the information loss issue by multiple convolutional and pooling operations in DCNNs, an attention dense convolutional block is proposed to extract and accumulate multi-scale feature maps. Meanwhile, a residual bi-directional ConvLSTM block (BiConvLSTM) is proposed to effectively learn multi-scale and long-range spatial contexts from local feature maps. Finally, a weighted loss function is proposed for model training to address the class unbalanced issue. Combined with the seam image set, the experimental results show that the proposed image denoising network could correctly extract the laser stripes from seam images which could demonstrate that the proposed method shows a high detection precision and good robustness against the strong image noise interference from welding process.

IDEA-Net: Adaptive Dual Self-Attention Network for Single Image Denoising

Image denoising is a challenging task due to possible data bias and prediction variance. Existing approaches usually suffer from high computational cost. In this work, we propose an unsupervised image denoiser, dubbed as adaptIve Dual sElf-Attention Network (IDEA-Net), to handle these challenges. IDEA-Net benefits from a generatively learned image-wise dual self-attention region where the denoising process is enforced. Besides, IDEA-Net is not only robust to possible data bias but also helpful to reduce the prediction variance by applying a simplified encoder-decoder with Poisson dropout operations on a single noisy image merely. The proposed IDEA-Net demonstrated the outperformance on four benchmark datasets compared with other single-image-based learning and non-learning image denoisers. IDEA-Net also shows an appropriate choice to remove real-world noise in low-light and noisy scenes, which in turn, contribute to more accurate dark face detection. The source code is available at https://github.com/zhemingzuo/IDEA-Net.

Unsupervised Image Restoration With Quality-Task-Perception Loss

Image restoration includes various kinds of tasks, such as image denoising, image deraining and low-light image enhancement, etc. Due to the domain shift problem of current supervised methods, researchers tend to adopt unsupervised image restoration methods. However, fake color or blur image, insufficient restoration and missing semantic information are three common problems when utilizing these methods. In this paper, we propose a new hybrid loss named Quality-Task-Perception (QTP) to deal with these three problems simultaneously. Specifically, this hybrid loss includes three components: quality, task and perception. The quality part overcomes the fake color or blur image problem by enforcing image quality scores of the restored images and those of the unpaired clean images to be similar. For the task part, we tackle the insufficient restoration problem by proposing to apply a task probability network to convert the unsupervised image restoration into a supervised classification problem, and this task probability network is learned from our proposed pipeline. The perception part handles the missing semantic information by restricting the multi-scale phase consistency between the degraded image and its restored version. Comprehensive experiments on both supervised and unsupervised datasets in three image restoration tasks demonstrate the superiority of our proposed approach.

A Low-Light Enhancement Network-LLENet

Low-light enhancement is one of the essential technologies for the construction of modern smart cities. This technology is particularly important for the improvement of urban real-time information capability. The unstable factors of the society will not be hidden in the dark, and it can add brick to the construction of smart cities. Inspired by the Retinex, this paper designs Low-Light Enhancement Network(LLENet) which designs two parts of image decomposition and enhancement In the enhancement part, we divide it into image restoration section and adjustment section. In the restoration section, we abandon the traditional interpolation method for recovery instead of U-Net and we add the attention mechanism before the U-Net enhancement channel at the same time. The nature of the network is still end-to-end trainable. The LOL dataset is used in the experiment. After a lot of experiments, it has been proved that this method is a very satisfactory low-light enhancement method and has good performance in noise removal. With the addition of U-Net, the image distortion is avoided by sampling on the reflectivity channel, and the real world is revealed before our eyes.

Progressive Joint Low-Light Enhancement and Noise Removal for Raw Images

Low-light imaging on mobile devices is typically challenging due to insufficient incident light coming through the relatively small aperture, resulting in low image quality. Most of the previous works on low-light imaging focus either only on a single task such as illumination adjustment, color enhancement, or noise removal; or on a joint illumination adjustment and denoising task that heavily relies on short-long exposure image pairs from specific camera models. These approaches are less practical and generalizable in real-world settings where camera-specific joint enhancement and restoration is required. In this paper, we propose a low-light imaging framework that performs joint illumination adjustment, color enhancement, and denoising to tackle this problem. Considering the difficulty in model-specific data collection and the ultra-high definition of the captured images, we design two branches: a coefficient estimation branch and a joint operation branch. The coefficient estimation branch works in a low-resolution space and predicts the coefficients for enhancement via bilateral learning, whereas the joint operation branch works in a full-resolution space and progressively performs joint enhancement and denoising. In contrast to existing methods, our framework does not need to recollect massive data when adapted to another camera model, which significantly reduces the efforts required to fine-tune our approach for practical usage. Through extensive experiments, we demonstrate its great potential in real-world low-light imaging applications.

Adaptive Material Matching for Hyperspectral Imagery Destriping

Due to instrument instability, slit contamination, and light interference, hyperspectral images often suffer from striping artifacts, which greatly impairs the data quality. Real hyperspectral data are usually characterized by a small amount of historical data, complex material distribution, insignificant periodicity of noise, and so on, which brings significant challenges for the destriping task. However, the assumptions made by traditional destriping methods are often inconsistent with these characteristics. To this end, we propose a novel destriping method based on adaptive material matching (MAM) without making explicit assumptions of hyperspectral data. Specifically, to identify pixels that belong to the same material, we propose a principal material analysis (PMA) to adaptively generate thresholds within each superpixel. The pixels are matched by thresholding their vertical gradients and leveraging both inner stripe gradient feature (ISGF) and neighbor-stripe geometry feature (NSGF). Correction pixels selected from the same material can then be used to calculate the offsets and gains of pixels to adjust adjacent columns. To further improve the stability of the destriping process, we generate a set of correction candidates for each column and select the optimal candidate by considering the prior distribution and destriping nonuniformity. The stripe noise within the whole image is finally removed by iteratively performing the correction between adjacent columns. We compare the proposed model against traditional and deep learning methods on both synthetic and real hyperspectral images. The promising results indicate that MAM can effectively remove the image stripes, retain original image information, and improve the nonuniformity.

Low-light Enhancement Using Retinex-Decomposition Convolutional Neural Networks

This paper proposes a new retinex-decomposition convolutional network (DC-Net) to enhance low-light images based on retinex theory. The proposed method estimates the reflectance and illumination components using Dc-Net. Bright-Net and Smooth-Net are used for the refined illumination, and Denoise-Net returns the noise-removed reflectance. Finally, A resultant image can be estimated by multiplying the noise-removed reflectance map and brightness-improved illumination. The experimental results show that the proposed scheme can provide high-quality images without saturation.

A Study on the Noise Removal in Road Images Acquired from Black Box Based on Convolutional Neural Networks

In this paper, we propose a convolutional neural network-based method to remove noise included in road images acquired from a vehicle black-box. In general, various noises are included in a road image acquired from a black-box installed in a vehicle while driving on the road. For example, noise is included in the image due to vehicle exhaust gas, light reflected by objects, fog, sunlight, etc. To remove noise included in the road image, a regression model that estimates the original image from the road image including noise is generated by learning the original road image and the road image to which noise is added using a convolutional neural network. As a result of testing the proposed method on actual road images, an average noise removal rate of about 24% was presented.

I-GANs for Synthetical Infrared Images Generation

Due to the insensitivity of infrared images to changes in light intensity and weather conditions, these images are used in many surveillance systems and different fields. However, despite all the applications and benefits of these images, not enough data is available in many applications due to the high cost, time-consuming, and complicated data preparation. Two deep neural networks based on Conditional Generative Adversarial Networks are introduced to solve this problem and produce synthetical infrared images. One of these models is only for problems where the pair to pair visible and infrared images are available, and as a result, the mapping between these two domains will be learned. Given that in many of the problems we face unpaired data, another network is proposed in which the goal is to obtain a mapping from visible to infrared images so that the distribution of synthetical infrared images is indistinguishable from the real ones. Two publicly available datasets have been used to train and test the proposed models. Results properly demonstrate that the evaluation of the proposed system in regard to peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) has improved by 4.6199% and 3.9196%, respectively, compared to previous models.

Fast and Lightweight Network for Single Frame Structured Illumination Microscopy Super-Resolution

Structured illumination microscopy (SIM) is an important super-resolution-based microscopy technique that breaks the diffraction limit and enhances optical microscopy systems. With the development of biology and medical engineering, there is a high demand for real-time and robust SIM imaging under extreme low-light and short-exposure environments. Existing SIM techniques typically require multiple structured illumination frames to produce a high-resolution image. In this article, we propose a single-frame SIM (SF-SIM) based on deep learning. Our SF-SIM only needs one shot of a structured illumination frame and generates similar results compared with the traditional SIM systems that typically require 15 shots. In our SF-SIM, we propose a noise estimator that can effectively suppress the noise in the image and enable our method to work in the low-light and short-exposure environment without the need for stacking multiple frames for nonlocal denoising. We also design a bandpass attention module that makes our deep network more sensitive to the change of frequency and enhances the imaging quality. Our proposed SF-SIM is almost 14 times faster than traditional SIM methods when achieving similar results. Therefore, our method is significantly valuable for the development of microbiology and medicine.

Image Enhancement Algorithm Based on GAN Neural Network

Deep underwater color images have problems such as low brightness, poor contrast, and loss of local details. In order to effectively enhance low-quality underwater images, this paper proposes an enhancement method based on GAN (Generative Adversarial Network). This paper studies low-light image enhancement algorithms, aiming to improve the quality of low-light images by studying some technical means and methods, and restore the original scene information of low-quality images, so as to obtain natural and clear images with complete details and structural information. In order to verify the effectiveness of this method, image databases such as DIARETDB0 and SID are used as the research object, combined with multi-scale Retinex color reproduction contrast-constrained adaptive histogram equalization to compare the performance of the enhanced algorithm. The results show that the processed image is better than other image enhancement methods in terms of color protection, contrast enhancement, and image detail enhancement. The proposed method significantly improves the indicators proposed in the article.

Underwater Image Enhancement Using Dual Convolutional Neural Network with Skip Connections

Underwater images in high quality are important for many applications but they are often in poor quality since they suffer from fog, low brightness, colour distortion, and reduced contrast. Underwater image quality is degraded with the depth of the water since the red light is absorbed more than blue and green lights and the light is scattered by the suspended particles. Although several traditional and deep learning based approaches are proposed to enhance and restore the image, producing a high quality enhanced image with natural colour is still challenging. In this paper, a novel convolutional neural network architecture is proposed and it has two identical branches to input a raw degraded image and a colour balanced image. Dense blocks are utilized to train the model with fewer parameters. In addition, skip connections are introduced over the dense blocks to preserve the spatial information. The proposed approach is evaluated on publicly available UIEB dataset and shows 28.67 of PSNR value, and 0.89 of SSIM index, which are better than the state-of-the-art approaches.

Unsupervised Decomposition and Correction Network for Low-Light Image Enhancement

Vision-based intelligent driving assistance systems and transportation systems can be improved by enhancing the visibility of the scenes captured in extremely challenging conditions. In particular, many low-image image enhancement (LIE) algorithms have been proposed to facilitate such applications in low-light conditions. While deep learning-based methods have achieved substantial success in this field, most of them require paired training data, which is difficult to be collected. This paper advocates a novel Unsupervised Decomposition and Correction Network (UDCN) for LIE without depending on paired data for training. Inspired by the Retinex model, our method first decomposes images into illumination and reflectance components with an image decomposition network (IDN). Then, the decomposed illumination is processed by an illumination correction network (ICN) and fused with the reflectance to generate a primary enhanced result. In contrast with fully supervised learning approaches, UDCN is an unsupervised one which is trained only with low-light images and corresponding histogram equalized (HE) counterparts (can be derived from the low-light image itself) as input. Both the decomposition and correction networks are optimized under the guidance of hybrid no-reference quality-aware losses and inter-consistency constraints between the low-light image and its HE counterpart. In addition, we also utilize an unsupervised noise removal network (NRN) to remove the noise previously hidden in the darkness for further improving the primary result. Qualitative and quantitative comparison results are reported to demonstrate the efficacy of UDCN and its superiority over several representative alternatives in the literature. The results and code will be made public available at https://github.com/myd945/UDCN .

SpO2 Measurement: Non-Idealities and Ways to Improve Estimation Accuracy in Wearable Pulse Oximeters

The blood oxygen saturation level (SpO 2 ) has become one of the vital body parameters for the early detection, monitoring, and tracking of the symptoms of coronavirus diseases 2019 (COVID-19) and is clinically accepted for patient care and diagnostics. Pulse oximetry provides non-invasive SpO 2 monitoring at home and ICUs without the need of a physician/doctor. However, the accuracy of SpO 2 estimation in wearable pulse oximeters remains a challenge due to various non-idealities. We propose a method to improve the estimation accuracy by denoising the red and IR signals, detecting the signal quality, and providing feedback to hardware to adjust the signal chain parameters like LED current or transimpedance amplifier gain and enhance the signal quality. SpO 2 is calculated using the red and infrared photoplethysmogram (PPG) signals acquired from the wrist using Texas Instruments AFE4950EVM. We introduce the green PPG signal as a reference to obtain the window size of the moving average filter for baseline wander removal and as a timing reference for peak and valley detection in the red and infrared PPG signals. We propose the improved peak and valley detection algorithm based on the incremental merge segmentation algorithm. Kurtosis, entropy, and Signal-to-noise ratio (SNR) are used as signal quality parameters, and SNR is further related to the variance in the SpO 2 measurement. A closed-loop implementation is performed to enhance signal quality based on the signal quality parameters of the recorded PPG signals. The proposed algorithm aims to estimate SpO 2 with a variance of 1% for the pulse oximetry devices.

Low-Light Image Enhancement via Feature Restoration

Besides poor visibility, under-exposed images often suffer from severe noise and color distortion. Most existing Retinex-based methods deal with the noise and color distortion via some careful designs to denoising and/or color correction. In this paper, we propose a simple yet effective network from the perspective of feature map restoration to mitigate such issues without constructing any explicit modules. More concretely, we build an encoder-decoder network to reconstruct images, while a feature restoration subnet is introduced to transform the features of low-light images to those of corresponding clear ones. The enhanced images are consequently acquired through assembling the restored features by the decoder, in which, the noise and possible color distortion can be greatly remedied. Extensive experiments on widely-used datasets are conducted to validate the superiority of our design over other state-of-the-art alternatives both quantitatively and qualitatively. Our code is available at https://github.com/YaN9-Y/FRLIE.

Content Preserving Scale Space Network for Fast Image Restoration from Noisy-Blurry Pairs

Hand-held photography in low-light conditions presents a number of challenges to capture high quality images. Capturing using a high ISO results in noisy images, while capturing using longer exposure results in blurry images. This necessitates post-processing techniques to restore the latent image. Most existing methods try to estimate the latent image either by denoising or by deblurring a single image. Both these approaches are ill-posed and often result in unsatisfactory results. A few methods try to alleviate this ill-posedness using a pair of noisy-blurry images as inputs. However, most of the methods using this approach are computationally very expensive. In this paper, we propose a fast method to estimate a latent image given a pair of noisy-blurry images. To accomplish this, we propose a deep-learning based approach that uses scale space representation of the images. To improve computational efficiency, we process higher scale spaces using shallower networks and the lowest scale using a deeper network. Also, unlike existing scale-space methods that use bi-cubic interpolation, we propose a content preserving scale space transformation for decimation and interpolation. The proposed method generates state-of-the-art results at reduced computational complexity compared to state-of-the-art method. Finally, we also show that computational efficiency can be improved by 90% compared to baseline with only a marginal drop in PSNR.

A Multiresolution Method for Non-Contact Heart Rate Estimation Using Facial Video Frames

In recent years, camera-based non-contact heart rate (HR) measurement technology has grown immensely. The system captures the reflection of light from the facial tissues and lead to the formation of a remote photoplethysmogram (rPPG) signal that can be used to measure physiological parameters for cardiac health assessment. Due to environmental interferences, extraction of a reliable rPPG signal is a challenging task and thus, requires a robust denoising algorithm. In this paper, a discrete wavelet transform (DWT)-based multiresolution method is used to remove the noises from the video frames caused due to illumination variation and motion artifacts. Subsequently, rPPG signal is extracted and HR is measured from two region of interests (ROIs), facial and forehead regions. The study evaluates the performance of the proposed method on each of the RGB color channels from both the ROIs. The performance results for the COHFACE dataset show that the proposed method works well for the estimation of HR values. Furthermore, they reveal that the forehead region on the green channel is more suitable for HR measurement.

Research on Laser Polarization Image Reconstruction Based on Wavelet Transform and Deep Learning

The traditional laser polarization image recoustruction method is affected by environmental noise, resalting in poor image reconstruction effect. For this reason, a wavelet transform and deep learning laser polarization image recoustruction method is designed. The convolutional neural network is ased to denoise the image, the wavelet transform method is ased to ertract the image terture featares, and the overall nested network edge detection method in deep learning is introdaced to detect the edge. In addition, the featare fasion modale in the wavelet transform is ased for processing, adding Multiscale Dilated Dense Block MDDB, Erperimental Laser Polarization Image Reconstruction. The erperimental comparison resalts show that the method proposed in this paper can accurately identify the target in the image, malse foll ase of the activation function in it to learn and identify the image featares, effectively prevent the loss of important information in the image feature learning and identification. This method significantly improves the quality of reconstructed images and achieves better visual effects.

BP-EVD: Forward Block-Output Propagation for Efficient Video Denoising

Denoising videos in real-time is critical in many applications, including robotics and medicine, where varying-light conditions, miniaturized sensors, and optics can substantially compromise image quality. This work proposes the first video denoising method based on a deep neural network that achieves state-of-the-art performance on dynamic scenes while running in real-time on VGA video resolution with no frame latency. The backbone of our method is a novel, remarkably simple, temporal network of cascaded blocks with forward block output propagation. We train our architecture with short, long, and global residual connections by minimizing the restoration loss of pairs of frames, leading to a more effective training across noise levels. It is robust to heavy noise following Poisson-Gaussian noise statistics. The algorithm is evaluated on RAW and RGB data. We propose a denoising algorithm that requires no future frames to denoise a current frame, reducing its latency considerably. The visual and quantitative results show that our algorithm achieves state-of-the-art performance among efficient algorithms, achieving from two-fold to two-orders-of-magnitude speed-ups on standard benchmarks for video denoising.

Self-Supervised Low-Light Image Enhancement Using Discrepant Untrained Network Priors

This paper proposes a deep learning method for low-light image enhancement, which exploits the generation capability of Neural Networks (NNs) while requiring no training samples except the input image itself. Based on the Retinex decomposition model, the reflectance and illumination of a low-light image are parameterized by two untrained NNs. The ambiguity between the two layers is resolved by the discrepancy between the two NNs in terms of architecture and capacity, while the complex noise with spatially-varying characteristics is handled by an illumination-adaptive self-supervised denoising module. The enhancement is done by jointly optimizing the Retinex decomposition and the illumination adjustment. Extensive experiments show that the proposed method not only outperforms existing non-learning-based and unsupervised-learning-based methods, but also competes favorably with some supervised-learning-based methods in extreme low-light conditions.

CERL: A Unified Optimization Framework for Light Enhancement With Realistic Noise

Low-light images captured in the real world are inevitably corrupted by sensor noise. Such noise is spatially variant and highly dependent on the underlying pixel intensity, deviating from the oversimplified assumptions in conventional denoising. Existing light enhancement methods either overlook the important impact of real-world noise during enhancement, or treat noise removal as a separate pre- or post-processing step. We present C oordinated E nhancement for R eal-world L ow-light Noisy Images (CERL), that seamlessly integrates light enhancement and noise suppression parts into a unified and physics-grounded optimization framework. For the real low-light noise removal part, we customize a self-supervised denoising model that can easily be adapted without referring to clean ground-truth images. For the light enhancement part, we also improve the design of a state-of-the-art backbone. The two parts are then joint formulated into one principled plug-and-play optimization. Our approach is compared against state-of-the-art low-light enhancement methods both qualitatively and quantitatively. Besides standard benchmarks, we further collect and test on a new realistic low-light mobile photography dataset (RLMP), whose mobile-captured photos display heavier realistic noise than those taken by high-quality cameras. CERL consistently produces the most visually pleasing and artifact-free results across all experiments. Our RLMP dataset and codes are available at: https://github.com/VITA-Group/CERL .

FIBS-Unet: Feature Integration and Block Smoothing Network for Single Image Dehazing

The dehazing algorithms are based on the hazy simulation equation to remove haze and restore the input image feature maps by estimating the intensity coefficient of the atmospheric light source and the scattering coefficient of the atmosphere. However, the coefficient prediction isn’t good, resulting in artifact noise in the dehazed output image. The increasing expansion of deep learning algorithms in computer vision applications to combat noise and interference in the hazy picture is growing. This paper proposed an efficient framework for Feature Integration and Block Smoothing (FIBS-Unet) Unet architecture using encoder-decoder processing with intensity attention block. We modified the Res2Net residual block with customized convolution and added instance normalization to improve the encoder feature extraction efficiency. Besides, we designed the Intensity Attention Block (IAB) using Sub-Pixel Layer and convolution ( 1×1 ) to amplify input feature and fusion feature maps. We developed an efficient decoder employing sub-pixel convolutions, concatenations, contrive convolutions, and multipliers to recover smooth and high-quality feature maps at the framework. The proposed FIBS-Unet has minimized the Mean Absolute Error (MAE) at perceptual loss function with the RESIDE dataset. We calculated the Peak Signal-to-Noise Ratio (PSNR), the Similarity Index Measure (SSIM), and a subjective visual color difference to evaluate the model’s effectiveness. The proposed FIBS-Unet achieved better quality dehazing image results of PSNR:34.122 and SSIM:0.9890 in the outdoor scenarios at dense haze and backlight image for the Synthetic Objective Testing Set (SOTS). Our extensive experimental results specify that proposed FIBS-Unet is extendable to real-time applications.

Multi-scale Feature Recovery of Low-Light Enhancement Algorithm Based on U-net Network and Perceptual Loss

How to handle low-light images is a challenging problem in the field of image processing. A mature low-light enhancement technique not only benefits human visual perception, but also can lay a good foundation for subsequent advanced tasks such as target detection and image classification. To balance the visual effect of an image and its contribution to subsequent tasks, we propose an innovative multi-scale feature recovery low-light enhancement model combining U-net network and perceptual loss. It first passes the image through a denoising module to reduce the uncertainty of the subsequent enhancement process, and then trains a tandem “end-to-end” network with a synthetic image training dataset to complete the entire enhancement process. The unique U-shaped structure with multiple loss function constraints allows the algorithm to better recover feature information at different scales of the image, resulting in a more distinctive resultant image with normal illumination. Through extensive experiments on synthetic low-light images and real low-light images, the analysis results show that the algorithm can not only make the low-light image enhancement results with more natural colors, which can bring a pleasant visual experience to the observer and satisfy the basic low-light enhancement needs, but also recover as many features and image details as possible, which is beneficial to the subsequent image vision tasks.

An Efficient Transformer with Shifted Windows for Low-light Image Enhancement

In low-light environments, the images have different degrees of color and texture degradation and contain a lot of noise, which seriously affect human visual perception. The state-of-the-art methods for low-light image enhancement are usually based on convolutional neural networks (CNN). Due to the lack of ability of CNN to capture global contextual information, the enhanced images of these methods lack details and cannot effectively suppress the noise. The Transformer can effectively capture global contextual information, we try to combine the Transformer to make up for the defects of CNN and improve the luminance, texture and color of images. In this paper, we propose an efficient Transformer with Shifted Windows for low-light image enhancement (SwinLIE). SwinLIE includes two important modules: Enhanced Swin Transformer block (ESTB) and adaptive feature fusion block (AFFB). ESTB is proposed to be embedded in each stage of the model to fully capture the global contextual information, while AFFB is proposed to fuse features from different stages. We conduct extensive comparisons with the state-of-the-art methods on the LOL dataset, and experiments show that the proposed method has the best PSNR, SSIM and NIQE. At the same time, the proposed method has low requirements for equipment and quick running time. The images reconstructed by our method have rich colors and textures, and the noise is effectively suppressed.

Unsupervised Region-Based Denoising for Optical Coherence Tomography Framework

Optical Coherence Tomography (OCT) is an emerging imaging tool that is now widely adopted in various medical settings such as cardiology and ophthalmology and is emerging in dentistry. In OCT, light of low coherence is used for image capturing which results in the introduction of speckle noise. Specifically, a degraded signal-to-noise ratio accentuates ambiguity in feature-extraction and contributes to the introduction of artefacts. This ultimately impacts its clinical utility where clear diagnostic detail is essential. In this work a concentrated-unsupervised deep learning denoising framework for OCT images is proposed, incorporating attention gate encoders. Attention gates are utilized to ensure focus on denoising the foreground and to ‘hard-threshold’ the background. Training data was created by processing the image with state-of-the-art denoisers (BM3D and NLM, etc.) to emphasize only essential data removal. The proposed framework was analysed quantitatively and visually, in comparison against state-of-the-art denoising algorithms. The experimental results show that the approaches verifiably remove speckle noise and achieves superior quality to well-known denoisers. The method improved the PSNR by 29.6 dB, CNR by 11.5 dB and, ENL by 1196.6 dB compared to original image and state-of-the-art denoisers.

Con-Net: A Consolidated Light-Weight Image Restoration Network

There has been a considerable gap between the recent high-resolution display technologies and the short storage of its content. However, most of the existing restoration methods are restricted by local convolution operations and equal treatment of the diverse information in degraded image. These approaches being degradation-specific employ the same rigid spatial processing across different images ultimately resulting in high memory consumption. For overcoming this limitation we propose Con-Net, a network design capable of exploiting the non-uniformities of the degradations in spatial-domain with limited number of parameters (656k). Our proposed Con-Net comprises of basically two main components, (1) a spatial-degradation aware network for extracting the diverse information inherent in any degraded image, and (2) a holistic attention refinement network for exploiting the knowledge from the degradation aware network to selectively restore the degraded pixels. In a nutshell, our proposed method is generalizable for three applications: image denoising, super-resolution and real-world low-light enhancement. Extensive qualitative and quantitative comparison with prior arts on 8 benchmark datasets demonstrates the efficacy of our proposed Con-Net over existing state-of-the-art degradation-specific architectures, by huge parameter and FLOPs reduction in all the three tasks.

Joint Contrast Enhancement and Noise Suppression of Low-light Images Via Deep Learning

The captured images suffer from low sharpness, low contrast, and unwanted noise when the imaging device is used in conditions such as backlighting, overexposure, or darkness. Learning-based low-light enhancement methods have robust feature learning and mapping capabilities. Therefore, we propose a learning-based joint contrast enhancement and noise suppression method for low-light images (termed JCENS). JCENS is mainly composed of three subnetworks: the low-light image denoising network (LDNet), the attention feature extraction network (AENet), and the low-light image enhancement network (LENet). In particular, LDNet produces a low-light image devoid of unwanted noise. AENet mitigates the impact of LDNet’s denoising process on local details and generates attention enhancement features. Ultimately, LENet combine the outputs of LDNet and AENet to produce a noise-free and normal-light-enhanced image. The proposed joint contrast enhancement and noise suppression network is capable of achieving a balance between brightness enhancement and noise suppression, thereby better preserving salient image details. Both synthetic and realistic experiments have demonstrated the superior performance of our JCENS in terms of quantitative evaluations and visual image qualities.

Intelligent detection UAV for illegal fishing based on YOLOv5 algorithm

with the arrival of 2021, the Yangtze River began to enter the stage of no fishing, which lasted for 10 years. The purpose of no fishing in the Yangtze River is also to protect all kinds of fish living in the Yangtze River. In the past, under the massive fishing, most of the fish in the Yangtze River were in the crisis of biological decline. In order to face this crisis, the ban on fishing can effectively alleviate the crisis of fish decline and restore the biological growth of the Yangtze River. With the implementation of the ban, the phenomenon of poaching in the Yangtze River Basin has not been banned. However, the traditional manual survey is limited by bad weather such as night and fog, and the lack of visibility in most watersheds. Based on the above problems, this project designs an intelligent detection system for illegal fishing vessels in the Yangtze River in complex environment. Taking UAV as the carrier, considering that most illegal fishing activities are concentrated at night, the project will focus on infrared enhancement and recognition, supplemented by visible light enhancement and recognition, work together, take the deep learning theory and method as the technical core of the research, and enhance the video image through the image enhancement method based on convolutional neural network, The infrared / visible light intelligent visual perception enhancement, cluster UAV illegal fisherman detection and tracking are constructed. Combined with the intelligent auxiliary law enforcement system, supplemented by a variety of classical image processing algorithms, the detection of illegal fishing ships is finally realized.

Object Detection using Image Dehazing: A Journey Of Visual Improvement

Object Detection in hazy conditions is very challenging as haze significantly degrades the visibility of images limits visibility especially in outdoor settings. Here we introduce an interesting method to deal with haze that is present in images. Before applying any object detection method on the hazy input image, it is needed to be dehaze first and recognised later.For dehazing we have used the an Image Dehazing network known as All-in-One Dehazing Network (AOD-net) which is based on reformulation of atmospheric model and generates clean and clear image through a light-weight CNN and for recognition we have used the third version of famous YOLO i.e. YOLOv3. We test our method on various real time hazy images and compare the object similarity results on hazy image as well as on dehaze image. Along with this we have compared the number of object which are recognised in hazy image and in output clear image.

Auto-Refining 3D Mesh Reconstruction Algorithm From Limited Angle Depth Data

3D object reconstruction is a very rapidly developing field, especially from a single perspective. Yet the majority of modern research is focused on developing algorithms around a single static object reconstruction and in most of the cases derived from synthetically generated datasets, failing or at least working insufficiently accurately in real-world data scenarios, regarding morphing the 3D object’s restoration from a deficient real world frame. For solving that problem, we introduce an extended version of the three-staged deep auto-refining adversarial neural network architecture that can denoise and refine real-world depth sensor data current methods for a full human body pose reconstruction, in both Earth Mover’s (0.059) and Chamfer (0.079) distances. Visual inspection of the reconstructed point-cloud proved future adaptation potential to most of depth sensor noise defects for both structured light depth sensors and LiDAR sensors.

Self-supervision versus synthetic datasets: which is the lesser evil in the context of video denoising?

Supervised training has led to state-of-the-art results in image and video denoising. However, its application to real data is limited since it requires large datasets of noisy-clean pairs that are difficult to obtain. For this reason, networks are often trained on realistic synthetic data. More recently, some self-supervised frameworks have been proposed for training such denoising networks directly on the noisy data without requiring ground truth. On synthetic denoising problems supervised training outperforms self-supervised approaches, however in recent years the gap has become narrower, especially for video. In this paper, we propose a study aiming to determine which is the best approach to train denoising networks for real raw videos: supervision on synthetic realistic data or self-supervision on real data. A complete study with quantitative results in case of natural videos with real motion is impossible since no dataset with clean-noisy pairs exists. We address this issue by considering three independent experiments in which we compare the two frameworks. We found that self-supervision on the real data outperforms supervision on synthetic data, and that in normal illumination conditions the drop in performance is due to the synthetic ground truth generation, not the noise model.

DeblurGAN-CNN: Effective Image Denoising and Recognition for Noisy Handwritten Characters

Many problems can reduce handwritten character recognition performance, such as image degradation, light conditions, low-resolution images, and even the quality of the capture devices. However, in this research, we have focused on the noise in the character images that could decrease the accuracy of handwritten character recognition. Many types of noise penalties influence the recognition performance, for example, low resolution, Gaussian noise, low contrast, and blur. First, this research proposes a method that learns from the noisy handwritten character images and synthesizes clean character images using the robust deblur generative adversarial network (DeblurGAN). Second, we combine the DeblurGAN architecture with a convolutional neural network (CNN), called DeblurGAN-CNN. Subsequently, two state-of-the-art CNN architectures are combined with DeblurGAN, namely DeblurGAN-DenseNet121 and DeblurGAN-MobileNetV2, to address many noise problems and enhance the recognition performance of the handwritten character images. Finally, the DeblurGAN-CNN could transform the noisy characters to the new clean characters and recognize clean characters simultaneously. We have evaluated and compared the experimental results of the proposed DeblurGAN-CNN architectures with the existing methods on four handwritten character datasets: n-THI-C68, n-MNIST, THI-C68, and THCC-67. For the n-THI-C68 dataset, the DeblurGAN-CNN achieved above 98% and outperformed the other existing methods. For the n-MNIST, the proposed DeblurGAN-CNN achieved an accuracy of 97.59% when the AWGN+Contrast noise method was applied to the handwritten digits. We have evaluated the DeblurGAN-CNN on the THCC-67 dataset. The result showed that the proposed DeblurGAN-CNN achieved an accuracy of 80.68%, which is significantly higher than the existing method, approximately 10%.

Removal of Gaussian Distributed Noise in Images with Deep Neural Network Models

The removal of noise caused by environmental factors in microscopic imaging studies has become an important research topic in the field of medical imaging. In the medical imaging stage made with any digital microscopy method (Confocal, Fluorescence, etc.), undesirable noises are added to the image obtained due to factors stemming from excessive or low illumination, high or low temperature, or electronic circuit equipment. The most basic noise model formed due to these environmental factors mentioned is the Gaussian normal distribution or a characteristic function close to this distribution. It is widely known that spatial filters (mean, median, Gaussian smoothing) are applied to eliminate Gaussian noise in digital image processing. However, undesirable results may occur in the images obtained when spatial filters are used to remove the noise in the images. In particular, because high frequencies are suppressed in images where spatial filters are applied, details are lost in the final image, and a blurred image is obtained. For this reason, four different convolutional neural network-based models are used for noise removal and to improve the PSNR values in this study. As a result, the modified U-Net improved the PSNR values for different noise levels as follows: +6.23, +7.88 and +10.52 dB

Low Illumination Image Enhancement Algorithm Based on HSV-RNET

Aiming at the problems of color distortion, blurred edge and large noise in the RetinexNet image enhancement algorithm, a low-illumination image enhancement algorithm based on HSV-RNET is proposed. First, convert the image to HSV color space image, and enhance the illuminance image V separately; then, add denoising loss to control the noise of reflected image in the decomposition network, and add color loss function to the enhancement network to improve the color distortion of the image; secondly, after denoising the decomposed reflection image R, add Canny image sharpening and then combine with the enhanced illumination image I^; finally, convert the HSV image to RGB image. The experimental results show that, compared with image enhancement algorithms such as LIME and SRIE, the algorithm proposed in this paper can effectively reduce the color distortion and edge blur, and improve the visual effect of the image. There are obvious advantages in the quality evaluation index.

NonReference Mapping Net

It is very challenging to collect images under low light conditions due to low photon count. The exposure time is short and the number of photons is so rare that the objects in the image cannot be recognized and the noise will be amplified. However, the long exposure time is almost impossible to achieve in daily shooting. In response to these challenges, many researchers have proposed a lot of methods to solve the above problems, but most of the methods are only for specific lighting conditions. In recent years, some methods for training deep learning models based on a large amount of images are used for low-light image enhancement, but many methods require strictly paired images. This paper proposes a new idea, no reference mapping network, it only needs images collected in dark scenes. NRM-Net is a pixel-wise nonlinear mapping network, which realizes pixel-wise adjustment of low-light images in a high dynamic range. The realization of this goal is mainly attributed to our carefully designed high-order nonlinear mapping function and a set of loss functions.

Terrain Recognition Based on the Carrier-Free UWB Radar Using Stacked Denoising Autoencoder

The carrier-free ultra-wideband (UWB) sensor characterizes high distance resolution and high interference immunity. It is not easily affected by weather and lighting conditions, and its received echoes contain detailed structural information of the target. In this paper, we propose a terrain recognition framework based on the carrier-free UWB sensor. For the purpose of extracting noise-robustness features, a deep network named stacked denoising autoencoder (SDAE) is developed. Given that the convolutional neural network (CNN) is insensitive to translation to some extent, we combine several CNNs as middle structures of the proposed model. Experimental results demonstrate that the proposed algorithm can effectively learn essential representation and improve classification accuracy in the presence of low signal-to-noise ratios (SNRs), making it very suitable for use in a classification scheme.

Traffic Sign Detection Based on Driving Sight Distance in Haze Environment

To explore the relationship between traffic sign detection performance and driving sight distance in haze environment, this paper proposed a UV correlation model among sight distance, haze grade and traffic sign detection performance. First, the German traffic sign data set (GTSDB) is synthesized into experimental data set according to three levels of light haze, haze and dense haze. The Faster R-CNN model is utilized to detect the traffic signs after dehazing by Guided Filter Dehazing Algorithm. The detection accuracy is as high as 95.11%, which shows that the model has strong generalization ability and adaptability. Second, the weight is determined by haze, taking the driving sight distance as U layer and the detection result of Faster RCNN model as V layer, establishing the UV correlation model. Finally, KM algorithm is used to solve the correlation model, and the best matching result between UV layers is gained. The experimental results show the haze level significantly affects the driving sight distance, and then affects the detection accuracy of traffic signs. When the driving Sight distance threshold is 300 meters, 100 meters and 50 meters in light haze, haze and dense haze, the KM algorithm obtains the detection accuracy levels of A (higher than 93%), B (88%-93%) and C (85%-88%), respectively.

Refocusing Metric of Light Field Image Using Region-Adaptive Multi-Scale Focus Measure

Compared with conventional photography, the newly emerging light field image capturing technique has dramatically extended potential capabilities of post processing. Among the new capabilities, refocusing is of the most interest. In this paper, we first investigate a region-adaptive multi-scale focus measure (RA-MSFM) that is able to more robustly and accurately measure focus of light field images. It is especially superior when measuring focus in flat areas where previous methods struggle. Following we design a novel refocusing measure metric which employs the RA-MSFM as core technique. Using the metric, refocusing capability of a given light field image as a whole can be measured in a single number by combining focus score maps of each refocused image in the focal stack. The focus score maps are generated using the proposed RA-MSFM. In RA-MSFM, different multi-scale factor is adaptively selected depending on different regions such as texture-rich or flat areas using a multi-layer perceptron network. Different from most light field image metrics that assess image quality, our metric targets to assess refocusing capability. Our experiments have shown that not only does the proposed refocusing metric have high correlation with subjective evaluations given in the form of mean opinion scores, but it also produces all-in-focus images having 0.7 ~ 4.6dB higher PSNRs compared to previous state-of-the-art methods. The proposed refocusing metric can be used to measure refocusing loss in practical application such as compression, tone mapping, denoising, and smoothing.

Dual-Scale Single Image Dehazing via Neural Augmentation

Model-based single image dehazing algorithms restore haze-free images with sharp edges and rich details for real-world hazy images at the expense of low PSNR and SSIM values for synthetic hazy images. Data-driven ones restore haze-free images with high PSNR and SSIM values for synthetic hazy images but with low contrast, and even some remaining haze for real-world hazy images. In this paper, a novel single image dehazing algorithm is introduced by combining model-based and data-driven approaches. Both transmission map and atmospheric light are first estimated by the model-based methods, and then refined by dual-scale generative adversarial networks (GANs) based approaches. The resultant algorithm forms a neural augmentation which converges very fast while the corresponding data-driven approach might not converge. Haze-free images are restored by using the estimated transmission map and atmospheric light as well as the Koschmieder’s law. Experimental results indicate that the proposed algorithm can remove haze well from real-world and synthetic hazy images.

URetinex-Net: Retinex-based Deep Unfolding Network for Low-light Image Enhancement

Retinex model-based methods have shown to be effective in layer-wise manipulation with well-designed priors for low-light image enhancement. However, the commonly used handcrafted priors and optimization-driven solutions lead to the absence of adaptivity and efficiency. To address these issues, in this paper, we propose a Retinex-based deep unfolding network (URetinex-Net), which unfolds an optimization problem into a learnable network to decompose a low-light image into reflectance and illumination layers. By formulating the decomposition problem as an implicit priors regularized model, three learning-based modules are carefully designed, responsible for data-dependent initialization, high-efficient unfolding optimization, and user-specified illumination enhancement, respectively. Particularly, the proposed unfolding optimization module, introducing two networks to adaptively fit implicit priors in data-driven manner, can realize noise suppression and details preservation for the final decomposition results. Extensive experiments on real-world low-light images qualitatively and quantitatively demonstrate the effectiveness and superiority of the proposed method over state-of-the-art methods. The code is available at https://github.com/AndersonYong/URetinex-Net.

Uncooled Thermal Image Denoising using Deep Convolutional Neural Network

Thermal imaging which initially originated for military applications owing to the fact that it can produce a clear image on darkest nights as they need no light to operate thus allow seeing without being seen. Thermal imaging cameras can also see to some extent through snow, rain, fog and therefore find its application in thermal weapon sight, night vision for tanks and surveillance. However images captured are contaminated by noise during image acquisition, compression and transmission which can severely hamper successful image analysis and tracking. In this work we used a denoising convolutional neural network to reduce Gaussian noise from the images acquired through uncooled thermal imagers. From the acquired images, 100 images were segmented into patches to train the network which resulted into improved image quality metrics which are indicated through experimental results resulting into higher peak signal-to-noise ratio.

Recurrent Attentive Decomposition Network for Low-Light Image Enhancement

This paper aims to solve the problems of Low-light image enhancement based on classical method RetinexNet. Given the problems of original results with lots of noise and color distortion, this paper proposes a novel recurrent attentive decomposition network, which combines spatial attention mechanism and Encoder-Decoder structure to better capture the key information of images and make a thorough image decomposition process. Furthermore, another network based on attention mechanism is added to denoise the reflection image and improve the restoration effect of image details. Compared with RetinexNet and other popular methods, the overall style of images processed by our method is more consistent with that of the real scene. Both visual comparison and quantity comparison of Structural Similarity(SSIM) and Peak Signal to Noise Ratio(PSNR) demonstrate that our method is with superiority to several state-of-the-art methods.

Unrolling Graph Total Variation for Light Field Image Denoising

A light field (LF) image is composed of multiple sub-aperture images (SAIs) from slightly offset viewpoints. To denoise a noise-corrupted LF image, leveraging recent development in deep algorithm unfolding, we pursue a hybrid graph-model-based / data-driven approach. Specifically, we first connect each pixel in a target patch of an SAI to neighboring pixels within the patch, and to pixels in co-located "similar" patches in adjacent SAIs. Given graph connectivity, we formulate a maximum a posteriori (MAP) problem using graph total variation (GTV) as signal prior. We then unroll the iterations of a corresponding optimization algorithm into a sequence of neural layers. In each unrolled layer, we learn relevant features per pixel from data using a convolutional neural net (CNN) in a supervised manner, so that edge weights can be computed as functions of feature distances. Each neural layer can be interpreted as a graph low-pass filter for a 4D LF image patch. Experiments show that our proposal outperformed two model-based and two deep-learning-based implementations in numerical and visual comparisons.

ColorPolarNet: Residual Dense Network-Based Chromatic Intensity-Polarization Imaging in Low-Light Environment

Polarization imaging provides more dimensional information than traditional intensity imaging and has been widely used in both military and civil areas, such as battlefield reconnaissance, environmental monitoring, autonomous driving, and so on. With the rapid development of color polarization imaging sensors, polarization detection that allows simultaneous acquisition of multiple images (intensity maps modulated by different polarizers) and real-time extraction of polarization information is becoming increasingly mature. However, the imaging quality of color polarization camera is always unsatisfactory in low-light environment with low-photon count. In this article, we present a residual dense blocks (RDBs)-based multitask convolutional network, called ColorPolarNet, to extract and enhance the intensity and polarization information from four RGB polarized light intensity images captured in low-light environment with different polarization orientations (0°,45°,90°, and 135°). This proposed network consists of two parts: 1) an intensity network for initial denoising and color deviation correction of the four polarized light intensity maps and 2) a polarization network used to enhance the details of the total intensity ( S0 ) map, the degree of linear polarization (DoLP) map, and the angle of polarization (AoP) map. Also, to train the model, we collect a low-light chromatic polarization (LLCP) dataset that contains 300 paired datasets of low-normal light images captured indoors and outdoors. Both qualitative and quantitative experimental results show that the proposed ColorPolarNet outperforms other conventional low-light image enhancement methods in terms of signal fidelity, contrast enhancement, and color reproduction. Our codes and the LLCP dataset are available at https://github.com/MinjieWan/ColorPolarNet .

Nonlocal Spatial–Spectral Neural Network for Hyperspectral Image Denoising

Hyperspectral image (HSI) denoising is an essential preprocessing step to improve the quality of HSIs. The difficulty of HSI denoising lies in effectively modeling the intrinsic characteristics of HSIs, such as spatial–spectral correlation (SSC), global spectral correlation (GSC), and nonlocal spatial correlation. This article introduces a nonlocal spatial–spectral neural network (NSSNN) for HSI denoising by considering the above three factors in a unified network. More specifically, NSSNN is based on the residual U-Net and embedded with the introduced spatial–spectral recurrent (SSR) blocks and nonlocal self-similarity (NSS) blocks. The SSR block comprises 3-D convolutions, one light recurrence, and one highway network. 3-D convolution helps exploit the SSC. The light recurrence and highway network make up the recurrent computation component and refined component, respectively, to model the GSC. The NSS block is based on crisscross attention and can exploit long-range spatial contexts effectively and efficiently. Attributing to effective modeling of the SSC, the GSC, and the nonlocal spatial correlation, our NSSNN has a strong denoising ability. Extensive experiments show the superior denoising effectiveness of our method on synthetic and real-world datasets compared to alternative methods. The source code will be available at https://github.com/lronkitty/NSSNN .

Contactless SpO2 Detection from Face Using Consumer Camera

We describe a novel computational framework for contactless oxygen saturation (SpO 2 ) detection using videos recorded from human faces using smartphone cameras with ambient light. For contact pulse oximeter, a ratio of ratios (RoR) metric derived from selected regions of interest (ROI) combined with linear regression modeling is the standard approach. However, when used upon contactless remote PPG (rPPG), the assumptions of this standard approach do not hold automatically: 1) the rPPG signal is usually derived from the face area where the light reflection may not be uniform due to variation in skin tissue composition and/or lighting conditions (moles, hairs, beard, partial shadowing, etc.), 2) for most consumer-level cameras under ambient light, the rPPG signal is converted from light reflection associated with wide-band spectra, which creates complicated nonlinearity for SpO 2 mappings. We propose a computational framework to overcome these challenges by 1) determining and dynamically tracking the ROIs according to both spatial and color proximity, and calculating the RoR based on selected individual ROIs which have homogeneous skin reflections, and 2) using a nonlinear machine learning model to mapping the SpO 2 levels from RoRs derived from two different color combinations. We validated the framework with 30 healthy participants during various breathing tasks and achieved 1.24% Root Mean Square Error for across-subjects model and 1.06% for within-subject models, which surpassed the FDA-recognized ISO 81060-2-61:2017 standard.

Enhancement of Remote PPG and Heart Rate Estimation with Optimal Signal Quality Index

With the popularity of non-invasive vital signs detection, remote photoplethysmography (rPPG) is drawing attention in the community. Remote PPG or rPPG signals are extracted in a contactless manner that is more prone to artifacts than PPG signals collected by wearable sensors. To develop a robust and accurate pipeline to estimate heart rate (HR) from rPPG signals, we propose a novel real-time dynamic ROI tracking algorithm that applies to slight motions and light changes. Furthermore, we develop and include a signal quality index (SQI) to improve the HR estimation accuracy. Studies have explored optimal SQIs for PPG signals, but not for remote PPG signals. In this paper, we select and test six SQIs: Perfusion, Kurtosis, Skewness, Zero-crossing, Entropy, and signal-to-noise ratio (SNR) on 124 rPPG sessions from 30 participants wearing masks. Based on the mean absolute error (MAE) of HR estimation, the optimal SQI is selected and validated by Mann–Whitney U test (MWU). Lastly, we show that the HR estimation accuracy is improved by 29% after removing outliers decided by the optimal SQI, and the best result achieves the MAE of 2.308 bpm.

A Deep Learning Approach to Enhance Underwater Images with Low Contrast, Blurriness and Degraded Color

This paper presents how to improve underwater images with non-uniform lighting, low contrast, blurriness, and degraded color using a Physical Neural Network (PNN)-based image-enhancing approach. The suggested method is built on the deep learning principle and focuses on a damaged or noisy underwater image's input images, weight & weight maps, and white balance data. The proposed method employs a variety of weight maps, including luminance, contrast, chromatic, and saliency, to create an image that overcomes the limits of the initial or noised image, which lacks distinct clarity. Reduced noise levels and better exposed dark regions, as well as increased global contrast and finer features and edges, can be found in the underwater image, created utilizing the aforementioned processes. The experiments are carried out on the EUVP dataset, and it is observed that the proposed method surpasses other state-of-the-art methods in terms of efficiency.

CNN-Based Scheme on EEG Hand-Motion Recognition Without Signal Preprocessing

In many methods, before inputting Electroencephalography (EEG) signals into the Convolutional Neural Network (CNN), the signals are usually preprocessed by low-pass filter, wavelet threshold denoising and so on. In this paper, it is found that these operations can be completely replaced by CNN, which automatically ignores noise interference in the recognition training process. A light and effective network named Shallow Residual CNN (SRCNN) is proposed to recognize the EEG hand-motion signals whose dataset is from Kaggle competition. The experiment results indicated that our proposed method can directly manage the raw data and have a better performance under ROC curve area evaluation when comparing with the classic deep learning network ResNet 34. This paper can be a good example to extend the CNN-based scheme to more types of EEG signal.

Low Light Image Enhancement by Multispectral Fusion and Convolutional Neural Networks

In this paper, we propose low light image enhancement by multispectral fusion and convolutional neural networks (CNNs). We adopt multispectral fusion of color (RGB) and near infrared (NIR) images for low light image enhancement based on pyramid feature selection and attention map. The proposed fusion network consists of two subnetworks: denoising and fusion. Since low light RGB images contain severe noise with detail loss, we first utilize a denoising subnetwork to preprocess RGB images. In the denoising subnetwork, we use concat operation to prevent the loss of image features during training. We independently train the denoising subnetwork because denoising datasets are easy to obtain. After denoising, we use gamma correction to enhance the denoised low light image. Finally, we perform the fusion subnetwork for hidden texture recovery based on pyramid feature selection and attention map. To build the fusion subnetwork, we synthesize a low light image dataset based on smoothing and gamma correction. We generate the ground truth for training by adding the details of the NIR images into the smoothed RGB images. Experimental results show that the proposed fusion method generates high quality images with little noise, fine details and good colors.

A Deep Retinex Framework for Light Field Restoration under Low-light Conditions

Light field (LF) images can record the scene from multiple directions and have many applications, such as refocusing and depth estimation. However, these applications can be heavily influenced by poor light condition and noise. This work aims to recover the high-quality LF images from their lowlight detection. First, a decomposition network is employed to decompose each LF image into its reflectance and illumination with the Retinex theory. Then, two enhancement networks are designed to denoise the reflectance and enhance the illumination, respectively. They adopt alternate spatial-angular feature extractions and process all the views synchronously with high efficiency. A parallel dual attention mechanism is integrated to both the spatial and angular feature extractions, to encode more important information. Moreover, a discriminator is introduced during the training to generate more realistic LF images by making judgment according to both the spatial and angular characteristics. Experimental results have demonstrated the superior performance of our method, which can restore the content, luminance, color and geometric structures of LF images effectively.

Unsupervised Low-light Image Enhancement with Decoupled Networks

In this paper, we tackle the problem of enhancing real-world low-light images with significant noise in an unsupervised fashion. Conventional unsupervised approaches focus primarily on illumination or contrast enhancement but fail to suppress the noise in real-world low-light images. To address this issue, we decouple this task into two sub-tasks: illumination enhancement and noise suppression. We propose a two-stage, fully unsupervised model to handle these tasks separately. In the noise suppression stage, we propose an illumination-aware denoising model so that real noise at different locations is removed with the guidance of the illumination conditions. To facilitate the unsupervised training, we construct pseudo triplet samples and propose an adaptive content loss correspondingly to preserve contextual details. To thoroughly evaluate the performance of the enhancement models, we build a new unpaired real-world low-light enhancement dataset. Extensive experiments show that our proposed method outperforms the state-of-the-art unsupervised methods concerning both illumination enhancement and noise reduction.

Research on the Application of Hotel Cleanliness Compliance Detection Algorithm Based on WGAN

Aiming at the problems of irregular cleaning and supervision difficulties in the cleaning process of hotel bathrooms, a target detection algorithm based on deep learning is proposed to detect the cleaning process transmitted by the sensor in real time and analyze its prescriptivity. However, the cleaning process has factors such as occlusion, light influence and insufficient data volume, resulting in inefficient detection. Therefore, this paper proposes a deep convolutional generation adversarial network (DCGAN) as the basic framework to expand the data set, improve the adaptability and robustness of the detector to different detection targets, take advantage of the fast speed and high accuracy of the YOLOv5 target detection network to detect the target, and then design a compliance detection network algorithm to detect whether the target meets the cleanliness standards. Experimental results show that the method has rapidity, practicality and high accuracy, and fully meets the engineering needs of hotel cleaning process detection and supervision.

Photon-Limited Blind Deconvolution Using Unsupervised Iterative Kernel Estimation

Blind deconvolution is a challenging problem, but in low-light it is even more difficult. Existing algorithms, both classical and deep-learning based, are not designed for this condition. When the photon shot noise is strong, conventional deconvolution methods fail because (1) the image does not have enough signal-to-noise ratio to perform the blur estimation; (2) While deep neural networks are powerful, many of them do not consider the forward process. When the noise is strong, these networks fail to simultaneously deblur and denoise; (3) While iterative schemes are known to be robust in the classical frameworks, they are seldom considered in deep neural networks because it requires a differentiable non-blind solver. This paper addresses the above challenges by presenting an unsupervised blind deconvolution method. At the core of this method is a reformulation of the general blind deconvolution framework from the conventional image-kernel alternating minimization to a purely kernel-based minimization. This kernel-based minimization leads to a new iterative scheme that backpropagates an unsupervised loss through a pre-trained non-blind solver to update the blur kernel. Experimental results show that the proposed framework achieves superior results than state-of-the-art blind deconvolution algorithms in low-light conditions.

Infrared and visible image fusion method based on dual domain enhancement in low illumination environment

For the problem of low contrast and loss of details in the fused infrared and visible images under a low illumination environment, this paper proposes a fusion method based on dual-domain enhancement. Firstly, the infrared and visible images are preprocessed: the Retinex model is used to enhance the infrared image, then denoise, and the logarithmic image processing (LIP) model is used to enhance the visible image; then the edge-preserving filter is used to decompose the preprocessed image into base layer and detail layer; for the base layer, a fusion technology based on visual saliency mapping is used to control the image quality For the detail layer, a simple summation fusion strategy is used to retain more details; finally, the fused image is reconstructed by linear combination. Experimental results show that the proposed algorithm is superior to the contrast algorithm in subjective visual effect and objective index evaluation.

LIGHT-DCSFN: A Light Cross-scale Fusion Network for Single Image Deahzing

At present, the image dehazing algorithm based on deep learning has many problems such as multiple model parameters and taking a long time for single image dehazing. Based on this, a fast image dehazing method based on depth cross scale fusion network is proposed. First, depth separable convolution is used to replace the normal standard convolution in the original network to improve single image dehazing efficiency. Secondly, a color feature extraction module is added on the basis of the original network to make the improved network better for image dehazing. Finally, this paper uses the network to predict the haze concentration map of the input image instead of directly obtaining the haze-free image through the network, so that the final image predicted by the network is clearer and more natural. The results indicate that the proposed algorithm greatly improves the dehazing time of a single image and can dehazing in most scenes.

IReF: Improved Residual Feature For Video Frame Deletion Forensics

Frame deletion forensics has been a major area of video forensics in recent years. The detection effect of current deep neural network-based methods outperforms previous traditional detection methods. Recently, researchers have used residual features as input to the network to detect frame deletion and have achieved promising results. We propose an IReF (Improved Residual Feature) by analyzing the effect of residual features on frame deletion traces. IReF preserves the main motion features and edge information by denoising and enhancing the residual features, making it easier for the network to identify the tampered features. And the sparse noise reduction reduces the storage requirement. Experiments show that under the 2D convolutional neural network, the accuracy of IReF compared with residual features is increased by 3.81 %, and the storage space requirement is reduced by 78%. In the 3D convolutional neural network with video clips as feature input, the accuracy of IReF features is increased by 5.63%, and the inference efficiency is increased by 18%.

Dual attention unit-based generative adversarial networks for low-light image enhancement

Images taken in low-light conditions would have insufficient light intensity and high noise. Many existing methods could not work very well in low-light environments, such as the noise and artifacts in dark conditions will be more obvious when enhanced. Therefore, low-light image enhancement is a challenging task in computer vision. To solve this problem, this paper proposes a lightweight generative adversarial network with dual-attention units to enhance underexposed photos. There is only a simple two-layer convolution in the generator section, and a dual-attention unit is added between the two convolutions to suppress the noise generated during the enhancement process and the deviation of color reduction. Then, non-local correlations of the image are used in the spatial attention module for denoising. Ours low-light image enhancement network is guided by the channel attention module to optimize redundant color features. In addition, the ideas of PatchGAN and Relativistic GAN are combined in the discriminator section to make the discriminator a better measure of the probability of changing from absolute true or false to relative true or false. The experiment results show that, our method could get better enhancement effects on low-illumination image datasets, which has more natural color, better exposure, and less noise and artifacts.

LiSnowNet: Real-time Snow Removal for LiDAR Point Clouds

Light Detection And Rangings (LiDARs) have been widely adopted to modern self-driving vehicles, providing 3D information of the scene and surrounding objects. However, adverser weather conditions still pose significant challenges to LiDARs since point clouds captured during snowfall can easily be corrupted. The resulting noisy point clouds degrade downstream tasks such as mapping. Existing works in de-noising point clouds corrupted by snow are based on nearest-neighbor search, and thus do not scale well with modern LiDARs which usually capture 100k or more points at 10Hz. In this paper, we introduce an unsupervised de-noising algorithm, LiSnowNet, running 52 x faster than the state-of-the-art methods while achieving superior performance in de-noising. Unlike previous methods, the proposed algorithm is based on a deep convolutional neural network and can be easily deployed to hardware accelerators such as GPUs. In addition, we demonstrate how to use the proposed method for mapping even with corrupted point clouds.

Gaze tracking technology for flight simulator

In order to equip the flight simulator with the corresponding gaze tracking system to further enhance the training quality for pilot cadets, a cross ratio-based gaze tracking system with an inexpensive monocular infrared camera and four infrared LEDs was proposed in this paper. Since it was difficult for traditional machine learning algorithms to extract eye features in the large field-of-view (FOV) infrared environment, a robust face detector based on deep neural network (DNN) was adopted to locate eyes with 68 facial landmarks more accurately. On this basis, the Gaussian Laplacian (LoG) blob detection algorithm was used to extract glints directly. Then the eye image was filtered by Gaussian to denoise and the pupil region was highlighted by global thresholding. Considering some areas of the pupil region were corroded by glints, the morphological operator closed operation was used to fill them. After that the pupil center was estimated by ellipse fitting. Finally, the gaze points were estimated by the constant cross ratio in the 2D projective space Through the experimental analysis after calibration, the system has high accuracy and practical value.

Performance Analysis of Conditional GANs based Image-to-Image Translation Models for Low-Light Image Enhancement

With the evolution of generative adversarial networks, popularly known as GANs for image-to-image translations, conditional GANs (cGANs) are explored and employed for various digital image preprocessing (enhancement and de-noising) tasks. The series of tasks includes image processing such as image enhancement, de-hazing, de-noising, resolution enhancement, and many more. In image enhancement, the area of increasing light (brightness) in low-light images (or poorly-illuminated images) is investigated in this work. For low-light image enhancement, the performance of pix2pix and pix2pixHD models has been demonstrated and analyzed. An analysis of low-light image enhancement using pix2pix model with other loss functions is also presented. Furthermore, pix2pix performance with instance normalization layers for low-light image enhancement is studied, and improved full-reference Image Quality Assessment (FIQA) metrics values along with entropy (a no-reference IQA (NIQA) metric) are reported. The quantitative and qualitative results are also compared with selected cutting-edge deep learning frameworks for low-light image enhancement. In this research, it is found that pix2pix model enhancement metrics are better than RetinexNet model. And the pix2pixHD results are comparable to the latest low-light image enhancement deep learning frameworks such as MIRNet and LLFlow. Furthermore, pix2pix models are lighter in size than MIRNet. The inference times achieved using pix2pix are the minimum on both the CPU and the GPU.

Application of Median Filter Method for Classification of Oil Palm Tree on LiDAR Images

This research discusses the problem of classifying oil palm trees that are widespread in Indonesian plantations using machine learning. The palm oil industry in Indonesia needs innovations to implement the plantation process more efficiently. This study focuses on testing the Median Filter preprocessing method in the classification process using the Convolutional Neural Network algorithm on light detection and ranging (LiDAR) format images. Researchers assume that a Median Filter that is able to remove salt and pepper noise in the image can help improve the quality of image clarity. Classification effectiveness was measured by evaluation of accuracy and F1-score. As a result, it can be concluded that the Median Filter cannot significantly affect the results of the LiDAR image classification of oil palm trees when compared to previous related studies. This is because the image data does not contain much salt and pepper noise. This study resulted in an accuracy of 84.98% and an F1-score of 0.85093 which is lower than the previous research without Median Filter preprocessing, which has an accuracy of 86%.

Image denoising technology of power equipment based on deep residual network

The continuous development and expansion of the power system puts forward higher and higher requirements for the reliability of the power equipment. Therefore, the early detection and elimination of power equipment faults is very important, which can not only reduce equipment losses, but also avoid accidents to a certain extent. With the development of artificial intelligence technology, computer vision has begun to be widely used in power systems. For computer vision technology, high-quality images determine whether the identification of power equipment and defect detection technology can have high performance. The image is noisy. In order to solve the problem of noise interference in the visible light image of power equipment, this paper proposes a denoising method for the visible light image of power equipment based on deep learning network. First, the text analyzes the existing problems and requirements in power equipment image denoising, and proposes the overall system architecture of the study. Secondly, on the basis of analyzing the basic principle of residual network, the paper proposes a deep residual network for denoising of visible light images of power equipment, in which the improved activation function is used to improve the performance of the network, and the denoising experiments of visible light images of power equipment in this paper, It is proved that the proposed denoising method is better than the traditional method.

2023

Dual Watermarking for Security of COVID-19 Patient Record

In recent years, smart healthcare systems have gained popularity due to the ease of sharing e-patient records over the open network. The issue of maintaining the security of these records has attracted many researchers. Thus, robust and dual watermarking based on redundant discrete wavelet transform (RDWT), Hessenberg Decomposition (HD), and randomized singular value decomposition (RSVD) are put forward for CT scan images of COVID-19 patients. To ensure a high level of authentication, multiple watermarks in form of Electronic Patient Record (EPR) text and medical image are embedded in the cover. The EPR is encoded via turbo code to reduce /eliminate the channel noise if any. Further, both imperceptibility and robustness are achieved by a fuzzy inference system, and the marked image is encrypted using a lightweight encryption technique. Moreover, the extracted watermark is denoised using the concept of deep neural network (DNN) to improve its robustness. Experiment results and performance analyses verify the proposed dual watermarking scheme.

Learning Enriched Features for Fast Image Restoration and Enhancement

Given a degraded input image, image restoration aims to recover the missing high-quality image content. Numerous applications demand effective image restoration, e.g., computational photography, surveillance, autonomous vehicles, and remote sensing. Significant advances in image restoration have been made in recent years, dominated by convolutional neural networks (CNNs). The widely-used CNN-based methods typically operate either on full-resolution or on progressively low-resolution representations. In the former case, spatial details are preserved but the contextual information cannot be precisely encoded. In the latter case, generated outputs are semantically reliable but spatially less accurate. This paper presents a new architecture with a holistic goal of maintaining spatially-precise high-resolution representations through the entire network, and receiving complementary contextual information from the low-resolution representations. The core of our approach is a multi-scale residual block containing the following key elements: (a) parallel multi-resolution convolution streams for extracting multi-scale features, (b) information exchange across the multi-resolution streams, (c) non-local attention mechanism for capturing contextual information, and (d) attention based multi-scale feature aggregation. Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details. Extensive experiments on six real image benchmark datasets demonstrate that our method, named as MIRNet-v2, achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement. The source code and pre-trained models are available at https://github.com/swz30/MIRNetv2 .

Interpretable Poisson Optimization-Inspired Deep Network for Single-Photon Counting Image Denoising

Single-photon counting (SPC) imaging is a versatile approach for detecting targets under extremely low-light situations. To increase the quality of SPC images degraded by noise, traditional optimization-based methods seek priors from handcrafted features, which are inadequate to handle different kinds of targets in natural scenes, leading to poor denoising performance. The difficulty of optimal parameter design for good balance between denoising performance and spatial clarity is another obstacle. To address these issues, we develop a novel interpretable Poisson optimization-inspired deep network for SPC image denoising. First, we build a learnable prior that regularizes the Poisson optimization problem for SPC imaging to enhance the denoising performance. We construct a deep network by unfolding the iterative shrinkage-thresholding algorithm to solve the Poisson optimization problem. Therefore, all modules in the network have strong interpretability, enabling good generalization capability in real situations. Second, all parameters are optimized in a data-driven manner in the network. Finally, we conduct both simulated and real experiments to test the effectiveness of the proposed method. Experimental results demonstrate that the proposed method outperforms other state-of-the-art methods from aspects of visual effect and quantitative analysis.

DEANet: Decomposition Enhancement and Adjustment Network for Low-Light Image Enhancement

Poor illumination greatly affects the quality of obtained images. In this paper, a novel convolutional neural network named DEANet is proposed on the basis of Retinex for low-light image enhancement. DEANet combines the frequency and content information of images and is divided into three subnetworks: decomposition, enhancement, and adjustment networks, which perform image decomposition; denoising, contrast enhancement, and detail preservation; and image adjustment and generation, respectively. The model is trained on the public LOL dataset, and the experimental results show that it outperforms the existing state-of-the-art methods regarding visual effects and image quality.

Multi-Stream Progressive Restoration for Low-Light Light Field Enhancement and Denoising

Light Fields (LFs) are easily degraded by noise and low light. Low light LF enhancement and denoising are more challenging than single image tasks because the epipolar information among views should be taken into consideration. In this work, we propose a multiple stream progressive restoration network to restore the whole LF in just one forward pass. To make full use of the multiple views supplementary information and preserve the epipolar information, we design three types of input composed of view stacking. Each type of input corresponds to an restoration stream and provides specific complementary information. In addition, the weights are shared for each type of input in order to better maintain the epipolar information among views. To fully utilize the supplementary information, we then design a multi-stream interaction module to aggregate features from different restoration streams. Finally, the multiple stages restoration is introduced to reconstruct the LF progressively. We carry out extensive experiments to demonstrate that our model outperforms the state-of-the-art techniques on real world low light LF dataset and synthetic noisy LF dataset.

Self-Supervised Image Denoising for Real-World Images With Context-Aware Transformer

In recent years, the development of deep learning has been pushing image denoising to a new level. Among them, self-supervised denoising is increasingly popular because it does not require any prior knowledge. Most of the existing self-supervised methods are based on convolutional neural networks (CNN), which are restricted by the locality of the receptive field and would cause color shifts or textures loss. In this paper, we propose a novel Denoise Transformer for real-world image denoising, which is mainly constructed with Context-aware Denoise Transformer (CADT) units and Secondary Noise Extractor (SNE) block. CADT is designed as a dual-branch structure, where the global branch uses a window-based Transformer encoder to extract the global information, while the local branch focuses on the extraction of local features with small receptive field. By incorporating CADT as basic components, we build a hierarchical network to directly learn the noise distribution information through residual learning and obtain the first stage denoised output. Then, we design SNE in low computation for secondary global noise extraction. Finally the blind spots are collected from the Denoise Transformer output and reconstructed, forming the final denoised image. Extensive experiments on the real-world SIDD benchmark achieve 50.62/0.990 for PSNR/SSIM, which is competitive with the current state-of-the-art method and only 0.17/0.001 lower. Visual comparisons on public sRGB, Raw-RGB and greyscale datasets prove that our proposed Denoise Transformer has a competitive performance, especially on blurred textures and low-light images, without using additional knowledge, e.g., noise level or noise type, regarding the underlying unknown noise.

T2V-DDPM: Thermal to Visible Face Translation using Denoising Diffusion Probabilistic Models

Modern-day surveillance systems perform person recognition using deep learning-based face verification networks. Most state-of-the-art facial verification systems are trained using visible spectrum images. But, acquiring images in the visible spectrum is impractical in scenarios of low-light and nighttime conditions, and often images are captured in an alternate domain such as the thermal infrared domain. Facial verification in thermal images is often performed after retrieving the corresponding visible domain images. This is a well-established problem often known as the Thermal-to-Visible (T2V) image translation. In this paper, we propose a Denoising Diffusion Probabilistic Model (DDPM) based solution for T2V translation specifically for facial images. During training, the model learns the conditional distribution of visible facial images given their corresponding thermal image through the diffusion process. During inference, the visible domain image is obtained by starting from Gaussian noise and performing denoising repeatedly. The existing inference process for DDPMs is stochastic and time-consuming. Hence, we propose a novel inference strategy for speeding up the inference time of DDPMs, specifically for the problem of T2V image translation. We achieve the state-of-the-art results on multiple datasets. The code and pretrained models are publically available at http://github.com/Nithin-GK/T2V-DDPM

1개의 댓글

comment-user-thumbnail
2023년 11월 29일

Great review! Your insights on the topic are spot-on. If you're interested in exploring more about healthcare CRM implementation, this guide might offer some valuable perspectives: https://www.cleveroad.com/blog/healthcare-crm-implementation/.

답글 달기