Image Fusion

Recent advancements in deep learning have significantly impacted the field of image fusion, particularly in the context of IR and visible image fusion. Various methods have been developed to address this challenge, with a focus on maintaining edge detail information effectively within the fused images. One innovative approach proposed is the infrared and visible image fusion with edge detail implantationmethod. This technique aims to enhance edge detail preservation by processing source image information and edge detail information separately, ultimately supplementing edge details into the main framework.

Introduction

Image fusion is an essential technique for information fusion, which has been widely utilized in practical application such as target detection, industrial production, military and biomedical science. Especially in industrial production, infrared and visible image fusion is a reliable tool of surveillance,

In recent decades, many image fusion methods have been proposed by different schemes which has strong feature extraction ability, was introduced into the image fusion community, however, by use these methods we can take full advantage of the extracted features.

Figure 1.

Besides the above analysis, there are three weaknesses of existing fusion methods obstructing the obtention of high-quality details.

(1) The tiny details cannot be decomposed into the detail part. This brings about uneven texture and poor visibility in the fused image.

(2) These methods cannot extract different features of the source images, leading to the loss of various features in the fused image.

(3) The extracted features cannot be fully utilized, which cause blurring of the fused image.

Here are three proposed solutions to these problems.

with the image that only contains strong edge information of the source images obtained by the canny operator as the guidance image and the source images, as the input images. Secondly, a rough CNN and a sophisticated CNN are designed to extract the features of infrared and visible detail parts respectively. Then a multi-layer features fusion strategy is utilized to integrate the extracted features. Moreover, the base part is fused through weighting method. Finally, the fused image is reconstructed by the adding of the fused detail and base parts.

Figure 2.

An image generally contains lots of different-part information, and the applied research of image is sometimes limited to the phenomenon of one part or some parts. Therefore, it is necessary to decompose the images into different parts, which not only eliminates the influence of other parts on image processing results, but also simplifies the complexity and difficulty of image processing. In this paper, the source images are decomposed into the detailed part containing details and the base part containing gray distribution by guided filter.

Figure 3.

Comparison of different decomposition methods. (a,c) are the base part and the detail part obtained by common method. (b,d) are the base part and the detail part obtained by our proposed method. (e) is the local amplification effect of the yellow box of the (c). (f) is the local amplification effect of the yellow box of the (d). From top to bottom the figures are respectively infrared image and visible image.

Figure 4.

The visible sensor can acquire clear image details, which is more suitable for human visual observation, so the visible detail layer contains rich and useful features. According to the characteristics

3-Training

In the training phase, we mainly consider that each convolution layer can extract the rich features. Selecting different training data for different purposes can effectively train the model. Therefore, different from other deep learning-based fusion methods, we propose that infrared and visible images are used as the training data of the infrared and visible detail parts networks, respectively. The 105 pairs of infrared and visible images from TNO database are selected as training data. However, it is insufficient to train a good mode, so we rotate the images 90°, each image is then randomly divided into 50 frames with a size of 224 × 224. After this operation, we can obtain 22,500 pairs of training data to expand the dataset. The task of image classification based on CNN has been proved to be able to extract image features, and has been applied to the field of image fusion, so we use the same method to train models

Figure 5 shows feature saliency maps of each convolutional layer. However, the visible feature saliency maps focus on active details. It can be demonstrated that the infrared feature saliency maps not only preserve the salient features but also ignore the noise feature. Meanwhile, the visible feature saliency maps can obtain the features of tiny details.

Figure 5.

Feature saliency map of each convolutional layer. “conv1”, “conv2” and “conv3” denote the first, second and third convolutional layer, respectively. From top to bottom, the figures are infrared and visible feature saliency maps, respectively.

4. Objective Performance Evaluation

Objective performance evaluation depends on the evaluation metrics that are given by the mathematical model, which is not disturbed from human visual characteristic and mental state. Therefore, besides subjective performance evaluation, objective metrics are adopted to measure the objective performance.

We must measure the computational cost of fusion methods, except the subjective and objective performance evaluation, which determines the actual application value of the method. The running time is used to evaluate the computational cost of all fusion methods. The infrared and visible image pairs are taken as an example for computational costs analysis.

5. Conclusions

we presented a novel infrared and visible image fusion method through details preservation, which can obtain excellent details information and simultaneously retain the gray distribution information of the source images. Experiments on TNO datasets indicate that our fused images look like sharpened images with abundant details, which is beneficial for observing the actual scene. In the future, an adaptive fusion framework will be built and the stability of the method will be enhanced.