Abstract:
Deep convolutional neural networks (CNNs) have improved RGB-induced salient object detection due to their superior feature learning. Cluttered backgrounds, low light, and lighting variations make such detections difficult.
This paper uses thermal infrared images to complement RGB-based saliency detection instead of improving it. Our end-to-end network for multi-modal salient object detection turns RGB-T saliency detection into a CNN feature fusion problem.
To achieve this, a backbone network (e.g., VGG-16) extracts coarse features from each RGB or thermal infrared image individually, and then several adjacent-depth feature combination (ADFC) modules are designed to extract multi-level refined features for each single-modal input image, considering that features captured at different depths differ in semantic information and visual details.
Next, a multi-branch group fusion (MGF) module captures cross-modal features by fusing ADFC modules for an RGB-T image pair at each level. Finally, a joint attention guided bi-directional message passing (JABMP) module predicts saliency using multi-level fused features from MGF modules.
Our algorithm outperforms the state-of-the-art on several public RGB-T salient object detection datasets, especially in challenging conditions like poor illumination, complex background, and low contrast.
Note: Please discuss with our team before submitting this abstract to the college. This Abstract or Synopsis varies based on student project requirements.
Did you like this final year project?
To download this project Code with thesis report and project training... Click Here