The paper presents a novel deep convolutional neural network architecture called SegNet, which is designed for semantic pixel-wise segmentation. It consists of an encoder network, a decoder network, and a pixel-wise classification layer.
The paper analyzes the decoding technique of SegNet and compares it with the widely used Fully Convolutional Network (FCN). It evaluates the performance of SegNet on two scene segmentation tasks: CamVid road scene segmentation and SUN RGB-D indoor scene segmentation.
The paper demonstrates the efficacy of SegNet by providing a real-time online demo of road scene segmentation for autonomous driving.
SegNet is shown to be efficient in terms of memory and computational time during inference, with competitive performance and efficient memory usage compared to other architectures.