Abstract:
Many image understanding tasks require natural scene text spotting. Text detection and recognition are subtasks. We propose a unified network that localizes and recognizes text in a single forward pass without image cropping, feature recalculation, word separation, or character grouping. End-to-end training allows the framework to recognize any text shape. Convolutional features are calculated once and shared by detection and recognition modules. Multi-task training sharpens learned features, improving performance. A 2D attention model in word recognition solves text irregularity. The attention model provides each character’s spatial location and orientation angle to refine text localization and local feature extraction in word recognition. Our method outperforms other methods on regular and irregular text spotting benchmarks. Each module design is tested with extensive ablation experiments.
Note: Please discuss with our team before submitting this abstract to the college. This Abstract or Synopsis varies based on student project requirements.
Did you like this final year project?
To download this project Code with thesis report and project training... Click Here