The following is a summary of “A brain-inspired object-based attention network for multiobject recognition and visual reasoning,” published in the May 2023 issue of Ophthalmology by Adeli, et al.
For a study, researchers sought to investigate how attention control is learned in the visual system, specifically focusing on the mechanisms of object-based attention and sequential glimpses of objects. An encoder-decoder model inspired by the recognition-attention system in the brain was developed to simulate these processes.
The model consisted of a hierarchy of feedforward, recurrent, and capsule layers, representing the “what” encoder. At each iteration, a new glimpse was taken from the image and processed through the encoder to obtain an object-centric representation. The representation was then passed to the “where” decoder, which utilized recurrent representation and top-down attentional modulation to plan subsequent glimpses and impact routing in the encoder.
The attention mechanism implemented in the model significantly improved the accuracy of classifying highly overlapping digits. In a visual reasoning task that involved comparing two objects, the model achieved near-perfect accuracy and outperformed larger models in generalizing to unseen stimuli.
The study demonstrated the benefits of object-based attention mechanisms and sequential glimpses of objects in improving visual processing tasks.
The encoder-decoder model, inspired by the recognition-attention system in the brain, successfully learned attention control and enhanced performance in classifying overlapping digits and visual reasoning tasks. These findings contribute to understanding how attention is learned in the visual system and highlight the importance of object-based attention mechanisms in goal-directed behavior.