Article summary of Hierarchical models of object recognition in cortex by Riesenhuber & Poggio - Chapter

Recognition of visual objects
Results

Recognition of visual objects

The recognition of visual objects is fundamental. Research often takes place with a repeated cognitive task with two essential requirements: invariance and specificity. Cells from the inferotemporal cortex (IT, the highest visual area in the ventral visual pathway) appear to play a key role in object recognition. The cells respond to what one sees with complex objects such as faces. Certain neurons respond specifically to certain faces and not to other faces. The question remains: how can they respond to different faces while the stimulus offer is practically the same in the retina?

This is also reflected in the striate cortex in cats. Both simple and complex cells respond to a presented bar. For example, it appears that the small simple cells have narrow receptive fields that are strongly position-dependent and that the complex cells have large receptive fields and are not position-dependent. Hubel and Wiesel have made a model in which the simple cells respond as if they are neighbor cells. Where cells that sit next to each other also see the world next to each other. So you often see a group of cells firing together. A direct follow-up of this model leads to a higher-order-complex cells scheme.

Cells in the V4 can control their attention and they can respond to an adaptation in their receptive field. There is little evidence that this mechanism is used to translate invariant object recognition. Invariance of each transformation can be built up by converting afferent cells with different variations of the same stimulus. Evidence has now been found that groups of cells that respond to whole or partial vision are learned through a learning process. The vision invariance problem can then be presented by a small number of neurons. This idea gives us two problems.

Problem 1

In monkeys, it is that learning (for them) unknown stimuli (such as faces) is possible because they learn a part of the invariant via just one view of the object. If this object is presented with a lot of distractor objects around it, it can be learned in combination with these objects. The cells thus become invariant at other positions.

Problem 2

The model does indicate how view tuned units (VTU, groups that fire at a specific object) are built, but not how they arise.

Results

The model is based on a simple hierarchical feedforward architecture. It is assumed that the structure reflects the invariance and that characteristic specificity must be built up from different mechanisms. The pooling mechanism should provide robust feature detectors. This means that it must allow detection on specific characteristics without getting confused by clutter or context in the receptive field.

There are two alternatives to a pooling mechanism.

Linear addition = SUM.

Equal weights are hereby weighed. Responses to a complex cell are invariant as long as the stimulus remains in the receptive field of the cell. However, there is no response as to whether there actually is a bar in the receptive field. The output signal is the sum of the afferent cells and so there is no characteristic specificity.

Non-linear maximum operation = MAX.

The strongest afferent cell determines the postsynaptic response. With MAX, the response is determined by determining the most active afferent cell and this signal is seen as the best match for a portion of the stimulus. This makes MAX respond better.

In both cases, the response of a complex cell is invariant to the bar on the receptive field. A non-linear MAX function is a good way that correctly describes the pool when invariant. This includes implicit scanning of afferent cells of the same type. The strongest is then selected from the cells that respond and this is the most consistent with the invariance. Pooling combinations of afferent cells provides a mixed signal caused by different stimuli.

MAX systems are comparable in some respects to neurophysiological data. For example, if two stimuli are offered in the receptive field of an IT neuron, then the neuron's response is dominated by the stimulus that receives the most responses separately. This corresponds to how the MAX model predicts when it comes to afferent neurons. A number of studies provide support for the MAX model. These studies often find a high non-linear tuning of IT cells. This corresponds to the MAX response function. A linear model cannot make such strong changes with a small change in input.

In some cases, clutter can cause the value to change from the MAX function. The quality of the match in the final phase has then changed, so that the power of the VTU response is also different. A solution for this is to add more specific characteristics. Simulations have shown that this model is able to recognize objects in a context.

The MAX model can be used well to describe brain processes. MAX responses are probably from cortical microcircuits in lateral inhibition between neurons in the cortical layer. In addition, the MAX response is important for object recognition.

Access:

Public

Click & Go to more related summaries or chapters:

Summaries per article with Artificial intelligences and Neurocognition at Leiden University 20/21

Article summary of Modeling visual recognition from neurobiological constraints by Oram & Perrett - Chapter

Article summary of Is a machine realization of truly human-like intelligence achievable? by McClelland - Chapter

Article summary of Untangling invariant object recognition by DiCarlo & Cox - Chapter

Article summary of Computing machinery and intelligence by Turing - Chapter

Article summary of Rule-Based Expert Systems: The MYCIN Experiments of the Stanford Heuristic Programming Project by Buchanan & Shortliffe - Chapter

Article summary of Robots with instincts by Adami - Chapter

Article summary of Male and female robots by Da Rold, Petrosino, & Parisi - Chapter

Article summary of Speed of processing in the human visual system by Thorpe, Fize & Marlot - Chapter

Article summary of Sparse but not "Grandmother-cell" coding in the medial temporal lobe by Quiroga, Kreiman, et. al. - Chapter

Article summary of Perceptrons by Van der Velde - Chapter

Article summary of Learning and neural plasticity in visual object recognition by Kourtzi & DiCarlo - Chapter

Article summary of Breaking position-invariant object recognition by Cox, Meier, et. al - Chapter

Article summary of A feedforward architecture accounts for rapid categorization by Serre, Oliva & Poggio - Chapter

Article summary of Hierarchical models of object recognition in cortex by Riesenhuber & Poggio - Chapter

Study Guide with article summaries for Artificial Intelligence at Leiden University

Article summary of Modeling visual recognition from neurobiological constraints by Oram & Perrett - Chapter

Article summary of Is a machine realization of truly human-like intelligence achievable? by McClelland - Chapter

Article summary of Untangling invariant object recognition by DiCarlo & Cox - Chapter

Article summary of Computing machinery and intelligence by Turing - Chapter

Article summary of Rule-Based Expert Systems: The MYCIN Experiments of the Stanford Heuristic Programming Project by Buchanan & Shortliffe - Chapter

Article summary of Robots with instincts by Adami - Chapter

Article summary of Male and female robots by Da Rold, Petrosino, & Parisi - Chapter

Article summary of Speed of processing in the human visual system by Thorpe, Fize & Marlot - Chapter

Article summary of Sparse but not "Grandmother-cell" coding in the medial temporal lobe by Quiroga, Kreiman, et. al. - Chapter

Article summary of Perceptrons by Van der Velde - Chapter

Article summary of Learning and neural plasticity in visual object recognition by Kourtzi & DiCarlo - Chapter

Article summary of Breaking position-invariant object recognition by Cox, Meier, et. al - Chapter

Article summary of A feedforward architecture accounts for rapid categorization by Serre, Oliva & Poggio - Chapter

Article summary of Hierarchical models of object recognition in cortex by Riesenhuber & Poggio - Chapter

Summaries per article with Artificial intelligences and Neurocognition at Leiden University 20/21

Join WorldSupporter!

Join with a free account for more service, or become a member for full access to exclusives and extra support of WorldSupporter >>

Going abroad?

Insure your way around the world

International expat insurances

Travel & Worldsupporter insurances (NL)

Study with summaries

Contributions: posts

Help other WorldSupporters with additions, improvements and tips

Spotlight: topics

Check how to use summaries on WorldSupporter.org

Submenu: Summaries & Activities

Follow the author: Vintage Supporter

Work for WorldSupporter

JoHo can really use your help! Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world

Working for JoHo as a student in Leyden

Parttime werken voor JoHo

Statistics

Search a summary, study help or student organization

Select any filter and click on Search to see results

Article summary of Hierarchical models of object recognition in cortex by Riesenhuber & Poggio - Chapter

Recognition of visual objects

Problem 1

Problem 2

Results

Linear addition = SUM.

Non-linear maximum operation = MAX.

Summaries per article with Artificial intelligences and Neurocognition at Leiden University 20/21

Study Guide with article summaries for Artificial Intelligence at Leiden University

Contributions: posts

Add new contribution

Spotlight: topics

Online access to all summaries, study notes en practice exams

How and why use WorldSupporter.org for your summaries and study assistance?

Using and finding summaries, notes and practice exams on JoHo WorldSupporter

Quicklinks to fields of study for summaries and study assistance