Article summary of Hierarchical models of object recognition in cortex by Riesenhuber & Poggio - Chapter

Recognition of visual objects

The recognition of visual objects is fundamental. Research often takes place with a repeated cognitive task with two essential requirements: invariance and specificity. Cells from the inferotemporal cortex (IT, the highest visual area in the ventral visual pathway) appear to play a key role in object recognition. The cells respond to what one sees with complex objects such as faces. Certain neurons respond specifically to certain faces and not to other faces. The question remains: how can they respond to different faces while the stimulus offer is practically the same in the retina?

This is also reflected in the striate cortex in cats. Both simple and complex cells respond to a presented bar. For example, it appears that the small simple cells have narrow receptive fields that are strongly position-dependent and that the complex cells have large receptive fields and are not position-dependent. Hubel and Wiesel have made a model in which the simple cells respond as if they are neighbor cells. Where cells that sit next to each other also see the world next to each other. So you often see a group of cells firing together. A direct follow-up of this model leads to a higher-order-complex cells scheme.

Cells in the V4 can control their attention and they can respond to an adaptation in their receptive field. There is little evidence that this mechanism is used to translate invariant object recognition. Invariance of each transformation can be built up by converting afferent cells with different variations of the same stimulus. Evidence has now been found that groups of cells that respond to whole or partial vision are learned through a learning process. The vision invariance problem can then be presented by a small number of neurons. This idea gives us two problems.

Problem 1

In monkeys, it is that learning (for them) unknown stimuli (such as faces) is possible because they learn a part of the invariant via just one view of the object. If this object is presented with a lot of distractor objects around it, it can be learned in combination with these objects. The cells thus become invariant at other positions.

Problem 2

The model does indicate how view tuned units (VTU, groups that fire at a specific object) are built, but not how they arise.

Results

The model is based on a simple hierarchical feedforward architecture. It is assumed that the structure reflects the invariance and that characteristic specificity must be built up from different mechanisms. The pooling mechanism should provide robust feature detectors. This means that it must allow detection on specific characteristics without getting confused by clutter or context in the receptive field.

There are two alternatives to a pooling mechanism.

Linear addition = SUM.

Equal weights are hereby weighed. Responses to a complex cell are invariant as long as the stimulus remains in the receptive field of the cell. However, there is no response as to whether there actually is a bar in the receptive field. The output signal is the sum of the afferent cells and so there is no characteristic specificity.

Non-linear maximum operation = MAX.

The strongest afferent cell determines the postsynaptic response. With MAX, the response is determined by determining the most active afferent cell and this signal is seen as the best match for a portion of the stimulus. This makes MAX respond better.

In both cases, the response of a complex cell is invariant to the bar on the receptive field. A non-linear MAX function is a good way that correctly describes the pool when invariant. This includes implicit scanning of afferent cells of the same type. The strongest is then selected from the cells that respond and this is the most consistent with the invariance. Pooling combinations of afferent cells provides a mixed signal caused by different stimuli.

MAX systems are comparable in some respects to neurophysiological data. For example, if two stimuli are offered in the receptive field of an IT neuron, then the neuron's response is dominated by the stimulus that receives the most responses separately. This corresponds to how the MAX model predicts when it comes to afferent neurons. A number of studies provide support for the MAX model. These studies often find a high non-linear tuning of IT cells. This corresponds to the MAX response function. A linear model cannot make such strong changes with a small change in input.

In some cases, clutter can cause the value to change from the MAX function. The quality of the match in the final phase has then changed, so that the power of the VTU response is also different. A solution for this is to add more specific characteristics. Simulations have shown that this model is able to recognize objects in a context.

The MAX model can be used well to describe brain processes. MAX responses are probably from cortical microcircuits in lateral inhibition between neurons in the cortical layer. In addition, the MAX response is important for object recognition.

Image

Access: 
Public

Image

Click & Go to more related summaries or chapters:

Summaries per article with Artificial intelligences and Neurocognition at Leiden University 20/21

Study Guide with article summaries for Artificial Intelligence at Leiden University

Join WorldSupporter!
Search a summary

Image

 

 

Contributions: posts

Help other WorldSupporters with additions, improvements and tips

Add new contribution

CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Enter the characters shown in the image.

Image

Spotlight: topics

Image

Check how to use summaries on WorldSupporter.org

Online access to all summaries, study notes en practice exams

How and why use WorldSupporter.org for your summaries and study assistance?

  • For free use of many of the summaries and study aids provided or collected by your fellow students.
  • For free use of many of the lecture and study group notes, exam questions and practice questions.
  • For use of all exclusive summaries and study assistance for those who are member with JoHo WorldSupporter with online access
  • For compiling your own materials and contributions with relevant study help
  • For sharing and finding relevant and interesting summaries, documents, notes, blogs, tips, videos, discussions, activities, recipes, side jobs and more.

Using and finding summaries, notes and practice exams on JoHo WorldSupporter

There are several ways to navigate the large amount of summaries, study notes en practice exams on JoHo WorldSupporter.

  1. Use the summaries home pages for your study or field of study
  2. Use the check and search pages for summaries and study aids by field of study, subject or faculty
  3. Use and follow your (study) organization
    • by using your own student organization as a starting point, and continuing to follow it, easily discover which study materials are relevant to you
    • this option is only available through partner organizations
  4. Check or follow authors or other WorldSupporters
  5. Use the menu above each page to go to the main theme pages for summaries
    • Theme pages can be found for international studies as well as Dutch studies

Do you want to share your summaries with JoHo WorldSupporter and its visitors?

Quicklinks to fields of study for summaries and study assistance

Main summaries home pages:

Main study fields:

Main study fields NL:

Follow the author: Vintage Supporter
Work for WorldSupporter

Image

JoHo can really use your help!  Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world

Working for JoHo as a student in Leyden

Parttime werken voor JoHo

Statistics
1065