Perception Across the Senses

Perception Across the Senses


The perception of the environment does not occur within a single sense.  The world is perceived through interactions between the senses integrated and cueing the other senses (Fairhall & Macaluso, 2009).  These interactions between the auditory, visual, and tactile sensory systems have been a field of rampant growth within the field of cognition in the last couple of decades.  Indeed, a few decades earlier, the interaction between modalities would have been viewed as somewhat exotic. (Spence, 2004)  How the brain represents space is of a particular interest to modern psychology, has been used in the creation of multitudes of new technologies, and is integral to the understanding of consciousness.  Of particular relevance to the studies of the brain’s construction of space is attention; therefore, the properties of attention must also be examined in detail to explain how the attention is modulated between the sensory modalities.  Multisensory attention can be viewed as a way that human beings control their senses to perceive the world.

The senses are linked together, in a regulatory process that allows for attention to pass through multiple modalities, while simultaneously drawing information from each one.  These cross-modal links among vision, touch, and audition have been tested to reveal an underlying cortical system devoted to multisensory integration  (Kida, 2009).  The brain seems to use the senses in concert, with each sense able to activate or primes the other senses and integrate sensory information from multiple modalities (Fairhall and Macaluso, 2009) This paper will examine many of the underlying mechanisms of multisensory integration, as well as the base senses themselves, especially vision.  The properties of attention must also be examined to understand multisensory attentions; therefore an in depth review of the properties of attention is a necessary first step to understand how the different senses communicate to create a complete and seamless conscious stream.  

Properties of Attention

            Attention and its various principles are some of the most researched principles of psychology, especially cognitive psychology.  Attention is required in almost any conscious process, and some have indeed equated attention with consciousness itself.  The properties of this locus of sensation interpreted into conscious perception and further reactive capabilities have many aspects yet to be resolved. (Lavie, 1995)  One controversial subject within the field of attention studies is the late attention model versus the early selection model.  This is an argument that focuses on where the “bottleneck” of attention is.  This is the study of the limits of attention and how its capacity is limited.  The early and late selection models were created to explain why irrelevant stimuli are perceived and why certain stimuli will be perceived whereas others will not.   In 1995, Lavie took these two approaches to attention and combined them into a hybrid theory of selective attention based on load and task difficulty.  This effectively combined the two concurrent models of attention and provides us with a consistent and well-based beginning of an approach to the modulation of attention resources by the perceptual systems.

            Lavie’s theories of attention should be examined in more detail so that the study of uni-modal and cross-modal sensory integration and attention can be more completely understood.  One problem that is not solved by this approach is the ability to perceive certain information even under high load conditions (Lavie, 2010).  It seems as though the properties of attention are also specific and subjective for each individual and therefore are very hard to study comprehensively.  But the hybrid model has solved many of the internal disputes of the two opposing views of late and early attention models by allowing for the limited capacity of attention and the assumption that perception is somewhat automatic, allowing for the limited capacity of attention and the assumption that perception is somewhat automatic cannot consciously shut it shut down (Lavie, 2010).  According to Lavie, the locus of attention shifts with the difficulty of the task, with easier tasks allowing more irrelevant stimuli to be perceived than more difficult tasks, which would mean a constantly shifting “filter” that has a threshold and limitations on what can be perceived.

            The shifting locus of attention seems to be viewed as somewhat of a filter for irrelevant stimuli while simultaneously having a built in “alarm system” that responds quickly to important stimuli.  This filter has been the subject of the majority of Lavie’s studies because of her studies on task difficulty and the effective limitations that attention has to be able to capture or receive stimulus input.  It has been shown that even the perception of biological stimuli like optic flow is reduced by demanding yet separate tasks. (Lavie, 1997)  This would lead to the belief that the resources of attention are inter-modal, and distributed between the senses based on what can be perceived by each modality.  The information is then integrated across multiple levels in various different parts of cortex.  This multisensory integration between the three primary senses of touch, audition, and vision is the next step in the study of cross-modal attention.

Multisensory Integration 

            Multisensory integration is necessary to recognize different inputs from sensory modalities as pertaining to the same object (Koelewijn, 2010) Results have even demonstrated that during multisensory integration, the brain combines inputs not only from sensory modalities, but acts upon these inputs in concert with the peripheral nervous system to allow for perceptual enhancements (Lugo, Doti, Wittich, & Faubert, 2008).  However, this multisensory integration is not the same thing as attention, even though many studies have shown correlations between the two.  Some studies have suggested that multisensory integration is preattentive and immune to top-down influences (Fairhall & Macaluso, 2009).  However, it is probable that the integration and cortical communication between modalities is not comprehensively understood, which is the position many researchers have taken on the issue.

Integration of multisensory inputs occurs when different senses are detecting the same stimulus at about the same time, at about the same location.  One example of this as an illusion is ventriloquism, which tricks vision into integrating movement information with a sound produced in a slightly different (but unnoticeable) location.  Because the movement takes place simultaneously with lip movement from the puppet, the sound is perceived as emanating from the puppet. Shams, Kamitani, and Shimojo developed another illusory effect that proves that audition biases the visual system in 2000.  They showed that multiple short auditory beeps transformed the visual perception of an event into multiple flashes.  These examples of illusory events show that the multisensory integration systems can be fooled, now they systems themselves must be examined in detail to understand how the senses work across modalities in unison to control attention.

            Multisensory integration can enhance visual search enhance the salience of objects (Koelewijn, 2010).  When a short sound is shown simultaneously with a color change of a target stimulus, the stimulus seems to become separate from the display proving that visual search is enhanced by audition (Koelewijn, 2010).  What these experiments show is the strength of the multisensory integration system to bias, enhance, or facilitate the other senses.  It is a kind of additive process that increases the perceiver’s ability to discriminate, search, and react to the surrounding environment.  However, there are constraints to how the senses can integrate, namely temporal and spatial limitations.

            It is believed that the multisensory integration sites converge unimodal information into unimodal or multimodal sites.  This is believed to only occur with certain constraints pertaining to the location and time between stimulus presentations.  There is also believed to be a rule of inverse effectiveness, which states that the multisensory integration effect is larger with less perceptually powerful stimuli, or less salient stimuli.  The temporal and locational constraints upon the multisensory integration system have been the study of previous research and the constraints upon the multisensory systems are relatively well known.  The results of the studies have shown that there is about a 100-millisecond time window in which the stimuli must be presented, or the multisensory integration will be significantly reduced.  This provides a fairly clear difference from preparatory states, or cueing and alerting of the different sensory systems (Koelewijn, 2010).  

The location of the stimulus is also very important for integration to occur.  However, these spatial constraints seem to be limited to the periphery; if the target is within the region of the fovea multisensory integration will almost certainly occur.  Auditory sounds will enhance the visual perception of objects if the sound is only temporally relevant and the visual stimulus is at the center of fixation.  This implies that sound works in concert with vision in the periphery to enhance the salience of objects, but in the center of the visual field the object will always be enhanced by simultaneous multimodal stimuli.  This infers that multimodal integration occurs at many cortical sites and is indeed shown to do so by many researchers.

            The cortical sites know to integrate multisensory events are not modal specific.  The primary visual cortex integrates auditory information just as there are sites specific for multisensory integration.  The primary brain regions involved in multisensory integration are the superior temporal sulcus and gyrus, the ventral and lateral intraparietal areas, and sub cortical areas such as the superior colliculus (Koelewijn, 2010).  However, the typically unimodal sites are also involved in the processes of multisensory integration, such as the primary visual cortex and primary visual cortex. The superior temporal sulcus is primarily involved in audiovisual integration and is one of the most highly studied cortical areas involved with multisensory integration.  However, it is believed that all of these areas communicate using feed forward connections and polysynaptic feedback loops, creating an additive integration process that increases in effectiveness as the stimuli decrease in intensity (Fairhall & Macaluso, 2009).

            The audiovisual integration effect is one of the more studied and probably the most often used of the multimodal integration effects.  We use this when attending to the visible speech patterns of another person.  The superior temporal sulcus is the brain area shown to integrate speech and visual patterns of movement.  A direct integration effect exists between vocal tract shape, speech acoustics, and deformation of the face, which can signal the starting and stopping of words, sentences, and ideas.  The biological motion of the mouth, neck, and head provide enormous amounts of information for the sound information that is going to be received.  Indeed, the activity of the superior temporal sulcus is super additive with congruent stimuli, as are the primary visual and auditory cortices.  (Callan, D.E.,Jones, Munhall, Kroos, Callan, A.M., & Bateson, 2004).

            Another brain area that has important implications for audiovisual integration is the superior colliculi, which is known as a brain area that reorients visual gaze, especially in saccadic eye movements.  The colliculus’ ability to reorient gaze to peripheral events shows that the brain integrates multisensory information in many cortical areas that are also devoted largely to one mode of perception.  The superior colliculus is known to reorient gaze to a visual movement with a saccade, but also receives input from other modalities, making it a polymodal integration site (Nelson, Hughes, & Aronchick, 1997) The summation, or additive effects of the multimodal inputs to the superior colliculus was observed by Nelson and Hughes’ study, confirming both the spatial constraints of multimodal integration and the role of the superior colliculus in multimodal perception.  Multimodal integration has been studied effectively over the last decade, but the ability for imaging of the brain has greatly increased during this time, so an fMRI study would greatly increase the knowledge of how audiovisual stimuli are integrated in the cortex.

            Moving back to the superior temporal sulcus, it has been shown that voxels of fMRI data show interactions of audio and visual inputs have a mixture of unisensory and multisensory subpopulations, some with uniquely unisensory inputs and some with uniquely multisensory inputs, whose subpopulations were not visible until high resolution fMRI was used to image the activations in cortex (Attenveldt, Blau, Blomert, & Gloebel, 2010).  This means that there are certain areas of cortex completely devoted to multisensory integration, as well as sites that process both a singular modality and multiple modalities, such as the primary visual or auditory cortex.  Pertaining to this research of intermodal connections within brain sites is the brain activity of the blind and deaf in terms of the plasticity of the brain.

            It seems that the brain can compensate for lacking in one type of modality. fMRI data has shown that cross-modal plasticity occurs predominantly in the right auditory cortex for the deaf.  Studies have shown that the auditory cortex in the deaf has visual activation in response to lip movements that are normally used for auditory processing in hearing patients (Finney, Fine, & Dobkins).  Therefore, the right auditory cortex of the deaf, because there is no auditory input received by the cortex, might be able to process motion in the visual modality.  But perhaps the most important aspect of this finding is the reciprocal findings in blind subjects.  In many blind patients, moving auditory stimuli have been observed to activate the right visual cortex.  This occurs again in the right hemisphere of cortex leading to the possibility that plasticity for motion processing in that hemisphere, as well as supporting the right visual cortex’s predisposition towards motion processing (Finney, Fine, & Dobkins).  This neuroplasticity supports the idea of an integrated and connected system of sensory processing that works in parallel to create a conscious perception of the world, especially between the two modalities of vision and audition.  There is even more evidence of such plasticity in the multisensory dorsal stream functioning.

            Localizing objects and navigating motor functions in the environment have been shown to be a part of the dorsal visual stream of information that receives input from the early visual areas (primary visual cortex) and projecting to the posterior parietal cortex Fiehler & Rosler, 2010).  The posterior parietal cortex works is believed to work as an integrator for multiple senses that guides movement in space and can even provide a unified representation of space.  It seems that this dorsal pathway is highly used by the brain to process how to use motor movements to correctly interact with the environment, whereas the ventral stream would be implied for the accurate perception of the object, such as size or distance.  In Fiehler and Rosler’s (2010) study, evidence for the polymodal integration system in the dorsal stream was found that parallels the ideas of multisensory integration already discuss.  This study provides another example of the everyday usage of multisensory information and obvious and necessarily useful integration of the tactile and visual modalities. 

So far, we have examined various aspects of the integrator systems of the senses, the various properties of attention, and how many of the primary sensory areas of cortex are multimodal.  This is but a brief examination of the subject matter, the knowledge and complexity of this field is extraordinarily complex and intricate with large amounts of information on the integration of information within the brain.  There is much research left to be done in the fields of attention and the integration of the senses, but this review should provide a basic overview of the knowledge obtained thus far in the two fields discussed.  However, to fully understand how the senses work in unison, we must step out of the constraints of multisensory integration and the basic tenets of attention into the realm of cross-modal attention.

Cross-modal Attention

Construction, maintenance, and updating of the cognitive representations of space surrounding an organism are essentially for higher functioning and adaption to the environment and the combination of cues of sensory data from the different modalities is often the best way to achieve an adaptive, representative perception of the environment (Spence, 2005).  The fields of multisensory integration and cross-modal attention are highly overlapping; however, it would seem that the main difference between the two fields of research is temporal.  Indeed, cross-modal attention necessarily implies the use of attentive resources, whereas multisensory integration does not necessarily.  This may be due to modern scientific techniques of testing reaction times and cueing, because the differences between the two ideas are minimal and possibly fabricated due to the constraints of laboratory settings.  It is possible that in the real world, such a dichotomy does not exist, which is the view I put forward.  However, for the sake of maintaining consistency with the research in the field, I have provided a dichotomy for the two. 

The difference between multisensory integration and cross-modal attention as defined by Koelewijin (2010) is that multisensory integration is preattentive, occurring at many different levels, whereas cross modal attention has to do with the focusing of the resources of attention.  Multisensory integration only occurs when the cue in one modality and target of a different modality are close together and nearly simultaneous.  Cross-modal cueing effects occur when the cue precedes the target by at least some time, between about fifty and three hundred milliseconds (Spence, 2010). But as discussed earlier, multisensory integration takes place between zero and one hundred milliseconds; therefore, multisensory integration and cross modal attention overlap. 

Cross modal attention has been the subject of a large body of research over the past couple decades.  Spence claims in his research (2004) that the interaction between the senses is the essence of the perceptual construction of the environment.  There is indeed good evidence for a neural system underlying cross-modal links (Kida, Inui, Tanaka, & Kakigi, 2010).  fMRI studies have shown that the intraparietal sulcus and the temporoparietal junction is activated in spatial cueing tasks, and TMS (transcranial magenetic imaging) has also provided support for the intraparietal region (Kida, Inui, Tanaka, & Kakigi, 2010).  However, just as multisensory integration was shown to affect the primary receivers for unimodal sensations, these areas are also responsive to cueing from different modalities.  This means that attention is not only involved with the processing of the primary modality, but also the modality specific brain areas in an irrelevant modality (Nager, Estorf, & Munte, 2006).  There does seem to be one area of cortex that is devoted to modulating the field of attention that is not dependent upon a modality.  This is referred to as the supramodal effect of cross modal attention, and it still highly debatable in the field of cognitive neuroscience.

Supramodal Model of attention

            The supramodal model of attention is a theory that insists that there is a common neural pathway that can control the spatial shifts of the field of attention within and between different modalities (Macaluso, Frith, & Driver).  Driver and Frith (2000) also found that both vision and touch share a supramodal effect for vision and touch in the intraparietal sulcus.  There has also been evidence that the right hemisphere controls for the shifting and general allocation of attention to different modalities, but this is not absolute, because both hemispheres show activation for crossmodal attention tasks.  The inferior premotor cortex was also found to be active during crossmodal tasks, giving reasons to believe that attention primes the motor cortex to react to the environment (Macaluso, Frith, & Driver).  Many studies have also found that the superior premotor cortex and superior temporo-parietal junction were active during visual attention tasks, showing that they are possibly responsible for major shifts of attention.  This shifting of attention between the senses is not automatic, as many researchers had thought prior to 2007.

            Experiments have shown that the crossmodal cueing effects are eliminated under conditions where the participants have to attend to high load tasks in a single modality (Spence, 2010).  This is consistent with Lavie’s load theory and helps to expand the limitations of attention across modalities.  It seems that tactile cueing as a whole can be eliminated when the participant monitors a rapid series visual stream, meaning that crossmodal attention is not automatic.


Beauchamp, M. S., Argall, B. D., Bodurka, J., Duyn, J. H., & Martin, A. (2004). Unraveling multisensory integration: patchy organization within human STS multisensory cortex. Nature Neuroscience, 7, 1190-1192.

Callan, D. E., Jones, J. A., Munhall, K., Kroos, C., Callan, A. M., & Vatikiotis-Bateson, E. (2004). Multisensory Integration Sites Identified by Perception of Spatial Wavelet Filtered Visual Speech Gesture Information. Journal of Cognitive Neuroscience, 16, 805-816.

Fairhall, S. L., & Macaluso, E. E. (2009). Spatial attention can modulate audiovisual integration at multiple cortical and subcortical sites. European Journal of Neuroscience, 29, 1247-1257.

Fiehler, K., & Rösler, F. (2010). Plasticity of multisensory dorsal stream functions: Evidence from congenitally blind and sighted adults. Restorative Neurology & Neuroscience, 28(2), 193-205. 

Finney, E. M., Fine, I., & Dobkins, K. R. (2001). Visual stimuli activate auditory cortex in the deaf. Nature Neuroscience, 4, 1171.

Hughes, H., Nelson, M., & Aronchick, D. (1998). Spatial characteristics of visual- auditory summation in human saccades. Vision Research, 38, 3955-3963.

Kida, T., Inui, K., Tanaka, E., & Kakigi, R. (2011). Dynamics of within-, inter-, and cross-modal attentional modulation. Journal Of Neurophysiology, 105(2), 674-686.

Lavie, N. (1995). Perceptual load as a necessary condition for selective attention. Journal of Experimental Psychology: Human Perception and Performance, 21, 451-468.

Lavie, N. (2010). Attention, distraction, and cognitive control under load. Current Directions in Psychological Science, 19, 143-148.

Lavie, N., Hirst, A., de Fockert, J. W., & Viding, E. (2004). Load Theory of Selective Attention and Cognitive Control. Journal of Experimental Psychology: General, 133, 339-354

Lavie, N. (2005). Distracted and confused?: Selective attention under load. Trends in Cognitive Sciences, 9, 75-82

Lugo, J. E., Doti, R. R., Wittich, W., & Faubert, J. (2008). Multisensory integration: Central processing modifies peripheral systems. Psychological Science, 19, 989-997.

Macaluso, E. E., Frith, C. D., & Driver, J. J. (2002). Supramodal Effects of Covert Spatial Orienting Triggered by Visual or Tactile Events. Journal of Cognitive Neuroscience, 14(3), 389-401. 

Nager, W., Estorf, K., & Münte, T. F. (2006). Crossmodal attention effects on brain responses to different stimulus classes. BMC Neuroscience, 731-738.

Rees, G., Frith, C. D., & Lavie, N. (1997). Modulating irrelevant motion perception by varying attentional load in an unrelated task. Science, 278, 1616-1619.

Sotto-Faraco, S. (2005). Book review. European Journal of Cognitive Psychology, 17(6), 882-885. 

Spence, C. (2010). Crossmodal spatial attention. Annals Of The New York Academy Of Sciences, 119, 1182-200.

Spence, C., & Parise, C. (2010). Prior-entry: a review. Consciousness and Cognition, 19(1), 364-379. 

Van Atteveldt, N. M., Blau, V. C., Blomert, L., & Goebel, R. (2010). fMR-adaptation indicates selectivity to audiovisual content congruency in distributed clusters in human superior temporal cortex. BMC Neuroscience.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.