Skip to main content
To refer to this page use: http://arks.princeton.edu/ark:/88435/pr1x16w
Full metadata record
DC FieldValueLanguage
dc.contributor.authorChandrasekaran, Chandramouli-
dc.contributor.authorTrubanova, Andrea-
dc.contributor.authorStillittano, Sébastien-
dc.contributor.authorCaplier, Alice-
dc.contributor.authorGhazanfar, Asif A.-
dc.date.accessioned2019-10-28T15:55:17Z-
dc.date.available2019-10-28T15:55:17Z-
dc.date.issued2009-07-17en_US
dc.identifier.citationChandrasekaran, Chandramouli, Trubanova, Andrea, Stillittano, Sébastien, Caplier, Alice, Ghazanfar, Asif A. (2009). The Natural Statistics of Audiovisual Speech. PLoS Computational Biology, 5 (7), e1000436 - e1000436. doi:10.1371/journal.pcbi.1000436en_US
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/pr1x16w-
dc.description.abstractHumans, like other animals, are exposed to a continuous stream of signals, which are dynamic, multimodal, extended, and time varying in nature. This complex input space must be transduced and sampled by our sensory systems and transmitted to the brain where it can guide the selection of appropriate actions. To simplify this process, it’s been suggested that the brain exploits statistical regularities in the stimulus space. Tests of this idea have largely been confined to unimodal signals and natural scenes. One important class of multisensory signals for which a quantitative input space characterization is unavailable is human speech. We do not understand what signals our brain has to actively piece together from an audiovisual speech stream to arrive at a percept versus what is already embedded in the signal structure of the stream itself. In essence, we do not have a clear understanding of the natural statistics of audiovisual speech. In the present study, we identified the following major statistical features of audiovisual speech. First, we observed robust correlations and close temporal correspondence between the area of the mouth opening and the acoustic envelope. Second, we found the strongest correlation between the area of the mouth opening and vocal tract resonances. Third, we observed that both area of the mouth opening and the voice envelope are temporally modulated in the 2–7 Hz frequency range. Finally, we show that the timing of mouth movements relative to the onset of the voice is consistently between 100 and 300 ms. We interpret these data in the context of recent neural theories of speech which suggest that speech communication is a reciprocally coupled, multisensory event, whereby the outputs of the signaler are matched to the neural processes of the receiver.en_US
dc.format.extente1000436 - e1000436en_US
dc.language.isoen_USen_US
dc.relation.ispartofPLoS Computational Biologyen_US
dc.rightsFinal published version. This is an open access article.en_US
dc.titleThe Natural Statistics of Audiovisual Speechen_US
dc.typeJournal Articleen_US
dc.identifier.doidoi:10.1371/journal.pcbi.1000436-
dc.date.eissued2009-07-17en_US
dc.identifier.eissn1553-7358-
pu.type.symplectichttp://www.symplectic.co.uk/publications/atom-terms/1.0/journal-articleen_US

Files in This Item:
File Description SizeFormat 
Natural_Statistics_Audiovisual_Speech_2009.PDF1.96 MBAdobe PDFView/Download


Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.