Resume: A new study shows how our brains distinguish between music and speech using simple acoustic parameters. Researchers found that slower, steady sounds are perceived as music, while faster, irregular sounds are perceived as speech.
These insights could optimize therapeutic programs for language disorders such as aphasia. The research provides a deeper insight into auditory processing.
Key Facts:
- Simple parameters: The brain uses basic acoustic parameters to distinguish music from speech.
- Therapeutic potential: Findings may improve therapies for language disorders such as aphasia.
- Research details: The study involved more than 300 participants listening to synthesized audio samples.
Source: N.Y.U
Music and speech are among the most common types of sounds we hear. But how do we identify what we think are the differences between the two?
An international team of researchers has mapped this process through a series of experiments, providing insights that provide a potential means to optimize therapeutic programs that use music to regain the ability to speak when addressing aphasia.
This language disorder affects more than 1 in 300 Americans every year, including Wendy Williams and Bruce Willis.
“Although music and speech are different in many ways, ranging from pitch to timbre to sound texture, our results show that the auditory system uses remarkably simple acoustic parameters to distinguish between music and speech,” explains Andrew Chang, a postdoctoral researcher at New York University. Department of Psychology and the lead author of the article, which appears in the journal PLOS biology.
“In general, slower and stable sound clips that contain nothing but noise sound more like music, while the faster and irregular clips sound more like speech.”
Scientists measure the speed of signals using precise units of measurement: Hertz (Hz). A higher number of Hz means a higher number of events (or cycles) per second than a lower number. For example, people typically walk at a pace of 1.5 to 2 steps per second, which is 1.5-2 Hz.
The beat of Stevie Wonder’s 1972 hit “Superstition” clocks in at approximately 1.6 Hz, while Anna Karina’s 1967 hit “Roller Girl” clocks in at 2 Hz. Speech, on the other hand, is typically two to three times faster than at 4-5 Hz.
It is well documented that the volume or loudness of a song over time (what is known as ‘amplitude modulation’) is relatively stable at 1-2 Hz. In contrast, the amplitude modulation of speech is typically 4-5 Hz, meaning the volume changes frequently.
Despite the ubiquity and prominence of music and speech, scientists previously had no clear understanding of how we effortlessly and automatically identify a sound as music or speech.
To better understand this process in their PLOS biology In their study, Chang and colleagues conducted a series of four experiments in which more than 300 participants listened to a series of audio segments of synthesized music and speech-like noise at different amplitude modulation rates and regularities.
The audio noise clips only allowed the detection of volume and speed. Participants were asked to judge whether these ambiguous noise samples, which they were told were noise-masked music or speech, sounded like music or speech.
Observing the pattern of participants sorting hundreds of noise fragments as music or speech revealed how much each rate and/or regularity feature influenced their judgment between music and speech. It’s the auditory version of “seeing faces in the cloud,” the scientists conclude: If there’s a particular feature in the sound wave that matches listeners’ ideas of what music or speech should be like, even a fragment of white noise can sound like music or speech. .
The results showed that our auditory system uses surprisingly simple and basic acoustic parameters to distinguish between music and speech: for participants, clips with lower frequencies (<2 Hz) and more regular amplitude modulation sounded more like music, while clips with higher frequencies ( ~4 Hz) and more irregular amplitude modulation sounded more like speech.
Knowing how the human brain distinguishes between music and speech could potentially benefit people with auditory or language disorders such as aphasia, the authors note.
For example, melodic intonation therapy is a promising approach to training people with aphasia to sing what they want to say, using their intact “musical mechanisms” to bypass damaged speech mechanisms. Knowing what makes music and speech similar or different in the brain could help design more effective rehabilitation programs.
The paper’s other authors were Xiangbin Teng of the Chinese University of Hong Kong, M. Florencia Assaneo of the National Autonomous University of Mexico (UNAM), and David Poeppel, professor of psychology at NYU and director of the Ernst Strüngmann Institute for Neuroscience in Frankfurt, Germany.
Financing: The research was supported by a grant from the National Institute on Deafness and Other Communication Disorders, part of the National Institutes of Health (F32DC018205), and Leon Levy Scholarships in Neuroscience.
About this auditory neuroscience research news
Author: James Devitt
Source: N.Y.U
Contact: James Devitt-NYU
Image: The image is credited to Neuroscience News
Original research: Open access.
“The human auditory system uses amplitude modulation to distinguish music from speech” by Andrew Chang et al. PLOS biology
Abstract
The human auditory system uses amplitude modulation to distinguish music from speech
Music and speech are complex and distinct auditory signals that are both fundamental to the human experience. The mechanisms underlying each domain are extensively investigated.
But which perceptual mechanism transforms a sound into music or speech and how? simple Acoustic information is needed to distinguish between these open questions.
Here we hypothesized that the amplitude modulation (AM) of a sound, an essential temporal acoustic feature that guides the auditory system across processing levels, is crucial for distinguishing between music and speech.
In particular, in contrast to paradigms that use naturalistic acoustic signals (which can be difficult to interpret), we used a noise interrogation approach to disentangle the auditory mechanism: if AM rate and regularity are critical for perceptual distinguishing music and speech, judging artificially noise-synthesized ambiguous audio signals should match their AM parameters.
In 4 experiments (N = 335), signals with a higher AM peak frequency are often rated as speech and lower as music. Interestingly, this principle is used consistently by all listeners for speech judgments, but only by musically advanced listeners for music.
Furthermore, signals with more regular AM are judged as music over speech, and this feature is of greater importance for music judgment regardless of musical sophistication.
The data suggest that the auditory system may rely on a low-level acoustic property as fundamental as AM to distinguish music from speech, a simple principle that invites both neurophysiological and evolutionary experimentation and speculation.