Definition of a Spectrogram
As valuable as spectrograms are, they can initially be a bit overwhelming. To understand what's going on in a spectrogram, you need some background information.
A spectrogram is a graph of a sound wave's component frequencies over time. Component frequencies are the range of frequencies present in the sound.
To clarify, when you hear a single sound, you're really hearing lots of different frequencies stacked on top of one another. These stacked frequencies are the wave's components, and the lowest component is the pitch you hear (also called the fundamental frequency, or F0).
A spectrogram shows time on the x-axis and frequency on the y-axis. That means the bottom of the spectrogram is the lowest frequency, and the top is the highest frequency. Moving left to right on the spectrogram represents moving forward in time.
A spectrogram also shows a third dimension: amplitude (loudness). Differences in amplitude are shown as differences in color or darkness on the spectrogram. The darker lines are frequencies with higher amplitude, while the lighter areas are frequencies with lower amplitude.
Spectrogram Vs. Spectrum
The word spectrogram comes from the word spectrum.
A spectrum is a plot of a wave's components at a given point in time.
You can think of a spectrum as a single snapshot of a spectrogram. If you want to think about it another way, a spectrogram consists of lots and lots of spectra lined up next to each other. Each large "spike" visible on the spectrum is one of the darker horizontal lines visible on the spectrogram.
Spectrogram Examples
There are two types of spectrograms: wide-band spectrograms and narrow-band spectrograms.
Wide-Band Spectrogram
The most common type of spectrogram used for analysis is a wide-band spectrogram. This kind of spectrogram looks "fuzzier," with lots of vertical lines. In speech, these vertical lines represent glottal pulses: the repeated opening and closing of the glottis. These glottal pulses represent voicing in speech sounds. A wide-band spectrogram helps you to see how a sound changes over time.
To view a wide-band spectrogram in your analysis software, set the "window length" to 0.005 s.1
Narrow-Band Spectrogram
A narrow-band spectrogram looks like a series of thin horizontal stripes, sort of like a filet of fish. These thin stripes are the wave's components. On a narrow-band spectrogram, it's easy to see the differences in amplitude between the individual components.
To view a narrow-band spectrogram, set the "window length" to 0.05 s, or even 0.5 s.1
Spectrogram Analysis
It's possible to estimate what a person is saying just by looking at the spectrogram of the utterance. You'll get some hands-on practice with this in a bit. In the meantime, here are some signals that linguists look for when analyzing a spectrogram.
- When you see several dark horizontal stripes across the spectrogram, you're probably looking at a vowel. On a wide-band spectrogram, you'll also see vertical lines representing glottal pulses during a vowel.
- When the spectrogram is lighter and doesn't show clear stripes, you're probably looking at a consonant.
- Random and "fuzzy" sections of a spectrogram often indicate fricatives, like [f, v, s, z, ʃ, ʒ, h].
- A dark line at the bottom of the spectrogram during a consonant indicates voicing. You'll see this in voiced consonants like [b, d, ɡ, m, n, ŋ, l, v, z]. If you don't see this line, you're probably looking at a voiceless consonant like [p, t, k, f, s, θ, ʃ].
- During a consonant, a very dark area at the top of the spectrogram likely indicates a sibilant; these are words with an s sound, with a loud noise at a high frequency, like [s, ʃ, ʒ].
- When part of the spectrogram resembles a vowel but contains fewer, lighter horizontal stripes, you might be looking at an approximant like [w, ɹ, l, j].
These signals don't tell you everything about an utterance, but they can help you make educated guesses.
Vowels on a Spectrogram
Remember those dark horizontal stripes that you see on the spectrogram during vowels? Those stripes are the vowel's formants. Relative formant values help you determine the vowel's place of articulation, or the position of the vocal tract when producing the vowel. The most relevant formants for linguistic analysis are the first three formants: F1, F2, and F3.
The lowest formant, F1, tells you inversely how high a vowel is. The lower the F1, the higher the vowel. F1 is the dark line closest to the bottom of the spectrogram. The high vowels are sounds like [i], as in bee or sheep, or [u], as in soup or blue. These vowels will have the lowest F1 value. Low vowels are sounds like [a], as in box or party. These vowels will have the highest F1 value.
Vowel height refers to how high the tongue is in the mouth when producing a vowel. If you pay attention to the position of your mouth, you can feel that your tongue is higher when you say sheep than when you say shop.
The next formant, F2, tells you how far back a vowel is. The lower the F2, the further back the vowel. The frontmost vowels are sounds like [i] and [e], as in plate. These have the highest F2. The back vowels are sounds like [u] and [o], as in pole or order. These have the lowest F2 value.
Backness refers to the horizontal position of the tongue when producing a vowel. If you say the word boot, you'll notice that your tongue is pushed toward the back of your mouth and that the back part of your tongue carries the most tension. Compare that to the word beet, where your tongue is pushed forward and the front part of your tongue is tense.
This table summarizes the relative F1 and F2 values for the five vowel sounds present in most languages.
Vowel | F1 Value | F2 Value |
i (high front) | low | high |
e (mid front) | mid | high |
a (low mid) | high | mid |
o (mid back) | mid | low |
u (high back) | low | low |
The next highest formant is F3. F3 doesn't tell you much about most vowels, but it plays a unique role in r-colored vowels. R-sounds, like in the general American pronunciation of bird, have a very low F3 value compared to other sounds. This makes these sounds easy to spot on a spectrogram.
You might notice that a fourth formant line is visible on a spectrogram. Higher formants, including F4, F5, etc., appear in speech sounds. However, these formants don't reveal as much about speech sounds as F1-F3 and are not commonly considered in linguistic analysis.
Lastly, formant transitions can help you identify the place of articulation of neighboring consonants. The formants of a vowel change as a speaker moves from one consonant to the next. The direction of these formant changes can help you determine where the consonants occur. For example, moving from a vowel into a [k] sound would result in a rising F2 and a lowering F3 (this is called a "velar pinch" on a spectrogram).
Spectrogram Reading Practice
Now for some practice analyzing a spectrogram. The example spectrograms in this explanation have all visualized the same utterance. Zoom in on the first quarter of the utterance: what do you see?
- This spectrogram begins with a long segment containing just a voicing bar. This indicates a voiced consonant that can be sustained for a long time. There is also no random loud noise, so this is probably not a fricative. Some likely candidates are [m, n], or [l].
- The next segment looks louder, based on the large dark sections. You can also see visible glottal pulses and formants. This looks like a vowel. F1 looks fairly low, and F2 is very high compared to F1. This is probably a relatively high front vowel.
- The next segment still has visible formants and glottal pulses but is much quieter. This hints at an approximant. F2 and F3 are very close together in this utterance, but it's clear that both dip down to a low point in this segment. This low F3 is characteristic of an r-sound.
- The last segment looks a lot like the second segment. This suggests that this is a vowel with a similar place of articulation to the previous vowel.
You've made some educated guesses—what word are you looking at here? As it turns out, this spectrogram shows a speaker saying the word Mary!
Try repeating this analysis on the rest of the utterances for some extra practice! You can see the answer below.
This spectrogram shows a speaker saying Mary loves raspberries!
Spectrogram - Key takeaways
- A spectrogram is a graph of a sound wave's component frequencies over time. Component frequencies are the range of frequencies present in the sound.
- There are two types of spectrograms: wide-band spectrograms and narrow-band spectrograms.
- A wide-band spectrogram helps you see how a sound changes over time, while a narrow-band spectrogram helps you see the differences in amplitude between components.
- The dark horizontal stripes on a spectrogram represent a vowel's formants.
- The signals visible on a spectrogram don't tell you everything about an utterance, but they can help you make educated guesses.
References
- Boersma, Paul & Weenink, David (2022). Praat: doing phonetics by computer [Computer program]. Version 6.2.23, retrieved 8 October 2022 from http://www.praat.org/
Learn with 13 Spectrogram flashcards in the free StudySmarter app
We have 14,000 flashcards about Dynamic Landscapes.
Already have an account? Log in
Frequently Asked Questions about Spectrogram
What is a spectrogram for speech?
A spectrogram is a graph of a sound wave's component frequencies over time. Component frequencies are the range of frequencies present in the sound.
What is the difference between a spectrum and a spectrogram?
A spectrum is a plot of a wave's components at a given point in time. You can think of a spectrum as a single snapshot of a spectrogram.
What are the types of spectrograms?
There are two types of spectrograms: wide-band spectrograms and narrow-band spectrograms. A wide-band spectrogram helps you see how a sound changes over time, while a narrow-band spectrogram helps you see the differences in amplitude between components.
How do you identify place of articulation on a spectrogram?
The vowel formants visible on a spectrogram help you see the place of articulation of vowels. Formant transitions between consonants help you see the place of articulation of consonants.
Why are spectrograms useful?
Spectrograms are useful for linguistic analysis because they allow you to see multiple speech signals at once. For example, you can see component frequencies, glottal pulses, voicing, vowel formants, and place of articulation all on a single spectrogram.
About StudySmarter
StudySmarter is a globally recognized educational technology company, offering a holistic learning platform designed for students of all ages and educational levels. Our platform provides learning support for a wide range of subjects, including STEM, Social Sciences, and Languages and also helps students to successfully master various tests and exams worldwide, such as GCSE, A Level, SAT, ACT, Abitur, and more. We offer an extensive library of learning materials, including interactive flashcards, comprehensive textbook solutions, and detailed explanations. The cutting-edge technology and tools we provide help students create their own learning materials. StudySmarter’s content is not only expert-verified but also regularly updated to ensure accuracy and relevance.
Learn more