Transcribing Spoken Data

In the study of English Language or linguistics, we often look at how people talk and interact with each other. This means that when we collect data, it is often of spoken language and is what we call spoken data. To get spoken data into a form we can use and analyse, we have to transcribe it.

Get started Sign up for free
Transcribing Spoken Data Transcribing Spoken Data

Create learning materials about Transcribing Spoken Data with our free learning app!

  • Instand access to millions of learning materials
  • Flashcards, notes, mock-exams and more
  • Everything you need to ace your exams
Create a free account

Millions of flashcards designed to help you ace your studies

Sign up for free

Convert documents into flashcards for free with AI!

Table of contents

    To transcribe something is to put it into a written or printed form.

    Once we have transcribed spoken data, we then have a transcription that we can use to analyse the spoken data.

    A transcription (or transcript) is a written or printed version of something.

    In this article, we’re going to look at why we transcribe spoken data, how we transcribe, how the International Phonetic Alphabet is used in transcription, and then how to cite speech transcription.

    Why do we Transcribe Spoken Data?

    Due to the nature of spoken language, once we’ve heard it, we generally can’t hear it again.

    Spoken data is simply data of language which represents how it was spoken. Spoken data differs from written language in that it usually shows the more informal language features that aren’t present in written language.

    To collect spoken data that we can listen to again, we must record it. This can be done either as an audio recording or as an audio-visual recording (video) where we can then listen to the spoken data as many times as we need.

    Although having audio recordings is important when analysing spoken data, it’s not always the most useful way to store data, as it can't be analysed and it can make it difficult to find a specific piece of data quickly.

    We transcribe spoken data so that we have a written form of it. This makes it much easier to analyse what has been said and how. Looking at the content of the spoken data (such as topics, words and interruptions) can be useful in areas of linguistics like sociolinguistics where we may need to analyse and compare the language of different speakers.

    Language differences can vary among speakers and can be related to social aspects such as age, class, gender, occupation, ethnicity and region.

    Another reason why we transcribe spoken data is to look at a person’s accent and pronunciation features. This is done by transcribing data using the International Phonetic Alphabet, which we’ll look at in a bit more detail later. Doing this allows greater and more specific speech analysis in fields such as phonetics and phonology.

    Accent and pronunciation features are the aspects of spoken language that can differ between different speakers. For example, how the /a/ in ‘bath’ is pronounced differently in British accents. Here, a short /a/ sound in ‘bath’ is a feature of northern accents.

    Transcribing spoken data, Image of man writing, StudySmarterFig. 1 - Transcribing data involves writing it out.

    Transcription of data in research

    Before transcribing, you first need to collect the data. This is done most often through recording spoken language either as an audio recording or recording as a video – having a video may be useful for looking at things such as NVC within a person's speech.

    NVC stands for non-verbal communication and is the name given to any sort of gesture, movement or facial expression used to communicate something. NVC is often used in conjunction with verbal communication (speech) but can also be used on its own.

    When recording and transcribing data, certain factors need to be considered. These are ethics and the observer’s paradox.


    In relation to ethics, we need to think about what is the morally right practice as researchers. As spoken language is produced by an individual and is unique to that individual, you need their permission to record them.

    If you don’t ask permission before recording someone, it could be considered a breach of that person’s privacy. Every study that requires spoken data has to first go through ethical considerations and make sure that permission has been asked for where it is needed.

    The observer’s paradox

    The observer’s paradox is the name given to the problem that arises when trying to record natural spoken language. Most natural speech occurs when the speakers are completely at ease and talking casually amongst themselves.

    When recording data though, there is usually an observer (the person recording the data) or at the very least a recording device. Due to ethical considerations, the speakers will also know that they are being recorded. As much as people may try to speak naturally, there is always an element of being a bit on edge when you know you’re being recorded or listened to. This may cause the speaker to either consciously or subconsciously alter how they speak.

    How to overcome observer’s paradox

    When collecting data, you can make certain allowances for observer’s paradox to overcome it. One thing you could do is ask for permission to record someone’s speech in advance of doing it and then record them when they’re not expecting it. With this method, you’ll have to let them listen to what you recorded before you use it as data to make sure they’re happy with you using it.

    Another way to try and sidestep the observer’s paradox is to let people know that you are recording them and then lead the conversation through some casual topics before you get to the conversation you want to record.

    By doing this, you’ll allow the speakers to get accustomed to being recorded and settle into speaking more naturally by the time it gets to the data you need. This will hopefully encourage more natural speech.

    Transcribing Data

    Before you start writing out your data into transcript form, you’ll need to write a sentence or two outlining some basic context. This will need to include:

    • Where and when the interaction is taking place

    • Who the speakers are

    • Any contextual information relevant to your study, for example, the gender of the speakers if you’re looking at language and gender

    When writing out a transcript, you’ll first need to listen to your recording and write out what was said. It’s a good idea to listen to the recording a few times to make sure you write what you actually hear and not what you expect to hear.

    It’s easy to mishear and automatically correct what you hear when you write it down. You’ve got to be careful not to do this when transcribing as you want a true representation of the spoken data.

    If something is said that is unusual or of note (this will depend on what you’re looking for), it’s a good idea to annotate this on your transcript and to listen through again to see if it appears anywhere else as well.

    Features of communication that can be shown in transcriptions:

    FeatureDefinitionAs it would be shown in a transcript
    False startWhere someone starts speaking, pauses, and starts again.John: I don't think... I didn't really see him.
    Micro-pausesA pause in speech that is less than a tenth of a second.(.)
    PauseA pause in speech longer than a tenth of a second, showing the length of the pause in seconds.(0.6)
    InterruptionsWhere one speaker interrupts another. Two slashes indicate at what point the speaker interrupts.John: I did see that the game // was on over the weekend.Peter: // The game was amazing!
    Simultaneous speechThis is where two speakers are speaking at the same time, indicated with lines on either side of simultaneous speech.John: Did you see the game? It was amazing, | there was a goal right at the end of the second half! |Peter: | It was so close! I couldn't believe they got in there so quick with that goal. |
    RepetitionWhere the same word or utterance is repeated.John: I did see that. I did see that yeah.
    StutterWhere a speaker struggles to keep a flow in speech.Tom: D d d did you see the g g game?
    FillerA small word inserted by a speaker in-between utterances.John: I erm, did see uh, that it like, was really sudden.

    Making note of specific speech sounds, such as phonemes can be done by using the International Phonetic Alphabet.

    What is the International Phonetic Alphabet?

    The International Phonetic Alphabet (IPA) was developed in the 19th century as an internationally recognised system of phonetic symbols. Each symbol corresponds to one specific speech sound, removing the confusion caused by having multiple sounds represented by the same letters.

    In English, the letter ‘c’ either sounds like ‘see’ or ‘k,’ as in the words 'cat' and 'centipede'. The IPA symbols can help us differentiate between the sounds as there is a different symbol for each different sound, such as /kæt/ for cat and /sɛntɪpi:d/ for centipede.

    You can have a look at all of the different symbols are in the IPA chart here.

    Transcribing spoken data Image of The International Phonetic Alphabet StudySmarter

    Fig. 2 - IPA Chart.

    How to use the IPA when Transcribing Spoken Data

    Using IPA in transcribing spoken data can make your data much more accurate and can be especially useful if you're looking at accent features such as vowel pronunciation in your spoken data. In A-level English language, you won’t be expected to transcribe whole extracts into IPA, but you will be expected to have a basic understanding of it.

    Let's look at an example of how the IPA can be used to show pronunciation features.

    A glottal stop is a closing of the throat which creates a pause in the airflow. Glottal stops usually replace consonants at the end or middle of words in certain languages and dialects. In the IPA, the glottal stop is represented with this symbol /ʔ/.

    Let's look at the glottal stop that appears in the word hat in certain dialects.

    If the ‘t’ is pronounced, it would be written as /hat/.

    If the ‘t’ isn’t pronounced and is replaced with a glottal stop, it would be written as /haʔ/.

    When you write something using IPA, make sure to put slanted brackets on either side of it to indicate your use of IPA. For example, /kat/ for ‘cat,’ /wau/ for ‘wow,’ and /beið/ for ‘bathe.’ The slanted brackets are for phonemic transcription (otherwise known as broad transcription) which is language-specific and records enough details to show how words differ from others in a language. Square brackets [ ] are used for narrow transcription which records as many details in the sound as possible.

    In the IPA chart, there are also diacritics and suprasegmentals which are the small marks placed next to, under, or on top of vowel or consonant symbols and give much greater information about the prosodic features of the speech sounds.

    Prosodic features are the extra elements of speech sound, such as tone, intonation, rhythm, and stress.

    The use of suprasegmentals and diacritics can be used to show stress, syllables and the linking of speech so that you can represent in written form exactly how something has been said. When adding diacritics and suprasegmentals into your transcription, you need to use square brackets around the transcribed speech to show that it's narrow transcription.

    Transcript example

    This transcript is an extract from a recorded conversation between two friends (Polly and Laura) who are planning a trip. You can spot some of the features from the table earlier.

    1 Polly: Well I was thinking that we could all get the train together.

    2 Laura: (0.5) Yeah… Yeah well I was going to say I could drive some of (.) four

    3 of us.

    4 Polly: Oh yeah (2) Well how about (.) | how about girls | in the car and boys

    5 on the train. | |

    6 Laura: | How about we |

    7 Yeah that sounds okay (1) We’ll have to //

    8 Polly: // I mean (.) we’ll have to see (.) Like we’ll have to ask the boys what

    9 they think

    10 Laura: Yeah yeah

    What are we looking at in this example?

    • Line 1 is an example of an utterance without any notable speech features.

    • In line 2, we can see that Laura took a pause of half a second before she started speaking, and then took another micro-pause later on in her utterance.

    • In line 4, Polly pauses for two seconds and then we see an example of simultaneous speech. In this simultaneous speech, Polly on line 4 says "how about girls" while Laura on line 6 says "how about we." As the lines are around those two sections of utterances, these are the only two sections that are spoken simultaneously.

    • In lines 7 and 8, we can see an interruption where the double slanted brackets are. Here, Polly interrupts Laura and then carries on speaking.

    An utterance is a spoken sound, word or sentence. ‘Utterance’ is often used in relation to transcription instead of ‘sentence.’

    Citing speech transcriptions

    When you first reference the transcript you’re talking about in your work, it’s usually good to cite the year and to give an overview of the general context, saying briefly who the speakers are and where the conversation is taking place (providing it’s relevant to what you’re discussing). From then on, it’s usually fine to reference a line number (as all transcripts should have numbered lines) and also state who is speaking to make it clear for your reader.

    Quoting transcriptions

    When quoting a short utterance or a word, simply put it in quote marks as you would when quoting a book.

    In line 4, Polly pauses for 2 seconds, saying "oh yeah (2) Well how about."

    When you are explaining something with the help of the IPA, make sure to put that part in slanted brackets.

    When quoting multiple lines, do it as a separate section underneath your paragraph and then do your explanation underneath, making sure to still reference specific line numbers.

    ----- Paragraph explaining your point -----

    Quoted lines from the transcript

    ----- Paragraph discussing the quoted text -----

    Transcribing Spoken Data - Key Takeaways

    • A transcription is a written or printed version of something.

    • When recording data for transcription, we have to consider ethics and the observer’s paradox.

    • Transcripts can be used to show features of spoken language such as interruptions, pauses and simultaneous speech.

    • The International Phonetic Alphabet (IPA) can be used to represent specific sounds of speech.

    • When citing speech transcripts, you can either quote a short utterance or a longer extract.


    1. Fig. 2: IPA chart 2020 ( by International Phonetic Association ( is licensed by CC BY-SA 3.0 (
    Frequently Asked Questions about Transcribing Spoken Data

    What is transcription?

    The process of transcription is when you record spoken data into a written or printed form so that it can be analysed.

    How to cite a transcription of speech?

    When first introducing the transcript, give the year and some basic context. Then (throughout your discussion and analysis), reference the line number for what you're discussing. It's also a good idea to state who is speaking for greater clarity in your explanation.

    How do you transcribe a speech?

    To transcribe speech, you need to record it, then write out what was said in the recording. When you have done this, you need to make clear where any features such as interruptions, pauses and simultaneous speech are.

    How should a transcript look?

    A transcript should have a sentence or two giving context at the beginning. Then the text should be arranged with a new line for each speaker with the speakers' names down the left of the page. Every line should be numbered.

    What should be included in a transcript?

    • Context of the interaction including anything that's relevant to your area of research.
    • Line numbers.
    • Speech features such as pauses, interruptions, simultaneous speech, fillers and false starts.

    Discover learning materials with the free StudySmarter app

    Sign up for free
    About StudySmarter

    StudySmarter is a globally recognized educational technology company, offering a holistic learning platform designed for students of all ages and educational levels. Our platform provides learning support for a wide range of subjects, including STEM, Social Sciences, and Languages and also helps students to successfully master various tests and exams worldwide, such as GCSE, A Level, SAT, ACT, Abitur, and more. We offer an extensive library of learning materials, including interactive flashcards, comprehensive textbook solutions, and detailed explanations. The cutting-edge technology and tools we provide help students create their own learning materials. StudySmarter’s content is not only expert-verified but also regularly updated to ensure accuracy and relevance.

    Learn more
    StudySmarter Editorial Team

    Team English Teachers

    • 14 minutes reading time
    • Checked by StudySmarter Editorial Team
    Save Explanation Save Explanation

    Study anywhere. Anytime.Across all devices.

    Sign-up for free

    Sign up to highlight and take notes. It’s 100% free.

    Join over 22 million students in learning with our StudySmarter App

    The first learning app that truly has everything you need to ace your exams in one place

    • Flashcards & Quizzes
    • AI Study Assistant
    • Study Planner
    • Mock-Exams
    • Smart Note-Taking
    Join over 22 million students in learning with our StudySmarter App
    Sign up with Email

    Get unlimited access with a free StudySmarter account.

    • Instant access to millions of learning materials.
    • Flashcards, notes, mock-exams, AI tools and more.
    • Everything you need to ace your exams.
    Second Popup Banner