Automated Transcript Poetry #1

by | Jul 25, 2017

“Everyone Safari”
by Y.A.T.*
I have wine
my name is Gonzales
and Mrs. Chandler sorry because everyone Safari
today my guests hara-hara-hara
I am you say how about you
And goofy and carefully deduce yourself
Hi I’m gonna terrorize some facilities
and with the famous Fiona solid water
so here are she here must I learn
drops a lot of clothes and still flows

We get a fair number of requests to incorporate automated transcripts as a feature of Transana. We understand that transcription can be a time-consuming and potentially expensive part of doing qualitative research with audio or video data. Many people want to skip what seems like a tedious typing task and jump right to coding and analysis!

As AI technology has improved, some services like YouTube have begun to offer automatically generated transcripts for video or audio content. This could potentially help people with hearing impairments or people trying to learn a foreign language to better understand speech in a video that has not been professionally captioned. These transcripts may be helpful when generated from a very clearly recorded audio of an individual speaker with very good diction, but with the present state of the technology, results can vary widely. Software developers continue to make improvements to voice-recognition software, but we are a long way from having reliable software that can understand accented speech, or that can identify individual multiple speakers in conversation, or accurately transcribe the words of those who are talking across each other or are in a noisy environment. Now and in the near future, researchers who collect media data involving human speech and conversation will be transcribing that data by hand. And because repeatedly hearing the voices, and seeing the expressions, gestures and interactions of research subjects is the most direct, hands-on and irreplaceable connection a researcher will ever have to their own data, we are OK with that.

We don’t have to look far to find examples of automatically transcribed speech that illustrate its limitations, and sometimes we are given the gift of garbled prose that nearly jumps off the page. We are pleased to bring you the first in a periodic series “Automated Transcript Poetry.”

“Everyone Safari”
* by YouTube Automatic Transcription
“The Actual Transcript of a Media File”
by a human transcriber using Transana
I have wine. my name is Gonzales F: [to camera] Hi everyone. My name is Fiona *****
and Mrs. Chandler sorry because everyone Safari and this is Gender Stories, because everyone has a story.
today my guests hara-hara-hara Today my guest is Hera. [to guest] Hera, how are you?
I am you say how about you H: I am good thanks. How about you?
And goofy and carefully deduce yourself F: I’m good, thank you. Please introduce yourself.
Hi I’m gonna terrorize some facilities H: Hi! My name is Hera and I am from the Philippines.
and with the famous Fiona solid water And I am here with the famous Fiona *****!
so here are she here must I learn F: [to camera] So Hera, she is here in Bangkok, Thailand,
drops a lot of clothes and still flows where the shops sell a lot of clothes and hormones.

Want to know more about why we aren’t near to being able to get rid of human transcribers?

Jesse Jarnow, Why Our Crazy-Smart AI Still Sucks at Transcribing Speech