We receive frequent inquiries about importing transcripts from automated online transcription services into Transana. There are several issues to consider if you are thinking about having your media transcribed through a service, whether automated or manual.
Online services typically offer at least a couple of formats for transcripts, often Word DOCx (*.docx) files and plain text (*.txt) files. Some may also offer Rich Text Format (*.rtf) files. I’ve found that services sometimes try to get a bit too fancy with their DOCx formatting, adding features (such as links to their web sites) that end up not being compatible with Transana. One service created DOCx files of such complexity that every single word in the transcript included 24 (always identical) formatting specifiers, and every single space between words requires another 24 specifiers.
As a result, it may take a bit of experimentation to find a formula that maximizes compatibility between a given transcription service and Transana. So try the DOCx file first. If that doesn’t work, try a less complicated format, such as RTF. Plain Text files, at least in theory, should always work with Transana. (Use UTF-8 encoding, if offered a choice.) I found one site where Transana could not import the DOCx files they created, but if I loaded the DOCx file in Word and selected the whole document, I could then use Copy and Paste to transfer the data to a blank Transana transcript in Edit mode.
A second issue is that some transcription services include time stamp information in their transcripts. The ones I have seen offer time values rounded to the nearest second, which is practically useless to Transana, which operates at an accuracy of 1/30th of a second. At least one site makes each time value a hyperlink to an online version of the transcript linked to an online audio file. These hyperlinks were structured in a way that was incompatible with Transana and caused import failures. When I chose the option to not import the time stamp information with the files, I had much better luck importing the transcripts. This is preferable, as you will need to go through the transcripts from these services to add Transana time codes and proof-read the transcripts anyway.
If you have generated an automated transcript using YouTube’s automated transcript generator as described in a previous blogpost, you’ll want to turn off the auto-time code feature before you create the transcript and copy/paste it into a blank Transana transcript.
With a little patience and experimentation, I’ve always been successful in transferring files from a transcription service into Transana.