PREPARING TEXT FOR ANALYSIS

Preparing Text for Analysis

Because text data can come in so many different forms and formats, you may need to prepare your text documents for import into Transana. At this time, Transana can import text in the following formats:

Format

File Extension

Microsoft Word .docx file

*.docx

Rich Text Format

*.rtf

Transana’s XML Format

*.xml

Plain Text Format

(encoded using Latin-1 or UTF-8)

*.txt

If you have data in Portable Document Format (*.pdf), please see the Tutorial sections for PDF data. PDF data has a significant emphasis on formatting and layout, so requires different handling than plain text documents.

If you have text in a different text format that you want to import into Transana, you will need to convert your source data. The easiest way to do this is to load the original document in your word processor, then choose File > Save As. You will see an option labelled “File type,” “Save as type,” or something similar.  Select *.docx  or *.rtf in this field to save your text data file as a Word document or Rich Text Format that you can import into Transana.

Transana does not currently support the import of tables in documents. If your document contains tables, you will need to convert the table formatting to text with tabs for that data to be correctly imported into Transana.

Advanced Topic: 

If you want to transfer text-based data (Documents or Transcripts) from one copy of Transana to another, we recommend using Transana’s XML format. Although this format can only be loaded in Transana, such file loads are very fast and are less subject to subtle distortions in measurements such as margins than are other formats.