WORD FREQUENCY REPORTS

Word Frequency Reports and Word Clouds

Transana’s Word Frequency Reports and Word Clouds can provide interesting information about the language and terms your participants use. Transana supports Word Groups, which allow you to group alternate word forms and synonyms together and to customize the reports.

Select the Report Scope

The item you select in the Database Tree to trigger the Word Frequency Report determines the contents of the report.

  • If you trigger the report from the main Libraries node, the Word Frequency Report will include data from all of your Libraries, including all text and PDF Documents and all Transcript from all Episodes in your database.
  • If you trigger the report from a Library, the Word Frequency Report will include all data from all Documents, PDF Documents, and Episode’s Transcripts contained in that Library.
  • If you trigger the report from a Document or PDF Document, the Word Frequency Report will include all text from that Document or PDF Document.
  • If you trigger the report from an Episode, the Word Frequency Report will include the text of all Transcripts for that Episode.
  • If you trigger the report from a Transcript, the contents of the Word Frequency Report will be limited to the text in that Transcript.
  • If you trigger the report from the root Collections node, the Word Frequency Report will include text from all text-based selections contained in all Collections, including nested Collections, in your Database.
  • If you trigger the report from a Collection, the Word Frequency Report will include all text-based selections from that Collection and all nested Collections.

The Results Tab

The Results Tab of the Word Frequency Report shows the words that appear in the data, the frequency of these words, and the Word Group each word represents. These values are, of course, dependent upon the word groups you have defined and the options you have set.

Word Groups are words that get grouped together for counting in the Word Frequency Report. Word Groups can contain different forms of the same word, such as “go,” “goes,” and “going.” They can also contain synonyms that you feel should be combined, such as “doctor,” “physician,” “surgeon,” “general practitioner,” and “GP.” There is also a special word group called the Do Not Show list, which may contain common words that don’t contribute meaning, such as “a,” “and,” and “the,” as well as non-word text such as “um” and “uhh,” which should not be included in the Word Frequency Report results.

Word Groups can be created, expanded, and edited from this page.

The Word Cloud Tab

The Word Cloud tab displays the data in the Results table in the form of a Word Cloud graphic. Changes you make that affect the Results table will also influence the Word Cloud.

The Group Words by Pattern Tab

This tab facilitates the creation of Word Groups using pattern matching. In English, words are often made plural by adding the letter “s,” and verbs can be made past tense with the “ed” suffix. Type the desired word ending into the Word Ending box and press the Apply Pattern button. The tab will generate a list of words it grouped together as a result.

Common suffixes I search for include “s”, “‘s”, “ed”, “d”,”ing”, and “ly”.

It is important to scan this list manually to eliminate incorrect matches. For example, “used” is not the past tense of the word “us” and should be removed from the pattern-based word grouping.

It can be helpful to alphabetize the Results list and scan it for words with the same root, then apply patterns you notice using the Group Words by Pattern tab.

The Options Tab

The Options tab provides a number of useful settings.

  • The Minimum Frequency setting allows you to alter the length of your Results list and the density of your Word Cloud by removing less-frequent words and word groups from the results.
  • The Minimum Word Length setting allows you to remove short words from your list should you desire to do so.
  • You can create, select, and apply a text-based Skip Words file to define common words that should not be included in Word Frequency Results and Word Clouds. Transana comes with a list of common English words that you may wish to consider as an example.
  • If you are experiencing difficulty with the Word Cloud display due to character encoding issues, you can select a Word Cloud Font to use that supports your desired characters.