Coding and Categorization

Coding and Categorization

There are two ways in Transana to approach the organization of data selections that are analytically important. Coding is the application of analytic tags, or codes, to selections of text, PDF, still-image, and media data.. The codes indicate what is analytically meaningful or interesting about the selection. Categorization is the creation of groupings, or categories, for gathering together data selections that are analytically similar. Like a code, the category identifies something analytically meaningful or interesting about the selection.

In Transana, the distinctions between coded selections and categorized selections are intentionally very permeable, and it is easy to categorize initially-coded selections or to code initially-categorized selections.

On a more preactical level, here’s how those concepts are implemented in Transana.


Codes are implemented in Transana as Keywords. You can create a coding system by creating Keyword Groups with Keywords in them to represent your codes.


Categories are implemented in Transana as Collections. You can nest Collections within other Collections to represent sub-categories. All data selections in Transana (Quotes, Snapshots, and Clips) must be contained in a Collection.

Coded Selections

Coded Selections are created by making a selection, then double-clicking the Keyword with which you want to code the selection. These are referred to as “Quick” quotes or clips because you do not have to indicate a Collection where the item should be placed and you do not have to name the newly created item. The item is named automatically and is placed in a “Quick Quotes and Clips” Collection.

Categorized Selections

Categorized Selections are created by making a selection, then double-clicking the Collection in which they should be placed. You then give the selection a name and optionally add Codes. They are referred to as “Standard” Quotes and Clips. (All Snapshots must be categorized initially, although it is common practice to apply Coding Shapes to Snapshots as part of the creation process.)

Considerations for the Coding model

  • You can apply multiple codes to a single selection. That is, you can describe a single selection along more than one axis.
  • This makes coded items accessible to complex searches, where you can specify multiple search criteria that must be met.
  • The creation of “Quick” items is fast.
  • Coding requires less “analytic” thinking up front. You can code a lot of data quickly, and figure out what it all means later in the analytic process.

Considerations for the Categorization model

  • “Standard” selections in Collections are more vIsual, less abstract than coded selections. You can easily see how many items are in each Collection.
  • Because of the ways Collections work, Categorized data items can be considered in groups, and in relation to one another, instead of independently.
    • Items in the Categorization model are grouped together. That’s what a Collection is, a place to group data items that are somehow alike or related together.
    • Items in Collections have an order, which can be changed. This is particularly important for narrative analysis, but can be important with other methodologies as well.
  • Categorization often takes a bit more “analytic” thinking during the selection definition process. You need to have a sense of what your most important themes are likely to be to determine appropriate Collection selection for new items.

Moving between Coding and Categorization

It is very easy to turn a Coded selection into a Categorized selection. You change it’s name from the automatically-generated name to a more meaning name, and you move it from the “Quick Quotes and Clips” Collection to a more analytically meaningful Collection.

It is very easy to add Codes to a Categorized selection. You can code the selection during creation, and there are several ways to alter the coding of a selection later in the analytic process.

The distinctions described above have a lot to do with the process of creating selections and much less to do with how those selections work later in the analytic process. There is actually little difference between Coded selections and Categorized selections once the creation process is complete. So experiment with both approaches and move forward with which ever is more comfortable for you.

From a slightly different perspective

Here’s an older video that compares the Standard Method of Quote and Clip creation (Categorization) to the Quick Method of Quote and Clip creation (Coding).

Stages in the Analytic Process

When I am doing a significant analysis myself, I have notice that I tend to follow this pattern:

  • Early on, when I don’t yet have a sense of where my data analysis is taking me, I tend to do a lot of coding. I create many codes that represent different facets of the data that are interesting and relevant to the research questions at hand.
  • Periodically, I review my data using Transana’s Collection Report. The Summary section of the report gives me indications of what sorts of things I am seeing repeatedly in the data. Searches help me follow up to see how different ideas play out in the data.
  • I start creating Collections to represent the most important themes I have seen from reviewing the coding in my data. I move the selections I have creating into those thematic Collections, organizing and re-organizing my Collections as my understanding of the themes within the data grow and change.
  • When I bring new data into my analysis later in the analytic process, I tend to use Categorization a lot more. As my thematic understanding grows, I have a clearer sense of where new selections belong in my emerging analytic structure.

This is, of course, a very abstracted over-simpification of the analytic process I go through. However, it is fair to conclude that I have been known to use Transana’s analytic tools differently at different phases of the analytic process as my understanding of my data changes with time.