
PDF QUOTES and SNAPSHOTS
PDF Quotes and Snapshots – An Overview
This page describes the different options for coding PDF data and the implications of the choices you are offered.
PDF Quotes and PDF Snapshots – An Overview
You have two options when working with PDF Data in Transana. You can focus on text by creating PDF Quotes, or you can focus on visual appearance by creating PDF Snapshots. This page explains the difference.
The Whole PDF page
To the right, you can see a whole PDF Document as it appears in Transana. You can see that there is a text selection (indicated by a solid blue rectangle) that includes the instructions given in the PDF. You can also see that it includes two sponsor logos (indicated by orange dashed ellipses) at the bottom of the page.
PDF Quotes
To the left, you can see a PDF Text Selection, called a PDF Quote. The PDF Window is zoomed in and positioned to show our selection with just a little of the PDF page’s context. The actual text selection was made with a Text Selection coding shape as shown by the solid blue rectangle.
This selection from the PDF is treated as text by Transana. (Importantly, this part of the PDF was represented as text in the original PDF. Transana cannot automatically extract text from letter-shaped graphics.) This implies that the contents are available wherever Transana looks for text, such as when you are doing a text search or running a report that processes text such as the Word Frequency Report.
PDF Snapshots
To the right, you can see a PDF image selection, called a PDF Snapshot. The PDF WIndow is zoomed in and positioned to show the desired context, and the coding is added by drawing coding shapes associated with specific keywords.
This selection from the PDF is treated as an image by Transana. The framing of the PDF in the PDF Window is important. No text is extracted from the selection even if it is present.
Implications
The implications of these two approaches can be seen in the PDF Document report to the left that includes the two PDF selections described above.
We can clearly see that the PDF Quote called Instructions contains text, and is limited to the text included within the Text Selection box drawn during Quote creation.
We can also clearly see that the PDF Snapshot called Sponsor Logos is a graphic that appears exactly the way the PDF was displayed in the PDF Window when it was saved. It is not enough to draw Coding Shapes when coding images in a PDF; you must also zoom, resize, and position the document in the window so it looks the way you want your PDF Snapshot to look.