| Transcript Annotation Tool |
||||||||||||||||||||||||||||||||||
|
Introduction
The transcript annotation tool is a web-based tool for annotating a transcript synchronized with an audio stream. It is especially useful for working with spoken word materials such as speeches, debates, and court proceedings. Transcripts are in supplied in an XML format called TRS (see http://www.ldc.upenn.edu/mirror/Transcriber/) using a tool called Transcriber. Users use Transcriber to match segments of the transcript with time segments in an audio stream. In some cases we use additional software to match text and audio on a word by word basis. The Transcript Annotation Tool lets a user attach notes to highlighted sections of the transcript or to time intervals on a timeline. Because the program maintains the correspondence between transcript and time coordinates, users can easily move back and forth between the transcript and the audio timeline. You can add categories or structured information to each annotation using a simple form defined for the annotation document. This lets you set up a coding or categorization scheme that can be used to collect research data or to prompt a student to think about specific issues when marking up a transcript. The Transcript Annotation Tool's user interface has five main areas as shown below.
Opening an Transcript Annotation Document
Use the Chooser to select a transcript annotation document to edit with the tool. An annotation document links to but doesn't contain the audio clip and transcript you are annotating. To open an annotation document:
Playing Audio and Navigating
The Control Panel provides controls for playing and navigating the annotated audio stream and transcript. Use the Play button to play back audio starting at the current playback time. As the audio is played the tool will highlight the current position in the text transcript. Use the Play Selection button to play the interval you've selected in the the timeline or transcript. Playback will halt when it reaches the end of the selection. Use the beginning and end buttons position the playback time at the very beginning or end of the audio stream. Use the note ahead / back buttons to jump to the next note position or return to the last note. Use the turn back and turn ahead buttons move the playback time to the beginning of the previous or next "turn" in the transcript. The time code field displays the current playback time followed by the total duration of the audio stream. Click the current time to enter a new playback time.
Using The Timeline
The Ruler provides a visual representation of the audio timeline.
The current playback time is shown by the green line. The ruler will scroll as this green line moves outside the view. The gray area with handles is the currently selected time segment. When you add a note to the timeline the shaded area shows the time interval being annotated. Adjust the selection by dragging the triangle handles at each end of the selection or by dragging and selecting text in the transcript area. The transcript maintains a correspondence between sections of text and time segments in the audio stream. When zoomed in the larger green highlight will show you the boundaries of the transcript segment in time. The size of this highlight depends on the size of the marked-up passage in the transcript.
Adding A Note to the Timeline or Transcript
To add a note select a time segment in the timeline or drag and select text in the transcript. When you are satisfied with your selection press the +Note button on the Control Panel. Use the Note Area to type in the text of your note. The text of the note is saved as soon as you finish typing.
When you add a note you will see a small icon appear to the left of the transcript:
This will scroll as you scroll the transcript and will remain attached to the line where the note's selected segment begins. To go back to your annotation click the icon. An icon will also appear to the right of the timeline ruler:
This icon will scroll as you scroll the ruler. To go back to your annotation you may also click this icon.
Removing Notes
The trash button is used to remove a note from the timeline: Select a note icon and then click the trash button to remove the note.
Searching the Transcript
To search the transcript for a word or sequence of words.
Changing the Title of the Document
To edit the title of the current document, click the "Edit Title" button (
Saving a Copy of the Annotation Document
The "Save As" button allows you to save a copy of the annotation document. When the "Save As" button is clicked, you can choose a workspace and folder in which to save the document. If you want to use the same folder to store the revised annotation document, you will need to give the clip a new file name. This will prevent the overwriting of any earlier annotation documents that may be in the folder.
Preparing Audio and Transcripts for Annotation
In order to add audio to Project Pad you will need to encode it in a format that can be used by Project Pad tools. The transcript annotation tool expects audio files encoded in MP3 format. We recommend bit rates from 16 to 32 kbps. The open source LAME utility program (http://lame.sourceforge.net/) can be used to encode files for annotation. Transcript files (that synchronize with MP3 audio) must be in TRS format. The open source program 'transcriber' (http://trans.sourceforge.net/en/presentation.php) can be used to prepare transcripts for annotation. Once you have prepared the audio and transcript files in the proper format, you need to place them in a directory that is visible to the Project Pad system. Both files should have the same name except for the file extensions. If you are running with the default configuration in Project Pad, then you already have a folder /sakai-ppad/media that is visible to the system. Just drop your files into this folder. Use the "Import" button ( |