Imagine a common movie scene: a hero confronts a villain. Captioning such a moment would at first glance seem as basic as transcribing the dialogue. But consider the choices involved: How do you convey the sarcasm in a comeback? Do you include a henchman’s muttering in the background? Does the villain emit a scream, a grunt, or a howl as he goes down? And how do you note a gunshot without spoiling the scene?
These are the choices closed captioners face every day. Captioners must decide whether and how to describe background noises, accents, laughter, musical cues, and even silences. When captioners describe a sound—or choose to ignore it—they are applying their own subjective interpretations to otherwise objective noises, creating meaning that does not necessarily exist in the soundtrack or the script.
Reading Sounds looks at closed-captioning as a potent source of meaning in rhetorical analysis. Through nine engrossing chapters, Sean Zdenek demonstrates how the choices captioners make affect the way deaf and hard of hearing viewers experience media. He draws on hundreds of real-life examples, as well as interviews with both professional captioners and regular viewers of closed captioning. Zdenek’s analysis is an engrossing look at how we make the audible visible, one that proves that better standards for closed captioning create a better entertainment experience for all viewers.
|Publisher:||University of Chicago Press|
|Sold by:||Barnes & Noble|
|File size:||3 MB|
About the Author
Read an Excerpt
Closed-Captioned Media and Popular Culture
By Sean Zdenek
The University of Chicago PressCopyright © 2015 The University of Chicago
All rights reserved.
A Rhetorical View of Captioning
Four New Principles of Closed Captioning
Closed captioning has been around since 1980 — it's not "new media" by any means — but you wouldn't know it from the passionate captioning advocacy campaigns, new web accessibility laws, revised international standards, ongoing lawsuits, new and imperfect web-based captioning solutions, corporate feet dragging, and millions of uncaptioned web videos. Situated at the intersection of a number of competing discourses and perspectives, closed captioning offers a key location for exploring the rhetoric of disability in the age of digital media. Reading Sounds offers the first extended study of closed captioning from a humanistic perspective. Instead of treating closed captioning as a legal requirement, a technical problem, or a matter of simple transcription, this book considers how captioning can be a potent source of meaning in rhetorical analysis.
Reading Sounds positions closed captioning as a significant variable in multimodal analysis, questions narrow definitions that reduce captioning to the mere "display" of text on the screen, broadens current treatments of quality captioning, and explores captioning as a complex rhetorical and interpretative practice. This book argues that captioners not only select which sounds are significant, and hence which sounds are worthy of being captioned, but also rhetorically invent words for sounds. Drawing on a number of examples from a range of popular movies and television shows, Reading Sounds develops a rhetorical sensitivity to the interactions among sounds, captions, contexts, constraints, writers, and readers.
This view is founded on a number of key but rarely acknowledged and little-understood principles of closed captioning. Taken together, these principles set us on a path towards a new, more complex theory of captioning for deaf and hard-of-hearing viewers. These principles also offer an implicit rationale for the development of theoretically informed caption studies, a research program that is deeply invested in questions of meaning at the interface of sound, writing, and accessibility.
1. Every sound cannot be closed captioned.
Captioning is not mere transcription or the dutiful recording of every sound. There's not enough space or reading time to try to provide captions for every sound, particularly when sounds are layered on top of each other in the typical big-budget flick. Multiple soundtracks create a wall of sound: foreground speech, background speech, sound effects, music with lyrics, and other ambient sounds overlap and in some cases compete with each other. Sound is simultaneous; print is linear. It's not possible to convert the entire soundscape of a major film or TV production into a highly condensed print form. It can also be distracting and confusing to readers when the caption track is filled with references to sounds that are incidental to the main narrative. Caption readers may mistake an ambient, stock, or "keynote" sound (Schafer 1977, 9) for a significant plot sound when that sound is repeatedly captioned. A professional captioner shared the following example with me: Consider a dog barking in an establishing shot of a suburban home. When the dog's bark is repeatedly captioned, one may begin to wonder if there's something wrong with that dog. Is that sound relevant to this scene? (See figure 1.2.) Very few discussions of captioning acknowledge or even seem to recognize that captioning, done well, must be a selective inscription of the soundscape, even when the goal is so-called "verbatim captioning."
2. Captioners must decide which sounds are significant.
If every sound cannot be captioned, then someone has to figure out which sounds should be. Speech sounds usually take precedence over nonspeech sounds, but it's not that simple. What about speech sounds in the background that border on indistinct but are discernable through careful and repeated listening by a well-trained captioner? Should these sounds be captioned (1) verbatim, (2) with a short description such as (indistinct chatter), or (3) not at all? Answering this question by appealing to volume levels (under the assumption that louder sounds are more important) may downplay the important role that quieter sounds sometimes play in a narrative (see figure 1.3). What is needed is an awareness of how sounds are situated in specific contexts. Context trumps volume level. Only through a complete understanding of the entire program can the captioner effectively interpret and reconstruct it. Just as earlier scenes in a movie anticipate later ones, so too should earlier captions anticipate later ones. In the case of a television series, the captioner may need to be familiar with previous episodes (including, when applicable, the work of other captioners on those episodes) in order to identify which sounds have historical significance. The concept of significance (or "relevant" sounds [see Sydik 2007, 181]) shifts our attention away from captioning as copying and toward captioning as the creative selection and interpretation of sounds.
3. Captioners must rhetorically invent and negotiate the meaning of the text.
The caption track isn't a simple reflection of the production script. The script is not poured wholesale into the caption file. Rather, the movie is transformed into a new text through the process of captioning it. In fact, as we will see in chapter 4, when the captioner relies too heavily on the script (for example, mistaking ambient sounds for distinct speech sounds), the results can be disastrous. In other cases, words must be rhetorically invented, which is typical for nonspeech sounds. I don't mean that the captioner must invent neologisms — I issue a warning about neologistic onomatopoeia in chapter 8. Rather, the captioner must choose the best word(s) to convey the meaning of a sound in the context of a scene and under the constraints of space and time. The best way to understand this process, as this book argues throughout, is in terms of a rhetorical negotiation of meaning that is dependent on context, purpose, genre, and audience.
4. Captions are interpretations.
Captioning is not an objective science. The meaning is not waiting there to be written down. While the practice of captioning will present a number of simple scenarios for the captioner, the subjectivity of the captioner and the ideological pressures that shape the production of closed captions will always be close to the surface of the captioned text. The practice of captioning movies and TV shows is typically performed independently, as contract work by captioning companies for major production studios, with little oversight, interest, or input from the content producers beyond the need to ensure legal compliance (Udo and Fels 2010, 209). In the case of nonspeech sounds, these independent contractors possess near-total control over the selection of significant sounds and the creation of captions for them. The resulting caption track is not an objective reflection of the text but what Abé Mark Nornes (2007, 15) calls, in the context of foreign language subtitling, a "new text." This view of captioning as rhetorical invention or textual performance, with the captioner serving as a rhetorical proxy agent, is likely to seem at odds with the goal of "equal access for all." But access to captioned content will never, strictly speaking, be the same as access to the sonic landscape. Rather, the captioned text will always be inflected by the captioners' interpretative powers and the different affordances of sound and writing.
These four principles will need a book to explain and defend. They are new and challenge the conventional wisdom about closed captioning. They have the potential to transform how we think about captioning, accessibility for deaf and hard-of-hearing viewers, and the relationships between sound and writing in the digital age. Researchers in rhetorical studies and disability studies have yet to provide a sustained analysis of closed captioning. (For exceptions, see Lueck 2011, Lueck 2013, and my own previous research: Zdenek 2011a, Zdenek 2011b, and Zdenek 2014.) We haven't paused to pay attention to captioning as rhetoric, even as we've held up captioning as one of the centerpieces of an accessible web. By rhetoric, I don't simply mean language pressed into the service of persuasion but, more broadly, signs and symbols that construct worlds of meaning for us to inhabit. Closed captions are not windowpanes on a sonic reality but mediate that reality in the course of providing access to it (cf. Miller 1979, 611). The conventional view of closed captioning tends to simplify questions of quality and focus on questions of quantity. For example, the Twenty-First Century Communications and Video Accessibility Act of 2010 (CVAA) requires that only certain types of TV-like content on the Internet be closed captioned, leading advocates to ask: How do we compel producers of independent web series, which aren't covered under the new law, to caption their programs? Quality tends to be defined narrowly in terms of completeness (Is the entire show captioned?) and accuracy (Is every speech sound captioned correctly? Are any captions garbled as a result of poor autotranscription?). Just as quality in foreign language subtitling too often gets reduced to mistranslation or "misprision" — what Nornes (2007, 16) calls "red meat" for critics of subtitling — quality in closed captioning too often gets reduced to questions of accuracy (e.g., "caption fails"). This book offers a new approach to quality in captioning by considering how captions create new meanings, manipulate space and time, call attention to productive tensions between sound and writing, and reflect captioners' subjectivities and interpretative skills. In short, this book offers a humanistic rationale for closed captioning — the first of its kind — by countering the popular perception that captioning is straightforward, objective, or simple. If captioning can be shown to be a complex rhetorical practice, then universal design advocates will have even more ammunition to argue that closed captioning should be an integral aspect of the production cycle, not an add-on or afterthought (see Udo and Fels 2010).
Despite the age of captioning technology, we still do not have a comprehensive approach to caption quality that goes beyond important but basic issues of typography, placement, accuracy, timing, and presentation rate. Current practice, at least on television, is too often burdened by a legacy of styling captions in all capital letters with centered alignment, among other lingering and pressing problems. Caption quality has been evaluated in terms of visual design — how legibility and readability interact with screen placement, timing, and caption style (e.g., scroll-up style vs. pop-on style). What we do not have yet is a way of thinking about captioning as a rhetorical and interpretative practice that warrants further analysis and criticism from scholars in the humanities and social sciences. In short, while we have captioning style guidelines for quality, we have not explored quality rhetorically. A rhetorical perspective recasts quality in terms of how writers and readers make meaning: What do captioners need to know about a text or plot in order to provide access to it? Which sounds are essential to the plot? Which sounds do not need to be captioned? How should genre, audience, context, and purpose shape the captioning act? What are the differences between making meaning through reading and making meaning through listening? Given the inherent differences between, and different affordances of, writing and sound, how can captioners ensure that deaf and hard- of-hearing viewers are sufficiently accommodated? The concepts that structure these questions — effectiveness, meaning, purpose, context, genre, audience — are of abiding interest to rhetoricians.
My argument, developed over the following chapters, is that a rhetorical view of captioning calls attention to seven transformations of meaning:
1. Captions contextualize. Captioning is about meaning, not sound per se. Captions don't describe sounds so much as convey the purpose and meaning of sounds in specific contexts. The meaning of a sound in a particular context may transcend its origins. The precise sonic qualities of a squeaky water tap may be less significant than the act of turning the tap off: (TURNS TAP OFF). In such cases, the action trumps the sound. Additional examples include [TURNS OFF RADIO], [unbuckles seat belt], [BLADE PULLS FREE], [Snaps Oscar's Neck], and [HITS CYMBAL]. Onomatopoeia has a role to play in captioning, but it must be used with care and when the visual context clearly informs the meaning of the captions. Media: http://ReadingSounds.net/chapter1/#contextualize.
2. Captions clarify. Captions tell us which sounds are important, what people are saying, and what nonspeech sounds mean. As a hearing viewer, I continually find myself relying on captions to learn characters' names and apprehend unusual words such as "flobberworms." (So that's what Peter Pettigrew just said in the background of the Harry Potter movie!) Reading provides superior access over listening, particularly when a noisy environment may work against the listener's ability to clearly make out what people are saying. The same goes for music lyrics that are transcribed on the screen for easy reading, as lyrics are well known for being misinterpreted by hearing fans. Media: http://ReadingSounds.net/chapter1/#clarify.
3. Captions formalize. Captions tend to be presented in standard written English, with information about manner of speaking relegated to identifiers such as (drunken slurring). Nothing else about the speech will mark it as inflected or accented (e.g., drunk) except for a lone identifier at the beginning of the first speech caption. While standard English provides the fastest access to information, it comes at the expense of conveying the embodied aspects of speech. Embodiment is carried almost entirely by manner of speaking identifiers or simple phonetic transformations (e.g., gonna, can't). While it is easy to find examples of substandard or phonetic spellings in speech captions, even these examples are informed by a desire to make the captions as fast to read as possible. Phonetic transcriptions are rhetorical insofar as they balance accuracy with accessibility. In this way, we might say that captions rationalize the teeming soundscape. Sounds that resist easy classification or simple description, such as mood music, are tamed or ignored altogether. Media: http://ReadingSounds.net/chapter1/#formalize.
4. Captions equalize. Every sound tends to play at the same "volume" on the caption track. While there are ways of modulating the volume of captioned sounds and differentiating background from foreground sounds in the captions, these ways are limited and space consuming. As a result, every sound tends to occupy the same sonic plane, making every sound equally "loud." Media: http://ReadingSounds.net/chapter1/#equalize.
5. Captions linearize. Sounds that are heard simultaneously cannot be read simultaneously. Captions linearize by presenting the soundscape in a form that can be read one sound/caption at a time. Although it is unusual, multiple nonspeech parentheticals can be presented on the screen at the same time. Multiple sounds can also occupy the same caption — see, for example, District 9's (2009) [ALIEN GROWLS ANDPEOPLE SHOUTING INDISTINCTLY] and [RAPID GUNFIRE AND MENSHOUTING IN DISTANCE]. Multiple, simultaneous sounds can also be reduced to single captions such as [overlapping chatter] and [overlapping shouts] from Silver Linings Playbook (2012). But simultaneous sounds must still be read one at a time. The caption reader thus experiences the film soundscape as a series of individual captions. Media: http://ReadingSounds.net/chapter1/#linearize.
Excerpted from Reading Sounds by Sean Zdenek. Copyright © 2015 The University of Chicago. Excerpted by permission of The University of Chicago Press.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.
Table of ContentsPreface
1 A Rhetorical View of Captioning
2 Reading and Writing Captions
3 Context and Subjectivity in Sound Effects Captioning
5 Captioned Irony
6 Captioned Silences and Ambient Sounds
7 Cultural Literacy, Sonic Allusions, and Series Awareness
8 In a Manner of Speaking
9 The Future of Closed Captioning