An urgent, erudite, and practical book that redefines literacy to embrace how we think and communicate now
We live in a world that is awash in visual storytelling. The recent technological revolutions in video recording, editing, and distribution are more akin to the development of movable type than any other such revolution in the last five hundred years. And yet we are not popularly cognizant of or conversant with visual storytelling's grammar, the coded messages of its style, and the practical components of its production. We are largely, in a word, illiterate.
But this is not a gloomy diagnosis of the collapse of civilization; rather, it is a celebration of the progress we've made and an exhortation and a plan to seize the potential we're poised to enjoy. The rules that define effective visual storytelling-much like the rules that define written language-do in fact exist, and Stephen Apkon has long experience in deploying them, teaching them, and witnessing their power in the classroom and beyond. In The Age of the Image, drawing on the history of literacy-from scroll to codex, scribes to printing presses, SMS to social media-on the science of how various forms of storytelling work on the human brain, and on the practical value of literacy in real-world situations, Apkon convincingly argues that now is the time to transform the way we teach, create, and communicate so that we can all step forward together into a rich and stimulating future.
Related collections and offers
|Publisher:||Farrar, Straus and Giroux|
|Product dimensions:||5.10(w) x 7.90(h) x 0.80(d)|
About the Author
Stephen Apkon is the founder and executive director of the Jacob Burns Film Center, a non-profit film and education organization located in Pleasantville, New York. The JBFC presents a wide array of documentary, independent, and foreign film programs in a three-theater state-of-the-art film complex, and has developed educational programs focused on twenty-first century literacy. Under Apkon's leadership, the JBFC opened a 27,000-square-foot media arts lab in 2009. Since its doors opened in 2001, JBFC education programs have reached more than 100,000 children.
Steve serves on the boards of The World Cinema Foundation and Advancing Human Rights. He is President of Big 20 Productions; the director and producer of The Patron, a collaboration with Ido Haar; a producer of Enlistment Days, directed by Ido Haar; and a producer of I'm Carolyn Parker: The Good, the Mad, and the Beautiful directed by Jonathan Demme.
Read an Excerpt
ALL THE WORLD’S A SCREEN
We often look for signs of systemic change in the highest of places, yet it is often in the most profane of places that we find the fuel for this change.
So it seems appropriate that a look at the shape of literacy in the twenty-first century can be gleaned in a visit to a set of ratty industrial warehouses at the edge of downtown Los Angeles, in a neighborhood favored by filmmakers for its moody alleys and car-chase venues.
Inside a drafty space that bears the most charitable appellation of “office,” a stout young man in a T-shirt and flip-flops named Freddie Wong, along with several of his friends from the University of Southern California, thinks of new ideas for simple Internet videos that millions will watch.
Wong’s first breakout success played off the video game Guitar Hero, in which the player makes exaggerated riffs on a plastic guitar in order to score points and keep an electronic “crowd” cheering. Some of the nation’s more exuberant players were in the habit of videotaping themselves jamming on the fake guitars and uploading this footage to YouTube, the massive file-sharing site that had been launched only a year earlier.
Those who were paying attention noticed a commonality in the really popular clips: they emphasized ridiculous quasi–rock star poses, and the cameras zoomed in on the player’s face while ignoring the game unfolding on the screen.
This emerging trope seemed ripe for parody. Wong enlisted a few friends; borrowed a motorcycle, donned a black leather jacket, a red sequined shirt, and medieval chains; and preened like a vain thug. “What’s up, Internet?” he said. “My name’s Freddie and I’ve come from a long hard day of rocking faces and doing jumps with my sweet bike here to come and rock you.” A flunky removes the jacket from Wong’s shoulders like an obedient valet. Wong sneers that the chains on his chest are there to keep his soul tied down; otherwise, it would fly off and “impregnate women.”
Then comes an utterly ridiculous set piece in which Wong’s exaggerated licks on the fake guitar are kept in time with the scoring of the game, run as a split screen. Wong does a relatively average job on the scoring, but he acts throughout as though he were supremely pleased with himself. And then, in an unexpected coup de grâce, he smashes the plastic guitar, punk-style, with a triumphant “Yes!”
The performance was designed to be ludicrous—which it was—but the larger purpose was to engage the Internet’s boisterous hive of largely anonymous users who watch, criticize, and share amateur videos. It didn’t matter that the strap of the guitar was accidentally hanging on Freddie the wrong way, or that the lighting was crude. The video was smart and it was literate and it shared a set of in-gestures with the audience. Within a few weeks, it received more than a million hits on YouTube, as friends e-mailed it through the exponential matrix of social connections. Wong has since duplicated this success many times over.
“Gamer Commute,” for example, which received more than nine million views within two months, begins with a shot of Wong waking up in an ordinary bedroom and making a choice of clothes from an electronic menu—glasses, gray T-shirt, flip-flops, cargo shorts. Three guns fly toward him and embed themselves in his body with a metallic click. He then gets into his Toyota, and after taking it up to the speed limit on an ordinary Los Angeles street, he climbs on top of the roof of the moving car and fires a pistol in the air. All of this was achieved with green screens and special effects. The video ends with Wong coming into his ordinary-looking office cubicle and sitting down with a bored expression, resigned to the mundane workday. The video builds upon a foundation of cultural knowledge, and then leaves an unstated moral conclusion: that the gaming world contains far too many thrills and blood spatters to be sustainable in the dreary existence of working life.
One wonders how many of the nine million clicks “Gamer Commute” received were made from cubicles such as the one Wong occupies at the video’s end.
Freddie Wong’s success on YouTube was anything but a random accident. He did not make a video and just throw it against the wall of the Internet to see if it would stick. In fact, his career has been built not so much on creative randomness as on deliberate calculation, in much the same way Jack Kerouac wrote On the Road not as a freewheeling, spontaneous howl of beatnik joy, but rather as a shrewd attempt to write a bestseller that would embody the rambling spirit of the late 1950s. Like Kerouac, Wong found cultural receptivity, and he has done more to uncrack the nebulous market “science” behind effective amateur filmmaking than just about anybody working today.
This inquiry into the base code of successful videos started when Wong was still an undergraduate at USC. He wrote a thesis called the “10^6 Project,” its name implying a force with exponential power. “Why do videos go viral?” he asked. “What kind of content goes viral? And what are the strategies and techniques utilized to promote them?” Wong and his classmate and partner Brandon Laatsch started looking closer at the videos that had gone viral, and at the core factors in their doing so.
“They repeat a formula,” Wong said. “The success of videos was seen as this random force, but when you have an enormous body of people doing the same thing, that element of randomness disappears.”
Herein lies a paradox: watching videos on the Web is usually a solitary experience, but Wong tapped into a social gold mine. People who watch homemade videos love to pass them on to their friends. It is a way to have a connection with others and even to claim a little credit for the creativity of the filmmaker, because it is you who first noticed and laughed at his brilliance; the humor accrues to the sharer. People take a social risk when they e-mail a link to a Web video to friends, or call them over to the laptop. “You’re putting yourself out there,” says Wong. So it had better be amusing.
The factor he is aiming for, then, is what, for newspapers, used to be called the “Hey, Martha” factor: the quirky, indescribable story that would persuade a reader to toss the Metro section over to his wife and say, “Hey, Martha, you have to read this.” Enjoying the story together and then talking about it, even arguing about it, become as much a part of the experience as the viewing. So in Wong’s case, the maker’s imperative is to provoke a specific response in the viewer, namely, “What will make me want to show this to other people?”
This medium of exchange is critical in an era when our choices of what to watch are so easily driven by the recommendations of our friends, which usually come in the form not of spoken plaudits but of a forwarded e-mail. This is the way we share literacy in this century.
Henry Jenkins, a professor of communication at the University of Southern California, writes about this in a book called Spreadable Media: Creating Meaning and Value in a Networked Culture, written with Sam Ford and Joshua Green. Jenkins says the act of forwarding a video link both gives an inherent sense of added value to the product and instantaneously creates a mini-community.
“Rather than seeing circulation as the empty exchange of information stripped of context and meaning, we see these acts of circulation as constituting bids for meaning and value,” Jenkins writes with his coauthors. “We feel that it very much matters who sends the message, who receives it, and most importantly, what messages get sent.”
In short, context matters in these mini-communities. Freddie Wong, too, noticed a contextual factor at work in videos that were truly successful in eliciting page clicks. The sound and the dialogue needed to be only basically intelligible, as they were likely to be competing with other sounds. And for that reason, sound element didn’t matter nearly as much as the visual components. The filmmaker’s credo that “plot matters” holds true more than ever, but plot must be expressed in a way that can be seen. It lends itself to a more physical type of acting—almost a mimetic method that dates to the silent era of movies at the turn of the twentieth century.
“You have to assume that these videos are being played on laptops, tablets and cell phones with tiny speakers” says Wong. “And that there’s going to be noise and other distractions in the room. And so you have to make it visually interesting in order to cut through the static noise and draw the proper attention to the central message of the video.”
The new era of the digital peep show has something else in common with the brief films of the silent era. Like those early films, today’s videos are meant to be international in their appeal. The early film studios in Patterson, New Jersey, were cranking out three-minute reels that could be understood in Paris and Buenos Aires as well as in New York. Short films with limited dialogue are more easily appreciated, and therefore more likely to be watched and shared in nations where English is not widely spoken. Visuals are not hampered by the constraints of tongue; they work in most any culture—which is why Freddie Wong has a following in Croatia.
Thus stripped of most traditional linguistic elements, the short film has to move fast, but it must strive not to confuse the viewer with too many moving objects or jarring cuts. The format of a short online video is subject to a concept called “dropoff,” in which the viewer simply gets bored and stops watching. Conventional Hollywood movies longer than ninety minutes and shown to a captive paying audience in a theater have the luxury of padding out the material or slowing down the story to draw out the narrative experience. Video shorts do not enjoy that luxury; viewers can “walk out” with a simple click.
This isn’t to say that filmmakers succeed when they aim for the lowest common denominator. The video has to demonstrate a balance between accessibility and sophistication.
The Harvard scholar Marjorie Garber, in her landmark book The Use and Abuse of Literature, lays out the tricky question of what distinguishes a piece of literary writing from the merely everyday. Garber postulates that a text ought to draw upon some of the foundational works that came before it, if even in shadow—the myth of the flawed hero, a Homeric sea journey, a pair of doomed lovers. The work also ought to have a quality of openness to it, a certain ambiguity that leaves the author’s intentions at least partly in the dark, so that the “meaning” of the text might be pleasingly unclear and the conclusion left for the reader to draw from his own experiences.
This is a hard thing to get right, and Wong’s videos meet both standards of Garber’s literacy test. He draws upon a body of work that is familiar to most of his viewers: the world of video games, in which certain tropes (the car chase, the first-person shooter, and the explosions of colorful blood in simulated combat) are recognizable and even shopworn to the audience. Wong’s videos take these standards and apply them to ridiculous scenarios so that the “fish out of water” quality might itself be the locus for the joke, and for the narrative pleasure of the audience, who is acquainted with the in-references.
Wong is not alone in tapping old literary veins with new visual technology. Just across town from him live two transplanted New Yorkers, brothers Benny and Rafi Fine, who have produced several narrative Web series and their own sort of talk-show format called Kids Watch Viral Videos (which evokes memories of Art Linkletter’s 1960s television program Kids Say the Darndest Things). The Fine brothers’ videos have garnered millions of hits, and Benny and Rafi have become celebrities in the Web video world, complete with squealing fans. On a recent trip to London, they were spotted by a group of young teens who trailed them down the sidewalk until they had the courage to approach them and gush over their videos. While the Fine brothers are not exactly the Beatles, their videos have recognition and panache. And they are hard at work creating new pieces while meeting with corporations that want to hire them to create videos that feature the companies’ products.
When a mysterious person who called herself “wigoutgirl” posted the video “Bride Has Massive Hair Wig Out” on the Internet, it spread like angry lice, reaching 2.8 million viewers in a few weeks. The home video, shot by an unseen bridesmaid, shows a freaked-out bride on her wedding day who has just received a bad haircut. She grows progressively more upset, then seizes a pair of scissors and begins cutting it all off—a scenario that perhaps fed directly into every woman’s worst nightmare. Most men laughed at the distaff drama of it, but the scene was compelling, and lent itself to multiple watchings and sharings.
Turns out, the video was not documentary footage but a scripted set of images filmed in a real hotel suite. The shampoo company Unilever had done it as a publicity stunt—but a clever one. The footage has been viewed over twelve million times, essentially because it is a well-constructed piece of drama that transfixes the viewer with a story that almost everyone can relate to.
There is big money to be made in creating literate visual stories, even if they are somewhat silly. You don’t need to be highbrow to be literate.
* * *
Among the more than three billion videos watched each day on sites such as YouTube, there is undoubtedly a lot of garbage. But in what medium is there not? Of all the paintings hanging on walls of museums around the world, there is a small subset that I’d like to hang on my own walls. In every medium there is a wide range of content, and a difference in taste among consumers. We must never assume that an appeal to the masses represents illiteracy. In fact, it implies a high degree of literacy. And in the new century, that increasingly means visual media.
The language of the modern cinema is only about a century old, and it continues to develop. Some of its conventions will never change, yet it has suddenly become relevant not just to passive moviegoers, but also to ordinary citizens, who can now “write” in a way that was once reserved for the elite.
Freddie Wong’s guitar stunt would have been impossible even seven years ago. There was no giant information commons of video files, no YouTube or Vimeo. His creations definitely would have gotten some laughs among his friends, but they would never have gotten such instant traction in the world at large. The proliferation of devices that allow us easily and proficiently to capture moving images, the introduction of inexpensive and accessible editing tools, and at the same time the emergence of distribution sites such as YouTube, which made its debut in 2005, have changed the game forever.
Today more than forty-eight hours of fresh video is uploaded to YouTube every minute, which translates to eight years of content added every day. To put that in perspective: each month, there is more video added to the site than the collective output of the three major television networks since their founding after World War II. There are more than eight hundred million unique visitors watching videos each month on YouTube, with more than 70 percent of this traffic coming from outside the United States. We are part of a global visual conversation. The medium of television itself is moving quickly away from the polished network-produced shows and into more renegade do-it-yourself programming, where the rules are being rewritten.
“With more and more people being connected, the economics are improving, so it makes sense that storytellers of all kinds would want to come to us,” YouTube’s global head of content, Robert Kyncl, told a reporter. “The more connected devices we get, the more this system will open up … We’ll take the couch potato, sure. But what we’re really after is the couch potato who is willing to get up and lean in and get engaged.”
Meanwhile, newspapers are dwindling rapidly as they lose their perch as the world’s most trusted medium of news and information. According to the Pew Research Center’s Project for Excellence in Journalism, there were about sixty-two million paid newspaper subscriptions in 1990. What a difference twenty years and the Internet have made: by 2010, that number had fallen (and it continues to fall) to about forty-three million, and several old lions such as the Seattle Post-Intelligencer, The Ann Arbor News, the Rocky Mountain News, and the Tucson Citizen had gone out of business entirely.
Those papers that want to survive are struggling to find ways to make their copy accessible and relevant on the Web. Big stories now routinely come with video components. Schools of journalism are racing to re-create themselves lest they remain trapped in a world of print that no longer exists in the way it did even ten years ago.
What we are now seeing is the gradual ascendance of the moving image as the primary mode of communication around the world: one that transcends languages, cultures, and borders. And what makes this new era different from the dawn of television is that the means of production—once in the hands of big-time broadcasting companies with their large budgets—is now available to anyone with a camera, a computer, and the will. “Hollywood will always bring great content,” Chad Hurley, the head of YouTube, told Forbes, “but amateurs can create something just as interesting—and do it in two minutes.”
The rate of change has been dizzying. The first job I ever had was delivering daily newspapers: the Boston Globe in the morning and the South Middlesex News in the afternoon. Those papers and the Big Three networks of ABC, CBS, and NBC were the primary links we had then to the larger world of information. We depended on them to tell us the truth. As one of the few places where corporate marketing departments could reach their target consumers, these media outlets commanded the advertising revenues that kept them profitable and allowed them to develop teams of reporters who could take their time pursuing a story. We had no other alternatives for our news. The networks and the papers represented a wall of information with very few cracks.
Now the spread of mini-publishing presses and mini-television studios has turned that wall into rubble. There are no longer limited outlets of information, and consequently there is diminished consensus on what constitutes the most important events of the day. Now there is only a bewildering array of choices for the reader/viewer—and a tremendous opportunity for those who seek to tell their own stories and hurl their own papers onto the porches of the world. Now the new media presents each of us with the power—even beyond that of print—to tap deeply into the well of obscurity and bring forth a message, an idea, an image, that was heretofore unknown or unimagined.
Even in the midst of an economic crisis (in fact, maybe even because of it) the Arab entrepôt of Dubai is opening the Mohammed Bin Rashid School for Communication, to teach students the basics of film and new media storytelling in Arabic. “This is the first time in the Arab world you have a school of communications teaching the indigenous population in their indigenous language how to work within their own countries,” the school’s new dean, Ali Jaber, told Variety. “It is only when you tell your own stories to your own people that you’ll be able to tell them to others.”
The grammar of violence is a particularly powerful means of communication unleashed by this new visual potency. Napoleon once said he feared three hostile newspapers as much as a thousand bayonets. The image is a tool of revolutionaries as well as counterrevolutionaries.
During the protests surrounding the disputed 2009 elections in Iran, a woman named Neda Agha-Soltan was hit in the chest by a sniper’s bullet as she stepped from a car that belonged to her singing instructor. She bled to death almost immediately. In almost any other circumstance, she would have been known in the public consciousness only as one of an estimated ninety-eight people killed during the upheaval in Tehran and in the months afterward—a statistic instead of a person. But a man with a video camera happened to be standing nearby and filmed the fifty-two horrifying seconds of her shooting and death in the street, as her friends tried to stop the bleeding. That image became one of the signal facts—yes, facts—about the uprising that took place around that election stolen by Mahmoud Ahmedinijad. It was reproduced in thousands of propaganda videos and distributed immediately.
Neda has become the “Marianne” of Iran, the symbol of the people’s yearning for liberty that once was captured in oils by Eugene Delacroix in his painting Liberty Leads the People, and is now captured without warning by a cell phone camera and spread around the world in the space of a day. We see her as she actually died on Karekar Street, wearing sneakers and jeans, saying her last words: “It burned me.” In this instance, and in many others, the video form of expression has become a global vernacular for expressing reality.
Even the most radical organizations understand the power of visual media. In 2001, Al Qaeda reversed a decade-long Taliban prohibition on video as it created its own production company, As-Sahab, in order to spread its message and recruit new members. Headed up by the media-savvy disaffected American Adam Gadahn, the company once produced close to a hundred videos a year deep in the mountains bordering Afghanistan and Pakistan. Its reach, though, is global.
Evan Kohlmann, the director of the international consulting firm Flashpoint Partners, which follows terrorist activity, had respect for the skill of the As-Sahab productions, even if he finds their content despicable. “It’s actually amazing,” he said. “You’re talking about very, very high-quality video subtitling. You’re talking about English translations. Graphic sequences have been done showing rockets being fired into an American flag, and having the American flag exploding into pieces. And it, you know, these are very high-quality videos. They’re very dramatic. They get passed around like baseball cards. They’re being distributed in formats that you can even watch on your cell phone. So it shows us there are dedicated teams of individuals working on this. And that they’re spending quite a bit of time on this.”
In this instance, and in many others, the video form of expression has become the preferred method for expressing a point of view to an international audience, where a spoken language is not always held in common by interested parties. The Mohammed Bin Rashid School for Communication is in partnership with the Annenberg School for Communication and Journalism at the University of Southern California. Elizabeth Daley, the Annenberg School’s founding executive director and current dean of the USC School of Cinematic Arts, goes even further in her assessment about the historically transformative powers of visual media.
Daley likes to draw an analogy between our current state of linguistic history and that of Italy in the fourteenth century, where, cloistered behind monastery and university walls, scholars lectured in Latin while, in the streets and marketplaces, people spoke, joked, bargained, and yelled in vernacular Italian dialects.
“The corresponding argument today,” she writes, “simply put, is that for most people—including students—film, television, computer, and online games and music, constitute the current vernacular.”
If you take this metaphor to its logical end point, written speech fades away, like Latin, to mere ceremony and obscurity, whereas video becomes the only mode of future semiosis. Instead of sending e-mails, we’ll be sending videos.
This is not, however, an obituary for the printed word. There will never be a “death” of words on paper (or on screens or some other delivery mechanism), or an end to the sequentially ordered sentences that define how we transmit and preserve ideas. That form of expression will never cease to be relevant. I am choosing to use it right now, and you are choosing to receive it this way, from this book. More than this, the literacy of images shares a mutually reinforcing relationship with the literacy of words. The two are forever entangled.
Reading is so often a sensual experience, contingent on the ability of the reader to form pictures and shadows along with the writer. This works on the creative end, too. William Faulkner once wrote that the whole idea for The Sound and the Fury, arguably one of the greatest novels ever to come from the American South, derived from a single, puzzling image that he could not get out of his mind: a boy in the branches of a tree, peering into a girl’s bedroom through a window. First came the mind movie, then came the novel.
Yet we are breaking away from the past in one critical way that makes Elizabeth Daley’s point worth considering. The unstoppable rise of visual expression as a popular means of conveying truth is going to require a new discernment on the part of the reader/viewer: a combination of skepticism and incisiveness that assesses the value of the image-based argument rather than the spoken. It is a sensual kind of literacy.
* * *
We have to understand that writing ultimately charts back not to the eyes but to the ears. The building blocks of text are merely repurposed symbols of the spoken word.
Thirty thousand years ago, humans learned to transmit feelings and ideas to one another through a series of animal grunts and whines; these sounds grew more complex as the ideas grew more complex, and eventually our tribal units accepted certain whines as the common understanding of a concept. The utterance was a way to store and catalog experience for later retrieval. These ideas were eventually expressed as graven symbols: first letters, and then words assembled from those letters, atomizing and alienating the sign from its original home in the realm of grunts.
Images are traced to a different home, and also a different method of acquiring knowledge. They are more primal and less filtered. And we have a more emotional relationship with those messages. Sound-writing is primarily an intellectual exercise. Seeing is more libidinal. Naked images bring us back to some atavistic forest in our species’ memory, where beauty dapples the branches but killers lurk in the shadows. Delight and fear are intermingled.
We get a perverse and uncomfortable feeling when, for example, we see Neda Agha-Soltan struck in the chest by a sniper’s bullet and watch the life leave her eyes in less than thirty seconds. It twinges a chord of sympathetic pain; it makes us imagine what it would feel like to see an innocent friend or relative suffer the same inexplicable fate. Ultimately, the image reminds us of our own precarious existence.
Yet understanding the larger meaning of the Neda Agha-Soltan death video is impossible without knowing the background on the state of affairs in Iran, and why people were out protesting that day against Ahmedinijad and the stolen election. A broader conception is necessary, and text is an important element in supplying that context, even if that text is delivered as spoken word in the form of a voice-over to the images. Without it, you have only a meaningless act of violence. With the text and the accompanying context, the image becomes a powerful indictment of the sitting Iranian government and its callous treatment of its own citizens. The image does not live in isolation. The emotion it provokes has to be anchored in a larger conceptual and cultural understanding.
When ex-soldier Joe Cook and his father set out to shoot a video entitled “Dear Mr. Obama,” they relied on the public’s knowledge that many soldiers were returning from war in Iraq and Afghanistan missing limbs that had been blown off by IEDs. This allowed them to craft an intimate message from Cook to then-candidate Barack Obama, admonishing him not to abandon the U.S. troops. When it came time for Cook to walk away in the video, the camera provided the most jarring nonverbal message: Cook’s leg was missing—and we understood the context. This contextual understanding is a function of multiple forms of texts, including print, visual, and oral texts, each utilizing its particular strengths to create a foundation for deeper understanding.
When a filmmaker can tap into reserves of experience-based, image-based, and text-based knowledge already present inside the viewer, a bit of neurological necromancy takes over. One simple image becomes like an electric cord plugged into an existing grid of knowledge acquired through reading, listening, or experiencing, all of which make the image brighter and more immediate, pouring a surge of intellectual electricity into a rebus that fills the eye and dominates the discussion. The image touches us in a way that speech cannot; it becomes the handle for carrying the whole luggage. It brings a superbly human dimension into the picture.
Accomplishing this trick is a key element of visual literacy, not just unconsciously on the part of the viewer, but also for the one making the film. There is an implied contract between filmmaker and viewer that what is being shown is not only an accurate depiction of what is inside the camera’s frame, but an accurate distillation of the larger story outside it.
Here’s a personal story that helps explain what I mean. We live not far from the town of Armonk, New York, where an apple orchard and farm stand stood for decades. My wife and I for years had been in the habit of dropping by for a dozen apples or a jug of cider whenever we drove down that road, and so we were dismayed to hear that the family that owned the orchard was selling it off to a condominium developer. This is not an uncommon fate for agricultural land in this part of New York State, as everyone around here is aware, and the slipping-away of the bucolic character of the landscape is a subject you hear a lot of at every regional zoning meeting. I certainly don’t begrudge any of the parties involved their desire to make a buck. The farmer was making a reasonable business decision, as was the condo builder. Nobody is a bad guy here. But it does spell the loss of that orchard.
My son was particularly attached to this place and wanted to make a short film about the change that was imminent. So I took him out there to film on a day when the developer was setting up a sales office for McMansions next to the apple trees that would shortly be uprooted. We all happened to be sitting in the fellow’s office as he explained what was happening. “We’re going to make this place beautiful,” he said. “We’re calling it Cider Mill.”
Behind the guy’s shoulder was a window that looked out onto the farmhouse yard, and you could plainly see a bulldozer ripping down one of the structures. His words were given a new visual layering. And the viewer could immediately understand the irony: they were building the very opposite of a cider mill; in fact, they were destroying the mill in order to create a simulacrum.
We were not trying to create a “gotcha” moment, or make the developer look bad. But that ten-second moment represented a marriage of sound-based knowledge and an eye-based event that illustrated a home truth about the place where we live. All in all, an effective piece of film. Like Freddie Wong’s compression of reality, it worked because it was literate.
When you tell a good visual story, you are creating a mysterious chain of events in the viewer’s mind with every electronic visual choice you make, and the final truth and beauty of it depend largely on the degree of literacy brought to its creation.
* * *
We have seen a historical-change moment like this before. Television was introduced to the American public right after World War II, and by 1968 its influences had sunk so deep into the culture that a new way of understanding it had to be developed.
An academic named John Debes was particularly out in front on this. With the help of the Eastman Kodak company, he convened a seminar dedicated to what he called “visual literacy”—the idea that comprehension of what we see in movies, photography, and television is as vital a skill as that of reading the written sentence. Being visually literate, said Debes, enables the viewer “to interpret the visible actions, objects, symbols, natural or man-made, that he encounters in his environment.”
This idea coincided with a similar movement in “media literacy,” a more politicized discipline of the 1960s that encouraged citizens to be more insightful consumers of what they were absorbing from radio and television. The effects of American political demagoguery, it was thought, would be easier to understand if the listener-viewer had a more sophisticated understanding of how a message was constructed. The anthropologist Edmund Carter said, “The new mass media—film, radio, TV—are new languages, their grammar as yet unknown. Each codifies reality differently, each conceals a unique metaphysics.”
We Americans fell in love with that reality. In 1945, fewer than ten thousand televisions populated our homes. Only fifteen years later, that population had grown to almost sixty million. Today there are well over a quarter of a billion sets in the United States, with almost 97 percent of all households owning at least one set. (This is actually down from 98.9 percent, as fewer people now rely on television to get their visual fix.) But that doesn’t tell even half the story, as our computers, tablets, personal entertainment devices, and cell phones stream an endless array of content. And that content is available not just when we choose to watch. There are screens at our restaurants and retail stores, at our offices and gas stations, and even in our cars and airplane seats. It took less than thirty years for the science fiction of Ridley Scott’s Blade Runner to become reality, and in cities like New York, Tokyo, and Seoul, skins of buildings have been turned into screens, and visual media has become an essential part of the vocabulary of architecture. The Grand Indonesia tower, a fifty-seven-story building in Jakarta, is wrapped in more than sixty thousand square feet of screens. In an interview with a trade publication, the architect Darryl Yamamoto says that “up to now architecture has been about fitting buildings into three-dimensional space. The inclusion of video screens completely covering a building’s surface changes that equation of how a building occupies that space. In a sense, video-screen coverage on a building surface places it in a fourth dimension where pictorial and iconic imagery now becomes a representational feature of how the building presents itself.” Moving images increasingly occupy our public spaces and add to the ever-expanding body of visual data we are steeped in.
The time has come to rebuild the idea of “visual literacy”—not to overthrow it, but to expand it. We must now take into account what is technologically inevitable in the twenty-first century: the proliferation of messages that are increasingly divorced from their written foundations and increasingly married to the wordless visual pathways of knowledge that have been the human birthright since before we were even human.
We have now reached a point in our romance with the electronic image where it has moved in to stay, and all of us will be called upon to be not just consumers but producers. Listening to the story and judging it is no longer enough. We now have to start telling the story ourselves, all of us, if this is to be a literate society.
The medium is ripe not just for entertainment and frivolity. It has a powerful social effect that we cannot ignore. Those in power have much to fear from it, just as Napoleon said he feared newspapers more than bayonets.
One of the founding intellectuals of the media literacy movement was Marshall McLuhan, a Canadian professor whose book Understanding Media: The Extensions of Man made clear the distinction between what he called “hot media” and “cool media.” This had nothing to do with temperature or style. It had to do with the level of participation required by the receiver. The more the medium filled up the frame of reality (as in a photograph or a movie), the more it was “hot,” and the experience was more passive. If the medium was a bit starker, and called upon the viewer to fill in his or her own mental spackle (as with a cartoon, a discussion, or a snippet of jazz), the more it was “cool”—that is, an experience that asks for a bit more intellectual participation, a little more cognitive presence.
By this definition, videos on YouTube and elsewhere are mostly hot, because all we have to do is click and watch. But the larger field of possibility they open is truly breathtaking because of the newfound ease of creating one’s own moving images and adding them to the growing body of society’s collective dialogue. Only the very talented and the very lucky could manage this in McLuhan’s time because of the huge commercial barrier (represented by publishers, movie studios, magazine editors, and television stations) that stood between the creator and the market, and the creative barrier presented by the lack of access to affordable tools of creation.
Those barriers are now falling fast. The experience of media is growing less and less passive.
When German playwright and author Bertolt Brecht wrote a famous essay called “Radio as an Apparatus of Communication” in 1932, he was not imagining the distribution of image-driven media through the television and now the Internet. Still, he understood well the importance of being proficient not just in listening but also in speaking, and that communication is, eventually, a two-way street of dialogue as opposed to monologue, even within mechanically mediated forms of communicating.
It is purely an apparatus for distribution, for mere sharing out. So here is a positive suggestion: change this apparatus over from distribution to communication. The radio would be the finest possible communication apparatus in public life, a vast network of pipes. That is to say, it would be if it knew how to receive as well as to transmit, how to let the listener speak as well as hear, how to bring him into a relationship instead of isolating him. On this principle the radio should step out of the supply business and organize its listeners as suppliers. Any attempt by the radio to give a truly public character to public occasions is a step in the right direction.
Brecht was living in the golden age of cinema, but he focused on the radio because it was one of the most dominant forms of communication available in his day. Today the vast majority of our information is delivered through visual media: the television, the cinema, the Internet, and the screens that surround us where we work, shop, socialize, and learn.
Update Brecht’s observation for changing technology and we’re essentially reading about the Internet and the ease of sharing the enormous power of the wordless icon, the thing that has no alphabetic expression. I can only imagine that Brecht would have smiled at the dramatic explosion of video-sharing outlets such as YouTube, whose slogan implores its billions of users to “Broadcast Yourself.”
The manipulation of video is shortly to become a global language, and its seductive powers will be on display for all to see. But it is morally neutral. It can be used for good and evil and everything in between.
The most pertinent question now facing us is not how can we resist this revolution in thought, but how can we respond with the maximum amount of thoughtfulness, energy, and smarts? How will we present ourselves to the world?
How literate will we be?
Copyright © 2013 by Stephen Apkon
Foreword copyright © 2013 by Martin Scorsese
Table of Contents
Foreword Martin Scorsese xi
All the World's a Screen 15
What Is Literacy? 37
The Brain Sees Pictures First 75
The Evolution of the Audience 102
The Big Business of Images 136
Grammar, Rhythm, and Rhyme in the Age of the Image 158
Teaching a New Generation 210
The Sharpening Picture 242
Notes on Sources 255