Stop Staring: Facial Modeling and Animation Done Right [NOOK Book]


The de facto official source on facial animation—now updated!

If you want to do character facial modeling and animation at the high levels achieved in today’s films and games, Stop Staring: Facial Modeling and Animation Done Right, Third Edition, is for you. While thoroughly covering the basics such as squash and stretch, lip syncs, and much more, this new edition has been thoroughly updated to capture the very newest professional design ...

See more details below
Stop Staring: Facial Modeling and Animation Done Right

Available on NOOK devices and apps  
  • NOOK Devices
  • NOOK HD/HD+ Tablet
  • NOOK
  • NOOK Color
  • NOOK Tablet
  • Tablet/Phone
  • NOOK for Windows 8 Tablet
  • NOOK for iOS
  • NOOK for Android
  • NOOK Kids for iPad
  • PC/Mac
  • NOOK for Windows 8
  • NOOK for PC
  • NOOK for Mac
  • NOOK Study
  • NOOK for Web

Want a NOOK? Explore Now

NOOK Book (eBook)
$28.49 price
(Save 43%)$49.99 List Price
Note: This NOOK Book can be purchased in bulk. Please email us for more information.


The de facto official source on facial animation—now updated!

If you want to do character facial modeling and animation at the high levels achieved in today’s films and games, Stop Staring: Facial Modeling and Animation Done Right, Third Edition, is for you. While thoroughly covering the basics such as squash and stretch, lip syncs, and much more, this new edition has been thoroughly updated to capture the very newest professional design techniques, as well as changes in software, including using Python to automate tasks.

  • Shows you how to create facial animation for movies, games, and more
  • Provides in-depth techniques and tips for everyone from students and beginners to high-level professional animators and directors currently in the field
  • Features the author’s valuable insights from his own extensive experience in the field
  • Covers the basics such as squash and stretch, color and shading, and lip syncs, as well as how to automate processes using Python

Breathe life into your creations with this important book, considered by many studio 3D artists to be the quintessential reference on facial animation.

Read More Show Less

Editorial Reviews

From Barnes & Noble
The Barnes & Noble Review
From earliest infancy, humans not only recognize faces: They recognize when something's wrong with the image of a face. If you're a 3D animator, that means literally everyone's a critic, even babies. With no room for error, facial animation is one of the most difficult tasks you'll ever face. Fortunately, you now have an expert consultant for all your facial animation projects: Jason Osipa, in Stop Staring, Second Edition.

There are no Ferraris or alien landscapes in this book: just faces. Osipa begins by deconstructing speech from a visual point of view, showing which facial movements matter, which you can ignore, and how to do lip syncing that works. Next, he brings the same insight to the eyes, brows, and eyelids: the parts of the face most important to communicating emotion.

Osipa notes that many animators spend too much time on brows, not enough on eyes. And he uses that insight to introduce the crucial concept of landmarking: it's the area surrounding what you think you're looking at that delivers the cues people rely on most. Failure to landmark is why so many beginning animators' smiles look fake, tortured, creepy, wrong.

Osipa spends an entire section of his book on the mouth -- introducing more sophisticated lip-sync techniques; then showing how to use "visimes" and "mouth keys" to achieve even greater realism. He returns to eyes and brows with the same depth, then walks through the rest of the process: connecting features, skeletal setup, weighting, rigging, interfaces, and taking your shot through production. There's even a chapter on leading-edge squash-and-stretch deformation. While Osipa's concepts are tool-independent, his examples are built in Maya (and Alias provides the Maya Learning Edition on CD-ROM so you can follow along).

Jason Schleifer, who animated Gollum, endorsed this book. You will, too. Bill Camarda, from the May 2007 Read Only

Read More Show Less

Product Details

  • ISBN-13: 9780470939611
  • Publisher: Wiley
  • Publication date: 9/14/2010
  • Sold by: Barnes & Noble
  • Format: eBook
  • Edition number: 3
  • Pages: 416
  • Sales rank: 1,333,152
  • File size: 45 MB
  • Note: This product may take a few minutes to download.

Meet the Author

Jason Osipa has been working in 3D since 1997, holding titles in all levels of animation, rigging, and directing in real-time and rendered 3D. He is currently running Osipa Entertainment, which offers contracting, consulting, and classes for games, TV, Direct-to-Video, and film. Prior to opening his own company, he worked at gaming industry giants LucasArts and EA, among others. He is the author of both previous editions of Stop Staring: Facial Modeling and Animation Done Right.

Read More Show Less

Read an Excerpt

Stop Staring

By Jason Osipa

John Wiley & Sons

ISBN: 0-7821-4129-3

Chapter One

Learning the Basics of Lip Sync

I love this stuff. When more than a few people told me I should write a book about my approach to facial animation, I started to take note. They saw how I took something so complex, so daunting, so evil, and made it so easy, simple, and-dare I say it?-fun! The trick is the implementation: the building techniques to make sure the mouth can get into all sorts of shapes convincingly and to have those shapes interact with each other attractively. In modeling for facial animation, mix and match is the name of the game. Instead of building individual specialized shapes for every phoneme and expression, we'll build shapes that are broader in their application and use combinations of those to create all those other specialized shapes. On the animation front, it's all about interface, and maximum control for minimum effort. You want to spend your time being creative and animating, not fighting with the complexities that can emerge from having a face with great range. It doesn't sound like there's much to these concepts for modeling and animating, and, yeah, they really are small and simple -but they're huge in their details, so let's get into them.

Before we can jump into re-creating the things we see and understand on faces, we first need to figure out what those are. Starting on the ground floor, we're going to break down the essentials of lip sync and learn the only absolutes. Next, I'll go into how basic speech can be broken into two simple and basic cycles of movement, which is what makes the sync portion of this book so simple. Finally, at the end of this chapter, I take those two things-what's essential and the two cycles-and actually build them into a technique for working.

* The bare-bones essentials of lip sync
* The two speech cycles
* Starting with what's most important: visimes
* Building the simplest sync

The Essentials of Lip Sync

People overcomplicate things. It's easy to assume that anything that looks good must also be complex. In the world of 3D animation, where programs are packed with mile after mile of options, tools, and dialog boxes, overcomplication can be an especially easy trap to fall into. Not using every feature available to you is a good start in refining any technique in 3D, and not always using the recommended tools is when you're really advancing and thinking outside the box. Many programs have controls and systems geared for facial animation, but those same programs usually have better tools in their arsenals for the job.

If you've ever tried lip sync in CGI, it has probably been frustrating, complicated, difficult, and unrewarding. By the end of it, most people are just glad to see it get done and regret deciding to involve sync in their project. Another approach currently being explored is facial motion capture or automated sync. Neither of these looks very good or has much artistic flair-yet.

Don't despair; I will get you set up for sync quickly and painlessly so you can spend your time on performance (the fun stuff!). If your bag is automation, there's still a lot of information on how to bump the quality up a couple of notches on that, too.

The lip sync portion of facial animation is the easiest to understand, because it's the simplest. You see, people's mouths don't do that much during speech. Things like smiles and frowns and all sorts of neat gooey faces are cool, and we'll get to them later, but for now we're just talking sync. Plain old speech. Deadpan and emotionless and, well, boring, is where our base will be. Now, you're thinking, "Hey! My face can do all sorts of stuff! I don't want to do boring animation!" You're right, your face can do all sorts of things, and who would ever want to do boring animation? For the basics, however, we're not going to complicate it yet-that'll come later. In a very short while, in Part II, we'll build a model of a mouth that can do anything your mouth can do, and more, but you need all this stuff in your head before you can get there.

When dealing with the bare-bones essentials of lip sync and studying people, we've whittled it down to two basic motions. The mouth goes closed/open, and it goes narrow/wide; all of these are illustrated in Figure 1.1.

That's really, at its core, all that speech entails. If we were lip-syncing a character with a plain circle for a mouth (and we will in just a minute!), the shapes in Figure 1.2 would be all the keys we would need to create the illusion of speech.

Your reaction to this very short list might be, "What about things like F where I bite my lip, or L where I roll up my tongue?" That's just the point of these early chapters. We ignore those unique and complicated shapes, strip the process down to what is absolutely necessary to be understood visually, and then build it back up from a solid base. If these two controls-Open/Closed and Wide/Narrow-are all you have to draw on, you get creative about how to utilize them. Things like F just get pared back to "sort-of closed." If you were to animate this way and stop the animation on the frame where the "sort of closed" is standing in for an F and say "That's not an F!" you'd be right, but in motion, you hardly notice, and what we're talking about here is motion. As a standard in this book, I'm going to try not to concern us with the individual frames, so much as the motion and the impression it gives.

Animating lip sync is all smoke and mirrors. What is really happening isn't relevant; it's all about the impression. How about M? "I need to roll my lips in together to say M; I can't do that with a circle-mouth-thingamajig." Sure you can, or at least you can give that same impression-just close it all the way; that's good enough. When you get the lip sync good enough and focus on the acting, people only notice the acting. The sync becomes visual noise!

Analyzing the Right Things

Let me take you on a little real-world tutorial of what's important, and what's not, in action. There can be a tendency-and it's not necessarily a bad one-to slow things down to the frame-by-frame level and analyze in detail what happens so that we can re-create it as animators. Here's an example:

Look in the mirror (or don't) and slowly, deliberately, and clearly enunciate the word pebble: PEH-BULL. We're trying to see just what exactly happens visually, on our face and lips, during that word, so we can re-create it in animation. Think about or watch what your lips are doing-all the details: The little puff in your cheek after the B. The way the pursing of your lips for P is different than for B. How your tongue starts its way to the roof of your mouth early in the B sound and stays there until after the end of the word. All these details gives you a pretty good idea of how to analyze and re-create the word pebble in animation, right? Wrong! That's exactly the wrong way to do it. That's how you would do it for a character who was speaking slowly and deliberately, and enunciating clearly. This is how a mirror can be dangerous if used incorrectly: over-analysis. None of these things, these details, are wrong-they're just not necessary, and I'll explain why in the next paragraph.

This time, at regular, comfortable, conversational speed, say "How far do you think this pebble would go if I threw it?" How did the word pebble look? Check it out a few times, resisting the urge to do it slowly. As far as the word pebble is concerned, the overall visual impression is merely closed, a little open, closed, a little open. That's it. In a sentence spoken regularly, the word pebble will most likely look the same as mama or papa. Say the phrase again with that in mind. Try not to change what your mouth does, but instead notice that the Opens and the Closeds are the most significant things happening during the word. The mouth doesn't open wide enough (in this case) to see a tongue, so why would you animate it or need to spend time thinking about it? Because it's "correct"? That would be like animating a character's innards. You can't see them, but they're there, so animating them would be the "correct" thing to do, right? Wrong. It's a silly waste of the time you could otherwise spend on the acting.

The Opens and the Closeds are the most important of any of the things a mouth does. That's why puppets work. Does it look like a puppet is really saying anything? Of course not, but with the flapping of the jaw happening around the same time as the sounds the actor makes, your brain fills in the connection. You want to believe that the character is talking, and that's why the only truly important action in the word pebble is open, closed, open, closed.

This is how you analyze the right things: search for the overall impression, not the details. It's very easy to learn how to do this, but very hard to master; luckily you have a good coach.

Speech Cycles

This approach of identifying cycles and "visimes," which you'll learn more about in just a moment, is likely very different than what you know now. If you're looking for a phoneme-to-picture comparison chart, you're not going to get it here, because in this approach there is no absolute shape for each sound, and to point you in such a direction would do more harm than good. Each sound's shape is going to be unique, and you'll learn to identify it and its components. To start, let's talk about the two speech cycles.

In its simplest form, there are two distinct and separate cycles in speech: open and closed, as in jaw movement, and narrow and wide, as in lip movement.

These two cycles don't necessarily occur at the same times as each other, nor do they go all the way back and forth from one extreme to the other all the time. The open and closed motion generally lines up with the puppet motion of the jaw, or flow of air-almost any sound being created-while the narrow and wide motions have more to do with the kind of sound being created. In the phrase "Why are we watching you?" we get this sequence for the Wide/Narrow:

Word Wide/Narrow Sequence

why Narrow, wide

are No change/shape we Narrow, wide

watching Narrow, slightly wide

you Narrow

Simple, right? Now let's look at the jaw or Open/Closed cycle. "Closed" refers to a position not completely closed, but closer to closed than to open.

Word Open/Closed Sequence

why Closed, open, closed

are Closed, open, closed

we Closed, slightly open

watching Closed, open, closed, slightly open, closed

you Closed / no change

That's it for the essentials. We're going to get into more shapes and controls and special cases, but there it is. The backbone of this book's lip sync technique has to do with simple analysis of the Wide/Narrow and Open/Closed cycles; over time, we'll add more and more levels, each one simple on its own, to create complex, believable performances.

These cycles will be the foundation on which we build everything else. Taking the lead from the human mouth, I've based all of this book on the "simpler is better" mindset. Your mouth is lazy. Go on, admit it. It hits the major sounds and fudges over the rest, like someone When I say "cycle," I'm merely referring to how the shape will go from one to the other and then back again. There are no other stops along the way. The mouth will go open, closed, open, closed; the lips will go wide, narrow, wide, narrow. whose name you forgot but say hi to anyway, hoping they don't notice you garbled it. It makes good sense; it's efficient.

I've had programs and books and teachers all show me sets of shape keys for sync I had to build, that included things like G. Why would we build a shape for or pay special attention to the letter G? Whether it's hard G or soft G, you can say it with your mouth in any of the shapes shown in Figure 1.3.

Now, these are obviously all pretty different. If you were, however, to try and say a soft or hard G with your mouth held in each of these poses, you could do it without much trouble. In 95% of cases when the letter G comes up in sync, we're going to ignore it-that is to say, it will get no keys in the animation. Further to that, we will most certainly not build a shape for G; how would we pick just one?

The G sound is actually created in the throat, not by the lips nor the open/closed positions of the mouth. This whole example with G is to illustrate the criteria of visimes. What are visimes, you ask? Read on.

Starting with What's Most Important: Visimes

So, we've decided that we're going to go with a less-is-more approach. That's good. For this non-inclusive approach, however, where we're trying to exclude all the extraneous mouth positions, something you'll need to know is what must be included. There are certain sounds that we make with our mouths that absolutely need to be represented visually, no matter what: visimes. These are the sounds that can only be made by the mouth with specific characteristics to the mouth shape, or range-such as narrow for OO, as in food, or closed, for M, as in mom. There are more visimes to address than the Open, Closed, Wide, and Narrow can properly do, but even these must-see shapes can be "cheated" to fit into the "circle mouth" setup you've seen and we're about to build.

Why Phonemes Aren't Best for CGI

The most common key sets and setups out there for public consumption are based on phonemes, which means "the sounds your mouth makes during speech." To base sync on phonemes seemingly makes perfect sense-it's the way it's been done for years with classical animation-but for the newer world of CGI, it can be overly complicated. Phonemes worked fantastically on paper, where nothing comes for free; every frame must be drawn, and a little popping from frame to frame was just part of the style. In CGI-in anything, really-the eye is drawn to what is out of place, and generally, most computer animations don't have keys on every frame, or even every second frame. If just on the mouth there's a key on every frame of your lip sync, you had better believe that's where all eyes will be; not a good thing.

In the search for a better system for CGI sync, something became very apparent: There are three different kinds of sounds you can make during speech, and not all of these are very easy to see! Phonemes-based sync lumps all of these sounds together, and that is what precludes it from being the best solution for us. The important point I'm coming to here is that during speech some sounds are made primarily with your lips, some are made primarily by your tongue, and others are made in your throat and vocal cords. The only ones you absolutely have to worry about every time in animation are the sounds made primarily by the lips.

Phonemes are sounds, but what matters in animation is what can be seen. Instead of phonemes, of which there are about 38 in English (depending on your reference), what we're going to base our system on is "visual phonemes," or visimes. Visimes are the significant shapes or visuals that are made by your lips. Phonemes are sounds; visimes are shapes. They're all you really need to see to be convinced. You obviously cue these shapes based on the sounds you hear, but there aren't nearly as many to be seen as there are heard. The necessary visimes are listed in Table 1.1. Remember that these are shapes tied to sounds, not necessarily collections of letters exactly in the text.

Words are made up of these even if they aren't spelled this way; the word you comprises the two visimes EE and then OO, to make the EE-OO sound of the word. As we move forward you'll learn that if there is no exact visime for the sound, we'll merely use the next closest thing. For instance, the sound OH as in M-OH-N (moan) is not really shown on this chart, whereas OO is. They're not really the same, but they're close enough that you can funnel OH over to an OO-type shape.

That's just 7 shapes to hit, and only a few of those are their own unique shape to build! Analysis and breakdown of speech has just gone from 38 sounds to account for, to 7. Some sounds can show up as the same shape, such as UH and AW, which only need to be represented by the jaw opening.


Excerpted from Stop Staring by Jason Osipa Excerpted by permission.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.

Read More Show Less

Table of Contents

Part I: Getting to Know the Face.

1 Learning the Basics of Lip Sync.

2 What the Eyes and Brows Tell Us.

3 Facial Landmarking.

Part II: Animating and Modeling the Mouth.

4 Visimes and Lip Sync Technique.

5 Constructing a Mouth and Nose.

6 Mouth Keys.

Part III: Eyes and Brows.

7 Building Emotion: The Basics of the Eyes.

8 Constructing Eyes and Brows.

9 Eye and Brow Keys.

Part IV: Bringing It Together.

10 Connecting the Features.

11 Skeletal Setup, Weighting, and Rigging.

12 Interfaces for Your Faces.

13 Squash, Stretch, and Secondaries.

14 A Shot in Production.

Read More Show Less

Customer Reviews

Average Rating 5
( 2 )
Rating Distribution

5 Star


4 Star


3 Star


2 Star


1 Star


Your Rating:

Your Name: Create a Pen Name or

Barnes & Review Rules

Our reader reviews allow you to share your comments on titles you liked, or didn't, with others. By submitting an online review, you are representing to Barnes & that all information contained in your review is original and accurate in all respects, and that the submission of such content by you and the posting of such content by Barnes & does not and will not violate the rights of any third party. Please follow the rules below to help ensure that your review can be posted.

Reviews by Our Customers Under the Age of 13

We highly value and respect everyone's opinion concerning the titles we offer. However, we cannot allow persons under the age of 13 to have accounts at or to post customer reviews. Please see our Terms of Use for more details.

What to exclude from your review:

Please do not write about reviews, commentary, or information posted on the product page. If you see any errors in the information on the product page, please send us an email.

Reviews should not contain any of the following:

  • - HTML tags, profanity, obscenities, vulgarities, or comments that defame anyone
  • - Time-sensitive information such as tour dates, signings, lectures, etc.
  • - Single-word reviews. Other people will read your review to discover why you liked or didn't like the title. Be descriptive.
  • - Comments focusing on the author or that may ruin the ending for others
  • - Phone numbers, addresses, URLs
  • - Pricing and availability information or alternative ordering information
  • - Advertisements or commercial solicitation


  • - By submitting a review, you grant to Barnes & and its sublicensees the royalty-free, perpetual, irrevocable right and license to use the review in accordance with the Barnes & Terms of Use.
  • - Barnes & reserves the right not to post any review -- particularly those that do not follow the terms and conditions of these Rules. Barnes & also reserves the right to remove any review at any time without notice.
  • - See Terms of Use for other conditions and disclaimers.
Search for Products You'd Like to Recommend

Recommend other products that relate to your review. Just search for them below and share!

Create a Pen Name

Your Pen Name is your unique identity on It will appear on the reviews you write and other website activities. Your Pen Name cannot be edited, changed or deleted once submitted.

Your Pen Name can be any combination of alphanumeric characters (plus - and _), and must be at least two characters long.

Continue Anonymously
Sort by: Showing all of 2 Customer Reviews
  • Anonymous

    Posted July 8, 2013


    I would like to hav res 13

    Was this review helpful? Yes  No   Report this review
  • Anonymous

    Posted July 21, 2004

    really helped

    great help even for not-too-experienced modelers and animators

    Was this review helpful? Yes  No   Report this review
Sort by: Showing all of 2 Customer Reviews

If you find inappropriate content, please report it to Barnes & Noble
Why is this product inappropriate?
Comments (optional)