|Series:||Anthem Series on Russian, East European and Eurasian Studies Series|
|Product dimensions:||5.90(w) x 8.90(h) x 0.80(d)|
About the Author
Veronika Makarova is an Associate Professor in the Department of Languages and Linguistics and the Interdisciplinary Linguistics Program Chair at the University of Saskatchewan, Canada.
Read an Excerpt
Russian Language Studies in North America
New Perspectives from Theoretical and Applied Linguistics
By Veronika Makarova
Wimbledon Publishing CompanyCopyright © 2012 Veronika Makarova; individual contributors
All rights reserved.
PHONETICS. TRACING EMOTIONS IN RUSSIAN VOWELS
University of Saskatchewan
Valery A. Petrushin
Opera Solutions, San Diego, California
The advantage of the emotions is that they lead us astray, and the advantage of science is that it is not emotional.
— Oscar Wilde (1891)
This chapter examines acoustic clues of six emotional states (neutral, surprise, happiness, anger, sadness and fear) in the production of Russian vowels. The findings are presented and discussed for three groups of vowels: unstressed, stressed and pitch accented. The research data come from the RUSLANA (Russian Language Affective) database of Standard Russian.
Emotions "convey the psychological state of a person" (Iliev et al. 2010, 445). They are "conceived to be natural bodily experiences and expressions, older than language, irrational and subjective, unconscious rather than deliberate, genuine rather than artificial, feelings rather than thoughts" (Edwards 1999, 272). Humans can express and identify emotions with a variety of communication forms including vocal (linguistic, verbal art) and non-vocal (facial expressions, shaking, changes in skin coloration, blood pressure, heart rate, sweating, posture, clothing, hairstyle, non-verbal art, gesticulation and behavioral patterns) (Anolli and Ciceri 2001; Iliev et al. 2010).
Expression of emotions in speech currently attracts scholars from a wide range of disciplines, such as literary criticism, neuroscience, anthropology, pragmatics, communication sciences, psychology, physiology,linguistics, applied linguistics, education, engineering, computer science, psychotherapy and psychiatry (Wierzbicka 1997; Johnstone and Scherer 2000; Pavlenko 2005; Imai 2007). All the structural levels and most functional forms of language serve to express emotions. Linguistically, emotions are rendered via phonetic (acoustic), graphic, phonological, morphological, lexical, syntactic, sociolinguistic, discoursal (textual and pragmatic) devices as well as their combinations (Cowie et al. 2001; Bazzanella 2004). Specifically, lexical cues of affect include words perceived to be associated with particular emotions, e.g., 'wrong' and 'damn' are associated with negative emotions (Cowie et al. 2001). Discourse clues of emotions include particular types of verbal responses which were influenced by emotions, such as rejection, repetition, rephrase, ask, start-over, etc. (Edwards 1999). Emotions in discourse are seen as "a way of talking" that can be contrasted and used on occasion, and may include rhetorical opposites and contrasts or sets of conversational templates and scenarios (Edwards 1999, 278). An example of a mixed clue (both visual and acoustic) is the degree of jaw movements correlating with the emotion of irritation (Banse and Scherer 1996). It is often extremely hard to disentangle some of the emotional cues or estimate their exact contribution to perceived emotion, since they are expressed at multiple levels of language as well as by non-linguistic cues and their interactions (Dietrich et al. 2006). Despite all the variability of the expression of emotions in language, human subjects can identify emotions even in very short extracts of speech (such as one vowel) containing only acoustic clues (Toivanen et al. 2006).
The expression of emotion in languages has identifiable common characteristics as well as unique language-specific features (Lutz 1988; Wierzbicka 1997; Goddard and Wierzbicka 2002). It has been claimed in earlier research that Russian has some "specifically Russian" emotional terms as well as unique syntactical ways of expressing emotions (Levontina and Zalizniak 2001). This chapter focuses on the expression of emotion in Russian via the acoustic parameters of Russian speech.
The task of analyzing linguistic portrayals of emotion is made even more challenging due to disagreements among scholars about the definitions and classifications of emotions (Nordstrand et al. 2004; Scherer 2000; Zervas et al. 2007). Dimensional approaches view emotion as a continuum or gradual transition; they often map emotions in two- or three-dimensional space continua (Osgood 1957; Davitz 1964; Plutchik 1980; Nordstrom et al. 2004; Grimm et al. 2007). However, for reasons of simplicity, most phonetic studies (Nordstrom et al. 2004; Waaramaa et al. 2006) follow the discrete or category approach, which identifies a few basic emotions that are considered distinct from each other (Ekman 1979; Iida 2002). In this study, we also follow the discrete approach. From the commonly identified list of basic emotions (Iliev et al. 2010), we have selected five states (fear, joy, sadness, surprise and anger) which are examined against the "neutral" or un-emotive state.
In phonetic studies of emotive and affective speech, most of the attention so far has been given to prosodic correlates of emotion, primarily to pitch parameters, such as the types, magnitudes, duration and steepness of pitch movements and the declination within phrases (Banse and Scherer 1996; Paeschke and Sendlmeier 2000). Some characteristics of the temporal and rhythmical organization of speech as well as intensity have also been shown to be relevant for emotive information (Scherer 1989; Arnfield et al. 1995; Stibbard 2000). Other suprasegmental parameters which have been shown to contribute to the expression of emotion in speech include voice quality, pauses and boundaries (Cowie et al. 2001; Gobl and Chasaide 2003; Min Lee and Narayanan 2005).
It has been observed that some segmental features, such as segmental durations, spectra and formant frequencies are also salient for the expression of emotion (Min Lee and Narayanan 2005, Kienast and Sendlmeier 2000; Cowie et al. 2001; Tickle 2000; Fernandez 2004). The total list of features singled out by researchers as acoustic correlates of emotive states can vary from approximately thirty (McGilloway et al. 2000) to over one hundred (Fernandez 2004).
Acoustic characteristics of vowels have been named among segmental clues of emotion, but there has been some disagreement with respect to what exactly happens to vowel quality under affect. While some studies conclude that vowel quality significantly changes under emotion (Fernandez 2004), some other studies show that emotions do have an impact on vowel quality, though this effect is minimal (Szameitat et al. 2009). Some explanations of the changes of vowel characteristics under affect are found in the speech production studies that show the articulatory changes in emotional states, such as the changes in the lip opening, rising, protrusion and rounding (Caldognetto et al. 2004) and in tongue movements (Fonàgy 1976). Another observed change in emotive vowels is the increase of the values of formants (F3 and F4) under some negative emotions, which is explained by a more tense and shortened vocal tract (Waaramaa et al. 2006). Some recent experiments suggest that the observed impact of some emotions on articulation (such as the vertical and lateral labial distance) may differ by the type of vowel (Nordstrom et al. 2004). All the above-mentioned research studies have been performed on languages other than Russian.
This chapter contributes to the field by investigating emotion-related parameters in the acoustic characteristics of Russian vowels. The materials for the study were retrieved from RUSLANA, a Russian affective speech database which represents the phonemes, major syntactical and intonation contour types in Russian (Makarova and Petrushin 2002). Emotions were simulated by the speakers. This procedure is so commonly employed in other phonetic experiments and emotive databases (Nordstrom et al. 2004; Toivanen et al. 2006) that it has been called "the preferred way of obtaining emotional voice samples in the field" (Scherer 2003, 232). The database and the extracted features are described in the following section.
Our study pursued the following major goals, to:
1. Investigate the effect of emotive-affective state on major acoustic parameters of Russian vowels grouped by accentual type (accented, stressed, unstressed);
2. Investigate the effect of emotive-affective state on major acoustic parameters of individual Russian vowels.
In this study, we investigated the parameters of Russian vowel phonemes which we denote in SAMPA transcription as /a/, /i/, /u/, /e/, /o/, /1/, whereby the symbol /1/ is used in this chapter to represent the high central vowel, as the one found in the Russian word syr (cheese).
II. Materials and Methods
II.1. The database
The RUSLANA (Russian Language Affective) database includes the recordings of speakers of Standard (St. Petersburg) Russian portraying the following six emotional states: neutral, anger, fear, happiness, sadness, and surprise. These emotions are typically represented in phonetics and speech processing studies among 'archetypal emotions' (Cowie et al. 2001; Banse and Scherer 1996). The database also represents the major syntactical types of Russian (statements, 'yes-no,' alternative and wh-questions, echo-questions and exclamations) and basic intonation contours that are linked with those sentence types (Bryzgunova 1977). All the phonemes of Russian have been included in the database. RUSLANA includes utterances from 61 subjects (12 male and 49 female). Each speaker recorded ten sentences of different syntactical type and intonation pattern portraying the above-mentioned six emotional states, that is, each speaker produced 60 utterances. Figure 1.1 shows one exclamatory utterance produced with different emotions by a female speaker.
All the data were recorded on a portable digital audio tape recorder, Sony TCD-D8 at 48 kHz sampling rate, via Sennheiser headphone set, in a soundproof recording studio of the Department of Phonetics, St. Petersburg State University, St. Petersburg, Russia. The obtained recordings were converted into monophonic Windows PCM format at 32 kHz sampling frequency and 16-bit resolution.
In this study, we used 600 utterances from ten speakers (five male and five female). These ten speakers were selected based on the results of the database evaluation. In the process of evaluation, 30 speakers of Standard Russian (10 males and 20 females) were requested to perform two evaluations of the randomly presented stimuli. In the first evaluation, the listeners were requested to identify the emotion they heard portrayed in each utterance. In the second evaluation, they ranked how well every utterance portrayed a given emotion on a ten-point Lickert scale. The speakers whose utterances ranked the highest in both evaluations were selected for the study to ensure the quality of emotion portrayal.
II.2. Feature extraction
The RUSLANA database provides a number of acoustic features for each utterance with a 10 ms step interval. It also provides phoneme-level labeling for all utterances. We used these data to estimate features for each phoneme in every utterance. The following features have been extracted and analyzed:
Phoneme duration (Dur);
Percentage of voiceness;
Average energy (E);
Average fundamental frequency value (F0);
Average F0 derivative (F0deriv);
Average formant values (F1, F2, F3);
Average formant bandwidths (BW1, BW2, BW3).
Additionally, the values of average power spectrum on logarithmic scale were estimated for the following 16 sub-bands: 0–500 Hz, 501–1000 Hz, 1001– 1500 Hz, 1501–2000 Hz, 2001–2500 Hz, 2501–3000 Hz, 3001–3500 Hz, 3501–4000 Hz, 4001–5000 Hz, 5001–6000 Hz, 6001–7000 Hz, 7001–8000 Hz, 8001–10000 Hz, 10001–12000 Hz, 12001–14000 Hz, 14001–16000 Hz. These power spectrum features are denoted here by letters "Fq" followed by the upper bound of frequency range, for example, the sub-band 2501–3000 Hz is denoted as Fq3000.
In total, the features for about 17,100 occurrences of phonemes have been extracted. The phoneme-level labeling for vowels allows for the distinction between unstressed, stressed, and pitch accented vowels. In our analysis, we used 325 occurrences of pitch accented vowels, 1393 occurrences of stressed vowels and 3,891 occurrences of unstressed vowels.
The extracted features were subjected to Univariate ANOVA analysis to determine the effect of emotion type on variability of every parameter. The analysis was conducted for all the instances of every vowel phoneme in the database, as well as separately for accented, stressed and unstressed vowels. The effects were considered significant at a p-value less than 0.05. Subsequent post-hoc analysis was performed which employed multiple pair-wise comparison tests using Tukey's honestly significant difference criterion. The latter tests were performed for each pair of emotion types, such as anger/sadness, anger/ fear, etc., and for all parameter means that showed the significant effect of emotion type in the preceding ANOVA analysis. These procedures allowed us to determine whether every feature significantly varies with the factor of emotion type, and if so, which of the pairs of emotional states show significant differences in the average feature values.
III.1. Vowel groups (accented, stressed and unstressed vowels)
This section describes the features which were found to be significantly different across the six emotive-affective states for the three vowel groups: unstressed, stressed and accented.
III. 1.1. The effect of emotion type on major vowel parameters
For all the three vowel types, emotions have a statistically significant effect on the variability of all 25 parameters analyzed in the study: vowel duration (Dur), average energy (E), average fundamental frequency (F0), and all the power spectrum features; first, second and third formants (F1, F2, F3), F0 derivative, and formant bandwidths (BW1, BW2, BW3).
Average values for the analyzed parameters by emotive type of the three groups of vowels are represented below in Tables 1.1–1.3.
We will comment briefly on some of the findings represented in the Tables 1.1–1.3.
Emotion type is a significant factor in the variability of duration in all the three vowel groups (unstressed, stressed and accented). In all the three vowel groups, neutral vowels have the smallest duration, i.e., all emotive states extend vowel durations. Emotive unstressed vowels are on the average 7.7 ms longer than the neutral ones (11 percent of the average duration of an unstressed vowel in the dataset); emotive stressed vowels are on average 12.2 ms longer than neutral vowels (14 percent of the average vowel duration); and the analogous extension for emotive accented vowels is 25.0 ms (22 percent of the average vowel length). Predictably, vowel duration within one emotive set decreases consistently from 'accented' to 'unstressed,' which is a consequence of the vowel length extension with the increased degree of prominence.
In all the three groups, there are significant differences in the durations of sad/neutral, and afraid/neutral vowels. In the accented vowels group, maximal vowel duration is found in the production of the simulated 'sad' emotive state (132.5 ms average), followed by 'surprised' and 'afraid' (117.9 and 117.4 ms), with the shortest emotive vowels belonging to the 'happy' state (106.1 ms). These differences in duration across emotive states are statistically significant for 'sad/happy,' 'sad/neutral,' 'afraid/neutral' and 'neutral/surprised' states.
A different distribution of duration across emotive states is found in the stressed vowel group, where the 'happy' state has the longest duration (88.1 ms), closely followed by the 'angry' (88.1 ms), 'sad' (86.8 ms), 'afraid' (85.4 ms) and 'surprised' (84.3 ms) emotive states. Statistically significant are the differences in duration between neutral vowels, on the one hand, and all the emotive vowels, on the other. There are no statistically significant differences across any pairs of emotive stressed vowels.
Yet another picture emerges from the durational distributions in the group of unstressed vowels, in which the longest vowel duration is found in the 'angry' state (76.3 ms), followed by 'afraid' (72.0 ms), 'sad' (70.8 ms), 'happy' (68.9 ms) and 'surprised' (68.1 ms). The unstressed group of vowels has significant differences in duration between 'sad/neutral,' 'afraid/neutral,' 'angry/sad,' 'angry/happy,' 'angry/neutral' and 'angry/surprised' states. Figure 1.2 below represents the average duration values for accented and unstressed vowels.
In all the three vowel datasets, vowels in neutral utterances have the lowest energy (0.027, 0.028 and 0.024 rms in the unstressed, stressed and accented datasets, respectively), closely followed by vowels in the 'sad' emotive state (0.032, 0.032 and 0.028 rms, respectively). The emotive states displaying the highest vowel energy in all the datasets are 'angry' (0.075, 0.069 and 0.063 rms in the unstressed, stressed and accented datasets) and 'happy' (0.063, 0.077 and 0.063 rms). 'Surprised' and 'afraid' vowels have very close values in the medium energy range varying between 0.039 and 0.051 rms. Figure 1.3 below shows the energy distribution of accented, stressed and unstressed vowels by emotive state.
Excerpted from Russian Language Studies in North America by Veronika Makarova. Copyright © 2012 Veronika Makarova; individual contributors. Excerpted by permission of Wimbledon Publishing Company.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.
Table of Contents
List of Tables and Figures; Introduction – Veronika Makarova; PART ONE: LANGUAGE STRUCTURES AND THEIR INTERFACE; 1. Phonetics. Tracing Emotions in Russian Vowels – Veronika Makarova and Valery A. Petrushin; 2. Phonology. Vowel–Zero Alternations in Russian Prepositions: Prosodic Constituency and Productivity – Lev Blumenfeld; 3. Morphology and Lexicology Interface. Latest Russian Neologisms: The Next Step towards Analytism? – Julia Rochtchina; 4. Syntax. Bi-nominative Sentences in Russian – Igor Mel’čuk; 5. Psycholinguistics. The Effect of Grammatical Gender in Russian Spoken-Word Recognition – Irina A. Sekerina; PART TWO: APPLIED LINGUISTIC AND SOCIOLINGUISTIC ANALYSIS; 6. Communicative Language Teaching and Russian: The Current State of the Field – William J. Comer; 7. Low-Proficiency Heritage Speakers of Russian: Their Interlanguage System as a Basis for Fast Language (Re)Building – Alla Smyslova; 8. Superior Speakers or “Super” Russian: OPI Guidelines Revisited – Ludmila Isurin; 9. Who Am I?: Cultural Identities among Russian-Speaking Immigrants of the Third (and Fourth?) Wave and their Effects on Language Attitudes – David R. Andrews; 10. Russian Language History in Canada. Doukhobor Internal and External Migrations: Effects on Language Development and Structure – Gunter Schaarschmidt; Afterword – Veronika Makarova; Index
What People are Saying About This
“The volume is an excellent collective effort that demonstrates the vibrancy and diversity of Russian language studies in North America.” —Dr Lara Ryazanova-Clarke, University of Edinburgh
“This collection of papers is a wonderful resource for everyone interested in the deep structure of the Russian language, linguistic identity and culture. The quality of research and scientific accuracy is brilliant, the articles providing comprehensive coverage of the modern state of scientific knowledge of the Russian language in Canada and North America.” —Professor Karina Evgrafova, St. Petersburg State University
“This book presents a very profound analysis of the Russian language spoken in North America. Scholars working in this field of science have a unique opportunity to be provided with reliable Russian-language investigation results, obtained thorough linguistic, methodological and cross-cultural research.” —Professor Ekaterina Kostina, Novosibirsk State Pedagogical University