The Acquisition of Complex Sentences

This comprehensive account of how children acquire complex sentences investigates spontaneous speech in English-speaking children between ages two and five. After examining the acquisition of numerous types of clauses, Holger Diessel argues that the acquisition process is determined by a variety of factors: the frequency of the various complex sentences in the language, the complexity of the emerging constructions, the communicative functions of complex sentences, and the child's social-cognitive development.

Product Details

  • ISBN-13: 9780521831932
  • Publisher: Cambridge University Press
  • Publication date: 8/15/2004
  • Series: Cambridge Studies in Linguistics Series , #105
  • Pages: 242
  • Product dimensions: 5.98 (w) x 8.98 (h) x 1.02 (d)

Meet the Author

Holger Diessel is a Research Fellow in the Department of Comparative and Developmental Psychology, Max Planck Institute for Evolutionary Anthropology, Leipzig.

Read an Excerpt

Cambridge University Press
0521831938 - The Acquisition of Complex Sentences - by Holger Diessel

1 Introduction

1.1 The scope and goal of this study

Complex sentences are grammatical constructions consisting of multiple clauses. They are commonly divided into two types: sentences including co-ordinate clauses, and sentences including a matrix clause and a subordinate clause. Three different types of subordinate clauses can be distinguished: complement clauses, relative clauses, and adverbial clauses. In traditional grammar, complement clauses are defined as arguments of a predicate in the superordinate clause; relative clauses are analysed as attributes of a noun or noun phrase; and adverbial clauses are seen as some sort of modifier of the associated matrix clause or verb phrase. All three types of subordinate clauses can be finite or nonfinite. Nonfinite subordinate clauses comprise infinitival and participial constructions. Examples of the various subordinate and coordinate clauses are given in (1)-(7).

(1) Peter promised that he would come. [finite COMP-clause]
(2) Sue wants Peter to leave. [nonfinite COMP-clause]
(3) Sally bought the bike that was on sale. [finite REL-clause]
(4) Is that the driver causing the accidents? [nonfinite REL-clause]
(5) He arrived when Mary was just about to leave. [finite ADV-clause]
(6) She left the door open to hear the baby. [nonfinite ADV-clause]
(7) He tried hard, but he failed. [COOR-clause]

This study examines the development of complex sentences in early child speech. It is based on observational data from five English-speaking children between the ages of 1;8 and 5;1. The data consist of about 12,000 multiple-clause utterances, which is probably the largest database that has ever been used in a study on the acquisition of complex sentence constructions. The literature is primarily concerned with children's comprehension of complex sentences based on data from experiments. There are only a few observational studies examining children's use of complex sentences in spontaneous speech. These are mainly concerned with the early use of complex sentences including adverbial and co-ordinate clauses; the literature on relative and complement clauses is almost entirely experimental. The current investigation is the first observational study systematically to examine the development of all multiple-clause structures in English and thus fills an important gap in the literature on the acquisition of complex sentences.1

The primary goal of the study is to describe the development of complex sentences and subordinate clauses in spontaneous child speech. When do the first complex sentences emerge? What characterizes the earliest subordinate clauses? How does the development proceed? However, the study also addresses a number of more general questions concerning the organization of grammar and grammatical development.

The theoretical approach taken in this study combines construction grammar with the usage-based model (cf. Fillmore, Kay, and O'Connor 1988; Lakoff 1987; Goldberg 1995; Bybee 1985, 1995; Langacker 1987a, 1988, 1991; Barlow and Kemmer 2000; and Bybee and Hopper 2001). In construction grammar, grammar consists of interrelated symbolic units, combining a specific form with a specific function or meaning. In the usage-based model, grammar is seen as a dynamic system shaped by the psychological mechanisms involved in language use. In order to understand the dynamics of the system, one has to study the development of grammatical knowledge, both historically and in language acquisition. From this perspective, the current study is not just concerned with the acquisition of complex sentences in English but also with the structure and organization of grammar and the emergence of grammatical knowledge.

The investigation proceeds as follows. The remainder of the current chapter presents the central hypotheses of the study and provides an overview of the data. Chapter 2 discusses some central principles of construction grammar and the usage-based model, providing the theoretical background for the study. Chapter 3 gives a short definition of complex sentences and subordinate clauses. Chapters 4-7 present the bulk of the empirical analysis: chapter 4 describes the development of infinitival and participial complement clauses; chapter 5 is concerned with the early use of finite complement clauses; chapter 6 investigates the acquisition of relative clauses; and chapter 7 examines the emergence of co-ordinate and adverbial clauses; finally, chapter 8 provides a summary of the results and discusses the implications of the empirical findings for the theory of grammar and grammatical development.

1.2 Hypotheses

The study proposes two major hypotheses:

  • First, it is argued that the development of complex sentences originates from simple nonembedded sentences that are gradually 'transformed' to multiple-clause constructions. Two different developmental pathways are distinguished: (ⅰ) complex sentences including complement or relative clauses emerge from simple sentences that are gradually expanded to multiple-clause structures; and (ⅱ) complex sentences including adverbial or co-ordinate clauses develop by integrating two independent sentences into a specific biclausal unit.

  • Second, it is shown that children's early complex sentences are organized around concrete lexical expressions. More schematic representations of complex sentences emerge only later when children have learned a sufficient number of lexically specific constructions to generalize across them.

In what follows I discuss the two hypotheses in turn.

1.2.1 From simple sentences to complex sentence constructions

The first multiple-clause structures that seem to consist of a subordinate clause and a matrix clause contain a single proposition (i.e. they describe a single situation). Consider the following examples:

(8) I wanna see it. [Nina 1;11]
(9) I think it's a little bear. [Nina 2;2]
(10) Here's a rabbit that I'm patting. [Nina 3;0]

Example (8) includes an infinitival construction that one might analyse as an early instance of a nonfinite complement clause. However, the current study shows that the complement-taking verbs of children's early nonfinite complement clauses basically function as quasi-modals that specify the meaning of the infinitive: rather than denoting an independent state of affairs the complement-taking verbs elaborate the semantic structure of the activity expressed by the nonfinite verb.

Example (9) shows a construction that seems to include an early instance of a finite complement clause. However, if we look at children's early finite complement constructions more closely we find that the apparent matrix clauses do not designate an independent state of affairs; rather, they function as epistemic markers, attention getters, or markers of illocutionary force, guiding the hearer's interpretation of the associated complement clause.

The sentence in (10) is characteristic of children's early relative clauses, which tend to emerge a few months after the first complement clauses. The sentence consists of a presentational copular clause and a relative clause that is attached to the predicate nominal. Following Lambrecht (1988), I argue that the presentational copular clause is propositionally empty: rather than denoting an independent state of affairs, it functions to establish a new referent in focus position making it available for the predication expressed in the relative clause.

Although the sentences in (8)-(10) consist of two clauses, or clause-like elements, they designate only a single situation (i.e. they contain only a single proposition) and do not involve embedding. As children grow older, the three constructions become semantically and morphologically more complex. The whole development can be seen as a process of clause expansion: starting from structures that designate a single situation and do not involve embedding, children gradually learn the use of complex sentences in which a matrix clause and a subordinate clause express a specific relationship between two propositions.

Like complement and relative clauses, adverbial and co-ordinate clauses develop from simple nonembedded sentences. However, the development takes a different pathway. It originates from two independent sentences that are pragmatically combined in the ongoing discourse. Two typical examples are given in (11) and (12):

(11) ADULT: It's not raining today. [Peter 2;6]
CHILD: But . . . it's raining here.
(12) CHILD: Don't touch this camera. [Peter 2;7]
CHILD: Cause it's broken.

Although the clauses in these examples are combined by a connective, they do not constitute a grammatical construction. The two conjuncts are expressed by utterances that are grammatically independent. Starting from such discourse structures, children gradually learn the use of complex sentences in which the matrix clause and the adverbial clause (or two co-ordinate clauses) are tightly integrated in a biclausal construction. Thus, while complement and relative clauses evolve via clause expansion, adverbial and co-ordinate clauses develop through a process of clause integration.

1.2.2 From lexically specific constructions to constructional schemas

The second major hypothesis asserts that children's early complex sentences are lexically specific: they are organized around concrete lexical expressions that are part of the constructions. In studies of adult grammar, constructions including subordinate clauses are defined over abstract grammatical categories. For instance, a relative clause is commonly defined as a subordinate clause modifying a noun or noun phrase in the matrix clause (i.e. [N(P) [REL-clause]S]NP), and a complement clause is a subordinate clause functioning as an argument of the matrix clause predicate (i.e. [Ⅴ [COMP-clause]S]VP). However, adult grammar also includes lexically specific constructions, which are often overlooked (or ignored) in the syntactic literature. For instance, the comparative conditional construction (e.g. The faster you walk the sooner you'll be there) consists of two comparative phrases that are combined by two concrete lexical expressions: The___ the___ (cf. Fillmore, Kay, and O'Connor 1988). Such lexically specific constructions exist side by side with abstract grammatical representations in adult grammar (cf. chapter 2). However, in child language abstract grammatical representations are initially absent. A number of recent studies have shown that children's early grammatical constructions are organized around concrete lexical material: they are lexically specific constructions consisting of a relational term, usually a verb, and an open slot that can be filled by various elements (cf. Tomasello 1992, 2000a, 2000b, 2003; Tomasello and Brooks 1999; Pine and Lieven 1993; Pine, Lieven, and Rowland 1998; Lieven, Pine, and Baldwin 1997; Diessel and Tomasello 2000, 2001; Dabrowska 2000; Israel, Johnson, and Brooks 2000; Theakston, Lieven, Pine, and Rowland 2001, 2003; Abbot-Smith, Lieven, and Tomasello 2001; Wilson 2003; see also the older works by Braine 1976; MacWhinney 1975; and Bowerman 1976). Consider, for instance, the examples in (13), adopted from a diary study by Tomasello (1992: 285ff.). The sentences were produced by his 2-year-old daughter.

(13) That's Daddy. More corn. Block get-it.
That's Weezer. More that. Bottle get-it
That's my chair. More cookie. Phone get-it
That's him. More mail. Mama get-it
That's a paper too. More popsicle. Towel get-it.
That's Mark's book. More jump. Dog get-it.
That's too little for me. More Pete water. Books get-it.

The formulaic character of these utterances suggests that they are defined upon the occurrence of specific lexical expressions. They consist of a constant part associated with an open slot that is usually filled by a nominal expression: That's___, More___, ___ get-it. Such lexically specific constructions are characteristic of early child speech. Virtually all of the multi-word utterances produced by Tomasello's 2-year-old daughter are organized around specific verbs (or other relational terms).

The current study shows that such lexically specific constructions are not only characteristic of children's early simple sentences but also of their early multiple-clause structures. Like simple sentences, complex sentences are tied to concrete lexical expressions in early child speech. They are associated with a specific conjunction, a formulaic matrix clause, or some other lexical expression providing a frame for the rest of the utterance. Abstract grammatical representations of complex sentences emerge only later when children have learned enough lexically specific constructions to extract a constructional schema from the data.

1.2.3 Determining factors

How do we explain the development of complex sentences from simple item-based constructions? The current study argues that the acquisition process is determined by multiple factors: the frequency of the various complex sentences in the ambient language, the complexity of the emerging constructions, the communicative functions of complex sentences, and the social-cognitive development of the child.

As we will see throughout this book, there is a close correlation between the age at which children begin to use a specific construction and the frequency of this construction in the ambient language. The more frequently a complex sentence occurs in the input data, the earlier it emerges in children's speech. This suggests that input frequency plays a key role in the acquisition process.

However, input frequency alone does not suffice to account for the data; there are various other factors that seem to have an effect on the development. In particular, the complexity of the emerging constructions appears to influence the acquisition process. If we look at children's early complex sentences, we find that they tend to be very simple: although they consist of two clauses (or clause-like elements), they contain only a single proposition and involve very little grammatical marking. More complex constructions denoting a relationship between two propositions in two full-fledged clauses emerge only later. This suggests that the order of acquisition is at least partially determined by the semantic and morphosyntactic complexity of the emerging constructions. Specifically, one might hypothesize that children's early complex sentences are simple (both semantically and formally) because more complex constructions are initially too difficult to plan and to produce.

Since the earliest complex sentences are not only simple but also frequent, complexity and frequency are difficult to disentangle; both correlate very closely with the order of acquisition. However, that complexity is an important factor independent of frequency is suggested by the fact that there are some very complex structures that should have emerged earlier if the development were solely determined by input frequency.

In addition to frequency and complexity, there are several other factors that seem to affect the acquisition process. In particular, the pragmatic functions of complex sentences have an important effect on the development. Most of the complex sentences that children begin to use early are especially well suited for the specific communicative needs of young children. For instance, the earliest relative clauses occur in presentational constructions that are not only frequent and simple but also pragmatically very useful in parent-child speech. Presentational relative constructions consist of a presentational copular clause that identifies a referent in the speech situation and a relative clause that expresses a predication about the previously established referent. Since children tend to talk about elements that are present in the speech situation, presentational relatives are well suited for the particular communicative needs of young children. It is thus a plausible hypothesis that the early appearance of these constructions is partly motivated by their pragmatic functions.

Finally, the development of complement clauses seems to be related to the child's developing 'theory of mind' (cf. Shatz, Wellman, and Silber 1983; see also Lohmann and Tomasello 2003). Complement clauses are commonly used as arguments of 'complement taking verbs' such as think, know, and guess, which denote mental states and cognitive activities. However, in early child language complement-taking verbs occur almost exclusively in formulaic matrix clauses functioning as epistemic markers, attention getters, or markers of illocutionary force. Since the assertive use of these verbs presupposes a theory of mind that develops only gradually during the preschool years, one might hypothesize that young children do not use assertive matrix clauses because they lack the cognitive prerequisites for this use.

In general, the development of complex sentences seems to be determined by multiple factors. Frequency and complexity appear to be involved in the acquisition of all complex sentence constructions, but there are also pragmatic and general cognitive factors that play an important role in the developmental process.

1.3 Data

The analysis is based on observational data from five English-speaking children aged 1;8 to 5;1. The data come from 357 computerized transcripts of spontaneous parent-child speech. All data are taken from the CHILDES database (cf. MacWhinney 1995). The transcripts are in the CHAT format, which has been specifically designed to facilitate the computerized analysis of child language data (cf. MacWhinney 1995: ch. 5). The frequency counts and lists of examples presented throughout this study have been prepared with the help of the CLAN computer programs, which are part of the CHILDES system (cf. MacWhinney 1995: ch. 21).

The five children of this study are well known from the literature: Adam and Sarah are two of the children that Roger Brown investigated in his classical study (cf. Brown 1973); Peter is one of the children studied by Lois Bloom (1973); and the data from Nina and Naomi were provided by Patrick Suppes (1973) and Jacqueline Sachs (1983), respectively. All five children were born in the late sixties or early seventies.

Adam was the child of a minister and an elementary school teacher. Although he was African American, he did not speak African American English (cf. MacWhinney 2000:28). Adam's data comprise 55 transcripts that cover the time from age 2;3 to 4;10. The recordings occurred at regular intervals of one to three weeks. Adam's corpus is the biggest corpus of the study; it includes a total of 46,498 child utterances.

Sarah was the child of a working-class family (cf. MacWhinney 2000:29). Her data comprise 139 recordings that were collected at regular intervals from age 2;3 to 5;1. Although Sarah's corpus includes twice as many files as Adam's corpus, her database is smaller; it contains a total of 37,021 child utterances.

Peter was the first-born child of an upper middle-class family with college-educated parents (MacWhinney 2000:21). His corpus comprises 20 transcripts including a total of 30,256 child utterances. The transcripts are based on recordings that were prepared at regular intervals between the ages of 1;9 and 3;2.

Naomi was the child of the investigator Jacqueline Sachs. Her corpus consists of 87 files covering the time from age 1;8 to 3;5. In addition to these files, the CHILDES database includes six other recordings from Naomi that were excluded from the current investigation because they are temporally separated from the bulk of her data: two of them are very early recordings from the ages of 1;3 and 1;6; the four others were prepared after a gap of several months at the age of 3;8 and between 4;7 and 4;9. The 87 files that have been included in the current database contain a total of 14,656 child utterances, which is the smallest corpus of the study.

Table 1.1 General overview of the data

Children Age range Number of utterances Number of files
Adam 2;3-4;10 46,498 55
Sarah 2;3-5;1 37,021 139
Nina 1;11-3;4 32,212 56
Peter 1;9-3;2 30,256 20
Naomi 1;8-3;5 14,656 87
Total 1;8-5;1 160,643 357

Table 1.2 MLUs at 2;3 and 3;2

MLU at 2;3 MLU at 3;2
Adam 2.11 3.55
Sarah 1.63 2.47
Nina 3.22 3.58
Peter 2.49 3.45
Naomi 2.35 3.34
Mean 2.36 3.28

Finally, Nina's corpus consists of 56 files containing transcripts from the age of 1;11 to 3;4. There is a gap in Nina's data between the ages of 2;6 and 2;9 during which no recordings were prepared. The recordings before and after the gap occurred at regular intervals of one to two weeks. Nina's corpus contains a total of 32,212 child utterances. Table 1.1 provides an overview of the data.

There are significant individual differences in the development of the five children. Table 1.2 shows the children's mean length of utterances (MLU) at the ages of 2;3 and 3;2. The MLU indicates the average number of morphemes that occur per utterance at a specific time; it is commonly used as a measure for children's level of language development (cf. Brown 1973). The numbers have been automatically computed by the MLU program of the CHILDES system.

As can be seen in this table, at the age of 2;3 Adam, Peter, and Naomi have similar MLUs ranging from 2.11 to 2.49, Nina's MLU is significantly higher, and Sarah's MLU is by far the lowest. At the age of 3;2 the gap between Nina and the other children has become smaller, but Sarah's MLU is still lower than the MLUs of the four other children: while Adam, Peter, Nina, and Naomi produce an average of about 3.5 morphemes per utterance at this age, Sarah's utterances include only an average of 2.47 morphemes. This suggests that Sarah is somewhat lagging behind in her development. As we will see throughout this study, Sarah begins to produce most complex sentences several months after they emerge in the speech of the four other children.

Table 1.3 Total number of the children's multiple-clause utterances

Total number of multiple-clause utterances
Adam 4,389
Sarah 2,496
Nina 2,545
Peter 1,746
Naomi 802
Total 11,978

Fig. 1.1 Mean proportions of complex sentences in the transcripts of the five children

All 357 computer files have been searched for multiple-clause utterances defined upon the occurrence of at least two verbs (disregarding auxiliaries and modals). Whenever possible, the search was conducted automatically using the CLAN programs of the CHILDES system, but all 357 files have also been searched manually by the investigator and an assistant. Table 1.3 shows the total number of multiple-clause utterances that occur in the transcripts of each child.

The earliest multiple-clause utterances appear around the second birthday. Before the age of 2;0 the children's speech consists almost exclusively of simple nonembedded sentences. Figure 1.1 shows the mean proportions of multiple-clause structures that occurred in the total corpus of all child utterances up to the age of 4;0 (the numbers on which this figure is based are given in table 1a in the appendix).

© Cambridge University Press
Table of Contents

1. Introduction; 2. A dynamic network model of grammatical constructions; 3. Towards a definition of complex sentences and subordinate clauses; 4. Infinitival and participial complement constructions; 5. Complement clauses; 6. Relative clauses; 7. Adverbial and co-ordinate clauses; 8. Conclusion.

