- Shopping Bag ( 0 items )
Stanley Wasserman, John Scott, and Peter J. Carrington
Interest in social network analysis has grown massively in recent years. This growth has been matched by an increasing sophistication in the technical tools available to users. Models and Methods in Social Network Analysis (MMSNA) presents the most important of those developments in quantitative models and methods for analyzing social network data that have appeared during the 1990s. It is a collection of original chapters by leading methodologists, commissioned by the three editors to review recent advances in their particular areas of network methods.
As is well-known, social network analysis has been used since the mid-1930s to advance research in the social and behavioral sciences, but progressed slowly and linearly, until the end of the century. Sociometry (sociograms, sociomatrices), graph theory, dyads, triads, subgroups, and blockmodels - reflecting substantive concerns such as reciprocity, structural balance, transitivity, clusterability, and structural equivalence - all made their appearances and were quickly adopted by the relatively small number of "network analysts." It was easy to trace the evolution of network theories and ideas from professors to students, from one generation to the next. The field of network analysis was even analyzed as a network (see, for example, Mullins 1973, as well as analyses by Burt in 1978, and Hummon and Carley in 1993). Many users eventually became analysts, and some even methodologists. A conference of methodologists, held at Dartmouth College in the mid-1970s, consisted of about thirty researchers (see Holland and Leinhardt 1979) and really did constitute a "who's who" of the field - an auspicious, but rather small gathering. Developments at this time were also summarized in such volumes as the methodological collection edited by Linton Freeman and his colleagues (1989), which presented a collection of papers given at a conference in Laguna Beach, California, in the early 1980s, and the collection edited by Barry Wellman and the late Stephen Berkowitz (2003 ). Much of this early research has been brought together in a recent compilation, together with some later contributions (Scott 2002).
However, something occurred in about 1990. It is not completely clear to us what caused it. Interest in social networks and use of the wide-ranging collection of social network methodology began to grow at a much more rapid (maybe even increasing) rate. There was a realization in much of behavioral science that the "social contexts" of actions matter. Epidemiologists realized that epidemics do not progress uniformly through populations (which are almost never homogeneous). The slightly controversial view that sex research had to consider sexual networks, even if such networks are just dyads, took hold. Organizational studies were recognized as being at the heart of management research (roughly one-third of the presentations at the Academy of Management annual meetings now have a network perspective). Physicists latched onto the web and metabolic systems, developing applications of the paradigm that a few social and behavioral scientists had been working on for many, many years. This came as a surprise to many of these physicists, and some of them did not even seem to be aware of the earlier work - although their maniacal focus on the small world problem (Watts 1999, 2003; Buchanan 2002) has made most of their research rather routine and unimaginative (see Barabasi, 2002, for a lower-level overview). Researchers in the telecommunications industry have started to look at individual telephone networks to detect user fraud. In addition, there is the media attention given to terrorist networks, spawning a number of methodologists to dabble in the area - see Connections 24(3) (2001): a special issue on terrorist networks, as well as the proceedings from a recent conference (Breiger, Carley, and Pattison 2003) on this topic. Perhaps the ultimate occurred more recently when Business 2.0 (November 2003) named social network applications the "Hottest New Technology of 2003." All in all, an incredible diversity of new applications for what is now a rather established paradigm.
Sales of network analysis textbooks have increased: an almost unheard-of occurrence for academic texts (whose sales tend to hit zero several years after publication). It has been 10 years since the publication of the leading text in the area - Social Network Analysis: Methods and Applications (Wasserman and Faust 1994) - and almost 15 years since work on it began. It is remarkable not only that is it still in print, but also that increasing numbers of people are buying it, maybe even looking at parts of it. Yet, much has happened in social network analysis since the mid-1990s. Some general introductory texts have since appeared (Degenne and Forsé 1999; Scott 2000), but clearly, there is a need for an update to the methodological material discussed in Wasserman and Faust's standard reference.
Consequently, we intend MMSNA to be a sequel to Social Network Analysis: Methods and Applications. Although our view of the important research during the 1990s is somewhat subjective, we do believe (as do our contributors) that we have covered the field with MMSNA, including chapters on all the topics in the quantitative analysis of social networks in which sufficient important work has been recently published. The presentations of methodological advances found in these pages are illustrated with substantive applications, reflecting the belief that it is usually problems arising from empirical research that motivate methodological innovation. The contributions review only already published work: they avoid reference to work that is still "in progress."
Currently, no volume completely reviews the state of the art in social network analysis, nor does any volume present the most recent developments in the field. MMSNA is a complement, a supplement, not a competitor, to Wasserman and Faust (1994). We expect that anyone who has trained in network methods using Wasserman and Faust or who uses it as a reference will want to update his or her knowledge of network methods with the material found herein. As mentioned, the range of topics in this volume is somewhat selective, so its coverage of the entire field of network methods is not nearly as comprehensive as that of Wasserman and Faust. Nevertheless, the individually authored chapters of MMSNA are more in-depth, definitely more up-to-date, and more advanced in places than presentations in that book.
We turn now to the individual chapters in MMNSA. Peter Marsden's "Recent Developments in Network Measurement" is a significant scene-setting chapter for this whole volume. He explores the central issues in the measurement of social relations that underpin the other techniques examined in the book. His particular concern is not with measuring network structures themselves, but in the acquisition of relevant and reliable data. To this end, he looks specifically at the design of network studies and the collection of source data on social relations.
Marsden's starting point is the recognition that whole network and egocentric approaches can be complementary viewpoints on the same data. Whole network studies are concerned with the structural properties of networks at the global level, whereas egocentric studies focus on the network as it appears from the standpoint of those situated at particular locations within it. Despite this complementarity, however, issues of sampling and data selection mean that it is rarely possible to move with any ease from the "structure" to the "agent," or vice versa. Marsden examines, in particular, the implications of the identification of network boundaries on the basis of positional, event-based, and relational measures, showing how recent developments have moved beyond the conventional, and often inadequate, approaches to boundary setting.
Data collection for network analysis, in whatever kind of study, has most typically involved survey and questionnaire methods, and Marsden reviews the work of recent authors on the specific response formats for collecting factual and judgmental data on social relations. He considers in particular depth the problems of recall and recognition in egocentric approaches, especially with the use of name-generator methods, and he gives focused attention to studies that aim to collect data on subjective images and perceptions of networks rather than merely reporting actual connections. A key issue in both types of research is the meaning given to the relations by the actors - most particularly, the meaning of such apparently obvious terms as "friend." Marsden shows that a number of issues in this area are significantly related to the position that the respondent occupies in the network on which he or she is reporting. The chapter concludes with some briefer remarks on archival and observational methods where the researcher has less direct control (if any at all) over the nature of the raw data.
Marsden's remarks on the sampling problem are further considered in Ove Frank's chapter, "Network Sampling and Model Fitting." Frank has been the leading contributor to work on network sampling for many years, and here he begins from a consideration of the general issues in sampling methodology that he sees as central to the analysis of multivariate network data. A common method in network analysis has been implicit or explicit snowball sampling, and Frank looks at the use of this method in relation to line (edge) sampling as well as point (vertex) sampling, and he shows that the limitations of this method can be partly countered through the use of probabilistic network models (i.e., basing the sampling on population model assumptions). These are examined through the method of random graphs, especially the uniform and Bernoulli models, and the more interesting models such as Holland-Leinhardt's p1, p*, and Markov random graphs.
Frank gives greatest attention, however, to dyad-dependence models that explicitly address the issue of how points and lines are related. These are models in which network structure is determined by the latent individual preferences for local linkages, and Frank suggests that these can be seen as generalizations of the Holland-Leinhardt p1 model and that they are equally useful for Bayesian models. He examines log-linear and clustering approaches to choosing such models, arguing that the most effective practical solution may be to combine the two. These general conclusions are illustrated through actual studies of drug abuse, the spread of AIDS, participation in crime, and social capital.
The next group of chapters turns from issues of data design and collection to structural measurement and analysis. Centrality has been one of the most important areas of investigation in substantive studies of social networks. Not surprisingly, many measures of centrality have been proposed. The chapter by Martin Everett and Stephen Borgatti, "Extending Centrality," notes that these measures have been limited to individual actors and one-mode data. Their concern is with the development of novel measures that would enlarge the scope of centrality analysis, seeking to generalize the three primary concepts of centrality (degree, closeness, and betweenness) and Freeman's notion of centralization. They first show that it is possible to analyze the centrality of groups, whether these are defined by some external attribute such as ethnicity, sex, or political affiliation, or by structural network criteria (as cliques or blocks). A more complex procedure is to shift the measurement of centrality from one-mode to two-mode data, such as, for example, both individuals and the events in which they are involved. Although such measures are more difficult to interpret substantively, Everett and Borgatti note that they involve less loss of the original data and do not require any arbitrary dichotomizing of adjacency matrices. Finally, they look at a core-periphery approach to centrality, which identifies those sub-graphs that share common structural locations within networks.
Patrick Doreian, Vladimir Batagelj, and Anuška Ferligoj, in "Positional Analyses of Sociometric Data," examine blockmodeling procedures, reviewing both structural equivalence and regular equivalence approaches. Noting that few empirical examples of exact partitioning exist, they argue that the lack of fit between model and reality can be measured and used as a way of comparing the adequacy of different models. Most importantly, they combine this with a generalization of the blockmodeling method that permits many types of models to be constructed and compared. Sets of "permitted" ideal blocks are constructed, and the model that shows minimum inconsistency is sought. In an interesting convergence with the themes raised by Everett and Borgatti, they use their method on Little League data and discover evidence for the existence of a center-periphery structure. They go on to explore the implications of imposing pre-specified models (such as a center-periphery model) on empirical data, allowing the assessment of the extent to which actual data exhibit particular structural characteristics. They argue that this hypothesis-testing approach is to be preferred to the purely inductive approach that is usually employed to find positions in a network.
Thomas Valente's "Network Models and Methods for Studying the Diffusion of Innovations" turns to the implications of network structure for the flow of information through a network. In this case, the flow considered is information about innovations, and Valente reviews existing studies in search of evidence for diffusion processes. His particular concern is for the speed of diffusion in different networks and the implications of this for rates of innovation. A highly illuminating comparison of available mathematical models with existing empirical studies in public health using event history analysis shows that network influences are important, but that the available data prevent more definitive conclusions from being drawn. Valente argues for the collection of more adequate data, combining evidence on both information and network structure, and the construction of more adequately theorized models of the diffusion process.
Katherine Faust's "Using Correspondence Analysis for Joint Displays of Affiliation Networks" convincingly shows the need for formal and strict representational models of the joint space of actors and relational ties. Correspondence analysis (a scaling method), she argues, allows a high level of precision in this task. Having specified the nature of the method and its relevance for social network data, rather than the more typical "actors x variable" data with which it is often used, Faust presents a novel analysis of a global trading network, consisting of international organizations and their member countries. This discloses a clear regional structure in which the first dimension separates South American from Central American countries and organizations, whereas the second dimension separates North American and North Atlantic countries from all others.
The exponential family of random graphs, p*, has received a lot of attention in recent years, and in "An Introduction to Random Graphs, Dependence Graphs, and p*," Stanley Wasserman joins with Garry Robins to review this recent work. Wasserman and Robins made the important generalization of the model from Markov random graphs to a larger family of models. In this chapter, however, they begin with dependence graphs to further clarify the models. They see the great value of p* models as making possible an effective and informed move from local, micro phenomena to overall, macro phenomena. Using maximum likelihood and pseudolikelihood (based on logit models) estimation techniques, they show that the often-noted tendency towards model degeneracy (the production of trivial or uninteresting results) can be offset by using more complex models in which 3- or 4-star configuration counts are used. That is, the model incorporates the first three or four moments of the degree distribution to produce more realistic models. Evidence from simulation studies confirms the power of this approach. Indeed, degenerate models may not always be trivial, but may point to regions where stochastic processes have broken down. In making this point, they make important connections with recent developments in small world networks.
Although analyses of two-mode, affiliation networks involve one significant move away from the conventional one-mode analysis of relational, adjacency data, analyses of multiple networks involves a complementary broadening of approach. Laura Koehly and Philippa Pattison ("Random Graph Models for Social Networks: Multiple Relations or Multiple Raters?") turn to this issue of multiple networks, arguing that most real networks are of this kind. Building on simpler, univariate p* models, they make a generalization to random graph models for multiple networks using dependence graphs. They examine both actual relations and cognitive perceptions of these relations among managers in high-technology industries, showing that the multiple network methods lead to conclusions that simply would not be apparent in a conventional single network approach. Their work is the first step toward richer models of generalized relational structures.
The idea of dependence graphs was central to the chapters of Wasserman and Robins and of Koehly and Pattison. Garry Robins and Philippa Pattison join forces to explore this key idea in "Interdependencies and Social Processes: Dependence Graphs and Generalized Dependence Structures." They make the Durkheimian point that dependence must be seen as central to the very idea of sociality and use this to reconstruct the idea of social space. As they correctly point out, the element or unit in social space is not the individual but the ties that connect them, and they hold that the exploration of dependence models allows the grasping of the variety of ties that enter into the construction of social spaces. From this point of view, dependence graphs are to be seen as representations of proximity in social space, and network analysts are engaged in social geometry.
The analysis of social networks over time has long been recognized as something of a Holy Grail for network researchers, and Tom Snijders reviews this quest in "Models for Longitudinal Network Data." In particular, he examines ideas of network evolution, in which change in network structure is seen as an endogenous product of micro-level network dynamics. Exploring what he terms the independent arcs model, the reciprocity model, the popularity model, and the more encompassing actor-oriented model, Snijders concludes that the latter offers the best potential. In this model, actors are seen as changing their outgoing ties (choices), each change aiming at increasing the value derived from a particular network configuration. Such changes are "myopic," concerned only with the immediate consequences. A series of such rational choices means that small, incremental changes accumulate to the point at which substantial macro-level transformations of structure occur. He concludes with the intriguing suggestion that such techniques can usefully be allied with multiple network methods such as those discussed by Koehly and Pattison.
The final two chapters in the book are reviews of available software sources for visualization and analysis of social networks. The visualization of networks began with Moreno and the early sociograms, but the use of social network analysis for larger social networks has made the task of visualization more difficult. For some time, Linton Freeman has been concerned with the development of techniques, and in "Graphical Techniques for Exploring Social Network Data," he presents the latest and most up-to-date overview. The two families of approaches that he considers are those based on some form of multidimensional scaling (MDS) and those that involve an algebraic procedure. In MDS, points are optimally located in a specified, hopefully small, number of dimensions, using metric or non-metric approaches to proximity. In the algebraic methods of correspondence analysis and principal component analysis, points are located in relation to dimensions identified through procedures akin to the analysis of variance. Using data on beachgoers, Freeman shows that the two techniques produce consistent results, but an algebraic method produces a more dramatic visualization of the structure. Importantly, he also notes that wherever a network is plotted as a disc or sphere, it has few interesting structural properties. Freeman goes on to examine the use of specific algorithms for displaying and manipulating network images, focusing on MAGE, which allows points to be coded for demographic variables such as gender, age, and ethnicity. The use of this method is illustrated from a number of data sets. The longitudinal issues addressed by Snijders are also relevant to the visualization issue, and Freeman considers the use of MOVIEMOL as an animation device for representing small-scale and short-term changes in network structure. He shows the descriptive power of this technique for uncovering social change, but also shows how it can be used in more analytical ways to begin to uncover some of the processes at work.
The final chapter turns to the issue of the software available for different kinds of network analysis. Mark Huisman and Marijtje van Duijn, in "Software for Social Network Analysis," present what is the most up-to-date review of a continually changing field. A total of twenty-seven packages are considered, excluding the visualization software considered by Freeman. Detailed attention is given to six major packages: UCINET, Pajek, MultiNet, NetMiner, STRUCTURE, and StOCNET. Wherever possible, the packages are compared using the same data set (Freeman's EIES network). This is a true road test, with interesting and somewhat surprising results. The authors conclude that there is no single "best buy" and that the package of choice depends very much on the particular questions that are of interest to the analyst.
Recent Developments in Network Measurement
Peter V. Marsden
This chapter considers study design and data collection methods for social network studies, emphasizing methodological research and applications that have appeared since an earlier review (Marsden 1990). It concentrates on methods and instruments for measuring social relationships linking actors or objects. Many analytical techniques discussed in other chapters identify patterns and regularities that measure structural properties of networks (such as centralization or global density), and/or relational properties of particular objects/actors within them (such as centrality or local density). The focus here is on acquiring the elementary data elements themselves.
Beginning with common designs for studying social networks, the chapter then covers methods for setting network boundaries. A discussion of data collection techniques follows. Survey and questionnaire methods receive primary attention: they are widely used, and much methodological research has focused on them. More recent work emphasizes methods for measuring egocentric networks and variations in network perceptions; questions of informant accuracy or competence in reporting on networks remain highly salient. The chapter closes with a brief discussion of network data from informants, archives, and observations, and issues in obtaining them.
2.1 Network Study Designs
The broad majority of social network studies use either "whole-network" or "egocentric" designs. Whole-network studies examine sets of interrelated objects or actors that are regarded for analytical purposes as bounded social collectives, although in practice network boundaries are often permeable and/or ambiguous. Egocentric studies focus on a focal actor or object and the relationships in its locality.
Freeman (1989) formally defined forms of whole-network data in set-theoretic, graph-theoretic, and matrix terms. The minimal network database consists of one set of objects (also known as actors or nodes) linked by one set of relationships observed at one occasion; the cross-sectional study of women's friendships in voluntary associations given by Valente (Figure 6.1.1, Chapter 6, this volume) is one example. The matrix representation of this common form of network data is known as a "who to whom" matrix or a "sociomatrix." Wasserman and Faust (1994) termed this form a one-mode data set because of its single set of objects.
Elaborations of the minimal design consider more than one set of relationships, measure relationships at multiple occasions, and/or allow multiple sets of objects (which may change over occasions). Data sets with two sets of objects - termed two-mode by Wasserman and Faust (1994) - are common; Table 7.4.1 of Chapter 7 in this volume gives an example, a network of national memberships in trade and treaty organizations. Many studies also measure multiple relations, as in Lazega's (1999) study of collaboration, advising, and friendships among attorneys. As Snijders (Chapter 11, this volume) indicates, interest in longitudinal questions about social networks is rising; most extant data sets remain single occasion, however. In addition to relationships, almost all network data sets measure attributes (either time constant or time varying) of objects, but this chapter does not consider issues of measurement for these.
A further variation known as a cognitive social structure (CSS) design (Krackhardt 1987) obtains measurements of the relationship(s) under study from multiple sources or observers. Chapter 9 in this volume presents models for such data. The CSS design is widely used to study informant variations in the social perception of networks. In applications to date, observers have been actors in the networks under study, but in principle the sets of actors and observers could be disjoint.
Egocentric network designs assemble data on relationships involving a focal object (ego) and the objects (alters) to which it is linked. Focal objects are often sampled from a larger population. The egocentric network data in the 1985 General Social Survey (GSS; see Marsden 1987), for example, include information on up to five alters with whom each survey respondent "discusses important matters."
Egocentric and whole-network designs are usually distinguished sharply from one another, but they are interrelated. A whole network contains an egocentric network for each object within it (Marsden 2002). Conversely, if egos are sampled "densely," whole networks may be constructed using egocentric network data. Kirke (1996), for instance, elicited egocentric networks for almost all youth in a particular district, and later used them in a whole-network analysis identifying within-district clusters. Egocentric designs in which respondents report on the relationships among alters in their egocentric networks may be seen as restricted CSS designs - in which informants report on clusters of proximate relationships, rather than on all linkages.
Aside from egocentric designs and one-mode (single-relation or multirelational), two-mode, and CSS designs for whole networks, some studies sample portions of networks. Frank discusses network sampling in depth in Chapter 3 (this volume). One sampling design observes relationships for a random sample of nodes (Granovetter 1976). Another, known as the "random walk" design (Klovdahl et al. 1977; McGrady et al. 1995), samples chains of nodes, yielding insight into indirect connectedness in large, open populations.
2.2 Setting Network Boundaries
Deciding on the set(s) of objects that lie within a network is a difficult problem for whole-network studies. Laumann, Marsden, and Prensky (1989) outlined three generic boundary specification strategies: a positional approach based on characteristics of objects or formal membership criteria, an event-based approach resting on participation in some class of activities, and a relational approach based on social connectedness. Employment by an organization (e.g., Krackhardt 1990) is one positional criterion. The "regulars" at a beach depicted by Freeman (Figure 12.2.3, Chapter 12, this volume; see also Freeman and Webster 1994) were identified via an event-based approach; regulars were defined as persons observed 3 or more days during the study period.
Doreian and Woodard (1992) outlined a specific version of the relational approach called expanding selection. Beginning with a provisional "fixed" list of objects deemed to be in a network, it then adds objects linked to those on the initial list. This approach is closely related to the snowball sampling design discussed by Frank in Chapter 3, this volume; Doreian and Woodard, however, added a new object only after finding that it had several links (not just one) to elements on the fixed list. They review logistical issues in implementing expanding selection, and compare it with the fixed-list approach in a study of social services networks. More than one-half of the agencies located via expanding selection were not on the fixed list. Added agencies were closely linked to one another, although the fixed-list agencies were relatively central within the expanded network. The fixed-list approach presumes substantial prior investigator knowledge of network boundaries, whereas expanding selection draws on participant knowledge about them.
Elsewhere, Doreian and Woodard (1994) suggested methods for identifying a "reasonably complete" network within a larger network data set. They used expanding selection to identify a large set of candidate objects, and then selected a dense segment of this for study. They adopted Seidman's (1983) "k-core" concept (a subset of objects, each linked to at least k others within the subset) as a criterion for setting network boundaries. By varying k, investigators can set more and less restrictive criteria for including objects.
Egocentric network studies typically set boundaries during data collection. The "name generator" questions discussed in this chapter accomplish this.
2.3 Survey and Questionnaire Methods
Network studies draw extensively on survey and questionnaire data. Surveys allow investigators to decide on relationships to measure and on actors/objects to be approached for data. In the absence of archival records, surveys are often the most practical alternative: they make much more modest demands on participants than do diary methods or observation, for example. Surveys do introduce artificiality, however, and findings rest heavily on the presumed validity of self-reports.
Both whole-network and egocentric network studies use survey methods, but the designs typically differ in how they obtain network data and in what they ask of respondents. A whole-network study usually compiles a roster of actors before data collection begins. Survey and questionnaire instruments incorporate the roster, allowing respondents to recognize rather than recall their relationships. Egocentric studies, however, are often conducted in large, open populations. The alters in a respondent's network are not known beforehand, so setting network boundaries must rely on respondent recall.
Whole-network studies ordinarily seek interviews with all actors in the population, and ask respondents to report only on their direct relationships. (The CSS studies discussed later are an exception; they ask for much more data.) In egocentric studies, however, practical and resource considerations usually preclude interviewing a respondent's alters. Such studies ask respondents for data on their own relationships to alters, and also often ask for information on linkages between alters; moreover, they commonly request proxy reports about alters.
|2||Recent developments in network measurement||8|
|3||Network sampling and model fitting||31|
|5||Positional analyses of sociometric data||77|
|6||Network models and methods for studying the diffusion of innovations||98|
|7||Using correspondence analysis for joint displays of affiliation networks||117|
|8||An introduction to random graphs, dependence graphs, and p[superscript *]||148|
|9||Random graph models for social networks : multiple relations or multiple raters||162|
|10||Interdependencies and social processes : dependence graphs and generalized dependence structures||192|
|11||Models for longitudinal network data||215|
|12||Graphic techniques for exploring social network data||248|
|13||Software for social network analysis||270|