Uh-oh, it looks like your Internet Explorer is out of date.
For a better shopping experience, please upgrade now.
Database Modeling and Design / Edition 3 available in Paperback
Temporarily Out of Stock Online
This new edition of Database Modeling & Design continues to focus on the techniques for relational database design introduced in previous editions, starting with the entity-relationship (ER) approach for data requirements specification and conceptual modeling. Author Toby Teorey then looks ahead to the common properties in data modeling and operations shared among the relational model and advanced database technologies such as the object-oriented, temporal, and multimedia models. A full chapter is devoted to database design techniques for data warehousing and online analytical processing (OLAP).
Teorey covers the database life cycle from requirements analysis and logical design to physical design for local, distributed, and multidatabases. The discussion of basic principles is supplemented with a common, running example: a company personnel and project database based on real-life experiences and classroom testing.
Written for both the novice and the professional database designer, this book is the essential resource for database modeling, including the building of standard SQL data definitions. The design rules set forth in this book are applicable to any SQL-based system, including IBM DB2, Oracle V8.0, Informix IDS-UDO, Microsoft Access, Microsoft SQL Server, and Sybase Adaptive Server Enterprise.
• Continued focus on relational model
• Integration of information about data warehouse and OLAP, plus other advanced database technologies, including object oriented, multimedia, and temporal database
• Discussion of basic principles is supplemented by examples based on real life cases
|Series:||Morgan Kaufmann Series in Data Management Systems Series|
|Product dimensions:||7.37(w) x 9.23(h) x 0.80(d)|
Read an Excerpt
Chapter 3: ER Modeling in Logical Database Design
This chapter shows how the ER approach can be applied to the database life cycle, particularly in steps I through II(b) (as defined in Section 1.3), which include the requirements analysis and conceptual modeling stages of logical database design. The example introduced in Chapter 2 is used again to illustrate the ER modeling principles developed in this chapter.
Logical database design is accomplished with a variety of approaches, including the top-down, bottom-up, and combined methodologies. The traditional approach, particularly for relational databases, has been a low-level, bottom-up activity, synthesizing individual data elements into normalized relations (tables) after careful analysis of the data element interdependencies defined by the requirements analysis. Although the traditional process has had some success for small- to medium-sized databases, its complexity for large databases can be overwhelming to the point where practicing designers do not bother to use it with any regularity. In practice, a combination of the top-down and bottom-up approaches is used; in some cases, tables can be defined directly from the requirements analysis. A new form of the combined approach has recently become popular because of the introduction of the ER model into the process.
The ER model has been most successful as a tool for communication between the designer and the end user during the requirements analysis and logical design phases. Its success is due to the fact that the model is easy to understand and convenient to represent. Another reason for its effectiveness is that it is a top-down approach using the concept of abstraction. The number of entities in a database is typically an order of magnitude less than the number of data elements because data elements usually represent the attributes. Therefore, using entities as an abstraction for data elements and focusing on the interentity relationships greatly reduces the number of objects under consideration and simplifies the analysis. Though it is still necessary to represent data elements by attributes of entities at the conceptual level, their dependencies are normally confined to the other attributes within the entity or, in some cases, to those attributes associated with other entities that have a direct relationship to their entity.
The major interattribute dependencies that occur in data models are the dependencies between the entity keys, the unique identifiers of different entities that are captured in the ER modeling process. Special cases such as dependencies among data elements of unrelated entities can be handled when they are identified in the ensuing data analysis.
The logical database design approach defined here uses both the ER model and the relational model in successive stages. It benefits from the simplicity and ease of use of the ER model and the structure and associated formalism of the relational model. In order to facilitate this approach, it is necessary to build a framework for transforming the variety of ER constructs into tables that are already normalized or can be normalized with a minimum of transformation. Before we do this, however, we need to first define the major steps of the relational design methodology in the context of the database life cycle.
3.2 Requirements Analysis and ER ModelingRequirements analysis is the most important step (step 1) of the database life cycle and is typically the most labor intensive. The database designer must interview the end-user population and determine exactly what the database is to be used for and what it must contain. The basic objectives of requirements analysis are
- to delineate the data requirements of the enterprise in terms of primitive objects;
- to describe the information about the objects and the relationships among objects needed to model these data requirements;
- to determine the types of transactions that are intended to be executed on the database and the interaction between the transactions and the data objects;
- to define any performance, integrity, security, or administrative constraints that must be imposed on the resulting database;
- to specify the hardware and software platform for the database implementation; and
- to thoroughly document all of the preceding in a detailed requirements specification. The data objects can also be defined in a data dictionary system, often provided as an integral part of the database management system.
The ER model helps designers to accurately capture the real data requirements because it requires them to focus on semantic detail in the data relationships, which is greater than the detail that would be provided by FDs alone. The semantics of ER allow for direct transformations of entities and relationships to at least first normal form tables. They also provide clear guidelines for integrity constraints. In addition, abstraction techniques such as generalization provide useful tools for integrating end-user views to define a global conceptual schema.
Let us now look more closely at the basic objects and relationships that should be defined during requirements analysis and conceptual design. These two life cycle steps are often done simultaneously.
Consider the substeps in step IIa), ER modeling:
- Classify entities and attributes.
- Identify the generalization hierarchies.
- Define relationships.
The remainder of this section discusses the tasks involved in each substep.
3.2.1 Classify Entities and Attributes
Though it is easy to define entity, attribute, and relationship constructs, it is not as easy to distinguish their roles in modeling the database. What makes an object an entity, an attribute, or even a relationship? For example, project headquarters are located in cities. Should "city" be an entity or an attribute? A vita is kept for each employee. Is "vita" an entity or a relationship?
The following guidelines for classifying entities and attributes will help the designer's thoughts converge to a normalized relational database design.
- Entities should contain descriptive information.
- Classify multivalued attributes as entities.
- Attach attributes to the entities they most directly describe.
Now we examine each guideline in turn.
Entities should contain descriptive information. If there is descriptive information about an object, the object should be classified as an entity. If an object requires only an identifier and does not have relationships, the object should be classified as an attribute. With "city," for example, if there is some descriptive information such as "country" and "population" for cities, then "city" should be classified as an entity. If only the city name is needed to identify a city, then "city" should be classified as an attribute associated with some entity, such as Project. Examples of objects in the real world that are typically classified as entities are employee, task, project, department, company, customer, and so on.
Classify multivalued attributes as entities. If more than one value of a descriptor attribute corresponds to one value of an identifier, the descriptor should be classified as an entity instead of an attribute, even though it does not have descriptors itself. A large company, for example, could have many offices, some of them possibly in different cities. In that case, "office" could be classified as a multivalued attribute of "company," but it would be better to be classified as an entity, with "office-address" as its identifier. If attributes are restricted to be single valued only, the later design and implementation decisions will be simplified.
Attach attributes to the entities they most directly describe. For example, attribute "office-building-name" should normally be an attribute of the entity Department instead of the entity Employee.
The procedure of identifying entities and attaching attributes to entities is iterative: Classify some objects as entities and attach identifiers and descriptors to them. If you find some violation of the preceding guidelines, change some objects from entity to attribute (or from attribute to entity), attach attributes to the new entities, and so forth....
Table of Contents
2 The ER Model: Basic Concepts
3 ER Modeling in Logical Database Design
4 Transformation of the ER Model to SQL
6 Access Methods
7 An Example of Relational Database Design
8 Distributed Data Allocation
9 Data Warehousing, OLAP, and Data Mining
10 Advanced Database Technologies
In this third edition we continue to concentrate on techniques for database design in relational database systems, starting with the entityrelationship (ER) approach for data requirements specification and conceptual modeling; but we also look ahead to the common properties in data modeling and operations between the relational model and advanced database technologies such as the object-oriented, temporal, and multimedia models. We cover the database life cycle from requirements analysis and logical design to physical design for local, distributed, and multidatabases. The discussion of basic principles is supplemented with a common example-a company personnel and project database, based on real-lifeexperiences and thoroughly classroom tested.
OrganizationThe database life cycle is described in Chapter 1. We also review the basic concepts of database modeling. Entity-relationship (ER) modeling is a popular method for the conceptualization of users data requirements. Currently, there is no standard ER model, and published articles and textbooks have introduced an enormous variety of constructs and notation to represent fundamental modeling concepts. In Chapter 2 we present the most fundamental ER concepts and provide a simple set of notational constructs-that is, the Chen notation-to represent them. We then provide guidance to reading some of the most common alternative notations used in the literature. We look at ER models at two levels: the simple level that is currently used in most computer-aided software engineering (CASE) tools and that satisfies requirements for end-user readability, and a complex level that is useful to database designers who want to clearly define complex relationships.
Chapters 3 and 4 show how to use ER concepts in the database design process. Chapter 3 is devoted to direct application of ER modeling in logical database design. Chapter 4 explains the transformation of the ER model to the relational model and to SQL syntax specifically.
Chapter 5 is devoted to the fundamentals of database normalization through fifth normal form, showing the functional equivalence between the ER model and the relational model for the higher normal forms. Chapter 6 introduces the concepts of physical design, access methods, and join strategies and illustrates how to modify a relational schema to be more efficient when usage for specific applications is well known. The case study in Chapter 7 summarizes the techniques presented in Chapters I through 6 with a new problem environment.
Chapter 8 presents the fundamentals of fragmentation and data allocation in a distributed database system. Chapters 9 and 10 are entirely new. Chapter 9 describes the main database design issues in data warehousing and gives some examples of how OLAP and data mining are being used today to get new information from old data. Chapter 10 discusses the new database technologies, such as spatial, temporal, multimedia, object-oriented, and object-relational, and how they overlap and interact in terms of the database design life cycle.
An extensive review of the popular relational database query language SQL is presented in Appendix A for those readers who lack familiarity with database query languages. Appendix B takes a brief look at the major database parameters controlled by either the user or database administrator that affect performance in a significant way and can be tuned to improve efficiency. The relationship between database reliability and performance is explored in Appendix C, leading to a new way to compute mean transaction (response) time for any database application. As a supplement to Chapter 9, Appendix D lists many of the major vendors associated with data warehousing, including those specifically selling software for OLAP and data mining. Since a complete list is impossible to make, given the dynamics of the industry, several key Web sites are given to lead the reader to more up-to-date listings. Finally, Appendix E summarizes some recent research in the various ways databases can be integrated with the World Wide Web.
This book can be used by the database practitioner as a useful guide to database modeling and its application to database designs from business and office environments to scientific and engineering databases. Whether you are a novice database user or an experienced professional, this book offers new insights into database modeling and the ease of transition from the ER model to the relational model, including the building of standard SQL data definitions. Thus, whether you are using DB2, Oracle, Sybase, Informix, SQL Server, NonStop SQL, or any other SQL-based system, the design rules set forth here will be applicable. The case studies used for the examples throughout the book are from real-life databases that were designed using the principles formulated here. This book can also be used by the advanced undergraduate or beginning graduate student to supplement a course textbook in introductory database management or for a stand-alone course in data modeling or database design.