MCSE: Microsoft SQL Server 2000 Database Design and Implementation Training Kitby Microsoft Press, Microsoft Corporation
This official "MCSE Training Kit" teaches how to design and implement database solutions using Microsoft SQL Server 2000 Enterprise Edition, as well as providing in-depth preparation for MCP Exam 70-229 a core exam for MCDBA certification and an elective on the Microsoft Windows "RM" 2000 MCSE track. Topics map directly to the objectives
This official "MCSE Training Kit" teaches how to design and implement database solutions using Microsoft SQL Server 2000 Enterprise Edition, as well as providing in-depth preparation for MCP Exam 70-229 a core exam for MCDBA certification and an elective on the Microsoft Windows "RM" 2000 MCSE track. Topics map directly to the objectives measured by the MCP exam, including developing a logical data model, implementing the physical database, retrieving and modifying data, programming business logic, tuning and optimizing data access, and designing a database security plan. Students learn through an integrated system of skill-building tutorials, case study examples, and self-assessment exercises. An economical alternative to classroom instruction, this kit enables students to set their own pace and learn by doing!
Read an Excerpt
Chapter 3: Designing a SQL Server DatabaseAbout This Chapter
A SQL Server database consists of a collection of tables that contain data and other objectsincluding views, indexes, stored procedures, and triggersthat are defined so as to support activities performed with the data. The data stored in a database is usually related to a particular subject or process, such as a retailer's customer information or a manufacturer's sales information. This chapter introduces you to the process of creating a SQL Server database and describes the basic concepts of database design. This chapter also provides information about planning a database, identifying system requirements, and developing a logical data model.
Before You Begin
To complete the lessons in this chapter, you must have
- SQL Server 2000 Enterprise Edition installed on a Windows 2000 Server computer.
- The ability to log on to the Windows 2000 Server computer and to SQL Server as the Windows 2000 administrator.
- Paper and a pen or pencil to complete part of the exercises.
Lesson 1: Introduction to Database DesignBefore you can develop a logical data modeland subsequently create a database and the objects it containsyou should understand the fundamental concepts of database design. In addition, you should be familiar with the basic components of a database and how those components work together to provide efficient data storage and to provide access to those who require specific types of data, in specific formats, from the database. This lesson introduces you to the basic components of a database and the terminology that describes those components. The lesson then discusses normalization and entity relationshipstwo concepts that are integral to understanding relational database design.
After this lesson, you will be able to:
- Describe the main components of a relational database.
- Describe the process of normalization and normalize tables in a database design.
- Identify the relationships that exist between entities.
Estimated lesson time: 30 minutes
Components of a SQL Server DatabaseA SQL Server database consists of a collection of tables that store specific sets of structured data. A table (entity) contains a collection of rows (tuples) and columns (attributes). Each column in the table is designed to store a certain type of information (for example, dates, names, dollar amounts, or numbers). Tables have several types of controls (constraints, rules, triggers, defaults, and customized user data types) that ensure the validity of the data. Tables can have indexes (similar to those in books) that enable rows to be found quickly. You can add declarative referential integrity constraints to the tables to ensure consistency between interrelated data in different tables. A database can also store procedures that use Transact-SQL programming code to perform operations with the data in the database, such as storing views that provide customized access to table data.
For example, suppose that you create a database named MyCoDB to manage the data in your company. In the MyCoDB database, you create a table named Employees to store information about each employee, and the table contains columns named EmpID, LastName, FirstName, Dept, and Title. To ensure that no two employees share the same EmpID and that the Dept column contains only valid numbers for the departments in your company, you must add constraints to the table. Because you want to quickly find the data for an employee based on the employee's ID or last name, you define indexes. For each employee, you will have to add a row of data to the Employees table, so you create a stored procedure named AddEmployee that is customized to accept the data values for a new employee and that performs the operation of adding the row to the Employees table. You might need a departmental summary of employees, in which case you define a view named DeptEmps that combines data from the Departments and Employees tables and produces the output. Figure 3.1 shows the parts of the MyCoDB database.
Click to view graphic
Figure 3.1 The MyCoDB database, the Employees table, and the DeptEmps view.
Normalizing a Database DesignOptimizing a database design includes the process of normalization. Normalizing a logical database design involves using formal methods to separate the data into multiple, related tables. Having a greater number of narrow tables (with fewer columns) is characteristic of a normalized database; having a few wide tables (with more columns) is characteristic of a denormalized database.
Reasonable normalization often improves performance. When useful indexes are available, the SQL Server 2000 query optimizer is efficient at selecting rapid, efficient joins between tables.
As normalization increases, so do the number and complexity of joins required to retrieve data. Too many complex relational joins between too many tables can hinder performance. Reasonable normalization should include few regularly executed queries that use joins involving more than four tables.
A database that is used primarily for decision support (as opposed to update-intensive transaction processing) might not have redundant updates and might be more understandable and efficient for queries if the design is not fully normalized. Nevertheless, data that is not normalized is a more common design problem in database applications than over-normalized data. Starting with a normalized design and then selectively denormalizing tables for specific reasons is a good strategy.
Sometimes the logical database design is already fixed, however, and total redesign is not feasible. But even then, it might be possible to normalize a large table selectively into several smaller tables. If the database is accessed through stored procedures, this schema change could take place without affecting applications. If not, it might be possible to create a view that hides the schema change from the applications.
Achieving a Well-Designed Database
In relational database design theory, normalization rules identify certain attributes that must be present or absent in a well-designed database. These rules can become quite complicated and go well beyond the scope of this book. There are a few rules that can help you achieve a sound database design, however. A table should have an identifier, it should store data for only a single type of entity, it should avoid nullable columns, and it should not have repeating values or columns.
A Table Should Have an Identifier
The fundamental rule of database design theory is that each table should have a unique row identifier, which is a column or a set of columns used to distinguish any single record from every other record in the table. Each table should have an ID column, and no two records can share the same ID value. The column (or columns) that serves as the unique row identifier for a table is the primary key of the table.
In Figure 3.2, the Employees table does not include a column that uniquely identifies each row within the table. Notice that the name David Mendlen appears twice. Because there is no unique identifier in this table, there is no way to easily distinguish one row from another. This situation could be worse if both employees worked in the same department and had the same job title.
Click to view graphic
Figure 3.2 A table that has no unique identifier.
Click to view graphic
Figure 3.3 A normalized table that has a unique identifier.
You can normalize the table by adding a column that uniquely identifies each row, as shown in Figure 3.3. Notice that each instance of David Mendlen has a unique EmpID value.
A Table Should Store Data for Only a Single Type of Entity
Attempting to store too much information in a table can prevent the efficient and reliable management of the data in the table. For example, in Figure 3.4, the Books table includes information about each book's publisher.
Click to view graphic
Figure 3.4 A Table that includes title and publisher information.
Although it is possible to have columns that contain information for both the book and the publisher in the same table, this design leads to several problems. The publisher information must be added and stored redundantly for each book published by a given publisher. This information uses extra storage space in the database. If the address for the publisher changes, the change must be made for each book. Furthermore, if the last book for a publisher is removed from the Books table, the information for that publisher is lost.
In a normalized database, the information about books and publishers would be stored in at least two tables: one for book titles and one for publishers (as shown in Figure 3.5).
Click to view graphic
Figure 3.5 A normalized database design that includes a table for book titles and a table for publisher information.
The information about the publisher now has to be entered only once and then linked to each book. If the publisher information changes, it must be changed in only one place, and the publisher information will be there even if the publisher has no books in the database.
A Table Should Avoid Nullable Columns
Tables can have columns defined to allow null values. A null value indicates that the record has no value for that attribute. Although it can be useful to allow null values in isolated cases, it is best to use them sparingly because they require special handling that increases the complexity of data operations. If you have a table that has several nullable columns and several of the rows have null values in the columns, you should consider placing these columns in another table linked to the primary table. Storing the data in two separate tables enables the primary table to be simple in design but capable of accommodating the occasional need for storing this information.
A Table Should Not Have Repeating Values or Columns
A table should not contain a list of values for a specific piece of information. For example, suppose that you want to track book titles and their authors. Although most books might have only one author, many of them might have two or more. If there is only one column in the Books table for the name of the author, this situation presents a problem. One solution is to store the name of both authors in the column, but showing a list of individual authors would then be difficult. Another solution is to change the structure of the table to add another column for the name of the second author, but this solution accommodates only two authors. Yet another column must be added if a book has three authors.
Figure 3.6 shows two methods of handling multiple authors per title.
Click to view graphic
Figure 3.6 Two methods for structuring the Books table.
If you find that you need to store a list of values in a single column or if you have multiple columns for a single piece of data (Author1, Author2, and so on), you should consider placing the duplicated data in another table with a link to the primary table. In the case of the Books table, you could create an additional primary table for authors and then create a third table that matches books to authors and accommodates repeating values, as shown in Figure 3.7. This design enables any number of authors for a book without modifying the definition of the table and allocates no unused storage space for books that have a single author.
Click to view graphic
Figure 3.7 Three tables that store information about books and their authors. (Image unavailable)
Entity RelationshipsIn a relational database, relationships help to prevent redundant data. A relationship works by matching data in key columnsusually columns that have the same name in both tables. In most cases, the relationship matches the primary key from one table, which provides a unique identifier for each row with an entry in the foreign key in the other table. Primary keys and foreign keys are discussed in more detail in Chapter 5, "Implementing Data Integrity."
There are three types of relationships between tables: one-to-one, one-to-many, and many-to-many. The type of relationship depends on how the related columns are defined.
In a one-to-one relationship, a row in table A can have no more than one matching row in table B (and vice versa). A one-to-one relationship is created if both of the related columns are primary keys or have unique constraints. This type of relationship is not common, however, because information related in this way would usually be in one table.
A one-to-many relationship is the most common type of relationship. In this type of relationship, a row in table A can have many matching rows in table B, but a row in table B can have only one matching row in table A. For example, the Publishers and Titles tables mentioned previously have a one-to-many relationship. Each publisher produces many titles, but each title comes from only one publisher. A one-to-many relationship is created if only one of the related columns is a primary key or has a unique constraint.
In a many-to-many relationship, a row in table A can have many matching rows in table B (and vice versa). You create such a relationship by defining a third table, called a junction table, whose primary key consists of the foreign keys from both table A and table B. In Figures 3-6 and 3-7, you saw how the author information could be separated into another table. The Books table and the Authors table have a many-to-many relationship. Each of these tables has a one-to-many relationship with the BookAuthor table, which serves as the junction table between the two primary tables.
Exercise 1: Exploring the Basic Concepts of Database Design
In this exercise, you will view the primary objects that are contained in a SQL Server database. You will then apply the principles of normalization to a database design and identify the relationships that exist between entities within a database. To perform this exercise, you should be logged into your Windows 2000 Server computer as Administrator. You will use SQL Query Analyzer and SQL Server Enterprise Manager for part of the exercise, and you will need paper and a pencil to complete the rest of the exercise.
To identify the main components of a SQL Server database
- Open SQL Query Analyzer and use Windows authentication to log on to SQL Server.
- Open the Object Browser window if it is not already displayed.
- Review the list of database objects that appear in the tree. Notice that the Northwind and Pubs databases appear as objects in the tree. The tree also includes a list of common objects, such as aggregate functions and system data types.
- Expand the Northwind node.
- Expand the Users Tables node.
The Object Browser window displays a hierarchical tree of database objects contained in the instance of SQL Server that you are logged on to.
A list of object types appears. The list includes users tables, system tables, stored procedures, functions, and user-defined data types. Each category contains objects specific to the Northwind database.
A list of users tables in the Northwind database appears. Notice that each table object is preceded by the object owner (which, in this case, is dbo).
To view the contents of a table
- Right-click dbo.Categories, then click Open.
- Review the columns and rows within the table.
- Close the Open Table window.
- Review each users table object to determine the columns in each one. Be sure to close the Open Table window after you review that table. You can also review the system tables to view their attributes.
The Open Table window appears and displays the contents of the Categories table.
What are the column names (attributes) in the Categories table, and how many rows of data are displayed?
To use the sp_help system stored procedure to view table information
- In the Query window, execute the following Transact-SQL statement:
- A list of all objects in the Northwind database appears on the Grids tab of the Results pane.
- Close SQL Query Analyzer.
After the result appears on the Grids tab, click within the Results pane. A second scroll bar will appear, and you can scroll through all of the objects in the database.
To normalize a database design
- Review the following table:
- Keeping in mind the table's design, apply the four basic rules that you should follow when designing a database. The rules are listed here for your convenience:
- A table should have an identifier.
- A table should store data for only a single type of entity.
- A table should avoid nullable columns.
- A table should not have repeating values or columns.
Which rule is being violated in the Customers table?
How should you modify the data?
|103||Neil||Smith||Denver||CDs, videos, DVDs|
The PurchaseType column contains a list of the types of products that the customer has purchased.
How should you modify the current design?
|101||Elizabeth||Boyle||Cleveland||Spring candles||Pavlova, Ltd.||Ian Devling|
|102||Rob||Caron||Chicago||Sandalwood incense||Mayumi's||Mayumi Ohno|
|103||Neil||Smith||Denver||Sage||Pavlova, Ltd||Ian Devling|
|104||Denise||Smith||Boston||Hanging crystal||Leka Trading||Chandra Leka|
How should you modify the current design?
To generate a database diagram in SQL Server
- On the Start menu, point to Programs and then point to Microsoft SQL Server. Then, click Enterprise Manager.
- In the Tree tab, expand the Microsoft SQL Servers node, expand SQL Server Group, expand the node for your local computer, expand Databases, and then expand the Pubs database.
- Right-click Diagrams, then click New Database Diagram.
- Click Next.
- Select the Add Selected Tables Automatically checkbox.
- Change the level in the How Many Levels Of Related Tables list box to 2.
- Select the Authors table from the Available Tables list, then click Add.
- Click Next.
- Click Finish.
SQL Server Enterprise Manager appears.
A list of object types in the Pubs database should now be displayed.
The Create Database Diagram wizard appears.
The Select Tables To Be Added screen appears. The list of tables in the Pubs database appears in the left window.
The How Many Levels Of Related Tables option becomes active.
The Authors table, TitleAuthor table, and Titles table are added to the Tables To Add To Diagram list.
The Completing The Create Database Diagram Wizard screen appears.
The diagram is generated and appears in a new window. The diagram includes the Authors table, the TitleAuthor table, and the Titles table.
To view entity relationships in a database diagram
- If the tables overlap, click the Arrange Tables button on the toolbar.
- If necessary, click the Zoom button on the toolbar and select a size that would make the relationships easier to identify. Maximize the window if necessary.
- View the connector that links the Authors table to the TitleAuthor table and the connector that links that Titles table to the TitleAuthor table.
- Close the diagram window without saving the changes that you made, then close SQL Server Enterprise Manager.
The tables should now be arranged so that you can see the relationships between them.
The connector indicates that a relationship exists between the tables. Notice that there is no connector between the Author table and the Titles table because no direct relationship exists between the tables.
At one end of the connecter is a key, which indicates one. The other side of the connector is an infinity sign, which indicates many. These symbols tell you that a one-to-many relationship exists between the Authors table and the TitleAuthor table and that a one-to-many relationship exists between the Titles table and the TitleAuthor table.
What is the implied relationship between the Titles table and the Authors table?
What type of table is the TitleAuthor table (in terms of the implied relationship between Titles and Authors)?
As an extra exercise, try creating diagrams for other users tables within the Pubs database and for tables in the Northwind database. You can also try selecting levels greater than two in the How Many Levels Of Related Tables list box. Experiment with the different levels and tables.
Lesson SummaryA SQL Server database consists of a collection of tables that store a specific set of structured data. A table contains a collection of rows and columns. Each column in the table is designed to store a certain type of information (for example, dates, names, dollar amounts, or numbers). The logical design of the database, including the tables and the relationships between them, is the core of an optimized, relational database. Optimizing a database design includes the process of normalization. Normalizing a logical database design involves using formal methods to separate the data into multiple, related tables. As normalization increases, so do the number and complexity of joins that are required to retrieve data. Normalization rules identify certain attributes that must be present or absent in a well-designed database. Tables in a normalized database should have an identifier, should only store data for a single type of entity, should avoid nullable columns, and should not have repeating values or columns. You can create relationships between your tables in a database diagram to show how the columns in one table are linked to columns in another table. In a relational database, relationships help to prevent redundant data. A relationship works by matching data in key columnsusually columns that have the same name in both tables. There are three types of relationships between tables: one-to-one, one-to-many, and many-to-many. The type of relationship depends on how you define the related columns....
Meet the Author
Founded in 1975, Microsoft Corporation (Nasdaq 'MSFT') is the worldwide leader in software for personal and business computing. The company offers a wide range of products and services designed to empower people through great software—any time, any place, and on any device.
Most Helpful Customer Reviews
See all customer reviews