MCSA/MCSE Training Kit: Microsoft SQL Server 2000 System Administrationby Microsoft Press, Carl Rabeler
Official kit teaches database administrators how to set up and support Microsoft SQL Server 2000 as they prepare for MCP Exam 70-228, a core exam for MCDBA certification and an elective on the Microsoft Windows 2000 MCSE track.Topics map directly to the exam objectives including installing and configuring SQL Server 2000, creating databases, optimizing database… See more details below
Official kit teaches database administrators how to set up and support Microsoft SQL Server 2000 as they prepare for MCP Exam 70-228, a core exam for MCDBA certification and an elective on the Microsoft Windows 2000 MCSE track.Topics map directly to the exam objectives including installing and configuring SQL Server 2000, creating databases, optimizing database performance, extracting and transforming data, managing security and troubleshooting Server system activity.
Read an Excerpt
Chapter 7: Populating a Database
About This ChapterThis chapter prepares you to populate your database from an existing data source. The chapter begins with a discussion of the data transformation, consistency, and validation issues you will encounter when you import data from one or multiple sources into a SQL Server 2000 database. Next, you are introduced to the primary SQL Server 2000 tools used for populating a database. Finally, you learn how and when to use each of these tools.
Before You BeginTo complete this chapter, you must have
- A computer that meets or exceeds the minimum hardware requirements listed in Table 2.1 in the Lesson 1 section of Chapter 2.
- Microsoft Windows 2000 Server running on your computer on an NTFS partition.
- A computer with a NetBIOS name of SelfPacedCPU, configured as a domain controller in the SelfPacedSQL.MSFT domain.
- Installed a default instance and at least one named instance of SQL Server 2000 (see Chapter 2).
Lesson 1: Transferring and Transforming DataAfter you have created your user database, you need to populate it with data.
Frequently, this consists of importing and transforming existing data from one or more external data sources (such as another database system) to SQL Server 2000. In this lesson, you learn the issues you will face when you choose to import data from another data source. This lesson will also introduce you to the primary tools used to import and transform data, including a brief discussion of the capabilities of each tool. These tools include DTS, Bcp, and the BULK INSERT Transact-SQL statement.
After this lesson, you will be able to
- Describe the process of evaluating the quality and consistency of data in an external data source prior to data import
- Understand the types of data transformations that may be necessary when importing data from existing data sources
- Describe the tools provided by SQL Server 2000 for importing data
Estimated lesson time: 15 minutes
Importing DataPopulating your SQL Server 2000 user database frequently consists of the process of importing existing data from external data sources to a newly created destination database. These external data sources include Microsoft or third-party databases, spreadsheets, or text files. Before you can import this data, you must perform the following preliminary tasks to evaluate the external data and determine the steps that will be involved in the import process. These preliminary steps will also help you select the appropriate tool to use.
- Determine the consistency of the existing data within the external data source. The external data source may not have enforced consistency when data was initially input (for example, if the names of states were entered as two-letter abbreviations in some instances and were fully spelled out in other instances).
- Determine whether additional columns must be added. The existing data may be internally consistent but not include necessary columns because these values were assumed (such as values for country or an international telephone area code).
- Determine whether the existing data format should be modified. The existing data may be internally consistent but not be represented in the manner you want to use in the destination database (for example, requiring a change in the date format or the conversion of numerical values to more readable string values, such as 1, 2, and 3 being converted to poor, average, and excellent).
- Determine whether existing data columns should be modified. The existing data may be internally consistent but contain columns that need to be aggregated or separated (such as separating first and last names into separate columns or summarizing daily data into weekly or monthly data).
- Determine whether the import of data will be a one-time task or a task that must be performed periodically. The existing data may need to be migrated from an existing legacy system that will be retired or may need to be extracted from an existing system from which you will extract data on a weekly or monthly basis.
- Determine how to access the existing data. Is the external data source directly accessible, and do you have permission to access the data directly? (If not, the data will have to be exported into a format that SQL Server 2000 can work with, such as a delimited text file.)
DTS Data TransformationsAfter you have evaluated the data in each external data source, you need to determine how to proceed. Occasionally, changes to data can be made in the external data source, but usually these changes cannot be made in the external data source without either breaking existing applications (for example, adding columns or changing data formats) or consuming too much time (for example, manually enforcing data consistency where none existed previously). These changes can either be made after the data has been imported into SQL Server, using temporary tables and using Transact-SQL statements to scrub and cleanse the data, or can be made during the import process itself. Changes to the data made during the import and export process are referred to as DTS transformations. A DTS transformation occurs when one or more operations or functions are applied against data before that data arrives at the destination. The data at the source is not changed. Transformations make it easy to implement data scrubbing, conversions, and complex data validations during the import and export process.
The type and extent of modifications that must be made will help determine the SQL Server 2000 data transfer tool that you will use, and when you will perform the modifications. Also, whether the data import is a one-time task or a periodic task will frequently determine the tool you will use and how much you automate the necessary transformations. Planning and testing (using data subsets) is essential, particularly for large data sets.
Introducing the Data Transfer ToolsSQL Server 2000 provides a number of tools for importing and exporting data. These tools have differing capabilities to extract subsets of data from an existing data source and to transform data. Table 7.1 briefly describes each of these primary data transfer tools and their capabilities.
Table 7.1 Data Transfer Tools and Their Functions
|DTS||DTS is a graphical tool used to import, export, and transform data. DTS can work directly with a wide variety of data sources. DTS creates packages that can be scheduled. DTS can also import and export database objects schema (meta data) between SQL Server instances.|
|Bcp||Bcp is a command-prompt utility used to copy data from a text file to a SQL Server 2000 table or view (or from a SQL Server 2000 table or view to a text file) using ODBC. The transformation capabilities of Bcp are limited and require cryptic format files. Working with Microsoft or third-party databases is a two-step process.|
|BULK INSERT Transact-SQL statement||BULK INSERT is a Transact-SQL command used to copy data from an ASCII text file to a SQL Server 2000 table or view (but not from SQL Server 2000 to a text file) using OLE DB. The BULK INSERT statement provides the same functionality of Bcp (and has the same limitations) in a Transact-SQL statement and can be embedded in a DTS package.|
There are a number of other methods of moving data between data sources, but they are not normally used to initially populate a database. These additional methods include backup and restore (see Chapters 8 and 9), replication (see Chapter 15), ActiveX scripts, and the INSERT and SELECT INTO Transact-SQL statements using distributed queries (see Chapter 12).
Lesson SummaryWhen you are populating a database from existing data stored in an external data source, you must evaluate that data to determine what transformations, if any, must be performed on that data prior to or during importation. You must determine whether the import is a one-time task or will be a periodic task. You must also determine how you will access this data, either directly or through an exported text file. These factors will determine the data transfer tool you use, and when you transform the data.
Lesson 2: Introducing Microsoft Data Transformation Services (DTS)DTS is a powerful set of graphical tools (and programmable objects) that you can use to import, export, and transform data to and from a wide variety of data sources and destinations. In this lesson, you learn about the components of DTS packages, including connections, tasks, transformations, and workflow. You are also introduced to storage options for DTS packages. Finally, this lesson introduces the DTS tools, including the DTS Import/Export Wizard (Dtswiz.exe), DTS Designer, and the DTS package execution utilities. Subsequent lessons will teach you how to use each of these tools.
After this lesson, you will be able to
- Describe the components of a DTS package
- Describe the data sources to which DTS can establish direct connections
- Describe the type of data transformations that are available
- Describe the DTS workflow options
- Describe the DTS package storage options
- Describe each of the DTS tools
Estimated lesson time: 15 minutes
Understanding a DTS PackageYou can create a DTS package to connect to a data source, copy and manage data and database objects, run tasks as jobs from within a package, transform the data, and then store the transformed data and the database objects to the same data source or to a different data destination. You can also notify a user (or process) of the success or failure of the package, including attaching a file to an e-mail message. You define these tasks as discrete steps (known as DTS tasks) and control the execution using precedence constraints (such as performing a certain task if a task succeeds and a different task if the task fails). You create a DTS package in the DTS Import/Export Wizard, in DTS Designer, or programmatically.
A DTS package must have a valid data source and a valid data destination to which it connects. It can connect to additional data sources during execution (such as lookup tables located on a third data connection). Table 7.2 describes the variety of connections that you can establish during the package creation process.
Table 7.2 Connection Types
|Data source connection||A connection to a standard database (such as Microsoft SQL Server 2000, Microsoft Access 2000, Oracle, dBase, or Paradox), an OLE DB connection to an ODBC data source, a Microsoft Excel 2000 spreadsheet, an HTML source, or any other OLE DB provider. The properties of the data source connection specify the necessary connection parameters.|
|File connection||A connection to a text file (normally delimited). The properties of the file connection specify the format of the text file. There are multiple formats you can use. For example, the text file format can be either a delimited or a fixed field format.|
|Data link connection||A connection to an intermediate file (.UDL file) that stores a connection string to create an OLE DB connection that is resolved at run time. The data link connection allows you to encapsulate the connection properties into a separate .UDL data link file. You can then edit the connection string in the data link file (from one data source to another) without changing the SQL Server 2000 DTS package.|
A DTS package can perform a plethora of tasks, either sequentially or in parallel. Parallel tasks run independently of each other (using separate operating system threads to enhance performance), whereas sequential tasks execute in a defined order based on the success, failure, or completion of predecessor tasks. A DTS task is a discrete unit of work that is part of a data movement and transformation process (such as copying the contents of a text file to a SQL Server 2000 table). DTS tasks that transform data (such as the Parallel Data Pump, the Transform Data, and the Data Driven Query tasks) are based on an architectural component called the DTS data pump. The DTS data pump is an OLE DB service provider that provides the interfaces and the means to import, export, and transform data from an OLE DB data source to an OLE DB destination.
SQL Server 2000 provides a number of DTS tasks that transform data, copy and manage data and meta data, and function as jobs. These tasks are accessed through the DTS Designer tool (except the Parallel Data Pump task, which can only be accessed programmatically). Table 7.3 describes the tasks that are available through DTS Designer (all of which are also accessible programmatically).
DTS can perform column-level transformations on data during the importation process. These transformations can be as simple as changing column names and as complex as your transformation process requires. Using DTS Designer, you can use the Transform Data task or the Data Driven Query task, or you can write an ActiveX script. Using Visual Basic or Visual C++, you can also write transformations programmatically and use the Parallel Data Pump task. You perform a transformation by mapping a relationship between one or more columns in the data source with one or more columns in the data destination. You then define the transformations you want to occur during importation....
and post it to your social network
Most Helpful Customer Reviews
See all customer reviews >