Beginning Transact-SQL with SQL Server 2000 and 2005
Transact-SQL is a powerful implementation of the ANSI standard SQL database query language. In order to build effective database applications, you must gain a thorough understanding of these features. This book provides you with a comprehensive introduction to the T-SQL language and shows you how it can be used to work with both the SQL Server 2000 and 2005 releases.
Beginning with an overview of the SQL Server query operations and tools that are used with T-SQL, the author goes on to explain how to design and build applications of increasing complexity. By gaining an understanding of the power of the T-SQL language, you'll be prepared to meet the ever-increasing demands of programming.
What you will learn from this book
- How T-SQL provides you with the means to create tools for managing hundreds of databases
- Various programming techniques that use views and stored procedures
- Ways to optimize query performance
- How to create databases that will be an essential foundation to applications you develop later
Who this book is for
This book is for database developers and administrators who have not yet programmed with Transact-SQL. Some familiarity with relational databases and basic SQL is helpful, and some programming experience is helpful.
Wrox Beginning guides are crafted to make learning programming languages and technologies easier than you think, providing a structured, tutorial format that will guide you through all the techniques involved.
About the Author
Paul Turley (Seattle, WA) is a Senior Consultant for Hitachi Consulting, where he architects and develops business reporting solutions and database systems for many high-profile business clients. He has been developing database solutions since 1991 for companies such as Hewlett-Packard, Boise Cascade, Disney, and Microsoft. He has been a Microsoft Certified Professional and Trainer since 1996 and currently holds his MCDBA, MCSD, MSF Practitioner, IT Project+, and A+ certifications.
Paul designed and maintains www.Scout-Master.com, a web-based service that enables Boy Scouts and their leaders to manage their own unit web sites, membership, and advancement records on-line using SQL Server and ASP.NET. Paul has been a contributing or lead author on Professional SQL Server Reporting Services (1st and 2nd editions), Beginning Access 2002 VBA, Professional SQL Server 2000 Data Warehousing with Analysis Services, and Professional Access 2000 Programming from WROX Press.
Dan Wood (Silverdale, WA) is the Operations Manager, Database Administrator, and SQL Server Trainer for Netdesk Corporation, a Microsoft Gold Certified Partner for Learning Solutions in Seattle where he manages and develops database solutions as well as trains database professionals from organizations throughout the Northwest. He has been a Microsoft Certified Professional and Trainer since 1999 and currently holds his MCDBA, MCSD, and MCSE certifications.
Read an Excerpt
Beginning Transact-SQL with SQL Server 2000 and 2005
By Paul Turley
John Wiley & SonsISBN: 0-7645-7955-X
Chapter OneIntroducing Transact-SQL and Data Management Systems
Welcome to the world of Transact-Structured Query Language programming. Transact-SQL, or T-SQL, is Microsoft Corporation's implementation of the Structured Query Language, which was designed to retrieve, manipulate, and add data to Relational Database Management Systems (RDBMS). Hopefully, you already have a basic idea of what SQL is used for because you purchased this book, but you may not have a good understanding of the concepts behind relational databases and the purpose of SQL. This first chapter introduces you to some of the fundamentals of the design and architecture of relational databases and presents a brief description of SQL as a language. If you are brand new to SQL and database technologies, this chapter will provide a foundation to help ensure the rest of the book is as effective as possible. If you are already comfortable with the concepts of relational databases and Microsoft's implementation, specifically, you may want to skip on ahead to Chapter 2, "SQL Server Fundamentals," or Chapter 3, "Tools for Accessing SQL Server." Both of these chapters introduce some of the features and tools in SQL Server 2000 as well as the new features and tools coming with SQL Server 2005.
Another great, more in-depth source for SQL 2000 and SQL 2005 programming from the application developer's perspective are the Wrox Press books authored by Rob Viera: Professional SQL Server 2000 Programming, Beginning SQL Server 2005 Programming, and Professional SQL Server 2005 Programming. Throughout the chapters ahead, I will refer back to both the basic concepts introduced in this chapter and to areas in the books mentioned here for further clarification in the use or nature of the Transact-SQL language.
Transact-Structured Query Language
T-SQL is Microsoft's implementation of a standard established by the American National Standards Institute (ANSI) for the Structured Query Language (SQL). SQL was first developed by researchers at IBM. They called their first pre-release version of SQL "SEQUEL," which stood for Structured English QUEry Language. The first release version was renamed to SQL, dropping the English part but retaining the pronunciation to identify it with its predecessor. Today, several implementations of SQL by different stakeholders are in the database marketplace, and as you sojourn through the sometimes-mystifying lands of database technology you will undoubtedly encounter these different varieties of SQL. What makes them all similar is the ANSI standard to which IBM, more than any other vendor, adheres to with tenacious rigidity. However, what differentiate the many implementations of SQL are the customized programming objects and extensions to the language that make it unique to that particular platform. Microsoft SQL Server 2000 implements ANSI-92, or the 1992 standard as set by ANSI. SQL Server 2005 implements ANSI-99. The term "implements" is of significance. T-SQL is not fully compliant with ANSI standards in its 2000 or 2005 implementation; neither is Oracle's P/L SQL, Sybase's SQLAnywhere, or the open-source MySQL. Each implementation has custom extensions and variations that deviate from the established standard. ANSI has three levels of compliance: Entry, Intermediate, and Full. T-SQL is certified at the entry level of ANSI compliance. If you strictly adhere to the features that are ANSI-compliant, the same code you write for Microsoft SQL Server should work on any ANSI-compliant platform; that's the theory, anyway. If you find that you are writing cross-platform queries, you will most certainly need to take extra care to ensure that the syntax is perfectly suited for all the platforms it affects. Really, the simple reality of this issue is that very few people will need to write queries to work on multiple database platforms. These standards serve as a guideline to help keep query languages focused on working with data, rather than other forms of programming, perhaps slowing the evolution of relational databases just enough to keep us sane.
T-SQL: Programming Language or Query Language?
T-SQL was not really developed to be a full-fledged programming language. Over the years the ANSI standard has been expanded to incorporate more and more procedural language elements, but it still lacks the power and flexibility of a true programming language. Antoine, a talented programmer and friend of mine, refers to SQL as "Visual Basic on Quaaludes." I share this bit of information not because I agree with it, but because I think it is funny. I also think it is indicative of many application developers' view of this versatile language.
The Structured Query Language was designed with the exclusive purpose of data retrieval and data manipulation. Microsoft's T-SQL implementation of SQL was specifically designed for use in Microsoft's Relational Database Management System (RDBMS), SQL Server. Although T-SQL, like its ANSI sibling, can be used for many programming-like operations, its effectiveness at these tasks varies from excellent to abysmal. That being said, I am still more than happy to call T-SQL a programming language if only to avoid someone calling me a SQL "Queryer" instead of a SQL Programmer. However, the undeniable fact still remains; as a programming language, T-SQL falls short. The good news is that as a data retrieval and set manipulation language it is exceptional. When T-SQL programmers try to use T-SQL like a programming language they invariably run afoul of the best practices that ensure the efficient processing and execution of the code. Because T-SQL is at its best when manipulating sets of data, try to keep that fact foremost in your thoughts during the process of developing T-SQL code.
Performing multiple recursive row operations or complex mathematical computations is quite possible with T-SQL, but so is writing a .NET application with Notepad. Antoine was fond of responding to these discussions with, "Yes, you can do that. You can also crawl around the Pentagon on your hands and knees if you want to." His sentiments were the same as my father's when I was growing up; he used to make a point of telling me that "Just because you can do something doesn't mean you should." The point here is that oftentimes SQL programmers will resort to creating custom objects in their code that are inefficient as far as memory and CPU consumption are concerned. They do this because it is the easiest and quickest way to finish the code. I agree that there are times when a quick solution is the best, but future performance must always be taken into account. This book tries to show you the best way to write T-SQL so that you can avoid writing code that will bring your server to its knees, begging for mercy.
What's New in SQL Server 2005
Several books and hundreds of web sites have already been published that are devoted to the topic of "What's New in SQL Server 2005," so I won't spend a great deal of time describing all the changes that come with this new release. Instead, throughout the book I will identify those changes that are applicable to the subject being described. However, in this introductory chapter I want to spend a little time discussing one of the most significant changes and how it will impact the SQL programmer. This change is the incorporation of the .NET Framework with SQL Server.
T-SQL and the .NET Framework
The integration of SQL Server with Microsoft's .NET Framework is an awesome leap forward in database programming possibilities. It is also a significant source of misunderstanding and trepidation, especially by traditional infrastructure database administrators.
This new feature, among other things, allows developers to use programming languages to write stored procedures and functions that access and manipulate data with object-oriented code, rather than SQL statements.
Kiss T-SQL Goodbye?
Any reports of T-SQL's demise are premature and highly exaggerated. The ability to create database programming objects in managed code instead of SQL does not mean that T-SQL is in danger of becoming extinct. A marketing-minded executive at one of Microsoft's partner companies came up with a cool tagline about SQL Server 2005 and the .NET Framework that said "SQL Server 2005 and .NET; Kiss SQL Good-bye." He was quickly dissuaded by his team when presented with the facts. However, the executive wasn't completely wrong. What his catchy tagline could say and be accurate is "SQL Server 2005 and .NET; Kiss SQL Cursors Good-bye." It could also have said the same thing about complex T-SQL aggregations or a number of T-SQL solutions presently used that will quickly become obsolete with the release of SQL Server 2005.
Transact-SQL cursors are covered in detail in Chapter 10, so for the time being, suffice it to say that they are generally a bad thing and should be avoided. Cursors are all about recursive operations with single or row values. They consume a disproportionate amount of memory and CPU resources compared to set operations.
With the integration of the .NET Framework and SQL Server, expensive cursor operations can be replaced by efficient, compiled assemblies, but that is just the beginning. A whole book could be written about the possibilities created with SQL Server's direct access to the .NET Framework. Complex data types, custom aggregations, powerful functions, and even managed code triggers can be added to a database to exponentially increase the flexibility and power of the database application. Among other things, one of the chief advantages of the .NET Framework's integration is the ability of T-SQL developers to have complete access to the entire .NET object model and operating system application programming interface (API) library without the use of custom extended stored procedures. Extended stored procedures and especially custom extended stored procedures, which are almost always implemented through unmanaged code, have typically been the source of a majority of the security and reliability issues involving SQL Server. By replacing extended stored procedures, which can only exist at the server level, with managed assemblies that exist at the database level, all kinds of security and scalability issues virtually disappear.
Database Management System (DBMS)
A DBMS is a set of programs that are designed to store and maintain data. The role of the DBMS is to manage the data so that the consistency and integrity of the data is maintained above all else. Quite a few types and implementations of Database Management Systems exist:
Hierarchical Database Management Systems (HDBMS)-Hierarchical databases have been around for a long time and are perhaps the oldest of all databases. It was (and in some cases still is) used to manage hierarchical data. It has several limitations such as only being able to manage single trees of hierarchical data and the inability to efficiently prevent erroneous and duplicate data. HDBMS implementations are getting increasingly rare and are constrained to specialized, and typically, non-commercial applications.
Network Database Management System (NDBMS)-The NDBMS has been largely abandoned. In the past, large organizational database systems were implemented as network or hierarchical systems. The network systems did not suffer from the data inconsistencies of the hierarchical model but they did suffer from a very complex and rigid structure that made changes to the database or its hosted applications very difficult.
Relational Database Management System (RDBMS)-An RDBMS is a software application used to store data in multiple related tables using SQL as the tool for creating, managing, and modifying both the data and the data structures. An RDBMS maintains data by storing it in tables that represent single entities and storing information about the relationship of these tables to each other in yet more tables. The concept of a relational database was first described by E.F. Codd, an IBM scientist who defined the relational model in 1970. Relational databases are optimized for recording transactions and the resultant transactional data. Most commercial software applications use an RDBMS as their data store. Because SQL was designed specifically for use with an RDBMS, I will spend a little extra time covering the basic structures of an RDBMS later in this chapter.
Object-Oriented Database Management System (ODBMS)-The ODBMS emerged a few years ago as a system where data was stored as objects in a database. ODBMS supports multiple classes of objects and inheritance of classes along with other aspects of object orientation. Currently, no international standard exists that specifies exactly what an ODBMS is and what it isn't. Because ODBMS applications store objects instead of related entities, it makes the system very efficient when dealing with complex data objects and object-oriented programming (OOP) languages such as the new .NET languages from Microsoft as well as C and Java. When ODBMS solutions were first released they were quickly touted as the ultimate database system and predicted to make all other database systems obsolete. However, they never achieved the wide acceptance that was predicted. They do have a very valid position in the database market, but it is a niche market held mostly within the Computer-Aided Design (CAD) and telecommunications industries.
Object-Relational Database Management System (ORDBMS)-The ORDBMS emerged from existing RDBMS solutions when the vendors who produced the relational systems realized that the ability to store objects was becoming more important. They incorporated mechanisms to be able to store classes and objects in the relational model. ORDBMS implementations have, for the most part, usurped the market that the ODBMS vendors were targeting for a variety of reasons that I won't expound on here. However, Microsoft's SQL Server 2005, with its XML data type and incorporation of the .NET Framework, could arguably be labeled an ORDBMS.
SQL Server as a Relational Database Management System
This section introduces you to the concepts behind relational databases and how they are implemented from a Microsoft viewpoint. This will, by necessity, skirt the edges of database object creation, which is covered in great detail in Chapter 11, so for the purpose of this discussion I will avoid the exact mechanics and focus on the final results.
As I mentioned earlier, a relational database stores all of its data inside tables. Ideally, each table will represent a single entity or object. You would not want to create one table that contained data about both dogs and cars. That isn't to say you couldn't do this, but it wouldn't be very efficient or easy to maintain if you did.
Tables are divided up into rows and columns. Each row must be able to stand on its own, without a dependency to other rows in the table. The row must represent a single, complete instance of the entity the table was created to represent. Each column in the row contains specific attributes that help define the instance. This may sound a bit complex, but it is actually very simple. To help illustrate, consider a real-world entity, an employee. If you want to store data about an employee you would need to create a table that has the properties you need to record data about your employee. For simplicity's sake, call your table Employee.
For more information on naming objects, check out the "Naming Conventions" section in Chapter 4.
When you create your employee table you also need to decide on what attributes of the employee you want to store. For the purposes of this example you have decided to store the employee's last name, first name, social security number, department, extension, and hire date. The resulting table would look something like that shown in Figure 1-1.
The data in the table would look something like that shown in Figure 1-2.
To efficiently manage the data in your table you need to be able to uniquely identify each individual row in the table. It is much more difficult to retrieve, update, or delete a single row if there is not a single attribute that identifies each row individually. In many cases, this identifier is not a descriptive attribute of the entity. For example, the logical choice to uniquely identify your employee is the social security number attribute. However, there are a couple of reasons why you would not want to use the social security number as the primary mechanism for identifying each instance of an employee. So instead of using the social security number you will assign a non-descriptive key to each row. The key value used to uniquely identify individual rows in a table is called a primary key.
Excerpted from Beginning Transact-SQL with SQL Server 2000 and 2005 by Paul Turley Excerpted by permission.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.
Table of Contents
Chapter 1: Introducing Transact-SQL and Data Management Systems.
Chapter 2: SQL Server Fundamentals.
Chapter 3: Tools for Accessing SQL Server.
Chapter 4: Introducing Transact-SQL Language.
Chapter 5: Data Retrieval.
Chapter 6: SQL Functions.
Chapter 7: Aggregation and Grouping.
Chapter 8: Multi-Table Queries.
Chapter 9: Data Transactions.
Chapter 10: Advanced Queries and Scripting.
Chapter 11: Full-Text Index Queries.
Chapter 12: Creating and Managing Database Objects.
Chapter 13: Transact-SQL Programming Objects.
Chapter 14: Transact-SQL in Applications and Reporting.
Appendix A: Command Syntax Reference.
Appendix B: System Variables and Functions Reference.
Appendix C: System Stored Procedure Reference.
Appendix D: Information Schema Views Reference.
Appendix E: Answers to Exercises.