Table of Contents
INTRODUCTION xxix CHAPTER 1: WELCOME TO SQL SERVER INTEGRATION SERVICES 1
 SQL Server SSIS Historical Overview 2
 What’s New in SSIS 2
 Tools of the Trade 2
 Import and Export Wizard 3
 The SQL Server Data Tools Experience 4
 SSIS Architecture 5
 Packages 5
 Control Flow 6
 Data Flow 9
 Variables 13
 Parameters 14
 Error Handling and Logging 14
 Editions of SQL Server 14
 Summary 15
 CHAPTER 2: THE SSIS TOOLS 17
 Import and Export Wizard 17
 SQL Server Data Tools 23
 Creating Your First Package 25
 The Solution Explorer Window 26
 The Toolbox 27
 The Properties Windows 28
 The SSIS Package Designer 28
 Control Flow 29
 Connection Managers 32
 Variables 32
 Data Flow 34
 Parameters 34
 Event Handlers 34
 Package Explorer 35
 Executing a Package 36
 Management Studio 36
 Summary 37
 CHAPTER 3: SSIS TASKS 39
 SSIS Task Objects 40
 Using the Task Editor 40
 The Task Editor Expressions Tab 40
 SSDTCommon Properties 41
 Looping and Sequence Tasks 42
 Script Task (.NET) 43
 Analysis Services Tasks 45
 Analysis Services Execute DDL Task 45
 Analysis Services Processing Task 46
 Data Mining Query Task 46
 Data Flow Task 47
 Data Preparation Tasks 48
 Data Profi ler 48
 File System Task 50
 Archiving a File 52
 FTP Task 53
 Getting a File Using FTP 54
 Web Service Task 55
 Retrieving Data Using the Web Service Task and XML Source Component 57
 XML Task 60
 Validating an XML File 62
 RDBMS Server Tasks 64
 Bulk Insert Task 64
 Using the Bulk Insert Task 67
 Execute SQL Task 68
 Workflow Tasks 80
 Execute Package Task 80
 Execute Process Task 81
 Message Queue Task 82
 Send Mail Task 83
 WMI Data Reader Task 84
 WMI Event Watcher Task 86
 Polling a Directory for the Delivery of a File 86
 SMO Administration Tasks 87
 Transfer Database Task 88
 Transfer Error Messages Task 89
 Transfer Logins Task 89
 Transfer Master Stored Procedures Task 90
 Transfer Jobs Task 91
 Transfer SQL Server Objects Task 91
 Summary 92
 CHAPTER 4: CONTAINERS 93
 Task Host Containers 93
 Sequence Containers 94
 Groups 95
 For Loop Container 95
 Foreach Loop Container 97
 Foreach File Enumerator Example 98
 Foreach ADO Enumerator Example 100
 Summary 103
 CHAPTER 5: THE DATA FLOW 105
 Understanding the Data Flow 105
 Data Viewers 106
 Sources 106
 OLE DB Source 107
 Excel Source 109
 Flat File Source 110
 Raw File Source 114
 XML Source 115
 ADO.NET Source 115
 Destinations 115
 Excel Destination 116
 Flat File Destination 116
 OLE DB Destination 116
 Raw File Destination 117
 Recordset Destination 117
 SQL Server and Mobile Destinations 118
 Data Mining Model Training 118
 DataReader Destination 118
 Dimension and Partition Processing 118
 Common Transformations 119
 Synchronous versus Asynchronous Transformations 119
 Aggregate 119
 Conditional Split 121
 Data Conversion 122
 Derived Column 122
 Lookup 123
 Row Count 124
 Script Component 125
 Slowly Changing Dimension 126
 Sort 126
 Union All 127
 Other Transformations 128
 Audit 128
 Character Map 129
 Copy Column 130
 Data Mining Query 130
 DQS Cleansing 131
 Export Column 131
 Fuzzy Lookup 133
 Fuzzy Grouping 138
 Import Column 142
 Merge 144
 Merge Join 144
 Multicast 145
 OLE DB Command 145
 Percentage and Row Sampling 147
 Pivot Transform 147
 Unpivot 150
 Term Extraction 152
 Term Lookup 156
 Data Flow Example 157
 Summary 160
 CHAPTER 6: USING VARIABLES, PARAMETERS, AND EXPRESSIONS 161
 Dynamic Package Objects 162
 Variable Overview 162
 Parameter Overview 162
 Expression Overview 163
 Understanding Data Types 164
 SSIS Data Types 164
 Date and Time Type Support 166
 How Wrong Data Types and Sizes Can Affect Performance 167
 Unicode and Non-Unicode Conversion Issues 167
 Casting in SSIS Expressions 169
 Using Variables and Parameters 170
 Defi ning Variables 170
 Defining Parameters 171
 Variable and Parameter Data Types 172
 Working with Expressions 173
 C#-Like? Close, but Not Completely 174
 The Expression Builder 175
 Syntax Basics 176
 Using Expressions in SSIS Packages 190
 Summary 200
 CHAPTER 7: JOINING DATA 201
 The Lookup Transformation 202
 Using the Merge Join Transformation 203
 Contrasting SSIS and the Relational Join 203
 Lookup Features 206
 Building the Basic Package 207
 Using a Relational Join in the Source 209
 Using the Merge Join Transformation 211
 Using the Lookup Transformation 216
 Full-Cache Mode 216
 No-Cache Mode 219
 Partial-Cache Mode 220
 Multiple Outputs 223
 Expressionable Properties 226
 Cascaded Lookup Operations 227
 Cache Connection Manager and Cache Transform 229
 Summary 231
 CHAPTER 8: CREATING AN END-TO-END PACKAGE 233
 Basic Transformation Tutorial 233
 Creating Connections 234
 Creating the Control Flow 237
 Creating the Data Flow 237
 Completing the Package 239
 Saving the Package 239
 Executing the Package 240
 Typical Mainframe ETL with Data Scrubbing 241
 Creating the Data Flow 242
 Handling Dirty Data 242
 Finalizing 246
 Handling More Bad Data 247
 Looping and the Dynamic Tasks 250
 Looping 250
 Making the Package Dynamic 250
 Summary 252
 CHAPTER 9: SCRIPTING IN SSIS 253
 Introducing SSIS Scripting 253
 Getting Started in SSIS Scripting 255
 Selecting the Scripting Language 255
 Using the VSTA Scripting IDE 256
 Example: Hello World 257
 Adding Code and Classes 259
 Using Managed Assemblies 260
 Example: Using Custom .NET Assemblies 261
 Using the Script Task 264
 Confi guring the Script Task Editor 265
 The Script Task Dts Object 266
 Accessing Variables in the Script Task 267
 Connecting to Data Sources in a Script Task 271
 Raising an Event in a Script Task 281
 Writing a Log Entry in a Script Task 287
 Using the Script Component 288
 Diff erences from a Script Task 288
 Confi guring the Script Component Editor 289
 Accessing Variables in a Script Component 291
 Connecting to Data Sources in a Script Component 292
 Raising Events 292
 Logging 293
 Example: Data Validation 294
 Synchronous versus Asynchronous 302
 Essential Coding, Debugging, and Troubleshooting Techniques 305
 Structured Exception Handling 305
 Script Debugging and Troubleshooting 308
 Summary 311
 CHAPTER 10: LOADING A DATA WAREHOUSE 313
 Data Profiling 315
 Initial Execution of the Data Profiling Task 315
 Reviewing the Results of the Data Profiling Task 317
 Turning Data Profi le Results into Actionable ETL Steps 321
 Data Extraction and Cleansing 322
 Dimension Table Loading 322
 Loading a Simple Dimension Table 323
 Loading a Complex Dimension Table 327
 Considerations and Alternatives to the SCD Transformation 335
 Fact Table Loading 337
 SSAS Processing 345
 Using a Master ETL Package 350
 Summary 351
 CHAPTER 11: ADVANCED DATA CLEANSING IN SSIS 353
 Advanced Derived Column Use 354
 Text Parsing Example 355
 Advanced Fuzzy Lookup and Fuzzy Grouping 357
 Fuzzy Lookup 357
 Fuzzy Grouping 363
 DQS Cleansing 366
 Data Quality Services 366
 DQS Cleansing Transformation 370
 Summary 373
 CHAPTER 12: USING THE RELATIONAL ENGINE 375
 Data Extraction 376
 SELECT * Is Bad 376
 WHERE Is Your Friend 377
 Transform during Extract 378
 Many ANDs Make Light Work 381
 SORT in the Database 382
 Modularize 384
 SQL Server Does Text Files Too 385
 Using Set-Based Logic 389
 SQL Server Change Data Capture 391
 Benefi ts of SQL Server CDC 392
 Preparing CDC 393
 Capture Instance Tables 394
 The CDC API 396
 Using the New SSIS CDC Tools 398
 Querying CDC in SSIS 401
 Data Loading 405
 Database Snapshots 406
 The MERGE Operator 408
 Summary 411
 CHAPTER 13: ACCESSING HETEROGENEOUS DATA 413
 Excel and Access 415
 64-Bit Support 415
 Working with Excel Files 417
 Working with Access 421
 Importing from Oracle 427
 Oracle Client Setup 428
 Importing Oracle Data 429
 Using XML and Web Services 431
 Confi guring the Web Service Task 431
 Working with XML Data as a Source 439
 Flat Files 442
 Loading Flat Files 443
 Extracting Data from Flat Files 444
 ODBC 447
 Other Heterogeneous Sources 450
 Summary 451
 CHAPTER 14: RELIABILITY AND SCALABILITY 453
 Restarting Packages 453
 Simple Control Flow 455
 Containers within Containers and Checkpoints 457
 Variations on a Theme 459
 Inside the Checkpoint File 461
 Package Transactions 463
 Single Package, Single Transaction 464
 Single Package, Multiple Transactions 466
 Two Packages, One Transaction 468
 Single Package Using a Native Transaction in SQL Server 469
 Error Outputs 471
 Scaling Out 473
 Architectural Features 474
 Scaling Out Memory Pressures 474
 Scaling Out by Staging Data 475
 Scaling Out with Parallel Loading 479
 Summary 485
 CHAPTER 15: UNDERSTANDING AND TUNING THE DATA FLOW ENGINE 487
 The SSIS Engine 488
 Understanding the SSIS Data Flow and Control Flow 488
 Handling Workflows with the Control Flow 491
 Data Processing in the Data Flow 491
 Memory Buff er Architecture 492
 Types of Transformations 493
 Advanced Data Flow Execution Concepts 501
 SSIS Data Flow Design and Tuning 508
 Data Flow Design Practices 508
 Optimizing Package Processing 513
 Troubleshooting Data Flow Performance Bottlenecks 516
 Pipeline Performance Monitoring 518
 Summary 520
 CHAPTER 16: SSIS SOFTWARE DEVELOPMENT LIFE CYCLE 521
 Introduction to Software Development Life Cycles 523
 SDLCs: A Brief History 524
 Types of Software Development Life Cycles 524
 Versioning and Source Code Control 525
 Subversion (SVN) 526
 Team Foundation Server, Team System, and SSIS 533
 Summary 547
 CHAPTER 17: ERROR AND EVENT HANDLING 549
 Using Precedence Constraints 549
 Precedence Constraint Basics 550
 Advanced Precedence Constraints and Expressions 551
 Event Handling 557
 Events 558
 Using Event Handlers 560
 Event Handler Inheritance 567
 Breakpoints 569
 Error Rows 572
 Logging 576
 Logging Providers 577
 Log Events 577
 Catalog Logging 582
 Summary 584
 CHAPTER 18: PROGRAMMING AND EXTENDING SSIS 585
 The Sample Components 586
 Component 1: Source Adapter 586
 Component 2: Transform 587
 Component 3: Destination Adapter 588
 The Pipeline Component Methods 588
 Design-Time Functionality 589
 Runtime 593
 Connection Time 594
 Building the Components 596
 Preparation 596
 Building the Source Component 602
 Building the Transformation
 Component 614
 Building the Destination Adapter 625
 Using the Components 633
 Installing the Components 633
 Debugging Components 634
 Design Time 635
 Building the Complete Package 636
 Runtime Debugging 637
 Upgrading to SQL Server 2012 641
 Summary 641
 CHAPTER 19: ADDING A USER INTERFACE TO YOUR COMPONENT 643
 Three Key Steps for Designing the UI: An Overview 644
 Building the User Interface 644
 Adding the Project 645
 Implementing IDtsComponentUI 647
 Setting the UITypeName 651
 Building the Form 653
 Extending the User Interface 658
 Runtime Connections 658
 Component Properties 661
 Handling Errors and Warnings 663
 Column Properties 665
 Other UI Considerations 667
 Summary 667
 CHAPTER 20: EXTERNAL MANAGEMENT AND WMI TASK IMPLEMENTATION 669
 External Management of SSIS with Managed Code 670
 Setting Up a Test SSIS Package for Demonstration Purposes 670
 The Managed Object Model Code Library 671
 Catalog Management 672
 Folder Management 673
 Environments 674
 The DTS Runtime Managed Code Library 676
 SSIS Deployment Projects 676
 Parameter Objects 677
 Server Deployment 679
 Executing SSIS Packages Deployed to the SSIS Catalog 680
 EnvironmentReferences 681
 Package Operations 682
 Application Object Maintenance Operations 683
 Package Operations 683
 Package Monitoring 686
 Project, Folder, and Package Listing 688
 A Package Management Example 689
 Package Log Providers 699
 Specifying Events to Log 701
 Programming to Log Providers 702
 SQL Server 2012 Operation Logs 703
 Package Confi gurations 705
 Creating a Confi guration 706
 Programming the Confi guration Object 707
 Confi guration Object 708
 Windows Management Instrumentation Tasks 709
 WMI Reader Task Explained 709
 WMI Data Reader Example 710
 WMI Event Watcher Task 715
 WMI Event Watcher Task Example 716
 Summary 718
 CHAPTER 21: USING SSIS WITH EXTERNAL APPLICATIONS 719
 InfoPath Documents 720
 ASP.NET Applications 727
 Winform .NET Applications 731
 Executing a Package with T-SQL 736
 Summary 741
 CHAPTER 22: ADMINISTERING SSIS 743
 Using the SSIS Catalog 743
 Setting the SSIS Catalog Properties 744
 SSISDB 747
 Deployment Models 748
 Project Deployment Model 749
 Package Deployment Model 751
 Using T-SQL with SSIS 757
 Executing Packages 757
 Using Parameters 758
 Querying Tables for Parameter Values 759
 Using Environments 760
 Using Data Taps 765
 Creating a Central SSIS Server 766
 Clustering SSIS 768
 Package Confi guration 770
 Command-Line Utilities 774
 DTExec 774
 DTExecUI 775
 DTUtil 780
 Security 782
 Securing the SSIS Catalog 782
 Legacy Security 785
 Scheduling Packages 787
 SQL Server Agent 787
 Proxy Accounts 789
 64-Bit Issues 790
 Monitoring Package Executions 791
 Built-in Reporting 791
 Custom Reporting 795
 Performance Counters 796
 Summary 796
 CHAPTER 23: CASE STUDY: A PROGRAMMATIC EXAMPLE 797
 What You Will Take Away 798
 Background 798
 Business Problem 799
 Solution Summary 799
 Solution Architecture 801
 Naming Conventions and Tips 804
 Additional SSIS Tips Before You Start a Large Project 805
 Data Architecture 805
 File Storage Location Setup 806
 Bank ACH Payments 806
 Lockbox Files 807
 PayPal or Direct Credits to the Corporate Account 808
 Case Study Database Model 808
 Database Setup 810
 Case Study Load Packages 819
 Bank File Load Package 819
 ACH Load Package 845
 E-mail Load Package 861
 Testing 866
 Case Study Invoice Matching Process 867
 Matching Process Control Flow 867
 Matching Process High-Confidence Data Flow 870
 Matching Process Medium-Confidence Data Flow 875
 Interpreting the Results 879
 Creating a Parent Driver Package 880
 Driver Package Setup 880
 Summary 881
 INDEX 883