CHAPTER 1 INTRODUCING INTEGRATION SERVICES
2
So we’ll begin our journey into SSIS by looking at how ETL has evolved in the SQL Server world. Up
to SQL Server 6.5, the bulk copy program (bcp) was the primary tool for loading data into SQL Server
databases. A command-line utility, bcp made loading basic text files into database tables fairly simple.
Unfortunately, the flip side of that simplicity was that you could use bcp only to load data from flat files,
and you couldn’t perform additional validations or transformations on the data during the load. A
common database-to-database ETL scenario with bcp might include extracting data from a database
server to a delimited text file, importing the file into a SQL Server database, and finally using T-SQL to
perform transformations on the data in the database. The bcp utility is still provided with all versions of
SQL Server, and is still used for simple one-off data loads from flat files on occasion.
In response to the increasing demands of ETL processing, Data Transformation Services (DTS)
made its first appearance in SQL Server 7. With DTS, you could grab data from a variety of sources,
transform it on the fly, and load it into the database. Although DTS was a much more sophisticated tool
than bcp, it still lacked much of the functionality required to develop enterprise-class ETL solutions.
With the release of SQL Server 2005, Microsoft replaced DTS with SQL Server Integration Services
(SSIS). SSIS is a true enterprise ETL solution with several advancements over its predecessors, including
built-in logging; support for a wide variety of complex transformation, data validation, and data
cleansing components; separation of process control from data flow; support for several types of data
sources and destinations; and the ability to create custom components, to name a few.
SSIS in SQL Server 12 represents the first major enhancement to SSIS since its introduction way
back in 2005. In this newest release, Microsoft has implemented major improvements in functionality
and usability. Some of the new goodness includes the ability to move ETL packages seamlessly between
environments, centralized storage and administration of SSIS packages, and a host of usability
enhancements. In this book, you’ll explore the core functionality you need to get up and running with
SSIS and the advanced functionality you need to implement the most complex ETL processing.
ETL: THE LOST YEARS
Although bcp is efficient, many developers and DBAs over the years found the need for solutions that can
perform more-complex solutions. During the “lost years” of SQL Server ETL, a large number of home-
grown ETL applications began to sprout up in shops all over the world. Many of these solutions were very
inefficient, featuring hard-coded sources and destinations and inflexible transformations. Even in the 21st
century, there are quite a few of these legacy home-brewed ETL applications running at some of the
world’s largest corporations. Building and maintaining in-house ETL applications from scratch can be an
interesting academic exercise, but it’s terribly inefficient. The extra time and money spent trying to
maintain and administer the code base for these applications can take a significant chunk of the resources
you could otherwise devote to designing, developing, and building out actual ETL solutions with an
enterprise ETL platform.
What Can SSIS Do for You?
SSIS provides a wide array of out-of-the-box functionality to accomplish common ETL-related tasks. The
major tasks you’ll encounter during most ETL processing include the following: