Today, companies capture and store tremendous amounts of information about every aspect of their business: their customers, partners, vendors, markets, and more. But with the rise in the quantity of information has come a corresponding decrease in its qualitya problem businesses recognize and are working feverishly to solve.
Enterprise Knowledge Management: The Data Quality Approach presents an easily adaptable methodology for defining, measuring, and improving data quality. Author David Loshin begins by presenting an economic framework for understanding the value of data quality, then proceeds to outline data quality rules and domain-and mapping-based approaches to consolidating enterprise knowledge. Written for both a managerial and a technical audience, this book will be indispensable to the growing number of companies committed to wresting every possible advantage from their vast stores of business information.
- Expert advice from a highly successful data quality consultant
- The only book on data quality offering the business acumen to appeal to managers and the technical expertise to appeal to IT professionals
- Details the high costs of bad data and the options available to companies that want to transform mere data into true enterprise knowledge
- Presents conceptual and practical information complementing companies' interest in data warehousing, data mining, and knowledge discovery
|Series:||Morgan Kaufmann Series in Data Management Systems Series|
|Product dimensions:||1.03(w) x 7.50(h) x 9.25(d)|
About the Author
David Loshin is President of Knowledge Integrity, Inc., a company specializing in data management consulting. The author of numerous books on performance computing and data management, including “Master Data Management" (2008) and “Business Intelligence - The Savvy Manager’s Guide" (2003), and creator of courses and tutorials on all facets of data management best practices, David is often looked to for thought leadership in the information management industry.
Read an Excerpt
Chatper 1: IntroductionWithout even realizing it, everyone is affected by poor data quality. Some are affected directly in annoying ways, such as receiving two or three identical mailings from the same sales organization in the same week. Some are affected in less direct ways, such as the 20-minute wait on hold for a customer service department. Some are affected more malevolently through deliberate fraud, such as identity theft. But whenever poor data quality, inconsistencies, and errors bloat both companies and government agencies and hamper their ability to provide the best possible service, everyone suffers.
Data quality seems to be a hazy concept, but the lack of data quality severely hampers the ability of organizations to effectively accumulate and manage enterprise-wide knowledge. The goal of this book is to demonstrate that data quality is not an esoteric notion but something that can be quantified, measured, and improved, all with a strict focus on return on investment. Our approach is that knowledge management is a pillar that must stand securely on a pedestal of data quality, and by the end of this book, the reader should be able to build that pedestal.
This book covers these areas.
- Data ownership paradigms
- The definition of data quality
- An economic framework for data quality, including steps in building a return on investment model to justify the costs of a data quality program
- The dimensions of data quality
- Using statistical process control as a tool for measurement
- Data domains and mappings between those domains
- Data quality rules and business rules
- Measurement and current state assessment
- Data quality requirementsanalysis
- Metadata and policy
- Rules-based processing
- Discovery of metadata and data quality and business rules
- Data cleansing
- Root cause analysis and supplier management
- Data enhancement
- Putting it all into practice
1.1.1 Bank Deposit?
In November of 1998, it was reported by the Associated Press that a New York man allegedly brought a dead deer into a bank in Stamford, Connecticut, because he was upset with the bank's service. Police say the 70-year-old argued with a teller over a clerical mistake with his checking account. Because he was apparently unhappy with the teller, he went home, got the deer carcass and brought it back to the branch office.
1.1.2 CD Mail Fraud
Here is a news story taken from the Associated Press newswire. The text is printed with permission. Newark - For four years a Middlesex County man fooled the computer fraud programs at two music-by-mail clubs, using 1,630 aliases to buy music CDs at rates offered only to first-time buyers.
David Russo, 33, of Sayerville, NJ, admitted yesterday that he received 22,260 CDs by making each address - even if f it listed the same post office box - different enough to evade fraud-detection computer programs.
Among his methods: adding fictitious apartment numbers, unneeded direction abbreviations and extra punctuation marks. (Emphasis mine) The scam is believed to be the largest of its kind in the nation, said Assistant U.S. Attorney Scott S. Christie, who prosecuted the case. The introductory offers typically provided nine free CDs with the purchase of one CD at the regular price, plus shipping and handling. Other CDs then had to be purchased later to fulfill club requirements. Russo paid about $56,000 for CDs, said Paul B. Brickfield, his lawyer, or an average of $2.50 each. He then sold the CDs at flea markets for about $10 each, Brickfield said. Russo pleaded guilty to a single count of mail fraud. He faces about 12 to 18 months in prison and a fine of up to $250,000.
1.1.3 Mars Orbiter
The Mars Climate Orbiter, a key part of NASA's program to explore the planet Mars, vanished in September 1999 after rockets were fired to bring it into orbit of the planet. It was later discovered by an investigative board that NASA engineers failed to convert English measures of rocket thrusts to newtons, a metric system measuring rocket force, and that was the root cause of the loss of the spacecraft. The orbiter smashed into the planet instead of reaching a safe orbit. This discrepancy between the two measures, which was relatively small, caused the orbiter to approach Mars at too low an altitude. The result was the loss of a $125 million spacecraft and a significant setback in NASA's ability to explore Mars...
Table of Contents
2. Who Owns Information?
3. Data Quality in Practice
4. Economic Framework of Data Quality and the Value Proposition
5. Dimensions of Data Quality
6. Statistical Process Control and the Improvement Cycle
7. Domains, Mappings, and Enterprise Reference Data
8. Data Quality Assertions and Business Rules
9. Measurement and Current State Assessment
10. Data Quality Requirements
11. Metadata, Guidelines, and Policy
12. Rule-Based Data Quality
13. Metadata and Rule Discovery
14. Data Cleansing
15. Root Cause Analysis and Supplier Management
16. Data Enrichment/Enhancement
17. Data Quality and Business Rules in Practice
18. Building the Data Quality Practice