From the Author: The Information Architecture To Pursue
A top priority of CIOs and organizations everywhere is how to best adapt the environment to manage the information asset. There is a plethora of available systems to throw into that equation. The possibilities can be daunting.
- "One size fits all" does not apply to information architecture.
- Gone are the days when vendors could bring their laminated architectures to a client with credibility.
- Organizations must go forward incrementally from where they are and deliver business returns with each at the quarter-, not year-level, turnaround.
For such an important asset, the barometer cannot be a competitor’s environment. Early adopters of good practices will reap the most rewards. Following are several key actions to take to improve a company’s information architecture.
Move Key Operational Systems To In-Memory
In-memory for operational systems is appropriate wherever SQL is used operationally and the performance gains of in-memory can be utilized.
Configurations differ. Products like VoltDB are NewSQL systems purpose-built for storing data and throughput of transactional systems. NewSQL is used today for traditional high performance applications such as capital markets data feeds, financial trade, telco record streams, sensor-based distribution systems, wireless, online gaming, fraud detection, digital ad exchanges, and micro transaction systems.
NewSQL systems are in-memory, schema-based DBMS systems that scale out in a cluster. They have high availability architectures that use synchronous, multi-master, active-active replication. As the name implies, NewSQL supports full SQL – aggregate functions, LIKE, UNION, materialized views, indexes, etc.
In-memory also is found in DBMS environments that primarily scale-up like SAP HANA, Teradata, and IBM PureData. With in-memory systems like HANA, a company can store its entire operational database entirely in RAM as the primary persistence layer. With the increasing number of cores (multi-core CPUs) becoming standard, CPUs are able to process increased data volumes in parallel. Main memory is no longer a limited resource. These systems recognize this and fully exploit main memory. Caches and layers are eliminated because the entire physical database is sitting on the motherboard and is therefore in memory all the time. By providing added performance and full ACID compliance, these systems are pushing up the threshold of size and complexity where NoSQL systems make sense.
Selectively Utilize Data Stream Processing
Data stream processing and event stream processing can hardly be considered a data store alongside DBMS and file systems since it doesn’t actually store data. However, it is a data processing platform. Data is only stored in data stores for processing later anyway so if an organization can perform all processing without the storage, it can skip the storage. Profile data, such as found in a Master Data Management hub can be added to the processing alongside the stream, providing instantaneous context-sensitive processing in real-time.
Examine The Syndicated Data Marketplace
Data has existed for purchase for a while, but the data has mostly been sourced into a very specific need, such as a marketing list for a promotion. As organizations make the move to more widespread data access and leveragable data structures, an investment in syndicated data can be leveraged throughout the enterprise.
Embrace Master Data Management
When facing a mounting workload adding value to an enterprise with information management, considering key components of each application that can be managed separately is wise. The most prominent of these components is master data. Building master data in a scalable, sharable manner, such as with a master data management approach, will streamline project development time and bring consistent data to multiple applications.
Utilize Data Virtualization
Dispense with the notion that each query can come from one data store. With data virtualization making big gains in recent years, data can be selectively stored in its best-fit platform and still be served to queries requiring data from elsewhere. Data virtualization can be a way to save the day for one-off queries or selective queries using data virtualization can be architected into scheduled operations.
Marginalize Multidimensional Databases
The hyper-denormalized multidimensional structure has proved a very difficult structure to use effectively. When created "spot on" to a query need, it is a good performing structure. When mismatched due to too few columns in the structure (requiring "drill through") or too many, creating overhead, it becomes an encumbrance.
Use Data Warehouses Strategically
The data warehouse concept is still necessary in any modern environment. The idea of sharing the data, the DBMS platform, a model, the methods, and the tools across different data sets and subject areas brings many benefits.
Making data warehouse data columnar in orientation generally would help a data warehouse more than it would hurt. However, people don’t generally like any downsides with their upgrades. A data warehouse community is not just multiple people. It’s very disparate user groups. The "groupthink" of the data warehouse also will limit finding the value proposition for the in-memory data warehouse although SSD storage is a must.
The data warehouse can be the lowest common denominator approach to storing data, which is not bad for the mid-specification analytic workload.
Data warehouses will see evolutionary change, but new applications and those who want specific analytic features may just source their data from the data warehouse. There are many of these "marts" being built today. The expansion of platform features in DBMS will continue as marts go searching for their best-fit platform.
Make Analytic Marts Columnar And In-Memory
Analytic marts built to support a single application, subject area, or department are well served to optimize around the specific requirements. These marts, which are multiplying throughout enterprises, have eschewed joining forces with the data warehouse, often because the analytic features will not be turned on for the data warehouse.
At less than 5 terabytes, analytic marts provide a great playground with little downside and bureaucracy that prevent trying out a columnar orientation and in-memory processing. Once tried, these features have quick appeal because the performance they create is often orders of magnitude greater performance for analytic processing.
Priase for Information Management:
"This is an excerpt from the first chapter of Information Management: Strategies for Gaining a Competitive Advantage with Data, written by William McKnight…he addresses the relationship between information management and business value, explores data management technologies, and offers advice on maximizing the potential of enterprise information."SearchDataManagement.com, March 31, 2014
"…overall it does provide some very useful information and guidance that could be used as part of a preparation and planning exercise towards developing a suitable data and information management strategy… it would make a suitable first guide for anyone who has been given the task of developing such a scheme, and might help to clarify some of the key issues in such a way as to make the task a little bit easier." Score: 7 out of 10BCS.org, April 2014
"William McKnight has delivered a very clear and concise explanation about how to get the most from your organization’s data. He steps the reader through an assortment of data processing technologies and approaches and show which deliver the best ROI for which types of workloads. This is a desperately needed mapping that many users will find invaluable!"Wayne Eckerson, business intelligence thought leader and president of Eckerson Group, a business-technology management consulting firm specializing in BI, performance management, and analytics.
"A blueprint and action plan for a corporate information management strategy, this book is a useful guide for anyone who wishes to improve business success with technology. Author William McKnight provides the foundation and tools for information managers to set policies and programs for the improved management of information, while addressing advances in architecture and technology principles."Julie Langenkamp-Muenkel, Editorial Director of Information-Management.com
"I always enjoy William’s writing, especially his balance between inspiring foresight and pragmatic advice rooted in real-world experience. He has skillfully shown that poise again: with his guidance you’ll find Information Management transforms what can be a burdensome responsibility into an insightful practice."Donald Farmer, VP Product Management, qlikview.com
"Many claim we're in the golden age of data management; every traditional paradigm and approach seems to have a newer, better, and faster alternative. This book provides a terrific overview of the new class of technologies that must be integrated into every CIO's technology plan."Evan Levy, Co-Author, Customer Data Integration: Reaching a Single Version of the Truth
"Big data is no longer just an IT topic. It’s one that’s now top-of-mind for executives, too. William McKnight takes the increasingly knotty hairball of information managementits practices, technologies, and skillsand unravels it in this timely and relevant book. A must-read for business and IT pros alike."Jill Dyché, SAS Vice President and author of The New IT
"I challenge any Information Management professional to not get value from this book. William covers a range of topics, and has so much knowledge he is able to offer usable insights across them all. The book is unique in the way it provides such a solid grounding for anyone making architectural or process decisions in the field of information management, and should be required reading for organizations looking to understand how newer approaches and technologies can be used to enable better decision making."Michael Whitehead, CEO and Co-Founder, WhereScape Software