Read an Excerpt
Administering Informix Dynamic Server
Building the Foundation
By Carlton Doe
MC PressCopyright © 2008 Carlton Doe
All rights reserved.
Introduction to Informix Dynamic Server
In this Chapter
* Understanding the server's architecture
* Definition of key terms
By virtue of the fact that you're reading this book, either you are new to Informix Dynamic Server (IDS) or you're migrating from an earlier version of the data server and want to know how to use and manage the new functionality available to you. In either case, you're in for a good learning experience. There's a lot involved in understanding how this database server operates and how to make it perform in your particular environment.
The intent of this book is to make the learning process easier by distilling for you what you really need to know to configure, run, and tune a database environment using Informix Dynamic Server. Although at first glance IDS may seem similar to other data server products, this server is in fact unlike any other. It won't take you long to see that IDS is far more advanced technologically as well as more stable, easier to administer, and more robust than competing servers.
IDS 11, the focus of this book, is both very much like earlier versions of the product and also radically different. Version 11 incorporates significant new technology built on a completely modified server architecture introduced with IDS 9.1. Yet for all the changes, the server is still managed and operates as before — in many cases, it's even easier to administer!
Today's IDS is not a "regular" data server as many people might classify it; rather, it is an "object-relational" server. IDS includes high -performance core server technology developed in the early 1990s to take advantage of emerging symmetric multiprocessing (SMP) and massively parallel processing (MPP) technology, and it has been continually enhanced since then. The biggest enhancement was the addition of object -oriented database functionality throughout the server in 1995. This feature completely changed how DBAs and application developers can and should model and use data within their databases and applications. IDS now offers many more tools and options than the standard, relational-only data servers.
This chapter introduces the architecture of the data server and its three main components. We'll also go over some key terminology that is either unique to Informix Dynamic Server or has a new or different meaning when used with this product. By the end of the chapter, you should understand why the product has the name it does, what a thread is, and what the fundamental components of the data server are.
This book assumes you have some level of familiarity with the SQL language and with standard relational and object-oriented database concepts. However, we won't engage in a heavy bits-and-bytes discussion. If you need that level of detail, consult the documentation accompanying your distribution of the software or visit IBM's Informix Web site.
What Is Informix Dynamic Server?
Informix Dynamic Server is a data server — or, to use a marketing buzzword, an object-relational database management system (ORDBMS). The server can work both with "standard" (or relational) data types, such as character and numeric values, and with object-oriented data types. This new technology is an extension to the ability the server has had for years to store nonnumeric or non-ASCII data in binary large objects (BLOBs). Today, IDS can do more than just write the binary stream to disk as it does with simple BLOBs. Using the appropriate functions, you can not only store the data "object" but also manipulate, search, alter, and correlate it; you can execute any operation against it that makes sense for the data and is provided by the function. This new functionality and associated data type support is commonly referred to as extensibility and extensible data types.
In general terms, though, the server's job is to provide an environment whereby data can be stored, retrieved, changed, and deleted in such a way that data itself is not lost, compromised, or modified outside the rules established by the data server or the database administrator. IDS contains both logical and physical mechanisms to accomplish these tasks.
From a logical point of view, IDS provides the ability to set rules and conditions governing not only the acceptable range of values for a column in a table but also where a row will be stored on disk. You can specify the conditions that must exist for data elements in the row to be modified or deleted. You can set up procedures to be invoked automatically and execute specific database actions to enforce still other rules when data in a table or column is added, modified, or deleted.
From a physical point of view, IDS keeps a series of logs that record changes made to data as they occur and provides a locking mechanism you can use to ensure that data requested by one user session can't be changed or deleted by another. The data server can create copies, in whole or in part, of database environments, either within the same physical server or on a separate server, to minimize the impact of a physical server failure, to distribute/collect data between database environments, to enable load balancing, or to provide continuous availability of data services. Last, IDS provides the ability to create backups of database environments that can be used to restore some or all of a database environment should a mechanical failure or user error occur. You can even configure the restore to stop at a specific point in time so the user error doesn't recur, permitting full data recovery up to the moment the operation took place. Other recently added functionality enables restoring a backup created on one physical server to another physical server even if the second server isn't using the same operating system (O/S).
Built on the widely heralded Dynamic Scalable Architecture (DSA), the IDS data server was designed to run on, and take advantage of, today's computer systems with multiple physical CPUs and larger memory stores. In fact, field studies have shown that as more physical resources (e.g., CPUs) are added to the system, IDS performance increases linearly.
Central to the design of DSA and its functionality is a concept called process parallelization, or the processing of compatible tasks in parallel. The general SQL-processing mechanism of the data server is built to work in smaller, discrete steps. These steps are allocated across the CPU resources so that they occur more or less simultaneously, or in parallel. Figure 1.1 shows how this process works from a conceptual point of view.
The figure illustrates how a query might be executed in parallel. At the beginning of the process, a series of disk reads occurs. The results from this step, and from every other in the process, are passed in real time up the processing ladder of functional operations. At each level of the process, there are fewer rows to work with, and the results generated by each operation are joined with the results of the other operations at the same level. Eventually, the data server returns the final result to the application in what amounts to significantly less time than if the query process executed the steps serially, waiting for each step to be completed before beginning the next step, and with larger amounts of data.
In addition to processing SQL operations more quickly, IDS executes most administrative functions — such as building indexes, updating database statistical information, and checking and potentially repairing the database system after a failure — in parallel as well. This type of functionality brings with it the responsibility to monitor and tune for it. As the data server administrator, you must set the resource limitations within which parallel processing of queries and other activities must occur. Look for coverage of this topic and other advanced tuning operations in the companion to this book, Administering Informix Dynamic Server, Advanced Topics.
Another key feature of IDS's architecture is the ability of the server to allocate and release physical server resources dynamically when necessary. For example, you might configure an IDS database environment to use x MB of system RAM, y data locks, and so on when the database environment starts. If the data processing load spikes, IDS will try to secure more system resources (e.g., memory) to handle the increased load rather than fail due to insufficient resources. Knowing that IDS's attempt to obtain additional system resources is not unbounded, you can set explicit boundaries to the resources that the database environment can take from the system.
Finally, you can adjust most IDS database environment configuration parameters while the database environment is online and processing user transactions. IBM continues to enhance this functionality with every release of the data server, and it is nearly complete now. IDS's ability to intelligently self-manage required resources and be administered without interruption accounts for the word "dynamic" in the product's name.
With the addition of object-oriented technology, IDS delivers proven functionality that efficiently integrates new and complex data types directly into the database. It handles time-series, spatial, geodetic, Extensible Markup Language (XML), video, image, and other user -defined data side by side with traditional legacy data to meet today's most rigorous data and business demands. IDS is also a development-neutral environment that supports a comprehensive array of application development tools for rapid deployment of applications under Linux, Unix, Apple Mac OS X, and Microsoft Windows operating environments.
The Informix Dynamic Server Model
Data server architecture is a significant differentiator and contributor to IDS's performance, scalability, and ability to support new data types and processing requirements. Nearly all data servers available today use an older technological design that requires each database operation for an individual user (e.g., read, sort, write, communications tasks) to invoke a separate operating system process. This architecture worked well when database sizes and user counts were relatively small. Today, these types of servers spawn many hundreds, even thousands, of individual processes that the operating system must create, queue, schedule, manage/control, and then terminate when they're no longer needed. Given that, generally speaking, any individual system CPU can work on only one thing at a time — and that the operating system must work on each process before returning to the top of the queue — this data server architecture creates an environment in which individual database operations must wait for one or more passes through the queue to complete their task. Scalability with this type of architecture has nothing to do with the software; it depends entirely on the speed of the processor — how fast it can work through the queue before it starts over again.
As I mentioned in the previous section, the Dynamic Scalable Architecture on which Informix Dynamic Server is built was designed to work with multiple physical CPUs and larger memory stores to create an operating environment with greater data server performance and improved stability. The DSA includes built-in multithreading and parallel-processing capabilities, dynamic and self-tuning shared memory components, and intelligent logical data storage capabilities, supporting the most efficient use of all available system resources. Three major functional components make up the architectural model for Informix Dynamic Server:
The processor component
The shared memory component
The disk component
Let's look at each of these pieces individually.
The Processor Component
IDS provides the unique ability to scale the database environment by employing a dynamically configurable pool of data server processes called virtual processors (VPs). (Look for a in-depth explanation of exactly how VPs work in Administering Informix Dynamic Server, Advanced Topics.) As you saw in Figure 1.1, IDS takes a database operation such as a sorted data query and segments it into task-oriented subtasks (data scan, join, group, sort) for rapid processing by virtual processors that specialize in each type of subtask. VPs mimic the functionality of the hardware CPUs in that they schedule and manage user requests using multiple, concurrent threads. Figure 1.2 illustrates how IDS's pool of virtual processors operate.
A thread represents a discrete task within a data server process. Multiple threads can execute simultaneously, in parallel, across the pool of virtual processors. Unlike a CPU process-based (or single-threaded) engine, which leaves each task on the system CPU for its given unit of time (even if no work can be done, thus wasting processing time), IDS's virtual processors are multithreaded. As a consequence, when a thread either is waiting for a resource or has completed its task, a thread switch occurs and the virtual processor immediately begins work on another thread. As a result, precious CPU processing power is used to satisfy as many user requests as possible in the given amount of time. Figure 1.3 illustrates this capability, known as fan-in parallelism.
Not only can one virtual processor respond to multiple user requests in any given unit of time, as this figure illustrates, but one user request also can be distributed across multiple virtual processors. For example, with a processing-intensive request such as a multitable join, the data server divides the task into multiple subtasks and then spreads these subtasks across all available virtual processors. With the ability to distribute tasks, the request is completed more quickly. Figure 1.4 illustrates this capability, referred to as fan-out parallelism.
The net effect of IDS's two types of parallelism is more work being accomplished more quickly than with single-threaded architectures. In other words, the data server is faster.
Because threads aren't statically assigned to virtual processors, load balancing occurs dynamically within IDS. Outstanding requests are serviced by the first available virtual processor, balancing the workload across all available resources. For efficient execution and versatile tuning, you can group VPs into classes, each optimized for a particular function. Figure 1.5 illustrates this capability, showing VPs optimized for CPU operations, disk I/O, communications, administrative, and other tasks.
The design of IDS's VPs also includes administrative access, resulting in the ability to easily look at and analyze the activities requested by users and the data server. With single-threaded servers, each operation is a separate and independent operating system process with its own data stack, instruction cache, and other O/S overhead, making it difficult to build a comprehensive view of what's happening inside the data server. In contrast, IDS's onstat and oncheck administrative utilities gather information from the database environment's shared resources and can easily display who is doing what and how much of an impact it's having on the system.
You can configure the database environment with the appropriate number of VPs in each class to handle the expected workload for that environment. You can even define custom VPs to be used only for specific functions. Called user-defined virtual processors (UDVPs), these VPs have the same processing power as the database environment's core CPU VPs but are isolated from operating on core functionality. With this separation, if the user function that a UDVP is executing "misbehaves," the function can't intentionally or unintentionally cause an abnormal shutdown.
If necessary, you can also adjust the number and type of VPs while the database environment is online, without interrupting database operations — for example, to handle different load mixes or occasional periods of heavy activity. In Linux, Mac OS X, and Unix systems, the use of multithreaded virtual processors significantly reduces the number of O/S processes, requiring less context switching as a result. Windows systems implement VPs as threads to take advantage of the inherent multithreading capability of the operating system. Because IDS includes its own threading capability for servicing client requests, it requires fewer Windows threads, reducing the system thread-scheduling overhead and providing better throughput.
In making full use of the hardware processing cycles, IDS needs less hardware power to achieve performance comparable to or better than that of other database servers. In fact, real-world tests and customer experiences indicate IDS needs only 25 percent to 40 percent of the hardware resources to meet or exceed the performance characteristics of single-threaded or process-based data servers. This efficiency means your business can save money on hardware purchases as well as on ongoing maintenance.
The Shared Memory Component
With the consolidation of tasks and processes into VPs, all the memory used by the data server is consolidated as well. This large, single block of shared memory enables IDS to transfer data easily among the VPs. It also lets other user connections determine whether the data they need has already been queried by another user and can be used for their request, rather than having to go out to disk to get it. The memory inside this block is used and reused as needed to process user connections. When a user session terminates, the thread-specific memory for the session is freed and reused by another session.
If the database environment requires more memory to process its workload, the data server allocates additional blocks of memory dynamically from the operating system until it reaches the limit set during the database environment's configuration. When the need for the additional memory is gone, the additional segments of memory are released. You can make similar changes manually while the database environment is running. This ability to dynamically add and release memory helps eliminate down time to retune the environment as the workload increases and decreases. Released memory is returned to the general O/S pool for use by other processes, further enhancing the efficiency of the server's use of shared memory.
Excerpted from Administering Informix Dynamic Server by Carlton Doe. Copyright © 2008 Carlton Doe. Excerpted by permission of MC Press.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.