Read an Excerpt
Chapter 2: Availability EnhancementsThe downtime of production databases can have a significant impact on the organization: You can lose revenue, you can lose credibility, you can lose customers, you can lose orders, you may not deliver on obligations, you may violate contract requirements for uptime, and you may encounter legal proceedings. This chapter discusses the following:
- General high-availability technology
- Data guard
- Online operations
- Replication enhancements
General High-Availability TechnologyOracle 9i introduced a number of enhancements to its comprehensive data availability solution. For example, Oracle9i realizes that downtime is an eventuality, and that one aspect of database availability is to minimize the amount of downtime by focusing on recovering from an outage very quickly.
Some of the high-availability enhancements are
- Minimal input/output (I/O) recovery
- Fast-start time-based recovery limit
- Oracle Flashback
- Resumable space allocation
- Export/import features
Minimal I/O RecoveryOracle9i recognized that failure is a reality and focused specifically on recovering rapidly from the outage by optimizing the speed with which it can apply redo logs. One obvious solution is to use disks and drivers that can quickly access the logs and can transfer the data into memory rapidly. A second solution is to ensure that the redo logs reside in a contiguous area on disk. A third solution is to minimize the number of redo entries that must be read and applied; this concept is called minimal I/O recovery.
Oracle documents every change that is made to the database in the redo log. Changes are recorded in the redo log as redo entries, which are comprised of a set of change vectors. A change vector tracks a specific change that is made to a single data block. The redo logs also track the rollback segments in the database. Oracle uses the redo log to recover from a database crash. Oracle scans the redo logs, reads the change vectors, and applies them to the corresponding data block in the database.
To recover from a crash, you must recover the data blocks in the cache that were dirty at the time of the failure. The entries that pertain to the dirty blocks in the data cache are documented in the redo log. However, the redo log is very likely to contain entries for dirty data blocks that were written back to disk before the crash, and therefore were not dirty at the time of the failure. Oracle9i optimized the recovery process to exclude the redo log entries that do not apply to blocks that were dirty at the time of the failure (see Figure 2-1).
Oracle first does a quick pass through the redo logs to identify the change vectors that pertain to the dirty blocks and then stores this information in the PGA. In a second pass, Oracle uses the information in the PGA to apply just the identified changes. Since Oracle only sequentially reads the online log and does not actually access the data blocks themselves, the overall time to recover is minimized by the two-pass approach.
Fast-Start Time-Based Recovery LimitOracle9i periodically stores the checkpoint redo byte address (RBA) in the redo log. A checkpoint is designated by a system change number (SCN) in the control file. The checkpoint assures you that all data blocks with an SCN that is less than or equal to the checkpoint SCN have been written to disk. In the event of a failure, only redo entries with an SCN higher than the checkpoint SCN need to be read and applied. More specifically, the checkpoint RBA identifies the starting point in the redo log from which redo entries must be applied.
The time to recover is dependent on how recently the last checkpoint was done, since it determines the number of data blocks with a higher SCN than the checkpoint SCN. The time to recover is also dependent upon the number of redo log blocks that must be read to identify changes. Therefore, frequent checkpointing actions result in a lower time for instance recovery; the price for this benefit is the adverse impact on performance. This performance degradation occurs because checkpointing uses the database writer (DBWn) processes. You must make a trade-off between instance recovery time and operational database performance.
Oracle9i introduces the fast-start checkpointing architecture to implement fast-start instance recovery functionality. The fast-start checkpointing architecture performs incremental checkpointing, and it writes the oldest dirty blocks first to ensure that the most recent dirty blocks are associated with a higher SCN. You can now specify the target mean time to recover (MTTR) for the instance recovery by assigning the initialization parameter FAST_START_MTTR_TARGET a value between 0 and 3,600 seconds. When you set this parameter to 0, Oracle9i will not use this parameter to manage instance recovery time. The optimal value for FAST_START_MTTR_TARGET depends upon the database utilization, the system global area (SGA) size, and the Service Level Agreement (SLA) for that site. Your challenge is to optimize the FAST_START_MTTR_TARGET parameter such that it is in accordance with the SLA without an undue degradation in operational performance. You can verify checkpoint statistics with the Statspack and by querying the V$INSTANCE_RECOVERY view.
Oracle9i added the following three columns to V$INSTANCE_RECOVERY:
- TARGET_MTTR The user's setting for the FAST_START_MTTR_TARGET parameter
- ESTIMATED_MTTR The MTTR that is estimated based on the number of dirty blocks and the number of redo log blocks
- CKPT_BLOCK_WRITES The number of data blocks that were written by the checkpoint write operation
ALTER SYSTEM SET FAST_START_MTTR_TARGET = 240;
Oracle internally uses the FAST_START_MTTR_TARGET parameter value to calculate and set the FAST_START_IO_TARGET and LOG_CHECKPOINT_INTERVAL parameters. If you specify the FAST_START_IO_TARGET and LOG_CHECKPOINT_INTERVAL parameter values, they will override the values Oracle calculates from the FAST_START_MTTR_TARGET parameter.
Some parameters that you use to manage instance recovery are as follows:
- The LOG_CHECKPOINT_INTERVAL parameter value specifies the maximum number of redo blocks that you can have after the checkpoint. You can eliminate checkpoint intervals by setting LOG_CHECKPOINT_INTERVAL to 0 (zero).
- The LOG_CHECKPOINT_TIMEOUT parameter specifies the maximum number of seconds of redo logging action after the checkpoint.
- The FAST_START_IO_TARGET parameter specifies the number of data blocks that must be recovered with the two-pass minimal I/O recovery.
- The DB_BLOCK_MAX_DIRTY_TARGET parameter specifies the maximum number of dirty data blocks that you can have in the buffer cache. This parameter has been retired in Oracle9i.
Oracle FlashbackOracle Flashback enables you to query the database as of a certain time or a specific SCN in the past. All DML operations that were committed prior to that specific time or SCN are included, while data that was committed after the specific time or SCN is not.
Some applications where Oracle Flashback is useful are
- Decision support systems (OLAP)
- Document management systems and e-mail systems
- Database repair
Here is an example:
1. All the rows of the database are intact at 10:25; the database is at SCN#1.
2. The user accidentally deletes critical rows of information at 10:28; the database is at SCN#2.
3. The user recognizes this loss of information at 10:29.
4. At 10:30, the user flashes back to the database at SCN#1 before the information was deleted and retrieves the "missing" rows into a PL/SQL cursor.
5. At 10:32, the user returns to the current database by disabling the flashback and inserts the missing data from the PL/SQL cursor.
Requirements for FlashbackHere are some requirements in order to use Flashback:
1. You must use automatic undo management as opposed to the older technique of rollback segments:
UNDO_MANAGEMENT = AUTO
2. You must set the initialization parameter UNDO_RETENTION to specify how far back in seconds Oracle should retain undo information and therefore how far back you can perform a Flashback query. You can also update this parameter dynamically with the following statement:
ALTER SYSTEM SET UNDO_RETENTION=1800
3. You must have EXECUTE privilege on the DBMS_FLASHBACK package....