Read an Excerpt
Chapter 5:The I/O Request Packet
The I/O Request PacketThe operating system uses a data structure known as an I/O request packet, or IRP, to communicate with a kernel-mode device driver. In this chapter, I'll discuss this important data structure and the means by which it's created, sent, processed, and ultimately destroyed. I'll end with a discussion of the relatively complex subject of IRP cancellation. This chapter is rather abstract, I'm afraid, because I haven't yet talked about any of the concepts that surround specific types of IRPs. You might, therefore, want to skim this chapter and refer back to it while you're reading later chapters.
Two data structures are crucial to the handling of I/O requests: the I/O request packet itself and the IO_STACK_LOCATION structure. I'll describe both structures in this section.
Structure of an IRPFigure 5-1 illustrates the IRP data structure, with opaque fields shaded in the usual convention of this book. A brief description of the important fields follows.
MdlAddress (PMDL) is the address of a memory descriptor list (MDL) describing the user-mode buffer associated with this request. The I/O Manager creates this MDL for IRP_MJ_READ and IRP_MJ_WRITE requests if the topmost device object's flags indicate DO_DIRECT_IO. It creates an MDL for the output buffer used with an IRP_MJ_DEVICE_CONTROL request if the control code indicates METHOD_IN_DIRECT or METHOD_OUT_DIRECT. The MDL itself describes the user-mode virtual buffer and also contains the physical addresses of locked pages containing that buffer. A driver has to do additional work, which can be quite minimal, to actually access the user-mode buffer.
Figure 5-1. I/O request packet data structure. (Image unavailable)
Flags (ULONG) contains flags that a device driver can read but not directly alter. None of these flags are relevant to a Windows Driver Model driver.
AssociatedIrp (union) is a union of three possible pointers. The alternative that a typical WDM driver might want to access is named AssociatedIrp.SystemBuffer. The SystemBuffer pointer holds the address of a data buffer in nonpaged kernel-mode memory. For IRP_MJ_READ and IRP_MJ_WRITE operations, the I/O Manager creates this data buffer if the topmost device object's flags specify DO_BUFFERED_IO. For IRP_MJ_DEVICE_CONTROL operations, the I/O Manager creates this buffer if the I/O control function code indicates that it should. (See Chapter 9, "Specialized Topics.") The I/O Manager copies data sent by user-mode code to the driver into this buffer as part of the process of creating the IRP. Such data includes the data involved in a WriteFile call or the so-called input data for a call to DeviceIoControl. For read requests, the device driver fills this buffer with data; the I/O Manager later copies the buffer back to the user-mode buffer. For control operations that specify METHOD_BUFFERED, the driver places the so-called output data in this buffer, and the I/O Manager copies it to the user-mode output buffer.
IoStatus (IO_STATUS_BLOCK) is a structure containing two fields that drivers set when they ultimately complete a request. IoStatus.Status will receive an NTSTATUS code, while IoStatus.Information is a ULONG_PTR that will receive an information value whose exact content depends on the type of IRP and the completion status. A common use of the Information field is to hold the total number of bytes transferred by an operation like IRP_MJ_READ that transfers data. Certain Plug and Play (PnP) requests use this field as a pointer to a structure that you can think of as the answer to a query.
RequestorMode will equal one of the enumeration constants UserMode or KernelMode, depending on where the original I/O request originated. Drivers sometimes inspect this value to know whether to trust some parameters.
PendingReturned (BOOLEAN) is TRUE if the lowest-level dispatch routine to process this IRP returned STATUS_PENDING. Completion routines reference this field to avoid a potential race condition between completion and dispatch routines.
Cancel (BOOLEAN) is TRUE if IoCancelIrp has been called to cancel this request and FALSE if it hasn't (yet) been called. IRP cancellation is a relatively complex topic that I'll discuss fully later on in this chapter (in "Cancelling I/ORequests").
CancelIrql (KIRQL) is the interrupt request level (IRQL) at which the special cancel spin lock was acquired. You reference this field in a cancel routine when you release the spin lock.
CancelRoutine (PDRIVER_CANCEL) is the address of an IRP cancellation routine in your driver. You use IoSetCancelRoutine to set this field instead of modifying it directly.
UserBuffer (PVOID) contains the user-mode virtual address of the output buffer for an IRP_MJ_DEVICE_CONTROL request for which the control code specifies METHOD_NEITHER. It also holds the user-mode virtual address of the buffer for read and write requests, but a driver should usually specify one of the device flags DO_BUFFERED_IO or DO_DIRECT_IO and should therefore not usually need to access the field for reads or writes. When handling a METHOD_NEITHER control operation, the driver can create its own MDL using this address.
Tail.Overlay is a structure within a union that contains several members potentially useful to a WDM driver. Refer to Figure 5-2 for a map of the Tail union. In the figure, items at the same level as you read left to right are alternatives within a union, while the vertical dimension portrays successive locations within a structure. Tail.Overlay.DeviceQueueEntry (KDEVICE_QUEUE_ENTRY) and Tail.Overlay.DriverContext (PVOID) are alternatives within an unnamed union within Tail.Overlay. The I/O Manager uses DeviceQueueEntry as a linking field within the standard queue of requests for a device. At moments when the IRP is not on some queue that uses this field and when you own the IRP, you can use the four pointers in DriverContext in any way you please. Tail.Overlay.ListEntry (LIST_ENTRY) is available for you to use as a linking field for IRPs on any private queue you choose to implement.
Figure 5-2. Map of the Tail union in an IRP. (Image unavailable)
CurrentLocation (CHAR) and Tail.Overlay.CurrentStackLocation (PIO_STACK_LOCATION) are not documented for use by drivers because support functions like IoGetCurrentIrpStackLocation can be used instead. During debugging, however, it might help you to realize that CurrentLocation is the index of the current I/O stack location and CurrentStackLocation is a pointer to it.
The I/O Stack
Whenever any kernel-mode program creates an IRP, it also creates an associated array of IO_STACK_LOCATION structures: one stack location for each of the drivers that will process the IRP and often one more stack location for the use of the originator of the IRP. (See Figure 5-3.) A stack location contains type codes and parameter information for the IRP as well as the address of a completion routine. Refer to Figure 5-4 for an illustration of the stack structure.
Figure 5-3. Parallelism between driver and I/O stacks. (Image unavailable)
I'll discuss the mechanics of creating IRPs a bit further on in this chapter. It helps to know right now that the StackCount field of a DEVICE_OBJECT indicates how many locations to reserve for an IRP sent to that device's driver.
Figure 5-4. I/O stack location data structure. (Image unavailable)
MajorFunction (UCHAR) is the major function code associated with this IRP. This would be a value like IRP_MJ_READ that corresponds to one of the dispatch function pointers in the MajorFunction table of a driver object. Since this code is in the I/O stack location for a particular driver, it's conceivable that an IRP could start life as an IRP_MJ_READ (for example) and be transformed into something else as it progresses down the stack of drivers. I'll show you examples in Chapter 11, "The Universal Serial Bus," of how a USB driver changes the personality of a read or write request into an internal control operation in order to submit the request to the USB bus driver.
MinorFunction (UCHAR) is a minor function code that further identifies an IRP belonging to a few major function classes. IRP_MJ_PNP requests, for example, are divided into a dozen or so subtypes with minor function codes such as IRP_MN_START_DEVICE, IRP_MN_REMOVE_DEVICE, and so on.
Parameters (union) is a union of substructures, one for each type of request that has specific parameters. The substructures include, for example, Create (for IRP_MJ_CREATE requests), Read (for IRP_MJ_READ requests), and StartDevice (for the IRP_MN_START_DEVICE subtype of IRP_MJ_PNP).
DeviceObject (PDEVICE_OBJECT) is the address of the device object that corresponds to this stack entry. IoCallDriver fills in this field.
FileObject (PFILE_OBJECT) is the address of the kernel file object to which the IRP is directed. Drivers often use the FileObject pointer to correlate IRPs in a queue with a request (in the form of an IRP_MJ_CLEANUP) to cancel all queued IRPs in preparation for closing the file object.
CompletionRoutine (PIO_COMPLETION_ROUTINE) is the address of an I/O completion routine installed by the driver above the one to which this stack location corresponds. You never set this field directly—instead, you call IoSetCompletionRoutine, which knows to reference the stack location below the one that your driver owns. The lowest-level driver in the hierarchy of drivers for a given device never needs a completion routine because it must complete the request. The originator of a request, however, sometimes does need a completion routine but doesn't usually have its own stack location. That's why each level in the hierarchy uses the next lower stack location to hold its own completion routine pointer.
Context (PVOID) is an arbitrary context value that will be passed as an argument to the completion routine. You never set this field directly; it's set automatically from one of the arguments to IoSetCompletionRoutine.
The "Standard Model" for IRP Processing
Particle physics has its "standard model" for the universe, and so does WDM. Figure 5-5 illustrates a typical flow of ownership for an IRP as it progresses through various stages in its life. Not every type of IRP would go through these steps, and some of the steps might be missing or altered depending on the type of device and the type of IRP. Notwithstanding the possible variability, however, the picture provides a useful starting point for discussion.
Figure 5-5. The "standard model" for IRP processing. (Image unavailable)
It's Even More Complicated than You Thought…
The first time you encounter the concepts that make up the standard model for IRP processing, they'll probably seem pretty complicated. Unfortunately, the standard model is also not quite sufficient to handle all the problems that can arise in a regime that includes hot pluggable devices, dynamic resource reconfiguration, and power management. In later chapters, I'll describe another way of queuing and cancelling IRPs that deals with these extra problems. The standard model will seem like a model of clarity when you're done reading about that!
Despite the problems that some devices present, many devices can still employ the standard model (which is, of course, why I'm bothering to explain it here). If your device cannot be removed or reconfigured while the system is running and can reject I/O requests while in a low-power state, you can use the standard model.
Creating an IRP
The IRP begins life when some entity calls an I/O Manager function to create it. In the figure, I used the term I/O Manager to describe this entity, as though there were a single system component responsible for creating IRPs. In reality, no such single actor in the population of operating system routines exists, and it would have been more accurate to just say that something creates the IRP. Your own driver will be creating IRPs from time to time, for example, and you will occupy the initial ownership box for those particular IRPs.
You can use any of four functions to create a new IRP:
- iobuildasynchronousfsdrequest builds an irp on whose completion you don't plan to wait. this function and the next are appropriate for building only certain types of irp.
- iobuildsynchronousfsdrequest builds an irp on whose completion you do plan to wait.
- iobuilddeviceiocontrolrequest builds a synchronous irp_mj_device_control or irp_mj_internal_device_control request.
- ioallocateirp builds an irp that is not one of the types supported by the preceding three functions.
The Fsd in the first two of these function names stands for file system driver (FSD). Although FSDs are the primary users of the functions, any driver is allowed to call them. The DDK also documents a function named IoMakeAssociatedIrp for building an IRP that's subordinate to some other IRP. WDM drivers should not call this function. Indeed, completion of associated IRPs doesn't work correctly in Microsoft Windows 98 anyway.
Deciding which of these functions to call and determining what additional initialization you need to perform on an IRP is a rather complicated matter. I'll come back to this subject, therefore, at the end of this chapter...