MICRO-36 Proceedings of the 36th Annual International Symposium on Microarchitecture: Micro-36 2003

Thirty-five papers from the December 2003 symposium present emerging research in high-performance processor microarchitecture, with sessions on voltage scaling and transient faults, cache design, dynamic optimization systems, energy efficiency, secure processors, and scaling design. The contributors propose a low power pipeline based on circuit level timing speculation, a method to compute architectural vulnerability factors, and a two-phase dynamic translator for IA-32 applications running on Itanium-based systems. Other topics include processor acceleration through automated instruction set customization, fast path-based neural branch prediction, and hardware support for control transfers in code caches. There is no subject index. Annotation ©2004 Book News, Inc., Portland, OR

Table of Contents

Message from the General Chairix
Message from the Program Chairx
Keynote 1
Microarchitecture on the MOSFET Diet3
Session 1Voltage Scaling and Transient Faults
Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation7
VSV: L2-Miss-Driven Variable Supply-Voltage Scaling for Low Power19
A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor29
Session 2Cache Design
TLC: Transmission Line Caches43
Distance Associativity for High-Performance Energy-Efficient Non-Uniform Cache Architectures55
Near-Optimal Precharging in High-Performance Nanoscale CMOS Caches67
Session 3Power and Energy Efficient Architectures
Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction81
Runtime Power Monitoring in High-End Processors: Methodology and Empirical Data93
Power-Driven Design of Router Microarchitectures in On-Chip Networks105
Optimum Power/Performance Pipeline Depth117
Session 4Application Specific Optimization and Analysis
Processor Acceleration through Automated Instruction Set Customization129
The Reconfigurable Streaming Vector Processor (RSVP)141
Scaling and Characterizing Database Workloads: Bridging the Gap between Research and Practice151
Keynote 2
In Memory of Bob Rau165
Session 5Dynamic Optimization Systems
Generational Cache Management of Code Traces in Dynamic Optimization Systems169
The Performance of Runtime Data Cache Prefetching in a Dynamic Optimization System180
IA-32 Execution Layer: A Two-Phase Dynamic Translator Designed to Support IA-32 Applications on Itanium-Based Systems191
Session 6Dynamic Program Analysis and Optimization
LLVA: A Low-Level Virtual Instruction Set Architecture205
Comparing Program Phase Detection Techniques217
Using Interaction Costs for Microarchitectural Bottleneck Analysis228
Session 7Branch, Value and Scheduling Optimizations
Fast Path-Based Neural Branch Prediction243
Hardware Support for Control Transfers in Code Caches253
Exploiting Value Locality in Physical Register Files265
Macro-op Scheduling: Relaxing Scheduling Loop Constraints277
Session 8Dataflow, Data Parallel, and Clustered Architectures
Universal Mechanisms for Data-Parallel Architectures303
Flexible Compiler-Managed L0 Buffers for Clustered VLIW Processors315
Instruction Replication for Clustered Microarchitectures326
Session 9Secure and Network Processors
Efficient Memory Integrity Verification and Encryption for Secure Processors339
Fast Secure Processor for Inhibiting Software Piracy and Tampering351
IPStash: A Power-Efficient Memory Architecture for IP-Lookup361
Design and Implementation of High-Performance Memory Systems for Future Packet Buffers373
Session 10Scaling Design
Beating In-Order Stalls with "Flea-Flicker" Two-Pass Pipelining387
Scalable Hardware Memory Disambiguation for High ILP Processors399
Reducing Design Complexity of the Load/Store Queue411
Checkpoint Processing and Recovery: Towards Scalable Large Instruction Window Processors423
Author Index435

