High Performance Memory Systems

Gershon Kedem
Computer Science Department
Duke University

abstract

As general purpose processors are getting faster, the speed gap between the processor and the main memory system increases. While processors' speed generally increases by a factor of 2 every 18 months, DRAM speed improves by about 5% a year. Moreover, since DRAM memory is external to the microprocessor, considerable delay occurs from the time the processor asks for a cache line until the CPU receives the first word. As a result the relative cost of accessing DRAM, measured as the number of instructions the processor can execute by the time a DRAM access is complete, is more than 250 instructions per access (IPA). For example the IPA cost for an UltraSparc-2 machine is ~256 IPA. We expect the cost to substantially rise in the near future.

Albert Yu of Intel predicts that by the year 2006 processors will be able to execute 20,000 MIPS. The relative cost of accessing DRAM for a 20,000 MIPS processor will be more than 2000 IPA. A general technique that helps alleviate this problem is cache memory. As processors get faster, they execute larger programs with larger data sets. For such programs, caches can be ineffective. That is, these programs could suffer a large number of cache misses, even on a processor with a large cache. When the relative DRAM-access cost is 1500 IPA, a 1% cache miss-rate could reduce performance to 15% of peak performance.

The High Performance memory System project is an effort to develop new architectures and hardware mechanisms that will help narrow the growing CPU memory speed gap.

A list of on-line projects

Design and Evaluation of a Distributed Cache Architecture with Prediction

WCDRAM: A fully associative integrated Cached-DRAM with wide cache lines

DRAM-page Based Prediction and Preftching

The Case for Building High Performance Intermediate-Level Memory Systems