|
Directory |
| Caching In: P4 Extreme Edition |
|
|
|
Page 2 of 6
The Basics At the top of said hierarchy is Level 1 cache. As a side note, this is probably the only thing in the computer architecture world whose enumeration actually starts with one instead of zero. In both AMD and Intel architectures, L1 cache is actually split into two separate parts: the instruction cache (i-cache) and the data cache (d-cache). The data cache, logically enough, provides an area to store floating point and integer numbers the processor will soon need. The i-cache (also a logic device) is an incredibly small area on the chip that does the same for processor instructions. How then, does the cache know what the processor will need next? This is where hardware logic and compiler optimizations come into play. If you’re important enough to be in the VIP room at Club L1 Caché, you probably got there in either one of two ways. The first possibility is that you were just in there and the bouncer remembered you. The second (and often more likely) possibility is that your next-door neighbor just went in and you snuck in behind him. This is called the Principle of Locality and it is the fundamental principle of cache design. If you were just needed by the processor, you’ll probably be needed again; and if you are next to something that needed to be used, you’ll also be needed more often than not. This principle allows cache to be very effective in figuring out what needs to be needed next.
Unfortunately, Club L1 Caché wouldn’t be effective if it weren’t so
exclusive. Ever try to find a friend in a crowded bar? If L1 cache were
large, the processor would have a hard time finding the data or
instructions it needed, completely negating the purpose of cache in the
first place. Level 1 cache needs to be relatively small so that no
matter when the processor wants something from it, it won’t have to
spend time searching. This is where Level 2 cache comes in. L2 cache is
comparatively a lot larger than L1, and provides a holding pattern for
instructions and data in a unified place with both direct access to
memory and to L1 cache. This way, if the processor needs something that
isn’t in L1 cache, L1 cache doesn’t have to wait for it to come all the
way from memory. In fact, the data is probably in L2 cache because of
Principle of Locality. So, is Level 3 cache any different than L2? Not
really. It sits between L2 and memory as an additional buffer for both
instructions and data. With processor intensive applications like
graphics processing, web serving, data analysis, etc., the additional
L3 cache is a key performance boost by eliminating extraneous trips to
memory on the system bus by adding a huge chunk of space with high
speed access. This is why you generally find L3 cache only on high-end
processors such as the Xeon and Itanium. |
||||||||