Parallel Programming: for Multicore and Cluster Systems- P9: Innovations in hardware architecture, like hyper-threading or multicore processors, mean that parallel computing resources are available for inexpensive desktop computers. In only a few years, many standard software products will be based on concepts of parallel programming implemented on such hardware, and the range of applications will be much broader than that of scientific computing, up to now the main application area for parallel computing | 70 2 Parallel Computer Architecture leading to a large number of cache misses and therefore a large execution time. This phenomenon is also called thrashing. Fully Associative Caches In a fully associative cache each memory block can be placed in any cache position thus overcoming the disadvantage of direct mapped caches. As for direct mapped caches a memory address can again be partitioned into a block address s leftmost bits and a word address w rightmost bits . Since each cache block can contain any memory block the entire block address must be used as tag and must be stored with the cache block to allow the identification of the memory block stored. Thus each memory address is partitioned as follows To check whether a given memory block is stored in the cache all the entries in the cache must be searched since the memory block can be stored at any cache position. This is illustrated in Fig. b . w block address word address The advantage of fully associative caches lies in the increased flexibility when loading memory blocks into the cache. The main disadvantage is that for each memory access all cache positions must be considered to check whether the corresponding memory block is currently held in the cache. To make this search practical it must be done in parallel using a separate comparator for each cache position thus increasing the required hardware effort significantly. Another disadvantage is that the tags to be stored for each cache block are significantly larger as for direct mapped caches. For the example cache introduced above the tags must be 30 bits long for a fully associated cache . for each 32-bit memory block a 30-bit tag must be stored. Because of the large search effort a fully associative mapping is useful only for caches with a small number of positions. Set Associative Caches Set associative caches are a compromise between direct mapped and fully associative caches. In a set associative cache the cache is partitioned