![]() Placement and migration policies are strictly correlated. Static approach fairly distribute data blocks in the whole cache. Migration approaches concentrate data blocks in few banks. Slide 15 Outline Introduction Methodology Bank Policy Approaches Bank Placement Policy Bank Access Policy Bank Migration Policy Bank Replacement Policy Conclusions Slide 16 Bank Migration Policy Static Gradual + Swapping Gradual + Replication Slide 17 Bank Migration Policy Replication reduces the effective size of the cache. These results suggest the broad area of improvement on this policy. 9P + 7P is a trade-off, but it is still far from the performance potencial. Slide 12 Outline Introduction Methodology Bank Policy Approaches Bank Placement Policy Bank Access Policy Bank Migration Policy Bank Replacement Policy Conclusions Slide 13 Bank Access Policy Partially Serial 9P + 7P Parallel Slide 14 Bank Access Policy Power efficiency vs. 16B configurations concentrate data in few banks. MICRO 04 Slide 9 Outline Introduction Methodology Bank Policy Approaches Bank Placement Policy Bank Access Policy Bank Migration Policy Bank Replacement Policy Conclusions Slide 10 Bank Placement Policy 1B + Static 16B + Static 16B + Local Slide 11 Bank Placement Policy 1B + Static placement provides fair distribution. Managing wire delay in large chip-multiprocessor caches. caches8 KBytes Shared L2 NUCA cache1 MBytes, 256 Banks Slide 8 Baseline NUCA cache architecture 8 cores 256 banks B. ASPLOS 02 Slide 5 NUCA Policies Bank Placement PolicyBank Access Policy Bank Replacement PolicyBank Migration Policy Slide 6 Outline Introduction Methodology Bank Policy Approaches Bank Placement Policy Bank Access Policy Bank Migration Policy Bank Replacement Policy Conclusions Slide 7 Methodology Simulation tools: Simics + GEMS CACTI v6.0 PARSEC Benchmark Suite Number of cores8 Core processorOut-of-order SPARCv9 Main Memory Size4 GBytes Memory Bandwidth512 Bytes/cycle On-chip wire delay1 cycle Off-chip wire delay20 cycles Switch delay1 cycle Private L1 data caches8 KBytes Private L1 instr. An Adaptive, non-uniform cache structure for wire-delay dominated on-chip caches. Banks close to cache controller have smaller latencies than further banks. NUCA divides a large cache in smaller and faster banks. Slide 4 NUCA Non-Uniform Cache Architecture (NUCA) was first proposed in ASPLOS 2002 by Kim et al. CMPs incorporate larger and shared last-level caches. Take advantage of Thread-level parallelism. Keep performance improvement while reducing power consumption. Arquitectura de Computadors Universitat Politcnica de Catalunya Barcelona, Spain MMCS 2009, Washington DC (USA) - MaSlide 2 Outline Introduction Methodology Bank Policy Approaches Bank Placement Policy Bank Access Policy Bank Migration Policy Bank Replacement Policy Conclusions Slide 3 Introduction CMPs have emerged as a dominant paradigm in system design. Enginyeria Informtica Universitat Rovira i Virgili Tarragona, Spain Dept. Analysis of NUCA Policies for CMPs Using Parsec Benchmark Suite Javier Lira Carlos Molina Antonio Gonzlez Intel Barcelona Research Center Intel Labs - UPC Barcelona, Spain Dept.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |