Title: Memory Organization and Managements for Linear Algebra Computations (HwA4)
Authors: J.J. Navarro, T. Lang, T. Juan, J. Gallardo, E. Garcia, O. Temam, C. Fricker, Y. Jegou, W. Jalby, and D. Snelling
Email: juanjo@ac.upc.es
Date: August 1994
Abstract:
It has become apparent in recent years that the performance of linear algebra computations, which access large amounts of data, is dependent on the behavior of the memory hierarchy. In this research we contribute to the effort of improving the exploitation of the potential locality of these algorithms with proposals at the software and at the hardware level. Two software approaches are proposed and evaluated. In Part I it is shown that, in a system with a multilevel memory hierarchy, algorithms with one level of blocking do not achieve best performance. As a consequence, a family of multilevel blocking algorithsm that we call `Multilevel Orthogonal Block' (MOB) is proposed and shown to be optimal and easy to design. In Part II a method is proposed for estimating interference misses in a regular do-loop nest, and that knowledge is used to derive the optimal block size. Also two hardware solutions are proposed and evaluated. In Part III the `Virtual Line Scheme' is presented. This scheme allows the utilization of large virtual lines when fetching data from memory for better exploitation of spatial locality, while the actual physical cache line is smaller than currently found caches lines for better exploitation of temporal locality. Finally, Part IV described a hardware extension of the cache, named `Cache Bypass Buffers' (CBBs), that allows the program to manage the placement of data, avoiding most of the interference misses.
This report is available through
Last modified on May 13, 1996 by J.H.M.Dassen.
(C) 1995 by Leiden University