Title: Multilevel Blocking and Prefetching for Linear Algebra Computations (HwA5b)
Authors: E. Garcia, J.L. Larriba--Pey, J.R. Herrero, T. Juan, J.J. Navarro, and T. Lang
Email: juanjo@ac.upc.es
Date: July 1995
Abstract:
In a previous work it was shown that the performance of linear algebra computations, which access large amounts of data, is dependent on the behavior of the memory hierarchy. This research is aimed to use the multilevel orthogonal blocking approach in conjuntion with other software techniques to further improve the performance of linear algebra computations. This work has been divided into two parts. In Part I the blocking techniques are applied to improve Sparse matrix computations that appear in many linear algebra kernels of scientific applications. The combination of several software techniques (loop unrolling, software pipelining) together with blocking to the sparse matrix by dense matrix multiplication introduces a very large search space. In Part II the performance of the dense matrix by matrix multiplication executed on a superscalar high performance workstation is improved using binding and nonbinding prefetching to hide the memory latency together with the well known technique of blocking.
This report is available through
Last modified on May 13, 1996 by J.H.M.Dassen.
(C) 1995 by Leiden University