Optimizing dense linear algebra algorithms on heterogeneous machines

This paper addresses the execution of inherently sequential linear algebra algorithms namely LU factorization, tridiagonal reduction and the symmetric QR factorization algorithm used for eigenvector computation, which are significant building blocks for applications in our target image processing an...

ver descrição completa

Detalhes bibliográficos
Autor principal: Jorge Barbosa (author)
Outros Autores: João Tavares (author), A. J. Padilha (author)
Formato: book
Idioma:eng
Publicado em: 2006
Assuntos:
Texto completo:https://hdl.handle.net/10216/92545
País:Portugal
Oai:oai:repositorio-aberto.up.pt:10216/92545
Descrição
Resumo:This paper addresses the execution of inherently sequential linear algebra algorithms namely LU factorization, tridiagonal reduction and the symmetric QR factorization algorithm used for eigenvector computation, which are significant building blocks for applications in our target image processing and analysis domain. These algorithms present additional difficulties to optimize the processing time due to the fact that the computational load for data matrix columns increases with their index, requiring a fine tuned load assignment and distribution. We present an efficient methodology to determine the optimal number of processors to be used in a computation, as well as a new static load distribution strategy that achieves better results than other algorithms developed for the same purpose.