Newsgroups: comp.parallel
From: jeckstei@rutcor.rutgers.edu (Jonathan Eckstein)
Subject: Re: HPF: Sparse Matrices in Compressed Row Storage format
Organization: Rutgers University
Date: 6 Jan 1997 13:11:45 GMT
Message-ID: <5aqtmh$p0v@server1.ctc.com>

When I was at Thinking Machines we used to worry about this sort of
thing a fair amount.

I think you can improve things a bit by using a segmented scan
operation: the segments are rows.  You do a gather operation, a
multiply, and then a scan to collect the proper values in the first
element for each row.  Then you can to a less expensive gather or
scatter to compress the results.  See Guy Blelloch's book or the CMSLL
library doc. 

However, this still stinks.

I think the best bet is to use some kind of graph partitioning
algorithm to partition the rows and columns so as to minimize
off-processor data references.  I think U. Minnesota CS department
(Vipin Kumar et al) have some very nice graph partitioning
heuristics. However, you may have to struggle with the HPF compiler to
get it to generate efficient code for the multiplication.

Jonathan Eckstein

--
Articles to bigrigg+parallel@pitt.edu (Administrative: bigrigg@cs.pitt.edu)
Archive: http://www.hensa.ac.uk/parallel/internet/usenet/comp.parallel