mirror of https://git.FreeBSD.org/ports.git
16 lines
949 B
Plaintext
16 lines
949 B
Plaintext
COSMA is a parallel, high-performance, GPU-accelerated, matrix-matrix
|
|
multiplication algorithm that is communication-optimal for all
|
|
combinations of matrix dimensions, number of processors and memory
|
|
sizes, without the need for any parameter tuning. The key idea behind
|
|
COSMA is to first derive a tight optimal sequential schedule and only
|
|
then parallelize it, preserving I/O optimality between processes. This
|
|
stands in contrast with the 2D and 3D algorithms, which fix process
|
|
domain decomposition upfront and then map it to the matrix dimensions,
|
|
which may result in asymptotically more communication. The final
|
|
design of COSMA facilitates the overlap of computation and
|
|
communication, ensuring speedups and applicability of modern
|
|
mechanisms such as RDMA. COSMA allows to not utilize some processors
|
|
in order to optimize the processor grid, which reduces the
|
|
communication volume even further and increases the computation volume
|
|
per processor.
|