Skip to content

Running Spatter

Jeffrey Young edited this page Oct 14, 2022 · 10 revisions

Basic Spatter Execution

More detailed specification

While we eventually plan to script these selections, it is useful to understand how Spatter allocates buffer space to know the max data size and sparsity you can use on a specific system.

Let's say we want to do a gather operation that transfers 500 MB with sparsity 1/2 or every two elements:

  1. Each element is a double or 8 bytes.
  2. To perform a gather, we need a wide buffer (2x the size of the gathered amount of data) for the source buffer, an index buffer that is 1x the size of the data to be moved and a target buffer that is 1x the data size to be moved. This means we would need 4x the size of the number of bytes moved, or 2 GB of RAM to perform a 500 MB gather.

For transferring 500 MB measured with powers of 10 (as opposed to MiB), that would be (500*10^6 B) / (8 B / element) = 62,500,000 elements transferred.

So we'd run this test as follows:

#Requires 2 GB of RAM to avoid paging to disk
#10 iterations are run by default - this can be changed with the -runs flag
./spatter -s 2 -l 62500000 -runs 4
Warning: Length not specified. Default is 16 (elements)
Warning: No backend specified, guessing OpenMP
Warning: Kernel unspecified, guess GATHER
Warning: Kernel file unspecified, guessing kernels/kernels_vector.cl
backend kernel op time source_size target_size idx_size bytes_moved usable_bandwidth actual_bandwidth omp_threads vector_len block_dim
OPENMP GATHER COPY 0.000054 256 128 128 256 4.784600 7.176899 128 1 1 0
OPENMP GATHER COPY 0.000117 256 128 128 256 2.196275 3.294412 128 1 1 0
OPENMP GATHER COPY 0.000086 256 128 128 256 2.977714 4.466571 128 1 1 0
OPENMP GATHER COPY 0.000078 256 128 128 256 3.288206 4.932309 128 1 1 0

Larger test cases and using the wrapping parameter

Spatter can also use wrapping to perform larger scatter/gather operations. This means that the benchmark uses different different elements when going over the source buffer again, meaning the second pass over the indices may have worse alignment than the first. Wrapping is not turned on by default but can be selected for larger scatter/gather operations using the -w flag with a value less than or equal to the selected sparsity.

Let's test the case of transferring 50 GB of data. This gather would require ~200 GB of DRAM using the 4x buffer size formula from above.

(500*10^9 B) / (8 B / element) = 62,500,000,000 elements transferred.

Now to test on a system with 315 GB of RAM.

#./spatter -s [sparsity] -l [num_doubles_moved] -w [wrap must be <= sparsity]
#To test with sparsity 1/2 on 500 GB gather with wrapping factor of 2
#Running this benchmark on a Power9 system peaks at about ~190 GB of memory usage via htop
$ ./spatter -s 2 -l 6250000000 -w 2 -runs 4
Warning: No backend specified, guessing OpenMP
Warning: Kernel unspecified, guess GATHER
Warning: Kernel file unspecified, guessing kernels/kernels_vector.cl
backend kernel op time source_size target_size idx_size bytes_moved usable_bandwidth actual_bandwidth omp_threads vector_len block_dim
OPENMP GATHER COPY 2.249260 50000000000 50000000000 50000000000 100000000000 44459.061280 66688.591919 128 1 1 0
OPENMP GATHER COPY 2.106428 50000000000 50000000000 50000000000 100000000000 47473.743241 71210.614862 128 1 1 0
OPENMP GATHER COPY 2.120846 50000000000 50000000000 50000000000 100000000000 47151.000651 70726.500977 128 1 1 0
OPENMP GATHER COPY 2.136164 50000000000 50000000000 50000000000 100000000000 46812.892604 70219.338906 128 1 1 0

#Run the same test without wrapping - this test peaks at 236 GB usage via htop
#Note that performance will be better since wrapping further destroys any potential locality in an address stream
./spatter -s 2 -l 6250000000 -runs 4
Warning: No backend specified, guessing OpenMP
Warning: Kernel unspecified, guess GATHER
Warning: Kernel file unspecified, guessing kernels/kernels_vector.cl
backend kernel op time source_size target_size idx_size bytes_moved usable_bandwidth actual_bandwidth omp_threads vector_len block_dim
OPENMP GATHER COPY 1.659674 100000000000 50000000000 50000000000 100000000000 60252.783383 90379.175074 128 1 1 0
OPENMP GATHER COPY 1.666801 100000000000 50000000000 50000000000 100000000000 59995.179291 89992.768937 128 1 1 0
OPENMP GATHER COPY 1.622255 100000000000 50000000000 50000000000 100000000000 61642.605108 92463.907661 128 1 1 0
OPENMP GATHER COPY 1.609857 100000000000 50000000000 50000000000 100000000000 62117.313324 93175.969986 128 1 1 0

Running scatter with the same data size:

 ./spatter -s 2 --kernel-name SCATTER -l 6250000000 -runs 4 -w 2
Warning: No backend specified, guessing OpenMP
Warning: Kernel file unspecified, guessing kernels/kernels_vector.cl
backend kernel op time source_size target_size idx_size bytes_moved usable_bandwidth actual_bandwidth omp_threads vector_len block_dim
OPENMP SCATTER COPY 1.911930 50000000000 50000000000 50000000000 100000000000 52303.170588 78454.755881 128 1 1 0
OPENMP SCATTER COPY 1.919919 50000000000 50000000000 50000000000 100000000000 52085.522146 78128.283219 128 1 1 0
OPENMP SCATTER COPY 1.916936 50000000000 50000000000 50000000000 100000000000 52166.579185 78249.868777 128 1 1 0
OPENMP SCATTER COPY 1.925537 50000000000 50000000000 50000000000 100000000000 51933.570468 77900.355701 128 1 1 0