The following code creates and manipulates 2 TB of randomly generated data.
import dask.array as da
rs = da.random.RandomState()
x = rs.normal(10, 1, size=(500000, 500000), chunks=(10000, 10000))
(x + 1)[::2, ::2].sum().compute(scheduler='threads')
On a single CPU, this computation takes two hours.
On an eight-GPU single-node system this computation takes nineteen seconds.
Combine Dask Array with CuPyActually this computation isn’t that impressive. It’s a simple workload, for which most of the time is spent creating and destroying random data. The computation and communication patterns are simple, reflecting the simplicity of common for data processing workloads.
What is impressive is that we were able to create a distributed parallel GPU array quickly by composing these three existing libraries:
-
CuPy provides a partial implementation of Numpy on the GPU.
-
Dask Array provides chunked algorithms on top of Numpy-like libraries like Numpy and CuPy.
This enables us to operate on more data than we could fit in memory by operating on that data in
from Planet SciPy
read more
No comments:
Post a Comment