I recently moved from Anaconda to NVIDIA within the RAPIDS team, which is building a PyData-friendly GPU-enabled data science stack. For my first week I explored some of the current challenges of working with GPUs in the PyData ecosystem. This post shares my first impressions and also outlines plans for near-term work.
First, lets start with the value proposition of GPUs, significant speed increases over traditional CPUs.
GPU PerformanceLike many PyData developers, I’m loosely aware that GPUs are sometimes fast, but don’t deal with them often enough to have strong feeling about them.
To get a more visceral feel for the performance differences, I logged into a GPU machine, opened up CuPy (a Numpy-like GPU library developed mostly by Chainer in Japan) and cuDF (a Pandas-like library in development at NVIDIA) and did a couple of small speed comparisons:
Compare Numpy and Cupy
>>> import numpy, cupy
>>> x = numpy.random.random((10000, 10000))
>>> y = cupy.random.random((10000, 10000))
>>> %timeit (np.sin(x) ** 2 + np.cos(x) ** 2 == 1).all()
402 ms ± 8.59 ms per
from Planet SciPy
read more
No comments:
Post a Comment