Monday, December 17, 2018

First Impressions of GPUs and PyData

I recently moved from Anaconda to NVIDIA within the RAPIDS team, which is building a PyData-friendly GPU-enabled data science stack. For my first week I explored some of the current challenges of working with GPUs in the PyData ecosystem. This post shares my first impressions and also outlines plans for near-term work.

First, lets start with the value proposition of GPUs, significant speed increases over traditional CPUs.

GPU Performance

Like many PyData developers, I’m loosely aware that GPUs are sometimes fast, but don’t deal with them often enough to have strong feeling about them.

To get a more visceral feel for the performance differences, I logged into a GPU machine, opened up CuPy (a Numpy-like GPU library developed mostly by Chainer in Japan) and cuDF (a Pandas-like library in development at NVIDIA) and did a couple of small speed comparisons:

Compare Numpy and Cupy
>>> import numpy, cupy

>>> x = numpy.random.random((10000, 10000))
>>> y = cupy.random.random((10000, 10000))

>>> %timeit (np.sin(x) ** 2 + np.cos(x) ** 2 == 1).all()
402 ms ± 8.59 ms per
(continued...)

from Planet SciPy
read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...