Monday, July 8, 2019

PSF GSoC students blogs: Improving uarray performance

What did you do this week?

I have been focused on my uarray PR (uarray#1780). uarray defines a protocol for dispatching function calls to multiple different backend implentations. In my PR, I've been reimplementing the core function dispatch mechanism in C++ using the Python C-API. This week I've moved the backend registration system in to C++ which means the protocol is now 100% C++. This has brought the overhead down from ~5 us per function call to just ~700 ns or about 10 times more than a normal python function call. This overhead was one of the main blockers for the adoption of uarray so is very nice to see it come down.
    
I've also updated the vendored version pypocketfft in scipy#10393. This new version includes a small cache for the FFT "twiddle factors" which I helped implement. This improves benchmarks by ~20% in most cases or as much as 60% for
some input sizes.

What is coming up next?

My uarray PR has already been merged over the weekend so I can update my scipy.fft code and update the bachmarks there. I also plan on using the new version of pypocketfft to add support for Hermitian FFTs (like numpy's hfft). This would make scipy.fft a complete replacement for numpy.fft's functionality.

I can also work on adding pre-planned transforms to the scipy.fft interface. This would also require pocketfft's plan cacheing much more flexible so I expect we can add user config options to our automatic plan cacheing.

Did you get stuck anywhere?

No significan blockers this week.

 



from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...