Friday, May 31, 2019

Quansight Labs Blog: metadsl: A Framework for Domain Specific Languages in Python

metadsl: A Framework for Domain Specific Languages in Python

Hello, my name is Saul Shanabrook and for the past year or so I have been at Quansight exploring the array computing ecosystem. This started with working on the xnd project, a set of low level primitives to help build cross platform NumPy-like APIs, and then started exploring Lenore Mullin's work on a mathematics of arrays. After spending quite a bit of time working on an integrated solution built on these concepts, I decided to step back to try to generalize and simplify the core concepts. The trickiest part was not actually compiling mathematical descriptions of array operations in Python, but figuring out how to make it useful to existing users. To do this, we need to meet users where they are at, which is with the APIs they are already familiar with, like numpy. The goal of metadsl is to make it easier to tackle parts of this problem seperately so that we can collaborate on tackling it together.

Libraries for Scientific Computing

Much of the recent rise of Python's popularity is due to its usage for scientific computing and machine learning. This work is built on different frameworks, like Pandas, NumPy, Tensorflow, and scikit-learn. Each of these are meant to be used from Python, but have their own concepts and abstractions to learn on top of the core language, so we can look at them as Domain Specific Languages (DSLs). As the ecosystem has matured, we are now demanding more flexibility for how these languages are executed. Dask gives us a way to write Pandas or NumPy and execute it across many cores or computers, Ibis allows us to write Pandas but on a SQL database, with CuPy we can execute NumPy on our GPU, and with Numba we can optimize our NumPy expession on a CPU or GPU. These projects prove that it is possible to write optimizing compilers that target varying hardware paradigms for existing Python numeric APIs. However, this isn't straightforward and these projects success is a testament to the perserverence and ingenuity of the authors. We need to make it easy to add reusable optimizations to libraries like these, so that we can support the latest hardware and compiler optimizations from Python. metadsl is meant to be a place to come together to build a framework for DSLs in Python. It provides a way to seperate the user experience from the the specific of execution, to enable consistency and flexibility for users. In this post, I will go through an example of creating a very basic DSL. It will not use the metadsl library, but will created in the same style as metadsl to illustrate its basic principles.

Read more… (4 min remaining to read)



from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...