Monday, April 5, 2021

Python⇒Speed: Loading SQL data into Pandas without running out of memory

You have some data in a relational database, and you want to process it with Pandas. So you use Pandas’ handy read_sql() API to get a DataFrame—and promptly run out of memory.

The problem: you’re loading all the data into memory at once. If you have enough rows in the SQL query’s results, it simply won’t fit in RAM.

Pandas does have a batching option for read_sql(), which can reduce memory usage, but it’s still not perfect: it also loads all the data into memory at once!

So how do you process larger-than-memory queries with Pandas? Let’s find out.

Read more...

from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...