You have some data in a relational database, and you want to process it with Pandas. So you use Pandas’ handy read_sql()
API to get a DataFrame—and promptly run out of memory.
The problem: you’re loading all the data into memory at once. If you have enough rows in the SQL query’s results, it simply won’t fit in RAM.
Pandas does have a batching option for read_sql()
, which can reduce memory usage, but it’s still not perfect: it also loads all the data into memory at once!
So how do you process larger-than-memory queries with Pandas? Let’s find out.
Read more...from Planet Python
via read more
No comments:
Post a Comment