Tuesday, August 25, 2020

Artem Golubin: How to turn an ordinary gzip archive into a database

This article demonstrates how specially crafted gzip archives can be used as a database like storage. It also introduces a Python package and explains how it works.

gzip is a popular file compression format to store large amounts of raw data. It has a good data compression ratio, but relatively slow compression/decompression speed.

Many companies use it in Big data applications when they need to store compressed CSV or JSON lines files. Such file formats are row-oriented and usually processed line by line. gzip can save a lot of space, especially when you have repetitive field names in JSON files.

Unfortunately, a compressed file can only be accessed in the streaming[....]



from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...