Saturday, May 22, 2021

Python Pool: Achieving String Compression in Python

Introduction

Hello geeks! Today we are here with another module of string. The topic of discussion will be String compression using the python programming language. String compression is crucial for in-memory database management systems (IMDBMS) to reduce memory consumption and processing time. It has many applications, such as shortening the URL and messages as well to save space. Let’s have a detailed discussion on string compression.

What do you mean by String compression?

String compression in python basically means to shorten any string whose length is very long. Compressing the string will never change the original intent of the string. Let me give you an example for a better understanding.

Let us suppose we have an URL of an image .

URL: https://ift.tt/3vhmQ9Y

As you can see, the length of the URL is very long. So to shorten this URL, we will use string compression. By compressing the URL, the length of the URL changes, but the URL which you will get after shortening will take you to the same image if you paste that URL to google.

string compression pythonString Compression

Why do we need string compression?

The main aim of string compression in python is to save the memory as much as possible. This is because consumption of memory will lead to requiring more resources which are ultimately very expensive.

Nowadays, the world wants speed in whichever task they are searching for. The compressed data or the string will take less processing time and will deliver the output as quickly as possible.

It also features fast read operations, i.e., if a message is compressed, it will take less time for a user to read.

Thus, String compression will reduce the consumption of memory and the processing time, and the user’s time to read a message.

Until now, you must have understood the importance of string compression and its use when it comes to real-life problems.

Algorithm for string compression in python

Let us consider the following example

  • ‘AAABAACCDDDD’ —> ‘A3B1A2C2D4’
  • ‘BAAACCDDDD’ —> ‘B1A3C2D4’
  • ‘AABBCC’ –> ‘A2B2C2’
  • ‘BBBAACCCCDD’ –> ‘B3A2C4D2
  1. Pick the first character from the input string (str).
  2. Append it to the compressed string.
  3. Count the number of subsequent occurrences of the character (in str) and append the count to the compressed string if it is more than 1 only​.
  4. Then, pick the next character and repeat the steps above until the end of str is reached.

Code in Python Without Using Any Library

ind = 0
def compressed(string):
  global ind
  comp_str = ""
  len_str = len(string)
  while (ind != len_str):
    count = 1
    while ((ind < (len_str-1)) and (string[ind] == string[ind+1])):
      count = count + 1
      ind = ind + 1
    if (count == 1):
      comp_str = comp_str + str(string[ind])
    else:
      comp_str = comp_str + str(string[ind]) + str(count)
    
    ind = ind + 1
  return comp_str

string = input("Enter the string: ")
print("After compression:")
print(compressed(string))     
OUTPUT:- Enter the string : AABBBCCC
After compression: A2B3C3

Time complexity

The time complexity of this code is O(n).

Code in Python Using itertools Library

from itertools import takewhile, dropwhile

def parse_linq(s):
    return '' if not s \
        else str(len(list (takewhile(lambda c: c == s[0], s)))) + s[0] + 
        \ parse_linq("".join(list (dropwhile(lambda c: c = s[0], s))))

print(parse_linq ('AAABB'))
OUTPUT:-  A3B2

Explanation of the code

Lets look at each function used in the code above

  1. Right after return you can see Python variation of ternary operator – [value_true] if [condition] else [value_false].
    So line return ” if not s is identical to the guardian clause in the first algorithm. The else return value is implementing the rest of the logic.
  2. The loop is implemented as takewhile(lambda c: c == s[0], s)
    This will iterate over characters of the string argument for as long as the letter equals the first letter of the string argument (s[0]). I used “s” to make it shorter on the screen. However, “takewhile” returns a generator. Therefore we need the next function
  3. And the next function in this chain is the list([generator]). The generator returns one item at a time, and the list() function gets all items out of it.
    However, it returns a list, so string “AAA” will become [‘A’, ‘A’, ‘A’]
  4. Using the len([list]) function, we get the number of elements found in this substring. Combining it with the first element of the argument, we get “3A” out of [‘A’, ‘A’, ‘A’]. And this creates a “head” of every recursive call. Now we need to create a tail.
  5. The tail is created by using dropwhile(lambda c: c == s[0], s) function, which drops the number of elements already taken by the “head”.
  6. Since dropwhile is also a generator, we use list() as well.
  7. But we cannot pass a list with the recursive call, so “”.join() function will join elements of the list into a string, and the resulting string will be passed as a new argument into the recursive call.
  8. The recursion will stop when we take all characters out of the string and pass in an empty string.

Using collections Library

from collections import Counter

def compress(string):
    temp = Counter()
    result = " "
    for x in string:
        temp[x] += 1

    for key, value in temp.items():
        result += str(key) + str(value)
    return result

if __name__ == '__main__':
    s = input("Enter the string:")
    print(compress(s))

OUTPUT:-
Enter the string:ZZURRRRRR
Z2U1R6

Also, Read

FAQs

1. How to use zlib for performing string compression?

Ans :-

import zlib
text = b"Python Pool!" * 100
compressed = zlib.compress(text)

print("size of original: " + str(len(text)))
print("size of compressed: " + str(len(compressed)))

decompressed = zlib.decompress(compressed)
print("size of decompressed: " + str(len(decompressed)))
OUTPUT:-
size of original: 1300
size of compressed: 32

Conclusion

I hope you have understood today’s detailed tutorial on string compression using python. We have discussed the importance of string compression in real life and why it is important. We also understood the algorithm that is supposed to be used and a detailed explanation of the code using the library and without using the library.

If you have any confusion or doubt feel free to comment down below. Practice some question on string compression to gain confidence. Keep pythonning geeks!

The post Achieving String Compression in Python appeared first on Python Pool.



from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...