Wednesday, May 26, 2021

Stack Abuse: Checking Vulnerabilities in Your Python Code with Bandit

Introduction

As developers, we're encouraged from the start of the journey to write clean code. Equally as important, but less talked about is writing and using secure code.

In Python projects, we typically install modules and third-party packages to avoid developing solutions that already exist. However, this common practice is why hackers exploit dependencies to wreak havoc in our software, and why we need to be able to detect when something is amiss. As such, we use tools like Bandit, an open-source security analysis utility for Python projects.

In this guide - we'll explore how simple lines of code can end up being destructive, and how we can use Bandit to help us identify them.

Security Vulnerabilities in Python

A security vulnerability in our code is a flaw that malicious agents can take advantage of to exploit our systems and/or data. As you program in Python, there could be some vulnerable usage of functional calls or module imports that may be safe when invoked locally but could open doors for malicious users to tamper with the system when deployed without the right configurations.

You've probably come across several of these in your day-to-day coding activities. Some of the more common attacks and exploits are largely dealt with by modern frameworks and systems which anticipate such attacks.

Here are a few:

  • OS Command Injection - Based on the humble subprocess module that you use to execute command-line utilities and invoke OS-related processes. The following snippet uses the subprocess module to perform a DNS lookup and returns the output:
# nslookup.py
import subprocess
domain = input("Enter the Domain: ")
output = subprocess.check_output(f"nslookup {domain}", shell=True, encoding='UTF-8')
print(output)

What could go wrong here?

In an ideal scenario, the end-user will provide a DNS and the script returns the results of the nslookup command. But, if they were to provide an OS-based command such as ls along with the DNS, the following output is received - the command would be run too:

$ python3 nslookup.py
Enter the Domain: stackabuse.com ; ls
Server:         218.248.112.65
Address:        218.248.112.65#53

Non-authoritative answer:
Name:   stackabuse.com
Address: 172.67.136.166
Name:   stackabuse.com
Address: 104.21.62.141
Name:   stackabuse.com
Address: 2606:4700:3034::ac43:88a6
Name:   stackabuse.com
Address: 2606:4700:3036::6815:3e8d

config.yml
nslookup.py

By allowing someone to pass in a part of a command - we've let them access the OS-level terminal.

Imagine how destructive things might get, if the malicious actor were to provide a command such as cat /etc/passwd which would reveal the passwords of the existing users. As simple as it sounds, the subprocess module can be very risky to use.

  • SQL Injection - SQL Injection attacks are rare these days, thanks to the ORM functionalities which are widely being used. But if you are still aligned to using raw SQL, you need to be aware of how your SQL queries are constructed and how safe your query parameters are validated and passed in.

Consider the following snippet:

from django.db import connection

def find_user(username):
    with connection.cursor() as cur:
        cur.execute(f"""select username from USERS where name = '%s'""" % username)
        output = cur.fetchone()
    return output

The function call is simple - you pass in a string as an argument, say "Foobar" and the string gets inserted into the SQL query, resulting in:

select username from USERS where name = 'Foobar'

However, much like the previous issue - if someone were to add a ; character, they could chain multiple commands. For example, inserting '; DROP TABLE USERS; -- would result in:

select username from USERS where name = ''; DROP TABLE USERS; --'

The first statement would run, right before the database drops the entire USERS table. Yikes!

Notice how the last quote has been commented out using the double dashes. SQL query parameters can become nightmares, if not reviewed properly. Here's where security tools can help in spotting such unintentional yet harmful lines of code.

Bandit

Bandit is an open-source tool written in Python that helps you analyze your Python code and find common security issues in it. It will be able to scan your Python code, spot the vulnerabilities and exploits such as the ones that were mentioned in the previous section. Bandit can be installed locally or inside your virtual environment easily via pip:

$ pip install bandit

Bandit can be used from the following perspectives:

  • DevSecOps: Including Bandit as a part of the Continuous Integration (CI) Practices.
  • Development: Bandit can be used locally as part of the local development setup, where the developers can have control over function exploitation before committing the code.

Using Bandit

Bandit can be easily integrated as part of the CI tests, and common vulnerability checks can be performed before shipping the code to production. For example, DevSecOps Engineers can invoke Bandit whenever a pull request is raised or code is being committed, for enhanced security. Based on the organization guidelines, the import modules and function calls can be allowed or restricted.

Bandit provides control to the users on which modules to use and which modules to blacklist. This control is defined inside the configuration file, which can be generated using the bandit-config-generator tool. The output of the code tests that are run can be exported in the form of CSV, JSON, etc.

The configuration file can be generated as:

$ bandit-config-generator -o config.yml

The generated config.yml file contains several parts corresponding to the tests that can be allowed or revoked, function calls that can be allowed or revoked, along the maximum length of cryptographic keys. The user may use bandit by specifying this configuration file or perform all tests simply by passing in the project's directory:

$  bandit -r code/ -f csv -o out.csv
[main]  INFO    profile include tests: None
[main]  INFO    profile exclude tests: None
[main]  INFO    cli include tests: None
[main]  INFO    cli exclude tests: None
[main]  INFO    running on Python 3.8.5
434 [0.. 50.. 100.. 150.. 200.. 250.. 300.. 350.. 400.. ]
[csv]   INFO    CSV output written to file: out.csv

In this Bandit call, you will be specifying the project directory using the -r flag and writing the output as a CSV using the -o flag. Bandit tests all the python scripts inside this project directory and returns the output as a CSV. The output is very detailed and here's what it looks like:

Fig 1. Bandit Output

As mentioned in the previous section, the subprocess module import and the shell=True argument are of high-security threat. If it's inevitable to use this module and argument, these can be whitelisted in the configuration file and make it skip the tests by including the codes B602 (subprocess_popen_with_shell_equals_true) and B404 (import_subprocess) in "skips". You may find these codes in the generated config file. The tests that are included in the file in the skips section as:

skips: [B602, B404]

If you re-run the Bandit tests again using the generated configuration file, this will result in an empty CSV file that denotes that all tests were passed:

> bandit -c code/config.yml -r code/ -f csv -o out2.csv
[main]  INFO    profile include tests: None
[main]  INFO    profile exclude tests: B404,B602
[main]  INFO    cli include tests: None
[main]  INFO    cli exclude tests: None
[main]  INFO    using config: code/config.yml
[main]  INFO    running on Python 3.8.5
434 [0.. 50.. 100.. 150.. 200.. 250.. 300.. 350.. 400.. ]
[csv]   INFO    CSV output written to file: out2.csv

For collaborations inside an organization, this bandit configuration file needs to be embedded in newly created projects so that the developers can have access to it even locally.

Conclusion

Code should be clean and safe. In this short guide, we've taken a look at Bandit, a Python library used for identifying commonplace security issues with modules you're probably already using.



from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...