Monday, September 30, 2019

Codementor: Writing a simple Pytest hook

Write some basic pytest hooks to capture test failures in a text file.

from Planet Python
via read more

Podcast.__init__: Building A Modern Discussion Forum In Python To Support Healthy Communities

Building and sustaining a healthy community requires a substantial amount of effort, especially online. The design and user experience of the digital space can impact the overall interactions of the participants and guide them toward respectful conversation. In this episode Rafał Pitoń shares his experience building the Misago platform for creating community forums. He explains his motivation for creating the project, the lessons he has learned in the process, and how it is being used by himself and others. This was a great conversation about how technology is just a means, and not the end in itself.

Summary

Building and sustaining a healthy community requires a substantial amount of effort, especially online. The design and user experience of the digital space can impact the overall interactions of the participants and guide them toward respectful conversation. In this episode Rafał Pitoń shares his experience building the Misago platform for creating community forums. He explains his motivation for creating the project, the lessons he has learned in the process, and how it is being used by himself and others. This was a great conversation about how technology is just a means, and not the end in itself.

Announcements

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, and Data Council. Upcoming events include the combined events of the Data Architecture Summit and Graphorum, Data Council in Barcelona, and the Data Orchestration Summit. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today.
  • Your host as usual is Tobias Macey and today I’m interviewing Rafał Pitoń about Misago, a fully featured modern forum application that is fast, scalable, and responsive

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by explaining what Misago is and your motivation for creating it?
    • How does it compare to other modern forum options such as Discourse and Flarum?
  • How did you generate and prioritize the set of features that you have implemented and what are the main capabilities that are still on your roadmap?
  • Is Misago intended to be run in isolation, or does it allow for integrating into a larger Django project?
    • Is there any support for multi-tenancy?
  • How is Misago itself implemented and how has the architecture evolved since you first began working on it?
    • If you were to start it today, what are some of the choices that you would make differently?
  • What are the extension points that developers can hook into for adding custom functionality?
  • In addition to the technical challenges, managing a forum involves a fair amount of social challenges. How does Misago help with management of a healthy community?
    • How do different design elements factor into promoting healthy conversation and sustainable engagement?
    • What are some of the aspects of community management and the accompanying platform features that enable them which aren’t initially obvious?
  • For someone who wants to use Misago, what is involved in deploying and configuring it?
    • What are some of the routine maintenance tasks that they should be aware of?
  • What are some of the most interesting or unexpected ways that you have seen Misago used?
  • What have you found to be the most interesting, unexpected, and challenging aspects of building and maintaining a forum platform?
  • What do you have planned for the future of Misago?

Keep In Touch

Closing Announcements

  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA



from Planet Python
via read more

Evennia: Blackifying and fixing bugs

Since version 0.9 of Evennia, the MU*-creation framework, was released, work has mainly been focused on bug fixing. But there few new features also already sneaked into master branch, despite technically being changes slated for Evennia 1.0.




On Frontends

Contributor friarzen has chipped away at improving Evennia's HTML5 web client. It already had the ability to structure and spawn any number of nested text panes. In the future we want to extend the user's ability to save an restore its layouts and allow developers to offer pre-prepared layouts for their games. Already now though, it has gotten plugins for handling both graphics, sounds and video:

Inline image by me (griatch-art.deviantart.com)


A related fun development is Castlelore Studios' development of an Unreal Engine Evennia plugin (this is unaffiliated with core Evennia development and I've not tried it, but it looks pretty nifty!): 

Image ©Castlelore Studios
 
On Black

Evennia's source code is extensively documented and was sort of adhering to the Python formatting standard PEP8. But many places were sort of hit-and-miss and others were formatted with slight variations due to who wrote the code.
 
After pre-work and recommendation by Greg Taylor, Evennia has adopted the black autoformatter for its source code. I'm not really convinced that black produces the best output of all possible outputs every time, but as Greg puts it, it's at least consistent in style. We use a line width of 100.

I have set it up so that whenever a new commit is added to the repo, the black formatter will run on it. It may still produce line widths >100 at times (especially for long strings), but otherwise this reduces the number of different PEP8 infractions in the code a lot.

On Python3

Overall the move to Python3 appears to have been pretty uneventful for most users. I've not heard almost any complaints or requests for help with converting an existing game.
The purely Python2-to-Python3 related bugs have been very limited after launch; almost all have been with unicode/bytes when sending data over the wire.

People have wholeheartedly adopted the new f-strings though, and some spontaneous PRs have already been made towards converting some of Evennia existing code into using them.

Post-launch we moved to Django 2.2.2, but the Django 2+ upgrades have been pretty uneventful so far.Some people had issues installing Twisted on Windows since there was no py3.7 binary wheel (causing them to have to compile it from scratch). The rise of the Linux Subsystem on Windows have alleviated most of this though and I've not seen any Windows install issues in a while.

On Future

For now we'll stay in bug-fixing mode, with the ocational new feature popping up here and there. In the future we'll move to the develop branch again. I have a slew of things in mind for 1.0. 

Apart from bug fixing and cleaning up the API in several places, I plan to make use of the feedback received over the years to make Evennia a little more accessible for a new user. This means I'll also try reworking and consolidating the tutorials so one can follow them with a more coherent "red thread", as well as improving the documentation in various other ways to help newcomers with the common questions we hear a lot. 

The current project plan (subject to change) is found here. Lots of things to do!



Top image credit: https://ift.tt/24BA3Hs (public domain)


from Planet Python
via read more

PyCharm: Webinar: “React+TypeScript+TDD in PyCharm” with Paul Everitt

ReactJS is wildly popular and thus wildly supported. TypeScript is increasingly popular, and thus increasingly supported.

The two together? Not as much. Given that they both change quickly, it’s hard to find accurate learning materials.

React+TypeScript, with PyCharm? That three-part combination is the topic of this webinar. We’ll show a little about a lot. Meaning, the key steps to getting productive, in PyCharm, for React projects using TypeScript. Along the way we’ll show test-driven development and emphasize tips-and-tricks in the IDE.

  • Wednesday, October 16
  • 6:00 PM – 7:00 PM CEST (12:00 PM – 1:00 PM EDT)
  • Aimed at intermediate web developers familiar with React
  • Register here

About This Webinar

This webinar is based on a 12 part tutorial with write-ups, videos, and working code for each step. The tutorial covers: getting started with Jest testing, debugging, TSX, functional components, sharing props with types, class based components, interfaces, testing event handlers, and “dumb” components.

You could, of course, skip this webinar and go through the material. Or you could go through the material and use the webinar to ask questions. Either way, we’ll give a quick treatment of each topic.

As a note, I’ll be doing this same webinar, but in webinars with IntelliJ and Rider, later in the year.

One final point: the tutorial and this webinar teach React+TS while sitting in tests, rather than the browser. It’s a productive way to work and makes for a good learning experience.

Speaking To You

Paul is the PyCharm Developer Advocate at JetBrains. Before that, Paul was a co-founder of Zope Corporation, taking the first open source application server through $14M of funding. Paul has bootstrapped both the Python Software Foundation and the Plone Foundation. Paul was an officer in the US Navy, starting www.navy.mil in 1993.



from Planet Python
via read more

Real Python: Preventing SQL Injection Attacks With Python

Every few years, the Open Web Application Security Project (OWASP) ranks the most critical web application security risks. Since the first report, injection risks have always been on top. Among all injection types, SQL injection is one of the most common attack vectors, and arguably the most dangerous. As Python is one of the most popular programming languages in the world, knowing how to protect against Python SQL injection is critical.

In this tutorial, you’re going to learn:

  • What Python SQL injection is and how to prevent it
  • How to compose queries with both literals and identifiers as parameters
  • How to safely execute queries in a database

This tutorial is suited for users of all database engines. The examples here use PostgreSQL, but the results can be reproduced in other database management systems (such as SQLite, MySQL, Microsoft SQL Server, Oracle, and so on).

Free Bonus: 5 Thoughts On Python Mastery, a free course for Python developers that shows you the roadmap and the mindset you'll need to take your Python skills to the next level.

Understanding Python SQL Injection

SQL Injection attacks are such a common security vulnerability that the legendary xkcd webcomic devoted a comic to it:

A humorous webcomic by xkcd about the potential effect of SQL injection"Exploits of a Mom" (Image: xkcd)

Generating and executing SQL queries is a common task. However, companies around the world often make horrible mistakes when it comes to composing SQL statements. While the ORM layer usually composes SQL queries, sometimes you have to write your own.

When you use Python to execute these queries directly into a database, there’s a chance you could make mistakes that might compromise your system. In this tutorial, you’ll learn how to successfully implement functions that compose dynamic SQL queries without putting your system at risk for Python SQL injection.

Setting Up a Database

To get started, you’re going to set up a fresh PostgreSQL database and populate it with data. Throughout the tutorial, you’ll use this database to witness firsthand how Python SQL injection works.

Creating a Database

First, open your shell and create a new PostgreSQL database owned by the user postgres:

$ createdb -O postgres psycopgtest

Here you used the command line option -O to set the owner of the database to the user postgres. You also specified the name of the database, which is psycopgtest.

Note: postgres is a special user, which you would normally reserve for administrative tasks, but for this tutorial, it’s fine to use postgres. In a real system, however, you should create a separate user to be the owner of the database.

Your new database is ready to go! You can connect to it using psql:

$ psql -U postgres -d psycopgtest
psql (11.2, server 10.5)
Type "help" for help.

You’re now connected to the database psycopgtest as the user postgres. This user is also the database owner, so you’ll have read permissions on every table in the database.

Creating a Table With Data

Next, you need to create a table with some user information and add data to it:

psycopgtest=# CREATE TABLE users (
    username varchar(30),
    admin boolean
);
CREATE TABLE

psycopgtest=# INSERT INTO users
    (username, admin)
VALUES
    ('ran', true),
    ('haki', false);
INSERT 0 2

psycopgtest=# SELECT * FROM users;
 username | admin
----------+-------
 ran      | t
 haki     | f
(2 rows)

The table has two columns: username and admin. The admin column indicates whether or not a user has administrative privileges. Your goal is to target the admin field and try to abuse it.

Setting Up a Python Virtual Environment

Now that you have a database, it’s time to set up your Python environment. For step-by-step instructions on how to do this, check out Python Virtual Environments: A Primer.

Create your virtual environment in a new directory:

~/src $ mkdir psycopgtest
~/src $ cd psycopgtest
~/src/psycopgtest $ python3 -m venv venv

After you run this command, a new directory called venv will be created. This directory will store all the packages you install inside the virtual environment.

Connecting to the Database

To connect to a database in Python, you need a database adapter. Most database adapters follow version 2.0 of the Python Database API Specification PEP 249. Every major database engine has a leading adapter:

Database Adapter
PostgreSQL Psycopg
SQLite sqlite3
Oracle cx_oracle
MySql MySQLdb

To connect to a PostgreSQL database, you’ll need to install Psycopg, which is the most popular adapter for PostgreSQL in Python. Django ORM uses it by default, and it’s also supported by SQLAlchemy.

In your terminal, activate the virtual environment and use pip to install psycopg:

~/src/psycopgtest $ source venv/bin/activate
~/src/psycopgtest $ python -m pip install psycopg2>=2.8.0
Collecting psycopg2
  Using cached https://....
  psycopg2-2.8.2.tar.gz
Installing collected packages: psycopg2
  Running setup.py install for psycopg2 ... done
Successfully installed psycopg2-2.8.2

Now you’re ready to create a connection to your database. Here’s the start of your Python script:

import psycopg2

connection = psycopg2.connect(
    host="localhost",
    database="psycopgtest",
    user="postgres",
    password=None,
)
connection.set_session(autocommit=True)

You used psycopg2.connect() to create the connection. This function accepts the following arguments:

  • host is the IP address or the DNS of the server where your database is located. In this case, the host is your local machine, or localhost.

  • database is the name of the database to connect to. You want to connect to the database you created earlier, psycopgtest.

  • user is a user with permissions for the database. In this case, you want to connect to the database as the owner, so you pass the user postgres.

  • password is the password for whoever you specified in user. In most development environments, users can connect to the local database without a password.

After setting up the connection, you configured the session with autocommit=True. Activating autocommit means you won’t have to manually manage transactions by issuing a commit or rollback. This is the default behavior in most ORMs. You use this behavior here as well so that you can focus on composing SQL queries instead of managing transactions.

Note: Django users can get the instance of the connection used by the ORM from django.db.connection:

from django.db import connection

Executing a Query

Now that you have a connection to the database, you’re ready to execute a query:

>>>
>>> with connection.cursor() as cursor:
...     cursor.execute('SELECT COUNT(*) FROM users')
...     result = cursor.fetchone()
... print(result)
(2,)

You used the connection object to create a cursor. Just like a file in Python, cursor is implemented as a context manager. When you create the context, a cursor is opened for you to use to send commands to the database. When the context exits, the cursor closes and you can no longer use it.

Note: To learn more about context managers, check out Python Context Managers and the “with” Statement.

While inside the context, you used cursor to execute a query and fetch the results. In this case, you issued a query to count the rows in the users table. To fetch the result from the query, you executed cursor.fetchone() and received a tuple. Since the query can only return one result, you used fetchone(). If the query were to return more than one result, then you’d need to either iterate over cursor or use one of the other fetch* methods.

Using Query Parameters in SQL

In the previous section, you created a database, established a connection to it, and executed a query. The query you used was static. In other words, it had no parameters. Now you’ll start to use parameters in your queries.

First, you’re going to implement a function that checks whether or not a user is an admin. is_admin() accepts a username and returns that user’s admin status:

# BAD EXAMPLE. DON'T DO THIS!
def is_admin(username: str) -> bool:
    with connection.cursor() as cursor:
        cursor.execute("""
            SELECT
                admin
            FROM
                users
            WHERE
                username = '%s'
        """ % username)
        result = cursor.fetchone()
    admin, = result
    return admin

This function executes a query to fetch the value of the admin column for a given username. You used fetchone() to return a tuple with a single result. Then, you unpacked this tuple into the variable admin. To test your function, check some usernames:

>>>
>>> is_admin('haki')
False
>>> is_admin('ran')
True

So far so good. The function returned the expected result for both users. But what about non-existing user? Take a look at this Python traceback:

>>>
>>> is_admin('foo')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 12, in is_admin
TypeError: cannot unpack non-iterable NoneType object

When the user does not exist, a TypeError is raised. This is because .fetchone() returns None when no results are found, and unpacking None raises a TypeError. The only place you can unpack a tuple is where you populate admin from result.

To handle non-existing users, create a special case for when result is None:

# BAD EXAMPLE. DON'T DO THIS!
def is_admin(username: str) -> bool:
    with connection.cursor() as cursor:
        cursor.execute("""
            SELECT
                admin
            FROM
                users
            WHERE
                username = '%s'
        """ % username)
        result = cursor.fetchone()

    if result is None:
        # User does not exist
        return False

    admin, = result
    return admin

Here, you’ve added a special case for handling None. If username does not exist, then the function should return False. Once again, test the function on some users:

>>>
>>> is_admin('haki')
False
>>> is_admin('ran')
True
>>> is_admin('foo')
False

Great! The function can now handle non-existing usernames as well.

Exploiting Query Parameters With Python SQL Injection

In the previous example, you used string interpolation to generate a query. Then, you executed the query and sent the resulting string directly to the database. However, there’s something you may have overlooked during this process.

Think back to the username argument you passed to is_admin(). What exactly does this variable represent? You might assume that username is just a string that represents an actual user’s name. As you’re about to see, though, an intruder can easily exploit this kind of oversight and cause major harm by performing Python SQL injection.

Try to check if the following user is an admin or not:

>>>
>>> is_admin("'; select true; --")
True

Wait… What just happened?

Let’s take another look at the implementation. Print out the actual query being executed in the database:

>>>
>>> print("select admin from users where username = '%s'" % "'; select true; --")
select admin from users where username = ''; select true; --'

The resulting text contains three statements. To understand exactly how Python SQL injection works, you need to inspect each part individually. The first statement is as follows:

select admin from users where username = '';

This is your intended query. The semicolon (;) terminates the query, so the result of this query does not matter. Next up is the second statement:

select true;

This statement was constructed by the intruder. It’s designed to always return True.

Lastly, you see this short bit of code:

--'

This snippet defuses anything that comes after it. The intruder added the comment symbol (--) to turn everything you might have put after the last placeholder into a comment.

When you execute the function with this argument, it will always return True. If, for example, you use this function in your login page, an intruder could log in with the username '; select true; --, and they’ll be granted access.

If you think this is bad, it could get worse! Intruders with knowledge of your table structure can use Python SQL injection to cause permanent damage. For example, the intruder can inject an update statement to alter the information in the database:

>>>
>>> is_admin('haki')
False
>>> is_admin("'; update users set admin = 'true' where username = 'haki'; select true; --")
True
>>> is_admin('haki')
True

Let’s break it down again:

';

This snippet terminates the query, just like in the previous injection. The next statement is as follows:

update users set admin = 'true' where username = 'haki';

This section updates admin to true for user haki.

Finally, there’s this code snippet:

select true; --

As in the previous example, this piece returns true and comments out everything that follows it.

Why is this worse? Well, if the intruder manages to execute the function with this input, then user haki will become an admin:

psycopgtest=# select * from users;
 username | admin
----------+-------
 ran      | t
 haki     | t
(2 rows)

The intruder no longer has to use the hack. They can just log in with the username haki. (If the intruder really wanted to cause harm, then they could even issue a DROP DATABASE command.)

Before you forget, restore haki back to its original state:

psycopgtest=# update users set admin = false where username = 'haki';
UPDATE 1

So, why is this happening? Well, what do you know about the username argument? You know it should be a string representing the username, but you don’t actually check or enforce this assertion. This can be dangerous! It’s exactly what attackers are looking for when they try to hack your system.

Crafting Safe Query Parameters

In the previous section, you saw how an intruder can exploit your system and gain admin permissions by using a carefully crafted string. The issue was that you allowed the value passed from the client to be executed directly to the database, without performing any sort of check or validation. SQL injections rely on this type of vulnerability.

Any time user input is used in a database query, there’s a possible vulnerability for SQL injection. The key to preventing Python SQL injection is to make sure the value is being used as the developer intended. In the previous example, you intended for username to be used as a string. In reality, it was used as a raw SQL statement.

To make sure values are used as they’re intended, you need to escape the value. For example, to prevent intruders from injecting raw SQL in the place of a string argument, you can escape quotation marks:

>>>
>>> # BAD EXAMPLE. DON'T DO THIS!
>>> username = username.replace("'", "''")

This is just one example. There are a lot of special characters and scenarios to think about when trying to prevent Python SQL injection. Lucky for you, modern database adapters, come with built-in tools for preventing Python SQL injection by using query parameters. These are used instead of plain string interpolation to compose a query with parameters.

Note: Different adapters, databases, and programming languages refer to query parameters by different names. Common names include bind variables, replacement variables, and substitution variables.

Now that you have a better understanding of the vulnerability, you’re ready to rewrite the function using query parameters instead of string interpolation:

 1 def is_admin(username: str) -> bool:
 2     with connection.cursor() as cursor:
 3         cursor.execute("""
 4             SELECT
 5                 admin
 6             FROM
 7                 users
 8             WHERE
 9                 username = %(username)s
10         """, {
11             'username': username
12         })
13         result = cursor.fetchone()
14 
15     if result is None:
16         # User does not exist
17         return False
18 
19     admin, = result
20     return admin

Here’s what’s different in this example:

  • In line 9, you used a named parameter username to indicate where the username should go. Notice how the parameter username is no longer surrounded by single quotation marks.

  • In line 11, you passed the value of username as the second argument to cursor.execute(). The connection will use the type and value of username when executing the query in the database.

To test this function, try some valid and invalid values, including the dangerous string from before:

>>>
>>> is_admin('haki')
False
>>> is_admin('ran')
True
>>> is_admin('foo')
False
>>> is_admin("'; select true; --")
False

Amazing! The function returned the expected result for all values. What’s more, the dangerous string no longer works. To understand why, you can inspect the query generated by execute():

>>>
>>> with connection.cursor() as cursor:
...    cursor.execute("""
...        SELECT
...            admin
...        FROM
...            users
...        WHERE
...            username = %(username)s
...    """, {
...        'username': "'; select true; --"
...    })
...    print(cursor.query.decode('utf-8'))
SELECT
    admin
FROM
    users
WHERE
    username = '''; select true; --'

The connection treated the value of username as a string and escaped any characters that might terminate the string and introduce Python SQL injection.

Passing Safe Query Parameters

Database adapters usually offer several ways to pass query parameters. Named placeholders are usually the best for readability, but some implementations might benefit from using other options.

Let’s take a quick look at some of the right and wrong ways to use query parameters. The following code block shows the types of queries you’ll want to avoid:

# BAD EXAMPLES. DON'T DO THIS!
cursor.execute("SELECT admin FROM users WHERE username = '" + username + '");
cursor.execute("SELECT admin FROM users WHERE username = '%s' % username);
cursor.execute("SELECT admin FROM users WHERE username = '{}'".format(username));
cursor.execute(f"SELECT admin FROM users WHERE username = '{username}'");

Each of these statements passes username from the client directly to the database, without performing any sort of check or validation. This sort of code is ripe for inviting Python SQL injection.

In contrast, these types of queries should be safe for you to execute:

# SAFE EXAMPLES. DO THIS!
cursor.execute("SELECT admin FROM users WHERE username = %s'", (username, ));
cursor.execute("SELECT admin FROM users WHERE username = %(username)s", {'username': username});

In these statements, username is passed as a named parameter. Now, the database will use the specified type and value of username when executing the query, offering protection from Python SQL injection.

Using SQL Composition

So far you’ve used parameters for literals. Literals are values such as numbers, strings, and dates. But what if you have a use case that requires composing a different query—one where the parameter is something else, like a table or column name?

Inspired by the previous example, let’s implement a function that accepts the name of a table and returns the number of rows in that table:

# BAD EXAMPLE. DON'T DO THIS!
def count_rows(table_name: str) -> int:
    with connection.cursor() as cursor:
        cursor.execute("""
            SELECT
                count(*)
            FROM
                %(table_name)s
        """, {
            'table_name': table_name,
        })
        result = cursor.fetchone()

    rowcount, = result
    return rowcount

Try to execute the function on your users table:

>>>
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 9, in count_rows
psycopg2.errors.SyntaxError: syntax error at or near "'users'"
LINE 5:                 'users'
                        ^

The command failed to generate the SQL. As you’ve seen already, the database adapter treats the variable as a string or a literal. A table name, however, is not a plain string. This is where SQL composition comes in.

You already know it’s not safe to use string interpolation to compose SQL. Luckily, Psycopg provides a module called psycopg.sql to help you safely compose SQL queries. Let’s rewrite the function using psycopg.sql.SQL():

from psycopg2 import sql

def count_rows(table_name: str) -> int:
    with connection.cursor() as cursor:
        stmt = sql.SQL("""
            SELECT
                count(*)
            FROM
                {table_name}
        """).format(
            table_name = sql.Identifier(table_name),
        )
        cursor.execute(stmt)
        result = cursor.fetchone()

    rowcount, = result
    return rowcount

There are two differences in this implementation. First, you used sql.SQL() to compose the query. Then, you used sql.Identifier() to annotate the argument value table_name. (An identifier is a column or table name.)

Note: Users of the popular package django-debug-toolbar might get an error in the SQL panel for queries composed with psycopg.sql.SQL(). A fix is expected for release in version 2.0.

Now, try executing the function on the users table:

>>>
>>> count_rows('users')
2

Great! Next, let’s see what happens when the table does not exist:

>>>
>>> count_rows('foo')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 11, in count_rows
psycopg2.errors.UndefinedTable: relation "foo" does not exist
LINE 5:                 "foo"
                        ^

The function throws the UndefinedTable exception. In the following steps, you’ll use this exception as an indication that your function is safe from a Python SQL injection attack.

Note: The exception UndefinedTable was added in psycopg2 version 2.8. If you’re working with an earlier version of Psycopg, then you’ll get a different exception.

To put it all together, add an option to count rows in the table up to a certain limit. This feature might be useful for very large tables. To implement this, add a LIMIT clause to the query, along with query parameters for the limit’s value:

from psycopg2 import sql

def count_rows(table_name: str, limit: int) -> int:
    with connection.cursor() as cursor:
        stmt = sql.SQL("""
            SELECT
                COUNT(*)
            FROM (
                SELECT
                    1
                FROM
                    {table_name}
                LIMIT
                    {limit}
            ) AS limit_query
        """).format(
            table_name = sql.Identifier(table_name),
            limit = sql.Literal(limit),
        )
        cursor.execute(stmt)
        result = cursor.fetchone()

    rowcount, = result
    return rowcount

In this code block, you annotated limit using sql.Literal(). As in the previous example, psycopg will bind all query parameters as literals when using the simple approach. However, when using sql.SQL(), you need to explicitly annotate each parameter using either sql.Identifier() or sql.Literal().

Note: Unfortunately, the Python API specification does not address the binding of identifiers, only literals. Psycopg is the only popular adapter that added the ability to safely compose SQL with both literals and identifiers. This fact makes it even more important to pay close attention when binding identifiers.

Execute the function to make sure that it works:

>>>
>>> count_rows('users', 1)
1
>>> count_rows('users', 10)
2

Now that you see the function is working, make sure it’s also safe:

>>>
>>> count_rows("(select 1) as foo; update users set admin = true where name = 'haki'; --", 1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 18, in count_rows
psycopg2.errors.UndefinedTable: relation "(select 1) as foo; update users set admin = true where name = '" does not exist
LINE 8:                     "(select 1) as foo; update users set adm...
                            ^

This traceback shows that psycopg escaped the value, and the database treated it as a table name. Since a table with this name doesn’t exist, an UndefinedTable exception was raised and you were not hacked!

Conclusion

You’ve successfully implemented a function that composes dynamic SQL without putting your system at risk for Python SQL injection! You’ve used both literals and identifiers in your query without compromising security.

You’ve learned:

  • What Python SQL injection is and how it can be exploited
  • How to prevent Python SQL injection using query parameters
  • How to safely compose SQL statements that use literals and identifiers as parameters

You’re now able to create programs that can withstand attacks from the outside. Go forth and thwart the hackers!


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]



from Planet Python
via read more

Python Software Foundation: Grants Awarded for Python in Education

The Python Software Foundation has been asked about Python in education quite a bit recently. People have asked, “Is there an official curriculum we can use?”, “Are there online resources?”, “Are there efforts happening to improve Python on mobile?”, and so on.

9 years ago we instituted the Education Summit at PyCon US where educators as well as students work together on initiatives and obstacles. Earlier this year we decided we needed to do more. In November of 2018, the PSF created the Python in Education Board Committee and it was tasked with finding initiatives to fund to help improve the presence of Python in education.

In January of this year, the Python in Education Board Committee launched a “request for ideas” phase taking suggestions from the community on what we should focus our funding on. After the RFI period, we came up with 3 areas of education we wanted to focus on and asked to receive grant proposals on the following: resources (curriculums, evaluations, studies, multidisciplinary projects), localization (primarily translations), and mobile (development on mobile devices).

We are happy to publish more details on the grants the PSF approved from this initiative!

Beeware

The BeeWare Project wants to make it possible for all Python developers to write native apps for desktop and mobile platforms. Most desktop operating systems and iOS are supported already, but Android needs attention. Since Android users outnumber other mobile OS users worldwide by over 3 to 1, we determined it is important to fund this project. Beeware was awarded a $50,000 grant to help improve Python on Android. Phase one will be starting soon with this set of goals:

  1. A port of the CPython runtime to Android, delivered as a binary library ready to install into an Android project.
  2. A JNI-based library for bridging between the Android runtime and the CPython runtime.
  3. A template for a Gradle project that can be used to deploy Python code on Android devices. 

Beeware announced that they are looking for contractors to help with the work. Check out their blog post for more information.

Python in Education Website

Educational resources are in demand.  The PSF awarded a grant of $12,000 USD to Meg Ray, to work on creating a Python in Education website where we can curate educational information from all over the world. Meg will begin by collecting resources and after auditing the shared information, she will work on organizing it on an official PSF webpage. This work will begin in October of 2019 so please keep an eye out for updates via tweets and blogs!

Friendly-tracebacks

Lastly is a project called friendly-tracebacks. This project is not in need of financial support but is asking the PSF to help publicize it.  Friendly-traceback aims to provide simplified tracebacks translated into as many languages as possible. The project maintainer is looking for volunteers to help with tasks such as documenting possible SyntaxError use cases and documenting exceptions that haven't already been covered. Read more on their blog for the full call to action from the maintainer.


We hope to continue this initiative yearly! Companies that are passionate about supporting Python in Education should get in touch; we can't continue our work without your support!  As a non-profit organization, the PSF depends on sponsorships and donations to support the Python community.

Donate to the PSF: https://www.python.org/psf/donations/
Sponsor the PSF: https://www.python.org/psf/sponsorship/


from Planet Python
via read more

Grants Awarded for Python in Education

The Python Software Foundation has been asked about Python in education quite a bit recently. People have asked, “Is there an official curriculum we can use?”, “Are there online resources?”, “Are there efforts happening to improve Python on mobile?”, and so on.

9 years ago we instituted the Education Summit at PyCon US where educators as well as students work together on initiatives and obstacles. Earlier this year we decided we needed to do more. In November of 2018, the PSF created the Python in Education Board Committee and it was tasked with finding initiatives to fund to help improve the presence of Python in education.

In January of this year, the Python in Education Board Committee launched a “request for ideas” phase taking suggestions from the community on what we should focus our funding on. After the RFI period, we came up with 3 areas of education we wanted to focus on and asked to receive grant proposals on the following: resources (curriculums, evaluations, studies, multidisciplinary projects), localization (primarily translations), and mobile (development on mobile devices).

We are happy to publish more details on the grants the PSF approved from this initiative!

Beeware

The BeeWare Project wants to make it possible for all Python developers to write native apps for desktop and mobile platforms. Most desktop operating systems and iOS are supported already, but Android needs attention. Since Android users outnumber other mobile OS users worldwide by over 3 to 1, we determined it is important to fund this project. Beeware was awarded a $50,000 grant to help improve Python on Android. Phase one will be starting soon with this set of goals:

  1. A port of the CPython runtime to Android, delivered as a binary library ready to install into an Android project.
  2. A JNI-based library for bridging between the Android runtime and the CPython runtime.
  3. A template for a Gradle project that can be used to deploy Python code on Android devices. 

Beeware announced that they are looking for contractors to help with the work. Check out their blog post for more information.

Python in Education Website

Educational resources are in demand.  The PSF awarded a grant of $12,000 USD to Meg Ray, to work on creating a Python in Education website where we can curate educational information from all over the world. Meg will begin by collecting resources and after auditing the shared information, she will work on organizing it on an official PSF webpage. This work will begin in October of 2019 so please keep an eye out for updates via tweets and blogs!

Friendly-tracebacks

Lastly is a project called friendly-tracebacks. This project is not in need of financial support but is asking the PSF to help publicize it.  Friendly-traceback aims to provide simplified tracebacks translated into as many languages as possible. The project maintainer is looking for volunteers to help with tasks such as documenting possible SyntaxError use cases and documenting exceptions that haven't already been covered. Read more on their blog for the full call to action from the maintainer.


We hope to continue this initiative yearly! Companies that are passionate about supporting Python in Education should get in touch; we can't continue our work without your support!  As a non-profit organization, the PSF depends on sponsorships and donations to support the Python community.

Donate to the PSF: https://www.python.org/psf/donations/
Sponsor the PSF: https://www.python.org/psf/sponsorship/


from Python Software Foundation News
via read more

Webinar: “React+TypeScript+TDD in PyCharm” with Paul Everitt

ReactJS is wildly popular and thus wildly supported. TypeScript is increasingly popular, and thus increasingly supported.

The two together? Not as much. Given that they both change quickly, it’s hard to find accurate learning materials.

React+TypeScript, with PyCharm? That three-part combination is the topic of this webinar. We’ll show a little about a lot. Meaning, the key steps to getting productive, in PyCharm, for React projects using TypeScript. Along the way we’ll show test-driven development and emphasize tips-and-tricks in the IDE.

  • Wednesday, October 16
  • 6:00 PM – 7:00 PM CEST (12:00 PM – 1:00 PM EDT)
  • Aimed at intermediate web developers familiar with React
  • Register here

webinar2019_watch the recording-171

About This Webinar

This webinar is based on a 12 part tutorial with write-ups, videos, and working code for each step. The tutorial covers: getting started with Jest testing, debugging, TSX, functional components, sharing props with types, class based components, interfaces, testing event handlers, and “dumb” components.

You could, of course, skip this webinar and go through the material. Or you could go through the material and use the webinar to ask questions. Either way, we’ll give a quick treatment of each topic.

As a note, I’ll be doing this same webinar, but in webinars with IntelliJ and Rider, later in the year.

One final point: the tutorial and this webinar teach React+TS while sitting in tests, rather than the browser. It’s a productive way to work and makes for a good learning experience.

Speaking To You

Paul is the PyCharm Developer Advocate at JetBrains. Before that, Paul was a co-founder of Zope Corporation, taking the first open source application server through $14M of funding. Paul has bootstrapped both the Python Software Foundation and the Plone Foundation. Paul was an officer in the US Navy, starting www.navy.mil in 1993.



from PyCharm Blog
read more

Preventing SQL Injection Attacks With Python

Every few years, the Open Web Application Security Project (OWASP) ranks the most critical web application security risks. Since the first report, injection risks have always been on top. Among all injection types, SQL injection is one of the most common attack vectors, and arguably the most dangerous. As Python is one of the most popular programming languages in the world, knowing how to protect against Python SQL injection is critical.

In this tutorial, you’re going to learn:

  • What Python SQL injection is and how to prevent it
  • How to compose queries with both literals and identifiers as parameters
  • How to safely execute queries in a database

This tutorial is suited for users of all database engines. The examples here use PostgreSQL, but the results can be reproduced in other database management systems (such as SQLite, MySQL, Microsoft SQL Server, Oracle, and so on).

Understanding Python SQL Injection

SQL Injection attacks are such a common security vulnerability that the legendary xkcd webcomic devoted a comic to it:

A humorous webcomic by xkcd about the potential effect of SQL injection
"Exploits of a Mom" (Image: xkcd)

Generating and executing SQL queries is a common task. However, companies around the world often make horrible mistakes when it comes to composing SQL statements. While the ORM layer usually composes SQL queries, sometimes you have to write your own.

When you use Python to execute these queries directly into a database, there’s a chance you could make mistakes that might compromise your system. In this tutorial, you’ll learn how to successfully implement functions that compose dynamic SQL queries without putting your system at risk for Python SQL injection.

Setting Up a Database

To get started, you’re going to set up a fresh PostgreSQL database and populate it with data. Throughout the tutorial, you’ll use this database to witness firsthand how Python SQL injection works.

Creating a Database

First, open your shell and create a new PostgreSQL database owned by the user postgres:

$ createdb -O postgres psycopgtest

Here you used the command line option -O to set the owner of the database to the user postgres. You also specified the name of the database, which is psycopgtest.

Your new database is ready to go! You can connect to it using psql:

$ psql -U postgres -d psycopgtest
psql (11.2, server 10.5)
Type "help" for help.

You’re now connected to the database psycopgtest as the user postgres. This user is also the database owner, so you’ll have read permissions on every table in the database.

Creating a Table With Data

Next, you need to create a table with some user information and add data to it:

psycopgtest=# CREATE TABLE users (
    username varchar(30),
    admin boolean
);
CREATE TABLE

psycopgtest=# INSERT INTO users
    (username, admin)
VALUES
    ('ran', true),
    ('haki', false);
INSERT 0 2

psycopgtest=# SELECT * FROM users;
 username | admin
----------+-------
 ran      | t
 haki     | f
(2 rows)

The table has two columns: username and admin. The admin column indicates whether or not a user has administrative privileges. Your goal is to target the admin field and try to abuse it.

Setting Up a Python Virtual Environment

Now that you have a database, it’s time to set up your Python environment. For step-by-step instructions on how to do this, check out Python Virtual Environments: A Primer.

Create your virtual environment in a new directory:

~/src $ mkdir psycopgtest
~/src $ cd psycopgtest
~/src/psycopgtest $ python3 -m venv venv

After you run this command, a new directory called venv will be created. This directory will store all the packages you install inside the virtual environment.

Connecting to the Database

To connect to a database in Python, you need a database adapter. Most database adapters follow version 2.0 of the Python Database API Specification PEP 249. Every major database engine has a leading adapter:

Database Adapter
PostgreSQL Psycopg
SQLite sqlite3
Oracle cx_oracle
MySql MySQLdb

To connect to a PostgreSQL database, you’ll need to install Psycopg, which is the most popular adapter for PostgreSQL in Python. Django ORM uses it by default, and it’s also supported by SQLAlchemy.

In your terminal, activate the virtual environment and use pip to install psycopg:

~/src/psycopgtest $ source venv/bin/activate
~/src/psycopgtest $ python -m pip install psycopg2>=2.8.0
Collecting psycopg2
  Using cached https://....
  psycopg2-2.8.2.tar.gz
Installing collected packages: psycopg2
  Running setup.py install for psycopg2 ... done
Successfully installed psycopg2-2.8.2

Now you’re ready to create a connection to your database. Here’s the start of your Python script:

import psycopg2

connection = psycopg2.connect(
    host="localhost",
    database="psycopgtest",
    user="postgres",
    password=None,
)
connection.set_session(autocommit=True)

You used psycopg2.connect() to create the connection. This function accepts the following arguments:

  • host is the IP address or the DNS of the server where your database is located. In this case, the host is your local machine, or localhost.

  • database is the name of the database to connect to. You want to connect to the database you created earlier, psycopgtest.

  • user is a user with permissions for the database. In this case, you want to connect to the database as the owner, so you pass the user postgres.

  • password is the password for whoever you specified in user. In most development environments, users can connect to the local database without a password.

After setting up the connection, you configured the session with autocommit=True. Activating autocommit means you won’t have to manually manage transactions by issuing a commit or rollback. This is the default behavior in most ORMs. You use this behavior here as well so that you can focus on composing SQL queries instead of managing transactions.

Executing a Query

Now that you have a connection to the database, you’re ready to execute a query:

>>>
>>> with connection.cursor() as cursor:
...     cursor.execute('SELECT COUNT(*) FROM users')
...     result = cursor.fetchone()
... print(result)
(2,)

You used the connection object to create a cursor. Just like a file in Python, cursor is implemented as a context manager. When you create the context, a cursor is opened for you to use to send commands to the database. When the context exits, the cursor closes and you can no longer use it.

While inside the context, you used cursor to execute a query and fetch the results. In this case, you issued a query to count the rows in the users table. To fetch the result from the query, you executed cursor.fetchone() and received a tuple. Since the query can only return one result, you used fetchone(). If the query were to return more than one result, then you’d need to either iterate over cursor or use one of the other fetch* methods.

Using Query Parameters in SQL

In the previous section, you created a database, established a connection to it, and executed a query. The query you used was static. In other words, it had no parameters. Now you’ll start to use parameters in your queries.

First, you’re going to implement a function that checks whether or not a user is an admin. is_admin() accepts a username and returns that user’s admin status:

# BAD EXAMPLE. DON'T DO THIS!
def is_admin(username: str) -> bool:
    with connection.cursor() as cursor:
        cursor.execute("""
            SELECT
                admin
            FROM
                users
            WHERE
                username = '%s'
        """ % username)
        result = cursor.fetchone()
    admin, = result
    return admin

This function executes a query to fetch the value of the admin column for a given username. You used fetchone() to return a tuple with a single result. Then, you unpacked this tuple into the variable admin. To test your function, check some usernames:

>>>
>>> is_admin('haki')
False
>>> is_admin('ran')
True

So far so good. The function returned the expected result for both users. But what about non-existing user? Take a look at this Python traceback:

>>>
>>> is_admin('foo')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 12, in is_admin
TypeError: cannot unpack non-iterable NoneType object

When the user does not exist, a TypeError is raised. This is because .fetchone() returns None when no results are found, and unpacking None raises a TypeError. The only place you can unpack a tuple is where you populate admin from result.

To handle non-existing users, create a special case for when result is None:

# BAD EXAMPLE. DON'T DO THIS!
def is_admin(username: str) -> bool:
    with connection.cursor() as cursor:
        cursor.execute("""
            SELECT
                admin
            FROM
                users
            WHERE
                username = '%s'
        """ % username)
        result = cursor.fetchone()

    if result is None:
        # User does not exist
        return False

    admin, = result
    return admin

Here, you’ve added a special case for handling None. If username does not exist, then the function should return False. Once again, test the function on some users:

>>>
>>> is_admin('haki')
False
>>> is_admin('ran')
True
>>> is_admin('foo')
False

Great! The function can now handle non-existing usernames as well.

Exploiting Query Parameters With Python SQL Injection

In the previous example, you used string interpolation to generate a query. Then, you executed the query and sent the resulting string directly to the database. However, there’s something you may have overlooked during this process.

Think back to the username argument you passed to is_admin(). What exactly does this variable represent? You might assume that username is just a string that represents an actual user’s name. As you’re about to see, though, an intruder can easily exploit this kind of oversight and cause major harm by performing Python SQL injection.

Try to check if the following user is an admin or not:

>>>
>>> is_admin("'; select true; --")
True

Wait… What just happened?

Let’s take another look at the implementation. Print out the actual query being executed in the database:

>>>
>>> print("select admin from users where username = '%s'" % "'; select true; --")
select admin from users where username = ''; select true; --'

The resulting text contains three statements. To understand exactly how Python SQL injection works, you need to inspect each part individually. The first statement is as follows:

select admin from users where username = '';

This is your intended query. The semicolon (;) terminates the query, so the result of this query does not matter. Next up is the second statement:

select true;

This statement was constructed by the intruder. It’s designed to always return True.

Lastly, you see this short bit of code:

--'

This snippet defuses anything that comes after it. The intruder added the comment symbol (--) to turn everything you might have put after the last placeholder into a comment.

When you execute the function with this argument, it will always return True. If, for example, you use this function in your login page, an intruder could log in with the username '; select true; --, and they’ll be granted access.

If you think this is bad, it could get worse! Intruders with knowledge of your table structure can use Python SQL injection to cause permanent damage. For example, the intruder can inject an update statement to alter the information in the database:

>>>
>>> is_admin('haki')
False
>>> is_admin("'; update users set admin = 'true' where username = 'haki'; select true; --")
True
>>> is_admin('haki')
True

Let’s break it down again:

';

This snippet terminates the query, just like in the previous injection. The next statement is as follows:

update users set admin = 'true' where username = 'haki';

This section updates admin to true for user haki.

Finally, there’s this code snippet:

select true; --

As in the previous example, this piece returns true and comments out everything that follows it.

Why is this worse? Well, if the intruder manages to execute the function with this input, then user haki will become an admin:

psycopgtest=# select * from users;
 username | admin
----------+-------
 ran      | t
 haki     | t
(2 rows)

The intruder no longer has to use the hack. They can just log in with the username haki. (If the intruder really wanted to cause harm, then they could even issue a DROP DATABASE command.)

Before you forget, restore haki back to its original state:

psycopgtest=# update users set admin = false where username = 'haki';
UPDATE 1

So, why is this happening? Well, what do you know about the username argument? You know it should be a string representing the username, but you don’t actually check or enforce this assertion. This can be dangerous! It’s exactly what attackers are looking for when they try to hack your system.

Crafting Safe Query Parameters

In the previous section, you saw how an intruder can exploit your system and gain admin permissions by using a carefully crafted string. The issue was that you allowed the value passed from the client to be executed directly to the database, without performing any sort of check or validation. SQL injections rely on this type of vulnerability.

Any time user input is used in a database query, there’s a possible vulnerability for SQL injection. The key to preventing Python SQL injection is to make sure the value is being used as the developer intended. In the previous example, you intended for username to be used as a string. In reality, it was used as a raw SQL statement.

To make sure values are used as they’re intended, you need to escape the value. For example, to prevent intruders from injecting raw SQL in the place of a string argument, you can escape quotation marks:

>>>
>>> # BAD EXAMPLE. DON'T DO THIS!
>>> username = username.replace("'", "''")

This is just one example. There are a lot of special characters and scenarios to think about when trying to prevent Python SQL injection. Lucky for you, modern database adapters, come with built-in tools for preventing Python SQL injection by using query parameters. These are used instead of plain string interpolation to compose a query with parameters.

Now that you have a better understanding of the vulnerability, you’re ready to rewrite the function using query parameters instead of string interpolation:

 1 def is_admin(username: str) -> bool:
 2     with connection.cursor() as cursor:
 3         cursor.execute("""
 4             SELECT
 5                 admin
 6             FROM
 7                 users
 8             WHERE
 9                 username = %(username)s
10         """, {
11             'username': username
12         })
13         result = cursor.fetchone()
14 
15     if result is None:
16         # User does not exist
17         return False
18 
19     admin, = result
20     return admin

Here’s what’s different in this example:

  • In line 9, you used a named parameter username to indicate where the username should go. Notice how the parameter username is no longer surrounded by single quotation marks.

  • In line 11, you passed the value of username as the second argument to cursor.execute(). The connection will use the type and value of username when executing the query in the database.

To test this function, try some valid and invalid values, including the dangerous string from before:

>>>
>>> is_admin('haki')
False
>>> is_admin('ran')
True
>>> is_admin('foo')
False
>>> is_admin("'; select true; --")
False

Amazing! The function returned the expected result for all values. What’s more, the dangerous string no longer works. To understand why, you can inspect the query generated by execute():

>>>
>>> with connection.cursor() as cursor:
...    cursor.execute("""
...        SELECT
...            admin
...        FROM
...            users
...        WHERE
...            username = %(username)s
...    """, {
...        'username': "'; select true; --"
...    })
...    print(cursor.query.decode('utf-8'))
SELECT
    admin
FROM
    users
WHERE
    username = '''; select true; --'

The connection treated the value of username as a string and escaped any characters that might terminate the string and introduce Python SQL injection.

Passing Safe Query Parameters

Database adapters usually offer several ways to pass query parameters. Named placeholders are usually the best for readability, but some implementations might benefit from using other options.

Let’s take a quick look at some of the right and wrong ways to use query parameters. The following code block shows the types of queries you’ll want to avoid:

# BAD EXAMPLES. DON'T DO THIS!
cursor.execute("SELECT admin FROM users WHERE username = '" + username + '");
cursor.execute("SELECT admin FROM users WHERE username = '%s' % username);
cursor.execute("SELECT admin FROM users WHERE username = '{}'".format(username));
cursor.execute(f"SELECT admin FROM users WHERE username = '{username}'");

Each of these statements passes username from the client directly to the database, without performing any sort of check or validation. This sort of code is ripe for inviting Python SQL injection.

In contrast, these types of queries should be safe for you to execute:

# SAFE EXAMPLES. DO THIS!
cursor.execute("SELECT admin FROM users WHERE username = %s'", (username, ));
cursor.execute("SELECT admin FROM users WHERE username = %(username)s", {'username': username});

In these statements, username is passed as a named parameter. Now, the database will use the specified type and value of username when executing the query, offering protection from Python SQL injection.

Using SQL Composition

So far you’ve used parameters for literals. Literals are values such as numbers, strings, and dates. But what if you have a use case that requires composing a different query—one where the parameter is something else, like a table or column name?

Inspired by the previous example, let’s implement a function that accepts the name of a table and returns the number of rows in that table:

# BAD EXAMPLE. DON'T DO THIS!
def count_rows(table_name: str) -> int:
    with connection.cursor() as cursor:
        cursor.execute("""
            SELECT
                count(*)
            FROM
                %(table_name)s
        """, {
            'table_name': table_name,
        })
        result = cursor.fetchone()

    rowcount, = result
    return rowcount

Try to execute the function on your users table:

>>>
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 9, in count_rows
psycopg2.errors.SyntaxError: syntax error at or near "'users'"
LINE 5:                 'users'
                        ^

The command failed to generate the SQL. As you’ve seen already, the database adapter treats the variable as a string or a literal. A table name, however, is not a plain string. This is where SQL composition comes in.

You already know it’s not safe to use string interpolation to compose SQL. Luckily, Psycopg provides a module called psycopg.sql to help you safely compose SQL queries. Let’s rewrite the function using psycopg.sql.SQL():

from psycopg2 import sql

def count_rows(table_name: str) -> int:
    with connection.cursor() as cursor:
        stmt = sql.SQL("""
            SELECT
                count(*)
            FROM
                {table_name}
        """).format(
            table_name = sql.Identifier(table_name),
        )
        cursor.execute(stmt)
        result = cursor.fetchone()

    rowcount, = result
    return rowcount

There are two differences in this implementation. First, you used sql.SQL() to compose the query. Then, you used sql.Identifier() to annotate the argument value table_name. (An identifier is a column or table name.)

Now, try executing the function on the users table:

>>>
>>> count_rows('users')
2

Great! Next, let’s see what happens when the table does not exist:

>>>
>>> count_rows('foo')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 11, in count_rows
psycopg2.errors.UndefinedTable: relation "foo" does not exist
LINE 5:                 "foo"
                        ^

The function throws the UndefinedTable exception. In the following steps, you’ll use this exception as an indication that your function is safe from a Python SQL injection attack.

To put it all together, add an option to count rows in the table up to a certain limit. This feature might be useful for very large tables. To implement this, add a LIMIT clause to the query, along with query parameters for the limit’s value:

from psycopg2 import sql

def count_rows(table_name: str, limit: int) -> int:
    with connection.cursor() as cursor:
        stmt = sql.SQL("""
            SELECT
                COUNT(*)
            FROM (
                SELECT
                    1
                FROM
                    {table_name}
                LIMIT
                    {limit}
            ) AS limit_query
        """).format(
            table_name = sql.Identifier(table_name),
            limit = sql.Literal(limit),
        )
        cursor.execute(stmt)
        result = cursor.fetchone()

    rowcount, = result
    return rowcount

In this code block, you annotated limit using sql.Literal(). As in the previous example, psycopg will bind all query parameters as literals when using the simple approach. However, when using sql.SQL(), you need to explicitly annotate each parameter using either sql.Identifier() or sql.Literal().

Execute the function to make sure that it works:

>>>
>>> count_rows('users', 1)
1
>>> count_rows('users', 10)
2

Now that you see the function is working, make sure it’s also safe:

>>>
>>> count_rows("(select 1) as foo; update users set admin = true where name = 'haki'; --", 1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 18, in count_rows
psycopg2.errors.UndefinedTable: relation "(select 1) as foo; update users set admin = true where name = '" does not exist
LINE 8:                     "(select 1) as foo; update users set adm...
                            ^

This traceback shows that psycopg escaped the value, and the database treated it as a table name. Since a table with this name doesn’t exist, an UndefinedTable exception was raised and you were not hacked!

Conclusion

You’ve successfully implemented a function that composes dynamic SQL without putting your system at risk for Python SQL injection! You’ve used both literals and identifiers in your query without compromising security.

You’ve learned:

  • What Python SQL injection is and how it can be exploited
  • How to prevent Python SQL injection using query parameters
  • How to safely compose SQL statements that use literals and identifiers as parameters

You’re now able to create programs that can withstand attacks from the outside. Go forth and thwart the hackers!


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]



from Real Python
read more

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...