Tuesday, November 27, 2018

Stack Abuse: Asynchronous vs Synchronous Python Performance Analysis

Introduction

This article is the second part of a series on using Python for developing asynchronous web applications. The first part provides a more in-depth coverage of concurrency in Python and asyncio, as well as aiohttp.

If you'd like to read more about Asynchronous Python for Web Development, we've got it covered.

Due to the non-blocking nature of asynchronous libraries like aiohttp we would hope to be able to make and handle more requests in a given amount of time compared to analogous synchronous code. This is due to the fact that asynchronous code can rapidly switch between contexts in order to minimize time spent waiting for I/O.

Client-Side vs Server-Side Performance

Testing client-side performance of an asynchronous library like aiohttp is relatively straightforward. We choose some website as reference, and then make a certain number of requests, timing how long it takes our code to complete them. We'll be looking at the relative performance of aiohttp and requests when making requests to https://example.com.

Testing server-side performance is a little more tricky. Libraries like aiohttp come with built-in development servers, which are fine for testing routes on a local network. However, these development servers are not suited to deploying applications on the public web, as they cannot handle the load expected of a publicly available website, and they are not good at serving up static assets, like Javascript, CSS, and image files.

In order to get a better idea of the relative performance of aiohttp and an analogous synchronous web framework, we're going to re-implement our web app using Flask and then we'll compare development and production servers for both implementations.

For the production server, we're going to be using gunicorn.

Client-Side: aiohttp vs requests

For a traditional, synchronous approach, we just use a simple for loop. Though, before you run the code, make sure to install the requests module:

$ pip install --user requests

With that out of the way, let's go ahead and implement it in a more traditional manner:

# multiple_sync_requests.py
import requests  
def main():  
    n_requests = 100
    url = "https://example.com"
    session = requests.Session()
    for i in range(n_requests):
        print(f"making request {i} to {url}")
        resp = session.get(url)
        if resp.status_code == 200:
            pass

main()  

Analogous asynchronous code is a little more complicated though. Making multiple requests with aiohttp leverages the asyncio.gather method to make requests concurrently:

# multiple_async_requests.py
import asyncio  
import aiohttp

async def make_request(session, req_n):  
    url = "https://example.com"
    print(f"making request {req_n} to {url}")
    async with session.get(url) as resp:
        if resp.status == 200:
            await resp.text()

async def main():  
    n_requests = 100
    async with aiohttp.ClientSession() as session:
        await asyncio.gather(
            *[make_request(session, i) for i in range(n_requests)]
        )

loop = asyncio.get_event_loop()  
loop.run_until_complete(main())  

Running both synchronous and asynchronous code with the bash time utility:

me@local:~$ time python multiple_sync_requests.py  
real    0m13.112s  
user    0m1.212s  
sys     0m0.053s  

me@local:~$ time python multiple_async_requests.py  
real    0m1.277s  
user    0m0.695s  
sys     0m0.054s  

The concurrent/asynchronous code is far faster. But what happens if we multi-thread the synchronous code? Could it match the speed of concurrent code?

# multiple_sync_request_threaded.py
import threading  
import argparse  
import requests

def create_parser():  
    parser = argparse.ArgumentParser(
        description="Specify the number of threads to use"
    )

    parser.add_argument("-nt", "--n_threads", default=1, type=int)

    return parser

def make_requests(session, n, url, name=""):  
    for i in range(n):
        print(f"{name}: making request {i} to {url}")
        resp = session.get(url)
        if resp.status_code == 200:
            pass

def main():  
    parsed = create_parser().parse_args()

    n_requests = 100
    n_requests_per_thread = n_requests // parsed.n_threads

    url = "https://example.com"
    session = requests.Session()

    threads = [
        threading.Thread(
            target=make_requests,
            args=(session, n_requests_per_thread, url, f"thread_{i}")
        ) for i in range(parsed.n_threads)
    ]
    for t in threads:
        t.start()
    for t in threads:
        t.join()

main()  

Running this rather verbose piece of code will yield:

me@local:~$ time python multiple_sync_request_threaded.py -nt 10  
real    0m2.170s  
user    0m0.942s  
sys     0m0.104s  

And we can increase performance by using more threads, but returns diminish rapidly:

me@local:~$ time python multiple_sync_request_threaded.py -nt 20  
real    0m1.714s  
user    0m1.126s  
sys     0m0.119s  

By introducing threading, we can come close to matching the performance of the asynchronous code, at the cost of increased code complexity.

While it does offer a similar response time, it's not worth it for the price of complicating code that could be simple - The quality of code isn't increased by the complexity or the number of lines we use.

Server-Side: aiohttp vs Flask

We'll use the Apache Benchmark (ab) tool to test the performance of different servers.

With ab we can specify the total number of requests to make, in addition to the number of concurrent requests to make.

Before we can start testing, we have to reimplement our planet tracker app (from the previous article) using a synchronous framework. We'll use Flask, as the API is similar to aiohttp (in reality the aiohttp routing API is based off of Flask):

# flask_app.py
from flask import Flask, jsonify, render_template, request

from planet_tracker import PlanetTracker

__all__ = ["app"]

app = Flask(__name__, static_url_path="",  
            static_folder="./client",
            template_folder="./client")

@app.route("/planets/<planet_name>", methods=["GET"])
def get_planet_ephmeris(planet_name):  
    data = request.args
    try:
        geo_location_data = {
            "lon": str(data["lon"]),
            "lat": str(data["lat"]),
            "elevation": float(data["elevation"])
        }
    except KeyError as err:
        # default to Greenwich observatory
        geo_location_data = {
            "lon": "-0.0005",
            "lat": "51.4769",
            "elevation": 0.0,
        }
    print(f"get_planet_ephmeris: {planet_name}, {geo_location_data}")
    tracker = PlanetTracker()
    tracker.lon = geo_location_data["lon"]
    tracker.lat = geo_location_data["lat"]
    tracker.elevation = geo_location_data["elevation"]
    planet_data = tracker.calc_planet(planet_name)
    return jsonify(planet_data)

@app.route('/')
def hello():  
    return render_template("index.html")

if __name__ == "__main__":  
    app.run(
        host="localhost",
        port=8000,
        threaded=True
    )

If you're jumping in without reading the previous article, we have to set up our project a little before testing. I've put all the Python server code in a directory planettracker, itself a sub-directory of my home folder.

me@local:~/planettracker$ ls  
planet_tracker.py  
flask_app.py  
aiohttp_app.py  

I strongly suggest that you visit the previous article and get familiar with the application we've already built before proceeding.

aiohttp and Flask Development Servers

Let's see how long it takes our servers to handle 1000 requests, made 20 at a time.

First, I'll open two terminal windows. In the first, I run the server:

# terminal window 1
me@local:~/planettracker$ pipenv run python aiohttp_app.py  

In the second, let's run ab:

# terminal window 2
me@local:~/planettracker$ ab -k -c 20 -n 1000 "localhost:8000/planets/mars?lon=145.051&lat=-39.754&elevation=0"  
...
Concurrency Level:      20  
Time taken for tests:   0.494 seconds  
Complete requests:      1000  
Failed requests:        0  
Keep-Alive requests:    1000  
Total transferred:      322000 bytes  
HTML transferred:       140000 bytes  
Requests per second:    2023.08 [\#/sec] (mean)  
Time per request:       9.886 [ms] (mean)  
Time per request:       0.494 [ms] (mean, across all concurrent requests)  
Transfer rate:          636.16 [Kbytes/sec] received  
...

ab outputs a lot of information, and I've only displayed the most relevant bit. Of this the number to which we should pay most attention is the "Requests per second" field.

Now, exiting out of the server in the first window, lets fire up our Flask app:

# terminal window 1
me@local:~/planettracker$ pipenv run python flask_app.py  

Running the test script again:

# terminal window 2
me@local:~/planettracker$ ab -k -c 20 -n 1000 "localhost:8000/planets/mars?lon=145.051&lat=-39.754&elevation=0"  
...
Concurrency Level:      20  
Time taken for tests:   1.385 seconds  
Complete requests:      1000  
Failed requests:        0  
Keep-Alive requests:    0  
Total transferred:      210000 bytes  
HTML transferred:       64000 bytes  
Requests per second:    721.92 [\#/sec] (mean)  
Time per request:       27.704 [ms] (mean)  
Time per request:       1.385 [ms] (mean, across all concurrent requests)  
Transfer rate:          148.05 [Kbytes/sec] received  
...

It looks like the aiohttp app is 2.5x to 3x faster than the Flask when using each library's respective development server.

What happens if we use gunicorn to serve up our apps?

aiohttp and Flask as Served by gunicorn

Before we can test our apps in production mode, we have to first install gunicorn and figure out how to run our apps using an appropriate gunicorn worker class. In order to test the Flask app we can use the standard gunicorn worker, but for aiohttp we have to use the gunicorn worker bundled with aiohttp. We can install gunicorn with pipenv:

me@local~/planettracker$ pipenv install gunicorn  

We can run the aiohttp app with the appropriate gunicorn worker:

# terminal window 1
me@local:~/planettracker$ pipenv run gunicorn aiohttp_app:app --worker-class aiohttp.GunicornWebWorker  

Moving forward, when displaying ab test results I'm only going to show the "Requests per second" field for the sake of brevity:

# terminal window 2
me@local:~/planettracker$ ab -k -c 20 -n 1000 "localhost:8000/planets/mars?lon=145.051&lat=-39.754&elevation=0"  
...
Requests per second:    2396.24 [\#/sec] (mean)  
...

Now let's see how the Flask app fares:

# terminal window 1
me@local:~/planettracker$ pipenv run gunicorn flask_app:app  

Testing with ab:

# terminal window 2
me@local:~/planettracker$ ab -k -c 20 -n 1000 "localhost:8000/planets/mars?lon=145.051&lat=-39.754&elevation=0"  
...
Requests per second:    1041.30 [\#/sec] (mean)  
...

Using gunicorn definitely resulting in increased performance for both the aiohttp and Flask apps. The aiohttp app still performs better, although not by as much of a margin as with the development server.

gunicorn allows us to use multiple workers to serve up our apps. We can use the -w command line argument to tell gunicorn to spawn more worker processes. Using 4 workers results in a significant performance bump for our apps:

# terminal window 1
me@local:~/planettracker$ pipenv run gunicorn aiohttp_app:app -w 4  

Testing with ab:

# terminal window 2
me@local:~/planettracker$ ab -k -c 20 -n 1000 "localhost:8000/planets/mars?lon=145.051&lat=-39.754&elevation=0"  
...
Requests per second:    2541.97 [\#/sec] (mean)  
...

Moving on the the Flask version:

# terminal window 1
me@local:~/planettracker$ pipenv run gunicorn flask_app:app -w 4  

Testing with ab:

# terminal window 2
me@local:~/planettracker$ ab -k -c 20 -n 1000 "localhost:8000/planets/mars?lon=145.051&lat=-39.754&elevation=0"  
...
Requests per second:    1729.17 [\#/sec] (mean)  
...

The Flask app saw a more significant boost in performance when using multiple workers!

Summarizing Results

Let's take a step back and look at the results of testing development and production servers for both aiohttp and Flask implementations of our planet tracker app in a table:

aiohttp Flask % difference
Development server (Requests/sec) 2023.08 721.92 180.24
gunicorn (Requests/sec) 2396.24 1041.30 130.12
% increase over development server 18.45 44.24
gunicorn -w 4 (Requests/sec) 2541.97 1729.17 47.01
% increase over development server 25.65 139.52

Conclusion

In this article, we've compared the performance of an asynchronous web application compared to its synchronous counterpart and used several tools to do so.



from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...