Wednesday, September 30, 2020

LAAC Technology: Making Concurrent HTTP requests with Python AsyncIO

Table of Contents

Introduction

Python 3.4 added the asyncio module to the standard library. Asyncio allows us to run IO-bound tasks asynchronously to increase the performance of our program. Common IO-bound tasks include calls to a database, reading and writing files to disk, and sending and receiving HTTP requests. A Django web application is a common example of an IO-bound application.

We’ll demonstrate the usage of concurrent HTTP requests by fetching prices for stock tickers. The only third party package we’ll use is httpx. Httpx is very similar to the popular requests package, but httpx supports asyncio.

Project Set Up

Requires Python 3.8+

  1. Create a project directory
  2. Create a virtual environment inside the directory
    • python3 -m venv async_http_venv
  3. Activate the virtual environment
    • source ./async_http_venv/bin/activate
  4. Install httpx
    • pip install httpx
  5. Copy the below example code into a python file named async_http.py

Example Code

import argparse
import asyncio
import itertools
import pprint
from decimal import Decimal
from typing import List, Tuple
import httpx
YAHOO_FINANCE_URL = "https://query1.finance.yahoo.com/v8/finance/chart/{}"
async def fetch_price(
ticker: str, client: httpx.AsyncClient
) -> Tuple[str, Decimal]:
print(f"Making request for {ticker} price")
response = await client.get(YAHOO_FINANCE_URL.format(ticker))
print(f"Received results for {ticker}")
price = response.json()["chart"]["result"][0]["meta"]["regularMarketPrice"]
return ticker, Decimal(price).quantize(Decimal("0.01"))
async def fetch_all_prices(tickers: List[str]) -> List[Tuple[str, Decimal]]:
async with httpx.AsyncClient() as client:
return await asyncio.gather(
*map(fetch_price, tickers, itertools.repeat(client),)
)
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument(
"-t",
"--tickers",
nargs="+",
help="List of tickers separated by a space",
required=True,
)
args = parser.parse_args()
loop = asyncio.get_event_loop()
result = loop.run_until_complete(fetch_all_prices(args.tickers))
pprint.pprint(result)

Test the Example Code

With the newly created virtual environment activated and python file ready, let’s run the program to test our setup.

python async_http.py -t VTSAX VTIAX IJS VSS AAPL ORCL GOOG MSFT FB

python asyncio http first request

If you look at the output, the requests do not finish sequentially. In a synchronous program, the request for VTSAX would be made first and finish first. Afterward, the next request for VTIAX would start. In our asynchronous program, the requests are made back to back and finish out of order whenever the API responds. Let’s run the script again with the same arguments and see what the order of results are.

python asyncio http second request

As you can see in the first request we received results for IJS first, but in the second request, the results for IJS returned fourth. Let’s walk through the code to see what our program does.

Walk Through

Let’s start with the fetch_all_prices function. The function starts by creating an AsyncClient that we’ll pass in every time we call fetch_price.

async with httpx.AsyncClient() as client:

Creating a client allows us to take advantage of HTTP connection pooling, which reuses the same TCP connection for each request. This increases the performance for each HTTP request. Additionally, we’re using a with statement to automatically close our client when the function finishes.

Next, let’s look at our return statement.

return await asyncio.gather(
*map(fetch_price, tickers, itertools.repeat(client),)
)

First, we’re running asyncio.gather which accepts asyncio futures or coroutines. In our case, we’re expanding, using an asterisk, a map of fetch_price functions which are our coroutines. To create our map of functions, we’re using the list of tickers and using itertools.repeat, which passes in our client to every function for each ticker. Once our map call is done, we have a function for each ticker which we can pass to asyncio.gather to run concurrently.

Now let’s look at our fetch_price function.

response = await client.get(YAHOO_FINANCE_URL.format(ticker))

We’re using the AsyncClient that we passed in to make an HTTP GET request to Yahoo Finance. We use the await keyword here because this is where the IO happens. Once the program reaches this line, it makes the HTTP GET request and yields control to the event loop while the request finishes.

price = response.json()["chart"]["result"][0]["meta"]["regularMarketPrice"]
return ticker, Decimal(price).quantize(Decimal("0.01"))

Once the request finishes, we extract the json from the response and return the price along with the ticker to identify which price is associated with a ticker. Finally before returning the price, we turn it into a decimal and round it to the nearest two decimal points.

Final Thoughts

The package ecosystem around the asyncio module is still maturing. Httpx looks like a quality replacement for requests. Starlette and FastAPI are two promising ASGI based web servers. As of version 3.1, Django has support for ASGI. Finally, more libraries are being released with asyncio in mind. As of this writing, asyncio has not seen widespread usage, but over the next few years, I predict asyncio will see a lot more adoption within the Python community.



from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...