Tuesday, March 30, 2021

Juri Pakaste: Swift networking with AsyncHTTPClient

When you need to access resources over HTTP in Swift, in most cases the answer is URLSession from Foundation. On server side that's most probably not the right choice; there you are most likely running on SwiftNIO and you'll want something that integrates with it. On command line it's a toss up; on macOS URLSession is great, on other platforms… well, hope you don't run into any unimplemented corners.

I needed a command line tool that fetches a JSON file, parses it, downloads the files listed in the JSON file, and saves the JSON file too. I'm mostly on macOS so URLSession would have been fine, but I wanted to explore my options. SwiftNIO ships with a low-level HTTP client implementation, but that's not the right choice for a quick utility. The good news is there's also a higher-level implementation: AsyncHTTPClient. It's a lovely future based (while we wait for async/await in Swift) asynchronous HTTP client that makes this task a breeze.

The format of the JSON manifest file looks like this:

{
    "files": [
        {
            "file": "filename1",
            "bytes": 42,
            "sha256": "8e079926d7340822e6b4c501811af5d1edc47d796b97f56c1cbe3177b47d588b"
        },

        {
            "file": "filename2",
            "bytes": 10,
            "sha256": "4998ab8c155e03ebf843d4adf51d702b9560dc0fbafe35405c826d9a76460289"
        }
    ]
}

And so on. That's just a couple of structs:

struct ManifestFile: Codable {
    let file: String
}

struct Manifest: Codable {
    let files: [ManifestFile]
}

I'll just ignore bytes and sha256. They are relevant in other contexts, but here they don't matter.

Command line

Let's start by defining a command line interface. We'll use Swift Argument Parser:

import ArgumentParser
import Foundation

struct DownloadManifestFiles: ParsableCommand {
    static var configuration = CommandConfiguration(
        commandName: "manifestdl",
        abstract: "Download manifest files"
    )
    
    @Argument(help: "URL to the manifest")
    var url: String

    @Option(help: "Target directory, default to working directory")
    var directory: String = "."

    mutating func validate() throws {
        guard URL(string: self.url) != nil else {
            throw ValidationError("url \(self.url) was not valid")
        }
    }

    mutating func run() throws {
        guard let reallyURL = URL(string: self.url) else {
            throw ValidationError("URL \(self.url) was not valid")
        }
        let cwd = URL(fileURLWithPath: FileManager.default.currentDirectoryPath, isDirectory: true)
        let directory = URL(fileURLWithPath: self.directory, isDirectory: true, relativeTo: cwd)

        fatalError("TODO")
    }
}

DownloadManifestFiles.main()

Nice and easy. An interesting wrinkle is that ArgumentParser doesn't support URLs, probably because they're in Foundation and ArgumentParser uses only stdlib. And NSURL, which backs URL, is a class with far more responsibilities than is comfortable in a simple data wrapper.

Downloading…

Next we need the actual networking. Let's wrap it in a helper class:

import AsyncHTTPClient
import Foundation
import NIO
import NIOHTTP1

class Downloader {
    let httpClient = HTTPClient(eventLoopGroupProvider: .createNew)

    func syncShutdown() throws {
        try self.httpClient.syncShutdown()
    }

    func downloadListedFiles(url: URL, directory: URL) -> EventLoopFuture<Void> {
        fatalError("TODO")
    }
}

We don't really care about any details about the download, we just want to know if it succeeded or not, hence EventLoopFuture<Void>. EventLoopFuture uses untyped errors, so we don't need to include an error in the type signature. It makes a bit odd next to Result and Combine's Publisher, but it does help when integrating with Swift's exceptions.

Next let's implement the downloadListedFiles method.

class Downloader {
    /* … */
    func downloadListedFiles(url: URL, directory: URL) -> EventLoopFuture<Void> {
        let future = self.downloadManifest(url: manifestURL)
            .flatMap { manifest, data in
                self.downloadManifestContent(
                    manifest: manifest,
                    manifestURL: manifestURL,
                    directory: directory
                ).map { data }
            }
            .flatMap { data in
                self.saveManifest(directory: directory, data: data)
            }
        return future
    }

    func downloadManifest(url: URL) -> EventLoopFuture<(manifest: Manifest, data: Data)> {
        fatalError("TODO")
    }

    func downloadManifestContent(
        manifest: Manifest,
        manifestURL: URL,
        directory: URL
    ) -> EventLoopFuture<Void> {
        fatalError("TODO")
    }

    func saveManifest(directory: URL, data: Data) -> EventLoopFuture<Void> {
        fatalError("TODO")
    }
}

That looks like an acceptable outline for it. Give downloadListedFiles an URL to a listing file and a directory to download to and it'll download the file manifest, parse it, and then download the listed files, and finally save the manifest too. I'll fill in the blanks one by one.

Next let's look how downloadManifest should look.

func downloadManifest(url: URL) -> EventLoopFuture<(manifest: Manifest, data: Data)> {
    self.httpClient.get(url: url.absoluteString)
        .flatMapThrowing { response in
            guard var body = response.body,
                    let data = body.readData(length: body.readableBytes)
            else {
                throw MissingBodyError()
            }
            return (manifest: try JSONDecoder().decode(Manifest.self, from: data), data: data)
        }
}

As you can see, just like ArgumentParser, HTTPClient eschews URL as a Foundation type. Other than that, HTTPClient gives us a really easy interface. Just .get a String containing an URL, and then do whatever you need to do with a chaining method like .map, .flatMap, or as in this case, .flatMapThrowing.

Next we can tackle downloadManifestContent. It's the function that's responsible for downloading all the listed files.

func downloadManifestContent(
    manifest: Manifest,
    manifestURL: URL,
    directory: URL
) -> EventLoopFuture<Void> {
    let baseURL = manifestURL.deletingLastPathComponent()
    let requestFutures: [EventLoopFuture<Void>]
    do {
        requestFutures = try manifest.files.map { manifestFile in
            let localURL = directory.appendingPathComponent(manifestFile.file)
            try FileManager.default.createDirectory(
                at: localURL.deletingLastPathComponent(),
                withIntermediateDirectories: true,
                attributes: nil
            )
            let localPath = localURL.path
            let delegate = try FileDownloadDelegate(path: localPath, pool: self.threadPool)
            let request = try HTTPClient.Request(
                url: baseURL.appendingPathComponent(manifestFile.file).absoluteString
            )
            return self.httpClient.execute(request: request, delegate: delegate)
                .futureResult
                .map { _ in () }
        }
    } catch {
        return self.httpClient.eventLoopGroup.next().makeFailedFuture(error)
    }
    return EventLoopFuture.andAllSucceed(requestFutures, on: self.eventLoopGroup.next())
}

var eventLoopGroup: EventLoopGroup {
    self.httpClient.eventLoopGroup
}

This one's not quite as simple, but it's not too bad. For each listed file we create a future for downloading it. The download to disk, as opposed to memory, happens with the help of FileDownloadDelegate, a delegate included in AsyncHTTPClient that can write downloads to disk and report progress. Then once we have a list of futures, we smash them all together with andAllSucceeded. Again we don't care about anything else other than success, so Void is a perfectly fine value type.

One detail I need to point out here is the eventLoopGroup property. SwiftNIO works with EventLoops, and EventLoops are apparently usually threads. While we're working only with with networking code it's probably not a problem to ask HTTPClient for its EventLoopGroup instance.

File I/O

We have read the manifest and written the files listed in it to disk. One thing left to do: saving the manifest. Writing the file with SwiftNIO isn't quite as friendly as AsyncHTTPClient is, and if you were doing more of this you'd want to put a nicer façade on it, but here we just need it this once.

To prepare for this, lets first set up a bit of scaffolding. It feels cleaner to move the management of the EventLoopGroup to our own code now that we're using it for not just the HTTP client, and we'll also need a thread pool for the file I/O.

class Downloader {
    let eventLoopGroup: EventLoopGroup // replaces the computed property
    let httpClient: HTTPClient
    let threadPool: NIOThreadPool

    init() {
        self.eventLoopGroup = NIOTSEventLoopGroup()
        self.httpClient = HTTPClient(eventLoopGroupProvider: .shared(self.eventLoopGroup))
        self.threadPool = NIOThreadPool(numberOfThreads: 1)

        self.threadPool.start()
    }

    func syncShutdown() throws {
        try self.httpClient.syncShutdown()
        try self.threadPool.syncShutdownGracefully()
        try self.eventLoopGroup.syncShutdownGracefully()
    }

    /* … */
}

Forcing NIOTSEventLoopGroup here probably ties this code to macOS. For portability, there are other implementations. Here's what AsyncHTTPClient does when you ask it to create the event loop group itself:

#if canImport(Network)
    if #available(OSX 10.14, iOS 12.0, tvOS 12.0, watchOS 6.0, *) {
        self.eventLoopGroup = NIOTSEventLoopGroup()
    } else {
        self.eventLoopGroup = MultiThreadedEventLoopGroup(numberOfThreads: 1)
    }
#else
    self.eventLoopGroup = MultiThreadedEventLoopGroup(numberOfThreads: 1)
#endif

Doing something similar in your own code should help make this more cross platform.

With that setup done, we can dive into the file writing itself.

func saveManifest(directory: URL, data: Data) -> EventLoopFuture<Void> {
    let io = NonBlockingFileIO(threadPool: self.threadPool)
    let eventLoop = self.eventLoopGroup.next()
    let buffer = ByteBuffer(data: data)
    return io
        .openFile(
            path: directory.appendingPathComponent("manifest.json").path,
            mode: .write,
            flags: .allowFileCreation(),
            eventLoop: eventLoop
        ).flatMap { handle in
            io.write(fileHandle: handle, buffer: buffer, eventLoop: eventLoop)
                .map { handle }
        }.flatMapThrowing { handle in
            try handle.close()
        }
}

It's two async operations and one synchronous one in a pipeline. Open file asynchronously, write file asynchronously, close it synchronously. The flatMaps can feel a little daunting if you're not used to them, as always with future libraries. But once you get used to them, it's pretty OK. Async/await will hopefully help.

After all that work we're ready to loop back to our run method. We left it calling fatalError after processing the arguments. Now we can finish it up:

mutating func run() throws {
    guard let reallyURL = URL(string: self.url) else {
        throw ValidationError("URL \(self.url) was not valid")
    }
    let cwd = URL(fileURLWithPath: FileManager.default.currentDirectoryPath, isDirectory: true)
    let directory = URL(fileURLWithPath: self.directory, isDirectory: true, relativeTo: cwd)

    let downloader = Downloader()
    let dlFuture = downloader.downloadListedFiles(url: reallyURL, directory: directory)

    defer {
        do {
            try downloader.syncShutdown()
        } catch {
            print("Error shutting down: \(error)")
        }
    }
    try dlFuture.wait()
}

And that's it! Create a Downloader, call it to get a future, set up cleanup, then wait until the future is done.

Conclusion

SwiftNIO is a fantastic library that powers the Swift on the server ecosystem. It's also great with command line tooling. It can occasionally be a bit more involved than Foundation, but especially with HTTP requests the difference is negligible. You'd have had to bring in Combine too to make URLSessions composable.

The Foundation/standard library split is a bit awkward here, as it often is when working with Swift command line tools. It's not that Foundation doesn't work, but it's clear that often there's the Swift way and then there's the Foundation way. And Foundation's cross platform story has been a bit rocky.

As Swift's async story progresses a lot of this code can be simplified, I hope. In the ideal case the structure would stay pretty much as is, but those nested maps and flatMaps could be replaced with more straightforward code. However, I don't think you need to wait for async/await and all the related enhancements to arrive. This is already pretty great.



from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...