Using for await to batch async work
/ 3 min read
I have recently run into quite a few issues with rate limits and had the need to use batching. For example, I am trying to collate all the books I have ever bought and read so I have been enriching this data using the Google Books API. They aren’t super clear on what the rate limit is, but I have worked out that you can’t do more than roughly 120 books per minute.
Batching has also been very useful at work when writing scripts to migrate large amounts of data. Often you don’t want to overwhelm a DB or spawn a ton of lambdas by firing out too many concurrent requests.
To batch this work I have been using for await loops (docs here). I had no idea this existed until a colleague showed me it and, frankly, if I had read the documentation without knowing what it did I doubt I would have ever understood it. So this article will walk through how it is useful.
Using my Google Books example, let’s imagine that we have a list of book titles. For each title we want to make a request to the Google Books API to get more info about the book (e.g. title, genre, description).
Because Google Books has a rate limit, we only want to get 10 books at a time, waiting 10 seconds between each batch. This means we can’t just iterate over the books array and make all the requests at once:
We also don’t want to await each call to the API as this means we would be waiting for each request to finish before we can start the next one.
Ideally, we want to make 10 calls to the api, wait for the results and the delay, then repeat until we have done them all. for await
can help here.
for await
allows us to await the whole batch completing so we can make 10 concurrent calls and then wait a bit. This gives us lots of flexibility to avoid hitting the rate limt.