skip to content
Just Change Direction

Using for await to batch async work

/ 3 min read

I have recently run into quite a few issues with rate limits and had the need to use batching. For example, I am trying to collate all the books I have ever bought and read so I have been enriching this data using the Google Books API. They aren’t super clear on what the rate limit is, but I have worked out that you can’t do more than roughly 120 books per minute.

Batching has also been very useful at work when writing scripts to migrate large amounts of data. Often you don’t want to overwhelm a DB or spawn a ton of lambdas by firing out too many concurrent requests.

To batch this work I have been using for await loops (docs here). I had no idea this existed until a colleague showed me it and, frankly, if I had read the documentation without knowing what it did I doubt I would have ever understood it. So this article will walk through how it is useful.

Using my Google Books example, let’s imagine that we have a list of book titles. For each title we want to make a request to the Google Books API to get more info about the book (e.g. title, genre, description).

const bookTitles = [
'1984',
'The Three-Body Problem',
'What We Owe The Future',
'Leviathan Wakes',
/* 100s more */
];
const getBookData = async (bookTitle: string) => {
console.log('Getting data for book: ', bookTitle);
await getAndSaveDataFromGoogleBooks(bookTitle);
console.log("Completed: " bookTitle);
};

Because Google Books has a rate limit, we only want to get 10 books at a time, waiting 10 seconds between each batch. This means we can’t just iterate over the books array and make all the requests at once:

// ❌ This would result in all the calls to the api happening pretty much at the same time
const run = async () => {
await Promise.all(bookTitles.map((bookTitle) => getBookData(bookTitle)));
};
run();

We also don’t want to await each call to the API as this means we would be waiting for each request to finish before we can start the next one.

// ❌ Here we await the completion of each API call, so they happen one by one - slow!
const run = async () => {
bookTitles.forEach((bookTitle) => {
await getBookData(bookTitle);
});
};
run();

Ideally, we want to make 10 calls to the api, wait for the results and the delay, then repeat until we have done them all. for await can help here.

// Batch the book titles into an array of arrays with each subarray holding 10 titles
// [
// ['1984', 'The Three-Body Problem' /* and 8 more */],
// ['What We Owe The Future', 'Leviathan Wakes' /* and 8 more */],
// ];
const batchedBookTitles = batched(bookTitles, 10);
const run = async () => {
for await (const batch of batchedBookTitles) {
// For each book title, we make concurrent calls to the api
const promises = batch.map((bookTitle) => {
return getBookData(bookTitle);
});
// We await the completion of all 10 calls
await Promise.all(promises);
// wait 10 seconds to avoid hitting the rate limits
await delay(10);
}
};
run();

for await allows us to await the whole batch completing so we can make 10 concurrent calls and then wait a bit. This gives us lots of flexibility to avoid hitting the rate limt.