I am trying to set up a asyncio program that takes as its input, one AsyncGenerator
and returns data via another AsyncGenerator
. Here's a sample program that illustrates the basic flow of data:
from collections.abc import AsyncGeneratorimport asyncioasync def input_gen() -> AsyncGenerator[str, None]:'''a simple generator that yields strings''' for char in "abc123xyz789": await asyncio.sleep(0.1) yield charasync def slow_task(item: str) -> str:'''simulate a slow task that opporates on a single item''' await asyncio.sleep(0.5) return f"{item}_loaded"async def my_gen() -> AsyncGenerator[str, None]:'''a second generator that yields the results slow_task(item)''' async for item in input_gen(): yield await slow_task(item)results = [x async for x in my_gen()]
In this flow, I would like to a) enable concurrent processing of slow_task
s for each item
, and b) start yielding the outputs of slow_task(item)
as soon as they become available. The outputs of my_gen
need not be sorted but should be otherwise identically.
I have been trying to find a path to do this using asyncio's Queue in a producer/consumer pattern but I haven't managed to get very far. I'm hoping folks have some suggestions for approaches that will improve this.