My use case here is to run a number of python subprocesses, using multiple threads, and capture the output of those commands (which is just a sometimes-quite-large blob of text) into the values of another dict.
The OS I'm running this on is linux, but has a number of customizations: one of which is a pretty aggressive memory watchdog for processes not launched in a specific way (and my python script is not special enough to get around the watchdog...)
When I run "too many" threads and some combination of the subprocesses return "too much" output, the memory watchdog barks and my script gets killed.
I did just enough digging to find that the .communicate
method buffers output until the subprocess is done, which certainly aligns with the problems I'm having. But I haven't been able to find another better way to do this, although it seems like this would be a common enough usecase that maybe I'm missing something REALLY obvious?
import subprocessfrom multiprocessing.pool import ThreadPooldef runShowCommands(cmdTable) -> dict:""" return a dictionary of captured output from commands defined in cmdTable. """ def handle_proc_stdout(handle): try: procOutput[handle] = procHandles[handle].communicate(timeout=180)[0].decode("utf-8", errors="ignore") except subprocess.TimeoutExpired: # naughty process! procHandles[handle].kill() procOutput[handle] = f"KILLED: command {handle} timed out and was killed" procOutput = {} # dict to store output text from show commands procHandles = {} for cmd in cmdTable.keys(): try: procHandles[cmd] = subprocess.Popen(cmdTable[cmd], stdout=subprocess.PIPE, stderr=subprocess.PIPE) except FileNotFoundError: procOutput[cmd] = f"NOT FOUND: command {cmd} was not found?" threadpool = ThreadPool(processes=4) threadpool.map(handle_proc_stdout, procHandles.keys()) return procOutput