I need to pull data from a web API for a series of SN, I implement it in a for-loop function, each loop will spend about 1.5s (due to the query is complex), so I'm trying to use multiple processes to help upgrade the efficiency. but the weird is that the multiple processes will run more time than a single process. I guess the reason is the web api maybe can't support multiple processing, but even though I only run one process use queue method, it still spend double the time of the single process. I can't figure it out.
initial code
def pull_data(row): url_api = '*web api*' post_json = {row} x = requests.post(url_api, data=post_json) return x.json()for i in range(len(rows)): json_f = [] t_s = time.time() json_f = pull_data(rows[0]) # will spend about 1.5s t_e = time.time() print(f"\nSelect query completed in {format(t_e - t_s, '.2f')}seconds")
multiple processing code
from multiprocessing import Process,Queuedef pull_data(row,q): url_api = '*web api*' post_json = {row} x = requests.post(url_api, data=post_json) q.put(x.json())for i in range(0, len(rows), 3): jobs = [] json_f = [] q = Queue() t_s = time.time() if 0 <= i < len(rows): p1 = Process(target=pull_data, args=(rows[i], q)) jobs.append(p1) p1.start() if 1 <= i + 1 < len(rows): p2 = Process(target=pull_data, args=(rows[i + 1], q)) jobs.append(p2) p2.start() if 2 <= i + 2 < len(rows): p3 = Process(target=pull_data, args=(rows[i + 2], q)) jobs.append(p3) p3.start() for proc in jobs: proc.join() t_e = time.time() while not q.empty(): json_f.append(q.get()) print(f"\nSelect query completed in {format(t_e - t_s, '.2f')}seconds") # total 3 process will run about 6s , even I only run one processing,It also spend about 3s, higher then than initial code.
Thanks @AKX suggestion, that works for me below is this modified
from multiprocessing import Pooldef threaded_post_requests(rows, max_workers=5): with Pool(processes=max_workers) as pool: results = pool.map(pull_data, rows) return resultsdef pull_data(row): url_api = '*web api*' post_json = {row} x = requests.post(url_api, data=post_json) return [fixture_json, row]def database_test(): rows = ['post list'] results = threaded_post_requests(rows) for i in results: print("\n", i)if __name__ == '__main__': database_test()