Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 13981

why is the multiprocessing method is slower than single mehod

$
0
0

I need to pull data from a web API for a series of SN, I implement it in a for-loop function, each loop will spend about 1.5s (due to the query is complex), so I'm trying to use multiple processes to help upgrade the efficiency. but the weird is that the multiple processes will run more time than a single process. I guess the reason is the web api maybe can't support multiple processing, but even though I only run one process use queue method, it still spend double the time of the single process. I can't figure it out.

initial code

def pull_data(row):        url_api = '*web api*'        post_json = {row}        x = requests.post(url_api, data=post_json)        return x.json()for i in range(len(rows)):        json_f = []        t_s = time.time()        json_f = pull_data(rows[0]) # will spend about 1.5s        t_e = time.time()        print(f"\nSelect query completed in {format(t_e - t_s, '.2f')}seconds")

multiple processing code

from multiprocessing import Process,Queuedef pull_data(row,q):        url_api = '*web api*'        post_json = {row}        x = requests.post(url_api, data=post_json)        q.put(x.json())for i in range(0, len(rows), 3):        jobs = []        json_f = []        q = Queue()        t_s = time.time()        if 0 <= i < len(rows):            p1 = Process(target=pull_data, args=(rows[i], q))            jobs.append(p1)            p1.start()        if 1 <= i + 1 < len(rows):            p2 = Process(target=pull_data, args=(rows[i + 1], q))            jobs.append(p2)            p2.start()        if 2 <= i + 2 < len(rows):            p3 = Process(target=pull_data, args=(rows[i + 2], q))            jobs.append(p3)            p3.start()        for proc in jobs:            proc.join()        t_e = time.time()         while not q.empty():             json_f.append(q.get())        print(f"\nSelect query completed in {format(t_e - t_s, '.2f')}seconds") # total 3 process will run about 6s , even I only run one processing,It also spend about 3s, higher then than initial code.

Thanks @AKX suggestion, that works for me below is this modified

from multiprocessing import Pooldef threaded_post_requests(rows, max_workers=5):    with Pool(processes=max_workers) as pool:        results = pool.map(pull_data, rows)    return resultsdef pull_data(row):        url_api = '*web api*'        post_json = {row}        x = requests.post(url_api, data=post_json)        return [fixture_json, row]def database_test():    rows = ['post list']    results = threaded_post_requests(rows)    for i in results:        print("\n", i)if __name__ == '__main__':    database_test()

Viewing all articles
Browse latest Browse all 13981

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>