Is there a better way to use multiprocessing within a loop?

I'm quite new to using multiprocessing and I'm trying to figure out if there is a better way to use multiprocessing when it has to be within a loop. Or at least I think it has to be within a loop. Let me try to describe the outline and then a better solution may be obvious.

I'm processing large spatial datasets with lots of data points. I'm trying to calculate values for smaller sub-regions of the large dataset by splitting it up with a regular grid. Each grid calculation can be calculated independently of another so this makes it perfect for parallel processing - I just need to know what data to send to each grid. Typically, there will be 1000's of the smaller grid-bins.

However, I'm trying to track these changes over time so I'm then repeating the same process for each timestep of data. Typically there will be 10-500 timesteps of data to process. Each timestep's data is stored in a several files on disk that I have to load when processing the timestep.

Currently my code looks a bit like the following psuedo-code and it takes about 30-60s to process a timestep. Reading and binning probably only takes a few seconds of that time and the rest is the multi-processing of each bin.

from multiprocessing import Poolfor t in range(num_timeteps_to_process):    # read data files for this timestep    data_1 = read_file1_data(path, t)    data_2 = read_file2_data(path, t)    data_3 = read_file3_data(path, t)    # find what data is in each grid_bin - returns data and bin_id    binned_data = spatial_binning_function(data_1, num_bins_x=100, num_bins_y=100)    # loop through all bins doing calculation    bin_calc_arg_list = []    for bin_indx, data_id in enumerate(binned_data):        bin_calc_arg_list.append((bin_indx, data_id, data_1, data_2, data_3, other_settings))        print("        Running Multiprocessing Pool")        # create the process pool - multiprocessing        with Pool(processes=num_processors) as pool:            # execute a task            results = pool.starmap(multiprocess_timesteps_bins, bin_calc_arg_list)            # close the process pool            pool.close()            # wait for issued tasks to complete            pool.join()        # collate results here and return        ...        ...        ...

That's the general flow of my code and it's quite a bit quicker than any single-threaded implementation but I'm aware that the opening and closing of the processing pool within each timestep loop is probably a really bad idea but the reading and subdivision of data needs to be done per timestep as I don't think running multiple processes where each one is reading the same data a few thousand times is better.

From a few print statements it seems that there may be several seconds required starting up the pool each timestep and maybe some more time spent shutting it down.

As I said, I'm not overly familiar with multiprocessing yet so I'm not quite sure what is the best way to set up a problem like this - I'm assuming my implementation is not how it should be done, but it works and it's faster than single threaded.

Should I be using some other dynamic queue that I open at the start and then pop items into that queue during each timestep? And then close once at the end?

Any pointers or suggestions welcome.

Is there a better way to use multiprocessing within a loop?

Trending Articles

RAMAYAMPET Mandal Sarpanch | Upa-Sarpanch | Ward member Mobile Numbers Medak...

लड़कियां सेक्स के दौरान क्यों करती है उह! आह!लड़कियां सेक्स के दौरान क्यों करती...

Neem Baba Extra Questions Answer Class 6 English Poorvi

Throw Back: 4×4 — Sikilitele (Ft Castro) Prod by JQ

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Lowe faces four theft charges

Practice Sheet of Right form of verbs for HSC Students

Mafia, Murder & Mayhem In The Motor City: Detroit Mob Hit Timeline (1937-2007)

The 10 Tennessee Cities With The Largest Black Population For 2021

Materials Around Us Class 6 Worksheet Science Chapter 6

デスクトップヒープの枯渇

Best Suvichar in Hindi |बेस्ट सुविचार |शुभ विचार हिंदी में

Kanulanu Thaake Lyrics and translation | Manam (2014)

Korean Sex Porn Videos: XXX Videos & Free Porn Movies

Teen Shot In Miami Drive-By Dies From Injuries

Download: IQ Muzatasha feat Shy D & Pmj – Ulesi NiFertilizer Yamavuto

Mahakal Attitude Status

Property developer set up cannabis factory to help pay off debts...

♡

KB: How to troubleshoot issues when adding a Hyper-V host in System Center...