Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 14360

Multiprocessing pool for loop in Python

$
0
0

I have a code used to calculate statistics from my simulation. I want to use a multiprocessing pool (similar to parfor in Matlab). Below is my code. I have n-th data points and I want to distribute these data points to nCPUs to calculate the result.

In def process_data(pts), it begins from ii_start < ii < ii_end. I have a line to implement the loop,

results = pool.map(process_data, range(ii_start, ii_end + 1))

In Matlab, I simply use

parfor ii = ii_start:ii_end

Do I need to have something similar in python?Also, my confusion is, e.g pts = 98, it will direct to tmean[98] do perform the calculation and I then obtain ea[98], however, I am not sure how to obtain the index from ii_start:ii_end and apply that to process_data.

The script begins here.

import numpy as npfrom multiprocessing import Pool, cpu_countimport osimport h5pyfilepath = '/actual_file_path/rry.mat'rry_data = {}f = h5py.File(filepath)for k, v in f.items():    rry_data[k] = np.array(v)C_sorted        = rry_data['C_sorted'].Tea              = rry_data['ea'].Tidxnew          = rry_data['idxnew'].Tii              = rry_data['ii'].T        # prev stop pointny              = rry_data['ny'].Trhostarprofile  = rry_data['rhostarprofile'].Trry             = rry_data['rry'].Ttmean           = rry_data['tmean'].Tymean           = rry_data['ymean'].TYstar           = rry_data['Ystar'].Tn = len(ea)saveInterval = int(1e6)saveFilename = '/actual_file_path/ea_temp'ii_start = 98ii_end = 110.   # setting 110 for testing, ii_end = 172439820pts = np.linspace(1,n,n)# Initialize a global list to store resultsall_results = []def process_data(pts):    rho = tmean[pts]  # Python uses 0-based indexing    rystar = int(idxnew[pts])    sortedrhoheight = Ystar[rystar]       mod = int((pts-1)%ny)    ry = int(rry[mod])    rhoheight = Ystar[ry]      if sortedrhoheight > rhoheight:        rhobar = (1 / (rhoheight - sortedrhoheight)) * -1 * np.sum(rhostarprofile[ry:rystar, 0] * C_sorted[ry:rystar, 1])    elif sortedrhoheight == rhoheight:        rhobar = 0    else:        rhobar = (1 / (rhoheight - sortedrhoheight)) * np.sum(rhostarprofile[rystar:ry, 0] * C_sorted[rystar:ry, 1])    ea[pts] = (rhoheight - sortedrhoheight) * (rho - rhobar)# number of cores you have allocated for your slurm task:number_of_cores = int(os.environ['SLURM_CPUS_PER_TASK'])print(number_of_cores)# number_of_cores = cpu_count() # if not on the cluster you should do this insteadif __name__ == '__main__':    # Parallel loop for ii_start to ii_end#    with Pool(processes=int(os.environ['SLURM_CPUS_PER_TASK'])) as pool:    with Pool(number_of_cores) as pool:        pool.map(process_data, range(ii_start, ii_end + 1))

Viewing all articles
Browse latest Browse all 14360

Trending Articles