The matrix is like 4000 columns x 1e8 lines. Now I need to divide the 1e8 lines into 1000 bins based on the 2nd column and sum the lines in every bin. Finally, I would get 4000 columns x 1000 lines.
aa=np.random.rand(10000000, 4000)l,r=min(aa[:,1]),max(aa[:,1])bins=np.linspace(l,r,1001)for bn in range(1000): ll,rr=bins[bn],bins[bn+1] np.sum(dd[(aa[:,1]>ll) &( aa[:,1]< rr) ],axis=0)' # one of the 1000 lines
I find the speed of the final step above is low.
Then I need to stack all the 1000 lines to generate the 1000 x 4000 matrix.