I would like to get the area of overlap of 2 histograms.
I was able to extract the histogram data of my 2 datasets (from similar columns of 2 dataframes). I also used density = True
so that the area under the curve would be 1.
hist_data1, bin_edges1 = np.histogram(data1.iloc[:, 1], bins = 50, density = True)hist_data2, bin_edges2 = np.histogram(data2.iloc[:, 1], bins = 50, density = True)
I am trying to follow the method used here:
def histogram_intersection(h1, h2, bins): bins = numpy.diff(bins) sm = 0 for i in range(len(bins)): sm += min(bins[i]*h1[i], bins[i]*h2[i]) return sm
However, bin_edges1
may not be the exactly the same as in bin_edges2
so there is no uniform bins
that can be used in histogram_intersection()
. How can I use the above method such that the bins
would be uniform?