I need to calculate the average of some numbers in the same column every 5 minutes.These data can come every 30 seconds or 1 minute.I would like the time to be used as a reference to calculate this average.
I tried the method below, but I had to include the time later and it only works if it's 1 minute; I couldn't calculate the average considering the time.
Planilha = pd.DataFrame({'Data/hora': ['01/02/2024 05:01','01/02/2024 05:02','01/02/2024 05:03','01/02/2024 05:04','01/02/2024 05:05','01/02/2024 05:06','01/02/2024 05:07','01/02/2024 05:08','01/02/2024 05:09','01/02/2024 05:10','01/02/2024 05:11','01/02/2024 05:12','01/02/2024 05:13','01/02/2024 05:14','01/02/2024 05:15'],'Valores_ok' : [21.48544006,32.41119499,44.18326492,59.37920151,76.55416718,93.16954193,121.0470154,164.0023529,207.9371277,198.1840485,150.4580994,144.5747345,155.5020691,155.5775085,160.8874695],})Irrad = Planilha.iloc[:,Planilha.columns.str.contains('Valores')] #Filtering the necessary dataIrradM = pd.DataFrame()IrradM.columns = pd.DataFrame(columns=Irrad.columns)for i in range(0, len(Irrad), 5): Irrad5m = Irrad.iloc[i:i+5].mean(numeric_only=True) Irrad5m = pd.DataFrame(Irrad5m).T IrradM = pd.concat([IrradM, Irrad5m], ignore_index=True)Another issue is the delay due to the amount of data.I imagine there must be a much easier way to perform this operation.
Input data
Data/hora Valores_ok01/02/2024 05:01 21.48544001/02/2024 05:02 32.41119501/02/2024 05:03 44.18326501/02/2024 05:04 59.37920201/02/2024 05:05 76.55416701/02/2024 05:06 93.16954201/02/2024 05:07 121.04701501/02/2024 05:08 164.00235301/02/2024 05:09 207.93712801/02/2024 05:10 198.18404801/02/2024 05:11 150.45809901/02/2024 05:12 144.57473501/02/2024 05:13 155.50206901/02/2024 05:14 155.57750801/02/2024 05:15 160.887470Expected output
Data/hora Valores_ok01/02/2024 05:05 46.80265401/02/2024 05:10 156.86801701/02/2024 05:15 153.399976