I am working on making a DL model to predict a value. Statement goes like: we have a
time vs A vs B vs C -- xls files ( multiple ones , present in Data named folder )
I want to predict A using using what were previous values of A, B, C. This ofc means when its the first prediction, we'll get N/A as we dont have older A to user.
Current I have this: ( Im really new doing all online searching here and there ):
def load_data(folder_path): data_frames = [] for file_name in os.listdir(folder_path): if file_name.endswith('.xls'): file_path = os.path.join(folder_path, file_name) df = pd.read_excel(file_path) # Assuming the data is in Excel format data_frames.append(df) return pd.concat(data_frames, ignore_index=True)def preprocess_data(data): X = data[['S', 'A', 'T']] y = data['F'] scaler = StandardScaler() X_scaled = scaler.fit_transform(X) X_reshaped = X_scaled.reshape(X_scaled.shape[0], X_scaled.shape[1], 1) # Reshape for CNN return X_reshaped, y, scalerdef build_cnn_model(input_shape): model = Sequential() model.add(Conv1D(filters=1, kernel_size=10, activation='relu', input_shape=(1, 1,3),)) model.add(MaxPooling1D(pool_size=2)) model.add(Conv1D(filters=64, kernel_size=3, activation='relu')) model.add(MaxPooling1D(pool_size=2)) model.add(Flatten()) model.add(Dense(64, activation='relu')) model.add(Dense(1)) return modeldef train_and_predict_next_frequency(folder_path, current_scheduled, current_actual): data = load_data(folder_path) X, y, scaler = preprocess_data(data) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) input_shape = (X_train.shape[1], 1) X_train_reshaped = X_train.reshape(X_train.shape[0], X_train.shape[1], 1) X_test_reshaped = X_test.reshape(X_test.shape[0], X_test.shape[1], 1) print(input_shape) model = build_cnn_model(input_shape) model.compile(optimizer='adam', loss='mean_squared_error') model.fit(X_train_reshaped, y_train, epochs=50, batch_size=32, verbose=1) mse = model.evaluate(X_test_reshaped, y_test, verbose=0) print("Mean Squared Error:", mse) scaled_input = scaler.transform([[current_scheduled, current_actual]]) input_reshaped = scaled_input.reshape(1, 2, 1) # Reshape for CNN next_frequency_scaled = model.predict(input_reshaped) next_frequency = scaler.inverse_transform(next_frequency_scaled.reshape(-1, 1)) return next_frequency[0][0]folder_path = "Data" # Folder containing the Excel filescurrent_scheduled = 10current_actual = 8predicted_frequency = train_and_predict_next_frequency(folder_path, current_scheduled, current_actual)print("Predicted Next Frequency:", predicted_frequency)
and I am getting this error here:
ValueError: One of the dimensions in the output is <= 0 due to downsampling in conv1d_10.Consider increasing the input size. Received input shape [None, 1, 1, 3] which would produceoutput shape with a zero or negative value in a dimension.
predict A using using what were previous values of A, B, C. ( whats prev? , we got timeblock to tell that ).
I tried changing the layer functions but it didnt work. I want to achieve the above mentioned