Investigating discrepancies in TensorFlow and PyTorch performance

In my pursuit of mastering PyTorch neural networks, I've attempted to replicate an existing TensorFlow architecture. However, I've encountered a significant performance gap. While TensorFlow achieves rapid learning within 25 epochs, PyTorch requires at least 250 epochs for comparable generalization. Despite meticulous code scrutiny, I've been unable to identify further enhancements. Despite carefully aligning the architectures of both neural networks, disparities still persist. Can anyone shed light on what else might be amiss here?

In the subsequent section, I'll present the full Python code for both implementations, along with the CLI output and graphical visualization.

PyTorch code:

# Standard library importsimport pandas as pdimport matplotlib.pyplot as plt# External library importsimport torchfrom sklearn.model_selection import train_test_splitfrom sklearn.preprocessing import OneHotEncoder, MinMaxScaler, StandardScalerfrom sklearn.metrics import max_error, mean_absolute_error, mean_squared_error# Loading datasetdf_data = pd.read_csv("./data_inverter.csv", names=["pvt", "edge", "slew", "load", "delay"])# Selecting subset of data based on specific conditionsdf_select = df_data[(df_data["pvt"] == "PtypV1500T027") & (df_data["edge"] == "rise")]# Splitting features and target variableX = df_select.drop(["pvt", "edge", "delay"], axis='columns')y = df_select["delay"]# Scaling input features using Min-Max scalingslew_scaler = MinMaxScaler()load_scaler = MinMaxScaler()X_scaled = X.copy()X_scaled["slew"] = slew_scaler.fit_transform(X_scaled.slew.values.reshape(-1, 1))X_scaled["load"] = load_scaler.fit_transform(X_scaled.load.values.reshape(-1, 1))# Splitting data into training and testing setsX_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.1, random_state=42)# Converting data to PyTorch tensorsX_train_tensor = torch.FloatTensor(X_train.values)y_train_tensor = torch.FloatTensor(y_train.values).view(-1, 1)X_test_tensor = torch.FloatTensor(X_test.values)y_test_tensor = torch.FloatTensor(y_test.values).view(-1, 1)# Setting random seed for reproducibilitytorch.manual_seed(42)# Defining neural network architecturemodel = torch.nn.Sequential(    torch.nn.Linear(X_train_tensor.shape[1], 128),    torch.nn.ReLU(),    torch.nn.Linear(128, 128),    torch.nn.ReLU(),    torch.nn.Linear(128, 64),    torch.nn.ReLU(),    torch.nn.Linear(64, 32),    torch.nn.ReLU(),    torch.nn.Linear(32, 16),    torch.nn.ReLU(),    torch.nn.Linear(16, 1),    torch.nn.ELU())# Loss function and optimizercriterion = torch.nn.MSELoss()criterion_val = torch.nn.MSELoss()optimizer = torch.optim.Adam(model.parameters())# Training the modelnum_epochs = 25progress = {'loss': [], 'mae': [], 'mse': [], 'val_loss': [], 'val_mae': [], 'val_mse': []}for epoch in range(num_epochs):    # Forward pass    y_predict = model(X_train_tensor)    loss = criterion(y_predict, y_train_tensor)    # Backward and optimize    loss.backward()    optimizer.step()    optimizer.zero_grad()    # Validation    with torch.no_grad():        model.eval()        y_test_predict = model(X_test_tensor)        loss_val = criterion_val(y_test_predict, y_test_tensor)    model.train()    # Record progress    progress['loss'].append(loss.item())    progress['mae'].append(mean_absolute_error(y_train_tensor, y_predict.detach().numpy()))    progress['mse'].append(mean_squared_error(y_train_tensor, y_predict.detach().numpy()))    progress['val_loss'].append(loss_val.item())    progress['val_mae'].append(mean_absolute_error(y_test_tensor, y_test_predict.detach().numpy()))    progress['val_mse'].append(mean_squared_error(y_test_tensor, y_test_predict.detach().numpy()))    print("Epoch %i/%i   -   loss: %0.5F" % (epoch, num_epochs, loss.item()))# Displaying model summaryprint(model)# Plotting training progressdf_progress = pd.DataFrame(progress)df_progress.plot()plt.title("Model training progress: DNN PyTorch")plt.tight_layout()plt.show()# Making predictions on the testing setwith torch.no_grad():    model.eval()    y_predict_tensor = model(X_test_tensor)    y_predict = y_predict_tensor.numpy()# Displaying model performance metricsprint("Model performance metrics: DNN PyTorch")print("MAX error:", max_error(y_test_tensor, y_predict))print("MAE error:", mean_absolute_error(y_test_tensor, y_predict))print("MSE error:", mean_squared_error(y_test_tensor, y_predict, squared=False))plt.scatter(y_test, y_predict)plt.scatter(y_test, y_test, marker='.')plt.title("Model predictions: DNN PyTorch")plt.tight_layout()plt.show()

TensorFlow code:

# Standard library importsimport pandas as pdimport matplotlib.pyplot as plt# External library importsimport tensorflow as tffrom sklearn.model_selection import train_test_splitfrom sklearn.preprocessing import OneHotEncoder, MinMaxScaler, StandardScalerfrom sklearn.metrics import max_error, mean_absolute_error, mean_squared_error# Loading datasetdf_data = pd.read_csv("./data_inverter.csv", names=["pvt", "edge", "slew", "load", "delay"])# Selecting subset of data based on specific conditionsdf_select = df_data[(df_data["pvt"] == "PtypV1500T027") & (df_data["edge"] == "rise")]# Splitting features and target variableX = df_select.drop(["pvt", "edge", "delay"], axis='columns')y = df_select["delay"]# Scaling input features using Min-Max scalingslew_scaler = MinMaxScaler()load_scaler = MinMaxScaler()X_scaled = X.copy()X_scaled["slew"] = slew_scaler.fit_transform(X_scaled.slew.values.reshape(-1, 1))X_scaled["load"] = load_scaler.fit_transform(X_scaled.load.values.reshape(-1, 1))# Splitting data into training and testing setsX_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.1, random_state=42)# Converting data to TensorFlow tensorsX_train_tensor = tf.constant(X_train.values, dtype=tf.float32)y_train_tensor = tf.constant(y_train.values, dtype=tf.float32)X_test_tensor = tf.constant(X_test.values, dtype=tf.float32)y_test_tensor = tf.constant(y_test.values, dtype=tf.float32)# Setting random seed for reproducibilitytf.keras.utils.set_random_seed(42)# Defining neural network architecturemodel = tf.keras.models.Sequential([    tf.keras.layers.Dense(128, activation='relu', input_dim=X_train_tensor.shape[1]),    tf.keras.layers.Dense(128, activation='relu'),    tf.keras.layers.Dense(64, activation='relu'),    tf.keras.layers.Dense(32, activation='relu'),    tf.keras.layers.Dense(16, activation='relu'),    tf.keras.layers.Dense(1, activation='elu')])# Compiling the modelmodel.compile(    loss=tf.keras.losses.MeanSquaredError(),  # Using Mean Squared Error loss function    optimizer=tf.keras.optimizers.Adam(),  # Using Adam optimizer    metrics=['mae', 'mse']  # Using Mean Absolute Error and Mean Squared Error as metrics)# Training the modelprogress = model.fit(X_train_tensor, y_train_tensor, validation_data=(X_test_tensor, y_test_tensor), epochs=25)# Evaluating model performance on the testing setmodel.evaluate(X_test_tensor, y_test_tensor, verbose=2)# Displaying model summaryprint(model.summary())# Plotting training progresspd.DataFrame(progress.history).plot()plt.title("Model training progress: DNN TensorFlow")plt.tight_layout()plt.show()# Making predictions on the testing sety_predict = model.predict(X_test_tensor)# Displaying model performance metricsprint("Model performance metrics: DNN TensorFlow")print("MAX error:", max_error(y_test_tensor, y_predict))print("MAE error:", mean_absolute_error(y_test_tensor, y_predict))print("MSE error:", mean_squared_error(y_test_tensor, y_predict, squared=False))plt.scatter(y_test, y_predict)plt.scatter(y_test, y_test, marker='.')plt.title("Model predictions: DNN TensorFlow")plt.tight_layout()plt.show()

CLI output of PyTorch model performance metrics after 25 epochs:

Sequential(  (0): Linear(in_features=2, out_features=128, bias=True)  (1): ReLU()  (2): Linear(in_features=128, out_features=128, bias=True)  (3): ReLU()  (4): Linear(in_features=128, out_features=64, bias=True)  (5): ReLU()  (6): Linear(in_features=64, out_features=32, bias=True)  (7): ReLU()  (8): Linear(in_features=32, out_features=16, bias=True)  (9): ReLU()  (10): Linear(in_features=16, out_features=1, bias=True)  (11): ELU(alpha=1.0))Model performance metrics: DNN PyTorchMAX error: 1.2864852MAE error: 0.3353702MSE error: 0.42874745

CLI output of TensorFlow model performance metrics after 25 epochs:

Model: "sequential"_________________________________________________________________ Layer (type)                Output Shape              Param #   ================================================================= dense (Dense)               (None, 128)               384        dense_1 (Dense)             (None, 128)               16512      dense_2 (Dense)             (None, 64)                8256       dense_3 (Dense)             (None, 32)                2080       dense_4 (Dense)             (None, 16)                528        dense_5 (Dense)             (None, 1)                 17        =================================================================Total params: 27777 (108.50 KB)Trainable params: 27777 (108.50 KB)Non-trainable params: 0 (0.00 Byte)_________________________________________________________________None6/6 [==============================] - 0s 750us/stepModel performance metrics: DNN TensorFlowMAX error: 0.013849139MAE error: 0.0029576812MSE error: 0.0036013061

PyTorch training progress:

TensorFlow training progress:

PyTorch scatter plot (orange = target against itself, blue = target against prediction):

TensorFlow scatter plot (orange = target against itself, blue = target against prediction):

..............................................................................................................................................

Appending additional info (reacting to the questions and comments):

torch.optim.Adam - the default learning rate is set to 0.001.

tf.keras.optimizers.Adam - the default learning rate is set to 0.001

..............................................................................................................................................

PyTorch model performance after 250 epoch:

Sequential(  (0): Linear(in_features=2, out_features=128, bias=True)  (1): ReLU()  (2): Linear(in_features=128, out_features=128, bias=True)  (3): ReLU()  (4): Linear(in_features=128, out_features=64, bias=True)  (5): ReLU()  (6): Linear(in_features=64, out_features=32, bias=True)  (7): ReLU()  (8): Linear(in_features=32, out_features=16, bias=True)  (9): ReLU()  (10): Linear(in_features=16, out_features=1, bias=True)  (11): ELU(alpha=1.0))Model performance metrics: DNN PyTorchMAX error: 0.025619686MAE error: 0.006687804MSE error: 0.008531998

Investigating discrepancies in TensorFlow and PyTorch performance

Trending Articles

RAMAYAMPET Mandal Sarpanch | Upa-Sarpanch | Ward member Mobile Numbers Medak...

लड़कियां सेक्स के दौरान क्यों करती है उह! आह!लड़कियां सेक्स के दौरान क्यों करती...

Neem Baba Extra Questions Answer Class 6 English Poorvi

Throw Back: 4×4 — Sikilitele (Ft Castro) Prod by JQ

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Lowe faces four theft charges

Practice Sheet of Right form of verbs for HSC Students

Mafia, Murder & Mayhem In The Motor City: Detroit Mob Hit Timeline (1937-2007)

The 10 Tennessee Cities With The Largest Black Population For 2021

Materials Around Us Class 6 Worksheet Science Chapter 6

デスクトップヒープの枯渇

Best Suvichar in Hindi |बेस्ट सुविचार |शुभ विचार हिंदी में

Kanulanu Thaake Lyrics and translation | Manam (2014)

Korean Sex Porn Videos: XXX Videos & Free Porn Movies

Teen Shot In Miami Drive-By Dies From Injuries

Download: IQ Muzatasha feat Shy D & Pmj – Ulesi NiFertilizer Yamavuto

Mahakal Attitude Status

Property developer set up cannabis factory to help pay off debts...

♡

KB: How to troubleshoot issues when adding a Hyper-V host in System Center...