Dataset Strucute: Temporal directed graph; Nodes have features; Edges don't have features; Nodes are labelled. Using the Elliptic Dataset
Task: Classify nodes/ Predict node labels.
Data Structure: 2 .csv files of nodes and edges.
- For the nodes csv
#Rows = #Nodesand#Columns = #Features - For the edges csv
#Rows = #Edges - Finally both files are converted into a tensor and turned into a Pytorch-geometric Data class.
I want to train various Graph Neural Networks on the data and extract node embeddings from the networks. I know that is possible because the authors of the Elliptic dataset extracted node embeddings from a GCN.
Below is the code for the GAT I am using.
class GAT(torch.nn.Module):"""Graph Attention Network""" def __init__(self, dim_in, dim_h, dim_out, heads=24): super().__init__() self.gat1 = GATv2Conv(dim_in, dim_h, heads=heads) self.gat2 = GATv2Conv(dim_h*heads, dim_out, heads=1) self.optimizer = torch.optim.Adam(self.parameters(), lr=0.25, weight_decay=5e-4) def forward(self, x, edge_index): h = F.dropout(x, p=0.5, training=self.training) h = self.gat1(x, edge_index) h = F.elu(h) h = F.dropout(h, p=0.5, training=self.training) h = self.gat2(h, edge_index) return h, F.log_softmax(h, dim=1)This function returns a trained model
def train(model, data , epochs = 200):"""Train a GNN model and return the trained model.""" criterion = torch.nn.CrossEntropyLoss() optimizer = model.optimizer model = model.to(device) model.train() for epoch in range(epochs+1): # Training optimizer.zero_grad() _, out = model(data.x.to(device), data.edge_index.to(device)) loss = criterion(out[data.train_mask].to(device), data.y[data.train_mask].to(device)) loss.backward() optimizer.step() # Print metrics every 10 epochs if(epoch % 10 == 0): print(f'Epoch {epoch:>3} | Train Loss: {loss:.3f}') return modelWhat modifications do I need to make to the code to extract the node embeddings?