图神经网络-组队学习（third）

本次任务终于到了实战阶段，要根据节点的属性（类别型或数值型）、边的信息、边的属性（有的话）、已知的节点预测标签，对未知标签的节点做预测。
我们通过比较MLP、GCN和GAT在Cora数据集上节点分类任务中的表现以及他们学习到的节点表征能力。

一、Cora数据集介绍

Cora图拥有2708个节点和10556条边，平均节点度为3.9，是一个无向图，不存在孤立的节点，我们使用140个有真实标签的节点（每类20个）用于训练，有标签的节点的比例只占到5%。

dataset = Planetoid(root='./dataset/Cora', name='Cora', transform=NormalizeFeatures())
print()
print(f'Dataset:{dataset}:')
print(f'Number of graphs:{len(dataset)}')
print(f'Number of features:{dataset.num_features}')
print(f'Number of classes:{dataset.num_classes}')
data = dataset[0]
print()
print(data)
print('===========')
print(f'Number of nodes:{data.num_nodes}')
print(f'Number of edges:{data.num_edges}')
print(f'Average node degree:{data.num_edges/data.num_nodes:.2f}')
print(f'Number of training nodes:{data.train_mask.sum()}')
print(f'Training node label rate: {int(data.train_mask.sum()) / data.num_nodes:.2f}')
print(f'Contains isolated nodes: {data.contains_isolated_nodes()}')
print(f'Contains self-loops: {data.contains_self_loops()}')
print(f'Is undirected: {data.is_undirected()}')

输出的结果为：
在这里插入图片描述

二、MLP图节点分类器

MLP由两个线程层、一个Relu非线性层和一个dropout操作，第一个线性层将1433维的特征向量嵌入到低维空间中，第二个线性层将节点表征嵌入到类别空间中。然后利用交叉熵和Adam优化器来训练这个网络。
MLP的构造：

import torch
from torch.nn import Linear
import torch.nn.functional as F
class MLP(torch.nn.Module):
    def __init__(self, hidden_channels):
        super(MLP, self).__init__()
        torch.manual_seed(12345)
        self.lin1 = Linear(dataset.num_features, hidden_channels)
        self.lin2 = Linear(hidden_channels, dataset.num_classes)
    def forward(self, x):
        x = self.lin1(x)
        x = x.relu()
        x = F.dropout(x, p=0.5, training=self.training)
        x = self.lin2(x)
        return x

训练代码：

model = MLP(hidden_channels=16)
# print(model)
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01, weight_decay=5e-4)
def train():
    model.train()
    optimizer.zero_grad()
    out = model(data.x)
    loss = criterion(out[data.train_mask], data.y[data.train_mask])
    loss.backward()
    optimizer.step()
    return loss
for epoch in range(1, 201):
    loss = train()
    print(f'Epoch:{epoch:03d}, Loss:{loss:.4f}')

测试其分类准确率：

def test():
    model.eval()
    out = model(data.x)
    pred = out.argmax(dim=1)
    test_correct = pred[data.test_mask] == data.y[data.test_mask]
    test_acc = int(test_correct.sum()) / int(data.test_mask.sum())
    return test_acc
test_acc = test()
print(f'Test Accuracy:{test_acc:.4f}')

分类准确率为：

分类准确率大概只有59%。

三、GCN及其在分类任务中的应用

GCNConv构造函数接口：

# GCNConv(inchannels:int, out_channels:int, improved:bool=False, cached:bool=False, add_self_loops:bool=True,normalize:bool=True,bias:bool=True,**kwargs)
#improved:如果为true，A~=A+2I，目的在于增强中心节点自身信息
#cached：是否存储D~(-1/2)A~(D~(-1/2))以便后续使用，只在归纳学习中设为true

我们将torch.nn.Linear layers替换为PyG的GNN Conv Layers，MLP模型就转化为了GNN模型。

经过visualize函数的处理，7维特征的节点被嵌入到2维的平面上。可视化如图所示（原始未经过训练状态）：
在这里插入图片描述

然后开始训练GCN节点分类器并检测准确性：

from torch_geometric.nn import GCNConv
class GCN(torch.nn.Module):
    def __init__(self, hidden_channels):
        super(GCN, self).__init__()
        torch.manual_seed(12345)
        self.conv1 = GCNConv(dataset.num_features, hidden_channels)
        self.conv2 = GCNConv(hidden_channels, dataset.num_classes)
    def forward(self, x, edge_index):
        x = self.conv1(x, edge_index)
        x = x.relu()
        x = F.dropout(x, p=0.5, training=self.training)
        x = self.conv2(x, edge_index)
        return x
# model = GCN(hidden_channels=16)
# print(model)
model = GCN(hidden_channels=16)
model.eval()
out = model(data.x, data.edge_index)
# visualize(out, color=data.y)
optimizer = torch.optim.Adam(model.parameters(), lr=0.01, weight_decay=5e-4)
criterion = torch.nn.CrossEntropyLoss()
def train():
    model.train()
    optimizer.zero_grad()
    out = model(data.x, data.edge_index)
    loss = criterion(out[data.train_mask], data.y[data.train_mask])
    loss.backward()
    optimizer.step()
    return loss
for epoch in range(1, 201):
    loss = train()
    print(f'Epoch:{epoch:03d}, Loss:{loss:.4f}')
def test():
    model.eval()
    out = model(data.x, data.edge_index)
    pred = out.argmax(dim=1)
    test_correct = pred[data.test_mask] == data.y[data.test_mask]
    test_acc = int(test_correct.sum()) / int(data.test_mask.sum())
    return test_acc
test_acc = test()
print(f'Test Accuracy:{test_acc:.4f}')

训练结果为：
在这里插入图片描述

通过将MLP层转换为GCN层，准确率达到了81.4%，准确率提升了很多。说明节点的邻接信息在取得更好的准确率方面起着关键作用。
将训练过的模型输出的节点表征可视化可以发现，同类节点聚集在了一起：
在这里插入图片描述
GCN的模型设计与上一节的知识联系起来之后的代码为：

from torch_geometric.nn import MessagePassing
from torch_geometric.utils import add_self_loops,degree
class GCN(MessagePassing):
    def __init__(self, in_channels, out_channels):
        super(GCN, self).__init__(aggr='add')
        self.lin = torch.nn.Linear(in_channels, out_channels)
    def forward(self, x, edge_index):
        #step1:Add self-loops to the adjacency matrix.
        edge_index, _=add_self_loops(edge_index, num_nodes=x.size(0))
        #step2:Linearly transform node feature matrix.
        x = self.lin(x)
        #step3:Compute normalization
        row, col = edge_index
        deg = degree(col, x.size(0), dtype=x.dtype)
        deg_inv_sqrt = deg.pow(-0.5)
        norm = deg_inv_sqrt[row] * deg_inv_sqrt[col]
        #step4-5:Start propagating messages.
        return self.propagate(edge_index, x=x,norm=norm)
    def message(self, x_j, norm):
        #step4:Normalize node features
        #x_j表示提升的张量，包含每个边缘的源节点特征
        #节点功能可以通过在变量后面添加_x或_j来自动提升
        return norm.view(-1, 1) * x_j

四、GAT及其在分类任务中的应用

GAT构造函数接口

# GATConv(in_channels:union[int, Tuple[int, int]],out_channels:int, heads:int=1,concat:bool=True, negative_slope:float=0.2,dropout:float=0.0, add_self_loops:bool=True, bias:bool=True,**kwargs)
#heads:在GATConv使用多少个注意力模型
#concat：如为true，不同注意力模型得到的节点表征被拼接到一起（表征维度翻倍），否则对不同注意力模型得到的节点表征求均值

这一次我们将linear层替换为GATConv层，来实现基于GAT的图节点分类神经网络。

from torch_geometric.nn import GATConv
class GAT(torch.nn.Module):
    def __init__(self, hidden_channels):
        super(GAT, self).__init__()
        torch.manual_seed(12345)
        self.conv1 = GATConv(dataset.num_features, hidden_channels)
        self.conv2 = GATConv(hidden_channels, dataset.num_classes)
    def forward(self, x, edge_index):
        x = self.conv1(x, edge_index)
        x = x.relu()
        x = F.dropout(x, p=0.5, training=self.training)
        x = self.conv2(x, edge_index)
        return x

训练和测试过程与上面GCN过程相似，最终GAT的分类准确率为73.8%。
在这里插入图片描述
最终节点聚合的可视化结果为：

五、总结

MLP只考虑了节点自身属性，忽略了节点之间的连接关系，它的结果是最差的；GCN和GAT，同时考虑了节点自身属性与周围邻居节点的属性，他们的结果优于MLP。而GCN与GAT的区别在于邻居节点信息聚合过程中的归一化方法不同：

GCN根据中心节点与邻居节点的度计算归一化系数，GAT根据中心节点与邻居节点的相似度计算归一化系数。
GCN的归一化方法依赖于图的拓扑结构
GAT的归一化方式依赖于中心节点与邻居节点的相似度，是训练得到的，不受拓扑结构的影响。

作业

参照源代码，使用PyG中不同的图卷积层在PyG的不同数据上实现节点分类或回归任务。

SGC模型突破了GCN的层数限制，将GCN每层的激活函数去掉（不需要非线性变换）。利用图中的节点关系，直接计算图节点间局部邻居的平均值。通过多次计算节点间1跳距离的平均值来实现卷积叠加的效果。
这种简化版本的图卷积模型就叫做简化图卷积模型（Simple Graph Convolution 简称SGC)。
SGC使用了一个固定的低通滤波器，然后是线性分类器。这种结构大大简化了原有GCN的训练过程。
在这里插入图片描述
图片来源：（论文arXiv: 1902.07153,2019）
我使用SGC来测试一下其在分类任务中的表现：

class SGC(torch.nn.Module):
    def __init__(self, hidden_channels):
        super(SGC, self).__init__()
        torch.manual_seed(12345)
        self.conv1 = SGConv(dataset.num_features, hidden_channels)
        self.conv2 = SGConv(hidden_channels, dataset.num_classes)
    def forward(self, x, edge_index):
        x = self.conv1(x, edge_index)
        x = x.relu()
        x = F.dropout(x, p=0.5, training=self.training)
        x = self.conv2(x, edge_index)
        return x

分类准确率为：81.10%
在这里插入图片描述

原文链接：https://blog.csdn.net/Etc_in_the_great/article/details/118109459