RE-NET

《Recurrent Event Network: Autoregressive Structure Inference over Temporal Knowledge Graphs》论文解读

RGCN

W_r^(l)是某一个特定的关系r（如弹劾，主动建交等）在第l层GCN中的权重矩阵，每一个不同的关系r对应其对应的权重矩阵，比如r₁对应W₁这样

h_o^(l)是某一个特定的客体（object）在第l层GCN中的隐藏表示/特征向量（hidden state/embedding），也就是当前s实体所指向的那个o客体，h_s^(l)同理

W_o^(l)是自环权重矩阵，意思就是它会与节点s自身的特征一直相乘然后更新到下一层再继续相乘

N_t^(s,r)表示在时间戳t（timestamp）这个时间下的节点s通过关系r的邻居节点集合（Neighborhood），N_t^(s)同理

c_s是归一化常数，体现在公式里就是：num(o) * num(r)，就是当前t的实体s对应的客体o的个数乘上当前t这个s对应的关系r的个数，包含了多少个邻居的信息就平均掉。简单来说，就是当前t的实体s的所有邻居个数

RGCN聚合器的作用在于：聚合节点s的邻居节点的信息。结合公式解读一下就是：对于节点s，RGCN在更新节点特征同时考虑节点自身的特征和其邻居节点的特征，以及它们之间的关系类型（指不同关系对应不同权重矩阵)。

RGCN模块代码如下：

class RGCNLayer(nn.Module):
    def __init__(self, in_feat, out_feat, bias=None, activation=None, self_loop=False, dropout=0.0):
        super(RGCNLayer, self).__init__()
        self.bias = bias
        self.activation = activation
        self.self_loop = self_loop

        if self.bias:
            self.bias = nn.Parameter(torch.Tensor(out_feat))
            nn.init.xavier_uniform_(self.bias,  gain=nn.init.calculate_gain('relu'))
            # xavier_uniform_：初始化，用于保证输入输出的方差相同。可以避免随着层数的传递，输入过大而梯度消失或输入过小而失去非线性

        # 权重矩阵
        if self.self_loop:
            # 回环权重矩阵
            # input_feature,output_feature
            self.loop_weight = nn.Parameter(torch.Tensor(in_feat, out_feat))
            nn.init.xavier_uniform_(self.loop_weight, gain=nn.init.calculate_gain('relu'))

        if dropout:
            self.dropout = nn.Dropout(dropout)
        else:
            self.dropout = None

    def propagate(self, g, reverse):
        raise NotImplementedError

    def forward(self, g, reverse):
        if self.self_loop:
            '''
            g是graph，.ndata['h']是PyTorch DGL图（Graph）对象的一个节点数据属性，它表示图中的节点特征向量。
            在 GNNs 中，节点特征向量通常由两部分组成：一部分是静态的特征，如节点类别、属性等；
            另一部分是动态的特征，如与节点相邻的节点数量、节点之间的边权重等。
            '''
            loop_message = torch.mm(g.ndata['h'], self.loop_weight)
            if self.dropout is not None:
                loop_message = self.dropout(loop_message)

        self.propagate(g, reverse)

        # 应用bias偏移和激活函数
        node_repr = g.ndata['h']
        if self.bias:
            node_repr = node_repr + self.bias
        if self.self_loop:
            node_repr = node_repr + loop_message
        if self.activation:
            node_repr = self.activation(node_repr)

        # 更新节点特征向量
        g.ndata['h'] = node_repr
        return g

    
class RGCNBlockLayer(RGCNLayer):  # RGCN块层
    def __init__(self, in_feat, out_feat, num_rels, num_bases, bias=None, activation=None, self_loop=False, dropout=0.0):
        super(RGCNBlockLayer, self).__init__(in_feat, 
                                             out_feat, 
                                             bias,
                                             activation, 
                                             self_loop=self_loop,
                                             dropout=dropout)
        self.num_rels = num_rels
        self.num_bases = num_bases
        assert self.num_bases > 0

        self.out_feat = out_feat

        self.submat_in = in_feat // self.num_bases
        self.submat_out = out_feat // self.num_bases

        # assuming in_feat and out_feat are both divisible by num_bases
        # if self.num_rels == 2:
        #     self.in_feat = in_feat
        #     self.weight = nn.Parameter(torch.Tensor(
        #         self.num_rels, in_feat, out_feat))
        # else:
        self.weight = nn.Parameter(torch.Tensor(self.num_rels, self.num_bases * self.submat_in * self.submat_out))
        nn.init.xavier_uniform_(self.weight, gain=nn.init.calculate_gain('relu'))

    def msg_func(self, edges, reverse):  # 节点和权重相乘，再转化为输出格式
        if reverse:
            # edges.data['type_o']：表示目标节点所属的关系的类型。具体哪个关系就对应它的权重矩阵
            weight = self.weight.index_select(0, edges.data['type_o']).view(-1, self.submat_in, self.submat_out)
        else:
            weight = self.weight.index_select(0, edges.data['type_s']).view(-1, self.submat_in, self.submat_out)

        '''
        edges.src['h']：表示源节点的特征向量。
        它是一个大小为 (num_nodes, submat_in) 的tensor，其中 num_nodes 表示图中的节点数。
        '''
        node = edges.src['h'].view(-1, 1, self.submat_in)
        '''
        msg：表示消息传递的结果。
        它是一个大小为 (num_edges, out_feat) 的tensor，其中out_feat 表示输出特征向量的维度。
        '''
        msg = torch.bmm(node, weight).view(-1, self.out_feat)
        return {'msg': msg}

    def propagate(self, g, reverse):
        g.update_all(lambda x: self.msg_func(x, reverse), fn.sum(msg='msg', out='h'), self.apply_func)

    def apply_func(self, nodes):
        return {'h': nodes.data['h'] * nodes.data['norm']}

结合代码来看公式，是

   		'''
         先让节点特征向量与自环权重矩阵相乘+dropout
         '''
loop_message = torch.mm(g.ndata['h'], self.loop_weight)
         if self.dropout is not None:
             loop_message = self.dropout(loop_message)
         '''
         再把偏移、乘积、激活函数加在原有的节点特征向量上
         允许每一层都保留前一层的信息，有点像ResNet的思想
         '''
	node_repr = g.ndata['h']
         if self.bias:
             node_repr = node_repr + self.bias
         if self.self_loop:
             node_repr = node_repr + loop_message
         if self.activation:
             node_repr = self.activation(node_repr)
'''
把最后的节点表示（node_repr）作为该层节点特征向量的输出
'''
         g.ndata['h'] = node_repr
         return g

是

  def msg_func(self, edges, reverse): 
      if reverse:
          weight = self.weight.index_select(0, edges.data['type_o']).view(-1, self.submat_in, self.submat_out)
      else:
          weight = self.weight.index_select(0, edges.data['type_s']).view(-1, self.submat_in, self.submat_out)
'''
先找到目标节点所属的关系的类型（比如：弹劾、被弹劾；选举、被选举）
拿到该关系所对应的权重矩阵
'''
      node = edges.src['h'].view(-1, 1, self.submat_in)
'''
将源节点的特征向量与权重相乘
'''
      msg = torch.bmm(node, weight).view(-1, self.out_feat)
      return {'msg': msg}

  def propagate(self, g, reverse):
      g.update_all(lambda x: self.msg_func(x, reverse), fn.sum(msg='msg', out='h'), self.apply_func)
'''
把该源节点的所有邻居节点的msg结果相加聚合
'''
  def apply_func(self, nodes):
      return {'h': nodes.data['h'] * nodes.data['norm']}
  	'''
  	最后乘上归一化常数nodes.data['norm']
  	'''

RGCN捕捉多跳邻居的信息：通过多次应用图卷积操作实现

在第一次迭代中，每个节点聚合来自其直接邻居的信息
在第二次迭代中，每个节点合来自其邻居的邻居（即第二跳邻居）的信息

RE-Net架构图解读

任务：预测时间戳t中的三元组

已知：过去几个时间步的图（架构图举例为3个时间步）

RE-Net_global对整个知识图谱（能观察到的图，架构图中整个图谱为3个时间步的图）进行编码，给出全局嵌入表示（global_emb）。在预测时给出预测时间戳t的主体、客体分布以及更新后的全局嵌入表示

RE-Net(train)：将可观察到的局部图中的s,r和global_emb等放入Aggregator聚合器（RGCN）当中，聚合器的输出结果为两个序列s_packed_input,s_packed_input_r分别放入encoder,encoder_r中，编码实体表示和关系表示，RE-Net类的self.encoder被定义为nn.GRU，最后分别送入线性层self.linear,self.linear_r(nn.linear)和dropout层预测客体和关系并分别计算损失

RE-Net(valid,test)：RE-Net_global预测时间戳t的主体，客体分布后采样前self.num_k个，更新（主体和客体的列表、索引、历史交互缓存），构建新的时间点的图，更新图字典（graph_dict）和全局嵌入表示（global_emb）。再重复以上操作↑