Abstract

本文目标是预计到达时间(The estimated time of arrival, ETA)，现有研究存在问题：很少有研究将结构化的图数据考虑在内，更不用说异构的信息网络了。
本文提出 HetETA 模型 旨在ETA任务中利用异构的图数据，具体做法：
（1）将路网地图转化为多相关信息网络，引入车辆轨迹图联合考虑车辆行为模式
（2）时间信息分为近期(recent periods)、日周期(day periods)、周周期(week periods)，然后对每个时间模块分别建模。

1. Introduction & Related Work

（不是全文翻译，仅是关键信息的笔记）

现有研究存在的问题：

将路网看作同质图，忽略边的差异性
大多数研究使用的数据集如META-LA或PEMS，而实际的交通状况要复杂很多
有的模型适用于小网络，难以放到大网络中

本文：

是基于路线的ETA 方法（不是用sensor数据），大致思路是先预测每个路段的出行时间，然后根据已知的轨迹求和。
利用GNN 模型对路网数据建模，建模难点在于：
（1）与单一相关关系的sensor数据相比，路网数据（图数据）连接关系更复杂。举例：可能如图2(a)有向左/右转和直行等等不同的状态；如图2(b)高速上转到不同路段也需要降速等。
（2）路段的连接更稀疏。以沈阳路网为例，共有74685个点，每个点平均有2.52个邻居节点。

为解决上述问题，本文：

引入异质性信息网络(heterogeneous information network, HIN)，将路网地图转化为多相关信息网络（a multi-relational network，是HIN的一种，如图2a）
也部署了车辆轨迹图，节点和多相关信息网络图节点相同，边体现点之间的连通频率。
异质性也包括时间数据，将时间数据分成三类：recent periods、daily periods、weekly periods

2. Methodology

时间片分为三类，以这篇博文中的标注为例（如图）：
论文笔记《HetETA: Heterogeneous Information Network Embedding for Estimating Time of Arrival》

临近片段(recent) X R = [ X ( t q − L R + 1 ) , X ( t q − L R + 2 ) , … , X ( t q ) ] ∈ R L R × ∣ V ∣ × n \mathbf{X}_{\mathcal{R}}=\left[X^{\left(t_{q}-L_{\mathcal{R}}+1\right)}, X^{\left(t_{q}-L_{\mathcal{R}}+2\right)}, \ldots, X^{\left(t_{q}\right)}\right] \in \mathbb{R}^{L_{\mathcal{R}} \times|V| \times n} XR=[X(tq−LR+1),X(tq−LR+2),…,X(tq)]∈RLR×∣V∣×n
日周期片段(daily-period) X D = [ X ( t q + 1 − L D ∗ T D ) , X ( t q + 1 − ( L D − 1 ) ∗ T D ) , … , X ( t q + 1 − T D ) ] ∈ R L D × ∣ V ∣ × n \mathbf{X}_{\mathcal{D}}=\left[X^{\left(t_{q}+1-L_{\mathcal{D}} * T_{D}\right)}, X^{\left(t_{q}+1-\left(L_{\mathcal{D}}-1\right) * T_{D}\right)}, \ldots, X^{\left(t_{q}+1-T_{D}\right)}\right] \in \mathbb{R}^{L_{\mathcal{D}} \times|V| \times n} XD=[X(tq+1−LD∗TD),X(tq+1−(LD−1)∗TD),…,X(tq+1−TD)]∈RLD×∣V∣×n
周周期片段(weekly-period component) X W = [ X ( t q + 1 − L W ∗ T D ∗ 7 ) , X ( t q + 1 − ( L W − 1 ) ∗ T D ∗ 7 ) , … , X ( t q + 1 − T D ∗ 7 ) ] ∈ R L W × ∣ V ∣ × n \mathbf{X}_{\mathcal{W}}=\left[X^{\left(t_{q}+1-L_{W} * T_{D} * 7\right)}, X^{\left(t_{q}+1-\left(L_{W}-1\right) * T_{D} * 7\right)}, \ldots, X^{\left(t_{q}+1-T_{D} * 7\right)}\right] \in\mathbb{R}^{L_{\mathcal{W}} \times|V| \times n} XW=[X(tq+1−LW∗TD∗7),X(tq+1−(LW−1)∗TD∗7),…,X(tq+1−TD∗7)]∈RLW×∣V∣×n

论文笔记《HetETA: Heterogeneous Information Network Embedding for Estimating Time of Arrival》
将这三个时间片分别放到同样架构的模块中：

先放到 Temporal Gated CNN 中对时间相关性建模（见本文2.1）
再分别将 多相关信息网络 和 车辆轨迹图 放到 Het-ChebNet 中对空间相关性建模（见本文2.2）

2.1 Gated CNNs for Temporal Correlations

本节思路类似：Yu B, Yin H, Zhu Z. Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting[J].IJCAI 2018. 是沿着时间维度做卷积，以此捕获时间相关性。
论文笔记《HetETA: Heterogeneous Information Network Embedding for Estimating Time of Arrival》

输入 X X X，宽 C i n C_{in} Cin（输入通道数），高 L i n L_{in} Lin（输入序列的长度），深度 ∣ V ∣ |V| ∣V∣（节点数目）
两套相同结构的卷积核，每套是 C o u t C_{out} Cout 个维度为 L K ∗ 1 ∗ C i n L_K*1*C_{in} LK∗1∗Cin。要注意核的深度是1，做的是1维卷积。
其中一个卷积后，再经过sigmoid函数，起到门控作用。再与另一套卷积后的结果相乘。
最后得到隐藏状态 H = ( K 1 ⋆ x ) ⊙ σ ( K 2 ⋆ x ) ∈ R ( L i n − L K + 1 ) × ∣ V ∣ × C out \mathrm{H}=\left(\mathrm{K}_{1} \star \mathrm{x}\right) \odot \sigma\left(\mathrm{K}_{2} \star \mathrm{x}\right) \in \mathbb{R}^{\left(L_{\mathrm{in}}-L_{K}+1\right) \times|V| \times C_{\text {out }}} H=(K1⋆x)⊙σ(K2⋆x)∈R(Lin−LK+1)×∣V∣×Cout

从整体再看下模型

最开始的输入 x = X ∈ R L × ∣ V ∣ × n \mathbf{x}=\mathbf{X} \in \mathbb{R}^{L \times|V| \times n} x=X∈RL×∣V∣×n
经过第1个 gated CNN 层，得到 H = H 1 ∈ R ( L i n − L K + 1 ) × ∣ V ∣ × C out \mathbf{H}= \mathbf{H_1}\in \mathbb{R}^{\left(L_{\mathrm{in}}-L_{K}+1\right) \times|V| \times C_{\text {out }}} H=H1∈R(Lin−LK+1)×∣V∣×Cout
接着 H 1 \mathbf{H_1} H1 被放入 Het-ChebNets 中，得到 H 2 ∈ R ( L − L K + 1 ) × ∣ V ∣ × 2 ∗ C 1 \mathbf{H}_{2} \in \mathbb{R}^{\left(L-L_{K}+1\right) \times|V| \times 2 * C_{1}} H2∈R(L−LK+1)×∣V∣×2∗C1
再将 H 2 \mathbf{H}_{2} H2 放入第2个 gated CNN 层，得到 H = H 3 ∈ R ( L − 2 ∗ ( L K − 1 ) ) × ∣ V ∣ × C 3 \mathbf{H}=\mathbf{H}_{3} \in \mathbb{R}^{\left(L-2 *\left(L_{K}-1\right)\right) \times|V| \times C_{3}} H=H3∈R(L−2∗(LK−1))×∣V∣×C3
最后再将 H 3 \mathbf{H}_{3} H3 放入第3个 gated CNN 层，该层的核维度是 [ ( L − 2 ∗ ( L K − 1 ) ) × 1 × C 3 , C 3 ] [(L-2 *\left.\left.\left(L_{K}-1\right)\right) \times 1 \times C_{3}, C_{3}\right] [(L−2∗(LK−1))×1×C3,C3]，得到输出是 H = H 4 ∈ R 1 × ∣ V ∣ × C 3 \mathbf{H}=\mathbf{H}_{4} \in \mathbb{R}^{1 \times|V| \times C_{3}} H=H4∈R1×∣V∣×C3

（待更新）

论文笔记《HetETA: Heterogeneous Information Network Embedding for Estimating Time of Arrival》

Abstract

1. Introduction & Related Work

2. Methodology

2.1 Gated CNNs for Temporal Correlations

2.2 Het-ChebNet for Spatial Correlations

2.3 Fusion Layer for Prediction

论文笔记《HetETA: Heterogeneous Information Network Embedding for Estimating Time of Arrival》

Abstract

1. Introduction & Related Work

2. Methodology

2.1 Gated CNNs for Temporal Correlations

2.2 Het-ChebNet for Spatial Correlations

2.3 Fusion Layer for Prediction

相关推荐