Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

CPT-DF: Congestion Prediction on Toll-Gates Using Deep Learning and Fuzzy Evaluation for Freeway Network in China

CPT-DF: Congestion Prediction on Toll-Gates Using Deep Learning and Fuzzy Evaluation for Freeway... Hindawi Journal of Advanced Transportation Volume 2023, Article ID 2941035, 16 pages https://doi.org/10.1155/2023/2941035 Research Article CPT-DF: Congestion Prediction on Toll-Gates Using Deep Learning and Fuzzy Evaluation for Freeway Network in China 1,2 1,2 2 3 2 2 Tongtong Shi, Ping Wang , Xudong Qi, Jiacheng Yang, Rui He, Jingwen Yang, 1,4 and Yu Han School of Intelligent System Engineering, Sun Yat-Sen University, Guangzhou 518000, China School of Electronics and Control Engineering, Chang’an University, Xi’an 710064, China School of Automation, Southeast University, Nanjing 210096, China Guangdong Province Key Laboratory of Fire Science and Technology, Guangzhou 510006, China Correspondence should be addressed to Ping Wang; wang0372@e.ntu.edu.sg and Yu Han; hanyu25@mail.sysu.edu.cn Received 26 April 2022; Accepted 3 September 2022; Published 10 April 2023 Academic Editor: Dong-Kyu Kim Copyright © 2023 Tongtong Shi et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Toll-gates are crucial points of management and key congestion bottleneck for the freeway. In order to avoid traffic deterioration and alleviate traffic congestion in advance, it is necessary to predict and evaluate the congestion in toll-gates scattering in large- scale region of freeway network. In this paper, traffic volume and operational delay time are selected from various traffic indicators to evaluate congestion considering the particular characteristics of the traffic flow within the toll-gate area. The congestion prediction method is designed including two modules: a deep learning (DL) prediction and a fuzzy evaluation. We propose a modified deep learning method based on graph convolutional network (GCN) structure in the fusion of dilated causal mechanism and optimize the method for spatial feature extraction by constructing a new adjacency matrix. This new AI network could process spatiotemporal information of traffic volume and operational delay time, that extracted from large- scaled toll-gates spontaneously, and predict key indicators in 15/30/60 min future time. The evaluation module is proposed based on these predicted results. Then, fuzzy C-means algorithm (FCM) is further modified by determining coupling weight for these two key indicators to detect congestion state. Original traffic data are obtained from the current 186 toll-gates served on the freeway network in Shaanxi Province, China. Experimental tests are carried out based on historical data of four months after preprogressing. The comparative tests show the proposed CPT-DF (congestion prediction on toll-gates using deep learning and fuzzy evaluation) outperforms the current-used other models by 6-15%. The successful prediction could extend to the real-time prediction and early warning of traffic congestion in the toll system to improve the intelligent level of traffic emergency management and guidance on the key road of disasters to some extent. 1. Introduction within the toll-gate area and other roads are different. There are multiple steps such as deceleration, lane changing, Freeway is the backbone of long-distance transport because rendezvous, and toll payment when a vehicle enters the of its low disruption and excellent road conditions [1]. As toll-gate area [3]. The entire trafficefficiency is strongly the demand for long-distance transport increases with affected by passing time through the toll-gates. Thus, toll- economic development, the number of freeway mileage in gates are crucial points of management and key congestion China is growing exponentially, as shown in Figure 1(a). bottleneck for the freeway network, especially in China. How- At present, researchers have made a large quantity of work ever, serious traffic congestion often occurs in the toll-gates on traffic state of urban road sections, but there are relatively area. As shown in Figure 1(b), in the top 10 toll-gates ranked few studies on toll-gate of freeway [2]. Since the freeway is by congestion in China, the vehicle speed may drop to closed management and toll-gates scattering in large-scale 10 km/h and the congestion index may even be up to 50-60. region of freeway network, characteristics of the traffic flow It might additionally cause traffic accidents, energy 2 Journal of Advanced Transportation 16.1 60 6 14.96 50 5 14.26 13.64 13.1 40 4 12.35 30 3 11.19 20 2 10.44 9.62 0 0 12 3 4 5 6 78 9 10 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 Mileage of freeway (10000 km) Toll gate Driving speed (km/h) Congestion index (a) (b) Figure 1: (a) Mileage of freeways in China. (b) Top 10 toll-gates ranked by congestion index (11 : 10 April 9, 2022). (Data source from reference: https://report.amap.com/congest.do#). consumption, and environmental pollution in the areas of are also unsatisfactory results in long-term prediction these toll-gates [4]. In order to alleviate or avoid the occur- work due to the defects of the model. For example, the rence of these problems, an increasing number of researchers disappearance and explosion of gradients in Recurrent are working on two aspects: traffic prediction and congestion Neural Networks (RNNs) lead to the loss of historical data evaluation of the toll-gates area, in which further studies on and cannot be processed in parallel on a large scale, result- quantitively predicted traffic indicators might be an alterna- ing in slow computation. Therefore, a graph convolutional tive way to solve the problem to evaluate the congestion. network fusing the dilated causal mechanism was intro- The success prediction and evaluation would provide the pos- duced in this paper to compensate for this deficiency. In sibility for the advance management of the toll-gate and the addition, most existing graph-based methods usually only guidance of traffic routes. The traffic perception will be faster build spatial features based on distances or correlations and more accurate with the development and breakthrough of between toll-gates [10–12]. But there will be two situations artificial intelligence (AI) and deep learning (DL), providing as shown in Figure 2: (1) Nearby Euclidean distances more effective intelligent technology for alleviating conges- between gates, but little spatial correlation of traffic flows tion and traffic emergency management. (Region A: A and A are not directly relatable), (2) Long 1 2 In terms of traffic prediction, the freeway network is a Euclidean distances between gates, but strong spatial cor- topological structure with dynamic spatiotemporal fea- relation of traffic flows (Region B). This means that only tures, which is manifested by the periodicity of traffic flow referring to a single factor cannot accurately grasp the and the spatial correlation of toll-gates between upstream large-scale spatial features and even has a great interfer- and downstream. Initially, due to the limitation of intelli- ence on the prediction accuracy. Based on this, a new spa- gent technology, most of the prediction work of traffic tial adjacency matrix that combines the two features of congestion focused on the temporal dimension while toll-gates including correlation and distance is proposed ignoring the information of the large-scale spatial dimen- in this paper, which can more accurately explain the spa- sion. In recent years, graph neural network (GNN) based tial characteristics of toll-gates in a large scale. on deep learning was first proposed by Gori et al. [5] In terms of congestion evaluation, the work of conges- and Scarcella et al. [6], which introduced graph structure tion evaluation of freeway toll-gates is to further analyze in the field of spatial correlation to skillfully simulate the and summarize the prediction results (single or multiple spatial correlation between objects. Hence, various algo- traffic indicators). The content of this part includes the rithms based on graph models have been widely used in selection of indicators and evaluation methods to measure different fields, including social network, biomedicine, traffic congestion. Firstly, researchers mostly use traffic and knowledge graph [7]. Similarly, the toll-gates and indicators such as speed, traffic volume, occupancy, opera- ordinary road sections of the freeway network can be tional delay time, and queuing length as the detection cri- mapped to the relationship between points and edges in teria for urban traffic congestion. The freeway is a closed the graph structure. Recently, as a branch of GNN, Graph system consisting of “toll entrances-road sections-toll Convolutional Networks (GCN) [8, 9] were introduced to exits”. Under normal circumstances, the vehicle will drive traffic work and efficiently implement congestion predic- at a high and uniform speed on ordinary road sections. tion from a spatiotemporal perspective. However, there When reaching the toll-gate area, the vehicle speed will Journal of Advanced Transportation 3 B Region B 1 B 2 550 Region A 00:00 03:00 06:00 09:00 12:00 15:00 18:00 21:00 24:00 00:00 03:00 06:00 09:00 12:00 15:00 18:00 21:00 24:00 Time Time 210105 510202 740103 590106 Figure 2: The special situation of toll-gate location and traffic flow distribution. slow down to an extremely low state until the toll is com- posed in this paper to complete the congestion pre- diction and evaluation of toll-gates more accurately pleted and finally accelerate the process of driving away. Therefore, the speed itself varies greatly in the toll-gate (ii) A new AI network (CPT-DF) for congestion predic- area, so it is suitable as a congestion indicator in the ordi- tion of freeway toll-gates using deep learning and nary road section rather than in the toll-gate area. In addi- fuzzy evaluation is proposed. The GCN of the tion, since there are often nondeployed detector sections prediction module is used to capture the spatial or detector damaged sections upstream and downstream information of the road network by weighting the of the toll-gate, the queuing length and occupancy rate neighbourhood node features and embedding them cannot be completely and directly detected. In view of this, into the graph structure, and the improved dilated the traffic volume based on toll-data is preferred in this causal mechanism preserves the nonlinear ability paper as one of the indicators to identify the congestion of the model by adding residual connections and state of freeway toll-gates. Besides, due to the difference gated linear units, so that temporal dimension in the scale of toll-gates, the evaluation results of toll- information can be better captured. The evaluation gate i and toll-gate j may be different under the same traf- module uses the improved FCM algorithm for accu- fic volume. Therefore, this paper selects operational delay rate interval classification of toll-gate traffic state time as another indicator and designs a calculation method based on toll-data. Secondly, there is no standard (iii) A new adjacency matrix that combines both the congestion evaluation method because the Electronic Toll correlation and distance features of toll-gates is Collection (ETC) channel has been added to the toll- constructed in this paper, which can optimize the gates of China’s freeways in recent years, which is an existing methods to more accurately extract the important renovation of the toll-gates [13]. References spatial features of large-scale toll-gates include the “Highway Capacity Manual (HCM)” issued (iv) This paper verifies the proposed CPT-DF model by United States and the “Urban Road Traffic Operation based on toll data, and some toll-gates are selected Evaluation Indicator System” issued by China, both of to complete the work of congestion prediction, which quantify the service level into n intervals according which could efficiently improve the intelligent level to road characteristics. Therefore, the idea of clustering of traffic emergency management and guidance on and the currently more popular methods based on fuzzy the key road of disasters clustering are introduced in this paper to perform traffic state classification well. Currently, there are some studies in this area. A congestion evaluation model based on the 2. Related Work fuzzy C-means (FCM) algorithm was proposed [14]. This 2.1. Prediction Algorithms. In recent years, with the increase method of unsupervised learning based on fuzzy clustering in the amount of data and the development of fusion tech- to analyze hidden data patterns is helpful for the conges- nology, data-driven prediction algorithms represented by tion evaluation of toll-gates. However, the traditional statistical models, traditional machine learning and deep FCM algorithm does not consider the influence of each learning models have become more and more popular in traffic index on the clustering results, and the algorithm various research fields. In the field of traffic prediction, is prone to fall into the local minimum. Based on these ARIMA model [15] and Kalman filter model [16] can rely problems, this paper refers to and improves the FCM on statistical methods to simply model the relationship method, which combines the coupling weights of the two between traffic parameters to predict road traffic state. How- traffic indicators to optimize the FCM algorithm and accu- ever, such models are usually based on linear assumptions, rately realize the congestion evaluation. The main contri- which cannot strongly explain the high-dimensional and butions can be summarized as follows. nonlinear of traffic data. Later, machine learning methods (i) Based on toll-data, a new calculation method for such as Support Vector Regression (SVR) [17] proposed by “traffic volume” and “operational delay time” is pro- researchers can solve the nonlinear relationship in traffic Flow Flow 4 Journal of Advanced Transportation data well, so such methods are widely used in freeways and traffic behaviour, and then evaluate the traffic congestion, urban roads. However, the changes in road environment but it is not suitable for scenarios with high complexity and traffic flow near toll-gates are more complicated than and unknown degree, such as toll-gates. The latter is based the temporal characteristics of ordinary road sections, and on neural networks and clustering algorithms to evaluate there are few studies on toll-gates at present. Wang et al. traffic congestion [23]. Neural network-based methods need [18] fuse vehicle detector data, long-range microwave sensor to manually classify traffic states in advance, but it is difficult data, and toll-data and employ Deep Belief Network (DBN) to accurately classify states in unknown scenarios. In the to successfully predict small-scale ring roads at time inter- contrast, clustering methods are very suitable for special vals of 30/60/120-minute traffic flow at the toll-gate. Shuai scenarios of toll-gates without standard evaluation. The clus- et al. [19] adopted the modified Long Short-Term Memory tering algorithm is the process of dividing a collection of (LSTM) and predicted the traffic volume of the 51 screened research objects into multiple classes consisting of similar toll-gate. These deep learning-based methods are good at objects. Initially, the K-means clustering algorithm [24] capturing traffic trends in complex toll-gate environments combined with three parameters of traffic volume, speed, or exploring spatial connectivity between one or more road and occupancy was applied to achieve a simple state evalua- segments in a single temporal dimension, but the spatial tion. However, the algorithm cannot evaluate the critical threshold of the sample. Later, considering that traffic state characteristics between each toll-gate on a large scale are not considered. Therefore, the researchers extracted the spa- is a fuzzy concept, the algorithm of fuzzy clustering is also tial features of the road network by introducing the convolu- applied to traffic state evaluation. The FCM algorithm was tional neural network (CNN) [20] to convert the structure of first proposed by Dunn [25], and later the improved the traffic road network into a standard graph structure, but algorithm based on FCM has been widely used in the field CNN is not suitable for processing non-European data such of traffic evaluation [26, 27]. as toll-gates. Recently, GCNs have been widely used for many graph-based tasks, and many studies have further 3. Methodology explored the use of GCNs to model the topology of road net- 3.1. Problem Definition works. The STGCN model [10] combines GCN and CNN for the first time to model the traffic network and spatiotem- 3.1.1. Congestion Domain of Toll-Gates. To study the traffic poral sequence. This paper also refers to the fusion principle congestion of the toll-gates, this paper constructs a “toll-gate of STGCN in the model construction part, but the model can congestion domain” according to the trafficcharacteristics of only use CNN to process the signal of each layer of the the toll-gates, which includes the three areas (upstream network. Propagating to the upper layer, the processing of section, deceleration section, and toll section), as shown in samples is independent at each moment, so it cannot cope Figure 3. After the vehicle passes through the congestion with long-term prediction well; the DCRNN model [11] domain, it will quickly resume normal driving through the models the spatial correlation as a diffusion process on a acceleration section and the downstream section. directed graph to establish a traffic flow transformation Upstream Section.A fixed distance section before the model, we propose to develop a diffuse convolutional recur- vehicle enters the toll-gate. The first vehicle detector (VD) rent neural network capable of capturing the spatial and in Figure 3(1) is installed upstream of the toll-gate in temporal dependencies between long-term sequences using upstream section, which is used to detect parameters such the seq2seq framework. This paper is inspired by DCRNN as the speed of passing vehicles and the current time. in the construction of time series model. The ASTGCN Deceleration Section. The vehicle starts to slow down and model [21] introduces an attention mechanism (GAT) in enters the toll-gate and selects a different toll lane. GCN to effectively model temporal and spatial correlations. Toll Section. The toll lanes can be divided into Manual However, this model cannot capture spatial and temporal Toll Collection (MTC) and Electronic Toll Collection dependencies simultaneously and only considers low-order (ETC) based on the tolling method. The vehicle decelerates neighbourhood relationships between nodes, ignoring the through the railing locomotive detector (RLD) as shown in correlations between different historical periods. The T- Figure 3(2) and records the current time. GCN model [12] integrates GCN and GRU to capture traffic spatiotemporal features, which can be well derived from 3.1.2. Calculation of Indicator. Firstly, the original data spatiotemporal features. These models can well capture the collected in this paper will be counted every 5 minutes and information between adjacent nodes from the perspective traffic volume can be counted directly (the details are pre- of spatial and temporal, so as to complete short-term traffic sented in Section 4.1.1). Secondly, the operational delay time prediction. However, these models have not been success- on ordinary roads can be obtained according to “speed- fully applied in the traffic prediction work of the toll station, acceleration-distance”. But the limitation of this method will so this paper proposes a new model and uses the toll-data to bring a large error. According to the characteristics of toll- carry out the application work of the actual scene. gates, the operational delay time is calculated as 2.2. Evaluation Algorithms. The methods of congestion ! ! n k evaluation can be divided into traffic theory and data- 1 1 D = 〠 T − 〠 T , ð1Þ driven algorithms [22]. The former is based on physical i Pk n K i=1 i=1 and mathematical theory to describe the characteristics of Journal of Advanced Transportation 5 Congested domain ② ② T T T T i(RLD) i(RLD) i(VD) Upstream Deceleration Toll gate Acceleration Downstream T T T u D t Vehicle detectors (upstream) Railing locomotive detector Figure 3: Composition of toll-gates and construction of congestion domain. where F ∈ R is a vector of observations for n toll-gates at T = T + T + T = T − T , ð2Þ i u D t iðÞ RLD iðÞ VD time step t, F here also refers to the traffic volume and oper- ational delay time. T = 〠 T − T , ð3Þ Pk jk RLD jk VD ðÞ ðÞ (2) Construction of Graph Data. For unordered gate network j=1,volume<Q traffic data, the observations F are not independent and can where D is the calculated average operational delay time per be viewed as graph signals defined on an undirected graph five minutes. T is the passing time of the i-th vehicle passing G as shown in Figure 4, the graph is expressed in terms of through the congestion domain. T , T , and T are the G = ðF , E, WÞ. E is a set of edges representing the connec- u D t t t n×n upstream section and the deceleration section as shown in tions between gates, and W ∈ R represents the adjacency Figure 3, respectively and the travel time of the toll road. matrix of G . T is the travel time of the vehicle through the toll channel Pk K in the congestion domain. Q is the flow threshold when Each of the toll-gates can be regarded as a vertex in the the traffic state is unblocked. T and T are the times graph structure, and the road segments connecting the toll- iðVDÞ iðRLDÞ gates can be regarded as edges. In order to represent the detected by the i-th vehicle passing the upstream vehicle spatial relationship between each toll-gate (vertex), the detector and the railing locomotive detector, respectively. Euclidean Distance is usually chosen, but there will be the 3.1.3. Preparing for Input/Output. As shown in Table 1, two defects shown in Figure 2 that were mentioned earlier. sets of inputs/outputs are designed to perform both predic- Therefore, we introduce Distance Matrix (D-Matrix) and tion and evaluation. For the prediction module, the input Correlation Matrix (C-Matrix) to represent spatial features. values are the two previously selected feature data (traffic volume and operational delay time) and graph data. The (i) D-Matrix is the Euclidean distance between each graph data represents the spatial relationship between each gate, which can be calculated using the latitude and toll-gate. The two traffic indicators represent the temporal longitude values of the gates by “Vincenty solutions relationship between the historical traffic state and future of geodesics on the ellipsoid” [28]. traffic state of toll-gate. The output values are just two traffic (ii) C-Matrix is judged based on whether the gates are indicators for future moments predicted by the model. For directly connected. As shown in Figure 5, P and P 1 4 the evaluation module, the input value is the output value are directly connected, but P and P are not directly 1 2 of the prediction module and the congestion threshold cor- connected, so C =1, C =0: p p p p responding to each toll-gate, and the output value is the final 1 4 1 4 traffic state. This part mainly introduces the construction of Finally, a novel type of Distance and Correlation Matrix the spatiotemporal data input to the prediction module. (D&C-Matrix) is constructed to calculate the adjacency matrix W as ij (1) A. Description of Feature Data. Traffic prediction is a typical spatiotemporal prediction problem. Given the previ- ÀÁ ÂÃ ÂÃ ous M observations of historical traffic feature, the data mea- 2 D ⊙ C > ij ij sured at the N toll-gates at time step H can be viewed as a exp − w = ≥ ε, i ≠ j, ð5Þ matrix of size M × N. Then, the predicted value of the flow ij closest to the true value in the next H time steps is as 1, others F , ⋯, F = argmaxlogPF , ⋯F F , ⋯, F , ðÞ j t+1 t+H t+1 t+H t−M+1 t where ½D Š and ½C Š are D-Matrix and C-Matrix, respec- ij ij ð4Þ tively, ⊙ is the Hadamard product, and D and C are the ij ij 6 Journal of Advanced Transportation Table 1: The input/output values of the proposed model. Prediction module Evaluation module Graph Feature data (historical) Feature data Threshold data Input values Operational delay D&C- Traffic volume (number/n min) Congestion Traffic volume (number/n min) time (s) matrix (future) boundaries Output Traffic volume (number/n min) (future) Traffic state values 4 1 2 5 1 2 1 2 5 5 ··· 5 5 6 6 3 3 t-Mt-M+1 t+H Time (a) (b) Figure 4: (a) Toll-gates in a road network. (b) Spatiotemporal correlation. P P C-matrix D-matrix D&C-matrix 1 4 P P P P P P P P P P P P 1 2 3 4 1 2 3 4 1 2 3 4 P P P 1 0 0 1 1 4 95 89 1 0 0 89 1 1 1 P P P 0 1 1 0 4 1 76 1 0 1 76 0 2 2 2 P P 0 1 1 0 95 76 1 5 P 0 76 1 0 3 3 3 P P 2 3 P P 1 0 0 1 89 1 5 1 P 89 0 0 1 4 4 4 (a) (b) Figure 5: Construction of adjacency matrix. (a) Location distribution of the four toll-gates (Yellow lines represent dividing fences that vehicles on the freeway cannot pass through). (b) The process of building an adjacency matrix. GCN fused with dilated causal convolutions. Finally, the distance and correlation between gates i and j. σ and ε is the threshold of control matrix distribution and sparsity. evaluation module combines the prediction indicators with the FCM clustering mechanism to realize the congestion detection of the toll-gate in the future period. 3.2. Overview. In this paper, we propose an AI network (CPT-DF) of deep learning that integrates a fine-grained 3.3. Prediction Module congestion evaluation mechanism, as shown in Figure 6. The CPT-DF network includes two modules: prediction 3.3.1. Spatial Feature Extraction. GCN is a basic operation module and evaluation module. The prediction module based on spectral decomposition method or spatial struc- includes input/output layer 1 and spatiotemporal convolu- ture. The spectral decomposition-based method is to deal tion layer, and the evaluation module includes input/out- with the spectral domain correlation representation of the put layer 2 and congestion evaluation layer. The output graph. In this paper, the spectral decomposition method is layer 1 and the input layer 2 are marked with green fonts, introduced to extract node spatial features given node infor- just because of the transmission relationship in the calcu- mation. As early as 2014, Bruna et al. [29] proposed Spectral lation process. Network to define convolution operations in the Fourier First, the preliminary work completed the construction domain, which can be defined as the product of feature x of the congestion domain of the toll-gate, the selection of ∈ ℝ and a convolution kernel G = diag ðθÞ as indicators, and the data required by the input layer. Then, the prediction module detects traffic indicators (traffic vol- ume and operational delay time) for future periods based x ∗ G = UG ðÞ Λ U x, ð6Þ θ θ on the spatiotemporal convolutional layers constructed by Correlation Journal of Advanced Transportation 7 Prediction module Input layer1 Spatiotemporal convolutional layer Output layer1 Temporal-dataset TCN module Traffic ··· Delay time Temporal Causal conv volume feature Ŷ BN t+T Dilated conv 186 × M (5 min) X X X X t-m t-n+1 t-1 t Spatial ST-fusion module Spatio-dataset feature 186 × 1 GCN module D&C-matrix Traffic 1 186 Temporal volume 2 Conv 15 min 30 min feature GCN 1 hour 186 × 186 Conv Delay time Evaluation module Input layer2 Congestion evaluation layer Output layer2 Data preparation Road network Time T Traffic t+T state Evaluation Congestion threshold Congestion factor Volume 183 184 185 186 Threshold Figure 6: CPT-DF network: congestion prediction on toll-gates using deep learning and fuzzy evaluation. The blue line boxes represent the input values of the prediction module and the evaluation module, respectively, the red line boxes represent the output values, and the output value of the prediction module is also one of the input values of the evaluation module. where U is a matrix composed of the eigenvectors of the nor- mation, which achieves a larger receptive field and reduces malized Laplacian matrix, and Λ is a diagonal matrix of traffic the number of convolution kernels. indicators. However, this method of convolution, which com- putes the Eigen decomposition of the Laplacian matrix of the 3.3.2. Temporal Feature Extraction. As a derivative of CNN, graph, leads to potentially intensive computations and results Temporal Convolutional Networks (TCN) [31] is a network in unsatisfactory locality of the convolution kernels. framework that can accurately process sequences or data In order to alleviate the problems of Spectral Network, in containing time series. It aims to extract features across time 2016, Michael et al. [30] proposed Cheb Nets K-jump steps by directly exploiting the powerful properties of convo- convolution to define convolution on the graph, thus elimi- lutions and uses fully connected networks and dilated causal nating the time cost of calculating Laplacian matrix vectors. convolution to achieve corresponding outputs for each On this basis, this paper sets K =1 to alleviate the local over- input, respectively, and ensure that no historical data is fitting problem. Therefore, the graph convolution can be missed. In this paper, an improved TCN network is designed written as to extract temporal features by fusing dilation convolution, GLU, and residual blocks. The specific improved TCN struc- − 1/2 − 1/2 ðÞ ðÞ ′ ′ ture is shown in Figure 7. x ∗ G &= θ x − θ D WD x ð7Þ − 1/2 − 1/2 ðÞ ðÞ = θ I + D WD x, (1) Dilated Causal Convolution. Dilated causal convolution is used to solve the problem of the time dimension of big data. Among them, the expansion coefficient of the convolu- ′ ′ where the adjustable parameter is θ = θ = −θ . D is the tion kernel can be arbitrarily combined from the range of [1, degree matrix. The GCN model constructs filters in the Fou- 2, 4, 8, 16, 32]. Through comparative experiments, it is rier domain, constructs spatial features by stacking multiple found that the experimental results obtained by [1, 2, 4], local Covn layers, and extracts the structural information of [1, 2, 4, 8, 16], and [8, 16, 32] are relatively stable. At the the network in the form of convolutions. Therefore, a deeper same time, in order to maintain the temporal relationship structure can be constructed to deeply recover spatial infor- of historical information, the kernel is set to 2, and the 8 Journal of Advanced Transportation (i-1) (i-1) (i-1) Outputs {F , F ... F } 1 2 N Dropout Dilated convolution . . . (1) (1) N–1 Dilated convolution 1 1 Conv Dropout GLU 1 1 The weighted Conv normalization Dilated convolution (i-1) (i-1) (i-1) Inputs {F , F ... F } F , F ... F , F 1 2 N 0 1 N–1 N (a) (b) Figure 7: Structure of improved temporal feature extraction. (a) The composition structure of the overall framework. (b) Structure of Dilated Causal Convolutions. ðÞ M−K +1 ×C expansion coefficient is used as the sliding jump value, and t 0 Γ ∗ τY = P ⊙ σ Q ∈ ℝ , ð10Þ ðÞ the receptive field is set to2×2^ð4 − 1Þ =16: where P and Q are the input of the GLU gate, respec- Formally, for the one-dimensional sequence input x ∈ ℝ tively, and ⊙ represents the element-wise Hadamard and the kernel function ϕ ∈ ℝ , d is the expansion coefficient, product. σðQÞ controls the dynamic change of the input P, the flow data sequence input by time is fF , F , ⋯F g,and 0 1 N and the added nonlinear link ensures the stacked input of the output result is denoted as fg , g , ⋯g g, the mapping 0 1 N the time layer, and the residual connection is realized in the relationship S between F and g can be expressed as: time layer of the stack. Using the same convolution kernel Γ M×C for each node y ∈ ℝ in the traffic graph, the time domain g ̂ , g ̂ , ⋯g ̂ =SF , F , ⋯F : ð8Þ ðÞ 0 1 N 0 1 N convolution Γ ∗ τy can be extended to the three-dimensional M×n×C variable y ∈ ℝ . The convolution operation S on the element F is (3) ST-Fusion Module. To fuse spatiotemporal features, a k−1 spatiotemporal fusion module (ST-Fusion Module) inspired SF = x ∗ dϕ F = 〠 ϕ i · x , ð9Þ ðÞ ðÞðÞ ðÞ F−d·i by [32] is constructed in this paper. The modules can be i=0 stacked or expanded depending on the size and complexity of a particular case. As shown in Figure 8, each spatial con- where k represents the kernel size, F − d · i maps the upper volutional layer bridges two temporal convolutional layers, layer history information, and at the same time introduces which can achieve fast transition of the states of the tempo- the residual block in the TCN. ral and spatial layers. In addition, this design scales the chan- nel C through the graph convolution layer, which also helps (2) Gated Linear Units (GLU). After adding the residual mod- the network to fully apply the bottleneck strategy and ule, the TCN has 3 layers of dilated convolution, and the data achieve scale and feature compression. distribution is normalized by weights, and then the GLU is used to replace the ReLU in the original structure to save the The input and output of ST-Fusion Module are both 3D nonlinearity of the remaining blocks, at the same time to l M×n×C expand the volume every time add dropout after the product tensors. For the input F ∈ ℝ of block l, the output l+1 l+1 ðM−2ðK −1ÞÞ×n×C to prevent overfitting. Furthermore, 1 ∗ 1Conv with a width F ∈ ℝ is of K is introduced to obtain the output Y of the time sequence as the input of the next stage. At this time, the time convolu- l+1 l l l l F = Γ ∗ τReLU Θ ∗ g Γ ∗ τF , ð11Þ 1 0 tion input of each node can be regarded as a sequence of M×C length M, and the number of channels is C ,so Y ∈ ℝ . K ×C ×2C l l t t 0 The convolution kernel τ ∈ ℝ is used to map the where Γ , Γ are the upper and lower kernels of the temporal 0 1 ðM−K +1Þ×2C l t 0 input Y to a single output element ½PQŠ ∈ ℝ . 1 ∗ convolutional layer of the inclusion graph convolution. Θ is 1Conv keeps the remaining input and output dimensions the spectral domain convolution kernel in graph convolu- the same, the convolution can be defined as: tion, and Re LUð·Þ represents the activation unit. Journal of Advanced Transportation 9 Output layer Spatial-temporal block Improved Improved TCN Graph TCN Spatial-temporal temporal convolution temporal block layer convolution convolution l+1 layer layer Passage Passage Passage (F ...F ),W t-M+1 t C = 16 C = 32 C = 32 Spatial-temporal block Figure 8: Spatiotemporal block and output connection diagram. After fusion of temporal convolution and spatial convo- method. First, calculate the proportion p and entropy value ij lution, apply linear transformation F = Zw + b on channel C E of sample i under indicator j. The weight w of each traffic to obtain n nodes. The predicted value of traffic w ∈ ℝ is the indicator is further calculated as weight vector, b is the deviation, considering the conver- gence speed, and using L2-loss to measure the model perfor- ij mance, the flow loss function is expressed as ð15Þ p = , ij ∑ x i=1 ij LðÞ ̂v ; W =〠kk ̂vvðÞ , ⋯, v , W − v : ð12Þ θ t−M+1 t θ t+1 E = − 〠 p ·ln p , j =1,2, ⋯z, ð16Þ j ij ij ln n ðÞ i=1 Since the deepening of the spatiotemporal block men- tioned above will gradually slow down during the training 1 − E process, this paper introduces Batch Normalization (BN) w = : ð17Þ j z ∑ 1 − E j=1 j before the hidden layer activation function to fix the distri- bution of the input and pull the distribution back to the nor- The specific congestion evaluation process is shown in mal distribution interval of [0, 1] to speed up convergence Figure 9, and the improved FCM algorithm introduced in speed, while making the optimization smoother. The specific this paper is an unsupervised fuzzy clustering method, which transformation is is a data clustering method based on the optimization of the ÂÃ objective function. The membership degree of the cluster ðÞ k ðÞ k x −Ex ðÞ k center is represented by a numerical value. Input the feature x̂ = qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi , ð13Þ ÂÃ ðÞ k Var x prediction sample set X= fx , x ,⋯x g and the number of 1 2 n traffic state categories, and then the calculation formulas of the objective function of the improved FCM clustering ðkÞ where x represents the output of the activation func- algorithm for traffic state classification are tion of the hidden layer, E½·Š represents the mean value, and Var½·Š represents the variance. c n σ 2 At the same time, two parameters γ, β are added to JXðÞ , U, V = 〠 〠ðÞ u d , ð18Þ im im perform the inverse activation transformation. m=1 i=1 ðÞ k ðÞ k ðÞ k ðÞ k y = γ x̂ + β : ð14Þ t = exp − 〠 u ∙d , i im im m=1 3.4. Evaluation Module ð19Þ 3.4.1. Entropy Weight Method (EWM). The traditional FCM d = 〠 w x − v : im j i m algorithm does not consider the influence of traffic indica- j=1 tors and individual samples on the clustering results. The EWM uses the idea of entropy value to judge the discrete where U is the affiliation matrix of each sample belonging to degree of an indicator and determines the weight of each different traffic states. V is the matrix composed of all clus- tering centres. n is the total number of samples. u is the indicator through the information entropy. In this paper, im the EWM is used to determine the weights of different indi- affiliation degree of samples belonging to traffic state m. σ cators, and the degree of influence of each sample on the is the weighted index, indicating that the less fuzzy the algo- clustering results is defined by designing a sample weighting rithm is, the more accurately the state is divided. t is the ··· ··· 10 Journal of Advanced Transportation Congestion evaluation Traffic volume 1 2 3 4 Congestion Evaluation ··· Start Indicator predict Finish level module ··· Delay time 183 184 185 186 Congestion threshold Improved FCM algorithm Renew Indicator weight Hyperparameter setting Initialization || U(k+1)–U(k)<𝜀 || Optimal cluster center 𝜎 u c d Clustering center Figure 9: The algorithm flow of traffic congestion evaluation. ID of entry gate Vehicle detectors 30102 1 Time_VD Time_RLD 2019-04-03 12:11:18 2019-04-03 12:12:18 (b) Time_processing Payment type Time 100 0 (c) (e) Gate_ID (number/5min) Time 30102 30103 30104 6:00 13 11 22 3 6:05 34 25 17 2 6:10 23 20 19 4 6:15 10 15 25 9 Time 6:20 38 20 31 17 (a) (d) (f ) Figure 10: Data sources of freeway tolls and data processing of traffic indicators. weight of sample i. d is the weighted Euclidean distance In addition, this paper collects the original toll-data for 4 im months (December 2018-March 2019) and selects some between sample i and cluster center m. fields as shown in Figure 10(c), including the toll-gate num- ber, vehicle detector, time to VD and arrival to RLD, respec- 4. Experiments tively (VD and RLD are the positions of (1) and (2) in Figure 3, respectively), travel time, and payment method This section contains the experimental settings and experi- (choice of ETC and MTC channels). Then, this paper mental results. converts the collected raw data into the traffic volume and running delay time required by the experiment. The 4.1. Experimental Settings collected traffic volume data are integrated at 5-minute 4.1.1. Study Site and Datasets. In this paper, 186 toll-gates on intervals as shown in Figure 10(d), and the operational delay freeways in Shaanxi Province, China are selected as the study time is calculated according to Equations ((1)–(3)). Due to site. The locations of the toll-gates are shown in Figure 10(a), equipment failure and other reasons, there will be data showing a radial distribution. Based on the calculation in missing in some periods as shown in Figure 10(e). Facing Equation (5) of the D&C-Matrix, the spatial heat map rela- the problem of missing temporal data, we analyze the tionship between the toll-gates obtained by further analysis distribution characteristics of missing data and establish a is shown in Figure 10(b). reasonable complementary rule framework to interpolate Flow Flow Journal of Advanced Transportation 11 25 RMSE 10.02 9.05 10 8.21 HA ARIMA SVR LSTM SAEs GRU TCN STGCN DCRNN T-GCN CPT-DF 1-hour 30-min 15-min Figure 11: Visualization of prediction performance (RMSE) of different models at different time steps. temporal data. Finally, the data completion effect shown in 4.2. Experimental Results. This part uses the feature matrix Figure 10(f) is achieved, and a relatively complete data set and adjacency matrix datasets based on the Shaanxi Prov- is prepared for the experimental part. ince toll-data to demonstrate the long-term prediction ability of the model proposed in this paper under large- 4.1.2. Parameters Setting. This experiment uses the ADAM scale networks. The experimental results are discussed from optimizer for training, setting the learning rate every 5 epochs three aspects: prediction results, evaluation results, and to 0.7, the initial learning rate to 0.001, and the batch size to 50. ablation experiments. The channels of the spatiotemporal block are set to 32, 16, 32. In addition, the experiment selected the following three 4.2.1. Prediction Results. From the overall comparison evaluation metrics as shown in Equations (20)–(22). results, as shown in Figure 11, all models based on temporal and spatial features have better prediction performance than (i) Root Mean Squared Error (RMSE). models based on temporal features only. This also proves that there is a strong spatial correlation between various toll sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi stations in the large-scale road network. 2 Table 2 lists the comparison results of the prediction per- RMSE = 〠ðÞ x − x̂ : ð20Þ i i n formance at 15/30/60 minutes of the seven models that only i=1 consider the temporal feature. Here, MAE and RMSE have the same units as the quantity being estimated. When the predicted object is traffic, the unit is the number of vehi- cles/n min; when the predicted object is delay time, the unit (ii) Mean Absolute Error (MAE). is min. The error of the traditional method will be larger because the temporal storage capacity of the linear model is limited, the toll-gates of the freeway are more complex, MAE = 〠 x − x̂ , ð21Þ jj i i and nonlinear traffic flow characteristics than ordinary road i=1 sections. Especially with historical average (HA) models, relying only on the average makes it difficult to predict accu- rate results. Compared with ARIMA and SVR models, it is more suitable for short-term prediction work. When the (iii) Mean Absolute Percentage Error (MAPE). time step increases, the model will have difficulty converg- ing. Deep learning methods (LSTM, SAE, GRU, and TCN) are affected by data distribution and will produce larger prediction errors for gates with large peak-to-valley fluctua- MAPE = 〠ðÞ jj ðÞ x − x̂ /x ∗ 100%, ð22Þ i i i i=1 tions. Especially, the TCN model also shows better perfor- mance among the four methods. Because the causal where x is the true value, and x̂ is the predicted value. convolution in TCN is different from the traditional time i i 12 Journal of Advanced Transportation Table 2: Comparison of the performance of temporal prediction. 15-min 30-min 60-min Task MAE MAPE RMSE MAE MAPE RMSE MAE MAPE RMSE HA 9.32 17.85% 22.98 9.32 17.85% 22.98 9.32 17.85% 22.98 ARIMA 6.97 13.20% 16.29 7.44 14.09% 18.37 7.90 14.19% 18.89 SVR 6.11 12.91% 14.70 6.55 13.43% 17.21 7.21 13.83% 17.93 LSTM 5.65 12.02% 13.11 6.09 12.71% 15.92 6.95 13.01% 17.07 SAEs 5.23 11.69% 12.78 5.60 12.01% 14.91 6.26 12.61% 16.13 GRU 4.78 10.20% 11.30 5.22 10.92% 13.84 5.68 11.22% 14.72 TCN 4.57 9.21% 10.80 4.91 9.85% 12.18 5.17 10.15% 13.44 Table 3: Comparison of the performance of spatiotemporal prediction. 15-min 30-min 60-min Task MAE MAPE RMSE MAE MAPE RMSE MAE MAPE RMSE STGCN 4.06 8.02% 9.31 4.35 9.12% 11.43 4.61 9.62% 12.29 DCRNN 3.79 7.79% 8.73 4.13 8.07% 10.75 4.39 8.37% 11.01 T-GCN 3.69 7.06% 8.21 3.81 7.83% 9.79 4.07 8.03% 10.68 CPT-DF 3.71 7.11% 8.26 3.78 7.44% 9.05 3.97 7.82% 10.02 Time (min) TCN-STGCN STGCN DCRNN T-GCN Figure 12: Comparison results of calculation time of different models. series network, it has the characteristic of a one-way struc- through the information transfer between units to capture ture in which the value of the next moment only depends the temporal characteristics. Finally, the characteristics of on the value of the previous multistep. Furthermore, the regional gates will be better predicted. In contrast, in the receptive field is enlarged by adding dilated convolutions work on the time series of highly nonlinear and complex to capture longer dependencies. Therefore, considering the toll-data, the capture ability and memory ability of CNN’s long-term prediction performance, this paper selects and temporal feature are slightly insufficient compared with improves the optimal dilated causal convolutional network. RNN and its variants (LSTM and GRU). Similarly, the spa- Table 3 lists the comparison results of the prediction per- tiotemporal prediction ability of the STGCN model based formance at 15/30/60 minutes of the four models consider- on GCN and CNN is slightly insufficient compared with ing both spatial and temporal features. In graph-related the diffusion convolutional recurrent neural network models (T-GCN, DCRNN, STGCN, and CPT-DF), the (DCRNN) based on RNN and the T-GCN model based on information of node features and graph structure can be GRU. In the short-term prediction (15-minute) work, the learned end-to-end by GCNs. Furthermore, the topological T-GCN model adopts GCN to learn complex topology for structure and spatial correlation features of the toll-gate are spatial correlation and GRU to learn dynamic changes of well captured. The obtained time series with spatial charac- traffic data for temporal correlation. GRU solves the gradient teristics is further input into the unit model of the processing disappearance and gradient explosion problems faced by temporal module, and the dynamic changes are obtained RNN when training a large amount of data and retains the Journal of Advanced Transportation 13 Basically Lightly Moderately Severely Traffic condition Unblocked unblocked congested congested congested Congestion level A B C D E Colour Traffic volume (0, 0.164) (0.164, 0.363) (0.363, 0.565) (0.565, 0.753) (0.753, 1) Delay time (0, 0.151) (0.151, 0.324) (0.324, 0.484) (0.484, 0.669) (0.669, 1) Cluster centre (0.066, 0.058) (0.244, 0.220) (0.434, 0.377) (0.563, 0.504) (0.784, 0.690) Figure 13: The value range and cluster center corresponding to the two normalized indicators. 1 1 0.9 0.9 0.8 0.8 X 0.7839 Y 0.6903 0.7 0.7 X 0.5626 0.6 0.6 Y 0.5042 0.5 0.5 X 0.4344 Y 0.3769 0.4 0.4 X 0.2442 0.3 0.3 Y 0.2166 0.2 0.2 X 0.06623 Y 0.05826 0.1 0.1 0 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Traffic flow Traffic flow Figure 14: Improved congestion evaluation FCM algorithm clustering results. trend of historical charging data, so it can better predict traf- 4.2.2. Evaluation Results. This paper selects the 3-month fic feature in the future in the short-term prediction work collection data of 30101 (toll-gate ID) as an example, and combined with GCN. However, in the long-term prediction calculates the delay time in the congestion area of the toll- (30-minute and 60-minute) work, different from the RNN gate according to Equations (1), (2), and (3), and the traffic structure, the CPT-DF model proposed in this paper can volume is used as the input of the clustering algorithm. be massively parallelized due to the dilated causal convolu- According to the “Urban Road Traffic Operation Evaluation tion. The expansion coefficient and the size of the filter Indicator System” issued by China, an evaluation index called Road Traffic Performance Index (TPI) is proposed change the receptive field and can also avoid the problems of gradient dispersion and gradient explosion in RNN. as shown in Figure 13. Traffic congestion is divided into 5 Therefore, the CPT-DF model achieves the best results in grades in TPI [34]: unblocked (A), basically unblocked (B), the long-term prediction work. lightly congested (C), moderately congested (D), and Furthermore, real-time traffic flow prediction is a basic severely congested (E). Therefore, set the clustering traffic requirement of intelligent transportation systems and has state category C =5, the fuzzy factor σ =2, and the objective strict requirements on the total time cost of model training function change value e=le − 5. In order to speed up the and testing [33]. Therefore, in this paper, taking 1 hour as training speed and optimize the clustering results to normal- the historical time window, the calculation time of the four ize the data, the calculation time of the algorithm is about graph-based models is calculated as shown in Figure 12. 25-30 s. Figure 14 shows the visualization results of the tradi- The comparison found that since the gated convolution in tional improved FCM algorithm on actual traffic data. From STGCN is replaced by the improved dilated causal convolu- left to right, it corresponds to five traffic states (A)-(E), and tion, the training time of the CPT-DF model is greatly according to the clustering results of the improved FCM reduced. DCRNN and T-GCN models take longer to train algorithm, two indicators under various traffic states are than diffuse causal convolutions when using RNN and its given, the corresponding value range and cluster center. variants to capture time series. By comparing the STGCN Furthermore, this paper is verified based on the pro- model before the improvement, the training time of the posed clustering method and the actual and predicted values improved CPT-DF model is reduced by 27.16%, which will of working days (March 20, 2019) and holidays (March 24, provide advantages for real-time traffic flow prediction. 2019) are selected for comparison. As shown in Figure 15, Traffic delay time Traffic delay time 14 Journal of Advanced Transportation 5 5 4 4 3 3 2 2 1 1 True True Predict Predict (a) (b) Figure 15: Comparison of actual and predicted traffic conditions on weekdays and holidays. 10.5 10.41 10.36 14 13.44 12.29 12.18 10.02 10.0 12 11.43 10.8 10.02 9.31 9.5 9.41 9.05 9.34 8.26 9.05 9.0 8.7 6 8.67 8.5 4 8.26 8.0 7.5 15 30 60 15 30 60 Time (min) Time (min) TCN C-matrix GCN D-matrix CPT-DF DC-matrix (a) (b) Figure 16: (a) Model improvement before and after comparison. TCN represents the prediction model after removing the spatial module; GCN represents the prediction model after replacing the temporal module. (b) Comparison of prediction accuracy under different matrices. the clustering results of the improved FCM algorithm pro- performance of the improved CPT-DF model is about 20% posed in this paper are basically in line with the actual traffic higher than that of the TCN model that only considers conditions, and the CPT-DF algorithm proposed in this the temporal dimension. The performance of the GCN paper can accurately predict future traffic congestion except model is reduced by about 12% after removing the TCN for special interference points. model. The improved CPT-DF model has stronger spatio- temporal prediction ability and higher traffic flow prediction 4.2.3. Ablation Experiment. In order to further study the efficiency. Therefore, the experimental results show that optimization effect of different modules on the CPT-DF both the temporal module and the spatial module have great model, under the same conditions, this paper cancels the effects on the model proposed in this paper. temporal module and the spatial module, respectively, to Figure 16(b) shows the prediction results obtained by complete the ablation experiment. In addition, this paper using three different matrices (C-Matrix, D-Matrix, and also discusses the effect of three adjacency matrices on the D&C-Matrix). It can be seen that the improved D&C- prediction performance. Matrix has better prediction accuracy than C-Matrix and Figure 16(a) shows the comparison of the prediction D-Matrix. It is about 3%-4%, so it can be used as a better performance of each module of the CPT-DF model. The construction method for spatial adjacency matrix. Congestion level RMSE 00:00 02:00 04:00 06:00 08:00 10:00 12:00 14:00 16:00 18:00 20:00 22:00 24:00 RMSE Congestion level 00:00 02:00 04:00 06:00 08:00 10:00 12:00 14:00 16:00 18:00 20:00 22:00 24:00 Journal of Advanced Transportation 15 5. Conclusions and Discussion Furthermore, if real-time and multisource traffic data based on toll-gates or other sensors are collected, the work Large-scale congestion occurs frequently at toll-gates on of traffic prediction and evaluation of multitime span of freeways, especially during holidays or daily peak times. In day/month/year and multiscale (road network, local lanes, order to alleviate the congestion of the toll-gate and prevent vehicle, etc.) can be further analysed in a more fine-grained the occurrence of additional traffic accidents, environmental manner using the AI graph network proposed in this paper. pollution, etc., it is necessary to select the toll-gate as the Of course, each method will have some unexpected prob- study site and effectively predict and evaluate the future lems when it is applied in practice. For example, although congestion based on historical traffic data. In this paper, our proposed model can predict future traffic congestion in the topology of the freeway network is modelled as a graph spatiotemporal dimension, it cannot intuitively explain the structure. Toll-gates and the ordinary road segments rules for the evolution of traffic flow outside the “congestion between gates are regarded as vertices and edges in the domain” of toll-gates, including how congestion flow forms graph structure, and the traffic volume and operational and dissipates. Therefore, in the follow-up work, we will delay time are selected as feature matrices at the vertices. combine the evolution rules of actual traffic flow to better Then, by analysing the compositional characteristics of control the change of traffic to improve the intelligent level of traffic emergency management. toll-gates and constructing a congestion domain, a new AI network of toll-gate congestion based on GCN fusing dilated causal mechanism in DL and FCM clustering inte- Data Availability grating coupling weight is proposed in this paper. Further, The data that support the findings of this study are available 4-month toll data of freeways in China’s Shaanxi Province from the corresponding author upon reasonable request. are collected to complete the experimental work. In the experiment, congestion detection of toll-gate is Conflicts of Interest realized from two aspects: prediction and evaluation. Analysing the prediction results: the performance of the The authors declare no conflict of interest. graph-based models is about 8%-35% better than other non- graph models in the long-term prediction (60-min) work. Acknowledgments The reason is that the graph-based algorithm additionally verifies the correlation between each toll-gate from the global The research is financially supported by National Key spatiotemporal dimension and quantifies it using the D&C- Research and Development Program of China, Key technol- Matrix. It provides the possibility for the advance manage- ogies for derivative composite disaster assessment and emer- ment and traffic guidance for toll-gates of large-scale freeways. gency adapting in the Guangdong-Hong Kong-Macao Analysing the evaluation results: the traffic state is rea- Greater Bay Area (2021YFC3001000). sonably divided into five levels and the congestion of the toll-gate is accurately evaluated using the fuzzy clustering References method. It provides a possibility to accurately release the congestion information and avoid wrong alarm of the [1] Y. Wang, X. Yu, S. Zhang et al., “Freeway traffic control in toll-gates. presence of capacity drop,” IEEE Transactions on Intelligent The successful prediction could extend to the real-time Transportation Systems, vol. 22, no. 3, pp. 1497–1516, 2021. prediction and early warning of traffic congestion in the toll [2] S. Zahedian, A. Nohekhan, and K. F. Sadabadi, “Dynamic toll prediction using historical data on toll roads: case study of the system to improve the intelligent level of traffic emergency I-66 inner beltway,” Transportation Engineering, vol. 5, article management and guidance on the key road of disasters. 100084, 2021. On the one hand, when the management department of [3] L. Shen, J. Lu, D. Geng, and L. Deng, “Peak traffic flow predic- the freeway receives the accurate traffic indicators of the tions: exploiting toll data from large expressway networks,” toll-gate in the future period, it is not only to grasp the Sustainability, vol. 13, no. 1, pp. 260–318, 2021. regional traffic evolution from global toll-gate but also to [4] L. Yan, P. Wang, J. Yang, Y. Hu, Y. Han, and J. Yao, “Refined adjust the service instructions from a single toll-gate. For path planning for emergency rescue vehicles on congested example, based on the predicted information, the manager urban arterial roads via reinforcement learning approach,” adjusts the optimal ratio of opening and closing for the toll Journal of Advanced Transportation, vol. 2021, no. 1, Article lanes of the freeway in advance to ensure the smooth flow ID 8772688, p. 12, 2021. of vehicles. On the other hand, the optimal route and travel [5] M. Gori, G. Mandarina, and F. Scarcella, “A new model for time by the guidance is chosen by the driver based on the earning in graph domains,” Proceedings of the International accurate congestion state of the toll-gate predicted in Joint Conference on Neural Networks, vol. 2, no. 2, pp. 729– advance. For example, a driver will have n routes that can 734, 2005. be selected from the origin A to the destination B, and then [6] F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and based on the congestion evaluation and the calculation of the G. Monfardini, “The graph neural network model,” IEEE travel time, multiple options (the least delay, the shortest Transactions on Neural Networks, vol. 20, no. 1, pp. 61–80, distance, or the least congestion, etc.) are provided for the 2009. road user, which is especially effective in the emergency of [7] J. Zhou, G. Cui, S. Hu et al., “Graph neural networks: a review key road of disasters. of methods and applications,” AI Open, vol. 1, pp. 57–81, 2020. 16 Journal of Advanced Transportation [8] B. Yu, Y. Lee, and K. Sohn, “Forecasting road traffic speeds by [23] T. Afrin and N. Yodo, “A probabilistic estimation of traffic considering area-wide spatio-temporal dependencies based on congestion using Bayesian network,” Measurement, vol. 174, a graph convolutional neural network (GCN),” Transportation article 109051, 2021. Research Part C: Emerging Technologies, vol. 114, pp. 189–204, [24] R. Esfahani, F. Shahbazi, and M. Akbarzadeh, “Three-phase classification of an uninterrupted traffic flow: a k-means clus- [9] X. Shi, H. Qi, Y. Shen, G. Wu, and B. Yin, “A spatial–temporal tering study,” Transportmetrica B: transport dynamics, vol. 7, attention approach for traffic prediction,” IEEE Transactions no. 1, pp. 546–558, 2019. on Intelligent Transportation Systems, vol. 22, no. 8, [25] J. Dunn, “Well-Separated clusters and optimal fuzzy parti- pp. 4909–4918, 2021. tions,” Journal of cybernetics, vol. 4, no. 1, pp. 95–104, 1974. [10] B. Yu, H. Yin, and Z. Zhu, “Spatiotemporal graph convolu- [26] Z. Cheng, W. Wang, J. Lu, and X. Xing, “Classifying the traffic tional networks: a deep learning framework for traffic forecast- state of urban expressways: a machine-learning approach,” ing,” in Proceedings of the Twenty-Seventh International Joint Transportation Research Part A: Policy and Practice, vol. 137, Conference on Artificial Intelligence, pp. 3634–3640, Stock- pp. 411–428, 2020. holm, 2018. [27] S. Gan, S. Liang, K. Li, J. Deng, and T. Cheng, “Trajectory [11] Y. Li, R. Yu, C. Shahabi, and Y. Liu, “Diffusion convolutional length prediction for intelligent traffic signaling: a data- recurrent neural network: Data-driven traffic forecasting,” driven approach,” IEEE Transactions on Intelligent Transpor- 2017, https://arxiv.org/abs/1707.01926. tation Systems, vol. 19, no. 2, pp. 426–435, 2018. [12] L. Zhao, Y. Song, C. Zhang et al., “T-GCN: a temporal graph [28] T. Vincenty, “Direct and inverse solutions of geodesics on the convolutional network for traffic prediction,” IEEE Transac- ellipsoid with application of nested equations,” Survey Review, tions on Intelligent Transportation Systems, vol. 21, no. 9, vol. 23, no. 176, pp. 88–93, 1975. pp. 3848–3858, 2020. [29] J. Bruna, W. Zaremba, A. Szlam, and Y. LeCun, “Spectral net- [13] J. Zhao, Y. Gao, Z. Bai, H. Wang, and S. Lu, “Traffic speed pre- works and locally connected networks on graphs,” 2014, diction under non-recurrent congestion: based on LSTM https://arxiv.org/abs/1312.6203. method and BeiDou navigation satellite system data,” IEEE [30] E. Michael, K. Hermann, and B. Dagmar, “Large-scale quan- Intelligent Transportation Systems Magazine, vol. 11, no. 2, tum networks based on graphs,” New Journal of Physics, pp. 70–81, 2019. vol. 18, no. 5, article 053036, 2016. [14] Z. Lv, L. Qiao, K. Cai, and Q. Wang, “Big data analysis technol- [31] S. Bai, J. Kolter, and V. Koltun, “An empirical evaluation of ogy for electric vehicle networks in smart cities,” IEEE Trans- generic convolutional and recurrent networks for sequence actions on Intelligent Transportation Systems, vol. 22, no. 3, modelling,” 2018, https://arxiv.org/abs/1803.01271. pp. 1807–1816, 2021. [32] J. Wang, Y. Zhang, Y. Wei, Y. Hu, X. Piao, and B. Yin, “Metro [15] B. Williams and L. Hoel, “Modeling and forecasting vehicular passenger flow prediction via dynamic hypergraph convolu- traffic flow as a seasonal ARIMA process: theoretical basis and tion networks,” IEEE Transactions on Intelligent Transporta- empirical results,” Journal of Transportation Engineering, tion Systems, vol. 22, no. 12, pp. 7891–7903, 2021. vol. 129, no. 6, pp. 664–672, 2003. [33] J. Yang, P. Wang, W. Yuan, Y. Ju, W. Han, and J. Zhao, “Auto- [16] H. Yang, P. Jin, B. Ran, D. Yang, Z. Duan, and L. He, “Freeway matic generation of optimal road trajectory for the rescue vehi- traffic state estimation: a Lagrangian-space Kalman filter cle in case of emergency on mountain freeway using approach,” Journal of Intelligent Transportation Systems, reinforcement learning approach,” IET Intelligent Transport vol. 23, no. 6, pp. 525–540, 2019. Systems, vol. 15, no. 9, pp. 1142–1152, 2021. [17] C. Wu, J. Ho, and D. Lee, “Travel-time prediction with support [34] J. Jiang, Q. Chen, J. Xue, H. Wang, and Z. Chen, “A novel vector regression,” IEEE Transactions on Intelligent Transpor- method about the representation and discrimination of traffic tation Systems, vol. 5, no. 4, pp. 276–281, 2004. state,” Sensors, vol. 20, no. 18, p. 5039, 2020. [18] P. Wang, W. Hao, and Y. Jin, “Fine-grained traffic flow pre- diction of various vehicle types via fusion of multisource data and deep learning approaches,” IEEE Transactions on Intelligent Transportation Systems., vol. 22, no. 11, pp. 6921–6930, 2021. [19] C. Shuai, W. Wang, G. Xu, M. He, and J. Lee, “Short-term traf- fic flow prediction of expressway considering spatial influ- ences,” Journal of Transportation Engineering, Part A: Systems, vol. 148, no. 6, 2022. [20] W. Zhang, Y. Yu, Y. Qi, F. Shu, and Y. Wang, “Short-term traf- fic flow prediction based on spatio-temporal analysis and CNN deep learning,” Transportmetrica A Transport Science, vol. 15, no. 2, pp. 1688–1711, 2019. [21] S. Guo, Y. Lin, N. Feng, C. Song, and H. Wan, “Attention based spatial-temporal graph convolutional networks for traffic flow forecasting,” in Proceedings of the AAAI Conference on Artifi- cial Intelligence, vol. 33no. 1, pp. 922–929, 2021. [22] M. Akhtar and S. Moridpour, “A review of traffic congestion prediction using artificial intelligence,” Journal of Advanced Transportation, vol. 2021, Article ID 8878011, 18 pages, 2021. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Advanced Transportation Hindawi Publishing Corporation

CPT-DF: Congestion Prediction on Toll-Gates Using Deep Learning and Fuzzy Evaluation for Freeway Network in China

Loading next page...
 
/lp/hindawi-publishing-corporation/cpt-df-congestion-prediction-on-toll-gates-using-deep-learning-and-9EPGUiUoiz

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher
Hindawi Publishing Corporation
ISSN
0197-6729
eISSN
2042-3195
DOI
10.1155/2023/2941035
Publisher site
See Article on Publisher Site

Abstract

Hindawi Journal of Advanced Transportation Volume 2023, Article ID 2941035, 16 pages https://doi.org/10.1155/2023/2941035 Research Article CPT-DF: Congestion Prediction on Toll-Gates Using Deep Learning and Fuzzy Evaluation for Freeway Network in China 1,2 1,2 2 3 2 2 Tongtong Shi, Ping Wang , Xudong Qi, Jiacheng Yang, Rui He, Jingwen Yang, 1,4 and Yu Han School of Intelligent System Engineering, Sun Yat-Sen University, Guangzhou 518000, China School of Electronics and Control Engineering, Chang’an University, Xi’an 710064, China School of Automation, Southeast University, Nanjing 210096, China Guangdong Province Key Laboratory of Fire Science and Technology, Guangzhou 510006, China Correspondence should be addressed to Ping Wang; wang0372@e.ntu.edu.sg and Yu Han; hanyu25@mail.sysu.edu.cn Received 26 April 2022; Accepted 3 September 2022; Published 10 April 2023 Academic Editor: Dong-Kyu Kim Copyright © 2023 Tongtong Shi et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Toll-gates are crucial points of management and key congestion bottleneck for the freeway. In order to avoid traffic deterioration and alleviate traffic congestion in advance, it is necessary to predict and evaluate the congestion in toll-gates scattering in large- scale region of freeway network. In this paper, traffic volume and operational delay time are selected from various traffic indicators to evaluate congestion considering the particular characteristics of the traffic flow within the toll-gate area. The congestion prediction method is designed including two modules: a deep learning (DL) prediction and a fuzzy evaluation. We propose a modified deep learning method based on graph convolutional network (GCN) structure in the fusion of dilated causal mechanism and optimize the method for spatial feature extraction by constructing a new adjacency matrix. This new AI network could process spatiotemporal information of traffic volume and operational delay time, that extracted from large- scaled toll-gates spontaneously, and predict key indicators in 15/30/60 min future time. The evaluation module is proposed based on these predicted results. Then, fuzzy C-means algorithm (FCM) is further modified by determining coupling weight for these two key indicators to detect congestion state. Original traffic data are obtained from the current 186 toll-gates served on the freeway network in Shaanxi Province, China. Experimental tests are carried out based on historical data of four months after preprogressing. The comparative tests show the proposed CPT-DF (congestion prediction on toll-gates using deep learning and fuzzy evaluation) outperforms the current-used other models by 6-15%. The successful prediction could extend to the real-time prediction and early warning of traffic congestion in the toll system to improve the intelligent level of traffic emergency management and guidance on the key road of disasters to some extent. 1. Introduction within the toll-gate area and other roads are different. There are multiple steps such as deceleration, lane changing, Freeway is the backbone of long-distance transport because rendezvous, and toll payment when a vehicle enters the of its low disruption and excellent road conditions [1]. As toll-gate area [3]. The entire trafficefficiency is strongly the demand for long-distance transport increases with affected by passing time through the toll-gates. Thus, toll- economic development, the number of freeway mileage in gates are crucial points of management and key congestion China is growing exponentially, as shown in Figure 1(a). bottleneck for the freeway network, especially in China. How- At present, researchers have made a large quantity of work ever, serious traffic congestion often occurs in the toll-gates on traffic state of urban road sections, but there are relatively area. As shown in Figure 1(b), in the top 10 toll-gates ranked few studies on toll-gate of freeway [2]. Since the freeway is by congestion in China, the vehicle speed may drop to closed management and toll-gates scattering in large-scale 10 km/h and the congestion index may even be up to 50-60. region of freeway network, characteristics of the traffic flow It might additionally cause traffic accidents, energy 2 Journal of Advanced Transportation 16.1 60 6 14.96 50 5 14.26 13.64 13.1 40 4 12.35 30 3 11.19 20 2 10.44 9.62 0 0 12 3 4 5 6 78 9 10 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 Mileage of freeway (10000 km) Toll gate Driving speed (km/h) Congestion index (a) (b) Figure 1: (a) Mileage of freeways in China. (b) Top 10 toll-gates ranked by congestion index (11 : 10 April 9, 2022). (Data source from reference: https://report.amap.com/congest.do#). consumption, and environmental pollution in the areas of are also unsatisfactory results in long-term prediction these toll-gates [4]. In order to alleviate or avoid the occur- work due to the defects of the model. For example, the rence of these problems, an increasing number of researchers disappearance and explosion of gradients in Recurrent are working on two aspects: traffic prediction and congestion Neural Networks (RNNs) lead to the loss of historical data evaluation of the toll-gates area, in which further studies on and cannot be processed in parallel on a large scale, result- quantitively predicted traffic indicators might be an alterna- ing in slow computation. Therefore, a graph convolutional tive way to solve the problem to evaluate the congestion. network fusing the dilated causal mechanism was intro- The success prediction and evaluation would provide the pos- duced in this paper to compensate for this deficiency. In sibility for the advance management of the toll-gate and the addition, most existing graph-based methods usually only guidance of traffic routes. The traffic perception will be faster build spatial features based on distances or correlations and more accurate with the development and breakthrough of between toll-gates [10–12]. But there will be two situations artificial intelligence (AI) and deep learning (DL), providing as shown in Figure 2: (1) Nearby Euclidean distances more effective intelligent technology for alleviating conges- between gates, but little spatial correlation of traffic flows tion and traffic emergency management. (Region A: A and A are not directly relatable), (2) Long 1 2 In terms of traffic prediction, the freeway network is a Euclidean distances between gates, but strong spatial cor- topological structure with dynamic spatiotemporal fea- relation of traffic flows (Region B). This means that only tures, which is manifested by the periodicity of traffic flow referring to a single factor cannot accurately grasp the and the spatial correlation of toll-gates between upstream large-scale spatial features and even has a great interfer- and downstream. Initially, due to the limitation of intelli- ence on the prediction accuracy. Based on this, a new spa- gent technology, most of the prediction work of traffic tial adjacency matrix that combines the two features of congestion focused on the temporal dimension while toll-gates including correlation and distance is proposed ignoring the information of the large-scale spatial dimen- in this paper, which can more accurately explain the spa- sion. In recent years, graph neural network (GNN) based tial characteristics of toll-gates in a large scale. on deep learning was first proposed by Gori et al. [5] In terms of congestion evaluation, the work of conges- and Scarcella et al. [6], which introduced graph structure tion evaluation of freeway toll-gates is to further analyze in the field of spatial correlation to skillfully simulate the and summarize the prediction results (single or multiple spatial correlation between objects. Hence, various algo- traffic indicators). The content of this part includes the rithms based on graph models have been widely used in selection of indicators and evaluation methods to measure different fields, including social network, biomedicine, traffic congestion. Firstly, researchers mostly use traffic and knowledge graph [7]. Similarly, the toll-gates and indicators such as speed, traffic volume, occupancy, opera- ordinary road sections of the freeway network can be tional delay time, and queuing length as the detection cri- mapped to the relationship between points and edges in teria for urban traffic congestion. The freeway is a closed the graph structure. Recently, as a branch of GNN, Graph system consisting of “toll entrances-road sections-toll Convolutional Networks (GCN) [8, 9] were introduced to exits”. Under normal circumstances, the vehicle will drive traffic work and efficiently implement congestion predic- at a high and uniform speed on ordinary road sections. tion from a spatiotemporal perspective. However, there When reaching the toll-gate area, the vehicle speed will Journal of Advanced Transportation 3 B Region B 1 B 2 550 Region A 00:00 03:00 06:00 09:00 12:00 15:00 18:00 21:00 24:00 00:00 03:00 06:00 09:00 12:00 15:00 18:00 21:00 24:00 Time Time 210105 510202 740103 590106 Figure 2: The special situation of toll-gate location and traffic flow distribution. slow down to an extremely low state until the toll is com- posed in this paper to complete the congestion pre- diction and evaluation of toll-gates more accurately pleted and finally accelerate the process of driving away. Therefore, the speed itself varies greatly in the toll-gate (ii) A new AI network (CPT-DF) for congestion predic- area, so it is suitable as a congestion indicator in the ordi- tion of freeway toll-gates using deep learning and nary road section rather than in the toll-gate area. In addi- fuzzy evaluation is proposed. The GCN of the tion, since there are often nondeployed detector sections prediction module is used to capture the spatial or detector damaged sections upstream and downstream information of the road network by weighting the of the toll-gate, the queuing length and occupancy rate neighbourhood node features and embedding them cannot be completely and directly detected. In view of this, into the graph structure, and the improved dilated the traffic volume based on toll-data is preferred in this causal mechanism preserves the nonlinear ability paper as one of the indicators to identify the congestion of the model by adding residual connections and state of freeway toll-gates. Besides, due to the difference gated linear units, so that temporal dimension in the scale of toll-gates, the evaluation results of toll- information can be better captured. The evaluation gate i and toll-gate j may be different under the same traf- module uses the improved FCM algorithm for accu- fic volume. Therefore, this paper selects operational delay rate interval classification of toll-gate traffic state time as another indicator and designs a calculation method based on toll-data. Secondly, there is no standard (iii) A new adjacency matrix that combines both the congestion evaluation method because the Electronic Toll correlation and distance features of toll-gates is Collection (ETC) channel has been added to the toll- constructed in this paper, which can optimize the gates of China’s freeways in recent years, which is an existing methods to more accurately extract the important renovation of the toll-gates [13]. References spatial features of large-scale toll-gates include the “Highway Capacity Manual (HCM)” issued (iv) This paper verifies the proposed CPT-DF model by United States and the “Urban Road Traffic Operation based on toll data, and some toll-gates are selected Evaluation Indicator System” issued by China, both of to complete the work of congestion prediction, which quantify the service level into n intervals according which could efficiently improve the intelligent level to road characteristics. Therefore, the idea of clustering of traffic emergency management and guidance on and the currently more popular methods based on fuzzy the key road of disasters clustering are introduced in this paper to perform traffic state classification well. Currently, there are some studies in this area. A congestion evaluation model based on the 2. Related Work fuzzy C-means (FCM) algorithm was proposed [14]. This 2.1. Prediction Algorithms. In recent years, with the increase method of unsupervised learning based on fuzzy clustering in the amount of data and the development of fusion tech- to analyze hidden data patterns is helpful for the conges- nology, data-driven prediction algorithms represented by tion evaluation of toll-gates. However, the traditional statistical models, traditional machine learning and deep FCM algorithm does not consider the influence of each learning models have become more and more popular in traffic index on the clustering results, and the algorithm various research fields. In the field of traffic prediction, is prone to fall into the local minimum. Based on these ARIMA model [15] and Kalman filter model [16] can rely problems, this paper refers to and improves the FCM on statistical methods to simply model the relationship method, which combines the coupling weights of the two between traffic parameters to predict road traffic state. How- traffic indicators to optimize the FCM algorithm and accu- ever, such models are usually based on linear assumptions, rately realize the congestion evaluation. The main contri- which cannot strongly explain the high-dimensional and butions can be summarized as follows. nonlinear of traffic data. Later, machine learning methods (i) Based on toll-data, a new calculation method for such as Support Vector Regression (SVR) [17] proposed by “traffic volume” and “operational delay time” is pro- researchers can solve the nonlinear relationship in traffic Flow Flow 4 Journal of Advanced Transportation data well, so such methods are widely used in freeways and traffic behaviour, and then evaluate the traffic congestion, urban roads. However, the changes in road environment but it is not suitable for scenarios with high complexity and traffic flow near toll-gates are more complicated than and unknown degree, such as toll-gates. The latter is based the temporal characteristics of ordinary road sections, and on neural networks and clustering algorithms to evaluate there are few studies on toll-gates at present. Wang et al. traffic congestion [23]. Neural network-based methods need [18] fuse vehicle detector data, long-range microwave sensor to manually classify traffic states in advance, but it is difficult data, and toll-data and employ Deep Belief Network (DBN) to accurately classify states in unknown scenarios. In the to successfully predict small-scale ring roads at time inter- contrast, clustering methods are very suitable for special vals of 30/60/120-minute traffic flow at the toll-gate. Shuai scenarios of toll-gates without standard evaluation. The clus- et al. [19] adopted the modified Long Short-Term Memory tering algorithm is the process of dividing a collection of (LSTM) and predicted the traffic volume of the 51 screened research objects into multiple classes consisting of similar toll-gate. These deep learning-based methods are good at objects. Initially, the K-means clustering algorithm [24] capturing traffic trends in complex toll-gate environments combined with three parameters of traffic volume, speed, or exploring spatial connectivity between one or more road and occupancy was applied to achieve a simple state evalua- segments in a single temporal dimension, but the spatial tion. However, the algorithm cannot evaluate the critical threshold of the sample. Later, considering that traffic state characteristics between each toll-gate on a large scale are not considered. Therefore, the researchers extracted the spa- is a fuzzy concept, the algorithm of fuzzy clustering is also tial features of the road network by introducing the convolu- applied to traffic state evaluation. The FCM algorithm was tional neural network (CNN) [20] to convert the structure of first proposed by Dunn [25], and later the improved the traffic road network into a standard graph structure, but algorithm based on FCM has been widely used in the field CNN is not suitable for processing non-European data such of traffic evaluation [26, 27]. as toll-gates. Recently, GCNs have been widely used for many graph-based tasks, and many studies have further 3. Methodology explored the use of GCNs to model the topology of road net- 3.1. Problem Definition works. The STGCN model [10] combines GCN and CNN for the first time to model the traffic network and spatiotem- 3.1.1. Congestion Domain of Toll-Gates. To study the traffic poral sequence. This paper also refers to the fusion principle congestion of the toll-gates, this paper constructs a “toll-gate of STGCN in the model construction part, but the model can congestion domain” according to the trafficcharacteristics of only use CNN to process the signal of each layer of the the toll-gates, which includes the three areas (upstream network. Propagating to the upper layer, the processing of section, deceleration section, and toll section), as shown in samples is independent at each moment, so it cannot cope Figure 3. After the vehicle passes through the congestion with long-term prediction well; the DCRNN model [11] domain, it will quickly resume normal driving through the models the spatial correlation as a diffusion process on a acceleration section and the downstream section. directed graph to establish a traffic flow transformation Upstream Section.A fixed distance section before the model, we propose to develop a diffuse convolutional recur- vehicle enters the toll-gate. The first vehicle detector (VD) rent neural network capable of capturing the spatial and in Figure 3(1) is installed upstream of the toll-gate in temporal dependencies between long-term sequences using upstream section, which is used to detect parameters such the seq2seq framework. This paper is inspired by DCRNN as the speed of passing vehicles and the current time. in the construction of time series model. The ASTGCN Deceleration Section. The vehicle starts to slow down and model [21] introduces an attention mechanism (GAT) in enters the toll-gate and selects a different toll lane. GCN to effectively model temporal and spatial correlations. Toll Section. The toll lanes can be divided into Manual However, this model cannot capture spatial and temporal Toll Collection (MTC) and Electronic Toll Collection dependencies simultaneously and only considers low-order (ETC) based on the tolling method. The vehicle decelerates neighbourhood relationships between nodes, ignoring the through the railing locomotive detector (RLD) as shown in correlations between different historical periods. The T- Figure 3(2) and records the current time. GCN model [12] integrates GCN and GRU to capture traffic spatiotemporal features, which can be well derived from 3.1.2. Calculation of Indicator. Firstly, the original data spatiotemporal features. These models can well capture the collected in this paper will be counted every 5 minutes and information between adjacent nodes from the perspective traffic volume can be counted directly (the details are pre- of spatial and temporal, so as to complete short-term traffic sented in Section 4.1.1). Secondly, the operational delay time prediction. However, these models have not been success- on ordinary roads can be obtained according to “speed- fully applied in the traffic prediction work of the toll station, acceleration-distance”. But the limitation of this method will so this paper proposes a new model and uses the toll-data to bring a large error. According to the characteristics of toll- carry out the application work of the actual scene. gates, the operational delay time is calculated as 2.2. Evaluation Algorithms. The methods of congestion ! ! n k evaluation can be divided into traffic theory and data- 1 1 D = 〠 T − 〠 T , ð1Þ driven algorithms [22]. The former is based on physical i Pk n K i=1 i=1 and mathematical theory to describe the characteristics of Journal of Advanced Transportation 5 Congested domain ② ② T T T T i(RLD) i(RLD) i(VD) Upstream Deceleration Toll gate Acceleration Downstream T T T u D t Vehicle detectors (upstream) Railing locomotive detector Figure 3: Composition of toll-gates and construction of congestion domain. where F ∈ R is a vector of observations for n toll-gates at T = T + T + T = T − T , ð2Þ i u D t iðÞ RLD iðÞ VD time step t, F here also refers to the traffic volume and oper- ational delay time. T = 〠 T − T , ð3Þ Pk jk RLD jk VD ðÞ ðÞ (2) Construction of Graph Data. For unordered gate network j=1,volume<Q traffic data, the observations F are not independent and can where D is the calculated average operational delay time per be viewed as graph signals defined on an undirected graph five minutes. T is the passing time of the i-th vehicle passing G as shown in Figure 4, the graph is expressed in terms of through the congestion domain. T , T , and T are the G = ðF , E, WÞ. E is a set of edges representing the connec- u D t t t n×n upstream section and the deceleration section as shown in tions between gates, and W ∈ R represents the adjacency Figure 3, respectively and the travel time of the toll road. matrix of G . T is the travel time of the vehicle through the toll channel Pk K in the congestion domain. Q is the flow threshold when Each of the toll-gates can be regarded as a vertex in the the traffic state is unblocked. T and T are the times graph structure, and the road segments connecting the toll- iðVDÞ iðRLDÞ gates can be regarded as edges. In order to represent the detected by the i-th vehicle passing the upstream vehicle spatial relationship between each toll-gate (vertex), the detector and the railing locomotive detector, respectively. Euclidean Distance is usually chosen, but there will be the 3.1.3. Preparing for Input/Output. As shown in Table 1, two defects shown in Figure 2 that were mentioned earlier. sets of inputs/outputs are designed to perform both predic- Therefore, we introduce Distance Matrix (D-Matrix) and tion and evaluation. For the prediction module, the input Correlation Matrix (C-Matrix) to represent spatial features. values are the two previously selected feature data (traffic volume and operational delay time) and graph data. The (i) D-Matrix is the Euclidean distance between each graph data represents the spatial relationship between each gate, which can be calculated using the latitude and toll-gate. The two traffic indicators represent the temporal longitude values of the gates by “Vincenty solutions relationship between the historical traffic state and future of geodesics on the ellipsoid” [28]. traffic state of toll-gate. The output values are just two traffic (ii) C-Matrix is judged based on whether the gates are indicators for future moments predicted by the model. For directly connected. As shown in Figure 5, P and P 1 4 the evaluation module, the input value is the output value are directly connected, but P and P are not directly 1 2 of the prediction module and the congestion threshold cor- connected, so C =1, C =0: p p p p responding to each toll-gate, and the output value is the final 1 4 1 4 traffic state. This part mainly introduces the construction of Finally, a novel type of Distance and Correlation Matrix the spatiotemporal data input to the prediction module. (D&C-Matrix) is constructed to calculate the adjacency matrix W as ij (1) A. Description of Feature Data. Traffic prediction is a typical spatiotemporal prediction problem. Given the previ- ÀÁ ÂÃ ÂÃ ous M observations of historical traffic feature, the data mea- 2 D ⊙ C > ij ij sured at the N toll-gates at time step H can be viewed as a exp − w = ≥ ε, i ≠ j, ð5Þ matrix of size M × N. Then, the predicted value of the flow ij closest to the true value in the next H time steps is as 1, others F , ⋯, F = argmaxlogPF , ⋯F F , ⋯, F , ðÞ j t+1 t+H t+1 t+H t−M+1 t where ½D Š and ½C Š are D-Matrix and C-Matrix, respec- ij ij ð4Þ tively, ⊙ is the Hadamard product, and D and C are the ij ij 6 Journal of Advanced Transportation Table 1: The input/output values of the proposed model. Prediction module Evaluation module Graph Feature data (historical) Feature data Threshold data Input values Operational delay D&C- Traffic volume (number/n min) Congestion Traffic volume (number/n min) time (s) matrix (future) boundaries Output Traffic volume (number/n min) (future) Traffic state values 4 1 2 5 1 2 1 2 5 5 ··· 5 5 6 6 3 3 t-Mt-M+1 t+H Time (a) (b) Figure 4: (a) Toll-gates in a road network. (b) Spatiotemporal correlation. P P C-matrix D-matrix D&C-matrix 1 4 P P P P P P P P P P P P 1 2 3 4 1 2 3 4 1 2 3 4 P P P 1 0 0 1 1 4 95 89 1 0 0 89 1 1 1 P P P 0 1 1 0 4 1 76 1 0 1 76 0 2 2 2 P P 0 1 1 0 95 76 1 5 P 0 76 1 0 3 3 3 P P 2 3 P P 1 0 0 1 89 1 5 1 P 89 0 0 1 4 4 4 (a) (b) Figure 5: Construction of adjacency matrix. (a) Location distribution of the four toll-gates (Yellow lines represent dividing fences that vehicles on the freeway cannot pass through). (b) The process of building an adjacency matrix. GCN fused with dilated causal convolutions. Finally, the distance and correlation between gates i and j. σ and ε is the threshold of control matrix distribution and sparsity. evaluation module combines the prediction indicators with the FCM clustering mechanism to realize the congestion detection of the toll-gate in the future period. 3.2. Overview. In this paper, we propose an AI network (CPT-DF) of deep learning that integrates a fine-grained 3.3. Prediction Module congestion evaluation mechanism, as shown in Figure 6. The CPT-DF network includes two modules: prediction 3.3.1. Spatial Feature Extraction. GCN is a basic operation module and evaluation module. The prediction module based on spectral decomposition method or spatial struc- includes input/output layer 1 and spatiotemporal convolu- ture. The spectral decomposition-based method is to deal tion layer, and the evaluation module includes input/out- with the spectral domain correlation representation of the put layer 2 and congestion evaluation layer. The output graph. In this paper, the spectral decomposition method is layer 1 and the input layer 2 are marked with green fonts, introduced to extract node spatial features given node infor- just because of the transmission relationship in the calcu- mation. As early as 2014, Bruna et al. [29] proposed Spectral lation process. Network to define convolution operations in the Fourier First, the preliminary work completed the construction domain, which can be defined as the product of feature x of the congestion domain of the toll-gate, the selection of ∈ ℝ and a convolution kernel G = diag ðθÞ as indicators, and the data required by the input layer. Then, the prediction module detects traffic indicators (traffic vol- ume and operational delay time) for future periods based x ∗ G = UG ðÞ Λ U x, ð6Þ θ θ on the spatiotemporal convolutional layers constructed by Correlation Journal of Advanced Transportation 7 Prediction module Input layer1 Spatiotemporal convolutional layer Output layer1 Temporal-dataset TCN module Traffic ··· Delay time Temporal Causal conv volume feature Ŷ BN t+T Dilated conv 186 × M (5 min) X X X X t-m t-n+1 t-1 t Spatial ST-fusion module Spatio-dataset feature 186 × 1 GCN module D&C-matrix Traffic 1 186 Temporal volume 2 Conv 15 min 30 min feature GCN 1 hour 186 × 186 Conv Delay time Evaluation module Input layer2 Congestion evaluation layer Output layer2 Data preparation Road network Time T Traffic t+T state Evaluation Congestion threshold Congestion factor Volume 183 184 185 186 Threshold Figure 6: CPT-DF network: congestion prediction on toll-gates using deep learning and fuzzy evaluation. The blue line boxes represent the input values of the prediction module and the evaluation module, respectively, the red line boxes represent the output values, and the output value of the prediction module is also one of the input values of the evaluation module. where U is a matrix composed of the eigenvectors of the nor- mation, which achieves a larger receptive field and reduces malized Laplacian matrix, and Λ is a diagonal matrix of traffic the number of convolution kernels. indicators. However, this method of convolution, which com- putes the Eigen decomposition of the Laplacian matrix of the 3.3.2. Temporal Feature Extraction. As a derivative of CNN, graph, leads to potentially intensive computations and results Temporal Convolutional Networks (TCN) [31] is a network in unsatisfactory locality of the convolution kernels. framework that can accurately process sequences or data In order to alleviate the problems of Spectral Network, in containing time series. It aims to extract features across time 2016, Michael et al. [30] proposed Cheb Nets K-jump steps by directly exploiting the powerful properties of convo- convolution to define convolution on the graph, thus elimi- lutions and uses fully connected networks and dilated causal nating the time cost of calculating Laplacian matrix vectors. convolution to achieve corresponding outputs for each On this basis, this paper sets K =1 to alleviate the local over- input, respectively, and ensure that no historical data is fitting problem. Therefore, the graph convolution can be missed. In this paper, an improved TCN network is designed written as to extract temporal features by fusing dilation convolution, GLU, and residual blocks. The specific improved TCN struc- − 1/2 − 1/2 ðÞ ðÞ ′ ′ ture is shown in Figure 7. x ∗ G &= θ x − θ D WD x ð7Þ − 1/2 − 1/2 ðÞ ðÞ = θ I + D WD x, (1) Dilated Causal Convolution. Dilated causal convolution is used to solve the problem of the time dimension of big data. Among them, the expansion coefficient of the convolu- ′ ′ where the adjustable parameter is θ = θ = −θ . D is the tion kernel can be arbitrarily combined from the range of [1, degree matrix. The GCN model constructs filters in the Fou- 2, 4, 8, 16, 32]. Through comparative experiments, it is rier domain, constructs spatial features by stacking multiple found that the experimental results obtained by [1, 2, 4], local Covn layers, and extracts the structural information of [1, 2, 4, 8, 16], and [8, 16, 32] are relatively stable. At the the network in the form of convolutions. Therefore, a deeper same time, in order to maintain the temporal relationship structure can be constructed to deeply recover spatial infor- of historical information, the kernel is set to 2, and the 8 Journal of Advanced Transportation (i-1) (i-1) (i-1) Outputs {F , F ... F } 1 2 N Dropout Dilated convolution . . . (1) (1) N–1 Dilated convolution 1 1 Conv Dropout GLU 1 1 The weighted Conv normalization Dilated convolution (i-1) (i-1) (i-1) Inputs {F , F ... F } F , F ... F , F 1 2 N 0 1 N–1 N (a) (b) Figure 7: Structure of improved temporal feature extraction. (a) The composition structure of the overall framework. (b) Structure of Dilated Causal Convolutions. ðÞ M−K +1 ×C expansion coefficient is used as the sliding jump value, and t 0 Γ ∗ τY = P ⊙ σ Q ∈ ℝ , ð10Þ ðÞ the receptive field is set to2×2^ð4 − 1Þ =16: where P and Q are the input of the GLU gate, respec- Formally, for the one-dimensional sequence input x ∈ ℝ tively, and ⊙ represents the element-wise Hadamard and the kernel function ϕ ∈ ℝ , d is the expansion coefficient, product. σðQÞ controls the dynamic change of the input P, the flow data sequence input by time is fF , F , ⋯F g,and 0 1 N and the added nonlinear link ensures the stacked input of the output result is denoted as fg , g , ⋯g g, the mapping 0 1 N the time layer, and the residual connection is realized in the relationship S between F and g can be expressed as: time layer of the stack. Using the same convolution kernel Γ M×C for each node y ∈ ℝ in the traffic graph, the time domain g ̂ , g ̂ , ⋯g ̂ =SF , F , ⋯F : ð8Þ ðÞ 0 1 N 0 1 N convolution Γ ∗ τy can be extended to the three-dimensional M×n×C variable y ∈ ℝ . The convolution operation S on the element F is (3) ST-Fusion Module. To fuse spatiotemporal features, a k−1 spatiotemporal fusion module (ST-Fusion Module) inspired SF = x ∗ dϕ F = 〠 ϕ i · x , ð9Þ ðÞ ðÞðÞ ðÞ F−d·i by [32] is constructed in this paper. The modules can be i=0 stacked or expanded depending on the size and complexity of a particular case. As shown in Figure 8, each spatial con- where k represents the kernel size, F − d · i maps the upper volutional layer bridges two temporal convolutional layers, layer history information, and at the same time introduces which can achieve fast transition of the states of the tempo- the residual block in the TCN. ral and spatial layers. In addition, this design scales the chan- nel C through the graph convolution layer, which also helps (2) Gated Linear Units (GLU). After adding the residual mod- the network to fully apply the bottleneck strategy and ule, the TCN has 3 layers of dilated convolution, and the data achieve scale and feature compression. distribution is normalized by weights, and then the GLU is used to replace the ReLU in the original structure to save the The input and output of ST-Fusion Module are both 3D nonlinearity of the remaining blocks, at the same time to l M×n×C expand the volume every time add dropout after the product tensors. For the input F ∈ ℝ of block l, the output l+1 l+1 ðM−2ðK −1ÞÞ×n×C to prevent overfitting. Furthermore, 1 ∗ 1Conv with a width F ∈ ℝ is of K is introduced to obtain the output Y of the time sequence as the input of the next stage. At this time, the time convolu- l+1 l l l l F = Γ ∗ τReLU Θ ∗ g Γ ∗ τF , ð11Þ 1 0 tion input of each node can be regarded as a sequence of M×C length M, and the number of channels is C ,so Y ∈ ℝ . K ×C ×2C l l t t 0 The convolution kernel τ ∈ ℝ is used to map the where Γ , Γ are the upper and lower kernels of the temporal 0 1 ðM−K +1Þ×2C l t 0 input Y to a single output element ½PQŠ ∈ ℝ . 1 ∗ convolutional layer of the inclusion graph convolution. Θ is 1Conv keeps the remaining input and output dimensions the spectral domain convolution kernel in graph convolu- the same, the convolution can be defined as: tion, and Re LUð·Þ represents the activation unit. Journal of Advanced Transportation 9 Output layer Spatial-temporal block Improved Improved TCN Graph TCN Spatial-temporal temporal convolution temporal block layer convolution convolution l+1 layer layer Passage Passage Passage (F ...F ),W t-M+1 t C = 16 C = 32 C = 32 Spatial-temporal block Figure 8: Spatiotemporal block and output connection diagram. After fusion of temporal convolution and spatial convo- method. First, calculate the proportion p and entropy value ij lution, apply linear transformation F = Zw + b on channel C E of sample i under indicator j. The weight w of each traffic to obtain n nodes. The predicted value of traffic w ∈ ℝ is the indicator is further calculated as weight vector, b is the deviation, considering the conver- gence speed, and using L2-loss to measure the model perfor- ij mance, the flow loss function is expressed as ð15Þ p = , ij ∑ x i=1 ij LðÞ ̂v ; W =〠kk ̂vvðÞ , ⋯, v , W − v : ð12Þ θ t−M+1 t θ t+1 E = − 〠 p ·ln p , j =1,2, ⋯z, ð16Þ j ij ij ln n ðÞ i=1 Since the deepening of the spatiotemporal block men- tioned above will gradually slow down during the training 1 − E process, this paper introduces Batch Normalization (BN) w = : ð17Þ j z ∑ 1 − E j=1 j before the hidden layer activation function to fix the distri- bution of the input and pull the distribution back to the nor- The specific congestion evaluation process is shown in mal distribution interval of [0, 1] to speed up convergence Figure 9, and the improved FCM algorithm introduced in speed, while making the optimization smoother. The specific this paper is an unsupervised fuzzy clustering method, which transformation is is a data clustering method based on the optimization of the ÂÃ objective function. The membership degree of the cluster ðÞ k ðÞ k x −Ex ðÞ k center is represented by a numerical value. Input the feature x̂ = qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi , ð13Þ ÂÃ ðÞ k Var x prediction sample set X= fx , x ,⋯x g and the number of 1 2 n traffic state categories, and then the calculation formulas of the objective function of the improved FCM clustering ðkÞ where x represents the output of the activation func- algorithm for traffic state classification are tion of the hidden layer, E½·Š represents the mean value, and Var½·Š represents the variance. c n σ 2 At the same time, two parameters γ, β are added to JXðÞ , U, V = 〠 〠ðÞ u d , ð18Þ im im perform the inverse activation transformation. m=1 i=1 ðÞ k ðÞ k ðÞ k ðÞ k y = γ x̂ + β : ð14Þ t = exp − 〠 u ∙d , i im im m=1 3.4. Evaluation Module ð19Þ 3.4.1. Entropy Weight Method (EWM). The traditional FCM d = 〠 w x − v : im j i m algorithm does not consider the influence of traffic indica- j=1 tors and individual samples on the clustering results. The EWM uses the idea of entropy value to judge the discrete where U is the affiliation matrix of each sample belonging to degree of an indicator and determines the weight of each different traffic states. V is the matrix composed of all clus- tering centres. n is the total number of samples. u is the indicator through the information entropy. In this paper, im the EWM is used to determine the weights of different indi- affiliation degree of samples belonging to traffic state m. σ cators, and the degree of influence of each sample on the is the weighted index, indicating that the less fuzzy the algo- clustering results is defined by designing a sample weighting rithm is, the more accurately the state is divided. t is the ··· ··· 10 Journal of Advanced Transportation Congestion evaluation Traffic volume 1 2 3 4 Congestion Evaluation ··· Start Indicator predict Finish level module ··· Delay time 183 184 185 186 Congestion threshold Improved FCM algorithm Renew Indicator weight Hyperparameter setting Initialization || U(k+1)–U(k)<𝜀 || Optimal cluster center 𝜎 u c d Clustering center Figure 9: The algorithm flow of traffic congestion evaluation. ID of entry gate Vehicle detectors 30102 1 Time_VD Time_RLD 2019-04-03 12:11:18 2019-04-03 12:12:18 (b) Time_processing Payment type Time 100 0 (c) (e) Gate_ID (number/5min) Time 30102 30103 30104 6:00 13 11 22 3 6:05 34 25 17 2 6:10 23 20 19 4 6:15 10 15 25 9 Time 6:20 38 20 31 17 (a) (d) (f ) Figure 10: Data sources of freeway tolls and data processing of traffic indicators. weight of sample i. d is the weighted Euclidean distance In addition, this paper collects the original toll-data for 4 im months (December 2018-March 2019) and selects some between sample i and cluster center m. fields as shown in Figure 10(c), including the toll-gate num- ber, vehicle detector, time to VD and arrival to RLD, respec- 4. Experiments tively (VD and RLD are the positions of (1) and (2) in Figure 3, respectively), travel time, and payment method This section contains the experimental settings and experi- (choice of ETC and MTC channels). Then, this paper mental results. converts the collected raw data into the traffic volume and running delay time required by the experiment. The 4.1. Experimental Settings collected traffic volume data are integrated at 5-minute 4.1.1. Study Site and Datasets. In this paper, 186 toll-gates on intervals as shown in Figure 10(d), and the operational delay freeways in Shaanxi Province, China are selected as the study time is calculated according to Equations ((1)–(3)). Due to site. The locations of the toll-gates are shown in Figure 10(a), equipment failure and other reasons, there will be data showing a radial distribution. Based on the calculation in missing in some periods as shown in Figure 10(e). Facing Equation (5) of the D&C-Matrix, the spatial heat map rela- the problem of missing temporal data, we analyze the tionship between the toll-gates obtained by further analysis distribution characteristics of missing data and establish a is shown in Figure 10(b). reasonable complementary rule framework to interpolate Flow Flow Journal of Advanced Transportation 11 25 RMSE 10.02 9.05 10 8.21 HA ARIMA SVR LSTM SAEs GRU TCN STGCN DCRNN T-GCN CPT-DF 1-hour 30-min 15-min Figure 11: Visualization of prediction performance (RMSE) of different models at different time steps. temporal data. Finally, the data completion effect shown in 4.2. Experimental Results. This part uses the feature matrix Figure 10(f) is achieved, and a relatively complete data set and adjacency matrix datasets based on the Shaanxi Prov- is prepared for the experimental part. ince toll-data to demonstrate the long-term prediction ability of the model proposed in this paper under large- 4.1.2. Parameters Setting. This experiment uses the ADAM scale networks. The experimental results are discussed from optimizer for training, setting the learning rate every 5 epochs three aspects: prediction results, evaluation results, and to 0.7, the initial learning rate to 0.001, and the batch size to 50. ablation experiments. The channels of the spatiotemporal block are set to 32, 16, 32. In addition, the experiment selected the following three 4.2.1. Prediction Results. From the overall comparison evaluation metrics as shown in Equations (20)–(22). results, as shown in Figure 11, all models based on temporal and spatial features have better prediction performance than (i) Root Mean Squared Error (RMSE). models based on temporal features only. This also proves that there is a strong spatial correlation between various toll sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi stations in the large-scale road network. 2 Table 2 lists the comparison results of the prediction per- RMSE = 〠ðÞ x − x̂ : ð20Þ i i n formance at 15/30/60 minutes of the seven models that only i=1 consider the temporal feature. Here, MAE and RMSE have the same units as the quantity being estimated. When the predicted object is traffic, the unit is the number of vehi- cles/n min; when the predicted object is delay time, the unit (ii) Mean Absolute Error (MAE). is min. The error of the traditional method will be larger because the temporal storage capacity of the linear model is limited, the toll-gates of the freeway are more complex, MAE = 〠 x − x̂ , ð21Þ jj i i and nonlinear traffic flow characteristics than ordinary road i=1 sections. Especially with historical average (HA) models, relying only on the average makes it difficult to predict accu- rate results. Compared with ARIMA and SVR models, it is more suitable for short-term prediction work. When the (iii) Mean Absolute Percentage Error (MAPE). time step increases, the model will have difficulty converg- ing. Deep learning methods (LSTM, SAE, GRU, and TCN) are affected by data distribution and will produce larger prediction errors for gates with large peak-to-valley fluctua- MAPE = 〠ðÞ jj ðÞ x − x̂ /x ∗ 100%, ð22Þ i i i i=1 tions. Especially, the TCN model also shows better perfor- mance among the four methods. Because the causal where x is the true value, and x̂ is the predicted value. convolution in TCN is different from the traditional time i i 12 Journal of Advanced Transportation Table 2: Comparison of the performance of temporal prediction. 15-min 30-min 60-min Task MAE MAPE RMSE MAE MAPE RMSE MAE MAPE RMSE HA 9.32 17.85% 22.98 9.32 17.85% 22.98 9.32 17.85% 22.98 ARIMA 6.97 13.20% 16.29 7.44 14.09% 18.37 7.90 14.19% 18.89 SVR 6.11 12.91% 14.70 6.55 13.43% 17.21 7.21 13.83% 17.93 LSTM 5.65 12.02% 13.11 6.09 12.71% 15.92 6.95 13.01% 17.07 SAEs 5.23 11.69% 12.78 5.60 12.01% 14.91 6.26 12.61% 16.13 GRU 4.78 10.20% 11.30 5.22 10.92% 13.84 5.68 11.22% 14.72 TCN 4.57 9.21% 10.80 4.91 9.85% 12.18 5.17 10.15% 13.44 Table 3: Comparison of the performance of spatiotemporal prediction. 15-min 30-min 60-min Task MAE MAPE RMSE MAE MAPE RMSE MAE MAPE RMSE STGCN 4.06 8.02% 9.31 4.35 9.12% 11.43 4.61 9.62% 12.29 DCRNN 3.79 7.79% 8.73 4.13 8.07% 10.75 4.39 8.37% 11.01 T-GCN 3.69 7.06% 8.21 3.81 7.83% 9.79 4.07 8.03% 10.68 CPT-DF 3.71 7.11% 8.26 3.78 7.44% 9.05 3.97 7.82% 10.02 Time (min) TCN-STGCN STGCN DCRNN T-GCN Figure 12: Comparison results of calculation time of different models. series network, it has the characteristic of a one-way struc- through the information transfer between units to capture ture in which the value of the next moment only depends the temporal characteristics. Finally, the characteristics of on the value of the previous multistep. Furthermore, the regional gates will be better predicted. In contrast, in the receptive field is enlarged by adding dilated convolutions work on the time series of highly nonlinear and complex to capture longer dependencies. Therefore, considering the toll-data, the capture ability and memory ability of CNN’s long-term prediction performance, this paper selects and temporal feature are slightly insufficient compared with improves the optimal dilated causal convolutional network. RNN and its variants (LSTM and GRU). Similarly, the spa- Table 3 lists the comparison results of the prediction per- tiotemporal prediction ability of the STGCN model based formance at 15/30/60 minutes of the four models consider- on GCN and CNN is slightly insufficient compared with ing both spatial and temporal features. In graph-related the diffusion convolutional recurrent neural network models (T-GCN, DCRNN, STGCN, and CPT-DF), the (DCRNN) based on RNN and the T-GCN model based on information of node features and graph structure can be GRU. In the short-term prediction (15-minute) work, the learned end-to-end by GCNs. Furthermore, the topological T-GCN model adopts GCN to learn complex topology for structure and spatial correlation features of the toll-gate are spatial correlation and GRU to learn dynamic changes of well captured. The obtained time series with spatial charac- traffic data for temporal correlation. GRU solves the gradient teristics is further input into the unit model of the processing disappearance and gradient explosion problems faced by temporal module, and the dynamic changes are obtained RNN when training a large amount of data and retains the Journal of Advanced Transportation 13 Basically Lightly Moderately Severely Traffic condition Unblocked unblocked congested congested congested Congestion level A B C D E Colour Traffic volume (0, 0.164) (0.164, 0.363) (0.363, 0.565) (0.565, 0.753) (0.753, 1) Delay time (0, 0.151) (0.151, 0.324) (0.324, 0.484) (0.484, 0.669) (0.669, 1) Cluster centre (0.066, 0.058) (0.244, 0.220) (0.434, 0.377) (0.563, 0.504) (0.784, 0.690) Figure 13: The value range and cluster center corresponding to the two normalized indicators. 1 1 0.9 0.9 0.8 0.8 X 0.7839 Y 0.6903 0.7 0.7 X 0.5626 0.6 0.6 Y 0.5042 0.5 0.5 X 0.4344 Y 0.3769 0.4 0.4 X 0.2442 0.3 0.3 Y 0.2166 0.2 0.2 X 0.06623 Y 0.05826 0.1 0.1 0 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Traffic flow Traffic flow Figure 14: Improved congestion evaluation FCM algorithm clustering results. trend of historical charging data, so it can better predict traf- 4.2.2. Evaluation Results. This paper selects the 3-month fic feature in the future in the short-term prediction work collection data of 30101 (toll-gate ID) as an example, and combined with GCN. However, in the long-term prediction calculates the delay time in the congestion area of the toll- (30-minute and 60-minute) work, different from the RNN gate according to Equations (1), (2), and (3), and the traffic structure, the CPT-DF model proposed in this paper can volume is used as the input of the clustering algorithm. be massively parallelized due to the dilated causal convolu- According to the “Urban Road Traffic Operation Evaluation tion. The expansion coefficient and the size of the filter Indicator System” issued by China, an evaluation index called Road Traffic Performance Index (TPI) is proposed change the receptive field and can also avoid the problems of gradient dispersion and gradient explosion in RNN. as shown in Figure 13. Traffic congestion is divided into 5 Therefore, the CPT-DF model achieves the best results in grades in TPI [34]: unblocked (A), basically unblocked (B), the long-term prediction work. lightly congested (C), moderately congested (D), and Furthermore, real-time traffic flow prediction is a basic severely congested (E). Therefore, set the clustering traffic requirement of intelligent transportation systems and has state category C =5, the fuzzy factor σ =2, and the objective strict requirements on the total time cost of model training function change value e=le − 5. In order to speed up the and testing [33]. Therefore, in this paper, taking 1 hour as training speed and optimize the clustering results to normal- the historical time window, the calculation time of the four ize the data, the calculation time of the algorithm is about graph-based models is calculated as shown in Figure 12. 25-30 s. Figure 14 shows the visualization results of the tradi- The comparison found that since the gated convolution in tional improved FCM algorithm on actual traffic data. From STGCN is replaced by the improved dilated causal convolu- left to right, it corresponds to five traffic states (A)-(E), and tion, the training time of the CPT-DF model is greatly according to the clustering results of the improved FCM reduced. DCRNN and T-GCN models take longer to train algorithm, two indicators under various traffic states are than diffuse causal convolutions when using RNN and its given, the corresponding value range and cluster center. variants to capture time series. By comparing the STGCN Furthermore, this paper is verified based on the pro- model before the improvement, the training time of the posed clustering method and the actual and predicted values improved CPT-DF model is reduced by 27.16%, which will of working days (March 20, 2019) and holidays (March 24, provide advantages for real-time traffic flow prediction. 2019) are selected for comparison. As shown in Figure 15, Traffic delay time Traffic delay time 14 Journal of Advanced Transportation 5 5 4 4 3 3 2 2 1 1 True True Predict Predict (a) (b) Figure 15: Comparison of actual and predicted traffic conditions on weekdays and holidays. 10.5 10.41 10.36 14 13.44 12.29 12.18 10.02 10.0 12 11.43 10.8 10.02 9.31 9.5 9.41 9.05 9.34 8.26 9.05 9.0 8.7 6 8.67 8.5 4 8.26 8.0 7.5 15 30 60 15 30 60 Time (min) Time (min) TCN C-matrix GCN D-matrix CPT-DF DC-matrix (a) (b) Figure 16: (a) Model improvement before and after comparison. TCN represents the prediction model after removing the spatial module; GCN represents the prediction model after replacing the temporal module. (b) Comparison of prediction accuracy under different matrices. the clustering results of the improved FCM algorithm pro- performance of the improved CPT-DF model is about 20% posed in this paper are basically in line with the actual traffic higher than that of the TCN model that only considers conditions, and the CPT-DF algorithm proposed in this the temporal dimension. The performance of the GCN paper can accurately predict future traffic congestion except model is reduced by about 12% after removing the TCN for special interference points. model. The improved CPT-DF model has stronger spatio- temporal prediction ability and higher traffic flow prediction 4.2.3. Ablation Experiment. In order to further study the efficiency. Therefore, the experimental results show that optimization effect of different modules on the CPT-DF both the temporal module and the spatial module have great model, under the same conditions, this paper cancels the effects on the model proposed in this paper. temporal module and the spatial module, respectively, to Figure 16(b) shows the prediction results obtained by complete the ablation experiment. In addition, this paper using three different matrices (C-Matrix, D-Matrix, and also discusses the effect of three adjacency matrices on the D&C-Matrix). It can be seen that the improved D&C- prediction performance. Matrix has better prediction accuracy than C-Matrix and Figure 16(a) shows the comparison of the prediction D-Matrix. It is about 3%-4%, so it can be used as a better performance of each module of the CPT-DF model. The construction method for spatial adjacency matrix. Congestion level RMSE 00:00 02:00 04:00 06:00 08:00 10:00 12:00 14:00 16:00 18:00 20:00 22:00 24:00 RMSE Congestion level 00:00 02:00 04:00 06:00 08:00 10:00 12:00 14:00 16:00 18:00 20:00 22:00 24:00 Journal of Advanced Transportation 15 5. Conclusions and Discussion Furthermore, if real-time and multisource traffic data based on toll-gates or other sensors are collected, the work Large-scale congestion occurs frequently at toll-gates on of traffic prediction and evaluation of multitime span of freeways, especially during holidays or daily peak times. In day/month/year and multiscale (road network, local lanes, order to alleviate the congestion of the toll-gate and prevent vehicle, etc.) can be further analysed in a more fine-grained the occurrence of additional traffic accidents, environmental manner using the AI graph network proposed in this paper. pollution, etc., it is necessary to select the toll-gate as the Of course, each method will have some unexpected prob- study site and effectively predict and evaluate the future lems when it is applied in practice. For example, although congestion based on historical traffic data. In this paper, our proposed model can predict future traffic congestion in the topology of the freeway network is modelled as a graph spatiotemporal dimension, it cannot intuitively explain the structure. Toll-gates and the ordinary road segments rules for the evolution of traffic flow outside the “congestion between gates are regarded as vertices and edges in the domain” of toll-gates, including how congestion flow forms graph structure, and the traffic volume and operational and dissipates. Therefore, in the follow-up work, we will delay time are selected as feature matrices at the vertices. combine the evolution rules of actual traffic flow to better Then, by analysing the compositional characteristics of control the change of traffic to improve the intelligent level of traffic emergency management. toll-gates and constructing a congestion domain, a new AI network of toll-gate congestion based on GCN fusing dilated causal mechanism in DL and FCM clustering inte- Data Availability grating coupling weight is proposed in this paper. Further, The data that support the findings of this study are available 4-month toll data of freeways in China’s Shaanxi Province from the corresponding author upon reasonable request. are collected to complete the experimental work. In the experiment, congestion detection of toll-gate is Conflicts of Interest realized from two aspects: prediction and evaluation. Analysing the prediction results: the performance of the The authors declare no conflict of interest. graph-based models is about 8%-35% better than other non- graph models in the long-term prediction (60-min) work. Acknowledgments The reason is that the graph-based algorithm additionally verifies the correlation between each toll-gate from the global The research is financially supported by National Key spatiotemporal dimension and quantifies it using the D&C- Research and Development Program of China, Key technol- Matrix. It provides the possibility for the advance manage- ogies for derivative composite disaster assessment and emer- ment and traffic guidance for toll-gates of large-scale freeways. gency adapting in the Guangdong-Hong Kong-Macao Analysing the evaluation results: the traffic state is rea- Greater Bay Area (2021YFC3001000). sonably divided into five levels and the congestion of the toll-gate is accurately evaluated using the fuzzy clustering References method. It provides a possibility to accurately release the congestion information and avoid wrong alarm of the [1] Y. Wang, X. Yu, S. Zhang et al., “Freeway traffic control in toll-gates. presence of capacity drop,” IEEE Transactions on Intelligent The successful prediction could extend to the real-time Transportation Systems, vol. 22, no. 3, pp. 1497–1516, 2021. prediction and early warning of traffic congestion in the toll [2] S. Zahedian, A. Nohekhan, and K. F. Sadabadi, “Dynamic toll prediction using historical data on toll roads: case study of the system to improve the intelligent level of traffic emergency I-66 inner beltway,” Transportation Engineering, vol. 5, article management and guidance on the key road of disasters. 100084, 2021. On the one hand, when the management department of [3] L. Shen, J. Lu, D. Geng, and L. Deng, “Peak traffic flow predic- the freeway receives the accurate traffic indicators of the tions: exploiting toll data from large expressway networks,” toll-gate in the future period, it is not only to grasp the Sustainability, vol. 13, no. 1, pp. 260–318, 2021. regional traffic evolution from global toll-gate but also to [4] L. Yan, P. Wang, J. Yang, Y. Hu, Y. Han, and J. Yao, “Refined adjust the service instructions from a single toll-gate. For path planning for emergency rescue vehicles on congested example, based on the predicted information, the manager urban arterial roads via reinforcement learning approach,” adjusts the optimal ratio of opening and closing for the toll Journal of Advanced Transportation, vol. 2021, no. 1, Article lanes of the freeway in advance to ensure the smooth flow ID 8772688, p. 12, 2021. of vehicles. On the other hand, the optimal route and travel [5] M. Gori, G. Mandarina, and F. Scarcella, “A new model for time by the guidance is chosen by the driver based on the earning in graph domains,” Proceedings of the International accurate congestion state of the toll-gate predicted in Joint Conference on Neural Networks, vol. 2, no. 2, pp. 729– advance. For example, a driver will have n routes that can 734, 2005. be selected from the origin A to the destination B, and then [6] F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and based on the congestion evaluation and the calculation of the G. Monfardini, “The graph neural network model,” IEEE travel time, multiple options (the least delay, the shortest Transactions on Neural Networks, vol. 20, no. 1, pp. 61–80, distance, or the least congestion, etc.) are provided for the 2009. road user, which is especially effective in the emergency of [7] J. Zhou, G. Cui, S. Hu et al., “Graph neural networks: a review key road of disasters. of methods and applications,” AI Open, vol. 1, pp. 57–81, 2020. 16 Journal of Advanced Transportation [8] B. Yu, Y. Lee, and K. Sohn, “Forecasting road traffic speeds by [23] T. Afrin and N. Yodo, “A probabilistic estimation of traffic considering area-wide spatio-temporal dependencies based on congestion using Bayesian network,” Measurement, vol. 174, a graph convolutional neural network (GCN),” Transportation article 109051, 2021. Research Part C: Emerging Technologies, vol. 114, pp. 189–204, [24] R. Esfahani, F. Shahbazi, and M. Akbarzadeh, “Three-phase classification of an uninterrupted traffic flow: a k-means clus- [9] X. Shi, H. Qi, Y. Shen, G. Wu, and B. Yin, “A spatial–temporal tering study,” Transportmetrica B: transport dynamics, vol. 7, attention approach for traffic prediction,” IEEE Transactions no. 1, pp. 546–558, 2019. on Intelligent Transportation Systems, vol. 22, no. 8, [25] J. Dunn, “Well-Separated clusters and optimal fuzzy parti- pp. 4909–4918, 2021. tions,” Journal of cybernetics, vol. 4, no. 1, pp. 95–104, 1974. [10] B. Yu, H. Yin, and Z. Zhu, “Spatiotemporal graph convolu- [26] Z. Cheng, W. Wang, J. Lu, and X. Xing, “Classifying the traffic tional networks: a deep learning framework for traffic forecast- state of urban expressways: a machine-learning approach,” ing,” in Proceedings of the Twenty-Seventh International Joint Transportation Research Part A: Policy and Practice, vol. 137, Conference on Artificial Intelligence, pp. 3634–3640, Stock- pp. 411–428, 2020. holm, 2018. [27] S. Gan, S. Liang, K. Li, J. Deng, and T. Cheng, “Trajectory [11] Y. Li, R. Yu, C. Shahabi, and Y. Liu, “Diffusion convolutional length prediction for intelligent traffic signaling: a data- recurrent neural network: Data-driven traffic forecasting,” driven approach,” IEEE Transactions on Intelligent Transpor- 2017, https://arxiv.org/abs/1707.01926. tation Systems, vol. 19, no. 2, pp. 426–435, 2018. [12] L. Zhao, Y. Song, C. Zhang et al., “T-GCN: a temporal graph [28] T. Vincenty, “Direct and inverse solutions of geodesics on the convolutional network for traffic prediction,” IEEE Transac- ellipsoid with application of nested equations,” Survey Review, tions on Intelligent Transportation Systems, vol. 21, no. 9, vol. 23, no. 176, pp. 88–93, 1975. pp. 3848–3858, 2020. [29] J. Bruna, W. Zaremba, A. Szlam, and Y. LeCun, “Spectral net- [13] J. Zhao, Y. Gao, Z. Bai, H. Wang, and S. Lu, “Traffic speed pre- works and locally connected networks on graphs,” 2014, diction under non-recurrent congestion: based on LSTM https://arxiv.org/abs/1312.6203. method and BeiDou navigation satellite system data,” IEEE [30] E. Michael, K. Hermann, and B. Dagmar, “Large-scale quan- Intelligent Transportation Systems Magazine, vol. 11, no. 2, tum networks based on graphs,” New Journal of Physics, pp. 70–81, 2019. vol. 18, no. 5, article 053036, 2016. [14] Z. Lv, L. Qiao, K. Cai, and Q. Wang, “Big data analysis technol- [31] S. Bai, J. Kolter, and V. Koltun, “An empirical evaluation of ogy for electric vehicle networks in smart cities,” IEEE Trans- generic convolutional and recurrent networks for sequence actions on Intelligent Transportation Systems, vol. 22, no. 3, modelling,” 2018, https://arxiv.org/abs/1803.01271. pp. 1807–1816, 2021. [32] J. Wang, Y. Zhang, Y. Wei, Y. Hu, X. Piao, and B. Yin, “Metro [15] B. Williams and L. Hoel, “Modeling and forecasting vehicular passenger flow prediction via dynamic hypergraph convolu- traffic flow as a seasonal ARIMA process: theoretical basis and tion networks,” IEEE Transactions on Intelligent Transporta- empirical results,” Journal of Transportation Engineering, tion Systems, vol. 22, no. 12, pp. 7891–7903, 2021. vol. 129, no. 6, pp. 664–672, 2003. [33] J. Yang, P. Wang, W. Yuan, Y. Ju, W. Han, and J. Zhao, “Auto- [16] H. Yang, P. Jin, B. Ran, D. Yang, Z. Duan, and L. He, “Freeway matic generation of optimal road trajectory for the rescue vehi- traffic state estimation: a Lagrangian-space Kalman filter cle in case of emergency on mountain freeway using approach,” Journal of Intelligent Transportation Systems, reinforcement learning approach,” IET Intelligent Transport vol. 23, no. 6, pp. 525–540, 2019. Systems, vol. 15, no. 9, pp. 1142–1152, 2021. [17] C. Wu, J. Ho, and D. Lee, “Travel-time prediction with support [34] J. Jiang, Q. Chen, J. Xue, H. Wang, and Z. Chen, “A novel vector regression,” IEEE Transactions on Intelligent Transpor- method about the representation and discrimination of traffic tation Systems, vol. 5, no. 4, pp. 276–281, 2004. state,” Sensors, vol. 20, no. 18, p. 5039, 2020. [18] P. Wang, W. Hao, and Y. Jin, “Fine-grained traffic flow pre- diction of various vehicle types via fusion of multisource data and deep learning approaches,” IEEE Transactions on Intelligent Transportation Systems., vol. 22, no. 11, pp. 6921–6930, 2021. [19] C. Shuai, W. Wang, G. Xu, M. He, and J. Lee, “Short-term traf- fic flow prediction of expressway considering spatial influ- ences,” Journal of Transportation Engineering, Part A: Systems, vol. 148, no. 6, 2022. [20] W. Zhang, Y. Yu, Y. Qi, F. Shu, and Y. Wang, “Short-term traf- fic flow prediction based on spatio-temporal analysis and CNN deep learning,” Transportmetrica A Transport Science, vol. 15, no. 2, pp. 1688–1711, 2019. [21] S. Guo, Y. Lin, N. Feng, C. Song, and H. Wan, “Attention based spatial-temporal graph convolutional networks for traffic flow forecasting,” in Proceedings of the AAAI Conference on Artifi- cial Intelligence, vol. 33no. 1, pp. 922–929, 2021. [22] M. Akhtar and S. Moridpour, “A review of traffic congestion prediction using artificial intelligence,” Journal of Advanced Transportation, vol. 2021, Article ID 8878011, 18 pages, 2021.

Journal

Journal of Advanced TransportationHindawi Publishing Corporation

Published: Apr 10, 2023

There are no references for this article.