Access the full text.

Sign up today, get DeepDyve free for 14 days.

Journal of Advanced Transportation
, Volume 2023 – Mar 30, 2023

/lp/hindawi-publishing-corporation/classification-of-the-traffic-status-subcategory-with-etc-gantry-data-iWziyWmBGS

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

- Publisher
- Hindawi Publishing Corporation
- ISSN
- 0197-6729
- eISSN
- 2042-3195
- DOI
- 10.1155/2023/2765937
- Publisher site
- See Article on Publisher Site

Hindawi Journal of Advanced Transportation Volume 2023, Article ID 2765937, 21 pages https://doi.org/10.1155/2023/2765937 Research Article Classification of the Traffic Status Subcategory with ETC Gantry Data: An Improved Support Tensor Machine Approach 1,2 1,2 1,2 1,2 Yan Zhao, Wenqi Lu, Yikang Rui , and Bin Ran School of Transportation, Southeast University, Nanjing, Jiangsu 211189, China Joint Research Institute on Internet of Mobility, Southeast University and University of Wisconsin-Madison, Southeast University, Nanjing 211189, China Correspondence should be addressed to Yikang Rui; 101012189@seu.edu.cn Received 31 October 2022; Revised 17 February 2023; Accepted 1 March 2023; Published 30 March 2023 Academic Editor: Jinjun Tang Copyright © 2023 Yan Zhao et al. Tis is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Accurate and reliable trafc state identifcation is the prerequisite for developing intelligent trafc programs. With the im- provement of intelligent trafc control measures, the trafc state of some highways has gradually stabilized. Te current research on trafc state identifcation needs to fully meet the highly informative intelligent trafc system and trafc state subcategory analysis. To fll the gap above, we propose an improved support tensor machine (STM) method based on self-training and multiclassifcation for trafc state subcategory identifcation (ISTM) with ETC gantry data. Tis paper takes the excellent ap- plication of the support vector machine (SVM) in trafc state identifcation as the starting point of method design and extends to the STM. Te ETC gantry data are represented as a third-order tensor model. Tis paper utilizes the similarity among tensor samples to construct the kernel function and recognize the trafc states. We simplify STM calculation with a one-against-one model and a self-training idea. An optimal ft of the characteristics is supplied by maximizing inter-subcategory tensor block distances and minimizing intra-subcategory tensor block distances throughout a joint utilization of the STM and multiscale training theories. Te experiment in this paper uses ETC gantry data from the Jingtai highway in Shandong Province, and the fndings reveal that the ISTM has optimum values of 0.2578 and 0.3254 for the SumD and 0.1718 and 0.1901 for the DBI as compared to K-mean clustering and the SVM. Te ISTM trains the trafc state subcategory classifers with high accuracy and strong generalization ability. and provide an in-depth understanding of the traveler’s 1. Introduction needs. Te management can develop fne-grained trafc Practical and reliable trafc state identifcation is one of the management content for diferent objectives. Trafc operators critical components of the intelligent transportation system, can analyze and compare subcategories of each trafc state which is the foundation for the ITS system to improve and make the necessary control technology updates. For highway operations efciency and trafc control scheme example, Trafc operators can provide travel guidance to maximize road network efciency. It is easy to obtain trafc development. With the improvement of intelligent control measures, the trafc state of some highways gradually tends to information feedback and improve highways’ resilience after be smooth and less fuctuating. An efective trafc state the trafc conditions refnement. identifcation method helps to grasp the changes in the trafc At present, the criteria of highway trafc states could be state from the global perspective, which provides an essential divided into absolute and relative metrics [1]. Relative scientifc basis for trafc management and control. A fne- metrics use historical trafc data to describe the highway grained trafc state classifcation can explain the chaotic part trafc condition in a specifc time and space. Te relative of the trafc state and enable cognitive upgrading of trafc metrics of trafc state include fow rate, speed, and density. states. Te refned trafc state subcategories are more specifc Te relative metrics are usually divided into four categories, 2 Journal of Advanced Transportation data analysis [9]. Tis paper uses ETC gantry data’s rich including “free,” “smooth,” “congested,” and “blocked” [2]. Tese four trafc states are highly variable, they vary sig- dimensionality and fast update speed as the entry point to support trafc states subcategories parameter analysis. nifcantly in the granularity of the trafc fow parameters, and it is easy to classify them. Currently, the research on Tis paper proposes a trafc state subcategory identif- trafc state identifcation mainly focuses on the above four cation method based on the STM. Te highly generalized categories of trafc states. Yue et al. [3] classifed trafc states nature of the tensor model for a concise representation of into fve levels (free fow, near free fow, light congestion, multiple linear types in linear and dyadic spaces and has high moderate congestion, and severe congestion) based on the program readability. Data representation is a critical factor in vehicle kilometers traveled (VKT) for a specifc travel speed. model performance. Research shows that the third-order tensor model is superior to the matrix and fourth-order However, these fve levels still need to portray the charac- teristics of diferent trafc state subcategories fully and still tensor representations in preserving information in the data set [10]. Te STM has a good recognition efect when the belong to larger-grained trafc state categories. Researchers have started to study the vulnerability of trafc qualitatively samples are uneven, and the sample size is small. Terefore, the tensor model meets the model construction requirements or quantitatively [4]. Te highway trafc system is a complex time-varying system subject to more uncertain trafc events, of this study. Traditional STM has some limitations when and the trafc state pattern may be variable [5]. Te trafc faced with the problem of trafc state subcategory identif- state identifcation method based on the large granularity cation. First, the traditional STM is a binary classifcation classifcation cannot meet the needs of increasingly complex algorithm, which cannot deal with multicategory problems. and detailed trafc fow change research. Tere is a need to Second, the traditional STM is a supervised learning algo- refne the existing trafc state category and further deepen rithm, which requires a large amount of sample labeling work and increases the data processing workload. Finally, diferent the research on the trafc state identifcation method. It is a need for trafc state identifcation methods that can be tensor blocks contain diferent amounts of information. Traditional STM uses the same scale to express tensors and applied to efciently identify trafc state subcategories from a large amount of data. Tis paper aims to conduct research deals with fxed-size tensor objects, resulting in the need for more analysis of tensor blocks of diferent scales. on trafc state subcategory identifcation and design a method based on STM. Tis research makes some improvements to the STM. Data mining and machine learning methods strongly (1) An ETC gantry data three-order tensor model is drive trafc state research development [6, 7]. Trafc state constructed to deeply explore the relationship be- identifcation is afected by the quality of research data and tween multilevel parameters of ETC gantry data. model performance to some specifc—the trafc state sub- (2) A support tensor classifer based on self-training category research challenges the trafc data and models learning is proposed, and the improved STM is used. (1) Te collection of highway trafc data may rely on based on self-training and multiclassifcation (ISTM) commonly used detectors, such as magnetic detectors, video according to the recent experience of many appli- detectors, and mobile phone data [8]. Due to the limitation cations of the SVM in trafc state identifcation. At of detection ability (e.g., vehicle type detection and vehicle the same time, the radial basis tensor kernel function diversion detection) and low deployment density of the for the input tensor sample’s high-dimensional above detectors, the quality of trafc data obtained is difcult mapping to explore the correlation degree of dif- to guarantee. (2) Te data structure is the basis for building ferent trafc parameters. Te improvements made by the data model and the data operations based on the data the ISTM mainly include embedding the one- structure. Implementing ITS trafc data is the key to against-one mechanism to achieve multicategory building a trafc state subcategory identifcation model. classifcation. Te ISTM introduced the idea of self- Based on ensuring the model’s high interpretation, sensi- training to reduce the workload of sample calibration tivity, and stability, the model needs to dig into the mul- and extract diferent tensor blocks to represent timodal correlation between the data and learn trafc data’s multiscale tensor slices. spatial and temporal patterns under the trafc state subcategories. (3) Tis paper uses ETC gantry data for experimental Diferent standard trafc data collections are shown in comparison to verify the efectiveness of the ISTM. Tables 1 and 2. At present, ETC gantries are rapidly pop- Te main contributions of this article can be described as ularized and developed. Te ETC gantries system includes follows: static data (e.g. toll station information) and dynamic data (e.g. travel information and toll information). Tese data (i) Tis paper takes the small range of trafc state hide a large amount of valuable information. Te value of changes on the essential highway section as the ETC gantries data gradually manifests with the continuous research object, refnes the trafc state categories, development of big data technology. First, ETC gantry data and enriches the practical use meaning of trafc are updated quickly. ETC gantry could accumulate a huge state identifcation. data volume in a short period. Second, ETC gantry data have (ii) An ETC gantry data three-order tensor model- a high consistency structure, and the update overhead is based represents to preserve the spatialtemporal negligible. In addition, ETC gantry data are rich in di- information in the ETC gantry data. Te tensor mensions, which may provide various perspectives for trafc Journal of Advanced Transportation 3 Table 1: Comparison of trafc data collection methods. Collection method Advantages Disadvantages Te compatibility with trafc information collection systems based on other Less investment, not afected by external conditions such as climate and light, technologies could be better. When the vehicle speed is too fast or congested, Coil detector strong adaptability, high accuracy of speed measurement and trafc counting, the coil current does not change signifcantly, and the detection accuracy and good working stability. reduces Easily afected by the weather and the refection of sunlight, the energy refected Easy to install and maintain, the clear discrimination of models. Low energy from the vehicles in the detection area will be difused or absorbed by the Infrared detector consumption and no radiation impact on the surrounding environment. particles in the air, such as rain, snow and haze; thus afecting the detection Accurate measurement of vehicle position, speed and vehicle type. results. Te data detection accuracy of the detector is afected by the environment, especially the impact of high winds and rainstorms. Moreover, the detection High detection accuracy, and continuous detection of all weather, can also range of ultrasonic waves is tapered and is afected by the model and height of Ultrasonic detector detect stationary vehicles and trafc fow information under congestion. Te the vehicle; when the vehicle is at medium and high speeds, it generates a large detector is small, easy to install, and has a long service life. number of ultrasonic pulse repetitions that quickly make the measured occupancy small. It cannot accurately measure the speed of a single vehicle. On road sections with Easy to install and maintain, long service life, less afected by weather and more congested trafc, uneven distribution of vehicle types and more large Radar detector climate, and able to carry out multiple lane detection simultaneously. vehicles, the detection accuracy of the detector drops sharply due to severe occlusion. Relying on optical principles, factors such as dust, shadows, weather, and lighting can afect the detection results. Small vehicles may be obscured by Flexible in use, can detect multiple trafc parameters at once and has an accompanying large vehicles, resulting in inaccurate detection results. Te Video image detection extensive detection range. detection performance is afected when the vehicle and road contrast is low. Te equipment and image processing cost is high, and the real-time performance could be better. Te decrease in detection accuracy caused by the base station positioning error leads to the loss of short-distance travel within the base station cell. Low Wide coverage, large analysis samples, low implementation cost, and long-term signaling sampling frequency causes the trajectory tracking of cell phones to be Cell phone signaling data continuous monitoring. noncontinuous, making it difcult to know accurately the user’s departure time, arrival time, and dwell time—limitations caused by nonfull-sample detection. Global all-weather positioning with high positioning accuracy can provide Location is infuenced by climate, ionosphere, troposphere, air, and GPS data global unifed 3D geocentric coordinates as follows: short acquisition period, electromagnetic waves. Low sampling rates introduce sampling errors and long duration, and large data scale. positioning errors. Due to the evasion behaviours such as blocking license plates and unstable A high level of intelligent automation, wide coverage, fast update speed, and equipment working conditions, problems such as missing felds and abnormal ETC gantry data rich ETC gantry data dimensions can provide multiple perspectives for trafc data may occur in the ETC gantry data. Te trafc detection between ETC data analysis. High accuracy and good development prospects. gantry zones needs to be improved, and the quality of ETC gantry data cannot be guaranteed. 4 Journal of Advanced Transportation Table 2: Types of trafc data available to the detector. Collection method Trafc fow Occupancy Speed Vehicle type Other parameters Coil detector ✓ ✓ ✓ ✓ Vehicle length Infrared detector ✓ ✓ ✓ ✓ Queue length Ultrasonic detector ✓ × ✓ × — Radar detector ✓ ✓ ✓ × Headway Video image detection ✓ ✓ ✓ ✓ Headway, vehicle length, density, and queue length Cell phone signaling data × × ✓ × Travel path GPS data ✓ × ✓ × Travel path ETC gantry data ✓ ✓ ✓ ✓ Headway, vehicle length, density, total mileage, and trip od Note. ✓: detectable; ×: not detectable. model is more suitable for the case of a narrow Te model-driven trafc state method has solid ex- calculation range of parameter data, and small data planatory power, makes idealized assumptions, and has granularity of trafc states subcategory research. limited application scenarios for any given highway seg- ment. Tis approach may ignore certain factors, such as (iii) Te multiscale training idea and the self-training weather type, that may signifcantly afect trafc charac- idea are introduced to the learning process of the teristics when considering the impact of site roadway ge- STM. Te multiscale training idea improves the ometry and trafc features on the baseline conditions [14]. classifcation training content, and the self-training Model-driven trafc state identifcation methods based on idea realizes the expansion of the STM labeled data the model are unable to segment trafc states [15]. set to overcome the problem of the small size of the initial labeled data set. 2.2. Identifcation Methods Driven by Big Data. Seo et al. [16] Section 2 gives a general overview of the work related to and Li et al. [17] reviewed the research theories and methods highway trafc state identifcation. Section 3 defnes the of highway trafc state identifcation and concluded that tensor basis and the method of constructing three-order future research hotspots are machine learning and deep tensor models for ETC gantry data. Section 4 introduces the learning. Tsubota et al. [18] compared the trafc state ISTM method. In Section 5, the results and discussion of the identifcation methods based on two kinds of data, trafc experiments are provided. We discuss the conclusions and volume and GPS vehicle trajectory. Tang et al. [19] used the future work in Section 6. K-mean method to classify the input samples into clusters. Xu et al. [20] proposed a trafc state identifcation algorithm 2. Related Work for road trafc networks based on compressed sensing. With the maturity of fuzzy control technology, scholars in- Trafc state identifcation refers to characterize and judge troduced fuzzy theory and integration methods into the the highway trafc state by constructing a specifc identi- trafc feld. Stutz and Runkler [21] applied fuzzy clustering fcation model through current and historical trafc fow to study trafc state classifcation methods to improve the data and their mapping relationship. It is the crucial premise classical clustering model. Wang et al. [22] proposed an for evaluating road efciency, discovering trafc congestion improved trafc state identifcation model based on selective bottlenecks, and formulating trafc control plans. In recent integration learning (SEL). years, domestic and foreign scholars have conducted ex- In addition, trafc state identifcation algorithms also tensive research on trafc state identifcation methods and contain classifcation models such as the SVM [23, 24], K- proposed various identifcation methods, which have laid nearest neighbor (KNN) [25], decision trees [26], and neural a theoretical foundation. Te trafc state identifcation networks [27], whose input features are primarily in vector methods could be divided into those based on trafc fow form [28]. However, the dimensionality of the trafc net- theory, and model-driven identifcation methods and data- work feature set is large. Te dimensionality reduction is driven artifcial intelligence identifcation methods. processed as feature vector input to the classifcation model. Te structural information between diferent dimensions of the trafc fow network is easily lost, and the connection 2.1. Identifcation Methods Driven by Trafc Flow Teory Model. Te trafc fow theory-based and model-driven between road segments on the road needs to be addressed identifcation methods include the following: Yuan et al. [29]. Considering a robust spatialtemporal correlation [11] proposed a new trafc state estimator based on the model when modeling trafc fow data is crucial. extended Kalman flter (EKF) technique using an LWR Tensor modeling is a standard technique for capturing model as the process equation. Hara et al. [12] improved the multidimensional structure dependencies. Te tensor model Gaussian graphical model using the EM algorithm and the uses a compact structure to simulate the original multidi- graph lasso technique to determine the model parameters. mensional data. In addition, tensor models efectively solve Herrera and Bayen [13] integrated GPS data into the trafc the problem of the high dimensionality of feature data, and tensor-based classifcation methods have been widely used state estimation model and compared it with the application of the Kalman flter. in artifcial intelligence and other felds [30]. Large data sizes Journal of Advanced Transportation 5 characterize ETC gantry data, multiple data types, and high 3. Tensor Basis and Data Preparation dimensionality. Te previous research shows that tensors Tensor is a multiple linear mapping defned on the Cartesian could maintain the inherent structural characteristics of the product of vector space. We organize the spatialtemporal data to the maximum extent [31]. trafc data into a multidimensional structure by combining In conclusion, a tensor is a convenient tool in trans- trafc information (trafc volume and average travel speed), portation research, especially when modeling complex time series information with other locations information, transportation datasets with spatial-temporal multidimen- and the spatialtemporal trafc data is structured as a mul- sionality and efectively handling high-dimensional data tidimensional array (location × time × trafc information), information. It is necessary to further improve the accuracy i.e., a tensor structure. Te tensor array of N order is denoted of trafc state identifcation by mining complex trafc fow I ×...×I 1 N as X ∈ R , where the Kth order contains k compo- dynamics from the massive trafc big data through in- nents within elements denoted as x , where i , i , ..., i telligent techniques. Terefore, this paper proposes to use i ,i ,...,i 1 2 k 1 2 k represents the component expression under the the STM to efectively preserve the correlation structure respective order. features among trafc states of diferent road sections and Tere are many types and complex structures of data reduce the loss of feature information. tables within the ETC gantry system. Te data are updated quickly, and the value density of each type of data is uneven. 2.3. Research on Evaluation Indicators. Te results of trafc Tis paper mainly uses the property that tensor models state identifcation and the selection of identifcation in- could satisfy the transformation of tensor components in dicators have a very close correlation and are directly diferent spatial systems and use the relationship between infuenced by the selection. Te current trafc state study tensor computation and linear algebra for multidimensional combines multiple trafc state indicators to generate more trafc data computation [40]. In this section, an ETC gantry accurate and comprehensive trafc state identifcation in- data three-order tensor model is proposed to analyze the dicators. El-Hamdani and Benamar [32] integrated the trafc data contained in the ETC gantry data table based on design of diferent evaluation indicators for communication the tensor model, and the extracted various data used for type, vehicle priority, system behavior, road model, and trafc data calculation. parking avoidance. Sun et al. [33] used the average speed of Te ETC gantry license plate data table and ETC gantry roads to assess trafc conditions in urban business districts. transaction data table in the ETC gantry system are selected Zhao and Hu [34] used the extra travel time to analyze the to set the ETC gantry subtensor block model. Te subtensor spatial and temporal patterns of trafc congestion in Beijing blocks are extended to construct the tensor block model. Te over six months. Chen et al. [35] proposed a speed per- ETC gantry tensor block model is sequentially arranged into formance index to assess trafc conditions and congestion the ETC gantry three-order tensor model along the time patterns on urban freeways by considering trafc fow speeds series. We use the ETC gantry three-order tensor model to and road speed limits. Deng et al. [36] developed a new extract the required trafc data. method for trafc state evaluation based on a cloud model Te structure of ETC gantry data based on the tensor that integrates the advantages of structured trafc state model is shown in Figure 1. indicators. Evaluation based on performance and indicators is the mainstream research in evaluating trafc 3.1. ETC Gantry Subtensor Block Construction. ETC gantry conditions [37]. I ×I ×...×I x y z subtensor block model X ∈ R performs data di- To enhance the objectivity of trafc indicators, we should mensionality reduction on the ETC gantry data table and show the most comprehensive and critical information with transcodes the selected data attributes. First, the ETC gantry the help of the least number of indicators. Adequate cre- number, capture time, and identifcation license plate in the dentials for trafc management control, trafc volume, ETC gantry license plate data table are selected to build the average travel speed, and space occupancy could compre- ETC gantry license plate data subtensor block. We select hensively refect the road trafc operation [38]. Te average time, space, and vehicle user as the subtensor block di- travel speed is the most direct refection of the trafc state mension division scale and categorize the above three felds [39]. In this paper, the above three are selected as trafc state into each subtensor block dimension. identifcation indicators, calculated and characterized using a three-order tensor model. I ×I ×I time space car X ∈ R , I � t , I � s , I � c , (1) LPR time pictime space id car license where t is the vehicle capture time, s is the ETC gantry Similarly, the three felds of transaction time, gantry pictime id information, and c is the captured vehicle ID number, identifcation license plate, model information, and license information. transaction situation in the ETC gantry transaction data 6 Journal of Advanced Transportation I ×I ×I time space car X ∈ R , I � t , I � s ETC gantry Database trade time trade time space id (2) I � c , c , c , car license type match ETC gantry license ETC gantry where t is the vehicle capture time, s is the ETC gantry pictime id plate recognition Transaction datasheet information, and c is the captured vehicle ID in- datasheet license formation, vehicle type information, and transaction ETC gantry ETC gantry situation. sub-tensor block sub-tensor block construction construction 3.2. ETC Gantry Tensor Block Construction. We obtain the ETC gantry tensor block by extending each subtensor SPACE TIME CAR block model. ETC gantry tensor ETC gantry tensor ETC gantry tensor block construction block construction block construction Defnition 1. Subtensor block merging: the subtensor block ETC gantry high-order tensor models are projected on the high-order tensor dimension, model and the same-order components of the subtensor block models in the same dimension are merged. In contrast, the Trafc data calculation based on components of diferent orders are retained. Te subtensor a high-order tensor model blocks are merged in the following way: Figure 1: ETC gantry data structures. table are selected to construct the ETC gantry transaction data subtensor block as follows: I×J×K A ∈ R , I � i , i , i , J � j , j , j , K � k , k , 1 2 3 1 2 3 1 2 I×J (3) B ∈ R , I � i , i , i , J � j , j , 1 2 4 1 2 I×J×K f: A× B ⟶ C, C ∈ R , I � i , i , i , i , J � j , j , j , K � k , k . 1 2 3 4 1 2 3 1 2 �������������������������������� � i �I ,...,i �I 1 1 N N Te ETC gantry tensor block represents the trafc sta- ‖A‖ � A i , . . . , i × A i , . . . , i . (4) F 1 N 1 N tistics of an ETC gantry within 1 minute and is sorted in i �1,...,i �1 1 N a specifc sequence to simplify the calculation. Te tensor block selects a time, space, and vehicle users as the di- mensional spaces. Among them, the time dimension com- Defnition 3. Tensor inner product: the inner product be- ponent parameter is set to 1 minute, containing 60-time I ×...×I I ×...×I 1 N 1 N tween the tensorA ∈ R ,B ∈ R , defned as the component parameters. Te space dimension contains the sum of the products of its corresponding components. following two spatial factors: gantry number and lane number. Te vehicle user’s dimension contains the following i �I ,...,i �I 1 1 N N three parameters: license plate information, vehicle type ⟨A,B⟩ � A i , . . . , i × B i , . . . , i . (5) 1 N 1 N information, and vehicle transaction information. i �1,...,i �1 1 N 3.3. Trafc Data Calculation Based on the High-Order Tensor Model. Figure 2(a) shows the three-order tensor model Defnition 4. Tensor K mode product: the mode product of I ×...×I I ,I 1 N k k extracted from the 30-minute trafc statistics of two ETC tensor A ∈ R and matrix B ∈ R , tensor A and gantries (three lanes) with diferent colour tensor blocks matrix B is denoted as A× B, and the result is tensor I ×...I ×...×I 1 K N indicating diferent types of cars. Te basic tensor algebra C ∈ R , which is calculated by the following operations used in this paper are described as follows: formula: Defnition 2. Frobenius parametrization: the Frobenius A× B ≜ A I , . . . , I , p, I , . . . , I × B(p, q). (6) I ×...×I K 1 K−1 K+1 N 1 N parametrization ‖ · ‖ of a tensor A ∈ R is calculated p�1 as follows: G005032001000110020 G005032001000110020 G005032001000110020 G005032001000110010 G005032001000110010 G005032001000110010 PICTIME PICTIME PICTIME Journal of Advanced Transportation 7 CARTYPE CAR (a) CARTYPE CAR (b) CARTYPE CAR (c) Figure 2: Diagram of trafc data calculation based on the ETC gantry high-order tensor model: (a) diagram of the ETC gantry high-order tensor model; (b) diagram of high-order ETC gantry tensor models; (c) diagram of vehicle-level retrieval information. GANTRYID GANTRYID GANTRYID 9:00 9:01 9:02 9:03 9:04 9:05 9:06 9:07 9:08 9:00 9:09 9:01 9:02 9:10 9:03 9:11 9:04 9:12 9:05 9:13 9:06 9:14 9:07 9:15 9:08 9:16 9:00 9:09 9:01 9:17 9:10 9:02 9:18 9:11 9:03 9:19 9:12 9:04 9:13 9:20 9:05 9:06 9:14 9:21 9:07 9:15 9:22 9:08 9:16 9:23 9:09 9:17 9:24 9:10 9:18 9:25 9:11 9:19 9:12 9:26 9:20 9:13 9:27 9:21 9:14 9:28 9:22 9:15 9:29 9:23 9:16 9:30 9:17 9:24 9:18 9:25 9:19 9:26 9:20 9:27 9:21 9:28 9:22 9:29 9:23 9:30 9:24 9:25 9:26 9:27 9:28 9:29 9:30 8 Journal of Advanced Transportation We are taking an ETC gantry trafc data calculation as achieves the classifcation by solving the multidimensional an example. All trafc information of this ETC gantry could hyperplane simultaneously, as shown in Figure 3. be obtained by making a tensor slice of this gantry, as shown Te classifers are obtained by supervised learning of in Figure 2(b), including trafc volume, average headway tensor sample points to complete the classifcation of input time, the proportion of vehicle types, average interval speed, tensor points and the mapping from tensor dimensions to and space occupancy. Te vehicle-level retrieval information actual dimensions, which could be expressed as follows: is shown in Figure 2(c). Certain vehicle information can be y � f⟨X;W, b⟩ � ⟨X,W⟩ + b, (7) obtained by inputting vehicle license plate information, the tensor block model in which the vehicle is located can be where y ∈ {+1, −1} is the input sample binary category label, I ×...×I indexed, and trafc parameters calculated, including average 1 N X ∈ R is the input N order tensor training sample, travel speed and headway time. I ×...×I 1 N W ∈ R is the weight tensor of the classifcation hy- perplane, X,W are of the same scale, and ⟨X,W⟩ is the 4. Methodology inner product of the two tensors. Terefore, the STM classifcation problem could be transformed into solving the 4.1. Support Tensor Machine. To distinguish the vector ∗ ∗ optimal X ,W to maximize the category interval. More- sample x , x ∈ R , i � 1, . . . , l, the SVM designs the hy- i i over, the constraint expression could be set as an afne perplane w x + b � 0 to classify it with high dimensionality, function constraint and expressed as a convex optimization where w ∈ R is the hyperplane normal vector and b ∈ R is problem according to the tensor space constraint. Te su- the hyperplane intercept [41]. Te classifcation hyperplane pervised learning-based STM optimization model is distinguishes two types of data at a specifc interval, and this expressed as follows: interval distance refects the diference between the two types of data. Te larger the interval between the two types of data ‖W‖ min f(W, ε) � + C ε represents, the more signifcant the diference between the i W,ε i�1 two types of data. Terefore, the problem of fnding the best (8) decision hyperplane could be transformed into solving the s.t.y ⟨W,X ⟩ + b ≥ 1 − ε i i i maximum interval between two types of data to ensure that the hyperplane is robust. Te upper and lower boundaries of ε ≥ 0, i � 1, 2, . . . , l. the delineated interval must pass through some sample points closest to the decision hyperplane, and these sample Te ε represents the error tolerance of the decision vectors that determine the size of the interval distance are hyperplane for the sample points. Te correct classifcation called support vectors [42]. Te classifcation could be points are ε � 0, the points in the classifcation interval are performed based on the point’s relative position to the ε ∈ (0, 1), and incorrect classifcation points are decision hyperplane. Te SVM classifcation problem could ε ∈ (0, +∞). Te STM is updated to seek the optimal so- ∗ ∗ be transformed into solving for the best w , b that max- lution of the objective function in maximizing the classif- imizes the category interval L. cation interval and minimizing the error. It is solved using Te core idea of the STM is the same as that of the SVM, the Lagrange multiplier method, where the above inequality and the STM chooses a tensor form as the training sample constraint is solved by introducing non-negative Lagrange points, fnds the tensor hyperplane in the tensor dimension, factors λ and β . i i and has a higher generalization of the data features. Te STM 2 l l l ‖W‖ (9) L W, b, λ , β � + C ε − λ y ⟨W,X ⟩ + b − 1 + ε − β ε . i i i i i i i i i i�1 i�1 i�1 To improve the efciency of the solution, the model is W � λ y X , i i i i�1 transformed into dual problem-solving. Te sufcient condition for the original problem and the dual problem to l (10) ∗ ∗ λ y � 0, have optimal solutions simultaneously that X , λ , β meet i i i�1 the KKT condition, and the partial derivatives are found for W, b, ε and set to zero. λ + β � C, 1 ≤ i ≤ l. i i TIME Journal of Advanced Transportation 9 ETC ETC Gantry Data High- Input traffic data ISTM Multiclassification Order Tensor Model Decision Figure 3: Diagram of the STM classifcation. Substituting equations (9) and (10) to obtain an unknown distribution of trafc data and insufcient a priori equivalent expression for the STM model optimization knowledge, using the distance between two tensor samples problem, to refect the current structural information. It lets K⟨X ,X ⟩ satisfy Mercer’s theorem and uses the Gaussian i j l l l kernel function to make the dimensional transformation, ⎛ ⎝ ⎞ ⎠ max λ − λ λ y y ⟨X ,X ⟩ i i j i j i j λ 2 and the model is as follows: i�1 i�1 j�1 � � � � � � � � X − X � � i j l F (11) ⎝ ⎠ ⎛ ⎞ (14) K⟨X ,X ⟩ �⟨ψ X , ψX ⟩ � exp − . i j i j s.t. λ y � 0, 2δ i i i�1 Te STM pairing problem is updated as follows: 0 ≤ λ ≤ C, i � 1, 2, . . . , l. l l l ⎝ ⎠ ⎛ ⎞ Tensor data calculation is only based on linear STM, max λ − λ λ y y K⟨X ,X ⟩ i i j i j i j i�1 i�1 j�1 which cannot fully preserve the diferent feature orders of tensor data in the tensor feature dimension. Tis model (15) introduces the kernel function to solve the inner product s.t. λ y � 0, i i among tensor data, which maps the samples of the original i�1 tensor dimension to the high-dimensional feature dimension. I ×...×I 1 N Given the tensor X ∈ R in the original feature di- 0 ≤ λ ≤ C, i � 1, 2, . . . , l. mension to refect the data characteristics of the tensor well, let H be the corresponding high-dimensional Hilbert space Te STM classifcation decision function is as follows: under the nonlinear mapping ψ(·), and ψ(·) could corre- ∗ ∗ spond to a high dimension (including infnite dimension). f(X) � sgn ⟨X,W ⟩ + b Te mapping relationship could be expressed as follows: (16) ∗ ∗ I ×···×I ⎛ ⎝ ⎞ ⎠ 1 N � sgn λ K⟨X ,X⟩ + b . ψ: X ⟶ ψ(X) ∈ H ⊂ R . (12) i i i�1 Ten equation (11) is updated as follows: Te derivation of the binary STM for the data set l l l containing only y ∈ {+1, −1} is described above, and we 1 i ⎛ ⎝ ⎞ ⎠ max λ − λ λ y y ⟨ψ X , ψ X ⟩ i i j i j i j extend the binary STM to the problem of the data set y λ 2 i�1 i�1 j�1 containing m classifcations. Te multiclassifcation support tensor machine (MSTM) problem could be described as (13) follows: dataset T � (X , y ), (X , y ), ..., (X , y ) , 1 1 2 2 l l s.t. λ y � 0, i i I ×...×I 1 N X ∈ R , y ∈ {1, 2, ..., m}, which contains m i i i�1 classifcations. Constructing a multiclassifcation SVM includes direct 0 ≤ λ ≤ C, i � 1, 2, . . . , l. and indirect methods [43]. Te direct method modifes the In practical applications, the Gaussian kernel function objective function and combines the parameter solutions of could achieve a relatively stable correct rate for randomly multiple classifcation surfaces into an optimization prob- distributed samples by selecting reasonable parameters. Tis lem. It separates multiple classes with high computational section chooses the Gaussian kernel function in the case of complexity and a long training time. Te indirect method 10 Journal of Advanced Transportation constructs multiclassifcation machines by combining l ‖W‖ multiple binary classifers, and the standard methods are min f(W, ε) � + C ε W,ε i�1 one-against-one and one-against-all [44]. Te training samples of each classifer involve two types of training (19) samples, the training samples are balanced and the training (l) s.t. y ⟨W,X ⟩ + y b ≥ 1 − ε i i i i accuracy is high in one-against-one. Tis model selects one- l�1 against-one trafc state identifcation according to several types of trafc state classifcation and practical application ε ≥ 0, i � 1, 2, . . . , l. requirements. We assume that the STM requires the division of m categories. One-against-one model designs an STM Te multiscale training process of the ISTM model is classifer between any two categories of samples and is summarized in Algorithm 1. transformed into a binary classifcation problem. Te m In the learning process of the supervised STM described categories of samples need to design m(m − 1)/2 STM above, the tensor data label y � sgn(X ) is guaranteed to be i i in the correct classifcation by the error cost coefcient C as classifers, and the category with the most votes is the category of the unknown sample. much as possible. If the label data y does not exist, it cannot be judged whether its classifcation situation is correct, and the objective function penalty part of the operation cannot 4.2. Self-Training Multiclassifcation Improved Support Tensor be completed. However, the information provided by the Machine, ISTM. Trafc state classifcation evaluates trafc labeled samples may be subjective and limited. Tis paper operation based on trafc parameters from a global per- proposes an ISTM method based on the self-training idea to spective. To improve the training content of trafc state solve the problem that supervised learning requires many classifcation, we propose the idea of multiscale training to labeled samples. Te ISTM uses a small number of labeled improve STM training by extracting diferent numbers of samples to train the initial classifer, prequalifying the un- tensor blocks to represent multiscale tensor slices. Because labeled samples, and then selects the unlabeled samples with the category of tensor blocks is unknown, the same tensor high confdence to expand the labeled set. Te ISTM consists (l) block can be extracted into L slices of diferent sizes X | , i of the following steps, and Figure 4 shows the ISTM learning (l) process. thus establishing a multiscale sample set X | , y . Te i l i�1 (i) We are setting up a trafc state label preclassifer slice size L is calculated as follows: about the actual situation, labeling a small amount First, the diference values α of trafc fow within all of data, and completing the ISTM initialization neighboring tensor blocks in the sample space of the ETC (ii) Using the initialized ISTM classifer to prequalify gantry tensor model are calculated, the trafc diference unlabeled samples, generate soft labels sequence Φ � (α , α , ..., α ), α > α is obtained in 1 2 n i i+1 descending order, and a threshold value is introduced to y � sgn(X ) ∈ {1, 2, ..., m}, and then select un- i i labeled samples with high confdence to iteratively exclude the anomalous disturbances generated by noise in the sample. Te calculation formula is as follows: add to the labeled data [45] (iii) Repeating steps (i) and (ii) until the error is within α � E(α) + 10 × D(α). (17) th the tolerance E(α) and D(α) are the expectation and variance of the Te self-training process of the ISTM model is sum- fow variance series Φ, respectively. Te training radius marized in Algorithm 2. sequence Φ is obtained by removing the elements in the training radius sequence Φ that are larger than threshold α . th After determining the training radius sequence Φ , the 5. Experiment and Result mathematical expectation is used to generate the slice size L. 5.1. ETC Gantry Data Description and Preprocessing. Tis Te slice size L of the tensor block is obtained by calculating section selects the highway section from K61 + 100 to the mathematical expectation of the number of tensor K109 + 800 as the research object, and several ETC gantries samples of all tensor blocks in the training radius sequence are deployed in the section with comprehensive coverage Φ as follows: and large-scale availability, which could provide high- L � P , (18) i quality trafc parameter data. In this paper, we obtain i�1 this essential road section’s ETC trafc status parameter data P is the number of tensor block samples whose diference set and complete the preprocessing data work. By the ETC values between a particular tensor block and all the gantry data three-order tensor model construction and remaining tensor blocks are smaller than α in the training trafc data extraction method in Section 3 above, this paper radius sequence Φ . used the trafc volume, average travel speed, and space Equation (8) can be further extended to a multi- occupancy as trafc state parameters. Figure 5 compares classifcation STM under multiscale training, and the cor- trafc volume data of 9 ETC gantries on January 8, 2022. Te responding optimization problem is shown as follows. Te trafc fow data of 9 ETC gantries in the period of 0:00–8:59 solution method is the same as described above. have the same trend and similar rise. Te Pearson correlation Journal of Advanced Transportation 11 Input: tensor blocks in the ETC gantry tensor model T � (X , y ), (X , y ), ..., (X , y ) 1 1 2 2 l l (l) Output: multiscale tensor sample set X | , y i i i�1 Begin (1) For each i ∈ {1, 2, ..., l} (2) Solve for the diference between the trafc fow in each tensor block and the neighboring tensor blocks α, Flow variance sequences are obtained after sorting in descending order, (3) Introduction of thresholds α � E(α) + 10D(α), E(α) is the mathematical expectation of the fow variance series, D(α) is th the variance of the fow variance series (4) After eliminating the elements larger than the threshold, the training radius sequence Φ is obtained (5) End for (6) For each i ∈ {1, 2, ..., l} (7) P is the number of tensor block samples whose diference values between the tensor block and the trafc fow within all the remaining tensor blocks are smaller than α in the training radius sequence Φ (8) Get the slice size corresponding to each tensor block (9) End for (10) Building multiscale sample sets End ALGORITHM 1: Te multiscale training process of the ISTM model. Trainings Labeled Data Classifiers Importing BEGIN Data Input Filtering High- Unlabeled Data Iteration Termination Unlabeled Data Into The confidence Conditions Labeled Data Current Classifiers Input The Full Updating Final Data Into The End Labeled Data Classifiers Final Sets Classifiers Figure 4: ISTM learning process. I ×...×I 1 N Input: unlabeled tensor sample set X ,X , ...X ,X ∈ R and Label Predictor sgn(X ) ∈ {1, 2, ..., m} 1 2 i i Output: ISTM model trafc state classifer and tensor set with trafc state labels T � (X , y ), (X , y ), ..., (X , y ) 1 1 2 2 l l Begin (1) Set trafc soft labels for unlabeled tensor sample sets based on trafc state label predictor sgn(X ) ∈ {1, 2, ..., m} (2) Te soft label tensor samples with high confdence are selected as the input label samples; (3) Multiscale training of ISTM model according to the input soft label tensor samples; (4) Obtaining the initial ISTM model trafc state classifer; (5) While iteration termination conditions not meet do (6) Unlabeled data input to ISTM model trafc state classifer; (7) Select the classifcation tensor samples with high confdence and categorize them as input label samples; (8) Multiscale training of ISTM model according to input label samples; (9) Obtaining an ISTM model trafc state classifer; (10) Obtain tensor sets with trafc status labels T � (X , y ), (X , y ), ..., (X , y ) 1 1 2 2 l l End ALGORITHM 2: Te self-training process of the ISTM model. coefcients among ETC gantries are all above 0.985, showing an example to analyze ETC gantry trafc data. Figure 6(a) a robust correlation. shows the trafc fow data of this ETC gantry section for fve According to the high correlation among ETC gantries days from January 7, 2022, to January 10, 2022, and Figure data, this section takes ETC gantry G005032001000110010 as 6(b) shows the heat map of the correlation coefcient of 12 Journal of Advanced Transportation GANTRYID G005032001000910010 G005032001000810010 G005032001000710010 G005032001000610010 G005032001000510010 2022010800 G005032001000410010 G005032001000310010 G005032001000210010 G005032001000110010 TIME <100 400-500 100-200 500-600 200-300 600-700 300-400 200-300 (a) GANTRYID G005032001000110010 1 0.986 0.996 0.989 0.995 0.990 0.997 0.997 0.985 G005032001000210010 0.986 1 0.991 0.991 0.992 0.994 0.991 0.988 0.991 G005032001000310010 0.996 0.991 1 0.997 0.999 0.997 0.999 0.997 0.993 0.995 G005032001000410010 0.989 0.991 0.997 1 0.998 0.999 0.996 0.995 0.998 G005032001000510010 0.995 0.992 0.999 0.998 1 0.999 0.999 0.998 0.996 0.994 0.997 G005032001000610010 0.990 0.999 0.999 1 0.996 0.995 0.997 0.99 G005032001000710010 0.997 0.991 0.999 0.996 0.999 0.996 1 0.999 0.991 G005032001000810010 0.997 0.988 0.997 0.995 0.998 0.995 0.999 1 0.992 G005032001000910010 0.985 0.991 0.993 0.998 0.996 0.997 0.991 0.992 1 0.985 GANTRYID (b) Figure 5: Comparative analysis of trafc volume of diferent ETC gantries: (a) comparison of trafc fow of diferent ETC gantries; (b) correlation coefcients of diferent ETC gantries trafc volumes. trafc fow for fve days. Te overall trend of trafc fow for the trafc states throughout the day was slight. With the this ETC gantry for each day is similar, and the overall trafc conventional trafc state classifcation and identifcation volume is saddle-shaped. method, the road section contains only two types of trafc It can be inferred that the trafc fow distribution of this states. Tere are various characteristics of trafc fow state section could have been more sparse without noticeable changes in these road sections, and there are diferences in changes from 0:00 a.m. to 6:00 a.m. Te trafc fow dis- the duration and infuence scope of various trafc state tribution fuctuated between 7:00 a.m. and 6:00 p.m. Te changes. Te guidance signifcance of trafc control pro- trafc fow gradually decreased at night, and the diference in vided by dividing only two types of trafc states is minor. Trafc Volume (VEH/H) G005032001000110010 G005032001000210010 G005032001000310010 G005032001000410010 G005032001000510010 G005032001000610010 G005032001000710010 G005032001000810010 G005032001000910010 Journal of Advanced Transportation 13 0123456789 10 11 12 13 14 15 16 17 18 19 20 21 22 23 HOUR (h) 0107 0110 0108 0111 (a) 20220107 1 0.985 0.977 0.988 0.995 0.995 20220108 0.985 1 0.968 0.987 0.987 0.990 0.985 20220109 0.977 0.968 1 0.973 0.972 0.980 20220110 0.988 0.987 0.973 1 0.991 0.975 20220111 0.995 0.987 0.972 0.991 1 0.970 20220107 20220108 20220109 20220110 20220111 DATE LABEL (b) Figure 6: ETC gantry G005032001000110010 trafc fow analysis: (a) comparison of ETC gantry trafc volume; (b) heat map of ETC gantry trafc volume correlation coefcient. Te trafc states subcategory was determined by using the 5.2. Experimental Setting. To construct the ISTM trafc state 24-hour (5 minutes/time step) data from January 1 to Jan- labeling preclassifer, the ratio of the average interval speed uary 7 (7 days) as the training set and the data from January 8 to the free fow speed (mobility index) is used as the initial and January 9 as the test set. trafc state subcategory classifcation basis. Te trafc state Trafc Volume (VEH/h) DATE LABEL 14 Journal of Advanced Transportation of this section will be classifed into the four subcategories state subcategories. M is the distance between trafc j,k combined with the actual trafc condition, and the trafc state classifcation point j and trafc state classifcation state subcategories will be divided as shown in Table 3. We point k. R measures the similarity. D is the maximum j,k j selected 10% of all training data as the marked training value of R calculated for each trafc state subcategory, j,k data set. and the maximum similarity value between trafc state To evaluate the sensitivity of the ISTM to the changes in subcategory j and other subcategories. Te DBI is the trafc parameters, this section sets the K-mean clustering based mean value of the maximum similarity of the four trafc on RBF unsupervised identifcation method and the SVM state subcategories. supervised identifcation method as the experimental control group. All three methods are computed by applying the 5.3. Experimental Results. According to K-mean clustering, MATLAB platform. Te K-mean clustering is used to cluster the trafc data will be classifed into two categories of regular the trafc data of this road section autonomously, and the trafc states. Te information on trafc fow data and the number of input clustering centers is 4. Te SVM expresses the corresponding trafc states are shown in Figure 7. Te trafc state parameter data as vectors, sets the input labels for calculation of skewness and kurtosis of trafc data shows the training trafc data with Table 3 as the standard, and that the peak value, tail data, and outliers of trafc data are calculates the hyperplane weights of each decision. small under conventional trafc state classifcation. How- Tis paper selects the sum of closeness distance (SumD) ever, in the free fow trafc state, there is a severe left de- and the Davies–Bouldin Index (DBI) as the evaluation in- viation of the data, which indicates that the classifcation and dexes of this trafc state identifcation experiment. Te identifcation of the conventional trafc state of this road parametric projection number calculates the distance from section need to be improved. each test sample point to the trafc state classifcation point Te K-mean clustering trafc state subcategory identi- (tensor classifer, clustering center, or decision hyperplane). fcation yields trafc state clustering center results, as shown Te sum is the SumD. Te smaller the SumD is, the better the in Table 4, and test data trafc state subcategory identif- classifcation aggregation efect of the method is, and the cation results, as shown in Figure 8. calculation formula is as follows: Te values of the trafc fow parameters corresponding ��������������� � to diferent trafc states in Table 4 are consistent with the l 4 � � � �2 � � trafc fow operation characteristics. Figure 8 shows that � � SumD � X − W . (20) � (ij) (j)� each test data under K-mean clustering distribute with i�1 j�1 points (the center of clustering marked with diferently shaped red dots in the fgure) as the center, and the boundary First, DBI needs to calculate the sum of average intraclass of each trafc state subcategory is clear. However, the distances of trafc state sub-categories. Ten, calculates the spacing between smooth I and smooth II is small, and in- distance between two trafc state classifcation points dividual data points overlap. Te overall intracategory (tensor classifer, cluster center, or decision hyperplane). distance is moderate, but the deviation between the clus- Tird, the sum will be divided by the distance. Finally, DBI is tering center and the test data points under the smooth II the average of the maximum distances. A smaller DBI means state is signifcant, and the intercategory distance is small. smaller intraclass and more considerable interclass distances Te results of the SVM-based trafc identifcation are for the sub-categories. A value close to zero indicates a better shown in Figure 9. partition. Te calculation formula is as follows: In Figure 9, the test data are divided into planes with ����������������� clear boundaries for each trafc state subcategory. � � � �2 � � However, the distance between smooth I and smooth II is � � S � X − W , � � j (ij) (j) l small, and individual data points overlap. Te distance i�1 between the trafc state subcategories is noticeable and � � consistent with the trafc fow operation characteristics. � � � � � � M � W − W , � � j,k (j) (k) Te distance within the group of smooth I is signifcant. However, the SVM uses hyperplanes as the basis classifers S + S j k for trafc state subcategory identifcation, which has (21) R � , j,k M a broader range of classifcation criteria than the K-mean j,k clustering method. Te results of the ISTM-based trafc state subcategory D � max R , j j,k j≠k identifcation are shown in Figure 10. In Figure 10, the boundaries of various trafc state subcategories are clear. Te trafc state features under the DBI � D , 4 same subcategory are evident and consistent with the trafc j�1 fow operation characteristics. Te efect of trafc state where S is the average distance from data within the identifcation based on ISTM is well. trafc state subcategory j to trafc state classifcation Comparing the three methods, it is clear that the error of points, representing the degree of dispersion within trafc the SVM increases as the number of test data samples Journal of Advanced Transportation 15 Table 3: Research section trafc status subcategory level. Trafc status subcategory Mobility index Color level Free I > 0.95 Free II [0.85, 0.95] Smooth I [0.75, 0.85] Smooth II < 0.75 The kurtosis is -1.0429 The skewness is 0.4166 The kurtosis is -0.1058 The skewness is 0.04228 0 0 TRAFFIC VOLUME TRAFFIC VOLUME (a) (b) Figure 7: Distribution of trafc fow data in the corresponding trafc state: (a) distribution of trafc fow data under free fow; (b) trafc fow data distribution under steady fow. Table 4: K-mean clustering results. Trafc status subcategory Average Trafc volume (veh/5-min) Space Occupancy (%) level interval speed (km/h) Free I 15 109.33 1.49 Free II 45 101.87 3.58 Smooth I 91 92.75 6.70 Smooth II 118 83.33 10.99 increases. Te SVM tends to be less efective in identifying Te calculations of the three methods in Tables 5 and 6 trafc state subcategories in that range when the trafc show that the ISTM has the best efect on trafc state sub- volume increases. Te K-mean performs poorly in a learning category identifcation when the trafc state discrepancy is environment with small test data and is more sensitive to slight. Te reason for the unsatisfactory performance of the noisy data. Te ISTM sets up a multiscale training idea to SVM in this experiment may be related to the label set. Te SVM is supervised learning, and its label setting will signif- fexibly extract tensor slices of diferent sizes. Te ISTM adapts to the variable data volume of test samples well and icantly impact the trafc state identifcation results and in- satisfes the category classifcation when the data set has terfere with its intracategory distance. Te K-mean clustering a small range of diferences. is unsupervised learning, which is better than the SVM in Te results of the SumD calculation for the three trafc choosing the same weight for the trafc state parameters. Te state identifcation methods are shown in Table 5. Te ISTM directly uses trafc tensor data as input to retain the comparison is shown in Figure 11(a). Te results of the DBI topology of trafc data more efectively, dramatically reduces calculation are shown in Table 6, and the comparison is the number of decision variables, and helps to overcome the shown in Figure 11(b). overftting problem in the calculation of the SVM method. COUNTING [2,5] (5,8] (8,11] (11,14] (14,17] (17,20] (20,23] (23,26] (26,29] (29,32] (32,35] (35,38] (38,41] (41,44] (44,47] (47,50] COUNTING [51,56] (56,61] (61,66] (66,71] (71,76] (76,81] (81,86] (86,91] (91,96] (96,101] (101,106] (106,111] (111,116] (116,121] (121,126] (126,131] (131,136] (136,141] (141,146] (146,151] (151,156] (156,161] 200 Speed (km/h) 16 Journal of Advanced Transportation Smooth I Free I Smooth II Free II (a) Figure 8: Continued. Flow (veh/5 min) Occupancy (%) 150 Speed (km/h) Flow (veh/5 min) Journal of Advanced Transportation 17 Smooth I Free I Smooth II Free II (b) Figure 8: Trafc state subcategory identifcation results based on K-mean clustering: (a) trafc status identifcation results on January 8; (b) trafc status identifcation results on January 9. Free I Smooth I Free II Smooth II (a) Figure 9: Continued. Speed (km/h) Flow (veh/5 min) Occupancy (%) Occupancy (%) 120 Flow (veh/5 min) Flow (veh/5 min) Flow (veh/5 min) 18 Journal of Advanced Transportation Free I Smooth I Free II Smooth II (b) Figure 9: Trafc state subcategory identifcation results based on SVM: (a) trafc status identifcation results on January 8; (b) trafc status identifcation results on January 9. 50 100 Free I Smooth I Free I Smooth I Free II Smooth II Free II Smooth II (a) (b) Figure 10: Trafc state subcategory identifcation results based on ISTM: (a) trafc status identifcation results on January 8; (b) trafc status identifcation results on January 9. Table 5: Trafc state subcategory identifcation results (SumD). SumD Method Free I Free II Smooth I Smooth II Average K-mean 9.27478 9.23825 9.01315 8.98892 9.12878 SVM 9.00476 9.21502 9.28233 9.28339 9.19638 ISTM 8.91179 8.9073 8.95905 8.70588 8.87101 Speed (km/h) Speed (km/h) Speed (km/h) Occupancy (%) Occupancy (%) Occupancy (%) Journal of Advanced Transportation 19 9.4 9.3 0.5 9.2 0.45 9.1 0.4 0.35 8.9 0.3 0.25 8.8 0.2 8.7 0.15 8.6 0.1 8.5 0.05 8.4 0 Free I Free II Smooth I Smooth II Free I Free II Smooth I Smooth II Sub-category Traffic State Sub-category Traffic State K-mean K-mean SVM SVM ISTM ISTM (a) (b) Figure 11: Te comparison of trafc state subcategory identifcation results (DBI): (a) SumD; (b) DBI. Table 6: Trafc state subcategory identifcation results (DBI). optimize tensor models, build online recognition models for trafc state subcategories, and accelerate the trafc DBI Method state training speed. Free I Free II Smooth I Smooth II Average K-mean 0.3786 0.3496 0.4011 0.4210 0.3876 Data Availability SVM 0.4796 0.3989 0.3833 0.3617 0.4059 ISTM 0.2125 0.2033 0.2149 0.2324 0.2158 Te ETC gantry data used to support the fndings of this study have not been made available because the ETC gantry data includes user privacy. 6. Conclusion Conflicts of Interest Tis paper considers the narrow computable range and small data granularity of highway trafc data. We address Te authors declare that there are no conficts of interest the problem that the calibration of signifcant granularity regarding the publication of this manuscript. trafc state parameters in conventional methods cannot meet the refned trafc state identifcation. Tis paper Acknowledgments proposes a trafc state subcategory identifcation method based on ETC gantry data and synthesizing the STM, and Tis research was supported by the Shandong Province Key multiscale training and self-training learning ideas. Te R&D Program (Major Science and Technology Innovation ISTM is extended from SVM and its kernel function to Project) Project “Research and Application of Key Tech- STM and RBF, which can handle tensor structure. Te nology of Highway Vehicle-Road Collaboration” (no. ISTM uses the tensor high-order data structure to carve 2020CXGC010118). trafc data and uses multiscale tensor slices to refect the proportional relationship between trafc state parameters References as much as possible. To overcome the problem of a small initial labeled data set size, the ISTM fuses self-training [1] H. Li, Y. Zhou, Y. Xing, C. Zhang, and X. Zhang, “An im- learning with STM to make up for the shortcoming that proved mixed distribution model of time headway for urban supervised learning STM cannot perform training learning roads based on a new trafc state classifcation method,” IEEE when the samples are unlabeled. Te experimental results Access, vol. 9, pp. 12635–12647, 2021. [2] G. Aceto, D. Ciuonzo, A. Montieri, and A. Pescape, ´ “Mobile show that ISTM has better performance for trafc state encrypted trafc classifcation using deep learning: experi- recognition in a small variation range and can improve the mental evaluation, lessons learned, and challenges,” IEEE accuracy of small classifcation of trafc state in this ex- Transactions on Network and Service Management, vol. 16, perimental section. no. 2, pp. 445–458, 2019. In the future, we will conduct more detailed research [3] Y. Yue, L. Yu, L. Zhu, G. Song, and X. Chen, “Macroscopic on the internal changes in the trafc state. We will Model for Evaluating Trafc Conditions on the Expressway combine theoretical analysis of the correlation between Based on Speed-Specifc VKT Distributions,” in Journal of macroscopic trafc fow and microscopic trafc fow and Transporation Systems Engineering & Information Technology, evaluate various factors afecting the trafc state of the vol. 14, pp. 85–92, 2014. highway to make up for the shortcomings in this study. In [4] H. Wen-jun and Z. Xi-zhao, “A multi-mode two-level addition, we will use multisource ITS trafc data to transportation model based on network vulnerability,” in SumD DBI 20 Journal of Advanced Transportation Proceedings of the 2019 5th International Conference on [19] J. Tang, F. Liu, Y. Zou, W. Zhang, and Y. Wang, “An improved Transportation Information and Safety (Ictis 2019), p. 104, fuzzy neural network for trafc speed prediction considering periodic characteristic,” IEEE Transactions on Intelligent New York, NY, USA, June, 2019. [5] D. Huang, Y. Chai, L. Zhao, and G. Sun, “Trafc Congestion Transportation Systems, vol. 18, no. 9, pp. 2340–2350, 2017. [20] D. W. Xu, H. H. Dong, H. J. Li, L. M. Jia, and Y. J. Feng, “Te Status Identifcation Method for Road Network With baseline estimation of road trafc states based on compressive sens- skip Multi-Source Uncertain Information,” in Acta Auto- ing,” Transportation Business, vol. 3, no. 2, pp. 131–152, 2015. matica Sinica, vol. 44, pp. 533–544, 2018. [21] C. Stutz and T. A. Runkler, “Classifcation and prediction of [6] Z. Cui, R. Ke, Z. Pu, X. Ma, and Y. Wang, “Learning trafc as road trafc using application-specifc fuzzy clustering,” IEEE a graph: a gated graph wavelet recurrent neural network for Transactions on Fuzzy Systems, vol. 10, no. 3, pp. 297–308, network-scale trafc prediction,” Transportation Research Part C: Emerging Technologies, vol. 115, Article ID 102620, [22] Z. Wang, R. Chu, M. Zhang, X. Wang, and S. Luan, “An improved selective ensemble learning method for highway [7] X. Chen, S. Wu, C. Shi et al., “Sensing data supported trafc trafc fow state identifcation,” IEEE Access, vol. 8, fow prediction via denoising schemes and ann: a compari- pp. 212623–212634, 2020. son,” IEEE Sensors Journal, vol. 20, no. 23, pp. 14317–14328, [23] Y. Zhang and Y. Liu, “Trafc forecasting using least squares support vector machines,” Transportmetrica, vol. 5, no. 3, [8] Q. Liu, J. Xie, and F. Ding, “A data-driven feature based pp. 193–213, 2009. learning application to detect freeway segment trafc status [24] Z. Cheng, W. Wang, J. Lu, and X. Xing, “Classifying the trafc using mobile phone data,” Sustainability, vol. 13, no. 13, state of urban expressways: a machine-learning approach,” p. 7131, 2021. Transportation Research Part A: Policy and Practice, vol. 137, [9] Y. Zhao, B. Zhang, Y. Cao, Y. Rui, and B. Ran, “Application of pp. 411–428, 2020. data fusion based on clustering-neural network for etc gantry [25] D. Xu, Y. Wang, P. Peng, S. Beilun, Z. Deng, and H. Guo, fow capacity correction,” in Proceedings of the 22nd COTA “Real-time road trafc state prediction based on kernel- International Conference of Transportation Professionals, KNN,” Transportmetrica: Transportation Science, vol. 16, pp. 239–249, Changsha, China, September, 2022. no. 1, pp. 104–118, 2020. [10] L. Li, X. Lin, H. Liu, W. Lu, B. Zhou, and J. Zhu, “Dis- [26] X. Wen, Y. Xie, L. Jiang, Z. Pu, and T. Ge, “Applications of placement data imputation in urban internet of things system machine learning methods in trafc crash severity modelling: based on tucker decomposition with L2 regularization,” IEEE current status and future directions,” Transport Reviews, Internet of Tings Journal, vol. 9, no. 15, pp. 13315–13326, vol. 41, no. 6, pp. 855–879, 2021. [27] J. Yang, J. Li, L. Wei, L. Gao, and F. Mao, “Spatiotemporal [11] Y. Yuan, J. W. C. van Lint, R. E. Wilson, F. van Wageningen- DeepWalk gated recurrent neural network: a deep learning Kessels, and S. P. Hoogendoorn, “Real-time Lagrangian trafc framework for trafc learning and forecasting,” Journal of state estimator for freeways,” IEEE Transactions on Intelligent Advanced Transportation, vol. 2022, Article ID 4260244, Transportation Systems, vol. 13, no. 1, pp. 59–70, 2012. 11 pages, 2022. [12] Y. Hara, J. Suzuki, and M. Kuwahara, “Network-wide trafc [28] X. Lin, “A road network trafc state identifcation method state estimation using a mixture Gaussian graphical model based on macroscopic fundamental Diagram and spectral and graphical lasso,” Transportation Research Part C: clustering and support vector machine,” Mathematical Emerging Technologies, vol. 86, pp. 622–638, 2018. Problems in Engineering, vol. 2019, Article ID 6571237, [13] J. C. Herrera and A. M. Bayen, “Incorporation of Lagrangian 10 pages, 2019. measurements in freeway trafc state estimation,” Trans- [29] L. Zhou, S. Zhang, J. Yu, and X. Chen, “Spatial-temporal deep portation Research Part B: Methodological, vol. 44, no. 4, tensor neural networks for large-scale urban network speed pp. 460–481, 2010. prediction,” IEEE Transactions on Intelligent Transportation [14] L. Li, L. Qin, X. Qu, J. Zhang, Y. Wang, and B. Ran, “Day- Systems, vol. 21, no. 9, pp. 3718–3729, 2020. ahead trafc fow forecasting based on a deep belief network [30] Z. Hao, L. He, B. Chen, and X. Yang, “A linear support higher- optimized by the multi-objective particle swarm algorithm,” order tensor machine for classifcation,” IEEE Transactions on Knowledge-Based Systems, vol. 172, pp. 1–14, 2019. Image Processing, vol. 22, no. 7, pp. 2911–2920, 2013. [15] F. Xu, Z. He, Z. Sha, W. Sun, and L. Zhuang, “Trafc state [31] M. Signoretto, R. Van de Plas, B. De Moor, and evaluation based on macroscopic fundamental Diagram of J. A. K. Suykens, “Tensor versus matrix completion: a com- urban road network,” Procedia - Social and Behavioral Sci- parison with application to spectral data,” IEEE Signal Pro- ences, vol. 96, pp. 480–489, 2013. cessing Letters, vol. 18, no. 7, pp. 403–406, 2011. [16] T. Seo, A. M. Bayen, T. Kusakabe, and Y. Asakura, “Trafc [32] S. El Hamdani and N. Benamar, “A comprehensive study of state estimation on highway: a comprehensive survey,” An- intelligent transportation system architectures for road con- nual Reviews in Control, vol. 43, pp. 128–151, 2017. gestion avoidance,” Ubiquitous Networking, vol. 10542, [17] L. Li, H. Zhou, H. Liu, C. Zhang, and J. Liu, “A hybrid method pp. 95–106, 2017. coupling empirical mode decomposition and a long short- [33] Q. Sun, Y. Sun, L. Sun et al., “Research on trafc congestion term memory network to predict missing measured signal characteristics of city business circles based on TPI data: the data of SHM systems,” Structural Health Monitoring, vol. 20, case of Qingdao, China,” Physica A: Statistical Mechanics and no. 4, pp. 1778–1793, 2021. Its Applications, vol. 534, Article ID 122214, 2019. [18] T. Tsubota, A. Bhaskar, A. Nantes, E. Chung, and V. V. Gayah, [34] P. Zhao and H. Hu, “Geographical patterns of trafc con- “Comparative analysis of trafc state estimation cumulative gestion in growing megacities: big data analytics from Bei- counts-based and trajectory-based methods,” Transportation jing,” Cities, vol. 92, pp. 164–174, 2019. Research Record: Journal of the Transportation Research [35] Y. Chen, C. Chen, Q. Wu, J. Ma, G. Zhang, and J. Milton, Board, vol. 2491, no. 1, pp. 43–52, 2015. “Spatial-temporal trafc congestion identifcation and Journal of Advanced Transportation 21 correlation extraction using foating car data,” Journal of Intelligent Transportation Systems, vol. 25, no. 3, pp. 263–280, [36] Z. Deng, D. Huang, J. Liu, B. Mi, and Y. Liu, “An assessment method for trafc state vulnerability based on a cloud model for urban road network trafc systems,” IEEE Transactions on Intelligent Transportation Systems, vol. 22, no. 11, pp. 7155– 7168, 2021. [37] T. Afrin and N. Yodo, “A survey of road trafc congestion measures towards a sustainable and resilient transportation system,” Sustainability, vol. 12, no. 11, p. 4660, 2020. [38] H. Chen, R. Zhou, H. Chen, and A. Lau, “A resilience-oriented evaluation and identifcation of critical thresholds for trafc congestion difusion,” Physica A: Statistical Mechanics and Its Applications, vol. 600, Article ID 127592, 2022. [39] Q. Ma, Z. Zou, and S. Ullah, “An approach to urban trafc condition estimation by aggregating GPS data,” Cluster Computing, vol. 22, no. S3, pp. 5421–5434, 2019. [40] H. Tan, Y. Wu, B. Shen, P. J. Jin, and B. Ran, “Short-Term trafc prediction based on dynamic tensor completion,” IEEE Transactions on Intelligent Transportation Systems, vol. 17, no. 8, pp. 2123–2133, 2016. [41] C. C. Chang, “LIBSVM: A library for support vector ma- chines,” ACM Transactions on Intelligent Systems and Tech- nology, vol. 2, no. 3, 2022. [42] J. Gao, H. Wang, and H. Shen, “Task failure prediction in cloud data centers using deep learning,” in Proceedings of the 2019 Ieee International Conference on Big Data (Big Data), pp. 1111–1116, New York, NY, USA, December, 2019. [43] J. Cervantes, F. Garcia-Lamont, L. Rodrıguez-Mazahua, and A. Lopez, “A comprehensive survey on support vector ma- chine classifcation: applications, challenges and trends,” Neurocomputing, vol. 408, pp. 189–215, 2020. [44] P. Lingras and C. Butz, “Rough set based 1-v-1 and 1-v-r approaches to support vector machine multi-classifcation,” Information Sciences, vol. 177, no. 18, pp. 3782–3798, 2007. [45] C. Liu, D. Wu, Y. Li, and Y. Du, “Large-scale pavement roughness measurements with vehicle crowdsourced data using semi-supervised learning,” Transportation Research Part C: Emerging Technologies, vol. 125, Article ID 103048,

Journal of Advanced Transportation – Hindawi Publishing Corporation

**Published: ** Mar 30, 2023

Loading...

You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!

Read and print from thousands of top scholarly journals.

System error. Please try again!

Already have an account? Log in

Bookmark this article. You can see your Bookmarks on your DeepDyve Library.

To save an article, **log in** first, or **sign up** for a DeepDyve account if you don’t already have one.

Copy and paste the desired citation format or use the link below to download a file formatted for EndNote

Access the full text.

Sign up today, get DeepDyve free for 14 days.

All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.