Access the full text.
Sign up today, get DeepDyve free for 14 days.
References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.
Hindawi Journal of Robotics Volume 2022, Article ID 4163992, 12 pages https://doi.org/10.1155/2022/4163992 Research Article Support Vector Machine and Granular Computing Based Time Series Volatility Prediction Yuan Yang and Xu Ma School of Mathematics and Computer Science, Ningxia Normal University, Ningxia, Guyuan 756000, China Correspondence should be addressed to Yuan Yang; sjyangyuan@nxnu.edu.cn Received 17 December 2021; Revised 13 February 2022; Accepted 14 February 2022; Published 16 April 2022 Academic Editor: Shan Zhong Copyright © 2022 Yuan Yang and Xu Ma. ,is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. With the development of information technology, a large amount of time-series data is generated and stored in the ﬁeld of economic management, and the potential and valuable knowledge and information in the data can be mined to support management and decision-making activities by using data mining algorithms. In this paper, three diﬀerent time-series infor- mation granulation methods are proposed for time-series information granulation from both time axis and theoretical domain: time-series time-axis information granulation method based on ﬂuctuation point and time-series time-axis information granulation method based on cloud model and fuzzy time-series prediction method based on theoretical domain information granulation. At the same time, the granulation idea of grain computing is introduced into time-series analysis, and the original high-dimensional time series is granulated into low-dimensional grain time series by information granulation of time series, and the constructed information grains can portray and reﬂect the structural characteristics of the original time-series data, to realize eﬃcient dimensionality reduction and lay the foundation for the subsequent data mining work. Finally, the grains of the decision tree are analyzed, and diﬀerent support vector machine classiﬁers corresponding to each grain are designed to construct a global multiclassiﬁcation model. (i.e., granulation), thus contributing to better analysis and 1. Introduction problem-solving. ,e existing research on time-series in- With the rapid development of internet technology and the formation granulation is mainly divided into two aspects of improved performance of data storage devices in recent research in the time axis and theoretical domain, that is, years, a large amount of data is generated and stored in solving the problems of eﬀective divisional representation of various industries. Among these data, a large portion of the time window and the theoretical domain. Support vector them are time-tagged, that is, a series of observations machines show many unique advantages in solving small recorded in chronological order, called time series. How to sample, nonlinear, and high-form recognition problems, eﬀectively analyze and process this time series of data to such as using this technology to avoid local minimum of uncover potential and valuable knowledge and information knowledge and realize capacity control, and can be extended to support more eﬃcient production, operation, manage- to other machine learning problems such as function ﬁtting. ment, and decision-making activities of enterprises is one of Time-series time-axis information granulation research the important tasks in today’s big data era [1]. Computation usually uses a ﬁxed time interval to divide the time series, is a new approach to simulating human problem-solving that is, hard division, and then represents the information thinking and solving complex tasks with big data and is an grains obtained after the division, ignoring the changing emerging research direction in artiﬁcial intelligence in re- characteristics of the time series on the time axis, which does cent years. ,e main idea of the theory is too abstract and not conform to the essential meaning of information grains, divides complex problems into several simpler problems so it is necessary to design the time-series information 2 Journal of Robotics granulation method according to the changing character- analysis are implemented in medical diagnosis problems. istics of the time series on the time axis so that the infor- Many results have also been achieved in terms of practical applications. ,e importance of attributes, as elaborated in mation grains obtained internal structures are similar to each other and the information grains are distinct from each the literature [6], was added to the granular computation of other [2]. Time-series domain information granulation knowledge while used in solving the minimal attribute studies usually cannot combine the requirements of both approximation, among others. In subsequent research, fuzzy interpretability and prediction accuracy of the domain quotient space theory was created by literature [7], improved partition interval, so there is a need to design a time-series by literature [8], perfected in the context of data mining, and domain information granulation method that can have both so on. He Y [9] dealt with word computation and language strong interpretability and high prediction accuracy [3]. dynamics and proposed a language dynamics system. ,e ,is paper introduces the granulation idea of granula- subsequent literature [10] elaborates a grain computation tion in time-series analysis, by granulating the time series model based on tolerance relations, giving a grain operation with information, granulating the original high-dimensional criterion for incomplete information systems, a grain rep- resentation, and a grain decomposition method. At the same time series into low-dimensional granulated time series, constructing information granules that can portray and time, in connection with the attribute simpliﬁcation of rough reﬂect the structural characteristics of the original time- sets, the determination conditions are given, and the series data, thus realizing eﬃcient dimensionality reduction problems such as the acquisition of attribute necessity rules and laying the foundation for subsequent data mining work, for incomplete information systems are addressed. Luo C studying the time-series information granulation problem- [11] applies the compatible granularity space model in the oriented to clustering and prediction, addressing the ﬁeld of image segmentation. Kim S T [12] combines shortcomings in existing research methods, proposing three granularity with neural networks and is applied to eﬃcient diﬀerent time-series information granulation methods in knowledge discovery. Dong G [13] elaborates on the con- terms of both time axis and thesis domain, and applying nection between concept description and concept hierarchy them to stock time-series data for clustering and prediction transformation based on the similarity of the concept lattice and granularity partitioning in the process of concept analysis. Because of the shortcomings in existing research methods, the study of time-series information granulation clustering. Su W H [14] grain vector space and artiﬁcial for clustering and prediction proposes three diﬀerent time- neural network, which improves the timeliness and com- series information granulation methods from both time axis prehensibility of knowledge representation of the artiﬁcial and theoretical domain and puts them into stock time-series neural network. Literature [15] decomposed copper and data for clustering and prediction analysis. To address the wheat prices based on EMD and EEMD methods, respec- problems of long training time and low eﬃciency of existing tively, based on multiscale perspective analysis, and ﬁnally, support vector machines for solving multiclassiﬁcation BP neural network, SVM model, and ARIMA were used for problems, the idea of granular computing is introduced to prediction and integration, and the prediction results construct support vector machine multiclassiﬁcation showed that the combined model prediction is better. Al- models, and the learning algorithm for improving the though the prediction model integrated by decomposition is construction of decision trees is investigated to achieve the better, there are some defects, such as the wavelet decom- purpose of improving its training eﬃciency and classiﬁca- position method has problems of weak adaptability of itself tion accuracy. and poor robustness of network training in the process of data decomposition, while the EMD method has problems of modal overlap and lack of theoreticality in the decompo- 2. Related Work sition process. Moreover, for price series with multiscale and Combining other more mature theories and methods with high noise, the number of each component after decom- SVM has become a research topic with great potential for position of these methods is high, which is not conducive to development. However, at the same time, it faces problems the subsequent forecasting work. After that, the literature [16] constructed a new sequence decomposition method, the such as diﬃcult classiﬁcation and inaccurate prediction. ,e current research on granularity support vector machines empirical wavelet transform method, based on wavelet transform and combined the advantages of EMD. ,e lit- mainly focuses on the combination of speciﬁc models: SVM with rough sets, decision trees, clustering, quotient spaces, erature [17] and others used EWD and EWT to decompose the wind power sequences and then combined the with the association rules, and so on. ,ese results only preprocess the data, but these models are important for the theoretical neural network method for cross-combination prediction, and after comparison, it was found that the sequences study of machine learning and support vector machines, as well as for the exploration of problems such as intelligent decomposed by EWT had a better prediction eﬀect. ,e information processing. basic idea of a rough set is to form concepts and rules Egrioglu E [4] studies rough lower approximation and through analytical induction and study target equivalence rough upper approximation on the space of grain approx- relations as well as categorical approximation knowledge discovery. Zhao Y [18] combines multilevel and multi- imations from the perspective of rough set theory. Subse- quently, the concept of grain-logic (G-logic) is given in the perspective granularity methods by deﬁning the division sequence product space and using nested division sequences study of literature [5], where a similar inference system is built based on rough logic, while instance veriﬁcation and to deﬁne diﬀerent granular layers over the theoretical Journal of Robotics 3 domain. Finally, a granulation model based on the divi- the algorithm ﬂow of the support vector machine-based sion order is given using the division order product space. granular computing. Solving for an estimate of a system sample based on a Chen W [19] proposes a neighborhood granulation method by introducing inter- and intraclass thresholds to known training sample, in a dependency between the construct a supervised neighborhood-based rough set output inputs of a system, machine learning makes a model and gives the rough approximation quality and relatively accurate estimate of the unknown data as conditional entropy monotonicity change theorems for possible. ,en it is possible to model the problem of this model by analyzing the neighborhood particle change machine learning as the existence of some unknown law under double thresholds. After studying the operation dependency between input variables and output variables mechanism of data information particles in literature [22]. ,e basic idea of the support vector machine is to get [20], the nonstandard analysis is used as the operation rule a high-dimensional space, use nonlinearity to transform the space of the input, and then solve the optimal linear of information particles; the accompanying binary rela- tion is proposed; and the coarse and ﬁne particle layer classiﬁcation and ﬁnally deﬁne the appropriate inner product function to complete this nonlinear transfor- division of information particles in the binary relation is analyzed in-depth, and the algorithm can realize the mation. ,e triadic theory of granular computing includes merging and decomposition of particle layer space, which multiperspective, multilevel granular structures, and can eﬀectively reduce the data calculation intensity and granular computing triangles. ,e methodology of simplify the data analysis process. granular computing is a structured problem-solving, and the computational model of granular computing is structured information processing. ,e triad emphasizes 3. Support Vector Machine Based the mutual support of the philosophical, methodological, Algorithms for Particle Computation and computational models of that computation. ,e study of granular computing attempts to organize, abstract, and Set theory is the foundation of modern mathematics, and fuzzy set theory is one of the new mathematical tools and combine granular processing ideas from various disci- plines to obtain a higher-level, systematic, and discipline- theories. Once the concept of fuzzy sets and the problem of granularity of fuzzy information were introduced, it speciﬁc knowledge-independent principle of granular computing. rapidly expanded the scope of its use and extended the theory of fuzzy logic, followed by the “theory of word ,e traditional algorithm steps are as follows: computation”, which aims to use language for fuzzy (1) Select the number of grains to be divided p. computation and reasoning to achieve fuzzy intelligent control. At the same time, the integration of fuzzy set λ λ θ θ 1 2 f (p, x) � d x − + d x − . theory and quotient space, using fuzzy equivalence re- (1) ϕ p p d (x) d (x) p p lations, completed the study of the expansion of the quotient space model and grain computation and was able f (p, x) is the overall feature function of the data to accurately map and solve uncertainty problems. set. ,erefore, a proper hierarchical progressive granularity (2) Determine its objective optimization function. structure can solve the problem eﬀectively. However, this theory does not have the means and technical algorithms l+1 zJ(w, p) ∗ l ∇ t J w, p; x , y � δ + λα . to complete the transformation, including between b (2) i ij zb ij granularity and granularity world, between granularity and granularity, and between granularity world and In objective optimization, the function w is the granularity world. If this problem can be solved, it will penalty parameter, and the classiﬁed samples will improve and promote the theory and the scope of use of appear as nonseparable regions and may also appear the quotient space [21]. to belong to one class of samples or multiple classes ,e fusion of the three models in turn produces fuzzy of samples. rough sets, fuzzy quotient spaces, and so on so that the three models are both distinct and related. First, between rough (3) Generate k decision functions as follows: sets and fuzzy sets, the former are processed later, and the latter are preexisting. However, both describe and generalize 2 X � T + b X . (3) k k ij the incompleteness and inaccuracy of information grains, i�1,j�1 and there are signiﬁcant diﬀerences in the processing of information grains. Rough sets focus on the coarseness of (4) ,e radial basis kernel function is then obtained as information grains, describe grains by upper and lower follows: approximation operators, and emphasize indistinguish- zc 1 ability and the classiﬁcation of diﬀerent equivalence classes. G(J) � j + X Y . (4) i i zj n Fuzzy sets focus on fuzziness, describe and emphasize the i�1 indistinguishability of boundaries using aﬃliation and af- (5) ,e ﬁnal if-then form and the main fuzzy constraint ﬁliation functions, and study only the degree of aﬃliation of the same equivalence class. Figure 1 shows the framework of propagation rules are proposed. 4 Journal of Robotics SVM- ARIMA User Profile Possessor Modification Assessable data constraints data Store Data Validation AP1 AP2 AP3 Retrieve Data Attribute Cloud storage information preprocessing Selection User l repository User 2 ...... User IoT UDR Realization Diagnosis ...... Measurements Generation Algorithms methods User n Figure 1: Support vector machine-based algorithm ﬂow framework for granular computing. structure is a description of the relationships and connec- g(i, j) + g max T(i, j) � + |X − Y| . (5) tions between grains and grains, grains and layers, and layers g max − g min and layers. ,e grain structure is a relational structure consisting of interconnections between grain layers. ,ere It follows that cluster analysis can be considered as a are three modes: top-down, bottom-up, and center-out, concrete implementation of the idea of granulation, which is which are three common modes of information processing another layer of abstraction on the idea of cluster analysis. for humans and computer information processing. Granular computing is a concrete implementation of the ideas of granularity and hierarchy in the solution of machine problems. ,e core concept of granular computing is 4. Support Vector Machine and Granular multilevel and multiview granular structure [23]. ,e fun- Computing Based Time Series damental framework of granular computing consists of Volatility Prediction particles, granular layers, and granular structures. ,e most fundamental element of granular computation is called a 4.1. Empirical Modal Decomposition of Time Series Fluctua- particle, which is composed of a collection of individuals tion Algorithms. Empirical modal decomposition (EMD) is described by internal properties and a whole of external a signal decomposition processing method, the principle of properties. An abstracted description of the space of which is to decompose the originally complex original price problems or samples of operations is called a granule layer, signal sequence into a ﬁnite number of simpler eigenmode and the whole of particles obtained based on some required functions (IMFs). Each IMF represents the information granulation criterion constitutes a granule layer of internal contained in the original price series at diﬀerent scale levels, particles with some identical (or similar) granularity or which can eﬀectively reﬂect the embedded characteristics of properties. Granularity comes from the way people perceive the original price series at a low level. EMD decomposition the world. ,e observation, deﬁnition, and transformation method is a kind of data processing method that can smooth of practical problems are granular calculations for diﬀerent the complex signal series, which decomposes the original problems measured from diﬀerent perspectives or levels, and signal series into several component series according to the in diﬀerent applications, granularity can be interpreted as frequency level; the ﬁrst component series has the highest size, abstraction, complexity, and so on. Diﬀerent grains can frequency, then decreasing in order; and the last sequence be ordered by their granularity. Each grain provides a local, has the lowest frequency so that many component sequences partial description, and all grains in that layer combine to at diﬀerent feature scale levels can be obtained, and the provide a global, complete description. Grain calculations decomposed component sequence corresponds to an IMF are often solved on diﬀerent grain layers. A multilevel grain eigenmode component of the relevant frequency. ,e Journal of Robotics 5 superimposed waves in the original signal can be removed, frequency ﬂuctuations of the general sequence can indicate the degree of price series ﬂuctuations; the higher the fre- and symmetric modal waveforms can be obtained. In the EMD algorithm, the sequence IMF1 contains the compo- quency of the component corresponding to the price series ﬂuctuations more violent, the lower the frequency of the nent of the original sequence with the smallest period. ,e component corresponding to the price series ﬂuctuations residual term after subtracting IMF1 from the original se- more moderate, so the last decomposition of the component quence contains the part of the vibration signal whose period out of the ﬂuctuations of the most moderate general can is larger than IMF1, so the average period of IMF2 is represent the price series ﬂuctuations towards, generally generally larger than the average period of IMF1. By analogy, referred to as the residual component [24]. the IMF sequence ﬁltered by the EMD algorithm has to decrease the signal frequency, decrease ﬂuctuation intensity, EMD decomposition is simple, and the decomposition results are more accurate because the process of mathe- and increase the average period, and the ﬁnal residual term is a constant or monotonic function, which reﬂects the long- matical processing is adaptive without human interference and is automatically generated decomposition results. EMD term trend in the sequence. ,e problem of unbalanced data classiﬁcation has also become an important direction in the wil automatically generate diﬀerent basis functions and the most appropriate number of components according to the ﬁeld of data mining and machine learning to study the ﬂuctuation of diﬀerent price series. In contrast, the wavelet classiﬁcation problem. A better solution to the classiﬁcation decomposition method requires the selection of the basis problem of unbalanced data distribution is required to deal function in advance when processing the original sequence with the data classiﬁcation problem in a more compre- and then requires several training trials to determine the hensive way. Usually, the total number of components most appropriate number of components. Time-series in- obtained by EMD decomposition is log N, where N is the formation granulation based on a support vector machine is number of data samples in the original series. However, to introduce the idea of granular computing support vector since the actual time series are composed of both real signals machine into time-series analysis, which is a new research and noise, the empirical modal decomposition is processed direction of time-series analysis [25]. ,e idea of the concept for data containing noise, and for some time-series data with of information granulation is to decompose a whole into signal jump changes, the jump signals may cause scale loss, small parts and then study the decomposed parts, and each making the decomposed results have the problem of modal part is divided into a particle. In other words, information confounding. granules are elements that are similar and indistinguishable When there is a jump in the scale of the original time- or have a certain function. Information granulation of time series signal, the decomposition result of EMD may have the series is the basis for compressing the scale of time-series problem of data modal mixing. ,e so-called modal overlap data and using it for subsequent time-series analysis, in- is manifested in the decomposition results because there terpretation, and modeling [26]. ,erefore, compared with should be only one scale feature in the scale feature, and the the wavelet decomposition method, the EMD decomposi- subsequence of the scale feature is not unique, and the tion method highlights obvious advantages in the operation signals of multiple scale features are mixed in the sequence. of the decomposition process. However, there are some Particularly, inﬂuenced by the signal in many aspects such as drawbacks in the application process of EMD, such as the collection frequency, frequency components, and signal easy occurrence of component stacking situations and amplitude, the phenomenon of modal blending can easily endpoint contamination situations. ,is is because EMD occur when empirical modal decomposition is performed needs to construct the upper and lower envelopes of the directly, and modal blending mainly refers to the following sequence using the cubic spline method in the decompo- two aspects: sition process, but the cubic spline method will disperse near (1) A single IMF contains components of the full het- the boundary points of the original sequence, and with the erogeneous scale decomposition of EMD, the endpoint eﬀect will gradually (2) Signal components of the same scale appear in spread inward to pollute the whole sequence, resulting in diﬀerent IMFs interference with the ﬁnal decomposition eﬀect. ,e sim- plest way to cope with this problem is to keep discarding the To obtain a relatively stable speed of the vehicles within nonextreme part at the endpoints during the decomposition, the cluster, we use the average speed of the vehicles within but this will cause data waste and thus aﬀect the latter the cluster to characterize the stability of the cluster and ﬁlter prediction eﬀect. If the data at the boundary points of the the vehicle nodes within the above set of pairs of neighboring sequence is not deleted, generally, it is only possible to add nodes by motion consistency to remove the vehicle nodes data to each end by various methods, and this extension within the cluster with a large diﬀerence from the average process will be disturbed by human factors, which will speed, to ensure that the cluster can travel on the road in a eventually aﬀect the decomposition eﬀect. Figure 2 shows relatively stable manner. Speciﬁcally, the average speed of the process of variational modal decomposition. the vehicles within the cluster at time t can be expressed as ,e empirical modal decomposition algorithm can be follows: understood as a set of adaptive ﬁlters that sieve the data layer δx n! by layer according to its essential scale characteristics in a c v � x + μ , (6) process where the characteristic time scales are separated in δt r!(n − r)! order from small to large. After such decomposition, the 6 Journal of Robotics IMF Noise-only Remove Scale IMF ... SE Original Decomposition Mix Scale WFLP ﬁlter Reconstructed Reconstruction ... Signal Signal IMF Dri Scale IMF Predict dri CS-SVR model Figure 2: Variational modal decomposition ﬂow. where N (t) denotes the number of elements in the set of classiﬁcation algorithms, that is, based on the ﬂaws and neighboring nodes of V at time t and V represents the n-th deﬁciencies found in previous algorithms for solving im- i it element within the set N of neighboring nodes of at time balance problems, the algorithms are appropriately im- vi Vi t. If the velocity of V satisﬁes the following equation, it will proved and extended to improve the ability to handle jn be removed: imbalance classiﬁcation problems. ,e use of a pseudosignal has also been proposed to solve the modal aliasing problem by introducing a pseudosignal to avoid the inclusion of too N(t) � N · V + A. (7) vi in i�1 wide a band in the IMF, but this method is also an approach that requires human subjective judgment to intervene and ,e set of neighbor nodes of vehicle V at moment t can suﬀers from the same problem of weakened adaptivity. be expressed as follows: N � V t + θ + η. (8) vi i 4.2. Time Series Volatility Prediction Model Based on Support Vector Machine Grain Calculation. Support vector machine- ,e average end-to-end delay reﬂects the eﬀectiveness of based time-series information granulation is a new research the protocol and can be determined using the following direction of time-series analysis by introducing the idea of formula: ������ granular computing support vector machine into time-series δy 2 2 3 analysis. ,e concept of information granulation whose idea x € � c y + x + y + c . (9) δx is to decompose a whole into small parts and then study the parts obtained from the decomposition, and each part that is When the input signal of the network is the k-th training divided is a grain. In another way, an information grain is sample X , its output value of the j-th neuron after the some elements that are similarly close and indistinguishable nonlinear transformation of the i-th hidden layer neuron is or have some function and are combined. Information as follows: granulation of time series is the basis for compressing the size of time-series data and using it for subsequent time- X � T + b X . (10) k k ij series analysis, interpretation, and modeling, and its research i�1,j�1 framework is shown in Figure 3. To address the phenomenon of modal confusion in the When granulating information on time series, it mainly empirical modal decomposition algorithm, Huang proposed includes two steps: information grain division and infor- the method of interruption detection; the speciﬁc idea is: mation grain description. Information grain division is to after each decomposition, the ﬁnal decomposition result is divide the time series into several small subsequences, and analyzed and judged, and if the phenomenon of modal each subsequence is called an information grain; informa- confusion is found, it is ﬁltered by selecting the appropriate tion grain description is to construct a description method to interruption scale from it and then decomposed again. eﬀectively characterize the information grain on the infor- However, the interruption detection is a posteriori means of mation grain obtained from the division. ,rough the in- considering the judgment; then it may lead to the following formation granulation operation of time series, the research situation: the scales potentially contained in the signal are object can be abstracted from the low-level, ﬁne-grained incorrectly ﬁltered out, and the adaptiveness of the empirical original high-dimensional time-series granulation to the modal decomposition itself is thus greatly weakened, so in high-level, coarse-grained low-dimensional grain time se- some speciﬁc cases, this interruption detection method has ries, and the constructed information grain can portray and certain shortcomings, which will have an impact on its reﬂect the local features of the original time-series data, inspection eﬀect. ,e approach in the direction of which achieves eﬃcient dimensionality reduction and lays Frequency (Hz) Journal of Robotics 7 e ﬁrst layer e fourth e second e third layer layer layer x y e ﬁh A1 layer A2 W1 W1 ∏ N W1 f1 60 ∑ f f2 W2 ∏ N B1 W2 W2 B2 x y Figure 3: A research framework for granulation of time-series information. 2.5 2.0 the foundation for the subsequent data mining work. For the 1.5 information granulation of time series, some scholars have 1.0 conducted relevant research and achieved certain research 0.5 results. By combing the published research results, the existing Figure 4: Interval information granulation of time series. research on time-series information granulation can be divided into two aspects: (1) time-axis information granu- lation of time series, that is, to solve the problem of an change characteristics (structural characteristics) of the time eﬀective division of time window representation of time series on the time axis, which does not conform to the series, and (2) research on time series’ domain information essential meaning of information grain. ,e problem of granulation, that is, to solve the problem of an eﬀective unbalanced data classiﬁcation is a key problem in the ﬁeld of division of time series’ domain representation. machine learning and data mining. From a realistic point of Time-axis information granulation of time series is to view, the distribution of data sets in a large number of divide the time series into some time windows according to classiﬁcation problems is unbalanced, and the importance of its change characteristics on the time axis according to some each category is diﬀerent; usually, sparse categories of data method, and the subsequence on each time window is are more worthy of study in a particular context. ,erefore, regarded as an information grain, and then the subsequence it is necessary to design the information granulation method on the divided time window is characterized eﬀectively. ,e of time series according to the changing characteristics of resulting interval information grain can achieve full cov- time series on the time axis so that the obtained information erage for the data samples on the time windows, as shown in grains have similar internal structures among themselves Figure 4. and distinguishable information grains from each other. ,e theoretical domain information granulation of time Real-life time-series data are usually characterized by high series is to divide the time series into several theoretical dimensionality and high noise, so it is crucial to eﬀectively domain intervals according to its variation characteristics on perform information granulation operations on time series the theoretical domain according to some method, and each to reduce the data size of time series and reduce the impact of theoretical domain interval is regarded as an information noise. grain, and then the divided theoretical domain intervals are ,e information granulation operation on time series in characterized eﬀectively. ,e research on the theoretical terms of the time axis is essentially the same as the traditional domain information granulation of time series is mainly time-series dimensionality reduction representation divided into four types: the ﬁrst is the equal interval theo- method; both are to reasonably compress the data size while retical domain division method; the second is the equal keeping the important features of the original time series as frequency theoretical domain division method; the third is much as possible. However, the traditional time-series di- the clustering-based theoretical domain division method; mensionality reduction representation method does not and the fourth is the optimization theory-based theoretical compress the time series to a high degree and does not reﬂect domain division method. ,e main research methods of the structural characteristics of the time series well, thus time-series information granulation in terms of time axis are aﬀecting the eﬀectiveness of the subsequent analysis work. interval-based time-axis information granulation, cluster- ,e granular time series obtained after the information ing-based time-axis information granulation, and fuzzy set- granulation operation cannot directly participate in the data based time-axis information granulation. ,ese methods mining work of time series yet and need to combine the usually use a ﬁxed time interval to divide the time series, that characteristics of the information granular to propose the is, hard division, and then represent the subsequence (in- corresponding similarity measure before the subsequent formation grain) obtained after the division, ignoring the analysis calculation. ,e most important decision tree in the Time (s) X Value 8 Journal of Robotics multiclassiﬁcation problem is established by using the 158.4 combination of grain calculation and Huﬀman tree, using its feature of the shortest length of the path with weight so that 118.7 it can be attributed to the category in the shortest time, 39.25 constructing the Huﬀman model of multiclassiﬁcation, 78.95 analyzing the grains of the decision tree, and using the 39.25 granularity and decision tree to construct diﬀerent multi- classiﬁers corresponding to each grain. Finally, the global -0.4531 model is constructed. In solving the multiclassiﬁcation 39.25 problem, the use of grain computing, time series, and -40.16 support vector machines are combined, which not only 39.25 -79.86 inherit the advantages of each but also complement the disadvantages of each, and get the synergistic enhancement -119.6 eﬀect. 2468 10 Time (s) 5. Experimental Verification and Conclusions Figure 5: Time-series domain partitioning based on support vector 5.1. Time Series Domain Division. In multiclassiﬁcation of machine class approach. textual problems, the huge amount of information is a major problem in solving multiclassiﬁcation. First, the problem is granularized, and in the text background knowledge, the text lower frequencies vary relatively ﬂat, and each change in the content is categorized as environment, computer, trans- series represents a long-term impact triggered by a major portation, education, economy, military, sports, medicine, event; and the original series always ﬂuctuates up and down art, politics, and so on. ,en the diﬀerent disciplines are around the residual term and shows an increasing trend in considered as particles in the multilevel granular structure, the long run. and all these categories belong to the same granular layer. ,e particles in the upper granular layer of these particles are combinations with similar particle size characteristics, and 5.2. Accuracy Comparison between Diﬀerent Models. these are coarse particles of the lower layer, and the lower According to the results reﬂected in the volume-price re- layer is ﬁne particles of the upper layer. ,e weights pro- lationship model, the model regression with the volume- cessed sum to 1. For better operation, the weights are price relationship as input is better, and the prediction eﬀect multiplied by 1,000 times to become integers. And use VC++ is better for the more volatile prediction, and the volatility of 6.0 software programming to construct the decision tree, the volume-price relationship has a closer relationship with function CrtHuﬀTree (Huﬀnode ht[] and int n) to implement the volatility of the closing price of the stock. For diﬀerent the tree construction, and function code (Huﬀnode ht[], step sizes, the loss function of the training set decreases and ﬁnally converges, while the loss function of the test set with Huﬀcode hcd[], Huﬀcode ss, and int n) to implement the encoding. From Figure 5, it can be seen that the subintervals step sizes 10 and 30 has a better eﬀect, and the loss function of the test set with step size 50 decreases and converges of the theoretical domain obtained based on the support vector machine class method are consistent with the distri- poorly. According to the ﬁnal mean absolute error MAE, the MAE of the training set with diﬀerent step sizes is close to bution characteristics of the data, that is, the subinterval interval divided is smaller in regions with dense data dis- 0.04, and the best performance of the model with 30 step tribution and larger in regions with sparse data distribution. sizes is 0.011784855. Reﬁning the classiﬁcation decision for However, there is a great diﬀerence in the amount of data local neighborhood data distribution, adjusting the posterior contained in the subintervals of the theoretical domain probability estimates and using rough set approximation obtained by this method, that is, the amount of data con- theory to handle extreme distribution cases, eliminates the tained in the subintervals with dense data distribution is uncertainty of lacking rare class data. After the reclassiﬁ- cation decision based on the reﬁned instance distribution, large and the amount of data contained in the subintervals with sparse data distribution is small, so the subintervals are the dynamic mean-neighbor classiﬁcation algorithm based on neighborhood rough sets can classify query instances into optimized using the information granulation method below. From the decomposition results, the multiscale character- classes more accurately. istics of the original time series can be initially found: from In terms of training time, the longer the step size IMF1 to IMF10, the IMF component gradually changes from window, the longer the training time. Based on the above high-frequency vibrations to low-frequency vibrations; the conclusion, in the next model testing process, it was decided series with higher vibration frequencies, compared with the to choose a window with a step size of 30 for testing. original series, remains the same in terms of ﬂuctuation According to the loss function change image of the technical frequency, and there are some diﬀerences in terms of am- indicator model, the loss function of both the training and plitude, which represent the short-term eﬀects triggered by test sets are under and ﬁnally converge. According to the results reﬂected by the technical indicator model, the loss the normal ﬂuctuations of the securities market and the occurrence of irregular events; the vibration series with function MAE of the test set with technical indicators as Frequencies Journal of Robotics 9 Time-series information ARIMA ARIMA SVM GRC LDA2 Mistake SVM GRC Standard value Decomposition SVQR FIG-SVQR Modeling Loss function SVM-GRC LMS or LDA or RBFN ARIMA-GRC ARIMA-SVM FIG- SVM- ARIMA ARIMA ARIMA- ARIMA-SVM-GRC SVQR GRC -GRC -SVM SVM-GRC 100 200 400 500 Samples Figure 6: Variation of loss function values for each model. input is 0.01071587, which is slightly smaller than that of the error, both BP and RBF results are above 3%, while both volume-price relationship model, and the overall regression SVQR and ARIMA-SVM-GRC methods are stable below result is slightly better than that of the volume-price rela- 2%. ,e mean absolute error is greater than 30 for both BP tionship model. and RBF methods and below 30 for SVQR and ARIMA- SVM-GRC methods. However, based on the observation of the ﬁtted curves, the technical indicator model has a better prediction for For the interval prediction results, we compared the SVQR method and the ARIMA-SVM-GRC method, and the more volatile and weaker volatility realizations for less volatile. ,e ARIMA-SVM-GRC method is applied to coverage probabilities of both methods were above 95%, and the bandwidths were less than 30%, indicating that both the reduce the normalized volume-price relationship and technical indicator data to extract the data features of vol- SVQR method and the ARIMA-SVM-GRC method could ume-price relationship and technical indicators and reduce better reﬂect the interval prediction results, and the FIG- the six-dimensional volume-price relationship data to three- SVQR method improved the coverage probability by 0.59% dimensional and technical indicator data to three-dimen- compared with the SVQR method and reduced the coverage sional. According to the results of the principal component probability by 0.59%. ,e FIG-SVQR method improves the ratio, the ﬁrst principal component of the volume-price coverage probability by 0.59% and reduces the bandwidth by 0.99% compared with the SVQR method, which indicates relationship accounts for the largest share incorporates all the information of the sample, and the ﬁrst and second that the ARIMA-SVM-GRC method proposed in this sec- tion is signiﬁcantly better than the SVQR method in terms of principal components of the technical indicators account for about 50%. ,e change in the value of the loss function of interval prediction. each model is shown in Figure 6. As shown in Figure 7, the ARIMA-SVM-GRC method increases the validity of the prediction results by performing 5.3. Comparison of Model Error Scenarios. According to the model error, the prediction error of the decomposition deterministic prediction of runoﬀ while also obtaining uncertain prediction results from experimental data. Usu- integration model is lower than the prediction error of the model under multiple factors, but the overall deviation ally, the prediction interval can reﬂect the ﬂuctuation of situation is not signiﬁcant. Moreover, the decomposition runoﬀ, and the ARIMA-SVM-GRC method can handle integration model can improve the prediction accuracy of discrete and nonlinear relationships. To further demonstrate the neural network model as much as possible. So, when the the predictive power of the proposed ARIMA-SVM-GRC diﬀerence between the prediction results of the multifactor method, we compare the ARIMA-SVM-GRC method with under model and the decomposition integration model other traditional data-driven methods in a cross-sectional manner. under time series is applied to solve the more complex problem of ﬂuctuation is not large, it is necessary to consider In this section, we conduct comparison experiments with BP, RBF, and SVQR methods. In the cross-sectional the prediction results of the reference two models together. When the diﬀerence between the forecasts of the two models comparison analysis of these methods in terms of both point prediction and interval prediction, the BP and RBF algo- is large, the model with the smaller error can be considered for forecasting. From the comparison in Figure 8, it can be rithms have poor results for runoﬀ prediction, with an seen that CMS takes the longest time, which is mainly be- average absolute percentage error above 6%, while SVQR cause CMS needs to consider both intra- and interattribute and ARIMA-SVM-GRC have smaller average absolute similarity in the similarity of two objects, and from the percentage error and both below 4%, with the plurality- based SVQR method having the best average absolute introduction in Section 3, it can be seen that the algorithm has a high time complexity and therefore takes the longest percentage error of 2.7%. For the relative mean squared -measure K (number of under-sampling) 10 Journal of Robotics Comparison of runoﬀ prediction accuracy 16,000 14,000 12,000 10,000 8,000 6,000 300 600 0 100 200 400 500 700 Samples True value Figure 7: Comparison of runoﬀ prediction accuracy. Experiments have showed its good recognizing F-measure value and translating accuracy rate. 1.0 0.8 0.6 0.0 0.2 0.4 0.4 0.6 0.2 0.8 M (model Loss) is an important 1.0 0.0 inﬂuencing factor of F-measure. e results show that bandpass signals can be represented correctly with undersampling. Figure 8: Comparison of model error scenarios. time. HM, OF, IOF, Eskin, and k-modes algorithms belong number of clusters obtained from the benchmark scale to the more classical algorithms. clustering is much smaller than the sample size of the And after algorithm optimization, the running time of original data set, the ARIMA-SVM-GRC algorithm requires the ARIMA-SVM-GRC algorithm mainly depends on the much less running time than other comparison algorithms. baseline scale clustering results and the scale conversion According to the dynamic mean query neighborhood, a method and is aﬀected by the number of clusters in the more scientiﬁc as well as more rigorous method is needed to calculate the local conﬁdence intervals and global conﬁdence baseline scale clustering results; the more the number of clusters obtained, the longer the time required for scale up- intervals in the query neighborhood to determine the actual projection, and the less the number of clusters, the shorter situation of the sample distribution of the minority category the time required for scale up-projection, independent of the data in the neighborhood. Since the algorithm running time original data set independent of the size of the set. Since the is aﬀected by the experimental environment and the degree M (model loss) Journal of Robotics 11 of algorithm optimization, the CMS and Eskin methods in Conflicts of Interest the experimental algorithm are obtained according to the ,e authors declare that there are no conﬂicts of interest in formula reduction in the journal without algorithm opti- this article. mization, so the running time is relatively long. 6. Conclusions Acknowledgments With the development of technology and network, multi- ,is work was supported by the projects of Ningxia Natural classiﬁcation problems occupy an increasingly important Science Foundation (No. 2022AAC03315, time-series position in people’s lives. ,e multiclassiﬁcation problem is analysis and method research based on granular computing; becoming more and more diﬃcult to solve along with the No. 2022AAC03328, research on vegetable supply chain increase of disturbing factors and the massive amount of traceability based on combined RFID technology and ser- data. vice-oriented architecture (SOA); No. 2021AAC03235, fu- By studying the multiclassiﬁcation problem, a support sion method of interest target detection in low illumination vector machine multiclassiﬁcation model based on visible image and infrared image; No. 2022AAC03314, high- granular computing is proposed and combined with a precision numerical simulation for the incompressible time-series ﬂuctuation prediction model to analyze and magnetohydrodynamic problems; and No. 2022AAC03301, deal with the multiclassiﬁcation problem. In this paper, high-order diﬀerence schemes on an adaptive algorithm for granular computation is incorporated into the data pre- convection-diﬀusion reaction equations). processing model. ,e problems of large data processing and low training speed can be eﬀectively solved by using References ideas such as granulation and hierarchy of granular computing triad, which are analyzed and transformed [1] M.-H. Fan, M.-Y. Chen, and E.-C. Liao, “A deep learning from the perspective of granular computing. In the approach for ﬁnancial market prediction: utilization of google multiclassiﬁcation problem, whether the decision tree is trends and keywords,” Granular Computing, vol. 6, no. 1, constructed reasonably is crucial, using the granularity of pp. 207–216, 2021. granular computing combined with the Huﬀman tree to [2] E. Bas, U. Yolcu, and E. Egrioglu, “Intuitionistic fuzzy time series functions approach for time series forecasting,” construct the optimal decision binomial tree, so that it can Granular Computing, vol. 6, no. 3, pp. 619–629, 2021. obtain the classiﬁcation in the shortest time and solve the [3] T. Wang, T. Ma, D. Yan et al., “Prediction of heating load problem of uneven samples within the class, which pro- ﬂuctuation based on fuzzy information granulation and vides a practical method for the analysis of multi- support vector machine,” <ermal Science, vol. 25, no. 5, classiﬁcation problems. In terms of the time axis of time pp. 3219–3228, 2021. series, the method of granulating time-series information [4] E. Egrioglu, U. Yolcu, and E. Bas, “Intuitionistic high-order based on ﬂuctuation points is proposed for the structural fuzzy time series forecasting method based on pi-sigma ar- characteristics of low-frequency time series. ,e key to the tiﬁcial neural networks trained by artiﬁcial bee colony,” research of this method is the deﬁnition and identiﬁcation Granular Computing, vol. 4, no. 4, pp. 639–654, 2019. of ﬂuctuation points in the time series, identifying ﬂuc- [5] Z. Han, J. Zhao, H. Leung, K. F. Ma, and W. Wang, “A review tuation points by operating on the original time series, of deep learning models for time series prediction,” IEEE Sensors Journal, vol. 21, no. 6, pp. 7833–7848, 2019. dividing the information grain of the original time series [6] M. Bose and K. Mali, “Designing fuzzy time series forecasting using the ﬂuctuation points, and then describing the models: a survey,” International Journal of Approximate information grain after the division using a linear func- Reasoning, vol. 111, pp. 78–99, 2019. tion to complete the information granulation operation of [7] D. Yu, Z. Xu, and X. Wang, “Bibliometric analysis of support the time series and transforming the original time series vector machines research trend: a case study in China,” In- into a grain time series. Since the number of information ternational Journal of Machine Learning and Cybernetics, grains and the corresponding time window size in the vol. 11, no. 3, pp. 715–728, 2020. granular time series are diﬀerent, a new similarity mea- [8] H. Wu, H. Long, Y. Wang, and Y. Wang, “Stock index sure based on linear information granulation is proposed forecasting: a new fuzzy time series forecasting method,” to facilitate the subsequent data mining work on the Journal of Forecasting, vol. 40, no. 4, pp. 653–666, 2021. granular time series. Firstly, to ensure the one-to-one [9] Y. He, Y. Yan, and Q. Xu, “Wind and solar power probability density prediction via fuzzy information granulation and correspondence between linear information grains of support vector quantile regression,” International Journal of diﬀerent time series, the segmentation matching algo- Electrical Power & Energy Systems, vol. 113, pp. 515–527, 2019. rithm of linear information grains is proposed; secondly, [10] K. Matenczuk, ´ A. Kozina, A. Markowska et al., “Financial for the matched linear information grains, the corre- time series forecasting: comparison of traditional and spiking sponding similarity metric algorithm is proposed. neural networks,” Procedia Computer Science, vol. 192, pp. 5023–5029, 2021. Data Availability [11] C. Luo, C. Tan, and Y. Zheng, “Long-term prediction of time series based on stepwise linear division algorithm and time- ,e data used to support the ﬁndings of this study are variant zonary fuzzy information granules,” International available from the corresponding author upon request. Journal of Approximate Reasoning, vol. 108, pp. 38–61, 2019. 12 Journal of Robotics [12] S.-T. Kim, I.-H. Choi, and H. Li, “Identiﬁcation of multi- concentration aromatic fragrances with electronic nose technology using a support vector machine,” Analytical Methods, vol. 13, no. 40, pp. 4710–4717, 2021. [13] G. Dong, R. Li, J. Jiang, H. Wu, and S. C. McClure, “Mul- tigranular wavelet decomposition-based support vector re- gression and moving average method for service-time prediction on web map service platforms,” IEEE Systems Journal, vol. 14, no. 3, pp. 3653–3664, 2019. [14] W. H. Su, S. Bakalis, and D. W. Sun, “Chemometric deter- mination of time series moisture in both potato and sweet potato tubers during hot air and microwave drying using near/mid-infrared (NIR/MIR) hyperspectral techniques,” Drying Technology, vol. 38, pp. 806–823, 2020. [15] M. Askari and F. Keynia, “Mid-term electricity load fore- casting by a new composite method based on optimal learning MLP algorithm,” IET Generation, Transmission & Distribu- tion, vol. 14, no. 5, pp. 845–852, 2020. [16] A. R. S. Parmezan, V. M. A. Souza, and G. E. A. P. A. Batista, “Evaluation of statistical and machine learning models for time series prediction: identifying the state-of-the-art and the best conditions for the use of each model,” Information Sciences, vol. 484, pp. 302–337, 2019. [17] C. H. Fajardo-Toro, J. Mula, and R. Poler, “Adaptive and hybrid forecasting models—a review,” Engineering Digital Transformation, Springer, Berlin, Germany, 2019. [18] Y. Zhao, T. Li, and C. Luo, “Spatial-temporal fuzzy infor- mation granules for time series forecasting,” Soft Computing, vol. 25, no. 3, pp. 1963–1981, 2021. [19] W. Chen, M. Jiang, W.-G. Zhang, and Z. Chen, “A novel graph convolutional feature based convolutional neural network for stock trend prediction,” Information Sciences, vol. 556, pp. 67–94, 2021. [20] T. Ouyang, Y. He, H. Li, Z. Sun, and S. Baek, “Modeling and forecasting short-term power load with copula model and deep belief network,” IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 3, no. 2, pp. 127–136, 2019. [21] P. E. Puspita, T. Inkaya, and M. Akansel, “Clustering-based sales forecasting in a forklift distributor,” International Journal of Engineering Research and Development, vol. 11, no. 1, pp. 25–40, 2019. [22] P. S. G. de Mattos Neto, J. F. L. de Oliveira, D. S.D. O. Santos Ju´nior, H. V. Siqueira, M. H. N. Marinho, and F. Madeiro, “An adaptive hybrid system using deep learning for wind speed forecasting,” Information Sciences, vol. 581, pp. 495–514, 2021. [23] A. A. Nasser, M. Z. Rashad, and S. E. Hussein, “A two-layer water demand prediction system in urban areas based on micro-services and LSTM neural networks,” IEEE Access, vol. 8, pp. 147647–147661, 2020. [24] K. Singh, R. Tiwari, P. Johri, and A. A. Elngar, “Feature se- lection and hyper-parameter tuning technique using neural network for stock market prediction,” Journal of Information Technology Management, vol. 12, pp. 89–108, 2020. [25] W. Wang, W. Liu, and H. Chen, “Information granules-based BP neural network for long-term prediction of time series,” IEEE Transactions on Fuzzy Systems, vol. 29, no. 10, pp. 2975–2987, 2020. [26] Z. Nannan and L. Chao, “Adaptive online time series pre- diction based on a novel dynamic fuzzy cognitive map,” Journal of Intelligent & Fuzzy Systems, vol. 36, no. 6, pp. 5291–5303, 2019.
Journal of Robotics – Hindawi Publishing Corporation
Published: Apr 16, 2022
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.