Access the full text.
Sign up today, get DeepDyve free for 14 days.
References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.
Mathematical and Computer Modelling of Dynamical Systems Vol. 11, No. 2, June 2005, 159 – 169 Optimal and Suboptimal Algorithms in Set Membership Identiﬁcation BOLESŁAW KACEWICZ Department of Applied Mathematics, AGH University of Science and Technology, Al. Mickiewicza 30, A3/A4, 301, 30-059 Cracow, Poland. We discuss in this paper optimality properties of identiﬁcation algorithms in a set membership framework. We deal with restricted-complexity (conditional) identiﬁcation, where approxima- tions (models) to a possibly complex system are selected from a low dimensional space. We discuss the worst- and average-case settings. In the worst-case setting, we present results on optimality, or suboptimality, of algorithms based on computing the unconditional or conditional Chebyshev centres of an uncertainty set. In the average-case setting, we show that the optimal algorithm is given by the projection of the unconditional Chebyshev centre. We show explicit formulas for its average errors, allowing us to see the contribution of all problem parameters to the minimal error. We discuss the case of weighted average errors corresponding to non-uniform distributions over uncertainty sets, and show how the weights inﬂuence the minimal identiﬁcation error. Keywords: Identiﬁcation, worst-case setting, average-case setting, weights, optimal algorithm, suboptimal algorithm. 1. Introduction The idea behind the set membership approach is to use a priori knowledge about the problem being solved, and the measured (or computed, or observed) information about unknown quantities to deﬁne a feasible set of elements consistent with the available information. A number of results is devoted to the characterization of feasible sets in system identiﬁcation, see [1 – 13] and the references therein. In many cases, such as, e.g., in restricted-complexity estimation, the desired estimate is often required to be a member of a set of given simple structure. In this case estimation is referred to as conditional [6]. In this paper, we present results obtained within a set membership framework in two settings: the worst-case and average-case settings. In the worst-case setting, the error of *Email: kacewicz@uci.agh.edu.pl Mathematical and Computer Modelling of Dynamical Systems ISSN 1387-3954 print/ISSN 1744-5051 online ª 2005 Taylor & Francis Group Ltd http://www.tandf.co.uk/journals DOI: 10.1080/13873950500068575 160 B. Kacewicz an algorithm is measured by the worst performance for problems consistent with given information, see e.g., [1 – 6], [9 – 11]. A conservative point of view is thus chosen in the worst-case analysis: exceptionally diﬃcult elements, even if they appear rarely, are taken into account. The optimal algorithm in this setting is deﬁned by the conditional Chebyshev centre of the feasible set. The conditional Chebyshev centre is in general diﬃcult to compute, even for sets of a simple structure [3]. For this reason, we turn our attention to suboptimal, but easy-to-compute estimators. To this end, we discuss, based on [2], the reliability level of the projection algorithm. Compared to the worst-case approach, the average-case setting is less conservative. The error of an algorithm is now deﬁned on the average, which excludes outliers [7]. The optimal algorithm turns out to be the projection algorithm whose suboptimality has been established in the worst-case setting. We provide formulas for its two local average errors, which explicitly exhibit the contribution of the factors such as information, measurement errors, and restrictions imposed on the models. In addition to the uniform distribution over uncertainty sets considered in [7], we discuss, based on [8], weighted average errors for a class of weight functions. The results are given explaining the contribution of the weights to the minimal errors. The contribution is through certain integral parameters that can be computed, or estimated, for a given selection of the weights. An example is given illustrating what the estimators and the errors look like for a particular identiﬁcation problem. 2. Problem formulation We wish to identify a system x belonging to the system space X = R . We look for a model (an approximation) to x, having at our disposal information about x given by a vector of measurements y which belongs to the information space Y = R , with m 5 n. The spaces X and Y are equipped with norms jjxjj = jjxjj and jjyjj =jjAyjj, where A X Y is a non-singular m6 m matrix, and jj jj denotes the Euclidean norm in X or Y, with qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ jjcjj ¼ c , for any vector c2 R with components c . Given a full rank m6 n i¼1 i matrix U (called the information matrix), the vector y is given as a perturbed value of Ux, jjy Uxjj E; ð1Þ where e4 0 is the accuracy of computing (or measuring) information. For example, the information vector may be obtained as follows. For the input sequence u=[u ,u , ..., u ] with u 6¼ 0, the components of y can be given as output 0 1 m–1 0 measurements related to u by y ¼ x u þ e ; l ¼ 1; 2; .. . ; m; ð2Þ l k lk l k¼1 where e is an error corrupting the output (we set x = 0 for k4 n). Some l k normalization condition is imposed on the sequence u, saying, for instance, that a norm of u does not exceed 1. Hence, y ¼ Ux þ e; ð3Þ where e=[e ,e ,... e ] is assumed to satisfy jjejj 4e. The matrix U in this case is the 1 2 m Y m6 n Toeplitz-type matrix Optimal algorithms in identiﬁcation 161 2 3 u 00 ... 0 6 7 u u 0 ... 0 1 0 6 7 6 7 u u u ... 0 2 1 0 6 7 6 . . . . 7 . . . . 6 7 . . . . 6 7 U ¼ : ð4Þ 6 7 u u ... .. . u n1 n2 0 6 7 6 7 u u ... .. . u n n1 1 6 7 6 . . . . 7 . . . . 4 5 . . . . u u ... .. . u m1 m2 mn Based on y, we compute an approximation to x using an algorithm, by which we mean any mapping f : Y !M ; where M is a given subset of X. An approximation to x is provided by the vector f(y)2M.If M is a proper subset of the system space X, then identiﬁcation is called conditional; see, e.g., [2], [6]. The choice of M is usually aimed at restricting the complexity of the approximating models. Identiﬁcation is unconditional if M = X. 3. Worst-case setting In this setting, motivated by the aim of constructing robust algorithms, the behaviour of an approximant is studied for the worst system compatible with computed information. For a given measurement y, the error of an algorithm f, the y-local worst- case error, is deﬁned by worst e ðfÞ¼ sup jjx fðyÞjj ; ð5Þ y X x2E where the set E (assumed non-empty) contains all systems consistent with information y, E ¼fx 2 X : jjUx yjj Eg : ð6Þ The error (5) measures the worst uncertainty that an algorithm may produce for a given measurement y. If the estimate selection is unconstrained, the best choice of z = f(y) is given by the Chebyshev centre of the set E . (This remains true for an arbitrary set E , not g g necessarily given by (6), and any norm in X.) Indeed, the Chebyshev centre of E is deﬁned as an element c(E ) that minimizes the maximal distance from elements of E g g among all z in X, cðE Þ¼ arg min max kz xk: ð7Þ z2X x2E The algorithm deﬁned by f(y)= c(E ) is optimal, and its worst-case error is equal to the geometric radius of E , worst e ðfÞ¼ sup jjx cðE Þjj ð¼ radðE ÞÞ : ð8Þ y y y X x2E y 162 B. Kacewicz In many cases of interest, the Chebyshev centre is readily available. For instance, the uncertainty set E in (6) is an ellipsoid in R with the centre (Chebyshev centre) given by T 1 T cðE Þ¼ ðU UÞ U y , where U ¼ AU and y ¼ Ay . Computational aspects of the optimality problem are more complicated for conditional identiﬁcation, even for simple uncertainty sets. The optimal algorithm, opt denoted by f , is now deﬁned by the conditional Chebyshev centre, a minimizer selected among admissible models in the set M: cond c ðE Þ¼ arg min max kz xk: ð9Þ z2M x2E If a linear parametrization of the selected estimate is desired, then the set M is deﬁned as a linear p-dimensional manifold in R (p 4 n). The computation of the conditional Chebyshev centre in this case, even for such a simple set as an ellipsoid, is not a cond straightforward task; see [3] for an iterative method for ﬁnding c (E ). A simple idea to overcome the diﬃculties with computing the conditional Chebyshev centre is to take the optimal unconditional estimate given by the unconditional Chebyshev centre c(E ) (which is easy to compute for many sets E ), and to ﬁnd its g g projection onto the manifold M. The algorithm obtained in this way is denoted by proj f . It is quite obvious that this algorithm is in general not optimal, and the question is: how much do we lose with respect to the optimal one? Is the projection algorithm suboptimal within a reasonable margin? The answer to these questions is given in the following result [2], which holds for much more general uncertainty sets than ellipsoids. We only assume that: the set of feasible elements is a compact subset of R ; ð10Þ and that M is a p-dimensional linear manifold, n 0 1 p M¼ fx 2 R : x ¼ z þ a z þ þ a z g; ð11Þ 1 p i n i T j i where a 2 R, z 2 R ,(z ) z = 0 for i, j=0, 1, ..., p, i 6¼ j and jjz jj = 1 for i = 1, 2,. . .,p. In [2], the following theorem has been shown. Theorem 1 If (10) and (11) hold, then pﬃﬃﬃﬃﬃﬃﬃﬃ worst proj worst opt e ðf Þ 4=3 e ðf Þ: ð12Þ y y pﬃﬃﬃﬃﬃﬃﬃﬃ Furthermore, the constant 4=3 cannot be improved. The theorem ensures that the projection algorithm, although not optimal, is pﬃﬃﬃﬃﬃﬃﬃﬃ suboptimal within a reasonable constant of 4=3 , i.e., less than 16% of accuracy is lost with respect to the accuracy of the optimal algorithm. This holds independently of the dimensions n and p. The second statement of the theorem follows from the pﬃﬃﬃ pﬃﬃﬃ following example [2]. Let n=2, E ¼fðx ; x Þ2 R : x ¼ x = 2 þ 1; jx j 2 g , y 1 2 2 1 1 and M={(x ,x )2 R :x = 0}. Then c(E ) = (0, 1), the projection of it onto M is (0, 1 2 2 g pﬃﬃﬃ pﬃﬃﬃ cond 0), and c ðE Þ¼ð1= 2; 0Þ. The error of the projection algorithm is equal to 6 , pﬃﬃﬃ pﬃﬃﬃﬃﬃﬃﬃﬃ while the optimal error is 3= 2. The ratio is exactly 4=3. 4. Average-case setting The worst-case analysis is necessary when robust approximations are of interest. If, however, robustness is not our main concern, the worst-case error criterion may be Optimal algorithms in identiﬁcation 163 viewed as conservative. In [7] a less conservative error measure has been analysed, which reﬂects the average performance of identiﬁcation algorithms. We remain in the set membership framework, and assume, similarly to the worst- case setting, that systems consistent with a measurement y belong to the set E given in (6). The basic quantity is now the y-local average error of an algorithm deﬁned by vﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ aver u 2 e ðfÞ¼ jjx fðyÞjj dx; ð13Þ y X jE j where jAj is the Lebesgue measure of a set A and ‘ ’ is the Lebesgue integral in R .In contrast to (5), the error of an algorithm is now measured by its mean square (not the worst-case) behaviour in E . In addition to (13), the other error measure has also been considered in [7]. It is sometimes desirable to assess the behaviour of an algorithm for a ﬁxed system x (not aver for a ﬁxed measurement y). Based on the errors fe ðfÞg for x, we deﬁne the x-local average error, which describes the behaviour of an algorithm in the set of all measurements consistent with a ﬁxed system x, E ¼fy 2 Y : jjy Uxjj Eg: The x-local average error is given by vﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ x; aver aver e ðfÞ¼ t e ðfÞ dy; ð14Þ x y jE j for a measurable function f. The following result provides the formula for the y-local error of an arbitrary algorithm f, see [7]. Let K ={t2 R : jj t jj 4 1} be the unit ball in the standard Euclidean norm, let qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ T T 2 T E ¼ E y ðI UðU UÞ U Þy (for E 6¼;, the real non-negative number e is well deﬁned and e 4e), and let B be any y y y n6 n matrix such that B B ¼ U U . Theorem 2 For any M R and f, T T 2 1 2 y 2 aver 1 e ðfÞ ¼jjfðyÞðU UÞ U y jj þ jjB tjj dt: ð15Þ jK j opt Hence, the optimal algorithm f is deﬁned by T T opt 1 f ðyÞ¼ the projection of ðU UÞ U y ontoMð16Þ (if the projection is well deﬁned). This is the projection of the Chebyshev centre of E onto M. Unlike in the worst-case setting, the optimal algorithm in the average-case setting coincides with the projection algorithm. Its error is given in the following result [7]. 164 B. Kacewicz Theorem 3 2 E T T y T opt 2 1 1 aver e ðf Þ ¼ dist ðU UÞ U y ; M þ trðU UÞ : ð17Þ n þ 2 For a linear manifold M in (11), the x-local error of the optimal algorithm is given in 1 p the next result, see [7]. Let M=[z , ..., z be the orthogonal matrix with column vectors that span the space M. Theorem 4 Let M be given by (11). Then 2 2 1 x;aver opt e ðf Þ ¼ distðx;MÞ þ ðtrðU UÞ m þ 2 ð18Þ T T þ tr ðI MM ÞðU UÞ ðI MM Þ Þ: This formula explicitly shows the inﬂuence on the minimal error of the factors such as the number of measurements m, the information matrix U, the accuracy of measuring information e, the matrix M representing the subspace M, and the unmodelled dynamics term dist(x,M). For instance, we can now formulate, in algebraic language, criteria for the optimal selection of information matrix or the subspace M to minimize the error of the optimal algorithm, see [7]. 5. Average-case setting with weights The average errors considered in the previous section are deﬁned under the assumption that the distribution of elements in the feasible sets E and E is uniform. This is not a bad assumption when no additional information concerning the distribution is available. In many cases, however, such knowledge is available, and other distributions of measurements and problem elements, such as, for instance, Gaussian distribution, may better describe the problem. The question arises: how far do the results on optimal estimation given in the previous section rely on the assumption of a uniform distribution? The results given in this section, based on [8], provide the answer for a wide class of distributions. Let us generalize the notion of the average error, and deﬁne the weighted average y- local error. Assume for a ﬁxed y that a non-negative Lebesgue integrable function p :E ?R is given such that m ðE Þ¼ p ðxÞdx > 0. The weighted y-local average y y y y error of an algorithm f is deﬁned by vﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ u 2 aver e ðfÞ¼ jjx fðyÞjj p ðxÞdx: ð19Þ y X m ðE Þ This error criterion reﬂects the average behaviour of an algorithm with respect to a weighted Lebesgue measure concentrated on E . To deﬁne the weighted x-local error, we assume that a non-negative Lebesgue x x x x integrable function q :E ?R is given such that m ðE Þ¼ x q ðyÞdy > 0. For a measurable function f, we deﬁne the weighted x-local average error of f by Optimal algorithms in identiﬁcation 165 vﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ x;aver aver x e ðfÞ¼ t e ðfÞ q ðyÞdy: ð20Þ x y m ðE Þ Weight functions p and q under consideration will be deﬁned by transferring some basic functions p ^ and q ^ to proper sets as follows. Let p ^ : K ! R be a non-negative Lebesgue integrable function with k ðp ^Þ¼ p ^ðtÞ dt > 0 . To deﬁne p , we express the set E in equivalent form. Recall that an n6 n non-singular matrix B is such that T T T T B B ¼ U U . Deﬁning b ¼ðB Þ U y , we have that jjUx – yjj 4e iﬀ jjBx – bjj4e , y y where e is deﬁned in the previous section. Since we wish to study uncertain information, for which the unique recovery of x from the knowledge of y is not possible, we assume that E is not a singleton, i.e., e 4 0. We deﬁne the weight function p by y y y Bx b p ðxÞ¼ p ^ for x 2E : ð21Þ y y The function p (x) depends on the matrix B; a possible choice is B ¼ðU UÞ . We shall see that the optimal algorithm is independent of the selection of the matrix B. For a wide class of weight functions the same holds true for the optimal error. Similarly to p , we deﬁne the function q by x x q ðyÞ¼ q Aðy UxÞ ; y 2E ; ð22Þ where q ^ : K ! R is a non-negative integrable function with k ðq ^Þ¼ q ^ðtÞdt > 0: m m The choice of weight functions is not an easy task. Selecting the weights provides in fact additional a priori information about the problem, which should reﬂect our knowledge about it. A proper weight selection allows us to include relevant a priori information into the problem formulation. For a discussion of the problem of measure selection in a general average setting one is referred to [12], p. 68. We shall see that in many interesting cases the optimal algorithm is independent of the weights. The error of the optimal algorithm is weight-dependent, and the dependence will be explicitly shown in terms of certain parameters that are easy to compute for any selection of the weights. We now restrict ourselves to isotropic weight functions; for a discussion of the more ^ ^ general case one is referred to [8]. We assume that basic weight functions p and q take constant values on spheres in the respective spaces. Letting w and w be real non- negative continuous functions on [0,1], we deﬁne 2 2 p ^ðtÞ¼ wðjjtjj Þ for t 2 K and q ^ðtÞ¼ w ðjjtjj Þ for t 2 K : ð23Þ n m First of all, it is possible to show [8] that, similarly to the case of uniform distributions, opt the optimal algorithm f is given by the projection of the least squares approximation on M, T 1 T opt f ðyÞ¼ the projection of ðU UÞ U y onto M ; ð24Þ provided that the projection is well deﬁned. (This holds for any symmetric weight opt function p ^ .) Hence, f does not depend on the weights. The following result shown in 166 B. Kacewicz [8] gives explicit formulas for the errors of the optimal algorithm, and explains the inﬂuence of the weights. Let a and a be parameters depending on the weights, given by n m ðn þ 2ÞI nþ1 2 n a ¼ ðn 1Þ; where I ¼ I ðwÞ¼ wðr Þr dr : n n n nI n1 A similar formula describes the numbers a , with w replaced by w . The following theorem holds. Theorem 5 For weight functions given by (23), we have T T T 2 1 2 n 1 aver opt 2 e ðf Þ ¼ distððU UÞ U y ;MÞ þ E trðU UÞ ; ð25Þ y y n þ 2 and 1 a a m m T opt 2 2 1 x;aver 2 e ðf Þ ¼ distðx;MÞ þ E a þ trðU UÞ n þ 2 m þ 2 ð26Þ m T 1 2 T T þ E tr ðI MM ÞðU UÞ ðI MM Þ : m þ 2 The errors of the optimal algorithm depend on the weights through the parameters a and a that can be computed (or estimated) for each particular choice of the n m weights. Bounds on these parameters for three classes of weight functions – continuous, diﬀerentiable and truncated Gaussian functions – can be found in [8]. The case of uniform distributions over E and E corresponds to p ^ðtÞ 1 and q ^ðtÞ 1 , that is, to w(t):1 and w ðtÞ 1. Hence, in this case a = 1 and a ¼ 1 , and (25) and (26) agree with (17) and (18). Note that for isotropic weight functions the error depends on the matrix U , but is independent of the choice of the matrix B in (21). Following [8], we illustrate the results given above for an exemplary problem. Example Let A = I , where I is the identity m6 m matrix, i.e., we consider the m m standard Euclidean norm in the space Y. Let information be given by k repeated measurements of all components of the system x. That is, the information matrix is given by U=(I , ... I ) , with I appearing k times, and exact information is the m6 1 n n n T T T vector Ux=[x , ... x ] , with x appearing k times. The total number of measurements m equals m=k n. Assume that we are able to measure components of x in such a way that the result 1 T k T T i obtained, an m6 1 vector y=[(y ) , ... (y ) ] with k vector components y , has mean squared error at most E , vﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ u X ð1=kÞ jjy xjj E : ð27Þ i¼1 The uncertainty set E thus has the form y Optimal algorithms in identiﬁcation 167 i 2 2 E ¼f x : jjy xjj kEgð28Þ i¼1 pﬃﬃﬃ pﬃﬃﬃ 1=2 T T (so that E ¼ kE ). One can easily verify that U U = kI , B ¼ðU UÞ ¼ kI , and pﬃﬃﬃ b ¼ y = k. Hence, E can be written as i¼1 () 1 1 E ¼ x : jjx y jj pﬃﬃﬃE ; ð29Þ y y i¼1 where k k X X 2 T 2 2 i i j E ¼ k E ðk 1Þ jjy jj ðy Þ y : i¼1 i;j¼1; i6¼j The optimal algorithm is given by the projection onto M of the arithmetic mean of y , opt 1 T T i f ðyÞ¼ the projection onto M of the vector ðU UÞ U y ¼ y : ð30Þ i¼1 We show what the formula (26) looks like for a particular choice of weight functions and a subspace M. Let n 5 2. Let the weights be deﬁned by functions wðrÞ¼ w ðrÞ¼ 1 r , and M={x 2 R : x =...= x = 0}. Then I ðwÞ¼ 3 n n 4 nþ4 I ðw Þ¼ , and a ¼ a ¼ . We have that n n n ðnþ1Þðnþ5Þ nþ6 opt 2 2 x;aver e ðf Þ ¼ distðx;MÞ nðn þ 4Þ 2 m þ 4 þ E þ ð31Þ n þ 6 ðn þ 2Þðm þ 6Þ ðm þ 2Þðm þ 6Þ m þ 4 2 T T 1 T þ k E tr ðI MM ÞðU UÞ ðI MM Þ : n n ðm þ 2Þðm þ 6Þ qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ Since distðx;MÞ ¼ x (it is equal to zero if n = 2), and i¼3 n 2 T T 1 T tr ðI MM ÞðU UÞ ðI MM Þ ¼ ; n n opt the error of f can be expressed as x;aver opt 2 e ðf Þ ¼ x þ hðn; kÞ; ð32Þ i¼3 where 2nðn þ 4Þ hðn; kÞ¼ ðn þ 2Þðn þ 6Þðn þ 6=kÞ nðn þ 4Þðn þ 4=kÞ þ ð33Þ ðn þ 6Þðn þ 2=kÞðn þ 6=kÞ ðn 2Þðn þ 4=kÞ þ : ðn þ 2=kÞðn þ 6=kÞ 168 B. Kacewicz The formula (32) shows the dependence of the average error on E and k. The dependence of h(n,k)on k is not crucial. For a ﬁxed n, when E goes to zero or k goes to inﬁnity, the error tends to the unmodelled dynamics error x (whose appearance i¼3 i cannot be avoided since the identiﬁcation is conditional), essentially as E =k . For further illustration, we specify the example above for a simple problem with exemplary data. We wish to identify a system x depending on three parameters, x ¼½x ; x ; x (n = 3). We measure each component of x three times (k = 3) in such a 1 2 3 way that (27) holds, the precision of the measurement device being E ¼ 8 10 . Let the measured vectors be y ¼½5:035;2:955; 0:003 ; y ¼½5:003;3:02; 0:01 ; y ¼½4:993;2:867;0:005 : The least squares approximation is then given by ð1=3Þ y ¼½5:01033;2:94733;0:00267 : i¼1 For the subspace M deﬁned as above, the optimal (on the average) approximation is deﬁned by opt T f ðyÞ¼½5:01033;2:94733; 0 ; and has the y-local error aver opt 2 3 e ðf Þ ¼ 1:739 10 : The x-local error for an exemplary x consistent with the measurements y , given by x = [5, – 3, 0] , is equal to x;aver opt 2 3 e ðf Þ ¼ 2:599 10 opt 2 –3 (since x is in M, dist(x,M) = 0), while the distance jjx – f jj is 2.881 10 (as we see, the actual distance is here greater than the average distance). Let us ﬁnally note that the formula (32) suggests an optimization problem concerning the experiment design. For a given positive d, we wish to reduce the second term in (32) to be of order d (neglecting the inﬂuence of h(n,k)). In order to achieve this, we can perform, for any k, k (vector) measurements with accuracy pﬃﬃﬃ E ¼ kd. In reasonable models of experimental cost, the cost should increase with increasing number of experiments, k, and it should decrease if we relax the accuracy, i.e., if we increase E (this is the case when k grows). The question is to ﬁnd an optimal k that minimizes the cost. 6. Conclusions We have considered the performance of conditional estimators in set membership identiﬁcation in the worst-case and average-case settings. Optimal estimators in the Optimal algorithms in identiﬁcation 169 worst-case setting are in general diﬃcult to compute, and suboptimal estimators such as the projection estimator are often considered. We have shown that the error of the qﬃﬃ projection algorithm never exceeds times the error of the optimal algorithm. In the average-case setting, the projection algorithm turns out to be optimal. Explicit formulas for its local errors are given, showing the inﬂuence of the problem parameters on the minimal identiﬁcation error. The case of weighted average errors is discussed, and the inﬂuence of the weights is explicitly shown. Acknowledgment This research was partly supported by local AGH grant No. 10.420.03. References [1] Chernousko, F.L., 1994, State Estimation for Dynamic Systems (Boca Raton: CRC Press). [2] Garulli, A., Kacewicz, B., Vicino, A., Zappa, G., 1999, Reliability of Projection Algorithms in Conditional Estimation. Journal of Optimization Theory and Applications, 101, 1 – 14. [3] Garulli, A., Vicino, A. and Zappa, G., 1997, Conditional Central Algorithms for Worst-Case Estimation and Filtering. In Proc. 36th IEEE Conference on Decision and Control, San Diego. [4] Helmicki, A.J., Jacobson, C.A. and Nett, C.N., 1991, Control Oriented System Identiﬁcation: A Worst- Case/Deterministic Approach in H . IEEE Transactions on Automatic Control, 36, 1163 – 1176. [5] Jacobson, C.A., Nett, C.N. and Partington, J.R., 1992, Worst Case System Identiﬁcation in l : Optimal Algorithms and Error Bounds. Systems and Control Letters, 19, 419 – 424. [6] Kacewicz, B., Milanese, M. and Vicino, A., 1988, Conditionally Optimal Algorithms and Estimation of Reduced Order Models. Journal of Complexity, 4, 73 – 85. [7] Kacewicz, B., 2000, Optimal Average Case Estimation in Hilbert Norms. Mathematics of Control, Signals and Systems, 13, 347 – 359. [8] Kacewicz, B., 2003, Weighted Average Errors in Set-Membership Estimation. Mathematics of Control, Signals and Systems, 16, 238 – 253. [9] Ma¨ kila¨ , P. and Partington, J., 1999, On Robustness in System Identiﬁcation. Automatica, 35, 907 – 916. [10] Milanese, M. and Vicino, A., 1993, Information Based Complexity and Nonparametric Worst-Case System Identiﬁcation. Journal of Complexity, 9, 427 – 445. [11] Milanese, M., Norton, J., Piet-Lahanier, H., and Walter, E. (Eds), 1996, Bounding Approaches to System Identiﬁcation (New York: Plenum Press). [12] Novak, E., 1988, Deterministic and Stochastic Error Bounds in Numerical Analysis, Lecture Notes in Mathematics 1349 (Berlin: Springer – Verlag). [13] Walter, E. (Ed.), 1990, Parameter Identiﬁcations with Error Bound. Special issue in: Mathematics and Computers in Simulation, 32 (5 – 6).
Mathematical and Computer Modelling of Dynamical Systems – Taylor & Francis
Published: Jun 1, 2005
Keywords: Identification; worst-case setting; average-case setting; weights; optimal algorithm; suboptimal algorithm
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.