Jekyll2023-03-18T11:17:05+00:00https://kalman-filter.com/feed.xmlKalman filter for professionalsUnscented Kalman Filter2023-02-18T00:00:00+00:002023-02-18T00:00:00+00:00https://kalman-filter.com/Unscented-Kalman-Filter<p>For the state estimation of nonlinear systems, shortly after the original publication of Rudolf Kalman, a variant was published by Kalman himself which allows the application of the Kalman filter also for nonlinear systems. Here, the nonlinear system is linearized at the current estimated value. For this the equations must be derived analytically. This procedure was for many years the solution for the nonlinear systems which occur more frequently in practice. An essential disadvantage of this method is that the filter can diverge easily since the estimated value does not correspond to the actual state and thus is always linearized at the wrong place. In the case of unfavorable nonlinearities, the estimated value diverges and thus linearization is performed at an even more incorrect position. This leads after some steps fast to a complete divergence.</p> <p>The famous publication of Simon J. Julier and Jeffrey K. Uhlmann approached the problem of estimating nonlinear systems differently . It is based on the idea not to linearize the involved functions, since they are known exactly. Rather, a procedure must be used to choose points that can be sent through the nonlinear functions and then reassemble the results so that they can be worked with. This method went down in history with the name Unscented Transformation and the filter belonging to it has become world-famous as Unscented Kalman Filter (UKF).</p> <h3>Unscented Transformation</h3> <p>It results in the following calculation rules for the sigma points. For $$i=0,1,…,n_x$$ with $$\text{dim}( \mathbf{x} ) = n_x$$ the sigma points are</p> <p>$\mathcal{X}^{(0)}(k-1) = \mathbf{x}(k-1|k-1) ,$</p> <p>$\mathcal{X}^{(i)}(k-1) = \mathbf{x}(k-1|k-1) + \sqrt{\frac{n_x}{1 - W_0}} \bigl(\mathbf{P}^{\frac{1}{2}}(k-1|k-1)\bigl)_i ,$</p> <p>and</p> <p>$\mathcal{X}^{(i+n_x)}(k-1) = \mathbf{x}(k-1|k-1) - \sqrt{\frac{n_x}{1 - W_0}} \bigl(\mathbf{P}^{\frac{1}{2}}(k-1|k-1)\bigl)_i$</p> <p>where the parenthesis $$\bigl( \cdot \bigl)_i$$ indicates that the i-column of the matrix is to be used. The corresponding weight is computed with</p> <p>$W_i = \frac{1-W_0}{2n_x} .$</p> <p>Subsequently, the sigma points $$\mathcal{X}$$ are used to be sent through the (nonlinear) function $$\mathbf{f}$$ and by means of the weights $$W_i$$ to calculate the first central moment</p> <p>$\mathbf{x}(k|k-1) = \sum^{2n_x}_{i=1} \mathbf{f}(\mathcal{X}^{(i)}(k-1)) W_i .$</p> <p>The predicted covariance matrix of the predicted state results insignificantly different with additive addition of the process noise</p> <p>$\mathbf{P}(k|k-1) = \sum^{2n_x}_{i=1} \bigl( \mathbf{f}(\mathcal{X}^{(i)}(k-1))-\mathcal{x}(k|k-1) \bigl) \bigl( \mathbf{f}(\mathcal{X}^{(i)}(k-1))-\mathcal{x}(k|k-1) \bigl)^T W_i + \mathbf{Q}_d(k-1) .$</p> <p>For the correction step, another set of sigma points must be calculated from the predicted state $$\mathbf{x}(k|k-1)$$ and the predicted covariance $$\mathbf{P}(k|k-1)$$,</p> <p>$\mathcal{X}^{(0)}(k) = \mathbf{x}(k|k-1) ,$</p> <p>$\mathcal{X}^{(i)}(k) = \mathbf{x}(k|k-1) + \sqrt{\frac{n_x}{1-W_0}} \bigl(\mathbf{P}^{\frac{1}{2}}(k|k-1)\bigl)_i$</p> <p>and</p> <p>$\mathcal{X}^{(i+n_x)}(k) = \mathbf{x}(k|k-1) - \sqrt{\frac{n_x}{1-W_0}} \bigl(\mathbf{P}^{\frac{1}{2}}(k|k-1)\bigl)_i .$</p> <p>To perform the correction step, the measurement vector must be predicted</p> <p>$\mathbf{y}(k|k-1) = \sum^{2n_x}_{i=1} \mathbf{h}(\mathcal{X}^{(i)}(k)) W_i .$</p> <p>With the help of the intermediate variables, cross covariance</p> <p>$\mathbf{P}_{xy}(k|k-1) = \sum^{2n_x}_{i=1} \bigl( \mathcal{X}^{(i)}(k) - \mathbf{x}(k|k-1) \bigl) \bigl( \mathbf{h}(\mathcal{X}^{(i)}(k)) - \mathbf{y}(k|k-1) \bigl)^T W_i$</p> <p>and innovation</p> <p>$\mathbf{S}(k|k-1) = \sum^{2n_x}_{i=1} W_i \bigl( \mathbf{h}(\mathcal{X}^{(i)}(k))- \mathbf{y}(k|k-1) \bigl) \bigl( \mathbf{h}(\mathcal{X}^{(i)}(k))- \mathbf{y}(k|k-1) \bigl)^T + \mathbf{R}(k)$</p> <p>the state</p> <p>$\mathbf{x}(k|k) = \mathbf{x}(k|k-1) + \mathbf{P}_{xy}(k|k-1) \mathbf{S}(k|k-1)^{-1} (\mathbf{y}(k) - \mathbf{y}(k|k-1) )$</p> <p>and covariance</p> <p>$\mathbf{P}(k) = \mathbf{P}(k|k-1) - \mathbf{P}_{xy}(k|k-1) \mathbf{S}(k|k-1)^{-1} \mathbf{P}_{xy}(k|k-1)^T$</p> <p>can be updated.</p> <h3>Conclusion</h3> <p>The Unscented Transformation is fascinating and is a fundamental contribution to the use of deterministic methods for state estimation of nonlinear systems. A major disadvantage of the UKF that there is a tuning parameter with W.</p> <h3>References</h3> <p> Julier, Simon J.; Uhlmann, Jeffrey K. (1997). “New extension of the Kalman filter to nonlinear systems” (PDF). In Kadar, Ivan (ed.). Signal Processing, Sensor Fusion, and Target Recognition VI. Proceedings of SPIE. Vol. 3. pp. 182–193.</p>For the state estimation of nonlinear systems, shortly after the original publication of Rudolf Kalman, a variant was published by Kalman himself which allows the application of the Kalman filter also for nonlinear systems. Here, the nonlinear system is linearized at the current estimated value. For this the equations must be derived analytically. This procedure was for many years the solution for the nonlinear systems which occur more frequently in practice. An essential disadvantage of this method is that the filter can diverge easily since the estimated value does not correspond to the actual state and thus is always linearized at the wrong place. In the case of unfavorable nonlinearities, the estimated value diverges and thus linearization is performed at an even more incorrect position. This leads after some steps fast to a complete divergence.Cubature Kalman Filter2023-01-29T00:00:00+00:002023-01-29T00:00:00+00:00https://kalman-filter.com/Cubature-Kalman-Filter<p>The Cubature Kalman Filter (CKF) is the newest representative of the sigma-point methods. The selection of sigma points in the CKF is slightly different from the Unscented Kalman Filter (UKF) and is based on the Cubature rule which was derived by Arasaratnam and Haykin . As for the UKF, the CKF follows the idea that it is easier to approximate a probability function than to linearize a nonlinear function.</p> <p>Essentially, the challenge in nonlinear state estimation is that the involuted nonlinear functions result in, for example, previously Gaussian distributed variables no longer being Gaussian distributed. Mathematically speaking, the cause is that an integral must be computed over the nonlinear function which is not analytically possible for most cases. Since Kalman filters only consider the first two statistical moments, it is sufficient to obtain only the first two moments as long as they correspond to the first two moments of the true distribution function.</p> <h3>Cubature Rule</h3> <p>Explained in short words, the idea of the Cubature rule is to transform the state defined in Cartesian space into a spherical-radial coordinate system and to use the <a href="https://en.wikipedia.org/wiki/Gaussian_quadrature">Gaussian quadrature</a> there. It results in the following calculation rules for the sigma points. For $$i=1,2,…,n_x$$ with $$\text{dim}( \mathbf{x} ) = n_x$$ the sigma points are</p> <p>$\mathcal{X}^{(i)}(k-1) = \mathbf{x}(k-1|k-1) + \sqrt{n_x} \bigl(\mathbf{P}^{\frac{1}{2}}(k-1|k-1)\bigl)_i$</p> <p>and</p> <p>$\mathcal{X}^{(i+n_x)}(k-1) = \mathbf{x}(k-1|k-1) - \sqrt{n_x} \bigl(\mathbf{P}^{\frac{1}{2}}(k-1|k-1)\bigl)_i$</p> <p>where the parenthesis $$\bigl( \cdot \bigl)_i$$ indicates that the i-column of the matrix is to be used. The corresponding weight is computed with</p> <p>$W_i = \frac{1}{2n_x} .$</p> <p>Subsequently, the sigma points $$\mathcal{X}$$ are used to be sent through the (nonlinear) function $$\mathbf{f}$$ and by means of the weights $$W_i$$ to calculate the first central moment</p> <p>$\mathbf{x}(k|k-1) = \sum^{2n_x}_{i=1} \mathbf{f}(\mathcal{X}^{(i)}(k-1)) W_i .$</p> <p>The predicted covariance matrix of the predicted state results insignificantly different with additive addition of the process noise</p> <p>$\mathbf{P}(k|k-1) = \sum^{2n_x}_{i=1} \bigl( \mathbf{f}(\mathcal{X}^{(i)}(k-1))-\mathcal{x}(k|k-1) \bigl) \bigl( \mathbf{f}(\mathcal{X}^{(i)}(k-1))-\mathcal{x}(k|k-1) \bigl)^T W_i + \mathbf{Q}_d(k-1) .$</p> <p>For the correction step, another set of sigma points must be calculated from the predicted state $$\mathbf{x}(k|k-1)$$ and the predicted covariance $$\mathbf{P}(k|k-1)$$,</p> <p>$\mathcal{X}^{(i)}(k) = \mathbf{x}(k|k-1) + \sqrt{n_x} \bigl(\mathbf{P}^{\frac{1}{2}}(k|k-1)\bigl)_i$</p> <p>and</p> <p>$\mathcal{X}^{(i+n_x)}(k) = \mathbf{x}(k|k-1) - \sqrt{n_x} \bigl(\mathbf{P}^{\frac{1}{2}}(k|k-1)\bigl)_i .$</p> <p>To perform the correction step, the measurement vector must be predicted</p> <p>$\mathbf{y}(k|k-1) = \sum^{2n_x}_{i=1} \mathbf{h}(\mathcal{X}^{(i)}(k)) W_i .$</p> <p>With the help of the intermediate variables, cross covariance</p> <p>$\mathbf{P}_{xy}(k|k-1) = \sum^{2n_x}_{i=1} \bigl( \mathcal{X}^{(i)}(k) - \mathbf{x}(k|k-1) \bigl) \bigl( \mathbf{h}(\mathcal{X}^{(i)}(k)) - \mathbf{y}(k|k-1) \bigl)^T W_i$</p> <p>and innovation</p> <p>$\mathbf{S}(k|k-1) = \sum^{2n_x}_{i=1} W_i \bigl( \mathbf{h}(\mathcal{X}^{(i)}(k))- \mathbf{y}(k|k-1) \bigl) \bigl( \mathbf{h}(\mathcal{X}^{(i)}(k))- \mathbf{y}(k|k-1) \bigl)^T + \mathbf{R}(k)$</p> <p>the state</p> <p>$\mathbf{x}(k|k) = \mathbf{x}(k|k-1) + \mathbf{P}_{xy}(k|k-1) \mathbf{S}(k|k-1)^{-1} (\mathbf{y}(k) - \mathbf{y}(k|k-1) )$</p> <p>and covariance</p> <p>$\mathbf{P}(k) = \mathbf{P}(k|k-1) - \mathbf{P}_{xy}(k|k-1) \mathbf{S}(k|k-1)^{-1} \mathbf{P}_{xy}(k|k-1)^T$</p> <p>can be updated.</p> <h3>Conclusion</h3> <p>Compared to the Unscented Transformation of the UKF, the Cubature rule has a great similarity and is even a special case of the UT method. The advantage is that no parameters have to be tuned and since no negative weights are used there is no danger that negative definite covariance matrices are generated. The disadvantage is that the larger n is, the wider the sigma points are spread than with the UKF, which leads to the fact that the outer regions of the used nonlinear function significantly influence the value of the sigma points.</p> <h3>References</h3> <p> I. Arasaratnam and S. Haykin, “Cubature Kalman Filters,” in IEEE Transactions on Automatic Control, vol. 54, no. 6, pp. 1254-1269, June 2009, doi: 10.1109/TAC.2009.2019800.</p> <p> D. F. Crouse, “Cubature/unscented/sigma point Kalman filtering with angular measurement models,” 2015 18th International Conference on Information Fusion (Fusion), 2015, pp. 1550-1557.</p>The Cubature Kalman Filter (CKF) is the newest representative of the sigma-point methods. The selection of sigma points in the CKF is slightly different from the Unscented Kalman Filter (UKF) and is based on the Cubature rule which was derived by Arasaratnam and Haykin . As for the UKF, the CKF follows the idea that it is easier to approximate a probability function than to linearize a nonlinear function.Estimating the state of a dc motor2022-11-26T00:00:00+00:002022-11-26T00:00:00+00:00https://kalman-filter.com/estimating-the-state-of-a-dc-motor<p>This example will focus on estimating the angular position $$\theta$$, angular velocity $$\dot{\theta}$$ and armature current $$i$$ of a DC motor with a <a href="/linear-kalman-filter/">linear Kalman filter</a>. When modeling DC motors, it is important to mention that certainly nonlinear models are superior to linear ones. For didactic purposes, the following widely used linear model is sufficient for the time being.</p> <h2>Continuous-time system model</h2> <p>The modeling of a DC motor is structured into two parts. The electrical part is modeled with the electrical equivalent circuit and the kinematics with the basic equations of a DC motor.</p> <h3>Electrical system model</h3> <p>The electrical part consists of the armature resistance $$R$$ and the inductance of the armature winding $$L$$ which are connected in series, and the electromechanical energy converter $$M$$ which is the DC motor. The circuit is operated by the voltage source $$V_s$$.</p> <p align="center"> <img src="/assets/images/dc_motor/electrical-system-model-dc-motor.png" title="Electrical system model of a DC motor" width="500" /> </p> <p>The following equation can be determined using the mesh equation</p> <p>$V_s = L \dot{i} + R i - V_e .$</p> <p>The DC motor induces the voltage $$V_e$$ that can be regarded linear to the angular velocity $$V_e = K_e \dot{\theta}$$ with the <a href="https://en.wikipedia.org/wiki/Motor_constants">motor velocity constant</a> $$K_e$$.</p> <p>Since $$L \dot{i}$$ is a differentiating term, we solve for the same and get the following first order differential equation</p> <p>$\dot{i} = \frac{1}{L} V_s - \frac{R}{L} i - \frac{K_e}{L} \dot{\theta}$</p> <p>that we will later transfer to the state-space.</p> <h3>Kinematic system model</h3> <p>For the mechanical part, the DC motor is described by the physics of <a href="https://en.wikipedia.org/wiki/Rotation_around_a_fixed_axis">rotations around a fixed axis</a>. It is assumed that the mechanical shaft and the rotated body, i.e. the actuated load, are known and subsumed under the angular mass $$J$$.</p> <p align="center"> <img src="/assets/images/dc_motor/kinematic-system-model-dc-motor.png" title="Kinematic system model of a DC motor" width="300" /> </p> <p>The relationship between the angular acceleration $$\ddot{\theta}$$ caused by the net torque is then as follows</p> <p>$J \ddot{\theta} = m_m - m_R - m_L .$</p> <p>Where $$m_m$$ is the motor torque. The friction torque $$m_R$$ and load torque $$m_L$$ counteract to the motor torque $$m_m$$. For the motor torque $$m_m$$ there is a linear relationship to the current $$m_m = K_T i$$ that is adjusted by the factor $$K_T$$ which is referred to as <a href="https://en.wikipedia.org/wiki/Motor_constants">motor torque constant</a>. The friction torque $$m_R$$ is proportional to the angular velocity $$m_R = b \dot{\theta}$$ multiplied by the coefficient $$b$$. The torque $$m_L$$ is the mechanical load torque that is $$m_L = 0$$, if the DC motor is operated without payload, but we will assume later a unknown load torque. Since $$\ddot{\theta}$$ is the term with the highest differentiation, we solve for the same and get the following second order differential equation</p> <p>$\ddot{\theta} = \frac{K_T i}{J} - \frac{b}{J} \dot{\theta} - \frac{m_L}{J} .$</p> <h3>Combined system model</h3> <p>The two upper differential equations can now be transferred into one combined state-space model of the form</p> <p>$\dot{\mathbf{x}}(t) = \mathbf{A} \mathbf{x}(t) + \mathbf{B} \mathbf{u}(t) + \mathbf{G} \mathbf{z}(t) .$</p> <p>For this purpose, the kinematic differential equation of second order gets converted into a first-order differential equation by defining $$x_1:=\theta$$ as the first state variable and $$x_2:=\dot{\theta}$$ as its derivative. Additionally, we model the derivative of the load torque $$m_L$$ as an unknown disturbance</p> <p>$\dot{m_L} = z(t),$</p> <p>which means we define the load torgue as a third state variable $$x_3:=m_L$$. For the first-order electrical differential equation, we can directly define $$x_4:=i$$ as the fourth and last state variable. Furthermore, the combined system is only controlled by the voltage supply, i.e.,</p> <p>$\mathbf{u}(t) = V_s.$</p> <p>Arranging everything into state-space form, we get</p> <p>$\underbrace{\begin{bmatrix} \dot{\theta} \\ \ddot{\theta} \\ \dot{m_L} \\ \dot{i} \end{bmatrix}}_{= \dot{\mathbf{x}}(t)}= \underbrace{\begin{bmatrix} 0 &amp; 1 &amp; 0 &amp; 0\\ 0 &amp; -\frac{b}{J} &amp; -\frac{1}{J} &amp;\frac{K_T}{J} \\ 0 &amp; 0 &amp; 0 &amp; 0 \\ 0 &amp; -\frac{K_e}{L} &amp; 0 &amp; -\frac{R}{L}\end{bmatrix}}_{=A} \underbrace{\begin{bmatrix} \theta \\ \dot{\theta} \\ m_L \\ i \end{bmatrix}}_{=\mathbf{x}(t)}+\underbrace{\begin{bmatrix} 0 \\ 0 \\ 0 \\ \frac{1}{L}\end{bmatrix}}_{=\mathbf{B}} \mathbf{u}(t) + \underbrace{\begin{bmatrix} 0 \\ 0 \\ 1 \\ 0\end{bmatrix}}_{=\mathbf{G}} \mathbf{z}(t) .$</p> <p>For the measurement equation</p> <p>$\mathbf{y}(t) = \mathbf{C} \mathbf{x}(t) + \mathbf{v}(t),$</p> <p>we assume we measure the angular position $$\theta = x_1$$ with an absolute encoder, and from this, the measurement matrix follows</p> <p>$\mathbf{C} = \begin{bmatrix} 1 &amp; 0 &amp; 0 &amp; 0\end{bmatrix}.$</p> <p><em>Some notes, the choice of state variables is the part of modeling and it is possible for both another choice and another order. When modeling the load torque $$m_L$$, it was assumed that the load torque $$m_L$$ is unknown and noisy. Sine the load torque $$m_L$$ is not mean-free, it is not possible to set $$\mathbf{z}(t) = m_L$$. One practice is therefore to assume the change (i.e. the derivative) to be mean-free and to set the variable as an additional state (here $$x_3$$).</em></p> <h2>Discrete-time system model</h2> <p>Discretization of the matrices $$\mathbf{A}_d$$ and $$\mathbf{B}_d$$ does not yield handy formulations if analytical discretized. Therefore, the discretization is done with the following values $$J=0.0001 \ kg \cdot m^2$$, $$b=0.0001 \ \frac{Nm}{0.5 \cdot rad}$$, $$K_T=0.03 \ \frac{Nm}{A}$$, $$K_e=0.03 \ V$$, $$R=0.5 \ \Omega$$, $$L=0.0004 \ H$$ and $$T_s=0.1 \ s$$ that are later also used for the simulation by using the known equations</p> <p>$\mathbf{A}_d = e^{\mathbf{A} T_s}$</p> <p>$\mathbf{B}_d = \int^{T_s}_{0} e^{\mathbf{A} \tau} d \tau \mathbf{B} .$</p> <p>For the sake of completeness, the discrete system model is $\mathbf{x}(k) = \mathbf{A}_d(k-1) \mathbf{x}(k-1) + \mathbf{B}_d(k-1) \mathbf{u}(k-1) + \mathbf{z}(k-1)$</p> <p>$\mathbf{y}(k) = \mathbf{C}(k) \mathbf{x}(k) + \mathbf{v}(k) .$</p> <p>It is worth taking a look at the 3rd state variable $$x_3$$ with the load torque $m_L(k) = m_L(k-1) + v_{m_L}(k-1) .$ The line is to be understood in such a way that the change of the load torque from time $$k-1$$ to time $$k$$ corresponds to gaussian noise, i.e. a random walk. This is the primary modeling choice made here. The rest corresponds to the standard procedure for modeling a DC motor.</p> <h3>Observability</h3> <p>In order to check whether the modeling of the measurement mapping allows to observe all states, one can check the rank of the observability matrix $\mathbf{O} = \text{rank} \begin{bmatrix} \mathbf{C} \\ \mathbf{C} \mathbf{A}_d \\ \mathbf{C} (\mathbf{A}_d)^2 \\ \mathbf{C} (\mathbf{A}_d)^3 \end{bmatrix} = 4 .$ In this case, the rank corresponds to the number of degrees of freedom and thus the system is fully observable with the chosen measurement mapping $$\mathbf{C}$$.</p> <h3>Determination of the system and measurement noise</h3> <p>There are different ways to determine the process noise. Since here only simulative work is done, all ways would work. In practice, it would be necessary to evalulate more precisely which path and which parameterization should be selected. Here the variance is set to $$\sigma_{m_L}^2 = 2.25 \cdot 10^{-6}$$ with $\mathbf{Q} = \begin{bmatrix} 0 &amp; 0 &amp; 0 &amp; 0\\ 0 &amp; 0 &amp; 0 &amp; 0 \\ 0 &amp; 0 &amp; \sigma^2_{m_L} &amp; 0 \\ 0 &amp; 0 &amp; 0 &amp; 0 \end{bmatrix} .$ For the calculation of the covariance matrix of the discrete process noise the discretization with zero-order hold is chosen $\mathbf{Q}_d = \int^{T_s}_{0} e^{\mathbf{A} \tau} \mathbf{Q} e^{\mathbf{A}^T \tau} d \tau .$ <em>The calculation does not provide a handy result for $$\mathbf{Q}_d$$ therefore it is not given here.</em></p> <p>The determination of the measurement noise is based on the assumption that the angular position of the dc motor is determined by an absolute encoder with a resolution of 12 bits. This results in 4096 discrete positions, where one position corresponds approximately to an interval length of 0.088 degrees. The variance of the measruement noise need to be set equal to the variance of the continuous uniform distribution $\sigma^2_r = \frac{1}{12} ( \frac{2 \pi}{4096} )^2 = 1.96 \cdot 10^{-7} \ \text{rad} .$</p> <h2>Implementation and results</h2> <p>The following results were received by conducting a simulation in MATLAB. To make the simulation more lively and realistic, the input voltage is set to 6 V and in the middle of the simulation to 12 V to mock a controller (see the figure below).</p> <p align="center"> <img src="/assets/images/dc_motor/u.png" title="input voltage" /> </p> <p>The next four figures shows the course of the estimate of $$\mathbf{x}$$ and the ground truth of $$\mathbf{x}$$. Additionally, for the first state $$x_1$$ the measurements $$y$$ are plotted. For the second state $$x_2$$, i.e. the angular velocity $$\dot{\theta}$$, the approximate derivative $$y_{diff}$$ is plotted. Qualitative can be evaluated for all states that the estimate is very close to the true value.</p> <p align="center"> <img src="/assets/images/dc_motor/x1.png" title="x1" /> </p> <p align="center"> <img src="/assets/images/dc_motor/x2.png" title="x2" /> </p> <p align="center"> <img src="/assets/images/dc_motor/x3.png" title="x3" /> </p> <p align="center"> <img src="/assets/images/dc_motor/x4.png" title="x4" /> </p> <p>The previous result was mainly qualitative, a widely used quantitative way is to evaluate the estimation error with a metric such as the <a href="/root-mean-square-error/">Root Mean Square Error (RMSE)</a>. The next figure shows the RMSE for $$x_1$$ and the measurement $$y$$. It can be seen that RMSE for $$x_1$$ of the Kalman filter matches the RMSE of the measurement $$y$$. The main reason is that the measurement noise is very low.</p> <p align="center"> <img src="/assets/images/dc_motor/RMSE_x1.png" title="RMSE(x1)" /> </p> <p>Usually the angular velocity $$\dot{\theta}$$ is used to control a DC motor. Without a state estimator, one may be inclined to derive the measured position by approximate derivative and control to this quantity. In the next figure the RMSE for the derived measurement is also shown. It can be clearly seen that the RMSE spikes when the input voltage changes in a jump.</p> <p align="center"> <img src="/assets/images/dc_motor/RMSE_x2.png" title="RMSE(x2)" /> </p> <p>As the last step, the correctness of the simulation can be confirmed by the <a href="/normalized-estimation-error-squared/">Normalized Estimation Error Squared (NEES)</a>. Since 1000 Monte Carlo runs were performed, a 95% confidence interval $$[3.83;4.18]$$ can be determined. The next figure shows the NEES that stays within the confidence interval. Moreover, the mean of the NEES is close to 4 which is the degree of freedom of the used state-space system.</p> <p align="center"> <img src="/assets/images/dc_motor/NEES.png" title="NEES" /> </p> <h2>Source Code</h2> <p>The author is working on an extended code base which also includes the system shown above. For now, the code can only be made available after completing the larger code base. Stay tuned!</p>This example will focus on estimating the angular position $$\theta$$, angular velocity $$\dot{\theta}$$ and armature current $$i$$ of a DC motor with a linear Kalman filter. When modeling DC motors, it is important to mention that certainly nonlinear models are superior to linear ones. For didactic purposes, the following widely used linear model is sufficient for the time being.Multiple Model State Estimation2022-10-17T00:00:00+00:002022-10-17T00:00:00+00:00https://kalman-filter.com/Multiple-Model<p>Central for the performance of a model-based state estimator is how well the model corresponds to the process to be observed. In practice, the question arises whether the process to be estimated always behaves exactly according to one model or whether the process changes between different models. The solution is to take this into account by using multiple models. In the literature this aspect is called multiple model. The challenge is both how exactly, i.e. only one model or the mixture of several models, and according to which rules the process changes into the different models.</p> <h3>First generation</h3> <p>In the so-called first generation of MM methods, it is assumed that the system behaves according to a single model which is not known in advance. It is only known that it is one model out of a known set of models. Therefore, the approach is to use r parallel Kalman filters, i.e. one Kalman filter for each model. By considering the innovation covariance matrix, a recursive estimator can be derived, which over time gives a higher weighting to the Kalman filter for which the model used corresponds to the actual model.</p> <h3>Second generation</h3> <p>The second generation removes the need to stick to a specific one for the entire runtime and allows the system to switch between the different models. However, this freedom leads to the fact that r^n parallel filters are needed for the optimal result. In order to have practical solutions, several sub-optimal ones exist.</p> <h3>Third generation</h3> <p>The third generation basically does not allow any of the previously mentioned restrictions, i.e. the system can also behave according to unknown models. However, solutions for this problem do not go quite so strictly.</p> <h3>Further reading</h3> <p>Probably the most comprehensive work on the subject is the paper by Ri and Jilkov, although it refers to the tracking use case .</p> <p>For the IMM approach, S. Dingler’s paper goes into more detail and describes the advantages and disadvantages .</p> <h3>References</h3> <p> X. Rong Li and V. P. Jilkov, “Survey of maneuvering target tracking. Part V. Multiple-model methods,” in IEEE Transactions on Aerospace and Electronic Systems, vol. 41, no. 4, pp. 1255-1321, Oct. 2005, doi: 10.1109/TAES.2005.1561886.</p> <p> Dingler, Sebastian. “State estimation with the Interacting Multiple Model (IMM) method.” <a href="https://arxiv.org/abs/2207.04875">arXiv preprint arXiv:2207.04875, 2022</a>.</p>Central for the performance of a model-based state estimator is how well the model corresponds to the process to be observed. In practice, the question arises whether the process to be estimated always behaves exactly according to one model or whether the process changes between different models. The solution is to take this into account by using multiple models. In the literature this aspect is called multiple model. The challenge is both how exactly, i.e. only one model or the mixture of several models, and according to which rules the process changes into the different models.Root Mean Square Error (RMSE)2022-09-25T00:00:00+00:002022-09-25T00:00:00+00:00https://kalman-filter.com/RMSE<p>To evaluate the performance of state estimators, the estimation error $$\tilde{\mathbf{x}}(k) = \mathbf{x}(k) - \mathbf{\hat{x}}(k)$$ is evaluated. The root mean square error (RMSE), which is a widely used quality measure, is suitable for this purpose. The basis is the estimation error $$\tilde{\mathbf{x}}(k)$$ for each time step $$k \in {1…K}$$.</p> <h3>In simulation</h3> <p>For a simulation with a length of $$K$$ time increments the RMSE is averaged by $$N$$ Monte Carlo runs in order to achieve a high statistical significance $\text{RMSE}(\tilde{\mathbf{x}}(k)) = \sqrt{\frac{1}{N} \sum^N_{i=1} (\tilde{x}^i_1(k)^2 + … + \tilde{x}^i_n(k)^2)}$ where $$n = \text{dim}(\tilde{\mathbf{x}}(k))$$ is.</p> <p>The RMSE should always be calculated only for elements with the same unit, because it is not correct to add e.g. velocity and acceleration. In this case, RMSE’s are determined per state or unit. If for example a velocity is chosen as a state vector, it is called the RMSE of the velocity. In general, the smaller the RMSE, the better the state estimator.</p> <h3>In real-world applications</h3> <p>If real measurement data are to be evaluated, the RMSE can only be calculated from a single run. The RMSE thus has no high statistical significance. The RMSE can therefore alternatively be calculated via the mean estimation error</p> <p>$\mu_{\tilde{x}} = \frac{1}{K} \sum^K_{k=1} \tilde{\mathbf{x}}(k)$</p> <p>and the standard deviation of the estimation error</p> <p>$\sigma_{\tilde{x}} = \sqrt{ \frac{1}{K} \sum^K_{k=1} (\tilde{\mathbf{x}}(k) - \mu_{\tilde{x}} )^2 }$</p> <p>with</p> <p>$\text{RMSE}(\tilde{\mathbf{x}}(k)) = \sqrt{ \mu_{\tilde{x}} + \sigma_{\tilde{x}} } .$</p> <p>However, since the true state $$\mathbf{x}$$, i.e. a “ground truth”, must be available, this can only be done in practice with highly accurate reference data. It may also be the case that not all states can be measured by reference measuring devices. It is therefore legitimate to calculate the RMSE only for the states for which reference data are available.</p>To evaluate the performance of state estimators, the estimation error $$\tilde{\mathbf{x}}(k) = \mathbf{x}(k) - \mathbf{\hat{x}}(k)$$ is evaluated. The root mean square error (RMSE), which is a widely used quality measure, is suitable for this purpose. The basis is the estimation error $$\tilde{\mathbf{x}}(k)$$ for each time step $$k \in {1…K}$$. In simulation For a simulation with a length of $$K$$ time increments the RMSE is averaged by $$N$$ Monte Carlo runs in order to achieve a high statistical significance $\text{RMSE}(\tilde{\mathbf{x}}(k)) = \sqrt{\frac{1}{N} \sum^N_{i=1} (\tilde{x}^i_1(k)^2 + … + \tilde{x}^i_n(k)^2)}$ where $$n = \text{dim}(\tilde{\mathbf{x}}(k))$$ is.Normalized Innovation Squared (NIS)2022-09-25T00:00:00+00:002022-09-25T00:00:00+00:00https://kalman-filter.com/NIS<p>The Normalized Innovation Squared (NIS) metric allows to check whether the Kalman filter is consistent with the measurement residual $$\nu (k)$$ and the associated innovation covariance matrix $$\mathbf{S}(k)$$.</p> <p><em>The NIS hypothesis test uses the same mechanics as the <a href="/normalized-estimation-error-squared/">NEES metric</a>. It is therefore recommended to read this article first.</em></p> <p>The measurement residual $$\nu (k)$$ is the difference of the measurement prediction $$\mathbf{\hat{y}}(k)$$ and the measurement $$\mathbf{y}(k)$$. The consistency check of the innovation covariance matrix uses the normalizing with the measurement residual $$\nu (k)$$</p> <p>$\text{NIS} (\nu(k),\mathbf{S}(k)) = \epsilon_\nu (k) = \nu(k)^T \mathbf{S}(k)^{-1} \nu(k) \ .$</p> <p>As with <a href="/normalized-estimation-error-squared/">NEES</a>, the quantity $$\epsilon_\nu (k)$$ is chi-squared distributed with $$\text{dim}( \nu(k) ) = n_z$$ degrees of freedom, i.e. the dimension of the measurement vector $$\mathbf{y}(k)$$.</p> <p>In order to achieve an increased statistical significance, an averaged NIS can be calculated by running the test N times $\text{ANIS}( \epsilon^i_{\nu}(k) ) = \overline{\epsilon}_{\nu} (k) = \frac{1}{N} \sum^N_{i=1} \epsilon^i_\nu(k) .$</p> <p>$$\epsilon_{\nu}$$ has the property that the mean converges towards $$N n_i$$ and that $$\epsilon_{\nu}$$ is chi-squared distributed with $$N n_z$$ degrees of freedom.</p> <h3>Hypothesis testing</h3> <p>The hypothesis test is performed with the hypothesis</p> <p>$H_0 : \text{The measurement residual  \nu (k) is consistent with the innovation covariance matrix \mathbf{S}(k)}$</p> <p>and is accepted if</p> <p>$\epsilon_{\nu} \in [r_1,r_2] .$</p> <p>The acceptance interval is chosen such that the probability $$P$$ that $$H_0$$ is accepted is $$(1 - \alpha)$$, i.e.</p> <p>$P \{ \epsilon_{\nu} \in [r_1,r_2] | H_0 \} = 1 - \alpha .$</p> <p>The confidence interval $$[r_1,r_2]$$ can be calculated with the inverse cumulative distribution function $$F^{-1}$$ of the chi-squared distribution</p> <p>$r_1 = F^{-1} ( \frac{\alpha}{2}, n_z ), \ r_2 = F^{-1} ( 1- \frac{\alpha}{2}, n_z ) .$</p> <h3>Conclusion</h3> <p>The NIS metric in conjunction with a hypothesis test allows to check for the consistency of the innovation covariance matrix $$\mathbf{S}(k)$$. Since the innovation covariance matrix directly influences the Kalman matrix $$\mathbf{K}$$, the consistency of the innovation covariance matrix $$\mathbf{S}(k)$$ has a direct influence on the efficiency of the kalman filter. The hypothesis procedure is analogous to the NEES metric with the difference that for the NIS only the measurement space is used and not the full state-space. This is particularly relevant for practice, since ground truth data is not always available.</p>The Normalized Innovation Squared (NIS) metric allows to check whether the Kalman filter is consistent with the measurement residual $$\nu (k)$$ and the associated innovation covariance matrix $$\mathbf{S}(k)$$.Normalized Estimation Error Squared (NEES)2022-09-11T00:00:00+00:002022-09-11T00:00:00+00:00https://kalman-filter.com/NEES<p>A desired property of a state estimator is that it is able to indicate the quality of the estimate correctly. This ability is called consistency of a state estimator and has a direct impact on the estimation error $$\tilde{\mathbf{x}}(k)$$, i.e. an inconsistent state estimator does not provide the most optimal result. A state estimator should be able to indicate the quality of the estimate correctly, because an increase in sample size leads to a growth in information content and the state estimate $$\hat{\mathbf{x}}(k)$$ shall be as close as possible to the true state $$\mathbf{x}(k)$$. This results from the requirement that a state estimator shall be unbiased. Mathematically speaking, this is expressed by the expected value of the estimation error $$\tilde{\mathbf{x}}(k)$$ being zero $E [ \mathbf{x}(k) - \hat{\mathbf{x}}(k) ] = E [ \tilde{\mathbf{x}}(k) ] \overset{!}{=} 0$</p> <p>and that the covariance matrix of the estimation error shall be equal to the covariance matrix $$\mathbf{P}(k|k)$$ of the state estimator $E [ [\mathbf{x}(k) - \hat{\mathbf{x}}(k)] [\mathbf{x}(k) - \hat{\mathbf{x}}(k)]^T ] = E [ \tilde{\mathbf{x}}(k|k) \tilde{\mathbf{x}}(k|k)^T ] \overset{!}{=} \mathbf{P}(k|k) .$</p> <p>Checking the consistency of the state estimator can be done with a hypothesis test using the Normalized Estimation Error Squared (NEES) metric. For this purpose, a normalization of the estimation error $$\tilde{\mathbf{x}}$$ and the error covariance matrix $$\mathbf{P}$$ is performed</p> <p>$\epsilon (k) = \tilde{\mathbf{x}}(k|k)^T \mathbf{P}(k|k)^{-1} \tilde{\mathbf{x}}(k|k) \ .$</p> <p>The quantity $$\epsilon (k)$$ is chi-squared distributed with $$\text{dim}( \tilde{\mathbf{x}}(k)) = n_x$$ degrees of freedom.</p> <p><em>A Brief explanation why chi-squared distributed: The sum of standard normally distributed squared random variables is chi-squared distributed. Under the assumption that $$\mathbf{x}(k)$$ and $$\hat{\mathbf{x}}(k)$$ are standard normally distributed random variables, the estimation error $$\tilde{\mathbf{x}}$$ is a standard normal random variable, too. <a href="https://en.wikipedia.org/wiki/Chi-squared_distribution">Consult Wikipedia for more details.</a></em></p> <h3>Hypothesis testing</h3> <p>A very useful property of $$\epsilon(k)$$ is that the expectation is</p> <p>$E [ \epsilon(k) ] = n_x .$</p> <p>This can now be used to check whether a state estimator delivers consistent results, i.e. whether it correctly estimates the quality of its estimation. The hypothesis</p> <p>$H_0 : \text{The estimation error \hat{\mathbf{x}}(k) is consistent with the filters covariance matrix \mathbf{P}(k)}$</p> <p>is accepted if</p> <p>$\epsilon (k) \in [r_1,r_2] .$</p> <p>The acceptance interval is chosen such that the probability $$P$$ that $$H_0$$ is accepted is $$(1 - \alpha)$$, i.e.</p> <p>$P \{ \epsilon (k) \in [r_1,r_2] | H_0 \} = 1 - \alpha .$</p> <p>The confidence interval $$[r_1,r_2]$$ can be calculated with the inverse cumulative distribution function $$F^{-1}$$ of the chi-squared distribution</p> <p>$r_1 = F^{-1} ( \frac{\alpha}{2}, n_x ), \ r_2 = F^{-1} ( 1- \frac{\alpha}{2}, n_x ) .$</p> <h3>Analysis of the NEES</h3> <p>Most often, one can test the hypothesis by plotting the $$\text{NEES}( \epsilon(k) )$$ and the confidence interval $$[r_1,r_2]$$ and check visually if the NEES is inside or outside.</p> <p align="center"> <img src="/assets/images/dc_motor/NEES.png" title="Normalized Estimation Error Squared (NEES)" /> <center>Figure 1: Example of ANEES with 4 degrees of freedom.</center> </p> <p>Quantitatively, one counts the number of NEES values that are outside the confidence interval and sets them in relation to the total number of $$\epsilon(k)$$. Formal speaking,</p> <p>$\frac{1}{K} \sum_{ \forall k \in K : \epsilon(k) \in [r_1,r_2]} 1 &lt; (1 - \alpha)$</p> <p>where $$K \in \mathbb{N}^+$$ is the maximum number of time steps, e.g. $${1,2,…,K }$$.</p> <p>If the NEES is greater than the upper bound $$r_2$$, then the filter estimates the error to be lower than it is in reality. Provided that the measurement noise was selected properly, the process noise was chosen too low. If the NEES is smaller than $$r_1$$, then the error is estimated larger than it really is. In this case, the filter is called inefficient. The process noise was chosen too large. In both cases, an inconsistent state estimator can be expected not to reach the smallest possible estimation error.</p> <h3>Higher statistical significance</h3> <p>To increase statistical significance, additional $$N$$ independent samples of $$\epsilon(k)$$ can be generated and an average NEES can be formed</p> <p>$\text{ANEES}( \epsilon_i(k) ) = \overline{\epsilon} (k) = \frac{1}{N} \sum^N_{i=1} \epsilon_i(k) .$</p> <p>$$\overline{\epsilon}$$ has the property that the expectation converges towards $$n_x$$ and that $$N \overline{\epsilon}$$ is chi-squared distributed with $$N n_x$$ degrees of freedom. The confidence interval must therefore be normalized by $$N$$, i.e. $r_1 = \frac{1}{N} F^{-1} ( \frac{\alpha}{2},N n_x ), \ r_2 = \frac{1}{N} F^{-1} ( 1- \frac{\alpha}{2},N n_x ) .$ The hypothesis test is repeated in the same way as described above.</p> <h3>Conclusion</h3> <p>Obviously, the NEES is only correct in a “truth model simulation”, because the true state and the measurements are generated from the same system model with known statistical properties that the Kalman filter uses. It is therefore likely that a Kalman filter will pass the NEES hypothesis test within the simulation, but not in a real application. In practice, it is therefore recommended to perform the <a href="/normalized-innovation-squared/">Normalized Innovation Squared (NIS)</a> test exclusively or additionally to the NEES, because it does not require access to the true state vector. However, the NEES hypothesis test offers a deeper insight because the entire state vector is used and helps among other things in debugging within an application. This is based on the idea that if the implementation does not work correctly in a simulation with the “truth model”, why should it work in reality?</p> <p>From Salzmann: Some Aspects of Kalman Filtering</p> <p>Mehra and Peschon 1971</p> <p>The innovation is a important quantity this was pointed out by Kalman,1960</p> <p>In fact the compelte filter can be derived completely based on the innovation sequence by Kailath, 1968</p> <p>Monitoring the innovation can also be used for fault detection. Survey by Willsky 1976.</p> <p>The innovation sequence offers a possible approach to adaptive filtering. Survy Chin 1979</p> <p>Ch-square test Bendat and Piersol 1986</p>A desired property of a state estimator is that it is able to indicate the quality of the estimate correctly. This ability is called consistency of a state estimator and has a direct impact on the estimation error $$\tilde{\mathbf{x}}(k)$$, i.e. an inconsistent state estimator does not provide the most optimal result. A state estimator should be able to indicate the quality of the estimate correctly, because an increase in sample size leads to a growth in information content and the state estimate $$\hat{\mathbf{x}}(k)$$ shall be as close as possible to the true state $$\mathbf{x}(k)$$. This results from the requirement that a state estimator shall be unbiased. Mathematically speaking, this is expressed by the expected value of the estimation error $$\tilde{\mathbf{x}}(k)$$ being zero $E [ \mathbf{x}(k) - \hat{\mathbf{x}}(k) ] = E [ \tilde{\mathbf{x}}(k) ] \overset{!}{=} 0$Discretization of linear state-space model2022-01-08T00:00:00+00:002022-01-08T00:00:00+00:00https://kalman-filter.com/discretization-of-linear-state-space-model<p>In practice, the discretization of the state-space equations is of particular importance.</p> <p>For systems with high degrees of freedom (&gt;2), the determination is not particularly trivial and presents even experienced individuals with considerable challenges due to the specialized algebra.</p> <p>The details of the derivation shall be omitted here because this is only suitable for the really interested reader and here we will deal with the very practical determination with the help of MATLAB.</p> <p>In general, it is recommended to simplify the matrices in continuous time as much as possible before performing the discretization.</p> <h3>Formulas based on ZOH discretization</h3> <p>$\mathbf{A}_d = e^{\mathbf{A} T_s}$</p> <p>$\mathbf{B}_d = \int^{T_s}_{0} e^{\mathbf{A} \tau} d \tau \mathbf{B}$</p> <p>$\mathbf{G}_d = \int^{T_s}_{0} e^{\mathbf{A} \tau} \mathbf{G} d \tau$</p> <p>$\mathbf{Q}_d = \int^{T_s}_{0} e^{\mathbf{A} \tau} \mathbf{Q} e^{\mathbf{A}^T \tau} d \tau$</p> <p>$\mathbf{R}_d = \mathbf{R} \frac{1}{T_s}$</p> <h3>Computation in MATLAB</h3> <figure class="highlight"><pre><code class="language-matlab" data-lang="matlab"><span class="n">Ad</span> <span class="o">=</span> <span class="nb">expm</span><span class="p">(</span><span class="n">A</span><span class="o">*</span><span class="n">Ts</span><span class="p">);</span> <span class="n">syms</span> <span class="n">tau</span> <span class="n">Bd</span> <span class="o">=</span> <span class="n">vpa</span><span class="p">(</span><span class="n">int</span><span class="p">(</span><span class="nb">expm</span><span class="p">(</span><span class="n">A</span><span class="o">*</span><span class="n">tau</span><span class="p">)</span><span class="o">*</span><span class="n">B</span><span class="p">,</span><span class="n">tau</span><span class="p">,[</span><span class="mi">0</span> <span class="n">Ts</span><span class="p">]));</span> <span class="n">syms</span> <span class="n">nu</span> <span class="n">Gd</span> <span class="o">=</span> <span class="n">vpa</span><span class="p">(</span><span class="n">int</span><span class="p">(</span><span class="nb">expm</span><span class="p">(</span><span class="n">A</span><span class="o">*</span><span class="n">tau</span><span class="p">)</span><span class="o">*</span><span class="n">G</span><span class="p">,</span><span class="n">tau</span><span class="p">,[</span><span class="mi">0</span> <span class="n">Ts</span><span class="p">]));</span></code></pre></figure> <h3>An elegant way</h3> <p>A particularly elegant way is to construct a matrix that yields the same result as described above.</p> <p>$e^{\begin{bmatrix}\mathbf{A} &amp; \mathbf{B}\\ \mathbf{0} &amp; \mathbf{0}\end{bmatrix} \cdot T_s} = \begin{bmatrix}\mathbf{A}_d &amp; \mathbf{B}_d \\ \mathbf{0} &amp; \mathbf{I}\end{bmatrix}$</p> <p>For exampel for a system with $$\mathbf{A}_{3 \times 3}, \mathbf{B}_{1 \times 3}$$</p> <figure class="highlight"><pre><code class="language-matlab" data-lang="matlab"><span class="n">big</span><span class="o">=</span><span class="nb">expm</span><span class="p">([</span><span class="n">A</span> <span class="n">B</span><span class="p">;</span><span class="nb">zeros</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="mi">4</span><span class="p">)]</span><span class="o">*</span><span class="n">Ts</span><span class="p">);</span> <span class="n">Ad</span><span class="o">=</span><span class="n">big</span><span class="p">(</span><span class="mi">1</span><span class="p">:</span><span class="mi">3</span><span class="p">,</span><span class="mi">1</span><span class="p">:</span><span class="mi">3</span><span class="p">);</span> <span class="n">bd</span><span class="o">=</span><span class="n">big</span><span class="p">(</span><span class="mi">1</span><span class="p">:</span><span class="mi">3</span><span class="p">,</span><span class="mi">4</span><span class="p">);</span></code></pre></figure>In practice, the discretization of the state-space equations is of particular importance.