Kalman Filter

A Kalman Filter is a tool for incorporating noisy measurements over time to estimate the true state of a dynamic system. It is widely used in control systems, robotics, and signal processing.

An Underlying Dynamical System Model

A great way to understand the Kalman Filter is to think of it as a hidden Markov model where the hidden states evolve over time according to a linear dynamical system, and the observations are noisy measurements of these states.

Model underlying the Kalman filter. Squares represent matrices. Ellipses represent multivariate normal distributions (with the mean and covariance matrix enclosed). Unenclosed values are vectors. For the simple case, the various matrices are constant with time, and thus the subscripts are not used, but Kalman filtering allows any of them to change each time step.

Where:

$F_{k}$ is the state transition model
$K_{k}$ is the observation model
$Q_{k}$ is the Covariance of the process noise
$R_{k}$ is the Covariance of the observation noise

The Kalman Filter Algorithm

The Kalman Filter operates in two stages: predict and update. At each time step $k$ , we maintain an estimate of the state $\hat{x}_{k}$ and its uncertainty (covariance) $P_{k}$ .

Prediction Step

First, we predict the next state based on the system dynamics:

\hat{x}_{k ∣ k - 1} = F_{k} \hat{x}_{k - 1∣ k - 1} + B_{k} u_{k}

P_{k ∣ k - 1} = F_{k} P_{k - 1∣ k - 1} F_{k}^{T} + Q_{k}

Where:

$\hat{x}_{k ∣ k - 1}$ is the a priori state estimate at step $k$ given observations up to $k - 1$
$P_{k ∣ k - 1}$ is the a priori estimate covariance
$B_{k}$ is the control input model (often omitted if no control input $u_{k}$ )

Update Step

When we receive a measurement $z_{k}$ , we update our estimate:

First, compute the Kalman gain:

K_{k} = P_{k ∣ k - 1} H_{k}^{T} (H_{k} P_{k ∣ k - 1} H_{k}^{T} + R_{k})^{- 1}

Then update the state estimate:

\hat{x}_{k ∣ k} = \hat{x}_{k ∣ k - 1} + K_{k} (z_{k} - H_{k} \hat{x}_{k ∣ k - 1})

And the covariance:

P_{k ∣ k} = (I - K_{k} H_{k}) P_{k ∣ k - 1}

Where:

$K_{k}$ is the Kalman gain, determining how much to trust the measurement
$z_{k} - H_{k} \hat{x}_{k ∣ k - 1}$ is the innovation or measurement residual

Mathematical Intuition

Bayesian Perspective

The Kalman Filter is the optimal Bayesian estimator for linear-Gaussian systems. At each step, we’re computing:

p (x_{k} ∣ z_{1 : k}) \propto p (z_{k} ∣ x_{k}) \cdot p (x_{k} ∣ z_{1 : k - 1})

Prior: $p (x_{k} ∣ z_{1 : k - 1})$ comes from the prediction step
Likelihood: $p (z_{k} ∣ x_{k})$ is the observation model
Posterior: $p (x_{k} ∣ z_{1 : k})$ is our updated belief after seeing $z_{k}$

Because both distributions are Gaussian, their product is also Gaussian, and we can compute it in closed form.

The Kalman Gain Intuition

The Kalman gain $K_{k}$ balances trust between our prediction and the measurement:

If measurement noise is small ( $R_{k}$ small): $K_{k}$ is large → trust the measurement more
If prediction uncertainty is small ( $P_{k ∣ k - 1}$ small): $K_{k}$ is small → trust the prediction more

The update equation $\hat{x}_{k ∣ k} = \hat{x}_{k ∣ k - 1} + K_{k} \cdot (innovation)$ can be read as:

“Start with our prediction, then correct it proportionally to how surprising the measurement was.”

Optimality

The Kalman Filter minimizes the mean squared error of the state estimate. It is provably optimal for linear systems with Gaussian noise. The covariance matrix $P_{k}$ tracks our uncertainty, which typically decreases as we incorporate more measurements.

1D Example: Tracking Position

Consider a simple scenario: estimating the position of an object moving at constant velocity, given noisy position measurements.

Setup

State: $x_{k} = [position velocity]$

Dynamics (constant velocity):

x_{k} = [10 Δ t 1] x_{k - 1} + w_{k}

where $w_{k} \sim N (0, Q)$ represents process noise.

Observation (we only measure position):

z_{k} = [10] x_{k} + v_{k}

where $v_{k} \sim N (0, R)$ is measurement noise.

Numerical Example

Let’s say:

$Δ t = 1$ second
Process noise: $Q = [0.1 0 0 0.1]$
Measurement noise: $R = 4$ (quite noisy!)
True trajectory: object starts at position 0, moving at 1 m/s

Initial state: $\hat{x}_{0} = [00]$ , $P_{0} = [5005]$ (very uncertain)

Time step 1:

Predict:

$\hat{x}_{1∣0} = [1011] [00] = [00]$
$P_{1∣0} = [1011] [5005] [1101] + [0.1 0 0 0.1] = [10.1 5 5 5.1]$

Suppose we measure $z_{1} = 1.5$ (true position is 1, but measurement is noisy).

Update:

Innovation: $y_{1} = 1.5 - [10] [00] = 1.5$
Innovation covariance: $S_{1} = [10] [10.1 5 5 5.1] [10] + 4 = 14.1$
Kalman gain: $K_{1} = [10.1 5 5 5.1] [10] \frac{1}{14.1} \approx [0.716 0.355]$
Updated state: $\hat{x}_{1∣1} = [00] + [0.716 0.355] \cdot 1.5 \approx [1.07 0.53]$

Notice: the filter not only updated the position estimate but also inferred the velocity!

Visualization Interpretation

Over time, the Kalman Filter produces:

Smoother trajectory: The filtered estimate is less noisy than raw measurements
Decreasing uncertainty: The covariance $P_{k}$ shrinks as more measurements arrive
Velocity inference: Even though we only measure position, the filter estimates velocity from the pattern of position changes

The filter balances:

Physical model (constant velocity) vs. measurements
High measurement noise ( $R = 4$ ) means trusting the model more
As uncertainty decreases, the Kalman gain decreases, and updates become more conservative

This is why Kalman Filters are so powerful: they optimally fuse noisy data with a model of system dynamics to produce the best possible estimate.

Graph View