math

Markov Measurement Error

Suppose you had a memoryless variable that changed over time ($X_t$). However, when you checked its value every day, Gaussian noise was added ($N_t$) so you instead measured $Y_t$.

If there were no noise, a linear model predicting $Y_t$ from $Y_{t-1}$ and $Y_{t-2}$ would find the coefficient on the second term to be zero.

However, since there is noise, the term won't be zero. To see this notice that if noise is responsible for the vast majority of $Y$'s variance, then adding the $Y_{t-2}$ term effectively double your "sample size", which makes your model more accurate. In other words, if $N$ is large, then the coefficients of $Y_{t-1}$ and $Y_{t-2}$ tend towards equality.

How can our model account for measurement error?

Suppose $X$ and $Y$ are standard normal variables and that $X_t = \alpha X_{t-1}$ plus noise while $Y_t = \beta X_t$ plus noise.

If we build a causal graph model, we find that

  • $Cor(Y_{t-2}, Y_{t-1}) = \alpha \beta^2$
  • $Cor(Y_{t-2}, Y_{t}) = \alpha^2 \beta^2$
  • $Cor(Y_{t-1}, Y_{t}) = \alpha \beta^2$

Note, this provides a preliminary test to see if the Markov model fits well: check if $Cor(Y_{t-2}, Y_{t-1}) = Cor(Y_{t-1}, Y_{t})$.

Note also that we can estimate $\alpha$ with

$$ \hat{\alpha} = \frac{Cor(Y_{t-2}, Y_{t})}{Cor(Y_{t-2}, Y_{t-1})} $$

Predicting $\beta$ is then straightforward:

$$ \hat{\beta}^2 = \frac{Cor(Y_{t-2}, Y_{t-1})}{\hat{\alpha}} = \frac{Cor(Y_{t-2}, Y_{t-1})^2}{Cor(Y_{t-2}, Y_{t})}$$