Moving Average/Probability Distribution

Probability Distribution as Distribution of Importance
The definition of expected value provides the mathematical foundation for moving averages in the discrete and continuous setting and the mathematical theory is just an application of basic principles of probability theory. Nevertheless the notion of probability is bit misleading because the semantic of moving average does not refer to probability of events. The probability must be regarded as distribution of importance. In time series e.g. less importance is assigned to older data and that does not mean that older data is less likely than recent data. The events that create the collected data are not considered from probability perspective in general.

Importance can be defined by moving averages by; To quantify this proximity a Metric or Norm on the underlying vector space $$V$$ can be assigned. Greater distance to reference point in $$v_o\in V$$ lead to less importance, e.g. by
 * proximity in time (old and recent data)
 * proximity in space (see application of the moving average on images above)
 * $$\displaystyle w_v:=\frac{1}{1+\|v-v_0\|} \le 1$$.

The weight for the importance is 1 for $$v=v_o$$. For increasing distance measure by the norm $$\|\cdot\|$$ decreases the weight towards 0. Standardization with $$s(n)$$ as sum of all weights for discrete moving averages (as mentioned for EMA) lead to the property of probability distributions:
 * $$\sum_{x\in T} p_t(x) = 1$$.

Furthermore there are other moving averages that incorporate negative weights. This leads to the fact that
 * $$\sum_{x\in T} p_t(x) \not= 1$$. This could happen when the positive/negative impact $$I(t)\in \R$$ of collected data $$C(t)$$ is assigned to the weight and the probability mass function. The assignment of impact factors of collected data to the probability/importance values mixes two different properities. This should be avoided and the impact $$I(t)$$ on $$C(t)$$ should be kept separately for a transparent definition of the moving average, i.e.
 * $$MA(t) := \sum_{k\in T} p_t(k) \cdot I(k) \cdot C(k)$$ with $$\sum_{k\in T} p_t(k) = 1$$.