User:Akela1101/ML Algorithms Evaluation

Data separation
A common way to check an algorithm is to separate data into training, validation and test sets (60%, 20%, 20% accordingly).

Given set of hypotheses $$h_k(x)$$, where $$k \in 1 .. p$$, the strategy is as follows:

$$\Box$$
 * Find $$w_k\ |\ J_k(w_k) \xrightarrow[w_k]{} min$$ for each k on the training set.


 * Find k which gives $$\min_{k}(J_k^v(w_k))$$ on the validation set.
 * If this minimum is big, you should come up with another set of hypotheses $$\varnothing$$

$$\blacksquare$$
 * Check $$J_k^t(w_k)$$ on the test set.
 * If it is much bigger than $$J_k^v$$, then throw away $$h_k(x)$$ and go to the previous step.
 * If it is about as small as $$J_k^v$$, then a solution is found.

Regularization parameter
Searching for the most optimal λ is very similar to the strategy above,

except for instead of $$h_k(x)$$, we should try different $$\lambda_r = \{ 0.01, 0.02, ..etc\}, r \in 1..u$$.

$$J(\bar{w}) = \frac{1}{2m}\sum_{j = 1}^m \biggl(h(x^j) - y^j\biggr)^2 + \frac{\lambda}{2m}\sum_{i = 1}^n w_i^2 \xrightarrow[w]{} min$$

Probably there's a sense in checking $$h_k(x)$$ one by one.

For each of $$h_k(x)$$ use the strategy to find $$\lambda_r$$ that gives the least error, and then find k that minimizes that error among $$h_k(x)$$.