先验概率:
在缺少某个前提下的变量概率,在机器学习中就是没有训练样本,在训练之前的初始概率:P(w)
后验概率:
在有了样本数据以后,对变量进行概率的修正,得到的概率就是后验概率,,例如g是样本,则后验概率是:P(w | g)
贝叶斯公式:
从形式上讲,贝叶斯公式通过先验概率和似然函数求取后验概率。
P(w | g)= P(w) P(g | w) / P(g)
R 语言贝叶斯公式计算例子:
先验概率: 机器的状态有两种,工作working(概率是:0.9),或者损坏broken(概率是:0.1)
似然概率: 在两种状态下,结果有好坏两种, good or broken
good
broken
working
0.95
0.05
broken
0.7
0.3
然后给出一组结果,"g","b","g","g","g","g","g","g","g","b","g","b", 求后验概率
即 P(w | g), P(w | b), P(b | g), P(b | b)
例如, P(w | g)= P(w) P(g | w) / P(g)
这里的全概率P(g) = P(g | w)P(w) + P(g | b)P(b)
下面是R代码
########################################################
# Illustration of function bayes to illustrate
# sequential learning in Bayes' rule
########################################################
bayes
probs
dimnames(probs)[[1]]
dimnames(probs)[[2]]
probs[1, ]
for(j in 1:length(data))
probs[j+1, ]
sum(probs[j, ] * likelihood[, data[j]])
dimnames(probs)[[1]]
paste(0:length(data), dimnames(probs)[[1]])
data.frame(probs)
}
# quality control example
# machine is either working or broken with prior probs .9 and .1
prior
# outcomes are good (g) or broken (b)
# likelihood matrix gives probs of each outcome for each model
like.working
like.broken
likelihood
# sequence of data outcomes
data
# function bayes will computed the posteriors, one datum at a time
# inputs are the prior vector, likelihood matrix, and vector of data
posterior
posterior
执行结果:
working broken
0 prior 0.9000 0.10000
1 g 0.9243 0.07568
2 b 0.6706 0.32941
3 g 0.7342 0.26576
4 g 0.7894 0.21055
5 g 0.8358 0.16424
6 g 0.8735 0.12649
7 g 0.9036 0.09641
8 g 0.9271 0.07289
9 g 0.9452 0.05476
10 b 0.7421 0.25793
11 g 0.7961 0.20389
12 b 0.3942 0.60578