首页 > 编程知识 正文

拉普拉斯先验分布,设随机变量x服从拉普拉斯分布,其分布密度为

时间:2023-05-06 01:30:49 阅读:259217 作者:4067

作者:桂。

时间:2017-03-21  07:25:17

链接:http://www.cnblogs.com/xingshansi/p/6592599.html 

声明:欢迎被转载,不过记得注明出处哦~

 

前言

本文为曲线拟合与分布拟合系列的一部分,主要讲解混合zzdzt分布(Laplace Mixture Model,LMM)。zzdzt也是常用的统计概率模型之一,网上关于混合高斯模型(GMM)的例子很多,而关于LMM实现的很少。其实混合模型都可以用EM算法推导,只是求闭式解的运算上略有差别,全文包括:

  1)LMM理论推导;

  2)LMM代码实现;

内容多有借鉴他人,最后一并附上链接。

 

一、LMM理论推导

  A-模型介绍

对于单个zzdzt分布,表达式为:

$f(Y) = frac{1}{{2b}}{e^{ - frac{{left| {Y - mu } right|}}{b}}}$

对于$K$个模型的混合分布:

$Pleft( {{Y_j}|Theta } right) = sumlimits_{k = 1}^K {{w_k}fleft( {{Y_j}|{mu _k},{b_k}} right)} $

如何拟合呢?下面利用EM分析迭代公式,仅分析Y为一维的情况,其他可类推。(先给出一个结果图)

  B-EM算法推导

E-Step:

1)求解隐变量,构造完全数据集

同GMM推导类似,利用全概率公式:

2)构造Q函数

基于之前混合高斯模型(GMM)的讨论,EM算法下混合模型的Q函数可以表示为:

$Qleft( {Theta ,{Theta ^{left( i right)}}} right) = sumlimits_{j = 1}^N {sumlimits_{k = 1}^K {log left( {{w_k}} right)Pleft( {{Z_j} in {Upsilon _k}|{Y_j},{Theta ^{left( i right)}}} right)} }  + sumlimits_{j = 1}^N {sumlimits_{k = 1}^K {log left( {{f_k}left( {{Y_j}|{Z_j} in {Upsilon _k},{theta _k}} right)} right)} } Pleft( {{Z_j} in {Upsilon _k}|{Y_j},{Theta ^{left( i right)}}} right)$

其中${{theta _k}} = [mu_k,b_k]$为分布$k$对应的参数,$Theta$  = {$theta _1$,$theta _2$,...,$theta _K$}为参数集合,$N$为样本个数,$K$为混合模型个数。

M-Step:

1)MLE求参

首先对${{w_k}}$进行优化

由于$sumlimits_{k = 1}^M {{w_k}}  = 1$,利用Lagrange乘子求解:

${J_w} = sumlimits_{j = 1}^N {sumlimits_{k = 1}^K {left[ {log left( {{w_k}} right)Pleft( {left. {{Z_j} in {Upsilon _k}} right|{Y_j},{{bf{Theta }}^{left( i right)}}} right)} right]} }  + lambda left[ {sumlimits_{k = 1}^K {{w_k}}  - 1} right]$

求偏导:

$frac{{partial {J_w}}}{{partial {w_k}}} = sumlimits_{J = 1}^N {left[ {frac{1}{{{w_k}}}Pleft( {{Z_j} in {Upsilon _k}|{Y_j},{{bf{Theta }}^{left( i right)}}} right)} right] + } lambda  = 0$

 得

对各分布内部参数$theta_k$进行优化

给出准则函数:

${J_Theta } = sumlimits_{j = 1}^N {sumlimits_{k = 1}^K {log left( {{f_k}left( {{Y_j}|{Z_j} in {Upsilon _k},{theta _k}} right)} right)} } Pleft( {{Z_j} in {Upsilon _k}|{Y_j},{Theta ^{left( i right)}}} right)$

仅讨论$Y_j$为一维数据情况,其他类推。对于zzdzt分布:

关于$theta_k$利用MLE即可求参。

首先求解$b_k$的迭代公式:

由于$mu_k$含有绝对值,因此需要一点小技巧。${J_Theta }$对$mu_k$求偏导,得到:

得到的$mu_k$估计即为:

$mu _k^{left( {i + 1} right)} = {{hat mu }_k}$

在迭代的最终状态,可以认为$i$次参数与$i+1$次参数近似相等,从而上面的求导结果转化为:

得到参数$mu_k$的迭代公式:

总结一下LMM的求解步骤:

E-Step:

M-Step:

 

二、LMM代码实现

 根据上一篇GMM的代码,简单改几行code,即可得到LMM:

function [u,b,t,iter] = fit_mix_laplace( X,M )%% fit_mix_laplace - fit parameters for a mixed-laplacian distribution using EM algorithm%% format: [u,b,t,iter] = fit_mix_laplacian( X,M )%% input: X - input samples, Nx1 vector% M - number of gaussians which are assumed to compose the distribution%% output: u - fitted mean for each laplacian% b - fitted standard deviation for each laplacian% t - probability of each laplacian in the complete distribution% iter- number of iterations done by the function%N = length( X );Z = ones(N,M) * 1/M; % indicators vectorP = zeros(N,M); % probabilities vector for each sample and each modelt = ones(1,M) * 1/M; % distribution of the gaussian models in the samplesu = linspace(0.2,1.4,M); % mean vectorb = ones(1,M) * var(X) / sqrt(M); % variance vectorC = 1/sqrt(2*pi); % just a constantIc = ones(N,1); % - enable a row replication by the * operatorIr = ones(1,M); % - enable a column replication by the * operatorQ = zeros(N,M); % user variable to determine when we have converged to a steady solutionthresh = 1e-7; step = N;last_step = 300; % step/last_stepiter = 0;min_iter = 3000; while ((( abs((step/last_step)-1) > thresh) & (step>(N*1e-10)) ) & (iter<min_iter) ) % E step % ======== Q = Z; P = 1./ (Ic*b) .* exp( -(1e-6+abs(X*Ir - Ic*u))./(Ic*b) ); for m = 1:M Z(:,m) = (P(:,m)*t(m))./(P*t(:)); end % estimate convergence step size and update iteration number prog_text = sprintf(repmat( 'b',1,(iter>0)*12+ceil(log10(iter+1)) )); iter = iter + 1; last_step = step * (1 + eps) + eps; step = sum(sum(abs(Q-Z))); fprintf( '%s%d iterationsn',prog_text,iter ); % M step % ======== Zm = sum(Z); % sum each column Zm(find(Zm==0)) = eps; % avoid devision by zero u = sum(((X*Ir)./abs(X*Ir - Ic*u)).*Z) ./sum(1./abs(X*Ir - Ic*u).*Z) ; b = sum((abs(X*Ir - Ic*u)).*Z) ./ Zm ; t = Zm/N;endend

给出上文统计分布的拟合程序:

clc;clear all;%generate randomxmin = -10;xmax = 10;Len = 10000000;x = linspace(xmin,xmax,Len);mu = [3,-4];b = [0.9 0.4];w = [0.7 0.3];fx = w(1)/2/b(1)*exp(-abs(x-mu(1))/b(1))+ w(2)/2/b(2)*exp(-abs(x-mu(2))/b(2));ymax = 1/b(2);ymin = 0;Y = (ymax-ymin)*rand(1,Len)-ymin;data = x(Y<=fx);%Laplace Mixture Model fittingK = 2;[mu_new,b_new,w_new,iter] = fit_mix_laplace( data',K);%figuresubplot 221hist(data,2000);grid on;subplot 222numter = [xmin:.2:xmax];plot(numter,w_new(1)/2/b_new(1)*exp(-abs(numter-mu_new(1))/b_new(1)),'r','linewidth',2);hold on;plot(numter,w_new(2)/2/b_new(2)*exp(-abs(numter-mu_new(2))/b_new(2)),'g','linewidth',2);hold on;subplot (2,2,[3,4])[histFreq, histXout] = hist(data, numter);binWidth = histXout(2)-histXout(1);%Barbar(histXout, histFreq/binWidth/sum(histFreq)); hold on;grid on;plot(numter,w_new(1)/2/b_new(1)*exp(-abs(numter-mu_new(1))/b_new(1)),'r','linewidth',2);hold on;plot(numter,w_new(2)/2/b_new(2)*exp(-abs(numter-mu_new(2))/b_new(2)),'g','linewidth',2);hold on;

对应结果图(与上文同):

参考

Mitianoudis N, Stathaki T. Batch and online underdetermined source separation using Laplacian mixture models[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(6): 1818-1832.

版权声明:该文观点仅代表作者本人。处理文章:请发送邮件至 三1五14八八95#扣扣.com 举报,一经查实,本站将立刻删除。