回归方程最小二乘法,偏最小二乘回归分析步骤

前言本博客用于记录初学者普通最bzdsj乘回归遇到的相关知识点和解决问题的过程。

http://www.Sina.com/http://www.Sina.com /

普通bzdsj乘法回归

回归-现有数据http://www.Sina.com/: cal _ housing.CSV http://www.Sina.com /截至1990年，美国加利福尼亚州所有区块人口普查的信息针对9组变量

可变阈值(截距(11.4939275.7518MEDIAN INCOME )收入中位数)0.479045.7768MEDIAN INCOME2)收入中位数2 )-0.0166-- . 6123ln ) totalRooms/population )总房屋数/人口)-0.8582-56.1280ln ) be Dian0. 804338.0685 ln (population/household ) ()

importpandasaspdimportnumpyasnpdata=PD.read _ CSV (cal _ housing.CSV ' ) name=data.columnsx=data[name] 9第9列中的PPE name ) 8:9 ) ) print ) data.shape，X.shape，y.shape(#返回矩阵数----------------(xname 3360 ) u'totalBedrooms '，u'population '，u'households '，u'medianIncome']，dtype='object ' ) (y name : '，ii

你可以自己决定开发环境：Pycharm 2018.1.2和版本：Python 2.7.14 :: Anaconda, Inc.。小于0.5小于50%。

seed=8888 #随机种子proportion=0.1 #测试集的百分比froms klearn.model _ selectionimporttrain _ test _ splitx _ train，y_ y random_state=seed ) print(x_train.shape，X_test.shape，y_train.shape， y _ test.shape--------------------------------------------------。 ---------------------------------------------------------也就是说-。

reg=Linear Regression(#线性回归RES=reg.fit ) X_train，对于y_train )训练集X_train，y_train使用在进行训练的y_hat=RES.predict(X_test ) #训练中获得的估计器预测输入x_test的集合2 ) #得到残差平方和SSE_test=NP.mean () y_test-NP.mean )的标准化均方误差NMSE_cvR2_cv=1 - NMSE_cv #决定系数R2 _ cvpr intr2_ cv se value 0.342814 dtype : float 64回归-模拟数据由自己决定采样量(n )、自变量个数(p )和系数值(b )，由自己决定正规误差的平均值m和标准偏差s

seed=8888 #随机种子n=100 #样本量p=7 #自变量个数m=0 #误差项平均值s=5 #标准差b=[ 2，5，16，9，-3，-5，-2]#贝塔值C=[2, 2，] ) n，p ) ) y=X.dot(B ) NP.random.normal )，m，s y.shape---------------------- y

import statsmodels.api as sm#截项mod=sm.ols(y，x ) #添加常规最大bzdsj次幂模型，ordinaryleastsquaremodelres=mod.fit ) #输出r ^ 1---res2.r squared---------------------------------------------- - ----------------------------olsregressionresults=================60yr-squared :926 model : olsa DJ.r-squared :920 method : leastsquaresf-statistic 3360165.4 date 3360 mon，07 may 33601.32 e-49 time :09336054336025 log-likelihood :-304.71 no.observations :100 AIC :623.4 df residuals 336093 3606=====================================------------- -请参阅6.597 x 315.76190.59626.4440.00014.57816.946 x 48.95950.52117.1810.0007.9249.995 x5- 3. 0.175.000-5.968-4.019 x7-2.01260.536-3.7540.000-3.077-0.948===============9jarque-Bera(JB ) :227 skew :-0.078 prob (JB ) ) )。 3360.893 Kurt osis 33603.174 cond.no.1.51==================