首页 > 编程知识 正文

python ols参数,非线性模型可以用ols估计吗

时间:2023-05-05 20:18:52 阅读:244890 作者:2670

我不确定为什么我对简单的OLS获得稍微不同的结果,具体取决于我是通过panda's experimental rpy interface在R中执行回归还是在Python中使用statsmodels。

import pandas

from rpy2.robjects import r

from functools import partial

loadcsv = partial(pandas.DataFrame.from_csv,

index_col="seqn", parse_dates=False)

demoq = loadcsv("csv/DEMO.csv")

rxq = loadcsv("csv/quest/RXQ_RX.csv")

num_rx = {}

for seqn, num in rxq.rxd295.iteritems():

try:

val = int(num)

except ValueError:

val = 0

num_rx[seqn] = val

series = pandas.Series(num_rx, name="num_rx")

demoq = demoq.join(series)

import pandas.rpy.common as com

df = com.convert_to_r_dataframe(demoq)

r.assign("demoq", df)

r('lmout

r('print(summary(lmout))') # print from R从R起,我得到以下总结:

Call:

lm(formula = demoq$num_rx ~ demoq$ridageyr)

Residuals:

Min 1Q Median 3Q Max

-2.9086 -0.6908 -0.2940 0.1358 15.7003

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) -0.1358216 0.0241399 -5.626 1.89e-08 ***

demoq$ridageyr 0.0358161 0.0006232 57.469 < 2e-16 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.545 on 9963 degrees of freedom

Multiple R-squared: 0.249, Adjusted R-squared: 0.2489

F-statistic: 3303 on 1 and 9963 DF, p-value: < 2.2e-16使用statsmodels.api执行OLS:

import statsmodels.api as sm

results = sm.OLS(demoq.num_rx, demoq.ridageyr).fit()

results.summary()结果与R的输出相似,但不一样:

OLS Regression Results

Adj. R-squared: 0.247

Log-Likelihood: -18488.

No. Observations: 9965 AIC: 3.698e+04

Df Residuals: 9964 BIC: 3.698e+04

coef std err t P>|t| [95.0% Conf. Int.]

ridageyr 0.0331 0.000 82.787 0.000 0.032 0.034安装过程有点麻烦。但是,有一个ipython笔记本here,可以重现不一致性。

版权声明:该文观点仅代表作者本人。处理文章:请发送邮件至 三1五14八八95#扣扣.com 举报,一经查实,本站将立刻删除。