首页 > 编程知识 正文

python中lcut的功能和用法,python的pandas里的空值怎么表示

时间:2023-05-04 09:32:35 阅读:283782 作者:4140

文件说:

“连续值可以使用剪切(基于值的仓)和qcut(基于样本分位数的仓)离散化”

对我来说听起来非常抽象…我可以看到下面的例子中的差异,但是qcut(样本分位数)实际上是什么意思?你什么时候用qcut和cut?

谢谢.

factors = np.random.randn(30)

In [11]:

pd.cut(factors, 5)

Out[11]:

[(-0.411, 0.575], (-0.411, 0.575], (-0.411, 0.575], (-0.411, 0.575], (0.575, 1.561], ..., (-0.411, 0.575], (-1.397, -0.411], (0.575, 1.561], (-2.388, -1.397], (-0.411, 0.575]]

Length: 30

Categories (5, object): [(-2.388, -1.397] < (-1.397, -0.411] < (-0.411, 0.575] < (0.575, 1.561] < (1.561, 2.547]]

In [14]:

pd.qcut(factors, 5)

Out[14]:

[(-0.348, 0.0899], (-0.348, 0.0899], (0.0899, 1.19], (0.0899, 1.19], (0.0899, 1.19], ..., (0.0899, 1.19], (-1.137, -0.348], (1.19, 2.547], [-2.383, -1.137], (-0.348, 0.0899]]

Length: 30

Categories (5, object): [[-2.383, -1.137] < (-1.137, -0.348] < (-0.348, 0.0899] < (0.0899, 1.19] < (1.19, 2.547]]`

版权声明:该文观点仅代表作者本人。处理文章:请发送邮件至 三1五14八八95#扣扣.com 举报,一经查实,本站将立刻删除。