import pandas as pd
import numpy as np
from sklearn.feature_selection import SelectKBest ,chi2
label_ds=pd.read_csv("D:/intern/bll_beijing.csv")
array = label_ds.values
label_X = array[:,1:]
label_y = array[:,0]
test = SelectKBest(score_func=chi2, k=4)
fit = test.fit(label_X, label_y)我收到了这个:
Traceback (most recent call last):
fit = test.fit(label_X, label_y)
File "C:\Users\TOSHIBA\AppData\Local\Programs\Python\Python35\lib\site-packages\sklearn\feature_selection\univariate_selection.py", line 349, in fit
score_func_ret = self.score_func(X, y)
File "C:\Users\TOSHIBA\AppData\Local\Programs\Python\Python35\lib\site-packages\sklearn\feature_selection\univariate_selection.py", line 217, in chi2
Y = LabelBinarizer().fit_transform(y)
File "C:\Users\TOSHIBA\AppData\Local\Programs\Python\Python35\lib\site-packages\sklearn\preprocessing\label.py", line 307, in fit_transform
return self.fit(y).transform(y)
File "C:\Users\TOSHIBA\AppData\Local\Programs\Python\Python35\lib\site-packages\sklearn\preprocessing\label.py", line 284, in fit
self.classes_ = unique_labels(y)
File "C:\Users\TOSHIBA\AppData\Local\Programs\Python\Python35\lib\site-packages\sklearn\utils\multiclass.py", line 97, in unique_labels
raise ValueError("Unknown label type: %s" % repr(ys))
ValueError: Unknown label type: (array([0.55, 0.84, 0.72, 0.54, 0.59, 0.77, 0.85, 1.03, 1.62, 3.04, 3.6 ]),)
[Finished in 3.4s][ 0.55, 0.84, 0.72, 0.54, 0.59, 0.77, 0.85, 1.03, 1.62, 3.04, 3.6 ]是csv文档的第一列。
它有什么问题?
发布于 2018-03-01 13:28:58
此label_y具有连续的值。
但是您已经将评分函数指定为chi2。根据documentation of chi2的说法,这只对分类任务有效。
计算每个非负特征和类之间的卡方统计。
对于回归任务,您可以使用以下方法:
https://stackoverflow.com/questions/49025207
复制相似问题