----------+
|[3.0,-4.0]|(4,[1,3],[4.0,3.0])|[0.6,-0.8]|
+----------+-------------------+----------+
OneHotEncoderEstimator...使用方法示例:
from pyspark.ml.feature import OneHotEncoderEstimator
from pyspark.ml.linalg import Vectors
df...= spark.createDataFrame([(0.0, ), (1.0, ), (2.0, )], ["input"])
ohe = OneHotEncoderEstimator(inputCols...(Vectors.dense([0.6, -1.1, -3.0, 4.5, 3.3]), )],
["features"])
vs...= VectorSlicer(inputCol="features", outputCol="sliced", indices=[1, 4])
vs.transform(df).show(truncate