我正在摄取一个通常是int的数据类型,但也可以是None或inf,并使用它创建一个Spark DataFrame。我试着让它成为一个LongType,PySpark抱怨说,因为inf是一个浮点型: File "/opt/spark/python/lib/pyspark.zip/pyspark/worker.py", line 177, in
我正在尝试从pandas_udf返回一个特定的结构。它在一个集群上工作,但在另一个集群上失败。我尝试在组上运行udf,这需要返回类型为数据框架。from pyspark.sql.functions import pandas_udfimport numpy as np
from pyspark.sql.types\python\pyspark\sql\types.py in to_arrow_
from pyspark.sql import SparkSessionfrom pyspark.sql.types import *
rdd = sc.sparkContext.parallelize([1.1,2.3,3,4,5,6,7,8,9,10])
print(rdd.collect: StructType can not accept object 1.1 i