通过这个问题:How to group data and construct a new column - python pandas?,我知道了如何使用pandas对多列进行分组并构造一个新的唯一id,但是如果我想在Python中使用Apachebeam来实现该问题中描述的相同功能,我如何实现它,然后将新数据写入换行符分隔的JSON我是Apachebeam的新手,这是我现在所拥有的: import pandas
import apache_beam</em
我刚从apache文档的初始指南开始,看起来这个特定的管道导入已经不可用了。从apache_beam.options.pipeline_options导入PipelineOptions错误:
ImportError跟踪(最近一次调用) in () ->1来自apache_beam.options.pipeline_options
job_name是预处理-ga360-190523-130005
modules versions are apache-beam 2.5.0,google-cloud-dataflow(flags = [], **options)
with beam.Pipeline(options=opts) as
outfile = "gs://s
(打印)
~/PROJECTS/Apache_Beam/env/lib/python3.8/site-packages/apache_beam/pvalue.py in __or__(self, ptransform/env/lib/python3.8/site-packages/apache_beam/pipel
/venv/lib/python3.7/site-packages/apache_beam/transforms/core.py", line 1415, in
wrapper = lambda x:"/usr/local/lib/python3.7/site-packages/apache_beam/runners/