我正在尝试选择pd.DataFrame
的随机子集,并将值设置为某一列。下面是一个玩具例子:
import pandas as pd
df = pd.DataFrame({
'species': ['platypus', 'monkey', 'possum'],
'name': ['mike', 'paul', 'doug'],
'group': ['control', 'control', 'control']
})
species name group
0 platypus mike control
1 monkey paul control
2 possum doug control
我尝试了下面的方法,将两个人随机分配给实验组,但这行不通:
df.sample(2)['group'] = 'experimental'
事实上,这也是行不通的:
df.iloc[[0, 1]]['group'] = 'experimental'
发布于 2022-04-29 06:40:49
您可以使用df.sample(2).index
获取随机抽样数据的df中的索引,然后将其传递到.loc
,将这些索引的组列设置为“实验性”,如下所示:
df.loc[df.sample(2).index, 'group'] = 'experimental'
输出:
species name group
0 platypus mike experimental
1 monkey paul experimental
2 possum doug control
发布于 2022-04-29 06:53:01
下面是一些随机的索引,随机次数。
import pandas as pd
import random
def custom_randomizer(df, col):
total_randoms = random.choice(df.index) + 1
for _ in range(total_randoms):
df.loc[random.choice(df.index), col] = 'expiremental'
return df
df = pd.DataFrame({
'species': ['platypus', 'monkey', 'possum'],
'name': ['mike', 'paul', 'doug'],
'group': ['control', 'control', 'control']
})
df = custom_randomizer(df, 'group')
print(df)
发布于 2022-04-29 07:24:03
df['group'].iloc[[0, 1]] = 'experimental'
输出
species name group
0 platypus mike experimental
1 monkey paul experimental
2 possum doug control
https://stackoverflow.com/questions/72059569
复制