下面我给出一个例子:
a = ['Ibrutinib', 'Ibrutinib', 'Ibrutinib',
'Ibrutinib-containing product', 'Ibrutinib 140 MG',
'Ibrutinib Oral Product',
'Ibrutinib-containing product in oral dose form', 'Ibrutinib Pill',
'Ibrutinib Oral Capsule', 'Ibrutinib 140 MG Oral Capsule',
'Ibrutinib 140 MG [Imbruvica]',
'Ibrutinib Oral Capsule [Imbruvica]',
'Ibrutinib 140 MG Oral Capsule [Imbruvica]']
pd.Series(a).value_counts()
%%out%%
Ibrutinib 3
Ibrutinib-containing product in oral dose form 1
Ibrutinib Pill 1
Ibrutinib Oral Product 1
Ibrutinib 140 MG Oral Capsule [Imbruvica] 1
Ibrutinib 140 MG Oral Capsule 1
Ibrutinib Oral Capsule 1
Ibrutinib-containing product 1
Ibrutinib 140 MG [Imbruvica] 1
Ibrutinib 140 MG 1
Ibrutinib Oral Capsule [Imbruvica] 1
dtype: int64我希望看到“易卜拉替尼140毫克”在3个位置,因为它在原来的系列中领先。
发布于 2020-08-13 06:37:03
要按原始列表排序,请将其转换为数据帧,然后创建一个排序依据的排名列。
import pandas as pd
a = ['Ibrutinib', 'Ibrutinib', 'Ibrutinib',
'Ibrutinib-containing product', 'Ibrutinib 140 MG',
'Ibrutinib Oral Product',
'Ibrutinib-containing product in oral dose form', 'Ibrutinib Pill',
'Ibrutinib Oral Capsule', 'Ibrutinib 140 MG Oral Capsule',
'Ibrutinib 140 MG [Imbruvica]',
'Ibrutinib Oral Capsule [Imbruvica]',
'Ibrutinib 140 MG Oral Capsule [Imbruvica]']
s = pd.Series(a).value_counts()
df = s.rename_axis('value').reset_index(name='count') # convert to dataframe
df["rank"] = df['value'].apply(lambda x : a.index(x)) # create rank column, ranked by list index
dfsrt = df.sort_values(by='rank') # sort by rank
print(dfsrt[['value','count']].to_string(index=False, justify='left', # display value and count
formatters={'value':'{{:<{}s}}'.format(dfsrt['value'].str.len().max()).format}))输出
value count
Ibrutinib 3
Ibrutinib-containing product 1
Ibrutinib 140 MG 1
Ibrutinib Oral Product 1
Ibrutinib-containing product in oral dose form 1
Ibrutinib Pill 1
Ibrutinib Oral Capsule 1
Ibrutinib 140 MG Oral Capsule 1
Ibrutinib 140 MG [Imbruvica] 1
Ibrutinib Oral Capsule [Imbruvica] 1
Ibrutinib 140 MG Oral Capsule [Imbruvica] 1发布于 2020-08-13 05:57:00
试一试
df = pd.Dataframe(a)
df = df.groupby(0, sort=False).size()\
.sort_values('size', ascending=False, kind='mergesort')默认情况下,Value_counts对快速排序进行排序,但这并不能保证排序的稳定性。
https://stackoverflow.com/questions/63384321
复制相似问题