首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >可以根据单元格的前五个字符匹配数据帧的行吗?

可以根据单元格的前五个字符匹配数据帧的行吗?
EN

Stack Overflow用户
提问于 2020-10-13 06:23:51
回答 1查看 51关注 0票数 0

如果有人有这个数据帧:

代码语言:javascript
运行
复制
    0           1           2
0   RC100_1     RC100_1     RC100_1
1   RC101_1     RC101_1     RC101_1
2   RC101_3     RC102_1     RC102_1
3   RC102_1     RC103_3     RC103_3
4   RC102_3     RC104_1     RC104_1
5   RC103_3     RC109_2     RC109_2
6   RC104_1     RC111_1     RC111_1
7   RC109_2     RC114_2     RC114_2
8   RC111_1     RC115_2     RC115_2
9   RC114_1     RC116_1     RC116_1
10  RC115_4     RC116_2     RC116_2
11  RC116_1     RC117_2     RC117_2
12  RC117_4     RC117_4     RC117_4
13  RC117_4     RC118_2     RC118_2
14  RC118_1     RC119_2     RC119_2
15  RC119_4     RC120_2     RC120_2
16  RC120_4     RC121_2     RC121_2
17  RC121_4     RC122_2     RC122_2
18  RC122_4     RC125_2     RC125_2
19  RC125_2     RC126_3     RC126_3
20  RC126_3     RC129_2     RC129_2
21  RC129_4     RC12_24     RC12_24

是否可以将其转换为每行只包含前五个字符匹配的内容,否则将被排序?我的意思是不需要手动更换每个单元格。以下是我想要将其转换为

代码语言:javascript
运行
复制
    0           1           2
0   RC100_1     RC100_1     RC100_1
1   RC101_1     RC101_1     RC101_1
2   RC101_3     NaN         NaN
3   RC102_1     RC102_1     RC102_1
4   RC102_3     NaN         NaN
5   RC103_3     RC103_3     RC103_3
6   RC104_1     RC104_1     RC104_1
7   RC109_2     RC109_2     RC109_2
8   RC111_1     RC111_1     RC111_1
9   RC114_1     RC114_2     RC114_2
10  RC115_4     RC115_2     RC115_2
11  RC116_1     RC116_1     RC116_1
12  NaN         RC116_2     RC116_2
13  RC117_4     RC117_2     RC117_2
14  RC117_4     RC117_4     RC117_4
15  RC118_1     RC118_2     RC118_2
16  RC119_4     RC119_2     RC119_2
17  RC120_4     RC120_2     RC120_2
18  RC121_4     RC121_2     RC121_2
19  RC122_4     RC122_2     RC122_2
20  RC125_2     RC125_2     RC125_2
21  RC126_3     RC126_3     RC126_3
22  RC129_4     RC129_2     RC129_2
23  NaN         RC12_24     RC12_24
EN

回答 1

Stack Overflow用户

发布于 2020-10-13 07:19:57

只需列出所有文件的名称,并使用它来索引和对齐所有的文件名。以下是您发布的数据的工作示例。

代码语言:javascript
运行
复制
import pandas as pd
import numpy as np

# in your case do something like names_idx = excel_names + csv_names + txt_names
names_idx = ['RC100_1', 'RC100_1', 'RC100_1',
             'RC101_1', 'RC101_1', 'RC101_1',
             'RC101_3', 'RC102_1', 'RC102_1',
             'RC102_1', 'RC103_3', 'RC103_3',
             'RC102_3', 'RC104_1', 'RC104_1',
             'RC103_3', 'RC109_2', 'RC109_2',
             'RC104_1', 'RC111_1', 'RC111_1',
             'RC109_2', 'RC114_2', 'RC114_2',
             'RC111_1', 'RC115_2', 'RC115_2',
             'RC114_1', 'RC116_1', 'RC116_1',
             'RC115_4', 'RC116_2', 'RC116_2',
             'RC116_1', 'RC117_2', 'RC117_2',
             'RC117_4', 'RC117_4', 'RC117_4',
             'RC117_4', 'RC118_2', 'RC118_2',
             'RC118_1', 'RC119_2', 'RC119_2',
             'RC119_4', 'RC120_2', 'RC120_2',
             'RC120_4', 'RC121_2', 'RC121_2',
             'RC121_4', 'RC122_2', 'RC122_2',
             'RC122_4', 'RC125_2', 'RC125_2',
             'RC125_2', 'RC126_3', 'RC126_3',
             'RC126_3', 'RC129_2', 'RC129_2',
             'RC129_4', 'RC12_24', 'RC12_24']

# This is not needed if you already have them separately from globbing earlier
csv_names = names_idx[::3]
excel_names = names_idx[1::3]
txt_names = names_idx[2::3]

# remove duplicates
names_idx = set(names_idx)

# create an empty dataframe with index as unique file names
df = pd.DataFrame(index=names_idx)

# create empty columns and position the file names 
df['csv'] = np.nan
df.csv[csv_names] = csv_names

df['excel'] = np.nan
df.excel[excel_names] = excel_names

df['txt'] = np.nan
df.txt[txt_names] = txt_names

print(df)

输出

代码语言:javascript
运行
复制
             csv    excel      txt
RC111_1  RC111_1  RC111_1  RC111_1
RC125_2  RC125_2  RC125_2  RC125_2
RC114_2      NaN  RC114_2  RC114_2
RC116_2      NaN  RC116_2  RC116_2
RC129_4  RC129_4      NaN      NaN
RC118_1  RC118_1      NaN      NaN
RC12_24      NaN  RC12_24  RC12_24
RC121_4  RC121_4      NaN      NaN
RC126_3  RC126_3  RC126_3  RC126_3
RC129_2      NaN  RC129_2  RC129_2
RC102_1  RC102_1  RC102_1  RC102_1
RC101_1  RC101_1  RC101_1  RC101_1
RC109_2  RC109_2  RC109_2  RC109_2
RC119_4  RC119_4      NaN      NaN
RC100_1  RC100_1  RC100_1  RC100_1
RC120_2      NaN  RC120_2  RC120_2
RC122_2      NaN  RC122_2  RC122_2
RC121_2      NaN  RC121_2  RC121_2
RC117_4  RC117_4  RC117_4  RC117_4
RC118_2      NaN  RC118_2  RC118_2
RC103_3  RC103_3  RC103_3  RC103_3
RC117_2      NaN  RC117_2  RC117_2
RC102_3  RC102_3      NaN      NaN
RC119_2      NaN  RC119_2  RC119_2
RC114_1  RC114_1      NaN      NaN
RC116_1  RC116_1  RC116_1  RC116_1
RC101_3  RC101_3      NaN      NaN
RC115_2      NaN  RC115_2  RC115_2
RC122_4  RC122_4      NaN      NaN
RC104_1  RC104_1  RC104_1  RC104_1
RC115_4  RC115_4      NaN      NaN
RC120_4  RC120_4      NaN      NaN
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/64326084

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档