merge()
函数可以让DataFrame对象具有标准数据库操作:
pd.merge(left, right, how='inner', on=None, left_on=None, right_on=None,
left_index=False, right_index=False, sort=True)
import pandas as pd
left = pd.DataFrame({
'id':[1,2,3,4,5],
'Name': ['Alex', 'Amy', 'Allen', 'Alice', 'Ayoung'],
'subject_id':['sub1','sub2','sub4','sub6','sub5']})
right = pd.DataFrame(
{'id':[1,2,3,4,5],
'Name': ['Billy', 'Brian', 'Bran', 'Bryce', 'Betty'],
'subject_id':['sub2','sub4','sub3','sub6','sub5']})
print(left)
print("---------")
print(right)
'''
id Name subject_id
0 1 Alex sub1
1 2 Amy sub2
2 3 Allen sub4
3 4 Alice sub6
4 5 Ayoung sub5
---------
id Name subject_id
0 1 Billy sub2
1 2 Brian sub4
2 3 Bran sub3
3 4 Bryce sub6
4 5 Betty sub5
'''
print(pd.merge(left,right,on="id"))
'''
id Name_x subject_id_x Name_y subject_id_y
0 1 Alex sub1 Billy sub2
1 2 Amy sub2 Brian sub4
2 3 Allen sub4 Bran sub3
3 4 Alice sub6 Bryce sub6
4 5 Ayoung sub5 Betty sub5
'''
print(pd.merge(left,right,on=["id", "subject_id"]))
'''
id Name_x subject_id Name_y
0 4 Alice sub6 Bryce
1 5 Ayoung sub5 Betty
'''
使用how
参数,指定连接方式,如果组合键没有出现在左侧或者右侧表中,连接表值为NA:
合并方法 | SQL等效 | 描述 |
---|---|---|
left | LEFT OUTER JOIN | 使用左侧对象的键 |
right | RIGHT OUTER JOIN | 使用右侧对象的键 |
outer | FULL OUTER JOIN | 使用键的联合 |
inner | INNER JOIN | 使用键的交集 |
left join:
print(pd.merge(left,right,on=["subject_id"],how="left"))
'''
id_x Name_x subject_id id_y Name_y
0 1 Alex sub1 NaN NaN
1 2 Amy sub2 1.0 Billy
2 3 Allen sub4 2.0 Brian
3 4 Alice sub6 4.0 Bryce
4 5 Ayoung sub5 5.0 Betty
'''