python merge on多个条件_【链接】pandas中的merge应用

f76711340993000a1391c2a98066b8bf.png

数据分析方面,“连接”是高频出现的一个词。在python应用种,merge方法就是一种重要的连接方法。

merge主要功能在于合并数据集,通过left、right来连结字段。

1.merge默认按相同字段名合并,且取两个都有的。

import pandas  as pd
df1 = pd.DataFrame({"name":["kate","herz","catherine","sally"],
                    "age":[25,28,39,35]})
df2 = pd.DataFrame({"name":["kate","herz","sally"],
                     "score":[70,60,90]})
pd.merge(df1,df2)

bb19084b618640d0450127e4017aeff9.png

2.当左右连结字段名不相同时,使用left_on,right_on

df1 = pd.DataFrame({"name":["kate","herz","catherine","sally"],
                    "age":[25,28,39,35]})
df2 = pd.DataFrame({"call_name":["kate","herz","sally"],
                     "score":[70,60,90]})
pd.merge(df1,df2,left_on = "name",right_on = "call_name")

f1cf3ceba551730999eb070a27bb5ff1.png

3.合并后,删除重复的列

df1 = pd.DataFrame({"name":["kate","herz","catherine","sally"],
                    "age":[25,28,39,35]})
df2 = pd.DataFrame({"call_name":["kate","herz","sally"],
                     "score":[70,60,90]})
pd.merge(df1,df2,left_on = "name",right_on = "call_name").drop("name",axis = 1)

cdab0884ceb374b4d764b7264209e165.png

4.参数how的使用

  • inner 内连接,取交集
pd.merge(df1,df2,left_on = "name",right_on = "call_name",how ="inner")

5f9e7057931540a53ad60ac07932050a.png
  • outer 外连接,取并集,并用nan填充
df3 = pd.DataFrame({"name":["kate","herz","sally","cristin"],"score":[70,60,90,30]})
pd.merge(df1,df3,on="name",how="outer")

a493f0ca09a6394e9dc21a80a5e15906.png
  • left 左连接,左侧取全部,右侧取部分
pd.merge(df1,df3,on = "name",how = "left")

aaeb4aab6da4891c0beb743a0a192e38.png
  • right 右链接,左侧取部分,右侧取全部
pd.merge(df1,df3,on = "name",how = "right")

14d4c555ccd55f50d35270369e4b6024.png