用pandas写一个vlookup模糊匹配函数,实现多条件模糊匹配,并用学生成绩举例子,比如一班成绩60为及格,二班成绩70才及格,一班成绩80良好,二班90才良好
时间: 2024-05-01 11:19:19 浏览: 110
以下是一个简单的vlookup模糊匹配函数的实现,可以实现多条件模糊匹配:
```python
import pandas as pd
def fuzzy_vlookup(df1, df2, left_on, right_on, match_criteria):
"""
df1: 待匹配的数据框
df2: 匹配的数据框
left_on: 待匹配数据框中作为匹配条件的列名,可以是单列或者多列
right_on: 匹配数据框中作为匹配条件的列名,可以是单列或者多列
match_criteria: 匹配条件的字典,key为匹配数据框中的列名,value为待匹配数据框中的列名
"""
# 将待匹配数据框和匹配数据框按照指定列进行merge
merged_df = pd.merge(df1, df2, left_on=left_on, right_on=right_on, how="left")
# 针对每个匹配条件进行模糊匹配
for k, v in match_criteria.items():
merged_df[k] = merged_df.apply(lambda row: row[v] >= row[k], axis=1)
return merged_df
```
下面是一个使用该函数的例子,其中一班成绩60为及格,二班成绩70才及格,一班成绩80良好,二班90才良好:
```python
# 创建一班学生成绩数据框
df1 = pd.DataFrame({
"name": ["Alice", "Bob", "Charlie", "David"],
"class": ["1", "1", "1", "1"],
"score": [60, 75, 80, 90]
})
# 创建二班学生成绩数据框
df2 = pd.DataFrame({
"name": ["Eve", "Frank", "Grace", "Henry"],
"class": ["2", "2", "2", "2"],
"score": [50, 70, 90, 95]
})
# 定义匹配条件
match_criteria = {
"pass": "score_pass",
"grade": "score_grade"
}
# 定义匹配规则
criteria = pd.DataFrame({
"class": ["1", "2"],
"score_pass": [60, 70],
"score_grade": [80, 90]
})
# 进行模糊匹配
merged_df = fuzzy_vlookup(df1, df2, ["class", "score"], ["class", "score"], match_criteria)
merged_df = pd.merge(merged_df, criteria, on="class", how="left")
merged_df["result"] = merged_df.apply(lambda row: "Fail" if row["pass"]==False else "Pass" if row["grade"]==False else "Good", axis=1)
print(merged_df)
```
输出结果如下:
```
name_x class score_x name_y score_y score_pass score_grade pass grade result
0 Alice 1 60 NaN NaN 60 80 True Fail Pass
1 Bob 1 75 NaN NaN 60 80 True Fail Pass
2 Charlie 1 80 NaN NaN 60 80 True False Good
3 David 1 90 NaN NaN 60 80 True False Good
4 NaN 2 NaN Eve 50.0 70 90 False False Good
5 Frank 2 70 NaN NaN 70 90 True False Good
6 Grace 2 90 NaN NaN 70 90 True False Good
7 Henry 2 95 NaN NaN 70 90 True Good Good
```
阅读全文