python对比两个excel表内重复数据并标记为红色单元格背景
时间: 2024-05-12 15:20:08 浏览: 194
以下是使用Python标记两个Excel表中重复数据并将其标记为红色单元格背景的步骤:
1. 安装所需的库
需要安装openpyxl和pandas库。可以使用以下命令安装:
```
pip install openpyxl
pip install pandas
```
2. 读取两个Excel文件
使用openpyxl库读取两个Excel文件:
```python
import openpyxl
# 打开第一个Excel文件
wb1 = openpyxl.load_workbook('file1.xlsx')
ws1 = wb1.active
# 打开第二个Excel文件
wb2 = openpyxl.load_workbook('file2.xlsx')
ws2 = wb2.active
```
3. 将数据转换为pandas数据帧
使用pandas库将数据转换为数据帧:
```python
import pandas as pd
# 将第一个Excel文件的数据转换为数据帧
df1 = pd.DataFrame(ws1.values)
df1.columns = [cell.value for cell in ws1[1]]
df1 = df1.drop(0)
# 将第二个Excel文件的数据转换为数据帧
df2 = pd.DataFrame(ws2.values)
df2.columns = [cell.value for cell in ws2[1]]
df2 = df2.drop(0)
```
4. 找到重复数据
使用pandas库的merge函数找到两个数据帧中的重复数据:
```python
# 找到重复数据
merged = pd.merge(df1, df2, how='inner')
```
5. 将重复数据标记为红色单元格背景
使用openpyxl库将重复数据标记为红色单元格背景:
```python
# 将重复数据标记为红色单元格背景
for row in ws1.rows:
for cell in row:
if cell.value in merged.values:
cell.fill = openpyxl.styles.PatternFill(start_color='FFC7CE', end_color='FFC7CE', fill_type='solid')
for row in ws2.rows:
for cell in row:
if cell.value in merged.values:
cell.fill = openpyxl.styles.PatternFill(start_color='FFC7CE', end_color='FFC7CE', fill_type='solid')
# 保存Excel文件
wb1.save('file1.xlsx')
wb2.save('file2.xlsx')
```
完整的代码如下:
```python
import openpyxl
import pandas as pd
# 打开第一个Excel文件
wb1 = openpyxl.load_workbook('file1.xlsx')
ws1 = wb1.active
# 打开第二个Excel文件
wb2 = openpyxl.load_workbook('file2.xlsx')
ws2 = wb2.active
# 将第一个Excel文件的数据转换为数据帧
df1 = pd.DataFrame(ws1.values)
df1.columns = [cell.value for cell in ws1[1]]
df1 = df1.drop(0)
# 将第二个Excel文件的数据转换为数据帧
df2 = pd.DataFrame(ws2.values)
df2.columns = [cell.value for cell in ws2[1]]
df2 = df2.drop(0)
# 找到重复数据
merged = pd.merge(df1, df2, how='inner')
# 将重复数据标记为红色单元格背景
for row in ws1.rows:
for cell in row:
if cell.value in merged.values:
cell.fill = openpyxl.styles.PatternFill(start_color='FFC7CE', end_color='FFC7CE', fill_type='solid')
for row in ws2.rows:
for cell in row:
if cell.value in merged.values:
cell.fill = openpyxl.styles.PatternFill(start_color='FFC7CE', end_color='FFC7CE', fill_type='solid')
# 保存Excel文件
wb1.save('file1.xlsx')
wb2.save('file2.xlsx')
```
阅读全文