ValueError: cannot reindex on an axis with duplicate labels
时间: 2024-05-09 08:18:36 浏览: 665
This error occurs when you try to reindex a pandas DataFrame or Series on an axis (rows or columns) that has duplicate labels.
For example, if you have a DataFrame with two rows that have the same label and you try to reindex it with a new set of labels, you will get this error:
```
import pandas as pd
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]}, index=['a', 'a'])
df.reindex(['a', 'b', 'c'])
```
Output:
```
ValueError: cannot reindex from a duplicate axis
```
To fix this error, you need to first remove the duplicate labels from the axis you want to reindex. One way to do this is to use the `drop_duplicates()` method on the index or columns of the DataFrame or Series:
```
df = df[~df.index.duplicated(keep='first')]
df.reindex(['a', 'b', 'c'])
```
Output:
```
A B
a 1.0 3.0
b NaN NaN
c NaN NaN
```
In this example, the `~df.index.duplicated(keep='first')` returns a boolean array that is `True` for the first occurrence of each label in the index and `False` for any subsequent occurrence of the same label. The `~` operator negates this boolean array so that it becomes `True` for all unique labels in the index. Finally, we use this boolean array to select only the rows of the DataFrame that have unique labels.
阅读全文