首页pandas 一个字段如果包含某两个字就生成新的一列作为类别

pandas 一个字段如果包含某两个字就生成新的一列作为类别

时间: 2024-03-09 11:44:50 浏览: 71

可以使用 Pandas 的 apply 方法结合 lambda 表达式来实现。具体的步骤如下： 1. 使用 str.contains 方法判断某个字段是否包含某两个字，返回一个布尔类型的 Series。 2. 将上一步得到的 Series 作为 apply 方法的输入，使用 lambda 表达式对其进行处理，生成新的一列作为类别。具体的代码如下所示： ``` python import pandas as pd # 生成示例数据 data = { 'text': ['hello world', 'world hello', 'hello', 'world', 'hi world'] } df = pd.DataFrame(data) # 使用 apply 方法结合 lambda 表达式生成新的一列作为类别 df['category'] = df['text'].apply(lambda x: 'category1' if 'hello' in x and 'world' in x else 'category2') print(df) ``` 运行结果如下所示： ``` text category 0 hello world category1 1 world hello category1 2 hello category2 3 world category2 4 hi world category2 ``` 在上面的代码中，我们使用 apply 方法结合 lambda 表达式对 text 列进行处理，生成 category 列作为类别。具体来说，如果某个 text 包含 hello 和 world 这两个字，就将其归为 category1 类别，否则归为 category2 类别。最终的结果存储在 df['category'] 列中。

阅读全文