Pandas仅选择列中的唯一字符串在另一列中只有一个特定字符串的行

基础概念

Pandas 是一个强大的 Python 数据分析库，提供了大量的数据结构和数据分析工具。它允许你轻松地操作和分析大型数据集。

类型

Series：一维数组，类似于数组或列表。
DataFrame：二维表格型数据结构，类似于 Excel 表格。

应用场景

数据清洗和预处理。
数据分析和统计。
数据可视化和报告生成。

问题描述

假设你有一个 DataFrame，其中有两列：column_A 和 column_B。你希望选择 column_A 中的唯一字符串，并且这些字符串在 column_B 中只有一个特定字符串（例如 "specific_string"）的行。

解决方案

以下是一个示例代码，展示如何实现这一需求：

import pandas as pd

# 创建示例 DataFrame
data = {
    'column_A': ['A', 'B', 'C', 'A', 'D', 'E'],
    'column_B': ['specific_string', 'other_string', 'specific_string', 'other_string', 'specific_string', 'other_string']
}
df = pd.DataFrame(data)

# 选择 column_A 中的唯一字符串
unique_strings = df['column_A'].unique()

# 过滤出 column_B 中只有一个特定字符串的行
filtered_df = df[df['column_B'] == 'specific_string']

# 进一步过滤出 column_A 中的唯一字符串
result_df = filtered_df[filtered_df['column_A'].isin(unique_strings)]

print(result_df)