Google Vision文本检测返回太多不必要的数据

Google Cloud Vision API 的文本检测功能非常强大，但有时它可能会返回大量不必要的数据。如果你只需要特定的信息，可以通过以下几种方法来过滤和提取你需要的数据。

1. 使用适当的 API 请求

首先，确保你使用的是适当的 API 请求。Google Cloud Vision API 提供了多种功能，包括文本检测、面部检测、标签检测等。确保你只请求了文本检测功能。

from google.cloud import vision

client = vision.ImageAnnotatorClient()

# 加载图像
with open('path/to/your/image.jpg', 'rb') as image_file:
    content = image_file.read()

image = vision.Image(content=content)

# 进行文本检测
response = client.text_detection(image=image)

# 获取检测结果
texts = response.text_annotations

2. 过滤不必要的数据

text_annotations 返回的结果中，第一个元素通常是整个检测到的文本块，后续的元素是每个单独的文本片段。你可以根据需要过滤这些数据。

# 获取整个文本块
full_text = texts[0].description if texts else ""

# 获取单独的文本片段
individual_texts = [text.description for text in texts[1:]]

3. 提取特定信息

如果你只需要特定的信息，可以进一步处理和过滤数据。例如，如果你只需要提取特定格式的文本（如电子邮件地址、电话号码等），可以使用正则表达式来匹配和提取这些信息。

import re

# 示例：提取电子邮件地址
email_pattern = re.compile(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b')
emails = email_pattern.findall(full_text)

# 示例：提取电话号码
phone_pattern = re.compile(r'\b\d{3}[-.\s]??\d{3}[-.\s]??\d{4}\b')
phone_numbers = phone_pattern.findall(full_text)

print("Emails:", emails)
print("Phone Numbers:", phone_numbers)

4. 处理多语言文本

如果你的图像中包含多种语言的文本，你可以使用 locale 参数来指定检测的语言，从而减少不必要的数据。

image_context = vision.ImageContext(language_hints=['en'])

response = client.text_detection(image=image, image_context=image_context)
texts = response.text_annotations

5. 错误处理

确保你处理了可能的错误和异常情况，例如 API 请求失败或返回空结果。

if response.error.message:
    raise Exception(f'{response.error.message}')
else:
    texts = response.text_annotations
    if texts:
        full_text = texts[0].description
        individual_texts = [text.description for text in texts[1:]]
    else:
        full_text = ""
        individual_texts = []

完整示例

以下是一个完整的示例代码，展示了如何使用 Google Cloud Vision API 进行文本检测，并提取特定的信息。

from google.cloud import vision
import re

def detect_text(image_path):
    client = vision.ImageAnnotatorClient()

    with open(image_path, 'rb') as image_file:
        content = image_file.read()

    image = vision.Image(content=content)
    response = client.text_detection(image=image)

    if response.error.message:
        raise Exception(f'{response.error.message}')
    
    texts = response.text_annotations
    if not texts:
        return "", []

    full_text = texts[0].description
    individual_texts = [text.description for text in texts[1:]]

    return full_text, individual_texts

def extract_emails_and_phones(text):
    email_pattern = re.compile(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b')
    phone_pattern = re.compile(r'\b\d{3}[-.\s]??\d{3}[-.\s]??\d{4}\b')

    emails = email_pattern.findall(text)
    phone_numbers = phone_pattern.findall(text)

    return emails, phone_numbers

# 使用示例
image_path = 'path/to/your/image.jpg'
full_text, individual_texts = detect_text(image_path)
emails, phone_numbers = extract_emails_and_phones(full_text)

print("Full Text:", full_text)
print("Individual Texts:", individual_texts)
print("Emails:", emails)
print("Phone Numbers:", phone_numbers)

通过这些步骤，你可以有效地过滤和提取 Google Cloud Vision API 返回的文本检测结果中的特定信息。