探索 DeepFace 的奥妙和实力

原创

buzzfrog

发布于 2024-07-13 00:11:11

1490

发布于 2024-07-13 00:11:11

文章被收录于专栏：云上修行

在数字化时代，面部识别技术的突破性进展正在重塑我们与设备和数字世界的互动方式。由于其准确性、便捷性和高效性，这项技术已成为安全、营销和社交媒体领域中不可或缺的一环。今天，我们深入探讨 DeepFace：一个强大的面部识别和分析框架，它应用了最先进的人工智能算法来识别、分析和验证人脸。

DeepFace：简介与背景

DeepFace，一个由一系列深度学习模型支持的强大面部识别框架，其名源自于它背后的核心理念——通过深度学习技术实现准确的面部识别和属性分析。这个框架包括了从人脸检测到面部属性分析（如年龄、性别、情感和种族）以及面部识别的全面解决方案。

如何工作

DeepFace 框架的工作流程通常遵循以下几个步骤：

人脸检测：首先，它通过先进的检测算法定位图像中的人脸。

人脸对齐：然后，采用特定的预处理步骤对检测到的面部进行对齐，以提高识别精度。

特征提取：通过训练有素的深度学习模型，如 VGG-Face、Facenet 等，从对齐的人脸中提取特征。

匹配与分析：最后，将提取的特征用于面部识别、比较或面部属性分析。

关键功能

DeepFace 的魅力在于它的多功能性，它不仅能进行人脸识别，还提供了年龄、性别、情绪和种族的分析功能。其 API 支持丰富，可以满足多种用例的需求，比如：

验证：比较两张脸是否属于同一人。

识别：在数据库中识别给定面部的身份。

分析：评估面部图像中的情绪、性别、年龄和种族。

表示：将面部编码为用于后续分析或比较的特征向量。

技术深掘

DeepFace 利用了 TensorFlow 这个强大的深度学习库，通过精心设计的算法和预训练模型来实现其核心功能。它提供了灵活性高、可扩展性强的人脸识别技术，支持多种深度学习模型和不同的人脸检测后端。

代码剖析

# 常用依赖项
import os
import warnings
import logging
from typing import Any, Dict, List, Union, Optional

# 必须在导入 tensorflow 之前设置
os.environ["TF_USE_LEGACY_KERAS"] = "1"

# pylint: disable=wrong-import-position

# 第三方依赖项
import numpy as np
import pandas as pd
import tensorflow as tf

# 包依赖项
from deepface.commons import package_utils, folder_utils
from deepface.commons import logger as log
from deepface.modules import (
    modeling,
    representation,
    verification,
    recognition,
    demography,
    detection,
    streaming,
    preprocessing,
)
from deepface import __version__

logger = log.get_singletonish_logger()

# -----------------------------------
# 依赖项配置

# 如果用户使用的是 tf 2.16 或更高版本，则应安装 tf_keras 包
package_utils.validate_for_keras3()

warnings.filterwarnings("ignore")
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"
tf_version = package_utils.get_tf_major_version()
if tf_version == 2:
    tf.get_logger().setLevel(logging.ERROR)
# -----------------------------------

# 创建存储模型权重所需的文件夹(如果必要)
folder_utils.initialize_folder()


def build_model(model_name: str) -> Any:
    """
    此函数构建一个 deepface 模型
    参数:
        model_name (string): 人脸识别或面部属性模型
            VGG-Face, Facenet, OpenFace, DeepFace, DeepID 用于人脸识别
            Age, Gender, Emotion, Race 用于面部属性
    返回:
        built_model
    """
    return modeling.build_model(model_name=model_name)


def verify(
    img1_path: Union[str, np.ndarray, List[float]],
    img2_path: Union[str, np.ndarray, List[float]],
    model_name: str = "VGG-Face",
    detector_backend: str = "opencv",
    distance_metric: str = "cosine",
    enforce_detection: bool = True,
    align: bool = True,
    expand_percentage: int = 0,
    normalization: str = "base",
    silent: bool = False,
    threshold: Optional[float] = None,
    anti_spoofing: bool = False,
) -> Dict[str, Any]:
    """
    验证一对图像是否代表同一个人或不同的人。
    参数:
        img1_path (str or np.ndarray or List[float]): 第一个图像的路径。
            接受精确的图像路径作为字符串、numpy 数组 (BGR)、base64 编码的图像或预先计算的嵌入。

        img2_path (str or np.ndarray or List[float]): 第二个图像的路径。
            接受精确的图像路径作为字符串、numpy 数组 (BGR)、base64 编码的图像或预先计算的嵌入。

        model_name (str): 人脸识别模型。选项有: VGG-Face, Facenet, Facenet512, OpenFace, DeepFace, DeepID, Dlib, ArcFace, SFace 和 GhostFaceNet（默认是 VGG-Face）。

        detector_backend (string): 人脸检测后端。选项有: 'opencv', 'retinaface', 'mtcnn', 'ssd', 'dlib', 'mediapipe', 'yolov8', 'centerface' 或 'skip'（默认是 opencv）。

        distance_metric (string): 衡量相似度的度量。选项有: 'cosine', 'euclidean', 'euclidean_l2'（默认是 cosine）。

        enforce_detection (boolean): 如果未检测到人脸则引发异常。设置为 False 以避免对低分辨率图像的异常（默认是 True）。

        align (bool): 启用人脸对齐（默认是 True）。

        expand_percentage (int): 根据百分比扩展检测到的面部区域（默认是 0）。

        normalization (string): 在将输入图像传递给模型之前进行标准化。选项有: base, raw, Facenet, Facenet2018, VGGFace, VGGFace2, ArcFace（默认是 base）。

        silent (boolean): 抑制或允许一些日志信息以实现更安静的分析过程（默认是 False）。

        threshold (float): 指定阈值以确定一对是否代表同一个人或不同的个体。此阈值用于比较距离。如果未设置，则根据指定的模型名称和距离度量应用默认预调阈值（默认是 None）。

        anti_spoofing (boolean): 启用防欺骗（默认是 False）。

    返回:
        result (dict): 包含验证结果的字典，包含以下键。

        - 'verified' (bool): 表示图像是否代表同一个人（True）或不同的人（False）。

        - 'distance' (float): 面部向量之间的距离度量。距离越小，相似度越高。

        - 'threshold' (float): 用于验证的最大阈值。如果距离低于此阈值，则认为图像匹配。

        - 'model' (str): 选择的人脸识别模型。

        - 'distance_metric' (str): 选择的用于衡量距离的相似度度量。

        - 'facial_areas' (dict): 两张图像中面部的矩形感兴趣区域。
            - 'img1': {'x': int, 'y': int, 'w': int, 'h': int} 第一个图像的感兴趣区域。
            - 'img2': {'x': int, 'y': int, 'w': int, 'h': int} 第二个图像的感兴趣区域。

        - 'time' (float): 验证过程所用的时间（以秒为单位）。
    """

    return verification.verify(
        img1_path=img1_path,
        img2_path=img2_path,
        model_name=model_name,
        detector_backend=detector_backend,
        distance_metric=distance_metric,
        enforce_detection=enforce_detection,
        align=align,
        expand_percentage=expand_percentage,
        normalization=normalization,
        silent=silent,
        threshold=threshold,
        anti_spoofing=anti_spoofing,
    )


def analyze(
    img_path: Union[str, np.ndarray],
    actions: Union[tuple, list] = ("emotion", "age", "gender", "race"),
    enforce_detection: bool = True,
    detector_backend: str = "opencv",
    align: bool = True,
    expand_percentage: int = 0,
    silent: bool = False,
    anti_spoofing: bool = False,
) -> List[Dict[str, Any]]:
    """
    分析提供图像中的年龄、性别、情绪和种族等面部属性。
    参数:
        img_path (str or np.ndarray): 精确的图像路径、BGR 格式的 numpy 数组或 base64 编码的图像。如果源图像包含多个面部，将为每个检测到的面部包含信息。

        actions (tuple): 要分析的属性。默认是 ('age', 'gender', 'emotion', 'race')。可以根据需要排除一些这些属性。

        enforce_detection (boolean): 如果未在图像中检测到面部，则引发异常。设置为 False 以避免对低分辨率图像的异常（默认是 True）。

        detector_backend (string): 人脸检测后端。选项有: 'opencv', 'retinaface', 'mtcnn', 'ssd', 'dlib', 'mediapipe', 'yolov8', 'centerface' 或 'skip'（默认是 opencv）。

        distance_metric (string): 衡量相似度的度量。选项有: 'cosine', 'euclidean', 'euclidean_l2'（默认是 cosine）。

        align (boolean): 基于眼睛位置进行对齐（默认是 True）。

        expand_percentage (int): 根据百分比扩展检测到的面部区域（默认是 0）。

        silent (boolean): 抑制或允许一些日志信息以实现更安静的分析过程（默认是 False）。

        anti_spoofing (boolean): 启用防欺骗（默认是 False）。

    返回:
        results (List[Dict[str, Any]]): 字典列表，每个字典表示检测到的面部的分析结果。列表中的每个字典包含以下键:

        - 'region' (dict): 表示图像中检测到的面部的矩形区域。
            - 'x': 面部左上角的 x 坐标。
            - 'y': 面部左上角的 y 坐标。
            - 'w': 检测到的面部区域的宽度。
            - 'h': 检测到的面部区域的高度。

        - 'age' (float): 检测到面部的估计年龄。

        - 'face_confidence' (float): 检测到面部的置信度得分。表示面部检测的可靠性。

        - 'dominant_gender' (str): 检测到面部的主导性别。可能的值是 "Man" 或 "Woman"。

        - 'gender' (dict): 每个性别类别的置信度得分。
            - 'Man': 男性性别的置信度得分。
            - 'Woman': 女性性别的置信度得分。

        - 'dominant_emotion' (str): 检测到面部的主导情绪。可能的值包括 "sad," "angry," "surprise," "fear," "happy," "disgust," 和 "neutral"。

        - 'emotion' (dict): 每个情绪类别的置信度得分。
            - 'sad': 悲伤的置信度得分。
            - 'angry': 愤怒的置信度得分。
            - 'surprise': 惊讶的置信度得分。
            - 'fear': 恐惧的置信度得分。
            - 'happy': 快乐的置信度得分。
            - 'disgust': 厌恶的置信度得分。
            - 'neutral': 中立的置信度得分。

        - 'dominant_race' (str): 检测到面部的主导种族。可能的值包括 "indian," "asian," "latino hispanic," "black," "middle eastern," 和 "white"。

        - 'race' (dict): 每个种族类别的置信度得分。
            - 'indian': 印度种族的置信度得分。
            - 'asian': 亚洲种族的置信度得分。
            - 'latino hispanic': 拉美/西班牙裔种族的置信度得分。
            - 'black': 黑人种族的置信度得分。
            - 'middle eastern': 中东种族的置信度得分。
            - 'white': 白人种族的置信度得分。
    """
    return demography.analyze(
        img_path=img_path,
        actions=actions,
        enforce_detection=enforce_detection,
        detector_backend=detector_backend,
        align=align,
        expand_percentage=expand_percentage,
        silent=silent,
        anti_spoofing=anti_spoofing,
    )


def find(
    img_path: Union[str, np.ndarray],
    db_path: str,
    model_name: str = "VGG-Face",
    distance_metric: str = "cosine",
    enforce_detection: bool = True,
    detector_backend: str = "opencv",
    align: bool = True,
    expand_percentage: int = 0,
    threshold: Optional[float] = None,
    normalization: str = "base",
    silent: bool = False,
    refresh_database: bool = True,
    anti_spoofing: bool = False,
) -> List[pd.DataFrame]:
    """
    在数据库中识别人脸
    参数:
        img_path (str or np.ndarray): 精确的图像路径、BGR 格式的 numpy 数组或 base64 编码的图像。如果源图像包含多个面部，将为每个检测到的面部包含信息。

        db_path (string): 包含图像文件的文件夹路径。数据库中的所有检测到的面部将在决策过程中被考虑。

        model_name (str): 人脸识别模型。选项有: VGG-Face, Facenet, Facenet512, OpenFace, DeepFace, DeepID, Dlib, ArcFace, SFace 和 GhostFaceNet（默认是 VGG-Face）。

        distance_metric (string): 衡量相似度的度量。选项有: 'cosine', 'euclidean', 'euclidean_l2'（默认是 cosine）。

        enforce_detection (boolean): 如果未检测到人脸则引发异常。设置为 False 以避免对低分辨率图像的异常（默认是 True）。

        detector_backend (string): 人脸检测后端。选项有: 'opencv', 'retinaface', 'mtcnn', 'ssd', 'dlib', 'mediapipe', 'yolov8', 'centerface' 或 'skip'（默认是 opencv）。

        align (boolean): 基于眼睛位置进行对齐（默认是 True）。

        expand_percentage (int): 根据百分比扩展检测到的面部区域（默认是 0）。

        threshold (float): 指定阈值以确定一对是否代表同一个人或不同的个体。此阈值用于比较距离。如果未设置，则根据指定的模型名称和距离度量应用默认预调阈值（默认是 None）。

        normalization (string): 在将输入图像传递给模型之前进行标准化。选项有: base, raw, Facenet, Facenet2018, VGGFace, VGGFace2, ArcFace（默认是 base）。

        silent (boolean): 抑制或允许一些日志信息以实现更安静的分析过程（默认是 False）。

        refresh_database (boolean): 与目录或数据库文件同步图像表示（pkl）文件， 如果设置为 false，将忽略 db_path 内的任何文件更改（默认是 True）。

        anti_spoofing (boolean): 启用防欺骗（默认是 False）。

    返回:
        results (List[pd.DataFrame]): pandas 数据框列表。每个数据框对应于源图像中检测到的个人的身份信息。DataFrame 列包括:

        - 'identity': 检测到的个体的身份标签。

        - 'target_x', 'target_y', 'target_w', 'target_h': 位于数据库中目标脸的边界框坐标。

        - 'source_x', 'source_y', 'source_w', 'source_h': 源图像中检测到脸的边界框坐标。

        - 'threshold': 确定一对是否是同一个人或不同人的阈值。

        - 'distance': 基于指定模型和距离度量的脸部相似度得分。
    """
    return recognition.find(
        img_path=img_path,
        db_path=db_path,
        model_name=model_name,
        distance_metric=distance_metric,
        enforce_detection=enforce_detection,
        detector_backend=detector_backend,
        align=align,
        expand_percentage=expand_percentage,
        threshold=threshold,
        normalization=normalization,
        silent=silent,
        refresh_database=refresh_database,
        anti_spoofing=anti_spoofing,
    )


def represent(
    img_path: Union[str, np.ndarray],
    model_name: str = "VGG-Face",
    enforce_detection: bool = True,
    detector_backend: str = "opencv",
    align: bool = True,
    expand_percentage: int = 0,
    normalization: str = "base",
    anti_spoofing: bool = False,
) -> List[Dict[str, Any]]:
    """
    将人脸图像表示为多维向量嵌入。

    参数:
        img_path (str or np.ndarray): 精确的图像路径、BGR 格式的 numpy 数组或 base64 编码的图像。如果源图像包含多个面部，将为每个检测到的面部包含信息。

        model_name (str): 人脸识别模型。选项有: VGG-Face, Facenet, Facenet512, OpenFace, DeepFace, DeepID, Dlib, ArcFace, SFace 和 GhostFaceNet（默认是 VGG-Face）。

        enforce_detection (boolean): 如果未检测到人脸则引发异常。设置为 False 以避免对低分辨率图像的异常（默认是 True）。

        detector_backend (string): 人脸检测后端。选项有: 'opencv', 'retinaface', 'mtcnn', 'ssd', 'dlib', 'mediapipe', 'yolov8', 'centerface' 或 'skip'（默认是 opencv）。

        align (boolean): 基于眼睛位置进行对齐（默认是 True）。

        expand_percentage (int): 根据百分比扩展检测到的面部区域（默认是 0）。

        normalization (string): 在将输入图像传递给模型之前进行标准化。默认是 base。选项有: base, raw, Facenet, Facenet2018, VGGFace, VGGFace2, ArcFace（默认是 base）。

        anti_spoofing (boolean): 启用防欺骗（默认是 False）。

    返回:
        results (List[Dict[str, Any]]): 字典列表，每个字典包含以下字段:

        - embedding (List[float]): 表示面部特征的多维向量。维数根据参考模型的不同而不同（例如，FaceNet 返回 128 维，VGG-Face 返回 4096 维）。

        - facial_area (dict): 通过人脸检测检测到的面部区域的字典格式。包含 'x' 和 'y' 作为左上角点，'w' 和 'h' 作为宽度和高度。如果 `detector_backend` 设置为 'skip'，则表示整个图像区域，并且是无意义的。

        - face_confidence (float): 面部检测的置信度。 如果 `detector_backend` 设置为 'skip'，置信度将为 0 并且是无意义的。
    """
    return representation.represent(
        img_path=img_path,
        model_name=model_name,
        enforce_detection=enforce_detection,
        detector_backend=detector_backend

以上所提及的代码片段提供了一个窗口，展示了如何通过 DeepFace 框架实现面部识别和属性分析的各种功能。例如，使用 verify 函数可对两张图像进行身份验证，而 analyze 函数则用于分析图像中的面部属性。这些功能背后，涵盖了对 TensorFlow 框架深度定制的应用，以及对一系列先进的计算机视觉技术的集成应用。

应用场景

DeepFace 的潜力是巨大的，从增强安全系统的能力到改善用户体验，再到促进新兴技术的开发和部署，其应用前景非常广泛。无论是社交媒体平台上的自动化标签功能，还是零售和招聘行业中自动化的客户服务和筛选流程，DeepFace 都展现出了显著的影响力和价值。

结论

DeepFace 不仅仅是一项技术，它是人脸识别和分析领域内重要的一大步。通过不断地突破和发展，DeepFace 和类似的技术将继续推动人工智能和计算机视觉的边界，为人类带来更加安全、便捷、互联的生活方式。

在这个数字化日益增长的时代，理解和掌握如 DeepFace 这样的尖端技术，对于技术从业者和业界观察者来说至关重要。随着技术的进步和应用范围的扩大，它的影响只会变得越来越深远。

原创声明：本文系作者授权腾讯云开发者社区发表，未经许可，不得转载。

如有侵权，请联系 cloudcommunity@tencent.com 删除。

人工智能

原创声明：本文系作者授权腾讯云开发者社区发表，未经许可，不得转载。

如有侵权，请联系 cloudcommunity@tencent.com 删除。

人工智能

#DeepFace

登录后参与评论

0 条评论

热度