【启发式算法】灰狼优化算法（Grey Wolf Optimizer, GWO）详细介绍（Python）

不去幼儿园

发布于 2025-11-02 12:33:00

5500

运行总次数：0

📢本篇文章是博主启发式算法领域学习时，用于个人学习、研究或者欣赏使用，并基于博主对相关等领域的一些理解而记录的学习摘录和笔记，若有不当和侵权之处，指出后将会立即改正，还望谅解。文章分类在👉启发式算法专栏：【启发式算法】（13）---《灰狼优化算法（Grey Wolf Optimizer, GWO）（Python）》

灰狼优化算法（Grey Wolf Optimizer, GWO）详细介绍（Python）

一、引言

在现代计算智能领域，优化问题广泛存在于机器学习、工程设计、图像处理、路径规划等诸多应用中。传统梯度类优化方法在处理非凸、高维、不可导问题时表现受限，因此，基于自然界启发的**元启发式优化算法（Metaheuristics）**成为研究热点。

灰狼优化算法（Grey Wolf Optimizer, GWO）由 Mirjalili 等人于 2014 年提出，是一种模拟灰狼社会等级与狩猎行为的群体智能优化算法。GWO 算法结构简洁、参数少、全局搜索能力强，已在多个 benchmark 函数与工程问题中展现出优越性能。

二、灰狼社会等级与算法映射

灰狼群体具有严格的社会等级结构，GWO 算法将其映射为优化过程中的候选解：

狼等级	算法角色	优化含义
α（Alpha）	最优解	当前群体中的最佳候选解
β（Beta）	次优解	第二优解，辅助 Alpha 决策
δ（Delta）	第三优解	第三优解，参与引导搜索
ω（Omega）	其他个体	被引导的候选解，向 α、β、δ 学习

三、狩猎行为的数学建模

灰狼的狩猎行为主要包括三个阶段：包围猎物、追踪猎物、攻击猎物。GWO 算法将这三个阶段建模为如下数学形式。

3.1 包围猎物（Encircling Prey）

设猎物位置为

，灰狼个体位置为

，则包围行为定义如下：

\vec{D}(t) = \left| \vec{C} \cdot \vec{X}_p(t) - \vec{X}(t) \right|

\vec{X}(t+1) = \vec{X}_p(t) - \vec{A} \cdot \vec{D}(t)

其中，

和

为系数向量，定义如下：

\vec{A} = 2a \cdot \vec{r}_1 - a,\quad \vec{C} = 2 \cdot \vec{r}_2

随迭代线性减小，从 2 降至 0，控制探索与开发的转换； -

为随机向量，增强搜索的随机性。

3.2 协同狩猎（Hunting）

在 GWO 中，Alpha、Beta 和 Delta 狼负责估计猎物位置，其余个体根据它们的引导更新位置。具体公式如下：

首先计算与三只头狼的距离：

\begin{aligned} \vec{D}_\alpha &= \left| \vec{C}_1 \cdot \vec{X}_\alpha - \vec{X} \right| \\ \vec{D}_\beta &= \left| \vec{C}_2 \cdot \vec{X}_\beta - \vec{X} \right| \\ \vec{D}_\delta &= \left| \vec{C}_3 \cdot \vec{X}_\delta - \vec{X} \right| \end{aligned}

然后根据它们的引导更新位置：

\begin{aligned} \vec{X}_1 &= \vec{X}_\alpha - \vec{A}_1 \cdot \vec{D}_\alpha \\ \vec{X}_2 &= \vec{X}_\beta - \vec{A}_2 \cdot \vec{D}_\beta \\ \vec{X}_3 &= \vec{X}_\delta - \vec{A}_3 \cdot \vec{D}_\delta \end{aligned}

最终位置更新为三者平均：

\vec{X}(t+1) = \frac{\vec{X}_1 + \vec{X}_2 + \vec{X}_3}{3}

3.3 探索与开发（Exploration vs Exploitation）

GWO 通过调节参数

实现探索与开发的动态平衡：

- 当

：搜索个体远离猎物，进行**全局探索**； - 当

：搜索个体靠近猎物，进行**局部开发**。

此外，

向量引入随机扰动，有助于跳出局部最优，增强全局搜索能力。

四、算法流程

GWO 算法流程如下：

1. 初始化灰狼种群 X_i (i = 1, 2, ..., N)
2. 计算每个个体的适应度
3. 记录 Alpha、Beta、Delta 狼
4. 对于每次迭代：
   a. 更新参数 a, A, C
   b. 对于每个个体：
      i. 根据 Alpha、Beta、Delta 更新位置
      ii. 计算新位置的适应度
   c. 更新 Alpha、Beta、Delta
5. 满足终止条件则结束，否则返回步骤 4

算法示例：

1.问题设定：Rastrigin 二维函数最小化

函数：f(x,y)=20+x2+y2−10(cos2πx+cos2πy) 全局最小值在 (0,0)，f=0。搜索范围：x,y∈[-5,5]。（曲面高低起伏，有很多局部极小，适合观察 GWO 如何跳出陷阱。）

2.初始化（t=0）

种群大小 N=5，随机生成 5 只灰狼（为便于手算，只保留 2 位小数）：

编号	(x,y)	f(x,y)	等级
1	( 2.1, -1.3)	27.84	ω
2	(-0.8, 0.4)	8.24	β
3	( 1.2, 2.4)	35.56	ω
4	(-0.3, -0.2)	1.92	α
5	( 3.0, -2.5)	58.30	ω

立刻选出前三狼： α = 4 号 (-0.3, -0.2) β = 2 号 (-0.8, 0.4) δ = 1 号 ( 2.1, -1.3)

3.第一次迭代（t=1）

3.1 更新控制参数 a = 2 → 0 线性递减，这里第一次迭代取 a=2。 r1,r2∈[0,1] 随机，假设本次：

r1=0.3, r2=0.8 → A = 2a·r1 - a = 2·2·0.3 - 2 = -0.8 C = 2·r2 = 1.6

3.2 对任意一只普通狼（以 3 号为例）演示完整更新 3 号当前位置 X=(1.2, 2.4)

Step-1 计算与 α,β,δ 的距离 Dα=|C·Xα - X|=|1.6·(-0.3, -0.2) - (1.2, 2.4)| = |(-0.48, -0.32) - (1.2, 2.4)| = (1.68, 2.72)

同理（手算省略）得 Dβ=(2.08, 1.92) Dδ=(4.56, 4.48)

Step-2 计算三只“引导狼”给出的下一步位置 X1 = Xα - A·Dα = (-0.3, -0.2) - (-0.8)·(1.68, 2.72) = (-0.3, -0.2) + (1.344, 2.176) = (1.044, 1.976)

X2 = Xβ - A·Dβ = (-0.8, 0.4) + 0.8·(2.08, 1.92) = (0.864, 1.936)

X3 = Xδ - A·Dδ = (2.1, -1.3) + 0.8·(4.56, 4.48) = (5.748, 2.284)

Step-3 取平均得到新位置 X(t+1)= (X1+X2+X3)/3 = ( (1.044+0.864+5.748)/3 , (1.976+1.936+2.284)/3 ) = (2.55, 2.06)

3.3 计算新位置适应度 f(2.55, 2.06)= 20 + 2.55² + 2.06² -10(cos2π·2.55 + cos2π·2.06) ≈ 22.4 比原 f=35.56 明显下降，移动有效！

3.4 对整个种群重复上述操作 得到 5 个新位置后，重新排序，选出新的 α,β,δ，进入下一次迭代。

4.迭代轨迹可视化（示意图）

初始 α 在 (-0.3,-0.2)
第 1 步后 α 可能已被更优个体取代，向 (0,0) 靠近
随着 a 减小，搜索步长自动缩小，群体逐渐“收缩”到全局最小值 (0,0)

五、论文结果

论文中GWO 在 29 个标准测试函数上与 PSO、GSA、DE 等算法对比，结果如下：

测试类型	函数特性	GWO 表现
单峰函数	检验开发能力	收敛速度快，精度高
多峰函数	检验探索能力	避免局部最优能力强
复合函数	综合性能	探索与开发平衡良好

[Python] 灰狼算法实现

项目代码我已经放入GitCode里面，可以通过下面链接跳转：🔥 【启发式算法】灰狼优化算法（Grey Wolf Optimizer, GWO）（Python）后续相关启发式算法也会不断在【HeuristicAlgorithm】项目里更新，如果该项目对你有所帮助，请帮我点一个星星✨✨✨✨✨，鼓励分享，十分感谢！！！若是下面代码复现困难或者有问题，也欢迎评论区留言。

"""《灰狼算法》
    时间：2024.10.30
    作者：不去幼儿园
"""
import numpy as np
import matplotlib.pyplot as plt
plt.rcParams['font.family'] = 'SimHei'  # Windows系统常用中文字体

# 参数设置
Pop_size = 10
Max_T = 75
Dim = 2
Num_obstacles = 50
safe_distance = 5
bounds = np.array([[0, 100], [0, 100]])


# 随机生成起点和终点
def generate_start_and_goal(bounds, safe_distance,
                            min_sep=None, margin=5):
    """
    保证起点在左上矩形，终点在右下矩形，且两者欧氏距离 >= min_sep
    bounds:  np.array([[xmin, xmax], [ymin, ymax]])
    margin:  距离边界的最小留白，防止起点/终点贴边
    min_sep: 两点间最小距离，默认等于 safe_distance
    """
    if min_sep is None:
        min_sep = safe_distance

    x_min, x_max = bounds[0, 0], bounds[0, 1]
    y_min, y_max = bounds[1, 0], bounds[1, 1]

    # 计算中心点，把地图切成左上 / 右下两个矩形
    x_c = (x_min + x_max) / 2
    y_c = (y_min + y_max) / 2

    # 左上矩形边界
    left_x1, right_x1 = x_min + margin, x_c - margin
    bottom_y1, top_y1    = y_c + margin, y_max - margin

    # 右下矩形边界
    left_x2, right_x2 = x_c + margin, x_max - margin
    bottom_y2, top_y2    = y_min + margin, y_c - margin

    # 确保矩形有效
    assert left_x1 < right_x1 and bottom_y1 < top_y1, "左上区域过小，请调大地图或减小 margin"
    assert left_x2 < right_x2 and bottom_y2 < top_y2, "右下区域过小，请调大地图或减小 margin"

    while True:
        # 在左上矩形采起点
        start = np.array([np.random.uniform(left_x1, right_x1),
                          np.random.uniform(bottom_y1, top_y1)])
        # 在右下矩形采终点
        goal  = np.array([np.random.uniform(left_x2, right_x2),
                          np.random.uniform(bottom_y2, top_y2)])

        if np.linalg.norm(start - goal) >= min_sep:
            return start, goal


# 随机生成障碍物
def generate_obstacles(num_obstacles, bounds, start, goal, safe_distance):
	obstacles = []
	while len(obstacles) < num_obstacles:
		obs = np.random.uniform(bounds[:, 0], bounds[:, 1], size=(1, 2)).flatten()
		if (np.linalg.norm(obs - start) >= safe_distance and
				np.linalg.norm(obs - goal) >= safe_distance and
				all(np.linalg.norm(obs - other) >= 2 * safe_distance for other in obstacles)):
			obstacles.append(obs)
	return np.array(obstacles)


# 适应度函数（标准GWO与改进GWO共用）
def fitness_function(position, goal, obstacles, safe_distance):
	distance_to_goal = np.linalg.norm(position - goal)
	penalty = sum([1000 / dis_to_obs for obs in obstacles
				   if (dis_to_obs := np.linalg.norm(position - obs)) < safe_distance])
	return distance_to_goal + penalty


# 标准GWO算法实现
def standard_gwo(Pop_size, Max_T, Dim, start, goal, bounds, obstacles, safe_distance):
	positions = np.random.uniform(bounds[:, 0], bounds[:, 1], size=(Pop_size, Dim))
	alpha_pos, beta_pos, delta_pos = start.copy(), None, None
	alpha_score, beta_score, delta_score = float('inf'), float('inf'), float('inf')
	best_path, best_scores = [start.copy()], []

	for iter in range(Max_T):
		for i in range(Pop_size):
			fitness = fitness_function(positions[i, :], goal, obstacles, safe_distance)

			if fitness < alpha_score:
				delta_score, delta_pos = beta_score, beta_pos
				beta_score, beta_pos = alpha_score, alpha_pos
				alpha_score, alpha_pos = fitness, positions[i, :].copy()
			elif fitness < beta_score:
				delta_score, delta_pos = beta_score, beta_pos
				beta_score, beta_pos = fitness, positions[i, :].copy()
			elif fitness < delta_score:
				delta_score, delta_pos = fitness, positions[i, :].copy()

		best_path.append(alpha_pos.copy())
		best_scores.append(alpha_score)

		for i in range(Pop_size):
			for j in range(Dim):
				r1_alpha, r2_alpha = np.random.rand(), np.random.rand()
				A1 = 2 * (2 - iter * (2 / Max_T)) * r1_alpha - (2 - iter * (2 / Max_T))
				C1 = 2 * r2_alpha
				D_alpha = abs(C1 * alpha_pos[j] - positions[i, j])
				X1 = alpha_pos[j] - A1 * D_alpha

				A2, C2 = 2 * (2 - iter * (2 / Max_T)) * np.random.rand() - (
						2 - iter * (2 / Max_T)), 2 * np.random.rand()
				D_beta = abs(C2 * beta_pos[j] - positions[i, j]) if beta_pos is not None else 0
				X2 = beta_pos[j] - A2 * D_beta

				A3, C3 = 2 * (2 - iter * (2 / Max_T)) * np.random.rand() - (
						2 - iter * (2 / Max_T)), 2 * np.random.rand()
				D_delta = abs(C3 * delta_pos[j] - positions[i, j]) if delta_pos is not None else 0
				X3 = delta_pos[j] - A3 * D_delta

				positions[i, j] = (X1 + X2 + X3) / 3

		positions = np.clip(positions, bounds[:, 0], bounds[:, 1])

	return best_path, best_scores, alpha_pos


# 改进GWO算法实现
def modified_gwo(Pop_size, Max_T, Dim, start, goal, bounds, obstacles, safe_distance):
	positions = np.random.uniform(bounds[:, 0], bounds[:, 1], size=(Pop_size, Dim))
	alpha_pos, beta_pos, delta_pos = start.copy(), None, None
	alpha_score, beta_score, delta_score = float('inf'), float('inf'), float('inf')
	best_path, best_scores = [start.copy()], []

	for iter in range(Max_T):
		a = 2 - iter * (2 / Max_T)

		for i in range(Pop_size):
			fitness = fitness_function(positions[i, :], goal, obstacles, safe_distance)
			if fitness < alpha_score:
				delta_score, delta_pos = beta_score, beta_pos
				beta_score, beta_pos = alpha_score, alpha_pos
				alpha_score, alpha_pos = fitness, positions[i, :].copy()
			elif fitness < beta_score:
				delta_score, delta_pos = beta_score, beta_pos
				beta_score, beta_pos = fitness, positions[i, :].copy()
			elif fitness < delta_score:
				delta_score, delta_pos = fitness, positions[i, :].copy()

		best_path.append(alpha_pos.copy())
		best_scores.append(alpha_score)

		for i in range(Pop_size):
			for j in range(Dim):
				r1, r2 = np.random.rand(), np.random.rand()
				A1, C1 = 2 * a * r1 - a, 2 * r2
				D_alpha = abs(C1 * alpha_pos[j] - positions[i, j])
				X1 = alpha_pos[j] - A1 * D_alpha

				A2, C2 = 2 * a * np.random.rand() - a, 2 * np.random.rand()
				D_beta = abs(C2 * beta_pos[j] - positions[i, j]) if beta_pos is not None else 0
				X2 = beta_pos[j] - A2 * D_beta

				A3, C3 = 2 * a * np.random.rand() - a, 2 * np.random.rand()
				D_delta = abs(C3 * delta_pos[j] - positions[i, j]) if delta_pos is not None else 0
				X3 = delta_pos[j] - A3 * D_delta

				positions[i, j] = (0.6 * X1 + 0.3 * X2 + 0.1 * X3)

		positions = np.clip(positions, bounds[:, 0], bounds[:, 1])

	return best_path, best_scores, alpha_pos


start, goal = generate_start_and_goal(bounds, safe_distance)
obstacles = generate_obstacles(Num_obstacles, bounds, start, goal, safe_distance)

# 运行两种GWO算法
standard_path, standard_scores, standard_final_pos = standard_gwo(Pop_size, Max_T, Dim, start, goal, bounds, obstacles,
																  safe_distance)
modified_path, modified_scores, modified_final_pos = modified_gwo(Pop_size, Max_T, Dim, start, goal, bounds, obstacles,
																  safe_distance)

# 绘制最终路径
plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.title("标准GWO - 最终路径")
plt.plot(start[0], start[1], 'go', markersize=10, label='起点')
plt.plot(goal[0], goal[1], 'ro', markersize=10, label='终点')
plt.plot(obstacles[:, 0], obstacles[:, 1], 'ks', markersize=4, label='障碍物')
standard_path_np = np.array(standard_path)
plt.plot(standard_path_np[:, 0], standard_path_np[:, 1], 'c-', linewidth=1.5, label='路径')
plt.scatter(standard_path_np[:, 0], standard_path_np[:, 1], color='c', s=20, marker='o')
plt.plot(standard_final_pos[0], standard_final_pos[1], 'c*', markersize=12, label='最终Alpha位置')
plt.legend()

plt.subplot(1, 2, 2)
plt.title("改进GWO - 最终路径")
plt.plot(start[0], start[1], 'go', markersize=10, label='起点')
plt.plot(goal[0], goal[1], 'ro', markersize=10, label='终点')
plt.plot(obstacles[:, 0], obstacles[:, 1], 'ks', markersize=4, label='障碍物')
modified_path_np = np.array(modified_path)
plt.plot(modified_path_np[:, 0], modified_path_np[:, 1], 'm-', linewidth=1.5, label='路径')
plt.scatter(modified_path_np[:, 0], modified_path_np[:, 1], color='m', s=20, marker='o')
plt.plot(modified_final_pos[0], modified_final_pos[1], 'm*', markersize=12, label='最终Alpha位置')
plt.legend()
plt.show()

# 绘制收敛曲线对比
plt.figure()
plt.plot(range(Max_T), standard_scores, 'b-', label='标准GWO')
plt.plot(range(Max_T), modified_scores, 'r-', label='改进GWO')
plt.xlabel('迭代次数')
plt.ylabel('最优适应度值')
plt.title('收敛曲线对比')
plt.legend()
plt.show()

# 计算最优路径长度
standard_path_Len = sum(np.linalg.norm(standard_path_np[i] - standard_path_np[i + 1])
						for i in range(len(standard_path_np) - 1))

print(f'标准路径长度: {standard_path_Len}')

modified_path_Len = sum(np.linalg.norm(modified_path_np[i] - modified_path_np[i + 1])
						for i in range(len(modified_path_np) - 1))

print(f'改进路径长度: {modified_path_Len}')

[Results] 运行结果

[Notice] 注意事项

GWO 就是： “让最厉害的三只狼（α,β,δ）当‘导航员’，其余狼每隔一步都向它们学习一次，逐渐把整群狼拖到谷底。” 把上面的手算步骤对照代码走一遍，你会发现“包围-狩猎-攻击”不再抽象，而是每一步都在做向量加加减减——这就是灰狼优化的全部秘密。

# 环境配置
Python                  3.11.5
torch                   2.1.0
torchvision             0.16.0
gym                     0.26.2

六、总结

灰狼优化算法（GWO）作为一种群体智能优化方法，具有以下优势：

- 结构简洁，参数少，易于实现； - 模拟社会等级与协同行为，具备良好全局搜索能力； - 在探索与开发之间实现动态平衡； - 在标准测试函数与工程问题中表现优异，具备广泛应用前景。

参考资料： Mirjalili S., Mirjalili S.M., Lewis A. (2014). *Grey Wolf Optimizer*, Advances in Engineering Software, Vol. 69, pp. 46–61.

https://github.com/Dev-Soni-2608/Grey-Wolf-Optimizer-for-Robot-Path-Planning/tree/main

本文参与腾讯云自媒体同步曝光计划，分享自作者个人站点/博客。

原始发表：2025-11-01，如有侵权请联系 cloudcommunity@tencent.com 删除

函数

本文分享自作者个人站点/博客前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

本文参与腾讯云自媒体同步曝光计划，欢迎热爱写作的你一起参与！

登录后参与评论

0 条评论

热度