C++效率掌握之STL库：list底层剖析及迭代器万字详解

DARLING Zero two

发布于 2025-02-25 14:38:13

11800

代码可运行

文章被收录于专栏：C语言C语言

运行总次数：0

代码可运行

list 的函数用法与 STL 库中其他的大差不差，本文章难度有些上升，将针对前面忽略的迭代器和模版进行深度解析，真正了解到底什么是迭代器，和迭代器的实现原理

1.学习list底层的重要性

list 底层是双向带头循环链表，在链表的任意位置进行插入和删除操作的时间复杂度都是 O(1)。这是因为链表插入或删除节点只需要修改相邻节点的指针。例如，在实现一个任务调度系统时，任务可能会随时被添加或移除，如果使用 std::list 来存储任务，就可以高效地处理这些操作

为了与库里的 list 进行区分，所有的类和函数都放在自定义的命名空间 bit 进行区分

2.节点模版

template<class T>
struct list_node
{
	list_node<T>* _next;
	list_node<T>* _prev;
	T _val;

	list_node(const T& val = T())
		:_next(nullptr)
		, _prev(nullptr)
		, _val(val)
	{}
};

经过数据结构的学习我们知道，每个节点都是一个结构体，因为是双向循环链表，所以有 prev 和 next，使用初始化列表进行常规的初始化

🔥值得注意的是：

这里使用 struct 而不是 class，是因为 struct 的默认访问权限是 public，class 的默认访问权限是 private ，虽然用 class＋友元 的方式也是可行的，但不如用 struct 来的方便
构造函数参数部分为 const T& val = T() ，而不是常写的 0 ，是因为 T 无法确定是自定义类型还是内置类型，所以当没有参数传入时就调用各自类型的默认构造函数进行传参

3.迭代器模版及实现

迭代器就是一个桥梁，让容器能通过迭代器实现算法

容器 <–> 迭代器 <–> 算法

根据迭代器的容器底层结构决定的性质，可以大致分为三类：

单向迭代器（forward iterator）：支持运算符重载 ++，常用容器为 forward_list、unordered_map、unordered_set
双向迭代器（bidirectional iterator）：支持运算符重载 ++、--，常用容器为 list、map、set
随机迭代器（random access iterator）：支持运算符重载 ++、--、+、-，常用容器为vector、string、deque

从下往上为依次包含的关系，比如 list 迭代器为双向，那么既可以用双向，也可以用单向，不能用随机。通常 std 库里的是随机迭代器

因此这也解释了为什么 vector 迭代器可以使用 std::iterator，也可以使用 vector<T>::iterator。但是 list 迭代器不可以使用 std::iterator，一般使用 list<T>::iterator

3.1 初步实现

	void test_list1()
	{
		list<int> lt;
		lt.push_back(1);
		lt.push_back(2);
		lt.push_back(3);
		lt.push_back(4);

		list<int>::iterator it = lt.begin();
		while (it != lt.end())
		{
			cout << *it << " ";
			++it;
		}
		cout << endl;

		for (auto e : lt)
		{
			cout << e << " ";
		}
		cout << endl;
	}

剖析底层代码先从功能入手，通常我们实现迭代器如代码所示

3.1.1 迭代器模版

那么要先实现基本的迭代器模版

template<class T>
struct _list_iterator
{
	typedef list_node<T> Node;
	Node* _node;

	_list_iterator(Node* node)
		:_node(node)
	{}
};

实现基本的构造函数，_node 是一个指向 Node 类型对象的指针，它用于存储当前迭代器所指向的链表节点

实现 push_back 往链表中放置数据，实现方式还是和双线链表中一样，可以去数据结构专题中回顾

3.1.2 push_back

传送门：【初阶数据结构】逆流的回环链桥：双链表

void push_back(const T& x)
{
	Node* tail = _head->_prev;
	Node* newnode = new Node(x);
	tail->_next = newnode;
	newnode->_prev = tail;
	newnode->_next = _head;
	_head->_prev = newnode;
	//insert(end(), x);
}

3.1.3 begin和end

接下来实现 begin() 和 end() ，还是比较简单的，这里直接展示放在 list 模版中的效果，后续就只单独列出实现功能的代码了

template<class T>
class list
{
	typedef list_node<T> Node;
public:
	typedef _list_iterator<T> iterator;
	iterator begin() const
	{
		return _head->_next;
	}

	iterator end() const
	{
		return _head;
	}

	list()
	{
		_head = new Node;
		_head->_prev = _head;
		_head->_next = _head;
	}

private:
	Node* _head;
};

list 模版有一个头节点，作为哨兵位，实现了双链表的构造函数，看似简单的代码，实则隐藏了很多的细节

🔥值得注意的是：

begin() 和 end() 返回的是 iterator 类型，C++ 标准库提供了大量基于迭代器的通用算法（如 std::find、std::sort、std::for_each 等）。当自定义容器的 begin() 函数返回迭代器时，这些通用算法可以直接应用到自定义容器上，实现代码的复用
iterator 是 _list_iterator<T> 模版的重命名，更符合使用习惯
begin() 和 end() 返回的值看似是 Node* 类型，实际上单参数的函数支持隐式转换，Node* 隐式转换为 iterator 类型，return _head->_next 其实是 return iterator(_head->_next)
begin() 和 end() 函数都用 const 修饰，使函数更具普遍性，包括 const 变量的情况，普通类型调用也是权限的缩小，两种情况共用一个函数

3.1.4 *解引用运算符重载

T& operator*()
{
	return _node->_val;
}

这里 _node 应该是一个指向链表节点的指针，_val 是链表节点中存储的数据成员。该函数返回节点数据 _val

3.1.5 ++运算符重载

//前置
_list_iterator<T>& operator++()
{
	_node = _node->_next;
	return *this;
}
//后置
_list_iterator<T> operator++(int)
{
	self temp(*this);
	_node = _node->_next;
	return temp;
}

为了区分前置后置，前置 ++ 和后置 ++ 虽然是运算符重载，但是形式上也构成函数重载，后置 ++ 增加这个 int 参数不是为了接受具体的值，仅仅是占位，跟前置 ++ 构成重载， int 这个位置传任何值都可以，实际调用的时候前置 ++ 和后置 ++ 可能分别为 operator++() 和 operator++(0) ，括号内的值是随机的

🔥值得注意的是： 前置 ++ 返回的对象还存在，所以可以用引用，而后置 ++ 返回的对象是个临时对象，所以不能返回引用

3.1.6 !=运算符重载

bool operator!=(const _list_iterator<T>& it) const
{
	return _node != it._node;
}

该函数通过比较当前对象的 _node 成员变量和传入对象 it 的 _node 成员变量来判断两个对象是否不相等。如果 _node 和 it._node 不相等，则返回 true，表示两个对象不相等；否则返回 false，表示两个对象相等

🔥值得注意的是： 调用 end() 时传值返回，具有常性，所以 != 重载要参数加 const

以上就已经基本实现了迭代器，能够跑起来实现正常功能，范围 for 的底层也是迭代器，所以也能够使用了，但是还有很多需要完善的

3.2 再完善实现

通常迭代器分为 iterator 和 const_iterator

那么我们难道再写一个来实现吗，显然是不符合编程范式的，代码就显得太冗余了

有人或许会想到这么写：

typedef _list_iterator<T> iterator;
typedef const _list_const_iterator<T> const_iterator;

这样看起或许可行，但是我们要思考 const 的使用

举个例子：

const T* ptr1;
T* const ptr2;

这两段代码的含义分别为：不可以修改指向对象的内容，不可以修改指向别的对象

显然 typedef const _list_const_iterator<T> const_iterator 是第一种 const 的使用，在 ++ 运算符重载中有 _node = _node->_next ，需要通过这段代码来到下一个节点，按照我们这样加 const 就和这段代码冲突了。我们只是不想修改其内容，而不是无法跳转节点

所以发明 STL库的人真是个天才，想到了巧妙的办法：

迭代器模版
template<class T, class Ref>
......
list模版
typedef _list_iterator<T, T&> iterator;
typedef _list_iterator<T, const T&> const_iterator;

iterator 和 const_iterator 的区别就在于 * 运算符重载返回的值是否能够被修改，因此增加一个新的模版参数，当使用对应的迭代器就会调用相应的模版

3.3 最终完善实现

在查看STL库里的list底层代码时，会发现实际上的迭代器代码有三个参数，单纯去看是很难发现为什么要有第三个参数的，必须要结合实际的应用场景

举个例子：

struct A
{
	A(int a1 = 0, int a2 = 0)
		:_a1(a1)
		,_a2(a2)
	{}

	int _a1;
	int _a2;
};

void test_list2()
{
	list<A> lt;
	lt.push_back(A(1, 1));
	lt.push_back(A(2, 2));
	lt.push_back(A(3, 3));
	lt.push_back(A(4, 4));
	list<A>::iterator it = lt.begin();
	while (it != lt.end())
	{
		/*cout << (*it)._a1 << " " << (*it)._a2 << endl;*/
		cout << it->_a1 << " " << it->_a2 << endl;
		++it;
	}
	cout << endl;
}

假设有个类A，需要遍历输出，通常我们会写成 cout << (*it)._a1 << " " << (*it)._a2 << endl ，但是这并不符合编程范式

可是为什么写成 cout << it->_a1 << " " << it->_a2 << endl 也能通过？

严格来说，应该写成 it->->_a2，才符合语法，因为运算符重载要求可读性，所以编译器特殊处理，省略一个 ->

3.3.1 ->运算符重载

所以我们可以写出 -> 运算符重载

Ptr operator->()
{
	return &(_node->_val);
}

同样的道理，为了实现iterator 和 const_iterator两种迭代器，再增加一个模版参数，同时因为模版参数有点过多，导致类型太长了，对类型也进行重命名

迭代器模版
template<class T, class Ref, class Ptr>;
typedef _list_iterator<T, Ref, Ptr> self;
......
list模版
typedef _list_iterator<T, T&, T*> iterator;
typedef _list_iterator<T, const T&, const T*> const_iterator;

4.其余list函数功能实现

4.1 析构函数

~list()
{
	clear();
	delete _head;
	_head = nullptr;
}

4.2 拷贝构造函数

list(const list<T>& lt)
{
	_head = new Node;
	_head->_prev = _head;
	_head->_next = _head;
	//万一拷贝的是string、vector等代价就大了，所以用&
	for (auto& x : lt)
	{
		push_back(x);
	}
}

4.3 =运算符重载

list<T>& operator=(list<T> lt)
{
	swap(lt);
	return *this;
}

4.4 swap

void swap(list<T>& lt)
{
	std::swap(_head, lt._head);
}

4.5 clear

void clear()
{
	iterator it = begin();
	while (it != end())
	{
		it = erase(it);
	}
}

4.6 insert

iterator insert(iterator pos, const T& x)
{
	Node* cur = pos._node;
	Node* prev = cur->_prev;
	Node* newnode = new Node(x);

	prev->_next = newnode;
	newnode->_next = cur;
	cur->_prev = newnode;
	newnode->_prev = prev;

	return newnode;
}

4.7 erase

iterator erase(iterator pos)
{
	assert(pos != end());

	Node* cur = pos._node;
	Node* prev = cur->_prev;
	Node* next = cur->_next;

	prev->_next = next;
	next->_prev = prev;

	delete cur;
	return next;
}

4.8 push_front、pop_back、pop_front

void push_front(const T& x)
{
	insert(begin(), x);
}

void pop_back()
{
	erase(--end());
}

void pop_front()
{
	erase(begin());
}

4.9 size

size_t size()
{
	size_t sz = 0;
	iterator it = begin();
	while (it != end())
	{
		++sz;
		++it;
	}
	return sz;
}

4.10 --运算符重载

//前置
self& operator--()
{
	_node = _node->_prev;
	return *this;
}

//后置
self operator--(int)
{
	self temp(*this);
	_node = _node->_prev;
	return temp;
}

4.11 ==运算符重载

bool operator==(const self& it) const
{
	return _node == it._node;
}

5.list模拟实现代码及测试代码展示

#pragma once
#include <iostream>
using namespace std;
#include <assert.h>

namespace bit
{
	//节点模版
	template<class T>
	struct list_node
	{
		list_node<T>* _next;
		list_node<T>* _prev;
		T _val;

		list_node(const T& val = T())
			:_next(nullptr)
			, _prev(nullptr)
			, _val(val)
		{}
	};
	//迭代器模版
	template<class T, class Ref, class Ptr>
	struct _list_iterator
	{
		typedef list_node<T> Node;
		typedef _list_iterator<T, Ref, Ptr> self;
		Node* _node;

		_list_iterator(Node* node)
			:_node(node)
		{}

		//const T& operator*()就可以实现const_iterator
		Ref& operator*()
		{
			return _node->_val;
		}
		
		Ptr operator->()
		{
			return &(_node->_val);
		}
		//前置
		self& operator++()
		{
			_node = _node->_next;
			return *this;
		}

		self& operator--()
		{
			_node = _node->_prev;
			return *this;
		}

		//后置
		self operator++(int)
		{
			self temp(*this);
			_node = _node->_next;
			return temp;
		}

		self operator--(int)
		{
			self temp(*this);
			_node = _node->_prev;
			return temp;
		}

		bool operator!=(const self& it) const
		{
			return _node != it._node;
		}

		bool operator==(const self& it) const
		{
			return _node == it._node;
		}
	};
	//list类模版
	template<class T>
	class list
	{
		typedef list_node<T> Node;
	public:
		typedef _list_iterator<T, T&, T*> iterator;
		typedef _list_iterator<T, const T&, const T*> const_iterator;
		//typedef const _list_const_iterator<T> const_iterator;
		//不能这样写，因为++没法next,就没法遍历了
		iterator begin() const
		{
			return iterator(_head->_next);
			//单参数的函数支持隐式转换
		}

		iterator end() const
		{
			return _head;
		}
		
		list()
		{
			_head = new Node;
			_head->_prev = _head;
			_head->_next = _head;
		}

		list(const list<T>& lt)
		{
			_head = new Node;
			_head->_prev = _head;
			_head->_next = _head;
			//万一拷贝的是string、vector等代价就大了，所以用&
			for (auto& x : lt)
			{
				push_back(x);
			}
		}

		void swap(list<T>& lt)
		{
			std::swap(_head, lt._head);
		}

		list<T>& operator=(list<T> lt)
		{
			swap(lt);
			return *this;
		}
		
		~list()
		{
			clear();
			delete _head;
			_head = nullptr;
		}

		void clear()
		{
			iterator it = begin();
			while (it != end())
			{
				it = erase(it);
			}
		}

		void push_back(const T& x)
		{
			/*Node* tail = _head->_prev;
			Node* newnode = new Node(x);
			tail->_next = newnode;
			newnode->_prev = tail;
			newnode->_next = _head;
			_head->_prev = newnode;*/
			insert(end(), x);
		}

		void push_front(const T& x)
		{
			insert(begin(), x);
		}

		void pop_back()
		{
			erase(--end());
		}

		void pop_front()
		{
			erase(begin());
		}

		iterator insert(iterator pos, const T& x)
		{
			Node* cur = pos._node;
			Node* prev = cur->_prev;
			Node* newnode = new Node(x);

			prev->_next = newnode;
			newnode->_next = cur;
			cur->_prev = newnode;
			newnode->_prev = prev;

			return newnode;
		}

		iterator erase(iterator pos)
		{
			assert(pos != end());

			Node* cur = pos._node;
			Node* prev = cur->_prev;
			Node* next = cur->_next;

			prev->_next = next;
			next->_prev = prev;

			delete cur;
			return next;
		}

		size_t size()
		{
			size_t sz = 0;
			iterator it = begin();
			while (it != end())
			{
				++sz;
				++it;
			}
			return sz;
		}
	private:
		Node* _head;
	};

	void test_list1()
	{
		list<int> lt;
		lt.push_back(1);
		lt.push_back(2);
		lt.push_back(3);
		lt.push_back(4);

		list<int>::iterator it = lt.begin();
		//调用end的函数时传值返回，具有常性,所以!=重载要参数加const
		while (it != lt.end())
		{
			cout << *it << " ";
			++it;
		}
		cout << endl;

		for (auto e : lt)
		{
			cout << e << " ";
		}
		cout << endl;
	}

	struct A
	{
		A(int a1 = 0, int a2 = 0)
			:_a1(a1)
			,_a2(a2)
		{}

		int _a1;
		int _a2;
	};

	void test_list2()
	{
		list<A> lt;
		lt.push_back(A(1, 1));
		lt.push_back(A(2, 2));
		lt.push_back(A(3, 3));
		lt.push_back(A(4, 4));
		//严格来说，应该写成it->->_a2，才符合语法
		//因为运算符重载要求可读性，所以编译器特殊处理，省略一个->
		list<A>::iterator it = lt.begin();
		while (it != lt.end())
		{
			/*cout << (*it)._a1 << " " << (*it)._a2 << endl;*/
			cout << it->_a1 << " " << it->_a2 << endl;
			++it;
		}
		cout << endl;
	}

	void test_list3()
	{
		list<int> lt;
		lt.push_back(1);
		lt.push_back(2);
		lt.push_back(3);
		lt.push_back(4);
		lt.push_front(5);
		lt.push_front(6);
		lt.push_front(7);
		lt.push_front(8);

		lt.pop_front();
		lt.pop_back();

		for (auto e : lt)
		{
			cout << e << " ";
		}
		cout << endl;
		
		lt.clear();
		lt.push_back(10);
		lt.push_back(20);
		lt.push_back(30);
		lt.push_back(40);
		for (auto e : lt)
		{
			cout << e << " ";
		}
		cout << endl;
		cout << lt.size() << endl;
	}

	void test_list4()
	{
		list<int> lt1;
		lt1.push_back(1);
		lt1.push_back(2);
		lt1.push_back(3);
		lt1.push_back(4);
		for (auto e : lt1)
		{
			cout << e << " ";
		}
		cout << endl;
		list<int> lt2;
		lt2 = lt1;
		for (auto e : lt2)
		{
			cout << e << " ";
		}
		cout << endl;
	}
}