我们经常会使用weak来解决OC中的循环引用问题,因为weak不会使引用计数加1;并且weak修饰的指针还会在对象被销毁后自动置空,这有效的解决了野指针调用的问题。
那么weak 的底层原理是怎样的呢?我们接下来就来分析一下。
首先随便在一个工程中,写入下面类似的代码,然后在weak的那行打断点:
运行到断点处,转成汇编分析:
转成汇编后,我发现断点下方有一个objc_initWeak:
为了查看objc_initWeak函数是在哪个库里面,我对其打了个符号断点:
然后点击下一步:
此时,我发现objc_initWeak函数是在objc库中。
1、objc_initWeak
然后我找到objc库的源码,搜索objc_initWeak,找到了其源码实现:
id
objc_initWeak(id *location, id newObj)
{
if (!newObj) {
*location = nil;
return nil;
}
return storeWeak<DontHaveOld, DoHaveNew, DoCrashIfDeallocating>
(location, (objc_object*)newObj);
}
该方法有两个参数 location 和 newObj :
从上面的代码也可以看出来,objc_initWeak函数只是一个深层次函数调用的入口,在该函数内部会调用storeWeak函数。接下来我们就来看下storeWeak函数的实现。
2、storeWeak
实现如下:
// Template parameters.
enum HaveOld { DontHaveOld = false, DoHaveOld = true };
enum HaveNew { DontHaveNew = false, DoHaveNew = true };
enum CrashIfDeallocating {
DontCrashIfDeallocating = false, DoCrashIfDeallocating = true
};
template <HaveOld haveOld, HaveNew haveNew,
CrashIfDeallocating crashIfDeallocating>
static id
storeWeak(id *location, objc_object *newObj)
{
assert(haveOld || haveNew);
if (!haveNew) assert(newObj == nil);
Class previouslyInitializedClass = nil;
id oldObj;
SideTable *oldTable;
SideTable *newTable;
// Acquire locks for old and new values.
// Order by lock address to prevent lock ordering problems.
// Retry if the old value changes underneath us.
retry:
if (haveOld) { // 如果weak ptr之前弱引用过一个obj,则将这个obj所对应的SideTable取出,赋值给oldTable
oldObj = *location;
oldTable = &SideTables()[oldObj];
} else {
oldTable = nil; // 如果weak ptr之前没有弱引用过一个obj,则oldTable = nil
}
if (haveNew) { // 如果weak ptr要weak引用一个新的obj,则将该obj对应的SideTable取出,赋值给newTable
newTable = &SideTables()[newObj];
} else {
newTable = nil; // 如果weak ptr不需要引用一个新obj,则newTable = nil
}
// 加锁操作,防止多线程中竞争冲突
SideTable::lockTwo<haveOld, haveNew>(oldTable, newTable);
// location 应该与 oldObj 保持一致,如果不同,说明当前的 location 已经处理过 oldObj 可是又被其他线程所修改
if (haveOld && *location != oldObj) {
SideTable::unlockTwo<haveOld, haveNew>(oldTable, newTable);
goto retry;
}
// Prevent a deadlock between the weak reference machinery
// and the +initialize machinery by ensuring that no
// weakly-referenced object has an un-+initialized isa.
if (haveNew && newObj) {
Class cls = newObj->getIsa();
if (cls != previouslyInitializedClass &&
!((objc_class *)cls)->isInitialized()) // 如果cls还没有初始化,先初始化,再尝试设置weak
{
SideTable::unlockTwo<haveOld, haveNew>(oldTable, newTable);
_class_initialize(_class_getNonMetaClass(cls, (id)newObj));
// If this class is finished with +initialize then we're good.
// If this class is still running +initialize on this thread
// (i.e. +initialize called storeWeak on an instance of itself)
// then we may proceed but it will appear initializing and
// not yet initialized to the check above.
// Instead set previouslyInitializedClass to recognize it on retry.
previouslyInitializedClass = cls; // 这里记录一下previouslyInitializedClass, 防止改if分支再次进入
goto retry; // 重新获取一遍newObj,这时的newObj应该已经初始化过了
}
}
// Clean up old value, if any.
if (haveOld) {
weak_unregister_no_lock(&oldTable->weak_table, oldObj, location); // 如果weak_ptr之前弱引用过别的对象oldObj,则调用weak_unregister_no_lock,在oldObj的weak_entry_t中移除该weak_ptr地址
}
// Assign new value, if any.
if (haveNew) { // 如果weak_ptr需要弱引用新的对象newObj
// (1) 调用weak_register_no_lock方法,将weak ptr的地址记录到newObj对应的weak_entry_t中
newObj = (objc_object *)
weak_register_no_lock(&newTable->weak_table, (id)newObj, location,
crashIfDeallocating);
// weak_register_no_lock returns nil if weak store should be rejected
// (2) 更新newObj的isa的weakly_referenced bit标志位
// Set is-weakly-referenced bit in refcount table.
if (newObj && !newObj->isTaggedPointer()) {
newObj->setWeaklyReferenced_nolock();
}
// Do not set *location anywhere else. That would introduce a race.
// (3)*location 赋值,也就是将weak ptr直接指向了newObj。可以看到,这里并没有将newObj的引用计数+1
*location = (id)newObj; // 将weak ptr指向object
}
else {
// No new value. The storage is not changed.
}
// 解锁,其他线程可以访问oldTable, newTable了
SideTable::unlockTwo<haveOld, haveNew>(oldTable, newTable);
return (id)newObj; // 返回newObj,此时的newObj与刚传入时相比,weakly-referenced bit位置1
}
storeWeak函数的实现代码虽然很长,但是并不难理解,下面我们来分析下该方法的实现:
不难看出,storeWeak函数的重点是weak_unregister_no_lock和weak_register_no_lock这两个函数,而这两个函数都是操作的SideTable这样一个结构的变量,因此我们需要先了解下SideTable。
3、SideTable
先来看下SideTable的定义:
struct SideTable {
spinlock_t slock;
RefcountMap refcnts;
weak_table_t weak_table;
};
SideTable散列表,其定义很清晰,它有三个成员:
3.1 weak_table_t
先来看看weak_table_t的底层代码:
/**
* The global weak references table. Stores object ids as keys,
* and weak_entry_t structs as their values.
*/
struct weak_table_t {
weak_entry_t *weak_entries;
size_t num_entries;
uintptr_t mask;
uintptr_t max_hash_displacement;
};
weak_table_t是一个典型的hash结构,其里面有一个weak_entries数组,其元素是weak_entry_t类型。
3.2 weak_entry_t
#define WEAK_INLINE_COUNT 4
#define REFERRERS_OUT_OF_LINE 2
struct weak_entry_t {
DisguisedPtr<objc_object> referent; // 被弱引用的对象
// 引用该对象的对象列表,联合。引用个数小于4,用inline_referrers数组。用个数大于4,用动态数组weak_referrer_t *referrers
union {
struct {
weak_referrer_t *referrers; // 弱引用该对象的对象指针地址的hash数组
uintptr_t out_of_line_ness : 2; // 是否使用动态hash数组标记位
uintptr_t num_refs : PTR_MINUS_2; // hash数组中的元素个数
uintptr_t mask; // hash数组长度-1,会参与hash计算。(注意,这里是hash数组的长度,而不是元素个数。比如,数组长度可能是64,而元素个数仅存了2个)素个数)。
uintptr_t max_hash_displacement; // 可能会发生的hash冲突的最大次数,用于判断是否出现了逻辑错误(hash表中的冲突次数绝不会超过改值)
};
struct {
// out_of_line_ness field is low bits of inline_referrers[1]
weak_referrer_t inline_referrers[WEAK_INLINE_COUNT];
};
};
bool out_of_line() {
return (out_of_line_ness == REFERRERS_OUT_OF_LINE);
}
weak_entry_t& operator=(const weak_entry_t& other) {
memcpy(this, &other, sizeof(other));
return *this;
}
weak_entry_t(objc_object *newReferent, objc_object **newReferrer)
: referent(newReferent) // 构造方法,里面初始化了静态数组
{
inline_referrers[0] = newReferrer;
for (int i = 1; i < WEAK_INLINE_COUNT; i++) {
inline_referrers[i] = nil;
}
}
};
可以看到,weak_entry_t也是一个hash结构,其存储的元素是弱引用对象指针的指针,通过操作弱指针的指针,就可以使weak指针在其指向的对象析构后指向nil。
weak_entry_t的结构定义中是有联合体的,在联合体的内部有定长数组inline_referrers和动态数组referrers两种方式来存储弱引用对象指针的地址。
通过out_of_line()函数来判断到底使用哪种方式。当弱引用该对象的指针数目小于等于WEAK_INLINE_COUNT时,使用定长数组inline_referrers;当弱引用该对象的指针数目大于WEAK_INLINE_COUNT时,会将定长数组中的元素转移到动态数组中,并且之后都是使用动态数组存储。
到这里,我们已经知道了弱引用表weak_table_t是一个hash结构的表,其Key是所指对象的地址,其Value是weak指针的地址(地址的值就是所指对象的地址)数组。接下来我们就看一下这个弱引用表是怎么维护这些数据的。
4、weak_register_no_lock函数添加弱引用
id
weak_register_no_lock(weak_table_t *weak_table, id referent_id,
id *referrer_id, bool crashIfDeallocating)
{
objc_object *referent = (objc_object *)referent_id;
objc_object **referrer = (objc_object **)referrer_id;
// 如果referent为nil 或 referent 采用了TaggedPointer计数方式,直接返回,不做任何操作
if (!referent || referent->isTaggedPointer()) return referent_id;
// 确保被引用的对象可用(没有在析构,同时应该支持weak引用)
bool deallocating;
if (!referent->ISA()->hasCustomRR()) {
deallocating = referent->rootIsDeallocating();
}
else {
BOOL (*allowsWeakReference)(objc_object *, SEL) =
(BOOL(*)(objc_object *, SEL))
object_getMethodImplementation((id)referent,
SEL_allowsWeakReference);
if ((IMP)allowsWeakReference == _objc_msgForward) {
return nil;
}
deallocating =
! (*allowsWeakReference)(referent, SEL_allowsWeakReference);
}
// 正在析构的对象,不能够被弱引用
if (deallocating) {
if (crashIfDeallocating) {
_objc_fatal("Cannot form weak reference to instance (%p) of "
"class %s. It is possible that this object was "
"over-released, or is in the process of deallocation.",
(void*)referent, object_getClassName((id)referent));
} else {
return nil;
}
}
// now remember it and where it is being stored
// 在 weak_table中找到referent对应的weak_entry,并将referrer加入到weak_entry中
weak_entry_t *entry;
if ((entry = weak_entry_for_referent(weak_table, referent))) { // 如果能找到weak_entry,则讲referrer插入到weak_entry中
append_referrer(entry, referrer); // 将referrer插入到weak_entry_t的引用数组中
}
else { // 如果找不到,就新建一个
weak_entry_t new_entry(referent, referrer);
weak_grow_maybe(weak_table);
weak_entry_insert(weak_table, &new_entry);
}
// Do not set *referrer. objc_storeWeak() requires that the
// value not change.
return referent_id;
}
weak_register_no_lock函数中需要传入4个参数:
从上面的源码中我们可以分析出,weak_register_no_lock函数中主要做了以下几件事情:
4.1 weak_entry_for_referent取元素
static weak_entry_t *
weak_entry_for_referent(weak_table_t *weak_table, objc_object *referent)
{
assert(referent);
weak_entry_t *weak_entries = weak_table->weak_entries;
if (!weak_entries) return nil;
size_t begin = hash_pointer(referent) & weak_table->mask; // 这里通过 & weak_table->mask的位操作,来确保index不会越界
size_t index = begin;
size_t hash_displacement = 0;
while (weak_table->weak_entries[index].referent != referent) {
index = (index+1) & weak_table->mask;
if (index == begin) bad_weak_table(weak_table->weak_entries); // 触发bad weak table crash
hash_displacement++;
if (hash_displacement > weak_table->max_hash_displacement) { // 当hash冲突超过了可能的max hash 冲突时,说明元素没有在hash表中,返回nil
return nil;
}
}
return &weak_table->weak_entries[index];
}
4.2 append_referrer添加元素
static void append_referrer(weak_entry_t *entry, objc_object **new_referrer)
{
if (! entry->out_of_line()) { // 如果weak_entry 尚未使用动态数组,走这里
// Try to insert inline.
for (size_t i = 0; i < WEAK_INLINE_COUNT; i++) {
if (entry->inline_referrers[i] == nil) {
entry->inline_referrers[i] = new_referrer;
return;
}
}
// 如果inline_referrers的位置已经存满了,则要转型为referrers,做动态数组。
// Couldn't insert inline. Allocate out of line.
weak_referrer_t *new_referrers = (weak_referrer_t *)
calloc(WEAK_INLINE_COUNT, sizeof(weak_referrer_t));
// This constructed table is invalid, but grow_refs_and_insert
// will fix it and rehash it.
for (size_t i = 0; i < WEAK_INLINE_COUNT; i++) {
new_referrers[i] = entry->inline_referrers[I];
}
entry->referrers = new_referrers;
entry->num_refs = WEAK_INLINE_COUNT;
entry->out_of_line_ness = REFERRERS_OUT_OF_LINE;
entry->mask = WEAK_INLINE_COUNT-1;
entry->max_hash_displacement = 0;
}
// 对于动态数组的附加处理:
assert(entry->out_of_line()); // 断言:此时一定使用的动态数组
if (entry->num_refs >= TABLE_SIZE(entry) * 3/4) { // 如果动态数组中元素个数大于或等于数组位置总空间的3/4,则扩展数组空间为当前长度的一倍
return grow_refs_and_insert(entry, new_referrer); // 扩容,并插入
}
// 如果不需要扩容,直接插入到weak_entry中
// 注意,weak_entry是一个哈希表,key:w_hash_pointer(new_referrer) value: new_referrer
// 细心的人可能注意到了,这里weak_entry_t 的hash算法和 weak_table_t的hash算法是一样的,同时扩容/减容的算法也是一样的
size_t begin = w_hash_pointer(new_referrer) & (entry->mask); // '& (entry->mask)' 确保了 begin的位置只能大于或等于 数组的长度
size_t index = begin; // 初始的hash index
size_t hash_displacement = 0; // 用于记录hash冲突的次数,也就是hash再位移的次数
while (entry->referrers[index] != nil) {
hash_displacement++;
index = (index+1) & entry->mask; // index + 1, 移到下一个位置,再试一次能否插入。(这里要考虑到entry->mask取值,一定是:0x111, 0x1111, 0x11111, ... ,因为数组每次都是*2增长,即8, 16, 32,对应动态数组空间长度-1的mask,也就是前面的取值。)
if (index == begin) bad_weak_table(entry); // index == begin 意味着数组绕了一圈都没有找到合适位置,这时候一定是出了什么问题。
}
if (hash_displacement > entry->max_hash_displacement) { // 记录最大的hash冲突次数, max_hash_displacement意味着: 我们尝试至多max_hash_displacement次,肯定能够找到object对应的hash位置
entry->max_hash_displacement = hash_displacement;
}
// 将ref存入hash数组,同时,更新元素个数num_refs
weak_referrer_t &ref = entry->referrers[index];
ref = new_referrer;
entry->num_refs++;
}
这段代码中做的事情总结一下就是:首先确定是使用定长数组还是动态数组,如果是使用定长数组,则直接将weak指针地址添加到数组即可;如果定长数组已经用尽,则需要将将定长数组中的元素转存到动态数组中。
5、weak_unregister_no_lock移除引用
如果weak指针之前指向了一个弱引用,则会调用weak_unregister_no_lock函数将旧的weak指针地址从弱引用表中移除。
void
weak_unregister_no_lock(weak_table_t *weak_table, id referent_id,
id *referrer_id)
{
objc_object *referent = (objc_object *)referent_id;
objc_object **referrer = (objc_object **)referrer_id;
weak_entry_t *entry;
if (!referent) return;
if ((entry = weak_entry_for_referent(weak_table, referent))) { // 查找到referent所对应的weak_entry_t
remove_referrer(entry, referrer); // 在referent所对应的weak_entry_t的hash数组中,移除referrer
// 移除元素之后, 要检查一下weak_entry_t的hash数组是否已经空了
bool empty = true;
if (entry->out_of_line() && entry->num_refs != 0) {
empty = false;
}
else {
for (size_t i = 0; i < WEAK_INLINE_COUNT; i++) {
if (entry->inline_referrers[i]) {
empty = false;
break;
}
}
}
if (empty) { // 如果weak_entry_t的hash数组已经空了,则需要将weak_entry_t从weak_table中移除
weak_entry_remove(weak_table, entry);
}
}
到这里位置,我们实际上就已经介绍完了【对一个对象做weak操作的时候底层所做的事情】:
首先,会有一张SideTable散列表,这个散列表包含了引用计数表、弱引用表等;
然后,散列表里面会有一张全局的弱引用表weak_table,来存储所有的弱指针信息;
弱引用表weak_table中又有一个数组weak_entries,其元素类型是weak_entry_t,每一个弱引用对象都会对应一个weak_entry_t;
weak_entry_t中会使用定长数组或者动态数组来存储指向对应对象的所有弱指针的地址。
我们可以看到,在整个流程中,都没有对引用计数的操作,因此,weak修饰的指针并不会使对象的引用计数加1。
我们知道,weak指针会在其所指向的对象被销毁的时候自动置为nil,这是怎么做到的呢?
6、dealloc
- (void)dealloc {
_objc_rootDealloc(self);
}
_objc_rootDealloc:
void
_objc_rootDealloc(id obj)
{
assert(obj);
obj->rootDealloc();
}
rootDealloc:
inline void
objc_object::rootDealloc()
{
if (isTaggedPointer()) return; // fixme necessary?
if (fastpath(isa.nonpointer &&
!isa.weakly_referenced &&
!isa.has_assoc &&
!isa.has_cxx_dtor &&
!isa.has_sidetable_rc))
{
assert(!sidetable_present());
free(this);
}
else {
object_dispose((id)this);
}
}
object_dispose:
id
object_dispose(id obj)
{
if (!obj) return nil;
objc_destructInstance(obj);
free(obj);
return nil;
}
objc_destructInstance:
void *objc_destructInstance(id obj)
{
if (obj) {
// Read all of the flags at once for performance.
bool cxx = obj->hasCxxDtor();
bool assoc = obj->hasAssociatedObjects();
// This order is important.
if (cxx) object_cxxDestruct(obj);
if (assoc) _object_remove_assocations(obj);
obj->clearDeallocating();
}
return obj;
}
clearDeallocating:
inline void
objc_object::clearDeallocating()
{
if (slowpath(!isa.nonpointer)) {
// Slow path for raw pointer isa.
sidetable_clearDeallocating();
}
else if (slowpath(isa.weakly_referenced || isa.has_sidetable_rc)) {
// Slow path for non-pointer isa with weak refs and/or side table data.
clearDeallocating_slow();
}
assert(!sidetable_present());
}
sidetable_clearDeallocating:
void
objc_object::sidetable_clearDeallocating()
{
SideTable& table = SideTables()[this];
// clear any weak table items
// clear extra retain count and deallocating bit
// (fixme warn or abort if extra retain count == 0 ?)
table.lock();
RefcountMap::iterator it = table.refcnts.find(this);
if (it != table.refcnts.end()) {
if (it->second & SIDE_TABLE_WEAKLY_REFERENCED) {
weak_clear_no_lock(&table.weak_table, (id)this);
}
table.refcnts.erase(it);
}
table.unlock();
}
weak_clear_no_lock:
void
weak_clear_no_lock(weak_table_t *weak_table, id referent_id)
{
objc_object *referent = (objc_object *)referent_id;
weak_entry_t *entry = weak_entry_for_referent(weak_table, referent); // 找到referent在weak_table中对应的weak_entry_t
if (entry == nil) {
/// XXX shouldn't happen, but does with mismatched CF/objc
//printf("XXX no entry for clear deallocating %p\n", referent);
return;
}
// zero out references
weak_referrer_t *referrers;
size_t count;
// 找出weak引用referent的weak 指针地址数组以及数组长度
if (entry->out_of_line()) {
referrers = entry->referrers;
count = TABLE_SIZE(entry);
}
else {
referrers = entry->inline_referrers;
count = WEAK_INLINE_COUNT;
}
for (size_t i = 0; i < count; ++i) {
objc_object **referrer = referrers[i]; // 取出每个weak ptr的地址
if (referrer) {
if (*referrer == referent) { // 如果weak ptr确实weak引用了referent,则将weak ptr设置为nil,这也就是为什么weak 指针会自动设置为nil的原因
*referrer = nil;
}
else if (*referrer) { // 如果所存储的weak ptr没有weak 引用referent,这可能是由于runtime代码的逻辑错误引起的,报错
_objc_inform("__weak variable at %p holds %p instead of %p. "
"This is probably incorrect use of "
"objc_storeWeak() and objc_loadWeak(). "
"Break on objc_weak_error to debug.\n",
referrer, (void*)*referrer, (void*)referent);
objc_weak_error();
}
}
}
weak_entry_remove(weak_table, entry); // 由于referent要被释放了,因此referent的weak_entry_t也要移除出weak_table
}
weak_clear_no_lock这个函数中做的事情就是:在弱引用表weak_table中获取到弱引用对象所对应的weak_entry_t,然后将weak_entry_t的定长数组或者动态数组中的所有弱指针地址信息进行遍历移除置空,最后将weak_entry_t从weak_table中移除。
上面就是weak指针能够自动置空的原因。
以上。