当前位置：网站首页>Runtime - Methods member variable, cache member variable

Runtime - Methods member variable, cache member variable

2022-06-25 07:45:00 【chabuduoxs】

class , Metaclass methods Member variables

When analyzing classes before , The inside of the class methods It stores all the instance method information of this class , How is it stored ？

class method_array_t : 
    public list_array_tt<method_t, method_list_t> 
{
    
    typedef list_array_tt<method_t, method_list_t> Super;

 public:
    method_list_t **beginCategoryMethodLists() {
    
        return beginLists();
    }
    
    method_list_t **endCategoryMethodLists(Class cls);

    method_array_t duplicate() {
    
        return Super::duplicate<method_array_t>();
    }
};
// MARK: - method Structure statement 
struct method_t {
    
    SEL name;//SEL
    const char *types;// Method parameters and types 
    MethodListIMP imp;//imp

    struct SortBySELAddress :
        public std::binary_function<const method_t&,
                                    const method_t&, bool>
    {
    
        bool operator() (const method_t& lhs,
                         const method_t& rhs)
        {
     return lhs.name < rhs.name; }
    };
};

Check the source code and find that methods It's an array pointer , The size of this array is （N + 1）* 8 Bytes （N For the number of categories ）, That is to say, there are some one-dimensional arrays in this array , Point to the real instance method list , That is, classification 1 Instance method list for , Method list of the class itself, etc . There are instance methods in the method list method_t, Look at his definition .

// MARK: - method Structure statement 
struct method_t {
    
    SEL name;//SEL
    const char *types;// Method parameters and types 
    MethodListIMP imp;//imp

    struct SortBySELAddress :
        public std::binary_function<const method_t&,
                                    const method_t&, bool>
    {
    
        bool operator() (const method_t& lhs,
                         const method_t& rhs)
        {
     return lhs.name < rhs.name; }
    };
};
typedef struct method_t *Method; //  Method statement 
//  The essence of the method is a method_t Structure pointer , It can point to any method .

You can find that there are only three pointers inside , So it only takes up 24 Bytes , These memories are in the static area . Metaclass methods Empathy , It just saves a list of class methods .
Let's analyze these three member variables .

SEL： Method selector , It corresponds to the method name one by one , Is the unique identifier of a method , It can be treated as a method name , Used before @selector（ Method name ） Is to get a method selector .
types： Type encoding string , It contains the parameter and return value information of the method , The first value in the encoding represents the return value type , The following letters indicate each parameter type of the method in turn . The first number represents the total memory occupied by all parameters , The following numbers represent the offset of each parameter memory address .
IMP： A function pointer , Store an address , Point to the concrete implementation of the method in the code area .

class , Metaclass cache Member variables

When an object receives a message , According to its isa The pointer finds the class to which it belongs , Then according to the class methods Find a list of all the methods , Then we iterate through the list of methods in turn to find the method to execute . In practice , Only some methods of an object are commonly used , Other methods use very low frequencies , If the object traverses all the lists every time it receives a message , The performance must be very poor .
Class cache Member variables are used to solve this problem .
The system calls the method once every time , This method will be stored in cache in , The next time you call a method, it will start from cache Search for . I can't find it methods Inside looking for , It greatly improves the efficiency of method search .
have a look cache Source code .

struct cache_t {
    
    struct bucket_t *_buckets;
    mask_t _mask;
    mask_t _occupied;
}
// MARK: - bucket_t Declaration structure 
struct bucket_t {
    
private:
    // IMP-first is better for arm64e ptrauth and no worse for arm64.
    // SEL-first is better for armv7* and i386 and x86_64.
#if __arm64__
    MethodCacheIMP _imp;
    cache_key_t _key;
#else
    cache_key_t _key;
    MethodCacheIMP _imp;
#endif

public:
    inline cache_key_t key() const {
     return _key; }
    inline IMP imp() const {
     return (IMP)_imp; }
    inline void setKey(cache_key_t newKey) {
     _key = newKey; }
    inline void setImp(IMP newImp) {
     _imp = newImp; }

    void set(cache_key_t newKey, IMP newImp);
};

Analyze the member variables

_buckets: Method cache hash table
_mask: Hash table length - 1
_occupied： Number of cache methods

It can be seen that the elements in the hash table are not directly method_t, It is bucket_t, Click on its structure to find , It has two member variables IMP and cache_key_t.IMP It's a function pointer , Make a busy guess is the function ID .
Let's see below. apple How to implement this hash table .

// Class points to cache. SEL is key. Cache buckets store SEL+IMP.
// Caches are never built in the dyld shared cache.
//  Can be here sel is key See cache_key_t Equivalent to unique identification SEL,cahe The hash table stores IMP and SEL
//  The cache will not be built on dyld In shared cache 
static inline mask_t cache_hash(cache_key_t key, mask_t mask) 
{
    
    //  The hash algorithm here is very simple , Just use key& length -1 get index
    // SEL Is the unique identifier of the method 
    return (mask_t)(key & mask);
}

cache_hash Conflict

In the actual process , It is possible to encounter such problems , Different sel & n - 1 After that, we got the same index, Then there will be data conflicts . How to deal with ？

//  Hash table read 
bucket_t * cache_t::find(cache_key_t k, id receiver)
{
    
    assert(k != 0);
    //  Get hash table and length -1
    bucket_t *b = buckets();
    mask_t m = mask();
    //  Get the index of a certain length through the hash algorithm 
    mask_t begin = cache_hash(k, m);
    mask_t i = begin;
    do {
    
        //  Read index Element contrast SEL, Judge whether it is equal to what we need , return 
        //  Or find free memory , Explain the first call to , Deposit in 
        if (b[i].key() == 0  ||  b[i].key() == k) {
    
            return &b[i];
        }
    } while ((i = cache_next(i, m)) != begin);
    //  otherwise Index - 1, Traversal hash table , Until you read what you want SEL
    // hack
    
    Class cls = (Class)((uintptr_t)this - offsetof(objc_class, cache));
    //  Judge some wrong situations 
    cache_t::bad_cache(receiver, (SEL)k, cls);
}

You can see that there is a need for comparison index Whether the element is idle or not SEL Whether it is the same as what we searched , If we don't find it, we will index - 1, Traversal hash table , Until you find free memory , Or the real way .
Read us directly according to index Holding method , No traversal required .

cache Hash table expansion

void cache_t::reallocate(mask_t oldCapacity, mask_t newCapacity)
{
    
    bool freeOld = canBeFreed();
    
    bucket_t *oldBuckets = buckets();
    //  Open up a new hash table 
    bucket_t *newBuckets = allocateBuckets(newCapacity);

    // Cache's old contents are not propagated. 
    // This is thought to save cache memory at the cost of extra cache fills.
    // fixme re-measure this

    assert(newCapacity > 0);
    assert((uintptr_t)(mask_t)(newCapacity-1) == newCapacity-1);

    setBucketsAndMask(newBuckets, newCapacity - 1);
    
    if (freeOld) {
    
        //  Release old hash table , Empty its cache 
        cache_collect_free(oldBuckets, oldCapacity);
        cache_collect(false);
    }
}