当前位置:网站首页>Detailed explanation of dictionary source code in C #
Detailed explanation of dictionary source code in C #
2022-06-21 13:43:00 【Guo Mahua】
Dictionary
The last article introduced HashTable Implementation principle of ,Dictionary Very different .
- Dictionary Use zipper method to solve hash conflict ,HashTable Use Double Hash.
- Dictionary It's a generic type , For value types and reference types key,Hash The comparator is different .
- Dictionary Again resize New random may be used Hash The comparator .
Constructors
Dictionary Two very important arrays are maintained internally , They are the basis for the realization of the zipper method .
private int[]? _buckets;
private Entry[]? _entries;private struct Entry
{
public uint hashCode;
public int next;
public TKey key;
public TValue value;
}
initialization Dictionary You can specify one capacity, But the length of the two arrays is the smallest prime number greater than this value . This is to reduce the probability of hash collision , And HashTable equally , The default is 3.
Zipper method Operating mechanism
The common zipper pictures on the Internet are as follows , In fact, this diagram will bring misunderstanding to friends who have not understood the principle .
First , The following figure shows all the elements on the linked list , It's actually all stored in Entry[]? _entries, In this array .
Observe Entry Structure The implementation of the , It can be found that it uses next Fields to represent hash The next node address subscript on the linked list .

int[]? _buckets Bucket array , Express Hash The beginning of the list . according to Key It's worth it hashcode To determine which bucket to allocate .
It should be noted that , stay Dictionary In the implementation ,_buckets The values saved in the array are from 1 At the beginning , If the chain header node is located in _entries Subscript to be 0, Then the value saved in this bucket is index+1 = 1.
in addition , When a new node is added , Is added to Hash First place of chain , And modify the subscript address pointed to by the bucket .
in addition ,Dictionary In the implementation _freeCount、_freeList、StartOfFreeList The three fields are also very interesting .
At first I thought it meant the remaining capacity of the dictionary , It turned out _freeCount Only in Remove Add... After operation , That's what I understand , Originally, it represents the released space capacity , and _freeList Indicates the last free space address .
actually , Dictionary In a very clever way , Maintains a free space address chain , and _freeList Is the subscript to the address of the last release .
When adding a new element , Will preferentially select from the released addresses , according to _freeList After the address represented is added ,_freeList Will automatically return to the last free address . Note that “ the previous ”, Give priority to the recently released address .
See the following source code for specific implementation .
Remove Realization principle
actually , Through to Hashtable and Dictionary Two object source code learning , I found that only the combination Remove and Add Methods to see , To understand the author's design ideas .
And HashTable equally , Let's start with the relatively simple Remove Method , How is it realized :
public bool Remove(TKey key)
{
if (_buckets != null)
{
Debug.Assert(_entries != null, "entries should be non-null");
uint collisionCount = 0;
uint hashCode = (uint)(_comparer?.GetHashCode(key) ?? key.GetHashCode());
// GetBucket(uint hashCode) => buckets[hashCode % (uint)buckets.Length];
ref int bucket = ref GetBucket(hashCode);
Entry[]? entries = _entries;
int last = -1;
// i Used to represent the element in entries Position in , guess buckets The minimum valid values stored in the array are 1.
int i = bucket - 1;
// i<0 Subscript indicating the current bucket storage bucket by 0, Describe what to remove key non-existent ,return false.
while (i >= 0)
{
ref Entry entry = ref entries[i];
// If the element is found
if (entry.hashCode == hashCode && (_comparer?.Equals(entry.key, key) ?? EqualityComparer<TKey>.Default.Equals(entry.key, key)))
{
if (last < 0)
{
// last<0 Indicates that the element to be removed is the first element in the hash chain ,
// Therefore, you need to update the index value stored in the bucket to The position of the second element + 1;
// This should also prove what we mentioned above ,buckets The minimum valid values stored in the array are 1
bucket = entry.next + 1; // Value in buckets is 1-based
}
else
{
// last>=0 Indicates that the element to be removed is an element at the middle or end of the hash chain ,
// Therefore, there is no need to update the bucket address value , Just add the last element's next The node is set to the... Of the element to be removed next The node can be
entries[last].next = entry.next;
}
// Why should I set... For elements that have been removed next?? This step is very important , See below
entry.next = StartOfFreeList - _freeList;
if (RuntimeHelpers.IsReferenceOrContainsReferences<TKey>())
{
entry.key = default!;
}
if (RuntimeHelpers.IsReferenceOrContainsReferences<TValue>())
{
entry.value = default!;
}
// The free pointer points to i, Indicates that the address is free and available
_freeList = i;
_freeCount++;
return true;
}
// I didn't go into the one above if, That there is hash Chain but no corresponding key,
// The pointer moves down one bit , Keep looking for , When the end is found ,i Will be equal to the -1
last = i;
i = entry.next;
collisionCount++;
if (collisionCount > (uint)entries.Length)
{ ThrowHelper.ThrowInvalidOperationException_ConcurrentOperationsNotSupported();
}
}
}
return false;
}Add Realization principle
According to the above , Combined with the source code discovery :
Assign the new element to _freeList Address to , then _freeList It points to StartOfFreeList - entries[_freeList].next Why? ?
Here we need to combine remove In terms of operation , There is a line of code that I can't understand :
entry.next = StartOfFreeList - _freeList;
Of the space key-value Although it has been released , But it's next The pointer remains , And points to StartOfFreeList - _freeList Why? ?
Notice the two expressions I've bold !!
If now _freeList = 1, And then remove The address dropped is 2 The elements of . So address 2 Situated next The pointer points to StartOfFreeList( Constant for the -3 The constant ) - _freeList, That is to say -4. Next we'll _freeList Adjust to the most recently released address , namely 2.
then Add Operation to add a new element , Of course, it is allocated to 2 Address . So at this time
entries[_freeList].next How many ?-4!
StartOfFreeList - entries[_freeList].next, How many ? 1!
after _freeList It points to StartOfFreeList - entries[_freeList].next, It should be a few ?1!
Equal to say ,Dictionary The released free addresses are strung into a chain ! When allocating new elements , A free address will be taken from the end of the chain , then _freeList Automatically move to the last free address , Become a new chain tail !!!
Think of it here. , I finally got it , Why is the idle pointer called _freeList, Because it is really a chain !
private bool TryInsert(TKey key, TValue value, InsertionBehavior behavior)
{
...
uint hashCode = (uint)((comparer == null) ? key.GetHashCode() : comparer.GetHashCode(key));
// Collision frequency
uint collisionCount = 0;
// according to hashCode The barrel in which the calculation is made , Note that ref
// GetBucket(uint hashCode) => buckets[hashCode % (uint)buckets.Length];
ref int bucket = ref GetBucket(hashCode);
// here i Why bucket - 1 Well ?
int i = bucket - 1;
// here if...else Medium while The main purpose is to detect whether there is a ring in the hash chain ,
// If so update, Will update the element directly and return
// If you add the same key The elements of , It throws an exception
if (comparer == null)
{
// Is not specified comparer Words ,Dictionary Will use the default comparator
// There are value types key and Reference type Key Two cases , The code is highly similar to the following , Look straight down ↓
if (typeof(TKey).IsValueType)
{ ... }
else
{ ... }
}
else
{
while (true)
{
// Be careful : Here is when adding elements , The only way out of the loop , Very skillfully will i Into the uint.
// The dictionary is added for the first time after initialization ,bucket It must be 0,i This is the case -1.
// Because the sign bit is the first bit of binary , A negative number is 1, So turn to uint after , It would be a very large integer .
if ((uint)i >= (uint)entries.Length)
{
break;
}
//Key Equal to the current element , If yes, an exception will be thrown , If yes, change the value and return
if (entries[i].hashCode == hashCode && comparer.Equals(entries[i].key, key))
{
if (behavior == InsertionBehavior.OverwriteExisting)
{
entries[i].value = value;
return true;
}
if (behavior == InsertionBehavior.ThrowOnExisting)
{
ThrowHelper.ThrowAddingDuplicateWithKeyArgumentException(key);
}
return false;
}
i = entries[i].next;
collisionCount++;
if (collisionCount > (uint)entries.Length)
{ ThrowHelper.ThrowInvalidOperationException_ConcurrentOperationsNotSupported();
}
}
}
// Be careful : Although the dictionary capacity after initialization is 3, But this time _freeCount The value is 0, and _freeList Then for -1
// After my analysis :_freeCount Express the classics remove Number of spaces freed ,
// _freeList It means remove Free space address .
// combination remove You'll find that , The design here is very clever .
int index;
if (_freeCount > 0)
{
index = _freeList;
Debug.Assert((StartOfFreeList - entries[_freeList].next) >= -1, "shouldn't overflow because `next` cannot underflow");
_freeList = StartOfFreeList - entries[_freeList].next;
_freeCount--;
}
else
{
int count = _count;
if (count == entries.Length)
{
Resize();
bucket = ref GetBucket(hashCode);
}
index = count;
_count = count + 1;
entries = _entries;
}
// bucket Is used to record the starting address of the hash chain ,
// from entry.next = bucket - 1; The operation shows that the hash chain will add a new element to the header ,
// So the last of the hash chain next The value pointed to by the pointer must be -1
// and bucket = index + 1; explain bucket The minimum valid value in is 1
ref Entry entry = ref entries![index];
entry.hashCode = hashCode;
entry.next = bucket - 1;
entry.key = key;
entry.value = value;
bucket = index + 1;
_version++;
if (!typeof(TKey).IsValueType && collisionCount > HashHelpers.HashCollisionThreshold && comparer is NonRandomizedStringEqualityComparer)
{
Resize(entries.Length, true);
}
return true;
}
Capacity expansion
Dictionary Capacity expansion and HashTable The expansion of is similar to , Are based on Current capacity ×2 Take the minimum prime number greater than this value .
however ,Dictionary Whether to force refresh during capacity expansion HashCode The option to ( For reference types only Key Work ), If forceNewHashCodes by True, A new one will be randomly generated IEqualityComparer, Used to generate new HashCode.
Indexer lookup
Indexer lookup and Remove Find something similar to , No more details here .
边栏推荐
- Turn to the countdown for coupon issuance! First look at the rules of interstellar pocket donation
- ###数据库的高可用配置(mysql)
- Chapter IX Cisco ASA application nat
- C#&. Net to implement a distributed event bus from 0 (1)
- What is software testing?
- 哪个期货平台 交易更安全放心。求推荐。
- Apache shardingsphere 5.1.2 release | new driving API + cloud native deployment to create a high-performance data gateway
- 如何使用搜索引擎?
- 应用配置管理,基础原理分析
- 3D slicer saves segmentation results
猜你喜欢

Must the database primary key be self incremented? What scenarios do not suggest self augmentation?

###数据库的高可用配置(mysql)

Lamp Architecture 3 -- compilation and use of PHP source code

Navigation bar switching, message board, text box losing focus

C language -- program compilation and linking

Automatic operation and maintenance 4 - variables and encryption in ansible
![[deeply understand tcapulusdb technology] tmonitor system upgrade](/img/ec/bbfb7e2f19a94b69ec0a6092fd3032.png)
[deeply understand tcapulusdb technology] tmonitor system upgrade

Highly available configuration of database (MySQL)

Turn to the countdown for coupon issuance! First look at the rules of interstellar pocket donation

SCCM creates a client collection based on the installed app and periodically pushes application updates
随机推荐
Lamp architecture 5 - MySQL Cluster and master-slave structure
May the mountains and rivers be safe
7. pointer
Kubernetes快速實戰與核心原理剖析
Artifacial Intelligent Project
服务治理的工作内容
Tomorrow's interview, I can't sleep in the middle of the night to review the bottom implementation of STL
Quelle plate - forme à terme est plus sûre. Je vous en prie.
MySQL - data type
2. reference
Questions and answers No. 43: application performance probe monitoring principle PHP probe
Application configuration management, basic principle analysis
Sort query results according to the input order of fuzzy query jancode
Heat mapping using Seaborn
Distributed transactions, simple in principle, are all pits in writing
如何使用搜索引擎?
Alibaba cloud link tracking is on the Net project (Jaeger trace)
Work content of service governance
[course assignment] floating point operation analysis and precision improvement
A blazor webassembly application that can automatically generate page components based on objects or types