当前位置:网站首页>[game theory] basic knowledge

[game theory] basic knowledge

2022-06-23 07:14:00 Meet the demon king

【 Game theory 】 Basic knowledge of

1 Strategic game

Strategic game also called Static game , It is once game . Strategic game G G G Formalized as :
G = { N , { A i } i = 1 N , { u i } i = 1 N } G=\left\{N,\left\{A_{i}\right\}_{i=1}^{N},\left\{u_{i}\right\}_{i=1}^{N}\right\} G={ N,{ Ai}i=1N,{ ui}i=1N}
among :

  • N N N: Player set , A limited set of all players ;
  • A i A_{i} Ai: The player i i i The strategy set of : It means the player i i i A collection of selectable policies ;
  • u i u_i ui: The player i i i The return function of : A 1 × A 2 × … A N → R A_{1} \times A_{2} \times \ldots A_{N} \rightarrow R A1×A2×ANR Represents the benefits under a set of strategies .

Complete information static game It has the following characteristics :

  • Non cooperative game
  • Act at the same time (simultaneous move): All players make strategic choices at the same time . I don't know the strategy choice of other players when I choose the strategy
  • Full information (complete information): Each player's strategy and revenue function are the common knowledge of all players (common knowledge)

Incomplete information static game ( Also known as Static Bayesian game ) It has the following characteristics :

  • Non cooperative game
  • Act at the same time (simultaneous move): All players make strategic choices at the same time . I don't know the strategy choice of other players when I choose the strategy
  • Incomplete information (incomplete information): At least one player does not know exactly the payoff function of another player .

With Prisoner's dilemma For example :

Complete information static game :

Two suspect were arrested , To different cells , But the police don't have enough evidence , Two suspects were told :

  1. If neither side confesses , Were sentenced to one month ;
  2. If both sides confess , Were sentenced to six months ;
  3. If one party confesses and the other does not , Then the Confessor releases , The party who did not confess was sentenced to nine months .

A bivariate matrix can be used to represent the benefits of both :

 Insert picture description here

Incomplete information static game :

Two suspect were arrested , To different cells , But the police don't have enough evidence , Two suspects were told :

  1. If neither side confesses , Were sentenced to one month ;
  2. If both sides confess , Were sentenced to six months ;
  3. If one party confesses and the other does not , Then the Confessor releases , The party who did not confess was sentenced to nine months .

besides , There are additional things to note :

  1. Prisoner 1 Always rational , That is, selfish
  2. Prisoner 2 May be rational , It may also be altruistic , It depends on his mood
  3. When Prisoner 2 It's altruistic , Then he is not so frank , He thinks confession equals “ An extra four months in prison ”
  4. Prisoner 1 Unable to judge accurately Prisoner 2 Is it selfish or altruistic , But he can infer Prisoner 2 The probability of rationality is 0.8, The probability of altruism is 0.2.

 Insert picture description here

2 Extended game

Extended game , Also known as Dynamic game , It corresponds to the strategic game . In an extended game , Players take turns making decisions , Usually available Game tree Depict it .

The game tree consists of node (node) and edge (edge) form , Corresponding to the game player 、 Strategy and benefits .

  1. Non leaf node : representative Game player , It indicates which game player makes a decision at this time . Each non leaf node has only one game player .
  2. leaf node : Represents each player's... At this time earnings . Benefits only exist in leaf nodes .
  3. edge : Express Strategy

 Insert picture description here

Extended game G G G Formalized as :
G = { N , H , P , { u i } } G=\left\{N, H, P,\left\{u_{i}\right\}\right\} G={ N,H,P,{ ui}}
among :

  • N N N Gather for players ;
  • H H H by “ Sequence of policies ” A collection of components , It can be a finite set or an infinite set . H H H Is called history (history). The nature includes :
    • ∅ ∈ H \emptyset \in H H, Represents the root node of the game tree .
    • If the policy sequence a 1 a 2 … a k ∈ H a^{1} a^{2} \ldots a^{k} \in H a1a2akH also s < k s<k s<k, that a 1 a 2 … a s ∈ H a^{1} a^{2} \ldots a^{s} \in H a1a2asH
    • If the infinite policy sequence ( a k ) k = 1 ∞ \left(a^{k}\right)_{k=1}^{\infty} (ak)k=1 Contentment for any k k k, a 1 a 2 … a k ∈ H a^{1} a^{2} \ldots a^{k} \in H a1a2akH, that ( a k ) k = 1 ∞ ∈ H \left(a^{k}\right)_{k=1}^{\infty} \in H (ak)k=1H
    • Each historical sequence corresponds to a node of the game tree , Corresponding to the node reached at the end of the historical sequence
    • In the complete information extended game , History set size = Number of nodes
    • Final history collection (Terminal history set): Z Z Z = {All Terminal history}, On these nodes are earnings
  • P P P For the game player function (Player Function):
    • P : H \ Z → N P: H \backslash Z \rightarrow N P:H\ZN Assign a player set to each non - ending history N N N An element in
    • P ( h ) P(h) P(h) In history h h h after , Which player makes the decision
  • u i u_i ui Is the income function , It means the first one i i i Revenue from players

3 Nash equilibrium

Go on up Complete information static game An example of the prisoner's dilemma . Let's stand before the prisoner 1 From the perspective of :

  • If 2 Frankly , I choose to confess , Then the income is -6; If I don't choose to confess , Then the income is -9. So I will choose to confess ;
  • If 2 No, No , I choose to confess , Then the income is 0; If I don't choose to confess , Then the income is -9, So I will choose to confess .

meanwhile , The prisoner 2 I think so . So both will confess .

Optimal response : When the opponent's strategy is chosen , Players will adjust their strategies , Make your income the largest among several strategic choices .

Nash equilibrium : Any player changes his strategy in this strategy combination ( The strategies of other players remain the same ) Will not improve their own income .

That is, every player's strategy is The best response When , A stable situation will be formed , That is to achieve Nash equilibrium .

The formal definition of Nash equilibrium is as follows :

Nash equilibrium is the result of game a ∗ = ( a 1 ∗ , a 2 ∗ , … , a N ∗ ) a^{*}=\left(a_{1}^{*}, a_{2}^{*}, \ldots, a_{N}^{*}\right) a=(a1,a2,,aN), That is, for each player i i i There are :
u i ( a i ∗ , a − i ∗ ) ≥ u i ( a i , a − i ∗ ) u_{i}\left(a_{i}^{*}, a_{-i}^{*}\right) \geq u_{i}\left(a_{i}, a_{-i}^{*}\right) ui(ai,ai)ui(ai,ai)
therefore , We can By finding the game result that satisfies the best response of all people at the same time , To solve the Nash equilibrium .

Reference

https://zhuanlan.zhihu.com/p/148407108

https://zhiqianghe.blog.csdn.net/article/details/107330041

https://www.docin.com/p-2590113104.html

https://zhuanlan.zhihu.com/p/199047997

原网站

版权声明
本文为[Meet the demon king]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/174/202206230624541702.html