当前位置:网站首页>Kubernetes related knowledge - surprise interview

Kubernetes related knowledge - surprise interview

2022-06-25 21:04:00 Zhuge iron and steel cloud

Catalog

1、k8s Architecture and functions of each component

2、deployments and Stateful Sets Pulled up pod What's the difference , What's the difference between recycling ?

3、pod Rolling update strategy

4、pod The network plugin (flannel、openswitch、calico), Why use overlay networks ,overlay underlay

1.Flannel

1) VxLAN: 

2) host-GW:

3) UDP:

2.Calico

Calico workflow

Calico Two kinds of network methods

Calico Advantages and disadvantages

3.Openswitch

ovs framework

Docker0 The Internet

Network division

5、pod The boot process

6、POD Delete process

7、 Elegant exit plan ( Take the initiative to quit )

8、 Ready probe 、 Survival probe difference

9、informer Mechanism

List-Watch Mechanism

informer modular

 Watch How is it realized

10、pod Life cycle combined with hook

11、k8s declarative api

12、CRI  CNI  CSI

CRI(Container Runtime Interface): Container runtime interface , Providing computing resources  

CNI(Container Network Interface): Container network interface , Provide network resources

CSI(Container Storage Interface): Container storage interface , Provide storage resources

13、qos classification , How does the system score

14、 Scheduling strategy

15、 Why design pod,pod What are the benefits

16、CRD

17、operator Programming practice

18、ingress(router)

19、iptables and ipvs The difference between

20、 Rights management

ServiceAccount

RBAC( Role-based access control )

Role( role )

ClusterRole( The cluster character )

RoleBinding( Character binding ) and ClusterRoleBinding( Cluster role binding )

1、k8s Architecture and functions of each component


Master Nodes mainly include API Server、Scheduler、Controller manager、etcd Several major components

Node Nodes mainly include kubelet、kube-proxy Module and pod object

API Server : Provide a hub for data interaction and communication between other modules , Other modules pass API Server Query or modify data , Only API Server Just directly with etcd Interact .

Scheduler: Be responsible for scheduling resources within the cluster , amount to “ Control room ”. Receive from kube-apiserver establish Pods The task of , After receiving the task, it will retrieve all the items that meet the... Through the preselection strategy and the preference strategy Pod Required Node node , Start execution Pod Scheduling logic . After successful scheduling, the Pod Bind to the target node .

Preselection strategy (Predicate)
  1. According to the operation Pod Resource constraints to exclude non-conforming Node
  2. According to the operation Pod when , Whether it is required to share the network namespace of the host , Such as : some Pod Start the network namespace to share the host , start-up 80 port , And some Node Of 80 Already occupied , Then it does not conform to , We should not rule out .
The preferred strategy (Priority):
   This phase passes through a series of functions , The relevant attributes of the preselected nodes will be input into these functions , Get the priority score of each node through calculation , Then sort in reverse order , The one who gets the highest score , As run Pod The node of , If after calculation , Found multiple Node Same score , At this point, a random Node As it should be Pod Running Node.

Controller manager: In charge of the cluster Node、Namespace、Service、Token、Replication And other resource objects , Keep the resource objects in the cluster in the expected working state . every last controller adopt api-server Provided restful The interface monitors the status of each resource object in the cluster in real time , When a fault occurs , Cause the working state of the resource object to change , Just intervene , Try to restore the resource object from its current state to its expected working state , common controller Yes Namespace Controller、Node Controller、Service Controller、ServiceAccount Controller、Token Controller、ResourceQuote Controller、Replication Controller etc. .

Etcd:etcd stay kubernetes Clusters are used to store data and notify changes , It stores key data in etcd in .

Pod:Pod yes Kubernetes The most basic operating unit . One Pod Represents a process running in a cluster , It encapsulates one or more closely related containers .

Kubelet: Running on each computing node ,kubelet Components through api-server Interface monitoring provided pod The expected state of , And call the corresponding interface to achieve this state . At the same time, monitor the information assigned to the Node Node pods, Get container state periodically , Report to regularly api-server, Re pass api-server Notify each component . Also responsible for image and container cleaning

kube-proxy:kube-proxy Will act as daemon( Daemon ) Run on each node through watch In a way that monitors etcd About China Pod Latest status information for , Once it detects a Pod The resource has been deleted or created ip A series of changes , It immediately put these changes , The reaction is iptables or ipvs In the rules , For later Another request is sent to service when ,service Upon receipt of the request, it will be based on kube-proxy Make a good policy to forward the request , So as to realize load balancing .

2、deployments and Stateful Sets Pulled up pod What's the difference , What's the difference between recycling ?
Deployments Deploy stateless Services

Stateful Sets Deploy stateful Services

StatefulSet,Pod The deployment is based on {0 …… N-1} Created in sequence number order .

StatefulSet Medium Pod With a unique sequential index and stable network identity .

When deleting, follow the and Pod The ordinal index is in reverse order, one at a time Pod. After deleting the next Pod We'll wait for the last one to be completely shut down .

3、pod Rolling update strategy
stay Deployment Of YAML In the definition file , from spec.strategy.type Field assignment Pod Rolling update strategy , It has two optional values :

RollingUpdate ( The default value is ): Step by step to create new Pod, At the same time, the old Pod, New use Pod Replace the old Pod.

Recreate: Creating a new Pod front , All the old Pod Must terminate in its entirety

Use RollingUpdate strategy , There are also two options that allow you to fine tune the update process :

maxSurge: During the update , Allows you to create more than expected state definitions Pod The maximum of the number .

maxUnavailable: During the update , Tolerate inaccessible Pod The maximum of the number

Pod It has been fully deployed and will bring Pod Marked as Ready, Creating Pod Marked as NotReady, Is being deleted Pod Marked as Terminating.

4、pod The network plugin (flannel、openswitch、calico), Why use overlay networks ,overlay underlay
K8s The network of can be roughly divided into two categories , One is overlay network , One is direct routing .

Overlay network is to package one layer and four layer protocol outside the original message (udp agreement ), Route and forward through the host network .

Direct pull routing controls the next hop through the routing table IP Address to achieve .

Kubernetes On the system Pod The implementation of the network depends on third-party plug-ins , There are three kinds of common Flannel、Calico、Openvswitch

1.Flannel
flannel The way we work is 3 Kind of , The default is VXLAN Pattern .flanneld Need to rely on etcd To ensure that the cluster IP The assignment does not conflict with the problem .

1) VxLAN: 
        VXLAN yes Linux A network virtualization technology supported by the kernel itself , Is a module of the kernel , Implement encapsulation and unpacking in kernel mode , Build a coverage network , In fact, it is a host computer Flannel.1 Virtual layer-2 network composed of devices .


2) host-GW:
        This method is in the host computer Pod Interconnected via virtual bridge , Then use the physical network card of the host as the gateway , When you need to access other Node Upper Pod when , Just send the message to the physical network card of the host , The host queries the local routing table , To do routing forwarding , Achieve cross host Pod signal communication , The problem with this model is , When k8s When the cluster is very large , This will cause the routing table on the host to become very large , And this way , All requests Node Must be in the same layer 2 network , Otherwise, the route cannot be forwarded .

3) UDP:
        The core is through TUN equipment flannel0 Realization (TUN The device is a virtual network device working on the third layer , The function is : Pass... Between the operating system kernel and the user application IP package )

Compared with direct communication between two host computers , There's more flanneld Processing of , This process , Used flannel0 This TUN equipment , Just sending out IP In the process of package, there are three copies of data from user state to kernel state (linux The cost of context switching is high ), So the performance is very poor .

2.Calico
calico The basic components of are etcd、felix、BGP Client、BIDR(bgp route reflector).

Etcd: Distributed key value storage , Mainly responsible for the consistency of network metadata .

Felix: Run on every node , It is mainly responsible for the configuration of routing and ACLs( Packet filtering technology ) Wait for information to make sure endpoint Connected state of .

BIDR(bgp route reflector) Mainly responsible for Felix write in kernel The routing information of is distributed to the current Calico The Internet

Calico workflow
Felix Will listen ECTD Central storage , Get events from . For example, a user creates a on a machine pod,Felix Will connect its network card 、IP、MAC It's all set up , Then write a line in the routing table of the kernel , Mark this IP I should go to this network card . Then pass the standard BGP The routing protocol of is spread to the whole host , Let the outside world know this IP ad locum , You get here when you're routing .

Calico Two kinds of network methods
1)IPIP
Understand... Literally , Is to take a IP The data package is nested in another IP In the bag , Namely the IP Layer to IP One of the layer tunnel, Its function is basically based on IP Layer of the bridge ! Generally speaking , The common bridge is based on mac Layer of , There is no need for IP, And this ipip It's through the routing of both ends tunnel, Connect two different networks through point-to-point .ipip Source code in the kernel net/ipv4/ipip.c Can be found in .

2)BGP
Border gateway protocol (Border Gateway Protocol, BGP) It is a core decentralized autonomous routing protocol on the Internet . It's through maintenance IP Routing table or ‘ Prefix ’ Table to implement the autonomous system (AS) Accessibility between , It belongs to vector routing protocol .BGP Don't use the traditional internal Gateway Protocol (IGP) Indicators of , And use path based 、 Network policy or rule set to determine the route . therefore , It's more suitable to be called the vector protocol , Rather than routing protocols .BGP, Generally speaking, it means multiple lines connected to the computer room ( Such as telecommunications 、 Unicom 、 Mobile, etc ) Merge into one , Realize multi line single IP,BGP Advantages of computer room : The server only needs to have one IP Address , The best access route is determined by the backbone router on the network according to the number of routing hops and other technical indicators , It doesn't take up any of the server's systems .

Calico Advantages and disadvantages
IPIP The Internet : For mutual access Pod Not in the same segment , Cross segment access scenarios , Enclosed in outer layer IP It can solve the problem of routing across network segments . The flow needs tunl0 Device packaging , It's a little inefficient
BGP The Internet : For mutual access Pod In the same segment , Native hostGW, Efficient . Too much iptables Rules create complexity and are not debuggable , There is also a loss of performance .

3.Openswitch
OpenvSwitch abbreviation OVS,OVS It's a high quality 、 Multi tier virtual switching software . Its purpose is to support large-scale network automation through programming extensions , It also supports standard management interfaces and protocols .

Simple view ,OVS It consists of these three parts

ovsdb-sever: OVS Database server , Used to store the configuration information of the virtual switch . It is associated with manager and ovs-vswitchd The exchange of information uses OVSDB(JSON-RPC) The way .

ovs-vswitchd: OVS The core components of , It's connected to the top controller Communications comply with openflow agreement , It is associated with ovsdb-server Communication with OVSDB agreement , It communicates with the kernel module through netlink signal communication , It supports multiple independent datapath( bridge ), It's done by changing flow table Implemented binding , and VLAN And so on .

ovs kernel module: OVS The kernel module of , Handle packet switching and tunneling , cache flow, If a forwarding rule is found in the kernel cache, it is forwarded , Otherwise, it will be sent to user space for processing .

ovs framework
stay k8s、Docker scenario , It's mainly about building L3 To L3 The tunnel , First , for fear of Docker Created docker0 Address conflict , We need to manually configure and specify the following Node Node docker0 The address segment distribution of the bridge .
secondly , establish Open vSwitch The bridge ovs, And then use ovs-vsctl Order to ovs Bridge increase gre port , add to gre Port to connect the target NodeIP The address is set to the opposite end IP Address . For each peer IP All addresses need to do this ( For large networks , Need to do automated scripts to complete ).
Last , take ovs As a network interface , Join in Docker On the bridge . restart ovs Bridge and Docker The bridge , And add a Docker The address segment of to Docker The routing rule items of the bridge , The network of the two containers can be connected .

When an application in a container accesses the address of another container , Packets are sent to... Via the default route within the container docker0 bridge .ovs The bridge is for docker0 The port of the bridge exists , Ann will send the data to ovs bridge .ovs The network has been established through configuration and other ovs Bridged GRE/VxLAN Tunnel , Naturally, data can be delivered to the opposite end Node, And sent to docker0 And Pod.

This mode uses docker Network of , adopt docker start-up

adopt openshift Created POD It's direct connection OVS Of br0 Bridged

Docker0 The Internet
Docker When starting up, a... Will be automatically created on the host docker0 bridge , It is actually a bridge , Start up of all containers if docker run When the network mode is not specified, it will be mounted to docker0 On the bridge . In this way, the container can communicate with the host or even other containers .

Every time you start one docker Containers ,docker Will give you docker The container allocates a ip, We just need to install docker, There will be a network card docker0 Bridging mode , The technology used is veth-pair technology , Every time you start a container , There will be an additional network card .

( fourteen )Docker0 Network details _Bertram The blog of -CSDN Blog _docker0

docker0 The bridge will default to 172.17.0.0/16 Network segment .

Use docker network ls You can see docker Network of

Network division
openshift It will be installed in /etc/ansible/hosts Set two parameters in

osm_cluster_network_cidr=10.1.0.0/16

osm_host_subnet_length=9( Not explicitly defined , The default value is 9)

osm_cluster_network_cidr Definition SDN The Internet CIDR, Used to allocate POD The Internet

osm_host_subnet_length Definition SDN Length when allocating host subnets

For example, the above two parameters , be :

Distributable POD IP:10.1.0.1-10.1.255.254, You can use IP Count ,2^(32-16)-2=65534 individual

Allocable subnet ( colony node Number of nodes )=2^(32-16-9)=128

Each node can be assigned POD IP Count 2^9=512 individual

openshift_portal_net:k8s Of service Will be created in this subnet , The default value is 172.30.0.0/16

  Refer to :4.2. Configure cluster variables OpenShift Container Platform 3.11 | Red Hat Customer Portal

stay openshift in , You can use the command oc get hostsubnet Command to view the network segments assigned to each node

kubelet call k8s Of api Interface to obtain node network segment , Then pass the parameters to CNI Interface ,CNI Interface to IPAM,IPAM adopt local-host The plug-in for IP Return to CNI, And then to POD Set up IP

Node has been assigned IP Can be found in /var/lib/cni/networks/openshift-sdn View under directory

5、pod The boot process
1、 The client submits the creation request , Can pass API Server Of Restful API, You can also use kubectl Command line tools . Supported data types include JSON and YAML.

2、API Server Handling user requests , Storage Pod Data to etcd.

3、 The scheduler passes API Server View unbound Pod. Try to Pod Assign hosts .

4、 Filter host ( Scheduling preselection ): The scheduler uses a set of rules to filter out non compliant hosts . such as Pod Specifies the amount of resources required , Then the ratio of available resources Pod Hosts that require less resources are filtered out .

5、 The host scores ( Scheduling optimization ): Score the hosts that meet the requirements selected in the first step , In the host scoring stage , The scheduler considers some overall optimization strategies , For example, hold a Replication Controller Copies of are distributed to different hosts , Use the lowest load host, etc .

6、 Select host : Choose the host with the highest score , Conduct binding operation , The results are stored in etcd in .

7、kubelet Execute according to the scheduling result Pod Create operation : After the binding is successful ,scheduler Would call APIServer Of API stay etcd Create a boundpod object , Describes the binding of all running on a work node pod Information . Running on each work node kubelet I will also talk to you regularly etcd Sync boundpod Information , Once it is found that boundpod Object not updated , Call Docker API Create and launch pod Inside the container .

 kubelet adopt API Server monitor etcd Catalog , Sync pod list . If you find something new pod Bind to this node , According to pod The list requires the creation of pod, If it's a discovery pod Be updated , Then make corresponding changes . Read pod After the message , If create and modify pod The task of , Do the following :
1、 For the sake of pod Create a data directory
2、 from API Server Read the pod detailed list
3、 For the sake of pod Mount the external volume
4、 download pod The required Secret
5、 The check is already running in the node pod, If it's time to pod No container or Pause Container not started , The first stop pod All container processes in .
6、 Use pause Mirror each pod Create a container , This container is used for connecting pipes Pod The network of all other containers in .
7、 by pod Each of these containers does the following : Calculate one for the container hash value , Then use the name of the container to query for docker Container of hash value . If you find a container , And both hash Values are different , Then stop docker Process in container , And stop being associated with it pause Containers , If the same , Do not deal with . If container is terminated , And the container has no specified restart policy , Then no processing call is made docker client Download container image , And start the container .

The illustration kubernetes Pod Create a process to uncover the secrets _Kubernetes The Chinese community

6、POD Delete process
The general process is as follows :kubernetes Source code analysis -pod Delete process _hahachenchen789 The blog of -CSDN Blog _k8s Delete pod technological process

7、 Elegant exit plan ( Take the initiative to quit )
        Elegant stop (Graceful shutdown) This comes from the operating system , We all have to OS Do some cleanup first , And the opposite is hard stop (Hard shutdown), For example, unplug the power supply .

        In a distributed system , Graceful stop is not just about the process itself on a single machine , Often you have to deal with other components in the system . Let's start a micro service , The gateway gives us part of the traffic , At this time :

If we kill the process without saying a word , That part of the traffic can't be handled correctly , Some users are affected . But also good , Generally speaking, the gateway or service registry will keep a heartbeat with our services , After the heartbeat timeout, the system will automatically remove our service , And the problem was solved ; This is a hard stop , Although our whole system is well written, it can heal itself , But there are still some jitters and even errors ;
If we first tell the gateway or service registry that we are going to be offline , Wait for the other party to complete the service removal operation and then abort the process , That won't affect any traffic ; It's elegant stop , Minimize the impact of the start and stop of a single component on the entire system ;
        By convention ,SIGKILL It's a hard stop signal , and SIGTERM It's a signal that tells the process to exit gracefully , So many microservice frameworks listen to SIGTERM The signal , After receiving it, go to the anti registration and other cleaning operations , Achieve graceful exit .

        In addition to Pod from k8s Of Service Take it off and the elegant exit inside the process , We have to do something extra , From, say, k8s Unregister on an external service registry . This is the time to use PreStop hook 了 ,k8s Currently provided Exec and HTTP Two kinds of PreStop hook, In practice , Need to pass through Pod Of .spec.containers[].lifecycle.preStop Field is Pod Each container in the is configured separately , such as :

spec:
  contaienrs:
  - name: my-awesome-container
    lifecycle:
      preStop:
        exec:
          command: ["/bin/sh","-c","/pre-stop.sh"]
/pre-stop.sh We can write our own cleaning logic in the script , The general steps are as follows :

1. User deletion Pod
2.Pod Get into Terminating state ;
3. meanwhile ,k8s Will Pod From the corresponding service To remove from ;
4. meanwhile , For having preStop hook The container of ,kubelet Will call... For each container preStop hook, If preStop hook The running time of exceeds grace period( The default is 30 second ),kubelet Will send SIGTERM And wait 2 second ;
5. meanwhile , For not preStop hook The container of ,kubelet send out SIGTERM
6.grace period Beyond ,kubelet send out SIGKILL Kill containers that have not yet exited
8、 Ready probe 、 Survival probe difference
Ready probe Readiness Probe Decide whether or not to POD Send traffic

Survival probe Liveness Probe Decide whether to restart POD

9、informer Mechanism
Etcd Store the data information of the cluster ,apiserver As a unified entrance , Any operation on data must go through apiserver. client (kubelet/scheduler/controller-manager) adopt list-watch monitor apiserver China Resources (pod/rs/rc wait ) Of create,update and delete event , And call the corresponding event handling function for the event type .

List-Watch Mechanism
that list-watch What is it exactly , seeing the name of a thing one thinks of its function ,list-watch There are two parts , Namely list and watch.list Very easy to understand , Is to call resources list API List resources , be based on HTTP Short link implementation ;watch Is to call the resource watch API Listen for resource change events , be based on HTTP Long link implementation

for example List Of API:GET /api/v1/pods

WATCH Of API:GET /api/v1/watch/pods

informer modular
K8S Of informer Module encapsulation list-watch API, Users only need to specify resources , Write an event handler ,AddFunc,UpdateFunc and DeleteFunc etc. . As shown in the figure below ,informer First, through list API List resources , And then call watch API Listen for resource change events , And put the results into a FIFO queue , At the other end of the queue, there is a process that takes events from it , And call the corresponding registration function to handle the event .Informer It also maintains a read-only Map Store cache , Mainly to improve the efficiency of query , Reduce apiserver The load of .

 Watch How is it realized
Watch adopt HTTP Long link reception apiserver Resource change events sent , The main way to achieve this is through Chunked transfer encoding( Block transfer coding )

When the client calls watch API when ,apiserver stay response Of HTTP Header Set in Transfer-Encoding The value of is chunked, Indicates that block transmission coding is adopted , After the client receives this message , Then it will be linked with the server , And wait for the next data block , That is, the event information of the resource . for example :

$ curl -i http://{kube-api-server-ip}:8080/api/v1/watch/pods?watch=yes
HTTP/1.1 200 OK
Content-Type: application/json
Transfer-Encoding: chunked
Date: Thu, 02 Jan 2019 20:22:59 GMT
Transfer-Encoding: chunked
 
{"type":"ADDED", "object":{"kind":"Pod","apiVersion":"v1",...}}
{"type":"ADDED", "object":{"kind":"Pod","apiVersion":"v1",...}}
{"type":"MODIFIED", "object":{"kind":"Pod","apiVersion":"v1",...}}
understand K8S The essence of design List-Watch The mechanism and Informer modular - You know

10、pod Life cycle combined with hook
k8s Provides life cycle hooks , That is to say Pod Hook,Pod Hook By kubelet Sponsored , When a process in a container starts or terminates , This is included in the life cycle of the container .

Kubernetes Two hook functions are provided :

PostStart: This hook executes immediately after the container is created . however , There is no guarantee that the hook will be in the container ENTRYPOINT Run before , Because no parameters are passed to the handler . It is mainly used for resource deployment 、 Environmental preparation, etc . However, it should be noted that if the hook takes too long to run or hang , The container will not reach running state .(PostStart You can execute after the container is started . But it should be noted that , this hook And the... In the container ENTRYPOINT The order in which commands are executed is uncertain .)
PreStop: This hook is called immediately before the container terminates . It's clogged , It means it's synchronized , So it must be done before the call to delete the container is issued . Mainly used to gracefully close applications 、 Inform other systems, etc . If the hook hangs during execution , Pod The stage will stay at running State and never reach failed state ,PreStop Is executed before the container is terminated , It's a blocking way . After execution ,Kubelet To really start destroying containers .
If PostStart perhaps PreStop Hook failed , It will kill the container . So we should make the hook function as light as possible . Of course, in some cases , It makes sense to run commands for a long time , For example, pre save the state before stopping the container .

spec:
  contaienrs:
  - name: my-awesome-container
    lifecycle:
      preStop:
        exec:
          command: ["/bin/sh","-c","/pre-stop.sh"]
11、k8s declarative api
For us to use Kubernetes API How objects work , I usually write corresponding API Object's YAML Documents to Kubernetes( Instead of using some commands to directly operate API). So-called “ declarative ”, It means that you only need to submit a defined API Object to “ Statement ”( This YAML Documents are actually a kind of “ Statement ”), Just indicate what the desired final state looks like . And if you submit commands one by one , To guide how to achieve the desired state step by step , This is it. “ imperative ” 了 .

One API The object is Etcd The complete resource path in , By :Group(API Group )、Version(API edition ) and Resource(API The resource type ) It's made up of three parts .

12、CRI  CNI  CSI
CRI(Container Runtime Interface): Container runtime interface , Providing computing resources  
CRI Defines the interface between the container and the mirrored service , Because the container runtime and the life cycle of the image are isolated from each other , So you need to define two services .

RuntimeService: Container and Sandbox Runtime management
ImageService: Provides the ability to pull from the image warehouse 、 see 、 And remove the mirror RPC.


CNI(Container Network Interface): Container network interface , Provide network resources
CNI yes CNCF One of its projects , There is a set of tools used to configure Linux Container network interface specification and library composition , It also includes some plug-ins .CNI Only care about the network assignment when the container is created , And releasing network resources when the container is deleted .

CNI The following methods are included in the interface of : Add network 、 Delete network 、 Add network list 、 Delete network list .

type CNI interface {
    AddNetworkList(net *NetworkConfigList, rt *RuntimeConf) (types.Result, error)
    DelNetworkList(net *NetworkConfigList, rt *RuntimeConf) error
 
    AddNetwork(net *NetworkConfig, rt *RuntimeConf) (types.Result, error)
    DelNetwork(net *NetworkConfig, rt *RuntimeConf) error

CSI(Container Storage Interface): Container storage interface , Provide storage resources
K8S The storage system is abstracted from the external storage component interface , That is to say CSI, adopt grpc Interface provides external services . Third party storage vendors can publish and deploy public storage plug-ins , Without touching K8S Core code .

13、qos classification , How does the system score
QoS(Quality of Service), Most of them translate into “ Service quality level ”, Another translation “ Service quality assurance ”,k8s Create a Pod when , It will give this Pod Allocate one QoS Grade , It can be one of the following levels :

Guaranteed:Pod Every container in the must have memory /CPU Restrictions and requests , And the values must be equal . If a container only indicates limit Without setting request, be request The value is equal to the limit value .

Burstable:Pod At least one container has memory or CPU Request and not satisfied Guarantee Level requirements , Memory /CPU The value of is set differently .

BestEffort: The container must have no memory or CPU A restriction or request .

This configuration is not configured through a configuration item , But through configuration CPU/MEM Of limits And requests Value to confirm the service quality level .

Three QoS priority :Guaranteed --> Burstable --> BestEffort

Kubernetes Recycling strategy : When the cluster monitors node Node memory or CPU When resources run out , To protect node Normal work , Will start the resource recovery strategy , By expelling... On nodes Pod To reduce resource consumption .

Three QoS Policy eviction priority , From high to low ( From left to right )

BestEffort --> Burstable --> Guaranteed

14、 Scheduling strategy
API Server Accept client submission Pod During the operation after the object creation request , An important step is the scheduler program kube-scheduler Select the best available node from the current cluster to receive and run it , Usually the default scheduler kube-scheduler Responsible for performing such tasks . For each of the Pod The object is , The scheduling process is usually divided into two stages —》 Filter —》 Scoring , The filtering phase is used to filter out those that do not conform to the scheduling rules Node, The scoring stage is built on the filtering stage , For each of the Node scores , The higher the score is, the higher the score is Node The greater the probability .

Pod The scheduling policy is in addition to the system default kube-scheduler In addition to the scheduler, there are the following implementation methods :

nodeName( Direct designation Node Host name )
nodeSelector ( Node selector , by Node tagged , then Pod Pass through nodeSelector Choose the labeled Node)
The stain taint And tolerance tolerations
NodeAffinity Node affinity
PodAffinity Pod Affinity
PodAntAffinity Pod Anti affinity
This should be the most complete Pod Scheduling strategy - You know

15、 Why design pod,pod What are the benefits
        Imagine a scene like this , We go through docker Start a container as tomcat, When starting a container for mysql, Such a group of related containers pass through ip+port To provide services to the outside world . hypothesis mysql The reason for the container , The container dies . Then we can't quickly sense whether the service is normal . meanwhile , hypothesis tomacat Want to visit mysql, In the era of containerization , We must mysql Of ip Inject through environment variable injection tomcat Containers , And will tomcat Container ports map to host ports , So that the outside world can access .

        adopt pod This includes design patterns for multiple containers , stay pod Start two containers in myql as well as tomcat, Then we can use pod Of live or death To quickly decide that pod Contains 2 Normal and death of associated processes . here pod Internally, for the O & M developers , It's a black box , We don't need to focus directly on pod What happened inside , And just focus on pod Whether it is normal or not . meanwhile pod At the beginning of design ,pause Introduction of containers , It can easily solve the problems between multiple business containers ip Communication and file sharing issues , Solved many problems node Lower container communication problem .

16、CRD
        stay Kubernetes Everything in can be seen as resources ,Kubernetes 1.7 After that, I added the right CRD Customize the secondary development capability of resources to expand Kubernetes API, adopt CRD We can approach Kubernetes API Add new resource type , Without modification Kubernetes Source code to create custom API server, This feature greatly improves Kubernetes The ability to expand .
        When you create a new CustomResourceDefinition (CRD) when ,Kubernetes API The server will create a new version for each version you specify RESTful Resource path , We can according to the api Path to create some of our own defined type resources .CRD Can be namespace , It can also be cluster wide , from CRD Scope of action (scpoe) Field , As with existing built-in objects , Deleting a namespace deletes all custom objects in that namespace .customresourcedefinition Itself has no namespace , All namespaces can use .

adopt crd Resources create custom resources , That is, customize one Restful API:

$ vi resourcedefinition.yaml:

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  # The name must match the following spec Field matching , The format is : <plural>.<group>
  name: crontabs.stable.example.com
spec:
  # be used for REST API The group name of : /apis/<group>/<version>
  group: stable.example.com
  # this CustomResourceDefinition List of supported versions
  versions:
    - name: v1
      # Each version can be enabled through the service flag / Ban .
      served: true
      # One and only one version must be marked as a stored version .
      storage: true
  # Appoint crd Resource scope is in namespace or cluster
  scope: Namespaced
  names:
    # URL Plural names used in : /apis/<group>/<version>/<plural>
    plural: crontabs
    # stay CLI(shell Parameters entered in the interface ) A singular name used as an alias on and for display
    singular: crontab
    # kind Fields use hump naming rules . The resource list is used like this
    kind: CronTab
    # Short names allow short strings to match CLI Resources on , Consciousness is the ability to pass kubectl When viewing a resource, use the short name of the resource to get .
    shortNames:
    - ct
Create custom contab resources
$ kubectl create -f resourcedefinition.yaml
Then create a new one with a namespace in the following location RESTful API Endpoint :
/apis/stable.example.com/v1/namespaces/*/crontabs/... Then we can use the url To create and manage custom object resources .
View custom contab Information about resources
$ kubectl get contab/ct
1.2、 Create custom resource objects :

according to crd Created from object resources RESTful API, To create crontab Type resource object
$ vi my-crontab.yaml:

apiVersion: "stable.example.com/v1"
kind: CronTab
metadata:
  name: my-new-cron-object
spec:
  cronSpec: "* * * * */5"
  image: my-awesome-cron-image
Creating CustomResourceDefinition After object , We can create custom resource objects . Custom objects can contain custom fields . These fields can contain any JSON. In the example above cronSpec and image Custom fields are set in CronTab Type .CronTab The type comes from the... You created above CustomResourceDefinition Object specification .

Create custom resources contab The object of the resource
$ kubectl create -f my-crontab.yaml
Kuberneters(K8s)CRD Detailed explanation of resources - Simple books

17、operator Programming practice
        operator It's a kind of kubernetes Extended form of , Use custom resource objects (Custom Resource) To manage applications and components , Allow users to Kubernetes The declarative form of API Style to manage applications and services .operator Defines a group of Kubernetes How to package and deploy complex business applications in cluster ,operator It is mainly to solve the problem of how to run a specific application or service 、 Deployment and how to deal with problems in a specific custom way .

Ten minutes to understand k8s Operator The production process of the application - You know

18、ingress(router)
        k8s External exposure services (service) There are two main ways :NotePort, LoadBalance, Besides externalIPs It can also make all kinds of service External services , But when clusters serve a lot ,NodePort The biggest disadvantage of this method is that it will occupy many ports of cluster machines ;LB The biggest disadvantage of each way is service One LB A little waste and trouble , And need k8s Support beyond ; and ingress And it just needs one NodePort Or a LB Can satisfy all service Demand for external services . The working mechanism can be roughly represented by the following figure :

Ingress It contains two main parts Ingress Controller and Ingress.

Ingress The solution is that after the new service is added , The correspondence between domain name and service , It's basically a ingress The object of , adopt yaml Create and update, load .

Ingress Controller Yes, it will Ingress This change produces a period of Nginx Configuration of , Then configure this configuration through Kubernetes API writes Nginx Of Pod in , then reload.( Be careful : write in nginx.conf It's not service The address of , It is service backend Of pod The address of , To avoid the service Add a layer of load balancing forwarding )

 k8s Foreign service ingress_yrx420909 The blog of -CSDN Blog _ingress k8s

k8s Of ingress stay OpenShift be called Route, be based on HAProxy Of Ingress Controller To realize the traffic routing of external stack requests , Running on the infra node .

19、iptables and ipvs The difference between
        from k8s Of 1.8 Version start ,kube-proxy Introduced IPVS Pattern ,IPVS Patterns and iptables Also based on Netfilter, however ipvs Adopted hash surface ,iptables Take a list of rules one by one .iptables It is also designed for firewall , The more clusters iptables The more rules , and iptables The rule is to match from top to bottom , So the more inefficient . So when service When the quantity reaches a certain scale ,ipvs Of hash The speed advantage of looking up a table will show up , So as to improve service Service performance

        For each node kube-proxy Responsible for monitoring API server in service and endpoint The change of . Write the change information to the local userspace、iptables、ipvs To achieve service Load balancing , Use NAT take vip Flow goes to endpoint in . because userspace Mode because of reliability and performance ( Switch the kernel frequently / User space ) It has already been eliminated , All client requests svc, Go first iptables, Then go through kube-proxy To pod, So the performance is very poor .

The difference between the two is as follows :

ipvs Provides better scalability and performance for large clusters

ipvs Support than iptables More complex load balancing algorithms ( The minimum load 、 The minimum connection 、 A weighted, etc. )

ipvs Support for server health checks and connection retries

ipvs Adopted hash surface ,iptables Take a list of rules one by one

20、 Rights management
k8s There are two kinds of accounts : UserAccount( The account number used by ), ServiceAccount(Pod Account used )

UserAccount: visit API Server Usually people go to visit , Or write a script to access , The account used for such access .

ServiceAccount:Pod Connect yourself API Server when , Account used .

stay K8s We need to go through three inspection steps to operate any resource in : authentication 、 to grant authorization and Access control .

Of course, these are only for ordinary users ,k8s The default clusteradmin Is the one with the highest authority .

authentication : That is, the user wants to log in k8s Operating resources on , You must provide a valid user name and password . Authentication methods include

        token( Share secret key ) 

        SSL( two-way SSL authentication )

to grant authorization : After user authentication , To view what resources the user can operate , If the resource to be operated is within the range of the resource allowed to be operated, it can be operated through . There are several authorization methods , The default is RBAC.
        ABAC( Property based access control )

        RBAC( Role-based access control )

        NODE( Node based access control )

        WEB HOOK( Customize HTTP Access control of callback methods )

Access control : This part mainly refers to the certification and authorization , Later, check the cascading object operation to be operated

ServiceAccount
ServiceAccount It's for convenience Pod The process call inside Kubernetes API Or other external services .

1.serviceaccount It is only limited to where it is namespace;   

2. Every namespace It will automatically create one default Of ServiceAccount,default By default, you only have the permission to download images .

3.Token controller testing ServiceAccount The creation of , And create... For them secret

4. Turn on ServiceAccount Admission Controller after

        1. Every Pod It will be automatically set after creation spec.serviceAccount by default

        2. verification Pod Refer to the service account Already exist , Otherwise, refuse to create     

        3. If Pod Is not specified ImagePullSecrets, Then put service account Of ImagePullSecrets Add to Pod in

        4. Every container It will be mounted after startup service account Of token and ca.crt To /var/run/secrets/kubernetes.io/serviceaccount/

If you want to POD Use special sa, Can pass spec.serviceAccountName Special... Can be specified sa, adopt RBAC To the corresponding sa to grant authorization .

RBAC( Role-based access control )
The authorization mechanism passes by default RBAC( Role-based access control ) To authorize , It only allows authorization , No denial of Authorization , Because the default is to reject all , We just need to define what the user is allowed to do .

RBAC Introduced 4 A new resource object :Role、ClusterRole、RoleBinding、ClusterRoleBinding

Role( role )
Role It can be understood as a combination of a set of permissions , That is, I package a set of permissions , Take a name . The permissions here are in the form of permission , There is no rule of rejection .Role It is aimed at a namespace. When creating, you need to specify the corresponding namespace.

Operational permissions (get、create、list、delete、update、edit、watch、exec etc. )

Operable objects (Pods、PV、ConfigMaps、Deployments、Nodes、Secrets、Namespaces etc. )

ClusterRole( The cluster character )
and Role equally , Are a combination of permissions , But for the entire cluster , You do not need to specify the corresponding namespace.

RoleBinding( Character binding ) and ClusterRoleBinding( Cluster role binding )
Will be user and role Association binding , It is equivalent to the position award in the company , Such as :A Strong ability , The company awarded him , Director of Technology Department , In addition, he also holds a part-time post as vice president of the company CTO etc. .

RoleBinding You can also quote ClusterRole, However, the bound user can only target the corresponding namespace.

ClusterRoleBinding You can only quote ClusterRole

Use the above two clusters to illustrate Role,RoleBinding and clusterRole,clusterRoleBinding
colony 1:
        Each namespace (Namespace) in , To define an administrator for this namespace , You have to create the same for each namespace Admin Role, And then use roleBinding take AdminRole Link to the user , To make the specified user an administrator in this namespace . It's very cumbersome , If there are many namespaces , You have to do the same thing in these namespaces , so much trouble .
colony 2:
        Throughout k8s Create a cluster level admin role, And then use roleBinding binding clusterRole To quote it , Associate users roleBinding, In this way, the user successfully has administrator rights to the current namespace . It will be easy . Just create a cluster level admin role that will do .

If you also want to create a user that owns and manages all resources in all namespaces , Use ClusterRoleBinding, take ClusterRole Binding with administrative users , Grant this user the privileges of Cluster Administrator .
————————————————
Copyright notice : This paper is about CSDN Blogger 「huangfukui」 The original article of , follow CC 4.0 BY-SA Copyright agreement , For reprint, please attach the original source link and this statement .
Link to the original text :https://blog.csdn.net/huangfukui/article/details/122147867

原网站

版权声明
本文为[Zhuge iron and steel cloud]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202181334258460.html