当前位置:网站首页>[tke] analysis of CLB loopback in Intranet under IPVS forwarding mode

[tke] analysis of CLB loopback in Intranet under IPVS forwarding mode

2022-06-24 16:17:00 jokey

Problem description

There are two customers in the cluster Service There is an accidental timeout between calls , After investigation, it was found that it triggered TKE Intranet in CLB Loopback problems lead to ( Public network in the same scenario CLB There is no such loopback problem ), However, the customer reported that another cluster has a similar invocation scenario , However, there has been no timeout . After checking and comparing , In two clusters Service The invocation scenarios are indeed consistent , However, the invoked services in the two clusters Service in externalTrafficPolicy Configuration difference , The cluster with loopback problem is configured as "Local", The cluster configuration without loopback problem is "Cluster".

Say first conclusion

Why use "externalTrafficPolicy=Local " With loopback problem , While using "externalTrafficPolicy=Cluster" No loopback problem ?

It is found by capturing packets on the outgoing node network cards of two different clusters , Clusters with loopback problems are Pod A visit CLB IP When leaving the node, no SNAT, When accessing a cluster without loopback problems, the outgoing node does SNAT( Request source IP Convert to nodes IP), And because of CLB When forwarding, the request source will be determined IP, If the forwarding backend list is found to be the same as the request source IP Backend , It will not be forwarded to the backend ,⽽ Select another backend , So I did SNAT Our cluster just avoids the loopback problem .

Problem analysis

Scenarios that trigger loopback problems :

When a container in a cluster Pod A Through the intranet CLB exposed Service B service (Pod B) May occur .

Link triggered by loopback problem

PodA(client) -> CLB( Intranet ) -> Service B( Just forward to Pod A Node NodePort )-> Pod .

Because the deployment scenarios of the two clusters are consistent , That is, the trigger scenario that satisfies the loopback problem , stay TKE IPVS In forwarding mode , pod The service message of internal access load balancer type needs to go out of the node ( because LB IP Not bound to ipvs0 Interface ), So by default iptables The rule out node should be to do SNAT Of , However, there is a loopback problem. In fact, the outgoing node of the cluster has not done so SNAT, So let's analyze Service in externalTrafficPolicy The effect of different configurations of on the forwarding link of container network packets .

contrast iptables Rule differences

First, compare the two cluster nodes iptables(NAT surface ) Forwarding rules for :

iptables Rule differences

Found in the cluster Service Set up “externalTrafficPolicy=Local ” There will be two more nodes dedicated to "externalTrafficPolicy=Local" Added Of iptables The rules .

How rule differences affect the forwarding of packets from the container's outgoing nodes :

externalTrafficPolicy Configure to Local Its function is to keep the client source IP And avoid LoadBalancer and NodePort Second hop forwarding of type service , Details refer to :externalTrafficPolicy Introduce .

according to iptables Rule knowledge , We see that when the network packet is about to leave the node , Must hit first OUTPUT chain :

-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES

Then go down and have a look according to the follow-up chain forwarding rules :

Service Use the default “externalTrafficPolicy=Cluster ” Cluster node of , Did SNAT The situation of :

Outgoing node SNAT analysis

Service Set up “externalTrafficPolicy=Local ” Cluster node of , Didn't do SNAT The situation of :

The outgoing node has no SNAT analysis

As can be seen from the above comparison , although Service Of externalTrafficPolicy Configuration is intended to handle access Service In the direction of flow , But adding rules does affect container access CLB Policy when leaving the node , This determines whether there is an intranet CLB Loopback problem .

Reference material :https://kubernetes.io/blog/2018/07/09/ipvs-based-in-cluster-load-balancing-deep-dive/

原网站

版权声明
本文为[jokey]所创,转载请带上原文链接,感谢
https://yzsam.com/2021/04/20210428225406786D.html