Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel(Memif) + Wireguard is unstable on Calico #527

Open
glazychev-art opened this issue Mar 16, 2022 · 4 comments
Open

Kernel(Memif) + Wireguard is unstable on Calico #527

glazychev-art opened this issue Mar 16, 2022 · 4 comments
Milestone

Comments

@glazychev-art
Copy link
Contributor

Description

Note: This issue was caught when using Calico vpp

Sometimes ping doesn't work, if we use kernel (or memif) + wireguard interfaces. It is enough to run Kernel2Wireguard2Kernel example .

Additional info:

I've added traces and found that the problem is most often here:

NSC ----> FWD1 ----> FWD2 ----> NSE
NSC -xxx- FWD1 <---- FWD2 <---- NSE

FWD1 information.
This trace is on the backward, from the FWD2:

vpp# show trace:  
Packet 3

02:08:36:559914: af-packet-input
  af_packet: hw_if_index 1 next-index 4
    tpacket2_hdr:
      status 0x1 len 158 snaplen 158 mac 66 net 80
      sec 0x62319535 nsec 0x159e4b2 vlan 0 vlan_tpid 0
02:08:36:559925: ethernet-input
  IP4: 02:42:ac:12:00:03 -> 02:42:ac:12:00:04
02:08:36:559931: ip4-input
  UDP: 172.18.0.3 -> 172.18.0.4
    tos 0x00, ttl 63, length 144, checksum 0x2332 dscp CS0 ecn NON_ECN
    fragment id 0x0000
  UDP: 51820 -> 51820
    length 124, checksum 0x0000
02:08:36:559936: cnat-input-ip4
  session not found
  in:host-eth0 out:DELETED 
02:08:36:559944: ip4-lookup
  fib 0 dpo-idx 20 flow hash: 0x00000000
  UDP: 172.18.0.3 -> 172.18.0.4
    tos 0x00, ttl 63, length 144, checksum 0x2332 dscp CS0 ecn NON_ECN
    fragment id 0x0000
  UDP: 51820 -> 51820
    length 124, checksum 0x0000
02:08:36:559948: ip4-receive
    UDP: 172.18.0.3 -> 172.18.0.4
      tos 0x00, ttl 63, length 144, checksum 0x2332 dscp CS0 ecn NON_ECN
      fragment id 0x0000
    UDP: 51820 -> 51820
      length 124, checksum 0x0000
02:08:36:559951: ip4-udp-lookup
  UDP: src-port 51820 dst-port 51820
02:08:36:559953: wg4-input
  Wireguard input: 
    Type: Data
    Peer: 0
    Length: 84
    Keepalive: false
02:08:36:560288: ip4-input-no-checksum
  ICMP: 172.16.1.100 -> 172.16.1.101
    tos 0x00, ttl 63, length 84, checksum 0x8351 dscp CS0 ecn NON_ECN
    fragment id 0x9d6e
  ICMP echo_reply checksum 0x3e2a id 62
02:08:36:560295: l3xc-input-ip4
  l3xc-index:0 lb-index:48
02:08:36:560300: ip4-rewrite
  tx_sw_if_index 12 dpo-idx 48 : ipv4 via 0.0.0.0 tun5: mtu:8920 next:12 flags:[] flow hash: 0x00000000
  00000000: 450000549d6e00003e018451ac100164ac10016500003e2a003e00113f6a821c
  00000020: 00000000000000000000000000000000000000000000000000000000
02:08:36:560302: interface-12-output-deleted
  tun5 
  00000000: 450000549d6e00003e018451ac100164ac10016500003e2a003e00113f6a821c
  00000020: 0000000000000000000000000000000000000000000000000000000000000000
  00000040: 0000000000000000000000000000000000000000552b95de843e0a103afc478b
  00000060: 63032fb80cd7034820f49dff7d10b1144dd7a61d261a9260
02:08:36:560304: error-drop
  rx:wg0
02:08:36:560305: drop
  interface-12-output-deleted: interface is deleted
vpp# show l3xc 
l3xc:[0]: wg0
    path-list:[179] locks:1 flags:shared,no-uRPF, uRPF-list: None
      path:[162] pl-index:179 ip4 weight=1 pref=0 attached-nexthop:  oper-flags:resolved,
        172.16.1.101 tun5 (p2p)
      [@0]: ipv4 via 0.0.0.0 tun5: mtu:8920 next:12 flags:[]

  [@4]: ipv4 via 0.0.0.0 tun5: mtu:8920 next:12 flags:[]
l3xc:[1]: wg0
    path-list:[168] locks:1 flags:shared,no-uRPF, uRPF-list: None
      path:[161] pl-index:168 ip6 weight=1 pref=0 attached:  oper-flags:resolved,
         tun5

  [@3]: ipv6 via :: tun5: mtu:8920 next:15 flags:[]
l3xc:[2]: tun5
    path-list:[167] locks:1 flags:shared,no-uRPF, uRPF-list: None
      path:[159] pl-index:167 ip4 weight=1 pref=0 attached-nexthop:  oper-flags:resolved,
        172.16.1.100 wg0
      [@0]: ipv4 [features] via 172.16.1.100 wg0: mtu:8920 next:8 flags:[features ] 00000000: 4500000000000000401122c2ac120004ac120003ca6cca6c0000000000000000
                                                                                    00000020: 000000000000000000000000
             stacked-on entry:55:
               [@2]: ipv4 via 172.18.0.3 host-eth0: mtu:1500 next:5 flags:[features ] 0242ac1200030242ac1200040800

  [@3]: ipv4 [features] via 172.16.1.100 wg0: mtu:8920 next:8 flags:[features ] 00000000: 4500000000000000401122c2ac120004ac120003ca6cca6c0000000000000000
                                                                                00000020: 000000000000000000000000
    stacked-on entry:55:
      [@2]: ipv4 via 172.18.0.3 host-eth0: mtu:1500 next:5 flags:[features ] 0242ac1200030242ac1200040800
l3xc:[3]: tun5
    path-list:[180] locks:1 flags:shared,no-uRPF, uRPF-list: None
      path:[208] pl-index:180 ip6 weight=1 pref=0 attached:  oper-flags:resolved,
         wg0

  [@1]: dpo-drop ip6

vpp# show int
              Name               Idx    State  MTU (L3/IP4/IP6/MPLS)     Counter          Count     
host-eth0                         1      up          1500/0/0/0     rx packets               1387688
                                                                    rx bytes              1729621362
                                                                    tx packets                727018
                                                                    tx bytes               169224852
                                                                    drops                       2435
                                                                    punt                         198
                                                                    ip4                      1384504
                                                                    ip6                         3154
ipip0                             3      up          9000/0/0/0     rx packets                 24012
                                                                    rx bytes                 7451017
                                                                    tx packets                 25636
                                                                    tx bytes                 9032818
                                                                    ip4                        24012
ipip1                             6      up          9000/0/0/0     rx packets                 38450
                                                                    rx bytes                 7956542
                                                                    tx packets                 45448
                                                                    tx bytes                 7614480
                                                                    drops                         44
                                                                    ip4                        38450
local0                            0     down          0/0/0/0       drops                          1
loop0                             4     down         9000/0/0/0     
loop1                             17    down         9000/0/0/0     
loop2                             7     down         9000/0/0/0     
loop3                             9     down         9000/0/0/0     
tap0                              2      up          9216/0/0/0     rx packets                638394
                                                                    rx bytes               152464785
                                                                    tx packets               1349584
                                                                    tx bytes              1718452054
                                                                    drops                         19
                                                                    ip4                       635362
                                                                    ip6                         3013
tun1                              5      up          9216/0/0/0     rx packets                 25086
                                                                    rx bytes                10375976
                                                                    tx packets                 26789
                                                                    tx bytes                 3804974
                                                                    drops                         12
                                                                    ip4                        25074
                                                                    ip6                           12
tun2                              18     up          9216/0/0/0     rx packets                  1046
                                                                    rx bytes                  230263
                                                                    tx packets                  1092
                                                                    tx bytes                  121961
                                                                    drops                          8
                                                                    ip4                         1038
                                                                    ip6                            8
tun3                              13     up          9216/0/0/0     rx packets                  2493
                                                                    rx bytes                  353878
                                                                    tx packets                  2347
                                                                    tx bytes                  464337
                                                                    drops                          8
                                                                    ip4                         2485
                                                                    ip6                            8
tun4                              10     up          9216/0/0/0     rx packets                    13
                                                                    rx bytes                     768
                                                                    drops                          9
                                                                    ip4                            4
                                                                    ip6                            9
tun5                              12     up     8920/8920/8920/8920 rx packets                   128
                                                                    rx bytes                   10464
                                                                    drops                          8
                                                                    ip4                          120
                                                                    ip6                            8
wg0                               11     up     8920/8920/8920/8920 tx packets                   120
                                                                    tx bytes                   17280
                                                                    drops                        120
                                                                    ip4                          120

vpp# show node ip4-rewrite
node ip4-rewrite, type internal, state active, index 601
  node function variants:
    Name             Priority  Active  Description
    icl                    -1          Intel Ice Lake
    skx                    -1          Intel Skylake (server) / Cascade Lake
    hsw                    50    yes   Intel Haswell
    default                 0          default

  next nodes:
    next-index  node-index               Node               Vectors
         0          593                ip4-drop                0   
         1          609             ip4-icmp-error             0   
         2          536                ip4-frag                0   
         3          405                 gso-ip4                0   
         4          257             cnat-output-ip4         2192208
         5          681            host-eth0-output            0   
         6          683               tap0-output              0   
         7          276          acl-plugin-out-ip4-fa      1440390
         8          381              tunnel-output             0   
         9          687               tun1-output              0   
        10          691       interface-8-output-deleted       0   
        11          695              loop3-output            11654 
        12          699       interface-12-output-deleted    19727 
        13          703       interface-14-output-deleted      0   
        14           9              wg4-output-tun             0   
        15          705       interface-15-output-deleted     32   
        16          664            interface-output            0   
        17          709       interface-16-output-deleted      0   
        18          717       interface-20-output-deleted      0   
        19          693               tun3-output              0   
        20          713               tun2-output              0   

  known previous nodes:
    srv6-as-localsid (47)              srv6-ad-flow-localsid (52)         srv6-ad-localsid (56)              
    lisp-tunnel-output (148)           l3xc-input-ip4 (173)               cnat-input-ip4 (259)               
    lookup-ip4-src (367)               lookup-ip4-dst-itf (368)           lookup-ip4-dst (369)               
    tunnel-output-no-count (380)       tunnel-output (381)                adj-midchain-tx (382)              
    sr-localsid-un-perf (415)          sr-localsid-un (416)               sr-localsid (417)                  
    sr-localsid-d (418)                tcp4-output (462)                  ip4-frag (536)                     
    ip4-punt-redirect (594)            ip4-load-balance (605)             ip4-lookup (606)                   
    ip4-classify (614)                 vxlan4-encap (623)                 

My guess is that many interfaces are created and deleted during the tests (both NSM and Calico). And at some point, the state of the list of interfaces is violated (perhaps due to reallocation)

@glazychev-art
Copy link
Contributor Author

@edwarnicke
Do you have any thoughts?

@glazychev-art
Copy link
Contributor Author

@glazychev-art
Copy link
Contributor Author

@glazychev-art
Copy link
Contributor Author

I've tried to reproduce it locally on the bare VPP, but without success.
To check this issue it would be very helpful if we could use the latest VPP revision. It contains several patches for wireguard, ip, vnet that would affect this problem.
To do this, we need to wait for an update in Calico-VPP - https://github.com/projectcalico/vpp-dataplane/blob/master/vpplink/binapi/vpp_clone_current.sh#L87
According to the results of the last communication with the Calico-guys, they have upgrading the VPP version in mind, but given the amount of negotiation, they'll probably delay as long as nothing major breaks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants