You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
But the kernel has the neighbor listed as added by BGP/Zebra:
root@sw2:~# ip neigh show dev Vlan2
10.0.0.71 lladdr 74:86:e2:43:33:05 extern_learn NOARP proto zebra
10.0.0.50 lladdr 18:5a:58:2a:e8:20 extern_learn NOARP proto zebra
fe80::7686:e2ff:fe43:3305 lladdr 74:86:e2:43:33:05 extern_learn NOARP proto zebra
And the type-2 routes look good:
# vtysh -c "show bgp l2vpn evpn"
BGP table version is 7, local router ID is 172.16.0.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
Origin codes: i - IGP, e - EGP, ? - incomplete
EVPN type-1 prefix: [1]:[EthTag]:[ESI]:[IPlen]:[VTEP-IP]:[Frag-id]
EVPN type-2 prefix: [2]:[EthTag]:[MAClen]:[MAC]:[IPlen]:[IP]
EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
EVPN type-4 prefix: [4]:[ESI]:[IPlen]:[OrigIP]
EVPN type-5 prefix: [5]:[EthTag]:[IPlen]:[IP]
Network Next Hop Metric LocPrf Weight Path
Route Distinguisher: 172.16.0.1:2
*> [2]:[0]:[48]:[18:5a:58:2a:e8:20]
172.16.0.1 0 4210000001 i
RT:32897:10002 ET:8
*> [2]:[0]:[48]:[18:5a:58:2a:e8:20]:[32]:[10.0.0.50]
172.16.0.1 0 4210000001 i
RT:32897:10002 ET:8
*> [2]:[0]:[48]:[74:86:e2:43:33:05]:[32]:[10.0.0.71]
172.16.0.1 0 4210000001 i
RT:32897:10002 ET:8
*> [2]:[0]:[48]:[74:86:e2:43:33:05]:[128]:[fe80::7686:e2ff:fe43:3305]
172.16.0.1 0 4210000001 i
RT:32897:10002 ET:8
*> [3]:[0]:[32]:[172.16.0.1]
172.16.0.1 0 4210000001 i
RT:32897:10002 ET:8
Route Distinguisher: 172.16.0.2:2
*> [2]:[0]:[48]:[74:86:e2:43:28:05]:[32]:[10.0.0.72]
172.16.0.2 32768 i
ET:8 RT:32898:10002
*> [2]:[0]:[48]:[74:86:e2:43:28:05]:[128]:[fe80::7686:e2ff:fe43:2805]
172.16.0.2 32768 i
ET:8 RT:32898:10002
*> [3]:[0]:[32]:[172.16.0.2]
172.16.0.2 32768 i
ET:8 RT:32898:10002
Displayed 8 out of 8 total prefixes
Going over to the originating VTEP (sw1) where the host is directly connected, the NEIGH_TABLE is populated as expected:
2024 Nov 20 22:13:42.369623 sw1 NOTICE swss#orchagent: :- addNeighbor: Created neighbor ip 10.0.0.50, 18:5a:58:2a:e8:20 on Vlan2
2024 Nov 20 22:13:42.370310 sw1 NOTICE syncd#syncd: [none] SAI_API_NEXT_HOP:brcm_sai_create_next_hop:334 nhid 3 vr_id 0 ip af:v4 addr:10.0.0.50 rif-id 1 tunnel-id 0 vni 0
2024 Nov 20 22:13:42.370474 sw1 NOTICE syncd#syncd: [none] SAI_API_NEXT_HOP:_brcm_sai_xgs_create_ip_nexthop:554 nhid 3 eg-if 400004 rif 0 vid 0 port/tid(0x0) is_trunk(0)
2024 Nov 20 22:13:42.371069 sw1 NOTICE swss#orchagent: :- addNextHop: Created next hop 10.0.0.50 on Vlan2
I'm assuming there is some event that should cause population of the NEIGH_TABLE on sw2, which likely should also trigger off programming of the neighbor into the ASIC.
The behavior is pings go through between 10.0.0.50 and 10.0.0.72 for about 10s, then stops for 300s which appears to be the mac aging timer, then continues for about 10s, then stops again for 300s, rinse and repeat. I'm assuming this is due to some slow (during learning) vs fast path logic with the ASIC.
The text was updated successfully, but these errors were encountered:
bradh352
changed the title
NEIGH_TABLE not populated with VXLAN routes leading to WARNING
NEIGH_TABLE not populated with VXLAN routes
Dec 2, 2024
Observed on master and 202405 (with PR #3383 applied to make VXLANs actually work).
Basic architecture is VXLAN EVPN with an l3 irb/vni interface on the switches participating in the vxlan fabric.
In
sw2
I've noticed log entries like:Then when I investigate,
NEIGH_TABLE
inAPPL_DB
doesn't have any neighbors listed for Vlan2.But the kernel has the neighbor listed as added by BGP/Zebra:
And the type-2 routes look good:
Going over to the originating VTEP (
sw1
) where the host is directly connected, the NEIGH_TABLE is populated as expected:And we see these log entries.
I'm assuming there is some event that should cause population of the
NEIGH_TABLE
onsw2
, which likely should also trigger off programming of the neighbor into the ASIC.The behavior is pings go through between 10.0.0.50 and 10.0.0.72 for about 10s, then stops for 300s which appears to be the mac aging timer, then continues for about 10s, then stops again for 300s, rinse and repeat. I'm assuming this is due to some slow (during learning) vs fast path logic with the ASIC.
The text was updated successfully, but these errors were encountered: