Skip to content

Calico BGP random nodes cannot establish peer connections with new nodes #12938

Description

@MrYuanZhen

Current Behavior

When using the Calico BGP mode, when a new node is added to the Kubernetes cluster, Calico fails to establish a peer connection with the new node and a randomly selected existing node in the cluster.

  1. The IP address of the node added to the cluster is 11.22.234.205,After adding the node, one randomly selected node in the cluster fails to establish a peer connection with 11.22.234.205.

Execute the command on the Calico-node of the abnormal node

calico-node -show-status
Image Image
  1. The calico-node logs from the abnormal node show as follows
2026-06-09 07:43:09.733 [INFO][2465752] felix/ipip_mgr.go 221: All-hosts IP set out-of sync, refreshing it.
2026-06-09 07:43:09.733 [INFO][2465752] felix/ipsets.go 159: Queueing IP set for creation family="inet" setID="all-hosts-net" setType="hash:net"
2026-06-09 07:43:09.740 [INFO][2465813] confd/client.go 1036: Recompute BGP peerings: HostBGPConfig(node=xmnctstestarmk8sworker06; name=ip_addr_v4) updated; HostBGPConfig(node=xmnctstestarmk8sworker06; name=ip_addr_v6) updated; HostBGPConfig(node=xmnctstestarmk8sworker06; name=rr_cluster_id) updated; xmnctstestarmk8sworker06 updated
bird: Reconfiguration requested by SIGHUP
bird: Reconfiguring
bird: device1: Reconfigured
bird: direct1: Reconfigured
bird: Mesh_11_22_234_193: Reconfigured
bird: Mesh_11_22_234_194: Reconfigured
bird: Mesh_11_22_234_195: Reconfigured
bird: Mesh_11_22_234_190: Reconfigured
bird: Mesh_11_22_234_191: Reconfigured
bird: Mesh_11_22_234_200: Reconfigured
bird: Mesh_11_22_234_201: Reconfigured
bird: Mesh_11_22_234_202: Reconfigured
bird: Mesh_11_22_234_203: Reconfigured
bird: Reconfigured
2026-06-09 07:43:09.789 [INFO][2465813] confd/resource.go 290: Target config /etc/calico/confd/config/bird.cfg has been updated due to change in key: /calico/bgp/v1/host
2026-06-09 07:43:09.810 [INFO][2465752] felix/int_dataplane.go 1954: Received *proto.HostMetadataV4V6Update update from calculation graph msg=hostname:"xmnctstestarmk8sworker06" labels:<key:"beta.kubernetes.io/arch" value:"arm64" > labels:<key:"beta.kubernetes.io/instance-type" value:"rke2" > labels:<key:"beta.kubernetes.io/os" value:"linux" > labels:<key:"cattle.io/os" value:"linux" > labels:<key:"kubernetes.io/arch" value:"arm64" > labels:<key:"kubernetes.io/hostname" value:"xmnctstestarmk8sworker06" > labels:<key:"kubernetes.io/os" value:"linux" > labels:<key:"node.kubernetes.io/instance-type" value:"rke2" > labels:<key:"rke.cattle.io/machine" value:"746c17c3-9c03-4976-86df-ce373b88dcbb" > 
2026-06-09 07:43:09.817 [INFO][2465813] confd/client.go 1036: Recompute BGP peerings: xmnctstestarmk8sworker06 updated
2026-06-09 07:43:09.856 [INFO][2465752] felix/int_dataplane.go 1954: Received *proto.HostMetadataV4V6Update update from calculation graph msg=hostname:"xmnctstestarmk8sworker06" labels:<key:"beta.kubernetes.io/arch" value:"arm64" > labels:<key:"beta.kubernetes.io/instance-type" value:"rke2" > labels:<key:"beta.kubernetes.io/os" value:"linux" > labels:<key:"cattle.io/os" value:"linux" > labels:<key:"kubernetes.io/arch" value:"arm64" > labels:<key:"kubernetes.io/hostname" value:"xmnctstestarmk8sworker06" > labels:<key:"kubernetes.io/os" value:"linux" > labels:<key:"node.kubernetes.io/instance-type" value:"rke2" > labels:<key:"rke.cattle.io/machine" value:"746c17c3-9c03-4976-86df-ce373b88dcbb" > 
2026-06-09 07:43:29.223 [INFO][2465813] confd/client.go 1036: Recompute BGP peerings: xmnctstestarmk8sworker06 updated
2026-06-09 07:43:29.223 [INFO][2465752] felix/int_dataplane.go 1954: Received *proto.HostMetadataV4V6Update update from calculation graph msg=hostname:"xmnctstestarmk8sworker06" labels:<key:"beta.kubernetes.io/arch" value:"arm64" > labels:<key:"beta.kubernetes.io/instance-type" value:"rke2" > labels:<key:"beta.kubernetes.io/os" value:"linux" > labels:<key:"cattle.io/os" value:"linux" > labels:<key:"kubernetes.io/arch" value:"arm64" > labels:<key:"kubernetes.io/hostname" value:"xmnctstestarmk8sworker06" > labels:<key:"kubernetes.io/os" value:"linux" > labels:<key:"node-role.kubernetes.io/worker" value:"true" > labels:<key:"node.kubernetes.io/instance-type" value:"rke2" > labels:<key:"rke.cattle.io/machine" value:"746c17c3-9c03-4976-86df-ce373b88dcbb" > 
2026-06-09 07:43:37.016 [INFO][2465752] felix/int_dataplane.go 1954: Received *proto.HostMetadataV4V6Update update from calculation graph msg=hostname:"xmnctstestarmk8sworker06" ipv4_addr:"11.22.234.205/24" labels:<key:"beta.kubernetes.io/arch" value:"arm64" > labels:<key:"beta.kubernetes.io/instance-type" value:"rke2" > labels:<key:"beta.kubernetes.io/os" value:"linux" > labels:<key:"cattle.io/os" value:"linux" > labels:<key:"kubernetes.io/arch" value:"arm64" > labels:<key:"kubernetes.io/hostname" value:"xmnctstestarmk8sworker06" > labels:<key:"kubernetes.io/os" value:"linux" > labels:<key:"node-role.kubernetes.io/worker" value:"true" > labels:<key:"node.kubernetes.io/instance-type" value:"rke2" > labels:<key:"rke.cattle.io/machine" value:"746c17c3-9c03-4976-86df-ce373b88dcbb" > 
2026-06-09 07:43:37.026 [INFO][2465813] confd/client.go 1036: Recompute BGP peerings: HostBGPConfig(node=xmnctstestarmk8sworker06; name=ip_addr_v4) updated; HostBGPConfig(node=xmnctstestarmk8sworker06; name=network_v4) updated
2026-06-09 07:43:37.034 [INFO][2465813] confd/resource.go 290: Target config /etc/calico/confd/config/bird.cfg has been updated due to change in key: /calico/bgp/v1/host
2026-06-09 07:43:37.292 [INFO][2465752] felix/int_dataplane.go 1954: Received *proto.HostMetadataV4V6Update update from calculation graph msg=hostname:"xmnctstestarmk8sworker06" ipv4_addr:"11.22.234.205/24" labels:<key:"beta.kubernetes.io/arch" value:"arm64" > labels:<key:"beta.kubernetes.io/instance-type" value:"rke2" > labels:<key:"beta.kubernetes.io/os" value:"linux" > labels:<key:"cattle.io/os" value:"linux" > labels:<key:"kubernetes.io/arch" value:"arm64" > labels:<key:"kubernetes.io/hostname" value:"xmnctstestarmk8sworker06" > labels:<key:"kubernetes.io/os" value:"linux" > labels:<key:"node-role.kubernetes.io/worker" value:"true" > labels:<key:"node.kubernetes.io/instance-type" value:"rke2" > labels:<key:"rke.cattle.io/machine" value:"746c17c3-9c03-4976-86df-ce373b88dcbb" > 
bird: BGP: Unexpected connect from unknown address 11.22.234.205 (port 46713)
bird: BGP: Unexpected connect from unknown address 11.22.234.205 (port 43107)
bird: BGP: Unexpected connect from unknown address 11.22.234.205 (port 52401)
bird: BGP: Unexpected connect from unknown address 11.22.234.205 (port 48981)
2026-06-09 07:43:49.776 [INFO][2465752] felix/summary.go 100: Summarising 16 dataplane reconciliation loops over 1m3.2s: avg=6ms longest=13ms (resync-ipsets-v4)
bird: BGP: Unexpected connect from unknown address 11.22.234.205 (port 58581)
bird: BGP: Unexpected connect from unknown address 11.22.234.205 (port 55011)
bird: BGP: Unexpected connect from unknown address 11.22.234.205 (port 47195)
bird: BGP: Unexpected connect from unknown address 11.22.234.205 (port 60343)
bird: BGP: Unexpected connect from unknown address 11.22.234.205 (port 33427)
bird: BGP: Unexpected connect from unknown address 11.22.234.205 (port 37737)
bird: BGP: Unexpected connect from unknown address 11.22.234.205 (port 57825)
2026-06-09 07:44:03.975 [INFO][2465752] felix/int_dataplane.go 1954: Received *proto.HostMetadataV4V6Update update from calculation graph msg=hostname:"xmnctstestarmk8sworker06" ipv4_addr:"11.22.234.205/24" labels:<key:"beta.kubernetes.io/arch" value:"arm64" > labels:<key:"beta.kubernetes.io/instance-type" value:"rke2" > labels:<key:"beta.kubernetes.io/os" value:"linux" > labels:<key:"cattle.io/os" value:"linux" > labels:<key:"kubernetes.io/arch" value:"arm64" > labels:<key:"kubernetes.io/hostname" value:"xmnctstestarmk8sworker06" > labels:<key:"kubernetes.io/os" value:"linux" > labels:<key:"node-role.kubernetes.io/worker" value:"true" > labels:<key:"node.kubernetes.io/instance-type" value:"rke2" > labels:<key:"plan.upgrade.cattle.io/system-agent-upgrader" value:"9bb1010f7f487d1fb565e26c004071b7c56489fa9a1d3ce128297483" > labels:<key:"rke.cattle.io/machine" value:"746c17c3-9c03-4976-86df-ce373b88dcbb" > 
2026-06-09 07:44:03.978 [INFO][2465813] confd/client.go 1036: Recompute BGP peerings: xmnctstestarmk8sworker06 updated
bird: BGP: Unexpected connect from unknown address 11.22.234.205 (port 34975)
bird: BGP: Unexpected connect from unknown address 11.22.234.205 (port 51201)
bird: BGP: Unexpected connect from unknown address 11.22.234.205 (port 56149)

  1. Execute the following command in the Calico-Node pod, or restart the pod to recover from the failure.
sv hup bird || true 
  1. After rebooting, running the calico-node -show-status command again shows that node 11.22.234.205 has restored its peer connection.
Image
  1. Below is the log after restart, showing that node 11.22.234.205 has been properly connected.
bird: BGP: Unexpected connect from unknown address 11.22.234.205 (port 47073)
bird: BGP: Unexpected connect from unknown address 11.22.234.205 (port 37937)
bird: BGP: Unexpected connect from unknown address 11.22.234.205 (port 38259)
bird: BGP: Unexpected connect from unknown address 11.22.234.205 (port 53055)
bird: BGP: Unexpected connect from unknown address 11.22.234.205 (port 49627)
bird: BGP: Unexpected connect from unknown address 11.22.234.205 (port 59853)
bird: Reconfiguration requested by SIGHUP
bird: Reconfiguring
bird: device1: Reconfigured
bird: direct1: Reconfigured
bird: Mesh_11_22_234_193: Reconfigured
bird: Mesh_11_22_234_194: Reconfigured
bird: Mesh_11_22_234_195: Reconfigured
bird: Mesh_11_22_234_190: Reconfigured
bird: Mesh_11_22_234_191: Reconfigured
bird: Mesh_11_22_234_200: Reconfigured
bird: Mesh_11_22_234_201: Reconfigured
bird: Mesh_11_22_234_202: Reconfigured
bird: Mesh_11_22_234_203: Reconfigured
bird: Adding protocol Mesh_11_22_234_205
bird: Mesh_11_22_234_205: Initializing
bird: Mesh_11_22_234_205: Starting
bird: Mesh_11_22_234_205: State changed to start
bird: Reconfigured
bird: Mesh_11_22_234_205: Connected to table master
bird: Mesh_11_22_234_205: State changed to feed
bird: Mesh_11_22_234_205: State changed to up

Based on log inspection, I suspect there is an issue with the order of configuration file loading and bird's reload process. However, strangely, this phenomenon occurs randomly on any node within the cluster.

Your Environment

  • Calico version: calico-node:v3.27.3
  • Orchestrator version (e.g. kubernetes, openshift, etc.): v1.27.15+rke2r1
  • Operating System and version: Kylin V10 SP2, Kylin Linux Advanced Server release V10 (Sword)
  • Kernel Version: 4.19.90-25.43.v2101.ky10.aarch64

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions