Skip to content

bpf: Add BPF_FIB_LOOKUP_VLAN flag to bpf_fib_lookup() helper#12415

Open
kernel-patches-daemon-bpf[bot] wants to merge 1 commit into
bpf-next_basefrom
series/1108746=>bpf-next
Open

bpf: Add BPF_FIB_LOOKUP_VLAN flag to bpf_fib_lookup() helper#12415
kernel-patches-daemon-bpf[bot] wants to merge 1 commit into
bpf-next_basefrom
series/1108746=>bpf-next

Conversation

@kernel-patches-daemon-bpf

Copy link
Copy Markdown

Pull request for series with
subject: bpf: Add BPF_FIB_LOOKUP_VLAN flag to bpf_fib_lookup() helper
version: 1
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=1108746

@kernel-patches-daemon-bpf

Copy link
Copy Markdown
Author

Upstream branch: f1a660b
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1108746
version: 1

@kernel-patches-review-bot

Copy link
Copy Markdown
commit 34c95a78800d7969199ef48dd9f56442df7cfba4
Author: Avinash Duduskar <avinash.duduskar@gmail.com>

bpf: Add BPF_FIB_LOOKUP_VLAN flag to bpf_fib_lookup() helper

bpf_fib_lookup() returns the FIB-resolved egress ifindex straight from
the fib result. When the egress is a VLAN device, the returned ifindex
is the VLAN netdev's, which has no XDP xmit handler; XDP programs must
instead target the underlying physical device. This patch adds the
BPF_FIB_LOOKUP_VLAN flag to let programs discover both the real egress
ifindex and the VLAN tag in one call.

> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 11dd610fa5fa..aa7fe378a35d 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h

[ ... ]

> diff --git a/net/core/filter.c b/net/core/filter.c
> index 9590877b0714..782fa86df95a 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -6119,10 +6119,28 @@ static const struct bpf_func_proto bpf_skb_get_xfrm_state_proto = {
>  #endif
>
>  #if IS_ENABLED(CONFIG_INET) || IS_ENABLED(CONFIG_IPV6)
> -static int bpf_fib_set_fwd_params(struct bpf_fib_lookup *params, u32 mtu)
> +static int bpf_fib_set_fwd_params(struct net_device *dev,
> +				  struct bpf_fib_lookup *params,
> +				  u32 flags, u32 mtu)
>  {
>  	params->h_vlan_TCI = 0;
>  	params->h_vlan_proto = 0;
> +
> +	if ((flags & BPF_FIB_LOOKUP_VLAN) && is_vlan_dev(dev)) {
> +		struct net_device *real_dev = vlan_dev_real_dev(dev);

Does this correctly handle stacked VLANs (QinQ)? Both the commit message
and the inline comment below mention that stacked VLANs are not walked and
only the immediate parent is resolved.

However, looking at the implementation of vlan_dev_real_dev() in
net/8021q/vlan_core.c:

const struct net_device *vlan_dev_real_dev(const struct net_device *dev)
{
    while (is_vlan_dev(dev))
        dev = vlan_dev_priv(dev)->real_dev;
    return dev;
}

It appears vlan_dev_real_dev() loops through all stacked VLANs to return
the ultimate non-VLAN underlying device.

If the configuration is QinQ, wouldn't this set params->ifindex to the base
physical device while only returning the topmost VLAN tag in
params->h_vlan_TCI? This could cause an XDP program to push only a single
tag and drop the intermediate VLAN tags, bypassing the outer VLAN entirely.

Should this use vlan_dev_priv(dev)->real_dev instead to only get the
immediate parent as intended?

> +
> +		/* Only the immediate parent is resolved; stacked VLANs

This isn't a bug, but the comment style here differs from BPF subsystem
guidelines. The opening /* has comment text on the same line, but the BPF
comment style prefers the opening /* on its own line with text starting
on the next line.

> +		 * (QinQ) are not walked, and a NULL real_dev (which
> +		 * is_vlan_dev() rules out in practice) keeps the
> +		 * original ifindex.
> +		 */
> +		if (real_dev) {
> +			params->h_vlan_proto = vlan_dev_vlan_proto(dev);
> +			params->h_vlan_TCI = htons(vlan_dev_vlan_id(dev));
> +			params->ifindex = real_dev->ifindex;
> +		}
> +	}
> +
>  	if (mtu)
>  		params->mtu_result = mtu; /* union with tot_len */
>
> @@ -6265,7 +6283,7 @@ static int bpf_ipv4_fib_lookup(struct net *net, struct bpf_fib_lookup *params,
>  	memcpy(params->smac, dev->dev_addr, ETH_ALEN);
>
>  set_fwd_params:
> -	return bpf_fib_set_fwd_params(params, mtu);
> +	return bpf_fib_set_fwd_params(dev, params, flags, mtu);
>  }
>  #endif

[ ... ]

> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index 11dd610fa5fa..aa7fe378a35d 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h

[ ... ]



AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

In-Reply-To-Subject: bpf: Add BPF_FIB_LOOKUP_VLAN flag to bpf_fib_lookup() helper
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/27223878961

@kernel-patches-daemon-bpf

Copy link
Copy Markdown
Author

@kernel-patches-daemon-bpf

Copy link
Copy Markdown
Author

Upstream branch: 68f4e48
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1108746
version: 1

@kernel-patches-daemon-bpf

Copy link
Copy Markdown
Author

Upstream branch: c15261b
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1108746
version: 1

@kernel-patches-daemon-bpf

Copy link
Copy Markdown
Author

Upstream branch: 140fa23
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1108746
version: 1

bpf_fib_lookup() returns the FIB-resolved egress ifindex straight
from the fib result. When the egress is a VLAN device, the returned
ifindex is the VLAN netdev's, which has no XDP xmit handler; XDP
programs that want to forward the frame (e.g. xdp-forward) must
instead target the underlying physical device and push the VLAN tag
themselves. Today the program has no way to learn either the
underlying ifindex or the VLAN tag without maintaining its own
VLAN-to-ifindex map in userspace and refreshing it on netlink
events.

Add BPF_FIB_LOOKUP_VLAN. When the caller sets this flag and the fib
result is a VLAN device, populate the existing output fields
params->h_vlan_proto and params->h_vlan_TCI from the VLAN device,
and replace params->ifindex with the underlying real device's
ifindex. params->h_vlan_TCI carries the VID only, with PCP and DEI
bits zero; a consumer wanting to set egress priority writes PCP
itself. Only the immediate parent is resolved; stacked VLANs (QinQ)
are not walked. When the flag is not set, behaviour is unchanged:
h_vlan_proto and h_vlan_TCI are zeroed and ifindex is left at the
FIB result.

This lets an XDP redirect target the physical device and learn the
tag to push in a single lookup, which xdp-forward's optional VLAN
mode (xdp-project/xdp-tools#504) wants from the kernel side.

The change extends bpf_fib_set_fwd_params() to take the egress dev
and the lookup flags so the VLAN swap happens in the same place the
vlan output fields are zeroed by default. Both IPv4 and IPv6
callers pass through. The helper's input semantics are unchanged.
Under !CONFIG_VLAN_8021Q, is_vlan_dev() returns false and the new
block is a no-op.

Suggested-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Avinash Duduskar <avinash.duduskar@gmail.com>
@kernel-patches-daemon-bpf kernel-patches-daemon-bpf Bot force-pushed the series/1108746=>bpf-next branch from 055a5af to f4da50a Compare June 10, 2026 04:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant