netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] bridge: fix C-VLAN preservation in 802.1ad vlan_tunnel egress
@ 2025-12-28  2:00 Alexandre Knecht
  2025-12-30  9:31 ` Ido Schimmel
  2025-12-30  9:38 ` Nikolay Aleksandrov
  0 siblings, 2 replies; 6+ messages in thread
From: Alexandre Knecht @ 2025-12-28  2:00 UTC (permalink / raw)
  To: netdev; +Cc: roopa, Alexandre Knecht

When using an 802.1ad bridge with vlan_tunnel, the C-VLAN tag is
incorrectly stripped from frames during egress processing.

br_handle_egress_vlan_tunnel() uses skb_vlan_pop() to remove the S-VLAN
from hwaccel before VXLAN encapsulation. However, skb_vlan_pop() also
moves any "next" VLAN from the payload into hwaccel:

    /* move next vlan tag to hw accel tag */
    __skb_vlan_pop(skb, &vlan_tci);
    __vlan_hwaccel_put_tag(skb, vlan_proto, vlan_tci);

For QinQ frames where the C-VLAN sits in the payload, this moves it to
hwaccel where it gets lost during VXLAN encapsulation.

Fix by calling __vlan_hwaccel_clear_tag() directly, which clears only
the hwaccel S-VLAN and leaves the payload untouched.

This path is only taken when vlan_tunnel is enabled and tunnel_info
is configured, so 802.1Q bridges are unaffected.

Tested with 802.1ad bridge + VXLAN vlan_tunnel, verified C-VLAN
preserved in VXLAN payload via tcpdump.

Fixes: 11538d039ac6 ("bridge: vlan dst_metadata hooks in ingress and egress paths")
Signed-off-by: Alexandre Knecht <knecht.alexandre@gmail.com>
---
 net/bridge/br_vlan_tunnel.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/net/bridge/br_vlan_tunnel.c b/net/bridge/br_vlan_tunnel.c
index 12de0d1df0bc..a1b62507e521 100644
--- a/net/bridge/br_vlan_tunnel.c
+++ b/net/bridge/br_vlan_tunnel.c
@@ -189,7 +189,6 @@ int br_handle_egress_vlan_tunnel(struct sk_buff *skb,
 	IP_TUNNEL_DECLARE_FLAGS(flags) = { };
 	struct metadata_dst *tunnel_dst;
 	__be64 tunnel_id;
-	int err;

 	if (!vlan)
 		return 0;
@@ -199,9 +198,13 @@ int br_handle_egress_vlan_tunnel(struct sk_buff *skb,
 		return 0;

 	skb_dst_drop(skb);
-	err = skb_vlan_pop(skb);
-	if (err)
-		return err;
+	/* For 802.1ad (QinQ), skb_vlan_pop() incorrectly moves the C-VLAN
+	 * from payload to hwaccel after clearing S-VLAN. We only need to
+	 * clear the hwaccel S-VLAN; the C-VLAN must stay in payload for
+	 * correct VXLAN encapsulation. This is also correct for 802.1Q
+	 * where no C-VLAN exists in payload.
+	 */
+	__vlan_hwaccel_clear_tag(skb);

 	if (BR_INPUT_SKB_CB(skb)->backup_nhid) {
 		__set_bit(IP_TUNNEL_KEY_BIT, flags);
--
2.43.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net] bridge: fix C-VLAN preservation in 802.1ad vlan_tunnel egress
  2025-12-28  2:00 [PATCH net] bridge: fix C-VLAN preservation in 802.1ad vlan_tunnel egress Alexandre Knecht
@ 2025-12-30  9:31 ` Ido Schimmel
  2025-12-30  9:38 ` Nikolay Aleksandrov
  1 sibling, 0 replies; 6+ messages in thread
From: Ido Schimmel @ 2025-12-30  9:31 UTC (permalink / raw)
  To: Alexandre Knecht, razor; +Cc: netdev, roopa

+ Nik (Please use scripts/get_maintainer.pl next time)

On Sun, Dec 28, 2025 at 03:00:57AM +0100, Alexandre Knecht wrote:
> When using an 802.1ad bridge with vlan_tunnel, the C-VLAN tag is
> incorrectly stripped from frames during egress processing.
> 
> br_handle_egress_vlan_tunnel() uses skb_vlan_pop() to remove the S-VLAN
> from hwaccel before VXLAN encapsulation. However, skb_vlan_pop() also
> moves any "next" VLAN from the payload into hwaccel:
> 
>     /* move next vlan tag to hw accel tag */
>     __skb_vlan_pop(skb, &vlan_tci);
>     __vlan_hwaccel_put_tag(skb, vlan_proto, vlan_tci);
> 
> For QinQ frames where the C-VLAN sits in the payload, this moves it to
> hwaccel where it gets lost during VXLAN encapsulation.
> 
> Fix by calling __vlan_hwaccel_clear_tag() directly, which clears only
> the hwaccel S-VLAN and leaves the payload untouched.
> 
> This path is only taken when vlan_tunnel is enabled and tunnel_info
> is configured, so 802.1Q bridges are unaffected.

It's not clear from the commit message why 802.1Q bridges are
unaffected. I think you mean that in 802.1Q bridges the C-VLAN is never
in the payload and always in hwaccel (even when Tx VLAN offload is
disabled, thanks to commit 12464bb8de021).

> 
> Tested with 802.1ad bridge + VXLAN vlan_tunnel, verified C-VLAN
> preserved in VXLAN payload via tcpdump.
> 
> Fixes: 11538d039ac6 ("bridge: vlan dst_metadata hooks in ingress and egress paths")
> Signed-off-by: Alexandre Knecht <knecht.alexandre@gmail.com>

Looks correct to me:

Reviewed-by: Ido Schimmel <idosch@nvidia.com>

> ---
>  net/bridge/br_vlan_tunnel.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/net/bridge/br_vlan_tunnel.c b/net/bridge/br_vlan_tunnel.c
> index 12de0d1df0bc..a1b62507e521 100644
> --- a/net/bridge/br_vlan_tunnel.c
> +++ b/net/bridge/br_vlan_tunnel.c
> @@ -189,7 +189,6 @@ int br_handle_egress_vlan_tunnel(struct sk_buff *skb,
>  	IP_TUNNEL_DECLARE_FLAGS(flags) = { };
>  	struct metadata_dst *tunnel_dst;
>  	__be64 tunnel_id;
> -	int err;
> 
>  	if (!vlan)
>  		return 0;
> @@ -199,9 +198,13 @@ int br_handle_egress_vlan_tunnel(struct sk_buff *skb,
>  		return 0;
> 
>  	skb_dst_drop(skb);
> -	err = skb_vlan_pop(skb);
> -	if (err)
> -		return err;
> +	/* For 802.1ad (QinQ), skb_vlan_pop() incorrectly moves the C-VLAN
> +	 * from payload to hwaccel after clearing S-VLAN. We only need to
> +	 * clear the hwaccel S-VLAN; the C-VLAN must stay in payload for
> +	 * correct VXLAN encapsulation. This is also correct for 802.1Q
> +	 * where no C-VLAN exists in payload.
> +	 */
> +	__vlan_hwaccel_clear_tag(skb);
> 
>  	if (BR_INPUT_SKB_CB(skb)->backup_nhid) {
>  		__set_bit(IP_TUNNEL_KEY_BIT, flags);
> --
> 2.43.0

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] bridge: fix C-VLAN preservation in 802.1ad vlan_tunnel egress
  2025-12-28  2:00 [PATCH net] bridge: fix C-VLAN preservation in 802.1ad vlan_tunnel egress Alexandre Knecht
  2025-12-30  9:31 ` Ido Schimmel
@ 2025-12-30  9:38 ` Nikolay Aleksandrov
  2025-12-30  9:43   ` Nikolay Aleksandrov
  1 sibling, 1 reply; 6+ messages in thread
From: Nikolay Aleksandrov @ 2025-12-30  9:38 UTC (permalink / raw)
  To: Alexandre Knecht, netdev; +Cc: roopa

On 28/12/2025 04:00, Alexandre Knecht wrote:
> When using an 802.1ad bridge with vlan_tunnel, the C-VLAN tag is
> incorrectly stripped from frames during egress processing.
> 
> br_handle_egress_vlan_tunnel() uses skb_vlan_pop() to remove the S-VLAN
> from hwaccel before VXLAN encapsulation. However, skb_vlan_pop() also
> moves any "next" VLAN from the payload into hwaccel:
> 
>      /* move next vlan tag to hw accel tag */
>      __skb_vlan_pop(skb, &vlan_tci);
>      __vlan_hwaccel_put_tag(skb, vlan_proto, vlan_tci);
> 
> For QinQ frames where the C-VLAN sits in the payload, this moves it to
> hwaccel where it gets lost during VXLAN encapsulation.
> 
> Fix by calling __vlan_hwaccel_clear_tag() directly, which clears only
> the hwaccel S-VLAN and leaves the payload untouched.
> 
> This path is only taken when vlan_tunnel is enabled and tunnel_info
> is configured, so 802.1Q bridges are unaffected.
> 
> Tested with 802.1ad bridge + VXLAN vlan_tunnel, verified C-VLAN
> preserved in VXLAN payload via tcpdump.
> 
> Fixes: 11538d039ac6 ("bridge: vlan dst_metadata hooks in ingress and egress paths")
> Signed-off-by: Alexandre Knecht <knecht.alexandre@gmail.com>
> ---
>   net/bridge/br_vlan_tunnel.c | 11 +++++++----
>   1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/net/bridge/br_vlan_tunnel.c b/net/bridge/br_vlan_tunnel.c
> index 12de0d1df0bc..a1b62507e521 100644
> --- a/net/bridge/br_vlan_tunnel.c
> +++ b/net/bridge/br_vlan_tunnel.c
> @@ -189,7 +189,6 @@ int br_handle_egress_vlan_tunnel(struct sk_buff *skb,
>   	IP_TUNNEL_DECLARE_FLAGS(flags) = { };
>   	struct metadata_dst *tunnel_dst;
>   	__be64 tunnel_id;
> -	int err;
> 
>   	if (!vlan)
>   		return 0;
> @@ -199,9 +198,13 @@ int br_handle_egress_vlan_tunnel(struct sk_buff *skb,
>   		return 0;
> 
>   	skb_dst_drop(skb);
> -	err = skb_vlan_pop(skb);
> -	if (err)
> -		return err;
> +	/* For 802.1ad (QinQ), skb_vlan_pop() incorrectly moves the C-VLAN
> +	 * from payload to hwaccel after clearing S-VLAN. We only need to
> +	 * clear the hwaccel S-VLAN; the C-VLAN must stay in payload for
> +	 * correct VXLAN encapsulation. This is also correct for 802.1Q
> +	 * where no C-VLAN exists in payload.
> +	 */
> +	__vlan_hwaccel_clear_tag(skb);
> 
>   	if (BR_INPUT_SKB_CB(skb)->backup_nhid) {
>   		__set_bit(IP_TUNNEL_KEY_BIT, flags);
> --
> 2.43.0
> 

Nice catch. As Ido said, please use get_maintainer.pl next time.
The change looks good to me as well. Thanks!

Acked-by: Nikolay Aleksandrov <razor@blackwall.org>




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] bridge: fix C-VLAN preservation in 802.1ad vlan_tunnel egress
  2025-12-30  9:38 ` Nikolay Aleksandrov
@ 2025-12-30  9:43   ` Nikolay Aleksandrov
  2025-12-30 11:25     ` Alexandre Knecht
  0 siblings, 1 reply; 6+ messages in thread
From: Nikolay Aleksandrov @ 2025-12-30  9:43 UTC (permalink / raw)
  To: Alexandre Knecht, netdev; +Cc: roopa, Ido Schimmel

On 30/12/2025 11:38, Nikolay Aleksandrov wrote:
> On 28/12/2025 04:00, Alexandre Knecht wrote:
>> When using an 802.1ad bridge with vlan_tunnel, the C-VLAN tag is
>> incorrectly stripped from frames during egress processing.
>>
>> br_handle_egress_vlan_tunnel() uses skb_vlan_pop() to remove the S-VLAN
>> from hwaccel before VXLAN encapsulation. However, skb_vlan_pop() also
>> moves any "next" VLAN from the payload into hwaccel:
>>
>>      /* move next vlan tag to hw accel tag */
>>      __skb_vlan_pop(skb, &vlan_tci);
>>      __vlan_hwaccel_put_tag(skb, vlan_proto, vlan_tci);
>>
>> For QinQ frames where the C-VLAN sits in the payload, this moves it to
>> hwaccel where it gets lost during VXLAN encapsulation.
>>
>> Fix by calling __vlan_hwaccel_clear_tag() directly, which clears only
>> the hwaccel S-VLAN and leaves the payload untouched.
>>
>> This path is only taken when vlan_tunnel is enabled and tunnel_info
>> is configured, so 802.1Q bridges are unaffected.
>>
>> Tested with 802.1ad bridge + VXLAN vlan_tunnel, verified C-VLAN
>> preserved in VXLAN payload via tcpdump.
>>
>> Fixes: 11538d039ac6 ("bridge: vlan dst_metadata hooks in ingress and 
>> egress paths")
>> Signed-off-by: Alexandre Knecht <knecht.alexandre@gmail.com>
>> ---
>>   net/bridge/br_vlan_tunnel.c | 11 +++++++----
>>   1 file changed, 7 insertions(+), 4 deletions(-)
>>
>> diff --git a/net/bridge/br_vlan_tunnel.c b/net/bridge/br_vlan_tunnel.c
>> index 12de0d1df0bc..a1b62507e521 100644
>> --- a/net/bridge/br_vlan_tunnel.c
>> +++ b/net/bridge/br_vlan_tunnel.c
>> @@ -189,7 +189,6 @@ int br_handle_egress_vlan_tunnel(struct sk_buff *skb,
>>       IP_TUNNEL_DECLARE_FLAGS(flags) = { };
>>       struct metadata_dst *tunnel_dst;
>>       __be64 tunnel_id;
>> -    int err;
>>
>>       if (!vlan)
>>           return 0;
>> @@ -199,9 +198,13 @@ int br_handle_egress_vlan_tunnel(struct sk_buff 
>> *skb,
>>           return 0;
>>
>>       skb_dst_drop(skb);
>> -    err = skb_vlan_pop(skb);
>> -    if (err)
>> -        return err;
>> +    /* For 802.1ad (QinQ), skb_vlan_pop() incorrectly moves the C-VLAN
>> +     * from payload to hwaccel after clearing S-VLAN. We only need to
>> +     * clear the hwaccel S-VLAN; the C-VLAN must stay in payload for
>> +     * correct VXLAN encapsulation. This is also correct for 802.1Q
>> +     * where no C-VLAN exists in payload.
>> +     */
>> +    __vlan_hwaccel_clear_tag(skb);
>>
>>       if (BR_INPUT_SKB_CB(skb)->backup_nhid) {
>>           __set_bit(IP_TUNNEL_KEY_BIT, flags);
>> -- 
>> 2.43.0
>>
> 
> Nice catch. As Ido said, please use get_maintainer.pl next time.
> The change looks good to me as well. Thanks!
> 
> Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
> 
> 
> 

haha and I missed to add Ido to the reply..
need more coffee this morning :)
Sorry about that!

+ Ido

Cheers,
  Nik


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] bridge: fix C-VLAN preservation in 802.1ad vlan_tunnel egress
  2025-12-30  9:43   ` Nikolay Aleksandrov
@ 2025-12-30 11:25     ` Alexandre Knecht
  2025-12-30 12:03       ` Ido Schimmel
  0 siblings, 1 reply; 6+ messages in thread
From: Alexandre Knecht @ 2025-12-30 11:25 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: netdev, roopa, Ido Schimmel

Hi Ido, Nik, thanks for the review. I'm sorry for the missing
maintainers, I felt so bad when I saw the bot result after posting,
will not happen ever again.

> It's not clear from the commit message why 802.1Q bridges are
> unaffected. I think you mean that in 802.1Q bridges the C-VLAN is never
> in the payload and always in hwaccel (even when Tx VLAN offload is
> disabled, thanks to commit 12464bb8de021).

You're right that I should clarify, but I think we may be talking
about different scenarios.

The 802.1Q bridge I mentioned is specifically the case with
vlan_tunnel enabled (for VXLAN bridging). In this setup:

1. The bridge uses per-vlan br_tunnel_info to map C-VLAN (0x8100) to VNI
2. On egress via br_handle_egress_vlan_tunnel(), the C-VLAN is cleared
from hwaccel and used to set the tunnel_id in dst_metadata
3. After skb_vlan_pop() clears hwaccel, skb->protocol is the actual
payload type (e.g., IPv4), not a VLAN ethertype, so the second phase
of skb_vlan_pop() that pops nested VLANs from payload never triggers

So in this 802.1Q + vlan_tunnel path, there's no C-VLAN in hwaccel at
the VXLAN driver entry point.

For 802.1ad bridges, here is what I understood of the flow at egress
in br_handle_egress_vlan_tunnel():

  // 1. S-VLAN is in hwaccel at this point (bridge put it there)
  tunnel_id = READ_ONCE(vlan->tinfo.tunnel_id);
  if (!tunnel_id || unlikely(!skb_vlan_tag_present(skb))) // ← checks hwaccel!
  return 0;

  // 2. Set tunnel metadata (VNI) : this goes in skb_dst, NOT hwaccel
  skb_dst_drop(skb);
  // ... creates metadata_dst with tunnel_id ...
  skb_dst_set(skb, &tunnel_dst->dst);

  // 3. Now remove the S-VLAN from hwaccel (it's been "consumed" for VNI lookup)
  err = skb_vlan_pop(skb); // ← BUG: also pops C-VLAN from payload

So at the moment skb_vlan_pop() is called:
- S-VLAN: in hwaccel (skb->vlan_tci) : needs to be cleared
- C-VLAN: in packet payload : should NOT be touched
- VNI: already set in skb_dst() metadata : separate from VLAN

The S-VLAN was used to lookup the VNI, but it's still sitting in
hwaccel and needs to be removed before the packet goes into the VXLAN
tunnel. The problem is skb_vlan_pop() does more than just clear
hwaccel, it also "normalizes" any nested VLAN it finds in the payload.
The fix uses __vlan_hwaccel_clear_tag() instead, which only clears
hwaccel without touching the payload.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] bridge: fix C-VLAN preservation in 802.1ad vlan_tunnel egress
  2025-12-30 11:25     ` Alexandre Knecht
@ 2025-12-30 12:03       ` Ido Schimmel
  0 siblings, 0 replies; 6+ messages in thread
From: Ido Schimmel @ 2025-12-30 12:03 UTC (permalink / raw)
  To: Alexandre Knecht; +Cc: Nikolay Aleksandrov, netdev, roopa

On Tue, Dec 30, 2025 at 12:25:39PM +0100, Alexandre Knecht wrote:
> Hi Ido, Nik, thanks for the review. I'm sorry for the missing
> maintainers, I felt so bad when I saw the bot result after posting,
> will not happen ever again.
> 
> > It's not clear from the commit message why 802.1Q bridges are
> > unaffected. I think you mean that in 802.1Q bridges the C-VLAN is never
> > in the payload and always in hwaccel (even when Tx VLAN offload is
> > disabled, thanks to commit 12464bb8de021).
> 
> You're right that I should clarify, but I think we may be talking
> about different scenarios.
> 
> The 802.1Q bridge I mentioned is specifically the case with
> vlan_tunnel enabled (for VXLAN bridging). In this setup:
> 
> 1. The bridge uses per-vlan br_tunnel_info to map C-VLAN (0x8100) to VNI
> 2. On egress via br_handle_egress_vlan_tunnel(), the C-VLAN is cleared
> from hwaccel and used to set the tunnel_id in dst_metadata
> 3. After skb_vlan_pop() clears hwaccel, skb->protocol is the actual
> payload type (e.g., IPv4), not a VLAN ethertype, so the second phase
> of skb_vlan_pop() that pops nested VLANs from payload never triggers
> 
> So in this 802.1Q + vlan_tunnel path, there's no C-VLAN in hwaccel at
> the VXLAN driver entry point.
> 
> For 802.1ad bridges, here is what I understood of the flow at egress
> in br_handle_egress_vlan_tunnel():
> 
>   // 1. S-VLAN is in hwaccel at this point (bridge put it there)
>   tunnel_id = READ_ONCE(vlan->tinfo.tunnel_id);
>   if (!tunnel_id || unlikely(!skb_vlan_tag_present(skb))) // ← checks hwaccel!
>   return 0;
> 
>   // 2. Set tunnel metadata (VNI) : this goes in skb_dst, NOT hwaccel
>   skb_dst_drop(skb);
>   // ... creates metadata_dst with tunnel_id ...
>   skb_dst_set(skb, &tunnel_dst->dst);
> 
>   // 3. Now remove the S-VLAN from hwaccel (it's been "consumed" for VNI lookup)
>   err = skb_vlan_pop(skb); // ← BUG: also pops C-VLAN from payload
> 
> So at the moment skb_vlan_pop() is called:
> - S-VLAN: in hwaccel (skb->vlan_tci) : needs to be cleared
> - C-VLAN: in packet payload : should NOT be touched
> - VNI: already set in skb_dst() metadata : separate from VLAN
> 
> The S-VLAN was used to lookup the VNI, but it's still sitting in
> hwaccel and needs to be removed before the packet goes into the VXLAN
> tunnel. The problem is skb_vlan_pop() does more than just clear
> hwaccel, it also "normalizes" any nested VLAN it finds in the payload.
> The fix uses __vlan_hwaccel_clear_tag() instead, which only clears
> hwaccel without touching the payload.

I agree with the analysis and the fix looks correct to me. Also tested
with test_vxlan_vnifiltering.sh which tests the 802.1Q scenario.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-12-30 12:04 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-28  2:00 [PATCH net] bridge: fix C-VLAN preservation in 802.1ad vlan_tunnel egress Alexandre Knecht
2025-12-30  9:31 ` Ido Schimmel
2025-12-30  9:38 ` Nikolay Aleksandrov
2025-12-30  9:43   ` Nikolay Aleksandrov
2025-12-30 11:25     ` Alexandre Knecht
2025-12-30 12:03       ` Ido Schimmel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).