netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Bridge VLAN memory leak
@ 2022-07-04 15:47 Vlad Buslov
  2022-07-05 13:55 ` Ido Schimmel
  0 siblings, 1 reply; 5+ messages in thread
From: Vlad Buslov @ 2022-07-04 15:47 UTC (permalink / raw)
  To: Ido Schimmel; +Cc: netdev, Maor Dickman, Roopa Prabhu, razor

Hi Ido,

While implementing QinQ offload support in mlx5 I encountered a memory
leak[0] in the bridge implementation which seems to be related to the new
BR_VLFLAG_ADDED_BY_SWITCHDEV flag that you have recently added.

To reproduce the issue netdevice must support bridge VLAN offload, so I
can't provide a simple script that uses veth or anything like that.
Instead, I'll describe the issue step-by-step:

1. Create a bridge, add offload-capable netdevs to it and assign some
VLAN to them. __vlan_vid_add() function will set the
BR_VLFLAG_ADDED_BY_SWITCHDEV flag since br_switchdev_port_vlan_add()
should return 0 if dev can offload VLANs and will also skip call to
vlan_vid_add() in such case:

        /* Try switchdev op first. In case it is not supported, fallback to
         * 8021q add.
         */
        err = br_switchdev_port_vlan_add(dev, v->vid, flags, false, extack);
        if (err == -EOPNOTSUPP)
                return vlan_vid_add(dev, br->vlan_proto, v->vid);
        v->priv_flags |= BR_VLFLAG_ADDED_BY_SWITCHDEV;

2. Enable filtering and set VLAN protocol to 802.1ad. That will trigger
the following code in __br_vlan_set_proto() that re-creates existing
VLANs with vlan_vid_add() function call whether they have the
BR_VLFLAG_ADDED_BY_SWITCHDEV flag set or not:

         /* Add VLANs for the new proto to the device filter. */
         list_for_each_entry(p, &br->port_list, list) {
                 vg = nbp_vlan_group(p);
                 list_for_each_entry(vlan, &vg->vlan_list, vlist) {
                         err = vlan_vid_add(p->dev, proto, vlan->vid);
                         if (err)
                                 goto err_filt;
                 }
         }

3. Now delete the bridge. That will delete all existing VLANs via
__vlan_vid_del() function, which skips calling vlan_vid_del() (that is
necessary to clean up after vlan_vid_add()) if VLAN has
BR_VLFLAG_ADDED_BY_SWITCHDEV flag set:

         /* Try switchdev op first. In case it is not supported, fallback to
          * 8021q del.
          */
         err = br_switchdev_port_vlan_del(dev, v->vid);
         if (!(v->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV))
                 vlan_vid_del(dev, br->vlan_proto, v->vid);


The issue doesn't reproduce for me anymore if I just clear the
BR_VLFLAG_ADDED_BY_SWITCHDEV flag when re-creating VLANs on step 2.
However, I'm not sure whether it is the right approach in this case.
WDYT?

[0]:

unreferenced object 0xffff8881f6771200 (size 256):
  comm "ip", pid 446855, jiffies 4298238841 (age 55.240s)
  hex dump (first 32 bytes):
    00 00 7f 0e 83 88 ff ff 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000012819ac>] vlan_vid_add+0x437/0x750
    [<00000000f2281fad>] __br_vlan_set_proto+0x289/0x920
    [<000000000632b56f>] br_changelink+0x3d6/0x13f0
    [<0000000089d25f04>] __rtnl_newlink+0x8ae/0x14c0
    [<00000000f6276baf>] rtnl_newlink+0x5f/0x90
    [<00000000746dc902>] rtnetlink_rcv_msg+0x336/0xa00
    [<000000001c2241c0>] netlink_rcv_skb+0x11d/0x340
    [<0000000010588814>] netlink_unicast+0x438/0x710
    [<00000000e1a4cd5c>] netlink_sendmsg+0x788/0xc40
    [<00000000e8992d4e>] sock_sendmsg+0xb0/0xe0
    [<00000000621b8f91>] ____sys_sendmsg+0x4ff/0x6d0
    [<000000000ea26996>] ___sys_sendmsg+0x12e/0x1b0
    [<00000000684f7e25>] __sys_sendmsg+0xab/0x130
    [<000000004538b104>] do_syscall_64+0x3d/0x90
    [<0000000091ed9678>] entry_SYSCALL_64_after_hwframe+0x46/0xb0


Regards,
Vlad

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bridge VLAN memory leak
  2022-07-04 15:47 Bridge VLAN memory leak Vlad Buslov
@ 2022-07-05 13:55 ` Ido Schimmel
  2022-07-05 14:46   ` Vlad Buslov
  0 siblings, 1 reply; 5+ messages in thread
From: Ido Schimmel @ 2022-07-05 13:55 UTC (permalink / raw)
  To: Vlad Buslov; +Cc: netdev, Maor Dickman, Roopa Prabhu, razor

On Mon, Jul 04, 2022 at 06:47:29PM +0300, Vlad Buslov wrote:
> Hi Ido,
> 
> While implementing QinQ offload support in mlx5 I encountered a memory
> leak[0] in the bridge implementation which seems to be related to the new
> BR_VLFLAG_ADDED_BY_SWITCHDEV flag that you have recently added.

FTR, added here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=279737939a8194f02fa352ab4476a1b241f44ef4

> 
> To reproduce the issue netdevice must support bridge VLAN offload, so I
> can't provide a simple script that uses veth or anything like that.
> Instead, I'll describe the issue step-by-step:
> 
> 1. Create a bridge, add offload-capable netdevs to it and assign some
> VLAN to them. __vlan_vid_add() function will set the
> BR_VLFLAG_ADDED_BY_SWITCHDEV flag since br_switchdev_port_vlan_add()
> should return 0 if dev can offload VLANs and will also skip call to
> vlan_vid_add() in such case:
> 
>         /* Try switchdev op first. In case it is not supported, fallback to
>          * 8021q add.
>          */
>         err = br_switchdev_port_vlan_add(dev, v->vid, flags, false, extack);
>         if (err == -EOPNOTSUPP)
>                 return vlan_vid_add(dev, br->vlan_proto, v->vid);
>         v->priv_flags |= BR_VLFLAG_ADDED_BY_SWITCHDEV;
> 
> 2. Enable filtering and set VLAN protocol to 802.1ad. That will trigger
> the following code in __br_vlan_set_proto() that re-creates existing
> VLANs with vlan_vid_add() function call whether they have the
> BR_VLFLAG_ADDED_BY_SWITCHDEV flag set or not:
> 
>          /* Add VLANs for the new proto to the device filter. */
>          list_for_each_entry(p, &br->port_list, list) {
>                  vg = nbp_vlan_group(p);
>                  list_for_each_entry(vlan, &vg->vlan_list, vlist) {
>                          err = vlan_vid_add(p->dev, proto, vlan->vid);
>                          if (err)
>                                  goto err_filt;
>                  }
>          }
> 
> 3. Now delete the bridge. That will delete all existing VLANs via
> __vlan_vid_del() function, which skips calling vlan_vid_del() (that is
> necessary to clean up after vlan_vid_add()) if VLAN has
> BR_VLFLAG_ADDED_BY_SWITCHDEV flag set:
> 
>          /* Try switchdev op first. In case it is not supported, fallback to
>           * 8021q del.
>           */
>          err = br_switchdev_port_vlan_del(dev, v->vid);
>          if (!(v->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV))
>                  vlan_vid_del(dev, br->vlan_proto, v->vid);

Looking at the code before the change, I'm pretty sure you will be able
to reproduce the leak prior to above mentioned commit:

```
-	err = br_switchdev_port_vlan_del(dev, vid);
-	if (err == -EOPNOTSUPP) {
-		vlan_vid_del(dev, br->vlan_proto, vid);
-		return 0;
-	}
-	return err;
```

> 
> 
> The issue doesn't reproduce for me anymore if I just clear the
> BR_VLFLAG_ADDED_BY_SWITCHDEV flag when re-creating VLANs on step 2.
> However, I'm not sure whether it is the right approach in this case.
> WDYT?

As a switchdev driver you already know about the new VLAN protocol via
'SWITCHDEV_ATTR_ID_BRIDGE_VLAN_PROTOCOL' so do we really need the VLANs
to be programmed again? The VLAN protocol is not communicated in
'SWITCHDEV_OBJ_ID_PORT_VLAN' anyway.

Can you try the below (compile tested only)?

```
diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
index 6e53dc991409..9ffd40b8270c 100644
--- a/net/bridge/br_vlan.c
+++ b/net/bridge/br_vlan.c
@@ -959,6 +959,8 @@ int __br_vlan_set_proto(struct net_bridge *br, __be16 proto,
 	list_for_each_entry(p, &br->port_list, list) {
 		vg = nbp_vlan_group(p);
 		list_for_each_entry(vlan, &vg->vlan_list, vlist) {
+			if (vlan->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV)
+				continue;
 			err = vlan_vid_add(p->dev, proto, vlan->vid);
 			if (err)
 				goto err_filt;
@@ -973,8 +975,11 @@ int __br_vlan_set_proto(struct net_bridge *br, __be16 proto,
 	/* Delete VLANs for the old proto from the device filter. */
 	list_for_each_entry(p, &br->port_list, list) {
 		vg = nbp_vlan_group(p);
-		list_for_each_entry(vlan, &vg->vlan_list, vlist)
+		list_for_each_entry(vlan, &vg->vlan_list, vlist) {
+			if (vlan->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV)
+				continue;
 			vlan_vid_del(p->dev, oldproto, vlan->vid);
+		}
 	}
 
 	return 0;
@@ -983,13 +988,19 @@ int __br_vlan_set_proto(struct net_bridge *br, __be16 proto,
 	attr.u.vlan_protocol = ntohs(oldproto);
 	switchdev_port_attr_set(br->dev, &attr, NULL);
 
-	list_for_each_entry_continue_reverse(vlan, &vg->vlan_list, vlist)
+	list_for_each_entry_continue_reverse(vlan, &vg->vlan_list, vlist) {
+		if (vlan->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV)
+			continue;
 		vlan_vid_del(p->dev, proto, vlan->vid);
+	}
 
 	list_for_each_entry_continue_reverse(p, &br->port_list, list) {
 		vg = nbp_vlan_group(p);
-		list_for_each_entry(vlan, &vg->vlan_list, vlist)
+		list_for_each_entry(vlan, &vg->vlan_list, vlist) {
+			if (vlan->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV)
+				continue;
 			vlan_vid_del(p->dev, proto, vlan->vid);
+		}
 	}
 
 	return err;
```

> 
> [0]:
> 
> unreferenced object 0xffff8881f6771200 (size 256):
>   comm "ip", pid 446855, jiffies 4298238841 (age 55.240s)
>   hex dump (first 32 bytes):
>     00 00 7f 0e 83 88 ff ff 00 00 00 00 00 00 00 00  ................
>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>   backtrace:
>     [<00000000012819ac>] vlan_vid_add+0x437/0x750
>     [<00000000f2281fad>] __br_vlan_set_proto+0x289/0x920
>     [<000000000632b56f>] br_changelink+0x3d6/0x13f0
>     [<0000000089d25f04>] __rtnl_newlink+0x8ae/0x14c0
>     [<00000000f6276baf>] rtnl_newlink+0x5f/0x90
>     [<00000000746dc902>] rtnetlink_rcv_msg+0x336/0xa00
>     [<000000001c2241c0>] netlink_rcv_skb+0x11d/0x340
>     [<0000000010588814>] netlink_unicast+0x438/0x710
>     [<00000000e1a4cd5c>] netlink_sendmsg+0x788/0xc40
>     [<00000000e8992d4e>] sock_sendmsg+0xb0/0xe0
>     [<00000000621b8f91>] ____sys_sendmsg+0x4ff/0x6d0
>     [<000000000ea26996>] ___sys_sendmsg+0x12e/0x1b0
>     [<00000000684f7e25>] __sys_sendmsg+0xab/0x130
>     [<000000004538b104>] do_syscall_64+0x3d/0x90
>     [<0000000091ed9678>] entry_SYSCALL_64_after_hwframe+0x46/0xb0
> 
> 
> Regards,
> Vlad

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: Bridge VLAN memory leak
  2022-07-05 13:55 ` Ido Schimmel
@ 2022-07-05 14:46   ` Vlad Buslov
  2022-07-05 15:42     ` Ido Schimmel
  0 siblings, 1 reply; 5+ messages in thread
From: Vlad Buslov @ 2022-07-05 14:46 UTC (permalink / raw)
  To: Ido Schimmel; +Cc: netdev, Maor Dickman, Roopa Prabhu, razor

On Tue 05 Jul 2022 at 16:55, Ido Schimmel <idosch@nvidia.com> wrote:
> On Mon, Jul 04, 2022 at 06:47:29PM +0300, Vlad Buslov wrote:
>> Hi Ido,
>> 
>> While implementing QinQ offload support in mlx5 I encountered a memory
>> leak[0] in the bridge implementation which seems to be related to the new
>> BR_VLFLAG_ADDED_BY_SWITCHDEV flag that you have recently added.
>
> FTR, added here:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=279737939a8194f02fa352ab4476a1b241f44ef4
>
>> 
>> To reproduce the issue netdevice must support bridge VLAN offload, so I
>> can't provide a simple script that uses veth or anything like that.
>> Instead, I'll describe the issue step-by-step:
>> 
>> 1. Create a bridge, add offload-capable netdevs to it and assign some
>> VLAN to them. __vlan_vid_add() function will set the
>> BR_VLFLAG_ADDED_BY_SWITCHDEV flag since br_switchdev_port_vlan_add()
>> should return 0 if dev can offload VLANs and will also skip call to
>> vlan_vid_add() in such case:
>> 
>>         /* Try switchdev op first. In case it is not supported, fallback to
>>          * 8021q add.
>>          */
>>         err = br_switchdev_port_vlan_add(dev, v->vid, flags, false, extack);
>>         if (err == -EOPNOTSUPP)
>>                 return vlan_vid_add(dev, br->vlan_proto, v->vid);
>>         v->priv_flags |= BR_VLFLAG_ADDED_BY_SWITCHDEV;
>> 
>> 2. Enable filtering and set VLAN protocol to 802.1ad. That will trigger
>> the following code in __br_vlan_set_proto() that re-creates existing
>> VLANs with vlan_vid_add() function call whether they have the
>> BR_VLFLAG_ADDED_BY_SWITCHDEV flag set or not:
>> 
>>          /* Add VLANs for the new proto to the device filter. */
>>          list_for_each_entry(p, &br->port_list, list) {
>>                  vg = nbp_vlan_group(p);
>>                  list_for_each_entry(vlan, &vg->vlan_list, vlist) {
>>                          err = vlan_vid_add(p->dev, proto, vlan->vid);
>>                          if (err)
>>                                  goto err_filt;
>>                  }
>>          }
>> 
>> 3. Now delete the bridge. That will delete all existing VLANs via
>> __vlan_vid_del() function, which skips calling vlan_vid_del() (that is
>> necessary to clean up after vlan_vid_add()) if VLAN has
>> BR_VLFLAG_ADDED_BY_SWITCHDEV flag set:
>> 
>>          /* Try switchdev op first. In case it is not supported, fallback to
>>           * 8021q del.
>>           */
>>          err = br_switchdev_port_vlan_del(dev, v->vid);
>>          if (!(v->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV))
>>                  vlan_vid_del(dev, br->vlan_proto, v->vid);
>
> Looking at the code before the change, I'm pretty sure you will be able
> to reproduce the leak prior to above mentioned commit:
>
> ```
> -	err = br_switchdev_port_vlan_del(dev, vid);
> -	if (err == -EOPNOTSUPP) {
> -		vlan_vid_del(dev, br->vlan_proto, vid);
> -		return 0;
> -	}
> -	return err;
> ```
>
>> 
>> 
>> The issue doesn't reproduce for me anymore if I just clear the
>> BR_VLFLAG_ADDED_BY_SWITCHDEV flag when re-creating VLANs on step 2.
>> However, I'm not sure whether it is the right approach in this case.
>> WDYT?
>
> As a switchdev driver you already know about the new VLAN protocol via
> 'SWITCHDEV_ATTR_ID_BRIDGE_VLAN_PROTOCOL' so do we really need the VLANs
> to be programmed again? The VLAN protocol is not communicated in
> 'SWITCHDEV_OBJ_ID_PORT_VLAN' anyway.

In my WIP implementation of 802.1ad offload I just re-create all
existing VLANs in hardware with new protocol upon receiving
SWITCHDEV_ATTR_ID_BRIDGE_VLAN_PROTOCOL notification.

>
> Can you try the below (compile tested only)?

With the patch applied memleak no longer reproduces.

>
> ```
> diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
> index 6e53dc991409..9ffd40b8270c 100644
> --- a/net/bridge/br_vlan.c
> +++ b/net/bridge/br_vlan.c
> @@ -959,6 +959,8 @@ int __br_vlan_set_proto(struct net_bridge *br, __be16 proto,
>  	list_for_each_entry(p, &br->port_list, list) {
>  		vg = nbp_vlan_group(p);
>  		list_for_each_entry(vlan, &vg->vlan_list, vlist) {
> +			if (vlan->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV)
> +				continue;
>  			err = vlan_vid_add(p->dev, proto, vlan->vid);
>  			if (err)
>  				goto err_filt;
> @@ -973,8 +975,11 @@ int __br_vlan_set_proto(struct net_bridge *br, __be16 proto,
>  	/* Delete VLANs for the old proto from the device filter. */
>  	list_for_each_entry(p, &br->port_list, list) {
>  		vg = nbp_vlan_group(p);
> -		list_for_each_entry(vlan, &vg->vlan_list, vlist)
> +		list_for_each_entry(vlan, &vg->vlan_list, vlist) {
> +			if (vlan->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV)
> +				continue;
>  			vlan_vid_del(p->dev, oldproto, vlan->vid);
> +		}
>  	}
>  
>  	return 0;
> @@ -983,13 +988,19 @@ int __br_vlan_set_proto(struct net_bridge *br, __be16 proto,
>  	attr.u.vlan_protocol = ntohs(oldproto);
>  	switchdev_port_attr_set(br->dev, &attr, NULL);
>  
> -	list_for_each_entry_continue_reverse(vlan, &vg->vlan_list, vlist)
> +	list_for_each_entry_continue_reverse(vlan, &vg->vlan_list, vlist) {
> +		if (vlan->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV)
> +			continue;
>  		vlan_vid_del(p->dev, proto, vlan->vid);
> +	}
>  
>  	list_for_each_entry_continue_reverse(p, &br->port_list, list) {
>  		vg = nbp_vlan_group(p);
> -		list_for_each_entry(vlan, &vg->vlan_list, vlist)
> +		list_for_each_entry(vlan, &vg->vlan_list, vlist) {
> +			if (vlan->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV)
> +				continue;
>  			vlan_vid_del(p->dev, proto, vlan->vid);
> +		}
>  	}
>  
>  	return err;
> ```
>
>> 
>> [0]:
>> 
>> unreferenced object 0xffff8881f6771200 (size 256):
>>   comm "ip", pid 446855, jiffies 4298238841 (age 55.240s)
>>   hex dump (first 32 bytes):
>>     00 00 7f 0e 83 88 ff ff 00 00 00 00 00 00 00 00  ................
>>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>   backtrace:
>>     [<00000000012819ac>] vlan_vid_add+0x437/0x750
>>     [<00000000f2281fad>] __br_vlan_set_proto+0x289/0x920
>>     [<000000000632b56f>] br_changelink+0x3d6/0x13f0
>>     [<0000000089d25f04>] __rtnl_newlink+0x8ae/0x14c0
>>     [<00000000f6276baf>] rtnl_newlink+0x5f/0x90
>>     [<00000000746dc902>] rtnetlink_rcv_msg+0x336/0xa00
>>     [<000000001c2241c0>] netlink_rcv_skb+0x11d/0x340
>>     [<0000000010588814>] netlink_unicast+0x438/0x710
>>     [<00000000e1a4cd5c>] netlink_sendmsg+0x788/0xc40
>>     [<00000000e8992d4e>] sock_sendmsg+0xb0/0xe0
>>     [<00000000621b8f91>] ____sys_sendmsg+0x4ff/0x6d0
>>     [<000000000ea26996>] ___sys_sendmsg+0x12e/0x1b0
>>     [<00000000684f7e25>] __sys_sendmsg+0xab/0x130
>>     [<000000004538b104>] do_syscall_64+0x3d/0x90
>>     [<0000000091ed9678>] entry_SYSCALL_64_after_hwframe+0x46/0xb0
>> 
>> 
>> Regards,
>> Vlad


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bridge VLAN memory leak
  2022-07-05 14:46   ` Vlad Buslov
@ 2022-07-05 15:42     ` Ido Schimmel
  2022-07-05 16:25       ` Vlad Buslov
  0 siblings, 1 reply; 5+ messages in thread
From: Ido Schimmel @ 2022-07-05 15:42 UTC (permalink / raw)
  To: Vlad Buslov; +Cc: netdev, Maor Dickman, Roopa Prabhu, razor

On Tue, Jul 05, 2022 at 05:46:49PM +0300, Vlad Buslov wrote:
> On Tue 05 Jul 2022 at 16:55, Ido Schimmel <idosch@nvidia.com> wrote:
> > On Mon, Jul 04, 2022 at 06:47:29PM +0300, Vlad Buslov wrote:
> >> Hi Ido,
> >> 
> >> While implementing QinQ offload support in mlx5 I encountered a memory
> >> leak[0] in the bridge implementation which seems to be related to the new
> >> BR_VLFLAG_ADDED_BY_SWITCHDEV flag that you have recently added.
> >
> > FTR, added here:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=279737939a8194f02fa352ab4476a1b241f44ef4
> >
> >> 
> >> To reproduce the issue netdevice must support bridge VLAN offload, so I
> >> can't provide a simple script that uses veth or anything like that.
> >> Instead, I'll describe the issue step-by-step:
> >> 
> >> 1. Create a bridge, add offload-capable netdevs to it and assign some
> >> VLAN to them. __vlan_vid_add() function will set the
> >> BR_VLFLAG_ADDED_BY_SWITCHDEV flag since br_switchdev_port_vlan_add()
> >> should return 0 if dev can offload VLANs and will also skip call to
> >> vlan_vid_add() in such case:
> >> 
> >>         /* Try switchdev op first. In case it is not supported, fallback to
> >>          * 8021q add.
> >>          */
> >>         err = br_switchdev_port_vlan_add(dev, v->vid, flags, false, extack);
> >>         if (err == -EOPNOTSUPP)
> >>                 return vlan_vid_add(dev, br->vlan_proto, v->vid);
> >>         v->priv_flags |= BR_VLFLAG_ADDED_BY_SWITCHDEV;
> >> 
> >> 2. Enable filtering and set VLAN protocol to 802.1ad. That will trigger
> >> the following code in __br_vlan_set_proto() that re-creates existing
> >> VLANs with vlan_vid_add() function call whether they have the
> >> BR_VLFLAG_ADDED_BY_SWITCHDEV flag set or not:
> >> 
> >>          /* Add VLANs for the new proto to the device filter. */
> >>          list_for_each_entry(p, &br->port_list, list) {
> >>                  vg = nbp_vlan_group(p);
> >>                  list_for_each_entry(vlan, &vg->vlan_list, vlist) {
> >>                          err = vlan_vid_add(p->dev, proto, vlan->vid);
> >>                          if (err)
> >>                                  goto err_filt;
> >>                  }
> >>          }
> >> 
> >> 3. Now delete the bridge. That will delete all existing VLANs via
> >> __vlan_vid_del() function, which skips calling vlan_vid_del() (that is
> >> necessary to clean up after vlan_vid_add()) if VLAN has
> >> BR_VLFLAG_ADDED_BY_SWITCHDEV flag set:
> >> 
> >>          /* Try switchdev op first. In case it is not supported, fallback to
> >>           * 8021q del.
> >>           */
> >>          err = br_switchdev_port_vlan_del(dev, v->vid);
> >>          if (!(v->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV))
> >>                  vlan_vid_del(dev, br->vlan_proto, v->vid);
> >
> > Looking at the code before the change, I'm pretty sure you will be able
> > to reproduce the leak prior to above mentioned commit:
> >
> > ```
> > -	err = br_switchdev_port_vlan_del(dev, vid);
> > -	if (err == -EOPNOTSUPP) {
> > -		vlan_vid_del(dev, br->vlan_proto, vid);
> > -		return 0;
> > -	}
> > -	return err;
> > ```
> >
> >> 
> >> 
> >> The issue doesn't reproduce for me anymore if I just clear the
> >> BR_VLFLAG_ADDED_BY_SWITCHDEV flag when re-creating VLANs on step 2.
> >> However, I'm not sure whether it is the right approach in this case.
> >> WDYT?
> >
> > As a switchdev driver you already know about the new VLAN protocol via
> > 'SWITCHDEV_ATTR_ID_BRIDGE_VLAN_PROTOCOL' so do we really need the VLANs
> > to be programmed again? The VLAN protocol is not communicated in
> > 'SWITCHDEV_OBJ_ID_PORT_VLAN' anyway.
> 
> In my WIP implementation of 802.1ad offload I just re-create all
> existing VLANs in hardware with new protocol upon receiving
> SWITCHDEV_ATTR_ID_BRIDGE_VLAN_PROTOCOL notification.

Would it be easier for you if you got the VLAN protocol in
'SWITCHDEV_OBJ_ID_PORT_VLAN' and __br_vlan_set_proto() would invoke
__vlan_vid_add() and __vlan_vid_del() instead of calling the 8021q
driver directly?

> 
> >
> > Can you try the below (compile tested only)?
> 
> With the patch applied memleak no longer reproduces.
> 
> >
> > ```
> > diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
> > index 6e53dc991409..9ffd40b8270c 100644
> > --- a/net/bridge/br_vlan.c
> > +++ b/net/bridge/br_vlan.c
> > @@ -959,6 +959,8 @@ int __br_vlan_set_proto(struct net_bridge *br, __be16 proto,
> >  	list_for_each_entry(p, &br->port_list, list) {
> >  		vg = nbp_vlan_group(p);
> >  		list_for_each_entry(vlan, &vg->vlan_list, vlist) {
> > +			if (vlan->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV)
> > +				continue;
> >  			err = vlan_vid_add(p->dev, proto, vlan->vid);
> >  			if (err)
> >  				goto err_filt;
> > @@ -973,8 +975,11 @@ int __br_vlan_set_proto(struct net_bridge *br, __be16 proto,
> >  	/* Delete VLANs for the old proto from the device filter. */
> >  	list_for_each_entry(p, &br->port_list, list) {
> >  		vg = nbp_vlan_group(p);
> > -		list_for_each_entry(vlan, &vg->vlan_list, vlist)
> > +		list_for_each_entry(vlan, &vg->vlan_list, vlist) {
> > +			if (vlan->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV)
> > +				continue;
> >  			vlan_vid_del(p->dev, oldproto, vlan->vid);
> > +		}
> >  	}
> >  
> >  	return 0;
> > @@ -983,13 +988,19 @@ int __br_vlan_set_proto(struct net_bridge *br, __be16 proto,
> >  	attr.u.vlan_protocol = ntohs(oldproto);
> >  	switchdev_port_attr_set(br->dev, &attr, NULL);
> >  
> > -	list_for_each_entry_continue_reverse(vlan, &vg->vlan_list, vlist)
> > +	list_for_each_entry_continue_reverse(vlan, &vg->vlan_list, vlist) {
> > +		if (vlan->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV)
> > +			continue;
> >  		vlan_vid_del(p->dev, proto, vlan->vid);
> > +	}
> >  
> >  	list_for_each_entry_continue_reverse(p, &br->port_list, list) {
> >  		vg = nbp_vlan_group(p);
> > -		list_for_each_entry(vlan, &vg->vlan_list, vlist)
> > +		list_for_each_entry(vlan, &vg->vlan_list, vlist) {
> > +			if (vlan->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV)
> > +				continue;
> >  			vlan_vid_del(p->dev, proto, vlan->vid);
> > +		}
> >  	}
> >  
> >  	return err;
> > ```
> >
> >> 
> >> [0]:
> >> 
> >> unreferenced object 0xffff8881f6771200 (size 256):
> >>   comm "ip", pid 446855, jiffies 4298238841 (age 55.240s)
> >>   hex dump (first 32 bytes):
> >>     00 00 7f 0e 83 88 ff ff 00 00 00 00 00 00 00 00  ................
> >>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>   backtrace:
> >>     [<00000000012819ac>] vlan_vid_add+0x437/0x750
> >>     [<00000000f2281fad>] __br_vlan_set_proto+0x289/0x920
> >>     [<000000000632b56f>] br_changelink+0x3d6/0x13f0
> >>     [<0000000089d25f04>] __rtnl_newlink+0x8ae/0x14c0
> >>     [<00000000f6276baf>] rtnl_newlink+0x5f/0x90
> >>     [<00000000746dc902>] rtnetlink_rcv_msg+0x336/0xa00
> >>     [<000000001c2241c0>] netlink_rcv_skb+0x11d/0x340
> >>     [<0000000010588814>] netlink_unicast+0x438/0x710
> >>     [<00000000e1a4cd5c>] netlink_sendmsg+0x788/0xc40
> >>     [<00000000e8992d4e>] sock_sendmsg+0xb0/0xe0
> >>     [<00000000621b8f91>] ____sys_sendmsg+0x4ff/0x6d0
> >>     [<000000000ea26996>] ___sys_sendmsg+0x12e/0x1b0
> >>     [<00000000684f7e25>] __sys_sendmsg+0xab/0x130
> >>     [<000000004538b104>] do_syscall_64+0x3d/0x90
> >>     [<0000000091ed9678>] entry_SYSCALL_64_after_hwframe+0x46/0xb0
> >> 
> >> 
> >> Regards,
> >> Vlad
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bridge VLAN memory leak
  2022-07-05 15:42     ` Ido Schimmel
@ 2022-07-05 16:25       ` Vlad Buslov
  0 siblings, 0 replies; 5+ messages in thread
From: Vlad Buslov @ 2022-07-05 16:25 UTC (permalink / raw)
  To: Ido Schimmel; +Cc: netdev, Maor Dickman, Roopa Prabhu, razor

On Tue 05 Jul 2022 at 18:42, Ido Schimmel <idosch@nvidia.com> wrote:
> On Tue, Jul 05, 2022 at 05:46:49PM +0300, Vlad Buslov wrote:
>> On Tue 05 Jul 2022 at 16:55, Ido Schimmel <idosch@nvidia.com> wrote:
>> > On Mon, Jul 04, 2022 at 06:47:29PM +0300, Vlad Buslov wrote:
>> >> Hi Ido,
>> >> 
>> >> While implementing QinQ offload support in mlx5 I encountered a memory
>> >> leak[0] in the bridge implementation which seems to be related to the new
>> >> BR_VLFLAG_ADDED_BY_SWITCHDEV flag that you have recently added.
>> >
>> > FTR, added here:
>> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=279737939a8194f02fa352ab4476a1b241f44ef4
>> >
>> >> 
>> >> To reproduce the issue netdevice must support bridge VLAN offload, so I
>> >> can't provide a simple script that uses veth or anything like that.
>> >> Instead, I'll describe the issue step-by-step:
>> >> 
>> >> 1. Create a bridge, add offload-capable netdevs to it and assign some
>> >> VLAN to them. __vlan_vid_add() function will set the
>> >> BR_VLFLAG_ADDED_BY_SWITCHDEV flag since br_switchdev_port_vlan_add()
>> >> should return 0 if dev can offload VLANs and will also skip call to
>> >> vlan_vid_add() in such case:
>> >> 
>> >>         /* Try switchdev op first. In case it is not supported, fallback to
>> >>          * 8021q add.
>> >>          */
>> >>         err = br_switchdev_port_vlan_add(dev, v->vid, flags, false, extack);
>> >>         if (err == -EOPNOTSUPP)
>> >>                 return vlan_vid_add(dev, br->vlan_proto, v->vid);
>> >>         v->priv_flags |= BR_VLFLAG_ADDED_BY_SWITCHDEV;
>> >> 
>> >> 2. Enable filtering and set VLAN protocol to 802.1ad. That will trigger
>> >> the following code in __br_vlan_set_proto() that re-creates existing
>> >> VLANs with vlan_vid_add() function call whether they have the
>> >> BR_VLFLAG_ADDED_BY_SWITCHDEV flag set or not:
>> >> 
>> >>          /* Add VLANs for the new proto to the device filter. */
>> >>          list_for_each_entry(p, &br->port_list, list) {
>> >>                  vg = nbp_vlan_group(p);
>> >>                  list_for_each_entry(vlan, &vg->vlan_list, vlist) {
>> >>                          err = vlan_vid_add(p->dev, proto, vlan->vid);
>> >>                          if (err)
>> >>                                  goto err_filt;
>> >>                  }
>> >>          }
>> >> 
>> >> 3. Now delete the bridge. That will delete all existing VLANs via
>> >> __vlan_vid_del() function, which skips calling vlan_vid_del() (that is
>> >> necessary to clean up after vlan_vid_add()) if VLAN has
>> >> BR_VLFLAG_ADDED_BY_SWITCHDEV flag set:
>> >> 
>> >>          /* Try switchdev op first. In case it is not supported, fallback to
>> >>           * 8021q del.
>> >>           */
>> >>          err = br_switchdev_port_vlan_del(dev, v->vid);
>> >>          if (!(v->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV))
>> >>                  vlan_vid_del(dev, br->vlan_proto, v->vid);
>> >
>> > Looking at the code before the change, I'm pretty sure you will be able
>> > to reproduce the leak prior to above mentioned commit:
>> >
>> > ```
>> > -	err = br_switchdev_port_vlan_del(dev, vid);
>> > -	if (err == -EOPNOTSUPP) {
>> > -		vlan_vid_del(dev, br->vlan_proto, vid);
>> > -		return 0;
>> > -	}
>> > -	return err;
>> > ```
>> >
>> >> 
>> >> 
>> >> The issue doesn't reproduce for me anymore if I just clear the
>> >> BR_VLFLAG_ADDED_BY_SWITCHDEV flag when re-creating VLANs on step 2.
>> >> However, I'm not sure whether it is the right approach in this case.
>> >> WDYT?
>> >
>> > As a switchdev driver you already know about the new VLAN protocol via
>> > 'SWITCHDEV_ATTR_ID_BRIDGE_VLAN_PROTOCOL' so do we really need the VLANs
>> > to be programmed again? The VLAN protocol is not communicated in
>> > 'SWITCHDEV_OBJ_ID_PORT_VLAN' anyway.
>> 
>> In my WIP implementation of 802.1ad offload I just re-create all
>> existing VLANs in hardware with new protocol upon receiving
>> SWITCHDEV_ATTR_ID_BRIDGE_VLAN_PROTOCOL notification.
>
> Would it be easier for you if you got the VLAN protocol in
> 'SWITCHDEV_OBJ_ID_PORT_VLAN' and __br_vlan_set_proto() would invoke
> __vlan_vid_add() and __vlan_vid_del() instead of calling the 8021q
> driver directly?

For me it is easy to iterate and re-create existing VLANs with new
protocol inside the driver since I already have all the necessary
structures for that. Also, with current architecture I pre-create
required flow groups based on current bridge VLAN proto (during new
bridge creation or on reception of
SWITCHDEV_ATTR_ID_BRIDGE_VLAN_PROTOCOL event), so having protocol inside
switchdev_obj_port_vlan will be redundant.

>
>> 
>> >
>> > Can you try the below (compile tested only)?
>> 
>> With the patch applied memleak no longer reproduces.
>> 
>> >
>> > ```
>> > diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
>> > index 6e53dc991409..9ffd40b8270c 100644
>> > --- a/net/bridge/br_vlan.c
>> > +++ b/net/bridge/br_vlan.c
>> > @@ -959,6 +959,8 @@ int __br_vlan_set_proto(struct net_bridge *br, __be16 proto,
>> >  	list_for_each_entry(p, &br->port_list, list) {
>> >  		vg = nbp_vlan_group(p);
>> >  		list_for_each_entry(vlan, &vg->vlan_list, vlist) {
>> > +			if (vlan->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV)
>> > +				continue;
>> >  			err = vlan_vid_add(p->dev, proto, vlan->vid);
>> >  			if (err)
>> >  				goto err_filt;
>> > @@ -973,8 +975,11 @@ int __br_vlan_set_proto(struct net_bridge *br, __be16 proto,
>> >  	/* Delete VLANs for the old proto from the device filter. */
>> >  	list_for_each_entry(p, &br->port_list, list) {
>> >  		vg = nbp_vlan_group(p);
>> > -		list_for_each_entry(vlan, &vg->vlan_list, vlist)
>> > +		list_for_each_entry(vlan, &vg->vlan_list, vlist) {
>> > +			if (vlan->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV)
>> > +				continue;
>> >  			vlan_vid_del(p->dev, oldproto, vlan->vid);
>> > +		}
>> >  	}
>> >  
>> >  	return 0;
>> > @@ -983,13 +988,19 @@ int __br_vlan_set_proto(struct net_bridge *br, __be16 proto,
>> >  	attr.u.vlan_protocol = ntohs(oldproto);
>> >  	switchdev_port_attr_set(br->dev, &attr, NULL);
>> >  
>> > -	list_for_each_entry_continue_reverse(vlan, &vg->vlan_list, vlist)
>> > +	list_for_each_entry_continue_reverse(vlan, &vg->vlan_list, vlist) {
>> > +		if (vlan->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV)
>> > +			continue;
>> >  		vlan_vid_del(p->dev, proto, vlan->vid);
>> > +	}
>> >  
>> >  	list_for_each_entry_continue_reverse(p, &br->port_list, list) {
>> >  		vg = nbp_vlan_group(p);
>> > -		list_for_each_entry(vlan, &vg->vlan_list, vlist)
>> > +		list_for_each_entry(vlan, &vg->vlan_list, vlist) {
>> > +			if (vlan->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV)
>> > +				continue;
>> >  			vlan_vid_del(p->dev, proto, vlan->vid);
>> > +		}
>> >  	}
>> >  
>> >  	return err;
>> > ```
>> >
>> >> 
>> >> [0]:
>> >> 
>> >> unreferenced object 0xffff8881f6771200 (size 256):
>> >>   comm "ip", pid 446855, jiffies 4298238841 (age 55.240s)
>> >>   hex dump (first 32 bytes):
>> >>     00 00 7f 0e 83 88 ff ff 00 00 00 00 00 00 00 00  ................
>> >>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>> >>   backtrace:
>> >>     [<00000000012819ac>] vlan_vid_add+0x437/0x750
>> >>     [<00000000f2281fad>] __br_vlan_set_proto+0x289/0x920
>> >>     [<000000000632b56f>] br_changelink+0x3d6/0x13f0
>> >>     [<0000000089d25f04>] __rtnl_newlink+0x8ae/0x14c0
>> >>     [<00000000f6276baf>] rtnl_newlink+0x5f/0x90
>> >>     [<00000000746dc902>] rtnetlink_rcv_msg+0x336/0xa00
>> >>     [<000000001c2241c0>] netlink_rcv_skb+0x11d/0x340
>> >>     [<0000000010588814>] netlink_unicast+0x438/0x710
>> >>     [<00000000e1a4cd5c>] netlink_sendmsg+0x788/0xc40
>> >>     [<00000000e8992d4e>] sock_sendmsg+0xb0/0xe0
>> >>     [<00000000621b8f91>] ____sys_sendmsg+0x4ff/0x6d0
>> >>     [<000000000ea26996>] ___sys_sendmsg+0x12e/0x1b0
>> >>     [<00000000684f7e25>] __sys_sendmsg+0xab/0x130
>> >>     [<000000004538b104>] do_syscall_64+0x3d/0x90
>> >>     [<0000000091ed9678>] entry_SYSCALL_64_after_hwframe+0x46/0xb0
>> >> 
>> >> 
>> >> Regards,
>> >> Vlad
>> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-07-05 16:40 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-07-04 15:47 Bridge VLAN memory leak Vlad Buslov
2022-07-05 13:55 ` Ido Schimmel
2022-07-05 14:46   ` Vlad Buslov
2022-07-05 15:42     ` Ido Schimmel
2022-07-05 16:25       ` Vlad Buslov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).