From: Vlad Buslov <vladbu@nvidia.com>
To: Ido Schimmel <idosch@nvidia.com>
Cc: <netdev@vger.kernel.org>, Maor Dickman <maord@nvidia.com>,
Roopa Prabhu <roopa@nvidia.com>, <razor@blackwall.org>
Subject: Re: Bridge VLAN memory leak
Date: Tue, 5 Jul 2022 19:25:14 +0300 [thread overview]
Message-ID: <871quzbji8.fsf@nvidia.com> (raw)
In-Reply-To: <YsRb/cJJ+zSKQkdD@shredder>
On Tue 05 Jul 2022 at 18:42, Ido Schimmel <idosch@nvidia.com> wrote:
> On Tue, Jul 05, 2022 at 05:46:49PM +0300, Vlad Buslov wrote:
>> On Tue 05 Jul 2022 at 16:55, Ido Schimmel <idosch@nvidia.com> wrote:
>> > On Mon, Jul 04, 2022 at 06:47:29PM +0300, Vlad Buslov wrote:
>> >> Hi Ido,
>> >>
>> >> While implementing QinQ offload support in mlx5 I encountered a memory
>> >> leak[0] in the bridge implementation which seems to be related to the new
>> >> BR_VLFLAG_ADDED_BY_SWITCHDEV flag that you have recently added.
>> >
>> > FTR, added here:
>> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=279737939a8194f02fa352ab4476a1b241f44ef4
>> >
>> >>
>> >> To reproduce the issue netdevice must support bridge VLAN offload, so I
>> >> can't provide a simple script that uses veth or anything like that.
>> >> Instead, I'll describe the issue step-by-step:
>> >>
>> >> 1. Create a bridge, add offload-capable netdevs to it and assign some
>> >> VLAN to them. __vlan_vid_add() function will set the
>> >> BR_VLFLAG_ADDED_BY_SWITCHDEV flag since br_switchdev_port_vlan_add()
>> >> should return 0 if dev can offload VLANs and will also skip call to
>> >> vlan_vid_add() in such case:
>> >>
>> >> /* Try switchdev op first. In case it is not supported, fallback to
>> >> * 8021q add.
>> >> */
>> >> err = br_switchdev_port_vlan_add(dev, v->vid, flags, false, extack);
>> >> if (err == -EOPNOTSUPP)
>> >> return vlan_vid_add(dev, br->vlan_proto, v->vid);
>> >> v->priv_flags |= BR_VLFLAG_ADDED_BY_SWITCHDEV;
>> >>
>> >> 2. Enable filtering and set VLAN protocol to 802.1ad. That will trigger
>> >> the following code in __br_vlan_set_proto() that re-creates existing
>> >> VLANs with vlan_vid_add() function call whether they have the
>> >> BR_VLFLAG_ADDED_BY_SWITCHDEV flag set or not:
>> >>
>> >> /* Add VLANs for the new proto to the device filter. */
>> >> list_for_each_entry(p, &br->port_list, list) {
>> >> vg = nbp_vlan_group(p);
>> >> list_for_each_entry(vlan, &vg->vlan_list, vlist) {
>> >> err = vlan_vid_add(p->dev, proto, vlan->vid);
>> >> if (err)
>> >> goto err_filt;
>> >> }
>> >> }
>> >>
>> >> 3. Now delete the bridge. That will delete all existing VLANs via
>> >> __vlan_vid_del() function, which skips calling vlan_vid_del() (that is
>> >> necessary to clean up after vlan_vid_add()) if VLAN has
>> >> BR_VLFLAG_ADDED_BY_SWITCHDEV flag set:
>> >>
>> >> /* Try switchdev op first. In case it is not supported, fallback to
>> >> * 8021q del.
>> >> */
>> >> err = br_switchdev_port_vlan_del(dev, v->vid);
>> >> if (!(v->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV))
>> >> vlan_vid_del(dev, br->vlan_proto, v->vid);
>> >
>> > Looking at the code before the change, I'm pretty sure you will be able
>> > to reproduce the leak prior to above mentioned commit:
>> >
>> > ```
>> > - err = br_switchdev_port_vlan_del(dev, vid);
>> > - if (err == -EOPNOTSUPP) {
>> > - vlan_vid_del(dev, br->vlan_proto, vid);
>> > - return 0;
>> > - }
>> > - return err;
>> > ```
>> >
>> >>
>> >>
>> >> The issue doesn't reproduce for me anymore if I just clear the
>> >> BR_VLFLAG_ADDED_BY_SWITCHDEV flag when re-creating VLANs on step 2.
>> >> However, I'm not sure whether it is the right approach in this case.
>> >> WDYT?
>> >
>> > As a switchdev driver you already know about the new VLAN protocol via
>> > 'SWITCHDEV_ATTR_ID_BRIDGE_VLAN_PROTOCOL' so do we really need the VLANs
>> > to be programmed again? The VLAN protocol is not communicated in
>> > 'SWITCHDEV_OBJ_ID_PORT_VLAN' anyway.
>>
>> In my WIP implementation of 802.1ad offload I just re-create all
>> existing VLANs in hardware with new protocol upon receiving
>> SWITCHDEV_ATTR_ID_BRIDGE_VLAN_PROTOCOL notification.
>
> Would it be easier for you if you got the VLAN protocol in
> 'SWITCHDEV_OBJ_ID_PORT_VLAN' and __br_vlan_set_proto() would invoke
> __vlan_vid_add() and __vlan_vid_del() instead of calling the 8021q
> driver directly?
For me it is easy to iterate and re-create existing VLANs with new
protocol inside the driver since I already have all the necessary
structures for that. Also, with current architecture I pre-create
required flow groups based on current bridge VLAN proto (during new
bridge creation or on reception of
SWITCHDEV_ATTR_ID_BRIDGE_VLAN_PROTOCOL event), so having protocol inside
switchdev_obj_port_vlan will be redundant.
>
>>
>> >
>> > Can you try the below (compile tested only)?
>>
>> With the patch applied memleak no longer reproduces.
>>
>> >
>> > ```
>> > diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
>> > index 6e53dc991409..9ffd40b8270c 100644
>> > --- a/net/bridge/br_vlan.c
>> > +++ b/net/bridge/br_vlan.c
>> > @@ -959,6 +959,8 @@ int __br_vlan_set_proto(struct net_bridge *br, __be16 proto,
>> > list_for_each_entry(p, &br->port_list, list) {
>> > vg = nbp_vlan_group(p);
>> > list_for_each_entry(vlan, &vg->vlan_list, vlist) {
>> > + if (vlan->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV)
>> > + continue;
>> > err = vlan_vid_add(p->dev, proto, vlan->vid);
>> > if (err)
>> > goto err_filt;
>> > @@ -973,8 +975,11 @@ int __br_vlan_set_proto(struct net_bridge *br, __be16 proto,
>> > /* Delete VLANs for the old proto from the device filter. */
>> > list_for_each_entry(p, &br->port_list, list) {
>> > vg = nbp_vlan_group(p);
>> > - list_for_each_entry(vlan, &vg->vlan_list, vlist)
>> > + list_for_each_entry(vlan, &vg->vlan_list, vlist) {
>> > + if (vlan->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV)
>> > + continue;
>> > vlan_vid_del(p->dev, oldproto, vlan->vid);
>> > + }
>> > }
>> >
>> > return 0;
>> > @@ -983,13 +988,19 @@ int __br_vlan_set_proto(struct net_bridge *br, __be16 proto,
>> > attr.u.vlan_protocol = ntohs(oldproto);
>> > switchdev_port_attr_set(br->dev, &attr, NULL);
>> >
>> > - list_for_each_entry_continue_reverse(vlan, &vg->vlan_list, vlist)
>> > + list_for_each_entry_continue_reverse(vlan, &vg->vlan_list, vlist) {
>> > + if (vlan->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV)
>> > + continue;
>> > vlan_vid_del(p->dev, proto, vlan->vid);
>> > + }
>> >
>> > list_for_each_entry_continue_reverse(p, &br->port_list, list) {
>> > vg = nbp_vlan_group(p);
>> > - list_for_each_entry(vlan, &vg->vlan_list, vlist)
>> > + list_for_each_entry(vlan, &vg->vlan_list, vlist) {
>> > + if (vlan->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV)
>> > + continue;
>> > vlan_vid_del(p->dev, proto, vlan->vid);
>> > + }
>> > }
>> >
>> > return err;
>> > ```
>> >
>> >>
>> >> [0]:
>> >>
>> >> unreferenced object 0xffff8881f6771200 (size 256):
>> >> comm "ip", pid 446855, jiffies 4298238841 (age 55.240s)
>> >> hex dump (first 32 bytes):
>> >> 00 00 7f 0e 83 88 ff ff 00 00 00 00 00 00 00 00 ................
>> >> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>> >> backtrace:
>> >> [<00000000012819ac>] vlan_vid_add+0x437/0x750
>> >> [<00000000f2281fad>] __br_vlan_set_proto+0x289/0x920
>> >> [<000000000632b56f>] br_changelink+0x3d6/0x13f0
>> >> [<0000000089d25f04>] __rtnl_newlink+0x8ae/0x14c0
>> >> [<00000000f6276baf>] rtnl_newlink+0x5f/0x90
>> >> [<00000000746dc902>] rtnetlink_rcv_msg+0x336/0xa00
>> >> [<000000001c2241c0>] netlink_rcv_skb+0x11d/0x340
>> >> [<0000000010588814>] netlink_unicast+0x438/0x710
>> >> [<00000000e1a4cd5c>] netlink_sendmsg+0x788/0xc40
>> >> [<00000000e8992d4e>] sock_sendmsg+0xb0/0xe0
>> >> [<00000000621b8f91>] ____sys_sendmsg+0x4ff/0x6d0
>> >> [<000000000ea26996>] ___sys_sendmsg+0x12e/0x1b0
>> >> [<00000000684f7e25>] __sys_sendmsg+0xab/0x130
>> >> [<000000004538b104>] do_syscall_64+0x3d/0x90
>> >> [<0000000091ed9678>] entry_SYSCALL_64_after_hwframe+0x46/0xb0
>> >>
>> >>
>> >> Regards,
>> >> Vlad
>>
prev parent reply other threads:[~2022-07-05 16:40 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-04 15:47 Bridge VLAN memory leak Vlad Buslov
2022-07-05 13:55 ` Ido Schimmel
2022-07-05 14:46 ` Vlad Buslov
2022-07-05 15:42 ` Ido Schimmel
2022-07-05 16:25 ` Vlad Buslov [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=871quzbji8.fsf@nvidia.com \
--to=vladbu@nvidia.com \
--cc=idosch@nvidia.com \
--cc=maord@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=razor@blackwall.org \
--cc=roopa@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.