* Re: [PATCH net] net: bridge: fix early call to br_stp_change_bridge_id
2017-12-16 11:31 ` [PATCH net] net: bridge: fix early call to br_stp_change_bridge_id Nikolay Aleksandrov
@ 2017-12-16 12:38 ` Nikolay Aleksandrov
2017-12-16 18:32 ` Stephen Hemminger
2017-12-18 2:24 ` Toshiaki Makita
2 siblings, 0 replies; 13+ messages in thread
From: Nikolay Aleksandrov @ 2017-12-16 12:38 UTC (permalink / raw)
To: netdev; +Cc: roopa, davem, makita.toshiaki, stephen, avagin, bridge
On 16/12/17 13:31, Nikolay Aleksandrov wrote:
> The early call to br_stp_change_bridge_id in bridge's newlink can cause
> a memory leak if an error occurs during the newlink because the fdb
> entries are not cleaned up if a different lladdr was specified, also
> another minor issue is that it generates fdb notifications with
> ifindex = 0. To remove this special case the call is done after netdev
> register and we cleanup any bridge fdb entries on changelink error.
> That also doesn't slow down normal bridge removal, alternative is to call
> it in its ndo_uninit.
>
> To reproduce the issue:
> $ ip l add br0 address 00:11:22:33:44:55 type bridge group_fwd_mask 1
> RTNETLINK answers: Invalid argument
>
> $ rmmod bridge
> [ 1822.142525] =============================================================================
> [ 1822.143640] BUG bridge_fdb_cache (Tainted: G O ): Objects remaining in bridge_fdb_cache on __kmem_cache_shutdown()
> [ 1822.144821] -----------------------------------------------------------------------------
>
> [ 1822.145990] Disabling lock debugging due to kernel taint
> [ 1822.146732] INFO: Slab 0x0000000092a844b2 objects=32 used=2 fp=0x00000000fef011b0 flags=0x1ffff8000000100
> [ 1822.147700] CPU: 2 PID: 13584 Comm: rmmod Tainted: G B O 4.15.0-rc2+ #87
> [ 1822.148578] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
> [ 1822.150008] Call Trace:
> [ 1822.150510] dump_stack+0x78/0xa9
> [ 1822.151156] slab_err+0xb1/0xd3
> [ 1822.151834] ? __kmalloc+0x1bb/0x1ce
> [ 1822.152546] __kmem_cache_shutdown+0x151/0x28b
> [ 1822.153395] shutdown_cache+0x13/0x144
> [ 1822.154126] kmem_cache_destroy+0x1c0/0x1fb
> [ 1822.154669] SyS_delete_module+0x194/0x244
> [ 1822.155199] ? trace_hardirqs_on_thunk+0x1a/0x1c
> [ 1822.155773] entry_SYSCALL_64_fastpath+0x23/0x9a
> [ 1822.156343] RIP: 0033:0x7f929bd38b17
> [ 1822.156859] RSP: 002b:00007ffd160e9a98 EFLAGS: 00000202 ORIG_RAX: 00000000000000b0
> [ 1822.157728] RAX: ffffffffffffffda RBX: 00005578316ba090 RCX: 00007f929bd38b17
> [ 1822.158422] RDX: 00007f929bd9ec60 RSI: 0000000000000800 RDI: 00005578316ba0f0
> [ 1822.159114] RBP: 0000000000000003 R08: 00007f929bff5f20 R09: 00007ffd160e8a11
> [ 1822.159808] R10: 00007ffd160e9860 R11: 0000000000000202 R12: 00007ffd160e8a80
> [ 1822.160513] R13: 0000000000000000 R14: 0000000000000000 R15: 00005578316ba090
> [ 1822.161278] INFO: Object 0x000000007645de29 @offset=0
> [ 1822.161666] INFO: Object 0x00000000d5df2ab5 @offset=128
>
> Fixes: a4b816d8ba1c ("bridge: Change local fdb entries whenever mac address of bridge device changes")
> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> ---
> Consequently this also would fix the null ptr deref due to the rhashtable
> not being initialized in net-next when br_stp_change_bridge_id is called.
>
> Toshiaki, any reason you called br_stp_change_bridge_id before
> register_netdevice when you introduced it in 30313a3d5794 ?
>
> net/bridge/br_netlink.c | 14 +++++++++-----
> 1 file changed, 9 insertions(+), 5 deletions(-)
>
Oops, wrong Fixes commit id, this one was introduced in Feb/2014, should be:
Fixes: 30313a3d5794 ("bridge: Handle IFLA_ADDRESS correctly when creating bridge device")
as I said before since that was introduced in Apr/2014.
Dave, would you like me to send v2 with the proper Fixes tag ?
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH net] net: bridge: fix early call to br_stp_change_bridge_id
2017-12-16 11:31 ` [PATCH net] net: bridge: fix early call to br_stp_change_bridge_id Nikolay Aleksandrov
2017-12-16 12:38 ` Nikolay Aleksandrov
@ 2017-12-16 18:32 ` Stephen Hemminger
2017-12-18 2:24 ` Toshiaki Makita
2 siblings, 0 replies; 13+ messages in thread
From: Stephen Hemminger @ 2017-12-16 18:32 UTC (permalink / raw)
To: Nikolay Aleksandrov; +Cc: netdev, roopa, bridge, avagin, davem
On Sat, 16 Dec 2017 13:31:36 +0200
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
> The early call to br_stp_change_bridge_id in bridge's newlink can cause
> a memory leak if an error occurs during the newlink because the fdb
> entries are not cleaned up if a different lladdr was specified, also
> another minor issue is that it generates fdb notifications with
> ifindex = 0. To remove this special case the call is done after netdev
> register and we cleanup any bridge fdb entries on changelink error.
> That also doesn't slow down normal bridge removal, alternative is to call
> it in its ndo_uninit.
>
> To reproduce the issue:
> $ ip l add br0 address 00:11:22:33:44:55 type bridge group_fwd_mask 1
> RTNETLINK answers: Invalid argument
>
> $ rmmod bridge
> [ 1822.142525] =============================================================================
> [ 1822.143640] BUG bridge_fdb_cache (Tainted: G O ): Objects remaining in bridge_fdb_cache on __kmem_cache_shutdown()
> [ 1822.144821] -----------------------------------------------------------------------------
>
> [ 1822.145990] Disabling lock debugging due to kernel taint
> [ 1822.146732] INFO: Slab 0x0000000092a844b2 objects=32 used=2 fp=0x00000000fef011b0 flags=0x1ffff8000000100
> [ 1822.147700] CPU: 2 PID: 13584 Comm: rmmod Tainted: G B O 4.15.0-rc2+ #87
> [ 1822.148578] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
> [ 1822.150008] Call Trace:
> [ 1822.150510] dump_stack+0x78/0xa9
> [ 1822.151156] slab_err+0xb1/0xd3
> [ 1822.151834] ? __kmalloc+0x1bb/0x1ce
> [ 1822.152546] __kmem_cache_shutdown+0x151/0x28b
> [ 1822.153395] shutdown_cache+0x13/0x144
> [ 1822.154126] kmem_cache_destroy+0x1c0/0x1fb
> [ 1822.154669] SyS_delete_module+0x194/0x244
> [ 1822.155199] ? trace_hardirqs_on_thunk+0x1a/0x1c
> [ 1822.155773] entry_SYSCALL_64_fastpath+0x23/0x9a
> [ 1822.156343] RIP: 0033:0x7f929bd38b17
> [ 1822.156859] RSP: 002b:00007ffd160e9a98 EFLAGS: 00000202 ORIG_RAX: 00000000000000b0
> [ 1822.157728] RAX: ffffffffffffffda RBX: 00005578316ba090 RCX: 00007f929bd38b17
> [ 1822.158422] RDX: 00007f929bd9ec60 RSI: 0000000000000800 RDI: 00005578316ba0f0
> [ 1822.159114] RBP: 0000000000000003 R08: 00007f929bff5f20 R09: 00007ffd160e8a11
> [ 1822.159808] R10: 00007ffd160e9860 R11: 0000000000000202 R12: 00007ffd160e8a80
> [ 1822.160513] R13: 0000000000000000 R14: 0000000000000000 R15: 00005578316ba090
> [ 1822.161278] INFO: Object 0x000000007645de29 @offset=0
> [ 1822.161666] INFO: Object 0x00000000d5df2ab5 @offset=128
>
> Fixes: a4b816d8ba1c ("bridge: Change local fdb entries whenever mac address of bridge device changes")
> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> ---
> Consequently this also would fix the null ptr deref due to the rhashtable
> not being initialized in net-next when br_stp_change_bridge_id is called.
>
> Toshiaki, any reason you called br_stp_change_bridge_id before
> register_netdevice when you introduced it in 30313a3d5794 ?
>
> net/bridge/br_netlink.c | 14 +++++++++-----
> 1 file changed, 9 insertions(+), 5 deletions(-)
Thanks for working on this.
I agree that fixing this in ndo_uninit would be wrong. There are less bugs
if init and uninit do logically equivalent steps.
A bridge device can be created either with netlink or ioctl.
This change is also makes both ways of adding MAC have
the same semantics;
If bridge is created with ioctl then the bridge_id (and MAC)
will not be changed until later device is added or MAC address
is set by other operation.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH net] net: bridge: fix early call to br_stp_change_bridge_id
2017-12-16 11:31 ` [PATCH net] net: bridge: fix early call to br_stp_change_bridge_id Nikolay Aleksandrov
2017-12-16 12:38 ` Nikolay Aleksandrov
2017-12-16 18:32 ` Stephen Hemminger
@ 2017-12-18 2:24 ` Toshiaki Makita
2017-12-18 14:22 ` Nikolay Aleksandrov
2 siblings, 1 reply; 13+ messages in thread
From: Toshiaki Makita @ 2017-12-18 2:24 UTC (permalink / raw)
To: Nikolay Aleksandrov, netdev; +Cc: bridge, roopa, avagin, davem
On 2017/12/16 20:31, Nikolay Aleksandrov wrote:
> The early call to br_stp_change_bridge_id in bridge's newlink can cause
> a memory leak if an error occurs during the newlink because the fdb
> entries are not cleaned up if a different lladdr was specified, also
> another minor issue is that it generates fdb notifications with
> ifindex = 0. To remove this special case the call is done after netdev
> register and we cleanup any bridge fdb entries on changelink error.
> That also doesn't slow down normal bridge removal, alternative is to call
> it in its ndo_uninit.
...
> Fixes: a4b816d8ba1c ("bridge: Change local fdb entries whenever mac address of bridge device changes")
> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> ---
> Consequently this also would fix the null ptr deref due to the rhashtable
> not being initialized in net-next when br_stp_change_bridge_id is called.
>
> Toshiaki, any reason you called br_stp_change_bridge_id before
> register_netdevice when you introduced it in 30313a3d5794 ?
Thank you for taking care of this.
It's my bad. I just missed the problem.
> net/bridge/br_netlink.c | 14 +++++++++-----
> 1 file changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
> index d0ef0a8e8831..b0362cadb7c8 100644
> --- a/net/bridge/br_netlink.c
> +++ b/net/bridge/br_netlink.c
> @@ -1262,19 +1262,23 @@ static int br_dev_newlink(struct net *src_net, struct net_device *dev,
...
> err = br_changelink(dev, tb, data, extack);
> - if (err)
> + if (err) {
> + /* clean possible fdbs from br_stp_change_bridge_id above */
> + br_fdb_delete_by_port(br, NULL, 0, 1);
Don't we need to call br_dev_delete (br_link_ops.dellink) after
successful register instead of br_fdb_delete?
Particularly I'm wondering if not calling br_sysfs_delbr() is ok or not.
--
Toshiaki Makita
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH net] net: bridge: fix early call to br_stp_change_bridge_id
2017-12-18 2:24 ` Toshiaki Makita
@ 2017-12-18 14:22 ` Nikolay Aleksandrov
2017-12-18 14:23 ` Nikolay Aleksandrov
0 siblings, 1 reply; 13+ messages in thread
From: Nikolay Aleksandrov @ 2017-12-18 14:22 UTC (permalink / raw)
To: Toshiaki Makita, netdev; +Cc: roopa, davem, stephen, avagin, bridge
On 12/18/2017 04:24 AM, Toshiaki Makita wrote:
> On 2017/12/16 20:31, Nikolay Aleksandrov wrote:
[snip]
> ...
>> err = br_changelink(dev, tb, data, extack);
>> - if (err)
>> + if (err) {
>> + /* clean possible fdbs from br_stp_change_bridge_id above */
>> + br_fdb_delete_by_port(br, NULL, 0, 1);
>
> Don't we need to call br_dev_delete (br_link_ops.dellink) after
> successful register instead of br_fdb_delete?
> Particularly I'm wondering if not calling br_sysfs_delbr() is ok or not.
>
Funny, that is actually the only reason we need to call it (br_sysfs_delbr). :-)
Good catch, that is another leak - the bridge sysfs entries are registered when
NETDEV_REGISTER event happens (register_netdevice) but are not properly cleaned up
on error there. This has also been present since the introduction of changelink
during newlink, commit:
b6677449dff6 ("bridge: netlink: call br_changelink() during br_dev_newlink()")
I'll send v2 that does br_dev_delete(dev, NULL) instead of the current cleanup.
With kobject debug enabled and that I can see "brif" and the rest of the sysfs
files getting freed properly, while before they weren't.
Thanks,
Nik
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH net] net: bridge: fix early call to br_stp_change_bridge_id
2017-12-18 14:22 ` Nikolay Aleksandrov
@ 2017-12-18 14:23 ` Nikolay Aleksandrov
2017-12-18 15:35 ` [PATCH net v2] net: bridge: fix early call to br_stp_change_bridge_id and plug newlink leaks Nikolay Aleksandrov
0 siblings, 1 reply; 13+ messages in thread
From: Nikolay Aleksandrov @ 2017-12-18 14:23 UTC (permalink / raw)
To: Toshiaki Makita, netdev; +Cc: bridge, roopa, avagin, davem
On 12/18/2017 04:22 PM, Nikolay Aleksandrov wrote:
> On 12/18/2017 04:24 AM, Toshiaki Makita wrote:
>> On 2017/12/16 20:31, Nikolay Aleksandrov wrote:
> [snip]
>> ...
>>> err = br_changelink(dev, tb, data, extack);
>>> - if (err)
>>> + if (err) {
>>> + /* clean possible fdbs from br_stp_change_bridge_id above */
>>> + br_fdb_delete_by_port(br, NULL, 0, 1);
>>
>> Don't we need to call br_dev_delete (br_link_ops.dellink) after
>> successful register instead of br_fdb_delete?
>> Particularly I'm wondering if not calling br_sysfs_delbr() is ok or not.
>>
>
> Funny, that is actually the only reason we need to call it (br_sysfs_delbr). :-)
>
> Good catch, that is another leak - the bridge sysfs entries are registered when
> NETDEV_REGISTER event happens (register_netdevice) but are not properly cleaned up
> on error there. This has also been present since the introduction of changelink
> during newlink, commit:
> b6677449dff6 ("bridge: netlink: call br_changelink() during br_dev_newlink()")
err, since the changelink was fixed to be after device registration in commit:
5b8d5429daa0 ("bridge: netlink: register netdevice before executing changelink")
>
> I'll send v2 that does br_dev_delete(dev, NULL) instead of the current cleanup.
> With kobject debug enabled and that I can see "brif" and the rest of the sysfs
> files getting freed properly, while before they weren't.
>
> Thanks,
> Nik
>
^ permalink raw reply [flat|nested] 13+ messages in thread* [PATCH net v2] net: bridge: fix early call to br_stp_change_bridge_id and plug newlink leaks
2017-12-18 14:23 ` Nikolay Aleksandrov
@ 2017-12-18 15:35 ` Nikolay Aleksandrov
2017-12-18 18:31 ` David Miller
0 siblings, 1 reply; 13+ messages in thread
From: Nikolay Aleksandrov @ 2017-12-18 15:35 UTC (permalink / raw)
To: netdev
Cc: makita.toshiaki, roopa, davem, stephen, avagin, bridge, idosch,
Nikolay Aleksandrov
The early call to br_stp_change_bridge_id in bridge's newlink can cause
a memory leak if an error occurs during the newlink because the fdb
entries are not cleaned up if a different lladdr was specified, also
another minor issue is that it generates fdb notifications with
ifindex = 0. Another unrelated memory leak is the bridge sysfs entries
which get added on NETDEV_REGISTER event, but are not cleaned up in the
newlink error path. To remove this special case the call to
br_stp_change_bridge_id is done after netdev register and we cleanup the
bridge on changelink error via br_dev_delete to plug all leaks.
This patch makes netlink bridge destruction on newlink error the same as
dellink and ioctl del which is necessary since at that point we have a
fully initialized bridge device.
To reproduce the issue:
$ ip l add br0 address 00:11:22:33:44:55 type bridge group_fwd_mask 1
RTNETLINK answers: Invalid argument
$ rmmod bridge
[ 1822.142525] =============================================================================
[ 1822.143640] BUG bridge_fdb_cache (Tainted: G O ): Objects remaining in bridge_fdb_cache on __kmem_cache_shutdown()
[ 1822.144821] -----------------------------------------------------------------------------
[ 1822.145990] Disabling lock debugging due to kernel taint
[ 1822.146732] INFO: Slab 0x0000000092a844b2 objects=32 used=2 fp=0x00000000fef011b0 flags=0x1ffff8000000100
[ 1822.147700] CPU: 2 PID: 13584 Comm: rmmod Tainted: G B O 4.15.0-rc2+ #87
[ 1822.148578] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 1822.150008] Call Trace:
[ 1822.150510] dump_stack+0x78/0xa9
[ 1822.151156] slab_err+0xb1/0xd3
[ 1822.151834] ? __kmalloc+0x1bb/0x1ce
[ 1822.152546] __kmem_cache_shutdown+0x151/0x28b
[ 1822.153395] shutdown_cache+0x13/0x144
[ 1822.154126] kmem_cache_destroy+0x1c0/0x1fb
[ 1822.154669] SyS_delete_module+0x194/0x244
[ 1822.155199] ? trace_hardirqs_on_thunk+0x1a/0x1c
[ 1822.155773] entry_SYSCALL_64_fastpath+0x23/0x9a
[ 1822.156343] RIP: 0033:0x7f929bd38b17
[ 1822.156859] RSP: 002b:00007ffd160e9a98 EFLAGS: 00000202 ORIG_RAX: 00000000000000b0
[ 1822.157728] RAX: ffffffffffffffda RBX: 00005578316ba090 RCX: 00007f929bd38b17
[ 1822.158422] RDX: 00007f929bd9ec60 RSI: 0000000000000800 RDI: 00005578316ba0f0
[ 1822.159114] RBP: 0000000000000003 R08: 00007f929bff5f20 R09: 00007ffd160e8a11
[ 1822.159808] R10: 00007ffd160e9860 R11: 0000000000000202 R12: 00007ffd160e8a80
[ 1822.160513] R13: 0000000000000000 R14: 0000000000000000 R15: 00005578316ba090
[ 1822.161278] INFO: Object 0x000000007645de29 @offset=0
[ 1822.161666] INFO: Object 0x00000000d5df2ab5 @offset=128
Fixes: 30313a3d5794 ("bridge: Handle IFLA_ADDRESS correctly when creating bridge device")
Fixes: 5b8d5429daa0 ("bridge: netlink: register netdevice before executing changelink")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
---
v2: it's better to use br_dev_delete to clean up because the sysfs entries
were leaked as well, basically we use it for the bridge fdb flush and sysfs
entries delete
Consequently this also would fix the null ptr deref due to the rhashtable
not being initialized in net-next when br_stp_change_bridge_id is called.
net/bridge/br_netlink.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index d0ef0a8e8831..015f465c514b 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -1262,19 +1262,20 @@ static int br_dev_newlink(struct net *src_net, struct net_device *dev,
struct net_bridge *br = netdev_priv(dev);
int err;
+ err = register_netdevice(dev);
+ if (err)
+ return err;
+
if (tb[IFLA_ADDRESS]) {
spin_lock_bh(&br->lock);
br_stp_change_bridge_id(br, nla_data(tb[IFLA_ADDRESS]));
spin_unlock_bh(&br->lock);
}
- err = register_netdevice(dev);
- if (err)
- return err;
-
err = br_changelink(dev, tb, data, extack);
if (err)
- unregister_netdevice(dev);
+ br_dev_delete(dev, NULL);
+
return err;
}
--
2.14.3
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [PATCH net v2] net: bridge: fix early call to br_stp_change_bridge_id and plug newlink leaks
2017-12-18 15:35 ` [PATCH net v2] net: bridge: fix early call to br_stp_change_bridge_id and plug newlink leaks Nikolay Aleksandrov
@ 2017-12-18 18:31 ` David Miller
0 siblings, 0 replies; 13+ messages in thread
From: David Miller @ 2017-12-18 18:31 UTC (permalink / raw)
To: nikolay; +Cc: netdev, makita.toshiaki, roopa, stephen, avagin, bridge, idosch
From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Date: Mon, 18 Dec 2017 17:35:09 +0200
> The early call to br_stp_change_bridge_id in bridge's newlink can cause
> a memory leak if an error occurs during the newlink because the fdb
> entries are not cleaned up if a different lladdr was specified, also
> another minor issue is that it generates fdb notifications with
> ifindex = 0. Another unrelated memory leak is the bridge sysfs entries
> which get added on NETDEV_REGISTER event, but are not cleaned up in the
> newlink error path. To remove this special case the call to
> br_stp_change_bridge_id is done after netdev register and we cleanup the
> bridge on changelink error via br_dev_delete to plug all leaks.
>
> This patch makes netlink bridge destruction on newlink error the same as
> dellink and ioctl del which is necessary since at that point we have a
> fully initialized bridge device.
>
> To reproduce the issue:
> $ ip l add br0 address 00:11:22:33:44:55 type bridge group_fwd_mask 1
> RTNETLINK answers: Invalid argument
>
> $ rmmod bridge
> [ 1822.142525] =============================================================================
> [ 1822.143640] BUG bridge_fdb_cache (Tainted: G O ): Objects remaining in bridge_fdb_cache on __kmem_cache_shutdown()
> [ 1822.144821] -----------------------------------------------------------------------------
...
> Fixes: 30313a3d5794 ("bridge: Handle IFLA_ADDRESS correctly when creating bridge device")
> Fixes: 5b8d5429daa0 ("bridge: netlink: register netdevice before executing changelink")
> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> ---
> v2: it's better to use br_dev_delete to clean up because the sysfs entries
> were leaked as well, basically we use it for the bridge fdb flush and sysfs
> entries delete
>
> Consequently this also would fix the null ptr deref due to the rhashtable
> not being initialized in net-next when br_stp_change_bridge_id is called.
Applied, but doing state changes after register_netdev() is always dicey.
I hope things go well if the device is brought up immediately during
that register_netdev() call and before the stp bridge ID setting is
done.
^ permalink raw reply [flat|nested] 13+ messages in thread