* [PATCH bpf-next V3] net/xdp: Fix suspicious RCU usage warning
@ 2018-08-13 9:21 Tariq Toukan
2018-08-13 10:31 ` Jesper Dangaard Brouer
0 siblings, 1 reply; 6+ messages in thread
From: Tariq Toukan @ 2018-08-13 9:21 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann
Cc: netdev, Eran Ben Elisha, Tariq Toukan, Jesper Dangaard Brouer
Fix the warning below by calling rhashtable_lookup_fast.
Also, make some code movements for better quality and human
readability.
[ 342.450870] WARNING: suspicious RCU usage
[ 342.455856] 4.18.0-rc2+ #17 Tainted: G O
[ 342.462210] -----------------------------
[ 342.467202] ./include/linux/rhashtable.h:481 suspicious rcu_dereference_check() usage!
[ 342.476568]
[ 342.476568] other info that might help us debug this:
[ 342.476568]
[ 342.486978]
[ 342.486978] rcu_scheduler_active = 2, debug_locks = 1
[ 342.495211] 4 locks held by modprobe/3934:
[ 342.500265] #0: 00000000e23116b2 (mlx5_intf_mutex){+.+.}, at:
mlx5_unregister_interface+0x18/0x90 [mlx5_core]
[ 342.511953] #1: 00000000ca16db96 (rtnl_mutex){+.+.}, at: unregister_netdev+0xe/0x20
[ 342.521109] #2: 00000000a46e2c4b (&priv->state_lock){+.+.}, at: mlx5e_close+0x29/0x60
[mlx5_core]
[ 342.531642] #3: 0000000060c5bde3 (mem_id_lock){+.+.}, at: xdp_rxq_info_unreg+0x93/0x6b0
[ 342.541206]
[ 342.541206] stack backtrace:
[ 342.547075] CPU: 12 PID: 3934 Comm: modprobe Tainted: G O 4.18.0-rc2+ #17
[ 342.556621] Hardware name: Dell Inc. PowerEdge R730/0H21J3, BIOS 1.5.4 10/002/2015
[ 342.565606] Call Trace:
[ 342.568861] dump_stack+0x78/0xb3
[ 342.573086] xdp_rxq_info_unreg+0x3f5/0x6b0
[ 342.578285] ? __call_rcu+0x220/0x300
[ 342.582911] mlx5e_free_rq+0x38/0xc0 [mlx5_core]
[ 342.588602] mlx5e_close_channel+0x20/0x120 [mlx5_core]
[ 342.594976] mlx5e_close_channels+0x26/0x40 [mlx5_core]
[ 342.601345] mlx5e_close_locked+0x44/0x50 [mlx5_core]
[ 342.607519] mlx5e_close+0x42/0x60 [mlx5_core]
[ 342.613005] __dev_close_many+0xb1/0x120
[ 342.617911] dev_close_many+0xa2/0x170
[ 342.622622] rollback_registered_many+0x148/0x460
[ 342.628401] ? __lock_acquire+0x48d/0x11b0
[ 342.633498] ? unregister_netdev+0xe/0x20
[ 342.638495] rollback_registered+0x56/0x90
[ 342.643588] unregister_netdevice_queue+0x7e/0x100
[ 342.649461] unregister_netdev+0x18/0x20
[ 342.654362] mlx5e_remove+0x2a/0x50 [mlx5_core]
[ 342.659944] mlx5_remove_device+0xe5/0x110 [mlx5_core]
[ 342.666208] mlx5_unregister_interface+0x39/0x90 [mlx5_core]
[ 342.673038] cleanup+0x5/0xbfc [mlx5_core]
[ 342.678094] __x64_sys_delete_module+0x16b/0x240
[ 342.683725] ? do_syscall_64+0x1c/0x210
[ 342.688476] do_syscall_64+0x5a/0x210
[ 342.693025] entry_SYSCALL_64_after_hwframe+0x49/0xbe
Fixes: 8d5d88527587 ("xdp: rhashtable with allocator ID to pointer mapping")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
---
net/core/xdp.c | 13 +++----------
1 file changed, 3 insertions(+), 10 deletions(-)
V2 -> V3:
* Fix return value test for rhashtable_remove_fast, per Jesper's comment.
V1 -> V2:
* Use rhashtable_lookup_fast and make some code movements, per Daniel's
and Alexei's comments.
Please queue to -stable v4.18.
diff --git a/net/core/xdp.c b/net/core/xdp.c
index 3dd99e1c04f5..8b1c7b699982 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -105,16 +105,9 @@ static void __xdp_rxq_info_unreg_mem_model(struct xdp_rxq_info *xdp_rxq)
mutex_lock(&mem_id_lock);
- xa = rhashtable_lookup(mem_id_ht, &id, mem_id_rht_params);
- if (!xa) {
- mutex_unlock(&mem_id_lock);
- return;
- }
-
- err = rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params);
- WARN_ON(err);
-
- call_rcu(&xa->rcu, __xdp_mem_allocator_rcu_free);
+ xa = rhashtable_lookup_fast(mem_id_ht, &id, mem_id_rht_params);
+ if (xa && !rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params))
+ call_rcu(&xa->rcu, __xdp_mem_allocator_rcu_free);
mutex_unlock(&mem_id_lock);
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next V3] net/xdp: Fix suspicious RCU usage warning
2018-08-13 9:21 [PATCH bpf-next V3] net/xdp: Fix suspicious RCU usage warning Tariq Toukan
@ 2018-08-13 10:31 ` Jesper Dangaard Brouer
2018-08-13 10:57 ` Tariq Toukan
0 siblings, 1 reply; 6+ messages in thread
From: Jesper Dangaard Brouer @ 2018-08-13 10:31 UTC (permalink / raw)
To: Tariq Toukan
Cc: Alexei Starovoitov, Daniel Borkmann, netdev, Eran Ben Elisha,
brouer, Neil Brown, Paul E. McKenney
On Mon, 13 Aug 2018 12:21:58 +0300
Tariq Toukan <tariqt@mellanox.com> wrote:
> Fix the warning below by calling rhashtable_lookup_fast.
> Also, make some code movements for better quality and human
> readability.
>
> [ 342.450870] WARNING: suspicious RCU usage
> [ 342.455856] 4.18.0-rc2+ #17 Tainted: G O
> [ 342.462210] -----------------------------
> [ 342.467202] ./include/linux/rhashtable.h:481 suspicious rcu_dereference_check() usage!
> [ 342.476568]
> [ 342.476568] other info that might help us debug this:
> [ 342.476568]
> [ 342.486978]
> [ 342.486978] rcu_scheduler_active = 2, debug_locks = 1
> [ 342.495211] 4 locks held by modprobe/3934:
> [ 342.500265] #0: 00000000e23116b2 (mlx5_intf_mutex){+.+.}, at:
> mlx5_unregister_interface+0x18/0x90 [mlx5_core]
> [ 342.511953] #1: 00000000ca16db96 (rtnl_mutex){+.+.}, at: unregister_netdev+0xe/0x20
> [ 342.521109] #2: 00000000a46e2c4b (&priv->state_lock){+.+.}, at: mlx5e_close+0x29/0x60
> [mlx5_core]
> [ 342.531642] #3: 0000000060c5bde3 (mem_id_lock){+.+.}, at: xdp_rxq_info_unreg+0x93/0x6b0
> [ 342.541206]
> [ 342.541206] stack backtrace:
> [ 342.547075] CPU: 12 PID: 3934 Comm: modprobe Tainted: G O 4.18.0-rc2+ #17
> [ 342.556621] Hardware name: Dell Inc. PowerEdge R730/0H21J3, BIOS 1.5.4 10/002/2015
> [ 342.565606] Call Trace:
> [ 342.568861] dump_stack+0x78/0xb3
> [ 342.573086] xdp_rxq_info_unreg+0x3f5/0x6b0
> [ 342.578285] ? __call_rcu+0x220/0x300
> [ 342.582911] mlx5e_free_rq+0x38/0xc0 [mlx5_core]
> [ 342.588602] mlx5e_close_channel+0x20/0x120 [mlx5_core]
> [ 342.594976] mlx5e_close_channels+0x26/0x40 [mlx5_core]
> [ 342.601345] mlx5e_close_locked+0x44/0x50 [mlx5_core]
> [ 342.607519] mlx5e_close+0x42/0x60 [mlx5_core]
> [ 342.613005] __dev_close_many+0xb1/0x120
> [ 342.617911] dev_close_many+0xa2/0x170
> [ 342.622622] rollback_registered_many+0x148/0x460
> [ 342.628401] ? __lock_acquire+0x48d/0x11b0
> [ 342.633498] ? unregister_netdev+0xe/0x20
> [ 342.638495] rollback_registered+0x56/0x90
> [ 342.643588] unregister_netdevice_queue+0x7e/0x100
> [ 342.649461] unregister_netdev+0x18/0x20
> [ 342.654362] mlx5e_remove+0x2a/0x50 [mlx5_core]
> [ 342.659944] mlx5_remove_device+0xe5/0x110 [mlx5_core]
> [ 342.666208] mlx5_unregister_interface+0x39/0x90 [mlx5_core]
> [ 342.673038] cleanup+0x5/0xbfc [mlx5_core]
> [ 342.678094] __x64_sys_delete_module+0x16b/0x240
> [ 342.683725] ? do_syscall_64+0x1c/0x210
> [ 342.688476] do_syscall_64+0x5a/0x210
> [ 342.693025] entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> Fixes: 8d5d88527587 ("xdp: rhashtable with allocator ID to pointer mapping")
> Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
> Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Jesper Dangaard Brouer <brouer@redhat.com>
> ---
> net/core/xdp.c | 13 +++----------
> 1 file changed, 3 insertions(+), 10 deletions(-)
>
> V2 -> V3:
> * Fix return value test for rhashtable_remove_fast, per Jesper's comment.
>
> V1 -> V2:
> * Use rhashtable_lookup_fast and make some code movements, per Daniel's
> and Alexei's comments.
>
> Please queue to -stable v4.18.
>
> diff --git a/net/core/xdp.c b/net/core/xdp.c
> index 3dd99e1c04f5..8b1c7b699982 100644
> --- a/net/core/xdp.c
> +++ b/net/core/xdp.c
> @@ -105,16 +105,9 @@ static void __xdp_rxq_info_unreg_mem_model(struct xdp_rxq_info *xdp_rxq)
>
> mutex_lock(&mem_id_lock);
>
> - xa = rhashtable_lookup(mem_id_ht, &id, mem_id_rht_params);
> - if (!xa) {
> - mutex_unlock(&mem_id_lock);
> - return;
> - }
> -
> - err = rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params);
> - WARN_ON(err);
> -
> - call_rcu(&xa->rcu, __xdp_mem_allocator_rcu_free);
> + xa = rhashtable_lookup_fast(mem_id_ht, &id, mem_id_rht_params);
> + if (xa && !rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params))
> + call_rcu(&xa->rcu, __xdp_mem_allocator_rcu_free);
>
> mutex_unlock(&mem_id_lock);
> }
I would personally prefer to write it as in the example as "== 0", look
at example in [1] section "Object removal", but is it semantically the
same to write !rhashtable_remove_fast(). So, I'm fine with this.
In the example[1], the sequence is wrapped in rcu_read_lock/unlock,
while you have not done so. The rhashtable_lookup_fast and
rhashtable_remove_fast calls have their own rcu_read_lock/unlock, but
the outer rcu_read_lock/unlock, makes sure that a RCU period cannot
slip in between the two calls.
I still think your fix is correct, due to the mutex_lock. Given the
mutex sync removal and insert in this rhashtable.
I do wonder if it would be better to add the outer rcu_read_lock/unlock,
calls if someone else reads and copy-paste this code (and don't have an
mutex sync scheme) ?
If you think this is all fine, and want to proceed as is then you have
my ack:
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
[1] https://lwn.net/Articles/751374/
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next V3] net/xdp: Fix suspicious RCU usage warning
2018-08-13 10:31 ` Jesper Dangaard Brouer
@ 2018-08-13 10:57 ` Tariq Toukan
2018-08-13 12:02 ` Jesper Dangaard Brouer
0 siblings, 1 reply; 6+ messages in thread
From: Tariq Toukan @ 2018-08-13 10:57 UTC (permalink / raw)
To: Jesper Dangaard Brouer, Tariq Toukan
Cc: Alexei Starovoitov, Daniel Borkmann, netdev, Eran Ben Elisha,
Neil Brown, Paul E. McKenney
On 13/08/2018 1:31 PM, Jesper Dangaard Brouer wrote:
> On Mon, 13 Aug 2018 12:21:58 +0300
> Tariq Toukan <tariqt@mellanox.com> wrote:
>
>> Fix the warning below by calling rhashtable_lookup_fast.
>> Also, make some code movements for better quality and human
>> readability.
>>
>> [ 342.450870] WARNING: suspicious RCU usage
>> [ 342.455856] 4.18.0-rc2+ #17 Tainted: G O
>> [ 342.462210] -----------------------------
>> [ 342.467202] ./include/linux/rhashtable.h:481 suspicious rcu_dereference_check() usage!
>> [ 342.476568]
>> [ 342.476568] other info that might help us debug this:
>> [ 342.476568]
>> [ 342.486978]
>> [ 342.486978] rcu_scheduler_active = 2, debug_locks = 1
>> [ 342.495211] 4 locks held by modprobe/3934:
>> [ 342.500265] #0: 00000000e23116b2 (mlx5_intf_mutex){+.+.}, at:
>> mlx5_unregister_interface+0x18/0x90 [mlx5_core]
>> [ 342.511953] #1: 00000000ca16db96 (rtnl_mutex){+.+.}, at: unregister_netdev+0xe/0x20
>> [ 342.521109] #2: 00000000a46e2c4b (&priv->state_lock){+.+.}, at: mlx5e_close+0x29/0x60
>> [mlx5_core]
>> [ 342.531642] #3: 0000000060c5bde3 (mem_id_lock){+.+.}, at: xdp_rxq_info_unreg+0x93/0x6b0
>> [ 342.541206]
>> [ 342.541206] stack backtrace:
>> [ 342.547075] CPU: 12 PID: 3934 Comm: modprobe Tainted: G O 4.18.0-rc2+ #17
>> [ 342.556621] Hardware name: Dell Inc. PowerEdge R730/0H21J3, BIOS 1.5.4 10/002/2015
>> [ 342.565606] Call Trace:
>> [ 342.568861] dump_stack+0x78/0xb3
>> [ 342.573086] xdp_rxq_info_unreg+0x3f5/0x6b0
>> [ 342.578285] ? __call_rcu+0x220/0x300
>> [ 342.582911] mlx5e_free_rq+0x38/0xc0 [mlx5_core]
>> [ 342.588602] mlx5e_close_channel+0x20/0x120 [mlx5_core]
>> [ 342.594976] mlx5e_close_channels+0x26/0x40 [mlx5_core]
>> [ 342.601345] mlx5e_close_locked+0x44/0x50 [mlx5_core]
>> [ 342.607519] mlx5e_close+0x42/0x60 [mlx5_core]
>> [ 342.613005] __dev_close_many+0xb1/0x120
>> [ 342.617911] dev_close_many+0xa2/0x170
>> [ 342.622622] rollback_registered_many+0x148/0x460
>> [ 342.628401] ? __lock_acquire+0x48d/0x11b0
>> [ 342.633498] ? unregister_netdev+0xe/0x20
>> [ 342.638495] rollback_registered+0x56/0x90
>> [ 342.643588] unregister_netdevice_queue+0x7e/0x100
>> [ 342.649461] unregister_netdev+0x18/0x20
>> [ 342.654362] mlx5e_remove+0x2a/0x50 [mlx5_core]
>> [ 342.659944] mlx5_remove_device+0xe5/0x110 [mlx5_core]
>> [ 342.666208] mlx5_unregister_interface+0x39/0x90 [mlx5_core]
>> [ 342.673038] cleanup+0x5/0xbfc [mlx5_core]
>> [ 342.678094] __x64_sys_delete_module+0x16b/0x240
>> [ 342.683725] ? do_syscall_64+0x1c/0x210
>> [ 342.688476] do_syscall_64+0x5a/0x210
>> [ 342.693025] entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>
>> Fixes: 8d5d88527587 ("xdp: rhashtable with allocator ID to pointer mapping")
>> Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
>> Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
>> Cc: Jesper Dangaard Brouer <brouer@redhat.com>
>> ---
>> net/core/xdp.c | 13 +++----------
>> 1 file changed, 3 insertions(+), 10 deletions(-)
>>
>> V2 -> V3:
>> * Fix return value test for rhashtable_remove_fast, per Jesper's comment.
>>
>> V1 -> V2:
>> * Use rhashtable_lookup_fast and make some code movements, per Daniel's
>> and Alexei's comments.
>>
>> Please queue to -stable v4.18.
>>
>> diff --git a/net/core/xdp.c b/net/core/xdp.c
>> index 3dd99e1c04f5..8b1c7b699982 100644
>> --- a/net/core/xdp.c
>> +++ b/net/core/xdp.c
>> @@ -105,16 +105,9 @@ static void __xdp_rxq_info_unreg_mem_model(struct xdp_rxq_info *xdp_rxq)
>>
>> mutex_lock(&mem_id_lock);
>>
>> - xa = rhashtable_lookup(mem_id_ht, &id, mem_id_rht_params);
>> - if (!xa) {
>> - mutex_unlock(&mem_id_lock);
>> - return;
>> - }
>> -
>> - err = rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params);
>> - WARN_ON(err);
>> -
>> - call_rcu(&xa->rcu, __xdp_mem_allocator_rcu_free);
>> + xa = rhashtable_lookup_fast(mem_id_ht, &id, mem_id_rht_params);
>> + if (xa && !rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params))
>> + call_rcu(&xa->rcu, __xdp_mem_allocator_rcu_free);
>>
>> mutex_unlock(&mem_id_lock);
>> }
>
> I would personally prefer to write it as in the example as "== 0", look
> at example in [1] section "Object removal", but is it semantically the
> same to write !rhashtable_remove_fast(). So, I'm fine with this.
>
I thought the coding convention is to not explicitly compare to zero,
just like we do not compare to NULL on page allocation, but do:
if (!page)
But I don't mind changing this one.
> In the example[1], the sequence is wrapped in rcu_read_lock/unlock,
> while you have not done so. The rhashtable_lookup_fast and
> rhashtable_remove_fast calls have their own rcu_read_lock/unlock, but
> the outer rcu_read_lock/unlock, makes sure that a RCU period cannot
> slip in between the two calls.
>
> I still think your fix is correct, due to the mutex_lock. Given the
> mutex sync removal and insert in this rhashtable.
>
Right, we rely here on the mutex to avoid the scenario you described.
So the outer rcu lock calls are not necessary.
> I do wonder if it would be better to add the outer rcu_read_lock/unlock,
> calls if someone else reads and copy-paste this code (and don't have an
> mutex sync scheme) ?
>
Yeah it'll be safer for future unaware developers, but I think reviewers
should always comment and make it clear that the best generic reference
is [1], not any specific/optimized use case.
If you guys still want to me to fix this then please let me know and
I'll re-spin.
> If you think this is all fine, and want to proceed as is then you have
> my ack:
>
> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
>
>
> [1] https://lwn.net/Articles/751374/
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next V3] net/xdp: Fix suspicious RCU usage warning
2018-08-13 10:57 ` Tariq Toukan
@ 2018-08-13 12:02 ` Jesper Dangaard Brouer
2018-08-13 12:22 ` Daniel Borkmann
0 siblings, 1 reply; 6+ messages in thread
From: Jesper Dangaard Brouer @ 2018-08-13 12:02 UTC (permalink / raw)
To: Tariq Toukan
Cc: Alexei Starovoitov, Daniel Borkmann, netdev, Eran Ben Elisha,
Neil Brown, Paul E. McKenney, brouer
On Mon, 13 Aug 2018 13:57:04 +0300
Tariq Toukan <tariqt@mellanox.com> wrote:
> On 13/08/2018 1:31 PM, Jesper Dangaard Brouer wrote:
> > On Mon, 13 Aug 2018 12:21:58 +0300
> > Tariq Toukan <tariqt@mellanox.com> wrote:
> >
> >> Fix the warning below by calling rhashtable_lookup_fast.
> >> Also, make some code movements for better quality and human
> >> readability.
> >>
> >> [ 342.450870] WARNING: suspicious RCU usage
> >> [ 342.455856] 4.18.0-rc2+ #17 Tainted: G O
> >> [ 342.462210] -----------------------------
> >> [ 342.467202] ./include/linux/rhashtable.h:481 suspicious rcu_dereference_check() usage!
> >> [ 342.476568]
> >> [ 342.476568] other info that might help us debug this:
> >> [ 342.476568]
> >> [ 342.486978]
> >> [ 342.486978] rcu_scheduler_active = 2, debug_locks = 1
> >> [ 342.495211] 4 locks held by modprobe/3934:
> >> [ 342.500265] #0: 00000000e23116b2 (mlx5_intf_mutex){+.+.}, at:
> >> mlx5_unregister_interface+0x18/0x90 [mlx5_core]
> >> [ 342.511953] #1: 00000000ca16db96 (rtnl_mutex){+.+.}, at: unregister_netdev+0xe/0x20
> >> [ 342.521109] #2: 00000000a46e2c4b (&priv->state_lock){+.+.}, at: mlx5e_close+0x29/0x60
> >> [mlx5_core]
> >> [ 342.531642] #3: 0000000060c5bde3 (mem_id_lock){+.+.}, at: xdp_rxq_info_unreg+0x93/0x6b0
> >> [ 342.541206]
> >> [ 342.541206] stack backtrace:
> >> [ 342.547075] CPU: 12 PID: 3934 Comm: modprobe Tainted: G O 4.18.0-rc2+ #17
> >> [ 342.556621] Hardware name: Dell Inc. PowerEdge R730/0H21J3, BIOS 1.5.4 10/002/2015
> >> [ 342.565606] Call Trace:
> >> [ 342.568861] dump_stack+0x78/0xb3
> >> [ 342.573086] xdp_rxq_info_unreg+0x3f5/0x6b0
> >> [ 342.578285] ? __call_rcu+0x220/0x300
> >> [ 342.582911] mlx5e_free_rq+0x38/0xc0 [mlx5_core]
> >> [ 342.588602] mlx5e_close_channel+0x20/0x120 [mlx5_core]
> >> [ 342.594976] mlx5e_close_channels+0x26/0x40 [mlx5_core]
> >> [ 342.601345] mlx5e_close_locked+0x44/0x50 [mlx5_core]
> >> [ 342.607519] mlx5e_close+0x42/0x60 [mlx5_core]
> >> [ 342.613005] __dev_close_many+0xb1/0x120
> >> [ 342.617911] dev_close_many+0xa2/0x170
> >> [ 342.622622] rollback_registered_many+0x148/0x460
> >> [ 342.628401] ? __lock_acquire+0x48d/0x11b0
> >> [ 342.633498] ? unregister_netdev+0xe/0x20
> >> [ 342.638495] rollback_registered+0x56/0x90
> >> [ 342.643588] unregister_netdevice_queue+0x7e/0x100
> >> [ 342.649461] unregister_netdev+0x18/0x20
> >> [ 342.654362] mlx5e_remove+0x2a/0x50 [mlx5_core]
> >> [ 342.659944] mlx5_remove_device+0xe5/0x110 [mlx5_core]
> >> [ 342.666208] mlx5_unregister_interface+0x39/0x90 [mlx5_core]
> >> [ 342.673038] cleanup+0x5/0xbfc [mlx5_core]
> >> [ 342.678094] __x64_sys_delete_module+0x16b/0x240
> >> [ 342.683725] ? do_syscall_64+0x1c/0x210
> >> [ 342.688476] do_syscall_64+0x5a/0x210
> >> [ 342.693025] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >>
> >> Fixes: 8d5d88527587 ("xdp: rhashtable with allocator ID to pointer mapping")
> >> Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
> >> Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
> >> Cc: Jesper Dangaard Brouer <brouer@redhat.com>
> >> ---
> >> net/core/xdp.c | 13 +++----------
> >> 1 file changed, 3 insertions(+), 10 deletions(-)
> >>
> >> V2 -> V3:
> >> * Fix return value test for rhashtable_remove_fast, per Jesper's comment.
> >>
> >> V1 -> V2:
> >> * Use rhashtable_lookup_fast and make some code movements, per Daniel's
> >> and Alexei's comments.
> >>
> >> Please queue to -stable v4.18.
> >>
> >> diff --git a/net/core/xdp.c b/net/core/xdp.c
> >> index 3dd99e1c04f5..8b1c7b699982 100644
> >> --- a/net/core/xdp.c
> >> +++ b/net/core/xdp.c
> >> @@ -105,16 +105,9 @@ static void __xdp_rxq_info_unreg_mem_model(struct xdp_rxq_info *xdp_rxq)
> >>
> >> mutex_lock(&mem_id_lock);
> >>
> >> - xa = rhashtable_lookup(mem_id_ht, &id, mem_id_rht_params);
> >> - if (!xa) {
> >> - mutex_unlock(&mem_id_lock);
> >> - return;
> >> - }
> >> -
> >> - err = rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params);
> >> - WARN_ON(err);
> >> -
> >> - call_rcu(&xa->rcu, __xdp_mem_allocator_rcu_free);
> >> + xa = rhashtable_lookup_fast(mem_id_ht, &id, mem_id_rht_params);
> >> + if (xa && !rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params))
> >> + call_rcu(&xa->rcu, __xdp_mem_allocator_rcu_free);
> >>
> >> mutex_unlock(&mem_id_lock);
> >> }
> >
> > I would personally prefer to write it as in the example as "== 0", look
> > at example in [1] section "Object removal", but is it semantically the
> > same to write !rhashtable_remove_fast(). So, I'm fine with this.
> >
>
> I thought the coding convention is to not explicitly compare to zero,
> just like we do not compare to NULL on page allocation, but do:
> if (!page)
If the return value is a pointer, then I use the (!ptr) check, and also
if the return value is a bool. In this case where the success is 0, I
find it slightly confusing to read if(!remove) then success-case.
> But I don't mind changing this one.
I also don't care much... if you do respin, it would be nice to do.
> > In the example[1], the sequence is wrapped in rcu_read_lock/unlock,
> > while you have not done so. The rhashtable_lookup_fast and
> > rhashtable_remove_fast calls have their own rcu_read_lock/unlock,
> > but the outer rcu_read_lock/unlock, makes sure that a RCU period
> > cannot slip in between the two calls.
> >
> > I still think your fix is correct, due to the mutex_lock. Given the
> > mutex sync removal and insert in this rhashtable.
> >
>
> Right, we rely here on the mutex to avoid the scenario you described.
> So the outer rcu lock calls are not necessary.
>
> > I do wonder if it would be better to add the outer
> > rcu_read_lock/unlock, calls if someone else reads and copy-paste
> > this code (and don't have an mutex sync scheme) ?
> >
>
> Yeah it'll be safer for future unaware developers, but I think
> reviewers should always comment and make it clear that the best
> generic reference is [1], not any specific/optimized use case.
>
> If you guys still want to me to fix this then please let me know and
> I'll re-spin.
I'll let Daniel make the choice.
> > If you think this is all fine, and want to proceed as is then you
> > have my ack:
> >
> > Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
> >
> >
> > [1] https://lwn.net/Articles/751374/
> >
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next V3] net/xdp: Fix suspicious RCU usage warning
2018-08-13 12:02 ` Jesper Dangaard Brouer
@ 2018-08-13 12:22 ` Daniel Borkmann
2018-08-16 0:11 ` Daniel Borkmann
0 siblings, 1 reply; 6+ messages in thread
From: Daniel Borkmann @ 2018-08-13 12:22 UTC (permalink / raw)
To: Jesper Dangaard Brouer, Tariq Toukan
Cc: Alexei Starovoitov, netdev, Eran Ben Elisha, Neil Brown,
Paul E. McKenney
On 08/13/2018 02:02 PM, Jesper Dangaard Brouer wrote:
> On Mon, 13 Aug 2018 13:57:04 +0300
> Tariq Toukan <tariqt@mellanox.com> wrote:
>> On 13/08/2018 1:31 PM, Jesper Dangaard Brouer wrote:
>>> On Mon, 13 Aug 2018 12:21:58 +0300
>>> Tariq Toukan <tariqt@mellanox.com> wrote:
[...]
>>> In the example[1], the sequence is wrapped in rcu_read_lock/unlock,
>>> while you have not done so. The rhashtable_lookup_fast and
>>> rhashtable_remove_fast calls have their own rcu_read_lock/unlock,
>>> but the outer rcu_read_lock/unlock, makes sure that a RCU period
>>> cannot slip in between the two calls.
>>>
>>> I still think your fix is correct, due to the mutex_lock. Given the
>>> mutex sync removal and insert in this rhashtable.
>>
>> Right, we rely here on the mutex to avoid the scenario you described.
>> So the outer rcu lock calls are not necessary.
Agree.
>>> I do wonder if it would be better to add the outer
>>> rcu_read_lock/unlock, calls if someone else reads and copy-paste
>>> this code (and don't have an mutex sync scheme) ?
>>
>> Yeah it'll be safer for future unaware developers, but I think
>> reviewers should always comment and make it clear that the best
>> generic reference is [1], not any specific/optimized use case.
>>
>> If you guys still want to me to fix this then please let me know and
>> I'll re-spin.
>
> I'll let Daniel make the choice.
Patch is fine as is. If we would be adding the RCU read lock/unlock pair
even though it's not necessary but for other developers to copy paste
from, I think this might be double-confusing: in the one case for people
reading the current code as they will wonder why the additional RCU read
side is needed here (so it will leave them puzzling), and in the other
case for people trying to copy-paste from it wondering whether they would
need similar scheme with mutex in addition. So I strongly prefer to 'do
the right thing' based on the situation. Given BPF PR is still pending,
I'll get the patch in once it has been pulled.
Thanks,
Daniel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next V3] net/xdp: Fix suspicious RCU usage warning
2018-08-13 12:22 ` Daniel Borkmann
@ 2018-08-16 0:11 ` Daniel Borkmann
0 siblings, 0 replies; 6+ messages in thread
From: Daniel Borkmann @ 2018-08-16 0:11 UTC (permalink / raw)
To: Jesper Dangaard Brouer, Tariq Toukan
Cc: Alexei Starovoitov, netdev, Eran Ben Elisha, Neil Brown,
Paul E. McKenney
On 08/13/2018 02:22 PM, Daniel Borkmann wrote:
[...]
> I'll get the patch in once it has been pulled.
Applied to bpf, thanks Tariq!
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-08-16 3:05 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-08-13 9:21 [PATCH bpf-next V3] net/xdp: Fix suspicious RCU usage warning Tariq Toukan
2018-08-13 10:31 ` Jesper Dangaard Brouer
2018-08-13 10:57 ` Tariq Toukan
2018-08-13 12:02 ` Jesper Dangaard Brouer
2018-08-13 12:22 ` Daniel Borkmann
2018-08-16 0:11 ` Daniel Borkmann
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).