All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Tariq Toukan <tariqt@mellanox.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	netdev@vger.kernel.org, Eran Ben Elisha <eranbe@mellanox.com>,
	brouer@redhat.com, Neil Brown <neilb@suse.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Subject: Re: [PATCH bpf-next V3] net/xdp: Fix suspicious RCU usage warning
Date: Mon, 13 Aug 2018 12:31:59 +0200	[thread overview]
Message-ID: <20180813123159.1f447108@redhat.com> (raw)
In-Reply-To: <1534152118-15968-1-git-send-email-tariqt@mellanox.com>

On Mon, 13 Aug 2018 12:21:58 +0300
Tariq Toukan <tariqt@mellanox.com> wrote:

> Fix the warning below by calling rhashtable_lookup_fast.
> Also, make some code movements for better quality and human
> readability.
> 
> [  342.450870] WARNING: suspicious RCU usage
> [  342.455856] 4.18.0-rc2+ #17 Tainted: G           O
> [  342.462210] -----------------------------
> [  342.467202] ./include/linux/rhashtable.h:481 suspicious rcu_dereference_check() usage!
> [  342.476568]
> [  342.476568] other info that might help us debug this:
> [  342.476568]
> [  342.486978]
> [  342.486978] rcu_scheduler_active = 2, debug_locks = 1
> [  342.495211] 4 locks held by modprobe/3934:
> [  342.500265]  #0: 00000000e23116b2 (mlx5_intf_mutex){+.+.}, at:
> mlx5_unregister_interface+0x18/0x90 [mlx5_core]
> [  342.511953]  #1: 00000000ca16db96 (rtnl_mutex){+.+.}, at: unregister_netdev+0xe/0x20
> [  342.521109]  #2: 00000000a46e2c4b (&priv->state_lock){+.+.}, at: mlx5e_close+0x29/0x60
> [mlx5_core]
> [  342.531642]  #3: 0000000060c5bde3 (mem_id_lock){+.+.}, at: xdp_rxq_info_unreg+0x93/0x6b0
> [  342.541206]
> [  342.541206] stack backtrace:
> [  342.547075] CPU: 12 PID: 3934 Comm: modprobe Tainted: G           O      4.18.0-rc2+ #17
> [  342.556621] Hardware name: Dell Inc. PowerEdge R730/0H21J3, BIOS 1.5.4 10/002/2015
> [  342.565606] Call Trace:
> [  342.568861]  dump_stack+0x78/0xb3
> [  342.573086]  xdp_rxq_info_unreg+0x3f5/0x6b0
> [  342.578285]  ? __call_rcu+0x220/0x300
> [  342.582911]  mlx5e_free_rq+0x38/0xc0 [mlx5_core]
> [  342.588602]  mlx5e_close_channel+0x20/0x120 [mlx5_core]
> [  342.594976]  mlx5e_close_channels+0x26/0x40 [mlx5_core]
> [  342.601345]  mlx5e_close_locked+0x44/0x50 [mlx5_core]
> [  342.607519]  mlx5e_close+0x42/0x60 [mlx5_core]
> [  342.613005]  __dev_close_many+0xb1/0x120
> [  342.617911]  dev_close_many+0xa2/0x170
> [  342.622622]  rollback_registered_many+0x148/0x460
> [  342.628401]  ? __lock_acquire+0x48d/0x11b0
> [  342.633498]  ? unregister_netdev+0xe/0x20
> [  342.638495]  rollback_registered+0x56/0x90
> [  342.643588]  unregister_netdevice_queue+0x7e/0x100
> [  342.649461]  unregister_netdev+0x18/0x20
> [  342.654362]  mlx5e_remove+0x2a/0x50 [mlx5_core]
> [  342.659944]  mlx5_remove_device+0xe5/0x110 [mlx5_core]
> [  342.666208]  mlx5_unregister_interface+0x39/0x90 [mlx5_core]
> [  342.673038]  cleanup+0x5/0xbfc [mlx5_core]
> [  342.678094]  __x64_sys_delete_module+0x16b/0x240
> [  342.683725]  ? do_syscall_64+0x1c/0x210
> [  342.688476]  do_syscall_64+0x5a/0x210
> [  342.693025]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> 
> Fixes: 8d5d88527587 ("xdp: rhashtable with allocator ID to pointer mapping")
> Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
> Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Jesper Dangaard Brouer <brouer@redhat.com>
> ---
>  net/core/xdp.c | 13 +++----------
>  1 file changed, 3 insertions(+), 10 deletions(-)
> 
> V2 -> V3:
> * Fix return value test for rhashtable_remove_fast, per Jesper's comment.
> 
> V1 -> V2:
> * Use rhashtable_lookup_fast and make some code movements, per Daniel's
>   and Alexei's comments.
> 
> Please queue to -stable v4.18.
> 
> diff --git a/net/core/xdp.c b/net/core/xdp.c
> index 3dd99e1c04f5..8b1c7b699982 100644
> --- a/net/core/xdp.c
> +++ b/net/core/xdp.c
> @@ -105,16 +105,9 @@ static void __xdp_rxq_info_unreg_mem_model(struct xdp_rxq_info *xdp_rxq)
>  
>  	mutex_lock(&mem_id_lock);
>  
> -	xa = rhashtable_lookup(mem_id_ht, &id, mem_id_rht_params);
> -	if (!xa) {
> -		mutex_unlock(&mem_id_lock);
> -		return;
> -	}
> -
> -	err = rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params);
> -	WARN_ON(err);
> -
> -	call_rcu(&xa->rcu, __xdp_mem_allocator_rcu_free);
> +	xa = rhashtable_lookup_fast(mem_id_ht, &id, mem_id_rht_params);
> +	if (xa && !rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params))
> +		call_rcu(&xa->rcu, __xdp_mem_allocator_rcu_free);
>  
>  	mutex_unlock(&mem_id_lock);
>  }

I would personally prefer to write it as in the example as "== 0", look
at example in [1] section "Object removal", but is it semantically the
same to write !rhashtable_remove_fast(). So, I'm fine with this.

In the example[1], the sequence is wrapped in rcu_read_lock/unlock,
while you have not done so. The rhashtable_lookup_fast and
rhashtable_remove_fast calls have their own rcu_read_lock/unlock, but
the outer rcu_read_lock/unlock, makes sure that a RCU period cannot
slip in between the two calls.

I still think your fix is correct, due to the mutex_lock.  Given the
mutex sync removal and insert in this rhashtable.

I do wonder if it would be better to add the outer rcu_read_lock/unlock,
calls if someone else reads and copy-paste this code (and don't have an
mutex sync scheme) ?

If you think this is all fine, and want to proceed as is then you have
my ack:

Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>


[1] https://lwn.net/Articles/751374/

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

  reply	other threads:[~2018-08-13 13:13 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-13  9:21 [PATCH bpf-next V3] net/xdp: Fix suspicious RCU usage warning Tariq Toukan
2018-08-13 10:31 ` Jesper Dangaard Brouer [this message]
2018-08-13 10:57   ` Tariq Toukan
2018-08-13 12:02     ` Jesper Dangaard Brouer
2018-08-13 12:22       ` Daniel Borkmann
2018-08-16  0:11         ` Daniel Borkmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180813123159.1f447108@redhat.com \
    --to=brouer@redhat.com \
    --cc=ast@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eranbe@mellanox.com \
    --cc=neilb@suse.com \
    --cc=netdev@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=tariqt@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.