lustre-devel-lustre.org archive mirror
 help / color / mirror / Atom feed
From: James Simmons <jsimmons@infradead.org>
To: Andreas Dilger <adilger@whamcloud.com>,
	Oleg Drokin <green@whamcloud.com>, NeilBrown <neilb@suse.de>
Cc: Chris Horn <chris.horn@hpe.com>,
	Lustre Development List <lustre-devel@lists.lustre.org>
Subject: [lustre-devel] [PATCH 32/33] lnet: ksocklnd: ksocklnd_ni_get_eth_intf_speed() must use only rtnl lock
Date: Sun,  2 Feb 2025 15:46:32 -0500	[thread overview]
Message-ID: <20250202204633.1148872-33-jsimmons@infradead.org> (raw)
In-Reply-To: <20250202204633.1148872-1-jsimmons@infradead.org>

A kernel with debug enable is reporting the following:

include/linux/inetdevice.h:221 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
5 locks held by lctl/42289:
 #0: ffffffffa04263f8 ((libcfs_ioctl_list).rwsem){++++}-{3:3}, at: __blocking_notifier_call_chain+0x44/0xa0
 #1: ffffffffa04fa7b0 (lnet_config_mutex){+.+.}-{3:3}, at: lnet_configure+0x1d/0xc0 [lnet]
 #2: ffffffffa04f06e8 (the_lnet.ln_api_mutex){+.+.}-{3:3}, at: LNetNIInit+0x4c/0x960 [lnet]
 #3: ffffffffa04f0788 (&the_lnet.ln_lnd_mutex){+.+.}-{3:3}, at: lnet_startup_lndnet+0x11e/0xa90 [lnet]
 #4: ffffffff834a0cd0 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x1b/0x30

stack backtrace:
CPU: 2 PID: 42289 Comm: lctl Kdump: loaded Tainted: G        W  O     --------- -  - 4.18.0rh8.5-debug #2
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  Call Trace:
  ? dump_stack+0x119/0x18e
  ? lockdep_rcu_suspicious+0x141/0x14f
  ? ksocklnd_ni_get_eth_intf_speed.isra.1+0x308/0x360 [ksocklnd]
  ? mark_held_locks+0x6a/0xc0
  ? ktime_get_with_offset+0x233/0x2b0
  ? trace_hardirqs_on+0x4e/0x220
  ? ksocknal_tunables_setup+0xed/0x200 [ksocklnd]
  ? ksocknal_startup+0x4ff/0x1180 [ksocklnd]

The function __ethtool_get_link_ksettings() has a hard requirement
to have the in_dev device protected by the rtnl mutex. At the
same time we are aquiring in_dev using the rcu lock. The in_dev
needs to be stabilized by the same lock. So use rtnl functions
instead.

WC-bug-id: https://jira.whamcloud.com/browse/LU-16807
Lustre-commit: 65d664bff40590366 ("LU-16807 ksocklnd: ksocklnd_ni_get_eth_intf_speed() must use only rtnl lock")
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51028
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Neil Brown <neilb@suse.de>
---
 net/lnet/klnds/socklnd/socklnd_modparams.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/lnet/klnds/socklnd/socklnd_modparams.c b/net/lnet/klnds/socklnd/socklnd_modparams.c
index 5eb58ca97c06..031fe13d2a38 100644
--- a/net/lnet/klnds/socklnd/socklnd_modparams.c
+++ b/net/lnet/klnds/socklnd/socklnd_modparams.c
@@ -183,11 +183,11 @@ static int ksocklnd_ni_get_eth_intf_speed(struct lnet_ni *ni)
 		if (!(flags & IFF_UP))
 			continue;
 
-		in_dev = __in_dev_get_rcu(dev);
+		in_dev = __in_dev_get_rtnl(dev);
 		if (!in_dev)
 			continue;
 
-		in_dev_for_each_ifa_rcu(ifa, in_dev) {
+		in_dev_for_each_ifa_rtnl(ifa, in_dev) {
 			if (strcmp(ifa->ifa_label, ni->ni_interface) == 0)
 				intf_idx = dev->ifindex;
 		}
-- 
2.39.3

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

  parent reply	other threads:[~2025-02-02 21:14 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-02 20:46 [lustre-devel] [PATCH 00/33] lustre: sync to OpenSFS branch May 31, 2023 James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 01/33] lnet: set msg field for lnet message header James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 02/33] Revert "lustre: llite: Check vmpage in releasepage" James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 03/33] lustre: llite: EIO is possible on a race with page reclaim James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 04/33] lustre: llite: add __GFP_NORETRY for read-ahead page James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 05/33] lustre: obd: change lmd flags to bitmap James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 06/33] lustre: uapi: cleanup FSFILT defines James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 07/33] lustre: obd: Reserve metadata overstriping flags James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 08/33] lnet: selftest: manage the workqueue state properly James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 09/33] lustre: remove cl_{offset, index, page_size} helpers James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 10/33] lustre: csdc: reserve layout bits for compress component James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 11/33] lustre: obd: replace simple_strtoul() James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 12/33] lnet: Use dynamic allocation for LND tunables James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 13/33] lustre: cksum: fix generating T10PI guard tags for partial brw page James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 14/33] lustre: llite: remove OBD_ -> CFS_ macros James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 15/33] lustre: obd: " James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 16/33] lnet: improve numeric NID to CPT hashing James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 17/33] lnet: libcfs: Remove unsed LASSERT_ATOMIC_* macros James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 18/33] lustre: misc: replace obsolete ioctl numbers James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 19/33] lustre: lmv: treat unknown hash type as sane type James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 20/33] lustre: llite: Fix return for non-queued aio James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 21/33] lnet: collect data about routes by using Netlink James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 22/33] lustre: ptlrpc: switch sptlrpc_rule_set_choose to large nid James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 23/33] lnet: use list_first_entry() where appropriate James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 24/33] lustre: statahead: using try lock for batched RPCs James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 25/33] lnet: libcfs: use round_up directly James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 26/33] lustre: mdc: md_open_data should keep ref on close_req James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 27/33] lustre: llite: update comment of ll_swap_layouts_close James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 28/33] lustre: ldlm: replace OBD_ -> CFS_ macros James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 29/33] lustre: mdc: remove " James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 30/33] lnet: libcfs: move cfs_expr_list_print to nidstrings.c James Simmons
2025-02-02 20:46 ` [lustre-devel] [PATCH 31/33] lnet: libcfs: Remove reference to LASSERT_ATOMIC_POS James Simmons
2025-02-02 20:46 ` James Simmons [this message]
2025-02-02 20:46 ` [lustre-devel] [PATCH 33/33] lustre: ldlm: convert ldlm extent locks to linux extent-tree James Simmons

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250202204633.1148872-33-jsimmons@infradead.org \
    --to=jsimmons@infradead.org \
    --cc=adilger@whamcloud.com \
    --cc=chris.horn@hpe.com \
    --cc=green@whamcloud.com \
    --cc=lustre-devel@lists.lustre.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).