Intel-Wired-Lan Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: James Hogan <jhogan@kernel.org>
To: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>,
	netdev@vger.kernel.org,
	Aleksandr Loktionov <aleksandr.loktionov@intel.com>,
	intel-wired-lan@lists.osuosl.org
Subject: Re: [Intel-wired-lan] [WIP v2] igc: fix deadlock caused by taking RTNL in RPM resume path
Date: Sun, 02 Oct 2022 11:56:28 +0100	[thread overview]
Message-ID: <3329047.e9J7NaK4W3@saruman> (raw)
In-Reply-To: <3186253.aeNJFYEL58@saruman>

On Monday, 29 August 2022 09:16:33 BST James Hogan wrote:
> On Saturday, 13 August 2022 18:18:25 BST James Hogan wrote:
> > On Saturday, 13 August 2022 01:05:41 BST Vinicius Costa Gomes wrote:
> > > James Hogan <jhogan@kernel.org> writes:
> > > > On Thursday, 11 August 2022 21:25:24 BST Vinicius Costa Gomes wrote:
> > > >> It was reported a RTNL deadlock in the igc driver that was causing
> > > >> problems during suspend/resume.
> > > >> 
> > > >> The solution is similar to commit ac8c58f5b535 ("igb: fix deadlock
> > > >> caused by taking RTNL in RPM resume path").
> > > >> 
> > > >> Reported-by: James Hogan <jhogan@kernel.org>
> > > >> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
> > > >> ---
> > > >> Sorry for the noise earlier, my kernel config didn't have runtime PM
> > > >> enabled.
> > > > 
> > > > Thanks for looking into this.
> > > > 
> > > > This is identical to the patch I've been running for the last week.
> > > > The
> > > > deadlock is avoided, however I now occasionally see an assertion from
> > > > netif_set_real_num_tx_queues due to the lock not being taken in some
> > > > cases
> > > > via the runtime_resume path, and a suspicious
> > > > rcu_dereference_protected()
> > > > warning (presumably due to the same issue of the lock not being
> > > > taken).
> > > > See here for details:
> > > > https://lore.kernel.org/netdev/4765029.31r3eYUQgx@saruman/
> > > 
> > > Oh, sorry. I missed the part that the rtnl assert splat was already
> > > using similar/identical code to what I got/copied from igb.
> > > 
> > > So what this seems to be telling us is that the "fix" from igb is only
> > > hiding the issue,
> > 
> > I suppose the patch just changes the assumption from "lock will never be
> > held on runtime resume path" (incorrect, deadlock) to "lock will always be
> > held on runtime resume path" (also incorrect, probably racy).
> > 
> > > and we would need to remove the need for taking the
> > > RTNL for the suspend/resume paths in igc and igb? (as someone else said
> > > in that igb thread, iirc)
> > 
> > (I'll defer to others on this. I'm pretty unfamiliar with networking code
> > and this particular lock.)
> 
> I'd be great to have this longstanding issue properly fixed rather than
> having to carry a patch locally that may not be lock safe.
> 
> Also, any tips for diagnosing the issue of the network link not coming back
> up after resume? I sometimes have to unload and reload the driver module to
> get it back again.

Any thoughts on this from anybody?

Cheers
James


_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

  reply	other threads:[~2022-10-02 10:56 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-14  8:14 [Intel-wired-lan] I225-V (igc driver) hangs after resume in igc_resume/igc_tsn_reset James Hogan
2022-07-15 17:25 ` Tony Nguyen
2022-07-17 19:59 ` Vinicius Costa Gomes
     [not found]   ` <4773114.31r3eYUQgx@saruman>
2022-07-23 15:52     ` James Hogan
2022-07-27 14:37       ` Vinicius Costa Gomes
2022-07-28 17:36         ` James Hogan
2022-08-04 13:03           ` James Hogan
2022-08-04 13:27             ` Paul Menzel
2022-08-04 21:41               ` James Hogan
2022-08-04 22:07                 ` James Hogan
2022-08-05 11:25                   ` James Hogan
2022-08-11 15:13                     ` [Intel-wired-lan] [PATCH] igc: fix deadlock caused by taking RTNL in RPM resume path Vinicius Costa Gomes
2022-08-11 18:58                       ` kernel test robot
2022-08-11 19:59                       ` kernel test robot
2022-08-11 20:25                       ` [Intel-wired-lan] [WIP v2] " Vinicius Costa Gomes
2022-08-11 21:41                         ` James Hogan
2022-08-13  0:05                           ` Vinicius Costa Gomes
2022-08-13 17:18                             ` James Hogan
2022-08-29  8:16                               ` James Hogan
2022-10-02 10:56                                 ` James Hogan [this message]
2023-08-14 11:04                                   ` James Hogan
2023-08-29  1:58                                     ` Vinicius Costa Gomes
2023-09-03 17:57                                       ` James Hogan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3329047.e9J7NaK4W3@saruman \
    --to=jhogan@kernel.org \
    --cc=aleksandr.loktionov@intel.com \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=netdev@vger.kernel.org \
    --cc=pmenzel@molgen.mpg.de \
    --cc=vinicius.gomes@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox