netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Lukas Wunner <lukas@wunner.de>
To: Jacob Keller <jacob.e.keller@intel.com>
Cc: "Tony Nguyen" <anthony.l.nguyen@intel.com>,
	davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com,
	edumazet@google.com, netdev@vger.kernel.org,
	sasha.neftin@intel.com, "Roman Lozko" <lozko.roma@gmail.com>,
	"Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>,
	"Kurt Kanzenbach" <kurt@linutronix.de>,
	"Heiner Kallweit" <hkallweit1@gmail.com>,
	"Simon Horman" <horms@kernel.org>,
	"Naama Meir" <naamax.meir@linux.intel.com>
Subject: Re: [PATCH net] igc: Fix LED-related deadlock on driver unbind
Date: Tue, 23 Apr 2024 09:53:38 +0200	[thread overview]
Message-ID: <ZidpAp1CL3iKfcGz@wunner.de> (raw)
In-Reply-To: <96939b80-b789-41a6-bea6-78f16833bbc9@intel.com>

On Mon, Apr 22, 2024 at 04:32:01PM -0700, Jacob Keller wrote:
> On 4/22/2024 1:45 PM, Tony Nguyen wrote:
> > Roman reports a deadlock on unplug of a Thunderbolt docking station
> > containing an Intel I225 Ethernet adapter.
> > 
> > The root cause is that led_classdev's for LEDs on the adapter are
> > registered such that they're device-managed by the netdev.  That
> > results in recursive acquisition of the rtnl_lock() mutex on unplug:
> > 
> > When the driver calls unregister_netdev(), it acquires rtnl_lock(),
> > then frees the device-managed resources.  Upon unregistering the LEDs,
> > netdev_trig_deactivate() invokes unregister_netdevice_notifier(),
> > which tries to acquire rtnl_lock() again.
> > 
> > Avoid by using non-device-managed LED registration.
> > 
> 
> Could we instead switch to using devm with the PCI device struct instead
> of the netdev struct?

No, unfortunately that doesn't work:

The unregistering of the LEDs would then happen after unbind of the
pci_dev, i.e. after igc_release_hw_control() and pci_disable_device().
The LED registers aren't even accessible at that point, but the LEDs
are still exposed in sysfs.  I tried that approach but then realized
it's a mistake:

https://lore.kernel.org/all/ZhBN9p1yOyciXkzw@wunner.de/

Andrew Lunn concurred and wrote that "LEDs need to be added and
explicitly removed within the life cycle of the netdev":

https://lore.kernel.org/all/7cfb1af7-3270-447a-a2cf-16c2af02ec29@lunn.ch/

We'd have to convert the igc driver to use devm_*() for everything to
avoid this ordering issue.  I don't think that's something we can do
at this point in the cycle.  The present patch fixes a regression
introduced with v6.9-rc1.


There's another reason this approach doesn't work:

The first argument to devm_led_classdev_register() has two purposes:
(1) It's used to manage the resource (i.e. LED is unregistered on unbind),
(2) but it's also used as the parent below which the LED appears in sysfs.

If I changed the argument to the pci_dev, the LED would suddenly appear
below the pci_dev in sysfs, instead of the netdev.  So the patch would
result in an undesired change of behavior.

Of course we can discuss introducing a new devm_*() helper which accepts
separate device arguments for the two purposes above.  But that would
likewise be something we can't do at this point in the cycle.

We discussed the conundrum of the dual-purpose device argument in a
separate thread for r8169 (which suffered from the same LED deadlock):

https://lore.kernel.org/all/20240405205903.GA3458@wunner.de/

Thanks,

Lukas

  parent reply	other threads:[~2024-04-23  7:53 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-22 20:45 [PATCH net] igc: Fix LED-related deadlock on driver unbind Tony Nguyen
2024-04-22 23:32 ` Jacob Keller
2024-04-22 23:37   ` Marek Marczykowski-Górecki
2024-04-22 23:46     ` Jacob Keller
2024-04-23  8:08       ` Lukas Wunner
2024-04-23  7:53   ` Lukas Wunner [this message]
2024-04-25  3:40 ` patchwork-bot+netdevbpf
  -- strict thread matches above, loose matches on Subject: below --
2024-04-15 13:48 Lukas Wunner
2024-04-16 13:51 ` Simon Horman
2024-04-16 14:06 ` Kurt Kanzenbach
2024-04-16 20:55   ` Lukas Wunner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZidpAp1CL3iKfcGz@wunner.de \
    --to=lukas@wunner.de \
    --cc=anthony.l.nguyen@intel.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hkallweit1@gmail.com \
    --cc=horms@kernel.org \
    --cc=jacob.e.keller@intel.com \
    --cc=kuba@kernel.org \
    --cc=kurt@linutronix.de \
    --cc=lozko.roma@gmail.com \
    --cc=marmarek@invisiblethingslab.com \
    --cc=naamax.meir@linux.intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sasha.neftin@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).