From: Joe Damato <jdamato@fastly.com>
To: Jacob Keller <jacob.e.keller@intel.com>
Cc: netdev@vger.kernel.org, dmantipov@yandex.ru,
Tony Nguyen <anthony.l.nguyen@intel.com>,
Przemek Kitszel <przemyslaw.kitszel@intel.com>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Simon Horman <horms@kernel.org>,
"moderated list:INTEL ETHERNET DRIVERS"
<intel-wired-lan@lists.osuosl.org>,
open list <linux-kernel@vger.kernel.org>
Subject: Re: [Intel-wired-lan] [RFC iwl-net] e1000: Hold RTNL when e1000_down can be called
Date: Tue, 22 Oct 2024 14:17:14 -0700 [thread overview]
Message-ID: <ZxgWWgJKx4h0Thfe@LQ3V64L9R2> (raw)
In-Reply-To: <270a914d-3b50-4eee-b564-1b8cff82facc@intel.com>
On Tue, Oct 22, 2024 at 02:15:27PM -0700, Jacob Keller wrote:
>
>
> On 10/22/2024 2:12 PM, Joe Damato wrote:
> > On Tue, Oct 22, 2024 at 01:00:47PM -0700, Joe Damato wrote:
> >> On Tue, Oct 22, 2024 at 05:21:53PM +0000, Joe Damato wrote:
> >>> e1000_down calls netif_queue_set_napi, which assumes that RTNL is held.
> >>>
> >>> There are a few paths for e1000_down to be called in e1000 where RTNL is
> >>> not currently being held:
> >>> - e1000_shutdown (pci shutdown)
> >>> - e1000_suspend (power management)
> >>> - e1000_reinit_locked (via e1000_reset_task delayed work)
> >>>
> >>> Hold RTNL in two places to fix this issue:
> >>> - e1000_reset_task
> >>> - __e1000_shutdown (which is called from both e1000_shutdown and
> >>> e1000_suspend).
> >>
> >> It looks like there's one other spot I missed:
> >>
> >> e1000_io_error_detected (pci error handler) which should also hold
> >> rtnl_lock:
> >>
> >> + if (netif_running(netdev)) {
> >> + rtnl_lock();
> >> e1000_down(adapter);
> >> + rtnl_unlock();
> >> + }
> >>
> >> I can send that update in the v2, but I'll wait to see if Intel has suggestions
> >> on the below.
> >>
> >>> The other paths which call e1000_down seemingly hold RTNL and are OK:
> >>> - e1000_close (ndo_stop)
> >>> - e1000_change_mtu (ndo_change_mtu)
> >>>
> >>> I'm submitting this is as an RFC because:
> >>> - the e1000_reinit_locked issue appears very similar to commit
> >>> 21f857f0321d ("e1000e: add rtnl_lock() to e1000_reset_task"), which
> >>> fixes a similar issue in e1000e
> >>>
> >>> however
> >>>
> >>> - adding rtnl to e1000_reinit_locked seemingly conflicts with an
> >>> earlier e1000 commit b2f963bfaeba ("e1000: fix lockdep warning in
> >>> e1000_reset_task").
> >>>
> >>> Hopefully Intel can weigh in and shed some light on the correct way to
> >>> go.
> >
> > Regarding the above locations where rtnl_lock may need to be held,
> > comparing to other intel drivers:
> >
> > - e1000_reset_task: it appears that igc, igb, and e100e all hold
> > rtnl_lock in their reset_task functions, so I think adding an
> > rtnl_lock / rtnl_unlock to e1000_reset_task should be OK,
> > despite the existence of commit b2f963bfaeba ("e1000: fix
> > lockdep warning in e1000_reset_task").
> >
> > - e1000_io_error_detected:
> > - e1000e temporarily obtains and drops rtnl in
> > e1000e_pm_freeze
> > - ixgbe holds rtnl in the same path (toward the bottom of
> > ixgbe_io_error_detected)
> > - igb does NOT hold rtnl in this path (as far as I can tell)
> > - it was suggested in another thread to hold rtnl in this path
> > for igc [1].
> >
> > Given that it will be added to igc and is held in this same
> > path in e1000e and ixgbe, I think it is safe to add it for
> > e1000, as well.
> >
> > - e1000_shutdown:
> > - igb holds rtnl in the same path,
> > - e1000e temporarily holds it in this path (via
> > e1000e_pm_freeze)
> > - ixgbe holds rtnl in the same path
> >
> > So based on the recommendation for igc [1], and the precedent set in
> > the other Intel drivers in most cases (except igb and the io_error
> > path), I think adding rtnl to all 3 locations described above is
> > correct.
> >
> > Please let me know if you all agree. Thanks for reviewing this.
> >
> >
> [1]:
> https://lore.kernel.org/netdev/40242f59-139a-4b45-8949-1210039f881b@intel.com/
>
> I agree with this assessment.
Thanks for taking a look. I will send an official iwl-net PATCH with
these changes once the 24 hour timer has expired.
next prev parent reply other threads:[~2024-10-22 21:17 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-22 17:21 [RFC iwl-net] e1000: Hold RTNL when e1000_down can be called Joe Damato
2024-10-22 20:00 ` Joe Damato
2024-10-22 21:12 ` Joe Damato
2024-10-22 21:15 ` [Intel-wired-lan] " Jacob Keller
2024-10-22 21:17 ` Joe Damato [this message]
2024-10-22 21:14 ` Jacob Keller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZxgWWgJKx4h0Thfe@LQ3V64L9R2 \
--to=jdamato@fastly.com \
--cc=andrew+netdev@lunn.ch \
--cc=anthony.l.nguyen@intel.com \
--cc=davem@davemloft.net \
--cc=dmantipov@yandex.ru \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=jacob.e.keller@intel.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=przemyslaw.kitszel@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox