netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ian Ray <ian.ray@gehealthcare.com>
To: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jakub Kicinski <kuba@kernel.org>,
	horms@kernel.org, Tony Nguyen <anthony.l.nguyen@intel.com>,
	Przemek Kitszel <przemyslaw.kitszel@intel.com>,
	Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Paolo Abeni <pabeni@redhat.com>,
	brian.ruley@gehealthcare.com, intel-wired-lan@lists.osuosl.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [Intel-wired-lan] [PATCH v2] igb: Fix watchdog_task race with shutdown
Date: Fri, 27 Jun 2025 16:28:56 +0300	[thread overview]
Message-ID: <aF6cmKkrJSV_AWBN@b3410ffb93c4> (raw)
In-Reply-To: <3504878c-6b3f-4d5f-bcfd-2e7e4a912570@intel.com>

On Mon, Jun 16, 2025 at 02:47:29PM -0700, Jacob Keller wrote:
> On 6/10/2025 5:44 AM, Ian Ray wrote:
> > On Mon, Jun 09, 2025 at 04:10:39PM -0700, Jakub Kicinski wrote:
:
> > IIUC set_bit() is an atomic operation (via bitops.h), and so
> > my previous comment still stands.
> >
> > (Sorry if I have misunderstood your question.)
> >
> > Either watchdog_task runs just before __IGB_DOWN is set (and
> > the timer is stopped by this patch) -- or watchdog_task runs
> > just after __IGB_DOWN is set (and thus the timer will not be
> > restarted).
> >
> > In both cases, the final cancel_work_sync ensures that the
> > watchdog_task completes before igb_down() continues.
> >
> > Regards,
> > Ian
> 
> Hmm. Well set_bit is atomic, but I don't think it has ordering
> guarantees on its own. Wouldn't we need to be using a barrier here to
> guarantee ordering here?
> 
> Perhaps cancel_work_sync has barriers implied and that makes this work
> properly?

Ah, I see.  I checked the cancel_work_documentation and implementation
and I am not sure we can make any assumptions about barriers.

Would two additional calls to smp_mb__after_atomic() be acceptable?
Something like this (on top of this series v2).

-- >8 --
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index a65ae7925ae8..9b63dc594454 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -2179,6 +2179,7 @@ void igb_down(struct igb_adapter *adapter)
         * disable watchdog from being rescheduled.
         */
        set_bit(__IGB_DOWN, &adapter->state);
+       smp_mb__after_atomic();
        timer_delete_sync(&adapter->watchdog_timer);
        timer_delete_sync(&adapter->phy_info_timer);

@@ -3886,6 +3887,7 @@ static void igb_remove(struct pci_dev *pdev)
         * disable watchdog from being rescheduled.
         */
        set_bit(__IGB_DOWN, &adapter->state);
+       smp_mb__after_atomic();
        timer_delete_sync(&adapter->watchdog_timer);
        timer_delete_sync(&adapter->phy_info_timer);
-- >8 --

Thanks,
Ian

> 
> > ORDERING
> > --------
> >
> > Like with atomic_t, the rule of thumb is:
> >
> >  - non-RMW operations are unordered;
> >
> >  - RMW operations that have no return value are unordered;
> >
> >  - RMW operations that have a return value are fully ordered.
> >
> >  - RMW operations that are conditional are fully ordered.
> >
> > Except for a successful test_and_set_bit_lock() which has ACQUIRE semantics,
> > clear_bit_unlock() which has RELEASE semantics and test_bit_acquire which has
> > ACQUIRE semantics.
> >
> 
> set_bit is listed as a RMW without a return value, so its unordered.
> That makes me think we'd want clear_bit_unlock() if the cancel_work_sync
> itself doesn't provide the barriers we need.
> 
> Thanks,
> Jake

      reply	other threads:[~2025-06-27 13:29 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-03  8:09 [PATCH v2] igb: Fix watchdog_task race with shutdown Ian Ray
2025-06-06  1:43 ` Jakub Kicinski
2025-06-09  6:32   ` Ian Ray
2025-06-09 23:10     ` Jakub Kicinski
2025-06-10 12:44       ` Ian Ray
2025-06-16 21:47         ` [Intel-wired-lan] " Jacob Keller
2025-06-27 13:28           ` Ian Ray [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aF6cmKkrJSV_AWBN@b3410ffb93c4 \
    --to=ian.ray@gehealthcare.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=anthony.l.nguyen@intel.com \
    --cc=brian.ruley@gehealthcare.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=jacob.e.keller@intel.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=przemyslaw.kitszel@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).