Intel-Wired-Lan Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Simon Horman <horms@kernel.org>
To: sergey.temerkhanov@intel.com
Cc: Simon Horman <horms@kernel.org>,
	intel-wired-lan@lists.osuosl.org, netdev@vger.kernel.org,
	pmenzel@molgen.mpg.de
Subject: Re: [Intel-wired-lan] [PATCH iwl-next v2] ixgbe: Implement PCI reset handler
Date: Sat, 20 Jun 2026 09:53:27 +0100	[thread overview]
Message-ID: <20260620085327.965597-1-horms@kernel.org> (raw)
In-Reply-To: <20260618142212.310475-1-sergey.temerkhanov@intel.com>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
ixgbe: Implement PCI reset handler

Implement PCI device reset handler to allow the network device to
get re-initialized and function after a PCI-level reset.

> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
[ ... ]
> @@ -12338,6 +12338,86 @@ static pci_ers_result_t ixgbe_io_slot_reset(struct pci_dev *pdev)
> +static void ixgbe_reset_prep(struct pci_dev *pdev)
> +{
> +	struct ixgbe_adapter *adapter = pci_get_drvdata(pdev);
> +
> +	if (!adapter)
> +		return;
> +
> +	if (poll_timeout_us(test_and_set_bit(__IXGBE_RESETTING, &adapter->state),
> +			    test_bit(__IXGBE_RESETTING, &adapter->state),

[Severity: Medium]
Does this introduce a new instance of an ad-hoc bit lock?

According to the networking drivers guidelines, using bit operations like
test_and_set_bit() to guard critical sections instead of proper
synchronization primitives like a mutex or spinlock can hide issues from
lockdep and lacks deadlock detection.

[Severity: Critical]
Does this misuse the poll_timeout_us() macro?

The macro executes the operation before evaluating the condition. Since
test_and_set_bit() unconditionally sets the bit to 1, test_bit() will always
evaluate to true on the very first loop iteration, regardless of whether the
lock was already held.

Doesn't this completely break mutual exclusion?

> +			    IXGBE_RESET_PREP_POLL_INTERVAL_US,
> +			    IXGBE_RESET_PREP_TIMEOUT_US, false)) {
> +		/* ixgbe_reset_done() will exit early if this happens.
> +		 * A retry will be needed
> +		 */
> +		e_err(drv, "Timed out waiting for __IXGBE_RESETTING to be released. Reset is needed\n");
> +		return;

[Severity: High]
If this times out and returns early, won't it abandon the software
reinitialization process while the PCI subsystem proceeds with the physical
reset?

The .reset_prepare callback returns void, so the reset continues. By
returning early without setting __IXGBE_PCIE_RESET_IN_PROGRESS, critical
teardowns are skipped, and ixgbe_reset_done() will also bail out early:

ixgbe_reset_done() {
    if (!test_and_clear_bit(__IXGBE_PCIE_RESET_IN_PROGRESS, &adapter->state)) {
        e_err(drv, "Reset done called without PCIe reset in progress\n");
        return;
    }
}

Could this permanently break the network interface until the module is
reloaded?

> +	}
> +
> +	/* Sync __IXGBE_RESETTING */
> +	smp_mb__after_atomic();
> +
> +	if (test_bit(__IXGBE_SERVICE_INITED, &adapter->state)) {
> +		/* Prevent the service task from being requeued in the timer callback */
> +		timer_delete_sync(&adapter->service_timer);
> +		/* Cancel any possibly queued service task */
> +		cancel_work_sync(&adapter->service_task);

[Severity: Critical]
Can this sequence cause a deadlock with the service task?

Here in ixgbe_reset_prep(), __IXGBE_RESETTING has just been acquired,
and then cancel_work_sync(&adapter->service_task) blocks waiting for
the service task to complete.

However, if the active service task reaches ixgbe_reinit_locked():

ixgbe_reinit_locked() {
    while (test_and_set_bit(__IXGBE_RESETTING, &adapter->state))
        usleep_range(1000, 2000);
    ...
}

The service task will spin infinitely waiting for __IXGBE_RESETTING to be
cleared. Since ixgbe_reset_prep() holds the bit lock and is blocked waiting
for the service task, neither thread can make progress.

> +	}
> +
> +	pci_clear_master(pdev);
> +
> +	set_bit(__IXGBE_PCIE_RESET_IN_PROGRESS, &adapter->state);
> +}
[ ... ]
> +static void ixgbe_reset_done(struct pci_dev *pdev)
> +{
> +	struct ixgbe_adapter *adapter = pci_get_drvdata(pdev);
[ ... ]
> +	/* Allow the service task to run */
> +	if (!test_bit(__IXGBE_REMOVING, &adapter->state)) {
> +		clear_bit(__IXGBE_RESETTING, &adapter->state);
> +		/* Sync __IXGBE_RESETTING */
> +		smp_mb__after_atomic();

[Severity: Medium]
Does this provide the correct release semantics for the ad-hoc lock?

The clear_bit() operation does not imply a memory barrier, and
smp_mb__after_atomic() only orders memory operations that occur after the
atomic operation.

Could memory accesses from within the critical section (like the
pci_set_master() call earlier) bleed past the lock release, allowing other
threads to acquire the lock too early?

Is it possible to use clear_bit_unlock() or place an smp_mb__before_atomic()
prior to the clear_bit() instead?

> +	}
[ ... ]

      reply	other threads:[~2026-06-20  8:53 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-18 14:22 [Intel-wired-lan] [PATCH iwl-next v2] ixgbe: Implement PCI reset handler Sergey Temerkhanov
2026-06-20  8:53 ` Simon Horman [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260620085327.965597-1-horms@kernel.org \
    --to=horms@kernel.org \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=netdev@vger.kernel.org \
    --cc=pmenzel@molgen.mpg.de \
    --cc=sergey.temerkhanov@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox