From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 912A52B9B7 for ; Sat, 20 Jun 2026 08:53:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781945626; cv=none; b=fk02Iwl4CdVq0OKubKtijKQfC2ssvBGOffX5gUaOT4yuVexhehnxETez5ZVRyLLA5ulUKM96HyYzMxzAZSBJIfU+KOlEzA8owRVF4k28b9U+O0Yi3fAy6fP32vSRb2qihUHCmkP7D10S8swKPzJInId1Toz3fj/i1plNTfiyMAY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781945626; c=relaxed/simple; bh=1dDmBbpJoaalDMG9FJ/YD6P+/FeZHCjDquKJCujYgnU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=YqFNR3RnB84Ux8pE9/lNFn4/rhIGjcY/kn/Sb9lekf6Pe9a0LxN0LDmNdels6RcAA9q50OwFa1YtGAH/e6t76tyyzNmdyq+j4u0Y+wniI0oJrj5e+T9IYLahuUI28Ydn8sNQT6XwdFcfMT1M6S/BBjYEdd/JEICmUnILwZCrFsQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ZbMwWP+O; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ZbMwWP+O" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 239751F000E9; Sat, 20 Jun 2026 08:53:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1781945625; bh=u8+egrML541MWpVVt07sjq5yP5ImkoHuZ/0MZ8y1gg4=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=ZbMwWP+OaUZQPJMq0cU94POElOt381gzO7lVWcWpjfoKStmc75w+NTkz8MJ76jMIa L8EASZqerFldTUtqDkUQLke/ZaHg3KAHPDys2XuUB02AVf48l6jsNxTvgPNfwaCgJ6 KpqzSkYG6Y9mbhRh7NLifqUVKT9dvDROm2AbUU9oIi5booQnBl/4TP6pYYjhu0cFMU mzoGcI+6SUPiDrevr/xZxtO4eyz69EK37qwX5ZaQPgwwkcmKmQZ4ThFrcdzLf4WICv 3HT61xPbZUlzU3Ilx513/2CyRMyHbFUFPfb0YOHklKzObZ397WjaQPfzlL46JiLalV 09x4CsWyEpANg== From: Simon Horman To: sergey.temerkhanov@intel.com Cc: Simon Horman , intel-wired-lan@lists.osuosl.org, netdev@vger.kernel.org, pmenzel@molgen.mpg.de Subject: Re: [PATCH iwl-next v2] ixgbe: Implement PCI reset handler Date: Sat, 20 Jun 2026 09:53:27 +0100 Message-ID: <20260620085327.965597-1-horms@kernel.org> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260618142212.310475-1-sergey.temerkhanov@intel.com> References: <20260618142212.310475-1-sergey.temerkhanov@intel.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit This is an AI-generated review of your patch. The human sending this email has considered the AI review valid, or at least plausible. Full review at: https://sashiko.dev --- ixgbe: Implement PCI reset handler Implement PCI device reset handler to allow the network device to get re-initialized and function after a PCI-level reset. > diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c > --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c > +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c [ ... ] > @@ -12338,6 +12338,86 @@ static pci_ers_result_t ixgbe_io_slot_reset(struct pci_dev *pdev) > +static void ixgbe_reset_prep(struct pci_dev *pdev) > +{ > + struct ixgbe_adapter *adapter = pci_get_drvdata(pdev); > + > + if (!adapter) > + return; > + > + if (poll_timeout_us(test_and_set_bit(__IXGBE_RESETTING, &adapter->state), > + test_bit(__IXGBE_RESETTING, &adapter->state), [Severity: Medium] Does this introduce a new instance of an ad-hoc bit lock? According to the networking drivers guidelines, using bit operations like test_and_set_bit() to guard critical sections instead of proper synchronization primitives like a mutex or spinlock can hide issues from lockdep and lacks deadlock detection. [Severity: Critical] Does this misuse the poll_timeout_us() macro? The macro executes the operation before evaluating the condition. Since test_and_set_bit() unconditionally sets the bit to 1, test_bit() will always evaluate to true on the very first loop iteration, regardless of whether the lock was already held. Doesn't this completely break mutual exclusion? > + IXGBE_RESET_PREP_POLL_INTERVAL_US, > + IXGBE_RESET_PREP_TIMEOUT_US, false)) { > + /* ixgbe_reset_done() will exit early if this happens. > + * A retry will be needed > + */ > + e_err(drv, "Timed out waiting for __IXGBE_RESETTING to be released. Reset is needed\n"); > + return; [Severity: High] If this times out and returns early, won't it abandon the software reinitialization process while the PCI subsystem proceeds with the physical reset? The .reset_prepare callback returns void, so the reset continues. By returning early without setting __IXGBE_PCIE_RESET_IN_PROGRESS, critical teardowns are skipped, and ixgbe_reset_done() will also bail out early: ixgbe_reset_done() { if (!test_and_clear_bit(__IXGBE_PCIE_RESET_IN_PROGRESS, &adapter->state)) { e_err(drv, "Reset done called without PCIe reset in progress\n"); return; } } Could this permanently break the network interface until the module is reloaded? > + } > + > + /* Sync __IXGBE_RESETTING */ > + smp_mb__after_atomic(); > + > + if (test_bit(__IXGBE_SERVICE_INITED, &adapter->state)) { > + /* Prevent the service task from being requeued in the timer callback */ > + timer_delete_sync(&adapter->service_timer); > + /* Cancel any possibly queued service task */ > + cancel_work_sync(&adapter->service_task); [Severity: Critical] Can this sequence cause a deadlock with the service task? Here in ixgbe_reset_prep(), __IXGBE_RESETTING has just been acquired, and then cancel_work_sync(&adapter->service_task) blocks waiting for the service task to complete. However, if the active service task reaches ixgbe_reinit_locked(): ixgbe_reinit_locked() { while (test_and_set_bit(__IXGBE_RESETTING, &adapter->state)) usleep_range(1000, 2000); ... } The service task will spin infinitely waiting for __IXGBE_RESETTING to be cleared. Since ixgbe_reset_prep() holds the bit lock and is blocked waiting for the service task, neither thread can make progress. > + } > + > + pci_clear_master(pdev); > + > + set_bit(__IXGBE_PCIE_RESET_IN_PROGRESS, &adapter->state); > +} [ ... ] > +static void ixgbe_reset_done(struct pci_dev *pdev) > +{ > + struct ixgbe_adapter *adapter = pci_get_drvdata(pdev); [ ... ] > + /* Allow the service task to run */ > + if (!test_bit(__IXGBE_REMOVING, &adapter->state)) { > + clear_bit(__IXGBE_RESETTING, &adapter->state); > + /* Sync __IXGBE_RESETTING */ > + smp_mb__after_atomic(); [Severity: Medium] Does this provide the correct release semantics for the ad-hoc lock? The clear_bit() operation does not imply a memory barrier, and smp_mb__after_atomic() only orders memory operations that occur after the atomic operation. Could memory accesses from within the critical section (like the pci_set_master() call earlier) bleed past the lock release, allowing other threads to acquire the lock too early? Is it possible to use clear_bit_unlock() or place an smp_mb__before_atomic() prior to the clear_bit() instead? > + } [ ... ]