Linux kernel -stable discussions
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Tomas Henzl <thenzl@redhat.com>,
	Don Brace <don.brace@microchip.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Sasha Levin <sashal@kernel.org>,
	jejb@linux.ibm.com, storagedev@microchip.com,
	linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH AUTOSEL 7.0-5.15] scsi: smartpqi: Silence a recursive lock warning
Date: Mon, 11 May 2026 18:19:23 -0400	[thread overview]
Message-ID: <20260511221931.2370053-24-sashal@kernel.org> (raw)
In-Reply-To: <20260511221931.2370053-1-sashal@kernel.org>

From: Tomas Henzl <thenzl@redhat.com>

[ Upstream commit d65efdf467ff935e35dfe6aa9a7ab93f17ac07ee ]

On systems with multiple controllers debug kernel shows

  WARNING: possible recursive locking detected

during shutdown.

Each controller does have its own ctrl_info (and mutex) and that isn't
correctly recognized by debug kernel.  Suppress the warning by releasing
the mutex at the end of pqi_shutdown().

Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Don Brace <don.brace@microchip.com>
Link: https://patch.msgid.link/20260414124118.23661-1-thenzl@redhat.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

## Decision Summary
Backport this to stable trees that contain the mutex-based
`pqi_ctrl_block_device_reset()` shutdown path, verified in `v5.15`,
`v6.1`, `v6.6`, `v6.12`, `v6.16`, and `v6.19`. Do not apply it as-is to
`v5.10`, where that helper only sets `block_device_reset = true` and no
`pqi_ctrl_unblock_device_reset()` helper exists.

This is a one-line fix for an imbalanced `lun_reset_mutex` lock in
`pqi_shutdown()`. The direct user-visible symptom is a lockdep “possible
recursive locking detected” warning during shutdown/reboot on systems
with multiple smartpqi controllers. The impact is mostly debug-kernel
noise rather than data loss or a production crash, but the patch
corrects real locking hygiene, is tiny, subsystem-reviewed, and has very
low risk.

## Phase Walkthrough
Phase 1:
- Record 1.1: subsystem `scsi: smartpqi`; action `Silence`; intent is to
  stop a recursive lock warning during shutdown.
- Record 1.2: tags present: `Signed-off-by: Tomas Henzl`, `Acked-by: Don
  Brace`, `Link:
  https://patch.msgid.link/20260414124118.23661-1-thenzl@redhat.com`,
  `Signed-off-by: Martin K. Petersen`. No `Fixes:`, no `Reported-by:`,
  no `Cc: stable`.
- Record 1.3: message describes debug kernels warning on multi-
  controller systems because distinct per-controller mutexes are not
  recognized as distinct after shutdown leaves one held.
- Record 1.4: hidden bug fix: yes. It is described as silencing a
  warning, but the code adds a missing unlock for a mutex acquired
  earlier in the same function.

Phase 2:
- Record 2.1: one file, `drivers/scsi/smartpqi/smartpqi_init.c`, one
  insertion in `pqi_shutdown()`. Single-file surgical fix.
- Record 2.2: before, `pqi_shutdown()` locked
  `ctrl_info->lun_reset_mutex` via `pqi_ctrl_block_device_reset()` and
  returned after `pqi_reset()` without unlocking. After, it unlocks via
  `pqi_ctrl_unblock_device_reset()`.
- Record 2.3: bug category is synchronization/lock balancing. The
  changed helper is verified as
  `mutex_unlock(&ctrl_info->lun_reset_mutex)`.
- Record 2.4: fix quality is high: one existing helper call, no new API,
  no refactor. Main risk is allowing a reset waiter to proceed late in
  shutdown; Tomas explicitly discussed this risk on-list and said he
  checked it.

Phase 3:
- Record 3.1: blame shows the shutdown call to
  `pqi_ctrl_block_device_reset()` is old, but `9fa8202336096` changed
  the helper to a mutex-based block/unblock model. That is the relevant
  introduction point for the missing unlock.
- Record 3.2: no `Fixes:` tag, so no tagged introducing commit to
  follow.
- Record 3.3: recent file history shows normal smartpqi churn, including
  fixes and device-ID updates; no prerequisite for this one-line helper
  call was identified for v5.15+ style code.
- Record 3.4: Tomas Henzl has SCSI commits in history but no recent
  smartpqi commits found; Don Brace is listed as smartpqi maintainer and
  acked the patch.
- Record 3.5: dependency is the existing
  `pqi_ctrl_unblock_device_reset()` helper. It exists in v5.15+ verified
  tags, not in v5.10.

Phase 4:
- Record 4.1: candidate commit hash was not available locally, so `b4
  dig -c` could not be used for this candidate. `b4 mbox` and `b4 am`
  using the Link fetched the original thread.
- Record 4.2: original recipients were `linux-scsi` and Don Brace; Don
  Brace acked it; Martin Petersen applied it.
- Record 4.3: external thread and an earlier related LKML post show a
  real lockdep splat with call trace through `__do_sys_reboot ->
  device_shutdown -> pci_device_shutdown -> pqi_shutdown`.
- Record 4.4: no newer v2/v3 was reported by `b4 mbox -c`; thread had
  six messages. A separate 2025 lockdep-key proposal for the same
  warning was found, but it is not present in this tree.
- Record 4.5: web search found no relevant stable-list discussion.

Phase 5:
- Record 5.1: modified function: `pqi_shutdown()`.
- Record 5.2: caller is PCI driver `.shutdown = pqi_shutdown`; this is
  reached from PCI/device shutdown during reboot/poweroff paths.
- Record 5.3: relevant callees are `pqi_wait_until_ofa_finished()`,
  `pqi_scsi_block_requests()`, `pqi_ctrl_block_device_reset()`,
  `pqi_ctrl_block_requests()`, `pqi_ctrl_wait_until_quiesced()`,
  `pqi_flush_cache()`, `pqi_crash_if_pending_command()`, `pqi_reset()`,
  and now `pqi_ctrl_unblock_device_reset()`.
- Record 5.4: verified external call trace reaches `pqi_shutdown()` from
  reboot. Trigger requires multiple smartpqi controllers and a
  debug/lockdep kernel.
- Record 5.5: similar lock/unlock pairing exists in OFA and
  suspend/resume paths; shutdown was the unmatched case.

Phase 6:
- Record 6.1: verified `v5.15`, `v6.1`, `v6.6`, `v6.12`, `v6.16`, and
  `v6.19` have mutex-based block/unblock helpers and shutdown lacks the
  final unblock. Verified `v5.10` does not have the mutex helper.
- Record 6.2: expected backport difficulty is clean or trivial for
  v5.15+ style trees because the exact helper and shutdown context
  exist. v5.10 is not applicable as-is.
- Record 6.3: no related fix already present in the checked local tree;
  `lun_reset_key` proposal is absent.

Phase 7:
- Record 7.1: subsystem is SCSI storage driver, `smartpqi`; criticality
  is driver-specific but storage-related.
- Record 7.2: subsystem is active; recent history shows ongoing fixes,
  device IDs, and driver updates.

Phase 8:
- Record 8.1: affected users are systems with Microchip/Microsemi
  SmartPQI controllers, especially multiple controllers with
  debug/lockdep kernels.
- Record 8.2: trigger is shutdown/reboot. The verified external trace
  shows reboot path; unprivileged triggerability was not verified.
- Record 8.3: failure mode is lockdep warning/lock imbalance, severity
  medium-low in production terms but valid for debug-kernel correctness.
- Record 8.4: benefit is moderate for affected systems and CI/debug
  kernels; risk is very low because this is one line using an existing
  helper after a matching lock.

Phase 9:
- Record 9.1: evidence for backporting: real lock imbalance,
  reproducible lockdep warning, one-line fix, maintainer ack, existing
  helper, verified affected stable baselines v5.15+. Evidence against:
  symptom is mainly debug warning, not crash/data corruption; v5.10 not
  applicable as-is.
- Record 9.2: stable rules: obviously correct yes; fixes a real bug yes;
  important issue borderline but acceptable due lockdep warning and tiny
  risk; small/contained yes; no new features/APIs yes; applies to v5.15+
  style trees likely clean/trivial.
- Record 9.3: no exception category applies.
- Record 9.4: risk-benefit favors backporting for applicable stable
  trees.

## Verification
- Phase 1: Parsed supplied commit message and `b4 am` output; confirmed
  tags and absence of `Fixes:`/stable/Reported-by.
- Phase 2: Read `smartpqi_init.c`; confirmed
  `pqi_ctrl_block_device_reset()` is `mutex_lock()` and
  `pqi_ctrl_unblock_device_reset()` is `mutex_unlock()`.
- Phase 3: Used `git blame`, `git show 0530736e40a069`, and `git show
  9fa8202336096d`; confirmed helper semantics changed to mutex model in
  the shutdown/suspend update.
- Phase 4: `WebFetch` to `patch.msgid.link` was blocked by Anubis; `b4
  mbox`/`b4 am` fetched the lore thread successfully. `b4 am` reported
  the patch applies cleanly to current tree.
- Phase 4: Read lore mirror; confirmed Bart’s “patch looks fine”
  comment, Tomas’s risk discussion, Don Brace’s ack, and Martin
  Petersen’s apply notice.
- Phase 5: Used exact searches and file reads to trace `.shutdown =
  pqi_shutdown`, SCSI reset handlers, and related lock users.
- Phase 6: Used version tags to verify affected code in `v5.15+` and
  non-applicability to `v5.10`.
- Phase 7: Checked `MAINTAINERS`; confirmed Don Brace maintains
  smartpqi.
- Phase 8: External LKML mirror provided the concrete lockdep call trace
  and trigger conditions.
- Unverified: I did not build-test the patch and did not verify stable
  branch-specific conflicts beyond version-tag code presence.

**YES**

 drivers/scsi/smartpqi/smartpqi_init.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index b4ed991976d06..2026ac645d6ab 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -9427,6 +9427,7 @@ static void pqi_shutdown(struct pci_dev *pci_dev)
 
 	pqi_crash_if_pending_command(ctrl_info);
 	pqi_reset(ctrl_info);
+	pqi_ctrl_unblock_device_reset(ctrl_info);
 }
 
 static void pqi_process_lockup_action_param(void)
-- 
2.53.0


  parent reply	other threads:[~2026-05-11 22:20 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-11 22:19 [PATCH AUTOSEL 7.0-5.10] ALSA: sparc/dbri: add missing fallthrough Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-6.6] docs: cgroup-v1: Update charge-commit section Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-5.10] drm/panel: feiyang-fy07024di26a30d: return display-on error Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-6.1] smb: client: Zero-pad short GSS session keys per MS-SMB2 Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-5.15] wifi: nl80211: re-check wiphy netns in nl80211_prepare_wdev_dump() continuation Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-6.6] ipv6: Implement limits on extension header parsing Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-6.12] net: usb: cdc_ncm: add Apple Mac USB-C direct networking quirk Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-5.15] net: usb: r8152: add TRENDnet TUC-ET2G v2.0 Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-5.10] i2c: dev: prevent integer overflow in I2C_TIMEOUT ioctl Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-5.10] powerpc/vmx: avoid KASAN instrumentation in enter_vmx_ops() for kexec Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-6.18] ALSA: usb-audio: add min_mute quirk for Razer Nommo V2 X Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-5.10] wifi: libertas: fix integer underflow in process_cmdrequest() Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0] io_uring/wait: honour caller's time namespace for IORING_ENTER_ABS_TIMER Sasha Levin
2026-05-12 15:47   ` Jens Axboe
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-5.10] wifi: nl80211: require CAP_NET_ADMIN over the target netns in SET_WIPHY_NETNS Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-6.12] media: qcom: camss: avoid format string warning Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-5.10] scsi: scsi_dh_alua: Increase default ALUA timeout to maximum spec value Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-6.18] Bluetooth: hci_uart: Fix NULL deref in recv callbacks when priv is uninitialized Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0] ALSA: hda/realtek: Add mute LED fixup for HP Pavilion 15-cs1xxx Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-5.15] ALSA: usb-audio: Add quirk flags for AlphaTheta EUPHONIA Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-6.18] ALSA: hda/realtek: Add codec SSID quirk for Lenovo Yoga Pro 9 16IMH9 Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-5.10] fbdev: ipu-v3: clean up kernel-doc warnings Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-6.6] ASoC: amd: yc: Add DMI quirk for MSI Bravo 15 C7VE Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-6.1] powerpc/pasemi: Drop redundant res assignment Sasha Levin
2026-05-11 22:19 ` Sasha Levin [this message]
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-6.18] powerpc/pseries/htmdump: Free the global buffers in htmdump module exit Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-5.10] i2c: acpi: Add ELAN0678 to i2c_acpi_force_100khz_device_ids Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-6.18] iommu/amd: Use maximum Event log buffer size when SNP is enabled on Family 0x19 Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-5.10] ALSA: usb-audio: add clock quirk for Motu 1248 Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0] ASoC: sdw_utils: avoid the SDCA companion function not supported failure Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260511221931.2370053-24-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=don.brace@microchip.com \
    --cc=jejb@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=patches@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    --cc=storagedev@microchip.com \
    --cc=thenzl@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox