All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Tomas Henzl <thenzl@redhat.com>,
	Don Brace <don.brace@microchip.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Sasha Levin <sashal@kernel.org>,
	jejb@linux.ibm.com, storagedev@microchip.com,
	linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH AUTOSEL 7.0-5.15] scsi: smartpqi: Silence a recursive lock warning
Date: Mon, 11 May 2026 18:19:23 -0400	[thread overview]
Message-ID: <20260511221931.2370053-24-sashal@kernel.org> (raw)
In-Reply-To: <20260511221931.2370053-1-sashal@kernel.org>

From: Tomas Henzl <thenzl@redhat.com>

[ Upstream commit d65efdf467ff935e35dfe6aa9a7ab93f17ac07ee ]

On systems with multiple controllers debug kernel shows

  WARNING: possible recursive locking detected

during shutdown.

Each controller does have its own ctrl_info (and mutex) and that isn't
correctly recognized by debug kernel.  Suppress the warning by releasing
the mutex at the end of pqi_shutdown().

Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Don Brace <don.brace@microchip.com>
Link: https://patch.msgid.link/20260414124118.23661-1-thenzl@redhat.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

## Decision Summary
Backport this to stable trees that contain the mutex-based
`pqi_ctrl_block_device_reset()` shutdown path, verified in `v5.15`,
`v6.1`, `v6.6`, `v6.12`, `v6.16`, and `v6.19`. Do not apply it as-is to
`v5.10`, where that helper only sets `block_device_reset = true` and no
`pqi_ctrl_unblock_device_reset()` helper exists.

This is a one-line fix for an imbalanced `lun_reset_mutex` lock in
`pqi_shutdown()`. The direct user-visible symptom is a lockdep “possible
recursive locking detected” warning during shutdown/reboot on systems
with multiple smartpqi controllers. The impact is mostly debug-kernel
noise rather than data loss or a production crash, but the patch
corrects real locking hygiene, is tiny, subsystem-reviewed, and has very
low risk.

## Phase Walkthrough
Phase 1:
- Record 1.1: subsystem `scsi: smartpqi`; action `Silence`; intent is to
  stop a recursive lock warning during shutdown.
- Record 1.2: tags present: `Signed-off-by: Tomas Henzl`, `Acked-by: Don
  Brace`, `Link:
  https://patch.msgid.link/20260414124118.23661-1-thenzl@redhat.com`,
  `Signed-off-by: Martin K. Petersen`. No `Fixes:`, no `Reported-by:`,
  no `Cc: stable`.
- Record 1.3: message describes debug kernels warning on multi-
  controller systems because distinct per-controller mutexes are not
  recognized as distinct after shutdown leaves one held.
- Record 1.4: hidden bug fix: yes. It is described as silencing a
  warning, but the code adds a missing unlock for a mutex acquired
  earlier in the same function.

Phase 2:
- Record 2.1: one file, `drivers/scsi/smartpqi/smartpqi_init.c`, one
  insertion in `pqi_shutdown()`. Single-file surgical fix.
- Record 2.2: before, `pqi_shutdown()` locked
  `ctrl_info->lun_reset_mutex` via `pqi_ctrl_block_device_reset()` and
  returned after `pqi_reset()` without unlocking. After, it unlocks via
  `pqi_ctrl_unblock_device_reset()`.
- Record 2.3: bug category is synchronization/lock balancing. The
  changed helper is verified as
  `mutex_unlock(&ctrl_info->lun_reset_mutex)`.
- Record 2.4: fix quality is high: one existing helper call, no new API,
  no refactor. Main risk is allowing a reset waiter to proceed late in
  shutdown; Tomas explicitly discussed this risk on-list and said he
  checked it.

Phase 3:
- Record 3.1: blame shows the shutdown call to
  `pqi_ctrl_block_device_reset()` is old, but `9fa8202336096` changed
  the helper to a mutex-based block/unblock model. That is the relevant
  introduction point for the missing unlock.
- Record 3.2: no `Fixes:` tag, so no tagged introducing commit to
  follow.
- Record 3.3: recent file history shows normal smartpqi churn, including
  fixes and device-ID updates; no prerequisite for this one-line helper
  call was identified for v5.15+ style code.
- Record 3.4: Tomas Henzl has SCSI commits in history but no recent
  smartpqi commits found; Don Brace is listed as smartpqi maintainer and
  acked the patch.
- Record 3.5: dependency is the existing
  `pqi_ctrl_unblock_device_reset()` helper. It exists in v5.15+ verified
  tags, not in v5.10.

Phase 4:
- Record 4.1: candidate commit hash was not available locally, so `b4
  dig -c` could not be used for this candidate. `b4 mbox` and `b4 am`
  using the Link fetched the original thread.
- Record 4.2: original recipients were `linux-scsi` and Don Brace; Don
  Brace acked it; Martin Petersen applied it.
- Record 4.3: external thread and an earlier related LKML post show a
  real lockdep splat with call trace through `__do_sys_reboot ->
  device_shutdown -> pci_device_shutdown -> pqi_shutdown`.
- Record 4.4: no newer v2/v3 was reported by `b4 mbox -c`; thread had
  six messages. A separate 2025 lockdep-key proposal for the same
  warning was found, but it is not present in this tree.
- Record 4.5: web search found no relevant stable-list discussion.

Phase 5:
- Record 5.1: modified function: `pqi_shutdown()`.
- Record 5.2: caller is PCI driver `.shutdown = pqi_shutdown`; this is
  reached from PCI/device shutdown during reboot/poweroff paths.
- Record 5.3: relevant callees are `pqi_wait_until_ofa_finished()`,
  `pqi_scsi_block_requests()`, `pqi_ctrl_block_device_reset()`,
  `pqi_ctrl_block_requests()`, `pqi_ctrl_wait_until_quiesced()`,
  `pqi_flush_cache()`, `pqi_crash_if_pending_command()`, `pqi_reset()`,
  and now `pqi_ctrl_unblock_device_reset()`.
- Record 5.4: verified external call trace reaches `pqi_shutdown()` from
  reboot. Trigger requires multiple smartpqi controllers and a
  debug/lockdep kernel.
- Record 5.5: similar lock/unlock pairing exists in OFA and
  suspend/resume paths; shutdown was the unmatched case.

Phase 6:
- Record 6.1: verified `v5.15`, `v6.1`, `v6.6`, `v6.12`, `v6.16`, and
  `v6.19` have mutex-based block/unblock helpers and shutdown lacks the
  final unblock. Verified `v5.10` does not have the mutex helper.
- Record 6.2: expected backport difficulty is clean or trivial for
  v5.15+ style trees because the exact helper and shutdown context
  exist. v5.10 is not applicable as-is.
- Record 6.3: no related fix already present in the checked local tree;
  `lun_reset_key` proposal is absent.

Phase 7:
- Record 7.1: subsystem is SCSI storage driver, `smartpqi`; criticality
  is driver-specific but storage-related.
- Record 7.2: subsystem is active; recent history shows ongoing fixes,
  device IDs, and driver updates.

Phase 8:
- Record 8.1: affected users are systems with Microchip/Microsemi
  SmartPQI controllers, especially multiple controllers with
  debug/lockdep kernels.
- Record 8.2: trigger is shutdown/reboot. The verified external trace
  shows reboot path; unprivileged triggerability was not verified.
- Record 8.3: failure mode is lockdep warning/lock imbalance, severity
  medium-low in production terms but valid for debug-kernel correctness.
- Record 8.4: benefit is moderate for affected systems and CI/debug
  kernels; risk is very low because this is one line using an existing
  helper after a matching lock.

Phase 9:
- Record 9.1: evidence for backporting: real lock imbalance,
  reproducible lockdep warning, one-line fix, maintainer ack, existing
  helper, verified affected stable baselines v5.15+. Evidence against:
  symptom is mainly debug warning, not crash/data corruption; v5.10 not
  applicable as-is.
- Record 9.2: stable rules: obviously correct yes; fixes a real bug yes;
  important issue borderline but acceptable due lockdep warning and tiny
  risk; small/contained yes; no new features/APIs yes; applies to v5.15+
  style trees likely clean/trivial.
- Record 9.3: no exception category applies.
- Record 9.4: risk-benefit favors backporting for applicable stable
  trees.

## Verification
- Phase 1: Parsed supplied commit message and `b4 am` output; confirmed
  tags and absence of `Fixes:`/stable/Reported-by.
- Phase 2: Read `smartpqi_init.c`; confirmed
  `pqi_ctrl_block_device_reset()` is `mutex_lock()` and
  `pqi_ctrl_unblock_device_reset()` is `mutex_unlock()`.
- Phase 3: Used `git blame`, `git show 0530736e40a069`, and `git show
  9fa8202336096d`; confirmed helper semantics changed to mutex model in
  the shutdown/suspend update.
- Phase 4: `WebFetch` to `patch.msgid.link` was blocked by Anubis; `b4
  mbox`/`b4 am` fetched the lore thread successfully. `b4 am` reported
  the patch applies cleanly to current tree.
- Phase 4: Read lore mirror; confirmed Bart’s “patch looks fine”
  comment, Tomas’s risk discussion, Don Brace’s ack, and Martin
  Petersen’s apply notice.
- Phase 5: Used exact searches and file reads to trace `.shutdown =
  pqi_shutdown`, SCSI reset handlers, and related lock users.
- Phase 6: Used version tags to verify affected code in `v5.15+` and
  non-applicability to `v5.10`.
- Phase 7: Checked `MAINTAINERS`; confirmed Don Brace maintains
  smartpqi.
- Phase 8: External LKML mirror provided the concrete lockdep call trace
  and trigger conditions.
- Unverified: I did not build-test the patch and did not verify stable
  branch-specific conflicts beyond version-tag code presence.

**YES**

 drivers/scsi/smartpqi/smartpqi_init.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index b4ed991976d06..2026ac645d6ab 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -9427,6 +9427,7 @@ static void pqi_shutdown(struct pci_dev *pci_dev)
 
 	pqi_crash_if_pending_command(ctrl_info);
 	pqi_reset(ctrl_info);
+	pqi_ctrl_unblock_device_reset(ctrl_info);
 }
 
 static void pqi_process_lockup_action_param(void)
-- 
2.53.0


  parent reply	other threads:[~2026-05-11 22:20 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-11 22:19 [PATCH AUTOSEL 7.0-5.10] ALSA: sparc/dbri: add missing fallthrough Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-6.6] docs: cgroup-v1: Update charge-commit section Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-5.10] drm/panel: feiyang-fy07024di26a30d: return display-on error Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-6.1] smb: client: Zero-pad short GSS session keys per MS-SMB2 Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-5.15] wifi: nl80211: re-check wiphy netns in nl80211_prepare_wdev_dump() continuation Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-6.6] ipv6: Implement limits on extension header parsing Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-6.12] net: usb: cdc_ncm: add Apple Mac USB-C direct networking quirk Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-5.15] net: usb: r8152: add TRENDnet TUC-ET2G v2.0 Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-5.10] i2c: dev: prevent integer overflow in I2C_TIMEOUT ioctl Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-5.10] powerpc/vmx: avoid KASAN instrumentation in enter_vmx_ops() for kexec Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-6.18] ALSA: usb-audio: add min_mute quirk for Razer Nommo V2 X Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-5.10] wifi: libertas: fix integer underflow in process_cmdrequest() Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0] io_uring/wait: honour caller's time namespace for IORING_ENTER_ABS_TIMER Sasha Levin
2026-05-12 15:47   ` Jens Axboe
2026-05-15 14:04     ` Jens Axboe
2026-05-15 14:11       ` Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-5.10] wifi: nl80211: require CAP_NET_ADMIN over the target netns in SET_WIPHY_NETNS Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-6.12] media: qcom: camss: avoid format string warning Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-5.10] scsi: scsi_dh_alua: Increase default ALUA timeout to maximum spec value Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-6.18] Bluetooth: hci_uart: Fix NULL deref in recv callbacks when priv is uninitialized Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0] ALSA: hda/realtek: Add mute LED fixup for HP Pavilion 15-cs1xxx Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-5.15] ALSA: usb-audio: Add quirk flags for AlphaTheta EUPHONIA Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-6.18] ALSA: hda/realtek: Add codec SSID quirk for Lenovo Yoga Pro 9 16IMH9 Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-5.10] fbdev: ipu-v3: clean up kernel-doc warnings Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-6.6] ASoC: amd: yc: Add DMI quirk for MSI Bravo 15 C7VE Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-6.1] powerpc/pasemi: Drop redundant res assignment Sasha Levin
2026-05-11 22:19 ` Sasha Levin [this message]
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-6.18] powerpc/pseries/htmdump: Free the global buffers in htmdump module exit Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-5.10] i2c: acpi: Add ELAN0678 to i2c_acpi_force_100khz_device_ids Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-6.18] iommu/amd: Use maximum Event log buffer size when SNP is enabled on Family 0x19 Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0-5.10] ALSA: usb-audio: add clock quirk for Motu 1248 Sasha Levin
2026-05-11 22:19 ` [PATCH AUTOSEL 7.0] ASoC: sdw_utils: avoid the SDCA companion function not supported failure Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260511221931.2370053-24-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=don.brace@microchip.com \
    --cc=jejb@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=patches@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    --cc=storagedev@microchip.com \
    --cc=thenzl@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.