Linux kernel -stable discussions
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev,
	Aliaksei Makarau <Aliaksei.Makarau@ibm.com>,
	Mahanta Jambigi <mjambigi@linux.ibm.com>,
	Halil Pasic <pasic@linux.ibm.com>,
	Alexandra Winter <wintera@linux.ibm.com>,
	Simon Horman <horms@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 6.6 23/76] s390/ism: fix concurrency management in ism_cmd()
Date: Wed, 30 Jul 2025 11:35:16 +0200	[thread overview]
Message-ID: <20250730093227.742807414@linuxfoundation.org> (raw)
In-Reply-To: <20250730093226.854413920@linuxfoundation.org>

6.6-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Halil Pasic <pasic@linux.ibm.com>

[ Upstream commit 897e8601b9cff1d054cdd53047f568b0e1995726 ]

The s390x ISM device data sheet clearly states that only one
request-response sequence is allowable per ISM function at any point in
time.  Unfortunately as of today the s390/ism driver in Linux does not
honor that requirement. This patch aims to rectify that.

This problem was discovered based on Aliaksei's bug report which states
that for certain workloads the ISM functions end up entering error state
(with PEC 2 as seen from the logs) after a while and as a consequence
connections handled by the respective function break, and for future
connection requests the ISM device is not considered -- given it is in a
dysfunctional state. During further debugging PEC 3A was observed as
well.

A kernel message like
[ 1211.244319] zpci: 061a:00:00.0: Event 0x2 reports an error for PCI function 0x61a
is a reliable indicator of the stated function entering error state
with PEC 2. Let me also point out that a kernel message like
[ 1211.244325] zpci: 061a:00:00.0: The ism driver bound to the device does not support error recovery
is a reliable indicator that the ISM function won't be auto-recovered
because the ISM driver currently lacks support for it.

On a technical level, without this synchronization, commands (inputs to
the FW) may be partially or fully overwritten (corrupted) by another CPU
trying to issue commands on the same function. There is hard evidence that
this can lead to DMB token values being used as DMB IOVAs, leading to
PEC 2 PCI events indicating invalid DMA. But this is only one of the
failure modes imaginable. In theory even completely losing one command
and executing another one twice and then trying to interpret the outputs
as if the command we intended to execute was actually executed and not
the other one is also possible.  Frankly, I don't feel confident about
providing an exhaustive list of possible consequences.

Fixes: 684b89bc39ce ("s390/ism: add device driver for internal shared memory")
Reported-by: Aliaksei Makarau <Aliaksei.Makarau@ibm.com>
Tested-by: Mahanta Jambigi <mjambigi@linux.ibm.com>
Tested-by: Aliaksei Makarau <Aliaksei.Makarau@ibm.com>
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250722161817.1298473-1-wintera@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/s390/net/ism_drv.c | 3 +++
 include/linux/ism.h        | 1 +
 2 files changed, 4 insertions(+)

diff --git a/drivers/s390/net/ism_drv.c b/drivers/s390/net/ism_drv.c
index af0d90beba638..76ba36c83e522 100644
--- a/drivers/s390/net/ism_drv.c
+++ b/drivers/s390/net/ism_drv.c
@@ -130,6 +130,7 @@ static int ism_cmd(struct ism_dev *ism, void *cmd)
 	struct ism_req_hdr *req = cmd;
 	struct ism_resp_hdr *resp = cmd;
 
+	spin_lock(&ism->cmd_lock);
 	__ism_write_cmd(ism, req + 1, sizeof(*req), req->len - sizeof(*req));
 	__ism_write_cmd(ism, req, 0, sizeof(*req));
 
@@ -143,6 +144,7 @@ static int ism_cmd(struct ism_dev *ism, void *cmd)
 	}
 	__ism_read_cmd(ism, resp + 1, sizeof(*resp), resp->len - sizeof(*resp));
 out:
+	spin_unlock(&ism->cmd_lock);
 	return resp->ret;
 }
 
@@ -630,6 +632,7 @@ static int ism_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 		return -ENOMEM;
 
 	spin_lock_init(&ism->lock);
+	spin_lock_init(&ism->cmd_lock);
 	dev_set_drvdata(&pdev->dev, ism);
 	ism->pdev = pdev;
 	ism->dev.parent = &pdev->dev;
diff --git a/include/linux/ism.h b/include/linux/ism.h
index 9a4c204df3da1..04e2fc1973ce4 100644
--- a/include/linux/ism.h
+++ b/include/linux/ism.h
@@ -28,6 +28,7 @@ struct ism_dmb {
 
 struct ism_dev {
 	spinlock_t lock; /* protects the ism device */
+	spinlock_t cmd_lock; /* serializes cmds */
 	struct list_head list;
 	struct pci_dev *pdev;
 
-- 
2.39.5




  parent reply	other threads:[~2025-07-30  9:37 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-30  9:34 [PATCH 6.6 00/76] 6.6.101-rc1 review Greg Kroah-Hartman
2025-07-30  9:34 ` [PATCH 6.6 01/76] Input: gpio-keys - fix a sleep while atomic with PREEMPT_RT Greg Kroah-Hartman
2025-07-30  9:34 ` [PATCH 6.6 02/76] virtio_ring: Fix error reporting in virtqueue_resize Greg Kroah-Hartman
2025-07-30  9:34 ` [PATCH 6.6 03/76] regulator: core: fix NULL dereference on unbind due to stale coupling data Greg Kroah-Hartman
2025-07-30  9:34 ` [PATCH 6.6 04/76] RDMA/core: Rate limit GID cache warning messages Greg Kroah-Hartman
2025-07-30  9:34 ` [PATCH 6.6 05/76] interconnect: qcom: sc7280: Add missing num_links to xm_pcie3_1 node Greg Kroah-Hartman
2025-07-30  9:34 ` [PATCH 6.6 06/76] iio: adc: ad7949: use spi_is_bpw_supported() Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 07/76] regmap: fix potential memory leak of regmap_bus Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 08/76] x86/hyperv: Fix usage of cpu_online_mask to get valid cpu Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 09/76] platform/x86: Fix initialization order for firmware_attributes_class Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 10/76] staging: vchiq_arm: Make vchiq_shutdown never fail Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 11/76] xfrm: interface: fix use-after-free after changing collect_md xfrm interface Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 12/76] net/mlx5: Fix memory leak in cmd_exec() Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 13/76] net/mlx5: E-Switch, Fix peer miss rules to use peer eswitch Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 14/76] i40e: Add rx_missed_errors for buffer exhaustion Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 15/76] i40e: report VF tx_dropped with tx_errors instead of tx_discards Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 16/76] i40e: When removing VF MAC filters, only check PF-set MAC Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 17/76] net: appletalk: Fix use-after-free in AARP proxy probe Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 18/76] net/sched: sch_qfq: Avoid triggering might_sleep in atomic context in qfq_delete_class Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 19/76] can: dev: can_restart(): reverse logic to remove need for goto Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 20/76] can: dev: can_restart(): move debug message and stats after successful restart Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 21/76] can: netlink: can_changelink(): fix NULL pointer deref of struct can_priv::do_set_mode Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 22/76] drm/bridge: ti-sn65dsi86: Remove extra semicolon in ti_sn_bridge_probe() Greg Kroah-Hartman
2025-07-30  9:35 ` Greg Kroah-Hartman [this message]
2025-07-30  9:35 ` [PATCH 6.6 24/76] net: hns3: fix concurrent setting vlan filter issue Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 25/76] net: hns3: disable interrupt when ptp init failed Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 26/76] net: hns3: fixed vf get max channels bug Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 27/76] net: hns3: default enable tx bounce buffer when smmu enabled Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 28/76] platform/x86: ideapad-laptop: Fix kbd backlight not remembered among boots Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 29/76] i2c: qup: jump out of the loop in case of timeout Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 30/76] i2c: tegra: Fix reset error handling with ACPI Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 31/76] i2c: virtio: Avoid hang by using interruptible completion wait Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 32/76] bus: fsl-mc: Fix potential double device reference in fsl_mc_get_endpoint() Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 33/76] sprintf.h requires stdarg.h Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 34/76] ALSA: hda/realtek - Add mute LED support for HP Pavilion 15-eg0xxx Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 35/76] arm64/entry: Mask DAIF in cpu_switch_to(), call_on_irq_stack() Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 36/76] dpaa2-eth: Fix device reference count leak in MAC endpoint handling Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 37/76] dpaa2-switch: " Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 38/76] e1000e: disregard NVM checksum on tgp when valid checksum bit is not set Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 39/76] e1000e: ignore uninitialized checksum word on tgp Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 40/76] gve: Fix stuck TX queue for DQ queue format Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 41/76] ice: Fix a null pointer dereference in ice_copy_and_init_pkg() Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 42/76] kasan: use vmalloc_dump_obj() for vmalloc error reports Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 43/76] nilfs2: reject invalid file types when reading inodes Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 44/76] resource: fix false warning in __request_region() Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 45/76] selftests: mptcp: connect: also cover alt modes Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 46/76] selftests: mptcp: connect: also cover checksum Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 47/76] mm/zsmalloc: do not pass __GFP_MOVABLE if CONFIG_COMPACTION=n Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 48/76] drm/amdkfd: Dont call mmput from MMU notifier callback Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 49/76] usb: typec: tcpm: allow to use sink in accessory mode Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 50/76] usb: typec: tcpm: allow switching to mode accessory to mux properly Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 51/76] usb: typec: tcpm: apply vbus before data bringup in tcpm_src_attach Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 52/76] x86/bugs: Fix use of possibly uninit value in amd_check_tsa_microcode() Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 53/76] jfs: reject on-disk inodes of an unsupported type Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 54/76] comedi: comedi_test: Fix possible deletion of uninitialized timers Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 55/76] ALSA: hda/tegra: Add Tegra264 support Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 56/76] ALSA: hda: Add missing NVIDIA HDA codec IDs Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 57/76] drm/i915/dp: Fix 2.7 Gbps DP_LINK_BW value on g4x Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 58/76] mm: khugepaged: fix call hpage_collapse_scan_file() for anonymous vma Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 59/76] erofs: address D-cache aliasing Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 60/76] crypto: powerpc/poly1305 - add depends on BROKEN for now Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 61/76] crypto: qat - add shutdown handler to qat_dh895xcc Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 62/76] iio: hid-sensor-prox: Fix incorrect OFFSET calculation Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 63/76] iio: hid-sensor-prox: Restore lost scale assignments Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 64/76] ksmbd: fix use-after-free in __smb2_lease_break_noti() Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 65/76] mtd: rawnand: qcom: Fix last codeword read in qcom_param_page_type_exec() Greg Kroah-Hartman
2025-07-30  9:35 ` [PATCH 6.6 66/76] perf/x86/intel: Fix crash in icl_update_topdown_event() Greg Kroah-Hartman
2025-07-30  9:36 ` [PATCH 6.6 67/76] wifi: mt76: mt7921: prevent decap offload config before STA initialization Greg Kroah-Hartman
2025-07-30  9:36 ` [PATCH 6.6 68/76] ksmbd: add free_transport ops in ksmbd connection Greg Kroah-Hartman
2025-07-30  9:36 ` [PATCH 6.6 69/76] arm64/cpufeatures/kvm: Add ARMv8.9 FEAT_ECBHB bits in ID_AA64MMFR1 register Greg Kroah-Hartman
2025-07-30  9:36 ` [PATCH 6.6 70/76] mptcp: make fallback action and fallback decision atomic Greg Kroah-Hartman
2025-07-30  9:36 ` [PATCH 6.6 71/76] mptcp: plug races between subflow fail and subflow creation Greg Kroah-Hartman
2025-07-30  9:36 ` [PATCH 6.6 72/76] mptcp: reset fallback status gracefully at disconnect() time Greg Kroah-Hartman
2025-07-30  9:36 ` [PATCH 6.6 73/76] ARM: 9448/1: Use an absolute path to unified.h in KBUILD_AFLAGS Greg Kroah-Hartman
2025-07-30  9:36 ` [PATCH 6.6 74/76] drm/sched: Remove optimization that causes hang when killing dependent jobs Greg Kroah-Hartman
2025-07-30  9:36 ` [PATCH 6.6 75/76] spi: cadence-quadspi: fix cleanup of rx_chan on failure paths Greg Kroah-Hartman
2025-07-30  9:36 ` [PATCH 6.6 76/76] Revert "selftests/bpf: Add a cgroup prog bpf_get_ns_current_pid_tgid() test" Greg Kroah-Hartman
2025-07-30 17:19 ` [PATCH 6.6 00/76] 6.6.101-rc1 review Brett A C Sheffield
2025-07-30 17:31 ` Peter Schneider
2025-07-30 17:38 ` Mark Brown
2025-07-30 20:10 ` Jon Hunter
2025-07-30 21:00 ` Shuah Khan
2025-07-30 22:12 ` Shuah Khan
2025-07-31  7:09 ` Harshit Mogalapalli
2025-07-31  8:54 ` Ron Economos
2025-07-31 10:38 ` Naresh Kamboju
2025-07-31 18:48 ` Miguel Ojeda
2025-08-01  1:28 ` Hardik Garg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250730093227.742807414@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=Aliaksei.Makarau@ibm.com \
    --cc=horms@kernel.org \
    --cc=mjambigi@linux.ibm.com \
    --cc=pabeni@redhat.com \
    --cc=pasic@linux.ibm.com \
    --cc=patches@lists.linux.dev \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=wintera@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox