stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Niklas Schnelle <schnelle@linux.ibm.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>, stable@vger.kernel.org
Cc: patches@lists.linux.dev, Julian Ruess <julianr@linux.ibm.com>,
	Gerd Bayer	 <gbayer@linux.ibm.com>,
	Farhan Ali <alifm@linux.ibm.com>,
	Alexander Gordeev	 <agordeev@linux.ibm.com>,
	Sasha Levin <sashal@kernel.org>
Subject: Re: [PATCH 6.1 60/81] s390/pci: Fix stale function handles in error handling
Date: Thu, 10 Jul 2025 10:14:17 +0200	[thread overview]
Message-ID: <6bca64221f8954adcdcfe6b5639e29c7fee4b03a.camel@linux.ibm.com> (raw)
In-Reply-To: <20250708162226.893789793@linuxfoundation.org>

On Tue, 2025-07-08 at 18:23 +0200, Greg Kroah-Hartman wrote:
> 6.1-stable review patch.  If anyone has any objections, please let me know.
> 
> ------------------
> 
> From: Niklas Schnelle <schnelle@linux.ibm.com>
> 
> [ Upstream commit 45537926dd2aaa9190ac0fac5a0fbeefcadfea95 ]
> 
> The error event information for PCI error events contains a function
> handle for the respective function. This handle is generally captured at
> the time the error event was recorded. Due to delays in processing or
> cascading issues, it may happen that during firmware recovery multiple
> events are generated. When processing these events in order Linux may
> already have recovered an affected function making the event information
> stale. Fix this by doing an unconditional CLP List PCI function
> retrieving the current function handle with the zdev->state_lock held
> and ignoring the event if its function handle is stale.
> 
> Cc: stable@vger.kernel.org
> Fixes: 4cdf2f4e24ff ("s390/pci: implement minimal PCI error recovery")
> Reviewed-by: Julian Ruess <julianr@linux.ibm.com>
> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
> Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
>  arch/s390/pci/pci_event.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/arch/s390/pci/pci_event.c b/arch/s390/pci/pci_event.c
> index d969f36bf186f..dc512c8f82324 100644
> --- a/arch/s390/pci/pci_event.c
> +++ b/arch/s390/pci/pci_event.c
> @@ -257,6 +257,8 @@ static void __zpci_event_error(struct zpci_ccdf_err *ccdf)
>  	struct zpci_dev *zdev = get_zdev_by_fid(ccdf->fid);
>  	struct pci_dev *pdev = NULL;
>  	pci_ers_result_t ers_res;
> +	u32 fh = 0;
> +	int rc;
>  
>  	zpci_dbg(3, "err fid:%x, fh:%x, pec:%x\n",
>  		 ccdf->fid, ccdf->fh, ccdf->pec);
> @@ -264,9 +266,23 @@ static void __zpci_event_error(struct zpci_ccdf_err *ccdf)
>  	zpci_err_hex(ccdf, sizeof(*ccdf));
>  
>  	if (zdev) {
> +		mutex_lock(&zdev->state_lock);
> +		rc = clp_refresh_fh(zdev->fid, &fh);
> +		if (rc) {
> +			mutex_unlock(&zdev->state_lock);
> +			goto no_pdev;
> +		}
> +		if (!fh || ccdf->fh != fh) {
> +			/* Ignore events with stale handles */
> +			zpci_dbg(3, "err fid:%x, fh:%x (stale %x)\n",
> +				 ccdf->fid, fh, ccdf->fh);
> +			mutex_unlock(&zdev->state_lock);
> +			goto no_pdev;
> +		}
>  		zpci_update_fh(zdev, ccdf->fh);
>  		if (zdev->zbus->bus)
>  			pdev = pci_get_slot(zdev->zbus->bus, zdev->devfn);
> +		mutex_unlock(&zdev->state_lock);
>  	}
>  
>  	pr_err("%s: Event 0x%x reports an error for PCI function 0x%x\n",

Sorry I only noticed this due to a build error report but this backport
is NOT CORRECT. The mutex_lock(&zdev->state_lock) line that was context
in the original commit was part of commit bcb5d6c76903 ("s390/pci:
introduce lock to synchronize state of zpci_dev's") which also added
the mutex and isn't in this tree. So without pulling that in as a
prerequisite this won't compile. 

Also and kind of worse the above puts the mutex_unlock() in the wrong
place! Please drop/revert this patch. 

The original commit here should work for its specific problem even
without the backport of the mutex though I think it would be best to
get that into stable as well. Sorry for not marking it as a dependency.
That said, shouldn't there be a note that this backport deviates
significantly from the upstream commit?

Thanks,
Niklas

  reply	other threads:[~2025-07-10  8:14 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-08 16:22 [PATCH 6.1 00/81] 6.1.144-rc1 review Greg Kroah-Hartman
2025-07-08 16:22 ` [PATCH 6.1 01/81] rtc: cmos: use spin_lock_irqsave in cmos_interrupt Greg Kroah-Hartman
2025-07-08 16:22 ` [PATCH 6.1 02/81] s390/pci: Do not try re-enabling load/store if device is disabled Greg Kroah-Hartman
2025-07-08 16:22 ` [PATCH 6.1 03/81] vsock/vmci: Clear the vmci transport packet properly when initializing it Greg Kroah-Hartman
2025-07-08 16:22 ` [PATCH 6.1 04/81] mmc: sdhci: Add a helper function for dump register in dynamic debug mode Greg Kroah-Hartman
2025-07-08 16:22 ` [PATCH 6.1 05/81] Revert "mmc: sdhci: Disable SD card clock before changing parameters" Greg Kroah-Hartman
2025-07-08 16:22 ` [PATCH 6.1 06/81] mmc: core: sd: Apply BROKEN_SD_DISCARD quirk earlier Greg Kroah-Hartman
2025-07-08 16:22 ` [PATCH 6.1 07/81] Bluetooth: hci_sync: revert some mesh modifications Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 08/81] Bluetooth: MGMT: set_mesh: update LE scan interval and window Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 09/81] Bluetooth: MGMT: mesh_send: check instances prior disabling advertising Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 10/81] regulator: gpio: Fix the out-of-bounds access to drvdata::gpiods Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 11/81] usb: typec: altmodes/displayport: do not index invalid pin_assignments Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 12/81] mtk-sd: Fix a pagefault in dma_unmap_sg() for not prepared data Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 13/81] mtk-sd: Prevent memory corruption from DMA map failure Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 14/81] mtk-sd: reset host->mrq on prepare_data() error Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 15/81] arm64: dts: apple: t8103: Fix PCIe BCM4377 nodename Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 16/81] platform/mellanox: mlxbf-tmfifo: fix vring_desc.len assignment Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 17/81] RDMA/mlx5: Initialize obj_event->obj_sub_list before xa_insert Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 18/81] nfs: Clean up /proc/net/rpc/nfs when nfs_fs_proc_net_init() fails Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 19/81] NFSv4/pNFS: Fix a race to wake on NFS_LAYOUT_DRAIN Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 20/81] scsi: qla2xxx: Fix DMA mapping test in qla24xx_get_port_database() Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 21/81] scsi: qla4xxx: Fix missing DMA mapping error in qla4xxx_alloc_pdu() Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 22/81] scsi: ufs: core: Fix spelling of a sysfs attribute name Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 23/81] RDMA/mlx5: Fix CC counters query for MPV Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 24/81] platform/mellanox: nvsw-sn2201: Fix bus number in adapter error message Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 25/81] Bluetooth: Prevent unintended pause by checking if advertising is active Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 26/81] btrfs: fix missing error handling when searching for inode refs during log replay Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 27/81] btrfs: fix iteration of extrefs " Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 28/81] ethernet: atl1: Add missing DMA mapping error checks and count errors Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 29/81] drm/exynos: fimd: Guard display clock control with runtime PM calls Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 30/81] spi: spi-fsl-dspi: Clear completion counter before initiating transfer Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 31/81] drm/i915/selftests: Change mock_request() to return error pointers Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 32/81] platform/x86: dell-wmi-sysman: Fix WMI data block retrieval in sysfs callbacks Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 33/81] platform/mellanox: mlxreg-lc: Fix logic error in power state check Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 34/81] drm/i915/gt: Fix timeline left held on VMA alloc error Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 35/81] drm/i915/gsc: mei interrupt top half should be in irq disabled context Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 36/81] igc: disable L1.2 PCI-E link substate to avoid performance issue Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 37/81] lib: test_objagg: Set error message in check_expect_hints_stats() Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 38/81] amd-xgbe: align CL37 AN sequence as per databook Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 39/81] enic: fix incorrect MTU comparison in enic_change_mtu() Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 40/81] rose: fix dangling neighbour pointers in rose_rt_device_down() Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 41/81] nui: Fix dma_mapping_error() check Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 42/81] net/sched: Always pass notifications when child class becomes empty Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 43/81] smb: client: fix race condition in negotiate timeout by using more precise timing Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 44/81] drm/msm: Fix a fence leak in submit error path Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 45/81] drm/msm: Fix another leak in the " Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 46/81] ALSA: sb: Dont allow changing the DMA mode during operations Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 47/81] ALSA: sb: Force to disable DMAs once when DMA mode is changed Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 48/81] ata: libata-acpi: Do not assume 40 wire cable if no devices are enabled Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 49/81] ata: pata_cs5536: fix build on 32-bit UML Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 50/81] powerpc: Fix struct termio related ioctl macros Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 51/81] ASoC: amd: yc: update quirk data for HP Victus Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 52/81] scsi: target: Fix NULL pointer dereference in core_scsi3_decode_spec_i_port() Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 53/81] aoe: defer rexmit timer downdev work to workqueue Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 54/81] wifi: mac80211: drop invalid source address OCB frames Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 55/81] wifi: ath6kl: remove WARN on bad firmware input Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 56/81] ACPICA: Refuse to evaluate a method if arguments are missing Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 57/81] mtd: spinand: fix memory leak of ECC engine conf Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 58/81] rcu: Return early if callback is not specified Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 59/81] virtio-net: ensure the received length does not exceed allocated size Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 60/81] s390/pci: Fix stale function handles in error handling Greg Kroah-Hartman
2025-07-10  8:14   ` Niklas Schnelle [this message]
2025-07-10 13:16     ` Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 61/81] drm/v3d: Disable interrupts before resetting the GPU Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 62/81] NFSv4/flexfiles: Fix handling of NFS level errors in I/O Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 63/81] btrfs: use btrfs_record_snapshot_destroy() during rmdir Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 64/81] dpaa2-eth: fix xdp_rxq_info leak Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 65/81] platform/x86: think-lmi: Fix class device unregistration Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 66/81] platform/x86: dell-wmi-sysman: " Greg Kroah-Hartman
2025-07-08 16:23 ` [PATCH 6.1 67/81] net: usb: lan78xx: fix WARN in __netif_napi_del_locked on disconnect Greg Kroah-Hartman
2025-07-08 16:24 ` [PATCH 6.1 68/81] xhci: dbctty: disable ECHO flag by default Greg Kroah-Hartman
2025-07-08 16:24 ` [PATCH 6.1 69/81] xhci: dbc: Flush queued requests before stopping dbc Greg Kroah-Hartman
2025-07-08 16:24 ` [PATCH 6.1 70/81] xhci: Disable stream for xHC controller with XHCI_BROKEN_STREAMS Greg Kroah-Hartman
2025-07-08 16:24 ` [PATCH 6.1 71/81] usb: cdnsp: do not disable slot for disabled slot Greg Kroah-Hartman
2025-07-08 16:24 ` [PATCH 6.1 72/81] dma-buf: fix timeout handling in dma_resv_wait_timeout v2 Greg Kroah-Hartman
2025-07-08 16:24 ` [PATCH 6.1 73/81] i2c/designware: Fix an initialization issue Greg Kroah-Hartman
2025-07-08 16:24 ` [PATCH 6.1 74/81] Logitech C-270 even more broken Greg Kroah-Hartman
2025-07-08 16:24 ` [PATCH 6.1 75/81] platform/x86: think-lmi: Create ksets consecutively Greg Kroah-Hartman
2025-07-08 16:24 ` [PATCH 6.1 76/81] platform/x86: think-lmi: Fix kobject cleanup Greg Kroah-Hartman
2025-07-08 16:24 ` [PATCH 6.1 77/81] usb: typec: displayport: Fix potential deadlock Greg Kroah-Hartman
2025-07-08 16:24 ` [PATCH 6.1 78/81] x86/bugs: Rename MDS machinery to something more generic Greg Kroah-Hartman
2025-07-08 16:24 ` [PATCH 6.1 79/81] x86/bugs: Add a Transient Scheduler Attacks mitigation Greg Kroah-Hartman
2025-07-08 16:24 ` [PATCH 6.1 80/81] KVM: SVM: Advertise TSA CPUID bits to guests Greg Kroah-Hartman
2025-07-08 16:24 ` [PATCH 6.1 81/81] x86/process: Move the buffer clearing before MONITOR Greg Kroah-Hartman
2025-07-08 16:35   ` Andrew Cooper
2025-07-08 16:48     ` Greg Kroah-Hartman
2025-07-08 16:49     ` Borislav Petkov
2025-07-08 16:52       ` Andrew Cooper
2025-07-09 22:06 ` [PATCH 6.1 00/81] 6.1.144-rc1 review Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6bca64221f8954adcdcfe6b5639e29c7fee4b03a.camel@linux.ibm.com \
    --to=schnelle@linux.ibm.com \
    --cc=agordeev@linux.ibm.com \
    --cc=alifm@linux.ibm.com \
    --cc=gbayer@linux.ibm.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=julianr@linux.ibm.com \
    --cc=patches@lists.linux.dev \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).