From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8753423D2A3 for ; Thu, 14 Aug 2025 19:29:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755199776; cv=none; b=M2jEXIDLSwKBaNiHuCVMh7rwYk6IPsJ30uqc7+QVuYY1Hn2DstaNfF/tGCxp59NeDDfYrwHVhFqctuRhBW/J4jMzr5hLS+LjbCdeKXKBlawx93TakB/37iPhv4eVUBjy39DS+TDNQuoM8/qJaUdclLyzATJpU/fS3xv6JLO8M+c= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755199776; c=relaxed/simple; bh=C722KyueZgnml44yiZSKGL6Emhwa8sk1grG/A7G7Xiw=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=tVWTOmMQ5rypBMlIrC+HiI52rvfgJY8EMb8f8Jz6p0X5qrGJLTbQLz2AQCLDQMSUdHu36L8ZG7kcdFEk38EyJMVrTK9M7k+UMEHPoIJ1pQZWdaRqpIMbQ4q/XCmOUidz6aXjkw7RdZD9dTQ00gzR1hbdPp7M0wtYC2TBXXIIqnc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=PmfvYTwU; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="PmfvYTwU" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1755199775; x=1786735775; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=C722KyueZgnml44yiZSKGL6Emhwa8sk1grG/A7G7Xiw=; b=PmfvYTwU2spLH4w7pG55EjAUAy8CVMKrMyy6pC5AKUl5R4rlCo33WK5r cGJAKI1UPMdOVRMmu+pC7/DEH5Blf77jtHWtIcYYoLtWO3W6goUW42so5 dZsjrKCf/AGqn9vUr/2ltPk4Y1JyYKD/EXEYFg/ZcbvfyXvRlvbTiswh0 /TT6O6T1aBFIjfY6gP2yeYEN9uAYnovIylCval13l4VpSefplcabVF3Fx qPW1Ce0c3M+bRwbYqHCtvwgF4Ig3daSV93j15fsTj112UOZ3WUbSuJGqS 83ect/NdyvHkZOAP/kH9Dylf00eInUxkTy3liFBsTkiM6uYCj2hXuATk1 w==; X-CSE-ConnectionGUID: ec+pJcC1R7WN7hNMJOQqDQ== X-CSE-MsgGUID: qA8J6T0TSGSxNC7dR2H+cg== X-IronPort-AV: E=McAfee;i="6800,10657,11522"; a="75096423" X-IronPort-AV: E=Sophos;i="6.17,290,1747724400"; d="scan'208";a="75096423" Received: from orviesa010.jf.intel.com ([10.64.159.150]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Aug 2025 12:29:34 -0700 X-CSE-ConnectionGUID: vx0X23xmR8+w93glgb0cww== X-CSE-MsgGUID: C81WM5VmTFmlpBHRfeABtg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,290,1747724400"; d="scan'208";a="166055654" Received: from linux.intel.com ([10.54.29.200]) by orviesa010.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Aug 2025 12:29:34 -0700 Received: from [10.124.223.17] (unknown [10.124.223.17]) by linux.intel.com (Postfix) with ESMTP id E9B1420B571C; Thu, 14 Aug 2025 12:29:32 -0700 (PDT) Message-ID: Date: Thu, 14 Aug 2025 12:29:25 -0700 Precedence: bulk X-Mailing-List: linux-pci@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 1/5] PCI/AER: Allow drivers to opt in to Bus Reset on Non-Fatal Errors To: Lukas Wunner , Niklas Schnelle Cc: Bjorn Helgaas , Riana Tauro , Aravind Iddamsetty , "Sean C. Dardis" , Terry Bowman , Linas Vepstas , Mahesh J Salgaonkar , Oliver OHalloran , Manivannan Sadhasivam , linuxppc-dev@lists.ozlabs.org, linux-pci@vger.kernel.org References: <28fd805043bb57af390168d05abb30898cf4fc58.1755008151.git.lukas@wunner.de> <7c545fff40629b612267501c0c74bc40c3df29e2.camel@linux.ibm.com> Content-Language: en-US From: Sathyanarayanan Kuppuswamy In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 8/14/25 2:36 AM, Lukas Wunner wrote: > On Thu, Aug 14, 2025 at 09:56:09AM +0200, Niklas Schnelle wrote: >> On Wed, 2025-08-13 at 07:11 +0200, Lukas Wunner wrote: >>> +++ b/drivers/pci/pcie/err.c >>> @@ -217,15 +217,10 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev, >>> pci_walk_bridge(bridge, pci_pm_runtime_get_sync, NULL); >>> >>> pci_dbg(bridge, "broadcast error_detected message\n"); >>> - if (state == pci_channel_io_frozen) { >>> + if (state == pci_channel_io_frozen) >>> pci_walk_bridge(bridge, report_frozen_detected, &status); >>> - if (reset_subordinates(bridge) != PCI_ERS_RESULT_RECOVERED) { >>> - pci_warn(bridge, "subordinate device reset failed\n"); >>> - goto failed; >>> - } >>> - } else { >>> + else >>> pci_walk_bridge(bridge, report_normal_detected, &status); >>> - } >>> >>> if (status == PCI_ERS_RESULT_CAN_RECOVER) { >>> status = PCI_ERS_RESULT_RECOVERED; >> On s390 PCI errors leave the device with MMIO blocked until either the >> error state is cleared or we reset via the firmware interface. With >> this change and the pci_channel_io_frozen case AER would now do the >> report_mmio_enabled() before the reset with nothing happening between >> report_frozen_detected() and report_mmio_enabled() is MMIO enabled at >> this point? > Yes, MMIO access is controlled through the Memory Space Enable and > Bus Master Enable bits in the Command Register (PCIe r7.0 sec 7.5.1.1.3). > Drivers generally set those bits on probe and they're not automatically > cleared by hardware upon an Uncorrectable Error. > > EEH and s390 blocking MMIO access likely serves the purpose of preventing > corrupted data being propagated upstream. AER doesn't support that > (or at least doesn't mandate that -- certain implementations might > be capable of blocking poisoned data). > > Thus with AER, MMIO access is usually possible already in ->error_detected(). > The only reason why drivers shouldn't be doing that and instead defer > MMIO access to ->mmio_enabled() is to be compatible with EEH and s390. > > There are exceptions to this rule: E.g. if the Uncorrectable Error > was "Surprise Down", the link to the device is down and MMIO access > isn't possible, neither in ->error_detected() nor ->mmio_enabled(). > Drivers should check whether an MMIO read results in an "all ones" > response (PCI_POSSIBLE_ERROR()), which is usually what the host bridge > fabricates upon a Completion Timeout caused by the link being down > and the Endpoint being inaccessible. > > (There's a list of all the errors with default severity etc in PCIe r7.0 > sec 6.2.7.) > > I believe DPC was added to the PCIe Base Specification to prevent > propagating corrupted data upstream, similarly to EEH and s390. > DPC brings down the link immediately upon an Uncorrectable Error > (the Linux driver confines this to Fatal Errors), but as a side > effect this results in a Hot Reset being propagated down the > hierarchy, making it impossible to access the device in the > faulting state to retrieve debug information etc. After the link > has been brought up again, the device is in post-reset state. > So DPC doesn't allow for reset-less recovery. > > With the ordering change introduced by this commit, ->mmio_enabled() > will no longer be able to access MMIO space in the DPC case because > the link hasn't been brought back up until ->slot_reset(). But as > explained in the commit message, I only found two drivers affected > by this. One seems to be EEH-specific (drivers/scsi/ipr.c). > And the other one (drivers/scsi/sym53c8xx_2/sym_glue.c) dumps debug > registers in ->mmio_enabled() and I'm arguing that with this commit > it's dumping "all ones", but right now it's dumping the post-reset > state of the device, which isn't any more useful. > >> I think this callback really only makes sense if you have >> an equivalent to s390's clearing of the error state that enables MMIO >> but doesn't otherwise reset. Similarly EEH has eeh_pci_enable(pe, >> EEH_OPT_THAW_MMIO). > Right. > >>> @@ -233,6 +228,14 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev, >>> pci_walk_bridge(bridge, report_mmio_enabled, &status); >>> } >>> >>> + if (status == PCI_ERS_RESULT_NEED_RESET || >>> + state == pci_channel_io_frozen) { >>> + if (reset_subordinates(bridge) != PCI_ERS_RESULT_RECOVERED) { >>> + pci_warn(bridge, "subordinate device reset failed\n"); >>> + goto failed; >>> + } >>> + } >>> + >>> if (status == PCI_ERS_RESULT_NEED_RESET) { >>> /* >>> * TODO: Should call platform-specific >> I wonder if it might make sense to merge the reset into the above >> existing if. > There are drivers such as drivers/bus/mhi/host/pci_generic.c which > return PCI_ERS_RESULT_RECOVERED from ->error_detected(). So they > fall through directly to the ->resume() stage. They're doing this > even in the pci_channel_io_frozen case (i.e. for Fatal Errors). > > But for DPC we must call reset_subordinates() to bring the link back up. > And for Fatal Errors, Documentation/PCI/pcieaer-howto.rst suggests that > we must likewise call it because the link may be unreliable. For fatal errors, since we already ignore the value returned by ->error_detected() (by calling reset_subordinates() unconditionally), why not update status accordingly in report_frozen_detected() and notify the driver about the reset? That way, the reset logic could be unified under a single if (status == PCI_ERS_RESULT_NEED_RESET) condition. Checking the drivers/bus/mhi/host/pci_generic.c implementation, it looks like calling slot_reset callback looks harmless. > > Hence the if-condition must use a boolean OR, i.e.: > > if (status == PCI_ERS_RESULT_NEED_RESET || > state == pci_channel_io_frozen) { > > ... whereas if I would move the invocation of reset_subordinates() inside > the existing "if (status == PCI_ERS_RESULT_NEED_RESET)", it would be > equivalent to a boolean AND. > > I would have to amend drivers such as drivers/bus/mhi/host/pci_generic.c > to work with the changed logic and I figured that it's simpler to only > change pcie_do_recovery() rather than touching all affected drivers. > I don't have any of that hardware and so it seems risky touching all > those drivers. > >> I'm a bit confused by that TODO comment and >> anyway, it asks for a platform-specific reset to be added bu >> reset_subordinate() already seems to allow platform specific behavior, >> so maybe the moved reset should replace the TODO? > Manivannan has a patch pending which removes the TODO: > > https://lore.kernel.org/all/20250715-pci-port-reset-v6-1-6f9cce94e7bb@oss.qualcomm.com/ > >> Also I think for the >> kind of broken case where the state is pci_channel_io_frozen but the >> drivers just reports PCI_ERS_RESULT_CAN_RECOVER it looks like there >> would be a reset but no calls to ->slot_reset(). > Right, but that's what AER is currently doing and drivers such as > drivers/bus/mhi/host/pci_generic.c are written to work with the > existing flow. > > Thanks, > > Lukas > -- Sathyanarayanan Kuppuswamy Linux Kernel Developer