From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from rtits2.realtek.com.tw (rtits2.realtek.com [211.75.126.72]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C67D4169ACE for ; Wed, 29 May 2024 07:48:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=211.75.126.72 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716968888; cv=none; b=c6/es2rM3jsJjVyfXVNBDR18DVKgcK1SK208B76uB0nxfOvpAdKA5SeZJVC9nbxtzgEjxJJK2ElQIqOyjnqqiiWaiUziAjaRq37OlmQs/f+dCbirDlXpdJJyaBVksd857+F/GCKuG0r0ZPZ82+rNofSHj278rjz6W71NUoJc4Es= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716968888; c=relaxed/simple; bh=2tGAgjxZrLAxYVFU97kplp8b3wqFhZU1hIVNNF+AZ6k=; h=From:To:CC:Subject:Date:Message-ID:References:In-Reply-To: Content-Type:MIME-Version; b=KBBqRxm46szLg0CKHDXyE4uIYBxxboXP7e5ISVVTHaQS8HHK52MkWUmfxxdiC9BjXwPP8Guutgy+B64fq2R72aA7CI4XPNAwIGffqvAlbkPmLXPGS3vf/SHVPxguJ0cBY6weIaP4KkheH7/BnRIS1yShkN/e17ImQtcBzOQu4TE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=realtek.com; spf=pass smtp.mailfrom=realtek.com; arc=none smtp.client-ip=211.75.126.72 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=realtek.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=realtek.com X-SpamFilter-By: ArmorX SpamTrap 5.78 with qID 44T7lD5t93351131, This message is accepted by code: ctloc85258 Received: from mail.realtek.com (rtexh36505.realtek.com.tw[172.21.6.25]) by rtits2.realtek.com.tw (8.15.2/2.95/5.92) with ESMTPS id 44T7lD5t93351131 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 May 2024 15:47:13 +0800 Received: from RTEXMBS02.realtek.com.tw (172.21.6.95) by RTEXH36505.realtek.com.tw (172.21.6.25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Wed, 29 May 2024 15:47:14 +0800 Received: from RTEXMBS01.realtek.com.tw (172.21.6.94) by RTEXMBS02.realtek.com.tw (172.21.6.95) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Wed, 29 May 2024 15:47:13 +0800 Received: from RTEXMBS01.realtek.com.tw ([fe80::d5b2:56ba:6b47:9975]) by RTEXMBS01.realtek.com.tw ([fe80::d5b2:56ba:6b47:9975%5]) with mapi id 15.01.2507.035; Wed, 29 May 2024 15:47:13 +0800 From: Ricky WU To: Lukas Wunner CC: "scott@spiteful.org" , "linux-pci@vger.kernel.org" , "kbusch@kernel.org" , "linux-nvme@lists.infradead.org" , "Mika Westerberg" Subject: RE: [bug report] nvme not re-recognize when changed M.2 SSD in S3 mode Thread-Topic: [bug report] nvme not re-recognize when changed M.2 SSD in S3 mode Thread-Index: AdqKP+LQtoSsBN9ZRz2XBQWf3XU8AwAq/q8AAAon9QAC8Ap+oAZFPNwAAGl8ipA= Date: Wed, 29 May 2024 07:47:13 +0000 Message-ID: <94918684e6a84122a6373ef45b3c117c@realtek.com> References: <8c3d00850e7449c184e4c45a3c9b9dfa@realtek.com> In-Reply-To: Accept-Language: zh-TW, en-US Content-Language: zh-TW x-kse-serverinfo: RTEXMBS02.realtek.com.tw, 9 x-kse-antispam-interceptor-info: fallback x-kse-antivirus-interceptor-info: fallback Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Precedence: bulk X-Mailing-List: linux-pci@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-KSE-AntiSpam-Interceptor-Info: fallback Hi Lukas, We use SDEX card replace M.2 nvme card because we don't have enough M.2 nvm= e card=20 We tried this patch as below situation:=20 1.keep the SDEX card enter S3 then resume ->PASS=20 the video can continue playing, not see the msg "device replace during syst= em sleep"=20 2. on S3 mode insert the SDEX card then resume -> PASS Can identify the card and can see the msg "device replace during system sle= ep" 3.on S3 mode remove the SDEX card then resume -> PASS No card show on system and can see the msg "device replace during system sl= eep" 4.replace card on S3 mode (different brand) ->PASS Can identify the second card and can see the msg "device replace during sys= tem sleep" 5.replace card on S3 mode (same brand and same capacity) ->FAIL Can not see the msg "device replace during system sleep" Against 5 we found a issue, most card not declare capability "PCI_EXT_CAP_I= D_DSN", even if there is the capability the config space value are 0, cause pci_get= _dsn() return 0 normally. However these cards can show the SN on CrystalDiskInfo.=20 below is the card pci config space log and lsblk disk info. Sandisk Corp Device 5007 (prog-if 02 [NVM Express]) this card can see the "= PCI_EXT_CAP_ID_DSN" capability But value is 0 ---------------------------------------------------------------------------= ----------------------------------------- cr@cr-desktop:~$ sudo lspci -s 02:00.0 -vvv 02:00.0 Non-Volatile memory controller: Sandisk Corp Device 5007 (prog-if 0= 2 [NVM Express]) Subsystem: Sandisk Corp Device 5007 Physical Slot: 8 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Steppi= ng- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=3Dfast >TAbort- SERR- Below please find a reworked patch which checks the Device Serial Number = in > addition to various other device identity information in config space. >=20 > I've moved the check for a replaced device to the ->resume_noirq phase, i= .e. > before the device driver's first access to the device. > I'm also marking the device removed to prevent the driver from accessing = it. >=20 > Does this fix the issue for you? >=20 > If it does reliably detect a replaced device, could you also double-check= the > opposite case, i.e. if the device is *not* replaced during system sleep? >=20 > I think this is about as much as we can do at the PCI layer to detect a r= eplaced > device. Drivers may have additional ways to identify a device (such as > reading a WWID from some register) and we could consider providing a libr= ary > function which drivers could call if they detect a replaced device on res= ume. >=20 > If you set CONFIG_DYNAMIC_DEBUG=3Dy and add the following to the command > line... >=20 > pciehp.pciehp_debug=3D1 dyndbg=3D"file pciehp* +p" log_buf_len=3D10M > ignore_loglevel >=20 > ...you should see "device replaced during system sleep" messages on resum= e if > a replaced device is detected. >=20 > -- >8 -- >=20 > From: Lukas Wunner > Subject: [PATCH] PCI: pciehp: Detect device replacement during system sle= ep >=20 > Ricky reports that replacing a device in a hotplug slot during ACPI sleep= state > S3 does not cause re-enumeration on resume, as one would expect. Instead= , > the new device is treated as if it was the old one. >=20 > There is no bulletproof way to detect device replacement, but as a heuris= tic, > check whether the device identity in config space matches cached data in > struct pci_dev (Vendor ID, Device ID, Class Code, Revision ID, Subsystem > Vendor ID, Subsystem ID). Additionally, cache and compare the Device Ser= ial > Number (PCIe r6.2 sec 7.9.3). If a mismatch is detected, mark the old de= vice > disconnected (to prevent its driver from accessing the new device) and > synthesize a Presence Detect Changed event. >=20 > Reported-by: Ricky Wu > Closes: > https://lore.kernel.org/r/a608b5930d0a48f092f717c0e137454b@realtek.com > Signed-off-by: Lukas Wunner > --- > drivers/pci/hotplug/pciehp.h | 4 ++++ > drivers/pci/hotplug/pciehp_core.c | 42 > ++++++++++++++++++++++++++++++++++++++- > drivers/pci/hotplug/pciehp_hpc.c | 5 +++++ > drivers/pci/hotplug/pciehp_pci.c | 4 ++++ > 4 files changed, 54 insertions(+), 1 deletion(-) >=20 > diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h = index > e0a614a..273dd8c 100644 > --- a/drivers/pci/hotplug/pciehp.h > +++ b/drivers/pci/hotplug/pciehp.h > @@ -46,6 +46,9 @@ > /** > * struct controller - PCIe hotplug controller > * @pcie: pointer to the controller's PCIe port service device > + * @dsn: cached copy of Device Serial Number of Function 0 in the hotplu= g > slot > + * (PCIe r6.2 sec 7.9.3); used to determine whether a hotplugged dev= ice > + * was replaced with a different one during system sleep > * @slot_cap: cached copy of the Slot Capabilities register > * @inband_presence_disabled: In-Band Presence Detect Disable supported > by > * controller and disabled per spec recommendation (PCIe r5.0, > appendix I > @@ -87,6 +90,7 @@ > */ > struct controller { > struct pcie_device *pcie; > + u64 dsn; >=20 > u32 slot_cap; /* capabilities and > quirks */ > unsigned int inband_presence_disabled:1; diff --git > a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c > index ddd55ad..ff458e6 100644 > --- a/drivers/pci/hotplug/pciehp_core.c > +++ b/drivers/pci/hotplug/pciehp_core.c > @@ -284,6 +284,32 @@ static int pciehp_suspend(struct pcie_device *dev) > return 0; > } >=20 > +static bool pciehp_device_replaced(struct controller *ctrl) { > + struct pci_dev *pdev __free(pci_dev_put); > + u32 reg; > + > + pdev =3D pci_get_slot(ctrl->pcie->port->subordinate, PCI_DEVFN(0,= 0)); > + if (!pdev) > + return true; > + > + if (pci_read_config_dword(pdev, PCI_VENDOR_ID, ®) || > + reg !=3D (pdev->vendor | (pdev->device << 16)) || > + pci_read_config_dword(pdev, PCI_CLASS_REVISION, ®) || > + reg !=3D (pdev->revision | (pdev->class << 8))) > + return true; > + > + if (pdev->hdr_type =3D=3D PCI_HEADER_TYPE_NORMAL && > + (pci_read_config_dword(pdev, PCI_SUBSYSTEM_VENDOR_ID, > ®) || > + reg !=3D (pdev->subsystem_vendor | (pdev->subsystem_device > << 16)))) > + return true; > + > + if (pci_get_dsn(pdev) !=3D ctrl->dsn) > + return true; > + > + return false; > +} > + > static int pciehp_resume_noirq(struct pcie_device *dev) { > struct controller *ctrl =3D get_service_data(dev); @@ -293,9 +319= ,23 > @@ static int pciehp_resume_noirq(struct pcie_device *dev) > ctrl->cmd_busy =3D true; >=20 > /* clear spurious events from rediscovery of inserted card */ > - if (ctrl->state =3D=3D ON_STATE || ctrl->state =3D=3D BLINKINGOFF= _STATE) > + if (ctrl->state =3D=3D ON_STATE || ctrl->state =3D=3D BLINKINGOFF= _STATE) > + { > pcie_clear_hotplug_events(ctrl); >=20 > + /* > + * If hotplugged device was replaced with a different one > + * during system sleep, mark the old device disconnected > + * (to prevent its driver from accessing the new device) > + * and synthesize a Presence Detect Changed event. > + */ > + if (pciehp_device_replaced(ctrl)) { > + ctrl_dbg(ctrl, "device replaced during system > sleep\n"); > + pci_walk_bus(ctrl->pcie->port->subordinate, > + pci_dev_set_disconnected, > NULL); > + pciehp_request(ctrl, PCI_EXP_SLTSTA_PDC); > + } > + } > + > return 0; > } > #endif > diff --git a/drivers/pci/hotplug/pciehp_hpc.c > b/drivers/pci/hotplug/pciehp_hpc.c > index b1d0a1b3..061f01f 100644 > --- a/drivers/pci/hotplug/pciehp_hpc.c > +++ b/drivers/pci/hotplug/pciehp_hpc.c > @@ -1055,6 +1055,11 @@ struct controller *pcie_init(struct pcie_device > *dev) > } > } >=20 > + pdev =3D pci_get_slot(subordinate, PCI_DEVFN(0, 0)); > + if (pdev) > + ctrl->dsn =3D pci_get_dsn(pdev); > + pci_dev_put(pdev); > + > return ctrl; > } >=20 > diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pcieh= p_pci.c > index ad12515..65e50be 100644 > --- a/drivers/pci/hotplug/pciehp_pci.c > +++ b/drivers/pci/hotplug/pciehp_pci.c > @@ -72,6 +72,10 @@ int pciehp_configure_device(struct controller *ctrl) > pci_bus_add_devices(parent); > down_read_nested(&ctrl->reset_lock, ctrl->depth); >=20 > + dev =3D pci_get_slot(parent, PCI_DEVFN(0, 0)); > + ctrl->dsn =3D pci_get_dsn(dev); > + pci_dev_put(dev); > + > out: > pci_unlock_rescan_remove(); > return ret; > -- > 2.43.0