public inbox for linux-pci@vger.kernel.org
 help / color / mirror / Atom feed
From: Lukas Wunner <lukas@wunner.de>
To: Ricky WU <ricky_wu@realtek.com>
Cc: "scott@spiteful.org" <scott@spiteful.org>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"kbusch@kernel.org" <kbusch@kernel.org>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	Mika Westerberg <mika.westerberg@linux.intel.com>
Subject: Re: [bug report] nvme not re-recognize when changed M.2 SSD in S3 mode
Date: Mon, 27 May 2024 13:54:42 +0200	[thread overview]
Message-ID: <ZlR0grWLqO9nch8Q@wunner.de> (raw)
In-Reply-To: <8c3d00850e7449c184e4c45a3c9b9dfa@realtek.com>

On Thu, Apr 25, 2024 at 06:02:36AM +0000, Ricky WU wrote:
> I tested this patch and the result looks good, but if the two SSD
> has same VID, PID and capacity it still has problem.
> So I think add S/N to compare is necessary if it can do
> 
> And I also tested SD7.0 card the result is same with M.2 SSD

Below please find a reworked patch which checks the Device Serial
Number in addition to various other device identity information
in config space.

I've moved the check for a replaced device to the ->resume_noirq
phase, i.e. before the device driver's first access to the device.
I'm also marking the device removed to prevent the driver from
accessing it.

Does this fix the issue for you?

If it does reliably detect a replaced device, could you also
double-check the opposite case, i.e. if the device is *not*
replaced during system sleep?

I think this is about as much as we can do at the PCI layer to
detect a replaced device.  Drivers may have additional ways
to identify a device (such as reading a WWID from some register)
and we could consider providing a library function which drivers
could call if they detect a replaced device on resume.

If you set CONFIG_DYNAMIC_DEBUG=y and add the following to the
command line...

pciehp.pciehp_debug=1 dyndbg="file pciehp* +p" log_buf_len=10M ignore_loglevel

...you should see "device replaced during system sleep" messages
on resume if a replaced device is detected.

-- >8 --

From: Lukas Wunner <lukas@wunner.de>
Subject: [PATCH] PCI: pciehp: Detect device replacement during system sleep

Ricky reports that replacing a device in a hotplug slot during ACPI
sleep state S3 does not cause re-enumeration on resume, as one would
expect.  Instead, the new device is treated as if it was the old one.

There is no bulletproof way to detect device replacement, but as a
heuristic, check whether the device identity in config space matches
cached data in struct pci_dev (Vendor ID, Device ID, Class Code,
Revision ID, Subsystem Vendor ID, Subsystem ID).  Additionally, cache
and compare the Device Serial Number (PCIe r6.2 sec 7.9.3).  If a
mismatch is detected, mark the old device disconnected (to prevent its
driver from accessing the new device) and synthesize a Presence Detect
Changed event.

Reported-by: Ricky Wu <ricky_wu@realtek.com>
Closes: https://lore.kernel.org/r/a608b5930d0a48f092f717c0e137454b@realtek.com
Signed-off-by: Lukas Wunner <lukas@wunner.de>
---
 drivers/pci/hotplug/pciehp.h      |  4 ++++
 drivers/pci/hotplug/pciehp_core.c | 42 ++++++++++++++++++++++++++++++++++++++-
 drivers/pci/hotplug/pciehp_hpc.c  |  5 +++++
 drivers/pci/hotplug/pciehp_pci.c  |  4 ++++
 4 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
index e0a614a..273dd8c 100644
--- a/drivers/pci/hotplug/pciehp.h
+++ b/drivers/pci/hotplug/pciehp.h
@@ -46,6 +46,9 @@
 /**
  * struct controller - PCIe hotplug controller
  * @pcie: pointer to the controller's PCIe port service device
+ * @dsn: cached copy of Device Serial Number of Function 0 in the hotplug slot
+ *	(PCIe r6.2 sec 7.9.3); used to determine whether a hotplugged device
+ *	was replaced with a different one during system sleep
  * @slot_cap: cached copy of the Slot Capabilities register
  * @inband_presence_disabled: In-Band Presence Detect Disable supported by
  *	controller and disabled per spec recommendation (PCIe r5.0, appendix I
@@ -87,6 +90,7 @@
  */
 struct controller {
 	struct pcie_device *pcie;
+	u64 dsn;
 
 	u32 slot_cap;				/* capabilities and quirks */
 	unsigned int inband_presence_disabled:1;
diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c
index ddd55ad..ff458e6 100644
--- a/drivers/pci/hotplug/pciehp_core.c
+++ b/drivers/pci/hotplug/pciehp_core.c
@@ -284,6 +284,32 @@ static int pciehp_suspend(struct pcie_device *dev)
 	return 0;
 }
 
+static bool pciehp_device_replaced(struct controller *ctrl)
+{
+	struct pci_dev *pdev __free(pci_dev_put);
+	u32 reg;
+
+	pdev = pci_get_slot(ctrl->pcie->port->subordinate, PCI_DEVFN(0, 0));
+	if (!pdev)
+		return true;
+
+	if (pci_read_config_dword(pdev, PCI_VENDOR_ID, &reg) ||
+	    reg != (pdev->vendor | (pdev->device << 16)) ||
+	    pci_read_config_dword(pdev, PCI_CLASS_REVISION, &reg) ||
+	    reg != (pdev->revision | (pdev->class << 8)))
+		return true;
+
+	if (pdev->hdr_type == PCI_HEADER_TYPE_NORMAL &&
+	    (pci_read_config_dword(pdev, PCI_SUBSYSTEM_VENDOR_ID, &reg) ||
+	     reg != (pdev->subsystem_vendor | (pdev->subsystem_device << 16))))
+		return true;
+
+	if (pci_get_dsn(pdev) != ctrl->dsn)
+		return true;
+
+	return false;
+}
+
 static int pciehp_resume_noirq(struct pcie_device *dev)
 {
 	struct controller *ctrl = get_service_data(dev);
@@ -293,9 +319,23 @@ static int pciehp_resume_noirq(struct pcie_device *dev)
 	ctrl->cmd_busy = true;
 
 	/* clear spurious events from rediscovery of inserted card */
-	if (ctrl->state == ON_STATE || ctrl->state == BLINKINGOFF_STATE)
+	if (ctrl->state == ON_STATE || ctrl->state == BLINKINGOFF_STATE) {
 		pcie_clear_hotplug_events(ctrl);
 
+		/*
+		 * If hotplugged device was replaced with a different one
+		 * during system sleep, mark the old device disconnected
+		 * (to prevent its driver from accessing the new device)
+		 * and synthesize a Presence Detect Changed event.
+		 */
+		if (pciehp_device_replaced(ctrl)) {
+			ctrl_dbg(ctrl, "device replaced during system sleep\n");
+			pci_walk_bus(ctrl->pcie->port->subordinate,
+				     pci_dev_set_disconnected, NULL);
+			pciehp_request(ctrl, PCI_EXP_SLTSTA_PDC);
+		}
+	}
+
 	return 0;
 }
 #endif
diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index b1d0a1b3..061f01f 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -1055,6 +1055,11 @@ struct controller *pcie_init(struct pcie_device *dev)
 		}
 	}
 
+	pdev = pci_get_slot(subordinate, PCI_DEVFN(0, 0));
+	if (pdev)
+		ctrl->dsn = pci_get_dsn(pdev);
+	pci_dev_put(pdev);
+
 	return ctrl;
 }
 
diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c
index ad12515..65e50be 100644
--- a/drivers/pci/hotplug/pciehp_pci.c
+++ b/drivers/pci/hotplug/pciehp_pci.c
@@ -72,6 +72,10 @@ int pciehp_configure_device(struct controller *ctrl)
 	pci_bus_add_devices(parent);
 	down_read_nested(&ctrl->reset_lock, ctrl->depth);
 
+	dev = pci_get_slot(parent, PCI_DEVFN(0, 0));
+	ctrl->dsn = pci_get_dsn(dev);
+	pci_dev_put(dev);
+
  out:
 	pci_unlock_rescan_remove();
 	return ret;
-- 
2.43.0


  reply	other threads:[~2024-05-27 11:54 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-09  5:44 [bug report] nvme not re-recognize when changed M.2 SSD in S3 mode Ricky WU
2024-04-09 13:16 ` Bjorn Helgaas
2024-04-10 10:07 ` Lukas Wunner
2024-04-10 14:58   ` Lukas Wunner
2024-04-25  6:02     ` Ricky WU
2024-05-27 11:54       ` Lukas Wunner [this message]
2024-05-29  7:47         ` Ricky WU
2024-05-29 14:43           ` Lukas Wunner
2024-05-15  7:31     ` Ricky WU
2024-05-15  8:46       ` Lukas Wunner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZlR0grWLqO9nch8Q@wunner.de \
    --to=lukas@wunner.de \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mika.westerberg@linux.intel.com \
    --cc=ricky_wu@realtek.com \
    --cc=scott@spiteful.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox