[PATCH v2 0/4] cxl/mem: Fix shutdown order

Linux CXL
 help / color / mirror / Atom feed

* [PATCH v2 0/4] cxl/mem: Fix shutdown order
@ 2023-09-29 23:09 Dan Williams
  2023-09-29 23:09 ` [PATCH v2 1/4] cxl/pci: Remove unnecessary device reference management in sanitize work Dan Williams
                   ` (3 more replies)
  0 siblings, 4 replies; 34+ messages in thread
From: Dan Williams @ 2023-09-29 23:09 UTC (permalink / raw)
  To: linux-cxl
  Cc: Jonathan Cameron, Dave Jiang, Ira Weiny, Davidlohr Bueso,
	ira.weiny

While fixing a crash where the cxlmd->cxlds validity needed to be
maintained over the span of the memdev being unregistered, I made a note
to come back and cleanup sanitize notifier setup/shutdown
implementation. Jonathan went further and noticed that the fix needs
that rework first. So the changes since v1 [1] are reworking the
sanitize notifier setup to be more idiomatic with how other driver
mechanisms are managed.

The special wrinkle of the sanitize notifier is that it interacts with
interrupts, which are enabled early in the flow, and it interacts with
memdev sysfs which is not initialized until late in the flow.

After some other cleanups, a self contained
cxl_sanitize_setup_notifier() is introduced to centralize that
incremental setup work, and leave cxl_memdev_shutdown() alone to
coordinate closing down the ioctl path relative to the unregister event
of the memdev.

[1]: http://lore.kernel.org/r/169596542307.790108.11339208844199665348.stgit@dwillia2-xfh.jf.intel.com

---

Dan Williams (4):
      cxl/pci: Remove unnecessary device reference management in sanitize work
      cxl/pci: Cleanup 'sanitize' to always poll
      cxl/pci: Fix sanitize notifier setup
      cxl/mem: Fix shutdown order

 drivers/cxl/core/memdev.c |   45 ------------------
 drivers/cxl/cxlmem.h      |    2 -
 drivers/cxl/pci.c         |  112 +++++++++++++++++++++++++++++----------------
 3 files changed, 73 insertions(+), 86 deletions(-)

base-commit: 6465e260f48790807eef06b583b38ca9789b6072

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v2 1/4] cxl/pci: Remove unnecessary device reference management in sanitize work
  2023-09-29 23:09 [PATCH v2 0/4] cxl/mem: Fix shutdown order Dan Williams
@ 2023-09-29 23:09 ` Dan Williams
  2023-09-29 23:41   ` Ira Weiny
                     ` (3 more replies)
  2023-09-29 23:09 ` [PATCH v2 2/4] cxl/pci: Cleanup 'sanitize' to always poll Dan Williams
                   ` (2 subsequent siblings)
  3 siblings, 4 replies; 34+ messages in thread
From: Dan Williams @ 2023-09-29 23:09 UTC (permalink / raw)
  To: linux-cxl; +Cc: ira.weiny

Given that any particular put_device() could be the final put of the
device, the fact that there are usages of cxlds->dev after
put_device(cxlds->dev) is a red flag. Drop the reference counting since
the device is pinned by being registered and will not be unregistered
without triggering the driver + workqueue to shutdown.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/pci.c |    5 -----
 1 file changed, 5 deletions(-)

diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 44a21ab7add5..aa1b3dd9e64c 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -152,8 +152,6 @@ static void cxl_mbox_sanitize_work(struct work_struct *work)
 	mutex_lock(&mds->mbox_mutex);
 	if (cxl_mbox_background_complete(cxlds)) {
 		mds->security.poll_tmo_secs = 0;
-		put_device(cxlds->dev);
-
 		if (mds->security.sanitize_node)
 			sysfs_notify_dirent(mds->security.sanitize_node);
 
@@ -296,9 +294,6 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_memdev_state *mds,
 		 */
 		if (mbox_cmd->opcode == CXL_MBOX_OP_SANITIZE) {
 			if (mds->security.poll) {
-				/* hold the device throughout */
-				get_device(cxlds->dev);
-
 				/* give first timeout a second */
 				timeout = 1;
 				mds->security.poll_tmo_secs = timeout;


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 2/4] cxl/pci: Cleanup 'sanitize' to always poll
  2023-09-29 23:09 [PATCH v2 0/4] cxl/mem: Fix shutdown order Dan Williams
  2023-09-29 23:09 ` [PATCH v2 1/4] cxl/pci: Remove unnecessary device reference management in sanitize work Dan Williams
@ 2023-09-29 23:09 ` Dan Williams
  2023-09-29 23:49   ` Ira Weiny
                     ` (4 more replies)
  2023-09-29 23:09 ` [PATCH v2 3/4] cxl/pci: Fix sanitize notifier setup Dan Williams
  2023-09-29 23:09 ` [PATCH v2 4/4] cxl/mem: Fix shutdown order Dan Williams
  3 siblings, 5 replies; 34+ messages in thread
From: Dan Williams @ 2023-09-29 23:09 UTC (permalink / raw)
  To: linux-cxl; +Cc: ira.weiny

In preparation for fixing the init/teardown of the 'sanitize' workqueue
and sysfs notification mechanism, arrange for cxl_mbox_sanitize_work()
to be the single location where the sysfs attribute is notified. With
that change there is no distinction between polled mode and interrupt
mode. All the interrupt does is accelerate the polling interval.

The change to check for "mds->security.sanitize_node" under the lock is
there to ensure that the interrupt, the work routine and the
setup/teardown code can all have a consistent view of the registered
notifier and the workqueue state. I.e. the expectation is that the
interrupt is live past the point that the sanitize sysfs attribute is
published, and it may race teardown, so it must be consulted under a
lock.

Lastly, some opportunistic replacements of
"queue_delayed_work(system_wq, ...)", which is just open coded
schedule_delayed_work(), are included.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/memdev.c |    3 +-
 drivers/cxl/cxlmem.h      |    2 --
 drivers/cxl/pci.c         |   60 +++++++++++++++++++--------------------------
 3 files changed, 26 insertions(+), 39 deletions(-)

diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 14b547c07f54..2a7a07f6d165 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -561,8 +561,7 @@ static void cxl_memdev_security_shutdown(struct device *dev)
 	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
 	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
 
-	if (mds->security.poll)
-		cancel_delayed_work_sync(&mds->security.poll_dwork);
+	cancel_delayed_work_sync(&mds->security.poll_dwork);
 }
 
 static void cxl_memdev_shutdown(struct device *dev)
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 706f8a6d1ef4..55f00ad17a77 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -360,7 +360,6 @@ struct cxl_fw_state {
  *
  * @state: state of last security operation
  * @enabled_cmds: All security commands enabled in the CEL
- * @poll: polling for sanitization is enabled, device has no mbox irq support
  * @poll_tmo_secs: polling timeout
  * @poll_dwork: polling work item
  * @sanitize_node: sanitation sysfs file to notify
@@ -368,7 +367,6 @@ struct cxl_fw_state {
 struct cxl_security_state {
 	unsigned long state;
 	DECLARE_BITMAP(enabled_cmds, CXL_SEC_ENABLED_MAX);
-	bool poll;
 	int poll_tmo_secs;
 	struct delayed_work poll_dwork;
 	struct kernfs_node *sanitize_node;
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index aa1b3dd9e64c..ac4e434b0806 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -128,10 +128,10 @@ static irqreturn_t cxl_pci_mbox_irq(int irq, void *id)
 	reg = readq(cxlds->regs.mbox + CXLDEV_MBOX_BG_CMD_STATUS_OFFSET);
 	opcode = FIELD_GET(CXLDEV_MBOX_BG_CMD_COMMAND_OPCODE_MASK, reg);
 	if (opcode == CXL_MBOX_OP_SANITIZE) {
+		mutex_lock(&mds->mbox_mutex);
 		if (mds->security.sanitize_node)
-			sysfs_notify_dirent(mds->security.sanitize_node);
-
-		dev_dbg(cxlds->dev, "Sanitization operation ended\n");
+			mod_delayed_work(system_wq, &mds->security.poll_dwork, 0);
+		mutex_unlock(&mds->mbox_mutex);
 	} else {
 		/* short-circuit the wait in __cxl_pci_mbox_send_cmd() */
 		rcuwait_wake_up(&mds->mbox_wait);
@@ -160,8 +160,7 @@ static void cxl_mbox_sanitize_work(struct work_struct *work)
 		int timeout = mds->security.poll_tmo_secs + 10;
 
 		mds->security.poll_tmo_secs = min(15 * 60, timeout);
-		queue_delayed_work(system_wq, &mds->security.poll_dwork,
-				   timeout * HZ);
+		schedule_delayed_work(&mds->security.poll_dwork, timeout * HZ);
 	}
 	mutex_unlock(&mds->mbox_mutex);
 }
@@ -293,15 +292,11 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_memdev_state *mds,
 		 * and allow userspace to poll(2) for completion.
 		 */
 		if (mbox_cmd->opcode == CXL_MBOX_OP_SANITIZE) {
-			if (mds->security.poll) {
-				/* give first timeout a second */
-				timeout = 1;
-				mds->security.poll_tmo_secs = timeout;
-				queue_delayed_work(system_wq,
-						   &mds->security.poll_dwork,
-						   timeout * HZ);
-			}
-
+			/* give first timeout a second */
+			timeout = 1;
+			mds->security.poll_tmo_secs = timeout;
+			schedule_delayed_work(&mds->security.poll_dwork,
+					      timeout * HZ);
 			dev_dbg(dev, "Sanitization operation started\n");
 			goto success;
 		}
@@ -384,7 +379,9 @@ static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds)
 	const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
 	struct device *dev = cxlds->dev;
 	unsigned long timeout;
+	int irq, msgnum;
 	u64 md_status;
+	u32 ctrl;
 
 	timeout = jiffies + mbox_ready_timeout * HZ;
 	do {
@@ -432,33 +429,26 @@ static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds)
 	dev_dbg(dev, "Mailbox payload sized %zu", mds->payload_size);
 
 	rcuwait_init(&mds->mbox_wait);
+	INIT_DELAYED_WORK(&mds->security.poll_dwork, cxl_mbox_sanitize_work);
 
-	if (cap & CXLDEV_MBOX_CAP_BG_CMD_IRQ) {
-		u32 ctrl;
-		int irq, msgnum;
-		struct pci_dev *pdev = to_pci_dev(cxlds->dev);
-
-		msgnum = FIELD_GET(CXLDEV_MBOX_CAP_IRQ_MSGNUM_MASK, cap);
-		irq = pci_irq_vector(pdev, msgnum);
-		if (irq < 0)
-			goto mbox_poll;
-
-		if (cxl_request_irq(cxlds, irq, cxl_pci_mbox_irq, NULL))
-			goto mbox_poll;
+	/* background command interrupts are optional */
+	if ((cap & CXLDEV_MBOX_CAP_BG_CMD_IRQ) == 0)
+		return 0;
 
-		/* enable background command mbox irq support */
-		ctrl = readl(cxlds->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET);
-		ctrl |= CXLDEV_MBOX_CTRL_BG_CMD_IRQ;
-		writel(ctrl, cxlds->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET);
+	msgnum = FIELD_GET(CXLDEV_MBOX_CAP_IRQ_MSGNUM_MASK, cap);
+	irq = pci_irq_vector(to_pci_dev(cxlds->dev), msgnum);
+	if (irq < 0)
+		return 0;
 
+	if (cxl_request_irq(cxlds, irq, cxl_pci_mbox_irq, NULL))
 		return 0;
-	}
 
-mbox_poll:
-	mds->security.poll = true;
-	INIT_DELAYED_WORK(&mds->security.poll_dwork, cxl_mbox_sanitize_work);
+	dev_dbg(cxlds->dev, "Mailbox interrupts enabled\n");
+	/* enable background command mbox irq support */
+	ctrl = readl(cxlds->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET);
+	ctrl |= CXLDEV_MBOX_CTRL_BG_CMD_IRQ;
+	writel(ctrl, cxlds->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET);
 
-	dev_dbg(cxlds->dev, "Mailbox interrupts are unsupported");
 	return 0;
 }
 


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 3/4] cxl/pci: Fix sanitize notifier setup
  2023-09-29 23:09 [PATCH v2 0/4] cxl/mem: Fix shutdown order Dan Williams
  2023-09-29 23:09 ` [PATCH v2 1/4] cxl/pci: Remove unnecessary device reference management in sanitize work Dan Williams
  2023-09-29 23:09 ` [PATCH v2 2/4] cxl/pci: Cleanup 'sanitize' to always poll Dan Williams
@ 2023-09-29 23:09 ` Dan Williams
  2023-09-30  2:42   ` Ira Weiny
                     ` (3 more replies)
  2023-09-29 23:09 ` [PATCH v2 4/4] cxl/mem: Fix shutdown order Dan Williams
  3 siblings, 4 replies; 34+ messages in thread
From: Dan Williams @ 2023-09-29 23:09 UTC (permalink / raw)
  To: linux-cxl; +Cc: Jonathan Cameron, Dave Jiang, Davidlohr Bueso, ira.weiny

Fix a race condition between the mailbox-background command interrupt
firing and the security-state sysfs attribute being removed.

The race is difficult to see due to the awkward placement of the
sanitize-notifier setup code and the multiple places the teardown calls
are made, cxl_memdev_security_init() and cxl_memdev_security_shutdown().

Unify setup in one place, cxl_sanitize_setup_notifier(). Arrange for
the paired cxl_sanitize_teardown_notifier() to safely quiet the notifier
and let the cxl_memdev + irq be unregistered later in the flow.

This fix is also needed as a preparation fix for a memdev unregistration
crash.

Reported-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Fixes: 0c36b6ad436a ("cxl/mbox: Add sanitization handling machinery")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/memdev.c |   42 ----------------------------------------
 drivers/cxl/pci.c         |   47 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 47 insertions(+), 42 deletions(-)

diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 2a7a07f6d165..a950091e5640 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -556,20 +556,11 @@ void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
 }
 EXPORT_SYMBOL_NS_GPL(clear_exclusive_cxl_commands, CXL);
 
-static void cxl_memdev_security_shutdown(struct device *dev)
-{
-	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
-	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
-
-	cancel_delayed_work_sync(&mds->security.poll_dwork);
-}
-
 static void cxl_memdev_shutdown(struct device *dev)
 {
 	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
 
 	down_write(&cxl_memdev_rwsem);
-	cxl_memdev_security_shutdown(dev);
 	cxlmd->cxlds = NULL;
 	up_write(&cxl_memdev_rwsem);
 }
@@ -1001,35 +992,6 @@ static const struct file_operations cxl_memdev_fops = {
 	.llseek = noop_llseek,
 };
 
-static void put_sanitize(void *data)
-{
-	struct cxl_memdev_state *mds = data;
-
-	sysfs_put(mds->security.sanitize_node);
-}
-
-static int cxl_memdev_security_init(struct cxl_memdev *cxlmd)
-{
-	struct cxl_dev_state *cxlds = cxlmd->cxlds;
-	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
-	struct device *dev = &cxlmd->dev;
-	struct kernfs_node *sec;
-
-	sec = sysfs_get_dirent(dev->kobj.sd, "security");
-	if (!sec) {
-		dev_err(dev, "sysfs_get_dirent 'security' failed\n");
-		return -ENODEV;
-	}
-	mds->security.sanitize_node = sysfs_get_dirent(sec, "state");
-	sysfs_put(sec);
-	if (!mds->security.sanitize_node) {
-		dev_err(dev, "sysfs_get_dirent 'state' failed\n");
-		return -ENODEV;
-	}
-
-	return devm_add_action_or_reset(cxlds->dev, put_sanitize, mds);
- }
-
 struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds)
 {
 	struct cxl_memdev *cxlmd;
@@ -1058,10 +1020,6 @@ struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds)
 	if (rc)
 		goto err;
 
-	rc = cxl_memdev_security_init(cxlmd);
-	if (rc)
-		goto err;
-
 	rc = devm_add_action_or_reset(cxlds->dev, cxl_memdev_unregister, cxlmd);
 	if (rc)
 		return ERR_PTR(rc);
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index ac4e434b0806..b0023e479315 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -165,6 +165,49 @@ static void cxl_mbox_sanitize_work(struct work_struct *work)
 	mutex_unlock(&mds->mbox_mutex);
 }
 
+static void cxl_sanitize_teardown_notifier(void *data)
+{
+	struct cxl_memdev_state *mds = data;
+	struct kernfs_node *state;
+
+	/*
+	 * Prevent new irq triggered invocations of the workqueue and
+	 * flush inflight invocations.
+	 */
+	mutex_lock(&mds->mbox_mutex);
+	state = mds->security.sanitize_node;
+	mds->security.sanitize_node = NULL;
+	mutex_unlock(&mds->mbox_mutex);
+
+	cancel_delayed_work_sync(&mds->security.poll_dwork);
+	sysfs_put(state);
+}
+
+static int cxl_sanitize_setup_notifier(struct cxl_memdev *cxlmd)
+{
+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
+	struct device *dev = cxlds->dev;
+	struct kernfs_node *sec;
+
+	if (!test_bit(CXL_SEC_ENABLED_SANITIZE, mds->security.enabled_cmds))
+		return 0;
+
+	sec = sysfs_get_dirent(dev->kobj.sd, "security");
+	if (!sec) {
+		dev_err(dev, "sanitize notification setup failure\n");
+		return -ENOENT;
+	}
+	mds->security.sanitize_node = sysfs_get_dirent(sec, "state");
+	sysfs_put(sec);
+	if (!mds->security.sanitize_node) {
+		dev_err(dev, "sanitize notification setup failure\n");
+		return -ENOENT;
+	}
+
+	return devm_add_action_or_reset(dev, cxl_sanitize_teardown_notifier, mds);
+}
+
 /**
  * __cxl_pci_mbox_send_cmd() - Execute a mailbox command
  * @mds: The memory device driver data
@@ -875,6 +918,10 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (rc)
 		return rc;
 
+	rc = cxl_sanitize_setup_notifier(cxlmd);
+	if (rc)
+		return rc;
+
 	pmu_count = cxl_count_regblock(pdev, CXL_REGLOC_RBI_PMU);
 	for (i = 0; i < pmu_count; i++) {
 		struct cxl_pmu_regs pmu_regs;


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 4/4] cxl/mem: Fix shutdown order
  2023-09-29 23:09 [PATCH v2 0/4] cxl/mem: Fix shutdown order Dan Williams
                   ` (2 preceding siblings ...)
  2023-09-29 23:09 ` [PATCH v2 3/4] cxl/pci: Fix sanitize notifier setup Dan Williams
@ 2023-09-29 23:09 ` Dan Williams
  2023-09-29 23:52   ` Ira Weiny
                     ` (2 more replies)
  3 siblings, 3 replies; 34+ messages in thread
From: Dan Williams @ 2023-09-29 23:09 UTC (permalink / raw)
  To: linux-cxl; +Cc: Ira Weiny, ira.weiny

Ira reports that removing cxl_mock_mem causes a crash with the following
trace:

 BUG: kernel NULL pointer dereference, address: 0000000000000044
 [..]
 RIP: 0010:cxl_region_decode_reset+0x7f/0x180 [cxl_core]
 [..]
 Call Trace:
  <TASK>
  cxl_region_detach+0xe8/0x210 [cxl_core]
  cxl_decoder_kill_region+0x27/0x40 [cxl_core]
  cxld_unregister+0x29/0x40 [cxl_core]
  devres_release_all+0xb8/0x110
  device_unbind_cleanup+0xe/0x70
  device_release_driver_internal+0x1d2/0x210
  bus_remove_device+0xd7/0x150
  device_del+0x155/0x3e0
  device_unregister+0x13/0x60
  devm_release_action+0x4d/0x90
  ? __pfx_unregister_port+0x10/0x10 [cxl_core]
  delete_endpoint+0x121/0x130 [cxl_core]
  devres_release_all+0xb8/0x110
  device_unbind_cleanup+0xe/0x70
  device_release_driver_internal+0x1d2/0x210
  bus_remove_device+0xd7/0x150
  device_del+0x155/0x3e0
  ? lock_release+0x142/0x290
  cdev_device_del+0x15/0x50
  cxl_memdev_unregister+0x54/0x70 [cxl_core]

This crash is due to the clearing out the cxl_memdev's driver context
(@cxlds) before the subsystem is done with it. This is ultimately due to
the region(s), that this memdev is a member, being torn down and expecting
to be able to de-reference @cxlds, like here:

static int cxl_region_decode_reset(struct cxl_region *cxlr, int count)
...
                if (cxlds->rcd)
                        goto endpoint_reset;
...

Fix it by keeping the driver context valid until memdev-device
unregistration, and subsequently the entire stack of related
dependencies, unwinds.

Fixes: 9cc238c7a526 ("cxl/pci: Introduce cdevm_file_operations")
Reported-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/memdev.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index a950091e5640..e78b5ead14fa 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -570,8 +570,8 @@ static void cxl_memdev_unregister(void *_cxlmd)
 	struct cxl_memdev *cxlmd = _cxlmd;
 	struct device *dev = &cxlmd->dev;

-	cxl_memdev_shutdown(dev);
 	cdev_device_del(&cxlmd->cdev, dev);
+	cxl_memdev_shutdown(dev);
 	put_device(dev);
 }

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 1/4] cxl/pci: Remove unnecessary device reference management in sanitize work
  2023-09-29 23:09 ` [PATCH v2 1/4] cxl/pci: Remove unnecessary device reference management in sanitize work Dan Williams
@ 2023-09-29 23:41   ` Ira Weiny
  2023-10-02  9:55   ` Jonathan Cameron
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 34+ messages in thread
From: Ira Weiny @ 2023-09-29 23:41 UTC (permalink / raw)
  To: Dan Williams, linux-cxl; +Cc: ira.weiny

Dan Williams wrote:
> Given that any particular put_device() could be the final put of the
> device, the fact that there are usages of cxlds->dev after
> put_device(cxlds->dev) is a red flag. Drop the reference counting since
> the device is pinned by being registered and will not be unregistered
> without triggering the driver + workqueue to shutdown.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Reviewed-by: Ira Weiny <ira.weiny@intel.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 2/4] cxl/pci: Cleanup 'sanitize' to always poll
  2023-09-29 23:09 ` [PATCH v2 2/4] cxl/pci: Cleanup 'sanitize' to always poll Dan Williams
@ 2023-09-29 23:49   ` Ira Weiny
  2023-09-29 23:51   ` Ira Weiny
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 34+ messages in thread
From: Ira Weiny @ 2023-09-29 23:49 UTC (permalink / raw)
  To: Dan Williams, linux-cxl; +Cc: ira.weiny

Dan Williams wrote:
> In preparation for fixing the init/teardown of the 'sanitize' workqueue
> and sysfs notification mechanism, arrange for cxl_mbox_sanitize_work()
> to be the single location where the sysfs attribute is notified. With
> that change there is no distinction between polled mode and interrupt
> mode. All the interrupt does is accelerate the polling interval.
> 
> The change to check for "mds->security.sanitize_node" under the lock is
> there to ensure that the interrupt, the work routine and the
> setup/teardown code can all have a consistent view of the registered
> notifier and the workqueue state. I.e. the expectation is that the
> interrupt is live past the point that the sanitize sysfs attribute is
> published, and it may race teardown, so it must be consulted under a
> lock.
> 
> Lastly, some opportunistic replacements of
> "queue_delayed_work(system_wq, ...)", which is just open coded
> schedule_delayed_work(), are included.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/core/memdev.c |    3 +-
>  drivers/cxl/cxlmem.h      |    2 --
>  drivers/cxl/pci.c         |   60 +++++++++++++++++++--------------------------
>  3 files changed, 26 insertions(+), 39 deletions(-)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 14b547c07f54..2a7a07f6d165 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -561,8 +561,7 @@ static void cxl_memdev_security_shutdown(struct device *dev)
>  	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
>  	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>  
> -	if (mds->security.poll)
> -		cancel_delayed_work_sync(&mds->security.poll_dwork);
> +	cancel_delayed_work_sync(&mds->security.poll_dwork);
>  }
>  
>  static void cxl_memdev_shutdown(struct device *dev)
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 706f8a6d1ef4..55f00ad17a77 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -360,7 +360,6 @@ struct cxl_fw_state {
>   *
>   * @state: state of last security operation
>   * @enabled_cmds: All security commands enabled in the CEL
> - * @poll: polling for sanitization is enabled, device has no mbox irq support
>   * @poll_tmo_secs: polling timeout
>   * @poll_dwork: polling work item
>   * @sanitize_node: sanitation sysfs file to notify
> @@ -368,7 +367,6 @@ struct cxl_fw_state {
>  struct cxl_security_state {
>  	unsigned long state;
>  	DECLARE_BITMAP(enabled_cmds, CXL_SEC_ENABLED_MAX);
> -	bool poll;
>  	int poll_tmo_secs;
>  	struct delayed_work poll_dwork;
>  	struct kernfs_node *sanitize_node;
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index aa1b3dd9e64c..ac4e434b0806 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -128,10 +128,10 @@ static irqreturn_t cxl_pci_mbox_irq(int irq, void *id)
>  	reg = readq(cxlds->regs.mbox + CXLDEV_MBOX_BG_CMD_STATUS_OFFSET);
>  	opcode = FIELD_GET(CXLDEV_MBOX_BG_CMD_COMMAND_OPCODE_MASK, reg);
>  	if (opcode == CXL_MBOX_OP_SANITIZE) {
> +		mutex_lock(&mds->mbox_mutex);
>  		if (mds->security.sanitize_node)
> -			sysfs_notify_dirent(mds->security.sanitize_node);
> -
> -		dev_dbg(cxlds->dev, "Sanitization operation ended\n");
> +			mod_delayed_work(system_wq, &mds->security.poll_dwork, 0);
> +		mutex_unlock(&mds->mbox_mutex);
>  	} else {
>  		/* short-circuit the wait in __cxl_pci_mbox_send_cmd() */
>  		rcuwait_wake_up(&mds->mbox_wait);
> @@ -160,8 +160,7 @@ static void cxl_mbox_sanitize_work(struct work_struct *work)
>  		int timeout = mds->security.poll_tmo_secs + 10;
>  
>  		mds->security.poll_tmo_secs = min(15 * 60, timeout);
> -		queue_delayed_work(system_wq, &mds->security.poll_dwork,
> -				   timeout * HZ);
> +		schedule_delayed_work(&mds->security.poll_dwork, timeout * HZ);
>  	}
>  	mutex_unlock(&mds->mbox_mutex);
>  }
> @@ -293,15 +292,11 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_memdev_state *mds,
>  		 * and allow userspace to poll(2) for completion.
>  		 */
>  		if (mbox_cmd->opcode == CXL_MBOX_OP_SANITIZE) {
> -			if (mds->security.poll) {
> -				/* give first timeout a second */
> -				timeout = 1;
> -				mds->security.poll_tmo_secs = timeout;
> -				queue_delayed_work(system_wq,
> -						   &mds->security.poll_dwork,
> -						   timeout * HZ);
> -			}
> -
> +			/* give first timeout a second */
> +			timeout = 1;
> +			mds->security.poll_tmo_secs = timeout;
> +			schedule_delayed_work(&mds->security.poll_dwork,
> +					      timeout * HZ);
>  			dev_dbg(dev, "Sanitization operation started\n");
>  			goto success;
>  		}
> @@ -384,7 +379,9 @@ static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds)
>  	const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
>  	struct device *dev = cxlds->dev;
>  	unsigned long timeout;
> +	int irq, msgnum;
>  	u64 md_status;
> +	u32 ctrl;
>  
>  	timeout = jiffies + mbox_ready_timeout * HZ;
>  	do {
> @@ -432,33 +429,26 @@ static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds)
>  	dev_dbg(dev, "Mailbox payload sized %zu", mds->payload_size);
>  
>  	rcuwait_init(&mds->mbox_wait);
> +	INIT_DELAYED_WORK(&mds->security.poll_dwork, cxl_mbox_sanitize_work);

NIT: The change below was ugly to review...


>  
> -	if (cap & CXLDEV_MBOX_CAP_BG_CMD_IRQ) {
> -		u32 ctrl;
> -		int irq, msgnum;
> -		struct pci_dev *pdev = to_pci_dev(cxlds->dev);
> -
> -		msgnum = FIELD_GET(CXLDEV_MBOX_CAP_IRQ_MSGNUM_MASK, cap);
> -		irq = pci_irq_vector(pdev, msgnum);
> -		if (irq < 0)
> -			goto mbox_poll;
> -
> -		if (cxl_request_irq(cxlds, irq, cxl_pci_mbox_irq, NULL))
> -			goto mbox_poll;
> +	/* background command interrupts are optional */
> +	if ((cap & CXLDEV_MBOX_CAP_BG_CMD_IRQ) == 0)
> +		return 0;
>  
> -		/* enable background command mbox irq support */
> -		ctrl = readl(cxlds->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET);
> -		ctrl |= CXLDEV_MBOX_CTRL_BG_CMD_IRQ;
> -		writel(ctrl, cxlds->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET);
> +	msgnum = FIELD_GET(CXLDEV_MBOX_CAP_IRQ_MSGNUM_MASK, cap);
> +	irq = pci_irq_vector(to_pci_dev(cxlds->dev), msgnum);
> +	if (irq < 0)
> +		return 0;
>  
> +	if (cxl_request_irq(cxlds, irq, cxl_pci_mbox_irq, NULL))
>  		return 0;
> -	}
>  
> -mbox_poll:
> -	mds->security.poll = true;
> -	INIT_DELAYED_WORK(&mds->security.poll_dwork, cxl_mbox_sanitize_work);
> +	dev_dbg(cxlds->dev, "Mailbox interrupts enabled\n");
> +	/* enable background command mbox irq support */
> +	ctrl = readl(cxlds->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET);
> +	ctrl |= CXLDEV_MBOX_CTRL_BG_CMD_IRQ;
> +	writel(ctrl, cxlds->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET);
>  
> -	dev_dbg(cxlds->dev, "Mailbox interrupts are unsupported");
>  	return 0;
>  }
>  
> 



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 2/4] cxl/pci: Cleanup 'sanitize' to always poll
  2023-09-29 23:09 ` [PATCH v2 2/4] cxl/pci: Cleanup 'sanitize' to always poll Dan Williams
  2023-09-29 23:49   ` Ira Weiny
@ 2023-09-29 23:51   ` Ira Weiny
  2023-10-02 10:02   ` Jonathan Cameron
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 34+ messages in thread
From: Ira Weiny @ 2023-09-29 23:51 UTC (permalink / raw)
  To: Dan Williams, linux-cxl; +Cc: ira.weiny

Dan Williams wrote:
> In preparation for fixing the init/teardown of the 'sanitize' workqueue
> and sysfs notification mechanism, arrange for cxl_mbox_sanitize_work()
> to be the single location where the sysfs attribute is notified. With
> that change there is no distinction between polled mode and interrupt
> mode. All the interrupt does is accelerate the polling interval.
> 
> The change to check for "mds->security.sanitize_node" under the lock is
> there to ensure that the interrupt, the work routine and the
> setup/teardown code can all have a consistent view of the registered
> notifier and the workqueue state. I.e. the expectation is that the
> interrupt is live past the point that the sanitize sysfs attribute is
> published, and it may race teardown, so it must be consulted under a
> lock.
> 
> Lastly, some opportunistic replacements of
> "queue_delayed_work(system_wq, ...)", which is just open coded
> schedule_delayed_work(), are included.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/core/memdev.c |    3 +-

[snip]

> @@ -384,7 +379,9 @@ static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds)
>  	const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
>  	struct device *dev = cxlds->dev;
>  	unsigned long timeout;
> +	int irq, msgnum;
>  	u64 md_status;
> +	u32 ctrl;
>  
>  	timeout = jiffies + mbox_ready_timeout * HZ;
>  	do {
> @@ -432,33 +429,26 @@ static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds)
>  	dev_dbg(dev, "Mailbox payload sized %zu", mds->payload_size);
>  
>  	rcuwait_init(&mds->mbox_wait);
> +	INIT_DELAYED_WORK(&mds->security.poll_dwork, cxl_mbox_sanitize_work);

NIT: The change below was ugly to review...  But I think it looks ok.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>

>  
> -	if (cap & CXLDEV_MBOX_CAP_BG_CMD_IRQ) {
> -		u32 ctrl;
> -		int irq, msgnum;
> -		struct pci_dev *pdev = to_pci_dev(cxlds->dev);
> -
> -		msgnum = FIELD_GET(CXLDEV_MBOX_CAP_IRQ_MSGNUM_MASK, cap);
> -		irq = pci_irq_vector(pdev, msgnum);
> -		if (irq < 0)
> -			goto mbox_poll;
> -
> -		if (cxl_request_irq(cxlds, irq, cxl_pci_mbox_irq, NULL))
> -			goto mbox_poll;
> +	/* background command interrupts are optional */
> +	if ((cap & CXLDEV_MBOX_CAP_BG_CMD_IRQ) == 0)
> +		return 0;
>  
> -		/* enable background command mbox irq support */
> -		ctrl = readl(cxlds->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET);
> -		ctrl |= CXLDEV_MBOX_CTRL_BG_CMD_IRQ;
> -		writel(ctrl, cxlds->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET);
> +	msgnum = FIELD_GET(CXLDEV_MBOX_CAP_IRQ_MSGNUM_MASK, cap);
> +	irq = pci_irq_vector(to_pci_dev(cxlds->dev), msgnum);
> +	if (irq < 0)
> +		return 0;
>  
> +	if (cxl_request_irq(cxlds, irq, cxl_pci_mbox_irq, NULL))
>  		return 0;
> -	}
>  
> -mbox_poll:
> -	mds->security.poll = true;
> -	INIT_DELAYED_WORK(&mds->security.poll_dwork, cxl_mbox_sanitize_work);
> +	dev_dbg(cxlds->dev, "Mailbox interrupts enabled\n");
> +	/* enable background command mbox irq support */
> +	ctrl = readl(cxlds->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET);
> +	ctrl |= CXLDEV_MBOX_CTRL_BG_CMD_IRQ;
> +	writel(ctrl, cxlds->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET);
>  
> -	dev_dbg(cxlds->dev, "Mailbox interrupts are unsupported");
>  	return 0;
>  }
>  
> 



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 4/4] cxl/mem: Fix shutdown order
  2023-09-29 23:09 ` [PATCH v2 4/4] cxl/mem: Fix shutdown order Dan Williams
@ 2023-09-29 23:52   ` Ira Weiny
  2023-10-02 10:11     ` Jonathan Cameron
  2023-10-02 16:59   ` Dave Jiang
  2023-10-03 17:40   ` Davidlohr Bueso
  2 siblings, 1 reply; 34+ messages in thread
From: Ira Weiny @ 2023-09-29 23:52 UTC (permalink / raw)
  To: Dan Williams, linux-cxl; +Cc: Ira Weiny

Dan Williams wrote:
> Ira reports that removing cxl_mock_mem causes a crash with the following
> trace:
> 
>  BUG: kernel NULL pointer dereference, address: 0000000000000044
>  [..]
>  RIP: 0010:cxl_region_decode_reset+0x7f/0x180 [cxl_core]
>  [..]
>  Call Trace:
>   <TASK>
>   cxl_region_detach+0xe8/0x210 [cxl_core]
>   cxl_decoder_kill_region+0x27/0x40 [cxl_core]
>   cxld_unregister+0x29/0x40 [cxl_core]
>   devres_release_all+0xb8/0x110
>   device_unbind_cleanup+0xe/0x70
>   device_release_driver_internal+0x1d2/0x210
>   bus_remove_device+0xd7/0x150
>   device_del+0x155/0x3e0
>   device_unregister+0x13/0x60
>   devm_release_action+0x4d/0x90
>   ? __pfx_unregister_port+0x10/0x10 [cxl_core]
>   delete_endpoint+0x121/0x130 [cxl_core]
>   devres_release_all+0xb8/0x110
>   device_unbind_cleanup+0xe/0x70
>   device_release_driver_internal+0x1d2/0x210
>   bus_remove_device+0xd7/0x150
>   device_del+0x155/0x3e0
>   ? lock_release+0x142/0x290
>   cdev_device_del+0x15/0x50
>   cxl_memdev_unregister+0x54/0x70 [cxl_core]
> 
> This crash is due to the clearing out the cxl_memdev's driver context
> (@cxlds) before the subsystem is done with it. This is ultimately due to
> the region(s), that this memdev is a member, being torn down and expecting
> to be able to de-reference @cxlds, like here:
> 
> static int cxl_region_decode_reset(struct cxl_region *cxlr, int count)
> ...
>                 if (cxlds->rcd)
>                         goto endpoint_reset;
> ...
> 
> Fix it by keeping the driver context valid until memdev-device
> unregistration, and subsequently the entire stack of related
> dependencies, unwinds.
> 
> Fixes: 9cc238c7a526 ("cxl/pci: Introduce cdevm_file_operations")
> Reported-by: Ira Weiny <ira.weiny@intel.com>

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Tested-by: Ira Weiny <ira.weiny@intel.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 3/4] cxl/pci: Fix sanitize notifier setup
  2023-09-29 23:09 ` [PATCH v2 3/4] cxl/pci: Fix sanitize notifier setup Dan Williams
@ 2023-09-30  2:42   ` Ira Weiny
  2023-10-02 10:10   ` Jonathan Cameron
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 34+ messages in thread
From: Ira Weiny @ 2023-09-30  2:42 UTC (permalink / raw)
  To: Dan Williams, linux-cxl
  Cc: Jonathan Cameron, Dave Jiang, Davidlohr Bueso, ira.weiny

Dan Williams wrote:
> Fix a race condition between the mailbox-background command interrupt
> firing and the security-state sysfs attribute being removed.
> 
> The race is difficult to see due to the awkward placement of the
> sanitize-notifier setup code and the multiple places the teardown calls
> are made, cxl_memdev_security_init() and cxl_memdev_security_shutdown().
> 
> Unify setup in one place, cxl_sanitize_setup_notifier(). Arrange for
> the paired cxl_sanitize_teardown_notifier() to safely quiet the notifier
> and let the cxl_memdev + irq be unregistered later in the flow.
> 
> This fix is also needed as a preparation fix for a memdev unregistration
> crash.
> 
> Reported-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
> Cc: Dave Jiang <dave.jiang@intel.com>
> Cc: Davidlohr Bueso <dave@stgolabs.net>
> Fixes: 0c36b6ad436a ("cxl/mbox: Add sanitization handling machinery")
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Reviewed-by: Ira Weiny <ira.weiny@intel.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 1/4] cxl/pci: Remove unnecessary device reference management in sanitize work
  2023-09-29 23:09 ` [PATCH v2 1/4] cxl/pci: Remove unnecessary device reference management in sanitize work Dan Williams
  2023-09-29 23:41   ` Ira Weiny
@ 2023-10-02  9:55   ` Jonathan Cameron
  2023-10-02 15:27   ` Davidlohr Bueso
  2023-10-02 16:48   ` Dave Jiang
  3 siblings, 0 replies; 34+ messages in thread
From: Jonathan Cameron @ 2023-10-02  9:55 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny

On Fri, 29 Sep 2023 16:09:33 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> Given that any particular put_device() could be the final put of the
> device, the fact that there are usages of cxlds->dev after
> put_device(cxlds->dev) is a red flag. Drop the reference counting since
> the device is pinned by being registered and will not be unregistered
> without triggering the driver + workqueue to shutdown.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Makes sense

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
>  drivers/cxl/pci.c |    5 -----
>  1 file changed, 5 deletions(-)
> 
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 44a21ab7add5..aa1b3dd9e64c 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -152,8 +152,6 @@ static void cxl_mbox_sanitize_work(struct work_struct *work)
>  	mutex_lock(&mds->mbox_mutex);
>  	if (cxl_mbox_background_complete(cxlds)) {
>  		mds->security.poll_tmo_secs = 0;
> -		put_device(cxlds->dev);
> -
>  		if (mds->security.sanitize_node)
>  			sysfs_notify_dirent(mds->security.sanitize_node);
>  
> @@ -296,9 +294,6 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_memdev_state *mds,
>  		 */
>  		if (mbox_cmd->opcode == CXL_MBOX_OP_SANITIZE) {
>  			if (mds->security.poll) {
> -				/* hold the device throughout */
> -				get_device(cxlds->dev);
> -
>  				/* give first timeout a second */
>  				timeout = 1;
>  				mds->security.poll_tmo_secs = timeout;
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 2/4] cxl/pci: Cleanup 'sanitize' to always poll
  2023-09-29 23:09 ` [PATCH v2 2/4] cxl/pci: Cleanup 'sanitize' to always poll Dan Williams
  2023-09-29 23:49   ` Ira Weiny
  2023-09-29 23:51   ` Ira Weiny
@ 2023-10-02 10:02   ` Jonathan Cameron
  2023-10-04  0:55     ` Dan Williams
  2023-10-02 15:49   ` Davidlohr Bueso
  2023-10-02 16:57   ` Dave Jiang
  4 siblings, 1 reply; 34+ messages in thread
From: Jonathan Cameron @ 2023-10-02 10:02 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny

On Fri, 29 Sep 2023 16:09:39 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> In preparation for fixing the init/teardown of the 'sanitize' workqueue
> and sysfs notification mechanism, arrange for cxl_mbox_sanitize_work()
> to be the single location where the sysfs attribute is notified. With
> that change there is no distinction between polled mode and interrupt
> mode. All the interrupt does is accelerate the polling interval.
> 
> The change to check for "mds->security.sanitize_node" under the lock is
> there to ensure that the interrupt, the work routine and the
> setup/teardown code can all have a consistent view of the registered
> notifier and the workqueue state. I.e. the expectation is that the
> interrupt is live past the point that the sanitize sysfs attribute is
> published, and it may race teardown, so it must be consulted under a
> lock.
> 
> Lastly, some opportunistic replacements of
> "queue_delayed_work(system_wq, ...)", which is just open coded
> schedule_delayed_work(), are included.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
One trivial comment inline, otherwise looks fine to me

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
>  drivers/cxl/core/memdev.c |    3 +-
>  drivers/cxl/cxlmem.h      |    2 --
>  drivers/cxl/pci.c         |   60 +++++++++++++++++++--------------------------
>  3 files changed, 26 insertions(+), 39 deletions(-)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 14b547c07f54..2a7a07f6d165 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -561,8 +561,7 @@ static void cxl_memdev_security_shutdown(struct device *dev)
>  	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
>  	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>  
> -	if (mds->security.poll)
> -		cancel_delayed_work_sync(&mds->security.poll_dwork);
> +	cancel_delayed_work_sync(&mds->security.poll_dwork);
>  }
>  
>  static void cxl_memdev_shutdown(struct device *dev)
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 706f8a6d1ef4..55f00ad17a77 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -360,7 +360,6 @@ struct cxl_fw_state {
>   *
>   * @state: state of last security operation
>   * @enabled_cmds: All security commands enabled in the CEL
> - * @poll: polling for sanitization is enabled, device has no mbox irq support
>   * @poll_tmo_secs: polling timeout
>   * @poll_dwork: polling work item
>   * @sanitize_node: sanitation sysfs file to notify
> @@ -368,7 +367,6 @@ struct cxl_fw_state {
>  struct cxl_security_state {
>  	unsigned long state;
>  	DECLARE_BITMAP(enabled_cmds, CXL_SEC_ENABLED_MAX);
> -	bool poll;
>  	int poll_tmo_secs;
>  	struct delayed_work poll_dwork;
>  	struct kernfs_node *sanitize_node;
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index aa1b3dd9e64c..ac4e434b0806 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -128,10 +128,10 @@ static irqreturn_t cxl_pci_mbox_irq(int irq, void *id)
>  	reg = readq(cxlds->regs.mbox + CXLDEV_MBOX_BG_CMD_STATUS_OFFSET);
>  	opcode = FIELD_GET(CXLDEV_MBOX_BG_CMD_COMMAND_OPCODE_MASK, reg);
>  	if (opcode == CXL_MBOX_OP_SANITIZE) {
> +		mutex_lock(&mds->mbox_mutex);
>  		if (mds->security.sanitize_node)
> -			sysfs_notify_dirent(mds->security.sanitize_node);
> -
> -		dev_dbg(cxlds->dev, "Sanitization operation ended\n");
> +			mod_delayed_work(system_wq, &mds->security.poll_dwork, 0);
> +		mutex_unlock(&mds->mbox_mutex);
>  	} else {
>  		/* short-circuit the wait in __cxl_pci_mbox_send_cmd() */
>  		rcuwait_wake_up(&mds->mbox_wait);
> @@ -160,8 +160,7 @@ static void cxl_mbox_sanitize_work(struct work_struct *work)
>  		int timeout = mds->security.poll_tmo_secs + 10;
>  
>  		mds->security.poll_tmo_secs = min(15 * 60, timeout);
> -		queue_delayed_work(system_wq, &mds->security.poll_dwork,
> -				   timeout * HZ);
> +		schedule_delayed_work(&mds->security.poll_dwork, timeout * HZ);
>  	}
>  	mutex_unlock(&mds->mbox_mutex);
>  }
> @@ -293,15 +292,11 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_memdev_state *mds,
>  		 * and allow userspace to poll(2) for completion.
>  		 */
>  		if (mbox_cmd->opcode == CXL_MBOX_OP_SANITIZE) {
> -			if (mds->security.poll) {
> -				/* give first timeout a second */
> -				timeout = 1;
> -				mds->security.poll_tmo_secs = timeout;
> -				queue_delayed_work(system_wq,
> -						   &mds->security.poll_dwork,
> -						   timeout * HZ);
> -			}
> -
> +			/* give first timeout a second */
> +			timeout = 1;
> +			mds->security.poll_tmo_secs = timeout;
> +			schedule_delayed_work(&mds->security.poll_dwork,
> +					      timeout * HZ);
>  			dev_dbg(dev, "Sanitization operation started\n");
>  			goto success;
>  		}
> @@ -384,7 +379,9 @@ static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds)
>  	const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
>  	struct device *dev = cxlds->dev;
>  	unsigned long timeout;
> +	int irq, msgnum;
>  	u64 md_status;
> +	u32 ctrl;
>  
>  	timeout = jiffies + mbox_ready_timeout * HZ;
>  	do {
> @@ -432,33 +429,26 @@ static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds)
>  	dev_dbg(dev, "Mailbox payload sized %zu", mds->payload_size);
>  
>  	rcuwait_init(&mds->mbox_wait);
> +	INIT_DELAYED_WORK(&mds->security.poll_dwork, cxl_mbox_sanitize_work);
>  
> -	if (cap & CXLDEV_MBOX_CAP_BG_CMD_IRQ) {
> -		u32 ctrl;
> -		int irq, msgnum;
> -		struct pci_dev *pdev = to_pci_dev(cxlds->dev);
> -
> -		msgnum = FIELD_GET(CXLDEV_MBOX_CAP_IRQ_MSGNUM_MASK, cap);
> -		irq = pci_irq_vector(pdev, msgnum);
> -		if (irq < 0)
> -			goto mbox_poll;
> -
> -		if (cxl_request_irq(cxlds, irq, cxl_pci_mbox_irq, NULL))
> -			goto mbox_poll;
> +	/* background command interrupts are optional */
> +	if ((cap & CXLDEV_MBOX_CAP_BG_CMD_IRQ) == 0)

Nit, but it's a boolean flag, so to me
	if (!(cap & CXLDEV_MBOX_CAP_BG_CMD_IRQ))

feels slightly better.

There are instances of this done with !() elsewhere in the file,
so not obviously a case of wanting to match local style.


> +		return 0;
>  
> -		/* enable background command mbox irq support */
> -		ctrl = readl(cxlds->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET);
> -		ctrl |= CXLDEV_MBOX_CTRL_BG_CMD_IRQ;
> -		writel(ctrl, cxlds->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET);
> +	msgnum = FIELD_GET(CXLDEV_MBOX_CAP_IRQ_MSGNUM_MASK, cap);
> +	irq = pci_irq_vector(to_pci_dev(cxlds->dev), msgnum);
> +	if (irq < 0)
> +		return 0;
>  
> +	if (cxl_request_irq(cxlds, irq, cxl_pci_mbox_irq, NULL))
>  		return 0;
> -	}
>  
> -mbox_poll:
> -	mds->security.poll = true;
> -	INIT_DELAYED_WORK(&mds->security.poll_dwork, cxl_mbox_sanitize_work);
> +	dev_dbg(cxlds->dev, "Mailbox interrupts enabled\n");
> +	/* enable background command mbox irq support */
> +	ctrl = readl(cxlds->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET);
> +	ctrl |= CXLDEV_MBOX_CTRL_BG_CMD_IRQ;
> +	writel(ctrl, cxlds->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET);
>  
> -	dev_dbg(cxlds->dev, "Mailbox interrupts are unsupported");
>  	return 0;
>  }
>  
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 3/4] cxl/pci: Fix sanitize notifier setup
  2023-09-29 23:09 ` [PATCH v2 3/4] cxl/pci: Fix sanitize notifier setup Dan Williams
  2023-09-30  2:42   ` Ira Weiny
@ 2023-10-02 10:10   ` Jonathan Cameron
  2023-10-04  0:59     ` Dan Williams
  2023-10-02 16:59   ` Dave Jiang
  2023-10-04  0:52   ` Davidlohr Bueso
  3 siblings, 1 reply; 34+ messages in thread
From: Jonathan Cameron @ 2023-10-02 10:10 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, Dave Jiang, Davidlohr Bueso, ira.weiny

On Fri, 29 Sep 2023 16:09:44 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> Fix a race condition between the mailbox-background command interrupt
> firing and the security-state sysfs attribute being removed.
> 
> The race is difficult to see due to the awkward placement of the
> sanitize-notifier setup code and the multiple places the teardown calls
> are made, cxl_memdev_security_init() and cxl_memdev_security_shutdown().
> 
> Unify setup in one place, cxl_sanitize_setup_notifier(). Arrange for
> the paired cxl_sanitize_teardown_notifier() to safely quiet the notifier
> and let the cxl_memdev + irq be unregistered later in the flow.
> 
> This fix is also needed as a preparation fix for a memdev unregistration
> crash.
> 
> Reported-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
> Cc: Dave Jiang <dave.jiang@intel.com>
> Cc: Davidlohr Bueso <dave@stgolabs.net>
> Fixes: 0c36b6ad436a ("cxl/mbox: Add sanitization handling machinery")
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

One trivial question inline about which parameter to pass in from the
many many interlocking state structures...

If you do make the suggested change, it's just complex enough I want another
look so I'm not giving a tag for now.

> ---
>  drivers/cxl/core/memdev.c |   42 ----------------------------------------
>  drivers/cxl/pci.c         |   47 +++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 47 insertions(+), 42 deletions(-)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 2a7a07f6d165..a950091e5640 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -556,20 +556,11 @@ void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
>  }
>  EXPORT_SYMBOL_NS_GPL(clear_exclusive_cxl_commands, CXL);
>  
> -static void cxl_memdev_security_shutdown(struct device *dev)
> -{
> -	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> -	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
> -
> -	cancel_delayed_work_sync(&mds->security.poll_dwork);
> -}
> -
>  static void cxl_memdev_shutdown(struct device *dev)
>  {
>  	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
>  
>  	down_write(&cxl_memdev_rwsem);
> -	cxl_memdev_security_shutdown(dev);
>  	cxlmd->cxlds = NULL;
>  	up_write(&cxl_memdev_rwsem);
>  }
> @@ -1001,35 +992,6 @@ static const struct file_operations cxl_memdev_fops = {
>  	.llseek = noop_llseek,
>  };
>  
> -static void put_sanitize(void *data)
> -{
> -	struct cxl_memdev_state *mds = data;
> -
> -	sysfs_put(mds->security.sanitize_node);
> -}
> -
> -static int cxl_memdev_security_init(struct cxl_memdev *cxlmd)
> -{
> -	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> -	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
> -	struct device *dev = &cxlmd->dev;
> -	struct kernfs_node *sec;
> -
> -	sec = sysfs_get_dirent(dev->kobj.sd, "security");
> -	if (!sec) {
> -		dev_err(dev, "sysfs_get_dirent 'security' failed\n");
> -		return -ENODEV;
> -	}
> -	mds->security.sanitize_node = sysfs_get_dirent(sec, "state");
> -	sysfs_put(sec);
> -	if (!mds->security.sanitize_node) {
> -		dev_err(dev, "sysfs_get_dirent 'state' failed\n");
> -		return -ENODEV;
> -	}
> -
> -	return devm_add_action_or_reset(cxlds->dev, put_sanitize, mds);
> - }
> -
>  struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds)
>  {
>  	struct cxl_memdev *cxlmd;
> @@ -1058,10 +1020,6 @@ struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds)
>  	if (rc)
>  		goto err;
>  
> -	rc = cxl_memdev_security_init(cxlmd);
> -	if (rc)
> -		goto err;
> -
>  	rc = devm_add_action_or_reset(cxlds->dev, cxl_memdev_unregister, cxlmd);
>  	if (rc)
>  		return ERR_PTR(rc);
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index ac4e434b0806..b0023e479315 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -165,6 +165,49 @@ static void cxl_mbox_sanitize_work(struct work_struct *work)
>  	mutex_unlock(&mds->mbox_mutex);
>  }
>  
> +static void cxl_sanitize_teardown_notifier(void *data)
> +{
> +	struct cxl_memdev_state *mds = data;
> +	struct kernfs_node *state;
> +
> +	/*
> +	 * Prevent new irq triggered invocations of the workqueue and
> +	 * flush inflight invocations.
> +	 */
> +	mutex_lock(&mds->mbox_mutex);
> +	state = mds->security.sanitize_node;
> +	mds->security.sanitize_node = NULL;
> +	mutex_unlock(&mds->mbox_mutex);
> +
> +	cancel_delayed_work_sync(&mds->security.poll_dwork);
> +	sysfs_put(state);
> +}
> +
> +static int cxl_sanitize_setup_notifier(struct cxl_memdev *cxlmd)
> +{

Almost everything in cxl_pci_probe() passes in the mds.
Why not do the same here?

 
> +	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> +	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
> +	struct device *dev = cxlds->dev;
> +	struct kernfs_node *sec;
> +
> +	if (!test_bit(CXL_SEC_ENABLED_SANITIZE, mds->security.enabled_cmds))
> +		return 0;
> +
> +	sec = sysfs_get_dirent(dev->kobj.sd, "security");
> +	if (!sec) {
> +		dev_err(dev, "sanitize notification setup failure\n");
> +		return -ENOENT;
> +	}
> +	mds->security.sanitize_node = sysfs_get_dirent(sec, "state");
> +	sysfs_put(sec);
> +	if (!mds->security.sanitize_node) {
> +		dev_err(dev, "sanitize notification setup failure\n");
> +		return -ENOENT;
> +	}
> +
> +	return devm_add_action_or_reset(dev, cxl_sanitize_teardown_notifier, mds);
> +}
> +
>  /**
>   * __cxl_pci_mbox_send_cmd() - Execute a mailbox command
>   * @mds: The memory device driver data
> @@ -875,6 +918,10 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (rc)
>  		return rc;
>  
> +	rc = cxl_sanitize_setup_notifier(cxlmd);
> +	if (rc)
> +		return rc;
> +
>  	pmu_count = cxl_count_regblock(pdev, CXL_REGLOC_RBI_PMU);
>  	for (i = 0; i < pmu_count; i++) {
>  		struct cxl_pmu_regs pmu_regs;
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 4/4] cxl/mem: Fix shutdown order
  2023-09-29 23:52   ` Ira Weiny
@ 2023-10-02 10:11     ` Jonathan Cameron
  0 siblings, 0 replies; 34+ messages in thread
From: Jonathan Cameron @ 2023-10-02 10:11 UTC (permalink / raw)
  To: Ira Weiny; +Cc: Dan Williams, linux-cxl

On Fri, 29 Sep 2023 16:52:06 -0700
Ira Weiny <ira.weiny@intel.com> wrote:

> Dan Williams wrote:
> > Ira reports that removing cxl_mock_mem causes a crash with the following
> > trace:
> > 
> >  BUG: kernel NULL pointer dereference, address: 0000000000000044
> >  [..]
> >  RIP: 0010:cxl_region_decode_reset+0x7f/0x180 [cxl_core]
> >  [..]
> >  Call Trace:
> >   <TASK>
> >   cxl_region_detach+0xe8/0x210 [cxl_core]
> >   cxl_decoder_kill_region+0x27/0x40 [cxl_core]
> >   cxld_unregister+0x29/0x40 [cxl_core]
> >   devres_release_all+0xb8/0x110
> >   device_unbind_cleanup+0xe/0x70
> >   device_release_driver_internal+0x1d2/0x210
> >   bus_remove_device+0xd7/0x150
> >   device_del+0x155/0x3e0
> >   device_unregister+0x13/0x60
> >   devm_release_action+0x4d/0x90
> >   ? __pfx_unregister_port+0x10/0x10 [cxl_core]
> >   delete_endpoint+0x121/0x130 [cxl_core]
> >   devres_release_all+0xb8/0x110
> >   device_unbind_cleanup+0xe/0x70
> >   device_release_driver_internal+0x1d2/0x210
> >   bus_remove_device+0xd7/0x150
> >   device_del+0x155/0x3e0
> >   ? lock_release+0x142/0x290
> >   cdev_device_del+0x15/0x50
> >   cxl_memdev_unregister+0x54/0x70 [cxl_core]
> > 
> > This crash is due to the clearing out the cxl_memdev's driver context
> > (@cxlds) before the subsystem is done with it. This is ultimately due to
> > the region(s), that this memdev is a member, being torn down and expecting
> > to be able to de-reference @cxlds, like here:
> > 
> > static int cxl_region_decode_reset(struct cxl_region *cxlr, int count)
> > ...
> >                 if (cxlds->rcd)
> >                         goto endpoint_reset;
> > ...
> > 
> > Fix it by keeping the driver context valid until memdev-device
> > unregistration, and subsequently the entire stack of related
> > dependencies, unwinds.
> > 
> > Fixes: 9cc238c7a526 ("cxl/pci: Introduce cdevm_file_operations")
> > Reported-by: Ira Weiny <ira.weiny@intel.com>  
> 
> Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> Tested-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Thanks for cleaning up the whole area whilst fixing this.

Jonathan


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 1/4] cxl/pci: Remove unnecessary device reference management in sanitize work
  2023-09-29 23:09 ` [PATCH v2 1/4] cxl/pci: Remove unnecessary device reference management in sanitize work Dan Williams
  2023-09-29 23:41   ` Ira Weiny
  2023-10-02  9:55   ` Jonathan Cameron
@ 2023-10-02 15:27   ` Davidlohr Bueso
  2023-10-02 16:48   ` Dave Jiang
  3 siblings, 0 replies; 34+ messages in thread
From: Davidlohr Bueso @ 2023-10-02 15:27 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny

On Fri, 29 Sep 2023, Dan Williams wrote:

>Given that any particular put_device() could be the final put of the
>device, the fact that there are usages of cxlds->dev after
>put_device(cxlds->dev) is a red flag. Drop the reference counting since
>the device is pinned by being registered and will not be unregistered
>without triggering the driver + workqueue to shutdown.
>
>Signed-off-by: Dan Williams <dan.j.williams@intel.com>

If the refcounting is unnecessary, great.

Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 2/4] cxl/pci: Cleanup 'sanitize' to always poll
  2023-09-29 23:09 ` [PATCH v2 2/4] cxl/pci: Cleanup 'sanitize' to always poll Dan Williams
                     ` (2 preceding siblings ...)
  2023-10-02 10:02   ` Jonathan Cameron
@ 2023-10-02 15:49   ` Davidlohr Bueso
  2023-10-04  1:01     ` Dan Williams
  2023-10-02 16:57   ` Dave Jiang
  4 siblings, 1 reply; 34+ messages in thread
From: Davidlohr Bueso @ 2023-10-02 15:49 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny

On Fri, 29 Sep 2023, Dan Williams wrote:

>In preparation for fixing the init/teardown of the 'sanitize' workqueue
>and sysfs notification mechanism, arrange for cxl_mbox_sanitize_work()
>to be the single location where the sysfs attribute is notified. With
>that change there is no distinction between polled mode and interrupt
>mode. All the interrupt does is accelerate the polling interval.
>
>The change to check for "mds->security.sanitize_node" under the lock is
>there to ensure that the interrupt, the work routine and the
>setup/teardown code can all have a consistent view of the registered
>notifier and the workqueue state. I.e. the expectation is that the
>interrupt is live past the point that the sanitize sysfs attribute is
>published, and it may race teardown, so it must be consulted under a
>lock.

That makes sense, but this is currently under hardirq, so we'd need the
threaded flavor instead to take the mutex.

Thanks,
Davidlohr

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 1/4] cxl/pci: Remove unnecessary device reference management in sanitize work
  2023-09-29 23:09 ` [PATCH v2 1/4] cxl/pci: Remove unnecessary device reference management in sanitize work Dan Williams
                     ` (2 preceding siblings ...)
  2023-10-02 15:27   ` Davidlohr Bueso
@ 2023-10-02 16:48   ` Dave Jiang
  3 siblings, 0 replies; 34+ messages in thread
From: Dave Jiang @ 2023-10-02 16:48 UTC (permalink / raw)
  To: Dan Williams, linux-cxl; +Cc: ira.weiny



On 9/29/23 16:09, Dan Williams wrote:
> Given that any particular put_device() could be the final put of the
> device, the fact that there are usages of cxlds->dev after
> put_device(cxlds->dev) is a red flag. Drop the reference counting since
> the device is pinned by being registered and will not be unregistered
> without triggering the driver + workqueue to shutdown.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> ---
>  drivers/cxl/pci.c |    5 -----
>  1 file changed, 5 deletions(-)
> 
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 44a21ab7add5..aa1b3dd9e64c 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -152,8 +152,6 @@ static void cxl_mbox_sanitize_work(struct work_struct *work)
>  	mutex_lock(&mds->mbox_mutex);
>  	if (cxl_mbox_background_complete(cxlds)) {
>  		mds->security.poll_tmo_secs = 0;
> -		put_device(cxlds->dev);
> -
>  		if (mds->security.sanitize_node)
>  			sysfs_notify_dirent(mds->security.sanitize_node);
>  
> @@ -296,9 +294,6 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_memdev_state *mds,
>  		 */
>  		if (mbox_cmd->opcode == CXL_MBOX_OP_SANITIZE) {
>  			if (mds->security.poll) {
> -				/* hold the device throughout */
> -				get_device(cxlds->dev);
> -
>  				/* give first timeout a second */
>  				timeout = 1;
>  				mds->security.poll_tmo_secs = timeout;
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 2/4] cxl/pci: Cleanup 'sanitize' to always poll
  2023-09-29 23:09 ` [PATCH v2 2/4] cxl/pci: Cleanup 'sanitize' to always poll Dan Williams
                     ` (3 preceding siblings ...)
  2023-10-02 15:49   ` Davidlohr Bueso
@ 2023-10-02 16:57   ` Dave Jiang
  4 siblings, 0 replies; 34+ messages in thread
From: Dave Jiang @ 2023-10-02 16:57 UTC (permalink / raw)
  To: Dan Williams, linux-cxl; +Cc: ira.weiny



On 9/29/23 16:09, Dan Williams wrote:
> In preparation for fixing the init/teardown of the 'sanitize' workqueue
> and sysfs notification mechanism, arrange for cxl_mbox_sanitize_work()
> to be the single location where the sysfs attribute is notified. With
> that change there is no distinction between polled mode and interrupt
> mode. All the interrupt does is accelerate the polling interval.
> 
> The change to check for "mds->security.sanitize_node" under the lock is
> there to ensure that the interrupt, the work routine and the
> setup/teardown code can all have a consistent view of the registered
> notifier and the workqueue state. I.e. the expectation is that the
> interrupt is live past the point that the sanitize sysfs attribute is
> published, and it may race teardown, so it must be consulted under a
> lock.
> 
> Lastly, some opportunistic replacements of
> "queue_delayed_work(system_wq, ...)", which is just open coded
> schedule_delayed_work(), are included.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> ---
>  drivers/cxl/core/memdev.c |    3 +-
>  drivers/cxl/cxlmem.h      |    2 --
>  drivers/cxl/pci.c         |   60 +++++++++++++++++++--------------------------
>  3 files changed, 26 insertions(+), 39 deletions(-)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 14b547c07f54..2a7a07f6d165 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -561,8 +561,7 @@ static void cxl_memdev_security_shutdown(struct device *dev)
>  	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
>  	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>  
> -	if (mds->security.poll)
> -		cancel_delayed_work_sync(&mds->security.poll_dwork);
> +	cancel_delayed_work_sync(&mds->security.poll_dwork);
>  }
>  
>  static void cxl_memdev_shutdown(struct device *dev)
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 706f8a6d1ef4..55f00ad17a77 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -360,7 +360,6 @@ struct cxl_fw_state {
>   *
>   * @state: state of last security operation
>   * @enabled_cmds: All security commands enabled in the CEL
> - * @poll: polling for sanitization is enabled, device has no mbox irq support
>   * @poll_tmo_secs: polling timeout
>   * @poll_dwork: polling work item
>   * @sanitize_node: sanitation sysfs file to notify
> @@ -368,7 +367,6 @@ struct cxl_fw_state {
>  struct cxl_security_state {
>  	unsigned long state;
>  	DECLARE_BITMAP(enabled_cmds, CXL_SEC_ENABLED_MAX);
> -	bool poll;
>  	int poll_tmo_secs;
>  	struct delayed_work poll_dwork;
>  	struct kernfs_node *sanitize_node;
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index aa1b3dd9e64c..ac4e434b0806 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -128,10 +128,10 @@ static irqreturn_t cxl_pci_mbox_irq(int irq, void *id)
>  	reg = readq(cxlds->regs.mbox + CXLDEV_MBOX_BG_CMD_STATUS_OFFSET);
>  	opcode = FIELD_GET(CXLDEV_MBOX_BG_CMD_COMMAND_OPCODE_MASK, reg);
>  	if (opcode == CXL_MBOX_OP_SANITIZE) {
> +		mutex_lock(&mds->mbox_mutex);
>  		if (mds->security.sanitize_node)
> -			sysfs_notify_dirent(mds->security.sanitize_node);
> -
> -		dev_dbg(cxlds->dev, "Sanitization operation ended\n");
> +			mod_delayed_work(system_wq, &mds->security.poll_dwork, 0);
> +		mutex_unlock(&mds->mbox_mutex);
>  	} else {
>  		/* short-circuit the wait in __cxl_pci_mbox_send_cmd() */
>  		rcuwait_wake_up(&mds->mbox_wait);
> @@ -160,8 +160,7 @@ static void cxl_mbox_sanitize_work(struct work_struct *work)
>  		int timeout = mds->security.poll_tmo_secs + 10;
>  
>  		mds->security.poll_tmo_secs = min(15 * 60, timeout);
> -		queue_delayed_work(system_wq, &mds->security.poll_dwork,
> -				   timeout * HZ);
> +		schedule_delayed_work(&mds->security.poll_dwork, timeout * HZ);
>  	}
>  	mutex_unlock(&mds->mbox_mutex);
>  }
> @@ -293,15 +292,11 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_memdev_state *mds,
>  		 * and allow userspace to poll(2) for completion.
>  		 */
>  		if (mbox_cmd->opcode == CXL_MBOX_OP_SANITIZE) {
> -			if (mds->security.poll) {
> -				/* give first timeout a second */
> -				timeout = 1;
> -				mds->security.poll_tmo_secs = timeout;
> -				queue_delayed_work(system_wq,
> -						   &mds->security.poll_dwork,
> -						   timeout * HZ);
> -			}
> -
> +			/* give first timeout a second */
> +			timeout = 1;
> +			mds->security.poll_tmo_secs = timeout;
> +			schedule_delayed_work(&mds->security.poll_dwork,
> +					      timeout * HZ);
>  			dev_dbg(dev, "Sanitization operation started\n");
>  			goto success;
>  		}
> @@ -384,7 +379,9 @@ static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds)
>  	const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
>  	struct device *dev = cxlds->dev;
>  	unsigned long timeout;
> +	int irq, msgnum;
>  	u64 md_status;
> +	u32 ctrl;
>  
>  	timeout = jiffies + mbox_ready_timeout * HZ;
>  	do {
> @@ -432,33 +429,26 @@ static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds)
>  	dev_dbg(dev, "Mailbox payload sized %zu", mds->payload_size);
>  
>  	rcuwait_init(&mds->mbox_wait);
> +	INIT_DELAYED_WORK(&mds->security.poll_dwork, cxl_mbox_sanitize_work);
>  
> -	if (cap & CXLDEV_MBOX_CAP_BG_CMD_IRQ) {
> -		u32 ctrl;
> -		int irq, msgnum;
> -		struct pci_dev *pdev = to_pci_dev(cxlds->dev);
> -
> -		msgnum = FIELD_GET(CXLDEV_MBOX_CAP_IRQ_MSGNUM_MASK, cap);
> -		irq = pci_irq_vector(pdev, msgnum);
> -		if (irq < 0)
> -			goto mbox_poll;
> -
> -		if (cxl_request_irq(cxlds, irq, cxl_pci_mbox_irq, NULL))
> -			goto mbox_poll;
> +	/* background command interrupts are optional */
> +	if ((cap & CXLDEV_MBOX_CAP_BG_CMD_IRQ) == 0)
> +		return 0;
>  
> -		/* enable background command mbox irq support */
> -		ctrl = readl(cxlds->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET);
> -		ctrl |= CXLDEV_MBOX_CTRL_BG_CMD_IRQ;
> -		writel(ctrl, cxlds->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET);
> +	msgnum = FIELD_GET(CXLDEV_MBOX_CAP_IRQ_MSGNUM_MASK, cap);
> +	irq = pci_irq_vector(to_pci_dev(cxlds->dev), msgnum);
> +	if (irq < 0)
> +		return 0;
>  
> +	if (cxl_request_irq(cxlds, irq, cxl_pci_mbox_irq, NULL))
>  		return 0;
> -	}
>  
> -mbox_poll:
> -	mds->security.poll = true;
> -	INIT_DELAYED_WORK(&mds->security.poll_dwork, cxl_mbox_sanitize_work);
> +	dev_dbg(cxlds->dev, "Mailbox interrupts enabled\n");
> +	/* enable background command mbox irq support */
> +	ctrl = readl(cxlds->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET);
> +	ctrl |= CXLDEV_MBOX_CTRL_BG_CMD_IRQ;
> +	writel(ctrl, cxlds->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET);
>  
> -	dev_dbg(cxlds->dev, "Mailbox interrupts are unsupported");
>  	return 0;
>  }
>  
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 3/4] cxl/pci: Fix sanitize notifier setup
  2023-09-29 23:09 ` [PATCH v2 3/4] cxl/pci: Fix sanitize notifier setup Dan Williams
  2023-09-30  2:42   ` Ira Weiny
  2023-10-02 10:10   ` Jonathan Cameron
@ 2023-10-02 16:59   ` Dave Jiang
  2023-10-04  0:52   ` Davidlohr Bueso
  3 siblings, 0 replies; 34+ messages in thread
From: Dave Jiang @ 2023-10-02 16:59 UTC (permalink / raw)
  To: Dan Williams, linux-cxl; +Cc: Jonathan Cameron, Davidlohr Bueso, ira.weiny



On 9/29/23 16:09, Dan Williams wrote:
> Fix a race condition between the mailbox-background command interrupt
> firing and the security-state sysfs attribute being removed.
> 
> The race is difficult to see due to the awkward placement of the
> sanitize-notifier setup code and the multiple places the teardown calls
> are made, cxl_memdev_security_init() and cxl_memdev_security_shutdown().
> 
> Unify setup in one place, cxl_sanitize_setup_notifier(). Arrange for
> the paired cxl_sanitize_teardown_notifier() to safely quiet the notifier
> and let the cxl_memdev + irq be unregistered later in the flow.
> 
> This fix is also needed as a preparation fix for a memdev unregistration
> crash.
> 
> Reported-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
> Cc: Dave Jiang <dave.jiang@intel.com>
> Cc: Davidlohr Bueso <dave@stgolabs.net>
> Fixes: 0c36b6ad436a ("cxl/mbox: Add sanitization handling machinery")
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> ---
>  drivers/cxl/core/memdev.c |   42 ----------------------------------------
>  drivers/cxl/pci.c         |   47 +++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 47 insertions(+), 42 deletions(-)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 2a7a07f6d165..a950091e5640 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -556,20 +556,11 @@ void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
>  }
>  EXPORT_SYMBOL_NS_GPL(clear_exclusive_cxl_commands, CXL);
>  
> -static void cxl_memdev_security_shutdown(struct device *dev)
> -{
> -	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> -	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
> -
> -	cancel_delayed_work_sync(&mds->security.poll_dwork);
> -}
> -
>  static void cxl_memdev_shutdown(struct device *dev)
>  {
>  	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
>  
>  	down_write(&cxl_memdev_rwsem);
> -	cxl_memdev_security_shutdown(dev);
>  	cxlmd->cxlds = NULL;
>  	up_write(&cxl_memdev_rwsem);
>  }
> @@ -1001,35 +992,6 @@ static const struct file_operations cxl_memdev_fops = {
>  	.llseek = noop_llseek,
>  };
>  
> -static void put_sanitize(void *data)
> -{
> -	struct cxl_memdev_state *mds = data;
> -
> -	sysfs_put(mds->security.sanitize_node);
> -}
> -
> -static int cxl_memdev_security_init(struct cxl_memdev *cxlmd)
> -{
> -	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> -	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
> -	struct device *dev = &cxlmd->dev;
> -	struct kernfs_node *sec;
> -
> -	sec = sysfs_get_dirent(dev->kobj.sd, "security");
> -	if (!sec) {
> -		dev_err(dev, "sysfs_get_dirent 'security' failed\n");
> -		return -ENODEV;
> -	}
> -	mds->security.sanitize_node = sysfs_get_dirent(sec, "state");
> -	sysfs_put(sec);
> -	if (!mds->security.sanitize_node) {
> -		dev_err(dev, "sysfs_get_dirent 'state' failed\n");
> -		return -ENODEV;
> -	}
> -
> -	return devm_add_action_or_reset(cxlds->dev, put_sanitize, mds);
> - }
> -
>  struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds)
>  {
>  	struct cxl_memdev *cxlmd;
> @@ -1058,10 +1020,6 @@ struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds)
>  	if (rc)
>  		goto err;
>  
> -	rc = cxl_memdev_security_init(cxlmd);
> -	if (rc)
> -		goto err;
> -
>  	rc = devm_add_action_or_reset(cxlds->dev, cxl_memdev_unregister, cxlmd);
>  	if (rc)
>  		return ERR_PTR(rc);
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index ac4e434b0806..b0023e479315 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -165,6 +165,49 @@ static void cxl_mbox_sanitize_work(struct work_struct *work)
>  	mutex_unlock(&mds->mbox_mutex);
>  }
>  
> +static void cxl_sanitize_teardown_notifier(void *data)
> +{
> +	struct cxl_memdev_state *mds = data;
> +	struct kernfs_node *state;
> +
> +	/*
> +	 * Prevent new irq triggered invocations of the workqueue and
> +	 * flush inflight invocations.
> +	 */
> +	mutex_lock(&mds->mbox_mutex);
> +	state = mds->security.sanitize_node;
> +	mds->security.sanitize_node = NULL;
> +	mutex_unlock(&mds->mbox_mutex);
> +
> +	cancel_delayed_work_sync(&mds->security.poll_dwork);
> +	sysfs_put(state);
> +}
> +
> +static int cxl_sanitize_setup_notifier(struct cxl_memdev *cxlmd)
> +{
> +	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> +	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
> +	struct device *dev = cxlds->dev;
> +	struct kernfs_node *sec;
> +
> +	if (!test_bit(CXL_SEC_ENABLED_SANITIZE, mds->security.enabled_cmds))
> +		return 0;
> +
> +	sec = sysfs_get_dirent(dev->kobj.sd, "security");
> +	if (!sec) {
> +		dev_err(dev, "sanitize notification setup failure\n");
> +		return -ENOENT;
> +	}
> +	mds->security.sanitize_node = sysfs_get_dirent(sec, "state");
> +	sysfs_put(sec);
> +	if (!mds->security.sanitize_node) {
> +		dev_err(dev, "sanitize notification setup failure\n");
> +		return -ENOENT;
> +	}
> +
> +	return devm_add_action_or_reset(dev, cxl_sanitize_teardown_notifier, mds);
> +}
> +
>  /**
>   * __cxl_pci_mbox_send_cmd() - Execute a mailbox command
>   * @mds: The memory device driver data
> @@ -875,6 +918,10 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (rc)
>  		return rc;
>  
> +	rc = cxl_sanitize_setup_notifier(cxlmd);
> +	if (rc)
> +		return rc;
> +
>  	pmu_count = cxl_count_regblock(pdev, CXL_REGLOC_RBI_PMU);
>  	for (i = 0; i < pmu_count; i++) {
>  		struct cxl_pmu_regs pmu_regs;
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 4/4] cxl/mem: Fix shutdown order
  2023-09-29 23:09 ` [PATCH v2 4/4] cxl/mem: Fix shutdown order Dan Williams
  2023-09-29 23:52   ` Ira Weiny
@ 2023-10-02 16:59   ` Dave Jiang
  2023-10-03 17:40   ` Davidlohr Bueso
  2 siblings, 0 replies; 34+ messages in thread
From: Dave Jiang @ 2023-10-02 16:59 UTC (permalink / raw)
  To: Dan Williams, linux-cxl; +Cc: Ira Weiny



On 9/29/23 16:09, Dan Williams wrote:
> Ira reports that removing cxl_mock_mem causes a crash with the following
> trace:
> 
>  BUG: kernel NULL pointer dereference, address: 0000000000000044
>  [..]
>  RIP: 0010:cxl_region_decode_reset+0x7f/0x180 [cxl_core]
>  [..]
>  Call Trace:
>   <TASK>
>   cxl_region_detach+0xe8/0x210 [cxl_core]
>   cxl_decoder_kill_region+0x27/0x40 [cxl_core]
>   cxld_unregister+0x29/0x40 [cxl_core]
>   devres_release_all+0xb8/0x110
>   device_unbind_cleanup+0xe/0x70
>   device_release_driver_internal+0x1d2/0x210
>   bus_remove_device+0xd7/0x150
>   device_del+0x155/0x3e0
>   device_unregister+0x13/0x60
>   devm_release_action+0x4d/0x90
>   ? __pfx_unregister_port+0x10/0x10 [cxl_core]
>   delete_endpoint+0x121/0x130 [cxl_core]
>   devres_release_all+0xb8/0x110
>   device_unbind_cleanup+0xe/0x70
>   device_release_driver_internal+0x1d2/0x210
>   bus_remove_device+0xd7/0x150
>   device_del+0x155/0x3e0
>   ? lock_release+0x142/0x290
>   cdev_device_del+0x15/0x50
>   cxl_memdev_unregister+0x54/0x70 [cxl_core]
> 
> This crash is due to the clearing out the cxl_memdev's driver context
> (@cxlds) before the subsystem is done with it. This is ultimately due to
> the region(s), that this memdev is a member, being torn down and expecting
> to be able to de-reference @cxlds, like here:
> 
> static int cxl_region_decode_reset(struct cxl_region *cxlr, int count)
> ...
>                 if (cxlds->rcd)
>                         goto endpoint_reset;
> ...
> 
> Fix it by keeping the driver context valid until memdev-device
> unregistration, and subsequently the entire stack of related
> dependencies, unwinds.
> 
> Fixes: 9cc238c7a526 ("cxl/pci: Introduce cdevm_file_operations")
> Reported-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> ---
>  drivers/cxl/core/memdev.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index a950091e5640..e78b5ead14fa 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -570,8 +570,8 @@ static void cxl_memdev_unregister(void *_cxlmd)
>  	struct cxl_memdev *cxlmd = _cxlmd;
>  	struct device *dev = &cxlmd->dev;
>  
> -	cxl_memdev_shutdown(dev);
>  	cdev_device_del(&cxlmd->cdev, dev);
> +	cxl_memdev_shutdown(dev);
>  	put_device(dev);
>  }
>  
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 4/4] cxl/mem: Fix shutdown order
  2023-09-29 23:09 ` [PATCH v2 4/4] cxl/mem: Fix shutdown order Dan Williams
  2023-09-29 23:52   ` Ira Weiny
  2023-10-02 16:59   ` Dave Jiang
@ 2023-10-03 17:40   ` Davidlohr Bueso
  2 siblings, 0 replies; 34+ messages in thread
From: Davidlohr Bueso @ 2023-10-03 17:40 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, Ira Weiny

On Fri, 29 Sep 2023, Dan Williams wrote:

>Ira reports that removing cxl_mock_mem causes a crash with the following
>trace:
>
> BUG: kernel NULL pointer dereference, address: 0000000000000044
> [..]
> RIP: 0010:cxl_region_decode_reset+0x7f/0x180 [cxl_core]
> [..]
> Call Trace:
>  <TASK>
>  cxl_region_detach+0xe8/0x210 [cxl_core]
>  cxl_decoder_kill_region+0x27/0x40 [cxl_core]
>  cxld_unregister+0x29/0x40 [cxl_core]
>  devres_release_all+0xb8/0x110
>  device_unbind_cleanup+0xe/0x70
>  device_release_driver_internal+0x1d2/0x210
>  bus_remove_device+0xd7/0x150
>  device_del+0x155/0x3e0
>  device_unregister+0x13/0x60
>  devm_release_action+0x4d/0x90
>  ? __pfx_unregister_port+0x10/0x10 [cxl_core]
>  delete_endpoint+0x121/0x130 [cxl_core]
>  devres_release_all+0xb8/0x110
>  device_unbind_cleanup+0xe/0x70
>  device_release_driver_internal+0x1d2/0x210
>  bus_remove_device+0xd7/0x150
>  device_del+0x155/0x3e0
>  ? lock_release+0x142/0x290
>  cdev_device_del+0x15/0x50
>  cxl_memdev_unregister+0x54/0x70 [cxl_core]
>
>This crash is due to the clearing out the cxl_memdev's driver context
>(@cxlds) before the subsystem is done with it. This is ultimately due to
>the region(s), that this memdev is a member, being torn down and expecting
>to be able to de-reference @cxlds, like here:
>
>static int cxl_region_decode_reset(struct cxl_region *cxlr, int count)
>...
>                if (cxlds->rcd)
>                        goto endpoint_reset;
>...
>
>Fix it by keeping the driver context valid until memdev-device
>unregistration, and subsequently the entire stack of related
>dependencies, unwinds.
>
>Fixes: 9cc238c7a526 ("cxl/pci: Introduce cdevm_file_operations")
>Reported-by: Ira Weiny <ira.weiny@intel.com>
>Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 3/4] cxl/pci: Fix sanitize notifier setup
  2023-09-29 23:09 ` [PATCH v2 3/4] cxl/pci: Fix sanitize notifier setup Dan Williams
                     ` (2 preceding siblings ...)
  2023-10-02 16:59   ` Dave Jiang
@ 2023-10-04  0:52   ` Davidlohr Bueso
  2023-10-04  1:09     ` Dan Williams
  3 siblings, 1 reply; 34+ messages in thread
From: Davidlohr Bueso @ 2023-10-04  0:52 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, Jonathan Cameron, Dave Jiang, ira.weiny

On Fri, 29 Sep 2023, Dan Williams wrote:

>+static int cxl_sanitize_setup_notifier(struct cxl_memdev *cxlmd)
>+{
>+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
>+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
>+	struct device *dev = cxlds->dev;

This wants to be dev = &cxlmd->dev;

>+	struct kernfs_node *sec;
>+
>+	if (!test_bit(CXL_SEC_ENABLED_SANITIZE, mds->security.enabled_cmds))
>+		return 0;
>+
>+	sec = sysfs_get_dirent(dev->kobj.sd, "security");
>+	if (!sec) {
>+		dev_err(dev, "sanitize notification setup failure\n");

Nit: Should the error message differ from the next one? Maybe s/sanitize/security?

>+		return -ENOENT;
>+	}
>+	mds->security.sanitize_node = sysfs_get_dirent(sec, "state");
>+	sysfs_put(sec);
>+	if (!mds->security.sanitize_node) {
>+		dev_err(dev, "sanitize notification setup failure\n");
>+		return -ENOENT;
>+	}
>+
>+	return devm_add_action_or_reset(dev, cxl_sanitize_teardown_notifier, mds);
>+}

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 2/4] cxl/pci: Cleanup 'sanitize' to always poll
  2023-10-02 10:02   ` Jonathan Cameron
@ 2023-10-04  0:55     ` Dan Williams
  0 siblings, 0 replies; 34+ messages in thread
From: Dan Williams @ 2023-10-04  0:55 UTC (permalink / raw)
  To: Jonathan Cameron, Dan Williams; +Cc: linux-cxl, ira.weiny

Jonathan Cameron wrote:
> On Fri, 29 Sep 2023 16:09:39 -0700
> Dan Williams <dan.j.williams@intel.com> wrote:
> 
> > In preparation for fixing the init/teardown of the 'sanitize' workqueue
> > and sysfs notification mechanism, arrange for cxl_mbox_sanitize_work()
> > to be the single location where the sysfs attribute is notified. With
> > that change there is no distinction between polled mode and interrupt
> > mode. All the interrupt does is accelerate the polling interval.
> > 
> > The change to check for "mds->security.sanitize_node" under the lock is
> > there to ensure that the interrupt, the work routine and the
> > setup/teardown code can all have a consistent view of the registered
> > notifier and the workqueue state. I.e. the expectation is that the
> > interrupt is live past the point that the sanitize sysfs attribute is
> > published, and it may race teardown, so it must be consulted under a
> > lock.
> > 
> > Lastly, some opportunistic replacements of
> > "queue_delayed_work(system_wq, ...)", which is just open coded
> > schedule_delayed_work(), are included.
> > 
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> One trivial comment inline, otherwise looks fine to me
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> > ---
> >  drivers/cxl/core/memdev.c |    3 +-
> >  drivers/cxl/cxlmem.h      |    2 --
> >  drivers/cxl/pci.c         |   60 +++++++++++++++++++--------------------------
> >  3 files changed, 26 insertions(+), 39 deletions(-)
> > 
> > diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> > index 14b547c07f54..2a7a07f6d165 100644
> > --- a/drivers/cxl/core/memdev.c
> > +++ b/drivers/cxl/core/memdev.c
> > @@ -561,8 +561,7 @@ static void cxl_memdev_security_shutdown(struct device *dev)
> >  	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> >  	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
> >  
> > -	if (mds->security.poll)
> > -		cancel_delayed_work_sync(&mds->security.poll_dwork);
> > +	cancel_delayed_work_sync(&mds->security.poll_dwork);
> >  }
> >  
> >  static void cxl_memdev_shutdown(struct device *dev)
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index 706f8a6d1ef4..55f00ad17a77 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -360,7 +360,6 @@ struct cxl_fw_state {
> >   *
> >   * @state: state of last security operation
> >   * @enabled_cmds: All security commands enabled in the CEL
> > - * @poll: polling for sanitization is enabled, device has no mbox irq support
> >   * @poll_tmo_secs: polling timeout
> >   * @poll_dwork: polling work item
> >   * @sanitize_node: sanitation sysfs file to notify
> > @@ -368,7 +367,6 @@ struct cxl_fw_state {
> >  struct cxl_security_state {
> >  	unsigned long state;
> >  	DECLARE_BITMAP(enabled_cmds, CXL_SEC_ENABLED_MAX);
> > -	bool poll;
> >  	int poll_tmo_secs;
> >  	struct delayed_work poll_dwork;
> >  	struct kernfs_node *sanitize_node;
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index aa1b3dd9e64c..ac4e434b0806 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -128,10 +128,10 @@ static irqreturn_t cxl_pci_mbox_irq(int irq, void *id)
> >  	reg = readq(cxlds->regs.mbox + CXLDEV_MBOX_BG_CMD_STATUS_OFFSET);
> >  	opcode = FIELD_GET(CXLDEV_MBOX_BG_CMD_COMMAND_OPCODE_MASK, reg);
> >  	if (opcode == CXL_MBOX_OP_SANITIZE) {
> > +		mutex_lock(&mds->mbox_mutex);
> >  		if (mds->security.sanitize_node)
> > -			sysfs_notify_dirent(mds->security.sanitize_node);
> > -
> > -		dev_dbg(cxlds->dev, "Sanitization operation ended\n");
> > +			mod_delayed_work(system_wq, &mds->security.poll_dwork, 0);
> > +		mutex_unlock(&mds->mbox_mutex);
> >  	} else {
> >  		/* short-circuit the wait in __cxl_pci_mbox_send_cmd() */
> >  		rcuwait_wake_up(&mds->mbox_wait);
> > @@ -160,8 +160,7 @@ static void cxl_mbox_sanitize_work(struct work_struct *work)
> >  		int timeout = mds->security.poll_tmo_secs + 10;
> >  
> >  		mds->security.poll_tmo_secs = min(15 * 60, timeout);
> > -		queue_delayed_work(system_wq, &mds->security.poll_dwork,
> > -				   timeout * HZ);
> > +		schedule_delayed_work(&mds->security.poll_dwork, timeout * HZ);
> >  	}
> >  	mutex_unlock(&mds->mbox_mutex);
> >  }
> > @@ -293,15 +292,11 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_memdev_state *mds,
> >  		 * and allow userspace to poll(2) for completion.
> >  		 */
> >  		if (mbox_cmd->opcode == CXL_MBOX_OP_SANITIZE) {
> > -			if (mds->security.poll) {
> > -				/* give first timeout a second */
> > -				timeout = 1;
> > -				mds->security.poll_tmo_secs = timeout;
> > -				queue_delayed_work(system_wq,
> > -						   &mds->security.poll_dwork,
> > -						   timeout * HZ);
> > -			}
> > -
> > +			/* give first timeout a second */
> > +			timeout = 1;
> > +			mds->security.poll_tmo_secs = timeout;
> > +			schedule_delayed_work(&mds->security.poll_dwork,
> > +					      timeout * HZ);
> >  			dev_dbg(dev, "Sanitization operation started\n");
> >  			goto success;
> >  		}
> > @@ -384,7 +379,9 @@ static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds)
> >  	const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
> >  	struct device *dev = cxlds->dev;
> >  	unsigned long timeout;
> > +	int irq, msgnum;
> >  	u64 md_status;
> > +	u32 ctrl;
> >  
> >  	timeout = jiffies + mbox_ready_timeout * HZ;
> >  	do {
> > @@ -432,33 +429,26 @@ static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds)
> >  	dev_dbg(dev, "Mailbox payload sized %zu", mds->payload_size);
> >  
> >  	rcuwait_init(&mds->mbox_wait);
> > +	INIT_DELAYED_WORK(&mds->security.poll_dwork, cxl_mbox_sanitize_work);
> >  
> > -	if (cap & CXLDEV_MBOX_CAP_BG_CMD_IRQ) {
> > -		u32 ctrl;
> > -		int irq, msgnum;
> > -		struct pci_dev *pdev = to_pci_dev(cxlds->dev);
> > -
> > -		msgnum = FIELD_GET(CXLDEV_MBOX_CAP_IRQ_MSGNUM_MASK, cap);
> > -		irq = pci_irq_vector(pdev, msgnum);
> > -		if (irq < 0)
> > -			goto mbox_poll;
> > -
> > -		if (cxl_request_irq(cxlds, irq, cxl_pci_mbox_irq, NULL))
> > -			goto mbox_poll;
> > +	/* background command interrupts are optional */
> > +	if ((cap & CXLDEV_MBOX_CAP_BG_CMD_IRQ) == 0)
> 
> Nit, but it's a boolean flag, so to me
> 	if (!(cap & CXLDEV_MBOX_CAP_BG_CMD_IRQ))
> 
> feels slightly better.
> 
> There are instances of this done with !() elsewhere in the file,
> so not obviously a case of wanting to match local style.

Ok.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 3/4] cxl/pci: Fix sanitize notifier setup
  2023-10-02 10:10   ` Jonathan Cameron
@ 2023-10-04  0:59     ` Dan Williams
  2023-10-04 10:12       ` Jonathan Cameron
  0 siblings, 1 reply; 34+ messages in thread
From: Dan Williams @ 2023-10-04  0:59 UTC (permalink / raw)
  To: Jonathan Cameron, Dan Williams
  Cc: linux-cxl, Dave Jiang, Davidlohr Bueso, ira.weiny

Jonathan Cameron wrote:
> On Fri, 29 Sep 2023 16:09:44 -0700
> Dan Williams <dan.j.williams@intel.com> wrote:
> 
> > Fix a race condition between the mailbox-background command interrupt
> > firing and the security-state sysfs attribute being removed.
> > 
> > The race is difficult to see due to the awkward placement of the
> > sanitize-notifier setup code and the multiple places the teardown calls
> > are made, cxl_memdev_security_init() and cxl_memdev_security_shutdown().
> > 
> > Unify setup in one place, cxl_sanitize_setup_notifier(). Arrange for
> > the paired cxl_sanitize_teardown_notifier() to safely quiet the notifier
> > and let the cxl_memdev + irq be unregistered later in the flow.
> > 
> > This fix is also needed as a preparation fix for a memdev unregistration
> > crash.
> > 
> > Reported-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
> > Cc: Dave Jiang <dave.jiang@intel.com>
> > Cc: Davidlohr Bueso <dave@stgolabs.net>
> > Fixes: 0c36b6ad436a ("cxl/mbox: Add sanitization handling machinery")
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> 
> One trivial question inline about which parameter to pass in from the
> many many interlocking state structures...
> 
> If you do make the suggested change, it's just complex enough I want another
> look so I'm not giving a tag for now.
> 
> > ---
> >  drivers/cxl/core/memdev.c |   42 ----------------------------------------
> >  drivers/cxl/pci.c         |   47 +++++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 47 insertions(+), 42 deletions(-)
> > 
> > diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> > index 2a7a07f6d165..a950091e5640 100644
> > --- a/drivers/cxl/core/memdev.c
> > +++ b/drivers/cxl/core/memdev.c
> > @@ -556,20 +556,11 @@ void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
> >  }
> >  EXPORT_SYMBOL_NS_GPL(clear_exclusive_cxl_commands, CXL);
> >  
> > -static void cxl_memdev_security_shutdown(struct device *dev)
> > -{
> > -	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> > -	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
> > -
> > -	cancel_delayed_work_sync(&mds->security.poll_dwork);
> > -}
> > -
> >  static void cxl_memdev_shutdown(struct device *dev)
> >  {
> >  	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> >  
> >  	down_write(&cxl_memdev_rwsem);
> > -	cxl_memdev_security_shutdown(dev);
> >  	cxlmd->cxlds = NULL;
> >  	up_write(&cxl_memdev_rwsem);
> >  }
> > @@ -1001,35 +992,6 @@ static const struct file_operations cxl_memdev_fops = {
> >  	.llseek = noop_llseek,
> >  };
> >  
> > -static void put_sanitize(void *data)
> > -{
> > -	struct cxl_memdev_state *mds = data;
> > -
> > -	sysfs_put(mds->security.sanitize_node);
> > -}
> > -
> > -static int cxl_memdev_security_init(struct cxl_memdev *cxlmd)
> > -{
> > -	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> > -	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
> > -	struct device *dev = &cxlmd->dev;
> > -	struct kernfs_node *sec;
> > -
> > -	sec = sysfs_get_dirent(dev->kobj.sd, "security");
> > -	if (!sec) {
> > -		dev_err(dev, "sysfs_get_dirent 'security' failed\n");
> > -		return -ENODEV;
> > -	}
> > -	mds->security.sanitize_node = sysfs_get_dirent(sec, "state");
> > -	sysfs_put(sec);
> > -	if (!mds->security.sanitize_node) {
> > -		dev_err(dev, "sysfs_get_dirent 'state' failed\n");
> > -		return -ENODEV;
> > -	}
> > -
> > -	return devm_add_action_or_reset(cxlds->dev, put_sanitize, mds);
> > - }
> > -
> >  struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds)
> >  {
> >  	struct cxl_memdev *cxlmd;
> > @@ -1058,10 +1020,6 @@ struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds)
> >  	if (rc)
> >  		goto err;
> >  
> > -	rc = cxl_memdev_security_init(cxlmd);
> > -	if (rc)
> > -		goto err;
> > -
> >  	rc = devm_add_action_or_reset(cxlds->dev, cxl_memdev_unregister, cxlmd);
> >  	if (rc)
> >  		return ERR_PTR(rc);
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index ac4e434b0806..b0023e479315 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -165,6 +165,49 @@ static void cxl_mbox_sanitize_work(struct work_struct *work)
> >  	mutex_unlock(&mds->mbox_mutex);
> >  }
> >  
> > +static void cxl_sanitize_teardown_notifier(void *data)
> > +{
> > +	struct cxl_memdev_state *mds = data;
> > +	struct kernfs_node *state;
> > +
> > +	/*
> > +	 * Prevent new irq triggered invocations of the workqueue and
> > +	 * flush inflight invocations.
> > +	 */
> > +	mutex_lock(&mds->mbox_mutex);
> > +	state = mds->security.sanitize_node;
> > +	mds->security.sanitize_node = NULL;
> > +	mutex_unlock(&mds->mbox_mutex);
> > +
> > +	cancel_delayed_work_sync(&mds->security.poll_dwork);
> > +	sysfs_put(state);
> > +}
> > +
> > +static int cxl_sanitize_setup_notifier(struct cxl_memdev *cxlmd)
> > +{
> 
> Almost everything in cxl_pci_probe() passes in the mds.
> Why not do the same here?

Because this one really is built on top of a stack of things and needs
the 'device' because it is tying the device's sysfs attributes to the
completion notifications of the background workqueue.

I mentioned this in the cover, but failed to mention it again in this
changelog:

"The special wrinkle of the sanitize notifier is that it interacts with
interrupts, which are enabled early in the flow, and it interacts with
memdev sysfs which is not initialized until late in the flow."

There are no sysfs attributes reachable from an @mds.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 2/4] cxl/pci: Cleanup 'sanitize' to always poll
  2023-10-02 15:49   ` Davidlohr Bueso
@ 2023-10-04  1:01     ` Dan Williams
  2023-10-04  1:13       ` Davidlohr Bueso
  0 siblings, 1 reply; 34+ messages in thread
From: Dan Williams @ 2023-10-04  1:01 UTC (permalink / raw)
  To: Davidlohr Bueso, Dan Williams; +Cc: linux-cxl, ira.weiny

Davidlohr Bueso wrote:
> On Fri, 29 Sep 2023, Dan Williams wrote:
> 
> >In preparation for fixing the init/teardown of the 'sanitize' workqueue
> >and sysfs notification mechanism, arrange for cxl_mbox_sanitize_work()
> >to be the single location where the sysfs attribute is notified. With
> >that change there is no distinction between polled mode and interrupt
> >mode. All the interrupt does is accelerate the polling interval.
> >
> >The change to check for "mds->security.sanitize_node" under the lock is
> >there to ensure that the interrupt, the work routine and the
> >setup/teardown code can all have a consistent view of the registered
> >notifier and the workqueue state. I.e. the expectation is that the
> >interrupt is live past the point that the sanitize sysfs attribute is
> >published, and it may race teardown, so it must be consulted under a
> >lock.
> 
> That makes sense, but this is currently under hardirq, so we'd need the
> threaded flavor instead to take the mutex.

Yes I missed that, will fix.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 3/4] cxl/pci: Fix sanitize notifier setup
  2023-10-04  0:52   ` Davidlohr Bueso
@ 2023-10-04  1:09     ` Dan Williams
  2023-10-04 16:21       ` Davidlohr Bueso
  0 siblings, 1 reply; 34+ messages in thread
From: Dan Williams @ 2023-10-04  1:09 UTC (permalink / raw)
  To: Davidlohr Bueso, Dan Williams
  Cc: linux-cxl, Jonathan Cameron, Dave Jiang, ira.weiny

Davidlohr Bueso wrote:
> On Fri, 29 Sep 2023, Dan Williams wrote:
> 
> >+static int cxl_sanitize_setup_notifier(struct cxl_memdev *cxlmd)
> >+{
> >+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> >+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
> >+	struct device *dev = cxlds->dev;
> 
> This wants to be dev = &cxlmd->dev;

No, the notifier needs to be torn down in the cxl_pci teardown path. If
cxlmd->dev was the devm operations host then this notifier would need to
be setup from the cxl_mem driver, not cxl_pci. The teardown order of
cxl_pci ends with @cxlds->dev as the devm host ends up as:

cxl_sanitize_teardown_notifier()
cxl_memdev_unregister()

...otherwise if the @cxlmd->dev is used then the devm callback may not
fire until device_release() time since it is possible that the cxl_mem
driver never attaches to trigger the typical devm action around
->remove() time.

> >+	struct kernfs_node *sec;
> >+
> >+	if (!test_bit(CXL_SEC_ENABLED_SANITIZE, mds->security.enabled_cmds))
> >+		return 0;
> >+
> >+	sec = sysfs_get_dirent(dev->kobj.sd, "security");
> >+	if (!sec) {
> >+		dev_err(dev, "sanitize notification setup failure\n");
> 
> Nit: Should the error message differ from the next one? Maybe s/sanitize/security?

Maybe, but these things will likely never fire, as devm_cxl_add_memdev()
would have failed before getting here. Maybe just delete them?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 2/4] cxl/pci: Cleanup 'sanitize' to always poll
  2023-10-04  1:01     ` Dan Williams
@ 2023-10-04  1:13       ` Davidlohr Bueso
  0 siblings, 0 replies; 34+ messages in thread
From: Davidlohr Bueso @ 2023-10-04  1:13 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny

On Tue, 03 Oct 2023, Dan Williams wrote:

>Davidlohr Bueso wrote:
>> On Fri, 29 Sep 2023, Dan Williams wrote:
>>
>> >In preparation for fixing the init/teardown of the 'sanitize' workqueue
>> >and sysfs notification mechanism, arrange for cxl_mbox_sanitize_work()
>> >to be the single location where the sysfs attribute is notified. With
>> >that change there is no distinction between polled mode and interrupt
>> >mode. All the interrupt does is accelerate the polling interval.
>> >
>> >The change to check for "mds->security.sanitize_node" under the lock is
>> >there to ensure that the interrupt, the work routine and the
>> >setup/teardown code can all have a consistent view of the registered
>> >notifier and the workqueue state. I.e. the expectation is that the
>> >interrupt is live past the point that the sanitize sysfs attribute is
>> >published, and it may race teardown, so it must be consulted under a
>> >lock.
>>
>> That makes sense, but this is currently under hardirq, so we'd need the
>> threaded flavor instead to take the mutex.
>
>Yes I missed that, will fix.

With the following this seem to be running just fine along with locking
debugging on (just need to confirm that the sysfs notification works with
polling in userspace upon completion).

root@cxl:/sys/bus/cxl/devices/mem0# cat security/state
disabled
root@cxl:/sys/bus/cxl/devices/mem0# echo 1 > security/sanitize
[   77.641857] cxl_pci:__cxl_pci_mbox_send_cmd:297: cxl_pci 0000:0d:00.0: Sending command: 0x4500
[   77.648236] cxl_pci:cxl_pci_mbox_wait_for_doorbell:72: cxl_pci 0000:0d:00.0: Doorbell wait took 0ms
[   77.656379] cxl_pci:__cxl_pci_mbox_send_cmd:297: cxl_pci 0000:0d:00.0: Sending command: 0x4400
[   77.662616] cxl_pci:cxl_pci_mbox_wait_for_doorbell:72: cxl_pci 0000:0d:00.0: Doorbell wait took 0ms
[   77.669227] cxl_pci:__cxl_pci_mbox_send_cmd:343: cxl_pci 0000:0d:00.0: Sanitization operation started
root@cxl:/sys/bus/cxl/devices/mem0# cat security/state
sanitize
root@cxl:/sys/bus/cxl/devices/mem0# echo 1 > security/sanitize
[   81.705796] cxl_pci 0000:0d:00.0: Failed to get security state : -16 -bash: echo: write error: Device or resource busy
[   85.761560] cxl_pci:cxl_mbox_sanitize_work:158: cxl_pci 0000:0d:00.0: Sanitization operation ended


----
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index b0023e479315..b8f69ce883be 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -187,7 +187,7 @@ static int cxl_sanitize_setup_notifier(struct cxl_memdev *cxlmd)
  {
	struct cxl_dev_state *cxlds = cxlmd->cxlds;
	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
-	struct device *dev = cxlds->dev;
+	struct device *dev = &cxlmd->dev;
	struct kernfs_node *sec;

	if (!test_bit(CXL_SEC_ENABLED_SANITIZE, mds->security.enabled_cmds))
@@ -483,7 +483,7 @@ static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds)
	if (irq < 0)
		return 0;

-	if (cxl_request_irq(cxlds, irq, cxl_pci_mbox_irq, NULL))
+	if (cxl_request_irq(cxlds, irq, NULL, cxl_pci_mbox_irq))
		return 0;

	dev_dbg(cxlds->dev, "Mailbox interrupts enabled\n");

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 3/4] cxl/pci: Fix sanitize notifier setup
  2023-10-04  0:59     ` Dan Williams
@ 2023-10-04 10:12       ` Jonathan Cameron
  2023-10-04 18:47         ` Dan Williams
  0 siblings, 1 reply; 34+ messages in thread
From: Jonathan Cameron @ 2023-10-04 10:12 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, Dave Jiang, Davidlohr Bueso, ira.weiny

On Tue, 3 Oct 2023 17:59:46 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> Jonathan Cameron wrote:
> > On Fri, 29 Sep 2023 16:09:44 -0700
> > Dan Williams <dan.j.williams@intel.com> wrote:
> >   
> > > Fix a race condition between the mailbox-background command interrupt
> > > firing and the security-state sysfs attribute being removed.
> > > 
> > > The race is difficult to see due to the awkward placement of the
> > > sanitize-notifier setup code and the multiple places the teardown calls
> > > are made, cxl_memdev_security_init() and cxl_memdev_security_shutdown().
> > > 
> > > Unify setup in one place, cxl_sanitize_setup_notifier(). Arrange for
> > > the paired cxl_sanitize_teardown_notifier() to safely quiet the notifier
> > > and let the cxl_memdev + irq be unregistered later in the flow.
> > > 
> > > This fix is also needed as a preparation fix for a memdev unregistration
> > > crash.
> > > 
> > > Reported-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
> > > Cc: Dave Jiang <dave.jiang@intel.com>
> > > Cc: Davidlohr Bueso <dave@stgolabs.net>
> > > Fixes: 0c36b6ad436a ("cxl/mbox: Add sanitization handling machinery")
> > > Signed-off-by: Dan Williams <dan.j.williams@intel.com>  
> > 
> > One trivial question inline about which parameter to pass in from the
> > many many interlocking state structures...
> > 
> > If you do make the suggested change, it's just complex enough I want another
> > look so I'm not giving a tag for now.
> >   
> > > ---
> > >  drivers/cxl/core/memdev.c |   42 ----------------------------------------
> > >  drivers/cxl/pci.c         |   47 +++++++++++++++++++++++++++++++++++++++++++++
> > >  2 files changed, 47 insertions(+), 42 deletions(-)
> > > 
> > > diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> > > index 2a7a07f6d165..a950091e5640 100644
> > > --- a/drivers/cxl/core/memdev.c
> > > +++ b/drivers/cxl/core/memdev.c
> > > @@ -556,20 +556,11 @@ void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
> > >  }
> > >  EXPORT_SYMBOL_NS_GPL(clear_exclusive_cxl_commands, CXL);
> > >  
> > > -static void cxl_memdev_security_shutdown(struct device *dev)
> > > -{
> > > -	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> > > -	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
> > > -
> > > -	cancel_delayed_work_sync(&mds->security.poll_dwork);
> > > -}
> > > -
> > >  static void cxl_memdev_shutdown(struct device *dev)
> > >  {
> > >  	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> > >  
> > >  	down_write(&cxl_memdev_rwsem);
> > > -	cxl_memdev_security_shutdown(dev);
> > >  	cxlmd->cxlds = NULL;
> > >  	up_write(&cxl_memdev_rwsem);
> > >  }
> > > @@ -1001,35 +992,6 @@ static const struct file_operations cxl_memdev_fops = {
> > >  	.llseek = noop_llseek,
> > >  };
> > >  
> > > -static void put_sanitize(void *data)
> > > -{
> > > -	struct cxl_memdev_state *mds = data;
> > > -
> > > -	sysfs_put(mds->security.sanitize_node);
> > > -}
> > > -
> > > -static int cxl_memdev_security_init(struct cxl_memdev *cxlmd)
> > > -{
> > > -	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> > > -	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
> > > -	struct device *dev = &cxlmd->dev;
> > > -	struct kernfs_node *sec;
> > > -
> > > -	sec = sysfs_get_dirent(dev->kobj.sd, "security");
> > > -	if (!sec) {
> > > -		dev_err(dev, "sysfs_get_dirent 'security' failed\n");
> > > -		return -ENODEV;
> > > -	}
> > > -	mds->security.sanitize_node = sysfs_get_dirent(sec, "state");
> > > -	sysfs_put(sec);
> > > -	if (!mds->security.sanitize_node) {
> > > -		dev_err(dev, "sysfs_get_dirent 'state' failed\n");
> > > -		return -ENODEV;
> > > -	}
> > > -
> > > -	return devm_add_action_or_reset(cxlds->dev, put_sanitize, mds);
> > > - }
> > > -
> > >  struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds)
> > >  {
> > >  	struct cxl_memdev *cxlmd;
> > > @@ -1058,10 +1020,6 @@ struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds)
> > >  	if (rc)
> > >  		goto err;
> > >  
> > > -	rc = cxl_memdev_security_init(cxlmd);
> > > -	if (rc)
> > > -		goto err;
> > > -
> > >  	rc = devm_add_action_or_reset(cxlds->dev, cxl_memdev_unregister, cxlmd);
> > >  	if (rc)
> > >  		return ERR_PTR(rc);
> > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > index ac4e434b0806..b0023e479315 100644
> > > --- a/drivers/cxl/pci.c
> > > +++ b/drivers/cxl/pci.c
> > > @@ -165,6 +165,49 @@ static void cxl_mbox_sanitize_work(struct work_struct *work)
> > >  	mutex_unlock(&mds->mbox_mutex);
> > >  }
> > >  
> > > +static void cxl_sanitize_teardown_notifier(void *data)
> > > +{
> > > +	struct cxl_memdev_state *mds = data;
> > > +	struct kernfs_node *state;
> > > +
> > > +	/*
> > > +	 * Prevent new irq triggered invocations of the workqueue and
> > > +	 * flush inflight invocations.
> > > +	 */
> > > +	mutex_lock(&mds->mbox_mutex);
> > > +	state = mds->security.sanitize_node;
> > > +	mds->security.sanitize_node = NULL;
> > > +	mutex_unlock(&mds->mbox_mutex);
> > > +
> > > +	cancel_delayed_work_sync(&mds->security.poll_dwork);
> > > +	sysfs_put(state);
> > > +}
> > > +
> > > +static int cxl_sanitize_setup_notifier(struct cxl_memdev *cxlmd)
> > > +{  
> > 
> > Almost everything in cxl_pci_probe() passes in the mds.
> > Why not do the same here?  
> 
> Because this one really is built on top of a stack of things and needs
> the 'device' because it is tying the device's sysfs attributes to the
> completion notifications of the background workqueue.
> 
> I mentioned this in the cover, but failed to mention it again in this
> changelog:
> 
> "The special wrinkle of the sanitize notifier is that it interacts with
> interrupts, which are enabled early in the flow, and it interacts with
> memdev sysfs which is not initialized until late in the flow."
> 
> There are no sysfs attributes reachable from an @mds.

I'm confused.  This accesses the sysfs stuff via
sec = sysfs_get_dirent(dev->kobj.sd, "security");
where dev = cxlds->dev
and cxlds is embedded in mds.

So from a code point of view I can't see the restriction.
Is it more a semantic issue that it naturally feels better to use
the cxl_mdev?

Jonathan


> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 3/4] cxl/pci: Fix sanitize notifier setup
  2023-10-04  1:09     ` Dan Williams
@ 2023-10-04 16:21       ` Davidlohr Bueso
  2023-10-04 18:48         ` Dan Williams
                           ` (2 more replies)
  0 siblings, 3 replies; 34+ messages in thread
From: Davidlohr Bueso @ 2023-10-04 16:21 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, Jonathan Cameron, Dave Jiang, ira.weiny

On Tue, 03 Oct 2023, Dan Williams wrote:

>Davidlohr Bueso wrote:
>> On Fri, 29 Sep 2023, Dan Williams wrote:
>>
>> >+static int cxl_sanitize_setup_notifier(struct cxl_memdev *cxlmd)
>> >+{
>> >+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
>> >+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
>> >+	struct device *dev = cxlds->dev;
>>
>> This wants to be dev = &cxlmd->dev;
>
>No, the notifier needs to be torn down in the cxl_pci teardown path. If
>cxlmd->dev was the devm operations host then this notifier would need to
>be setup from the cxl_mem driver, not cxl_pci. The teardown order of
>cxl_pci ends with @cxlds->dev as the devm host ends up as:
>
>cxl_sanitize_teardown_notifier()
>cxl_memdev_unregister()
>
>...otherwise if the @cxlmd->dev is used then the devm callback may not
>fire until device_release() time since it is possible that the cxl_mem
>driver never attaches to trigger the typical devm action around
>->remove() time.

As is, the first sysfs_get_dirent() fails, fyi.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 3/4] cxl/pci: Fix sanitize notifier setup
  2023-10-04 10:12       ` Jonathan Cameron
@ 2023-10-04 18:47         ` Dan Williams
  0 siblings, 0 replies; 34+ messages in thread
From: Dan Williams @ 2023-10-04 18:47 UTC (permalink / raw)
  To: Jonathan Cameron, Dan Williams
  Cc: linux-cxl, Dave Jiang, Davidlohr Bueso, ira.weiny

Jonathan Cameron wrote:
> On Tue, 3 Oct 2023 17:59:46 -0700
> Dan Williams <dan.j.williams@intel.com> wrote:
> 
> > Jonathan Cameron wrote:
> > > On Fri, 29 Sep 2023 16:09:44 -0700
> > > Dan Williams <dan.j.williams@intel.com> wrote:
> > >   
> > > > Fix a race condition between the mailbox-background command interrupt
> > > > firing and the security-state sysfs attribute being removed.
> > > > 
> > > > The race is difficult to see due to the awkward placement of the
> > > > sanitize-notifier setup code and the multiple places the teardown calls
> > > > are made, cxl_memdev_security_init() and cxl_memdev_security_shutdown().
> > > > 
> > > > Unify setup in one place, cxl_sanitize_setup_notifier(). Arrange for
> > > > the paired cxl_sanitize_teardown_notifier() to safely quiet the notifier
> > > > and let the cxl_memdev + irq be unregistered later in the flow.
> > > > 
> > > > This fix is also needed as a preparation fix for a memdev unregistration
> > > > crash.
> > > > 
> > > > Reported-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
> > > > Cc: Dave Jiang <dave.jiang@intel.com>
> > > > Cc: Davidlohr Bueso <dave@stgolabs.net>
> > > > Fixes: 0c36b6ad436a ("cxl/mbox: Add sanitization handling machinery")
> > > > Signed-off-by: Dan Williams <dan.j.williams@intel.com>  
> > > 
> > > One trivial question inline about which parameter to pass in from the
> > > many many interlocking state structures...
> > > 
> > > If you do make the suggested change, it's just complex enough I want another
> > > look so I'm not giving a tag for now.
> > >   
> > > > ---
> > > >  drivers/cxl/core/memdev.c |   42 ----------------------------------------
> > > >  drivers/cxl/pci.c         |   47 +++++++++++++++++++++++++++++++++++++++++++++
> > > >  2 files changed, 47 insertions(+), 42 deletions(-)
> > > > 
> > > > diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> > > > index 2a7a07f6d165..a950091e5640 100644
> > > > --- a/drivers/cxl/core/memdev.c
> > > > +++ b/drivers/cxl/core/memdev.c
> > > > @@ -556,20 +556,11 @@ void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
> > > >  }
> > > >  EXPORT_SYMBOL_NS_GPL(clear_exclusive_cxl_commands, CXL);
> > > >  
> > > > -static void cxl_memdev_security_shutdown(struct device *dev)
> > > > -{
> > > > -	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> > > > -	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
> > > > -
> > > > -	cancel_delayed_work_sync(&mds->security.poll_dwork);
> > > > -}
> > > > -
> > > >  static void cxl_memdev_shutdown(struct device *dev)
> > > >  {
> > > >  	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> > > >  
> > > >  	down_write(&cxl_memdev_rwsem);
> > > > -	cxl_memdev_security_shutdown(dev);
> > > >  	cxlmd->cxlds = NULL;
> > > >  	up_write(&cxl_memdev_rwsem);
> > > >  }
> > > > @@ -1001,35 +992,6 @@ static const struct file_operations cxl_memdev_fops = {
> > > >  	.llseek = noop_llseek,
> > > >  };
> > > >  
> > > > -static void put_sanitize(void *data)
> > > > -{
> > > > -	struct cxl_memdev_state *mds = data;
> > > > -
> > > > -	sysfs_put(mds->security.sanitize_node);
> > > > -}
> > > > -
> > > > -static int cxl_memdev_security_init(struct cxl_memdev *cxlmd)
> > > > -{
> > > > -	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> > > > -	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
> > > > -	struct device *dev = &cxlmd->dev;
> > > > -	struct kernfs_node *sec;
> > > > -
> > > > -	sec = sysfs_get_dirent(dev->kobj.sd, "security");
> > > > -	if (!sec) {
> > > > -		dev_err(dev, "sysfs_get_dirent 'security' failed\n");
> > > > -		return -ENODEV;
> > > > -	}
> > > > -	mds->security.sanitize_node = sysfs_get_dirent(sec, "state");
> > > > -	sysfs_put(sec);
> > > > -	if (!mds->security.sanitize_node) {
> > > > -		dev_err(dev, "sysfs_get_dirent 'state' failed\n");
> > > > -		return -ENODEV;
> > > > -	}
> > > > -
> > > > -	return devm_add_action_or_reset(cxlds->dev, put_sanitize, mds);
> > > > - }
> > > > -
> > > >  struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds)
> > > >  {
> > > >  	struct cxl_memdev *cxlmd;
> > > > @@ -1058,10 +1020,6 @@ struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds)
> > > >  	if (rc)
> > > >  		goto err;
> > > >  
> > > > -	rc = cxl_memdev_security_init(cxlmd);
> > > > -	if (rc)
> > > > -		goto err;
> > > > -
> > > >  	rc = devm_add_action_or_reset(cxlds->dev, cxl_memdev_unregister, cxlmd);
> > > >  	if (rc)
> > > >  		return ERR_PTR(rc);
> > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > > index ac4e434b0806..b0023e479315 100644
> > > > --- a/drivers/cxl/pci.c
> > > > +++ b/drivers/cxl/pci.c
> > > > @@ -165,6 +165,49 @@ static void cxl_mbox_sanitize_work(struct work_struct *work)
> > > >  	mutex_unlock(&mds->mbox_mutex);
> > > >  }
> > > >  
> > > > +static void cxl_sanitize_teardown_notifier(void *data)
> > > > +{
> > > > +	struct cxl_memdev_state *mds = data;
> > > > +	struct kernfs_node *state;
> > > > +
> > > > +	/*
> > > > +	 * Prevent new irq triggered invocations of the workqueue and
> > > > +	 * flush inflight invocations.
> > > > +	 */
> > > > +	mutex_lock(&mds->mbox_mutex);
> > > > +	state = mds->security.sanitize_node;
> > > > +	mds->security.sanitize_node = NULL;
> > > > +	mutex_unlock(&mds->mbox_mutex);
> > > > +
> > > > +	cancel_delayed_work_sync(&mds->security.poll_dwork);
> > > > +	sysfs_put(state);
> > > > +}
> > > > +
> > > > +static int cxl_sanitize_setup_notifier(struct cxl_memdev *cxlmd)
> > > > +{  
> > > 
> > > Almost everything in cxl_pci_probe() passes in the mds.
> > > Why not do the same here?  
> > 
> > Because this one really is built on top of a stack of things and needs
> > the 'device' because it is tying the device's sysfs attributes to the
> > completion notifications of the background workqueue.
> > 
> > I mentioned this in the cover, but failed to mention it again in this
> > changelog:
> > 
> > "The special wrinkle of the sanitize notifier is that it interacts with
> > interrupts, which are enabled early in the flow, and it interacts with
> > memdev sysfs which is not initialized until late in the flow."
> > 
> > There are no sysfs attributes reachable from an @mds.
> 
> I'm confused.  This accesses the sysfs stuff via
> sec = sysfs_get_dirent(dev->kobj.sd, "security");
> where dev = cxlds->dev
> and cxlds is embedded in mds.

Ah, no that's a bug I introduced. The dev should be &cxlmd->dev and I
made this mistake when combining the code.

> So from a code point of view I can't see the restriction.
> Is it more a semantic issue that it naturally feels better to use
> the cxl_mdev?

This is also why I wanted a positive test result that I did not
introduce some silly bug, which I did. My typical capture of silly bugs
is cxl_test, but there is no cxl_test infrastructure for sanitize.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 3/4] cxl/pci: Fix sanitize notifier setup
  2023-10-04 16:21       ` Davidlohr Bueso
@ 2023-10-04 18:48         ` Dan Williams
  2023-10-04 18:50         ` Dan Williams
  2023-10-04 18:54         ` Dan Williams
  2 siblings, 0 replies; 34+ messages in thread
From: Dan Williams @ 2023-10-04 18:48 UTC (permalink / raw)
  To: Davidlohr Bueso, Dan Williams
  Cc: linux-cxl, Jonathan Cameron, Dave Jiang, ira.weiny

Davidlohr Bueso wrote:
> On Tue, 03 Oct 2023, Dan Williams wrote:
> 
> >Davidlohr Bueso wrote:
> >> On Fri, 29 Sep 2023, Dan Williams wrote:
> >>
> >> >+static int cxl_sanitize_setup_notifier(struct cxl_memdev *cxlmd)
> >> >+{
> >> >+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> >> >+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
> >> >+	struct device *dev = cxlds->dev;
> >>
> >> This wants to be dev = &cxlmd->dev;
> >
> >No, the notifier needs to be torn down in the cxl_pci teardown path. If
> >cxlmd->dev was the devm operations host then this notifier would need to
> >be setup from the cxl_mem driver, not cxl_pci. The teardown order of
> >cxl_pci ends with @cxlds->dev as the devm host ends up as:
> >
> >cxl_sanitize_teardown_notifier()
> >cxl_memdev_unregister()
> >
> >...otherwise if the @cxlmd->dev is used then the devm callback may not
> >fire until device_release() time since it is possible that the cxl_mem
> >driver never attaches to trigger the typical devm action around
> >->remove() time.
> 
> As is, the first sysfs_get_dirent() fails, fyi.

Yup, thanks for testing, I am using the wrong device to lookup the sysfs
entry.

What would it take to get your test scenario into cxl_test... or at
least a recipe for it in QEMU that can be checked into the cxl-cli
tests?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 3/4] cxl/pci: Fix sanitize notifier setup
  2023-10-04 16:21       ` Davidlohr Bueso
  2023-10-04 18:48         ` Dan Williams
@ 2023-10-04 18:50         ` Dan Williams
  2023-10-04 18:54         ` Dan Williams
  2 siblings, 0 replies; 34+ messages in thread
From: Dan Williams @ 2023-10-04 18:50 UTC (permalink / raw)
  To: Davidlohr Bueso, Dan Williams
  Cc: linux-cxl, Jonathan Cameron, Dave Jiang, ira.weiny

Davidlohr Bueso wrote:
> On Tue, 03 Oct 2023, Dan Williams wrote:
> 
> >Davidlohr Bueso wrote:
> >> On Fri, 29 Sep 2023, Dan Williams wrote:
> >>
> >> >+static int cxl_sanitize_setup_notifier(struct cxl_memdev *cxlmd)
> >> >+{
> >> >+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> >> >+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
> >> >+	struct device *dev = cxlds->dev;
> >>
> >> This wants to be dev = &cxlmd->dev;
> >
> >No, the notifier needs to be torn down in the cxl_pci teardown path. If
> >cxlmd->dev was the devm operations host then this notifier would need to
> >be setup from the cxl_mem driver, not cxl_pci. The teardown order of
> >cxl_pci ends with @cxlds->dev as the devm host ends up as:
> >
> >cxl_sanitize_teardown_notifier()
> >cxl_memdev_unregister()
> >
> >...otherwise if the @cxlmd->dev is used then the devm callback may not
> >fire until device_release() time since it is possible that the cxl_mem
> >driver never attaches to trigger the typical devm action around
> >->remove() time.
> 
> As is, the first sysfs_get_dirent() fails, fyi.

I should say that I misread your motivation for changing the dev because
that would also change the devm host, and now I see the bug.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 3/4] cxl/pci: Fix sanitize notifier setup
  2023-10-04 16:21       ` Davidlohr Bueso
  2023-10-04 18:48         ` Dan Williams
  2023-10-04 18:50         ` Dan Williams
@ 2023-10-04 18:54         ` Dan Williams
  2023-10-04 19:23           ` Davidlohr Bueso
  2 siblings, 1 reply; 34+ messages in thread
From: Dan Williams @ 2023-10-04 18:54 UTC (permalink / raw)
  To: Davidlohr Bueso, Dan Williams
  Cc: linux-cxl, Jonathan Cameron, Dave Jiang, ira.weiny

Davidlohr Bueso wrote:
> On Tue, 03 Oct 2023, Dan Williams wrote:
> 
> >Davidlohr Bueso wrote:
> >> On Fri, 29 Sep 2023, Dan Williams wrote:
> >>
> >> >+static int cxl_sanitize_setup_notifier(struct cxl_memdev *cxlmd)
> >> >+{
> >> >+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> >> >+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
> >> >+	struct device *dev = cxlds->dev;
> >>
> >> This wants to be dev = &cxlmd->dev;
> >
> >No, the notifier needs to be torn down in the cxl_pci teardown path. If
> >cxlmd->dev was the devm operations host then this notifier would need to
> >be setup from the cxl_mem driver, not cxl_pci. The teardown order of
> >cxl_pci ends with @cxlds->dev as the devm host ends up as:
> >
> >cxl_sanitize_teardown_notifier()
> >cxl_memdev_unregister()
> >
> >...otherwise if the @cxlmd->dev is used then the devm callback may not
> >fire until device_release() time since it is possible that the cxl_mem
> >driver never attaches to trigger the typical devm action around
> >->remove() time.
> 
> As is, the first sysfs_get_dirent() fails, fyi.

Here is the full fix, apologies for misreading your fix earlier:

diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index f69f6195c465..a84932ad9392 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -187,25 +187,25 @@ static int cxl_sanitize_setup_notifier(struct cxl_memdev *cxlmd)
 {
        struct cxl_dev_state *cxlds = cxlmd->cxlds;
        struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
-       struct device *dev = cxlds->dev;
+       struct device *host = cxlds->dev;
        struct kernfs_node *sec;
 
        if (!test_bit(CXL_SEC_ENABLED_SANITIZE, mds->security.enabled_cmds))
                return 0;
 
-       sec = sysfs_get_dirent(dev->kobj.sd, "security");
+       sec = sysfs_get_dirent(cxlmd->dev.kobj.sd, "security");
        if (!sec) {
-               dev_err(dev, "sanitize notification setup failure\n");
+               dev_err(host, "sanitize notification setup failure\n");
                return -ENOENT;
        }
        mds->security.sanitize_node = sysfs_get_dirent(sec, "state");
        sysfs_put(sec);
        if (!mds->security.sanitize_node) {
-               dev_err(dev, "sanitize notification setup failure\n");
+               dev_err(host, "sanitize notification setup failure\n");
                return -ENOENT;
        }
 
-       return devm_add_action_or_reset(dev, cxl_sanitize_teardown_notifier, mds);
+       return devm_add_action_or_reset(host, cxl_sanitize_teardown_notifier, mds);
 }
 
 /**


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 3/4] cxl/pci: Fix sanitize notifier setup
  2023-10-04 18:54         ` Dan Williams
@ 2023-10-04 19:23           ` Davidlohr Bueso
  0 siblings, 0 replies; 34+ messages in thread
From: Davidlohr Bueso @ 2023-10-04 19:23 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, Jonathan Cameron, Dave Jiang, ira.weiny

On Wed, 04 Oct 2023, Dan Williams wrote:

>Davidlohr Bueso wrote:
>> On Tue, 03 Oct 2023, Dan Williams wrote:
>>
>> >Davidlohr Bueso wrote:
>> >> On Fri, 29 Sep 2023, Dan Williams wrote:
>> >>
>> >> >+static int cxl_sanitize_setup_notifier(struct cxl_memdev *cxlmd)
>> >> >+{
>> >> >+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
>> >> >+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
>> >> >+	struct device *dev = cxlds->dev;
>> >>
>> >> This wants to be dev = &cxlmd->dev;
>> >
>> >No, the notifier needs to be torn down in the cxl_pci teardown path. If
>> >cxlmd->dev was the devm operations host then this notifier would need to
>> >be setup from the cxl_mem driver, not cxl_pci. The teardown order of
>> >cxl_pci ends with @cxlds->dev as the devm host ends up as:
>> >
>> >cxl_sanitize_teardown_notifier()
>> >cxl_memdev_unregister()
>> >
>> >...otherwise if the @cxlmd->dev is used then the devm callback may not
>> >fire until device_release() time since it is possible that the cxl_mem
>> >driver never attaches to trigger the typical devm action around
>> >->remove() time.
>>
>> As is, the first sysfs_get_dirent() fails, fyi.
>
>Here is the full fix, apologies for misreading your fix earlier:

Yep, looks good (and tested).

>
>diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
>index f69f6195c465..a84932ad9392 100644
>--- a/drivers/cxl/pci.c
>+++ b/drivers/cxl/pci.c
>@@ -187,25 +187,25 @@ static int cxl_sanitize_setup_notifier(struct cxl_memdev *cxlmd)
> {
>        struct cxl_dev_state *cxlds = cxlmd->cxlds;
>        struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
>-       struct device *dev = cxlds->dev;
>+       struct device *host = cxlds->dev;
>        struct kernfs_node *sec;
>
>        if (!test_bit(CXL_SEC_ENABLED_SANITIZE, mds->security.enabled_cmds))
>                return 0;
>
>-       sec = sysfs_get_dirent(dev->kobj.sd, "security");
>+       sec = sysfs_get_dirent(cxlmd->dev.kobj.sd, "security");
>        if (!sec) {
>-               dev_err(dev, "sanitize notification setup failure\n");
>+               dev_err(host, "sanitize notification setup failure\n");
>                return -ENOENT;
>        }
>        mds->security.sanitize_node = sysfs_get_dirent(sec, "state");
>        sysfs_put(sec);
>        if (!mds->security.sanitize_node) {
>-               dev_err(dev, "sanitize notification setup failure\n");
>+               dev_err(host, "sanitize notification setup failure\n");
>                return -ENOENT;
>        }
>
>-       return devm_add_action_or_reset(dev, cxl_sanitize_teardown_notifier, mds);
>+       return devm_add_action_or_reset(host, cxl_sanitize_teardown_notifier, mds);
> }
>
> /**
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2023-10-04 19:23 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-29 23:09 [PATCH v2 0/4] cxl/mem: Fix shutdown order Dan Williams
2023-09-29 23:09 ` [PATCH v2 1/4] cxl/pci: Remove unnecessary device reference management in sanitize work Dan Williams
2023-09-29 23:41   ` Ira Weiny
2023-10-02  9:55   ` Jonathan Cameron
2023-10-02 15:27   ` Davidlohr Bueso
2023-10-02 16:48   ` Dave Jiang
2023-09-29 23:09 ` [PATCH v2 2/4] cxl/pci: Cleanup 'sanitize' to always poll Dan Williams
2023-09-29 23:49   ` Ira Weiny
2023-09-29 23:51   ` Ira Weiny
2023-10-02 10:02   ` Jonathan Cameron
2023-10-04  0:55     ` Dan Williams
2023-10-02 15:49   ` Davidlohr Bueso
2023-10-04  1:01     ` Dan Williams
2023-10-04  1:13       ` Davidlohr Bueso
2023-10-02 16:57   ` Dave Jiang
2023-09-29 23:09 ` [PATCH v2 3/4] cxl/pci: Fix sanitize notifier setup Dan Williams
2023-09-30  2:42   ` Ira Weiny
2023-10-02 10:10   ` Jonathan Cameron
2023-10-04  0:59     ` Dan Williams
2023-10-04 10:12       ` Jonathan Cameron
2023-10-04 18:47         ` Dan Williams
2023-10-02 16:59   ` Dave Jiang
2023-10-04  0:52   ` Davidlohr Bueso
2023-10-04  1:09     ` Dan Williams
2023-10-04 16:21       ` Davidlohr Bueso
2023-10-04 18:48         ` Dan Williams
2023-10-04 18:50         ` Dan Williams
2023-10-04 18:54         ` Dan Williams
2023-10-04 19:23           ` Davidlohr Bueso
2023-09-29 23:09 ` [PATCH v2 4/4] cxl/mem: Fix shutdown order Dan Williams
2023-09-29 23:52   ` Ira Weiny
2023-10-02 10:11     ` Jonathan Cameron
2023-10-02 16:59   ` Dave Jiang
2023-10-03 17:40   ` Davidlohr Bueso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox