* [PATCH 0/7] Introduce DEFINE_ACQUIRE(), a scoped_cond_guard() replacement
@ 2025-05-07 7:21 Dan Williams
2025-05-07 7:21 ` [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking Dan Williams
` (6 more replies)
0 siblings, 7 replies; 29+ messages in thread
From: Dan Williams @ 2025-05-07 7:21 UTC (permalink / raw)
To: linux-cxl
Cc: linux-kernel, Alison Schofield, Dave Jiang, David Lechner,
Davidlohr Bueso, Fabio M. De Francesco, Ingo Molnar, Ira Weiny,
Jonathan Cameron, Linus Torvalds, Peter Zijlstra, Vishal Verma
As detailed in patch1, scoped_cond_guard() has some usability warts. The
"if_not_guard()" effort tried to improve upon the situation, but was
buggy and difficult to fix.
It turns out however that CLASS() already has the necessary semantics to
address both the scoped_cond_guard() and if_not_guard() problems.
CLASS() can be used to define an auto variable with a constructor
(lock()), and a destructor (unlock()). This is what guard() does with a
hidden variable for unconditional locks. For conditional locks, the
variable is simply unhidden and evaluated with IS_ERR() to determine if
the lock acquisition was successful.
The proposal goes one step further and forces conversions to this new
scheme to be type-safe. So, if a subsystem using a mutex wants to use
scope-based unlock for mutex_lock_interruptible() it needs to convert to
the 'struct mutex_aquire' object and convert the entirety of the
subsystem to using CLASS(), guard(), and/or scoped_guard() helpers.
Note, scoped_cond_guard() is not defined for 'struct mutex_acquire'.
Reworks to accommodate type-safety enforcement is what makes this series
7 patches instead of 2. It converts all existing 'struct rw_semaphore'
usage in the CXL subsystem to 'struct rw_semaphore_acquire'. That
requires cleaning up some "reverse" locking patterns and "unlock in the
middle of the function" patterns. The result is smaller and easier to
reason about, once familiarity with CLASS() is established.
Dan Williams (7):
cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking
cxl/decoder: Move decoder register programming to a helper
cxl/decoder: Drop pointless locking
cxl/region: Split commit_store() into __commit() and queue_reset()
helpers
cxl/region: Move ready-to-probe state check to a helper
cxl/region: Introduce CLASS(cxl_decoder_detach...) consolidate
multiple paths
cleanup: Create an rwsem conditional acquisition class
drivers/cxl/core/cdat.c | 6 +-
drivers/cxl/core/core.h | 49 +++-
drivers/cxl/core/hdm.c | 115 +++++-----
drivers/cxl/core/mbox.c | 15 +-
drivers/cxl/core/memdev.c | 62 ++---
drivers/cxl/core/port.c | 18 +-
drivers/cxl/core/region.c | 463 ++++++++++++++++++--------------------
drivers/cxl/cxlmem.h | 2 +-
include/linux/cleanup.h | 62 +++++
include/linux/mutex.h | 24 ++
include/linux/rwsem.h | 37 +++
11 files changed, 491 insertions(+), 362 deletions(-)
base-commit: b4432656b36e5cc1d50a1f2dc15357543add530e
--
2.49.0
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking
2025-05-07 7:21 [PATCH 0/7] Introduce DEFINE_ACQUIRE(), a scoped_cond_guard() replacement Dan Williams
@ 2025-05-07 7:21 ` Dan Williams
2025-05-07 9:32 ` Peter Zijlstra
2025-05-09 19:10 ` kernel test robot
2025-05-07 7:21 ` [PATCH 2/7] cxl/decoder: Move decoder register programming to a helper Dan Williams
` (5 subsequent siblings)
6 siblings, 2 replies; 29+ messages in thread
From: Dan Williams @ 2025-05-07 7:21 UTC (permalink / raw)
To: linux-cxl
Cc: linux-kernel, David Lechner, Peter Zijlstra, Linus Torvalds,
Ingo Molnar, Fabio M. De Francesco, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny
The scoped_cond_guard() helper has two problems. First, with "convert to
cleanup" style patches, it results in messy diffs that re-indent large
swaths of code. Second, it swallows return values from interruptible /
killable locks, i.e. the return value of those is not always -EINTR.
The last attempt at improving this situation, if_not_guard() [1], was
reverted with a suggestion:
"I really think the basic issue is that 'cond_guard' itself is a pretty
broken concept. It simply doesn't work very well in the C syntax."
However, the observation is that just exposing the CLASS() to callers
solves: the safety concern (compiler enforced "expected expression"
checks), conveying the conditional acquisition state of the lock, and
avoiding re-indentation caused by scoped_cond_guard(). See the commit below
for the analysis of the revert.
Commit b4d83c8323b0 ("headers/cleanup.h: Remove the if_not_guard() facility")
The DEFINE_ACQUIRE() macro wraps a lock type like 'struct mutex' into a
'struct mutex_acquire' type that can only be acquired and automatically
released by scope-based helpers. E.g.:
[scoped_]guard(mutex_acquire)(...)
CLASS(mutex_intr_acquire, ...)
CLASS(mutex_try_acquire, ...)
Use DEFINE_ACQUIRE to create the new classes above and use
mutex_intr_acquire in the CXL subsystem to demonstrate the conversion.
Link: https://lore.kernel.org/all/CAHk-=whn07tnDosPfn+UcAtWHBcLg=KqA16SHVv0GV4t8P1fHw@mail.gmail.com [1]
Cc: David Lechner <dlechner@baylibre.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "Fabio M. De Francesco" <fabio.maria.de.francesco@linux.intel.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
drivers/cxl/core/mbox.c | 9 +++---
drivers/cxl/cxlmem.h | 2 +-
include/linux/cleanup.h | 62 +++++++++++++++++++++++++++++++++++++++++
include/linux/mutex.h | 24 ++++++++++++++++
4 files changed, 91 insertions(+), 6 deletions(-)
diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index d72764056ce6..cec9dfb22567 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -1394,9 +1394,9 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
int nr_records = 0;
int rc;
- rc = mutex_lock_interruptible(&mds->poison.lock);
- if (rc)
- return rc;
+ CLASS(mutex_intr_acquire, lock)(&mds->poison.lock);
+ if (IS_ERR(lock))
+ return PTR_ERR(lock);
po = mds->poison.list_out;
pi.offset = cpu_to_le64(offset);
@@ -1430,7 +1430,6 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
}
} while (po->flags & CXL_POISON_FLAG_MORE);
- mutex_unlock(&mds->poison.lock);
return rc;
}
EXPORT_SYMBOL_NS_GPL(cxl_mem_get_poison, "CXL");
@@ -1466,7 +1465,7 @@ int cxl_poison_state_init(struct cxl_memdev_state *mds)
return rc;
}
- mutex_init(&mds->poison.lock);
+ mutex_acquire_init(&mds->poison.lock);
return 0;
}
EXPORT_SYMBOL_NS_GPL(cxl_poison_state_init, "CXL");
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 3ec6b906371b..9b4ab5d1a7c4 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -257,7 +257,7 @@ struct cxl_poison_state {
u32 max_errors;
DECLARE_BITMAP(enabled_cmds, CXL_POISON_ENABLED_MAX);
struct cxl_mbox_poison_out *list_out;
- struct mutex lock; /* Protect reads of poison list */
+ struct mutex_acquire lock; /* Protect reads of poison list */
};
/*
diff --git a/include/linux/cleanup.h b/include/linux/cleanup.h
index 7e57047e1564..926b95daa4b5 100644
--- a/include/linux/cleanup.h
+++ b/include/linux/cleanup.h
@@ -424,5 +424,67 @@ __DEFINE_LOCK_GUARD_0(_name, _lock)
static inline void * class_##_name##_ext##_lock_ptr(class_##_name##_t *_T) \
{ return class_##_name##_lock_ptr(_T); }
+/*
+ * DEFINE_ACQUIRE(acquire_class_name, lock_type, unlock, cond_lock):
+ * Define a CLASS() that instantiates and acquires a conditional lock
+ * within an existing scope. In contrast to DEFINE_GUARD[_COND](), which
+ * hides the variable tracking the lock scope, CLASS(@acquire_class_name,
+ * @lock) instantiates @lock as either an ERR_PTR() or a cookie that drops
+ * the lock when it goes out of scope. An "_acquire" suffix is appended to
+ * @lock_type to provide type-safety against mixing explicit and implicit
+ * (scope-based) cleanup.
+ *
+ * Ex.
+ *
+ * DEFINE_ACQUIRE(mutex_intr_acquire, mutex, mutex_unlock,
+ * mutex_lock_interruptible)
+ *
+ * int interruptible_operation(...)
+ * {
+ * ...
+ * CLASS(mutex_intr_acquire, lock)(&obj->lock);
+ * if (IS_ERR(lock))
+ * return PTR_ERR(lock);
+ * ...
+ * } <= obj->lock dropped here.
+ *
+ * Attempts to perform:
+ *
+ * mutex_unlock(&obj->lock);
+ *
+ * ...fail because obj->lock is a 'struct mutex_acquire' not 'struct mutex'
+ * instance.
+ *
+ * Also, attempts to use the CLASS() conditionally require the ambiguous
+ * scope to be clarified (compiler enforced):
+ *
+ * if (...)
+ * CLASS(mutex_intr_acquire, lock)(&obj->lock); // <-- "error: expected expression"
+ * if (IS_ERR(lock))
+ * return PTR_ERR(lock);
+ *
+ * vs:
+ *
+ * if (...) {
+ * CLASS(mutex_intr_acquire, lock)(&obj->lock);
+ * if (IS_ERR(lock))
+ * return PTR_ERR(lock);
+ * } // <-- lock released here
+ */
+#define DEFINE_ACQUIRE(_name, _locktype, _unlock, _cond_lock) \
+ DEFINE_CLASS(_name, struct _locktype##_acquire *, \
+ if (!IS_ERR_OR_NULL(_T)) _unlock(&_T->_locktype), ({ \
+ struct _locktype##_acquire *lock_result; \
+ int ret = _cond_lock(&to_lock->_locktype); \
+ \
+ if (ret) \
+ lock_result = ERR_PTR(ret); \
+ else \
+ lock_result = \
+ (struct _locktype##_acquire \
+ *)&to_lock->_locktype; \
+ lock_result; \
+ }), \
+ struct _locktype##_acquire *to_lock)
#endif /* _LINUX_CLEANUP_H */
diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index 2143d05116be..283111f43b0f 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -64,6 +64,8 @@ do { \
__mutex_init((mutex), #mutex, &__key); \
} while (0)
+#define mutex_acquire_init(lock) mutex_init(&(lock)->mutex)
+
/**
* mutex_init_with_key - initialize a mutex with a given lockdep key
* @mutex: the mutex to be initialized
@@ -202,6 +204,28 @@ DEFINE_GUARD(mutex, struct mutex *, mutex_lock(_T), mutex_unlock(_T))
DEFINE_GUARD_COND(mutex, _try, mutex_trylock(_T))
DEFINE_GUARD_COND(mutex, _intr, mutex_lock_interruptible(_T) == 0)
+/* mutex type that only implements scope-based unlock */
+struct mutex_acquire {
+ /* private: */
+ struct mutex mutex;
+};
+DEFINE_GUARD(mutex_acquire, struct mutex_acquire *, mutex_lock(&_T->mutex),
+ mutex_unlock(&_T->mutex))
+DEFINE_GUARD_COND(mutex_acquire, _try, mutex_trylock(&_T->mutex))
+DEFINE_GUARD_COND(mutex_acquire, _intr, mutex_lock_interruptible(&_T->mutex) == 0)
+DEFINE_ACQUIRE(mutex_intr_acquire, mutex, mutex_unlock,
+ mutex_lock_interruptible)
+
+static inline int mutex_try_or_busy(struct mutex *lock)
+{
+ int ret[] = { -EBUSY, 0 };
+
+ return ret[mutex_trylock(lock)];
+}
+
+DEFINE_ACQUIRE(mutex_try_acquire, mutex, mutex_unlock,
+ mutex_try_or_busy)
+
extern unsigned long mutex_get_owner(struct mutex *lock);
#endif /* __LINUX_MUTEX_H */
--
2.49.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 2/7] cxl/decoder: Move decoder register programming to a helper
2025-05-07 7:21 [PATCH 0/7] Introduce DEFINE_ACQUIRE(), a scoped_cond_guard() replacement Dan Williams
2025-05-07 7:21 ` [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking Dan Williams
@ 2025-05-07 7:21 ` Dan Williams
2025-05-07 7:21 ` [PATCH 3/7] cxl/decoder: Drop pointless locking Dan Williams
` (4 subsequent siblings)
6 siblings, 0 replies; 29+ messages in thread
From: Dan Williams @ 2025-05-07 7:21 UTC (permalink / raw)
To: linux-cxl
Cc: linux-kernel, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
Alison Schofield, Vishal Verma, Ira Weiny
In preparation for converting to rw_semaphore_acquire semantics move the
contents of an open-coded {down,up}_read(&cxl_dpa_rwsem) section to a
helper function.
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
drivers/cxl/core/hdm.c | 77 +++++++++++++++++++++++-------------------
1 file changed, 42 insertions(+), 35 deletions(-)
diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index 70cae4ebf8a4..418539e859e3 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -763,14 +763,53 @@ static int cxld_await_commit(void __iomem *hdm, int id)
return -ETIMEDOUT;
}
+static void setup_hw_decoder(struct cxl_decoder *cxld, void __iomem *hdm)
+{
+ int id = cxld->id;
+ u64 base, size;
+ u32 ctrl;
+
+ /* common decoder settings */
+ ctrl = readl(hdm + CXL_HDM_DECODER0_CTRL_OFFSET(cxld->id));
+ cxld_set_interleave(cxld, &ctrl);
+ cxld_set_type(cxld, &ctrl);
+ base = cxld->hpa_range.start;
+ size = range_len(&cxld->hpa_range);
+
+ writel(upper_32_bits(base), hdm + CXL_HDM_DECODER0_BASE_HIGH_OFFSET(id));
+ writel(lower_32_bits(base), hdm + CXL_HDM_DECODER0_BASE_LOW_OFFSET(id));
+ writel(upper_32_bits(size), hdm + CXL_HDM_DECODER0_SIZE_HIGH_OFFSET(id));
+ writel(lower_32_bits(size), hdm + CXL_HDM_DECODER0_SIZE_LOW_OFFSET(id));
+
+ if (is_switch_decoder(&cxld->dev)) {
+ struct cxl_switch_decoder *cxlsd =
+ to_cxl_switch_decoder(&cxld->dev);
+ void __iomem *tl_hi = hdm + CXL_HDM_DECODER0_TL_HIGH(id);
+ void __iomem *tl_lo = hdm + CXL_HDM_DECODER0_TL_LOW(id);
+ u64 targets;
+
+ cxlsd_set_targets(cxlsd, &targets);
+ writel(upper_32_bits(targets), tl_hi);
+ writel(lower_32_bits(targets), tl_lo);
+ } else {
+ struct cxl_endpoint_decoder *cxled =
+ to_cxl_endpoint_decoder(&cxld->dev);
+ void __iomem *sk_hi = hdm + CXL_HDM_DECODER0_SKIP_HIGH(id);
+ void __iomem *sk_lo = hdm + CXL_HDM_DECODER0_SKIP_LOW(id);
+
+ writel(upper_32_bits(cxled->skip), sk_hi);
+ writel(lower_32_bits(cxled->skip), sk_lo);
+ }
+
+ writel(ctrl, hdm + CXL_HDM_DECODER0_CTRL_OFFSET(id));
+}
+
static int cxl_decoder_commit(struct cxl_decoder *cxld)
{
struct cxl_port *port = to_cxl_port(cxld->dev.parent);
struct cxl_hdm *cxlhdm = dev_get_drvdata(&port->dev);
void __iomem *hdm = cxlhdm->regs.hdm_decoder;
int id = cxld->id, rc;
- u64 base, size;
- u32 ctrl;
if (cxld->flags & CXL_DECODER_F_ENABLE)
return 0;
@@ -803,39 +842,7 @@ static int cxl_decoder_commit(struct cxl_decoder *cxld)
}
down_read(&cxl_dpa_rwsem);
- /* common decoder settings */
- ctrl = readl(hdm + CXL_HDM_DECODER0_CTRL_OFFSET(cxld->id));
- cxld_set_interleave(cxld, &ctrl);
- cxld_set_type(cxld, &ctrl);
- base = cxld->hpa_range.start;
- size = range_len(&cxld->hpa_range);
-
- writel(upper_32_bits(base), hdm + CXL_HDM_DECODER0_BASE_HIGH_OFFSET(id));
- writel(lower_32_bits(base), hdm + CXL_HDM_DECODER0_BASE_LOW_OFFSET(id));
- writel(upper_32_bits(size), hdm + CXL_HDM_DECODER0_SIZE_HIGH_OFFSET(id));
- writel(lower_32_bits(size), hdm + CXL_HDM_DECODER0_SIZE_LOW_OFFSET(id));
-
- if (is_switch_decoder(&cxld->dev)) {
- struct cxl_switch_decoder *cxlsd =
- to_cxl_switch_decoder(&cxld->dev);
- void __iomem *tl_hi = hdm + CXL_HDM_DECODER0_TL_HIGH(id);
- void __iomem *tl_lo = hdm + CXL_HDM_DECODER0_TL_LOW(id);
- u64 targets;
-
- cxlsd_set_targets(cxlsd, &targets);
- writel(upper_32_bits(targets), tl_hi);
- writel(lower_32_bits(targets), tl_lo);
- } else {
- struct cxl_endpoint_decoder *cxled =
- to_cxl_endpoint_decoder(&cxld->dev);
- void __iomem *sk_hi = hdm + CXL_HDM_DECODER0_SKIP_HIGH(id);
- void __iomem *sk_lo = hdm + CXL_HDM_DECODER0_SKIP_LOW(id);
-
- writel(upper_32_bits(cxled->skip), sk_hi);
- writel(lower_32_bits(cxled->skip), sk_lo);
- }
-
- writel(ctrl, hdm + CXL_HDM_DECODER0_CTRL_OFFSET(id));
+ setup_hw_decoder(cxld, hdm);
up_read(&cxl_dpa_rwsem);
port->commit_end++;
--
2.49.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 3/7] cxl/decoder: Drop pointless locking
2025-05-07 7:21 [PATCH 0/7] Introduce DEFINE_ACQUIRE(), a scoped_cond_guard() replacement Dan Williams
2025-05-07 7:21 ` [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking Dan Williams
2025-05-07 7:21 ` [PATCH 2/7] cxl/decoder: Move decoder register programming to a helper Dan Williams
@ 2025-05-07 7:21 ` Dan Williams
2025-05-07 7:21 ` [PATCH 4/7] cxl/region: Split commit_store() into __commit() and queue_reset() helpers Dan Williams
` (3 subsequent siblings)
6 siblings, 0 replies; 29+ messages in thread
From: Dan Williams @ 2025-05-07 7:21 UTC (permalink / raw)
To: linux-cxl
Cc: linux-kernel, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
Alison Schofield, Vishal Verma, Ira Weiny
cxl_dpa_rwsem coordinates changes to dpa allocation settings for a given
decoder. cxl_decoder_reset() has no need for a consistent snapshot of the
dpa settings since it is merely clearing out whatever was there previously.
Otherwise, cxl_region_rwsem protects against 'reset' racing 'setup'.
In preparationg for converting to rw_semaphore_acquire semantics, drop this
locking.
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
drivers/cxl/core/hdm.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index 418539e859e3..1c195f495a59 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -913,7 +913,6 @@ static void cxl_decoder_reset(struct cxl_decoder *cxld)
"%s: out of order reset, expected decoder%d.%d\n",
dev_name(&cxld->dev), port->id, port->commit_end);
- down_read(&cxl_dpa_rwsem);
ctrl = readl(hdm + CXL_HDM_DECODER0_CTRL_OFFSET(id));
ctrl &= ~CXL_HDM_DECODER0_CTRL_COMMIT;
writel(ctrl, hdm + CXL_HDM_DECODER0_CTRL_OFFSET(id));
@@ -922,7 +921,6 @@ static void cxl_decoder_reset(struct cxl_decoder *cxld)
writel(0, hdm + CXL_HDM_DECODER0_SIZE_LOW_OFFSET(id));
writel(0, hdm + CXL_HDM_DECODER0_BASE_HIGH_OFFSET(id));
writel(0, hdm + CXL_HDM_DECODER0_BASE_LOW_OFFSET(id));
- up_read(&cxl_dpa_rwsem);
cxld->flags &= ~CXL_DECODER_F_ENABLE;
--
2.49.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 4/7] cxl/region: Split commit_store() into __commit() and queue_reset() helpers
2025-05-07 7:21 [PATCH 0/7] Introduce DEFINE_ACQUIRE(), a scoped_cond_guard() replacement Dan Williams
` (2 preceding siblings ...)
2025-05-07 7:21 ` [PATCH 3/7] cxl/decoder: Drop pointless locking Dan Williams
@ 2025-05-07 7:21 ` Dan Williams
2025-05-07 7:21 ` [PATCH 5/7] cxl/region: Move ready-to-probe state check to a helper Dan Williams
` (2 subsequent siblings)
6 siblings, 0 replies; 29+ messages in thread
From: Dan Williams @ 2025-05-07 7:21 UTC (permalink / raw)
To: linux-cxl
Cc: linux-kernel, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
Alison Schofield, Vishal Verma, Ira Weiny
The complexity of dropping the lock is removed in favor of splitting commit
operations to a helper, and leaving all the complexities of "decommit" for
commit_store() to coordinate the different locking contexts.
The CPU cache-invalidation in the decommit path is solely handled now by
cxl_region_decode_reset(). Previously the CPU caches were being needlessly
flushed twice in the decommit path where the first flush had no guarantee
that the memory would not be immediately re-dirtied.
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
drivers/cxl/core/region.c | 99 +++++++++++++++++++++++++++------------
1 file changed, 70 insertions(+), 29 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index c3f4dc244df7..d52e98e2f7a3 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -350,30 +350,42 @@ static int cxl_region_decode_commit(struct cxl_region *cxlr)
return rc;
}
-static ssize_t commit_store(struct device *dev, struct device_attribute *attr,
- const char *buf, size_t len)
+static int queue_reset(struct cxl_region *cxlr)
{
- struct cxl_region *cxlr = to_cxl_region(dev);
struct cxl_region_params *p = &cxlr->params;
- bool commit;
- ssize_t rc;
+ int rc;
- rc = kstrtobool(buf, &commit);
+ rc = down_write_killable(&cxl_region_rwsem);
if (rc)
return rc;
+ /* Already in the requested state? */
+ if (p->state < CXL_CONFIG_COMMIT)
+ goto out;
+
+ p->state = CXL_CONFIG_RESET_PENDING;
+
+out:
+ up_write(&cxl_region_rwsem);
+
+ return rc;
+}
+
+static int __commit(struct cxl_region *cxlr)
+{
+ struct cxl_region_params *p = &cxlr->params;
+ int rc;
+
rc = down_write_killable(&cxl_region_rwsem);
if (rc)
return rc;
/* Already in the requested state? */
- if (commit && p->state >= CXL_CONFIG_COMMIT)
- goto out;
- if (!commit && p->state < CXL_CONFIG_COMMIT)
+ if (p->state >= CXL_CONFIG_COMMIT)
goto out;
/* Not ready to commit? */
- if (commit && p->state < CXL_CONFIG_ACTIVE) {
+ if (p->state < CXL_CONFIG_ACTIVE) {
rc = -ENXIO;
goto out;
}
@@ -386,31 +398,60 @@ static ssize_t commit_store(struct device *dev, struct device_attribute *attr,
if (rc)
goto out;
- if (commit) {
- rc = cxl_region_decode_commit(cxlr);
- if (rc == 0)
- p->state = CXL_CONFIG_COMMIT;
- } else {
- p->state = CXL_CONFIG_RESET_PENDING;
- up_write(&cxl_region_rwsem);
- device_release_driver(&cxlr->dev);
- down_write(&cxl_region_rwsem);
-
- /*
- * The lock was dropped, so need to revalidate that the reset is
- * still pending.
- */
- if (p->state == CXL_CONFIG_RESET_PENDING) {
- cxl_region_decode_reset(cxlr, p->interleave_ways);
- p->state = CXL_CONFIG_ACTIVE;
- }
- }
+ rc = cxl_region_decode_commit(cxlr);
+ if (rc == 0)
+ p->state = CXL_CONFIG_COMMIT;
out:
up_write(&cxl_region_rwsem);
+ return rc;
+}
+
+static ssize_t commit_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t len)
+{
+ struct cxl_region *cxlr = to_cxl_region(dev);
+ struct cxl_region_params *p = &cxlr->params;
+ bool commit;
+ ssize_t rc;
+
+ rc = kstrtobool(buf, &commit);
if (rc)
return rc;
+
+ if (commit) {
+ rc = __commit(cxlr);
+ if (rc)
+ return rc;
+ return len;
+ }
+
+ rc = queue_reset(cxlr);
+ if (rc)
+ return rc;
+
+ /*
+ * Unmap the region and depend the reset-pending state to ensure
+ * it does not go active again until post reset
+ */
+ device_release_driver(&cxlr->dev);
+
+ /*
+ * With the reset pending take cxl_region_rwsem unconditionally
+ * to ensure the reset gets handled before returning.
+ */
+ guard(rwsem_write)(&cxl_region_rwsem);
+
+ /*
+ * Revalidate that the reset is still pending in case another
+ * thread already handled this reset.
+ */
+ if (p->state == CXL_CONFIG_RESET_PENDING) {
+ cxl_region_decode_reset(cxlr, p->interleave_ways);
+ p->state = CXL_CONFIG_ACTIVE;
+ }
+
return len;
}
--
2.49.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 5/7] cxl/region: Move ready-to-probe state check to a helper
2025-05-07 7:21 [PATCH 0/7] Introduce DEFINE_ACQUIRE(), a scoped_cond_guard() replacement Dan Williams
` (3 preceding siblings ...)
2025-05-07 7:21 ` [PATCH 4/7] cxl/region: Split commit_store() into __commit() and queue_reset() helpers Dan Williams
@ 2025-05-07 7:21 ` Dan Williams
2025-05-07 7:21 ` [PATCH 6/7] cxl/region: Introduce CLASS(cxl_decoder_detach...) consolidate multiple paths Dan Williams
2025-05-07 7:21 ` [PATCH 7/7] cleanup: Create an rwsem conditional acquisition class Dan Williams
6 siblings, 0 replies; 29+ messages in thread
From: Dan Williams @ 2025-05-07 7:21 UTC (permalink / raw)
To: linux-cxl
Cc: linux-kernel, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
Alison Schofield, Vishal Verma, Ira Weiny
Rather than unlocking the region rwsem in the middle of cxl_region_probe()
create a helper for determining when the region is ready-to-probe.
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
drivers/cxl/core/region.c | 24 ++++++++++++++++++------
1 file changed, 18 insertions(+), 6 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index d52e98e2f7a3..11448824ddd4 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3529,9 +3529,8 @@ static void shutdown_notifiers(void *_cxlr)
unregister_mt_adistance_algorithm(&cxlr->adist_notifier);
}
-static int cxl_region_probe(struct device *dev)
+static int cxl_region_can_probe(struct cxl_region *cxlr)
{
- struct cxl_region *cxlr = to_cxl_region(dev);
struct cxl_region_params *p = &cxlr->params;
int rc;
@@ -3554,15 +3553,28 @@ static int cxl_region_probe(struct device *dev)
goto out;
}
- /*
- * From this point on any path that changes the region's state away from
- * CXL_CONFIG_COMMIT is also responsible for releasing the driver.
- */
out:
up_read(&cxl_region_rwsem);
if (rc)
return rc;
+ return 0;
+}
+
+static int cxl_region_probe(struct device *dev)
+{
+ struct cxl_region *cxlr = to_cxl_region(dev);
+ struct cxl_region_params *p = &cxlr->params;
+ int rc;
+
+ rc = cxl_region_can_probe(cxlr);
+ if (rc)
+ return rc;
+
+ /*
+ * From this point on any path that changes the region's state away from
+ * CXL_CONFIG_COMMIT is also responsible for releasing the driver.
+ */
cxlr->memory_notifier.notifier_call = cxl_region_perf_attrs_callback;
cxlr->memory_notifier.priority = CXL_CALLBACK_PRI;
--
2.49.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 6/7] cxl/region: Introduce CLASS(cxl_decoder_detach...) consolidate multiple paths
2025-05-07 7:21 [PATCH 0/7] Introduce DEFINE_ACQUIRE(), a scoped_cond_guard() replacement Dan Williams
` (4 preceding siblings ...)
2025-05-07 7:21 ` [PATCH 5/7] cxl/region: Move ready-to-probe state check to a helper Dan Williams
@ 2025-05-07 7:21 ` Dan Williams
2025-05-08 7:44 ` kernel test robot
2025-05-07 7:21 ` [PATCH 7/7] cleanup: Create an rwsem conditional acquisition class Dan Williams
6 siblings, 1 reply; 29+ messages in thread
From: Dan Williams @ 2025-05-07 7:21 UTC (permalink / raw)
To: linux-cxl
Cc: linux-kernel, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
Alison Schofield, Vishal Verma, Ira Weiny
Both detach_target() and cxld_unregister() want to tear down a cxl_region
when an endpoint decoder is either detached or destroyed.
When a region is to be destroyed cxl_decoder_detach() releases
cxl_region_rwsem unbinds the cxl_region driver and re-acquires the rwsem.
This "reverse" locking pattern is difficult to reason about, not amenable
to scope-based cleanup, and the minor differences in the calling convention
of cxl_decoder_detach() currently results in the cxl_decoder_kill_region()
wrapper.
Introduce CLASS(cxl_decoder_detach...) which creates an object that moves
the post-detach cleanup work to a destructor, and consolidates minor
preamble differences in the constructor.
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
drivers/cxl/core/core.h | 43 ++++++++++++++++++-
drivers/cxl/core/port.c | 6 +--
drivers/cxl/core/region.c | 88 ++++++++++++++++++---------------------
3 files changed, 83 insertions(+), 54 deletions(-)
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index 17b692eb3257..44b09552f44e 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -12,6 +12,11 @@ extern const struct device_type cxl_pmu_type;
extern struct attribute_group cxl_base_attribute_group;
+enum cxl_detach_mode {
+ DETACH_ONLY,
+ DETACH_INVALIDATE,
+};
+
#ifdef CONFIG_CXL_REGION
extern struct device_attribute dev_attr_create_pmem_region;
extern struct device_attribute dev_attr_create_ram_region;
@@ -20,7 +25,11 @@ extern struct device_attribute dev_attr_region;
extern const struct device_type cxl_pmem_region_type;
extern const struct device_type cxl_dax_region_type;
extern const struct device_type cxl_region_type;
-void cxl_decoder_kill_region(struct cxl_endpoint_decoder *cxled);
+
+struct cxl_region *cxl_decoder_detach(struct cxl_region *cxlr,
+ struct cxl_endpoint_decoder *cxled,
+ int pos, enum cxl_detach_mode mode);
+
#define CXL_REGION_ATTR(x) (&dev_attr_##x.attr)
#define CXL_REGION_TYPE(x) (&cxl_region_type)
#define SET_CXL_REGION_ATTR(x) (&dev_attr_##x.attr),
@@ -48,7 +57,9 @@ static inline int cxl_get_poison_by_endpoint(struct cxl_port *port)
{
return 0;
}
-static inline void cxl_decoder_kill_region(struct cxl_endpoint_decoder *cxled)
+static inline struct cxl_region *
+cxl_decoder_detach(struct cxl_region *cxlr, struct cxl_endpoint_decoder *cxled,
+ int pos, enum cxl_detach_mode mode)
{
}
static inline int cxl_region_init(void)
@@ -99,6 +110,34 @@ u16 cxl_rcrb_to_aer(struct device *dev, resource_size_t rcrb);
extern struct rw_semaphore cxl_dpa_rwsem;
extern struct rw_semaphore cxl_region_rwsem;
+DEFINE_CLASS(
+ cxl_decoder_detach, struct cxl_region *,
+ if (!IS_ERR_OR_NULL(_T)) {
+ device_release_driver(&_T->dev);
+ put_device(&_T->dev);
+ },
+ ({
+ int rc = 0;
+
+ /* when the decoder is being destroyed lock unconditionally */
+ if (mode == DETACH_INVALIDATE)
+ down_write(&cxl_region_rwsem);
+ else
+ rc = down_write_killable(&cxl_region_rwsem);
+
+ if (rc)
+ cxlr = ERR_PTR(rc);
+ else {
+ cxlr = cxl_decoder_detach(cxlr, cxled, pos, mode);
+ get_device(&cxlr->dev);
+ }
+ up_write(&cxl_region_rwsem);
+
+ cxlr;
+ }),
+ struct cxl_region *cxlr, struct cxl_endpoint_decoder *cxled, int pos,
+ enum cxl_detach_mode mode)
+
int cxl_memdev_init(void);
void cxl_memdev_exit(void);
void cxl_mbox_init(void);
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 726bd4a7de27..20b65f13bded 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -2008,11 +2008,9 @@ EXPORT_SYMBOL_NS_GPL(cxl_decoder_add, "CXL");
static void cxld_unregister(void *dev)
{
- struct cxl_endpoint_decoder *cxled;
-
if (is_endpoint_decoder(dev)) {
- cxled = to_cxl_endpoint_decoder(dev);
- cxl_decoder_kill_region(cxled);
+ CLASS(cxl_decoder_detach, cxlr)
+ (NULL, to_cxl_endpoint_decoder(dev), -1, DETACH_INVALIDATE);
}
device_unregister(dev);
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 11448824ddd4..17e69f6cc772 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2122,27 +2122,52 @@ static int cxl_region_attach(struct cxl_region *cxlr,
return 0;
}
-static int cxl_region_detach(struct cxl_endpoint_decoder *cxled)
+/*
+ * Cleanup a decoder's interest in a region. There are 2 cases to
+ * handle, removing an unknown @cxled from a known position in a region
+ * (detach_target()) or removing a known @cxled from an unknown @cxlr
+ * (cxld_unregister())
+ *
+ * When the detachment finds a region, the caller is responsible for
+ * releasing the region driver.
+ */
+struct cxl_region *cxl_decoder_detach(struct cxl_region *cxlr,
+ struct cxl_endpoint_decoder *cxled,
+ int pos, enum cxl_detach_mode mode)
{
- struct cxl_port *iter, *ep_port = cxled_to_port(cxled);
- struct cxl_region *cxlr = cxled->cxld.region;
struct cxl_region_params *p;
- int rc = 0;
lockdep_assert_held_write(&cxl_region_rwsem);
- if (!cxlr)
- return 0;
+ if (!cxled) {
+ p = &cxlr->params;
+
+ if (pos >= p->interleave_ways) {
+ dev_dbg(&cxlr->dev, "position %d out of range %d\n",
+ pos, p->interleave_ways);
+ return ERR_PTR(-ENXIO);
+ }
+
+ if (!p->targets[pos])
+ return NULL;
+ cxled = p->targets[pos];
+ } else {
+ cxlr = cxled->cxld.region;
+ if (!cxlr)
+ return NULL;
+ p = &cxlr->params;
+ }
- p = &cxlr->params;
- get_device(&cxlr->dev);
+
+ if (mode == DETACH_INVALIDATE)
+ cxled->part = -1;
if (p->state > CXL_CONFIG_ACTIVE) {
cxl_region_decode_reset(cxlr, p->interleave_ways);
p->state = CXL_CONFIG_ACTIVE;
}
- for (iter = ep_port; !is_cxl_root(iter);
+ for (struct cxl_port *iter = cxled_to_port(cxled); !is_cxl_root(iter);
iter = to_cxl_port(iter->dev.parent))
cxl_port_detach_region(iter, cxlr, cxled);
@@ -2153,7 +2178,7 @@ static int cxl_region_detach(struct cxl_endpoint_decoder *cxled)
dev_WARN_ONCE(&cxlr->dev, 1, "expected %s:%s at position %d\n",
dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
cxled->pos);
- goto out;
+ return NULL;
}
if (p->state == CXL_CONFIG_ACTIVE) {
@@ -2167,21 +2192,7 @@ static int cxl_region_detach(struct cxl_endpoint_decoder *cxled)
.end = -1,
};
- /* notify the region driver that one of its targets has departed */
- up_write(&cxl_region_rwsem);
- device_release_driver(&cxlr->dev);
- down_write(&cxl_region_rwsem);
-out:
- put_device(&cxlr->dev);
- return rc;
-}
-
-void cxl_decoder_kill_region(struct cxl_endpoint_decoder *cxled)
-{
- down_write(&cxl_region_rwsem);
- cxled->part = -1;
- cxl_region_detach(cxled);
- up_write(&cxl_region_rwsem);
+ return cxlr;
}
static int attach_target(struct cxl_region *cxlr,
@@ -2206,29 +2217,10 @@ static int attach_target(struct cxl_region *cxlr,
static int detach_target(struct cxl_region *cxlr, int pos)
{
- struct cxl_region_params *p = &cxlr->params;
- int rc;
-
- rc = down_write_killable(&cxl_region_rwsem);
- if (rc)
- return rc;
-
- if (pos >= p->interleave_ways) {
- dev_dbg(&cxlr->dev, "position %d out of range %d\n", pos,
- p->interleave_ways);
- rc = -ENXIO;
- goto out;
- }
-
- if (!p->targets[pos]) {
- rc = 0;
- goto out;
- }
-
- rc = cxl_region_detach(p->targets[pos]);
-out:
- up_write(&cxl_region_rwsem);
- return rc;
+ CLASS(cxl_decoder_detach, ret)(cxlr, NULL, pos, DETACH_ONLY);
+ if (IS_ERR(ret))
+ return PTR_ERR(ret);
+ return 0;
}
static size_t store_targetN(struct cxl_region *cxlr, const char *buf, int pos,
--
2.49.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 7/7] cleanup: Create an rwsem conditional acquisition class
2025-05-07 7:21 [PATCH 0/7] Introduce DEFINE_ACQUIRE(), a scoped_cond_guard() replacement Dan Williams
` (5 preceding siblings ...)
2025-05-07 7:21 ` [PATCH 6/7] cxl/region: Introduce CLASS(cxl_decoder_detach...) consolidate multiple paths Dan Williams
@ 2025-05-07 7:21 ` Dan Williams
6 siblings, 0 replies; 29+ messages in thread
From: Dan Williams @ 2025-05-07 7:21 UTC (permalink / raw)
To: linux-cxl
Cc: linux-kernel, David Lechner, Peter Zijlstra, Linus Torvalds,
Ingo Molnar, Fabio M. De Francesco, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny
Introduce 'struct rw_semaphore_acquire' with the following helpers:
[scoped_]guard(rwsem_read_acquire)(...)
[scoped_]guard(rwsem_write_acquire)(...)
CLASS(rwsem_read_intr_acquire, ...)
CLASS(rwsem_write_kill_acquire, ...)
CLASS(rwsem_read_try_acquire, ...)
CLASS(rwsem_write_try_acquire, ...)
...and convert the CXL 'struct rw_semaphore' instances to 'struct
rw_semaphore_acquire'.
Recall that the "_acquire" flavor of locking primitives do not support
explicit 'lock' and 'unlock' operations, only guard() and CLASS(). This
mandates some pre-work before this mechanism can be adopted in a subsystem.
Specifically, some explicit unlock patterns in the CXL subsystem
(mid-function unlock and "reverse" locking patterns) were refactored into
sequences that are amenable to scope-based unlock.
The result is easier to read (once CLASS() semantics are understood)
smaller code that avoids the re-indentation implied by scoped_cond_guard().
One nit though is that 'struct rw_semaphore_acquire' is not compatible with
lockdep APIs. So, for now, open-code the type-conversion back to the raw
'struct rw_semaphore'.
Cc: David Lechner <dlechner@baylibre.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "Fabio M. De Francesco" <fabio.maria.de.francesco@linux.intel.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
drivers/cxl/core/cdat.c | 6 +-
drivers/cxl/core/core.h | 34 ++---
drivers/cxl/core/hdm.c | 38 +++--
drivers/cxl/core/mbox.c | 6 +-
drivers/cxl/core/memdev.c | 62 ++++-----
drivers/cxl/core/port.c | 12 +-
drivers/cxl/core/region.c | 286 +++++++++++++++-----------------------
include/linux/rwsem.h | 37 +++++
8 files changed, 219 insertions(+), 262 deletions(-)
diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
index edb4f41eeacc..dadfcf6fb84c 100644
--- a/drivers/cxl/core/cdat.c
+++ b/drivers/cxl/core/cdat.c
@@ -336,7 +336,7 @@ static int match_cxlrd_hb(struct device *dev, void *data)
cxlrd = to_cxl_root_decoder(dev);
cxlsd = &cxlrd->cxlsd;
- guard(rwsem_read)(&cxl_region_rwsem);
+ guard(rwsem_read_acquire)(&cxl_region_rwsem);
for (int i = 0; i < cxlsd->nr_targets; i++) {
if (host_bridge == cxlsd->target[i]->dport_dev)
return 1;
@@ -987,7 +987,7 @@ void cxl_region_shared_upstream_bandwidth_update(struct cxl_region *cxlr)
bool is_root;
int rc;
- lockdep_assert_held(&cxl_dpa_rwsem);
+ lockdep_assert_held(&cxl_dpa_rwsem.rw_semaphore);
struct xarray *usp_xa __free(free_perf_xa) =
kzalloc(sizeof(*usp_xa), GFP_KERNEL);
@@ -1057,7 +1057,7 @@ void cxl_region_perf_data_calculate(struct cxl_region *cxlr,
{
struct cxl_dpa_perf *perf;
- lockdep_assert_held(&cxl_dpa_rwsem);
+ lockdep_assert_held(&cxl_dpa_rwsem.rw_semaphore);
perf = cxled_get_dpa_perf(cxled);
if (IS_ERR(perf))
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index 44b09552f44e..23afbbd650b7 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -107,8 +107,8 @@ u16 cxl_rcrb_to_aer(struct device *dev, resource_size_t rcrb);
#define PCI_RCRB_CAP_HDR_NEXT_MASK GENMASK(15, 8)
#define PCI_CAP_EXP_SIZEOF 0x3c
-extern struct rw_semaphore cxl_dpa_rwsem;
-extern struct rw_semaphore cxl_region_rwsem;
+extern struct rw_semaphore_acquire cxl_dpa_rwsem;
+extern struct rw_semaphore_acquire cxl_region_rwsem;
DEFINE_CLASS(
cxl_decoder_detach, struct cxl_region *,
@@ -117,22 +117,24 @@ DEFINE_CLASS(
put_device(&_T->dev);
},
({
- int rc = 0;
-
/* when the decoder is being destroyed lock unconditionally */
- if (mode == DETACH_INVALIDATE)
- down_write(&cxl_region_rwsem);
- else
- rc = down_write_killable(&cxl_region_rwsem);
-
- if (rc)
- cxlr = ERR_PTR(rc);
- else {
- cxlr = cxl_decoder_detach(cxlr, cxled, pos, mode);
- get_device(&cxlr->dev);
+ if (mode == DETACH_INVALIDATE) {
+ scoped_guard(rwsem_write_acquire, &cxl_region_rwsem) {
+ cxlr = cxl_decoder_detach(cxlr, cxled, pos,
+ mode);
+ if (!IS_ERR_OR_NULL(cxlr))
+ get_device(&cxlr->dev);
+ }
+ } else {
+ CLASS(rwsem_write_kill_acquire, rwsem)(&cxl_region_rwsem);
+ if (IS_ERR(rwsem))
+ cxlr = ERR_CAST(rwsem);
+ else
+ cxlr = cxl_decoder_detach(cxlr, cxled, pos,
+ mode);
+ if (!IS_ERR_OR_NULL(cxlr))
+ get_device(&cxlr->dev);
}
- up_write(&cxl_region_rwsem);
-
cxlr;
}),
struct cxl_region *cxlr, struct cxl_endpoint_decoder *cxled, int pos,
diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index 1c195f495a59..1624f066fde5 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -16,7 +16,7 @@
* for enumerating these registers and capabilities.
*/
-DECLARE_RWSEM(cxl_dpa_rwsem);
+DECLARE_RWSEM_ACQUIRE(cxl_dpa_rwsem);
static int add_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
int *target_map)
@@ -213,7 +213,7 @@ void cxl_dpa_debug(struct seq_file *file, struct cxl_dev_state *cxlds)
{
struct resource *p1, *p2;
- guard(rwsem_read)(&cxl_dpa_rwsem);
+ guard(rwsem_read_acquire)(&cxl_dpa_rwsem);
for (p1 = cxlds->dpa_res.child; p1; p1 = p1->sibling) {
__cxl_dpa_debug(file, p1, 0);
for (p2 = p1->child; p2; p2 = p2->sibling)
@@ -265,7 +265,7 @@ static void __cxl_dpa_release(struct cxl_endpoint_decoder *cxled)
struct resource *res = cxled->dpa_res;
resource_size_t skip_start;
- lockdep_assert_held_write(&cxl_dpa_rwsem);
+ lockdep_assert_held_write(&cxl_dpa_rwsem.rw_semaphore);
/* save @skip_start, before @res is released */
skip_start = res->start - cxled->skip;
@@ -280,7 +280,7 @@ static void __cxl_dpa_release(struct cxl_endpoint_decoder *cxled)
static void cxl_dpa_release(void *cxled)
{
- guard(rwsem_write)(&cxl_dpa_rwsem);
+ guard(rwsem_write_acquire)(&cxl_dpa_rwsem);
__cxl_dpa_release(cxled);
}
@@ -292,7 +292,7 @@ static void devm_cxl_dpa_release(struct cxl_endpoint_decoder *cxled)
{
struct cxl_port *port = cxled_to_port(cxled);
- lockdep_assert_held_write(&cxl_dpa_rwsem);
+ lockdep_assert_held_write(&cxl_dpa_rwsem.rw_semaphore);
devm_remove_action(&port->dev, cxl_dpa_release, cxled);
__cxl_dpa_release(cxled);
}
@@ -360,7 +360,7 @@ static int __cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
struct resource *res;
int rc;
- lockdep_assert_held_write(&cxl_dpa_rwsem);
+ lockdep_assert_held_write(&cxl_dpa_rwsem.rw_semaphore);
if (!len) {
dev_warn(dev, "decoder%d.%d: empty reservation attempted\n",
@@ -469,7 +469,7 @@ int cxl_dpa_setup(struct cxl_dev_state *cxlds, const struct cxl_dpa_info *info)
{
struct device *dev = cxlds->dev;
- guard(rwsem_write)(&cxl_dpa_rwsem);
+ guard(rwsem_write_acquire)(&cxl_dpa_rwsem);
if (cxlds->nr_partitions)
return -EBUSY;
@@ -515,9 +515,8 @@ int devm_cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
struct cxl_port *port = cxled_to_port(cxled);
int rc;
- down_write(&cxl_dpa_rwsem);
- rc = __cxl_dpa_reserve(cxled, base, len, skipped);
- up_write(&cxl_dpa_rwsem);
+ scoped_guard(rwsem_write_acquire, &cxl_dpa_rwsem)
+ rc = __cxl_dpa_reserve(cxled, base, len, skipped);
if (rc)
return rc;
@@ -528,7 +527,7 @@ EXPORT_SYMBOL_NS_GPL(devm_cxl_dpa_reserve, "CXL");
resource_size_t cxl_dpa_size(struct cxl_endpoint_decoder *cxled)
{
- guard(rwsem_read)(&cxl_dpa_rwsem);
+ guard(rwsem_read_acquire)(&cxl_dpa_rwsem);
if (cxled->dpa_res)
return resource_size(cxled->dpa_res);
@@ -539,7 +538,7 @@ resource_size_t cxl_dpa_resource_start(struct cxl_endpoint_decoder *cxled)
{
resource_size_t base = -1;
- lockdep_assert_held(&cxl_dpa_rwsem);
+ lockdep_assert_held(&cxl_dpa_rwsem.rw_semaphore);
if (cxled->dpa_res)
base = cxled->dpa_res->start;
@@ -551,7 +550,7 @@ int cxl_dpa_free(struct cxl_endpoint_decoder *cxled)
struct cxl_port *port = cxled_to_port(cxled);
struct device *dev = &cxled->cxld.dev;
- guard(rwsem_write)(&cxl_dpa_rwsem);
+ guard(rwsem_write_acquire)(&cxl_dpa_rwsem);
if (!cxled->dpa_res)
return 0;
if (cxled->cxld.region) {
@@ -581,7 +580,7 @@ int cxl_dpa_set_part(struct cxl_endpoint_decoder *cxled,
struct device *dev = &cxled->cxld.dev;
int part;
- guard(rwsem_write)(&cxl_dpa_rwsem);
+ guard(rwsem_write_acquire)(&cxl_dpa_rwsem);
if (cxled->cxld.flags & CXL_DECODER_F_ENABLE)
return -EBUSY;
@@ -613,7 +612,7 @@ static int __cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long lon
struct resource *p, *last;
int part;
- guard(rwsem_write)(&cxl_dpa_rwsem);
+ guard(rwsem_write_acquire)(&cxl_dpa_rwsem);
if (cxled->cxld.region) {
dev_dbg(dev, "decoder attached to %s\n",
dev_name(&cxled->cxld.region->dev));
@@ -841,9 +840,8 @@ static int cxl_decoder_commit(struct cxl_decoder *cxld)
}
}
- down_read(&cxl_dpa_rwsem);
- setup_hw_decoder(cxld, hdm);
- up_read(&cxl_dpa_rwsem);
+ scoped_guard(rwsem_read_acquire, &cxl_dpa_rwsem)
+ setup_hw_decoder(cxld, hdm);
port->commit_end++;
rc = cxld_await_commit(hdm, cxld->id);
@@ -881,7 +879,7 @@ void cxl_port_commit_reap(struct cxl_decoder *cxld)
{
struct cxl_port *port = to_cxl_port(cxld->dev.parent);
- lockdep_assert_held_write(&cxl_region_rwsem);
+ lockdep_assert_held_write(&cxl_region_rwsem.rw_semaphore);
/*
* Once the highest committed decoder is disabled, free any other
@@ -1029,7 +1027,7 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
else
cxld->target_type = CXL_DECODER_DEVMEM;
- guard(rwsem_write)(&cxl_region_rwsem);
+ guard(rwsem_write_acquire)(&cxl_region_rwsem);
if (cxld->id != cxl_num_decoders_committed(port)) {
dev_warn(&port->dev,
"decoder%d.%d: Committed out of order\n",
diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index cec9dfb22567..9f17e9072e60 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -909,8 +909,8 @@ void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
* translations. Take topology mutation locks and lookup
* { HPA, REGION } from { DPA, MEMDEV } in the event record.
*/
- guard(rwsem_read)(&cxl_region_rwsem);
- guard(rwsem_read)(&cxl_dpa_rwsem);
+ guard(rwsem_read_acquire)(&cxl_region_rwsem);
+ guard(rwsem_read_acquire)(&cxl_dpa_rwsem);
dpa = le64_to_cpu(evt->media_hdr.phys_addr) & CXL_DPA_MASK;
cxlr = cxl_dpa_to_region(cxlmd, dpa);
@@ -1258,7 +1258,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd)
/* synchronize with cxl_mem_probe() and decoder write operations */
guard(device)(&cxlmd->dev);
endpoint = cxlmd->endpoint;
- guard(rwsem_read)(&cxl_region_rwsem);
+ guard(rwsem_read_acquire)(&cxl_region_rwsem);
/*
* Require an endpoint to be safe otherwise the driver can not
* be sure that the device is unmapped.
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index a16a5886d40a..8183f8099f89 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -231,15 +231,13 @@ int cxl_trigger_poison_list(struct cxl_memdev *cxlmd)
if (!port || !is_cxl_endpoint(port))
return -EINVAL;
- rc = down_read_interruptible(&cxl_region_rwsem);
- if (rc)
- return rc;
+ CLASS(rwsem_read_intr_acquire, region_rwsem)(&cxl_region_rwsem);
+ if (IS_ERR(region_rwsem))
+ return PTR_ERR(region_rwsem);
- rc = down_read_interruptible(&cxl_dpa_rwsem);
- if (rc) {
- up_read(&cxl_region_rwsem);
- return rc;
- }
+ CLASS(rwsem_read_intr_acquire, dpa_rwsem)(&cxl_dpa_rwsem);
+ if (IS_ERR(dpa_rwsem))
+ return PTR_ERR(dpa_rwsem);
if (cxl_num_decoders_committed(port) == 0) {
/* No regions mapped to this memdev */
@@ -248,8 +246,6 @@ int cxl_trigger_poison_list(struct cxl_memdev *cxlmd)
/* Regions mapped, collect poison by endpoint */
rc = cxl_get_poison_by_endpoint(port);
}
- up_read(&cxl_dpa_rwsem);
- up_read(&cxl_region_rwsem);
return rc;
}
@@ -291,19 +287,17 @@ int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
if (!IS_ENABLED(CONFIG_DEBUG_FS))
return 0;
- rc = down_read_interruptible(&cxl_region_rwsem);
- if (rc)
- return rc;
+ CLASS(rwsem_read_intr_acquire, region_rwsem)(&cxl_region_rwsem);
+ if (IS_ERR(region_rwsem))
+ return PTR_ERR(region_rwsem);
- rc = down_read_interruptible(&cxl_dpa_rwsem);
- if (rc) {
- up_read(&cxl_region_rwsem);
- return rc;
- }
+ CLASS(rwsem_read_intr_acquire, dpa_rwsem)(&cxl_dpa_rwsem);
+ if (IS_ERR(dpa_rwsem))
+ return PTR_ERR(dpa_rwsem);
rc = cxl_validate_poison_dpa(cxlmd, dpa);
if (rc)
- goto out;
+ return rc;
inject.address = cpu_to_le64(dpa);
mbox_cmd = (struct cxl_mbox_cmd) {
@@ -313,7 +307,7 @@ int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
};
rc = cxl_internal_send_cmd(cxl_mbox, &mbox_cmd);
if (rc)
- goto out;
+ return rc;
cxlr = cxl_dpa_to_region(cxlmd, dpa);
if (cxlr)
@@ -326,11 +320,8 @@ int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
.length = cpu_to_le32(1),
};
trace_cxl_poison(cxlmd, cxlr, &record, 0, 0, CXL_POISON_TRACE_INJECT);
-out:
- up_read(&cxl_dpa_rwsem);
- up_read(&cxl_region_rwsem);
- return rc;
+ return 0;
}
EXPORT_SYMBOL_NS_GPL(cxl_inject_poison, "CXL");
@@ -346,19 +337,17 @@ int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
if (!IS_ENABLED(CONFIG_DEBUG_FS))
return 0;
- rc = down_read_interruptible(&cxl_region_rwsem);
- if (rc)
- return rc;
+ CLASS(rwsem_read_intr_acquire, region_rwsem)(&cxl_region_rwsem);
+ if (IS_ERR(region_rwsem))
+ return PTR_ERR(region_rwsem);
- rc = down_read_interruptible(&cxl_dpa_rwsem);
- if (rc) {
- up_read(&cxl_region_rwsem);
- return rc;
- }
+ CLASS(rwsem_read_intr_acquire, dpa_rwsem)(&cxl_dpa_rwsem);
+ if (IS_ERR(dpa_rwsem))
+ return PTR_ERR(dpa_rwsem);
rc = cxl_validate_poison_dpa(cxlmd, dpa);
if (rc)
- goto out;
+ return rc;
/*
* In CXL 3.0 Spec 8.2.9.8.4.3, the Clear Poison mailbox command
@@ -377,7 +366,7 @@ int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
rc = cxl_internal_send_cmd(cxl_mbox, &mbox_cmd);
if (rc)
- goto out;
+ return rc;
cxlr = cxl_dpa_to_region(cxlmd, dpa);
if (cxlr)
@@ -390,11 +379,8 @@ int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
.length = cpu_to_le32(1),
};
trace_cxl_poison(cxlmd, cxlr, &record, 0, 0, CXL_POISON_TRACE_CLEAR);
-out:
- up_read(&cxl_dpa_rwsem);
- up_read(&cxl_region_rwsem);
- return rc;
+ return 0;
}
EXPORT_SYMBOL_NS_GPL(cxl_clear_poison, "CXL");
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 20b65f13bded..a157aef2cd06 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -34,14 +34,14 @@
* All changes to the interleave configuration occur with this lock held
* for write.
*/
-DECLARE_RWSEM(cxl_region_rwsem);
+DECLARE_RWSEM_ACQUIRE(cxl_region_rwsem);
static DEFINE_IDA(cxl_port_ida);
static DEFINE_XARRAY(cxl_root_buses);
int cxl_num_decoders_committed(struct cxl_port *port)
{
- lockdep_assert_held(&cxl_region_rwsem);
+ lockdep_assert_held(&cxl_region_rwsem.rw_semaphore);
return port->commit_end + 1;
}
@@ -176,7 +176,7 @@ static ssize_t target_list_show(struct device *dev,
ssize_t offset;
int rc;
- guard(rwsem_read)(&cxl_region_rwsem);
+ guard(rwsem_read_acquire)(&cxl_region_rwsem);
rc = emit_target_list(cxlsd, buf);
if (rc < 0)
return rc;
@@ -235,7 +235,7 @@ static ssize_t dpa_resource_show(struct device *dev, struct device_attribute *at
{
struct cxl_endpoint_decoder *cxled = to_cxl_endpoint_decoder(dev);
- guard(rwsem_read)(&cxl_dpa_rwsem);
+ guard(rwsem_read_acquire)(&cxl_dpa_rwsem);
return sysfs_emit(buf, "%#llx\n", (u64)cxl_dpa_resource_start(cxled));
}
static DEVICE_ATTR_RO(dpa_resource);
@@ -560,7 +560,7 @@ static ssize_t decoders_committed_show(struct device *dev,
{
struct cxl_port *port = to_cxl_port(dev);
- guard(rwsem_read)(&cxl_region_rwsem);
+ guard(rwsem_read_acquire)(&cxl_region_rwsem);
return sysfs_emit(buf, "%d\n", cxl_num_decoders_committed(port));
}
@@ -1729,7 +1729,7 @@ static int decoder_populate_targets(struct cxl_switch_decoder *cxlsd,
if (xa_empty(&port->dports))
return -EINVAL;
- guard(rwsem_write)(&cxl_region_rwsem);
+ guard(rwsem_write_acquire)(&cxl_region_rwsem);
for (i = 0; i < cxlsd->cxld.interleave_ways; i++) {
struct cxl_dport *dport = find_dport(port, target_map[i]);
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 17e69f6cc772..65313a3548e4 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -139,18 +139,13 @@ static ssize_t uuid_show(struct device *dev, struct device_attribute *attr,
{
struct cxl_region *cxlr = to_cxl_region(dev);
struct cxl_region_params *p = &cxlr->params;
- ssize_t rc;
- rc = down_read_interruptible(&cxl_region_rwsem);
- if (rc)
- return rc;
+ CLASS(rwsem_read_intr_acquire, region_rwsem)(&cxl_region_rwsem);
+ if (IS_ERR(region_rwsem))
+ return PTR_ERR(region_rwsem);
if (cxlr->mode != CXL_PARTMODE_PMEM)
- rc = sysfs_emit(buf, "\n");
- else
- rc = sysfs_emit(buf, "%pUb\n", &p->uuid);
- up_read(&cxl_region_rwsem);
-
- return rc;
+ return sysfs_emit(buf, "\n");
+ return sysfs_emit(buf, "%pUb\n", &p->uuid);
}
static int is_dup(struct device *match, void *data)
@@ -162,7 +157,7 @@ static int is_dup(struct device *match, void *data)
if (!is_cxl_region(match))
return 0;
- lockdep_assert_held(&cxl_region_rwsem);
+ lockdep_assert_held(&cxl_region_rwsem.rw_semaphore);
cxlr = to_cxl_region(match);
p = &cxlr->params;
@@ -192,27 +187,22 @@ static ssize_t uuid_store(struct device *dev, struct device_attribute *attr,
if (uuid_is_null(&temp))
return -EINVAL;
- rc = down_write_killable(&cxl_region_rwsem);
- if (rc)
- return rc;
+ CLASS(rwsem_write_kill_acquire, region_rwsem)(&cxl_region_rwsem);
+ if (IS_ERR(region_rwsem))
+ return PTR_ERR(region_rwsem);
if (uuid_equal(&p->uuid, &temp))
- goto out;
+ return len;
- rc = -EBUSY;
if (p->state >= CXL_CONFIG_ACTIVE)
- goto out;
+ return -EBUSY;
rc = bus_for_each_dev(&cxl_bus_type, NULL, &temp, is_dup);
if (rc < 0)
- goto out;
+ return rc;
uuid_copy(&p->uuid, &temp);
-out:
- up_write(&cxl_region_rwsem);
- if (rc)
- return rc;
return len;
}
static DEVICE_ATTR_RW(uuid);
@@ -353,22 +343,18 @@ static int cxl_region_decode_commit(struct cxl_region *cxlr)
static int queue_reset(struct cxl_region *cxlr)
{
struct cxl_region_params *p = &cxlr->params;
- int rc;
- rc = down_write_killable(&cxl_region_rwsem);
- if (rc)
- return rc;
+ CLASS(rwsem_write_kill_acquire, rwsem)(&cxl_region_rwsem);
+ if (IS_ERR(rwsem))
+ return PTR_ERR(rwsem);
/* Already in the requested state? */
if (p->state < CXL_CONFIG_COMMIT)
- goto out;
+ return 0;
p->state = CXL_CONFIG_RESET_PENDING;
-out:
- up_write(&cxl_region_rwsem);
-
- return rc;
+ return 0;
}
static int __commit(struct cxl_region *cxlr)
@@ -376,19 +362,17 @@ static int __commit(struct cxl_region *cxlr)
struct cxl_region_params *p = &cxlr->params;
int rc;
- rc = down_write_killable(&cxl_region_rwsem);
- if (rc)
- return rc;
+ CLASS(rwsem_write_kill_acquire, rwsem)(&cxl_region_rwsem);
+ if (IS_ERR(rwsem))
+ return PTR_ERR(rwsem);
/* Already in the requested state? */
if (p->state >= CXL_CONFIG_COMMIT)
- goto out;
+ return 0;
/* Not ready to commit? */
- if (p->state < CXL_CONFIG_ACTIVE) {
- rc = -ENXIO;
- goto out;
- }
+ if (p->state < CXL_CONFIG_ACTIVE)
+ return -ENXIO;
/*
* Invalidate caches before region setup to drop any speculative
@@ -396,16 +380,15 @@ static int __commit(struct cxl_region *cxlr)
*/
rc = cxl_region_invalidate_memregion(cxlr);
if (rc)
- goto out;
+ return rc;
rc = cxl_region_decode_commit(cxlr);
- if (rc == 0)
- p->state = CXL_CONFIG_COMMIT;
+ if (rc)
+ return rc;
-out:
- up_write(&cxl_region_rwsem);
+ p->state = CXL_CONFIG_COMMIT;
- return rc;
+ return 0;
}
static ssize_t commit_store(struct device *dev, struct device_attribute *attr,
@@ -441,7 +424,7 @@ static ssize_t commit_store(struct device *dev, struct device_attribute *attr,
* With the reset pending take cxl_region_rwsem unconditionally
* to ensure the reset gets handled before returning.
*/
- guard(rwsem_write)(&cxl_region_rwsem);
+ guard(rwsem_write_acquire)(&cxl_region_rwsem);
/*
* Revalidate that the reset is still pending in case another
@@ -460,15 +443,11 @@ static ssize_t commit_show(struct device *dev, struct device_attribute *attr,
{
struct cxl_region *cxlr = to_cxl_region(dev);
struct cxl_region_params *p = &cxlr->params;
- ssize_t rc;
- rc = down_read_interruptible(&cxl_region_rwsem);
- if (rc)
- return rc;
- rc = sysfs_emit(buf, "%d\n", p->state >= CXL_CONFIG_COMMIT);
- up_read(&cxl_region_rwsem);
-
- return rc;
+ CLASS(rwsem_read_intr_acquire, rwsem)(&cxl_region_rwsem);
+ if (IS_ERR(rwsem))
+ return PTR_ERR(rwsem);
+ return sysfs_emit(buf, "%d\n", p->state >= CXL_CONFIG_COMMIT);
}
static DEVICE_ATTR_RW(commit);
@@ -492,15 +471,11 @@ static ssize_t interleave_ways_show(struct device *dev,
{
struct cxl_region *cxlr = to_cxl_region(dev);
struct cxl_region_params *p = &cxlr->params;
- ssize_t rc;
-
- rc = down_read_interruptible(&cxl_region_rwsem);
- if (rc)
- return rc;
- rc = sysfs_emit(buf, "%d\n", p->interleave_ways);
- up_read(&cxl_region_rwsem);
- return rc;
+ CLASS(rwsem_read_intr_acquire, rwsem)(&cxl_region_rwsem);
+ if (IS_ERR(rwsem))
+ return PTR_ERR(rwsem);
+ return sysfs_emit(buf, "%d\n", p->interleave_ways);
}
static const struct attribute_group *get_cxl_region_target_group(void);
@@ -535,23 +510,21 @@ static ssize_t interleave_ways_store(struct device *dev,
return -EINVAL;
}
- rc = down_write_killable(&cxl_region_rwsem);
- if (rc)
- return rc;
- if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
- rc = -EBUSY;
- goto out;
- }
+ CLASS(rwsem_write_kill_acquire, rwsem)(&cxl_region_rwsem);
+ if (IS_ERR(rwsem))
+ return PTR_ERR(rwsem);
+
+ if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE)
+ return -EBUSY;
save = p->interleave_ways;
p->interleave_ways = val;
rc = sysfs_update_group(&cxlr->dev.kobj, get_cxl_region_target_group());
- if (rc)
+ if (rc) {
p->interleave_ways = save;
-out:
- up_write(&cxl_region_rwsem);
- if (rc)
return rc;
+ }
+
return len;
}
static DEVICE_ATTR_RW(interleave_ways);
@@ -562,15 +535,11 @@ static ssize_t interleave_granularity_show(struct device *dev,
{
struct cxl_region *cxlr = to_cxl_region(dev);
struct cxl_region_params *p = &cxlr->params;
- ssize_t rc;
- rc = down_read_interruptible(&cxl_region_rwsem);
- if (rc)
- return rc;
- rc = sysfs_emit(buf, "%d\n", p->interleave_granularity);
- up_read(&cxl_region_rwsem);
-
- return rc;
+ CLASS(rwsem_read_intr_acquire, rwsem)(&cxl_region_rwsem);
+ if (IS_ERR(rwsem))
+ return PTR_ERR(rwsem);
+ return sysfs_emit(buf, "%d\n", p->interleave_granularity);
}
static ssize_t interleave_granularity_store(struct device *dev,
@@ -603,19 +572,15 @@ static ssize_t interleave_granularity_store(struct device *dev,
if (cxld->interleave_ways > 1 && val != cxld->interleave_granularity)
return -EINVAL;
- rc = down_write_killable(&cxl_region_rwsem);
- if (rc)
- return rc;
- if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
- rc = -EBUSY;
- goto out;
- }
+ CLASS(rwsem_write_kill_acquire, rwsem)(&cxl_region_rwsem);
+ if (IS_ERR(rwsem))
+ return PTR_ERR(rwsem);
+
+ if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE)
+ return -EBUSY;
p->interleave_granularity = val;
-out:
- up_write(&cxl_region_rwsem);
- if (rc)
- return rc;
+
return len;
}
static DEVICE_ATTR_RW(interleave_granularity);
@@ -626,17 +591,14 @@ static ssize_t resource_show(struct device *dev, struct device_attribute *attr,
struct cxl_region *cxlr = to_cxl_region(dev);
struct cxl_region_params *p = &cxlr->params;
u64 resource = -1ULL;
- ssize_t rc;
- rc = down_read_interruptible(&cxl_region_rwsem);
- if (rc)
- return rc;
+ CLASS(rwsem_read_intr_acquire, rwsem)(&cxl_region_rwsem);
+ if (IS_ERR(rwsem))
+ return PTR_ERR(rwsem);
+
if (p->res)
resource = p->res->start;
- rc = sysfs_emit(buf, "%#llx\n", resource);
- up_read(&cxl_region_rwsem);
-
- return rc;
+ return sysfs_emit(buf, "%#llx\n", resource);
}
static DEVICE_ATTR_RO(resource);
@@ -664,7 +626,7 @@ static int alloc_hpa(struct cxl_region *cxlr, resource_size_t size)
struct resource *res;
u64 remainder = 0;
- lockdep_assert_held_write(&cxl_region_rwsem);
+ lockdep_assert_held_write(&cxl_region_rwsem.rw_semaphore);
/* Nothing to do... */
if (p->res && resource_size(p->res) == size)
@@ -706,7 +668,7 @@ static void cxl_region_iomem_release(struct cxl_region *cxlr)
struct cxl_region_params *p = &cxlr->params;
if (device_is_registered(&cxlr->dev))
- lockdep_assert_held_write(&cxl_region_rwsem);
+ lockdep_assert_held_write(&cxl_region_rwsem.rw_semaphore);
if (p->res) {
/*
* Autodiscovered regions may not have been able to insert their
@@ -723,7 +685,7 @@ static int free_hpa(struct cxl_region *cxlr)
{
struct cxl_region_params *p = &cxlr->params;
- lockdep_assert_held_write(&cxl_region_rwsem);
+ lockdep_assert_held_write(&cxl_region_rwsem.rw_semaphore);
if (!p->res)
return 0;
@@ -747,15 +709,14 @@ static ssize_t size_store(struct device *dev, struct device_attribute *attr,
if (rc)
return rc;
- rc = down_write_killable(&cxl_region_rwsem);
- if (rc)
- return rc;
+ CLASS(rwsem_write_kill_acquire, rwsem)(&cxl_region_rwsem);
+ if (IS_ERR(rwsem))
+ return PTR_ERR(rwsem);
if (val)
rc = alloc_hpa(cxlr, val);
else
rc = free_hpa(cxlr);
- up_write(&cxl_region_rwsem);
if (rc)
return rc;
@@ -769,17 +730,13 @@ static ssize_t size_show(struct device *dev, struct device_attribute *attr,
struct cxl_region *cxlr = to_cxl_region(dev);
struct cxl_region_params *p = &cxlr->params;
u64 size = 0;
- ssize_t rc;
- rc = down_read_interruptible(&cxl_region_rwsem);
- if (rc)
- return rc;
+ CLASS(rwsem_read_intr_acquire, rwsem)(&cxl_region_rwsem);
+ if (IS_ERR(rwsem))
+ return PTR_ERR(rwsem);
if (p->res)
size = resource_size(p->res);
- rc = sysfs_emit(buf, "%#llx\n", size);
- up_read(&cxl_region_rwsem);
-
- return rc;
+ return sysfs_emit(buf, "%#llx\n", size);
}
static DEVICE_ATTR_RW(size);
@@ -803,28 +760,21 @@ static size_t show_targetN(struct cxl_region *cxlr, char *buf, int pos)
{
struct cxl_region_params *p = &cxlr->params;
struct cxl_endpoint_decoder *cxled;
- int rc;
- rc = down_read_interruptible(&cxl_region_rwsem);
- if (rc)
- return rc;
+ CLASS(rwsem_read_intr_acquire, rwsem)(&cxl_region_rwsem);
+ if (IS_ERR(rwsem))
+ return PTR_ERR(rwsem);
if (pos >= p->interleave_ways) {
dev_dbg(&cxlr->dev, "position %d out of range %d\n", pos,
p->interleave_ways);
- rc = -ENXIO;
- goto out;
+ return -ENXIO;
}
cxled = p->targets[pos];
if (!cxled)
- rc = sysfs_emit(buf, "\n");
- else
- rc = sysfs_emit(buf, "%s\n", dev_name(&cxled->cxld.dev));
-out:
- up_read(&cxl_region_rwsem);
-
- return rc;
+ return sysfs_emit(buf, "\n");
+ return sysfs_emit(buf, "%s\n", dev_name(&cxled->cxld.dev));
}
static int check_commit_order(struct device *dev, void *data)
@@ -1127,7 +1077,7 @@ static int cxl_port_attach_region(struct cxl_port *port,
unsigned long index;
int rc = -EBUSY;
- lockdep_assert_held_write(&cxl_region_rwsem);
+ lockdep_assert_held_write(&cxl_region_rwsem.rw_semaphore);
cxl_rr = cxl_rr_load(port, cxlr);
if (cxl_rr) {
@@ -1228,7 +1178,7 @@ static void cxl_port_detach_region(struct cxl_port *port,
struct cxl_region_ref *cxl_rr;
struct cxl_ep *ep = NULL;
- lockdep_assert_held_write(&cxl_region_rwsem);
+ lockdep_assert_held_write(&cxl_region_rwsem.rw_semaphore);
cxl_rr = cxl_rr_load(port, cxlr);
if (!cxl_rr)
@@ -2137,7 +2087,7 @@ struct cxl_region *cxl_decoder_detach(struct cxl_region *cxlr,
{
struct cxl_region_params *p;
- lockdep_assert_held_write(&cxl_region_rwsem);
+ lockdep_assert_held_write(&cxl_region_rwsem.rw_semaphore);
if (!cxled) {
p = &cxlr->params;
@@ -2199,20 +2149,17 @@ static int attach_target(struct cxl_region *cxlr,
struct cxl_endpoint_decoder *cxled, int pos,
unsigned int state)
{
- int rc = 0;
-
- if (state == TASK_INTERRUPTIBLE)
- rc = down_write_killable(&cxl_region_rwsem);
- else
- down_write(&cxl_region_rwsem);
- if (rc)
- return rc;
+ if (state == TASK_INTERRUPTIBLE) {
+ CLASS(rwsem_write_kill_acquire, rwsem)(&cxl_region_rwsem);
+ if (IS_ERR(rwsem))
+ return PTR_ERR(rwsem);
+ guard(rwsem_read_acquire)(&cxl_dpa_rwsem);
+ return cxl_region_attach(cxlr, cxled, pos);
+ }
- down_read(&cxl_dpa_rwsem);
- rc = cxl_region_attach(cxlr, cxled, pos);
- up_read(&cxl_dpa_rwsem);
- up_write(&cxl_region_rwsem);
- return rc;
+ guard(rwsem_write_acquire)(&cxl_region_rwsem);
+ guard(rwsem_read_acquire)(&cxl_dpa_rwsem);
+ return cxl_region_attach(cxlr, cxled, pos);
}
static int detach_target(struct cxl_region *cxlr, int pos)
@@ -2644,19 +2591,14 @@ static ssize_t region_show(struct device *dev, struct device_attribute *attr,
char *buf)
{
struct cxl_decoder *cxld = to_cxl_decoder(dev);
- ssize_t rc;
- rc = down_read_interruptible(&cxl_region_rwsem);
- if (rc)
- return rc;
+ CLASS(rwsem_read_intr_acquire, rwsem)(&cxl_region_rwsem);
+ if (IS_ERR(rwsem))
+ return PTR_ERR(rwsem);
if (cxld->region)
- rc = sysfs_emit(buf, "%s\n", dev_name(&cxld->region->dev));
- else
- rc = sysfs_emit(buf, "\n");
- up_read(&cxl_region_rwsem);
-
- return rc;
+ return sysfs_emit(buf, "%s\n", dev_name(&cxld->region->dev));
+ return sysfs_emit(buf, "\n");
}
DEVICE_ATTR_RO(region);
@@ -2995,7 +2937,7 @@ static int cxl_pmem_region_alloc(struct cxl_region *cxlr)
struct device *dev;
int i;
- guard(rwsem_read)(&cxl_region_rwsem);
+ guard(rwsem_read_acquire)(&cxl_region_rwsem);
if (p->state != CXL_CONFIG_COMMIT)
return -ENXIO;
@@ -3084,7 +3026,7 @@ static struct cxl_dax_region *cxl_dax_region_alloc(struct cxl_region *cxlr)
struct cxl_dax_region *cxlr_dax;
struct device *dev;
- guard(rwsem_read)(&cxl_region_rwsem);
+ guard(rwsem_read_acquire)(&cxl_region_rwsem);
if (p->state != CXL_CONFIG_COMMIT)
return ERR_PTR(-ENXIO);
@@ -3255,7 +3197,7 @@ static int match_region_by_range(struct device *dev, const void *data)
cxlr = to_cxl_region(dev);
p = &cxlr->params;
- guard(rwsem_read)(&cxl_region_rwsem);
+ guard(rwsem_read_acquire)(&cxl_region_rwsem);
if (p->res && p->res->start == r->start && p->res->end == r->end)
return 1;
@@ -3315,7 +3257,7 @@ static int __construct_region(struct cxl_region *cxlr,
struct resource *res;
int rc;
- guard(rwsem_write)(&cxl_region_rwsem);
+ guard(rwsem_write_acquire)(&cxl_region_rwsem);
p = &cxlr->params;
if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
dev_err(cxlmd->dev.parent,
@@ -3453,10 +3395,10 @@ int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
attach_target(cxlr, cxled, -1, TASK_UNINTERRUPTIBLE);
- down_read(&cxl_region_rwsem);
- p = &cxlr->params;
- attach = p->state == CXL_CONFIG_COMMIT;
- up_read(&cxl_region_rwsem);
+ scoped_guard(rwsem_read_acquire, &cxl_region_rwsem) {
+ p = &cxlr->params;
+ attach = p->state == CXL_CONFIG_COMMIT;
+ }
if (attach) {
/*
@@ -3484,7 +3426,7 @@ u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint, u64 spa)
if (!endpoint)
return ~0ULL;
- guard(rwsem_write)(&cxl_region_rwsem);
+ guard(rwsem_write_acquire)(&cxl_region_rwsem);
xa_for_each(&endpoint->regions, index, iter) {
struct cxl_region_params *p = &iter->region->params;
@@ -3524,32 +3466,24 @@ static void shutdown_notifiers(void *_cxlr)
static int cxl_region_can_probe(struct cxl_region *cxlr)
{
struct cxl_region_params *p = &cxlr->params;
- int rc;
- rc = down_read_interruptible(&cxl_region_rwsem);
- if (rc) {
+ CLASS(rwsem_read_intr_acquire, rwsem)(&cxl_region_rwsem);
+ if (IS_ERR(rwsem)) {
dev_dbg(&cxlr->dev, "probe interrupted\n");
- return rc;
+ return PTR_ERR(rwsem);
}
if (p->state < CXL_CONFIG_COMMIT) {
dev_dbg(&cxlr->dev, "config state: %d\n", p->state);
- rc = -ENXIO;
- goto out;
+ return -ENXIO;
}
if (test_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags)) {
dev_err(&cxlr->dev,
"failed to activate, re-commit region and retry\n");
- rc = -ENXIO;
- goto out;
+ return -ENXIO;
}
-out:
- up_read(&cxl_region_rwsem);
-
- if (rc)
- return rc;
return 0;
}
diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h
index c8b543d428b0..4c44a50d47d6 100644
--- a/include/linux/rwsem.h
+++ b/include/linux/rwsem.h
@@ -242,9 +242,46 @@ DEFINE_GUARD(rwsem_read, struct rw_semaphore *, down_read(_T), up_read(_T))
DEFINE_GUARD_COND(rwsem_read, _try, down_read_trylock(_T))
DEFINE_GUARD_COND(rwsem_read, _intr, down_read_interruptible(_T) == 0)
+struct rw_semaphore_acquire {
+ struct rw_semaphore rw_semaphore;
+};
+
+#define DECLARE_RWSEM_ACQUIRE(name) \
+ struct rw_semaphore_acquire name = { __RWSEM_INITIALIZER( \
+ name.rw_semaphore) }
+
+DEFINE_GUARD(rwsem_read_acquire, struct rw_semaphore_acquire *,
+ down_read(&_T->rw_semaphore), up_read(&_T->rw_semaphore))
+DEFINE_GUARD(rwsem_write_acquire, struct rw_semaphore_acquire *,
+ down_write(&_T->rw_semaphore), up_write(&_T->rw_semaphore))
+DEFINE_ACQUIRE(rwsem_read_intr_acquire, rw_semaphore, up_read,
+ down_read_interruptible)
+DEFINE_ACQUIRE(rwsem_write_kill_acquire, rw_semaphore, up_write,
+ down_write_killable)
+
+static inline int down_read_try_or_busy(struct rw_semaphore *rwsem)
+{
+ int ret[] = { -EBUSY, 0 };
+
+ return ret[down_read_trylock(rwsem)];
+}
+
+DEFINE_ACQUIRE(rwsem_read_try_acquire, rw_semaphore, up_read,
+ down_read_try_or_busy)
+
DEFINE_GUARD(rwsem_write, struct rw_semaphore *, down_write(_T), up_write(_T))
DEFINE_GUARD_COND(rwsem_write, _try, down_write_trylock(_T))
+static inline int down_write_try_or_busy(struct rw_semaphore *rwsem)
+{
+ int ret[] = { -EBUSY, 0 };
+
+ return ret[down_write_trylock(rwsem)];
+}
+
+DEFINE_ACQUIRE(rwsem_write_try_acquire, rw_semaphore, up_write,
+ down_write_try_or_busy)
+
/*
* downgrade write lock to read lock
*/
--
2.49.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking
2025-05-07 7:21 ` [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking Dan Williams
@ 2025-05-07 9:32 ` Peter Zijlstra
2025-05-07 21:18 ` Dan Williams
2025-05-09 19:10 ` kernel test robot
1 sibling, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2025-05-07 9:32 UTC (permalink / raw)
To: Dan Williams
Cc: linux-cxl, linux-kernel, David Lechner, Linus Torvalds,
Ingo Molnar, Fabio M. De Francesco, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny
On Wed, May 07, 2025 at 12:21:39AM -0700, Dan Williams wrote:
> The scoped_cond_guard() helper has two problems. First, with "convert to
> cleanup" style patches, it results in messy diffs that re-indent large
> swaths of code. Second, it swallows return values from interruptible /
> killable locks, i.e. the return value of those is not always -EINTR.
>
> The last attempt at improving this situation, if_not_guard() [1], was
> reverted with a suggestion:
>
> "I really think the basic issue is that 'cond_guard' itself is a pretty
> broken concept. It simply doesn't work very well in the C syntax."
>
> However, the observation is that just exposing the CLASS() to callers
> solves: the safety concern (compiler enforced "expected expression"
> checks), conveying the conditional acquisition state of the lock, and
> avoiding re-indentation caused by scoped_cond_guard(). See the commit below
> for the analysis of the revert.
>
> Commit b4d83c8323b0 ("headers/cleanup.h: Remove the if_not_guard() facility")
>
> The DEFINE_ACQUIRE() macro wraps a lock type like 'struct mutex' into a
> 'struct mutex_acquire' type that can only be acquired and automatically
> released by scope-based helpers. E.g.:
>
> [scoped_]guard(mutex_acquire)(...)
> CLASS(mutex_intr_acquire, ...)
> CLASS(mutex_try_acquire, ...)
>
> Use DEFINE_ACQUIRE to create the new classes above and use
> mutex_intr_acquire in the CXL subsystem to demonstrate the conversion.
>
> +#define DEFINE_ACQUIRE(_name, _locktype, _unlock, _cond_lock) \
> + DEFINE_CLASS(_name, struct _locktype##_acquire *, \
> + if (!IS_ERR_OR_NULL(_T)) _unlock(&_T->_locktype), ({ \
> + struct _locktype##_acquire *lock_result; \
> + int ret = _cond_lock(&to_lock->_locktype); \
> + \
> + if (ret) \
> + lock_result = ERR_PTR(ret); \
> + else \
> + lock_result = \
> + (struct _locktype##_acquire \
> + *)&to_lock->_locktype; \
> + lock_result; \
> + }), \
> + struct _locktype##_acquire *to_lock)
>
> #endif /* _LINUX_CLEANUP_H */
> diff --git a/include/linux/mutex.h b/include/linux/mutex.h
> index 2143d05116be..283111f43b0f 100644
> --- a/include/linux/mutex.h
> +++ b/include/linux/mutex.h
> @@ -64,6 +64,8 @@ do { \
> __mutex_init((mutex), #mutex, &__key); \
> } while (0)
>
> +#define mutex_acquire_init(lock) mutex_init(&(lock)->mutex)
> +
> /**
> * mutex_init_with_key - initialize a mutex with a given lockdep key
> * @mutex: the mutex to be initialized
> @@ -202,6 +204,28 @@ DEFINE_GUARD(mutex, struct mutex *, mutex_lock(_T), mutex_unlock(_T))
> DEFINE_GUARD_COND(mutex, _try, mutex_trylock(_T))
> DEFINE_GUARD_COND(mutex, _intr, mutex_lock_interruptible(_T) == 0)
>
> +/* mutex type that only implements scope-based unlock */
> +struct mutex_acquire {
> + /* private: */
> + struct mutex mutex;
> +};
> +DEFINE_GUARD(mutex_acquire, struct mutex_acquire *, mutex_lock(&_T->mutex),
> + mutex_unlock(&_T->mutex))
> +DEFINE_GUARD_COND(mutex_acquire, _try, mutex_trylock(&_T->mutex))
> +DEFINE_GUARD_COND(mutex_acquire, _intr, mutex_lock_interruptible(&_T->mutex) == 0)
> +DEFINE_ACQUIRE(mutex_intr_acquire, mutex, mutex_unlock,
> + mutex_lock_interruptible)
> +
> +static inline int mutex_try_or_busy(struct mutex *lock)
> +{
> + int ret[] = { -EBUSY, 0 };
> +
> + return ret[mutex_trylock(lock)];
> +}
> +
> +DEFINE_ACQUIRE(mutex_try_acquire, mutex, mutex_unlock,
> + mutex_try_or_busy)
> +
> extern unsigned long mutex_get_owner(struct mutex *lock);
>
> #endif /* __LINUX_MUTEX_H */
I'm terribly confused...
What's wrong with:
CLASS(mutex_intr, lock)(&foo);
if (IS_ERR(__guard_ptr(mutex_intr)(lock)))
return __guard_ptr(mutex_intr)(lock);
I mean, yeah __guard_ptr(mutex_intr) doesn't really roll of the tongue,
but if that is the whole objection, surely we can try and fix that bit
instead of building an entire parallel set of primitives.
Notably, you're going to be running into trouble the moment you want to
use your acquire stuff on things like raw_spin_trylock_irqsave(),
because all that already wraps the return type, in order to hide the
flags thing etc.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking
2025-05-07 9:32 ` Peter Zijlstra
@ 2025-05-07 21:18 ` Dan Williams
2025-05-08 11:00 ` Peter Zijlstra
0 siblings, 1 reply; 29+ messages in thread
From: Dan Williams @ 2025-05-07 21:18 UTC (permalink / raw)
To: Peter Zijlstra, Dan Williams
Cc: linux-cxl, linux-kernel, David Lechner, Linus Torvalds,
Ingo Molnar, Fabio M. De Francesco, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny
Peter Zijlstra wrote:
[..]
> > @@ -202,6 +204,28 @@ DEFINE_GUARD(mutex, struct mutex *, mutex_lock(_T), mutex_unlock(_T))
> > DEFINE_GUARD_COND(mutex, _try, mutex_trylock(_T))
> > DEFINE_GUARD_COND(mutex, _intr, mutex_lock_interruptible(_T) == 0)
> >
> > +/* mutex type that only implements scope-based unlock */
> > +struct mutex_acquire {
> > + /* private: */
> > + struct mutex mutex;
> > +};
> > +DEFINE_GUARD(mutex_acquire, struct mutex_acquire *, mutex_lock(&_T->mutex),
> > + mutex_unlock(&_T->mutex))
> > +DEFINE_GUARD_COND(mutex_acquire, _try, mutex_trylock(&_T->mutex))
> > +DEFINE_GUARD_COND(mutex_acquire, _intr, mutex_lock_interruptible(&_T->mutex) == 0)
> > +DEFINE_ACQUIRE(mutex_intr_acquire, mutex, mutex_unlock,
> > + mutex_lock_interruptible)
> > +
> > +static inline int mutex_try_or_busy(struct mutex *lock)
> > +{
> > + int ret[] = { -EBUSY, 0 };
> > +
> > + return ret[mutex_trylock(lock)];
> > +}
> > +
> > +DEFINE_ACQUIRE(mutex_try_acquire, mutex, mutex_unlock,
> > + mutex_try_or_busy)
> > +
> > extern unsigned long mutex_get_owner(struct mutex *lock);
> >
> > #endif /* __LINUX_MUTEX_H */
>
> I'm terribly confused...
I suspect the disconnect is that this proposal adds safety where guard()
does not today. That was driven by the mistake that Linus caught in the
RFC [1]
at the same time I note that your patch is horribly broken. Look
at your change to drivers/cxl/core/mbox.c: you made it use
+ struct mutex *lock __drop(mutex_unlock) =
+ mutex_intr_acquire(&mds->poison.lock);
but then you didn't remove the existing unlocking, so that
function still has
[1]: http://lore.kernel.org/CAHk-=wgRPDGvofj1PU=NemF6iFu308pFZ=w5P+sQyOMGd978fA@mail.gmail.com
I.e. in my haste I forgot to cleanup a straggling open-coded
mutex_unlock(), but that is something the compiler warns about iff we
switch to parallel primitive universe.
> What's wrong with:
>
> CLASS(mutex_intr, lock)(&foo);
> if (IS_ERR(__guard_ptr(mutex_intr)(lock)))
> return __guard_ptr(mutex_intr)(lock);
__guard_ptr() returns NULL on error, not an ERR_PTR, but I get the gist.
> I mean, yeah __guard_ptr(mutex_intr) doesn't really roll of the tongue,
> but if that is the whole objection, surely we can try and fix that bit
> instead of building an entire parallel set of primitives.
Yes, the "entire set of parallel primitives" was the least confident
part of this proposal, but the more I look, that is a feature (albeit
inelegant) not a bug.
Today one can write:
guard(mutex_intr)(&lock);
...
mutex_unlock(lock);
...and the compiler does not tell you that the lock may not even be held
upon return, nor that this is unlocking a lock that will also be
unlocked when @lock goes out of scope.
The only type safety today is the BUILD_BUG_ON() in scoped_cond_guard()
when passing in the wrong lock class.
So the proposal is, if you know what you are doing, or have a need to
switch back and forth between scope-based and explicit unlock for a give
lock, use the base primitives. If instead you want to fully convert to
scope-based lock management (excise all explicit unlock() calls) *and*
you want the compiler to validate the conversion, switch to the _acquire
parallel universe.
> Notably, you're going to be running into trouble the moment you want to
> use your acquire stuff on things like raw_spin_trylock_irqsave(),
> because all that already wraps the return type, in order to hide the
> flags thing etc.
I think that is solvable, but only with a new DEFINE_LOCK_GUARD_1() that
knows that the @lock member of class_##name##_t needs to be cast to the
base lock type.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 6/7] cxl/region: Introduce CLASS(cxl_decoder_detach...) consolidate multiple paths
2025-05-07 7:21 ` [PATCH 6/7] cxl/region: Introduce CLASS(cxl_decoder_detach...) consolidate multiple paths Dan Williams
@ 2025-05-08 7:44 ` kernel test robot
0 siblings, 0 replies; 29+ messages in thread
From: kernel test robot @ 2025-05-08 7:44 UTC (permalink / raw)
To: Dan Williams, linux-cxl
Cc: llvm, oe-kbuild-all, linux-kernel, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny
Hi Dan,
kernel test robot noticed the following build warnings:
[auto build test WARNING on b4432656b36e5cc1d50a1f2dc15357543add530e]
url: https://github.com/intel-lab-lkp/linux/commits/Dan-Williams/cleanup-Introduce-DEFINE_ACQUIRE-a-CLASS-for-conditional-locking/20250507-152728
base: b4432656b36e5cc1d50a1f2dc15357543add530e
patch link: https://lore.kernel.org/r/20250507072145.3614298-7-dan.j.williams%40intel.com
patch subject: [PATCH 6/7] cxl/region: Introduce CLASS(cxl_decoder_detach...) consolidate multiple paths
config: um-allmodconfig (https://download.01.org/0day-ci/archive/20250508/202505081548.9HldF62Y-lkp@intel.com/config)
compiler: clang version 19.1.7 (https://github.com/llvm/llvm-project cd708029e0b2869e80abe31ddb175f7c35361f90)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250508/202505081548.9HldF62Y-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202505081548.9HldF62Y-lkp@intel.com/
All warnings (new ones prefixed by >>):
In file included from drivers/cxl/core/port.c:9:
In file included from include/linux/pci.h:38:
In file included from include/linux/interrupt.h:11:
In file included from include/linux/hardirq.h:11:
In file included from arch/um/include/asm/hardirq.h:5:
In file included from include/asm-generic/hardirq.h:17:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:12:
In file included from arch/um/include/asm/io.h:24:
include/asm-generic/io.h:549:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
549 | val = __raw_readb(PCI_IOBASE + addr);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:567:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
567 | val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
| ~~~~~~~~~~ ^
include/uapi/linux/byteorder/little_endian.h:37:51: note: expanded from macro '__le16_to_cpu'
37 | #define __le16_to_cpu(x) ((__force __u16)(__le16)(x))
| ^
In file included from drivers/cxl/core/port.c:9:
In file included from include/linux/pci.h:38:
In file included from include/linux/interrupt.h:11:
In file included from include/linux/hardirq.h:11:
In file included from arch/um/include/asm/hardirq.h:5:
In file included from include/asm-generic/hardirq.h:17:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:12:
In file included from arch/um/include/asm/io.h:24:
include/asm-generic/io.h:585:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
585 | val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
| ~~~~~~~~~~ ^
include/uapi/linux/byteorder/little_endian.h:35:51: note: expanded from macro '__le32_to_cpu'
35 | #define __le32_to_cpu(x) ((__force __u32)(__le32)(x))
| ^
In file included from drivers/cxl/core/port.c:9:
In file included from include/linux/pci.h:38:
In file included from include/linux/interrupt.h:11:
In file included from include/linux/hardirq.h:11:
In file included from arch/um/include/asm/hardirq.h:5:
In file included from include/asm-generic/hardirq.h:17:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:12:
In file included from arch/um/include/asm/io.h:24:
include/asm-generic/io.h:601:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
601 | __raw_writeb(value, PCI_IOBASE + addr);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:616:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
616 | __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:631:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
631 | __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:724:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
724 | readsb(PCI_IOBASE + addr, buffer, count);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:737:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
737 | readsw(PCI_IOBASE + addr, buffer, count);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:750:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
750 | readsl(PCI_IOBASE + addr, buffer, count);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:764:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
764 | writesb(PCI_IOBASE + addr, buffer, count);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:778:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
778 | writesw(PCI_IOBASE + addr, buffer, count);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:792:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
792 | writesl(PCI_IOBASE + addr, buffer, count);
| ~~~~~~~~~~ ^
In file included from drivers/cxl/core/port.c:17:
>> drivers/cxl/core/core.h:64:1: warning: non-void function does not return a value [-Wreturn-type]
64 | }
| ^
13 warnings generated.
vim +64 drivers/cxl/core/core.h
79ed8367834ee8 Dan Williams 2025-05-07 28
79ed8367834ee8 Dan Williams 2025-05-07 29 struct cxl_region *cxl_decoder_detach(struct cxl_region *cxlr,
79ed8367834ee8 Dan Williams 2025-05-07 30 struct cxl_endpoint_decoder *cxled,
79ed8367834ee8 Dan Williams 2025-05-07 31 int pos, enum cxl_detach_mode mode);
79ed8367834ee8 Dan Williams 2025-05-07 32
779dd20cfb56c5 Ben Widawsky 2021-06-08 33 #define CXL_REGION_ATTR(x) (&dev_attr_##x.attr)
8d48817df6ac20 Dan Williams 2021-06-15 34 #define CXL_REGION_TYPE(x) (&cxl_region_type)
779dd20cfb56c5 Ben Widawsky 2021-06-08 35 #define SET_CXL_REGION_ATTR(x) (&dev_attr_##x.attr),
04ad63f086d1a9 Dan Williams 2022-01-11 36 #define CXL_PMEM_REGION_TYPE(x) (&cxl_pmem_region_type)
09d09e04d2fcf8 Dan Williams 2023-02-10 37 #define CXL_DAX_REGION_TYPE(x) (&cxl_dax_region_type)
8d48817df6ac20 Dan Williams 2021-06-15 38 int cxl_region_init(void);
8d48817df6ac20 Dan Williams 2021-06-15 39 void cxl_region_exit(void);
f0832a58639691 Alison Schofield 2023-04-18 40 int cxl_get_poison_by_endpoint(struct cxl_port *port);
b98d042698a325 Alison Schofield 2024-04-30 41 struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa);
9aa5f6235e16ac Alison Schofield 2024-07-02 42 u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
86954ff5032d9d Alison Schofield 2024-04-30 43 u64 dpa);
b98d042698a325 Alison Schofield 2024-04-30 44
779dd20cfb56c5 Ben Widawsky 2021-06-08 45 #else
9aa5f6235e16ac Alison Schofield 2024-07-02 46 static inline u64 cxl_dpa_to_hpa(struct cxl_region *cxlr,
9aa5f6235e16ac Alison Schofield 2024-07-02 47 const struct cxl_memdev *cxlmd, u64 dpa)
86954ff5032d9d Alison Schofield 2024-04-30 48 {
86954ff5032d9d Alison Schofield 2024-04-30 49 return ULLONG_MAX;
86954ff5032d9d Alison Schofield 2024-04-30 50 }
b98d042698a325 Alison Schofield 2024-04-30 51 static inline
b98d042698a325 Alison Schofield 2024-04-30 52 struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa)
b98d042698a325 Alison Schofield 2024-04-30 53 {
b98d042698a325 Alison Schofield 2024-04-30 54 return NULL;
b98d042698a325 Alison Schofield 2024-04-30 55 }
f0832a58639691 Alison Schofield 2023-04-18 56 static inline int cxl_get_poison_by_endpoint(struct cxl_port *port)
f0832a58639691 Alison Schofield 2023-04-18 57 {
f0832a58639691 Alison Schofield 2023-04-18 58 return 0;
f0832a58639691 Alison Schofield 2023-04-18 59 }
79ed8367834ee8 Dan Williams 2025-05-07 60 static inline struct cxl_region *
79ed8367834ee8 Dan Williams 2025-05-07 61 cxl_decoder_detach(struct cxl_region *cxlr, struct cxl_endpoint_decoder *cxled,
79ed8367834ee8 Dan Williams 2025-05-07 62 int pos, enum cxl_detach_mode mode)
b9686e8c8e39d4 Dan Williams 2022-06-04 63 {
b9686e8c8e39d4 Dan Williams 2022-06-04 @64 }
8d48817df6ac20 Dan Williams 2021-06-15 65 static inline int cxl_region_init(void)
8d48817df6ac20 Dan Williams 2021-06-15 66 {
8d48817df6ac20 Dan Williams 2021-06-15 67 return 0;
8d48817df6ac20 Dan Williams 2021-06-15 68 }
8d48817df6ac20 Dan Williams 2021-06-15 69 static inline void cxl_region_exit(void)
8d48817df6ac20 Dan Williams 2021-06-15 70 {
8d48817df6ac20 Dan Williams 2021-06-15 71 }
779dd20cfb56c5 Ben Widawsky 2021-06-08 72 #define CXL_REGION_ATTR(x) NULL
8d48817df6ac20 Dan Williams 2021-06-15 73 #define CXL_REGION_TYPE(x) NULL
779dd20cfb56c5 Ben Widawsky 2021-06-08 74 #define SET_CXL_REGION_ATTR(x)
04ad63f086d1a9 Dan Williams 2022-01-11 75 #define CXL_PMEM_REGION_TYPE(x) NULL
09d09e04d2fcf8 Dan Williams 2023-02-10 76 #define CXL_DAX_REGION_TYPE(x) NULL
779dd20cfb56c5 Ben Widawsky 2021-06-08 77 #endif
779dd20cfb56c5 Ben Widawsky 2021-06-08 78
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking
2025-05-07 21:18 ` Dan Williams
@ 2025-05-08 11:00 ` Peter Zijlstra
2025-05-09 5:04 ` Dan Williams
0 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2025-05-08 11:00 UTC (permalink / raw)
To: Dan Williams
Cc: linux-cxl, linux-kernel, David Lechner, Linus Torvalds,
Ingo Molnar, Fabio M. De Francesco, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny
On Wed, May 07, 2025 at 02:18:25PM -0700, Dan Williams wrote:
> Peter Zijlstra wrote:
> [..]
> > > @@ -202,6 +204,28 @@ DEFINE_GUARD(mutex, struct mutex *, mutex_lock(_T), mutex_unlock(_T))
> > > DEFINE_GUARD_COND(mutex, _try, mutex_trylock(_T))
> > > DEFINE_GUARD_COND(mutex, _intr, mutex_lock_interruptible(_T) == 0)
> > >
> > > +/* mutex type that only implements scope-based unlock */
> > > +struct mutex_acquire {
> > > + /* private: */
> > > + struct mutex mutex;
> > > +};
> > > +DEFINE_GUARD(mutex_acquire, struct mutex_acquire *, mutex_lock(&_T->mutex),
> > > + mutex_unlock(&_T->mutex))
> > > +DEFINE_GUARD_COND(mutex_acquire, _try, mutex_trylock(&_T->mutex))
> > > +DEFINE_GUARD_COND(mutex_acquire, _intr, mutex_lock_interruptible(&_T->mutex) == 0)
> > > +DEFINE_ACQUIRE(mutex_intr_acquire, mutex, mutex_unlock,
> > > + mutex_lock_interruptible)
> > > +
> > > +static inline int mutex_try_or_busy(struct mutex *lock)
> > > +{
> > > + int ret[] = { -EBUSY, 0 };
> > > +
> > > + return ret[mutex_trylock(lock)];
> > > +}
> > > +
> > > +DEFINE_ACQUIRE(mutex_try_acquire, mutex, mutex_unlock,
> > > + mutex_try_or_busy)
> > > +
> > > extern unsigned long mutex_get_owner(struct mutex *lock);
> > >
> > > #endif /* __LINUX_MUTEX_H */
> >
> > I'm terribly confused...
>
> I suspect the disconnect is that this proposal adds safety where guard()
> does not today. That was driven by the mistake that Linus caught in the
> RFC [1]
>
> at the same time I note that your patch is horribly broken. Look
> at your change to drivers/cxl/core/mbox.c: you made it use
>
> + struct mutex *lock __drop(mutex_unlock) =
> + mutex_intr_acquire(&mds->poison.lock);
>
> but then you didn't remove the existing unlocking, so that
> function still has
>
> [1]: http://lore.kernel.org/CAHk-=wgRPDGvofj1PU=NemF6iFu308pFZ=w5P+sQyOMGd978fA@mail.gmail.com
>
> I.e. in my haste I forgot to cleanup a straggling open-coded
> mutex_unlock(), but that is something the compiler warns about iff we
> switch to parallel primitive universe.
AFAICT all you've done is an effective rename. You can still write:
mutex_unlock(&foo->lock.lock);
and the compiler will again happily do so.
> > What's wrong with:
> >
> > CLASS(mutex_intr, lock)(&foo);
> > if (IS_ERR(__guard_ptr(mutex_intr)(lock)))
> > return __guard_ptr(mutex_intr)(lock);
>
> __guard_ptr() returns NULL on error, not an ERR_PTR, but I get the gist.
(Yeah, I realized after sending, but didn't want to self-reply for
something minor like that :-)
> > I mean, yeah __guard_ptr(mutex_intr) doesn't really roll of the tongue,
> > but if that is the whole objection, surely we can try and fix that bit
> > instead of building an entire parallel set of primitives.
>
> Yes, the "entire set of parallel primitives" was the least confident
> part of this proposal, but the more I look, that is a feature (albeit
> inelegant) not a bug.
>
> Today one can write:
>
> guard(mutex_intr)(&lock);
> ...
> mutex_unlock(lock);
>
> ...and the compiler does not tell you that the lock may not even be held
> upon return, nor that this is unlocking a lock that will also be
> unlocked when @lock goes out of scope.
>
> The only type safety today is the BUILD_BUG_ON() in scoped_cond_guard()
> when passing in the wrong lock class.
>
> So the proposal is, if you know what you are doing, or have a need to
> switch back and forth between scope-based and explicit unlock for a give
> lock, use the base primitives. If instead you want to fully convert to
> scope-based lock management (excise all explicit unlock() calls) *and*
> you want the compiler to validate the conversion, switch to the _acquire
> parallel universe.
As with all refactoring ever, the rename trick always works. But I don't
think that warrants building a parallel infrastructure just for that.
Specifically, it very much does not disallow calling mutex_unlock() on
your new member. So all you get is some compiler help during refactor,
and again, just rename the lock member already.
Also, if we ever actually get LLVM's Thread Safety Analysis working,
that will help us with all these problems:
https://lore.kernel.org/all/20250304092417.2873893-1-elver@google.com/
But the compiler needs a little more work go grok C :-)
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking
2025-05-08 11:00 ` Peter Zijlstra
@ 2025-05-09 5:04 ` Dan Williams
2025-05-09 10:40 ` Peter Zijlstra
0 siblings, 1 reply; 29+ messages in thread
From: Dan Williams @ 2025-05-09 5:04 UTC (permalink / raw)
To: Peter Zijlstra, Dan Williams
Cc: linux-cxl, linux-kernel, David Lechner, Linus Torvalds,
Ingo Molnar, Fabio M. De Francesco, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny
Peter Zijlstra wrote:
[..]
> > So the proposal is, if you know what you are doing, or have a need to
> > switch back and forth between scope-based and explicit unlock for a give
> > lock, use the base primitives. If instead you want to fully convert to
> > scope-based lock management (excise all explicit unlock() calls) *and*
> > you want the compiler to validate the conversion, switch to the _acquire
> > parallel universe.
>
> As with all refactoring ever, the rename trick always works. But I don't
> think that warrants building a parallel infrastructure just for that.
>
> Specifically, it very much does not disallow calling mutex_unlock() on
> your new member. So all you get is some compiler help during refactor,
> and again, just rename the lock member already.
>
> Also, if we ever actually get LLVM's Thread Safety Analysis working,
> that will help us with all these problems:
>
> https://lore.kernel.org/all/20250304092417.2873893-1-elver@google.com/
That looks lovely.
> But the compiler needs a little more work go grok C :-)
Ok, here is a last shot that incorporates all the feedback:
1/ Conceptually no need for a new CLASS() save for the fact that
__guard_ptr() returns NULL on failure, not an ERR_PTR().
2/ The rename trick is not true type safety, especially if it leads to
parallel universe of primitives, but it is a useful trick.
3/ "IS_ERR(__guard_ptr(mutex_intr)(lock))" is a mouthful, would be nice
to have something more succint while maintaining some safety.
That leads me to a scheme like the following:
DEFINE_GUARD_ERR(mutex, _intr, mutex_lock_interruptible(_T))
...
ACQUIRE(mutex_intr, lock)(&obj->lock);
if (IS_ERR(lock))
return PTR_ERR(lock);
Where that guard class differs from mutex_intr in that it returns an
ERR_PTR(). Some type-safety is provided by ACQUIRE() which looks for the
"mutex_intr_err" class not "mutex_intr".
The rename trick can be opt-in for helping to validate refactoring, but
the expectation is that something like Thread Safety Analysis should
make that rename track moot. In the meantime code stays with all the
same named primitives, only ever one primitive universe to handle.
I am open to making the rule be that "#define MUTEX_ACQUIRE" never ships
upstream, it's only a local hack during code refactoring to check
assumptions.
-- 8< --
diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index d72764056ce6..b767d3e8f9c7 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -1,9 +1,10 @@
// SPDX-License-Identifier: GPL-2.0-only
/* Copyright(c) 2020 Intel Corporation. All rights reserved. */
+#define MUTEX_ACQUIRE
+#include <linux/mutex.h>
#include <linux/security.h>
#include <linux/debugfs.h>
#include <linux/ktime.h>
-#include <linux/mutex.h>
#include <linux/unaligned.h>
#include <cxlpci.h>
#include <cxlmem.h>
@@ -1394,9 +1395,9 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
int nr_records = 0;
int rc;
- rc = mutex_lock_interruptible(&mds->poison.lock);
- if (rc)
- return rc;
+ ACQUIRE(mutex_intr, lock)(&mds->poison.lock);
+ if (IS_ERR(lock))
+ return PTR_ERR(lock);
po = mds->poison.list_out;
pi.offset = cpu_to_le64(offset);
@@ -1430,7 +1431,6 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
}
} while (po->flags & CXL_POISON_FLAG_MORE);
- mutex_unlock(&mds->poison.lock);
return rc;
}
EXPORT_SYMBOL_NS_GPL(cxl_mem_get_poison, "CXL");
@@ -1466,7 +1466,7 @@ int cxl_poison_state_init(struct cxl_memdev_state *mds)
return rc;
}
- mutex_init(&mds->poison.lock);
+ mutex_init(&mds->poison.lock.mutex);
return 0;
}
EXPORT_SYMBOL_NS_GPL(cxl_poison_state_init, "CXL");
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 3ec6b906371b..9b4ab5d1a7c4 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -257,7 +257,7 @@ struct cxl_poison_state {
u32 max_errors;
DECLARE_BITMAP(enabled_cmds, CXL_POISON_ENABLED_MAX);
struct cxl_mbox_poison_out *list_out;
- struct mutex lock; /* Protect reads of poison list */
+ struct mutex_acquire lock; /* Protect reads of poison list */
};
/*
diff --git a/include/linux/cleanup.h b/include/linux/cleanup.h
index 7e57047e1564..403947d2537e 100644
--- a/include/linux/cleanup.h
+++ b/include/linux/cleanup.h
@@ -291,16 +291,21 @@ static inline class_##_name##_t class_##_name##ext##_constructor(_init_args) \
#define __DEFINE_CLASS_IS_CONDITIONAL(_name, _is_cond) \
static __maybe_unused const bool class_##_name##_is_conditional = _is_cond
+#define __DEFINE_CLASS_IS_ERR_PTR(_name, _is_err) \
+static __maybe_unused const bool class_##_name##_is_err_ptr = _is_err
+
#define __DEFINE_GUARD_LOCK_PTR(_name, _exp) \
static inline void * class_##_name##_lock_ptr(class_##_name##_t *_T) \
{ return (void *)(__force unsigned long)*(_exp); }
-#define DEFINE_CLASS_IS_GUARD(_name) \
+#define DEFINE_CLASS_IS_GUARD(_name) \
__DEFINE_CLASS_IS_CONDITIONAL(_name, false); \
+ __DEFINE_CLASS_IS_ERR_PTR(_name, false); \
__DEFINE_GUARD_LOCK_PTR(_name, _T)
-#define DEFINE_CLASS_IS_COND_GUARD(_name) \
+#define DEFINE_CLASS_IS_COND_GUARD(_name) \
__DEFINE_CLASS_IS_CONDITIONAL(_name, true); \
+ __DEFINE_CLASS_IS_ERR_PTR(_name, false); \
__DEFINE_GUARD_LOCK_PTR(_name, _T)
#define DEFINE_GUARD(_name, _type, _lock, _unlock) \
@@ -309,6 +314,7 @@ static __maybe_unused const bool class_##_name##_is_conditional = _is_cond
#define DEFINE_GUARD_COND(_name, _ext, _condlock) \
__DEFINE_CLASS_IS_CONDITIONAL(_name##_ext, true); \
+ __DEFINE_CLASS_IS_ERR_PTR(_name##_ext, false); \
EXTEND_CLASS(_name, _ext, \
({ void *_t = _T; if (_T && !(_condlock)) _t = NULL; _t; }), \
class_##_name##_t _T) \
@@ -320,6 +326,7 @@ static __maybe_unused const bool class_##_name##_is_conditional = _is_cond
#define __guard_ptr(_name) class_##_name##_lock_ptr
#define __is_cond_ptr(_name) class_##_name##_is_conditional
+#define __is_guard_err_ptr(_name) class_##_name##_is_err_ptr
/*
* Helper macro for scoped_guard().
@@ -346,6 +353,7 @@ _label: \
for (CLASS(_name, scope)(args); true; ({ goto _label; })) \
if (!__guard_ptr(_name)(&scope)) { \
BUILD_BUG_ON(!__is_cond_ptr(_name)); \
+ BUILD_BUG_ON(__is_guard_err_ptr(_name)); \
_fail; \
_label: \
break; \
@@ -424,5 +432,43 @@ __DEFINE_LOCK_GUARD_0(_name, _lock)
static inline void * class_##_name##_ext##_lock_ptr(class_##_name##_t *_T) \
{ return class_##_name##_lock_ptr(_T); }
+#define EXTEND_CLASS_ERR(_name, ext, _init, _init_args...) \
+ typedef class_##_name##_t class_##_name##ext##_err_t; \
+ static inline void class_##_name##ext##_err_destructor( \
+ class_##_name##_t *p) \
+ { \
+ /* base destructors do not expect ERR_PTR */ \
+ if (IS_ERR(p)) \
+ p = NULL; \
+ class_##_name##_destructor(p); \
+ } \
+ static inline class_##_name##_t class_##_name##ext##_err_constructor( \
+ _init_args) \
+ { \
+ class_##_name##_t t = _init; \
+ return t; \
+ }
+
+#define DEFINE_GUARD_ERR(_name, _ext, _condlock) \
+ __DEFINE_CLASS_IS_CONDITIONAL(_name##_ext##_err, true); \
+ __DEFINE_CLASS_IS_ERR_PTR(_name##_ext##_err, true); \
+ EXTEND_CLASS_ERR(_name, _ext, ({ \
+ void *_t = _T; \
+ int err = _condlock; \
+ \
+ if (err) \
+ _t = ERR_PTR(err); \
+ _t; \
+ }), \
+ class_##_name##_t _T) \
+ static inline void *class_##_name##_ext##_err_lock_ptr( \
+ class_##_name##_t *_T) \
+ { \
+ return class_##_name##_lock_ptr(_T); \
+ }
+
+#define ACQUIRE(_name, var) \
+ class_##_name##_err_t var __cleanup(class_##_name##_err_destructor) = \
+ class_##_name##_err_constructor
#endif /* _LINUX_CLEANUP_H */
diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index 2143d05116be..bd4b449ea6bd 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -198,9 +198,37 @@ extern void mutex_unlock(struct mutex *lock);
extern int atomic_dec_and_mutex_lock(atomic_t *cnt, struct mutex *lock);
+/*
+ * For subsystems that want to use a "rename trick" to use type-safety
+ * to validate debug scope-based unlock, define MUTEX_ACQUIRE before
+ * including mutex.h.
+ */
+struct mutex_acquire {
+ /* private: */
+ struct mutex mutex;
+};
+
+static inline int mutex_try_or_busy(struct mutex *lock)
+{
+ int ret[] = { -EBUSY, 0 };
+
+ return ret[mutex_trylock(lock)];
+}
+
+#ifndef MUTEX_ACQUIRE
DEFINE_GUARD(mutex, struct mutex *, mutex_lock(_T), mutex_unlock(_T))
DEFINE_GUARD_COND(mutex, _try, mutex_trylock(_T))
DEFINE_GUARD_COND(mutex, _intr, mutex_lock_interruptible(_T) == 0)
+DEFINE_GUARD_ERR(mutex, _intr, mutex_lock_interruptible(_T))
+DEFINE_GUARD_ERR(mutex, _try, mutex_try_or_busy(_T))
+#else
+DEFINE_GUARD(mutex, struct mutex_acquire *, mutex_lock(&_T->mutex),
+ mutex_unlock(&_T->mutex))
+DEFINE_GUARD_COND(mutex, _try, mutex_trylock(&_T->mutex))
+DEFINE_GUARD_COND(mutex, _intr, mutex_lock_interruptible(&_T->mutex) == 0)
+DEFINE_GUARD_ERR(mutex, _intr, mutex_lock_interruptible(&_T->mutex))
+DEFINE_GUARD_ERR(mutex, _try, mutex_try_or_busy(&_T->mutex))
+#endif
extern unsigned long mutex_get_owner(struct mutex *lock);
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking
2025-05-09 5:04 ` Dan Williams
@ 2025-05-09 10:40 ` Peter Zijlstra
2025-05-10 1:11 ` dan.j.williams
0 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2025-05-09 10:40 UTC (permalink / raw)
To: Dan Williams
Cc: linux-cxl, linux-kernel, David Lechner, Linus Torvalds,
Ingo Molnar, Fabio M. De Francesco, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny
On Thu, May 08, 2025 at 10:04:32PM -0700, Dan Williams wrote:
> Peter Zijlstra wrote:
> [..]
> > > So the proposal is, if you know what you are doing, or have a need to
> > > switch back and forth between scope-based and explicit unlock for a give
> > > lock, use the base primitives. If instead you want to fully convert to
> > > scope-based lock management (excise all explicit unlock() calls) *and*
> > > you want the compiler to validate the conversion, switch to the _acquire
> > > parallel universe.
> >
> > As with all refactoring ever, the rename trick always works. But I don't
> > think that warrants building a parallel infrastructure just for that.
> >
> > Specifically, it very much does not disallow calling mutex_unlock() on
> > your new member. So all you get is some compiler help during refactor,
> > and again, just rename the lock member already.
> >
> > Also, if we ever actually get LLVM's Thread Safety Analysis working,
> > that will help us with all these problems:
> >
> > https://lore.kernel.org/all/20250304092417.2873893-1-elver@google.com/
>
> That looks lovely.
>
> > But the compiler needs a little more work go grok C :-)
>
> Ok, here is a last shot that incorporates all the feedback:
>
> 1/ Conceptually no need for a new CLASS() save for the fact that
> __guard_ptr() returns NULL on failure, not an ERR_PTR().
>
> 2/ The rename trick is not true type safety, especially if it leads to
> parallel universe of primitives, but it is a useful trick.
>
> 3/ "IS_ERR(__guard_ptr(mutex_intr)(lock))" is a mouthful, would be nice
> to have something more succint while maintaining some safety.
>
> That leads me to a scheme like the following:
>
> DEFINE_GUARD_ERR(mutex, _intr, mutex_lock_interruptible(_T))
> ...
> ACQUIRE(mutex_intr, lock)(&obj->lock);
> if (IS_ERR(lock))
> return PTR_ERR(lock);
Urgh.. can you live with something like this?
---
diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index d72764056ce6..6b0ca400b393 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -1394,8 +1394,8 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
int nr_records = 0;
int rc;
- rc = mutex_lock_interruptible(&mds->poison.lock);
- if (rc)
+ ACQUIRE(mutex_intr, lock)(&mds->poison.poison_lock);
+ if ((rc = ACQUIRE_ERR(mutex_intr, &lock)))
return rc;
po = mds->poison.list_out;
@@ -1430,7 +1430,6 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
}
} while (po->flags & CXL_POISON_FLAG_MORE);
- mutex_unlock(&mds->poison.lock);
return rc;
}
EXPORT_SYMBOL_NS_GPL(cxl_mem_get_poison, "CXL");
@@ -1466,7 +1465,7 @@ int cxl_poison_state_init(struct cxl_memdev_state *mds)
return rc;
}
- mutex_init(&mds->poison.lock);
+ mutex_init(&mds->poison.poison_lock);
return 0;
}
EXPORT_SYMBOL_NS_GPL(cxl_poison_state_init, "CXL");
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 3ec6b906371b..85a160c778ae 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -257,7 +257,7 @@ struct cxl_poison_state {
u32 max_errors;
DECLARE_BITMAP(enabled_cmds, CXL_POISON_ENABLED_MAX);
struct cxl_mbox_poison_out *list_out;
- struct mutex lock; /* Protect reads of poison list */
+ struct mutex poison_lock; /* Protect reads of poison list */
};
/*
diff --git a/include/linux/cleanup.h b/include/linux/cleanup.h
index 7093e1d08af0..c0439fd63915 100644
--- a/include/linux/cleanup.h
+++ b/include/linux/cleanup.h
@@ -3,6 +3,8 @@
#define _LINUX_CLEANUP_H
#include <linux/compiler.h>
+#include <linux/err.h>
+#include <linux/args.h>
/**
* DOC: scope-based cleanup helpers
@@ -323,23 +325,40 @@ static __maybe_unused const bool class_##_name##_is_conditional = _is_cond
__DEFINE_GUARD_LOCK_PTR(_name, _T)
#define DEFINE_GUARD(_name, _type, _lock, _unlock) \
- DEFINE_CLASS(_name, _type, if (_T) { _unlock; }, ({ _lock; _T; }), _type _T); \
+ DEFINE_CLASS(_name, _type, if (!IS_ERR_OR_NULL(_T)) { _unlock; }, ({ _lock; _T; }), _type _T); \
DEFINE_CLASS_IS_GUARD(_name)
-#define DEFINE_GUARD_COND(_name, _ext, _condlock) \
+#define DEFINE_GUARD_COND_4(_name, _ext, _lock, _cond) \
__DEFINE_CLASS_IS_CONDITIONAL(_name##_ext, true); \
EXTEND_CLASS(_name, _ext, \
- ({ void *_t = _T; if (_T && !(_condlock)) _t = NULL; _t; }), \
+ ({ void *_t = _T; int _RET = (_lock); if (_T && !(_cond)) _t = ERR_PTR(_RET); _t; }), \
class_##_name##_t _T) \
static inline void * class_##_name##_ext##_lock_ptr(class_##_name##_t *_T) \
{ return class_##_name##_lock_ptr(_T); }
+/*
+ * Default binary condition; success on 'true'.
+ */
+#define DEFINE_GUARD_COND_3(_name, _ext, _lock) \
+ DEFINE_GUARD_COND_4(_name, _ext, _lock, _RET)
+
+#define DEFINE_GUARD_COND(X...) CONCATENATE(DEFINE_GUARD_COND_, COUNT_ARGS(X))(X)
+
#define guard(_name) \
CLASS(_name, __UNIQUE_ID(guard))
#define __guard_ptr(_name) class_##_name##_lock_ptr
#define __is_cond_ptr(_name) class_##_name##_is_conditional
+#define ACQUIRE(_name, _var) \
+ CLASS(_name, _var)
+
+#define ACQUIRE_ERR(_name, _var) \
+ ({ long _rc = PTR_ERR(__guard_ptr(_name)(_var)); \
+ if (!_rc) _rc = -EBUSY; \
+ if (!IS_ERR_VALUE(_rc)) _rc = 0; \
+ _rc; })
+
/*
* Helper macro for scoped_guard().
*
@@ -401,7 +420,7 @@ typedef struct { \
\
static inline void class_##_name##_destructor(class_##_name##_t *_T) \
{ \
- if (_T->lock) { _unlock; } \
+ if (!IS_ERR_OR_NULL(_T->lock)) { _unlock; } \
} \
\
__DEFINE_GUARD_LOCK_PTR(_name, &_T->lock)
@@ -433,15 +452,20 @@ __DEFINE_CLASS_IS_CONDITIONAL(_name, false); \
__DEFINE_UNLOCK_GUARD(_name, void, _unlock, __VA_ARGS__) \
__DEFINE_LOCK_GUARD_0(_name, _lock)
-#define DEFINE_LOCK_GUARD_1_COND(_name, _ext, _condlock) \
+#define DEFINE_LOCK_GUARD_1_COND_4(_name, _ext, _lock, _cond) \
__DEFINE_CLASS_IS_CONDITIONAL(_name##_ext, true); \
EXTEND_CLASS(_name, _ext, \
({ class_##_name##_t _t = { .lock = l }, *_T = &_t;\
- if (_T->lock && !(_condlock)) _T->lock = NULL; \
+ int _RET = (_lock); \
+ if (_T->lock && !(_cond)) _T->lock = ERR_PTR(_RET);\
_t; }), \
typeof_member(class_##_name##_t, lock) l) \
static inline void * class_##_name##_ext##_lock_ptr(class_##_name##_t *_T) \
{ return class_##_name##_lock_ptr(_T); }
+#define DEFINE_LOCK_GUARD_1_COND_3(_name, _ext, _lock) \
+ DEFINE_LOCK_GUARD_1_COND_4(_name, _ext, _lock, _RET);
+
+#define DEFINE_LOCK_GUARD_1_COND(X...) CONCATENATE(DEFINE_LOCK_GUARD_1_COND_, COUNT_ARGS(X))(X)
#endif /* _LINUX_CLEANUP_H */
diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index 2143d05116be..232fdde82bbb 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -200,7 +200,7 @@ extern int atomic_dec_and_mutex_lock(atomic_t *cnt, struct mutex *lock);
DEFINE_GUARD(mutex, struct mutex *, mutex_lock(_T), mutex_unlock(_T))
DEFINE_GUARD_COND(mutex, _try, mutex_trylock(_T))
-DEFINE_GUARD_COND(mutex, _intr, mutex_lock_interruptible(_T) == 0)
+DEFINE_GUARD_COND(mutex, _intr, mutex_lock_interruptible(_T), _RET == 0)
extern unsigned long mutex_get_owner(struct mutex *lock);
diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h
index c8b543d428b0..c810deb88d13 100644
--- a/include/linux/rwsem.h
+++ b/include/linux/rwsem.h
@@ -240,7 +240,7 @@ extern void up_write(struct rw_semaphore *sem);
DEFINE_GUARD(rwsem_read, struct rw_semaphore *, down_read(_T), up_read(_T))
DEFINE_GUARD_COND(rwsem_read, _try, down_read_trylock(_T))
-DEFINE_GUARD_COND(rwsem_read, _intr, down_read_interruptible(_T) == 0)
+DEFINE_GUARD_COND(rwsem_read, _intr, down_read_interruptible(_T), _RET == 0)
DEFINE_GUARD(rwsem_write, struct rw_semaphore *, down_write(_T), up_write(_T))
DEFINE_GUARD_COND(rwsem_write, _try, down_write_trylock(_T))
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking
2025-05-07 7:21 ` [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking Dan Williams
2025-05-07 9:32 ` Peter Zijlstra
@ 2025-05-09 19:10 ` kernel test robot
1 sibling, 0 replies; 29+ messages in thread
From: kernel test robot @ 2025-05-09 19:10 UTC (permalink / raw)
To: Dan Williams, linux-cxl
Cc: oe-kbuild-all, linux-kernel, David Lechner, Peter Zijlstra,
Ingo Molnar, Fabio M. De Francesco, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny
Hi Dan,
kernel test robot noticed the following build errors:
[auto build test ERROR on b4432656b36e5cc1d50a1f2dc15357543add530e]
url: https://github.com/intel-lab-lkp/linux/commits/Dan-Williams/cleanup-Introduce-DEFINE_ACQUIRE-a-CLASS-for-conditional-locking/20250507-152728
base: b4432656b36e5cc1d50a1f2dc15357543add530e
patch link: https://lore.kernel.org/r/20250507072145.3614298-2-dan.j.williams%40intel.com
patch subject: [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking
config: arm-randconfig-001-20250509 (https://download.01.org/0day-ci/archive/20250510/202505100206.85k3mymM-lkp@intel.com/config)
compiler: arm-linux-gnueabi-gcc (GCC) 8.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250510/202505100206.85k3mymM-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202505100206.85k3mymM-lkp@intel.com/
All error/warnings (new ones prefixed by >>):
In file included from include/linux/irqflags.h:17,
from arch/arm/include/asm/bitops.h:28,
from include/linux/bitops.h:68,
from arch/arm/boot/compressed/../../../../lib/lz4/lz4defs.h:40,
from arch/arm/boot/compressed/../../../../lib/lz4/lz4_decompress.c:36,
from arch/arm/boot/compressed/../../../../lib/decompress_unlz4.c:10,
from arch/arm/boot/compressed/decompress.c:60:
include/linux/mutex.h: In function 'class_mutex_intr_acquire_destructor':
>> include/linux/cleanup.h:476:13: error: implicit declaration of function 'IS_ERR_OR_NULL' [-Werror=implicit-function-declaration]
if (!IS_ERR_OR_NULL(_T)) _unlock(&_T->_locktype), ({ \
^~~~~~~~~~~~~~
include/linux/cleanup.h:246:18: note: in definition of macro 'DEFINE_CLASS'
{ _type _T = *p; _exit; } \
^~~~~
include/linux/mutex.h:216:1: note: in expansion of macro 'DEFINE_ACQUIRE'
DEFINE_ACQUIRE(mutex_intr_acquire, mutex, mutex_unlock,
^~~~~~~~~~~~~~
include/linux/mutex.h: In function 'class_mutex_intr_acquire_constructor':
>> include/linux/cleanup.h:481:24: error: implicit declaration of function 'ERR_PTR'; did you mean 'PERCPU_PTR'? [-Werror=implicit-function-declaration]
lock_result = ERR_PTR(ret); \
^~~~~~~
include/linux/cleanup.h:248:13: note: in definition of macro 'DEFINE_CLASS'
{ _type t = _init; return t; }
^~~~~
include/linux/mutex.h:216:1: note: in expansion of macro 'DEFINE_ACQUIRE'
DEFINE_ACQUIRE(mutex_intr_acquire, mutex, mutex_unlock,
^~~~~~~~~~~~~~
>> include/linux/cleanup.h:481:22: warning: assignment to 'struct mutex_acquire *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
lock_result = ERR_PTR(ret); \
^
include/linux/cleanup.h:248:13: note: in definition of macro 'DEFINE_CLASS'
{ _type t = _init; return t; }
^~~~~
include/linux/mutex.h:216:1: note: in expansion of macro 'DEFINE_ACQUIRE'
DEFINE_ACQUIRE(mutex_intr_acquire, mutex, mutex_unlock,
^~~~~~~~~~~~~~
include/linux/mutex.h: In function 'class_mutex_try_acquire_constructor':
>> include/linux/cleanup.h:481:22: warning: assignment to 'struct mutex_acquire *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
lock_result = ERR_PTR(ret); \
^
include/linux/cleanup.h:248:13: note: in definition of macro 'DEFINE_CLASS'
{ _type t = _init; return t; }
^~~~~
include/linux/mutex.h:226:1: note: in expansion of macro 'DEFINE_ACQUIRE'
DEFINE_ACQUIRE(mutex_try_acquire, mutex, mutex_unlock,
^~~~~~~~~~~~~~
In file included from include/linux/rwsem.h:17,
from include/linux/mm_types.h:13,
from include/linux/mmzone.h:22,
from include/linux/gfp.h:7,
from include/linux/umh.h:4,
from include/linux/kmod.h:9,
from include/linux/module.h:17,
from arch/arm/boot/compressed/../../../../lib/lz4/lz4_decompress.c:38,
from arch/arm/boot/compressed/../../../../lib/decompress_unlz4.c:10,
from arch/arm/boot/compressed/decompress.c:60:
include/linux/err.h: At top level:
>> include/linux/err.h:39:35: error: conflicting types for 'ERR_PTR'
static inline void * __must_check ERR_PTR(long error)
^~~~~~~
In file included from include/linux/irqflags.h:17,
from arch/arm/include/asm/bitops.h:28,
from include/linux/bitops.h:68,
from arch/arm/boot/compressed/../../../../lib/lz4/lz4defs.h:40,
from arch/arm/boot/compressed/../../../../lib/lz4/lz4_decompress.c:36,
from arch/arm/boot/compressed/../../../../lib/decompress_unlz4.c:10,
from arch/arm/boot/compressed/decompress.c:60:
include/linux/cleanup.h:481:24: note: previous implicit declaration of 'ERR_PTR' was here
lock_result = ERR_PTR(ret); \
^~~~~~~
include/linux/cleanup.h:248:13: note: in definition of macro 'DEFINE_CLASS'
{ _type t = _init; return t; }
^~~~~
include/linux/mutex.h:216:1: note: in expansion of macro 'DEFINE_ACQUIRE'
DEFINE_ACQUIRE(mutex_intr_acquire, mutex, mutex_unlock,
^~~~~~~~~~~~~~
In file included from include/linux/rwsem.h:17,
from include/linux/mm_types.h:13,
from include/linux/mmzone.h:22,
from include/linux/gfp.h:7,
from include/linux/umh.h:4,
from include/linux/kmod.h:9,
from include/linux/module.h:17,
from arch/arm/boot/compressed/../../../../lib/lz4/lz4_decompress.c:38,
from arch/arm/boot/compressed/../../../../lib/decompress_unlz4.c:10,
from arch/arm/boot/compressed/decompress.c:60:
>> include/linux/err.h:82:33: error: conflicting types for 'IS_ERR_OR_NULL'
static inline bool __must_check IS_ERR_OR_NULL(__force const void *ptr)
^~~~~~~~~~~~~~
In file included from include/linux/irqflags.h:17,
from arch/arm/include/asm/bitops.h:28,
from include/linux/bitops.h:68,
from arch/arm/boot/compressed/../../../../lib/lz4/lz4defs.h:40,
from arch/arm/boot/compressed/../../../../lib/lz4/lz4_decompress.c:36,
from arch/arm/boot/compressed/../../../../lib/decompress_unlz4.c:10,
from arch/arm/boot/compressed/decompress.c:60:
include/linux/cleanup.h:476:13: note: previous implicit declaration of 'IS_ERR_OR_NULL' was here
if (!IS_ERR_OR_NULL(_T)) _unlock(&_T->_locktype), ({ \
^~~~~~~~~~~~~~
include/linux/cleanup.h:246:18: note: in definition of macro 'DEFINE_CLASS'
{ _type _T = *p; _exit; } \
^~~~~
include/linux/mutex.h:216:1: note: in expansion of macro 'DEFINE_ACQUIRE'
DEFINE_ACQUIRE(mutex_intr_acquire, mutex, mutex_unlock,
^~~~~~~~~~~~~~
cc1: some warnings being treated as errors
vim +/IS_ERR_OR_NULL +476 include/linux/cleanup.h
416
417 #define DEFINE_LOCK_GUARD_1_COND(_name, _ext, _condlock) \
418 __DEFINE_CLASS_IS_CONDITIONAL(_name##_ext, true); \
419 EXTEND_CLASS(_name, _ext, \
420 ({ class_##_name##_t _t = { .lock = l }, *_T = &_t;\
421 if (_T->lock && !(_condlock)) _T->lock = NULL; \
422 _t; }), \
423 typeof_member(class_##_name##_t, lock) l) \
424 static inline void * class_##_name##_ext##_lock_ptr(class_##_name##_t *_T) \
425 { return class_##_name##_lock_ptr(_T); }
426
427 /*
428 * DEFINE_ACQUIRE(acquire_class_name, lock_type, unlock, cond_lock):
429 * Define a CLASS() that instantiates and acquires a conditional lock
430 * within an existing scope. In contrast to DEFINE_GUARD[_COND](), which
431 * hides the variable tracking the lock scope, CLASS(@acquire_class_name,
432 * @lock) instantiates @lock as either an ERR_PTR() or a cookie that drops
433 * the lock when it goes out of scope. An "_acquire" suffix is appended to
434 * @lock_type to provide type-safety against mixing explicit and implicit
435 * (scope-based) cleanup.
436 *
437 * Ex.
438 *
439 * DEFINE_ACQUIRE(mutex_intr_acquire, mutex, mutex_unlock,
440 * mutex_lock_interruptible)
441 *
442 * int interruptible_operation(...)
443 * {
444 * ...
445 * CLASS(mutex_intr_acquire, lock)(&obj->lock);
446 * if (IS_ERR(lock))
447 * return PTR_ERR(lock);
448 * ...
449 * } <= obj->lock dropped here.
450 *
451 * Attempts to perform:
452 *
453 * mutex_unlock(&obj->lock);
454 *
455 * ...fail because obj->lock is a 'struct mutex_acquire' not 'struct mutex'
456 * instance.
457 *
458 * Also, attempts to use the CLASS() conditionally require the ambiguous
459 * scope to be clarified (compiler enforced):
460 *
461 * if (...)
462 * CLASS(mutex_intr_acquire, lock)(&obj->lock); // <-- "error: expected expression"
463 * if (IS_ERR(lock))
464 * return PTR_ERR(lock);
465 *
466 * vs:
467 *
468 * if (...) {
469 * CLASS(mutex_intr_acquire, lock)(&obj->lock);
470 * if (IS_ERR(lock))
471 * return PTR_ERR(lock);
472 * } // <-- lock released here
473 */
474 #define DEFINE_ACQUIRE(_name, _locktype, _unlock, _cond_lock) \
475 DEFINE_CLASS(_name, struct _locktype##_acquire *, \
> 476 if (!IS_ERR_OR_NULL(_T)) _unlock(&_T->_locktype), ({ \
477 struct _locktype##_acquire *lock_result; \
478 int ret = _cond_lock(&to_lock->_locktype); \
479 \
480 if (ret) \
> 481 lock_result = ERR_PTR(ret); \
482 else \
483 lock_result = \
484 (struct _locktype##_acquire \
485 *)&to_lock->_locktype; \
486 lock_result; \
487 }), \
488 struct _locktype##_acquire *to_lock)
489
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking
2025-05-09 10:40 ` Peter Zijlstra
@ 2025-05-10 1:11 ` dan.j.williams
2025-05-12 10:50 ` Peter Zijlstra
0 siblings, 1 reply; 29+ messages in thread
From: dan.j.williams @ 2025-05-10 1:11 UTC (permalink / raw)
To: Peter Zijlstra, Dan Williams
Cc: linux-cxl, linux-kernel, David Lechner, Linus Torvalds,
Ingo Molnar, Fabio M. De Francesco, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny
Peter Zijlstra wrote:
> On Thu, May 08, 2025 at 10:04:32PM -0700, Dan Williams wrote:
> > Peter Zijlstra wrote:
> > [..]
> > > > So the proposal is, if you know what you are doing, or have a need to
> > > > switch back and forth between scope-based and explicit unlock for a give
> > > > lock, use the base primitives. If instead you want to fully convert to
> > > > scope-based lock management (excise all explicit unlock() calls) *and*
> > > > you want the compiler to validate the conversion, switch to the _acquire
> > > > parallel universe.
> > >
> > > As with all refactoring ever, the rename trick always works. But I don't
> > > think that warrants building a parallel infrastructure just for that.
> > >
> > > Specifically, it very much does not disallow calling mutex_unlock() on
> > > your new member. So all you get is some compiler help during refactor,
> > > and again, just rename the lock member already.
> > >
> > > Also, if we ever actually get LLVM's Thread Safety Analysis working,
> > > that will help us with all these problems:
> > >
> > > https://lore.kernel.org/all/20250304092417.2873893-1-elver@google.com/
> >
> > That looks lovely.
> >
> > > But the compiler needs a little more work go grok C :-)
> >
> > Ok, here is a last shot that incorporates all the feedback:
> >
> > 1/ Conceptually no need for a new CLASS() save for the fact that
> > __guard_ptr() returns NULL on failure, not an ERR_PTR().
> >
> > 2/ The rename trick is not true type safety, especially if it leads to
> > parallel universe of primitives, but it is a useful trick.
> >
> > 3/ "IS_ERR(__guard_ptr(mutex_intr)(lock))" is a mouthful, would be nice
> > to have something more succint while maintaining some safety.
> >
> > That leads me to a scheme like the following:
> >
> > DEFINE_GUARD_ERR(mutex, _intr, mutex_lock_interruptible(_T))
> > ...
> > ACQUIRE(mutex_intr, lock)(&obj->lock);
> > if (IS_ERR(lock))
> > return PTR_ERR(lock);
>
> Urgh.. can you live with something like this?
>
> ---
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index d72764056ce6..6b0ca400b393 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -1394,8 +1394,8 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
> int nr_records = 0;
> int rc;
>
> - rc = mutex_lock_interruptible(&mds->poison.lock);
> - if (rc)
> + ACQUIRE(mutex_intr, lock)(&mds->poison.poison_lock);
> + if ((rc = ACQUIRE_ERR(mutex_intr, &lock)))
> return rc;
Yes, I can live with that, and I like the compactness of the resulting
cleanup.h macros better than my attempt.
Although, how about this small ergonomic fixup for more symmetry between
ACQUIRE() and ACQUIRE_ERR()?
---
diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 6b0ca400b393..e5d2171c9a48 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -1395,7 +1395,7 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
int rc;
ACQUIRE(mutex_intr, lock)(&mds->poison.poison_lock);
- if ((rc = ACQUIRE_ERR(mutex_intr, &lock)))
+ if ((rc = ACQUIRE_ERR(mutex_intr, lock)))
return rc;
po = mds->poison.list_out;
diff --git a/include/linux/cleanup.h b/include/linux/cleanup.h
index 17d4655a6b73..b379ff445179 100644
--- a/include/linux/cleanup.h
+++ b/include/linux/cleanup.h
@@ -335,7 +342,7 @@ static __maybe_unused const bool class_##_name##_is_conditional = _is_cond
CLASS(_name, _var)
#define ACQUIRE_ERR(_name, _var) \
- ({ long _rc = PTR_ERR(__guard_ptr(_name)(_var)); \
+ ({ long _rc = PTR_ERR(__guard_ptr(_name)(&(_var))); \
if (!_rc) _rc = -EBUSY; \
if (!IS_ERR_VALUE(_rc)) _rc = 0; \
_rc; })
---
Also, I needed the following to get this to compile:
---
diff --git a/include/linux/cleanup.h b/include/linux/cleanup.h
index 17d4655a6b73..052bbad6ac68 100644
--- a/include/linux/cleanup.h
+++ b/include/linux/cleanup.h
@@ -305,8 +305,15 @@ static __maybe_unused const bool class_##_name##_is_conditional = _is_cond
__DEFINE_CLASS_IS_CONDITIONAL(_name, true); \
__DEFINE_GUARD_LOCK_PTR(_name, _T)
+/*
+ * For guard with a potential percpu lock, the address space needs to be
+ * cast away.
+ */
+#define IS_ERR_OR_NULL_ANY(x) \
+IS_ERR_OR_NULL((const void *)(__force const unsigned long)(x))
+
#define DEFINE_GUARD(_name, _type, _lock, _unlock) \
- DEFINE_CLASS(_name, _type, if (!IS_ERR_OR_NULL(_T)) { _unlock; }, ({ _lock; _T; }), _type _T); \
+ DEFINE_CLASS(_name, _type, if (!IS_ERR_OR_NULL_ANY(_T)) { _unlock; }, ({ _lock; _T; }), _type _T); \
DEFINE_CLASS_IS_GUARD(_name)
#define DEFINE_GUARD_COND_4(_name, _ext, _lock, _cond) \
@@ -401,7 +408,7 @@ typedef struct { \
\
static inline void class_##_name##_destructor(class_##_name##_t *_T) \
{ \
- if (!IS_ERR_OR_NULL(_T->lock)) { _unlock; } \
+ if (!IS_ERR_OR_NULL_ANY(_T->lock)) { _unlock; } \
} \
\
__DEFINE_GUARD_LOCK_PTR(_name, &_T->lock)
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking
2025-05-10 1:11 ` dan.j.williams
@ 2025-05-12 10:50 ` Peter Zijlstra
2025-05-12 18:25 ` Peter Zijlstra
0 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2025-05-12 10:50 UTC (permalink / raw)
To: dan.j.williams
Cc: linux-cxl, linux-kernel, David Lechner, Linus Torvalds,
Ingo Molnar, Fabio M. De Francesco, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny
On Fri, May 09, 2025 at 06:11:49PM -0700, dan.j.williams@intel.com wrote:
> ---
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 6b0ca400b393..e5d2171c9a48 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -1395,7 +1395,7 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
> int rc;
>
> ACQUIRE(mutex_intr, lock)(&mds->poison.poison_lock);
> - if ((rc = ACQUIRE_ERR(mutex_intr, &lock)))
> + if ((rc = ACQUIRE_ERR(mutex_intr, lock)))
> return rc;
>
> po = mds->poison.list_out;
> diff --git a/include/linux/cleanup.h b/include/linux/cleanup.h
> index 17d4655a6b73..b379ff445179 100644
> --- a/include/linux/cleanup.h
> +++ b/include/linux/cleanup.h
> @@ -335,7 +342,7 @@ static __maybe_unused const bool class_##_name##_is_conditional = _is_cond
> CLASS(_name, _var)
>
> #define ACQUIRE_ERR(_name, _var) \
> - ({ long _rc = PTR_ERR(__guard_ptr(_name)(_var)); \
> + ({ long _rc = PTR_ERR(__guard_ptr(_name)(&(_var))); \
> if (!_rc) _rc = -EBUSY; \
> if (!IS_ERR_VALUE(_rc)) _rc = 0; \
> _rc; })
>
No strong feelings about this. I see arguments either way.
> ---
>
> Also, I needed the following to get this to compile:
Ah, I didn't get beyond nvim/clangd not complaining about things :-)
> ---
>
> diff --git a/include/linux/cleanup.h b/include/linux/cleanup.h
> index 17d4655a6b73..052bbad6ac68 100644
> --- a/include/linux/cleanup.h
> +++ b/include/linux/cleanup.h
> @@ -305,8 +305,15 @@ static __maybe_unused const bool class_##_name##_is_conditional = _is_cond
> __DEFINE_CLASS_IS_CONDITIONAL(_name, true); \
> __DEFINE_GUARD_LOCK_PTR(_name, _T)
>
> +/*
> + * For guard with a potential percpu lock, the address space needs to be
> + * cast away.
> + */
> +#define IS_ERR_OR_NULL_ANY(x) \
> +IS_ERR_OR_NULL((const void *)(__force const unsigned long)(x))
> +
> #define DEFINE_GUARD(_name, _type, _lock, _unlock) \
> - DEFINE_CLASS(_name, _type, if (!IS_ERR_OR_NULL(_T)) { _unlock; }, ({ _lock; _T; }), _type _T); \
> + DEFINE_CLASS(_name, _type, if (!IS_ERR_OR_NULL_ANY(_T)) { _unlock; }, ({ _lock; _T; }), _type _T); \
> DEFINE_CLASS_IS_GUARD(_name)
>
> #define DEFINE_GUARD_COND_4(_name, _ext, _lock, _cond) \
> @@ -401,7 +408,7 @@ typedef struct { \
> \
> static inline void class_##_name##_destructor(class_##_name##_t *_T) \
> { \
> - if (!IS_ERR_OR_NULL(_T->lock)) { _unlock; } \
> + if (!IS_ERR_OR_NULL_ANY(_T->lock)) { _unlock; } \
> } \
> \
> __DEFINE_GUARD_LOCK_PTR(_name, &_T->lock)
So looking at this I realized I might have inadvertly broken
scoped_guard() and scoped_cond_guard(), both rely on !__guard_ptr() for
the failure case, and this is no longer true.
The below seems to cover things. I even did a simple compile :-)
---
diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index d72764056ce6..8a52e337dd0f 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -1394,8 +1394,8 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
int nr_records = 0;
int rc;
- rc = mutex_lock_interruptible(&mds->poison.lock);
- if (rc)
+ ACQUIRE(mutex_intr, lock)(&mds->poison.mutex);
+ if ((rc = ACQUIRE_ERR(mutex_intr, lock)))
return rc;
po = mds->poison.list_out;
@@ -1430,7 +1430,6 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
}
} while (po->flags & CXL_POISON_FLAG_MORE);
- mutex_unlock(&mds->poison.lock);
return rc;
}
EXPORT_SYMBOL_NS_GPL(cxl_mem_get_poison, "CXL");
@@ -1466,7 +1465,7 @@ int cxl_poison_state_init(struct cxl_memdev_state *mds)
return rc;
}
- mutex_init(&mds->poison.lock);
+ mutex_init(&mds->poison.mutex);
return 0;
}
EXPORT_SYMBOL_NS_GPL(cxl_poison_state_init, "CXL");
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 3ec6b906371b..1dfd361ed295 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -257,7 +257,7 @@ struct cxl_poison_state {
u32 max_errors;
DECLARE_BITMAP(enabled_cmds, CXL_POISON_ENABLED_MAX);
struct cxl_mbox_poison_out *list_out;
- struct mutex lock; /* Protect reads of poison list */
+ struct mutex mutex; /* Protect reads of poison list */
};
/*
diff --git a/include/linux/cleanup.h b/include/linux/cleanup.h
index 7093e1d08af0..a20d70e563e1 100644
--- a/include/linux/cleanup.h
+++ b/include/linux/cleanup.h
@@ -3,6 +3,8 @@
#define _LINUX_CLEANUP_H
#include <linux/compiler.h>
+#include <linux/err.h>
+#include <linux/args.h>
/**
* DOC: scope-based cleanup helpers
@@ -310,9 +312,17 @@ static inline class_##_name##_t class_##_name##ext##_constructor(_init_args) \
#define __DEFINE_CLASS_IS_CONDITIONAL(_name, _is_cond) \
static __maybe_unused const bool class_##_name##_is_conditional = _is_cond
+#define __GUARD_ERR(_ptr) \
+ ({ long _rc = (__force unsigned long)(_ptr); \
+ if (!_rc) { _rc = -EBUSY; } if (!IS_ERR_VALUE(_rc)) { _rc = 0; } \
+ _rc; })
+
#define __DEFINE_GUARD_LOCK_PTR(_name, _exp) \
static inline void * class_##_name##_lock_ptr(class_##_name##_t *_T) \
- { return (void *)(__force unsigned long)*(_exp); }
+ { void *_ptr = (void *)(__force unsigned long)*(_exp); \
+ if (IS_ERR(_ptr)) { _ptr = NULL; } return _ptr; } \
+ static inline int class_##_name##_lock_err(class_##_name##_t *_T) \
+ { return __GUARD_ERR(*(_exp)); }
#define DEFINE_CLASS_IS_GUARD(_name) \
__DEFINE_CLASS_IS_CONDITIONAL(_name, false); \
@@ -323,23 +333,37 @@ static __maybe_unused const bool class_##_name##_is_conditional = _is_cond
__DEFINE_GUARD_LOCK_PTR(_name, _T)
#define DEFINE_GUARD(_name, _type, _lock, _unlock) \
- DEFINE_CLASS(_name, _type, if (_T) { _unlock; }, ({ _lock; _T; }), _type _T); \
+ DEFINE_CLASS(_name, _type, if (!__GUARD_ERR(_T)) { _unlock; }, ({ _lock; _T; }), _type _T); \
DEFINE_CLASS_IS_GUARD(_name)
-#define DEFINE_GUARD_COND(_name, _ext, _condlock) \
+#define DEFINE_GUARD_COND_4(_name, _ext, _lock, _cond) \
__DEFINE_CLASS_IS_CONDITIONAL(_name##_ext, true); \
EXTEND_CLASS(_name, _ext, \
- ({ void *_t = _T; if (_T && !(_condlock)) _t = NULL; _t; }), \
+ ({ void *_t = _T; int _RET = (_lock); if (_T && !(_cond)) _t = ERR_PTR(_RET); _t; }), \
class_##_name##_t _T) \
static inline void * class_##_name##_ext##_lock_ptr(class_##_name##_t *_T) \
- { return class_##_name##_lock_ptr(_T); }
+ { return class_##_name##_lock_ptr(_T); } \
+ static inline int class_##_name##_ext##_lock_err(class_##_name##_t *_T) \
+ { return class_##_name##_lock_err(_T); }
+
+/*
+ * Default binary condition; success on 'true'.
+ */
+#define DEFINE_GUARD_COND_3(_name, _ext, _lock) \
+ DEFINE_GUARD_COND_4(_name, _ext, _lock, _RET)
+
+#define DEFINE_GUARD_COND(X...) CONCATENATE(DEFINE_GUARD_COND_, COUNT_ARGS(X))(X)
#define guard(_name) \
CLASS(_name, __UNIQUE_ID(guard))
#define __guard_ptr(_name) class_##_name##_lock_ptr
+#define __guard_err(_name) class_##_name##_lock_err
#define __is_cond_ptr(_name) class_##_name##_is_conditional
+#define ACQUIRE(_name, _var) CLASS(_name, _var)
+#define ACQUIRE_ERR(_name, _var) __guard_err(_name)(&(_var))
+
/*
* Helper macro for scoped_guard().
*
@@ -401,7 +425,7 @@ typedef struct { \
\
static inline void class_##_name##_destructor(class_##_name##_t *_T) \
{ \
- if (_T->lock) { _unlock; } \
+ if (!__GUARD_ERR(_T->lock)) { _unlock; } \
} \
\
__DEFINE_GUARD_LOCK_PTR(_name, &_T->lock)
@@ -433,15 +457,22 @@ __DEFINE_CLASS_IS_CONDITIONAL(_name, false); \
__DEFINE_UNLOCK_GUARD(_name, void, _unlock, __VA_ARGS__) \
__DEFINE_LOCK_GUARD_0(_name, _lock)
-#define DEFINE_LOCK_GUARD_1_COND(_name, _ext, _condlock) \
+#define DEFINE_LOCK_GUARD_1_COND_4(_name, _ext, _lock, _cond) \
__DEFINE_CLASS_IS_CONDITIONAL(_name##_ext, true); \
EXTEND_CLASS(_name, _ext, \
({ class_##_name##_t _t = { .lock = l }, *_T = &_t;\
- if (_T->lock && !(_condlock)) _T->lock = NULL; \
+ int _RET = (_lock); \
+ if (_T->lock && !(_cond)) _T->lock = ERR_PTR(_RET);\
_t; }), \
typeof_member(class_##_name##_t, lock) l) \
static inline void * class_##_name##_ext##_lock_ptr(class_##_name##_t *_T) \
- { return class_##_name##_lock_ptr(_T); }
+ { return class_##_name##_lock_ptr(_T); } \
+ static inline int class_##_name##_ext##_lock_err(class_##_name##_t *_T) \
+ { return class_##_name##_lock_err(_T); }
+
+#define DEFINE_LOCK_GUARD_1_COND_3(_name, _ext, _lock) \
+ DEFINE_LOCK_GUARD_1_COND_4(_name, _ext, _lock, _RET);
+#define DEFINE_LOCK_GUARD_1_COND(X...) CONCATENATE(DEFINE_LOCK_GUARD_1_COND_, COUNT_ARGS(X))(X)
#endif /* _LINUX_CLEANUP_H */
diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index 2143d05116be..232fdde82bbb 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -200,7 +200,7 @@ extern int atomic_dec_and_mutex_lock(atomic_t *cnt, struct mutex *lock);
DEFINE_GUARD(mutex, struct mutex *, mutex_lock(_T), mutex_unlock(_T))
DEFINE_GUARD_COND(mutex, _try, mutex_trylock(_T))
-DEFINE_GUARD_COND(mutex, _intr, mutex_lock_interruptible(_T) == 0)
+DEFINE_GUARD_COND(mutex, _intr, mutex_lock_interruptible(_T), _RET == 0)
extern unsigned long mutex_get_owner(struct mutex *lock);
diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h
index c8b543d428b0..c810deb88d13 100644
--- a/include/linux/rwsem.h
+++ b/include/linux/rwsem.h
@@ -240,7 +240,7 @@ extern void up_write(struct rw_semaphore *sem);
DEFINE_GUARD(rwsem_read, struct rw_semaphore *, down_read(_T), up_read(_T))
DEFINE_GUARD_COND(rwsem_read, _try, down_read_trylock(_T))
-DEFINE_GUARD_COND(rwsem_read, _intr, down_read_interruptible(_T) == 0)
+DEFINE_GUARD_COND(rwsem_read, _intr, down_read_interruptible(_T), _RET == 0)
DEFINE_GUARD(rwsem_write, struct rw_semaphore *, down_write(_T), up_write(_T))
DEFINE_GUARD_COND(rwsem_write, _try, down_write_trylock(_T))
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking
2025-05-12 10:50 ` Peter Zijlstra
@ 2025-05-12 18:25 ` Peter Zijlstra
2025-05-12 18:58 ` Peter Zijlstra
0 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2025-05-12 18:25 UTC (permalink / raw)
To: dan.j.williams
Cc: linux-cxl, linux-kernel, David Lechner, Linus Torvalds,
Ingo Molnar, Fabio M. De Francesco, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny
On Mon, May 12, 2025 at 12:50:26PM +0200, Peter Zijlstra wrote:
> +#define __GUARD_ERR(_ptr) \
> + ({ long _rc = (__force unsigned long)(_ptr); \
> + if (!_rc) { _rc = -EBUSY; } if (!IS_ERR_VALUE(_rc)) { _rc = 0; } \
> + _rc; })
> +
> #define DEFINE_GUARD(_name, _type, _lock, _unlock) \
> - DEFINE_CLASS(_name, _type, if (_T) { _unlock; }, ({ _lock; _T; }), _type _T); \
> + DEFINE_CLASS(_name, _type, if (!__GUARD_ERR(_T)) { _unlock; }, ({ _lock; _T; }), _type _T); \
> DEFINE_CLASS_IS_GUARD(_name)
GCC is 'stupid' and this generates atrocious code. I'll play with it.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking
2025-05-12 18:25 ` Peter Zijlstra
@ 2025-05-12 18:58 ` Peter Zijlstra
2025-05-12 20:39 ` Linus Torvalds
2025-05-13 3:32 ` dan.j.williams
0 siblings, 2 replies; 29+ messages in thread
From: Peter Zijlstra @ 2025-05-12 18:58 UTC (permalink / raw)
To: dan.j.williams
Cc: linux-cxl, linux-kernel, David Lechner, Linus Torvalds,
Ingo Molnar, Fabio M. De Francesco, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny
On Mon, May 12, 2025 at 08:25:59PM +0200, Peter Zijlstra wrote:
> On Mon, May 12, 2025 at 12:50:26PM +0200, Peter Zijlstra wrote:
>
> > +#define __GUARD_ERR(_ptr) \
> > + ({ long _rc = (__force unsigned long)(_ptr); \
> > + if (!_rc) { _rc = -EBUSY; } if (!IS_ERR_VALUE(_rc)) { _rc = 0; } \
> > + _rc; })
> > +
>
> > #define DEFINE_GUARD(_name, _type, _lock, _unlock) \
> > - DEFINE_CLASS(_name, _type, if (_T) { _unlock; }, ({ _lock; _T; }), _type _T); \
> > + DEFINE_CLASS(_name, _type, if (!__GUARD_ERR(_T)) { _unlock; }, ({ _lock; _T; }), _type _T); \
> > DEFINE_CLASS_IS_GUARD(_name)
>
> GCC is 'stupid' and this generates atrocious code. I'll play with it.
PRE:
bf9e: 48 85 db test %rbx,%rbx
bfa1: 74 1a je bfbd <foo+0x5d>
bfa3: 48 81 fb 00 f0 ff ff cmp $0xfffffffffffff000,%rbx
bfaa: 77 11 ja bfbd <foo+0x5d>
POST:
bf9e: 48 8d 43 ff lea -0x1(%rbx),%rax
bfa2: 48 3d ff ef ff ff cmp $0xffffffffffffefff,%rax
bfa8: 77 11 ja bfbb <foo+0x5b>
---
--- a/include/linux/cleanup.h
+++ b/include/linux/cleanup.h
@@ -293,17 +293,18 @@ static inline class_##_name##_t class_##
#define __DEFINE_CLASS_IS_CONDITIONAL(_name, _is_cond) \
static __maybe_unused const bool class_##_name##_is_conditional = _is_cond
-#define __GUARD_ERR(_ptr) \
- ({ long _rc = (__force unsigned long)(_ptr); \
- if (!_rc) { _rc = -EBUSY; } if (!IS_ERR_VALUE(_rc)) { _rc = 0; } \
- _rc; })
+#define __GUARD_IS_ERR(_ptr) \
+ ({ unsigned long _rc = (__force unsigned long)(_ptr); \
+ unlikely((_rc-1) >= -MAX_ERRNO-1); })
#define __DEFINE_GUARD_LOCK_PTR(_name, _exp) \
static inline void * class_##_name##_lock_ptr(class_##_name##_t *_T) \
{ void *_ptr = (void *)(__force unsigned long)*(_exp); \
if (IS_ERR(_ptr)) { _ptr = NULL; } return _ptr; } \
static inline int class_##_name##_lock_err(class_##_name##_t *_T) \
- { return __GUARD_ERR(*(_exp)); }
+ { long _rc = (__force unsigned long)*(_exp); \
+ if (!_rc) { _rc = -EBUSY; } if (!IS_ERR_VALUE(_rc)) { _rc = 0; } \
+ return _rc; }
#define DEFINE_CLASS_IS_GUARD(_name) \
__DEFINE_CLASS_IS_CONDITIONAL(_name, false); \
@@ -314,7 +315,7 @@ static __maybe_unused const bool class_#
__DEFINE_GUARD_LOCK_PTR(_name, _T)
#define DEFINE_GUARD(_name, _type, _lock, _unlock) \
- DEFINE_CLASS(_name, _type, if (!__GUARD_ERR(_T)) { _unlock; }, ({ _lock; _T; }), _type _T); \
+ DEFINE_CLASS(_name, _type, if (!__GUARD_IS_ERR(_T)) { _unlock; }, ({ _lock; _T; }), _type _T); \
DEFINE_CLASS_IS_GUARD(_name)
#define DEFINE_GUARD_COND_4(_name, _ext, _lock, _cond) \
@@ -406,7 +407,7 @@ typedef struct { \
\
static inline void class_##_name##_destructor(class_##_name##_t *_T) \
{ \
- if (!__GUARD_ERR(_T->lock)) { _unlock; } \
+ if (!__GUARD_IS_ERR(_T->lock)) { _unlock; } \
} \
\
__DEFINE_GUARD_LOCK_PTR(_name, &_T->lock)
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking
2025-05-12 18:58 ` Peter Zijlstra
@ 2025-05-12 20:39 ` Linus Torvalds
2025-05-13 7:09 ` Peter Zijlstra
2025-05-13 3:32 ` dan.j.williams
1 sibling, 1 reply; 29+ messages in thread
From: Linus Torvalds @ 2025-05-12 20:39 UTC (permalink / raw)
To: Peter Zijlstra
Cc: dan.j.williams, linux-cxl, linux-kernel, David Lechner,
Ingo Molnar, Fabio M. De Francesco, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny
On Mon, 12 May 2025 at 11:58, Peter Zijlstra <peterz@infradead.org> wrote:
>
> > GCC is 'stupid' and this generates atrocious code. I'll play with it.
>
> PRE:
> bf9e: 48 85 db test %rbx,%rbx
> bfa1: 74 1a je bfbd <foo+0x5d>
> bfa3: 48 81 fb 00 f0 ff ff cmp $0xfffffffffffff000,%rbx
> bfaa: 77 11 ja bfbd <foo+0x5d>
>
> POST:
> bf9e: 48 8d 43 ff lea -0x1(%rbx),%rax
> bfa2: 48 3d ff ef ff ff cmp $0xffffffffffffefff,%rax
> bfa8: 77 11 ja bfbb <foo+0x5b>
I'm not convinced that's actually an improvement.
Yes, it's one less instruction, and three bytes shorter. But it uses
an extra register, so now it might make surrounding code much worse by
making register allocation have a harder time.
If you *really* care about this, I think you should realize that the
non-error case is a valid kernel pointer.
And we could add some architecture-specific function to check for "is
this a valid non-NULL and non-error pointer" with a fallback to the
generic case.
Because then on a platform like x86, where kernel pointers are always
negative, but not *as* negative as the error pointers, you can check
for that with a single compare.
The logic is "add MAX_ERRNO, and if it's still negative, it wasn't
NULL and it wasn't ERR_PTR".
And while 'add' needs a destination register, 'sub' with the negated
value does not, and is called 'cmp'.
So I think you can do that with
cmp $-MAX_ERRNO,...
js ...
Sadly, I can't seem to get gcc to generate that code. But I didn't try
very hard.
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking
2025-05-12 18:58 ` Peter Zijlstra
2025-05-12 20:39 ` Linus Torvalds
@ 2025-05-13 3:32 ` dan.j.williams
1 sibling, 0 replies; 29+ messages in thread
From: dan.j.williams @ 2025-05-13 3:32 UTC (permalink / raw)
To: Peter Zijlstra, dan.j.williams
Cc: linux-cxl, linux-kernel, David Lechner, Linus Torvalds,
Ingo Molnar, Fabio M. De Francesco, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny
Peter Zijlstra wrote:
> On Mon, May 12, 2025 at 08:25:59PM +0200, Peter Zijlstra wrote:
> > On Mon, May 12, 2025 at 12:50:26PM +0200, Peter Zijlstra wrote:
> >
> > > +#define __GUARD_ERR(_ptr) \
> > > + ({ long _rc = (__force unsigned long)(_ptr); \
> > > + if (!_rc) { _rc = -EBUSY; } if (!IS_ERR_VALUE(_rc)) { _rc = 0; } \
> > > + _rc; })
> > > +
> >
> > > #define DEFINE_GUARD(_name, _type, _lock, _unlock) \
> > > - DEFINE_CLASS(_name, _type, if (_T) { _unlock; }, ({ _lock; _T; }), _type _T); \
> > > + DEFINE_CLASS(_name, _type, if (!__GUARD_ERR(_T)) { _unlock; }, ({ _lock; _T; }), _type _T); \
> > > DEFINE_CLASS_IS_GUARD(_name)
> >
> > GCC is 'stupid' and this generates atrocious code. I'll play with it.
>
> PRE:
> bf9e: 48 85 db test %rbx,%rbx
> bfa1: 74 1a je bfbd <foo+0x5d>
> bfa3: 48 81 fb 00 f0 ff ff cmp $0xfffffffffffff000,%rbx
> bfaa: 77 11 ja bfbd <foo+0x5d>
>
> POST:
> bf9e: 48 8d 43 ff lea -0x1(%rbx),%rax
> bfa2: 48 3d ff ef ff ff cmp $0xffffffffffffefff,%rax
> bfa8: 77 11 ja bfbb <foo+0x5b>
>
FWIW this looks good to me, and the conversion to drop all explicit
locking passed tests. I threw it out on a branch to get some bot
coverage in the meantime.
https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=cxl-acquire
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking
2025-05-12 20:39 ` Linus Torvalds
@ 2025-05-13 7:09 ` Peter Zijlstra
2025-05-13 8:50 ` Peter Zijlstra
0 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2025-05-13 7:09 UTC (permalink / raw)
To: Linus Torvalds
Cc: dan.j.williams, linux-cxl, linux-kernel, David Lechner,
Ingo Molnar, Fabio M. De Francesco, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny
On Mon, May 12, 2025 at 01:39:19PM -0700, Linus Torvalds wrote:
> On Mon, 12 May 2025 at 11:58, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > > GCC is 'stupid' and this generates atrocious code. I'll play with it.
> >
> > PRE:
> > bf9e: 48 85 db test %rbx,%rbx
> > bfa1: 74 1a je bfbd <foo+0x5d>
> > bfa3: 48 81 fb 00 f0 ff ff cmp $0xfffffffffffff000,%rbx
> > bfaa: 77 11 ja bfbd <foo+0x5d>
> >
> > POST:
> > bf9e: 48 8d 43 ff lea -0x1(%rbx),%rax
> > bfa2: 48 3d ff ef ff ff cmp $0xffffffffffffefff,%rax
> > bfa8: 77 11 ja bfbb <foo+0x5b>
>
> I'm not convinced that's actually an improvement.
>
> Yes, it's one less instruction, and three bytes shorter. But it uses
> an extra register, so now it might make surrounding code much worse by
> making register allocation have a harder time.
I was going for the one less branch, but yeah, register pressure :/
Typically this is at the end of a scope, and I was hoping this is where
you have free regs etc.
> If you *really* care about this, I think you should realize that the
> non-error case is a valid kernel pointer.
>
> And we could add some architecture-specific function to check for "is
> this a valid non-NULL and non-error pointer" with a fallback to the
> generic case.
>
> Because then on a platform like x86, where kernel pointers are always
> negative, but not *as* negative as the error pointers, you can check
> for that with a single compare.
>
> The logic is "add MAX_ERRNO, and if it's still negative, it wasn't
> NULL and it wasn't ERR_PTR".
>
> And while 'add' needs a destination register, 'sub' with the negated
> value does not, and is called 'cmp'.
>
> So I think you can do that with
>
> cmp $-MAX_ERRNO,...
> js ...
>
> Sadly, I can't seem to get gcc to generate that code. But I didn't try
> very hard.
And so try I must :-)
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking
2025-05-13 7:09 ` Peter Zijlstra
@ 2025-05-13 8:50 ` Peter Zijlstra
2025-05-13 19:46 ` Linus Torvalds
0 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2025-05-13 8:50 UTC (permalink / raw)
To: Linus Torvalds
Cc: dan.j.williams, linux-cxl, linux-kernel, David Lechner,
Ingo Molnar, Fabio M. De Francesco, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny
On Tue, May 13, 2025 at 09:09:18AM +0200, Peter Zijlstra wrote:
> On Mon, May 12, 2025 at 01:39:19PM -0700, Linus Torvalds wrote:
> > If you *really* care about this, I think you should realize that the
> > non-error case is a valid kernel pointer.
> >
> > And we could add some architecture-specific function to check for "is
> > this a valid non-NULL and non-error pointer" with a fallback to the
> > generic case.
> >
> > Because then on a platform like x86, where kernel pointers are always
> > negative, but not *as* negative as the error pointers, you can check
> > for that with a single compare.
> >
> > The logic is "add MAX_ERRNO, and if it's still negative, it wasn't
> > NULL and it wasn't ERR_PTR".
> >
> > And while 'add' needs a destination register, 'sub' with the negated
> > value does not, and is called 'cmp'.
> >
> > So I think you can do that with
> >
> > cmp $-MAX_ERRNO,...
> > js ...
> >
> > Sadly, I can't seem to get gcc to generate that code. But I didn't try
> > very hard.
Yeah, it seems to really like emitting add and lea.
Inline asm obviously works:
003e c09e: 48 81 fb 01 f0 ff ff cmp $0xfffffffffffff001,%rbx
0045 c0a5: 79 11 jns c0b8 <foo+0x58>
0047 c0a7: 48 89 df mov %rbx,%rdi
004a c0aa: e8 00 00 00 00 call c0af <foo+0x4f> c0ab: R_X86_64_PLT32 raw_spin_rq_unlock-0x4
...
0058 c0b8: 5b pop %rbx
0059 c0b9: 5d pop %rbp
005a c0ba: e9 00 00 00 00 jmp c0bf <foo+0x5f> c0bb: R_X86_64_PLT32 __x86_return_thunk-0x4
Just not sure its worth it at this point.
---
diff --git a/arch/x86/include/asm/cleanup.h b/arch/x86/include/asm/cleanup.h
new file mode 100644
index 000000000000..7cef49be8570
--- /dev/null
+++ b/arch/x86/include/asm/cleanup.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_CLEANUP_H
+#define _ASM_X86_CLEANUP_H
+
+#define __GUARD_IS_ERR(_ptr) \
+ ({ unsigned long _var = (__force unsigned long)(_ptr); \
+ bool _s; \
+ asm_inline volatile ("cmp %[val], %[var]" \
+ : "=@ccns" (_s) \
+ : [val] "i" (-MAX_ERRNO), \
+ [var] "r" (_var)); \
+ unlikely(_s); })
+
+#endif /* _ASM_X86_CLEANUP_H */
+
diff --git a/include/asm-generic/Kbuild b/include/asm-generic/Kbuild
index 8675b7b4ad23..a59a88c95277 100644
--- a/include/asm-generic/Kbuild
+++ b/include/asm-generic/Kbuild
@@ -12,6 +12,7 @@ mandatory-y += bug.h
mandatory-y += cacheflush.h
mandatory-y += cfi.h
mandatory-y += checksum.h
+mandatory-y += cleanup.h
mandatory-y += compat.h
mandatory-y += current.h
mandatory-y += delay.h
diff --git a/include/asm-generic/cleanup.h b/include/asm-generic/cleanup.h
new file mode 100644
index 000000000000..616ae558638e
--- /dev/null
+++ b/include/asm-generic/cleanup.h
@@ -0,0 +1,5 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_CLEANUP_H
+#define _ASM_GENERIC_CLEANUP_H
+
+#endif /* _ASM_GENERIC_CLEANUP_H */
diff --git a/include/linux/cleanup.h b/include/linux/cleanup.h
index 18209e191973..00e5ef7aa314 100644
--- a/include/linux/cleanup.h
+++ b/include/linux/cleanup.h
@@ -6,6 +6,8 @@
#include <linux/err.h>
#include <linux/args.h>
+#include <asm/cleanup.h>
+
/**
* DOC: scope-based cleanup helpers
*
@@ -312,9 +314,11 @@ static inline class_##_name##_t class_##_name##ext##_constructor(_init_args) \
#define __DEFINE_CLASS_IS_CONDITIONAL(_name, _is_cond) \
static __maybe_unused const bool class_##_name##_is_conditional = _is_cond
+#ifndef __GUARD_IS_ERR
#define __GUARD_IS_ERR(_ptr) \
({ unsigned long _rc = (__force unsigned long)(_ptr); \
unlikely((_rc-1) >= -(MAX_ERRNO+1)); })
+#endif
#define __DEFINE_GUARD_LOCK_PTR(_name, _exp) \
static inline void * class_##_name##_lock_ptr(class_##_name##_t *_T) \
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking
2025-05-13 8:50 ` Peter Zijlstra
@ 2025-05-13 19:46 ` Linus Torvalds
2025-05-13 20:06 ` Al Viro
2025-05-14 6:46 ` Peter Zijlstra
0 siblings, 2 replies; 29+ messages in thread
From: Linus Torvalds @ 2025-05-13 19:46 UTC (permalink / raw)
To: Peter Zijlstra
Cc: dan.j.williams, linux-cxl, linux-kernel, David Lechner,
Ingo Molnar, Fabio M. De Francesco, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny
On Tue, 13 May 2025 at 01:50, Peter Zijlstra <peterz@infradead.org> wrote:
>
> +#define __GUARD_IS_ERR(_ptr) \
> + ({ unsigned long _var = (__force unsigned long)(_ptr); \
> + bool _s; \
> + asm_inline volatile ("cmp %[val], %[var]" \
> + : "=@ccns" (_s) \
> + : [val] "i" (-MAX_ERRNO), \
> + [var] "r" (_var)); \
> + unlikely(_s); })
I think that this might be acceptable if it was some actual common operation.
But for just the conditional guard test, I think it's cute, but I
don't think it's really worth it.
Put another way: if we actually used this for IS_ERR_OR_NULL(), it
might be a worthwhile thing to look at. We have lots of those - some
of them in important core places.
Right now IS_ERR_OR_NULL() generates pretty disgusting code, with
clang doing things like this:
testq %rdi, %rdi
sete %al
cmpq $-4095, %rdi # imm = 0xF001
setae %cl
orb %al, %cl
je .LBB3_1
in order to avoid two jumps, while gcc generates that
testq %rdi, %rdi
je .L189
cmpq $-4096, %rdi
ja .L189
pattern.
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking
2025-05-13 19:46 ` Linus Torvalds
@ 2025-05-13 20:06 ` Al Viro
2025-05-13 20:31 ` Al Viro
2025-05-14 6:46 ` Peter Zijlstra
1 sibling, 1 reply; 29+ messages in thread
From: Al Viro @ 2025-05-13 20:06 UTC (permalink / raw)
To: Linus Torvalds
Cc: Peter Zijlstra, dan.j.williams, linux-cxl, linux-kernel,
David Lechner, Ingo Molnar, Fabio M. De Francesco,
Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny
On Tue, May 13, 2025 at 12:46:29PM -0700, Linus Torvalds wrote:
> Right now IS_ERR_OR_NULL() generates pretty disgusting code, with
> clang doing things like this:
>
> testq %rdi, %rdi
> sete %al
> cmpq $-4095, %rdi # imm = 0xF001
> setae %cl
> orb %al, %cl
> je .LBB3_1
>
> in order to avoid two jumps, while gcc generates that
>
> testq %rdi, %rdi
> je .L189
> cmpq $-4096, %rdi
> ja .L189
>
> pattern.
FWIW (unsigned long)v - 1 >= (unsigned long)(-MAX_ERRNO-1) yields
leaq -1(%rdi), %rax
cmpq $-4097, %rax
ja .L4
from gcc and
leaq 4095(%rdi), %rax
cmpq $4095, %rax # imm = 0xFFF
ja .LBB0_1
from clang...
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking
2025-05-13 20:06 ` Al Viro
@ 2025-05-13 20:31 ` Al Viro
2025-05-13 21:28 ` Linus Torvalds
0 siblings, 1 reply; 29+ messages in thread
From: Al Viro @ 2025-05-13 20:31 UTC (permalink / raw)
To: Linus Torvalds
Cc: Peter Zijlstra, dan.j.williams, linux-cxl, linux-kernel,
David Lechner, Ingo Molnar, Fabio M. De Francesco,
Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny
On Tue, May 13, 2025 at 09:06:19PM +0100, Al Viro wrote:
> FWIW (unsigned long)v - 1 >= (unsigned long)(-MAX_ERRNO-1) yields
> leaq -1(%rdi), %rax
> cmpq $-4097, %rax
> ja .L4
> from gcc and
> leaq 4095(%rdi), %rax
> cmpq $4095, %rax # imm = 0xFFF
> ja .LBB0_1
> from clang...
Nevermind - should've read back through the thread for context.
Sorry.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking
2025-05-13 20:31 ` Al Viro
@ 2025-05-13 21:28 ` Linus Torvalds
2025-05-17 9:17 ` David Laight
0 siblings, 1 reply; 29+ messages in thread
From: Linus Torvalds @ 2025-05-13 21:28 UTC (permalink / raw)
To: Al Viro
Cc: Peter Zijlstra, dan.j.williams, linux-cxl, linux-kernel,
David Lechner, Ingo Molnar, Fabio M. De Francesco,
Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny
On Tue, 13 May 2025 at 13:31, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Nevermind - should've read back through the thread for context.
Well, your comment did make me test what I can make gcc generate..
I still can't get gcc to do
cmpq $-4095,%rdi
jns .L189
for IS_ERR_OR_NULL() however hard I try.
The best I *can* get both gcc and clang to at least do
movq %rdi, %rcx
addq $4095, %rcx
jns .L189
which I suspect it much better than the "lea+cmpq", because a pure
register move is handled by the renaming and has no cost aside from
the front end (ie decoding).
So
#define IS_ERR_OR_NULL(ptr) (MAX_ERRNO + (long)(ptr) >= 0)
does seem to be potentially something we could use, and maybe we could
push the compiler people to realize that their current code generation
is bad.
Of course, it doesn't actually *really* work for IS_ERR_OR_NULL(),
because it gets the wrong results for user pointers, and while the
*common* case for the kernel is to test various kernel pointers, the
user pointer case does happen (ie mmap() and friends).
IOW, it's not actually correct in that form, I just wanted to see what
we could do for some limited form of this common pattern.
Anyway, I am surprised that neither gcc nor clang seem to have
realized that you can turn an "add" that just checks the condition
codes for sign or equality into a "cmp" of the negative value.
It seems such a trivial and obvious thing to do. But maybe I'm
confused and am missing something.
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking
2025-05-13 19:46 ` Linus Torvalds
2025-05-13 20:06 ` Al Viro
@ 2025-05-14 6:46 ` Peter Zijlstra
1 sibling, 0 replies; 29+ messages in thread
From: Peter Zijlstra @ 2025-05-14 6:46 UTC (permalink / raw)
To: Linus Torvalds
Cc: dan.j.williams, linux-cxl, linux-kernel, David Lechner,
Ingo Molnar, Fabio M. De Francesco, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny
On Tue, May 13, 2025 at 12:46:29PM -0700, Linus Torvalds wrote:
> On Tue, 13 May 2025 at 01:50, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > +#define __GUARD_IS_ERR(_ptr) \
> > + ({ unsigned long _var = (__force unsigned long)(_ptr); \
> > + bool _s; \
> > + asm_inline volatile ("cmp %[val], %[var]" \
> > + : "=@ccns" (_s) \
> > + : [val] "i" (-MAX_ERRNO), \
> > + [var] "r" (_var)); \
> > + unlikely(_s); })
>
> I think that this might be acceptable if it was some actual common operation.
>
> But for just the conditional guard test, I think it's cute, but I
> don't think it's really worth it.
Its actually every guard, the destructor is shared between unconditional
and conditional locks.
> Right now IS_ERR_OR_NULL() generates pretty disgusting code, with
> clang doing things like this:
>
> testq %rdi, %rdi
> sete %al
> cmpq $-4095, %rdi # imm = 0xF001
> setae %cl
> orb %al, %cl
> je .LBB3_1
Whee, that is creative indeed :-)
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking
2025-05-13 21:28 ` Linus Torvalds
@ 2025-05-17 9:17 ` David Laight
0 siblings, 0 replies; 29+ messages in thread
From: David Laight @ 2025-05-17 9:17 UTC (permalink / raw)
To: Linus Torvalds
Cc: Al Viro, Peter Zijlstra, dan.j.williams, linux-cxl, linux-kernel,
David Lechner, Ingo Molnar, Fabio M. De Francesco,
Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny
On Tue, 13 May 2025 14:28:37 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Tue, 13 May 2025 at 13:31, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > Nevermind - should've read back through the thread for context.
>
> Well, your comment did make me test what I can make gcc generate..
>
> I still can't get gcc to do
>
> cmpq $-4095,%rdi
> jns .L189
>
> for IS_ERR_OR_NULL() however hard I try.
>
> The best I *can* get both gcc and clang to at least do
>
> movq %rdi, %rcx
> addq $4095, %rcx
> jns .L189
>
> which I suspect it much better than the "lea+cmpq", because a pure
> register move is handled by the renaming and has no cost aside from
> the front end (ie decoding).
>
> So
>
> #define IS_ERR_OR_NULL(ptr) (MAX_ERRNO + (long)(ptr) >= 0)
>
> does seem to be potentially something we could use, and maybe we could
> push the compiler people to realize that their current code generation
> is bad.
>
> Of course, it doesn't actually *really* work for IS_ERR_OR_NULL(),
> because it gets the wrong results for user pointers, and while the
> *common* case for the kernel is to test various kernel pointers, the
> user pointer case does happen (ie mmap() and friends).
>
> IOW, it's not actually correct in that form, I just wanted to see what
> we could do for some limited form of this common pattern.
>
> Anyway, I am surprised that neither gcc nor clang seem to have
> realized that you can turn an "add" that just checks the condition
> codes for sign or equality into a "cmp" of the negative value.
>
> It seems such a trivial and obvious thing to do. But maybe I'm
> confused and am missing something.
Doing the signed compare (long)(ptr) >= -MAX_ERRNO generates cmp + jl
(sign != overflow) which is a better test.
To let user pointers through it might be possible to generate:
leaq -1(%reg), %reg
cmpq $-4097, %reg
leaq 1(%reg), %reg
ja label
which trades a register for an instruction.
It wouldn't be too bad if the second 'leaq' is moved to the branch
target - especially for any cpu that don't have inc/dec that doesn't
affect the flags.
David
^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2025-05-17 9:17 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-07 7:21 [PATCH 0/7] Introduce DEFINE_ACQUIRE(), a scoped_cond_guard() replacement Dan Williams
2025-05-07 7:21 ` [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking Dan Williams
2025-05-07 9:32 ` Peter Zijlstra
2025-05-07 21:18 ` Dan Williams
2025-05-08 11:00 ` Peter Zijlstra
2025-05-09 5:04 ` Dan Williams
2025-05-09 10:40 ` Peter Zijlstra
2025-05-10 1:11 ` dan.j.williams
2025-05-12 10:50 ` Peter Zijlstra
2025-05-12 18:25 ` Peter Zijlstra
2025-05-12 18:58 ` Peter Zijlstra
2025-05-12 20:39 ` Linus Torvalds
2025-05-13 7:09 ` Peter Zijlstra
2025-05-13 8:50 ` Peter Zijlstra
2025-05-13 19:46 ` Linus Torvalds
2025-05-13 20:06 ` Al Viro
2025-05-13 20:31 ` Al Viro
2025-05-13 21:28 ` Linus Torvalds
2025-05-17 9:17 ` David Laight
2025-05-14 6:46 ` Peter Zijlstra
2025-05-13 3:32 ` dan.j.williams
2025-05-09 19:10 ` kernel test robot
2025-05-07 7:21 ` [PATCH 2/7] cxl/decoder: Move decoder register programming to a helper Dan Williams
2025-05-07 7:21 ` [PATCH 3/7] cxl/decoder: Drop pointless locking Dan Williams
2025-05-07 7:21 ` [PATCH 4/7] cxl/region: Split commit_store() into __commit() and queue_reset() helpers Dan Williams
2025-05-07 7:21 ` [PATCH 5/7] cxl/region: Move ready-to-probe state check to a helper Dan Williams
2025-05-07 7:21 ` [PATCH 6/7] cxl/region: Introduce CLASS(cxl_decoder_detach...) consolidate multiple paths Dan Williams
2025-05-08 7:44 ` kernel test robot
2025-05-07 7:21 ` [PATCH 7/7] cleanup: Create an rwsem conditional acquisition class Dan Williams
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox