* [PATCH 0/7 v2 RFC] Make wake_up_{bit,var} less fragile
@ 2024-08-26 6:30 NeilBrown
2024-08-26 6:30 ` [PATCH 1/7] block: change wait on bd_claiming to use a var_waitqueue, not a bit_waitqueue NeilBrown
` (7 more replies)
0 siblings, 8 replies; 16+ messages in thread
From: NeilBrown @ 2024-08-26 6:30 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Linus Torvalds, Jens Axboe
Cc: linux-kernel, linux-fsdevel, linux-block
This is a second attempt to make wake_up_{bit,var} less fragile.
This version doesn't change those functions much, but instead
improves the documentation and provides some helpers which
both serve as patterns to follow and alternates so that use of the
fragile functions can be limited or eliminated.
The only change to either function is that wake_up_bit() is changed to
take an unsigned long * rather than a void *. This necessitates the
first patch which changes the one place where something other then
unsigned long * is passed to wake_up bit - it is in block/.
The final patch modifies the same bit of code as a demonstration of one
of the new APIs that has been added.
Thanks,
NeilBrown
[PATCH 1/7] block: change wait on bd_claiming to use a var_waitqueue,
[PATCH 2/7] sched: change wake_up_bit() and related function to
[PATCH 3/7] sched: Improve documentation for wake_up_bit/wait_on_bit
[PATCH 4/7] sched: Document wait_var_event() family of functions and
[PATCH 5/7] sched: Add test_and_clear_wake_up_bit() and
[PATCH 6/7] sched: Add wait/wake interface for variable updated under
[PATCH 7/7] Block: switch bd_prepare_to_claim to use
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 1/7] block: change wait on bd_claiming to use a var_waitqueue, not a bit_waitqueue
2024-08-26 6:30 [PATCH 0/7 v2 RFC] Make wake_up_{bit,var} less fragile NeilBrown
@ 2024-08-26 6:30 ` NeilBrown
2024-09-17 3:12 ` Jens Axboe
2024-09-17 3:13 ` (subset) " Jens Axboe
2024-08-26 6:30 ` [PATCH 2/7] sched: change wake_up_bit() and related function to expect unsigned long * NeilBrown
` (6 subsequent siblings)
7 siblings, 2 replies; 16+ messages in thread
From: NeilBrown @ 2024-08-26 6:30 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Linus Torvalds, Jens Axboe
Cc: linux-kernel, linux-fsdevel, linux-block
bd_prepare_to_claim() waits for a var to change, not for a bit to be
cleared.
So change from bit_waitqueue() to __var_waitqueue() and correspondingly
use wake_up_var().
This will allow a future patch which change the "bit" function to expect
an "unsigned long *" instead of "void *".
Signed-off-by: NeilBrown <neilb@suse.de>
---
block/bdev.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/block/bdev.c b/block/bdev.c
index c5507b6f63b8..21e688fb6449 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -548,7 +548,7 @@ int bd_prepare_to_claim(struct block_device *bdev, void *holder,
/* if claiming is already in progress, wait for it to finish */
if (whole->bd_claiming) {
- wait_queue_head_t *wq = bit_waitqueue(&whole->bd_claiming, 0);
+ wait_queue_head_t *wq = __var_waitqueue(&whole->bd_claiming);
DEFINE_WAIT(wait);
prepare_to_wait(wq, &wait, TASK_UNINTERRUPTIBLE);
@@ -571,7 +571,7 @@ static void bd_clear_claiming(struct block_device *whole, void *holder)
/* tell others that we're done */
BUG_ON(whole->bd_claiming != holder);
whole->bd_claiming = NULL;
- wake_up_bit(&whole->bd_claiming, 0);
+ wake_up_var(&whole->bd_claiming);
}
/**
--
2.44.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 2/7] sched: change wake_up_bit() and related function to expect unsigned long *
2024-08-26 6:30 [PATCH 0/7 v2 RFC] Make wake_up_{bit,var} less fragile NeilBrown
2024-08-26 6:30 ` [PATCH 1/7] block: change wait on bd_claiming to use a var_waitqueue, not a bit_waitqueue NeilBrown
@ 2024-08-26 6:30 ` NeilBrown
2024-09-16 11:28 ` Peter Zijlstra
2024-08-26 6:31 ` [PATCH 3/7] sched: Improve documentation for wake_up_bit/wait_on_bit family of functions NeilBrown
` (5 subsequent siblings)
7 siblings, 1 reply; 16+ messages in thread
From: NeilBrown @ 2024-08-26 6:30 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Linus Torvalds, Jens Axboe
Cc: linux-kernel, linux-fsdevel, linux-block
wake_up_bit() currently allows a "void *". While this isn't strictly a
problem as the address is never dereferenced, it is inconsistent with
the corresponding wait_var_event() which requires "unsigned long *" and
does dereference the pointer.
And code that needs to wait for a change in something other than an
unsigned long would be better served by wake_up_var().
This patch changes all related "void *" to "unsigned long *".
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NeilBrown <neilb@suse.de>
---
include/linux/wait_bit.h | 16 ++++++++--------
kernel/sched/wait_bit.c | 12 ++++++------
2 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/include/linux/wait_bit.h b/include/linux/wait_bit.h
index 7725b7579b78..48e123839892 100644
--- a/include/linux/wait_bit.h
+++ b/include/linux/wait_bit.h
@@ -8,7 +8,7 @@
#include <linux/wait.h>
struct wait_bit_key {
- void *flags;
+ unsigned long *flags;
int bit_nr;
unsigned long timeout;
};
@@ -23,14 +23,14 @@ struct wait_bit_queue_entry {
typedef int wait_bit_action_f(struct wait_bit_key *key, int mode);
-void __wake_up_bit(struct wait_queue_head *wq_head, void *word, int bit);
+void __wake_up_bit(struct wait_queue_head *wq_head, unsigned long *word, int bit);
int __wait_on_bit(struct wait_queue_head *wq_head, struct wait_bit_queue_entry *wbq_entry, wait_bit_action_f *action, unsigned int mode);
int __wait_on_bit_lock(struct wait_queue_head *wq_head, struct wait_bit_queue_entry *wbq_entry, wait_bit_action_f *action, unsigned int mode);
-void wake_up_bit(void *word, int bit);
-int out_of_line_wait_on_bit(void *word, int, wait_bit_action_f *action, unsigned int mode);
-int out_of_line_wait_on_bit_timeout(void *word, int, wait_bit_action_f *action, unsigned int mode, unsigned long timeout);
-int out_of_line_wait_on_bit_lock(void *word, int, wait_bit_action_f *action, unsigned int mode);
-struct wait_queue_head *bit_waitqueue(void *word, int bit);
+void wake_up_bit(unsigned long *word, int bit);
+int out_of_line_wait_on_bit(unsigned long *word, int, wait_bit_action_f *action, unsigned int mode);
+int out_of_line_wait_on_bit_timeout(unsigned long *word, int, wait_bit_action_f *action, unsigned int mode, unsigned long timeout);
+int out_of_line_wait_on_bit_lock(unsigned long *word, int, wait_bit_action_f *action, unsigned int mode);
+struct wait_queue_head *bit_waitqueue(unsigned long *word, int bit);
extern void __init wait_bit_init(void);
int wake_bit_function(struct wait_queue_entry *wq_entry, unsigned mode, int sync, void *key);
@@ -327,7 +327,7 @@ do { \
* You can use this helper if bitflags are manipulated atomically rather than
* non-atomically under a lock.
*/
-static inline void clear_and_wake_up_bit(int bit, void *word)
+static inline void clear_and_wake_up_bit(int bit, unsigned long *word)
{
clear_bit_unlock(bit, word);
/* See wake_up_bit() for which memory barrier you need to use. */
diff --git a/kernel/sched/wait_bit.c b/kernel/sched/wait_bit.c
index 134d7112ef71..058b0e18727e 100644
--- a/kernel/sched/wait_bit.c
+++ b/kernel/sched/wait_bit.c
@@ -9,7 +9,7 @@
static wait_queue_head_t bit_wait_table[WAIT_TABLE_SIZE] __cacheline_aligned;
-wait_queue_head_t *bit_waitqueue(void *word, int bit)
+wait_queue_head_t *bit_waitqueue(unsigned long *word, int bit)
{
const int shift = BITS_PER_LONG == 32 ? 5 : 6;
unsigned long val = (unsigned long)word << shift | bit;
@@ -55,7 +55,7 @@ __wait_on_bit(struct wait_queue_head *wq_head, struct wait_bit_queue_entry *wbq_
}
EXPORT_SYMBOL(__wait_on_bit);
-int __sched out_of_line_wait_on_bit(void *word, int bit,
+int __sched out_of_line_wait_on_bit(unsigned long *word, int bit,
wait_bit_action_f *action, unsigned mode)
{
struct wait_queue_head *wq_head = bit_waitqueue(word, bit);
@@ -66,7 +66,7 @@ int __sched out_of_line_wait_on_bit(void *word, int bit,
EXPORT_SYMBOL(out_of_line_wait_on_bit);
int __sched out_of_line_wait_on_bit_timeout(
- void *word, int bit, wait_bit_action_f *action,
+ unsigned long *word, int bit, wait_bit_action_f *action,
unsigned mode, unsigned long timeout)
{
struct wait_queue_head *wq_head = bit_waitqueue(word, bit);
@@ -108,7 +108,7 @@ __wait_on_bit_lock(struct wait_queue_head *wq_head, struct wait_bit_queue_entry
}
EXPORT_SYMBOL(__wait_on_bit_lock);
-int __sched out_of_line_wait_on_bit_lock(void *word, int bit,
+int __sched out_of_line_wait_on_bit_lock(unsigned long *word, int bit,
wait_bit_action_f *action, unsigned mode)
{
struct wait_queue_head *wq_head = bit_waitqueue(word, bit);
@@ -118,7 +118,7 @@ int __sched out_of_line_wait_on_bit_lock(void *word, int bit,
}
EXPORT_SYMBOL(out_of_line_wait_on_bit_lock);
-void __wake_up_bit(struct wait_queue_head *wq_head, void *word, int bit)
+void __wake_up_bit(struct wait_queue_head *wq_head, unsigned long *word, int bit)
{
struct wait_bit_key key = __WAIT_BIT_KEY_INITIALIZER(word, bit);
@@ -144,7 +144,7 @@ EXPORT_SYMBOL(__wake_up_bit);
* may need to use a less regular barrier, such fs/inode.c's smp_mb(),
* because spin_unlock() does not guarantee a memory barrier.
*/
-void wake_up_bit(void *word, int bit)
+void wake_up_bit(unsigned long *word, int bit)
{
__wake_up_bit(bit_waitqueue(word, bit), word, bit);
}
--
2.44.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 3/7] sched: Improve documentation for wake_up_bit/wait_on_bit family of functions
2024-08-26 6:30 [PATCH 0/7 v2 RFC] Make wake_up_{bit,var} less fragile NeilBrown
2024-08-26 6:30 ` [PATCH 1/7] block: change wait on bd_claiming to use a var_waitqueue, not a bit_waitqueue NeilBrown
2024-08-26 6:30 ` [PATCH 2/7] sched: change wake_up_bit() and related function to expect unsigned long * NeilBrown
@ 2024-08-26 6:31 ` NeilBrown
2024-08-26 6:31 ` [PATCH 4/7] sched: Document wait_var_event() family of functions and wake_up_var() NeilBrown
` (4 subsequent siblings)
7 siblings, 0 replies; 16+ messages in thread
From: NeilBrown @ 2024-08-26 6:31 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Linus Torvalds, Jens Axboe
Cc: linux-kernel, linux-fsdevel, linux-block
This patch revises the documention for wake_up_bit(),
clear_and_wake_up_bit(), and all the wait_on_bit() family of functions.
The new documentation places less emphasis on the pool of waitqueues
used (an implementation details) and focuses instead on details of how
the functions behave.
The barriers included in the wait functions and clear_and_wake_up_bit()
and those required for wake_up_bit() are spelled out more clearly.
The error statuses returned are given explicitly.
The fact that the wait_on_bit_lock function set the bit is made more
obvious.
Signed-off-by: NeilBrown <neilb@suse.de>
---
include/linux/wait_bit.h | 159 +++++++++++++++++++++------------------
kernel/sched/wait_bit.c | 37 +++++----
2 files changed, 110 insertions(+), 86 deletions(-)
diff --git a/include/linux/wait_bit.h b/include/linux/wait_bit.h
index 48e123839892..b792a92a036e 100644
--- a/include/linux/wait_bit.h
+++ b/include/linux/wait_bit.h
@@ -53,19 +53,21 @@ extern int bit_wait_io_timeout(struct wait_bit_key *key, int mode);
/**
* wait_on_bit - wait for a bit to be cleared
- * @word: the word being waited on, a kernel virtual address
- * @bit: the bit of the word being waited on
+ * @word: the address containing the bit being waited on
+ * @bit: the bit at that address being waited on
* @mode: the task state to sleep in
*
- * There is a standard hashed waitqueue table for generic use. This
- * is the part of the hashtable's accessor API that waits on a bit.
- * For instance, if one were to have waiters on a bitflag, one would
- * call wait_on_bit() in threads waiting for the bit to clear.
- * One uses wait_on_bit() where one is waiting for the bit to clear,
- * but has no intention of setting it.
- * Returned value will be zero if the bit was cleared, or non-zero
- * if the process received a signal and the mode permitted wakeup
- * on that signal.
+ * Wait for the given bit in an unsigned long or bitmap (see DECLARE_BITMAP())
+ * to be cleared. The clearing of the bit must be signalled with
+ * wake_up_bit(), often as clear_and_wake_up_bit().
+ *
+ * The process will wait on a waitqueue selected by hash from a shared
+ * pool. It will only be woken on a wake_up for the target bit, even
+ * if other processed on the same queue are woken for other bits.
+ *
+ * Returned value will be zero if the bit was cleared in which case the
+ * call has ACQUIRE semantics, or %-EINTR if the process received a
+ * signal and the mode permitted wake up on that signal.
*/
static inline int
wait_on_bit(unsigned long *word, int bit, unsigned mode)
@@ -80,17 +82,20 @@ wait_on_bit(unsigned long *word, int bit, unsigned mode)
/**
* wait_on_bit_io - wait for a bit to be cleared
- * @word: the word being waited on, a kernel virtual address
- * @bit: the bit of the word being waited on
+ * @word: the address containing the bit being waited on
+ * @bit: the bit at that address being waited on
* @mode: the task state to sleep in
*
- * Use the standard hashed waitqueue table to wait for a bit
- * to be cleared. This is similar to wait_on_bit(), but calls
- * io_schedule() instead of schedule() for the actual waiting.
+ * Wait for the given bit in an unsigned long or bitmap (see DECLARE_BITMAP())
+ * to be cleared. The clearing of the bit must be signalled with
+ * wake_up_bit(), often as clear_and_wake_up_bit().
+ *
+ * This is similar to wait_on_bit(), but calls io_schedule() instead of
+ * schedule() for the actual waiting.
*
- * Returned value will be zero if the bit was cleared, or non-zero
- * if the process received a signal and the mode permitted wakeup
- * on that signal.
+ * Returned value will be zero if the bit was cleared in which case the
+ * call has ACQUIRE semantics, or %-EINTR if the process received a
+ * signal and the mode permitted wake up on that signal.
*/
static inline int
wait_on_bit_io(unsigned long *word, int bit, unsigned mode)
@@ -104,19 +109,24 @@ wait_on_bit_io(unsigned long *word, int bit, unsigned mode)
}
/**
- * wait_on_bit_timeout - wait for a bit to be cleared or a timeout elapses
- * @word: the word being waited on, a kernel virtual address
- * @bit: the bit of the word being waited on
+ * wait_on_bit_timeout - wait for a bit to be cleared or a timeout to elapse
+ * @word: the address containing the bit being waited on
+ * @bit: the bit at that address being waited on
* @mode: the task state to sleep in
* @timeout: timeout, in jiffies
*
- * Use the standard hashed waitqueue table to wait for a bit
- * to be cleared. This is similar to wait_on_bit(), except also takes a
- * timeout parameter.
+ * Wait for the given bit in an unsigned long or bitmap (see
+ * DECLARE_BITMAP()) to be cleared, or for a timeout to expire. The
+ * clearing of the bit must be signalled with wake_up_bit(), often as
+ * clear_and_wake_up_bit().
*
- * Returned value will be zero if the bit was cleared before the
- * @timeout elapsed, or non-zero if the @timeout elapsed or process
- * received a signal and the mode permitted wakeup on that signal.
+ * This is similar to wait_on_bit(), except it also takes a timeout
+ * parameter.
+ *
+ * Returned value will be zero if the bit was cleared in which case the
+ * call has ACQUIRE semantics, or %-EINTR if the process received a
+ * signal and the mode permitted wake up on that signal, or %-EAGAIN if the
+ * timeout elapsed.
*/
static inline int
wait_on_bit_timeout(unsigned long *word, int bit, unsigned mode,
@@ -132,19 +142,21 @@ wait_on_bit_timeout(unsigned long *word, int bit, unsigned mode,
/**
* wait_on_bit_action - wait for a bit to be cleared
- * @word: the word being waited on, a kernel virtual address
- * @bit: the bit of the word being waited on
+ * @word: the address containing the bit waited on
+ * @bit: the bit at that address being waited on
* @action: the function used to sleep, which may take special actions
* @mode: the task state to sleep in
*
- * Use the standard hashed waitqueue table to wait for a bit
- * to be cleared, and allow the waiting action to be specified.
- * This is like wait_on_bit() but allows fine control of how the waiting
- * is done.
+ * Wait for the given bit in an unsigned long or bitmap (see DECLARE_BITMAP())
+ * to be cleared. The clearing of the bit must be signalled with
+ * wake_up_bit(), often as clear_and_wake_up_bit().
+ *
+ * This is similar to wait_on_bit(), but calls @action() instead of
+ * schedule() for the actual waiting.
*
- * Returned value will be zero if the bit was cleared, or non-zero
- * if the process received a signal and the mode permitted wakeup
- * on that signal.
+ * Returned value will be zero if the bit was cleared in which case the
+ * call has ACQUIRE semantics, or the error code returned by @action if
+ * that call returned non-zero.
*/
static inline int
wait_on_bit_action(unsigned long *word, int bit, wait_bit_action_f *action,
@@ -157,23 +169,22 @@ wait_on_bit_action(unsigned long *word, int bit, wait_bit_action_f *action,
}
/**
- * wait_on_bit_lock - wait for a bit to be cleared, when wanting to set it
- * @word: the word being waited on, a kernel virtual address
- * @bit: the bit of the word being waited on
+ * wait_on_bit_lock - wait for a bit to be cleared, then set it
+ * @word: the address containing the bit being waited on
+ * @bit: the bit of the word being waited on and set
* @mode: the task state to sleep in
*
- * There is a standard hashed waitqueue table for generic use. This
- * is the part of the hashtable's accessor API that waits on a bit
- * when one intends to set it, for instance, trying to lock bitflags.
- * For instance, if one were to have waiters trying to set bitflag
- * and waiting for it to clear before setting it, one would call
- * wait_on_bit() in threads waiting to be able to set the bit.
- * One uses wait_on_bit_lock() where one is waiting for the bit to
- * clear with the intention of setting it, and when done, clearing it.
+ * Wait for the given bit in an unsigned long or bitmap (see
+ * DECLARE_BITMAP()) to be cleared. The clearing of the bit must be
+ * signalled with wake_up_bit(), often as clear_and_wake_up_bit(). As
+ * soon as it is clear, atomically set it and return.
*
- * Returns zero if the bit was (eventually) found to be clear and was
- * set. Returns non-zero if a signal was delivered to the process and
- * the @mode allows that signal to wake the process.
+ * This is similar to wait_on_bit(), but sets the bit before returning.
+ *
+ * Returned value will be zero if the bit was successfully set in which
+ * case the call has the same memory sequencing semantics as
+ * test_and_clear_bit(), or %-EINTR if the process received a signal and
+ * the mode permitted wake up on that signal.
*/
static inline int
wait_on_bit_lock(unsigned long *word, int bit, unsigned mode)
@@ -185,15 +196,18 @@ wait_on_bit_lock(unsigned long *word, int bit, unsigned mode)
}
/**
- * wait_on_bit_lock_io - wait for a bit to be cleared, when wanting to set it
- * @word: the word being waited on, a kernel virtual address
- * @bit: the bit of the word being waited on
+ * wait_on_bit_lock_io - wait for a bit to be cleared, then set it
+ * @word: the address containing the bit being waited on
+ * @bit: the bit of the word being waited on and set
* @mode: the task state to sleep in
*
- * Use the standard hashed waitqueue table to wait for a bit
- * to be cleared and then to atomically set it. This is similar
- * to wait_on_bit(), but calls io_schedule() instead of schedule()
- * for the actual waiting.
+ * Wait for the given bit in an unsigned long or bitmap (see
+ * DECLARE_BITMAP()) to be cleared. The clearing of the bit must be
+ * signalled with wake_up_bit(), often as clear_and_wake_up_bit(). As
+ * soon as it is clear, atomically set it and return.
+ *
+ * This is similar to wait_on_bit_lock(), but calls io_schedule() instead
+ * of schedule().
*
* Returns zero if the bit was (eventually) found to be clear and was
* set. Returns non-zero if a signal was delivered to the process and
@@ -209,21 +223,19 @@ wait_on_bit_lock_io(unsigned long *word, int bit, unsigned mode)
}
/**
- * wait_on_bit_lock_action - wait for a bit to be cleared, when wanting to set it
- * @word: the word being waited on, a kernel virtual address
- * @bit: the bit of the word being waited on
+ * wait_on_bit_lock_action - wait for a bit to be cleared, then set it
+ * @word: the address containing the bit being waited on
+ * @bit: the bit of the word being waited on and set
* @action: the function used to sleep, which may take special actions
* @mode: the task state to sleep in
*
- * Use the standard hashed waitqueue table to wait for a bit
- * to be cleared and then to set it, and allow the waiting action
- * to be specified.
- * This is like wait_on_bit() but allows fine control of how the waiting
- * is done.
+ * This is similar to wait_on_bit_lock(), but calls @action() instead of
+ * schedule() for the actual waiting.
*
- * Returns zero if the bit was (eventually) found to be clear and was
- * set. Returns non-zero if a signal was delivered to the process and
- * the @mode allows that signal to wake the process.
+ * Returned value will be zero if the bit was successfully set in which
+ * case the call has the same memory sequencing semantics as
+ * test_and_clear_bit(), or the error code returned by @action if that
+ * call returned non-zero.
*/
static inline int
wait_on_bit_lock_action(unsigned long *word, int bit, wait_bit_action_f *action,
@@ -320,12 +332,13 @@ do { \
/**
* clear_and_wake_up_bit - clear a bit and wake up anyone waiting on that bit
- *
* @bit: the bit of the word being waited on
- * @word: the word being waited on, a kernel virtual address
+ * @word: the address containing the bit being waited on
*
- * You can use this helper if bitflags are manipulated atomically rather than
- * non-atomically under a lock.
+ * The designated bit is cleared and any tasks waiting in wait_on_bit()
+ * or similar will be woken. This call has RELEASE semantics so that
+ * any changes to memory made before this call are guaranteed to be visible
+ * after the corresponding wait_on_bit() completes.
*/
static inline void clear_and_wake_up_bit(int bit, unsigned long *word)
{
diff --git a/kernel/sched/wait_bit.c b/kernel/sched/wait_bit.c
index 058b0e18727e..247997e1c9c4 100644
--- a/kernel/sched/wait_bit.c
+++ b/kernel/sched/wait_bit.c
@@ -128,21 +128,32 @@ void __wake_up_bit(struct wait_queue_head *wq_head, unsigned long *word, int bit
EXPORT_SYMBOL(__wake_up_bit);
/**
- * wake_up_bit - wake up a waiter on a bit
- * @word: the word being waited on, a kernel virtual address
- * @bit: the bit of the word being waited on
+ * wake_up_bit - wake up waiters on a bit
+ * @word: the address containing the bit being waited on
+ * @bit: the bit at that address being waited on
*
- * There is a standard hashed waitqueue table for generic use. This
- * is the part of the hash-table's accessor API that wakes up waiters
- * on a bit. For instance, if one were to have waiters on a bitflag,
- * one would call wake_up_bit() after clearing the bit.
+ * Wake up any process waiting in wait_on_bit() or similar for the
+ * given bit to be cleared.
*
- * In order for this to function properly, as it uses waitqueue_active()
- * internally, some kind of memory barrier must be done prior to calling
- * this. Typically, this will be smp_mb__after_atomic(), but in some
- * cases where bitflags are manipulated non-atomically under a lock, one
- * may need to use a less regular barrier, such fs/inode.c's smp_mb(),
- * because spin_unlock() does not guarantee a memory barrier.
+ * The wake-up is sent to tasks in a waitqueue selected by hash from a
+ * shared pool. Only those tasks on that queue which have requested
+ * wake_up on this specific address and bit will be woken, and only if the
+ * bit is clear.
+ *
+ * In order for this to function properly there must be a full memory
+ * barrier after the bit is cleared and before this function is called.
+ * If the bit was cleared atomically, such as a by clear_bit() then
+ * smb_mb__after_atomic() can be used, othwewise smb_mb() is needed.
+ *
+ * If, however, the wait_on_bit is performed under a lock, such as with
+ * wait_on_bit_action() where the action drops and reclaims the lock
+ * around schedule(), and if this wake_up_bit() call happens under the
+ * same lock, then no barrier is required.
+ *
+ * Normally the bit should be cleared by an operation with RELEASE
+ * semantics so that any changes to memory made before the bit is
+ * cleared are guaranteed to be visible after the matching wait_on_bit()
+ * completes.
*/
void wake_up_bit(unsigned long *word, int bit)
{
--
2.44.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 4/7] sched: Document wait_var_event() family of functions and wake_up_var()
2024-08-26 6:30 [PATCH 0/7 v2 RFC] Make wake_up_{bit,var} less fragile NeilBrown
` (2 preceding siblings ...)
2024-08-26 6:31 ` [PATCH 3/7] sched: Improve documentation for wake_up_bit/wait_on_bit family of functions NeilBrown
@ 2024-08-26 6:31 ` NeilBrown
2024-08-26 6:31 ` [PATCH 5/7] sched: Add test_and_clear_wake_up_bit() and atomic_dec_and_wake_up() NeilBrown
` (3 subsequent siblings)
7 siblings, 0 replies; 16+ messages in thread
From: NeilBrown @ 2024-08-26 6:31 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Linus Torvalds, Jens Axboe
Cc: linux-kernel, linux-fsdevel, linux-block
wake_up_var(), wait_var_event() and related interfaces are not
documented but have important ordering requirements. This patch adds
documentation and makes these requirements explicit.
The return values for those wait_var_event_* functions which return a
value are documented. Note that these are, perhaps surprisingly,
sometimes different from comparable wait_on_bit() functions.
Signed-off-by: NeilBrown <neilb@suse.de>
---
include/linux/wait_bit.h | 71 ++++++++++++++++++++++++++++++++++++++++
kernel/sched/wait_bit.c | 30 +++++++++++++++++
2 files changed, 101 insertions(+)
diff --git a/include/linux/wait_bit.h b/include/linux/wait_bit.h
index b792a92a036e..ca5c6e70f908 100644
--- a/include/linux/wait_bit.h
+++ b/include/linux/wait_bit.h
@@ -282,6 +282,22 @@ __out: __ret; \
___wait_var_event(var, condition, TASK_UNINTERRUPTIBLE, 0, 0, \
schedule())
+/**
+ * wait_var_event - wait for a variable to be updated and notified
+ * @var: the address of variable being waited on
+ * @condition: the condition to wait for
+ *
+ * Wait for a @condition to be true, only re-checking when a wake up is
+ * received for the given @var (an arbitrary kernel address which need
+ * not be directly related to the given condition, but usually is).
+ *
+ * The process will wait on a waitqueue selected by hash from a shared
+ * pool. It will only be woken on a wake_up for the given address.
+ *
+ * The condition should normally use smp_load_acquire() or a similarly
+ * ordered access to ensure that any changes to memory made before the
+ * condition became true will be visible after the wait completes.
+ */
#define wait_var_event(var, condition) \
do { \
might_sleep(); \
@@ -294,6 +310,24 @@ do { \
___wait_var_event(var, condition, TASK_KILLABLE, 0, 0, \
schedule())
+/**
+ * wait_var_event_killable - wait for a variable to be updated and notified
+ * @var: the address of variable being waited on
+ * @condition: the condition to wait for
+ *
+ * Wait for a @condition to be true or a fatal signal to be received,
+ * only re-checking the condition when a wake up is received for the given
+ * @var (an arbitrary kernel address which need not be directly related
+ * to the given condition, but usually is).
+ *
+ * This is similar to wait_var_event() but returns a value which is
+ * 0 if the condition became true, or %-ERESTARTSYS if a fatal signal
+ * was received.
+ *
+ * The condition should normally use smp_load_acquire() or a similarly
+ * ordered access to ensure that any changes to memory made before the
+ * condition became true will be visible after the wait completes.
+ */
#define wait_var_event_killable(var, condition) \
({ \
int __ret = 0; \
@@ -308,6 +342,26 @@ do { \
TASK_UNINTERRUPTIBLE, 0, timeout, \
__ret = schedule_timeout(__ret))
+/**
+ * wait_var_event_timeout - wait for a variable to be updated or a timeout to expire
+ * @var: the address of variable being waited on
+ * @condition: the condition to wait for
+ * @timeout: maximum time to wait in jiffies
+ *
+ * Wait for a @condition to be true or a timeout to expire, only
+ * re-checking the condition when a wake up is received for the given
+ * @var (an arbitrary kernel address which need not be directly related
+ * to the given condition, but usually is).
+ *
+ * This is similar to wait_var_event() but returns a value which is 0 if
+ * the timeout expired and the condition was still false, or the
+ * remaining time left in the timeout (but at least 1) if the condition
+ * was found to be true.
+ *
+ * The condition should normally use smp_load_acquire() or a similarly
+ * ordered access to ensure that any changes to memory made before the
+ * condition became true will be visible after the wait completes.
+ */
#define wait_var_event_timeout(var, condition, timeout) \
({ \
long __ret = timeout; \
@@ -321,6 +375,23 @@ do { \
___wait_var_event(var, condition, TASK_INTERRUPTIBLE, 0, 0, \
schedule())
+/**
+ * wait_var_event_killable - wait for a variable to be updated and notified
+ * @var: the address of variable being waited on
+ * @condition: the condition to wait for
+ *
+ * Wait for a @condition to be true or a signal to be received, only
+ * re-checking the condition when a wake up is received for the given
+ * @var (an arbitrary kernel address which need not be directly related
+ * to the given condition, but usually is).
+ *
+ * This is similar to wait_var_event() but returns a value which is 0 if
+ * the condition became true, or %-ERESTARTSYS if a signal was received.
+ *
+ * The condition should normally use smp_load_acquire() or a similarly
+ * ordered access to ensure that any changes to memory made before the
+ * condition became true will be visible after the wait completes.
+ */
#define wait_var_event_interruptible(var, condition) \
({ \
int __ret = 0; \
diff --git a/kernel/sched/wait_bit.c b/kernel/sched/wait_bit.c
index 247997e1c9c4..d7ac2ec09f8f 100644
--- a/kernel/sched/wait_bit.c
+++ b/kernel/sched/wait_bit.c
@@ -199,6 +199,36 @@ void init_wait_var_entry(struct wait_bit_queue_entry *wbq_entry, void *var, int
}
EXPORT_SYMBOL(init_wait_var_entry);
+/**
+ * wake_up_var - wake up waiters on a variable (kernel address)
+ * @var: the address of the variable being waited on
+ *
+ * Wake up any process waiting in wait_var_event() or similar for the
+ * given variable to change. wait_var_event() can be waiting for an
+ * arbitrary condition to be true and associates that condition with an
+ * address. Calling wake_up_var() suggests that the condition has been
+ * made true, but does not strictly require the condtion to use the
+ * address given.
+ *
+ * The wake-up is sent to tasks in a waitqueue selected by hash from a
+ * shared pool. Only those tasks on that queue which have requested
+ * wake_up on this specific address will be woken.
+ *
+ * In order for this to function properly there must be a full memory
+ * barrier after the variable is updated (or more accurately, after the
+ * condtion waited on has been made to be true) and before this function
+ * is called. If the variable was updated atomically, such as a by
+ * atomic_dec() then smb_mb__after_atomic() can be used. If the
+ * variable was updated by a fully ordered operation such as
+ * atomic_dec_and_test() then no extra barrier is required. Othwewise
+ * smb_mb() is needed.
+ *
+ * Normally the variable should be updated (the condition should be made
+ * to be true) by an operation with RELEASE semantics such as
+ * smp_store_release() so that any changes to memory made before the
+ * variable was update are guaranteed to be visible after the matching
+ * wait_var_event() completes.
+ */
void wake_up_var(void *var)
{
__wake_up_bit(__var_waitqueue(var), var, -1);
--
2.44.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 5/7] sched: Add test_and_clear_wake_up_bit() and atomic_dec_and_wake_up()
2024-08-26 6:30 [PATCH 0/7 v2 RFC] Make wake_up_{bit,var} less fragile NeilBrown
` (3 preceding siblings ...)
2024-08-26 6:31 ` [PATCH 4/7] sched: Document wait_var_event() family of functions and wake_up_var() NeilBrown
@ 2024-08-26 6:31 ` NeilBrown
2024-08-26 6:31 ` [PATCH 6/7] sched: Add wait/wake interface for variable updated under a lock NeilBrown
` (2 subsequent siblings)
7 siblings, 0 replies; 16+ messages in thread
From: NeilBrown @ 2024-08-26 6:31 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Linus Torvalds, Jens Axboe
Cc: linux-kernel, linux-fsdevel, linux-block
There are common patterns in the kernel of using test_and_clear_bit()
before wake_up_bit(), and atomic_dec_and_test() before wake_up_var().
These combinations don't need extra barriers but sometimes include them
unnecessarily.
To help avoid the unnecessary barriers and to help discourage the
general use of wake_up_bit/var (which is a fragile interface) introduce
two combined functions which implement these patterns.
Also add store_release_wake_up() which support the parts of simply
setting a non-atomic variable as sending a wakeup. This pattern
requires barriers which are often omitted.
Signed-off-by: NeilBrown <neilb@suse.de>
---
include/linux/wait_bit.h | 61 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 61 insertions(+)
diff --git a/include/linux/wait_bit.h b/include/linux/wait_bit.h
index ca5c6e70f908..c1675457c077 100644
--- a/include/linux/wait_bit.h
+++ b/include/linux/wait_bit.h
@@ -419,4 +419,65 @@ static inline void clear_and_wake_up_bit(int bit, unsigned long *word)
wake_up_bit(word, bit);
}
+/**
+ * test_and_clear_wake_up_bit - clear a bit if it was set: wake up anyone waiting on that bit
+ *
+ * @bit: the bit of the word being waited on
+ * @word: the address of memory containing that bit
+ *
+ * If the bit is set and can be atomically cleared, any tasks waiting in
+ * wait_on_bit() or similar will be woken. This call has the same
+ * complete ordering semantics as test_and_clear_bit(). Any changes to
+ * memory made before this call are guaranteed to be visible after the
+ * corresponding wait_on_bit() completes.
+ *
+ * Returns %true if the bit was successfully set and the wake up was sent.
+ */
+static inline bool test_and_clear_wake_up_bit(int bit, unsigned long *word)
+{
+ if (!test_and_clear_bit(bit, word))
+ return false;
+ /* no extra barrier required */
+ wake_up_bit(word, bit);
+ return true;
+}
+
+/**
+ * atomic_dec_and_wake_up - decrement an atomic_t and if zero, wake up waiters
+ *
+ * @var: the variable to dec and test
+ *
+ * Decrements the atomic variable and if it reaches zero, send a wake_up to any
+ * processes waiting on the variable.
+ *
+ * This function has the same complete ordering semantics as atomic_dec_and_test.
+ *
+ * Returns %true is the variable reaches zero and the wake up was sent.
+ */
+
+static inline bool atomic_dec_and_wake_up(atomic_t *var)
+{
+ if (!atomic_dec_and_test(var))
+ return false;
+ wake_up_var(var);
+ return true;
+}
+
+/**
+ * store_release_wake_up - update a variable and send a wake_up
+ * @var: the address of the variable to be updated and woken
+ * @val: the value to store in the variable.
+ *
+ * Store the given value in the variable send a wake up to any tasks
+ * waiting on the variable. All necessary barriers are included to ensure
+ * the task calling wait_var_event() sees the new value and all values
+ * written to memory before this call.
+ */
+#define store_release_wake_up(var, val) \
+do { \
+ smp_store_release(var, val); \
+ smp_mb(); \
+ wake_up_var(var); \
+} while (0)
+
#endif /* _LINUX_WAIT_BIT_H */
--
2.44.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 6/7] sched: Add wait/wake interface for variable updated under a lock.
2024-08-26 6:30 [PATCH 0/7 v2 RFC] Make wake_up_{bit,var} less fragile NeilBrown
` (4 preceding siblings ...)
2024-08-26 6:31 ` [PATCH 5/7] sched: Add test_and_clear_wake_up_bit() and atomic_dec_and_wake_up() NeilBrown
@ 2024-08-26 6:31 ` NeilBrown
2024-08-26 6:31 ` [PATCH 7/7] Block: switch bd_prepare_to_claim to use wait_var_event_mutex() NeilBrown
2024-09-15 23:52 ` [PATCH 0/7 v2 RFC] Make wake_up_{bit,var} less fragile NeilBrown
7 siblings, 0 replies; 16+ messages in thread
From: NeilBrown @ 2024-08-26 6:31 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Linus Torvalds, Jens Axboe
Cc: linux-kernel, linux-fsdevel, linux-block
Sometimes we need to wait for a condition to be true which must be
testing while holding a lock. Correspondingly the condition is made
true while holing the lock and the wake up is sent under the lock.
This patch provides wake and wait interfaces which can be used for this
situation when the lock is either a mutex or a spinlock.
Signed-off-by: NeilBrown <neilb@suse.de>
---
include/linux/wait_bit.h | 69 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 69 insertions(+)
diff --git a/include/linux/wait_bit.h b/include/linux/wait_bit.h
index c1675457c077..6995a0d89ebd 100644
--- a/include/linux/wait_bit.h
+++ b/include/linux/wait_bit.h
@@ -401,6 +401,75 @@ do { \
__ret; \
})
+/**
+ * wait_var_event_spinlock - wait for a variable to be updated under a spinlock
+ * @var: the address of the variable being waited on
+ * @condition: condition to wait for
+ * @lock: the spinlock which protects updates to the variable
+ *
+ * Wait for a condition which can only be reliably tested while holding
+ * a spinlock. The variables assessed in the condition will normal be updated
+ * under the same spinlock, and the wake up should be signalled with
+ * wake_up_var_locked() under the same spinlock.
+ *
+ * This is similar to wait_var_event(), but assume a spinlock is held
+ * while calling this function and while updating the variable.
+ *
+ * This must be called while the given lock is held and the lock will be
+ * dropped when schedule() is called to wait for a wake up, and will be
+ * reclaimed before testing the condition again.
+ */
+#define wait_var_event_spinlock(var, condition, lock) \
+do { \
+ might_sleep(); \
+ if (condition) \
+ break; \
+ ___wait_var_event(var, condition, TASK_UNINTERRUPTIBLE, 0, 0, \
+ spin_unlock(lock); schedule(); spin_lock(lock)); \
+} while (0)
+
+/**
+ * wait_var_event_mutex - wait for a variable to be updated under a mutex
+ * @var: the address of the variable being waited on
+ * @condition: condition to wait for
+ * @mutex: the mutex which protects updates to the variable
+ *
+ * Wait for a condition which can only be reliably tested while holding
+ * a mutex. The variables assessed in the condition will normal be
+ * updated under the same mutex, and the wake up should be signalled
+ * with wake_up_var_locked() under the same mutex.
+ *
+ * This is similar to wait_var_event(), but assume a mutex is held
+ * while calling this function and while updating the variable.
+ *
+ * This must be called while the given mutex is held and the mutex will be
+ * dropped when schedule() is called to wait for a wake up, and will be
+ * reclaimed before testing the condition again.
+ */
+#define wait_var_event_mutex(var, condition, mutex) \
+do { \
+ might_sleep(); \
+ if (condition) \
+ break; \
+ ___wait_var_event(var, condition, TASK_UNINTERRUPTIBLE, 0, 0, \
+ mutex_unlock(mutex); schedule(); mutex_lock(mutex)); \
+} while (0)
+
+/**
+ * wake_up_var_locked - wake up waits for a variable while holding a spinlock or mutex
+ * @var: the address of the variable being waited on
+ * @lock: The spinlock or mutex what protects the variable
+ *
+ * Send a wake up for the given variable which should be waited for with
+ * wait_var_event_spinlock() or wait_var_event_mutex(). Unlike wake_up_var(),
+ * no extra barriers are needed as the locking provides sufficient sequencing.
+ */
+#define wake_up_var_locked(var, lock) \
+do { \
+ lockdep_assert_held(lock); \
+ wake_up_var(var); \
+} while (0)
+
/**
* clear_and_wake_up_bit - clear a bit and wake up anyone waiting on that bit
* @bit: the bit of the word being waited on
--
2.44.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 7/7] Block: switch bd_prepare_to_claim to use wait_var_event_mutex()
2024-08-26 6:30 [PATCH 0/7 v2 RFC] Make wake_up_{bit,var} less fragile NeilBrown
` (5 preceding siblings ...)
2024-08-26 6:31 ` [PATCH 6/7] sched: Add wait/wake interface for variable updated under a lock NeilBrown
@ 2024-08-26 6:31 ` NeilBrown
2024-09-15 23:52 ` [PATCH 0/7 v2 RFC] Make wake_up_{bit,var} less fragile NeilBrown
7 siblings, 0 replies; 16+ messages in thread
From: NeilBrown @ 2024-08-26 6:31 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Linus Torvalds, Jens Axboe
Cc: linux-kernel, linux-fsdevel, linux-block
bd_prepare_to_claim() contains an open-coded version of the new
wait_var_event_mutex().
Change it to use that function and re-organise the code to benefit from
this change.
Signed-off-by: NeilBrown <neilb@suse.de>
---
block/bdev.c | 49 +++++++++++++++++++------------------------------
1 file changed, 19 insertions(+), 30 deletions(-)
diff --git a/block/bdev.c b/block/bdev.c
index 21e688fb6449..6e827ee02e7d 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -487,10 +487,10 @@ long nr_blockdev_pages(void)
* Test whether @bdev can be claimed by @holder.
*
* RETURNS:
- * %true if @bdev can be claimed, %false otherwise.
+ * %0 if @bdev can be claimed, %-EBUSY otherwise.
*/
-static bool bd_may_claim(struct block_device *bdev, void *holder,
- const struct blk_holder_ops *hops)
+static int bd_may_claim(struct block_device *bdev, void *holder,
+ const struct blk_holder_ops *hops)
{
struct block_device *whole = bdev_whole(bdev);
@@ -503,9 +503,9 @@ static bool bd_may_claim(struct block_device *bdev, void *holder,
if (bdev->bd_holder == holder) {
if (WARN_ON_ONCE(bdev->bd_holder_ops != hops))
return false;
- return true;
+ return 0;
}
- return false;
+ return -EBUSY;
}
/*
@@ -514,8 +514,8 @@ static bool bd_may_claim(struct block_device *bdev, void *holder,
*/
if (whole != bdev &&
whole->bd_holder && whole->bd_holder != bd_may_claim)
- return false;
- return true;
+ return -EBUSY;
+ return 0;
}
/**
@@ -535,43 +535,32 @@ int bd_prepare_to_claim(struct block_device *bdev, void *holder,
const struct blk_holder_ops *hops)
{
struct block_device *whole = bdev_whole(bdev);
+ int err = 0;
if (WARN_ON_ONCE(!holder))
return -EINVAL;
-retry:
- mutex_lock(&bdev_lock);
- /* if someone else claimed, fail */
- if (!bd_may_claim(bdev, holder, hops)) {
- mutex_unlock(&bdev_lock);
- return -EBUSY;
- }
- /* if claiming is already in progress, wait for it to finish */
- if (whole->bd_claiming) {
- wait_queue_head_t *wq = __var_waitqueue(&whole->bd_claiming);
- DEFINE_WAIT(wait);
-
- prepare_to_wait(wq, &wait, TASK_UNINTERRUPTIBLE);
- mutex_unlock(&bdev_lock);
- schedule();
- finish_wait(wq, &wait);
- goto retry;
- }
+ mutex_lock(&bdev_lock);
+ wait_var_event_mutex(&whole->bd_claiming,
+ (err = bd_may_claim(bdev, holder, hops)) != 0 ||
+ whole->bd_claiming == NULL,
+ &bdev_lock);
- /* yay, all mine */
- whole->bd_claiming = holder;
+ /* if someone else claimed, fail */
+ if (!err)
+ /* yay, all mine */
+ whole->bd_claiming = holder;
mutex_unlock(&bdev_lock);
- return 0;
+ return err;
}
EXPORT_SYMBOL_GPL(bd_prepare_to_claim); /* only for the loop driver */
static void bd_clear_claiming(struct block_device *whole, void *holder)
{
- lockdep_assert_held(&bdev_lock);
/* tell others that we're done */
BUG_ON(whole->bd_claiming != holder);
whole->bd_claiming = NULL;
- wake_up_var(&whole->bd_claiming);
+ wake_up_var_locked(&whole->bd_claiming, &bdev_lock);
}
/**
--
2.44.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 0/7 v2 RFC] Make wake_up_{bit,var} less fragile
2024-08-26 6:30 [PATCH 0/7 v2 RFC] Make wake_up_{bit,var} less fragile NeilBrown
` (6 preceding siblings ...)
2024-08-26 6:31 ` [PATCH 7/7] Block: switch bd_prepare_to_claim to use wait_var_event_mutex() NeilBrown
@ 2024-09-15 23:52 ` NeilBrown
7 siblings, 0 replies; 16+ messages in thread
From: NeilBrown @ 2024-09-15 23:52 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Linus Torvalds, Jens Axboe
Cc: linux-kernel, linux-fsdevel, linux-block
Hi Ingo and Peter,
have you had a chance to look at these yet? Should I resend? Maybe
after -rc is out?
Thanks,
NeilBrown
On Mon, 26 Aug 2024, NeilBrown wrote:
> This is a second attempt to make wake_up_{bit,var} less fragile.
> This version doesn't change those functions much, but instead
> improves the documentation and provides some helpers which
> both serve as patterns to follow and alternates so that use of the
> fragile functions can be limited or eliminated.
>
> The only change to either function is that wake_up_bit() is changed to
> take an unsigned long * rather than a void *. This necessitates the
> first patch which changes the one place where something other then
> unsigned long * is passed to wake_up bit - it is in block/.
>
> The final patch modifies the same bit of code as a demonstration of one
> of the new APIs that has been added.
>
> Thanks,
> NeilBrown
>
>
> [PATCH 1/7] block: change wait on bd_claiming to use a var_waitqueue,
> [PATCH 2/7] sched: change wake_up_bit() and related function to
> [PATCH 3/7] sched: Improve documentation for wake_up_bit/wait_on_bit
> [PATCH 4/7] sched: Document wait_var_event() family of functions and
> [PATCH 5/7] sched: Add test_and_clear_wake_up_bit() and
> [PATCH 6/7] sched: Add wait/wake interface for variable updated under
> [PATCH 7/7] Block: switch bd_prepare_to_claim to use
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 2/7] sched: change wake_up_bit() and related function to expect unsigned long *
2024-08-26 6:30 ` [PATCH 2/7] sched: change wake_up_bit() and related function to expect unsigned long * NeilBrown
@ 2024-09-16 11:28 ` Peter Zijlstra
2024-09-16 11:48 ` NeilBrown
0 siblings, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2024-09-16 11:28 UTC (permalink / raw)
To: NeilBrown
Cc: Ingo Molnar, Linus Torvalds, Jens Axboe, linux-kernel,
linux-fsdevel, linux-block
On Mon, Aug 26, 2024 at 04:30:59PM +1000, NeilBrown wrote:
> wake_up_bit() currently allows a "void *". While this isn't strictly a
> problem as the address is never dereferenced, it is inconsistent with
> the corresponding wait_var_event() which requires "unsigned long *" and
> does dereference the pointer.
I'm having trouble parsing this. The way I read it, you're contradicting
yourself. Where does wait_var_event() require 'unsigned long *' ?
> And code that needs to wait for a change in something other than an
> unsigned long would be better served by wake_up_var().
This, afaict the whole var thing is size invariant. It only cares about
the address.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 2/7] sched: change wake_up_bit() and related function to expect unsigned long *
2024-09-16 11:28 ` Peter Zijlstra
@ 2024-09-16 11:48 ` NeilBrown
2024-09-16 18:18 ` Peter Zijlstra
0 siblings, 1 reply; 16+ messages in thread
From: NeilBrown @ 2024-09-16 11:48 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Ingo Molnar, Linus Torvalds, Jens Axboe, linux-kernel,
linux-fsdevel, linux-block
On Mon, 16 Sep 2024, Peter Zijlstra wrote:
> On Mon, Aug 26, 2024 at 04:30:59PM +1000, NeilBrown wrote:
> > wake_up_bit() currently allows a "void *". While this isn't strictly a
> > problem as the address is never dereferenced, it is inconsistent with
> > the corresponding wait_var_event() which requires "unsigned long *" and
> > does dereference the pointer.
>
> I'm having trouble parsing this. The way I read it, you're contradicting
> yourself. Where does wait_var_event() require 'unsigned long *' ?
Sorry, that is meant so as "the corresponding wait_on_bit()".
>
> > And code that needs to wait for a change in something other than an
> > unsigned long would be better served by wake_up_var().
>
> This, afaict the whole var thing is size invariant. It only cares about
> the address.
>
Again - wake_up_bit(). Sorry - bits are vars were swimming around my
brain and I didn't proof-read properly.
This patch is all "bit", no "var".
NeilBrown
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 2/7] sched: change wake_up_bit() and related function to expect unsigned long *
2024-09-16 11:48 ` NeilBrown
@ 2024-09-16 18:18 ` Peter Zijlstra
2024-09-16 20:37 ` NeilBrown
0 siblings, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2024-09-16 18:18 UTC (permalink / raw)
To: NeilBrown
Cc: Ingo Molnar, Linus Torvalds, Jens Axboe, linux-kernel,
linux-fsdevel, linux-block
On Mon, Sep 16, 2024 at 09:48:11PM +1000, NeilBrown wrote:
> On Mon, 16 Sep 2024, Peter Zijlstra wrote:
> > On Mon, Aug 26, 2024 at 04:30:59PM +1000, NeilBrown wrote:
> > > wake_up_bit() currently allows a "void *". While this isn't strictly a
> > > problem as the address is never dereferenced, it is inconsistent with
> > > the corresponding wait_var_event() which requires "unsigned long *" and
> > > does dereference the pointer.
> >
> > I'm having trouble parsing this. The way I read it, you're contradicting
> > yourself. Where does wait_var_event() require 'unsigned long *' ?
>
> Sorry, that is meant so as "the corresponding wait_on_bit()".
>
>
> >
> > > And code that needs to wait for a change in something other than an
> > > unsigned long would be better served by wake_up_var().
> >
> > This, afaict the whole var thing is size invariant. It only cares about
> > the address.
> >
>
> Again - wake_up_bit(). Sorry - bits are vars were swimming around my
> brain and I didn't proof-read properly.
>
> This patch is all "bit", no "var".
OK :-)
Anyway, other than that the patches look fine, but given we're somewhat
in the middle of the merge window and all traveling to get into Vienna
and have a few beers, I would much prefer merging these patches after
-rc1, that okay?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 2/7] sched: change wake_up_bit() and related function to expect unsigned long *
2024-09-16 18:18 ` Peter Zijlstra
@ 2024-09-16 20:37 ` NeilBrown
0 siblings, 0 replies; 16+ messages in thread
From: NeilBrown @ 2024-09-16 20:37 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Ingo Molnar, Linus Torvalds, Jens Axboe, linux-kernel,
linux-fsdevel, linux-block
On Tue, 17 Sep 2024, Peter Zijlstra wrote:
> On Mon, Sep 16, 2024 at 09:48:11PM +1000, NeilBrown wrote:
> > On Mon, 16 Sep 2024, Peter Zijlstra wrote:
> > > On Mon, Aug 26, 2024 at 04:30:59PM +1000, NeilBrown wrote:
> > > > wake_up_bit() currently allows a "void *". While this isn't strictly a
> > > > problem as the address is never dereferenced, it is inconsistent with
> > > > the corresponding wait_var_event() which requires "unsigned long *" and
> > > > does dereference the pointer.
> > >
> > > I'm having trouble parsing this. The way I read it, you're contradicting
> > > yourself. Where does wait_var_event() require 'unsigned long *' ?
> >
> > Sorry, that is meant so as "the corresponding wait_on_bit()".
> >
> >
> > >
> > > > And code that needs to wait for a change in something other than an
> > > > unsigned long would be better served by wake_up_var().
> > >
> > > This, afaict the whole var thing is size invariant. It only cares about
> > > the address.
> > >
> >
> > Again - wake_up_bit(). Sorry - bits are vars were swimming around my
> > brain and I didn't proof-read properly.
> >
> > This patch is all "bit", no "var".
>
> OK :-)
>
> Anyway, other than that the patches look fine, but given we're somewhat
> in the middle of the merge window and all traveling to get into Vienna
> and have a few beers, I would much prefer merging these patches after
> -rc1, that okay?
>
Yes, that's OK. Thanks for having a look. Have fun in Vienna.
NeilBrown
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/7] block: change wait on bd_claiming to use a var_waitqueue, not a bit_waitqueue
2024-08-26 6:30 ` [PATCH 1/7] block: change wait on bd_claiming to use a var_waitqueue, not a bit_waitqueue NeilBrown
@ 2024-09-17 3:12 ` Jens Axboe
2024-09-17 21:54 ` NeilBrown
2024-09-17 3:13 ` (subset) " Jens Axboe
1 sibling, 1 reply; 16+ messages in thread
From: Jens Axboe @ 2024-09-17 3:12 UTC (permalink / raw)
To: NeilBrown, Ingo Molnar, Peter Zijlstra, Linus Torvalds
Cc: linux-kernel, linux-fsdevel, linux-block
On 8/26/24 12:30 AM, NeilBrown wrote:
> bd_prepare_to_claim() waits for a var to change, not for a bit to be
> cleared.
> So change from bit_waitqueue() to __var_waitqueue() and correspondingly
> use wake_up_var().
> This will allow a future patch which change the "bit" function to expect
> an "unsigned long *" instead of "void *".
Looks fine to me - since this one is separate from the series, I can snag
it and shove it into the block side so it'll make 6.12-rc1. Then at least
it won't be a dependency for the rest of the series post that.
--
Jens Axboe
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: (subset) [PATCH 1/7] block: change wait on bd_claiming to use a var_waitqueue, not a bit_waitqueue
2024-08-26 6:30 ` [PATCH 1/7] block: change wait on bd_claiming to use a var_waitqueue, not a bit_waitqueue NeilBrown
2024-09-17 3:12 ` Jens Axboe
@ 2024-09-17 3:13 ` Jens Axboe
1 sibling, 0 replies; 16+ messages in thread
From: Jens Axboe @ 2024-09-17 3:13 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Linus Torvalds, NeilBrown
Cc: linux-kernel, linux-fsdevel, linux-block
On Mon, 26 Aug 2024 16:30:58 +1000, NeilBrown wrote:
> bd_prepare_to_claim() waits for a var to change, not for a bit to be
> cleared.
> So change from bit_waitqueue() to __var_waitqueue() and correspondingly
> use wake_up_var().
> This will allow a future patch which change the "bit" function to expect
> an "unsigned long *" instead of "void *".
>
> [...]
Applied, thanks!
[1/7] block: change wait on bd_claiming to use a var_waitqueue, not a bit_waitqueue
commit: aa3d8a36780ab568d528348dd8115560f63ea16b
Best regards,
--
Jens Axboe
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/7] block: change wait on bd_claiming to use a var_waitqueue, not a bit_waitqueue
2024-09-17 3:12 ` Jens Axboe
@ 2024-09-17 21:54 ` NeilBrown
0 siblings, 0 replies; 16+ messages in thread
From: NeilBrown @ 2024-09-17 21:54 UTC (permalink / raw)
To: Jens Axboe
Cc: Ingo Molnar, Peter Zijlstra, Linus Torvalds, linux-kernel,
linux-fsdevel, linux-block
On Tue, 17 Sep 2024, Jens Axboe wrote:
> On 8/26/24 12:30 AM, NeilBrown wrote:
> > bd_prepare_to_claim() waits for a var to change, not for a bit to be
> > cleared.
> > So change from bit_waitqueue() to __var_waitqueue() and correspondingly
> > use wake_up_var().
> > This will allow a future patch which change the "bit" function to expect
> > an "unsigned long *" instead of "void *".
>
> Looks fine to me - since this one is separate from the series, I can snag
> it and shove it into the block side so it'll make 6.12-rc1. Then at least
> it won't be a dependency for the rest of the series post that.
Thanks Jens!
NeilBrown
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2024-09-17 21:54 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-26 6:30 [PATCH 0/7 v2 RFC] Make wake_up_{bit,var} less fragile NeilBrown
2024-08-26 6:30 ` [PATCH 1/7] block: change wait on bd_claiming to use a var_waitqueue, not a bit_waitqueue NeilBrown
2024-09-17 3:12 ` Jens Axboe
2024-09-17 21:54 ` NeilBrown
2024-09-17 3:13 ` (subset) " Jens Axboe
2024-08-26 6:30 ` [PATCH 2/7] sched: change wake_up_bit() and related function to expect unsigned long * NeilBrown
2024-09-16 11:28 ` Peter Zijlstra
2024-09-16 11:48 ` NeilBrown
2024-09-16 18:18 ` Peter Zijlstra
2024-09-16 20:37 ` NeilBrown
2024-08-26 6:31 ` [PATCH 3/7] sched: Improve documentation for wake_up_bit/wait_on_bit family of functions NeilBrown
2024-08-26 6:31 ` [PATCH 4/7] sched: Document wait_var_event() family of functions and wake_up_var() NeilBrown
2024-08-26 6:31 ` [PATCH 5/7] sched: Add test_and_clear_wake_up_bit() and atomic_dec_and_wake_up() NeilBrown
2024-08-26 6:31 ` [PATCH 6/7] sched: Add wait/wake interface for variable updated under a lock NeilBrown
2024-08-26 6:31 ` [PATCH 7/7] Block: switch bd_prepare_to_claim to use wait_var_event_mutex() NeilBrown
2024-09-15 23:52 ` [PATCH 0/7 v2 RFC] Make wake_up_{bit,var} less fragile NeilBrown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).