* [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
@ 2025-01-13 6:18 I Hsin Cheng
2025-01-13 9:45 ` kernel test robot
` (3 more replies)
0 siblings, 4 replies; 12+ messages in thread
From: I Hsin Cheng @ 2025-01-13 6:18 UTC (permalink / raw)
To: yury.norov; +Cc: linux, jserv, linux-kernel, I Hsin Cheng
Original implementation of "cpumask_any_but()" isn't actually random as
the comment claims itself to be. It's behavior is in fact to select the
first cpu in "mask" which isn't equal to "cpu".
Re-implement the function so we can choose a random cpu by randomly
select the value of "n" and choose the nth cpu in "mask"
Experiments[1] are done below to verify it generate more random result than
orginal implementation which tends to select the same cpu over and over
again.
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
---
The test is done on x86_64 architecture with 6.8.0-48-generic kernel
version on Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
[1]:
Test script:
int init_module(void)
{
const struct cpumask *cur_mask = cpu_online_mask;
unsigned int cpu = 5, result;
int times = 50;
pr_info("Old cpumask_any_but(): ");
for (int i = 0; i < times; i++) {
result = cpumask_any_but(cur_mask, cpu);
pr_cont("%u ", result);
}
pr_info("\n");
pr_info("New cpumask_any_but(): ");
for (int i = 0; i < times; i++) {
result = cpumask_any_but_v2(cur_mask, cpu);
pr_cont("%u ", result);
}
pr_info("\n");
return 0;
}
Experiment result showned as below display in dmesg:
[ 8036.558152] Old cpumask_any_but(): 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[ 8036.558193] New cpumask_any_but(): 7 1 1 2 2 2 3 4 0 2 7 4 6 3 3 2 2 4 2 7 6 6 6 4 6 6 6 4 4 7 6 2 2 6 7 6 6 3 0 6 2 1 0 4 4 6 4 6 6 3
---
include/linux/cpumask.h | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 9278a50d5..336297960 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -401,12 +401,18 @@ unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta
static __always_inline
unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
{
- unsigned int i;
+ unsigned int i, n, weight;
cpumask_check(cpu);
- for_each_cpu(i, mask)
- if (i != cpu)
- break;
+ weight = cpumask_weight(mask);
+ n = get_random_u32() % weight;
+
+ /* If we accidentally pick "n" equal to "cpu",
+ * then simply choose "n + 1"th cpu.
+ */
+ if (n == cpu)
+ n = (n + 1) % weight;
+ i = cpumask_nth(n, mask);
return i;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
2025-01-13 6:18 [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but() I Hsin Cheng
@ 2025-01-13 9:45 ` kernel test robot
2025-01-13 9:45 ` kernel test robot
` (2 subsequent siblings)
3 siblings, 0 replies; 12+ messages in thread
From: kernel test robot @ 2025-01-13 9:45 UTC (permalink / raw)
To: I Hsin Cheng; +Cc: oe-kbuild-all
Hi Hsin,
[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:
[auto build test ERROR on linus/master]
[also build test ERROR on v6.13-rc7 next-20250113]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/I-Hsin-Cheng/cpumask-Implement-random-version-of-cpumask_any_but/20250113-142111
base: linus/master
patch link: https://lore.kernel.org/r/20250113061839.22131-1-richard120310%40gmail.com
patch subject: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
config: i386-buildonly-randconfig-002-20250113 (https://download.01.org/0day-ci/archive/20250113/202501131732.I9VU5v0r-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250113/202501131732.I9VU5v0r-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202501131732.I9VU5v0r-lkp@intel.com/
All errors (new ones prefixed by >>):
In file included from include/linux/smp.h:13,
from include/linux/lockdep.h:14,
from include/linux/spinlock.h:63,
from include/linux/swait.h:7,
from include/linux/completion.h:12,
from include/linux/crypto.h:15,
from arch/x86/kernel/asm-offsets.c:9:
include/linux/cpumask.h: In function 'cpumask_any_but':
>> include/linux/cpumask.h:407:18: error: implicit declaration of function 'cpumask_weight'; did you mean 'cpumask_next'? [-Werror=implicit-function-declaration]
407 | weight = cpumask_weight(mask);
| ^~~~~~~~~~~~~~
| cpumask_next
>> include/linux/cpumask.h:408:13: error: implicit declaration of function 'get_random_u32' [-Werror=implicit-function-declaration]
408 | n = get_random_u32() % weight;
| ^~~~~~~~~~~~~~
>> include/linux/cpumask.h:415:13: error: implicit declaration of function 'cpumask_nth'; did you mean 'cpumask_next'? [-Werror=implicit-function-declaration]
415 | i = cpumask_nth(n, mask);
| ^~~~~~~~~~~
| cpumask_next
include/linux/cpumask.h: At top level:
>> include/linux/cpumask.h:450:14: error: conflicting types for 'cpumask_nth'; have 'unsigned int(unsigned int, const struct cpumask *)'
450 | unsigned int cpumask_nth(unsigned int cpu, const struct cpumask *srcp)
| ^~~~~~~~~~~
include/linux/cpumask.h:415:13: note: previous implicit declaration of 'cpumask_nth' with type 'int()'
415 | i = cpumask_nth(n, mask);
| ^~~~~~~~~~~
>> include/linux/cpumask.h:779:37: error: conflicting types for 'cpumask_weight'; have 'unsigned int(const struct cpumask *)'
779 | static __always_inline unsigned int cpumask_weight(const struct cpumask *srcp)
| ^~~~~~~~~~~~~~
include/linux/cpumask.h:407:18: note: previous implicit declaration of 'cpumask_weight' with type 'int()'
407 | weight = cpumask_weight(mask);
| ^~~~~~~~~~~~~~
In file included from include/linux/nodemask.h:98,
from include/linux/mmzone.h:18,
from include/linux/gfp.h:7,
from include/linux/slab.h:16,
from include/linux/crypto.h:17:
>> include/linux/random.h:42:5: error: conflicting types for 'get_random_u32'; have 'u32(void)' {aka 'unsigned int(void)'}
42 | u32 get_random_u32(void);
| ^~~~~~~~~~~~~~
include/linux/cpumask.h:408:13: note: previous implicit declaration of 'get_random_u32' with type 'int()'
408 | n = get_random_u32() % weight;
| ^~~~~~~~~~~~~~
cc1: some warnings being treated as errors
make[3]: *** [scripts/Makefile.build:102: arch/x86/kernel/asm-offsets.s] Error 1 shuffle=3780167214
make[3]: Target 'prepare' not remade because of errors.
make[2]: *** [Makefile:1263: prepare0] Error 2 shuffle=3780167214
make[2]: Target 'prepare' not remade because of errors.
make[1]: *** [Makefile:251: __sub-make] Error 2 shuffle=3780167214
make[1]: Target 'prepare' not remade because of errors.
make: *** [Makefile:251: __sub-make] Error 2 shuffle=3780167214
make: Target 'prepare' not remade because of errors.
vim +407 include/linux/cpumask.h
317
318 /**
319 * for_each_cpu_wrap - iterate over every cpu in a mask, starting at a specified location
320 * @cpu: the (optionally unsigned) integer iterator
321 * @mask: the cpumask pointer
322 * @start: the start location
323 *
324 * The implementation does not assume any bit in @mask is set (including @start).
325 *
326 * After the loop, cpu is >= nr_cpu_ids.
327 */
328 #define for_each_cpu_wrap(cpu, mask, start) \
329 for_each_set_bit_wrap(cpu, cpumask_bits(mask), small_cpumask_bits, start)
330
331 /**
332 * for_each_cpu_and - iterate over every cpu in both masks
333 * @cpu: the (optionally unsigned) integer iterator
334 * @mask1: the first cpumask pointer
335 * @mask2: the second cpumask pointer
336 *
337 * This saves a temporary CPU mask in many places. It is equivalent to:
338 * struct cpumask tmp;
339 * cpumask_and(&tmp, &mask1, &mask2);
340 * for_each_cpu(cpu, &tmp)
341 * ...
342 *
343 * After the loop, cpu is >= nr_cpu_ids.
344 */
345 #define for_each_cpu_and(cpu, mask1, mask2) \
346 for_each_and_bit(cpu, cpumask_bits(mask1), cpumask_bits(mask2), small_cpumask_bits)
347
348 /**
349 * for_each_cpu_andnot - iterate over every cpu present in one mask, excluding
350 * those present in another.
351 * @cpu: the (optionally unsigned) integer iterator
352 * @mask1: the first cpumask pointer
353 * @mask2: the second cpumask pointer
354 *
355 * This saves a temporary CPU mask in many places. It is equivalent to:
356 * struct cpumask tmp;
357 * cpumask_andnot(&tmp, &mask1, &mask2);
358 * for_each_cpu(cpu, &tmp)
359 * ...
360 *
361 * After the loop, cpu is >= nr_cpu_ids.
362 */
363 #define for_each_cpu_andnot(cpu, mask1, mask2) \
364 for_each_andnot_bit(cpu, cpumask_bits(mask1), cpumask_bits(mask2), small_cpumask_bits)
365
366 /**
367 * for_each_cpu_or - iterate over every cpu present in either mask
368 * @cpu: the (optionally unsigned) integer iterator
369 * @mask1: the first cpumask pointer
370 * @mask2: the second cpumask pointer
371 *
372 * This saves a temporary CPU mask in many places. It is equivalent to:
373 * struct cpumask tmp;
374 * cpumask_or(&tmp, &mask1, &mask2);
375 * for_each_cpu(cpu, &tmp)
376 * ...
377 *
378 * After the loop, cpu is >= nr_cpu_ids.
379 */
380 #define for_each_cpu_or(cpu, mask1, mask2) \
381 for_each_or_bit(cpu, cpumask_bits(mask1), cpumask_bits(mask2), small_cpumask_bits)
382
383 /**
384 * for_each_cpu_from - iterate over CPUs present in @mask, from @cpu to the end of @mask.
385 * @cpu: the (optionally unsigned) integer iterator
386 * @mask: the cpumask pointer
387 *
388 * After the loop, cpu is >= nr_cpu_ids.
389 */
390 #define for_each_cpu_from(cpu, mask) \
391 for_each_set_bit_from(cpu, cpumask_bits(mask), small_cpumask_bits)
392
393 /**
394 * cpumask_any_but - return a "random" in a cpumask, but not this one.
395 * @mask: the cpumask to search
396 * @cpu: the cpu to ignore.
397 *
398 * Often used to find any cpu but smp_processor_id() in a mask.
399 * Return: >= nr_cpu_ids if no cpus set.
400 */
401 static __always_inline
402 unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
403 {
404 unsigned int i, n, weight;
405
406 cpumask_check(cpu);
> 407 weight = cpumask_weight(mask);
> 408 n = get_random_u32() % weight;
409
410 /* If we accidentally pick "n" equal to "cpu",
411 * then simply choose "n + 1"th cpu.
412 */
413 if (n == cpu)
414 n = (n + 1) % weight;
> 415 i = cpumask_nth(n, mask);
416 return i;
417 }
418
419 /**
420 * cpumask_any_and_but - pick a "random" cpu from *mask1 & *mask2, but not this one.
421 * @mask1: the first input cpumask
422 * @mask2: the second input cpumask
423 * @cpu: the cpu to ignore
424 *
425 * Returns >= nr_cpu_ids if no cpus set.
426 */
427 static __always_inline
428 unsigned int cpumask_any_and_but(const struct cpumask *mask1,
429 const struct cpumask *mask2,
430 unsigned int cpu)
431 {
432 unsigned int i;
433
434 cpumask_check(cpu);
435 i = cpumask_first_and(mask1, mask2);
436 if (i != cpu)
437 return i;
438
439 return cpumask_next_and(cpu, mask1, mask2);
440 }
441
442 /**
443 * cpumask_nth - get the Nth cpu in a cpumask
444 * @srcp: the cpumask pointer
445 * @cpu: the Nth cpu to find, starting from 0
446 *
447 * Return: >= nr_cpu_ids if such cpu doesn't exist.
448 */
449 static __always_inline
> 450 unsigned int cpumask_nth(unsigned int cpu, const struct cpumask *srcp)
451 {
452 return find_nth_bit(cpumask_bits(srcp), small_cpumask_bits, cpumask_check(cpu));
453 }
454
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
2025-01-13 6:18 [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but() I Hsin Cheng
2025-01-13 9:45 ` kernel test robot
@ 2025-01-13 9:45 ` kernel test robot
2025-01-13 10:13 ` Kuan-Wei Chiu
2025-01-13 11:05 ` Mark Rutland
3 siblings, 0 replies; 12+ messages in thread
From: kernel test robot @ 2025-01-13 9:45 UTC (permalink / raw)
To: I Hsin Cheng; +Cc: llvm, oe-kbuild-all
Hi Hsin,
[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:
[auto build test ERROR on linus/master]
[also build test ERROR on v6.13-rc7 next-20250113]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/I-Hsin-Cheng/cpumask-Implement-random-version-of-cpumask_any_but/20250113-142111
base: linus/master
patch link: https://lore.kernel.org/r/20250113061839.22131-1-richard120310%40gmail.com
patch subject: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
config: i386-buildonly-randconfig-001-20250113 (https://download.01.org/0day-ci/archive/20250113/202501131709.T02YI6Rr-lkp@intel.com/config)
compiler: clang version 19.1.3 (https://github.com/llvm/llvm-project ab51eccf88f5321e7c60591c5546b254b6afab99)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250113/202501131709.T02YI6Rr-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202501131709.T02YI6Rr-lkp@intel.com/
All errors (new ones prefixed by >>):
In file included from arch/x86/kernel/asm-offsets.c:9:
In file included from include/linux/crypto.h:15:
In file included from include/linux/completion.h:12:
In file included from include/linux/swait.h:7:
In file included from include/linux/spinlock.h:63:
In file included from include/linux/lockdep.h:14:
In file included from include/linux/smp.h:13:
>> include/linux/cpumask.h:407:11: error: call to undeclared function 'cpumask_weight'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
407 | weight = cpumask_weight(mask);
| ^
>> include/linux/cpumask.h:408:6: error: call to undeclared function 'get_random_u32'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
408 | n = get_random_u32() % weight;
| ^
>> include/linux/cpumask.h:415:6: error: call to undeclared function 'cpumask_nth'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
415 | i = cpumask_nth(n, mask);
| ^
include/linux/cpumask.h:415:6: note: did you mean 'cpumask_next'?
include/linux/cpumask.h:217:14: note: 'cpumask_next' declared here
217 | unsigned int cpumask_next(int n, const struct cpumask *srcp)
| ^
>> include/linux/cpumask.h:450:14: error: static declaration of 'cpumask_nth' follows non-static declaration
450 | unsigned int cpumask_nth(unsigned int cpu, const struct cpumask *srcp)
| ^
include/linux/cpumask.h:415:6: note: previous implicit declaration is here
415 | i = cpumask_nth(n, mask);
| ^
>> include/linux/cpumask.h:779:37: error: static declaration of 'cpumask_weight' follows non-static declaration
779 | static __always_inline unsigned int cpumask_weight(const struct cpumask *srcp)
| ^
include/linux/cpumask.h:407:11: note: previous implicit declaration is here
407 | weight = cpumask_weight(mask);
| ^
In file included from arch/x86/kernel/asm-offsets.c:9:
In file included from include/linux/crypto.h:17:
In file included from include/linux/slab.h:16:
In file included from include/linux/gfp.h:7:
In file included from include/linux/mmzone.h:18:
In file included from include/linux/nodemask.h:98:
>> include/linux/random.h:42:5: error: conflicting types for 'get_random_u32'
42 | u32 get_random_u32(void);
| ^
include/linux/cpumask.h:408:6: note: previous implicit declaration is here
408 | n = get_random_u32() % weight;
| ^
In file included from arch/x86/kernel/asm-offsets.c:14:
In file included from include/linux/suspend.h:5:
In file included from include/linux/swap.h:9:
In file included from include/linux/memcontrol.h:13:
In file included from include/linux/cgroup.h:17:
In file included from include/linux/fs.h:33:
In file included from include/linux/percpu-rwsem.h:7:
In file included from include/linux/rcuwait.h:6:
In file included from include/linux/sched/signal.h:6:
include/linux/signal.h:98:11: warning: array index 3 is past the end of the array (that has type 'unsigned long[2]') [-Warray-bounds]
98 | return (set->sig[3] | set->sig[2] |
| ^ ~
arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
24 | unsigned long sig[_NSIG_WORDS];
| ^
In file included from arch/x86/kernel/asm-offsets.c:14:
In file included from include/linux/suspend.h:5:
In file included from include/linux/swap.h:9:
In file included from include/linux/memcontrol.h:13:
In file included from include/linux/cgroup.h:17:
In file included from include/linux/fs.h:33:
In file included from include/linux/percpu-rwsem.h:7:
In file included from include/linux/rcuwait.h:6:
In file included from include/linux/sched/signal.h:6:
include/linux/signal.h:98:25: warning: array index 2 is past the end of the array (that has type 'unsigned long[2]') [-Warray-bounds]
98 | return (set->sig[3] | set->sig[2] |
| ^ ~
arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
24 | unsigned long sig[_NSIG_WORDS];
| ^
In file included from arch/x86/kernel/asm-offsets.c:14:
In file included from include/linux/suspend.h:5:
In file included from include/linux/swap.h:9:
In file included from include/linux/memcontrol.h:13:
In file included from include/linux/cgroup.h:17:
In file included from include/linux/fs.h:33:
In file included from include/linux/percpu-rwsem.h:7:
In file included from include/linux/rcuwait.h:6:
In file included from include/linux/sched/signal.h:6:
include/linux/signal.h:114:11: warning: array index 3 is past the end of the array (that has type 'const unsigned long[2]') [-Warray-bounds]
114 | return (set1->sig[3] == set2->sig[3]) &&
| ^ ~
arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
24 | unsigned long sig[_NSIG_WORDS];
| ^
In file included from arch/x86/kernel/asm-offsets.c:14:
In file included from include/linux/suspend.h:5:
In file included from include/linux/swap.h:9:
In file included from include/linux/memcontrol.h:13:
In file included from include/linux/cgroup.h:17:
In file included from include/linux/fs.h:33:
In file included from include/linux/percpu-rwsem.h:7:
In file included from include/linux/rcuwait.h:6:
In file included from include/linux/sched/signal.h:6:
include/linux/signal.h:114:27: warning: array index 3 is past the end of the array (that has type 'const unsigned long[2]') [-Warray-bounds]
114 | return (set1->sig[3] == set2->sig[3]) &&
| ^ ~
arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
24 | unsigned long sig[_NSIG_WORDS];
| ^
In file included from arch/x86/kernel/asm-offsets.c:14:
In file included from include/linux/suspend.h:5:
In file included from include/linux/swap.h:9:
In file included from include/linux/memcontrol.h:13:
In file included from include/linux/cgroup.h:17:
In file included from include/linux/fs.h:33:
In file included from include/linux/percpu-rwsem.h:7:
In file included from include/linux/rcuwait.h:6:
In file included from include/linux/sched/signal.h:6:
include/linux/signal.h:115:5: warning: array index 2 is past the end of the array (that has type 'const unsigned long[2]') [-Warray-bounds]
115 | (set1->sig[2] == set2->sig[2]) &&
| ^ ~
arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
24 | unsigned long sig[_NSIG_WORDS];
| ^
In file included from arch/x86/kernel/asm-offsets.c:14:
In file included from include/linux/suspend.h:5:
In file included from include/linux/swap.h:9:
In file included from include/linux/memcontrol.h:13:
In file included from include/linux/cgroup.h:17:
In file included from include/linux/fs.h:33:
In file included from include/linux/percpu-rwsem.h:7:
In file included from include/linux/rcuwait.h:6:
In file included from include/linux/sched/signal.h:6:
include/linux/signal.h:115:21: warning: array index 2 is past the end of the array (that has type 'const unsigned long[2]') [-Warray-bounds]
115 | (set1->sig[2] == set2->sig[2]) &&
| ^ ~
arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
24 | unsigned long sig[_NSIG_WORDS];
| ^
In file included from arch/x86/kernel/asm-offsets.c:14:
In file included from include/linux/suspend.h:5:
In file included from include/linux/swap.h:9:
In file included from include/linux/memcontrol.h:13:
In file included from include/linux/cgroup.h:17:
vim +/cpumask_weight +407 include/linux/cpumask.h
317
318 /**
319 * for_each_cpu_wrap - iterate over every cpu in a mask, starting at a specified location
320 * @cpu: the (optionally unsigned) integer iterator
321 * @mask: the cpumask pointer
322 * @start: the start location
323 *
324 * The implementation does not assume any bit in @mask is set (including @start).
325 *
326 * After the loop, cpu is >= nr_cpu_ids.
327 */
328 #define for_each_cpu_wrap(cpu, mask, start) \
329 for_each_set_bit_wrap(cpu, cpumask_bits(mask), small_cpumask_bits, start)
330
331 /**
332 * for_each_cpu_and - iterate over every cpu in both masks
333 * @cpu: the (optionally unsigned) integer iterator
334 * @mask1: the first cpumask pointer
335 * @mask2: the second cpumask pointer
336 *
337 * This saves a temporary CPU mask in many places. It is equivalent to:
338 * struct cpumask tmp;
339 * cpumask_and(&tmp, &mask1, &mask2);
340 * for_each_cpu(cpu, &tmp)
341 * ...
342 *
343 * After the loop, cpu is >= nr_cpu_ids.
344 */
345 #define for_each_cpu_and(cpu, mask1, mask2) \
346 for_each_and_bit(cpu, cpumask_bits(mask1), cpumask_bits(mask2), small_cpumask_bits)
347
348 /**
349 * for_each_cpu_andnot - iterate over every cpu present in one mask, excluding
350 * those present in another.
351 * @cpu: the (optionally unsigned) integer iterator
352 * @mask1: the first cpumask pointer
353 * @mask2: the second cpumask pointer
354 *
355 * This saves a temporary CPU mask in many places. It is equivalent to:
356 * struct cpumask tmp;
357 * cpumask_andnot(&tmp, &mask1, &mask2);
358 * for_each_cpu(cpu, &tmp)
359 * ...
360 *
361 * After the loop, cpu is >= nr_cpu_ids.
362 */
363 #define for_each_cpu_andnot(cpu, mask1, mask2) \
364 for_each_andnot_bit(cpu, cpumask_bits(mask1), cpumask_bits(mask2), small_cpumask_bits)
365
366 /**
367 * for_each_cpu_or - iterate over every cpu present in either mask
368 * @cpu: the (optionally unsigned) integer iterator
369 * @mask1: the first cpumask pointer
370 * @mask2: the second cpumask pointer
371 *
372 * This saves a temporary CPU mask in many places. It is equivalent to:
373 * struct cpumask tmp;
374 * cpumask_or(&tmp, &mask1, &mask2);
375 * for_each_cpu(cpu, &tmp)
376 * ...
377 *
378 * After the loop, cpu is >= nr_cpu_ids.
379 */
380 #define for_each_cpu_or(cpu, mask1, mask2) \
381 for_each_or_bit(cpu, cpumask_bits(mask1), cpumask_bits(mask2), small_cpumask_bits)
382
383 /**
384 * for_each_cpu_from - iterate over CPUs present in @mask, from @cpu to the end of @mask.
385 * @cpu: the (optionally unsigned) integer iterator
386 * @mask: the cpumask pointer
387 *
388 * After the loop, cpu is >= nr_cpu_ids.
389 */
390 #define for_each_cpu_from(cpu, mask) \
391 for_each_set_bit_from(cpu, cpumask_bits(mask), small_cpumask_bits)
392
393 /**
394 * cpumask_any_but - return a "random" in a cpumask, but not this one.
395 * @mask: the cpumask to search
396 * @cpu: the cpu to ignore.
397 *
398 * Often used to find any cpu but smp_processor_id() in a mask.
399 * Return: >= nr_cpu_ids if no cpus set.
400 */
401 static __always_inline
402 unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
403 {
404 unsigned int i, n, weight;
405
406 cpumask_check(cpu);
> 407 weight = cpumask_weight(mask);
> 408 n = get_random_u32() % weight;
409
410 /* If we accidentally pick "n" equal to "cpu",
411 * then simply choose "n + 1"th cpu.
412 */
413 if (n == cpu)
414 n = (n + 1) % weight;
> 415 i = cpumask_nth(n, mask);
416 return i;
417 }
418
419 /**
420 * cpumask_any_and_but - pick a "random" cpu from *mask1 & *mask2, but not this one.
421 * @mask1: the first input cpumask
422 * @mask2: the second input cpumask
423 * @cpu: the cpu to ignore
424 *
425 * Returns >= nr_cpu_ids if no cpus set.
426 */
427 static __always_inline
428 unsigned int cpumask_any_and_but(const struct cpumask *mask1,
429 const struct cpumask *mask2,
430 unsigned int cpu)
431 {
432 unsigned int i;
433
434 cpumask_check(cpu);
435 i = cpumask_first_and(mask1, mask2);
436 if (i != cpu)
437 return i;
438
439 return cpumask_next_and(cpu, mask1, mask2);
440 }
441
442 /**
443 * cpumask_nth - get the Nth cpu in a cpumask
444 * @srcp: the cpumask pointer
445 * @cpu: the Nth cpu to find, starting from 0
446 *
447 * Return: >= nr_cpu_ids if such cpu doesn't exist.
448 */
449 static __always_inline
> 450 unsigned int cpumask_nth(unsigned int cpu, const struct cpumask *srcp)
451 {
452 return find_nth_bit(cpumask_bits(srcp), small_cpumask_bits, cpumask_check(cpu));
453 }
454
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
2025-01-13 6:18 [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but() I Hsin Cheng
2025-01-13 9:45 ` kernel test robot
2025-01-13 9:45 ` kernel test robot
@ 2025-01-13 10:13 ` Kuan-Wei Chiu
2025-01-13 10:27 ` I Hsin Cheng
2025-01-13 11:05 ` Mark Rutland
3 siblings, 1 reply; 12+ messages in thread
From: Kuan-Wei Chiu @ 2025-01-13 10:13 UTC (permalink / raw)
To: I Hsin Cheng; +Cc: yury.norov, linux, jserv, linux-kernel
Hi I Hsin,
On Mon, Jan 13, 2025 at 02:18:39PM +0800, I Hsin Cheng wrote:
> Original implementation of "cpumask_any_but()" isn't actually random as
> the comment claims itself to be. It's behavior is in fact to select the
> first cpu in "mask" which isn't equal to "cpu".
>
> Re-implement the function so we can choose a random cpu by randomly
> select the value of "n" and choose the nth cpu in "mask"
>
This patch may slow down the efficiency of cpumask_any_but(). Are there
any in-tree users of cpumask_any_but() that require it to return a
truly random id, or benefit from such behavior?
Regards,
Kuan-Wei
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
2025-01-13 10:13 ` Kuan-Wei Chiu
@ 2025-01-13 10:27 ` I Hsin Cheng
2025-01-13 11:09 ` Kuan-Wei Chiu
0 siblings, 1 reply; 12+ messages in thread
From: I Hsin Cheng @ 2025-01-13 10:27 UTC (permalink / raw)
To: Kuan-Wei Chiu; +Cc: yury.norov, linux, jserv, linux-kernel
On Mon, Jan 13, 2025 at 06:13:07PM +0800, Kuan-Wei Chiu wrote:
> Hi I Hsin,
>
> On Mon, Jan 13, 2025 at 02:18:39PM +0800, I Hsin Cheng wrote:
> > Original implementation of "cpumask_any_but()" isn't actually random as
> > the comment claims itself to be. It's behavior is in fact to select the
> > first cpu in "mask" which isn't equal to "cpu".
> >
> > Re-implement the function so we can choose a random cpu by randomly
> > select the value of "n" and choose the nth cpu in "mask"
> >
> This patch may slow down the efficiency of cpumask_any_but(). Are there
> any in-tree users of cpumask_any_but() that require it to return a
> truly random id, or benefit from such behavior?
Hi Kuan-Wei,
Thanks for your review ! Yes indeed it may slow down the efficiency
abit.
However there are some use cases I think randomness is important
such as "dl_task_offline_migration()" under kernel/sched/deadline.c ,
where the operation of cpu picking shouldn't favor certain cpu too much.
Also "select_task_rq()" utitlize "cpumask_any()" to pick cpu, it doesn't
need to be perfectly random, but neither should it only favor certain
cpu.
What do you think?
Best regards,
I Hsin
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
2025-01-13 6:18 [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but() I Hsin Cheng
` (2 preceding siblings ...)
2025-01-13 10:13 ` Kuan-Wei Chiu
@ 2025-01-13 11:05 ` Mark Rutland
2025-01-13 18:00 ` Yury Norov
3 siblings, 1 reply; 12+ messages in thread
From: Mark Rutland @ 2025-01-13 11:05 UTC (permalink / raw)
To: I Hsin Cheng; +Cc: yury.norov, linux, jserv, linux-kernel
On Mon, Jan 13, 2025 at 02:18:39PM +0800, I Hsin Cheng wrote:
> Original implementation of "cpumask_any_but()" isn't actually random as
> the comment claims itself to be. It's behavior is in fact to select the
> first cpu in "mask" which isn't equal to "cpu".
What it says specifically is:
cpumask_any_but - return a "random" in a cpumask, but not this one.
... and by "random", it really means "arbitrary".
The idea here is that the caller is specifying that it doesn't care
which specific CPU is chosen, but this is not required to be a random
selection.
> Re-implement the function so we can choose a random cpu by randomly
> select the value of "n" and choose the nth cpu in "mask"
>
> Experiments[1] are done below to verify it generate more random result than
> orginal implementation which tends to select the same cpu over and over
> again.
I think what you're after here is similar to
cpumask_any_and_distribute(), and you should look at building
cpumask_any_but_distribute() in the same way, rather than changing
cpumask_any_but().
Mark.
> Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
> ---
> The test is done on x86_64 architecture with 6.8.0-48-generic kernel
> version on Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
>
> [1]:
> Test script:
>
> int init_module(void)
> {
> const struct cpumask *cur_mask = cpu_online_mask;
> unsigned int cpu = 5, result;
> int times = 50;
>
> pr_info("Old cpumask_any_but(): ");
> for (int i = 0; i < times; i++) {
> result = cpumask_any_but(cur_mask, cpu);
> pr_cont("%u ", result);
> }
> pr_info("\n");
>
> pr_info("New cpumask_any_but(): ");
> for (int i = 0; i < times; i++) {
> result = cpumask_any_but_v2(cur_mask, cpu);
> pr_cont("%u ", result);
> }
> pr_info("\n");
>
> return 0;
> }
>
> Experiment result showned as below display in dmesg:
> [ 8036.558152] Old cpumask_any_but(): 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>
> [ 8036.558193] New cpumask_any_but(): 7 1 1 2 2 2 3 4 0 2 7 4 6 3 3 2 2 4 2 7 6 6 6 4 6 6 6 4 4 7 6 2 2 6 7 6 6 3 0 6 2 1 0 4 4 6 4 6 6 3
> ---
> include/linux/cpumask.h | 14 ++++++++++----
> 1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> index 9278a50d5..336297960 100644
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -401,12 +401,18 @@ unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta
> static __always_inline
> unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
> {
> - unsigned int i;
> + unsigned int i, n, weight;
>
> cpumask_check(cpu);
> - for_each_cpu(i, mask)
> - if (i != cpu)
> - break;
> + weight = cpumask_weight(mask);
> + n = get_random_u32() % weight;
> +
> + /* If we accidentally pick "n" equal to "cpu",
> + * then simply choose "n + 1"th cpu.
> + */
> + if (n == cpu)
> + n = (n + 1) % weight;
> + i = cpumask_nth(n, mask);
> return i;
> }
>
> --
> 2.43.0
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
2025-01-13 10:27 ` I Hsin Cheng
@ 2025-01-13 11:09 ` Kuan-Wei Chiu
0 siblings, 0 replies; 12+ messages in thread
From: Kuan-Wei Chiu @ 2025-01-13 11:09 UTC (permalink / raw)
To: I Hsin Cheng; +Cc: yury.norov, linux, jserv, linux-kernel
On Mon, Jan 13, 2025 at 06:27:26PM +0800, I Hsin Cheng wrote:
> On Mon, Jan 13, 2025 at 06:13:07PM +0800, Kuan-Wei Chiu wrote:
> > Hi I Hsin,
> >
> > On Mon, Jan 13, 2025 at 02:18:39PM +0800, I Hsin Cheng wrote:
> > > Original implementation of "cpumask_any_but()" isn't actually random as
> > > the comment claims itself to be. It's behavior is in fact to select the
> > > first cpu in "mask" which isn't equal to "cpu".
> > >
> > > Re-implement the function so we can choose a random cpu by randomly
> > > select the value of "n" and choose the nth cpu in "mask"
> > >
> > This patch may slow down the efficiency of cpumask_any_but(). Are there
> > any in-tree users of cpumask_any_but() that require it to return a
> > truly random id, or benefit from such behavior?
>
> Hi Kuan-Wei,
>
> Thanks for your review ! Yes indeed it may slow down the efficiency
> abit.
>
> However there are some use cases I think randomness is important
> such as "dl_task_offline_migration()" under kernel/sched/deadline.c ,
> where the operation of cpu picking shouldn't favor certain cpu too much.
>
> Also "select_task_rq()" utitlize "cpumask_any()" to pick cpu, it doesn't
> need to be perfectly random, but neither should it only favor certain
> cpu.
>
> What do you think?
>
If a true random number isn't needed, could next_pseudo_random32() be
used instead for better efficiency?
I'm not familiar with the scheduler, but if there are only one or two
scheduler use cases, would you consider creating a new
cpumask_random_but() API and converting those specific cases to use it?
In this case, the patch should also be CC'd to scheduler developers.
Additionally, you should explain the benefits of this approach in the
patch description.
Regards,
Kuan-Wei
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
2025-01-13 11:05 ` Mark Rutland
@ 2025-01-13 18:00 ` Yury Norov
2025-01-14 7:15 ` I Hsin Cheng
0 siblings, 1 reply; 12+ messages in thread
From: Yury Norov @ 2025-01-13 18:00 UTC (permalink / raw)
To: Mark Rutland; +Cc: I Hsin Cheng, linux, jserv, linux-kernel
On Mon, Jan 13, 2025 at 11:05:19AM +0000, Mark Rutland wrote:
> On Mon, Jan 13, 2025 at 02:18:39PM +0800, I Hsin Cheng wrote:
> > Original implementation of "cpumask_any_but()" isn't actually random as
> > the comment claims itself to be. It's behavior is in fact to select the
> > first cpu in "mask" which isn't equal to "cpu".
>
> What it says specifically is:
>
> cpumask_any_but - return a "random" in a cpumask, but not this one.
>
> ... and by "random", it really means "arbitrary".
>
> The idea here is that the caller is specifying that it doesn't care
> which specific CPU is chosen, but this is not required to be a random
> selection.
>
> > Re-implement the function so we can choose a random cpu by randomly
> > select the value of "n" and choose the nth cpu in "mask"
> >
> > Experiments[1] are done below to verify it generate more random result than
> > orginal implementation which tends to select the same cpu over and over
> > again.
>
> I think what you're after here is similar to
> cpumask_any_and_distribute(), and you should look at building
> cpumask_any_but_distribute() in the same way, rather than changing
> cpumask_any_but().
>
> Mark.
I agree with Mark. cpumask_any_but_distribute() is what you most
likely need. Anyways, whatever you end up please don't change existing
API, especially in a way that hurts performance so badly.
>
> > Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
This patch should go with a demonstration that some particular
system(s) benefits from it, and the others don't suffer.
> > ---
> > The test is done on x86_64 architecture with 6.8.0-48-generic kernel
> > version on Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
> >
> > [1]:
> > Test script:
> >
> > int init_module(void)
> > {
> > const struct cpumask *cur_mask = cpu_online_mask;
> > unsigned int cpu = 5, result;
> > int times = 50;
> >
> > pr_info("Old cpumask_any_but(): ");
> > for (int i = 0; i < times; i++) {
> > result = cpumask_any_but(cur_mask, cpu);
> > pr_cont("%u ", result);
> > }
> > pr_info("\n");
> >
> > pr_info("New cpumask_any_but(): ");
> > for (int i = 0; i < times; i++) {
> > result = cpumask_any_but_v2(cur_mask, cpu);
> > pr_cont("%u ", result);
> > }
> > pr_info("\n");
> >
> > return 0;
> > }
> >
> > Experiment result showned as below display in dmesg:
> > [ 8036.558152] Old cpumask_any_but(): 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> >
> > [ 8036.558193] New cpumask_any_but(): 7 1 1 2 2 2 3 4 0 2 7 4 6 3 3 2 2 4 2 7 6 6 6 4 6 6 6 4 4 7 6 2 2 6 7 6 6 3 0 6 2 1 0 4 4 6 4 6 6 3
>
> > ---
> > include/linux/cpumask.h | 14 ++++++++++----
> > 1 file changed, 10 insertions(+), 4 deletions(-)
> >
> > diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> > index 9278a50d5..336297960 100644
> > --- a/include/linux/cpumask.h
> > +++ b/include/linux/cpumask.h
#include <linux/random.h>
Which would be really good to avoid.
> > @@ -401,12 +401,18 @@ unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta
> > static __always_inline
> > unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
> > {
> > - unsigned int i;
> > + unsigned int i, n, weight;
> >
> > cpumask_check(cpu);
> > - for_each_cpu(i, mask)
> > - if (i != cpu)
> > - break;
> > + weight = cpumask_weight(mask);
> > + n = get_random_u32() % weight;
> > +
> > + /* If we accidentally pick "n" equal to "cpu",
> > + * then simply choose "n + 1"th cpu.
> > + */
> > + if (n == cpu)
> > + n = (n + 1) % weight;
> > + i = cpumask_nth(n, mask);
This is an entirely broken thing, and it works only because your CPU mask
is dense. Imagine cpumask: 0111 1111. Your new cpumask_any_but(mask, 5)
will return 5 exactly, if the get_random_u32() draws 4.
It looks broken even for a dense mask. By probability, your code returns:
P(0-4,7) == 1/8,
P(5) == 0,
P(6) == 1/4.
Assuming you are trying to implement a random uniform distribution drawing,
the correct probabilities should look like:
P(0-4,6-7) == 1/7,
P(5) == 0,
Thanks,
Yury
> > return i;
> > }
> >
> > --
> > 2.43.0
> >
> >
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
2025-01-13 18:00 ` Yury Norov
@ 2025-01-14 7:15 ` I Hsin Cheng
2025-01-14 15:02 ` Mark Rutland
2025-01-14 15:43 ` Yury Norov
0 siblings, 2 replies; 12+ messages in thread
From: I Hsin Cheng @ 2025-01-14 7:15 UTC (permalink / raw)
To: Yury Norov; +Cc: mark.rutland, linux, jserv, linux-kernel, visitorckw
On Mon, Jan 13, 2025 at 01:00:56PM -0500, Yury Norov wrote:
> On Mon, Jan 13, 2025 at 11:05:19AM +0000, Mark Rutland wrote:
> > On Mon, Jan 13, 2025 at 02:18:39PM +0800, I Hsin Cheng wrote:
> > > Original implementation of "cpumask_any_but()" isn't actually random as
> > > the comment claims itself to be. It's behavior is in fact to select the
> > > first cpu in "mask" which isn't equal to "cpu".
> >
> > What it says specifically is:
> >
> > cpumask_any_but - return a "random" in a cpumask, but not this one.
> >
> > ... and by "random", it really means "arbitrary".
> >
> > The idea here is that the caller is specifying that it doesn't care
> > which specific CPU is chosen, but this is not required to be a random
> > selection.
> >
> > > Re-implement the function so we can choose a random cpu by randomly
> > > select the value of "n" and choose the nth cpu in "mask"
> > >
> > > Experiments[1] are done below to verify it generate more random result than
> > > orginal implementation which tends to select the same cpu over and over
> > > again.
> >
> > I think what you're after here is similar to
> > cpumask_any_and_distribute(), and you should look at building
> > cpumask_any_but_distribute() in the same way, rather than changing
> > cpumask_any_but().
> >
> > Mark.
>
> I agree with Mark. cpumask_any_but_distribute() is what you most
> likely need. Anyways, whatever you end up please don't change existing
> API, especially in a way that hurts performance so badly.
I see, I didn't noticed cpumask_any*_distribute() APIs, they're the
thing I was looking for, thanks!
>
> >
> > > Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
>
> This patch should go with a demonstration that some particular
> system(s) benefits from it, and the others don't suffer.
Thanks for the head up, I'll be aware of this in the future.
>
> > > ---
> > > The test is done on x86_64 architecture with 6.8.0-48-generic kernel
> > > version on Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
> > >
> > > [1]:
> > > Test script:
> > >
> > > int init_module(void)
> > > {
> > > const struct cpumask *cur_mask = cpu_online_mask;
> > > unsigned int cpu = 5, result;
> > > int times = 50;
> > >
> > > pr_info("Old cpumask_any_but(): ");
> > > for (int i = 0; i < times; i++) {
> > > result = cpumask_any_but(cur_mask, cpu);
> > > pr_cont("%u ", result);
> > > }
> > > pr_info("\n");
> > >
> > > pr_info("New cpumask_any_but(): ");
> > > for (int i = 0; i < times; i++) {
> > > result = cpumask_any_but_v2(cur_mask, cpu);
> > > pr_cont("%u ", result);
> > > }
> > > pr_info("\n");
> > >
> > > return 0;
> > > }
> > >
> > > Experiment result showned as below display in dmesg:
> > > [ 8036.558152] Old cpumask_any_but(): 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> > >
> > > [ 8036.558193] New cpumask_any_but(): 7 1 1 2 2 2 3 4 0 2 7 4 6 3 3 2 2 4 2 7 6 6 6 4 6 6 6 4 4 7 6 2 2 6 7 6 6 3 0 6 2 1 0 4 4 6 4 6 6 3
> >
> > > ---
> > > include/linux/cpumask.h | 14 ++++++++++----
> > > 1 file changed, 10 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> > > index 9278a50d5..336297960 100644
> > > --- a/include/linux/cpumask.h
> > > +++ b/include/linux/cpumask.h
>
> #include <linux/random.h>
>
> Which would be really good to avoid.
>
> > > @@ -401,12 +401,18 @@ unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta
> > > static __always_inline
> > > unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
> > > {
> > > - unsigned int i;
> > > + unsigned int i, n, weight;
> > >
> > > cpumask_check(cpu);
> > > - for_each_cpu(i, mask)
> > > - if (i != cpu)
> > > - break;
> > > + weight = cpumask_weight(mask);
> > > + n = get_random_u32() % weight;
> > > +
> > > + /* If we accidentally pick "n" equal to "cpu",
> > > + * then simply choose "n + 1"th cpu.
> > > + */
> > > + if (n == cpu)
> > > + n = (n + 1) % weight;
> > > + i = cpumask_nth(n, mask);
>
> This is an entirely broken thing, and it works only because your CPU mask
> is dense. Imagine cpumask: 0111 1111. Your new cpumask_any_but(mask, 5)
> will return 5 exactly, if the get_random_u32() draws 4.
>
Oh I'm sorry for this part, "cpumask_nth()" return value should be
checked instead of checking "get_random_u32()". May I ask why does it
only work for dense cpumask? I mean clearly original method will be
faster when cpumask isn't dense.
> It looks broken even for a dense mask. By probability, your code returns:
>
> P(0-4,7) == 1/8,
> P(5) == 0,
> P(6) == 1/4.
>
> Assuming you are trying to implement a random uniform distribution drawing,
> the correct probabilities should look like:
>
> P(0-4,6-7) == 1/7,
> P(5) == 0,
>
> Thanks,
> Yury
>
I did noticed this part, I was thinking that we do not actually need to
make the prbability to be uniform distribution, just want to make it
scatters more then picking certain number.
Thanks for your detailed review and explanation, also thanks to Mark and
Kuan-Wei !
I now realize "random" here is more of a convention for the caller to
states that it doesn't matter which cpu it gets, maybe we should
rephrase the comment to make it less confusing? Because I think "random"
itself does stands for a particular meaning.
Best regards,
Richard Cheng.
> > > return i;
> > > }
> > >
> > > --
> > > 2.43.0
> > >
> > >
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
2025-01-14 7:15 ` I Hsin Cheng
@ 2025-01-14 15:02 ` Mark Rutland
2025-01-14 15:43 ` Yury Norov
1 sibling, 0 replies; 12+ messages in thread
From: Mark Rutland @ 2025-01-14 15:02 UTC (permalink / raw)
To: I Hsin Cheng; +Cc: Yury Norov, linux, jserv, linux-kernel, visitorckw
On Tue, Jan 14, 2025 at 03:15:43PM +0800, I Hsin Cheng wrote:
> On Mon, Jan 13, 2025 at 01:00:56PM -0500, Yury Norov wrote:
> > On Mon, Jan 13, 2025 at 11:05:19AM +0000, Mark Rutland wrote:
> > > On Mon, Jan 13, 2025 at 02:18:39PM +0800, I Hsin Cheng wrote:
> > > > Original implementation of "cpumask_any_but()" isn't actually random as
> > > > the comment claims itself to be. It's behavior is in fact to select the
> > > > first cpu in "mask" which isn't equal to "cpu".
> > >
> > > What it says specifically is:
> > >
> > > cpumask_any_but - return a "random" in a cpumask, but not this one.
> > >
> > > ... and by "random", it really means "arbitrary".
> > >
> > > The idea here is that the caller is specifying that it doesn't care
> > > which specific CPU is chosen, but this is not required to be a random
> > > selection.
> I now realize "random" here is more of a convention for the caller to
> states that it doesn't matter which cpu it gets, maybe we should
> rephrase the comment to make it less confusing? Because I think "random"
> itself does stands for a particular meaning.
FWIW, I agree. I reckon (as above), we could replace "random" with
arbitrary, i.e. replace
return a "random" cpu in a cpumask ...
... with:
return an arbitrary cpu in a cpumask ...
Looking again I see that the comment for cpumask_any_but() misses the
word "cpu" too, so it'd be nice to clean that up regardless.
Mark.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
2025-01-14 7:15 ` I Hsin Cheng
2025-01-14 15:02 ` Mark Rutland
@ 2025-01-14 15:43 ` Yury Norov
2025-01-15 7:24 ` I Hsin Cheng
1 sibling, 1 reply; 12+ messages in thread
From: Yury Norov @ 2025-01-14 15:43 UTC (permalink / raw)
To: I Hsin Cheng; +Cc: mark.rutland, linux, jserv, linux-kernel, visitorckw
> > > > @@ -401,12 +401,18 @@ unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta
> > > > static __always_inline
> > > > unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
> > > > {
> > > > - unsigned int i;
> > > > + unsigned int i, n, weight;
> > > >
> > > > cpumask_check(cpu);
> > > > - for_each_cpu(i, mask)
> > > > - if (i != cpu)
> > > > - break;
> > > > + weight = cpumask_weight(mask);
> > > > + n = get_random_u32() % weight;
> > > > +
> > > > + /* If we accidentally pick "n" equal to "cpu",
> > > > + * then simply choose "n + 1"th cpu.
> > > > + */
> > > > + if (n == cpu)
> > > > + n = (n + 1) % weight;
> > > > + i = cpumask_nth(n, mask);
> >
> > This is an entirely broken thing, and it works only because your CPU mask
> > is dense. Imagine cpumask: 0111 1111. Your new cpumask_any_but(mask, 5)
> > will return 5 exactly, if the get_random_u32() draws 4.
> >
>
> Oh I'm sorry for this part, "cpumask_nth()" return value should be
> checked instead of checking "get_random_u32()". May I ask why does it
> only work for dense cpumask?
Because for dense mask cpumask_nth(n) == n, and your
n = get_random_u32() % weight
is the same as
n = cpumask_nth(get_random_u32() % weight)
> I mean clearly original method will be
> faster when cpumask isn't dense.
>
> > It looks broken even for a dense mask. By probability, your code returns:
> >
> > P(0-4,7) == 1/8,
> > P(5) == 0,
> > P(6) == 1/4.
> >
> > Assuming you are trying to implement a random uniform distribution drawing,
> > the correct probabilities should look like:
> >
> > P(0-4,6-7) == 1/7,
> > P(5) == 0,
> >
> > Thanks,
> > Yury
> >
>
> I did noticed this part, I was thinking that we do not actually need to
> make the prbability to be uniform distribution, just want to make it
> scatters more then picking certain number.
On cpumask level you can't speculate whether users need true randomness
or not. That's why we pay so much attention to proper function naming.
See:
- cpumask_any() - 'any' requirement is much simpler than random;
- cpumask_any_distribute() - not random at all, just a better (good
enough) approximation;
- cpumask_random() - a true random drawing. Must pass Chi^2,
Kolmogorov-Smirnov and other fancy statistical tests. As you can see,
doesn't exist.
If in your function you use expensive true-random get_random_u32(),
one can expect that the output will not be worse than the original
uniform distribution, by the statistical properties means.
> Thanks for your detailed review and explanation, also thanks to Mark and
> Kuan-Wei !
>
> I now realize "random" here is more of a convention for the caller to
> states that it doesn't matter which cpu it gets, maybe we should
> rephrase the comment to make it less confusing? Because I think "random"
> itself does stands for a particular meaning.
Feel free to send a patch.
Thanks,
Yury
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
2025-01-14 15:43 ` Yury Norov
@ 2025-01-15 7:24 ` I Hsin Cheng
0 siblings, 0 replies; 12+ messages in thread
From: I Hsin Cheng @ 2025-01-15 7:24 UTC (permalink / raw)
To: Yury Norov; +Cc: mark.rutland, linux, jserv, linux-kernel, visitorckw
On Tue, Jan 14, 2025 at 10:43:56AM -0500, Yury Norov wrote:
> > > > > @@ -401,12 +401,18 @@ unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta
> > > > > static __always_inline
> > > > > unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
> > > > > {
> > > > > - unsigned int i;
> > > > > + unsigned int i, n, weight;
> > > > >
> > > > > cpumask_check(cpu);
> > > > > - for_each_cpu(i, mask)
> > > > > - if (i != cpu)
> > > > > - break;
> > > > > + weight = cpumask_weight(mask);
> > > > > + n = get_random_u32() % weight;
> > > > > +
> > > > > + /* If we accidentally pick "n" equal to "cpu",
> > > > > + * then simply choose "n + 1"th cpu.
> > > > > + */
> > > > > + if (n == cpu)
> > > > > + n = (n + 1) % weight;
> > > > > + i = cpumask_nth(n, mask);
> > >
> > > This is an entirely broken thing, and it works only because your CPU mask
> > > is dense. Imagine cpumask: 0111 1111. Your new cpumask_any_but(mask, 5)
> > > will return 5 exactly, if the get_random_u32() draws 4.
> > >
> >
> > Oh I'm sorry for this part, "cpumask_nth()" return value should be
> > checked instead of checking "get_random_u32()". May I ask why does it
> > only work for dense cpumask?
>
> Because for dense mask cpumask_nth(n) == n, and your
> n = get_random_u32() % weight
> is the same as
> n = cpumask_nth(get_random_u32() % weight)
>
> > I mean clearly original method will be
> > faster when cpumask isn't dense.
> >
> > > It looks broken even for a dense mask. By probability, your code returns:
> > >
> > > P(0-4,7) == 1/8,
> > > P(5) == 0,
> > > P(6) == 1/4.
> > >
> > > Assuming you are trying to implement a random uniform distribution drawing,
> > > the correct probabilities should look like:
> > >
> > > P(0-4,6-7) == 1/7,
> > > P(5) == 0,
> > >
> > > Thanks,
> > > Yury
> > >
> >
> > I did noticed this part, I was thinking that we do not actually need to
> > make the prbability to be uniform distribution, just want to make it
> > scatters more then picking certain number.
>
> On cpumask level you can't speculate whether users need true randomness
> or not. That's why we pay so much attention to proper function naming.
>
> See:
> - cpumask_any() - 'any' requirement is much simpler than random;
> - cpumask_any_distribute() - not random at all, just a better (good
> enough) approximation;
> - cpumask_random() - a true random drawing. Must pass Chi^2,
> Kolmogorov-Smirnov and other fancy statistical tests. As you can see,
> doesn't exist.
>
> If in your function you use expensive true-random get_random_u32(),
> one can expect that the output will not be worse than the original
> uniform distribution, by the statistical properties means.
>
I see, thanks for the explanation, it really helps me learn alot!
> > Thanks for your detailed review and explanation, also thanks to Mark and
> > Kuan-Wei !
> >
> > I now realize "random" here is more of a convention for the caller to
> > states that it doesn't matter which cpu it gets, maybe we should
> > rephrase the comment to make it less confusing? Because I think "random"
> > itself does stands for a particular meaning.
>
> Feel free to send a patch.
No problem, I'll send it ASAP.
Regards,
I Hsin Cheng.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-01-15 7:24 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-13 6:18 [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but() I Hsin Cheng
2025-01-13 9:45 ` kernel test robot
2025-01-13 9:45 ` kernel test robot
2025-01-13 10:13 ` Kuan-Wei Chiu
2025-01-13 10:27 ` I Hsin Cheng
2025-01-13 11:09 ` Kuan-Wei Chiu
2025-01-13 11:05 ` Mark Rutland
2025-01-13 18:00 ` Yury Norov
2025-01-14 7:15 ` I Hsin Cheng
2025-01-14 15:02 ` Mark Rutland
2025-01-14 15:43 ` Yury Norov
2025-01-15 7:24 ` I Hsin Cheng
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.