All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
@ 2025-01-13  6:18 I Hsin Cheng
  2025-01-13  9:45 ` kernel test robot
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: I Hsin Cheng @ 2025-01-13  6:18 UTC (permalink / raw)
  To: yury.norov; +Cc: linux, jserv, linux-kernel, I Hsin Cheng

Original implementation of "cpumask_any_but()" isn't actually random as
the comment claims itself to be. It's behavior is in fact to select the
first cpu in "mask" which isn't equal to "cpu".

Re-implement the function so we can choose a random cpu by randomly
select the value of "n" and choose the nth cpu in "mask"

Experiments[1] are done below to verify it generate more random result than
orginal implementation which tends to select the same cpu over and over
again.

Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
---
The test is done on x86_64 architecture with 6.8.0-48-generic kernel
version on Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz

[1]:
Test script:

int init_module(void)
{
    const struct cpumask *cur_mask = cpu_online_mask;
    unsigned int cpu = 5, result;
    int times = 50;

    pr_info("Old cpumask_any_but(): ");
    for (int i = 0; i < times; i++) {
        result = cpumask_any_but(cur_mask, cpu);
        pr_cont("%u ", result);
    }
    pr_info("\n");

    pr_info("New cpumask_any_but(): ");
    for (int i = 0; i < times; i++) {
        result = cpumask_any_but_v2(cur_mask, cpu);
        pr_cont("%u ", result);
    }
    pr_info("\n");

    return 0;
}

Experiment result showned as below display in dmesg:
[ 8036.558152] Old cpumask_any_but(): 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

[ 8036.558193] New cpumask_any_but(): 7 1 1 2 2 2 3 4 0 2 7 4 6 3 3 2 2 4 2 7 6 6 6 4 6 6 6 4 4 7 6 2 2 6 7 6 6 3 0 6 2 1 0 4 4 6 4 6 6 3
---
 include/linux/cpumask.h | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 9278a50d5..336297960 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -401,12 +401,18 @@ unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta
 static __always_inline
 unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
 {
-	unsigned int i;
+	unsigned int i, n, weight;
 
 	cpumask_check(cpu);
-	for_each_cpu(i, mask)
-		if (i != cpu)
-			break;
+	weight = cpumask_weight(mask);
+	n = get_random_u32() % weight;
+
+	/* If we accidentally pick "n" equal to "cpu",
+	 * then simply choose "n + 1"th cpu.
+	 */
+	if (n == cpu)
+		n = (n + 1) % weight;
+	i = cpumask_nth(n, mask);
 	return i;
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
  2025-01-13  6:18 [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but() I Hsin Cheng
@ 2025-01-13  9:45 ` kernel test robot
  2025-01-13  9:45 ` kernel test robot
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: kernel test robot @ 2025-01-13  9:45 UTC (permalink / raw)
  To: I Hsin Cheng; +Cc: oe-kbuild-all

Hi Hsin,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:

[auto build test ERROR on linus/master]
[also build test ERROR on v6.13-rc7 next-20250113]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/I-Hsin-Cheng/cpumask-Implement-random-version-of-cpumask_any_but/20250113-142111
base:   linus/master
patch link:    https://lore.kernel.org/r/20250113061839.22131-1-richard120310%40gmail.com
patch subject: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
config: i386-buildonly-randconfig-002-20250113 (https://download.01.org/0day-ci/archive/20250113/202501131732.I9VU5v0r-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250113/202501131732.I9VU5v0r-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202501131732.I9VU5v0r-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from include/linux/smp.h:13,
                    from include/linux/lockdep.h:14,
                    from include/linux/spinlock.h:63,
                    from include/linux/swait.h:7,
                    from include/linux/completion.h:12,
                    from include/linux/crypto.h:15,
                    from arch/x86/kernel/asm-offsets.c:9:
   include/linux/cpumask.h: In function 'cpumask_any_but':
>> include/linux/cpumask.h:407:18: error: implicit declaration of function 'cpumask_weight'; did you mean 'cpumask_next'? [-Werror=implicit-function-declaration]
     407 |         weight = cpumask_weight(mask);
         |                  ^~~~~~~~~~~~~~
         |                  cpumask_next
>> include/linux/cpumask.h:408:13: error: implicit declaration of function 'get_random_u32' [-Werror=implicit-function-declaration]
     408 |         n = get_random_u32() % weight;
         |             ^~~~~~~~~~~~~~
>> include/linux/cpumask.h:415:13: error: implicit declaration of function 'cpumask_nth'; did you mean 'cpumask_next'? [-Werror=implicit-function-declaration]
     415 |         i = cpumask_nth(n, mask);
         |             ^~~~~~~~~~~
         |             cpumask_next
   include/linux/cpumask.h: At top level:
>> include/linux/cpumask.h:450:14: error: conflicting types for 'cpumask_nth'; have 'unsigned int(unsigned int,  const struct cpumask *)'
     450 | unsigned int cpumask_nth(unsigned int cpu, const struct cpumask *srcp)
         |              ^~~~~~~~~~~
   include/linux/cpumask.h:415:13: note: previous implicit declaration of 'cpumask_nth' with type 'int()'
     415 |         i = cpumask_nth(n, mask);
         |             ^~~~~~~~~~~
>> include/linux/cpumask.h:779:37: error: conflicting types for 'cpumask_weight'; have 'unsigned int(const struct cpumask *)'
     779 | static __always_inline unsigned int cpumask_weight(const struct cpumask *srcp)
         |                                     ^~~~~~~~~~~~~~
   include/linux/cpumask.h:407:18: note: previous implicit declaration of 'cpumask_weight' with type 'int()'
     407 |         weight = cpumask_weight(mask);
         |                  ^~~~~~~~~~~~~~
   In file included from include/linux/nodemask.h:98,
                    from include/linux/mmzone.h:18,
                    from include/linux/gfp.h:7,
                    from include/linux/slab.h:16,
                    from include/linux/crypto.h:17:
>> include/linux/random.h:42:5: error: conflicting types for 'get_random_u32'; have 'u32(void)' {aka 'unsigned int(void)'}
      42 | u32 get_random_u32(void);
         |     ^~~~~~~~~~~~~~
   include/linux/cpumask.h:408:13: note: previous implicit declaration of 'get_random_u32' with type 'int()'
     408 |         n = get_random_u32() % weight;
         |             ^~~~~~~~~~~~~~
   cc1: some warnings being treated as errors
   make[3]: *** [scripts/Makefile.build:102: arch/x86/kernel/asm-offsets.s] Error 1 shuffle=3780167214
   make[3]: Target 'prepare' not remade because of errors.
   make[2]: *** [Makefile:1263: prepare0] Error 2 shuffle=3780167214
   make[2]: Target 'prepare' not remade because of errors.
   make[1]: *** [Makefile:251: __sub-make] Error 2 shuffle=3780167214
   make[1]: Target 'prepare' not remade because of errors.
   make: *** [Makefile:251: __sub-make] Error 2 shuffle=3780167214
   make: Target 'prepare' not remade because of errors.


vim +407 include/linux/cpumask.h

   317	
   318	/**
   319	 * for_each_cpu_wrap - iterate over every cpu in a mask, starting at a specified location
   320	 * @cpu: the (optionally unsigned) integer iterator
   321	 * @mask: the cpumask pointer
   322	 * @start: the start location
   323	 *
   324	 * The implementation does not assume any bit in @mask is set (including @start).
   325	 *
   326	 * After the loop, cpu is >= nr_cpu_ids.
   327	 */
   328	#define for_each_cpu_wrap(cpu, mask, start)				\
   329		for_each_set_bit_wrap(cpu, cpumask_bits(mask), small_cpumask_bits, start)
   330	
   331	/**
   332	 * for_each_cpu_and - iterate over every cpu in both masks
   333	 * @cpu: the (optionally unsigned) integer iterator
   334	 * @mask1: the first cpumask pointer
   335	 * @mask2: the second cpumask pointer
   336	 *
   337	 * This saves a temporary CPU mask in many places.  It is equivalent to:
   338	 *	struct cpumask tmp;
   339	 *	cpumask_and(&tmp, &mask1, &mask2);
   340	 *	for_each_cpu(cpu, &tmp)
   341	 *		...
   342	 *
   343	 * After the loop, cpu is >= nr_cpu_ids.
   344	 */
   345	#define for_each_cpu_and(cpu, mask1, mask2)				\
   346		for_each_and_bit(cpu, cpumask_bits(mask1), cpumask_bits(mask2), small_cpumask_bits)
   347	
   348	/**
   349	 * for_each_cpu_andnot - iterate over every cpu present in one mask, excluding
   350	 *			 those present in another.
   351	 * @cpu: the (optionally unsigned) integer iterator
   352	 * @mask1: the first cpumask pointer
   353	 * @mask2: the second cpumask pointer
   354	 *
   355	 * This saves a temporary CPU mask in many places.  It is equivalent to:
   356	 *	struct cpumask tmp;
   357	 *	cpumask_andnot(&tmp, &mask1, &mask2);
   358	 *	for_each_cpu(cpu, &tmp)
   359	 *		...
   360	 *
   361	 * After the loop, cpu is >= nr_cpu_ids.
   362	 */
   363	#define for_each_cpu_andnot(cpu, mask1, mask2)				\
   364		for_each_andnot_bit(cpu, cpumask_bits(mask1), cpumask_bits(mask2), small_cpumask_bits)
   365	
   366	/**
   367	 * for_each_cpu_or - iterate over every cpu present in either mask
   368	 * @cpu: the (optionally unsigned) integer iterator
   369	 * @mask1: the first cpumask pointer
   370	 * @mask2: the second cpumask pointer
   371	 *
   372	 * This saves a temporary CPU mask in many places.  It is equivalent to:
   373	 *	struct cpumask tmp;
   374	 *	cpumask_or(&tmp, &mask1, &mask2);
   375	 *	for_each_cpu(cpu, &tmp)
   376	 *		...
   377	 *
   378	 * After the loop, cpu is >= nr_cpu_ids.
   379	 */
   380	#define for_each_cpu_or(cpu, mask1, mask2)				\
   381		for_each_or_bit(cpu, cpumask_bits(mask1), cpumask_bits(mask2), small_cpumask_bits)
   382	
   383	/**
   384	 * for_each_cpu_from - iterate over CPUs present in @mask, from @cpu to the end of @mask.
   385	 * @cpu: the (optionally unsigned) integer iterator
   386	 * @mask: the cpumask pointer
   387	 *
   388	 * After the loop, cpu is >= nr_cpu_ids.
   389	 */
   390	#define for_each_cpu_from(cpu, mask)				\
   391		for_each_set_bit_from(cpu, cpumask_bits(mask), small_cpumask_bits)
   392	
   393	/**
   394	 * cpumask_any_but - return a "random" in a cpumask, but not this one.
   395	 * @mask: the cpumask to search
   396	 * @cpu: the cpu to ignore.
   397	 *
   398	 * Often used to find any cpu but smp_processor_id() in a mask.
   399	 * Return: >= nr_cpu_ids if no cpus set.
   400	 */
   401	static __always_inline
   402	unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
   403	{
   404		unsigned int i, n, weight;
   405	
   406		cpumask_check(cpu);
 > 407		weight = cpumask_weight(mask);
 > 408		n = get_random_u32() % weight;
   409	
   410		/* If we accidentally pick "n" equal to "cpu",
   411		 * then simply choose "n + 1"th cpu.
   412		 */
   413		if (n == cpu)
   414			n = (n + 1) % weight;
 > 415		i = cpumask_nth(n, mask);
   416		return i;
   417	}
   418	
   419	/**
   420	 * cpumask_any_and_but - pick a "random" cpu from *mask1 & *mask2, but not this one.
   421	 * @mask1: the first input cpumask
   422	 * @mask2: the second input cpumask
   423	 * @cpu: the cpu to ignore
   424	 *
   425	 * Returns >= nr_cpu_ids if no cpus set.
   426	 */
   427	static __always_inline
   428	unsigned int cpumask_any_and_but(const struct cpumask *mask1,
   429					 const struct cpumask *mask2,
   430					 unsigned int cpu)
   431	{
   432		unsigned int i;
   433	
   434		cpumask_check(cpu);
   435		i = cpumask_first_and(mask1, mask2);
   436		if (i != cpu)
   437			return i;
   438	
   439		return cpumask_next_and(cpu, mask1, mask2);
   440	}
   441	
   442	/**
   443	 * cpumask_nth - get the Nth cpu in a cpumask
   444	 * @srcp: the cpumask pointer
   445	 * @cpu: the Nth cpu to find, starting from 0
   446	 *
   447	 * Return: >= nr_cpu_ids if such cpu doesn't exist.
   448	 */
   449	static __always_inline
 > 450	unsigned int cpumask_nth(unsigned int cpu, const struct cpumask *srcp)
   451	{
   452		return find_nth_bit(cpumask_bits(srcp), small_cpumask_bits, cpumask_check(cpu));
   453	}
   454	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
  2025-01-13  6:18 [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but() I Hsin Cheng
  2025-01-13  9:45 ` kernel test robot
@ 2025-01-13  9:45 ` kernel test robot
  2025-01-13 10:13 ` Kuan-Wei Chiu
  2025-01-13 11:05 ` Mark Rutland
  3 siblings, 0 replies; 12+ messages in thread
From: kernel test robot @ 2025-01-13  9:45 UTC (permalink / raw)
  To: I Hsin Cheng; +Cc: llvm, oe-kbuild-all

Hi Hsin,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:

[auto build test ERROR on linus/master]
[also build test ERROR on v6.13-rc7 next-20250113]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/I-Hsin-Cheng/cpumask-Implement-random-version-of-cpumask_any_but/20250113-142111
base:   linus/master
patch link:    https://lore.kernel.org/r/20250113061839.22131-1-richard120310%40gmail.com
patch subject: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
config: i386-buildonly-randconfig-001-20250113 (https://download.01.org/0day-ci/archive/20250113/202501131709.T02YI6Rr-lkp@intel.com/config)
compiler: clang version 19.1.3 (https://github.com/llvm/llvm-project ab51eccf88f5321e7c60591c5546b254b6afab99)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250113/202501131709.T02YI6Rr-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202501131709.T02YI6Rr-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from arch/x86/kernel/asm-offsets.c:9:
   In file included from include/linux/crypto.h:15:
   In file included from include/linux/completion.h:12:
   In file included from include/linux/swait.h:7:
   In file included from include/linux/spinlock.h:63:
   In file included from include/linux/lockdep.h:14:
   In file included from include/linux/smp.h:13:
>> include/linux/cpumask.h:407:11: error: call to undeclared function 'cpumask_weight'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     407 |         weight = cpumask_weight(mask);
         |                  ^
>> include/linux/cpumask.h:408:6: error: call to undeclared function 'get_random_u32'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     408 |         n = get_random_u32() % weight;
         |             ^
>> include/linux/cpumask.h:415:6: error: call to undeclared function 'cpumask_nth'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     415 |         i = cpumask_nth(n, mask);
         |             ^
   include/linux/cpumask.h:415:6: note: did you mean 'cpumask_next'?
   include/linux/cpumask.h:217:14: note: 'cpumask_next' declared here
     217 | unsigned int cpumask_next(int n, const struct cpumask *srcp)
         |              ^
>> include/linux/cpumask.h:450:14: error: static declaration of 'cpumask_nth' follows non-static declaration
     450 | unsigned int cpumask_nth(unsigned int cpu, const struct cpumask *srcp)
         |              ^
   include/linux/cpumask.h:415:6: note: previous implicit declaration is here
     415 |         i = cpumask_nth(n, mask);
         |             ^
>> include/linux/cpumask.h:779:37: error: static declaration of 'cpumask_weight' follows non-static declaration
     779 | static __always_inline unsigned int cpumask_weight(const struct cpumask *srcp)
         |                                     ^
   include/linux/cpumask.h:407:11: note: previous implicit declaration is here
     407 |         weight = cpumask_weight(mask);
         |                  ^
   In file included from arch/x86/kernel/asm-offsets.c:9:
   In file included from include/linux/crypto.h:17:
   In file included from include/linux/slab.h:16:
   In file included from include/linux/gfp.h:7:
   In file included from include/linux/mmzone.h:18:
   In file included from include/linux/nodemask.h:98:
>> include/linux/random.h:42:5: error: conflicting types for 'get_random_u32'
      42 | u32 get_random_u32(void);
         |     ^
   include/linux/cpumask.h:408:6: note: previous implicit declaration is here
     408 |         n = get_random_u32() % weight;
         |             ^
   In file included from arch/x86/kernel/asm-offsets.c:14:
   In file included from include/linux/suspend.h:5:
   In file included from include/linux/swap.h:9:
   In file included from include/linux/memcontrol.h:13:
   In file included from include/linux/cgroup.h:17:
   In file included from include/linux/fs.h:33:
   In file included from include/linux/percpu-rwsem.h:7:
   In file included from include/linux/rcuwait.h:6:
   In file included from include/linux/sched/signal.h:6:
   include/linux/signal.h:98:11: warning: array index 3 is past the end of the array (that has type 'unsigned long[2]') [-Warray-bounds]
      98 |                 return (set->sig[3] | set->sig[2] |
         |                         ^        ~
   arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
      24 |         unsigned long sig[_NSIG_WORDS];
         |         ^
   In file included from arch/x86/kernel/asm-offsets.c:14:
   In file included from include/linux/suspend.h:5:
   In file included from include/linux/swap.h:9:
   In file included from include/linux/memcontrol.h:13:
   In file included from include/linux/cgroup.h:17:
   In file included from include/linux/fs.h:33:
   In file included from include/linux/percpu-rwsem.h:7:
   In file included from include/linux/rcuwait.h:6:
   In file included from include/linux/sched/signal.h:6:
   include/linux/signal.h:98:25: warning: array index 2 is past the end of the array (that has type 'unsigned long[2]') [-Warray-bounds]
      98 |                 return (set->sig[3] | set->sig[2] |
         |                                       ^        ~
   arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
      24 |         unsigned long sig[_NSIG_WORDS];
         |         ^
   In file included from arch/x86/kernel/asm-offsets.c:14:
   In file included from include/linux/suspend.h:5:
   In file included from include/linux/swap.h:9:
   In file included from include/linux/memcontrol.h:13:
   In file included from include/linux/cgroup.h:17:
   In file included from include/linux/fs.h:33:
   In file included from include/linux/percpu-rwsem.h:7:
   In file included from include/linux/rcuwait.h:6:
   In file included from include/linux/sched/signal.h:6:
   include/linux/signal.h:114:11: warning: array index 3 is past the end of the array (that has type 'const unsigned long[2]') [-Warray-bounds]
     114 |                 return  (set1->sig[3] == set2->sig[3]) &&
         |                          ^         ~
   arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
      24 |         unsigned long sig[_NSIG_WORDS];
         |         ^
   In file included from arch/x86/kernel/asm-offsets.c:14:
   In file included from include/linux/suspend.h:5:
   In file included from include/linux/swap.h:9:
   In file included from include/linux/memcontrol.h:13:
   In file included from include/linux/cgroup.h:17:
   In file included from include/linux/fs.h:33:
   In file included from include/linux/percpu-rwsem.h:7:
   In file included from include/linux/rcuwait.h:6:
   In file included from include/linux/sched/signal.h:6:
   include/linux/signal.h:114:27: warning: array index 3 is past the end of the array (that has type 'const unsigned long[2]') [-Warray-bounds]
     114 |                 return  (set1->sig[3] == set2->sig[3]) &&
         |                                          ^         ~
   arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
      24 |         unsigned long sig[_NSIG_WORDS];
         |         ^
   In file included from arch/x86/kernel/asm-offsets.c:14:
   In file included from include/linux/suspend.h:5:
   In file included from include/linux/swap.h:9:
   In file included from include/linux/memcontrol.h:13:
   In file included from include/linux/cgroup.h:17:
   In file included from include/linux/fs.h:33:
   In file included from include/linux/percpu-rwsem.h:7:
   In file included from include/linux/rcuwait.h:6:
   In file included from include/linux/sched/signal.h:6:
   include/linux/signal.h:115:5: warning: array index 2 is past the end of the array (that has type 'const unsigned long[2]') [-Warray-bounds]
     115 |                         (set1->sig[2] == set2->sig[2]) &&
         |                          ^         ~
   arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
      24 |         unsigned long sig[_NSIG_WORDS];
         |         ^
   In file included from arch/x86/kernel/asm-offsets.c:14:
   In file included from include/linux/suspend.h:5:
   In file included from include/linux/swap.h:9:
   In file included from include/linux/memcontrol.h:13:
   In file included from include/linux/cgroup.h:17:
   In file included from include/linux/fs.h:33:
   In file included from include/linux/percpu-rwsem.h:7:
   In file included from include/linux/rcuwait.h:6:
   In file included from include/linux/sched/signal.h:6:
   include/linux/signal.h:115:21: warning: array index 2 is past the end of the array (that has type 'const unsigned long[2]') [-Warray-bounds]
     115 |                         (set1->sig[2] == set2->sig[2]) &&
         |                                          ^         ~
   arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
      24 |         unsigned long sig[_NSIG_WORDS];
         |         ^
   In file included from arch/x86/kernel/asm-offsets.c:14:
   In file included from include/linux/suspend.h:5:
   In file included from include/linux/swap.h:9:
   In file included from include/linux/memcontrol.h:13:
   In file included from include/linux/cgroup.h:17:


vim +/cpumask_weight +407 include/linux/cpumask.h

   317	
   318	/**
   319	 * for_each_cpu_wrap - iterate over every cpu in a mask, starting at a specified location
   320	 * @cpu: the (optionally unsigned) integer iterator
   321	 * @mask: the cpumask pointer
   322	 * @start: the start location
   323	 *
   324	 * The implementation does not assume any bit in @mask is set (including @start).
   325	 *
   326	 * After the loop, cpu is >= nr_cpu_ids.
   327	 */
   328	#define for_each_cpu_wrap(cpu, mask, start)				\
   329		for_each_set_bit_wrap(cpu, cpumask_bits(mask), small_cpumask_bits, start)
   330	
   331	/**
   332	 * for_each_cpu_and - iterate over every cpu in both masks
   333	 * @cpu: the (optionally unsigned) integer iterator
   334	 * @mask1: the first cpumask pointer
   335	 * @mask2: the second cpumask pointer
   336	 *
   337	 * This saves a temporary CPU mask in many places.  It is equivalent to:
   338	 *	struct cpumask tmp;
   339	 *	cpumask_and(&tmp, &mask1, &mask2);
   340	 *	for_each_cpu(cpu, &tmp)
   341	 *		...
   342	 *
   343	 * After the loop, cpu is >= nr_cpu_ids.
   344	 */
   345	#define for_each_cpu_and(cpu, mask1, mask2)				\
   346		for_each_and_bit(cpu, cpumask_bits(mask1), cpumask_bits(mask2), small_cpumask_bits)
   347	
   348	/**
   349	 * for_each_cpu_andnot - iterate over every cpu present in one mask, excluding
   350	 *			 those present in another.
   351	 * @cpu: the (optionally unsigned) integer iterator
   352	 * @mask1: the first cpumask pointer
   353	 * @mask2: the second cpumask pointer
   354	 *
   355	 * This saves a temporary CPU mask in many places.  It is equivalent to:
   356	 *	struct cpumask tmp;
   357	 *	cpumask_andnot(&tmp, &mask1, &mask2);
   358	 *	for_each_cpu(cpu, &tmp)
   359	 *		...
   360	 *
   361	 * After the loop, cpu is >= nr_cpu_ids.
   362	 */
   363	#define for_each_cpu_andnot(cpu, mask1, mask2)				\
   364		for_each_andnot_bit(cpu, cpumask_bits(mask1), cpumask_bits(mask2), small_cpumask_bits)
   365	
   366	/**
   367	 * for_each_cpu_or - iterate over every cpu present in either mask
   368	 * @cpu: the (optionally unsigned) integer iterator
   369	 * @mask1: the first cpumask pointer
   370	 * @mask2: the second cpumask pointer
   371	 *
   372	 * This saves a temporary CPU mask in many places.  It is equivalent to:
   373	 *	struct cpumask tmp;
   374	 *	cpumask_or(&tmp, &mask1, &mask2);
   375	 *	for_each_cpu(cpu, &tmp)
   376	 *		...
   377	 *
   378	 * After the loop, cpu is >= nr_cpu_ids.
   379	 */
   380	#define for_each_cpu_or(cpu, mask1, mask2)				\
   381		for_each_or_bit(cpu, cpumask_bits(mask1), cpumask_bits(mask2), small_cpumask_bits)
   382	
   383	/**
   384	 * for_each_cpu_from - iterate over CPUs present in @mask, from @cpu to the end of @mask.
   385	 * @cpu: the (optionally unsigned) integer iterator
   386	 * @mask: the cpumask pointer
   387	 *
   388	 * After the loop, cpu is >= nr_cpu_ids.
   389	 */
   390	#define for_each_cpu_from(cpu, mask)				\
   391		for_each_set_bit_from(cpu, cpumask_bits(mask), small_cpumask_bits)
   392	
   393	/**
   394	 * cpumask_any_but - return a "random" in a cpumask, but not this one.
   395	 * @mask: the cpumask to search
   396	 * @cpu: the cpu to ignore.
   397	 *
   398	 * Often used to find any cpu but smp_processor_id() in a mask.
   399	 * Return: >= nr_cpu_ids if no cpus set.
   400	 */
   401	static __always_inline
   402	unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
   403	{
   404		unsigned int i, n, weight;
   405	
   406		cpumask_check(cpu);
 > 407		weight = cpumask_weight(mask);
 > 408		n = get_random_u32() % weight;
   409	
   410		/* If we accidentally pick "n" equal to "cpu",
   411		 * then simply choose "n + 1"th cpu.
   412		 */
   413		if (n == cpu)
   414			n = (n + 1) % weight;
 > 415		i = cpumask_nth(n, mask);
   416		return i;
   417	}
   418	
   419	/**
   420	 * cpumask_any_and_but - pick a "random" cpu from *mask1 & *mask2, but not this one.
   421	 * @mask1: the first input cpumask
   422	 * @mask2: the second input cpumask
   423	 * @cpu: the cpu to ignore
   424	 *
   425	 * Returns >= nr_cpu_ids if no cpus set.
   426	 */
   427	static __always_inline
   428	unsigned int cpumask_any_and_but(const struct cpumask *mask1,
   429					 const struct cpumask *mask2,
   430					 unsigned int cpu)
   431	{
   432		unsigned int i;
   433	
   434		cpumask_check(cpu);
   435		i = cpumask_first_and(mask1, mask2);
   436		if (i != cpu)
   437			return i;
   438	
   439		return cpumask_next_and(cpu, mask1, mask2);
   440	}
   441	
   442	/**
   443	 * cpumask_nth - get the Nth cpu in a cpumask
   444	 * @srcp: the cpumask pointer
   445	 * @cpu: the Nth cpu to find, starting from 0
   446	 *
   447	 * Return: >= nr_cpu_ids if such cpu doesn't exist.
   448	 */
   449	static __always_inline
 > 450	unsigned int cpumask_nth(unsigned int cpu, const struct cpumask *srcp)
   451	{
   452		return find_nth_bit(cpumask_bits(srcp), small_cpumask_bits, cpumask_check(cpu));
   453	}
   454	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
  2025-01-13  6:18 [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but() I Hsin Cheng
  2025-01-13  9:45 ` kernel test robot
  2025-01-13  9:45 ` kernel test robot
@ 2025-01-13 10:13 ` Kuan-Wei Chiu
  2025-01-13 10:27   ` I Hsin Cheng
  2025-01-13 11:05 ` Mark Rutland
  3 siblings, 1 reply; 12+ messages in thread
From: Kuan-Wei Chiu @ 2025-01-13 10:13 UTC (permalink / raw)
  To: I Hsin Cheng; +Cc: yury.norov, linux, jserv, linux-kernel

Hi I Hsin,

On Mon, Jan 13, 2025 at 02:18:39PM +0800, I Hsin Cheng wrote:
> Original implementation of "cpumask_any_but()" isn't actually random as
> the comment claims itself to be. It's behavior is in fact to select the
> first cpu in "mask" which isn't equal to "cpu".
> 
> Re-implement the function so we can choose a random cpu by randomly
> select the value of "n" and choose the nth cpu in "mask"
> 
This patch may slow down the efficiency of cpumask_any_but(). Are there
any in-tree users of cpumask_any_but() that require it to return a
truly random id, or benefit from such behavior?

Regards,
Kuan-Wei

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
  2025-01-13 10:13 ` Kuan-Wei Chiu
@ 2025-01-13 10:27   ` I Hsin Cheng
  2025-01-13 11:09     ` Kuan-Wei Chiu
  0 siblings, 1 reply; 12+ messages in thread
From: I Hsin Cheng @ 2025-01-13 10:27 UTC (permalink / raw)
  To: Kuan-Wei Chiu; +Cc: yury.norov, linux, jserv, linux-kernel

On Mon, Jan 13, 2025 at 06:13:07PM +0800, Kuan-Wei Chiu wrote:
> Hi I Hsin,
> 
> On Mon, Jan 13, 2025 at 02:18:39PM +0800, I Hsin Cheng wrote:
> > Original implementation of "cpumask_any_but()" isn't actually random as
> > the comment claims itself to be. It's behavior is in fact to select the
> > first cpu in "mask" which isn't equal to "cpu".
> > 
> > Re-implement the function so we can choose a random cpu by randomly
> > select the value of "n" and choose the nth cpu in "mask"
> > 
> This patch may slow down the efficiency of cpumask_any_but(). Are there
> any in-tree users of cpumask_any_but() that require it to return a
> truly random id, or benefit from such behavior?

Hi Kuan-Wei,

Thanks for your review ! Yes indeed it may slow down the efficiency
abit.

However there are some use cases I think randomness is important
such as "dl_task_offline_migration()" under kernel/sched/deadline.c ,
where the operation of cpu picking shouldn't favor certain cpu too much.

Also "select_task_rq()" utitlize "cpumask_any()" to pick cpu, it doesn't
need to be perfectly random, but neither should it only favor certain
cpu.

What do you think?

Best regards,
I Hsin


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
  2025-01-13  6:18 [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but() I Hsin Cheng
                   ` (2 preceding siblings ...)
  2025-01-13 10:13 ` Kuan-Wei Chiu
@ 2025-01-13 11:05 ` Mark Rutland
  2025-01-13 18:00   ` Yury Norov
  3 siblings, 1 reply; 12+ messages in thread
From: Mark Rutland @ 2025-01-13 11:05 UTC (permalink / raw)
  To: I Hsin Cheng; +Cc: yury.norov, linux, jserv, linux-kernel

On Mon, Jan 13, 2025 at 02:18:39PM +0800, I Hsin Cheng wrote:
> Original implementation of "cpumask_any_but()" isn't actually random as
> the comment claims itself to be. It's behavior is in fact to select the
> first cpu in "mask" which isn't equal to "cpu".

What it says specifically is:

  cpumask_any_but - return a "random" in a cpumask, but not this one.

... and by "random", it really means "arbitrary".

The idea here is that the caller is specifying that it doesn't care
which specific CPU is chosen, but this is not required to be a random
selection.

> Re-implement the function so we can choose a random cpu by randomly
> select the value of "n" and choose the nth cpu in "mask"
> 
> Experiments[1] are done below to verify it generate more random result than
> orginal implementation which tends to select the same cpu over and over
> again.

I think what you're after here is similar to
cpumask_any_and_distribute(), and you should look at building
cpumask_any_but_distribute() in the same way, rather than changing
cpumask_any_but().

Mark.

> Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
> ---
> The test is done on x86_64 architecture with 6.8.0-48-generic kernel
> version on Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
> 
> [1]:
> Test script:
> 
> int init_module(void)
> {
>     const struct cpumask *cur_mask = cpu_online_mask;
>     unsigned int cpu = 5, result;
>     int times = 50;
> 
>     pr_info("Old cpumask_any_but(): ");
>     for (int i = 0; i < times; i++) {
>         result = cpumask_any_but(cur_mask, cpu);
>         pr_cont("%u ", result);
>     }
>     pr_info("\n");
> 
>     pr_info("New cpumask_any_but(): ");
>     for (int i = 0; i < times; i++) {
>         result = cpumask_any_but_v2(cur_mask, cpu);
>         pr_cont("%u ", result);
>     }
>     pr_info("\n");
> 
>     return 0;
> }
> 
> Experiment result showned as below display in dmesg:
> [ 8036.558152] Old cpumask_any_but(): 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 
> [ 8036.558193] New cpumask_any_but(): 7 1 1 2 2 2 3 4 0 2 7 4 6 3 3 2 2 4 2 7 6 6 6 4 6 6 6 4 4 7 6 2 2 6 7 6 6 3 0 6 2 1 0 4 4 6 4 6 6 3
> ---
>  include/linux/cpumask.h | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> index 9278a50d5..336297960 100644
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -401,12 +401,18 @@ unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta
>  static __always_inline
>  unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
>  {
> -	unsigned int i;
> +	unsigned int i, n, weight;
>  
>  	cpumask_check(cpu);
> -	for_each_cpu(i, mask)
> -		if (i != cpu)
> -			break;
> +	weight = cpumask_weight(mask);
> +	n = get_random_u32() % weight;
> +
> +	/* If we accidentally pick "n" equal to "cpu",
> +	 * then simply choose "n + 1"th cpu.
> +	 */
> +	if (n == cpu)
> +		n = (n + 1) % weight;
> +	i = cpumask_nth(n, mask);
>  	return i;
>  }
>  
> -- 
> 2.43.0
> 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
  2025-01-13 10:27   ` I Hsin Cheng
@ 2025-01-13 11:09     ` Kuan-Wei Chiu
  0 siblings, 0 replies; 12+ messages in thread
From: Kuan-Wei Chiu @ 2025-01-13 11:09 UTC (permalink / raw)
  To: I Hsin Cheng; +Cc: yury.norov, linux, jserv, linux-kernel

On Mon, Jan 13, 2025 at 06:27:26PM +0800, I Hsin Cheng wrote:
> On Mon, Jan 13, 2025 at 06:13:07PM +0800, Kuan-Wei Chiu wrote:
> > Hi I Hsin,
> > 
> > On Mon, Jan 13, 2025 at 02:18:39PM +0800, I Hsin Cheng wrote:
> > > Original implementation of "cpumask_any_but()" isn't actually random as
> > > the comment claims itself to be. It's behavior is in fact to select the
> > > first cpu in "mask" which isn't equal to "cpu".
> > > 
> > > Re-implement the function so we can choose a random cpu by randomly
> > > select the value of "n" and choose the nth cpu in "mask"
> > > 
> > This patch may slow down the efficiency of cpumask_any_but(). Are there
> > any in-tree users of cpumask_any_but() that require it to return a
> > truly random id, or benefit from such behavior?
> 
> Hi Kuan-Wei,
> 
> Thanks for your review ! Yes indeed it may slow down the efficiency
> abit.
> 
> However there are some use cases I think randomness is important
> such as "dl_task_offline_migration()" under kernel/sched/deadline.c ,
> where the operation of cpu picking shouldn't favor certain cpu too much.
> 
> Also "select_task_rq()" utitlize "cpumask_any()" to pick cpu, it doesn't
> need to be perfectly random, but neither should it only favor certain
> cpu.
> 
> What do you think?
>

If a true random number isn't needed, could next_pseudo_random32() be
used instead for better efficiency?

I'm not familiar with the scheduler, but if there are only one or two
scheduler use cases, would you consider creating a new
cpumask_random_but() API and converting those specific cases to use it?
In this case, the patch should also be CC'd to scheduler developers.
Additionally, you should explain the benefits of this approach in the
patch description.

Regards,
Kuan-Wei

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
  2025-01-13 11:05 ` Mark Rutland
@ 2025-01-13 18:00   ` Yury Norov
  2025-01-14  7:15     ` I Hsin Cheng
  0 siblings, 1 reply; 12+ messages in thread
From: Yury Norov @ 2025-01-13 18:00 UTC (permalink / raw)
  To: Mark Rutland; +Cc: I Hsin Cheng, linux, jserv, linux-kernel

On Mon, Jan 13, 2025 at 11:05:19AM +0000, Mark Rutland wrote:
> On Mon, Jan 13, 2025 at 02:18:39PM +0800, I Hsin Cheng wrote:
> > Original implementation of "cpumask_any_but()" isn't actually random as
> > the comment claims itself to be. It's behavior is in fact to select the
> > first cpu in "mask" which isn't equal to "cpu".
> 
> What it says specifically is:
> 
>   cpumask_any_but - return a "random" in a cpumask, but not this one.
> 
> ... and by "random", it really means "arbitrary".
> 
> The idea here is that the caller is specifying that it doesn't care
> which specific CPU is chosen, but this is not required to be a random
> selection.
> 
> > Re-implement the function so we can choose a random cpu by randomly
> > select the value of "n" and choose the nth cpu in "mask"
> > 
> > Experiments[1] are done below to verify it generate more random result than
> > orginal implementation which tends to select the same cpu over and over
> > again.
> 
> I think what you're after here is similar to
> cpumask_any_and_distribute(), and you should look at building
> cpumask_any_but_distribute() in the same way, rather than changing
> cpumask_any_but().
> 
> Mark.

I agree with Mark. cpumask_any_but_distribute() is what you most
likely need. Anyways, whatever you end up please don't change existing
API, especially in a way that hurts performance so badly.

> 
> > Signed-off-by: I Hsin Cheng <richard120310@gmail.com>

This patch should go with a demonstration that some particular
system(s) benefits from it, and the others don't suffer.

> > ---
> > The test is done on x86_64 architecture with 6.8.0-48-generic kernel
> > version on Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
> > 
> > [1]:
> > Test script:
> > 
> > int init_module(void)
> > {
> >     const struct cpumask *cur_mask = cpu_online_mask;
> >     unsigned int cpu = 5, result;
> >     int times = 50;
> > 
> >     pr_info("Old cpumask_any_but(): ");
> >     for (int i = 0; i < times; i++) {
> >         result = cpumask_any_but(cur_mask, cpu);
> >         pr_cont("%u ", result);
> >     }
> >     pr_info("\n");
> > 
> >     pr_info("New cpumask_any_but(): ");
> >     for (int i = 0; i < times; i++) {
> >         result = cpumask_any_but_v2(cur_mask, cpu);
> >         pr_cont("%u ", result);
> >     }
> >     pr_info("\n");
> > 
> >     return 0;
> > }
> > 
> > Experiment result showned as below display in dmesg:
> > [ 8036.558152] Old cpumask_any_but(): 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> > 
> > [ 8036.558193] New cpumask_any_but(): 7 1 1 2 2 2 3 4 0 2 7 4 6 3 3 2 2 4 2 7 6 6 6 4 6 6 6 4 4 7 6 2 2 6 7 6 6 3 0 6 2 1 0 4 4 6 4 6 6 3
>
> > ---
> >  include/linux/cpumask.h | 14 ++++++++++----
> >  1 file changed, 10 insertions(+), 4 deletions(-)
> > 
> > diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> > index 9278a50d5..336297960 100644
> > --- a/include/linux/cpumask.h
> > +++ b/include/linux/cpumask.h

 #include <linux/random.h>

Which would be really good to avoid.

> > @@ -401,12 +401,18 @@ unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta
> >  static __always_inline
> >  unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
> >  {
> > -	unsigned int i;
> > +	unsigned int i, n, weight;
> >  
> >  	cpumask_check(cpu);
> > -	for_each_cpu(i, mask)
> > -		if (i != cpu)
> > -			break;
> > +	weight = cpumask_weight(mask);
> > +	n = get_random_u32() % weight;
> > +
> > +	/* If we accidentally pick "n" equal to "cpu",
> > +	 * then simply choose "n + 1"th cpu.
> > +	 */
> > +	if (n == cpu)
> > +		n = (n + 1) % weight;
> > +	i = cpumask_nth(n, mask);

This is an entirely broken thing, and it works only because your CPU mask
is dense. Imagine cpumask: 0111 1111. Your new cpumask_any_but(mask, 5)
will return 5 exactly, if the get_random_u32() draws 4.

It looks broken even for a dense mask. By probability, your code returns:

P(0-4,7) == 1/8,
P(5) == 0,
P(6) == 1/4.

Assuming you are trying to implement a random uniform distribution drawing,
the correct probabilities should look like:

P(0-4,6-7) == 1/7,
P(5) == 0,

Thanks,
Yury

> >  	return i;
> >  }
> >  
> > -- 
> > 2.43.0
> > 
> > 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
  2025-01-13 18:00   ` Yury Norov
@ 2025-01-14  7:15     ` I Hsin Cheng
  2025-01-14 15:02       ` Mark Rutland
  2025-01-14 15:43       ` Yury Norov
  0 siblings, 2 replies; 12+ messages in thread
From: I Hsin Cheng @ 2025-01-14  7:15 UTC (permalink / raw)
  To: Yury Norov; +Cc: mark.rutland, linux, jserv, linux-kernel, visitorckw

On Mon, Jan 13, 2025 at 01:00:56PM -0500, Yury Norov wrote:
> On Mon, Jan 13, 2025 at 11:05:19AM +0000, Mark Rutland wrote:
> > On Mon, Jan 13, 2025 at 02:18:39PM +0800, I Hsin Cheng wrote:
> > > Original implementation of "cpumask_any_but()" isn't actually random as
> > > the comment claims itself to be. It's behavior is in fact to select the
> > > first cpu in "mask" which isn't equal to "cpu".
> > 
> > What it says specifically is:
> > 
> >   cpumask_any_but - return a "random" in a cpumask, but not this one.
> > 
> > ... and by "random", it really means "arbitrary".
> > 
> > The idea here is that the caller is specifying that it doesn't care
> > which specific CPU is chosen, but this is not required to be a random
> > selection.
> > 
> > > Re-implement the function so we can choose a random cpu by randomly
> > > select the value of "n" and choose the nth cpu in "mask"
> > > 
> > > Experiments[1] are done below to verify it generate more random result than
> > > orginal implementation which tends to select the same cpu over and over
> > > again.
> > 
> > I think what you're after here is similar to
> > cpumask_any_and_distribute(), and you should look at building
> > cpumask_any_but_distribute() in the same way, rather than changing
> > cpumask_any_but().
> > 
> > Mark.
> 
> I agree with Mark. cpumask_any_but_distribute() is what you most
> likely need. Anyways, whatever you end up please don't change existing
> API, especially in a way that hurts performance so badly.

I see, I didn't noticed cpumask_any*_distribute() APIs, they're the
thing I was looking for, thanks!
> 
> > 
> > > Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
> 
> This patch should go with a demonstration that some particular
> system(s) benefits from it, and the others don't suffer.

Thanks for the head up, I'll be aware of this in the future.

> 
> > > ---
> > > The test is done on x86_64 architecture with 6.8.0-48-generic kernel
> > > version on Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
> > > 
> > > [1]:
> > > Test script:
> > > 
> > > int init_module(void)
> > > {
> > >     const struct cpumask *cur_mask = cpu_online_mask;
> > >     unsigned int cpu = 5, result;
> > >     int times = 50;
> > > 
> > >     pr_info("Old cpumask_any_but(): ");
> > >     for (int i = 0; i < times; i++) {
> > >         result = cpumask_any_but(cur_mask, cpu);
> > >         pr_cont("%u ", result);
> > >     }
> > >     pr_info("\n");
> > > 
> > >     pr_info("New cpumask_any_but(): ");
> > >     for (int i = 0; i < times; i++) {
> > >         result = cpumask_any_but_v2(cur_mask, cpu);
> > >         pr_cont("%u ", result);
> > >     }
> > >     pr_info("\n");
> > > 
> > >     return 0;
> > > }
> > > 
> > > Experiment result showned as below display in dmesg:
> > > [ 8036.558152] Old cpumask_any_but(): 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> > > 
> > > [ 8036.558193] New cpumask_any_but(): 7 1 1 2 2 2 3 4 0 2 7 4 6 3 3 2 2 4 2 7 6 6 6 4 6 6 6 4 4 7 6 2 2 6 7 6 6 3 0 6 2 1 0 4 4 6 4 6 6 3
> >
> > > ---
> > >  include/linux/cpumask.h | 14 ++++++++++----
> > >  1 file changed, 10 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> > > index 9278a50d5..336297960 100644
> > > --- a/include/linux/cpumask.h
> > > +++ b/include/linux/cpumask.h
> 
>  #include <linux/random.h>
> 
> Which would be really good to avoid.
> 
> > > @@ -401,12 +401,18 @@ unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta
> > >  static __always_inline
> > >  unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
> > >  {
> > > -	unsigned int i;
> > > +	unsigned int i, n, weight;
> > >  
> > >  	cpumask_check(cpu);
> > > -	for_each_cpu(i, mask)
> > > -		if (i != cpu)
> > > -			break;
> > > +	weight = cpumask_weight(mask);
> > > +	n = get_random_u32() % weight;
> > > +
> > > +	/* If we accidentally pick "n" equal to "cpu",
> > > +	 * then simply choose "n + 1"th cpu.
> > > +	 */
> > > +	if (n == cpu)
> > > +		n = (n + 1) % weight;
> > > +	i = cpumask_nth(n, mask);
> 
> This is an entirely broken thing, and it works only because your CPU mask
> is dense. Imagine cpumask: 0111 1111. Your new cpumask_any_but(mask, 5)
> will return 5 exactly, if the get_random_u32() draws 4.
> 

Oh I'm sorry for this part, "cpumask_nth()" return value should be
checked instead of checking "get_random_u32()". May I ask why does it
only work for dense cpumask? I mean clearly original method will be
faster when cpumask isn't dense.

> It looks broken even for a dense mask. By probability, your code returns:
> 
> P(0-4,7) == 1/8,
> P(5) == 0,
> P(6) == 1/4.
> 
> Assuming you are trying to implement a random uniform distribution drawing,
> the correct probabilities should look like:
> 
> P(0-4,6-7) == 1/7,
> P(5) == 0,
> 
> Thanks,
> Yury
> 

I did noticed this part, I was thinking that we do not actually need to
make the prbability to be uniform distribution, just want to make it
scatters more then picking certain number.

Thanks for your detailed review and explanation, also thanks to Mark and
Kuan-Wei !

I now realize "random" here is more of a convention for the caller to
states that it doesn't matter which cpu it gets, maybe we should
rephrase the comment to make it less confusing? Because I think "random"
itself does stands for a particular meaning.

Best regards,
Richard Cheng.

> > >  	return i;
> > >  }
> > >  
> > > -- 
> > > 2.43.0
> > > 
> > > 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
  2025-01-14  7:15     ` I Hsin Cheng
@ 2025-01-14 15:02       ` Mark Rutland
  2025-01-14 15:43       ` Yury Norov
  1 sibling, 0 replies; 12+ messages in thread
From: Mark Rutland @ 2025-01-14 15:02 UTC (permalink / raw)
  To: I Hsin Cheng; +Cc: Yury Norov, linux, jserv, linux-kernel, visitorckw

On Tue, Jan 14, 2025 at 03:15:43PM +0800, I Hsin Cheng wrote:
> On Mon, Jan 13, 2025 at 01:00:56PM -0500, Yury Norov wrote:
> > On Mon, Jan 13, 2025 at 11:05:19AM +0000, Mark Rutland wrote:
> > > On Mon, Jan 13, 2025 at 02:18:39PM +0800, I Hsin Cheng wrote:
> > > > Original implementation of "cpumask_any_but()" isn't actually random as
> > > > the comment claims itself to be. It's behavior is in fact to select the
> > > > first cpu in "mask" which isn't equal to "cpu".
> > > 
> > > What it says specifically is:
> > > 
> > >   cpumask_any_but - return a "random" in a cpumask, but not this one.
> > > 
> > > ... and by "random", it really means "arbitrary".
> > > 
> > > The idea here is that the caller is specifying that it doesn't care
> > > which specific CPU is chosen, but this is not required to be a random
> > > selection.

> I now realize "random" here is more of a convention for the caller to
> states that it doesn't matter which cpu it gets, maybe we should
> rephrase the comment to make it less confusing? Because I think "random"
> itself does stands for a particular meaning.

FWIW, I agree. I reckon (as above), we could replace "random" with
arbitrary, i.e. replace

	return a "random" cpu in a cpumask ...
	
... with:

	return an arbitrary cpu in a cpumask ...

Looking again I see that the comment for cpumask_any_but() misses the
word "cpu" too, so it'd be nice to clean that up regardless.

Mark.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
  2025-01-14  7:15     ` I Hsin Cheng
  2025-01-14 15:02       ` Mark Rutland
@ 2025-01-14 15:43       ` Yury Norov
  2025-01-15  7:24         ` I Hsin Cheng
  1 sibling, 1 reply; 12+ messages in thread
From: Yury Norov @ 2025-01-14 15:43 UTC (permalink / raw)
  To: I Hsin Cheng; +Cc: mark.rutland, linux, jserv, linux-kernel, visitorckw

> > > > @@ -401,12 +401,18 @@ unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta
> > > >  static __always_inline
> > > >  unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
> > > >  {
> > > > -	unsigned int i;
> > > > +	unsigned int i, n, weight;
> > > >  
> > > >  	cpumask_check(cpu);
> > > > -	for_each_cpu(i, mask)
> > > > -		if (i != cpu)
> > > > -			break;
> > > > +	weight = cpumask_weight(mask);
> > > > +	n = get_random_u32() % weight;
> > > > +
> > > > +	/* If we accidentally pick "n" equal to "cpu",
> > > > +	 * then simply choose "n + 1"th cpu.
> > > > +	 */
> > > > +	if (n == cpu)
> > > > +		n = (n + 1) % weight;
> > > > +	i = cpumask_nth(n, mask);
> > 
> > This is an entirely broken thing, and it works only because your CPU mask
> > is dense. Imagine cpumask: 0111 1111. Your new cpumask_any_but(mask, 5)
> > will return 5 exactly, if the get_random_u32() draws 4.
> > 
> 
> Oh I'm sorry for this part, "cpumask_nth()" return value should be
> checked instead of checking "get_random_u32()". May I ask why does it
> only work for dense cpumask?

Because for dense mask cpumask_nth(n) == n, and your
 	n = get_random_u32() % weight
is the same as 
 	n = cpumask_nth(get_random_u32() % weight)

> I mean clearly original method will be
> faster when cpumask isn't dense.
> 
> > It looks broken even for a dense mask. By probability, your code returns:
> > 
> > P(0-4,7) == 1/8,
> > P(5) == 0,
> > P(6) == 1/4.
> > 
> > Assuming you are trying to implement a random uniform distribution drawing,
> > the correct probabilities should look like:
> > 
> > P(0-4,6-7) == 1/7,
> > P(5) == 0,
> > 
> > Thanks,
> > Yury
> > 
> 
> I did noticed this part, I was thinking that we do not actually need to
> make the prbability to be uniform distribution, just want to make it
> scatters more then picking certain number.

On cpumask level you can't speculate whether users need true randomness
or not. That's why we pay so much attention to proper function naming.

See: 
 - cpumask_any() - 'any' requirement is much simpler than random;
 - cpumask_any_distribute() - not random at all, just a better (good
   enough) approximation;
 - cpumask_random() - a true random drawing. Must pass Chi^2, 
   Kolmogorov-Smirnov and other fancy statistical tests. As you can see,
   doesn't exist.

If in your function you use expensive true-random get_random_u32(),
one can expect that the output will not be worse than the original
uniform distribution, by the statistical properties means.

> Thanks for your detailed review and explanation, also thanks to Mark and
> Kuan-Wei !
> 
> I now realize "random" here is more of a convention for the caller to
> states that it doesn't matter which cpu it gets, maybe we should
> rephrase the comment to make it less confusing? Because I think "random"
> itself does stands for a particular meaning.

Feel free to send a patch.

Thanks,
Yury

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but()
  2025-01-14 15:43       ` Yury Norov
@ 2025-01-15  7:24         ` I Hsin Cheng
  0 siblings, 0 replies; 12+ messages in thread
From: I Hsin Cheng @ 2025-01-15  7:24 UTC (permalink / raw)
  To: Yury Norov; +Cc: mark.rutland, linux, jserv, linux-kernel, visitorckw

On Tue, Jan 14, 2025 at 10:43:56AM -0500, Yury Norov wrote:
> > > > > @@ -401,12 +401,18 @@ unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta
> > > > >  static __always_inline
> > > > >  unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
> > > > >  {
> > > > > -	unsigned int i;
> > > > > +	unsigned int i, n, weight;
> > > > >  
> > > > >  	cpumask_check(cpu);
> > > > > -	for_each_cpu(i, mask)
> > > > > -		if (i != cpu)
> > > > > -			break;
> > > > > +	weight = cpumask_weight(mask);
> > > > > +	n = get_random_u32() % weight;
> > > > > +
> > > > > +	/* If we accidentally pick "n" equal to "cpu",
> > > > > +	 * then simply choose "n + 1"th cpu.
> > > > > +	 */
> > > > > +	if (n == cpu)
> > > > > +		n = (n + 1) % weight;
> > > > > +	i = cpumask_nth(n, mask);
> > > 
> > > This is an entirely broken thing, and it works only because your CPU mask
> > > is dense. Imagine cpumask: 0111 1111. Your new cpumask_any_but(mask, 5)
> > > will return 5 exactly, if the get_random_u32() draws 4.
> > > 
> > 
> > Oh I'm sorry for this part, "cpumask_nth()" return value should be
> > checked instead of checking "get_random_u32()". May I ask why does it
> > only work for dense cpumask?
> 
> Because for dense mask cpumask_nth(n) == n, and your
>  	n = get_random_u32() % weight
> is the same as 
>  	n = cpumask_nth(get_random_u32() % weight)
> 
> > I mean clearly original method will be
> > faster when cpumask isn't dense.
> > 
> > > It looks broken even for a dense mask. By probability, your code returns:
> > > 
> > > P(0-4,7) == 1/8,
> > > P(5) == 0,
> > > P(6) == 1/4.
> > > 
> > > Assuming you are trying to implement a random uniform distribution drawing,
> > > the correct probabilities should look like:
> > > 
> > > P(0-4,6-7) == 1/7,
> > > P(5) == 0,
> > > 
> > > Thanks,
> > > Yury
> > > 
> > 
> > I did noticed this part, I was thinking that we do not actually need to
> > make the prbability to be uniform distribution, just want to make it
> > scatters more then picking certain number.
> 
> On cpumask level you can't speculate whether users need true randomness
> or not. That's why we pay so much attention to proper function naming.
> 
> See: 
>  - cpumask_any() - 'any' requirement is much simpler than random;
>  - cpumask_any_distribute() - not random at all, just a better (good
>    enough) approximation;
>  - cpumask_random() - a true random drawing. Must pass Chi^2, 
>    Kolmogorov-Smirnov and other fancy statistical tests. As you can see,
>    doesn't exist.
> 
> If in your function you use expensive true-random get_random_u32(),
> one can expect that the output will not be worse than the original
> uniform distribution, by the statistical properties means.
>

I see, thanks for the explanation, it really helps me learn alot!

> > Thanks for your detailed review and explanation, also thanks to Mark and
> > Kuan-Wei !
> > 
> > I now realize "random" here is more of a convention for the caller to
> > states that it doesn't matter which cpu it gets, maybe we should
> > rephrase the comment to make it less confusing? Because I think "random"
> > itself does stands for a particular meaning.
> 
> Feel free to send a patch.

No problem, I'll send it ASAP.

Regards,
I Hsin Cheng.


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-01-15  7:24 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-13  6:18 [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but() I Hsin Cheng
2025-01-13  9:45 ` kernel test robot
2025-01-13  9:45 ` kernel test robot
2025-01-13 10:13 ` Kuan-Wei Chiu
2025-01-13 10:27   ` I Hsin Cheng
2025-01-13 11:09     ` Kuan-Wei Chiu
2025-01-13 11:05 ` Mark Rutland
2025-01-13 18:00   ` Yury Norov
2025-01-14  7:15     ` I Hsin Cheng
2025-01-14 15:02       ` Mark Rutland
2025-01-14 15:43       ` Yury Norov
2025-01-15  7:24         ` I Hsin Cheng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.