All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/2] cpumask: Fix invalid uniprocessor assumptions
@ 2022-06-02 21:04 Sander Vanheule
  2022-06-02 21:04 ` [PATCH v1 1/2] cpumask: Fix invalid uniprocessor mask assumption Sander Vanheule
  2022-06-02 21:04 ` [PATCH v1 2/2] cpumask: Add UP optimised for_each_*_cpu loops Sander Vanheule
  0 siblings, 2 replies; 5+ messages in thread
From: Sander Vanheule @ 2022-06-02 21:04 UTC (permalink / raw)
  To: Peter Zijlstra, Yury Norov, Andrew Morton, Valentin Schneider,
	Thomas Gleixner, Greg Kroah-Hartman, Marco Elver, Barry Song
  Cc: linux-kernel, Andy Shevchenko, Sander Vanheule

On uniprocessor builds, it is currently assumed that any cpumask will
contain the single CPU: cpu0. This assumption is used to provide
optimised implementations.

The current assumption also appears to be wrong, by ignoring the fact
that users can provide empty cpumask-s. This can result in bugs as
explained in [1].

This series seeks to fix this assumption, allowing for masks that
contain at most cpu0, i.e. are "1" or "0".

[1] https://lore.kernel.org/all/20220530082552.46113-1-sander@svanheule.net/

Best,
Sander

---
To test these changes, I used the following code:

static void cpumask_test(const struct cpumask *mask)
{
        int cpu;

        if (cpumask_empty(mask))
                pr_info("testing empty mask\n");
        else
                pr_info("testing non-empty mask\n");

        pr_info("first cpu: %d\n", cpumask_first(mask));
        pr_info("first zero: %d\n", cpumask_first_zero(mask));
        pr_info("first and: %d\n", cpumask_first_and(mask, cpu_possible_mask));
        pr_info("last cpu: %d\n", cpumask_last(mask));
        pr_info("next cpu at -1: %d\n", cpumask_next(-1, mask));
        pr_info("next cpu at 0: %d\n", cpumask_next(0, mask));
        pr_info("next zero at -1: %d\n", cpumask_next_zero(-1, mask));
        pr_info("next zero at 0: %d\n", cpumask_next_zero(0, mask));
        pr_info("next and at -1: %d\n", cpumask_next_and(-1, mask, cpu_possible_mask));
        pr_info("next and at 0: %d\n", cpumask_next_and(0, mask, cpu_possible_mask));
        pr_info("next wrap at -1,false: %d\n", cpumask_next_wrap(-1, mask, 0, false));
        pr_info("next wrap at 0,false: %d\n", cpumask_next_wrap(-1, mask, 0, false));
        pr_info("next wrap at -1,true: %d\n", cpumask_next_wrap(-1, mask, 0, true));
        pr_info("next wrap at 0,true: %d\n", cpumask_next_wrap(-1, mask, 0, true));

        for_each_cpu(cpu, mask)
                pr_info("for each: %d\n", cpu);

        for_each_cpu_not(cpu, mask)
                pr_info("for each not: %d\n", cpu);

        for_each_cpu_wrap(cpu, mask, 0)
                pr_info("for each wrap: %d\n", cpu);

        for_each_cpu_and(cpu, mask, cpu_possible_mask)
                pr_info("for each and: %d\n", cpu);
}

static void run_tests()
{
        cpumask_clear(&empty_mask);
        cpumask_test(&empty_mask);
        cpumask_test(cpu_possible_mask);
}

On an unpatched build, with NR_CPUS=1:
    [...] testing empty mask
    [...] first cpu: 0
    [...] first zero: 0
    [...] first and: 0
    [...] last cpu: 0
    [...] next cpu at -1: 0
    [...] next cpu at 0: 1
    [...] next zero at -1: 0
    [...] next zero at 0: 1
    [...] next and at -1: 0
    [...] next and at 0: 1
    [...] next wrap at -1,false: 0
    [...] next wrap at 0,false: 0
    [...] next wrap at -1,true: 0
    [...] next wrap at 0,true: 0
    [...] for each: 0
    [...] for each not: 0
    [...] for each wrap: 0
    [...] for each and: 0
    [...] testing non-empty mask
    [...] first cpu: 0
    [...] first zero: 0
    [...] first and: 0
    [...] last cpu: 0
    [...] next cpu at -1: 0
    [...] next cpu at 0: 1
    [...] next zero at -1: 0
    [...] next zero at 0: 1
    [...] next and at -1: 0
    [...] next and at 0: 1
    [...] next wrap at -1,false: 0
    [...] next wrap at 0,false: 0
    [...] next wrap at -1,true: 0
    [...] next wrap at 0,true: 0
    [...] for each: 0
    [...] for each not: 0
    [...] for each wrap: 0
    [...] for each and: 0

On a patched build, with NR_CPUS=1:
    [...] testing empty mask
    [...] first cpu: 1
    [...] first zero: 0
    [...] first and: 1
    [...] last cpu: 1
    [...] next cpu at -1: 1
    [...] next cpu at 0: 1
    [...] next zero at -1: 0
    [...] next zero at 0: 1
    [...] next and at -1: 1
    [...] next and at 0: 1
    [...] next wrap at -1,false: 1
    [...] next wrap at 0,false: 1
    [...] next wrap at -1,true: 1
    [...] next wrap at 0,true: 1
    [...] for each not: 0
    [...] testing non-empty mask
    [...] first cpu: 0
    [...] first zero: 1
    [...] first and: 0
    [...] last cpu: 0
    [...] next cpu at -1: 0
    [...] next cpu at 0: 1
    [...] next zero at -1: 1
    [...] next zero at 0: 1
    [...] next and at -1: 0
    [...] next and at 0: 1
    [...] next wrap at -1,false: 0
    [...] next wrap at 0,false: 0
    [...] next wrap at -1,true: 0
    [...] next wrap at 0,true: 0
    [...] for each: 0
    [...] for each wrap: 0

For reference, the generic implementation with NR_CPUS=2, CONFIG_SMP=y
    [...] testing empty mask
    [...] first cpu: 2
    [...] first zero: 0
    [...] first and: 2
    [...] last cpu: 2
    [...] next cpu at -1: 2
    [...] next cpu at 0: 2
    [...] next zero at -1: 0
    [...] next zero at 0: 1
    [...] next and at -1: 2
    [...] next and at 0: 2
    [...] next wrap at -1,false: 2
    [...] next wrap at 0,false: 2
    [...] next wrap at -1,true: 2
    [...] next wrap at 0,true: 2
    [...] for each not: 0
    [...] testing non-empty mask
    [...] first cpu: 0
    [...] first zero: 1
    [...] first and: 0
    [...] last cpu: 0
    [...] next cpu at -1: 0
    [...] next cpu at 0: 2
    [...] next zero at -1: 1
    [...] next zero at 0: 1
    [...] next and at -1: 0
    [...] next and at 0: 2
    [...] next wrap at -1,false: 0
    [...] next wrap at 0,false: 0
    [...] next wrap at -1,true: 2
    [...] next wrap at 0,true: 2
    [...] for each: 0
    [...] for each wrap: 0
    [...] for each and: 0

Sander Vanheule (2):
  cpumask: Fix invalid uniprocessor mask assumption
  cpumask: Add UP optimised for_each_*_cpu loops

 include/linux/cpumask.h | 42 +++++++++++++++++++++--------------------
 1 file changed, 22 insertions(+), 20 deletions(-)

-- 
2.36.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-06-03 15:14 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-06-02 21:04 [PATCH v1 0/2] cpumask: Fix invalid uniprocessor assumptions Sander Vanheule
2022-06-02 21:04 ` [PATCH v1 1/2] cpumask: Fix invalid uniprocessor mask assumption Sander Vanheule
2022-06-02 22:48   ` Yury Norov
2022-06-03 15:13     ` Sander Vanheule
2022-06-02 21:04 ` [PATCH v1 2/2] cpumask: Add UP optimised for_each_*_cpu loops Sander Vanheule

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.