* Re: [PATCH v2 1/1] cpumask: Don't waste memory for sysfs cpulist nodes
2022-09-22 20:41 ` Yury Norov
@ 2022-09-22 21:43 ` Phil Auld
2022-09-22 22:07 ` Yury Norov
2022-09-23 6:30 ` Greg Kroah-Hartman
2022-09-23 10:01 ` Andy Shevchenko
2 siblings, 1 reply; 13+ messages in thread
From: Phil Auld @ 2022-09-22 21:43 UTC (permalink / raw)
To: Yury Norov
Cc: Andy Shevchenko, linux-kernel, Rasmus Villemoes,
Petr Štetiar, Greg Kroah-Hartman
On Thu, Sep 22, 2022 at 01:41:40PM -0700 Yury Norov wrote:
> + Petr Štetiar <ynezz@true.cz>,
> + Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>
> On Thu, Sep 22, 2022 at 10:49:54PM +0300, Andy Shevchenko wrote:
> > Currently the approximation is used which wastes the more memory
> > the more CPUs are present on the system. Proposed change calculates
> > the exact maximum needed in the worst case:
> >
> > NR_CPUS old new
> > ------- --- ---
> > 1 .. 1170 4096 4096
> > 1171 .. 1860 4098 .. 6510 4096
> > ... ... ...
> > 2*4096 28672 19925
> > 4*4096 57344 43597
> > 8*4096 114688 92749
> > 16*4096 229376 191053
> > 32*4096 458752 403197
> > 64*4096 917504 861949
> > 128*4096 1835008 1779453
> > 256*4096 3670016 3670016
> >
> > Under the hood the reccurent formula is being used:
> > (5 - 0) * 2 +
> > (50 - 5) * 3 +
> > (500 - 50) * 4 +
> > (5000 - 500) * 5 +
> > ...
> > (X[i] - X[i-1]) * i
> >
> > which allows to count the exact maximum length in the worst case,
> > i.e. when each second CPU is being listed. For backward compatibility
> > for more than 1170 and less than 1861 CPUs the page size is preserved.
> >
> > For less than 1171 and more than 1 million CPUs the old is being used.
>
> 1861
>
> >
> > Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > ---
> > v2: described better the advantage for 1171..1860 CPUs cases
> > include/linux/cpumask.h | 48 +++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 48 insertions(+)
> >
> > diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> > index 1b442fb2001f..12cf0905ca74 100644
> > --- a/include/linux/cpumask.h
> > +++ b/include/linux/cpumask.h
> > @@ -1122,6 +1122,21 @@ cpumap_print_list_to_buf(char *buf, const struct cpumask *mask,
> > *
> > * for cpumap NR_CPUS * 9/32 - 1 should be an exact length.
> > *
> > + * for cpulist the reccurent formula is being used:
> > + * (5 - 0) * 2 +
> > + * (50 - 5) * 3 +
> > + * (500 - 50) * 4 +
> > + * (5000 - 500) * 5 +
> > + * ...
> > + * (X[i] - X[i-1]) * i
> > + *
> > + * which allows to count the exact maximum length in the worst case,
> > + * i.e. when each second CPU is being listed. For backward compatibility
> > + * for more than 1170 and less than 1861 CPUs the page size is preserved.
> > + *
> > + * For less than 1171 and more than 1 million CPUs the old is being used
> > + * as described below:
> > + *
> > * For cpulist 7 is (ceil(log10(NR_CPUS)) + 1) allowing for NR_CPUS to be up
> > * to 2 orders of magnitude larger than 8192. And then we divide by 2 to
> > * cover a worst-case of every other cpu being on one of two nodes for a
> > @@ -1132,6 +1147,39 @@ cpumap_print_list_to_buf(char *buf, const struct cpumask *mask,
> > */
> > #define CPUMAP_FILE_MAX_BYTES (((NR_CPUS * 9)/32 > PAGE_SIZE) \
> > ? (NR_CPUS * 9)/32 - 1 : PAGE_SIZE)
> > +
> > +#define __CPULIST_FOR_10(x) (((x + 1) / 2 - 0) * 2)
> > +#define __CPULIST_FOR_100(x) (((x + 1) / 2 - 5) * 3)
> > +#define __CPULIST_FOR_1000(x) (((x + 1) / 2 - 50) * 4)
> > +#define __CPULIST_FOR_10000(x) (((x + 1) / 2 - 500) * 5)
> > +#define __CPULIST_FOR_100000(x) (((x + 1) / 2 - 5000) * 6)
> > +#define __CPULIST_FOR_1000000(x) (((x + 1) / 2 - 50000) * 7)
>
> The defs below will be nicer if you make it like this:
>
> #define __CPULIST_FOR_10(x) (((x + 1) / 2 - 0) * 2)
> #define __CPULIST_FOR_100(x) __CPULIST_FOR_10(10) + (((x + 1) / 2 - 5) * 3)
> #define __CPULIST_FOR_1000(x) __CPULIST_FOR_100(100) + (((x + 1) / 2 - 50) * 4)
> ...
>
>
>
> > +#if NR_CPUS < 1861
> > +#define CPULIST_FILE_MAX_BYTES PAGE_SIZE
>
> The comment says:
> for more than 1170 and less than 1861 CPUs the page size is preserved.
>
> Which doesn't look correct. Looks like it should be:
> for less than 1861 CPUs the page size is preserved.
>
> Or I miss something?
>
> > +#elif NR_CPUS < 10000
> > +#define CPULIST_FILE_MAX_BYTES \
> > + (__CPULIST_FOR_10(10) + \
> > + __CPULIST_FOR_100(100) + \
> > + __CPULIST_FOR_1000(1000) + \
> > + __CPULIST_FOR_10000(NR_CPUS))
> > +#elif NR_CPUS < 100000
> > +#define CPULIST_FILE_MAX_BYTES \
> > + (__CPULIST_FOR_10(10) + \
> > + __CPULIST_FOR_100(100) + \
> > + __CPULIST_FOR_1000(1000) + \
> > + __CPULIST_FOR_10000(10000) + \
> > + __CPULIST_FOR_100000(NR_CPUS))
> > +#elif NR_CPUS < 1000000
> > +#define CPULIST_FILE_MAX_BYTES \
> > + (__CPULIST_FOR_10(10) + \
> > + __CPULIST_FOR_100(100) + \
> > + __CPULIST_FOR_1000(1000) + \
> > + __CPULIST_FOR_10000(10000) + \
> > + __CPULIST_FOR_100000(100000) + \
> > + __CPULIST_FOR_1000000(NR_CPUS))
> > +#else
> > #define CPULIST_FILE_MAX_BYTES (((NR_CPUS * 7)/2 > PAGE_SIZE) ? (NR_CPUS * 7)/2 : PAGE_SIZE)
> > +#endif
> >
> > #endif /* __LINUX_CPUMASK_H */
> > --
> > 2.35.1
>
> I'm OK to take this in replace for Phil's version, but the commit that
> introduces CPULIST_FILE_MAX_BYTES is already in mainline: 7ee951acd31a8
> ("drivers/base: fix userspace break from using bin_attributes for cpumap
> and cpulist"). Can you rebase it on top of v6.0-rc6?
>
> Greg, since Andy's version is more precise, I'd like to send a pull
> request with it in -rc7. Can you drop Phil's patch so I'll go with
> this one?
>
This changes the other of the 2 macros and looks like it is already on
top of the fix to CPUMAP_FILE_MAX_BYTES.
It should be able to go right on top of a tree with that one in it, I think.
With the comment fixed up as you note above I'll git Reviewed-by:
and Tested-by: shortly.
This one is a refinement of 7ee951acd31a8 though and is not a critical as
the one Greg was talking about and Petr hit.
Cheers,
Phil
> Thanks,
> Yury
>
--
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH v2 1/1] cpumask: Don't waste memory for sysfs cpulist nodes
2022-09-22 21:43 ` Phil Auld
@ 2022-09-22 22:07 ` Yury Norov
0 siblings, 0 replies; 13+ messages in thread
From: Yury Norov @ 2022-09-22 22:07 UTC (permalink / raw)
To: Phil Auld
Cc: Andy Shevchenko, linux-kernel, Rasmus Villemoes,
Petr Štetiar, Greg Kroah-Hartman
On Thu, Sep 22, 2022 at 05:43:38PM -0400, Phil Auld wrote:
> On Thu, Sep 22, 2022 at 01:41:40PM -0700 Yury Norov wrote:
> > + Petr Štetiar <ynezz@true.cz>,
> > + Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> >
> > On Thu, Sep 22, 2022 at 10:49:54PM +0300, Andy Shevchenko wrote:
> > > Currently the approximation is used which wastes the more memory
> > > the more CPUs are present on the system. Proposed change calculates
> > > the exact maximum needed in the worst case:
> > >
> > > NR_CPUS old new
> > > ------- --- ---
> > > 1 .. 1170 4096 4096
> > > 1171 .. 1860 4098 .. 6510 4096
> > > ... ... ...
> > > 2*4096 28672 19925
> > > 4*4096 57344 43597
> > > 8*4096 114688 92749
> > > 16*4096 229376 191053
> > > 32*4096 458752 403197
> > > 64*4096 917504 861949
> > > 128*4096 1835008 1779453
> > > 256*4096 3670016 3670016
> > >
> > > Under the hood the reccurent formula is being used:
> > > (5 - 0) * 2 +
> > > (50 - 5) * 3 +
> > > (500 - 50) * 4 +
> > > (5000 - 500) * 5 +
> > > ...
> > > (X[i] - X[i-1]) * i
> > >
> > > which allows to count the exact maximum length in the worst case,
> > > i.e. when each second CPU is being listed. For backward compatibility
> > > for more than 1170 and less than 1861 CPUs the page size is preserved.
> > >
> > > For less than 1171 and more than 1 million CPUs the old is being used.
> >
> > 1861
> >
> > >
> > > Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > > ---
> > > v2: described better the advantage for 1171..1860 CPUs cases
> > > include/linux/cpumask.h | 48 +++++++++++++++++++++++++++++++++++++++++
> > > 1 file changed, 48 insertions(+)
> > >
> > > diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> > > index 1b442fb2001f..12cf0905ca74 100644
> > > --- a/include/linux/cpumask.h
> > > +++ b/include/linux/cpumask.h
> > > @@ -1122,6 +1122,21 @@ cpumap_print_list_to_buf(char *buf, const struct cpumask *mask,
> > > *
> > > * for cpumap NR_CPUS * 9/32 - 1 should be an exact length.
> > > *
> > > + * for cpulist the reccurent formula is being used:
> > > + * (5 - 0) * 2 +
> > > + * (50 - 5) * 3 +
> > > + * (500 - 50) * 4 +
> > > + * (5000 - 500) * 5 +
> > > + * ...
> > > + * (X[i] - X[i-1]) * i
> > > + *
> > > + * which allows to count the exact maximum length in the worst case,
> > > + * i.e. when each second CPU is being listed. For backward compatibility
> > > + * for more than 1170 and less than 1861 CPUs the page size is preserved.
> > > + *
> > > + * For less than 1171 and more than 1 million CPUs the old is being used
> > > + * as described below:
> > > + *
> > > * For cpulist 7 is (ceil(log10(NR_CPUS)) + 1) allowing for NR_CPUS to be up
> > > * to 2 orders of magnitude larger than 8192. And then we divide by 2 to
> > > * cover a worst-case of every other cpu being on one of two nodes for a
> > > @@ -1132,6 +1147,39 @@ cpumap_print_list_to_buf(char *buf, const struct cpumask *mask,
> > > */
> > > #define CPUMAP_FILE_MAX_BYTES (((NR_CPUS * 9)/32 > PAGE_SIZE) \
> > > ? (NR_CPUS * 9)/32 - 1 : PAGE_SIZE)
> > > +
> > > +#define __CPULIST_FOR_10(x) (((x + 1) / 2 - 0) * 2)
> > > +#define __CPULIST_FOR_100(x) (((x + 1) / 2 - 5) * 3)
> > > +#define __CPULIST_FOR_1000(x) (((x + 1) / 2 - 50) * 4)
> > > +#define __CPULIST_FOR_10000(x) (((x + 1) / 2 - 500) * 5)
> > > +#define __CPULIST_FOR_100000(x) (((x + 1) / 2 - 5000) * 6)
> > > +#define __CPULIST_FOR_1000000(x) (((x + 1) / 2 - 50000) * 7)
> >
> > The defs below will be nicer if you make it like this:
> >
> > #define __CPULIST_FOR_10(x) (((x + 1) / 2 - 0) * 2)
> > #define __CPULIST_FOR_100(x) __CPULIST_FOR_10(10) + (((x + 1) / 2 - 5) * 3)
> > #define __CPULIST_FOR_1000(x) __CPULIST_FOR_100(100) + (((x + 1) / 2 - 50) * 4)
> > ...
> >
> >
> >
> > > +#if NR_CPUS < 1861
> > > +#define CPULIST_FILE_MAX_BYTES PAGE_SIZE
> >
> > The comment says:
> > for more than 1170 and less than 1861 CPUs the page size is preserved.
> >
> > Which doesn't look correct. Looks like it should be:
> > for less than 1861 CPUs the page size is preserved.
> >
> > Or I miss something?
> >
> > > +#elif NR_CPUS < 10000
> > > +#define CPULIST_FILE_MAX_BYTES \
> > > + (__CPULIST_FOR_10(10) + \
> > > + __CPULIST_FOR_100(100) + \
> > > + __CPULIST_FOR_1000(1000) + \
> > > + __CPULIST_FOR_10000(NR_CPUS))
> > > +#elif NR_CPUS < 100000
> > > +#define CPULIST_FILE_MAX_BYTES \
> > > + (__CPULIST_FOR_10(10) + \
> > > + __CPULIST_FOR_100(100) + \
> > > + __CPULIST_FOR_1000(1000) + \
> > > + __CPULIST_FOR_10000(10000) + \
> > > + __CPULIST_FOR_100000(NR_CPUS))
> > > +#elif NR_CPUS < 1000000
> > > +#define CPULIST_FILE_MAX_BYTES \
> > > + (__CPULIST_FOR_10(10) + \
> > > + __CPULIST_FOR_100(100) + \
> > > + __CPULIST_FOR_1000(1000) + \
> > > + __CPULIST_FOR_10000(10000) + \
> > > + __CPULIST_FOR_100000(100000) + \
> > > + __CPULIST_FOR_1000000(NR_CPUS))
> > > +#else
> > > #define CPULIST_FILE_MAX_BYTES (((NR_CPUS * 7)/2 > PAGE_SIZE) ? (NR_CPUS * 7)/2 : PAGE_SIZE)
> > > +#endif
> > >
> > > #endif /* __LINUX_CPUMASK_H */
> > > --
> > > 2.35.1
> >
> > I'm OK to take this in replace for Phil's version, but the commit that
> > introduces CPULIST_FILE_MAX_BYTES is already in mainline: 7ee951acd31a8
> > ("drivers/base: fix userspace break from using bin_attributes for cpumap
> > and cpulist"). Can you rebase it on top of v6.0-rc6?
> >
> > Greg, since Andy's version is more precise, I'd like to send a pull
> > request with it in -rc7. Can you drop Phil's patch so I'll go with
> > this one?
> >
>
> This changes the other of the 2 macros and looks like it is already on
> top of the fix to CPUMAP_FILE_MAX_BYTES.
>
> It should be able to go right on top of a tree with that one in it, I think.
Indeed. (I reviewed it from a phone and missed that.)
> With the comment fixed up as you note above I'll git Reviewed-by:
> and Tested-by: shortly.
>
> This one is a refinement of 7ee951acd31a8 though and is not a critical as
> the one Greg was talking about and Petr hit.
OK with that. Let's see what Greg will say about how to handle it
better. I'm OK with both ways.
Thanks,
Yury
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 1/1] cpumask: Don't waste memory for sysfs cpulist nodes
2022-09-22 20:41 ` Yury Norov
2022-09-22 21:43 ` Phil Auld
@ 2022-09-23 6:30 ` Greg Kroah-Hartman
2022-09-23 15:23 ` Yury Norov
2022-09-23 10:01 ` Andy Shevchenko
2 siblings, 1 reply; 13+ messages in thread
From: Greg Kroah-Hartman @ 2022-09-23 6:30 UTC (permalink / raw)
To: Yury Norov
Cc: Andy Shevchenko, linux-kernel, Rasmus Villemoes, Phil Auld,
Petr Štetiar
On Thu, Sep 22, 2022 at 01:41:40PM -0700, Yury Norov wrote:
> + Petr Štetiar <ynezz@true.cz>,
> + Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>
> On Thu, Sep 22, 2022 at 10:49:54PM +0300, Andy Shevchenko wrote:
> > Currently the approximation is used which wastes the more memory
> > the more CPUs are present on the system. Proposed change calculates
> > the exact maximum needed in the worst case:
> >
> > NR_CPUS old new
> > ------- --- ---
> > 1 .. 1170 4096 4096
> > 1171 .. 1860 4098 .. 6510 4096
> > ... ... ...
> > 2*4096 28672 19925
> > 4*4096 57344 43597
> > 8*4096 114688 92749
> > 16*4096 229376 191053
> > 32*4096 458752 403197
> > 64*4096 917504 861949
> > 128*4096 1835008 1779453
> > 256*4096 3670016 3670016
> >
> > Under the hood the reccurent formula is being used:
> > (5 - 0) * 2 +
> > (50 - 5) * 3 +
> > (500 - 50) * 4 +
> > (5000 - 500) * 5 +
> > ...
> > (X[i] - X[i-1]) * i
> >
> > which allows to count the exact maximum length in the worst case,
> > i.e. when each second CPU is being listed. For backward compatibility
> > for more than 1170 and less than 1861 CPUs the page size is preserved.
> >
> > For less than 1171 and more than 1 million CPUs the old is being used.
>
> 1861
>
> >
> > Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > ---
> > v2: described better the advantage for 1171..1860 CPUs cases
> > include/linux/cpumask.h | 48 +++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 48 insertions(+)
> >
> > diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> > index 1b442fb2001f..12cf0905ca74 100644
> > --- a/include/linux/cpumask.h
> > +++ b/include/linux/cpumask.h
> > @@ -1122,6 +1122,21 @@ cpumap_print_list_to_buf(char *buf, const struct cpumask *mask,
> > *
> > * for cpumap NR_CPUS * 9/32 - 1 should be an exact length.
> > *
> > + * for cpulist the reccurent formula is being used:
> > + * (5 - 0) * 2 +
> > + * (50 - 5) * 3 +
> > + * (500 - 50) * 4 +
> > + * (5000 - 500) * 5 +
> > + * ...
> > + * (X[i] - X[i-1]) * i
> > + *
> > + * which allows to count the exact maximum length in the worst case,
> > + * i.e. when each second CPU is being listed. For backward compatibility
> > + * for more than 1170 and less than 1861 CPUs the page size is preserved.
> > + *
> > + * For less than 1171 and more than 1 million CPUs the old is being used
> > + * as described below:
> > + *
> > * For cpulist 7 is (ceil(log10(NR_CPUS)) + 1) allowing for NR_CPUS to be up
> > * to 2 orders of magnitude larger than 8192. And then we divide by 2 to
> > * cover a worst-case of every other cpu being on one of two nodes for a
> > @@ -1132,6 +1147,39 @@ cpumap_print_list_to_buf(char *buf, const struct cpumask *mask,
> > */
> > #define CPUMAP_FILE_MAX_BYTES (((NR_CPUS * 9)/32 > PAGE_SIZE) \
> > ? (NR_CPUS * 9)/32 - 1 : PAGE_SIZE)
> > +
> > +#define __CPULIST_FOR_10(x) (((x + 1) / 2 - 0) * 2)
> > +#define __CPULIST_FOR_100(x) (((x + 1) / 2 - 5) * 3)
> > +#define __CPULIST_FOR_1000(x) (((x + 1) / 2 - 50) * 4)
> > +#define __CPULIST_FOR_10000(x) (((x + 1) / 2 - 500) * 5)
> > +#define __CPULIST_FOR_100000(x) (((x + 1) / 2 - 5000) * 6)
> > +#define __CPULIST_FOR_1000000(x) (((x + 1) / 2 - 50000) * 7)
>
> The defs below will be nicer if you make it like this:
>
> #define __CPULIST_FOR_10(x) (((x + 1) / 2 - 0) * 2)
> #define __CPULIST_FOR_100(x) __CPULIST_FOR_10(10) + (((x + 1) / 2 - 5) * 3)
> #define __CPULIST_FOR_1000(x) __CPULIST_FOR_100(100) + (((x + 1) / 2 - 50) * 4)
> ...
>
>
>
> > +#if NR_CPUS < 1861
> > +#define CPULIST_FILE_MAX_BYTES PAGE_SIZE
>
> The comment says:
> for more than 1170 and less than 1861 CPUs the page size is preserved.
>
> Which doesn't look correct. Looks like it should be:
> for less than 1861 CPUs the page size is preserved.
>
> Or I miss something?
>
> > +#elif NR_CPUS < 10000
> > +#define CPULIST_FILE_MAX_BYTES \
> > + (__CPULIST_FOR_10(10) + \
> > + __CPULIST_FOR_100(100) + \
> > + __CPULIST_FOR_1000(1000) + \
> > + __CPULIST_FOR_10000(NR_CPUS))
> > +#elif NR_CPUS < 100000
> > +#define CPULIST_FILE_MAX_BYTES \
> > + (__CPULIST_FOR_10(10) + \
> > + __CPULIST_FOR_100(100) + \
> > + __CPULIST_FOR_1000(1000) + \
> > + __CPULIST_FOR_10000(10000) + \
> > + __CPULIST_FOR_100000(NR_CPUS))
> > +#elif NR_CPUS < 1000000
> > +#define CPULIST_FILE_MAX_BYTES \
> > + (__CPULIST_FOR_10(10) + \
> > + __CPULIST_FOR_100(100) + \
> > + __CPULIST_FOR_1000(1000) + \
> > + __CPULIST_FOR_10000(10000) + \
> > + __CPULIST_FOR_100000(100000) + \
> > + __CPULIST_FOR_1000000(NR_CPUS))
> > +#else
> > #define CPULIST_FILE_MAX_BYTES (((NR_CPUS * 7)/2 > PAGE_SIZE) ? (NR_CPUS * 7)/2 : PAGE_SIZE)
> > +#endif
> >
> > #endif /* __LINUX_CPUMASK_H */
> > --
> > 2.35.1
>
> I'm OK to take this in replace for Phil's version, but the commit that
> introduces CPULIST_FILE_MAX_BYTES is already in mainline: 7ee951acd31a8
> ("drivers/base: fix userspace break from using bin_attributes for cpumap
> and cpulist"). Can you rebase it on top of v6.0-rc6?
>
> Greg, since Andy's version is more precise, I'd like to send a pull
> request with it in -rc7. Can you drop Phil's patch so I'll go with
> this one?
Let me get this fix to Linus now for 6.0-final as it is hitting people
right now. Making it "cleaner" after that is fine to go through your
tree as that's less of an issue, right?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH v2 1/1] cpumask: Don't waste memory for sysfs cpulist nodes
2022-09-23 6:30 ` Greg Kroah-Hartman
@ 2022-09-23 15:23 ` Yury Norov
0 siblings, 0 replies; 13+ messages in thread
From: Yury Norov @ 2022-09-23 15:23 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Andy Shevchenko, linux-kernel, Rasmus Villemoes, Phil Auld,
Petr Štetiar
On Fri, Sep 23, 2022 at 08:30:45AM +0200, Greg Kroah-Hartman wrote:
> On Thu, Sep 22, 2022 at 01:41:40PM -0700, Yury Norov wrote:
> > + Petr Štetiar <ynezz@true.cz>,
> > + Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> >
> > On Thu, Sep 22, 2022 at 10:49:54PM +0300, Andy Shevchenko wrote:
> > > Currently the approximation is used which wastes the more memory
> > > the more CPUs are present on the system. Proposed change calculates
> > > the exact maximum needed in the worst case:
> > >
> > > NR_CPUS old new
> > > ------- --- ---
> > > 1 .. 1170 4096 4096
> > > 1171 .. 1860 4098 .. 6510 4096
> > > ... ... ...
> > > 2*4096 28672 19925
> > > 4*4096 57344 43597
> > > 8*4096 114688 92749
> > > 16*4096 229376 191053
> > > 32*4096 458752 403197
> > > 64*4096 917504 861949
> > > 128*4096 1835008 1779453
> > > 256*4096 3670016 3670016
> > >
> > > Under the hood the reccurent formula is being used:
> > > (5 - 0) * 2 +
> > > (50 - 5) * 3 +
> > > (500 - 50) * 4 +
> > > (5000 - 500) * 5 +
> > > ...
> > > (X[i] - X[i-1]) * i
> > >
> > > which allows to count the exact maximum length in the worst case,
> > > i.e. when each second CPU is being listed. For backward compatibility
> > > for more than 1170 and less than 1861 CPUs the page size is preserved.
> > >
> > > For less than 1171 and more than 1 million CPUs the old is being used.
> >
> > 1861
> >
> > >
> > > Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > > ---
> > > v2: described better the advantage for 1171..1860 CPUs cases
> > > include/linux/cpumask.h | 48 +++++++++++++++++++++++++++++++++++++++++
> > > 1 file changed, 48 insertions(+)
> > >
> > > diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> > > index 1b442fb2001f..12cf0905ca74 100644
> > > --- a/include/linux/cpumask.h
> > > +++ b/include/linux/cpumask.h
> > > @@ -1122,6 +1122,21 @@ cpumap_print_list_to_buf(char *buf, const struct cpumask *mask,
> > > *
> > > * for cpumap NR_CPUS * 9/32 - 1 should be an exact length.
> > > *
> > > + * for cpulist the reccurent formula is being used:
> > > + * (5 - 0) * 2 +
> > > + * (50 - 5) * 3 +
> > > + * (500 - 50) * 4 +
> > > + * (5000 - 500) * 5 +
> > > + * ...
> > > + * (X[i] - X[i-1]) * i
> > > + *
> > > + * which allows to count the exact maximum length in the worst case,
> > > + * i.e. when each second CPU is being listed. For backward compatibility
> > > + * for more than 1170 and less than 1861 CPUs the page size is preserved.
> > > + *
> > > + * For less than 1171 and more than 1 million CPUs the old is being used
> > > + * as described below:
> > > + *
> > > * For cpulist 7 is (ceil(log10(NR_CPUS)) + 1) allowing for NR_CPUS to be up
> > > * to 2 orders of magnitude larger than 8192. And then we divide by 2 to
> > > * cover a worst-case of every other cpu being on one of two nodes for a
> > > @@ -1132,6 +1147,39 @@ cpumap_print_list_to_buf(char *buf, const struct cpumask *mask,
> > > */
> > > #define CPUMAP_FILE_MAX_BYTES (((NR_CPUS * 9)/32 > PAGE_SIZE) \
> > > ? (NR_CPUS * 9)/32 - 1 : PAGE_SIZE)
> > > +
> > > +#define __CPULIST_FOR_10(x) (((x + 1) / 2 - 0) * 2)
> > > +#define __CPULIST_FOR_100(x) (((x + 1) / 2 - 5) * 3)
> > > +#define __CPULIST_FOR_1000(x) (((x + 1) / 2 - 50) * 4)
> > > +#define __CPULIST_FOR_10000(x) (((x + 1) / 2 - 500) * 5)
> > > +#define __CPULIST_FOR_100000(x) (((x + 1) / 2 - 5000) * 6)
> > > +#define __CPULIST_FOR_1000000(x) (((x + 1) / 2 - 50000) * 7)
> >
> > The defs below will be nicer if you make it like this:
> >
> > #define __CPULIST_FOR_10(x) (((x + 1) / 2 - 0) * 2)
> > #define __CPULIST_FOR_100(x) __CPULIST_FOR_10(10) + (((x + 1) / 2 - 5) * 3)
> > #define __CPULIST_FOR_1000(x) __CPULIST_FOR_100(100) + (((x + 1) / 2 - 50) * 4)
> > ...
> >
> >
> >
> > > +#if NR_CPUS < 1861
> > > +#define CPULIST_FILE_MAX_BYTES PAGE_SIZE
> >
> > The comment says:
> > for more than 1170 and less than 1861 CPUs the page size is preserved.
> >
> > Which doesn't look correct. Looks like it should be:
> > for less than 1861 CPUs the page size is preserved.
> >
> > Or I miss something?
> >
> > > +#elif NR_CPUS < 10000
> > > +#define CPULIST_FILE_MAX_BYTES \
> > > + (__CPULIST_FOR_10(10) + \
> > > + __CPULIST_FOR_100(100) + \
> > > + __CPULIST_FOR_1000(1000) + \
> > > + __CPULIST_FOR_10000(NR_CPUS))
> > > +#elif NR_CPUS < 100000
> > > +#define CPULIST_FILE_MAX_BYTES \
> > > + (__CPULIST_FOR_10(10) + \
> > > + __CPULIST_FOR_100(100) + \
> > > + __CPULIST_FOR_1000(1000) + \
> > > + __CPULIST_FOR_10000(10000) + \
> > > + __CPULIST_FOR_100000(NR_CPUS))
> > > +#elif NR_CPUS < 1000000
> > > +#define CPULIST_FILE_MAX_BYTES \
> > > + (__CPULIST_FOR_10(10) + \
> > > + __CPULIST_FOR_100(100) + \
> > > + __CPULIST_FOR_1000(1000) + \
> > > + __CPULIST_FOR_10000(10000) + \
> > > + __CPULIST_FOR_100000(100000) + \
> > > + __CPULIST_FOR_1000000(NR_CPUS))
> > > +#else
> > > #define CPULIST_FILE_MAX_BYTES (((NR_CPUS * 7)/2 > PAGE_SIZE) ? (NR_CPUS * 7)/2 : PAGE_SIZE)
> > > +#endif
> > >
> > > #endif /* __LINUX_CPUMASK_H */
> > > --
> > > 2.35.1
> >
> > I'm OK to take this in replace for Phil's version, but the commit that
> > introduces CPULIST_FILE_MAX_BYTES is already in mainline: 7ee951acd31a8
> > ("drivers/base: fix userspace break from using bin_attributes for cpumap
> > and cpulist"). Can you rebase it on top of v6.0-rc6?
> >
> > Greg, since Andy's version is more precise, I'd like to send a pull
> > request with it in -rc7. Can you drop Phil's patch so I'll go with
> > this one?
>
> Let me get this fix to Linus now for 6.0-final as it is hitting people
> right now. Making it "cleaner" after that is fine to go through your
> tree as that's less of an issue, right?
OK
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 1/1] cpumask: Don't waste memory for sysfs cpulist nodes
2022-09-22 20:41 ` Yury Norov
2022-09-22 21:43 ` Phil Auld
2022-09-23 6:30 ` Greg Kroah-Hartman
@ 2022-09-23 10:01 ` Andy Shevchenko
2 siblings, 0 replies; 13+ messages in thread
From: Andy Shevchenko @ 2022-09-23 10:01 UTC (permalink / raw)
To: Yury Norov
Cc: linux-kernel, Rasmus Villemoes, Phil Auld, Petr Štetiar,
Greg Kroah-Hartman
On Thu, Sep 22, 2022 at 01:41:40PM -0700, Yury Norov wrote:
> + Petr Štetiar <ynezz@true.cz>,
> + Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>
> On Thu, Sep 22, 2022 at 10:49:54PM +0300, Andy Shevchenko wrote:
> > Currently the approximation is used which wastes the more memory
> > the more CPUs are present on the system. Proposed change calculates
> > the exact maximum needed in the worst case:
> >
> > NR_CPUS old new
> > ------- --- ---
> > 1 .. 1170 4096 4096
> > 1171 .. 1860 4098 .. 6510 4096
> > ... ... ...
> > 2*4096 28672 19925
> > 4*4096 57344 43597
> > 8*4096 114688 92749
> > 16*4096 229376 191053
> > 32*4096 458752 403197
> > 64*4096 917504 861949
> > 128*4096 1835008 1779453
> > 256*4096 3670016 3670016
> >
> > Under the hood the reccurent formula is being used:
> > (5 - 0) * 2 +
> > (50 - 5) * 3 +
> > (500 - 50) * 4 +
> > (5000 - 500) * 5 +
> > ...
> > (X[i] - X[i-1]) * i
> >
> > which allows to count the exact maximum length in the worst case,
> > i.e. when each second CPU is being listed. For backward compatibility
> > for more than 1170 and less than 1861 CPUs the page size is preserved.
> >
> > For less than 1171 and more than 1 million CPUs the old is being used.
>
> 1861
No, this is correct for the PAGE_SIZE == 4096.
> > Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > ---
> > v2: described better the advantage for 1171..1860 CPUs cases
> > include/linux/cpumask.h | 48 +++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 48 insertions(+)
> >
> > diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> > index 1b442fb2001f..12cf0905ca74 100644
> > --- a/include/linux/cpumask.h
> > +++ b/include/linux/cpumask.h
> > @@ -1122,6 +1122,21 @@ cpumap_print_list_to_buf(char *buf, const struct cpumask *mask,
> > *
> > * for cpumap NR_CPUS * 9/32 - 1 should be an exact length.
> > *
> > + * for cpulist the reccurent formula is being used:
> > + * (5 - 0) * 2 +
> > + * (50 - 5) * 3 +
> > + * (500 - 50) * 4 +
> > + * (5000 - 500) * 5 +
> > + * ...
> > + * (X[i] - X[i-1]) * i
> > + *
> > + * which allows to count the exact maximum length in the worst case,
> > + * i.e. when each second CPU is being listed. For backward compatibility
> > + * for more than 1170 and less than 1861 CPUs the page size is preserved.
> > + *
> > + * For less than 1171 and more than 1 million CPUs the old is being used
> > + * as described below:
> > + *
> > * For cpulist 7 is (ceil(log10(NR_CPUS)) + 1) allowing for NR_CPUS to be up
> > * to 2 orders of magnitude larger than 8192. And then we divide by 2 to
> > * cover a worst-case of every other cpu being on one of two nodes for a
> > @@ -1132,6 +1147,39 @@ cpumap_print_list_to_buf(char *buf, const struct cpumask *mask,
> > */
> > #define CPUMAP_FILE_MAX_BYTES (((NR_CPUS * 9)/32 > PAGE_SIZE) \
> > ? (NR_CPUS * 9)/32 - 1 : PAGE_SIZE)
> > +
> > +#define __CPULIST_FOR_10(x) (((x + 1) / 2 - 0) * 2)
> > +#define __CPULIST_FOR_100(x) (((x + 1) / 2 - 5) * 3)
> > +#define __CPULIST_FOR_1000(x) (((x + 1) / 2 - 50) * 4)
> > +#define __CPULIST_FOR_10000(x) (((x + 1) / 2 - 500) * 5)
> > +#define __CPULIST_FOR_100000(x) (((x + 1) / 2 - 5000) * 6)
> > +#define __CPULIST_FOR_1000000(x) (((x + 1) / 2 - 50000) * 7)
>
> The defs below will be nicer if you make it like this:
>
> #define __CPULIST_FOR_10(x) (((x + 1) / 2 - 0) * 2)
> #define __CPULIST_FOR_100(x) __CPULIST_FOR_10(10) + (((x + 1) / 2 - 5) * 3)
> #define __CPULIST_FOR_1000(x) __CPULIST_FOR_100(100) + (((x + 1) / 2 - 50) * 4)
> ...
Not big deal, but I found my way more readable.
> > +#if NR_CPUS < 1861
> > +#define CPULIST_FILE_MAX_BYTES PAGE_SIZE
>
> The comment says:
> for more than 1170 and less than 1861 CPUs the page size is preserved.
>
> Which doesn't look correct. Looks like it should be:
> for less than 1861 CPUs the page size is preserved.
>
> Or I miss something?
Yes, you missed that the current formula gives an overhead already at 1171,
while it's room up to 1860. All these numbers are for PAGE_SIZE == 4096. In any
case, I was thinking more about this and I need to revert to my (locally)
initial approach to count the real size and then do like the old formula does,
i.e. max(PAGE_SIZE, real size) at the end.
> > +#elif NR_CPUS < 10000
> > +#define CPULIST_FILE_MAX_BYTES \
> > + (__CPULIST_FOR_10(10) + \
> > + __CPULIST_FOR_100(100) + \
> > + __CPULIST_FOR_1000(1000) + \
> > + __CPULIST_FOR_10000(NR_CPUS))
> > +#elif NR_CPUS < 100000
> > +#define CPULIST_FILE_MAX_BYTES \
> > + (__CPULIST_FOR_10(10) + \
> > + __CPULIST_FOR_100(100) + \
> > + __CPULIST_FOR_1000(1000) + \
> > + __CPULIST_FOR_10000(10000) + \
> > + __CPULIST_FOR_100000(NR_CPUS))
> > +#elif NR_CPUS < 1000000
> > +#define CPULIST_FILE_MAX_BYTES \
> > + (__CPULIST_FOR_10(10) + \
> > + __CPULIST_FOR_100(100) + \
> > + __CPULIST_FOR_1000(1000) + \
> > + __CPULIST_FOR_10000(10000) + \
> > + __CPULIST_FOR_100000(100000) + \
> > + __CPULIST_FOR_1000000(NR_CPUS))
> > +#else
> > #define CPULIST_FILE_MAX_BYTES (((NR_CPUS * 7)/2 > PAGE_SIZE) ? (NR_CPUS * 7)/2 : PAGE_SIZE)
> > +#endif
> >
> > #endif /* __LINUX_CPUMASK_H */
> I'm OK to take this in replace for Phil's version, but the commit that
> introduces CPULIST_FILE_MAX_BYTES is already in mainline: 7ee951acd31a8
> ("drivers/base: fix userspace break from using bin_attributes for cpumap
> and cpulist"). Can you rebase it on top of v6.0-rc6?
>
> Greg, since Andy's version is more precise, I'd like to send a pull
> request with it in -rc7. Can you drop Phil's patch so I'll go with
> this one?
If it's already in mainline, then there is no way we can drop it. Also note,
everything which is in -next branches are usually not for rebase (and IIRC Greg
never rebases his trees). Hence, my patch is for Linux Next, i.e. v6.1-rc1.
In any case, please wait for v3.
I'll Cc it to the people you mentioned above, if you think it's a right thing
to do. (The Cc list is based on MAINTAINERS + author of the previous patch)
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply [flat|nested] 13+ messages in thread