public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 5/5] cpumask: reduce cpumask_size
@ 2010-06-25 13:05 Rusty Russell
  2010-06-28  3:12 ` KOSAKI Motohiro
  0 siblings, 1 reply; 4+ messages in thread
From: Rusty Russell @ 2010-06-25 13:05 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnd Bergmann, anton, KOSAKI Motohiro, Mike Travis,
	anton, KOSAKI Motohiro, Mike Travis

Now we're sure noone is using old cpumask operators, nor *cpumask, we can
allocate less bits safely.  This reduces the memory usage of off-stack
cpumasks when CONFIG_CPUMASK_OFFSTACK=y but we don't have NR_CPUS actual
cpus.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: anton@samba.org
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Mike Travis <travis@sgi.com>
---
 include/linux/cpumask.h |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -1014,13 +1014,11 @@ static inline int cpulist_parse(const ch
 /**
  * cpumask_size - size to allocate for a 'struct cpumask' in bytes
  *
- * This will eventually be a runtime variable, depending on nr_cpu_ids.
+ * This can be a runtime variable, depending on nr_cpu_ids.
  */
 static inline size_t cpumask_size(void)
 {
-	/* FIXME: Once all cpumask assignments are eliminated, this
-	 * can be nr_cpumask_bits */
-	return BITS_TO_LONGS(NR_CPUS) * sizeof(long);
+	return BITS_TO_LONGS(nr_cpumask_bits) * sizeof(long);
 }
 
 /*


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 5/5] cpumask: reduce cpumask_size
  2010-06-25 13:05 [PATCH 5/5] cpumask: reduce cpumask_size Rusty Russell
@ 2010-06-28  3:12 ` KOSAKI Motohiro
  2010-06-28 10:27   ` Rusty Russell
  0 siblings, 1 reply; 4+ messages in thread
From: KOSAKI Motohiro @ 2010-06-28  3:12 UTC (permalink / raw)
  To: Rusty Russell
  Cc: kosaki.motohiro, Ingo Molnar, linux-kernel, Arnd Bergmann, anton,
	Mike Travis

> Now we're sure noone is using old cpumask operators, nor *cpumask, we can
> allocate less bits safely.  This reduces the memory usage of off-stack
> cpumasks when CONFIG_CPUMASK_OFFSTACK=y but we don't have NR_CPUS actual
> cpus.

I have to say I'm sorry. Probably I broke your assumption.
If this patch applied, we reintroduce exposing nr_cpu_ids issue and
break libnuma again. I think following change is necessary too.

Or, Am I missing something?

==================================================================
diff --git a/kernel/compat.c b/kernel/compat.c
index 5adab05..5fbee3e 100644
--- a/kernel/compat.c
+++ b/kernel/compat.c
@@ -506,7 +506,8 @@ asmlinkage long compat_sys_sched_getaffinity(compat_pid_t pid, unsigned int len,

        ret = sched_getaffinity(pid, mask);
        if (ret == 0) {
-               size_t retlen = min_t(size_t, len, cpumask_size());
+               size_t retlen = min_t(size_t, len,
+                                     BITS_TO_LONGS(NR_CPUS) * sizeof(long));

                if (compat_put_bitmap(user_mask_ptr, cpumask_bits(mask), retlen * 8))
                        ret = -EFAULT;
diff --git a/kernel/sched.c b/kernel/sched.c
index 18faf4d..c14acad 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -4823,7 +4823,9 @@ SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len,

        ret = sched_getaffinity(pid, mask);
        if (ret == 0) {
-               size_t retlen = min_t(size_t, len, cpumask_size());
+               size_t retlen = min_t(size_t, len,
+                                     BITS_TO_LONGS(NR_CPUS) * sizeof(long));


> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: anton@samba.org
> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Cc: Mike Travis <travis@sgi.com>
> ---
>  include/linux/cpumask.h |    6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -1014,13 +1014,11 @@ static inline int cpulist_parse(const ch
>  /**
>   * cpumask_size - size to allocate for a 'struct cpumask' in bytes
>   *
> - * This will eventually be a runtime variable, depending on nr_cpu_ids.
> + * This can be a runtime variable, depending on nr_cpu_ids.
>   */
>  static inline size_t cpumask_size(void)
>  {
> -	/* FIXME: Once all cpumask assignments are eliminated, this
> -	 * can be nr_cpumask_bits */
> -	return BITS_TO_LONGS(NR_CPUS) * sizeof(long);
> +	return BITS_TO_LONGS(nr_cpumask_bits) * sizeof(long);
>  }
>  
>  /*
> 




^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 5/5] cpumask: reduce cpumask_size
  2010-06-28  3:12 ` KOSAKI Motohiro
@ 2010-06-28 10:27   ` Rusty Russell
  2010-06-28 10:31     ` KOSAKI Motohiro
  0 siblings, 1 reply; 4+ messages in thread
From: Rusty Russell @ 2010-06-28 10:27 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Ingo Molnar, linux-kernel, Arnd Bergmann, anton, Mike Travis

On Mon, 28 Jun 2010 12:42:23 pm KOSAKI Motohiro wrote:
> > Now we're sure noone is using old cpumask operators, nor *cpumask, we can
> > allocate less bits safely.  This reduces the memory usage of off-stack
> > cpumasks when CONFIG_CPUMASK_OFFSTACK=y but we don't have NR_CPUS actual
> > cpus.
> 
> I have to say I'm sorry. Probably I broke your assumption.
> If this patch applied, we reintroduce exposing nr_cpu_ids issue and
> break libnuma again. I think following change is necessary too.
> 
> Or, Am I missing something?

I cc'd you because I remembered you being involved in that libnuma issue
and couldn't remember the details.

Unfortunately, this solution doesn't work:

> diff --git a/kernel/sched.c b/kernel/sched.c
> index 18faf4d..c14acad 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -4823,7 +4823,9 @@ SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len,
> 
>         ret = sched_getaffinity(pid, mask);
>         if (ret == 0) {
> -               size_t retlen = min_t(size_t, len, cpumask_size());
> +               size_t retlen = min_t(size_t, len,
> +                                     BITS_TO_LONGS(NR_CPUS) * sizeof(long));
> 

Since mask is a cpumask_var_t, only cpumask_size() is allocated.  We can't
copy NR_CPUS bits.

But I think it's OK, anyway.  libnuma is broken because it gets upset if the
number of cpus it reads from /sys/.../cpumap is more than the cpumask size
returned from sys_sched_getaffinity.

Currently, getaffinity returns cpumask_size() (ie. based on NR_CPUS), and
the printing routines use nr_cpumask_bits (ie. based on NR_CPUS for 
!CPUMASK_OFFSTACK, nr_cpu_ids for CPUMASK_OFFSTACK).

(libnuma is OK on CONFIG_CPUMASK_OFFSTACK=y because the sysfs output is
*shorter* than expected.  I checked the code).

With this patch, cpumask_size() becomes based on nr_cpumask_bits, so both
getaffinity and sysfs are using the same basis.

Do you agree?
Rusty.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 5/5] cpumask: reduce cpumask_size
  2010-06-28 10:27   ` Rusty Russell
@ 2010-06-28 10:31     ` KOSAKI Motohiro
  0 siblings, 0 replies; 4+ messages in thread
From: KOSAKI Motohiro @ 2010-06-28 10:31 UTC (permalink / raw)
  To: Rusty Russell
  Cc: kosaki.motohiro, Ingo Molnar, linux-kernel, Arnd Bergmann, anton,
	Mike Travis

> On Mon, 28 Jun 2010 12:42:23 pm KOSAKI Motohiro wrote:
> > > Now we're sure noone is using old cpumask operators, nor *cpumask, we can
> > > allocate less bits safely.  This reduces the memory usage of off-stack
> > > cpumasks when CONFIG_CPUMASK_OFFSTACK=y but we don't have NR_CPUS actual
> > > cpus.
> > 
> > I have to say I'm sorry. Probably I broke your assumption.
> > If this patch applied, we reintroduce exposing nr_cpu_ids issue and
> > break libnuma again. I think following change is necessary too.
> > 
> > Or, Am I missing something?
> 
> I cc'd you because I remembered you being involved in that libnuma issue
> and couldn't remember the details.
> 
> Unfortunately, this solution doesn't work:
> 
> > diff --git a/kernel/sched.c b/kernel/sched.c
> > index 18faf4d..c14acad 100644
> > --- a/kernel/sched.c
> > +++ b/kernel/sched.c
> > @@ -4823,7 +4823,9 @@ SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len,
> > 
> >         ret = sched_getaffinity(pid, mask);
> >         if (ret == 0) {
> > -               size_t retlen = min_t(size_t, len, cpumask_size());
> > +               size_t retlen = min_t(size_t, len,
> > +                                     BITS_TO_LONGS(NR_CPUS) * sizeof(long));
> > 
> 
> Since mask is a cpumask_var_t, only cpumask_size() is allocated.  We can't
> copy NR_CPUS bits.

Ahh, yes. It's purely broken.


> But I think it's OK, anyway.  libnuma is broken because it gets upset if the
> number of cpus it reads from /sys/.../cpumap is more than the cpumask size
> returned from sys_sched_getaffinity.
> 
> Currently, getaffinity returns cpumask_size() (ie. based on NR_CPUS), and
> the printing routines use nr_cpumask_bits (ie. based on NR_CPUS for 
> !CPUMASK_OFFSTACK, nr_cpu_ids for CPUMASK_OFFSTACK).
> 
> (libnuma is OK on CONFIG_CPUMASK_OFFSTACK=y because the sysfs output is
> *shorter* than expected.  I checked the code).
> 
> With this patch, cpumask_size() becomes based on nr_cpumask_bits, so both
> getaffinity and sysfs are using the same basis.
> 
> Do you agree?

Sure. I agree I missed. Thank you for very kindful explanation!




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-06-28 10:31 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-25 13:05 [PATCH 5/5] cpumask: reduce cpumask_size Rusty Russell
2010-06-28  3:12 ` KOSAKI Motohiro
2010-06-28 10:27   ` Rusty Russell
2010-06-28 10:31     ` KOSAKI Motohiro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox