public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jack Steiner <steiner@sgi.com>
To: mingo@redhat.com, hpa@zytor.com, linux-kernel@vger.kernel.org,
	torvalds@linux-foundation.org, travis@sgi.com,
	peterz@infradead.org, drepper@redhat.com, rja@sgi.com,
	sharyath@in.ibm.com, akpm@linux-foundation.org,
	tglx@linutronix.de, kosaki.motohiro@jp.fujitsu.com,
	mingo@elte.hu
Cc: linux-tip-commits@vger.kernel.org
Subject: Re: [tip:sched/urgent] sched: sched_getaffinity(): Allow less than NR_CPUS length
Date: Mon, 15 Mar 2010 11:04:54 -0500	[thread overview]
Message-ID: <20100315160454.GA8211@sgi.com> (raw)
In-Reply-To: <tip-cd3d8031eb4311e516329aee03c79a08333141f1@git.kernel.org>

On Mon, Mar 15, 2010 at 07:43:02AM +0000, tip-bot for KOSAKI Motohiro wrote:
> Commit-ID:  cd3d8031eb4311e516329aee03c79a08333141f1
> Gitweb:     http://git.kernel.org/tip/cd3d8031eb4311e516329aee03c79a08333141f1
> Author:     KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> AuthorDate: Fri, 12 Mar 2010 16:15:36 +0900
> Committer:  Ingo Molnar <mingo@elte.hu>
> CommitDate: Mon, 15 Mar 2010 08:28:44 +0100

...
> IOW we hope they are not annoyed by this issue ...

The change looks ok but I can't reproduce the problem.

I'm running on a distro kernel that has NR_CPUS=4096.  Glibc has also has a
definition of __CPU_SETSIZE (I assume this change was made by the distro but
am not certain):

    sched.c
    	...
	#if defined _SCHED_H && !defined __cpu_set_t_defined
	# define __cpu_set_t_defined
	/* Size definition for CPU sets.  */
	# define __CPU_SETSIZE  4096
	# define __NCPUBITS     (8 * sizeof (__cpu_mask))

Your test program runs ok:

	% strace t
	...
	sched_getaffinity(0, 512,  { ffff, 0, 0, 0, 0, 0, 0, 0 }) = 64



Also note that we've run on IA64 systems with NR_CPUS=4096 for several years w/o
hitting any problems.

Bottom line. I don't think the change will affect us.


> 
> sched: sched_getaffinity(): Allow less than NR_CPUS length
> 
> [ Note, this commit changes the syscall ABI for > 1024 CPUs systems. ]
> 
> Recently, some distro decided to use NR_CPUS=4096 for mysterious reasons.
> Unfortunately, glibc sched interface has the following definition:
> 
> 	# define __CPU_SETSIZE  1024
> 	# define __NCPUBITS     (8 * sizeof (__cpu_mask))
> 	typedef unsigned long int __cpu_mask;
> 	typedef struct
> 	{
> 	  __cpu_mask __bits[__CPU_SETSIZE / __NCPUBITS];
> 	} cpu_set_t;
> 
> It mean, if NR_CPUS is bigger than 1024, cpu_set_t makes an
> ABI issue ...
> 
> More recently, Sharyathi Nagesh reported following test program makes
> misterious syscall failure:
> 
>  -----------------------------------------------------------------------
>  #define _GNU_SOURCE
>  #include<stdio.h>
>  #include<errno.h>
>  #include<sched.h>
> 
>  int main()
>  {
>      cpu_set_t set;
>      if (sched_getaffinity(0, sizeof(cpu_set_t), &set) < 0)
>          printf("\n Call is failing with:%d", errno);
>  }
>  -----------------------------------------------------------------------
> 
> Because the kernel assumes len argument of sched_getaffinity() is bigger
> than NR_CPUS. But now it is not correct.
> 
> Now we are faced with the following annoying dilemma, due to
> the limitations of the glibc interface built in years ago:
> 
>  (1) if we change glibc's __CPU_SETSIZE definition, we lost
>      binary compatibility of _all_ application.
> 
>  (2) if we don't change it, we also lost binary compatibility of
>      Sharyathi's use case.
> 
> Then, I would propse to change the rule of the len argument of
> sched_getaffinity().
> 
> Old:
> 	len should be bigger than NR_CPUS
> New:
> 	len should be bigger than maximum possible cpu id
> 
> This creates the following behavior:
> 
>  (A) In the real 4096 cpus machine, the above test program still
>      return -EINVAL.
> 
>  (B) NR_CPUS=4096 but the machine have less than 1024 cpus (almost
>      all machines in the world), the above can run successfully.
> 
> Fortunatelly, BIG SGI machine is mainly used for HPC use case. It means
> they can rebuild their programs.
> 
> IOW we hope they are not annoyed by this issue ...
> 
> Reported-by: Sharyathi Nagesh <sharyath@in.ibm.com>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Acked-by: Ulrich Drepper <drepper@redhat.com>
> Acked-by: Peter Zijlstra <peterz@infradead.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Jack Steiner <steiner@sgi.com>
> Cc: Russ Anderson <rja@sgi.com>
> Cc: Mike Travis <travis@sgi.com>
> LKML-Reference: <20100312161316.9520.A69D9226@jp.fujitsu.com>
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
> ---
>  kernel/sched.c |   10 +++++++---
>  1 files changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/sched.c b/kernel/sched.c
> index 9ab3cd7..6eaef3d 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -4902,7 +4902,9 @@ SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len,
>  	int ret;
>  	cpumask_var_t mask;
>  
> -	if (len < cpumask_size())
> +	if (len < nr_cpu_ids)
> +		return -EINVAL;
> +	if (len & (sizeof(unsigned long)-1))
>  		return -EINVAL;
>  
>  	if (!alloc_cpumask_var(&mask, GFP_KERNEL))
> @@ -4910,10 +4912,12 @@ SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len,
>  
>  	ret = sched_getaffinity(pid, mask);
>  	if (ret == 0) {
> -		if (copy_to_user(user_mask_ptr, mask, cpumask_size()))
> +		int retlen = min(len, cpumask_size());
> +
> +		if (copy_to_user(user_mask_ptr, mask, retlen))
>  			ret = -EFAULT;
>  		else
> -			ret = cpumask_size();
> +			ret = retlen;
>  	}
>  	free_cpumask_var(mask);
>  

  reply	other threads:[~2010-03-15 16:05 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-12  7:15 [PATCH] sched: sched_getaffinity() allow less than NR_CPUS length KOSAKI Motohiro
2010-03-12 16:08 ` Ulrich Drepper
2010-03-15  7:43 ` [tip:sched/urgent] sched: sched_getaffinity(): Allow " tip-bot for KOSAKI Motohiro
2010-03-15 16:04   ` Jack Steiner [this message]
2010-03-15 16:17     ` Ulrich Drepper
2010-03-16 17:10   ` Ingo Molnar
2010-03-17  0:36     ` KOSAKI Motohiro
2010-03-17 10:03       ` [tip:sched/urgent] sched: Use proper type in sched_getaffinity() tip-bot for KOSAKI Motohiro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100315160454.GA8211@sgi.com \
    --to=steiner@sgi.com \
    --cc=akpm@linux-foundation.org \
    --cc=drepper@redhat.com \
    --cc=hpa@zytor.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rja@sgi.com \
    --cc=sharyath@in.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=travis@sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox