public inbox for linux-arch@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] Fix argument checking in sched_setaffinity
       [not found]                       ` <20040906141142.663941fb.pj@sgi.com>
@ 2004-09-07 14:40                         ` Linus Torvalds
  2004-09-07 14:48                           ` Geert Uytterhoeven
                                             ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Linus Torvalds @ 2004-09-07 14:40 UTC (permalink / raw)
  To: Paul Jackson; +Cc: ak, Andrew Morton, Linux Arch list


[ Linux-kernel replaced with linux-arch, to see if there's any commentary 
  from arch people involved.. ]

On Mon, 6 Sep 2004, Paul Jackson wrote:

> Linus wrote:
> > I hate the "byte at a time" interface.
> > 
> > That said, I think the "long at a time" interface we have now for bitmaps 
> > ends up being a compatibility problem, where the compat layer has to worry 
> > about big-endian 32-bit "long" lookign different from big-endian 64-bit 
> > "long".
> 
> My first preference would be to get all the binary bitmap interfaces
> (affinity, mbind and mempolicy) "right":
> 
>     I think that means an array of 'u32'.  This parallels what I did for
>     the ascii format, where there was less need to remain compatible
>     (except that ascii is naturally big-endian, while the u32 array has
>     the low order word first):
> 
>       $ cat /sys/devices/system/node/node0/cpumap
>       00000000,00000000,00000000,000000ff

I agree. 

>     No doubt Andi will veto this for mbind/mempolicy, because it breaks
>     libnuma's he has in the field - a reasonable concern. 

Now, it will _only_ break systems that are _both_ 64-bit _and_ big-endian. 
Little-endian or 32-bit boxes will never care.

There aren't that many of those machines. I've got one right here (ppc64), 
but that particular one is guaranteed not to break if only because it's 
running a 32-bit user space.

So it's ppc64, sparc64, s390x and sh64. I suspect the breakage is 
basically zero, since not only aren't there _that_ many machines out 
there, the percentage of them that use setaffinity or mbind is likely not 
that high either.

		Linus

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Fix argument checking in sched_setaffinity
  2004-09-07 14:40                         ` [PATCH] Fix argument checking in sched_setaffinity Linus Torvalds
@ 2004-09-07 14:48                           ` Geert Uytterhoeven
  2004-09-07 14:49                           ` Andi Kleen
                                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 16+ messages in thread
From: Geert Uytterhoeven @ 2004-09-07 14:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Paul Jackson, ak, Andrew Morton, Linux Arch list

On Tue, 7 Sep 2004, Linus Torvalds wrote:
> Now, it will _only_ break systems that are _both_ 64-bit _and_ big-endian.
> Little-endian or 32-bit boxes will never care.
>
> There aren't that many of those machines. I've got one right here (ppc64),
> but that particular one is guaranteed not to break if only because it's
> running a 32-bit user space.
>
> So it's ppc64, sparc64, s390x and sh64. I suspect the breakage is
> basically zero, since not only aren't there _that_ many machines out
> there, the percentage of them that use setaffinity or mbind is likely not
> that high either.

And mips64, and parisc...

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Fix argument checking in sched_setaffinity
  2004-09-07 14:40                         ` [PATCH] Fix argument checking in sched_setaffinity Linus Torvalds
  2004-09-07 14:48                           ` Geert Uytterhoeven
@ 2004-09-07 14:49                           ` Andi Kleen
  2004-09-07 21:44                             ` Ralf Baechle
  2004-09-08  0:26                             ` Anton Blanchard
  2004-09-07 14:50                           ` Matthew Wilcox
  2004-09-08  0:24                           ` Anton Blanchard
  3 siblings, 2 replies; 16+ messages in thread
From: Andi Kleen @ 2004-09-07 14:49 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Paul Jackson, ak, Andrew Morton, Linux Arch list

On Tue, Sep 07, 2004 at 07:40:53AM -0700, Linus Torvalds wrote:
> >     No doubt Andi will veto this for mbind/mempolicy, because it breaks
> >     libnuma's he has in the field - a reasonable concern. 
> 
> Now, it will _only_ break systems that are _both_ 64-bit _and_ big-endian. 
> Little-endian or 32-bit boxes will never care.
> 
> There aren't that many of those machines. I've got one right here (ppc64), 
> but that particular one is guaranteed not to break if only because it's 
> running a 32-bit user space.

ppc64 libnuma is not deployed - the current numactl releases
still don't have the system call numbers and nobody told me 
about hacking them in.

So for me breaking ppc64 would be no problem.

> 
> So it's ppc64, sparc64, s390x and sh64. I suspect the breakage is 
> basically zero, since not only aren't there _that_ many machines out 
> there, the percentage of them that use setaffinity or mbind is likely not 
> that high either.

For mbind() breaking existing ppc64 users is unlikely I agree.

For setaffinity I am not so sure. The system call is around
for a long time and has been used in standard utilities
in distributions also for quite some time.

Also it is commonly used in real time applications. 

So in short I think changing mbind to u32 would be fine,
but I wouldn't do it for sched_setaffinity() 

-Andi

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Fix argument checking in sched_setaffinity
  2004-09-07 14:40                         ` [PATCH] Fix argument checking in sched_setaffinity Linus Torvalds
  2004-09-07 14:48                           ` Geert Uytterhoeven
  2004-09-07 14:49                           ` Andi Kleen
@ 2004-09-07 14:50                           ` Matthew Wilcox
  2004-09-08  0:24                           ` Anton Blanchard
  3 siblings, 0 replies; 16+ messages in thread
From: Matthew Wilcox @ 2004-09-07 14:50 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Paul Jackson, ak, Andrew Morton, Linux Arch list

On Tue, Sep 07, 2004 at 07:40:53AM -0700, Linus Torvalds wrote:
> Now, it will _only_ break systems that are _both_ 64-bit _and_ big-endian. 
> Little-endian or 32-bit boxes will never care.
> 
> So it's ppc64, sparc64, s390x and sh64. I suspect the breakage is 

... and parisc64.  Not that we have 64 bit userland yet either.

> basically zero, since not only aren't there _that_ many machines out 
> there, the percentage of them that use setaffinity or mbind is likely not 
> that high either.

-- 
"Next the statesmen will invent cheap lies, putting the blame upon 
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince 
himself that the war is just, and will thank God for the better sleep 
he enjoys after this process of grotesque self-deception." -- Mark Twain

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Fix argument checking in sched_setaffinity
  2004-09-07 14:49                           ` Andi Kleen
@ 2004-09-07 21:44                             ` Ralf Baechle
  2004-09-07 22:55                               ` Paul Jackson
  2004-09-08  0:26                             ` Anton Blanchard
  1 sibling, 1 reply; 16+ messages in thread
From: Ralf Baechle @ 2004-09-07 21:44 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Linus Torvalds, Paul Jackson, ak, Andrew Morton, Linux Arch list

On Tue, Sep 07, 2004 at 04:49:36PM +0200, Andi Kleen wrote:

> ppc64 libnuma is not deployed - the current numactl releases
> still don't have the system call numbers and nobody told me 
> about hacking them in.
> 
> So for me breaking ppc64 would be no problem.

Same for mips64; to date the SGI IP27 (Origin 200 / 2000) is the only
supported NUMA system and it's not very widespread.  Embedded NUMA is
getting closer but as long as that's still NDA stuff I have no problem
with breaking ABIs.

> > So it's ppc64, sparc64, s390x and sh64. I suspect the breakage is 
> > basically zero, since not only aren't there _that_ many machines out 
> > there, the percentage of them that use setaffinity or mbind is likely not 
> > that high either.
> 
> For mbind() breaking existing ppc64 users is unlikely I agree.
> 
> For setaffinity I am not so sure. The system call is around
> for a long time and has been used in standard utilities
> in distributions also for quite some time.
> 
> Also it is commonly used in real time applications. 
> 
> So in short I think changing mbind to u32 would be fine,
> but I wouldn't do it for sched_setaffinity() 

Same situation here for mips64.  sched_setaffinity() is established and
being used in applications, including commercial products so changing is
not an option.  For mbind(2) the situation is the opposite; I just
noticed the syscall handler was actually pointing to sys_ni_syscall and
nobody did complain so far so that's clearly an ok for changing the
interface :-)

  Ralf

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Fix argument checking in sched_setaffinity
  2004-09-07 21:44                             ` Ralf Baechle
@ 2004-09-07 22:55                               ` Paul Jackson
  2004-09-08  6:58                                 ` Andi Kleen
  0 siblings, 1 reply; 16+ messages in thread
From: Paul Jackson @ 2004-09-07 22:55 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: ak, torvalds, ak, akpm, linux-arch

Ralf wrote:
> Same situation here for mips64.  sched_setaffinity() is established and
> being used in applications, including commercial products so changing is
> not an option. 

Well ... not an option unless Linus makes it one.  It would be a win
in the long term, I think.  We could introduce a new system call, and
mark this one for eventual removal.

Meanwhile, if sched_setaffinity() is being used in products we can't
really test, then that seems to me to be one more good reason to back
out the API tweaking that Andi and Linus have been doing to it this
last week, and just leave it be, as it was a week ago.

-- 
                          I won't rest till it's the best ...
                          Programmer, Linux Scalability
                          Paul Jackson <pj@sgi.com> 1.650.933.1373

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Fix argument checking in sched_setaffinity
  2004-09-07 14:40                         ` [PATCH] Fix argument checking in sched_setaffinity Linus Torvalds
                                             ` (2 preceding siblings ...)
  2004-09-07 14:50                           ` Matthew Wilcox
@ 2004-09-08  0:24                           ` Anton Blanchard
  2004-09-08  0:33                             ` [PATCH] [ppc64] compat_get_bitmap/compat_put_bitmap Anton Blanchard
  3 siblings, 1 reply; 16+ messages in thread
From: Anton Blanchard @ 2004-09-08  0:24 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Paul Jackson, ak, Andrew Morton, Linux Arch list

 
> Now, it will _only_ break systems that are _both_ 64-bit _and_ big-endian. 
> Little-endian or 32-bit boxes will never care.
> 
> There aren't that many of those machines. I've got one right here (ppc64), 
> but that particular one is guaranteed not to break if only because it's 
> running a 32-bit user space.
> 
> So it's ppc64, sparc64, s390x and sh64. I suspect the breakage is 
> basically zero, since not only aren't there _that_ many machines out 
> there, the percentage of them that use setaffinity or mbind is likely not 
> that high either.

Oh we've felt it all right, both the cpu affinity calls and the numa api 
calls that pass node bitmaps around. I'll resend what patches I
currently have for this.

Anton

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Fix argument checking in sched_setaffinity
  2004-09-07 14:49                           ` Andi Kleen
  2004-09-07 21:44                             ` Ralf Baechle
@ 2004-09-08  0:26                             ` Anton Blanchard
  1 sibling, 0 replies; 16+ messages in thread
From: Anton Blanchard @ 2004-09-08  0:26 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Linus Torvalds, Paul Jackson, ak, Andrew Morton, Linux Arch list


Hi,

> For mbind() breaking existing ppc64 users is unlikely I agree.

Yes.

> For setaffinity I am not so sure. The system call is around
> for a long time and has been used in standard utilities
> in distributions also for quite some time.

We have deployed applications using setaffinity, we need a very good
reason to change this.

> So in short I think changing mbind to u32 would be fine,
> but I wouldn't do it for sched_setaffinity() 

Agreed.

Anton

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH] [ppc64] compat_get_bitmap/compat_put_bitmap
  2004-09-08  0:24                           ` Anton Blanchard
@ 2004-09-08  0:33                             ` Anton Blanchard
  2004-09-08  0:40                               ` [PATCH] [ppc64] Fix compat cpu affinity on big endian 64bit Anton Blanchard
  0 siblings, 1 reply; 16+ messages in thread
From: Anton Blanchard @ 2004-09-08  0:33 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Paul Jackson, ak, Andrew Morton, Linux Arch list


Signed-off-by: Anton Blanchard <anton@samba.org>

diff -puN kernel/compat.c~compat_bitmap kernel/compat.c
--- gr_work/kernel/compat.c~compat_bitmap	2004-09-04 00:56:24.297280051 -0500
+++ gr_work-anton/kernel/compat.c	2004-09-04 00:56:24.313277511 -0500
@@ -590,3 +590,83 @@ long compat_clock_nanosleep(clockid_t wh
 
 /* timer_create is architecture specific because it needs sigevent conversion */
 
+long compat_get_bitmap(unsigned long *mask, compat_ulong_t __user *umask,
+		       unsigned long bitmap_size)
+{
+	int i, j;
+	unsigned long m;
+	compat_ulong_t um;
+	unsigned long nr_compat_longs;
+
+	/* align bitmap up to nearest compat_long_t boundary */
+	bitmap_size = ALIGN(bitmap_size, BITS_PER_COMPAT_LONG);
+
+	if (verify_area(VERIFY_READ, umask, bitmap_size / 8))
+		return -EFAULT;
+
+	nr_compat_longs = BITS_TO_COMPAT_LONGS(bitmap_size);
+
+	for (i = 0; i < BITS_TO_LONGS(bitmap_size); i++) {
+		m = 0;
+
+		for (j = 0; j < sizeof(m)/sizeof(um); j++) {
+			/*
+			 * We dont want to read past the end of the userspace
+			 * bitmap. We must however ensure the end of the
+			 * kernel bitmap is zeroed.
+			 */
+			if (nr_compat_longs-- > 0) {
+				if (__get_user(um, umask))
+					return -EFAULT;
+			} else {
+				um = 0;
+			}
+
+			umask++;
+			m |= (long)um << (j * BITS_PER_COMPAT_LONG);
+		}
+		*mask++ = m;
+	}
+
+	return 0;
+}
+
+long compat_put_bitmap(compat_ulong_t __user *umask, unsigned long *mask,
+		       unsigned long bitmap_size)
+{
+	int i, j;
+	unsigned long m;
+	compat_ulong_t um;
+	unsigned long nr_compat_longs;
+
+	/* align bitmap up to nearest compat_long_t boundary */
+	bitmap_size = ALIGN(bitmap_size, BITS_PER_COMPAT_LONG);
+
+	if (verify_area(VERIFY_WRITE, umask, bitmap_size / 8))
+		return -EFAULT;
+
+	nr_compat_longs = BITS_TO_COMPAT_LONGS(bitmap_size);
+
+	for (i = 0; i < BITS_TO_LONGS(bitmap_size); i++) {
+		m = *mask++;
+
+		for (j = 0; j < sizeof(m)/sizeof(um); j++) {
+			um = m;
+
+			/*
+			 * We dont want to write past the end of the userspace
+			 * bitmap.
+			 */
+			if (nr_compat_longs-- > 0) {
+				if (__put_user(um, umask))
+					return -EFAULT;
+			}
+
+			umask++;
+			m >>= 4*sizeof(um);
+			m >>= 4*sizeof(um);
+		}
+	}
+
+	return 0;
+}
diff -puN include/linux/compat.h~compat_bitmap include/linux/compat.h
--- gr_work/include/linux/compat.h~compat_bitmap	2004-09-04 00:56:24.302279257 -0500
+++ gr_work-anton/include/linux/compat.h	2004-09-04 00:56:24.314277352 -0500
@@ -132,5 +132,15 @@ asmlinkage long compat_sys_select(int n,
 		compat_ulong_t __user *outp, compat_ulong_t __user *exp,
 		struct compat_timeval __user *tvp);
 
+#define BITS_PER_COMPAT_LONG    (8*sizeof(compat_long_t))
+
+#define BITS_TO_COMPAT_LONGS(bits) \
+	(((bits)+BITS_PER_COMPAT_LONG-1)/BITS_PER_COMPAT_LONG)
+
+long compat_get_bitmap(unsigned long *mask, compat_ulong_t __user *umask,
+		       unsigned long bitmap_size);
+long compat_put_bitmap(compat_ulong_t __user *umask, unsigned long *mask,
+		       unsigned long bitmap_size);
+
 #endif /* CONFIG_COMPAT */
 #endif /* _LINUX_COMPAT_H */
_

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH] [ppc64] Fix compat cpu affinity on big endian 64bit
  2004-09-08  0:33                             ` [PATCH] [ppc64] compat_get_bitmap/compat_put_bitmap Anton Blanchard
@ 2004-09-08  0:40                               ` Anton Blanchard
  2004-09-08  0:43                                 ` [PATCH] [ppc64] Fix compat NUMA API " Anton Blanchard
  2004-09-08  5:22                                 ` [PATCH] [ppc64] Fix compat cpu affinity " Andrew Morton
  0 siblings, 2 replies; 16+ messages in thread
From: Anton Blanchard @ 2004-09-08  0:40 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Paul Jackson, ak, Andrew Morton, Linux Arch list


Add compat sched affinity code. We can argue about how 
USE_COMPAT_ULONG_CPUMASK works now that the non compat interface has changed.

The old non compat behaviour was to require a bitmap long enough in both
setaffinity and getaffinity, now its only required in getaffinity. I
could do the same for the 32bit interfaces.

Signed-off-by: Anton Blanchard <anton@samba.org>

diff -puN kernel/compat.c~compat_sys_sched_affinity kernel/compat.c
--- gr_work/kernel/compat.c~compat_sys_sched_affinity	2004-09-06 04:19:06.327399153 -0500
+++ gr_work-anton/kernel/compat.c	2004-09-07 19:36:09.127584076 -0500
@@ -412,16 +412,43 @@ compat_sys_wait4(compat_pid_t pid, compa
 	}
 }
 
+/*
+ * for maximum compatability, we allow programs to use a single (compat)
+ * unsigned long bitmask if all cpus will fit. If not, you have to have
+ * at least the kernel size available.
+ */
+#define USE_COMPAT_ULONG_CPUMASK (NR_CPUS <= BITS_PER_COMPAT_LONG)
+
 asmlinkage long compat_sys_sched_setaffinity(compat_pid_t pid, 
 					     unsigned int len,
 					     compat_ulong_t __user *user_mask_ptr)
 {
-	unsigned long kern_mask;
+	cpumask_t kern_mask;
 	mm_segment_t old_fs;
 	int ret;
 
-	if (get_user(kern_mask, user_mask_ptr))
-		return -EFAULT;
+	if (USE_COMPAT_ULONG_CPUMASK) {
+		compat_ulong_t user_mask;
+
+		if (len < sizeof(user_mask))
+			return -EINVAL;
+
+		if (get_user(user_mask, user_mask_ptr))
+			return -EFAULT;
+
+		cpus_addr(kern_mask)[0] = user_mask;
+	} else {
+		unsigned long *k;
+
+		if (len < sizeof(kern_mask))
+			return -EINVAL;
+
+		k = cpus_addr(kern_mask);
+		ret = compat_get_bitmap(k, user_mask_ptr,
+					sizeof(kern_mask) * BITS_PER_LONG);
+		if (ret)
+			return ret;
+	}
 
 	old_fs = get_fs();
 	set_fs(KERNEL_DS);
@@ -436,10 +463,14 @@ asmlinkage long compat_sys_sched_setaffi
 asmlinkage long compat_sys_sched_getaffinity(compat_pid_t pid, unsigned int len,
 					     compat_ulong_t __user *user_mask_ptr)
 {
-	unsigned long kern_mask;
+	cpumask_t kern_mask;
 	mm_segment_t old_fs;
 	int ret;
 
+	if (len < (USE_COMPAT_ULONG_CPUMASK ? sizeof(compat_ulong_t)
+				: sizeof(kern_mask)))
+		return -EINVAL;
+
 	old_fs = get_fs();
 	set_fs(KERNEL_DS);
 	ret = sys_sched_getaffinity(pid,
@@ -447,10 +478,23 @@ asmlinkage long compat_sys_sched_getaffi
 				    (unsigned long __user *) &kern_mask);
 	set_fs(old_fs);
 
-	if (ret > 0) {
-		ret = sizeof(compat_ulong_t);
-		if (put_user(kern_mask, user_mask_ptr))
+	if (ret < 0)
+		return ret;
+
+	if (USE_COMPAT_ULONG_CPUMASK) {
+		if (put_user(&cpus_addr(kern_mask)[0], user_mask_ptr))
 			return -EFAULT;
+		ret = sizeof(compat_ulong_t);
+	} else {
+		unsigned long *k;
+
+		k = cpus_addr(kern_mask);
+		ret = compat_put_bitmap(user_mask_ptr, k,
+					sizeof(kern_mask) * BITS_PER_LONG);
+		if (ret)
+			return ret;
+
+		ret = sizeof(kern_mask);
 	}
 
 	return ret;
_

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH] [ppc64] Fix compat NUMA API on big endian 64bit
  2004-09-08  0:40                               ` [PATCH] [ppc64] Fix compat cpu affinity on big endian 64bit Anton Blanchard
@ 2004-09-08  0:43                                 ` Anton Blanchard
  2004-09-08  5:22                                 ` [PATCH] [ppc64] Fix compat cpu affinity " Andrew Morton
  1 sibling, 0 replies; 16+ messages in thread
From: Anton Blanchard @ 2004-09-08  0:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Paul Jackson, ak, Andrew Morton, Linux Arch list


Switch the NUMA API to use compat_get_bitmap/compat_put_bitmap. In order
to use compat_alloc_userspace instead of set_fs tricks, we have to do a
few copies.

This is what we are currently using on ppc64 but are willing to
entertain the idea of going to a 32bit bitmap, especially considering
how much hoops we have to go through to get it right in this patch.

Signed-off-by: Anton Blanchard <anton@samba.org>

diff -puN mm/mempolicy.c~numa_api mm/mempolicy.c
--- gr_work/mm/mempolicy.c~numa_api	2004-09-04 21:14:44.595414365 -0500
+++ gr_work-anton/mm/mempolicy.c	2004-09-05 09:19:10.475691107 -0500
@@ -525,20 +525,82 @@ asmlinkage long sys_get_mempolicy(int __
 }
 
 #ifdef CONFIG_COMPAT
-/* The other functions are compatible */
+
 asmlinkage long compat_get_mempolicy(int __user *policy,
-				  unsigned __user *nmask, unsigned  maxnode,
-				  unsigned addr, unsigned  flags)
+				     compat_ulong_t __user *nmask,
+				     compat_ulong_t maxnode,
+				     compat_ulong_t addr, compat_ulong_t flags)
 {
 	long err;
 	unsigned long __user *nm = NULL;
+	unsigned long nr_bits, alloc_size;
+	DECLARE_BITMAP(bm, MAX_NUMNODES);
+
+	nr_bits = min_t(unsigned long, maxnode-1, MAX_NUMNODES);
+	alloc_size = ALIGN(nr_bits, BITS_PER_LONG) / 8;
+
 	if (nmask)
-		nm = compat_alloc_user_space(ALIGN(maxnode-1, 64) / 8);
-	err = sys_get_mempolicy(policy, nm, maxnode, addr, flags);
-	if (!err && copy_in_user(nmask, nm, ALIGN(maxnode-1, 32)/8))
-		err = -EFAULT;
+		nm = compat_alloc_user_space(alloc_size);
+
+	err = sys_get_mempolicy(policy, nm, nr_bits+1, addr, flags);
+
+	if (!err && nmask) {
+		err = copy_from_user(bm, nm, alloc_size);
+		/* ensure entire bitmap is zeroed */
+		err |= clear_user(nmask, ALIGN(maxnode-1, 8) / 8);
+		err |= compat_put_bitmap(nmask, bm, nr_bits);
+	}
+
 	return err;
 }
+
+asmlinkage long compat_set_mempolicy(int mode, compat_ulong_t __user *nmask,
+				     compat_ulong_t maxnode)
+{
+	long err;
+	unsigned long __user *nm = NULL;
+	unsigned long nr_bits, alloc_size;
+	DECLARE_BITMAP(bm, MAX_NUMNODES);
+
+	nr_bits = min_t(unsigned long, maxnode-1, MAX_NUMNODES);
+	alloc_size = ALIGN(nr_bits, BITS_PER_LONG) / 8;
+
+	if (nmask) {
+		err = compat_get_bitmap(bm, nmask, nr_bits);
+		nm = compat_alloc_user_space(alloc_size);
+		err |= copy_to_user(nm, bm, alloc_size);
+	}
+
+	if (err)
+		return -EFAULT;
+
+	return sys_set_mempolicy(mode, nm, nr_bits+1);
+}
+
+asmlinkage long compat_mbind(compat_ulong_t start, compat_ulong_t len,
+			     compat_ulong_t mode, compat_ulong_t __user *nmask,
+			     compat_ulong_t maxnode, compat_ulong_t flags)
+{
+	long err;
+	unsigned long __user *nm = NULL;
+	unsigned long nr_bits, alloc_size;
+	DECLARE_BITMAP(bm, MAX_NUMNODES);
+
+	nr_bits = min_t(unsigned long, maxnode-1, MAX_NUMNODES);
+	alloc_size = ALIGN(nr_bits, BITS_PER_LONG) / 8;
+
+	if (nmask) {
+		err = compat_get_bitmap(bm, nmask, nr_bits);
+		nm = compat_alloc_user_space(alloc_size);
+		err |= copy_to_user(nm, bm, alloc_size);
+	}
+
+	if (err)
+		return -EFAULT;
+
+	return sys_mbind(start, len, mode, nm, nr_bits+1, flags);
+}
+
 #endif
 
 /* Return effective policy for a VMA */

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] [ppc64] Fix compat cpu affinity on big endian 64bit
  2004-09-08  0:40                               ` [PATCH] [ppc64] Fix compat cpu affinity on big endian 64bit Anton Blanchard
  2004-09-08  0:43                                 ` [PATCH] [ppc64] Fix compat NUMA API " Anton Blanchard
@ 2004-09-08  5:22                                 ` Andrew Morton
  2004-09-08  5:34                                   ` Anton Blanchard
  1 sibling, 1 reply; 16+ messages in thread
From: Andrew Morton @ 2004-09-08  5:22 UTC (permalink / raw)
  To: Anton Blanchard; +Cc: torvalds, pj, ak, linux-arch

Anton Blanchard <anton@samba.org> wrote:
>
> Add compat sched affinity code. We can argue about how 
>  USE_COMPAT_ULONG_CPUMASK works now that the non compat interface has changed.
> 
>  The old non compat behaviour was to require a bitmap long enough in both
>  setaffinity and getaffinity, now its only required in getaffinity. I
>  could do the same for the 32bit interfaces.
> 
>  Signed-off-by: Anton Blanchard <anton@samba.org>
> 
>  diff -puN kernel/compat.c~compat_sys_sched_affinity kernel/compat.c
>  --- gr_work/kernel/compat.c~compat_sys_sched_affinity	2004-09-06 04:19:06.327399153 -0500
>  +++ gr_work-anton/kernel/compat.c	2004-09-07 19:36:09.127584076 -0500
>  @@ -412,16 +412,43 @@ compat_sys_wait4(compat_pid_t pid, compa
>   	}
>   }
>   
>  +/*
>  + * for maximum compatability, we allow programs to use a single (compat)
>  + * unsigned long bitmask if all cpus will fit. If not, you have to have
>  + * at least the kernel size available.
>  + */
>  +#define USE_COMPAT_ULONG_CPUMASK (NR_CPUS <= BITS_PER_COMPAT_LONG)

umm, wanna send along a patch which defines BITS_PER_COMPAT_LONG?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] [ppc64] Fix compat cpu affinity on big endian 64bit
  2004-09-08  5:22                                 ` [PATCH] [ppc64] Fix compat cpu affinity " Andrew Morton
@ 2004-09-08  5:34                                   ` Anton Blanchard
  2004-09-08  5:43                                     ` Andrew Morton
  0 siblings, 1 reply; 16+ messages in thread
From: Anton Blanchard @ 2004-09-08  5:34 UTC (permalink / raw)
  To: Andrew Morton; +Cc: torvalds, pj, ak, linux-arch


> umm, wanna send along a patch which defines BITS_PER_COMPAT_LONG?

Yep, did you get the first in the series? Its hiding at the bottom.

Anton

--

Date: Wed, 8 Sep 2004 10:33:59 +1000
From: Anton Blanchard <anton@samba.org>
To: Linus Torvalds <torvalds@osdl.org>
Cc: Paul Jackson <pj@sgi.com>, ak@muc.de,
	Andrew Morton <akpm@osdl.org>,
	Linux Arch list <linux-arch@vger.kernel.org>
Subject: [PATCH] [ppc64] compat_get_bitmap/compat_put_bitmap

Signed-off-by: Anton Blanchard <anton@samba.org>

diff -puN kernel/compat.c~compat_bitmap kernel/compat.c
--- gr_work/kernel/compat.c~compat_bitmap	2004-09-04 00:56:24.297280051 -0500
+++ gr_work-anton/kernel/compat.c	2004-09-04 00:56:24.313277511 -0500
@@ -590,3 +590,83 @@ long compat_clock_nanosleep(clockid_t wh
 
 /* timer_create is architecture specific because it needs sigevent conversion */
 
+long compat_get_bitmap(unsigned long *mask, compat_ulong_t __user *umask,
+		       unsigned long bitmap_size)
+{
+	int i, j;
+	unsigned long m;
+	compat_ulong_t um;
+	unsigned long nr_compat_longs;
+
+	/* align bitmap up to nearest compat_long_t boundary */
+	bitmap_size = ALIGN(bitmap_size, BITS_PER_COMPAT_LONG);
+
+	if (verify_area(VERIFY_READ, umask, bitmap_size / 8))
+		return -EFAULT;
+
+	nr_compat_longs = BITS_TO_COMPAT_LONGS(bitmap_size);
+
+	for (i = 0; i < BITS_TO_LONGS(bitmap_size); i++) {
+		m = 0;
+
+		for (j = 0; j < sizeof(m)/sizeof(um); j++) {
+			/*
+			 * We dont want to read past the end of the userspace
+			 * bitmap. We must however ensure the end of the
+			 * kernel bitmap is zeroed.
+			 */
+			if (nr_compat_longs-- > 0) {
+				if (__get_user(um, umask))
+					return -EFAULT;
+			} else {
+				um = 0;
+			}
+
+			umask++;
+			m |= (long)um << (j * BITS_PER_COMPAT_LONG);
+		}
+		*mask++ = m;
+	}
+
+	return 0;
+}
+
+long compat_put_bitmap(compat_ulong_t __user *umask, unsigned long *mask,
+		       unsigned long bitmap_size)
+{
+	int i, j;
+	unsigned long m;
+	compat_ulong_t um;
+	unsigned long nr_compat_longs;
+
+	/* align bitmap up to nearest compat_long_t boundary */
+	bitmap_size = ALIGN(bitmap_size, BITS_PER_COMPAT_LONG);
+
+	if (verify_area(VERIFY_WRITE, umask, bitmap_size / 8))
+		return -EFAULT;
+
+	nr_compat_longs = BITS_TO_COMPAT_LONGS(bitmap_size);
+
+	for (i = 0; i < BITS_TO_LONGS(bitmap_size); i++) {
+		m = *mask++;
+
+		for (j = 0; j < sizeof(m)/sizeof(um); j++) {
+			um = m;
+
+			/*
+			 * We dont want to write past the end of the userspace
+			 * bitmap.
+			 */
+			if (nr_compat_longs-- > 0) {
+				if (__put_user(um, umask))
+					return -EFAULT;
+			}
+
+			umask++;
+			m >>= 4*sizeof(um);
+			m >>= 4*sizeof(um);
+		}
+	}
+
+	return 0;
+}
diff -puN include/linux/compat.h~compat_bitmap include/linux/compat.h
--- gr_work/include/linux/compat.h~compat_bitmap	2004-09-04 00:56:24.302279257 -0500
+++ gr_work-anton/include/linux/compat.h	2004-09-04 00:56:24.314277352 -0500
@@ -132,5 +132,15 @@ asmlinkage long compat_sys_select(int n,
 		compat_ulong_t __user *outp, compat_ulong_t __user *exp,
 		struct compat_timeval __user *tvp);
 
+#define BITS_PER_COMPAT_LONG    (8*sizeof(compat_long_t))
+
+#define BITS_TO_COMPAT_LONGS(bits) \
+	(((bits)+BITS_PER_COMPAT_LONG-1)/BITS_PER_COMPAT_LONG)
+
+long compat_get_bitmap(unsigned long *mask, compat_ulong_t __user *umask,
+		       unsigned long bitmap_size);
+long compat_put_bitmap(compat_ulong_t __user *umask, unsigned long *mask,
+		       unsigned long bitmap_size);
+
 #endif /* CONFIG_COMPAT */
 #endif /* _LINUX_COMPAT_H */
_

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] [ppc64] Fix compat cpu affinity on big endian 64bit
  2004-09-08  5:34                                   ` Anton Blanchard
@ 2004-09-08  5:43                                     ` Andrew Morton
  0 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2004-09-08  5:43 UTC (permalink / raw)
  To: Anton Blanchard; +Cc: torvalds, pj, ak, linux-arch

Anton Blanchard <anton@samba.org> wrote:
>
>  > umm, wanna send along a patch which defines BITS_PER_COMPAT_LONG?
> 
>  Yep, did you get the first in the series?

<looks>

Oh, so I did.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Fix argument checking in sched_setaffinity
  2004-09-07 22:55                               ` Paul Jackson
@ 2004-09-08  6:58                                 ` Andi Kleen
  2004-09-08  7:26                                   ` Paul Jackson
  0 siblings, 1 reply; 16+ messages in thread
From: Andi Kleen @ 2004-09-08  6:58 UTC (permalink / raw)
  To: Paul Jackson; +Cc: Ralf Baechle, ak, torvalds, ak, akpm, linux-arch

On Tue, Sep 07, 2004 at 03:55:30PM -0700, Paul Jackson wrote:
> Ralf wrote:
> > Same situation here for mips64.  sched_setaffinity() is established and
> > being used in applications, including commercial products so changing is
> > not an option. 
> 
> Well ... not an option unless Linus makes it one.  It would be a win
> in the long term, I think.  We could introduce a new system call, and
> mark this one for eventual removal.
> 
> Meanwhile, if sched_setaffinity() is being used in products we can't
> really test, then that seems to me to be one more good reason to back
> out the API tweaking that Andi and Linus have been doing to it this
> last week, and just leave it be, as it was a week ago.

The changes me and Linus did do not change anything for these
products. When they pass length == sizeof(cpumask_t) everything 
is ok. The only behaviour change is for cases that would
previously have returned -EINVAL.

-Andi

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Fix argument checking in sched_setaffinity
  2004-09-08  6:58                                 ` Andi Kleen
@ 2004-09-08  7:26                                   ` Paul Jackson
  0 siblings, 0 replies; 16+ messages in thread
From: Paul Jackson @ 2004-09-08  7:26 UTC (permalink / raw)
  To: Andi Kleen; +Cc: ralf, torvalds, ak, akpm, linux-arch

> The only behaviour change is for cases that would
> previously have returned -EINVAL.

That's a change.  It has non-zero risk of breaking (or exposing an
existing, implicit breakage in) some user code, especially given that
user code has been encouraged to write loops testing for -EINVAL in
order to probe the mask size.  I didn't see enough benefit from any
of the variants of the last week to compensate for this risk.

-- 
                          I won't rest till it's the best ...
                          Programmer, Linux Scalability
                          Paul Jackson <pj@sgi.com> 1.650.933.1373

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2004-09-08  7:27 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20040831183655.58d784a3.pj@sgi.com>
     [not found] ` <20040904133701.GE33964@muc.de>
     [not found]   ` <20040904171417.67649169.pj@sgi.com>
     [not found]     ` <Pine.LNX.4.58.0409041717230.4735@ppc970.osdl.org>
     [not found]       ` <20040904180548.2dcdd488.pj@sgi.com>
     [not found]         ` <Pine.LNX.4.58.0409041827280.2331@ppc970.osdl.org>
     [not found]           ` <20040904204850.48b7cfbd.pj@sgi.com>
     [not found]             ` <Pine.LNX.4.58.0409042055460.2331@ppc970.osdl.org>
     [not found]               ` <20040904211749.3f713a8a.pj@sgi.com>
     [not found]                 ` <20040904215205.0a067ab8.pj@sgi.com>
     [not found]                   ` <20040906182330.GA79122@muc.de>
     [not found]                     ` <Pine.LNX.4.58.0409061147220.28608@ppc970.osdl.org>
     [not found]                       ` <20040906141142.663941fb.pj@sgi.com>
2004-09-07 14:40                         ` [PATCH] Fix argument checking in sched_setaffinity Linus Torvalds
2004-09-07 14:48                           ` Geert Uytterhoeven
2004-09-07 14:49                           ` Andi Kleen
2004-09-07 21:44                             ` Ralf Baechle
2004-09-07 22:55                               ` Paul Jackson
2004-09-08  6:58                                 ` Andi Kleen
2004-09-08  7:26                                   ` Paul Jackson
2004-09-08  0:26                             ` Anton Blanchard
2004-09-07 14:50                           ` Matthew Wilcox
2004-09-08  0:24                           ` Anton Blanchard
2004-09-08  0:33                             ` [PATCH] [ppc64] compat_get_bitmap/compat_put_bitmap Anton Blanchard
2004-09-08  0:40                               ` [PATCH] [ppc64] Fix compat cpu affinity on big endian 64bit Anton Blanchard
2004-09-08  0:43                                 ` [PATCH] [ppc64] Fix compat NUMA API " Anton Blanchard
2004-09-08  5:22                                 ` [PATCH] [ppc64] Fix compat cpu affinity " Andrew Morton
2004-09-08  5:34                                   ` Anton Blanchard
2004-09-08  5:43                                     ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox