public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: William Lee Irwin III <wli@holomorphy.com>
To: Paul Jackson <pj@sgi.com>
Cc: mikpe@csd.uu.se, nickpiggin@yahoo.com.au, rusty@rustcorp.com.au,
	linux-kernel@vger.kernel.org, akpm@osdl.org, ak@muc.de,
	ashok.raj@intel.com, hch@infradead.org, jbarnes@sgi.com,
	joe.korty@ccur.com, manfred@colorfullife.com,
	colpatch@us.ibm.com, Simon.Derr@bull.net
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation
Date: Fri, 4 Jun 2004 18:31:39 -0700	[thread overview]
Message-ID: <20040605013139.GM21007@holomorphy.com> (raw)
In-Reply-To: <20040604170542.576b4243.pj@sgi.com>

William Lee Irwin III wrote:
>> This is patently ridiculous. Make a compat_sched_getaffinity(), and
>> likewise for whatever else is copying unsigned long arrays to userspace.

On Fri, Jun 04, 2004 at 05:05:42PM -0700, Paul Jackson wrote:
> My mind reading skills are failing me.  At the risk of opening myself to
> further ridicule, which part of what I wrote is patently ridiculous,
> why, and how does that differ from whatever you had in mind when you
> recommended doing "likewise"?

Ridiculous == some bizarre for_each_cpu() loop doing put put_user() once
for every bit of a cpumask_t.


On Fri, Jun 04, 2004 at 05:05:42PM -0700, Paul Jackson wrote:
> Putting your comments aside for a moment ...
> We have here a bit of suckage.  The kernel bitmaps/cpumasks are arrays
> of unsigned long, with the low order long in the low order array slot,
> and the bytes within the longs in natural byte-order for that arch. The
> sched_setaffinity/sched_getaffinity calls in the kernel copy this stuff
> directly to/from user space.  This doesn't work so well for 32 bit tasks
> on a 64 bit big-endian kernel.  [Begin off-topic alert] The glibc
> sched_setaffinity and sched_getaffinity calls forcibly truncate the size
> of masks to some constant hardcoded size -- you have to use
> __SYSCALL(__NR_set_mempolicy) and such to get the real syscall.  This
> doesn't work so well for kernels compiled with NR_CPUS larger than the
> hardcoded glibc size.  [End off-topic alert]  This also doesn't provide
> any help to other code needing to move binary masks across the
> kernel/user boundary, such as the perfctr kernel extension that Mikael
> Pettersson <mikpe@csd.uu.se> describes.

Sounds like a glibc bug. It should probably dynamically detect
sizeof(cpumask_t), except of course that the API it's stuck with for
all time won't allow for dynamic allocation of the things.

Except when I look at my glibc headers, it's 1024 bits. And they're not
particularly recent glibc versions. SGI may need to get that bumped up,
but I doubt many others do.


On Fri, Jun 04, 2004 at 05:05:42PM -0700, Paul Jackson wrote:
> I presume that it is too late to change the low level format of masks
> that the sched_setaffinity/sched_getaffinity API support.  I'd be
> delighted to be wrong on this presumption.  So there is need for a
> compat variant of these calls, for use by 32 bit apps on 64 bit kernels.
>  My first reaction to Milton Miller's compat_sched_getaffinity patch
> that Anton reminded us of is similar to Andrew's.  I haven't had the
> intestinal fortitude to study the matter closer yet.   Before actually
> reading the code, I would expect that all it had to handle was the
> swapping of 32 bit halves of 64 bit longs on 64 bit big endian kernels,
> such as I described in my discussion of a mythical BIT32X() macro,
> earlier in this thread.  I would not expect it to have to make such a
> big deal of the special case of one word masks, as distinct from n word
> masks, though I agree that a 32 bit app should be able to use a single
> 32 bit word mask on a 64 bit kernel compiled with NR_CPUS <= 32.

I thought something more like this would work, but haven't tried it.
This wants a real copy_bitmap_to_user() helper unlike compat_set_fd_set().

Index: irqaction-2.6.7-rc2/fs/compat.c
===================================================================
--- irqaction-2.6.7-rc2.orig/fs/compat.c	2004-06-01 03:11:30.000000000 -0700
+++ irqaction-2.6.7-rc2/fs/compat.c	2004-06-04 10:28:44.190035000 -0700
@@ -40,6 +40,7 @@
 #include <linux/nfsd/nfsd.h>
 #include <linux/nfsd/syscall.h>
 #include <linux/personality.h>
+#include <linux/cpu.h>
 
 #include <net/sock.h>		/* siocdevprivate_ioctl */
 
@@ -1394,6 +1395,31 @@
 	return ret;
 }
 
+asmlinkage long compat_sched_getaffinity(compat_pid_t pid,
+				compat_uint_t len, compat_ulong_t __user *cpus)
+{
+	cpumask_t affinity;
+	int ret = 0;
+	task_t *task;
+
+	if (len < sizeof(cpumask_t))
+		return -EINVAL;
+	if (!access_ok(VERIFY_WRITE, cpus, sizeof(cpumask_t)))
+		return -EFAULT;
+	lock_cpu_hotplug();
+	read_lock(&tasklist_lock);
+	if ((task = pid ? find_task_by_pid(pid) : current))
+		cpus_and(affinity, task->cpus_allowed, cpu_possible_map);
+	else
+		ret = -ESRCH;
+	read_unlock(&tasklist_lock);
+	unlock_cpu_hotplug();
+	if (ret)
+		return ret;
+	compat_set_fd_set(NR_CPUS, cpus, cpus_addr(affinity));
+	return sizeof(cpumask_t);
+}
+
 #if defined(CONFIG_NFSD) || defined(CONFIG_NFSD_MODULE)
 /* Stuff for NFS server syscalls... */
 struct compat_nfsctl_svc {


On Fri, Jun 04, 2004 at 05:05:42PM -0700, Paul Jackson wrote:
> A key question, since it seems the perfctr stuff Mikael Pettersson
> describes is on its way into the main stream kernel, is whether any
> other kernel binary bitmap/cpumask API should use the same format as
> used by the kernel sched_setaffinity and sched_getaffinity, or use a
> more easily portable format - say an array of 32 bit words rather than
> an array of unsigned longs.  One could make impassioned pleas either
> way.  Having one kernel represent the same type in two different binary
> formats is a bit of a botch.  But then again, arrays of 32 bit words are
> 'nicer'.  And in fact, we already _have_ two formats required, since 32
> bit apps on 64 bit end endian kernels necessarily see a different format
> than their kernel uses natively -- indeed they use a format that is
> essentially the same as perfctr is using now.

This is trivial. Just like we needed ASCII marshalling, we need endian-
correct 32/64-bit bitmap marshalling.


On Fri, Jun 04, 2004 at 05:05:42PM -0700, Paul Jackson wrote:
> My vote, already cast when I slid the 32 bit chunk ascii format past
> y'all (it's amazing now, that I managed to do that ...) would be to
> export the array of 32 bit words format from the kernel, in all calls
> except the set/get affinity calls, where we have already cast the die
> otherwise.  I like what I understand Mikael is trying to do here.

The only case where this is distinguished at all from copy_to_user() is
64-bit bigendian with 32-bit userspace.


On Fri, Jun 04, 2004 at 05:05:42PM -0700, Paul Jackson wrote:
> In any case, I'd hope that any big/little endian distinctions could be
> encapsulated in macros provided by include/linux/byteorder headers. I'd
> hope that whichever one or two formats the kernel exported were
> supported by conversion routines in bitmap.c and bitmap.h, and if
> useful, also made available via the cpumask_t API.  Once cpumask
> routines were available to convert the perfctr format, then that would
> be one less use of the infamous cpus_addr() macro.  We should minimize
> 'open coding' of the conversion routines outside of the bitmap routines,
> which means look for the opportunity to move codes from both perfctr and
> compat_sched_setaffinity into lib/bitmap.c.


Index: irqaction-2.6.7-rc2/include/asm-generic/cpumask_array.h
===================================================================
--- irqaction-2.6.7-rc2.orig/include/asm-generic/cpumask_array.h	2004-05-29 23:26:10.000000000 -0700
+++ irqaction-2.6.7-rc2/include/asm-generic/cpumask_array.h	2004-06-04 18:29:36.984743000 -0700
@@ -27,6 +27,8 @@
 #define first_cpu(map)		find_first_bit((map).mask, NR_CPUS)
 #define next_cpu(cpu, map)	find_next_bit((map).mask, NR_CPUS, cpu + 1)
 
+#define cpus_to_u32_array(d, s)	bitmap_to_u32_array(d, (s).mask, sizeof(cpumask_t))
+
 /* only ever use this for things that are _never_ used on large boxen */
 #define cpus_coerce(map)	((map).mask[0])
 #define cpus_promote(map)	({ cpumask_t __cpu_mask = CPU_MASK_NONE;\
Index: irqaction-2.6.7-rc2/include/asm-generic/cpumask_arith.h
===================================================================
--- irqaction-2.6.7-rc2.orig/include/asm-generic/cpumask_arith.h	2004-05-29 23:26:26.000000000 -0700
+++ irqaction-2.6.7-rc2/include/asm-generic/cpumask_arith.h	2004-06-04 18:29:41.238097000 -0700
@@ -38,6 +38,8 @@
 #define CPU_MASK_ALL	(~((cpumask_t)0) >> (8*sizeof(cpumask_t) - NR_CPUS))
 #define CPU_MASK_NONE	((cpumask_t)0)
 
+#define cpus_to_u32_array(d, s)	bitmap_to_u32_array(d, &(s), sizeof(cpumask_t))
+
 /* only ever use this for things that are _never_ used on large boxen */
 #define cpus_coerce(map)		((unsigned long)(map))
 #define cpus_promote(map)		({ map; })
Index: irqaction-2.6.7-rc2/include/asm-generic/cpumask_up.h
===================================================================
--- irqaction-2.6.7-rc2.orig/include/asm-generic/cpumask_up.h	2004-05-29 23:25:55.000000000 -0700
+++ irqaction-2.6.7-rc2/include/asm-generic/cpumask_up.h	2004-06-04 18:29:46.573286000 -0700
@@ -40,6 +40,8 @@
 #define first_cpu(map)			(cpus_coerce(map) ? 0 : 1)
 #define next_cpu(cpu, map)		1
 
+#define cpus_to_u32_array(d, s)	bitmap_to_u32_array(d, &(s), sizeof(cpumask_t))
+
 /* only ever use this for things that are _never_ used on large boxen */
 #define cpus_promote(map)						\
 	({								\

-- wli

  reply	other threads:[~2004-06-05  1:32 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-06-03 16:43 [PATCH] Bitmap and Cpumask Cleanup - Overview Paul Jackson
2004-06-03 17:05 ` [PATCH] cpumask 1/10 cpu_present_map real even on non-smp Paul Jackson
2004-06-03 17:09 ` [PATCH] cpumask 2/10 bitmap cleanup preparation for cpumask overhaul Paul Jackson
2004-06-03 17:09 ` [PATCH] cpumask 3/10 bitmap inlining and optimizations Paul Jackson
2004-06-03 17:09 ` [PATCH] cpumask 4/10 uninline find_next_bit on ia64 Paul Jackson
2004-06-03 17:10 ` [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation Paul Jackson
2004-06-04  0:07   ` Andrew Morton
2004-06-04  0:25     ` Andrew Morton
2004-06-04  2:58       ` Paul Jackson
2004-06-04  2:47     ` Paul Jackson
2004-06-04  2:54       ` David S. Miller
2004-06-04  5:02         ` Paul Jackson
2004-06-04  5:01           ` David S. Miller
2004-06-04  1:47   ` Rusty Russell
2004-06-04  2:02     ` Nick Piggin
2004-06-04  2:19       ` Rusty Russell
2004-06-04  5:18       ` Paul Jackson
2004-06-04  5:22         ` David S. Miller
2004-06-04  6:57           ` Paul Jackson
2004-06-04  9:31         ` Mikael Pettersson
2004-06-04  9:37           ` William Lee Irwin III
2004-06-04  9:46             ` Mikael Pettersson
2004-06-04  9:59               ` William Lee Irwin III
2004-06-04 11:16                 ` Mikael Pettersson
2004-06-04 11:27                   ` William Lee Irwin III
2004-06-04 11:32                     ` William Lee Irwin III
2004-06-04 16:23                       ` Paul Jackson
2004-06-04 16:28                         ` William Lee Irwin III
2004-06-04 17:47                           ` Paul Jackson
2004-06-04 18:12                             ` William Lee Irwin III
2004-06-04 18:20                               ` William Lee Irwin III
2004-06-04 18:27                               ` Andrew Morton
2004-06-04 18:38                                 ` William Lee Irwin III
2004-06-05  2:51                                   ` William Lee Irwin III
2004-06-05  3:29                                     ` William Lee Irwin III
2004-06-04 18:42                               ` Paul Jackson
2004-06-04 18:42                                 ` William Lee Irwin III
2004-06-05  6:48                                   ` Paul Jackson
2004-06-06  2:07                               ` Rusty Russell
2004-06-06 12:16                                 ` Paul Jackson
2004-06-06 12:13                                   ` William Lee Irwin III
2004-06-06 12:28                                     ` Paul Jackson
2004-06-06 12:36                                       ` William Lee Irwin III
2004-06-06 13:42                                         ` Paul Jackson
2004-06-06 23:20                                   ` Rusty Russell
2004-06-07  6:44                                     ` Paul Jackson
2004-06-04  9:41           ` Andrew Morton
2004-06-05  7:01             ` Paul Jackson
2004-06-04 16:03           ` Paul Jackson
2004-06-04 16:56             ` William Lee Irwin III
2004-06-04 17:29               ` Paul Jackson
2004-06-04 17:52                 ` William Lee Irwin III
2004-06-04 19:01                   ` Paul Jackson
2004-06-04 19:08               ` Anton Blanchard
2004-06-04 19:17                 ` William Lee Irwin III
2004-06-04 20:28                 ` Andrew Morton
2004-06-07  7:55                   ` Anton Blanchard
2004-06-05  7:28                 ` Paul Jackson
2004-06-06  8:07                   ` Paul Jackson
2004-06-06  8:16                     ` William Lee Irwin III
2004-06-05  0:05               ` Paul Jackson
2004-06-05  1:31                 ` William Lee Irwin III [this message]
2004-06-05  8:04                   ` Paul Jackson
2004-06-05  8:26                     ` William Lee Irwin III
2004-06-06  8:40                       ` Paul Jackson
2004-06-06 12:34                         ` Paul Jackson
2004-06-07 16:54                       ` fix up compat_sched_[get/set]affinity Joe Korty
2004-06-07 17:07                         ` William Lee Irwin III
2004-06-04  5:30       ` [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation Paul Jackson
2004-06-04  5:35         ` Nick Piggin
2004-06-04  5:40           ` Andrew Morton
2004-06-04  5:53             ` Nick Piggin
2004-06-04  6:47             ` Paul Jackson
2004-06-04  4:31     ` Paul Jackson
2004-06-04  8:19   ` William Lee Irwin III
2004-06-04  8:43     ` Keith Owens
2004-06-04  9:54       ` William Lee Irwin III
2004-06-04 17:08         ` Paul Jackson
2004-06-09 16:38         ` William Lee Irwin III
2004-06-04  9:14     ` Paul Jackson
2004-06-03 17:10 ` [PATCH] cpumask 6/10 remove 26 no longer used cpumask*.h files Paul Jackson
2004-06-03 17:10 ` [PATCH] cpumask 7/10 remove obsolete cpumask macro uses - i386 arch Paul Jackson
2004-06-03 17:10 ` [PATCH] cpumask 8/10 remove obsolete cpumask macro uses - other archs Paul Jackson
2004-06-03 17:11 ` [PATCH] cpumask 9/10 Remove no longer used obsolete macro emulation Paul Jackson
2004-06-03 17:11 ` [PATCH] cpumask 10/10 optimize various uses of new cpumasks Paul Jackson
2004-06-04  4:27   ` Rusty Russell
2004-06-04  4:40     ` Nick Piggin
2004-06-04  4:51     ` Paul Jackson
2004-06-09  0:09   ` PATCH] cpumask 11/10 comment, spacing tweaks Paul Jackson
  -- strict thread matches above, loose matches on Subject: below --
2004-06-06 15:07 [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation Mikael Pettersson
2004-06-06 16:44 ` William Lee Irwin III
2004-06-06 17:46   ` Paul Jackson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040605013139.GM21007@holomorphy.com \
    --to=wli@holomorphy.com \
    --cc=Simon.Derr@bull.net \
    --cc=ak@muc.de \
    --cc=akpm@osdl.org \
    --cc=ashok.raj@intel.com \
    --cc=colpatch@us.ibm.com \
    --cc=hch@infradead.org \
    --cc=jbarnes@sgi.com \
    --cc=joe.korty@ccur.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=manfred@colorfullife.com \
    --cc=mikpe@csd.uu.se \
    --cc=nickpiggin@yahoo.com.au \
    --cc=pj@sgi.com \
    --cc=rusty@rustcorp.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox