All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andi Kleen <ak@suse.de>
To: colpatch@us.ibm.com
Cc: linux-kernel@vger.kernel.org, akpm@osdl.org, mbligh@aracnet.com
Subject: Re: NUMA API for Linux
Date: Thu, 15 Apr 2004 12:39:15 +0200	[thread overview]
Message-ID: <20040415123915.016523df.ak@suse.de> (raw)
In-Reply-To: <1081989517.1206.206.camel@arrakis>

On Wed, 14 Apr 2004 17:38:37 -0700
Matthew Dobson <colpatch@us.ibm.com> wrote:



> 1) Redefine the value of some of the MPOL_* flags

I don't want to merge the flags the and the mode argument. It's ugly.

> 2) Rename check_* to mpol_check_*

I really don't understand why you insist on renaming all my functions? 
I like the current naming, thank you.

> 3) Remove get_nodes().  This should be done in the same manner as
> sys_sched_setaffinity().  We shouldn't care about unused high bits.

I disagree on that. This would break programs that are first tested
on a small machine and then later run on a big machine (common case)

> 4) Create mpol_check_flags() to, well, check the flags.  As the number
> of flags and modes grows, it will be easier to do this check in its own
> function.
> 5) In the syscalls (sys_mbind() & sys_set_mempolicy()), change 'len' to
> a size_t, add __user to the declaration of 'nmask', change 'maxnode' to

unsigned long is the standard for system calls.  Check some others. 

> 'nmask_len', and condense 'flags' and 'mode' into 'flags'.  The
> motivation here is to make this syscall similar to
> sys_sched_setaffinity().  These calls are basically the memory
> equivalent of set/getaffinity, and should look & behave that way.  Also,
> dropping an argument leaves an opening for a pid argument, which I
> believe would be good.  We should allow processes (with appropriate
> permissions, of course) to mbind other processes.

Messing with other process' VM is a recipe for disaster. There 
used to be tons of exploitable races in /proc/pid/mem, I don't want to repeat that.
Adding pid to set_mem_policy would be a bit easier, but it would require
to add a lock to the task struct for this. Currently it is nice and lockless
because it relies on the fact that only the current process can change 
its own policy. I prefer to keep it lockless, because that keeps the memory 
allocation fast paths faster.

> 6) Change how end is calculated as follows:
> 	end = PAGE_ALIGN(start+len);
> 	start &= PAGE_MASK;
> Basically, this allows users to pass in a non-page aligned 'start', and
> makes sure we mbind all pages from the page containing 'start' to the
> page containing 'start'+'len'.

mprotect() does the EINVAL check on unalignment. I think it's better
to follow mprotect here.

-Andi

  reply	other threads:[~2004-04-15 10:42 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-04-07 21:24 NUMA API for Linux Matthew Dobson
2004-04-07 21:27 ` Andi Kleen
2004-04-07 21:41   ` Matthew Dobson
2004-04-07 21:45     ` Andi Kleen
2004-04-07 22:19       ` Matthew Dobson
2004-04-08  0:58       ` Matthew Dobson
2004-04-08  1:31         ` Andi Kleen
2004-04-08 18:36           ` Matthew Dobson
2004-04-09  1:09       ` Matthew Dobson
2004-04-09  5:29         ` Martin J. Bligh
2004-04-09 18:44           ` Matthew Dobson
2004-04-15  0:38   ` Matthew Dobson
2004-04-15 10:39     ` Andi Kleen [this message]
2004-04-15 11:48       ` Robin Holt
2004-04-15 18:32         ` Matthew Dobson
2004-04-15 19:44       ` Matthew Dobson
2004-04-07 21:35 ` Matthew Dobson
2004-04-07 21:51 ` Andrew Morton
2004-04-07 22:16   ` Andi Kleen
2004-04-07 22:34     ` Andrew Morton
2004-04-07 22:39     ` Martin J. Bligh
2004-04-07 22:33       ` Andi Kleen
2004-04-07 22:38   ` Martin J. Bligh
2004-04-07 22:38     ` Andi Kleen
2004-04-07 22:52       ` Andrew Morton
2004-04-07 23:09         ` Martin J. Bligh
2004-04-07 23:35         ` Andi Kleen
2004-04-07 23:56           ` Andrew Morton
2004-04-08  0:14             ` Andi Kleen
2004-04-08  0:26               ` Andrea Arcangeli
2004-04-08  0:51                 ` Andi Kleen
2004-04-08 16:15             ` Hugh Dickins
2004-04-08 17:05               ` Martin J. Bligh
2004-04-08 18:16                 ` Hugh Dickins
2004-04-08 19:25               ` Andrew Morton
2004-04-09  2:41                 ` Wim Coekaerts
2004-04-08  0:22           ` Andrea Arcangeli
     [not found] <1IsMQ-3vi-35@gated-at.bofh.it>
     [not found] ` <1IsMS-3vi-45@gated-at.bofh.it>
     [not found]   ` <1It5U-3J1-21@gated-at.bofh.it>
     [not found]     ` <1ItfE-3PL-3@gated-at.bofh.it>
     [not found]       ` <1ISQC-7Cv-5@gated-at.bofh.it>
2004-04-09  5:39         ` Andi Kleen
     [not found] <1IL3l-1dP-35@gated-at.bofh.it>
     [not found] ` <1IMik-2is-37@gated-at.bofh.it>
2004-04-08 19:20   ` Rajesh Venkatasubramanian
2004-04-08 19:48     ` Hugh Dickins
2004-04-08 19:57       ` Rajesh Venkatasubramanian
2004-04-08 19:52     ` Andrea Arcangeli
  -- strict thread matches above, loose matches on Subject: below --
2004-04-06 13:33 Andi Kleen
2004-04-06 23:35 ` Paul Jackson
2004-04-08 20:12 ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040415123915.016523df.ak@suse.de \
    --to=ak@suse.de \
    --cc=akpm@osdl.org \
    --cc=colpatch@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mbligh@aracnet.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.