From: Andi Kleen <ak@muc.de>
To: colpatch@us.ibm.com
Cc: linux-kernel@vger.kernel.org
Subject: Re: NUMA API for Linux
Date: Fri, 09 Apr 2004 07:39:37 +0200 [thread overview]
Message-ID: <m3d66hwvza.fsf@averell.firstfloor.org> (raw)
In-Reply-To: <1ISQC-7Cv-5@gated-at.bofh.it> (Matthew Dobson's message of "Fri, 09 Apr 2004 03:20:06 +0200")
Matthew Dobson <colpatch@us.ibm.com> writes:
>
> Instead of looking up a page's node number by
> page_zone(p)->zone_pgdat->node_id, you can get the same information much
> more efficiently by doing some bit-twidling on page->flags. Use
> page_nodenum(struct page *) from include/linux/mm.h.
That wasn't there when I wrote the code. I will do that change, thanks.
> So it looks like you *are* sharing policies, at least for VMA's in the
> range of a single mbind() call? This is a good start! ;) Looking
> further ahead, I'm a bit confused. It seems despite *sharing* VMA's
> belonging to a single mbind() call, you *copy* VMA's during dup_mmap(),
> copy_process(), split_vma(), and move_vma(). So in the majority of
> cases you duplicate policies instead of sharing them, but you *do* share
> them in some instances? Why the inconsistency?
Locking. policies get locked by their VM.
(the sharing you're advocating would require adding a new lock to
mempolicy)
>> +/* Change policy for a memory range */
>> +asmlinkage long sys_mbind(unsigned long start, unsigned long len,
>> + unsigned long mode,
>> + unsigned long *nmask, unsigned long maxnode,
>> + unsigned flags)
>
> What would you think about giving sys_mbind() a pid argument, like
> sys_sched_setaffinity()? It would make it much easier for sysadmins to
> use the API if they didn't have to rewrite applications to make these
> calls on their own. There's already a plethora of arguments, so one
> more might be overkill.... Just a thought.
It already has 6 arguments - 7 are not allowed. Also playing with a different
process' VM remotely is just a recipe for disaster IMHO (remember
all the security holes that used to be in /proc/pid/mem)
> Why not condense both the sys_mbind() & sys_set_mempolicy() into a
> single call? The functionality of these calls (and even their code) is
> very similar. The main difference is there is no need for looking up
> VMA's and doing locking in the sys_set_mempolicy() case. You overload
> the output side (sys_get_mempolicy()) to handle both whole process and
> memory range options, but you don't do the same on the input
> (sys_mbind() and sys_set_mempolicy()). Saving one syscall and having
> them behave more symmetrically would be a nice addition...
I think it's cleaner to have them separated.
>> +/* Retrieve NUMA policy */
>> +asmlinkage long sys_get_mempolicy(int *policy,
>> + unsigned long *nmask, unsigned long maxnode,
>> + unsigned long addr, unsigned long flags)
>
> I had a thought... Shouldn't all your user pointers be marked as such
> with __user? Ie:
I figure that the people who use the tools who need such ugly annotations
will add them themselves.
>
>> +{
>> + int err, pval;
>> + struct mm_struct *mm = current->mm;
>> + struct vm_area_struct *vma = NULL;
>> + struct mempolicy *pol = current->mempolicy;
>> +
>> + if (flags & ~(unsigned long)(MPOL_F_NODE|MPOL_F_ADDR))
>> + return -EINVAL;
>> + if (nmask != NULL && maxnode < numnodes)
>> + return -EINVAL;
>
> Did you mean: if (nmask == NULL || maxnode < numnodes) ?
No, the original test is correct.
> I think that BUG_ON should be BUG_ON(nid < 0), since it *shouldn't* be
> possible to get through that loop with nid >= MAX_NUMNODES. The only
> line that could set nid >= MAX_NUMNODES is nid = find_next_bit(...); and
> the very next line ensures that if nid is in fact >= MAX_NUMNODES it
> will be set to -1. It actually looks pretty redundant altogether,
> but...
Yes, it could be removed.
-Andi
next parent reply other threads:[~2004-04-09 5:39 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1IsMQ-3vi-35@gated-at.bofh.it>
[not found] ` <1IsMS-3vi-45@gated-at.bofh.it>
[not found] ` <1It5U-3J1-21@gated-at.bofh.it>
[not found] ` <1ItfE-3PL-3@gated-at.bofh.it>
[not found] ` <1ISQC-7Cv-5@gated-at.bofh.it>
2004-04-09 5:39 ` Andi Kleen [this message]
[not found] <1IL3l-1dP-35@gated-at.bofh.it>
[not found] ` <1IMik-2is-37@gated-at.bofh.it>
2004-04-08 19:20 ` NUMA API for Linux Rajesh Venkatasubramanian
2004-04-08 19:48 ` Hugh Dickins
2004-04-08 19:57 ` Rajesh Venkatasubramanian
2004-04-08 19:52 ` Andrea Arcangeli
2004-04-07 21:24 Matthew Dobson
2004-04-07 21:27 ` Andi Kleen
2004-04-07 21:41 ` Matthew Dobson
2004-04-07 21:45 ` Andi Kleen
2004-04-07 22:19 ` Matthew Dobson
2004-04-08 0:58 ` Matthew Dobson
2004-04-08 1:31 ` Andi Kleen
2004-04-08 18:36 ` Matthew Dobson
2004-04-09 1:09 ` Matthew Dobson
2004-04-09 5:29 ` Martin J. Bligh
2004-04-09 18:44 ` Matthew Dobson
2004-04-15 0:38 ` Matthew Dobson
2004-04-15 10:39 ` Andi Kleen
2004-04-15 11:48 ` Robin Holt
2004-04-15 18:32 ` Matthew Dobson
2004-04-15 19:44 ` Matthew Dobson
2004-04-07 21:35 ` Matthew Dobson
2004-04-07 21:51 ` Andrew Morton
2004-04-07 22:16 ` Andi Kleen
2004-04-07 22:34 ` Andrew Morton
2004-04-07 22:39 ` Martin J. Bligh
2004-04-07 22:33 ` Andi Kleen
2004-04-07 22:38 ` Martin J. Bligh
2004-04-07 22:38 ` Andi Kleen
2004-04-07 22:52 ` Andrew Morton
2004-04-07 23:09 ` Martin J. Bligh
2004-04-07 23:35 ` Andi Kleen
2004-04-07 23:56 ` Andrew Morton
2004-04-08 0:14 ` Andi Kleen
2004-04-08 0:26 ` Andrea Arcangeli
2004-04-08 0:51 ` Andi Kleen
2004-04-08 16:15 ` Hugh Dickins
2004-04-08 17:05 ` Martin J. Bligh
2004-04-08 18:16 ` Hugh Dickins
2004-04-08 19:25 ` Andrew Morton
2004-04-09 2:41 ` Wim Coekaerts
2004-04-08 0:22 ` Andrea Arcangeli
-- strict thread matches above, loose matches on Subject: below --
2004-04-06 13:33 Andi Kleen
2004-04-06 23:35 ` Paul Jackson
2004-04-08 20:12 ` Pavel Machek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m3d66hwvza.fsf@averell.firstfloor.org \
--to=ak@muc.de \
--cc=colpatch@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.