From: Andi Kleen <ak@muc.de>
To: colpatch@us.ibm.com
Cc: linux-kernel@vger.kernel.org
Subject: Re: NUMA API for Linux
Date: Fri, 09 Apr 2004 07:39:37 +0200 [thread overview]
Message-ID: <m3d66hwvza.fsf@averell.firstfloor.org> (raw)
In-Reply-To: <1ISQC-7Cv-5@gated-at.bofh.it> (Matthew Dobson's message of "Fri, 09 Apr 2004 03:20:06 +0200")
Matthew Dobson <colpatch@us.ibm.com> writes:
>
> Instead of looking up a page's node number by
> page_zone(p)->zone_pgdat->node_id, you can get the same information much
> more efficiently by doing some bit-twidling on page->flags. Use
> page_nodenum(struct page *) from include/linux/mm.h.
That wasn't there when I wrote the code. I will do that change, thanks.
> So it looks like you *are* sharing policies, at least for VMA's in the
> range of a single mbind() call? This is a good start! ;) Looking
> further ahead, I'm a bit confused. It seems despite *sharing* VMA's
> belonging to a single mbind() call, you *copy* VMA's during dup_mmap(),
> copy_process(), split_vma(), and move_vma(). So in the majority of
> cases you duplicate policies instead of sharing them, but you *do* share
> them in some instances? Why the inconsistency?
Locking. policies get locked by their VM.
(the sharing you're advocating would require adding a new lock to
mempolicy)
>> +/* Change policy for a memory range */
>> +asmlinkage long sys_mbind(unsigned long start, unsigned long len,
>> + unsigned long mode,
>> + unsigned long *nmask, unsigned long maxnode,
>> + unsigned flags)
>
> What would you think about giving sys_mbind() a pid argument, like
> sys_sched_setaffinity()? It would make it much easier for sysadmins to
> use the API if they didn't have to rewrite applications to make these
> calls on their own. There's already a plethora of arguments, so one
> more might be overkill.... Just a thought.
It already has 6 arguments - 7 are not allowed. Also playing with a different
process' VM remotely is just a recipe for disaster IMHO (remember
all the security holes that used to be in /proc/pid/mem)
> Why not condense both the sys_mbind() & sys_set_mempolicy() into a
> single call? The functionality of these calls (and even their code) is
> very similar. The main difference is there is no need for looking up
> VMA's and doing locking in the sys_set_mempolicy() case. You overload
> the output side (sys_get_mempolicy()) to handle both whole process and
> memory range options, but you don't do the same on the input
> (sys_mbind() and sys_set_mempolicy()). Saving one syscall and having
> them behave more symmetrically would be a nice addition...
I think it's cleaner to have them separated.
>> +/* Retrieve NUMA policy */
>> +asmlinkage long sys_get_mempolicy(int *policy,
>> + unsigned long *nmask, unsigned long maxnode,
>> + unsigned long addr, unsigned long flags)
>
> I had a thought... Shouldn't all your user pointers be marked as such
> with __user? Ie:
I figure that the people who use the tools who need such ugly annotations
will add them themselves.
>
>> +{
>> + int err, pval;
>> + struct mm_struct *mm = current->mm;
>> + struct vm_area_struct *vma = NULL;
>> + struct mempolicy *pol = current->mempolicy;
>> +
>> + if (flags & ~(unsigned long)(MPOL_F_NODE|MPOL_F_ADDR))
>> + return -EINVAL;
>> + if (nmask != NULL && maxnode < numnodes)
>> + return -EINVAL;
>
> Did you mean: if (nmask == NULL || maxnode < numnodes) ?
No, the original test is correct.
> I think that BUG_ON should be BUG_ON(nid < 0), since it *shouldn't* be
> possible to get through that loop with nid >= MAX_NUMNODES. The only
> line that could set nid >= MAX_NUMNODES is nid = find_next_bit(...); and
> the very next line ensures that if nid is in fact >= MAX_NUMNODES it
> will be set to -1. It actually looks pretty redundant altogether,
> but...
Yes, it could be removed.
-Andi
next parent reply other threads:[~2004-04-09 5:39 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1IsMQ-3vi-35@gated-at.bofh.it>
[not found] ` <1IsMS-3vi-45@gated-at.bofh.it>
[not found] ` <1It5U-3J1-21@gated-at.bofh.it>
[not found] ` <1ItfE-3PL-3@gated-at.bofh.it>
[not found] ` <1ISQC-7Cv-5@gated-at.bofh.it>
2004-04-09 5:39 ` Andi Kleen [this message]
[not found] <1IL3l-1dP-35@gated-at.bofh.it>
[not found] ` <1IMik-2is-37@gated-at.bofh.it>
2004-04-08 19:20 ` NUMA API for Linux Rajesh Venkatasubramanian
2004-04-08 19:48 ` Hugh Dickins
2004-04-08 19:57 ` Rajesh Venkatasubramanian
2004-04-08 19:52 ` Andrea Arcangeli
2004-04-07 21:24 Matthew Dobson
2004-04-07 21:27 ` Andi Kleen
2004-04-07 21:41 ` Matthew Dobson
2004-04-07 21:45 ` Andi Kleen
2004-04-07 22:19 ` Matthew Dobson
2004-04-08 0:58 ` Matthew Dobson
2004-04-08 1:31 ` Andi Kleen
2004-04-08 18:36 ` Matthew Dobson
2004-04-09 1:09 ` Matthew Dobson
2004-04-09 5:29 ` Martin J. Bligh
2004-04-09 18:44 ` Matthew Dobson
2004-04-15 0:38 ` Matthew Dobson
2004-04-15 10:39 ` Andi Kleen
2004-04-15 11:48 ` Robin Holt
2004-04-15 18:32 ` Matthew Dobson
2004-04-15 19:44 ` Matthew Dobson
2004-04-07 21:35 ` Matthew Dobson
2004-04-07 21:51 ` Andrew Morton
2004-04-07 22:16 ` Andi Kleen
2004-04-07 22:34 ` Andrew Morton
2004-04-07 22:39 ` Martin J. Bligh
2004-04-07 22:33 ` Andi Kleen
2004-04-07 22:38 ` Martin J. Bligh
2004-04-07 22:38 ` Andi Kleen
2004-04-07 22:52 ` Andrew Morton
2004-04-07 23:09 ` Martin J. Bligh
2004-04-07 23:35 ` Andi Kleen
2004-04-07 23:56 ` Andrew Morton
2004-04-08 0:14 ` Andi Kleen
2004-04-08 0:26 ` Andrea Arcangeli
2004-04-08 0:51 ` Andi Kleen
2004-04-08 16:15 ` Hugh Dickins
2004-04-08 17:05 ` Martin J. Bligh
2004-04-08 18:16 ` Hugh Dickins
2004-04-08 19:25 ` Andrew Morton
2004-04-09 2:41 ` Wim Coekaerts
2004-04-08 0:22 ` Andrea Arcangeli
-- strict thread matches above, loose matches on Subject: below --
2004-04-06 13:33 Andi Kleen
2004-04-06 23:35 ` Paul Jackson
2004-04-08 20:12 ` Pavel Machek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m3d66hwvza.fsf@averell.firstfloor.org \
--to=ak@muc.de \
--cc=colpatch@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox