From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
To: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@suse.de>,
linux-mm@kvack.org, Eric Whitney <eric.whitney@hp.com>,
David Rientjes <rientjes@google.com>, Paul Jackson <pj@sgi.com>
Subject: Re: [NUMA] Fix memory policy refcounting
Date: Tue, 06 Nov 2007 15:08:11 -0500 [thread overview]
Message-ID: <1194379691.5317.101.camel@localhost> (raw)
In-Reply-To: <Pine.LNX.4.64.0711061139230.30127@schroedinger.engr.sgi.com>
On Tue, 2007-11-06 at 11:43 -0800, Christoph Lameter wrote:
> On Tue, 6 Nov 2007, Lee Schermerhorn wrote:
>
> > We always seem to rathole on that subject. I just hoped to head that
> > off...
>
> Well fix this and the rathole will be gone.,
I'll hold you to that! :-)
>
> > > What do you mean by in use? If a vma can potentially use a shared policy
> > > in a rbtree then it is in use right?
> >
> > Not really--not for shared policies. Again, another task is allowed to
> > remove or replace the shared policies at any time, regardless of the
> > number of task's attached to the segment. We can't differentiate
> > between simple attachment and current use. We need the lookup-time
> > ref/unref to know that the policy is actually in use. We can still
> > replace it in the tree while it's "in use". This will remove the tree's
> > reference on the policy, but the policy won't be freed until the task
> > holding the extra ref drops it.
>
> Stil unclear as to why we need lookup time ref/unref. A task can replace
> the shared policy at any time you just need to update the refcounts. If
> you have a pointer to the policy in the vma then its possible to do so.
A pointer in the vma won't work. Different tasks could apply policies
on different ranges and shared policy semantics dictate that all tasks
see the same policy for a particular offset in the region--modulo
set/get races. The only way we could keep a pointer in the vma would be
to split the vmas in every task that has the shared region attached
whenever any task changes the policy of a range of the region, so that
all tasks have the same set of vma's all pointing to the same set of
policies in the tree. I don't think we can be changing other task's
address space externally like this. And it still wouldn't work, I
think, for shared policy semantics--again, except maybe with some sort
of rcu mechanism. More below on what constitutes actual "use".
>
> > I suppose we could stick any replaced mempolicy on a list associated
> > with the segment and keep them there until all tasks detach from the
> > shared segment. Not too much of a memory leak, as long as a task
>
> Well you have the refcount on the policy? Why keep the mempolicy around?
A non-zero ref count is what keeps the policy around. It implies that
some structure has a pointer to the policy, or some task is actively
examining the policy and will drop the reference when finsished with it.
[The latter is what's NOT happening now for shared policy.]
>
> > > AFAICT: If you take a reference on the shared policy for each
> > > vma then you can tell from the references that the policy is in use.
> >
> > See above. A vma reference does not constitute use for a shared policy.
>
> Why not? What does constitute "use" of a shared policy? A page that has
> used the policy?
Currently, when you lookup the policy [based on offset] in the rbtree
under spin_lock, the lookup function does an mpol_get() before dropping
the lock. Now, you can use the policy to allocate a page or to report
via get_mempolicy(MPOL_F_ADDR) or show_numa_maps()/mpol_to_str(). When
you're finished with the policy, you mpol_free() to release the
reference. While you're holding this ref, another task that has the
shared region attached can replace/delete the policy, removing it from
the rbtree and dropping the rbtree's reference via mpol_free(). Now,
the only reference to the policy is any reference held by a task that
has looked it up, but not yet mpol_free()ed it. When the last task
holding such a reference releases it, we'll free it back to the kmem
cache.
This is the type of use that I can't infer from vma counts or even vma
pointer refs. I should be able to replace the vma pointer/ref at any
time when the shared policy changes, and mpol_free() the policy for each
such vma pointer/ref. That leaves no ref to hold the policy should it
be in use [as discussed above].
Lee
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-11-06 20:08 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-10-26 23:41 [NUMA] Fix memory policy refcounting Christoph Lameter
2007-10-29 15:48 ` Lee Schermerhorn
2007-10-29 20:24 ` Christoph Lameter
2007-10-29 21:34 ` Lee Schermerhorn
2007-10-29 21:43 ` Christoph Lameter
2007-10-30 16:39 ` Lee Schermerhorn
2007-10-30 18:42 ` Christoph Lameter
2007-10-30 20:18 ` Lee Schermerhorn
2007-11-06 18:56 ` Lee Schermerhorn
2007-11-06 19:15 ` Christoph Lameter
2007-11-06 19:35 ` Lee Schermerhorn
2007-11-06 19:43 ` Christoph Lameter
2007-11-06 20:08 ` Lee Schermerhorn [this message]
2007-11-06 20:19 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1194379691.5317.101.camel@localhost \
--to=lee.schermerhorn@hp.com \
--cc=ak@suse.de \
--cc=clameter@sgi.com \
--cc=eric.whitney@hp.com \
--cc=linux-mm@kvack.org \
--cc=pj@sgi.com \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.