From: Christoph Lameter <clameter@sgi.com>
To: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: akpm@linux-foundation.org, linux-mm@kvack.org, ak@suse.de,
eric.whitney@hp.com, mel@skynet.ie
Subject: Re: [PATCH/RFC 4/4] Mem Policy: Fixup Fallback for Default Shmem Policy
Date: Wed, 24 Oct 2007 06:09:41 -0700 (PDT) [thread overview]
Message-ID: <Pine.LNX.4.64.0710240601590.24201@schroedinger.engr.sgi.com> (raw)
In-Reply-To: <1193160751.5859.93.camel@localhost>
On Tue, 23 Oct 2007, Lee Schermerhorn wrote:
> > I still think there must be a thinko here. The function seems to be
> > currently coded with the assumption that get_policy always returns a
> > policy. That policy may be the default policy??
>
> My assumption is that the get_policy vm_op should either return a
> [non-NULL] mempolicy corresponding to the specified address with the ref
> count elevated for the caller, or NULL. Never the default policy.
> Fallback will be handled by get_vma_policy().
Ok.
> So, my "model" is: the get_policy() op must return a non-NULL policy
> with elevated reference count or NULL so that get_vma_policy() can
> depend on consistent behavior; and a NULL return from the get_policy()
> op means "fall back to surrounding context" just as for vma policy.
>
> I think this is "consistent" behavior, for some definition thereof.
I still have concerns about ting the refcount. The get_policy() method may
take a refcount if it can ensure that the object is not vanishing from
under us. But I would think that a refcount needs to be taken when the
possibility is created for a certain vma to reference a policy via
get_vma_policy and not when get_vma_policy itself runs.
> > I still have no idea what your warrant is for being sure that the object
> > continues to exist before increasing the policy refcount in
> > get_vma_policy()? What pins the shared policy before we get the refcount?
>
> For shmem shared policy, the rb-tree spin lock protects the policy while
> we take the reference. To be consistent with this, I require that the
> shm get_policy op does the same when falling back to vma policy for shm
> file systems that don't support get_policy() ops--only hugetlbfs at this
> time.
The rb tree lock is always taken when we run get_vma_policy()? You mean
you can take the lock while the get_policy is run? This will make
get_vma_policy even heavier?
> The current task's vma policies, although subject to change by other
> threads/tasks sharing the mm_struct, are protected by the mmap_sem()
> while we take the reference, as you've pointed out in other mail. Why
> take the extra ref? Back in June/July, we [you, Andi, myself] thought
> that this was required for allocating under bind policy with the custom
> zonelist because the allocation could sleep. Now, if we hold the
> mmap_sem over the allocation, we can probably dispense with the extra
> reference on [non-shared] vma policies as well.
Right.
> However, we still need to unref shared policies which one could consider
> a subclass of vma policies. With these recent patches and the prior
> mempolicy ref count patches, we could assume that all policies except
> the system default and the current task's mempolicy needed unref upon
> return from get_vma_policy(). If we don't take an extra ref on other
> task's mempolicy and non-shared vma policy, then we need to be able to
> differentiate truly shared policies when we're done with them so that we
> can unref them.
If you take the reference when a vma is established then you can avoid
dropping the refcount on the hot paths?
> How about a funky flag in the higher order policy bits, like the
> MPOL_CONTEXT flag in my cpuset-independent interleave patch, to indicate
> shmem-style shared policy. If the reasoning about mmap_sem above is
> correct, and we only need to hold refs on shmem shared policy, we can
> dispense with all of this extra reference counting and only unref the
> shared policies.
Maybe. Would need to be further fleshed out.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
prev parent reply other threads:[~2007-10-24 13:09 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-10-12 15:48 [PATCH/RFC 0/4] More Mempolicy Reference Counting Fixes Lee Schermerhorn
2007-10-12 15:49 ` [PATCH/RFC 1/4] Mem Policy: fix mempolicy usage in pci driver Lee Schermerhorn
2007-10-12 17:29 ` Christoph Lameter
2007-10-12 15:49 ` [PATCH/RFC 2/4] Mem Policy: Fixup Shm and Interleave Policy Reference Counting Lee Schermerhorn
2007-10-12 17:30 ` Christoph Lameter
2007-10-12 15:49 ` [PATCH/RFC 3/4] Mem Policy: Fixup " Lee Schermerhorn
2007-10-12 17:33 ` Christoph Lameter
2007-10-12 15:49 ` [PATCH/RFC 4/4] Mem Policy: Fixup Fallback for Default Shmem Policy Lee Schermerhorn
2007-10-12 17:57 ` Christoph Lameter
2007-10-15 19:34 ` Christoph Lameter
2007-10-23 16:15 ` Lee Schermerhorn
2007-10-23 16:23 ` Christoph Lameter
2007-10-23 17:32 ` Lee Schermerhorn
2007-10-24 13:09 ` Christoph Lameter [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.0710240601590.24201@schroedinger.engr.sgi.com \
--to=clameter@sgi.com \
--cc=Lee.Schermerhorn@hp.com \
--cc=ak@suse.de \
--cc=akpm@linux-foundation.org \
--cc=eric.whitney@hp.com \
--cc=linux-mm@kvack.org \
--cc=mel@skynet.ie \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).