From: Miao Xie <miaox@cn.fujitsu.com>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Paul Menage <menage@google.com>, Paul Jackson <pj@usa.net>,
Andrew Morton <akpm@linux-foundation.org>,
mingo@elte.hu, linux-kernel@vger.kernel.org,
cl@linux-foundation.org, Derek Fults <dfults@sgi.com>
Subject: Re: [PATCH] cpuset: fix allocating page cache/slab object on the unallowed node when memory spread is set
Date: Thu, 12 Feb 2009 16:27:16 +0800 [thread overview]
Message-ID: <4993DD64.2090705@cn.fujitsu.com> (raw)
In-Reply-To: <200902121255.43047.nickpiggin@yahoo.com.au>
on 2009-2-12 9:55 Nick Piggin wrote:
> On Thursday 12 February 2009 12:19:11 Paul Menage wrote:
>> On Wed, Feb 11, 2009 at 4:54 PM, Nick Piggin <nickpiggin@yahoo.com.au>
> wrote:
>>> It would be possible, depending on timing, for the allocating thread to
>>> see either pre or post mems_allowed even if access was fully locked.
>> Right - seeing either the pre set or the post set is fine.
>>
>>> The only difference is that a partially changed mems_allowed could be
>>> seen. But what does this really mean? Some combination of the new and
>>> the old nodes. I don't think this is too much of a problem.
>> But if the old and new nodes are disjoint, that could lead to seeing no
>> nodes.
>
> Well we could structure updates as setting all new allowed nodes,
> then clearing newly disallowed ones.
But it still has the other problem. such as:
Task1 Task2
get_page_from_freelist() while(1) {
{
for_each_zone_zonelist_nodemask() {
change Task1's mems_allowed
if (!cpuset_zone_allowed_softwall())
goto try_next_zone;
try_next_zone:
...
}
} }
In the extreme case, Task1 will be completely unable to allocate memory
at worst. At least, it will lead to the delay of allocate pages. Though
the probability of this case is very low, we have to take into account.
Thanks!
Miao
>
>
>> Also, having the results of cpuset_zone_allowed() and
>> cpuset_current_mems_allowed change at random times over the course of
>> a call to alloc_pages() might cause interesting effects (e.g. we make
>> progress freeing pages from one set of nodes, and then call
>> get_page_from_freelist() on a different set of nodes).
>
> But again, is this really a problem? We're talking about a tiny
> possibility in a very uncommon case anyway when the cpuset is
> changing.
>
> If it can cause an outright error like OOM of course that's no
> good, but if it just requires us to go around the reclaim loop
> or allocate from another zone... I don't think that's so bad.
>
>
>>> This could work if we *really* need an atomic snapshot of mems_allowed.
>>> seqcount synchronisation would be an alternative too that could allow
>>> sleeping more easily than SRCU (OTOH if you don't need sleeping, then
>>> RCU should be faster than seqcount).
>>>
>>> But I'm not convinced we do need this to be atomic.
>> It's possible that I'm being overly-paranoid here. The decision to
>> make mems_allowed updates be purely pulled by the task itself predates
>> my involvement with cpusets code by a long time.
>
> It's not such a bad model, but the problem with it is that it needs
> to be carefully spread over the VM, and in fastpaths too. Now if it
> were something really critical, fine, but I'm hoping we can do
> without.
>
>
>
>
>
next prev parent reply other threads:[~2009-02-12 8:30 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-21 8:06 [PATCH] cpuset: fix allocating page cache/slab object on the unallowed node when memory spread is set Miao Xie
2009-01-21 8:30 ` Nick Piggin
2009-01-21 10:41 ` Paul Menage
2009-02-03 3:05 ` Miao Xie
2009-01-27 22:42 ` Andrew Morton
2009-01-28 16:38 ` Christoph Lameter
2009-02-03 3:25 ` Miao Xie
2009-02-03 22:16 ` Andrew Morton
2009-02-03 22:49 ` Paul Menage
2009-02-04 9:31 ` Miao Xie
2009-02-06 19:19 ` Paul Menage
2009-02-09 4:02 ` Nick Piggin
2009-02-10 11:37 ` Paul Menage
2009-02-12 0:54 ` Nick Piggin
2009-02-12 1:19 ` Paul Menage
2009-02-12 1:55 ` Nick Piggin
2009-02-12 1:58 ` Paul Menage
2009-02-12 8:23 ` Miao Xie
2009-02-12 21:53 ` Paul Menage
2009-02-12 8:27 ` Miao Xie [this message]
2009-02-12 10:40 ` Nick Piggin
2009-02-12 5:57 ` Miao Xie
2009-02-12 11:06 ` Paul Jackson
2009-02-04 9:03 ` Miao Xie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4993DD64.2090705@cn.fujitsu.com \
--to=miaox@cn.fujitsu.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux-foundation.org \
--cc=dfults@sgi.com \
--cc=linux-kernel@vger.kernel.org \
--cc=menage@google.com \
--cc=mingo@elte.hu \
--cc=nickpiggin@yahoo.com.au \
--cc=pj@usa.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox