From: Andrew Morton <akpm@linux-foundation.org>
To: miaox@cn.fujitsu.com
Cc: Nick Piggin <npiggin@suse.de>,
David Rientjes <rientjes@google.com>,
Lee Schermerhorn <lee.schermerhorn@hp.com>,
Paul Menage <menage@google.com>,
Linux-Kernel <linux-kernel@vger.kernel.org>,
Linux-MM <linux-mm@kvack.org>
Subject: Re: [PATCH] [PATCH -mmotm] cpuset,mm: use seqlock to protect task->mempolicy and mems_allowed (v2) (was: Re: [PATCH V2 4/4] cpuset,mm: update task's mems_allowed lazily)
Date: Wed, 31 Mar 2010 12:42:01 -0700 [thread overview]
Message-ID: <20100331124201.8cb20a11.akpm@linux-foundation.org> (raw)
In-Reply-To: <4BAB6646.7040302@cn.fujitsu.com>
On Thu, 25 Mar 2010 21:33:58 +0800
Miao Xie <miaox@cn.fujitsu.com> wrote:
> on 2010-3-11 19:03, Nick Piggin wrote:
> >> Ok, I try to make a new patch by using seqlock.
> >
> > Well... I do think seqlocks would be a bit simpler because they don't
> > require this checking and synchronizing of this patch.
> Hi, Nick Piggin
>
> I have made a new patch which uses seqlock to protect mems_allowed and mempolicy.
> please review it.
That's an awfully big patch for a pretty small bug?
> Subject: [PATCH] [PATCH -mmotm] cpuset,mm: use seqlock to protect task->mempolicy and mems_allowed (v2)
>
> Before applying this patch, cpuset updates task->mems_allowed by setting all
> new bits in the nodemask first, and clearing all old unallowed bits later.
> But in the way, the allocator can see an empty nodemask, though it is infrequent.
>
> The problem is following:
> The size of nodemask_t is greater than the size of long integer, so loading
> and storing of nodemask_t are not atomic operations. If task->mems_allowed
> don't intersect with new_mask, such as the first word of the mask is empty
> and only the first word of new_mask is not empty. When the allocator
> loads a word of the mask before
>
> current->mems_allowed |= new_mask;
>
> and then loads another word of the mask after
>
> current->mems_allowed = new_mask;
>
> the allocator gets an empty nodemask.
Probably we could fix this via careful ordering of the updates,
barriers and perhaps some avoicance action at the reader side.
> Besides that, if the size of nodemask_t is less than the size of long integer,
> there is another problem. when the kernel allocater invokes the following function,
>
> struct zoneref *next_zones_zonelist(struct zoneref *z,
> enum zone_type highest_zoneidx,
> nodemask_t *nodes,
> struct zone **zone)
> {
> /*
> * Find the next suitable zone to use for the allocation.
> * Only filter based on nodemask if it's set
> */
> if (likely(nodes == NULL))
> ......
> else
> while (zonelist_zone_idx(z) > highest_zoneidx ||
> (z->zone && !zref_in_nodemask(z, nodes)))
> z++;
>
> *zone = zonelist_zone(z);
> return z;
> }
>
> if we change nodemask between two calls of zref_in_nodemask(), such as
> Task1 Task2
> zref_in_nodemask(z = node0's z, nodes = 1-2)
> zref_in_nodemask return 0
> nodes = 0
> zref_in_nodemask(z = node1's z, nodes = 0)
> zref_in_nodemask return 0
> z will overflow.
And maybe we can fix this by taking a copy into a local.
> when the kernel allocater accesses task->mempolicy, there is the same problem.
And maybe we can fix those in a similar way.
But it's all too much, and we'll just break it again in the future. So
yup, I guess locking is needed.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-03-31 19:42 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-08 10:10 [PATCH V2 4/4] cpuset,mm: update task's mems_allowed lazily Miao Xie
2010-03-08 21:46 ` David Rientjes
2010-03-09 7:25 ` Miao Xie
2010-03-11 8:15 ` Nick Piggin
2010-03-11 10:33 ` Miao Xie
2010-03-11 11:03 ` Nick Piggin
2010-03-25 10:23 ` Miao Xie
2010-03-25 12:56 ` Miao Xie
2010-03-25 13:33 ` [PATCH] [PATCH -mmotm] cpuset,mm: use seqlock to protect task->mempolicy and mems_allowed (v2) (was: Re: [PATCH V2 4/4] cpuset,mm: update task's mems_allowed lazily) Miao Xie
2010-03-28 5:30 ` Bob Liu
2010-03-31 19:42 ` Andrew Morton [this message]
2010-03-31 9:54 ` [PATCH V2 4/4] cpuset,mm: update task's mems_allowed lazily Miao Xie
2010-03-31 10:34 ` David Rientjes
2010-04-01 2:16 ` Miao Xie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100331124201.8cb20a11.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=lee.schermerhorn@hp.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=menage@google.com \
--cc=miaox@cn.fujitsu.com \
--cc=npiggin@suse.de \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).