From: Michal Nazarewicz <mina86@mina86.com>
To: Joonsoo Kim <iamjoonsoo.kim@lge.com>,
Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Rik van Riel <riel@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Mel Gorman <mgorman@suse.de>,
Johannes Weiner <hannes@cmpxchg.org>,
Minchan Kim <minchan@kernel.org>,
Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>,
Zhang Yanfei <zhangyanfei@cn.fujitsu.com>,
Tang Chen <tangchen@cn.fujitsu.com>,
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>,
Wen Congyang <wency@cn.fujitsu.com>,
Marek Szyprowski <m.szyprowski@samsung.com>,
Laura Abbott <lauraa@codeaurora.org>,
Heesub Shin <heesub.shin@samsung.com>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
Ritesh Harjani <ritesh.list@gmail.com>,
t.stanislaws@samsung.com, Gioh Kim <gioh.kim@lge.com>,
Vlastimil Babka <vbabka@suse.cz>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
stable@vger.kernel.org
Subject: Re: [PATCH v4 1/4] mm/page_alloc: fix incorrect isolation behavior by rechecking migratetype
Date: Mon, 27 Oct 2014 18:14:36 +0100 [thread overview]
Message-ID: <xa1toasxo2o3.fsf@mina86.com> (raw)
In-Reply-To: <1414051821-12769-2-git-send-email-iamjoonsoo.kim@lge.com>
On Thu, Oct 23 2014, Joonsoo Kim wrote:
> There are two paths to reach core free function of buddy allocator,
> __free_one_page(), one is free_one_page()->__free_one_page() and the
> other is free_hot_cold_page()->free_pcppages_bulk()->__free_one_page().
> Each paths has race condition causing serious problems. At first, this
> patch is focused on first type of freepath. And then, following patch
> will solve the problem in second type of freepath.
>
> In the first type of freepath, we got migratetype of freeing page without
> holding the zone lock, so it could be racy. There are two cases of this
> race.
>
> 1. pages are added to isolate buddy list after restoring orignal
> migratetype
>
> CPU1 CPU2
>
> get migratetype => return MIGRATE_ISOLATE
> call free_one_page() with MIGRATE_ISOLATE
>
> grab the zone lock
> unisolate pageblock
> release the zone lock
>
> grab the zone lock
> call __free_one_page() with MIGRATE_ISOLATE
> freepage go into isolate buddy list,
> although pageblock is already unisolated
>
> This may cause two problems. One is that we can't use this page anymore
> until next isolation attempt of this pageblock, because freepage is on
> isolate buddy list. The other is that freepage accouting could be wrong
> due to merging between different buddy list. Freepages on isolate buddy
> list aren't counted as freepage, but ones on normal buddy list are counted
> as freepage. If merge happens, buddy freepage on normal buddy list is
> inevitably moved to isolate buddy list without any consideration of
> freepage accouting so it could be incorrect.
>
> 2. pages are added to normal buddy list while pageblock is isolated.
> It is similar with above case.
>
> This also may cause two problems. One is that we can't keep these
> freepages from being allocated. Although this pageblock is isolated,
> freepage would be added to normal buddy list so that it could be
> allocated without any restriction. And the other problem is same as
> case 1, that it, incorrect freepage accouting.
>
> This race condition would be prevented by checking migratetype again
> with holding the zone lock. Because it is somewhat heavy operation
> and it isn't needed in common case, we want to avoid rechecking as much
> as possible. So this patch introduce new variable, nr_isolate_pageblock
> in struct zone to check if there is isolated pageblock.
> With this, we can avoid to re-check migratetype in common case and do
> it only if there is isolated pageblock or migratetype is MIGRATE_ISOLATE.
> This solve above mentioned problems.
>
> Changes from v3:
> Add one more check in free_one_page() that checks whether migratetype is
> MIGRATE_ISOLATE or not. Without this, abovementioned case 1 could happens.
>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
> ---
> include/linux/mmzone.h | 9 +++++++++
> include/linux/page-isolation.h | 8 ++++++++
> mm/page_alloc.c | 11 +++++++++--
> mm/page_isolation.c | 2 ++
> 4 files changed, 28 insertions(+), 2 deletions(-)
--
Best regards, _ _
.o. | Liege of Serenely Enlightened Majesty of o' \,=./ `o
..o | Computer Science, Michał “mina86” Nazarewicz (o o)
ooo +--<mpn@google.com>--<xmpp:mina86@jabber.org>--ooO--(_)--Ooo--
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Michal Nazarewicz <mina86@mina86.com>
To: Joonsoo Kim <iamjoonsoo.kim@lge.com>,
Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Rik van Riel <riel@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Mel Gorman <mgorman@suse.de>,
Johannes Weiner <hannes@cmpxchg.org>,
Minchan Kim <minchan@kernel.org>,
Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>,
Zhang Yanfei <zhangyanfei@cn.fujitsu.com>,
Tang Chen <tangchen@cn.fujitsu.com>,
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>,
Wen Congyang <wency@cn.fujitsu.com>,
Marek Szyprowski <m.szyprowski@samsung.com>,
Laura Abbott <lauraa@codeaurora.org>,
Heesub Shin <heesub.shin@samsung.com>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
Ritesh Harjani <ritesh.list@gmail.com>,
t.stanislaws@samsung.com, Gioh Kim <gioh.kim@lge.com>,
Vlastimil Babka <vbabka@suse.cz>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
stable@vger.kernel.org
Subject: Re: [PATCH v4 1/4] mm/page_alloc: fix incorrect isolation behavior by rechecking migratetype
Date: Mon, 27 Oct 2014 18:14:36 +0100 [thread overview]
Message-ID: <xa1toasxo2o3.fsf@mina86.com> (raw)
In-Reply-To: <1414051821-12769-2-git-send-email-iamjoonsoo.kim@lge.com>
On Thu, Oct 23 2014, Joonsoo Kim wrote:
> There are two paths to reach core free function of buddy allocator,
> __free_one_page(), one is free_one_page()->__free_one_page() and the
> other is free_hot_cold_page()->free_pcppages_bulk()->__free_one_page().
> Each paths has race condition causing serious problems. At first, this
> patch is focused on first type of freepath. And then, following patch
> will solve the problem in second type of freepath.
>
> In the first type of freepath, we got migratetype of freeing page without
> holding the zone lock, so it could be racy. There are two cases of this
> race.
>
> 1. pages are added to isolate buddy list after restoring orignal
> migratetype
>
> CPU1 CPU2
>
> get migratetype => return MIGRATE_ISOLATE
> call free_one_page() with MIGRATE_ISOLATE
>
> grab the zone lock
> unisolate pageblock
> release the zone lock
>
> grab the zone lock
> call __free_one_page() with MIGRATE_ISOLATE
> freepage go into isolate buddy list,
> although pageblock is already unisolated
>
> This may cause two problems. One is that we can't use this page anymore
> until next isolation attempt of this pageblock, because freepage is on
> isolate buddy list. The other is that freepage accouting could be wrong
> due to merging between different buddy list. Freepages on isolate buddy
> list aren't counted as freepage, but ones on normal buddy list are counted
> as freepage. If merge happens, buddy freepage on normal buddy list is
> inevitably moved to isolate buddy list without any consideration of
> freepage accouting so it could be incorrect.
>
> 2. pages are added to normal buddy list while pageblock is isolated.
> It is similar with above case.
>
> This also may cause two problems. One is that we can't keep these
> freepages from being allocated. Although this pageblock is isolated,
> freepage would be added to normal buddy list so that it could be
> allocated without any restriction. And the other problem is same as
> case 1, that it, incorrect freepage accouting.
>
> This race condition would be prevented by checking migratetype again
> with holding the zone lock. Because it is somewhat heavy operation
> and it isn't needed in common case, we want to avoid rechecking as much
> as possible. So this patch introduce new variable, nr_isolate_pageblock
> in struct zone to check if there is isolated pageblock.
> With this, we can avoid to re-check migratetype in common case and do
> it only if there is isolated pageblock or migratetype is MIGRATE_ISOLATE.
> This solve above mentioned problems.
>
> Changes from v3:
> Add one more check in free_one_page() that checks whether migratetype is
> MIGRATE_ISOLATE or not. Without this, abovementioned case 1 could happens.
>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
> ---
> include/linux/mmzone.h | 9 +++++++++
> include/linux/page-isolation.h | 8 ++++++++
> mm/page_alloc.c | 11 +++++++++--
> mm/page_isolation.c | 2 ++
> 4 files changed, 28 insertions(+), 2 deletions(-)
--
Best regards, _ _
.o. | Liege of Serenely Enlightened Majesty of o' \,=./ `o
..o | Computer Science, Michał “mina86” Nazarewicz (o o)
ooo +--<mpn@google.com>--<xmpp:mina86@jabber.org>--ooO--(_)--Ooo--
next prev parent reply other threads:[~2014-10-27 17:14 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-23 8:10 [PATCH v4 0/4] fix freepage count problems in memory isolation Joonsoo Kim
2014-10-23 8:10 ` Joonsoo Kim
2014-10-23 8:10 ` [PATCH v4 1/4] mm/page_alloc: fix incorrect isolation behavior by rechecking migratetype Joonsoo Kim
2014-10-23 8:10 ` Joonsoo Kim
2014-10-27 10:33 ` Vlastimil Babka
2014-10-27 10:33 ` Vlastimil Babka
2014-10-28 7:22 ` Joonsoo Kim
2014-10-28 7:22 ` Joonsoo Kim
2014-10-27 17:14 ` Michal Nazarewicz [this message]
2014-10-27 17:14 ` Michal Nazarewicz
2014-10-23 8:10 ` [PATCH v4 2/4] mm/page_alloc: add freepage on isolate pageblock to correct buddy list Joonsoo Kim
2014-10-23 8:10 ` Joonsoo Kim
2014-10-27 10:34 ` Vlastimil Babka
2014-10-27 10:34 ` Vlastimil Babka
2014-10-27 17:14 ` Michal Nazarewicz
2014-10-27 17:14 ` Michal Nazarewicz
2014-10-23 8:10 ` [PATCH v4 3/4] mm/page_alloc: move migratetype recheck logic to __free_one_page() Joonsoo Kim
2014-10-23 8:10 ` Joonsoo Kim
2014-10-27 10:40 ` Vlastimil Babka
2014-10-27 10:40 ` Vlastimil Babka
2014-10-28 7:37 ` Joonsoo Kim
2014-10-28 7:37 ` Joonsoo Kim
2014-10-23 8:10 ` [PATCH v4 4/4] mm/page_alloc: restrict max order of merging on isolated pageblock Joonsoo Kim
2014-10-23 8:10 ` Joonsoo Kim
2014-10-24 2:27 ` [PATCH v4 0/4] fix freepage count problems in memory isolation Minchan Kim
2014-10-24 2:27 ` Minchan Kim
2014-10-24 5:36 ` Joonsoo Kim
2014-10-24 5:36 ` Joonsoo Kim
2014-10-24 8:26 ` Gioh Kim
2014-10-24 8:26 ` Gioh Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xa1toasxo2o3.fsf@mina86.com \
--to=mina86@mina86.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=b.zolnierkie@samsung.com \
--cc=gioh.kim@lge.com \
--cc=hannes@cmpxchg.org \
--cc=heesub.shin@samsung.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=isimatu.yasuaki@jp.fujitsu.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=lauraa@codeaurora.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=m.szyprowski@samsung.com \
--cc=mgorman@suse.de \
--cc=minchan@kernel.org \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=ritesh.list@gmail.com \
--cc=stable@vger.kernel.org \
--cc=t.stanislaws@samsung.com \
--cc=tangchen@cn.fujitsu.com \
--cc=vbabka@suse.cz \
--cc=wency@cn.fujitsu.com \
--cc=zhangyanfei@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.