From: Mel Gorman <mgorman@suse.de>
To: Minchan Kim <minchan@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>,
Xishi Qiu <qiuxishi@huawei.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/3] memory-hotplug: bug fix race between isolation and allocation
Date: Thu, 6 Sep 2012 10:24:24 +0100 [thread overview]
Message-ID: <20120906092424.GP11266@suse.de> (raw)
In-Reply-To: <20120906044903.GA16150@bbox>
On Thu, Sep 06, 2012 at 01:49:03PM +0900, Minchan Kim wrote:
> > > __offline_isolated_pages
> > > /*
> > > * BUG_ON hit or offline page
> > > * which is used by someone
> > > */
> > > BUG_ON(!PageBuddy(page A));
> > >
> >
> > offline_page calling BUG_ON because someone allocated the page is
> > ridiculous. I did not spot where that check is but it should be changed. The
> > correct action is to retry the isolation.
>
> It is where __offline_isolated_pges.
>
> ..
> while (pfn < end_pfn) {
> if (!pfn_valid(pfn)) {
> pfn++;
> continue;
> }
> page = pfn_to_page(pfn);
> BUG_ON(page_count(page));
> BUG_ON(!PageBuddy(page)); <---- HERE
> order = page_order(page);
> ...
>
> Comment of offline_isolated_pages says following as.
>
> We cannot do rollback at this point
>
> So if the comment is true, BUG_ON does make sense to me.
It's massive overkill. I see no reason why it cannot return EBUSY all the
way back up to offline_pages() and retry with the migration step. It would
both remove that BUG_ON and improve reliability of memory hot-remove.
> But I don't see why we can't retry it as I look thorugh code.
> Anyway, It's another story which isn't related to this patch.
>
True.
> >
> > > Signed-off-by: Minchan Kim <minchan@kernel.org>
> >
> > At no point in the changelog do you actually say what he patch does :/
>
> Argh, I will do.
>
> >
> > > ---
> > > mm/page_isolation.c | 5 ++++-
> > > 1 file changed, 4 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> > > index acf65a7..4699d1f 100644
> > > --- a/mm/page_isolation.c
> > > +++ b/mm/page_isolation.c
> > > @@ -196,8 +196,11 @@ __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn)
> > > continue;
> > > }
> > > page = pfn_to_page(pfn);
> > > - if (PageBuddy(page))
> > > + if (PageBuddy(page)) {
> > > + if (get_page_migratetype(page) != MIGRATE_ISOLATE)
> > > + break;
> > > pfn += 1 << page_order(page);
> > > + }
> >
> > It is possible the page is moved to the MIGRATE_ISOLATE list between when
> > the page was freed to the buddy allocator and this check was made. The
> > page->index information is stale and the impact is that the hotplug
> > operation fails when it could have succeeded. That said, I think it is a
> > very unlikely race that will never happen in practice.
>
> I understand you mean move_freepages which I have missed. Right?
Yes.
> Then, I will fix it, too.
>
> >
> > More importantly, the effect of this path is that EBUSY gets bubbled all
> > the way up and the hotplug operations fails. This is fine but as the page
> > is free at the time this problem is detected you also have the option
> > of moving the PageBuddy page to the MIGRATE_ISOLATE list at this time
> > if you take the zone lock. This will mean you need to change the name of
> > test_pages_isolated() of course.
>
> Sorry, I can't get your point. Could you elaborate it more?
You detect a PageBuddy page but it's on the wrong list. Instead of returning
and failing memory-hotremove, move the free page to the correct list at
the time it is detected.
> Is it related to this patch?
No, it's not important and was a suggestion on how it could be made
better. However, retrying hot-remove would be even better again. I'm not
suggesting it be done as part of this series.
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mgorman@suse.de>
To: Minchan Kim <minchan@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>,
Xishi Qiu <qiuxishi@huawei.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/3] memory-hotplug: bug fix race between isolation and allocation
Date: Thu, 6 Sep 2012 10:24:24 +0100 [thread overview]
Message-ID: <20120906092424.GP11266@suse.de> (raw)
In-Reply-To: <20120906044903.GA16150@bbox>
On Thu, Sep 06, 2012 at 01:49:03PM +0900, Minchan Kim wrote:
> > > __offline_isolated_pages
> > > /*
> > > * BUG_ON hit or offline page
> > > * which is used by someone
> > > */
> > > BUG_ON(!PageBuddy(page A));
> > >
> >
> > offline_page calling BUG_ON because someone allocated the page is
> > ridiculous. I did not spot where that check is but it should be changed. The
> > correct action is to retry the isolation.
>
> It is where __offline_isolated_pges.
>
> ..
> while (pfn < end_pfn) {
> if (!pfn_valid(pfn)) {
> pfn++;
> continue;
> }
> page = pfn_to_page(pfn);
> BUG_ON(page_count(page));
> BUG_ON(!PageBuddy(page)); <---- HERE
> order = page_order(page);
> ...
>
> Comment of offline_isolated_pages says following as.
>
> We cannot do rollback at this point
>
> So if the comment is true, BUG_ON does make sense to me.
It's massive overkill. I see no reason why it cannot return EBUSY all the
way back up to offline_pages() and retry with the migration step. It would
both remove that BUG_ON and improve reliability of memory hot-remove.
> But I don't see why we can't retry it as I look thorugh code.
> Anyway, It's another story which isn't related to this patch.
>
True.
> >
> > > Signed-off-by: Minchan Kim <minchan@kernel.org>
> >
> > At no point in the changelog do you actually say what he patch does :/
>
> Argh, I will do.
>
> >
> > > ---
> > > mm/page_isolation.c | 5 ++++-
> > > 1 file changed, 4 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> > > index acf65a7..4699d1f 100644
> > > --- a/mm/page_isolation.c
> > > +++ b/mm/page_isolation.c
> > > @@ -196,8 +196,11 @@ __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn)
> > > continue;
> > > }
> > > page = pfn_to_page(pfn);
> > > - if (PageBuddy(page))
> > > + if (PageBuddy(page)) {
> > > + if (get_page_migratetype(page) != MIGRATE_ISOLATE)
> > > + break;
> > > pfn += 1 << page_order(page);
> > > + }
> >
> > It is possible the page is moved to the MIGRATE_ISOLATE list between when
> > the page was freed to the buddy allocator and this check was made. The
> > page->index information is stale and the impact is that the hotplug
> > operation fails when it could have succeeded. That said, I think it is a
> > very unlikely race that will never happen in practice.
>
> I understand you mean move_freepages which I have missed. Right?
Yes.
> Then, I will fix it, too.
>
> >
> > More importantly, the effect of this path is that EBUSY gets bubbled all
> > the way up and the hotplug operations fails. This is fine but as the page
> > is free at the time this problem is detected you also have the option
> > of moving the PageBuddy page to the MIGRATE_ISOLATE list at this time
> > if you take the zone lock. This will mean you need to change the name of
> > test_pages_isolated() of course.
>
> Sorry, I can't get your point. Could you elaborate it more?
You detect a PageBuddy page but it's on the wrong list. Instead of returning
and failing memory-hotremove, move the free page to the correct list at
the time it is detected.
> Is it related to this patch?
No, it's not important and was a suggestion on how it could be made
better. However, retrying hot-remove would be even better again. I'm not
suggesting it be done as part of this series.
--
Mel Gorman
SUSE Labs
next prev parent reply other threads:[~2012-09-06 9:24 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-05 7:25 [PATCH 0/3] memory-hotplug: handle page race between allocation and isolation Minchan Kim
2012-09-05 7:25 ` Minchan Kim
2012-09-05 7:26 ` [PATCH 1/3] mm: use get_page_migratetype instead of page_private Minchan Kim
2012-09-05 7:26 ` Minchan Kim
2012-09-05 9:09 ` Mel Gorman
2012-09-05 9:09 ` Mel Gorman
2012-09-06 2:17 ` Minchan Kim
2012-09-06 2:17 ` Minchan Kim
2012-09-06 2:02 ` Kamezawa Hiroyuki
2012-09-06 2:02 ` Kamezawa Hiroyuki
2012-09-06 2:19 ` Minchan Kim
2012-09-06 2:19 ` Minchan Kim
2012-09-05 7:26 ` [PATCH 2/3] mm: remain migratetype in freed page Minchan Kim
2012-09-05 7:26 ` Minchan Kim
2012-09-05 9:25 ` Mel Gorman
2012-09-05 9:25 ` Mel Gorman
2012-09-06 2:28 ` Minchan Kim
2012-09-06 2:28 ` Minchan Kim
2012-09-05 7:26 ` [PATCH 3/3] memory-hotplug: bug fix race between isolation and allocation Minchan Kim
2012-09-05 7:26 ` Minchan Kim
2012-09-05 9:40 ` Mel Gorman
2012-09-05 9:40 ` Mel Gorman
2012-09-06 4:49 ` Minchan Kim
2012-09-06 4:49 ` Minchan Kim
2012-09-06 9:24 ` Mel Gorman [this message]
2012-09-06 9:24 ` Mel Gorman
2012-09-06 23:32 ` Minchan Kim
2012-09-06 23:32 ` Minchan Kim
2012-09-07 6:26 ` jencce zhou
2012-09-07 6:26 ` jencce zhou
-- strict thread matches above, loose matches on Subject: below --
2012-09-06 2:35 qiuxishi
2012-09-06 2:35 ` qiuxishi
2012-09-06 2:59 ` Minchan Kim
2012-09-06 2:59 ` Minchan Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120906092424.GP11266@suse.de \
--to=mgorman@suse.de \
--cc=akpm@linux-foundation.org \
--cc=isimatu.yasuaki@jp.fujitsu.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
--cc=qiuxishi@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.