From: Dave Hansen <dave@linux.vnet.ibm.com>
To: gerald.schaefer@de.ibm.com
Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, schwidefsky@de.ibm.com,
heiko.carstens@de.ibm.com, kamezawa.hiroyu@jp.fujitsu.com,
y-goto@jp.fujitsu.com
Subject: Re: [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated()
Date: Mon, 27 Oct 2008 10:25:59 -0700 [thread overview]
Message-ID: <1225128359.12673.101.camel@nimitz> (raw)
In-Reply-To: <4905F114.3030406@de.ibm.com>
On Mon, 2008-10-27 at 17:49 +0100, Gerald Schaefer wrote:
> My last bugfix here (adding zone->lock) introduced a new problem: Using
> pfn_to_page(pfn) to get the zone after the for() loop is wrong. pfn then
> points to the first pfn after end_pfn, which may be in a different zone
> or not present at all. This may lead to an addressing exception in
> page_zone() or spin_lock_irqsave().
I'm not sure I follow. Let's look at the code, pre-patch:
> for (pfn = start_pfn; pfn < end_pfn; pfn += pageblock_nr_pages) {
> page = __first_valid_page(pfn, pageblock_nr_pages);
> if (page && get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
> break;
> }
> if (pfn < end_pfn)
> return -EBUSY;
We have two ways out of the loop:
1. 'page' is valid, and not isolated, so we did a 'break'
2. No page hit (1) in the range and we broke out of the loop because
of the for() condition: (pfn < end_pfn).
So, when the condition happens that you mentioned in your changelog
above: "pfn then points to the first pfn after end_pfn", we jump out at
the 'return -EBUSY;'. We don't ever do pfn_to_page() in that case since
we've returned befoer.
Either 'page' is valid *OR* you return -EBUSY. I don't think you need
to check both.
> Using the last valid page that was found inside the for() loop, instead
> of pfn_to_page(), should fix this.
> @@ -130,10 +130,10 @@ int test_pages_isolated(unsigned long st
> if (page && get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
> break;
> }
> - if (pfn < end_pfn)
> + if ((pfn < end_pfn) || !page)
> return -EBUSY;
> /* Check all pages are free or Marked as ISOLATED */
> - zone = page_zone(pfn_to_page(pfn));
> + zone = page_zone(page);
I think this patch fixes the bug, but for reasons other than what you
said. :)
The trouble here is that the 'pfn' could have been in the middle of a
hole somewhere, which __first_valid_page() worked around. Since you
saved off the result of __first_valid_page(), it ends up being OK with
your patch.
Instead of using pfn_to_page() you could also have just called
__first_valid_page() again. But, that would have duplicated a bit of
work, even though not much in practice because the caches are still hot.
Technically, you wouldn't even need to check the return from
__first_valid_page() since you know it has a valid result because you
made the exact same call a moment before.
Anyway, can you remove the !page check, fix up the changelog and resend?
-- Dave
WARNING: multiple messages have this Message-ID (diff)
From: Dave Hansen <dave@linux.vnet.ibm.com>
To: gerald.schaefer@de.ibm.com
Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, schwidefsky@de.ibm.com,
heiko.carstens@de.ibm.com, kamezawa.hiroyu@jp.fujitsu.com,
y-goto@jp.fujitsu.com
Subject: Re: [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated()
Date: Mon, 27 Oct 2008 10:25:59 -0700 [thread overview]
Message-ID: <1225128359.12673.101.camel@nimitz> (raw)
In-Reply-To: <4905F114.3030406@de.ibm.com>
On Mon, 2008-10-27 at 17:49 +0100, Gerald Schaefer wrote:
> My last bugfix here (adding zone->lock) introduced a new problem: Using
> pfn_to_page(pfn) to get the zone after the for() loop is wrong. pfn then
> points to the first pfn after end_pfn, which may be in a different zone
> or not present at all. This may lead to an addressing exception in
> page_zone() or spin_lock_irqsave().
I'm not sure I follow. Let's look at the code, pre-patch:
> for (pfn = start_pfn; pfn < end_pfn; pfn += pageblock_nr_pages) {
> page = __first_valid_page(pfn, pageblock_nr_pages);
> if (page && get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
> break;
> }
> if (pfn < end_pfn)
> return -EBUSY;
We have two ways out of the loop:
1. 'page' is valid, and not isolated, so we did a 'break'
2. No page hit (1) in the range and we broke out of the loop because
of the for() condition: (pfn < end_pfn).
So, when the condition happens that you mentioned in your changelog
above: "pfn then points to the first pfn after end_pfn", we jump out at
the 'return -EBUSY;'. We don't ever do pfn_to_page() in that case since
we've returned befoer.
Either 'page' is valid *OR* you return -EBUSY. I don't think you need
to check both.
> Using the last valid page that was found inside the for() loop, instead
> of pfn_to_page(), should fix this.
> @@ -130,10 +130,10 @@ int test_pages_isolated(unsigned long st
> if (page && get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
> break;
> }
> - if (pfn < end_pfn)
> + if ((pfn < end_pfn) || !page)
> return -EBUSY;
> /* Check all pages are free or Marked as ISOLATED */
> - zone = page_zone(pfn_to_page(pfn));
> + zone = page_zone(page);
I think this patch fixes the bug, but for reasons other than what you
said. :)
The trouble here is that the 'pfn' could have been in the middle of a
hole somewhere, which __first_valid_page() worked around. Since you
saved off the result of __first_valid_page(), it ends up being OK with
your patch.
Instead of using pfn_to_page() you could also have just called
__first_valid_page() again. But, that would have duplicated a bit of
work, even though not much in practice because the caches are still hot.
Technically, you wouldn't even need to check the return from
__first_valid_page() since you know it has a valid result because you
made the exact same call a moment before.
Anyway, can you remove the !page check, fix up the changelog and resend?
-- Dave
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-10-27 17:26 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-27 16:49 [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated() Gerald Schaefer
2008-10-27 16:49 ` Gerald Schaefer, Gerald Schaefer
2008-10-27 17:17 ` Gerald Schaefer
2008-10-27 17:17 ` Gerald Schaefer
2008-10-27 17:19 ` Gerald Schaefer
2008-10-27 17:19 ` Gerald Schaefer, Gerald Schaefer
2008-10-27 17:25 ` Dave Hansen [this message]
2008-10-27 17:25 ` Dave Hansen
2008-10-27 17:59 ` Gerald Schaefer
2008-10-27 17:59 ` Gerald Schaefer
2008-10-28 0:32 ` KAMEZAWA Hiroyuki
2008-10-28 0:32 ` KAMEZAWA Hiroyuki
2008-10-28 13:00 ` Gerald Schaefer
2008-10-28 13:00 ` Gerald Schaefer
-- strict thread matches above, loose matches on Subject: below --
2008-10-29 14:25 Gerald Schaefer
2008-10-29 14:25 ` Gerald Schaefer, Gerald Schaefer
2008-10-29 18:00 ` Nathan Fontenot
2008-10-29 18:00 ` Nathan Fontenot
2008-10-30 0:09 ` KAMEZAWA Hiroyuki
2008-10-30 0:09 ` KAMEZAWA Hiroyuki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1225128359.12673.101.camel@nimitz \
--to=dave@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=gerald.schaefer@de.ibm.com \
--cc=heiko.carstens@de.ibm.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=schwidefsky@de.ibm.com \
--cc=y-goto@jp.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.