From: Andrew Morton <akpm@linux-foundation.org>
To: Jiang Liu <jiang.liu@huawei.com>
Cc: Maciej Rutecki <maciej.rutecki@gmail.com>,
Jianguo Wu <wujianguo@huawei.com>,
Chris Clayton <chris2553@googlemail.com>,
"Rafael J. Wysocki" <rjw@sisk.pl>, Mel Gorman <mgorman@suse.de>,
Minchan Kim <minchan@kernel.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Michal Hocko <mhocko@suse.cz>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: fix a regression with HIGHMEM introduced by changeset 7f1290f2f2a4d
Date: Tue, 6 Nov 2012 12:43:15 -0800 [thread overview]
Message-ID: <20121106124315.79deb2bc.akpm@linux-foundation.org> (raw)
In-Reply-To: <1352165517-9732-1-git-send-email-jiang.liu@huawei.com>
On Tue, 6 Nov 2012 09:31:57 +0800
Jiang Liu <jiang.liu@huawei.com> wrote:
> Changeset 7f1290f2f2 tries to fix a issue when calculating
> zone->present_pages, but it causes a regression to 32bit systems with
> HIGHMEM. With that changeset, function reset_zone_present_pages()
> resets all zone->present_pages to zero, and fixup_zone_present_pages()
> is called to recalculate zone->present_pages when boot allocator frees
> core memory pages into buddy allocator. Because highmem pages are not
> freed by bootmem allocator, all highmem zones' present_pages becomes
> zero.
>
> Actually there's no need to recalculate present_pages for highmem zone
> because bootmem allocator never allocates pages from them. So fix the
> regression by skipping highmem in function reset_zone_present_pages()
> and fixup_zone_present_pages().
>
> ...
>
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -6108,7 +6108,8 @@ void reset_zone_present_pages(void)
> for_each_node_state(nid, N_HIGH_MEMORY) {
> for (i = 0; i < MAX_NR_ZONES; i++) {
> z = NODE_DATA(nid)->node_zones + i;
> - z->present_pages = 0;
> + if (!is_highmem(z))
> + z->present_pages = 0;
> }
> }
> }
> @@ -6123,10 +6124,11 @@ void fixup_zone_present_pages(int nid, unsigned long start_pfn,
>
> for (i = 0; i < MAX_NR_ZONES; i++) {
> z = NODE_DATA(nid)->node_zones + i;
> + if (is_highmem(z))
> + continue;
> +
> zone_start_pfn = z->zone_start_pfn;
> zone_end_pfn = zone_start_pfn + z->spanned_pages;
> -
> - /* if the two regions intersect */
> if (!(zone_start_pfn >= end_pfn || zone_end_pfn <= start_pfn))
> z->present_pages += min(end_pfn, zone_end_pfn) -
> max(start_pfn, zone_start_pfn);
This ... isn't very nice. It is embeds within
reset_zone_present_pages() and fixup_zone_present_pages() knowledge
about their caller's state. Or, more specifically, it is emebedding
knowledge about the overall state of the system when these functions
are called.
I mean, a function called "reset_zone_present_pages" should reset
->present_pages!
The fact that fixup_zone_present_page() has multiple call sites makes
this all even more risky. And what are the interactions between this
and memory hotplug?
Can we find a cleaner fix?
Please tell us more about what's happening here. Is it the case that
reset_zone_present_pages() is being called *after* highmem has been
populated? If so, then fixup_zone_present_pages() should work
correctly for highmem? Or is it the case that highmem hasn't yet been
setup? IOW, what is the sequence of operations here?
Is the problem that we're *missing* a call to
fixup_zone_present_pages(), perhaps? If we call
fixup_zone_present_pages() after highmem has been populated,
fixup_zone_present_pages() should correctly fill in the highmem zone's
->present_pages?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: Jiang Liu <jiang.liu@huawei.com>
Cc: Maciej Rutecki <maciej.rutecki@gmail.com>,
Jianguo Wu <wujianguo@huawei.com>,
Chris Clayton <chris2553@googlemail.com>,
"Rafael J. Wysocki" <rjw@sisk.pl>, Mel Gorman <mgorman@suse.de>,
Minchan Kim <minchan@kernel.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Michal Hocko <mhocko@suse.cz>, <linux-mm@kvack.org>,
<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm: fix a regression with HIGHMEM introduced by changeset 7f1290f2f2a4d
Date: Tue, 6 Nov 2012 12:43:15 -0800 [thread overview]
Message-ID: <20121106124315.79deb2bc.akpm@linux-foundation.org> (raw)
In-Reply-To: <1352165517-9732-1-git-send-email-jiang.liu@huawei.com>
On Tue, 6 Nov 2012 09:31:57 +0800
Jiang Liu <jiang.liu@huawei.com> wrote:
> Changeset 7f1290f2f2 tries to fix a issue when calculating
> zone->present_pages, but it causes a regression to 32bit systems with
> HIGHMEM. With that changeset, function reset_zone_present_pages()
> resets all zone->present_pages to zero, and fixup_zone_present_pages()
> is called to recalculate zone->present_pages when boot allocator frees
> core memory pages into buddy allocator. Because highmem pages are not
> freed by bootmem allocator, all highmem zones' present_pages becomes
> zero.
>
> Actually there's no need to recalculate present_pages for highmem zone
> because bootmem allocator never allocates pages from them. So fix the
> regression by skipping highmem in function reset_zone_present_pages()
> and fixup_zone_present_pages().
>
> ...
>
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -6108,7 +6108,8 @@ void reset_zone_present_pages(void)
> for_each_node_state(nid, N_HIGH_MEMORY) {
> for (i = 0; i < MAX_NR_ZONES; i++) {
> z = NODE_DATA(nid)->node_zones + i;
> - z->present_pages = 0;
> + if (!is_highmem(z))
> + z->present_pages = 0;
> }
> }
> }
> @@ -6123,10 +6124,11 @@ void fixup_zone_present_pages(int nid, unsigned long start_pfn,
>
> for (i = 0; i < MAX_NR_ZONES; i++) {
> z = NODE_DATA(nid)->node_zones + i;
> + if (is_highmem(z))
> + continue;
> +
> zone_start_pfn = z->zone_start_pfn;
> zone_end_pfn = zone_start_pfn + z->spanned_pages;
> -
> - /* if the two regions intersect */
> if (!(zone_start_pfn >= end_pfn || zone_end_pfn <= start_pfn))
> z->present_pages += min(end_pfn, zone_end_pfn) -
> max(start_pfn, zone_start_pfn);
This ... isn't very nice. It is embeds within
reset_zone_present_pages() and fixup_zone_present_pages() knowledge
about their caller's state. Or, more specifically, it is emebedding
knowledge about the overall state of the system when these functions
are called.
I mean, a function called "reset_zone_present_pages" should reset
->present_pages!
The fact that fixup_zone_present_page() has multiple call sites makes
this all even more risky. And what are the interactions between this
and memory hotplug?
Can we find a cleaner fix?
Please tell us more about what's happening here. Is it the case that
reset_zone_present_pages() is being called *after* highmem has been
populated? If so, then fixup_zone_present_pages() should work
correctly for highmem? Or is it the case that highmem hasn't yet been
setup? IOW, what is the sequence of operations here?
Is the problem that we're *missing* a call to
fixup_zone_present_pages(), perhaps? If we call
fixup_zone_present_pages() after highmem has been populated,
fixup_zone_present_pages() should correctly fill in the highmem zone's
->present_pages?
next prev parent reply other threads:[~2012-11-06 20:43 UTC|newest]
Thread overview: 98+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-06 1:31 [PATCH] mm: fix a regression with HIGHMEM introduced by changeset 7f1290f2f2a4d Jiang Liu
2012-11-06 1:31 ` Jiang Liu
2012-11-06 10:23 ` Chris Clayton
2012-11-06 10:23 ` Chris Clayton
2012-11-06 20:43 ` Andrew Morton [this message]
2012-11-06 20:43 ` Andrew Morton
2012-11-14 14:52 ` Jiang Liu
2012-11-14 14:52 ` Jiang Liu
2012-11-15 9:22 ` Wen Congyang
2012-11-15 9:22 ` Wen Congyang
2012-11-15 11:28 ` Bob Liu
2012-11-15 11:28 ` Bob Liu
2012-11-15 14:23 ` Wen Congyang
2012-11-15 14:23 ` Wen Congyang
2012-11-15 15:40 ` Jiang Liu
2012-11-15 15:40 ` Jiang Liu
2012-11-15 21:41 ` David Rientjes
2012-11-15 21:41 ` David Rientjes
2012-11-15 19:24 ` Andrew Morton
2012-11-15 19:24 ` Andrew Morton
2012-11-15 21:17 ` Chris Clayton
2012-11-15 21:17 ` Chris Clayton
2012-11-15 21:27 ` David Rientjes
2012-11-15 21:27 ` David Rientjes
2012-11-18 16:07 ` [RFT PATCH v1 0/5] fix up inaccurate zone->present_pages Jiang Liu
2012-11-18 16:07 ` Jiang Liu
2012-11-18 16:07 ` [RFT PATCH v1 1/5] mm: introduce new field "managed_pages" to struct zone Jiang Liu
2012-11-18 16:07 ` Jiang Liu
2012-11-19 23:38 ` Andrew Morton
2012-11-19 23:38 ` Andrew Morton
2012-11-20 14:56 ` Jiang Liu
2012-11-20 14:56 ` Jiang Liu
2012-11-20 19:31 ` Andrew Morton
2012-11-20 19:31 ` Andrew Morton
2012-11-21 14:36 ` Jiang Liu
2012-11-21 14:36 ` Jiang Liu
2012-11-21 19:31 ` Andrew Morton
2012-11-21 19:31 ` Andrew Morton
2012-11-21 15:06 ` [RFT PATCH v2 " Jiang Liu
2012-11-21 15:06 ` Jiang Liu
2012-11-18 16:07 ` [RFT PATCH v1 2/5] mm: replace zone->present_pages with zone->managed_pages if appreciated Jiang Liu
2012-11-18 16:07 ` Jiang Liu
2012-11-18 16:07 ` [RFT PATCH v1 3/5] mm: set zone->present_pages to number of existing pages in the zone Jiang Liu
2012-11-18 16:07 ` Jiang Liu
2012-11-18 16:07 ` [RFT PATCH v1 4/5] mm: provide more accurate estimation of pages occupied by memmap Jiang Liu
2012-11-18 16:07 ` Jiang Liu
2012-11-19 23:42 ` Andrew Morton
2012-11-19 23:42 ` Andrew Morton
2012-11-20 15:18 ` Jiang Liu
2012-11-20 15:18 ` Jiang Liu
2012-11-20 19:19 ` Andrew Morton
2012-11-20 19:19 ` Andrew Morton
2012-11-21 14:52 ` Jiang Liu
2012-11-21 14:52 ` Jiang Liu
2012-11-21 19:35 ` Andrew Morton
2012-11-21 19:35 ` Andrew Morton
2012-11-22 16:17 ` Jiang Liu
2012-11-22 16:17 ` Jiang Liu
2012-11-21 15:09 ` [RFT PATCH v2 " Jiang Liu
2012-11-21 15:09 ` Jiang Liu
2012-11-28 23:52 ` Andrew Morton
2012-11-28 23:52 ` Andrew Morton
2012-11-29 2:25 ` Jianguo Wu
2012-11-29 2:25 ` Jianguo Wu
2012-11-29 10:52 ` Chris Clayton
2012-11-29 10:52 ` Chris Clayton
2012-12-02 19:55 ` Chris Clayton
2012-12-02 19:55 ` Chris Clayton
2012-12-03 7:26 ` Chris Clayton
2012-12-03 7:26 ` Chris Clayton
2012-12-03 23:17 ` Andrew Morton
2012-12-03 23:17 ` Andrew Morton
2012-12-04 1:21 ` Jiang Liu
2012-12-04 1:21 ` Jiang Liu
2012-12-04 10:05 ` Chris Clayton
2012-12-04 10:05 ` Chris Clayton
2012-11-20 2:15 ` [RFT PATCH v1 " Jaegeuk Hanse
2012-11-20 2:15 ` Jaegeuk Hanse
2012-11-18 16:07 ` [RFT PATCH v1 5/5] mm: increase totalram_pages when free pages allocated by bootmem allocator Jiang Liu
2012-11-18 16:07 ` Jiang Liu
2012-11-18 20:36 ` [RFT PATCH v1 0/5] fix up inaccurate zone->present_pages Chris Clayton
2012-11-18 20:36 ` Chris Clayton
2012-11-22 9:23 ` Chris Clayton
2012-11-22 9:23 ` Chris Clayton
2012-11-26 9:46 ` Chris Clayton
2012-11-26 9:46 ` Chris Clayton
2012-11-19 21:36 ` Maciej Rutecki
2012-11-19 21:36 ` Maciej Rutecki
2012-11-20 16:03 ` Jiang Liu
2012-11-20 16:03 ` Jiang Liu
2012-11-20 2:13 ` Jaegeuk Hanse
2012-11-20 2:13 ` Jaegeuk Hanse
2012-11-20 2:43 ` Jiang Liu
2012-11-20 2:43 ` Jiang Liu
2012-11-20 3:20 ` Jaegeuk Hanse
2012-11-20 3:20 ` Jaegeuk Hanse
2012-11-20 3:46 ` Jiang Liu
2012-11-20 3:46 ` Jiang Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121106124315.79deb2bc.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=chris2553@googlemail.com \
--cc=jiang.liu@huawei.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=maciej.rutecki@gmail.com \
--cc=mgorman@suse.de \
--cc=mhocko@suse.cz \
--cc=minchan@kernel.org \
--cc=rjw@sisk.pl \
--cc=wujianguo@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.