From: vbabka@suse.cz (Vlastimil Babka)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCHv2] mm: Don't offset memmap for flatmem
Date: Thu, 29 Jan 2015 14:13:38 +0100 [thread overview]
Message-ID: <54CA3202.8020609@suse.cz> (raw)
In-Reply-To: <20150126155617.GA2395@suse.de>
On 01/26/2015 04:56 PM, Mel Gorman wrote:
> On Fri, Jan 23, 2015 at 10:05:48AM +0100, Vlastimil Babka wrote:
>> On 01/23/2015 01:33 AM, Laura Abbott wrote:
>>> On 1/22/2015 4:20 PM, Andrew Morton wrote:
>>>>
>>>> I don't think v2 addressed Vlastimil's review comment?
>>>>
>>>
>>> We're still adding the offset to node_mem_map and then subtracting it from
>>> just mem_map. Did I miss another comment somewhere?
>>
>> Yes that was addressed, thanks. But I don't feel comfortable acking
>> it yet, as I have no idea if we are doing the right thing for
>> CONFIG_HAVE_MEMBLOCK_NODE_MAP && CONFIG_FLATMEM case here.
>>
>> Also putting the CONFIG_FLATMEM && !CONFIG_HAVE_MEMBLOCK_NODE_MAP
>> under the "if (page_to_pfn(mem_map) != pgdat->node_start_pfn)" will
>> probably do the right thing, but looks like a weird test for this
>> case here.
>>
>> I have no good suggestion though, so let's CC Mel who apparently
>> wrote the ARCH_PFN_OFFSET correction?
>>
>
> I don't recall introducing ARCH_PFN_OFFSET, are you sure it was me? I'm just
> back today after been offline a week so didn't review the patch but IIRC,
> ARCH_PFN_OFFSET deals with the case where physical memory does not start
> at 0. Without the offset, virtual _PAGE_OFFSET would not physical page 0.
> I don't recall it being related to the alignment of node 0 so if there
> are crashes due to misalignment of node 0 and the fix is ARCH_PFN_OFFSET
> related then I'm surprised.
You're right that ARCH_PFN_OFFSET wasn't added by you, but by commit
467bc461d2 which was a bugfix to your commit c713216dee, which did
introduce the mem_map correction code, and after which the code looked like:
mem_map = NODE_DATA(0)->node_mem_map;
#ifdef CONFIG_ARCH_POPULATES_NODE_MAP
if (page_to_pfn(mem_map) != pgdat->node_start_pfn)
mem_map -= pgdat->node_start_pfn;
#endif /* CONFIG_ARCH_POPULATES_NODE_MAP */
It's from 2006 so I can't expect you remember the details, but I had
some trouble finding out what this does. I assume it makes sure that
mem_map points to struct page corresponding to pfn 0, because that's
what translations using mem_map expect.
But pgdat->node_mem_map points to struct page corresponding to
pgdat->node_start_pfn, which might not be 0. So it subtracts
node_start_pfn to fix that. This is OK, as the node_mem_map is allocated
(in this very function) with padding so that it covers a
MAX_ORDER_NR_PAGES aligned area where node_mem_map may point to the
middle of it.
Commit 467bc461d2 fixed this in case the first pfn is not 0, but
ARCH_PFN_OFFSET. So mem_map points to struct page corresponding to
pfn=ARCH_PFN_OFFSET, which is OK. But I still have few doubts:
1) The "if (page_to_pfn(mem_map) != pgdat->node_start_pfn)" sort of
silently assumes that mem_map is allocated at the beginning of the node,
i.e. at pgdat->node_start_pfn. And the only reason for this if-condition
to be true, is that we haven't corrected the page_to_pfn translation,
which uses mem_map. Is this assumption always OK to do? Shouldn't the
if-condition be instead about pgdat->node_start_pfn not being aligned?
2) The #ifdef guard is about CONFIG_ARCH_POPULATES_NODE_MAP, which is
nowadays called CONFIG_HAVE_MEMBLOCK_NODE_MAP. But shouldn't it be
#ifdef FLATMEM instead? After all, we are correcting value of mem_map
based on page_to_pfn code variant used on FLATMEM. arm doesn't define
CONFIG_ARCH_POPULATES_NODE_MAP but apparently needs this correction.
3) The node_mem_map allocation code aligns the allocation to
MAX_ORDER_NR_PAGES, so the offset between the start of the allocated map
and where node_mem_map points to will be up to MAX_ORDER_NR_PAGES.
However, here we subtract (in current kernel) (pgdat->node_start_pfn -
ARCH_PFN_OFFSET). That looks like another silent assumption, that
pgdat->node_start_pfn is always between ARCH_PFN_OFFSET and
ARCH_PFN_OFFSET + MAX_ORDER_NR_PAGES. If it were larger, the mem_map
correction would subtract too much and end up below what was allocated
for node_mem_map, no? The bug report behind this patch said that first
2MB of memory was reserved using "no-map flag using DT". Unless this
somehow translates to ARCH_PFN_OFFSET at build time, we would underflow
mem_map, right? Maybe I'm just overly paranoid here and of course
ARCH_PFN_OFFSET is determined properly on arm...
If anyone can confirm my doubts or point me to what I'm missing, thanks.
WARNING: multiple messages have this Message-ID (diff)
From: Vlastimil Babka <vbabka@suse.cz>
To: Mel Gorman <mgorman@suse.de>
Cc: Laura Abbott <lauraa@codeaurora.org>,
Andrew Morton <akpm@linux-foundation.org>,
Srinivas Kandagatla <srinivas.kandagatla@linaro.org>,
linux-arm-kernel@lists.infradead.org,
Russell King - ARM Linux <linux@arm.linux.org.uk>,
ssantosh@kernel.org, Kevin Hilman <khilman@linaro.org>,
Arnd Bergman <arnd@arndb.de>, Stephen Boyd <sboyd@codeaurora.org>,
linux-mm@kvack.org, Kumar Gala <galak@codeaurora.org>
Subject: Re: [PATCHv2] mm: Don't offset memmap for flatmem
Date: Thu, 29 Jan 2015 14:13:38 +0100 [thread overview]
Message-ID: <54CA3202.8020609@suse.cz> (raw)
In-Reply-To: <20150126155617.GA2395@suse.de>
On 01/26/2015 04:56 PM, Mel Gorman wrote:
> On Fri, Jan 23, 2015 at 10:05:48AM +0100, Vlastimil Babka wrote:
>> On 01/23/2015 01:33 AM, Laura Abbott wrote:
>>> On 1/22/2015 4:20 PM, Andrew Morton wrote:
>>>>
>>>> I don't think v2 addressed Vlastimil's review comment?
>>>>
>>>
>>> We're still adding the offset to node_mem_map and then subtracting it from
>>> just mem_map. Did I miss another comment somewhere?
>>
>> Yes that was addressed, thanks. But I don't feel comfortable acking
>> it yet, as I have no idea if we are doing the right thing for
>> CONFIG_HAVE_MEMBLOCK_NODE_MAP && CONFIG_FLATMEM case here.
>>
>> Also putting the CONFIG_FLATMEM && !CONFIG_HAVE_MEMBLOCK_NODE_MAP
>> under the "if (page_to_pfn(mem_map) != pgdat->node_start_pfn)" will
>> probably do the right thing, but looks like a weird test for this
>> case here.
>>
>> I have no good suggestion though, so let's CC Mel who apparently
>> wrote the ARCH_PFN_OFFSET correction?
>>
>
> I don't recall introducing ARCH_PFN_OFFSET, are you sure it was me? I'm just
> back today after been offline a week so didn't review the patch but IIRC,
> ARCH_PFN_OFFSET deals with the case where physical memory does not start
> at 0. Without the offset, virtual _PAGE_OFFSET would not physical page 0.
> I don't recall it being related to the alignment of node 0 so if there
> are crashes due to misalignment of node 0 and the fix is ARCH_PFN_OFFSET
> related then I'm surprised.
You're right that ARCH_PFN_OFFSET wasn't added by you, but by commit
467bc461d2 which was a bugfix to your commit c713216dee, which did
introduce the mem_map correction code, and after which the code looked like:
mem_map = NODE_DATA(0)->node_mem_map;
#ifdef CONFIG_ARCH_POPULATES_NODE_MAP
if (page_to_pfn(mem_map) != pgdat->node_start_pfn)
mem_map -= pgdat->node_start_pfn;
#endif /* CONFIG_ARCH_POPULATES_NODE_MAP */
It's from 2006 so I can't expect you remember the details, but I had
some trouble finding out what this does. I assume it makes sure that
mem_map points to struct page corresponding to pfn 0, because that's
what translations using mem_map expect.
But pgdat->node_mem_map points to struct page corresponding to
pgdat->node_start_pfn, which might not be 0. So it subtracts
node_start_pfn to fix that. This is OK, as the node_mem_map is allocated
(in this very function) with padding so that it covers a
MAX_ORDER_NR_PAGES aligned area where node_mem_map may point to the
middle of it.
Commit 467bc461d2 fixed this in case the first pfn is not 0, but
ARCH_PFN_OFFSET. So mem_map points to struct page corresponding to
pfn=ARCH_PFN_OFFSET, which is OK. But I still have few doubts:
1) The "if (page_to_pfn(mem_map) != pgdat->node_start_pfn)" sort of
silently assumes that mem_map is allocated at the beginning of the node,
i.e. at pgdat->node_start_pfn. And the only reason for this if-condition
to be true, is that we haven't corrected the page_to_pfn translation,
which uses mem_map. Is this assumption always OK to do? Shouldn't the
if-condition be instead about pgdat->node_start_pfn not being aligned?
2) The #ifdef guard is about CONFIG_ARCH_POPULATES_NODE_MAP, which is
nowadays called CONFIG_HAVE_MEMBLOCK_NODE_MAP. But shouldn't it be
#ifdef FLATMEM instead? After all, we are correcting value of mem_map
based on page_to_pfn code variant used on FLATMEM. arm doesn't define
CONFIG_ARCH_POPULATES_NODE_MAP but apparently needs this correction.
3) The node_mem_map allocation code aligns the allocation to
MAX_ORDER_NR_PAGES, so the offset between the start of the allocated map
and where node_mem_map points to will be up to MAX_ORDER_NR_PAGES.
However, here we subtract (in current kernel) (pgdat->node_start_pfn -
ARCH_PFN_OFFSET). That looks like another silent assumption, that
pgdat->node_start_pfn is always between ARCH_PFN_OFFSET and
ARCH_PFN_OFFSET + MAX_ORDER_NR_PAGES. If it were larger, the mem_map
correction would subtract too much and end up below what was allocated
for node_mem_map, no? The bug report behind this patch said that first
2MB of memory was reserved using "no-map flag using DT". Unless this
somehow translates to ARCH_PFN_OFFSET at build time, we would underflow
mem_map, right? Maybe I'm just overly paranoid here and of course
ARCH_PFN_OFFSET is determined properly on arm...
If anyone can confirm my doubts or point me to what I'm missing, thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-01-29 13:13 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-16 11:30 Issue on reserving memory with no-map flag in DT Srinivas Kandagatla
2015-01-17 0:24 ` Laura Abbott
2015-01-17 0:24 ` Laura Abbott
2015-01-17 8:39 ` Srinivas Kandagatla
2015-01-17 8:39 ` Srinivas Kandagatla
2015-01-19 15:49 ` Vlastimil Babka
2015-01-19 15:49 ` Vlastimil Babka
2015-01-19 23:57 ` Laura Abbott
2015-01-19 23:57 ` Laura Abbott
2015-01-20 9:54 ` Vlastimil Babka
2015-01-20 9:54 ` Vlastimil Babka
2015-01-21 1:37 ` [PATCH] mm: Don't offset memmap for flatmem Laura Abbott
2015-01-21 1:37 ` Laura Abbott
2015-01-21 10:15 ` Vlastimil Babka
2015-01-21 10:15 ` Vlastimil Babka
2015-01-22 1:01 ` [PATCHv2] " Laura Abbott
2015-01-22 1:01 ` Laura Abbott
2015-01-23 0:20 ` Andrew Morton
2015-01-23 0:20 ` Andrew Morton
2015-01-23 0:33 ` Laura Abbott
2015-01-23 0:33 ` Laura Abbott
2015-01-23 9:05 ` Vlastimil Babka
2015-01-23 9:05 ` Vlastimil Babka
2015-01-26 15:56 ` Mel Gorman
2015-01-26 15:56 ` Mel Gorman
2015-01-29 13:13 ` Vlastimil Babka [this message]
2015-01-29 13:13 ` Vlastimil Babka
2015-02-04 2:25 ` Laura Abbott
2015-02-04 2:25 ` Laura Abbott
2015-02-24 19:54 ` Laura Abbott
2015-02-24 19:54 ` Laura Abbott
2015-02-27 15:24 ` Vlastimil Babka
2015-02-27 15:24 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54CA3202.8020609@suse.cz \
--to=vbabka@suse.cz \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.