linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	linuxppc-dev@lists.ozlabs.org, mpe@ellerman.id.au,
	npiggin@gmail.com, christophe.leroy@csgroup.eu,
	Tarun Sahu <tsahu@linux.ibm.com>
Cc: foraker1@llnl.gov
Subject: Re: [PATCH v2 2/2] powerpc/mm: Add memory_block_size as a kernel parameter
Date: Mon, 19 Jun 2023 18:28:56 +0200	[thread overview]
Message-ID: <bdb94911-ec9a-02ca-06fc-f850b6c815b2@redhat.com> (raw)
In-Reply-To: <875y7jifzc.fsf@linux.ibm.com>

On 19.06.23 18:17, Aneesh Kumar K.V wrote:
> David Hildenbrand <david@redhat.com> writes:
> 
>> On 09.06.23 08:08, Aneesh Kumar K.V wrote:
>>> Certain devices can possess non-standard memory capacities, not constrained
>>> to multiples of 1GB. Provide a kernel parameter so that we can map the
>>> device memory completely on memory hotplug.
>>
>> So, the unfortunate thing is that these devices would have worked out of
>> the box before the memory block size was increased from 256 MiB to 1 GiB
>> in these setups. Now, one has to fine-tune the memory block size. The
>> only other arch that I know, which supports setting the memory block
>> size, is x86 for special (large) UV systems -- and at least in the past
>> 128 MiB vs. 2 GiB memory blocks made a performance difference during
>> boot (maybe no longer today, who knows).
>>
>>
>> Obviously, less tunable and getting stuff simply working out of the box
>> is preferable.
>>
>> Two questions:
>>
>> 1) Isn't there a way to improve auto-detection to fallback to 256 MiB in
>> these setups, to avoid specifying these parameters?
> 
> The patch does try to detect as much as possible by looking at device tree
> nodes and aperture window size. But there are still cases where we find
> a memory aperture of size X GB and device driver hotplug X.YGB memory.
> 

Okay, and I assume we can't detect that case easily.

Which interface is that device driver using to hotplug memory? It's 
quite surprising I have to say ...

>>
>> 2) Is the 256 MiB -> 1 GiB memory block size switch really worth it? On
>> x86-64, experiments (with direct map fragmentation) showed that the
>> effective performance boost is pretty insignificant, so I wonder how big
>> the 1 GiB direct map performance improvement is.
> 
> 
> Tarun is running some tests to evaluate the impact. We used to use 1GiB
> mapping always. This was later switched to use memory block size to fix
> issues with memory unplug
> commit af9d00e93a4f ("powerpc/mm/radix: Create separate mappings for hot-plugged memory")
> explains some details related to that change.
> 

IIUC, that commit (conditionally) increased the memory block size to 
avoid the splitting, correct? By that, it broke the device driver use case.

> 
>>
>>
>> I guess the only real issue with 256 MiB memory blocks and 1 GiB direct
>> mapping is memory unplug of boot memory: when unplugging a 256 MiB
>> block, one would have to remap the 1 GiB range using 2 MiB ranges.
> 
>>
>> ... I was wondering what would happen if you simply leave the direct
>> mapping in this corner case in place instead of doing this remapping.
>> IOW, remove the memory but keep the direct map pointing at the removed
>> memory. Nobody should be touching it, or are there any cases where that
>> could hurt?
>>
>>
>> Or is there any other reason why we really want 1 GiB memory blocks
>> instead of to defaulting to 256 MiB the way it used to be?
>>
> 
> The idea we are working towards is to keep the memory block size small

That would be preferable, yes ...

> but map the boot memory using 1G. An unplug request can split that 1G
> mapping later. We could look at the possibility of leaving that mapping
> without splitting. But not sure why we would want to do that if we can
> correctly split things. Right now there is no splitting support in powerpc.

If splitting over-complicates the matter (and well, it will even consume 
more memory), it might at least be worth looking into that. Yes, it's 
cleaner.

I think there is also the option to fail memory offlining (and therefore 
unplug) if we have a 1 GiB mapping and don't want to split. For 
hotplugged memory it would always work to unplug again. aarch64 blocks 
any boot memory from getting unplugged.

But I guess that might break existing use cases (unplug boot memory) on 
ppc64 that rely on ZONE_MOVABLE to have it working with guarantees, 
right? Could be optimized but not sure if that's the best approach.


-- 
Cheers,

David / dhildenb


  reply	other threads:[~2023-06-19 16:30 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-09  6:08 [PATCH v2 1/2] powerpc/mm: Cleanup memory block size probing Aneesh Kumar K.V
2023-06-09  6:08 ` [PATCH v2 2/2] powerpc/mm: Add memory_block_size as a kernel parameter Aneesh Kumar K.V
2023-06-13 20:06   ` Reza Arbab
2023-06-19 10:35   ` David Hildenbrand
2023-06-19 16:17     ` Aneesh Kumar K.V
2023-06-19 16:28       ` David Hildenbrand [this message]
2023-06-20 12:35     ` Michael Ellerman
2023-06-20 12:53       ` David Hildenbrand
2023-06-13 19:53 ` [PATCH v2 1/2] powerpc/mm: Cleanup memory block size probing Reza Arbab

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bdb94911-ec9a-02ca-06fc-f850b6c815b2@redhat.com \
    --to=david@redhat.com \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=christophe.leroy@csgroup.eu \
    --cc=foraker1@llnl.gov \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=tsahu@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).