linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Michael Ellerman <mpe@ellerman.id.au>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	linuxppc-dev@lists.ozlabs.org, npiggin@gmail.com,
	christophe.leroy@csgroup.eu
Cc: foraker1@llnl.gov
Subject: Re: [PATCH v2 2/2] powerpc/mm: Add memory_block_size as a kernel parameter
Date: Tue, 20 Jun 2023 14:53:35 +0200	[thread overview]
Message-ID: <2b902e78-be4f-cff4-0e8e-771d2981455d@redhat.com> (raw)
In-Reply-To: <87mt0upazj.fsf@mail.lhotse>

On 20.06.23 14:35, Michael Ellerman wrote:
> David Hildenbrand <david@redhat.com> writes:
>> On 09.06.23 08:08, Aneesh Kumar K.V wrote:
>>> Certain devices can possess non-standard memory capacities, not constrained
>>> to multiples of 1GB. Provide a kernel parameter so that we can map the
>>> device memory completely on memory hotplug.
>>
>> So, the unfortunate thing is that these devices would have worked out of
>> the box before the memory block size was increased from 256 MiB to 1 GiB
>> in these setups. Now, one has to fine-tune the memory block size. The
>> only other arch that I know, which supports setting the memory block
>> size, is x86 for special (large) UV systems -- and at least in the past
>> 128 MiB vs. 2 GiB memory blocks made a performance difference during
>> boot (maybe no longer today, who knows).
>>
>>
>> Obviously, less tunable and getting stuff simply working out of the box
>> is preferable.
>>
>> Two questions:
>>
>> 1) Isn't there a way to improve auto-detection to fallback to 256 MiB in
>> these setups, to avoid specifying these parameters?
>>
>> 2) Is the 256 MiB -> 1 GiB memory block size switch really worth it? On
>> x86-64, experiments (with direct map fragmentation) showed that the
>> effective performance boost is pretty insignificant, so I wonder how big
>> the 1 GiB direct map performance improvement is.
> 
> The other issue is simply the number of sysfs entries.
> 
> With 64TB of memory and a 256MB block size you end up with ~250,000
> directories in /sys/devices/system/memory.

Yes, and so far on other archs we only optimize for that for on UV x86 
systems (with a default of 2 GiB). And that was added before we started 
to speed up memory device lookups significantly using a radix tree IIRC.

It's worth noting that there was a discussion on:

(a) not creating these device sysfs entries (when configured on the 
cmdline); often, nobody really ends up using them to online/offline 
memory blocks. Then, the only primary users is lsmem.

(b) exposing logical devices (e.g., a DIMM) taht can only be 
offlined/removed as a whole, instead of their individual memblocks (when 
configured on the cmdline). But for PPC64 that won't help.


But (a) gets more tricky if device drivers (and things like dax/kmem) 
rely on user-space memory onlining/offlining.

-- 
Cheers,

David / dhildenb


  reply	other threads:[~2023-06-20 12:55 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-09  6:08 [PATCH v2 1/2] powerpc/mm: Cleanup memory block size probing Aneesh Kumar K.V
2023-06-09  6:08 ` [PATCH v2 2/2] powerpc/mm: Add memory_block_size as a kernel parameter Aneesh Kumar K.V
2023-06-13 20:06   ` Reza Arbab
2023-06-19 10:35   ` David Hildenbrand
2023-06-19 16:17     ` Aneesh Kumar K.V
2023-06-19 16:28       ` David Hildenbrand
2023-06-20 12:35     ` Michael Ellerman
2023-06-20 12:53       ` David Hildenbrand [this message]
2023-06-13 19:53 ` [PATCH v2 1/2] powerpc/mm: Cleanup memory block size probing Reza Arbab

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2b902e78-be4f-cff4-0e8e-771d2981455d@redhat.com \
    --to=david@redhat.com \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=christophe.leroy@csgroup.eu \
    --cc=foraker1@llnl.gov \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).