linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: "Leonardo Brás" <leobras.c@gmail.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	David Hildenbrand <david@redhat.com>,
	Scott Cheloha <cheloha@linux.ibm.com>,
	linux-kernel@vger.kernel.org, Nicholas Piggin <npiggin@gmail.com>,
	Paul Mackerras <paulus@samba.org>,
	Sandipan Das <sandipan@linux.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Laurent Dufour <ldufour@linux.ibm.com>,
	linuxppc-dev@lists.ozlabs.org, Mike Rapoport <rppt@kernel.org>
Subject: Re: [PATCH v2 1/3] powerpc/mm/hash: Avoid resizing-down HPT on first memory hotplug
Date: Wed, 9 Jun 2021 14:40:59 +1000	[thread overview]
Message-ID: <YMBGW3RQOzoQxBqy@yekko> (raw)
In-Reply-To: <648b382159009c5f4277d9b9c3f896142ea75d6c.camel@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3445 bytes --]

On Tue, Jun 08, 2021 at 09:52:10PM -0300, Leonardo Brás wrote:
> On Mon, 2021-06-07 at 15:02 +1000, David Gibson wrote:
> > On Fri, Apr 30, 2021 at 11:36:06AM -0300, Leonardo Bras wrote:
> > > Because hypervisors may need to create HPTs without knowing the
> > > guest
> > > page size, the smallest used page-size (4k) may be chosen,
> > > resulting in
> > > a HPT that is possibly bigger than needed.
> > > 
> > > On a guest with bigger page-sizes, the amount of entries for HTP
> > > may be
> > > too high, causing the guest to ask for a HPT resize-down on the
> > > first
> > > hotplug.
> > > 
> > > This becomes a problem when HPT resize-down fails, and causes the
> > > HPT resize to be performed on every LMB added, until HPT size is
> > > compatible to guest memory size, causing a major slowdown.
> > > 
> > > So, avoiding HPT resizing-down on hot-add significantly improves
> > > memory
> > > hotplug times.
> > > 
> > > As an example, hotplugging 256GB on a 129GB guest took 710s without
> > > this
> > > patch, and 21s after applied.
> > > 
> > > Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
> > 
> > Sorry it's taken me so long to look at these
> > 
> > I don't love the extra statefulness that the 'shrinking' parameter
> > adds, but I can't see an elegant way to avoid it, so:
> > 
> > Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> 
> np, thanks for reviewing!

Actually... I take that back.  With the subsequent patches my
discomfort with the complexity of implementing the batching grew.

I think I can see a simpler way - although it wasn't as clear as I
thought it might be, without some deep history on this feature.

What's going on here is pretty hard to follow, because it starts in
arch-specific code (arch/powerpc/platforms/pseries/hotplug-memory.c)
where it processes the add/remove requests, then goes into generic
code __add_memory() which eventually emerges back in arch specific
code (hash__create_section_mapping()).

The HPT resizing calls are in the "inner" arch specific section,
whereas it's only the outer arch section that has the information to
batch properly.  The mutex and 'shrinking' parameter in Leonardo's
code are all about conveying information from the outer to inner
section.

Now, I think the reason I had the resize calls in the inner section
was to accomodate the notion that a) pHyp might support resizing in
future, and it could come in through a different path with its drmgr
thingy and/or b) bare metal hash architectures might want to implement
hash resizing, and this would make at least part of the path common.

Given the decreasing relevance of hash MMUs, I think we can now safely
say neither of these is ever going to happen.

Therefore, we can simplify things by moving the HPT resize calls into
the pseries LMB code, instead of create/remove_section_mapping.  Then
to do batching without extra complications we just need this logic for
all resizes (both add and remove):

	let new_hpt_order = expected HPT size for new mem size;

	if (new_hpt_order > current_hpt_order)
		resize to new_hpt_order

	add/remove memory

	if (new_hpt_order < current_hpt_order - 1)
		resize to new_hpt_order


-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2021-06-09  5:19 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-30 14:36 [PATCH v2 0/3] powerpc/mm/hash: Time improvements for memory hot(un)plug Leonardo Bras
2021-04-30 14:36 ` [PATCH v2 1/3] powerpc/mm/hash: Avoid resizing-down HPT on first memory hotplug Leonardo Bras
2021-06-07  5:02   ` David Gibson
2021-06-09  0:52     ` Leonardo Brás
2021-06-09  4:40       ` David Gibson [this message]
2021-06-09  5:51         ` Leonardo Brás
2021-06-09  6:59           ` David Gibson
2021-04-30 14:36 ` [PATCH v2 2/3] powerpc/mm/hash: Avoid multiple HPT resize-ups on " Leonardo Bras
2021-06-07  5:10   ` David Gibson
2021-06-09  3:09     ` Leonardo Brás
2021-04-30 14:36 ` [PATCH v2 3/3] powerpc/mm/hash: Avoid multiple HPT resize-downs on memory hotunplug Leonardo Bras
2021-06-07  5:20   ` David Gibson
2021-06-09  5:30     ` Leonardo Brás
2021-06-09  6:08       ` David Gibson
2021-04-30 14:41 ` [PATCH v2 0/3] powerpc/mm/hash: Time improvements for memory hot(un)plug Leonardo Bras

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YMBGW3RQOzoQxBqy@yekko \
    --to=david@gibson.dropbear.id.au \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=cheloha@linux.ibm.com \
    --cc=david@redhat.com \
    --cc=ldufour@linux.ibm.com \
    --cc=leobras.c@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=nathanl@linux.ibm.com \
    --cc=npiggin@gmail.com \
    --cc=paulus@samba.org \
    --cc=rppt@kernel.org \
    --cc=sandipan@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).