All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Balbir Singh <balbirs@nvidia.com>
Cc: linux-kernel@vger.kernel.org, x86@kernel.org,
	"Christian König" <christian.koenig@amd.com>,
	"Kees Cook" <kees@kernel.org>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Andy Lutomirski" <luto@kernel.org>,
	"Alex Deucher" <alexander.deucher@amd.com>,
	"Bert Karwatzki" <spasswolf@web.de>,
	"Madhavan Srinivasan" <maddy@linux.ibm.com>,
	"Nicholas Piggin" <npiggin@gmail.com>
Subject: Re: [PATCH] arch/x86: memory_hotplug, do not bump up max_pfn for device private pages
Date: Tue, 1 Apr 2025 10:57:21 +0200	[thread overview]
Message-ID: <Z-uqcSYvRD6ZPPQs@gmail.com> (raw)
In-Reply-To: <20250401000752.249348-1-balbirs@nvidia.com>


* Balbir Singh <balbirs@nvidia.com> wrote:

>  arch/x86/mm/init_64.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index dce60767124f..cc60b57473a4 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -970,9 +970,18 @@ int add_pages(int nid, unsigned long start_pfn, unsigned long nr_pages,
>  	ret = __add_pages(nid, start_pfn, nr_pages, params);
>  	WARN_ON_ONCE(ret);
>  
> -	/* update max_pfn, max_low_pfn and high_memory */
> -	update_end_of_memory_vars(start_pfn << PAGE_SHIFT,
> -				  nr_pages << PAGE_SHIFT);
> +	/*
> +	 * add_pages() is called by memremap_pages() for adding device private
> +	 * pages. Do not bump up max_pfn in the device private path. max_pfn
> +	 * changes affect dma_addressing_limited. dma_addressing_limited
> +	 * returning true when max_pfn is the device's addressable memory,
> +	 * can force device drivers to use bounce buffers and impact their
> +	 * performance
> +	 */
> +	if (!params->pgmap)
> +		/* update max_pfn, max_low_pfn and high_memory */
> +		update_end_of_memory_vars(start_pfn << PAGE_SHIFT,
> +					  nr_pages << PAGE_SHIFT);

So given that device private pages are not supposed to be mapped 
directly, not including these PFNs in max_pfn absolutely sounds like 
the correct fix to me.

But wouldn't the abnormally high max_pfn also cause us to create a too 
large direct mapping to cover it, or does something save us there? Such 
an overly large mapping would increase kernel page table size rather 
substantially on non-gbpages systems, AFAICS.

Say we create a 16TB mapping on a 16GB system - 1024x larger: to map 16 
TB with largepages requires 8,388,608 largepage mappings (!), which 
with 8-byte page table entries takes up ~64MB of unswappable RAM. (!!)

Is my math off, or am I misunderstanding something here?

Anyway, I've applied your fix to tip:x86/urgent with a few edits to the 
comments and the changelog, but I've also expanded the Cc: list of the 
commit liberally, in hope of getting more reviews for this fix. :-)

Thanks,

	Ingo

  reply	other threads:[~2025-04-01  8:57 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-01  0:07 [PATCH] arch/x86: memory_hotplug, do not bump up max_pfn for device private pages Balbir Singh
2025-04-01  8:57 ` Ingo Molnar [this message]
2025-04-01  9:24   ` Balbir Singh
2025-04-01  9:08 ` [tip: x86/urgent] x86/mm/init: Handle the special case of device private pages in add_pages(), to not increase max_pfn and trigger dma_addressing_limited() bounce buffers tip-bot2 for Balbir Singh
2025-12-01 22:11 ` [PATCH] arch/x86: memory_hotplug, do not bump up max_pfn for device private pages dan.j.williams
2025-12-01 23:01   ` Gregory Price
2025-12-02 22:40   ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z-uqcSYvRD6ZPPQs@gmail.com \
    --to=mingo@kernel.org \
    --cc=alexander.deucher@amd.com \
    --cc=balbirs@nvidia.com \
    --cc=bhelgaas@google.com \
    --cc=christian.koenig@amd.com \
    --cc=kees@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=maddy@linux.ibm.com \
    --cc=npiggin@gmail.com \
    --cc=peterz@infradead.org \
    --cc=spasswolf@web.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.