From: Mike Rapoport <rppt@kernel.org>
To: David Woodhouse <dwmw2@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
"Sauerwein, David" <dssauerw@amazon.de>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Ard Biesheuvel <ardb@kernel.org>,
Catalin Marinas <catalin.marinas@arm.com>,
David Hildenbrand <david@redhat.com>,
Marc Zyngier <maz@kernel.org>,
Mark Rutland <mark.rutland@arm.com>,
Mike Rapoport <rppt@linux.ibm.com>, Will Deacon <will@kernel.org>,
kvmarm@lists.cs.columbia.edu,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [RFC PATCH 1/3] mm: Introduce for_each_valid_pfn() and use it from reserve_bootmem_region()
Date: Thu, 3 Apr 2025 09:19:14 +0300 [thread overview]
Message-ID: <Z-4oYlsAzZ6OQHCH@kernel.org> (raw)
In-Reply-To: <20250402201841.3245371-1-dwmw2@infradead.org>
On Wed, Apr 02, 2025 at 09:18:39PM +0100, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
>
> Especially since commit 9092d4f7a1f8 ("memblock: update initialization
> of reserved pages"), the reserve_bootmem_region() function can spend a
> significant amount of time iterating over every 4KiB PFN in a range,
> calling pfn_valid() on each one, and ultimately doing absolutely nothing.
>
> On a platform used for virtualization, with large NOMAP regions that
> eventually get used for guest RAM, this leads to a significant increase
> in steal time experienced during kexec for a live update.
>
> Introduce for_each_valid_pfn() and use it from reserve_bootmem_region().
> This implementation is precisely the same naïve loop that the function
> used to have, but subsequent commits will provide optimised versions
> for FLATMEM and SPARSEMEM, and this version will remain for those
> architectures which provide their own pfn_valid() implementation,
> until/unless they also provide a matching for_each_valid_pfn().
>
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> ---
> include/linux/mmzone.h | 10 ++++++++++
> mm/mm_init.c | 23 ++++++++++-------------
> 2 files changed, 20 insertions(+), 13 deletions(-)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 25e80b2ca7f4..32ecb5cadbaf 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -2176,6 +2176,16 @@ void sparse_init(void);
> #define subsection_map_init(_pfn, _nr_pages) do {} while (0)
> #endif /* CONFIG_SPARSEMEM */
>
> +/*
> + * Fallback case for when the architecture provides its own pfn_valid() but
> + * not a corresponding for_each_valid_pfn().
> + */
> +#ifndef for_each_valid_pfn
> +#define for_each_valid_pfn(_pfn, _start_pfn, _end_pfn) \
> + for ((_pfn) = (_start_pfn); (_pfn) < (_end_pfn); (_pfn)++) \
> + if (pfn_valid(_pfn))
> +#endif
> +
> #endif /* !__GENERATING_BOUNDS.H */
> #endif /* !__ASSEMBLY__ */
> #endif /* _LINUX_MMZONE_H */
> diff --git a/mm/mm_init.c b/mm/mm_init.c
> index a38a1909b407..7c699bad42ad 100644
> --- a/mm/mm_init.c
> +++ b/mm/mm_init.c
> @@ -777,22 +777,19 @@ static inline void init_deferred_page(unsigned long pfn, int nid)
> void __meminit reserve_bootmem_region(phys_addr_t start,
> phys_addr_t end, int nid)
> {
> - unsigned long start_pfn = PFN_DOWN(start);
> - unsigned long end_pfn = PFN_UP(end);
> + unsigned long pfn;
>
> - for (; start_pfn < end_pfn; start_pfn++) {
> - if (pfn_valid(start_pfn)) {
> - struct page *page = pfn_to_page(start_pfn);
> + for_each_valid_pfn (pfn, PFN_DOWN(start), PFN_UP(end)) {
> + struct page *page = pfn_to_page(pfn);
>
> - init_deferred_page(start_pfn, nid);
> + init_deferred_page(pfn, nid);
>
> - /*
> - * no need for atomic set_bit because the struct
> - * page is not visible yet so nobody should
> - * access it yet.
> - */
> - __SetPageReserved(page);
> - }
> + /*
> + * no need for atomic set_bit because the struct
> + * page is not visible yet so nobody should
> + * access it yet.
> + */
> + __SetPageReserved(page);
> }
> }
>
> --
> 2.49.0
>
--
Sincerely yours,
Mike.
next prev parent reply other threads:[~2025-04-03 6:21 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-11 10:05 [PATCH v4 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid() Mike Rapoport
2021-05-11 10:05 ` Mike Rapoport
2021-05-11 10:05 ` Mike Rapoport
2021-05-11 10:05 ` [PATCH v4 1/4] include/linux/mmzone.h: add documentation for pfn_valid() Mike Rapoport
2021-05-11 10:05 ` Mike Rapoport
2021-05-11 10:05 ` Mike Rapoport
2021-05-11 10:22 ` Ard Biesheuvel
2021-05-11 10:22 ` Ard Biesheuvel
2021-05-11 10:22 ` Ard Biesheuvel
2021-05-11 10:05 ` [PATCH v4 2/4] memblock: update initialization of reserved pages Mike Rapoport
2021-05-11 10:05 ` Mike Rapoport
2021-05-11 10:05 ` Mike Rapoport
2021-05-11 10:23 ` Ard Biesheuvel
2021-05-11 10:23 ` Ard Biesheuvel
2021-05-11 10:23 ` Ard Biesheuvel
2025-03-31 12:50 ` David Woodhouse
2025-03-31 14:50 ` Mike Rapoport
2025-03-31 15:13 ` David Woodhouse
2025-04-01 11:33 ` Mike Rapoport
2025-04-01 11:50 ` David Woodhouse
2025-04-01 13:19 ` Mike Rapoport
2025-04-02 20:18 ` [RFC PATCH 1/3] mm: Introduce for_each_valid_pfn() and use it from reserve_bootmem_region() David Woodhouse
2025-04-02 20:18 ` [RFC PATCH 2/3] mm: Implement for_each_valid_pfn() for CONFIG_FLATMEM David Woodhouse
2025-04-03 6:19 ` Mike Rapoport
2025-04-02 20:18 ` [RFC PATCH 3/3] mm: Implement for_each_valid_pfn() for CONFIG_SPARSEMEM David Woodhouse
2025-04-03 6:24 ` Mike Rapoport
2025-04-03 7:07 ` David Woodhouse
2025-04-03 7:15 ` David Woodhouse
2025-04-03 14:13 ` Mike Rapoport
2025-04-03 14:17 ` David Woodhouse
2025-04-03 14:25 ` Mike Rapoport
2025-04-03 14:10 ` Mike Rapoport
2025-04-03 6:19 ` Mike Rapoport [this message]
2021-05-11 10:05 ` [PATCH v4 3/4] arm64: decouple check whether pfn is in linear map from pfn_valid() Mike Rapoport
2021-05-11 10:05 ` Mike Rapoport
2021-05-11 10:05 ` Mike Rapoport
2021-05-11 10:25 ` Ard Biesheuvel
2021-05-11 10:25 ` Ard Biesheuvel
2021-05-11 10:25 ` Ard Biesheuvel
2021-05-11 10:05 ` [PATCH v4 4/4] arm64: drop pfn_valid_within() and simplify pfn_valid() Mike Rapoport
2021-05-11 10:05 ` Mike Rapoport
2021-05-11 10:05 ` Mike Rapoport
2021-05-11 10:26 ` Ard Biesheuvel
2021-05-11 10:26 ` Ard Biesheuvel
2021-05-11 10:26 ` Ard Biesheuvel
2021-05-11 23:40 ` Andrew Morton
2021-05-11 23:40 ` Andrew Morton
2021-05-11 23:40 ` Andrew Morton
2021-05-12 5:31 ` Mike Rapoport
2021-05-12 5:31 ` Mike Rapoport
2021-05-12 5:31 ` Mike Rapoport
2021-05-12 3:13 ` [PATCH v4 0/4] " Kefeng Wang
2021-05-12 3:13 ` Kefeng Wang
2021-05-12 3:13 ` Kefeng Wang
2021-05-12 7:00 ` Ard Biesheuvel
2021-05-12 7:00 ` Ard Biesheuvel
2021-05-12 7:00 ` Ard Biesheuvel
2021-05-12 7:33 ` Mike Rapoport
2021-05-12 7:33 ` Mike Rapoport
2021-05-12 7:33 ` Mike Rapoport
2021-05-12 7:59 ` Ard Biesheuvel
2021-05-12 7:59 ` Ard Biesheuvel
2021-05-12 7:59 ` Ard Biesheuvel
2021-05-12 8:32 ` Mike Rapoport
2021-05-12 8:32 ` Mike Rapoport
2021-05-12 8:32 ` Mike Rapoport
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z-4oYlsAzZ6OQHCH@kernel.org \
--to=rppt@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=ardb@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=david@redhat.com \
--cc=dssauerw@amazon.de \
--cc=dwmw2@infradead.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mark.rutland@arm.com \
--cc=maz@kernel.org \
--cc=rppt@linux.ibm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.