From: Mike Rapoport <rppt@kernel.org>
To: David Hildenbrand <david@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>,
Dan Williams <dan.j.williams@intel.com>,
linux-cxl@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
Sudeep Holla <sudeep.holla@arm.com>,
Andrew Morton <akpm@linux-foundation.org>,
Will Deacon <will@kernel.org>, Jia He <justin.he@arm.com>,
Mike Rapoport <rppt@linux.ibm.com>,
linuxarm@huawei.com, catalin.marinas@arm.com,
Anshuman.Khandual@arm.com,
Yuquan Wang <wangyuquan1236@phytium.com.cn>,
Oscar Salvador <osalvador@suse.de>,
Lorenzo Pieralisi <lpieralisi@kernel.org>,
James Morse <james.morse@arm.com>
Subject: Re: [RFC PATCH 8/8] HACK: mm: memory_hotplug: Drop memblock_phys_free() call in try_remove_memory()
Date: Mon, 3 Jun 2024 10:57:28 +0300 [thread overview]
Message-ID: <Zl13aA2ihYsLPcL6@kernel.org> (raw)
In-Reply-To: <b8654dcd-68e1-4823-9588-cfa401104741@redhat.com>
On Fri, May 31, 2024 at 09:49:32AM +0200, David Hildenbrand wrote:
> On 29.05.24 19:12, Jonathan Cameron wrote:
> > I'm not sure what this is balancing, but it if is necessary then the reserved
> > memblock approach can't be used to stash NUMA node assignments as after the
> > first add / remove cycle the entry is dropped so not available if memory is
> > re-added at the same HPA.
> >
> > This patch is here to hopefully spur comments on what this is there for!
> >
> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > ---
> > mm/memory_hotplug.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index 431b1f6753c0..3d8dd4749dfc 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -2284,7 +2284,7 @@ static int __ref try_remove_memory(u64 start, u64 size)
> > }
> > if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) {
> > - memblock_phys_free(start, size);
> > + // memblock_phys_free(start, size);
> > memblock_remove(start, size);
> > }
>
> memblock_phys_free() works on memblock.reserved, memblock_remove() works on
> memblock.memory.
>
> If you take a look at the doc at the top of memblock.c:
>
> memblock.memory: physical memory available to the system
> memblock.reserved: regions that were allocated [during boot]
>
>
> memblock.memory is supposed to be a superset of memblock.reserved. Your
No it's not.
memblock.reserved is more of "if there is memory, don't touch it".
Some regions in memblock.reserved are boot time allocations and they are indeed a
subset of memblock.memory, but some are reservations done by firmware (e.g.
reserved memory in DT) that just might not have a corresponding regions in
memblock.memory. It can happen for example, when the same firmware runs on
devices with different memory configuration, but still wants to preserve
some physical addresses.
> "hack" here indicates that you somehow would be relying on the opposite
> being true, which indicates that you are doing the wrong thing.
I'm not sure about that, I still have to digest the patches :)
> memblock_remove() indeed balances against memblock_add_node() for hotplugged
> memory [add_memory_resource()]. There seem to a case where we would succeed
> in hotunplugging memory that was part of "memblock.reserved".
>
> But how could that happen? I think the following way:
>
> Once the buddy is up and running, memory allocated during early boot is not
> freed back to memblock, but usually we simply go via something like
> free_reserved_page(), not memblock_free() [because the buddy took over]. So
> one could end up unplugging memory that still resides in memblock.reserved
> set.
>
> So with memblock_phys_free(), we are enforcing the invariant that
> memblock.memory is a superset of memblock.reserved.
>
> Likely, arm64 should store that node assignment elsewhere from where it can
> be queried. Or it should be using something like
> CONFIG_HAVE_MEMBLOCK_PHYS_MAP for these static windows.
>
> --
> Cheers,
>
> David / dhildenb
>
--
Sincerely yours,
Mike.
WARNING: multiple messages have this Message-ID (diff)
From: Mike Rapoport <rppt@kernel.org>
To: David Hildenbrand <david@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>,
Dan Williams <dan.j.williams@intel.com>,
linux-cxl@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
Sudeep Holla <sudeep.holla@arm.com>,
Andrew Morton <akpm@linux-foundation.org>,
Will Deacon <will@kernel.org>, Jia He <justin.he@arm.com>,
Mike Rapoport <rppt@linux.ibm.com>,
linuxarm@huawei.com, catalin.marinas@arm.com,
Anshuman.Khandual@arm.com,
Yuquan Wang <wangyuquan1236@phytium.com.cn>,
Oscar Salvador <osalvador@suse.de>,
Lorenzo Pieralisi <lpieralisi@kernel.org>,
James Morse <james.morse@arm.com>
Subject: Re: [RFC PATCH 8/8] HACK: mm: memory_hotplug: Drop memblock_phys_free() call in try_remove_memory()
Date: Mon, 3 Jun 2024 10:57:28 +0300 [thread overview]
Message-ID: <Zl13aA2ihYsLPcL6@kernel.org> (raw)
In-Reply-To: <b8654dcd-68e1-4823-9588-cfa401104741@redhat.com>
On Fri, May 31, 2024 at 09:49:32AM +0200, David Hildenbrand wrote:
> On 29.05.24 19:12, Jonathan Cameron wrote:
> > I'm not sure what this is balancing, but it if is necessary then the reserved
> > memblock approach can't be used to stash NUMA node assignments as after the
> > first add / remove cycle the entry is dropped so not available if memory is
> > re-added at the same HPA.
> >
> > This patch is here to hopefully spur comments on what this is there for!
> >
> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > ---
> > mm/memory_hotplug.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index 431b1f6753c0..3d8dd4749dfc 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -2284,7 +2284,7 @@ static int __ref try_remove_memory(u64 start, u64 size)
> > }
> > if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) {
> > - memblock_phys_free(start, size);
> > + // memblock_phys_free(start, size);
> > memblock_remove(start, size);
> > }
>
> memblock_phys_free() works on memblock.reserved, memblock_remove() works on
> memblock.memory.
>
> If you take a look at the doc at the top of memblock.c:
>
> memblock.memory: physical memory available to the system
> memblock.reserved: regions that were allocated [during boot]
>
>
> memblock.memory is supposed to be a superset of memblock.reserved. Your
No it's not.
memblock.reserved is more of "if there is memory, don't touch it".
Some regions in memblock.reserved are boot time allocations and they are indeed a
subset of memblock.memory, but some are reservations done by firmware (e.g.
reserved memory in DT) that just might not have a corresponding regions in
memblock.memory. It can happen for example, when the same firmware runs on
devices with different memory configuration, but still wants to preserve
some physical addresses.
> "hack" here indicates that you somehow would be relying on the opposite
> being true, which indicates that you are doing the wrong thing.
I'm not sure about that, I still have to digest the patches :)
> memblock_remove() indeed balances against memblock_add_node() for hotplugged
> memory [add_memory_resource()]. There seem to a case where we would succeed
> in hotunplugging memory that was part of "memblock.reserved".
>
> But how could that happen? I think the following way:
>
> Once the buddy is up and running, memory allocated during early boot is not
> freed back to memblock, but usually we simply go via something like
> free_reserved_page(), not memblock_free() [because the buddy took over]. So
> one could end up unplugging memory that still resides in memblock.reserved
> set.
>
> So with memblock_phys_free(), we are enforcing the invariant that
> memblock.memory is a superset of memblock.reserved.
>
> Likely, arm64 should store that node assignment elsewhere from where it can
> be queried. Or it should be using something like
> CONFIG_HAVE_MEMBLOCK_PHYS_MAP for these static windows.
>
> --
> Cheers,
>
> David / dhildenb
>
--
Sincerely yours,
Mike.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2024-06-03 7:59 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-29 17:12 [RFC PATCH 0/8] arm64/memblock: Handling of CXL Fixed Memory Windows Jonathan Cameron
2024-05-29 17:12 ` Jonathan Cameron
2024-05-29 17:12 ` [RFC PATCH 1/8] arm64: numa: Introduce a memory_add_physaddr_to_nid() Jonathan Cameron
2024-05-29 17:12 ` Jonathan Cameron
2024-08-01 7:50 ` Yuquan Wang
2024-05-29 17:12 ` [RFC PATCH 2/8] arm64: memblock: Introduce a generic phys_addr_to_target_node() Jonathan Cameron
2024-05-29 17:12 ` Jonathan Cameron
2024-08-01 7:52 ` Yuquan Wang
2024-05-29 17:12 ` [RFC PATCH 3/8] mm: memblock: Add a means to add to memblock.reserved Jonathan Cameron
2024-05-29 17:12 ` Jonathan Cameron
2024-08-01 7:53 ` Yuquan Wang
2024-05-29 17:12 ` [RFC PATCH 4/8] arch_numa: Avoid onlining empty NUMA nodes Jonathan Cameron
2024-05-29 17:12 ` Jonathan Cameron
2024-08-01 7:53 ` Yuquan Wang
2024-05-29 17:12 ` [RFC PATCH 5/8] arch_numa: Make numa_add_memblk() set nid for memblock.reserved regions Jonathan Cameron
2024-05-29 17:12 ` Jonathan Cameron
2024-08-01 7:54 ` Yuquan Wang
2024-05-29 17:12 ` [RFC PATCH 6/8] arm64: mm: numa_fill_memblks() to add a memblock.reserved region if match Jonathan Cameron
2024-05-29 17:12 ` Jonathan Cameron
2024-08-01 7:54 ` Yuquan Wang
2024-05-29 17:12 ` [RFC PATCH 7/8] acpi: srat: cxl: Skip zero length CXL fixed memory windows Jonathan Cameron
2024-05-29 17:12 ` Jonathan Cameron
2024-08-01 7:55 ` Yuquan Wang
2024-05-29 17:12 ` [RFC PATCH 8/8] HACK: mm: memory_hotplug: Drop memblock_phys_free() call in try_remove_memory() Jonathan Cameron
2024-05-29 17:12 ` Jonathan Cameron
2024-05-30 10:07 ` Oscar Salvador
2024-05-30 10:07 ` Oscar Salvador
2024-05-30 12:14 ` Jonathan Cameron
2024-05-30 12:14 ` Jonathan Cameron
2024-05-31 7:49 ` David Hildenbrand
2024-05-31 7:49 ` David Hildenbrand
2024-05-31 9:48 ` Jonathan Cameron
2024-05-31 9:48 ` Jonathan Cameron
2024-05-31 9:55 ` David Hildenbrand
2024-05-31 9:55 ` David Hildenbrand
2024-06-06 15:44 ` Mike Rapoport
2024-06-06 15:44 ` Mike Rapoport
2024-06-03 7:57 ` Mike Rapoport [this message]
2024-06-03 7:57 ` Mike Rapoport
2024-06-03 9:14 ` David Hildenbrand
2024-06-03 9:14 ` David Hildenbrand
2024-06-03 10:43 ` Mike Rapoport
2024-06-03 10:43 ` Mike Rapoport
2024-06-03 20:53 ` David Hildenbrand
2024-06-03 20:53 ` David Hildenbrand
2024-06-04 9:35 ` Mike Rapoport
2024-06-04 9:35 ` Mike Rapoport
2024-06-04 9:39 ` David Hildenbrand
2024-06-04 9:39 ` David Hildenbrand
2024-06-05 8:00 ` Mike Rapoport
2024-06-05 8:00 ` Mike Rapoport
2024-06-05 8:23 ` David Hildenbrand
2024-06-05 8:23 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zl13aA2ihYsLPcL6@kernel.org \
--to=rppt@kernel.org \
--cc=Anshuman.Khandual@arm.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=catalin.marinas@arm.com \
--cc=dan.j.williams@intel.com \
--cc=david@redhat.com \
--cc=james.morse@arm.com \
--cc=justin.he@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-cxl@vger.kernel.org \
--cc=linuxarm@huawei.com \
--cc=lpieralisi@kernel.org \
--cc=osalvador@suse.de \
--cc=rppt@linux.ibm.com \
--cc=sudeep.holla@arm.com \
--cc=wangyuquan1236@phytium.com.cn \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.