linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@kernel.org>
To: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: linux-kernel@vger.kernel.org,
	Alexander Gordeev <agordeev@linux.ibm.com>,
	Andreas Larsson <andreas@gaisler.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Arnd Bergmann <arnd@arndb.de>, Borislav Petkov <bp@alien8.de>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Christophe Leroy <christophe.leroy@csgroup.eu>,
	Dan Williams <dan.j.williams@intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	David Hildenbrand <david@redhat.com>,
	"David S. Miller" <davem@davemloft.net>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Heiko Carstens <hca@linux.ibm.com>,
	Huacai Chen <chenhuacai@kernel.org>,
	Ingo Molnar <mingo@redhat.com>,
	Jiaxun Yang <jiaxun.yang@flygoat.com>,
	John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Rob Herring <robh@kernel.org>,
	Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	Vasily Gorbik <gor@linux.ibm.com>, Will Deacon <will@kernel.org>,
	linux-arm-kernel@lists.infradead.org, loongarch@lists.linux.dev,
	linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org,
	linux-sh@vger.kernel.org, sparclinux@vger.kernel.org,
	linux-acpi@vger.kernel.org, linux-cxl@vger.kernel.org,
	nvdimm@lists.linux.dev, devicetree@vger.kernel.org,
	linux-arch@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org
Subject: Re: [PATCH 06/17] x86/numa: simplify numa_distance allocation
Date: Mon, 22 Jul 2024 10:51:25 +0300	[thread overview]
Message-ID: <Zp4PfVZKAg3djFOu@kernel.org> (raw)
In-Reply-To: <20240719172849.000019a0@Huawei.com>

On Fri, Jul 19, 2024 at 05:28:49PM +0100, Jonathan Cameron wrote:
> On Tue, 16 Jul 2024 14:13:35 +0300
> Mike Rapoport <rppt@kernel.org> wrote:
> 
> > From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
> > 
> > Allocation of numa_distance uses memblock_phys_alloc_range() to limit
> > allocation to be below the last mapped page.
> > 
> > But NUMA initializaition runs after the direct map is populated and
> 
> initialization (one too many 'i's)

Thanks.
 
> > there is also code in setup_arch() that adjusts memblock limit to
> > reflect how much memory is already mapped in the direct map.
> > 
> > Simplify the allocation of numa_distance and use plain memblock_alloc().
> > This makes the code clearer and ensures that when numa_distance is not
> > allocated it is always NULL.
> Doesn't this break the comment in numa_set_distance() kernel-doc?
> "
>  * If such table cannot be allocated, a warning is printed and further
>  * calls are ignored until the distance table is reset with
>  * numa_reset_distance().
> "
> 
> Superficially that looks to be to avoid repeatedly hitting the
> singleton bit at the top of numa_set_distance() as SRAT or similar
> parsing occurs.

I believe it's there to avoid allocation of numa_distance in the middle of
distance parsing (SLIT or DT numa-distance-map).

If the allocation fails for the first element in the table, the
numa_distance and numa_distance_cnt remain zero and node_distance() falls
back to

	return from == to ? LOCAL_DISTANCE : REMOTE_DISTANCE;

It's different from arch_numa that always tries to allocate MAX_NUMNODES *
MAX_NUMNODES for numa_distance and treats the allocation failure as a
failure to initialize NUMA.

I like the general approach x86 uses more, i.e. in case distance parsing
fails in some way NUMA is still initialized with probably suboptimal
distances between nodes.

I'm going to restore that "singleton" behavior for now and will look into
making this all less cumbersome later.
 
> > Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> > ---
> >  arch/x86/mm/numa.c | 12 +++---------
> >  1 file changed, 3 insertions(+), 9 deletions(-)
> > 
> > diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
> > index 5e1dde26674b..ab2d4ecef786 100644
> > --- a/arch/x86/mm/numa.c
> > +++ b/arch/x86/mm/numa.c
> > @@ -319,8 +319,7 @@ void __init numa_reset_distance(void)
> >  {
> >  	size_t size = numa_distance_cnt * numa_distance_cnt * sizeof(numa_distance[0]);
> >  
> > -	/* numa_distance could be 1LU marking allocation failure, test cnt */
> > -	if (numa_distance_cnt)
> > +	if (numa_distance)
> >  		memblock_free(numa_distance, size);
> >  	numa_distance_cnt = 0;
> >  	numa_distance = NULL;	/* enable table creation */
> > @@ -331,7 +330,6 @@ static int __init numa_alloc_distance(void)
> >  	nodemask_t nodes_parsed;
> >  	size_t size;
> >  	int i, j, cnt = 0;
> > -	u64 phys;
> >  
> >  	/* size the new table and allocate it */
> >  	nodes_parsed = numa_nodes_parsed;
> > @@ -342,16 +340,12 @@ static int __init numa_alloc_distance(void)
> >  	cnt++;
> >  	size = cnt * cnt * sizeof(numa_distance[0]);
> >  
> > -	phys = memblock_phys_alloc_range(size, PAGE_SIZE, 0,
> > -					 PFN_PHYS(max_pfn_mapped));
> > -	if (!phys) {
> > +	numa_distance = memblock_alloc(size, PAGE_SIZE);
> > +	if (!numa_distance) {
> >  		pr_warn("Warning: can't allocate distance table!\n");
> > -		/* don't retry until explicitly reset */
> > -		numa_distance = (void *)1LU;
> >  		return -ENOMEM;
> >  	}
> >  
> > -	numa_distance = __va(phys);
> >  	numa_distance_cnt = cnt;
> >  
> >  	/* fill with the default distances */
> 
> 

-- 
Sincerely yours,
Mike.


  reply	other threads:[~2024-07-22  7:54 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-16 11:13 [PATCH 00/17] mm: introduce numa_memblks Mike Rapoport
2024-07-16 11:13 ` [PATCH 01/17] mm: move kernel/numa.c to mm/ Mike Rapoport
2024-07-17 14:35   ` David Hildenbrand
2024-07-19 13:55   ` Jonathan Cameron
2024-07-16 11:13 ` [PATCH 02/17] MIPS: sgi-ip27: make NODE_DATA() the same as on all other architectures Mike Rapoport
2024-07-17 14:32   ` David Hildenbrand
2024-07-19 14:38     ` Jonathan Cameron
2024-07-22  7:34       ` Mike Rapoport
2024-07-16 11:13 ` [PATCH 03/17] MIPS: loongson64: rename __node_data to node_data Mike Rapoport
2024-07-16 13:07   ` Jiaxun Yang
2024-07-17 14:33   ` David Hildenbrand
2024-07-19 15:27   ` Jonathan Cameron
2024-07-16 11:13 ` [PATCH 04/17] arch, mm: move definition of node_data to generic code Mike Rapoport
2024-07-17 14:35   ` David Hildenbrand
2024-07-19 15:39   ` Jonathan Cameron
2024-07-23  0:15   ` Davidlohr Bueso
2024-07-16 11:13 ` [PATCH 05/17] arch, mm: pull out allocation of NODE_DATA " Mike Rapoport
2024-07-17 14:42   ` David Hildenbrand
2024-07-18  7:02     ` Mike Rapoport
2024-07-19 15:07       ` David Hildenbrand
2024-07-19 15:34         ` Mike Rapoport
2024-07-19 15:46           ` David Hildenbrand
2024-07-19 15:51         ` Jonathan Cameron
2024-07-19 16:07           ` David Hildenbrand
2024-07-20 10:24     ` Mike Rapoport
2024-07-19 16:11   ` Jonathan Cameron
2024-07-16 11:13 ` [PATCH 06/17] x86/numa: simplify numa_distance allocation Mike Rapoport
2024-07-19 16:28   ` Jonathan Cameron
2024-07-22  7:51     ` Mike Rapoport [this message]
2024-07-16 11:13 ` [PATCH 07/17] x86/numa: move FAKE_NODE_* defines to numa_emu Mike Rapoport
2024-07-19 16:30   ` Jonathan Cameron
2024-07-16 11:13 ` [PATCH 08/17] x86/numa_emu: simplify allocation of phys_dist Mike Rapoport
2024-07-19 16:38   ` Jonathan Cameron
2024-07-16 11:13 ` [PATCH 09/17] x86/numa_emu: split __apicid_to_node update to a helper function Mike Rapoport
2024-07-19 16:47   ` Jonathan Cameron
2024-07-16 11:13 ` [PATCH 10/17] x86/numa_emu: use a helper function to get MAX_DMA32_PFN Mike Rapoport
2024-07-19 16:50   ` Jonathan Cameron
2024-07-16 11:13 ` [PATCH 11/17] x86/numa: numa_{add,remove}_cpu: make cpu parameter unsigned Mike Rapoport
2024-07-19 16:57   ` Jonathan Cameron
2024-07-16 11:13 ` [PATCH 12/17] mm: introduce numa_memblks Mike Rapoport
2024-07-19 18:16   ` Jonathan Cameron
2024-07-22  8:03     ` Mike Rapoport
2024-07-16 11:13 ` [PATCH 13/17] mm: move numa_distance and related code from x86 to numa_memblks Mike Rapoport
2024-07-18 21:46   ` Samuel Holland
2024-07-19  5:55     ` Mike Rapoport
2024-07-19 17:48   ` Jonathan Cameron
2024-07-20 12:25     ` Mike Rapoport
2024-07-16 11:13 ` [PATCH 14/17] mm: introduce numa_emulation Mike Rapoport
2024-07-19 16:03   ` Zi Yan
2024-07-20 12:09     ` Mike Rapoport
2024-07-16 11:13 ` [PATCH 15/17] mm: make numa_memblks more self-contained Mike Rapoport
2024-07-19 18:07   ` Jonathan Cameron
2024-07-20 12:32     ` Mike Rapoport
2024-07-22  8:05     ` Mike Rapoport
2024-07-16 11:13 ` [PATCH 16/17] arch_numa: switch over to numa_memblks Mike Rapoport
2024-07-19 18:16   ` Jonathan Cameron
2024-07-16 11:13 ` [PATCH 17/17] mm: make range-to-target_node lookup facility a part of numa_memblks Mike Rapoport
2024-07-19 18:19   ` Jonathan Cameron
2024-07-19 13:33 ` [PATCH 00/17] mm: introduce numa_memblks Jonathan Cameron
2024-07-22  8:08   ` Mike Rapoport

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zp4PfVZKAg3djFOu@kernel.org \
    --to=rppt@kernel.org \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=agordeev@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=andreas@gaisler.com \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=chenhuacai@kernel.org \
    --cc=christophe.leroy@csgroup.eu \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=david@redhat.com \
    --cc=devicetree@vger.kernel.org \
    --cc=glaubitz@physik.fu-berlin.de \
    --cc=gor@linux.ibm.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hca@linux.ibm.com \
    --cc=jiaxun.yang@flygoat.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-sh@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=loongarch@lists.linux.dev \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=nvdimm@lists.linux.dev \
    --cc=palmer@dabbelt.com \
    --cc=rafael@kernel.org \
    --cc=robh@kernel.org \
    --cc=sparclinux@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tsbogend@alpha.franken.de \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).