linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Baoquan He <bhe@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: mmorana@amperecomputing.com,
	Catalin Marinas <catalin.marinas@arm.com>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	"open list:MEMORY MANAGEMENT" <linux-mm@kvack.org>,
	Paul Mackerras <paulus@samba.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	sparclinux@vger.kernel.org,
	Alexander Duyck <alexander.h.duyck@linux.intel.com>,
	linux-s390@vger.kernel.org, x86@kernel.org,
	Mike Rapoport <rppt@linux.ibm.com>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Ingo Molnar <mingo@redhat.com>,
	Hoan Tran <Hoan@os.amperecomputing.com>,
	Pavel Tatashin <pavel.tatashin@microsoft.com>,
	lho@amperecomputing.com, Vasily Gorbik <gor@linux.ibm.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Will Deacon <will.deacon@arm.com>, Borislav Petkov <bp@alien8.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-arm-kernel@lists.infradead.org,
	Oscar Salvador <osalvador@suse.de>,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	linuxppc-dev@lists.ozlabs.org,
	"David S. Miller" <davem@davemloft.net>
Subject: Re: [PATCH v3 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA
Date: Tue, 31 Mar 2020 22:03:32 +0800	[thread overview]
Message-ID: <20200331140332.GA2129@MiWiFi-R3L-srv> (raw)
In-Reply-To: <20200331085513.GE30449@dhcp22.suse.cz>

Hi Michal,

On 03/31/20 at 10:55am, Michal Hocko wrote:
> On Tue 31-03-20 11:14:23, Mike Rapoport wrote:
> > Maybe I mis-read the code, but I don't see how this could happen. In the
> > HAVE_MEMBLOCK_NODE_MAP=y case, free_area_init_node() calls
> > calculate_node_totalpages() that ensures that node->node_zones are entirely
> > within the node because this is checked in zone_spanned_pages_in_node().
> 
> zone_spanned_pages_in_node does chech the zone boundaries are within the
> node boundaries. But that doesn't really tell anything about other
> potential zones interleaving with the physical memory range.
> zone->spanned_pages simply gives the physical range for the zone
> including holes. Interleaving nodes are essentially a hole
> (__absent_pages_in_range is going to skip those).
> 
> That means that when free_area_init_core simply goes over the whole
> physical zone range including holes and that is why we need to check
> both for physical and logical holes (aka other nodes).
> 
> The life would be so much easier if the whole thing would simply iterate
> over memblocks...

The memblock iterating sounds a great idea. I tried with putting the
memblock iterating in the upper layer, memmap_init(), which is used for
boot mem only anyway. Do you think it's doable and OK? It yes, I can
work out a formal patch to make this simpler as you said. The draft code
is as below. Like this it uses the existing code and involves little change.

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 138a56c0f48f..558d421f294b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6007,14 +6007,6 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 		 * function.  They do not exist on hotplugged memory.
 		 */
 		if (context == MEMMAP_EARLY) {
-			if (!early_pfn_valid(pfn)) {
-				pfn = next_pfn(pfn);
-				continue;
-			}
-			if (!early_pfn_in_nid(pfn, nid)) {
-				pfn++;
-				continue;
-			}
 			if (overlap_memmap_init(zone, &pfn))
 				continue;
 			if (defer_init(nid, pfn, end_pfn))
@@ -6130,9 +6122,17 @@ static void __meminit zone_init_free_lists(struct zone *zone)
 }
 
 void __meminit __weak memmap_init(unsigned long size, int nid,
-				  unsigned long zone, unsigned long start_pfn)
+				  unsigned long zone, unsigned long range_start_pfn)
 {
-	memmap_init_zone(size, nid, zone, start_pfn, MEMMAP_EARLY, NULL);
+	unsigned long start_pfn, end_pfn;
+	unsigned long range_end_pfn = range_start_pfn + size;
+	int i;
+	for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, NULL) {
+		start_pfn = clamp(start_pfn, range_start_pfn, range_end_pfn);
+		end_pfn = clamp(end_pfn, range_start_pfn, range_end_pfn);
+		if (end_pfn > start_pfn)
+			memmap_init_zone(size, nid, zone, start_pfn, MEMMAP_EARLY, NULL);
+	}
 }
 
 static int zone_batchsize(struct zone *zone)


  reply	other threads:[~2020-03-31 14:09 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-28 18:31 [PATCH v3 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA Hoan Tran
2020-03-28 18:31 ` [PATCH v3 1/5] " Hoan Tran
2020-03-28 18:31 ` [PATCH v3 2/5] powerpc: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES Hoan Tran
2020-03-28 18:31 ` [PATCH v3 3/5] x86: " Hoan Tran
2020-03-28 18:31 ` [PATCH v3 4/5] sparc: " Hoan Tran
2020-03-28 18:31 ` [PATCH v3 5/5] s390: " Hoan Tran
2020-03-29  0:19 ` [PATCH v3 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA Baoquan He
2020-03-30  7:44   ` Michal Hocko
2020-03-30  8:04     ` Baoquan He
2020-03-30  7:42 ` Michal Hocko
2020-03-30  8:16   ` Baoquan He
2020-03-30  8:28     ` Baoquan He
2020-03-30  9:21   ` Mike Rapoport
2020-03-30  9:58     ` Michal Hocko
2020-03-30 10:26       ` Mike Rapoport
2020-03-30 10:43         ` Baoquan He
2020-03-31 21:56       ` [PATCH RFC] mm: remove CONFIG_HAVE_MEMBLOCK_NODE_MAP (was: Re: [PATCH v3 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA) Mike Rapoport
2020-04-01  5:42         ` Baoquan He
2020-04-01  7:51           ` Mike Rapoport
2020-04-02  8:01             ` Michal Hocko
2020-04-09 14:41               ` Baoquan He
2020-04-09 15:33                 ` Michal Hocko
2020-04-10  6:46                   ` Baoquan He
2020-03-30  9:26   ` [PATCH v3 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA Baoquan He
2020-03-30 17:51   ` Mike Rapoport
2020-03-30 18:23     ` Michal Hocko
2020-03-31  8:14       ` Mike Rapoport
2020-03-31  8:55         ` Michal Hocko
2020-03-31 14:03           ` Baoquan He [this message]
2020-03-31 14:21             ` Michal Hocko
2020-03-31 14:31               ` Baoquan He
2020-04-03  4:46                 ` Hoan Tran
2020-04-03  7:09                   ` Baoquan He
2020-04-03 16:36                     ` Hoan Tran
2020-04-09 16:27               ` Mike Rapoport
2020-04-10  6:50                 ` Baoquan He

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200331140332.GA2129@MiWiFi-R3L-srv \
    --to=bhe@redhat.com \
    --cc=Hoan@os.amperecomputing.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.h.duyck@linux.intel.com \
    --cc=borntraeger@de.ibm.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=davem@davemloft.net \
    --cc=gor@linux.ibm.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=hpa@zytor.com \
    --cc=lho@amperecomputing.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mhocko@kernel.org \
    --cc=mingo@redhat.com \
    --cc=mmorana@amperecomputing.com \
    --cc=osalvador@suse.de \
    --cc=paulus@samba.org \
    --cc=pavel.tatashin@microsoft.com \
    --cc=rppt@linux.ibm.com \
    --cc=sparclinux@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=vbabka@suse.cz \
    --cc=will.deacon@arm.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).