linux-acpi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: Mike Rapoport <rppt@kernel.org>
Cc: Shiju Jose <shiju.jose@huawei.com>,
	"rafael@kernel.org" <rafael@kernel.org>,
	"bp@alien8.de" <bp@alien8.de>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"dferguson@amperecomputing.com" <dferguson@amperecomputing.com>,
	"linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
	"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
	"tony.luck@intel.com" <tony.luck@intel.com>,
	"lenb@kernel.org" <lenb@kernel.org>,
	"leo.duran@amd.com" <leo.duran@amd.com>,
	"Yazen.Ghannam@amd.com" <Yazen.Ghannam@amd.com>,
	"mchehab@kernel.org" <mchehab@kernel.org>,
	Linuxarm <linuxarm@huawei.com>,
	"rientjes@google.com" <rientjes@google.com>,
	"jiaqiyan@google.com" <jiaqiyan@google.com>,
	"Jon.Grimm@amd.com" <Jon.Grimm@amd.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"naoya.horiguchi@nec.com" <naoya.horiguchi@nec.com>,
	"james.morse@arm.com" <james.morse@arm.com>,
	"jthoughton@google.com" <jthoughton@google.com>,
	"somasundaram.a@hpe.com" <somasundaram.a@hpe.com>,
	"erdemaktas@google.com" <erdemaktas@google.com>,
	"pgonda@google.com" <pgonda@google.com>,
	"duenwen@google.com" <duenwen@google.com>,
	"gthelen@google.com" <gthelen@google.com>,
	"wschwartz@amperecomputing.com" <wschwartz@amperecomputing.com>,
	"wbs@os.amperecomputing.com" <wbs@os.amperecomputing.com>,
	"nifan.cxl@gmail.com" <nifan.cxl@gmail.com>,
	tanxiaofei <tanxiaofei@huawei.com>,
	"Zengtao (B)" <prime.zeng@hisilicon.com>,
	Roberto Sassu <roberto.sassu@huawei.com>,
	"kangkang.shen@futurewei.com" <kangkang.shen@futurewei.com>,
	wanghuiqiang <wanghuiqiang@huawei.com>
Subject: Re: [PATCH v11 1/3] mm: Add support to retrieve physical address range of memory from the node ID
Date: Thu, 21 Aug 2025 10:06:55 +0100	[thread overview]
Message-ID: <20250821100655.00003942@huawei.com> (raw)
In-Reply-To: <aKX_rk0DasbDgJrS@kernel.org>

On Wed, 20 Aug 2025 20:02:38 +0300
Mike Rapoport <rppt@kernel.org> wrote:

> On Wed, Aug 20, 2025 at 10:00:50AM +0000, Shiju Jose wrote:
> > >-----Original Message-----
> > >From: Jonathan Cameron <jonathan.cameron@huawei.com>
> > >Sent: 20 August 2025 09:54
> > >To: Mike Rapoport <rppt@kernel.org>
> > >Cc: Shiju Jose <shiju.jose@huawei.com>; rafael@kernel.org; bp@alien8.de;
> > >akpm@linux-foundation.org; dferguson@amperecomputing.com; linux-
> > >edac@vger.kernel.org; linux-acpi@vger.kernel.org; linux-mm@kvack.org; linux-
> > >doc@vger.kernel.org; tony.luck@intel.com; lenb@kernel.org;
> > >leo.duran@amd.com; Yazen.Ghannam@amd.com; mchehab@kernel.org;
> > >Linuxarm <linuxarm@huawei.com>; rientjes@google.com;
> > >jiaqiyan@google.com; Jon.Grimm@amd.com; dave.hansen@linux.intel.com;
> > >naoya.horiguchi@nec.com; james.morse@arm.com; jthoughton@google.com;
> > >somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com;
> > >duenwen@google.com; gthelen@google.com;
> > >wschwartz@amperecomputing.com; wbs@os.amperecomputing.com;
> > >nifan.cxl@gmail.com; tanxiaofei <tanxiaofei@huawei.com>; Zengtao (B)
> > ><prime.zeng@hisilicon.com>; Roberto Sassu <roberto.sassu@huawei.com>;
> > >kangkang.shen@futurewei.com; wanghuiqiang <wanghuiqiang@huawei.com>
> > >Subject: Re: [PATCH v11 1/3] mm: Add support to retrieve physical address
> > >range of memory from the node ID
> > >
> > >On Wed, 20 Aug 2025 10:34:13 +0300
> > >Mike Rapoport <rppt@kernel.org> wrote:
> > >  
> > >> On Tue, Aug 19, 2025 at 05:54:20PM +0100, Jonathan Cameron wrote:  
> > >> > On Tue, 12 Aug 2025 15:26:13 +0100
> > >> > <shiju.jose@huawei.com> wrote:
> > >> >  
> > >> > > From: Shiju Jose <shiju.jose@huawei.com>
> > >> > >
> > >> > > In the numa_memblks, a lookup facility is required to retrieve the
> > >> > > physical address range of memory in a NUMA node. ACPI RAS2 memory
> > >> > > features are among the use cases.
> > >> > >
> > >> > > Suggested-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> > >> > > Signed-off-by: Shiju Jose <shiju.jose@huawei.com>  
> > >> >
> > >> > Looks fine to me.  Mike, what do you think?  
> > >>
> > >> I still don't see why we can't use existing functions like
> > >> get_pfn_range_for_nid() or memblock_search_pfn_nid().
> > >>
> > >> Or even node_start_pfn() and node_spanned_pages().  
> > >
> > >Good point.  No reason anyone would scrub this on memory that hasn't been
> > >hotplugged yet, so no need to use numa-memblk to get the info.
> > >I guess I was thinking of the wrong hammer :)
> > >
> > >I'm not sure node_spanned_pages() works though as we need not to include
> > >ranges that might be on another node as we'd give a wrong impression of what
> > >was being scrubbed.  
> 
> If nodes are not interleaved node_spanned_pages() would work, even if there
> are holes inside the node, like e.g. e820-reserved memory.
> So with non-interleaved nodes node_start_pfn() and either
> node_spanned_pages() or node_end_pfn() will give the node extents and they
> are faster than get_pfn_range_for_nid().
> 
> If the nodes are interleaved, though, a single mem_base, mem_size are not
> enough for a node as there are a few contiguous ranges in that node, e.g.
> 
>   0              4G              8G             12G            16G
>   +-------------+ +-------------+ +-------------+ +-------------+
>   |    node 0   | |    node 1   | |    node 0   | |    node 1   |
>   +-------------+ +-------------+ +-------------+ +-------------+
> 
> I didn't look into the details of the RAS2 driver, but isn't it's something
> it should handle?

The aim here is that a query prior to setting a specific range returns
data for at least a range that the scrub controller covers and nothing
it doesn't. So just presenting the first chunk for a node is fine.
There is plenty of info for userspace to figure things out if it wants
to trigger a scrub on 8-12G in your example, but until it does we want
to return 0-4G for the default range.

I hacked up some SRAT tables to give something like the above for testing.
> 
> > >Should be able to use some combination of node_start_pfn() and maybe
> > >memblock_search_pfn_nid() to get it though (that also gets the nid we already
> > >know but meh, no ral harm in that.)  
> > 
> > Thanks Mike and Jonathan.
> > 
> > The following approaches were tried as you suggested, instead of newly proposed
> > nid_get_mem_physaddr_range().
> > Methods 1 to 3 give the same result as nid_get_mem_physaddr_range(), but
> > Method 4 gives a different value for the size.  
> 
> I believe that's because on x86 the node 0 is really scrambled because of
> e820/efi reservations that never make it to memblock.

Fun question of whether we should take any notice of those.
Would depend on whether anyone's scrub firmware gets confused if we scrub
them and they aren't backed by memory.  If they are we can rely on system
constraints refusing to scrub that stuff at an 'unsafe' level and if we
set it higher than it otherwise would be only possibility is we see earlier
error detections in those and have to deal with them.

Jonathan


>  
> > Please advise which method should be used for the RAS2?
> > 
> > Thanks,
> > Shiju
> >   
> 


  reply	other threads:[~2025-08-21  9:07 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-12 14:26 [PATCH v11 0/3] ACPI: Add support for ACPI RAS2 feature table shiju.jose
2025-08-12 14:26 ` [PATCH v11 1/3] mm: Add support to retrieve physical address range of memory from the node ID shiju.jose
2025-08-19 16:54   ` Jonathan Cameron
2025-08-20  7:34     ` Mike Rapoport
2025-08-20  8:54       ` Jonathan Cameron
2025-08-20 10:00         ` Shiju Jose
2025-08-20 17:02           ` Mike Rapoport
2025-08-21  9:06             ` Jonathan Cameron [this message]
2025-08-21 16:16               ` Luck, Tony
2025-08-24 12:41                 ` Mike Rapoport
2025-08-12 14:26 ` [PATCH v11 2/3] ACPI:RAS2: Add ACPI RAS2 driver shiju.jose
2025-08-12 14:26 ` [PATCH v11 3/3] ras: mem: Add memory " shiju.jose
2025-08-19 20:12 ` [PATCH v11 0/3] ACPI: Add support for ACPI RAS2 feature table Daniel Ferguson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250821100655.00003942@huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=Jon.Grimm@amd.com \
    --cc=Yazen.Ghannam@amd.com \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=dferguson@amperecomputing.com \
    --cc=duenwen@google.com \
    --cc=erdemaktas@google.com \
    --cc=gthelen@google.com \
    --cc=james.morse@arm.com \
    --cc=jiaqiyan@google.com \
    --cc=jthoughton@google.com \
    --cc=kangkang.shen@futurewei.com \
    --cc=lenb@kernel.org \
    --cc=leo.duran@amd.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxarm@huawei.com \
    --cc=mchehab@kernel.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=nifan.cxl@gmail.com \
    --cc=pgonda@google.com \
    --cc=prime.zeng@hisilicon.com \
    --cc=rafael@kernel.org \
    --cc=rientjes@google.com \
    --cc=roberto.sassu@huawei.com \
    --cc=rppt@kernel.org \
    --cc=shiju.jose@huawei.com \
    --cc=somasundaram.a@hpe.com \
    --cc=tanxiaofei@huawei.com \
    --cc=tony.luck@intel.com \
    --cc=wanghuiqiang@huawei.com \
    --cc=wbs@os.amperecomputing.com \
    --cc=wschwartz@amperecomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).