All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: <linux-pci@vger.kernel.org>, <linuxarm@huawei.com>,
	Ingo Molnar <mingo@kernel.org>, <linux-acpi@vger.kernel.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	"Peter Zijlstra" <peterz@infradead.org>, <x86@kernel.org>
Subject: Re: [PATCH] x86: Fix an issue with invalid ACPI numa config
Date: Tue, 11 Dec 2018 09:47:11 +0000	[thread overview]
Message-ID: <20181211094711.00003f3d@huawei.com> (raw)
In-Reply-To: <20181210235637.GA99796@google.com>

On Mon, 10 Dec 2018 17:56:37 -0600
Bjorn Helgaas <helgaas@kernel.org> wrote:

> I don't see this in linux-next so I assume nobody has picked it up yet?
> numa.c isn't my file, but the fix is for something that went in via
> PCI, so I can take this patch if nobody else has already.

Great. I've been assuming it is a safe change to make, but not gotten
a great deal of feedback from the x86 side.

> 
> The changelog could all be wrapped to use more of the line.  Assume 80
> char width and allow for the 4 spaces added by git log.
> 
> On Thu, Nov 15, 2018 at 11:06:17AM +0000, Jonathan Cameron wrote:
> > The addition of support to read the numa node for a PCI  
> 
> s/numa/NUMA/
> 
> > card specified by _PXM resulted in Martin's system not
> > booting.   Looking at the ACPI tables it seems that there
> > are PXM entries for the root ports, but no SRAT table.  
> 
> s/PXM/_PXM/ (and below)
> 
> > The absence of SRAT table results in dummy_numa_init being  
> 
> dummy_numa_init() and similar for other functions below.
> 
> > called.  However, unlike on arm64, this doesn't then result
> > in numa_off being set.  When the PCI code later comes along
> > and calls acpi_get_node for any PCI card below the root port,
> > it navigates up the ACPI tree until it finds the PXM value in
> > the root port. This value is then passed to
> > acpi_map_pxm_to_node.  If numa_off is set this returns,
> > NUMA_NO_NODE (as it does on arm64), on x86 it instead tries  
> 
> This doesn't parse quite right.
> 
> > to allocate a numa node from the unused set without setting
> > up all the infrastructure that would normally accompany such
> > a call.  We have not identified exactly which driver is
> > causing the subsequent hang for Martin.  
> 
> It's not clear to me why dummy_numa_init() must be different between
> arm64 and x86.  This patch makes x86 more like arm64, which is good.
> Maybe a subsequent patch could go even further?

I agree there may be potential here for a bit of unification
at a later stage. There is an awful lot of handling in this
area that is 'similar' between the architectures.

V2 will fix the other things you have commented on.

Thanks,

Jonathan

> 
> > It is invalid under the ACPI spec to specify new
> > numa nodes using PXM if they have no presence in SRAT.
> > Thus the simplest fix is to set numa_off when it is off due
> > to an invalid SRAT (here not present at all).
> > 
> > I do not have easy access to appropriate x86 numa systems so
> > would appreciate some testing of this one!
> > 
> > Known problem boards setups:
> > 
> > AMD Ryzen Threadripper 2950X on ASROCK X399 TAICHI
> > MSI X399 SLI PLUS (probably - not confirmed yet)
> > 
> > The PCI patch has been reverted, so this fix is not critical.
> > 
> > Reported-by: Martin Hundeb?ll <martin@geanix.com>
> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Fixes: bad7dcd94f39 ("ACPI/PCI: Pay attention to device-specific _PXM node values")
> > 
> > ---
> >  arch/x86/mm/numa.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
> > index 1308f5408bf7..ce1182f953ff 100644
> > --- a/arch/x86/mm/numa.c
> > +++ b/arch/x86/mm/numa.c
> > @@ -695,6 +695,8 @@ static int __init dummy_numa_init(void)
> >  	node_set(0, numa_nodes_parsed);
> >  	numa_add_memblk(0, 0, PFN_PHYS(max_pfn));
> >  
> > +	numa_off = true;
> > +
> >  	return 0;
> >  }
> >  
> > -- 
> > 2.18.0
> > 
> >   



      reply	other threads:[~2018-12-11  9:47 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-15 11:06 [PATCH] x86: Fix an issue with invalid ACPI numa config Jonathan Cameron
2018-11-15 11:09 ` Jonathan Cameron
2018-11-20 12:01 ` Peter Zijlstra
2018-11-20 13:19   ` Jonathan Cameron
2018-12-03 10:15     ` Jonathan Cameron
2018-12-10 23:56 ` Bjorn Helgaas
2018-12-11  9:47   ` Jonathan Cameron [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181211094711.00003f3d@huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=helgaas@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.