All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel J Blueman <daniel@numascale.com>
To: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	"x86@kernel.org" <x86@kernel.org>, Borislav Petkov <bp@suse.de>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Steffen Persvold <sp@numascale.com>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	kim.naru@amd.com,
	Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>,
	Myron Stowe <myron.stowe@redhat.com>
Subject: Re: [PATCH] Fix northbridge quirk to assign correct NUMA node
Date: Sun, 23 Mar 2014 22:30:17 +0800	[thread overview]
Message-ID: <532EEFF9.4020706@numascale.com> (raw)
In-Reply-To: <532C73DA.7060008@amd.com>

On 03/22/2014 01:16 AM, Suravee Suthikulpanit wrote:
> On 3/20/2014 10:38 PM, Daniel J Blueman wrote:
>> On 21/03/2014 06:07, Bjorn Helgaas wrote:
>>> [+cc linux-pci, Myron, Suravee, Kim, Aravind]
>>>
>>> On Thu, Mar 13, 2014 at 5:43 AM, Daniel J Blueman
>>> <daniel@numascale.com> wrote:
>>>> For systems with multiple servers and routed fabric, all
>>>> northbridges get
>>>> assigned to the first server. Fix this by also using the node
>>>> reported from
>>>> the PCI bus. For single-fabric systems, the northbriges are on PCI
>>>> bus 0
>>>> by definition, which are on NUMA node 0 by definition, so this is
>>>> invarient
>>>> on most systems.
>>>>
>>>> Tested on fam10h and fam15h single and multi-fabric systems and
>>>> candidate
>>>> for stable.
>>
>>> I wish this had been cc'd to linux-pci.  We're talking about a related
>>> change by Suravee there.  In fact, we were hoping this quirk could be
>>> removed altogether.
>>
>> Noted.
>>
>>> I don't understand what this quirk is doing.  Normally we discover the
>>> NUMA node for a PCI host bridge via the ACPI _PXM method.  The way
>>> _PXM works is that every PCI device in the hierarchy below the bridge
>>> inherits the same node number as the host bridge.  I first thought
>>> this might be a workaround for a system that lacks _PXM, but I don't
>>> think that can be right, because you're only changing the node for a
>>> few devices, not the whole hierarchy.
>>  >
>>> So I suspect the problem is more complicated, and maybe _PXM is
>>> insufficient to describe the topology?  Are there subtrees that should
>>> have nodes different from the host bridge?
>>
>> Yes; see below.
>>
>>> I know this patch is already in v3.14-rc7, but I'd still like to
>>> understand it so we can do the right thing with Suravee's patch.
>>
>> The _PXM method associates each northbridge with the first NUMA node,
>> 0 in single-fabric systems, and eg 4 for the second server in a
>> multi-fabric system with 2 dual-module Opterons (with 2 NUMA nodes
>> internally) etc, since the northbridges appear in the
>> PCI tree, under the host bridge, not above it [1].
> Daniel,
>
> That lspci looks interesting, what is the value returned from
> pci_bus_to_node() on your system for each fabric?

pci_bus_to_node returns 0 for PCI domain 0000, 2 for PCI domain 0001, 4 
for PCI domain 0002 and so on.

Our processor fabric interconnect has HyperTransport NodeId 2 on each 
server (as they start from bus 0, device 0x18 of course):
0000:00:1a.0 Host bridge: Device 1b47:0601 (rev 02)
0000:00:1a.1 Host bridge: Device 1b47:0602 (rev 02)

Thanks,
   Daniel
-- 
Daniel J Blueman
Principal Software Engineer, Numascale

  reply	other threads:[~2014-03-23 14:30 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-13 11:43 [PATCH] Fix northbridge quirk to assign correct NUMA node Daniel J Blueman
2014-03-14  9:06 ` Borislav Petkov
2014-03-14  9:57   ` Daniel J Blueman
2014-03-14 10:09 ` [tip:x86/urgent] x86/amd/numa: " tip-bot for Daniel J Blueman
2014-03-20 22:07 ` [PATCH] " Bjorn Helgaas
2014-03-21  3:38   ` Daniel J Blueman
2014-03-21 16:11     ` Bjorn Helgaas
2014-03-24  6:03       ` Daniel J Blueman
2014-03-21 17:16     ` Suravee Suthikulpanit
2014-03-23 14:30       ` Daniel J Blueman [this message]
2014-03-21  3:51   ` Suravee Suthikulpanit
2014-03-21  4:14     ` Daniel J Blueman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=532EEFF9.4020706@numascale.com \
    --to=daniel@numascale.com \
    --cc=aravind.gopalakrishnan@amd.com \
    --cc=bhelgaas@google.com \
    --cc=bp@suse.de \
    --cc=hpa@zytor.com \
    --cc=kim.naru@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=myron.stowe@redhat.com \
    --cc=sp@numascale.com \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.