Re: [PATCH v3] Add support for AMD64 EDAC on multiple PCI domains

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Daniel J Blueman <daniel@numascale-asia.com>
To: Borislav Petkov <bp@alien8.de>
Cc: Ingo Molnar <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	H Peter Anvin <hpa@zytor.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	Andreas Herrmann <herrmann.der.user@gmail.com>,
	Steffen Persvold <sp@numascale.com>
Subject: Re: [PATCH v3] Add support for AMD64 EDAC on multiple PCI domains
Date: Wed, 31 Oct 2012 13:23:36 +0800	[thread overview]
Message-ID: <5090B5D8.3000209@numascale-asia.com> (raw)
In-Reply-To: <20121029103217.GD4326@liondog.tnic>

On 29/10/2012 18:32, Borislav Petkov wrote:
> + Andreas.
>
> Dude, look at this boot log below:
>
> http://quora.org/2012/16-server-boot-2.txt
>
> That's 192 F10h's!

We were booting 384 a while back, but I'll let you know when reach 4096!

> On Mon, Oct 29, 2012 at 04:54:59PM +0800, Daniel J Blueman wrote:
>>> A number of other callers lookup the PCI device based on index
>>> 0..amd_nb_num(), but we can't easily allocate contiguous northbridge IDs
>> >from the PCI device in the first place.
>>
>>> OTOH we can simply this code by changing amd_get_node_id to generate a
>>> linear northbridge ID from the index of the matching entry in the
>>> northbridge array.
>>>
>>> I'll get a patch together to see if there are any snags.
>
> I suspected that after we have this nice approach, you guys would come
> with non-contiguous node numbers. Maan, can't you build your systems so
> that software people can have it easy at least for once??!

It depends on the definition of node, of course. The only changes we're 
considering is compliance with the Intel x2apic spec with using the 
upper 16-bits of the APIC ID as the server ("cluster") ID, since there 
are optimisations in Linux for this.

>> This really is a lot less intrusive [1] and boots well on top of
>> 3.7-rc3 on one of our 16-server/192-core/512GB systems [2].
>>
>> If you're happy with this simpler approach for now, I'll present
>> this and a separate patch cleaning up the inconsistent use of
>> unsigned and u8 node ID variables to u16?
>
> Sure, bring it on.

Yes, I've prepared a patch series and it tests out well.

>> diff --git a/arch/x86/include/asm/amd_nb.h b/arch/x86/include/asm/amd_nb.h
>> index b3341e9..b88fc7a 100644
>> --- a/arch/x86/include/asm/amd_nb.h
>> +++ b/arch/x86/include/asm/amd_nb.h
>> @@ -81,6 +81,18 @@ static inline struct amd_northbridge
>> *node_to_amd_nb(int node)
>>          return (node < amd_northbridges.num) ?
>> &amd_northbridges.nb[node] : NULL;
>>   }
>>
>> +static inline u8 get_node_id(struct pci_dev *pdev)
>> +{
>> +       int i;
>> +
>> +       for (i = 0; i != amd_nb_num(); i++)
>> +               if (pci_domain_nr(node_to_amd_nb(i)->misc->bus) ==
>> pci_domain_nr(pdev->bus) &&
>> +                   PCI_SLOT(node_to_amd_nb(i)->misc->devfn) ==
>> PCI_SLOT(pdev->devfn))
>> +                       return i;
>
> Looks ok, can you send the whole patch please?
>
>> +       BUG();
>
> I'm not sure about this - maybe WARN()? Are we absolutely sure we
> unconditionally should panic after not finding an NB descriptor?

It looks like the only way we could be looking up a non-existent NB 
descriptor is if the array or variable in hand was corrupted. Maybe 
better to panic immediately debugging to be elusive later.

I've tweaked this to warn and return the first Northbridge ID to avoid 
further issues, but even that isn't ideal.

> Btw, this shouldn't happen on those CPUs:
>
> [   39.279131] TSC synchronization [CPU#0 -> CPU#12]:
> [   39.287223] Measured 22750019569 cycles TSC warp between CPUs, turning off TSC clock.
> [    0.030000] tsc: Marking TSC unstable due to check_tsc_sync_source failed
>
> I guess TSCs are not starting at the same moment on all boards.

As these are physically separate servers (off-the-shelf servers in fact, 
a key benefit of NumaConnect), the TSC clocks diverge. Later, I'll be 
cooking up a patch series to keep them in sync, allowing fast TSC use.

> You definitely need ucode on those too:
>
> [  113.392460] microcode: CPU0: patch_level=0x00000000

Good tip!

Thanks,
   Daniel
-- 
Daniel J Blueman
Principal Software Engineer, Numascale Asia

     prev parent reply	other threads:[~2012-10-31  5:23 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-25  8:32 [PATCH v3] Add support for AMD64 EDAC on multiple PCI domains Daniel J Blueman
2012-10-25 11:03 ` Borislav Petkov
2012-10-25 11:56   ` Ingo Molnar
2012-10-25 13:59     ` Multiple patch authors (was: Re: [PATCH v3] Add support for AMD64 EDAC on multiple PCI domains) Borislav Petkov
2012-10-25 14:32       ` Multiple patch authors H. Peter Anvin
2012-10-25 14:36         ` Borislav Petkov
2012-10-25 14:41           ` H. Peter Anvin
2012-10-25 15:23             ` Borislav Petkov
2012-10-29  6:17   ` [PATCH v3] Add support for AMD64 EDAC on multiple PCI domains Daniel J Blueman
2012-10-29  8:54     ` Daniel J Blueman
2012-10-29 10:32       ` Borislav Petkov
2012-10-31  5:23         ` Daniel J Blueman [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5090B5D8.3000209@numascale-asia.com \
    --to=daniel@numascale-asia.com \
    --cc=bp@alien8.de \
    --cc=herrmann.der.user@gmail.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=sp@numascale.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.