All of lore.kernel.org
 help / color / mirror / Atom feed
From: Valentin Schneider <valentin.schneider@arm.com>
To: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"linux-ia64@vger.kernel.org" <linux-ia64@vger.kernel.org>,
	Sergei Trofimovich <slyfox@gentoo.org>,
	debian-ia64 <debian-ia64@lists.debian.org>
Subject: Re: [PATCH 0/1] sched/topology: NUMA distance deduplication
Date: Wed, 17 Mar 2021 20:04:07 +0000	[thread overview]
Message-ID: <87zgz1pmx4.mognet@arm.com> (raw)
In-Reply-To: <cf4d7277-54a0-8bc7-60fb-9b2f6befb511@physik.fu-berlin.de>

On 17/03/21 20:47, John Paul Adrian Glaubitz wrote:
> Helo Valentin!
>
> On 3/17/21 8:36 PM, Valentin Schneider wrote:
>> I see ACPI in your boot logs, so I'm guessing you have a bogus SLIT table
>> (the ACPI table with node distances). You should be able to double check
>> this with something like:
>>
>> $ acpidump > acpi.dump
>> $ acpixtract -a acpi.dump
>> $ iasl -d *.dat
>> $ cat slit.dsl
>
> There does not seem to be a SLIT table in my firmware:
>
> root@glendronach:~# acpidump > acpi.dump
> root@glendronach:~# acpixtract -a acpi.dump
>
> Intel ACPI Component Architecture
> ACPI Binary Table Extraction Utility version 20200925
> Copyright (c) 2000 - 2020 Intel Corporation
>
> acpixtract(31194): unaligned access to 0x60000fffff9b3925, ip=0x4000000000003e91
>   SSDT -    3768 bytes written (0x00000EB8) - ssdt1.dat
> acpixtract(31194): unaligned access to 0x60000fffff9b3925, ip=0x4000000000003e00
> acpixtract(31194): unaligned access to 0x60000fffff9b3925, ip=0x4000000000003e91
>   SPCR -      80 bytes written (0x00000050) - spcr.dat
> acpixtract(31194): unaligned access to 0x60000fffff9b3925, ip=0x4000000000003e00
> acpixtract(31194): unaligned access to 0x60000fffff9b3925, ip=0x4000000000003e91
>   APIC -     200 bytes written (0x000000C8) - apic.dat
>   SSDT -    1110 bytes written (0x00000456) - ssdt2.dat
>   SSDT -     316 bytes written (0x0000013C) - ssdt3.dat
>   SPMI -      80 bytes written (0x00000050) - spmi.dat
>   DSDT -   58726 bytes written (0x0000E566) - dsdt.dat
>   SSDT -     312 bytes written (0x00000138) - ssdt4.dat
>   SSDT -    2150 bytes written (0x00000866) - ssdt5.dat
>   SSDT -     316 bytes written (0x0000013C) - ssdt6.dat
>   SSDT -    3768 bytes written (0x00000EB8) - ssdt7.dat
>   FACP -     244 bytes written (0x000000F4) - facp.dat
>   SSDT -    1203 bytes written (0x000004B3) - ssdt8.dat
>   CPEP -      52 bytes written (0x00000034) - cpep.dat
>   SSDT -     316 bytes written (0x0000013C) - ssdt9.dat
>   DBGP -      52 bytes written (0x00000034) - dbgp.dat
>   SSDT -    3768 bytes written (0x00000EB8) - ssdt10.dat
>   FACS -      64 bytes written (0x00000040) - facs.dat
> root@glendronach:~#
>
> root@glendronach:~# ls *.dsl *.dat
> apic.dat  cpep.dsl  dsdt.dat  facp.dsl  spcr.dat  spmi.dsl    ssdt1.dat  ssdt2.dsl  ssdt4.dat  ssdt5.dsl  ssdt7.dat  ssdt8.dsl
> apic.dsl  dbgp.dat  dsdt.dsl  facs.dat  spcr.dsl  ssdt10.dat  ssdt1.dsl  ssdt3.dat  ssdt4.dsl  ssdt6.dat  ssdt7.dsl  ssdt9.dat
> cpep.dat  dbgp.dsl  facp.dat  facs.dsl  spmi.dat  ssdt10.dsl  ssdt2.dat  ssdt3.dsl  ssdt5.dat  ssdt6.dsl  ssdt8.dat  ssdt9.dsl
> root@glendronach:~#
>

Huh, then this might be some initialization fail that leaves nr_node_ids to
MAX_NUMNODES, which must be 256 in your case (NODES_SHIFT=8). Devicetree
can provide node distances, but something tells me you're not using that :-)

>> a) Complain to your hardware vendor to have them fix the table and ship a
>>    firmware fix
>
> The hardware is probably too old for the vendor to care about fixing it.
>

Indeed, I only realized that after googling your machine

>> b) Fix the ACPI table yourself - I've been told it's doable for *some* of
>>    them, but I've never done that myself
>> c) Compile your kernel with CONFIG_NUMA=n, as AFAICT you only actually have
>>    a single node
>> d) Ignore the warning
>>
>>
>> c) is clearly not ideal if you want to use a somewhat generic kernel image
>> on a wide host of machines; d) is also a bit yucky...
>
> Shouldn't the kernel be able to cope with quirky hardware? From what I remember in the past,
> ACPI tables used to be broken quite a lot and the kernel contained workarounds for such cases,
> didn't it?
>

Technically it *is* coping with it, it's just dumping the entire NUMA
distance matrix in the process... Let me see if I can't figure out why your
system doesn't end up with nr_node_ids=1.

> Adrian
>
> --
>  .''`.  John Paul Adrian Glaubitz
> : :' :  Debian Developer - glaubitz@debian.org
> `. `'   Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
>   `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

WARNING: multiple messages have this Message-ID (diff)
From: Valentin Schneider <valentin.schneider@arm.com>
To: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: "Peter Zijlstra \(Intel\)" <peterz@infradead.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"linux-ia64\@vger.kernel.org" <linux-ia64@vger.kernel.org>,
	Sergei Trofimovich <slyfox@gentoo.org>,
	debian-ia64 <debian-ia64@lists.debian.org>
Subject: Re: [PATCH 0/1] sched/topology: NUMA distance deduplication
Date: Wed, 17 Mar 2021 20:04:07 +0000	[thread overview]
Message-ID: <87zgz1pmx4.mognet@arm.com> (raw)
In-Reply-To: <cf4d7277-54a0-8bc7-60fb-9b2f6befb511@physik.fu-berlin.de>

On 17/03/21 20:47, John Paul Adrian Glaubitz wrote:
> Helo Valentin!
>
> On 3/17/21 8:36 PM, Valentin Schneider wrote:
>> I see ACPI in your boot logs, so I'm guessing you have a bogus SLIT table
>> (the ACPI table with node distances). You should be able to double check
>> this with something like:
>>
>> $ acpidump > acpi.dump
>> $ acpixtract -a acpi.dump
>> $ iasl -d *.dat
>> $ cat slit.dsl
>
> There does not seem to be a SLIT table in my firmware:
>
> root@glendronach:~# acpidump > acpi.dump
> root@glendronach:~# acpixtract -a acpi.dump
>
> Intel ACPI Component Architecture
> ACPI Binary Table Extraction Utility version 20200925
> Copyright (c) 2000 - 2020 Intel Corporation
>
> acpixtract(31194): unaligned access to 0x60000fffff9b3925, ip=0x4000000000003e91
>   SSDT -    3768 bytes written (0x00000EB8) - ssdt1.dat
> acpixtract(31194): unaligned access to 0x60000fffff9b3925, ip=0x4000000000003e00
> acpixtract(31194): unaligned access to 0x60000fffff9b3925, ip=0x4000000000003e91
>   SPCR -      80 bytes written (0x00000050) - spcr.dat
> acpixtract(31194): unaligned access to 0x60000fffff9b3925, ip=0x4000000000003e00
> acpixtract(31194): unaligned access to 0x60000fffff9b3925, ip=0x4000000000003e91
>   APIC -     200 bytes written (0x000000C8) - apic.dat
>   SSDT -    1110 bytes written (0x00000456) - ssdt2.dat
>   SSDT -     316 bytes written (0x0000013C) - ssdt3.dat
>   SPMI -      80 bytes written (0x00000050) - spmi.dat
>   DSDT -   58726 bytes written (0x0000E566) - dsdt.dat
>   SSDT -     312 bytes written (0x00000138) - ssdt4.dat
>   SSDT -    2150 bytes written (0x00000866) - ssdt5.dat
>   SSDT -     316 bytes written (0x0000013C) - ssdt6.dat
>   SSDT -    3768 bytes written (0x00000EB8) - ssdt7.dat
>   FACP -     244 bytes written (0x000000F4) - facp.dat
>   SSDT -    1203 bytes written (0x000004B3) - ssdt8.dat
>   CPEP -      52 bytes written (0x00000034) - cpep.dat
>   SSDT -     316 bytes written (0x0000013C) - ssdt9.dat
>   DBGP -      52 bytes written (0x00000034) - dbgp.dat
>   SSDT -    3768 bytes written (0x00000EB8) - ssdt10.dat
>   FACS -      64 bytes written (0x00000040) - facs.dat
> root@glendronach:~#
>
> root@glendronach:~# ls *.dsl *.dat
> apic.dat  cpep.dsl  dsdt.dat  facp.dsl  spcr.dat  spmi.dsl    ssdt1.dat  ssdt2.dsl  ssdt4.dat  ssdt5.dsl  ssdt7.dat  ssdt8.dsl
> apic.dsl  dbgp.dat  dsdt.dsl  facs.dat  spcr.dsl  ssdt10.dat  ssdt1.dsl  ssdt3.dat  ssdt4.dsl  ssdt6.dat  ssdt7.dsl  ssdt9.dat
> cpep.dat  dbgp.dsl  facp.dat  facs.dsl  spmi.dat  ssdt10.dsl  ssdt2.dat  ssdt3.dsl  ssdt5.dat  ssdt6.dsl  ssdt8.dat  ssdt9.dsl
> root@glendronach:~#
>

Huh, then this might be some initialization fail that leaves nr_node_ids to
MAX_NUMNODES, which must be 256 in your case (NODES_SHIFT==8). Devicetree
can provide node distances, but something tells me you're not using that :-)

>> a) Complain to your hardware vendor to have them fix the table and ship a
>>    firmware fix
>
> The hardware is probably too old for the vendor to care about fixing it.
>

Indeed, I only realized that after googling your machine

>> b) Fix the ACPI table yourself - I've been told it's doable for *some* of
>>    them, but I've never done that myself
>> c) Compile your kernel with CONFIG_NUMA=n, as AFAICT you only actually have
>>    a single node
>> d) Ignore the warning
>>
>>
>> c) is clearly not ideal if you want to use a somewhat generic kernel image
>> on a wide host of machines; d) is also a bit yucky...
>
> Shouldn't the kernel be able to cope with quirky hardware? From what I remember in the past,
> ACPI tables used to be broken quite a lot and the kernel contained workarounds for such cases,
> didn't it?
>

Technically it *is* coping with it, it's just dumping the entire NUMA
distance matrix in the process... Let me see if I can't figure out why your
system doesn't end up with nr_node_ids=1.

> Adrian
>
> --
>  .''`.  John Paul Adrian Glaubitz
> : :' :  Debian Developer - glaubitz@debian.org
> `. `'   Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
>   `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

  reply	other threads:[~2021-03-17 20:04 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-17 19:04 [PATCH 0/1] sched/topology: NUMA distance deduplication John Paul Adrian Glaubitz
2021-03-17 19:36 ` Valentin Schneider
2021-03-17 19:36   ` Valentin Schneider
2021-03-17 19:47   ` John Paul Adrian Glaubitz
2021-03-17 19:47     ` John Paul Adrian Glaubitz
2021-03-17 20:04     ` Valentin Schneider [this message]
2021-03-17 20:04       ` Valentin Schneider
2021-03-17 20:56       ` Valentin Schneider
2021-03-17 20:56         ` Valentin Schneider
2021-03-17 23:26         ` John Paul Adrian Glaubitz
2021-03-17 23:26           ` John Paul Adrian Glaubitz
2021-03-18 10:28           ` John Paul Adrian Glaubitz
2021-03-18 10:28             ` John Paul Adrian Glaubitz
2021-03-18 10:48             ` Valentin Schneider
2021-03-18 10:48               ` Valentin Schneider
2021-03-17 21:14       ` Sergei Trofimovich
2021-03-17 21:14         ` Sergei Trofimovich
2021-03-17 21:58         ` Anatoly Pugachev
2021-03-17 21:58           ` Anatoly Pugachev
2021-03-17 23:29         ` John Paul Adrian Glaubitz
2021-03-17 23:29           ` John Paul Adrian Glaubitz
  -- strict thread matches above, loose matches on Subject: below --
2021-01-22 12:39 Valentin Schneider

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zgz1pmx4.mognet@arm.com \
    --to=valentin.schneider@arm.com \
    --cc=debian-ia64@lists.debian.org \
    --cc=glaubitz@physik.fu-berlin.de \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=slyfox@gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.