* [PATCH] ia64: Ensure proper NUMA distance and possible map initialization
@ 2021-03-18 13:06 ` Valentin Schneider
0 siblings, 0 replies; 12+ messages in thread
From: Valentin Schneider @ 2021-03-18 13:06 UTC (permalink / raw)
To: linux-kernel, linux-ia64@vger.kernel.org, debian-ia64
Cc: John Paul Adrian Glaubitz, Peter Zijlstra (Intel), Ingo Molnar,
Vincent Guittot, Dietmar Eggemann, Sergei Trofimovich,
Anatoly Pugachev
John Paul reported a warning about bogus NUMA distance values spurred by
commit:
620a6dc40754 ("sched/topology: Make sched_init_numa() use a set for the deduplicating sort")
In this case, the afflicted machine comes up with a reported 256 possible
nodes, all of which are 0 distance away from one another. This was
previously silently ignored, but is now caught by the aforementioned
commit.
The culprit is ia64's node_possible_map which remains unchanged from its
initialization value of NODE_MASK_ALL. In John's case, the machine doesn't
have any SRAT nor SLIT table, but AIUI the possible map remains untouched
regardless of what ACPI tables end up being parsed. Thus, !online &&
possible nodes remain with a bogus distance of 0 (distances \in [0, 9] are
"reserved and have no meaning" as per the ACPI spec).
Follow x86 / drivers/base/arch_numa's example and set the possible map to
the parsed map, which in this case seems to be the online map.
Link: http://lore.kernel.org/r/255d6b5d-194e-eb0e-ecdd-97477a534441@physik.fu-berlin.de
Fixes: 620a6dc40754 ("sched/topology: Make sched_init_numa() use a set for the deduplicating sort")
Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
---
This might need an earlier Fixes: tag, but all of this is quite old and
dusty (the git blame rabbit hole leads me to ~2008/2007)
Alternatively, can we deprecate ia64 already?
---
arch/ia64/kernel/acpi.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c
index a5636524af76..e2af6b172200 100644
--- a/arch/ia64/kernel/acpi.c
+++ b/arch/ia64/kernel/acpi.c
@@ -446,7 +446,8 @@ void __init acpi_numa_fixup(void)
if (srat_num_cpus = 0) {
node_set_online(0);
node_cpuid[0].phys_id = hard_smp_processor_id();
- return;
+ slit_distance(0, 0) = LOCAL_DISTANCE;
+ goto out;
}
/*
@@ -489,7 +490,7 @@ void __init acpi_numa_fixup(void)
for (j = 0; j < MAX_NUMNODES; j++)
slit_distance(i, j) = i = j ?
LOCAL_DISTANCE : REMOTE_DISTANCE;
- return;
+ goto out;
}
memset(numa_slit, -1, sizeof(numa_slit));
@@ -514,6 +515,8 @@ void __init acpi_numa_fixup(void)
printk("\n");
}
#endif
+out:
+ node_possible_map = node_online_map;
}
#endif /* CONFIG_ACPI_NUMA */
--
2.25.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH] ia64: Ensure proper NUMA distance and possible map initialization
@ 2021-03-18 13:06 ` Valentin Schneider
0 siblings, 0 replies; 12+ messages in thread
From: Valentin Schneider @ 2021-03-18 13:06 UTC (permalink / raw)
To: linux-kernel, linux-ia64@vger.kernel.org, debian-ia64
Cc: John Paul Adrian Glaubitz, Peter Zijlstra (Intel), Ingo Molnar,
Vincent Guittot, Dietmar Eggemann, Sergei Trofimovich,
Anatoly Pugachev
John Paul reported a warning about bogus NUMA distance values spurred by
commit:
620a6dc40754 ("sched/topology: Make sched_init_numa() use a set for the deduplicating sort")
In this case, the afflicted machine comes up with a reported 256 possible
nodes, all of which are 0 distance away from one another. This was
previously silently ignored, but is now caught by the aforementioned
commit.
The culprit is ia64's node_possible_map which remains unchanged from its
initialization value of NODE_MASK_ALL. In John's case, the machine doesn't
have any SRAT nor SLIT table, but AIUI the possible map remains untouched
regardless of what ACPI tables end up being parsed. Thus, !online &&
possible nodes remain with a bogus distance of 0 (distances \in [0, 9] are
"reserved and have no meaning" as per the ACPI spec).
Follow x86 / drivers/base/arch_numa's example and set the possible map to
the parsed map, which in this case seems to be the online map.
Link: http://lore.kernel.org/r/255d6b5d-194e-eb0e-ecdd-97477a534441@physik.fu-berlin.de
Fixes: 620a6dc40754 ("sched/topology: Make sched_init_numa() use a set for the deduplicating sort")
Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
---
This might need an earlier Fixes: tag, but all of this is quite old and
dusty (the git blame rabbit hole leads me to ~2008/2007)
Alternatively, can we deprecate ia64 already?
---
arch/ia64/kernel/acpi.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c
index a5636524af76..e2af6b172200 100644
--- a/arch/ia64/kernel/acpi.c
+++ b/arch/ia64/kernel/acpi.c
@@ -446,7 +446,8 @@ void __init acpi_numa_fixup(void)
if (srat_num_cpus == 0) {
node_set_online(0);
node_cpuid[0].phys_id = hard_smp_processor_id();
- return;
+ slit_distance(0, 0) = LOCAL_DISTANCE;
+ goto out;
}
/*
@@ -489,7 +490,7 @@ void __init acpi_numa_fixup(void)
for (j = 0; j < MAX_NUMNODES; j++)
slit_distance(i, j) = i == j ?
LOCAL_DISTANCE : REMOTE_DISTANCE;
- return;
+ goto out;
}
memset(numa_slit, -1, sizeof(numa_slit));
@@ -514,6 +515,8 @@ void __init acpi_numa_fixup(void)
printk("\n");
}
#endif
+out:
+ node_possible_map = node_online_map;
}
#endif /* CONFIG_ACPI_NUMA */
--
2.25.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [PATCH] ia64: Ensure proper NUMA distance and possible map initialization
2021-03-18 13:06 ` Valentin Schneider
@ 2021-03-19 14:47 ` John Paul Adrian Glaubitz
-1 siblings, 0 replies; 12+ messages in thread
From: John Paul Adrian Glaubitz @ 2021-03-19 14:47 UTC (permalink / raw)
To: Valentin Schneider, linux-kernel, linux-ia64@vger.kernel.org,
debian-ia64
Cc: Peter Zijlstra (Intel), Ingo Molnar, Vincent Guittot,
Dietmar Eggemann, Sergei Trofimovich, Anatoly Pugachev
Hi Valentin!
On 3/18/21 2:06 PM, Valentin Schneider wrote:
> John Paul reported a warning about bogus NUMA distance values spurred by
> commit:
>
> 620a6dc40754 ("sched/topology: Make sched_init_numa() use a set for the deduplicating sort")
>
> In this case, the afflicted machine comes up with a reported 256 possible
> nodes, all of which are 0 distance away from one another. This was
> previously silently ignored, but is now caught by the aforementioned
> commit.
>
> The culprit is ia64's node_possible_map which remains unchanged from its
> initialization value of NODE_MASK_ALL. In John's case, the machine doesn't
> have any SRAT nor SLIT table, but AIUI the possible map remains untouched
> regardless of what ACPI tables end up being parsed. Thus, !online &&
> possible nodes remain with a bogus distance of 0 (distances \in [0, 9] are
> "reserved and have no meaning" as per the ACPI spec).
>
> Follow x86 / drivers/base/arch_numa's example and set the possible map to
> the parsed map, which in this case seems to be the online map.
>
> Link: http://lore.kernel.org/r/255d6b5d-194e-eb0e-ecdd-97477a534441@physik.fu-berlin.de
> Fixes: 620a6dc40754 ("sched/topology: Make sched_init_numa() use a set for the deduplicating sort")
> Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
> ---
> This might need an earlier Fixes: tag, but all of this is quite old and
> dusty (the git blame rabbit hole leads me to ~2008/2007)
>
> Alternatively, can we deprecate ia64 already?
> ---
> arch/ia64/kernel/acpi.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c
> index a5636524af76..e2af6b172200 100644
> --- a/arch/ia64/kernel/acpi.c
> +++ b/arch/ia64/kernel/acpi.c
> @@ -446,7 +446,8 @@ void __init acpi_numa_fixup(void)
> if (srat_num_cpus = 0) {
> node_set_online(0);
> node_cpuid[0].phys_id = hard_smp_processor_id();
> - return;
> + slit_distance(0, 0) = LOCAL_DISTANCE;
> + goto out;
> }
>
> /*
> @@ -489,7 +490,7 @@ void __init acpi_numa_fixup(void)
> for (j = 0; j < MAX_NUMNODES; j++)
> slit_distance(i, j) = i = j ?
> LOCAL_DISTANCE : REMOTE_DISTANCE;
> - return;
> + goto out;
> }
>
> memset(numa_slit, -1, sizeof(numa_slit));
> @@ -514,6 +515,8 @@ void __init acpi_numa_fixup(void)
> printk("\n");
> }
> #endif
> +out:
> + node_possible_map = node_online_map;
> }
> #endif /* CONFIG_ACPI_NUMA */
>
>
Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Could you send this patch through Andrew Morton's tree? The ia64 port currently
has no maintainer, so we have to use an alternative tree.
@Sergei: Could you test/ack this patch as well?
Thanks,
Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - glaubitz@debian.org
`. `' Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH] ia64: Ensure proper NUMA distance and possible map initialization
@ 2021-03-19 14:47 ` John Paul Adrian Glaubitz
0 siblings, 0 replies; 12+ messages in thread
From: John Paul Adrian Glaubitz @ 2021-03-19 14:47 UTC (permalink / raw)
To: Valentin Schneider, linux-kernel, linux-ia64@vger.kernel.org,
debian-ia64
Cc: Peter Zijlstra (Intel), Ingo Molnar, Vincent Guittot,
Dietmar Eggemann, Sergei Trofimovich, Anatoly Pugachev
Hi Valentin!
On 3/18/21 2:06 PM, Valentin Schneider wrote:
> John Paul reported a warning about bogus NUMA distance values spurred by
> commit:
>
> 620a6dc40754 ("sched/topology: Make sched_init_numa() use a set for the deduplicating sort")
>
> In this case, the afflicted machine comes up with a reported 256 possible
> nodes, all of which are 0 distance away from one another. This was
> previously silently ignored, but is now caught by the aforementioned
> commit.
>
> The culprit is ia64's node_possible_map which remains unchanged from its
> initialization value of NODE_MASK_ALL. In John's case, the machine doesn't
> have any SRAT nor SLIT table, but AIUI the possible map remains untouched
> regardless of what ACPI tables end up being parsed. Thus, !online &&
> possible nodes remain with a bogus distance of 0 (distances \in [0, 9] are
> "reserved and have no meaning" as per the ACPI spec).
>
> Follow x86 / drivers/base/arch_numa's example and set the possible map to
> the parsed map, which in this case seems to be the online map.
>
> Link: http://lore.kernel.org/r/255d6b5d-194e-eb0e-ecdd-97477a534441@physik.fu-berlin.de
> Fixes: 620a6dc40754 ("sched/topology: Make sched_init_numa() use a set for the deduplicating sort")
> Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
> ---
> This might need an earlier Fixes: tag, but all of this is quite old and
> dusty (the git blame rabbit hole leads me to ~2008/2007)
>
> Alternatively, can we deprecate ia64 already?
> ---
> arch/ia64/kernel/acpi.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c
> index a5636524af76..e2af6b172200 100644
> --- a/arch/ia64/kernel/acpi.c
> +++ b/arch/ia64/kernel/acpi.c
> @@ -446,7 +446,8 @@ void __init acpi_numa_fixup(void)
> if (srat_num_cpus == 0) {
> node_set_online(0);
> node_cpuid[0].phys_id = hard_smp_processor_id();
> - return;
> + slit_distance(0, 0) = LOCAL_DISTANCE;
> + goto out;
> }
>
> /*
> @@ -489,7 +490,7 @@ void __init acpi_numa_fixup(void)
> for (j = 0; j < MAX_NUMNODES; j++)
> slit_distance(i, j) = i == j ?
> LOCAL_DISTANCE : REMOTE_DISTANCE;
> - return;
> + goto out;
> }
>
> memset(numa_slit, -1, sizeof(numa_slit));
> @@ -514,6 +515,8 @@ void __init acpi_numa_fixup(void)
> printk("\n");
> }
> #endif
> +out:
> + node_possible_map = node_online_map;
> }
> #endif /* CONFIG_ACPI_NUMA */
>
>
Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Could you send this patch through Andrew Morton's tree? The ia64 port currently
has no maintainer, so we have to use an alternative tree.
@Sergei: Could you test/ack this patch as well?
Thanks,
Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - glaubitz@debian.org
`. `' Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH] ia64: Ensure proper NUMA distance and possible map initialization
2021-03-19 14:47 ` John Paul Adrian Glaubitz
@ 2021-03-19 19:10 ` Sergei Trofimovich
-1 siblings, 0 replies; 12+ messages in thread
From: Sergei Trofimovich @ 2021-03-19 19:10 UTC (permalink / raw)
To: John Paul Adrian Glaubitz
Cc: Valentin Schneider, linux-kernel, linux-ia64@vger.kernel.org,
debian-ia64, Peter Zijlstra (Intel), Ingo Molnar, Vincent Guittot,
Dietmar Eggemann, Anatoly Pugachev
On Fri, 19 Mar 2021 15:47:09 +0100
John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> wrote:
> Hi Valentin!
>
> On 3/18/21 2:06 PM, Valentin Schneider wrote:
> > John Paul reported a warning about bogus NUMA distance values spurred by
> > commit:
> >
> > 620a6dc40754 ("sched/topology: Make sched_init_numa() use a set for the deduplicating sort")
> >
> > In this case, the afflicted machine comes up with a reported 256 possible
> > nodes, all of which are 0 distance away from one another. This was
> > previously silently ignored, but is now caught by the aforementioned
> > commit.
> >
> > The culprit is ia64's node_possible_map which remains unchanged from its
> > initialization value of NODE_MASK_ALL. In John's case, the machine doesn't
> > have any SRAT nor SLIT table, but AIUI the possible map remains untouched
> > regardless of what ACPI tables end up being parsed. Thus, !online &&
> > possible nodes remain with a bogus distance of 0 (distances \in [0, 9] are
> > "reserved and have no meaning" as per the ACPI spec).
> >
> > Follow x86 / drivers/base/arch_numa's example and set the possible map to
> > the parsed map, which in this case seems to be the online map.
> >
> > Link: http://lore.kernel.org/r/255d6b5d-194e-eb0e-ecdd-97477a534441@physik.fu-berlin.de
> > Fixes: 620a6dc40754 ("sched/topology: Make sched_init_numa() use a set for the deduplicating sort")
> > Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
> > Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
> > ---
> > This might need an earlier Fixes: tag, but all of this is quite old and
> > dusty (the git blame rabbit hole leads me to ~2008/2007)
> >
> > Alternatively, can we deprecate ia64 already?
> > ---
> > arch/ia64/kernel/acpi.c | 7 +++++--
> > 1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c
> > index a5636524af76..e2af6b172200 100644
> > --- a/arch/ia64/kernel/acpi.c
> > +++ b/arch/ia64/kernel/acpi.c
> > @@ -446,7 +446,8 @@ void __init acpi_numa_fixup(void)
> > if (srat_num_cpus = 0) {
> > node_set_online(0);
> > node_cpuid[0].phys_id = hard_smp_processor_id();
> > - return;
> > + slit_distance(0, 0) = LOCAL_DISTANCE;
> > + goto out;
> > }
> >
> > /*
> > @@ -489,7 +490,7 @@ void __init acpi_numa_fixup(void)
> > for (j = 0; j < MAX_NUMNODES; j++)
> > slit_distance(i, j) = i = j ?
> > LOCAL_DISTANCE : REMOTE_DISTANCE;
> > - return;
> > + goto out;
> > }
> >
> > memset(numa_slit, -1, sizeof(numa_slit));
> > @@ -514,6 +515,8 @@ void __init acpi_numa_fixup(void)
> > printk("\n");
> > }
> > #endif
> > +out:
> > + node_possible_map = node_online_map;
> > }
> > #endif /* CONFIG_ACPI_NUMA */
> >
> >
>
> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
>
> Could you send this patch through Andrew Morton's tree? The ia64 port currently
> has no maintainer, so we have to use an alternative tree.
>
> @Sergei: Could you test/ack this patch as well?
Booted successfully without problems on rx3600.
Tested-by: Sergei Trofimovich <slyfox@gentoo.org>
--
Sergei
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH] ia64: Ensure proper NUMA distance and possible map initialization
@ 2021-03-19 19:10 ` Sergei Trofimovich
0 siblings, 0 replies; 12+ messages in thread
From: Sergei Trofimovich @ 2021-03-19 19:10 UTC (permalink / raw)
To: John Paul Adrian Glaubitz
Cc: Valentin Schneider, linux-kernel, linux-ia64@vger.kernel.org,
debian-ia64, Peter Zijlstra (Intel), Ingo Molnar, Vincent Guittot,
Dietmar Eggemann, Anatoly Pugachev
On Fri, 19 Mar 2021 15:47:09 +0100
John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> wrote:
> Hi Valentin!
>
> On 3/18/21 2:06 PM, Valentin Schneider wrote:
> > John Paul reported a warning about bogus NUMA distance values spurred by
> > commit:
> >
> > 620a6dc40754 ("sched/topology: Make sched_init_numa() use a set for the deduplicating sort")
> >
> > In this case, the afflicted machine comes up with a reported 256 possible
> > nodes, all of which are 0 distance away from one another. This was
> > previously silently ignored, but is now caught by the aforementioned
> > commit.
> >
> > The culprit is ia64's node_possible_map which remains unchanged from its
> > initialization value of NODE_MASK_ALL. In John's case, the machine doesn't
> > have any SRAT nor SLIT table, but AIUI the possible map remains untouched
> > regardless of what ACPI tables end up being parsed. Thus, !online &&
> > possible nodes remain with a bogus distance of 0 (distances \in [0, 9] are
> > "reserved and have no meaning" as per the ACPI spec).
> >
> > Follow x86 / drivers/base/arch_numa's example and set the possible map to
> > the parsed map, which in this case seems to be the online map.
> >
> > Link: http://lore.kernel.org/r/255d6b5d-194e-eb0e-ecdd-97477a534441@physik.fu-berlin.de
> > Fixes: 620a6dc40754 ("sched/topology: Make sched_init_numa() use a set for the deduplicating sort")
> > Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
> > Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
> > ---
> > This might need an earlier Fixes: tag, but all of this is quite old and
> > dusty (the git blame rabbit hole leads me to ~2008/2007)
> >
> > Alternatively, can we deprecate ia64 already?
> > ---
> > arch/ia64/kernel/acpi.c | 7 +++++--
> > 1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c
> > index a5636524af76..e2af6b172200 100644
> > --- a/arch/ia64/kernel/acpi.c
> > +++ b/arch/ia64/kernel/acpi.c
> > @@ -446,7 +446,8 @@ void __init acpi_numa_fixup(void)
> > if (srat_num_cpus == 0) {
> > node_set_online(0);
> > node_cpuid[0].phys_id = hard_smp_processor_id();
> > - return;
> > + slit_distance(0, 0) = LOCAL_DISTANCE;
> > + goto out;
> > }
> >
> > /*
> > @@ -489,7 +490,7 @@ void __init acpi_numa_fixup(void)
> > for (j = 0; j < MAX_NUMNODES; j++)
> > slit_distance(i, j) = i == j ?
> > LOCAL_DISTANCE : REMOTE_DISTANCE;
> > - return;
> > + goto out;
> > }
> >
> > memset(numa_slit, -1, sizeof(numa_slit));
> > @@ -514,6 +515,8 @@ void __init acpi_numa_fixup(void)
> > printk("\n");
> > }
> > #endif
> > +out:
> > + node_possible_map = node_online_map;
> > }
> > #endif /* CONFIG_ACPI_NUMA */
> >
> >
>
> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
>
> Could you send this patch through Andrew Morton's tree? The ia64 port currently
> has no maintainer, so we have to use an alternative tree.
>
> @Sergei: Could you test/ack this patch as well?
Booted successfully without problems on rx3600.
Tested-by: Sergei Trofimovich <slyfox@gentoo.org>
--
Sergei
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH] ia64: Ensure proper NUMA distance and possible map initialization
2021-03-19 19:10 ` Sergei Trofimovich
@ 2021-03-20 19:02 ` John Paul Adrian Glaubitz
-1 siblings, 0 replies; 12+ messages in thread
From: John Paul Adrian Glaubitz @ 2021-03-20 19:02 UTC (permalink / raw)
To: Sergei Trofimovich
Cc: Valentin Schneider, linux-kernel, linux-ia64@vger.kernel.org,
debian-ia64, Peter Zijlstra (Intel), Ingo Molnar, Vincent Guittot,
Dietmar Eggemann, Anatoly Pugachev, Andrew Morton
On 3/19/21 8:10 PM, Sergei Trofimovich wrote:
> On Fri, 19 Mar 2021 15:47:09 +0100
> John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> wrote:
>
>> Hi Valentin!
>>
>> On 3/18/21 2:06 PM, Valentin Schneider wrote:
>>> John Paul reported a warning about bogus NUMA distance values spurred by
>>> commit:
>>>
>>> 620a6dc40754 ("sched/topology: Make sched_init_numa() use a set for the deduplicating sort")
>>>
>>> In this case, the afflicted machine comes up with a reported 256 possible
>>> nodes, all of which are 0 distance away from one another. This was
>>> previously silently ignored, but is now caught by the aforementioned
>>> commit.
>>>
>>> The culprit is ia64's node_possible_map which remains unchanged from its
>>> initialization value of NODE_MASK_ALL. In John's case, the machine doesn't
>>> have any SRAT nor SLIT table, but AIUI the possible map remains untouched
>>> regardless of what ACPI tables end up being parsed. Thus, !online &&
>>> possible nodes remain with a bogus distance of 0 (distances \in [0, 9] are
>>> "reserved and have no meaning" as per the ACPI spec).
>>>
>>> Follow x86 / drivers/base/arch_numa's example and set the possible map to
>>> the parsed map, which in this case seems to be the online map.
>>>
>>> Link: http://lore.kernel.org/r/255d6b5d-194e-eb0e-ecdd-97477a534441@physik.fu-berlin.de
>>> Fixes: 620a6dc40754 ("sched/topology: Make sched_init_numa() use a set for the deduplicating sort")
>>> Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
>>> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
>>> ---
>>> This might need an earlier Fixes: tag, but all of this is quite old and
>>> dusty (the git blame rabbit hole leads me to ~2008/2007)
>>>
>>> Alternatively, can we deprecate ia64 already?
>>> ---
>>> arch/ia64/kernel/acpi.c | 7 +++++--
>>> 1 file changed, 5 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c
>>> index a5636524af76..e2af6b172200 100644
>>> --- a/arch/ia64/kernel/acpi.c
>>> +++ b/arch/ia64/kernel/acpi.c
>>> @@ -446,7 +446,8 @@ void __init acpi_numa_fixup(void)
>>> if (srat_num_cpus = 0) {
>>> node_set_online(0);
>>> node_cpuid[0].phys_id = hard_smp_processor_id();
>>> - return;
>>> + slit_distance(0, 0) = LOCAL_DISTANCE;
>>> + goto out;
>>> }
>>>
>>> /*
>>> @@ -489,7 +490,7 @@ void __init acpi_numa_fixup(void)
>>> for (j = 0; j < MAX_NUMNODES; j++)
>>> slit_distance(i, j) = i = j ?
>>> LOCAL_DISTANCE : REMOTE_DISTANCE;
>>> - return;
>>> + goto out;
>>> }
>>>
>>> memset(numa_slit, -1, sizeof(numa_slit));
>>> @@ -514,6 +515,8 @@ void __init acpi_numa_fixup(void)
>>> printk("\n");
>>> }
>>> #endif
>>> +out:
>>> + node_possible_map = node_online_map;
>>> }
>>> #endif /* CONFIG_ACPI_NUMA */
>>>
>>>
>>
>> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
>>
>> Could you send this patch through Andrew Morton's tree? The ia64 port currently
>> has no maintainer, so we have to use an alternative tree.
>>
>> @Sergei: Could you test/ack this patch as well?
>
> Booted successfully without problems on rx3600.
>
> Tested-by: Sergei Trofimovich <slyfox@gentoo.org>
Great, thanks!
@Andrew: Could you pick up this patch through your tree?
Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - glaubitz@debian.org
`. `' Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH] ia64: Ensure proper NUMA distance and possible map initialization
@ 2021-03-20 19:02 ` John Paul Adrian Glaubitz
0 siblings, 0 replies; 12+ messages in thread
From: John Paul Adrian Glaubitz @ 2021-03-20 19:02 UTC (permalink / raw)
To: Sergei Trofimovich
Cc: Valentin Schneider, linux-kernel, linux-ia64@vger.kernel.org,
debian-ia64, Peter Zijlstra (Intel), Ingo Molnar, Vincent Guittot,
Dietmar Eggemann, Anatoly Pugachev, Andrew Morton
On 3/19/21 8:10 PM, Sergei Trofimovich wrote:
> On Fri, 19 Mar 2021 15:47:09 +0100
> John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> wrote:
>
>> Hi Valentin!
>>
>> On 3/18/21 2:06 PM, Valentin Schneider wrote:
>>> John Paul reported a warning about bogus NUMA distance values spurred by
>>> commit:
>>>
>>> 620a6dc40754 ("sched/topology: Make sched_init_numa() use a set for the deduplicating sort")
>>>
>>> In this case, the afflicted machine comes up with a reported 256 possible
>>> nodes, all of which are 0 distance away from one another. This was
>>> previously silently ignored, but is now caught by the aforementioned
>>> commit.
>>>
>>> The culprit is ia64's node_possible_map which remains unchanged from its
>>> initialization value of NODE_MASK_ALL. In John's case, the machine doesn't
>>> have any SRAT nor SLIT table, but AIUI the possible map remains untouched
>>> regardless of what ACPI tables end up being parsed. Thus, !online &&
>>> possible nodes remain with a bogus distance of 0 (distances \in [0, 9] are
>>> "reserved and have no meaning" as per the ACPI spec).
>>>
>>> Follow x86 / drivers/base/arch_numa's example and set the possible map to
>>> the parsed map, which in this case seems to be the online map.
>>>
>>> Link: http://lore.kernel.org/r/255d6b5d-194e-eb0e-ecdd-97477a534441@physik.fu-berlin.de
>>> Fixes: 620a6dc40754 ("sched/topology: Make sched_init_numa() use a set for the deduplicating sort")
>>> Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
>>> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
>>> ---
>>> This might need an earlier Fixes: tag, but all of this is quite old and
>>> dusty (the git blame rabbit hole leads me to ~2008/2007)
>>>
>>> Alternatively, can we deprecate ia64 already?
>>> ---
>>> arch/ia64/kernel/acpi.c | 7 +++++--
>>> 1 file changed, 5 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c
>>> index a5636524af76..e2af6b172200 100644
>>> --- a/arch/ia64/kernel/acpi.c
>>> +++ b/arch/ia64/kernel/acpi.c
>>> @@ -446,7 +446,8 @@ void __init acpi_numa_fixup(void)
>>> if (srat_num_cpus == 0) {
>>> node_set_online(0);
>>> node_cpuid[0].phys_id = hard_smp_processor_id();
>>> - return;
>>> + slit_distance(0, 0) = LOCAL_DISTANCE;
>>> + goto out;
>>> }
>>>
>>> /*
>>> @@ -489,7 +490,7 @@ void __init acpi_numa_fixup(void)
>>> for (j = 0; j < MAX_NUMNODES; j++)
>>> slit_distance(i, j) = i == j ?
>>> LOCAL_DISTANCE : REMOTE_DISTANCE;
>>> - return;
>>> + goto out;
>>> }
>>>
>>> memset(numa_slit, -1, sizeof(numa_slit));
>>> @@ -514,6 +515,8 @@ void __init acpi_numa_fixup(void)
>>> printk("\n");
>>> }
>>> #endif
>>> +out:
>>> + node_possible_map = node_online_map;
>>> }
>>> #endif /* CONFIG_ACPI_NUMA */
>>>
>>>
>>
>> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
>>
>> Could you send this patch through Andrew Morton's tree? The ia64 port currently
>> has no maintainer, so we have to use an alternative tree.
>>
>> @Sergei: Could you test/ack this patch as well?
>
> Booted successfully without problems on rx3600.
>
> Tested-by: Sergei Trofimovich <slyfox@gentoo.org>
Great, thanks!
@Andrew: Could you pick up this patch through your tree?
Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - glaubitz@debian.org
`. `' Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] ia64: Ensure proper NUMA distance and possible map initialization
2021-03-18 13:06 ` Valentin Schneider
@ 2021-03-24 18:54 ` Andrew Morton
-1 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2021-03-24 18:54 UTC (permalink / raw)
To: Valentin Schneider
Cc: linux-kernel, linux-ia64@vger.kernel.org, debian-ia64,
John Paul Adrian Glaubitz, Peter Zijlstra (Intel), Ingo Molnar,
Vincent Guittot, Dietmar Eggemann, Sergei Trofimovich,
Anatoly Pugachev
On Thu, 18 Mar 2021 13:06:17 +0000 Valentin Schneider <valentin.schneider@arm.com> wrote:
> John Paul reported a warning about bogus NUMA distance values spurred by
> commit:
>
> 620a6dc40754 ("sched/topology: Make sched_init_numa() use a set for the deduplicating sort")
>
> In this case, the afflicted machine comes up with a reported 256 possible
> nodes, all of which are 0 distance away from one another. This was
> previously silently ignored, but is now caught by the aforementioned
> commit.
>
> The culprit is ia64's node_possible_map which remains unchanged from its
> initialization value of NODE_MASK_ALL. In John's case, the machine doesn't
> have any SRAT nor SLIT table, but AIUI the possible map remains untouched
> regardless of what ACPI tables end up being parsed. Thus, !online &&
> possible nodes remain with a bogus distance of 0 (distances \in [0, 9] are
> "reserved and have no meaning" as per the ACPI spec).
>
> Follow x86 / drivers/base/arch_numa's example and set the possible map to
> the parsed map, which in this case seems to be the online map.
>
> Link: http://lore.kernel.org/r/255d6b5d-194e-eb0e-ecdd-97477a534441@physik.fu-berlin.de
> Fixes: 620a6dc40754 ("sched/topology: Make sched_init_numa() use a set for the deduplicating sort")
> Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
> ---
> This might need an earlier Fixes: tag, but all of this is quite old and
> dusty (the git blame rabbit hole leads me to ~2008/2007)
>
Thanks. Is this worth a cc:stable tag?
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH] ia64: Ensure proper NUMA distance and possible map initialization
@ 2021-03-24 18:54 ` Andrew Morton
0 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2021-03-24 18:54 UTC (permalink / raw)
To: Valentin Schneider
Cc: linux-kernel, linux-ia64@vger.kernel.org, debian-ia64,
John Paul Adrian Glaubitz, Peter Zijlstra (Intel), Ingo Molnar,
Vincent Guittot, Dietmar Eggemann, Sergei Trofimovich,
Anatoly Pugachev
On Thu, 18 Mar 2021 13:06:17 +0000 Valentin Schneider <valentin.schneider@arm.com> wrote:
> John Paul reported a warning about bogus NUMA distance values spurred by
> commit:
>
> 620a6dc40754 ("sched/topology: Make sched_init_numa() use a set for the deduplicating sort")
>
> In this case, the afflicted machine comes up with a reported 256 possible
> nodes, all of which are 0 distance away from one another. This was
> previously silently ignored, but is now caught by the aforementioned
> commit.
>
> The culprit is ia64's node_possible_map which remains unchanged from its
> initialization value of NODE_MASK_ALL. In John's case, the machine doesn't
> have any SRAT nor SLIT table, but AIUI the possible map remains untouched
> regardless of what ACPI tables end up being parsed. Thus, !online &&
> possible nodes remain with a bogus distance of 0 (distances \in [0, 9] are
> "reserved and have no meaning" as per the ACPI spec).
>
> Follow x86 / drivers/base/arch_numa's example and set the possible map to
> the parsed map, which in this case seems to be the online map.
>
> Link: http://lore.kernel.org/r/255d6b5d-194e-eb0e-ecdd-97477a534441@physik.fu-berlin.de
> Fixes: 620a6dc40754 ("sched/topology: Make sched_init_numa() use a set for the deduplicating sort")
> Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
> ---
> This might need an earlier Fixes: tag, but all of this is quite old and
> dusty (the git blame rabbit hole leads me to ~2008/2007)
>
Thanks. Is this worth a cc:stable tag?
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH] ia64: Ensure proper NUMA distance and possible map initialization
2021-03-24 18:54 ` Andrew Morton
@ 2021-03-24 18:59 ` John Paul Adrian Glaubitz
-1 siblings, 0 replies; 12+ messages in thread
From: John Paul Adrian Glaubitz @ 2021-03-24 18:59 UTC (permalink / raw)
To: Andrew Morton, Valentin Schneider
Cc: linux-kernel, linux-ia64@vger.kernel.org, debian-ia64,
Peter Zijlstra (Intel), Ingo Molnar, Vincent Guittot,
Dietmar Eggemann, Sergei Trofimovich, Anatoly Pugachev
Hi!
On 3/24/21 7:54 PM, Andrew Morton wrote:
> On Thu, 18 Mar 2021 13:06:17 +0000 Valentin Schneider <valentin.schneider@arm.com> wrote:
>
>> John Paul reported a warning about bogus NUMA distance values spurred by
>> commit:
>>
>> 620a6dc40754 ("sched/topology: Make sched_init_numa() use a set for the deduplicating sort")
>>
>> In this case, the afflicted machine comes up with a reported 256 possible
>> nodes, all of which are 0 distance away from one another. This was
>> previously silently ignored, but is now caught by the aforementioned
>> commit.
>>
>> The culprit is ia64's node_possible_map which remains unchanged from its
>> initialization value of NODE_MASK_ALL. In John's case, the machine doesn't
>> have any SRAT nor SLIT table, but AIUI the possible map remains untouched
>> regardless of what ACPI tables end up being parsed. Thus, !online &&
>> possible nodes remain with a bogus distance of 0 (distances \in [0, 9] are
>> "reserved and have no meaning" as per the ACPI spec).
>>
>> Follow x86 / drivers/base/arch_numa's example and set the possible map to
>> the parsed map, which in this case seems to be the online map.
>>
>> Link: http://lore.kernel.org/r/255d6b5d-194e-eb0e-ecdd-97477a534441@physik.fu-berlin.de
>> Fixes: 620a6dc40754 ("sched/topology: Make sched_init_numa() use a set for the deduplicating sort")
>> Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
>> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
>> ---
>> This might need an earlier Fixes: tag, but all of this is quite old and
>> dusty (the git blame rabbit hole leads me to ~2008/2007)
>>
>
> Thanks. Is this worth a cc:stable tag?
Looks like the regression was introduced 5.12-rc1, so no need for backporting.
Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - glaubitz@debian.org
`. `' Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH] ia64: Ensure proper NUMA distance and possible map initialization
@ 2021-03-24 18:59 ` John Paul Adrian Glaubitz
0 siblings, 0 replies; 12+ messages in thread
From: John Paul Adrian Glaubitz @ 2021-03-24 18:59 UTC (permalink / raw)
To: Andrew Morton, Valentin Schneider
Cc: linux-kernel, linux-ia64@vger.kernel.org, debian-ia64,
Peter Zijlstra (Intel), Ingo Molnar, Vincent Guittot,
Dietmar Eggemann, Sergei Trofimovich, Anatoly Pugachev
Hi!
On 3/24/21 7:54 PM, Andrew Morton wrote:
> On Thu, 18 Mar 2021 13:06:17 +0000 Valentin Schneider <valentin.schneider@arm.com> wrote:
>
>> John Paul reported a warning about bogus NUMA distance values spurred by
>> commit:
>>
>> 620a6dc40754 ("sched/topology: Make sched_init_numa() use a set for the deduplicating sort")
>>
>> In this case, the afflicted machine comes up with a reported 256 possible
>> nodes, all of which are 0 distance away from one another. This was
>> previously silently ignored, but is now caught by the aforementioned
>> commit.
>>
>> The culprit is ia64's node_possible_map which remains unchanged from its
>> initialization value of NODE_MASK_ALL. In John's case, the machine doesn't
>> have any SRAT nor SLIT table, but AIUI the possible map remains untouched
>> regardless of what ACPI tables end up being parsed. Thus, !online &&
>> possible nodes remain with a bogus distance of 0 (distances \in [0, 9] are
>> "reserved and have no meaning" as per the ACPI spec).
>>
>> Follow x86 / drivers/base/arch_numa's example and set the possible map to
>> the parsed map, which in this case seems to be the online map.
>>
>> Link: http://lore.kernel.org/r/255d6b5d-194e-eb0e-ecdd-97477a534441@physik.fu-berlin.de
>> Fixes: 620a6dc40754 ("sched/topology: Make sched_init_numa() use a set for the deduplicating sort")
>> Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
>> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
>> ---
>> This might need an earlier Fixes: tag, but all of this is quite old and
>> dusty (the git blame rabbit hole leads me to ~2008/2007)
>>
>
> Thanks. Is this worth a cc:stable tag?
Looks like the regression was introduced 5.12-rc1, so no need for backporting.
Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - glaubitz@debian.org
`. `' Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2021-03-24 19:00 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-03-18 13:06 [PATCH] ia64: Ensure proper NUMA distance and possible map initialization Valentin Schneider
2021-03-18 13:06 ` Valentin Schneider
2021-03-19 14:47 ` John Paul Adrian Glaubitz
2021-03-19 14:47 ` John Paul Adrian Glaubitz
2021-03-19 19:10 ` Sergei Trofimovich
2021-03-19 19:10 ` Sergei Trofimovich
2021-03-20 19:02 ` John Paul Adrian Glaubitz
2021-03-20 19:02 ` John Paul Adrian Glaubitz
2021-03-24 18:54 ` Andrew Morton
2021-03-24 18:54 ` Andrew Morton
2021-03-24 18:59 ` John Paul Adrian Glaubitz
2021-03-24 18:59 ` John Paul Adrian Glaubitz
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.