From: "Michal Suchánek" <msuchanek@suse.de>
To: linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v04] powerpc/numa: Perform full re-add of CPU for PRRN/VPHN topology update
Date: Mon, 18 Feb 2019 11:49:17 +0100 [thread overview]
Message-ID: <20190218114917.5ab865c4@naga> (raw)
In-Reply-To: <d8b67f9d-77d0-2be3-15bc-c7f11bc519f7@linux.vnet.ibm.com>
On Thu, 14 Feb 2019 09:56:26 -0600
Michael Bringmann <mwb@linux.vnet.ibm.com> wrote:
Hello,
> To: linuxppc-dev@lists.ozlabs.org
> To: linux-kernel@vger.kernel.org
> Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Paul Mackerras <paulus@samba.org>
> Michael Ellerman <mpe@ellerman.id.au>
> Nathan Lynch <nathanl@linux.vnet.ibm.com>
> Corentin Labbe <clabbe@baylibre.com>
> Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
> Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> Guenter Roeck <linux@roeck-us.net>
> Michael Bringmann <mwb@linux.vnet.ibm.com>
> "Oliver O'Halloran" <oohall@gmail.com>
> Russell Currey <ruscur@russell.cc>
> Haren Myneni <haren@us.ibm.com>
> Al Viro <viro@zeniv.linux.org.uk>
> Kees Cook <keescook@chromium.org>
> Nicholas Piggin <npiggin@gmail.com>
> Rob Herring <robh@kernel.org>
> Juliet Kim <minkim@us.ibm.com>
> Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
> Date: 2018-11-05 16:14:12 -0600
> Subject: [PATCH v04] powerpc/numa: Perform full re-add of CPU for PRRN/VPHN topology update
Looks like something went wrong with the e-mail headers.
Maybe you should re-send?
Thanks
Michal
>
> On pseries systems, performing changes to a partition's affinity
> can result in altering the nodes a CPU is assigned to the
> current system. For example, some systems are subject to resource
> balancing operations by the operator or control software. In such
> environments, system CPUs may be in node 1 and 3 at boot, and be
> moved to nodes 2, 3, and 5, for better performance.
>
> The current implementation attempts to recognize such changes within
> the powerpc-specific version of arch_update_cpu_topology to modify a
> range of system data structures directly. However, some scheduler
> data structures may be inaccessible, or the timing of a node change
> may still lead to corruption or error in other modules (e.g. user
> space) which do not receive notification of these changes.
>
> This patch modifies the PRRN/VPHN topology update worker function to
> recognize an affinity change for a CPU, and to perform a full DLPAR
> remove and add of the CPU instead of dynamically changing its node
> to resolve this issue.
>
> [Based upon patch submission:
> Subject: [PATCH] powerpc/pseries: Perform full re-add of CPU for topology update post-migration
> From: Nathan Fontenot <nfont@linux.vnet.ibm.com>
> Date: Tue Oct 30 05:43:36 AEDT 2018
> ]
>
> [Replace patch submission:
> Subject: [PATCH] powerpc/topology: Update numa mask when cpu node mapping changes
> From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> Date: Wed Oct 10 15:24:46 AEDT 2018
> ]
>
> Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
> ---
> Changes in v04:
> -- Revise tests in topology_timer_fn to check vphn_enabled before prrn_enabled
> -- Remove unnecessary changes to numa_update_cpu_topology
> Changes in v03:
> -- Fixed under-scheduling of topo updates.
> Changes in v02:
> -- Reuse more of the previous implementation to reduce patch size
> -- Replace former calls to numa_update_cpu_topology(false) by
> topology_schedule_update
> -- Make sure that we report topology changes back through
> arch_update_cpu_topology
> -- Fix problem observed in powerpc next kernel with updating
> cpu_associativity_changes_mask in timer_topology_fn when both
> prrn_enabled and vphn_enabled, and many extra CPUs are possible,
> but not installed.
> -- Fix problem with updating cpu_associativity_changes_mask when
> VPHN associativity information does not arrive until after first
> call to update topology occurs.
> ---
> arch/powerpc/include/asm/topology.h | 7 +----
> arch/powerpc/kernel/rtasd.c | 2 +
> arch/powerpc/mm/numa.c | 47 +++++++++++++++++++++++------------
> 3 files changed, 34 insertions(+), 22 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
> index f85e2b01c3df..79505c371fd5 100644
> --- a/arch/powerpc/include/asm/topology.h
> +++ b/arch/powerpc/include/asm/topology.h
> @@ -42,7 +42,7 @@ extern void __init dump_numa_cpu_topology(void);
>
> extern int sysfs_add_device_to_node(struct device *dev, int nid);
> extern void sysfs_remove_device_from_node(struct device *dev, int nid);
> -extern int numa_update_cpu_topology(bool cpus_locked);
> +extern void topology_schedule_update(void);
>
> static inline void update_numa_cpu_lookup_table(unsigned int cpu, int node)
> {
> @@ -77,10 +77,7 @@ static inline void sysfs_remove_device_from_node(struct device *dev,
> {
> }
>
> -static inline int numa_update_cpu_topology(bool cpus_locked)
> -{
> - return 0;
> -}
> +static inline void topology_schedule_update(void) {}
>
> static inline void update_numa_cpu_lookup_table(unsigned int cpu, int node) {}
>
> diff --git a/arch/powerpc/kernel/rtasd.c b/arch/powerpc/kernel/rtasd.c
> index 8a1746d755c9..b1828de7ab78 100644
> --- a/arch/powerpc/kernel/rtasd.c
> +++ b/arch/powerpc/kernel/rtasd.c
> @@ -285,7 +285,7 @@ static void handle_prrn_event(s32 scope)
> * the RTAS event.
> */
> pseries_devicetree_update(-scope);
> - numa_update_cpu_topology(false);
> + topology_schedule_update();
> }
>
> static void handle_rtas_event(const struct rtas_error_log *log)
> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> index b5d1c45c1475..eb63479f09d7 100644
> --- a/arch/powerpc/mm/numa.c
> +++ b/arch/powerpc/mm/numa.c
> @@ -1077,6 +1077,8 @@ static int prrn_enabled;
> static void reset_topology_timer(void);
> static int topology_timer_secs = 1;
> static int topology_inited;
> +static int topology_update_in_progress;
> +static int topology_changed;
>
> /*
> * Change polling interval for associativity changes.
> @@ -1297,9 +1299,9 @@ static int update_lookup_table(void *data)
> * Update the node maps and sysfs entries for each cpu whose home node
> * has changed. Returns 1 when the topology has changed, and 0 otherwise.
> *
> - * cpus_locked says whether we already hold cpu_hotplug_lock.
> + * readd_cpus: Also readd any CPUs that have changed affinity
> */
> -int numa_update_cpu_topology(bool cpus_locked)
> +static int numa_update_cpu_topology(bool readd_cpus)
> {
> unsigned int cpu, sibling, changed = 0;
> struct topology_update_data *updates, *ud;
> @@ -1307,7 +1309,8 @@ int numa_update_cpu_topology(bool cpus_locked)
> struct device *dev;
> int weight, new_nid, i = 0;
>
> - if (!prrn_enabled && !vphn_enabled && topology_inited)
> + if ((!prrn_enabled && !vphn_enabled && topology_inited) ||
> + topology_update_in_progress)
> return 0;
>
> weight = cpumask_weight(&cpu_associativity_changes_mask);
> @@ -1318,6 +1321,8 @@ int numa_update_cpu_topology(bool cpus_locked)
> if (!updates)
> return 0;
>
> + topology_update_in_progress = 1;
> +
> cpumask_clear(&updated_cpus);
>
> for_each_cpu(cpu, &cpu_associativity_changes_mask) {
> @@ -1339,16 +1344,21 @@ int numa_update_cpu_topology(bool cpus_locked)
>
> new_nid = find_and_online_cpu_nid(cpu);
>
> - if (new_nid == numa_cpu_lookup_table[cpu]) {
> + if ((new_nid == numa_cpu_lookup_table[cpu]) ||
> + !cpu_present(cpu)) {
> cpumask_andnot(&cpu_associativity_changes_mask,
> &cpu_associativity_changes_mask,
> cpu_sibling_mask(cpu));
> - dbg("Assoc chg gives same node %d for cpu%d\n",
> + if (cpu_present(cpu))
> + dbg("Assoc chg gives same node %d for cpu%d\n",
> new_nid, cpu);
> cpu = cpu_last_thread_sibling(cpu);
> continue;
> }
>
> + if (readd_cpus)
> + dlpar_cpu_readd(cpu);
> +
> for_each_cpu(sibling, cpu_sibling_mask(cpu)) {
> ud = &updates[i++];
> ud->next = &updates[i];
> @@ -1390,7 +1400,7 @@ int numa_update_cpu_topology(bool cpus_locked)
> if (!cpumask_weight(&updated_cpus))
> goto out;
>
> - if (cpus_locked)
> + if (!readd_cpus)
> stop_machine_cpuslocked(update_cpu_topology, &updates[0],
> &updated_cpus);
> else
> @@ -1401,9 +1411,9 @@ int numa_update_cpu_topology(bool cpus_locked)
> * offline CPUs. It is best to perform this update from the stop-
> * machine context.
> */
> - if (cpus_locked)
> + if (!readd_cpus)
> stop_machine_cpuslocked(update_lookup_table, &updates[0],
> - cpumask_of(raw_smp_processor_id()));
> + cpumask_of(raw_smp_processor_id()));
> else
> stop_machine(update_lookup_table, &updates[0],
> cpumask_of(raw_smp_processor_id()));
> @@ -1420,35 +1430,40 @@ int numa_update_cpu_topology(bool cpus_locked)
> }
>
> out:
> + topology_update_in_progress = 0;
> kfree(updates);
> return changed;
> }
>
> int arch_update_cpu_topology(void)
> {
> - return numa_update_cpu_topology(true);
> + return numa_update_cpu_topology(false);
> }
>
> static void topology_work_fn(struct work_struct *work)
> {
> - rebuild_sched_domains();
> + lock_device_hotplug();
> + if (numa_update_cpu_topology(true))
> + rebuild_sched_domains();
> + unlock_device_hotplug();
> }
> static DECLARE_WORK(topology_work, topology_work_fn);
>
> -static void topology_schedule_update(void)
> +void topology_schedule_update(void)
> {
> - schedule_work(&topology_work);
> + if (!topology_update_in_progress)
> + schedule_work(&topology_work);
> }
>
> static void topology_timer_fn(struct timer_list *unused)
> {
> - if (prrn_enabled && cpumask_weight(&cpu_associativity_changes_mask))
> - topology_schedule_update();
> - else if (vphn_enabled) {
> + if (vphn_enabled) {
> if (update_cpu_associativity_changes_mask() > 0)
> topology_schedule_update();
> reset_topology_timer();
> }
> + else if (prrn_enabled && cpumask_weight(&cpu_associativity_changes_mask))
> + topology_schedule_update();
> }
> static struct timer_list topology_timer;
>
> @@ -1553,7 +1568,7 @@ void __init shared_proc_topology_init(void)
> if (lppaca_shared_proc(get_lppaca())) {
> bitmap_fill(cpumask_bits(&cpu_associativity_changes_mask),
> nr_cpumask_bits);
> - numa_update_cpu_topology(false);
> + topology_schedule_update();
> }
> }
>
>
next prev parent reply other threads:[~2019-02-18 10:51 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-14 15:56 [PATCH v04] powerpc/numa: Perform full re-add of CPU for PRRN/VPHN topology update Michael Bringmann
2019-02-18 10:49 ` Michal Suchánek [this message]
2019-02-18 14:15 ` Michal Suchánek
2019-02-18 17:09 ` Michael Bringmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190218114917.5ab865c4@naga \
--to=msuchanek@suse.de \
--cc=linuxppc-dev@lists.ozlabs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).