From: Nathan Lynch <nathanl@linux.ibm.com>
To: linuxppc-dev@lists.ozlabs.org
Cc: tyreld@linux.ibm.com, ajd@linux.ibm.com, mmc@linux.vnet.ibm.com,
cforno12@linux.vnet.ibm.com, drt@linux.vnet.ibm.com,
brking@linux.ibm.com
Subject: [PATCH 13/29] powerpc/pseries/mobility: use stop_machine for join/suspend
Date: Thu, 29 Oct 2020 20:17:49 -0500 [thread overview]
Message-ID: <20201030011805.1224603-14-nathanl@linux.ibm.com> (raw)
In-Reply-To: <20201030011805.1224603-1-nathanl@linux.ibm.com>
The partition suspend sequence as specified in the platform
architecture requires that all active processor threads call
H_JOIN, which:
- suspends the calling thread until it is the target of
an H_PROD; or
- immediately returns H_CONTINUE, if the calling thread is the last to
call H_JOIN. This thread is expected to call ibm,suspend-me to
completely suspend the partition.
Upon returning from ibm,suspend-me the calling thread must wake all
others using H_PROD.
rtas_ibm_suspend_me_unsafe() uses on_each_cpu() to implement this
protocol, but because of its synchronizing nature this is susceptible
to deadlock versus users of stop_machine() or other callers of
on_each_cpu().
Not only is stop_machine() intended for use cases like this, it
handles error propagation and allows us to keep the data shared
between CPUs minimal: a single atomic counter which ensures exactly
one CPU will wake the others from their joined states.
Switch the migration code to use stop_machine() and a less complex
local implementation of the H_JOIN/ibm,suspend-me logic, which
carries additional benefits:
- more informative error reporting, appropriately ratelimited
- resets the lockup detector / watchdog on resume to prevent lockup
warnings when the OS has been suspended for a time exceeding the
threshold.
Fixes: 91dc182ca6e2 ("[PATCH] powerpc: special-case ibm,suspend-me RTAS call")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
---
arch/powerpc/platforms/pseries/mobility.c | 132 ++++++++++++++++++++--
1 file changed, 125 insertions(+), 7 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c
index 1b8ae221b98a..44ca7d4e143d 100644
--- a/arch/powerpc/platforms/pseries/mobility.c
+++ b/arch/powerpc/platforms/pseries/mobility.c
@@ -12,9 +12,11 @@
#include <linux/cpu.h>
#include <linux/kernel.h>
#include <linux/kobject.h>
+#include <linux/nmi.h>
#include <linux/sched.h>
#include <linux/smp.h>
#include <linux/stat.h>
+#include <linux/stop_machine.h>
#include <linux/completion.h>
#include <linux/device.h>
#include <linux/delay.h>
@@ -412,6 +414,128 @@ static int wait_for_vasi_session_suspending(u64 handle)
return ret;
}
+static void prod_single(unsigned int target_cpu)
+{
+ long hvrc;
+ int hwid;
+
+ hwid = get_hard_smp_processor_id(target_cpu);
+ hvrc = plpar_hcall_norets(H_PROD, hwid);
+ if (hvrc == H_SUCCESS)
+ return;
+ pr_err_ratelimited("H_PROD of CPU %u (hwid %d) error: %ld\n",
+ target_cpu, hwid, hvrc);
+}
+
+static void prod_others(void)
+{
+ unsigned int cpu;
+
+ for_each_online_cpu(cpu) {
+ if (cpu != smp_processor_id())
+ prod_single(cpu);
+ }
+}
+
+static u16 clamp_slb_size(void)
+{
+ u16 prev = mmu_slb_size;
+
+ slb_set_size(SLB_MIN_SIZE);
+
+ return prev;
+}
+
+static int do_suspend(void)
+{
+ u16 saved_slb_size;
+ int status;
+ int ret;
+
+ pr_info("calling ibm,suspend-me on CPU %i\n", smp_processor_id());
+
+ /*
+ * The destination processor model may have fewer SLB entries
+ * than the source. We reduce mmu_slb_size to a safe minimum
+ * before suspending in order to minimize the possibility of
+ * programming non-existent entries on the destination. If
+ * suspend fails, we restore it before returning. On success
+ * the OF reconfig path will update it from the new device
+ * tree after resuming on the destination.
+ */
+ saved_slb_size = clamp_slb_size();
+
+ ret = rtas_ibm_suspend_me(&status);
+ if (ret != 0) {
+ pr_err("ibm,suspend-me error: %d\n", status);
+ slb_set_size(saved_slb_size);
+ }
+
+ return ret;
+}
+
+static int do_join(void *arg)
+{
+ atomic_t *counter = arg;
+ long hvrc;
+ int ret;
+
+ /* Must ensure MSR.EE off for H_JOIN. */
+ hard_irq_disable();
+ hvrc = plpar_hcall_norets(H_JOIN);
+
+ switch (hvrc) {
+ case H_CONTINUE:
+ /*
+ * All other CPUs are offline or in H_JOIN. This CPU
+ * attempts the suspend.
+ */
+ ret = do_suspend();
+ break;
+ case H_SUCCESS:
+ /*
+ * The suspend is complete and this cpu has received a
+ * prod.
+ */
+ ret = 0;
+ break;
+ case H_BAD_MODE:
+ case H_HARDWARE:
+ default:
+ ret = -EIO;
+ pr_err_ratelimited("H_JOIN error %ld on CPU %i\n",
+ hvrc, smp_processor_id());
+ break;
+ }
+
+ if (atomic_inc_return(counter) == 1) {
+ pr_info("CPU %u waking all threads\n", smp_processor_id());
+ prod_others();
+ }
+ /*
+ * Execution may have been suspended for several seconds, so
+ * reset the watchdog.
+ */
+ touch_nmi_watchdog();
+ return ret;
+}
+
+static int pseries_migrate_partition(u64 handle)
+{
+ atomic_t counter = ATOMIC_INIT(0);
+ int ret;
+
+ ret = wait_for_vasi_session_suspending(handle);
+ if (ret)
+ goto out;
+
+ ret = stop_machine(do_join, &counter, cpu_online_mask);
+ if (ret == 0)
+ post_mobility_fixup();
+out:
+ return ret;
+}
+
static ssize_t migration_store(struct class *class,
struct class_attribute *attr, const char *buf,
size_t count)
@@ -423,16 +547,10 @@ static ssize_t migration_store(struct class *class,
if (rc)
return rc;
- rc = wait_for_vasi_session_suspending(streamid);
+ rc = pseries_migrate_partition(streamid);
if (rc)
return rc;
- rc = rtas_ibm_suspend_me_unsafe(streamid);
- if (rc)
- return rc;
-
- post_mobility_fixup();
-
return count;
}
--
2.25.4
next prev parent reply other threads:[~2020-10-30 1:41 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-30 1:17 [PATCH 00/29] partition suspend updates Nathan Lynch
2020-10-30 1:17 ` [PATCH 01/29] powerpc/rtas: move rtas_call_reentrant() out of pseries guards Nathan Lynch
2020-12-04 20:37 ` Nathan Lynch
2020-10-30 1:17 ` [PATCH 02/29] powerpc/rtas: prevent suspend-related sys_rtas use on LE Nathan Lynch
2020-10-30 3:45 ` Andrew Donnellan
2020-10-30 12:10 ` Nathan Lynch
2020-11-06 14:59 ` Andrew Donnellan
2020-11-06 16:44 ` Nathan Lynch
2020-10-30 1:17 ` [PATCH 03/29] powerpc/rtas: complete ibm,suspend-me status codes Nathan Lynch
2020-12-04 12:52 ` Michael Ellerman
2020-12-04 14:40 ` Nathan Lynch
2020-10-30 1:17 ` [PATCH 04/29] powerpc/rtas: rtas_ibm_suspend_me -> rtas_ibm_suspend_me_unsafe Nathan Lynch
2020-10-30 1:17 ` [PATCH 05/29] powerpc/rtas: add rtas_ibm_suspend_me() Nathan Lynch
2020-10-30 1:17 ` [PATCH 06/29] powerpc/rtas: add rtas_activate_firmware() Nathan Lynch
2020-10-30 1:17 ` [PATCH 07/29] powerpc/hvcall: add token and codes for H_VASI_SIGNAL Nathan Lynch
2020-10-30 1:17 ` [PATCH 08/29] powerpc/pseries/mobility: don't error on absence of ibm, update-nodes Nathan Lynch
2020-10-30 1:17 ` [PATCH 09/29] powerpc/pseries/mobility: add missing break to default case Nathan Lynch
2020-10-30 1:17 ` [PATCH 10/29] powerpc/pseries/mobility: error message improvements Nathan Lynch
2020-10-30 1:17 ` [PATCH 11/29] powerpc/pseries/mobility: use rtas_activate_firmware() on resume Nathan Lynch
2020-10-30 1:17 ` [PATCH 12/29] powerpc/pseries/mobility: extract VASI session polling logic Nathan Lynch
2020-12-04 12:51 ` Michael Ellerman
2020-12-04 14:46 ` Nathan Lynch
2020-10-30 1:17 ` Nathan Lynch [this message]
2020-12-04 12:52 ` [PATCH 13/29] powerpc/pseries/mobility: use stop_machine for join/suspend Michael Ellerman
2020-12-04 16:01 ` Nathan Lynch
2020-12-05 11:03 ` Michael Ellerman
2020-10-30 1:17 ` [PATCH 14/29] powerpc/pseries/mobility: signal suspend cancellation to platform Nathan Lynch
2020-10-30 1:17 ` [PATCH 15/29] powerpc/pseries/mobility: retry partition suspend after error Nathan Lynch
2020-10-30 1:17 ` [PATCH 16/29] powerpc/rtas: dispatch partition migration requests to pseries Nathan Lynch
2020-12-04 12:52 ` Michael Ellerman
2020-12-04 16:04 ` Nathan Lynch
2020-10-30 1:17 ` [PATCH 17/29] powerpc/rtas: remove rtas_ibm_suspend_me_unsafe() Nathan Lynch
2020-10-30 1:17 ` [PATCH 18/29] powerpc/pseries/hibernation: drop pseries_suspend_begin() from suspend ops Nathan Lynch
2020-10-30 1:17 ` [PATCH 19/29] powerpc/pseries/hibernation: pass stream id via function arguments Nathan Lynch
2020-10-30 1:17 ` [PATCH 20/29] powerpc/pseries/hibernation: remove pseries_suspend_cpu() Nathan Lynch
2020-10-30 1:17 ` [PATCH 21/29] powerpc/machdep: remove suspend_disable_cpu() Nathan Lynch
2020-10-30 1:17 ` [PATCH 22/29] powerpc/rtas: remove rtas_suspend_cpu() Nathan Lynch
2020-10-30 1:17 ` [PATCH 23/29] powerpc/pseries/hibernation: switch to rtas_ibm_suspend_me() Nathan Lynch
2020-10-30 1:18 ` [PATCH 24/29] powerpc/rtas: remove unused rtas_suspend_last_cpu() Nathan Lynch
2020-10-30 1:18 ` [PATCH 25/29] powerpc/pseries/hibernation: remove redundant cacheinfo update Nathan Lynch
2020-10-30 1:18 ` [PATCH 26/29] powerpc/pseries/hibernation: perform post-suspend fixups later Nathan Lynch
2020-10-30 1:18 ` [PATCH 27/29] powerpc/pseries/hibernation: remove prepare_late() callback Nathan Lynch
2020-10-30 1:18 ` [PATCH 28/29] powerpc/rtas: remove unused rtas_suspend_me_data Nathan Lynch
2020-10-30 1:18 ` [PATCH 29/29] powerpc/pseries/mobility: refactor node lookup during DT update Nathan Lynch
2020-11-20 16:09 ` Nathan Lynch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201030011805.1224603-14-nathanl@linux.ibm.com \
--to=nathanl@linux.ibm.com \
--cc=ajd@linux.ibm.com \
--cc=brking@linux.ibm.com \
--cc=cforno12@linux.vnet.ibm.com \
--cc=drt@linux.vnet.ibm.com \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mmc@linux.vnet.ibm.com \
--cc=tyreld@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).