linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] more migration vs CPU hotplug fixes
@ 2019-07-18 19:29 Nathan Lynch
  2019-07-18 19:29 ` [PATCH 1/2] powerpc/rtas: use device model APIs and serialization during LPM Nathan Lynch
  2019-07-18 19:29 ` [PATCH 2/2] powerpc/rtas: allow rescheduling while changing cpu states Nathan Lynch
  0 siblings, 2 replies; 4+ messages in thread
From: Nathan Lynch @ 2019-07-18 19:29 UTC (permalink / raw)
  To: linuxppc-dev

Despite recent fixes, userspace-initiated CPU hotplug still can
destructively race with the migration code's CPU state manipulations
on the destination. Also, since such manipulations can consist of
mass-onlining and -offlining half or more of the CPUs in the system,
take care to reschedule when needed.

Nathan Lynch (2):
  powerpc/rtas: use device model APIs and serialization during LPM
  powerpc/rtas: allow rescheduling while changing cpu states

 arch/powerpc/kernel/rtas.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/2] powerpc/rtas: use device model APIs and serialization during LPM
  2019-07-18 19:29 [PATCH 0/2] more migration vs CPU hotplug fixes Nathan Lynch
@ 2019-07-18 19:29 ` Nathan Lynch
  2019-07-18 19:46   ` Nathan Lynch
  2019-07-18 19:29 ` [PATCH 2/2] powerpc/rtas: allow rescheduling while changing cpu states Nathan Lynch
  1 sibling, 1 reply; 4+ messages in thread
From: Nathan Lynch @ 2019-07-18 19:29 UTC (permalink / raw)
  To: linuxppc-dev

During LPAR migration, cpu hotplug and migration operations can
interleave like so:

cd /sys/devices/system/cpu/cpu7/ | drmgr -m -c pmig -p pre \
echo 0 > online                  | -s 0xd7a884f83d830e6d -t 19 \
echo 1 > online                  | -n -d 1 5
---------------------------------+-------------------------------------------
online_store() {                 |
  device_offline() {             |
    cpu_subsys_offline() {       |
      cpu_down(7);               |
    }                            |
    dev->offline = true;         |
  }                              | migration_store() {
}                                |   rtas_ibm_suspend_me() {
                                 |     rtas_online_cpus_mask() {
                                 |       cpu_up(7);
				 |     }
                                 |     cpu_hotplug_disable();
                                 |     on_each_cpu(rtas_percpu_suspend_me());
                                 |     cpu_hotplug_enable();
online_store() {                 |
  device_online() {              |
    cpu_subsys_online() {        |
      cpu_up(7);                 |
    }                            |
    dev->offline = false;        |
  }                              |     rtas_offline_cpus_mask() {
}                                |       rtas_cpu_state_change_mask() {
                                 |         cpu_down(7);
				 |       }
				 |     }
				 |   }
				 | }

This leaves cpu7 in a state where the driver core considers the cpu
device online, but in all other respects it is offline and
unused. Attempts to online the cpu via sysfs appear to succeed but the
driver core actually does not pass the request to the lower-level
cpuhp support code. This makes the cpu unusable until the system is
rebooted.

Instead of directly calling cpu_up/cpu_down, the migration code should
use the higher-level device core APIs to maintain consistent state and
serialize operations.

Fixes: 120496ac2d2d ("powerpc: Bring all threads online prior to migration/hibernation")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
---
 arch/powerpc/kernel/rtas.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 9b4d2a2ffb4f..fbefd9ff6dab 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -875,15 +875,17 @@ static int rtas_cpu_state_change_mask(enum rtas_cpu_state state,
 		return 0;
 
 	for_each_cpu(cpu, cpus) {
+		struct device *dev = get_cpu_device(cpu);
+
 		switch (state) {
 		case DOWN:
-			cpuret = cpu_down(cpu);
+			cpuret = device_offline(dev);
 			break;
 		case UP:
-			cpuret = cpu_up(cpu);
+			cpuret = device_online(dev);
 			break;
 		}
-		if (cpuret) {
+		if (cpuret < 0) {
 			pr_debug("%s: cpu_%s for cpu#%d returned %d.\n",
 					__func__,
 					((state == UP) ? "up" : "down"),
@@ -972,6 +974,8 @@ int rtas_ibm_suspend_me(u64 handle)
 	data.token = rtas_token("ibm,suspend-me");
 	data.complete = &done;
 
+	lock_device_hotplug();
+
 	/* All present CPUs must be online */
 	cpumask_andnot(offline_mask, cpu_present_mask, cpu_online_mask);
 	cpuret = rtas_online_cpus_mask(offline_mask);
@@ -1011,6 +1015,7 @@ int rtas_ibm_suspend_me(u64 handle)
 				__func__);
 
 out:
+	unlock_device_hotplug();
 	free_cpumask_var(offline_mask);
 	return atomic_read(&data.error);
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/2] powerpc/rtas: allow rescheduling while changing cpu states
  2019-07-18 19:29 [PATCH 0/2] more migration vs CPU hotplug fixes Nathan Lynch
  2019-07-18 19:29 ` [PATCH 1/2] powerpc/rtas: use device model APIs and serialization during LPM Nathan Lynch
@ 2019-07-18 19:29 ` Nathan Lynch
  1 sibling, 0 replies; 4+ messages in thread
From: Nathan Lynch @ 2019-07-18 19:29 UTC (permalink / raw)
  To: linuxppc-dev

rtas_cpu_state_change_mask() potentially operates on scores of cpus,
so explicitly allow rescheduling in the loop body.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
---
 arch/powerpc/kernel/rtas.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index fbefd9ff6dab..396fb2f35c01 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -20,6 +20,7 @@
 #include <linux/capability.h>
 #include <linux/delay.h>
 #include <linux/cpu.h>
+#include <linux/sched.h>
 #include <linux/smp.h>
 #include <linux/completion.h>
 #include <linux/cpumask.h>
@@ -902,6 +903,7 @@ static int rtas_cpu_state_change_mask(enum rtas_cpu_state state,
 				cpumask_clear_cpu(cpu, cpus);
 			}
 		}
+		cond_resched();
 	}
 
 	return ret;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/2] powerpc/rtas: use device model APIs and serialization during LPM
  2019-07-18 19:29 ` [PATCH 1/2] powerpc/rtas: use device model APIs and serialization during LPM Nathan Lynch
@ 2019-07-18 19:46   ` Nathan Lynch
  0 siblings, 0 replies; 4+ messages in thread
From: Nathan Lynch @ 2019-07-18 19:46 UTC (permalink / raw)
  To: linuxppc-dev

Nathan Lynch <nathanl@linux.ibm.com> writes:

> During LPAR migration, cpu hotplug and migration operations can
> interleave like so:
>
> cd /sys/devices/system/cpu/cpu7/ | drmgr -m -c pmig -p pre \
> echo 0 > online                  | -s 0xd7a884f83d830e6d -t 19 \
> echo 1 > online                  | -n -d 1 5
> ---------------------------------+-------------------------------------------
> online_store() {                 |
>   device_offline() {             |
>     cpu_subsys_offline() {       |
>       cpu_down(7);               |
>     }                            |
>     dev->offline = true;         |
>   }                              | migration_store() {
> }                                |   rtas_ibm_suspend_me() {
>                                  |     rtas_online_cpus_mask() {
>                                  |       cpu_up(7);
> 				 |     }
>                                  |     cpu_hotplug_disable();
>                                  |     on_each_cpu(rtas_percpu_suspend_me());
>                                  |     cpu_hotplug_enable();
> online_store() {                 |
>   device_online() {              |
>     cpu_subsys_online() {        |
>       cpu_up(7);                 |
>     }                            |
>     dev->offline = false;        |
>   }                              |     rtas_offline_cpus_mask() {
> }                                |       rtas_cpu_state_change_mask() {
>                                  |         cpu_down(7);
> 				 |       }
> 				 |     }
> 				 |   }
> 				 | }

Actually I think this isn't a correct depiction of the race. I'll
rewrite and resend.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-07-18 19:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-07-18 19:29 [PATCH 0/2] more migration vs CPU hotplug fixes Nathan Lynch
2019-07-18 19:29 ` [PATCH 1/2] powerpc/rtas: use device model APIs and serialization during LPM Nathan Lynch
2019-07-18 19:46   ` Nathan Lynch
2019-07-18 19:29 ` [PATCH 2/2] powerpc/rtas: allow rescheduling while changing cpu states Nathan Lynch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).