* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
From: Edgecombe, Rick P @ 2020-10-29 23:19 UTC (permalink / raw)
To: rppt@kernel.org
Cc: david@redhat.com, peterz@infradead.org, catalin.marinas@arm.com,
dave.hansen@linux.intel.com, linux-mm@kvack.org, paulus@samba.org,
pavel@ucw.cz, hpa@zytor.com, sparclinux@vger.kernel.org,
cl@linux.com, will@kernel.org, linux-riscv@lists.infradead.org,
linux-s390@vger.kernel.org, x86@kernel.org, rppt@linux.ibm.com,
borntraeger@de.ibm.com, mingo@redhat.com, rientjes@google.com,
Brown, Len, aou@eecs.berkeley.edu, gor@linux.ibm.com,
linux-pm@vger.kernel.org, hca@linux.ibm.com, bp@alien8.de,
luto@kernel.org, paul.walmsley@sifive.com, kirill@shutemov.name,
tglx@linutronix.de, iamjoonsoo.kim@lge.com,
linux-arm-kernel@lists.infradead.org, rjw@rjwysocki.net,
linux-kernel@vger.kernel.org, penberg@kernel.org,
palmer@dabbelt.com, akpm@linux-foundation.org,
linuxppc-dev@lists.ozlabs.org, davem@davemloft.net
In-Reply-To: <20201029075416.GJ1428094@kernel.org>
On Thu, 2020-10-29 at 09:54 +0200, Mike Rapoport wrote:
> __kernel_map_pages() on arm64 will also bail out if rodata_full is
> false:
> void __kernel_map_pages(struct page *page, int numpages, int enable)
> {
> if (!debug_pagealloc_enabled() && !rodata_full)
> return;
>
> set_memory_valid((unsigned long)page_address(page), numpages,
> enable);
> }
>
> So using set_direct_map() to map back pages removed from the direct
> map
> with __kernel_map_pages() seems safe to me.
Heh, one of us must have some simple boolean error in our head. I hope
its not me! :) I'll try on more time.
__kernel_map_pages() will bail out if rodata_full is false **AND**
debug page alloc is off. So it will only bail under conditions where
there could be nothing unmapped on the direct map.
Equivalent logic would be:
if (!(debug_pagealloc_enabled() || rodata_full))
return;
Or:
if (debug_pagealloc_enabled() || rodata_full)
set_memory_valid(blah)
So if either is on, the existing code will try to re-map. But the
set_direct_map_()'s will only work if rodata_full is on. So switching
hibernate to set_direct_map() will cause the remap to be missed for the
debug page alloc case, with !rodata_full.
It also breaks normal debug page alloc usage with !rodata_full for
similar reasons after patch 3. The pages would never get unmapped.
^ permalink raw reply
* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
From: Edgecombe, Rick P @ 2020-10-29 23:19 UTC (permalink / raw)
To: will@kernel.org, rppt@kernel.org
Cc: david@redhat.com, peterz@infradead.org, catalin.marinas@arm.com,
dave.hansen@linux.intel.com, linux-mm@kvack.org, paulus@samba.org,
pavel@ucw.cz, hpa@zytor.com, sparclinux@vger.kernel.org,
cl@linux.com, linux-riscv@lists.infradead.org,
linux-s390@vger.kernel.org, x86@kernel.org, rppt@linux.ibm.com,
borntraeger@de.ibm.com, mingo@redhat.com, rientjes@google.com,
Brown, Len, aou@eecs.berkeley.edu, gor@linux.ibm.com,
linux-pm@vger.kernel.org, hca@linux.ibm.com, bp@alien8.de,
luto@kernel.org, paul.walmsley@sifive.com, kirill@shutemov.name,
tglx@linutronix.de, iamjoonsoo.kim@lge.com,
linux-arm-kernel@lists.infradead.org, rjw@rjwysocki.net,
linux-kernel@vger.kernel.org, penberg@kernel.org,
palmer@dabbelt.com, akpm@linux-foundation.org,
linuxppc-dev@lists.ozlabs.org, davem@davemloft.net
In-Reply-To: <20201029081225.GK1428094@kernel.org>
On Thu, 2020-10-29 at 10:12 +0200, Mike Rapoport wrote:
> This series goal was primarily to separate dependincies and make it
> clearer what DEBUG_PAGEALLOC and what SET_DIRECT_MAP are. As it
> turned
> out, there is also some lack of consistency between architectures
> that
> implement either of this so I tried to improve this as well.
>
> Honestly, I don't know if a thread can be paused at the time
> __vunmap()
> left invalid pages, but it could, there is an issue on arm64 with
> DEBUG_PAGEALLOC=n and this set fixes it.
Ah, ok. So from this and the other thread, this is about the logic in
arm's cpa for when it will try the un/map operations. I think the logic
actually works currently. And this series introduces a problem on ARM
similar to the one you are saying preexists. I put the details in the
other thread.
^ permalink raw reply
* Test Results: RE: [V2,15/18] io-mapping: Cleanup atomic iomap
From: snowpatch @ 2020-10-29 23:20 UTC (permalink / raw)
To: Thomas Gleixner, linuxppc-dev
In-Reply-To: <20201029222652.084086429@linutronix.de>
[-- Attachment #1: Type: text/plain, Size: 114 bytes --]
Thanks for your contribution, unfortunately we've found some issues.
Your patch failed to apply to any branch.
^ permalink raw reply
* Test Results: RE: [V2,14/18] mm/highmem: Remove the old kmap_atomic cruft
From: snowpatch @ 2020-10-29 23:22 UTC (permalink / raw)
To: Thomas Gleixner, linuxppc-dev
In-Reply-To: <20201029222651.992069499@linutronix.de>
[-- Attachment #1: Type: text/plain, Size: 114 bytes --]
Thanks for your contribution, unfortunately we've found some issues.
Your patch failed to apply to any branch.
^ permalink raw reply
* Test Results: RE: [V2,16/18] sched: highmem: Store local kmaps in task struct
From: snowpatch @ 2020-10-29 23:24 UTC (permalink / raw)
To: Thomas Gleixner, linuxppc-dev
In-Reply-To: <20201029222652.194349374@linutronix.de>
[-- Attachment #1: Type: text/plain, Size: 114 bytes --]
Thanks for your contribution, unfortunately we've found some issues.
Your patch failed to apply to any branch.
^ permalink raw reply
* Test Results: RE: [V2,13/18] xtensa/mm/highmem: Switch to generic kmap atomic
From: snowpatch @ 2020-10-29 23:26 UTC (permalink / raw)
To: Thomas Gleixner, linuxppc-dev
In-Reply-To: <20201029222651.885593433@linutronix.de>
[-- Attachment #1: Type: text/plain, Size: 114 bytes --]
Thanks for your contribution, unfortunately we've found some issues.
Your patch failed to apply to any branch.
^ permalink raw reply
* Test Results: RE: [V2,12/18] sparc/mm/highmem: Switch to generic kmap atomic
From: snowpatch @ 2020-10-29 23:29 UTC (permalink / raw)
To: Thomas Gleixner, linuxppc-dev
In-Reply-To: <20201029222651.790791701@linutronix.de>
[-- Attachment #1: Type: text/plain, Size: 114 bytes --]
Thanks for your contribution, unfortunately we've found some issues.
Your patch failed to apply to any branch.
^ permalink raw reply
* Test Results: RE: [V2,11/18] powerpc/mm/highmem: Switch to generic kmap atomic
From: snowpatch @ 2020-10-29 23:31 UTC (permalink / raw)
To: Thomas Gleixner, linuxppc-dev
In-Reply-To: <20201029222651.695446198@linutronix.de>
[-- Attachment #1: Type: text/plain, Size: 114 bytes --]
Thanks for your contribution, unfortunately we've found some issues.
Your patch failed to apply to any branch.
^ permalink raw reply
* Test Results: RE: [V2,10/18] nds32/mm/highmem: Switch to generic kmap atomic
From: snowpatch @ 2020-10-29 23:33 UTC (permalink / raw)
To: Thomas Gleixner, linuxppc-dev
In-Reply-To: <20201029222651.586549209@linutronix.de>
[-- Attachment #1: Type: text/plain, Size: 114 bytes --]
Thanks for your contribution, unfortunately we've found some issues.
Your patch failed to apply to any branch.
^ permalink raw reply
* Test Results: RE: [V2, 09/18] mips/mm/highmem: Switch to generic kmap atomic
From: snowpatch @ 2020-10-29 23:36 UTC (permalink / raw)
To: Thomas Gleixner, linuxppc-dev
In-Reply-To: <20201029222651.490984112@linutronix.de>
[-- Attachment #1: Type: text/plain, Size: 114 bytes --]
Thanks for your contribution, unfortunately we've found some issues.
Your patch failed to apply to any branch.
^ permalink raw reply
* Test Results: RE: [V2,08/18] microblaze/mm/highmem: Switch to generic kmap atomic
From: snowpatch @ 2020-10-29 23:38 UTC (permalink / raw)
To: Thomas Gleixner, linuxppc-dev
In-Reply-To: <20201029222651.395482379@linutronix.de>
[-- Attachment #1: Type: text/plain, Size: 114 bytes --]
Thanks for your contribution, unfortunately we've found some issues.
Your patch failed to apply to any branch.
^ permalink raw reply
* Test Results: RE: [V2,06/18] ARM: highmem: Switch to generic kmap atomic
From: snowpatch @ 2020-10-29 23:40 UTC (permalink / raw)
To: Thomas Gleixner, linuxppc-dev
In-Reply-To: <20201029222651.209698448@linutronix.de>
[-- Attachment #1: Type: text/plain, Size: 114 bytes --]
Thanks for your contribution, unfortunately we've found some issues.
Your patch failed to apply to any branch.
^ permalink raw reply
* Re: [patch V2 00/18] mm/highmem: Preemptible variant of kmap_atomic & friends
From: Thomas Gleixner @ 2020-10-29 23:41 UTC (permalink / raw)
To: Linus Torvalds
Cc: Juri Lelli, linux-xtensa, Peter Zijlstra,
Sebastian Andrzej Siewior, linux-mips, Ben Segall, Linux-MM,
Guo Ren, linux-sparc, Vincent Chen, Ingo Molnar, linux-arch,
Vincent Guittot, Herbert Xu, the arch/x86 maintainers,
Russell King, linux-csky, Christoph Hellwig, David Airlie,
Mel Gorman, open list:SYNOPSYS ARC ARCHITECTURE, Ard Biesheuvel,
Paul McKenney, linuxppc-dev, Steven Rostedt, Greentime Hu,
Dietmar Eggemann, Linux ARM, Chris Zankel, Michal Simek,
Thomas Bogendoerfer, Nick Hu, Max Filippov, Vineet Gupta, LKML,
Arnd Bergmann, Daniel Vetter, Paul Mackerras, Andrew Morton,
Daniel Bristot de Oliveira, David S. Miller
In-Reply-To: <CAHk-=wiFxxGapdOyZHE-7LbFPk+jdfoqdeeJg0zWNQ86WvJGXg@mail.gmail.com>
On Thu, Oct 29 2020 at 16:11, Linus Torvalds wrote:
> On Thu, Oct 29, 2020 at 3:32 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>>
>> Though I wanted to share the current state of affairs before investigating
>> that further. If there is consensus in going forward with this, I'll have a
>> deeper look into this issue.
>
> Me likee. I think this looks like the right thing to do.
>
> I didn't actually apply the patches, but just from reading them it
> _looks_ to me like you do the migrate_disable() unconditionally, even
> if it's not a highmem page..
>
> That sounds like it might be a good thing for debugging, but not
> necessarily great in general.
>
> Or am I misreading things?
No, you're not misreading it, but doing it conditionally would be a
complete semantical disaster. kmap_atomic*() also disables preemption
and pagefaults unconditionaly. If that wouldn't be the case then every
caller would have to have conditionals like 'if (CONFIG_HIGHMEM)' or
worse 'if (PageHighMem(page)'.
Let's not go there.
Migrate disable is a less horrible plague than preempt and pagefault
disable even if the scheduler people disagree due to the lack of theory
backing that up :)
The charm of the new interface is that users still can rely on per
cpuness independent of being on a highmem plagued system. For non
highmem systems the extra migrate disable/enable is really a minor
nuissance.
Thanks,
tglx
^ permalink raw reply
* Test Results: RE: [V2, 07/18] csky/mm/highmem: Switch to generic kmap atomic
From: snowpatch @ 2020-10-29 23:42 UTC (permalink / raw)
To: Thomas Gleixner, linuxppc-dev
In-Reply-To: <20201029222651.303553207@linutronix.de>
[-- Attachment #1: Type: text/plain, Size: 114 bytes --]
Thanks for your contribution, unfortunately we've found some issues.
Your patch failed to apply to any branch.
^ permalink raw reply
* Test Results: RE: [V2, 05/18] arc/mm/highmem: Use generic kmap atomic implementation
From: snowpatch @ 2020-10-29 23:44 UTC (permalink / raw)
To: Thomas Gleixner, linuxppc-dev
In-Reply-To: <20201029222651.114375025@linutronix.de>
[-- Attachment #1: Type: text/plain, Size: 114 bytes --]
Thanks for your contribution, unfortunately we've found some issues.
Your patch failed to apply to any branch.
^ permalink raw reply
* Test Results: RE: [V2,03/18] highmem: Provide generic variant of kmap_atomic*
From: snowpatch @ 2020-10-29 23:46 UTC (permalink / raw)
To: Thomas Gleixner, linuxppc-dev
In-Reply-To: <20201029222650.910901973@linutronix.de>
[-- Attachment #1: Type: text/plain, Size: 114 bytes --]
Thanks for your contribution, unfortunately we've found some issues.
Your patch failed to apply to any branch.
^ permalink raw reply
* Re: [PATCH] powerpc: add support for TIF_NOTIFY_SIGNAL
From: Michael Ellerman @ 2020-10-30 0:48 UTC (permalink / raw)
To: Jens Axboe, linuxppc-dev
In-Reply-To: <7adea1eb-d193-9d31-6244-e8cd5b2084b2@kernel.dk>
Jens Axboe <axboe@kernel.dk> writes:
> Wire up TIF_NOTIFY_SIGNAL handling for powerpc.
>
> Cc: linuxppc-dev@lists.ozlabs.org
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>
> 5.11 has support queued up for TIF_NOTIFY_SIGNAL, see this posting
> for details:
>
> https://lore.kernel.org/io-uring/20201026203230.386348-1-axboe@kernel.dk/
>
> As part of that work, I'm adding TIF_NOTIFY_SIGNAL support to all archs,
> as that will enable a set of cleanups once all of them support it. I'm
> happy carrying this patch if need be, or it can be funelled through the
> arch tree. Let me know.
Happy for you to take it along with the rest of the series.
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
cheers
> diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
> index 46a210b03d2b..53115ae61495 100644
> --- a/arch/powerpc/include/asm/thread_info.h
> +++ b/arch/powerpc/include/asm/thread_info.h
> @@ -90,6 +90,7 @@ void arch_setup_new_exec(void);
> #define TIF_SYSCALL_TRACE 0 /* syscall trace active */
> #define TIF_SIGPENDING 1 /* signal pending */
> #define TIF_NEED_RESCHED 2 /* rescheduling necessary */
> +#define TIF_NOTIFY_SIGNAL 3 /* signal notifications exist */
> #define TIF_SYSCALL_EMU 4 /* syscall emulation active */
> #define TIF_RESTORE_TM 5 /* need to restore TM FP/VEC/VSX */
> #define TIF_PATCH_PENDING 6 /* pending live patching update */
> @@ -115,6 +116,7 @@ void arch_setup_new_exec(void);
> #define _TIF_SYSCALL_TRACE (1<<TIF_SYSCALL_TRACE)
> #define _TIF_SIGPENDING (1<<TIF_SIGPENDING)
> #define _TIF_NEED_RESCHED (1<<TIF_NEED_RESCHED)
> +#define _TIF_NOTIFY_SIGNAL (1<<TIF_NOTIFY_SIGNAL)
> #define _TIF_POLLING_NRFLAG (1<<TIF_POLLING_NRFLAG)
> #define _TIF_32BIT (1<<TIF_32BIT)
> #define _TIF_RESTORE_TM (1<<TIF_RESTORE_TM)
> @@ -136,7 +138,8 @@ void arch_setup_new_exec(void);
>
> #define _TIF_USER_WORK_MASK (_TIF_SIGPENDING | _TIF_NEED_RESCHED | \
> _TIF_NOTIFY_RESUME | _TIF_UPROBE | \
> - _TIF_RESTORE_TM | _TIF_PATCH_PENDING)
> + _TIF_RESTORE_TM | _TIF_PATCH_PENDING | \
> + _TIF_NOTIFY_SIGNAL)
> #define _TIF_PERSYSCALL_MASK (_TIF_RESTOREALL|_TIF_NOERROR)
>
> /* Bits in local_flags */
> diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
> index d2c356f37077..a8bb0aca1d02 100644
> --- a/arch/powerpc/kernel/signal.c
> +++ b/arch/powerpc/kernel/signal.c
> @@ -318,7 +318,7 @@ void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags)
> if (thread_info_flags & _TIF_PATCH_PENDING)
> klp_update_patch_state(current);
>
> - if (thread_info_flags & _TIF_SIGPENDING) {
> + if (thread_info_flags & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL)) {
> BUG_ON(regs != current->thread.regs);
> do_signal(current);
> }
> --
> 2.29.0
>
> --
> Jens Axboe
^ permalink raw reply
* [PATCH 01/29] powerpc/rtas: move rtas_call_reentrant() out of pseries guards
From: Nathan Lynch @ 2020-10-30 1:17 UTC (permalink / raw)
To: linuxppc-dev; +Cc: tyreld, ajd, mmc, cforno12, drt, brking
In-Reply-To: <20201030011805.1224603-1-nathanl@linux.ibm.com>
rtas_call_reentrant() isn't platform-dependent; move it out of
CONFIG_PPC_PSERIES-guarded code.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
---
arch/powerpc/kernel/rtas.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 954f41676f69..b40fc892138b 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -897,6 +897,12 @@ int rtas_ibm_suspend_me(u64 handle)
return atomic_read(&data.error);
}
+#else /* CONFIG_PPC_PSERIES */
+int rtas_ibm_suspend_me(u64 handle)
+{
+ return -ENOSYS;
+}
+#endif
/**
* rtas_call_reentrant() - Used for reentrant rtas calls
@@ -948,13 +954,6 @@ int rtas_call_reentrant(int token, int nargs, int nret, int *outputs, ...)
return ret;
}
-#else /* CONFIG_PPC_PSERIES */
-int rtas_ibm_suspend_me(u64 handle)
-{
- return -ENOSYS;
-}
-#endif
-
/**
* Find a specific pseries error log in an RTAS extended event log.
* @log: RTAS error/event log
--
2.25.4
^ permalink raw reply related
* [PATCH 02/29] powerpc/rtas: prevent suspend-related sys_rtas use on LE
From: Nathan Lynch @ 2020-10-30 1:17 UTC (permalink / raw)
To: linuxppc-dev; +Cc: tyreld, ajd, mmc, cforno12, drt, brking
In-Reply-To: <20201030011805.1224603-1-nathanl@linux.ibm.com>
While drmgr has had work in some areas to make its RTAS syscall
interactions endian-neutral, its code for performing partition
migration via the syscall has never worked on LE. While it is able to
complete ibm,suspend-me successfully, it crashes when attempting the
subsequent ibm,update-nodes call.
drmgr is the only known (or plausible) user of these ibm,suspend-me,
ibm,update-nodes, and ibm,update-properties, so allow them only in
big-endian configurations.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
---
arch/powerpc/kernel/rtas.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index b40fc892138b..132b2ae39009 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -1049,9 +1049,11 @@ static struct rtas_filter rtas_filters[] __ro_after_init = {
{ "set-time-for-power-on", -1, -1, -1, -1, -1 },
{ "ibm,set-system-parameter", -1, 1, -1, -1, -1 },
{ "set-time-of-day", -1, -1, -1, -1, -1 },
+#ifdef CONFIG_CPU_BIG_ENDIAN
{ "ibm,suspend-me", -1, -1, -1, -1, -1 },
{ "ibm,update-nodes", -1, 0, -1, -1, -1, 4096 },
{ "ibm,update-properties", -1, 0, -1, -1, -1, 4096 },
+#endif
{ "ibm,physical-attestation", -1, 0, 1, -1, -1 },
};
--
2.25.4
^ permalink raw reply related
* [PATCH 03/29] powerpc/rtas: complete ibm,suspend-me status codes
From: Nathan Lynch @ 2020-10-30 1:17 UTC (permalink / raw)
To: linuxppc-dev; +Cc: tyreld, ajd, mmc, cforno12, drt, brking
In-Reply-To: <20201030011805.1224603-1-nathanl@linux.ibm.com>
We don't completely account for the possible return codes for
ibm,suspend-me. Add definitions for these.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
---
arch/powerpc/include/asm/rtas.h | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index 55f9a154c95d..f060181a0d32 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -23,11 +23,16 @@
#define RTAS_RMOBUF_MAX (64 * 1024)
/* RTAS return status codes */
-#define RTAS_NOT_SUSPENDABLE -9004
#define RTAS_BUSY -2 /* RTAS Busy */
#define RTAS_EXTENDED_DELAY_MIN 9900
#define RTAS_EXTENDED_DELAY_MAX 9905
+/* statuses specific to ibm,suspend-me */
+#define RTAS_SUSPEND_ABORTED 9000 /* Suspension aborted */
+#define RTAS_NOT_SUSPENDABLE -9004 /* Partition not suspendable */
+#define RTAS_THREADS_ACTIVE -9005 /* Multiple processor threads active */
+#define RTAS_OUTSTANDING_COPROC -9006 /* Outstanding coprocessor operations */
+
/*
* In general to call RTAS use rtas_token("string") to lookup
* an RTAS token for the given string (e.g. "event-scan").
--
2.25.4
^ permalink raw reply related
* [PATCH 06/29] powerpc/rtas: add rtas_activate_firmware()
From: Nathan Lynch @ 2020-10-30 1:17 UTC (permalink / raw)
To: linuxppc-dev; +Cc: tyreld, ajd, mmc, cforno12, drt, brking
In-Reply-To: <20201030011805.1224603-1-nathanl@linux.ibm.com>
Provide a documented wrapper function for the ibm,activate-firmware
service, which must be called after a partition migration or
hibernation.
If the function is absent or the call fails, the OS will continue to
run normally with the current firmware, so there is no need to perform
any recovery. Just log it and continue.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
---
arch/powerpc/include/asm/rtas.h | 1 +
arch/powerpc/kernel/rtas.c | 30 ++++++++++++++++++++++++++++++
2 files changed, 31 insertions(+)
diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index b43165fc6c2a..fdefe6a974eb 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -247,6 +247,7 @@ extern void __noreturn rtas_restart(char *cmd);
extern void rtas_power_off(void);
extern void __noreturn rtas_halt(void);
extern void rtas_os_term(char *str);
+void rtas_activate_firmware(void);
extern int rtas_get_sensor(int sensor, int index, int *state);
extern int rtas_get_sensor_fast(int sensor, int index, int *state);
extern int rtas_get_power_level(int powerdomain, int *level);
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 70c570269d7b..58bbd69a233f 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -961,6 +961,36 @@ int rtas_ibm_suspend_me_unsafe(u64 handle)
}
#endif
+/**
+ * rtas_activate_firmware() - Activate a new version of firmware.
+ *
+ * Activate a new version of partition firmware. The OS must call this
+ * after resuming from a partition hibernation or migration in order
+ * to maintain the ability to perform live firmware updates. It's not
+ * catastrophic for this method to be absent or to fail; just log the
+ * condition in that case.
+ *
+ * Context: This function may sleep.
+ */
+void rtas_activate_firmware(void)
+{
+ int token;
+ int fwrc;
+
+ token = rtas_token("ibm,activate-firmware");
+ if (token == RTAS_UNKNOWN_SERVICE) {
+ pr_notice("ibm,activate-firmware method unavailable\n");
+ return;
+ }
+
+ do {
+ fwrc = rtas_call(token, 0, 1, NULL);
+ } while (rtas_busy_delay(fwrc));
+
+ if (fwrc)
+ pr_err("ibm,activate-firmware failed (%i)\n", fwrc);
+}
+
/**
* rtas_call_reentrant() - Used for reentrant rtas calls
* @token: Token for desired reentrant RTAS call
--
2.25.4
^ permalink raw reply related
* [PATCH 04/29] powerpc/rtas: rtas_ibm_suspend_me -> rtas_ibm_suspend_me_unsafe
From: Nathan Lynch @ 2020-10-30 1:17 UTC (permalink / raw)
To: linuxppc-dev; +Cc: tyreld, ajd, mmc, cforno12, drt, brking
In-Reply-To: <20201030011805.1224603-1-nathanl@linux.ibm.com>
The pseries partition suspend sequence requires that all active CPUs
call H_JOIN, which suspends all but one of them with interrupts
disabled. The "chosen" CPU is then to call ibm,suspend-me to complete
the suspend. Upon returning from ibm,suspend-me, the chosen CPU is to
use H_PROD to wake the joined CPUs.
Using on_each_cpu() for this, as rtas_ibm_suspend_me() does to
implement partition migration, is susceptible to deadlock with other
users of on_each_cpu() and with users of stop_machine APIs. The
callback passed to on_each_cpu() is not allowed to synchronize with
other CPUs in the way it is used here.
Complicating the fix is the fact that rtas_ibm_suspend_me() also
occupies the function name that should be used to provide a more
conventional wrapper for ibm,suspend-me. Rename rtas_ibm_suspend_me()
to rtas_ibm_suspend_me_unsafe() to free up the name and indicate that
it should not gain users.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
---
arch/powerpc/include/asm/rtas.h | 2 +-
arch/powerpc/kernel/rtas.c | 6 +++---
arch/powerpc/platforms/pseries/mobility.c | 2 +-
3 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index f060181a0d32..8436ed01567b 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -257,7 +257,7 @@ extern int rtas_set_indicator_fast(int indicator, int index, int new_value);
extern void rtas_progress(char *s, unsigned short hex);
extern int rtas_suspend_cpu(struct rtas_suspend_me_data *data);
extern int rtas_suspend_last_cpu(struct rtas_suspend_me_data *data);
-extern int rtas_ibm_suspend_me(u64 handle);
+int rtas_ibm_suspend_me_unsafe(u64 handle);
struct rtc_time;
extern time64_t rtas_get_boot_time(void);
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 132b2ae39009..33adefa84a42 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -843,7 +843,7 @@ static void rtas_percpu_suspend_me(void *info)
__rtas_suspend_cpu((struct rtas_suspend_me_data *)info, 1);
}
-int rtas_ibm_suspend_me(u64 handle)
+int rtas_ibm_suspend_me_unsafe(u64 handle)
{
long state;
long rc;
@@ -898,7 +898,7 @@ int rtas_ibm_suspend_me(u64 handle)
return atomic_read(&data.error);
}
#else /* CONFIG_PPC_PSERIES */
-int rtas_ibm_suspend_me(u64 handle)
+int rtas_ibm_suspend_me_unsafe(u64 handle)
{
return -ENOSYS;
}
@@ -1184,7 +1184,7 @@ SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
int rc = 0;
u64 handle = ((u64)be32_to_cpu(args.args[0]) << 32)
| be32_to_cpu(args.args[1]);
- rc = rtas_ibm_suspend_me(handle);
+ rc = rtas_ibm_suspend_me_unsafe(handle);
if (rc == -EAGAIN)
args.rets[0] = cpu_to_be32(RTAS_NOT_SUSPENDABLE);
else if (rc == -EIO)
diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c
index d6f4162478a5..b6de65cbfcd9 100644
--- a/arch/powerpc/platforms/pseries/mobility.c
+++ b/arch/powerpc/platforms/pseries/mobility.c
@@ -370,7 +370,7 @@ static ssize_t migration_store(struct class *class,
return rc;
do {
- rc = rtas_ibm_suspend_me(streamid);
+ rc = rtas_ibm_suspend_me_unsafe(streamid);
if (rc == -EAGAIN)
ssleep(1);
} while (rc == -EAGAIN);
--
2.25.4
^ permalink raw reply related
* [PATCH 00/29] partition suspend updates
From: Nathan Lynch @ 2020-10-30 1:17 UTC (permalink / raw)
To: linuxppc-dev; +Cc: tyreld, ajd, mmc, cforno12, drt, brking
This series aims to improve the pseries-specific partition migration
and hibernation implementation, part of which has been living in
kernel/rtas.c. Most of that code is eliminated or moved to
platforms/pseries, and the following major functional changes are
made:
- Use stop_machine() instead of on_each_cpu() to avoid deadlock in the
join/suspend sequence.
- Retry the join/suspend sequence on errors that are likely to be
transient. This is a mitigation for the fact that drivers currently
have no way to prepare for an impending partition suspension,
sometimes resulting in a virtual adapter being in a state which
causes the platform to fail the suspend call.
- Request cancellation of the migration via H_VASI_SIGNAL if Linux is
going to error out of the suspend attempt. This allows the
management console and other entities to promptly clean up their
operations instead of relying on long timeouts to fail the
migration.
- Little-endian users of ibm,suspend-me, ibm,update-nodes and
ibm,update-properties via sys_rtas are blocked when
CONFIG_PPC_RTAS_FILTERS is enabled.
- Legacy user space code (drmgr) historically has driven the migration
process by using sys_rtas to separately call ibm,suspend-me,
ibm,activate-firmware, and ibm,update-nodes/properties, in that
order. With these changes, when sys_rtas() dispatches
ibm,suspend-me, the kernel performs the device tree update and
firmware activation before returning. This is more reliable, and
drmgr does not seem bothered by it.
- If the H_VASI_STATE hcall is absent, the implementation proceeds
with the suspend instead of erroring out. This allows us to exercise
these code paths in QEMU.
Nathan Lynch (29):
powerpc/rtas: move rtas_call_reentrant() out of pseries guards
powerpc/rtas: prevent suspend-related sys_rtas use on LE
powerpc/rtas: complete ibm,suspend-me status codes
powerpc/rtas: rtas_ibm_suspend_me -> rtas_ibm_suspend_me_unsafe
powerpc/rtas: add rtas_ibm_suspend_me()
powerpc/rtas: add rtas_activate_firmware()
powerpc/hvcall: add token and codes for H_VASI_SIGNAL
powerpc/pseries/mobility: don't error on absence of ibm,update-nodes
powerpc/pseries/mobility: add missing break to default case
powerpc/pseries/mobility: error message improvements
powerpc/pseries/mobility: use rtas_activate_firmware() on resume
powerpc/pseries/mobility: extract VASI session polling logic
powerpc/pseries/mobility: use stop_machine for join/suspend
powerpc/pseries/mobility: signal suspend cancellation to platform
powerpc/pseries/mobility: retry partition suspend after error
powerpc/rtas: dispatch partition migration requests to pseries
powerpc/rtas: remove rtas_ibm_suspend_me_unsafe()
powerpc/pseries/hibernation: drop pseries_suspend_begin() from suspend
ops
powerpc/pseries/hibernation: pass stream id via function arguments
powerpc/pseries/hibernation: remove pseries_suspend_cpu()
powerpc/machdep: remove suspend_disable_cpu()
powerpc/rtas: remove rtas_suspend_cpu()
powerpc/pseries/hibernation: switch to rtas_ibm_suspend_me()
powerpc/rtas: remove unused rtas_suspend_last_cpu()
powerpc/pseries/hibernation: remove redundant cacheinfo update
powerpc/pseries/hibernation: perform post-suspend fixups later
powerpc/pseries/hibernation: remove prepare_late() callback
powerpc/rtas: remove unused rtas_suspend_me_data
powerpc/pseries/mobility: refactor node lookup during DT update
arch/powerpc/include/asm/hvcall.h | 9 +
arch/powerpc/include/asm/machdep.h | 1 -
arch/powerpc/include/asm/rtas-types.h | 8 -
arch/powerpc/include/asm/rtas.h | 13 +-
arch/powerpc/kernel/rtas.c | 245 ++++++---------
arch/powerpc/platforms/pseries/mobility.c | 361 ++++++++++++++++++----
arch/powerpc/platforms/pseries/suspend.c | 79 +----
7 files changed, 420 insertions(+), 296 deletions(-)
--
2.25.4
^ permalink raw reply
* [PATCH 05/29] powerpc/rtas: add rtas_ibm_suspend_me()
From: Nathan Lynch @ 2020-10-30 1:17 UTC (permalink / raw)
To: linuxppc-dev; +Cc: tyreld, ajd, mmc, cforno12, drt, brking
In-Reply-To: <20201030011805.1224603-1-nathanl@linux.ibm.com>
Now that the name is available, provide a simple wrapper for
ibm,suspend-me which returns both a Linux errno and optionally the
actual RTAS status to the caller.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
---
arch/powerpc/include/asm/rtas.h | 1 +
arch/powerpc/kernel/rtas.c | 57 +++++++++++++++++++++++++++++++++
2 files changed, 58 insertions(+)
diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index 8436ed01567b..b43165fc6c2a 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -258,6 +258,7 @@ extern void rtas_progress(char *s, unsigned short hex);
extern int rtas_suspend_cpu(struct rtas_suspend_me_data *data);
extern int rtas_suspend_last_cpu(struct rtas_suspend_me_data *data);
int rtas_ibm_suspend_me_unsafe(u64 handle);
+int rtas_ibm_suspend_me(int *fw_status);
struct rtc_time;
extern time64_t rtas_get_boot_time(void);
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 33adefa84a42..70c570269d7b 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -684,6 +684,63 @@ int rtas_set_indicator_fast(int indicator, int index, int new_value)
return rc;
}
+/**
+ * rtas_ibm_suspend_me() - Call ibm,suspend-me to suspend the LPAR.
+ *
+ * @fw_status: RTAS call status will be placed here if not NULL.
+ *
+ * rtas_ibm_suspend_me() should be called only on a CPU which has
+ * received H_CONTINUE from the H_JOIN hcall. All other active CPUs
+ * should be waiting to return from H_JOIN.
+ *
+ * rtas_ibm_suspend_me() may suspend execution of the OS
+ * indefinitely. Callers should take appropriate measures upon return, such as
+ * resetting watchdog facilities.
+ *
+ * Callers may choose to retry this call if @fw_status is
+ * %RTAS_THREADS_ACTIVE.
+ *
+ * Return:
+ * 0 - The partition has resumed from suspend, possibly after
+ * migration to a different host.
+ * -ECANCELED - The operation was aborted.
+ * -EAGAIN - There were other CPUs not in H_JOIN at the time of the call.
+ * -EBUSY - Some other condition prevented the suspend from succeeding.
+ * -EIO - Hardware/platform error.
+ */
+int rtas_ibm_suspend_me(int *fw_status)
+{
+ int fwrc;
+ int ret;
+
+ fwrc = rtas_call(rtas_token("ibm,suspend-me"), 0, 1, NULL);
+
+ switch (fwrc) {
+ case 0:
+ ret = 0;
+ break;
+ case RTAS_SUSPEND_ABORTED:
+ ret = -ECANCELED;
+ break;
+ case RTAS_THREADS_ACTIVE:
+ ret = -EAGAIN;
+ break;
+ case RTAS_NOT_SUSPENDABLE:
+ case RTAS_OUTSTANDING_COPROC:
+ ret = -EBUSY;
+ break;
+ case -1:
+ default:
+ ret = -EIO;
+ break;
+ }
+
+ if (fw_status)
+ *fw_status = fwrc;
+
+ return ret;
+}
+
void __noreturn rtas_restart(char *cmd)
{
if (rtas_flash_term_hook)
--
2.25.4
^ permalink raw reply related
* [PATCH 07/29] powerpc/hvcall: add token and codes for H_VASI_SIGNAL
From: Nathan Lynch @ 2020-10-30 1:17 UTC (permalink / raw)
To: linuxppc-dev; +Cc: tyreld, ajd, mmc, cforno12, drt, brking
In-Reply-To: <20201030011805.1224603-1-nathanl@linux.ibm.com>
H_VASI_SIGNAL can be used by a partition to request cancellation of
its migration. To be used in future changes.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
---
arch/powerpc/include/asm/hvcall.h | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
index c1fbccb04390..c98f5141e3fc 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -155,6 +155,14 @@
#define H_VASI_RESUMED 5
#define H_VASI_COMPLETED 6
+/* VASI signal codes. Only the Cancel code is valid for H_VASI_SIGNAL. */
+#define H_VASI_SIGNAL_CANCEL 1
+#define H_VASI_SIGNAL_ABORT 2
+#define H_VASI_SIGNAL_SUSPEND 3
+#define H_VASI_SIGNAL_COMPLETE 4
+#define H_VASI_SIGNAL_ENABLE 5
+#define H_VASI_SIGNAL_FAILOVER 6
+
/* Each control block has to be on a 4K boundary */
#define H_CB_ALIGNMENT 4096
@@ -261,6 +269,7 @@
#define H_ADD_CONN 0x284
#define H_DEL_CONN 0x288
#define H_JOIN 0x298
+#define H_VASI_SIGNAL 0x2A0
#define H_VASI_STATE 0x2A4
#define H_VIOCTL 0x2A8
#define H_ENABLE_CRQ 0x2B0
--
2.25.4
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox