* Re: Fwd: Re: [PATCH v17 02/10] of: Add a common kexec FDT setup function
From: Thiago Jung Bauermann @ 2021-02-12 3:21 UTC (permalink / raw)
To: Lakshmi Ramasubramanian
Cc: Rob Herring, linux-integrity, linuxppc-dev, linux-arm-kernel,
Mimi Zohar
In-Reply-To: <e7f3ae2e-20bc-9901-fb8d-80a3163e7d5e@linux.microsoft.com>
Lakshmi Ramasubramanian <nramas@linux.microsoft.com> writes:
> On 2/11/21 6:11 PM, Thiago Jung Bauermann wrote:
>> Lakshmi Ramasubramanian <nramas@linux.microsoft.com> writes:
>>
>>> On 2/11/21 3:59 PM, Thiago Jung Bauermann wrote:
>>>> Lakshmi Ramasubramanian <nramas@linux.microsoft.com> writes:
>>>>
>>>>> On 2/11/21 9:42 AM, Lakshmi Ramasubramanian wrote:
>>>>>> Hi Rob,
>>>>>> [PATCH] powerpc: Rename kexec elfcorehdr_addr to elf_headers_mem
>>>>>> This change causes build problem for x86_64 architecture (please see the
>>>>>> mail from kernel test bot below) since arch/x86/include/asm/kexec.h uses
>>>>>> "elf_load_addr" for the ELF header buffer address and not
>>>>>> "elf_headers_mem".
>>>>>> struct kimage_arch {
>>>>>> ...
>>>>>> /* Core ELF header buffer */
>>>>>> void *elf_headers;
>>>>>> unsigned long elf_headers_sz;
>>>>>> unsigned long elf_load_addr;
>>>>>> };
>>>>>> I am thinking of limiting of_kexec_alloc_and_setup_fdt() to ARM64 and
>>>>>> PPC64 since they are the only ones using this function now.
>>>>>> #if defined(CONFIG_ARM64) && defined(CONFIG_PPC64)
>>>>> Sorry - I meant to say
>>>>> #if defined(CONFIG_ARM64) || defined(CONFIG_PPC64)
>>>>>
>>>> Does it build correctly if you rename elf_headers_mem to elf_load_addr?
>>>> Or the other way around, renaming x86's elf_load_addr to
>>>> elf_headers_mem. I don't really have a preference.
>>>
>>> Yes - changing arm64 and ppc from "elf_headers_mem" to "elf_load_addr" builds
>>> fine.
>>>
>>> But I am concerned about a few other architectures that also define "struct
>>> kimage_arch" such as "parisc", "arm" which do not have any ELF related fields.
>>> They would not build if the config defines CONFIG_KEXEC_FILE and
>>> CONFIG_OF_FLATTREE.
>>>
>>> Do you think that could be an issue?
>> That's a good point. But in practice, arm doesn't support
>> CONFIG_KEXEC_FILE. And while parisc does support CONFIG_KEXEC_FILE, as
>> far as I could determine it doesn't support CONFIG_OF.
>> So IMHO we don't need to worry about them. We'll cross that bridge if we
>> get there. If they ever implement KEXEC_FILE or OF_FLATTREE support,
>> then (again, IMHO) the natural solution would be for them to name the
>> ELF header member the same way the other arches do.
>> And since no other architecture defines struct kimage_arch, those are
>> the only ones we need to consider.
>>
>
> Sounds good Thiago.
>
> I'll rename arm64 and ppc kimage_arch ELF address field to match that defined
> for x86/x64.
>
> Also, will add "fdt_size" param to of_kexec_alloc_and_setup_fdt(). For now, I'll
> use 2*fdt_totalsize(initial_boot_params) for ppc.
>
> Will send the updated patches shortly.
Sounds good. There will be a small conflict with powerpc/next because of
the patch I mentioned, but it's simple to fix by whoever merges the
series.
--
Thiago Jung Bauermann
IBM Linux Technology Center
^ permalink raw reply
* Re: Fwd: Re: [PATCH v17 02/10] of: Add a common kexec FDT setup function
From: Lakshmi Ramasubramanian @ 2021-02-12 2:28 UTC (permalink / raw)
To: Thiago Jung Bauermann
Cc: Rob Herring, linux-integrity, linuxppc-dev, linux-arm-kernel,
Mimi Zohar
In-Reply-To: <87eehmox08.fsf@manicouagan.localdomain>
On 2/11/21 6:11 PM, Thiago Jung Bauermann wrote:
>
> Lakshmi Ramasubramanian <nramas@linux.microsoft.com> writes:
>
>> On 2/11/21 3:59 PM, Thiago Jung Bauermann wrote:
>>> Lakshmi Ramasubramanian <nramas@linux.microsoft.com> writes:
>>>
>>>> On 2/11/21 9:42 AM, Lakshmi Ramasubramanian wrote:
>>>>> Hi Rob,
>>>>> [PATCH] powerpc: Rename kexec elfcorehdr_addr to elf_headers_mem
>>>>> This change causes build problem for x86_64 architecture (please see the
>>>>> mail from kernel test bot below) since arch/x86/include/asm/kexec.h uses
>>>>> "elf_load_addr" for the ELF header buffer address and not
>>>>> "elf_headers_mem".
>>>>> struct kimage_arch {
>>>>> ...
>>>>> /* Core ELF header buffer */
>>>>> void *elf_headers;
>>>>> unsigned long elf_headers_sz;
>>>>> unsigned long elf_load_addr;
>>>>> };
>>>>> I am thinking of limiting of_kexec_alloc_and_setup_fdt() to ARM64 and
>>>>> PPC64 since they are the only ones using this function now.
>>>>> #if defined(CONFIG_ARM64) && defined(CONFIG_PPC64)
>>>> Sorry - I meant to say
>>>> #if defined(CONFIG_ARM64) || defined(CONFIG_PPC64)
>>>>
>>> Does it build correctly if you rename elf_headers_mem to elf_load_addr?
>>> Or the other way around, renaming x86's elf_load_addr to
>>> elf_headers_mem. I don't really have a preference.
>>
>> Yes - changing arm64 and ppc from "elf_headers_mem" to "elf_load_addr" builds
>> fine.
>>
>> But I am concerned about a few other architectures that also define "struct
>> kimage_arch" such as "parisc", "arm" which do not have any ELF related fields.
>> They would not build if the config defines CONFIG_KEXEC_FILE and
>> CONFIG_OF_FLATTREE.
>>
>> Do you think that could be an issue?
>
> That's a good point. But in practice, arm doesn't support
> CONFIG_KEXEC_FILE. And while parisc does support CONFIG_KEXEC_FILE, as
> far as I could determine it doesn't support CONFIG_OF.
>
> So IMHO we don't need to worry about them. We'll cross that bridge if we
> get there. If they ever implement KEXEC_FILE or OF_FLATTREE support,
> then (again, IMHO) the natural solution would be for them to name the
> ELF header member the same way the other arches do.
>
> And since no other architecture defines struct kimage_arch, those are
> the only ones we need to consider.
>
Sounds good Thiago.
I'll rename arm64 and ppc kimage_arch ELF address field to match that
defined for x86/x64.
Also, will add "fdt_size" param to of_kexec_alloc_and_setup_fdt(). For
now, I'll use 2*fdt_totalsize(initial_boot_params) for ppc.
Will send the updated patches shortly.
-lakshmi
^ permalink raw reply
* Re: Fwd: Re: [PATCH v17 02/10] of: Add a common kexec FDT setup function
From: Thiago Jung Bauermann @ 2021-02-12 2:11 UTC (permalink / raw)
To: Lakshmi Ramasubramanian
Cc: Rob Herring, linux-integrity, linuxppc-dev, linux-arm-kernel,
Mimi Zohar
In-Reply-To: <b4ebf962-4210-5d17-2149-6b152d587f95@linux.microsoft.com>
Lakshmi Ramasubramanian <nramas@linux.microsoft.com> writes:
> On 2/11/21 3:59 PM, Thiago Jung Bauermann wrote:
>> Lakshmi Ramasubramanian <nramas@linux.microsoft.com> writes:
>>
>>> On 2/11/21 9:42 AM, Lakshmi Ramasubramanian wrote:
>>>> Hi Rob,
>>>> [PATCH] powerpc: Rename kexec elfcorehdr_addr to elf_headers_mem
>>>> This change causes build problem for x86_64 architecture (please see the
>>>> mail from kernel test bot below) since arch/x86/include/asm/kexec.h uses
>>>> "elf_load_addr" for the ELF header buffer address and not
>>>> "elf_headers_mem".
>>>> struct kimage_arch {
>>>> ...
>>>> /* Core ELF header buffer */
>>>> void *elf_headers;
>>>> unsigned long elf_headers_sz;
>>>> unsigned long elf_load_addr;
>>>> };
>>>> I am thinking of limiting of_kexec_alloc_and_setup_fdt() to ARM64 and
>>>> PPC64 since they are the only ones using this function now.
>>>> #if defined(CONFIG_ARM64) && defined(CONFIG_PPC64)
>>> Sorry - I meant to say
>>> #if defined(CONFIG_ARM64) || defined(CONFIG_PPC64)
>>>
>> Does it build correctly if you rename elf_headers_mem to elf_load_addr?
>> Or the other way around, renaming x86's elf_load_addr to
>> elf_headers_mem. I don't really have a preference.
>
> Yes - changing arm64 and ppc from "elf_headers_mem" to "elf_load_addr" builds
> fine.
>
> But I am concerned about a few other architectures that also define "struct
> kimage_arch" such as "parisc", "arm" which do not have any ELF related fields.
> They would not build if the config defines CONFIG_KEXEC_FILE and
> CONFIG_OF_FLATTREE.
>
> Do you think that could be an issue?
That's a good point. But in practice, arm doesn't support
CONFIG_KEXEC_FILE. And while parisc does support CONFIG_KEXEC_FILE, as
far as I could determine it doesn't support CONFIG_OF.
So IMHO we don't need to worry about them. We'll cross that bridge if we
get there. If they ever implement KEXEC_FILE or OF_FLATTREE support,
then (again, IMHO) the natural solution would be for them to name the
ELF header member the same way the other arches do.
And since no other architecture defines struct kimage_arch, those are
the only ones we need to consider.
--
Thiago Jung Bauermann
IBM Linux Technology Center
^ permalink raw reply
* Re: [PATCH v17 02/10] of: Add a common kexec FDT setup function
From: Thiago Jung Bauermann @ 2021-02-12 1:39 UTC (permalink / raw)
To: Lakshmi Ramasubramanian
Cc: mark.rutland, tao.li, zohar, paulus, vincenzo.frascino,
frowand.list, sashal, robh, masahiroy, jmorris, takahiro.akashi,
linux-arm-kernel, catalin.marinas, serge, devicetree,
pasha.tatashin, will, prsriva, hsinyi, allison, christophe.leroy,
mbrugger, balajib, dmitry.kasatkin, linux-kernel, james.morse,
gregkh, joe, linux-integrity, linuxppc-dev
In-Reply-To: <8a3aa3d2-2eba-549a-9970-a2b0fe3586c9@linux.microsoft.com>
Lakshmi Ramasubramanian <nramas@linux.microsoft.com> writes:
> On 2/11/21 5:09 PM, Thiago Jung Bauermann wrote:
>> There's actually a complication that I just noticed and needs to be
>> addressed. More below.
>>
>
> <...>
>
>>> +
>>> +/*
>>> + * of_kexec_alloc_and_setup_fdt - Alloc and setup a new Flattened Device Tree
>>> + *
>>> + * @image: kexec image being loaded.
>>> + * @initrd_load_addr: Address where the next initrd will be loaded.
>>> + * @initrd_len: Size of the next initrd, or 0 if there will be none.
>>> + * @cmdline: Command line for the next kernel, or NULL if there will
>>> + * be none.
>>> + *
>>> + * Return: fdt on success, or NULL errno on error.
>>> + */
>>> +void *of_kexec_alloc_and_setup_fdt(const struct kimage *image,
>>> + unsigned long initrd_load_addr,
>>> + unsigned long initrd_len,
>>> + const char *cmdline)
>>> +{
>>> + void *fdt;
>>> + int ret, chosen_node;
>>> + const void *prop;
>>> + unsigned long fdt_size;
>>> +
>>> + fdt_size = fdt_totalsize(initial_boot_params) +
>>> + (cmdline ? strlen(cmdline) : 0) +
>>> + FDT_EXTRA_SPACE;
>> Just adding 4 KB to initial_boot_params won't be enough for crash
>> kernels on ppc64. The current powerpc code doubles the size of
>> initial_boot_params (which is normally larger than 4 KB) and even that
>> isn't enough. A patch was added to powerpc/next today which uses a more
>> precise (but arch-specific) formula:
>> https://lore.kernel.org/linuxppc-dev/161243826811.119001.14083048209224609814.stgit@hbathini/
>> So I believe we need a hook here where architectures can provide their
>> own specific calculation for the size of the fdt. Perhaps a weakly
>> defined function providing a default implementation which an
>> arch-specific file can override (a la arch_kexec_kernel_image_load())?
>> Then the powerpc specific hook would be the kexec_fdt_totalsize_ppc64()
>> function from the patch I linked above.
>>
>
> Do you think it'd better to add "fdt_size" parameter to
> of_kexec_alloc_and_setup_fdt() so that the caller can provide the
> desired FDT buffer size?
Yes, that is actually simpler and better than my idea. :-)
--
Thiago Jung Bauermann
IBM Linux Technology Center
^ permalink raw reply
* [RFC PATCH] powerpc/64s: introduce call_realmode to run C code in real-mode
From: Nicholas Piggin @ 2021-02-12 1:20 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
The regular kernel stack can not be accessed in real mode in hash
guest kernels, which prevents the MMU from being disabled in general
C code. Provide a helper that can call a function pointer in real
mode using the emergency stack (accessable in real mode).
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/include/asm/asm-prototypes.h | 1 +
arch/powerpc/include/asm/book3s/64/mmu.h | 2 +
arch/powerpc/include/asm/thread_info.h | 16 ++++++++
arch/powerpc/kernel/irq.c | 16 --------
arch/powerpc/kernel/misc_64.S | 22 +++++++++++
arch/powerpc/mm/book3s64/pgtable.c | 48 +++++++++++++++++++++++
6 files changed, 89 insertions(+), 16 deletions(-)
diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h
index d0b832cbbec8..a973023c390a 100644
--- a/arch/powerpc/include/asm/asm-prototypes.h
+++ b/arch/powerpc/include/asm/asm-prototypes.h
@@ -126,6 +126,7 @@ extern s64 __ashldi3(s64, int);
extern s64 __ashrdi3(s64, int);
extern int __cmpdi2(s64, s64);
extern int __ucmpdi2(u64, u64);
+int __call_realmode(int (*fn)(void *arg), void *arg, void *sp);
/* tracing */
void _mcount(void);
diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
index 995bbcdd0ef8..80b0d24415ac 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -274,5 +274,7 @@ static inline unsigned long get_user_vsid(mm_context_t *ctx,
return get_vsid(context, ea, ssize);
}
+int call_realmode(int (*fn)(void *arg), void *arg);
+
#endif /* __ASSEMBLY__ */
#endif /* _ASM_POWERPC_BOOK3S_64_MMU_H_ */
diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
index 3d8a47af7a25..9279e472d51e 100644
--- a/arch/powerpc/include/asm/thread_info.h
+++ b/arch/powerpc/include/asm/thread_info.h
@@ -172,6 +172,22 @@ static inline bool test_thread_local_flags(unsigned int flags)
#define is_elf2_task() (0)
#endif
+static inline void check_stack_overflow(void)
+{
+ long sp;
+
+ if (!IS_ENABLED(CONFIG_DEBUG_STACKOVERFLOW))
+ return;
+
+ sp = current_stack_pointer & (THREAD_SIZE - 1);
+
+ /* check for stack overflow: is there less than 2KB free? */
+ if (unlikely(sp < 2048)) {
+ pr_err("do_IRQ: stack overflow: %ld\n", sp);
+ dump_stack();
+ }
+}
+
#endif /* !__ASSEMBLY__ */
#endif /* __KERNEL__ */
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 6b1eca53e36c..193b47b5b6a5 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -620,22 +620,6 @@ u64 arch_irq_stat_cpu(unsigned int cpu)
return sum;
}
-static inline void check_stack_overflow(void)
-{
- long sp;
-
- if (!IS_ENABLED(CONFIG_DEBUG_STACKOVERFLOW))
- return;
-
- sp = current_stack_pointer & (THREAD_SIZE - 1);
-
- /* check for stack overflow: is there less than 2KB free? */
- if (unlikely(sp < 2048)) {
- pr_err("do_IRQ: stack overflow: %ld\n", sp);
- dump_stack();
- }
-}
-
void __do_irq(struct pt_regs *regs)
{
unsigned int irq;
diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
index 070465825c21..5e911d0b0b16 100644
--- a/arch/powerpc/kernel/misc_64.S
+++ b/arch/powerpc/kernel/misc_64.S
@@ -27,6 +27,28 @@
.text
+#ifdef CONFIG_PPC_BOOK3S_64
+_GLOBAL(__call_realmode)
+ mflr r0
+ std r0,16(r1)
+ stdu r1,THREAD_SIZE-STACK_FRAME_OVERHEAD(r5)
+ mr r1,r5
+ mtctr r3
+ mr r3,r4
+ mfmsr r4
+ xori r4,r4,(MSR_IR|MSR_DR)
+ mtmsrd r4
+ bctrl
+ mfmsr r4
+ xori r4,r4,(MSR_IR|MSR_DR)
+ mtmsrd r4
+ ld r1,0(r1)
+ ld r0,16(r1)
+ mtlr r0
+ blr
+
+#endif
+
_GLOBAL(call_do_softirq)
mflr r0
std r0,16(r1)
diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c
index 5b3a3bae21aa..aad0e2059305 100644
--- a/arch/powerpc/mm/book3s64/pgtable.c
+++ b/arch/powerpc/mm/book3s64/pgtable.c
@@ -474,6 +474,54 @@ int pmd_move_must_withdraw(struct spinlock *new_pmd_ptl,
return true;
}
+/*
+ * Executing C code in real-mode in general Book3S-64 code can only be done
+ * via this function that switches the stack to one inside the real-mode-area,
+ * which may cover only a small first part of real memory on hash guest LPARs.
+ * fn must be NOKPROBES, must not access vmalloc or anything outside the RMA,
+ * probably shouldn't enable the MMU or interrupts, etc, and be very careful
+ * about calling other generic kernel or powerpc functions.
+ */
+int call_realmode(int (*fn)(void *arg), void *arg)
+{
+ unsigned long flags;
+ void *cursp, *emsp;
+ int ret;
+
+ if (WARN_ON_ONCE(!(mfmsr() & MSR_DR)))
+ return -EINVAL;
+ if (WARN_ON_ONCE(!(mfmsr() & MSR_IR)))
+ return -EINVAL;
+
+ /*
+ * The switch to emergency stack is only really required for HPT LPAR,
+ * but do it for all to help test coverage of tricky code.
+ */
+ cursp = (void *)(current_stack_pointer & ~(THREAD_SIZE - 1));
+ emsp = (void *)(local_paca->emergency_sp - THREAD_SIZE);
+
+ /*
+ * It's probably okay to go to real-mode and call directly in case we
+ * are already on the emergency stack, so allow it. But we may want to
+ * prevent callers from doing this in future though, so warn.
+ */
+ WARN_ON_ONCE(cursp == emsp);
+
+ check_stack_overflow();
+
+ local_irq_save(flags);
+ hard_irq_disable();
+
+ if (cursp == emsp)
+ ret = fn(arg);
+ else
+ ret = __call_realmode(fn, arg, emsp);
+
+ local_irq_restore(flags);
+
+ return ret;
+}
+
/*
* Does the CPU support tlbie?
*/
--
2.23.0
^ permalink raw reply related
* Re: [PATCH v17 02/10] of: Add a common kexec FDT setup function
From: Lakshmi Ramasubramanian @ 2021-02-12 1:17 UTC (permalink / raw)
To: Thiago Jung Bauermann
Cc: mark.rutland, tao.li, zohar, paulus, vincenzo.frascino,
frowand.list, sashal, robh, masahiroy, jmorris, takahiro.akashi,
linux-arm-kernel, catalin.marinas, serge, devicetree,
pasha.tatashin, will, prsriva, hsinyi, allison, christophe.leroy,
mbrugger, balajib, dmitry.kasatkin, linux-kernel, james.morse,
gregkh, joe, linux-integrity, linuxppc-dev
In-Reply-To: <87k0reozwh.fsf@manicouagan.localdomain>
On 2/11/21 5:09 PM, Thiago Jung Bauermann wrote:
>
> There's actually a complication that I just noticed and needs to be
> addressed. More below.
>
<...>
>> +
>> +/*
>> + * of_kexec_alloc_and_setup_fdt - Alloc and setup a new Flattened Device Tree
>> + *
>> + * @image: kexec image being loaded.
>> + * @initrd_load_addr: Address where the next initrd will be loaded.
>> + * @initrd_len: Size of the next initrd, or 0 if there will be none.
>> + * @cmdline: Command line for the next kernel, or NULL if there will
>> + * be none.
>> + *
>> + * Return: fdt on success, or NULL errno on error.
>> + */
>> +void *of_kexec_alloc_and_setup_fdt(const struct kimage *image,
>> + unsigned long initrd_load_addr,
>> + unsigned long initrd_len,
>> + const char *cmdline)
>> +{
>> + void *fdt;
>> + int ret, chosen_node;
>> + const void *prop;
>> + unsigned long fdt_size;
>> +
>> + fdt_size = fdt_totalsize(initial_boot_params) +
>> + (cmdline ? strlen(cmdline) : 0) +
>> + FDT_EXTRA_SPACE;
>
> Just adding 4 KB to initial_boot_params won't be enough for crash
> kernels on ppc64. The current powerpc code doubles the size of
> initial_boot_params (which is normally larger than 4 KB) and even that
> isn't enough. A patch was added to powerpc/next today which uses a more
> precise (but arch-specific) formula:
>
> https://lore.kernel.org/linuxppc-dev/161243826811.119001.14083048209224609814.stgit@hbathini/
>
> So I believe we need a hook here where architectures can provide their
> own specific calculation for the size of the fdt. Perhaps a weakly
> defined function providing a default implementation which an
> arch-specific file can override (a la arch_kexec_kernel_image_load())?
>
> Then the powerpc specific hook would be the kexec_fdt_totalsize_ppc64()
> function from the patch I linked above.
>
Do you think it'd better to add "fdt_size" parameter to
of_kexec_alloc_and_setup_fdt() so that the caller can provide the
desired FDT buffer size?
thanks,
-lakshmi
^ permalink raw reply
* Re: [PATCH v17 02/10] of: Add a common kexec FDT setup function
From: Thiago Jung Bauermann @ 2021-02-12 1:09 UTC (permalink / raw)
To: Lakshmi Ramasubramanian
Cc: mark.rutland, tao.li, zohar, paulus, vincenzo.frascino,
frowand.list, sashal, robh, masahiroy, jmorris, takahiro.akashi,
linux-arm-kernel, catalin.marinas, serge, devicetree,
pasha.tatashin, will, prsriva, hsinyi, allison, christophe.leroy,
mbrugger, balajib, dmitry.kasatkin, linux-kernel, james.morse,
gregkh, joe, linux-integrity, linuxppc-dev
In-Reply-To: <20210209182200.30606-3-nramas@linux.microsoft.com>
There's actually a complication that I just noticed and needs to be
addressed. More below.
Lakshmi Ramasubramanian <nramas@linux.microsoft.com> writes:
> From: Rob Herring <robh@kernel.org>
>
> Both arm64 and powerpc do essentially the same FDT /chosen setup for
> kexec. The differences are either omissions that arm64 should have
> or additional properties that will be ignored. The setup code can be
> combined and shared by both powerpc and arm64.
>
> The differences relative to the arm64 version:
> - If /chosen doesn't exist, it will be created (should never happen).
> - Any old dtb and initrd reserved memory will be released.
> - The new initrd and elfcorehdr are marked reserved.
> - "linux,booted-from-kexec" is set.
>
> The differences relative to the powerpc version:
> - "kaslr-seed" and "rng-seed" may be set.
> - "linux,elfcorehdr" is set.
> - Any existing "linux,usable-memory-range" is removed.
>
> Combine the code for setting up the /chosen node in the FDT and updating
> the memory reservation for kexec, for powerpc and arm64, in
> of_kexec_alloc_and_setup_fdt() and move it to "drivers/of/kexec.c".
>
> Signed-off-by: Rob Herring <robh@kernel.org>
> Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
> ---
> drivers/of/Makefile | 6 ++
> drivers/of/kexec.c | 258 ++++++++++++++++++++++++++++++++++++++++++++
> include/linux/of.h | 13 +++
> 3 files changed, 277 insertions(+)
> create mode 100644 drivers/of/kexec.c
>
> diff --git a/drivers/of/Makefile b/drivers/of/Makefile
> index 6e1e5212f058..c13b982084a3 100644
> --- a/drivers/of/Makefile
> +++ b/drivers/of/Makefile
> @@ -14,4 +14,10 @@ obj-$(CONFIG_OF_RESOLVE) += resolver.o
> obj-$(CONFIG_OF_OVERLAY) += overlay.o
> obj-$(CONFIG_OF_NUMA) += of_numa.o
>
> +ifdef CONFIG_KEXEC_FILE
> +ifdef CONFIG_OF_FLATTREE
> +obj-y += kexec.o
> +endif
> +endif
> +
> obj-$(CONFIG_OF_UNITTEST) += unittest-data/
> diff --git a/drivers/of/kexec.c b/drivers/of/kexec.c
> new file mode 100644
> index 000000000000..469e09613cdd
> --- /dev/null
> +++ b/drivers/of/kexec.c
> @@ -0,0 +1,258 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2020 Arm Limited
> + *
> + * Based on arch/arm64/kernel/machine_kexec_file.c:
> + * Copyright (C) 2018 Linaro Limited
> + *
> + * And arch/powerpc/kexec/file_load.c:
> + * Copyright (C) 2016 IBM Corporation
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/kexec.h>
> +#include <linux/libfdt.h>
> +#include <linux/of.h>
> +#include <linux/of_fdt.h>
> +#include <linux/random.h>
> +#include <linux/types.h>
> +
> +/* relevant device tree properties */
> +#define FDT_PROP_KEXEC_ELFHDR "linux,elfcorehdr"
> +#define FDT_PROP_MEM_RANGE "linux,usable-memory-range"
> +#define FDT_PROP_INITRD_START "linux,initrd-start"
> +#define FDT_PROP_INITRD_END "linux,initrd-end"
> +#define FDT_PROP_BOOTARGS "bootargs"
> +#define FDT_PROP_KASLR_SEED "kaslr-seed"
> +#define FDT_PROP_RNG_SEED "rng-seed"
> +#define RNG_SEED_SIZE 128
> +
> +/**
> + * fdt_find_and_del_mem_rsv - delete memory reservation with given address and size
> + *
> + * @fdt: Flattened device tree for the current kernel.
> + * @start: Starting address of the reserved memory.
> + * @size: Size of the reserved memory.
> + *
> + * Return: 0 on success, or negative errno on error.
> + */
> +static int fdt_find_and_del_mem_rsv(void *fdt, unsigned long start, unsigned long size)
> +{
> + int i, ret, num_rsvs = fdt_num_mem_rsv(fdt);
> +
> + for (i = 0; i < num_rsvs; i++) {
> + u64 rsv_start, rsv_size;
> +
> + ret = fdt_get_mem_rsv(fdt, i, &rsv_start, &rsv_size);
> + if (ret) {
> + pr_err("Malformed device tree.\n");
> + return -EINVAL;
> + }
> +
> + if (rsv_start == start && rsv_size == size) {
> + ret = fdt_del_mem_rsv(fdt, i);
> + if (ret) {
> + pr_err("Error deleting device tree reservation.\n");
> + return -EINVAL;
> + }
> +
> + return 0;
> + }
> + }
> +
> + return -ENOENT;
> +}
> +
> +/*
> + * of_kexec_alloc_and_setup_fdt - Alloc and setup a new Flattened Device Tree
> + *
> + * @image: kexec image being loaded.
> + * @initrd_load_addr: Address where the next initrd will be loaded.
> + * @initrd_len: Size of the next initrd, or 0 if there will be none.
> + * @cmdline: Command line for the next kernel, or NULL if there will
> + * be none.
> + *
> + * Return: fdt on success, or NULL errno on error.
> + */
> +void *of_kexec_alloc_and_setup_fdt(const struct kimage *image,
> + unsigned long initrd_load_addr,
> + unsigned long initrd_len,
> + const char *cmdline)
> +{
> + void *fdt;
> + int ret, chosen_node;
> + const void *prop;
> + unsigned long fdt_size;
> +
> + fdt_size = fdt_totalsize(initial_boot_params) +
> + (cmdline ? strlen(cmdline) : 0) +
> + FDT_EXTRA_SPACE;
Just adding 4 KB to initial_boot_params won't be enough for crash
kernels on ppc64. The current powerpc code doubles the size of
initial_boot_params (which is normally larger than 4 KB) and even that
isn't enough. A patch was added to powerpc/next today which uses a more
precise (but arch-specific) formula:
https://lore.kernel.org/linuxppc-dev/161243826811.119001.14083048209224609814.stgit@hbathini/
So I believe we need a hook here where architectures can provide their
own specific calculation for the size of the fdt. Perhaps a weakly
defined function providing a default implementation which an
arch-specific file can override (a la arch_kexec_kernel_image_load())?
Then the powerpc specific hook would be the kexec_fdt_totalsize_ppc64()
function from the patch I linked above.
--
Thiago Jung Bauermann
IBM Linux Technology Center
^ permalink raw reply
* Re: Fwd: Re: [PATCH v17 02/10] of: Add a common kexec FDT setup function
From: Lakshmi Ramasubramanian @ 2021-02-12 1:09 UTC (permalink / raw)
To: Thiago Jung Bauermann
Cc: Rob Herring, linux-integrity, linuxppc-dev, linux-arm-kernel,
Mimi Zohar
In-Reply-To: <87mtwap35f.fsf@manicouagan.localdomain>
On 2/11/21 3:59 PM, Thiago Jung Bauermann wrote:
>
> Lakshmi Ramasubramanian <nramas@linux.microsoft.com> writes:
>
>> On 2/11/21 9:42 AM, Lakshmi Ramasubramanian wrote:
>>> Hi Rob,
>>> [PATCH] powerpc: Rename kexec elfcorehdr_addr to elf_headers_mem
>>> This change causes build problem for x86_64 architecture (please see the
>>> mail from kernel test bot below) since arch/x86/include/asm/kexec.h uses
>>> "elf_load_addr" for the ELF header buffer address and not
>>> "elf_headers_mem".
>>> struct kimage_arch {
>>> ...
>>> /* Core ELF header buffer */
>>> void *elf_headers;
>>> unsigned long elf_headers_sz;
>>> unsigned long elf_load_addr;
>>> };
>>> I am thinking of limiting of_kexec_alloc_and_setup_fdt() to ARM64 and
>>> PPC64 since they are the only ones using this function now.
>>> #if defined(CONFIG_ARM64) && defined(CONFIG_PPC64)
>> Sorry - I meant to say
>> #if defined(CONFIG_ARM64) || defined(CONFIG_PPC64)
>>
>
> Does it build correctly if you rename elf_headers_mem to elf_load_addr?
> Or the other way around, renaming x86's elf_load_addr to
> elf_headers_mem. I don't really have a preference.
Yes - changing arm64 and ppc from "elf_headers_mem" to "elf_load_addr"
builds fine.
But I am concerned about a few other architectures that also define
"struct kimage_arch" such as "parisc", "arm" which do not have any ELF
related fields. They would not build if the config defines
CONFIG_KEXEC_FILE and CONFIG_OF_FLATTREE.
Do you think that could be an issue?
thanks,
-lakshmi
>
> That would be better than adding an #if condition.
>
>>> void *of_kexec_alloc_and_setup_fdt(const struct kimage *image,
>>> unsigned long initrd_load_addr,
>>> unsigned long initrd_len,
>>> const char *cmdline)
>>> {
>>> ...
>>> }
>>> #endif /* defined(CONFIG_ARM64) && defined(CONFIG_PPC64) */
>>> Please let me know if you have any concerns.
>>> thanks,
>>> -lakshmi
>
^ permalink raw reply
* Re: [PATCH 5/6] powerpc/mm/64s/hash: Add real-mode change_memory_range() for hash LPAR
From: Nicholas Piggin @ 2021-02-12 0:36 UTC (permalink / raw)
To: linuxppc-dev, Michael Ellerman; +Cc: aneesh.kumar
In-Reply-To: <20210211135130.3474832-5-mpe@ellerman.id.au>
Excerpts from Michael Ellerman's message of February 11, 2021 11:51 pm:
> When we enabled STRICT_KERNEL_RWX we received some reports of boot
> failures when using the Hash MMU and running under phyp. The crashes
> are intermittent, and often exhibit as a completely unresponsive
> system, or possibly an oops.
>
> One example, which was caught in xmon:
>
> [ 14.068327][ T1] devtmpfs: mounted
> [ 14.069302][ T1] Freeing unused kernel memory: 5568K
> [ 14.142060][ T347] BUG: Unable to handle kernel instruction fetch
> [ 14.142063][ T1] Run /sbin/init as init process
> [ 14.142074][ T347] Faulting instruction address: 0xc000000000004400
> cpu 0x2: Vector: 400 (Instruction Access) at [c00000000c7475e0]
> pc: c000000000004400: exc_virt_0x4400_instruction_access+0x0/0x80
> lr: c0000000001862d4: update_rq_clock+0x44/0x110
> sp: c00000000c747880
> msr: 8000000040001031
> current = 0xc00000000c60d380
> paca = 0xc00000001ec9de80 irqmask: 0x03 irq_happened: 0x01
> pid = 347, comm = kworker/2:1
> ...
> enter ? for help
> [c00000000c747880] c0000000001862d4 update_rq_clock+0x44/0x110 (unreliable)
> [c00000000c7478f0] c000000000198794 update_blocked_averages+0xb4/0x6d0
> [c00000000c7479f0] c000000000198e40 update_nohz_stats+0x90/0xd0
> [c00000000c747a20] c0000000001a13b4 _nohz_idle_balance+0x164/0x390
> [c00000000c747b10] c0000000001a1af8 newidle_balance+0x478/0x610
> [c00000000c747be0] c0000000001a1d48 pick_next_task_fair+0x58/0x480
> [c00000000c747c40] c000000000eaab5c __schedule+0x12c/0x950
> [c00000000c747cd0] c000000000eab3e8 schedule+0x68/0x120
> [c00000000c747d00] c00000000016b730 worker_thread+0x130/0x640
> [c00000000c747da0] c000000000174d50 kthread+0x1a0/0x1b0
> [c00000000c747e10] c00000000000e0f0 ret_from_kernel_thread+0x5c/0x6c
>
> This shows that CPU 2, which was idle, woke up and then appears to
> randomly take an instruction fault on a completely valid area of
> kernel text.
>
> The cause turns out to be the call to hash__mark_rodata_ro(), late in
> boot. Due to the way we layout text and rodata, that function actually
> changes the permissions for all of text and rodata to read-only plus
> execute.
>
> To do the permission change we use a hypervisor call, H_PROTECT. On
> phyp that appears to be implemented by briefly removing the mapping of
> the kernel text, before putting it back with the updated permissions.
> If any other CPU is executing during that window, it will see spurious
> faults on the kernel text and/or data, leading to crashes.
>
> To fix it we use stop machine to collect all other CPUs, and then have
> them drop into real mode (MMU off), while we change the mapping. That
> way they are unaffected by the mapping temporarily disappearing.
>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> ---
> arch/powerpc/mm/book3s64/hash_pgtable.c | 105 +++++++++++++++++++++++-
> 1 file changed, 104 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/mm/book3s64/hash_pgtable.c b/arch/powerpc/mm/book3s64/hash_pgtable.c
> index 3663d3cdffac..01de985df2c4 100644
> --- a/arch/powerpc/mm/book3s64/hash_pgtable.c
> +++ b/arch/powerpc/mm/book3s64/hash_pgtable.c
> @@ -8,6 +8,7 @@
> #include <linux/sched.h>
> #include <linux/mm_types.h>
> #include <linux/mm.h>
> +#include <linux/stop_machine.h>
>
> #include <asm/sections.h>
> #include <asm/mmu.h>
> @@ -400,6 +401,19 @@ EXPORT_SYMBOL_GPL(hash__has_transparent_hugepage);
> #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>
> #ifdef CONFIG_STRICT_KERNEL_RWX
> +
> +struct change_memory_parms {
> + unsigned long start, end, newpp;
> + unsigned int step, nr_cpus, master_cpu;
> + atomic_t cpu_counter;
> +};
> +
> +// We'd rather this was on the stack but it has to be in the RMO
> +static struct change_memory_parms chmem_parms;
> +
> +// And therefore we need a lock to protect it from concurrent use
> +static DEFINE_MUTEX(chmem_lock);
> +
> static void change_memory_range(unsigned long start, unsigned long end,
> unsigned int step, unsigned long newpp)
> {
> @@ -414,6 +428,73 @@ static void change_memory_range(unsigned long start, unsigned long end,
> mmu_kernel_ssize);
> }
>
> +static int notrace chmem_secondary_loop(struct change_memory_parms *parms)
> +{
> + unsigned long msr, tmp, flags;
> + int *p;
> +
> + p = &parms->cpu_counter.counter;
> +
> + local_irq_save(flags);
> + __hard_EE_RI_disable();
> +
> + asm volatile (
> + // Switch to real mode and leave interrupts off
> + "mfmsr %[msr] ;"
> + "li %[tmp], %[MSR_IR_DR] ;"
> + "andc %[tmp], %[msr], %[tmp] ;"
> + "mtmsrd %[tmp] ;"
> +
> + // Tell the master we are in real mode
> + "1: "
> + "lwarx %[tmp], 0, %[p] ;"
> + "addic %[tmp], %[tmp], -1 ;"
> + "stwcx. %[tmp], 0, %[p] ;"
> + "bne- 1b ;"
> +
> + // Spin until the counter goes to zero
> + "2: ;"
> + "lwz %[tmp], 0(%[p]) ;"
> + "cmpwi %[tmp], 0 ;"
> + "bne- 2b ;"
> +
> + // Switch back to virtual mode
> + "mtmsrd %[msr] ;"
Pity we don't have something that can switch to emergency stack and
so we can write this stuff in C.
How's something like this suit you?
---
arch/powerpc/kernel/misc_64.S | 22 +++++++++++++++++++++
arch/powerpc/kernel/process.c | 37 +++++++++++++++++++++++++++++++++++
2 files changed, 59 insertions(+)
diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
index 070465825c21..5e911d0b0b16 100644
--- a/arch/powerpc/kernel/misc_64.S
+++ b/arch/powerpc/kernel/misc_64.S
@@ -27,6 +27,28 @@
.text
+#ifdef CONFIG_PPC_BOOK3S_64
+_GLOBAL(__call_realmode)
+ mflr r0
+ std r0,16(r1)
+ stdu r1,THREAD_SIZE-STACK_FRAME_OVERHEAD(r5)
+ mr r1,r5
+ mtctr r3
+ mr r3,r4
+ mfmsr r4
+ xori r4,r4,(MSR_IR|MSR_DR)
+ mtmsrd r4
+ bctrl
+ mfmsr r4
+ xori r4,r4,(MSR_IR|MSR_DR)
+ mtmsrd r4
+ ld r1,0(r1)
+ ld r0,16(r1)
+ mtlr r0
+ blr
+
+#endif
+
_GLOBAL(call_do_softirq)
mflr r0
std r0,16(r1)
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index a66f435dabbf..260d60f665a3 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -2197,6 +2197,43 @@ void show_stack(struct task_struct *tsk, unsigned long *stack,
put_task_stack(tsk);
}
+#ifdef CONFIG_PPC_BOOK3S_64
+int __call_realmode(int (*fn)(void *arg), void *arg, void *sp);
+
+/* XXX: find a better place for this
+ * Executing C code in real-mode in general Book3S-64 code can only be done
+ * via this function that switches the stack to one inside the real-mode-area,
+ * which may cover only a small first part of real memory on hash guest LPARs.
+ * fn must be NOKPROBES, must not access vmalloc or anything outside the RMA,
+ * probably shouldn't enable the MMU or interrupts, etc, and be very careful
+ * about calling other generic kernel or powerpc functions.
+ */
+int call_realmode(int (*fn)(void *arg), void *arg)
+{
+ unsigned long flags;
+ void *cursp, *emsp;
+ int ret;
+
+ /* Stack switch is only really required for HPT LPAR, but do it for all to help test coverage of tricky code */
+ cursp = (void *)(current_stack_pointer & ~(THREAD_SIZE - 1));
+ emsp = (void *)(local_paca->emergency_sp - THREAD_SIZE);
+
+ /* XXX check_stack_overflow(); */
+
+ if (WARN_ON_ONCE(cursp == emsp))
+ return -EBUSY;
+
+ local_irq_save(flags);
+ hard_irq_disable();
+
+ ret = __call_realmode(fn, arg, emsp);
+
+ local_irq_restore(flags);
+
+ return ret;
+}
+#endif
+
#ifdef CONFIG_PPC64
/* Called with hard IRQs off */
void notrace __ppc64_runlatch_on(void)
--
2.23.0
^ permalink raw reply related
* Re: [PATCH] powerpc/pci: Remove unimplemented prototypes
From: Michael Ellerman @ 2021-02-12 0:20 UTC (permalink / raw)
To: Oliver O'Halloran, linuxppc-dev
In-Reply-To: <20200902035138.1762531-1-oohall@gmail.com>
On Wed, 2 Sep 2020 13:51:38 +1000, Oliver O'Halloran wrote:
> The corresponding definitions were deleted in commit 3d5134ee8341
> ("[POWERPC] Rewrite IO allocation & mapping on powerpc64") which
> was merged a mere 13 years ago.
Applied to powerpc/next.
[1/1] powerpc/pci: Remove unimplemented prototypes
https://git.kernel.org/powerpc/c/b3abe590c80e0ba55b6fce48762232d90dbc37a5
cheers
^ permalink raw reply
* Re: [PATCH] powerpc: remove interrupt handler functions from the noinstr section
From: Michael Ellerman @ 2021-02-12 0:20 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev; +Cc: Stephen Rothwell
In-Reply-To: <20210211063636.236420-1-npiggin@gmail.com>
On Thu, 11 Feb 2021 16:36:36 +1000, Nicholas Piggin wrote:
> The allyesconfig ppc64 kernel fails to link with relocations unable to
> fit after commit 3a96570ffceb ("powerpc: convert interrupt handlers to
> use wrappers"), which is due to the interrupt handler functions being
> put into the .noinstr.text section, which the linker script places on
> the opposite side of the main .text section from the interrupt entry
> asm code which calls the handlers.
>
> [...]
Applied to powerpc/next.
[1/1] powerpc: remove interrupt handler functions from the noinstr section
https://git.kernel.org/powerpc/c/e4bb64c7a42e61bcb6f8b70279fc1f7805eaad3f
cheers
^ permalink raw reply
* Re: [PATCH] powerpc/64s: syscall real mode entry use mtmsrd rather than rfid
From: Michael Ellerman @ 2021-02-12 0:20 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev
In-Reply-To: <20210208063326.331502-1-npiggin@gmail.com>
On Mon, 8 Feb 2021 16:33:26 +1000, Nicholas Piggin wrote:
> Have the real mode system call entry handler branch to the kernel
> 0xc000... address and then use mtmsrd to enable the MMU, rather than use
> SRRs and rfid.
>
> Commit 8729c26e675c ("powerpc/64s/exception: Move real to virt switch
> into the common handler") implemented this style of real mode entry for
> other interrupt handlers, so this brings system calls into line with
> them, which is the main motivcation for the change.
>
> [...]
Applied to powerpc/next.
[1/1] powerpc/64s: syscall real mode entry use mtmsrd rather than rfid
https://git.kernel.org/powerpc/c/14ad0e7d04f46865775fb010ccd96fb1cc83433a
cheers
^ permalink raw reply
* Re: [PATCH] powerpc/64s: Remove EXSLB interrupt save area
From: Michael Ellerman @ 2021-02-12 0:20 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev
In-Reply-To: <20210208063406.331655-1-npiggin@gmail.com>
On Mon, 8 Feb 2021 16:34:06 +1000, Nicholas Piggin wrote:
> SLB faults should not be taken while the PACA save areas are live, all
> memory accesses should be fetches from the kernel text, and access to
> PACA and the current stack, before C code is called or any other
> accesses are made.
>
> All of these have pinned SLBs so will not take a SLB fault. Therefore
> EXSLB is not be required.
Applied to powerpc/next.
[1/1] powerpc/64s: Remove EXSLB interrupt save area
https://git.kernel.org/powerpc/c/ac7c5e9b08acdb54ef3525abcad24bdb3ed05551
cheers
^ permalink raw reply
* Re: [PATCH] powerpc/64s: interrupt exit improve bounding of interrupt recursion
From: Michael Ellerman @ 2021-02-12 0:20 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev; +Cc: Athira Rajeev
In-Reply-To: <20210123061504.2076317-1-npiggin@gmail.com>
On Sat, 23 Jan 2021 16:15:04 +1000, Nicholas Piggin wrote:
> When replaying pending soft-masked interrupts when an interrupt returns
> to an irqs-enabled context, there is a special case required if this was
> an asynchronous interrupt to avoid unbounded interrupt recursion.
>
> This case was not tested for in the case the asynchronous interrupt hit
> in user context, because a subsequent nested interrupt would by definition
> hit in kernel mode, which then exits via the kernel path which does test
> this case.
>
> [...]
Applied to powerpc/next.
[1/1] powerpc/64s: interrupt exit improve bounding of interrupt recursion
https://git.kernel.org/powerpc/c/c0ef717305f51e29b5ce0c78a6bfe566b3283415
cheers
^ permalink raw reply
* Re: [PATCH 1/3] powerpc/83xx: Fix build error when CONFIG_PCI=n
From: Michael Ellerman @ 2021-02-12 0:20 UTC (permalink / raw)
To: Michael Ellerman, linuxppc-dev
In-Reply-To: <20210210130804.3190952-1-mpe@ellerman.id.au>
On Thu, 11 Feb 2021 00:08:02 +1100, Michael Ellerman wrote:
> As reported by lkp:
>
> arch/powerpc/platforms/83xx/km83xx.c:183:19: error: 'mpc83xx_setup_pci' undeclared here (not in a function)
> 183 | .discover_phbs = mpc83xx_setup_pci,
> | ^~~~~~~~~~~~~~~~~
> | mpc83xx_setup_arch
>
> [...]
Applied to powerpc/next.
[1/3] powerpc/83xx: Fix build error when CONFIG_PCI=n
https://git.kernel.org/powerpc/c/5c47c44f157f408c862b144bbd1d1e161a521aa2
[2/3] powerpc/mm/64s: Fix no previous prototype warning
https://git.kernel.org/powerpc/c/2bb421a3d93601aa81bc39af7aac7280303e0761
[3/3] powerpc/amigaone: Make amigaone_discover_phbs() static
https://git.kernel.org/powerpc/c/f30520c64f290589e91461d7326b497c23e7f5fd
cheers
^ permalink raw reply
* Re: [PATCH V3] powerpc/perf: Adds support for programming of Thresholding in P10
From: Michael Ellerman @ 2021-02-12 0:19 UTC (permalink / raw)
To: Kajol Jain, mpe; +Cc: atrajeev, maddy, linuxppc-dev
In-Reply-To: <20210209095234.837356-1-kjain@linux.ibm.com>
On Tue, 9 Feb 2021 15:22:34 +0530, Kajol Jain wrote:
> Thresholding, a performance monitoring unit feature, can be
> used to identify marked instructions which take more than
> expected cycles between start event and end event.
> Threshold compare (thresh_cmp) bits are programmed in MMCRA
> register. In Power9, thresh_cmp bits were part of the
> event code. But in case of P10, thresh_cmp are not part of
> event code due to inclusion of MMCR3 bits.
>
> [...]
Applied to powerpc/next.
[1/1] powerpc/perf: Adds support for programming of Thresholding in P10
https://git.kernel.org/powerpc/c/82d2c16b350f72aa21ac2a6860c542aa4b43a51e
cheers
^ permalink raw reply
* Re: [PATCH] powerpc/xive: Assign boolean values to a bool variable
From: Michael Ellerman @ 2021-02-12 0:19 UTC (permalink / raw)
To: paulus, Jiapeng Chong; +Cc: linuxppc-dev, linux-kernel, kvm-ppc
In-Reply-To: <1612680192-43116-1-git-send-email-jiapeng.chong@linux.alibaba.com>
On Sun, 7 Feb 2021 14:43:12 +0800, Jiapeng Chong wrote:
> Fix the following coccicheck warnings:
>
> ./arch/powerpc/kvm/book3s_xive.c:1856:2-17: WARNING: Assignment of 0/1
> to bool variable.
>
> ./arch/powerpc/kvm/book3s_xive.c:1854:2-17: WARNING: Assignment of 0/1
> to bool variable.
Applied to powerpc/next.
[1/1] powerpc/xive: Assign boolean values to a bool variable
https://git.kernel.org/powerpc/c/c9df3f809cc98b196548864f52d3c4e280dd1970
cheers
^ permalink raw reply
* Re: [PATCH] powerpc/kexec_file: fix FDT size estimation for kdump kernel
From: Michael Ellerman @ 2021-02-12 0:19 UTC (permalink / raw)
To: Michael Ellerman, Hari Bathini
Cc: Pingfan Liu, Petr Tesarik, Mahesh J Salgaonkar, stable,
linuxppc-dev, Sourabh Jain, Dave Young, Thiago Jung Bauermann
In-Reply-To: <161243826811.119001.14083048209224609814.stgit@hbathini>
On Thu, 04 Feb 2021 17:01:10 +0530, Hari Bathini wrote:
> On systems with large amount of memory, loading kdump kernel through
> kexec_file_load syscall may fail with the below error:
>
> "Failed to update fdt with linux,drconf-usable-memory property"
>
> This happens because the size estimation for kdump kernel's FDT does
> not account for the additional space needed to setup usable memory
> properties. Fix it by accounting for the space needed to include
> linux,usable-memory & linux,drconf-usable-memory properties while
> estimating kdump kernel's FDT size.
Applied to powerpc/next.
[1/1] powerpc/kexec_file: fix FDT size estimation for kdump kernel
https://git.kernel.org/powerpc/c/2377c92e37fe97bc5b365f55cf60f56dfc4849f5
cheers
^ permalink raw reply
* Re: [PATCH v6 0/2] powerpc/32: Implement C syscall entry/exit (complement)
From: Michael Ellerman @ 2021-02-12 0:19 UTC (permalink / raw)
To: Paul Mackerras, msuchanek, Michael Ellerman, npiggin,
Benjamin Herrenschmidt, Christophe Leroy
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1612898425.git.christophe.leroy@csgroup.eu>
On Tue, 9 Feb 2021 19:29:26 +0000 (UTC), Christophe Leroy wrote:
> This series implements C syscall entry/exit for PPC32. It reuses
> the work already done for PPC64.
>
> This series is based on today's next-test (f538b53fd47a) where main patchs from v5 are merged in.
>
> The first patch is important for performance.
>
> [...]
Applied to powerpc/next.
[1/3] powerpc/syscall: Do not check unsupported scv vector on PPC32
https://git.kernel.org/powerpc/c/b966f2279048ee9f30d83ef8568b99fa40917c54
[2/3] powerpc/32: Handle bookE debugging in C in syscall entry/exit
https://git.kernel.org/powerpc/c/d524dda719f06967db4d3ba519edf9267f84c155
[3/3] powerpc/syscall: Avoid storing 'current' in another pointer
https://git.kernel.org/powerpc/c/5b90b9661a3396e00f6e8bcbb617a0787fb683d0
cheers
^ permalink raw reply
* Re: [PATCH v2 1/3] powerpc/uaccess: get rid of small constant size cases in raw_copy_{to, from}_user()
From: Michael Ellerman @ 2021-02-12 0:19 UTC (permalink / raw)
To: Paul Mackerras, Michael Ellerman, Benjamin Herrenschmidt,
Christophe Leroy
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <99d4ccb58a20d8408d0e19874393655ad5b40822.1612879284.git.christophe.leroy@csgroup.eu>
On Tue, 9 Feb 2021 14:02:12 +0000 (UTC), Christophe Leroy wrote:
> Copied from commit 4b842e4e25b1 ("x86: get rid of small
> constant size cases in raw_copy_{to,from}_user()")
>
> Very few call sites where that would be triggered remain, and none
> of those is anywhere near hot enough to bother.
Applied to powerpc/next.
[1/3] powerpc/uaccess: get rid of small constant size cases in raw_copy_{to,from}_user()
https://git.kernel.org/powerpc/c/6b385d1d7c0a346758e35b128815afa25d4709ee
[2/3] powerpc/uaccess: Merge __put_user_size_allowed() into __put_user_size()
https://git.kernel.org/powerpc/c/95d019e0f9225954e33b6efcad315be9d548a4d7
[3/3] powerpc/uaccess: Merge raw_copy_to_user_allowed() into raw_copy_to_user()
https://git.kernel.org/powerpc/c/052f9d206f6c4b5b512b8c201d375f2dd194be35
cheers
^ permalink raw reply
* Re: [PATCH v5 00/22] powerpc/32: Implement C syscall entry/exit
From: Michael Ellerman @ 2021-02-12 0:19 UTC (permalink / raw)
To: Paul Mackerras, msuchanek, Michael Ellerman, npiggin,
Benjamin Herrenschmidt, Christophe Leroy
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1612796617.git.christophe.leroy@csgroup.eu>
On Mon, 8 Feb 2021 15:10:19 +0000 (UTC), Christophe Leroy wrote:
> This series implements C syscall entry/exit for PPC32. It reuses
> the work already done for PPC64.
>
> This series is based on today's merge-test (b6f72fc05389e3fc694bf5a5fa1bbd33f61879e0)
>
> In terms on performance we have the following number of cycles on an
> 8xx running null_syscall benchmark:
> - mainline: 296 cycles
> - after patch 4: 283 cycles
> - after patch 16: 304 cycles
> - after patch 17: 348 cycles
> - at the end of the series: 320 cycles
>
> [...]
Patches 1-15 and 21 applied to powerpc/next.
[01/22] powerpc/32s: Add missing call to kuep_lock on syscall entry
https://git.kernel.org/powerpc/c/57fdfbce89137ae85cd5cef48be168040a47dd13
[02/22] powerpc/32: Always enable data translation on syscall entry
https://git.kernel.org/powerpc/c/eca2411040c1ee15b8882c6427fb4eb5a48ada69
[03/22] powerpc/32: On syscall entry, enable instruction translation at the same time as data
https://git.kernel.org/powerpc/c/76249ddc27080b6b835a89cedcc4185b3b5a6b23
[04/22] powerpc/32: Reorder instructions to avoid using CTR in syscall entry
https://git.kernel.org/powerpc/c/2c59e5104821c5720e88bafa9e522f8bea9ce8fa
[05/22] powerpc/irq: Add helper to set regs->softe
https://git.kernel.org/powerpc/c/fb5608fd117a8b48752d2b5a7e70847c1ed33d33
[06/22] powerpc/irq: Rework helpers that manipulate MSR[EE/RI]
https://git.kernel.org/powerpc/c/08353779f2889305f64e04de3e46ed59ed60f859
[07/22] powerpc/irq: Add stub irq_soft_mask_return() for PPC32
https://git.kernel.org/powerpc/c/6650c4782d5788346a25a4f698880d124f2699a0
[08/22] powerpc/syscall: Rename syscall_64.c into interrupt.c
https://git.kernel.org/powerpc/c/ab1a517d55b01b54ba70f5d54f926f5ab4b18339
[09/22] powerpc/syscall: Make interrupt.c buildable on PPC32
https://git.kernel.org/powerpc/c/344bb20b159dd0996e521c0d4c131a6ae10c322a
[10/22] powerpc/syscall: Use is_compat_task()
https://git.kernel.org/powerpc/c/72b7a9e56b25babfe4c90bf3ce88285c7fb62ab9
[11/22] powerpc/syscall: Save r3 in regs->orig_r3
https://git.kernel.org/powerpc/c/8875f47b7681aa4e4484a9b612577b044725f839
[12/22] powerpc/syscall: Change condition to check MSR_RI
https://git.kernel.org/powerpc/c/c01b916658150e98f00a4981750c37a3224c8735
[13/22] powerpc/32: Always save non volatile GPRs at syscall entry
https://git.kernel.org/powerpc/c/fbcee2ebe8edbb6a93316f0a189ae7fcfaa7094f
[14/22] powerpc/syscall: implement system call entry/exit logic in C for PPC32
https://git.kernel.org/powerpc/c/6f76a01173ccaa363739f913394d4e138d92d718
[15/22] powerpc/32: Remove verification of MSR_PR on syscall in the ASM entry
https://git.kernel.org/powerpc/c/4d67facbcbdb3d9e3c9cb82e4ec47fc63d298dd8
[21/22] powerpc/32: Remove the counter in global_dbcr0
https://git.kernel.org/powerpc/c/eb595eca74067b78d36fb188b555e30f28686fc7
cheers
^ permalink raw reply
* Re: [PATCH] powerpc/32: Preserve cr1 in exception prolog stack check to fix build error
From: Michael Ellerman @ 2021-02-12 0:19 UTC (permalink / raw)
To: Paul Mackerras, Michael Ellerman, Benjamin Herrenschmidt,
Christophe Leroy
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <5ae4d545e3ac58e133d2599e0deb88843cb494fc.1612768623.git.christophe.leroy@csgroup.eu>
On Mon, 8 Feb 2021 07:17:40 +0000 (UTC), Christophe Leroy wrote:
> THREAD_ALIGN_SHIFT = THREAD_SHIFT + 1 = PAGE_SHIFT + 1
> Maximum PAGE_SHIFT is 18 for 256k pages so
> THREAD_ALIGN_SHIFT is 19 at the maximum.
>
> No need to clobber cr1, it can be preserved when moving r1
> into CR when we check stack overflow.
>
> [...]
Applied to powerpc/next.
[1/1] powerpc/32: Preserve cr1 in exception prolog stack check to fix build error
https://git.kernel.org/powerpc/c/3642eb21256a317ac14e9ed560242c6d20cf06d9
cheers
^ permalink raw reply
* Re: [PATCH 1/3] spi: mpc52xx: Avoid using get_tbl()
From: Michael Ellerman @ 2021-02-12 0:19 UTC (permalink / raw)
To: Paul Mackerras, Michael Ellerman, broonie, Christophe Leroy,
Benjamin Herrenschmidt
Cc: linuxppc-dev, linux-kernel, linux-spi
In-Reply-To: <99bf008e2970de7f8ed3225cda69a6d06ae1a644.1612866360.git.christophe.leroy@csgroup.eu>
On Tue, 9 Feb 2021 10:26:21 +0000 (UTC), Christophe Leroy wrote:
> get_tbl() is confusing as it returns the content TBL register
> on PPC32 but the concatenation of TBL and TBU on PPC64.
>
> Use mftb() instead.
>
> This will allow the removal of get_tbl() in a following patch.
Applied to powerpc/next.
[1/3] spi: mpc52xx: Avoid using get_tbl()
https://git.kernel.org/powerpc/c/e10656114d32c659768e7ca8aebaaa6ac6e959ab
[2/3] powerpc/time: Avoid using get_tbl()
https://git.kernel.org/powerpc/c/55d68df623eb679cc91f61137f14751e7f369662
[3/3] powerpc/time: Remove get_tbl()
https://git.kernel.org/powerpc/c/132f94f133961d18af615cb3503368e59529e9a8
cheers
^ permalink raw reply
* Re: [PATCH 1/3] powerpc/mm: Enable compound page check for both THP and HugeTLB
From: Michael Ellerman @ 2021-02-12 0:19 UTC (permalink / raw)
To: Aneesh Kumar K.V, mpe, linuxppc-dev
In-Reply-To: <20210203045812.234439-1-aneesh.kumar@linux.ibm.com>
On Wed, 3 Feb 2021 10:28:10 +0530, Aneesh Kumar K.V wrote:
> THP config results in compound pages. Make sure the kernel enables
> the PageCompound() check with CONFIG_HUGETLB_PAGE disabled and
> CONFIG_TRANSPARENT_HUGEPAGE enabled.
>
> This makes sure we correctly flush the icache with THP pages.
> flush_dcache_icache_page only matter for platforms that don't support
> COHERENT_ICACHE.
Applied to powerpc/next.
[1/3] powerpc/mm: Enable compound page check for both THP and HugeTLB
https://git.kernel.org/powerpc/c/c7ba2d636342093cfb842f47640e5b62192adfed
[2/3] powerpc/mm: Add PG_dcache_clean to indicate dcache clean state
https://git.kernel.org/powerpc/c/ec94b9b23d620d40ab2ced094a30c22bb8d69b9f
[3/3] powerpc/mm: Remove dcache flush from memory remove.
https://git.kernel.org/powerpc/c/2ac02e5ecec0cc2484d60a73b1bc6394aa2fad28
cheers
^ permalink raw reply
* Re: [PATCH kernel v3] powerpc/uaccess: Skip might_fault() when user access is enabled
From: Michael Ellerman @ 2021-02-12 0:19 UTC (permalink / raw)
To: Alexey Kardashevskiy, linuxppc-dev; +Cc: Jordan Niethe, Nicholas Piggin
In-Reply-To: <20210204121612.32721-1-aik@ozlabs.ru>
On Thu, 4 Feb 2021 23:16:12 +1100, Alexey Kardashevskiy wrote:
> The amount of code executed with enabled user space access (unlocked KUAP)
> should be minimal. However with CONFIG_PROVE_LOCKING or
> CONFIG_DEBUG_ATOMIC_SLEEP enabled, might_fault() may end up replaying
> interrupts which in turn may access the user space and forget to restore
> the KUAP state.
>
> The problem places are:
> 1. strncpy_from_user (and similar) which unlock KUAP and call
> unsafe_get_user -> __get_user_allowed -> __get_user_nocheck()
> with do_allow=false to skip KUAP as the caller took care of it.
> 2. __put_user_nocheck_goto() which is called with unlocked KUAP.
>
> [...]
Applied to powerpc/next.
[1/1] powerpc/uaccess: Avoid might_fault() when user access is enabled
https://git.kernel.org/powerpc/c/7d506ca97b665b95e698a53697dad99fae813c1a
cheers
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox