LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v3 5/8] x86/sme: Replace occurrences of sme_active() with cc_platform_has()
From: Kirill A. Shutemov @ 2021-09-24  9:41 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Sathyanarayanan Kuppuswamy, linux-efi, Brijesh Singh, kvm,
	Peter Zijlstra, Dave Hansen, dri-devel, platform-driver-x86,
	Will Deacon, linux-s390, Andi Kleen, Joerg Roedel, x86, amd-gfx,
	Christoph Hellwig, Ingo Molnar, linux-graphics-maintainer,
	Tom Lendacky, Tianyu Lan, Andy Lutomirski, Thomas Gleixner, kexec,
	linux-kernel, iommu, linux-fsdevel, linuxppc-dev
In-Reply-To: <YUzFj+yH79XRc3F3@zn.tnic>

On Thu, Sep 23, 2021 at 08:21:03PM +0200, Borislav Petkov wrote:
> On Thu, Sep 23, 2021 at 12:05:58AM +0300, Kirill A. Shutemov wrote:
> > Unless we find other way to guarantee RIP-relative access, we must use
> > fixup_pointer() to access any global variables.
> 
> Yah, I've asked compiler folks about any guarantees we have wrt
> rip-relative addresses but it doesn't look good. Worst case, we'd have
> to do the fixup_pointer() thing.
> 
> In the meantime, Tom and I did some more poking at this and here's a
> diff ontop.
> 
> The direction being that we'll stick both the AMD and Intel
> *cc_platform_has() call into cc_platform.c for which instrumentation
> will be disabled so no issues with that.
> 
> And that will keep all that querying all together in a single file.

And still do cc_platform_has() calls in __startup_64() codepath?

It's broken.

Intel detection in cc_platform_has() relies on boot_cpu_data.x86_vendor
which is not initialized until early_cpu_init() in setup_arch(). Given
that X86_VENDOR_INTEL is 0 it leads to false-positive.

I think opencode these two calls is the way forward. Maybe also move the
check from sme_encrypt_kernel() to __startup_64().

-- 
 Kirill A. Shutemov

^ permalink raw reply

* Re: [PATCH v3 5/8] x86/sme: Replace occurrences of sme_active() with cc_platform_has()
From: Borislav Petkov @ 2021-09-24  9:51 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Sathyanarayanan Kuppuswamy, linux-efi, Brijesh Singh, kvm,
	Peter Zijlstra, Dave Hansen, dri-devel, platform-driver-x86,
	Will Deacon, linux-s390, Andi Kleen, Joerg Roedel, x86, amd-gfx,
	Christoph Hellwig, Ingo Molnar, linux-graphics-maintainer,
	Tianyu Lan, Andy Lutomirski, Kirill A. Shutemov, Thomas Gleixner,
	kexec, linux-kernel, iommu, linux-fsdevel, linuxppc-dev
In-Reply-To: <20210924094132.gxyqp4z3qdk5w4j6@box.shutemov.name>

On Fri, Sep 24, 2021 at 12:41:32PM +0300, Kirill A. Shutemov wrote:
> On Thu, Sep 23, 2021 at 08:21:03PM +0200, Borislav Petkov wrote:
> > On Thu, Sep 23, 2021 at 12:05:58AM +0300, Kirill A. Shutemov wrote:
> > > Unless we find other way to guarantee RIP-relative access, we must use
> > > fixup_pointer() to access any global variables.
> > 
> > Yah, I've asked compiler folks about any guarantees we have wrt
> > rip-relative addresses but it doesn't look good. Worst case, we'd have
> > to do the fixup_pointer() thing.
> > 
> > In the meantime, Tom and I did some more poking at this and here's a
> > diff ontop.
> > 
> > The direction being that we'll stick both the AMD and Intel
> > *cc_platform_has() call into cc_platform.c for which instrumentation
> > will be disabled so no issues with that.
> > 
> > And that will keep all that querying all together in a single file.
> 
> And still do cc_platform_has() calls in __startup_64() codepath?
> 
> It's broken.
> 
> Intel detection in cc_platform_has() relies on boot_cpu_data.x86_vendor
> which is not initialized until early_cpu_init() in setup_arch(). Given
> that X86_VENDOR_INTEL is 0 it leads to false-positive.

Yeah, Tom, I had the same question yesterday.

/me looks in his direction.

:-)

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply

* [PATCH v3 1/2] powerpc/powermac: add missing g5_phy_disable_cpu1() declaration
From: Krzysztof Kozlowski @ 2021-09-24 10:56 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Michael Ellerman, Paul Mackerras,
	Rob Herring, Frank Rowand, Krzysztof Kozlowski, Christophe Leroy,
	linuxppc-dev, linux-kernel, devicetree

g5_phy_disable_cpu1() is used outside of platforms/powermac/feature.c,
so it should have a declaration to fix W=1 warning:

  arch/powerpc/platforms/powermac/feature.c:1533:6:
    error: no previous prototype for ‘g5_phy_disable_cpu1’ [-Werror=missing-prototypes]

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>

---

Changes since v2:
1. Put declaration in powermac/pmac.h

Changes since v1:
1. Drop declaration in powermac/smp.c
---
 arch/powerpc/platforms/powermac/pmac.h | 2 ++
 arch/powerpc/platforms/powermac/smp.c  | 2 --
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/powermac/pmac.h b/arch/powerpc/platforms/powermac/pmac.h
index 0d715db434dc..be528e0e507f 100644
--- a/arch/powerpc/platforms/powermac/pmac.h
+++ b/arch/powerpc/platforms/powermac/pmac.h
@@ -14,6 +14,8 @@ struct rtc_time;
 
 extern int pmac_newworld;
 
+void g5_phy_disable_cpu1(void);
+
 extern long pmac_time_init(void);
 extern time64_t pmac_get_boot_time(void);
 extern void pmac_get_rtc_time(struct rtc_time *);
diff --git a/arch/powerpc/platforms/powermac/smp.c b/arch/powerpc/platforms/powermac/smp.c
index 3256a316e884..5d0626f432d5 100644
--- a/arch/powerpc/platforms/powermac/smp.c
+++ b/arch/powerpc/platforms/powermac/smp.c
@@ -875,8 +875,6 @@ static int smp_core99_cpu_online(unsigned int cpu)
 
 static void __init smp_core99_bringup_done(void)
 {
-	extern void g5_phy_disable_cpu1(void);
-
 	/* Close i2c bus if it was used for tb sync */
 	if (pmac_tb_clock_chip_host)
 		pmac_i2c_close(pmac_tb_clock_chip_host);
-- 
2.30.2


^ permalink raw reply related

* [PATCH v3 2/2] powerpc/powermac: constify device_node in of_irq_parse_oldworld()
From: Krzysztof Kozlowski @ 2021-09-24 10:56 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Michael Ellerman, Paul Mackerras,
	Rob Herring, Frank Rowand, Krzysztof Kozlowski, Christophe Leroy,
	linuxppc-dev, linux-kernel, devicetree
In-Reply-To: <20210924105653.46963-1-krzysztof.kozlowski@canonical.com>

The of_irq_parse_oldworld() does not modify passed device_node so make
it a pointer to const for safety.  Drop the extern while modifying the
line.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>

---

Changes since v1:
1. Drop extern.
---
 arch/powerpc/platforms/powermac/pic.c | 2 +-
 include/linux/of_irq.h                | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/powermac/pic.c b/arch/powerpc/platforms/powermac/pic.c
index 4921bccf0376..af5ca1f41bb1 100644
--- a/arch/powerpc/platforms/powermac/pic.c
+++ b/arch/powerpc/platforms/powermac/pic.c
@@ -384,7 +384,7 @@ static void __init pmac_pic_probe_oldstyle(void)
 #endif
 }
 
-int of_irq_parse_oldworld(struct device_node *device, int index,
+int of_irq_parse_oldworld(const struct device_node *device, int index,
 			struct of_phandle_args *out_irq)
 {
 	const u32 *ints = NULL;
diff --git a/include/linux/of_irq.h b/include/linux/of_irq.h
index aaf219bd0354..83fccd0c9bba 100644
--- a/include/linux/of_irq.h
+++ b/include/linux/of_irq.h
@@ -20,12 +20,12 @@ typedef int (*of_irq_init_cb_t)(struct device_node *, struct device_node *);
 #if defined(CONFIG_PPC32) && defined(CONFIG_PPC_PMAC)
 extern unsigned int of_irq_workarounds;
 extern struct device_node *of_irq_dflt_pic;
-extern int of_irq_parse_oldworld(struct device_node *device, int index,
-			       struct of_phandle_args *out_irq);
+int of_irq_parse_oldworld(const struct device_node *device, int index,
+			  struct of_phandle_args *out_irq);
 #else /* CONFIG_PPC32 && CONFIG_PPC_PMAC */
 #define of_irq_workarounds (0)
 #define of_irq_dflt_pic (NULL)
-static inline int of_irq_parse_oldworld(struct device_node *device, int index,
+static inline int of_irq_parse_oldworld(const struct device_node *device, int index,
 				      struct of_phandle_args *out_irq)
 {
 	return -EINVAL;
-- 
2.30.2


^ permalink raw reply related

* Re: [PATCH AUTOSEL 5.10 01/26] ibmvnic: check failover_pending in login response
From: Sasha Levin @ 2021-09-24 11:47 UTC (permalink / raw)
  To: Pavel Machek
  Cc: linux-kernel, stable, netdev, drt, kuba, Sukadev Bhattiprolu,
	linuxppc-dev, David S . Miller
In-Reply-To: <20210923073347.GA2056@amd>

On Thu, Sep 23, 2021 at 09:33:47AM +0200, Pavel Machek wrote:
>Hi!
>
>Something went wrong with this series. I only see first 7 patches. I
>thought it might be local problem, but I only see 7 patches on lore...

Huh, yes, apparently git-send-email timed out. I'll resend. Thanks!


-- 
Thanks,
Sasha

^ permalink raw reply

* [PATCH v1 1/3] powerpc/code-patching: Improve verification of patchability
From: Christophe Leroy @ 2021-09-24 12:38 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Michael Neuling, linuxppc-dev, linux-kernel

Today, patch_instruction() assumes that is called exclusively on valid
addresses, and only checks that it is not called on an init address
after init section has been freed.

Improve verification by calling kernel_text_address() instead.

kernel_text_address() already includes a verification of
initmem release.

Cc: Michael Neuling <mikey@neuling.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/code-patching.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index f9a3019e37b4..20ec6648aacc 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -190,10 +190,9 @@ static int do_patch_instruction(u32 *addr, struct ppc_inst instr)
 int patch_instruction(u32 *addr, struct ppc_inst instr)
 {
 	/* Make sure we aren't patching a freed init section */
-	if (init_mem_is_free && init_section_contains(addr, 4)) {
-		pr_debug("Skipping init section patching addr: 0x%px\n", addr);
+	if (!kernel_text_address((unsigned long)addr))
 		return 0;
-	}
+
 	return do_patch_instruction(addr, instr);
 }
 NOKPROBE_SYMBOL(patch_instruction);
-- 
2.31.1


^ permalink raw reply related

* [PATCH v1 3/3] powerpc: Remove init_mem_is_free
From: Christophe Leroy @ 2021-09-24 12:38 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Michael Neuling, linuxppc-dev, linux-kernel
In-Reply-To: <a7a35416ddab4b42617e2c81fe5816c7f8d0c63f.1632487071.git.christophe.leroy@csgroup.eu>

init_mem_is_free is not used anymore. Remove it.

Cc: Michael Neuling <mikey@neuling.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/sections.h | 2 --
 arch/powerpc/include/asm/setup.h    | 1 -
 arch/powerpc/mm/mem.c               | 2 --
 3 files changed, 5 deletions(-)

diff --git a/arch/powerpc/include/asm/sections.h b/arch/powerpc/include/asm/sections.h
index dff55421f76b..454b8434bfc4 100644
--- a/arch/powerpc/include/asm/sections.h
+++ b/arch/powerpc/include/asm/sections.h
@@ -10,8 +10,6 @@
 
 #include <asm-generic/sections.h>
 
-extern bool init_mem_is_free;
-
 static inline int arch_is_kernel_initmem_freed(unsigned long addr)
 {
 	if (system_state < SYSTEM_RUNNING)
diff --git a/arch/powerpc/include/asm/setup.h b/arch/powerpc/include/asm/setup.h
index 6c1a7d217d1a..426a2d8d028f 100644
--- a/arch/powerpc/include/asm/setup.h
+++ b/arch/powerpc/include/asm/setup.h
@@ -9,7 +9,6 @@ extern void ppc_printk_progress(char *s, unsigned short hex);
 
 extern unsigned int rtas_data;
 extern unsigned long long memory_limit;
-extern bool init_mem_is_free;
 extern void *zalloc_maybe_bootmem(size_t size, gfp_t mask);
 
 struct device_node;
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index c3c4e31462ec..5b1eae6c0356 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -26,7 +26,6 @@
 #include <mm/mmu_decl.h>
 
 unsigned long long memory_limit;
-bool init_mem_is_free;
 
 unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)] __page_aligned_bss;
 EXPORT_SYMBOL(empty_zero_page);
@@ -312,7 +311,6 @@ void free_initmem(void)
 {
 	ppc_md.progress = ppc_printk_progress;
 	mark_initmem_nx();
-	init_mem_is_free = true;
 	free_initmem_default(POISON_FREE_INITMEM);
 }
 
-- 
2.31.1


^ permalink raw reply related

* [PATCH v1 2/3] powerpc: Stop using init_mem_is_free
From: Christophe Leroy @ 2021-09-24 12:38 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Michael Neuling, linuxppc-dev, linux-kernel
In-Reply-To: <a7a35416ddab4b42617e2c81fe5816c7f8d0c63f.1632487071.git.christophe.leroy@csgroup.eu>

Generic parts of the kernel for instance core_kernel_text()
use 'system_state' to check whether init memory has been
freed.

Do the same and stop using init_mem_is_free.

Cc: Michael Neuling <mikey@neuling.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/sections.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/sections.h b/arch/powerpc/include/asm/sections.h
index 6e4af4492a14..dff55421f76b 100644
--- a/arch/powerpc/include/asm/sections.h
+++ b/arch/powerpc/include/asm/sections.h
@@ -14,7 +14,7 @@ extern bool init_mem_is_free;
 
 static inline int arch_is_kernel_initmem_freed(unsigned long addr)
 {
-	if (!init_mem_is_free)
+	if (system_state < SYSTEM_RUNNING)
 		return 0;
 
 	return addr >= (unsigned long)__init_begin &&
-- 
2.31.1


^ permalink raw reply related

* Re: [PATCH v3 5/8] x86/sme: Replace occurrences of sme_active() with cc_platform_has()
From: Tom Lendacky @ 2021-09-24 13:31 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Sathyanarayanan Kuppuswamy, linux-efi, Brijesh Singh, kvm,
	Peter Zijlstra, Dave Hansen, dri-devel, platform-driver-x86,
	Will Deacon, linux-s390, Andi Kleen, Joerg Roedel, x86, amd-gfx,
	Christoph Hellwig, Ingo Molnar, linux-graphics-maintainer,
	Tianyu Lan, Andy Lutomirski, Kirill A. Shutemov, Thomas Gleixner,
	kexec, linux-kernel, iommu, linux-fsdevel, linuxppc-dev
In-Reply-To: <YU2fsCblZVQpgMvC@zn.tnic>

On 9/24/21 4:51 AM, Borislav Petkov wrote:
> On Fri, Sep 24, 2021 at 12:41:32PM +0300, Kirill A. Shutemov wrote:
>> On Thu, Sep 23, 2021 at 08:21:03PM +0200, Borislav Petkov wrote:
>>> On Thu, Sep 23, 2021 at 12:05:58AM +0300, Kirill A. Shutemov wrote:
>>>> Unless we find other way to guarantee RIP-relative access, we must use
>>>> fixup_pointer() to access any global variables.
>>>
>>> Yah, I've asked compiler folks about any guarantees we have wrt
>>> rip-relative addresses but it doesn't look good. Worst case, we'd have
>>> to do the fixup_pointer() thing.
>>>
>>> In the meantime, Tom and I did some more poking at this and here's a
>>> diff ontop.
>>>
>>> The direction being that we'll stick both the AMD and Intel
>>> *cc_platform_has() call into cc_platform.c for which instrumentation
>>> will be disabled so no issues with that.
>>>
>>> And that will keep all that querying all together in a single file.
>>
>> And still do cc_platform_has() calls in __startup_64() codepath?
>>
>> It's broken.
>>
>> Intel detection in cc_platform_has() relies on boot_cpu_data.x86_vendor
>> which is not initialized until early_cpu_init() in setup_arch(). Given
>> that X86_VENDOR_INTEL is 0 it leads to false-positive.
> 
> Yeah, Tom, I had the same question yesterday.
> 
> /me looks in his direction.
> 

Yup, all the things we talked about.

But we also know that cc_platform_has() gets called a few other times 
before boot_cpu_data is initialized - so more false-positives. For 
cc_platform_has() to work properly, given the desire to consolidate the 
calls, IMO, Intel will have to come up with some early setting that can be 
enabled and checked in place of boot_cpu_data or else you live with 
false-positives.

Thanks,
Tom

> :-)
> 

^ permalink raw reply

* [PATCH 3/3] s390: Use generic version of arch_is_kernel_initmem_freed()
From: Christophe Leroy @ 2021-09-24 13:48 UTC (permalink / raw)
  To: Andrew Morton, arnd
  Cc: linux-arch, linux-s390, linux-kernel, linux-mm, Gerald Schaefer,
	linuxppc-dev
In-Reply-To: <0b55650058a5bf64f7d74781871a1ada2298c8b4.1632491308.git.christophe.leroy@csgroup.eu>

Generic version of arch_is_kernel_initmem_freed() now does the same
as s390 version.

Remove the s390 version.

Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/s390/include/asm/sections.h | 12 ------------
 arch/s390/mm/init.c              |  3 ---
 2 files changed, 15 deletions(-)

diff --git a/arch/s390/include/asm/sections.h b/arch/s390/include/asm/sections.h
index 85881dd48022..3fecaa4e8b74 100644
--- a/arch/s390/include/asm/sections.h
+++ b/arch/s390/include/asm/sections.h
@@ -2,20 +2,8 @@
 #ifndef _S390_SECTIONS_H
 #define _S390_SECTIONS_H
 
-#define arch_is_kernel_initmem_freed arch_is_kernel_initmem_freed
-
 #include <asm-generic/sections.h>
 
-extern bool initmem_freed;
-
-static inline int arch_is_kernel_initmem_freed(unsigned long addr)
-{
-	if (!initmem_freed)
-		return 0;
-	return addr >= (unsigned long)__init_begin &&
-	       addr < (unsigned long)__init_end;
-}
-
 /*
  * .boot.data section contains variables "shared" between the decompressor and
  * the decompressed kernel. The decompressor will store values in them, and
diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index a04faf49001a..8c6f258a6183 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -58,8 +58,6 @@ unsigned long empty_zero_page, zero_page_mask;
 EXPORT_SYMBOL(empty_zero_page);
 EXPORT_SYMBOL(zero_page_mask);
 
-bool initmem_freed;
-
 static void __init setup_zero_pages(void)
 {
 	unsigned int order;
@@ -214,7 +212,6 @@ void __init mem_init(void)
 
 void free_initmem(void)
 {
-	initmem_freed = true;
 	__set_memory((unsigned long)_sinittext,
 		     (unsigned long)(_einittext - _sinittext) >> PAGE_SHIFT,
 		     SET_MEMORY_RW | SET_MEMORY_NX);
-- 
2.31.1


^ permalink raw reply related

* [PATCH 2/3] powerpc: Use generic version of arch_is_kernel_initmem_freed()
From: Christophe Leroy @ 2021-09-24 13:48 UTC (permalink / raw)
  To: Andrew Morton, arnd
  Cc: linux-arch, linux-s390, linux-kernel, linux-mm, linuxppc-dev
In-Reply-To: <0b55650058a5bf64f7d74781871a1ada2298c8b4.1632491308.git.christophe.leroy@csgroup.eu>

Generic version of arch_is_kernel_initmem_freed() now does the same
as powerpc version.

Remove the powerpc version.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/sections.h | 13 -------------
 1 file changed, 13 deletions(-)

diff --git a/arch/powerpc/include/asm/sections.h b/arch/powerpc/include/asm/sections.h
index 6e4af4492a14..79cb7a25a5fb 100644
--- a/arch/powerpc/include/asm/sections.h
+++ b/arch/powerpc/include/asm/sections.h
@@ -6,21 +6,8 @@
 #include <linux/elf.h>
 #include <linux/uaccess.h>
 
-#define arch_is_kernel_initmem_freed arch_is_kernel_initmem_freed
-
 #include <asm-generic/sections.h>
 
-extern bool init_mem_is_free;
-
-static inline int arch_is_kernel_initmem_freed(unsigned long addr)
-{
-	if (!init_mem_is_free)
-		return 0;
-
-	return addr >= (unsigned long)__init_begin &&
-		addr < (unsigned long)__init_end;
-}
-
 extern char __head_end[];
 
 #ifdef __powerpc64__
-- 
2.31.1


^ permalink raw reply related

* [PATCH 1/3] mm: Make generic arch_is_kernel_initmem_freed() do what it says
From: Christophe Leroy @ 2021-09-24 13:48 UTC (permalink / raw)
  To: Andrew Morton, arnd
  Cc: linux-arch, linux-s390, linux-kernel, linux-mm, Gerald Schaefer,
	linuxppc-dev

Commit 7a5da02de8d6 ("locking/lockdep: check for freed initmem in
static_obj()") added arch_is_kernel_initmem_freed() which is supposed
to report whether an object is part of already freed init memory.

For the time being, the generic version of arch_is_kernel_initmem_freed()
always reports 'false', allthough free_initmem() is generically called
on all architectures.

Therefore, change the generic version of arch_is_kernel_initmem_freed()
to check whether free_initmem() has been called. If so, then check
if a given address falls into init memory.

In order to use function init_section_contains(), the fonction is
moved at the end of asm-generic/section.h

Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 include/asm-generic/sections.h | 31 +++++++++++++++++--------------
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/include/asm-generic/sections.h b/include/asm-generic/sections.h
index d16302d3eb59..d1e5bb2c6b72 100644
--- a/include/asm-generic/sections.h
+++ b/include/asm-generic/sections.h
@@ -80,20 +80,6 @@ static inline int arch_is_kernel_data(unsigned long addr)
 }
 #endif
 
-/*
- * Check if an address is part of freed initmem. This is needed on architectures
- * with virt == phys kernel mapping, for code that wants to check if an address
- * is part of a static object within [_stext, _end]. After initmem is freed,
- * memory can be allocated from it, and such allocations would then have
- * addresses within the range [_stext, _end].
- */
-#ifndef arch_is_kernel_initmem_freed
-static inline int arch_is_kernel_initmem_freed(unsigned long addr)
-{
-	return 0;
-}
-#endif
-
 /**
  * memory_contains - checks if an object is contained within a memory region
  * @begin: virtual address of the beginning of the memory region
@@ -172,4 +158,21 @@ static inline bool is_kernel_rodata(unsigned long addr)
 	       addr < (unsigned long)__end_rodata;
 }
 
+/*
+ * Check if an address is part of freed initmem. This is needed on architectures
+ * with virt == phys kernel mapping, for code that wants to check if an address
+ * is part of a static object within [_stext, _end]. After initmem is freed,
+ * memory can be allocated from it, and such allocations would then have
+ * addresses within the range [_stext, _end].
+ */
+#ifndef arch_is_kernel_initmem_freed
+static inline int arch_is_kernel_initmem_freed(unsigned long addr)
+{
+	if (system_state < SYSTEM_RUNNING)
+		return 0;
+
+	return init_section_contains((void *)addr, 1);
+}
+#endif
+
 #endif /* _ASM_GENERIC_SECTIONS_H_ */
-- 
2.31.1


^ permalink raw reply related

* Re: [PATCH v2 1/2] serial: 8250: SERIAL_8250_FSL should not default to y when compile-testing
From: Johan Hovold @ 2021-09-24 14:10 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: linux-kernel, Greg Kroah-Hartman, linuxppc-dev, Li Yang,
	Scott Wood, linux-serial, Shawn Guo, Jiri Slaby, linux-arm-kernel
In-Reply-To: <6421f256407262afd658ffa74ec9430581528a7d.1632467477.git.geert+renesas@glider.be>

On Fri, Sep 24, 2021 at 09:12:30AM +0200, Geert Uytterhoeven wrote:
> Commit b1442c55ce8977aa ("serial: 8250: extend compile-test coverage")
> added compile-test support to the Freescale 16550 driver.  However, as
> SERIAL_8250_FSL is an invisible symbol, merely enabling compile-testing
> now enables this driver.
> 
> Fix this by dropping the COMPILE_TEST default again, but making the
> SERIAL_8250_FSL symbol visible instead.
> 
> Fixes: b1442c55ce8977aa ("serial: 8250: extend compile-test coverage")
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> ---
> v2:
>   - Split in two parts.

I realised that we can do better than this. I've prepared a patch that
preserves the old behaviour of always enabling the option on platforms
that may need it while also not enabling it by default when compile
testing.

Note that SERIAL_8250_FSL only enables a workaround for an erratum in
the Freescale UARTs in the 8250 driver (leaving the later added ACPI
support aside) and we shouldn't make it easier to disable it by mistake.

> ---
>  drivers/tty/serial/8250/Kconfig | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/tty/serial/8250/Kconfig b/drivers/tty/serial/8250/Kconfig
> index 808268edd2e82a45..0af96f3adab517f6 100644
> --- a/drivers/tty/serial/8250/Kconfig
> +++ b/drivers/tty/serial/8250/Kconfig
> @@ -361,9 +361,12 @@ config SERIAL_8250_BCM2835AUX
>  	  If unsure, say N.
>  
>  config SERIAL_8250_FSL
> -	bool
> +	bool "Freescale 16550-style UART support (8250 based driver)"
>  	depends on SERIAL_8250_CONSOLE
> -	default PPC || ARM || ARM64 || COMPILE_TEST
> +	default PPC || ARM || ARM64
> +	help
> +	  Selecting this option will add support for the 16550-style serial
> +	  port hardware found on Freescale SoCs.
>  
>  config SERIAL_8250_DW
>  	tristate "Support for Synopsys DesignWare 8250 quirks"

Johan

^ permalink raw reply

* [PATCH] powerpc: Activate CONFIG_STRICT_KERNEL_RWX by default
From: Christophe Leroy @ 2021-09-24 17:13 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel, Kees Cook

CONFIG_STRICT_KERNEL_RWX should be set by default on every
architectures (See https://github.com/KSPP/linux/issues/4)

On PPC32 we have to find a compromise between performance and/or
memory wasting and selection of strict_kernel_rwx, because it implies
either smaller memory chunks or larger alignment between RO memory
and RW memory.

For instance the 8xx maps memory with 8M pages. So either the limit
between RO and RW must be 8M aligned or it falls back or 512k pages
which implies more pressure on the TLB.

book3s/32 maps memory with BATs as much as possible. BATS can have
any power-of-two size between 128k and 256M but we have only 4 to 8
BATs so the alignment must be good enough to allow efficient use of
the BATs and avoid falling back on standard page mapping which would
kill performance.

So let's go one step forward and make it the default but still allow
users to unset it when wanted.

Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index ba5b66189358..79332f51185d 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -148,6 +148,7 @@ config PPC
 	select ARCH_MIGHT_HAVE_PC_PARPORT
 	select ARCH_MIGHT_HAVE_PC_SERIO
 	select ARCH_OPTIONAL_KERNEL_RWX		if ARCH_HAS_STRICT_KERNEL_RWX
+	select ARCH_OPTIONAL_KERNEL_RWX_DEFAULT
 	select ARCH_STACKWALK
 	select ARCH_SUPPORTS_ATOMIC_RMW
 	select ARCH_SUPPORTS_DEBUG_PAGEALLOC	if PPC32 || PPC_BOOK3S_64
-- 
2.31.1


^ permalink raw reply related

* Re: coherency issue observed after hotplug on POWER8
From: Naveen N. Rao @ 2021-09-24 17:17 UTC (permalink / raw)
  To: Thadeu Lima de Souza Cascardo, linuxppc-dev
In-Reply-To: <YUpIqytZqpohq4EM@mussarela>

Hi Cascardo,
Thanks for reporting this.


Thadeu Lima de Souza Cascardo wrote:
> Hi, there.
> 
> We have been investigating an issue we have observed on POWER8 POWERNV systems.
> When running the kernel selftests reuseport_bpf_cpu after a CPU hotplug, we see
> crashes, in different forms. [1]

Just to re-confirm: you are only seeing this on P8 powernv, and not in a 
P8 guest/LPAR? I haven't been able to reproduce this on a firestone -- 
can you share more details about your power8 machine?

Also, do you only see this with ubuntu kernels, or are you also able to 
reproduce this with the upstream tree?

> 
> I managed to get xmon on that trap, and did some debugging. [2] I tried to dump
> the BPF JIT code, and it looks different when dumped from CPU#0 and CPU#0x9f
> (the one that was hotplugged, offlined, then onlined).

Next time you reproduce this, can you try dumping the SLBs for the cpus 
(command 'u' in xmon)?

> 
> Here is my partial analysis [3]. Basically, the BPF JIT fills a page with
> invalid instructions (traps, in ppc64 case), and puts the BPF program in a
> random offset of the page. In the case of the hotplugged CPU, which was the one
> that compiled the program, the page had the expected contents (BPF program
> started at the offset used to run the program). On the other CPU (in many
> cases, CPU #0), the same memory address/page had different contents, with the
> program starting at a different offset.

From [3], I think fp->aux->jit_data can be NULL if there are subprogs.  
But, I find it interesting that you don't always see the correct 
bpf_func, as reported in comment #25. Can you also try dumping the full 
bpf_prog structure (prog/fp) from xmon?

> 
> Is this a case of a bug in the micro-architecture or the firmware when 
> doing the hotplug? Can someone chime in?

It's possible that something is going wrong when offlining the cpu. Can 
you try booting the kernel with 'powersave=off' and see if the problem 
goes away?

> 
> Notice that we can't reproduce the same issue on a POWER9 system.
> 
> Thanks.
> Cascardo.
> 
> [1] https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1927076
> [2] https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1927076/comments/29
> [3] https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1927076/comments/30
> 

- Naveen


^ permalink raw reply

* [Bug 213837] "Kernel panic - not syncing: corrupted stack end detected inside scheduler" at building via distcc on a G5
From: bugzilla-daemon @ 2021-09-24 22:16 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <bug-213837-206035@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=213837

--- Comment #10 from Erhard F. (erhard_f@mailbox.org) ---
Created attachment 298959
  --> https://bugzilla.kernel.org/attachment.cgi?id=298959&action=edit
kernel .config (5.15-rc2 + CONFIG_THREAD_SHIFT=15, PowerMac G5 11,2)

(In reply to mpe from comment #8)
> You could also try changing CONFIG_THREAD_SHIFT to 15, that might keep
> the system running a bit longer and give us some other clues.
The stack seems just large enough with CONFIG_THREAD_SHIFT=15 to not run into
this bug. I let the G5 build stuff via distcc in zram disk for a day without an
issue. With  CONFIG_THREAD_SHIFT=14 I hit the bug within minutes.

Just for completeness I'll upload the System.map and kernel .config with
CONFIG_THREAD_SHIFT=15.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.

^ permalink raw reply

* [Bug 213837] "Kernel panic - not syncing: corrupted stack end detected inside scheduler" at building via distcc on a G5
From: bugzilla-daemon @ 2021-09-24 22:17 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <bug-213837-206035@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=213837

--- Comment #11 from Erhard F. (erhard_f@mailbox.org) ---
Created attachment 298961
  --> https://bugzilla.kernel.org/attachment.cgi?id=298961&action=edit
System.map (5.15-rc2 + patch + CONFIG_THREAD_SHIFT=15, PowerMac G5 11,2)

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.

^ permalink raw reply

* [Bug 213837] "Kernel panic - not syncing: corrupted stack end detected inside scheduler" at building via distcc on a G5
From: bugzilla-daemon @ 2021-09-24 22:25 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <bug-213837-206035@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=213837

--- Comment #12 from Erhard F. (erhard_f@mailbox.org) ---
Created attachment 298963
  --> https://bugzilla.kernel.org/attachment.cgi?id=298963&action=edit
dmesg (5.15-rc2 + patch, PowerMac G5 11,2) #1

Last stack trace with CONFIG_THREAD_SHIFT=14 however did reveal a bit more
data:

[...]
stack: c0000000023effb0: 00000000 28022284 00000000 00000000  ....(.".........
stack: c0000000023effc0: 00000000 00000c00 00003fff 94c95000  ..........?...P.
stack: c0000000023effd0: 00000000 42000000 ffffffff ffffffea  ....B...........
stack: c0000000023effe0: 00000000 00000000 00000000 00000000  ................
stack: c0000000023efff0: 00000000 00000000 00000000 00000000  ................
Kernel panic - not syncing: corrupted stack end detected inside scheduler
CPU: 1 PID: 2652 Comm: cc1plus Tainted: G        W        
5.15.0-rc2-PowerMacG5+ #4
Call Trace:
[c0000000023ef7f0] [c0000000005532d8] .dump_stack_lvl+0x98/0xe0 (unreliable)
[c0000000023ef880] [c000000000069538] .panic+0x14c/0x3e8
[c0000000023ef930] [c00000000081d5a0] .__schedule+0xc0/0x874
[c0000000023efa00] [c00000000081dea0] .preempt_schedule_common+0x28/0x48
[c0000000023efa80] [c00000000081deec] .__cond_resched+0x2c/0x50
[c0000000023efaf0] [c00000000029579c] .dput+0x40/0x218
[c0000000023efba0] [c000000000285204] .path_put+0x1c/0x34
[c0000000023efc20] [c00000000027eab8] .do_readlinkat+0xdc/0x124
[c0000000023efcf0] [c00000000027f310] .__se_sys_readlink+0x20/0x30
[c0000000023efd60] [c000000000022850] .system_call_exception+0x1ac/0x1e4
[c0000000023efe10] [c00000000000b4cc] system_call_common+0xec/0x250
--- interrupt: c00 at 0x3fff95335500
NIP:  00003fff95335500 LR: 00003fff95273f6c CTR: 0000000000000000
REGS: c0000000023efe80 TRAP: 0c00   Tainted: G        W         
(5.15.0-rc2-PowerMacG5+)
MSR:  900000000200f032 <SF,HV,VEC,EE,PR,FP,ME,IR,DR,RI>  CR: 28022284  XER:
00000000
IRQMASK: 0 
GPR00: 0000000000000055 00003fffd5fb3600 00003fff95424300 00003fffd5fb4040 
GPR04: 00003fffd5fb3b20 00000000000003ff 0000000062697473 000000002bcf8581 
GPR08: ffffffffd4307a80 0000000000000000 0000000000000000 0000000000000000 
GPR12: 0000000000000000 00003fff95946430 00000000000003ff 00003fffd5fb4030 
GPR16: 00003fffd5fb3b20 0000000000000000 00003fffd5fb3700 0000000000000000 
GPR20: 000000000000002f 00003fffd5fb3b10 00003fff9593f7b8 00003fffd5fb4080 
GPR24: 0000000000000004 00003fffd5fb3710 00003fffd5fb3b20 00003fffd5fb4040 
GPR28: 00003fffd5fb4538 00003fffd5fb4084 00003fffd5fb4040 0000000000000000 
NIP [00003fff95335500] 0x3fff95335500
LR [00003fff95273f6c] 0x3fff95273f6c
--- interrupt: c00
Rebooting in 40 seconds..

And another one:
[...]
stack: c00000002cd77fb0: 00000000 28042822 00000000 00000000  ....(.("........
stack: c00000002cd77fc0: 00000000 00000c00 00003fff 9b2dd000  ..........?..-..
stack: c00000002cd77fd0: 00000000 42000000 00000000 00000000  ....B...........
stack: c00000002cd77fe0: 00000000 00000000 00000000 00000000  ................
stack: c00000002cd77ff0: 00000000 00000000 00000000 00000000  ................
Kernel panic - not syncing: corrupted stack end detected inside scheduler
CPU: 1 PID: 2713 Comm: cc1plus Tainted: G        W        
5.15.0-rc2-PowerMacG5+ #4
Call Trace:
[c00000002cd76dd0] [c0000000005532d8] .dump_stack_lvl+0x98/0xe0 (unreliable)
[c00000002cd76e60] [c000000000069538] .panic+0x14c/0x3e8
[c00000002cd76f10] [c00000000081d5a0] .__schedule+0xc0/0x874
[c00000002cd76fe0] [c00000000081dea0] .preempt_schedule_common+0x28/0x48
[c00000002cd77060] [c00000000081deec] .__cond_resched+0x2c/0x50
[c00000002cd770d0] [c000000000327848] .__ext4_handle_dirty_metadata+0x24/0x214
[c00000002cd771a0] [c000000000352a6c] .ext4_mb_mark_diskspace_used+0x3e0/0x41c
[c00000002cd77290] [c000000000355ae4] .ext4_mb_new_blocks+0x580/0xe10
[c00000002cd773b0] [c00000000033b2f0] .ext4_ind_map_blocks+0x63c/0xb28
[c00000002cd775a0] [c000000000342bf4] .ext4_map_blocks+0x37c/0x588
[c00000002cd77680] [c000000000342e64] ._ext4_get_block+0x64/0xec
[c00000002cd77730] [c0000000002c36a4] .__block_write_begin_int+0x188/0x4a4
[c00000002cd77850] [c0000000003480f8] .ext4_write_begin+0x2a8/0x3d0
[c00000002cd77970] [c0000000001c9170] .generic_perform_write+0xb8/0x1f4
[c00000002cd77a60] [c000000000333d68] .ext4_buffered_write_iter+0xb8/0x154
[c00000002cd77b00] [c000000000277a14] .new_sync_write+0x94/0xe8
[c00000002cd77c00] [c000000000278d6c] .vfs_write+0x13c/0x140
[c00000002cd77ca0] [c000000000278eb4] .ksys_write+0x78/0xc4
[c00000002cd77d60] [c000000000022850] .system_call_exception+0x1ac/0x1e4
[c00000002cd77e10] [c00000000000b4cc] system_call_common+0xec/0x250
--- interrupt: c00 at 0x3fff9b804b00
NIP:  00003fff9b804b00 LR: 00003fff9b780d04 CTR: 0000000000000000
REGS: c00000002cd77e80 TRAP: 0c00   Tainted: G        W         
(5.15.0-rc2-PowerMacG5+)
MSR:  900000000200d032 <SF,HV,VEC,EE,PR,ME,IR,DR,RI>  CR: 28042822  XER:
00000000
IRQMASK: 0 
GPR00: 0000000000000004 00003fffe21d3000 00003fff9b8f6300 0000000000000001 
GPR04: 0000000026979780 0000000000001000 0000000000000001 0000000000000036 
GPR08: 000000002697a780 0000000000000000 0000000000000000 0000000000000000 
GPR12: 0000000000000000 00003fff9be17430 00000000100f1858 0000000010063a50 
GPR16: 00000000100639f8 0000000010063968 000000012e83eb28 0000000000000000 
GPR20: ffffffffffffffff 0000000011a5b890 0000000000000001 00000000114e5160 
GPR24: 00000000114e5170 0000000000000000 0000000026979780 0000000000001000 
GPR28: 0000000000001000 00003fff9b8f0920 0000000026979780 0000000000001000 
NIP [00003fff9b804b00] 0x3fff9b804b00
LR [00003fff9b780d04] 0x3fff9b780d04
--- interrupt: c00
Rebooting in 40 seconds..

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.

^ permalink raw reply

* Re: [PATCH] powerpc/paravirt: correct preempt debug splat in vcpu_is_preempted()
From: Nathan Lynch @ 2021-09-25  0:10 UTC (permalink / raw)
  To: Michael Ellerman, Srikar Dronamraju; +Cc: linuxppc-dev, npiggin
In-Reply-To: <87pmsylli8.fsf@mpe.ellerman.id.au>

Michael Ellerman <mpe@ellerman.id.au> writes:
> Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
>> * Michael Ellerman <mpe@ellerman.id.au> [2021-09-23 17:29:32]:
>>
>>> Nathan Lynch <nathanl@linux.ibm.com> writes:
>>> > Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
>>> >
>>> >> * Nathan Lynch <nathanl@linux.ibm.com> [2021-09-22 11:01:12]:
>>> >>
>>> >>> Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
> ...
>>> >> Or can I understand how debug_smp_processor_id() is useful if
>>> >> __smp_processor_id() is defined as raw_smp_processor_id()?
>>> 
>>> debug_smp_processor_id() is useful on powerpc, as well as other arches,
>>> because it checks that we're in a context where the processor id won't
>>> change out from under us.
>>> 
>>> eg. something like this is unsafe:
>>> 
>>>   int counts[NR_CPUS];
>>>   int tmp, cpu;
>>>   
>>>   cpu = smp_processor_id();
>>>   tmp = counts[cpu];
>>>   				<- preempted here and migrated to another CPU
>>>   counts[cpu] = tmp + 1;
>>> 
>>
>> If lets say the above call was replaced by raw_smp_processor_id(), how would
>> it avoid the preemption / migration to another CPU?
>>
>> Replacing it with raw_smp_processor_id() may avoid, the debug splat but the
>> underlying issue would still remain as is. No?
>
> Correct.
>
> Using raw_smp_processor_id() is generally the wrong answer. For this
> example the correct fix is to disable preemption around the code, eg:
>
>    int counts[NR_CPUS];
>    int tmp, cpu;
>    
>    preempt_disable()
>
>    cpu = smp_processor_id();
>    tmp = counts[cpu];
>    counts[cpu] = tmp + 1;
>
>    preempt_enable();
>
>
> For the original issue I think it is OK to use raw_smp_processor_id(),
> because we're already doing a racy check of whether another CPU has been
> preempted by the hypervisor.
>
>         if (!is_kvm_guest()) {
>                 int first_cpu = cpu_first_thread_sibling(smp_processor_id());
>
>                 if (cpu_first_thread_sibling(cpu) == first_cpu)
>                         return false;
>         }
>
> We could disable preemption around that, eg:
>
>         if (!is_kvm_guest()) {
>                 int first_cpu;
>                 bool is_sibling;
>
>                 preempt_disable();
>                 first_cpu = cpu_first_thread_sibling(smp_processor_id());
>                 is_sibling = (cpu_first_thread_sibling(cpu) == first_cpu)
>                 preempt_enable();
>
>                 // Can be preempted here
>
>                 if (is_sibling)
>                     return false;
>         }
>
> But before we return we could be preempted, and then is_sibling is no
> longer necessarily correct. So that doesn't really gain us anything.
>
> The only way to make that result stable is to disable preemption in the
> caller, but most callers don't want to AFAICS, because they know they're
> doing a racy check to begin with.

I'll add that one way I think about this is that when I choose
smp_processor_id(), I am making a claim about the context in which it is
used, and when I use raw_smp_processor_id() I am making a different
claim.

smp_processor_id() => I am sampling the CPU index in a critical section
where the result equals the CPU that executes the entire critical
section, and I am relying on that property for the section's
correctness. This claim is verified by debug_smp_processor_id() when
DEBUG_PREEMPT=y.

raw_smp_processor_id() => I am sampling the CPU index and using the
result in a way that is safe even if it differs from the CPU(s) on which
the surrounding code actually executes.

This framing doesn't cover all situations, but I've found it to be
generally useful.

^ permalink raw reply

* Re: [PATCH] powerpc: don't select KFENCE on platform PPC_FSL_BOOK3E
From: Liu Shixin @ 2021-09-25  3:10 UTC (permalink / raw)
  To: Christophe Leroy, Marco Elver, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <f8d12860-56d7-5697-7cba-3cac95bb0a1c@csgroup.eu>



On 2021/9/24 14:41, Christophe Leroy wrote:
>
>
> Le 24/09/2021 à 08:39, Liu Shixin a écrit :
>> On platform PPC_FSL_BOOK3E, all lowmem is managed by tlbcam. That means
>> we didn't really map the kfence pool with page granularity. Therefore,
>> if KFENCE is enabled, the system will hit the following panic:
>
> Could you please explain a bit more what the problem is ?
>
> KFENCE has been implemented with the same logic as DEBUG_PAGEALLOC.
>
> DEBUG_PAGEALLOC is enabled on FSL_BOOK3E.
>
> In MMU_setup(), __map_without_ltlbs is set to 1 when KFENCE is enabled.
>
> __map_without_ltlbs should disable the use of tlbcam.
>
>
> So what's wrong really ?
>
> Does DEBUG_PAGEALLOC work on FSL_BOOK3E ?
>
> Thanks
> Christophe
>
hi Christophe,

The phenomenon is that kernel panic in the kfence_protect_page function because
__kfence_pool is not mapped with page granularity.

The problem is that in the mapin_ram function, the return value(i.e base) of mmu_mapin_ram
is equal to top. As a result, no level-2 page table is created for [base, top]. It seems that
__map_without_ltlbs didn't diable the use of tlbcam.

I have tried to force page table for all lowmem, then this problem will go away
but the kfence_test failed, which could be explained by the fact that tlbcam is still used.

By the way, DEBUG_PAGEALLOC works well on FSL_BOOK3E without level-2 page table.

Thanks,
>>
>>      BUG: Kernel NULL pointer dereference on read at 0x00000000
>>      Faulting instruction address: 0xc01de598
>>      Oops: Kernel access of bad area, sig: 11 [#1]
>>      BE PAGE_SIZE=4K SMP NR_CPUS=4 MPC8544 DS
>>      Dumping ftrace buffer:
>>         (ftrace buffer empty)
>>      Modules linked in:
>>      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.12.0-rc3+ #298
>>      NIP:  c01de598 LR: c08ae9c4 CTR: 00000000
>>      REGS: c0b4bea0 TRAP: 0300   Not tainted  (5.12.0-rc3+)
>>      MSR:  00021000 <CE,ME>  CR: 24000228  XER: 20000000
>>      DEAR: 00000000 ESR: 00000000
>>      GPR00: c08ae9c4 c0b4bf60 c0ad64e0 ef720000 00021000 00000000 00000000 00000200
>>      GPR08: c0ad5000 00000000 00000000 00000004 00000000 008fbb30 00000000 00000000
>>      GPR16: 00000000 00000000 00000000 00000000 c0000000 00000000 00000000 00000000
>>      GPR24: c08ca004 c08ca004 c0b6a0e0 c0b60000 c0b58f00 c0850000 c08ca000 ef720000
>>      NIP [c01de598] kfence_protect+0x44/0x6c
>>      LR [c08ae9c4] kfence_init+0xfc/0x2a4
>>      Call Trace:
>>      [c0b4bf60] [efffe160] 0xefffe160 (unreliable)
>>      [c0b4bf70] [c08ae9c4] kfence_init+0xfc/0x2a4
>>      [c0b4bfb0] [c0894d3c] start_kernel+0x3bc/0x574
>>      [c0b4bff0] [c0000470] set_ivor+0x14c/0x188
>>      Instruction dump:
>>      7c0802a6 8109d594 546a653a 90010014 54630026 39200000 7d48502e 2c0a0000
>>      41820010 554a0026 5469b53a 7d295214 <81490000> 38831000 554a003c 91490000
>>      random: get_random_bytes called from print_oops_end_marker+0x40/0x78 with crng_init=0
>>      ---[ end trace 0000000000000000 ]---
>>
>> Signed-off-by: Liu Shixin <liushixin2@huawei.com>
>> ---
>>   arch/powerpc/Kconfig | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>> index d46db0bfb998..cffd57bcb5e4 100644
>> --- a/arch/powerpc/Kconfig
>> +++ b/arch/powerpc/Kconfig
>> @@ -185,7 +185,7 @@ config PPC
>>       select HAVE_ARCH_KASAN            if PPC32 && PPC_PAGE_SHIFT <= 14
>>       select HAVE_ARCH_KASAN_VMALLOC        if PPC32 && PPC_PAGE_SHIFT <= 14
>>       select HAVE_ARCH_KGDB
>> -    select HAVE_ARCH_KFENCE            if PPC32
>> +    select HAVE_ARCH_KFENCE            if PPC32 && !PPC_FSL_BOOK3E
>>       select HAVE_ARCH_MMAP_RND_BITS
>>       select HAVE_ARCH_MMAP_RND_COMPAT_BITS    if COMPAT
>>       select HAVE_ARCH_NVRAM_OPS
>>
> .
>


^ permalink raw reply

* ppc32 doesn't boot when linkaddr=0x00900000
From: cp @ 2021-09-25  9:49 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: torvalds, paulus

hi
I am new to this list. Hope this is the right place to ask.

I am working with a PPC405GP board, and as far as I understand, the
support for ppc40x platforms like Acadia and Walnut were dropped with
kernel 5.8.0, so this seems like a pretty straightforward question,
but extensive experiments from kernel 4.11 to kernel 5.7.19 haven't
shown a really clear, up-to-date answer.

In k4.11 .. k5.7.19, when the kernel size is bigger than 8 MB, the
final kernel doesn't boot but rather arch/powerpc/boot/main.c dies
before the first message from the kernel shows up.

Why?

Digging deeper I see the relation between the kernel size and link_addr

        # Round the size to next higher MB limit
        round_size=$(((strip_size + 0xfffff) & 0xfff00000))

        round_size=0x$(printf "%x" $round_size)
        link_addr=$(printf "%d" $link_address)

and this is where link_addr is involved

        text_start="-Ttext $link_address"

My kernels are compiled for cuboot, and the code that invokes "kentry"
is entirely located in arch/powerpc/boot/main.c

I instrumned that module, and this is what I see on the condole

The following is the same kernel, compiled with the same .config, but
with two link_addr values

A) with link_addr=0x0080.0000
image loaded from 0x00800000
SP=0x03eb1b80
kernel_size = 7411084 bytes
copying 256 bytes from kernel-image at 0x0080f000 to elfheader
elf_info.loadsize = 0x00700e68
elf_info.memsize  = 0x0074234c
allocating 7611212 bytes for the new kernel
copying ...
from = 0x0081f000
to = 0x00000000
size = 7343720
flush_cache, 32Mbyte flushed
cmdline: uboot bootargs overridden
cmdline=[console=ttyS0,115200 root=/dev/sda2 rootfstype=ext2 rw
init=/sbin/init ]
Finalizing device tree... flat tree at 0xf23b80
ft_addr=0xf23b80
my tp1: success
kernel booting ....
(it boots)

B) with link_addr=0x0080.0000
image loaded from 0x00900000
SP=0x03eb1b80
kernel_size = 7411084
copying 256 bytes from kernel-image at 0x0090f000 to elfheader
elf_info.loadsize = 0x00700e68
elf_info.memsize  = 0x0074234c
allocating 7611212 bytes for the new kernel
copying ...
from = 0x0091f000
to = 0x00000000
size = 7343720
flush_cache, 32Mbyte flushed
cmdline: uboot bootargs overridden
cmdline=[console=ttyS0,115200 root=/dev/sda2 rootfstype=ext2 rw
init=/sbin/init ]
Finalizing device tree... flat tree at 0x1023b80
ft_addr=0x1023b80
my tp2: success
my tp3: success
invalidate_cache 0x00000000+0x02000000
my tp4: (point of no return)
calling kentry()...
kernel booting ....
(it dies at this point, but without a debugger it's like watching
something fall into a black hole)

Any ideas?
I am lost ...

Carlo

^ permalink raw reply

* Re: ppc32 doesn't boot when linkaddr=0x00900000
From: Christophe Leroy @ 2021-09-25 12:24 UTC (permalink / raw)
  To: cp, linuxppc-dev; +Cc: paulus, torvalds
In-Reply-To: <CA+QBN9AFNSf3+U4iMhwZx7c69MLk-BtSbVODBEA97ObYWRczbQ@mail.gmail.com>

Hi,

Le 25/09/2021 à 11:49, cp a écrit :
> hi
> I am new to this list. Hope this is the right place to ask.
> 
> I am working with a PPC405GP board, and as far as I understand, the
> support for ppc40x platforms like Acadia and Walnut were dropped with
> kernel 5.8.0, so this seems like a pretty straightforward question,
> but extensive experiments from kernel 4.11 to kernel 5.7.19 haven't
> shown a really clear, up-to-date answer.
> 
> In k4.11 .. k5.7.19, when the kernel size is bigger than 8 MB, the
> final kernel doesn't boot but rather arch/powerpc/boot/main.c dies
> before the first message from the kernel shows up.
> 
> Why?

That's the fourth time I receive your email. Please give us time to 
analyse it, there is no point in sending it again and again unless you 
have additional information to provide.

Christophe

^ permalink raw reply

* [PATCH v2 1/3] powerpc/code-patching: Improve verification of patchability
From: Christophe Leroy @ 2021-09-25  8:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Michael Neuling, linuxppc-dev, linux-kernel

Today, patch_instruction() assumes that is called exclusively on valid
addresses, and only checks that it is not called on an init address
after init section has been freed.

Improve verification by calling kernel_text_address() instead.

kernel_text_address() already includes a verification of
initmem release.

Cc: Michael Neuling <mikey@neuling.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/code-patching.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index f9a3019e37b4..20ec6648aacc 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -190,10 +190,9 @@ static int do_patch_instruction(u32 *addr, struct ppc_inst instr)
 int patch_instruction(u32 *addr, struct ppc_inst instr)
 {
 	/* Make sure we aren't patching a freed init section */
-	if (init_mem_is_free && init_section_contains(addr, 4)) {
-		pr_debug("Skipping init section patching addr: 0x%px\n", addr);
+	if (!kernel_text_address((unsigned long)addr))
 		return 0;
-	}
+
 	return do_patch_instruction(addr, instr);
 }
 NOKPROBE_SYMBOL(patch_instruction);
-- 
2.31.1


^ permalink raw reply related

* [PATCH v2 2/3] powerpc: Stop using init_mem_is_free
From: Christophe Leroy @ 2021-09-25  8:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Michael Neuling, linuxppc-dev, linux-kernel
In-Reply-To: <a7a35416ddab4b42617e2c81fe5816c7f8d0c63f.1632558672.git.christophe.leroy@csgroup.eu>

Generic parts of the kernel for instance core_kernel_text()
use 'system_state' to check whether init memory has been
freed.

Do the same and stop using init_mem_is_free.

Cc: Michael Neuling <mikey@neuling.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch_is_kernel_initmem_freed() is likely going to go away via the mm tree, in that case this patch will void.

v2: Also remove declaration of init_mem_is_free.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/sections.h | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/sections.h b/arch/powerpc/include/asm/sections.h
index 6e4af4492a14..454b8434bfc4 100644
--- a/arch/powerpc/include/asm/sections.h
+++ b/arch/powerpc/include/asm/sections.h
@@ -10,11 +10,9 @@
 
 #include <asm-generic/sections.h>
 
-extern bool init_mem_is_free;
-
 static inline int arch_is_kernel_initmem_freed(unsigned long addr)
 {
-	if (!init_mem_is_free)
+	if (system_state < SYSTEM_RUNNING)
 		return 0;
 
 	return addr >= (unsigned long)__init_begin &&
-- 
2.31.1


^ permalink raw reply related

* [PATCH v2 3/3] powerpc: Remove init_mem_is_free
From: Christophe Leroy @ 2021-09-25  8:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Michael Neuling, linuxppc-dev, linux-kernel
In-Reply-To: <a7a35416ddab4b42617e2c81fe5816c7f8d0c63f.1632558672.git.christophe.leroy@csgroup.eu>

init_mem_is_free is not used anymore. Remove it.

Cc: Michael Neuling <mikey@neuling.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/setup.h | 1 -
 arch/powerpc/mm/mem.c            | 2 --
 2 files changed, 3 deletions(-)

diff --git a/arch/powerpc/include/asm/setup.h b/arch/powerpc/include/asm/setup.h
index 6c1a7d217d1a..426a2d8d028f 100644
--- a/arch/powerpc/include/asm/setup.h
+++ b/arch/powerpc/include/asm/setup.h
@@ -9,7 +9,6 @@ extern void ppc_printk_progress(char *s, unsigned short hex);
 
 extern unsigned int rtas_data;
 extern unsigned long long memory_limit;
-extern bool init_mem_is_free;
 extern void *zalloc_maybe_bootmem(size_t size, gfp_t mask);
 
 struct device_node;
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index c3c4e31462ec..5b1eae6c0356 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -26,7 +26,6 @@
 #include <mm/mmu_decl.h>
 
 unsigned long long memory_limit;
-bool init_mem_is_free;
 
 unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)] __page_aligned_bss;
 EXPORT_SYMBOL(empty_zero_page);
@@ -312,7 +311,6 @@ void free_initmem(void)
 {
 	ppc_md.progress = ppc_printk_progress;
 	mark_initmem_nx();
-	init_mem_is_free = true;
 	free_initmem_default(POISON_FREE_INITMEM);
 }
 
-- 
2.31.1


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox