LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 2/3] powerpc: Use generic version of arch_is_kernel_initmem_freed()
From: Christophe Leroy @ 2021-09-24 13:48 UTC (permalink / raw)
  To: Andrew Morton, arnd
  Cc: linux-arch, linux-s390, linux-kernel, linux-mm, linuxppc-dev
In-Reply-To: <0b55650058a5bf64f7d74781871a1ada2298c8b4.1632491308.git.christophe.leroy@csgroup.eu>

Generic version of arch_is_kernel_initmem_freed() now does the same
as powerpc version.

Remove the powerpc version.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/sections.h | 13 -------------
 1 file changed, 13 deletions(-)

diff --git a/arch/powerpc/include/asm/sections.h b/arch/powerpc/include/asm/sections.h
index 6e4af4492a14..79cb7a25a5fb 100644
--- a/arch/powerpc/include/asm/sections.h
+++ b/arch/powerpc/include/asm/sections.h
@@ -6,21 +6,8 @@
 #include <linux/elf.h>
 #include <linux/uaccess.h>
 
-#define arch_is_kernel_initmem_freed arch_is_kernel_initmem_freed
-
 #include <asm-generic/sections.h>
 
-extern bool init_mem_is_free;
-
-static inline int arch_is_kernel_initmem_freed(unsigned long addr)
-{
-	if (!init_mem_is_free)
-		return 0;
-
-	return addr >= (unsigned long)__init_begin &&
-		addr < (unsigned long)__init_end;
-}
-
 extern char __head_end[];
 
 #ifdef __powerpc64__
-- 
2.31.1


^ permalink raw reply related

* [PATCH 3/3] s390: Use generic version of arch_is_kernel_initmem_freed()
From: Christophe Leroy @ 2021-09-24 13:48 UTC (permalink / raw)
  To: Andrew Morton, arnd
  Cc: linux-arch, linux-s390, linux-kernel, linux-mm, Gerald Schaefer,
	linuxppc-dev
In-Reply-To: <0b55650058a5bf64f7d74781871a1ada2298c8b4.1632491308.git.christophe.leroy@csgroup.eu>

Generic version of arch_is_kernel_initmem_freed() now does the same
as s390 version.

Remove the s390 version.

Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/s390/include/asm/sections.h | 12 ------------
 arch/s390/mm/init.c              |  3 ---
 2 files changed, 15 deletions(-)

diff --git a/arch/s390/include/asm/sections.h b/arch/s390/include/asm/sections.h
index 85881dd48022..3fecaa4e8b74 100644
--- a/arch/s390/include/asm/sections.h
+++ b/arch/s390/include/asm/sections.h
@@ -2,20 +2,8 @@
 #ifndef _S390_SECTIONS_H
 #define _S390_SECTIONS_H
 
-#define arch_is_kernel_initmem_freed arch_is_kernel_initmem_freed
-
 #include <asm-generic/sections.h>
 
-extern bool initmem_freed;
-
-static inline int arch_is_kernel_initmem_freed(unsigned long addr)
-{
-	if (!initmem_freed)
-		return 0;
-	return addr >= (unsigned long)__init_begin &&
-	       addr < (unsigned long)__init_end;
-}
-
 /*
  * .boot.data section contains variables "shared" between the decompressor and
  * the decompressed kernel. The decompressor will store values in them, and
diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index a04faf49001a..8c6f258a6183 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -58,8 +58,6 @@ unsigned long empty_zero_page, zero_page_mask;
 EXPORT_SYMBOL(empty_zero_page);
 EXPORT_SYMBOL(zero_page_mask);
 
-bool initmem_freed;
-
 static void __init setup_zero_pages(void)
 {
 	unsigned int order;
@@ -214,7 +212,6 @@ void __init mem_init(void)
 
 void free_initmem(void)
 {
-	initmem_freed = true;
 	__set_memory((unsigned long)_sinittext,
 		     (unsigned long)(_einittext - _sinittext) >> PAGE_SHIFT,
 		     SET_MEMORY_RW | SET_MEMORY_NX);
-- 
2.31.1


^ permalink raw reply related

* Re: [PATCH v3 5/8] x86/sme: Replace occurrences of sme_active() with cc_platform_has()
From: Tom Lendacky @ 2021-09-24 13:31 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Sathyanarayanan Kuppuswamy, linux-efi, Brijesh Singh, kvm,
	Peter Zijlstra, Dave Hansen, dri-devel, platform-driver-x86,
	Will Deacon, linux-s390, Andi Kleen, Joerg Roedel, x86, amd-gfx,
	Christoph Hellwig, Ingo Molnar, linux-graphics-maintainer,
	Tianyu Lan, Andy Lutomirski, Kirill A. Shutemov, Thomas Gleixner,
	kexec, linux-kernel, iommu, linux-fsdevel, linuxppc-dev
In-Reply-To: <YU2fsCblZVQpgMvC@zn.tnic>

On 9/24/21 4:51 AM, Borislav Petkov wrote:
> On Fri, Sep 24, 2021 at 12:41:32PM +0300, Kirill A. Shutemov wrote:
>> On Thu, Sep 23, 2021 at 08:21:03PM +0200, Borislav Petkov wrote:
>>> On Thu, Sep 23, 2021 at 12:05:58AM +0300, Kirill A. Shutemov wrote:
>>>> Unless we find other way to guarantee RIP-relative access, we must use
>>>> fixup_pointer() to access any global variables.
>>>
>>> Yah, I've asked compiler folks about any guarantees we have wrt
>>> rip-relative addresses but it doesn't look good. Worst case, we'd have
>>> to do the fixup_pointer() thing.
>>>
>>> In the meantime, Tom and I did some more poking at this and here's a
>>> diff ontop.
>>>
>>> The direction being that we'll stick both the AMD and Intel
>>> *cc_platform_has() call into cc_platform.c for which instrumentation
>>> will be disabled so no issues with that.
>>>
>>> And that will keep all that querying all together in a single file.
>>
>> And still do cc_platform_has() calls in __startup_64() codepath?
>>
>> It's broken.
>>
>> Intel detection in cc_platform_has() relies on boot_cpu_data.x86_vendor
>> which is not initialized until early_cpu_init() in setup_arch(). Given
>> that X86_VENDOR_INTEL is 0 it leads to false-positive.
> 
> Yeah, Tom, I had the same question yesterday.
> 
> /me looks in his direction.
> 

Yup, all the things we talked about.

But we also know that cc_platform_has() gets called a few other times 
before boot_cpu_data is initialized - so more false-positives. For 
cc_platform_has() to work properly, given the desire to consolidate the 
calls, IMO, Intel will have to come up with some early setting that can be 
enabled and checked in place of boot_cpu_data or else you live with 
false-positives.

Thanks,
Tom

> :-)
> 

^ permalink raw reply

* [PATCH v1 2/3] powerpc: Stop using init_mem_is_free
From: Christophe Leroy @ 2021-09-24 12:38 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Michael Neuling, linuxppc-dev, linux-kernel
In-Reply-To: <a7a35416ddab4b42617e2c81fe5816c7f8d0c63f.1632487071.git.christophe.leroy@csgroup.eu>

Generic parts of the kernel for instance core_kernel_text()
use 'system_state' to check whether init memory has been
freed.

Do the same and stop using init_mem_is_free.

Cc: Michael Neuling <mikey@neuling.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/sections.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/sections.h b/arch/powerpc/include/asm/sections.h
index 6e4af4492a14..dff55421f76b 100644
--- a/arch/powerpc/include/asm/sections.h
+++ b/arch/powerpc/include/asm/sections.h
@@ -14,7 +14,7 @@ extern bool init_mem_is_free;
 
 static inline int arch_is_kernel_initmem_freed(unsigned long addr)
 {
-	if (!init_mem_is_free)
+	if (system_state < SYSTEM_RUNNING)
 		return 0;
 
 	return addr >= (unsigned long)__init_begin &&
-- 
2.31.1


^ permalink raw reply related

* [PATCH v1 3/3] powerpc: Remove init_mem_is_free
From: Christophe Leroy @ 2021-09-24 12:38 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Michael Neuling, linuxppc-dev, linux-kernel
In-Reply-To: <a7a35416ddab4b42617e2c81fe5816c7f8d0c63f.1632487071.git.christophe.leroy@csgroup.eu>

init_mem_is_free is not used anymore. Remove it.

Cc: Michael Neuling <mikey@neuling.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/sections.h | 2 --
 arch/powerpc/include/asm/setup.h    | 1 -
 arch/powerpc/mm/mem.c               | 2 --
 3 files changed, 5 deletions(-)

diff --git a/arch/powerpc/include/asm/sections.h b/arch/powerpc/include/asm/sections.h
index dff55421f76b..454b8434bfc4 100644
--- a/arch/powerpc/include/asm/sections.h
+++ b/arch/powerpc/include/asm/sections.h
@@ -10,8 +10,6 @@
 
 #include <asm-generic/sections.h>
 
-extern bool init_mem_is_free;
-
 static inline int arch_is_kernel_initmem_freed(unsigned long addr)
 {
 	if (system_state < SYSTEM_RUNNING)
diff --git a/arch/powerpc/include/asm/setup.h b/arch/powerpc/include/asm/setup.h
index 6c1a7d217d1a..426a2d8d028f 100644
--- a/arch/powerpc/include/asm/setup.h
+++ b/arch/powerpc/include/asm/setup.h
@@ -9,7 +9,6 @@ extern void ppc_printk_progress(char *s, unsigned short hex);
 
 extern unsigned int rtas_data;
 extern unsigned long long memory_limit;
-extern bool init_mem_is_free;
 extern void *zalloc_maybe_bootmem(size_t size, gfp_t mask);
 
 struct device_node;
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index c3c4e31462ec..5b1eae6c0356 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -26,7 +26,6 @@
 #include <mm/mmu_decl.h>
 
 unsigned long long memory_limit;
-bool init_mem_is_free;
 
 unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)] __page_aligned_bss;
 EXPORT_SYMBOL(empty_zero_page);
@@ -312,7 +311,6 @@ void free_initmem(void)
 {
 	ppc_md.progress = ppc_printk_progress;
 	mark_initmem_nx();
-	init_mem_is_free = true;
 	free_initmem_default(POISON_FREE_INITMEM);
 }
 
-- 
2.31.1


^ permalink raw reply related

* [PATCH v1 1/3] powerpc/code-patching: Improve verification of patchability
From: Christophe Leroy @ 2021-09-24 12:38 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Michael Neuling, linuxppc-dev, linux-kernel

Today, patch_instruction() assumes that is called exclusively on valid
addresses, and only checks that it is not called on an init address
after init section has been freed.

Improve verification by calling kernel_text_address() instead.

kernel_text_address() already includes a verification of
initmem release.

Cc: Michael Neuling <mikey@neuling.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/code-patching.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index f9a3019e37b4..20ec6648aacc 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -190,10 +190,9 @@ static int do_patch_instruction(u32 *addr, struct ppc_inst instr)
 int patch_instruction(u32 *addr, struct ppc_inst instr)
 {
 	/* Make sure we aren't patching a freed init section */
-	if (init_mem_is_free && init_section_contains(addr, 4)) {
-		pr_debug("Skipping init section patching addr: 0x%px\n", addr);
+	if (!kernel_text_address((unsigned long)addr))
 		return 0;
-	}
+
 	return do_patch_instruction(addr, instr);
 }
 NOKPROBE_SYMBOL(patch_instruction);
-- 
2.31.1


^ permalink raw reply related

* Re: [PATCH AUTOSEL 5.10 01/26] ibmvnic: check failover_pending in login response
From: Sasha Levin @ 2021-09-24 11:47 UTC (permalink / raw)
  To: Pavel Machek
  Cc: linux-kernel, stable, netdev, drt, kuba, Sukadev Bhattiprolu,
	linuxppc-dev, David S . Miller
In-Reply-To: <20210923073347.GA2056@amd>

On Thu, Sep 23, 2021 at 09:33:47AM +0200, Pavel Machek wrote:
>Hi!
>
>Something went wrong with this series. I only see first 7 patches. I
>thought it might be local problem, but I only see 7 patches on lore...

Huh, yes, apparently git-send-email timed out. I'll resend. Thanks!


-- 
Thanks,
Sasha

^ permalink raw reply

* [PATCH v3 2/2] powerpc/powermac: constify device_node in of_irq_parse_oldworld()
From: Krzysztof Kozlowski @ 2021-09-24 10:56 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Michael Ellerman, Paul Mackerras,
	Rob Herring, Frank Rowand, Krzysztof Kozlowski, Christophe Leroy,
	linuxppc-dev, linux-kernel, devicetree
In-Reply-To: <20210924105653.46963-1-krzysztof.kozlowski@canonical.com>

The of_irq_parse_oldworld() does not modify passed device_node so make
it a pointer to const for safety.  Drop the extern while modifying the
line.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>

---

Changes since v1:
1. Drop extern.
---
 arch/powerpc/platforms/powermac/pic.c | 2 +-
 include/linux/of_irq.h                | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/powermac/pic.c b/arch/powerpc/platforms/powermac/pic.c
index 4921bccf0376..af5ca1f41bb1 100644
--- a/arch/powerpc/platforms/powermac/pic.c
+++ b/arch/powerpc/platforms/powermac/pic.c
@@ -384,7 +384,7 @@ static void __init pmac_pic_probe_oldstyle(void)
 #endif
 }
 
-int of_irq_parse_oldworld(struct device_node *device, int index,
+int of_irq_parse_oldworld(const struct device_node *device, int index,
 			struct of_phandle_args *out_irq)
 {
 	const u32 *ints = NULL;
diff --git a/include/linux/of_irq.h b/include/linux/of_irq.h
index aaf219bd0354..83fccd0c9bba 100644
--- a/include/linux/of_irq.h
+++ b/include/linux/of_irq.h
@@ -20,12 +20,12 @@ typedef int (*of_irq_init_cb_t)(struct device_node *, struct device_node *);
 #if defined(CONFIG_PPC32) && defined(CONFIG_PPC_PMAC)
 extern unsigned int of_irq_workarounds;
 extern struct device_node *of_irq_dflt_pic;
-extern int of_irq_parse_oldworld(struct device_node *device, int index,
-			       struct of_phandle_args *out_irq);
+int of_irq_parse_oldworld(const struct device_node *device, int index,
+			  struct of_phandle_args *out_irq);
 #else /* CONFIG_PPC32 && CONFIG_PPC_PMAC */
 #define of_irq_workarounds (0)
 #define of_irq_dflt_pic (NULL)
-static inline int of_irq_parse_oldworld(struct device_node *device, int index,
+static inline int of_irq_parse_oldworld(const struct device_node *device, int index,
 				      struct of_phandle_args *out_irq)
 {
 	return -EINVAL;
-- 
2.30.2


^ permalink raw reply related

* [PATCH v3 1/2] powerpc/powermac: add missing g5_phy_disable_cpu1() declaration
From: Krzysztof Kozlowski @ 2021-09-24 10:56 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Michael Ellerman, Paul Mackerras,
	Rob Herring, Frank Rowand, Krzysztof Kozlowski, Christophe Leroy,
	linuxppc-dev, linux-kernel, devicetree

g5_phy_disable_cpu1() is used outside of platforms/powermac/feature.c,
so it should have a declaration to fix W=1 warning:

  arch/powerpc/platforms/powermac/feature.c:1533:6:
    error: no previous prototype for ‘g5_phy_disable_cpu1’ [-Werror=missing-prototypes]

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>

---

Changes since v2:
1. Put declaration in powermac/pmac.h

Changes since v1:
1. Drop declaration in powermac/smp.c
---
 arch/powerpc/platforms/powermac/pmac.h | 2 ++
 arch/powerpc/platforms/powermac/smp.c  | 2 --
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/powermac/pmac.h b/arch/powerpc/platforms/powermac/pmac.h
index 0d715db434dc..be528e0e507f 100644
--- a/arch/powerpc/platforms/powermac/pmac.h
+++ b/arch/powerpc/platforms/powermac/pmac.h
@@ -14,6 +14,8 @@ struct rtc_time;
 
 extern int pmac_newworld;
 
+void g5_phy_disable_cpu1(void);
+
 extern long pmac_time_init(void);
 extern time64_t pmac_get_boot_time(void);
 extern void pmac_get_rtc_time(struct rtc_time *);
diff --git a/arch/powerpc/platforms/powermac/smp.c b/arch/powerpc/platforms/powermac/smp.c
index 3256a316e884..5d0626f432d5 100644
--- a/arch/powerpc/platforms/powermac/smp.c
+++ b/arch/powerpc/platforms/powermac/smp.c
@@ -875,8 +875,6 @@ static int smp_core99_cpu_online(unsigned int cpu)
 
 static void __init smp_core99_bringup_done(void)
 {
-	extern void g5_phy_disable_cpu1(void);
-
 	/* Close i2c bus if it was used for tb sync */
 	if (pmac_tb_clock_chip_host)
 		pmac_i2c_close(pmac_tb_clock_chip_host);
-- 
2.30.2


^ permalink raw reply related

* Re: [PATCH v3 5/8] x86/sme: Replace occurrences of sme_active() with cc_platform_has()
From: Borislav Petkov @ 2021-09-24  9:51 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Sathyanarayanan Kuppuswamy, linux-efi, Brijesh Singh, kvm,
	Peter Zijlstra, Dave Hansen, dri-devel, platform-driver-x86,
	Will Deacon, linux-s390, Andi Kleen, Joerg Roedel, x86, amd-gfx,
	Christoph Hellwig, Ingo Molnar, linux-graphics-maintainer,
	Tianyu Lan, Andy Lutomirski, Kirill A. Shutemov, Thomas Gleixner,
	kexec, linux-kernel, iommu, linux-fsdevel, linuxppc-dev
In-Reply-To: <20210924094132.gxyqp4z3qdk5w4j6@box.shutemov.name>

On Fri, Sep 24, 2021 at 12:41:32PM +0300, Kirill A. Shutemov wrote:
> On Thu, Sep 23, 2021 at 08:21:03PM +0200, Borislav Petkov wrote:
> > On Thu, Sep 23, 2021 at 12:05:58AM +0300, Kirill A. Shutemov wrote:
> > > Unless we find other way to guarantee RIP-relative access, we must use
> > > fixup_pointer() to access any global variables.
> > 
> > Yah, I've asked compiler folks about any guarantees we have wrt
> > rip-relative addresses but it doesn't look good. Worst case, we'd have
> > to do the fixup_pointer() thing.
> > 
> > In the meantime, Tom and I did some more poking at this and here's a
> > diff ontop.
> > 
> > The direction being that we'll stick both the AMD and Intel
> > *cc_platform_has() call into cc_platform.c for which instrumentation
> > will be disabled so no issues with that.
> > 
> > And that will keep all that querying all together in a single file.
> 
> And still do cc_platform_has() calls in __startup_64() codepath?
> 
> It's broken.
> 
> Intel detection in cc_platform_has() relies on boot_cpu_data.x86_vendor
> which is not initialized until early_cpu_init() in setup_arch(). Given
> that X86_VENDOR_INTEL is 0 it leads to false-positive.

Yeah, Tom, I had the same question yesterday.

/me looks in his direction.

:-)

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply

* Re: [PATCH v3 5/8] x86/sme: Replace occurrences of sme_active() with cc_platform_has()
From: Kirill A. Shutemov @ 2021-09-24  9:41 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Sathyanarayanan Kuppuswamy, linux-efi, Brijesh Singh, kvm,
	Peter Zijlstra, Dave Hansen, dri-devel, platform-driver-x86,
	Will Deacon, linux-s390, Andi Kleen, Joerg Roedel, x86, amd-gfx,
	Christoph Hellwig, Ingo Molnar, linux-graphics-maintainer,
	Tom Lendacky, Tianyu Lan, Andy Lutomirski, Thomas Gleixner, kexec,
	linux-kernel, iommu, linux-fsdevel, linuxppc-dev
In-Reply-To: <YUzFj+yH79XRc3F3@zn.tnic>

On Thu, Sep 23, 2021 at 08:21:03PM +0200, Borislav Petkov wrote:
> On Thu, Sep 23, 2021 at 12:05:58AM +0300, Kirill A. Shutemov wrote:
> > Unless we find other way to guarantee RIP-relative access, we must use
> > fixup_pointer() to access any global variables.
> 
> Yah, I've asked compiler folks about any guarantees we have wrt
> rip-relative addresses but it doesn't look good. Worst case, we'd have
> to do the fixup_pointer() thing.
> 
> In the meantime, Tom and I did some more poking at this and here's a
> diff ontop.
> 
> The direction being that we'll stick both the AMD and Intel
> *cc_platform_has() call into cc_platform.c for which instrumentation
> will be disabled so no issues with that.
> 
> And that will keep all that querying all together in a single file.

And still do cc_platform_has() calls in __startup_64() codepath?

It's broken.

Intel detection in cc_platform_has() relies on boot_cpu_data.x86_vendor
which is not initialized until early_cpu_init() in setup_arch(). Given
that X86_VENDOR_INTEL is 0 it leads to false-positive.

I think opencode these two calls is the way forward. Maybe also move the
check from sme_encrypt_kernel() to __startup_64().

-- 
 Kirill A. Shutemov

^ permalink raw reply

* [PATCH v2 2/2] serial: 8250: SERIAL_8250_FSL should depend on Freescale platforms
From: Geert Uytterhoeven @ 2021-09-24  7:12 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby, Johan Hovold, Shawn Guo, Li Yang,
	Scott Wood
  Cc: Geert Uytterhoeven, linuxppc-dev, linux-arm-kernel, linux-serial,
	linux-kernel
In-Reply-To: <6421f256407262afd658ffa74ec9430581528a7d.1632467477.git.geert+renesas@glider.be>

The Freescale 16550-style UART is only present on some Freescale SoCs.
Hence tighten the dependencies to prevent asking the user about this
driver, and possibly defaulting it to be enabled, when configuring a
kernel without appropriate Freescale SoC or ACPI support.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
v2:
  - Split in two parts.
---
 drivers/tty/serial/8250/Kconfig | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/tty/serial/8250/Kconfig b/drivers/tty/serial/8250/Kconfig
index 0af96f3adab517f6..a2978b31144e94f2 100644
--- a/drivers/tty/serial/8250/Kconfig
+++ b/drivers/tty/serial/8250/Kconfig
@@ -363,7 +363,8 @@ config SERIAL_8250_BCM2835AUX
 config SERIAL_8250_FSL
 	bool "Freescale 16550-style UART support (8250 based driver)"
 	depends on SERIAL_8250_CONSOLE
-	default PPC || ARM || ARM64
+	depends on FSL_SOC || ARCH_LAYERSCAPE || SOC_LS1021A || (ARM64 && ACPI) || COMPILE_TEST
+	default FSL_SOC || ARCH_LAYERSCAPE || SOC_LS1021A || (ARM64 && ACPI)
 	help
 	  Selecting this option will add support for the 16550-style serial
 	  port hardware found on Freescale SoCs.
-- 
2.25.1


^ permalink raw reply related

* [PATCH v2 1/2] serial: 8250: SERIAL_8250_FSL should not default to y when compile-testing
From: Geert Uytterhoeven @ 2021-09-24  7:12 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby, Johan Hovold, Shawn Guo, Li Yang,
	Scott Wood
  Cc: Geert Uytterhoeven, linuxppc-dev, linux-arm-kernel, linux-serial,
	linux-kernel

Commit b1442c55ce8977aa ("serial: 8250: extend compile-test coverage")
added compile-test support to the Freescale 16550 driver.  However, as
SERIAL_8250_FSL is an invisible symbol, merely enabling compile-testing
now enables this driver.

Fix this by dropping the COMPILE_TEST default again, but making the
SERIAL_8250_FSL symbol visible instead.

Fixes: b1442c55ce8977aa ("serial: 8250: extend compile-test coverage")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
v2:
  - Split in two parts.
---
 drivers/tty/serial/8250/Kconfig | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/serial/8250/Kconfig b/drivers/tty/serial/8250/Kconfig
index 808268edd2e82a45..0af96f3adab517f6 100644
--- a/drivers/tty/serial/8250/Kconfig
+++ b/drivers/tty/serial/8250/Kconfig
@@ -361,9 +361,12 @@ config SERIAL_8250_BCM2835AUX
 	  If unsure, say N.
 
 config SERIAL_8250_FSL
-	bool
+	bool "Freescale 16550-style UART support (8250 based driver)"
 	depends on SERIAL_8250_CONSOLE
-	default PPC || ARM || ARM64 || COMPILE_TEST
+	default PPC || ARM || ARM64
+	help
+	  Selecting this option will add support for the 16550-style serial
+	  port hardware found on Freescale SoCs.
 
 config SERIAL_8250_DW
 	tristate "Support for Synopsys DesignWare 8250 quirks"
-- 
2.25.1


^ permalink raw reply related

* Re: [PATCH v2 9/9] iommu/vt-d: Use pci core's DVSEC functionality
From: Christoph Hellwig @ 2021-09-24  6:42 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: Andrew Donnellan, linux-pci, linuxppc-dev, linux-cxl, iommu,
	Bjorn Helgaas, David E. Box, Frederic Barrat, Kan Liang,
	David Woodhouse, Lu Baolu
In-Reply-To: <20210923172647.72738-10-ben.widawsky@intel.com>

On Thu, Sep 23, 2021 at 10:26:47AM -0700, Ben Widawsky wrote:
>   */
>  static int siov_find_pci_dvsec(struct pci_dev *pdev)
>  {
> +	return pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_INTEL, 5);
>  }

I hink the siov_find_pci_dvsec helper is pretty pointless now and can be
folded into its only caller.  And independent of that: this capability
really needs a symbolic name.  Especially for a vendor like Intel that
might have a few there should be a list of them somewhere.


^ permalink raw reply

* Re: [PATCH] powerpc: don't select KFENCE on platform PPC_FSL_BOOK3E
From: Christophe Leroy @ 2021-09-24  6:41 UTC (permalink / raw)
  To: Liu Shixin, Marco Elver, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <20210924063927.1341241-1-liushixin2@huawei.com>



Le 24/09/2021 à 08:39, Liu Shixin a écrit :
> On platform PPC_FSL_BOOK3E, all lowmem is managed by tlbcam. That means
> we didn't really map the kfence pool with page granularity. Therefore,
> if KFENCE is enabled, the system will hit the following panic:

Could you please explain a bit more what the problem is ?

KFENCE has been implemented with the same logic as DEBUG_PAGEALLOC.

DEBUG_PAGEALLOC is enabled on FSL_BOOK3E.

In MMU_setup(), __map_without_ltlbs is set to 1 when KFENCE is enabled.

__map_without_ltlbs should disable the use of tlbcam.


So what's wrong really ?

Does DEBUG_PAGEALLOC work on FSL_BOOK3E ?

Thanks
Christophe

> 
>      BUG: Kernel NULL pointer dereference on read at 0x00000000
>      Faulting instruction address: 0xc01de598
>      Oops: Kernel access of bad area, sig: 11 [#1]
>      BE PAGE_SIZE=4K SMP NR_CPUS=4 MPC8544 DS
>      Dumping ftrace buffer:
>         (ftrace buffer empty)
>      Modules linked in:
>      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.12.0-rc3+ #298
>      NIP:  c01de598 LR: c08ae9c4 CTR: 00000000
>      REGS: c0b4bea0 TRAP: 0300   Not tainted  (5.12.0-rc3+)
>      MSR:  00021000 <CE,ME>  CR: 24000228  XER: 20000000
>      DEAR: 00000000 ESR: 00000000
>      GPR00: c08ae9c4 c0b4bf60 c0ad64e0 ef720000 00021000 00000000 00000000 00000200
>      GPR08: c0ad5000 00000000 00000000 00000004 00000000 008fbb30 00000000 00000000
>      GPR16: 00000000 00000000 00000000 00000000 c0000000 00000000 00000000 00000000
>      GPR24: c08ca004 c08ca004 c0b6a0e0 c0b60000 c0b58f00 c0850000 c08ca000 ef720000
>      NIP [c01de598] kfence_protect+0x44/0x6c
>      LR [c08ae9c4] kfence_init+0xfc/0x2a4
>      Call Trace:
>      [c0b4bf60] [efffe160] 0xefffe160 (unreliable)
>      [c0b4bf70] [c08ae9c4] kfence_init+0xfc/0x2a4
>      [c0b4bfb0] [c0894d3c] start_kernel+0x3bc/0x574
>      [c0b4bff0] [c0000470] set_ivor+0x14c/0x188
>      Instruction dump:
>      7c0802a6 8109d594 546a653a 90010014 54630026 39200000 7d48502e 2c0a0000
>      41820010 554a0026 5469b53a 7d295214 <81490000> 38831000 554a003c 91490000
>      random: get_random_bytes called from print_oops_end_marker+0x40/0x78 with crng_init=0
>      ---[ end trace 0000000000000000 ]---
> 
> Signed-off-by: Liu Shixin <liushixin2@huawei.com>
> ---
>   arch/powerpc/Kconfig | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index d46db0bfb998..cffd57bcb5e4 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -185,7 +185,7 @@ config PPC
>   	select HAVE_ARCH_KASAN			if PPC32 && PPC_PAGE_SHIFT <= 14
>   	select HAVE_ARCH_KASAN_VMALLOC		if PPC32 && PPC_PAGE_SHIFT <= 14
>   	select HAVE_ARCH_KGDB
> -	select HAVE_ARCH_KFENCE			if PPC32
> +	select HAVE_ARCH_KFENCE			if PPC32 && !PPC_FSL_BOOK3E
>   	select HAVE_ARCH_MMAP_RND_BITS
>   	select HAVE_ARCH_MMAP_RND_COMPAT_BITS	if COMPAT
>   	select HAVE_ARCH_NVRAM_OPS
> 

^ permalink raw reply

* [PATCH] powerpc: don't select KFENCE on platform PPC_FSL_BOOK3E
From: Liu Shixin @ 2021-09-24  6:39 UTC (permalink / raw)
  To: Marco Elver, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras
  Cc: Liu Shixin, linuxppc-dev, linux-kernel

On platform PPC_FSL_BOOK3E, all lowmem is managed by tlbcam. That means
we didn't really map the kfence pool with page granularity. Therefore,
if KFENCE is enabled, the system will hit the following panic:

    BUG: Kernel NULL pointer dereference on read at 0x00000000
    Faulting instruction address: 0xc01de598
    Oops: Kernel access of bad area, sig: 11 [#1]
    BE PAGE_SIZE=4K SMP NR_CPUS=4 MPC8544 DS
    Dumping ftrace buffer:
       (ftrace buffer empty)
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.12.0-rc3+ #298
    NIP:  c01de598 LR: c08ae9c4 CTR: 00000000
    REGS: c0b4bea0 TRAP: 0300   Not tainted  (5.12.0-rc3+)
    MSR:  00021000 <CE,ME>  CR: 24000228  XER: 20000000
    DEAR: 00000000 ESR: 00000000
    GPR00: c08ae9c4 c0b4bf60 c0ad64e0 ef720000 00021000 00000000 00000000 00000200
    GPR08: c0ad5000 00000000 00000000 00000004 00000000 008fbb30 00000000 00000000
    GPR16: 00000000 00000000 00000000 00000000 c0000000 00000000 00000000 00000000
    GPR24: c08ca004 c08ca004 c0b6a0e0 c0b60000 c0b58f00 c0850000 c08ca000 ef720000
    NIP [c01de598] kfence_protect+0x44/0x6c
    LR [c08ae9c4] kfence_init+0xfc/0x2a4
    Call Trace:
    [c0b4bf60] [efffe160] 0xefffe160 (unreliable)
    [c0b4bf70] [c08ae9c4] kfence_init+0xfc/0x2a4
    [c0b4bfb0] [c0894d3c] start_kernel+0x3bc/0x574
    [c0b4bff0] [c0000470] set_ivor+0x14c/0x188
    Instruction dump:
    7c0802a6 8109d594 546a653a 90010014 54630026 39200000 7d48502e 2c0a0000
    41820010 554a0026 5469b53a 7d295214 <81490000> 38831000 554a003c 91490000
    random: get_random_bytes called from print_oops_end_marker+0x40/0x78 with crng_init=0
    ---[ end trace 0000000000000000 ]---

Signed-off-by: Liu Shixin <liushixin2@huawei.com>
---
 arch/powerpc/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index d46db0bfb998..cffd57bcb5e4 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -185,7 +185,7 @@ config PPC
 	select HAVE_ARCH_KASAN			if PPC32 && PPC_PAGE_SHIFT <= 14
 	select HAVE_ARCH_KASAN_VMALLOC		if PPC32 && PPC_PAGE_SHIFT <= 14
 	select HAVE_ARCH_KGDB
-	select HAVE_ARCH_KFENCE			if PPC32
+	select HAVE_ARCH_KFENCE			if PPC32 && !PPC_FSL_BOOK3E
 	select HAVE_ARCH_MMAP_RND_BITS
 	select HAVE_ARCH_MMAP_RND_COMPAT_BITS	if COMPAT
 	select HAVE_ARCH_NVRAM_OPS
-- 
2.18.0.huawei.25


^ permalink raw reply related

* Re: [PATCH 3/3] memblock: cleanup memblock_free interface
From: Mike Rapoport @ 2021-09-24  5:32 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: linux-efi, kvm, linux-sh, linux-mips, linux-mm, sparclinux,
	linux-riscv, linux-s390, kasan-dev, xen-devel, linux-snps-arc,
	devicetree, linux-um, linux-arm-kernel, Linus Torvalds, linux-usb,
	linux-kernel, iommu, linux-alpha, Andrew Morton, linuxppc-dev,
	Mike Rapoport
In-Reply-To: <96e3da9f-70ff-e5c0-ef2e-cf0b636e5695@csgroup.eu>

On Thu, Sep 23, 2021 at 03:54:46PM +0200, Christophe Leroy wrote:
> 
> Le 23/09/2021 à 14:01, Mike Rapoport a écrit :
> > On Thu, Sep 23, 2021 at 11:47:48AM +0200, Christophe Leroy wrote:
> > > 
> > > 
> > > Le 23/09/2021 à 09:43, Mike Rapoport a écrit :
> > > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > > 
> > > > For ages memblock_free() interface dealt with physical addresses even
> > > > despite the existence of memblock_alloc_xx() functions that return a
> > > > virtual pointer.
> > > > 
> > > > Introduce memblock_phys_free() for freeing physical ranges and repurpose
> > > > memblock_free() to free virtual pointers to make the following pairing
> > > > abundantly clear:
> > > > 
> > > > 	int memblock_phys_free(phys_addr_t base, phys_addr_t size);
> > > > 	phys_addr_t memblock_phys_alloc(phys_addr_t base, phys_addr_t size);
> > > > 
> > > > 	void *memblock_alloc(phys_addr_t size, phys_addr_t align);
> > > > 	void memblock_free(void *ptr, size_t size);
> > > > 
> > > > Replace intermediate memblock_free_ptr() with memblock_free() and drop
> > > > unnecessary aliases memblock_free_early() and memblock_free_early_nid().
> > > > 
> > > > Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> > > > ---
> > > 
> > > > diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c
> > > > index 1a04e5bdf655..37826d8c4f74 100644
> > > > --- a/arch/s390/kernel/smp.c
> > > > +++ b/arch/s390/kernel/smp.c
> > > > @@ -723,7 +723,7 @@ void __init smp_save_dump_cpus(void)
> > > >    			/* Get the CPU registers */
> > > >    			smp_save_cpu_regs(sa, addr, is_boot_cpu, page);
> > > >    	}
> > > > -	memblock_free(page, PAGE_SIZE);
> > > > +	memblock_phys_free(page, PAGE_SIZE);
> > > >    	diag_amode31_ops.diag308_reset();
> > > >    	pcpu_set_smt(0);
> > > >    }
> > > > @@ -880,7 +880,7 @@ void __init smp_detect_cpus(void)
> > > >    	/* Add CPUs present at boot */
> > > >    	__smp_rescan_cpus(info, true);
> > > > -	memblock_free_early((unsigned long)info, sizeof(*info));
> > > > +	memblock_free(info, sizeof(*info));
> > > >    }
> > > >    /*
> > > 
> > > I'm a bit lost. IIUC memblock_free_early() and memblock_free() where
> > > identical.
> > 
> > Yes, they were, but all calls to memblock_free_early() were using
> > __pa(vaddr) because they had a virtual address at hand.
> 
> I'm still not following. In the above memblock_free_early() was taking
> (unsigned long)info . Was it a bug ? 

Not really because s390 has pa == va:

https://elixir.bootlin.com/linux/latest/source/arch/s390/include/asm/page.h#L169


-- 
Sincerely yours,
Mike.

^ permalink raw reply

* Re: [PATCH] powerpc/paravirt: correct preempt debug splat in vcpu_is_preempted()
From: Michael Ellerman @ 2021-09-24  3:07 UTC (permalink / raw)
  To: Srikar Dronamraju; +Cc: Nathan Lynch, linuxppc-dev, npiggin
In-Reply-To: <20210923180224.GD2004@linux.vnet.ibm.com>

Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
> * Michael Ellerman <mpe@ellerman.id.au> [2021-09-23 17:29:32]:
>
>> Nathan Lynch <nathanl@linux.ibm.com> writes:
>> > Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
>> >
>> >> * Nathan Lynch <nathanl@linux.ibm.com> [2021-09-22 11:01:12]:
>> >>
>> >>> Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
...
>> >> Or can I understand how debug_smp_processor_id() is useful if
>> >> __smp_processor_id() is defined as raw_smp_processor_id()?
>> 
>> debug_smp_processor_id() is useful on powerpc, as well as other arches,
>> because it checks that we're in a context where the processor id won't
>> change out from under us.
>> 
>> eg. something like this is unsafe:
>> 
>>   int counts[NR_CPUS];
>>   int tmp, cpu;
>>   
>>   cpu = smp_processor_id();
>>   tmp = counts[cpu];
>>   				<- preempted here and migrated to another CPU
>>   counts[cpu] = tmp + 1;
>> 
>
> If lets say the above call was replaced by raw_smp_processor_id(), how would
> it avoid the preemption / migration to another CPU?
>
> Replacing it with raw_smp_processor_id() may avoid, the debug splat but the
> underlying issue would still remain as is. No?

Correct.

Using raw_smp_processor_id() is generally the wrong answer. For this
example the correct fix is to disable preemption around the code, eg:

   int counts[NR_CPUS];
   int tmp, cpu;
   
   preempt_disable()

   cpu = smp_processor_id();
   tmp = counts[cpu];
   counts[cpu] = tmp + 1;

   preempt_enable();


For the original issue I think it is OK to use raw_smp_processor_id(),
because we're already doing a racy check of whether another CPU has been
preempted by the hypervisor.

        if (!is_kvm_guest()) {
                int first_cpu = cpu_first_thread_sibling(smp_processor_id());

                if (cpu_first_thread_sibling(cpu) == first_cpu)
                        return false;
        }

We could disable preemption around that, eg:

        if (!is_kvm_guest()) {
                int first_cpu;
                bool is_sibling;

                preempt_disable();
                first_cpu = cpu_first_thread_sibling(smp_processor_id());
                is_sibling = (cpu_first_thread_sibling(cpu) == first_cpu)
                preempt_enable();

                // Can be preempted here

                if (is_sibling)
                    return false;
        }

But before we return we could be preempted, and then is_sibling is no
longer necessarily correct. So that doesn't really gain us anything.

The only way to make that result stable is to disable preemption in the
caller, but most callers don't want to AFAICS, because they know they're
doing a racy check to begin with.

cheers

^ permalink raw reply

* Re: [PATCH v2 6/9] PCI: Add pci_find_dvsec_capability to find designated VSEC
From: Liang, Kan @ 2021-09-23 21:37 UTC (permalink / raw)
  To: Ben Widawsky, linux-cxl
  Cc: Andrew Donnellan, linux-pci, Frederic Barrat, iommu,
	Bjorn Helgaas, David E . Box, Jonathan Cameron, Bjorn Helgaas,
	Dan Williams, linuxppc-dev, David Woodhouse, Lu Baolu
In-Reply-To: <20210923172647.72738-7-ben.widawsky@intel.com>



On 9/23/2021 1:26 PM, Ben Widawsky wrote:
> Add pci_find_dvsec_capability to locate a Designated Vendor-Specific
> Extended Capability with the specified DVSEC ID.
> 
> The Designated Vendor-Specific Extended Capability (DVSEC) allows one or
> more vendor specific capabilities that aren't tied to the vendor ID of
> the PCI component.
> 
> DVSEC is critical for both the Compute Express Link (CXL) driver as well
> as the driver for OpenCAPI coherent accelerator (OCXL).
> 
> Cc: David E. Box <david.e.box@linux.intel.com>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: linux-pci@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: Andrew Donnellan <ajd@linux.ibm.com>
> Cc: Lu Baolu <baolu.lu@linux.intel.com>
> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

Applied the interface for the perf uncore driver as below. The interface 
works properly.

Tested-by: Kan Liang <kan.liang@linux.intel.com>


 From ebb69ba386dca91fb372522b13af9feb84adcbc0 Mon Sep 17 00:00:00 2001
From: Kan Liang <kan.liang@linux.intel.com>
Date: Thu, 23 Sep 2021 13:59:24 -0700
Subject: [PATCH] perf/x86/intel/uncore: Use pci core's DVSEC functionality

Apply standard interface pci_find_dvsec_capability for perf uncore
driver and remove unused macros.

Reduce maintenance burden of DVSEC query implementation.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
  arch/x86/events/intel/uncore_discovery.c | 41 
+++++++++++++++-----------------
  arch/x86/events/intel/uncore_discovery.h |  6 -----
  2 files changed, 19 insertions(+), 28 deletions(-)

diff --git a/arch/x86/events/intel/uncore_discovery.c 
b/arch/x86/events/intel/uncore_discovery.c
index 3049c64..f8ea092 100644
--- a/arch/x86/events/intel/uncore_discovery.c
+++ b/arch/x86/events/intel/uncore_discovery.c
@@ -21,7 +21,7 @@ static bool has_generic_discovery_table(void)
  		return false;

  	/* A discovery table device has the unique capability ID. */
-	dvsec = pci_find_next_ext_capability(dev, 0, UNCORE_EXT_CAP_ID_DISCOVERY);
+	dvsec = pci_find_next_ext_capability(dev, 0, PCI_EXT_CAP_ID_DVSEC);
  	pci_dev_put(dev);
  	if (dvsec)
  		return true;
@@ -260,7 +260,7 @@ static int parse_discovery_table(struct pci_dev 
*dev, int die,

  bool intel_uncore_has_discovery_tables(void)
  {
-	u32 device, val, entry_id, bar_offset;
+	u32 device, val, bar_offset;
  	int die, dvsec = 0, ret = true;
  	struct pci_dev *dev = NULL;
  	bool parsed = false;
@@ -275,27 +275,24 @@ bool intel_uncore_has_discovery_tables(void)
  	 * the discovery table devices.
  	 */
  	while ((dev = pci_get_device(PCI_VENDOR_ID_INTEL, device, dev)) != 
NULL) {
-		while ((dvsec = pci_find_next_ext_capability(dev, dvsec, 
UNCORE_EXT_CAP_ID_DISCOVERY))) {
-			pci_read_config_dword(dev, dvsec + UNCORE_DISCOVERY_DVSEC_OFFSET, &val);
-			entry_id = val & UNCORE_DISCOVERY_DVSEC_ID_MASK;
-			if (entry_id != UNCORE_DISCOVERY_DVSEC_ID_PMON)
-				continue;
-
-			pci_read_config_dword(dev, dvsec + UNCORE_DISCOVERY_DVSEC2_OFFSET, 
&val);
-
-			if (val & ~UNCORE_DISCOVERY_DVSEC2_BIR_MASK) {
-				ret = false;
-				goto err;
-			}
-			bar_offset = UNCORE_DISCOVERY_BIR_BASE +
-				     (val & UNCORE_DISCOVERY_DVSEC2_BIR_MASK) * 
UNCORE_DISCOVERY_BIR_STEP;
-
-			die = get_device_die_id(dev);
-			if (die < 0)
-				continue;
-
-			parse_discovery_table(dev, die, bar_offset, &parsed);
+		dvsec = pci_find_dvsec_capability(dev, PCI_VENDOR_ID_INTEL, 
UNCORE_DISCOVERY_DVSEC_ID_PMON);
+		if (!dvsec)
+			continue;
+
+		pci_read_config_dword(dev, dvsec + UNCORE_DISCOVERY_DVSEC2_OFFSET, &val);
+
+		if (val & ~UNCORE_DISCOVERY_DVSEC2_BIR_MASK) {
+			ret = false;
+			goto err;
  		}
+		bar_offset = UNCORE_DISCOVERY_BIR_BASE +
+			     (val & UNCORE_DISCOVERY_DVSEC2_BIR_MASK) * 
UNCORE_DISCOVERY_BIR_STEP;
+
+		die = get_device_die_id(dev);
+		if (die < 0)
+			continue;
+
+		parse_discovery_table(dev, die, bar_offset, &parsed);
  	}

  	/* None of the discovery tables are available */
diff --git a/arch/x86/events/intel/uncore_discovery.h 
b/arch/x86/events/intel/uncore_discovery.h
index 6d735611..84d56e5 100644
--- a/arch/x86/events/intel/uncore_discovery.h
+++ b/arch/x86/events/intel/uncore_discovery.h
@@ -2,12 +2,6 @@

  /* Generic device ID of a discovery table device */
  #define UNCORE_DISCOVERY_TABLE_DEVICE		0x09a7
-/* Capability ID for a discovery table device */
-#define UNCORE_EXT_CAP_ID_DISCOVERY		0x23
-/* First DVSEC offset */
-#define UNCORE_DISCOVERY_DVSEC_OFFSET		0x8
-/* Mask of the supported discovery entry type */
-#define UNCORE_DISCOVERY_DVSEC_ID_MASK		0xffff
  /* PMON discovery entry type ID */
  #define UNCORE_DISCOVERY_DVSEC_ID_PMON		0x1
  /* Second DVSEC offset */
-- 
2.7.4

Thanks,
Kan

> ---
>   drivers/pci/pci.c   | 32 ++++++++++++++++++++++++++++++++
>   include/linux/pci.h |  1 +
>   2 files changed, 33 insertions(+)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index ce2ab62b64cf..94ac86ff28b0 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -732,6 +732,38 @@ u16 pci_find_vsec_capability(struct pci_dev *dev, u16 vendor, int cap)
>   }
>   EXPORT_SYMBOL_GPL(pci_find_vsec_capability);
>   
> +/**
> + * pci_find_dvsec_capability - Find DVSEC for vendor
> + * @dev: PCI device to query
> + * @vendor: Vendor ID to match for the DVSEC
> + * @dvsec: Designated Vendor-specific capability ID
> + *
> + * If DVSEC has Vendor ID @vendor and DVSEC ID @dvsec return the capability
> + * offset in config space; otherwise return 0.
> + */
> +u16 pci_find_dvsec_capability(struct pci_dev *dev, u16 vendor, u16 dvsec)
> +{
> +	int pos;
> +
> +	pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_DVSEC);
> +	if (!pos)
> +		return 0;
> +
> +	while (pos) {
> +		u16 v, id;
> +
> +		pci_read_config_word(dev, pos + PCI_DVSEC_HEADER1, &v);
> +		pci_read_config_word(dev, pos + PCI_DVSEC_HEADER2, &id);
> +		if (vendor == v && dvsec == id)
> +			return pos;
> +
> +		pos = pci_find_next_ext_capability(dev, pos, PCI_EXT_CAP_ID_DVSEC);
> +	}
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(pci_find_dvsec_capability);
> +
>   /**
>    * pci_find_parent_resource - return resource region of parent bus of given
>    *			      region
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index cd8aa6fce204..c93ccfa4571b 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1130,6 +1130,7 @@ u16 pci_find_ext_capability(struct pci_dev *dev, int cap);
>   u16 pci_find_next_ext_capability(struct pci_dev *dev, u16 pos, int cap);
>   struct pci_bus *pci_find_next_bus(const struct pci_bus *from);
>   u16 pci_find_vsec_capability(struct pci_dev *dev, u16 vendor, int cap);
> +u16 pci_find_dvsec_capability(struct pci_dev *dev, u16 vendor, u16 dvsec);
>   
>   u64 pci_get_dsn(struct pci_dev *dev);
>   
> 

^ permalink raw reply related

* Re: [PATCH v1] powerpc/64s: Fix unrecoverable MCE crash
From: Ganesh @ 2021-09-23 18:22 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev; +Cc: Mahesh Salgaonkar
In-Reply-To: <20210922020247.209409-1-npiggin@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3614 bytes --]

On 9/22/21 7:32 AM, Nicholas Piggin wrote:

> The machine check handler is not considered NMI on 64s. The early
> handler is the true NMI handler, and then it schedules the
> machine_check_exception handler to run when interrupts are enabled.
>
> This works fine except the case of an unrecoverable MCE, where the true
> NMI is taken when MSR[RI] is clear, it can not recover to schedule the
> next handler, so it calls machine_check_exception directly so something
> might be done about it.
>
> Calling an async handler from NMI context can result in irq state and
> other things getting corrupted. This can also trigger the BUG at
> arch/powerpc/include/asm/interrupt.h:168.
>
> Fix this by just making the 64s machine_check_exception handler an NMI
> like it is on other subarchs.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---

Hi Nick,

If I inject control memory access error in LPAR on top of this patch
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210906084303.183921-1-ganeshgr@linux.ibm.com/

I see the following warning trace

WARNING: CPU: 130 PID: 7122 at arch/powerpc/include/asm/interrupt.h:319 machine_check_exception+0x310/0x340
  Modules linked in:
  CPU: 130 PID: 7122 Comm: inj_access_err Kdump: loaded Tainted: G   M              5.15.0-rc2-cma-00054-g4a0d59fbaf71-dirty #22
  NIP:  c00000000002f980 LR: c00000000002f7e8 CTR: c000000000a31860
  REGS: c0000039fe51bb20 TRAP: 0700   Tainted: G   M               (5.15.0-rc2-cma-00054-g4a0d59fbaf71-dirty)
  MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 88000222  XER: 20040000
  CFAR: c00000000002f844 IRQMASK: 0
  GPR00: c00000000002f798 c0000039fe51bdc0 c0000000020d0000 0000000000000001
  GPR04: 0000000000000000 4000000000000002 4000000000000000 00000000000019af
  GPR08: 00000077e5ad0000 0000000000000000 c0000077ee16c700 0000000000000080
  GPR12: 0000000088000222 c0000077ee16c700 0000000000000000 0000000000000000
  GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR24: 0000000000000000 0000000000000000 c0000000020fecd8 0000000000000000
  GPR28: 0000000000000000 0000000000000001 0000000000000001 c0000039fe51be80
  NIP [c00000000002f980] machine_check_exception+0x310/0x340
  LR [c00000000002f7e8] machine_check_exception+0x178/0x340
  Call Trace:
  [c0000039fe51bdc0] [c00000000002f798] machine_check_exception+0x128/0x340 (unreliable)
  [c0000039fe51be10] [c0000000000086ec] machine_check_common+0x1ac/0x1b0
  --- interrupt: 200 at 0x10000968
  NIP:  0000000010000968 LR: 0000000010000958 CTR: 0000000000000000
  REGS: c0000039fe51be80 TRAP: 0200   Tainted: G   M               (5.15.0-rc2-cma-00054-g4a0d59fbaf71-dirty)
  MSR:  8000000002a0f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 22000824  XER: 00000000
  CFAR: 000000000000021c DAR: 00007fffb00c0000 DSISR: 02000008 IRQMASK: 0
  GPR00: 0000000022000824 00007fffc9647770 0000000010027f00 00007fffb00c0000
  GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR08: 0000000000000000 00007fffb00c0000 0000000000000001 0000000000000000
  GPR12: 0000000000000000 00007fffb015a330 0000000000000000 0000000000000000
  GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR24: 0000000000000000 0000000000000000 0000000000000000 000000001000085c
  GPR28: 00007fffc9647d18 0000000000000001 00000000100009b0 00007fffc9647770
  NIP [0000000010000968] 0x10000968
  LR [0000000010000958] 0x10000958
  --- interrupt: 200


[-- Attachment #2: Type: text/html, Size: 4152 bytes --]

^ permalink raw reply

* Re: [PATCH 3/3] memblock: cleanup memblock_free interface
From: Shahab Vahedi @ 2021-09-23 12:44 UTC (permalink / raw)
  To: Mike Rapoport, Linus Torvalds
  Cc: devicetree@vger.kernel.org, linux-efi@vger.kernel.org,
	Mike Rapoport, kvm@vger.kernel.org, linux-s390@vger.kernel.org,
	linux-sh@vger.kernel.org, linux-um@lists.infradead.org,
	linux-kernel@vger.kernel.org, kasan-dev@googlegroups.com,
	linux-mips@vger.kernel.org, linux-mm@kvack.org,
	iommu@lists.linux-foundation.org, linux-usb@vger.kernel.org,
	linux-alpha@vger.kernel.org, sparclinux@vger.kernel.org,
	xen-devel@lists.xenproject.org, Andrew Morton,
	linux-snps-arc@lists.infradead.org, linuxppc-dev@lists.ozlabs.org,
	linux-riscv@lists.infradead.org,
	linux-arm-kernel@lists.infradead.org
In-Reply-To: <20210923074335.12583-4-rppt@kernel.org>

On 9/23/21 9:43 AM, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> For ages memblock_free() interface dealt with physical addresses even
> despite the existence of memblock_alloc_xx() functions that return a
> virtual pointer.
> 
> Introduce memblock_phys_free() for freeing physical ranges and repurpose
> memblock_free() to free virtual pointers to make the following pairing
> abundantly clear:
> 
> 	int memblock_phys_free(phys_addr_t base, phys_addr_t size);
> 	phys_addr_t memblock_phys_alloc(phys_addr_t base, phys_addr_t size);
> 
> 	void *memblock_alloc(phys_addr_t size, phys_addr_t align);
> 	void memblock_free(void *ptr, size_t size);
> 
> Replace intermediate memblock_free_ptr() with memblock_free() and drop
> unnecessary aliases memblock_free_early() and memblock_free_early_nid().
> 
> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>

arch/arc part: Reviewed-by: Shahab Vahedi <shahab@synopsys.com>


Thanks,
Shahab

^ permalink raw reply

* Re: [PATCH 0/3] memblock: cleanup memblock_free interface
From: Mike Rapoport @ 2021-09-23 19:45 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: devicetree, linux-efi, KVM list, linux-s390, Linux-sh list,
	linux-um, Linux Kernel Mailing List, kasan-dev,
	open list:BROADCOM NVRAM DRIVER, Linux-MM, iommu, linux-usb,
	Linux ARM, alpha, linux-sparc, xen-devel, Andrew Morton,
	open list:SYNOPSYS ARC ARCHITECTURE, linuxppc-dev, linux-riscv,
	Mike Rapoport
In-Reply-To: <CAHk-=wiJB8H5pZz-AKaSJ7ViRtdxQGJT7eOByp8DJx2OwZSYwA@mail.gmail.com>

Hi Linus,

On Thu, Sep 23, 2021 at 09:01:46AM -0700, Linus Torvalds wrote:
> On Thu, Sep 23, 2021 at 12:43 AM Mike Rapoport <rppt@kernel.org> wrote:
> >
> You need to be a LOT more careful.
> 
> From a trivial check - exactly because I looked at doing it with a
> script, and decided it's not so easy - I found cases like this:
> 
> -               memblock_free(__pa(paca_ptrs) + new_ptrs_size,
> +               memblock_free(paca_ptrs + new_ptrs_size,
> 
> which is COMPLETELY wrong.

I did use a coccinelle script that's slightly more robust that a sed you've
sent, but then I did a manual review, hence the two small patches with
fixes. Indeed I missed this one, so to be on the safe side I'll rename only
the obvious cases where coccinelle can be used reliably and leave all the
rest as it's now. If somebody cares enough they can update it later.
 
> And no, making the scripting just replace '__pa(x)' with '(void *)(x)'

These were actually manual and they are required for variables that
used as virtual addresses but have unsigned long type, like e.g.
initrd_start. So it's either __pa(x) or (void *).

-- 
Sincerely yours,
Mike.

^ permalink raw reply

* Re: [PATCH v4] lockdown,selinux: fix wrong subject in some SELinux lockdown checks
From: Paul Moore @ 2021-09-23 19:07 UTC (permalink / raw)
  To: Ondrej Mosnacek
  Cc: linux-efi, linux-pci, linux-cxl, Steffen Klassert, Herbert Xu,
	x86, James Morris, linux-acpi, Ingo Molnar, linux-serial,
	linux-pm, selinux, Steven Rostedt, Casey Schaufler, Dan Williams,
	netdev, Stephen Smalley, kexec, linux-kernel,
	linux-security-module, linux-fsdevel, bpf, linuxppc-dev,
	David S . Miller
In-Reply-To: <CAHC9VhQyejnmLn0NHQiWzikHs8ZdzAUdZ2WqNxgGM6xhJ4mvMQ@mail.gmail.com>

On Wed, Sep 15, 2021 at 10:59 PM Paul Moore <paul@paul-moore.com> wrote:
>
> On Mon, Sep 13, 2021 at 5:05 PM Paul Moore <paul@paul-moore.com> wrote:
> >
> > On Mon, Sep 13, 2021 at 10:02 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
> > >
> > > Commit 59438b46471a ("security,lockdown,selinux: implement SELinux
> > > lockdown") added an implementation of the locked_down LSM hook to
> > > SELinux, with the aim to restrict which domains are allowed to perform
> > > operations that would breach lockdown.
> > >
> > > However, in several places the security_locked_down() hook is called in
> > > situations where the current task isn't doing any action that would
> > > directly breach lockdown, leading to SELinux checks that are basically
> > > bogus.
> > >
> > > To fix this, add an explicit struct cred pointer argument to
> > > security_lockdown() and define NULL as a special value to pass instead
> > > of current_cred() in such situations. LSMs that take the subject
> > > credentials into account can then fall back to some default or ignore
> > > such calls altogether. In the SELinux lockdown hook implementation, use
> > > SECINITSID_KERNEL in case the cred argument is NULL.
> > >
> > > Most of the callers are updated to pass current_cred() as the cred
> > > pointer, thus maintaining the same behavior. The following callers are
> > > modified to pass NULL as the cred pointer instead:
> > > 1. arch/powerpc/xmon/xmon.c
> > >      Seems to be some interactive debugging facility. It appears that
> > >      the lockdown hook is called from interrupt context here, so it
> > >      should be more appropriate to request a global lockdown decision.
> > > 2. fs/tracefs/inode.c:tracefs_create_file()
> > >      Here the call is used to prevent creating new tracefs entries when
> > >      the kernel is locked down. Assumes that locking down is one-way -
> > >      i.e. if the hook returns non-zero once, it will never return zero
> > >      again, thus no point in creating these files. Also, the hook is
> > >      often called by a module's init function when it is loaded by
> > >      userspace, where it doesn't make much sense to do a check against
> > >      the current task's creds, since the task itself doesn't actually
> > >      use the tracing functionality (i.e. doesn't breach lockdown), just
> > >      indirectly makes some new tracepoints available to whoever is
> > >      authorized to use them.
> > > 3. net/xfrm/xfrm_user.c:copy_to_user_*()
> > >      Here a cryptographic secret is redacted based on the value returned
> > >      from the hook. There are two possible actions that may lead here:
> > >      a) A netlink message XFRM_MSG_GETSA with NLM_F_DUMP set - here the
> > >         task context is relevant, since the dumped data is sent back to
> > >         the current task.
> > >      b) When adding/deleting/updating an SA via XFRM_MSG_xxxSA, the
> > >         dumped SA is broadcasted to tasks subscribed to XFRM events -
> > >         here the current task context is not relevant as it doesn't
> > >         represent the tasks that could potentially see the secret.
> > >      It doesn't seem worth it to try to keep using the current task's
> > >      context in the a) case, since the eventual data leak can be
> > >      circumvented anyway via b), plus there is no way for the task to
> > >      indicate that it doesn't care about the actual key value, so the
> > >      check could generate a lot of "false alert" denials with SELinux.
> > >      Thus, let's pass NULL instead of current_cred() here faute de
> > >      mieux.
> > >
> > > Improvements-suggested-by: Casey Schaufler <casey@schaufler-ca.com>
> > > Improvements-suggested-by: Paul Moore <paul@paul-moore.com>
> > > Fixes: 59438b46471a ("security,lockdown,selinux: implement SELinux lockdown")
> > > Acked-by: Dan Williams <dan.j.williams@intel.com>         [cxl]
> > > Acked-by: Steffen Klassert <steffen.klassert@secunet.com> [xfrm]
> > > Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
> > > ---
> > >
> > > v4:
> > > - rebase on top of TODO
> > > - fix rebase conflicts:
> > >   * drivers/cxl/pci.c
> > >     - trivial: the lockdown reason was corrected in mainline
> > >   * kernel/bpf/helpers.c, kernel/trace/bpf_trace.c
> > >     - trivial: LOCKDOWN_BPF_READ was renamed to LOCKDOWN_BPF_READ_KERNEL
> > >       in mainline
> > >   * kernel/power/hibernate.c
> > >     - trivial: !secretmem_active() was added to the condition in
> > >       hibernation_available()
> > > - cover new security_locked_down() call in kernel/bpf/helpers.c
> > >   (LOCKDOWN_BPF_WRITE_USER in BPF_FUNC_probe_write_user case)
> > >
> > > v3: https://lore.kernel.org/lkml/20210616085118.1141101-1-omosnace@redhat.com/
> > > - add the cred argument to security_locked_down() and adapt all callers
> > > - keep using current_cred() in BPF, as the hook calls have been shifted
> > >   to program load time (commit ff40e51043af ("bpf, lockdown, audit: Fix
> > >   buggy SELinux lockdown permission checks"))
> > > - in SELinux, don't ignore hook calls where cred == NULL, but use
> > >   SECINITSID_KERNEL as the subject instead
> > > - update explanations in the commit message
> > >
> > > v2: https://lore.kernel.org/lkml/20210517092006.803332-1-omosnace@redhat.com/
> > > - change to a single hook based on suggestions by Casey Schaufler
> > >
> > > v1: https://lore.kernel.org/lkml/20210507114048.138933-1-omosnace@redhat.com/
> >
> > The changes between v3 and v4 all seem sane to me, but I'm going to
> > let this sit for a few days in hopes that we can collect a few more
> > Reviewed-bys and ACKs.  If I don't see any objections I'll merge it
> > mid-week(ish) into selinux/stable-5.15 and plan on sending it to Linus
> > after it goes through a build/test cycle.
>
> Time's up, I just merged this into selinux/stable-5.15 and I'll send
> this to Linus once it passes testing.

... and it's back out of selinux/stable-5.15 in spectacular fashion.
I'll be following up with another SELinux patch today or tomorrow.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply

* Re: [PATCH v3 5/8] x86/sme: Replace occurrences of sme_active() with cc_platform_has()
From: Borislav Petkov @ 2021-09-23 18:21 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Sathyanarayanan Kuppuswamy, linux-efi, Brijesh Singh, kvm,
	Peter Zijlstra, Dave Hansen, dri-devel, platform-driver-x86,
	Will Deacon, linux-s390, Andi Kleen, Joerg Roedel, x86, amd-gfx,
	Christoph Hellwig, Ingo Molnar, linux-graphics-maintainer,
	Tom Lendacky, Tianyu Lan, Andy Lutomirski, Thomas Gleixner, kexec,
	linux-kernel, iommu, linux-fsdevel, linuxppc-dev
In-Reply-To: <20210922210558.itofvu3725dap5xx@box.shutemov.name>

On Thu, Sep 23, 2021 at 12:05:58AM +0300, Kirill A. Shutemov wrote:
> Unless we find other way to guarantee RIP-relative access, we must use
> fixup_pointer() to access any global variables.

Yah, I've asked compiler folks about any guarantees we have wrt
rip-relative addresses but it doesn't look good. Worst case, we'd have
to do the fixup_pointer() thing.

In the meantime, Tom and I did some more poking at this and here's a
diff ontop.

The direction being that we'll stick both the AMD and Intel
*cc_platform_has() call into cc_platform.c for which instrumentation
will be disabled so no issues with that.

And that will keep all that querying all together in a single file.

---
diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index a73712b6ee0e..2d4f5c17d79c 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -51,7 +51,6 @@ void __init mem_encrypt_free_decrypted_mem(void);
 void __init mem_encrypt_init(void);
 
 void __init sev_es_init_vc_handling(void);
-bool amd_cc_platform_has(enum cc_attr attr);
 
 #define __bss_decrypted __section(".bss..decrypted")
 
@@ -74,7 +73,6 @@ static inline void __init sme_encrypt_kernel(struct boot_params *bp) { }
 static inline void __init sme_enable(struct boot_params *bp) { }
 
 static inline void sev_es_init_vc_handling(void) { }
-static inline bool amd_cc_platform_has(enum cc_attr attr) { return false; }
 
 static inline int __init
 early_set_memory_decrypted(unsigned long vaddr, unsigned long size) { return 0; }
@@ -103,12 +101,6 @@ static inline u64 sme_get_me_mask(void)
 	return sme_me_mask;
 }
 
-#if defined(CONFIG_CPU_SUP_INTEL) && defined(CONFIG_ARCH_HAS_CC_PLATFORM)
-bool intel_cc_platform_has(enum cc_attr attr);
-#else
-static inline bool intel_cc_platform_has(enum cc_attr attr) { return false; }
-#endif
-
 #endif	/* __ASSEMBLY__ */
 
 #endif	/* __X86_MEM_ENCRYPT_H__ */
diff --git a/arch/x86/kernel/cc_platform.c b/arch/x86/kernel/cc_platform.c
index da54a1805211..97ede7052f77 100644
--- a/arch/x86/kernel/cc_platform.c
+++ b/arch/x86/kernel/cc_platform.c
@@ -13,6 +13,52 @@
 
 #include <asm/processor.h>
 
+static bool intel_cc_platform_has(enum cc_attr attr)
+{
+#ifdef CONFIG_INTEL_TDX_GUEST
+	return false;
+#else
+	return false;
+#endif
+}
+
+/*
+ * SME and SEV are very similar but they are not the same, so there are
+ * times that the kernel will need to distinguish between SME and SEV. The
+ * cc_platform_has() function is used for this.  When a distinction isn't
+ * needed, the CC_ATTR_MEM_ENCRYPT attribute can be used.
+ *
+ * The trampoline code is a good example for this requirement.  Before
+ * paging is activated, SME will access all memory as decrypted, but SEV
+ * will access all memory as encrypted.  So, when APs are being brought
+ * up under SME the trampoline area cannot be encrypted, whereas under SEV
+ * the trampoline area must be encrypted.
+ */
+static bool amd_cc_platform_has(enum cc_attr attr)
+{
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+	switch (attr) {
+	case CC_ATTR_MEM_ENCRYPT:
+		return sme_me_mask;
+
+	case CC_ATTR_HOST_MEM_ENCRYPT:
+		return sme_me_mask && !(sev_status & MSR_AMD64_SEV_ENABLED);
+
+	case CC_ATTR_GUEST_MEM_ENCRYPT:
+		return sev_status & MSR_AMD64_SEV_ENABLED;
+
+	case CC_ATTR_GUEST_STATE_ENCRYPT:
+		return sev_status & MSR_AMD64_SEV_ES_ENABLED;
+
+	default:
+		return false;
+	}
+#else
+	return false;
+#endif
+}
+
+
 bool cc_platform_has(enum cc_attr attr)
 {
 	if (sme_me_mask)
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 53756ff12295..8321c43554a1 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -60,13 +60,6 @@ static u64 msr_test_ctrl_cache __ro_after_init;
  */
 static bool cpu_model_supports_sld __ro_after_init;
 
-#ifdef CONFIG_ARCH_HAS_CC_PLATFORM
-bool intel_cc_platform_has(enum cc_attr attr)
-{
-	return false;
-}
-#endif
-
 /*
  * Processors which have self-snooping capability can handle conflicting
  * memory type across CPUs by snooping its own cache. However, there exists
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 9417d404ea92..23d54b810f08 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -361,38 +361,6 @@ int __init early_set_memory_encrypted(unsigned long vaddr, unsigned long size)
 	return early_set_memory_enc_dec(vaddr, size, true);
 }
 
-/*
- * SME and SEV are very similar but they are not the same, so there are
- * times that the kernel will need to distinguish between SME and SEV. The
- * cc_platform_has() function is used for this.  When a distinction isn't
- * needed, the CC_ATTR_MEM_ENCRYPT attribute can be used.
- *
- * The trampoline code is a good example for this requirement.  Before
- * paging is activated, SME will access all memory as decrypted, but SEV
- * will access all memory as encrypted.  So, when APs are being brought
- * up under SME the trampoline area cannot be encrypted, whereas under SEV
- * the trampoline area must be encrypted.
- */
-bool amd_cc_platform_has(enum cc_attr attr)
-{
-	switch (attr) {
-	case CC_ATTR_MEM_ENCRYPT:
-		return sme_me_mask;
-
-	case CC_ATTR_HOST_MEM_ENCRYPT:
-		return sme_me_mask && !(sev_status & MSR_AMD64_SEV_ENABLED);
-
-	case CC_ATTR_GUEST_MEM_ENCRYPT:
-		return sev_status & MSR_AMD64_SEV_ENABLED;
-
-	case CC_ATTR_GUEST_STATE_ENCRYPT:
-		return sev_status & MSR_AMD64_SEV_ES_ENABLED;
-
-	default:
-		return false;
-	}
-}
-
 /* Override for DMA direct allocation check - ARCH_HAS_FORCE_DMA_UNENCRYPTED */
 bool force_dma_unencrypted(struct device *dev)
 {

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply related

* Re: [PATCH] powerpc/paravirt: correct preempt debug splat in vcpu_is_preempted()
From: Srikar Dronamraju @ 2021-09-23 18:02 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: Nathan Lynch, linuxppc-dev, npiggin
In-Reply-To: <874kabn40z.fsf@mpe.ellerman.id.au>

* Michael Ellerman <mpe@ellerman.id.au> [2021-09-23 17:29:32]:

> Nathan Lynch <nathanl@linux.ibm.com> writes:
> > Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
> >
> >> * Nathan Lynch <nathanl@linux.ibm.com> [2021-09-22 11:01:12]:
> >>
> >>> Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
> >>> > * Nathan Lynch <nathanl@linux.ibm.com> [2021-09-20 22:12:13]:
> >>> >
> >>> >> vcpu_is_preempted() can be used outside of preempt-disabled critical
> >>> >> sections, yielding warnings such as:
> >>> >> 
> >>> >> BUG: using smp_processor_id() in preemptible [00000000] code: systemd-udevd/185
> >>> >> caller is rwsem_spin_on_owner+0x1cc/0x2d0
> >>> >> CPU: 1 PID: 185 Comm: systemd-udevd Not tainted 5.15.0-rc2+ #33
> >>> >> Call Trace:
> >>> >> [c000000012907ac0] [c000000000aa30a8] dump_stack_lvl+0xac/0x108 (unreliable)
> >>> >> [c000000012907b00] [c000000001371f70] check_preemption_disabled+0x150/0x160
> >>> >> [c000000012907b90] [c0000000001e0e8c] rwsem_spin_on_owner+0x1cc/0x2d0
> >>> >> [c000000012907be0] [c0000000001e1408] rwsem_down_write_slowpath+0x478/0x9a0
> >>> >> [c000000012907ca0] [c000000000576cf4] filename_create+0x94/0x1e0
> >>> >> [c000000012907d10] [c00000000057ac08] do_symlinkat+0x68/0x1a0
> >>> >> [c000000012907d70] [c00000000057ae18] sys_symlink+0x58/0x70
> >>> >> [c000000012907da0] [c00000000002e448] system_call_exception+0x198/0x3c0
> >>> >> [c000000012907e10] [c00000000000c54c] system_call_common+0xec/0x250
> >>> >> 
> >>> >> The result of vcpu_is_preempted() is always subject to invalidation by
> >>> >> events inside and outside of Linux; it's just a best guess at a point in
> >>> >> time. Use raw_smp_processor_id() to avoid such warnings.
> >>> >
> >>> > Typically smp_processor_id() and raw_smp_processor_id() except for the
> >>> > CONFIG_DEBUG_PREEMPT.
> >>> 
> >>> Sorry, I don't follow...
> >>
> >> I meant, Unless CONFIG_DEBUG_PREEMPT, smp_processor_id() is defined as
> >> raw_processor_id().
> >>
> >>> 
> >>> > In the CONFIG_DEBUG_PREEMPT case, smp_processor_id()
> >>> > is actually debug_smp_processor_id(), which does all the checks.
> >>> 
> >>> Yes, OK.
> >>> 
> >>> > I believe these checks in debug_smp_processor_id() are only valid for x86
> >>> > case (aka cases were they have __smp_processor_id() defined.)
> >>> 
> >>> Hmm, I am under the impression that the checks in
> >>> debug_smp_processor_id() are valid regardless of whether the arch
> >>> overrides __smp_processor_id().
> >>
> >> From include/linux/smp.h
> >>
> >> /*
> >>  * Allow the architecture to differentiate between a stable and unstable read.
> >>  * For example, x86 uses an IRQ-safe asm-volatile read for the unstable but a
> >>  * regular asm read for the stable.
> >>  */
> >> #ifndef __smp_processor_id
> >> #define __smp_processor_id(x) raw_smp_processor_id(x)
> >> #endif
> >>
> >> As far as I see, only x86 has a definition of __smp_processor_id.
> >> So for archs like Powerpc, __smp_processor_id(), is always
> >> defined as raw_smp_processor_id(). Right?
> >
> > Sure, yes.
> >
> >> I would think debug_smp_processor_id() would be useful if __smp_processor_id()
> >> is different from raw_smp_processor_id(). Do note debug_smp_processor_id() 
> >> calls raw_smp_processor_id().
> 
> Agree.
> 
> > I do not think the utility of debug_smp_processor_id() is related to
> > whether the arch defines __smp_processor_id().
> >
> >> Or can I understand how debug_smp_processor_id() is useful if
> >> __smp_processor_id() is defined as raw_smp_processor_id()?
> 
> debug_smp_processor_id() is useful on powerpc, as well as other arches,
> because it checks that we're in a context where the processor id won't
> change out from under us.
> 
> eg. something like this is unsafe:
> 
>   int counts[NR_CPUS];
>   int tmp, cpu;
>   
>   cpu = smp_processor_id();
>   tmp = counts[cpu];
>   				<- preempted here and migrated to another CPU
>   counts[cpu] = tmp + 1;
> 

If lets say the above call was replaced by raw_smp_processor_id(), how would
it avoid the preemption / migration to another CPU?

Replacing it with raw_smp_processor_id() may avoid, the debug splat but the
underlying issue would still remain as is. No?

> 
> > So, for powerpc with DEBUG_PREEMPT unset, a call to smp_procesor_id()
> > expands to __smp_processor_id() which expands to raw_smp_processor_id(),
> > avoiding the preempt safety checks. This is working as intended.
> >
> > For powerpc with DEBUG_PREEMPT=y, a call to smp_processor_id() expands
> > to the out of line call to debug_smp_processor_id(), which calls
> > raw_smp_processor_id() and performs the checks, warning if called in an
> > inappropriate context, as seen here. Also working as intended.
> >
> > AFAICT __smp_processor_id() is a performance/codegen-oriented hook, and
> > not really related to the debug facility. Please see 9ed7d75b2f09d
> > ("x86/percpu: Relax smp_processor_id()").
> 
> Yeah good find.
> 
> cheers

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox