LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] swiotlb: set IO TLB segment size via cmdline
From: Roman Skakun @ 2021-09-15 13:37 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Roman Skakun, linux-doc, Peter Zijlstra, Viresh Kumar,
	Linux Kernel Mailing List, Paul Mackerras, Will Deacon,
	Boris Ostrovsky, Marek Szyprowski, Stefano Stabellini,
	Jonathan Corbet, Christoph Hellwig, xen-devel, Paul E. McKenney,
	Konrad Rzeszutek Wilk, Muchun Song, Thomas Gleixner,
	Juergen Gross, Thomas Bogendoerfer, Andrii Anisov, linuxppc-dev,
	Randy Dunlap, linux-mips, iommu, Roman Skakun, Andrew Morton,
	Lu Baolu, Robin Murphy, Mike Rapoport, Maciej W. Rozycki
In-Reply-To: <7c04db79-7de1-93ff-0908-9bad60a287b9@suse.com>

Hi Jan,

Thanks for the answer.

>> From: Roman Skakun <roman_skakun@epam.com>
>>
>> It is possible when default IO TLB size is not
>> enough to fit a long buffers as described here [1].
>>
>> This patch makes a way to set this parameter
>> using cmdline instead of recompiling a kernel.
>>
>> [1] https://www.xilinx.com/support/answers/72694.html
>
>  I'm not convinced the swiotlb use describe there falls under "intended
>  use" - mapping a 1280x720 framebuffer in a single chunk?

I had the same issue while mapping DMA chuck ~4MB for gem fb when
using xen vdispl.
I got the next log:
[ 142.030421] rcar-fcp fea2f000.fcp: swiotlb buffer is full (sz:
3686400 bytes), total 32768 (slots), used 32 (slots)

It happened when I tried to map bounce buffer, which has a large size.
The default size if 128(IO_TLB_SEGSIZE) * 2048(IO_TLB_SHIFT) = 262144
bytes, but we requested 3686400 bytes.
When I change IO_TLB_SEGSIZE to 2048. (2048(IO_TLB_SEGSIZE)  *
2048(IO_TLB_SHIFT) = 4194304bytes).
It makes possible to retrieve a bounce buffer for requested size.
After changing this value, the problem is gone.

>  the bottom of this page is also confusing, as following "Then we can
>  confirm the modified swiotlb size in the boot log:" there is a log
>  fragment showing the same original size of 64Mb.

I suspect, this is a mistake in the article.
According to https://elixir.bootlin.com/linux/v5.14.4/source/kernel/dma/swiotlb.c#L214
and
https://elixir.bootlin.com/linux/v5.15-rc1/source/kernel/dma/swiotlb.c#L182
The IO_TLB_SEGSIZE is not used to calculate total size of swiotlb area.
This explains why we have the same total size before and after changing of
TLB segment size.

>  In order to be sure to catch all uses like this one (including ones
>  which make it upstream in parallel to yours), I think you will want
>  to rename the original IO_TLB_SEGSIZE to e.g. IO_TLB_DEFAULT_SEGSIZE.

I don't understand your point. Can you clarify this?

>> +     /* get max IO TLB segment size */
>> +     if (isdigit(*str)) {
>> +             tmp = simple_strtoul(str, &str, 0);
>> +             if (tmp)
>> +                     io_tlb_seg_size = ALIGN(tmp, IO_TLB_SEGSIZE);
>
> From all I can tell io_tlb_seg_size wants to be a power of 2. Merely
> aligning to a multiple of IO_TLB_SEGSIZE isn't going to be enough.

Yes, right, thanks!

Cheers,
Roman.

вт, 14 сент. 2021 г. в 18:29, Jan Beulich <jbeulich@suse.com>:
>
> On 14.09.2021 17:10, Roman Skakun wrote:
> > From: Roman Skakun <roman_skakun@epam.com>
> >
> > It is possible when default IO TLB size is not
> > enough to fit a long buffers as described here [1].
> >
> > This patch makes a way to set this parameter
> > using cmdline instead of recompiling a kernel.
> >
> > [1] https://www.xilinx.com/support/answers/72694.html
>
> I'm not convinced the swiotlb use describe there falls under "intended
> use" - mapping a 1280x720 framebuffer in a single chunk? (As an aside,
> the bottom of this page is also confusing, as following "Then we can
> confirm the modified swiotlb size in the boot log:" there is a log
> fragment showing the same original size of 64Mb.
>
> > --- a/arch/mips/cavium-octeon/dma-octeon.c
> > +++ b/arch/mips/cavium-octeon/dma-octeon.c
> > @@ -237,7 +237,7 @@ void __init plat_swiotlb_setup(void)
> >               swiotlbsize = 64 * (1<<20);
> >  #endif
> >       swiotlb_nslabs = swiotlbsize >> IO_TLB_SHIFT;
> > -     swiotlb_nslabs = ALIGN(swiotlb_nslabs, IO_TLB_SEGSIZE);
> > +     swiotlb_nslabs = ALIGN(swiotlb_nslabs, swiotlb_io_seg_size());
>
> In order to be sure to catch all uses like this one (including ones
> which make it upstream in parallel to yours), I think you will want
> to rename the original IO_TLB_SEGSIZE to e.g. IO_TLB_DEFAULT_SEGSIZE.
>
> > @@ -81,15 +86,30 @@ static unsigned int max_segment;
> >  static unsigned long default_nslabs = IO_TLB_DEFAULT_SIZE >> IO_TLB_SHIFT;
> >
> >  static int __init
> > -setup_io_tlb_npages(char *str)
> > +setup_io_tlb_params(char *str)
> >  {
> > +     unsigned long tmp;
> > +
> >       if (isdigit(*str)) {
> > -             /* avoid tail segment of size < IO_TLB_SEGSIZE */
> > -             default_nslabs =
> > -                     ALIGN(simple_strtoul(str, &str, 0), IO_TLB_SEGSIZE);
> > +             default_nslabs = simple_strtoul(str, &str, 0);
> >       }
> >       if (*str == ',')
> >               ++str;
> > +
> > +     /* get max IO TLB segment size */
> > +     if (isdigit(*str)) {
> > +             tmp = simple_strtoul(str, &str, 0);
> > +             if (tmp)
> > +                     io_tlb_seg_size = ALIGN(tmp, IO_TLB_SEGSIZE);
>
> From all I can tell io_tlb_seg_size wants to be a power of 2. Merely
> aligning to a multiple of IO_TLB_SEGSIZE isn't going to be enough.
>
> Jan
>


-- 
Best Regards, Roman.

^ permalink raw reply

* [PATCH] video: fbdev: use memset_io() instead of memset()
From: Christophe Leroy @ 2021-09-15 13:34 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Andrew Morton
  Cc: linux-fbdev, Finn Thain, Stan Johnson, linux-kernel, dri-devel,
	linuxppc-dev

While investigating a lockup at startup on Powerbook 3400C, it was
identified that the fbdev driver generates alignment exception at
startup:

	--- interrupt: 600 at memset+0x60/0xc0
	NIP:  c0021414 LR: c03fc49c CTR: 00007fff
	REGS: ca021c10 TRAP: 0600   Tainted: G        W          (5.14.2-pmac-00727-g12a41fa69492)
	MSR:  00009032 <EE,ME,IR,DR,RI>  CR: 44008442  XER: 20000100
	DAR: cab80020 DSISR: 00017c07
	GPR00: 00000007 ca021cd0 c14412e0 cab80000 00000000 00100000 cab8001c 00000004
	GPR08: 00100000 00007fff 00000000 00000000 84008442 00000000 c0006fb4 00000000
	GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00100000
	GPR24: 00000000 81800000 00000320 c15fa400 c14d1878 00000000 c14d1800 c094e19c
	NIP [c0021414] memset+0x60/0xc0
	LR [c03fc49c] chipsfb_pci_init+0x160/0x580
	--- interrupt: 600
	[ca021cd0] [c03fc46c] chipsfb_pci_init+0x130/0x580 (unreliable)
	[ca021d20] [c03a3a70] pci_device_probe+0xf8/0x1b8
	[ca021d50] [c043d584] really_probe.part.0+0xac/0x388
	[ca021d70] [c043d914] __driver_probe_device+0xb4/0x170
	[ca021d90] [c043da18] driver_probe_device+0x48/0x144
	[ca021dc0] [c043e318] __driver_attach+0x11c/0x1c4
	[ca021de0] [c043ad30] bus_for_each_dev+0x88/0xf0
	[ca021e10] [c043c724] bus_add_driver+0x190/0x22c
	[ca021e40] [c043ee94] driver_register+0x9c/0x170
	[ca021e60] [c0006c28] do_one_initcall+0x54/0x1ec
	[ca021ed0] [c08246e4] kernel_init_freeable+0x1c0/0x270
	[ca021f10] [c0006fdc] kernel_init+0x28/0x11c
	[ca021f30] [c0017148] ret_from_kernel_thread+0x14/0x1c
	Instruction dump:
	7d4601a4 39490777 7d4701a4 39490888 7d4801a4 39490999 7d4901a4 39290aaa
	7d2a01a4 4c00012c 4bfffe88 0fe00000 <4bfffe80> 9421fff0 38210010 48001970

This is due to 'dcbz' instruction being used on non-cached memory.
'dcbz' instruction is used by memset() to zeroize a complete
cacheline at once, and memset() is not expected to be used on non
cached memory.

When performing a 'sparse' check on fbdev driver, it also appears
that the use of memset() is unexpected:

	drivers/video/fbdev/chipsfb.c:334:17: warning: incorrect type in argument 1 (different address spaces)
	drivers/video/fbdev/chipsfb.c:334:17:    expected void *
	drivers/video/fbdev/chipsfb.c:334:17:    got char [noderef] __iomem *screen_base
	drivers/video/fbdev/chipsfb.c:334:15: warning: memset with byte count of 1048576

Use fb_memset() instead of memset(). fb_memset() is defined as
memset_io() for powerpc.

Fixes: 8c8709334cec ("[PATCH] ppc32: Remove CONFIG_PMAC_PBOOK")
Reported-by: Stan Johnson <userm57@yahoo.com>
Cc: Finn Thain <fthain@linux-m68k.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 drivers/video/fbdev/chipsfb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/video/fbdev/chipsfb.c b/drivers/video/fbdev/chipsfb.c
index 998067b701fa..393894af26f8 100644
--- a/drivers/video/fbdev/chipsfb.c
+++ b/drivers/video/fbdev/chipsfb.c
@@ -331,7 +331,7 @@ static const struct fb_var_screeninfo chipsfb_var = {
 
 static void init_chips(struct fb_info *p, unsigned long addr)
 {
-	memset(p->screen_base, 0, 0x100000);
+	fb_memset(p->screen_base, 0, 0x100000);
 
 	p->fix = chipsfb_fix;
 	p->fix.smem_start = addr;
-- 
2.31.1


^ permalink raw reply related

* [PATCH] serial: 8250: SERIAL_8250_FSL should not default to y when compile-testing
From: Geert Uytterhoeven @ 2021-09-15 12:56 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby, Johan Hovold, Shawn Guo, Li Yang,
	Scott Wood
  Cc: Geert Uytterhoeven, linuxppc-dev, linux-arm-kernel, linux-serial,
	linux-kernel

Commit b1442c55ce8977aa ("serial: 8250: extend compile-test coverage")
added compile-test support to the Freescale 16550 driver.  However, as
SERIAL_8250_FSL is an invisible symbol, merely enabling COMPILE_TEST now
enables this driver.

Fix this by making SERIAL_8250_FSL visible.  Tighten the dependencies to
prevent asking the user about this driver when configuring a kernel
without appropriate Freescale SoC or ACPI support.

Fixes: b1442c55ce8977aa ("serial: 8250: extend compile-test coverage")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
Yes, it's ugly, but I see no better solution. Do you?

 drivers/tty/serial/8250/Kconfig | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/serial/8250/Kconfig b/drivers/tty/serial/8250/Kconfig
index 808268edd2e82a45..a2978b31144e94f2 100644
--- a/drivers/tty/serial/8250/Kconfig
+++ b/drivers/tty/serial/8250/Kconfig
@@ -361,9 +361,13 @@ config SERIAL_8250_BCM2835AUX
 	  If unsure, say N.
 
 config SERIAL_8250_FSL
-	bool
+	bool "Freescale 16550-style UART support (8250 based driver)"
 	depends on SERIAL_8250_CONSOLE
-	default PPC || ARM || ARM64 || COMPILE_TEST
+	depends on FSL_SOC || ARCH_LAYERSCAPE || SOC_LS1021A || (ARM64 && ACPI) || COMPILE_TEST
+	default FSL_SOC || ARCH_LAYERSCAPE || SOC_LS1021A || (ARM64 && ACPI)
+	help
+	  Selecting this option will add support for the 16550-style serial
+	  port hardware found on Freescale SoCs.
 
 config SERIAL_8250_DW
 	tristate "Support for Synopsys DesignWare 8250 quirks"
-- 
2.25.1


^ permalink raw reply related

* Re: [PATCH] pci: Rename pcibios_add_device to match
From: Niklas Schnelle @ 2021-09-15 10:13 UTC (permalink / raw)
  To: Oliver O'Halloran, Michal Simek, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Gerald Schaefer,
	Heiko Carstens, Vasily Gorbik, Christian Borntraeger,
	David S. Miller, Bjorn Helgaas, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin
  Cc: sparclinux, linux-s390, linuxppc-dev, linux-kernel, linux-pci
In-Reply-To: <20210913152709.48013-1-oohall@gmail.com>

On Tue, 2021-09-14 at 01:27 +1000, Oliver O'Halloran wrote:
> The general convention for pcibios_* hooks is that they're named after
> the corresponding pci_* function they provide a hook for. The exception
> is pcibios_add_device() which provides a hook for pci_device_add(). This
> has been irritating me for years so rename it.
> 
> Also, remove the export of the microblaze version. The only caller
> must be compiled as a built-in so there's no reason for the export.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/microblaze/pci/pci-common.c           | 3 +--
>  arch/powerpc/kernel/pci-common.c           | 2 +-
>  arch/powerpc/platforms/powernv/pci-sriov.c | 2 +-
>  arch/s390/pci/pci.c                        | 2 +-
>  arch/sparc/kernel/pci.c                    | 2 +-
>  arch/x86/pci/common.c                      | 2 +-
>  drivers/pci/pci.c                          | 4 ++--
>  drivers/pci/probe.c                        | 4 ++--
>  include/linux/pci.h                        | 2 +-
>  9 files changed, 11 insertions(+), 12 deletions(-)
> 
> 
.. snip ..
> diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
> index e7e6788d75a8..ded3321b7208 100644
> --- a/arch/s390/pci/pci.c
> +++ b/arch/s390/pci/pci.c
> @@ -561,7 +561,7 @@ static void zpci_cleanup_bus_resources(struct zpci_dev *zdev)
>  	zdev->has_resources = 0;
>  }
>  
> -int pcibios_add_device(struct pci_dev *pdev)
> +int pcibios_device_add(struct pci_dev *pdev)
>  {
>  	struct zpci_dev *zdev = to_zpci(pdev);
>  	struct resource *res;
> 
.. snip ..
>


I agree with your assesment this is indeed confusing. Interestingly
pcibios_release_device() also doesn't follow the convention exactly as
it is called from pci_release_dev() but at least that isn't very
confusing.

So for the arch/s390/pci bit:

Acked-by: Niklas Schnelle <schnelle@linux.ibm.com>


^ permalink raw reply

* Re: [PATCH v3 4/8] powerpc/pseries/svm: Add a powerpc version of cc_platform_has()
From: Borislav Petkov @ 2021-09-15 10:08 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Sathyanarayanan Kuppuswamy, linux-efi, Brijesh Singh, kvm,
	dri-devel, platform-driver-x86, Paul Mackerras, linux-s390,
	Andi Kleen, Joerg Roedel, x86, amd-gfx, Christoph Hellwig,
	linux-graphics-maintainer, Tom Lendacky, Tianyu Lan, kexec,
	linux-kernel, iommu, linux-fsdevel, linuxppc-dev
In-Reply-To: <87lf3yk7g4.fsf@mpe.ellerman.id.au>

On Wed, Sep 15, 2021 at 10:28:59AM +1000, Michael Ellerman wrote:
> I don't love it, a new C file and an out-of-line call to then call back
> to a static inline that for most configuration will return false ... but
> whatever :)

Yeah, hch thinks it'll cause a big mess otherwise:

https://lore.kernel.org/lkml/YSScWvpXeVXw%2Fed5@infradead.org/

I guess less ifdeffery is nice too.

> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)

Thx.

> Yeah, fixed in mainline today, thanks for trying to cross compile :)

Always!

:-)

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply

* Re: [PATCH trivial v2] powerpc/powernv/dump: Fix typo in comment
From: Joel Stanley @ 2021-09-15  8:19 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: linuxppc-dev
In-Reply-To: <20210914143802.54325-1-hegdevasant@linux.vnet.ibm.com>

On Tue, 14 Sept 2021 at 14:38, Vasant Hegde
<hegdevasant@linux.vnet.ibm.com> wrote:
>
> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>

Reviewed-by: Joel Stanley <joel@jms.id.au>

> ---
>  arch/powerpc/platforms/powernv/opal-dump.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/platforms/powernv/opal-dump.c b/arch/powerpc/platforms/powernv/opal-dump.c
> index 00c5a59d82d9..717d1d30ade5 100644
> --- a/arch/powerpc/platforms/powernv/opal-dump.c
> +++ b/arch/powerpc/platforms/powernv/opal-dump.c
> @@ -419,7 +419,7 @@ void __init opal_platform_dump_init(void)
>         int rc;
>         int dump_irq;
>
> -       /* ELOG not supported by firmware */
> +       /* Dump not supported by firmware */
>         if (!opal_check_token(OPAL_DUMP_READ))
>                 return;
>
> --
> 2.31.1
>

^ permalink raw reply

* Re: [PATCH] powerpc/powernv/flash: Check OPAL flash calls exist before using
From: Vasant Hegde @ 2021-09-15  6:55 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev
In-Reply-To: <87ilz2jr1f.fsf@mpe.ellerman.id.au>

On 9/15/21 11:53 AM, Michael Ellerman wrote:
> Vasant Hegde <hegdevasant@linux.vnet.ibm.com> writes:
>> Currently only FSP based powernv systems supports firmware update
>> interfaces. Hence check that the token OPAL_FLASH_VALIDATE exists
>> before initalising the flash driver.
>>
>> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
>> ---
>>   arch/powerpc/platforms/powernv/opal-flash.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/arch/powerpc/platforms/powernv/opal-flash.c b/arch/powerpc/platforms/powernv/opal-flash.c
>> index 7e7d38b17420..05490fc22fae 100644
>> --- a/arch/powerpc/platforms/powernv/opal-flash.c
>> +++ b/arch/powerpc/platforms/powernv/opal-flash.c
>> @@ -520,6 +520,10 @@ void __init opal_flash_update_init(void)
>>   {
>>   	int ret;
>>   
>> +	/* Firmware update is not supported by firmware */
>> +	if (!opal_check_token(OPAL_FLASH_VALIDATE))
>> +		return;
>> +
> 

Michael,

> That will mean the following files no longer appear on BMC systems:
> 
>    /sys/firmware/opal/image
>    /sys/firmware/opal/validate_flash
>    /sys/firmware/opal/manage_flash
>    /sys/firmware/opal/update_flash
> 
> Presumably those files don't actually work correctly, but are we sure
> their mere existence isn't used by anything at all?

That's correct. We never used these files/interfaces on BMC based systems.

> 
> We've had trouble in the past where removing sysfs files breaks tools
> unexpectedly, see smt_snooze_delay.

AFAIK only update_flash uses these interfaces on baremetal systems.
This change shouldn't break update_flash as these interfaces never used/worked
on BMC based powernv systems.

-Vasant


^ permalink raw reply

* Re: [PATCH] powerpc/powernv/flash: Check OPAL flash calls exist before using
From: Vasant Hegde @ 2021-09-15  6:55 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev
In-Reply-To: <87ilz2jr1f.fsf@mpe.ellerman.id.au>

On 9/15/21 11:53 AM, Michael Ellerman wrote:
> Vasant Hegde <hegdevasant@linux.vnet.ibm.com> writes:
>> Currently only FSP based powernv systems supports firmware update
>> interfaces. Hence check that the token OPAL_FLASH_VALIDATE exists
>> before initalising the flash driver.
>>
>> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
>> ---
>>   arch/powerpc/platforms/powernv/opal-flash.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/arch/powerpc/platforms/powernv/opal-flash.c b/arch/powerpc/platforms/powernv/opal-flash.c
>> index 7e7d38b17420..05490fc22fae 100644
>> --- a/arch/powerpc/platforms/powernv/opal-flash.c
>> +++ b/arch/powerpc/platforms/powernv/opal-flash.c
>> @@ -520,6 +520,10 @@ void __init opal_flash_update_init(void)
>>   {
>>   	int ret;
>>   
>> +	/* Firmware update is not supported by firmware */
>> +	if (!opal_check_token(OPAL_FLASH_VALIDATE))
>> +		return;
>> +
> 

Michael,

> That will mean the following files no longer appear on BMC systems:
> 
>    /sys/firmware/opal/image
>    /sys/firmware/opal/validate_flash
>    /sys/firmware/opal/manage_flash
>    /sys/firmware/opal/update_flash
> 
> Presumably those files don't actually work correctly, but are we sure
> their mere existence isn't used by anything at all?

That's correct. We never used these files/interfaces on BMC based systems.

> 
> We've had trouble in the past where removing sysfs files breaks tools
> unexpectedly, see smt_snooze_delay.

AFAIK only update_flash uses these interfaces on baremetal systems.
This change shouldn't break update_flash as these interfaces never used/worked
on BMC based powernv systems.

-Vasant


^ permalink raw reply

* Re: [PATCH] powerpc/powernv/flash: Check OPAL flash calls exist before using
From: Michael Ellerman @ 2021-09-15  6:23 UTC (permalink / raw)
  To: Vasant Hegde, linuxppc-dev; +Cc: Vasant Hegde
In-Reply-To: <20210914101630.30613-1-hegdevasant@linux.vnet.ibm.com>

Vasant Hegde <hegdevasant@linux.vnet.ibm.com> writes:
> Currently only FSP based powernv systems supports firmware update
> interfaces. Hence check that the token OPAL_FLASH_VALIDATE exists
> before initalising the flash driver.
>
> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
> ---
>  arch/powerpc/platforms/powernv/opal-flash.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/arch/powerpc/platforms/powernv/opal-flash.c b/arch/powerpc/platforms/powernv/opal-flash.c
> index 7e7d38b17420..05490fc22fae 100644
> --- a/arch/powerpc/platforms/powernv/opal-flash.c
> +++ b/arch/powerpc/platforms/powernv/opal-flash.c
> @@ -520,6 +520,10 @@ void __init opal_flash_update_init(void)
>  {
>  	int ret;
>  
> +	/* Firmware update is not supported by firmware */
> +	if (!opal_check_token(OPAL_FLASH_VALIDATE))
> +		return;
> +

That will mean the following files no longer appear on BMC systems:

  /sys/firmware/opal/image
  /sys/firmware/opal/validate_flash
  /sys/firmware/opal/manage_flash
  /sys/firmware/opal/update_flash

Presumably those files don't actually work correctly, but are we sure
their mere existence isn't used by anything at all?

We've had trouble in the past where removing sysfs files breaks tools
unexpectedly, see smt_snooze_delay.

cheers

^ permalink raw reply

* Re: [PATCH v6 4/4] powerpc/64s: Initialize and use a temporary mm for patching on Radix
From: Jordan Niethe @ 2021-09-15  4:24 UTC (permalink / raw)
  To: Christopher M. Riedl; +Cc: linuxppc-dev, linux-hardening
In-Reply-To: <20210911022904.30962-5-cmr@bluescreens.de>

On Sat, Sep 11, 2021 at 12:39 PM Christopher M. Riedl
<cmr@bluescreens.de> wrote:
>
> When code patching a STRICT_KERNEL_RWX kernel the page containing the
> address to be patched is temporarily mapped as writeable. Currently, a
> per-cpu vmalloc patch area is used for this purpose. While the patch
> area is per-cpu, the temporary page mapping is inserted into the kernel
> page tables for the duration of patching. The mapping is exposed to CPUs
> other than the patching CPU - this is undesirable from a hardening
> perspective. Use a temporary mm instead which keeps the mapping local to
> the CPU doing the patching.
>
> Use the `poking_init` init hook to prepare a temporary mm and patching
> address. Initialize the temporary mm by copying the init mm. Choose a
> randomized patching address inside the temporary mm userspace address
> space. The patching address is randomized between PAGE_SIZE and
> DEFAULT_MAP_WINDOW-PAGE_SIZE.
>
> Bits of entropy with 64K page size on BOOK3S_64:
>
>         bits of entropy = log2(DEFAULT_MAP_WINDOW_USER64 / PAGE_SIZE)
>
>         PAGE_SIZE=64K, DEFAULT_MAP_WINDOW_USER64=128TB
>         bits of entropy = log2(128TB / 64K)
>         bits of entropy = 31
>
> The upper limit is DEFAULT_MAP_WINDOW due to how the Book3s64 Hash MMU
> operates - by default the space above DEFAULT_MAP_WINDOW is not
> available. Currently the Hash MMU does not use a temporary mm so
> technically this upper limit isn't necessary; however, a larger
> randomization range does not further "harden" this overall approach and
> future work may introduce patching with a temporary mm on Hash as well.
>
> Randomization occurs only once during initialization at boot for each
> possible CPU in the system.
>
> Introduce two new functions, map_patch_mm() and unmap_patch_mm(), to
> respectively create and remove the temporary mapping with write
> permissions at patching_addr. Map the page with PAGE_KERNEL to set
> EAA[0] for the PTE which ignores the AMR (so no need to unlock/lock
> KUAP) according to PowerISA v3.0b Figure 35 on Radix.
>
> Based on x86 implementation:
>
> commit 4fc19708b165
> ("x86/alternatives: Initialize temporary mm for patching")
>
> and:
>
> commit b3fd8e83ada0
> ("x86/alternatives: Use temporary mm for text poking")
>
> Signed-off-by: Christopher M. Riedl <cmr@bluescreens.de>
>
> ---
>
> v6:  * Small clean-ups (naming, formatting, style, etc).
>      * Call stop_using_temporary_mm() before pte_unmap_unlock() after
>        patching.
>      * Replace BUG_ON()s in poking_init() w/ WARN_ON()s.
>
> v5:  * Only support Book3s64 Radix MMU for now.
>      * Use a per-cpu datastructure to hold the patching_addr and
>        patching_mm to avoid the need for a synchronization lock/mutex.
>
> v4:  * In the previous series this was two separate patches: one to init
>        the temporary mm in poking_init() (unused in powerpc at the time)
>        and the other to use it for patching (which removed all the
>        per-cpu vmalloc code). Now that we use poking_init() in the
>        existing per-cpu vmalloc approach, that separation doesn't work
>        as nicely anymore so I just merged the two patches into one.
>      * Preload the SLB entry and hash the page for the patching_addr
>        when using Hash on book3s64 to avoid taking an SLB and Hash fault
>        during patching. The previous implementation was a hack which
>        changed current->mm to allow the SLB and Hash fault handlers to
>        work with the temporary mm since both of those code-paths always
>        assume mm == current->mm.
>      * Also (hmm - seeing a trend here) with the book3s64 Hash MMU we
>        have to manage the mm->context.active_cpus counter and mm cpumask
>        since they determine (via mm_is_thread_local()) if the TLB flush
>        in pte_clear() is local or not - it should always be local when
>        we're using the temporary mm. On book3s64's Radix MMU we can
>        just call local_flush_tlb_mm().
>      * Use HPTE_USE_KERNEL_KEY on Hash to avoid costly lock/unlock of
>        KUAP.
> ---
>  arch/powerpc/lib/code-patching.c | 119 +++++++++++++++++++++++++++++--
>  1 file changed, 112 insertions(+), 7 deletions(-)
>
> diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> index e802e42c2789..af8e2a02a9dd 100644
> --- a/arch/powerpc/lib/code-patching.c
> +++ b/arch/powerpc/lib/code-patching.c
> @@ -11,6 +11,7 @@
>  #include <linux/cpuhotplug.h>
>  #include <linux/slab.h>
>  #include <linux/uaccess.h>
> +#include <linux/random.h>
>
>  #include <asm/tlbflush.h>
>  #include <asm/page.h>
> @@ -103,6 +104,7 @@ static inline void stop_using_temporary_mm(struct temp_mm *temp_mm)
>
>  static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
>  static DEFINE_PER_CPU(unsigned long, cpu_patching_addr);
> +static DEFINE_PER_CPU(struct mm_struct *, cpu_patching_mm);
>
>  static int text_area_cpu_up(unsigned int cpu)
>  {
> @@ -126,8 +128,48 @@ static int text_area_cpu_down(unsigned int cpu)
>         return 0;
>  }
>
> +static __always_inline void __poking_init_temp_mm(void)
> +{
> +       int cpu;
> +       spinlock_t *ptl; /* for protecting pte table */
> +       pte_t *ptep;
> +       struct mm_struct *patching_mm;
> +       unsigned long patching_addr;
> +
> +       for_each_possible_cpu(cpu) {
> +               patching_mm = copy_init_mm();
> +               WARN_ON(!patching_mm);
> +               per_cpu(cpu_patching_mm, cpu) = patching_mm;
> +
> +               /*
> +                * Choose a randomized, page-aligned address from the range:
> +                * [PAGE_SIZE, DEFAULT_MAP_WINDOW - PAGE_SIZE] The lower
> +                * address bound is PAGE_SIZE to avoid the zero-page.  The
> +                * upper address bound is DEFAULT_MAP_WINDOW - PAGE_SIZE to
> +                * stay under DEFAULT_MAP_WINDOW with the Book3s64 Hash MMU.
> +                */
> +               patching_addr = PAGE_SIZE + ((get_random_long() & PAGE_MASK)
> +                               % (DEFAULT_MAP_WINDOW - 2 * PAGE_SIZE));
> +               per_cpu(cpu_patching_addr, cpu) = patching_addr;
> +
> +               /*
> +                * PTE allocation uses GFP_KERNEL which means we need to
> +                * pre-allocate the PTE here because we cannot do the
> +                * allocation during patching when IRQs are disabled.
> +                */
> +               ptep = get_locked_pte(patching_mm, patching_addr, &ptl);
> +               WARN_ON(!ptep);
> +               pte_unmap_unlock(ptep, ptl);
> +       }
> +}
> +
>  void __init poking_init(void)
>  {
> +       if (radix_enabled()) {
> +               __poking_init_temp_mm();
> +               return;
> +       }
> +
>         WARN_ON(cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
>                 "powerpc/text_poke:online", text_area_cpu_up,
>                 text_area_cpu_down) < 0);
> @@ -197,30 +239,93 @@ static inline int unmap_patch_area(void)
>         return 0;
>  }
>
> +struct patch_mapping {
> +       spinlock_t *ptl; /* for protecting pte table */
> +       pte_t *ptep;
> +       struct temp_mm temp_mm;
> +};
> +
> +/*
> + * This can be called for kernel text or a module.
> + */
> +static int map_patch_mm(const void *addr, struct patch_mapping *patch_mapping)
> +{
> +       struct page *page;
> +       struct mm_struct *patching_mm = __this_cpu_read(cpu_patching_mm);
> +       unsigned long patching_addr = __this_cpu_read(cpu_patching_addr);
> +
> +       if (is_vmalloc_or_module_addr(addr))
> +               page = vmalloc_to_page(addr);
> +       else
> +               page = virt_to_page(addr);
> +
> +       patch_mapping->ptep = get_locked_pte(patching_mm, patching_addr,
> +                                            &patch_mapping->ptl);
> +       if (unlikely(!patch_mapping->ptep)) {
> +               pr_warn("map patch: failed to allocate pte for patching\n");
> +               return -1;
> +       }
> +
> +       set_pte_at(patching_mm, patching_addr, patch_mapping->ptep,
> +                  pte_mkdirty(mk_pte(page, PAGE_KERNEL)));

I think because switch_mm_irqs_off() will  not necessarily have a
barrier so a ptesync would be needed.
A spurious fault here from __patch_instruction() would not be handled correctly.

> +
> +       init_temp_mm(&patch_mapping->temp_mm, patching_mm);
> +       start_using_temporary_mm(&patch_mapping->temp_mm);
> +
> +       return 0;
> +}
> +
> +static int unmap_patch_mm(struct patch_mapping *patch_mapping)
> +{
> +       struct mm_struct *patching_mm = __this_cpu_read(cpu_patching_mm);
> +       unsigned long patching_addr = __this_cpu_read(cpu_patching_addr);
> +
> +       pte_clear(patching_mm, patching_addr, patch_mapping->ptep);
> +
> +       local_flush_tlb_mm(patching_mm);
> +       stop_using_temporary_mm(&patch_mapping->temp_mm);
> +
> +       pte_unmap_unlock(patch_mapping->ptep, patch_mapping->ptl);
> +
> +       return 0;
> +}
> +
>  static int do_patch_instruction(u32 *addr, struct ppc_inst instr)
>  {
>         int err, rc = 0;
>         u32 *patch_addr = NULL;
>         unsigned long flags;
> +       struct patch_mapping patch_mapping;
>
>         /*
> -        * During early early boot patch_instruction is called
> -        * when text_poke_area is not ready, but we still need
> -        * to allow patching. We just do the plain old patching
> +        * During early early boot patch_instruction is called when the
> +        * patching_mm/text_poke_area is not ready, but we still need to allow
> +        * patching. We just do the plain old patching.
>          */
> -       if (!this_cpu_read(text_poke_area))
> -               return raw_patch_instruction(addr, instr);
> +       if (radix_enabled()) {
> +               if (!this_cpu_read(cpu_patching_mm))
> +                       return raw_patch_instruction(addr, instr);
> +       } else {
> +               if (!this_cpu_read(text_poke_area))
> +                       return raw_patch_instruction(addr, instr);
> +       }
>
>         local_irq_save(flags);
>
> -       err = map_patch_area(addr);
> +       if (radix_enabled())
> +               err = map_patch_mm(addr, &patch_mapping);
> +       else
> +               err = map_patch_area(addr);
>         if (err)
>                 goto out;
>
>         patch_addr = (u32 *)(__this_cpu_read(cpu_patching_addr) | offset_in_page(addr));
>         rc = __patch_instruction(addr, instr, patch_addr);
>
> -       err = unmap_patch_area();
> +       if (radix_enabled())
> +               err = unmap_patch_mm(&patch_mapping);
> +       else
> +               err = unmap_patch_area();
>
>  out:
>         local_irq_restore(flags);
> --
> 2.32.0
>

^ permalink raw reply

* Re: [RESEND PATCH v4 1/4] drivers/nvdimm: Add nvdimm pmu structure
From: Dan Williams @ 2021-09-15  4:11 UTC (permalink / raw)
  To: kajoljain
  Cc: Linux NVDIMM, Santosh Sivaraj, maddy, Weiny, Ira, rnsastry,
	Peter Zijlstra, Linux Kernel Mailing List, atrajeev,
	Aneesh Kumar K.V, Vishal L Verma, Vaibhav Jain, Thomas Gleixner,
	linuxppc-dev
In-Reply-To: <CAPcyv4hE4rh5R+8zy3X4gDJeuPzQ0oQHmHbe_pppgWB2_RjfAg@mail.gmail.com>

On Tue, Sep 14, 2021 at 9:08 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Thu, Sep 9, 2021 at 12:56 AM kajoljain <kjain@linux.ibm.com> wrote:
> >
> >
> >
> > On 9/8/21 3:29 AM, Dan Williams wrote:
> > > Hi Kajol,
> > >
> > > Apologies for the delay in responding to this series, some comments below:
> >
> > Hi Dan,
> >     No issues, thanks for reviewing the patches.
> >
> > >
> > > On Thu, Sep 2, 2021 at 10:10 PM Kajol Jain <kjain@linux.ibm.com> wrote:
> > >>
> > >> A structure is added, called nvdimm_pmu, for performance
> > >> stats reporting support of nvdimm devices. It can be used to add
> > >> nvdimm pmu data such as supported events and pmu event functions
> > >> like event_init/add/read/del with cpu hotplug support.
> > >>
> > >> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > >> Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
> > >> Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
> > >> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
> > >> ---
> > >>  include/linux/nd.h | 43 +++++++++++++++++++++++++++++++++++++++++++
> > >>  1 file changed, 43 insertions(+)
> > >>
> > >> diff --git a/include/linux/nd.h b/include/linux/nd.h
> > >> index ee9ad76afbba..712499cf7335 100644
> > >> --- a/include/linux/nd.h
> > >> +++ b/include/linux/nd.h
> > >> @@ -8,6 +8,8 @@
> > >>  #include <linux/ndctl.h>
> > >>  #include <linux/device.h>
> > >>  #include <linux/badblocks.h>
> > >> +#include <linux/platform_device.h>
> > >> +#include <linux/perf_event.h>
> > >>
> > >>  enum nvdimm_event {
> > >>         NVDIMM_REVALIDATE_POISON,
> > >> @@ -23,6 +25,47 @@ enum nvdimm_claim_class {
> > >>         NVDIMM_CCLASS_UNKNOWN,
> > >>  };
> > >>
> > >> +/* Event attribute array index */
> > >> +#define NVDIMM_PMU_FORMAT_ATTR         0
> > >> +#define NVDIMM_PMU_EVENT_ATTR          1
> > >> +#define NVDIMM_PMU_CPUMASK_ATTR                2
> > >> +#define NVDIMM_PMU_NULL_ATTR           3
> > >> +
> > >> +/**
> > >> + * struct nvdimm_pmu - data structure for nvdimm perf driver
> > >> + *
> > >> + * @name: name of the nvdimm pmu device.
> > >> + * @pmu: pmu data structure for nvdimm performance stats.
> > >> + * @dev: nvdimm device pointer.
> > >> + * @functions(event_init/add/del/read): platform specific pmu functions.
> > >
> > > This is not valid kernel-doc:
> > >
> > > include/linux/nd.h:67: warning: Function parameter or member
> > > 'event_init' not described in 'nvdimm_pmu'
> > > include/linux/nd.h:67: warning: Function parameter or member 'add' not
> > > described in 'nvdimm_pmu'
> > > include/linux/nd.h:67: warning: Function parameter or member 'del' not
> > > described in 'nvdimm_pmu'
> > > include/linux/nd.h:67: warning: Function parameter or member 'read'
> > > not described in 'nvdimm_pmu'
> > >
> > > ...but I think rather than fixing those up 'struct nvdimm_pmu' should be pruned.
> > >
> > > It's not clear to me that it is worth the effort to describe these
> > > details to the nvdimm core which is just going to turn around and call
> > > the pmu core. I'd just as soon have the driver call the pmu core
> > > directly, optionally passing in attributes and callbacks that come
> > > from the nvdimm core and/or the nvdimm provider.
> >
> > The intend for adding these callbacks(event_init/add/del/read) is to give
> > flexibility to the nvdimm core to add some common checks/routines if required
> > in the future. Those checks can be common for all architecture with still having the
> > ability to call arch/platform specific driver code to use its own routines.
> >
> > But as you said, currently we don't have any common checks and it directly
> > calling platform specific code, so we can get rid of it.
> > Should we remove this part for now?
>
> Yes, lets go direct to the perf api for now and await the need for a
> common core wrapper to present itself.
>
> >
> >
> > >
> > > Otherwise it's also not clear which of these structure members are
> > > used at runtime vs purely used as temporary storage to pass parameters
> > > to the pmu core.
> > >
> > >> + * @attr_groups: data structure for events, formats and cpumask
> > >> + * @cpu: designated cpu for counter access.
> > >> + * @node: node for cpu hotplug notifier link.
> > >> + * @cpuhp_state: state for cpu hotplug notification.
> > >> + * @arch_cpumask: cpumask to get designated cpu for counter access.
> > >> + */
> > >> +struct nvdimm_pmu {
> > >> +       const char *name;
> > >> +       struct pmu pmu;
> > >> +       struct device *dev;
> > >> +       int (*event_init)(struct perf_event *event);
> > >> +       int  (*add)(struct perf_event *event, int flags);
> > >> +       void (*del)(struct perf_event *event, int flags);
> > >> +       void (*read)(struct perf_event *event);
> > >> +       /*
> > >> +        * Attribute groups for the nvdimm pmu. Index 0 used for
> > >> +        * format attribute, index 1 used for event attribute,
> > >> +        * index 2 used for cpusmask attribute and index 3 kept as NULL.
> > >> +        */
> > >> +       const struct attribute_group *attr_groups[4];
> > >
> > > Following from above, I'd rather this was organized as static
> > > attributes with an is_visible() helper for the groups for any dynamic
> > > aspects. That mirrors the behavior of nvdimm_create() and allows for
> > > device drivers to compose the attribute groups from a core set and /
> > > or a provider specific set.
> >
> > Since we don't have any common events right now, Can I use papr
> > attributes directly or should we create dummy events for common thing and
> > then merged it with papr event list.
>
> Just use papr events directly.

That is to say...I think if another implementation followed it should
try to match as many common event names as papr_scm picked, and
possibly extend with its own rather than start with a papr_scm
specific namespace for everything.

^ permalink raw reply

* Re: [RESEND PATCH v4 1/4] drivers/nvdimm: Add nvdimm pmu structure
From: Dan Williams @ 2021-09-15  4:08 UTC (permalink / raw)
  To: kajoljain
  Cc: Linux NVDIMM, Santosh Sivaraj, maddy, Weiny, Ira, rnsastry,
	Peter Zijlstra, Linux Kernel Mailing List, atrajeev,
	Aneesh Kumar K.V, Vishal L Verma, Vaibhav Jain, Thomas Gleixner,
	linuxppc-dev
In-Reply-To: <d7f8bf51-059f-4496-37c4-6516a703e209@linux.ibm.com>

On Thu, Sep 9, 2021 at 12:56 AM kajoljain <kjain@linux.ibm.com> wrote:
>
>
>
> On 9/8/21 3:29 AM, Dan Williams wrote:
> > Hi Kajol,
> >
> > Apologies for the delay in responding to this series, some comments below:
>
> Hi Dan,
>     No issues, thanks for reviewing the patches.
>
> >
> > On Thu, Sep 2, 2021 at 10:10 PM Kajol Jain <kjain@linux.ibm.com> wrote:
> >>
> >> A structure is added, called nvdimm_pmu, for performance
> >> stats reporting support of nvdimm devices. It can be used to add
> >> nvdimm pmu data such as supported events and pmu event functions
> >> like event_init/add/read/del with cpu hotplug support.
> >>
> >> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> >> Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
> >> Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
> >> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
> >> ---
> >>  include/linux/nd.h | 43 +++++++++++++++++++++++++++++++++++++++++++
> >>  1 file changed, 43 insertions(+)
> >>
> >> diff --git a/include/linux/nd.h b/include/linux/nd.h
> >> index ee9ad76afbba..712499cf7335 100644
> >> --- a/include/linux/nd.h
> >> +++ b/include/linux/nd.h
> >> @@ -8,6 +8,8 @@
> >>  #include <linux/ndctl.h>
> >>  #include <linux/device.h>
> >>  #include <linux/badblocks.h>
> >> +#include <linux/platform_device.h>
> >> +#include <linux/perf_event.h>
> >>
> >>  enum nvdimm_event {
> >>         NVDIMM_REVALIDATE_POISON,
> >> @@ -23,6 +25,47 @@ enum nvdimm_claim_class {
> >>         NVDIMM_CCLASS_UNKNOWN,
> >>  };
> >>
> >> +/* Event attribute array index */
> >> +#define NVDIMM_PMU_FORMAT_ATTR         0
> >> +#define NVDIMM_PMU_EVENT_ATTR          1
> >> +#define NVDIMM_PMU_CPUMASK_ATTR                2
> >> +#define NVDIMM_PMU_NULL_ATTR           3
> >> +
> >> +/**
> >> + * struct nvdimm_pmu - data structure for nvdimm perf driver
> >> + *
> >> + * @name: name of the nvdimm pmu device.
> >> + * @pmu: pmu data structure for nvdimm performance stats.
> >> + * @dev: nvdimm device pointer.
> >> + * @functions(event_init/add/del/read): platform specific pmu functions.
> >
> > This is not valid kernel-doc:
> >
> > include/linux/nd.h:67: warning: Function parameter or member
> > 'event_init' not described in 'nvdimm_pmu'
> > include/linux/nd.h:67: warning: Function parameter or member 'add' not
> > described in 'nvdimm_pmu'
> > include/linux/nd.h:67: warning: Function parameter or member 'del' not
> > described in 'nvdimm_pmu'
> > include/linux/nd.h:67: warning: Function parameter or member 'read'
> > not described in 'nvdimm_pmu'
> >
> > ...but I think rather than fixing those up 'struct nvdimm_pmu' should be pruned.
> >
> > It's not clear to me that it is worth the effort to describe these
> > details to the nvdimm core which is just going to turn around and call
> > the pmu core. I'd just as soon have the driver call the pmu core
> > directly, optionally passing in attributes and callbacks that come
> > from the nvdimm core and/or the nvdimm provider.
>
> The intend for adding these callbacks(event_init/add/del/read) is to give
> flexibility to the nvdimm core to add some common checks/routines if required
> in the future. Those checks can be common for all architecture with still having the
> ability to call arch/platform specific driver code to use its own routines.
>
> But as you said, currently we don't have any common checks and it directly
> calling platform specific code, so we can get rid of it.
> Should we remove this part for now?

Yes, lets go direct to the perf api for now and await the need for a
common core wrapper to present itself.

>
>
> >
> > Otherwise it's also not clear which of these structure members are
> > used at runtime vs purely used as temporary storage to pass parameters
> > to the pmu core.
> >
> >> + * @attr_groups: data structure for events, formats and cpumask
> >> + * @cpu: designated cpu for counter access.
> >> + * @node: node for cpu hotplug notifier link.
> >> + * @cpuhp_state: state for cpu hotplug notification.
> >> + * @arch_cpumask: cpumask to get designated cpu for counter access.
> >> + */
> >> +struct nvdimm_pmu {
> >> +       const char *name;
> >> +       struct pmu pmu;
> >> +       struct device *dev;
> >> +       int (*event_init)(struct perf_event *event);
> >> +       int  (*add)(struct perf_event *event, int flags);
> >> +       void (*del)(struct perf_event *event, int flags);
> >> +       void (*read)(struct perf_event *event);
> >> +       /*
> >> +        * Attribute groups for the nvdimm pmu. Index 0 used for
> >> +        * format attribute, index 1 used for event attribute,
> >> +        * index 2 used for cpusmask attribute and index 3 kept as NULL.
> >> +        */
> >> +       const struct attribute_group *attr_groups[4];
> >
> > Following from above, I'd rather this was organized as static
> > attributes with an is_visible() helper for the groups for any dynamic
> > aspects. That mirrors the behavior of nvdimm_create() and allows for
> > device drivers to compose the attribute groups from a core set and /
> > or a provider specific set.
>
> Since we don't have any common events right now, Can I use papr
> attributes directly or should we create dummy events for common thing and
> then merged it with papr event list.

Just use papr events directly.

^ permalink raw reply

* Re: [PATCH v2 07/29] ABI: sysfs-class-cxl: place "not in a guest" at description
From: Andrew Donnellan @ 2021-09-15  3:33 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Greg Kroah-Hartman, Linux Doc Mailing List
  Cc: Frederic Barrat, linuxppc-dev, linux-kernel, Jonathan Corbet
In-Reply-To: <88ce67c9eed1ae08af3d3992415032723184af9e.1631629496.git.mchehab+huawei@kernel.org>

On 15/9/21 12:32 am, Mauro Carvalho Chehab wrote:
> The What: field should have just the location of the ABI.
> Anything else should be inside the description.
> 
> This fixes its parsing by get_abi.pl script.
> 
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>

Looks fine to me.

Acked-by: Andrew Donnellan <ajd@linux.ibm.com>

> ---
>   Documentation/ABI/testing/sysfs-class-cxl | 15 ++++++++++-----
>   1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-class-cxl b/Documentation/ABI/testing/sysfs-class-cxl
> index 818f55970efb..3c77677e0ca7 100644
> --- a/Documentation/ABI/testing/sysfs-class-cxl
> +++ b/Documentation/ABI/testing/sysfs-class-cxl
> @@ -166,10 +166,11 @@ Description:    read only
>                   Decimal value of the Per Process MMIO space length.
>   Users:		https://github.com/ibm-capi/libcxl
>   
> -What:           /sys/class/cxl/<afu>m/pp_mmio_off (not in a guest)
> +What:           /sys/class/cxl/<afu>m/pp_mmio_off
>   Date:           September 2014
>   Contact:        linuxppc-dev@lists.ozlabs.org
>   Description:    read only
> +                (not in a guest)
>                   Decimal value of the Per Process MMIO space offset.
>   Users:		https://github.com/ibm-capi/libcxl
>   
> @@ -190,28 +191,31 @@ Description:    read only
>                   Identifies the revision level of the PSL.
>   Users:		https://github.com/ibm-capi/libcxl
>   
> -What:           /sys/class/cxl/<card>/base_image (not in a guest)
> +What:           /sys/class/cxl/<card>/base_image
>   Date:           September 2014
>   Contact:        linuxppc-dev@lists.ozlabs.org
>   Description:    read only
> +                (not in a guest)
>                   Identifies the revision level of the base image for devices
>                   that support loadable PSLs. For FPGAs this field identifies
>                   the image contained in the on-adapter flash which is loaded
>                   during the initial program load.
>   Users:		https://github.com/ibm-capi/libcxl
>   
> -What:           /sys/class/cxl/<card>/image_loaded (not in a guest)
> +What:           /sys/class/cxl/<card>/image_loaded
>   Date:           September 2014
>   Contact:        linuxppc-dev@lists.ozlabs.org
>   Description:    read only
> +                (not in a guest)
>                   Will return "user" or "factory" depending on the image loaded
>                   onto the card.
>   Users:		https://github.com/ibm-capi/libcxl
>   
> -What:           /sys/class/cxl/<card>/load_image_on_perst (not in a guest)
> +What:           /sys/class/cxl/<card>/load_image_on_perst
>   Date:           December 2014
>   Contact:        linuxppc-dev@lists.ozlabs.org
>   Description:    read/write
> +                (not in a guest)
>                   Valid entries are "none", "user", and "factory".
>                   "none" means PERST will not cause image to be loaded to the
>                   card.  A power cycle is required to load the image.
> @@ -235,10 +239,11 @@ Description:    write only
>                   contexts on the card AFUs.
>   Users:		https://github.com/ibm-capi/libcxl
>   
> -What:		/sys/class/cxl/<card>/perst_reloads_same_image (not in a guest)
> +What:		/sys/class/cxl/<card>/perst_reloads_same_image
>   Date:		July 2015
>   Contact:	linuxppc-dev@lists.ozlabs.org
>   Description:	read/write
> +                (not in a guest)
>   		Trust that when an image is reloaded via PERST, it will not
>   		have changed.
>   
> 

-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited

^ permalink raw reply

* Re: [PATCH] swiotlb: set IO TLB segment size via cmdline
From: Stefano Stabellini @ 2021-09-15  1:51 UTC (permalink / raw)
  To: rm.skakun
  Cc: Roman Skakun, linux-doc, Peter Zijlstra, Viresh Kumar,
	linux-kernel, Paul Mackerras, Jan Beulich, Will Deacon,
	Boris Ostrovsky, Marek Szyprowski, Stefano Stabellini,
	Jonathan Corbet, hch, xen-devel, Paul E. McKenney,
	Konrad Rzeszutek Wilk, Muchun Song, Thomas Gleixner,
	Juergen Gross, Thomas Bogendoerfer, Andrii Anisov, linuxppc-dev,
	Randy Dunlap, linux-mips, iommu, Roman Skakun, Andrew Morton,
	Lu Baolu, Robin Murphy, Mike Rapoport, Maciej W. Rozycki
In-Reply-To: <20210914153046.GB815@lst.de>

On Tue, 14 Sep 2021, Christoph Hellwig wrote:
> On Tue, Sep 14, 2021 at 05:29:07PM +0200, Jan Beulich wrote:
> > I'm not convinced the swiotlb use describe there falls under "intended
> > use" - mapping a 1280x720 framebuffer in a single chunk? (As an aside,
> > the bottom of this page is also confusing, as following "Then we can
> > confirm the modified swiotlb size in the boot log:" there is a log
> > fragment showing the same original size of 64Mb.
> 
> It doesn't.  We also do not add hacks to the kernel for whacky out
> of tree modules.

Also, Option 1 listed in the webpage seems to be a lot better. Any
reason you can't do that? Because that option both solves the problem
and increases performance.

^ permalink raw reply

* Re: [PATCH v3 4/8] powerpc/pseries/svm: Add a powerpc version of cc_platform_has()
From: Michael Ellerman @ 2021-09-15  0:28 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Sathyanarayanan Kuppuswamy, linux-efi, Brijesh Singh, kvm,
	dri-devel, platform-driver-x86, Paul Mackerras, linux-s390,
	Andi Kleen, Joerg Roedel, x86, amd-gfx, Christoph Hellwig,
	linux-graphics-maintainer, Tom Lendacky, Tianyu Lan, kexec,
	linux-kernel, iommu, linux-fsdevel, linuxppc-dev
In-Reply-To: <YUCOTIPPsJJpLO/d@zn.tnic>

Borislav Petkov <bp@alien8.de> writes:
> On Wed, Sep 08, 2021 at 05:58:35PM -0500, Tom Lendacky wrote:
>> Introduce a powerpc version of the cc_platform_has() function. This will
>> be used to replace the powerpc mem_encrypt_active() implementation, so
>> the implementation will initially only support the CC_ATTR_MEM_ENCRYPT
>> attribute.
>> 
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> Cc: Paul Mackerras <paulus@samba.org>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> ---
>>  arch/powerpc/platforms/pseries/Kconfig       |  1 +
>>  arch/powerpc/platforms/pseries/Makefile      |  2 ++
>>  arch/powerpc/platforms/pseries/cc_platform.c | 26 ++++++++++++++++++++
>>  3 files changed, 29 insertions(+)
>>  create mode 100644 arch/powerpc/platforms/pseries/cc_platform.c
>
> Michael,
>
> can I get an ACK for the ppc bits to carry them through the tip tree
> pls?

Yeah.

I don't love it, a new C file and an out-of-line call to then call back
to a static inline that for most configuration will return false ... but
whatever :)

Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)


> Btw, on a related note, cross-compiling this throws the following error here:
>
> $ make CROSS_COMPILE=/home/share/src/crosstool/gcc-9.4.0-nolibc/powerpc64-linux/bin/powerpc64-linux- V=1 ARCH=powerpc
>
> ...
>
> /home/share/src/crosstool/gcc-9.4.0-nolibc/powerpc64-linux/bin/powerpc64-linux-gcc -Wp,-MD,arch/powerpc/boot/.crt0.o.d -D__ASSEMBLY__ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -O2 -msoft-float -mno-altivec -mno-vsx -pipe -fomit-frame-pointer -fno-builtin -fPIC -nostdinc -include ./include/linux/compiler_attributes.h -I./arch/powerpc/include -I./arch/powerpc/include/generated  -I./include -I./arch/powerpc/include/uapi -I./arch/powerpc/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/compiler-version.h -include ./include/linux/kconfig.h -m32 -isystem /home/share/src/crosstool/gcc-9.4.0-nolibc/powerpc64-linux/bin/../lib/gcc/powerpc64-linux/9.4.0/include -mbig-endian -nostdinc -c -o arch/powerpc/boot/crt0.o arch/powerpc/boot/crt0.S
> In file included from <command-line>:
> ././include/linux/compiler_attributes.h:62:5: warning: "__has_attribute" is not defined, evaluates to 0 [-Wundef]
>    62 | #if __has_attribute(__assume_aligned__)
>       |     ^~~~~~~~~~~~~~~
> ././include/linux/compiler_attributes.h:62:20: error: missing binary operator before token "("
>    62 | #if __has_attribute(__assume_aligned__)
>       |                    ^
> ././include/linux/compiler_attributes.h:88:5: warning: "__has_attribute" is not defined, evaluates to 0 [-Wundef]
>    88 | #if __has_attribute(__copy__)
>       |     ^~~~~~~~~~~~~~~
> ...
>
> Known issue?

Yeah, fixed in mainline today, thanks for trying to cross compile :)

cheers

^ permalink raw reply

* Re: [PATCH 1/1] powerpc: Drop superfluous pci_dev_is_added() calls
From: Michael Ellerman @ 2021-09-15  0:23 UTC (permalink / raw)
  To: Bjorn Helgaas, Niklas Schnelle
  Cc: linux-arch, linux-s390, linux-kernel, Oliver O'Halloran,
	linux-pci, Bjorn Helgaas, linuxppc-dev
In-Reply-To: <20210914193130.GA1447657@bjorn-Precision-5520>

Bjorn Helgaas <helgaas@kernel.org> writes:
> On Fri, Sep 10, 2021 at 04:19:40PM +0200, Niklas Schnelle wrote:
>> On powerpc, pci_dev_is_added() is called as part of SR-IOV fixups
>> that are done under pcibios_add_device() which in turn is only called in
>> pci_device_add() whih is called when a PCI device is scanned.
>> 
>> Now pci_dev_assign_added() is called in pci_bus_add_device() which is
>> only called after scanning the device. Thus pci_dev_is_added() is always
>> false and can be dropped.
>> 
>> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
>
> Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
>
> This doesn't touch the PCI core, so maybe makes sense for you to take
> it, Michael?  But let me know if you think otherwise.

Yeah I'm happy to take it, thanks.

cheers

^ permalink raw reply

* [PATCH] swiotlb: set IO TLB segment size via cmdline
From: Roman Skakun @ 2021-09-14 15:10 UTC (permalink / raw)
  To: xen-devel, linux-mips, linuxppc-dev, linux-doc, linux-kernel,
	iommu
  Cc: Roman Skakun, Peter Zijlstra, Viresh Kumar, Paul Mackerras,
	Will Deacon, Boris Ostrovsky, Marek Szyprowski,
	Stefano Stabellini, Jonathan Corbet, Christoph Hellwig,
	Paul E. McKenney, Konrad Rzeszutek Wilk, Muchun Song,
	Thomas Gleixner, Juergen Gross, Thomas Bogendoerfer,
	Andrii Anisov, Randy Dunlap, Roman Skakun, Andrew Morton,
	Lu Baolu, Robin Murphy, Mike Rapoport, Maciej W. Rozycki

From: Roman Skakun <roman_skakun@epam.com>

It is possible when default IO TLB size is not
enough to fit a long buffers as described here [1].

This patch makes a way to set this parameter
using cmdline instead of recompiling a kernel.

[1] https://www.xilinx.com/support/answers/72694.html

Signed-off-by: Roman Skakun <roman_skakun@epam.com>
---
 .../admin-guide/kernel-parameters.txt         |  5 +-
 arch/mips/cavium-octeon/dma-octeon.c          |  2 +-
 arch/powerpc/platforms/pseries/svm.c          |  2 +-
 drivers/xen/swiotlb-xen.c                     |  7 +--
 include/linux/swiotlb.h                       |  1 +
 kernel/dma/swiotlb.c                          | 51 ++++++++++++++-----
 6 files changed, 48 insertions(+), 20 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 91ba391f9b32..f842a523a485 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5558,8 +5558,9 @@
 			it if 0 is given (See Documentation/admin-guide/cgroup-v1/memory.rst)
 
 	swiotlb=	[ARM,IA-64,PPC,MIPS,X86]
-			Format: { <int> | force | noforce }
-			<int> -- Number of I/O TLB slabs
+			Format: { <slabs> [,<io_tlb_segment_size>] [,force | noforce]​ }
+			<slabs> -- Number of I/O TLB slabs
+			<io_tlb_segment_size> -- Max IO TLB segment size
 			force -- force using of bounce buffers even if they
 			         wouldn't be automatically used by the kernel
 			noforce -- Never use bounce buffers (for debugging)
diff --git a/arch/mips/cavium-octeon/dma-octeon.c b/arch/mips/cavium-octeon/dma-octeon.c
index df70308db0e6..446c73bc936e 100644
--- a/arch/mips/cavium-octeon/dma-octeon.c
+++ b/arch/mips/cavium-octeon/dma-octeon.c
@@ -237,7 +237,7 @@ void __init plat_swiotlb_setup(void)
 		swiotlbsize = 64 * (1<<20);
 #endif
 	swiotlb_nslabs = swiotlbsize >> IO_TLB_SHIFT;
-	swiotlb_nslabs = ALIGN(swiotlb_nslabs, IO_TLB_SEGSIZE);
+	swiotlb_nslabs = ALIGN(swiotlb_nslabs, swiotlb_io_seg_size());
 	swiotlbsize = swiotlb_nslabs << IO_TLB_SHIFT;
 
 	octeon_swiotlb = memblock_alloc_low(swiotlbsize, PAGE_SIZE);
diff --git a/arch/powerpc/platforms/pseries/svm.c b/arch/powerpc/platforms/pseries/svm.c
index 87f001b4c4e4..2a1f09c722ac 100644
--- a/arch/powerpc/platforms/pseries/svm.c
+++ b/arch/powerpc/platforms/pseries/svm.c
@@ -47,7 +47,7 @@ void __init svm_swiotlb_init(void)
 	unsigned long bytes, io_tlb_nslabs;
 
 	io_tlb_nslabs = (swiotlb_size_or_default() >> IO_TLB_SHIFT);
-	io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
+	io_tlb_nslabs = ALIGN(io_tlb_nslabs, swiotlb_io_seg_size());
 
 	bytes = io_tlb_nslabs << IO_TLB_SHIFT;
 
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 643fe440c46e..0fc9c6cb6815 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -110,12 +110,13 @@ static int xen_swiotlb_fixup(void *buf, unsigned long nslabs)
 	int dma_bits;
 	dma_addr_t dma_handle;
 	phys_addr_t p = virt_to_phys(buf);
+	unsigned long tlb_segment_size = swiotlb_io_seg_size();
 
-	dma_bits = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT) + PAGE_SHIFT;
+	dma_bits = get_order(tlb_segment_size << IO_TLB_SHIFT) + PAGE_SHIFT;
 
 	i = 0;
 	do {
-		int slabs = min(nslabs - i, (unsigned long)IO_TLB_SEGSIZE);
+		int slabs = min(nslabs - i, (unsigned long)tlb_segment_size);
 
 		do {
 			rc = xen_create_contiguous_region(
@@ -153,7 +154,7 @@ static const char *xen_swiotlb_error(enum xen_swiotlb_err err)
 	return "";
 }
 
-#define DEFAULT_NSLABS		ALIGN(SZ_64M >> IO_TLB_SHIFT, IO_TLB_SEGSIZE)
+#define DEFAULT_NSLABS		ALIGN(SZ_64M >> IO_TLB_SHIFT, swiotlb_io_seg_size())
 
 int __ref xen_swiotlb_init(void)
 {
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index b0cb2a9973f4..35c3ffeda9fa 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -59,6 +59,7 @@ void swiotlb_sync_single_for_cpu(struct device *dev, phys_addr_t tlb_addr,
 		size_t size, enum dma_data_direction dir);
 dma_addr_t swiotlb_map(struct device *dev, phys_addr_t phys,
 		size_t size, enum dma_data_direction dir, unsigned long attrs);
+unsigned long swiotlb_io_seg_size(void);
 
 #ifdef CONFIG_SWIOTLB
 extern enum swiotlb_force swiotlb_force;
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 87c40517e822..6b505206fc13 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -72,6 +72,11 @@ enum swiotlb_force swiotlb_force;
 
 struct io_tlb_mem io_tlb_default_mem;
 
+/*
+ * Maximum IO TLB segment size.
+ */
+static unsigned long io_tlb_seg_size = IO_TLB_SEGSIZE;
+
 /*
  * Max segment that we can provide which (if pages are contingous) will
  * not be bounced (unless SWIOTLB_FORCE is set).
@@ -81,15 +86,30 @@ static unsigned int max_segment;
 static unsigned long default_nslabs = IO_TLB_DEFAULT_SIZE >> IO_TLB_SHIFT;
 
 static int __init
-setup_io_tlb_npages(char *str)
+setup_io_tlb_params(char *str)
 {
+	unsigned long tmp;
+
 	if (isdigit(*str)) {
-		/* avoid tail segment of size < IO_TLB_SEGSIZE */
-		default_nslabs =
-			ALIGN(simple_strtoul(str, &str, 0), IO_TLB_SEGSIZE);
+		default_nslabs = simple_strtoul(str, &str, 0);
 	}
 	if (*str == ',')
 		++str;
+
+	/* get max IO TLB segment size */
+	if (isdigit(*str)) {
+		tmp = simple_strtoul(str, &str, 0);
+		if (tmp)
+			io_tlb_seg_size = ALIGN(tmp, IO_TLB_SEGSIZE);
+	}
+	if (*str == ',')
+		++str;
+
+	/* update io_tlb_nslabs after applying a new segment size and
+	 * avoid tail segment of size < IO TLB segment size
+	 */
+	default_nslabs = ALIGN(default_nslabs, io_tlb_seg_size);
+
 	if (!strcmp(str, "force"))
 		swiotlb_force = SWIOTLB_FORCE;
 	else if (!strcmp(str, "noforce"))
@@ -97,7 +117,7 @@ setup_io_tlb_npages(char *str)
 
 	return 0;
 }
-early_param("swiotlb", setup_io_tlb_npages);
+early_param("swiotlb", setup_io_tlb_params);
 
 unsigned int swiotlb_max_segment(void)
 {
@@ -118,6 +138,11 @@ unsigned long swiotlb_size_or_default(void)
 	return default_nslabs << IO_TLB_SHIFT;
 }
 
+unsigned long swiotlb_io_seg_size(void)
+{
+	return io_tlb_seg_size;
+}
+
 void __init swiotlb_adjust_size(unsigned long size)
 {
 	/*
@@ -128,7 +153,7 @@ void __init swiotlb_adjust_size(unsigned long size)
 	if (default_nslabs != IO_TLB_DEFAULT_SIZE >> IO_TLB_SHIFT)
 		return;
 	size = ALIGN(size, IO_TLB_SIZE);
-	default_nslabs = ALIGN(size >> IO_TLB_SHIFT, IO_TLB_SEGSIZE);
+	default_nslabs = ALIGN(size >> IO_TLB_SHIFT, io_tlb_seg_size);
 	pr_info("SWIOTLB bounce buffer size adjusted to %luMB", size >> 20);
 }
 
@@ -147,7 +172,7 @@ void swiotlb_print_info(void)
 
 static inline unsigned long io_tlb_offset(unsigned long val)
 {
-	return val & (IO_TLB_SEGSIZE - 1);
+	return val & (io_tlb_seg_size - 1);
 }
 
 static inline unsigned long nr_slots(u64 val)
@@ -192,7 +217,7 @@ static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start,
 
 	spin_lock_init(&mem->lock);
 	for (i = 0; i < mem->nslabs; i++) {
-		mem->slots[i].list = IO_TLB_SEGSIZE - io_tlb_offset(i);
+		mem->slots[i].list = io_tlb_seg_size - io_tlb_offset(i);
 		mem->slots[i].orig_addr = INVALID_PHYS_ADDR;
 		mem->slots[i].alloc_size = 0;
 	}
@@ -261,7 +286,7 @@ int
 swiotlb_late_init_with_default_size(size_t default_size)
 {
 	unsigned long nslabs =
-		ALIGN(default_size >> IO_TLB_SHIFT, IO_TLB_SEGSIZE);
+		ALIGN(default_size >> IO_TLB_SHIFT, io_tlb_seg_size);
 	unsigned long bytes;
 	unsigned char *vstart = NULL;
 	unsigned int order;
@@ -522,7 +547,7 @@ static int swiotlb_find_slots(struct device *dev, phys_addr_t orig_addr,
 			alloc_size - (offset + ((i - index) << IO_TLB_SHIFT));
 	}
 	for (i = index - 1;
-	     io_tlb_offset(i) != IO_TLB_SEGSIZE - 1 &&
+	     io_tlb_offset(i) != io_tlb_seg_size - 1 &&
 	     mem->slots[i].list; i--)
 		mem->slots[i].list = ++count;
 
@@ -600,7 +625,7 @@ static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr)
 	 * with slots below and above the pool being returned.
 	 */
 	spin_lock_irqsave(&mem->lock, flags);
-	if (index + nslots < ALIGN(index + 1, IO_TLB_SEGSIZE))
+	if (index + nslots < ALIGN(index + 1, io_tlb_seg_size))
 		count = mem->slots[index + nslots].list;
 	else
 		count = 0;
@@ -620,7 +645,7 @@ static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr)
 	 * available (non zero)
 	 */
 	for (i = index - 1;
-	     io_tlb_offset(i) != IO_TLB_SEGSIZE - 1 && mem->slots[i].list;
+	     io_tlb_offset(i) != io_tlb_seg_size - 1 && mem->slots[i].list;
 	     i--)
 		mem->slots[i].list = ++count;
 	mem->used -= nslots;
@@ -698,7 +723,7 @@ dma_addr_t swiotlb_map(struct device *dev, phys_addr_t paddr, size_t size,
 
 size_t swiotlb_max_mapping_size(struct device *dev)
 {
-	return ((size_t)IO_TLB_SIZE) * IO_TLB_SEGSIZE;
+	return ((size_t)IO_TLB_SIZE) * io_tlb_seg_size;
 }
 
 bool is_swiotlb_active(struct device *dev)
-- 
2.27.0


^ permalink raw reply related

* Re: [PATCH] pci: Rename pcibios_add_device to match
From: Bjorn Helgaas @ 2021-09-14 19:40 UTC (permalink / raw)
  To: Oliver O'Halloran
  Cc: linux-s390, Michal Simek, linuxppc-dev, Thomas Gleixner,
	Vasily Gorbik, Niklas Schnelle, Heiko Carstens, x86, linux-pci,
	linux-kernel, sparclinux, Christian Borntraeger, Ingo Molnar,
	Paul Mackerras, Bjorn Helgaas, Borislav Petkov, Gerald Schaefer,
	H. Peter Anvin, David S. Miller
In-Reply-To: <20210913152709.48013-1-oohall@gmail.com>

On Tue, Sep 14, 2021 at 01:27:08AM +1000, Oliver O'Halloran wrote:
> The general convention for pcibios_* hooks is that they're named after
> the corresponding pci_* function they provide a hook for. The exception
> is pcibios_add_device() which provides a hook for pci_device_add(). This
> has been irritating me for years so rename it.
> 
> Also, remove the export of the microblaze version. The only caller
> must be compiled as a built-in so there's no reason for the export.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>

I fixed up the subject so it matches previous history and applied to
pci/enumeration for v5.16, thanks!

Stuff like this really annoys me, too.

> ---
>  arch/microblaze/pci/pci-common.c           | 3 +--
>  arch/powerpc/kernel/pci-common.c           | 2 +-
>  arch/powerpc/platforms/powernv/pci-sriov.c | 2 +-
>  arch/s390/pci/pci.c                        | 2 +-
>  arch/sparc/kernel/pci.c                    | 2 +-
>  arch/x86/pci/common.c                      | 2 +-
>  drivers/pci/pci.c                          | 4 ++--
>  drivers/pci/probe.c                        | 4 ++--
>  include/linux/pci.h                        | 2 +-
>  9 files changed, 11 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c
> index 557585f1be41..622a4867f9e9 100644
> --- a/arch/microblaze/pci/pci-common.c
> +++ b/arch/microblaze/pci/pci-common.c
> @@ -587,13 +587,12 @@ static void pcibios_fixup_resources(struct pci_dev *dev)
>  }
>  DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, pcibios_fixup_resources);
>  
> -int pcibios_add_device(struct pci_dev *dev)
> +int pcibios_device_add(struct pci_dev *dev)
>  {
>  	dev->irq = of_irq_parse_and_map_pci(dev, 0, 0);
>  
>  	return 0;
>  }
> -EXPORT_SYMBOL(pcibios_add_device);
>  
>  /*
>   * Reparent resource children of pr that conflict with res
> diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
> index c3573430919d..6749905932f4 100644
> --- a/arch/powerpc/kernel/pci-common.c
> +++ b/arch/powerpc/kernel/pci-common.c
> @@ -1059,7 +1059,7 @@ void pcibios_bus_add_device(struct pci_dev *dev)
>  		ppc_md.pcibios_bus_add_device(dev);
>  }
>  
> -int pcibios_add_device(struct pci_dev *dev)
> +int pcibios_device_add(struct pci_dev *dev)
>  {
>  	struct irq_domain *d;
>  
> diff --git a/arch/powerpc/platforms/powernv/pci-sriov.c b/arch/powerpc/platforms/powernv/pci-sriov.c
> index 28aac933a439..486c2937b159 100644
> --- a/arch/powerpc/platforms/powernv/pci-sriov.c
> +++ b/arch/powerpc/platforms/powernv/pci-sriov.c
> @@ -54,7 +54,7 @@
>   * to "new_size", calculated above. Implementing this is a convoluted process
>   * which requires several hooks in the PCI core:
>   *
> - * 1. In pcibios_add_device() we call pnv_pci_ioda_fixup_iov().
> + * 1. In pcibios_device_add() we call pnv_pci_ioda_fixup_iov().
>   *
>   *    At this point the device has been probed and the device's BARs are sized,
>   *    but no resource allocations have been done. The SR-IOV BARs are sized
> diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
> index e7e6788d75a8..ded3321b7208 100644
> --- a/arch/s390/pci/pci.c
> +++ b/arch/s390/pci/pci.c
> @@ -561,7 +561,7 @@ static void zpci_cleanup_bus_resources(struct zpci_dev *zdev)
>  	zdev->has_resources = 0;
>  }
>  
> -int pcibios_add_device(struct pci_dev *pdev)
> +int pcibios_device_add(struct pci_dev *pdev)
>  {
>  	struct zpci_dev *zdev = to_zpci(pdev);
>  	struct resource *res;
> diff --git a/arch/sparc/kernel/pci.c b/arch/sparc/kernel/pci.c
> index 9c2b720bfd20..31b0c1983286 100644
> --- a/arch/sparc/kernel/pci.c
> +++ b/arch/sparc/kernel/pci.c
> @@ -1010,7 +1010,7 @@ void pcibios_set_master(struct pci_dev *dev)
>  }
>  
>  #ifdef CONFIG_PCI_IOV
> -int pcibios_add_device(struct pci_dev *dev)
> +int pcibios_device_add(struct pci_dev *dev)
>  {
>  	struct pci_dev *pdev;
>  
> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
> index 3507f456fcd0..9e1e6b8d8876 100644
> --- a/arch/x86/pci/common.c
> +++ b/arch/x86/pci/common.c
> @@ -632,7 +632,7 @@ static void set_dev_domain_options(struct pci_dev *pdev)
>  		pdev->hotplug_user_indicators = 1;
>  }
>  
> -int pcibios_add_device(struct pci_dev *dev)
> +int pcibios_device_add(struct pci_dev *dev)
>  {
>  	struct pci_setup_rom *rom;
>  	struct irq_domain *msidom;
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index ce2ab62b64cf..c63598c1cdd8 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -2091,14 +2091,14 @@ void pcim_pin_device(struct pci_dev *pdev)
>  EXPORT_SYMBOL(pcim_pin_device);
>  
>  /*
> - * pcibios_add_device - provide arch specific hooks when adding device dev
> + * pcibios_device_add - provide arch specific hooks when adding device dev
>   * @dev: the PCI device being added
>   *
>   * Permits the platform to provide architecture specific functionality when
>   * devices are added. This is the default implementation. Architecture
>   * implementations can override this.
>   */
> -int __weak pcibios_add_device(struct pci_dev *dev)
> +int __weak pcibios_device_add(struct pci_dev *dev)
>  {
>  	return 0;
>  }
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index d9fc02a71baa..2ba43b6adf31 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -2450,7 +2450,7 @@ static struct irq_domain *pci_dev_msi_domain(struct pci_dev *dev)
>  	struct irq_domain *d;
>  
>  	/*
> -	 * If a domain has been set through the pcibios_add_device()
> +	 * If a domain has been set through the pcibios_device_add()
>  	 * callback, then this is the one (platform code knows best).
>  	 */
>  	d = dev_get_msi_domain(&dev->dev);
> @@ -2518,7 +2518,7 @@ void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
>  	list_add_tail(&dev->bus_list, &bus->devices);
>  	up_write(&pci_bus_sem);
>  
> -	ret = pcibios_add_device(dev);
> +	ret = pcibios_device_add(dev);
>  	WARN_ON(ret < 0);
>  
>  	/* Set up MSI IRQ domain */
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index cd8aa6fce204..7e0ce3a4d5a1 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -2126,7 +2126,7 @@ void pcibios_disable_device(struct pci_dev *dev);
>  void pcibios_set_master(struct pci_dev *dev);
>  int pcibios_set_pcie_reset_state(struct pci_dev *dev,
>  				 enum pcie_reset_state state);
> -int pcibios_add_device(struct pci_dev *dev);
> +int pcibios_device_add(struct pci_dev *dev);
>  void pcibios_release_device(struct pci_dev *dev);
>  #ifdef CONFIG_PCI
>  void pcibios_penalize_isa_irq(int irq, int active);
> -- 
> 2.31.1
> 

^ permalink raw reply

* Re: [PATCH 1/1] powerpc: Drop superfluous pci_dev_is_added() calls
From: Bjorn Helgaas @ 2021-09-14 19:31 UTC (permalink / raw)
  To: Niklas Schnelle
  Cc: linux-arch, linux-s390, linux-kernel, Oliver O'Halloran,
	linux-pci, Bjorn Helgaas, linuxppc-dev
In-Reply-To: <20210910141940.2598035-2-schnelle@linux.ibm.com>

On Fri, Sep 10, 2021 at 04:19:40PM +0200, Niklas Schnelle wrote:
> On powerpc, pci_dev_is_added() is called as part of SR-IOV fixups
> that are done under pcibios_add_device() which in turn is only called in
> pci_device_add() whih is called when a PCI device is scanned.
> 
> Now pci_dev_assign_added() is called in pci_bus_add_device() which is
> only called after scanning the device. Thus pci_dev_is_added() is always
> false and can be dropped.
> 
> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>

Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>

This doesn't touch the PCI core, so maybe makes sense for you to take
it, Michael?  But let me know if you think otherwise.

Thanks a lot for cleaning this up, Niklas.

> ---
>  arch/powerpc/platforms/powernv/pci-sriov.c | 6 ------
>  arch/powerpc/platforms/pseries/setup.c     | 3 +--
>  2 files changed, 1 insertion(+), 8 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-sriov.c b/arch/powerpc/platforms/powernv/pci-sriov.c
> index 28aac933a439..deddbb233fde 100644
> --- a/arch/powerpc/platforms/powernv/pci-sriov.c
> +++ b/arch/powerpc/platforms/powernv/pci-sriov.c
> @@ -9,9 +9,6 @@
>  
>  #include "pci.h"
>  
> -/* for pci_dev_is_added() */
> -#include "../../../../drivers/pci/pci.h"
> -
>  /*
>   * The majority of the complexity in supporting SR-IOV on PowerNV comes from
>   * the need to put the MMIO space for each VF into a separate PE. Internally
> @@ -228,9 +225,6 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
>  
>  void pnv_pci_ioda_fixup_iov(struct pci_dev *pdev)
>  {
> -	if (WARN_ON(pci_dev_is_added(pdev)))
> -		return;
> -
>  	if (pdev->is_virtfn) {
>  		struct pnv_ioda_pe *pe = pnv_ioda_get_pe(pdev);
>  
> diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
> index f79126f16258..2188054470c1 100644
> --- a/arch/powerpc/platforms/pseries/setup.c
> +++ b/arch/powerpc/platforms/pseries/setup.c
> @@ -74,7 +74,6 @@
>  #include <asm/hvconsole.h>
>  
>  #include "pseries.h"
> -#include "../../../../drivers/pci/pci.h"
>  
>  DEFINE_STATIC_KEY_FALSE(shared_processor);
>  EXPORT_SYMBOL(shared_processor);
> @@ -750,7 +749,7 @@ static void pseries_pci_fixup_iov_resources(struct pci_dev *pdev)
>  	const int *indexes;
>  	struct device_node *dn = pci_device_to_OF_node(pdev);
>  
> -	if (!pdev->is_physfn || pci_dev_is_added(pdev))
> +	if (!pdev->is_physfn)
>  		return;
>  	/*Firmware must support open sriov otherwise dont configure*/
>  	indexes = of_get_property(dn, "ibm,open-sriov-vf-bar-info", NULL);
> -- 
> 2.25.1
> 

^ permalink raw reply

* Re: [PATCH v3 5/8] x86/sme: Replace occurrences of sme_active() with cc_platform_has()
From: Borislav Petkov @ 2021-09-14 18:24 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Sathyanarayanan Kuppuswamy, linux-efi, Brijesh Singh, kvm,
	Peter Zijlstra, Dave Hansen, dri-devel, platform-driver-x86,
	Will Deacon, linux-s390, Andi Kleen, Joerg Roedel, x86, amd-gfx,
	Christoph Hellwig, Ingo Molnar, linux-graphics-maintainer,
	Tianyu Lan, Andy Lutomirski, Thomas Gleixner, kexec, linux-kernel,
	iommu, linux-fsdevel, linuxppc-dev
In-Reply-To: <367624d43d35d61d5c97a8b289d9ddae223636e9.1631141919.git.thomas.lendacky@amd.com>

On Wed, Sep 08, 2021 at 05:58:36PM -0500, Tom Lendacky wrote:
> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> index 18fe19916bc3..4b54a2377821 100644
> --- a/arch/x86/mm/mem_encrypt.c
> +++ b/arch/x86/mm/mem_encrypt.c
> @@ -144,7 +144,7 @@ void __init sme_unmap_bootdata(char *real_mode_data)
>  	struct boot_params *boot_data;
>  	unsigned long cmdline_paddr;
>  
> -	if (!sme_active())
> +	if (!cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
>  		return;
>  
>  	/* Get the command line address before unmapping the real_mode_data */
> @@ -164,7 +164,7 @@ void __init sme_map_bootdata(char *real_mode_data)
>  	struct boot_params *boot_data;
>  	unsigned long cmdline_paddr;
>  
> -	if (!sme_active())
> +	if (!cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
>  		return;
>  
>  	__sme_early_map_unmap_mem(real_mode_data, sizeof(boot_params), true);
> @@ -377,11 +377,6 @@ bool sev_active(void)
>  {
>  	return sev_status & MSR_AMD64_SEV_ENABLED;
>  }
> -
> -bool sme_active(void)
> -{
> -	return sme_me_mask && !sev_active();
> -}
>  EXPORT_SYMBOL_GPL(sev_active);
>  
>  /* Needs to be called from non-instrumentable code */

You forgot this hunk:

diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 5635ca9a1fbe..a3a2396362a5 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -364,8 +364,9 @@ int __init early_set_memory_encrypted(unsigned long vaddr, unsigned long size)
 /*
  * SME and SEV are very similar but they are not the same, so there are
  * times that the kernel will need to distinguish between SME and SEV. The
- * sme_active() and sev_active() functions are used for this.  When a
- * distinction isn't needed, the mem_encrypt_active() function can be used.
+ * PATTR_HOST_MEM_ENCRYPT and PATTR_GUEST_MEM_ENCRYPT flags to
+ * amd_prot_guest_has() are used for this. When a distinction isn't needed,
+ * the mem_encrypt_active() function can be used.
  *
  * The trampoline code is a good example for this requirement.  Before
  * paging is activated, SME will access all memory as decrypted, but SEV

because there's still a sme_active() mentioned there:

$ git grep sme_active
arch/x86/mm/mem_encrypt.c:367: * sme_active() and sev_active() functions are used for this.  When a

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply related

* Re: [PATCH] powerpc: clean up UPD_CONSTR
From: Nathan Chancellor @ 2021-09-14 16:38 UTC (permalink / raw)
  To: Nick Desaulniers, Michael Ellerman
  Cc: Michal Hocko, David Hildenbrand, Peter Zijlstra, Boqun Feng,
	linuxppc-dev, Christopher M. Riedl, Paul Mackerras, kvm-ppc,
	Dan Williams, Will Deacon, linux-kernel, Daniel Axtens
In-Reply-To: <20210914161712.2463458-1-ndesaulniers@google.com>

On 9/14/2021 9:17 AM, Nick Desaulniers wrote:
> UPD_CONSTR was previously a preprocessor define for an old GCC 4.9 inline
> asm bug with m<> constraints.
> 
> Fixes: 6563139d90ad ("powerpc: remove GCC version check for UPD_CONSTR")
> Suggested-by: Nathan Chancellor <nathan@kernel.org>
> Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>

Reviewed-by: Nathan Chancellor <nathan@kernel.org>

> ---
>   arch/powerpc/include/asm/asm-const.h | 2 --
>   arch/powerpc/include/asm/atomic.h    | 8 ++++----
>   arch/powerpc/include/asm/io.h        | 4 ++--
>   arch/powerpc/include/asm/uaccess.h   | 6 +++---
>   arch/powerpc/kvm/powerpc.c           | 4 ++--
>   5 files changed, 11 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/asm-const.h b/arch/powerpc/include/asm/asm-const.h
> index dbfa5e1e3198..bfb3c3534877 100644
> --- a/arch/powerpc/include/asm/asm-const.h
> +++ b/arch/powerpc/include/asm/asm-const.h
> @@ -12,6 +12,4 @@
>   #  define ASM_CONST(x)		__ASM_CONST(x)
>   #endif
>   
> -#define UPD_CONSTR "<>"
> -
>   #endif /* _ASM_POWERPC_ASM_CONST_H */
> diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
> index 6a53ef178bfd..fd594fdbd84d 100644
> --- a/arch/powerpc/include/asm/atomic.h
> +++ b/arch/powerpc/include/asm/atomic.h
> @@ -27,14 +27,14 @@ static __inline__ int arch_atomic_read(const atomic_t *v)
>   {
>   	int t;
>   
> -	__asm__ __volatile__("lwz%U1%X1 %0,%1" : "=r"(t) : "m"UPD_CONSTR(v->counter));
> +	__asm__ __volatile__("lwz%U1%X1 %0,%1" : "=r"(t) : "m<>"(v->counter));
>   
>   	return t;
>   }
>   
>   static __inline__ void arch_atomic_set(atomic_t *v, int i)
>   {
> -	__asm__ __volatile__("stw%U0%X0 %1,%0" : "=m"UPD_CONSTR(v->counter) : "r"(i));
> +	__asm__ __volatile__("stw%U0%X0 %1,%0" : "=m<>"(v->counter) : "r"(i));
>   }
>   
>   #define ATOMIC_OP(op, asm_op)						\
> @@ -320,14 +320,14 @@ static __inline__ s64 arch_atomic64_read(const atomic64_t *v)
>   {
>   	s64 t;
>   
> -	__asm__ __volatile__("ld%U1%X1 %0,%1" : "=r"(t) : "m"UPD_CONSTR(v->counter));
> +	__asm__ __volatile__("ld%U1%X1 %0,%1" : "=r"(t) : "m<>"(v->counter));
>   
>   	return t;
>   }
>   
>   static __inline__ void arch_atomic64_set(atomic64_t *v, s64 i)
>   {
> -	__asm__ __volatile__("std%U0%X0 %1,%0" : "=m"UPD_CONSTR(v->counter) : "r"(i));
> +	__asm__ __volatile__("std%U0%X0 %1,%0" : "=m<>"(v->counter) : "r"(i));
>   }
>   
>   #define ATOMIC64_OP(op, asm_op)						\
> diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h
> index f130783c8301..beba4979bff9 100644
> --- a/arch/powerpc/include/asm/io.h
> +++ b/arch/powerpc/include/asm/io.h
> @@ -122,7 +122,7 @@ static inline u##size name(const volatile u##size __iomem *addr)	\
>   {									\
>   	u##size ret;							\
>   	__asm__ __volatile__("sync;"#insn"%U1%X1 %0,%1;twi 0,%0,0;isync"\
> -		: "=r" (ret) : "m"UPD_CONSTR (*addr) : "memory");	\
> +		: "=r" (ret) : "m<>" (*addr) : "memory");	\
>   	return ret;							\
>   }
>   
> @@ -130,7 +130,7 @@ static inline u##size name(const volatile u##size __iomem *addr)	\
>   static inline void name(volatile u##size __iomem *addr, u##size val)	\
>   {									\
>   	__asm__ __volatile__("sync;"#insn"%U0%X0 %1,%0"			\
> -		: "=m"UPD_CONSTR (*addr) : "r" (val) : "memory");	\
> +		: "=m<>" (*addr) : "r" (val) : "memory");	\
>   	mmiowb_set_pending();						\
>   }
>   
> diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
> index 22c79ab40006..63316100080c 100644
> --- a/arch/powerpc/include/asm/uaccess.h
> +++ b/arch/powerpc/include/asm/uaccess.h
> @@ -86,7 +86,7 @@ __pu_failed:							\
>   		"1:	" op "%U1%X1 %0,%1	# put_user\n"	\
>   		EX_TABLE(1b, %l2)				\
>   		:						\
> -		: "r" (x), "m"UPD_CONSTR (*addr)		\
> +		: "r" (x), "m<>" (*addr)		\
>   		:						\
>   		: label)
>   
> @@ -143,7 +143,7 @@ do {								\
>   		"1:	"op"%U1%X1 %0, %1	# get_user\n"	\
>   		EX_TABLE(1b, %l2)				\
>   		: "=r" (x)					\
> -		: "m"UPD_CONSTR (*addr)				\
> +		: "m<>" (*addr)				\
>   		:						\
>   		: label)
>   
> @@ -200,7 +200,7 @@ __gus_failed:								\
>   		".previous\n"				\
>   		EX_TABLE(1b, 3b)			\
>   		: "=r" (err), "=r" (x)			\
> -		: "m"UPD_CONSTR (*addr), "i" (-EFAULT), "0" (err))
> +		: "m<>" (*addr), "i" (-EFAULT), "0" (err))
>   
>   #ifdef __powerpc64__
>   #define __get_user_asm2(x, addr, err)			\
> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> index b4e6f70b97b9..3fd037d36afb 100644
> --- a/arch/powerpc/kvm/powerpc.c
> +++ b/arch/powerpc/kvm/powerpc.c
> @@ -1094,7 +1094,7 @@ static inline u64 sp_to_dp(u32 fprs)
>   
>   	preempt_disable();
>   	enable_kernel_fp();
> -	asm ("lfs%U1%X1 0,%1; stfd%U0%X0 0,%0" : "=m"UPD_CONSTR (fprd) : "m"UPD_CONSTR (fprs)
> +	asm ("lfs%U1%X1 0,%1; stfd%U0%X0 0,%0" : "=m<>" (fprd) : "m<>" (fprs)
>   	     : "fr0");
>   	preempt_enable();
>   	return fprd;
> @@ -1106,7 +1106,7 @@ static inline u32 dp_to_sp(u64 fprd)
>   
>   	preempt_disable();
>   	enable_kernel_fp();
> -	asm ("lfd%U1%X1 0,%1; stfs%U0%X0 0,%0" : "=m"UPD_CONSTR (fprs) : "m"UPD_CONSTR (fprd)
> +	asm ("lfd%U1%X1 0,%1; stfs%U0%X0 0,%0" : "=m<>" (fprs) : "m<>" (fprd)
>   	     : "fr0");
>   	preempt_enable();
>   	return fprs;
> 


^ permalink raw reply

* [PATCH] powerpc: clean up UPD_CONSTR
From: Nick Desaulniers @ 2021-09-14 16:17 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Michal Hocko, David Hildenbrand, Peter Zijlstra, Boqun Feng,
	Nick Desaulniers, linuxppc-dev, Christopher M. Riedl,
	Nathan Chancellor, Paul Mackerras, kvm-ppc, Dan Williams,
	Will Deacon, linux-kernel, Daniel Axtens

UPD_CONSTR was previously a preprocessor define for an old GCC 4.9 inline
asm bug with m<> constraints.

Fixes: 6563139d90ad ("powerpc: remove GCC version check for UPD_CONSTR")
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
---
 arch/powerpc/include/asm/asm-const.h | 2 --
 arch/powerpc/include/asm/atomic.h    | 8 ++++----
 arch/powerpc/include/asm/io.h        | 4 ++--
 arch/powerpc/include/asm/uaccess.h   | 6 +++---
 arch/powerpc/kvm/powerpc.c           | 4 ++--
 5 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/include/asm/asm-const.h b/arch/powerpc/include/asm/asm-const.h
index dbfa5e1e3198..bfb3c3534877 100644
--- a/arch/powerpc/include/asm/asm-const.h
+++ b/arch/powerpc/include/asm/asm-const.h
@@ -12,6 +12,4 @@
 #  define ASM_CONST(x)		__ASM_CONST(x)
 #endif
 
-#define UPD_CONSTR "<>"
-
 #endif /* _ASM_POWERPC_ASM_CONST_H */
diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
index 6a53ef178bfd..fd594fdbd84d 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -27,14 +27,14 @@ static __inline__ int arch_atomic_read(const atomic_t *v)
 {
 	int t;
 
-	__asm__ __volatile__("lwz%U1%X1 %0,%1" : "=r"(t) : "m"UPD_CONSTR(v->counter));
+	__asm__ __volatile__("lwz%U1%X1 %0,%1" : "=r"(t) : "m<>"(v->counter));
 
 	return t;
 }
 
 static __inline__ void arch_atomic_set(atomic_t *v, int i)
 {
-	__asm__ __volatile__("stw%U0%X0 %1,%0" : "=m"UPD_CONSTR(v->counter) : "r"(i));
+	__asm__ __volatile__("stw%U0%X0 %1,%0" : "=m<>"(v->counter) : "r"(i));
 }
 
 #define ATOMIC_OP(op, asm_op)						\
@@ -320,14 +320,14 @@ static __inline__ s64 arch_atomic64_read(const atomic64_t *v)
 {
 	s64 t;
 
-	__asm__ __volatile__("ld%U1%X1 %0,%1" : "=r"(t) : "m"UPD_CONSTR(v->counter));
+	__asm__ __volatile__("ld%U1%X1 %0,%1" : "=r"(t) : "m<>"(v->counter));
 
 	return t;
 }
 
 static __inline__ void arch_atomic64_set(atomic64_t *v, s64 i)
 {
-	__asm__ __volatile__("std%U0%X0 %1,%0" : "=m"UPD_CONSTR(v->counter) : "r"(i));
+	__asm__ __volatile__("std%U0%X0 %1,%0" : "=m<>"(v->counter) : "r"(i));
 }
 
 #define ATOMIC64_OP(op, asm_op)						\
diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h
index f130783c8301..beba4979bff9 100644
--- a/arch/powerpc/include/asm/io.h
+++ b/arch/powerpc/include/asm/io.h
@@ -122,7 +122,7 @@ static inline u##size name(const volatile u##size __iomem *addr)	\
 {									\
 	u##size ret;							\
 	__asm__ __volatile__("sync;"#insn"%U1%X1 %0,%1;twi 0,%0,0;isync"\
-		: "=r" (ret) : "m"UPD_CONSTR (*addr) : "memory");	\
+		: "=r" (ret) : "m<>" (*addr) : "memory");	\
 	return ret;							\
 }
 
@@ -130,7 +130,7 @@ static inline u##size name(const volatile u##size __iomem *addr)	\
 static inline void name(volatile u##size __iomem *addr, u##size val)	\
 {									\
 	__asm__ __volatile__("sync;"#insn"%U0%X0 %1,%0"			\
-		: "=m"UPD_CONSTR (*addr) : "r" (val) : "memory");	\
+		: "=m<>" (*addr) : "r" (val) : "memory");	\
 	mmiowb_set_pending();						\
 }
 
diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index 22c79ab40006..63316100080c 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -86,7 +86,7 @@ __pu_failed:							\
 		"1:	" op "%U1%X1 %0,%1	# put_user\n"	\
 		EX_TABLE(1b, %l2)				\
 		:						\
-		: "r" (x), "m"UPD_CONSTR (*addr)		\
+		: "r" (x), "m<>" (*addr)		\
 		:						\
 		: label)
 
@@ -143,7 +143,7 @@ do {								\
 		"1:	"op"%U1%X1 %0, %1	# get_user\n"	\
 		EX_TABLE(1b, %l2)				\
 		: "=r" (x)					\
-		: "m"UPD_CONSTR (*addr)				\
+		: "m<>" (*addr)				\
 		:						\
 		: label)
 
@@ -200,7 +200,7 @@ __gus_failed:								\
 		".previous\n"				\
 		EX_TABLE(1b, 3b)			\
 		: "=r" (err), "=r" (x)			\
-		: "m"UPD_CONSTR (*addr), "i" (-EFAULT), "0" (err))
+		: "m<>" (*addr), "i" (-EFAULT), "0" (err))
 
 #ifdef __powerpc64__
 #define __get_user_asm2(x, addr, err)			\
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index b4e6f70b97b9..3fd037d36afb 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -1094,7 +1094,7 @@ static inline u64 sp_to_dp(u32 fprs)
 
 	preempt_disable();
 	enable_kernel_fp();
-	asm ("lfs%U1%X1 0,%1; stfd%U0%X0 0,%0" : "=m"UPD_CONSTR (fprd) : "m"UPD_CONSTR (fprs)
+	asm ("lfs%U1%X1 0,%1; stfd%U0%X0 0,%0" : "=m<>" (fprd) : "m<>" (fprs)
 	     : "fr0");
 	preempt_enable();
 	return fprd;
@@ -1106,7 +1106,7 @@ static inline u32 dp_to_sp(u64 fprd)
 
 	preempt_disable();
 	enable_kernel_fp();
-	asm ("lfd%U1%X1 0,%1; stfs%U0%X0 0,%0" : "=m"UPD_CONSTR (fprs) : "m"UPD_CONSTR (fprd)
+	asm ("lfd%U1%X1 0,%1; stfs%U0%X0 0,%0" : "=m<>" (fprs) : "m<>" (fprd)
 	     : "fr0");
 	preempt_enable();
 	return fprs;
-- 
2.33.0.309.g3052b89438-goog


^ permalink raw reply related

* Re: [RFC PATCH 5/8] sched: move CPU field back into thread_info if THREAD_INFO_IN_TASK=y
From: Ard Biesheuvel @ 2021-09-14 16:10 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Peter Zijlstra, Paul Mackerras, linux-riscv, Will Deacon,
	linux-s390, Arnd Bergmann, Russell King, Christian Borntraeger,
	Ingo Molnar, Catalin Marinas, Albert Ou, Kees Cook, Vasily Gorbik,
	Heiko Carstens, Keith Packard, Borislav Petkov, Andy Lutomirski,
	Paul Walmsley, Thomas Gleixner, Linux ARM,
	Linux Kernel Mailing List, Palmer Dabbelt, linuxppc-dev
In-Reply-To: <CAHk-=wiaVLChOjJ=7fdoQXKE4JHb98MjDtg8pPkA8EYfd5aj3g@mail.gmail.com>

On Tue, 14 Sept 2021 at 17:59, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Tue, Sep 14, 2021 at 8:53 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > task_cpu() takes a 'const struct task_struct *', whereas
> > task_thread_info() takes a 'struct task_struct *'.
>
> Oh, annoying, but that's easily fixed. Just make that
>
>    static inline struct thread_info *task_thread_info(struct
> task_struct *task) ..
>
> be a simple
>
>   #define task_thread_info(tsk) (&(tsk)->thread_info)
>
> instead. That actually then matches the !THREAD_INFO_IN_TASK case anyway.
>
> Make the commit comment be about how that fixes the type problem.
>
> Because while in many cases inline functions are superior to macros,
> it clearly isn't the case in this case.
>

Works for me.

^ permalink raw reply

* Re: [PATCH 2/2] kvm: rename KVM_MAX_VCPU_ID to, KVM_MAX_VCPU_IDS
From: Christian Zigotzky @ 2021-09-14 15:59 UTC (permalink / raw)
  To: Juergen Gross
  Cc: x86, Wanpeng Li, mad skateman, kvm, linux-doc, linux-kernel,
	Darren Stevens, Paul Mackerras, linux-kselftest, H. Peter Anvin,
	Shuah Khan, Jonathan Corbet, Joerg Roedel, Huacai Chen, ,
	Aleksandar Markovic, , Ingo Molnar, R.T.Dickinson, kvm-ppc,
	Borislav Petkov, Shuah Khan, Thomas Gleixner, Jim Mattson,
	Thomas Bogendoerfer, , linux-mips, Paolo Bonzini,
	Vitaly, Kuznetsov, linuxppc-dev

Hello Juergen,
Hello All,

Since the RC1 of kernel 5.13, -smp 2 and -smp 4 don't work with a 
virtual e5500 QEMU KVM-HV machine anymore. [1]
I see in the serial console, that the uImage doesn't load. I use the 
following QEMU command for booting:

qemu-system-ppc64 -M ppce500 -cpu e5500 -enable-kvm -m 1024 -kernel 
uImage -drive format=raw,file=MintPPC32-X5000.img,index=0,if=virtio 
-netdev user,id=mynet0 -device virtio-net,netdev=mynet0 -append "rw 
root=/dev/vda" -device virtio-vga -device virtio-mouse-pci -device 
virtio-keyboard-pci -device pci-ohci,id=newusb -device 
usb-audio,bus=newusb.0 -smp 4

The kernels boot without KVM-HV.

Summary for KVM-HV:

-smp 1 -> works
-smp 2 -> doesn't work
-smp 3 -> works
-smp 4 -> doesn't work

I used -smp 4 before the RC1 of kernel 5.13 because my FSL P5040 BookE 
machine [2] has 4 cores.

Does this patch solve this issue? [3]

Thanks,
Christian

[1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2021-May/229103.html
[2] http://wiki.amiga.org/index.php?title=X5000
[3] 
https://lists.ozlabs.org/pipermail/linuxppc-dev/2021-September/234152.html

^ permalink raw reply

* Re: [RFC PATCH 5/8] sched: move CPU field back into thread_info if THREAD_INFO_IN_TASK=y
From: Linus Torvalds @ 2021-09-14 15:58 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Peter Zijlstra, Paul Mackerras, linux-riscv, Will Deacon,
	linux-s390, Arnd Bergmann, Russell King, Christian Borntraeger,
	Ingo Molnar, Catalin Marinas, Albert Ou, Kees Cook, Vasily Gorbik,
	Heiko Carstens, Keith Packard, Borislav Petkov, Andy Lutomirski,
	Paul Walmsley, Thomas Gleixner, Linux ARM,
	Linux Kernel Mailing List, Palmer Dabbelt, linuxppc-dev
In-Reply-To: <CAMj1kXH_Q4a4Gsi0Xuw=YsV-b7Mu8TQndk3Ei-JFaRV=GSiqUQ@mail.gmail.com>

On Tue, Sep 14, 2021 at 8:53 AM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> task_cpu() takes a 'const struct task_struct *', whereas
> task_thread_info() takes a 'struct task_struct *'.

Oh, annoying, but that's easily fixed. Just make that

   static inline struct thread_info *task_thread_info(struct
task_struct *task) ..

be a simple

  #define task_thread_info(tsk) (&(tsk)->thread_info)

instead. That actually then matches the !THREAD_INFO_IN_TASK case anyway.

Make the commit comment be about how that fixes the type problem.

Because while in many cases inline functions are superior to macros,
it clearly isn't the case in this case.

              Linus

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox