LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 3/3] powerpc: Update NUMA Kconfig description & help text
From: Michael Ellerman @ 2020-11-24 12:05 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: rdunlap, srikar
In-Reply-To: <20201124120547.1940635-1-mpe@ellerman.id.au>

Update the NUMA Kconfig description to match other architectures, and
add some help text. Shamelessly borrowed from x86/arm64.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/Kconfig | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 4d688b426353..7f4995b245a3 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -659,9 +659,15 @@ config IRQ_ALL_CPUS
 	  reported with SMP Power Macintoshes with this option enabled.
 
 config NUMA
-	bool "NUMA support"
+	bool "NUMA Memory Allocation and Scheduler Support"
 	depends on PPC64 && SMP
 	default y if PPC_PSERIES || PPC_POWERNV
+	help
+	  Enable NUMA (Non-Uniform Memory Access) support.
+
+	  The kernel will try to allocate memory used by a CPU on the
+	  local memory controller of the CPU and add some more
+	  NUMA awareness to the kernel.
 
 config NODES_SHIFT
 	int
-- 
2.25.1


^ permalink raw reply related

* [PATCH 2/3] powerpc: Make NUMA default y for powernv
From: Michael Ellerman @ 2020-11-24 12:05 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: rdunlap, srikar
In-Reply-To: <20201124120547.1940635-1-mpe@ellerman.id.au>

Our NUMA option is default y for pseries, but not powernv. The bulk of
powernv systems are NUMA, so make NUMA default y for powernv also.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index a22db3db6b96..4d688b426353 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -661,7 +661,7 @@ config IRQ_ALL_CPUS
 config NUMA
 	bool "NUMA support"
 	depends on PPC64 && SMP
-	default y if SMP && PPC_PSERIES
+	default y if PPC_PSERIES || PPC_POWERNV
 
 config NODES_SHIFT
 	int
-- 
2.25.1


^ permalink raw reply related

* Re: [PATCH V2 4/5] ocxl: Add mmu notifier
From: Jason Gunthorpe @ 2020-11-24 13:45 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linuxppc-dev, Christophe Lombard, fbarrat, ajd
In-Reply-To: <20201124091738.GA26078@infradead.org>

On Tue, Nov 24, 2020 at 09:17:38AM +0000, Christoph Hellwig wrote:

> > @@ -470,6 +487,26 @@ void ocxl_link_release(struct pci_dev *dev, void *link_handle)
> >  }
> >  EXPORT_SYMBOL_GPL(ocxl_link_release);
> >  
> > +static void invalidate_range(struct mmu_notifier *mn,
> > +			     struct mm_struct *mm,
> > +			     unsigned long start, unsigned long end)
> > +{
> > +	struct pe_data *pe_data = container_of(mn, struct pe_data, mmu_notifier);
> > +	struct ocxl_link *link = pe_data->link;
> > +	unsigned long addr, pid, page_size = PAGE_SIZE;

The page_size variable seems unnecessary

> > +
> > +	pid = mm->context.id;
> > +
> > +	spin_lock(&link->atsd_lock);
> > +	for (addr = start; addr < end; addr += page_size)
> > +		pnv_ocxl_tlb_invalidate(&link->arva, pid, addr);
> > +	spin_unlock(&link->atsd_lock);
> > +}
> > +
> > +static const struct mmu_notifier_ops ocxl_mmu_notifier_ops = {
> > +	.invalidate_range = invalidate_range,
> > +};
> > +
> >  static u64 calculate_cfg_state(bool kernel)
> >  {
> >  	u64 state;
> > @@ -526,6 +563,8 @@ int ocxl_link_add_pe(void *link_handle, int pasid, u32 pidr, u32 tidr,
> >  	pe_data->mm = mm;
> >  	pe_data->xsl_err_cb = xsl_err_cb;
> >  	pe_data->xsl_err_data = xsl_err_data;
> > +	pe_data->link = link;
> > +	pe_data->mmu_notifier.ops = &ocxl_mmu_notifier_ops;
> >  
> >  	memset(pe, 0, sizeof(struct ocxl_process_element));
> >  	pe->config_state = cpu_to_be64(calculate_cfg_state(pidr == 0));
> > @@ -542,8 +581,16 @@ int ocxl_link_add_pe(void *link_handle, int pasid, u32 pidr, u32 tidr,
> >  	 * by the nest MMU. If we have a kernel context, TLBIs are
> >  	 * already global.
> >  	 */
> > -	if (mm)
> > +	if (mm) {
> >  		mm_context_add_copro(mm);
> > +		if (link->arva) {
> > +			/* Use MMIO registers for the TLB Invalidate
> > +			 * operations.
> > +			 */
> > +			mmu_notifier_register(&pe_data->mmu_notifier, mm);

Every other place doing stuff like this is de-duplicating the
notifier. If you have multiple clients this will do multiple redundant
invalidations?

The notifier get/put API is designed to solve that problem, you'd get
a single notifier for the mm and then add the impacted arva's to some
list at the notifier.

Jason

^ permalink raw reply

* [PATCH] tpm: ibmvtpm: fix error return code in tpm_ibmvtpm_probe()
From: Wang Hai @ 2020-11-24 13:52 UTC (permalink / raw)
  To: mpe, benh, paulus, peterhuewe, jarkko, jgg, stefanb, nayna
  Cc: linux-integrity, linuxppc-dev, linux-kernel

Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: d8d74ea3c002 ("tpm: ibmvtpm: Wait for buffer to be set before proceeding")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
---
 drivers/char/tpm/tpm_ibmvtpm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/char/tpm/tpm_ibmvtpm.c b/drivers/char/tpm/tpm_ibmvtpm.c
index 994385bf37c0..813eb2cac0ce 100644
--- a/drivers/char/tpm/tpm_ibmvtpm.c
+++ b/drivers/char/tpm/tpm_ibmvtpm.c
@@ -687,6 +687,7 @@ static int tpm_ibmvtpm_probe(struct vio_dev *vio_dev,
 				ibmvtpm->rtce_buf != NULL,
 				HZ)) {
 		dev_err(dev, "CRQ response timed out\n");
+		rc = -ETIMEDOUT;
 		goto init_irq_cleanup;
 	}
 
-- 
2.17.1


^ permalink raw reply related

* [PATCH] ASoC: fsl_xcvr: fix potential resource leak
From: Viorel Suman (OSS) @ 2020-11-24 14:19 UTC (permalink / raw)
  To: Timur Tabi, Nicolin Chen, Xiubo Li, Fabio Estevam, Shengjiu Wang,
	Liam Girdwood, Mark Brown, Jaroslav Kysela, Takashi Iwai,
	alsa-devel, linuxppc-dev, linux-kernel
  Cc: Viorel Suman

From: Viorel Suman <viorel.suman@nxp.com>

"fw" variable must be relased before return.

Signed-off-by: Viorel Suman <viorel.suman@nxp.com>
---
 sound/soc/fsl/fsl_xcvr.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sound/soc/fsl/fsl_xcvr.c b/sound/soc/fsl/fsl_xcvr.c
index 2a28810d0e29..3d58c88ea603 100644
--- a/sound/soc/fsl/fsl_xcvr.c
+++ b/sound/soc/fsl/fsl_xcvr.c
@@ -706,6 +706,7 @@ static int fsl_xcvr_load_firmware(struct fsl_xcvr *xcvr)
 	/* RAM is 20KiB = 16KiB code + 4KiB data => max 10 pages 2KiB each */
 	if (rem > 16384) {
 		dev_err(dev, "FW size %d is bigger than 16KiB.\n", rem);
+		release_firmware(fw);
 		return -ENOMEM;
 	}
 
-- 
2.26.2


^ permalink raw reply related

* Re: [PATCH] tpm: ibmvtpm: fix error return code in tpm_ibmvtpm_probe()
From: Stefan Berger @ 2020-11-24 14:28 UTC (permalink / raw)
  To: Wang Hai, mpe, benh, paulus, peterhuewe, jarkko, jgg, nayna
  Cc: linux-integrity, linuxppc-dev, linux-kernel
In-Reply-To: <20201124135244.31932-1-wanghai38@huawei.com>

On 11/24/20 8:52 AM, Wang Hai wrote:
> Fix to return a negative error code from the error handling
> case instead of 0, as done elsewhere in this function.
>
> Fixes: d8d74ea3c002 ("tpm: ibmvtpm: Wait for buffer to be set before proceeding")
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: Wang Hai <wanghai38@huawei.com>
> ---
>   drivers/char/tpm/tpm_ibmvtpm.c | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/drivers/char/tpm/tpm_ibmvtpm.c b/drivers/char/tpm/tpm_ibmvtpm.c
> index 994385bf37c0..813eb2cac0ce 100644
> --- a/drivers/char/tpm/tpm_ibmvtpm.c
> +++ b/drivers/char/tpm/tpm_ibmvtpm.c
> @@ -687,6 +687,7 @@ static int tpm_ibmvtpm_probe(struct vio_dev *vio_dev,
>   				ibmvtpm->rtce_buf != NULL,
>   				HZ)) {
>   		dev_err(dev, "CRQ response timed out\n");
> +		rc = -ETIMEDOUT;
>   		goto init_irq_cleanup;
>   	}
>   

Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>


^ permalink raw reply

* eBPF on powerpc
From: Christophe Leroy @ 2020-11-24 15:00 UTC (permalink / raw)
  To: Naveen N. Rao, linuxppc-dev@lists.ozlabs.org

Hi Naveen,

Few years ago, you implemented eBPF on PPC64.

Is there any reason for implementing it for PPC64 only ? Is there something that makes it impossible 
to have eBPF for PPC32 as well ?

Thanks
Christophe

^ permalink raw reply

* [PATCH v1 3/6] powerpc/8xx: Simplify INVALIDATE_ADJACENT_PAGES_CPU15
From: Christophe Leroy @ 2020-11-24 15:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <e796c5fcb5898de827c803cf1ab8ba1d7a5d4b76.1606231483.git.christophe.leroy@csgroup.eu>

We now have r11 available as a scratch register so
INVALIDATE_ADJACENT_PAGES_CPU15() can be simplified.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/kernel/head_8xx.S | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 775b4f4d011e..558c8e615ef9 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -180,14 +180,13 @@ SystemCall:
  */
 
 #ifdef CONFIG_8xx_CPU15
-#define INVALIDATE_ADJACENT_PAGES_CPU15(addr)	\
-	addi	addr, addr, PAGE_SIZE;	\
-	tlbie	addr;			\
-	addi	addr, addr, -(PAGE_SIZE << 1);	\
-	tlbie	addr;			\
-	addi	addr, addr, PAGE_SIZE
+#define INVALIDATE_ADJACENT_PAGES_CPU15(addr, tmp)	\
+	addi	tmp, addr, PAGE_SIZE;	\
+	tlbie	tmp;			\
+	addi	tmp, addr, -PAGE_SIZE;	\
+	tlbie	tmp
 #else
-#define INVALIDATE_ADJACENT_PAGES_CPU15(addr)
+#define INVALIDATE_ADJACENT_PAGES_CPU15(addr, tmp)
 #endif
 
 InstructionTLBMiss:
@@ -198,7 +197,7 @@ InstructionTLBMiss:
 	 * kernel page tables.
 	 */
 	mfspr	r10, SPRN_SRR0	/* Get effective address of fault */
-	INVALIDATE_ADJACENT_PAGES_CPU15(r10)
+	INVALIDATE_ADJACENT_PAGES_CPU15(r10, r11)
 	mtspr	SPRN_MD_EPN, r10
 #ifdef CONFIG_MODULES
 	mfcr	r11
-- 
2.25.0


^ permalink raw reply related

* [PATCH v1 1/6] powerpc/8xx: DEBUG_PAGEALLOC doesn't require an ITLB miss exception handler
From: Christophe Leroy @ 2020-11-24 15:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel

Since commit e611939fc8ec ("powerpc/mm: Ensure change_page_attr()
doesn't invalidate pinned TLBs"), pinned TLBs are not anymore
invalidated by __kernel_map_pages() when CONFIG_DEBUG_PAGEALLOC is
selected.

Remove the dependency on CONFIG_DEBUG_PAGEALLOC.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/kernel/head_8xx.S | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index ee0bfebc375f..66ee62f30d36 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -47,8 +47,7 @@
  * - Either we have modules
  * - Or we have not pinned the first 8M
  */
-#if defined(CONFIG_MODULES) || !defined(CONFIG_PIN_TLB_TEXT) || \
-    defined(CONFIG_DEBUG_PAGEALLOC)
+#if defined(CONFIG_MODULES) || !defined(CONFIG_PIN_TLB_TEXT)
 #define ITLB_MISS_KERNEL	1
 #endif
 
-- 
2.25.0


^ permalink raw reply related

* [PATCH v1 4/6] powerpc/8xx: Use SPRN_SPRG_SCRATCH2 in ITLB miss exception
From: Christophe Leroy @ 2020-11-24 15:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <e796c5fcb5898de827c803cf1ab8ba1d7a5d4b76.1606231483.git.christophe.leroy@csgroup.eu>

In order to re-enable MMU earlier, ensure ITLB miss exception
cannot clobber SPRN_SPRG_SCRATCH0 and SPRN_SPRG_SCRATCH1.
Do so by using SPRN_SPRG_SCRATCH2 and SPRN_M_TW instead, like
the DTLB miss exception.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/kernel/head_8xx.S | 12 ++++++------
 arch/powerpc/perf/8xx-pmu.c    |  4 ++--
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 558c8e615ef9..45239b06b6ce 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -190,8 +190,8 @@ SystemCall:
 #endif
 
 InstructionTLBMiss:
-	mtspr	SPRN_SPRG_SCRATCH0, r10
-	mtspr	SPRN_SPRG_SCRATCH1, r11
+	mtspr	SPRN_SPRG_SCRATCH2, r10
+	mtspr	SPRN_M_TW, r11
 
 	/* If we are faulting a kernel address, we have to use the
 	 * kernel page tables.
@@ -230,8 +230,8 @@ InstructionTLBMiss:
 	mtspr	SPRN_MI_RPN, r10	/* Update TLB entry */
 
 	/* Restore registers */
-0:	mfspr	r10, SPRN_SPRG_SCRATCH0
-	mfspr	r11, SPRN_SPRG_SCRATCH1
+0:	mfspr	r10, SPRN_SPRG_SCRATCH2
+	mfspr	r11, SPRN_M_TW
 	rfi
 	patch_site	0b, patch__itlbmiss_exit_1
 
@@ -240,8 +240,8 @@ InstructionTLBMiss:
 0:	lwz	r10, (itlb_miss_counter - PAGE_OFFSET)@l(0)
 	addi	r10, r10, 1
 	stw	r10, (itlb_miss_counter - PAGE_OFFSET)@l(0)
-	mfspr	r10, SPRN_SPRG_SCRATCH0
-	mfspr	r11, SPRN_SPRG_SCRATCH1
+	mfspr	r10, SPRN_SPRG_SCRATCH2
+	mfspr	r11, SPRN_M_TW
 	rfi
 #endif
 
diff --git a/arch/powerpc/perf/8xx-pmu.c b/arch/powerpc/perf/8xx-pmu.c
index e53c3c161257..02db58c7427a 100644
--- a/arch/powerpc/perf/8xx-pmu.c
+++ b/arch/powerpc/perf/8xx-pmu.c
@@ -165,9 +165,9 @@ static void mpc8xx_pmu_del(struct perf_event *event, int flags)
 		break;
 	case PERF_8xx_ID_ITLB_LOAD_MISS:
 		if (atomic_dec_return(&itlb_miss_ref) == 0) {
-			/* mfspr r10, SPRN_SPRG_SCRATCH0 */
+			/* mfspr r10, SPRN_SPRG_SCRATCH2 */
 			struct ppc_inst insn = ppc_inst(PPC_INST_MFSPR | __PPC_RS(R10) |
-					    __PPC_SPR(SPRN_SPRG_SCRATCH0));
+					    __PPC_SPR(SPRN_SPRG_SCRATCH2));
 
 			patch_instruction_site(&patch__itlbmiss_exit_1, insn);
 		}
-- 
2.25.0


^ permalink raw reply related

* [PATCH v1 2/6] powerpc/8xx: Always pin kernel text TLB
From: Christophe Leroy @ 2020-11-24 15:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <e796c5fcb5898de827c803cf1ab8ba1d7a5d4b76.1606231483.git.christophe.leroy@csgroup.eu>

There is no big poing in not pinning kernel text anymore, as now
we can keep pinned TLB even with things like DEBUG_PAGEALLOC.

Remove CONFIG_PIN_TLB_TEXT, making it always right.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/Kconfig               |  3 +--
 arch/powerpc/kernel/head_8xx.S     | 20 +++-----------------
 arch/powerpc/mm/nohash/8xx.c       |  3 +--
 arch/powerpc/platforms/8xx/Kconfig |  7 -------
 4 files changed, 5 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index e9f13fe08492..bf088b5b0a89 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -795,8 +795,7 @@ config DATA_SHIFT_BOOL
 	bool "Set custom data alignment"
 	depends on ADVANCED_OPTIONS
 	depends on STRICT_KERNEL_RWX || DEBUG_PAGEALLOC
-	depends on PPC_BOOK3S_32 || (PPC_8xx && !PIN_TLB_DATA && \
-				     (!PIN_TLB_TEXT || !STRICT_KERNEL_RWX))
+	depends on PPC_BOOK3S_32 || (PPC_8xx && !PIN_TLB_DATA && !STRICT_KERNEL_RWX)
 	help
 	  This option allows you to set the kernel data alignment. When
 	  RAM is mapped by blocks, the alignment needs to fit the size and
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 66ee62f30d36..775b4f4d011e 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -42,15 +42,6 @@
 #endif
 .endm
 
-/*
- * We need an ITLB miss handler for kernel addresses if:
- * - Either we have modules
- * - Or we have not pinned the first 8M
- */
-#if defined(CONFIG_MODULES) || !defined(CONFIG_PIN_TLB_TEXT)
-#define ITLB_MISS_KERNEL	1
-#endif
-
 /*
  * Value for the bits that have fixed value in RPN entries.
  * Also used for tagging DAR for DTLBerror.
@@ -209,12 +200,12 @@ InstructionTLBMiss:
 	mfspr	r10, SPRN_SRR0	/* Get effective address of fault */
 	INVALIDATE_ADJACENT_PAGES_CPU15(r10)
 	mtspr	SPRN_MD_EPN, r10
-#ifdef ITLB_MISS_KERNEL
+#ifdef CONFIG_MODULES
 	mfcr	r11
 	compare_to_kernel_boundary r10, r10
 #endif
 	mfspr	r10, SPRN_M_TWB	/* Get level 1 table */
-#ifdef ITLB_MISS_KERNEL
+#ifdef CONFIG_MODULES
 	blt+	3f
 	rlwinm	r10, r10, 0, 20, 31
 	oris	r10, r10, (swapper_pg_dir - PAGE_OFFSET)@ha
@@ -618,10 +609,6 @@ start_here:
 	lis	r0, (MD_TWAM | MD_RSV4I)@h
 	mtspr	SPRN_MD_CTR, r0
 #endif
-#ifndef CONFIG_PIN_TLB_TEXT
-	li	r0, 0
-	mtspr	SPRN_MI_CTR, r0
-#endif
 #if !defined(CONFIG_PIN_TLB_DATA) && !defined(CONFIG_PIN_TLB_IMMR)
 	lis	r0, MD_TWAM@h
 	mtspr	SPRN_MD_CTR, r0
@@ -739,7 +726,6 @@ _GLOBAL(mmu_pin_tlb)
 	mtspr	SPRN_MD_CTR, r6
 	tlbia
 
-#ifdef CONFIG_PIN_TLB_TEXT
 	LOAD_REG_IMMEDIATE(r5, 28 << 8)
 	LOAD_REG_IMMEDIATE(r6, PAGE_OFFSET)
 	LOAD_REG_IMMEDIATE(r7, MI_SVALID | MI_PS8MEG | _PMD_ACCESSED)
@@ -760,7 +746,7 @@ _GLOBAL(mmu_pin_tlb)
 	bdnzt	lt, 2b
 	lis	r0, MI_RSV4I@h
 	mtspr	SPRN_MI_CTR, r0
-#endif
+
 	LOAD_REG_IMMEDIATE(r5, 28 << 8 | MD_TWAM)
 #ifdef CONFIG_PIN_TLB_DATA
 	LOAD_REG_IMMEDIATE(r6, PAGE_OFFSET)
diff --git a/arch/powerpc/mm/nohash/8xx.c b/arch/powerpc/mm/nohash/8xx.c
index 231ca95f9ffb..19a3eec1d8c5 100644
--- a/arch/powerpc/mm/nohash/8xx.c
+++ b/arch/powerpc/mm/nohash/8xx.c
@@ -186,8 +186,7 @@ void mmu_mark_initmem_nx(void)
 	mmu_mapin_ram_chunk(0, boundary, PAGE_KERNEL_TEXT, false);
 	mmu_mapin_ram_chunk(boundary, einittext8, PAGE_KERNEL, false);
 
-	if (IS_ENABLED(CONFIG_PIN_TLB_TEXT))
-		mmu_pin_tlb(block_mapped_ram, false);
+	mmu_pin_tlb(block_mapped_ram, false);
 }
 
 #ifdef CONFIG_STRICT_KERNEL_RWX
diff --git a/arch/powerpc/platforms/8xx/Kconfig b/arch/powerpc/platforms/8xx/Kconfig
index cdda034733ff..1a8400bfbe82 100644
--- a/arch/powerpc/platforms/8xx/Kconfig
+++ b/arch/powerpc/platforms/8xx/Kconfig
@@ -202,13 +202,6 @@ config PIN_TLB_IMMR
 	  CONFIG_PIN_TLB_DATA is also selected, it will reduce
 	  CONFIG_PIN_TLB_DATA to 24 Mbytes.
 
-config PIN_TLB_TEXT
-	bool "Pinned TLB for TEXT"
-	depends on PIN_TLB
-	default y
-	help
-	  This pins kernel text with 8M pages.
-
 endmenu
 
 endmenu
-- 
2.25.0


^ permalink raw reply related

* [PATCH v1 5/6] powerpc/8xx: Use SPRN_SPRG_SCRATCH2 in DTLB miss exception
From: Christophe Leroy @ 2020-11-24 15:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <e796c5fcb5898de827c803cf1ab8ba1d7a5d4b76.1606231483.git.christophe.leroy@csgroup.eu>

Use SPRN_SPRG_SCRATCH2 in DTLB miss exception instead of DAR
in order to be similar to ITLB miss exception.

This also simplifies mpc8xx_pmu_del()

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/kernel/head_8xx.S |  9 ++++-----
 arch/powerpc/perf/8xx-pmu.c    | 19 +++++++------------
 2 files changed, 11 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 45239b06b6ce..35707e86c5f3 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -247,7 +247,7 @@ InstructionTLBMiss:
 
 	. = 0x1200
 DataStoreTLBMiss:
-	mtspr	SPRN_DAR, r10
+	mtspr	SPRN_SPRG_SCRATCH2, r10
 	mtspr	SPRN_M_TW, r11
 	mfcr	r11
 
@@ -286,11 +286,11 @@ DataStoreTLBMiss:
 	li	r11, RPN_PATTERN
 	rlwimi	r10, r11, 0, 24, 27	/* Set 24-27 */
 	mtspr	SPRN_MD_RPN, r10	/* Update TLB entry */
+	mtspr	SPRN_DAR, r11		/* Tag DAR */
 
 	/* Restore registers */
 
-0:	mfspr	r10, SPRN_DAR
-	mtspr	SPRN_DAR, r11	/* Tag DAR */
+0:	mfspr	r10, SPRN_SPRG_SCRATCH2
 	mfspr	r11, SPRN_M_TW
 	rfi
 	patch_site	0b, patch__dtlbmiss_exit_1
@@ -300,8 +300,7 @@ DataStoreTLBMiss:
 0:	lwz	r10, (dtlb_miss_counter - PAGE_OFFSET)@l(0)
 	addi	r10, r10, 1
 	stw	r10, (dtlb_miss_counter - PAGE_OFFSET)@l(0)
-	mfspr	r10, SPRN_DAR
-	mtspr	SPRN_DAR, r11	/* Tag DAR */
+	mfspr	r10, SPRN_SPRG_SCRATCH2
 	mfspr	r11, SPRN_M_TW
 	rfi
 #endif
diff --git a/arch/powerpc/perf/8xx-pmu.c b/arch/powerpc/perf/8xx-pmu.c
index 02db58c7427a..93004ee586a1 100644
--- a/arch/powerpc/perf/8xx-pmu.c
+++ b/arch/powerpc/perf/8xx-pmu.c
@@ -153,6 +153,11 @@ static void mpc8xx_pmu_read(struct perf_event *event)
 
 static void mpc8xx_pmu_del(struct perf_event *event, int flags)
 {
+	struct ppc_inst insn;
+
+	/* mfspr r10, SPRN_SPRG_SCRATCH2 */
+	insn = ppc_inst(PPC_INST_MFSPR | __PPC_RS(R10) | __PPC_SPR(SPRN_SPRG_SCRATCH2));
+
 	mpc8xx_pmu_read(event);
 
 	/* If it was the last user, stop counting to avoid useles overhead */
@@ -164,22 +169,12 @@ static void mpc8xx_pmu_del(struct perf_event *event, int flags)
 			mtspr(SPRN_ICTRL, 7);
 		break;
 	case PERF_8xx_ID_ITLB_LOAD_MISS:
-		if (atomic_dec_return(&itlb_miss_ref) == 0) {
-			/* mfspr r10, SPRN_SPRG_SCRATCH2 */
-			struct ppc_inst insn = ppc_inst(PPC_INST_MFSPR | __PPC_RS(R10) |
-					    __PPC_SPR(SPRN_SPRG_SCRATCH2));
-
+		if (atomic_dec_return(&itlb_miss_ref) == 0)
 			patch_instruction_site(&patch__itlbmiss_exit_1, insn);
-		}
 		break;
 	case PERF_8xx_ID_DTLB_LOAD_MISS:
-		if (atomic_dec_return(&dtlb_miss_ref) == 0) {
-			/* mfspr r10, SPRN_DAR */
-			struct ppc_inst insn = ppc_inst(PPC_INST_MFSPR | __PPC_RS(R10) |
-					    __PPC_SPR(SPRN_DAR));
-
+		if (atomic_dec_return(&dtlb_miss_ref) == 0)
 			patch_instruction_site(&patch__dtlbmiss_exit_1, insn);
-		}
 		break;
 	}
 }
-- 
2.25.0


^ permalink raw reply related

* [PATCH v1 6/6] powerpc/ppc-opcode: Add PPC_RAW_MFSPR()
From: Christophe Leroy @ 2020-11-24 15:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <e796c5fcb5898de827c803cf1ab8ba1d7a5d4b76.1606231483.git.christophe.leroy@csgroup.eu>

Add PPC_RAW_MFSPR() to replace open coding done in 8xx-pmu.c

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/ppc-opcode.h | 3 ++-
 arch/powerpc/perf/8xx-pmu.c           | 5 +----
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h
index a6e3700c4566..da6f300e9788 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -230,7 +230,6 @@
 #define PPC_INST_POPCNTB_MASK		0xfc0007fe
 #define PPC_INST_RFEBB			0x4c000124
 #define PPC_INST_RFID			0x4c000024
-#define PPC_INST_MFSPR			0x7c0002a6
 #define PPC_INST_MFSPR_DSCR		0x7c1102a6
 #define PPC_INST_MFSPR_DSCR_MASK	0xfc1ffffe
 #define PPC_INST_MTSPR_DSCR		0x7c1103a6
@@ -507,6 +506,8 @@
 
 #define PPC_RAW_NEG(d, a)		(0x7c0000d0 | ___PPC_RT(d) | ___PPC_RA(a))
 
+#define PPC_RAW_MFSPR(d, spr)		(0x7c0002a6 | ___PPC_RT(d) | __PPC_SPR(spr))
+
 /* Deal with instructions that older assemblers aren't aware of */
 #define	PPC_BCCTR_FLUSH		stringify_in_c(.long PPC_INST_BCCTR_FLUSH)
 #define	PPC_CP_ABORT		stringify_in_c(.long PPC_RAW_CP_ABORT)
diff --git a/arch/powerpc/perf/8xx-pmu.c b/arch/powerpc/perf/8xx-pmu.c
index 93004ee586a1..f970d1510d3d 100644
--- a/arch/powerpc/perf/8xx-pmu.c
+++ b/arch/powerpc/perf/8xx-pmu.c
@@ -153,10 +153,7 @@ static void mpc8xx_pmu_read(struct perf_event *event)
 
 static void mpc8xx_pmu_del(struct perf_event *event, int flags)
 {
-	struct ppc_inst insn;
-
-	/* mfspr r10, SPRN_SPRG_SCRATCH2 */
-	insn = ppc_inst(PPC_INST_MFSPR | __PPC_RS(R10) | __PPC_SPR(SPRN_SPRG_SCRATCH2));
+	struct ppc_inst insn = ppc_inst(PPC_RAW_MFSPR(10, SPRN_SPRG_SCRATCH2));
 
 	mpc8xx_pmu_read(event);
 
-- 
2.25.0


^ permalink raw reply related

* Re: [PATCH 1/3] perf/core: Flush PMU internal buffers for per-CPU events
From: Liang, Kan @ 2020-11-24 16:04 UTC (permalink / raw)
  To: Madhavan Srinivasan, Namhyung Kim, Michael Ellerman
  Cc: Ian Rogers, Andi Kleen, Peter Zijlstra, Jiri Olsa, linux-kernel,
	Stephane Eranian, Paul Mackerras, Arnaldo Carvalho de Melo,
	linuxppc-dev, Ingo Molnar, Gabriel Marin
In-Reply-To: <9657dc9f-e1a9-eb7e-8ac2-a108416d5a10@linux.ibm.com>



On 11/24/2020 12:42 AM, Madhavan Srinivasan wrote:
> 
> On 11/24/20 10:21 AM, Namhyung Kim wrote:
>> Hello,
>>
>> On Mon, Nov 23, 2020 at 8:00 PM Michael Ellerman <mpe@ellerman.id.au> 
>> wrote:
>>> Namhyung Kim <namhyung@kernel.org> writes:
>>>> Hi Peter and Kan,
>>>>
>>>> (Adding PPC folks)
>>>>
>>>> On Tue, Nov 17, 2020 at 2:01 PM Namhyung Kim <namhyung@kernel.org> 
>>>> wrote:
>>>>> Hello,
>>>>>
>>>>> On Thu, Nov 12, 2020 at 4:54 AM Liang, Kan 
>>>>> <kan.liang@linux.intel.com> wrote:
>>>>>>
>>>>>>
>>>>>> On 11/11/2020 11:25 AM, Peter Zijlstra wrote:
>>>>>>> On Mon, Nov 09, 2020 at 09:49:31AM -0500, Liang, Kan wrote:
>>>>>>>
>>>>>>>> - When the large PEBS was introduced (9c964efa4330), the 
>>>>>>>> sched_task() should
>>>>>>>> be invoked to flush the PEBS buffer in each context switch. 
>>>>>>>> However, The
>>>>>>>> perf_sched_events in account_event() is not updated accordingly. 
>>>>>>>> The
>>>>>>>> perf_event_task_sched_* never be invoked for a pure per-CPU 
>>>>>>>> context. Only
>>>>>>>> per-task event works.
>>>>>>>>      At that time, the perf_pmu_sched_task() is outside of
>>>>>>>> perf_event_context_sched_in/out. It means that perf has to double
>>>>>>>> perf_pmu_disable() for per-task event.
>>>>>>>> - The patch 1 tries to fix broken per-CPU events. The CPU 
>>>>>>>> context cannot be
>>>>>>>> retrieved from the task->perf_event_ctxp. So it has to be 
>>>>>>>> tracked in the
>>>>>>>> sched_cb_list. Yes, the code is very similar to the original 
>>>>>>>> codes, but it
>>>>>>>> is actually the new code for per-CPU events. The optimization 
>>>>>>>> for per-task
>>>>>>>> events is still kept.
>>>>>>>>     For the case, which has both a CPU context and a task 
>>>>>>>> context, yes, the
>>>>>>>> __perf_pmu_sched_task() in this patch is not invoked. Because the
>>>>>>>> sched_task() only need to be invoked once in a context switch. The
>>>>>>>> sched_task() will be eventually invoked in the task context.
>>>>>>> The thing is; your first two patches rely on PERF_ATTACH_SCHED_CB 
>>>>>>> and
>>>>>>> only set that for large pebs. Are you sure the other users (Intel 
>>>>>>> LBR
>>>>>>> and PowerPC BHRB) don't need it?
>>>>>> I didn't set it for LBR, because the perf_sched_events is always 
>>>>>> enabled
>>>>>> for LBR. But, yes, we should explicitly set the PERF_ATTACH_SCHED_CB
>>>>>> for LBR.
>>>>>>
>>>>>>          if (has_branch_stack(event))
>>>>>>                  inc = true;
>>>>>>
>>>>>>> If they indeed do not require the pmu::sched_task() callback for CPU
>>>>>>> events, then I still think the whole perf_sched_cb_{inc,dec}() 
>>>>>>> interface
>>>>>> No, LBR requires the pmu::sched_task() callback for CPU events.
>>>>>>
>>>>>> Now, The LBR registers have to be reset in sched in even for CPU 
>>>>>> events.
>>>>>>
>>>>>> To fix the shorter LBR callstack issue for CPU events, we also 
>>>>>> need to
>>>>>> save/restore LBRs in pmu::sched_task().
>>>>>> https://lore.kernel.org/lkml/1578495789-95006-4-git-send-email-kan.liang@linux.intel.com/ 
>>>>>>
>>>>>>
>>>>>>> is confusing at best.
>>>>>>>
>>>>>>> Can't we do something like this instead?
>>>>>>>
>>>>>> I think the below patch may have two issues.
>>>>>> - PERF_ATTACH_SCHED_CB is required for LBR (maybe PowerPC BHRB as 
>>>>>> well) now.
>>>>>> - We may disable the large PEBS later if not all PEBS events support
>>>>>> large PEBS. The PMU need a way to notify the generic code to decrease
>>>>>> the nr_sched_task.
>>>>> Any updates on this?  I've reviewed and tested Kan's patches
>>>>> and they all look good.
>>>>>
>>>>> Maybe we can talk to PPC folks to confirm the BHRB case?
>>>> Can we move this forward?  I saw patch 3/3 also adds 
>>>> PERF_ATTACH_SCHED_CB
>>>> for PowerPC too.  But it'd be nice if ppc folks can confirm the change.
>>> Sorry I've read the whole thread, but I'm still not entirely sure I
>>> understand the question.
>> Thanks for your time and sorry about not being clear enough.
>>
>> We found per-cpu events are not calling pmu::sched_task()
>> on context switches.  So PERF_ATTACH_SCHED_CB was
>> added to indicate the core logic that it needs to invoke the
>> callback.
>>
>> The patch 3/3 added the flag to PPC (for BHRB) with other
>> changes (I think it should be split like in the patch 2/3) and
>> want to get ACKs from the PPC folks.
> 
> Sorry for delay.
> 
> I guess first it will be better to split the ppc change to a separate 
> patch,

Both PPC and X86 invokes the perf_sched_cb_inc() directly. The patch 
changes the parameters of the perf_sched_cb_inc(). I think we have to 
update the PPC and X86 codes together. Otherwise, there will be a 
compile error, if someone may only applies the change for the 
perf_sched_cb_inc() but forget to applies the changes in PPC or X86 
specific codes.

> 
> secondly, we are missing the changes needed in the power_pmu_bhrb_disable()
> 
> where perf_sched_cb_dec() needs the "state" to be included.
> 

Ah, right. The below patch should fix the issue.

diff --git a/arch/powerpc/perf/core-book3s.c 
b/arch/powerpc/perf/core-book3s.c
index bced502f64a1..6756d1602a67 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -391,13 +391,18 @@ static void power_pmu_bhrb_enable(struct 
perf_event *event)
  static void power_pmu_bhrb_disable(struct perf_event *event)
  {
  	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
+	int state = PERF_SCHED_CB_SW_IN;

  	if (!ppmu->bhrb_nr)
  		return;

  	WARN_ON_ONCE(!cpuhw->bhrb_users);
  	cpuhw->bhrb_users--;
-	perf_sched_cb_dec(event->ctx->pmu);
+
+	if (!(event->attach_state & PERF_ATTACH_TASK))
+		state |= PERF_SCHED_CB_CPU;
+
+	perf_sched_cb_dec(event->ctx->pmu, state);

  	if (!cpuhw->disabled && !cpuhw->bhrb_users) {
  		/* BHRB cannot be turned off when other



Thanks,
Kan

^ permalink raw reply related

* eBPF on powerpc
From: Naveen N. Rao @ 2020-11-24 16:35 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <d69650b0-4024-5759-3ccb-ede5c0394500@csgroup.eu>

Hi Christophe,

Christophe Leroy wrote:
> Hi Naveen,
> 
> Few years ago, you implemented eBPF on PPC64.
> 
> Is there any reason for implementing it for PPC64 only ?

I focused on ppc64 since eBPF is a 64-bit VM and it was more 
straight-forward to target.

> Is there something that makes it impossible to have eBPF for PPC32 as 
> well ?

No, I just wasn't sure if it would be performant enough to warrant it.  
Since then however, there have been arm32 and riscv 32-bit JIT 
implementations and atleast the arm32 JIT seems to be showing ~50% 
better performance compared to the interpreter (*). So, it would be 
worthwhile to add support for ppc32.

Note that there might be a few instructions which would be difficult to 
support on 32-bit, but those can fallback to the interpreter, while 
allowing other programs to be JIT'ed.

- Naveen

(*) 
http://lkml.kernel.org/r/CAGXu5jLYunVCJGCfHPebKDaoQ71hdMGq4HhdDxTYpBQw_HXUYQ@mail.gmail.com
(*) http://lkml.kernel.org/r/b63fae4b-cb74-1928-b210-80914f3c8995@fb.com
(*) http://lkml.kernel.org/r/20200305050207.4159-1-luke.r.nels@gmail.com

^ permalink raw reply

* [PATCH net 0/2] ibmvnic: Bug fixes for queue descriptor processing
From: Thomas Falcon @ 2020-11-24 17:26 UTC (permalink / raw)
  To: netdev
  Cc: cforno12, ljp, ricklind, dnbanerg, tlfalcon, drt, brking, sukadev,
	linuxppc-dev

This series resolves a few issues in the ibmvnic driver's
RX buffer and TX completion processing. The first patch
includes memory barriers to synchronize queue descriptor
reads. The second patch fixes a memory leak that could
occur if the device returns a TX completion with an error
code in the descriptor, in which case the respective socket
buffer and other relevant data structures may not be freed
or updated properly.

Thomas Falcon (2):
  ibmvnic: Ensure that SCRQ entry reads are correctly ordered
  ibmvnic: Fix TX completion error handling

 drivers/net/ethernet/ibm/ibmvnic.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

-- 
1.8.3.1

^ permalink raw reply

* [PATCH net 1/2] ibmvnic: Ensure that SCRQ entry reads are correctly ordered
From: Thomas Falcon @ 2020-11-24 17:26 UTC (permalink / raw)
  To: netdev
  Cc: cforno12, ljp, ricklind, dnbanerg, tlfalcon, drt, brking, sukadev,
	linuxppc-dev
In-Reply-To: <1606238776-30259-1-git-send-email-tlfalcon@linux.ibm.com>

Ensure that received Subordinate Command-Response Queue (SCRQ)
entries are properly read in order by the driver. These queues
are used in the ibmvnic device to process RX buffer and TX completion
descriptors. dma_rmb barriers have been added after checking for a
pending descriptor to ensure the correct descriptor entry is checked
and after reading the SCRQ descriptor to ensure the entire
descriptor is read before processing.

Fixes: 032c5e828 ("Driver for IBM System i/p VNIC protocol")
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 2aa40b2..489ed5e 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2403,6 +2403,8 @@ static int ibmvnic_poll(struct napi_struct *napi, int budget)
 
 		if (!pending_scrq(adapter, adapter->rx_scrq[scrq_num]))
 			break;
+		/* ensure that we do not prematurely exit the polling loop */
+		dma_rmb();
 		next = ibmvnic_next_scrq(adapter, adapter->rx_scrq[scrq_num]);
 		rx_buff =
 		    (struct ibmvnic_rx_buff *)be64_to_cpu(next->
@@ -3098,6 +3100,9 @@ static int ibmvnic_complete_tx(struct ibmvnic_adapter *adapter,
 		unsigned int pool = scrq->pool_index;
 		int num_entries = 0;
 
+		/* ensure that the correct descriptor entry is read */
+		dma_rmb();
+
 		next = ibmvnic_next_scrq(adapter, scrq);
 		for (i = 0; i < next->tx_comp.num_comps; i++) {
 			if (next->tx_comp.rcs[i]) {
@@ -3498,6 +3503,9 @@ static union sub_crq *ibmvnic_next_scrq(struct ibmvnic_adapter *adapter,
 	}
 	spin_unlock_irqrestore(&scrq->lock, flags);
 
+	/* ensure that the entire SCRQ descriptor is read */
+	dma_rmb();
+
 	return entry;
 }
 
-- 
1.8.3.1


^ permalink raw reply related

* [PATCH net 2/2] ibmvnic: Fix TX completion error handling
From: Thomas Falcon @ 2020-11-24 17:26 UTC (permalink / raw)
  To: netdev
  Cc: cforno12, ljp, ricklind, dnbanerg, tlfalcon, drt, brking, sukadev,
	linuxppc-dev
In-Reply-To: <1606238776-30259-1-git-send-email-tlfalcon@linux.ibm.com>

TX completions received with an error return code are not
being processed properly. When an error code is seen, do not
proceed to the next completion before cleaning up the existing
entry's data structures.

Fixes: 032c5e828 ("Driver for IBM System i/p VNIC protocol")
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 489ed5e..7097bcb 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -3105,11 +3105,9 @@ static int ibmvnic_complete_tx(struct ibmvnic_adapter *adapter,
 
 		next = ibmvnic_next_scrq(adapter, scrq);
 		for (i = 0; i < next->tx_comp.num_comps; i++) {
-			if (next->tx_comp.rcs[i]) {
+			if (next->tx_comp.rcs[i])
 				dev_err(dev, "tx error %x\n",
 					next->tx_comp.rcs[i]);
-				continue;
-			}
 			index = be32_to_cpu(next->tx_comp.correlators[i]);
 			if (index & IBMVNIC_TSO_POOL_MASK) {
 				tx_pool = &adapter->tso_pool[pool];
-- 
1.8.3.1


^ permalink raw reply related

* Re: [PATCH kernel v4 1/8] genirq/ipi: Simplify irq_reserve_ipi
From: Cédric Le Goater @ 2020-11-24 16:54 UTC (permalink / raw)
  To: Alexey Kardashevskiy, linux-kernel
  Cc: linux-mips, Matt Redfearn, Qais Yousef, Marc Zyngier, x86,
	linux-gpio, Oliver O'Halloran, Frederic Barrat,
	Thomas Gleixner, Michal Suchánek, linuxppc-dev,
	linux-arm-kernel
In-Reply-To: <20201124061720.86766-2-aik@ozlabs.ru>

On 11/24/20 7:17 AM, Alexey Kardashevskiy wrote:
> __irq_domain_alloc_irqs() can already handle virq==-1 and free
> descriptors if it failed allocating hardware interrupts so let's skip
> this extra step.
> 
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>

LGTM,

Reviewed-by: Cédric Le Goater <clg@kaod.org>

Copying the MIPS folks since the IPI interface is only used under arch/mips.

C.
 
> ---
>  kernel/irq/ipi.c | 16 +++-------------
>  1 file changed, 3 insertions(+), 13 deletions(-)
> 
> diff --git a/kernel/irq/ipi.c b/kernel/irq/ipi.c
> index 43e3d1be622c..1b2807318ea9 100644
> --- a/kernel/irq/ipi.c
> +++ b/kernel/irq/ipi.c
> @@ -75,18 +75,12 @@ int irq_reserve_ipi(struct irq_domain *domain,
>  		}
>  	}
>  
> -	virq = irq_domain_alloc_descs(-1, nr_irqs, 0, NUMA_NO_NODE, NULL);
> -	if (virq <= 0) {
> -		pr_warn("Can't reserve IPI, failed to alloc descs\n");
> -		return -ENOMEM;
> -	}
> -
> -	virq = __irq_domain_alloc_irqs(domain, virq, nr_irqs, NUMA_NO_NODE,
> -				       (void *) dest, true, NULL);
> +	virq = __irq_domain_alloc_irqs(domain, -1, nr_irqs, NUMA_NO_NODE,
> +				       (void *) dest, false, NULL);
>  
>  	if (virq <= 0) {
>  		pr_warn("Can't reserve IPI, failed to alloc hw irqs\n");
> -		goto free_descs;
> +		return -EBUSY;
>  	}
>  
>  	for (i = 0; i < nr_irqs; i++) {
> @@ -96,10 +90,6 @@ int irq_reserve_ipi(struct irq_domain *domain,
>  		irq_set_status_flags(virq + i, IRQ_NO_BALANCING);
>  	}
>  	return virq;
> -
> -free_descs:
> -	irq_free_descs(virq, nr_irqs);
> -	return -EBUSY;
>  }
>  
>  /**
> 


^ permalink raw reply

* Re: eBPF on powerpc
From: Christophe Leroy @ 2020-11-24 18:45 UTC (permalink / raw)
  To: Naveen N. Rao, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <1606234192.xvkulhfr3y.naveen@linux.ibm.com>



Le 24/11/2020 à 17:35, Naveen N. Rao a écrit :
> Hi Christophe,
> 
> Christophe Leroy wrote:
>> Hi Naveen,
>>
>> Few years ago, you implemented eBPF on PPC64.
>>
>> Is there any reason for implementing it for PPC64 only ?
> 
> I focused on ppc64 since eBPF is a 64-bit VM and it was more straight-forward to target.
> 
>> Is there something that makes it impossible to have eBPF for PPC32 as well ?
> 
> No, I just wasn't sure if it would be performant enough to warrant it. Since then however, there 
> have been arm32 and riscv 32-bit JIT implementations and atleast the arm32 JIT seems to be showing 
> ~50% better performance compared to the interpreter (*). So, it would be worthwhile to add support 
> for ppc32.

That's great.

I know close to nothing about eBPF. Is there any interesting documentation on it somewhere that 
would allow me to easily understand how it works and allow me to extend the 64 bit powerpc to 32 bits ?

> 
> Note that there might be a few instructions which would be difficult to support on 32-bit, but those 
> can fallback to the interpreter, while allowing other programs to be JIT'ed.
> 
> 
> - Naveen
> 
> (*) http://lkml.kernel.org/r/CAGXu5jLYunVCJGCfHPebKDaoQ71hdMGq4HhdDxTYpBQw_HXUYQ@mail.gmail.com
> (*) http://lkml.kernel.org/r/b63fae4b-cb74-1928-b210-80914f3c8995@fb.com
> (*) http://lkml.kernel.org/r/20200305050207.4159-1-luke.r.nels@gmail.com

Christophe

^ permalink raw reply

* Re: [PATCH 1/3] powerpc: Make NUMA depend on SMP
From: Randy Dunlap @ 2020-11-24 19:46 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev; +Cc: srikar
In-Reply-To: <20201124120547.1940635-1-mpe@ellerman.id.au>

On 11/24/20 4:05 AM, Michael Ellerman wrote:
> Our Kconfig allows NUMA to be enabled without SMP, but none of
> our defconfigs use that combination. This means it can easily be
> broken inadvertently by code changes, which has happened recently.
> 
> Although it's theoretically possible to have a machine with a single
> CPU and multiple memory nodes, I can't think of any real systems where
> that's the case. Even so if such a system exists, it can just run an
> SMP kernel anyway.
> 
> So to avoid the need to add extra #ifdefs and/or build breaks, make
> NUMA depend on SMP.
> 
> Reported-by: kernel test robot <lkp@intel.com>
> Reported-by: Randy Dunlap <rdunlap@infradead.org>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

Reviewed-by: Randy Dunlap <rdunlap@infradead.org>

Thanks.

> ---
>  arch/powerpc/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index e9f13fe08492..a22db3db6b96 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -660,7 +660,7 @@ config IRQ_ALL_CPUS
>  
>  config NUMA
>  	bool "NUMA support"
> -	depends on PPC64
> +	depends on PPC64 && SMP
>  	default y if SMP && PPC_PSERIES
>  
>  config NODES_SHIFT
> 


-- 
~Randy

^ permalink raw reply

* Re: [PATCH 3/3] powerpc: Update NUMA Kconfig description & help text
From: Randy Dunlap @ 2020-11-24 19:47 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev; +Cc: srikar
In-Reply-To: <20201124120547.1940635-3-mpe@ellerman.id.au>

On 11/24/20 4:05 AM, Michael Ellerman wrote:
> Update the NUMA Kconfig description to match other architectures, and
> add some help text. Shamelessly borrowed from x86/arm64.
> 
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

Reviewed-by: Randy Dunlap <rdunlap@infradead.org>

Thanks.

> ---
>  arch/powerpc/Kconfig | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 4d688b426353..7f4995b245a3 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -659,9 +659,15 @@ config IRQ_ALL_CPUS
>  	  reported with SMP Power Macintoshes with this option enabled.
>  
>  config NUMA
> -	bool "NUMA support"
> +	bool "NUMA Memory Allocation and Scheduler Support"
>  	depends on PPC64 && SMP
>  	default y if PPC_PSERIES || PPC_POWERNV
> +	help
> +	  Enable NUMA (Non-Uniform Memory Access) support.
> +
> +	  The kernel will try to allocate memory used by a CPU on the
> +	  local memory controller of the CPU and add some more
> +	  NUMA awareness to the kernel.
>  
>  config NODES_SHIFT
>  	int
> 


-- 
~Randy


^ permalink raw reply

* Re: eBPF on powerpc
From: Naveen N. Rao @ 2020-11-24 19:51 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <4d588481-0c8d-6adf-53f5-e7332ddca7c4@csgroup.eu>

Christophe Leroy wrote:
> 
> 
> Le 24/11/2020 à 17:35, Naveen N. Rao a écrit :
>> Hi Christophe,
>> 
>> Christophe Leroy wrote:
>>> Hi Naveen,
>>>
>>> Few years ago, you implemented eBPF on PPC64.
>>>
>>> Is there any reason for implementing it for PPC64 only ?
>> 
>> I focused on ppc64 since eBPF is a 64-bit VM and it was more straight-forward to target.
>> 
>>> Is there something that makes it impossible to have eBPF for PPC32 as well ?
>> 
>> No, I just wasn't sure if it would be performant enough to warrant it. Since then however, there 
>> have been arm32 and riscv 32-bit JIT implementations and atleast the arm32 JIT seems to be showing 
>> ~50% better performance compared to the interpreter (*). So, it would be worthwhile to add support 
>> for ppc32.
> 
> That's great.
> 
> I know close to nothing about eBPF. Is there any interesting documentation on it somewhere that 
> would allow me to easily understand how it works and allow me to extend the 64 bit powerpc to 32 bits ?

I don't think there was ever a formal spec written for the eBPF VM. Here 
are a few resources which should help, alongside the existing JIT 
implementations:
- BPF Kernel Internals:  
  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/networking/filter.rst#n604
- https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/bpf
- BPF and XDP Reference Guide: https://docs.cilium.io/en/stable/bpf/


- Naveen


^ permalink raw reply

* [PATCH v1 2/3] powerpc/32s: In add_hash_page(), calculate VSID later
From: Christophe Leroy @ 2020-11-24 19:51 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <6470ab99e58c84a5445af43ce4d1d772b0dc3e93.1606247495.git.christophe.leroy@csgroup.eu>

VSID is only for create_hpte(). When _PAGE_HASHPTE is
already set, add_hash_page() bails out without calling
create_hpte() and doesn't need the value of VSID.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/mm/book3s32/hash_low.S | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/mm/book3s32/hash_low.S b/arch/powerpc/mm/book3s32/hash_low.S
index f964fd34dad9..1366e8e4fc05 100644
--- a/arch/powerpc/mm/book3s32/hash_low.S
+++ b/arch/powerpc/mm/book3s32/hash_low.S
@@ -188,12 +188,6 @@ _GLOBAL(add_hash_page)
 	mflr	r0
 	stw	r0,4(r1)
 
-	/* Convert context and va to VSID */
-	mulli	r3,r3,897*16		/* multiply context by context skew */
-	rlwinm	r0,r4,4,28,31		/* get ESID (top 4 bits of va) */
-	mulli	r0,r0,0x111		/* multiply by ESID skew */
-	add	r3,r3,r0		/* note create_hpte trims to 24 bits */
-
 #ifdef CONFIG_SMP
 	lwz	r8,TASK_CPU(r2)		/* to go in mmu_hash_lock */
 	oris	r8,r8,12
@@ -257,6 +251,12 @@ _GLOBAL(add_hash_page)
 	stwcx.	r5,0,r8
 	bne-	1b
 
+	/* Convert context and va to VSID */
+	mulli	r3,r3,897*16		/* multiply context by context skew */
+	rlwinm	r0,r4,4,28,31		/* get ESID (top 4 bits of va) */
+	mulli	r0,r0,0x111		/* multiply by ESID skew */
+	add	r3,r3,r0		/* note create_hpte trims to 24 bits */
+
 	bl	create_hpte
 
 9:
-- 
2.25.0


^ permalink raw reply related

* [PATCH v1 1/3] powerpc/32s: Remove unused counters incremented by create_hpte()
From: Christophe Leroy @ 2020-11-24 19:51 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel

primary_pteg_full and htab_hash_searches are not used.

Remove them.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/mm/book3s32/hash_low.S | 15 ---------------
 1 file changed, 15 deletions(-)

diff --git a/arch/powerpc/mm/book3s32/hash_low.S b/arch/powerpc/mm/book3s32/hash_low.S
index 9a56ba4f68f2..f964fd34dad9 100644
--- a/arch/powerpc/mm/book3s32/hash_low.S
+++ b/arch/powerpc/mm/book3s32/hash_low.S
@@ -359,11 +359,6 @@ END_FTR_SECTION_IFCLR(CPU_FTR_NEED_COHERENT)
 	beq+	10f			/* no PTE: go look for an empty slot */
 	tlbie	r4
 
-	lis	r4, (htab_hash_searches - PAGE_OFFSET)@ha
-	lwz	r6, (htab_hash_searches - PAGE_OFFSET)@l(r4)
-	addi	r6,r6,1			/* count how many searches we do */
-	stw	r6, (htab_hash_searches - PAGE_OFFSET)@l(r4)
-
 	/* Search the primary PTEG for a PTE whose 1st (d)word matches r5 */
 	mtctr	r0
 	addi	r4,r3,-HPTE_SIZE
@@ -393,12 +388,6 @@ END_FTR_SECTION_IFCLR(CPU_FTR_NEED_COHERENT)
 	bdnzf	2,1b			/* loop while ctr != 0 && !cr0.eq */
 	beq+	.Lfound_empty
 
-	/* update counter of times that the primary PTEG is full */
-	lis	r4, (primary_pteg_full - PAGE_OFFSET)@ha
-	lwz	r6, (primary_pteg_full - PAGE_OFFSET)@l(r4)
-	addi	r6,r6,1
-	stw	r6, (primary_pteg_full - PAGE_OFFSET)@l(r4)
-
 	patch_site	0f, patch__hash_page_C
 	/* Search the secondary PTEG for an empty slot */
 	ori	r5,r5,PTE_H		/* set H (secondary hash) bit */
@@ -491,10 +480,6 @@ _ASM_NOKPROBE_SYMBOL(create_hpte)
 	.align	2
 next_slot:
 	.space	4
-primary_pteg_full:
-	.space	4
-htab_hash_searches:
-	.space	4
 	.previous
 
 /*
-- 
2.25.0


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox