* [PATCH 3/3] powerpc: Update NUMA Kconfig description & help text
From: Michael Ellerman @ 2020-11-24 12:05 UTC (permalink / raw)
To: linuxppc-dev; +Cc: rdunlap, srikar
In-Reply-To: <20201124120547.1940635-1-mpe@ellerman.id.au>
Update the NUMA Kconfig description to match other architectures, and
add some help text. Shamelessly borrowed from x86/arm64.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
arch/powerpc/Kconfig | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 4d688b426353..7f4995b245a3 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -659,9 +659,15 @@ config IRQ_ALL_CPUS
reported with SMP Power Macintoshes with this option enabled.
config NUMA
- bool "NUMA support"
+ bool "NUMA Memory Allocation and Scheduler Support"
depends on PPC64 && SMP
default y if PPC_PSERIES || PPC_POWERNV
+ help
+ Enable NUMA (Non-Uniform Memory Access) support.
+
+ The kernel will try to allocate memory used by a CPU on the
+ local memory controller of the CPU and add some more
+ NUMA awareness to the kernel.
config NODES_SHIFT
int
--
2.25.1
^ permalink raw reply related
* [PATCH 2/3] powerpc: Make NUMA default y for powernv
From: Michael Ellerman @ 2020-11-24 12:05 UTC (permalink / raw)
To: linuxppc-dev; +Cc: rdunlap, srikar
In-Reply-To: <20201124120547.1940635-1-mpe@ellerman.id.au>
Our NUMA option is default y for pseries, but not powernv. The bulk of
powernv systems are NUMA, so make NUMA default y for powernv also.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
arch/powerpc/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index a22db3db6b96..4d688b426353 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -661,7 +661,7 @@ config IRQ_ALL_CPUS
config NUMA
bool "NUMA support"
depends on PPC64 && SMP
- default y if SMP && PPC_PSERIES
+ default y if PPC_PSERIES || PPC_POWERNV
config NODES_SHIFT
int
--
2.25.1
^ permalink raw reply related
* Re: [PATCH V2 4/5] ocxl: Add mmu notifier
From: Jason Gunthorpe @ 2020-11-24 13:45 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linuxppc-dev, Christophe Lombard, fbarrat, ajd
In-Reply-To: <20201124091738.GA26078@infradead.org>
On Tue, Nov 24, 2020 at 09:17:38AM +0000, Christoph Hellwig wrote:
> > @@ -470,6 +487,26 @@ void ocxl_link_release(struct pci_dev *dev, void *link_handle)
> > }
> > EXPORT_SYMBOL_GPL(ocxl_link_release);
> >
> > +static void invalidate_range(struct mmu_notifier *mn,
> > + struct mm_struct *mm,
> > + unsigned long start, unsigned long end)
> > +{
> > + struct pe_data *pe_data = container_of(mn, struct pe_data, mmu_notifier);
> > + struct ocxl_link *link = pe_data->link;
> > + unsigned long addr, pid, page_size = PAGE_SIZE;
The page_size variable seems unnecessary
> > +
> > + pid = mm->context.id;
> > +
> > + spin_lock(&link->atsd_lock);
> > + for (addr = start; addr < end; addr += page_size)
> > + pnv_ocxl_tlb_invalidate(&link->arva, pid, addr);
> > + spin_unlock(&link->atsd_lock);
> > +}
> > +
> > +static const struct mmu_notifier_ops ocxl_mmu_notifier_ops = {
> > + .invalidate_range = invalidate_range,
> > +};
> > +
> > static u64 calculate_cfg_state(bool kernel)
> > {
> > u64 state;
> > @@ -526,6 +563,8 @@ int ocxl_link_add_pe(void *link_handle, int pasid, u32 pidr, u32 tidr,
> > pe_data->mm = mm;
> > pe_data->xsl_err_cb = xsl_err_cb;
> > pe_data->xsl_err_data = xsl_err_data;
> > + pe_data->link = link;
> > + pe_data->mmu_notifier.ops = &ocxl_mmu_notifier_ops;
> >
> > memset(pe, 0, sizeof(struct ocxl_process_element));
> > pe->config_state = cpu_to_be64(calculate_cfg_state(pidr == 0));
> > @@ -542,8 +581,16 @@ int ocxl_link_add_pe(void *link_handle, int pasid, u32 pidr, u32 tidr,
> > * by the nest MMU. If we have a kernel context, TLBIs are
> > * already global.
> > */
> > - if (mm)
> > + if (mm) {
> > mm_context_add_copro(mm);
> > + if (link->arva) {
> > + /* Use MMIO registers for the TLB Invalidate
> > + * operations.
> > + */
> > + mmu_notifier_register(&pe_data->mmu_notifier, mm);
Every other place doing stuff like this is de-duplicating the
notifier. If you have multiple clients this will do multiple redundant
invalidations?
The notifier get/put API is designed to solve that problem, you'd get
a single notifier for the mm and then add the impacted arva's to some
list at the notifier.
Jason
^ permalink raw reply
* [PATCH] tpm: ibmvtpm: fix error return code in tpm_ibmvtpm_probe()
From: Wang Hai @ 2020-11-24 13:52 UTC (permalink / raw)
To: mpe, benh, paulus, peterhuewe, jarkko, jgg, stefanb, nayna
Cc: linux-integrity, linuxppc-dev, linux-kernel
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.
Fixes: d8d74ea3c002 ("tpm: ibmvtpm: Wait for buffer to be set before proceeding")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
---
drivers/char/tpm/tpm_ibmvtpm.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/char/tpm/tpm_ibmvtpm.c b/drivers/char/tpm/tpm_ibmvtpm.c
index 994385bf37c0..813eb2cac0ce 100644
--- a/drivers/char/tpm/tpm_ibmvtpm.c
+++ b/drivers/char/tpm/tpm_ibmvtpm.c
@@ -687,6 +687,7 @@ static int tpm_ibmvtpm_probe(struct vio_dev *vio_dev,
ibmvtpm->rtce_buf != NULL,
HZ)) {
dev_err(dev, "CRQ response timed out\n");
+ rc = -ETIMEDOUT;
goto init_irq_cleanup;
}
--
2.17.1
^ permalink raw reply related
* [PATCH] ASoC: fsl_xcvr: fix potential resource leak
From: Viorel Suman (OSS) @ 2020-11-24 14:19 UTC (permalink / raw)
To: Timur Tabi, Nicolin Chen, Xiubo Li, Fabio Estevam, Shengjiu Wang,
Liam Girdwood, Mark Brown, Jaroslav Kysela, Takashi Iwai,
alsa-devel, linuxppc-dev, linux-kernel
Cc: Viorel Suman
From: Viorel Suman <viorel.suman@nxp.com>
"fw" variable must be relased before return.
Signed-off-by: Viorel Suman <viorel.suman@nxp.com>
---
sound/soc/fsl/fsl_xcvr.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/sound/soc/fsl/fsl_xcvr.c b/sound/soc/fsl/fsl_xcvr.c
index 2a28810d0e29..3d58c88ea603 100644
--- a/sound/soc/fsl/fsl_xcvr.c
+++ b/sound/soc/fsl/fsl_xcvr.c
@@ -706,6 +706,7 @@ static int fsl_xcvr_load_firmware(struct fsl_xcvr *xcvr)
/* RAM is 20KiB = 16KiB code + 4KiB data => max 10 pages 2KiB each */
if (rem > 16384) {
dev_err(dev, "FW size %d is bigger than 16KiB.\n", rem);
+ release_firmware(fw);
return -ENOMEM;
}
--
2.26.2
^ permalink raw reply related
* Re: [PATCH] tpm: ibmvtpm: fix error return code in tpm_ibmvtpm_probe()
From: Stefan Berger @ 2020-11-24 14:28 UTC (permalink / raw)
To: Wang Hai, mpe, benh, paulus, peterhuewe, jarkko, jgg, nayna
Cc: linux-integrity, linuxppc-dev, linux-kernel
In-Reply-To: <20201124135244.31932-1-wanghai38@huawei.com>
On 11/24/20 8:52 AM, Wang Hai wrote:
> Fix to return a negative error code from the error handling
> case instead of 0, as done elsewhere in this function.
>
> Fixes: d8d74ea3c002 ("tpm: ibmvtpm: Wait for buffer to be set before proceeding")
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: Wang Hai <wanghai38@huawei.com>
> ---
> drivers/char/tpm/tpm_ibmvtpm.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/char/tpm/tpm_ibmvtpm.c b/drivers/char/tpm/tpm_ibmvtpm.c
> index 994385bf37c0..813eb2cac0ce 100644
> --- a/drivers/char/tpm/tpm_ibmvtpm.c
> +++ b/drivers/char/tpm/tpm_ibmvtpm.c
> @@ -687,6 +687,7 @@ static int tpm_ibmvtpm_probe(struct vio_dev *vio_dev,
> ibmvtpm->rtce_buf != NULL,
> HZ)) {
> dev_err(dev, "CRQ response timed out\n");
> + rc = -ETIMEDOUT;
> goto init_irq_cleanup;
> }
>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
^ permalink raw reply
* eBPF on powerpc
From: Christophe Leroy @ 2020-11-24 15:00 UTC (permalink / raw)
To: Naveen N. Rao, linuxppc-dev@lists.ozlabs.org
Hi Naveen,
Few years ago, you implemented eBPF on PPC64.
Is there any reason for implementing it for PPC64 only ? Is there something that makes it impossible
to have eBPF for PPC32 as well ?
Thanks
Christophe
^ permalink raw reply
* [PATCH v1 3/6] powerpc/8xx: Simplify INVALIDATE_ADJACENT_PAGES_CPU15
From: Christophe Leroy @ 2020-11-24 15:24 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <e796c5fcb5898de827c803cf1ab8ba1d7a5d4b76.1606231483.git.christophe.leroy@csgroup.eu>
We now have r11 available as a scratch register so
INVALIDATE_ADJACENT_PAGES_CPU15() can be simplified.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/kernel/head_8xx.S | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 775b4f4d011e..558c8e615ef9 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -180,14 +180,13 @@ SystemCall:
*/
#ifdef CONFIG_8xx_CPU15
-#define INVALIDATE_ADJACENT_PAGES_CPU15(addr) \
- addi addr, addr, PAGE_SIZE; \
- tlbie addr; \
- addi addr, addr, -(PAGE_SIZE << 1); \
- tlbie addr; \
- addi addr, addr, PAGE_SIZE
+#define INVALIDATE_ADJACENT_PAGES_CPU15(addr, tmp) \
+ addi tmp, addr, PAGE_SIZE; \
+ tlbie tmp; \
+ addi tmp, addr, -PAGE_SIZE; \
+ tlbie tmp
#else
-#define INVALIDATE_ADJACENT_PAGES_CPU15(addr)
+#define INVALIDATE_ADJACENT_PAGES_CPU15(addr, tmp)
#endif
InstructionTLBMiss:
@@ -198,7 +197,7 @@ InstructionTLBMiss:
* kernel page tables.
*/
mfspr r10, SPRN_SRR0 /* Get effective address of fault */
- INVALIDATE_ADJACENT_PAGES_CPU15(r10)
+ INVALIDATE_ADJACENT_PAGES_CPU15(r10, r11)
mtspr SPRN_MD_EPN, r10
#ifdef CONFIG_MODULES
mfcr r11
--
2.25.0
^ permalink raw reply related
* [PATCH v1 1/6] powerpc/8xx: DEBUG_PAGEALLOC doesn't require an ITLB miss exception handler
From: Christophe Leroy @ 2020-11-24 15:24 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
Since commit e611939fc8ec ("powerpc/mm: Ensure change_page_attr()
doesn't invalidate pinned TLBs"), pinned TLBs are not anymore
invalidated by __kernel_map_pages() when CONFIG_DEBUG_PAGEALLOC is
selected.
Remove the dependency on CONFIG_DEBUG_PAGEALLOC.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/kernel/head_8xx.S | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index ee0bfebc375f..66ee62f30d36 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -47,8 +47,7 @@
* - Either we have modules
* - Or we have not pinned the first 8M
*/
-#if defined(CONFIG_MODULES) || !defined(CONFIG_PIN_TLB_TEXT) || \
- defined(CONFIG_DEBUG_PAGEALLOC)
+#if defined(CONFIG_MODULES) || !defined(CONFIG_PIN_TLB_TEXT)
#define ITLB_MISS_KERNEL 1
#endif
--
2.25.0
^ permalink raw reply related
* [PATCH v1 4/6] powerpc/8xx: Use SPRN_SPRG_SCRATCH2 in ITLB miss exception
From: Christophe Leroy @ 2020-11-24 15:24 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <e796c5fcb5898de827c803cf1ab8ba1d7a5d4b76.1606231483.git.christophe.leroy@csgroup.eu>
In order to re-enable MMU earlier, ensure ITLB miss exception
cannot clobber SPRN_SPRG_SCRATCH0 and SPRN_SPRG_SCRATCH1.
Do so by using SPRN_SPRG_SCRATCH2 and SPRN_M_TW instead, like
the DTLB miss exception.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/kernel/head_8xx.S | 12 ++++++------
arch/powerpc/perf/8xx-pmu.c | 4 ++--
2 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 558c8e615ef9..45239b06b6ce 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -190,8 +190,8 @@ SystemCall:
#endif
InstructionTLBMiss:
- mtspr SPRN_SPRG_SCRATCH0, r10
- mtspr SPRN_SPRG_SCRATCH1, r11
+ mtspr SPRN_SPRG_SCRATCH2, r10
+ mtspr SPRN_M_TW, r11
/* If we are faulting a kernel address, we have to use the
* kernel page tables.
@@ -230,8 +230,8 @@ InstructionTLBMiss:
mtspr SPRN_MI_RPN, r10 /* Update TLB entry */
/* Restore registers */
-0: mfspr r10, SPRN_SPRG_SCRATCH0
- mfspr r11, SPRN_SPRG_SCRATCH1
+0: mfspr r10, SPRN_SPRG_SCRATCH2
+ mfspr r11, SPRN_M_TW
rfi
patch_site 0b, patch__itlbmiss_exit_1
@@ -240,8 +240,8 @@ InstructionTLBMiss:
0: lwz r10, (itlb_miss_counter - PAGE_OFFSET)@l(0)
addi r10, r10, 1
stw r10, (itlb_miss_counter - PAGE_OFFSET)@l(0)
- mfspr r10, SPRN_SPRG_SCRATCH0
- mfspr r11, SPRN_SPRG_SCRATCH1
+ mfspr r10, SPRN_SPRG_SCRATCH2
+ mfspr r11, SPRN_M_TW
rfi
#endif
diff --git a/arch/powerpc/perf/8xx-pmu.c b/arch/powerpc/perf/8xx-pmu.c
index e53c3c161257..02db58c7427a 100644
--- a/arch/powerpc/perf/8xx-pmu.c
+++ b/arch/powerpc/perf/8xx-pmu.c
@@ -165,9 +165,9 @@ static void mpc8xx_pmu_del(struct perf_event *event, int flags)
break;
case PERF_8xx_ID_ITLB_LOAD_MISS:
if (atomic_dec_return(&itlb_miss_ref) == 0) {
- /* mfspr r10, SPRN_SPRG_SCRATCH0 */
+ /* mfspr r10, SPRN_SPRG_SCRATCH2 */
struct ppc_inst insn = ppc_inst(PPC_INST_MFSPR | __PPC_RS(R10) |
- __PPC_SPR(SPRN_SPRG_SCRATCH0));
+ __PPC_SPR(SPRN_SPRG_SCRATCH2));
patch_instruction_site(&patch__itlbmiss_exit_1, insn);
}
--
2.25.0
^ permalink raw reply related
* [PATCH v1 2/6] powerpc/8xx: Always pin kernel text TLB
From: Christophe Leroy @ 2020-11-24 15:24 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <e796c5fcb5898de827c803cf1ab8ba1d7a5d4b76.1606231483.git.christophe.leroy@csgroup.eu>
There is no big poing in not pinning kernel text anymore, as now
we can keep pinned TLB even with things like DEBUG_PAGEALLOC.
Remove CONFIG_PIN_TLB_TEXT, making it always right.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/Kconfig | 3 +--
arch/powerpc/kernel/head_8xx.S | 20 +++-----------------
arch/powerpc/mm/nohash/8xx.c | 3 +--
arch/powerpc/platforms/8xx/Kconfig | 7 -------
4 files changed, 5 insertions(+), 28 deletions(-)
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index e9f13fe08492..bf088b5b0a89 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -795,8 +795,7 @@ config DATA_SHIFT_BOOL
bool "Set custom data alignment"
depends on ADVANCED_OPTIONS
depends on STRICT_KERNEL_RWX || DEBUG_PAGEALLOC
- depends on PPC_BOOK3S_32 || (PPC_8xx && !PIN_TLB_DATA && \
- (!PIN_TLB_TEXT || !STRICT_KERNEL_RWX))
+ depends on PPC_BOOK3S_32 || (PPC_8xx && !PIN_TLB_DATA && !STRICT_KERNEL_RWX)
help
This option allows you to set the kernel data alignment. When
RAM is mapped by blocks, the alignment needs to fit the size and
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 66ee62f30d36..775b4f4d011e 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -42,15 +42,6 @@
#endif
.endm
-/*
- * We need an ITLB miss handler for kernel addresses if:
- * - Either we have modules
- * - Or we have not pinned the first 8M
- */
-#if defined(CONFIG_MODULES) || !defined(CONFIG_PIN_TLB_TEXT)
-#define ITLB_MISS_KERNEL 1
-#endif
-
/*
* Value for the bits that have fixed value in RPN entries.
* Also used for tagging DAR for DTLBerror.
@@ -209,12 +200,12 @@ InstructionTLBMiss:
mfspr r10, SPRN_SRR0 /* Get effective address of fault */
INVALIDATE_ADJACENT_PAGES_CPU15(r10)
mtspr SPRN_MD_EPN, r10
-#ifdef ITLB_MISS_KERNEL
+#ifdef CONFIG_MODULES
mfcr r11
compare_to_kernel_boundary r10, r10
#endif
mfspr r10, SPRN_M_TWB /* Get level 1 table */
-#ifdef ITLB_MISS_KERNEL
+#ifdef CONFIG_MODULES
blt+ 3f
rlwinm r10, r10, 0, 20, 31
oris r10, r10, (swapper_pg_dir - PAGE_OFFSET)@ha
@@ -618,10 +609,6 @@ start_here:
lis r0, (MD_TWAM | MD_RSV4I)@h
mtspr SPRN_MD_CTR, r0
#endif
-#ifndef CONFIG_PIN_TLB_TEXT
- li r0, 0
- mtspr SPRN_MI_CTR, r0
-#endif
#if !defined(CONFIG_PIN_TLB_DATA) && !defined(CONFIG_PIN_TLB_IMMR)
lis r0, MD_TWAM@h
mtspr SPRN_MD_CTR, r0
@@ -739,7 +726,6 @@ _GLOBAL(mmu_pin_tlb)
mtspr SPRN_MD_CTR, r6
tlbia
-#ifdef CONFIG_PIN_TLB_TEXT
LOAD_REG_IMMEDIATE(r5, 28 << 8)
LOAD_REG_IMMEDIATE(r6, PAGE_OFFSET)
LOAD_REG_IMMEDIATE(r7, MI_SVALID | MI_PS8MEG | _PMD_ACCESSED)
@@ -760,7 +746,7 @@ _GLOBAL(mmu_pin_tlb)
bdnzt lt, 2b
lis r0, MI_RSV4I@h
mtspr SPRN_MI_CTR, r0
-#endif
+
LOAD_REG_IMMEDIATE(r5, 28 << 8 | MD_TWAM)
#ifdef CONFIG_PIN_TLB_DATA
LOAD_REG_IMMEDIATE(r6, PAGE_OFFSET)
diff --git a/arch/powerpc/mm/nohash/8xx.c b/arch/powerpc/mm/nohash/8xx.c
index 231ca95f9ffb..19a3eec1d8c5 100644
--- a/arch/powerpc/mm/nohash/8xx.c
+++ b/arch/powerpc/mm/nohash/8xx.c
@@ -186,8 +186,7 @@ void mmu_mark_initmem_nx(void)
mmu_mapin_ram_chunk(0, boundary, PAGE_KERNEL_TEXT, false);
mmu_mapin_ram_chunk(boundary, einittext8, PAGE_KERNEL, false);
- if (IS_ENABLED(CONFIG_PIN_TLB_TEXT))
- mmu_pin_tlb(block_mapped_ram, false);
+ mmu_pin_tlb(block_mapped_ram, false);
}
#ifdef CONFIG_STRICT_KERNEL_RWX
diff --git a/arch/powerpc/platforms/8xx/Kconfig b/arch/powerpc/platforms/8xx/Kconfig
index cdda034733ff..1a8400bfbe82 100644
--- a/arch/powerpc/platforms/8xx/Kconfig
+++ b/arch/powerpc/platforms/8xx/Kconfig
@@ -202,13 +202,6 @@ config PIN_TLB_IMMR
CONFIG_PIN_TLB_DATA is also selected, it will reduce
CONFIG_PIN_TLB_DATA to 24 Mbytes.
-config PIN_TLB_TEXT
- bool "Pinned TLB for TEXT"
- depends on PIN_TLB
- default y
- help
- This pins kernel text with 8M pages.
-
endmenu
endmenu
--
2.25.0
^ permalink raw reply related
* [PATCH v1 5/6] powerpc/8xx: Use SPRN_SPRG_SCRATCH2 in DTLB miss exception
From: Christophe Leroy @ 2020-11-24 15:24 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <e796c5fcb5898de827c803cf1ab8ba1d7a5d4b76.1606231483.git.christophe.leroy@csgroup.eu>
Use SPRN_SPRG_SCRATCH2 in DTLB miss exception instead of DAR
in order to be similar to ITLB miss exception.
This also simplifies mpc8xx_pmu_del()
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/kernel/head_8xx.S | 9 ++++-----
arch/powerpc/perf/8xx-pmu.c | 19 +++++++------------
2 files changed, 11 insertions(+), 17 deletions(-)
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 45239b06b6ce..35707e86c5f3 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -247,7 +247,7 @@ InstructionTLBMiss:
. = 0x1200
DataStoreTLBMiss:
- mtspr SPRN_DAR, r10
+ mtspr SPRN_SPRG_SCRATCH2, r10
mtspr SPRN_M_TW, r11
mfcr r11
@@ -286,11 +286,11 @@ DataStoreTLBMiss:
li r11, RPN_PATTERN
rlwimi r10, r11, 0, 24, 27 /* Set 24-27 */
mtspr SPRN_MD_RPN, r10 /* Update TLB entry */
+ mtspr SPRN_DAR, r11 /* Tag DAR */
/* Restore registers */
-0: mfspr r10, SPRN_DAR
- mtspr SPRN_DAR, r11 /* Tag DAR */
+0: mfspr r10, SPRN_SPRG_SCRATCH2
mfspr r11, SPRN_M_TW
rfi
patch_site 0b, patch__dtlbmiss_exit_1
@@ -300,8 +300,7 @@ DataStoreTLBMiss:
0: lwz r10, (dtlb_miss_counter - PAGE_OFFSET)@l(0)
addi r10, r10, 1
stw r10, (dtlb_miss_counter - PAGE_OFFSET)@l(0)
- mfspr r10, SPRN_DAR
- mtspr SPRN_DAR, r11 /* Tag DAR */
+ mfspr r10, SPRN_SPRG_SCRATCH2
mfspr r11, SPRN_M_TW
rfi
#endif
diff --git a/arch/powerpc/perf/8xx-pmu.c b/arch/powerpc/perf/8xx-pmu.c
index 02db58c7427a..93004ee586a1 100644
--- a/arch/powerpc/perf/8xx-pmu.c
+++ b/arch/powerpc/perf/8xx-pmu.c
@@ -153,6 +153,11 @@ static void mpc8xx_pmu_read(struct perf_event *event)
static void mpc8xx_pmu_del(struct perf_event *event, int flags)
{
+ struct ppc_inst insn;
+
+ /* mfspr r10, SPRN_SPRG_SCRATCH2 */
+ insn = ppc_inst(PPC_INST_MFSPR | __PPC_RS(R10) | __PPC_SPR(SPRN_SPRG_SCRATCH2));
+
mpc8xx_pmu_read(event);
/* If it was the last user, stop counting to avoid useles overhead */
@@ -164,22 +169,12 @@ static void mpc8xx_pmu_del(struct perf_event *event, int flags)
mtspr(SPRN_ICTRL, 7);
break;
case PERF_8xx_ID_ITLB_LOAD_MISS:
- if (atomic_dec_return(&itlb_miss_ref) == 0) {
- /* mfspr r10, SPRN_SPRG_SCRATCH2 */
- struct ppc_inst insn = ppc_inst(PPC_INST_MFSPR | __PPC_RS(R10) |
- __PPC_SPR(SPRN_SPRG_SCRATCH2));
-
+ if (atomic_dec_return(&itlb_miss_ref) == 0)
patch_instruction_site(&patch__itlbmiss_exit_1, insn);
- }
break;
case PERF_8xx_ID_DTLB_LOAD_MISS:
- if (atomic_dec_return(&dtlb_miss_ref) == 0) {
- /* mfspr r10, SPRN_DAR */
- struct ppc_inst insn = ppc_inst(PPC_INST_MFSPR | __PPC_RS(R10) |
- __PPC_SPR(SPRN_DAR));
-
+ if (atomic_dec_return(&dtlb_miss_ref) == 0)
patch_instruction_site(&patch__dtlbmiss_exit_1, insn);
- }
break;
}
}
--
2.25.0
^ permalink raw reply related
* [PATCH v1 6/6] powerpc/ppc-opcode: Add PPC_RAW_MFSPR()
From: Christophe Leroy @ 2020-11-24 15:24 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <e796c5fcb5898de827c803cf1ab8ba1d7a5d4b76.1606231483.git.christophe.leroy@csgroup.eu>
Add PPC_RAW_MFSPR() to replace open coding done in 8xx-pmu.c
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/include/asm/ppc-opcode.h | 3 ++-
arch/powerpc/perf/8xx-pmu.c | 5 +----
2 files changed, 3 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h
index a6e3700c4566..da6f300e9788 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -230,7 +230,6 @@
#define PPC_INST_POPCNTB_MASK 0xfc0007fe
#define PPC_INST_RFEBB 0x4c000124
#define PPC_INST_RFID 0x4c000024
-#define PPC_INST_MFSPR 0x7c0002a6
#define PPC_INST_MFSPR_DSCR 0x7c1102a6
#define PPC_INST_MFSPR_DSCR_MASK 0xfc1ffffe
#define PPC_INST_MTSPR_DSCR 0x7c1103a6
@@ -507,6 +506,8 @@
#define PPC_RAW_NEG(d, a) (0x7c0000d0 | ___PPC_RT(d) | ___PPC_RA(a))
+#define PPC_RAW_MFSPR(d, spr) (0x7c0002a6 | ___PPC_RT(d) | __PPC_SPR(spr))
+
/* Deal with instructions that older assemblers aren't aware of */
#define PPC_BCCTR_FLUSH stringify_in_c(.long PPC_INST_BCCTR_FLUSH)
#define PPC_CP_ABORT stringify_in_c(.long PPC_RAW_CP_ABORT)
diff --git a/arch/powerpc/perf/8xx-pmu.c b/arch/powerpc/perf/8xx-pmu.c
index 93004ee586a1..f970d1510d3d 100644
--- a/arch/powerpc/perf/8xx-pmu.c
+++ b/arch/powerpc/perf/8xx-pmu.c
@@ -153,10 +153,7 @@ static void mpc8xx_pmu_read(struct perf_event *event)
static void mpc8xx_pmu_del(struct perf_event *event, int flags)
{
- struct ppc_inst insn;
-
- /* mfspr r10, SPRN_SPRG_SCRATCH2 */
- insn = ppc_inst(PPC_INST_MFSPR | __PPC_RS(R10) | __PPC_SPR(SPRN_SPRG_SCRATCH2));
+ struct ppc_inst insn = ppc_inst(PPC_RAW_MFSPR(10, SPRN_SPRG_SCRATCH2));
mpc8xx_pmu_read(event);
--
2.25.0
^ permalink raw reply related
* Re: [PATCH 1/3] perf/core: Flush PMU internal buffers for per-CPU events
From: Liang, Kan @ 2020-11-24 16:04 UTC (permalink / raw)
To: Madhavan Srinivasan, Namhyung Kim, Michael Ellerman
Cc: Ian Rogers, Andi Kleen, Peter Zijlstra, Jiri Olsa, linux-kernel,
Stephane Eranian, Paul Mackerras, Arnaldo Carvalho de Melo,
linuxppc-dev, Ingo Molnar, Gabriel Marin
In-Reply-To: <9657dc9f-e1a9-eb7e-8ac2-a108416d5a10@linux.ibm.com>
On 11/24/2020 12:42 AM, Madhavan Srinivasan wrote:
>
> On 11/24/20 10:21 AM, Namhyung Kim wrote:
>> Hello,
>>
>> On Mon, Nov 23, 2020 at 8:00 PM Michael Ellerman <mpe@ellerman.id.au>
>> wrote:
>>> Namhyung Kim <namhyung@kernel.org> writes:
>>>> Hi Peter and Kan,
>>>>
>>>> (Adding PPC folks)
>>>>
>>>> On Tue, Nov 17, 2020 at 2:01 PM Namhyung Kim <namhyung@kernel.org>
>>>> wrote:
>>>>> Hello,
>>>>>
>>>>> On Thu, Nov 12, 2020 at 4:54 AM Liang, Kan
>>>>> <kan.liang@linux.intel.com> wrote:
>>>>>>
>>>>>>
>>>>>> On 11/11/2020 11:25 AM, Peter Zijlstra wrote:
>>>>>>> On Mon, Nov 09, 2020 at 09:49:31AM -0500, Liang, Kan wrote:
>>>>>>>
>>>>>>>> - When the large PEBS was introduced (9c964efa4330), the
>>>>>>>> sched_task() should
>>>>>>>> be invoked to flush the PEBS buffer in each context switch.
>>>>>>>> However, The
>>>>>>>> perf_sched_events in account_event() is not updated accordingly.
>>>>>>>> The
>>>>>>>> perf_event_task_sched_* never be invoked for a pure per-CPU
>>>>>>>> context. Only
>>>>>>>> per-task event works.
>>>>>>>> At that time, the perf_pmu_sched_task() is outside of
>>>>>>>> perf_event_context_sched_in/out. It means that perf has to double
>>>>>>>> perf_pmu_disable() for per-task event.
>>>>>>>> - The patch 1 tries to fix broken per-CPU events. The CPU
>>>>>>>> context cannot be
>>>>>>>> retrieved from the task->perf_event_ctxp. So it has to be
>>>>>>>> tracked in the
>>>>>>>> sched_cb_list. Yes, the code is very similar to the original
>>>>>>>> codes, but it
>>>>>>>> is actually the new code for per-CPU events. The optimization
>>>>>>>> for per-task
>>>>>>>> events is still kept.
>>>>>>>> For the case, which has both a CPU context and a task
>>>>>>>> context, yes, the
>>>>>>>> __perf_pmu_sched_task() in this patch is not invoked. Because the
>>>>>>>> sched_task() only need to be invoked once in a context switch. The
>>>>>>>> sched_task() will be eventually invoked in the task context.
>>>>>>> The thing is; your first two patches rely on PERF_ATTACH_SCHED_CB
>>>>>>> and
>>>>>>> only set that for large pebs. Are you sure the other users (Intel
>>>>>>> LBR
>>>>>>> and PowerPC BHRB) don't need it?
>>>>>> I didn't set it for LBR, because the perf_sched_events is always
>>>>>> enabled
>>>>>> for LBR. But, yes, we should explicitly set the PERF_ATTACH_SCHED_CB
>>>>>> for LBR.
>>>>>>
>>>>>> if (has_branch_stack(event))
>>>>>> inc = true;
>>>>>>
>>>>>>> If they indeed do not require the pmu::sched_task() callback for CPU
>>>>>>> events, then I still think the whole perf_sched_cb_{inc,dec}()
>>>>>>> interface
>>>>>> No, LBR requires the pmu::sched_task() callback for CPU events.
>>>>>>
>>>>>> Now, The LBR registers have to be reset in sched in even for CPU
>>>>>> events.
>>>>>>
>>>>>> To fix the shorter LBR callstack issue for CPU events, we also
>>>>>> need to
>>>>>> save/restore LBRs in pmu::sched_task().
>>>>>> https://lore.kernel.org/lkml/1578495789-95006-4-git-send-email-kan.liang@linux.intel.com/
>>>>>>
>>>>>>
>>>>>>> is confusing at best.
>>>>>>>
>>>>>>> Can't we do something like this instead?
>>>>>>>
>>>>>> I think the below patch may have two issues.
>>>>>> - PERF_ATTACH_SCHED_CB is required for LBR (maybe PowerPC BHRB as
>>>>>> well) now.
>>>>>> - We may disable the large PEBS later if not all PEBS events support
>>>>>> large PEBS. The PMU need a way to notify the generic code to decrease
>>>>>> the nr_sched_task.
>>>>> Any updates on this? I've reviewed and tested Kan's patches
>>>>> and they all look good.
>>>>>
>>>>> Maybe we can talk to PPC folks to confirm the BHRB case?
>>>> Can we move this forward? I saw patch 3/3 also adds
>>>> PERF_ATTACH_SCHED_CB
>>>> for PowerPC too. But it'd be nice if ppc folks can confirm the change.
>>> Sorry I've read the whole thread, but I'm still not entirely sure I
>>> understand the question.
>> Thanks for your time and sorry about not being clear enough.
>>
>> We found per-cpu events are not calling pmu::sched_task()
>> on context switches. So PERF_ATTACH_SCHED_CB was
>> added to indicate the core logic that it needs to invoke the
>> callback.
>>
>> The patch 3/3 added the flag to PPC (for BHRB) with other
>> changes (I think it should be split like in the patch 2/3) and
>> want to get ACKs from the PPC folks.
>
> Sorry for delay.
>
> I guess first it will be better to split the ppc change to a separate
> patch,
Both PPC and X86 invokes the perf_sched_cb_inc() directly. The patch
changes the parameters of the perf_sched_cb_inc(). I think we have to
update the PPC and X86 codes together. Otherwise, there will be a
compile error, if someone may only applies the change for the
perf_sched_cb_inc() but forget to applies the changes in PPC or X86
specific codes.
>
> secondly, we are missing the changes needed in the power_pmu_bhrb_disable()
>
> where perf_sched_cb_dec() needs the "state" to be included.
>
Ah, right. The below patch should fix the issue.
diff --git a/arch/powerpc/perf/core-book3s.c
b/arch/powerpc/perf/core-book3s.c
index bced502f64a1..6756d1602a67 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -391,13 +391,18 @@ static void power_pmu_bhrb_enable(struct
perf_event *event)
static void power_pmu_bhrb_disable(struct perf_event *event)
{
struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
+ int state = PERF_SCHED_CB_SW_IN;
if (!ppmu->bhrb_nr)
return;
WARN_ON_ONCE(!cpuhw->bhrb_users);
cpuhw->bhrb_users--;
- perf_sched_cb_dec(event->ctx->pmu);
+
+ if (!(event->attach_state & PERF_ATTACH_TASK))
+ state |= PERF_SCHED_CB_CPU;
+
+ perf_sched_cb_dec(event->ctx->pmu, state);
if (!cpuhw->disabled && !cpuhw->bhrb_users) {
/* BHRB cannot be turned off when other
Thanks,
Kan
^ permalink raw reply related
* eBPF on powerpc
From: Naveen N. Rao @ 2020-11-24 16:35 UTC (permalink / raw)
To: Christophe Leroy, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <d69650b0-4024-5759-3ccb-ede5c0394500@csgroup.eu>
Hi Christophe,
Christophe Leroy wrote:
> Hi Naveen,
>
> Few years ago, you implemented eBPF on PPC64.
>
> Is there any reason for implementing it for PPC64 only ?
I focused on ppc64 since eBPF is a 64-bit VM and it was more
straight-forward to target.
> Is there something that makes it impossible to have eBPF for PPC32 as
> well ?
No, I just wasn't sure if it would be performant enough to warrant it.
Since then however, there have been arm32 and riscv 32-bit JIT
implementations and atleast the arm32 JIT seems to be showing ~50%
better performance compared to the interpreter (*). So, it would be
worthwhile to add support for ppc32.
Note that there might be a few instructions which would be difficult to
support on 32-bit, but those can fallback to the interpreter, while
allowing other programs to be JIT'ed.
- Naveen
(*)
http://lkml.kernel.org/r/CAGXu5jLYunVCJGCfHPebKDaoQ71hdMGq4HhdDxTYpBQw_HXUYQ@mail.gmail.com
(*) http://lkml.kernel.org/r/b63fae4b-cb74-1928-b210-80914f3c8995@fb.com
(*) http://lkml.kernel.org/r/20200305050207.4159-1-luke.r.nels@gmail.com
^ permalink raw reply
* [PATCH net 0/2] ibmvnic: Bug fixes for queue descriptor processing
From: Thomas Falcon @ 2020-11-24 17:26 UTC (permalink / raw)
To: netdev
Cc: cforno12, ljp, ricklind, dnbanerg, tlfalcon, drt, brking, sukadev,
linuxppc-dev
This series resolves a few issues in the ibmvnic driver's
RX buffer and TX completion processing. The first patch
includes memory barriers to synchronize queue descriptor
reads. The second patch fixes a memory leak that could
occur if the device returns a TX completion with an error
code in the descriptor, in which case the respective socket
buffer and other relevant data structures may not be freed
or updated properly.
Thomas Falcon (2):
ibmvnic: Ensure that SCRQ entry reads are correctly ordered
ibmvnic: Fix TX completion error handling
drivers/net/ethernet/ibm/ibmvnic.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
--
1.8.3.1
^ permalink raw reply
* [PATCH net 1/2] ibmvnic: Ensure that SCRQ entry reads are correctly ordered
From: Thomas Falcon @ 2020-11-24 17:26 UTC (permalink / raw)
To: netdev
Cc: cforno12, ljp, ricklind, dnbanerg, tlfalcon, drt, brking, sukadev,
linuxppc-dev
In-Reply-To: <1606238776-30259-1-git-send-email-tlfalcon@linux.ibm.com>
Ensure that received Subordinate Command-Response Queue (SCRQ)
entries are properly read in order by the driver. These queues
are used in the ibmvnic device to process RX buffer and TX completion
descriptors. dma_rmb barriers have been added after checking for a
pending descriptor to ensure the correct descriptor entry is checked
and after reading the SCRQ descriptor to ensure the entire
descriptor is read before processing.
Fixes: 032c5e828 ("Driver for IBM System i/p VNIC protocol")
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
---
drivers/net/ethernet/ibm/ibmvnic.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 2aa40b2..489ed5e 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2403,6 +2403,8 @@ static int ibmvnic_poll(struct napi_struct *napi, int budget)
if (!pending_scrq(adapter, adapter->rx_scrq[scrq_num]))
break;
+ /* ensure that we do not prematurely exit the polling loop */
+ dma_rmb();
next = ibmvnic_next_scrq(adapter, adapter->rx_scrq[scrq_num]);
rx_buff =
(struct ibmvnic_rx_buff *)be64_to_cpu(next->
@@ -3098,6 +3100,9 @@ static int ibmvnic_complete_tx(struct ibmvnic_adapter *adapter,
unsigned int pool = scrq->pool_index;
int num_entries = 0;
+ /* ensure that the correct descriptor entry is read */
+ dma_rmb();
+
next = ibmvnic_next_scrq(adapter, scrq);
for (i = 0; i < next->tx_comp.num_comps; i++) {
if (next->tx_comp.rcs[i]) {
@@ -3498,6 +3503,9 @@ static union sub_crq *ibmvnic_next_scrq(struct ibmvnic_adapter *adapter,
}
spin_unlock_irqrestore(&scrq->lock, flags);
+ /* ensure that the entire SCRQ descriptor is read */
+ dma_rmb();
+
return entry;
}
--
1.8.3.1
^ permalink raw reply related
* [PATCH net 2/2] ibmvnic: Fix TX completion error handling
From: Thomas Falcon @ 2020-11-24 17:26 UTC (permalink / raw)
To: netdev
Cc: cforno12, ljp, ricklind, dnbanerg, tlfalcon, drt, brking, sukadev,
linuxppc-dev
In-Reply-To: <1606238776-30259-1-git-send-email-tlfalcon@linux.ibm.com>
TX completions received with an error return code are not
being processed properly. When an error code is seen, do not
proceed to the next completion before cleaning up the existing
entry's data structures.
Fixes: 032c5e828 ("Driver for IBM System i/p VNIC protocol")
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
---
drivers/net/ethernet/ibm/ibmvnic.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 489ed5e..7097bcb 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -3105,11 +3105,9 @@ static int ibmvnic_complete_tx(struct ibmvnic_adapter *adapter,
next = ibmvnic_next_scrq(adapter, scrq);
for (i = 0; i < next->tx_comp.num_comps; i++) {
- if (next->tx_comp.rcs[i]) {
+ if (next->tx_comp.rcs[i])
dev_err(dev, "tx error %x\n",
next->tx_comp.rcs[i]);
- continue;
- }
index = be32_to_cpu(next->tx_comp.correlators[i]);
if (index & IBMVNIC_TSO_POOL_MASK) {
tx_pool = &adapter->tso_pool[pool];
--
1.8.3.1
^ permalink raw reply related
* Re: [PATCH kernel v4 1/8] genirq/ipi: Simplify irq_reserve_ipi
From: Cédric Le Goater @ 2020-11-24 16:54 UTC (permalink / raw)
To: Alexey Kardashevskiy, linux-kernel
Cc: linux-mips, Matt Redfearn, Qais Yousef, Marc Zyngier, x86,
linux-gpio, Oliver O'Halloran, Frederic Barrat,
Thomas Gleixner, Michal Suchánek, linuxppc-dev,
linux-arm-kernel
In-Reply-To: <20201124061720.86766-2-aik@ozlabs.ru>
On 11/24/20 7:17 AM, Alexey Kardashevskiy wrote:
> __irq_domain_alloc_irqs() can already handle virq==-1 and free
> descriptors if it failed allocating hardware interrupts so let's skip
> this extra step.
>
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
LGTM,
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Copying the MIPS folks since the IPI interface is only used under arch/mips.
C.
> ---
> kernel/irq/ipi.c | 16 +++-------------
> 1 file changed, 3 insertions(+), 13 deletions(-)
>
> diff --git a/kernel/irq/ipi.c b/kernel/irq/ipi.c
> index 43e3d1be622c..1b2807318ea9 100644
> --- a/kernel/irq/ipi.c
> +++ b/kernel/irq/ipi.c
> @@ -75,18 +75,12 @@ int irq_reserve_ipi(struct irq_domain *domain,
> }
> }
>
> - virq = irq_domain_alloc_descs(-1, nr_irqs, 0, NUMA_NO_NODE, NULL);
> - if (virq <= 0) {
> - pr_warn("Can't reserve IPI, failed to alloc descs\n");
> - return -ENOMEM;
> - }
> -
> - virq = __irq_domain_alloc_irqs(domain, virq, nr_irqs, NUMA_NO_NODE,
> - (void *) dest, true, NULL);
> + virq = __irq_domain_alloc_irqs(domain, -1, nr_irqs, NUMA_NO_NODE,
> + (void *) dest, false, NULL);
>
> if (virq <= 0) {
> pr_warn("Can't reserve IPI, failed to alloc hw irqs\n");
> - goto free_descs;
> + return -EBUSY;
> }
>
> for (i = 0; i < nr_irqs; i++) {
> @@ -96,10 +90,6 @@ int irq_reserve_ipi(struct irq_domain *domain,
> irq_set_status_flags(virq + i, IRQ_NO_BALANCING);
> }
> return virq;
> -
> -free_descs:
> - irq_free_descs(virq, nr_irqs);
> - return -EBUSY;
> }
>
> /**
>
^ permalink raw reply
* Re: eBPF on powerpc
From: Christophe Leroy @ 2020-11-24 18:45 UTC (permalink / raw)
To: Naveen N. Rao, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <1606234192.xvkulhfr3y.naveen@linux.ibm.com>
Le 24/11/2020 à 17:35, Naveen N. Rao a écrit :
> Hi Christophe,
>
> Christophe Leroy wrote:
>> Hi Naveen,
>>
>> Few years ago, you implemented eBPF on PPC64.
>>
>> Is there any reason for implementing it for PPC64 only ?
>
> I focused on ppc64 since eBPF is a 64-bit VM and it was more straight-forward to target.
>
>> Is there something that makes it impossible to have eBPF for PPC32 as well ?
>
> No, I just wasn't sure if it would be performant enough to warrant it. Since then however, there
> have been arm32 and riscv 32-bit JIT implementations and atleast the arm32 JIT seems to be showing
> ~50% better performance compared to the interpreter (*). So, it would be worthwhile to add support
> for ppc32.
That's great.
I know close to nothing about eBPF. Is there any interesting documentation on it somewhere that
would allow me to easily understand how it works and allow me to extend the 64 bit powerpc to 32 bits ?
>
> Note that there might be a few instructions which would be difficult to support on 32-bit, but those
> can fallback to the interpreter, while allowing other programs to be JIT'ed.
>
>
> - Naveen
>
> (*) http://lkml.kernel.org/r/CAGXu5jLYunVCJGCfHPebKDaoQ71hdMGq4HhdDxTYpBQw_HXUYQ@mail.gmail.com
> (*) http://lkml.kernel.org/r/b63fae4b-cb74-1928-b210-80914f3c8995@fb.com
> (*) http://lkml.kernel.org/r/20200305050207.4159-1-luke.r.nels@gmail.com
Christophe
^ permalink raw reply
* Re: [PATCH 1/3] powerpc: Make NUMA depend on SMP
From: Randy Dunlap @ 2020-11-24 19:46 UTC (permalink / raw)
To: Michael Ellerman, linuxppc-dev; +Cc: srikar
In-Reply-To: <20201124120547.1940635-1-mpe@ellerman.id.au>
On 11/24/20 4:05 AM, Michael Ellerman wrote:
> Our Kconfig allows NUMA to be enabled without SMP, but none of
> our defconfigs use that combination. This means it can easily be
> broken inadvertently by code changes, which has happened recently.
>
> Although it's theoretically possible to have a machine with a single
> CPU and multiple memory nodes, I can't think of any real systems where
> that's the case. Even so if such a system exists, it can just run an
> SMP kernel anyway.
>
> So to avoid the need to add extra #ifdefs and/or build breaks, make
> NUMA depend on SMP.
>
> Reported-by: kernel test robot <lkp@intel.com>
> Reported-by: Randy Dunlap <rdunlap@infradead.org>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Thanks.
> ---
> arch/powerpc/Kconfig | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index e9f13fe08492..a22db3db6b96 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -660,7 +660,7 @@ config IRQ_ALL_CPUS
>
> config NUMA
> bool "NUMA support"
> - depends on PPC64
> + depends on PPC64 && SMP
> default y if SMP && PPC_PSERIES
>
> config NODES_SHIFT
>
--
~Randy
^ permalink raw reply
* Re: [PATCH 3/3] powerpc: Update NUMA Kconfig description & help text
From: Randy Dunlap @ 2020-11-24 19:47 UTC (permalink / raw)
To: Michael Ellerman, linuxppc-dev; +Cc: srikar
In-Reply-To: <20201124120547.1940635-3-mpe@ellerman.id.au>
On 11/24/20 4:05 AM, Michael Ellerman wrote:
> Update the NUMA Kconfig description to match other architectures, and
> add some help text. Shamelessly borrowed from x86/arm64.
>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Thanks.
> ---
> arch/powerpc/Kconfig | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 4d688b426353..7f4995b245a3 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -659,9 +659,15 @@ config IRQ_ALL_CPUS
> reported with SMP Power Macintoshes with this option enabled.
>
> config NUMA
> - bool "NUMA support"
> + bool "NUMA Memory Allocation and Scheduler Support"
> depends on PPC64 && SMP
> default y if PPC_PSERIES || PPC_POWERNV
> + help
> + Enable NUMA (Non-Uniform Memory Access) support.
> +
> + The kernel will try to allocate memory used by a CPU on the
> + local memory controller of the CPU and add some more
> + NUMA awareness to the kernel.
>
> config NODES_SHIFT
> int
>
--
~Randy
^ permalink raw reply
* Re: eBPF on powerpc
From: Naveen N. Rao @ 2020-11-24 19:51 UTC (permalink / raw)
To: Christophe Leroy, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <4d588481-0c8d-6adf-53f5-e7332ddca7c4@csgroup.eu>
Christophe Leroy wrote:
>
>
> Le 24/11/2020 à 17:35, Naveen N. Rao a écrit :
>> Hi Christophe,
>>
>> Christophe Leroy wrote:
>>> Hi Naveen,
>>>
>>> Few years ago, you implemented eBPF on PPC64.
>>>
>>> Is there any reason for implementing it for PPC64 only ?
>>
>> I focused on ppc64 since eBPF is a 64-bit VM and it was more straight-forward to target.
>>
>>> Is there something that makes it impossible to have eBPF for PPC32 as well ?
>>
>> No, I just wasn't sure if it would be performant enough to warrant it. Since then however, there
>> have been arm32 and riscv 32-bit JIT implementations and atleast the arm32 JIT seems to be showing
>> ~50% better performance compared to the interpreter (*). So, it would be worthwhile to add support
>> for ppc32.
>
> That's great.
>
> I know close to nothing about eBPF. Is there any interesting documentation on it somewhere that
> would allow me to easily understand how it works and allow me to extend the 64 bit powerpc to 32 bits ?
I don't think there was ever a formal spec written for the eBPF VM. Here
are a few resources which should help, alongside the existing JIT
implementations:
- BPF Kernel Internals:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/networking/filter.rst#n604
- https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/bpf
- BPF and XDP Reference Guide: https://docs.cilium.io/en/stable/bpf/
- Naveen
^ permalink raw reply
* [PATCH v1 2/3] powerpc/32s: In add_hash_page(), calculate VSID later
From: Christophe Leroy @ 2020-11-24 19:51 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <6470ab99e58c84a5445af43ce4d1d772b0dc3e93.1606247495.git.christophe.leroy@csgroup.eu>
VSID is only for create_hpte(). When _PAGE_HASHPTE is
already set, add_hash_page() bails out without calling
create_hpte() and doesn't need the value of VSID.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/mm/book3s32/hash_low.S | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/mm/book3s32/hash_low.S b/arch/powerpc/mm/book3s32/hash_low.S
index f964fd34dad9..1366e8e4fc05 100644
--- a/arch/powerpc/mm/book3s32/hash_low.S
+++ b/arch/powerpc/mm/book3s32/hash_low.S
@@ -188,12 +188,6 @@ _GLOBAL(add_hash_page)
mflr r0
stw r0,4(r1)
- /* Convert context and va to VSID */
- mulli r3,r3,897*16 /* multiply context by context skew */
- rlwinm r0,r4,4,28,31 /* get ESID (top 4 bits of va) */
- mulli r0,r0,0x111 /* multiply by ESID skew */
- add r3,r3,r0 /* note create_hpte trims to 24 bits */
-
#ifdef CONFIG_SMP
lwz r8,TASK_CPU(r2) /* to go in mmu_hash_lock */
oris r8,r8,12
@@ -257,6 +251,12 @@ _GLOBAL(add_hash_page)
stwcx. r5,0,r8
bne- 1b
+ /* Convert context and va to VSID */
+ mulli r3,r3,897*16 /* multiply context by context skew */
+ rlwinm r0,r4,4,28,31 /* get ESID (top 4 bits of va) */
+ mulli r0,r0,0x111 /* multiply by ESID skew */
+ add r3,r3,r0 /* note create_hpte trims to 24 bits */
+
bl create_hpte
9:
--
2.25.0
^ permalink raw reply related
* [PATCH v1 1/3] powerpc/32s: Remove unused counters incremented by create_hpte()
From: Christophe Leroy @ 2020-11-24 19:51 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
primary_pteg_full and htab_hash_searches are not used.
Remove them.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/mm/book3s32/hash_low.S | 15 ---------------
1 file changed, 15 deletions(-)
diff --git a/arch/powerpc/mm/book3s32/hash_low.S b/arch/powerpc/mm/book3s32/hash_low.S
index 9a56ba4f68f2..f964fd34dad9 100644
--- a/arch/powerpc/mm/book3s32/hash_low.S
+++ b/arch/powerpc/mm/book3s32/hash_low.S
@@ -359,11 +359,6 @@ END_FTR_SECTION_IFCLR(CPU_FTR_NEED_COHERENT)
beq+ 10f /* no PTE: go look for an empty slot */
tlbie r4
- lis r4, (htab_hash_searches - PAGE_OFFSET)@ha
- lwz r6, (htab_hash_searches - PAGE_OFFSET)@l(r4)
- addi r6,r6,1 /* count how many searches we do */
- stw r6, (htab_hash_searches - PAGE_OFFSET)@l(r4)
-
/* Search the primary PTEG for a PTE whose 1st (d)word matches r5 */
mtctr r0
addi r4,r3,-HPTE_SIZE
@@ -393,12 +388,6 @@ END_FTR_SECTION_IFCLR(CPU_FTR_NEED_COHERENT)
bdnzf 2,1b /* loop while ctr != 0 && !cr0.eq */
beq+ .Lfound_empty
- /* update counter of times that the primary PTEG is full */
- lis r4, (primary_pteg_full - PAGE_OFFSET)@ha
- lwz r6, (primary_pteg_full - PAGE_OFFSET)@l(r4)
- addi r6,r6,1
- stw r6, (primary_pteg_full - PAGE_OFFSET)@l(r4)
-
patch_site 0f, patch__hash_page_C
/* Search the secondary PTEG for an empty slot */
ori r5,r5,PTE_H /* set H (secondary hash) bit */
@@ -491,10 +480,6 @@ _ASM_NOKPROBE_SYMBOL(create_hpte)
.align 2
next_slot:
.space 4
-primary_pteg_full:
- .space 4
-htab_hash_searches:
- .space 4
.previous
/*
--
2.25.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox