LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH linux-next] module: remove duplicate include in interrupt.c
From: Christophe Leroy @ 2021-08-16 11:57 UTC (permalink / raw)
  To: cgel.zte, mpe
  Cc: Lv Ruyi, linux-kernel, npiggin, paulus, linuxppc-dev, Zeal Robot
In-Reply-To: <20210816113453.126939-1-lv.ruyi@zte.com.cn>



Le 16/08/2021 à 13:34, cgel.zte@gmail.com a écrit :
> From: Lv Ruyi <lv.ruyi@zte.com.cn>
> 
> 'asm/interrupt.h' included in 'interrupt.c' is duplicated.

This patch has been submitted at least half a dozen of times already.

You should maintain alphabetic order in the include list.

But please don't post it again, we have it in the pipe already, see 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/1624329437-84730-1-git-send-email-jiapeng.chong@linux.alibaba.com/

Next time please check at https://patchwork.ozlabs.org/project/linuxppc-dev/list/ before submitting 
a new patch.

Thanks
Christophe

> 
> Reported-by: Zeal Robot <zealci@zte.com.cn>
> Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
> ---
>   arch/powerpc/kernel/interrupt.c | 1 -
>   1 file changed, 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
> index 21bbd615ca41..8a936515e4e4 100644
> --- a/arch/powerpc/kernel/interrupt.c
> +++ b/arch/powerpc/kernel/interrupt.c
> @@ -10,7 +10,6 @@
>   #include <asm/cputime.h>
>   #include <asm/interrupt.h>
>   #include <asm/hw_irq.h>
> -#include <asm/interrupt.h>
>   #include <asm/kprobes.h>
>   #include <asm/paca.h>
>   #include <asm/ptrace.h>
> 

^ permalink raw reply

* [PATCH linux-next] module: remove duplicate include in interrupt.c
From: cgel.zte @ 2021-08-16 11:34 UTC (permalink / raw)
  To: mpe; +Cc: Lv Ruyi, linux-kernel, paulus, npiggin, linuxppc-dev, Zeal Robot

From: Lv Ruyi <lv.ruyi@zte.com.cn>

'asm/interrupt.h' included in 'interrupt.c' is duplicated.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
---
 arch/powerpc/kernel/interrupt.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
index 21bbd615ca41..8a936515e4e4 100644
--- a/arch/powerpc/kernel/interrupt.c
+++ b/arch/powerpc/kernel/interrupt.c
@@ -10,7 +10,6 @@
 #include <asm/cputime.h>
 #include <asm/interrupt.h>
 #include <asm/hw_irq.h>
-#include <asm/interrupt.h>
 #include <asm/kprobes.h>
 #include <asm/paca.h>
 #include <asm/ptrace.h>
-- 
2.25.1


^ permalink raw reply related

* Re: [PATCH v2 1/2] sched/topology: Skip updating masks for non-online nodes
From: Srikar Dronamraju @ 2021-08-16 10:33 UTC (permalink / raw)
  To: Valentin Schneider, Peter Zijlstra
  Cc: Nathan Lynch, Gautham R Shenoy, Vincent Guittot, Rik van Riel,
	linuxppc-dev, Geetika Moolchandani, LKML, Dietmar Eggemann,
	Thomas Gleixner, Laurent Dufour, Mel Gorman, Ingo Molnar
In-Reply-To: <20210810114727.GB21942@linux.vnet.ibm.com>

> 
> Your version is much much better than mine.
> And I have verified that it works as expected.
> 
> 

Hey Peter/Valentin

Are we waiting for any more feedback/testing for this?


-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply

* Re: [PATCH 0/2] powerpc: mpc855_ads defconfig fixes
From: Christophe Leroy @ 2021-08-16  8:50 UTC (permalink / raw)
  To: Joel Stanley, Michael Ellerman, linuxppc-dev
In-Reply-To: <20210816083126.2294522-1-joel@jms.id.au>



Le 16/08/2021 à 10:31, Joel Stanley a écrit :
> The first was a build warning I noticed when testing something
> unrelated.
> 
> I took a moment to look into it, and came up with the second patch which
> updates the defconfig to make it easier to maintain in the future
> 
> It also fixes a regression where the MTD partition support dropped out
> of the config. Given noone noticed the regression since v4.20 was
> released, perhaps it could be left disabled?

Most likely nobody is using this board anymore. But that's a good to have it to perform CI builds. 
So we should leave that kind of config.


> 
> Joel Stanley (2):
>    powerpc/config: Fix IPV6 warning in mpc855_ads
>    powerpc/configs: Regenerate mpc885_ads_defconfig
> 
>   arch/powerpc/configs/mpc885_ads_defconfig | 49 +++++++++++------------
>   1 file changed, 23 insertions(+), 26 deletions(-)
> 

^ permalink raw reply

* Re: [PATCH 2/2] powerpc/configs: Regenerate mpc885_ads_defconfig
From: Christophe Leroy @ 2021-08-16  8:49 UTC (permalink / raw)
  To: Joel Stanley, Michael Ellerman, linuxppc-dev
In-Reply-To: <20210816083126.2294522-3-joel@jms.id.au>



Le 16/08/2021 à 10:31, Joel Stanley a écrit :
> Regenrate atop v5.14-rc6.

Typos.

You mean you did redo a "make savedefconfig" ?

> 
> The chagnes are mostly re-ordering, except for the following which fall
> out due to dependenacies:
> 
>   - CONFIG_DEBUG_KERNEL=y selected by EXPERT
> 
>   - CONFIG_PPC_EARLY_DEBUG_CPM_ADDR=0xff002008 which is the default
>     setting
> 
> CONFIG_MTD_PHYSMAP_OF is not longer enabled, as it depends on
> MTD_PHYSMAP which is not enabled. This is a regression from commit
> 642b1e8dbed7 ("mtd: maps: Merge physmap_of.c into physmap-core.c"),
> which added the extra dependency. Add CONFIG_MTD_PHYSMAP=y so this stays
> in the config.
> 
> Signed-off-by: Joel Stanley <joel@jms.id.au>
> ---
>   arch/powerpc/configs/mpc885_ads_defconfig | 47 +++++++++++------------
>   1 file changed, 23 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/powerpc/configs/mpc885_ads_defconfig b/arch/powerpc/configs/mpc885_ads_defconfig
> index 5cd17adf903f..c74dc76b1d0d 100644
> --- a/arch/powerpc/configs/mpc885_ads_defconfig
> +++ b/arch/powerpc/configs/mpc885_ads_defconfig
> @@ -1,19 +1,30 @@
> -CONFIG_PPC_8xx=y
>   # CONFIG_SWAP is not set
>   CONFIG_SYSVIPC=y
>   CONFIG_NO_HZ=y
>   CONFIG_HIGH_RES_TIMERS=y
> +CONFIG_BPF_JIT=y
> +CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y
>   CONFIG_LOG_BUF_SHIFT=14
>   CONFIG_EXPERT=y
>   # CONFIG_ELF_CORE is not set
>   # CONFIG_BASE_FULL is not set
>   # CONFIG_FUTEX is not set
> +CONFIG_PERF_EVENTS=y
>   # CONFIG_VM_EVENT_COUNTERS is not set
> -# CONFIG_BLK_DEV_BSG is not set
> -CONFIG_PARTITION_ADVANCED=y
> +CONFIG_PPC_8xx=y
> +CONFIG_8xx_GPIO=y
> +CONFIG_SMC_UCODE_PATCH=y
> +CONFIG_PIN_TLB=y
>   CONFIG_GEN_RTC=y
>   CONFIG_HZ_100=y
> +CONFIG_MATH_EMULATION=y
> +CONFIG_PPC_16K_PAGES=y
> +CONFIG_ADVANCED_OPTIONS=y
>   # CONFIG_SECCOMP is not set
> +CONFIG_STRICT_KERNEL_RWX=y
> +CONFIG_MODULES=y
> +# CONFIG_BLK_DEV_BSG is not set
> +CONFIG_PARTITION_ADVANCED=y
>   CONFIG_NET=y
>   CONFIG_PACKET=y
>   CONFIG_UNIX=y
> @@ -33,6 +44,7 @@ CONFIG_MTD_CFI_GEOMETRY=y
>   # CONFIG_MTD_CFI_I2 is not set
>   CONFIG_MTD_CFI_I4=y
>   CONFIG_MTD_CFI_AMDSTD=y
> +CONFIG_MTD_PHYSMAP=y
>   CONFIG_MTD_PHYSMAP_OF=y
>   # CONFIG_BLK_DEV is not set
>   CONFIG_NETDEVICES=y
> @@ -45,38 +57,25 @@ CONFIG_DAVICOM_PHY=y
>   # CONFIG_LEGACY_PTYS is not set
>   CONFIG_SERIAL_CPM=y
>   CONFIG_SERIAL_CPM_CONSOLE=y
> +CONFIG_SPI=y
> +CONFIG_SPI_FSL_SPI=y
>   # CONFIG_HWMON is not set
> +CONFIG_WATCHDOG=y
> +CONFIG_8xxx_WDT=y
>   # CONFIG_USB_SUPPORT is not set
>   # CONFIG_DNOTIFY is not set
>   CONFIG_TMPFS=y
>   CONFIG_CRAMFS=y
>   CONFIG_NFS_FS=y
>   CONFIG_ROOT_NFS=y
> +CONFIG_CRYPTO=y
> +CONFIG_CRYPTO_DEV_TALITOS=y
>   CONFIG_CRC32_SLICEBY4=y
>   CONFIG_DEBUG_INFO=y
>   CONFIG_MAGIC_SYSRQ=y
> -CONFIG_DETECT_HUNG_TASK=y
> -CONFIG_PPC_16K_PAGES=y
> -CONFIG_DEBUG_KERNEL=y
>   CONFIG_DEBUG_FS=y
> -CONFIG_PPC_PTDUMP=y
> -CONFIG_MODULES=y
> -CONFIG_SPI=y
> -CONFIG_SPI_FSL_SPI=y
> -CONFIG_CRYPTO=y
> -CONFIG_CRYPTO_DEV_TALITOS=y
> -CONFIG_8xx_GPIO=y
> -CONFIG_WATCHDOG=y
> -CONFIG_8xxx_WDT=y
> -CONFIG_SMC_UCODE_PATCH=y
> -CONFIG_ADVANCED_OPTIONS=y
> -CONFIG_PIN_TLB=y
> -CONFIG_PERF_EVENTS=y
> -CONFIG_MATH_EMULATION=y
> -CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y
> -CONFIG_STRICT_KERNEL_RWX=y
> -CONFIG_BPF_JIT=y
>   CONFIG_DEBUG_VM_PGTABLE=y
> +CONFIG_DETECT_HUNG_TASK=y
>   CONFIG_BDI_SWITCH=y
>   CONFIG_PPC_EARLY_DEBUG=y
> -CONFIG_PPC_EARLY_DEBUG_CPM_ADDR=0xff002008
> +CONFIG_PPC_PTDUMP=y
> 

^ permalink raw reply

* Re: [PATCH 1/2] powerpc/config: Fix IPV6 warning in mpc855_ads
From: Christophe Leroy @ 2021-08-16  8:45 UTC (permalink / raw)
  To: Joel Stanley, Michael Ellerman, linuxppc-dev
In-Reply-To: <20210816083126.2294522-2-joel@jms.id.au>



Le 16/08/2021 à 10:31, Joel Stanley a écrit :
> When building this config there's a warning:
> 
>    79:warning: override: reassigning to symbol IPV6
> 
> Commit 9a1762a4a4ff ("powerpc/8xx: Update mpc885_ads_defconfig to
> improve CI") added CONFIG_IPV6=y, but left '# CONFIG_IPV6 is not set'
> in.
> 
> IPV6 is default y, so remove both to clean up the build.
> 
> Signed-off-by: Joel Stanley <joel@jms.id.au>

Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>

> ---
>   arch/powerpc/configs/mpc885_ads_defconfig | 2 --
>   1 file changed, 2 deletions(-)
> 
> diff --git a/arch/powerpc/configs/mpc885_ads_defconfig b/arch/powerpc/configs/mpc885_ads_defconfig
> index d21f266cea9a..5cd17adf903f 100644
> --- a/arch/powerpc/configs/mpc885_ads_defconfig
> +++ b/arch/powerpc/configs/mpc885_ads_defconfig
> @@ -21,7 +21,6 @@ CONFIG_INET=y
>   CONFIG_IP_MULTICAST=y
>   CONFIG_IP_PNP=y
>   CONFIG_SYN_COOKIES=y
> -# CONFIG_IPV6 is not set
>   # CONFIG_FW_LOADER is not set
>   CONFIG_MTD=y
>   CONFIG_MTD_BLOCK=y
> @@ -76,7 +75,6 @@ CONFIG_PERF_EVENTS=y
>   CONFIG_MATH_EMULATION=y
>   CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y
>   CONFIG_STRICT_KERNEL_RWX=y
> -CONFIG_IPV6=y
>   CONFIG_BPF_JIT=y
>   CONFIG_DEBUG_VM_PGTABLE=y
>   CONFIG_BDI_SWITCH=y
> 

^ permalink raw reply

* Re: [PATCH 3/3] powerpc/microwatt: CPU doesn't (yet) have speculation bugs
From: Christophe Leroy @ 2021-08-16  8:42 UTC (permalink / raw)
  To: Joel Stanley, Paul Mackerras, Michael Neuling, Anton Blanchard,
	Michael Ellerman, Nicholas Piggin, linuxppc-dev
  Cc: Daniel Axtens
In-Reply-To: <20210816082403.2293846-4-joel@jms.id.au>



Le 16/08/2021 à 10:24, Joel Stanley a écrit :
> Signed-off-by: Joel Stanley <joel@jms.id.au>
> ---
>   arch/powerpc/Kconfig | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 663766fbf505..d5af6667c206 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -279,6 +279,7 @@ config PPC_BARRIER_NOSPEC
>   	bool
>   	default y
>   	depends on PPC_BOOK3S_64 || PPC_FSL_BOOK3E
> +	depends on !PPC_MICROWATT

Not sure it is a good idea to disable it completely when PPC_MICROWATT is selected. Don't we want to 
be able to build generic kernels that can run on any book3s/64 ?

Maybe you should change the default instead, something like:

	bool
	default y if !PPC_MICROWATT
	depends on PPC_BOOK3S_64 || PPC_FSL_BOOK3E

>   
>   config EARLY_PRINTK
>   	bool
> 

^ permalink raw reply

* Re: [PATCH v2 00/60] KVM: PPC: Book3S HV P9: entry/exit optimisations
From: Athira Rajeev @ 2021-08-16  8:41 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: linuxppc-dev, kvm-ppc
In-Reply-To: <20210811160134.904987-1-npiggin@gmail.com>



> On 11-Aug-2021, at 9:30 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
> 
> This reduces radix guest full entry/exit latency on POWER9 and POWER10
> by 2x.
> 
> Nested HV guests should see smaller improvements in their L1 entry/exit,
> but this is also combined with most L0 speedups also applying to nested
> entry. nginx localhost throughput test in a SMP nested guest is improved
> about 10% (in a direct guest it doesn't change much because it uses XIVE
> for IPIs) when L0 and L1 are patched.
> 
> It does this in several main ways:
> 
> - Rearrange code to optimise SPR accesses. Mainly, avoid scoreboard
>  stalls.
> 
> - Test SPR values to avoid mtSPRs where possible. mtSPRs are expensive.
> 
> - Reduce mftb. mftb is expensive.
> 
> - Demand fault certain facilities to avoid saving and/or restoring them
>  (at the cost of fault when they are used, but this is mitigated over
>  a number of entries, like the facilities when context switching 
>  processes). PM, TM, and EBB so far.
> 
> - Defer some sequences that are made just in case a guest is interrupted
>  in the middle of a critical section to the case where the guest is
>  scheduled on a different CPU, rather than every time (at the cost of
>  an extra IPI in this case). Namely the tlbsync sequence for radix with
>  GTSE, which is very expensive.
> 
> - Reduce locking, barriers, atomics related to the vcpus-per-vcore > 1
>  handling that the P9 path does not require.
> 
> Changes since v1:
> - Verified DPDES changes still work with msgsndp SMT emulation.
> - Fixed HMI handling bug.
> - Split softpatch handling fixes into smaller pieces.
> - Rebased with Fabiano's latest HV sanitising patches.
> - Fix TM demand faulting bug causing nested guest TM tests to TM Bad
>  Thing the host in rare cases.
> - Re-name new "pmu=" command line option to "pmu_override=" and update
>  documentation wording.

Hi Nick,

For the PMU related changes,

Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>

Thanks
Athira
> - Add default=y config option rather than unconditionally removing the
>  L0 nested PMU workaround.
> - Remove unnecessary MSR[RI] updates in entry/exit. Down to about 4700
>  cycles now.
> - Another bugfix from Alexey's testing.
> 
> Changes since RFC:
> - Rebased with Fabiano's HV sanitising patches at the front.
> - Several demand faulting bug fixes mostly relating to nested guests.
> - Removed facility demand-faulting from L0 nested entry/exit handler.
>  Demand faulting is still done in the L1, but not the L0. The reason
>  is to reduce complexity (although it's only a small amount of
>  complexity), reduce demand faulting overhead that may require several
> 
> Fabiano Rosas (3):
>  KVM: PPC: Book3S HV Nested: Sanitise vcpu registers
>  KVM: PPC: Book3S HV Nested: Stop forwarding all HFUs to L1
>  KVM: PPC: Book3S HV Nested: save_hv_return_state does not require trap
>    argument
> 
> Nicholas Piggin (57):
>  KVM: PPC: Book3S HV: Initialise vcpu MSR with MSR_ME
>  KVM: PPC: Book3S HV: Remove TM emulation from POWER7/8 path
>  KVM: PPC: Book3S HV P9: Fixes for TM softpatch interrupt NIP
>  KVM: PPC: Book3S HV Nested: Fix TM softpatch HFAC interrupt emulation
>  KVM: PPC: Book3S HV Nested: Make nested HFSCR state accessible
>  KVM: PPC: Book3S HV Nested: Reflect guest PMU in-use to L0 when guest
>    SPRs are live
>  powerpc/64s: Remove WORT SPR from POWER9/10
>  KMV: PPC: Book3S HV P9: Use set_dec to set decrementer to host
>  KVM: PPC: Book3S HV P9: Use host timer accounting to avoid decrementer
>    read
>  KVM: PPC: Book3S HV P9: Use large decrementer for HDEC
>  KVM: PPC: Book3S HV P9: Reduce mftb per guest entry/exit
>  powerpc/time: add API for KVM to re-arm the host timer/decrementer
>  KVM: PPC: Book3S HV: POWER10 enable HAIL when running radix guests
>  powerpc/64s: Keep AMOR SPR a constant ~0 at runtime
>  KVM: PPC: Book3S HV: Don't always save PMU for guest capable of
>    nesting
>  powerpc/64s: Always set PMU control registers to frozen/disabled when
>    not in use
>  powerpc/64s: Implement PMU override command line option
>  KVM: PPC: Book3S HV P9: Implement PMU save/restore in C
>  KVM: PPC: Book3S HV P9: Factor PMU save/load into context switch
>    functions
>  KVM: PPC: Book3S HV P9: Demand fault PMU SPRs when marked not inuse
>  KVM: PPC: Book3S HV P9: Factor out yield_count increment
>  KVM: PPC: Book3S HV: CTRL SPR does not require read-modify-write
>  KVM: PPC: Book3S HV P9: Move SPRG restore to restore_p9_host_os_sprs
>  KVM: PPC: Book3S HV P9: Reduce mtmsrd instructions required to save
>    host SPRs
>  KVM: PPC: Book3S HV P9: Improve mtmsrd scheduling by delaying MSR[EE]
>    disable
>  KVM: PPC: Book3S HV P9: Add kvmppc_stop_thread to match
>    kvmppc_start_thread
>  KVM: PPC: Book3S HV: Change dec_expires to be relative to guest
>    timebase
>  KVM: PPC: Book3S HV P9: Move TB updates
>  KVM: PPC: Book3S HV P9: Optimise timebase reads
>  KVM: PPC: Book3S HV P9: Avoid SPR scoreboard stalls
>  KVM: PPC: Book3S HV P9: Only execute mtSPR if the value changed
>  KVM: PPC: Book3S HV P9: Juggle SPR switching around
>  KVM: PPC: Book3S HV P9: Move vcpu register save/restore into functions
>  KVM: PPC: Book3S HV P9: Move host OS save/restore functions to
>    built-in
>  KVM: PPC: Book3S HV P9: Move nested guest entry into its own function
>  KVM: PPC: Book3S HV P9: Move remaining SPR and MSR access into low
>    level entry
>  KVM: PPC: Book3S HV P9: Implement TM fastpath for guest entry/exit
>  KVM: PPC: Book3S HV P9: Switch PMU to guest as late as possible
>  KVM: PPC: Book3S HV P9: Restrict DSISR canary workaround to processors
>    that require it
>  KVM: PPC: Book3S HV P9: More SPR speed improvements
>  KVM: PPC: Book3S HV P9: Demand fault EBB facility registers
>  KVM: PPC: Book3S HV P9: Demand fault TM facility registers
>  KVM: PPC: Book3S HV P9: Use Linux SPR save/restore to manage some host
>    SPRs
>  KVM: PPC: Book3S HV P9: Comment and fix MMU context switching code
>  KVM: PPC: Book3S HV P9: Test dawr_enabled() before saving host DAWR
>    SPRs
>  KVM: PPC: Book3S HV P9: Don't restore PSSCR if not needed
>  KVM: PPC: Book3S HV P9: Avoid tlbsync sequence on radix guest exit
>  KVM: PPC: Book3S HV Nested: Avoid extra mftb() in nested entry
>  KVM: PPC: Book3S HV P9: Improve mfmsr performance on entry
>  KVM: PPC: Book3S HV P9: Optimise hash guest SLB saving
>  KVM: PPC: Book3S HV P9: Avoid changing MSR[RI] in entry and exit
>  KVM: PPC: Book3S HV P9: Add unlikely annotation for !mmu_ready
>  KVM: PPC: Book3S HV P9: Avoid cpu_in_guest atomics on entry and exit
>  KVM: PPC: Book3S HV P9: Remove most of the vcore logic
>  KVM: PPC: Book3S HV P9: Tidy kvmppc_create_dtl_entry
>  KVM: PPC: Book3S HV P9: Stop using vc->dpdes
>  KVM: PPC: Book3S HV P9: Remove subcore HMI handling
> 
> .../admin-guide/kernel-parameters.txt         |   8 +
> arch/powerpc/include/asm/asm-prototypes.h     |   5 -
> arch/powerpc/include/asm/kvm_asm.h            |   1 +
> arch/powerpc/include/asm/kvm_book3s.h         |   6 +
> arch/powerpc/include/asm/kvm_book3s_64.h      |   6 +-
> arch/powerpc/include/asm/kvm_host.h           |   7 +-
> arch/powerpc/include/asm/kvm_ppc.h            |   1 +
> arch/powerpc/include/asm/pmc.h                |   7 +
> arch/powerpc/include/asm/reg.h                |   3 +-
> arch/powerpc/include/asm/switch_to.h          |   2 +
> arch/powerpc/include/asm/time.h               |  19 +-
> arch/powerpc/kernel/cpu_setup_power.c         |  12 +-
> arch/powerpc/kernel/dt_cpu_ftrs.c             |   8 +-
> arch/powerpc/kernel/process.c                 |  32 +
> arch/powerpc/kernel/time.c                    |  54 +-
> arch/powerpc/kvm/Kconfig                      |  15 +
> arch/powerpc/kvm/book3s_64_mmu_radix.c        |   4 +
> arch/powerpc/kvm/book3s_hv.c                  | 890 ++++++++++--------
> arch/powerpc/kvm/book3s_hv.h                  |  41 +
> arch/powerpc/kvm/book3s_hv_builtin.c          |   2 +
> arch/powerpc/kvm/book3s_hv_hmi.c              |   7 +-
> arch/powerpc/kvm/book3s_hv_interrupts.S       |  13 +-
> arch/powerpc/kvm/book3s_hv_nested.c           | 109 ++-
> arch/powerpc/kvm/book3s_hv_p9_entry.c         | 817 +++++++++++++---
> arch/powerpc/kvm/book3s_hv_ras.c              |  54 ++
> arch/powerpc/kvm/book3s_hv_rmhandlers.S       | 115 +--
> arch/powerpc/kvm/book3s_hv_tm.c               |  61 +-
> arch/powerpc/mm/book3s64/radix_pgtable.c      |  15 -
> arch/powerpc/perf/core-book3s.c               |  35 +
> arch/powerpc/platforms/powernv/idle.c         |  10 +-
> 30 files changed, 1589 insertions(+), 770 deletions(-)
> create mode 100644 arch/powerpc/kvm/book3s_hv.h
> 
> -- 
> 2.23.0
> 


^ permalink raw reply

* Re: [PATCH 2/3] powerpc: Fix undefined static key
From: Christophe Leroy @ 2021-08-16  8:39 UTC (permalink / raw)
  To: Joel Stanley, Paul Mackerras, Michael Neuling, Anton Blanchard,
	Michael Ellerman, Nicholas Piggin, linuxppc-dev
  Cc: Daniel Axtens
In-Reply-To: <20210816082403.2293846-3-joel@jms.id.au>



Le 16/08/2021 à 10:24, Joel Stanley a écrit :
> When CONFIG_PPC_BARRIER_NOSPEC=n, security.c is not built leading to a
> missing definition of uaccess_flush_key.
> 
>    LD      vmlinux.o
>    MODPOST vmlinux.symvers
>    MODINFO modules.builtin.modinfo
>    GEN     modules.builtin
>    LD      .tmp_vmlinux.kallsyms1
> powerpc64le-linux-gnu-ld: arch/powerpc/kernel/align.o:(.toc+0x0): undefined reference to `uaccess_flush_key'
> powerpc64le-linux-gnu-ld: arch/powerpc/kernel/signal_64.o:(.toc+0x0): undefined reference to `uaccess_flush_key'
> powerpc64le-linux-gnu-ld: arch/powerpc/kernel/process.o:(.toc+0x0): undefined reference to `uaccess_flush_key'
> powerpc64le-linux-gnu-ld: arch/powerpc/kernel/traps.o:(.toc+0x0): undefined reference to `uaccess_flush_key'
> powerpc64le-linux-gnu-ld: arch/powerpc/kernel/hw_breakpoint_constraints.o:(.toc+0x0): undefined reference to `uaccess_flush_key'
> powerpc64le-linux-gnu-ld: arch/powerpc/kernel/ptrace/ptrace.o:(.toc+0x0): more undefined references to `uaccess_flush_key' follow
> make[1]: *** [Makefile:1176: vmlinux] Error 1
> 
> Hack one in to fix the build.
> 
> Signed-off-by: Joel Stanley <joel@jms.id.au>
> ---
>   arch/powerpc/include/asm/security_features.h | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/security_features.h b/arch/powerpc/include/asm/security_features.h
> index 792eefaf230b..46ade7927a4c 100644
> --- a/arch/powerpc/include/asm/security_features.h
> +++ b/arch/powerpc/include/asm/security_features.h
> @@ -39,6 +39,9 @@ static inline bool security_ftr_enabled(u64 feature)
>   	return !!(powerpc_security_features & feature);
>   }
>   
> +#ifndef CONFIG_PPC_BARRIER_NOSPEC
> +DEFINE_STATIC_KEY_FALSE(uaccess_flush_key);
> +#endif

It will then be re-defined by each file that includes asm/security_features.h ....

You can't use a DEFINE_ in a .h

You have to fix the problem at its source.

Cleanest way I see it to modify asm/book3s/64/kup.h and something like

if (IS_ENABLED(CONFIG_PPC_BARRIER_NOSPEC) && static_branch_unlikely(&uaccess_flush_key)



>   
>   // Features indicating support for Spectre/Meltdown mitigations
>   
> 

^ permalink raw reply

* [PATCH 2/2] powerpc/configs: Regenerate mpc885_ads_defconfig
From: Joel Stanley @ 2021-08-16  8:31 UTC (permalink / raw)
  To: Michael Ellerman, Christophe Leroy, linuxppc-dev
In-Reply-To: <20210816083126.2294522-1-joel@jms.id.au>

Regenrate atop v5.14-rc6.

The chagnes are mostly re-ordering, except for the following which fall
out due to dependenacies:

 - CONFIG_DEBUG_KERNEL=y selected by EXPERT

 - CONFIG_PPC_EARLY_DEBUG_CPM_ADDR=0xff002008 which is the default
   setting

CONFIG_MTD_PHYSMAP_OF is not longer enabled, as it depends on
MTD_PHYSMAP which is not enabled. This is a regression from commit
642b1e8dbed7 ("mtd: maps: Merge physmap_of.c into physmap-core.c"),
which added the extra dependency. Add CONFIG_MTD_PHYSMAP=y so this stays
in the config.

Signed-off-by: Joel Stanley <joel@jms.id.au>
---
 arch/powerpc/configs/mpc885_ads_defconfig | 47 +++++++++++------------
 1 file changed, 23 insertions(+), 24 deletions(-)

diff --git a/arch/powerpc/configs/mpc885_ads_defconfig b/arch/powerpc/configs/mpc885_ads_defconfig
index 5cd17adf903f..c74dc76b1d0d 100644
--- a/arch/powerpc/configs/mpc885_ads_defconfig
+++ b/arch/powerpc/configs/mpc885_ads_defconfig
@@ -1,19 +1,30 @@
-CONFIG_PPC_8xx=y
 # CONFIG_SWAP is not set
 CONFIG_SYSVIPC=y
 CONFIG_NO_HZ=y
 CONFIG_HIGH_RES_TIMERS=y
+CONFIG_BPF_JIT=y
+CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y
 CONFIG_LOG_BUF_SHIFT=14
 CONFIG_EXPERT=y
 # CONFIG_ELF_CORE is not set
 # CONFIG_BASE_FULL is not set
 # CONFIG_FUTEX is not set
+CONFIG_PERF_EVENTS=y
 # CONFIG_VM_EVENT_COUNTERS is not set
-# CONFIG_BLK_DEV_BSG is not set
-CONFIG_PARTITION_ADVANCED=y
+CONFIG_PPC_8xx=y
+CONFIG_8xx_GPIO=y
+CONFIG_SMC_UCODE_PATCH=y
+CONFIG_PIN_TLB=y
 CONFIG_GEN_RTC=y
 CONFIG_HZ_100=y
+CONFIG_MATH_EMULATION=y
+CONFIG_PPC_16K_PAGES=y
+CONFIG_ADVANCED_OPTIONS=y
 # CONFIG_SECCOMP is not set
+CONFIG_STRICT_KERNEL_RWX=y
+CONFIG_MODULES=y
+# CONFIG_BLK_DEV_BSG is not set
+CONFIG_PARTITION_ADVANCED=y
 CONFIG_NET=y
 CONFIG_PACKET=y
 CONFIG_UNIX=y
@@ -33,6 +44,7 @@ CONFIG_MTD_CFI_GEOMETRY=y
 # CONFIG_MTD_CFI_I2 is not set
 CONFIG_MTD_CFI_I4=y
 CONFIG_MTD_CFI_AMDSTD=y
+CONFIG_MTD_PHYSMAP=y
 CONFIG_MTD_PHYSMAP_OF=y
 # CONFIG_BLK_DEV is not set
 CONFIG_NETDEVICES=y
@@ -45,38 +57,25 @@ CONFIG_DAVICOM_PHY=y
 # CONFIG_LEGACY_PTYS is not set
 CONFIG_SERIAL_CPM=y
 CONFIG_SERIAL_CPM_CONSOLE=y
+CONFIG_SPI=y
+CONFIG_SPI_FSL_SPI=y
 # CONFIG_HWMON is not set
+CONFIG_WATCHDOG=y
+CONFIG_8xxx_WDT=y
 # CONFIG_USB_SUPPORT is not set
 # CONFIG_DNOTIFY is not set
 CONFIG_TMPFS=y
 CONFIG_CRAMFS=y
 CONFIG_NFS_FS=y
 CONFIG_ROOT_NFS=y
+CONFIG_CRYPTO=y
+CONFIG_CRYPTO_DEV_TALITOS=y
 CONFIG_CRC32_SLICEBY4=y
 CONFIG_DEBUG_INFO=y
 CONFIG_MAGIC_SYSRQ=y
-CONFIG_DETECT_HUNG_TASK=y
-CONFIG_PPC_16K_PAGES=y
-CONFIG_DEBUG_KERNEL=y
 CONFIG_DEBUG_FS=y
-CONFIG_PPC_PTDUMP=y
-CONFIG_MODULES=y
-CONFIG_SPI=y
-CONFIG_SPI_FSL_SPI=y
-CONFIG_CRYPTO=y
-CONFIG_CRYPTO_DEV_TALITOS=y
-CONFIG_8xx_GPIO=y
-CONFIG_WATCHDOG=y
-CONFIG_8xxx_WDT=y
-CONFIG_SMC_UCODE_PATCH=y
-CONFIG_ADVANCED_OPTIONS=y
-CONFIG_PIN_TLB=y
-CONFIG_PERF_EVENTS=y
-CONFIG_MATH_EMULATION=y
-CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y
-CONFIG_STRICT_KERNEL_RWX=y
-CONFIG_BPF_JIT=y
 CONFIG_DEBUG_VM_PGTABLE=y
+CONFIG_DETECT_HUNG_TASK=y
 CONFIG_BDI_SWITCH=y
 CONFIG_PPC_EARLY_DEBUG=y
-CONFIG_PPC_EARLY_DEBUG_CPM_ADDR=0xff002008
+CONFIG_PPC_PTDUMP=y
-- 
2.32.0


^ permalink raw reply related

* [PATCH 1/2] powerpc/config: Fix IPV6 warning in mpc855_ads
From: Joel Stanley @ 2021-08-16  8:31 UTC (permalink / raw)
  To: Michael Ellerman, Christophe Leroy, linuxppc-dev
In-Reply-To: <20210816083126.2294522-1-joel@jms.id.au>

When building this config there's a warning:

  79:warning: override: reassigning to symbol IPV6

Commit 9a1762a4a4ff ("powerpc/8xx: Update mpc885_ads_defconfig to
improve CI") added CONFIG_IPV6=y, but left '# CONFIG_IPV6 is not set'
in.

IPV6 is default y, so remove both to clean up the build.

Signed-off-by: Joel Stanley <joel@jms.id.au>
---
 arch/powerpc/configs/mpc885_ads_defconfig | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/powerpc/configs/mpc885_ads_defconfig b/arch/powerpc/configs/mpc885_ads_defconfig
index d21f266cea9a..5cd17adf903f 100644
--- a/arch/powerpc/configs/mpc885_ads_defconfig
+++ b/arch/powerpc/configs/mpc885_ads_defconfig
@@ -21,7 +21,6 @@ CONFIG_INET=y
 CONFIG_IP_MULTICAST=y
 CONFIG_IP_PNP=y
 CONFIG_SYN_COOKIES=y
-# CONFIG_IPV6 is not set
 # CONFIG_FW_LOADER is not set
 CONFIG_MTD=y
 CONFIG_MTD_BLOCK=y
@@ -76,7 +75,6 @@ CONFIG_PERF_EVENTS=y
 CONFIG_MATH_EMULATION=y
 CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y
 CONFIG_STRICT_KERNEL_RWX=y
-CONFIG_IPV6=y
 CONFIG_BPF_JIT=y
 CONFIG_DEBUG_VM_PGTABLE=y
 CONFIG_BDI_SWITCH=y
-- 
2.32.0


^ permalink raw reply related

* Re: [PATCH 1/3] powerpc/64s: Fix build when !PPC_BARRIER_NOSPEC
From: Christophe Leroy @ 2021-08-16  8:31 UTC (permalink / raw)
  To: Joel Stanley, Paul Mackerras, Michael Neuling, Anton Blanchard,
	Michael Ellerman, Nicholas Piggin, linuxppc-dev
  Cc: Daniel Axtens
In-Reply-To: <20210816082403.2293846-2-joel@jms.id.au>



Le 16/08/2021 à 10:24, Joel Stanley a écrit :
> When disabling PPC_BARRIER_NOSPEC the do_barrier_nospec_fixups_range
> definition is still included, as well as a stub in asm/setup.h:
> 
> ../arch/powerpc/lib/feature-fixups.c:502:6: error: redefinition of ‘do_barrier_nospec_fixu>
>    502 | void do_barrier_nospec_fixups_range(bool enable, void *fixup_start, void *fixup_en>
>        |      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> In file included from ../arch/powerpc/lib/feature-fixups.c:23:
> ../arch/powerpc/include/asm/setup.h:70:20: note: previous definition of ‘do_barrier_nospec>
>     70 | static inline void do_barrier_nospec_fixups_range(bool enable, void *start, void *>
>        |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> I assume the intent was to put the just do_barrier_nospec_fixups
> behind PPC_BARRIER_NOSPEC and let the compiler drop _range when there
> are no users. (There is a caller in module.c, but this is behind
> PPC_BARRIER_NOSPEC).

The compiler won't drop do_barrier_nospec_fixups_range() because it is not static.

> 
> This makes PPC_BOOK3S_64 match how the PPC_FSL_BOOK3E build works.
> 
> Signed-off-by: Joel Stanley <joel@jms.id.au>
> ---
>   arch/powerpc/include/asm/setup.h | 2 --
>   1 file changed, 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/setup.h b/arch/powerpc/include/asm/setup.h
> index 6c1a7d217d1a..71012284c044 100644
> --- a/arch/powerpc/include/asm/setup.h
> +++ b/arch/powerpc/include/asm/setup.h
> @@ -66,8 +66,6 @@ extern bool barrier_nospec_enabled;
>   
>   #ifdef CONFIG_PPC_BARRIER_NOSPEC
>   void do_barrier_nospec_fixups_range(bool enable, void *start, void *end);
> -#else
> -static inline void do_barrier_nospec_fixups_range(bool enable, void *start, void *end) { }
>   #endif
>   
>   #ifdef CONFIG_PPC_FSL_BOOK3E
> 

^ permalink raw reply

* [PATCH 0/2] powerpc: mpc855_ads defconfig fixes
From: Joel Stanley @ 2021-08-16  8:31 UTC (permalink / raw)
  To: Michael Ellerman, Christophe Leroy, linuxppc-dev

The first was a build warning I noticed when testing something
unrelated.

I took a moment to look into it, and came up with the second patch which
updates the defconfig to make it easier to maintain in the future

It also fixes a regression where the MTD partition support dropped out
of the config. Given noone noticed the regression since v4.20 was
released, perhaps it could be left disabled?

Joel Stanley (2):
  powerpc/config: Fix IPV6 warning in mpc855_ads
  powerpc/configs: Regenerate mpc885_ads_defconfig

 arch/powerpc/configs/mpc885_ads_defconfig | 49 +++++++++++------------
 1 file changed, 23 insertions(+), 26 deletions(-)

-- 
2.32.0


^ permalink raw reply

* [PATCH 3/3] powerpc/microwatt: CPU doesn't (yet) have speculation bugs
From: Joel Stanley @ 2021-08-16  8:24 UTC (permalink / raw)
  To: Paul Mackerras, Michael Neuling, Anton Blanchard,
	Michael Ellerman, Nicholas Piggin, linuxppc-dev
  Cc: Daniel Axtens
In-Reply-To: <20210816082403.2293846-1-joel@jms.id.au>

Signed-off-by: Joel Stanley <joel@jms.id.au>
---
 arch/powerpc/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 663766fbf505..d5af6667c206 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -279,6 +279,7 @@ config PPC_BARRIER_NOSPEC
 	bool
 	default y
 	depends on PPC_BOOK3S_64 || PPC_FSL_BOOK3E
+	depends on !PPC_MICROWATT
 
 config EARLY_PRINTK
 	bool
-- 
2.32.0


^ permalink raw reply related

* [PATCH 2/3] powerpc: Fix undefined static key
From: Joel Stanley @ 2021-08-16  8:24 UTC (permalink / raw)
  To: Paul Mackerras, Michael Neuling, Anton Blanchard,
	Michael Ellerman, Nicholas Piggin, linuxppc-dev
  Cc: Daniel Axtens
In-Reply-To: <20210816082403.2293846-1-joel@jms.id.au>

When CONFIG_PPC_BARRIER_NOSPEC=n, security.c is not built leading to a
missing definition of uaccess_flush_key.

  LD      vmlinux.o
  MODPOST vmlinux.symvers
  MODINFO modules.builtin.modinfo
  GEN     modules.builtin
  LD      .tmp_vmlinux.kallsyms1
powerpc64le-linux-gnu-ld: arch/powerpc/kernel/align.o:(.toc+0x0): undefined reference to `uaccess_flush_key'
powerpc64le-linux-gnu-ld: arch/powerpc/kernel/signal_64.o:(.toc+0x0): undefined reference to `uaccess_flush_key'
powerpc64le-linux-gnu-ld: arch/powerpc/kernel/process.o:(.toc+0x0): undefined reference to `uaccess_flush_key'
powerpc64le-linux-gnu-ld: arch/powerpc/kernel/traps.o:(.toc+0x0): undefined reference to `uaccess_flush_key'
powerpc64le-linux-gnu-ld: arch/powerpc/kernel/hw_breakpoint_constraints.o:(.toc+0x0): undefined reference to `uaccess_flush_key'
powerpc64le-linux-gnu-ld: arch/powerpc/kernel/ptrace/ptrace.o:(.toc+0x0): more undefined references to `uaccess_flush_key' follow
make[1]: *** [Makefile:1176: vmlinux] Error 1

Hack one in to fix the build.

Signed-off-by: Joel Stanley <joel@jms.id.au>
---
 arch/powerpc/include/asm/security_features.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/include/asm/security_features.h b/arch/powerpc/include/asm/security_features.h
index 792eefaf230b..46ade7927a4c 100644
--- a/arch/powerpc/include/asm/security_features.h
+++ b/arch/powerpc/include/asm/security_features.h
@@ -39,6 +39,9 @@ static inline bool security_ftr_enabled(u64 feature)
 	return !!(powerpc_security_features & feature);
 }
 
+#ifndef CONFIG_PPC_BARRIER_NOSPEC
+DEFINE_STATIC_KEY_FALSE(uaccess_flush_key);
+#endif
 
 // Features indicating support for Spectre/Meltdown mitigations
 
-- 
2.32.0


^ permalink raw reply related

* [PATCH 1/3] powerpc/64s: Fix build when !PPC_BARRIER_NOSPEC
From: Joel Stanley @ 2021-08-16  8:24 UTC (permalink / raw)
  To: Paul Mackerras, Michael Neuling, Anton Blanchard,
	Michael Ellerman, Nicholas Piggin, linuxppc-dev
  Cc: Daniel Axtens
In-Reply-To: <20210816082403.2293846-1-joel@jms.id.au>

When disabling PPC_BARRIER_NOSPEC the do_barrier_nospec_fixups_range
definition is still included, as well as a stub in asm/setup.h:

../arch/powerpc/lib/feature-fixups.c:502:6: error: redefinition of ‘do_barrier_nospec_fixu>
  502 | void do_barrier_nospec_fixups_range(bool enable, void *fixup_start, void *fixup_en>
      |      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from ../arch/powerpc/lib/feature-fixups.c:23:
../arch/powerpc/include/asm/setup.h:70:20: note: previous definition of ‘do_barrier_nospec>
   70 | static inline void do_barrier_nospec_fixups_range(bool enable, void *start, void *>
      |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I assume the intent was to put the just do_barrier_nospec_fixups
behind PPC_BARRIER_NOSPEC and let the compiler drop _range when there
are no users. (There is a caller in module.c, but this is behind
PPC_BARRIER_NOSPEC).

This makes PPC_BOOK3S_64 match how the PPC_FSL_BOOK3E build works.

Signed-off-by: Joel Stanley <joel@jms.id.au>
---
 arch/powerpc/include/asm/setup.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/powerpc/include/asm/setup.h b/arch/powerpc/include/asm/setup.h
index 6c1a7d217d1a..71012284c044 100644
--- a/arch/powerpc/include/asm/setup.h
+++ b/arch/powerpc/include/asm/setup.h
@@ -66,8 +66,6 @@ extern bool barrier_nospec_enabled;
 
 #ifdef CONFIG_PPC_BARRIER_NOSPEC
 void do_barrier_nospec_fixups_range(bool enable, void *start, void *end);
-#else
-static inline void do_barrier_nospec_fixups_range(bool enable, void *start, void *end) { }
 #endif
 
 #ifdef CONFIG_PPC_FSL_BOOK3E
-- 
2.32.0


^ permalink raw reply related

* [PATCH 0/3] powerpc/64s: Fix PPC_BARRIER_NOSPEC=n
From: Joel Stanley @ 2021-08-16  8:24 UTC (permalink / raw)
  To: Paul Mackerras, Michael Neuling, Anton Blanchard,
	Michael Ellerman, Nicholas Piggin, linuxppc-dev
  Cc: Daniel Axtens

When disabling PPC_BARRIER_NOSPEC on Microwatt to see if it improved
boot time, I discovered the build was broken (first patch). This got
worse between when I first tried and now (second patch).

The third patch disables PPC_BARRIER_NOSPEC when building for Microwatt.
This one is optional, as it doesn't seem to change boot speed with the
current Microwatt design on an Arty.

Joel Stanley (3):
  powerpc/64s: Fix build when !PPC_BARRIER_NOSPEC
  powerpc: Fix undefined static key
  powerpc/microwatt: CPU doesn't (yet) have speculation bugs

 arch/powerpc/Kconfig                         | 1 +
 arch/powerpc/include/asm/security_features.h | 3 +++
 arch/powerpc/include/asm/setup.h             | 2 --
 3 files changed, 4 insertions(+), 2 deletions(-)

-- 
2.32.0


^ permalink raw reply

* [PATCH v1 4/4] powerpc/64s/interrupt: avoid saving CFAR in some asynchronous interrupts
From: Nicholas Piggin @ 2021-08-16  7:29 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20210816072953.1165964-1-npiggin@gmail.com>

Reading the CFAR register is quite costly (~20 cycles on POWER9). It is
a good idea to have for most synchronous interrupts, but for async ones
it is much less important.

Doorbell, external, and decrementer interrupts are the important
asynchronous ones. HV interrupts can't skip CFAR if KVM HV is possible,
because it might be a guest exit that requires CFAR preserved. But for
now the important pseries interrupts can avoid loading CFAR.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/exceptions-64s.S | 63 ++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 69a472c38f62..42badd7beaf0 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -111,6 +111,8 @@ name:
 #define IAREA		.L_IAREA_\name\()	/* PACA save area */
 #define IVIRT		.L_IVIRT_\name\()	/* Has virt mode entry point */
 #define IISIDE		.L_IISIDE_\name\()	/* Uses SRR0/1 not DAR/DSISR */
+#define ICFAR		.L_ICFAR_\name\()	/* Uses CFAR */
+#define ICFAR_IF_HVMODE	.L_ICFAR_IF_HVMODE_\name\() /* Uses CFAR if HV */
 #define IDAR		.L_IDAR_\name\()	/* Uses DAR (or SRR0) */
 #define IDSISR		.L_IDSISR_\name\()	/* Uses DSISR (or SRR1) */
 #define IBRANCH_TO_COMMON	.L_IBRANCH_TO_COMMON_\name\() /* ENTRY branch to common */
@@ -150,6 +152,12 @@ do_define_int n
 	.ifndef IISIDE
 		IISIDE=0
 	.endif
+	.ifndef ICFAR
+		ICFAR=1
+	.endif
+	.ifndef ICFAR_IF_HVMODE
+		ICFAR_IF_HVMODE=0
+	.endif
 	.ifndef IDAR
 		IDAR=0
 	.endif
@@ -287,9 +295,21 @@ BEGIN_FTR_SECTION
 END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
 	HMT_MEDIUM
 	std	r10,IAREA+EX_R10(r13)		/* save r10 - r12 */
+	.if ICFAR
 BEGIN_FTR_SECTION
 	mfspr	r10,SPRN_CFAR
 END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
+	.elseif ICFAR_IF_HVMODE
+BEGIN_FTR_SECTION
+  BEGIN_FTR_SECTION_NESTED(69)
+	mfspr	r10,SPRN_CFAR
+  END_FTR_SECTION_NESTED(CPU_FTR_CFAR, CPU_FTR_CFAR, 69)
+FTR_SECTION_ELSE
+  BEGIN_FTR_SECTION_NESTED(69)
+	li	r10,0
+  END_FTR_SECTION_NESTED(CPU_FTR_CFAR, CPU_FTR_CFAR, 69)
+ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206)
+	.endif
 	.if \ool
 	.if !\virt
 	b	tramp_real_\name
@@ -305,9 +325,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
 BEGIN_FTR_SECTION
 	std	r9,IAREA+EX_PPR(r13)
 END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
+	.if ICFAR || ICFAR_IF_HVMODE
 BEGIN_FTR_SECTION
 	std	r10,IAREA+EX_CFAR(r13)
 END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
+	.endif
 	INTERRUPT_TO_KERNEL
 	mfctr	r10
 	std	r10,IAREA+EX_CTR(r13)
@@ -559,7 +581,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
 	.endif
 
 BEGIN_FTR_SECTION
+	.if ICFAR || ICFAR_IF_HVMODE
 	ld	r10,IAREA+EX_CFAR(r13)
+	.else
+	li	r10,0
+	.endif
 	std	r10,ORIG_GPR3(r1)
 END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
 	ld	r10,IAREA+EX_CTR(r13)
@@ -1501,6 +1527,12 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
  *
  * If soft masked, the masked handler will note the pending interrupt for
  * replay, and clear MSR[EE] in the interrupted context.
+ *
+ * CFAR is not required because this is an asynchronous interrupt that in
+ * general won't have much bearing on the state of the CPU, with the possible
+ * exception of crash/debug IPIs, but those are generally moving to use SRESET
+ * IPIs. Unless this is an HV interrupt and KVM HV is possible, in which case
+ * it may be exiting the guest and need CFAR to be saved.
  */
 INT_DEFINE_BEGIN(hardware_interrupt)
 	IVEC=0x500
@@ -1508,6 +1540,10 @@ INT_DEFINE_BEGIN(hardware_interrupt)
 	IMASK=IRQS_DISABLED
 	IKVM_REAL=1
 	IKVM_VIRT=1
+	ICFAR=0
+#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
+	ICFAR_IF_HVMODE=1
+#endif
 INT_DEFINE_END(hardware_interrupt)
 
 EXC_REAL_BEGIN(hardware_interrupt, 0x500, 0x100)
@@ -1726,6 +1762,10 @@ END_FTR_SECTION_IFSET(CPU_FTR_TM)
  * If PPC_WATCHDOG is configured, the soft masked handler will actually set
  * things back up to run soft_nmi_interrupt as a regular interrupt handler
  * on the emergency stack.
+ *
+ * CFAR is not required because this is asynchronous (see hardware_interrupt).
+ * A watchdog interrupt may like to have CFAR, but usually the interesting
+ * branch is long gone by that point (e.g., infinite loop).
  */
 INT_DEFINE_BEGIN(decrementer)
 	IVEC=0x900
@@ -1733,6 +1773,7 @@ INT_DEFINE_BEGIN(decrementer)
 #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
 	IKVM_REAL=1
 #endif
+	ICFAR=0
 INT_DEFINE_END(decrementer)
 
 EXC_REAL_BEGIN(decrementer, 0x900, 0x80)
@@ -1808,6 +1849,8 @@ EXC_COMMON_BEGIN(hdecrementer_common)
  * If soft masked, the masked handler will note the pending interrupt for
  * replay, leaving MSR[EE] enabled in the interrupted context because the
  * doorbells are edge triggered.
+ *
+ * CFAR is not required, similarly to hardware_interrupt.
  */
 INT_DEFINE_BEGIN(doorbell_super)
 	IVEC=0xa00
@@ -1815,6 +1858,7 @@ INT_DEFINE_BEGIN(doorbell_super)
 #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
 	IKVM_REAL=1
 #endif
+	ICFAR=0
 INT_DEFINE_END(doorbell_super)
 
 EXC_REAL_BEGIN(doorbell_super, 0xa00, 0x100)
@@ -1866,6 +1910,7 @@ INT_DEFINE_BEGIN(system_call)
 	IVEC=0xc00
 	IKVM_REAL=1
 	IKVM_VIRT=1
+	ICFAR=0
 INT_DEFINE_END(system_call)
 
 .macro SYSTEM_CALL virt
@@ -2164,6 +2209,11 @@ EXC_COMMON_BEGIN(hmi_exception_common)
  * Interrupt 0xe80 - Directed Hypervisor Doorbell Interrupt.
  * This is an asynchronous interrupt in response to a msgsnd doorbell.
  * Similar to the 0xa00 doorbell but for host rather than guest.
+ *
+ * CFAR is not required (similar to doorbell_interrupt), unless KVM HV
+ * is enabled, in which case it may be a guest exit. Most PowerNV kernels
+ * include KVM support so it would be nice if this could be dynamically
+ * patched out if KVM was not currently running any guests.
  */
 INT_DEFINE_BEGIN(h_doorbell)
 	IVEC=0xe80
@@ -2171,6 +2221,9 @@ INT_DEFINE_BEGIN(h_doorbell)
 	IMASK=IRQS_DISABLED
 	IKVM_REAL=1
 	IKVM_VIRT=1
+#ifndef CONFIG_KVM_BOOK3S_HV_POSSIBLE
+	ICFAR=0
+#endif
 INT_DEFINE_END(h_doorbell)
 
 EXC_REAL_BEGIN(h_doorbell, 0xe80, 0x20)
@@ -2194,6 +2247,9 @@ EXC_COMMON_BEGIN(h_doorbell_common)
  * Interrupt 0xea0 - Hypervisor Virtualization Interrupt.
  * This is an asynchronous interrupt in response to an "external exception".
  * Similar to 0x500 but for host only.
+ *
+ * Like h_doorbell, CFAR is only required for KVM HV because this can be
+ * a guest exit.
  */
 INT_DEFINE_BEGIN(h_virt_irq)
 	IVEC=0xea0
@@ -2201,6 +2257,9 @@ INT_DEFINE_BEGIN(h_virt_irq)
 	IMASK=IRQS_DISABLED
 	IKVM_REAL=1
 	IKVM_VIRT=1
+#ifndef CONFIG_KVM_BOOK3S_HV_POSSIBLE
+	ICFAR=0
+#endif
 INT_DEFINE_END(h_virt_irq)
 
 EXC_REAL_BEGIN(h_virt_irq, 0xea0, 0x20)
@@ -2237,6 +2296,8 @@ EXC_VIRT_NONE(0x4ee0, 0x20)
  *
  * If soft masked, the masked handler will note the pending interrupt for
  * replay, and clear MSR[EE] in the interrupted context.
+ *
+ * CFAR is not used by perf interrupts so not required.
  */
 INT_DEFINE_BEGIN(performance_monitor)
 	IVEC=0xf00
@@ -2244,6 +2305,7 @@ INT_DEFINE_BEGIN(performance_monitor)
 #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
 	IKVM_REAL=1
 #endif
+	ICFAR=0
 INT_DEFINE_END(performance_monitor)
 
 EXC_REAL_BEGIN(performance_monitor, 0xf00, 0x20)
@@ -2668,6 +2730,7 @@ EXC_VIRT_NONE(0x5800, 0x100)
 INT_DEFINE_BEGIN(soft_nmi)
 	IVEC=0x900
 	ISTACK=0
+	ICFAR=0
 INT_DEFINE_END(soft_nmi)
 
 /*
-- 
2.23.0


^ permalink raw reply related

* [PATCH v1 3/4] powerpc/64s/interrupt: Don't enable MSR[EE] in irq handlers unless perf is in use
From: Nicholas Piggin @ 2021-08-16  7:29 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Athira Rajeev, Madhavan Srinivasan, Nicholas Piggin
In-Reply-To: <20210816072953.1165964-1-npiggin@gmail.com>

Enabling MSR[EE] in interrupt handlers while interrupts are still soft
masked allows PMIs to profile interrupt handlers to some degree, beyond
what SIAR latching allows.

When perf is not being used, this is almost useless work. It requires an
extra mtmsrd in the irq handler, and it also opens the door to masked
interrupts hitting and requiring replay, which is more expensive than
just taking them directly. This effect can be noticable in high IRQ
workloads.

Avoid enabling MSR[EE] unless perf is currently in use. This saves about
60 cycles (or 8%) on a simple decrementer interrupt microbenchmark.
Replayed interrupts drop from 1.4% of interrupts to 0.003%.

This does prevent the soft-nmi interrupt being taken in these handlers,
but that's not too reliable anyway. The SMP watchdog will continue to be
the reliable way to catch lockups.

Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/hw_irq.h | 47 +++++++++++++++++++++++++------
 arch/powerpc/kernel/dbell.c       |  3 +-
 arch/powerpc/kernel/irq.c         |  3 +-
 arch/powerpc/kernel/time.c        | 30 ++++++++++----------
 4 files changed, 57 insertions(+), 26 deletions(-)

diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h
index 2d5c0d3ccbb6..e6644509c7af 100644
--- a/arch/powerpc/include/asm/hw_irq.h
+++ b/arch/powerpc/include/asm/hw_irq.h
@@ -309,17 +309,46 @@ static inline bool lazy_irq_pending_nocheck(void)
 bool power_pmu_running(void);
 
 /*
- * This is called by asynchronous interrupts to conditionally
- * re-enable hard interrupts after having cleared the source
- * of the interrupt. They are kept disabled if there is a different
- * soft-masked interrupt pending that requires hard masking.
+ * This is called by asynchronous interrupts to check whether to
+ * conditionally re-enable hard interrupts after having cleared
+ * the source of the interrupt. They are kept disabled if there
+ * is a different soft-masked interrupt pending that requires hard
+ * masking.
  */
-static inline void may_hard_irq_enable(void)
+static inline bool may_hard_irq_enable(void)
 {
-	if (!(get_paca()->irq_happened & PACA_IRQ_MUST_HARD_MASK)) {
-		get_paca()->irq_happened &= ~PACA_IRQ_HARD_DIS;
-		__hard_irq_enable();
-	}
+#ifdef CONFIG_PPC_IRQ_SOFT_MASK_DEBUG
+	BUG_ON(mfmsr() & MSR_EE);
+#endif
+#ifdef CONFIG_PERF_EVENTS
+	if (!power_pmu_running())
+		return false;
+
+	if (get_paca()->irq_happened & PACA_IRQ_MUST_HARD_MASK)
+		return false;
+
+	return true;
+#else
+	return false;
+#endif
+}
+
+/*
+ * Do the hard enabling, only call this if may_hard_irq_enable is true.
+ */
+static inline void do_hard_irq_enable(void)
+{
+#ifdef CONFIG_PPC_IRQ_SOFT_MASK_DEBUG
+	WARN_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED);
+	WARN_ON(get_paca()->irq_happened & PACA_IRQ_MUST_HARD_MASK);
+	BUG_ON(mfmsr() & MSR_EE);
+#endif
+	/*
+	 * This allows PMI interrupts (and watchdog soft-NMIs) through.
+	 * There is no other reason to enable this way.
+	 */
+	get_paca()->irq_happened &= ~PACA_IRQ_HARD_DIS;
+	__hard_irq_enable();
 }
 
 static inline bool arch_irq_disabled_regs(struct pt_regs *regs)
diff --git a/arch/powerpc/kernel/dbell.c b/arch/powerpc/kernel/dbell.c
index 5545c9cd17c1..0edeb5e9fede 100644
--- a/arch/powerpc/kernel/dbell.c
+++ b/arch/powerpc/kernel/dbell.c
@@ -27,7 +27,8 @@ DEFINE_INTERRUPT_HANDLER_ASYNC(doorbell_exception)
 
 	ppc_msgsync();
 
-	may_hard_irq_enable();
+	if (may_hard_irq_enable())
+		do_hard_irq_enable();
 
 	kvmppc_clear_host_ipi(smp_processor_id());
 	__this_cpu_inc(irq_stat.doorbell_irqs);
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 551b653228c4..745becbcd1ad 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -739,7 +739,8 @@ void __do_irq(struct pt_regs *regs)
 	irq = ppc_md.get_irq();
 
 	/* We can hard enable interrupts now to allow perf interrupts */
-	may_hard_irq_enable();
+	if (may_hard_irq_enable())
+		do_hard_irq_enable();
 
 	/* And finally process it */
 	if (unlikely(!irq))
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index c487ba5a6e11..ac67ec57f129 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -567,22 +567,22 @@ DEFINE_INTERRUPT_HANDLER_ASYNC(timer_interrupt)
 		return;
 	}
 
-	/* Ensure a positive value is written to the decrementer, or else
-	 * some CPUs will continue to take decrementer exceptions. When the
-	 * PPC_WATCHDOG (decrementer based) is configured, keep this at most
-	 * 31 bits, which is about 4 seconds on most systems, which gives
-	 * the watchdog a chance of catching timer interrupt hard lockups.
-	 */
-	if (IS_ENABLED(CONFIG_PPC_WATCHDOG))
-		set_dec(0x7fffffff);
-	else
-		set_dec(decrementer_max);
-
-	/* Conditionally hard-enable interrupts now that the DEC has been
-	 * bumped to its maximum value
-	 */
-	may_hard_irq_enable();
+	/* Conditionally hard-enable interrupts. */
+	if (may_hard_irq_enable()) {
+		/* Ensure a positive value is written to the decrementer, or
+		 * else some CPUs will continue to take decrementer exceptions.
+		 * When the PPC_WATCHDOG (decrementer based) is configured,
+		 * keep this at most 31 bits, which is about 4 seconds on most
+		 * systems, which gives the watchdog a chance of catching timer
+		 * interrupt hard lockups.
+		 */
+		if (IS_ENABLED(CONFIG_PPC_WATCHDOG))
+			set_dec(0x7fffffff);
+		else
+			set_dec(decrementer_max);
 
+		do_hard_irq_enable();
+	}
 
 #if defined(CONFIG_PPC32) && defined(CONFIG_PPC_PMAC)
 	if (atomic_read(&ppc_n_lost_interrupts) != 0)
-- 
2.23.0


^ permalink raw reply related

* [PATCH v1 2/4] powerpc/64s/perf: add power_pmu_running to query whether perf is being used
From: Nicholas Piggin @ 2021-08-16  7:29 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Athira Rajeev, Madhavan Srinivasan, Nicholas Piggin
In-Reply-To: <20210816072953.1165964-1-npiggin@gmail.com>

Interrupt handling code would like to know whether perf is enabled, to
know whether it should enable MSR[EE] to improve PMI coverage.

Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/hw_irq.h |  2 ++
 arch/powerpc/perf/core-book3s.c   | 13 +++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h
index 21cc571ea9c2..2d5c0d3ccbb6 100644
--- a/arch/powerpc/include/asm/hw_irq.h
+++ b/arch/powerpc/include/asm/hw_irq.h
@@ -306,6 +306,8 @@ static inline bool lazy_irq_pending_nocheck(void)
 	return __lazy_irq_pending(local_paca->irq_happened);
 }
 
+bool power_pmu_running(void);
+
 /*
  * This is called by asynchronous interrupts to conditionally
  * re-enable hard interrupts after having cleared the source
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index bb0ee716de91..76114a9afb2b 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2380,6 +2380,19 @@ static void perf_event_interrupt(struct pt_regs *regs)
 	perf_sample_event_took(sched_clock() - start_clock);
 }
 
+bool power_pmu_running(void)
+{
+	struct cpu_hw_events *cpuhw;
+
+	/* Could this simply test local_paca->pmcregs_in_use? */
+
+	if (!ppmu)
+		return false;
+
+	cpuhw = this_cpu_ptr(&cpu_hw_events);
+	return cpuhw->n_events;
+}
+
 static int power_pmu_prepare_cpu(unsigned int cpu)
 {
 	struct cpu_hw_events *cpuhw = &per_cpu(cpu_hw_events, cpu);
-- 
2.23.0


^ permalink raw reply related

* [PATCH v1 0/4] powerpc/64s: interrupt speedups
From: Nicholas Piggin @ 2021-08-16  7:29 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

Here's a few stragglers. The first patch was submitted already but had
some bugs with unrecoverable exceptions on HPT (current->blah being
accessed before MSR[RI] was enabled). Those should be fixed now.

The others are generally for helping asynch interrupts, which are a bit
harder to measure well but important for IO and IPIs.

After this series, the SPR accesses of the interrupt handlers for radix
are becoming pretty optimal except for PPR which we could improve on,
and virt CPU accounting which is very costly -- we might disable that
by default unless someone comes up with a good reason to keep it.

Thanks,
Nick

Nicholas Piggin (4):
  powerpc/64: handle MSR EE and RI in interrupt entry wrapper
  powerpc/64s/perf: add power_pmu_running to query whether perf is being
    used
  powerpc/64s/interrupt: Don't enable MSR[EE] in irq handlers unless
    perf is in use
  powerpc/64s/interrupt: avoid saving CFAR in some asynchronous
    interrupts

 arch/powerpc/include/asm/hw_irq.h    | 49 ++++++++++++---
 arch/powerpc/include/asm/interrupt.h | 31 ++++++++--
 arch/powerpc/kernel/dbell.c          |  3 +-
 arch/powerpc/kernel/exceptions-64s.S | 93 +++++++++++++++++++---------
 arch/powerpc/kernel/fpu.S            |  5 ++
 arch/powerpc/kernel/irq.c            |  3 +-
 arch/powerpc/kernel/time.c           | 30 ++++-----
 arch/powerpc/kernel/vector.S         |  8 +++
 arch/powerpc/perf/core-book3s.c      | 13 ++++
 9 files changed, 175 insertions(+), 60 deletions(-)

-- 
2.23.0


^ permalink raw reply

* [PATCH v1 1/4] powerpc/64: handle MSR EE and RI in interrupt entry wrapper
From: Nicholas Piggin @ 2021-08-16  7:29 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20210816072953.1165964-1-npiggin@gmail.com>

Similarly to the system call change in the previous patch, the mtmsrd to
enable RI can be combined with the mtmsrd to enable EE for interrupts
which enable the latter, which tends to be the important synchronous
interrupts (i.e., page faults).

Do this by enabling EE and RI together at the beginning of the entry
wrapper if PACA_IRQ_HARD_DIS is clear, and just enabling RI if it is set
(which means something wanted EE=0).

Asynchronous interrupts set PACA_IRQ_HARD_DIS, but synchronous ones
leave it unchanged, so by default they always get EE=1 unless they
interrupt a caller that has hard disabled. When the sync interrupt
later calls interrupt_cond_local_irq_enable(), that will not require
another mtmsrd because we already enabled here.

64e is conceptually unchanged, but it also sets MSR[EE]=1 now in the
interrupt wrapper for synchronous interrupts with the same code.

On 64s, saves one mtmsrd L=1 for synchronous interrupts on 64s, which
saves about 20 cycles. For kernel-mode interrupts, both synchronous and
asynchronous, this saves an additional ~40 cycles due to the mtmsrd
being moved ahead of mfspr SPRN_AMR, which prevents a SPR scoreboard
stall (on POWER9).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/interrupt.h | 31 ++++++++++++++++++++++++----
 arch/powerpc/kernel/exceptions-64s.S | 30 ---------------------------
 arch/powerpc/kernel/fpu.S            |  5 +++++
 arch/powerpc/kernel/vector.S         |  8 +++++++
 4 files changed, 40 insertions(+), 34 deletions(-)

diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index 6b800d3e2681..e3228a911b35 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -148,9 +148,21 @@ static inline void interrupt_enter_prepare(struct pt_regs *regs, struct interrup
 #endif
 
 #ifdef CONFIG_PPC64
-	if (irq_soft_mask_set_return(IRQS_ALL_DISABLED) == IRQS_ENABLED)
+	bool trace_enable = false;
+
+	if (IS_ENABLED(CONFIG_TRACE_IRQFLAGS)) {
+		if (irq_soft_mask_set_return(IRQS_DISABLED) == IRQS_ENABLED)
+			trace_enable = true;
+	} else {
+		irq_soft_mask_set(IRQS_DISABLED);
+	}
+	/* If the interrupt was taken with HARD_DIS set, don't enable MSR[EE] */
+	if (local_paca->irq_happened & PACA_IRQ_HARD_DIS)
+		__hard_RI_enable();
+	else
+		__hard_irq_enable();
+	if (trace_enable)
 		trace_hardirqs_off();
-	local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
 
 	if (user_mode(regs)) {
 		CT_WARN_ON(ct_state() != CONTEXT_USER);
@@ -200,13 +212,20 @@ static inline void interrupt_exit_prepare(struct pt_regs *regs, struct interrupt
 
 static inline void interrupt_async_enter_prepare(struct pt_regs *regs, struct interrupt_state *state)
 {
+#ifdef CONFIG_PPC64
+	/* Ensure interrupt_enter_prepare does not enable MSR[EE] */
+	local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
+#endif
+	interrupt_enter_prepare(regs, state);
 #ifdef CONFIG_PPC_BOOK3S_64
+	/*
+	 * MSR[RI] is only enabled after interrupt_enter_prepare, so this
+	 * thread flags access has to come afterward.
+	 */
 	if (cpu_has_feature(CPU_FTR_CTRL) &&
 	    !test_thread_local_flags(_TLF_RUNLATCH))
 		__ppc64_runlatch_on();
 #endif
-
-	interrupt_enter_prepare(regs, state);
 	irq_enter();
 }
 
@@ -273,6 +292,8 @@ static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct inte
 	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
 		BUG_ON(!arch_irq_disabled_regs(regs) && !(regs->msr & MSR_EE));
 
+	__hard_RI_enable();
+
 	/* Don't do any per-CPU operations until interrupt state is fixed */
 
 	if (nmi_disables_ftrace(regs)) {
@@ -370,6 +391,8 @@ interrupt_handler long func(struct pt_regs *regs)			\
 {									\
 	long ret;							\
 									\
+	__hard_RI_enable();						\
+									\
 	ret = ____##func (regs);					\
 									\
 	return ret;							\
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 4aec59a77d4c..69a472c38f62 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -113,7 +113,6 @@ name:
 #define IISIDE		.L_IISIDE_\name\()	/* Uses SRR0/1 not DAR/DSISR */
 #define IDAR		.L_IDAR_\name\()	/* Uses DAR (or SRR0) */
 #define IDSISR		.L_IDSISR_\name\()	/* Uses DSISR (or SRR1) */
-#define ISET_RI		.L_ISET_RI_\name\()	/* Run common code w/ MSR[RI]=1 */
 #define IBRANCH_TO_COMMON	.L_IBRANCH_TO_COMMON_\name\() /* ENTRY branch to common */
 #define IREALMODE_COMMON	.L_IREALMODE_COMMON_\name\() /* Common runs in realmode */
 #define IMASK		.L_IMASK_\name\()	/* IRQ soft-mask bit */
@@ -157,9 +156,6 @@ do_define_int n
 	.ifndef IDSISR
 		IDSISR=0
 	.endif
-	.ifndef ISET_RI
-		ISET_RI=1
-	.endif
 	.ifndef IBRANCH_TO_COMMON
 		IBRANCH_TO_COMMON=1
 	.endif
@@ -512,11 +508,6 @@ DEFINE_FIXED_SYMBOL(\name\()_common_real)
 	stb	r10,PACASRR_VALID(r13)
 	.endif
 
-	.if ISET_RI
-	li	r10,MSR_RI
-	mtmsrd	r10,1			/* Set MSR_RI */
-	.endif
-
 	.if ISTACK
 	.if IKUAP
 	kuap_save_amr_and_lock r9, r10, cr1, cr0
@@ -901,11 +892,6 @@ INT_DEFINE_BEGIN(system_reset)
 	IVEC=0x100
 	IAREA=PACA_EXNMI
 	IVIRT=0 /* no virt entry point */
-	/*
-	 * MSR_RI is not enabled, because PACA_EXNMI and nmi stack is
-	 * being used, so a nested NMI exception would corrupt it.
-	 */
-	ISET_RI=0
 	ISTACK=0
 	IKVM_REAL=1
 INT_DEFINE_END(system_reset)
@@ -986,8 +972,6 @@ EXC_COMMON_BEGIN(system_reset_common)
 	lhz	r10,PACA_IN_NMI(r13)
 	addi	r10,r10,1
 	sth	r10,PACA_IN_NMI(r13)
-	li	r10,MSR_RI
-	mtmsrd 	r10,1
 
 	mr	r10,r1
 	ld	r1,PACA_NMI_EMERG_SP(r13)
@@ -1061,12 +1045,6 @@ INT_DEFINE_BEGIN(machine_check_early)
 	IAREA=PACA_EXMC
 	IVIRT=0 /* no virt entry point */
 	IREALMODE_COMMON=1
-	/*
-	 * MSR_RI is not enabled, because PACA_EXMC is being used, so a
-	 * nested machine check corrupts it. machine_check_common enables
-	 * MSR_RI.
-	 */
-	ISET_RI=0
 	ISTACK=0
 	IDAR=1
 	IDSISR=1
@@ -1077,7 +1055,6 @@ INT_DEFINE_BEGIN(machine_check)
 	IVEC=0x200
 	IAREA=PACA_EXMC
 	IVIRT=0 /* no virt entry point */
-	ISET_RI=0
 	IDAR=1
 	IDSISR=1
 	IKVM_REAL=1
@@ -1147,9 +1124,6 @@ EXC_COMMON_BEGIN(machine_check_early_common)
 BEGIN_FTR_SECTION
 	bl	enable_machine_check
 END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
-	li	r10,MSR_RI
-	mtmsrd	r10,1
-
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	machine_check_early
 	std	r3,RESULT(r1)	/* Save result */
@@ -1237,10 +1211,6 @@ EXC_COMMON_BEGIN(machine_check_common)
 	 * save area: PACA_EXMC instead of PACA_EXGEN.
 	 */
 	GEN_COMMON machine_check
-
-	/* Enable MSR_RI when finished with PACA_EXMC */
-	li	r10,MSR_RI
-	mtmsrd 	r10,1
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	machine_check_exception
 	b	interrupt_return_srr
diff --git a/arch/powerpc/kernel/fpu.S b/arch/powerpc/kernel/fpu.S
index 6010adcee16e..eabd578cb772 100644
--- a/arch/powerpc/kernel/fpu.S
+++ b/arch/powerpc/kernel/fpu.S
@@ -81,7 +81,12 @@ EXPORT_SYMBOL(store_fp_state)
  */
 _GLOBAL(load_up_fpu)
 	mfmsr	r5
+#ifdef CONFIG_PPC_BOOK3S_64
+	/* interrupt doesn't set MSR[RI] and HPT can fault on current access */
+	ori	r5,r5,MSR_FP|MSR_RI
+#else
 	ori	r5,r5,MSR_FP
+#endif
 #ifdef CONFIG_VSX
 BEGIN_FTR_SECTION
 	oris	r5,r5,MSR_VSX@h
diff --git a/arch/powerpc/kernel/vector.S b/arch/powerpc/kernel/vector.S
index fc120fac1910..ead2900d9bb0 100644
--- a/arch/powerpc/kernel/vector.S
+++ b/arch/powerpc/kernel/vector.S
@@ -47,6 +47,10 @@ EXPORT_SYMBOL(store_vr_state)
  */
 _GLOBAL(load_up_altivec)
 	mfmsr	r5			/* grab the current MSR */
+#ifdef CONFIG_PPC_BOOK3S_64
+	/* interrupt doesn't set MSR[RI] and HPT can fault on current access */
+	ori	r5,r5,MSR_RI
+#endif
 	oris	r5,r5,MSR_VEC@h
 	MTMSRD(r5)			/* enable use of AltiVec now */
 	isync
@@ -128,6 +132,10 @@ _GLOBAL(load_up_vsx)
 	andis.	r5,r12,MSR_VEC@h
 	beql+	load_up_altivec		/* skip if already loaded */
 
+	/* interrupt doesn't set MSR[RI] and HPT can fault on current access */
+	li	r5,MSR_RI
+	mtmsrd	r5,1
+
 	ld	r4,PACACURRENT(r13)
 	addi	r4,r4,THREAD		/* Get THREAD */
 	li	r6,1
-- 
2.23.0


^ permalink raw reply related

* Re: [RFC PATCH] powerpc/book3s64/radix: Upgrade va tlbie to PID tlbie if we cross PMD_SIZE
From: Michael Ellerman @ 2021-08-16  7:03 UTC (permalink / raw)
  To: Aneesh Kumar K.V, Puvichakravarthy Ramachandran; +Cc: linuxppc-dev, npiggin
In-Reply-To: <c157f9c9-d340-24f7-1aa0-40bbd4e1386e@linux.ibm.com>

"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
> On 8/12/21 6:19 PM, Michael Ellerman wrote:
>> "Puvichakravarthy Ramachandran" <puvichakravarthy@in.ibm.com> writes:
>>>> With shared mapping, even though we are unmapping a large range, the kernel
>>>> will force a TLB flush with ptl lock held to avoid the race mentioned in
>>>> commit 1cf35d47712d ("mm: split 'tlb_flush_mmu()' into tlb flushing and memory freeing parts")
>>>> This results in the kernel issuing a high number of TLB flushes even for a large
>>>> range. This can be improved by making sure the kernel switch to pid based flush if the
>>>> kernel is unmapping a 2M range.
>>>>
>>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>>>> ---
>>>>   arch/powerpc/mm/book3s64/radix_tlb.c | 8 ++++----
>>>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c > b/arch/powerpc/mm/book3s64/radix_tlb.c
>>>> index aefc100d79a7..21d0f098e43b 100644
>>>> --- a/arch/powerpc/mm/book3s64/radix_tlb.c
>>>> +++ b/arch/powerpc/mm/book3s64/radix_tlb.c
>>>> @@ -1106,7 +1106,7 @@ EXPORT_SYMBOL(radix__flush_tlb_kernel_range);
>>>>    * invalidating a full PID, so it has a far lower threshold to change > from
>>>>    * individual page flushes to full-pid flushes.
>>>>    */
>>>> -static unsigned long tlb_single_page_flush_ceiling __read_mostly = 33;
>>>> +static unsigned long tlb_single_page_flush_ceiling __read_mostly = 32;
>>>>   static unsigned long tlb_local_single_page_flush_ceiling __read_mostly > = POWER9_TLB_SETS_RADIX * 2;
>>>>
>>>>   static inline void __radix__flush_tlb_range(struct mm_struct *mm,
>>>> @@ -1133,7 +1133,7 @@ static inline void __radix__flush_tlb_range(struct > mm_struct *mm,
>>>>        if (fullmm)
>>>>                flush_pid = true;
>>>>        else if (type == FLUSH_TYPE_GLOBAL)
>>>> -             flush_pid = nr_pages > tlb_single_page_flush_ceiling;
>>>> +             flush_pid = nr_pages >= tlb_single_page_flush_ceiling;
>>>>        else
>>>>                flush_pid = nr_pages > tlb_local_single_page_flush_ceiling;
>>>
>>> Additional details on the test environment. This was tested on a 2 Node/8
>>> socket Power10 system.
>>> The LPAR had 105 cores and the LPAR spanned across all the sockets.
>>>
>>> # perf stat -I 1000 -a -e cycles,instructions -e
>>> "{cpu/config=0x030008,name=PM_EXEC_STALL/}" -e
>>> "{cpu/config=0x02E01C,name=PM_EXEC_STALL_TLBIE/}" ./tlbie -i 10 -c 1  -t 1
>>>   Rate of work: = 176
>>> #           time             counts unit events
>>>       1.029206442         4198594519      cycles
>>>       1.029206442         2458254252      instructions              # 0.59 insn per cycle
>>>       1.029206442         3004031488      PM_EXEC_STALL
>>>       1.029206442         1798186036      PM_EXEC_STALL_TLBIE
>>>   Rate of work: = 181
>>>       2.054288539         4183883450      cycles
>>>       2.054288539         2472178171      instructions              # 0.59 insn per cycle
>>>       2.054288539         3014609313      PM_EXEC_STALL
>>>       2.054288539         1797851642      PM_EXEC_STALL_TLBIE
>>>   Rate of work: = 180
>>>       3.078306883         4171250717      cycles
>>>       3.078306883         2468341094      instructions              # 0.59 insn per cycle
>>>       3.078306883         2993036205      PM_EXEC_STALL
>>>       3.078306883         1798181890      PM_EXEC_STALL_TLBIE
>>> .
>>> .
>>>
>>> # cat /sys/kernel/debug/powerpc/tlb_single_page_flush_ceiling
>>> 34
>>>
>>> # echo 32 > /sys/kernel/debug/powerpc/tlb_single_page_flush_ceiling
>>>
>>> # perf stat -I 1000 -a -e cycles,instructions -e
>>> "{cpu/config=0x030008,name=PM_EXEC_STALL/}" -e
>>> "{cpu/config=0x02E01C,name=PM_EXEC_STALL_TLBIE/}" ./tlbie -i 10 -c 1  -t 1
>>>   Rate of work: = 313
>>> #           time             counts unit events
>>>       1.030310506         4206071143      cycles
>>>       1.030310506         4314716958      instructions              # 1.03 insn per cycle
>>>       1.030310506         2157762167      PM_EXEC_STALL
>>>       1.030310506          110825573      PM_EXEC_STALL_TLBIE
>>>   Rate of work: = 322
>>>       2.056034068         4331745630      cycles
>>>       2.056034068         4531658304      instructions              # 1.05 insn per cycle
>>>       2.056034068         2288971361      PM_EXEC_STALL
>>>       2.056034068          111267927      PM_EXEC_STALL_TLBIE
>>>   Rate of work: = 321
>>>       3.081216434         4327050349      cycles
>>>       3.081216434         4379679508      instructions              # 1.01 insn per cycle
>>>       3.081216434         2252602550      PM_EXEC_STALL
>>>       3.081216434          110974887      PM_EXEC_STALL_TLBIE
>> 
>> 
>> What is the tlbie test actually doing?
>> 
>> Does it do anything to measure the cost of refilling after the full mm flush?
>
> That is essentially
>
> for ()
> {
>    shmat()
>    fillshm()
>    shmdt()
>
> }
>
> for a 256MB range. So it is not really a fair benchmark because it 
> doesn't take into account the impact of throwing away the full pid 
> translation. But even then the TLBIE stalls is an important data point?

Choosing the ceiling is a trade-off, and this test only measures one
side of the trade-off.

It tells us that the actual time taken to execute the full flush is less
than doing 32 individual flushes, but that's not the full story.

To decide I think we need some numbers for some more "real" workloads,
to at least see that there's no change, or preferably some improvement.

Another interesting test might be to do the shmat/fillshm/shmdt, and
then chase some pointers to provoke TLB misses. Then we could work out
the relative cost of TLB misses vs the time to do the flush.

cheers

^ permalink raw reply

* Re: [PATCH v2 2/2] powerpc/perf: Return regs->nip as instruction pointer value when SIAR is 0
From: Christophe Leroy @ 2021-08-16  6:56 UTC (permalink / raw)
  To: kajoljain, Michael Ellerman, linuxppc-dev
  Cc: Sukadev Bhattiprolu, atrajeev, maddy, rnsastry
In-Reply-To: <0068dbc4-fa4b-ce98-9e89-3f02f939720d@linux.ibm.com>



Le 16/08/2021 à 08:44, kajoljain a écrit :
> 
> 
> On 8/14/21 6:14 PM, Michael Ellerman wrote:
>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>> Le 13/08/2021 à 10:24, Kajol Jain a écrit :
>>>> Incase of random sampling, there can be scenarios where SIAR is not
>>>> latching sample address and results in 0 value. Since current code
>>>> directly returning the siar value, we could see multiple instruction
>>>> pointer values as 0 in perf report.
>>
>> Can you please give more detail on that? What scenarios? On what CPUs?
>>
> 
> Hi Michael,
>      Sure I will update these details in my next patch-set.
> 
>>>> Patch resolves this issue by adding a ternary condition to return
>>>> regs->nip incase SIAR is 0.
>>>
>>> Your description seems rather similar to
>>> https://github.com/linuxppc/linux/commit/2ca13a4cc56c920a6c9fc8ee45d02bccacd7f46c
>>>
>>> Does it mean that the problem occurs on more than the power10 DD1 ?
>>>
>>> In that case, can the solution be common instead of doing something for power10 DD1 and something
>>> for others ?
>>
>> Agreed.
>>
>> This change would seem to make that P10 DD1 logic superfluous.
>>
>> Also we already have a fallback to regs->nip in the else case of the if,
>> so we should just use that rather than adding a ternary condition.
>>
>> eg.
>>
>> 	if (use_siar && siar_valid(regs) && siar)
>> 		return siar + perf_ip_adjust(regs);
>> 	else if (use_siar)
>> 		return 0;		// no valid instruction pointer
>> 	else
>> 		return regs->nip;
>>
>>
>> I'm also not sure why we have that return 0 case, I can't think of why
>> we'd ever want to do that rather than using nip. So maybe we should do
>> another patch to drop that case.
> 
> Yeah make sense. I will remove return 0 case in my next version.
> 

This was added by commit 
https://github.com/linuxppc/linux/commit/e6878835ac4794f25385522d29c634b7bbb7cca9

Are we sure it was an error to add it and it can be removed ?

Christophe

^ permalink raw reply

* Re: [PATCH v2 2/2] powerpc/perf: Return regs->nip as instruction pointer value when SIAR is 0
From: kajoljain @ 2021-08-16  6:46 UTC (permalink / raw)
  To: Christophe Leroy, mpe, linuxppc-dev; +Cc: atrajeev, maddy, rnsastry
In-Reply-To: <c6110aa1-90e2-77aa-1ab5-355975037227@csgroup.eu>



On 8/13/21 3:04 PM, Christophe Leroy wrote:
> 
> 
> Le 13/08/2021 à 10:24, Kajol Jain a écrit :
>> Incase of random sampling, there can be scenarios where SIAR is not
>> latching sample address and results in 0 value. Since current code
>> directly returning the siar value, we could see multiple instruction
>> pointer values as 0 in perf report.
>> Patch resolves this issue by adding a ternary condition to return
>> regs->nip incase SIAR is 0.
> 
> Your description seems rather similar to https://github.com/linuxppc/linux/commit/2ca13a4cc56c920a6c9fc8ee45d02bccacd7f46c
> 
> Does it mean that the problem occurs on more than the power10 DD1 ?
> 
> In that case, can the solution be common instead of doing something for power10 DD1 and something for others ?

Hi Christophe,
    Yes its better to have common check. I will make these updates.

Thanks,
Kajol Jain

> 
>>
>> Fixes: 75382aa72f06 ("powerpc/perf: Move code to select SIAR or pt_regs
>> into perf_read_regs")
>> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
>> ---
>>   arch/powerpc/perf/core-book3s.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
>> index 1b464aad29c4..aeecaaf6810f 100644
>> --- a/arch/powerpc/perf/core-book3s.c
>> +++ b/arch/powerpc/perf/core-book3s.c
>> @@ -2260,7 +2260,7 @@ unsigned long perf_instruction_pointer(struct pt_regs *regs)
>>           else
>>               return regs->nip;
>>       } else if (use_siar && siar_valid(regs))
>> -        return siar + perf_ip_adjust(regs);
>> +        return siar ? siar + perf_ip_adjust(regs) : regs->nip;
>>       else if (use_siar)
>>           return 0;        // no valid instruction pointer
>>       else
>>

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox