LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v3 1/2] ASoC: dt-bindings: fsl_aud2htx: Add binding doc for aud2htx module
From: Mark Brown @ 2020-11-06 11:54 UTC (permalink / raw)
  To: Shengjiu Wang, perex, Xiubo.Lee, festevam, robh+dt, nicoleotsuka,
	timur, tiwai, devicetree, alsa-devel, lgirdwood
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <1604281947-26874-1-git-send-email-shengjiu.wang@nxp.com>

On Mon, 2 Nov 2020 09:52:26 +0800, Shengjiu Wang wrote:
> AUD2HTX (Audio Subsystem TO HDMI TX Subsystem) is a new
> IP module found on i.MX8MP.

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next

Thanks!

[1/2] ASoC: dt-bindings: fsl_aud2htx: Add binding doc for aud2htx module
      commit: 40f4c56d08f2a95f4f3b036987f171dde69ddd36
[2/2] ASoC: fsl_aud2htx: Add aud2htx module driver
      commit: 8a24c834c053ef1b0cdefbd9c5bcb487cbc5068f

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

^ permalink raw reply

* [PATCH] powerpc/64s: Remove RFI
From: Christophe Leroy @ 2020-11-06 11:36 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, npiggin
  Cc: linuxppc-dev, linux-kernel

Last use of RFI on PPC64 was removed by
commit b8e90cb7bc04 ("powerpc/64: Convert the syscall exit path to
use RFI_TO_USER/KERNEL").

Remove the macro.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/ppc_asm.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h
index 511786f0e40d..bedf3eb52ebc 100644
--- a/arch/powerpc/include/asm/ppc_asm.h
+++ b/arch/powerpc/include/asm/ppc_asm.h
@@ -495,7 +495,6 @@ END_FTR_SECTION_NESTED(CPU_FTR_CELL_TB_BUG, CPU_FTR_CELL_TB_BUG, 96)
 #endif
 
 #ifdef CONFIG_PPC_BOOK3S_64
-#define RFI		rfid
 #define MTMSRD(r)	mtmsrd	r
 #define MTMSR_EERI(reg)	mtmsrd	reg,1
 #else
-- 
2.25.0


^ permalink raw reply related

* Re: [PATCH 05/11 v3] kprobes/ftrace: Add recursion protection to the ftrace callback
From: Masami Hiramatsu @ 2020-11-06 10:25 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Peter Zijlstra, James E.J. Bottomley, Guo Ren, linux-csky,
	H. Peter Anvin, Miroslav Benes, Ingo Molnar, linux-s390,
	Helge Deller, x86, Anil S Keshavamurthy, Christian Borntraeger,
	Naveen N. Rao, Petr Mladek, Vasily Gorbik, Heiko Carstens,
	Jiri Kosina, Borislav Petkov, Josh Poimboeuf, Thomas Gleixner,
	linux-parisc, linux-kernel, Masami Hiramatsu, Paul Mackerras,
	Andrew Morton, linuxppc-dev, David S. Miller
In-Reply-To: <20201106023546.944907560@goodmis.org>

On Thu, 05 Nov 2020 21:32:40 -0500
Steven Rostedt (VMware) <rostedt@goodmis.org> wrote:

> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> If a ftrace callback does not supply its own recursion protection and
> does not set the RECURSION_SAFE flag in its ftrace_ops, then ftrace will
> make a helper trampoline to do so before calling the callback instead of
> just calling the callback directly.
> 
> The default for ftrace_ops is going to change. It will expect that handlers
> provide their own recursion protection, unless its ftrace_ops states
> otherwise.
> 
> Link: https://lkml.kernel.org/r/20201028115613.140212174@goodmis.org
> 

Looks good to me.

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>

Thank you!

> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Masami Hiramatsu <mhiramat@kernel.org>
> Cc: Guo Ren <guoren@kernel.org>
> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
> Cc: Helge Deller <deller@gmx.de>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Heiko Carstens <hca@linux.ibm.com>
> Cc: Vasily Gorbik <gor@linux.ibm.com>
> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: x86@kernel.org
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: linux-csky@vger.kernel.org
> Cc: linux-parisc@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-s390@vger.kernel.org
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> ---
> 
> Changes since v2:
> 
>  - Move get_kprobe() into preempt disabled sections for various archs
> 
> 
>  arch/csky/kernel/probes/ftrace.c     | 12 ++++++++++--
>  arch/parisc/kernel/ftrace.c          | 16 +++++++++++++---
>  arch/powerpc/kernel/kprobes-ftrace.c | 11 ++++++++++-
>  arch/s390/kernel/ftrace.c            | 16 +++++++++++++---
>  arch/x86/kernel/kprobes/ftrace.c     | 12 ++++++++++--
>  5 files changed, 56 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/csky/kernel/probes/ftrace.c b/arch/csky/kernel/probes/ftrace.c
> index 5264763d05be..5eb2604fdf71 100644
> --- a/arch/csky/kernel/probes/ftrace.c
> +++ b/arch/csky/kernel/probes/ftrace.c
> @@ -13,16 +13,21 @@ int arch_check_ftrace_location(struct kprobe *p)
>  void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>  			   struct ftrace_ops *ops, struct pt_regs *regs)
>  {
> +	int bit;
>  	bool lr_saver = false;
>  	struct kprobe *p;
>  	struct kprobe_ctlblk *kcb;
>  
> -	/* Preempt is disabled by ftrace */
> +	bit = ftrace_test_recursion_trylock();
> +	if (bit < 0)
> +		return;
> +
> +	preempt_disable_notrace();
>  	p = get_kprobe((kprobe_opcode_t *)ip);
>  	if (!p) {
>  		p = get_kprobe((kprobe_opcode_t *)(ip - MCOUNT_INSN_SIZE));
>  		if (unlikely(!p) || kprobe_disabled(p))
> -			return;
> +			goto out;
>  		lr_saver = true;
>  	}
>  
> @@ -56,6 +61,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>  		 */
>  		__this_cpu_write(current_kprobe, NULL);
>  	}
> +out:
> +	preempt_enable_notrace();
> +	ftrace_test_recursion_unlock(bit);
>  }
>  NOKPROBE_SYMBOL(kprobe_ftrace_handler);
>  
> diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
> index 63e3ecb9da81..13d85042810a 100644
> --- a/arch/parisc/kernel/ftrace.c
> +++ b/arch/parisc/kernel/ftrace.c
> @@ -207,14 +207,21 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>  			   struct ftrace_ops *ops, struct pt_regs *regs)
>  {
>  	struct kprobe_ctlblk *kcb;
> -	struct kprobe *p = get_kprobe((kprobe_opcode_t *)ip);
> +	struct kprobe *p;
> +	int bit;
>  
> -	if (unlikely(!p) || kprobe_disabled(p))
> +	bit = ftrace_test_recursion_trylock();
> +	if (bit < 0)
>  		return;
>  
> +	preempt_disable_notrace();
> +	p = get_kprobe((kprobe_opcode_t *)ip);
> +	if (unlikely(!p) || kprobe_disabled(p))
> +		goto out;
> +
>  	if (kprobe_running()) {
>  		kprobes_inc_nmissed_count(p);
> -		return;
> +		goto out;
>  	}
>  
>  	__this_cpu_write(current_kprobe, p);
> @@ -235,6 +242,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>  		}
>  	}
>  	__this_cpu_write(current_kprobe, NULL);
> +out:
> +	preempt_enable_notrace();
> +	ftrace_test_recursion_unlock(bit);
>  }
>  NOKPROBE_SYMBOL(kprobe_ftrace_handler);
>  
> diff --git a/arch/powerpc/kernel/kprobes-ftrace.c b/arch/powerpc/kernel/kprobes-ftrace.c
> index 972cb28174b2..5df8d50c65ae 100644
> --- a/arch/powerpc/kernel/kprobes-ftrace.c
> +++ b/arch/powerpc/kernel/kprobes-ftrace.c
> @@ -18,10 +18,16 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
>  {
>  	struct kprobe *p;
>  	struct kprobe_ctlblk *kcb;
> +	int bit;
>  
> +	bit = ftrace_test_recursion_trylock();
> +	if (bit < 0)
> +		return;
> +
> +	preempt_disable_notrace();
>  	p = get_kprobe((kprobe_opcode_t *)nip);
>  	if (unlikely(!p) || kprobe_disabled(p))
> -		return;
> +		goto out;
>  
>  	kcb = get_kprobe_ctlblk();
>  	if (kprobe_running()) {
> @@ -52,6 +58,9 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
>  		 */
>  		__this_cpu_write(current_kprobe, NULL);
>  	}
> +out:
> +	preempt_enable_notrace();
> +	ftrace_test_recursion_unlock(bit);
>  }
>  NOKPROBE_SYMBOL(kprobe_ftrace_handler);
>  
> diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c
> index b388e87a08bf..8f31c726537a 100644
> --- a/arch/s390/kernel/ftrace.c
> +++ b/arch/s390/kernel/ftrace.c
> @@ -201,14 +201,21 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>  		struct ftrace_ops *ops, struct pt_regs *regs)
>  {
>  	struct kprobe_ctlblk *kcb;
> -	struct kprobe *p = get_kprobe((kprobe_opcode_t *)ip);
> +	struct kprobe *p;
> +	int bit;
>  
> -	if (unlikely(!p) || kprobe_disabled(p))
> +	bit = ftrace_test_recursion_trylock();
> +	if (bit < 0)
>  		return;
>  
> +	preempt_disable_notrace();
> +	p = get_kprobe((kprobe_opcode_t *)ip);
> +	if (unlikely(!p) || kprobe_disabled(p))
> +		goto out;
> +
>  	if (kprobe_running()) {
>  		kprobes_inc_nmissed_count(p);
> -		return;
> +		goto out;
>  	}
>  
>  	__this_cpu_write(current_kprobe, p);
> @@ -228,6 +235,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>  		}
>  	}
>  	__this_cpu_write(current_kprobe, NULL);
> +out:
> +	preempt_enable_notrace();
> +	ftrace_test_recursion_unlock(bit);
>  }
>  NOKPROBE_SYMBOL(kprobe_ftrace_handler);
>  
> diff --git a/arch/x86/kernel/kprobes/ftrace.c b/arch/x86/kernel/kprobes/ftrace.c
> index 681a4b36e9bb..a40a6cdfcca3 100644
> --- a/arch/x86/kernel/kprobes/ftrace.c
> +++ b/arch/x86/kernel/kprobes/ftrace.c
> @@ -18,11 +18,16 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>  {
>  	struct kprobe *p;
>  	struct kprobe_ctlblk *kcb;
> +	int bit;
>  
> -	/* Preempt is disabled by ftrace */
> +	bit = ftrace_test_recursion_trylock();
> +	if (bit < 0)
> +		return;
> +
> +	preempt_disable_notrace();
>  	p = get_kprobe((kprobe_opcode_t *)ip);
>  	if (unlikely(!p) || kprobe_disabled(p))
> -		return;
> +		goto out;
>  
>  	kcb = get_kprobe_ctlblk();
>  	if (kprobe_running()) {
> @@ -52,6 +57,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>  		 */
>  		__this_cpu_write(current_kprobe, NULL);
>  	}
> +out:
> +	preempt_enable_notrace();
> +	ftrace_test_recursion_unlock(bit);
>  }
>  NOKPROBE_SYMBOL(kprobe_ftrace_handler);
>  
> -- 
> 2.28.0
> 
> 


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply

* Re: [PATCH 00/36] Rid W=1 issues from TTY
From: Greg Kroah-Hartman @ 2020-11-06  9:54 UTC (permalink / raw)
  To: Lee Jones
  Cc: Robert Love, Nick Holloway, Russ Gorby, Kevin Wells, --,
	Andrew Morton, Laxman Dewangan, Paul Mackerras, David A. Hinds,
	linux-riscv, Jiri Slaby, linux-stm32, Bill Hawes, Roland Stigge,
	Rob Herring, Russell King, Michal Simek, Jonathan Hunter,
	Jan Dumon, Andy Gross, linux-serial, Sylvain Lemieux,
	Gerald Baeza, Sumit Semwal, Marko Kohtala, linux-media,
	Philipp Zabel, Alexandre Torgue, linux-arm-msm,
	Vladimir Zapolskiy, linaro-mm-sig, Stanislav Voronyi,
	C. Scott Ananian, Paul Walmsley, linux-tegra, Bjorn Andersson,
	Andrew J. Kroll, processes-Sapan Bhatia, Miloslav Trmac,
	Mike Hudson, Joseph Barrow, dri-devel, linux-kernel, paulkf,
	Filip Aben, Palmer Dabbelt, Maxime Coquelin, Thierry Reding,
	Colin Ian King, Jakub Jelinek, linuxppc-dev, Christian König,
	Russell King
In-Reply-To: <20201104193549.4026187-1-lee.jones@linaro.org>

On Wed, Nov 04, 2020 at 07:35:13PM +0000, Lee Jones wrote:
> This set is part of a larger effort attempting to clean-up W=1
> kernel builds, which are currently overwhelmingly riddled with
> niggly little warnings.

Many of these now applied, please update the series against my
tty-testing branch and resend the rest.

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH v2 00/16] PCI: dwc: Another round of clean-ups
From: Marek Szyprowski @ 2020-11-06  9:17 UTC (permalink / raw)
  To: Rob Herring, Lorenzo Pieralisi
  Cc: Kunihiko Hayashi, Neil Armstrong, linux-pci, Binghui Wang,
	Bjorn Andersson, linux-tegra, Thierry Reding, linux-arm-kernel,
	Thomas Petazzoni, Jonathan Chocron, Shawn Guo, Jonathan Hunter,
	Fabio Estevam, Jerome Brunet, Jesper Nilsson, linux-samsung-soc,
	Minghuan Lian, Kevin Hilman, Pratyush Anand, Krzysztof Kozlowski,
	Kishon Vijay Abraham I, Kukjin Kim, Mingkai Hu, Xiaowei Song,
	NXP Linux Team, Richard Zhu, Martin Blumenstingl, linux-arm-msm,
	Sascha Hauer, Yue Wang, Murali Karicheri, Bjorn Helgaas,
	linux-amlogic, linux-omap, linux-arm-kernel, Roy Zang,
	Masahiro Yamada, Jingoo Han, Andy Gross, Vidya Sagar,
	Stanimir Varbanov, Pengutronix Kernel Team, Gustavo Pimentel,
	linuxppc-dev, Lucas Stach
In-Reply-To: <20201105211159.1814485-1-robh@kernel.org>

Hi Rob,

On 05.11.2020 22:11, Rob Herring wrote:
> Here's another batch of DWC PCI host refactoring. This series primarily
> moves more of the MSI, link up, and resource handling to the core
> code. Beyond a couple of minor fixes, new in this version is runtime
> detection of iATU regions instead of using DT properties.

iATU detection seems to work fine with 
https://lore.kernel.org/linux-samsung-soc/20201029134017.27400-1-m.szyprowski@samsung.com/ 
on top of your patchset on Samsung Exynos5433 SoC:

exynos-pcie 15700000.pcie: host bridge /soc@0/pcie@15700000 ranges:
exynos-pcie 15700000.pcie:       IO 0x000c001000..0x000c010fff -> 
0x0000000000
exynos-pcie 15700000.pcie:      MEM 0x000c011000..0x000ffffffe -> 
0x000c011000
exynos-pcie 15700000.pcie: iATU unroll: disabled
exynos-pcie 15700000.pcie: Detected iATU regions: 3 outbound, 5 inbound
exynos-pcie 15700000.pcie: Link up
exynos-pcie 15700000.pcie: PCI host bridge to bus 0000:00

Fell free to add my:

Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>

> No doubt I've probably broken something. Please test. I've run this thru
> kernelci and checked boards with DWC PCI which currently is just
> Layerscape boards (hint: add boards and/or enable PCI). A git branch is
> here[1].
>
> This is dependent on "PCI: dwc: Restore ATU memory resource setup to use
> last entry" which will be in v5.10-rc3.
>
> Rob
>
> [1] git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git pci-more-dwc-cleanup
>
> Rob Herring (16):
>    PCI: dwc: Support multiple ATU memory regions
>    PCI: dwc/intel-gw: Move ATU offset out of driver match data
>    PCI: dwc: Move "dbi", "dbi2", and "addr_space" resource setup into
>      common code
>    PCI: dwc/intel-gw: Remove some unneeded function wrappers
>    PCI: dwc: Ensure all outbound ATU windows are reset
>    PCI: dwc/dra7xx: Use the common MSI irq_chip
>    PCI: dwc: Drop the .set_num_vectors() host op
>    PCI: dwc: Move MSI interrupt setup into DWC common code
>    PCI: dwc: Rework MSI initialization
>    PCI: dwc: Move link handling into common code
>    PCI: dwc: Move dw_pcie_msi_init() into core
>    PCI: dwc: Move dw_pcie_setup_rc() to DWC common code
>    PCI: dwc: Remove unnecessary wrappers around dw_pcie_host_init()
>    Revert "PCI: dwc/keystone: Drop duplicated 'num-viewport'"
>    PCI: dwc: Move inbound and outbound windows to common struct
>    PCI: dwc: Detect number of iATU windows
>
>   drivers/pci/controller/dwc/pci-dra7xx.c       | 141 +-----------------
>   drivers/pci/controller/dwc/pci-exynos.c       |  50 ++-----
>   drivers/pci/controller/dwc/pci-imx6.c         |  39 +----
>   drivers/pci/controller/dwc/pci-keystone.c     |  79 ++--------
>   .../pci/controller/dwc/pci-layerscape-ep.c    |  37 +----
>   drivers/pci/controller/dwc/pci-layerscape.c   |  67 +--------
>   drivers/pci/controller/dwc/pci-meson.c        |  53 ++-----
>   drivers/pci/controller/dwc/pcie-al.c          |  29 +---
>   drivers/pci/controller/dwc/pcie-armada8k.c    |  37 ++---
>   drivers/pci/controller/dwc/pcie-artpec6.c     |  76 +---------
>   .../pci/controller/dwc/pcie-designware-ep.c   |  58 +++----
>   .../pci/controller/dwc/pcie-designware-host.c | 139 ++++++++++-------
>   .../pci/controller/dwc/pcie-designware-plat.c |  70 +--------
>   drivers/pci/controller/dwc/pcie-designware.c  |  93 +++++++++++-
>   drivers/pci/controller/dwc/pcie-designware.h  |  24 +--
>   drivers/pci/controller/dwc/pcie-histb.c       |  37 ++---
>   drivers/pci/controller/dwc/pcie-intel-gw.c    |  67 ++-------
>   drivers/pci/controller/dwc/pcie-kirin.c       |  62 +-------
>   drivers/pci/controller/dwc/pcie-qcom.c        |  38 +----
>   drivers/pci/controller/dwc/pcie-spear13xx.c   |  62 +++-----
>   drivers/pci/controller/dwc/pcie-tegra194.c    |  41 +----
>   drivers/pci/controller/dwc/pcie-uniphier-ep.c |  38 +----
>   drivers/pci/controller/dwc/pcie-uniphier.c    |  51 +------
>   23 files changed, 356 insertions(+), 1032 deletions(-)
>
> --
> 2.25.1
>
Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply

* Re: [PATCH 03/18] powerpc: bad_page_fault, do_break get registers from regs
From: Christophe Leroy @ 2020-11-06  8:14 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev
In-Reply-To: <20201105143431.1874789-4-npiggin@gmail.com>



Le 05/11/2020 à 15:34, Nicholas Piggin a écrit :
> This also moves the 32s DABR match to C.

Is there a real benefit doing this ?

> 
> Similar to the previous patch this makes interrupt handler function
> types more regular so they can be wrapped with the next patch.
> 
> bad_page_fault and do_break are not performance critical.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   arch/powerpc/include/asm/bug.h             |  2 +-
>   arch/powerpc/include/asm/debug.h           |  3 +--
>   arch/powerpc/kernel/entry_32.S             | 14 ++++----------
>   arch/powerpc/kernel/exceptions-64e.S       |  3 +--
>   arch/powerpc/kernel/exceptions-64s.S       |  3 +--
>   arch/powerpc/kernel/head_8xx.S             |  5 ++---
>   arch/powerpc/kernel/process.c              |  7 +++----
>   arch/powerpc/kernel/traps.c                |  2 +-
>   arch/powerpc/mm/book3s64/hash_utils.c      |  4 ++--
>   arch/powerpc/mm/book3s64/slb.c             |  2 +-
>   arch/powerpc/mm/fault.c                    | 14 +++++++-------
>   arch/powerpc/platforms/8xx/machine_check.c |  2 +-
>   12 files changed, 25 insertions(+), 36 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
> index 2fa0cf6c6011..4af6c3835eb2 100644
> --- a/arch/powerpc/include/asm/bug.h
> +++ b/arch/powerpc/include/asm/bug.h
> @@ -113,7 +113,7 @@
>   struct pt_regs;
>   extern long do_page_fault(struct pt_regs *);
>   extern long hash__do_page_fault(struct pt_regs *);
> -extern void bad_page_fault(struct pt_regs *, unsigned long, int);
> +extern void bad_page_fault(struct pt_regs *, int);

pointless extern

Christophe

>   extern void _exception(int, struct pt_regs *, int, unsigned long);
>   extern void _exception_pkey(struct pt_regs *, unsigned long, int);
>   extern void die(const char *, struct pt_regs *, long);
> diff --git a/arch/powerpc/include/asm/debug.h b/arch/powerpc/include/asm/debug.h
> index ec57daf87f40..0550eceab3ca 100644
> --- a/arch/powerpc/include/asm/debug.h
> +++ b/arch/powerpc/include/asm/debug.h
> @@ -52,8 +52,7 @@ extern void do_send_trap(struct pt_regs *regs, unsigned long address,
>   			 unsigned long error_code, int brkpt);
>   #else
>   
> -extern void do_break(struct pt_regs *regs, unsigned long address,
> -		     unsigned long error_code);
> +void do_break(struct pt_regs *regs);
>   #endif
>   
>   #endif /* _ASM_POWERPC_DEBUG_H */
> diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
> index 8cdc8bcde703..eb97df234a0c 100644
> --- a/arch/powerpc/kernel/entry_32.S
> +++ b/arch/powerpc/kernel/entry_32.S
> @@ -657,10 +657,6 @@ ppc_swapcontext:
>   	.globl	handle_page_fault
>   handle_page_fault:
>   	addi	r3,r1,STACK_FRAME_OVERHEAD
> -#ifdef CONFIG_PPC_BOOK3S_32
> -	andis.  r0,r5,DSISR_DABRMATCH@h
> -	bne-    handle_dabr_fault
> -#endif
>   	bl	do_page_fault
>   	cmpwi	r3,0
>   	beq+	ret_from_except
> @@ -668,19 +664,17 @@ handle_page_fault:
>   	lwz	r0,_TRAP(r1)
>   	clrrwi	r0,r0,1
>   	stw	r0,_TRAP(r1)
> -	mr	r5,r3
> +	mr	r4,r3		/* err arg for bad_page_fault */
>   	addi	r3,r1,STACK_FRAME_OVERHEAD
> -	lwz	r4,_DAR(r1)
> +#ifdef CONFIG_PPC_BOOK3S_32
> +	blt	handle_dabr_fault
> +#endif
>   	bl	bad_page_fault
>   	b	ret_from_except_full
>   
>   #ifdef CONFIG_PPC_BOOK3S_32
>   	/* We have a data breakpoint exception - handle it */
>   handle_dabr_fault:
> -	SAVE_NVGPRS(r1)
> -	lwz	r0,_TRAP(r1)
> -	clrrwi	r0,r0,1
> -	stw	r0,_TRAP(r1)
>   	bl      do_break
>   	b	ret_from_except_full
>   #endif
> diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
> index 25fa7d5a643c..dc728bb1c89a 100644
> --- a/arch/powerpc/kernel/exceptions-64e.S
> +++ b/arch/powerpc/kernel/exceptions-64e.S
> @@ -1018,9 +1018,8 @@ storage_fault_common:
>   	bne-	1f
>   	b	ret_from_except_lite
>   1:	bl	save_nvgprs
> -	mr	r5,r3
> +	mr	r4,r3
>   	addi	r3,r1,STACK_FRAME_OVERHEAD
> -	ld	r4,_DAR(r1)
>   	bl	bad_page_fault
>   	b	ret_from_except
>   
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index 1f34cfd1887c..e6558c4d3f81 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -2135,8 +2135,7 @@ EXC_COMMON_BEGIN(h_data_storage_common)
>   	GEN_COMMON h_data_storage
>   	addi    r3,r1,STACK_FRAME_OVERHEAD
>   BEGIN_MMU_FTR_SECTION
> -	ld	r4,_DAR(r1)
> -	li	r5,SIGSEGV
> +	li	r4,SIGSEGV
>   	bl      bad_page_fault
>   MMU_FTR_SECTION_ELSE
>   	bl      unknown_exception
> diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
> index 0cd95b633e2b..13eda7154695 100644
> --- a/arch/powerpc/kernel/head_8xx.S
> +++ b/arch/powerpc/kernel/head_8xx.S
> @@ -408,10 +408,9 @@ do_databreakpoint:
>   	addi	r3,r1,STACK_FRAME_OVERHEAD
>   	mfspr	r4,SPRN_BAR
>   	stw	r4,_DAR(r11)
> -#ifdef CONFIG_VMAP_STACK
> -	lwz	r5,_DSISR(r11)
> -#else
> +#ifndef CONFIG_VMAP_STACK
>   	mfspr	r5,SPRN_DSISR
> +	stw	r5,_DSISR(r11)
>   #endif
>   	EXC_XFER_STD(0x1c00, do_break)
>   
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index d421a2c7f822..0bdd3ed653df 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -660,11 +660,10 @@ static void do_break_handler(struct pt_regs *regs)
>   	}
>   }
>   
> -void do_break (struct pt_regs *regs, unsigned long address,
> -		    unsigned long error_code)
> +void do_break(struct pt_regs *regs)
>   {
>   	current->thread.trap_nr = TRAP_HWBKPT;
> -	if (notify_die(DIE_DABR_MATCH, "dabr_match", regs, error_code,
> +	if (notify_die(DIE_DABR_MATCH, "dabr_match", regs, regs->dsisr,
>   			11, SIGSEGV) == NOTIFY_STOP)
>   		return;
>   
> @@ -682,7 +681,7 @@ void do_break (struct pt_regs *regs, unsigned long address,
>   		do_break_handler(regs);
>   
>   	/* Deliver the signal to userspace */
> -	force_sig_fault(SIGTRAP, TRAP_HWBKPT, (void __user *)address);
> +	force_sig_fault(SIGTRAP, TRAP_HWBKPT, (void __user *)regs->dar);
>   }
>   #endif	/* CONFIG_PPC_ADV_DEBUG_REGS */
>   
> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index 5006dcbe1d9f..902fcbd1a778 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -1641,7 +1641,7 @@ void alignment_exception(struct pt_regs *regs)
>   	if (user_mode(regs))
>   		_exception(sig, regs, code, regs->dar);
>   	else
> -		bad_page_fault(regs, regs->dar, sig);
> +		bad_page_fault(regs, sig);
>   
>   bail:
>   	exception_exit(prev_state);
> diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
> index 0f0bd4af4b2d..731518e7d56f 100644
> --- a/arch/powerpc/mm/book3s64/hash_utils.c
> +++ b/arch/powerpc/mm/book3s64/hash_utils.c
> @@ -1537,7 +1537,7 @@ long do_hash_fault(struct pt_regs *regs)
>   	 * the access, or panic if there isn't a handler.
>   	 */
>   	if (unlikely(in_nmi())) {
> -		bad_page_fault(regs, ea, SIGSEGV);
> +		bad_page_fault(regs, SIGSEGV);
>   		return 0;
>   	}
>   
> @@ -1576,7 +1576,7 @@ long do_hash_fault(struct pt_regs *regs)
>   			else
>   				_exception(SIGBUS, regs, BUS_ADRERR, ea);
>   		} else {
> -			bad_page_fault(regs, ea, SIGBUS);
> +			bad_page_fault(regs, SIGBUS);
>   		}
>   		err = 0;
>   
> diff --git a/arch/powerpc/mm/book3s64/slb.c b/arch/powerpc/mm/book3s64/slb.c
> index cc34d50874c1..ae89ad516247 100644
> --- a/arch/powerpc/mm/book3s64/slb.c
> +++ b/arch/powerpc/mm/book3s64/slb.c
> @@ -898,7 +898,7 @@ void do_bad_slb_fault(struct pt_regs *regs)
>   		if (user_mode(regs))
>   			_exception(SIGSEGV, regs, SEGV_BNDERR, regs->dar);
>   		else
> -			bad_page_fault(regs, regs->dar, SIGSEGV);
> +			bad_page_fault(regs, SIGSEGV);
>   	} else if (err == -EINVAL) {
>   		unrecoverable_exception(regs);
>   	} else {
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index 390a296b16a3..49fbe564ea2b 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -375,7 +375,7 @@ static void sanity_check_fault(bool is_write, bool is_user,
>   #elif defined(CONFIG_PPC_BOOK3E_64)
>   #define page_fault_is_bad(__err)	((__err) & DSISR_BAD_FAULT_64S)
>   #else
> -#define page_fault_is_bad(__err)	((__err) & DSISR_BAD_FAULT_32S)
> +#define page_fault_is_bad(__err)	((__err) & DSISR_BAD_FAULT_32S | DSISR_DABRMATCH)
>   #endif
>   #endif
>   
> @@ -408,7 +408,7 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
>   		return 0;
>   
>   	if (unlikely(page_fault_is_bad(error_code))) {
> -		if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && (error_code & DSISR_DABRMATCH))
> +		if (IS_ENABLED(CONFIG_PPC_BOOK3S) && (error_code & DSISR_DABRMATCH))
>   			return -1;
>   
>   		if (is_user) {
> @@ -562,14 +562,14 @@ long do_page_fault(struct pt_regs *regs)
>   	/* 32 and 64e handle errors in their asm code */
>   	if (unlikely(err)) {
>   		if (err > 0) {
> -			bad_page_fault(regs, address, err);
> +			bad_page_fault(regs, err);
>   			err = 0;
>   		} else {
>   			/*
>   			 * do_break() may change NV GPRS while handling the
>   			 * breakpoint. Return -ve to caller to do that.
>   			 */
> -			do_break(regs, address, error_code);
> +			do_break(regs);
>   		}
>   	}
>   #endif
> @@ -591,14 +591,14 @@ long hash__do_page_fault(struct pt_regs *regs)
>   	err = __do_page_fault(regs, address, error_code);
>   	if (unlikely(err)) {
>   		if (err > 0) {
> -			bad_page_fault(regs, address, err);
> +			bad_page_fault(regs, err);
>   			err = 0;
>   		} else {
>   			/*
>   			 * do_break() may change NV GPRS while handling the
>   			 * breakpoint. Return -ve to caller to do that.
>   			 */
> -			do_break(regs, address, error_code);
> +			do_break(regs);
>   		}
>   	}
>   
> @@ -612,7 +612,7 @@ NOKPROBE_SYMBOL(hash__do_page_fault);
>    * It is called from the DSI and ISI handlers in head.S and from some
>    * of the procedures in traps.c.
>    */
> -void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
> +void bad_page_fault(struct pt_regs *regs, int sig)
>   {
>   	const struct exception_table_entry *entry;
>   	int is_write = page_fault_is_write(regs->dsisr);
> diff --git a/arch/powerpc/platforms/8xx/machine_check.c b/arch/powerpc/platforms/8xx/machine_check.c
> index 88dedf38eccd..656365975895 100644
> --- a/arch/powerpc/platforms/8xx/machine_check.c
> +++ b/arch/powerpc/platforms/8xx/machine_check.c
> @@ -26,7 +26,7 @@ int machine_check_8xx(struct pt_regs *regs)
>   	 * to deal with that than having a wart in the mcheck handler.
>   	 * -- BenH
>   	 */
> -	bad_page_fault(regs, regs->dar, SIGBUS);
> +	bad_page_fault(regs, SIGBUS);
>   	return 1;
>   #else
>   	return 0;
> 

^ permalink raw reply

* Re: [PATCH 02/18] powerpc: remove arguments from fault handler functions
From: Christophe Leroy @ 2020-11-06  7:59 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev
In-Reply-To: <20201105143431.1874789-3-npiggin@gmail.com>



Le 05/11/2020 à 15:34, Nicholas Piggin a écrit :
> Make mm fault handlers all just take the pt_regs * argument and load
> DAR/DSISR from that. Make those that return a value return long.
> 
> This is done to make the function signatures match other handlers, which
> will help with a future patch to add wrappers. Explicit arguments could
> be added for performance but that would require more wrapper macro
> variants.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   arch/powerpc/include/asm/asm-prototypes.h |  4 ++--
>   arch/powerpc/include/asm/bug.h            |  4 ++--
>   arch/powerpc/kernel/exceptions-64e.S      |  2 --
>   arch/powerpc/kernel/exceptions-64s.S      | 14 ++------------
>   arch/powerpc/kernel/head_40x.S            | 10 +++++-----
>   arch/powerpc/kernel/head_8xx.S            |  6 +++---
>   arch/powerpc/kernel/head_book3s_32.S      |  6 ++----
>   arch/powerpc/kernel/head_booke.h          |  4 +---
>   arch/powerpc/mm/book3s64/hash_utils.c     |  8 +++++---
>   arch/powerpc/mm/book3s64/slb.c            | 11 +++++++----
>   arch/powerpc/mm/fault.c                   | 16 +++++++++-------
>   11 files changed, 38 insertions(+), 47 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h
> index d0b832cbbec8..22c9d08fa3a4 100644
> --- a/arch/powerpc/include/asm/asm-prototypes.h
> +++ b/arch/powerpc/include/asm/asm-prototypes.h
> @@ -82,8 +82,8 @@ void kernel_bad_stack(struct pt_regs *regs);
>   void system_reset_exception(struct pt_regs *regs);
>   void machine_check_exception(struct pt_regs *regs);
>   void emulation_assist_interrupt(struct pt_regs *regs);
> -long do_slb_fault(struct pt_regs *regs, unsigned long ea);
> -void do_bad_slb_fault(struct pt_regs *regs, unsigned long ea, long err);
> +long do_slb_fault(struct pt_regs *regs);
> +void do_bad_slb_fault(struct pt_regs *regs);
>   
>   /* signals, syscalls and interrupts */
>   long sys_swapcontext(struct ucontext __user *old_ctx,
> diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
> index d714d83bbc7c..2fa0cf6c6011 100644
> --- a/arch/powerpc/include/asm/bug.h
> +++ b/arch/powerpc/include/asm/bug.h
> @@ -111,8 +111,8 @@
>   #ifndef __ASSEMBLY__
>   
>   struct pt_regs;
> -extern int do_page_fault(struct pt_regs *, unsigned long, unsigned long);
> -extern int hash__do_page_fault(struct pt_regs *, unsigned long, unsigned long);
> +extern long do_page_fault(struct pt_regs *);
> +extern long hash__do_page_fault(struct pt_regs *);

extern is pointless

>   extern void bad_page_fault(struct pt_regs *, unsigned long, int);
>   extern void _exception(int, struct pt_regs *, int, unsigned long);
>   extern void _exception_pkey(struct pt_regs *, unsigned long, int);
> diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
> index f579ce46eef2..25fa7d5a643c 100644
> --- a/arch/powerpc/kernel/exceptions-64e.S
> +++ b/arch/powerpc/kernel/exceptions-64e.S
> @@ -1011,8 +1011,6 @@ storage_fault_common:
>   	std	r14,_DAR(r1)
>   	std	r15,_DSISR(r1)
>   	addi	r3,r1,STACK_FRAME_OVERHEAD
> -	mr	r4,r14
> -	mr	r5,r15
>   	ld	r14,PACA_EXGEN+EX_R14(r13)
>   	ld	r15,PACA_EXGEN+EX_R15(r13)
>   	bl	do_page_fault
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index f830b893fe03..1f34cfd1887c 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -1437,8 +1437,6 @@ EXC_VIRT_BEGIN(data_access, 0x4300, 0x80)
>   EXC_VIRT_END(data_access, 0x4300, 0x80)
>   EXC_COMMON_BEGIN(data_access_common)
>   	GEN_COMMON data_access
> -	ld	r4,_DAR(r1)
> -	ld	r5,_DSISR(r1)
>   	addi	r3,r1,STACK_FRAME_OVERHEAD
>   BEGIN_MMU_FTR_SECTION
>   	bl	do_hash_fault
> @@ -1491,10 +1489,9 @@ EXC_VIRT_BEGIN(data_access_slb, 0x4380, 0x80)
>   EXC_VIRT_END(data_access_slb, 0x4380, 0x80)
>   EXC_COMMON_BEGIN(data_access_slb_common)
>   	GEN_COMMON data_access_slb
> -	ld	r4,_DAR(r1)
> -	addi	r3,r1,STACK_FRAME_OVERHEAD
>   BEGIN_MMU_FTR_SECTION
>   	/* HPT case, do SLB fault */
> +	addi	r3,r1,STACK_FRAME_OVERHEAD
>   	bl	do_slb_fault
>   	cmpdi	r3,0
>   	bne-	1f
> @@ -1506,8 +1503,6 @@ MMU_FTR_SECTION_ELSE
>   ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
>   	std	r3,RESULT(r1)
>   	RECONCILE_IRQ_STATE(r10, r11)
> -	ld	r4,_DAR(r1)
> -	ld	r5,RESULT(r1)
>   	addi	r3,r1,STACK_FRAME_OVERHEAD
>   	bl	do_bad_slb_fault
>   	b	interrupt_return
> @@ -1542,8 +1537,6 @@ EXC_VIRT_BEGIN(instruction_access, 0x4400, 0x80)
>   EXC_VIRT_END(instruction_access, 0x4400, 0x80)
>   EXC_COMMON_BEGIN(instruction_access_common)
>   	GEN_COMMON instruction_access
> -	ld	r4,_DAR(r1)
> -	ld	r5,_DSISR(r1)
>   	addi	r3,r1,STACK_FRAME_OVERHEAD
>   BEGIN_MMU_FTR_SECTION
>   	bl	do_hash_fault
> @@ -1587,10 +1580,9 @@ EXC_VIRT_BEGIN(instruction_access_slb, 0x4480, 0x80)
>   EXC_VIRT_END(instruction_access_slb, 0x4480, 0x80)
>   EXC_COMMON_BEGIN(instruction_access_slb_common)
>   	GEN_COMMON instruction_access_slb
> -	ld	r4,_DAR(r1)
> -	addi	r3,r1,STACK_FRAME_OVERHEAD
>   BEGIN_MMU_FTR_SECTION
>   	/* HPT case, do SLB fault */
> +	addi	r3,r1,STACK_FRAME_OVERHEAD
>   	bl	do_slb_fault
>   	cmpdi	r3,0
>   	bne-	1f
> @@ -1602,8 +1594,6 @@ MMU_FTR_SECTION_ELSE
>   ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
>   	std	r3,RESULT(r1)
>   	RECONCILE_IRQ_STATE(r10, r11)
> -	ld	r4,_DAR(r1)
> -	ld	r5,RESULT(r1)
>   	addi	r3,r1,STACK_FRAME_OVERHEAD
>   	bl	do_bad_slb_fault
>   	b	interrupt_return
> diff --git a/arch/powerpc/kernel/head_40x.S b/arch/powerpc/kernel/head_40x.S
> index 44c9018aed1b..ea31f75e9692 100644
> --- a/arch/powerpc/kernel/head_40x.S
> +++ b/arch/powerpc/kernel/head_40x.S
> @@ -179,9 +179,9 @@ _ENTRY(saved_ksp_limit)
>    */
>   	START_EXCEPTION(0x0300,	DataStorage)
>   	EXCEPTION_PROLOG
> -	mfspr	r5, SPRN_ESR		/* Grab the ESR, save it, pass arg3 */
> +	mfspr	r5, SPRN_ESR		/* Grab the ESR, save it */
>   	stw	r5, _ESR(r11)
> -	mfspr	r4, SPRN_DEAR		/* Grab the DEAR, save it, pass arg2 */
> +	mfspr	r4, SPRN_DEAR		/* Grab the DEAR, save it */
>   	stw	r4, _DEAR(r11)
>   	EXC_XFER_LITE(0x300, handle_page_fault)
>   
> @@ -191,9 +191,9 @@ _ENTRY(saved_ksp_limit)
>    */
>   	START_EXCEPTION(0x0400, InstructionAccess)
>   	EXCEPTION_PROLOG
> -	mr	r4,r12			/* Pass SRR0 as arg2 */
> -	stw	r4, _DEAR(r11)
> -	li	r5,0			/* Pass zero as arg3 */
> +	li	r5,0
> +	stw	r5, _ESR(r11)		/* Zero ESR */
> +	stw	r12, _DEAR(r11)		/* SRR0 as DEAR */

I think we should avoid this, see below

>   	EXC_XFER_LITE(0x400, handle_page_fault)
>   
>   /* 0x0500 - External Interrupt Exception */
> diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
> index 9f359d3fba74..0cd95b633e2b 100644
> --- a/arch/powerpc/kernel/head_8xx.S
> +++ b/arch/powerpc/kernel/head_8xx.S
> @@ -356,14 +356,14 @@ DataStoreTLBMiss:
>   	. = 0x1300
>   InstructionTLBError:
>   	EXCEPTION_PROLOG
> -	mr	r4,r12
>   	andis.	r5,r9,DSISR_SRR1_MATCH_32S@h /* Filter relevant SRR1 bits */

Could avoid this, see below

>   	andis.	r10,r9,SRR1_ISI_NOPT@h
>   	beq+	.Litlbie
> -	tlbie	r4
> +	tlbie	r12
>   	/* 0x400 is InstructionAccess exception, needed by bad_page_fault() */
>   .Litlbie:
> -	stw	r4, _DAR(r11)
> +	stw	r12, _DAR(r11)
> +	stw	r5, _DSISR(r11)

And this

>   	EXC_XFER_LITE(0x400, handle_page_fault)
>   
>   /* This is the data TLB error on the MPC8xx.  This could be due to
> diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
> index 5eb9eedac920..81c69769cec6 100644
> --- a/arch/powerpc/kernel/head_book3s_32.S
> +++ b/arch/powerpc/kernel/head_book3s_32.S
> @@ -369,9 +369,9 @@ BEGIN_MMU_FTR_SECTION
>   	bl	hash_page
>   END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
>   #endif	/* CONFIG_VMAP_STACK */
> -1:	mr	r4,r12
>   	andis.	r5,r9,DSISR_SRR1_MATCH_32S@h /* Filter relevant SRR1 bits */
> -	stw	r4, _DAR(r11)
> +	stw	r5, _DSISR(r11)
> +	stw	r12, _DAR(r11)

And this including the andis.

>   	EXC_XFER_LITE(0x400, handle_page_fault)
>   
>   /* External interrupt */
> @@ -698,8 +698,6 @@ handle_page_fault_tramp_1:
>   #ifdef CONFIG_VMAP_STACK
>   	EXCEPTION_PROLOG_2 handle_dar_dsisr=1
>   #endif
> -	lwz	r4, _DAR(r11)
> -	lwz	r5, _DSISR(r11)
>   	/* fall through */
>   handle_page_fault_tramp_2:
>   	EXC_XFER_LITE(0x300, handle_page_fault)
> diff --git a/arch/powerpc/kernel/head_booke.h b/arch/powerpc/kernel/head_booke.h
> index 71c359d438b5..1da0c1d1b0a1 100644
> --- a/arch/powerpc/kernel/head_booke.h
> +++ b/arch/powerpc/kernel/head_booke.h
> @@ -477,9 +477,7 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_EMB_HV)
>   	NORMAL_EXCEPTION_PROLOG(INST_STORAGE);		      \
>   	mfspr	r5,SPRN_ESR;		/* Grab the ESR and save it */	      \
>   	stw	r5,_ESR(r11);						      \
> -	mr      r4,r12;                 /* Pass SRR0 as arg2 */		      \
> -	stw	r4, _DEAR(r11);						      \
> -	li      r5,0;                   /* Pass zero as arg3 */		      \
> +	stw	r12, _DEAR(r11);	/* Pass SRR0 as arg2 */		      \

And this

>   	EXC_XFER_LITE(0x0400, handle_page_fault)
>   
>   #define ALIGNMENT_EXCEPTION						      \
> diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
> index bfa1b1966218..0f0bd4af4b2d 100644
> --- a/arch/powerpc/mm/book3s64/hash_utils.c
> +++ b/arch/powerpc/mm/book3s64/hash_utils.c
> @@ -1510,13 +1510,15 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap,
>   }
>   EXPORT_SYMBOL_GPL(hash_page);
>   
> -int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr)
> +long do_hash_fault(struct pt_regs *regs)
>   {
> +	unsigned long ea = regs->dar;
> +	unsigned long dsisr = regs->dsisr;
>   	unsigned long access = _PAGE_PRESENT | _PAGE_READ;
>   	unsigned long flags = 0;
>   	struct mm_struct *mm;
>   	unsigned int region_id;
> -	int err;
> +	long err;
>   
>   	if (unlikely(dsisr & (DSISR_BAD_FAULT_64S | DSISR_DABRMATCH | DSISR_KEYFAULT)))
>   		goto _do_page_fault;
> @@ -1580,7 +1582,7 @@ int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr)
>   
>   	} else if (err) {
>   _do_page_fault:
> -		err = hash__do_page_fault(regs, ea, dsisr);
> +		err = hash__do_page_fault(regs);
>   	}
>   
>   	return err;
> diff --git a/arch/powerpc/mm/book3s64/slb.c b/arch/powerpc/mm/book3s64/slb.c
> index c30fcbfa0e32..cc34d50874c1 100644
> --- a/arch/powerpc/mm/book3s64/slb.c
> +++ b/arch/powerpc/mm/book3s64/slb.c
> @@ -837,8 +837,9 @@ static long slb_allocate_user(struct mm_struct *mm, unsigned long ea)
>   	return slb_insert_entry(ea, context, flags, ssize, false);
>   }
>   
> -long do_slb_fault(struct pt_regs *regs, unsigned long ea)
> +long do_slb_fault(struct pt_regs *regs)
>   {
> +	unsigned long ea = regs->dar;
>   	unsigned long id = get_region_id(ea);
>   
>   	/* IRQs are not reconciled here, so can't check irqs_disabled */
> @@ -889,13 +890,15 @@ long do_slb_fault(struct pt_regs *regs, unsigned long ea)
>   	}
>   }
>   
> -void do_bad_slb_fault(struct pt_regs *regs, unsigned long ea, long err)
> +void do_bad_slb_fault(struct pt_regs *regs)
>   {
> +	int err = regs->result;
> +
>   	if (err == -EFAULT) {
>   		if (user_mode(regs))
> -			_exception(SIGSEGV, regs, SEGV_BNDERR, ea);
> +			_exception(SIGSEGV, regs, SEGV_BNDERR, regs->dar);
>   		else
> -			bad_page_fault(regs, ea, SIGSEGV);
> +			bad_page_fault(regs, regs->dar, SIGSEGV);
>   	} else if (err == -EINVAL) {
>   		unrecoverable_exception(regs);
>   	} else {
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index e65a49f246ef..390a296b16a3 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -549,11 +549,12 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
>   }
>   NOKPROBE_SYMBOL(__do_page_fault);
>   
> -int do_page_fault(struct pt_regs *regs, unsigned long address,
> -		  unsigned long error_code)
> +long do_page_fault(struct pt_regs *regs)
>   {
>   	enum ctx_state prev_state = exception_enter();
> -	int err;
> +	unsigned long address = regs->dar;
> +	unsigned long error_code = regs->dsisr;
> +	long err;

By doing something more or less like this (need to be tuned for bookE as well):

+	int is_exec = TRAP(regs) == 0x400;
+	unsigned long address = is_exec ? regs->ssr0 : regs->dar;
+	unsigned long error_code = is_exec ? (regs->ssr1 & DSISR_SRR1_MATCH_32S) : regs->dsisr;

>   
>   	err = __do_page_fault(regs, address, error_code);
>   
> @@ -580,11 +581,12 @@ int do_page_fault(struct pt_regs *regs, unsigned long address,
>   NOKPROBE_SYMBOL(do_page_fault);
>   
>   #ifdef CONFIG_PPC_BOOK3S_64
> -/* Same as do_page_fault but interrupt entry has already run in do_hash_fault */
> -int hash__do_page_fault(struct pt_regs *regs, unsigned long address,
> -		  unsigned long error_code)
> +/* Same as do_page_fault but no interrupt entry */
> +long hash__do_page_fault(struct pt_regs *regs)
>   {
> -	int err;
> +	unsigned long address = regs->dar;
> +	unsigned long error_code = regs->dsisr;
> +	long err;
>   
>   	err = __do_page_fault(regs, address, error_code);
>   	if (unlikely(err)) {
> 

There is probably also something we can simplify around get_and_save_dar_dsisr_on_stack() macro in 
head_32.h, no need to reload DAR, at least for 8xx. Maybe as a followup patch later.

Christophe

^ permalink raw reply

* Re: [PATCH 18/18] powerpc/64s: move power4 idle entirely to C
From: kernel test robot @ 2020-11-06  7:34 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev; +Cc: kbuild-all, Nicholas Piggin
In-Reply-To: <20201105143431.1874789-19-npiggin@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1626 bytes --]

Hi Nicholas,

I love your patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on v5.10-rc2 next-20201105]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Nicholas-Piggin/powerpc-interrupt-wrappers/20201105-231909
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-allyesconfig (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/561aae13dd3400b772b91d60681d55937e046cbc
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Nicholas-Piggin/powerpc-interrupt-wrappers/20201105-231909
        git checkout 561aae13dd3400b772b91d60681d55937e046cbc
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   {standard input}: Assembler messages:
>> {standard input}:662: Error: unsupported relocation against r13

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 71461 bytes --]

^ permalink raw reply

* Re: [PATCH v2] powerpc: topology.h: fix build when CONFIG_NUMA=n
From: Christophe Leroy @ 2020-11-06  6:58 UTC (permalink / raw)
  To: Scott Cheloha, linuxppc-dev
  Cc: Nathan Lynch, Laurent Dufour, kernel test robot
In-Reply-To: <20201105223040.3612663-1-cheloha@linux.ibm.com>



Le 05/11/2020 à 23:30, Scott Cheloha a écrit :
> Add a non-NUMA definition for of_drconf_to_nid_single() to topology.h
> so we have one even if powerpc/mm/numa.c is not compiled.  On a non-NUMA
> kernel the appropriate node id is always first_online_node.
> 
> Signed-off-by: Scott Cheloha <cheloha@linux.ibm.com>
> Reported-by: kernel test robot <lkp@intel.com>
> Fixes: 72cdd117c449 ("pseries/hotplug-memory: hot-add: skip redundant LMB lookup")

Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>

> ---
> v1: Initial patch.
> 
> v2: Incorporate suggested cleanups from Christophe Leroy.
> 
>   arch/powerpc/include/asm/topology.h | 12 +++++++++---
>   1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
> index 8728590f514a..3beeb030cd78 100644
> --- a/arch/powerpc/include/asm/topology.h
> +++ b/arch/powerpc/include/asm/topology.h
> @@ -6,6 +6,7 @@
>   
>   struct device;
>   struct device_node;
> +struct drmem_lmb;
>   
>   #ifdef CONFIG_NUMA
>   
> @@ -61,6 +62,9 @@ static inline int early_cpu_to_node(int cpu)
>   	 */
>   	return (nid < 0) ? 0 : nid;
>   }
> +
> +int of_drconf_to_nid_single(struct drmem_lmb *lmb);
> +
>   #else
>   
>   static inline int early_cpu_to_node(int cpu) { return 0; }
> @@ -84,10 +88,12 @@ static inline int cpu_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc)
>   	return 0;
>   }
>   
> -#endif /* CONFIG_NUMA */
> +static inline int of_drconf_to_nid_single(struct drmem_lmb *lmb)
> +{
> +	return first_online_node;
> +}
>   
> -struct drmem_lmb;
> -int of_drconf_to_nid_single(struct drmem_lmb *lmb);
> +#endif /* CONFIG_NUMA */
>   
>   #if defined(CONFIG_NUMA) && defined(CONFIG_PPC_SPLPAR)
>   extern int find_and_online_cpu_nid(int cpu);
> 

^ permalink raw reply

* Re: [PATCH v2] kernel/watchdog: Fix watchdog_allowed_mask not used warning
From: Christophe Leroy @ 2020-11-06  6:57 UTC (permalink / raw)
  To: Santosh Sivaraj, Linux Kernel, linuxppc-dev
  Cc: Petr Mladek, Thomas Gleixner, Andrew Morton
In-Reply-To: <20201106015025.1281561-1-santosh@fossix.org>



Le 06/11/2020 à 02:50, Santosh Sivaraj a écrit :
> Define watchdog_allowed_mask only when SOFTLOCKUP_DETECTOR is enabled.
> 
> Fixes: 7feeb9cd4f5b ("watchdog/sysctl: Clean up sysctl variable name space")
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Reviewed-by: Petr Mladek <pmladek@suse.com>
> Signed-off-by: Santosh Sivaraj <santosh@fossix.org>

Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>

> ---
> v2:
> Added Petr's reviewed-by from [1] and add fixes tag as suggested by Christophe.
> 
> [1]: https://lkml.org/lkml/2020/8/20/1030
> 
>   kernel/watchdog.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index 5abb5b22ad13..71109065bd8e 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -44,8 +44,6 @@ int __read_mostly soft_watchdog_user_enabled = 1;
>   int __read_mostly watchdog_thresh = 10;
>   static int __read_mostly nmi_watchdog_available;
>   
> -static struct cpumask watchdog_allowed_mask __read_mostly;
> -
>   struct cpumask watchdog_cpumask __read_mostly;
>   unsigned long *watchdog_cpumask_bits = cpumask_bits(&watchdog_cpumask);
>   
> @@ -162,6 +160,8 @@ static void lockup_detector_update_enable(void)
>   int __read_mostly sysctl_softlockup_all_cpu_backtrace;
>   #endif
>   
> +static struct cpumask watchdog_allowed_mask __read_mostly;
> +
>   /* Global variables, exported for sysctl */
>   unsigned int __read_mostly softlockup_panic =
>   			CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE;
> 

^ permalink raw reply

* [PATCH v3] powerpc/watchpoint: Workaround P10 DD1 issue with VSX-32 byte instructions
From: Ravi Bangoria @ 2020-11-06  4:56 UTC (permalink / raw)
  To: mpe
  Cc: christophe.leroy, ravi.bangoria, mikey, jniethe5, npiggin, maddy,
	paulus, naveen.n.rao, linuxppc-dev

POWER10 DD1 has an issue where it generates watchpoint exceptions when
it shouldn't. The conditions where this occur are:

 - octword op
 - ending address of DAWR range is less than starting address of op
 - those addresses need to be in the same or in two consecutive 512B
   blocks
 - 'op address + 64B' generates an address that has a carry into bit
   52 (crosses 2K boundary)

Handle such spurious exception by considering them as extraneous and
emulating/single-steeping instruction without generating an event.

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
[Fixed build warning reported by kernel test robot]
Reported-by: kernel test robot <lkp@intel.com>
---
v2->v3:
 - v2: https://lore.kernel.org/r/20201022034039.330365-1-ravi.bangoria@linux.ibm.com
 - Drop first patch which introduced CPU_FTRS_POWER10_DD1. Instead use
   P1 DD1 PVR direclty in if condition. We can't set CPU_FTRS_POWER10_DD1
   inside guest as guest can be migrated to futur version of cpu.

Dependency: VSX-32 byte emulation support patches
  https://lore.kernel.org/r/20201011050908.72173-1-ravi.bangoria@linux.ibm.com

 arch/powerpc/kernel/hw_breakpoint.c | 68 ++++++++++++++++++++++++++++-
 1 file changed, 66 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
index 1f4a1efa0074..67297aea5d94 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -644,6 +644,11 @@ static bool is_larx_stcx_instr(int type)
 	return type == LARX || type == STCX;
 }
 
+static bool is_octword_vsx_instr(int type, int size)
+{
+	return ((type == LOAD_VSX || type == STORE_VSX) && size == 32);
+}
+
 /*
  * We've failed in reliably handling the hw-breakpoint. Unregister
  * it and throw a warning message to let the user know about it.
@@ -694,6 +699,58 @@ static bool stepping_handler(struct pt_regs *regs, struct perf_event **bp,
 	return true;
 }
 
+static void handle_p10dd1_spurious_exception(struct arch_hw_breakpoint **info,
+					     int *hit, unsigned long ea)
+{
+	int i;
+	unsigned long hw_end_addr;
+
+	/*
+	 * Handle spurious exception only when any bp_per_reg is set.
+	 * Otherwise this might be created by xmon and not actually a
+	 * spurious exception.
+	 */
+	for (i = 0; i < nr_wp_slots(); i++) {
+		if (!info[i])
+			continue;
+
+		hw_end_addr = ALIGN(info[i]->address + info[i]->len, HW_BREAKPOINT_SIZE);
+
+		/*
+		 * Ending address of DAWR range is less than starting
+		 * address of op.
+		 */
+		if ((hw_end_addr - 1) >= ea)
+			continue;
+
+		/*
+		 * Those addresses need to be in the same or in two
+		 * consecutive 512B blocks;
+		 */
+		if (((hw_end_addr - 1) >> 10) != (ea >> 10))
+			continue;
+
+		/*
+		 * 'op address + 64B' generates an address that has a
+		 * carry into bit 52 (crosses 2K boundary).
+		 */
+		if ((ea & 0x800) == ((ea + 64) & 0x800))
+			continue;
+
+		break;
+	}
+
+	if (i == nr_wp_slots())
+		return;
+
+	for (i = 0; i < nr_wp_slots(); i++) {
+		if (info[i]) {
+			hit[i] = 1;
+			info[i]->type |= HW_BRK_TYPE_EXTRANEOUS_IRQ;
+		}
+	}
+}
+
 int hw_breakpoint_handler(struct die_args *args)
 {
 	bool err = false;
@@ -752,8 +809,15 @@ int hw_breakpoint_handler(struct die_args *args)
 		goto reset;
 
 	if (!nr_hit) {
-		rc = NOTIFY_DONE;
-		goto out;
+		/* Workaround for Power10 DD1 */
+		if (mfspr(SPRN_PVR) == 0x800100 &&
+		    !IS_ENABLED(CONFIG_PPC_8xx) &&
+		    is_octword_vsx_instr(type, size)) {
+			handle_p10dd1_spurious_exception(info, hit, ea);
+		} else {
+			rc = NOTIFY_DONE;
+			goto out;
+		}
 	}
 
 	/*
-- 
2.25.1


^ permalink raw reply related

* [PATCH] powerpc/64s: Remove MSR[ISF] bit
From: Nicholas Piggin @ 2020-11-06  4:53 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

No supported processor implements this mode. Setting the bit in
MSR values can be a bit confusing (and would prevent the bit from
ever being reused). Remove it.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/reg.h | 5 +----
 arch/powerpc/kernel/entry_64.S | 2 +-
 arch/powerpc/kernel/head_64.S  | 3 +--
 arch/powerpc/kvm/book3s_pr.c   | 2 +-
 4 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index f877a576b338..8885fbf4285b 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -29,7 +29,6 @@
 #include <asm/reg_8xx.h>
 
 #define MSR_SF_LG	63              /* Enable 64 bit mode */
-#define MSR_ISF_LG	61              /* Interrupt 64b mode valid on 630 */
 #define MSR_HV_LG 	60              /* Hypervisor state */
 #define MSR_TS_T_LG	34		/* Trans Mem state: Transactional */
 #define MSR_TS_S_LG	33		/* Trans Mem state: Suspended */
@@ -69,13 +68,11 @@
 
 #ifdef CONFIG_PPC64
 #define MSR_SF		__MASK(MSR_SF_LG)	/* Enable 64 bit mode */
-#define MSR_ISF		__MASK(MSR_ISF_LG)	/* Interrupt 64b mode valid on 630 */
 #define MSR_HV 		__MASK(MSR_HV_LG)	/* Hypervisor state */
 #define MSR_S		__MASK(MSR_S_LG)	/* Secure state */
 #else
 /* so tests for these bits fail on 32-bit */
 #define MSR_SF		0
-#define MSR_ISF		0
 #define MSR_HV		0
 #define MSR_S		0
 #endif
@@ -134,7 +131,7 @@
 #define MSR_64BIT	MSR_SF
 
 /* Server variant */
-#define __MSR		(MSR_ME | MSR_RI | MSR_IR | MSR_DR | MSR_ISF |MSR_HV)
+#define __MSR		(MSR_ME | MSR_RI | MSR_IR | MSR_DR | MSR_HV)
 #ifdef __BIG_ENDIAN__
 #define MSR_		__MSR
 #define MSR_IDLE	(MSR_ME | MSR_SF | MSR_HV)
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 2f3846192ec7..479fb58844fa 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -967,7 +967,7 @@ _GLOBAL(enter_prom)
 	mtsrr1	r11
 	rfi
 #else /* CONFIG_PPC_BOOK3E */
-	LOAD_REG_IMMEDIATE(r12, MSR_SF | MSR_ISF | MSR_LE)
+	LOAD_REG_IMMEDIATE(r12, MSR_SF | MSR_LE)
 	andc	r11,r11,r12
 	mtsrr1	r11
 	RFI_TO_KERNEL
diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index 1510b2a56669..4e2591cb4bd1 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -865,8 +865,7 @@ enable_64b_mode:
 	oris	r11,r11,0x8000		/* CM bit set, we'll set ICM later */
 	mtmsr	r11
 #else /* CONFIG_PPC_BOOK3E */
-	li	r12,(MSR_64BIT | MSR_ISF)@highest
-	sldi	r12,r12,48
+	LOAD_REG_IMMEDIATE(r12, MSR_64BIT)
 	or	r11,r11,r12
 	mtmsrd	r11
 	isync
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index b1fefa63e125..913944dc3620 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -239,7 +239,7 @@ static void kvmppc_recalc_shadow_msr(struct kvm_vcpu *vcpu)
 	smsr |= (guest_msr & vcpu->arch.guest_owned_ext);
 	/* 64-bit Process MSR values */
 #ifdef CONFIG_PPC_BOOK3S_64
-	smsr |= MSR_ISF | MSR_HV;
+	smsr |= MSR_HV;
 #endif
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 	/*
-- 
2.23.0


^ permalink raw reply related

* Re: [PATCH] KVM: PPC: Book3S HV: XIVE: Fix possible oops when accessing ESB page
From: Michael Ellerman @ 2020-11-06  3:19 UTC (permalink / raw)
  To: Cédric Le Goater, Paul Mackerras
  Cc: kvm, Gustavo Romero, Greg Kurz, kvm-ppc, Cédric Le Goater,
	linuxppc-dev, David Gibson
In-Reply-To: <20201105134713.656160-1-clg@kaod.org>

Cédric Le Goater <clg@kaod.org> writes:
> When accessing the ESB page of a source interrupt, the fault handler
> will retrieve the page address from the XIVE interrupt 'xive_irq_data'
> structure. If the associated KVM XIVE interrupt is not valid, that is
> not allocated at the HW level for some reason, the fault handler will
> dereference a NULL pointer leading to the oops below :
>
>     WARNING: CPU: 40 PID: 59101 at arch/powerpc/kvm/book3s_xive_native.c:259 xive_native_esb_fault+0xe4/0x240 [kvm]
>     CPU: 40 PID: 59101 Comm: qemu-system-ppc Kdump: loaded Tainted: G        W        --------- -  - 4.18.0-240.el8.ppc64le #1
>     NIP:  c00800000e949fac LR: c00000000044b164 CTR: c00800000e949ec8
>     REGS: c000001f69617840 TRAP: 0700   Tainted: G        W        --------- -  -  (4.18.0-240.el8.ppc64le)
>     MSR:  9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 44044282  XER: 00000000
>     CFAR: c00000000044b160 IRQMASK: 0
>     GPR00: c00000000044b164 c000001f69617ac0 c00800000e96e000 c000001f69617c10
>     GPR04: 05faa2b21e000080 0000000000000000 0000000000000005 ffffffffffffffff
>     GPR08: 0000000000000000 0000000000000001 0000000000000000 0000000000000001
>     GPR12: c00800000e949ec8 c000001ffffd3400 0000000000000000 0000000000000000
>     GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>     GPR20: 0000000000000000 0000000000000000 c000001f5c065160 c000000001c76f90
>     GPR24: c000001f06f20000 c000001f5c065100 0000000000000008 c000001f0eb98c78
>     GPR28: c000001dcab40000 c000001dcab403d8 c000001f69617c10 0000000000000011
>     NIP [c00800000e949fac] xive_native_esb_fault+0xe4/0x240 [kvm]
>     LR [c00000000044b164] __do_fault+0x64/0x220
>     Call Trace:
>     [c000001f69617ac0] [0000000137a5dc20] 0x137a5dc20 (unreliable)
>     [c000001f69617b50] [c00000000044b164] __do_fault+0x64/0x220
>     [c000001f69617b90] [c000000000453838] do_fault+0x218/0x930
>     [c000001f69617bf0] [c000000000456f50] __handle_mm_fault+0x350/0xdf0
>     [c000001f69617cd0] [c000000000457b1c] handle_mm_fault+0x12c/0x310
>     [c000001f69617d10] [c00000000007ef44] __do_page_fault+0x264/0xbb0
>     [c000001f69617df0] [c00000000007f8c8] do_page_fault+0x38/0xd0
>     [c000001f69617e30] [c00000000000a714] handle_page_fault+0x18/0x38
>     Instruction dump:
>     40c2fff0 7c2004ac 2fa90000 409e0118 73e90001 41820080 e8bd0008 7c2004ac
>     7ca90074 39400000 915c0000 7929d182 <0b090000> 2fa50000 419e0080 e89e0018
>     ---[ end trace 66c6ff034c53f64f ]---
>     xive-kvm: xive_native_esb_fault: accessing invalid ESB page for source 8 !
>
> Fix that by checking the validity of the KVM XIVE interrupt structure.
>
> Reported-by: Greg Kurz <groug@kaod.org>
> Signed-off-by: Cédric Le Goater <clg@kaod.org>

Fixes ?

cheers

^ permalink raw reply

* Re: [PATCH kernel v2] irq: Add reference counting to IRQ mappings
From: Alexey Kardashevskiy @ 2020-11-06  3:06 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Rob Herring, Marc Zyngier, linux-kernel, Qian Cai,
	Cédric Le Goater, Frederic Barrat, Thomas Gleixner,
	Michal Suchánek, David Gibson
In-Reply-To: <20201029110141.94304-1-aik@ozlabs.ru>

Hi,

This one seems to be broken in the domain associating part so please 
ignore it, I'll post v3 soon. Thanks,


On 29/10/2020 22:01, Alexey Kardashevskiy wrote:
> PCI devices share 4 legacy INTx interrupts from the same PCI host bridge.
> Device drivers map/unmap hardware interrupts via irq_create_mapping()/
> irq_dispose_mapping(). The problem with that these interrupts are
> shared and when performing hot unplug, we need to unmap the interrupt
> only when the last device is released.
> 
> This reuses already existing irq_desc::kobj for this purpose.
> The refcounter is naturally 1 when the descriptor is allocated already;
> this adds kobject_get() in places where already existing mapped virq
> is returned.
> 
> This reorganizes irq_dispose_mapping() to release the kobj and let
> the release callback do the cleanup.
> 
> Quick grep shows no sign of irq reference counting in drivers. Drivers
> typically request mapping when probing and dispose it when removing;
> platforms tend to dispose only if setup failed and the rest seems
> calling one dispose per one mapping. Except (at least) PPC/pseries
> which needs https://lkml.org/lkml/2020/10/27/259
> 
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> 
> What is the easiest way to get irq-hierarchical hardware?
> I have a bunch of powerpc boxes (no good) but also a raspberry pi,
> a bunch of 32/64bit orange pi's, an "armada" arm box,
> thinkpads - is any of this good for the task?
> 
> 
> ---
> Changes:
> v2:
> * added more get/put, including irq_domain_associate/irq_domain_disassociate
> ---
>   kernel/irq/irqdesc.c   | 36 ++++++++++++++++++++-----------
>   kernel/irq/irqdomain.c | 49 +++++++++++++++++++++++++++++-------------
>   2 files changed, 58 insertions(+), 27 deletions(-)
> 
> diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
> index 1a7723604399..bc8f62157ffa 100644
> --- a/kernel/irq/irqdesc.c
> +++ b/kernel/irq/irqdesc.c
> @@ -419,20 +419,40 @@ static struct irq_desc *alloc_desc(int irq, int node, unsigned int flags,
>   	return NULL;
>   }
>   
> +static void delayed_free_desc(struct rcu_head *rhp);
>   static void irq_kobj_release(struct kobject *kobj)
>   {
>   	struct irq_desc *desc = container_of(kobj, struct irq_desc, kobj);
> +#ifdef CONFIG_IRQ_DOMAIN
> +	struct irq_domain *domain;
> +	unsigned int virq = desc->irq_data.irq;
>   
> -	free_masks(desc);
> -	free_percpu(desc->kstat_irqs);
> -	kfree(desc);
> +	domain = desc->irq_data.domain;
> +	if (domain) {
> +		if (irq_domain_is_hierarchy(domain)) {
> +			irq_domain_free_irqs(virq, 1);
> +		} else {
> +			irq_domain_disassociate(domain, virq);
> +			irq_free_desc(virq);
> +		}
> +	}
> +#endif
> +	/*
> +	 * We free the descriptor, masks and stat fields via RCU. That
> +	 * allows demultiplex interrupts to do rcu based management of
> +	 * the child interrupts.
> +	 * This also allows us to use rcu in kstat_irqs_usr().
> +	 */
> +	call_rcu(&desc->rcu, delayed_free_desc);
>   }
>   
>   static void delayed_free_desc(struct rcu_head *rhp)
>   {
>   	struct irq_desc *desc = container_of(rhp, struct irq_desc, rcu);
>   
> -	kobject_put(&desc->kobj);
> +	free_masks(desc);
> +	free_percpu(desc->kstat_irqs);
> +	kfree(desc);
>   }
>   
>   static void free_desc(unsigned int irq)
> @@ -453,14 +473,6 @@ static void free_desc(unsigned int irq)
>   	 */
>   	irq_sysfs_del(desc);
>   	delete_irq_desc(irq);
> -
> -	/*
> -	 * We free the descriptor, masks and stat fields via RCU. That
> -	 * allows demultiplex interrupts to do rcu based management of
> -	 * the child interrupts.
> -	 * This also allows us to use rcu in kstat_irqs_usr().
> -	 */
> -	call_rcu(&desc->rcu, delayed_free_desc);
>   }
>   
>   static int alloc_descs(unsigned int start, unsigned int cnt, int node,
> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
> index cf8b374b892d..5fb060e077e3 100644
> --- a/kernel/irq/irqdomain.c
> +++ b/kernel/irq/irqdomain.c
> @@ -487,6 +487,7 @@ static void irq_domain_set_mapping(struct irq_domain *domain,
>   
>   void irq_domain_disassociate(struct irq_domain *domain, unsigned int irq)
>   {
> +	struct irq_desc *desc = irq_to_desc(irq);
>   	struct irq_data *irq_data = irq_get_irq_data(irq);
>   	irq_hw_number_t hwirq;
>   
> @@ -514,11 +515,14 @@ void irq_domain_disassociate(struct irq_domain *domain, unsigned int irq)
>   
>   	/* Clear reverse map for this hwirq */
>   	irq_domain_clear_mapping(domain, hwirq);
> +
> +	kobject_put(&desc->kobj);
>   }
>   
>   int irq_domain_associate(struct irq_domain *domain, unsigned int virq,
>   			 irq_hw_number_t hwirq)
>   {
> +	struct irq_desc *desc = irq_to_desc(virq);
>   	struct irq_data *irq_data = irq_get_irq_data(virq);
>   	int ret;
>   
> @@ -530,6 +534,8 @@ int irq_domain_associate(struct irq_domain *domain, unsigned int virq,
>   	if (WARN(irq_data->domain, "error: virq%i is already associated", virq))
>   		return -EINVAL;
>   
> +	kobject_get(&desc->kobj);
> +
>   	mutex_lock(&irq_domain_mutex);
>   	irq_data->hwirq = hwirq;
>   	irq_data->domain = domain;
> @@ -548,6 +554,7 @@ int irq_domain_associate(struct irq_domain *domain, unsigned int virq,
>   			irq_data->domain = NULL;
>   			irq_data->hwirq = 0;
>   			mutex_unlock(&irq_domain_mutex);
> +			kobject_put(&desc->kobj);
>   			return ret;
>   		}
>   
> @@ -638,6 +645,7 @@ unsigned int irq_create_mapping(struct irq_domain *domain,
>   {
>   	struct device_node *of_node;
>   	int virq;
> +	struct irq_desc *desc;
>   
>   	pr_debug("irq_create_mapping(0x%p, 0x%lx)\n", domain, hwirq);
>   
> @@ -655,7 +663,9 @@ unsigned int irq_create_mapping(struct irq_domain *domain,
>   	/* Check if mapping already exists */
>   	virq = irq_find_mapping(domain, hwirq);
>   	if (virq) {
> +		desc = irq_to_desc(virq);
>   		pr_debug("-> existing mapping on virq %d\n", virq);
> +		kobject_get(&desc->kobj);
>   		return virq;
>   	}
>   
> @@ -674,6 +684,7 @@ unsigned int irq_create_mapping(struct irq_domain *domain,
>   	pr_debug("irq %lu on domain %s mapped to virtual irq %u\n",
>   		hwirq, of_node_full_name(of_node), virq);
>   
> +	desc = irq_to_desc(virq);
>   	return virq;
>   }
>   EXPORT_SYMBOL_GPL(irq_create_mapping);
> @@ -751,6 +762,7 @@ unsigned int irq_create_fwspec_mapping(struct irq_fwspec *fwspec)
>   	irq_hw_number_t hwirq;
>   	unsigned int type = IRQ_TYPE_NONE;
>   	int virq;
> +	struct irq_desc *desc;
>   
>   	if (fwspec->fwnode) {
>   		domain = irq_find_matching_fwspec(fwspec, DOMAIN_BUS_WIRED);
> @@ -787,8 +799,15 @@ unsigned int irq_create_fwspec_mapping(struct irq_fwspec *fwspec)
>   		 * current trigger type then we are done so return the
>   		 * interrupt number.
>   		 */
> -		if (type == IRQ_TYPE_NONE || type == irq_get_trigger_type(virq))
> +		if (type == IRQ_TYPE_NONE || type == irq_get_trigger_type(virq)) {
> +			desc = irq_to_desc(virq);
> +			kobject_get(&desc->kobj);
> +
> +			pr_err("___K___ (%u) %s %u: virq %d counter %d\n",
> +				smp_processor_id(),
> +			       __func__, __LINE__, virq, kref_read(&desc->kobj.kref));
>   			return virq;
> +		}
>   
>   		/*
>   		 * If the trigger type has not been set yet, then set
> @@ -800,6 +819,8 @@ unsigned int irq_create_fwspec_mapping(struct irq_fwspec *fwspec)
>   				return 0;
>   
>   			irqd_set_trigger_type(irq_data, type);
> +			desc = irq_to_desc(virq);
> +			kobject_get(&desc->kobj);
>   			return virq;
>   		}
>   
> @@ -852,22 +873,12 @@ EXPORT_SYMBOL_GPL(irq_create_of_mapping);
>    */
>   void irq_dispose_mapping(unsigned int virq)
>   {
> -	struct irq_data *irq_data = irq_get_irq_data(virq);
> -	struct irq_domain *domain;
> +	struct irq_desc *desc = irq_to_desc(virq);
>   
> -	if (!virq || !irq_data)
> +	if (!virq || !desc)
>   		return;
>   
> -	domain = irq_data->domain;
> -	if (WARN_ON(domain == NULL))
> -		return;
> -
> -	if (irq_domain_is_hierarchy(domain)) {
> -		irq_domain_free_irqs(virq, 1);
> -	} else {
> -		irq_domain_disassociate(domain, virq);
> -		irq_free_desc(virq);
> -	}
> +	kobject_put(&desc->kobj);
>   }
>   EXPORT_SYMBOL_GPL(irq_dispose_mapping);
>   
> @@ -1413,6 +1424,7 @@ int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
>   			    bool realloc, const struct irq_affinity_desc *affinity)
>   {
>   	int i, ret, virq;
> +	bool get_ref = false;
>   
>   	if (domain == NULL) {
>   		domain = irq_default_domain;
> @@ -1422,6 +1434,7 @@ int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
>   
>   	if (realloc && irq_base >= 0) {
>   		virq = irq_base;
> +		get_ref = true;
>   	} else {
>   		virq = irq_domain_alloc_descs(irq_base, nr_irqs, 0, node,
>   					      affinity);
> @@ -1453,8 +1466,14 @@ int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
>   		}
>   	}
>   	
> -	for (i = 0; i < nr_irqs; i++)
> +	for (i = 0; i < nr_irqs; i++) {
>   		irq_domain_insert_irq(virq + i);
> +		if (get_ref) {
> +			struct irq_desc *desc = irq_to_desc(virq + i);
> +
> +			kobject_get(&desc->kobj);
> +		}
> +	}
>   	mutex_unlock(&irq_domain_mutex);
>   
>   	return virq;
> 

-- 
Alexey

^ permalink raw reply

* Re: [PATCH v3 2/2] ASoC: fsl_aud2htx: Add aud2htx module driver
From: Shengjiu Wang @ 2020-11-06  2:51 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	alsa-devel, Timur Tabi, Xiubo Li, Fabio Estevam, Shengjiu Wang,
	Liam Girdwood, Takashi Iwai, Rob Herring, Mark Brown,
	linuxppc-dev, linux-kernel
In-Reply-To: <20201105013539.GA16459@Asurada-Nvidia>

On Thu, Nov 5, 2020 at 9:48 AM Nicolin Chen <nicoleotsuka@gmail.com> wrote:
>
> On Mon, Nov 02, 2020 at 09:52:27AM +0800, Shengjiu Wang wrote:
> > The AUD2HTX is a digital module that provides a bridge between
> > the Audio Subsystem and the HDMI RTX Subsystem. This module
> > includes intermediate storage to queue SDMA transactions prior
> > to being synchronized and passed to the HDMI RTX Subsystem over
> > the Audio Link.
> >
> > The AUD2HTX contains a DMA request routed to the SDMA module.
> > This DMA request is controlled based on the watermark level in
> > the 32-entry sample buffer.
> >
> > Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
>
> Acked-by: Nicolin Chen <nicoleotsuka@gmail.com>
>
> Despite some small comments inline.
>
> > +static int fsl_aud2htx_dai_probe(struct snd_soc_dai *cpu_dai)
> > +{
> > +     struct fsl_aud2htx *aud2htx = dev_get_drvdata(cpu_dai->dev);
> > +
> > +     /* DMA request when number of entries < WTMK_LOW */
> > +     regmap_update_bits(aud2htx->regmap, AUD2HTX_CTRL_EXT,
> > +                        AUD2HTX_CTRE_DT_MASK, 0);
> > +
> > +     /* Disable interrupts*/
> > +     regmap_update_bits(aud2htx->regmap, AUD2HTX_IRQ_MASK,
> > +                        AUD2HTX_WM_HIGH_IRQ_MASK |
> > +                        AUD2HTX_WM_LOW_IRQ_MASK |
> > +                        AUD2HTX_OVF_MASK,
> > +                        AUD2HTX_WM_HIGH_IRQ_MASK |
> > +                        AUD2HTX_WM_LOW_IRQ_MASK |
> > +                        AUD2HTX_OVF_MASK);
> > +
> > +     /* Configure watermark */
> > +     regmap_update_bits(aud2htx->regmap, AUD2HTX_CTRL_EXT,
> > +                        AUD2HTX_CTRE_WL_MASK,
> > +                        AUD2HTX_WTMK_LOW << AUD2HTX_CTRE_WL_SHIFT);
> > +     regmap_update_bits(aud2htx->regmap, AUD2HTX_CTRL_EXT,
> > +                        AUD2HTX_CTRE_WH_MASK,
> > +                        AUD2HTX_WTMK_HIGH << AUD2HTX_CTRE_WH_SHIFT);
>
> If there isn't a hard requirement from hardware, feels better to
> combine all the writes to AUD2HTX_CTRL_EXT into one single MMIO.

ok, will update it.

>
> > +static irqreturn_t fsl_aud2htx_isr(int irq, void *dev_id)
> > +{
> > +     return IRQ_HANDLED;
>
> Empty isr? Perhaps can drop the request_irq() at all?

I'd like to keep this for future enhancement, what do you think?

>
> > +static int fsl_aud2htx_probe(struct platform_device *pdev)
> > +{
> > +     struct fsl_aud2htx *aud2htx;
> > +     struct resource *res;
> > +     void __iomem *regs;
> > +     int ret, irq;
> > +
> > +     aud2htx = devm_kzalloc(&pdev->dev, sizeof(*aud2htx), GFP_KERNEL);
> > +     if (!aud2htx)
> > +             return -ENOMEM;
> > +
> > +     aud2htx->pdev = pdev;
> > +
> > +     res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > +     regs = devm_ioremap_resource(&pdev->dev, res);
> > +     if (IS_ERR(regs)) {
> > +             dev_err(&pdev->dev, "failed ioremap\n");
> > +             return PTR_ERR(regs);
> > +     }
> > +
> > +     aud2htx->regmap = devm_regmap_init_mmio(&pdev->dev, regs,
> > +                                             &fsl_aud2htx_regmap_config);
> > +     if (IS_ERR(aud2htx->regmap)) {
> > +             dev_err(&pdev->dev, "failed to init regmap");
> > +             return PTR_ERR(aud2htx->regmap);
> > +     }
> > +
> > +     irq = platform_get_irq(pdev, 0);
> > +     if (irq < 0) {
> > +             dev_err(&pdev->dev, "no irq for node %s\n",
> > +                     dev_name(&pdev->dev));
>
> dev_err() already prints dev_name, so not necessary to print again.

ok, will update it

best regards
wang shengjiu

^ permalink raw reply

* [PATCH 11/11 v3] ftrace: Add recording of functions that caused recursion
From: Steven Rostedt @ 2020-11-06  2:32 UTC (permalink / raw)
  To: linux-kernel
  Cc: Anton Vorontsov, linux-doc, Peter Zijlstra,
	Sebastian Andrzej Siewior, Kamalesh Babulal, James E.J. Bottomley,
	Guo Ren, H. Peter Anvin, live-patching, Miroslav Benes,
	Ingo Molnar, linux-s390, Joe Lawrence, Jonathan Corbet,
	Mauro Carvalho Chehab, Helge Deller, x86, linux-csky,
	Christian Borntraeger, Petr Mladek, Kees Cook, Vasily Gorbik,
	Heiko Carstens, Jiri Kosina, Borislav Petkov, Josh Poimboeuf,
	Thomas Gleixner, Tony Luck, linux-parisc, Masami Hiramatsu,
	Colin Cross, Paul Mackerras, Andrew Morton, linuxppc-dev
In-Reply-To: <20201106023235.367190737@goodmis.org>

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

This adds CONFIG_FTRACE_RECORD_RECURSION that will record to a file
"recursed_functions" all the functions that caused recursion while a
callback to the function tracer was running.

Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Guo Ren <guoren@kernel.org>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-csky@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s390@vger.kernel.org
Cc: live-patching@vger.kernel.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---

Changes since v2:

 - Use trace_recursion flags in current for protecting recursion of recursion recording
 - Make the recursion logic a little cleaner
 - Export GPL the recursion recording

 Documentation/trace/ftrace-uses.rst   |   6 +-
 arch/csky/kernel/probes/ftrace.c      |   2 +-
 arch/parisc/kernel/ftrace.c           |   2 +-
 arch/powerpc/kernel/kprobes-ftrace.c  |   2 +-
 arch/s390/kernel/ftrace.c             |   2 +-
 arch/x86/kernel/kprobes/ftrace.c      |   2 +-
 fs/pstore/ftrace.c                    |   2 +-
 include/linux/trace_recursion.h       |  29 +++-
 kernel/livepatch/patch.c              |   2 +-
 kernel/trace/Kconfig                  |  25 +++
 kernel/trace/Makefile                 |   1 +
 kernel/trace/ftrace.c                 |   4 +-
 kernel/trace/trace_event_perf.c       |   2 +-
 kernel/trace/trace_functions.c        |   2 +-
 kernel/trace/trace_output.c           |   6 +-
 kernel/trace/trace_output.h           |   1 +
 kernel/trace/trace_recursion_record.c | 236 ++++++++++++++++++++++++++
 17 files changed, 306 insertions(+), 20 deletions(-)
 create mode 100644 kernel/trace/trace_recursion_record.c

diff --git a/Documentation/trace/ftrace-uses.rst b/Documentation/trace/ftrace-uses.rst
index 86cd14b8e126..5981d5691745 100644
--- a/Documentation/trace/ftrace-uses.rst
+++ b/Documentation/trace/ftrace-uses.rst
@@ -118,7 +118,7 @@ can help in this regard. If you start your code with:
 
 	int bit;
 
-	bit = ftrace_test_recursion_trylock();
+	bit = ftrace_test_recursion_trylock(ip, parent_ip);
 	if (bit < 0)
 		return;
 
@@ -130,7 +130,9 @@ The code in between will be safe to use, even if it ends up calling a
 function that the callback is tracing. Note, on success,
 ftrace_test_recursion_trylock() will disable preemption, and the
 ftrace_test_recursion_unlock() will enable it again (if it was previously
-enabled).
+enabled). The instruction pointer (ip) and its parent (parent_ip) is passed to
+ftrace_test_recursion_trylock() to record where the recursion happened
+(if CONFIG_FTRACE_RECORD_RECURSION is set).
 
 Alternatively, if the FTRACE_OPS_FL_RECURSION flag is set on the ftrace_ops
 (as explained below), then a helper trampoline will be used to test
diff --git a/arch/csky/kernel/probes/ftrace.c b/arch/csky/kernel/probes/ftrace.c
index 5eb2604fdf71..f30b179924ef 100644
--- a/arch/csky/kernel/probes/ftrace.c
+++ b/arch/csky/kernel/probes/ftrace.c
@@ -18,7 +18,7 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 	struct kprobe *p;
 	struct kprobe_ctlblk *kcb;
 
-	bit = ftrace_test_recursion_trylock();
+	bit = ftrace_test_recursion_trylock(ip, parent_ip);
 	if (bit < 0)
 		return;
 
diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
index 13d85042810a..1c5d3732bda2 100644
--- a/arch/parisc/kernel/ftrace.c
+++ b/arch/parisc/kernel/ftrace.c
@@ -210,7 +210,7 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 	struct kprobe *p;
 	int bit;
 
-	bit = ftrace_test_recursion_trylock();
+	bit = ftrace_test_recursion_trylock(ip, parent_ip);
 	if (bit < 0)
 		return;
 
diff --git a/arch/powerpc/kernel/kprobes-ftrace.c b/arch/powerpc/kernel/kprobes-ftrace.c
index 5df8d50c65ae..fdfee39938ea 100644
--- a/arch/powerpc/kernel/kprobes-ftrace.c
+++ b/arch/powerpc/kernel/kprobes-ftrace.c
@@ -20,7 +20,7 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
 	struct kprobe_ctlblk *kcb;
 	int bit;
 
-	bit = ftrace_test_recursion_trylock();
+	bit = ftrace_test_recursion_trylock(nip, parent_nip);
 	if (bit < 0)
 		return;
 
diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c
index 8f31c726537a..657c1ab45408 100644
--- a/arch/s390/kernel/ftrace.c
+++ b/arch/s390/kernel/ftrace.c
@@ -204,7 +204,7 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 	struct kprobe *p;
 	int bit;
 
-	bit = ftrace_test_recursion_trylock();
+	bit = ftrace_test_recursion_trylock(ip, parent_ip);
 	if (bit < 0)
 		return;
 
diff --git a/arch/x86/kernel/kprobes/ftrace.c b/arch/x86/kernel/kprobes/ftrace.c
index a40a6cdfcca3..954d930a7127 100644
--- a/arch/x86/kernel/kprobes/ftrace.c
+++ b/arch/x86/kernel/kprobes/ftrace.c
@@ -20,7 +20,7 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 	struct kprobe_ctlblk *kcb;
 	int bit;
 
-	bit = ftrace_test_recursion_trylock();
+	bit = ftrace_test_recursion_trylock(ip, parent_ip);
 	if (bit < 0)
 		return;
 
diff --git a/fs/pstore/ftrace.c b/fs/pstore/ftrace.c
index 816210fc5d3a..adb0935eb062 100644
--- a/fs/pstore/ftrace.c
+++ b/fs/pstore/ftrace.c
@@ -41,7 +41,7 @@ static void notrace pstore_ftrace_call(unsigned long ip,
 	if (unlikely(oops_in_progress))
 		return;
 
-	bit = ftrace_test_recursion_trylock();
+	bit = ftrace_test_recursion_trylock(ip, parent_ip);
 	if (bit < 0)
 		return;
 
diff --git a/include/linux/trace_recursion.h b/include/linux/trace_recursion.h
index ac3d73484cb2..228cc56ed66e 100644
--- a/include/linux/trace_recursion.h
+++ b/include/linux/trace_recursion.h
@@ -91,6 +91,9 @@ enum {
 	 * not be correct. Allow for a single recursion to cover this case.
 	 */
 	TRACE_TRANSITION_BIT,
+
+	/* Used to prevent recursion recording from recursing. */
+	TRACE_RECORD_RECURSION_BIT,
 };
 
 #define trace_recursion_set(bit)	do { (current)->trace_recursion |= (1<<(bit)); } while (0)
@@ -142,7 +145,22 @@ static __always_inline int trace_get_context_bit(void)
 			pc & HARDIRQ_MASK ? TRACE_CTX_IRQ : TRACE_CTX_SOFTIRQ;
 }
 
-static __always_inline int trace_test_and_set_recursion(int start, int max)
+#ifdef CONFIG_FTRACE_RECORD_RECURSION
+extern void ftrace_record_recursion(unsigned long ip, unsigned long parent_ip);
+# define do_ftrace_record_recursion(ip, pip)				\
+	do {								\
+		if (!trace_recursion_test(TRACE_RECORD_RECURSION_BIT)) { \
+			trace_recursion_set(TRACE_RECORD_RECURSION_BIT); \
+			ftrace_record_recursion(ip, pip);		\
+			trace_recursion_clear(TRACE_RECORD_RECURSION_BIT); \
+		}							\
+	} while (0)
+#else
+# define do_ftrace_record_recursion(ip, pip)	do { } while (0)
+#endif
+
+static __always_inline int trace_test_and_set_recursion(unsigned long ip, unsigned long pip,
+							int start, int max)
 {
 	unsigned int val = current->trace_recursion;
 	int bit;
@@ -158,8 +176,10 @@ static __always_inline int trace_test_and_set_recursion(int start, int max)
 		 * a switch between contexts. Allow for a single recursion.
 		 */
 		bit = TRACE_TRANSITION_BIT;
-		if (trace_recursion_test(bit))
+		if (trace_recursion_test(bit)) {
+			do_ftrace_record_recursion(ip, pip);
 			return -1;
+		}
 		trace_recursion_set(bit);
 		barrier();
 		return bit + 1;
@@ -199,9 +219,10 @@ static __always_inline void trace_clear_recursion(int bit)
  * Returns: -1 if a recursion happened.
  *           >= 0 if no recursion
  */
-static __always_inline int ftrace_test_recursion_trylock(void)
+static __always_inline int ftrace_test_recursion_trylock(unsigned long ip,
+							 unsigned long parent_ip)
 {
-	return trace_test_and_set_recursion(TRACE_FTRACE_START, TRACE_FTRACE_MAX);
+	return trace_test_and_set_recursion(ip, parent_ip, TRACE_FTRACE_START, TRACE_FTRACE_MAX);
 }
 
 /**
diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
index 15480bf3ce88..875c5dbbdd33 100644
--- a/kernel/livepatch/patch.c
+++ b/kernel/livepatch/patch.c
@@ -49,7 +49,7 @@ static void notrace klp_ftrace_handler(unsigned long ip,
 
 	ops = container_of(fops, struct klp_ops, fops);
 
-	bit = ftrace_test_recursion_trylock();
+	bit = ftrace_test_recursion_trylock(ip, parent_ip);
 	if (WARN_ON_ONCE(bit < 0))
 		return;
 	/*
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index a4020c0b4508..9b11c096d139 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -727,6 +727,31 @@ config TRACE_EVAL_MAP_FILE
 
 	If unsure, say N.
 
+config FTRACE_RECORD_RECURSION
+	bool "Record functions that recurse in function tracing"
+	depends on FUNCTION_TRACER
+	help
+	  All callbacks that attach to the function tracing have some sort
+	  of protection against recursion. Even though the protection exists,
+	  it adds overhead. This option will create a file in the tracefs
+	  file system called "recursed_functions" that will list the functions
+	  that triggered a recursion.
+
+	  This will add more overhead to cases that have recursion.
+
+	  If unsure, say N
+
+config FTRACE_RECORD_RECURSION_SIZE
+	int "Max number of recursed functions to record"
+	default	128
+	depends on FTRACE_RECORD_RECURSION
+	help
+	  This defines the limit of number of functions that can be
+	  listed in the "recursed_functions" file, that lists all
+	  the functions that caused a recursion to happen.
+	  This file can be reset, but the limit can not change in
+	  size at runtime.
+
 config GCOV_PROFILE_FTRACE
 	bool "Enable GCOV profiling on ftrace subsystem"
 	depends on GCOV_KERNEL
diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index e153be351548..7e44cea89fdc 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -92,6 +92,7 @@ obj-$(CONFIG_DYNAMIC_EVENTS) += trace_dynevent.o
 obj-$(CONFIG_PROBE_EVENTS) += trace_probe.o
 obj-$(CONFIG_UPROBE_EVENTS) += trace_uprobe.o
 obj-$(CONFIG_BOOTTIME_TRACING) += trace_boot.o
+obj-$(CONFIG_FTRACE_RECORD_RECURSION) += trace_recursion_record.o
 
 obj-$(CONFIG_TRACEPOINT_BENCHMARK) += trace_benchmark.o
 
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 39f2bba89b76..03aad2b5cd5e 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -6918,7 +6918,7 @@ __ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip,
 	struct ftrace_ops *op;
 	int bit;
 
-	bit = trace_test_and_set_recursion(TRACE_LIST_START, TRACE_LIST_MAX);
+	bit = trace_test_and_set_recursion(ip, parent_ip, TRACE_LIST_START, TRACE_LIST_MAX);
 	if (bit < 0)
 		return;
 
@@ -6993,7 +6993,7 @@ static void ftrace_ops_assist_func(unsigned long ip, unsigned long parent_ip,
 {
 	int bit;
 
-	bit = trace_test_and_set_recursion(TRACE_LIST_START, TRACE_LIST_MAX);
+	bit = trace_test_and_set_recursion(ip, parent_ip, TRACE_LIST_START, TRACE_LIST_MAX);
 	if (bit < 0)
 		return;
 
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index a2b9fddb8148..1b202e28dfaa 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -447,7 +447,7 @@ perf_ftrace_function_call(unsigned long ip, unsigned long parent_ip,
 	if ((unsigned long)ops->private != smp_processor_id())
 		return;
 
-	bit = ftrace_test_recursion_trylock();
+	bit = ftrace_test_recursion_trylock(ip, parent_ip);
 	if (bit < 0)
 		return;
 
diff --git a/kernel/trace/trace_functions.c b/kernel/trace/trace_functions.c
index 89c414ce1388..646eda6c44a5 100644
--- a/kernel/trace/trace_functions.c
+++ b/kernel/trace/trace_functions.c
@@ -141,7 +141,7 @@ function_trace_call(unsigned long ip, unsigned long parent_ip,
 	if (unlikely(!tr->function_enabled))
 		return;
 
-	bit = ftrace_test_recursion_trylock();
+	bit = ftrace_test_recursion_trylock(ip, parent_ip);
 	if (bit < 0)
 		return;
 
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index 000e9dc224c6..92b1575ae0ca 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -353,8 +353,8 @@ static inline const char *kretprobed(const char *name)
 }
 #endif /* CONFIG_KRETPROBES */
 
-static void
-seq_print_sym(struct trace_seq *s, unsigned long address, bool offset)
+void
+trace_seq_print_sym(struct trace_seq *s, unsigned long address, bool offset)
 {
 #ifdef CONFIG_KALLSYMS
 	char str[KSYM_SYMBOL_LEN];
@@ -420,7 +420,7 @@ seq_print_ip_sym(struct trace_seq *s, unsigned long ip, unsigned long sym_flags)
 		goto out;
 	}
 
-	seq_print_sym(s, ip, sym_flags & TRACE_ITER_SYM_OFFSET);
+	trace_seq_print_sym(s, ip, sym_flags & TRACE_ITER_SYM_OFFSET);
 
 	if (sym_flags & TRACE_ITER_SYM_ADDR)
 		trace_seq_printf(s, " <" IP_FMT ">", ip);
diff --git a/kernel/trace/trace_output.h b/kernel/trace/trace_output.h
index 2f742b74e7e6..4c954636caf0 100644
--- a/kernel/trace/trace_output.h
+++ b/kernel/trace/trace_output.h
@@ -16,6 +16,7 @@ extern int
 seq_print_ip_sym(struct trace_seq *s, unsigned long ip,
 		unsigned long sym_flags);
 
+extern void trace_seq_print_sym(struct trace_seq *s, unsigned long address, bool offset);
 extern int trace_print_context(struct trace_iterator *iter);
 extern int trace_print_lat_context(struct trace_iterator *iter);
 
diff --git a/kernel/trace/trace_recursion_record.c b/kernel/trace/trace_recursion_record.c
new file mode 100644
index 000000000000..b2edac1fe156
--- /dev/null
+++ b/kernel/trace/trace_recursion_record.c
@@ -0,0 +1,236 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/seq_file.h>
+#include <linux/kallsyms.h>
+#include <linux/module.h>
+#include <linux/ftrace.h>
+#include <linux/fs.h>
+
+#include "trace_output.h"
+
+struct recursed_functions {
+	unsigned long		ip;
+	unsigned long		parent_ip;
+};
+
+static struct recursed_functions recursed_functions[CONFIG_FTRACE_RECORD_RECURSION_SIZE];
+static atomic_t nr_records;
+
+/*
+ * Cache the last found function. Yes, updates to this is racey, but
+ * so is memory cache ;-)
+ */
+static unsigned long cached_function;
+
+void ftrace_record_recursion(unsigned long ip, unsigned long parent_ip)
+{
+	int index = 0;
+	int i;
+	unsigned long old;
+
+ again:
+	/* First check the last one recorded */
+	if (ip == cached_function)
+		return;
+
+	i = atomic_read(&nr_records);
+	/* nr_records is -1 when clearing records */
+	smp_mb__after_atomic();
+	if (i < 0)
+		return;
+
+	/*
+	 * If there's two writers and this writer comes in second,
+	 * the cmpxchg() below to update the ip will fail. Then this
+	 * writer will try again. It is possible that index will now
+	 * be greater than nr_records. This is because the writer
+	 * that succeeded has not updated the nr_records yet.
+	 * This writer could keep trying again until the other writer
+	 * updates nr_records. But if the other writer takes an
+	 * interrupt, and that interrupt locks up that CPU, we do
+	 * not want this CPU to lock up due to the recursion protection,
+	 * and have a bug report showing this CPU as the cause of
+	 * locking up the computer. To not lose this record, this
+	 * writer will simply use the next position to update the
+	 * recursed_functions, and it will update the nr_records
+	 * accordingly.
+	 */
+	if (index < i)
+		index = i;
+	if (index >= CONFIG_FTRACE_RECORD_RECURSION_SIZE)
+		return;
+
+	for (i = index - 1; i >= 0; i--) {
+		if (recursed_functions[i].ip == ip) {
+			cached_function = ip;
+			return;
+		}
+	}
+
+	cached_function = ip;
+
+	/*
+	 * We only want to add a function if it hasn't been added before.
+	 * Add to the current location before incrementing the count.
+	 * If it fails to add, then increment the index (save in i)
+	 * and try again.
+	 */
+	old = cmpxchg(&recursed_functions[index].ip, 0, ip);
+	if (old != 0) {
+		/* Did something else already added this for us? */
+		if (old == ip)
+			return;
+		/* Try the next location (use i for the next index) */
+		index++;
+		goto again;
+	}
+
+	recursed_functions[index].parent_ip = parent_ip;
+
+	/*
+	 * It's still possible that we could race with the clearing
+	 *    CPU0                                    CPU1
+	 *    ----                                    ----
+	 *                                       ip = func
+	 *  nr_records = -1;
+	 *  recursed_functions[0] = 0;
+	 *                                       i = -1
+	 *                                       if (i < 0)
+	 *  nr_records = 0;
+	 *  (new recursion detected)
+	 *      recursed_functions[0] = func
+	 *                                            cmpxchg(recursed_functions[0],
+	 *                                                    func, 0)
+	 *
+	 * But the worse that could happen is that we get a zero in
+	 * the recursed_functions array, and it's likely that "func" will
+	 * be recorded again.
+	 */
+	i = atomic_read(&nr_records);
+	smp_mb__after_atomic();
+	if (i < 0)
+		cmpxchg(&recursed_functions[index].ip, ip, 0);
+	else if (i <= index)
+		atomic_cmpxchg(&nr_records, i, index + 1);
+}
+EXPORT_SYMBOL_GPL(ftrace_record_recursion);
+
+static DEFINE_MUTEX(recursed_function_lock);
+static struct trace_seq *tseq;
+
+static void *recursed_function_seq_start(struct seq_file *m, loff_t *pos)
+{
+	void *ret = NULL;
+	int index;
+
+	mutex_lock(&recursed_function_lock);
+	index = atomic_read(&nr_records);
+	if (*pos < index) {
+		ret = &recursed_functions[*pos];
+	}
+
+	tseq = kzalloc(sizeof(*tseq), GFP_KERNEL);
+	if (!tseq)
+		return ERR_PTR(-ENOMEM);
+
+	trace_seq_init(tseq);
+
+	return ret;
+}
+
+static void *recursed_function_seq_next(struct seq_file *m, void *v, loff_t *pos)
+{
+	int index;
+	int p;
+
+	index = atomic_read(&nr_records);
+	p = ++(*pos);
+
+	return p < index ? &recursed_functions[p] : NULL;
+}
+
+static void recursed_function_seq_stop(struct seq_file *m, void *v)
+{
+	kfree(tseq);
+	mutex_unlock(&recursed_function_lock);
+}
+
+static int recursed_function_seq_show(struct seq_file *m, void *v)
+{
+	struct recursed_functions *record = v;
+	int ret = 0;
+
+	if (record) {
+		trace_seq_print_sym(tseq, record->parent_ip, true);
+		trace_seq_puts(tseq, ":\t");
+		trace_seq_print_sym(tseq, record->ip, true);
+		trace_seq_putc(tseq, '\n');
+		ret = trace_print_seq(m, tseq);
+	}
+
+	return ret;
+}
+
+static const struct seq_operations recursed_function_seq_ops = {
+	.start  = recursed_function_seq_start,
+	.next   = recursed_function_seq_next,
+	.stop   = recursed_function_seq_stop,
+	.show   = recursed_function_seq_show
+};
+
+static int recursed_function_open(struct inode *inode, struct file *file)
+{
+	int ret = 0;
+
+	mutex_lock(&recursed_function_lock);
+	/* If this file was opened for write, then erase contents */
+	if ((file->f_mode & FMODE_WRITE) && (file->f_flags & O_TRUNC)) {
+		/* disable updating records */
+		atomic_set(&nr_records, -1);
+		smp_mb__after_atomic();
+		memset(recursed_functions, 0, sizeof(recursed_functions));
+		smp_wmb();
+		/* enable them again */
+		atomic_set(&nr_records, 0);
+	}
+	if (file->f_mode & FMODE_READ)
+		ret = seq_open(file, &recursed_function_seq_ops);
+	mutex_unlock(&recursed_function_lock);
+
+	return ret;
+}
+
+static ssize_t recursed_function_write(struct file *file,
+				       const char __user *buffer,
+				       size_t count, loff_t *ppos)
+{
+	return count;
+}
+
+static int recursed_function_release(struct inode *inode, struct file *file)
+{
+	if (file->f_mode & FMODE_READ)
+		seq_release(inode, file);
+	return 0;
+}
+
+static const struct file_operations recursed_functions_fops = {
+	.open           = recursed_function_open,
+	.write		= recursed_function_write,
+	.read           = seq_read,
+	.llseek         = seq_lseek,
+	.release        = recursed_function_release,
+};
+
+__init static int create_recursed_functions(void)
+{
+	struct dentry *dentry;
+
+	dentry = trace_create_file("recursed_functions", 0644, NULL, NULL,
+				   &recursed_functions_fops);
+	if (!dentry)
+		pr_warn("WARNING: Failed to create recursed_functions\n");
+	return 0;
+}
+
+fs_initcall(create_recursed_functions);
-- 
2.28.0



^ permalink raw reply related

* [PATCH 05/11 v3] kprobes/ftrace: Add recursion protection to the ftrace callback
From: Steven Rostedt @ 2020-11-06  2:32 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, James E.J. Bottomley, Guo Ren, linux-csky,
	H. Peter Anvin, Miroslav Benes, Ingo Molnar, linux-s390,
	Helge Deller, x86, Anil S Keshavamurthy, Christian Borntraeger,
	Naveen N. Rao, Petr Mladek, Vasily Gorbik, Heiko Carstens,
	Jiri Kosina, Borislav Petkov, Josh Poimboeuf, Thomas Gleixner,
	linux-parisc, Masami Hiramatsu, Paul Mackerras, Andrew Morton,
	linuxppc-dev, David S. Miller
In-Reply-To: <20201106023235.367190737@goodmis.org>

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

If a ftrace callback does not supply its own recursion protection and
does not set the RECURSION_SAFE flag in its ftrace_ops, then ftrace will
make a helper trampoline to do so before calling the callback instead of
just calling the callback directly.

The default for ftrace_ops is going to change. It will expect that handlers
provide their own recursion protection, unless its ftrace_ops states
otherwise.

Link: https://lkml.kernel.org/r/20201028115613.140212174@goodmis.org

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-csky@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s390@vger.kernel.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---

Changes since v2:

 - Move get_kprobe() into preempt disabled sections for various archs


 arch/csky/kernel/probes/ftrace.c     | 12 ++++++++++--
 arch/parisc/kernel/ftrace.c          | 16 +++++++++++++---
 arch/powerpc/kernel/kprobes-ftrace.c | 11 ++++++++++-
 arch/s390/kernel/ftrace.c            | 16 +++++++++++++---
 arch/x86/kernel/kprobes/ftrace.c     | 12 ++++++++++--
 5 files changed, 56 insertions(+), 11 deletions(-)

diff --git a/arch/csky/kernel/probes/ftrace.c b/arch/csky/kernel/probes/ftrace.c
index 5264763d05be..5eb2604fdf71 100644
--- a/arch/csky/kernel/probes/ftrace.c
+++ b/arch/csky/kernel/probes/ftrace.c
@@ -13,16 +13,21 @@ int arch_check_ftrace_location(struct kprobe *p)
 void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 			   struct ftrace_ops *ops, struct pt_regs *regs)
 {
+	int bit;
 	bool lr_saver = false;
 	struct kprobe *p;
 	struct kprobe_ctlblk *kcb;
 
-	/* Preempt is disabled by ftrace */
+	bit = ftrace_test_recursion_trylock();
+	if (bit < 0)
+		return;
+
+	preempt_disable_notrace();
 	p = get_kprobe((kprobe_opcode_t *)ip);
 	if (!p) {
 		p = get_kprobe((kprobe_opcode_t *)(ip - MCOUNT_INSN_SIZE));
 		if (unlikely(!p) || kprobe_disabled(p))
-			return;
+			goto out;
 		lr_saver = true;
 	}
 
@@ -56,6 +61,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 		 */
 		__this_cpu_write(current_kprobe, NULL);
 	}
+out:
+	preempt_enable_notrace();
+	ftrace_test_recursion_unlock(bit);
 }
 NOKPROBE_SYMBOL(kprobe_ftrace_handler);
 
diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
index 63e3ecb9da81..13d85042810a 100644
--- a/arch/parisc/kernel/ftrace.c
+++ b/arch/parisc/kernel/ftrace.c
@@ -207,14 +207,21 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 			   struct ftrace_ops *ops, struct pt_regs *regs)
 {
 	struct kprobe_ctlblk *kcb;
-	struct kprobe *p = get_kprobe((kprobe_opcode_t *)ip);
+	struct kprobe *p;
+	int bit;
 
-	if (unlikely(!p) || kprobe_disabled(p))
+	bit = ftrace_test_recursion_trylock();
+	if (bit < 0)
 		return;
 
+	preempt_disable_notrace();
+	p = get_kprobe((kprobe_opcode_t *)ip);
+	if (unlikely(!p) || kprobe_disabled(p))
+		goto out;
+
 	if (kprobe_running()) {
 		kprobes_inc_nmissed_count(p);
-		return;
+		goto out;
 	}
 
 	__this_cpu_write(current_kprobe, p);
@@ -235,6 +242,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 		}
 	}
 	__this_cpu_write(current_kprobe, NULL);
+out:
+	preempt_enable_notrace();
+	ftrace_test_recursion_unlock(bit);
 }
 NOKPROBE_SYMBOL(kprobe_ftrace_handler);
 
diff --git a/arch/powerpc/kernel/kprobes-ftrace.c b/arch/powerpc/kernel/kprobes-ftrace.c
index 972cb28174b2..5df8d50c65ae 100644
--- a/arch/powerpc/kernel/kprobes-ftrace.c
+++ b/arch/powerpc/kernel/kprobes-ftrace.c
@@ -18,10 +18,16 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
 {
 	struct kprobe *p;
 	struct kprobe_ctlblk *kcb;
+	int bit;
 
+	bit = ftrace_test_recursion_trylock();
+	if (bit < 0)
+		return;
+
+	preempt_disable_notrace();
 	p = get_kprobe((kprobe_opcode_t *)nip);
 	if (unlikely(!p) || kprobe_disabled(p))
-		return;
+		goto out;
 
 	kcb = get_kprobe_ctlblk();
 	if (kprobe_running()) {
@@ -52,6 +58,9 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
 		 */
 		__this_cpu_write(current_kprobe, NULL);
 	}
+out:
+	preempt_enable_notrace();
+	ftrace_test_recursion_unlock(bit);
 }
 NOKPROBE_SYMBOL(kprobe_ftrace_handler);
 
diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c
index b388e87a08bf..8f31c726537a 100644
--- a/arch/s390/kernel/ftrace.c
+++ b/arch/s390/kernel/ftrace.c
@@ -201,14 +201,21 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 		struct ftrace_ops *ops, struct pt_regs *regs)
 {
 	struct kprobe_ctlblk *kcb;
-	struct kprobe *p = get_kprobe((kprobe_opcode_t *)ip);
+	struct kprobe *p;
+	int bit;
 
-	if (unlikely(!p) || kprobe_disabled(p))
+	bit = ftrace_test_recursion_trylock();
+	if (bit < 0)
 		return;
 
+	preempt_disable_notrace();
+	p = get_kprobe((kprobe_opcode_t *)ip);
+	if (unlikely(!p) || kprobe_disabled(p))
+		goto out;
+
 	if (kprobe_running()) {
 		kprobes_inc_nmissed_count(p);
-		return;
+		goto out;
 	}
 
 	__this_cpu_write(current_kprobe, p);
@@ -228,6 +235,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 		}
 	}
 	__this_cpu_write(current_kprobe, NULL);
+out:
+	preempt_enable_notrace();
+	ftrace_test_recursion_unlock(bit);
 }
 NOKPROBE_SYMBOL(kprobe_ftrace_handler);
 
diff --git a/arch/x86/kernel/kprobes/ftrace.c b/arch/x86/kernel/kprobes/ftrace.c
index 681a4b36e9bb..a40a6cdfcca3 100644
--- a/arch/x86/kernel/kprobes/ftrace.c
+++ b/arch/x86/kernel/kprobes/ftrace.c
@@ -18,11 +18,16 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 {
 	struct kprobe *p;
 	struct kprobe_ctlblk *kcb;
+	int bit;
 
-	/* Preempt is disabled by ftrace */
+	bit = ftrace_test_recursion_trylock();
+	if (bit < 0)
+		return;
+
+	preempt_disable_notrace();
 	p = get_kprobe((kprobe_opcode_t *)ip);
 	if (unlikely(!p) || kprobe_disabled(p))
-		return;
+		goto out;
 
 	kcb = get_kprobe_ctlblk();
 	if (kprobe_running()) {
@@ -52,6 +57,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 		 */
 		__this_cpu_write(current_kprobe, NULL);
 	}
+out:
+	preempt_enable_notrace();
+	ftrace_test_recursion_unlock(bit);
 }
 NOKPROBE_SYMBOL(kprobe_ftrace_handler);
 
-- 
2.28.0



^ permalink raw reply related

* [PATCH v2] kernel/watchdog: Fix watchdog_allowed_mask not used warning
From: Santosh Sivaraj @ 2020-11-06  1:50 UTC (permalink / raw)
  To: Linux Kernel, linuxppc-dev
  Cc: Andrew Morton, Petr Mladek, Santosh Sivaraj, Thomas Gleixner

Define watchdog_allowed_mask only when SOFTLOCKUP_DETECTOR is enabled.

Fixes: 7feeb9cd4f5b ("watchdog/sysctl: Clean up sysctl variable name space")
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Santosh Sivaraj <santosh@fossix.org>
---
v2:
Added Petr's reviewed-by from [1] and add fixes tag as suggested by Christophe.

[1]: https://lkml.org/lkml/2020/8/20/1030

 kernel/watchdog.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 5abb5b22ad13..71109065bd8e 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -44,8 +44,6 @@ int __read_mostly soft_watchdog_user_enabled = 1;
 int __read_mostly watchdog_thresh = 10;
 static int __read_mostly nmi_watchdog_available;
 
-static struct cpumask watchdog_allowed_mask __read_mostly;
-
 struct cpumask watchdog_cpumask __read_mostly;
 unsigned long *watchdog_cpumask_bits = cpumask_bits(&watchdog_cpumask);
 
@@ -162,6 +160,8 @@ static void lockup_detector_update_enable(void)
 int __read_mostly sysctl_softlockup_all_cpu_backtrace;
 #endif
 
+static struct cpumask watchdog_allowed_mask __read_mostly;
+
 /* Global variables, exported for sysctl */
 unsigned int __read_mostly softlockup_panic =
 			CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE;
-- 
2.26.2


^ permalink raw reply related

* [PATCH v2] powerpc: topology.h: fix build when CONFIG_NUMA=n
From: Scott Cheloha @ 2020-11-05 22:30 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nathan Lynch, Laurent Dufour, kernel test robot

Add a non-NUMA definition for of_drconf_to_nid_single() to topology.h
so we have one even if powerpc/mm/numa.c is not compiled.  On a non-NUMA
kernel the appropriate node id is always first_online_node.

Signed-off-by: Scott Cheloha <cheloha@linux.ibm.com>
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 72cdd117c449 ("pseries/hotplug-memory: hot-add: skip redundant LMB lookup")
---
v1: Initial patch.

v2: Incorporate suggested cleanups from Christophe Leroy.

 arch/powerpc/include/asm/topology.h | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
index 8728590f514a..3beeb030cd78 100644
--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -6,6 +6,7 @@
 
 struct device;
 struct device_node;
+struct drmem_lmb;
 
 #ifdef CONFIG_NUMA
 
@@ -61,6 +62,9 @@ static inline int early_cpu_to_node(int cpu)
 	 */
 	return (nid < 0) ? 0 : nid;
 }
+
+int of_drconf_to_nid_single(struct drmem_lmb *lmb);
+
 #else
 
 static inline int early_cpu_to_node(int cpu) { return 0; }
@@ -84,10 +88,12 @@ static inline int cpu_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc)
 	return 0;
 }
 
-#endif /* CONFIG_NUMA */
+static inline int of_drconf_to_nid_single(struct drmem_lmb *lmb)
+{
+	return first_online_node;
+}
 
-struct drmem_lmb;
-int of_drconf_to_nid_single(struct drmem_lmb *lmb);
+#endif /* CONFIG_NUMA */
 
 #if defined(CONFIG_NUMA) && defined(CONFIG_PPC_SPLPAR)
 extern int find_and_online_cpu_nid(int cpu);
-- 
2.28.0


^ permalink raw reply related

* Re: Kernel panic from malloc() on SUSE 15.1?
From: Carl Jacobsen @ 2020-11-05 19:44 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev
In-Reply-To: <87lffgt8b9.fsf@mpe.ellerman.id.au>

[-- Attachment #1: Type: text/plain, Size: 14126 bytes --]

On Thu, Nov 5, 2020 at 2:19 AM Michael Ellerman <mpe@ellerman.id.au> wrote:

> Carl Jacobsen <cjacobsen@storix.com> writes:
> > The panic (on a call to malloc from static linked libcrypto) looks like
> > this:
>
> What hardware is this on?
>

Thank you for looking into this.

The system that's panicking identifies like this:
    # uname -a
    Linux sl151pwr8 4.12.14-197.18-default #1 SMP Tue Sep 17 14:26:49 UTC
2019
    (d75059b) ppc64le ppc64le ppc64le GNU/Linux
    #
    # cat /etc/os-release
    NAME="SLES"
    VERSION="15-SP1"
    VERSION_ID="15.1"
    PRETTY_NAME="SUSE Linux Enterprise Server 15 SP1"
    ID="sles"
    ID_LIKE="suse"
    ANSI_COLOR="0;32"
    CPE_NAME="cpe:/o:suse:sles:15:sp1"

The system is an LPAR running under PowerVM vios version 2.2.3.4.
The underlying hardware is machine type-model 8284-22A.


> Can you try booting with ppc_tm=off on the kernel command line, and see
> if that changes anything?
>

Yes. Output is down below. Doesn't appear to change much, but I don't have
the background to interpret the registers.


> Can you put your compiled test program up somewhere we can download it
> and look at? Or post the disassembly?
>

Here's the source file:
    https://www.storix.com/download/support/misc/rand_test.c

Here's the resulting executable:
    https://www.storix.com/download/support/misc/rand_test

Executable is linked to libcrypto from openssl-1.1.1g, configured with:
    ./config no-shared no-dso no-threads -fPIC -ggdb3 -debug -static

Executable is built (on SUSE 12) with:
    gcc -ggdb3 -o rand_test rand_test.c libcrypto.a


And running the executable (on SUSE 15.1) through gdb goes like this:

    # gdb --args ./rand_test
    GNU gdb (GDB; SUSE Linux Enterprise 15) 8.3.1
    << snip intro text >>
    Reading symbols from ./rand_test...
    (gdb) b main
    Breakpoint 1 at 0x1000288c: file rand_test.c, line 6.
    (gdb) r
    Starting program: /tmp/ossl/rand_test

    Breakpoint 1, main (argc=1, argv=0x7ffffffff798) at rand_test.c:6
    6           int has_enough_data = RAND_status();
    (gdb) s
    RAND_status () at crypto/rand/rand_lib.c:958
    958         const RAND_METHOD *meth = RAND_get_rand_method();
    (gdb)
    RAND_get_rand_method () at crypto/rand/rand_lib.c:844
    844         const RAND_METHOD *tmp_meth = NULL;
    (gdb)
    846         if (!RUN_ONCE(&rand_init, do_rand_init))
    (gdb)
    CRYPTO_THREAD_run_once (once=0x102a7d88 <rand_init>,
init=0x10002f30 <do_rand_init_ossl_>) at crypto/threads_none.c:67
    67          if (*once != 0)
    (gdb)
    70          init();
    (gdb)
    do_rand_init_ossl_ () at crypto/rand/rand_lib.c:306
    306     DEFINE_RUN_ONCE_STATIC(do_rand_init)
    (gdb)
    do_rand_init () at crypto/rand/rand_lib.c:309
    309         rand_engine_lock = CRYPTO_THREAD_lock_new();
    (gdb)
    CRYPTO_THREAD_lock_new () at crypto/threads_none.c:24
    24          if ((lock = OPENSSL_zalloc(sizeof(unsigned int))) == NULL) {
    (gdb)
    CRYPTO_zalloc (num=4, file=0x1023a500 "crypto/threads_none.c", line=24)
at crypto/mem.c:230
    230         void *ret = CRYPTO_malloc(num, file, line);
    (gdb)
    CRYPTO_malloc (num=4, file=0x1023a500 "crypto/threads_none.c", line=24)
at crypto/mem.c:194
    194         void *ret = NULL;
    (gdb)
    197         if (malloc_impl != NULL && malloc_impl != CRYPTO_malloc)
    (gdb)
    200         if (num == 0)
    (gdb)
    204         if (allow_customize) {
    (gdb)
    210             allow_customize = 0;
    (gdb)
    222         ret = malloc(num);
    (gdb)
    Bad kernel stack pointer 7fffffffef20 at 700
    Oops: Bad kernel stack pointer, sig: 6 [#1]
    SMP NR_CPUS=2048
    NUMA
    pSeries
    Modules linked in: scsi_transport_iscsi af_packet xt_tcpudp
ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ip_set nfnetlink
ebtable_nat ebtable_broute br_netfilter bridge stp llc ip6table_nat
nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw
ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security
ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables
x_tables ibmveth(X) vmx_crypto gf128mul crct10dif_vpmsum rtc_generic btrfs
xor zstd_decompress zstd_compress xxhash raid6_pq sr_mod cdrom sd_mod
ibmvscsi(X) scsi_transport_srp crc32c_vpmsum sg dm_multipath dm_mod
scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod autofs4
    Supported: Yes, External
    CPU: 4 PID: 3082 Comm: rand_test Tainted: G
4.12.14-197.18-default #1 SLE15-SP1
    task: c00000002e226100 task.stack: c0000000387c8000
    NIP: 0000000000000700 LR: 0000000010004acc CTR: 0000000000000000
    REGS: c00000001ebffd40 TRAP: 0300   Tainted: G
 (4.12.14-197.18-default)
    MSR: 8000000000001000 <SF,ME>
      CR: 44000844  XER: 20000000
    CFAR: 00000000000010f0 DAR: ffffffffffffb27a DSISR: 40000000 SOFTE: 0
    GPR00: 0000000020000000 00007fffffffef20 00000000102af788
fffffffffffffffd
    GPR04: 0000000000000020 0000000000000030 00000000102b0760
0000000000000001
    GPR08: 0000000000000000 00007fffb7dacc00 00000000102b0730
800000010280f033
    GPR12: 0000000000004000 00007fffb7ffa100 0000000000000000
0000000000000000
    GPR16: 0000000000000000 0000000000000000 0000000000000000
0000000000000000
    GPR20: 0000000000000000 0000000000000000 0000000000000000
0000000000000000
    GPR24: 0000000000000000 0000000000000000 0000000000000000
00007fffb7fef4b8
    GPR28: 00007fffb7ff0000 0000000000000000 0000000000000000
00007fffffffef20
    NIP [0000000000000700] 0x700
    LR [0000000010004acc] 0x10004acc
    Call Trace:
    Instruction dump:
    00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    00000000 00000000 00000000 00000000 7db243a6 7db142a6 f92d0080 7d20e2a6
    ---[ end trace 167d5d3b2e8a06e9 ]---

    Sending IPI to other CPUs
    IPI complete
    kexec: Starting switchover sequence.
    I'm in purgatory
     -> smp_release_cpus()
    spinning_secondaries = 0
     <- smp_release_cpus()
    Kernel panic - not syncing: Out of memory and no killable processes...

    CPU: 4 PID: 1 Comm: swapper/4 Not tainted 4.12.14-197.18-default #1
SLE15-SP1
    Call Trace:
    [c000000012457210] [c000000008a20140] dump_stack+0xb0/0xf0 (unreliable)
    [c000000012457250] [c000000008a1ccd4] panic+0x144/0x31c
    [c0000000124572e0] [c0000000082efcc0] out_of_memory+0x3f0/0x700
    [c000000012457380] [c0000000082f7ed4]
__alloc_pages_nodemask+0x1004/0x10b0
    [c000000012457570] [c00000000837f4d8] alloc_page_interleave+0x58/0x110
    [c0000000124575b0] [c0000000083800bc] alloc_pages_current+0x16c/0x1d0
    [c000000012457610] [c0000000082e8398] __page_cache_alloc+0xd8/0x150
    [c000000012457650] [c0000000082e8574] pagecache_get_page+0x164/0x440
    [c0000000124576b0] [c0000000082e8884]
grab_cache_page_write_begin+0x34/0x70
    [c0000000124576e0] [c00000000840ede8] simple_write_begin+0x48/0x190
    [c000000012457720] [c0000000082e7c7c] generic_perform_write+0xec/0x270
    [c0000000124577b0] [c0000000082ea2e0]
__generic_file_write_iter+0x250/0x2a0
    [c000000012457810] [c0000000082ea53c]
generic_file_write_iter+0x20c/0x2e0
    [c000000012457850] [c0000000083cc0e0] __vfs_write+0x120/0x1e0
    [c0000000124578e0] [c0000000083cdfc8] vfs_write+0xd8/0x220
    [c000000012457930] [c0000000083cfeec] SyS_write+0x6c/0x110
    [c000000012457980] [c000000008d154c4] xwrite+0x54/0xb8
    [c0000000124579c0] [c000000008d15574] do_copy+0x4c/0x17c
    [c0000000124579f0] [c000000008d15140] write_buffer+0x64/0x90
    [c000000012457a20] [c000000008d151d4] flush_buffer+0x68/0xf4
    [c000000012457a70] [c000000008d62268] unxz+0x210/0x398
    [c000000012457b10] [c000000008d15efc] unpack_to_rootfs+0x1f0/0x360
    [c000000012457bc0] [c000000008d16108] populate_rootfs+0x9c/0x188
    [c000000012457c40] [c00000000800f5d4] do_one_initcall+0x64/0x1d0
    [c000000012457d00] [c000000008d14474] kernel_init_freeable+0x294/0x388
    [c000000012457dc0] [c00000000801026c] kernel_init+0x2c/0x160
    [c000000012457e30] [c00000000800b560] ret_from_kernel_thread+0x5c/0x7c
    ------------[ cut here ]------------


Doing the same thing but with ppc_tm=off...
    # cat /proc/cmdline
    BOOT_IMAGE=/boot/vmlinux-4.12.14-197.18-default
root=UUID=0e795e37-3692-465a-a037-c2935a9fde7a mitigations=auto quiet
crashkernel=197M ppc_tm=off


Results in a panic at the same point, with a few registers changed:

    << snip down to panic at malloc >>
    (gdb)
    Bad kernel stack pointer 7fffffffef20 at 700
    Oops: Bad kernel stack pointer, sig: 6 [#1]
    SMP NR_CPUS=2048
    NUMA
    pSeries
    Modules linked in: scsi_transport_iscsi af_packet xt_tcpudp
ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ip_set nfnetlink
ebtable_nat ebtable_broute br_netfilter bridge stp llc ip6table_nat
nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw
ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security
ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables
x_tables ibmveth(X) vmx_crypto gf128mul crct10dif_vpmsum rtc_generic btrfs
xor zstd_decompress zstd_compress xxhash raid6_pq sr_mod cdrom sd_mod
ibmvscsi(X) scsi_transport_srp crc32c_vpmsum sg dm_multipath dm_mod
scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod autofs4
    Supported: Yes, External
    CPU: 2 PID: 3079 Comm: rand_test Tainted: G
4.12.14-197.18-default #1 SLE15-SP1
    task: c00000002f6bcc00 task.stack: c0000000321fc000
    NIP: 0000000000000700 LR: 0000000010004acc CTR: 0000000000000000
    REGS: c00000001ec17d40 TRAP: 0300   Tainted: G
 (4.12.14-197.18-default)
    MSR: 8000000000001000 <SF,ME>
      CR: 44000844  XER: 20000000
    CFAR: 00000000000010f0 DAR: ffffffffffffb27a DSISR: 40000000 SOFTE: 0
    GPR00: 0000000020000000 00007fffffffef20 00000000102af788
fffffffffffffffd
    GPR04: 0000000000000020 0000000000000030 00000000102b0760
0000000000000001
    GPR08: 0000000000000000 00007fffb7dacc00 00000000102b0730
800000000280f033
    GPR12: 0000000000004000 00007fffb7ffa100 0000000000000000
0000000000000000
    GPR16: 0000000000000000 0000000000000000 0000000000000000
0000000000000000
    GPR20: 0000000000000000 0000000000000000 0000000000000000
0000000000000000
    GPR24: 0000000000000000 0000000000000000 0000000000000000
00007fffb7fef4b8
    GPR28: 00007fffb7ff0000 0000000000000000 0000000000000000
00007fffffffef20
    NIP [0000000000000700] 0x700
    LR [0000000010004acc] 0x10004acc
    Call Trace:
    Instruction dump:
    00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    00000000 00000000 00000000 00000000 7db243a6 7db142a6 f92d0080 7d20e2a6
    ---[ end trace 436f626dd098548c ]---

    Sending IPI to other CPUs
    IPI complete
    kexec: Starting switchover sequence.
    I'm in purgatory
     -> smp_release_cpus()
    spinning_secondaries = 0
     <- smp_release_cpus()
    Kernel panic - not syncing: Out of memory and no killable processes...

    CPU: 2 PID: 1 Comm: swapper/2 Not tainted 4.12.14-197.18-default #1
SLE15-SP1
    Call Trace:
    [c000000012457210] [c000000008a20140] dump_stack+0xb0/0xf0 (unreliable)
    [c000000012457250] [c000000008a1ccd4] panic+0x144/0x31c
    [c0000000124572e0] [c0000000082efcc0] out_of_memory+0x3f0/0x700
    [c000000012457380] [c0000000082f7ed4]
__alloc_pages_nodemask+0x1004/0x10b0
    [c000000012457570] [c00000000837f4d8] alloc_page_interleave+0x58/0x110
    [c0000000124575b0] [c0000000083800bc] alloc_pages_current+0x16c/0x1d0
    [c000000012457610] [c0000000082e8398] __page_cache_alloc+0xd8/0x150
    [c000000012457650] [c0000000082e8574] pagecache_get_page+0x164/0x440
    [c0000000124576b0] [c0000000082e8884]
grab_cache_page_write_begin+0x34/0x70
    [c0000000124576e0] [c00000000840ede8] simple_write_begin+0x48/0x190
    [c000000012457720] [c0000000082e7c7c] generic_perform_write+0xec/0x270
    [c0000000124577b0] [c0000000082ea2e0]
__generic_file_write_iter+0x250/0x2a0
    [c000000012457810] [c0000000082ea53c]
generic_file_write_iter+0x20c/0x2e0
    [c000000012457850] [c0000000083cc0e0] __vfs_write+0x120/0x1e0
    [c0000000124578e0] [c0000000083cdfc8] vfs_write+0xd8/0x220
    [c000000012457930] [c0000000083cfeec] SyS_write+0x6c/0x110
    [c000000012457980] [c000000008d154c4] xwrite+0x54/0xb8
    [c0000000124579c0] [c000000008d15574] do_copy+0x4c/0x17c
    [c0000000124579f0] [c000000008d15140] write_buffer+0x64/0x90
    [c000000012457a20] [c000000008d151d4] flush_buffer+0x68/0xf4
    [c000000012457a70] [c000000008d62268] unxz+0x210/0x398
    [c000000012457b10] [c000000008d15efc] unpack_to_rootfs+0x1f0/0x360
    [c000000012457bc0] [c000000008d16108] populate_rootfs+0x9c/0x188
    [c000000012457c40] [c00000000800f5d4] do_one_initcall+0x64/0x1d0
    [c000000012457d00] [c000000008d14474] kernel_init_freeable+0x294/0x388
    [c000000012457dc0] [c00000000801026c] kernel_init+0x2c/0x160
    [c000000012457e30] [c00000000800b560] ret_from_kernel_thread+0x5c/0x7c
    ------------[ cut here ]------------


Diffing the panic output looks like this (highlighting register changes?):

    74,75c79,80
    < CPU: 4 PID: 3082 Comm: rand_test Tainted: G
4.12.14-197.18-default #1 SLE15-SP1
    < task: c00000002e226100 task.stack: c0000000387c8000
    ---
    > CPU: 2 PID: 3079 Comm: rand_test Tainted: G
4.12.14-197.18-default #1 SLE15-SP1
    > task: c00000002f6bcc00 task.stack: c0000000321fc000
    77c82
    < REGS: c00000001ebffd40 TRAP: 0300   Tainted: G
 (4.12.14-197.18-default)
    ---
    > REGS: c00000001ec17d40 TRAP: 0300   Tainted: G
 (4.12.14-197.18-default)
    83c88
    < GPR08: 0000000000000000 00007fffb7dacc00 00000000102b0730
800000010280f033
    ---
    > GPR08: 0000000000000000 00007fffb7dacc00 00000000102b0730
800000000280f033
    95c100
    < ---[ end trace 167d5d3b2e8a06e9 ]---
    ---
    > ---[ end trace 436f626dd098548c ]---
    106c111
    < CPU: 4 PID: 1 Comm: swapper/4 Not tainted 4.12.14-197.18-default #1
SLE15-SP1
    ---
    > CPU: 2 PID: 1 Comm: swapper/2 Not tainted 4.12.14-197.18-default #1
SLE15-SP1

-- 
Carl Jacobsen
Storix, Inc.

[-- Attachment #2: Type: text/html, Size: 16954 bytes --]

^ permalink raw reply

* Re: [PATCH] KVM: PPC: Book3S HV: XIVE: Fix possible oops when accessing ESB page
From: Greg Kurz @ 2020-11-05 17:41 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: kvm, Gustavo Romero, kvm-ppc, Paul Mackerras, linuxppc-dev,
	David Gibson
In-Reply-To: <20201105134713.656160-1-clg@kaod.org>

On Thu, 5 Nov 2020 14:47:13 +0100
Cédric Le Goater <clg@kaod.org> wrote:

> When accessing the ESB page of a source interrupt, the fault handler
> will retrieve the page address from the XIVE interrupt 'xive_irq_data'
> structure. If the associated KVM XIVE interrupt is not valid, that is
> not allocated at the HW level for some reason, the fault handler will
> dereference a NULL pointer leading to the oops below :
> 
>     WARNING: CPU: 40 PID: 59101 at arch/powerpc/kvm/book3s_xive_native.c:259 xive_native_esb_fault+0xe4/0x240 [kvm]
>     CPU: 40 PID: 59101 Comm: qemu-system-ppc Kdump: loaded Tainted: G        W        --------- -  - 4.18.0-240.el8.ppc64le #1
>     NIP:  c00800000e949fac LR: c00000000044b164 CTR: c00800000e949ec8
>     REGS: c000001f69617840 TRAP: 0700   Tainted: G        W        --------- -  -  (4.18.0-240.el8.ppc64le)
>     MSR:  9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 44044282  XER: 00000000
>     CFAR: c00000000044b160 IRQMASK: 0
>     GPR00: c00000000044b164 c000001f69617ac0 c00800000e96e000 c000001f69617c10
>     GPR04: 05faa2b21e000080 0000000000000000 0000000000000005 ffffffffffffffff
>     GPR08: 0000000000000000 0000000000000001 0000000000000000 0000000000000001
>     GPR12: c00800000e949ec8 c000001ffffd3400 0000000000000000 0000000000000000
>     GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>     GPR20: 0000000000000000 0000000000000000 c000001f5c065160 c000000001c76f90
>     GPR24: c000001f06f20000 c000001f5c065100 0000000000000008 c000001f0eb98c78
>     GPR28: c000001dcab40000 c000001dcab403d8 c000001f69617c10 0000000000000011
>     NIP [c00800000e949fac] xive_native_esb_fault+0xe4/0x240 [kvm]
>     LR [c00000000044b164] __do_fault+0x64/0x220
>     Call Trace:
>     [c000001f69617ac0] [0000000137a5dc20] 0x137a5dc20 (unreliable)
>     [c000001f69617b50] [c00000000044b164] __do_fault+0x64/0x220
>     [c000001f69617b90] [c000000000453838] do_fault+0x218/0x930
>     [c000001f69617bf0] [c000000000456f50] __handle_mm_fault+0x350/0xdf0
>     [c000001f69617cd0] [c000000000457b1c] handle_mm_fault+0x12c/0x310
>     [c000001f69617d10] [c00000000007ef44] __do_page_fault+0x264/0xbb0
>     [c000001f69617df0] [c00000000007f8c8] do_page_fault+0x38/0xd0
>     [c000001f69617e30] [c00000000000a714] handle_page_fault+0x18/0x38
>     Instruction dump:
>     40c2fff0 7c2004ac 2fa90000 409e0118 73e90001 41820080 e8bd0008 7c2004ac
>     7ca90074 39400000 915c0000 7929d182 <0b090000> 2fa50000 419e0080 e89e0018
>     ---[ end trace 66c6ff034c53f64f ]---
>     xive-kvm: xive_native_esb_fault: accessing invalid ESB page for source 8 !
> 
> Fix that by checking the validity of the KVM XIVE interrupt structure.
> 
> Reported-by: Greg Kurz <groug@kaod.org>
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---

Looks sane to me. QEMU still crashes on SIGBUS but no more oops at least.

Tested-by: Greg Kurz <groug@kaod.org>

>  arch/powerpc/kvm/book3s_xive_native.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/arch/powerpc/kvm/book3s_xive_native.c b/arch/powerpc/kvm/book3s_xive_native.c
> index d0c2db0e07fa..a59a94f02733 100644
> --- a/arch/powerpc/kvm/book3s_xive_native.c
> +++ b/arch/powerpc/kvm/book3s_xive_native.c
> @@ -251,6 +251,13 @@ static vm_fault_t xive_native_esb_fault(struct vm_fault *vmf)
>  	}
>  
>  	state = &sb->irq_state[src];
> +
> +	/* Some sanity checking */
> +	if (!state->valid) {
> +		pr_devel("%s: source %lx invalid !\n", __func__, irq);
> +		return VM_FAULT_SIGBUS;
> +	}
> +
>  	kvmppc_xive_select_irq(state, &hw_num, &xd);
>  
>  	arch_spin_lock(&sb->lock);


^ permalink raw reply

* Re: [PATCH 01/18] powerpc/64s: move the last of the page fault handling logic to C
From: kernel test robot @ 2020-11-05 21:54 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev; +Cc: kbuild-all, Nicholas Piggin
In-Reply-To: <20201105143431.1874789-2-npiggin@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4810 bytes --]

Hi Nicholas,

I love your patch! Perhaps something to improve:

[auto build test WARNING on powerpc/next]
[also build test WARNING on v5.10-rc2 next-20201105]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Nicholas-Piggin/powerpc-interrupt-wrappers/20201105-231909
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-allyesconfig (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/7c4e3c4d325c8c43a43b8031b5327cb91b56db19
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Nicholas-Piggin/powerpc-interrupt-wrappers/20201105-231909
        git checkout 7c4e3c4d325c8c43a43b8031b5327cb91b56db19
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> arch/powerpc/mm/book3s64/hash_utils.c:1513:5: warning: no previous prototype for 'do_hash_fault' [-Wmissing-prototypes]
    1513 | int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr)
         |     ^~~~~~~~~~~~~
   arch/powerpc/mm/book3s64/hash_utils.c:1886:6: warning: no previous prototype for 'hpte_insert_repeating' [-Wmissing-prototypes]
    1886 | long hpte_insert_repeating(unsigned long hash, unsigned long vpn,
         |      ^~~~~~~~~~~~~~~~~~~~~

vim +/do_hash_fault +1513 arch/powerpc/mm/book3s64/hash_utils.c

  1512	
> 1513	int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr)
  1514	{
  1515		unsigned long access = _PAGE_PRESENT | _PAGE_READ;
  1516		unsigned long flags = 0;
  1517		struct mm_struct *mm;
  1518		unsigned int region_id;
  1519		int err;
  1520	
  1521		if (unlikely(dsisr & (DSISR_BAD_FAULT_64S | DSISR_DABRMATCH | DSISR_KEYFAULT)))
  1522			goto _do_page_fault;
  1523	
  1524		/*
  1525		 * If we are in an "NMI" (e.g., an interrupt when soft-disabled), then
  1526		 * don't call hash_page, just fail the fault. This is required to
  1527		 * prevent re-entrancy problems in the hash code, namely perf
  1528		 * interrupts hitting while something holds H_PAGE_BUSY, and taking a
  1529		 * hash fault. See the comment in hash_preload().
  1530		 *
  1531		 * We come here as a result of a DSI at a point where we don't want
  1532		 * to call hash_page, such as when we are accessing memory (possibly
  1533		 * user memory) inside a PMU interrupt that occurred while interrupts
  1534		 * were soft-disabled.  We want to invoke the exception handler for
  1535		 * the access, or panic if there isn't a handler.
  1536		 */
  1537		if (unlikely(in_nmi())) {
  1538			bad_page_fault(regs, ea, SIGSEGV);
  1539			return 0;
  1540		}
  1541	
  1542		region_id = get_region_id(ea);
  1543		if ((region_id == VMALLOC_REGION_ID) || (region_id == IO_REGION_ID))
  1544			mm = &init_mm;
  1545		else
  1546			mm = current->mm;
  1547	
  1548		if (dsisr & DSISR_NOHPTE)
  1549			flags |= HPTE_NOHPTE_UPDATE;
  1550	
  1551		if (dsisr & DSISR_ISSTORE)
  1552			access |= _PAGE_WRITE;
  1553		/*
  1554		 * We set _PAGE_PRIVILEGED only when
  1555		 * kernel mode access kernel space.
  1556		 *
  1557		 * _PAGE_PRIVILEGED is NOT set
  1558		 * 1) when kernel mode access user space
  1559		 * 2) user space access kernel space.
  1560		 */
  1561		access |= _PAGE_PRIVILEGED;
  1562		if (user_mode(regs) || (region_id == USER_REGION_ID))
  1563			access &= ~_PAGE_PRIVILEGED;
  1564	
  1565		if (regs->trap == 0x400)
  1566			access |= _PAGE_EXEC;
  1567	
  1568		err = hash_page_mm(mm, ea, access, regs->trap, flags);
  1569		if (unlikely(err < 0)) {
  1570			// failed to instert a hash PTE due to an hypervisor error
  1571			if (user_mode(regs)) {
  1572				if (IS_ENABLED(CONFIG_PPC_SUBPAGE_PROT) && err == -2)
  1573					_exception(SIGSEGV, regs, SEGV_ACCERR, ea);
  1574				else
  1575					_exception(SIGBUS, regs, BUS_ADRERR, ea);
  1576			} else {
  1577				bad_page_fault(regs, ea, SIGBUS);
  1578			}
  1579			err = 0;
  1580	
  1581		} else if (err) {
  1582	_do_page_fault:
  1583			err = hash__do_page_fault(regs, ea, dsisr);
  1584		}
  1585	
  1586		return err;
  1587	}
  1588	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 71468 bytes --]

^ permalink raw reply

* [PATCH v2 13/16] PCI: dwc: Remove unnecessary wrappers around dw_pcie_host_init()
From: Rob Herring @ 2020-11-05 21:11 UTC (permalink / raw)
  Cc: Kunihiko Hayashi, Neil Armstrong, linux-pci, Binghui Wang,
	Minghuan Lian, Jonathan Chocron, Fabio Estevam, Jerome Brunet,
	Jesper Nilsson, Lorenzo Pieralisi, Kevin Hilman, linux-arm-kernel,
	Murali Karicheri, NXP Linux Team, Xiaowei Song, Richard Zhu,
	Martin Blumenstingl, Sascha Hauer, Yue Wang, Bjorn Helgaas,
	linux-amlogic, Mingkai Hu, linux-arm-kernel, Roy Zang,
	Masahiro Yamada, linuxppc-dev, Pengutronix Kernel Team, Shawn Guo,
	Lucas Stach
In-Reply-To: <20201105211159.1814485-1-robh@kernel.org>

Many calls to dw_pcie_host_init() are in a wrapper function with
nothing else now. Let's remove the pointless extra layer.

Cc: Richard Zhu <hongxing.zhu@nxp.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Pengutronix Kernel Team <kernel@pengutronix.de>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: NXP Linux Team <linux-imx@nxp.com>
Cc: Murali Karicheri <m-karicheri2@ti.com>
Cc: Minghuan Lian <minghuan.Lian@nxp.com>
Cc: Mingkai Hu <mingkai.hu@nxp.com>
Cc: Roy Zang <roy.zang@nxp.com>
Cc: Yue Wang <yue.wang@Amlogic.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Cc: Neil Armstrong <narmstrong@baylibre.com>
Cc: Jerome Brunet <jbrunet@baylibre.com>
Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Cc: Jonathan Chocron <jonnyc@amazon.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Xiaowei Song <songxiaowei@hisilicon.com>
Cc: Binghui Wang <wangbinghui@hisilicon.com>
Cc: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-amlogic@lists.infradead.org
Cc: linux-arm-kernel@axis.com
Signed-off-by: Rob Herring <robh@kernel.org>
---
 drivers/pci/controller/dwc/pci-imx6.c       | 22 ++---------------
 drivers/pci/controller/dwc/pci-keystone.c   | 19 +--------------
 drivers/pci/controller/dwc/pci-layerscape.c | 26 ++-------------------
 drivers/pci/controller/dwc/pci-meson.c      | 22 ++---------------
 drivers/pci/controller/dwc/pcie-al.c        | 20 ++--------------
 drivers/pci/controller/dwc/pcie-artpec6.c   | 23 +++---------------
 drivers/pci/controller/dwc/pcie-kirin.c     | 11 ++-------
 drivers/pci/controller/dwc/pcie-uniphier.c  | 23 +++---------------
 8 files changed, 17 insertions(+), 149 deletions(-)

diff --git a/drivers/pci/controller/dwc/pci-imx6.c b/drivers/pci/controller/dwc/pci-imx6.c
index f9547bb2cf1b..0cf1333c0440 100644
--- a/drivers/pci/controller/dwc/pci-imx6.c
+++ b/drivers/pci/controller/dwc/pci-imx6.c
@@ -842,25 +842,6 @@ static const struct dw_pcie_host_ops imx6_pcie_host_ops = {
 	.host_init = imx6_pcie_host_init,
 };
 
-static int imx6_add_pcie_port(struct imx6_pcie *imx6_pcie,
-			      struct platform_device *pdev)
-{
-	struct dw_pcie *pci = imx6_pcie->pci;
-	struct pcie_port *pp = &pci->pp;
-	struct device *dev = &pdev->dev;
-	int ret;
-
-	pp->ops = &imx6_pcie_host_ops;
-
-	ret = dw_pcie_host_init(pp);
-	if (ret) {
-		dev_err(dev, "failed to initialize host\n");
-		return ret;
-	}
-
-	return 0;
-}
-
 static const struct dw_pcie_ops dw_pcie_ops = {
 	.start_link = imx6_pcie_start_link,
 };
@@ -1005,6 +986,7 @@ static int imx6_pcie_probe(struct platform_device *pdev)
 
 	pci->dev = dev;
 	pci->ops = &dw_pcie_ops;
+	pci->pp.ops = &imx6_pcie_host_ops;
 
 	imx6_pcie->pci = pci;
 	imx6_pcie->drvdata = of_device_get_match_data(dev);
@@ -1154,7 +1136,7 @@ static int imx6_pcie_probe(struct platform_device *pdev)
 	if (ret)
 		return ret;
 
-	ret = imx6_add_pcie_port(imx6_pcie, pdev);
+	ret = dw_pcie_host_init(&pci->pp);
 	if (ret < 0)
 		return ret;
 
diff --git a/drivers/pci/controller/dwc/pci-keystone.c b/drivers/pci/controller/dwc/pci-keystone.c
index 5a4bcc2b1ddb..719756160821 100644
--- a/drivers/pci/controller/dwc/pci-keystone.c
+++ b/drivers/pci/controller/dwc/pci-keystone.c
@@ -844,23 +844,6 @@ static irqreturn_t ks_pcie_err_irq_handler(int irq, void *priv)
 	return ks_pcie_handle_error_irq(ks_pcie);
 }
 
-static int __init ks_pcie_add_pcie_port(struct keystone_pcie *ks_pcie,
-					struct platform_device *pdev)
-{
-	struct dw_pcie *pci = ks_pcie->pci;
-	struct pcie_port *pp = &pci->pp;
-	struct device *dev = &pdev->dev;
-	int ret;
-
-	ret = dw_pcie_host_init(pp);
-	if (ret) {
-		dev_err(dev, "failed to initialize host\n");
-		return ret;
-	}
-
-	return 0;
-}
-
 static void ks_pcie_am654_write_dbi2(struct dw_pcie *pci, void __iomem *base,
 				     u32 reg, size_t size, u32 val)
 {
@@ -1255,7 +1238,7 @@ static int __init ks_pcie_probe(struct platform_device *pdev)
 		}
 
 		pci->pp.ops = host_ops;
-		ret = ks_pcie_add_pcie_port(ks_pcie, pdev);
+		ret = dw_pcie_host_init(&pci->pp);
 		if (ret < 0)
 			goto err_get_sync;
 		break;
diff --git a/drivers/pci/controller/dwc/pci-layerscape.c b/drivers/pci/controller/dwc/pci-layerscape.c
index 400ebbebd00f..44ad34cdc3bc 100644
--- a/drivers/pci/controller/dwc/pci-layerscape.c
+++ b/drivers/pci/controller/dwc/pci-layerscape.c
@@ -232,31 +232,12 @@ static const struct of_device_id ls_pcie_of_match[] = {
 	{ },
 };
 
-static int __init ls_add_pcie_port(struct ls_pcie *pcie)
-{
-	struct dw_pcie *pci = pcie->pci;
-	struct pcie_port *pp = &pci->pp;
-	struct device *dev = pci->dev;
-	int ret;
-
-	pp->ops = pcie->drvdata->ops;
-
-	ret = dw_pcie_host_init(pp);
-	if (ret) {
-		dev_err(dev, "failed to initialize host\n");
-		return ret;
-	}
-
-	return 0;
-}
-
 static int __init ls_pcie_probe(struct platform_device *pdev)
 {
 	struct device *dev = &pdev->dev;
 	struct dw_pcie *pci;
 	struct ls_pcie *pcie;
 	struct resource *dbi_base;
-	int ret;
 
 	pcie = devm_kzalloc(dev, sizeof(*pcie), GFP_KERNEL);
 	if (!pcie)
@@ -270,6 +251,7 @@ static int __init ls_pcie_probe(struct platform_device *pdev)
 
 	pci->dev = dev;
 	pci->ops = pcie->drvdata->dw_pcie_ops;
+	pci->pp.ops = pcie->drvdata->ops;
 
 	pcie->pci = pci;
 
@@ -285,11 +267,7 @@ static int __init ls_pcie_probe(struct platform_device *pdev)
 
 	platform_set_drvdata(pdev, pcie);
 
-	ret = ls_add_pcie_port(pcie);
-	if (ret < 0)
-		return ret;
-
-	return 0;
+	return dw_pcie_host_init(&pci->pp);
 }
 
 static struct platform_driver ls_pcie_driver = {
diff --git a/drivers/pci/controller/dwc/pci-meson.c b/drivers/pci/controller/dwc/pci-meson.c
index 04589f0decb2..686ded034f22 100644
--- a/drivers/pci/controller/dwc/pci-meson.c
+++ b/drivers/pci/controller/dwc/pci-meson.c
@@ -387,25 +387,6 @@ static const struct dw_pcie_host_ops meson_pcie_host_ops = {
 	.host_init = meson_pcie_host_init,
 };
 
-static int meson_add_pcie_port(struct meson_pcie *mp,
-			       struct platform_device *pdev)
-{
-	struct dw_pcie *pci = &mp->pci;
-	struct pcie_port *pp = &pci->pp;
-	struct device *dev = &pdev->dev;
-	int ret;
-
-	pp->ops = &meson_pcie_host_ops;
-
-	ret = dw_pcie_host_init(pp);
-	if (ret) {
-		dev_err(dev, "failed to initialize host\n");
-		return ret;
-	}
-
-	return 0;
-}
-
 static const struct dw_pcie_ops dw_pcie_ops = {
 	.link_up = meson_pcie_link_up,
 	.start_link = meson_pcie_start_link,
@@ -425,6 +406,7 @@ static int meson_pcie_probe(struct platform_device *pdev)
 	pci = &mp->pci;
 	pci->dev = dev;
 	pci->ops = &dw_pcie_ops;
+	pci->pp.ops = &meson_pcie_host_ops;
 	pci->num_lanes = 1;
 
 	mp->phy = devm_phy_get(dev, "pcie");
@@ -471,7 +453,7 @@ static int meson_pcie_probe(struct platform_device *pdev)
 
 	platform_set_drvdata(pdev, mp);
 
-	ret = meson_add_pcie_port(mp, pdev);
+	ret = dw_pcie_host_init(&pci->pp);
 	if (ret < 0) {
 		dev_err(dev, "Add PCIe port failed, %d\n", ret);
 		goto err_phy;
diff --git a/drivers/pci/controller/dwc/pcie-al.c b/drivers/pci/controller/dwc/pcie-al.c
index d06866921187..7ac8a37d9ce0 100644
--- a/drivers/pci/controller/dwc/pcie-al.c
+++ b/drivers/pci/controller/dwc/pcie-al.c
@@ -322,23 +322,6 @@ static const struct dw_pcie_host_ops al_pcie_host_ops = {
 	.host_init = al_pcie_host_init,
 };
 
-static int al_add_pcie_port(struct pcie_port *pp,
-			    struct platform_device *pdev)
-{
-	struct device *dev = &pdev->dev;
-	int ret;
-
-	pp->ops = &al_pcie_host_ops;
-
-	ret = dw_pcie_host_init(pp);
-	if (ret) {
-		dev_err(dev, "failed to initialize host\n");
-		return ret;
-	}
-
-	return 0;
-}
-
 static const struct dw_pcie_ops dw_pcie_ops = {
 };
 
@@ -360,6 +343,7 @@ static int al_pcie_probe(struct platform_device *pdev)
 
 	pci->dev = dev;
 	pci->ops = &dw_pcie_ops;
+	pci->pp.ops = &al_pcie_host_ops;
 
 	al_pcie->pci = pci;
 	al_pcie->dev = dev;
@@ -384,7 +368,7 @@ static int al_pcie_probe(struct platform_device *pdev)
 
 	platform_set_drvdata(pdev, al_pcie);
 
-	return al_add_pcie_port(&pci->pp, pdev);
+	return dw_pcie_host_init(&pci->pp);
 }
 
 static const struct of_device_id al_pcie_of_match[] = {
diff --git a/drivers/pci/controller/dwc/pcie-artpec6.c b/drivers/pci/controller/dwc/pcie-artpec6.c
index fcba9915a606..597c282f586c 100644
--- a/drivers/pci/controller/dwc/pcie-artpec6.c
+++ b/drivers/pci/controller/dwc/pcie-artpec6.c
@@ -336,25 +336,6 @@ static const struct dw_pcie_host_ops artpec6_pcie_host_ops = {
 	.host_init = artpec6_pcie_host_init,
 };
 
-static int artpec6_add_pcie_port(struct artpec6_pcie *artpec6_pcie,
-				 struct platform_device *pdev)
-{
-	struct dw_pcie *pci = artpec6_pcie->pci;
-	struct pcie_port *pp = &pci->pp;
-	struct device *dev = pci->dev;
-	int ret;
-
-	pp->ops = &artpec6_pcie_host_ops;
-
-	ret = dw_pcie_host_init(pp);
-	if (ret) {
-		dev_err(dev, "failed to initialize host\n");
-		return ret;
-	}
-
-	return 0;
-}
-
 static void artpec6_pcie_ep_init(struct dw_pcie_ep *ep)
 {
 	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
@@ -445,7 +426,9 @@ static int artpec6_pcie_probe(struct platform_device *pdev)
 		if (!IS_ENABLED(CONFIG_PCIE_ARTPEC6_HOST))
 			return -ENODEV;
 
-		ret = artpec6_add_pcie_port(artpec6_pcie, pdev);
+		pci->pp.ops = &artpec6_pcie_host_ops;
+
+		ret = dw_pcie_host_init(&pci->pp);
 		if (ret < 0)
 			return ret;
 		break;
diff --git a/drivers/pci/controller/dwc/pcie-kirin.c b/drivers/pci/controller/dwc/pcie-kirin.c
index ac4bbdaf5324..026fd1e42a55 100644
--- a/drivers/pci/controller/dwc/pcie-kirin.c
+++ b/drivers/pci/controller/dwc/pcie-kirin.c
@@ -419,14 +419,6 @@ static const struct dw_pcie_host_ops kirin_pcie_host_ops = {
 	.host_init = kirin_pcie_host_init,
 };
 
-static int kirin_add_pcie_port(struct dw_pcie *pci,
-			       struct platform_device *pdev)
-{
-	pci->pp.ops = &kirin_pcie_host_ops;
-
-	return dw_pcie_host_init(&pci->pp);
-}
-
 static int kirin_pcie_probe(struct platform_device *pdev)
 {
 	struct device *dev = &pdev->dev;
@@ -449,6 +441,7 @@ static int kirin_pcie_probe(struct platform_device *pdev)
 
 	pci->dev = dev;
 	pci->ops = &kirin_dw_pcie_ops;
+	pci->pp.ops = &kirin_pcie_host_ops;
 	kirin_pcie->pci = pci;
 
 	ret = kirin_pcie_get_clk(kirin_pcie, pdev);
@@ -474,7 +467,7 @@ static int kirin_pcie_probe(struct platform_device *pdev)
 
 	platform_set_drvdata(pdev, kirin_pcie);
 
-	return kirin_add_pcie_port(pci, pdev);
+	return dw_pcie_host_init(&pci->pp);
 }
 
 static const struct of_device_id kirin_pcie_match[] = {
diff --git a/drivers/pci/controller/dwc/pcie-uniphier.c b/drivers/pci/controller/dwc/pcie-uniphier.c
index 2457e9dd098d..7e8bad326770 100644
--- a/drivers/pci/controller/dwc/pcie-uniphier.c
+++ b/drivers/pci/controller/dwc/pcie-uniphier.c
@@ -321,25 +321,6 @@ static const struct dw_pcie_host_ops uniphier_pcie_host_ops = {
 	.host_init = uniphier_pcie_host_init,
 };
 
-static int uniphier_add_pcie_port(struct uniphier_pcie_priv *priv,
-				  struct platform_device *pdev)
-{
-	struct dw_pcie *pci = &priv->pci;
-	struct pcie_port *pp = &pci->pp;
-	struct device *dev = &pdev->dev;
-	int ret;
-
-	pp->ops = &uniphier_pcie_host_ops;
-
-	ret = dw_pcie_host_init(pp);
-	if (ret) {
-		dev_err(dev, "Failed to initialize host (%d)\n", ret);
-		return ret;
-	}
-
-	return 0;
-}
-
 static int uniphier_pcie_host_enable(struct uniphier_pcie_priv *priv)
 {
 	int ret;
@@ -415,7 +396,9 @@ static int uniphier_pcie_probe(struct platform_device *pdev)
 	if (ret)
 		return ret;
 
-	return uniphier_add_pcie_port(priv, pdev);
+	priv->pci.pp.ops = &uniphier_pcie_host_ops;
+
+	return dw_pcie_host_init(&priv->pci.pp);
 }
 
 static const struct of_device_id uniphier_pcie_match[] = {
-- 
2.25.1


^ permalink raw reply related

* [PATCH v2 12/16] PCI: dwc: Move dw_pcie_setup_rc() to DWC common code
From: Rob Herring @ 2020-11-05 21:11 UTC (permalink / raw)
  Cc: Kunihiko Hayashi, Neil Armstrong, linux-pci, Binghui Wang,
	Bjorn Andersson, Minghuan Lian, linux-arm-kernel,
	Thomas Petazzoni, Fabio Estevam, Jerome Brunet, Jesper Nilsson,
	Lorenzo Pieralisi, Kevin Hilman, Pratyush Anand,
	Krzysztof Kozlowski, Kishon Vijay Abraham I, Murali Karicheri,
	NXP Linux Team, Xiaowei Song, Richard Zhu, Martin Blumenstingl,
	linux-arm-msm, Sascha Hauer, linuxppc-dev, Yue Wang,
	linux-samsung-soc, Bjorn Helgaas, linux-amlogic, linux-omap,
	Mingkai Hu, linux-arm-kernel, Roy Zang, Masahiro Yamada,
	Gustavo Pimentel, Andy Gross, Stanimir Varbanov, Kukjin Kim,
	Pengutronix Kernel Team, Jingoo Han, Shawn Guo, Lucas Stach
In-Reply-To: <20201105211159.1814485-1-robh@kernel.org>

All RC complex drivers must call dw_pcie_setup_rc(). The ordering of the
call shouldn't be too important other than being after any RC resets.

There's a few calls of dw_pcie_setup_rc() left as drivers implementing
suspend/resume need it.

Cc: Kishon Vijay Abraham I <kishon@ti.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Kukjin Kim <kgene@kernel.org>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Richard Zhu <hongxing.zhu@nxp.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Pengutronix Kernel Team <kernel@pengutronix.de>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: NXP Linux Team <linux-imx@nxp.com>
Cc: Murali Karicheri <m-karicheri2@ti.com>
Cc: Minghuan Lian <minghuan.Lian@nxp.com>
Cc: Mingkai Hu <mingkai.hu@nxp.com>
Cc: Roy Zang <roy.zang@nxp.com>
Cc: Yue Wang <yue.wang@Amlogic.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Cc: Neil Armstrong <narmstrong@baylibre.com>
Cc: Jerome Brunet <jbrunet@baylibre.com>
Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Cc: Xiaowei Song <songxiaowei@hisilicon.com>
Cc: Binghui Wang <wangbinghui@hisilicon.com>
Cc: Andy Gross <agross@kernel.org>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Stanimir Varbanov <svarbanov@mm-sol.com>
Cc: Pratyush Anand <pratyush.anand@gmail.com>
Cc: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: linux-omap@vger.kernel.org
Cc: linux-samsung-soc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-amlogic@lists.infradead.org
Cc: linux-arm-kernel@axis.com
Cc: linux-arm-msm@vger.kernel.org
Acked-by: Jingoo Han <jingoohan1@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
---
 drivers/pci/controller/dwc/pci-dra7xx.c           | 1 -
 drivers/pci/controller/dwc/pci-exynos.c           | 1 -
 drivers/pci/controller/dwc/pci-imx6.c             | 1 -
 drivers/pci/controller/dwc/pci-keystone.c         | 2 --
 drivers/pci/controller/dwc/pci-layerscape.c       | 2 --
 drivers/pci/controller/dwc/pci-meson.c            | 2 --
 drivers/pci/controller/dwc/pcie-armada8k.c        | 2 --
 drivers/pci/controller/dwc/pcie-artpec6.c         | 1 -
 drivers/pci/controller/dwc/pcie-designware-host.c | 1 +
 drivers/pci/controller/dwc/pcie-designware-plat.c | 8 --------
 drivers/pci/controller/dwc/pcie-histb.c           | 3 ---
 drivers/pci/controller/dwc/pcie-kirin.c           | 2 --
 drivers/pci/controller/dwc/pcie-qcom.c            | 1 -
 drivers/pci/controller/dwc/pcie-spear13xx.c       | 2 --
 drivers/pci/controller/dwc/pcie-uniphier.c        | 2 --
 15 files changed, 1 insertion(+), 30 deletions(-)

diff --git a/drivers/pci/controller/dwc/pci-dra7xx.c b/drivers/pci/controller/dwc/pci-dra7xx.c
index 72a5a2bf933b..b105af63854a 100644
--- a/drivers/pci/controller/dwc/pci-dra7xx.c
+++ b/drivers/pci/controller/dwc/pci-dra7xx.c
@@ -181,7 +181,6 @@ static int dra7xx_pcie_host_init(struct pcie_port *pp)
 	struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
 	struct dra7xx_pcie *dra7xx = to_dra7xx_pcie(pci);
 
-	dw_pcie_setup_rc(pp);
 	dra7xx_pcie_enable_interrupts(dra7xx);
 
 	return 0;
diff --git a/drivers/pci/controller/dwc/pci-exynos.c b/drivers/pci/controller/dwc/pci-exynos.c
index 3939fe22e8a2..5c10a5432896 100644
--- a/drivers/pci/controller/dwc/pci-exynos.c
+++ b/drivers/pci/controller/dwc/pci-exynos.c
@@ -372,7 +372,6 @@ static int exynos_pcie_host_init(struct pcie_port *pp)
 	phy_init(ep->phy);
 
 	exynos_pcie_deassert_core_reset(ep);
-	dw_pcie_setup_rc(pp);
 	exynos_pcie_assert_reset(ep);
 
 	exynos_pcie_enable_interrupts(ep);
diff --git a/drivers/pci/controller/dwc/pci-imx6.c b/drivers/pci/controller/dwc/pci-imx6.c
index 9e30fbf4efbe..f9547bb2cf1b 100644
--- a/drivers/pci/controller/dwc/pci-imx6.c
+++ b/drivers/pci/controller/dwc/pci-imx6.c
@@ -834,7 +834,6 @@ static int imx6_pcie_host_init(struct pcie_port *pp)
 	imx6_pcie_init_phy(imx6_pcie);
 	imx6_pcie_deassert_core_reset(imx6_pcie);
 	imx6_setup_phy_mpll(imx6_pcie);
-	dw_pcie_setup_rc(pp);
 
 	return 0;
 }
diff --git a/drivers/pci/controller/dwc/pci-keystone.c b/drivers/pci/controller/dwc/pci-keystone.c
index 90b222b020a3..5a4bcc2b1ddb 100644
--- a/drivers/pci/controller/dwc/pci-keystone.c
+++ b/drivers/pci/controller/dwc/pci-keystone.c
@@ -807,8 +807,6 @@ static int __init ks_pcie_host_init(struct pcie_port *pp)
 	if (ret)
 		return ret;
 
-	dw_pcie_setup_rc(pp);
-
 	ks_pcie_stop_link(pci);
 	ks_pcie_setup_rc_app_regs(ks_pcie);
 	writew(PCI_IO_RANGE_TYPE_32 | (PCI_IO_RANGE_TYPE_32 << 8),
diff --git a/drivers/pci/controller/dwc/pci-layerscape.c b/drivers/pci/controller/dwc/pci-layerscape.c
index 0d84986c4c16..400ebbebd00f 100644
--- a/drivers/pci/controller/dwc/pci-layerscape.c
+++ b/drivers/pci/controller/dwc/pci-layerscape.c
@@ -136,8 +136,6 @@ static int ls_pcie_host_init(struct pcie_port *pp)
 
 	ls_pcie_drop_msg_tlp(pcie);
 
-	dw_pcie_setup_rc(pp);
-
 	return 0;
 }
 
diff --git a/drivers/pci/controller/dwc/pci-meson.c b/drivers/pci/controller/dwc/pci-meson.c
index 2df0adcf0bf2..04589f0decb2 100644
--- a/drivers/pci/controller/dwc/pci-meson.c
+++ b/drivers/pci/controller/dwc/pci-meson.c
@@ -380,8 +380,6 @@ static int meson_pcie_host_init(struct pcie_port *pp)
 	meson_set_max_payload(mp, MAX_PAYLOAD_SIZE);
 	meson_set_max_rd_req_size(mp, MAX_READ_REQ_SIZE);
 
-	dw_pcie_setup_rc(pp);
-
 	return 0;
 }
 
diff --git a/drivers/pci/controller/dwc/pcie-armada8k.c b/drivers/pci/controller/dwc/pcie-armada8k.c
index dd2926bbb901..4e2552dcf982 100644
--- a/drivers/pci/controller/dwc/pcie-armada8k.c
+++ b/drivers/pci/controller/dwc/pcie-armada8k.c
@@ -171,8 +171,6 @@ static int armada8k_pcie_host_init(struct pcie_port *pp)
 	u32 reg;
 	struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
 
-	dw_pcie_setup_rc(pp);
-
 	if (!dw_pcie_link_up(pci)) {
 		/* Disable LTSSM state machine to enable configuration */
 		reg = dw_pcie_readl_dbi(pci, PCIE_GLOBAL_CONTROL_REG);
diff --git a/drivers/pci/controller/dwc/pcie-artpec6.c b/drivers/pci/controller/dwc/pcie-artpec6.c
index 7ee8f3c83f8f..fcba9915a606 100644
--- a/drivers/pci/controller/dwc/pcie-artpec6.c
+++ b/drivers/pci/controller/dwc/pcie-artpec6.c
@@ -328,7 +328,6 @@ static int artpec6_pcie_host_init(struct pcie_port *pp)
 	artpec6_pcie_init_phy(artpec6_pcie);
 	artpec6_pcie_deassert_core_reset(artpec6_pcie);
 	artpec6_pcie_wait_for_phy(artpec6_pcie);
-	dw_pcie_setup_rc(pp);
 
 	return 0;
 }
diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
index ebea2c814448..f2b0a15ad72b 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -422,6 +422,7 @@ int dw_pcie_host_init(struct pcie_port *pp)
 			goto err_free_msi;
 	}
 
+	dw_pcie_setup_rc(pp);
 	dw_pcie_msi_init(pp);
 
 	if (!dw_pcie_link_up(pci) && pci->ops->start_link) {
diff --git a/drivers/pci/controller/dwc/pcie-designware-plat.c b/drivers/pci/controller/dwc/pcie-designware-plat.c
index dec24e595c3e..9b397c807261 100644
--- a/drivers/pci/controller/dwc/pcie-designware-plat.c
+++ b/drivers/pci/controller/dwc/pcie-designware-plat.c
@@ -33,15 +33,7 @@ struct dw_plat_pcie_of_data {
 
 static const struct of_device_id dw_plat_pcie_of_match[];
 
-static int dw_plat_pcie_host_init(struct pcie_port *pp)
-{
-	dw_pcie_setup_rc(pp);
-
-	return 0;
-}
-
 static const struct dw_pcie_host_ops dw_plat_pcie_host_ops = {
-	.host_init = dw_plat_pcie_host_init,
 };
 
 static int dw_plat_pcie_establish_link(struct dw_pcie *pci)
diff --git a/drivers/pci/controller/dwc/pcie-histb.c b/drivers/pci/controller/dwc/pcie-histb.c
index 210777c793ea..86f9d16c50d7 100644
--- a/drivers/pci/controller/dwc/pcie-histb.c
+++ b/drivers/pci/controller/dwc/pcie-histb.c
@@ -196,9 +196,6 @@ static int histb_pcie_host_init(struct pcie_port *pp)
 	regval |= PCIE_WM_RC;
 	histb_pcie_writel(hipcie, PCIE_SYS_CTRL0, regval);
 
-	/* setup root complex */
-	dw_pcie_setup_rc(pp);
-
 	return 0;
 }
 
diff --git a/drivers/pci/controller/dwc/pcie-kirin.c b/drivers/pci/controller/dwc/pcie-kirin.c
index f84ac1b36b2c..ac4bbdaf5324 100644
--- a/drivers/pci/controller/dwc/pcie-kirin.c
+++ b/drivers/pci/controller/dwc/pcie-kirin.c
@@ -405,8 +405,6 @@ static int kirin_pcie_host_init(struct pcie_port *pp)
 {
 	pp->bridge->ops = &kirin_pci_ops;
 
-	dw_pcie_setup_rc(pp);
-
 	return 0;
 }
 
diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
index eb107179d544..e49791c4f846 100644
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -1280,7 +1280,6 @@ static int qcom_pcie_host_init(struct pcie_port *pp)
 			goto err_disable_phy;
 	}
 
-	dw_pcie_setup_rc(pp);
 	qcom_ep_reset_deassert(pcie);
 
 	return 0;
diff --git a/drivers/pci/controller/dwc/pcie-spear13xx.c b/drivers/pci/controller/dwc/pcie-spear13xx.c
index 31475e4493a7..1a9e353bef55 100644
--- a/drivers/pci/controller/dwc/pcie-spear13xx.c
+++ b/drivers/pci/controller/dwc/pcie-spear13xx.c
@@ -130,8 +130,6 @@ static int spear13xx_pcie_host_init(struct pcie_port *pp)
 
 	spear13xx_pcie->app_base = pci->dbi_base + 0x2000;
 
-	dw_pcie_setup_rc(pp);
-
 	/*
 	 * this controller support only 128 bytes read size, however its
 	 * default value in capability register is 512 bytes. So force
diff --git a/drivers/pci/controller/dwc/pcie-uniphier.c b/drivers/pci/controller/dwc/pcie-uniphier.c
index e6616408a29c..2457e9dd098d 100644
--- a/drivers/pci/controller/dwc/pcie-uniphier.c
+++ b/drivers/pci/controller/dwc/pcie-uniphier.c
@@ -314,8 +314,6 @@ static int uniphier_pcie_host_init(struct pcie_port *pp)
 
 	uniphier_pcie_irq_enable(priv);
 
-	dw_pcie_setup_rc(pp);
-
 	return 0;
 }
 
-- 
2.25.1


^ permalink raw reply related

* [PATCH v2 09/16] PCI: dwc: Rework MSI initialization
From: Rob Herring @ 2020-11-05 21:11 UTC (permalink / raw)
  Cc: Roy Zang, Lorenzo Pieralisi, Jingoo Han, linux-pci, Minghuan Lian,
	Murali Karicheri, Mingkai Hu, Gustavo Pimentel, Bjorn Helgaas,
	linuxppc-dev, linux-arm-kernel
In-Reply-To: <20201105211159.1814485-1-robh@kernel.org>

There are 3 possible MSI implementations for the DWC host. The first is
using the built-in DWC MSI controller. The 2nd is a custom MSI
controller as part of the PCI host (keystone only). The 3rd is an
external MSI controller (typically GICv3 ITS). Currently, the last 2
are distinguished with a .msi_host_init() hook with the 3rd option using
an empty function. However we can detect the 3rd case with the presence
of 'msi-parent' or 'msi-map' properties, so let's do that instead and
remove the empty functions.

Cc: Murali Karicheri <m-karicheri2@ti.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Minghuan Lian <minghuan.Lian@nxp.com>
Cc: Mingkai Hu <mingkai.hu@nxp.com>
Cc: Roy Zang <roy.zang@nxp.com>
Cc: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Cc: linuxppc-dev@lists.ozlabs.org
Acked-by: Jingoo Han <jingoohan1@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
---
 drivers/pci/controller/dwc/pci-keystone.c     |  9 -------
 drivers/pci/controller/dwc/pci-layerscape.c   | 25 -------------------
 .../pci/controller/dwc/pcie-designware-host.c | 20 +++++++++------
 drivers/pci/controller/dwc/pcie-designware.h  |  1 +
 drivers/pci/controller/dwc/pcie-intel-gw.c    |  9 -------
 5 files changed, 13 insertions(+), 51 deletions(-)

diff --git a/drivers/pci/controller/dwc/pci-keystone.c b/drivers/pci/controller/dwc/pci-keystone.c
index 9cf14f13798b..784385ae6074 100644
--- a/drivers/pci/controller/dwc/pci-keystone.c
+++ b/drivers/pci/controller/dwc/pci-keystone.c
@@ -272,14 +272,6 @@ static void ks_pcie_handle_legacy_irq(struct keystone_pcie *ks_pcie,
 	ks_pcie_app_writel(ks_pcie, IRQ_EOI, offset);
 }
 
-/*
- * Dummy function so that DW core doesn't configure MSI
- */
-static int ks_pcie_am654_msi_host_init(struct pcie_port *pp)
-{
-	return 0;
-}
-
 static void ks_pcie_enable_error_irq(struct keystone_pcie *ks_pcie)
 {
 	ks_pcie_app_writel(ks_pcie, ERR_IRQ_ENABLE_SET, ERR_IRQ_ALL);
@@ -854,7 +846,6 @@ static const struct dw_pcie_host_ops ks_pcie_host_ops = {
 
 static const struct dw_pcie_host_ops ks_pcie_am654_host_ops = {
 	.host_init = ks_pcie_host_init,
-	.msi_host_init = ks_pcie_am654_msi_host_init,
 };
 
 static irqreturn_t ks_pcie_err_irq_handler(int irq, void *priv)
diff --git a/drivers/pci/controller/dwc/pci-layerscape.c b/drivers/pci/controller/dwc/pci-layerscape.c
index 53e56d54c482..0d84986c4c16 100644
--- a/drivers/pci/controller/dwc/pci-layerscape.c
+++ b/drivers/pci/controller/dwc/pci-layerscape.c
@@ -168,37 +168,12 @@ static int ls1021_pcie_host_init(struct pcie_port *pp)
 	return ls_pcie_host_init(pp);
 }
 
-static int ls_pcie_msi_host_init(struct pcie_port *pp)
-{
-	struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
-	struct device *dev = pci->dev;
-	struct device_node *np = dev->of_node;
-	struct device_node *msi_node;
-
-	/*
-	 * The MSI domain is set by the generic of_msi_configure().  This
-	 * .msi_host_init() function keeps us from doing the default MSI
-	 * domain setup in dw_pcie_host_init() and also enforces the
-	 * requirement that "msi-parent" exists.
-	 */
-	msi_node = of_parse_phandle(np, "msi-parent", 0);
-	if (!msi_node) {
-		dev_err(dev, "failed to find msi-parent\n");
-		return -EINVAL;
-	}
-
-	of_node_put(msi_node);
-	return 0;
-}
-
 static const struct dw_pcie_host_ops ls1021_pcie_host_ops = {
 	.host_init = ls1021_pcie_host_init,
-	.msi_host_init = ls_pcie_msi_host_init,
 };
 
 static const struct dw_pcie_host_ops ls_pcie_host_ops = {
 	.host_init = ls_pcie_host_init,
-	.msi_host_init = ls_pcie_msi_host_init,
 };
 
 static const struct dw_pcie_ops dw_ls1021_pcie_ops = {
diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
index 95deef0eaadf..9b952639d020 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -365,6 +365,10 @@ int dw_pcie_host_init(struct pcie_port *pp)
 		pci->link_gen = of_pci_get_max_link_speed(np);
 
 	if (pci_msi_enabled()) {
+		pp->has_msi_ctrl = !(pp->ops->msi_host_init ||
+				     of_property_read_bool(np, "msi-parent") ||
+				     of_property_read_bool(np, "msi-map"));
+
 		if (!pp->num_vectors) {
 			pp->num_vectors = MSI_DEF_NUM_VECTORS;
 		} else if (pp->num_vectors > MAX_MSI_IRQS) {
@@ -372,7 +376,11 @@ int dw_pcie_host_init(struct pcie_port *pp)
 			return -EINVAL;
 		}
 
-		if (!pp->ops->msi_host_init) {
+		if (pp->ops->msi_host_init) {
+			ret = pp->ops->msi_host_init(pp);
+			if (ret < 0)
+				return ret;
+		} else if (pp->has_msi_ctrl) {
 			if (!pp->msi_irq) {
 				pp->msi_irq = platform_get_irq_byname_optional(pdev, "msi");
 				if (pp->msi_irq < 0) {
@@ -402,10 +410,6 @@ int dw_pcie_host_init(struct pcie_port *pp)
 				pp->msi_data = 0;
 				goto err_free_msi;
 			}
-		} else {
-			ret = pp->ops->msi_host_init(pp);
-			if (ret < 0)
-				return ret;
 		}
 	}
 
@@ -426,7 +430,7 @@ int dw_pcie_host_init(struct pcie_port *pp)
 		return 0;
 
 err_free_msi:
-	if (pci_msi_enabled() && !pp->ops->msi_host_init)
+	if (pp->has_msi_ctrl)
 		dw_pcie_free_msi(pp);
 	return ret;
 }
@@ -436,7 +440,7 @@ void dw_pcie_host_deinit(struct pcie_port *pp)
 {
 	pci_stop_root_bus(pp->bridge->bus);
 	pci_remove_root_bus(pp->bridge->bus);
-	if (pci_msi_enabled() && !pp->ops->msi_host_init)
+	if (pp->has_msi_ctrl)
 		dw_pcie_free_msi(pp);
 }
 EXPORT_SYMBOL_GPL(dw_pcie_host_deinit);
@@ -544,7 +548,7 @@ void dw_pcie_setup_rc(struct pcie_port *pp)
 
 	dw_pcie_setup(pci);
 
-	if (pci_msi_enabled() && !pp->ops->msi_host_init) {
+	if (pp->has_msi_ctrl) {
 		num_ctrls = pp->num_vectors / MAX_MSI_IRQS_PER_CTRL;
 
 		/* Initialize IRQ Status array */
diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
index 96382dcb2859..5d374bab10d1 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -175,6 +175,7 @@ struct dw_pcie_host_ops {
 };
 
 struct pcie_port {
+	bool			has_msi_ctrl:1;
 	u64			cfg0_base;
 	void __iomem		*va_cfg0_base;
 	u32			cfg0_size;
diff --git a/drivers/pci/controller/dwc/pcie-intel-gw.c b/drivers/pci/controller/dwc/pcie-intel-gw.c
index c562eb7454b1..292b9de86532 100644
--- a/drivers/pci/controller/dwc/pcie-intel-gw.c
+++ b/drivers/pci/controller/dwc/pcie-intel-gw.c
@@ -385,14 +385,6 @@ static int intel_pcie_rc_init(struct pcie_port *pp)
 	return intel_pcie_host_setup(lpp);
 }
 
-/*
- * Dummy function so that DW core doesn't configure MSI
- */
-static int intel_pcie_msi_init(struct pcie_port *pp)
-{
-	return 0;
-}
-
 static u64 intel_pcie_cpu_addr(struct dw_pcie *pcie, u64 cpu_addr)
 {
 	return cpu_addr + BUS_IATU_OFFSET;
@@ -404,7 +396,6 @@ static const struct dw_pcie_ops intel_pcie_ops = {
 
 static const struct dw_pcie_host_ops intel_pcie_dw_ops = {
 	.host_init =		intel_pcie_rc_init,
-	.msi_host_init =	intel_pcie_msi_init,
 };
 
 static const struct intel_pcie_soc pcie_data = {
-- 
2.25.1


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox