LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v1] powerpc/64s: Fix unrecoverable MCE crash
From: Nicholas Piggin @ 2021-09-22  2:02 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Mahesh Salgaonkar, Ganesh Goudar, Nicholas Piggin

The machine check handler is not considered NMI on 64s. The early
handler is the true NMI handler, and then it schedules the
machine_check_exception handler to run when interrupts are enabled.

This works fine except the case of an unrecoverable MCE, where the true
NMI is taken when MSR[RI] is clear, it can not recover to schedule the
next handler, so it calls machine_check_exception directly so something
might be done about it.

Calling an async handler from NMI context can result in irq state and
other things getting corrupted. This can also trigger the BUG at
arch/powerpc/include/asm/interrupt.h:168.

Fix this by just making the 64s machine_check_exception handler an NMI
like it is on other subarchs.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/interrupt.h |  4 ----
 arch/powerpc/kernel/traps.c          | 23 +++++++----------------
 2 files changed, 7 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index 6b800d3e2681..b32ed910a8cf 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -524,11 +524,7 @@ static __always_inline long ____##func(struct pt_regs *regs)
 /* Interrupt handlers */
 /* kernel/traps.c */
 DECLARE_INTERRUPT_HANDLER_NMI(system_reset_exception);
-#ifdef CONFIG_PPC_BOOK3S_64
-DECLARE_INTERRUPT_HANDLER_ASYNC(machine_check_exception);
-#else
 DECLARE_INTERRUPT_HANDLER_NMI(machine_check_exception);
-#endif
 DECLARE_INTERRUPT_HANDLER(SMIException);
 DECLARE_INTERRUPT_HANDLER(handle_hmi_exception);
 DECLARE_INTERRUPT_HANDLER(unknown_exception);
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index aac8c0412ff9..b21450c655d2 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -790,24 +790,19 @@ void die_mce(const char *str, struct pt_regs *regs, long err)
 	 * do_exit() checks for in_interrupt() and panics in that case, so
 	 * exit the irq/nmi before calling die.
 	 */
-	if (IS_ENABLED(CONFIG_PPC_BOOK3S_64))
-		irq_exit();
-	else
-		nmi_exit();
+	nmi_exit();
 	die(str, regs, err);
 }
 
 /*
- * BOOK3S_64 does not call this handler as a non-maskable interrupt
- * (it uses its own early real-mode handler to handle the MCE proper
- * and then raises irq_work to call this handler when interrupts are
- * enabled).
+ * BOOK3S_64 does not call this handler as a non-maskable interrupt (it uses
+ * its own early real-mode handler to handle the MCE proper and then raises
+ * irq_work to call this handler when interrupts are enabled), except in the
+ * case of unrecoverable_mce. If unrecoverable_mce was a separate NMI handler,
+ * then this could be ASYNC on 64s. However it should all work okay as an NMI
+ * handler (and it is NMI on other platforms) so just make it an NMI.
  */
-#ifdef CONFIG_PPC_BOOK3S_64
-DEFINE_INTERRUPT_HANDLER_ASYNC(machine_check_exception)
-#else
 DEFINE_INTERRUPT_HANDLER_NMI(machine_check_exception)
-#endif
 {
 	int recover = 0;
 
@@ -842,11 +837,7 @@ DEFINE_INTERRUPT_HANDLER_NMI(machine_check_exception)
 	if (regs_is_unrecoverable(regs))
 		die_mce("Unrecoverable Machine check", regs, SIGBUS);
 
-#ifdef CONFIG_PPC_BOOK3S_64
-	return;
-#else
 	return 0;
-#endif
 }
 
 DEFINE_INTERRUPT_HANDLER(SMIException) /* async? */
-- 
2.23.0


^ permalink raw reply related

* Re: [PATCH 3/3] powerpc/pseries/cpuhp: delete add/remove_by_count code
From: kernel test robot @ 2021-09-20 21:50 UTC (permalink / raw)
  To: Nathan Lynch, linuxppc-dev
  Cc: danielhb413, tyreld, ldufour, kbuild-all, aneesh.kumar
In-Reply-To: <20210920135504.1792219-4-nathanl@linux.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 5199 bytes --]

Hi Nathan,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on linus/master v5.15-rc2 next-20210920]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Nathan-Lynch/CPU-DLPAR-hotplug-for-v5-16/20210920-215907
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-allmodconfig (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/72ea4c8a5398a4a72da34051a66f260ab0154f57
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Nathan-Lynch/CPU-DLPAR-hotplug-for-v5-16/20210920-215907
        git checkout 72ea4c8a5398a4a72da34051a66f260ab0154f57
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   arch/powerpc/platforms/pseries/hotplug-cpu.c: In function 'dlpar_cpu':
>> arch/powerpc/platforms/pseries/hotplug-cpu.c:746:13: error: variable 'count' set but not used [-Werror=unused-but-set-variable]
     746 |         u32 count, drc_index;
         |             ^~~~~
   cc1: all warnings being treated as errors


vim +/count +746 arch/powerpc/platforms/pseries/hotplug-cpu.c

ac71380071d19d Nathan Fontenot         2015-12-16  743  
ac71380071d19d Nathan Fontenot         2015-12-16  744  int dlpar_cpu(struct pseries_hp_errorlog *hp_elog)
ac71380071d19d Nathan Fontenot         2015-12-16  745  {
ac71380071d19d Nathan Fontenot         2015-12-16 @746  	u32 count, drc_index;
ac71380071d19d Nathan Fontenot         2015-12-16  747  	int rc;
ac71380071d19d Nathan Fontenot         2015-12-16  748  
ac71380071d19d Nathan Fontenot         2015-12-16  749  	count = hp_elog->_drc_u.drc_count;
ac71380071d19d Nathan Fontenot         2015-12-16  750  	drc_index = hp_elog->_drc_u.drc_index;
ac71380071d19d Nathan Fontenot         2015-12-16  751  
ac71380071d19d Nathan Fontenot         2015-12-16  752  	lock_device_hotplug();
ac71380071d19d Nathan Fontenot         2015-12-16  753  
ac71380071d19d Nathan Fontenot         2015-12-16  754  	switch (hp_elog->action) {
ac71380071d19d Nathan Fontenot         2015-12-16  755  	case PSERIES_HP_ELOG_ACTION_REMOVE:
72ea4c8a5398a4 Nathan Lynch            2021-09-20  756  		if (hp_elog->id_type == PSERIES_HP_ELOG_ID_DRC_INDEX) {
ac71380071d19d Nathan Fontenot         2015-12-16  757  			rc = dlpar_cpu_remove_by_index(drc_index);
29c9a2699e71a7 Daniel Henrique Barboza 2021-04-16  758  			/*
29c9a2699e71a7 Daniel Henrique Barboza 2021-04-16  759  			 * Setting the isolation state of an UNISOLATED/CONFIGURED
29c9a2699e71a7 Daniel Henrique Barboza 2021-04-16  760  			 * device to UNISOLATE is a no-op, but the hypervisor can
29c9a2699e71a7 Daniel Henrique Barboza 2021-04-16  761  			 * use it as a hint that the CPU removal failed.
29c9a2699e71a7 Daniel Henrique Barboza 2021-04-16  762  			 */
29c9a2699e71a7 Daniel Henrique Barboza 2021-04-16  763  			if (rc)
29c9a2699e71a7 Daniel Henrique Barboza 2021-04-16  764  				dlpar_unisolate_drc(drc_index);
29c9a2699e71a7 Daniel Henrique Barboza 2021-04-16  765  		}
ac71380071d19d Nathan Fontenot         2015-12-16  766  		else
ac71380071d19d Nathan Fontenot         2015-12-16  767  			rc = -EINVAL;
ac71380071d19d Nathan Fontenot         2015-12-16  768  		break;
90edf184b9b727 Nathan Fontenot         2015-12-16  769  	case PSERIES_HP_ELOG_ACTION_ADD:
72ea4c8a5398a4 Nathan Lynch            2021-09-20  770  		if (hp_elog->id_type == PSERIES_HP_ELOG_ID_DRC_INDEX)
90edf184b9b727 Nathan Fontenot         2015-12-16  771  			rc = dlpar_cpu_add(drc_index);
90edf184b9b727 Nathan Fontenot         2015-12-16  772  		else
90edf184b9b727 Nathan Fontenot         2015-12-16  773  			rc = -EINVAL;
90edf184b9b727 Nathan Fontenot         2015-12-16  774  		break;
ac71380071d19d Nathan Fontenot         2015-12-16  775  	default:
ac71380071d19d Nathan Fontenot         2015-12-16  776  		pr_err("Invalid action (%d) specified\n", hp_elog->action);
ac71380071d19d Nathan Fontenot         2015-12-16  777  		rc = -EINVAL;
ac71380071d19d Nathan Fontenot         2015-12-16  778  		break;
ac71380071d19d Nathan Fontenot         2015-12-16  779  	}
ac71380071d19d Nathan Fontenot         2015-12-16  780  
ac71380071d19d Nathan Fontenot         2015-12-16  781  	unlock_device_hotplug();
ac71380071d19d Nathan Fontenot         2015-12-16  782  	return rc;
ac71380071d19d Nathan Fontenot         2015-12-16  783  }
ac71380071d19d Nathan Fontenot         2015-12-16  784  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 74041 bytes --]

^ permalink raw reply

* Re: [PATCH 1/4] crypto: nintendo-aes - add a new AES driver
From: Corentin Labbe @ 2021-09-22  6:04 UTC (permalink / raw)
  To: Emmanuel Gil Peyrot
  Cc: devicetree, Herbert Xu, Ash Logan, linux-kernel, Rob Herring,
	Paul Mackerras, linux-crypto, linuxppc-dev, David S. Miller,
	Jonathan Neuschäfer
In-Reply-To: <20210921213930.10366-2-linkmauve@linkmauve.fr>

Le Tue, Sep 21, 2021 at 11:39:27PM +0200, Emmanuel Gil Peyrot a écrit :
> This engine implements AES in CBC mode, using 128-bit keys only.  It is
> present on both the Wii and the Wii U, and is apparently identical in
> both consoles.
> 
> The hardware is capable of firing an interrupt when the operation is
> done, but this driver currently uses a busy loop, I’m not too sure
> whether it would be preferable to switch, nor how to achieve that.
> 
> It also supports a mode where no operation is done, and thus could be
> used as a DMA copy engine, but I don’t know how to expose that to the
> kernel or whether it would even be useful.
> 
> In my testing, on a Wii U, this driver reaches 80.7 MiB/s, while the
> aes-generic driver only reaches 30.9 MiB/s, so it is a quite welcome
> speedup.
> 
> This driver was written based on reversed documentation, see:
> https://wiibrew.org/wiki/Hardware/AES
> 
> Signed-off-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>
> Tested-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>  # on Wii U

[...]

> +static int
> +do_crypt(const void *src, void *dst, u32 len, u32 flags)
> +{
> +	u32 blocks = ((len >> 4) - 1) & AES_CTRL_BLOCK;
> +	u32 status;
> +	u32 counter = OP_TIMEOUT;
> +	u32 i;
> +
> +	/* Flush out all of src, we can’t know whether any of it is in cache */
> +	for (i = 0; i < len; i += 32)
> +		__asm__("dcbf 0, %0" : : "r" (src + i));
> +	__asm__("sync" : : : "memory");
> +
> +	/* Set the addresses for DMA */
> +	iowrite32be(virt_to_phys((void *)src), base + AES_SRC);
> +	iowrite32be(virt_to_phys(dst), base + AES_DEST);

Hello

Since you do DMA operation, I think you should use the DMA-API and call dma_map_xxx()
This will prevent the use of __asm__ and virt_to_phys().

Regards

^ permalink raw reply

* Re: [PATCH] powerpc/paravirt: correct preempt debug splat in vcpu_is_preempted()
From: Michael Ellerman @ 2021-09-22  6:32 UTC (permalink / raw)
  To: Nathan Lynch, linuxppc-dev; +Cc: srikar, npiggin
In-Reply-To: <20210921031213.2029824-1-nathanl@linux.ibm.com>

Nathan Lynch <nathanl@linux.ibm.com> writes:
> vcpu_is_preempted() can be used outside of preempt-disabled critical
> sections, yielding warnings such as:
>
> BUG: using smp_processor_id() in preemptible [00000000] code: systemd-udevd/185
> caller is rwsem_spin_on_owner+0x1cc/0x2d0
> CPU: 1 PID: 185 Comm: systemd-udevd Not tainted 5.15.0-rc2+ #33
> Call Trace:
> [c000000012907ac0] [c000000000aa30a8] dump_stack_lvl+0xac/0x108 (unreliable)
> [c000000012907b00] [c000000001371f70] check_preemption_disabled+0x150/0x160
> [c000000012907b90] [c0000000001e0e8c] rwsem_spin_on_owner+0x1cc/0x2d0
> [c000000012907be0] [c0000000001e1408] rwsem_down_write_slowpath+0x478/0x9a0
> [c000000012907ca0] [c000000000576cf4] filename_create+0x94/0x1e0
> [c000000012907d10] [c00000000057ac08] do_symlinkat+0x68/0x1a0
> [c000000012907d70] [c00000000057ae18] sys_symlink+0x58/0x70
> [c000000012907da0] [c00000000002e448] system_call_exception+0x198/0x3c0
> [c000000012907e10] [c00000000000c54c] system_call_common+0xec/0x250
>
> The result of vcpu_is_preempted() is always subject to invalidation by
> events inside and outside of Linux; it's just a best guess at a point in
> time. Use raw_smp_processor_id() to avoid such warnings.
>
> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
> Fixes: ca3f969dcb11 ("powerpc/paravirt: Use is_kvm_guest() in vcpu_is_preempted()")
> ---
>  arch/powerpc/include/asm/paravirt.h | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/include/asm/paravirt.h b/arch/powerpc/include/asm/paravirt.h
> index bcb7b5f917be..e429aca566de 100644
> --- a/arch/powerpc/include/asm/paravirt.h
> +++ b/arch/powerpc/include/asm/paravirt.h
> @@ -97,7 +97,14 @@ static inline bool vcpu_is_preempted(int cpu)
>  
>  #ifdef CONFIG_PPC_SPLPAR
>  	if (!is_kvm_guest()) {
> -		int first_cpu = cpu_first_thread_sibling(smp_processor_id());
> +		int first_cpu;
> +
> +		/*
> +		 * This is only a guess at best, and this function may be
> +		 * called with preemption enabled. Using raw_smp_processor_id()
> +		 * does not damage accuracy.
> +		 */
> +		first_cpu = cpu_first_thread_sibling(raw_smp_processor_id());

This change seems good, except I think the comment needs to be a lot
more explicit about what it's doing and why.

A casual reader is going to be confused about vcpu preemption vs
"preemption", which are basically unrelated yet use the same word.

It's not clear how raw_smp_processor_id() is related to (Linux)
preemption, unless you know that smp_processor_id() is the alternative
and it contains a preemption check.

And "this is only a guess" is not clear on what *this* is, you're
referring to the result of the whole function, but that's not obvious.

>  		/*
>  		 * Preemption can only happen at core granularity. This CPU
                   ^^^^^^^^^^
                   Means something different to "preemption" above.

I know you didn't write that comment, and maybe we need to rewrite some
of those existing comments to make it clear they're not talking about
Linux preemption.

cheers

^ permalink raw reply

* Re: [PATCH] powerpc/paravirt: correct preempt debug splat in vcpu_is_preempted()
From: Srikar Dronamraju @ 2021-09-22  7:57 UTC (permalink / raw)
  To: Nathan Lynch; +Cc: linuxppc-dev, npiggin
In-Reply-To: <20210921031213.2029824-1-nathanl@linux.ibm.com>

* Nathan Lynch <nathanl@linux.ibm.com> [2021-09-20 22:12:13]:

> vcpu_is_preempted() can be used outside of preempt-disabled critical
> sections, yielding warnings such as:
> 
> BUG: using smp_processor_id() in preemptible [00000000] code: systemd-udevd/185
> caller is rwsem_spin_on_owner+0x1cc/0x2d0
> CPU: 1 PID: 185 Comm: systemd-udevd Not tainted 5.15.0-rc2+ #33
> Call Trace:
> [c000000012907ac0] [c000000000aa30a8] dump_stack_lvl+0xac/0x108 (unreliable)
> [c000000012907b00] [c000000001371f70] check_preemption_disabled+0x150/0x160
> [c000000012907b90] [c0000000001e0e8c] rwsem_spin_on_owner+0x1cc/0x2d0
> [c000000012907be0] [c0000000001e1408] rwsem_down_write_slowpath+0x478/0x9a0
> [c000000012907ca0] [c000000000576cf4] filename_create+0x94/0x1e0
> [c000000012907d10] [c00000000057ac08] do_symlinkat+0x68/0x1a0
> [c000000012907d70] [c00000000057ae18] sys_symlink+0x58/0x70
> [c000000012907da0] [c00000000002e448] system_call_exception+0x198/0x3c0
> [c000000012907e10] [c00000000000c54c] system_call_common+0xec/0x250
> 
> The result of vcpu_is_preempted() is always subject to invalidation by
> events inside and outside of Linux; it's just a best guess at a point in
> time. Use raw_smp_processor_id() to avoid such warnings.

Typically smp_processor_id() and raw_smp_processor_id() except for the
CONFIG_DEBUG_PREEMPT. In the CONFIG_DEBUG_PREEMPT case, smp_processor_id()
is actually debug_smp_processor_id(), which does all the checks.

I believe these checks in debug_smp_processor_id() are only valid for x86
case (aka cases were they have __smp_processor_id() defined.)
i.e x86 has a different implementation of _smp_processor_id() for stable and
unstable

> 
> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
> Fixes: ca3f969dcb11 ("powerpc/paravirt: Use is_kvm_guest() in vcpu_is_preempted()")
> ---
>  arch/powerpc/include/asm/paravirt.h | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/include/asm/paravirt.h b/arch/powerpc/include/asm/paravirt.h
> index bcb7b5f917be..e429aca566de 100644
> --- a/arch/powerpc/include/asm/paravirt.h
> +++ b/arch/powerpc/include/asm/paravirt.h
> @@ -97,7 +97,14 @@ static inline bool vcpu_is_preempted(int cpu)
> 
>  #ifdef CONFIG_PPC_SPLPAR
>  	if (!is_kvm_guest()) {
> -		int first_cpu = cpu_first_thread_sibling(smp_processor_id());
> +		int first_cpu;
> +
> +		/*
> +		 * This is only a guess at best, and this function may be
> +		 * called with preemption enabled. Using raw_smp_processor_id()
> +		 * does not damage accuracy.
> +		 */
> +		first_cpu = cpu_first_thread_sibling(raw_smp_processor_id());
> 
>  		/*
>  		 * Preemption can only happen at core granularity. This CPU
> -- 
> 2.31.1
> 

How about something like the below?

diff --git a/include/linux/smp.h b/include/linux/smp.h
index 510519e8a1eb..8c669e8ceb73 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -256,12 +256,14 @@ static inline int get_boot_cpu_id(void)
  */
 #ifndef __smp_processor_id
 #define __smp_processor_id(x) raw_smp_processor_id(x)
-#endif
-
+#else
 #ifdef CONFIG_DEBUG_PREEMPT
   extern unsigned int debug_smp_processor_id(void);
 # define smp_processor_id() debug_smp_processor_id()
-#else
+#endif
+#endif
+
+#ifndef smp_processor_id
 # define smp_processor_id() __smp_processor_id()
 #endif
 

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply related

* Re: [PATCH 5/7] PCI: Add pci_find_dvsec_capability to find designated VSEC
From: Frederic Barrat @ 2021-09-22  9:33 UTC (permalink / raw)
  To: Ben Widawsky, linux-cxl, linux-pci
  Cc: Alison Schofield, Andrew Donnellan, Ira Weiny, Vishal Verma,
	David E . Box, Jonathan Cameron, Bjorn Helgaas, Dan Williams,
	linuxppc-dev
In-Reply-To: <20210921220459.2437386-6-ben.widawsky@intel.com>



On 22/09/2021 00:04, Ben Widawsky wrote:
> Add pci_find_dvsec_capability to locate a Designated Vendor-Specific
> Extended Capability with the specified DVSEC ID.
> 
> The Designated Vendor-Specific Extended Capability (DVSEC) allows one or
> more vendor specific capabilities that aren't tied to the vendor ID of
> the PCI component.
> 
> DVSEC is critical for both the Compute Express Link (CXL) driver as well
> as the driver for OpenCAPI coherent accelerator (OCXL).
> 
> Cc: David E. Box <david.e.box@linux.intel.com>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: linux-pci@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: Frederic Barrat <fbarrat@linux.ibm.com>
> Cc: Andrew Donnellan <ajd@linux.ibm.com>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---


LGTM
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>


>   drivers/pci/pci.c   | 32 ++++++++++++++++++++++++++++++++
>   include/linux/pci.h |  1 +
>   2 files changed, 33 insertions(+)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index ce2ab62b64cf..94ac86ff28b0 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -732,6 +732,38 @@ u16 pci_find_vsec_capability(struct pci_dev *dev, u16 vendor, int cap)
>   }
>   EXPORT_SYMBOL_GPL(pci_find_vsec_capability);
>   
> +/**
> + * pci_find_dvsec_capability - Find DVSEC for vendor
> + * @dev: PCI device to query
> + * @vendor: Vendor ID to match for the DVSEC
> + * @dvsec: Designated Vendor-specific capability ID
> + *
> + * If DVSEC has Vendor ID @vendor and DVSEC ID @dvsec return the capability
> + * offset in config space; otherwise return 0.
> + */
> +u16 pci_find_dvsec_capability(struct pci_dev *dev, u16 vendor, u16 dvsec)
> +{
> +	int pos;
> +
> +	pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_DVSEC);
> +	if (!pos)
> +		return 0;
> +
> +	while (pos) {
> +		u16 v, id;
> +
> +		pci_read_config_word(dev, pos + PCI_DVSEC_HEADER1, &v);
> +		pci_read_config_word(dev, pos + PCI_DVSEC_HEADER2, &id);
> +		if (vendor == v && dvsec == id)
> +			return pos;
> +
> +		pos = pci_find_next_ext_capability(dev, pos, PCI_EXT_CAP_ID_DVSEC);
> +	}
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(pci_find_dvsec_capability);
> +
>   /**
>    * pci_find_parent_resource - return resource region of parent bus of given
>    *			      region
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index cd8aa6fce204..c93ccfa4571b 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1130,6 +1130,7 @@ u16 pci_find_ext_capability(struct pci_dev *dev, int cap);
>   u16 pci_find_next_ext_capability(struct pci_dev *dev, u16 pos, int cap);
>   struct pci_bus *pci_find_next_bus(const struct pci_bus *from);
>   u16 pci_find_vsec_capability(struct pci_dev *dev, u16 vendor, int cap);
> +u16 pci_find_dvsec_capability(struct pci_dev *dev, u16 vendor, u16 dvsec);
>   
>   u64 pci_get_dsn(struct pci_dev *dev);
>   
> 

^ permalink raw reply

* Re: [PATCH 7/7] ocxl: Use pci core's DVSEC functionality
From: Frederic Barrat @ 2021-09-22  9:38 UTC (permalink / raw)
  To: Dan Williams, Ben Widawsky
  Cc: Alison Schofield, Andrew Donnellan, Linux PCI, linuxppc-dev,
	linux-cxl, Vishal Verma, Jonathan Cameron, Ira Weiny
In-Reply-To: <CAPcyv4h4QHAQF+ogMvOXrkdyR5Jceo8yp7TQNN+836=v0QwdDw@mail.gmail.com>



On 22/09/2021 02:44, Dan Williams wrote:
> On Tue, Sep 21, 2021 at 3:05 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>>
>> Reduce maintenance burden of DVSEC query implementation by using the
>> centralized PCI core implementation.
>>
>> Cc: linuxppc-dev@lists.ozlabs.org
>> Cc: Frederic Barrat <fbarrat@linux.ibm.com>
>> Cc: Andrew Donnellan <ajd@linux.ibm.com>
>> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
>> ---
>>   drivers/misc/ocxl/config.c | 13 +------------
>>   1 file changed, 1 insertion(+), 12 deletions(-)
>>
>> diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c
>> index a68738f38252..e401a51596b9 100644
>> --- a/drivers/misc/ocxl/config.c
>> +++ b/drivers/misc/ocxl/config.c
>> @@ -33,18 +33,7 @@
>>
>>   static int find_dvsec(struct pci_dev *dev, int dvsec_id)
>>   {
>> -       int vsec = 0;
>> -       u16 vendor, id;
>> -
>> -       while ((vsec = pci_find_next_ext_capability(dev, vsec,
>> -                                                   OCXL_EXT_CAP_ID_DVSEC))) {
>> -               pci_read_config_word(dev, vsec + OCXL_DVSEC_VENDOR_OFFSET,
>> -                               &vendor);
>> -               pci_read_config_word(dev, vsec + OCXL_DVSEC_ID_OFFSET, &id);
>> -               if (vendor == PCI_VENDOR_ID_IBM && id == dvsec_id)
>> -                       return vsec;
>> -       }
>> -       return 0;
>> +       return pci_find_dvsec_capability(dev, PCI_VENDOR_ID_IBM, dvsec_id);
>>   }


That looks fine, thanks for spotting it. You can add this for the next 
revision:
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>



> 
> What about:
> 
> arch/powerpc/platforms/powernv/ocxl.c::find_dvsec_from_pos()
> 
> ...?  With that converted the redundant definitions below:
> 
> OCXL_EXT_CAP_ID_DVSEC
> OCXL_DVSEC_VENDOR_OFFSET
> OCXL_DVSEC_ID_OFFSET
> 
> ...can be cleaned up in favor of the core definitions.


That would be great. Are you guys willing to do it? If not, I could have 
a follow-on patch, if I don't forget :-)

Thanks,

   Fred


^ permalink raw reply

* Re: [PATCH 1/4] crypto: nintendo-aes - add a new AES driver
From: Ard Biesheuvel @ 2021-09-22 10:10 UTC (permalink / raw)
  To: Emmanuel Gil Peyrot
  Cc: open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Herbert Xu, Ash Logan, Linux Kernel Mailing List, Rob Herring,
	Paul Mackerras, Linux Crypto Mailing List,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT), David S. Miller,
	Jonathan Neuschäfer
In-Reply-To: <20210921213930.10366-2-linkmauve@linkmauve.fr>

On Tue, 21 Sept 2021 at 23:49, Emmanuel Gil Peyrot
<linkmauve@linkmauve.fr> wrote:
>
> This engine implements AES in CBC mode, using 128-bit keys only.  It is
> present on both the Wii and the Wii U, and is apparently identical in
> both consoles.
>
> The hardware is capable of firing an interrupt when the operation is
> done, but this driver currently uses a busy loop, I’m not too sure
> whether it would be preferable to switch, nor how to achieve that.
>
> It also supports a mode where no operation is done, and thus could be
> used as a DMA copy engine, but I don’t know how to expose that to the
> kernel or whether it would even be useful.
>
> In my testing, on a Wii U, this driver reaches 80.7 MiB/s, while the
> aes-generic driver only reaches 30.9 MiB/s, so it is a quite welcome
> speedup.
>
> This driver was written based on reversed documentation, see:
> https://wiibrew.org/wiki/Hardware/AES
>
> Signed-off-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>
> Tested-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>  # on Wii U

This is redundant - everybody should test the code they submit.

...
> +       /* TODO: figure out how to use interrupts here, this will probably
> +        * lower throughput but let the CPU do other things while the AES
> +        * engine is doing its work. */

So is it worthwhile like this? How much faster is it to use this
accelerator rather than the CPU?

> +       do {
> +               status = ioread32be(base + AES_CTRL);
> +               cpu_relax();
> +       } while ((status & AES_CTRL_EXEC) && --counter);
> +
> +       /* Do we ever get called with dst ≠ src?  If so we have to invalidate
> +        * dst in addition to the earlier flush of src. */
> +       if (unlikely(dst != src)) {
> +               for (i = 0; i < len; i += 32)
> +                       __asm__("dcbi 0, %0" : : "r" (dst + i));
> +               __asm__("sync" : : : "memory");
> +       }
> +
> +       return counter ? 0 : 1;
> +}
> +
> +static void
> +nintendo_aes_crypt(const void *src, void *dst, u32 len, u8 *iv, int dir,
> +                  bool firstchunk)
> +{
> +       u32 flags = 0;
> +       unsigned long iflags;
> +       int ret;
> +
> +       flags |= AES_CTRL_EXEC_INIT /* | AES_CTRL_IRQ */ | AES_CTRL_ENA;
> +
> +       if (dir == AES_DIR_DECRYPT)
> +               flags |= AES_CTRL_DEC;
> +
> +       if (!firstchunk)
> +               flags |= AES_CTRL_IV;
> +
> +       /* Start the critical section */
> +       spin_lock_irqsave(&lock, iflags);
> +
> +       if (firstchunk)
> +               writefield(AES_IV, iv);
> +
> +       ret = do_crypt(src, dst, len, flags);
> +       BUG_ON(ret);
> +
> +       spin_unlock_irqrestore(&lock, iflags);
> +}
> +
> +static int nintendo_setkey_skcipher(struct crypto_skcipher *tfm, const u8 *key,
> +                                   unsigned int len)
> +{
> +       /* The hardware only supports AES-128 */
> +       if (len != AES_KEYSIZE_128)
> +               return -EINVAL;
> +
> +       writefield(AES_KEY, key);
> +       return 0;
> +}
> +
> +static int nintendo_skcipher_crypt(struct skcipher_request *req, int dir)
> +{
> +       struct skcipher_walk walk;
> +       unsigned int nbytes;
> +       int err;
> +       char ivbuf[AES_BLOCK_SIZE];
> +       unsigned int ivsize;
> +
> +       bool firstchunk = true;
> +
> +       /* Reset the engine */
> +       iowrite32be(0, base + AES_CTRL);
> +
> +       err = skcipher_walk_virt(&walk, req, false);
> +       ivsize = min(sizeof(ivbuf), walk.ivsize);
> +
> +       while ((nbytes = walk.nbytes) != 0) {
> +               unsigned int chunkbytes = round_down(nbytes, AES_BLOCK_SIZE);
> +               unsigned int ret = nbytes % AES_BLOCK_SIZE;
> +
> +               if (walk.total == chunkbytes && dir == AES_DIR_DECRYPT) {
> +                       /* If this is the last chunk and we're decrypting, take
> +                        * note of the IV (which is the last ciphertext block)
> +                        */
> +                       memcpy(ivbuf, walk.src.virt.addr + walk.total - ivsize,
> +                              ivsize);
> +               }
> +
> +               nintendo_aes_crypt(walk.src.virt.addr, walk.dst.virt.addr,
> +                                  chunkbytes, walk.iv, dir, firstchunk);
> +
> +               if (walk.total == chunkbytes && dir == AES_DIR_ENCRYPT) {
> +                       /* If this is the last chunk and we're encrypting, take
> +                        * note of the IV (which is the last ciphertext block)
> +                        */
> +                       memcpy(walk.iv,
> +                              walk.dst.virt.addr + walk.total - ivsize,
> +                              ivsize);
> +               } else if (walk.total == chunkbytes && dir == AES_DIR_DECRYPT) {
> +                       memcpy(walk.iv, ivbuf, ivsize);
> +               }
> +
> +               err = skcipher_walk_done(&walk, ret);
> +               firstchunk = false;
> +       }
> +
> +       return err;
> +}
> +
> +static int nintendo_cbc_encrypt(struct skcipher_request *req)
> +{
> +       return nintendo_skcipher_crypt(req, AES_DIR_ENCRYPT);
> +}
> +
> +static int nintendo_cbc_decrypt(struct skcipher_request *req)
> +{
> +       return nintendo_skcipher_crypt(req, AES_DIR_DECRYPT);
> +}
> +
> +static struct skcipher_alg nintendo_alg = {
> +       .base.cra_name          = "cbc(aes)",
> +       .base.cra_driver_name   = "cbc-aes-nintendo",
> +       .base.cra_priority      = 400,
> +       .base.cra_flags         = CRYPTO_ALG_KERN_DRIVER_ONLY,
> +       .base.cra_blocksize     = AES_BLOCK_SIZE,
> +       .base.cra_alignmask     = 15,
> +       .base.cra_module        = THIS_MODULE,
> +       .setkey                 = nintendo_setkey_skcipher,
> +       .encrypt                = nintendo_cbc_encrypt,
> +       .decrypt                = nintendo_cbc_decrypt,
> +       .min_keysize            = AES_KEYSIZE_128,
> +       .max_keysize            = AES_KEYSIZE_128,
> +       .ivsize                 = AES_BLOCK_SIZE,
> +};
> +
> +static int nintendo_aes_remove(struct platform_device *pdev)
> +{
> +       struct device *dev = &pdev->dev;
> +
> +       crypto_unregister_skcipher(&nintendo_alg);
> +       devm_iounmap(dev, base);
> +       base = NULL;
> +
> +       return 0;
> +}
> +
> +static int nintendo_aes_probe(struct platform_device *pdev)
> +{
> +       struct device *dev = &pdev->dev;
> +       struct resource *res;
> +       int ret;
> +
> +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +       base = devm_ioremap_resource(dev, res);
> +       if (IS_ERR(base))
> +               return PTR_ERR(base);
> +
> +       spin_lock_init(&lock);
> +
> +       ret = crypto_register_skcipher(&nintendo_alg);
> +       if (ret)
> +               goto eiomap;
> +
> +       dev_notice(dev, "Nintendo Wii and Wii U AES engine enabled\n");
> +       return 0;
> +
> + eiomap:
> +       devm_iounmap(dev, base);
> +
> +       dev_err(dev, "Nintendo Wii and Wii U AES initialization failed\n");
> +       return ret;
> +}
> +
> +static const struct of_device_id nintendo_aes_of_match[] = {
> +       { .compatible = "nintendo,hollywood-aes", },
> +       { .compatible = "nintendo,latte-aes", },
> +       {/* sentinel */},
> +};
> +MODULE_DEVICE_TABLE(of, nintendo_aes_of_match);
> +
> +static struct platform_driver nintendo_aes_driver = {
> +       .driver = {
> +               .name = "nintendo-aes",
> +               .of_match_table = nintendo_aes_of_match,
> +       },
> +       .probe = nintendo_aes_probe,
> +       .remove = nintendo_aes_remove,
> +};
> +
> +module_platform_driver(nintendo_aes_driver);
> +
> +MODULE_AUTHOR("Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>");
> +MODULE_DESCRIPTION("Nintendo Wii and Wii U Hardware AES driver");
> +MODULE_LICENSE("GPL");
> --
> 2.33.0
>

^ permalink raw reply

* Re: [PATCH 1/4] crypto: nintendo-aes - add a new AES driver
From: Emmanuel Gil Peyrot @ 2021-09-22 10:43 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Herbert Xu, Ash Logan, Emmanuel Gil Peyrot,
	Linux Kernel Mailing List, Rob Herring, Paul Mackerras,
	Linux Crypto Mailing List,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT), David S. Miller,
	Jonathan Neuschäfer
In-Reply-To: <CAMj1kXF6RpaAsN2zUgkO0NW7gMwwhXMHEEM-wpQXxeNJbGJ79A@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 9209 bytes --]

On Wed, Sep 22, 2021 at 12:10:41PM +0200, Ard Biesheuvel wrote:
> On Tue, 21 Sept 2021 at 23:49, Emmanuel Gil Peyrot
> <linkmauve@linkmauve.fr> wrote:
> >
> > This engine implements AES in CBC mode, using 128-bit keys only.  It is
> > present on both the Wii and the Wii U, and is apparently identical in
> > both consoles.
> >
> > The hardware is capable of firing an interrupt when the operation is
> > done, but this driver currently uses a busy loop, I’m not too sure
> > whether it would be preferable to switch, nor how to achieve that.
> >
> > It also supports a mode where no operation is done, and thus could be
> > used as a DMA copy engine, but I don’t know how to expose that to the
> > kernel or whether it would even be useful.
> >
> > In my testing, on a Wii U, this driver reaches 80.7 MiB/s, while the
> > aes-generic driver only reaches 30.9 MiB/s, so it is a quite welcome
> > speedup.
> >
> > This driver was written based on reversed documentation, see:
> > https://wiibrew.org/wiki/Hardware/AES
> >
> > Signed-off-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>
> > Tested-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>  # on Wii U
> 
> This is redundant - everybody should test the code they submit.

Indeed, except for the comment, as I haven’t been able to test on the
Wii just yet and that’s kind of a call for doing exactly that. :)

> 
> ...
> > +       /* TODO: figure out how to use interrupts here, this will probably
> > +        * lower throughput but let the CPU do other things while the AES
> > +        * engine is doing its work. */
> 
> So is it worthwhile like this? How much faster is it to use this
> accelerator rather than the CPU?

As I mentioned above, on my hardware it reaches 80.7 MiB/s using this
busy loop instead of 30.9 MiB/s using aes-generic, measured using
`cryptsetup benchmark --cipher=aes --key-size=128`.  I expect the
difference would be even more pronounced on the Wii, with its CPU being
clocked lower.

I will give a try at using the interrupt, but I fully expect a lower
throughput alongside a lower CPU usage (for large requests).

> 
> > +       do {
> > +               status = ioread32be(base + AES_CTRL);
> > +               cpu_relax();
> > +       } while ((status & AES_CTRL_EXEC) && --counter);
> > +
> > +       /* Do we ever get called with dst ≠ src?  If so we have to invalidate
> > +        * dst in addition to the earlier flush of src. */
> > +       if (unlikely(dst != src)) {
> > +               for (i = 0; i < len; i += 32)
> > +                       __asm__("dcbi 0, %0" : : "r" (dst + i));
> > +               __asm__("sync" : : : "memory");
> > +       }
> > +
> > +       return counter ? 0 : 1;
> > +}
> > +
> > +static void
> > +nintendo_aes_crypt(const void *src, void *dst, u32 len, u8 *iv, int dir,
> > +                  bool firstchunk)
> > +{
> > +       u32 flags = 0;
> > +       unsigned long iflags;
> > +       int ret;
> > +
> > +       flags |= AES_CTRL_EXEC_INIT /* | AES_CTRL_IRQ */ | AES_CTRL_ENA;
> > +
> > +       if (dir == AES_DIR_DECRYPT)
> > +               flags |= AES_CTRL_DEC;
> > +
> > +       if (!firstchunk)
> > +               flags |= AES_CTRL_IV;
> > +
> > +       /* Start the critical section */
> > +       spin_lock_irqsave(&lock, iflags);
> > +
> > +       if (firstchunk)
> > +               writefield(AES_IV, iv);
> > +
> > +       ret = do_crypt(src, dst, len, flags);
> > +       BUG_ON(ret);
> > +
> > +       spin_unlock_irqrestore(&lock, iflags);
> > +}
> > +
> > +static int nintendo_setkey_skcipher(struct crypto_skcipher *tfm, const u8 *key,
> > +                                   unsigned int len)
> > +{
> > +       /* The hardware only supports AES-128 */
> > +       if (len != AES_KEYSIZE_128)
> > +               return -EINVAL;
> > +
> > +       writefield(AES_KEY, key);
> > +       return 0;
> > +}
> > +
> > +static int nintendo_skcipher_crypt(struct skcipher_request *req, int dir)
> > +{
> > +       struct skcipher_walk walk;
> > +       unsigned int nbytes;
> > +       int err;
> > +       char ivbuf[AES_BLOCK_SIZE];
> > +       unsigned int ivsize;
> > +
> > +       bool firstchunk = true;
> > +
> > +       /* Reset the engine */
> > +       iowrite32be(0, base + AES_CTRL);
> > +
> > +       err = skcipher_walk_virt(&walk, req, false);
> > +       ivsize = min(sizeof(ivbuf), walk.ivsize);
> > +
> > +       while ((nbytes = walk.nbytes) != 0) {
> > +               unsigned int chunkbytes = round_down(nbytes, AES_BLOCK_SIZE);
> > +               unsigned int ret = nbytes % AES_BLOCK_SIZE;
> > +
> > +               if (walk.total == chunkbytes && dir == AES_DIR_DECRYPT) {
> > +                       /* If this is the last chunk and we're decrypting, take
> > +                        * note of the IV (which is the last ciphertext block)
> > +                        */
> > +                       memcpy(ivbuf, walk.src.virt.addr + walk.total - ivsize,
> > +                              ivsize);
> > +               }
> > +
> > +               nintendo_aes_crypt(walk.src.virt.addr, walk.dst.virt.addr,
> > +                                  chunkbytes, walk.iv, dir, firstchunk);
> > +
> > +               if (walk.total == chunkbytes && dir == AES_DIR_ENCRYPT) {
> > +                       /* If this is the last chunk and we're encrypting, take
> > +                        * note of the IV (which is the last ciphertext block)
> > +                        */
> > +                       memcpy(walk.iv,
> > +                              walk.dst.virt.addr + walk.total - ivsize,
> > +                              ivsize);
> > +               } else if (walk.total == chunkbytes && dir == AES_DIR_DECRYPT) {
> > +                       memcpy(walk.iv, ivbuf, ivsize);
> > +               }
> > +
> > +               err = skcipher_walk_done(&walk, ret);
> > +               firstchunk = false;
> > +       }
> > +
> > +       return err;
> > +}
> > +
> > +static int nintendo_cbc_encrypt(struct skcipher_request *req)
> > +{
> > +       return nintendo_skcipher_crypt(req, AES_DIR_ENCRYPT);
> > +}
> > +
> > +static int nintendo_cbc_decrypt(struct skcipher_request *req)
> > +{
> > +       return nintendo_skcipher_crypt(req, AES_DIR_DECRYPT);
> > +}
> > +
> > +static struct skcipher_alg nintendo_alg = {
> > +       .base.cra_name          = "cbc(aes)",
> > +       .base.cra_driver_name   = "cbc-aes-nintendo",
> > +       .base.cra_priority      = 400,
> > +       .base.cra_flags         = CRYPTO_ALG_KERN_DRIVER_ONLY,
> > +       .base.cra_blocksize     = AES_BLOCK_SIZE,
> > +       .base.cra_alignmask     = 15,
> > +       .base.cra_module        = THIS_MODULE,
> > +       .setkey                 = nintendo_setkey_skcipher,
> > +       .encrypt                = nintendo_cbc_encrypt,
> > +       .decrypt                = nintendo_cbc_decrypt,
> > +       .min_keysize            = AES_KEYSIZE_128,
> > +       .max_keysize            = AES_KEYSIZE_128,
> > +       .ivsize                 = AES_BLOCK_SIZE,
> > +};
> > +
> > +static int nintendo_aes_remove(struct platform_device *pdev)
> > +{
> > +       struct device *dev = &pdev->dev;
> > +
> > +       crypto_unregister_skcipher(&nintendo_alg);
> > +       devm_iounmap(dev, base);
> > +       base = NULL;
> > +
> > +       return 0;
> > +}
> > +
> > +static int nintendo_aes_probe(struct platform_device *pdev)
> > +{
> > +       struct device *dev = &pdev->dev;
> > +       struct resource *res;
> > +       int ret;
> > +
> > +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > +       base = devm_ioremap_resource(dev, res);
> > +       if (IS_ERR(base))
> > +               return PTR_ERR(base);
> > +
> > +       spin_lock_init(&lock);
> > +
> > +       ret = crypto_register_skcipher(&nintendo_alg);
> > +       if (ret)
> > +               goto eiomap;
> > +
> > +       dev_notice(dev, "Nintendo Wii and Wii U AES engine enabled\n");
> > +       return 0;
> > +
> > + eiomap:
> > +       devm_iounmap(dev, base);
> > +
> > +       dev_err(dev, "Nintendo Wii and Wii U AES initialization failed\n");
> > +       return ret;
> > +}
> > +
> > +static const struct of_device_id nintendo_aes_of_match[] = {
> > +       { .compatible = "nintendo,hollywood-aes", },
> > +       { .compatible = "nintendo,latte-aes", },
> > +       {/* sentinel */},
> > +};
> > +MODULE_DEVICE_TABLE(of, nintendo_aes_of_match);
> > +
> > +static struct platform_driver nintendo_aes_driver = {
> > +       .driver = {
> > +               .name = "nintendo-aes",
> > +               .of_match_table = nintendo_aes_of_match,
> > +       },
> > +       .probe = nintendo_aes_probe,
> > +       .remove = nintendo_aes_remove,
> > +};
> > +
> > +module_platform_driver(nintendo_aes_driver);
> > +
> > +MODULE_AUTHOR("Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>");
> > +MODULE_DESCRIPTION("Nintendo Wii and Wii U Hardware AES driver");
> > +MODULE_LICENSE("GPL");
> > --
> > 2.33.0
> >

-- 
Emmanuel Gil Peyrot

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH 1/4] crypto: nintendo-aes - add a new AES driver
From: Ard Biesheuvel @ 2021-09-22 10:55 UTC (permalink / raw)
  To: Emmanuel Gil Peyrot
  Cc: open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Herbert Xu, Ash Logan, Linux Kernel Mailing List, Rob Herring,
	Paul Mackerras, Linux Crypto Mailing List,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT), David S. Miller,
	Jonathan Neuschäfer
In-Reply-To: <20210922104302.22pgaoy2vspranqj@luna>

On Wed, 22 Sept 2021 at 12:43, Emmanuel Gil Peyrot
<linkmauve@linkmauve.fr> wrote:
>
> On Wed, Sep 22, 2021 at 12:10:41PM +0200, Ard Biesheuvel wrote:
> > On Tue, 21 Sept 2021 at 23:49, Emmanuel Gil Peyrot
> > <linkmauve@linkmauve.fr> wrote:
> > >
> > > This engine implements AES in CBC mode, using 128-bit keys only.  It is
> > > present on both the Wii and the Wii U, and is apparently identical in
> > > both consoles.
> > >
> > > The hardware is capable of firing an interrupt when the operation is
> > > done, but this driver currently uses a busy loop, I’m not too sure
> > > whether it would be preferable to switch, nor how to achieve that.
> > >
> > > It also supports a mode where no operation is done, and thus could be
> > > used as a DMA copy engine, but I don’t know how to expose that to the
> > > kernel or whether it would even be useful.
> > >
> > > In my testing, on a Wii U, this driver reaches 80.7 MiB/s, while the
> > > aes-generic driver only reaches 30.9 MiB/s, so it is a quite welcome
> > > speedup.
> > >
> > > This driver was written based on reversed documentation, see:
> > > https://wiibrew.org/wiki/Hardware/AES
> > >
> > > Signed-off-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>
> > > Tested-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>  # on Wii U
> >
> > This is redundant - everybody should test the code they submit.
>
> Indeed, except for the comment, as I haven’t been able to test on the
> Wii just yet and that’s kind of a call for doing exactly that. :)
>
> >
> > ...
> > > +       /* TODO: figure out how to use interrupts here, this will probably
> > > +        * lower throughput but let the CPU do other things while the AES
> > > +        * engine is doing its work. */
> >
> > So is it worthwhile like this? How much faster is it to use this
> > accelerator rather than the CPU?
>
> As I mentioned above, on my hardware it reaches 80.7 MiB/s using this
> busy loop instead of 30.9 MiB/s using aes-generic, measured using
> `cryptsetup benchmark --cipher=aes --key-size=128`.  I expect the
> difference would be even more pronounced on the Wii, with its CPU being
> clocked lower.
>

Ah apologies for not spotting that. This is a nice speedup.

> I will give a try at using the interrupt, but I fully expect a lower
> throughput alongside a lower CPU usage (for large requests).
>

You should consider latency as well. Is it really necessary to disable
interrupts as well? A scheduling blackout of ~1ms (for the worst case
of 64k of input @ 80 MB/s) may be tolerable but keeping interrupts
disabled for that long is probably not a great idea. (Just make sure
you use spin_lock_bh() to prevent deadlocks that could occur if your
code is called from softirq context)

But using the interrupt is obviously preferred. What's wrong with it?

Btw the crypto API does not permit AES-128 only - you will need to add
a fallback for other key sizes as well.


> >
> > > +       do {
> > > +               status = ioread32be(base + AES_CTRL);
> > > +               cpu_relax();
> > > +       } while ((status & AES_CTRL_EXEC) && --counter);
> > > +
> > > +       /* Do we ever get called with dst ≠ src?  If so we have to invalidate
> > > +        * dst in addition to the earlier flush of src. */
> > > +       if (unlikely(dst != src)) {
> > > +               for (i = 0; i < len; i += 32)
> > > +                       __asm__("dcbi 0, %0" : : "r" (dst + i));
> > > +               __asm__("sync" : : : "memory");
> > > +       }
> > > +
> > > +       return counter ? 0 : 1;
> > > +}
> > > +
> > > +static void
> > > +nintendo_aes_crypt(const void *src, void *dst, u32 len, u8 *iv, int dir,
> > > +                  bool firstchunk)
> > > +{
> > > +       u32 flags = 0;
> > > +       unsigned long iflags;
> > > +       int ret;
> > > +
> > > +       flags |= AES_CTRL_EXEC_INIT /* | AES_CTRL_IRQ */ | AES_CTRL_ENA;
> > > +
> > > +       if (dir == AES_DIR_DECRYPT)
> > > +               flags |= AES_CTRL_DEC;
> > > +
> > > +       if (!firstchunk)
> > > +               flags |= AES_CTRL_IV;
> > > +
> > > +       /* Start the critical section */
> > > +       spin_lock_irqsave(&lock, iflags);
> > > +
> > > +       if (firstchunk)
> > > +               writefield(AES_IV, iv);
> > > +
> > > +       ret = do_crypt(src, dst, len, flags);
> > > +       BUG_ON(ret);
> > > +
> > > +       spin_unlock_irqrestore(&lock, iflags);
> > > +}
> > > +
> > > +static int nintendo_setkey_skcipher(struct crypto_skcipher *tfm, const u8 *key,
> > > +                                   unsigned int len)
> > > +{
> > > +       /* The hardware only supports AES-128 */
> > > +       if (len != AES_KEYSIZE_128)
> > > +               return -EINVAL;
> > > +
> > > +       writefield(AES_KEY, key);
> > > +       return 0;
> > > +}
> > > +
> > > +static int nintendo_skcipher_crypt(struct skcipher_request *req, int dir)
> > > +{
> > > +       struct skcipher_walk walk;
> > > +       unsigned int nbytes;
> > > +       int err;
> > > +       char ivbuf[AES_BLOCK_SIZE];
> > > +       unsigned int ivsize;
> > > +
> > > +       bool firstchunk = true;
> > > +
> > > +       /* Reset the engine */
> > > +       iowrite32be(0, base + AES_CTRL);
> > > +
> > > +       err = skcipher_walk_virt(&walk, req, false);
> > > +       ivsize = min(sizeof(ivbuf), walk.ivsize);
> > > +
> > > +       while ((nbytes = walk.nbytes) != 0) {
> > > +               unsigned int chunkbytes = round_down(nbytes, AES_BLOCK_SIZE);
> > > +               unsigned int ret = nbytes % AES_BLOCK_SIZE;
> > > +
> > > +               if (walk.total == chunkbytes && dir == AES_DIR_DECRYPT) {
> > > +                       /* If this is the last chunk and we're decrypting, take
> > > +                        * note of the IV (which is the last ciphertext block)
> > > +                        */
> > > +                       memcpy(ivbuf, walk.src.virt.addr + walk.total - ivsize,
> > > +                              ivsize);
> > > +               }
> > > +
> > > +               nintendo_aes_crypt(walk.src.virt.addr, walk.dst.virt.addr,
> > > +                                  chunkbytes, walk.iv, dir, firstchunk);
> > > +
> > > +               if (walk.total == chunkbytes && dir == AES_DIR_ENCRYPT) {
> > > +                       /* If this is the last chunk and we're encrypting, take
> > > +                        * note of the IV (which is the last ciphertext block)
> > > +                        */
> > > +                       memcpy(walk.iv,
> > > +                              walk.dst.virt.addr + walk.total - ivsize,
> > > +                              ivsize);
> > > +               } else if (walk.total == chunkbytes && dir == AES_DIR_DECRYPT) {
> > > +                       memcpy(walk.iv, ivbuf, ivsize);
> > > +               }
> > > +
> > > +               err = skcipher_walk_done(&walk, ret);
> > > +               firstchunk = false;
> > > +       }
> > > +
> > > +       return err;
> > > +}
> > > +
> > > +static int nintendo_cbc_encrypt(struct skcipher_request *req)
> > > +{
> > > +       return nintendo_skcipher_crypt(req, AES_DIR_ENCRYPT);
> > > +}
> > > +
> > > +static int nintendo_cbc_decrypt(struct skcipher_request *req)
> > > +{
> > > +       return nintendo_skcipher_crypt(req, AES_DIR_DECRYPT);
> > > +}
> > > +
> > > +static struct skcipher_alg nintendo_alg = {
> > > +       .base.cra_name          = "cbc(aes)",
> > > +       .base.cra_driver_name   = "cbc-aes-nintendo",
> > > +       .base.cra_priority      = 400,
> > > +       .base.cra_flags         = CRYPTO_ALG_KERN_DRIVER_ONLY,
> > > +       .base.cra_blocksize     = AES_BLOCK_SIZE,
> > > +       .base.cra_alignmask     = 15,
> > > +       .base.cra_module        = THIS_MODULE,
> > > +       .setkey                 = nintendo_setkey_skcipher,
> > > +       .encrypt                = nintendo_cbc_encrypt,
> > > +       .decrypt                = nintendo_cbc_decrypt,
> > > +       .min_keysize            = AES_KEYSIZE_128,
> > > +       .max_keysize            = AES_KEYSIZE_128,
> > > +       .ivsize                 = AES_BLOCK_SIZE,
> > > +};
> > > +
> > > +static int nintendo_aes_remove(struct platform_device *pdev)
> > > +{
> > > +       struct device *dev = &pdev->dev;
> > > +
> > > +       crypto_unregister_skcipher(&nintendo_alg);
> > > +       devm_iounmap(dev, base);
> > > +       base = NULL;
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static int nintendo_aes_probe(struct platform_device *pdev)
> > > +{
> > > +       struct device *dev = &pdev->dev;
> > > +       struct resource *res;
> > > +       int ret;
> > > +
> > > +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > > +       base = devm_ioremap_resource(dev, res);
> > > +       if (IS_ERR(base))
> > > +               return PTR_ERR(base);
> > > +
> > > +       spin_lock_init(&lock);
> > > +
> > > +       ret = crypto_register_skcipher(&nintendo_alg);
> > > +       if (ret)
> > > +               goto eiomap;
> > > +
> > > +       dev_notice(dev, "Nintendo Wii and Wii U AES engine enabled\n");
> > > +       return 0;
> > > +
> > > + eiomap:
> > > +       devm_iounmap(dev, base);
> > > +
> > > +       dev_err(dev, "Nintendo Wii and Wii U AES initialization failed\n");
> > > +       return ret;
> > > +}
> > > +
> > > +static const struct of_device_id nintendo_aes_of_match[] = {
> > > +       { .compatible = "nintendo,hollywood-aes", },
> > > +       { .compatible = "nintendo,latte-aes", },
> > > +       {/* sentinel */},
> > > +};
> > > +MODULE_DEVICE_TABLE(of, nintendo_aes_of_match);
> > > +
> > > +static struct platform_driver nintendo_aes_driver = {
> > > +       .driver = {
> > > +               .name = "nintendo-aes",
> > > +               .of_match_table = nintendo_aes_of_match,
> > > +       },
> > > +       .probe = nintendo_aes_probe,
> > > +       .remove = nintendo_aes_remove,
> > > +};
> > > +
> > > +module_platform_driver(nintendo_aes_driver);
> > > +
> > > +MODULE_AUTHOR("Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>");
> > > +MODULE_DESCRIPTION("Nintendo Wii and Wii U Hardware AES driver");
> > > +MODULE_LICENSE("GPL");
> > > --
> > > 2.33.0
> > >
>
> --
> Emmanuel Gil Peyrot

^ permalink raw reply

* Re: [PATCH] powerpc/code-patching: Return error on patch_branch() out-of-range failure
From: Naveen N. Rao @ 2021-09-22 12:06 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Christophe Leroy, Michael Ellerman,
	Paul Mackerras
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <4940b03de220d1dfe2c6b47a41e60925497ce125.1630657331.git.christophe.leroy@csgroup.eu>

Christophe Leroy wrote:
> Do not silentely ignore a failure of create_branch() in
> patch_branch(). Return -ERANGE.
> 
> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> ---
>  arch/powerpc/lib/code-patching.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)

Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>


> 
> diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> index f9a3019e37b4..0bc9cc0416b8 100644
> --- a/arch/powerpc/lib/code-patching.c
> +++ b/arch/powerpc/lib/code-patching.c
> @@ -202,7 +202,9 @@ int patch_branch(u32 *addr, unsigned long target, int flags)
>  {
>  	struct ppc_inst instr;
>  
> -	create_branch(&instr, addr, target, flags);
> +	if (create_branch(&instr, addr, target, flags))
> +		return -ERANGE;
> +
>  	return patch_instruction(addr, instr);
>  }
>  
> -- 
> 2.25.0
> 
> 

^ permalink raw reply

* [RESEND PATCH 2/2] powerpc/powermac: constify device_node in of_irq_parse_oldworld()
From: Krzysztof Kozlowski @ 2021-09-22  8:44 UTC (permalink / raw)
  To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Rob Herring, Frank Rowand, Krzysztof Kozlowski, linuxppc-dev,
	linux-kernel, devicetree
In-Reply-To: <20210922084415.18269-1-krzysztof.kozlowski@canonical.com>

The of_irq_parse_oldworld() does not modify passed device_node so make
it a pointer to const for safety.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
---
 arch/powerpc/platforms/powermac/pic.c | 2 +-
 include/linux/of_irq.h                | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/powermac/pic.c b/arch/powerpc/platforms/powermac/pic.c
index 4921bccf0376..af5ca1f41bb1 100644
--- a/arch/powerpc/platforms/powermac/pic.c
+++ b/arch/powerpc/platforms/powermac/pic.c
@@ -384,7 +384,7 @@ static void __init pmac_pic_probe_oldstyle(void)
 #endif
 }
 
-int of_irq_parse_oldworld(struct device_node *device, int index,
+int of_irq_parse_oldworld(const struct device_node *device, int index,
 			struct of_phandle_args *out_irq)
 {
 	const u32 *ints = NULL;
diff --git a/include/linux/of_irq.h b/include/linux/of_irq.h
index aaf219bd0354..6074fdf51f0c 100644
--- a/include/linux/of_irq.h
+++ b/include/linux/of_irq.h
@@ -20,12 +20,12 @@ typedef int (*of_irq_init_cb_t)(struct device_node *, struct device_node *);
 #if defined(CONFIG_PPC32) && defined(CONFIG_PPC_PMAC)
 extern unsigned int of_irq_workarounds;
 extern struct device_node *of_irq_dflt_pic;
-extern int of_irq_parse_oldworld(struct device_node *device, int index,
+extern int of_irq_parse_oldworld(const struct device_node *device, int index,
 			       struct of_phandle_args *out_irq);
 #else /* CONFIG_PPC32 && CONFIG_PPC_PMAC */
 #define of_irq_workarounds (0)
 #define of_irq_dflt_pic (NULL)
-static inline int of_irq_parse_oldworld(struct device_node *device, int index,
+static inline int of_irq_parse_oldworld(const struct device_node *device, int index,
 				      struct of_phandle_args *out_irq)
 {
 	return -EINVAL;
-- 
2.30.2


^ permalink raw reply related

* [RESEND PATCH 1/2] powerpc/powermac: add missing g5_phy_disable_cpu1() declaration
From: Krzysztof Kozlowski @ 2021-09-22  8:44 UTC (permalink / raw)
  To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Rob Herring, Frank Rowand, Krzysztof Kozlowski, linuxppc-dev,
	linux-kernel, devicetree

g5_phy_disable_cpu1() is used outside of platforms/powermac/feature.c,
so it should have a declaration to fix W=1 warning:

  arch/powerpc/platforms/powermac/feature.c:1533:6:
    error: no previous prototype for ‘g5_phy_disable_cpu1’ [-Werror=missing-prototypes]

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
---
 arch/powerpc/include/asm/pmac_feature.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/include/asm/pmac_feature.h b/arch/powerpc/include/asm/pmac_feature.h
index e08e829261b6..7703e5bf1203 100644
--- a/arch/powerpc/include/asm/pmac_feature.h
+++ b/arch/powerpc/include/asm/pmac_feature.h
@@ -143,6 +143,10 @@
  */
 struct device_node;
 
+#ifdef CONFIG_PPC64
+void g5_phy_disable_cpu1(void);
+#endif /* CONFIG_PPC64 */
+
 static inline long pmac_call_feature(int selector, struct device_node* node,
 					long param, long value)
 {
-- 
2.30.2


^ permalink raw reply related

* [PATCH] powerpc/breakpoint: Cleanup
From: Christophe Leroy @ 2021-09-22 13:37 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel

cache_op_size() does exactly the same as l1_dcache_bytes().

Remove it.

MSR_64BIT already exists, no need to enclode the check
around #ifdef __powerpc64__

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/kernel/hw_breakpoint_constraints.c | 15 +++------------
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/kernel/hw_breakpoint_constraints.c b/arch/powerpc/kernel/hw_breakpoint_constraints.c
index 675d1f66ab72..42b967e3d85c 100644
--- a/arch/powerpc/kernel/hw_breakpoint_constraints.c
+++ b/arch/powerpc/kernel/hw_breakpoint_constraints.c
@@ -127,15 +127,6 @@ bool wp_check_constraints(struct pt_regs *regs, struct ppc_inst instr,
 	return false;
 }
 
-static int cache_op_size(void)
-{
-#ifdef __powerpc64__
-	return ppc64_caches.l1d.block_size;
-#else
-	return L1_CACHE_BYTES;
-#endif
-}
-
 void wp_get_instr_detail(struct pt_regs *regs, struct ppc_inst *instr,
 			 int *type, int *size, unsigned long *ea)
 {
@@ -147,14 +138,14 @@ void wp_get_instr_detail(struct pt_regs *regs, struct ppc_inst *instr,
 	analyse_instr(&op, regs, *instr);
 	*type = GETTYPE(op.type);
 	*ea = op.ea;
-#ifdef __powerpc64__
+
 	if (!(regs->msr & MSR_64BIT))
 		*ea &= 0xffffffffUL;
-#endif
+
 
 	*size = GETSIZE(op.type);
 	if (*type == CACHEOP) {
-		*size = cache_op_size();
+		*size = l1_dcache_bytes();
 		*ea &= ~(*size - 1);
 	} else if (*type == LOAD_VMX || *type == STORE_VMX) {
 		*ea &= ~(*size - 1);
-- 
2.31.1


^ permalink raw reply related

* Re: [PATCH v3 5/8] x86/sme: Replace occurrences of sme_active() with cc_platform_has()
From: Tom Lendacky @ 2021-09-22 13:40 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Sathyanarayanan Kuppuswamy, linux-efi, Brijesh Singh, kvm,
	Peter Zijlstra, Dave Hansen, dri-devel, platform-driver-x86,
	Will Deacon, linux-s390, Andi Kleen, Joerg Roedel, x86, amd-gfx,
	Christoph Hellwig, Ingo Molnar, linux-graphics-maintainer,
	Tianyu Lan, Borislav Petkov, Andy Lutomirski, Thomas Gleixner,
	kexec, linux-kernel, iommu, linux-fsdevel, linuxppc-dev
In-Reply-To: <20210921215830.vqxd75r4eyau6cxy@box.shutemov.name>

On 9/21/21 4:58 PM, Kirill A. Shutemov wrote:
> On Tue, Sep 21, 2021 at 04:43:59PM -0500, Tom Lendacky wrote:
>> On 9/21/21 4:34 PM, Kirill A. Shutemov wrote:
>>> On Tue, Sep 21, 2021 at 11:27:17PM +0200, Borislav Petkov wrote:
>>>> On Wed, Sep 22, 2021 at 12:20:59AM +0300, Kirill A. Shutemov wrote:
>>>>> I still believe calling cc_platform_has() from __startup_64() is totally
>>>>> broken as it lacks proper wrapping while accessing global variables.
>>>>
>>>> Well, one of the issues on the AMD side was using boot_cpu_data too
>>>> early and the Intel side uses it too. Can you replace those checks with
>>>> is_tdx_guest() or whatever was the helper's name which would check
>>>> whether the the kernel is running as a TDX guest, and see if that helps?
>>>
>>> There's no need in Intel check this early. Only AMD need it. Maybe just
>>> opencode them?
>>
>> Any way you can put a gzipped/bzipped copy of your vmlinux file somewhere I
>> can grab it from and take a look at it?
> 
> You can find broken vmlinux and bzImage here:
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdrive.google.com%2Fdrive%2Ffolders%2F1n74vUQHOGebnF70Im32qLFY8iS3wvjIs%3Fusp%3Dsharing&amp;data=04%7C01%7Cthomas.lendacky%40amd.com%7C1c7adf380cbe4c1a6bb708d97d4af6ff%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637678583935705530%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=gA30x%2Bfu97tUx0p2UqI8HgjiL8bxDbK1GqgJBbUrUE4%3D&amp;reserved=0
> 
> Let me know when I can remove it.

Looking at everything, it is all RIP relative addressing, so those
accesses should be fine. Your image has the intel_cc_platform_has()
function, does it work if you remove that call? Because I think it may be
the early call into that function which looks like it has instrumentation
that uses %gs in __sanitizer_cov_trace_pc and %gs is not setup properly
yet. And since boot_cpu_data.x86_vendor will likely be zero this early it
will match X86_VENDOR_INTEL and call into that function.

ffffffff8124f880 <intel_cc_platform_has>:
ffffffff8124f880:       e8 bb 64 06 00          callq  ffffffff812b5d40 <__fentry__>
ffffffff8124f885:       e8 36 ca 42 00          callq  ffffffff8167c2c0 <__sanitizer_cov_trace_pc>
ffffffff8124f88a:       31 c0                   xor    %eax,%eax
ffffffff8124f88c:       c3                      retq


ffffffff8167c2c0 <__sanitizer_cov_trace_pc>:
ffffffff8167c2c0:       65 8b 05 39 ad 9a 7e    mov    %gs:0x7e9aad39(%rip),%eax        # 27000 <__preempt_count>
ffffffff8167c2c7:       89 c6                   mov    %eax,%esi
ffffffff8167c2c9:       48 8b 0c 24             mov    (%rsp),%rcx
ffffffff8167c2cd:       81 e6 00 01 00 00       and    $0x100,%esi
ffffffff8167c2d3:       65 48 8b 14 25 40 70    mov    %gs:0x27040,%rdx

Thanks,
Tom

> 

^ permalink raw reply

* Re: [RESEND PATCH 1/2] powerpc/powermac: add missing g5_phy_disable_cpu1() declaration
From: Christophe Leroy @ 2021-09-22 13:52 UTC (permalink / raw)
  To: Krzysztof Kozlowski, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras, Rob Herring, Frank Rowand, linuxppc-dev,
	linux-kernel, devicetree
In-Reply-To: <20210922084415.18269-1-krzysztof.kozlowski@canonical.com>



Le 22/09/2021 à 10:44, Krzysztof Kozlowski a écrit :
> g5_phy_disable_cpu1() is used outside of platforms/powermac/feature.c,
> so it should have a declaration to fix W=1 warning:
> 
>    arch/powerpc/platforms/powermac/feature.c:1533:6:
>      error: no previous prototype for ‘g5_phy_disable_cpu1’ [-Werror=missing-prototypes]


While you are at it, can you clean it up completely, that is remove the 
declaration in arch/powerpc/platforms/powermac/smp.c ?


> 
> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
> ---
>   arch/powerpc/include/asm/pmac_feature.h | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/pmac_feature.h b/arch/powerpc/include/asm/pmac_feature.h
> index e08e829261b6..7703e5bf1203 100644
> --- a/arch/powerpc/include/asm/pmac_feature.h
> +++ b/arch/powerpc/include/asm/pmac_feature.h
> @@ -143,6 +143,10 @@
>    */
>   struct device_node;
>   
> +#ifdef CONFIG_PPC64
> +void g5_phy_disable_cpu1(void);
> +#endif /* CONFIG_PPC64 */
> +
>   static inline long pmac_call_feature(int selector, struct device_node* node,
>   					long param, long value)
>   {
> 

^ permalink raw reply

* Re: [RESEND PATCH 2/2] powerpc/powermac: constify device_node in of_irq_parse_oldworld()
From: Christophe Leroy @ 2021-09-22 13:55 UTC (permalink / raw)
  To: Krzysztof Kozlowski, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras, Rob Herring, Frank Rowand, linuxppc-dev,
	linux-kernel, devicetree
In-Reply-To: <20210922084415.18269-2-krzysztof.kozlowski@canonical.com>



Le 22/09/2021 à 10:44, Krzysztof Kozlowski a écrit :
> The of_irq_parse_oldworld() does not modify passed device_node so make
> it a pointer to const for safety.

AFAIKS this patch is unrelated to previous one so you should send them 
out separately instead of sending as a series.

> 
> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
> ---
>   arch/powerpc/platforms/powermac/pic.c | 2 +-
>   include/linux/of_irq.h                | 4 ++--
>   2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powermac/pic.c b/arch/powerpc/platforms/powermac/pic.c
> index 4921bccf0376..af5ca1f41bb1 100644
> --- a/arch/powerpc/platforms/powermac/pic.c
> +++ b/arch/powerpc/platforms/powermac/pic.c
> @@ -384,7 +384,7 @@ static void __init pmac_pic_probe_oldstyle(void)
>   #endif
>   }
>   
> -int of_irq_parse_oldworld(struct device_node *device, int index,
> +int of_irq_parse_oldworld(const struct device_node *device, int index,
>   			struct of_phandle_args *out_irq)
>   {
>   	const u32 *ints = NULL;
> diff --git a/include/linux/of_irq.h b/include/linux/of_irq.h
> index aaf219bd0354..6074fdf51f0c 100644
> --- a/include/linux/of_irq.h
> +++ b/include/linux/of_irq.h
> @@ -20,12 +20,12 @@ typedef int (*of_irq_init_cb_t)(struct device_node *, struct device_node *);
>   #if defined(CONFIG_PPC32) && defined(CONFIG_PPC_PMAC)
>   extern unsigned int of_irq_workarounds;
>   extern struct device_node *of_irq_dflt_pic;
> -extern int of_irq_parse_oldworld(struct device_node *device, int index,
> +extern int of_irq_parse_oldworld(const struct device_node *device, int index,
>   			       struct of_phandle_args *out_irq);

Please remove 'extern' which is useless for prototypes.

>   #else /* CONFIG_PPC32 && CONFIG_PPC_PMAC */
>   #define of_irq_workarounds (0)
>   #define of_irq_dflt_pic (NULL)
> -static inline int of_irq_parse_oldworld(struct device_node *device, int index,
> +static inline int of_irq_parse_oldworld(const struct device_node *device, int index,
>   				      struct of_phandle_args *out_irq)
>   {
>   	return -EINVAL;
> 

^ permalink raw reply

* Re: [RESEND PATCH 1/2] powerpc/powermac: add missing g5_phy_disable_cpu1() declaration
From: Krzysztof Kozlowski @ 2021-09-22 14:10 UTC (permalink / raw)
  To: Christophe Leroy, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras, Rob Herring, Frank Rowand, linuxppc-dev,
	linux-kernel, devicetree
In-Reply-To: <ee9fc44e-daab-10e6-f293-fb45b43ff5b1@csgroup.eu>

On 22/09/2021 15:52, Christophe Leroy wrote:
> 
> 
> Le 22/09/2021 à 10:44, Krzysztof Kozlowski a écrit :
>> g5_phy_disable_cpu1() is used outside of platforms/powermac/feature.c,
>> so it should have a declaration to fix W=1 warning:
>>
>>    arch/powerpc/platforms/powermac/feature.c:1533:6:
>>      error: no previous prototype for ‘g5_phy_disable_cpu1’ [-Werror=missing-prototypes]
> 
> 
> While you are at it, can you clean it up completely, that is remove the 
> declaration in arch/powerpc/platforms/powermac/smp.c ?
> 

Sure, I'll send a v2. Thanks for pointing this out.


Best regards,
Krzysztof

^ permalink raw reply

* Re: [RESEND PATCH 2/2] powerpc/powermac: constify device_node in of_irq_parse_oldworld()
From: Krzysztof Kozlowski @ 2021-09-22 14:12 UTC (permalink / raw)
  To: Christophe Leroy, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras, Rob Herring, Frank Rowand, linuxppc-dev,
	linux-kernel, devicetree
In-Reply-To: <a33f0978-b617-6a07-7240-ec011f894680@csgroup.eu>

On 22/09/2021 15:55, Christophe Leroy wrote:
> 
> 
> Le 22/09/2021 à 10:44, Krzysztof Kozlowski a écrit :
>> The of_irq_parse_oldworld() does not modify passed device_node so make
>> it a pointer to const for safety.
> 
> AFAIKS this patch is unrelated to previous one so you should send them 
> out separately instead of sending as a series.

The relation it's a series of bugfixes. Although they can be applied
independently, having a series is actually very useful - you run "b4 am"
on one message ID and get everything. The same with patchwork, if you
use that one.

> 
>>
>> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
>> ---
>>   arch/powerpc/platforms/powermac/pic.c | 2 +-
>>   include/linux/of_irq.h                | 4 ++--
>>   2 files changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/powerpc/platforms/powermac/pic.c b/arch/powerpc/platforms/powermac/pic.c
>> index 4921bccf0376..af5ca1f41bb1 100644
>> --- a/arch/powerpc/platforms/powermac/pic.c
>> +++ b/arch/powerpc/platforms/powermac/pic.c
>> @@ -384,7 +384,7 @@ static void __init pmac_pic_probe_oldstyle(void)
>>   #endif
>>   }
>>   
>> -int of_irq_parse_oldworld(struct device_node *device, int index,
>> +int of_irq_parse_oldworld(const struct device_node *device, int index,
>>   			struct of_phandle_args *out_irq)
>>   {
>>   	const u32 *ints = NULL;
>> diff --git a/include/linux/of_irq.h b/include/linux/of_irq.h
>> index aaf219bd0354..6074fdf51f0c 100644
>> --- a/include/linux/of_irq.h
>> +++ b/include/linux/of_irq.h
>> @@ -20,12 +20,12 @@ typedef int (*of_irq_init_cb_t)(struct device_node *, struct device_node *);
>>   #if defined(CONFIG_PPC32) && defined(CONFIG_PPC_PMAC)
>>   extern unsigned int of_irq_workarounds;
>>   extern struct device_node *of_irq_dflt_pic;
>> -extern int of_irq_parse_oldworld(struct device_node *device, int index,
>> +extern int of_irq_parse_oldworld(const struct device_node *device, int index,
>>   			       struct of_phandle_args *out_irq);
> 
> Please remove 'extern' which is useless for prototypes.

OK


Best regards,
Krzysztof

^ permalink raw reply

* Re: [PATCH 0/5] KVM: rseq: Fix and a test for a KVM+rseq bug
From: Paolo Bonzini @ 2021-09-22 14:12 UTC (permalink / raw)
  To: Sean Christopherson, Russell King, Catalin Marinas, Will Deacon,
	Guo Ren, Thomas Bogendoerfer, Michael Ellerman, Heiko Carstens,
	Vasily Gorbik, Christian Borntraeger, Oleg Nesterov,
	Steven Rostedt, Ingo Molnar, Thomas Gleixner, Peter Zijlstra,
	Andy Lutomirski, Mathieu Desnoyers, Paul E. McKenney, Boqun Feng,
	Shuah Khan
  Cc: linux-s390, kvm, linux-kernel, linux-csky, linux-mips,
	Peter Foley, Paul Mackerras, linux-kselftest, Ben Gardon,
	Shakeel Butt, linuxppc-dev, linux-arm-kernel
In-Reply-To: <20210818001210.4073390-1-seanjc@google.com>

On 18/08/21 02:12, Sean Christopherson wrote:
> Patch 1 fixes a KVM+rseq bug where KVM's handling of TIF_NOTIFY_RESUME,
> e.g. for task migration, clears the flag without informing rseq and leads
> to stale data in userspace's rseq struct.
> 
> Patch 2 is a cleanup to try and make future bugs less likely.  It's also
> a baby step towards moving and renaming tracehook_notify_resume() since
> it has nothing to do with tracing.  It kills me to not do the move/rename
> as part of this series, but having a dedicated series/discussion seems
> more appropriate given the sheer number of architectures that call
> tracehook_notify_resume() and the lack of an obvious home for the code.
> 
> Patch 3 is a fix/cleanup to stop overriding x86's unistd_{32,64}.h when
> the include path (intentionally) omits tools' uapi headers.  KVM's
> selftests do exactly that so that they can pick up the uapi headers from
> the installed kernel headers, and still use various tools/ headers that
> mirror kernel code, e.g. linux/types.h.  This allows the new test in
> patch 4 to reference __NR_rseq without having to manually define it.
> 
> Patch 4 is a regression test for the KVM+rseq bug.
> 
> Patch 5 is a cleanup made possible by patch 3.
> 
> 
> Sean Christopherson (5):
>    KVM: rseq: Update rseq when processing NOTIFY_RESUME on xfer to KVM
>      guest
>    entry: rseq: Call rseq_handle_notify_resume() in
>      tracehook_notify_resume()
>    tools: Move x86 syscall number fallbacks to .../uapi/
>    KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration
>      bugs
>    KVM: selftests: Remove __NR_userfaultfd syscall fallback
> 
>   arch/arm/kernel/signal.c                      |   1 -
>   arch/arm64/kernel/signal.c                    |   1 -
>   arch/csky/kernel/signal.c                     |   4 +-
>   arch/mips/kernel/signal.c                     |   4 +-
>   arch/powerpc/kernel/signal.c                  |   4 +-
>   arch/s390/kernel/signal.c                     |   1 -
>   include/linux/tracehook.h                     |   2 +
>   kernel/entry/common.c                         |   4 +-
>   kernel/rseq.c                                 |   4 +-
>   .../x86/include/{ => uapi}/asm/unistd_32.h    |   0
>   .../x86/include/{ => uapi}/asm/unistd_64.h    |   3 -
>   tools/testing/selftests/kvm/.gitignore        |   1 +
>   tools/testing/selftests/kvm/Makefile          |   3 +
>   tools/testing/selftests/kvm/rseq_test.c       | 131 ++++++++++++++++++
>   14 files changed, 143 insertions(+), 20 deletions(-)
>   rename tools/arch/x86/include/{ => uapi}/asm/unistd_32.h (100%)
>   rename tools/arch/x86/include/{ => uapi}/asm/unistd_64.h (83%)
>   create mode 100644 tools/testing/selftests/kvm/rseq_test.c
> 

Queued v3, thanks.  I'll send it in a separate pull request to Linus 
since it touches stuff outside my usual turf.

Thanks,

Paolo


^ permalink raw reply

* Re: [PATCH v2 01/16] ASoC: eureka-tlv320: Update to modern clocking terminology
From: Mark Brown @ 2021-09-22 14:21 UTC (permalink / raw)
  To: Fabio Estevam, Liam Girdwood, Mark Brown, Shengjiu Wang,
	Nicolin Chen, Xiubo Li
  Cc: alsa-devel, linuxppc-dev, linux-arm-kernel
In-Reply-To: <20210921213542.31688-1-broonie@kernel.org>

On Tue, 21 Sep 2021 22:35:27 +0100, Mark Brown wrote:
> As part of moving to remove the old style defines for the bus clocks update
> the eureka-tlv320 driver to use more modern terminology for clocking.
> 
> 

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next

Thanks!

[01/16] ASoC: eureka-tlv320: Update to modern clocking terminology
        commit: 4348be6330a18b123fa82494df9f5a134feecb7f
[02/16] ASoC: fsl-asoc-card: Update to modern clocking terminology
        commit: 8fcfd3493426c229f4f28bc5757dd3359e02cee8
[03/16] ASoC: fsl-audmix: Update to modern clocking terminology
        commit: 2757b340b25dd2cb3afc748d48c1dff6c9689f80
[04/16] ASoC: fsl-esai: Update to modern clocking terminology
        commit: e0b64fa34c7f444908549c32dd68f81ac436299e
[05/16] ASoC: fsl-mqs: Update to modern clocking terminology
        commit: a51da9dc9b3a844460a355cd10d0db4320f4d726
[06/16] ASoC: fsl_sai: Update to modern clocking terminology
        commit: 361284a4eb598eaf28e8458c542f214d3689b134
[07/16] ASoC: fsl_ssi: Update to modern clocking terminology
        commit: 89efbdaaa444d63346bf1bdf3b58dfb421de91f1
[08/16] ASoC: imx-audmix: Update to modern clocking terminology
        commit: bf101022487091032fd8102c835b1157b8283c43
[09/16] ASoC: imx-card: Update to modern clocking terminology
        commit: d689e280121abf1cdf0d37734b0b306098a774ed
[10/16] ASoC: imx-es8328: Update to modern clocking terminology
        commit: 56b69e4e4bc24c732b68ff6df54be83226a3b4e6
[11/16] ASoC: imx-hdmi: Update to modern clocking terminology
        commit: a90f847ad2f1c8575f6a7980e5ee9937d1a5eeb4
[12/16] ASoC: imx-rpmsg: Update to modern clocking terminology
        commit: caa0a6075a6e9239e49690a40a131496398602ab
[13/16] ASoC: imx-sgtl5000: Update to modern clocking terminology
        commit: 419099b4c3318a3c486f9f65b015760e71d53f0a
[14/16] ASoC: mpc8610_hpcd: Update to modern clocking terminology
        commit: 8a7f299b857b81a10566fe19c585fae4d1c1f8ef
[15/16] ASoC: pl1022_ds: Update to modern clocking terminology
        commit: fcd444bf6a29a22e529510de07c72555b7e46224
[16/16] ASoC: pl1022_rdk: Update to modern clocking terminology
        commit: 39e178a4cc7d042cd6353e73f3024d87e79a86ca

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

^ permalink raw reply

* Re: [PATCH 01/16] ASoC: eureka-tlv320: Update to modern clocking terminology
From: Mark Brown @ 2021-09-22 14:21 UTC (permalink / raw)
  To: Fabio Estevam, Liam Girdwood, Mark Brown, Shengjiu Wang,
	Nicolin Chen, Xiubo Li
  Cc: alsa-devel, linuxppc-dev, linux-arm-kernel
In-Reply-To: <20210921211040.11624-1-broonie@kernel.org>

On Tue, 21 Sep 2021 22:10:25 +0100, Mark Brown wrote:
> As part of moving to remove the old style defines for the bus clocks update
> the eureka-tlv320 driver to use more modern terminology for clocking.
> 
> 

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next

Thanks!

[01/16] ASoC: eureka-tlv320: Update to modern clocking terminology
        commit: 4348be6330a18b123fa82494df9f5a134feecb7f
[02/16] ASoC: fsl-asoc-card: Update to modern clocking terminology
        commit: 8fcfd3493426c229f4f28bc5757dd3359e02cee8
[03/16] ASoC: fsl-audmix: Update to modern clocking terminology
        commit: 2757b340b25dd2cb3afc748d48c1dff6c9689f80
[04/16] ASoC: fsl-esai: Update to modern clocking terminology
        commit: e0b64fa34c7f444908549c32dd68f81ac436299e
[05/16] ASoC: fsl-mqs: Update to modern clocking terminology
        commit: a51da9dc9b3a844460a355cd10d0db4320f4d726
[06/16] ASoC: fsl_sai: Update to modern clocking terminology
        commit: 361284a4eb598eaf28e8458c542f214d3689b134
[07/16] ASoC: fsl_ssi: Update to modern clocking terminology
        commit: 89efbdaaa444d63346bf1bdf3b58dfb421de91f1
[08/16] ASoC: imx-audmix: Update to modern clocking terminology
        commit: bf101022487091032fd8102c835b1157b8283c43
[09/16] ASoC: imx-card: Update to modern clocking terminology
        commit: d689e280121abf1cdf0d37734b0b306098a774ed
[10/16] ASoC: imx-es8328: Update to modern clocking terminology
        commit: 56b69e4e4bc24c732b68ff6df54be83226a3b4e6
[11/16] ASoC: imx-hdmi: Update to modern clocking terminology
        commit: a90f847ad2f1c8575f6a7980e5ee9937d1a5eeb4
[12/16] ASoC: imx-rpmsg: Update to modern clocking terminology
        commit: caa0a6075a6e9239e49690a40a131496398602ab
[13/16] ASoC: imx-sgtl5000: Update to modern clocking terminology
        commit: 419099b4c3318a3c486f9f65b015760e71d53f0a
[14/16] ASoC: mpc8610_hpcd: Update to modern clocking terminology
        commit: 8a7f299b857b81a10566fe19c585fae4d1c1f8ef
[15/16] ASoC: pl1022_ds: Update to modern clocking terminology
        commit: fcd444bf6a29a22e529510de07c72555b7e46224
[16/16] ASoC: pl1022_rdk: Update to modern clocking terminology
        commit: 39e178a4cc7d042cd6353e73f3024d87e79a86ca

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

^ permalink raw reply

* Re: [PATCH v3 5/8] x86/sme: Replace occurrences of sme_active() with cc_platform_has()
From: Kirill A. Shutemov @ 2021-09-22 14:30 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Sathyanarayanan Kuppuswamy, linux-efi, Brijesh Singh, kvm,
	Peter Zijlstra, Dave Hansen, dri-devel, platform-driver-x86,
	Will Deacon, linux-s390, Andi Kleen, Joerg Roedel, x86, amd-gfx,
	Christoph Hellwig, Ingo Molnar, linux-graphics-maintainer,
	Tianyu Lan, Borislav Petkov, Andy Lutomirski, Thomas Gleixner,
	kexec, linux-kernel, iommu, linux-fsdevel, linuxppc-dev
In-Reply-To: <01891f59-7ec3-cf62-a8fc-79f79ca76587@amd.com>

On Wed, Sep 22, 2021 at 08:40:43AM -0500, Tom Lendacky wrote:
> On 9/21/21 4:58 PM, Kirill A. Shutemov wrote:
> > On Tue, Sep 21, 2021 at 04:43:59PM -0500, Tom Lendacky wrote:
> > > On 9/21/21 4:34 PM, Kirill A. Shutemov wrote:
> > > > On Tue, Sep 21, 2021 at 11:27:17PM +0200, Borislav Petkov wrote:
> > > > > On Wed, Sep 22, 2021 at 12:20:59AM +0300, Kirill A. Shutemov wrote:
> > > > > > I still believe calling cc_platform_has() from __startup_64() is totally
> > > > > > broken as it lacks proper wrapping while accessing global variables.
> > > > > 
> > > > > Well, one of the issues on the AMD side was using boot_cpu_data too
> > > > > early and the Intel side uses it too. Can you replace those checks with
> > > > > is_tdx_guest() or whatever was the helper's name which would check
> > > > > whether the the kernel is running as a TDX guest, and see if that helps?
> > > > 
> > > > There's no need in Intel check this early. Only AMD need it. Maybe just
> > > > opencode them?
> > > 
> > > Any way you can put a gzipped/bzipped copy of your vmlinux file somewhere I
> > > can grab it from and take a look at it?
> > 
> > You can find broken vmlinux and bzImage here:
> > 
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdrive.google.com%2Fdrive%2Ffolders%2F1n74vUQHOGebnF70Im32qLFY8iS3wvjIs%3Fusp%3Dsharing&amp;data=04%7C01%7Cthomas.lendacky%40amd.com%7C1c7adf380cbe4c1a6bb708d97d4af6ff%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637678583935705530%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=gA30x%2Bfu97tUx0p2UqI8HgjiL8bxDbK1GqgJBbUrUE4%3D&amp;reserved=0
> > 
> > Let me know when I can remove it.
> 
> Looking at everything, it is all RIP relative addressing, so those
> accesses should be fine.

Not fine, but waiting to blowup with random build environment change.

> Your image has the intel_cc_platform_has()
> function, does it work if you remove that call? Because I think it may be
> the early call into that function which looks like it has instrumentation
> that uses %gs in __sanitizer_cov_trace_pc and %gs is not setup properly
> yet. And since boot_cpu_data.x86_vendor will likely be zero this early it
> will match X86_VENDOR_INTEL and call into that function.

Right removing call to intel_cc_platform_has() or moving it to
cc_platform.c fixes the issue.

-- 
 Kirill A. Shutemov

^ permalink raw reply

* [PATCH v1] powerpc/64/interrupt: Reconcile soft-mask state in NMI and fix false BUG
From: Nicholas Piggin @ 2021-09-22 14:49 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

If a NMI hits early in an interrupt handler before the irq soft-mask
state is reconciled, that can cause a false-positive BUG with a
CONFIG_PPC_IRQ_SOFT_MASK_DEBUG assertion.

Remove that assertion and instead check the case that if regs->msr has
EE clear, then regs->softe should be marked as disabled so the irq state
looks correct to NMI handlers, the same as how it's fixed up in the
case it was implicit soft-masked.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/interrupt.h | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index b32ed910a8cf..b76ab848aa0d 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -265,13 +265,16 @@ static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct inte
 	local_paca->irq_soft_mask = IRQS_ALL_DISABLED;
 	local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
 
-	if (is_implicit_soft_masked(regs)) {
-		// Adjust regs->softe soft implicit soft-mask, so
-		// arch_irq_disabled_regs(regs) behaves as expected.
+	if (!(regs->msr & MSR_EE) || is_implicit_soft_masked(regs)) {
+		/*
+		 * Adjust regs->softe to be soft-masked if it had not been
+		 * reconcied (e.g., interrupt entry with MSR[EE]=0 but softe
+		 * not yet set disabled), or if it was in an implicit soft
+		 * masked state. This makes arch_irq_disabled_regs(regs)
+		 * behave as expected.
+		 */
 		regs->softe = IRQS_ALL_DISABLED;
 	}
-	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
-		BUG_ON(!arch_irq_disabled_regs(regs) && !(regs->msr & MSR_EE));
 
 	/* Don't do any per-CPU operations until interrupt state is fixed */
 
-- 
2.23.0


^ permalink raw reply related

* [PATCH v3 0/6] powerpc/64s: interrupt speedups
From: Nicholas Piggin @ 2021-09-22 14:54 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

Here's a few stragglers. The first patch was submitted already but had
some bugs with unrecoverable exceptions on HPT (current->blah being
accessed before MSR[RI] was enabled). Those should be fixed now.

The others are generally for helping asynch interrupts, which are a bit
harder to measure well but important for IO and IPIs.

After this series, the SPR accesses of the interrupt handlers for radix
are becoming pretty optimal except for PPR which we could improve on,
and virt CPU accounting which is very costly -- we might disable that
by default unless someone comes up with a good reason to keep it.

Since v1:
- Compile fixes for 64e.
- Fixed a SOFT_MASK_DEBUG false positive.
- Improve function name and comments explaining why patch 2 does not
  need to hard enable when PMU is enabled via sysfs.

Since v2:
- Split first patch into patch 1 and 2, improve on the changelogs.
- More compile fixes.
- Fixed several review comments from Daniel.
- Added patch 5.

Thanks,
Nick

Nicholas Piggin (6):
  powerpc/64/interrupt: make normal synchronous interrupts enable
    MSR[EE] if possible
  powerpc/64s/interrupt: handle MSR EE and RI in interrupt entry wrapper
  powerpc/64s/perf: add power_pmu_wants_prompt_pmi to say whether perf
    wants PMIs to be soft-NMI
  powerpc/64s/interrupt: Don't enable MSR[EE] in irq handlers unless
    perf is in use
  powerpc/64/interrupt: reduce expensive debug tests
  powerpc/64s/interrupt: avoid saving CFAR in some asynchronous
    interrupts

 arch/powerpc/include/asm/hw_irq.h    |  59 +++++++++++++---
 arch/powerpc/include/asm/interrupt.h |  58 ++++++++++++---
 arch/powerpc/kernel/dbell.c          |   3 +-
 arch/powerpc/kernel/exceptions-64s.S | 101 ++++++++++++++++++---------
 arch/powerpc/kernel/fpu.S            |   5 ++
 arch/powerpc/kernel/irq.c            |   3 +-
 arch/powerpc/kernel/time.c           |  31 ++++----
 arch/powerpc/kernel/vector.S         |  10 +++
 arch/powerpc/perf/core-book3s.c      |  31 ++++++++
 9 files changed, 232 insertions(+), 69 deletions(-)

-- 
2.23.0


^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox