LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [RFC PATCH v3 4/4] ASoC: fsl_asrc_dma: Fix data copying speed issue with EDMA
From: Nicolin Chen @ 2020-06-12  8:10 UTC (permalink / raw)
  To: Shengjiu Wang
  Cc: alsa-devel, lars, timur, Xiubo.Lee, linux-kernel, linuxppc-dev,
	lgirdwood, tiwai, broonie, perex, festevam
In-Reply-To: <424ed6c249bafcbe30791c9de0352821c5ea67e2.1591947428.git.shengjiu.wang@nxp.com>

On Fri, Jun 12, 2020 at 03:37:51PM +0800, Shengjiu Wang wrote:
> With EDMA, there is two dma channels can be used for dev_to_dev,
> one is from ASRC, one is from another peripheral (ESAI or SAI).
> 
> If we select the dma channel of ASRC, there is an issue for ideal
> ratio case, the speed of copy data is faster than sample
> frequency, because ASRC output data is very fast in ideal ratio
> mode.
> 
> So it is reasonable to use the dma channel of Back-End peripheral.
> then copying speed of DMA is controlled by data consumption
> speed in the peripheral FIFO,
> 
> Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>

Reviewed-by: Nicolin Chen <nicoleotsuka@gmail.com>

^ permalink raw reply

* [PATCH] powerpc/pci: unmap legacy INTx interrupts when a PHB is removed
From: Cédric Le Goater @ 2020-06-12  7:02 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Oliver O'Halloran, linuxppc-dev, Cédric Le Goater

When a passthrough IO adapter is removed from a pseries machine using
hash MMU and the XIVE interrupt mode, the POWER hypervisor, pHyp,
expects the guest OS to have cleared all page table entries related to
the adapter. If some are still present, the RTAS call which isolates
the PCI slot returns error 9001 "valid outstanding translations" and
the removal of the IO adapter fails.

INTx interrupt numbers need special care because Linux maps the
interrupts automatically in the Linux interrupt number space. For this
purpose, record the logical interrupt number of the INTx at the PHB
level and clear these interrupts when the PCI bus is removed. This
will also clear all the page table entries of the ESB pages when using
XIVE.

Cc: "Oliver O'Halloran" <oohall@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
---

 This deprecates patch :
 
 http://patchwork.ozlabs.org/project/linuxppc-dev/patch/20200429075122.1216388-3-clg@kaod.org/

 Thanks,

 arch/powerpc/include/asm/pci-bridge.h |  4 +++
 arch/powerpc/kernel/pci-common.c      | 45 +++++++++++++++++++++++++++
 2 files changed, 49 insertions(+)

diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
index b92e81b256e5..9960dd249079 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -48,6 +48,8 @@ struct pci_controller_ops {
 
 /*
  * Structure of a PCI controller (host bridge)
+ *
+ * @intx: legacy INTx mappings
  */
 struct pci_controller {
 	struct pci_bus *bus;
@@ -127,6 +129,8 @@ struct pci_controller {
 
 	void *private_data;
 	struct npu *npu;
+
+	unsigned int intx[PCI_NUM_INTX];
 };
 
 /* These are used for config access before all the PCI probing
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index be108616a721..8c442627f465 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -353,6 +353,49 @@ struct pci_controller *pci_find_controller_for_domain(int domain_nr)
 	return NULL;
 }
 
+static void pci_intx_register(struct pci_dev *pdev, int virq)
+{
+	struct pci_controller *phb = pci_bus_to_host(pdev->bus);
+	int i;
+
+	for (i = 0; i < PCI_NUM_INTX; i++) {
+		/*
+		 * Look for an empty or an equivalent slot, as INTx
+		 * interrupts can be shared between adapters
+		 */
+		if (phb->intx[i] == virq || !phb->intx[i]) {
+			phb->intx[i] = virq;
+			break;
+		}
+	}
+
+	if (i == PCI_NUM_INTX)
+		pr_err("PCI:%s INTx all mapped\n", pci_name(pdev));
+}
+
+/*
+ * Clearing the mapped INTx interrupts will also clear the underlying
+ * mappings of the ESB pages of the interrupts when under XIVE. It is
+ * a requirement of PowerVM to clear all memory mappings before
+ * removing a PHB.
+ */
+static void pci_intx_dispose(struct pci_bus *bus)
+{
+	struct pci_controller *phb = pci_bus_to_host(bus);
+	int i;
+
+	pr_debug("PCI: Clearing INTx for PHB %04x:%02x...\n",
+		 pci_domain_nr(bus), bus->number);
+	for (i = 0; i < PCI_NUM_INTX; i++)
+		irq_dispose_mapping(phb->intx[i]);
+}
+
+void pcibios_remove_bus(struct pci_bus *bus)
+{
+	pci_intx_dispose(bus);
+}
+EXPORT_SYMBOL_GPL(pcibios_remove_bus);
+
 /*
  * Reads the interrupt pin to determine if interrupt is use by card.
  * If the interrupt is used, then gets the interrupt line from the
@@ -401,6 +444,8 @@ static int pci_read_irq_line(struct pci_dev *pci_dev)
 
 	pci_dev->irq = virq;
 
+	/* Record all INTx mappings for later removal of a PHB */
+	pci_intx_register(pci_dev, virq);
 	return 0;
 }
 
-- 
2.25.4


^ permalink raw reply related

* Re: [PATCH v2] All arch: remove system call sys_sysctl
From: Xiaoming Ni @ 2020-06-12  9:48 UTC (permalink / raw)
  To: Eric W. Biederman, Rich Felker
  Cc: linux-sh, catalin.marinas, paulus, ak, paulburton, geert,
	mattst88, brgerst, acme, cyphar, viro, luto, tglx, surenb, rth,
	young.liuyang, linux-parisc, rdunlap, linux-kernel, mcgrof,
	linux-fsdevel, akpm, mark.rutland, linux-ia64, linux-xtensa,
	jongk, linux, James.Bottomley, jcmvbkbc, linux-s390, ysato,
	deller, yzaikin, mszeredi, gor, linux-alpha, linux-m68k,
	linux-arm-kernel, chris, tony.luck, linux-api, zhouyanjie,
	minchan, sargun, alexander.shishkin, heiko.carstens,
	alex.huangjianhui, will, krzk, borntraeger, vbabka, samitolvanen,
	flameeyes, ravi.bangoria, elver, keescook, arnd, bp, christian,
	tsbogend, jiri, martin.petersen, yamada.masahiro, oleg,
	sudeep.holla, olof, shawnguo, davem, bauerman, fenghua.yu, peterz,
	dhowells, hpa, sparclinux, jolsa, svens, x86, linux, mingo,
	naveen.n.rao, paulmck, sfr, npiggin, namhyung, dvyukov, axboe,
	monstr, haolee.swjtu, linux-mips, ink, linuxppc-dev
In-Reply-To: <87k10dqx5g.fsf@x220.int.ebiederm.org>

On 2020/6/12 2:23, Eric W. Biederman wrote:
> Rich Felker <dalias@libc.org> writes:
> 
>> On Thu, Jun 11, 2020 at 12:01:11PM -0500, Eric W. Biederman wrote:
>>> Rich Felker <dalias@libc.org> writes:
>>>
>>>> On Thu, Jun 11, 2020 at 06:43:00AM -0500, Eric W. Biederman wrote:
>>>>> Xiaoming Ni <nixiaoming@huawei.com> writes:
>>>>>
>>>>>> Since the commit 61a47c1ad3a4dc ("sysctl: Remove the sysctl system call"),
>>>>>> sys_sysctl is actually unavailable: any input can only return an error.
>>>>>>
>>>>>> We have been warning about people using the sysctl system call for years
>>>>>> and believe there are no more users.  Even if there are users of this
>>>>>> interface if they have not complained or fixed their code by now they
>>>>>> probably are not going to, so there is no point in warning them any
>>>>>> longer.
>>>>>>
>>>>>> So completely remove sys_sysctl on all architectures.
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
>>>>>>
>>>>>> changes in v2:
>>>>>>    According to Kees Cook's suggestion, completely remove sys_sysctl on all arch
>>>>>>    According to Eric W. Biederman's suggestion, update the commit log
>>>>>>
>>>>>> V1: https://lore.kernel.org/lkml/1591683605-8585-1-git-send-email-nixiaoming@huawei.com/
>>>>>>    Delete the code of sys_sysctl and return -ENOSYS directly at the function entry
>>>>>> ---
>>>>>>   include/uapi/linux/sysctl.h                        |  15 --
>>>>> [snip]
>>>>>
>>>>>> diff --git a/include/uapi/linux/sysctl.h b/include/uapi/linux/sysctl.h
>>>>>> index 27c1ed2..84b44c3 100644
>>>>>> --- a/include/uapi/linux/sysctl.h
>>>>>> +++ b/include/uapi/linux/sysctl.h
>>>>>> @@ -27,21 +27,6 @@
>>>>>>   #include <linux/types.h>
>>>>>>   #include <linux/compiler.h>
>>>>>>   
>>>>>> -#define CTL_MAXNAME 10		/* how many path components do we allow in a
>>>>>> -				   call to sysctl?   In other words, what is
>>>>>> -				   the largest acceptable value for the nlen
>>>>>> -				   member of a struct __sysctl_args to have? */
>>>>>> -
>>>>>> -struct __sysctl_args {
>>>>>> -	int __user *name;
>>>>>> -	int nlen;
>>>>>> -	void __user *oldval;
>>>>>> -	size_t __user *oldlenp;
>>>>>> -	void __user *newval;
>>>>>> -	size_t newlen;
>>>>>> -	unsigned long __unused[4];
>>>>>> -};
>>>>>> -
>>>>>>   /* Define sysctl names first */
>>>>>>   
>>>>>>   /* Top-level names: */
>>>>> [snip]
>>>>>
>>>>> The uapi header change does not make sense.  The entire point of the
>>>>> header is to allow userspace programs to be able to call sys_sysctl.
>>>>> It either needs to all stay or all go.
>>>>>
>>>>> As the concern with the uapi header is about userspace programs being
>>>>> able to compile please leave the header for now.
>>>>>
>>>>> We should leave auditing userspace and seeing if userspace code will
>>>>> still compile if we remove this header for a separate patch.  The
>>>>> concerns and justifications for the uapi header are completely different
>>>>> then for the removing the sys_sysctl implementation.
>>>>>
>>>>> Otherwise
>>>>> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
>>>>
>>>> The UAPI header should be kept because it's defining an API not just
>>>> for the kernel the headers are supplied with, but for all past
>>>> kernels. In particular programs needing a failsafe CSPRNG source that
>>>> works on old kernels may (do) use this as a fallback only if modern
>>>> syscalls are missing. Removing the syscall is no problem since it
>>>> won't be used, but if you remove the types/macros from the UAPI
>>>> headers, they'll have to copy that into their own sources.
>>>
>>> May we assume you know of a least one piece of userspace that will fail
>>> to compile if this header file is removed?
>>
>> I know at least one piece of software is using SYS_sysctl for a
>> fallback CSPRNG source. I'm not 100% sure that they're using the
>> kernel headers; they might have copied it already. I'm also not sure
>> how many there are.
>>
>> Regardless, I think the principle stands. There's no need to remove
>> definitions that are essentially maintenance-free now that the
>> interface is no longer available in new kernels, and doing so
>> contributes to the myth that you're supposed to use kernel headers
>> matching runtime kernel rather than it always being safe to use latest
>> headers.
> 
> If there is no one using the definitions removing them saves people
> having to remember what they are there for.
> 
> The big rule is don't break userspace.  The goal is to allow people to
> upgrade their kernel without needing to worry about userspace breaking,
> and to be able to downgrade to the extent possible to help in tracking
> bugs.
> 
> Not being able to compile userspace seems like a pretty clear cut case.
> Although there are some fuzzy edges given the history of the kernel
> headers.  Things like your libc requiring kernel headers to be processed
> before they can be used.  I think there are still some kernel headers
> that have that restriction when used with glibc as glibc uses different
> sizes for types like dev_t.
> 
> The bottom line is we can't do it casually so that any work in the
> direction of removing from or deleting uapi headers needs to be it's own
> separate patch.
> 
> Given how much effort it can be to show that userspace is not using
> something I don't expect us to be mucking with the uapi headers any time
> soon.
> 
> Eric
> 

Thanks everyone for your guidance, I will delete the update of uapi file 
in v3 version.

But here I am still a bit confused: how to modify include/uapi?

Before commit 61a47c1ad3a4dc ("sysctl: Remove the sysctl system call"),
most of the enumeration variables defined in include/uapi/linux/sysctl.h
were used in kernel/sysctl_binary.c,
After commit 61a47c1ad3a4dc ("sysctl: Remove the sysctl system call"),
the code for enumerating variables in include/uapi/linux/sysctl.h cannot
be found in the current git repository

 From the management of a single git repository, we can immediately 
delete include/uapi/linux/sysctl.h for the reason of deleting unused 
code. But from the complex cooperation of linux/libc/ltp/man/xxxx, it 
may take a long time to modify uapi.

Is there any example for the update of uapi? How to control the rhythm?
How to update uapi?

Thanks
Xiaoming Ni


^ permalink raw reply

* Re: [PATCH v5 1/4] riscv: Move kernel mapping to vmalloc zone
From: Alex Ghiti @ 2020-06-12 12:30 UTC (permalink / raw)
  To: Atish Patra
  Cc: Albert Ou, Anup Patel, linux-kernel@vger.kernel.org List,
	Atish Patra, Paul Mackerras, Zong Li, Paul Walmsley,
	Palmer Dabbelt, linux-riscv, linuxppc-dev
In-Reply-To: <CAOnJCULMXX1OyLsFZMWj2tMftsCY1rqmN1QxYH+Y7Z-3a5Ap7g@mail.gmail.com>

Hi Atish,

Le 6/11/20 à 5:34 PM, Atish Patra a écrit :
> On Sun, Jun 7, 2020 at 1:01 AM Alexandre Ghiti <alex@ghiti.fr> wrote:
>> This is a preparatory patch for relocatable kernel.
>>
>> The kernel used to be linked at PAGE_OFFSET address and used to be loaded
>> physically at the beginning of the main memory. Therefore, we could use
>> the linear mapping for the kernel mapping.
>>
>> But the relocated kernel base address will be different from PAGE_OFFSET
>> and since in the linear mapping, two different virtual addresses cannot
>> point to the same physical address, the kernel mapping needs to lie outside
>> the linear mapping.
>>
>> In addition, because modules and BPF must be close to the kernel (inside
>> +-2GB window), the kernel is placed at the end of the vmalloc zone minus
>> 2GB, which leaves room for modules and BPF. The kernel could not be
>> placed at the beginning of the vmalloc zone since other vmalloc
>> allocations from the kernel could get all the +-2GB window around the
>> kernel which would prevent new modules and BPF programs to be loaded.
>>
>> Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
>> Reviewed-by: Zong Li <zong.li@sifive.com>
>> ---
>>   arch/riscv/boot/loader.lds.S     |  3 +-
>>   arch/riscv/include/asm/page.h    | 10 +++++-
>>   arch/riscv/include/asm/pgtable.h | 38 ++++++++++++++-------
>>   arch/riscv/kernel/head.S         |  3 +-
>>   arch/riscv/kernel/module.c       |  4 +--
>>   arch/riscv/kernel/vmlinux.lds.S  |  3 +-
>>   arch/riscv/mm/init.c             | 58 +++++++++++++++++++++++++-------
>>   arch/riscv/mm/physaddr.c         |  2 +-
>>   8 files changed, 88 insertions(+), 33 deletions(-)
>>
>> diff --git a/arch/riscv/boot/loader.lds.S b/arch/riscv/boot/loader.lds.S
>> index 47a5003c2e28..62d94696a19c 100644
>> --- a/arch/riscv/boot/loader.lds.S
>> +++ b/arch/riscv/boot/loader.lds.S
>> @@ -1,13 +1,14 @@
>>   /* SPDX-License-Identifier: GPL-2.0 */
>>
>>   #include <asm/page.h>
>> +#include <asm/pgtable.h>
>>
>>   OUTPUT_ARCH(riscv)
>>   ENTRY(_start)
>>
>>   SECTIONS
>>   {
>> -       . = PAGE_OFFSET;
>> +       . = KERNEL_LINK_ADDR;
>>
>>          .payload : {
>>                  *(.payload)
>> diff --git a/arch/riscv/include/asm/page.h b/arch/riscv/include/asm/page.h
>> index 2d50f76efe48..48bb09b6a9b7 100644
>> --- a/arch/riscv/include/asm/page.h
>> +++ b/arch/riscv/include/asm/page.h
>> @@ -90,18 +90,26 @@ typedef struct page *pgtable_t;
>>
>>   #ifdef CONFIG_MMU
>>   extern unsigned long va_pa_offset;
>> +extern unsigned long va_kernel_pa_offset;
>>   extern unsigned long pfn_base;
>>   #define ARCH_PFN_OFFSET                (pfn_base)
>>   #else
>>   #define va_pa_offset           0
>> +#define va_kernel_pa_offset    0
>>   #define ARCH_PFN_OFFSET                (PAGE_OFFSET >> PAGE_SHIFT)
>>   #endif /* CONFIG_MMU */
>>
>>   extern unsigned long max_low_pfn;
>>   extern unsigned long min_low_pfn;
>> +extern unsigned long kernel_virt_addr;
>>
>>   #define __pa_to_va_nodebug(x)  ((void *)((unsigned long) (x) + va_pa_offset))
>> -#define __va_to_pa_nodebug(x)  ((unsigned long)(x) - va_pa_offset)
>> +#define linear_mapping_va_to_pa(x)     ((unsigned long)(x) - va_pa_offset)
>> +#define kernel_mapping_va_to_pa(x)     \
>> +       ((unsigned long)(x) - va_kernel_pa_offset)
>> +#define __va_to_pa_nodebug(x)          \
>> +       (((x) >= PAGE_OFFSET) ?         \
>> +               linear_mapping_va_to_pa(x) : kernel_mapping_va_to_pa(x))
>>
>>   #ifdef CONFIG_DEBUG_VIRTUAL
>>   extern phys_addr_t __virt_to_phys(unsigned long x);
>> diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
>> index 35b60035b6b0..94ef3b49dfb6 100644
>> --- a/arch/riscv/include/asm/pgtable.h
>> +++ b/arch/riscv/include/asm/pgtable.h
>> @@ -11,23 +11,29 @@
>>
>>   #include <asm/pgtable-bits.h>
>>
>> -#ifndef __ASSEMBLY__
>> -
>> -/* Page Upper Directory not used in RISC-V */
>> -#include <asm-generic/pgtable-nopud.h>
>> -#include <asm/page.h>
>> -#include <asm/tlbflush.h>
>> -#include <linux/mm_types.h>
>> -
>> -#ifdef CONFIG_MMU
>> +#ifndef CONFIG_MMU
>> +#define KERNEL_VIRT_ADDR       PAGE_OFFSET
>> +#define KERNEL_LINK_ADDR       PAGE_OFFSET
>> +#else
>> +/*
>> + * Leave 2GB for modules and BPF that must lie within a 2GB range around
>> + * the kernel.
>> + */
>> +#define KERNEL_VIRT_ADDR       (VMALLOC_END - SZ_2G + 1)
>> +#define KERNEL_LINK_ADDR       KERNEL_VIRT_ADDR
>>
>>   #define VMALLOC_SIZE     (KERN_VIRT_SIZE >> 1)
>>   #define VMALLOC_END      (PAGE_OFFSET - 1)
>>   #define VMALLOC_START    (PAGE_OFFSET - VMALLOC_SIZE)
>>
>>   #define BPF_JIT_REGION_SIZE    (SZ_128M)
>> -#define BPF_JIT_REGION_START   (PAGE_OFFSET - BPF_JIT_REGION_SIZE)
>> -#define BPF_JIT_REGION_END     (VMALLOC_END)
>> +#define BPF_JIT_REGION_START   PFN_ALIGN((unsigned long)&_end)
>> +#define BPF_JIT_REGION_END     (BPF_JIT_REGION_START + BPF_JIT_REGION_SIZE)
>> +
> As these mappings have changed a few times in recent months including
> this one, I think it would be
> better to have virtual memory layout documentation in RISC-V similar
> to other architectures.
>
> If you can include the page table layout for 3/4 level page tables in
> the same document, that would be really helpful.
>

Yes, I'll do that in a separate commit.

Thanks,

Alex


>> +#ifdef CONFIG_64BIT
>> +#define VMALLOC_MODULE_START   BPF_JIT_REGION_END
>> +#define VMALLOC_MODULE_END     (((unsigned long)&_start & PAGE_MASK) + SZ_2G)
>> +#endif
>>
>>   /*
>>    * Roughly size the vmemmap space to be large enough to fit enough
>> @@ -57,9 +63,16 @@
>>   #define FIXADDR_SIZE     PGDIR_SIZE
>>   #endif
>>   #define FIXADDR_START    (FIXADDR_TOP - FIXADDR_SIZE)
>> -
>>   #endif
>>
>> +#ifndef __ASSEMBLY__
>> +
>> +/* Page Upper Directory not used in RISC-V */
>> +#include <asm-generic/pgtable-nopud.h>
>> +#include <asm/page.h>
>> +#include <asm/tlbflush.h>
>> +#include <linux/mm_types.h>
>> +
>>   #ifdef CONFIG_64BIT
>>   #include <asm/pgtable-64.h>
>>   #else
>> @@ -483,6 +496,7 @@ static inline void __kernel_map_pages(struct page *page, int numpages, int enabl
>>
>>   #define kern_addr_valid(addr)   (1) /* FIXME */
>>
>> +extern char _start[];
>>   extern void *dtb_early_va;
>>   void setup_bootmem(void);
>>   void paging_init(void);
>> diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
>> index 98a406474e7d..8f5bb7731327 100644
>> --- a/arch/riscv/kernel/head.S
>> +++ b/arch/riscv/kernel/head.S
>> @@ -49,7 +49,8 @@ ENTRY(_start)
>>   #ifdef CONFIG_MMU
>>   relocate:
>>          /* Relocate return address */
>> -       li a1, PAGE_OFFSET
>> +       la a1, kernel_virt_addr
>> +       REG_L a1, 0(a1)
>>          la a2, _start
>>          sub a1, a1, a2
>>          add ra, ra, a1
>> diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
>> index 8bbe5dbe1341..1a8fbe05accf 100644
>> --- a/arch/riscv/kernel/module.c
>> +++ b/arch/riscv/kernel/module.c
>> @@ -392,12 +392,10 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
>>   }
>>
>>   #if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
>> -#define VMALLOC_MODULE_START \
>> -        max(PFN_ALIGN((unsigned long)&_end - SZ_2G), VMALLOC_START)
>>   void *module_alloc(unsigned long size)
>>   {
>>          return __vmalloc_node_range(size, 1, VMALLOC_MODULE_START,
>> -                                   VMALLOC_END, GFP_KERNEL,
>> +                                   VMALLOC_MODULE_END, GFP_KERNEL,
>>                                      PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
>>                                      __builtin_return_address(0));
>>   }
>> diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S
>> index 0339b6bbe11a..a9abde62909f 100644
>> --- a/arch/riscv/kernel/vmlinux.lds.S
>> +++ b/arch/riscv/kernel/vmlinux.lds.S
>> @@ -4,7 +4,8 @@
>>    * Copyright (C) 2017 SiFive
>>    */
>>
>> -#define LOAD_OFFSET PAGE_OFFSET
>> +#include <asm/pgtable.h>
>> +#define LOAD_OFFSET KERNEL_LINK_ADDR
>>   #include <asm/vmlinux.lds.h>
>>   #include <asm/page.h>
>>   #include <asm/cache.h>
>> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
>> index 736de6c8739f..71da78914645 100644
>> --- a/arch/riscv/mm/init.c
>> +++ b/arch/riscv/mm/init.c
>> @@ -22,6 +22,9 @@
>>
>>   #include "../kernel/head.h"
>>
>> +unsigned long kernel_virt_addr = KERNEL_VIRT_ADDR;
>> +EXPORT_SYMBOL(kernel_virt_addr);
>> +
>>   unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)]
>>                                                          __page_aligned_bss;
>>   EXPORT_SYMBOL(empty_zero_page);
>> @@ -178,8 +181,12 @@ void __init setup_bootmem(void)
>>   }
>>
>>   #ifdef CONFIG_MMU
>> +/* Offset between linear mapping virtual address and kernel load address */
>>   unsigned long va_pa_offset;
>>   EXPORT_SYMBOL(va_pa_offset);
>> +/* Offset between kernel mapping virtual address and kernel load address */
>> +unsigned long va_kernel_pa_offset;
>> +EXPORT_SYMBOL(va_kernel_pa_offset);
>>   unsigned long pfn_base;
>>   EXPORT_SYMBOL(pfn_base);
>>
>> @@ -271,7 +278,7 @@ static phys_addr_t __init alloc_pmd(uintptr_t va)
>>          if (mmu_enabled)
>>                  return memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE);
>>
>> -       pmd_num = (va - PAGE_OFFSET) >> PGDIR_SHIFT;
>> +       pmd_num = (va - kernel_virt_addr) >> PGDIR_SHIFT;
>>          BUG_ON(pmd_num >= NUM_EARLY_PMDS);
>>          return (uintptr_t)&early_pmd[pmd_num * PTRS_PER_PMD];
>>   }
>> @@ -372,14 +379,30 @@ static uintptr_t __init best_map_size(phys_addr_t base, phys_addr_t size)
>>   #error "setup_vm() is called from head.S before relocate so it should not use absolute addressing."
>>   #endif
>>
>> +static uintptr_t load_pa, load_sz;
>> +
>> +static void __init create_kernel_page_table(pgd_t *pgdir, uintptr_t map_size)
>> +{
>> +       uintptr_t va, end_va;
>> +
>> +       end_va = kernel_virt_addr + load_sz;
>> +       for (va = kernel_virt_addr; va < end_va; va += map_size)
>> +               create_pgd_mapping(pgdir, va,
>> +                                  load_pa + (va - kernel_virt_addr),
>> +                                  map_size, PAGE_KERNEL_EXEC);
>> +}
>> +
>>   asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>>   {
>>          uintptr_t va, end_va;
>> -       uintptr_t load_pa = (uintptr_t)(&_start);
>> -       uintptr_t load_sz = (uintptr_t)(&_end) - load_pa;
>>          uintptr_t map_size = best_map_size(load_pa, MAX_EARLY_MAPPING_SIZE);
>>
>> +       load_pa = (uintptr_t)(&_start);
>> +       load_sz = (uintptr_t)(&_end) - load_pa;
>> +
>>          va_pa_offset = PAGE_OFFSET - load_pa;
>> +       va_kernel_pa_offset = kernel_virt_addr - load_pa;
>> +
>>          pfn_base = PFN_DOWN(load_pa);
>>
>>          /*
>> @@ -402,26 +425,22 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>>          create_pmd_mapping(fixmap_pmd, FIXADDR_START,
>>                             (uintptr_t)fixmap_pte, PMD_SIZE, PAGE_TABLE);
>>          /* Setup trampoline PGD and PMD */
>> -       create_pgd_mapping(trampoline_pg_dir, PAGE_OFFSET,
>> +       create_pgd_mapping(trampoline_pg_dir, kernel_virt_addr,
>>                             (uintptr_t)trampoline_pmd, PGDIR_SIZE, PAGE_TABLE);
>> -       create_pmd_mapping(trampoline_pmd, PAGE_OFFSET,
>> +       create_pmd_mapping(trampoline_pmd, kernel_virt_addr,
>>                             load_pa, PMD_SIZE, PAGE_KERNEL_EXEC);
>>   #else
>>          /* Setup trampoline PGD */
>> -       create_pgd_mapping(trampoline_pg_dir, PAGE_OFFSET,
>> +       create_pgd_mapping(trampoline_pg_dir, kernel_virt_addr,
>>                             load_pa, PGDIR_SIZE, PAGE_KERNEL_EXEC);
>>   #endif
>>
>>          /*
>> -        * Setup early PGD covering entire kernel which will allows
>> +        * Setup early PGD covering entire kernel which will allow
>>           * us to reach paging_init(). We map all memory banks later
>>           * in setup_vm_final() below.
>>           */
>> -       end_va = PAGE_OFFSET + load_sz;
>> -       for (va = PAGE_OFFSET; va < end_va; va += map_size)
>> -               create_pgd_mapping(early_pg_dir, va,
>> -                                  load_pa + (va - PAGE_OFFSET),
>> -                                  map_size, PAGE_KERNEL_EXEC);
>> +       create_kernel_page_table(early_pg_dir, map_size);
>>
>>          /* Create fixed mapping for early FDT parsing */
>>          end_va = __fix_to_virt(FIX_FDT) + FIX_FDT_SIZE;
>> @@ -441,6 +460,7 @@ static void __init setup_vm_final(void)
>>          uintptr_t va, map_size;
>>          phys_addr_t pa, start, end;
>>          struct memblock_region *reg;
>> +       static struct vm_struct vm_kernel = { 0 };
>>
>>          /* Set mmu_enabled flag */
>>          mmu_enabled = true;
>> @@ -467,10 +487,22 @@ static void __init setup_vm_final(void)
>>                  for (pa = start; pa < end; pa += map_size) {
>>                          va = (uintptr_t)__va(pa);
>>                          create_pgd_mapping(swapper_pg_dir, va, pa,
>> -                                          map_size, PAGE_KERNEL_EXEC);
>> +                                          map_size, PAGE_KERNEL);
>>                  }
>>          }
>>
>> +       /* Map the kernel */
>> +       create_kernel_page_table(swapper_pg_dir, PMD_SIZE);
>> +
>> +       /* Reserve the vmalloc area occupied by the kernel */
>> +       vm_kernel.addr = (void *)kernel_virt_addr;
>> +       vm_kernel.phys_addr = load_pa;
>> +       vm_kernel.size = (load_sz + PMD_SIZE - 1) & ~(PMD_SIZE - 1);
>> +       vm_kernel.flags = VM_MAP | VM_NO_GUARD;
>> +       vm_kernel.caller = __builtin_return_address(0);
>> +
>> +       vm_area_add_early(&vm_kernel);
>> +
>>          /* Clear fixmap PTE and PMD mappings */
>>          clear_fixmap(FIX_PTE);
>>          clear_fixmap(FIX_PMD);
>> diff --git a/arch/riscv/mm/physaddr.c b/arch/riscv/mm/physaddr.c
>> index e8e4dcd39fed..35703d5ef5fd 100644
>> --- a/arch/riscv/mm/physaddr.c
>> +++ b/arch/riscv/mm/physaddr.c
>> @@ -23,7 +23,7 @@ EXPORT_SYMBOL(__virt_to_phys);
>>
>>   phys_addr_t __phys_addr_symbol(unsigned long x)
>>   {
>> -       unsigned long kernel_start = (unsigned long)PAGE_OFFSET;
>> +       unsigned long kernel_start = (unsigned long)kernel_virt_addr;
>>          unsigned long kernel_end = (unsigned long)_end;
>>
>>          /*
>> --
>> 2.20.1
>>
>>
>

^ permalink raw reply

* Re: [PATCH kernel] powerpc/xive: Ignore kmemleak false positives
From: Michael Ellerman @ 2020-06-12 12:43 UTC (permalink / raw)
  To: Alexey Kardashevskiy, linuxppc-dev; +Cc: Alexey Kardashevskiy, Paul Mackerras
In-Reply-To: <20200612043303.84894-1-aik@ozlabs.ru>

Alexey Kardashevskiy <aik@ozlabs.ru> writes:
> xive_native_provision_pages() allocates memory and passes the pointer to
> OPAL so kmemleak cannot find the pointer usage in the kernel memory and
> produces a false positive report (below) (even if the kernel did scan
> OPAL memory, it is unable to deal with __pa() addresses anyway).
>
> This silences the warning.
>
> unreferenced object 0xc000200350c40000 (size 65536):
>   comm "qemu-system-ppc", pid 2725, jiffies 4294946414 (age 70776.530s)
>   hex dump (first 32 bytes):
>     02 00 00 00 50 00 00 00 00 00 00 00 00 00 00 00  ....P...........
>     01 00 08 07 00 00 00 00 00 00 00 00 00 00 00 00  ................
>   backtrace:
>     [<0000000081ff046c>] xive_native_alloc_vp_block+0x120/0x250
>     [<00000000d555d524>] kvmppc_xive_compute_vp_id+0x248/0x350 [kvm]
>     [<00000000d69b9c9f>] kvmppc_xive_connect_vcpu+0xc0/0x520 [kvm]
>     [<000000006acbc81c>] kvm_arch_vcpu_ioctl+0x308/0x580 [kvm]
>     [<0000000089c69580>] kvm_vcpu_ioctl+0x19c/0xae0 [kvm]
>     [<00000000902ae91e>] ksys_ioctl+0x184/0x1b0
>     [<00000000f3e68bd7>] sys_ioctl+0x48/0xb0
>     [<0000000001b2c127>] system_call_exception+0x124/0x1f0
>     [<00000000d2b2ee40>] system_call_common+0xe8/0x214
>
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
>
> Does kmemleak actually check the OPAL memory?

No it shouldn't.

The memory used by OPAL should all be reserved in the device tree. That
means we never give it to any of the Linux memory allocators, and
therefore kmemleak will never see an allocation from those areas and add
that area to its list of areas to scan.

At least that's my understanding of how kmemleak works.

> Because if it did, we would still have a warning as kmemleak does not
> trace __pa() addresses anyway.

Right.

I think this patch is an OK solution.

It's kind of odd that we donate pages and don't keep track of them. But
they're used by xive until it's reset, and we don't do that until we
kexec, at which point we don't need to know about them anyway.

cheers

^ permalink raw reply

* Re: PowerPC KVM-PR issue
From: Christian Zigotzky @ 2020-06-12 13:01 UTC (permalink / raw)
  To: linuxppc-dev, npiggin, kvm-ppc@vger.kernel.org
  Cc: Darren Stevens, R.T.Dickinson, Christian Zigotzky
In-Reply-To: <54082b17-31bb-f529-2e9e-b84c5a5aa9ec@xenosoft.de>

On 11 June 2020 at 04:47 pm, Christian Zigotzky wrote:
> On 10 June 2020 at 01:23 pm, Christian Zigotzky wrote:
>> On 10 June 2020 at 11:06 am, Christian Zigotzky wrote:
>>> On 10 June 2020 at 00:18 am, Christian Zigotzky wrote:
>>>> Hello,
>>>>
>>>> KVM-PR doesn't work anymore on my Nemo board [1]. I figured out 
>>>> that the Git kernels and the kernel 5.7 are affected.
>>>>
>>>> Error message: Fienix kernel: kvmppc_exit_pr_progint: emulation at 
>>>> 700 failed (00000000)
>>>>
>>>> I can boot virtual QEMU PowerPC machines with KVM-PR with the 
>>>> kernel 5.6 without any problems on my Nemo board.
>>>>
>>>> I tested it with QEMU 2.5.0 and QEMU 5.0.0 today.
>>>>
>>>> Could you please check KVM-PR on your PowerPC machine?
>>>>
>>>> Thanks,
>>>> Christian
>>>>
>>>> [1] https://en.wikipedia.org/wiki/AmigaOne_X1000
>>>
>>> I figured out that the PowerPC updates 5.7-1 [1] are responsible for 
>>> the KVM-PR issue. Please test KVM-PR on your PowerPC machines and 
>>> check the PowerPC updates 5.7-1 [1].
>>>
>>> Thanks
>>>
>>> [1] 
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d38c07afc356ddebaa3ed8ecb3f553340e05c969
>>>
>>>
>> I tested the latest Git kernel with Mac-on-Linux/KVM-PR today. 
>> Unfortunately I can't use KVM-PR with MoL anymore because of this 
>> issue (see screenshots [1]). Please check the PowerPC updates 5.7-1.
>>
>> Thanks
>>
>> [1]
>> - 
>> https://i.pinimg.com/originals/0c/b3/64/0cb364a40241fa2b7f297d4272bbb8b7.png
>> - 
>> https://i.pinimg.com/originals/9a/61/d1/9a61d170b1c9f514f7a78a3014ffd18f.png
>>
> Hi All,
>
> I bisected today because of the KVM-PR issue.
>
> Result:
>
> 9600f261acaaabd476d7833cec2dd20f2919f1a0 is the first bad commit
> commit 9600f261acaaabd476d7833cec2dd20f2919f1a0
> Author: Nicholas Piggin <npiggin@gmail.com>
> Date:   Wed Feb 26 03:35:21 2020 +1000
>
>     powerpc/64s/exception: Move KVM test to common code
>
>     This allows more code to be moved out of unrelocated regions. The
>     system call KVMTEST is changed to be open-coded and remain in the
>     tramp area to avoid having to move it to entry_64.S. The custom 
> nature
>     of the system call entry code means the hcall case can be made more
>     streamlined than regular interrupt handlers.
>
>     mpe: Incorporate fix from Nick:
>
>     Moving KVM test to the common entry code missed the case of HMI and
>     MCE, which do not do __GEN_COMMON_ENTRY (because they don't want to
>     switch to virt mode).
>
>     This means a MCE or HMI exception that is taken while KVM is 
> running a
>     guest context will not be switched out of that context, and KVM won't
>     be notified. Found by running sigfuz in guest with patched host on
>     POWER9 DD2.3, which causes some TM related HMI interrupts (which are
>     expected and supposed to be handled by KVM).
>
>     This fix adds a __GEN_REALMODE_COMMON_ENTRY for those handlers to add
>     the KVM test. This makes them look a little more like other handlers
>     that all use __GEN_COMMON_ENTRY.
>
>     Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>     Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
>     Link: 
> https://lore.kernel.org/r/20200225173541.1549955-13-npiggin@gmail.com
>
> :040000 040000 ec21cec22d165f8696d69532734cb2985d532cb0 
> 87dd49a9cd7202ec79350e8ca26cea01f1dbd93d M    arch
>
> -----
>
> The following commit is the problem: powerpc/64s/exception: Move KVM 
> test to common code [1]
>
> These changes were included in the PowerPC updates 5.7-1. [2]
>
> Another test:
>
> git checkout d38c07afc356ddebaa3ed8ecb3f553340e05c969 (PowerPC updates 
> 5.7-1 [2] ) -> KVM-PR doesn't work.
>
> After that: git revert d38c07afc356ddebaa3ed8ecb3f553340e05c969 -m 1 
> -> KVM-PR works.
>
> Could you please check the first bad commit? [1]
>
> Thanks,
> Christian
>
>
> [1] 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9600f261acaaabd476d7833cec2dd20f2919f1a0
> [2] 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d38c07afc356ddebaa3ed8ecb3f553340e05c969

Hi All,

I tried to revert the __GEN_REALMODE_COMMON_ENTRY fix for the latest Git 
kernel and for the stable kernel 5.7.2 but without any success. There 
was lot of restructuring work during the kernel 5.7 development time in 
the PowerPC area so it isn't possible reactivate the old code. That 
means we have lost the whole KVM-PR support. I also reported this issue 
to Alexander Graf two days ago. He wrote: "Howdy :). It looks pretty 
broken. Have you ever made a bisect to see where the problem comes from?"

Please check the KVM-PR code.

Thanks,
Christian



^ permalink raw reply

* Re: [PATCH v3 20/41] powerpc/book3s64/kuap/kuep: Make KUAP and KUEP a subfeature of PPC_MEM_KEYS
From: kernel test robot @ 2020-06-12 13:36 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linuxppc-dev, mpe
  Cc: Aneesh Kumar K.V, kbuild-all, bauerman, linuxram
In-Reply-To: <20200610095204.608183-21-aneesh.kumar@linux.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 4583 bytes --]

Hi "Aneesh,

I love your patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on next-20200611]
[cannot apply to v5.7]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Aneesh-Kumar-K-V/Kernel-userspace-access-execution-prevention-with-hash-translation/20200610-191943
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-randconfig-r003-20200612 (attached as .config)
compiler: powerpc64le-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>, old ones prefixed by <<):

arch/powerpc/mm/book3s64/pkeys.c: In function 'setup_kuep':
>> arch/powerpc/mm/book3s64/pkeys.c:207:28: error: 'boot_cpuid' undeclared (first use in this function)
207 |  if (smp_processor_id() == boot_cpuid) {
|                            ^~~~~~~~~~
arch/powerpc/mm/book3s64/pkeys.c:207:28: note: each undeclared identifier is reported only once for each function it appears in
arch/powerpc/mm/book3s64/pkeys.c: In function 'setup_kuap':
arch/powerpc/mm/book3s64/pkeys.c:228:28: error: 'boot_cpuid' undeclared (first use in this function)
228 |  if (smp_processor_id() == boot_cpuid) {
|                            ^~~~~~~~~~

vim +/boot_cpuid +207 arch/powerpc/mm/book3s64/pkeys.c

92e3da3cf193fd arch/powerpc/mm/pkeys.c          Ram Pai          2018-01-18  200  
82b6d6aaa3e29c arch/powerpc/mm/book3s64/pkeys.c Aneesh Kumar K.V 2020-06-10  201  #ifdef CONFIG_PPC_KUEP
82b6d6aaa3e29c arch/powerpc/mm/book3s64/pkeys.c Aneesh Kumar K.V 2020-06-10  202  void __init setup_kuep(bool disabled)
82b6d6aaa3e29c arch/powerpc/mm/book3s64/pkeys.c Aneesh Kumar K.V 2020-06-10  203  {
82b6d6aaa3e29c arch/powerpc/mm/book3s64/pkeys.c Aneesh Kumar K.V 2020-06-10  204  	if (disabled || !early_radix_enabled())
82b6d6aaa3e29c arch/powerpc/mm/book3s64/pkeys.c Aneesh Kumar K.V 2020-06-10  205  		return;
82b6d6aaa3e29c arch/powerpc/mm/book3s64/pkeys.c Aneesh Kumar K.V 2020-06-10  206  
82b6d6aaa3e29c arch/powerpc/mm/book3s64/pkeys.c Aneesh Kumar K.V 2020-06-10 @207  	if (smp_processor_id() == boot_cpuid) {
82b6d6aaa3e29c arch/powerpc/mm/book3s64/pkeys.c Aneesh Kumar K.V 2020-06-10  208  		pr_info("Activating Kernel Userspace Execution Prevention\n");
82b6d6aaa3e29c arch/powerpc/mm/book3s64/pkeys.c Aneesh Kumar K.V 2020-06-10  209  		cur_cpu_spec->mmu_features |= MMU_FTR_KUEP;
82b6d6aaa3e29c arch/powerpc/mm/book3s64/pkeys.c Aneesh Kumar K.V 2020-06-10  210  	}
82b6d6aaa3e29c arch/powerpc/mm/book3s64/pkeys.c Aneesh Kumar K.V 2020-06-10  211  
82b6d6aaa3e29c arch/powerpc/mm/book3s64/pkeys.c Aneesh Kumar K.V 2020-06-10  212  	/*
82b6d6aaa3e29c arch/powerpc/mm/book3s64/pkeys.c Aneesh Kumar K.V 2020-06-10  213  	 * Radix always uses key0 of the IAMR to determine if an access is
82b6d6aaa3e29c arch/powerpc/mm/book3s64/pkeys.c Aneesh Kumar K.V 2020-06-10  214  	 * allowed. We set bit 0 (IBM bit 1) of key0, to prevent instruction
82b6d6aaa3e29c arch/powerpc/mm/book3s64/pkeys.c Aneesh Kumar K.V 2020-06-10  215  	 * fetch.
82b6d6aaa3e29c arch/powerpc/mm/book3s64/pkeys.c Aneesh Kumar K.V 2020-06-10  216  	 */
82b6d6aaa3e29c arch/powerpc/mm/book3s64/pkeys.c Aneesh Kumar K.V 2020-06-10  217  	mtspr(SPRN_IAMR, AMR_KUEP_BLOCKED);
82b6d6aaa3e29c arch/powerpc/mm/book3s64/pkeys.c Aneesh Kumar K.V 2020-06-10  218  	isync();
82b6d6aaa3e29c arch/powerpc/mm/book3s64/pkeys.c Aneesh Kumar K.V 2020-06-10  219  }
82b6d6aaa3e29c arch/powerpc/mm/book3s64/pkeys.c Aneesh Kumar K.V 2020-06-10  220  #endif
82b6d6aaa3e29c arch/powerpc/mm/book3s64/pkeys.c Aneesh Kumar K.V 2020-06-10  221  

:::::: The code at line 207 was first introduced by commit
:::::: 82b6d6aaa3e29c1f61639eaf61333b3f84b34c4d powerpc/book3s64/kuep: Move KUEP related function outside radix

:::::: TO: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
:::::: CC: 0day robot <lkp@intel.com>

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 24649 bytes --]

^ permalink raw reply

* Re: [RFC PATCH v3 0/4] Reuse the dma channel if available in Back-End
From: Mark Brown @ 2020-06-12 13:59 UTC (permalink / raw)
  To: nicoleotsuka, Shengjiu Wang, timur, tiwai, festevam, lars,
	Xiubo.Lee, alsa-devel, linux-kernel, perex, lgirdwood,
	linuxppc-dev
In-Reply-To: <cover.1591947428.git.shengjiu.wang@nxp.com>

On Fri, 12 Jun 2020 15:37:47 +0800, Shengjiu Wang wrote:
> Reuse the dma channel if available in Back-End
> 
> Shengjiu Wang (4):
>   ASoC: soc-card: export snd_soc_lookup_component_nolocked
>   ASoC: dmaengine_pcm: export soc_component_to_pcm
>   ASoC: fsl_asrc_dma: Reuse the dma channel if available in Back-End
>   ASoC: fsl_asrc_dma: Fix data copying speed issue with EDMA
> 
> [...]

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next

Thanks!

[1/4] ASoC: soc-card: export snd_soc_lookup_component_nolocked
      commit: 6fbea6b6a838f9aa941fe53a3637fd8d8aab1eba
[2/4] ASoC: dmaengine_pcm: export soc_component_to_pcm
      commit: a9a21e1eafc94b79502cab8272b392f7f63ef7bb
[3/4] ASoC: fsl_asrc_dma: Reuse the dma channel if available in Back-End
      commit: 706e2c8811585f42612b6cff218ab3adbe63a4ee
[4/4] ASoC: fsl_asrc_dma: Fix data copying speed issue with EDMA
      commit: b287a6d9723c601dd947f1c27d4cc0192e384a5a

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

^ permalink raw reply

* [PATCH V2] powerpc/pseries/svm: Drop unused align argument in alloc_shared_lppaca() function
From: Satheesh Rajendran @ 2020-06-12 14:29 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Sukadev Bhattiprolu, Ram Pai, linux-kernel, Satheesh Rajendran,
	Laurent Dufour, Thiago Jung Bauermann

Argument "align" in alloc_shared_lppaca() was unused inside the
function. Let's drop it and update code comment for page alignment.

Cc: linux-kernel@vger.kernel.org
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Laurent Dufour <ldufour@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
---

V2:
Added reviewed by Thiago.
Dropped align argument as per Michael suggest.
Modified commit msg.

V1: http://patchwork.ozlabs.org/project/linuxppc-dev/patch/20200609113909.17236-1-sathnaga@linux.vnet.ibm.com/
---
 arch/powerpc/kernel/paca.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c
index 8d96169c597e..a174d64d9b4d 100644
--- a/arch/powerpc/kernel/paca.c
+++ b/arch/powerpc/kernel/paca.c
@@ -57,8 +57,8 @@ static void *__init alloc_paca_data(unsigned long size, unsigned long align,
 
 #define LPPACA_SIZE 0x400
 
-static void *__init alloc_shared_lppaca(unsigned long size, unsigned long align,
-					unsigned long limit, int cpu)
+static void *__init alloc_shared_lppaca(unsigned long size, unsigned long limit,
+					int cpu)
 {
 	size_t shared_lppaca_total_size = PAGE_ALIGN(nr_cpu_ids * LPPACA_SIZE);
 	static unsigned long shared_lppaca_size;
@@ -68,6 +68,12 @@ static void *__init alloc_shared_lppaca(unsigned long size, unsigned long align,
 	if (!shared_lppaca) {
 		memblock_set_bottom_up(true);
 
+		/* See Documentation/powerpc/ultravisor.rst for mode details
+		 *
+		 * UV/HV data share is in PAGE granularity, In order to
+		 * minimize the number of pages shared and maximize the
+		 * use of a page, let's use page align.
+		 */
 		shared_lppaca =
 			memblock_alloc_try_nid(shared_lppaca_total_size,
 					       PAGE_SIZE, MEMBLOCK_LOW_LIMIT,
@@ -122,7 +128,7 @@ static struct lppaca * __init new_lppaca(int cpu, unsigned long limit)
 		return NULL;
 
 	if (is_secure_guest())
-		lp = alloc_shared_lppaca(LPPACA_SIZE, 0x400, limit, cpu);
+		lp = alloc_shared_lppaca(LPPACA_SIZE, limit, cpu);
 	else
 		lp = alloc_paca_data(LPPACA_SIZE, 0x400, limit, cpu);
 
-- 
2.26.2


^ permalink raw reply related

* [PATCH v2] tty: serial: cpm_uart: Fix behaviour for non existing GPIOs
From: Christophe Leroy @ 2020-06-12 18:26 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby, Linus Walleij
  Cc: linuxppc-dev, linux-kernel, linux-serial

devm_gpiod_get_index() doesn't return NULL but -ENOENT when the
requested GPIO doesn't exist,  leading to the following messages:

[    2.742468] gpiod_direction_input: invalid GPIO (errorpointer)
[    2.748147] can't set direction for gpio #2: -2
[    2.753081] gpiod_direction_input: invalid GPIO (errorpointer)
[    2.758724] can't set direction for gpio #3: -2
[    2.763666] gpiod_direction_output: invalid GPIO (errorpointer)
[    2.769394] can't set direction for gpio #4: -2
[    2.774341] gpiod_direction_input: invalid GPIO (errorpointer)
[    2.779981] can't set direction for gpio #5: -2
[    2.784545] ff000a20.serial: ttyCPM1 at MMIO 0xfff00a20 (irq = 39, base_baud = 8250000) is a CPM UART

Use devm_gpiod_get_index_optional() instead.

At the same time, handle the error case and properly exit
with an error.

Fixes: 97cbaf2c829b ("tty: serial: cpm_uart: Convert to use GPIO descriptors")
Cc: stable@vger.kernel.org
Cc: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
v2: Using devm_gpiod_get_index_optional() and exiting if error
---
 drivers/tty/serial/cpm_uart/cpm_uart_core.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/tty/serial/cpm_uart/cpm_uart_core.c b/drivers/tty/serial/cpm_uart/cpm_uart_core.c
index a04f74d2e854..4df47d02b34b 100644
--- a/drivers/tty/serial/cpm_uart/cpm_uart_core.c
+++ b/drivers/tty/serial/cpm_uart/cpm_uart_core.c
@@ -1215,7 +1215,12 @@ static int cpm_uart_init_port(struct device_node *np,
 
 		pinfo->gpios[i] = NULL;
 
-		gpiod = devm_gpiod_get_index(dev, NULL, i, GPIOD_ASIS);
+		gpiod = devm_gpiod_get_index_optional(dev, NULL, i, GPIOD_ASIS);
+
+		if (IS_ERR(gpiod)) {
+			ret = PTR_ERR(gpiod);
+			goto out_irq;
+		}
 
 		if (gpiod) {
 			if (i == GPIO_RTS || i == GPIO_DTR)
@@ -1237,6 +1242,8 @@ static int cpm_uart_init_port(struct device_node *np,
 
 	return cpm_uart_request_port(&pinfo->port);
 
+out_irq:
+	irq_dispose_mapping(pinfo->port.irq);
 out_pram:
 	cpm_uart_unmap_pram(pinfo, pram);
 out_mem:
-- 
2.25.0


^ permalink raw reply related

* [PATCH net] ibmvnic: Harden device login requests
From: Thomas Falcon @ 2020-06-12 18:31 UTC (permalink / raw)
  To: netdev; +Cc: Thomas Falcon, linuxppc-dev, danymadden

The VNIC driver's "login" command sequence is the final step
in the driver's initialization process with device firmware,
confirming the available device queue resources to be utilized
by the driver. Under high system load, firmware may not respond
to the request in a timely manner or may abort the request. In
such cases, the driver should reattempt the login command
sequence. In case of a device error, the number of retries
is bounded.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 197dc5b2c090..1cb2b7f3b2cb 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -841,13 +841,14 @@ static int ibmvnic_login(struct net_device *netdev)
 {
 	struct ibmvnic_adapter *adapter = netdev_priv(netdev);
 	unsigned long timeout = msecs_to_jiffies(30000);
+	int retries = 10;
 	int retry_count = 0;
 	bool retry;
 	int rc;
 
 	do {
 		retry = false;
-		if (retry_count > IBMVNIC_MAX_QUEUES) {
+		if (retry_count > retries) {
 			netdev_warn(netdev, "Login attempts exceeded\n");
 			return -1;
 		}
@@ -862,11 +863,23 @@ static int ibmvnic_login(struct net_device *netdev)
 
 		if (!wait_for_completion_timeout(&adapter->init_done,
 						 timeout)) {
-			netdev_warn(netdev, "Login timed out\n");
-			return -1;
+			netdev_warn(netdev, "Login timed out, retrying...\n");
+			retry = true;
+			adapter->init_done_rc = 0;
+			retry_count++;
+			continue;
 		}
 
-		if (adapter->init_done_rc == PARTIALSUCCESS) {
+		if (adapter->init_done_rc == ABORTED) {
+			netdev_warn(netdev, "Login aborted, retrying...\n");
+			retry = true;
+			adapter->init_done_rc = 0;
+			retry_count++;
+			/* FW or device may be busy, so
+			 * wait a bit before retrying login
+			 */
+			msleep(500);
+		} else if (adapter->init_done_rc == PARTIALSUCCESS) {
 			retry_count++;
 			release_sub_crqs(adapter, 1);
 
-- 
2.18.1


^ permalink raw reply related

* [PATCH net] ibmvnic: Flush existing work items before device removal
From: Thomas Falcon @ 2020-06-12 18:34 UTC (permalink / raw)
  To: netdev; +Cc: Thomas Falcon, linuxppc-dev, danymadden

Ensure that all scheduled work items have completed before continuing
with device removal and after further event scheduling has been
halted. This patch fixes a bug where a scheduled driver reset event
is processed following device removal.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 1cb2b7f3b2cb..a66fa75976d3 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -5197,6 +5197,9 @@ static int ibmvnic_remove(struct vio_dev *dev)
 	adapter->state = VNIC_REMOVING;
 	spin_unlock_irqrestore(&adapter->state_lock, flags);
 
+	flush_work(&adapter->ibmvnic_reset);
+	flush_delayed_work(&adapter->ibmvnic_delayed_reset);
+
 	rtnl_lock();
 	unregister_netdevice(netdev);
 
-- 
2.18.1


^ permalink raw reply related

* Re: [PATCH] ASoC: fsl_ssi: Fix bclk calculation for mono channel
From: Nicolin Chen @ 2020-06-12 19:29 UTC (permalink / raw)
  To: Shengjiu Wang
  Cc: alsa-devel, timur, Xiubo.Lee, linuxppc-dev, tiwai, perex, broonie,
	festevam, linux-kernel
In-Reply-To: <1591690768-1691-1-git-send-email-shengjiu.wang@nxp.com>

On Tue, Jun 09, 2020 at 04:19:28PM +0800, Shengjiu Wang wrote:
> For mono channel, ssi will switch to normal mode. In normal
> mode, the Word Length Control bits control the word length
> divider in clock generator, which is different with I2S master
> mode, the word length is fixed to 32bit.
> 
> So we refine the famula for mono channel, otherwise there
> will be sound issue for S24_LE.
> 
> Fixes: b0a7043d5c2c ("ASoC: fsl_ssi: Caculate bit clock rate using slot number and width")
> Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
> ---
>  sound/soc/fsl/fsl_ssi.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
> index bad89b0d129e..e347776590f7 100644
> --- a/sound/soc/fsl/fsl_ssi.c
> +++ b/sound/soc/fsl/fsl_ssi.c
> @@ -695,6 +695,11 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream *substream,
>  	/* Generate bit clock based on the slot number and slot width */
>  	freq = slots * slot_width * params_rate(hw_params);
>  
> +	/* The slot_width is not fixed to 32 for normal mode */
> +	if (params_channels(hw_params) == 1)

This function has a local variable that you can reuse here:
	unsigned int slots = params_channels(hw_params);

> +		freq = (slots <= 1 ? 2 : slots) * params_width(hw_params) *
> +		       params_rate(hw_params);

We have a small section of slots and slot_width calculation
at the top of this function where we can squash these in.

^ permalink raw reply

* Re: [PATCH net] ibmvnic: Harden device login requests
From: David Miller @ 2020-06-12 21:10 UTC (permalink / raw)
  To: tlfalcon; +Cc: netdev, linuxppc-dev, danymadden
In-Reply-To: <1591986699-19484-1-git-send-email-tlfalcon@linux.ibm.com>

From: Thomas Falcon <tlfalcon@linux.ibm.com>
Date: Fri, 12 Jun 2020 13:31:39 -0500

> @@ -841,13 +841,14 @@ static int ibmvnic_login(struct net_device *netdev)
>  {
>  	struct ibmvnic_adapter *adapter = netdev_priv(netdev);
>  	unsigned long timeout = msecs_to_jiffies(30000);
> +	int retries = 10;
>  	int retry_count = 0;
>  	bool retry;
>  	int rc;

Reverse christmas tree, please.

^ permalink raw reply

* Re: [PATCH net] ibmvnic: Flush existing work items before device removal
From: David Miller @ 2020-06-12 21:11 UTC (permalink / raw)
  To: tlfalcon; +Cc: netdev, linuxppc-dev, danymadden
In-Reply-To: <1591986881-19624-1-git-send-email-tlfalcon@linux.ibm.com>

From: Thomas Falcon <tlfalcon@linux.ibm.com>
Date: Fri, 12 Jun 2020 13:34:41 -0500

> Ensure that all scheduled work items have completed before continuing
> with device removal and after further event scheduling has been
> halted. This patch fixes a bug where a scheduled driver reset event
> is processed following device removal.
> 
> Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'
From: Nick Desaulniers @ 2020-06-12 21:33 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christophe Leroy, Michael Ellerman, LKML, Nicholas Piggin,
	clang-built-linux, Paul Mackerras, linuxppc-dev
In-Reply-To: <20200611235256.GL31009@gate.crashing.org>

On Thu, Jun 11, 2020 at 4:53 PM Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> On Thu, Jun 11, 2020 at 03:43:55PM -0700, Nick Desaulniers wrote:
> > Segher, Cristophe, I suspect Clang is missing support for the %L and %U
> > output templates [1].
>
> The arch/powerpc kernel first used the %U output modifier in 0c176fa80fdf
> (from 2016), and %L in b8b572e1015f (2008).  include/asm-ppc (and ppc64)
> have had %U since 2005 (1da177e4c3f4), and %L as well (0c541b4406a6).

Thanks for all the references.  So it looks like we should have failed
sooner, if we didn't support those. Hmm...

> > Can you please point me to documentation/unit tests/source for
> > these so that I can figure out what they should be doing, and look into
> > implementing them in Clang?
>
> The PowerPC part of
> https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints
> (sorry, no anchor) documents %U.

I thought those were constraints, not output templates?  Oh,
    The asm statement must also use %U<opno> as a placeholder for the
    “update” flag in the corresponding load or store instruction.
got it.

>
> Traditionally the source code is the documentation for this.  The code
> here starts with the comment
>       /* Write second word of DImode or DFmode reference.  Works on register
>          or non-indexed memory only.  */
> (which is very out-of-date itself, it works fine for e.g. TImode as well,
> but alas).
>
> Unit tests are completely unsuitable for most compiler things like this.

What? No, surely one may write tests for output operands.  Grepping
for `%L` in gcc/ was less fun than I was hoping.

>
> The source code is gcc/config/rs6000/rs6000.c, easiest is to search for
> 'L' (with those quotes).  Function print_operand.
>
> HtH,

Yes, perfect, thank you so much!  So it looks like LLVM does not yet
handle %L properly for memory operands.
https://bugs.llvm.org/show_bug.cgi?id=46186#c4
It's neat to see how this is implemented in GCC (and how many aren't
implemented in LLVM, yikes :( ).  For reference, this is implemented
in PPCAsmPrinter::PrintAsmOperand() and
PPCAsmPrinter::PrintAsmMemoryOperand() in
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp.  GCC switches first on the
modifier characters, then the operand type. LLVM dispatches on operand
type, then modifier.  When I was looking into LLVM's AsmPrinter class,
I was surprised to see it's basically an assembler that just has
complex logic to just do a bunch of prints, so it makes sense to see
that pattern in GCC literally calling printf.  Not drastically
different than my first toy compiler
https://nickdesaulniers.github.io/blog/2015/05/25/interpreter-compiler-jit/
(looking back at that post now knowing what relocations are, I feel I
should probably add a note that that's a problem that's being solved
there.  Didn't know it at the time).

Some things I don't understand from PPC parlance is the "mode"
(preinc, predec, premodify) and small data operands?

IIUC the bug report correctly, it looks like LLVM is failing for the
__put_user_asm2_goto case for -m32.  A simple reproducer:
https://godbolt.org/z/jBBF9b

void foo(long long in, long long* out) {
asm volatile(
  "stw%X1 %0, %1\n\t"
  "stw%X1 %L0, %L1"
  ::"r"(in), "m"(*out));
}
prints (in GCC):
foo:
  stw 3, 0(5)
  stw 4, 4(5)
  blr
(first time looking at ppc assembler, seems constants and registers
are not as easy to distinguish,
https://developer.ibm.com/technologies/linux/articles/l-ppc/ say "Get
used to it." LOL, ok).
so that's "store word from register 3 into dereference of register 5
plus 0, then store word from register 4 into dereference of register 5
plus 4?"  Guessing the ppc32 abi is ILP32 putting long long's into two
separate registers?
Seems easy to implement in LLVM (short of those modes/small data operands).
https://reviews.llvm.org/D81767
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply

* Re: [PATCH kernel] KVM: PPC: Fix nested guest RC bits update
From: Michael Ellerman @ 2020-06-12 22:49 UTC (permalink / raw)
  To: Alexey Kardashevskiy, linuxppc-dev
  Cc: Aneesh Kumar K.V, kvm-ppc, David Gibson
In-Reply-To: <20200611030559.75257-1-aik@ozlabs.ru>

On Thu, 11 Jun 2020 13:05:59 +1000, Alexey Kardashevskiy wrote:
> Before commit 6cdf30375f82 ("powerpc/kvm/book3s: Use kvm helpers
> to walk shadow or secondary table") we called __find_linux_pte() with
> a page table pointer from a kvm_nested_guest struct but
> now we rely on kvmhv_find_nested() which takes an L1 LPID and returns
> a kvm_nested_guest pointer, however we pass a L0 LPID there and
> the L2 guest hangs.
> 
> [...]

Applied to powerpc/fixes.

[1/1] KVM: PPC: Fix nested guest RC bits update
      https://git.kernel.org/powerpc/c/e881bfaf5a5f409390973e076333281465f2b0d9

cheers

^ permalink raw reply

* [PATCH v2 00/12] x86: tag application address space for devices
From: Fenghua Yu @ 2020-06-13  0:41 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Frederic Barrat, Andrew Donnellan,
	Felix Kuehling, Joerg Roedel, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Yu-cheng Yu, Sohil Mehta,
	Ravi V Shankar
  Cc: Fenghua Yu, x86, linux-kernel, amd-gfx, iommu, linuxppc-dev

Typical hardware devices require a driver stack to translate application
buffers to hardware addresses, and a kernel-user transition to notify the
hardware of new work. What if both the translation and transition overhead
could be eliminated? This is what Shared Virtual Address (SVA) and ENQCMD
enabled hardware like Data Streaming Accelerator (DSA) aims to achieve.
Applications map portals in their local-address-space and directly submit
work to them using a new instruction.

This series enables ENQCMD and associated management of the new MSR
(MSR_IA32_PASID). This new MSR allows an application address space to be
associated with what the PCIe spec calls a Process Address Space ID (PASID).
This PASID tag is carried along with all requests between applications and
devices and allows devices to interact with the process address space.

SVA and ENQCMD enabled device drivers need this series. The phase 2 DSA
patches with SVA and ENQCMD support was released on the top of this series:
https://lore.kernel.org/patchwork/cover/1244060/

This series only provides simple and basic support for ENQCMD and the MSR:
1. Clean up type definitions (patch 1-3). These patches can be in a
   separate series.
   - Define "pasid" as "unsigned int" consistently (patch 1 and 2).
   - Define "flags" as "unsigned int"
2. Explain different various technical terms used in the series (patch 4).
3. Enumerate support for ENQCMD in the processor (patch 5).
4. Handle FPU PASID state and the MSR during context switch (patches 6-7).
5. Define "pasid" in mm_struct (patch 8).
5. Clear PASID state for new mm and forked and cloned thread (patch 9-10).
6. Allocate and free PASID for a process (patch 11).
7. Fix up the PASID MSR in #GP handler when one thread in a process
   executes ENQCMD for the first time (patches 12).

This patch series and the DSA phase 2 series are in
https://github.com/intel/idxd-driver/tree/idxd-stage2

References:
1. Detailed information on the ENQCMD/ENQCMDS instructions and the
IA32_PASID MSR can be found in Intel Architecture Instruction Set
Extensions and Future Features Programming Reference:
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

2. Detailed information on DSA can be found in DSA specification:
https://software.intel.com/en-us/download/intel-data-streaming-accelerator-preliminary-architecture-specification

Chang log:
v2:
- Add patches 1-3 to define "pasid" and "flags" as "unsigned int"
  consistently (Thomas)
  (these 3 patches could be in a separate patch set)
- Add patch 8 to move "pasid" to generic mm_struct (Christoph).
  Jean-Philippe Brucker released a virtually same patch. Upstream only
  needs one of the two.
- Add patch 9 to initialize PASID in a new mm.
- Plus other changes described in each patch (Thomas)

Ashok Raj (1):
  docs: x86: Add documentation for SVA (Shared Virtual Addressing)

Fenghua Yu (10):
  iommu: Change type of pasid to unsigned int
  ocxl: Change type of pasid to unsigned int
  iommu/vt-d: Change flags type to unsigned int in binding mm
  x86/cpufeatures: Enumerate ENQCMD and ENQCMDS instructions
  x86/msr-index: Define IA32_PASID MSR
  mm: Define pasid in mm
  fork: Clear PASID for new mm
  x86/process: Clear PASID state for a newly forked/cloned thread
  x86/mmu: Allocate/free PASID
  x86/traps: Fix up invalid PASID

Yu-cheng Yu (1):
  x86/fpu/xstate: Add supervisor PASID state for ENQCMD feature

 Documentation/x86/index.rst            |   1 +
 Documentation/x86/sva.rst              | 287 +++++++++++++++++++++++++
 arch/x86/include/asm/cpufeatures.h     |   1 +
 arch/x86/include/asm/fpu/types.h       |  10 +
 arch/x86/include/asm/fpu/xstate.h      |   2 +-
 arch/x86/include/asm/iommu.h           |   3 +
 arch/x86/include/asm/mmu_context.h     |  14 ++
 arch/x86/include/asm/msr-index.h       |   3 +
 arch/x86/kernel/cpu/cpuid-deps.c       |   1 +
 arch/x86/kernel/fpu/xstate.c           |   4 +
 arch/x86/kernel/process.c              |  18 ++
 arch/x86/kernel/traps.c                |  23 ++
 drivers/gpu/drm/amd/amdkfd/kfd_iommu.c |   5 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h  |   2 +-
 drivers/iommu/amd/amd_iommu.h          |  13 +-
 drivers/iommu/amd/amd_iommu_types.h    |  12 +-
 drivers/iommu/amd/init.c               |   4 +-
 drivers/iommu/amd/iommu.c              |  41 ++--
 drivers/iommu/amd/iommu_v2.c           |  22 +-
 drivers/iommu/intel/debugfs.c          |   2 +-
 drivers/iommu/intel/dmar.c             |  13 +-
 drivers/iommu/intel/intel-pasid.h      |  21 +-
 drivers/iommu/intel/iommu.c            |   4 +-
 drivers/iommu/intel/pasid.c            |  36 ++--
 drivers/iommu/intel/svm.c              | 159 ++++++++++++--
 drivers/iommu/iommu.c                  |   2 +-
 drivers/misc/ocxl/config.c             |   3 +-
 drivers/misc/ocxl/link.c               |   6 +-
 drivers/misc/ocxl/ocxl_internal.h      |   6 +-
 drivers/misc/ocxl/pasid.c              |   2 +-
 drivers/misc/ocxl/trace.h              |  20 +-
 drivers/misc/uacce/uacce.c             |   2 +-
 include/linux/amd-iommu.h              |   9 +-
 include/linux/intel-iommu.h            |  20 +-
 include/linux/intel-svm.h              |   2 +-
 include/linux/iommu.h                  |   8 +-
 include/linux/mm_types.h               |   6 +
 include/linux/uacce.h                  |   2 +-
 include/misc/ocxl.h                    |   6 +-
 kernel/fork.c                          |   8 +
 40 files changed, 656 insertions(+), 147 deletions(-)
 create mode 100644 Documentation/x86/sva.rst

-- 
2.19.1


^ permalink raw reply

* [PATCH v2 02/12] ocxl: Change type of pasid to unsigned int
From: Fenghua Yu @ 2020-06-13  0:41 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Frederic Barrat, Andrew Donnellan,
	Felix Kuehling, Joerg Roedel, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Yu-cheng Yu, Sohil Mehta,
	Ravi V Shankar
  Cc: Fenghua Yu, x86, linux-kernel, amd-gfx, iommu, linuxppc-dev
In-Reply-To: <1592008893-9388-1-git-send-email-fenghua.yu@intel.com>

PASID is defined as "int" although it's a 20-bit value and shouldn't be
negative int. To be consistent with type defined in iommu, define PASID
as "unsigned int".

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
v2:
- Create this new patch to define PASID as "unsigned int" consistently in
  ocxl (Thomas)

 drivers/misc/ocxl/config.c        |  3 ++-
 drivers/misc/ocxl/link.c          |  6 +++---
 drivers/misc/ocxl/ocxl_internal.h |  6 +++---
 drivers/misc/ocxl/pasid.c         |  2 +-
 drivers/misc/ocxl/trace.h         | 20 ++++++++++----------
 include/misc/ocxl.h               |  6 +++---
 6 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c
index c8e19bfb5ef9..22d034caed3d 100644
--- a/drivers/misc/ocxl/config.c
+++ b/drivers/misc/ocxl/config.c
@@ -806,7 +806,8 @@ int ocxl_config_set_TL(struct pci_dev *dev, int tl_dvsec)
 }
 EXPORT_SYMBOL_GPL(ocxl_config_set_TL);
 
-int ocxl_config_terminate_pasid(struct pci_dev *dev, int afu_control, int pasid)
+int ocxl_config_terminate_pasid(struct pci_dev *dev, int afu_control,
+				unsigned int pasid)
 {
 	u32 val;
 	unsigned long timeout;
diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c
index 58d111afd9f6..931f6ae022db 100644
--- a/drivers/misc/ocxl/link.c
+++ b/drivers/misc/ocxl/link.c
@@ -492,7 +492,7 @@ static u64 calculate_cfg_state(bool kernel)
 	return state;
 }
 
-int ocxl_link_add_pe(void *link_handle, int pasid, u32 pidr, u32 tidr,
+int ocxl_link_add_pe(void *link_handle, unsigned int pasid, u32 pidr, u32 tidr,
 		u64 amr, struct mm_struct *mm,
 		void (*xsl_err_cb)(void *data, u64 addr, u64 dsisr),
 		void *xsl_err_data)
@@ -572,7 +572,7 @@ int ocxl_link_add_pe(void *link_handle, int pasid, u32 pidr, u32 tidr,
 }
 EXPORT_SYMBOL_GPL(ocxl_link_add_pe);
 
-int ocxl_link_update_pe(void *link_handle, int pasid, __u16 tid)
+int ocxl_link_update_pe(void *link_handle, unsigned int pasid, __u16 tid)
 {
 	struct ocxl_link *link = (struct ocxl_link *) link_handle;
 	struct spa *spa = link->spa;
@@ -608,7 +608,7 @@ int ocxl_link_update_pe(void *link_handle, int pasid, __u16 tid)
 	return rc;
 }
 
-int ocxl_link_remove_pe(void *link_handle, int pasid)
+int ocxl_link_remove_pe(void *link_handle, unsigned int pasid)
 {
 	struct ocxl_link *link = (struct ocxl_link *) link_handle;
 	struct spa *spa = link->spa;
diff --git a/drivers/misc/ocxl/ocxl_internal.h b/drivers/misc/ocxl/ocxl_internal.h
index 345bf843a38e..3ca982ba7472 100644
--- a/drivers/misc/ocxl/ocxl_internal.h
+++ b/drivers/misc/ocxl/ocxl_internal.h
@@ -41,7 +41,7 @@ struct ocxl_afu {
 	struct ocxl_afu_config config;
 	int pasid_base;
 	int pasid_count; /* opened contexts */
-	int pasid_max; /* maximum number of contexts */
+	unsigned int pasid_max; /* maximum number of contexts */
 	int actag_base;
 	int actag_enabled;
 	struct mutex contexts_lock;
@@ -69,7 +69,7 @@ struct ocxl_xsl_error {
 
 struct ocxl_context {
 	struct ocxl_afu *afu;
-	int pasid;
+	unsigned int pasid;
 	struct mutex status_mutex;
 	enum ocxl_context_status status;
 	struct address_space *mapping;
@@ -128,7 +128,7 @@ int ocxl_config_check_afu_index(struct pci_dev *dev,
  * pasid: the PASID for the AFU context
  * tid: the new thread id for the process element
  */
-int ocxl_link_update_pe(void *link_handle, int pasid, __u16 tid);
+int ocxl_link_update_pe(void *link_handle, unsigned int pasid, __u16 tid);
 
 int ocxl_context_mmap(struct ocxl_context *ctx,
 			struct vm_area_struct *vma);
diff --git a/drivers/misc/ocxl/pasid.c b/drivers/misc/ocxl/pasid.c
index d14cb56e6920..a151fc8f0bec 100644
--- a/drivers/misc/ocxl/pasid.c
+++ b/drivers/misc/ocxl/pasid.c
@@ -80,7 +80,7 @@ static void range_free(struct list_head *head, u32 start, u32 size,
 
 int ocxl_pasid_afu_alloc(struct ocxl_fn *fn, u32 size)
 {
-	int max_pasid;
+	unsigned int max_pasid;
 
 	if (fn->config.max_pasid_log < 0)
 		return -ENOSPC;
diff --git a/drivers/misc/ocxl/trace.h b/drivers/misc/ocxl/trace.h
index 17e21cb2addd..019e2fc63b1d 100644
--- a/drivers/misc/ocxl/trace.h
+++ b/drivers/misc/ocxl/trace.h
@@ -9,13 +9,13 @@
 #include <linux/tracepoint.h>
 
 DECLARE_EVENT_CLASS(ocxl_context,
-	TP_PROTO(pid_t pid, void *spa, int pasid, u32 pidr, u32 tidr),
+	TP_PROTO(pid_t pid, void *spa, unsigned int pasid, u32 pidr, u32 tidr),
 	TP_ARGS(pid, spa, pasid, pidr, tidr),
 
 	TP_STRUCT__entry(
 		__field(pid_t, pid)
 		__field(void*, spa)
-		__field(int, pasid)
+		__field(unsigned int, pasid)
 		__field(u32, pidr)
 		__field(u32, tidr)
 	),
@@ -38,21 +38,21 @@ DECLARE_EVENT_CLASS(ocxl_context,
 );
 
 DEFINE_EVENT(ocxl_context, ocxl_context_add,
-	TP_PROTO(pid_t pid, void *spa, int pasid, u32 pidr, u32 tidr),
+	TP_PROTO(pid_t pid, void *spa, unsigned int pasid, u32 pidr, u32 tidr),
 	TP_ARGS(pid, spa, pasid, pidr, tidr)
 );
 
 DEFINE_EVENT(ocxl_context, ocxl_context_remove,
-	TP_PROTO(pid_t pid, void *spa, int pasid, u32 pidr, u32 tidr),
+	TP_PROTO(pid_t pid, void *spa, unsigned int pasid, u32 pidr, u32 tidr),
 	TP_ARGS(pid, spa, pasid, pidr, tidr)
 );
 
 TRACE_EVENT(ocxl_terminate_pasid,
-	TP_PROTO(int pasid, int rc),
+	TP_PROTO(unsigned int pasid, int rc),
 	TP_ARGS(pasid, rc),
 
 	TP_STRUCT__entry(
-		__field(int, pasid)
+		__field(unsigned int, pasid)
 		__field(int, rc)
 	),
 
@@ -107,11 +107,11 @@ DEFINE_EVENT(ocxl_fault_handler, ocxl_fault_ack,
 );
 
 TRACE_EVENT(ocxl_afu_irq_alloc,
-	TP_PROTO(int pasid, int irq_id, unsigned int virq, int hw_irq),
+	TP_PROTO(unsigned int pasid, int irq_id, unsigned int virq, int hw_irq),
 	TP_ARGS(pasid, irq_id, virq, hw_irq),
 
 	TP_STRUCT__entry(
-		__field(int, pasid)
+		__field(unsigned int, pasid)
 		__field(int, irq_id)
 		__field(unsigned int, virq)
 		__field(int, hw_irq)
@@ -133,11 +133,11 @@ TRACE_EVENT(ocxl_afu_irq_alloc,
 );
 
 TRACE_EVENT(ocxl_afu_irq_free,
-	TP_PROTO(int pasid, int irq_id),
+	TP_PROTO(unsigned int pasid, int irq_id),
 	TP_ARGS(pasid, irq_id),
 
 	TP_STRUCT__entry(
-		__field(int, pasid)
+		__field(unsigned int, pasid)
 		__field(int, irq_id)
 	),
 
diff --git a/include/misc/ocxl.h b/include/misc/ocxl.h
index 06dd5839e438..5eca04c8da97 100644
--- a/include/misc/ocxl.h
+++ b/include/misc/ocxl.h
@@ -429,7 +429,7 @@ int ocxl_config_set_TL(struct pci_dev *dev, int tl_dvsec);
  * desired AFU. It can be found in the AFU configuration
  */
 int ocxl_config_terminate_pasid(struct pci_dev *dev,
-				int afu_control_offset, int pasid);
+				int afu_control_offset, unsigned int pasid);
 
 /*
  * Read the configuration space of a function and fill in a
@@ -466,7 +466,7 @@ void ocxl_link_release(struct pci_dev *dev, void *link_handle);
  * 'xsl_err_data' is an argument passed to the above callback, if
  * defined
  */
-int ocxl_link_add_pe(void *link_handle, int pasid, u32 pidr, u32 tidr,
+int ocxl_link_add_pe(void *link_handle, unsigned int pasid, u32 pidr, u32 tidr,
 		u64 amr, struct mm_struct *mm,
 		void (*xsl_err_cb)(void *data, u64 addr, u64 dsisr),
 		void *xsl_err_data);
@@ -474,7 +474,7 @@ int ocxl_link_add_pe(void *link_handle, int pasid, u32 pidr, u32 tidr,
 /*
  * Remove a Process Element from the Shared Process Area for a link
  */
-int ocxl_link_remove_pe(void *link_handle, int pasid);
+int ocxl_link_remove_pe(void *link_handle, unsigned int pasid);
 
 /*
  * Allocate an AFU interrupt associated to the link.
-- 
2.19.1


^ permalink raw reply related

* [PATCH v2 01/12] iommu: Change type of pasid to unsigned int
From: Fenghua Yu @ 2020-06-13  0:41 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Frederic Barrat, Andrew Donnellan,
	Felix Kuehling, Joerg Roedel, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Yu-cheng Yu, Sohil Mehta,
	Ravi V Shankar
  Cc: Fenghua Yu, x86, linux-kernel, amd-gfx, iommu, linuxppc-dev
In-Reply-To: <1592008893-9388-1-git-send-email-fenghua.yu@intel.com>

PASID is defined as a few different types in iommu including "int",
"u32", and "unsigned int". To be consistent and to match with ioasid's
type, define PASID and its variations (e.g. max PASID) as "unsigned int".

No PASID type change in uapi.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
v2:
- Create this new patch to define PASID as "unsigned int" consistently in
  iommu (Thomas)

 drivers/gpu/drm/amd/amdkfd/kfd_iommu.c |  5 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h  |  2 +-
 drivers/iommu/amd/amd_iommu.h          | 13 ++++----
 drivers/iommu/amd/amd_iommu_types.h    | 12 ++++----
 drivers/iommu/amd/init.c               |  4 +--
 drivers/iommu/amd/iommu.c              | 41 ++++++++++++++------------
 drivers/iommu/amd/iommu_v2.c           | 22 +++++++-------
 drivers/iommu/intel/debugfs.c          |  2 +-
 drivers/iommu/intel/dmar.c             | 13 ++++----
 drivers/iommu/intel/intel-pasid.h      | 21 ++++++-------
 drivers/iommu/intel/iommu.c            |  4 +--
 drivers/iommu/intel/pasid.c            | 36 +++++++++++-----------
 drivers/iommu/intel/svm.c              | 12 ++++----
 drivers/iommu/iommu.c                  |  2 +-
 drivers/misc/uacce/uacce.c             |  2 +-
 include/linux/amd-iommu.h              |  9 +++---
 include/linux/intel-iommu.h            | 18 +++++------
 include/linux/intel-svm.h              |  2 +-
 include/linux/iommu.h                  |  8 ++---
 include/linux/uacce.h                  |  2 +-
 20 files changed, 121 insertions(+), 109 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
index 7c8786b9eb0a..703d23deca76 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
@@ -139,7 +139,8 @@ void kfd_iommu_unbind_process(struct kfd_process *p)
 }
 
 /* Callback for process shutdown invoked by the IOMMU driver */
-static void iommu_pasid_shutdown_callback(struct pci_dev *pdev, int pasid)
+static void iommu_pasid_shutdown_callback(struct pci_dev *pdev,
+					  unsigned int pasid)
 {
 	struct kfd_dev *dev = kfd_device_by_pci_dev(pdev);
 	struct kfd_process *p;
@@ -185,7 +186,7 @@ static void iommu_pasid_shutdown_callback(struct pci_dev *pdev, int pasid)
 }
 
 /* This function called by IOMMU driver on PPR failure */
-static int iommu_invalid_ppr_cb(struct pci_dev *pdev, int pasid,
+static int iommu_invalid_ppr_cb(struct pci_dev *pdev, unsigned int pasid,
 		unsigned long address, u16 flags)
 {
 	struct kfd_dev *dev;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index f0587d94294d..3c7d1f774afe 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -714,7 +714,7 @@ struct kfd_process {
 	/* We want to receive a notification when the mm_struct is destroyed */
 	struct mmu_notifier mmu_notifier;
 
-	uint16_t pasid;
+	unsigned int pasid;
 	unsigned int doorbell_index;
 
 	/*
diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index f892992c8744..0914b5b6f879 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -45,12 +45,13 @@ extern int amd_iommu_register_ppr_notifier(struct notifier_block *nb);
 extern int amd_iommu_unregister_ppr_notifier(struct notifier_block *nb);
 extern void amd_iommu_domain_direct_map(struct iommu_domain *dom);
 extern int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids);
-extern int amd_iommu_flush_page(struct iommu_domain *dom, int pasid,
+extern int amd_iommu_flush_page(struct iommu_domain *dom, unsigned int pasid,
 				u64 address);
-extern int amd_iommu_flush_tlb(struct iommu_domain *dom, int pasid);
-extern int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, int pasid,
-				     unsigned long cr3);
-extern int amd_iommu_domain_clear_gcr3(struct iommu_domain *dom, int pasid);
+extern int amd_iommu_flush_tlb(struct iommu_domain *dom, unsigned int pasid);
+extern int amd_iommu_domain_set_gcr3(struct iommu_domain *dom,
+				     unsigned int pasid, unsigned long cr3);
+extern int amd_iommu_domain_clear_gcr3(struct iommu_domain *dom,
+				       unsigned int pasid);
 extern struct iommu_domain *amd_iommu_get_v2_domain(struct pci_dev *pdev);
 
 #ifdef CONFIG_IRQ_REMAP
@@ -66,7 +67,7 @@ static inline int amd_iommu_create_irq_domain(struct amd_iommu *iommu)
 #define PPR_INVALID			0x1
 #define PPR_FAILURE			0xf
 
-extern int amd_iommu_complete_ppr(struct pci_dev *pdev, int pasid,
+extern int amd_iommu_complete_ppr(struct pci_dev *pdev, unsigned int pasid,
 				  int status, int tag);
 
 static inline bool is_rd890_iommu(struct pci_dev *pdev)
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 30a5d412255a..d395931b2232 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -443,11 +443,11 @@ extern struct kmem_cache *amd_iommu_irq_cache;
  * incoming PPR faults around.
  */
 struct amd_iommu_fault {
-	u64 address;    /* IO virtual address of the fault*/
-	u32 pasid;      /* Address space identifier */
-	u16 device_id;  /* Originating PCI device id */
-	u16 tag;        /* PPR tag */
-	u16 flags;      /* Fault flags */
+	u64 address;		/* IO virtual address of the fault*/
+	unsigned int pasid;	/* Address space identifier */
+	u16 device_id;		/* Originating PCI device id */
+	u16 tag;		/* PPR tag */
+	u16 flags;		/* Fault flags */
 
 };
 
@@ -745,7 +745,7 @@ extern unsigned long *amd_iommu_pd_alloc_bitmap;
 extern bool amd_iommu_unmap_flush;
 
 /* Smallest max PASID supported by any IOMMU in the system */
-extern u32 amd_iommu_max_pasid;
+extern unsigned int amd_iommu_max_pasid;
 
 extern bool amd_iommu_v2_present;
 
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 6ebd4825e320..fbf5dd1d3eab 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -172,7 +172,7 @@ static int amd_iommus_present;
 bool amd_iommu_np_cache __read_mostly;
 bool amd_iommu_iotlb_sup __read_mostly = true;
 
-u32 amd_iommu_max_pasid __read_mostly = ~0;
+unsigned int amd_iommu_max_pasid __read_mostly = ~0;
 
 bool amd_iommu_v2_present __read_mostly;
 static bool amd_iommu_pc_present __read_mostly;
@@ -1750,7 +1750,7 @@ static int __init iommu_init_pci(struct amd_iommu *iommu)
 
 	if (iommu_feature(iommu, FEATURE_GT)) {
 		int glxval;
-		u32 max_pasid;
+		unsigned int max_pasid;
 		u64 pasmax;
 
 		pasmax = iommu->features & FEATURE_PASID_MASK;
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 74cca1757172..32999adbbd46 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -502,7 +502,8 @@ static void amd_iommu_report_page_fault(u16 devid, u16 domain_id,
 static void iommu_print_event(struct amd_iommu *iommu, void *__evt)
 {
 	struct device *dev = iommu->iommu.dev;
-	int type, devid, pasid, flags, tag;
+	int type, devid, flags, tag;
+	unsigned int pasid;
 	volatile u32 *event = __evt;
 	int count = 0;
 	u64 address;
@@ -898,8 +899,8 @@ static void build_inv_iotlb_pages(struct iommu_cmd *cmd, u16 devid, int qdep,
 		cmd->data[2] |= CMD_INV_IOMMU_PAGES_SIZE_MASK;
 }
 
-static void build_inv_iommu_pasid(struct iommu_cmd *cmd, u16 domid, int pasid,
-				  u64 address, bool size)
+static void build_inv_iommu_pasid(struct iommu_cmd *cmd, u16 domid,
+				  unsigned int pasid, u64 address, bool size)
 {
 	memset(cmd, 0, sizeof(*cmd));
 
@@ -916,8 +917,9 @@ static void build_inv_iommu_pasid(struct iommu_cmd *cmd, u16 domid, int pasid,
 	CMD_SET_TYPE(cmd, CMD_INV_IOMMU_PAGES);
 }
 
-static void build_inv_iotlb_pasid(struct iommu_cmd *cmd, u16 devid, int pasid,
-				  int qdep, u64 address, bool size)
+static void build_inv_iotlb_pasid(struct iommu_cmd *cmd, u16 devid,
+				  unsigned int pasid, int qdep, u64 address,
+				  bool size)
 {
 	memset(cmd, 0, sizeof(*cmd));
 
@@ -936,8 +938,8 @@ static void build_inv_iotlb_pasid(struct iommu_cmd *cmd, u16 devid, int pasid,
 	CMD_SET_TYPE(cmd, CMD_INV_IOTLB_PAGES);
 }
 
-static void build_complete_ppr(struct iommu_cmd *cmd, u16 devid, int pasid,
-			       int status, int tag, bool gn)
+static void build_complete_ppr(struct iommu_cmd *cmd, u16 devid,
+			       unsigned int pasid, int status, int tag, bool gn)
 {
 	memset(cmd, 0, sizeof(*cmd));
 
@@ -2772,7 +2774,7 @@ int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids)
 }
 EXPORT_SYMBOL(amd_iommu_domain_enable_v2);
 
-static int __flush_pasid(struct protection_domain *domain, int pasid,
+static int __flush_pasid(struct protection_domain *domain, unsigned int pasid,
 			 u64 address, bool size)
 {
 	struct iommu_dev_data *dev_data;
@@ -2833,13 +2835,13 @@ static int __flush_pasid(struct protection_domain *domain, int pasid,
 	return ret;
 }
 
-static int __amd_iommu_flush_page(struct protection_domain *domain, int pasid,
-				  u64 address)
+static int __amd_iommu_flush_page(struct protection_domain *domain,
+				  unsigned int pasid, u64 address)
 {
 	return __flush_pasid(domain, pasid, address, false);
 }
 
-int amd_iommu_flush_page(struct iommu_domain *dom, int pasid,
+int amd_iommu_flush_page(struct iommu_domain *dom, unsigned int pasid,
 			 u64 address)
 {
 	struct protection_domain *domain = to_pdomain(dom);
@@ -2854,13 +2856,14 @@ int amd_iommu_flush_page(struct iommu_domain *dom, int pasid,
 }
 EXPORT_SYMBOL(amd_iommu_flush_page);
 
-static int __amd_iommu_flush_tlb(struct protection_domain *domain, int pasid)
+static int __amd_iommu_flush_tlb(struct protection_domain *domain,
+				 unsigned int pasid)
 {
 	return __flush_pasid(domain, pasid, CMD_INV_IOMMU_ALL_PAGES_ADDRESS,
 			     true);
 }
 
-int amd_iommu_flush_tlb(struct iommu_domain *dom, int pasid)
+int amd_iommu_flush_tlb(struct iommu_domain *dom, unsigned int pasid)
 {
 	struct protection_domain *domain = to_pdomain(dom);
 	unsigned long flags;
@@ -2874,7 +2877,7 @@ int amd_iommu_flush_tlb(struct iommu_domain *dom, int pasid)
 }
 EXPORT_SYMBOL(amd_iommu_flush_tlb);
 
-static u64 *__get_gcr3_pte(u64 *root, int level, int pasid, bool alloc)
+static u64 *__get_gcr3_pte(u64 *root, int level, unsigned int pasid, bool alloc)
 {
 	int index;
 	u64 *pte;
@@ -2906,7 +2909,7 @@ static u64 *__get_gcr3_pte(u64 *root, int level, int pasid, bool alloc)
 	return pte;
 }
 
-static int __set_gcr3(struct protection_domain *domain, int pasid,
+static int __set_gcr3(struct protection_domain *domain, unsigned int pasid,
 		      unsigned long cr3)
 {
 	struct domain_pgtable pgtable;
@@ -2925,7 +2928,7 @@ static int __set_gcr3(struct protection_domain *domain, int pasid,
 	return __amd_iommu_flush_tlb(domain, pasid);
 }
 
-static int __clear_gcr3(struct protection_domain *domain, int pasid)
+static int __clear_gcr3(struct protection_domain *domain, unsigned int pasid)
 {
 	struct domain_pgtable pgtable;
 	u64 *pte;
@@ -2943,7 +2946,7 @@ static int __clear_gcr3(struct protection_domain *domain, int pasid)
 	return __amd_iommu_flush_tlb(domain, pasid);
 }
 
-int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, int pasid,
+int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, unsigned int pasid,
 			      unsigned long cr3)
 {
 	struct protection_domain *domain = to_pdomain(dom);
@@ -2958,7 +2961,7 @@ int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, int pasid,
 }
 EXPORT_SYMBOL(amd_iommu_domain_set_gcr3);
 
-int amd_iommu_domain_clear_gcr3(struct iommu_domain *dom, int pasid)
+int amd_iommu_domain_clear_gcr3(struct iommu_domain *dom, unsigned int pasid)
 {
 	struct protection_domain *domain = to_pdomain(dom);
 	unsigned long flags;
@@ -2972,7 +2975,7 @@ int amd_iommu_domain_clear_gcr3(struct iommu_domain *dom, int pasid)
 }
 EXPORT_SYMBOL(amd_iommu_domain_clear_gcr3);
 
-int amd_iommu_complete_ppr(struct pci_dev *pdev, int pasid,
+int amd_iommu_complete_ppr(struct pci_dev *pdev, unsigned int pasid,
 			   int status, int tag)
 {
 	struct iommu_dev_data *dev_data;
diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c
index e4b025c5637c..21dd53054acb 100644
--- a/drivers/iommu/amd/iommu_v2.c
+++ b/drivers/iommu/amd/iommu_v2.c
@@ -40,7 +40,7 @@ struct pasid_state {
 	struct mmu_notifier mn;                 /* mmu_notifier handle */
 	struct pri_queue pri[PRI_QUEUE_SIZE];	/* PRI tag states */
 	struct device_state *device_state;	/* Link to our device_state */
-	int pasid;				/* PASID index */
+	unsigned int pasid;			/* PASID index */
 	bool invalid;				/* Used during setup and
 						   teardown of the pasid */
 	spinlock_t lock;			/* Protect pri_queues and
@@ -70,7 +70,7 @@ struct fault {
 	struct mm_struct *mm;
 	u64 address;
 	u16 devid;
-	u16 pasid;
+	unsigned int pasid;
 	u16 tag;
 	u16 finish;
 	u16 flags;
@@ -150,7 +150,8 @@ static void put_device_state(struct device_state *dev_state)
 
 /* Must be called under dev_state->lock */
 static struct pasid_state **__get_pasid_state_ptr(struct device_state *dev_state,
-						  int pasid, bool alloc)
+						  unsigned int pasid,
+						  bool alloc)
 {
 	struct pasid_state **root, **ptr;
 	int level, index;
@@ -184,7 +185,7 @@ static struct pasid_state **__get_pasid_state_ptr(struct device_state *dev_state
 
 static int set_pasid_state(struct device_state *dev_state,
 			   struct pasid_state *pasid_state,
-			   int pasid)
+			   unsigned int pasid)
 {
 	struct pasid_state **ptr;
 	unsigned long flags;
@@ -211,7 +212,8 @@ static int set_pasid_state(struct device_state *dev_state,
 	return ret;
 }
 
-static void clear_pasid_state(struct device_state *dev_state, int pasid)
+static void clear_pasid_state(struct device_state *dev_state,
+			      unsigned int pasid)
 {
 	struct pasid_state **ptr;
 	unsigned long flags;
@@ -229,7 +231,7 @@ static void clear_pasid_state(struct device_state *dev_state, int pasid)
 }
 
 static struct pasid_state *get_pasid_state(struct device_state *dev_state,
-					   int pasid)
+					   unsigned int pasid)
 {
 	struct pasid_state **ptr, *ret = NULL;
 	unsigned long flags;
@@ -594,7 +596,7 @@ static struct notifier_block ppr_nb = {
 	.notifier_call = ppr_notifier,
 };
 
-int amd_iommu_bind_pasid(struct pci_dev *pdev, int pasid,
+int amd_iommu_bind_pasid(struct pci_dev *pdev, unsigned int pasid,
 			 struct task_struct *task)
 {
 	struct pasid_state *pasid_state;
@@ -615,7 +617,7 @@ int amd_iommu_bind_pasid(struct pci_dev *pdev, int pasid,
 		return -EINVAL;
 
 	ret = -EINVAL;
-	if (pasid < 0 || pasid >= dev_state->max_pasids)
+	if (pasid >= dev_state->max_pasids)
 		goto out;
 
 	ret = -ENOMEM;
@@ -679,7 +681,7 @@ int amd_iommu_bind_pasid(struct pci_dev *pdev, int pasid,
 }
 EXPORT_SYMBOL(amd_iommu_bind_pasid);
 
-void amd_iommu_unbind_pasid(struct pci_dev *pdev, int pasid)
+void amd_iommu_unbind_pasid(struct pci_dev *pdev, unsigned int pasid)
 {
 	struct pasid_state *pasid_state;
 	struct device_state *dev_state;
@@ -695,7 +697,7 @@ void amd_iommu_unbind_pasid(struct pci_dev *pdev, int pasid)
 	if (dev_state == NULL)
 		return;
 
-	if (pasid < 0 || pasid >= dev_state->max_pasids)
+	if (pasid >= dev_state->max_pasids)
 		goto out;
 
 	pasid_state = get_pasid_state(dev_state, pasid);
diff --git a/drivers/iommu/intel/debugfs.c b/drivers/iommu/intel/debugfs.c
index cf1ebb98e418..f3b99ed8ee92 100644
--- a/drivers/iommu/intel/debugfs.c
+++ b/drivers/iommu/intel/debugfs.c
@@ -20,7 +20,7 @@
 struct tbl_walk {
 	u16 bus;
 	u16 devfn;
-	u32 pasid;
+	unsigned int pasid;
 	struct root_entry *rt_entry;
 	struct context_entry *ctx_entry;
 	struct pasid_entry *pasid_tbl_entry;
diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index cc46dff98fa0..805c02584df5 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -1395,8 +1395,8 @@ void qi_flush_dev_iotlb(struct intel_iommu *iommu, u16 sid, u16 pfsid,
 }
 
 /* PASID-based IOTLB invalidation */
-void qi_flush_piotlb(struct intel_iommu *iommu, u16 did, u32 pasid, u64 addr,
-		     unsigned long npages, bool ih)
+void qi_flush_piotlb(struct intel_iommu *iommu, u16 did, unsigned int pasid,
+		     u64 addr, unsigned long npages, bool ih)
 {
 	struct qi_desc desc = {.qw2 = 0, .qw3 = 0};
 
@@ -1437,7 +1437,7 @@ void qi_flush_piotlb(struct intel_iommu *iommu, u16 did, u32 pasid, u64 addr,
 
 /* PASID-based device IOTLB Invalidate */
 void qi_flush_dev_iotlb_pasid(struct intel_iommu *iommu, u16 sid, u16 pfsid,
-			      u32 pasid,  u16 qdep, u64 addr,
+			      unsigned int pasid,  u16 qdep, u64 addr,
 			      unsigned int size_order, u64 granu)
 {
 	unsigned long mask = 1UL << (VTD_PAGE_SHIFT + size_order - 1);
@@ -1465,7 +1465,7 @@ void qi_flush_dev_iotlb_pasid(struct intel_iommu *iommu, u16 sid, u16 pfsid,
 }
 
 void qi_flush_pasid_cache(struct intel_iommu *iommu, u16 did,
-			  u64 granu, int pasid)
+			  u64 granu, unsigned int pasid)
 {
 	struct qi_desc desc = {.qw1 = 0, .qw2 = 0, .qw3 = 0};
 
@@ -1779,7 +1779,7 @@ void dmar_msi_read(int irq, struct msi_msg *msg)
 }
 
 static int dmar_fault_do_one(struct intel_iommu *iommu, int type,
-		u8 fault_reason, int pasid, u16 source_id,
+		u8 fault_reason, unsigned int pasid, u16 source_id,
 		unsigned long long addr)
 {
 	const char *reason;
@@ -1826,10 +1826,11 @@ irqreturn_t dmar_fault(int irq, void *dev_id)
 	while (1) {
 		/* Disable printing, simply clear the fault when ratelimited */
 		bool ratelimited = !__ratelimit(&rs);
+		unsigned int pasid;
 		u8 fault_reason;
 		u16 source_id;
 		u64 guest_addr;
-		int type, pasid;
+		int type;
 		u32 data;
 		bool pasid_present;
 
diff --git a/drivers/iommu/intel/intel-pasid.h b/drivers/iommu/intel/intel-pasid.h
index c5318d40e0fa..fe9e9ac3b97e 100644
--- a/drivers/iommu/intel/intel-pasid.h
+++ b/drivers/iommu/intel/intel-pasid.h
@@ -72,7 +72,7 @@ struct pasid_entry {
 struct pasid_table {
 	void			*table;		/* pasid table pointer */
 	int			order;		/* page order of pasid table */
-	int			max_pasid;	/* max pasid */
+	unsigned int		max_pasid;	/* max pasid */
 	struct list_head	dev;		/* device list */
 };
 
@@ -98,30 +98,31 @@ static inline bool pasid_pte_is_present(struct pasid_entry *pte)
 	return READ_ONCE(pte->val[0]) & PASID_PTE_PRESENT;
 }
 
-extern u32 intel_pasid_max_id;
+extern unsigned int intel_pasid_max_id;
 int intel_pasid_alloc_id(void *ptr, int start, int end, gfp_t gfp);
-void intel_pasid_free_id(int pasid);
-void *intel_pasid_lookup_id(int pasid);
+void intel_pasid_free_id(unsigned int pasid);
+void *intel_pasid_lookup_id(unsigned int pasid);
 int intel_pasid_alloc_table(struct device *dev);
 void intel_pasid_free_table(struct device *dev);
 struct pasid_table *intel_pasid_get_table(struct device *dev);
 int intel_pasid_get_dev_max_id(struct device *dev);
-struct pasid_entry *intel_pasid_get_entry(struct device *dev, int pasid);
+struct pasid_entry *intel_pasid_get_entry(struct device *dev,
+					  unsigned int pasid);
 int intel_pasid_setup_first_level(struct intel_iommu *iommu,
 				  struct device *dev, pgd_t *pgd,
-				  int pasid, u16 did, int flags);
+				  unsigned int pasid, u16 did, int flags);
 int intel_pasid_setup_second_level(struct intel_iommu *iommu,
 				   struct dmar_domain *domain,
-				   struct device *dev, int pasid);
+				   struct device *dev, unsigned int pasid);
 int intel_pasid_setup_pass_through(struct intel_iommu *iommu,
 				   struct dmar_domain *domain,
-				   struct device *dev, int pasid);
+				   struct device *dev, unsigned int pasid);
 int intel_pasid_setup_nested(struct intel_iommu *iommu,
-			     struct device *dev, pgd_t *pgd, int pasid,
+			     struct device *dev, pgd_t *pgd, unsigned int pasid,
 			     struct iommu_gpasid_bind_data_vtd *pasid_data,
 			     struct dmar_domain *domain, int addr_width);
 void intel_pasid_tear_down_entry(struct intel_iommu *iommu,
-				 struct device *dev, int pasid,
+				 struct device *dev, unsigned int pasid,
 				 bool fault_ignore);
 int vcmd_alloc_pasid(struct intel_iommu *iommu, unsigned int *pasid);
 void vcmd_free_pasid(struct intel_iommu *iommu, unsigned int pasid);
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 9129663a7406..5a6f85d645c6 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -2469,7 +2469,7 @@ dmar_search_domain_by_dev_info(int segment, int bus, int devfn)
 static int domain_setup_first_level(struct intel_iommu *iommu,
 				    struct dmar_domain *domain,
 				    struct device *dev,
-				    int pasid)
+				    unsigned int pasid)
 {
 	int flags = PASID_FLAG_SUPERVISOR_MODE;
 	struct dma_pte *pgd = domain->pgd;
@@ -5147,7 +5147,7 @@ static int aux_domain_add_dev(struct dmar_domain *domain,
 		return -ENODEV;
 
 	if (domain->default_pasid <= 0) {
-		int pasid;
+		unsigned int pasid;
 
 		/* No private data needed for the default pasid */
 		pasid = ioasid_alloc(NULL, PASID_MIN,
diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
index c81f0f17c6ba..8a98b68c1c87 100644
--- a/drivers/iommu/intel/pasid.c
+++ b/drivers/iommu/intel/pasid.c
@@ -25,7 +25,7 @@
  * Intel IOMMU system wide PASID name space:
  */
 static DEFINE_SPINLOCK(pasid_lock);
-u32 intel_pasid_max_id = PASID_MAX;
+unsigned int intel_pasid_max_id = PASID_MAX;
 
 int vcmd_alloc_pasid(struct intel_iommu *iommu, unsigned int *pasid)
 {
@@ -146,7 +146,7 @@ int intel_pasid_alloc_table(struct device *dev)
 	struct pasid_table *pasid_table;
 	struct pasid_table_opaque data;
 	struct page *pages;
-	int max_pasid = 0;
+	unsigned int max_pasid = 0;
 	int ret, order;
 	int size;
 
@@ -168,7 +168,7 @@ int intel_pasid_alloc_table(struct device *dev)
 	INIT_LIST_HEAD(&pasid_table->dev);
 
 	if (info->pasid_supported)
-		max_pasid = min_t(int, pci_max_pasids(to_pci_dev(dev)),
+		max_pasid = min_t(unsigned int, pci_max_pasids(to_pci_dev(dev)),
 				  intel_pasid_max_id);
 
 	size = max_pasid >> (PASID_PDE_SHIFT - 3);
@@ -242,7 +242,8 @@ int intel_pasid_get_dev_max_id(struct device *dev)
 	return info->pasid_table->max_pasid;
 }
 
-struct pasid_entry *intel_pasid_get_entry(struct device *dev, int pasid)
+struct pasid_entry *intel_pasid_get_entry(struct device *dev,
+					  unsigned int pasid)
 {
 	struct device_domain_info *info;
 	struct pasid_table *pasid_table;
@@ -251,8 +252,7 @@ struct pasid_entry *intel_pasid_get_entry(struct device *dev, int pasid)
 	int dir_index, index;
 
 	pasid_table = intel_pasid_get_table(dev);
-	if (WARN_ON(!pasid_table || pasid < 0 ||
-		    pasid >= intel_pasid_get_dev_max_id(dev)))
+	if (WARN_ON(!pasid_table || pasid >= intel_pasid_get_dev_max_id(dev)))
 		return NULL;
 
 	dir = pasid_table->table;
@@ -305,7 +305,8 @@ static inline void pasid_clear_entry_with_fpd(struct pasid_entry *pe)
 }
 
 static void
-intel_pasid_clear_entry(struct device *dev, int pasid, bool fault_ignore)
+intel_pasid_clear_entry(struct device *dev, unsigned int pasid,
+			bool fault_ignore)
 {
 	struct pasid_entry *pe;
 
@@ -444,7 +445,7 @@ pasid_set_eafe(struct pasid_entry *pe)
 
 static void
 pasid_cache_invalidation_with_pasid(struct intel_iommu *iommu,
-				    u16 did, int pasid)
+				    u16 did, unsigned int pasid)
 {
 	struct qi_desc desc;
 
@@ -458,7 +459,8 @@ pasid_cache_invalidation_with_pasid(struct intel_iommu *iommu,
 }
 
 static void
-iotlb_invalidation_with_pasid(struct intel_iommu *iommu, u16 did, u32 pasid)
+iotlb_invalidation_with_pasid(struct intel_iommu *iommu, u16 did,
+			      unsigned int pasid)
 {
 	struct qi_desc desc;
 
@@ -473,7 +475,7 @@ iotlb_invalidation_with_pasid(struct intel_iommu *iommu, u16 did, u32 pasid)
 
 static void
 devtlb_invalidation_with_pasid(struct intel_iommu *iommu,
-			       struct device *dev, int pasid)
+			       struct device *dev, unsigned int pasid)
 {
 	struct device_domain_info *info;
 	u16 sid, qdep, pfsid;
@@ -490,7 +492,7 @@ devtlb_invalidation_with_pasid(struct intel_iommu *iommu,
 }
 
 void intel_pasid_tear_down_entry(struct intel_iommu *iommu, struct device *dev,
-				 int pasid, bool fault_ignore)
+				 unsigned int pasid, bool fault_ignore)
 {
 	struct pasid_entry *pte;
 	u16 did;
@@ -514,8 +516,8 @@ void intel_pasid_tear_down_entry(struct intel_iommu *iommu, struct device *dev,
 }
 
 static void pasid_flush_caches(struct intel_iommu *iommu,
-				struct pasid_entry *pte,
-				int pasid, u16 did)
+			       struct pasid_entry *pte,
+			       unsigned int pasid, u16 did)
 {
 	if (!ecap_coherent(iommu->ecap))
 		clflush_cache_range(pte, sizeof(*pte));
@@ -534,7 +536,7 @@ static void pasid_flush_caches(struct intel_iommu *iommu,
  */
 int intel_pasid_setup_first_level(struct intel_iommu *iommu,
 				  struct device *dev, pgd_t *pgd,
-				  int pasid, u16 did, int flags)
+				  unsigned int pasid, u16 did, int flags)
 {
 	struct pasid_entry *pte;
 
@@ -607,7 +609,7 @@ static inline int iommu_skip_agaw(struct dmar_domain *domain,
  */
 int intel_pasid_setup_second_level(struct intel_iommu *iommu,
 				   struct dmar_domain *domain,
-				   struct device *dev, int pasid)
+				   struct device *dev, unsigned int pasid)
 {
 	struct pasid_entry *pte;
 	struct dma_pte *pgd;
@@ -665,7 +667,7 @@ int intel_pasid_setup_second_level(struct intel_iommu *iommu,
  */
 int intel_pasid_setup_pass_through(struct intel_iommu *iommu,
 				   struct dmar_domain *domain,
-				   struct device *dev, int pasid)
+				   struct device *dev, unsigned int pasid)
 {
 	u16 did = FLPT_DEFAULT_DID;
 	struct pasid_entry *pte;
@@ -751,7 +753,7 @@ intel_pasid_setup_bind_data(struct intel_iommu *iommu, struct pasid_entry *pte,
  * @addr_width: Address width of the first level (guest)
  */
 int intel_pasid_setup_nested(struct intel_iommu *iommu, struct device *dev,
-			     pgd_t *gpgd, int pasid,
+			     pgd_t *gpgd, unsigned int pasid,
 			     struct iommu_gpasid_bind_data_vtd *pasid_data,
 			     struct dmar_domain *domain, int addr_width)
 {
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 6c87c807a0ab..b5618341b4b1 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -23,7 +23,7 @@
 #include "intel-pasid.h"
 
 static irqreturn_t prq_event_thread(int irq, void *d);
-static void intel_svm_drain_prq(struct device *dev, int pasid);
+static void intel_svm_drain_prq(struct device *dev, unsigned int pasid);
 
 #define PRQ_ORDER 0
 
@@ -371,7 +371,7 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, struct device *dev,
 	return ret;
 }
 
-int intel_svm_unbind_gpasid(struct device *dev, int pasid)
+int intel_svm_unbind_gpasid(struct device *dev, unsigned int pasid)
 {
 	struct intel_iommu *iommu = intel_svm_device_to_iommu(dev);
 	struct intel_svm_dev *sdev;
@@ -601,7 +601,7 @@ intel_svm_bind_mm(struct device *dev, int flags, struct svm_dev_ops *ops,
 }
 
 /* Caller must hold pasid_mutex */
-static int intel_svm_unbind_mm(struct device *dev, int pasid)
+static int intel_svm_unbind_mm(struct device *dev, unsigned int pasid)
 {
 	struct intel_svm_dev *sdev;
 	struct intel_iommu *iommu;
@@ -728,7 +728,7 @@ static bool is_canonical_address(u64 addr)
  * described in VT-d spec CH7.10 to drain all page requests and page
  * responses pending in the hardware.
  */
-static void intel_svm_drain_prq(struct device *dev, int pasid)
+static void intel_svm_drain_prq(struct device *dev, unsigned int pasid)
 {
 	struct device_domain_info *info;
 	struct dmar_domain *domain;
@@ -988,10 +988,10 @@ void intel_svm_unbind(struct iommu_sva *sva)
 	mutex_unlock(&pasid_mutex);
 }
 
-int intel_svm_get_pasid(struct iommu_sva *sva)
+unsigned int intel_svm_get_pasid(struct iommu_sva *sva)
 {
 	struct intel_svm_dev *sdev;
-	int pasid;
+	unsigned int pasid;
 
 	mutex_lock(&pasid_mutex);
 	sdev = to_intel_svm_dev(sva);
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index d43120eb1dc5..aba136ac377d 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2828,7 +2828,7 @@ void iommu_sva_unbind_device(struct iommu_sva *handle)
 }
 EXPORT_SYMBOL_GPL(iommu_sva_unbind_device);
 
-int iommu_sva_get_pasid(struct iommu_sva *handle)
+unsigned int iommu_sva_get_pasid(struct iommu_sva *handle)
 {
 	const struct iommu_ops *ops = handle->dev->bus->iommu_ops;
 
diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
index 107028e77ca3..8a3686e689dc 100644
--- a/drivers/misc/uacce/uacce.c
+++ b/drivers/misc/uacce/uacce.c
@@ -92,7 +92,7 @@ static long uacce_fops_compat_ioctl(struct file *filep,
 
 static int uacce_bind_queue(struct uacce_device *uacce, struct uacce_queue *q)
 {
-	int pasid;
+	unsigned int pasid;
 	struct iommu_sva *handle;
 
 	if (!(uacce->flags & UACCE_DEV_SVA))
diff --git a/include/linux/amd-iommu.h b/include/linux/amd-iommu.h
index 21e950e4ab62..25387ab5a4f5 100644
--- a/include/linux/amd-iommu.h
+++ b/include/linux/amd-iommu.h
@@ -76,7 +76,7 @@ extern void amd_iommu_free_device(struct pci_dev *pdev);
  *
  * The function returns 0 on success or a negative value on error.
  */
-extern int amd_iommu_bind_pasid(struct pci_dev *pdev, int pasid,
+extern int amd_iommu_bind_pasid(struct pci_dev *pdev, unsigned int pasid,
 				struct task_struct *task);
 
 /**
@@ -88,7 +88,7 @@ extern int amd_iommu_bind_pasid(struct pci_dev *pdev, int pasid,
  * When this function returns the device is no longer using the PASID
  * and the PASID is no longer bound to its task.
  */
-extern void amd_iommu_unbind_pasid(struct pci_dev *pdev, int pasid);
+extern void amd_iommu_unbind_pasid(struct pci_dev *pdev, unsigned int pasid);
 
 /**
  * amd_iommu_set_invalid_ppr_cb() - Register a call-back for failed
@@ -114,7 +114,7 @@ extern void amd_iommu_unbind_pasid(struct pci_dev *pdev, int pasid);
 #define AMD_IOMMU_INV_PRI_RSP_FAIL	2
 
 typedef int (*amd_iommu_invalid_ppr_cb)(struct pci_dev *pdev,
-					int pasid,
+					unsigned int pasid,
 					unsigned long address,
 					u16);
 
@@ -166,7 +166,8 @@ extern int amd_iommu_device_info(struct pci_dev *pdev,
  * @cb: The call-back function
  */
 
-typedef void (*amd_iommu_invalidate_ctx)(struct pci_dev *pdev, int pasid);
+typedef void (*amd_iommu_invalidate_ctx)(struct pci_dev *pdev,
+					 unsigned int pasid);
 
 extern int amd_iommu_set_invalidate_ctx_cb(struct pci_dev *pdev,
 					   amd_iommu_invalidate_ctx cb);
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 4100bd224f5c..44fa8879f829 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -549,7 +549,7 @@ struct dmar_domain {
 					   2 == 1GiB, 3 == 512GiB, 4 == 1TiB */
 	u64		max_addr;	/* maximum mapped address */
 
-	int		default_pasid;	/*
+	unsigned int	default_pasid;	/*
 					 * The default pasid used for non-SVM
 					 * traffic on mediated devices.
 					 */
@@ -699,14 +699,14 @@ extern void qi_flush_iotlb(struct intel_iommu *iommu, u16 did, u64 addr,
 extern void qi_flush_dev_iotlb(struct intel_iommu *iommu, u16 sid, u16 pfsid,
 			u16 qdep, u64 addr, unsigned mask);
 
-void qi_flush_piotlb(struct intel_iommu *iommu, u16 did, u32 pasid, u64 addr,
-		     unsigned long npages, bool ih);
+void qi_flush_piotlb(struct intel_iommu *iommu, u16 did, unsigned int pasid,
+		     u64 addr, unsigned long npages, bool ih);
 
 void qi_flush_dev_iotlb_pasid(struct intel_iommu *iommu, u16 sid, u16 pfsid,
-			      u32 pasid, u16 qdep, u64 addr,
+			      unsigned int pasid, u16 qdep, u64 addr,
 			      unsigned int size_order, u64 granu);
 void qi_flush_pasid_cache(struct intel_iommu *iommu, u16 did, u64 granu,
-			  int pasid);
+			  unsigned int pasid);
 
 int qi_submit_sync(struct intel_iommu *iommu, struct qi_desc *desc,
 		   unsigned int count, unsigned long options);
@@ -734,11 +734,11 @@ extern int intel_svm_enable_prq(struct intel_iommu *iommu);
 extern int intel_svm_finish_prq(struct intel_iommu *iommu);
 int intel_svm_bind_gpasid(struct iommu_domain *domain, struct device *dev,
 			  struct iommu_gpasid_bind_data *data);
-int intel_svm_unbind_gpasid(struct device *dev, int pasid);
+int intel_svm_unbind_gpasid(struct device *dev, unsigned int pasid);
 struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm,
 				 void *drvdata);
 void intel_svm_unbind(struct iommu_sva *handle);
-int intel_svm_get_pasid(struct iommu_sva *handle);
+unsigned int intel_svm_get_pasid(struct iommu_sva *handle);
 struct svm_dev_ops;
 
 struct intel_svm_dev {
@@ -747,7 +747,7 @@ struct intel_svm_dev {
 	struct device *dev;
 	struct svm_dev_ops *ops;
 	struct iommu_sva sva;
-	int pasid;
+	unsigned int pasid;
 	int users;
 	u16 did;
 	u16 dev_iotlb:1;
@@ -760,7 +760,7 @@ struct intel_svm {
 
 	struct intel_iommu *iommu;
 	int flags;
-	int pasid;
+	unsigned int pasid;
 	int gpasid; /* In case that guest PASID is different from host PASID */
 	struct list_head devs;
 	struct list_head list;
diff --git a/include/linux/intel-svm.h b/include/linux/intel-svm.h
index c9e7e601950d..7ec24f19b7d3 100644
--- a/include/linux/intel-svm.h
+++ b/include/linux/intel-svm.h
@@ -11,7 +11,7 @@
 struct device;
 
 struct svm_dev_ops {
-	void (*fault_cb)(struct device *dev, int pasid, u64 address,
+	void (*fault_cb)(struct device *dev, unsigned int pasid, u64 address,
 			 void *private, int rwxp, int response);
 };
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 5f0b7859d2eb..813961145679 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -292,7 +292,7 @@ struct iommu_ops {
 	struct iommu_sva *(*sva_bind)(struct device *dev, struct mm_struct *mm,
 				      void *drvdata);
 	void (*sva_unbind)(struct iommu_sva *handle);
-	int (*sva_get_pasid)(struct iommu_sva *handle);
+	unsigned int (*sva_get_pasid)(struct iommu_sva *handle);
 
 	int (*page_response)(struct device *dev,
 			     struct iommu_fault_event *evt,
@@ -302,7 +302,7 @@ struct iommu_ops {
 	int (*sva_bind_gpasid)(struct iommu_domain *domain,
 			struct device *dev, struct iommu_gpasid_bind_data *data);
 
-	int (*sva_unbind_gpasid)(struct device *dev, int pasid);
+	int (*sva_unbind_gpasid)(struct device *dev, unsigned int pasid);
 
 	int (*def_domain_type)(struct device *dev);
 
@@ -656,7 +656,7 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev,
 					struct mm_struct *mm,
 					void *drvdata);
 void iommu_sva_unbind_device(struct iommu_sva *handle);
-int iommu_sva_get_pasid(struct iommu_sva *handle);
+unsigned int iommu_sva_get_pasid(struct iommu_sva *handle);
 
 #else /* CONFIG_IOMMU_API */
 
@@ -1049,7 +1049,7 @@ static inline void iommu_sva_unbind_device(struct iommu_sva *handle)
 {
 }
 
-static inline int iommu_sva_get_pasid(struct iommu_sva *handle)
+static inline unsigned int iommu_sva_get_pasid(struct iommu_sva *handle)
 {
 	return IOMMU_PASID_INVALID;
 }
diff --git a/include/linux/uacce.h b/include/linux/uacce.h
index 454c2f6672d7..667d7e691c0d 100644
--- a/include/linux/uacce.h
+++ b/include/linux/uacce.h
@@ -81,7 +81,7 @@ struct uacce_queue {
 	struct list_head list;
 	struct uacce_qfile_region *qfrs[UACCE_MAX_REGION];
 	enum uacce_q_state state;
-	int pasid;
+	unsigned int pasid;
 	struct iommu_sva *handle;
 };
 
-- 
2.19.1


^ permalink raw reply related

* [PATCH v2 03/12] iommu/vt-d: Change flags type to unsigned int in binding mm
From: Fenghua Yu @ 2020-06-13  0:41 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Frederic Barrat, Andrew Donnellan,
	Felix Kuehling, Joerg Roedel, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Yu-cheng Yu, Sohil Mehta,
	Ravi V Shankar
  Cc: Fenghua Yu, x86, linux-kernel, amd-gfx, iommu, linuxppc-dev
In-Reply-To: <1592008893-9388-1-git-send-email-fenghua.yu@intel.com>

"flags" passed to intel_svm_bind_mm() is a bit mask and should be
defined as "unsigned int" instead of "int".

Change its type to "unsigned int".

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
v2:
- Add this new patch per Thomas' comment.

 drivers/iommu/intel/svm.c   | 7 ++++---
 include/linux/intel-iommu.h | 2 +-
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index b5618341b4b1..4e775e12ae52 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -427,7 +427,8 @@ int intel_svm_unbind_gpasid(struct device *dev, unsigned int pasid)
 
 /* Caller must hold pasid_mutex, mm reference */
 static int
-intel_svm_bind_mm(struct device *dev, int flags, struct svm_dev_ops *ops,
+intel_svm_bind_mm(struct device *dev, unsigned int flags,
+		  struct svm_dev_ops *ops,
 		  struct mm_struct *mm, struct intel_svm_dev **sd)
 {
 	struct intel_iommu *iommu = intel_svm_device_to_iommu(dev);
@@ -954,7 +955,7 @@ intel_svm_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
 {
 	struct iommu_sva *sva = ERR_PTR(-EINVAL);
 	struct intel_svm_dev *sdev = NULL;
-	int flags = 0;
+	unsigned int flags = 0;
 	int ret;
 
 	/*
@@ -963,7 +964,7 @@ intel_svm_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
 	 * and intel_svm etc.
 	 */
 	if (drvdata)
-		flags = *(int *)drvdata;
+		flags = *(unsigned int *)drvdata;
 	mutex_lock(&pasid_mutex);
 	ret = intel_svm_bind_mm(dev, flags, NULL, mm, &sdev);
 	if (ret)
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 44fa8879f829..9abc30cf10fc 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -759,7 +759,7 @@ struct intel_svm {
 	struct mm_struct *mm;
 
 	struct intel_iommu *iommu;
-	int flags;
+	unsigned int flags;
 	unsigned int pasid;
 	int gpasid; /* In case that guest PASID is different from host PASID */
 	struct list_head devs;
-- 
2.19.1


^ permalink raw reply related

* [PATCH v2 05/12] x86/cpufeatures: Enumerate ENQCMD and ENQCMDS instructions
From: Fenghua Yu @ 2020-06-13  0:41 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Frederic Barrat, Andrew Donnellan,
	Felix Kuehling, Joerg Roedel, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Yu-cheng Yu, Sohil Mehta,
	Ravi V Shankar
  Cc: Fenghua Yu, x86, linux-kernel, amd-gfx, iommu, linuxppc-dev
In-Reply-To: <1592008893-9388-1-git-send-email-fenghua.yu@intel.com>

Work submission instruction comes in two flavors. ENQCMD can be called
both in ring 3 and ring 0 and always uses the contents of PASID MSR when
shipping the command to the device. ENQCMDS allows a kernel driver to
submit commands on behalf of a user process. The driver supplies the
PASID value in ENQCMDS. There isn't any usage of ENQCMD in the kernel
as of now.

The CPU feature flag is shown as "enqcmd" in /proc/cpuinfo.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
v2:
- Re-write commit message (Thomas)

 arch/x86/include/asm/cpufeatures.h | 1 +
 arch/x86/kernel/cpu/cpuid-deps.c   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 02dabc9e77b0..4469618c410f 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -351,6 +351,7 @@
 #define X86_FEATURE_CLDEMOTE		(16*32+25) /* CLDEMOTE instruction */
 #define X86_FEATURE_MOVDIRI		(16*32+27) /* MOVDIRI instruction */
 #define X86_FEATURE_MOVDIR64B		(16*32+28) /* MOVDIR64B instruction */
+#define X86_FEATURE_ENQCMD		(16*32+29) /* ENQCMD and ENQCMDS instructions */
 
 /* AMD-defined CPU features, CPUID level 0x80000007 (EBX), word 17 */
 #define X86_FEATURE_OVERFLOW_RECOV	(17*32+ 0) /* MCA overflow recovery support */
diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index 3cbe24ca80ab..3a02707c1f4d 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -69,6 +69,7 @@ static const struct cpuid_dep cpuid_deps[] = {
 	{ X86_FEATURE_CQM_MBM_TOTAL,		X86_FEATURE_CQM_LLC   },
 	{ X86_FEATURE_CQM_MBM_LOCAL,		X86_FEATURE_CQM_LLC   },
 	{ X86_FEATURE_AVX512_BF16,		X86_FEATURE_AVX512VL  },
+	{ X86_FEATURE_ENQCMD,			X86_FEATURE_XSAVES    },
 	{}
 };
 
-- 
2.19.1


^ permalink raw reply related

* [PATCH v2 04/12] docs: x86: Add documentation for SVA (Shared Virtual Addressing)
From: Fenghua Yu @ 2020-06-13  0:41 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Frederic Barrat, Andrew Donnellan,
	Felix Kuehling, Joerg Roedel, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Yu-cheng Yu, Sohil Mehta,
	Ravi V Shankar
  Cc: Fenghua Yu, x86, linux-kernel, amd-gfx, iommu, linuxppc-dev
In-Reply-To: <1592008893-9388-1-git-send-email-fenghua.yu@intel.com>

From: Ashok Raj <ashok.raj@intel.com>

ENQCMD and Data Streaming Accelerator (DSA) and all of their associated
features are a complicated stack with lots of interconnected pieces.
This documentation provides a big picture overview for all of the
features.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
v2:
- Fix the doc format and add the doc in toctree (Thomas)
- Modify the doc for better description (Thomas, Tony, Dave)

 Documentation/x86/index.rst |   1 +
 Documentation/x86/sva.rst   | 287 ++++++++++++++++++++++++++++++++++++
 2 files changed, 288 insertions(+)
 create mode 100644 Documentation/x86/sva.rst

diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index 265d9e9a093b..e5d5ff096685 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -30,3 +30,4 @@ x86-specific Documentation
    usb-legacy-support
    i386/index
    x86_64/index
+   sva
diff --git a/Documentation/x86/sva.rst b/Documentation/x86/sva.rst
new file mode 100644
index 000000000000..1e52208c7dda
--- /dev/null
+++ b/Documentation/x86/sva.rst
@@ -0,0 +1,287 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===========================================
+Shared Virtual Addressing (SVA) with ENQCMD
+===========================================
+
+Background
+==========
+
+Shared Virtual Addressing (SVA) allows the processor and device to use the
+same virtual addresses avoiding the need for software to translate virtual
+addresses to physical addresses. SVA is what PCIe calls Shared Virtual
+Memory (SVM)
+
+In addition to the convenience of using application virtual addresses
+by the device, it also doesn't require pinning pages for DMA.
+PCIe Address Translation Services (ATS) along with Page Request Interface
+(PRI) allow devices to function much the same way as the CPU handling
+application page-faults. For more information please refer to PCIe
+specification Chapter 10: ATS Specification.
+
+Use of SVA requires IOMMU support in the platform. IOMMU also is required
+to support PCIe features ATS and PRI. ATS allows devices to cache
+translations for the virtual address. IOMMU driver uses the mmu_notifier()
+support to keep the device tlb cache and the CPU cache in sync. PRI allows
+the device to request paging the virtual address before using if they are
+not paged in the CPU page tables.
+
+
+Shared Hardware Workqueues
+==========================
+
+Unlike Single Root I/O Virtualization (SRIOV), Scalable IOV (SIOV) permits
+the use of Shared Work Queues (SWQ) by both applications and Virtual
+Machines (VM's). This allows better hardware utilization vs. hard
+partitioning resources that could result in under utilization. In order to
+allow the hardware to distinguish the context for which work is being
+executed in the hardware by SWQ interface, SIOV uses Process Address Space
+ID (PASID), which is a 20bit number defined by the PCIe SIG.
+
+PASID value is encoded in all transactions from the device. This allows the
+IOMMU to track I/O on a per-PASID granularity in addition to using the PCIe
+Resource Identifier (RID) which is the Bus/Device/Function.
+
+
+ENQCMD
+======
+
+ENQCMD is a new instruction on Intel platforms that atomically submits a
+work descriptor to a device. The descriptor includes the operation to be
+performed, virtual addresses of all parameters, virtual address of a completion
+record, and the PASID (process address space ID) of the current process.
+
+ENQCMD works with non-posted semantics and carries a status back if the
+command was accepted by hardware. This allows the submitter to know if the
+submission needs to be retried or other device specific mechanisms to
+implement implement fairness or ensure forward progress can be made.
+
+ENQCMD is the glue that ensures applications can directly submit commands
+to the hardware and also permit hardware to be aware of application context
+to perform I/O operations via use of PASID.
+
+Process Address Space Tagging
+=============================
+
+A new thread scoped MSR (IA32_PASID) provides the connection between
+user processes and the rest of the hardware. When an application first
+accesses an SVA capable device this MSR is initialized with a newly
+allocated PASID. The driver for the device calls an IOMMU specific api
+that sets up the routing for DMA and page-requests.
+
+For example, the Intel Data Streaming Accelerator (DSA) uses
+intel_svm_bind_mm(), which will do the following.
+
+- Allocate the PASID, and program the process page-table (cr3) in the PASID
+  context entries.
+- Register for mmu_notifier() to track any page-table invalidations to keep
+  the device tlb in sync. For example, when a page-table entry is invalidated,
+  IOMMU propagates the invalidation to device tlb. This will force any
+  future access by the device to this virtual address to participate in
+  ATS. If the IOMMU responds with proper response that a page is not
+  present, the device would request the page to be paged in via the PCIe PRI
+  protocol before performing I/O.
+
+This MSR is managed with the XSAVE feature set as "supervisor state" to
+ensure the MSR is updated during context switch.
+
+PASID Management
+================
+
+The kernel must allocate a PASID on behalf of each process and program it
+into the new MSR to communicate the process identity to platform hardware.
+ENQCMD uses the PASID stored in this MSR to tag requests from this process.
+When a user submits a work descriptor to a device using the ENQCMD
+instruction, the PASID field in the descriptor is auto-filled with the
+value from MSR_IA32_PASID. Requests for DMA from the device are also tagged
+with the same PASID. The platform IOMMU uses the PASID in the transaction to
+perform address translation. The IOMMU api's setup the corresponding PASID
+entry in IOMMU with the process address used by the CPU (for e.g cr3 in x86).
+
+The MSR must be configured on each logical CPU before any application
+thread can interact with a device. Threads that belong to the same
+process share the same page tables, thus the same MSR value.
+
+PASID is cleared when a process is created. The PASID allocation and MSR
+programming may occur long after a process and its threads have been created.
+One thread must call bind() to allocate the PASID for the process. If a
+thread uses ENQCMD without the MSR first being populated, it will cause #GP.
+The kernel will fix up the #GP by writing the process-wide PASID into the
+thread that took the #GP. A single process PASID can be used simultaneously
+with multiple devices since they all share the same address space.
+
+New threads could inherit the MSR value from the parent. But this would
+involve additional state management for those threads which may never use
+ENQCMD. Clearing the MSR at thread creation permits all threads to have a
+consistent behavior; the PASID is only programmed when the thread calls
+the bind() api (intel_svm_bind_mm()()), or when a thread calls ENQCMD for
+the first time.
+
+PASID Lifecycle Management
+==========================
+
+Only processes that access SVA capable devices need to have a PASID
+allocated. This allocation happens when a process first opens an SVA
+capable device (subsequent opens of the same, or other devices will
+share the same PASID).
+
+Although the PASID is allocated to the process by opening a device,
+it is not active in any of the threads of that process. Activation is
+done lazily when a thread tries to submit a work descriptor to a device
+using the ENQCMD.
+
+That first access will trigger a #GP fault because the IA32_PASID MSR
+has not been initialized with the PASID value assigned to the process
+when the device was opened. The Linux #GP handler notes that a PASID as
+been allocated for the process, and so initializes the IA32_PASID MSR
+and returns so that the ENQCMD instruction is re-executed.
+
+On fork(2) or exec(2) the PASID is removed from the process as it no
+longer has the same address space that it had when the device was opened.
+
+On clone(2) the new task shares the same address space, so will be
+able to use the PASID allocated to the process. The IA32_PASID is not
+preemptively initialized as the kernel does not know whether this thread
+is going to access the device.
+
+On exit(2) the PASID is freed. The device driver ensures that any pending
+operations queued to the device are either completed or aborted before
+allowing the PASID to be re-allocated.
+
+Relationships
+=============
+
+ * Each process has many threads, but only one PASID
+ * Devices have a limited number (~10's to 1000's) of hardware
+   workqueues and each portal maps down to a single workqueue.
+   The device driver manages allocating hardware workqueues.
+ * A single mmap() maps a single hardware workqueue as a "portal"
+ * For each device with which a process interacts, there must be
+   one or more mmap()'d portals.
+ * Many threads within a process can share a single portal to access
+   a single device.
+ * Multiple processes can separately mmap() the same portal, in
+   which case they still share one device hardware workqueue.
+ * The single process-wide PASID is used by all threads to interact
+   with all devices.  There is not, for instance, a PASID for each
+   thread or each thread<->device pair.
+
+FAQ
+===
+
+* What is SVA/SVM?
+
+Shared Virtual Addressing (SVA) permits I/O hardware and the processor to
+work in the same address space. In short, sharing the address space. Some
+call it Shared Virtual Memory (SVM), but Linux community wanted to avoid
+it with Posix Shared Memory and Secure Virtual Machines which were terms
+already in circulation.
+
+* What is a PASID?
+
+A Process Address Space ID (PASID) is a PCIe-defined TLP Prefix. A PASID is
+a 20 bit number allocated and managed by the OS. PASID is included in all
+transactions between the platform and the device.
+
+* How are shared work queues different?
+
+Traditionally to allow user space applications interact with hardware,
+there is a separate instance required per process. For example, consider
+doorbells as a mechanism of informing hardware about work to process. Each
+doorbell is required to be spaced 4k (or page-size) apart for process
+isolation. This requires hardware to provision that space and reserve in
+MMIO. This doesn't scale as the number of threads becomes quite large. The
+hardware also manages the queue depth for Shared Work Queues (SWQ), and
+consumers don't need to track queue depth. If there is no space to accept
+a command, the device will return an error indicating retry. Also
+submitting a command to an MMIO address that can't accept ENQCMD will
+return retry in response. In the new DMWr PCIe terminology, devices need to
+support DMWr completer capability. In addition it requires all switch ports
+to support DMWr routing and must be enabled by the PCIe subsystem, much
+like how PCIe Atomics() are managed for instance.
+
+SWQ allows hardware to provision just a single address in the device. When
+used with ENQCMD to submit work, the device can distinguish the process
+submitting the work since it will include the PASID assigned to that
+process. This decreases the pressure of hardware requiring to support
+hardware to scale to a large number of processes.
+
+* Is this the same as a user space device driver?
+
+Communicating with the device via the shared work queue is much simpler
+than a full blown user space driver. The kernel driver does all the
+initialization of the hardware. User space only needs to worry about
+submitting work and processing completions.
+
+* Is this the same as SR-IOV?
+
+Single Root I/O Virtualization (SR-IOV) focuses on providing independent
+hardware interfaces for virtualizing hardware. Hence its required to be
+almost fully functional interface to software supporting the traditional
+BAR's, space for interrupts via MSI-x, its own register layout.
+Virtual Functions (VFs) are assisted by the Physical Function (PF)
+driver.
+
+Scalable I/O Virtualization builds on the PASID concept to create device
+instances for virtualization. SIOV requires host software to assist in
+creating virtual devices, each virtual device is represented by a PASID
+along with the BDF of the device.  This allows device hardware to optimize
+device resource creation and can grow dynamically on demand. SR-IOV creation
+and management is very static in nature. Consult references below for more
+details.
+
+* Why not just create a virtual function for each app?
+
+Creating PCIe SRIOV type virtual functions (VF) are expensive. They create
+duplicated hardware for PCI config space requirements, Interrupts such as
+MSIx for instance. Resources such as interrupts have to be hard partitioned
+between VF's at creation time, and cannot scale dynamically on demand. The
+VF's are not completely independent from the Physical function (PF). Most
+VF's require some communication and assistance from the PF driver. SIOV
+creates a software defined device. Where all the configuration and control
+aspects are mediated via the slow path. The work submission and completion
+happen without any mediation.
+
+* Does this support virtualization?
+
+ENQCMD can be used from within a guest VM. In these cases the VMM helps
+with setting up a translation table to translate from Guest PASID to Host
+PASID. Please consult the ENQCMD instruction set reference for more
+details.
+
+* Does memory need to be pinned?
+
+When devices support SVA, along with platform hardware such as IOMMU
+supporting such devices, there is no need to pin memory for DMA purposes.
+Devices that support SVA also support other PCIe features that remove the
+pinning requirement for memory.
+
+Device TLB support - Device requests the IOMMU to lookup an address before
+use via Address Translation Service (ATS) requests.  If the mapping exists
+but there is no page allocated by the OS, IOMMU hardware returns that no
+mapping exists.
+
+Device requests that virtual address to be mapped via Page Request
+Interface (PRI). Once the OS has successfully completed  the mapping, it
+returns the response back to the device. The device continues again to
+request for a translation and continues.
+
+IOMMU works with the OS in managing consistency of page-tables with the
+device. When removing pages, it interacts with the device to remove any
+device-tlb that might have been cached before removing the mappings from
+the OS.
+
+References
+==========
+
+VT-D:
+https://01.org/blogs/ashokraj/2018/recent-enhancements-intel-virtualization-technology-directed-i/o-intel-vt-d
+
+SIOV:
+https://01.org/blogs/2019/assignable-interfaces-intel-scalable-i/o-virtualization-linux
+
+ENQCMD in ISE:
+https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf
+
+DSA spec:
+https://software.intel.com/sites/default/files/341204-intel-data-streaming-accelerator-spec.pdf
-- 
2.19.1


^ permalink raw reply related

* [PATCH v2 07/12] x86/msr-index: Define IA32_PASID MSR
From: Fenghua Yu @ 2020-06-13  0:41 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Frederic Barrat, Andrew Donnellan,
	Felix Kuehling, Joerg Roedel, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Yu-cheng Yu, Sohil Mehta,
	Ravi V Shankar
  Cc: Fenghua Yu, x86, linux-kernel, amd-gfx, iommu, linuxppc-dev
In-Reply-To: <1592008893-9388-1-git-send-email-fenghua.yu@intel.com>

The IA32_PASID MSR (0xd93) contains the Process Address Space Identifier
(PASID), a 20-bit value. Bit 31 must be set to indicate the value
programmed in the MSR is valid. Hardware uses PASID to identify process
address space and direct responses to the right address space.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
v2:
- Change "identify process" to "identify process address space" in the
  commit message (Thomas)

 arch/x86/include/asm/msr-index.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index e8370e64a155..e5f699ff1dd6 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -237,6 +237,9 @@
 #define MSR_IA32_LASTINTFROMIP		0x000001dd
 #define MSR_IA32_LASTINTTOIP		0x000001de
 
+#define MSR_IA32_PASID			0x00000d93
+#define MSR_IA32_PASID_VALID		BIT_ULL(31)
+
 /* DEBUGCTLMSR bits (others vary by model): */
 #define DEBUGCTLMSR_LBR			(1UL <<  0) /* last branch recording */
 #define DEBUGCTLMSR_BTF_SHIFT		1
-- 
2.19.1


^ permalink raw reply related

* [PATCH v2 08/12] mm: Define pasid in mm
From: Fenghua Yu @ 2020-06-13  0:41 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Frederic Barrat, Andrew Donnellan,
	Felix Kuehling, Joerg Roedel, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Yu-cheng Yu, Sohil Mehta,
	Ravi V Shankar
  Cc: Fenghua Yu, x86, linux-kernel, amd-gfx, iommu, linuxppc-dev
In-Reply-To: <1592008893-9388-1-git-send-email-fenghua.yu@intel.com>

PASID is shared by all threads in a process. So the logical place to keep
track of it is in the "mm". Both ARM and X86 need to use the PASID in the
"mm".

Suggested-by: Christoph Hellwig <hch@infradeed.org>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
v2:
- This new patch moves "pasid" from x86 specific mm_context_t to generic
  struct mm_struct per Christopher's comment: https://lore.kernel.org/linux-iommu/20200414170252.714402-1-jean-philippe@linaro.org/T/#mb57110ffe1aaa24750eeea4f93b611f0d1913911
- Jean-Philippe Brucker released a virtually same patch. I still put this
  patch in the series for better review. The upstream kernel only needs one
  of the two patches eventually.
https://lore.kernel.org/linux-iommu/20200519175502.2504091-2-jean-philippe@linaro.org/
- Change CONFIG_IOASID to CONFIG_PCI_PASID (Ashok)

 include/linux/mm_types.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 64ede5f150dc..5778db3aa42d 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -538,6 +538,10 @@ struct mm_struct {
 		atomic_long_t hugetlb_usage;
 #endif
 		struct work_struct async_put_work;
+
+#ifdef CONFIG_PCI_PASID
+		unsigned int pasid;
+#endif
 	} __randomize_layout;
 
 	/*
-- 
2.19.1


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox