LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [RFC 2/4] virtio: Override device's DMA OPS with virtio_direct_dma_ops selectively
From: Anshuman Khandual @ 2018-07-31  6:39 UTC (permalink / raw)
  To: Christoph Hellwig, Michael S. Tsirkin
  Cc: virtualization, linux-kernel, linuxppc-dev, aik, robh, joe,
	elfring, david, jasowang, benh, mpe, linuxram, haren, paulus,
	srikar
In-Reply-To: <20180730093027.GC26245@infradead.org>

On 07/30/2018 03:00 PM, Christoph Hellwig wrote:
>>> +
>>> +	if (xen_domain())
>>> +		goto skip_override;
>>> +
>>> +	if (virtio_has_iommu_quirk(dev))
>>> +		set_dma_ops(dev->dev.parent, &virtio_direct_dma_ops);
>>> +
>>> + skip_override:
>>> +
>>
>> I prefer normal if scoping as opposed to goto spaghetti pls.
>> Better yet move vring_use_dma_api here and use it.
>> Less of a chance something will break.
> 
> I agree about avoid pointless gotos here, but we can do things
> perfectly well without either gotos or a confusing helper here
> if we structure it right. E.g.:
> 
> 	// suitably detailed comment here
> 	if (!xen_domain() &&
> 	    !virtio_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM))
> 		set_dma_ops(dev->dev.parent, &virtio_direct_dma_ops);

I had updated this patch calling vring_use_dma_api() as a helper
as suggested by Michael but yes we can have the above condition
with a comment block. I will change this patch accordingly.

> 
> and while we're at it - modifying dma ops for the parent looks very
> dangerous.  I don't think we can do that, as it could break iommu
> setup interactions.  IFF we set a specific dma map ops it has to be
> on the virtio device itself, of which we have full control.

I understand your concern. At present virtio core calls parent's DMA
ops callbacks when device has VIRTIO_F_IOMMU_PLATFORM flag set. Most
likely those DMA OPS are architecture specific ones which can really
configure IOMMU. Most probably all devices and their parents share
the same DMA ops callback. IIUC as long as the entire system has a
single DMA ops structure, it should be okay. But I may be missing
other implications. I tried changing virtio core so that it always
calls device's DMA ops instead of it's parent DMA ops, it hit the
following WARN_ON for devices without IOMMU flag and hit both the
WARN_ON and BUG_ON for devices with the IOMMU flag.

static inline void *dma_alloc_attrs(struct device *dev, size_t size,
                                       dma_addr_t *dma_handle, gfp_t flag,
                                       unsigned long attrs)
{
        const struct dma_map_ops *ops = get_dma_ops(dev);
        void *cpu_addr;

        BUG_ON(!ops);
        WARN_ON_ONCE(dev && !dev->coherent_dma_mask);

--------

Seems like virtio device's DMA ops and coherent_dma_mask was never
set correctly assuming that virtio core always called parent's DMA
OPS all the time. We may have to change virtio device init to fix
this. Any thoughts ?

^ permalink raw reply

* phandle_cache vs of_detach_node (was Re: [PATCH] powerpc/mobility: Fix node detach/rename problem)
From: Michael Ellerman @ 2018-07-31  6:34 UTC (permalink / raw)
  To: Michael Bringmann, linuxppc-dev, Rob Herring, Frank Rowand,
	devicetree
In-Reply-To: <ffaa7eba-5236-e42d-c901-79b045adfb94@linux.vnet.ibm.com>

Hi Rob/Frank,

I think we might have a problem with the phandle_cache not interacting
well with of_detach_node():

Michael Bringmann <mwb@linux.vnet.ibm.com> writes:
> See below.
>
> On 07/30/2018 01:31 AM, Michael Ellerman wrote:
>> Michael Bringmann <mwb@linux.vnet.ibm.com> writes:
>> 
>>> During LPAR migration, the content of the device tree/sysfs may
>>> be updated including deletion and replacement of nodes in the
>>> tree.  When nodes are added to the internal node structures, they
>>> are appended in FIFO order to a list of nodes maintained by the
>>> OF code APIs.
>> 
>> That hasn't been true for several years. The data structure is an n-ary
>> tree. What kernel version are you working on?
>
> Sorry for an error in my description.  I oversimplified based on the
> name of a search iterator.  Let me try to provide a better explanation
> of the problem, here.
>
> This is the problem.  The PPC mobility code receives RTAS requests to
> delete nodes with platform-/hardware-specific attributes when restarting
> the kernel after a migration.  My example is for migration between a
> P8 Alpine and a P8 Brazos.   Nodes to be deleted may include 'ibm,random-v1',
> 'ibm,compression-v1', 'ibm,platform-facilities', 'ibm,sym-encryption-v1',
> or others.
>
> The mobility.c code calls 'of_detach_node' for the nodes and their children.
> This makes calls to detach the properties and to try to remove the associated
> sysfs/kernfs files.
>
> Then new copies of the same nodes are next provided by the PHYP, local
> copies are built, and a pointer to the 'struct device_node' is passed to
> of_attach_node.  Before the call to of_attach_node, the phandle is initialized
> to 0 when the data structure is alloced.  During the call to of_attach_node,
> it calls __of_attach_node which pulls the actual name and phandle from just
> created sub-properties named something like 'name' and 'ibm,phandle'.
>
> This is all fine for the first migration.  The problem occurs with the
> second and subsequent migrations when the PHYP on the new system wants to
> replace the same set of nodes again, referenced with the same names and
> phandle values.
>
>> 
>>> When nodes are removed from the device tree, they
>>> are marked OF_DETACHED, but not actually deleted from the system
>>> to allow for pointers cached elsewhere in the kernel.  The order
>>> and content of the entries in the list of nodes is not altered,
>>> though.
>> 
>> Something is going wrong if this is actually happening.
>> 
>> When the node is detached it should be *detached* from the tree of all
>> nodes, so it should not be discoverable other than by having an existing
>> pointer to it.
> On the second and subsequent migrations, the PHYP tells the system
> to again delete the nodes 'ibm,platform-facilities', 'ibm,random-v1',
> 'ibm,compression-v1', 'ibm,sym-encryption-v1'.  It specifies these
> nodes by its known set of phandle values -- the same handles used
> by the PHYP on the source system are known on the target system.
> The mobility.c code calls of_find_node_by_phandle() with these values
> and ends up locating the first instance of each node that was added
> during the original boot, instead of the second instance of each node
> created after the first migration.  The detach during the second
> migration fails with errors like,
>
> [ 4565.030704] WARNING: CPU: 3 PID: 4787 at drivers/of/dynamic.c:252 __of_detach_node+0x8/0xa0
> [ 4565.030708] Modules linked in: nfsv3 nfs_acl nfs tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag lockd grace fscache sunrpc xts vmx_crypto sg pseries_rng binfmt_misc ip_tables xfs libcrc32c sd_mod ibmveth ibmvscsi scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod
> [ 4565.030733] CPU: 3 PID: 4787 Comm: drmgr Tainted: G        W         4.18.0-rc1-wi107836-v05-120+ #201
> [ 4565.030737] NIP:  c0000000007c1ea8 LR: c0000000007c1fb4 CTR: 0000000000655170
> [ 4565.030741] REGS: c0000003f302b690 TRAP: 0700   Tainted: G        W          (4.18.0-rc1-wi107836-v05-120+)
> [ 4565.030745] MSR:  800000010282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 22288822  XER: 0000000a
> [ 4565.030757] CFAR: c0000000007c1fb0 IRQMASK: 1
> [ 4565.030757] GPR00: c0000000007c1fa4 c0000003f302b910 c00000000114bf00 c0000003ffff8e68
> [ 4565.030757] GPR04: 0000000000000001 ffffffffffffffff 800000c008e0b4b8 ffffffffffffffff
> [ 4565.030757] GPR08: 0000000000000000 0000000000000001 0000000080000003 0000000000002843
> [ 4565.030757] GPR12: 0000000000008800 c00000001ec9ae00 0000000040000000 0000000000000000
> [ 4565.030757] GPR16: 0000000000000000 0000000000000008 0000000000000000 00000000f6ffffff
> [ 4565.030757] GPR20: 0000000000000007 0000000000000000 c0000003e9f1f034 0000000000000001
> [ 4565.030757] GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> [ 4565.030757] GPR28: c000000001549d28 c000000001134828 c0000003ffff8e68 c0000003f302b930
> [ 4565.030804] NIP [c0000000007c1ea8] __of_detach_node+0x8/0xa0
> [ 4565.030808] LR [c0000000007c1fb4] of_detach_node+0x74/0xd0
> [ 4565.030811] Call Trace:
> [ 4565.030815] [c0000003f302b910] [c0000000007c1fa4] of_detach_node+0x64/0xd0 (unreliable)
> [ 4565.030821] [c0000003f302b980] [c0000000000c33c4] dlpar_detach_node+0xb4/0x150
> [ 4565.030826] [c0000003f302ba10] [c0000000000c3ffc] delete_dt_node+0x3c/0x80
> [ 4565.030831] [c0000003f302ba40] [c0000000000c4380] pseries_devicetree_update+0x150/0x4f0
> [ 4565.030836] [c0000003f302bb70] [c0000000000c479c] post_mobility_fixup+0x7c/0xf0
> [ 4565.030841] [c0000003f302bbe0] [c0000000000c4908] migration_store+0xf8/0x130
> [ 4565.030847] [c0000003f302bc70] [c000000000998160] kobj_attr_store+0x30/0x60
> [ 4565.030852] [c0000003f302bc90] [c000000000412f14] sysfs_kf_write+0x64/0xa0
> [ 4565.030857] [c0000003f302bcb0] [c000000000411cac] kernfs_fop_write+0x16c/0x240
> [ 4565.030862] [c0000003f302bd00] [c000000000355f20] __vfs_write+0x40/0x220
> [ 4565.030867] [c0000003f302bd90] [c000000000356358] vfs_write+0xc8/0x240
> [ 4565.030872] [c0000003f302bde0] [c0000000003566cc] ksys_write+0x5c/0x100
> [ 4565.030880] [c0000003f302be30] [c00000000000b288] system_call+0x5c/0x70
> [ 4565.030884] Instruction dump:
> [ 4565.030887] 38210070 38600000 e8010010 eb61ffd8 eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8
> [ 4565.030895] 7c0803a6 4e800020 e9230098 7929f7e2 <0b090000> 2f890000 4cde0020 e9030040
> [ 4565.030903] ---[ end trace 5bd54cb1df9d2976 ]---
>
> The mobility.c code continues on during the second migration, accepts the
> definitions of the new nodes from the PHYP and ends up renaming the new
> properties e.g.
>
> [ 4565.827296] Duplicate name in base, renamed to "ibm,platform-facilities#1"
>
> I don't see any check like 'of_node_check_flag(np, OF_DETACHED)' within
> of_find_node_by_phandle to skip nodes that are detached, but still present
> due to caching or use count considerations.  Another possibility to consider
> is that of_find_node_by_phandle also uses something called 'phandle_cache'
> which may have outdated data as of_detach_node() does not have access to
> that cache for the 'OF_DETACHED' nodes.

Yes the phandle_cache looks like it might be the problem.

I saw of_free_phandle_cache() being called as late_initcall, but didn't
realise that's only if MODULES is disabled.

So I don't see anything that invalidates the phandle_cache when a node
is removed.

The right solution would be for __of_detach_node() to invalidate the
phandle_cache for the node being detached. That's slightly complicated
by the phandle_cache being static inside base.c

To test the theory that it's the phandle_cache causing the problems can
you try this patch:

diff --git a/drivers/of/base.c b/drivers/of/base.c
index 848f549164cd..60e219132e24 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -1098,6 +1098,9 @@ struct device_node *of_find_node_by_phandle(phandle handle)
 		if (phandle_cache[masked_handle] &&
 		    handle == phandle_cache[masked_handle]->phandle)
 			np = phandle_cache[masked_handle];
+
+		if (of_node_check_flag(np, OF_DETACHED))
+			np = NULL;
 	}
 
 	if (!np) {

cheers

^ permalink raw reply related

* [PATCH v5 11/11] hugetlb: Introduce generic version of huge_ptep_get
From: Alexandre Ghiti @ 2018-07-31  6:01 UTC (permalink / raw)
  To: linux-mm, mike.kravetz, linux, catalin.marinas, will.deacon,
	tony.luck, fenghua.yu, ralf, paul.burton, jhogan, jejb, deller,
	benh, paulus, mpe, ysato, dalias, davem, tglx, mingo, hpa, x86,
	arnd, linux-arm-kernel, linux-kernel, linux-ia64, linux-mips,
	linux-parisc, linuxppc-dev, linux-sh, sparclinux, linux-arch
  Cc: Alexandre Ghiti
In-Reply-To: <20180731060155.16915-1-alex@ghiti.fr>

ia64, mips, parisc, powerpc, sh, sparc, x86 architectures use the
same version of huge_ptep_get, so move this generic implementation into
asm-generic/hugetlb.h.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/arm/include/asm/hugetlb-3level.h | 1 +
 arch/arm64/include/asm/hugetlb.h      | 1 +
 arch/ia64/include/asm/hugetlb.h       | 5 -----
 arch/mips/include/asm/hugetlb.h       | 5 -----
 arch/parisc/include/asm/hugetlb.h     | 5 -----
 arch/powerpc/include/asm/hugetlb.h    | 5 -----
 arch/sh/include/asm/hugetlb.h         | 5 -----
 arch/sparc/include/asm/hugetlb.h      | 5 -----
 arch/x86/include/asm/hugetlb.h        | 5 -----
 include/asm-generic/hugetlb.h         | 7 +++++++
 10 files changed, 9 insertions(+), 35 deletions(-)

diff --git a/arch/arm/include/asm/hugetlb-3level.h b/arch/arm/include/asm/hugetlb-3level.h
index 54e4b097b1f5..0d9f3918fa7e 100644
--- a/arch/arm/include/asm/hugetlb-3level.h
+++ b/arch/arm/include/asm/hugetlb-3level.h
@@ -29,6 +29,7 @@
  * ptes.
  * (The valid bit is automatically cleared by set_pte_at for PROT_NONE ptes).
  */
+#define __HAVE_ARCH_HUGE_PTEP_GET
 static inline pte_t huge_ptep_get(pte_t *ptep)
 {
 	pte_t retval = *ptep;
diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
index 80887abcef7f..fb6609875455 100644
--- a/arch/arm64/include/asm/hugetlb.h
+++ b/arch/arm64/include/asm/hugetlb.h
@@ -20,6 +20,7 @@
 
 #include <asm/page.h>
 
+#define __HAVE_ARCH_HUGE_PTEP_GET
 static inline pte_t huge_ptep_get(pte_t *ptep)
 {
 	return READ_ONCE(*ptep);
diff --git a/arch/ia64/include/asm/hugetlb.h b/arch/ia64/include/asm/hugetlb.h
index e9b42750fdf5..36cc0396b214 100644
--- a/arch/ia64/include/asm/hugetlb.h
+++ b/arch/ia64/include/asm/hugetlb.h
@@ -27,11 +27,6 @@ static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 {
 }
 
-static inline pte_t huge_ptep_get(pte_t *ptep)
-{
-	return *ptep;
-}
-
 static inline void arch_clear_hugepage_flags(struct page *page)
 {
 }
diff --git a/arch/mips/include/asm/hugetlb.h b/arch/mips/include/asm/hugetlb.h
index 120adc3b2ffd..425bb6fc3bda 100644
--- a/arch/mips/include/asm/hugetlb.h
+++ b/arch/mips/include/asm/hugetlb.h
@@ -82,11 +82,6 @@ static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 	return changed;
 }
 
-static inline pte_t huge_ptep_get(pte_t *ptep)
-{
-	return *ptep;
-}
-
 static inline void arch_clear_hugepage_flags(struct page *page)
 {
 }
diff --git a/arch/parisc/include/asm/hugetlb.h b/arch/parisc/include/asm/hugetlb.h
index 165b4e5a6f32..7cb595dcb7d7 100644
--- a/arch/parisc/include/asm/hugetlb.h
+++ b/arch/parisc/include/asm/hugetlb.h
@@ -48,11 +48,6 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 					     unsigned long addr, pte_t *ptep,
 					     pte_t pte, int dirty);
 
-static inline pte_t huge_ptep_get(pte_t *ptep)
-{
-	return *ptep;
-}
-
 static inline void arch_clear_hugepage_flags(struct page *page)
 {
 }
diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h
index 658bf7136a3c..33a2d9e3ea9e 100644
--- a/arch/powerpc/include/asm/hugetlb.h
+++ b/arch/powerpc/include/asm/hugetlb.h
@@ -142,11 +142,6 @@ extern int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 				      unsigned long addr, pte_t *ptep,
 				      pte_t pte, int dirty);
 
-static inline pte_t huge_ptep_get(pte_t *ptep)
-{
-	return *ptep;
-}
-
 static inline void arch_clear_hugepage_flags(struct page *page)
 {
 }
diff --git a/arch/sh/include/asm/hugetlb.h b/arch/sh/include/asm/hugetlb.h
index c87195ae0cfa..6f025fe18146 100644
--- a/arch/sh/include/asm/hugetlb.h
+++ b/arch/sh/include/asm/hugetlb.h
@@ -32,11 +32,6 @@ static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 {
 }
 
-static inline pte_t huge_ptep_get(pte_t *ptep)
-{
-	return *ptep;
-}
-
 static inline void arch_clear_hugepage_flags(struct page *page)
 {
 	clear_bit(PG_dcache_clean, &page->flags);
diff --git a/arch/sparc/include/asm/hugetlb.h b/arch/sparc/include/asm/hugetlb.h
index 028a1465fbe7..3963f80d1cb3 100644
--- a/arch/sparc/include/asm/hugetlb.h
+++ b/arch/sparc/include/asm/hugetlb.h
@@ -53,11 +53,6 @@ static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 	return changed;
 }
 
-static inline pte_t huge_ptep_get(pte_t *ptep)
-{
-	return *ptep;
-}
-
 static inline void arch_clear_hugepage_flags(struct page *page)
 {
 }
diff --git a/arch/x86/include/asm/hugetlb.h b/arch/x86/include/asm/hugetlb.h
index 574d42eb081e..7469d321f072 100644
--- a/arch/x86/include/asm/hugetlb.h
+++ b/arch/x86/include/asm/hugetlb.h
@@ -13,11 +13,6 @@ static inline int is_hugepage_only_range(struct mm_struct *mm,
 	return 0;
 }
 
-static inline pte_t huge_ptep_get(pte_t *ptep)
-{
-	return *ptep;
-}
-
 static inline void arch_clear_hugepage_flags(struct page *page)
 {
 }
diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h
index f3c99a03ee83..71d7b77eea50 100644
--- a/include/asm-generic/hugetlb.h
+++ b/include/asm-generic/hugetlb.h
@@ -119,4 +119,11 @@ static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 }
 #endif
 
+#ifndef __HAVE_ARCH_HUGE_PTEP_GET
+static inline pte_t huge_ptep_get(pte_t *ptep)
+{
+	return *ptep;
+}
+#endif
+
 #endif /* _ASM_GENERIC_HUGETLB_H */
-- 
2.16.2

^ permalink raw reply related

* [PATCH v5 10/11] hugetlb: Introduce generic version of huge_ptep_set_access_flags
From: Alexandre Ghiti @ 2018-07-31  6:01 UTC (permalink / raw)
  To: linux-mm, mike.kravetz, linux, catalin.marinas, will.deacon,
	tony.luck, fenghua.yu, ralf, paul.burton, jhogan, jejb, deller,
	benh, paulus, mpe, ysato, dalias, davem, tglx, mingo, hpa, x86,
	arnd, linux-arm-kernel, linux-kernel, linux-ia64, linux-mips,
	linux-parisc, linuxppc-dev, linux-sh, sparclinux, linux-arch
  Cc: Alexandre Ghiti
In-Reply-To: <20180731060155.16915-1-alex@ghiti.fr>

arm, ia64, sh, x86 architectures use the same version
of huge_ptep_set_access_flags, so move this generic implementation
into asm-generic/hugetlb.h.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/arm/include/asm/hugetlb-3level.h | 7 -------
 arch/arm64/include/asm/hugetlb.h      | 1 +
 arch/ia64/include/asm/hugetlb.h       | 7 -------
 arch/mips/include/asm/hugetlb.h       | 1 +
 arch/parisc/include/asm/hugetlb.h     | 1 +
 arch/powerpc/include/asm/hugetlb.h    | 1 +
 arch/sh/include/asm/hugetlb.h         | 7 -------
 arch/sparc/include/asm/hugetlb.h      | 1 +
 arch/x86/include/asm/hugetlb.h        | 7 -------
 include/asm-generic/hugetlb.h         | 9 +++++++++
 10 files changed, 14 insertions(+), 28 deletions(-)

diff --git a/arch/arm/include/asm/hugetlb-3level.h b/arch/arm/include/asm/hugetlb-3level.h
index 8247cd6a2ac6..54e4b097b1f5 100644
--- a/arch/arm/include/asm/hugetlb-3level.h
+++ b/arch/arm/include/asm/hugetlb-3level.h
@@ -37,11 +37,4 @@ static inline pte_t huge_ptep_get(pte_t *ptep)
 	return retval;
 }
 
-static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
-					     unsigned long addr, pte_t *ptep,
-					     pte_t pte, int dirty)
-{
-	return ptep_set_access_flags(vma, addr, ptep, pte, dirty);
-}
-
 #endif /* _ASM_ARM_HUGETLB_3LEVEL_H */
diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
index f4f69ae5466e..80887abcef7f 100644
--- a/arch/arm64/include/asm/hugetlb.h
+++ b/arch/arm64/include/asm/hugetlb.h
@@ -42,6 +42,7 @@ extern pte_t arch_make_huge_pte(pte_t entry, struct vm_area_struct *vma,
 #define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT
 extern void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
 			    pte_t *ptep, pte_t pte);
+#define __HAVE_ARCH_HUGE_PTEP_SET_ACCESS_FLAGS
 extern int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 				      unsigned long addr, pte_t *ptep,
 				      pte_t pte, int dirty);
diff --git a/arch/ia64/include/asm/hugetlb.h b/arch/ia64/include/asm/hugetlb.h
index 49d1f7949f3a..e9b42750fdf5 100644
--- a/arch/ia64/include/asm/hugetlb.h
+++ b/arch/ia64/include/asm/hugetlb.h
@@ -27,13 +27,6 @@ static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 {
 }
 
-static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
-					     unsigned long addr, pte_t *ptep,
-					     pte_t pte, int dirty)
-{
-	return ptep_set_access_flags(vma, addr, ptep, pte, dirty);
-}
-
 static inline pte_t huge_ptep_get(pte_t *ptep)
 {
 	return *ptep;
diff --git a/arch/mips/include/asm/hugetlb.h b/arch/mips/include/asm/hugetlb.h
index 3dcf5debf8c4..120adc3b2ffd 100644
--- a/arch/mips/include/asm/hugetlb.h
+++ b/arch/mips/include/asm/hugetlb.h
@@ -63,6 +63,7 @@ static inline int huge_pte_none(pte_t pte)
 	return !val || (val == (unsigned long)invalid_pte_table);
 }
 
+#define __HAVE_ARCH_HUGE_PTEP_SET_ACCESS_FLAGS
 static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 					     unsigned long addr,
 					     pte_t *ptep, pte_t pte,
diff --git a/arch/parisc/include/asm/hugetlb.h b/arch/parisc/include/asm/hugetlb.h
index 9c3950ca2974..165b4e5a6f32 100644
--- a/arch/parisc/include/asm/hugetlb.h
+++ b/arch/parisc/include/asm/hugetlb.h
@@ -43,6 +43,7 @@ static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 void huge_ptep_set_wrprotect(struct mm_struct *mm,
 					   unsigned long addr, pte_t *ptep);
 
+#define __HAVE_ARCH_HUGE_PTEP_SET_ACCESS_FLAGS
 int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 					     unsigned long addr, pte_t *ptep,
 					     pte_t pte, int dirty);
diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h
index 69c14ecac133..658bf7136a3c 100644
--- a/arch/powerpc/include/asm/hugetlb.h
+++ b/arch/powerpc/include/asm/hugetlb.h
@@ -137,6 +137,7 @@ static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 	flush_hugetlb_page(vma, addr);
 }
 
+#define __HAVE_ARCH_HUGE_PTEP_SET_ACCESS_FLAGS
 extern int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 				      unsigned long addr, pte_t *ptep,
 				      pte_t pte, int dirty);
diff --git a/arch/sh/include/asm/hugetlb.h b/arch/sh/include/asm/hugetlb.h
index 8df4004977b9..c87195ae0cfa 100644
--- a/arch/sh/include/asm/hugetlb.h
+++ b/arch/sh/include/asm/hugetlb.h
@@ -32,13 +32,6 @@ static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 {
 }
 
-static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
-					     unsigned long addr, pte_t *ptep,
-					     pte_t pte, int dirty)
-{
-	return ptep_set_access_flags(vma, addr, ptep, pte, dirty);
-}
-
 static inline pte_t huge_ptep_get(pte_t *ptep)
 {
 	return *ptep;
diff --git a/arch/sparc/include/asm/hugetlb.h b/arch/sparc/include/asm/hugetlb.h
index c41754a113f3..028a1465fbe7 100644
--- a/arch/sparc/include/asm/hugetlb.h
+++ b/arch/sparc/include/asm/hugetlb.h
@@ -40,6 +40,7 @@ static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
 	set_huge_pte_at(mm, addr, ptep, pte_wrprotect(old_pte));
 }
 
+#define __HAVE_ARCH_HUGE_PTEP_SET_ACCESS_FLAGS
 static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 					     unsigned long addr, pte_t *ptep,
 					     pte_t pte, int dirty)
diff --git a/arch/x86/include/asm/hugetlb.h b/arch/x86/include/asm/hugetlb.h
index a3f781f7a264..574d42eb081e 100644
--- a/arch/x86/include/asm/hugetlb.h
+++ b/arch/x86/include/asm/hugetlb.h
@@ -13,13 +13,6 @@ static inline int is_hugepage_only_range(struct mm_struct *mm,
 	return 0;
 }
 
-static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
-					     unsigned long addr, pte_t *ptep,
-					     pte_t pte, int dirty)
-{
-	return ptep_set_access_flags(vma, addr, ptep, pte, dirty);
-}
-
 static inline pte_t huge_ptep_get(pte_t *ptep)
 {
 	return *ptep;
diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h
index 9b9039845278..f3c99a03ee83 100644
--- a/include/asm-generic/hugetlb.h
+++ b/include/asm-generic/hugetlb.h
@@ -110,4 +110,13 @@ static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
 }
 #endif
 
+#ifndef __HAVE_ARCH_HUGE_PTEP_SET_ACCESS_FLAGS
+static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
+		unsigned long addr, pte_t *ptep,
+		pte_t pte, int dirty)
+{
+	return ptep_set_access_flags(vma, addr, ptep, pte, dirty);
+}
+#endif
+
 #endif /* _ASM_GENERIC_HUGETLB_H */
-- 
2.16.2

^ permalink raw reply related

* [PATCH v5 09/11] hugetlb: Introduce generic version of huge_ptep_set_wrprotect
From: Alexandre Ghiti @ 2018-07-31  6:01 UTC (permalink / raw)
  To: linux-mm, mike.kravetz, linux, catalin.marinas, will.deacon,
	tony.luck, fenghua.yu, ralf, paul.burton, jhogan, jejb, deller,
	benh, paulus, mpe, ysato, dalias, davem, tglx, mingo, hpa, x86,
	arnd, linux-arm-kernel, linux-kernel, linux-ia64, linux-mips,
	linux-parisc, linuxppc-dev, linux-sh, sparclinux, linux-arch
  Cc: Alexandre Ghiti
In-Reply-To: <20180731060155.16915-1-alex@ghiti.fr>

arm, ia64, mips, sh, x86 architectures use the same version
of huge_ptep_set_wrprotect, so move this generic implementation into
asm-generic/hugetlb.h.
Note: powerpc uses twice for book3s/32 and nohash/32 the same version as
the above architectures, but the modification was not straightforward
and hence has not been done.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/arm/include/asm/hugetlb-3level.h        | 6 ------
 arch/arm64/include/asm/hugetlb.h             | 1 +
 arch/ia64/include/asm/hugetlb.h              | 6 ------
 arch/mips/include/asm/hugetlb.h              | 6 ------
 arch/parisc/include/asm/hugetlb.h            | 1 +
 arch/powerpc/include/asm/book3s/32/pgtable.h | 2 ++
 arch/powerpc/include/asm/book3s/64/pgtable.h | 1 +
 arch/powerpc/include/asm/nohash/32/pgtable.h | 2 ++
 arch/powerpc/include/asm/nohash/64/pgtable.h | 1 +
 arch/sh/include/asm/hugetlb.h                | 6 ------
 arch/sparc/include/asm/hugetlb.h             | 1 +
 arch/x86/include/asm/hugetlb.h               | 6 ------
 include/asm-generic/hugetlb.h                | 8 ++++++++
 13 files changed, 17 insertions(+), 30 deletions(-)

diff --git a/arch/arm/include/asm/hugetlb-3level.h b/arch/arm/include/asm/hugetlb-3level.h
index b897541520ef..8247cd6a2ac6 100644
--- a/arch/arm/include/asm/hugetlb-3level.h
+++ b/arch/arm/include/asm/hugetlb-3level.h
@@ -37,12 +37,6 @@ static inline pte_t huge_ptep_get(pte_t *ptep)
 	return retval;
 }
 
-static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
-					   unsigned long addr, pte_t *ptep)
-{
-	ptep_set_wrprotect(mm, addr, ptep);
-}
-
 static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 					     unsigned long addr, pte_t *ptep,
 					     pte_t pte, int dirty)
diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
index 3e7f6e69b28d..f4f69ae5466e 100644
--- a/arch/arm64/include/asm/hugetlb.h
+++ b/arch/arm64/include/asm/hugetlb.h
@@ -48,6 +48,7 @@ extern int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR
 extern pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
 				     unsigned long addr, pte_t *ptep);
+#define __HAVE_ARCH_HUGE_PTEP_SET_WRPROTECT
 extern void huge_ptep_set_wrprotect(struct mm_struct *mm,
 				    unsigned long addr, pte_t *ptep);
 #define __HAVE_ARCH_HUGE_PTEP_CLEAR_FLUSH
diff --git a/arch/ia64/include/asm/hugetlb.h b/arch/ia64/include/asm/hugetlb.h
index cbe296271030..49d1f7949f3a 100644
--- a/arch/ia64/include/asm/hugetlb.h
+++ b/arch/ia64/include/asm/hugetlb.h
@@ -27,12 +27,6 @@ static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 {
 }
 
-static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
-					   unsigned long addr, pte_t *ptep)
-{
-	ptep_set_wrprotect(mm, addr, ptep);
-}
-
 static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 					     unsigned long addr, pte_t *ptep,
 					     pte_t pte, int dirty)
diff --git a/arch/mips/include/asm/hugetlb.h b/arch/mips/include/asm/hugetlb.h
index 6ff2531cfb1d..3dcf5debf8c4 100644
--- a/arch/mips/include/asm/hugetlb.h
+++ b/arch/mips/include/asm/hugetlb.h
@@ -63,12 +63,6 @@ static inline int huge_pte_none(pte_t pte)
 	return !val || (val == (unsigned long)invalid_pte_table);
 }
 
-static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
-					   unsigned long addr, pte_t *ptep)
-{
-	ptep_set_wrprotect(mm, addr, ptep);
-}
-
 static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 					     unsigned long addr,
 					     pte_t *ptep, pte_t pte,
diff --git a/arch/parisc/include/asm/hugetlb.h b/arch/parisc/include/asm/hugetlb.h
index fb7e0fd858a3..9c3950ca2974 100644
--- a/arch/parisc/include/asm/hugetlb.h
+++ b/arch/parisc/include/asm/hugetlb.h
@@ -39,6 +39,7 @@ static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 {
 }
 
+#define __HAVE_ARCH_HUGE_PTEP_SET_WRPROTECT
 void huge_ptep_set_wrprotect(struct mm_struct *mm,
 					   unsigned long addr, pte_t *ptep);
 
diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 02f5acd7ccc4..d2cd1d0226e9 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -228,6 +228,8 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
 {
 	pte_update(ptep, (_PAGE_RW | _PAGE_HWWRITE), _PAGE_RO);
 }
+
+#define __HAVE_ARCH_HUGE_PTEP_SET_WRPROTECT
 static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
 					   unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 42aafba7a308..7d957f7c47cd 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -451,6 +451,7 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
 		pte_update(mm, addr, ptep, 0, _PAGE_PRIVILEGED, 0);
 }
 
+#define __HAVE_ARCH_HUGE_PTEP_SET_WRPROTECT
 static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
 					   unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h
index 7c46a98cc7f4..f39e200d9591 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -249,6 +249,8 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
 {
 	pte_update(ptep, (_PAGE_RW | _PAGE_HWWRITE), _PAGE_RO);
 }
+
+#define __HAVE_ARCH_HUGE_PTEP_SET_WRPROTECT
 static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
 					   unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h
index dd0c7236208f..69fbf7e9b4db 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -238,6 +238,7 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
 	pte_update(mm, addr, ptep, _PAGE_RW, 0, 0);
 }
 
+#define __HAVE_ARCH_HUGE_PTEP_SET_WRPROTECT
 static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
 					   unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/sh/include/asm/hugetlb.h b/arch/sh/include/asm/hugetlb.h
index f1bbd255ee43..8df4004977b9 100644
--- a/arch/sh/include/asm/hugetlb.h
+++ b/arch/sh/include/asm/hugetlb.h
@@ -32,12 +32,6 @@ static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 {
 }
 
-static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
-					   unsigned long addr, pte_t *ptep)
-{
-	ptep_set_wrprotect(mm, addr, ptep);
-}
-
 static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 					     unsigned long addr, pte_t *ptep,
 					     pte_t pte, int dirty)
diff --git a/arch/sparc/include/asm/hugetlb.h b/arch/sparc/include/asm/hugetlb.h
index 2101ea217f33..c41754a113f3 100644
--- a/arch/sparc/include/asm/hugetlb.h
+++ b/arch/sparc/include/asm/hugetlb.h
@@ -32,6 +32,7 @@ static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 {
 }
 
+#define __HAVE_ARCH_HUGE_PTEP_SET_WRPROTECT
 static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
 					   unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/x86/include/asm/hugetlb.h b/arch/x86/include/asm/hugetlb.h
index 59c056adb3c9..a3f781f7a264 100644
--- a/arch/x86/include/asm/hugetlb.h
+++ b/arch/x86/include/asm/hugetlb.h
@@ -13,12 +13,6 @@ static inline int is_hugepage_only_range(struct mm_struct *mm,
 	return 0;
 }
 
-static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
-					   unsigned long addr, pte_t *ptep)
-{
-	ptep_set_wrprotect(mm, addr, ptep);
-}
-
 static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 					     unsigned long addr, pte_t *ptep,
 					     pte_t pte, int dirty)
diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h
index 6c0c8b0c71e0..9b9039845278 100644
--- a/include/asm-generic/hugetlb.h
+++ b/include/asm-generic/hugetlb.h
@@ -102,4 +102,12 @@ static inline int prepare_hugepage_range(struct file *file,
 }
 #endif
 
+#ifndef __HAVE_ARCH_HUGE_PTEP_SET_WRPROTECT
+static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
+		unsigned long addr, pte_t *ptep)
+{
+	ptep_set_wrprotect(mm, addr, ptep);
+}
+#endif
+
 #endif /* _ASM_GENERIC_HUGETLB_H */
-- 
2.16.2

^ permalink raw reply related

* [PATCH v5 08/11] hugetlb: Introduce generic version of prepare_hugepage_range
From: Alexandre Ghiti @ 2018-07-31  6:01 UTC (permalink / raw)
  To: linux-mm, mike.kravetz, linux, catalin.marinas, will.deacon,
	tony.luck, fenghua.yu, ralf, paul.burton, jhogan, jejb, deller,
	benh, paulus, mpe, ysato, dalias, davem, tglx, mingo, hpa, x86,
	arnd, linux-arm-kernel, linux-kernel, linux-ia64, linux-mips,
	linux-parisc, linuxppc-dev, linux-sh, sparclinux, linux-arch
  Cc: Alexandre Ghiti
In-Reply-To: <20180731060155.16915-1-alex@ghiti.fr>

arm, arm64, powerpc, sparc, x86 architectures use the same version of
prepare_hugepage_range, so move this generic implementation into
asm-generic/hugetlb.h.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/arm/include/asm/hugetlb.h     | 11 -----------
 arch/arm64/include/asm/hugetlb.h   | 11 -----------
 arch/ia64/include/asm/hugetlb.h    |  1 +
 arch/mips/include/asm/hugetlb.h    |  1 +
 arch/parisc/include/asm/hugetlb.h  |  1 +
 arch/powerpc/include/asm/hugetlb.h | 15 ---------------
 arch/sh/include/asm/hugetlb.h      |  1 +
 arch/sparc/include/asm/hugetlb.h   | 16 ----------------
 arch/x86/include/asm/hugetlb.h     | 15 ---------------
 include/asm-generic/hugetlb.h      | 15 +++++++++++++++
 10 files changed, 19 insertions(+), 68 deletions(-)

diff --git a/arch/arm/include/asm/hugetlb.h b/arch/arm/include/asm/hugetlb.h
index 9ca14227eeb7..3fcef21ff2c2 100644
--- a/arch/arm/include/asm/hugetlb.h
+++ b/arch/arm/include/asm/hugetlb.h
@@ -33,17 +33,6 @@ static inline int is_hugepage_only_range(struct mm_struct *mm,
 	return 0;
 }
 
-static inline int prepare_hugepage_range(struct file *file,
-					 unsigned long addr, unsigned long len)
-{
-	struct hstate *h = hstate_file(file);
-	if (len & ~huge_page_mask(h))
-		return -EINVAL;
-	if (addr & ~huge_page_mask(h))
-		return -EINVAL;
-	return 0;
-}
-
 static inline void arch_clear_hugepage_flags(struct page *page)
 {
 	clear_bit(PG_dcache_clean, &page->flags);
diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
index 1fd64ebf0cd7..3e7f6e69b28d 100644
--- a/arch/arm64/include/asm/hugetlb.h
+++ b/arch/arm64/include/asm/hugetlb.h
@@ -31,17 +31,6 @@ static inline int is_hugepage_only_range(struct mm_struct *mm,
 	return 0;
 }
 
-static inline int prepare_hugepage_range(struct file *file,
-					 unsigned long addr, unsigned long len)
-{
-	struct hstate *h = hstate_file(file);
-	if (len & ~huge_page_mask(h))
-		return -EINVAL;
-	if (addr & ~huge_page_mask(h))
-		return -EINVAL;
-	return 0;
-}
-
 static inline void arch_clear_hugepage_flags(struct page *page)
 {
 	clear_bit(PG_dcache_clean, &page->flags);
diff --git a/arch/ia64/include/asm/hugetlb.h b/arch/ia64/include/asm/hugetlb.h
index 82fe3d7a38d9..cbe296271030 100644
--- a/arch/ia64/include/asm/hugetlb.h
+++ b/arch/ia64/include/asm/hugetlb.h
@@ -9,6 +9,7 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb, unsigned long addr,
 			    unsigned long end, unsigned long floor,
 			    unsigned long ceiling);
 
+#define __HAVE_ARCH_PREPARE_HUGEPAGE_RANGE
 int prepare_hugepage_range(struct file *file,
 			unsigned long addr, unsigned long len);
 
diff --git a/arch/mips/include/asm/hugetlb.h b/arch/mips/include/asm/hugetlb.h
index b3d6bb53ee6e..6ff2531cfb1d 100644
--- a/arch/mips/include/asm/hugetlb.h
+++ b/arch/mips/include/asm/hugetlb.h
@@ -18,6 +18,7 @@ static inline int is_hugepage_only_range(struct mm_struct *mm,
 	return 0;
 }
 
+#define __HAVE_ARCH_PREPARE_HUGEPAGE_RANGE
 static inline int prepare_hugepage_range(struct file *file,
 					 unsigned long addr,
 					 unsigned long len)
diff --git a/arch/parisc/include/asm/hugetlb.h b/arch/parisc/include/asm/hugetlb.h
index 5a102d7251e4..fb7e0fd858a3 100644
--- a/arch/parisc/include/asm/hugetlb.h
+++ b/arch/parisc/include/asm/hugetlb.h
@@ -22,6 +22,7 @@ static inline int is_hugepage_only_range(struct mm_struct *mm,
  * If the arch doesn't supply something else, assume that hugepage
  * size aligned regions are ok without further preparation.
  */
+#define __HAVE_ARCH_PREPARE_HUGEPAGE_RANGE
 static inline int prepare_hugepage_range(struct file *file,
 			unsigned long addr, unsigned long len)
 {
diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h
index 7123599089c6..69c14ecac133 100644
--- a/arch/powerpc/include/asm/hugetlb.h
+++ b/arch/powerpc/include/asm/hugetlb.h
@@ -117,21 +117,6 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb, unsigned long addr,
 			    unsigned long end, unsigned long floor,
 			    unsigned long ceiling);
 
-/*
- * If the arch doesn't supply something else, assume that hugepage
- * size aligned regions are ok without further preparation.
- */
-static inline int prepare_hugepage_range(struct file *file,
-			unsigned long addr, unsigned long len)
-{
-	struct hstate *h = hstate_file(file);
-	if (len & ~huge_page_mask(h))
-		return -EINVAL;
-	if (addr & ~huge_page_mask(h))
-		return -EINVAL;
-	return 0;
-}
-
 #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR
 static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
 					    unsigned long addr, pte_t *ptep)
diff --git a/arch/sh/include/asm/hugetlb.h b/arch/sh/include/asm/hugetlb.h
index 54f65094efe6..f1bbd255ee43 100644
--- a/arch/sh/include/asm/hugetlb.h
+++ b/arch/sh/include/asm/hugetlb.h
@@ -15,6 +15,7 @@ static inline int is_hugepage_only_range(struct mm_struct *mm,
  * If the arch doesn't supply something else, assume that hugepage
  * size aligned regions are ok without further preparation.
  */
+#define __HAVE_ARCH_PREPARE_HUGEPAGE_RANGE
 static inline int prepare_hugepage_range(struct file *file,
 			unsigned long addr, unsigned long len)
 {
diff --git a/arch/sparc/include/asm/hugetlb.h b/arch/sparc/include/asm/hugetlb.h
index f661362376e0..2101ea217f33 100644
--- a/arch/sparc/include/asm/hugetlb.h
+++ b/arch/sparc/include/asm/hugetlb.h
@@ -26,22 +26,6 @@ static inline int is_hugepage_only_range(struct mm_struct *mm,
 	return 0;
 }
 
-/*
- * If the arch doesn't supply something else, assume that hugepage
- * size aligned regions are ok without further preparation.
- */
-static inline int prepare_hugepage_range(struct file *file,
-			unsigned long addr, unsigned long len)
-{
-	struct hstate *h = hstate_file(file);
-
-	if (len & ~huge_page_mask(h))
-		return -EINVAL;
-	if (addr & ~huge_page_mask(h))
-		return -EINVAL;
-	return 0;
-}
-
 #define __HAVE_ARCH_HUGE_PTEP_CLEAR_FLUSH
 static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 					 unsigned long addr, pte_t *ptep)
diff --git a/arch/x86/include/asm/hugetlb.h b/arch/x86/include/asm/hugetlb.h
index 3cd3a2c9840e..59c056adb3c9 100644
--- a/arch/x86/include/asm/hugetlb.h
+++ b/arch/x86/include/asm/hugetlb.h
@@ -13,21 +13,6 @@ static inline int is_hugepage_only_range(struct mm_struct *mm,
 	return 0;
 }
 
-/*
- * If the arch doesn't supply something else, assume that hugepage
- * size aligned regions are ok without further preparation.
- */
-static inline int prepare_hugepage_range(struct file *file,
-			unsigned long addr, unsigned long len)
-{
-	struct hstate *h = hstate_file(file);
-	if (len & ~huge_page_mask(h))
-		return -EINVAL;
-	if (addr & ~huge_page_mask(h))
-		return -EINVAL;
-	return 0;
-}
-
 static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
 					   unsigned long addr, pte_t *ptep)
 {
diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h
index cd9697672b79..6c0c8b0c71e0 100644
--- a/include/asm-generic/hugetlb.h
+++ b/include/asm-generic/hugetlb.h
@@ -87,4 +87,19 @@ static inline pte_t huge_pte_wrprotect(pte_t pte)
 }
 #endif
 
+#ifndef __HAVE_ARCH_PREPARE_HUGEPAGE_RANGE
+static inline int prepare_hugepage_range(struct file *file,
+		unsigned long addr, unsigned long len)
+{
+	struct hstate *h = hstate_file(file);
+
+	if (len & ~huge_page_mask(h))
+		return -EINVAL;
+	if (addr & ~huge_page_mask(h))
+		return -EINVAL;
+
+	return 0;
+}
+#endif
+
 #endif /* _ASM_GENERIC_HUGETLB_H */
-- 
2.16.2

^ permalink raw reply related

* [PATCH v5 07/11] hugetlb: Introduce generic version of huge_pte_wrprotect
From: Alexandre Ghiti @ 2018-07-31  6:01 UTC (permalink / raw)
  To: linux-mm, mike.kravetz, linux, catalin.marinas, will.deacon,
	tony.luck, fenghua.yu, ralf, paul.burton, jhogan, jejb, deller,
	benh, paulus, mpe, ysato, dalias, davem, tglx, mingo, hpa, x86,
	arnd, linux-arm-kernel, linux-kernel, linux-ia64, linux-mips,
	linux-parisc, linuxppc-dev, linux-sh, sparclinux, linux-arch
  Cc: Alexandre Ghiti
In-Reply-To: <20180731060155.16915-1-alex@ghiti.fr>

arm, arm64, ia64, mips, parisc, powerpc, sh, sparc, x86
architectures use the same version of huge_pte_wrprotect, so move
this generic implementation into asm-generic/hugetlb.h.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/arm/include/asm/hugetlb.h     | 5 -----
 arch/arm64/include/asm/hugetlb.h   | 5 -----
 arch/ia64/include/asm/hugetlb.h    | 5 -----
 arch/mips/include/asm/hugetlb.h    | 5 -----
 arch/parisc/include/asm/hugetlb.h  | 5 -----
 arch/powerpc/include/asm/hugetlb.h | 5 -----
 arch/sh/include/asm/hugetlb.h      | 5 -----
 arch/sparc/include/asm/hugetlb.h   | 5 -----
 arch/x86/include/asm/hugetlb.h     | 5 -----
 include/asm-generic/hugetlb.h      | 7 +++++++
 10 files changed, 7 insertions(+), 45 deletions(-)

diff --git a/arch/arm/include/asm/hugetlb.h b/arch/arm/include/asm/hugetlb.h
index c821b550d6a4..9ca14227eeb7 100644
--- a/arch/arm/include/asm/hugetlb.h
+++ b/arch/arm/include/asm/hugetlb.h
@@ -44,11 +44,6 @@ static inline int prepare_hugepage_range(struct file *file,
 	return 0;
 }
 
-static inline pte_t huge_pte_wrprotect(pte_t pte)
-{
-	return pte_wrprotect(pte);
-}
-
 static inline void arch_clear_hugepage_flags(struct page *page)
 {
 	clear_bit(PG_dcache_clean, &page->flags);
diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
index 49247c6f94db..1fd64ebf0cd7 100644
--- a/arch/arm64/include/asm/hugetlb.h
+++ b/arch/arm64/include/asm/hugetlb.h
@@ -42,11 +42,6 @@ static inline int prepare_hugepage_range(struct file *file,
 	return 0;
 }
 
-static inline pte_t huge_pte_wrprotect(pte_t pte)
-{
-	return pte_wrprotect(pte);
-}
-
 static inline void arch_clear_hugepage_flags(struct page *page)
 {
 	clear_bit(PG_dcache_clean, &page->flags);
diff --git a/arch/ia64/include/asm/hugetlb.h b/arch/ia64/include/asm/hugetlb.h
index bf573500b3c4..82fe3d7a38d9 100644
--- a/arch/ia64/include/asm/hugetlb.h
+++ b/arch/ia64/include/asm/hugetlb.h
@@ -26,11 +26,6 @@ static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 {
 }
 
-static inline pte_t huge_pte_wrprotect(pte_t pte)
-{
-	return pte_wrprotect(pte);
-}
-
 static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
 					   unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/mips/include/asm/hugetlb.h b/arch/mips/include/asm/hugetlb.h
index 1c9c4531376c..b3d6bb53ee6e 100644
--- a/arch/mips/include/asm/hugetlb.h
+++ b/arch/mips/include/asm/hugetlb.h
@@ -62,11 +62,6 @@ static inline int huge_pte_none(pte_t pte)
 	return !val || (val == (unsigned long)invalid_pte_table);
 }
 
-static inline pte_t huge_pte_wrprotect(pte_t pte)
-{
-	return pte_wrprotect(pte);
-}
-
 static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
 					   unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/parisc/include/asm/hugetlb.h b/arch/parisc/include/asm/hugetlb.h
index c09d8c74553c..5a102d7251e4 100644
--- a/arch/parisc/include/asm/hugetlb.h
+++ b/arch/parisc/include/asm/hugetlb.h
@@ -38,11 +38,6 @@ static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 {
 }
 
-static inline pte_t huge_pte_wrprotect(pte_t pte)
-{
-	return pte_wrprotect(pte);
-}
-
 void huge_ptep_set_wrprotect(struct mm_struct *mm,
 					   unsigned long addr, pte_t *ptep);
 
diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h
index 3562d46585ba..7123599089c6 100644
--- a/arch/powerpc/include/asm/hugetlb.h
+++ b/arch/powerpc/include/asm/hugetlb.h
@@ -152,11 +152,6 @@ static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 	flush_hugetlb_page(vma, addr);
 }
 
-static inline pte_t huge_pte_wrprotect(pte_t pte)
-{
-	return pte_wrprotect(pte);
-}
-
 extern int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 				      unsigned long addr, pte_t *ptep,
 				      pte_t pte, int dirty);
diff --git a/arch/sh/include/asm/hugetlb.h b/arch/sh/include/asm/hugetlb.h
index a9f8266f33cf..54f65094efe6 100644
--- a/arch/sh/include/asm/hugetlb.h
+++ b/arch/sh/include/asm/hugetlb.h
@@ -31,11 +31,6 @@ static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 {
 }
 
-static inline pte_t huge_pte_wrprotect(pte_t pte)
-{
-	return pte_wrprotect(pte);
-}
-
 static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
 					   unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/sparc/include/asm/hugetlb.h b/arch/sparc/include/asm/hugetlb.h
index 11115bbd712e..f661362376e0 100644
--- a/arch/sparc/include/asm/hugetlb.h
+++ b/arch/sparc/include/asm/hugetlb.h
@@ -48,11 +48,6 @@ static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 {
 }
 
-static inline pte_t huge_pte_wrprotect(pte_t pte)
-{
-	return pte_wrprotect(pte);
-}
-
 static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
 					   unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/x86/include/asm/hugetlb.h b/arch/x86/include/asm/hugetlb.h
index 42d872054791..3cd3a2c9840e 100644
--- a/arch/x86/include/asm/hugetlb.h
+++ b/arch/x86/include/asm/hugetlb.h
@@ -28,11 +28,6 @@ static inline int prepare_hugepage_range(struct file *file,
 	return 0;
 }
 
-static inline pte_t huge_pte_wrprotect(pte_t pte)
-{
-	return pte_wrprotect(pte);
-}
-
 static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
 					   unsigned long addr, pte_t *ptep)
 {
diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h
index 2fc3d68424e9..cd9697672b79 100644
--- a/include/asm-generic/hugetlb.h
+++ b/include/asm-generic/hugetlb.h
@@ -80,4 +80,11 @@ static inline int huge_pte_none(pte_t pte)
 }
 #endif
 
+#ifndef __HAVE_ARCH_HUGE_PTE_WRPROTECT
+static inline pte_t huge_pte_wrprotect(pte_t pte)
+{
+	return pte_wrprotect(pte);
+}
+#endif
+
 #endif /* _ASM_GENERIC_HUGETLB_H */
-- 
2.16.2

^ permalink raw reply related

* [PATCH v5 06/11] hugetlb: Introduce generic version of huge_pte_none
From: Alexandre Ghiti @ 2018-07-31  6:01 UTC (permalink / raw)
  To: linux-mm, mike.kravetz, linux, catalin.marinas, will.deacon,
	tony.luck, fenghua.yu, ralf, paul.burton, jhogan, jejb, deller,
	benh, paulus, mpe, ysato, dalias, davem, tglx, mingo, hpa, x86,
	arnd, linux-arm-kernel, linux-kernel, linux-ia64, linux-mips,
	linux-parisc, linuxppc-dev, linux-sh, sparclinux, linux-arch
  Cc: Alexandre Ghiti
In-Reply-To: <20180731060155.16915-1-alex@ghiti.fr>

arm, arm64, ia64, parisc, powerpc, sh, sparc, x86 architectures
use the same version of huge_pte_none, so move this generic
implementation into asm-generic/hugetlb.h.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/arm/include/asm/hugetlb.h     | 5 -----
 arch/arm64/include/asm/hugetlb.h   | 5 -----
 arch/ia64/include/asm/hugetlb.h    | 5 -----
 arch/mips/include/asm/hugetlb.h    | 1 +
 arch/parisc/include/asm/hugetlb.h  | 5 -----
 arch/powerpc/include/asm/hugetlb.h | 5 -----
 arch/sh/include/asm/hugetlb.h      | 5 -----
 arch/sparc/include/asm/hugetlb.h   | 5 -----
 arch/x86/include/asm/hugetlb.h     | 5 -----
 include/asm-generic/hugetlb.h      | 7 +++++++
 10 files changed, 8 insertions(+), 40 deletions(-)

diff --git a/arch/arm/include/asm/hugetlb.h b/arch/arm/include/asm/hugetlb.h
index 537660891f9f..c821b550d6a4 100644
--- a/arch/arm/include/asm/hugetlb.h
+++ b/arch/arm/include/asm/hugetlb.h
@@ -44,11 +44,6 @@ static inline int prepare_hugepage_range(struct file *file,
 	return 0;
 }
 
-static inline int huge_pte_none(pte_t pte)
-{
-	return pte_none(pte);
-}
-
 static inline pte_t huge_pte_wrprotect(pte_t pte)
 {
 	return pte_wrprotect(pte);
diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
index 4c8dd488554d..49247c6f94db 100644
--- a/arch/arm64/include/asm/hugetlb.h
+++ b/arch/arm64/include/asm/hugetlb.h
@@ -42,11 +42,6 @@ static inline int prepare_hugepage_range(struct file *file,
 	return 0;
 }
 
-static inline int huge_pte_none(pte_t pte)
-{
-	return pte_none(pte);
-}
-
 static inline pte_t huge_pte_wrprotect(pte_t pte)
 {
 	return pte_wrprotect(pte);
diff --git a/arch/ia64/include/asm/hugetlb.h b/arch/ia64/include/asm/hugetlb.h
index 41b5f6adeee4..bf573500b3c4 100644
--- a/arch/ia64/include/asm/hugetlb.h
+++ b/arch/ia64/include/asm/hugetlb.h
@@ -26,11 +26,6 @@ static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 {
 }
 
-static inline int huge_pte_none(pte_t pte)
-{
-	return pte_none(pte);
-}
-
 static inline pte_t huge_pte_wrprotect(pte_t pte)
 {
 	return pte_wrprotect(pte);
diff --git a/arch/mips/include/asm/hugetlb.h b/arch/mips/include/asm/hugetlb.h
index 7df1f116a3cc..1c9c4531376c 100644
--- a/arch/mips/include/asm/hugetlb.h
+++ b/arch/mips/include/asm/hugetlb.h
@@ -55,6 +55,7 @@ static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 	flush_tlb_page(vma, addr & huge_page_mask(hstate_vma(vma)));
 }
 
+#define __HAVE_ARCH_HUGE_PTE_NONE
 static inline int huge_pte_none(pte_t pte)
 {
 	unsigned long val = pte_val(pte) & ~_PAGE_GLOBAL;
diff --git a/arch/parisc/include/asm/hugetlb.h b/arch/parisc/include/asm/hugetlb.h
index 9afff26747a1..c09d8c74553c 100644
--- a/arch/parisc/include/asm/hugetlb.h
+++ b/arch/parisc/include/asm/hugetlb.h
@@ -38,11 +38,6 @@ static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 {
 }
 
-static inline int huge_pte_none(pte_t pte)
-{
-	return pte_none(pte);
-}
-
 static inline pte_t huge_pte_wrprotect(pte_t pte)
 {
 	return pte_wrprotect(pte);
diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h
index 0b02856aa85b..3562d46585ba 100644
--- a/arch/powerpc/include/asm/hugetlb.h
+++ b/arch/powerpc/include/asm/hugetlb.h
@@ -152,11 +152,6 @@ static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 	flush_hugetlb_page(vma, addr);
 }
 
-static inline int huge_pte_none(pte_t pte)
-{
-	return pte_none(pte);
-}
-
 static inline pte_t huge_pte_wrprotect(pte_t pte)
 {
 	return pte_wrprotect(pte);
diff --git a/arch/sh/include/asm/hugetlb.h b/arch/sh/include/asm/hugetlb.h
index 9abf9c86b769..a9f8266f33cf 100644
--- a/arch/sh/include/asm/hugetlb.h
+++ b/arch/sh/include/asm/hugetlb.h
@@ -31,11 +31,6 @@ static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 {
 }
 
-static inline int huge_pte_none(pte_t pte)
-{
-	return pte_none(pte);
-}
-
 static inline pte_t huge_pte_wrprotect(pte_t pte)
 {
 	return pte_wrprotect(pte);
diff --git a/arch/sparc/include/asm/hugetlb.h b/arch/sparc/include/asm/hugetlb.h
index 651a9593fcee..11115bbd712e 100644
--- a/arch/sparc/include/asm/hugetlb.h
+++ b/arch/sparc/include/asm/hugetlb.h
@@ -48,11 +48,6 @@ static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 {
 }
 
-static inline int huge_pte_none(pte_t pte)
-{
-	return pte_none(pte);
-}
-
 static inline pte_t huge_pte_wrprotect(pte_t pte)
 {
 	return pte_wrprotect(pte);
diff --git a/arch/x86/include/asm/hugetlb.h b/arch/x86/include/asm/hugetlb.h
index fd59673e7a0a..42d872054791 100644
--- a/arch/x86/include/asm/hugetlb.h
+++ b/arch/x86/include/asm/hugetlb.h
@@ -28,11 +28,6 @@ static inline int prepare_hugepage_range(struct file *file,
 	return 0;
 }
 
-static inline int huge_pte_none(pte_t pte)
-{
-	return pte_none(pte);
-}
-
 static inline pte_t huge_pte_wrprotect(pte_t pte)
 {
 	return pte_wrprotect(pte);
diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h
index ffa63fd8388d..2fc3d68424e9 100644
--- a/include/asm-generic/hugetlb.h
+++ b/include/asm-generic/hugetlb.h
@@ -73,4 +73,11 @@ static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 }
 #endif
 
+#ifndef __HAVE_ARCH_HUGE_PTE_NONE
+static inline int huge_pte_none(pte_t pte)
+{
+	return pte_none(pte);
+}
+#endif
+
 #endif /* _ASM_GENERIC_HUGETLB_H */
-- 
2.16.2

^ permalink raw reply related

* [PATCH v5 05/11] hugetlb: Introduce generic version of huge_ptep_clear_flush
From: Alexandre Ghiti @ 2018-07-31  6:01 UTC (permalink / raw)
  To: linux-mm, mike.kravetz, linux, catalin.marinas, will.deacon,
	tony.luck, fenghua.yu, ralf, paul.burton, jhogan, jejb, deller,
	benh, paulus, mpe, ysato, dalias, davem, tglx, mingo, hpa, x86,
	arnd, linux-arm-kernel, linux-kernel, linux-ia64, linux-mips,
	linux-parisc, linuxppc-dev, linux-sh, sparclinux, linux-arch
  Cc: Alexandre Ghiti
In-Reply-To: <20180731060155.16915-1-alex@ghiti.fr>

arm, x86 architectures use the same version of
huge_ptep_clear_flush, so move this generic implementation into
asm-generic/hugetlb.h.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/arm/include/asm/hugetlb-3level.h | 6 ------
 arch/arm64/include/asm/hugetlb.h      | 1 +
 arch/ia64/include/asm/hugetlb.h       | 1 +
 arch/mips/include/asm/hugetlb.h       | 1 +
 arch/parisc/include/asm/hugetlb.h     | 1 +
 arch/powerpc/include/asm/hugetlb.h    | 1 +
 arch/sh/include/asm/hugetlb.h         | 1 +
 arch/sparc/include/asm/hugetlb.h      | 1 +
 arch/x86/include/asm/hugetlb.h        | 6 ------
 include/asm-generic/hugetlb.h         | 8 ++++++++
 10 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/arch/arm/include/asm/hugetlb-3level.h b/arch/arm/include/asm/hugetlb-3level.h
index ad36e84b819a..b897541520ef 100644
--- a/arch/arm/include/asm/hugetlb-3level.h
+++ b/arch/arm/include/asm/hugetlb-3level.h
@@ -37,12 +37,6 @@ static inline pte_t huge_ptep_get(pte_t *ptep)
 	return retval;
 }
 
-static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
-					 unsigned long addr, pte_t *ptep)
-{
-	ptep_clear_flush(vma, addr, ptep);
-}
-
 static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
 					   unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
index 6ae0bcafe162..4c8dd488554d 100644
--- a/arch/arm64/include/asm/hugetlb.h
+++ b/arch/arm64/include/asm/hugetlb.h
@@ -71,6 +71,7 @@ extern pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
 				     unsigned long addr, pte_t *ptep);
 extern void huge_ptep_set_wrprotect(struct mm_struct *mm,
 				    unsigned long addr, pte_t *ptep);
+#define __HAVE_ARCH_HUGE_PTEP_CLEAR_FLUSH
 extern void huge_ptep_clear_flush(struct vm_area_struct *vma,
 				  unsigned long addr, pte_t *ptep);
 #define __HAVE_ARCH_HUGE_PTE_CLEAR
diff --git a/arch/ia64/include/asm/hugetlb.h b/arch/ia64/include/asm/hugetlb.h
index 6719c74da0de..41b5f6adeee4 100644
--- a/arch/ia64/include/asm/hugetlb.h
+++ b/arch/ia64/include/asm/hugetlb.h
@@ -20,6 +20,7 @@ static inline int is_hugepage_only_range(struct mm_struct *mm,
 		REGION_NUMBER((addr)+(len)-1) == RGN_HPAGE);
 }
 
+#define __HAVE_ARCH_HUGE_PTEP_CLEAR_FLUSH
 static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 					 unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/mips/include/asm/hugetlb.h b/arch/mips/include/asm/hugetlb.h
index 0959cc5a41fa..7df1f116a3cc 100644
--- a/arch/mips/include/asm/hugetlb.h
+++ b/arch/mips/include/asm/hugetlb.h
@@ -48,6 +48,7 @@ static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
 	return pte;
 }
 
+#define __HAVE_ARCH_HUGE_PTEP_CLEAR_FLUSH
 static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 					 unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/parisc/include/asm/hugetlb.h b/arch/parisc/include/asm/hugetlb.h
index 6e281e1bb336..9afff26747a1 100644
--- a/arch/parisc/include/asm/hugetlb.h
+++ b/arch/parisc/include/asm/hugetlb.h
@@ -32,6 +32,7 @@ static inline int prepare_hugepage_range(struct file *file,
 	return 0;
 }
 
+#define __HAVE_ARCH_HUGE_PTEP_CLEAR_FLUSH
 static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 					 unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h
index 970101cf9c82..0b02856aa85b 100644
--- a/arch/powerpc/include/asm/hugetlb.h
+++ b/arch/powerpc/include/asm/hugetlb.h
@@ -143,6 +143,7 @@ static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
 #endif
 }
 
+#define __HAVE_ARCH_HUGE_PTEP_CLEAR_FLUSH
 static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 					 unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/sh/include/asm/hugetlb.h b/arch/sh/include/asm/hugetlb.h
index 08ee6c00b5e9..9abf9c86b769 100644
--- a/arch/sh/include/asm/hugetlb.h
+++ b/arch/sh/include/asm/hugetlb.h
@@ -25,6 +25,7 @@ static inline int prepare_hugepage_range(struct file *file,
 	return 0;
 }
 
+#define __HAVE_ARCH_HUGE_PTEP_CLEAR_FLUSH
 static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 					 unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/sparc/include/asm/hugetlb.h b/arch/sparc/include/asm/hugetlb.h
index 944e3a4bfaff..651a9593fcee 100644
--- a/arch/sparc/include/asm/hugetlb.h
+++ b/arch/sparc/include/asm/hugetlb.h
@@ -42,6 +42,7 @@ static inline int prepare_hugepage_range(struct file *file,
 	return 0;
 }
 
+#define __HAVE_ARCH_HUGE_PTEP_CLEAR_FLUSH
 static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 					 unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/x86/include/asm/hugetlb.h b/arch/x86/include/asm/hugetlb.h
index e9e7fef867ad..fd59673e7a0a 100644
--- a/arch/x86/include/asm/hugetlb.h
+++ b/arch/x86/include/asm/hugetlb.h
@@ -28,12 +28,6 @@ static inline int prepare_hugepage_range(struct file *file,
 	return 0;
 }
 
-static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
-					 unsigned long addr, pte_t *ptep)
-{
-	ptep_clear_flush(vma, addr, ptep);
-}
-
 static inline int huge_pte_none(pte_t pte)
 {
 	return pte_none(pte);
diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h
index 0f6f151780dd..ffa63fd8388d 100644
--- a/include/asm-generic/hugetlb.h
+++ b/include/asm-generic/hugetlb.h
@@ -65,4 +65,12 @@ static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
 }
 #endif
 
+#ifndef __HAVE_ARCH_HUGE_PTEP_CLEAR_FLUSH
+static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
+		unsigned long addr, pte_t *ptep)
+{
+	ptep_clear_flush(vma, addr, ptep);
+}
+#endif
+
 #endif /* _ASM_GENERIC_HUGETLB_H */
-- 
2.16.2

^ permalink raw reply related

* [PATCH v5 04/11] hugetlb: Introduce generic version of huge_ptep_get_and_clear
From: Alexandre Ghiti @ 2018-07-31  6:01 UTC (permalink / raw)
  To: linux-mm, mike.kravetz, linux, catalin.marinas, will.deacon,
	tony.luck, fenghua.yu, ralf, paul.burton, jhogan, jejb, deller,
	benh, paulus, mpe, ysato, dalias, davem, tglx, mingo, hpa, x86,
	arnd, linux-arm-kernel, linux-kernel, linux-ia64, linux-mips,
	linux-parisc, linuxppc-dev, linux-sh, sparclinux, linux-arch
  Cc: Alexandre Ghiti
In-Reply-To: <20180731060155.16915-1-alex@ghiti.fr>

arm, ia64, sh, x86 architectures use the
same version of huge_ptep_get_and_clear, so move this generic
implementation into asm-generic/hugetlb.h.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/arm/include/asm/hugetlb-3level.h | 6 ------
 arch/arm64/include/asm/hugetlb.h      | 1 +
 arch/ia64/include/asm/hugetlb.h       | 6 ------
 arch/mips/include/asm/hugetlb.h       | 1 +
 arch/parisc/include/asm/hugetlb.h     | 1 +
 arch/powerpc/include/asm/hugetlb.h    | 1 +
 arch/sh/include/asm/hugetlb.h         | 6 ------
 arch/sparc/include/asm/hugetlb.h      | 1 +
 arch/x86/include/asm/hugetlb.h        | 6 ------
 include/asm-generic/hugetlb.h         | 8 ++++++++
 10 files changed, 13 insertions(+), 24 deletions(-)

diff --git a/arch/arm/include/asm/hugetlb-3level.h b/arch/arm/include/asm/hugetlb-3level.h
index 398fb06e8207..ad36e84b819a 100644
--- a/arch/arm/include/asm/hugetlb-3level.h
+++ b/arch/arm/include/asm/hugetlb-3level.h
@@ -49,12 +49,6 @@ static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
 	ptep_set_wrprotect(mm, addr, ptep);
 }
 
-static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
-					    unsigned long addr, pte_t *ptep)
-{
-	return ptep_get_and_clear(mm, addr, ptep);
-}
-
 static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 					     unsigned long addr, pte_t *ptep,
 					     pte_t pte, int dirty)
diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
index 874661a1dff1..6ae0bcafe162 100644
--- a/arch/arm64/include/asm/hugetlb.h
+++ b/arch/arm64/include/asm/hugetlb.h
@@ -66,6 +66,7 @@ extern void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
 extern int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 				      unsigned long addr, pte_t *ptep,
 				      pte_t pte, int dirty);
+#define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR
 extern pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
 				     unsigned long addr, pte_t *ptep);
 extern void huge_ptep_set_wrprotect(struct mm_struct *mm,
diff --git a/arch/ia64/include/asm/hugetlb.h b/arch/ia64/include/asm/hugetlb.h
index a235d6f60fb3..6719c74da0de 100644
--- a/arch/ia64/include/asm/hugetlb.h
+++ b/arch/ia64/include/asm/hugetlb.h
@@ -20,12 +20,6 @@ static inline int is_hugepage_only_range(struct mm_struct *mm,
 		REGION_NUMBER((addr)+(len)-1) == RGN_HPAGE);
 }
 
-static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
-					    unsigned long addr, pte_t *ptep)
-{
-	return ptep_get_and_clear(mm, addr, ptep);
-}
-
 static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 					 unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/mips/include/asm/hugetlb.h b/arch/mips/include/asm/hugetlb.h
index 8ea439041d5d..0959cc5a41fa 100644
--- a/arch/mips/include/asm/hugetlb.h
+++ b/arch/mips/include/asm/hugetlb.h
@@ -36,6 +36,7 @@ static inline int prepare_hugepage_range(struct file *file,
 	return 0;
 }
 
+#define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR
 static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
 					    unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/parisc/include/asm/hugetlb.h b/arch/parisc/include/asm/hugetlb.h
index 77c8adbac7c3..6e281e1bb336 100644
--- a/arch/parisc/include/asm/hugetlb.h
+++ b/arch/parisc/include/asm/hugetlb.h
@@ -8,6 +8,7 @@
 void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
 		     pte_t *ptep, pte_t pte);
 
+#define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR
 pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
 			      pte_t *ptep);
 
diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h
index 0794b53439d4..970101cf9c82 100644
--- a/arch/powerpc/include/asm/hugetlb.h
+++ b/arch/powerpc/include/asm/hugetlb.h
@@ -132,6 +132,7 @@ static inline int prepare_hugepage_range(struct file *file,
 	return 0;
 }
 
+#define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR
 static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
 					    unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/sh/include/asm/hugetlb.h b/arch/sh/include/asm/hugetlb.h
index bc552e37c1c9..08ee6c00b5e9 100644
--- a/arch/sh/include/asm/hugetlb.h
+++ b/arch/sh/include/asm/hugetlb.h
@@ -25,12 +25,6 @@ static inline int prepare_hugepage_range(struct file *file,
 	return 0;
 }
 
-static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
-					    unsigned long addr, pte_t *ptep)
-{
-	return ptep_get_and_clear(mm, addr, ptep);
-}
-
 static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 					 unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/sparc/include/asm/hugetlb.h b/arch/sparc/include/asm/hugetlb.h
index 16b0c53ea6c9..944e3a4bfaff 100644
--- a/arch/sparc/include/asm/hugetlb.h
+++ b/arch/sparc/include/asm/hugetlb.h
@@ -16,6 +16,7 @@ extern struct pud_huge_patch_entry __pud_huge_patch, __pud_huge_patch_end;
 void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
 		     pte_t *ptep, pte_t pte);
 
+#define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR
 pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
 			      pte_t *ptep);
 
diff --git a/arch/x86/include/asm/hugetlb.h b/arch/x86/include/asm/hugetlb.h
index 8db9a761964d..e9e7fef867ad 100644
--- a/arch/x86/include/asm/hugetlb.h
+++ b/arch/x86/include/asm/hugetlb.h
@@ -28,12 +28,6 @@ static inline int prepare_hugepage_range(struct file *file,
 	return 0;
 }
 
-static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
-					    unsigned long addr, pte_t *ptep)
-{
-	return ptep_get_and_clear(mm, addr, ptep);
-}
-
 static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 					 unsigned long addr, pte_t *ptep)
 {
diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h
index ee010b756246..0f6f151780dd 100644
--- a/include/asm-generic/hugetlb.h
+++ b/include/asm-generic/hugetlb.h
@@ -57,4 +57,12 @@ static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
 }
 #endif
 
+#ifndef __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR
+static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
+		unsigned long addr, pte_t *ptep)
+{
+	return ptep_get_and_clear(mm, addr, ptep);
+}
+#endif
+
 #endif /* _ASM_GENERIC_HUGETLB_H */
-- 
2.16.2

^ permalink raw reply related

* [PATCH v5 03/11] hugetlb: Introduce generic version of set_huge_pte_at
From: Alexandre Ghiti @ 2018-07-31  6:01 UTC (permalink / raw)
  To: linux-mm, mike.kravetz, linux, catalin.marinas, will.deacon,
	tony.luck, fenghua.yu, ralf, paul.burton, jhogan, jejb, deller,
	benh, paulus, mpe, ysato, dalias, davem, tglx, mingo, hpa, x86,
	arnd, linux-arm-kernel, linux-kernel, linux-ia64, linux-mips,
	linux-parisc, linuxppc-dev, linux-sh, sparclinux, linux-arch
  Cc: Alexandre Ghiti
In-Reply-To: <20180731060155.16915-1-alex@ghiti.fr>

arm, ia64, mips, powerpc, sh, x86 architectures use the
same version of set_huge_pte_at, so move this generic
implementation into asm-generic/hugetlb.h.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/arm/include/asm/hugetlb-3level.h | 6 ------
 arch/arm64/include/asm/hugetlb.h      | 1 +
 arch/ia64/include/asm/hugetlb.h       | 6 ------
 arch/mips/include/asm/hugetlb.h       | 6 ------
 arch/parisc/include/asm/hugetlb.h     | 1 +
 arch/powerpc/include/asm/hugetlb.h    | 6 ------
 arch/sh/include/asm/hugetlb.h         | 6 ------
 arch/sparc/include/asm/hugetlb.h      | 1 +
 arch/x86/include/asm/hugetlb.h        | 6 ------
 include/asm-generic/hugetlb.h         | 8 +++++++-
 10 files changed, 10 insertions(+), 37 deletions(-)

diff --git a/arch/arm/include/asm/hugetlb-3level.h b/arch/arm/include/asm/hugetlb-3level.h
index d4014fbe5ea3..398fb06e8207 100644
--- a/arch/arm/include/asm/hugetlb-3level.h
+++ b/arch/arm/include/asm/hugetlb-3level.h
@@ -37,12 +37,6 @@ static inline pte_t huge_ptep_get(pte_t *ptep)
 	return retval;
 }
 
-static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
-				   pte_t *ptep, pte_t pte)
-{
-	set_pte_at(mm, addr, ptep, pte);
-}
-
 static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 					 unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
index 4af1a800a900..874661a1dff1 100644
--- a/arch/arm64/include/asm/hugetlb.h
+++ b/arch/arm64/include/asm/hugetlb.h
@@ -60,6 +60,7 @@ static inline void arch_clear_hugepage_flags(struct page *page)
 extern pte_t arch_make_huge_pte(pte_t entry, struct vm_area_struct *vma,
 				struct page *page, int writable);
 #define arch_make_huge_pte arch_make_huge_pte
+#define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT
 extern void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
 			    pte_t *ptep, pte_t pte);
 extern int huge_ptep_set_access_flags(struct vm_area_struct *vma,
diff --git a/arch/ia64/include/asm/hugetlb.h b/arch/ia64/include/asm/hugetlb.h
index afe9fa4d969b..a235d6f60fb3 100644
--- a/arch/ia64/include/asm/hugetlb.h
+++ b/arch/ia64/include/asm/hugetlb.h
@@ -20,12 +20,6 @@ static inline int is_hugepage_only_range(struct mm_struct *mm,
 		REGION_NUMBER((addr)+(len)-1) == RGN_HPAGE);
 }
 
-static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
-				   pte_t *ptep, pte_t pte)
-{
-	set_pte_at(mm, addr, ptep, pte);
-}
-
 static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
 					    unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/mips/include/asm/hugetlb.h b/arch/mips/include/asm/hugetlb.h
index 53764050243e..8ea439041d5d 100644
--- a/arch/mips/include/asm/hugetlb.h
+++ b/arch/mips/include/asm/hugetlb.h
@@ -36,12 +36,6 @@ static inline int prepare_hugepage_range(struct file *file,
 	return 0;
 }
 
-static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
-				   pte_t *ptep, pte_t pte)
-{
-	set_pte_at(mm, addr, ptep, pte);
-}
-
 static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
 					    unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/parisc/include/asm/hugetlb.h b/arch/parisc/include/asm/hugetlb.h
index 28c23b68d38d..77c8adbac7c3 100644
--- a/arch/parisc/include/asm/hugetlb.h
+++ b/arch/parisc/include/asm/hugetlb.h
@@ -4,6 +4,7 @@
 
 #include <asm/page.h>
 
+#define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT
 void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
 		     pte_t *ptep, pte_t pte);
 
diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h
index a7d5c739df9b..0794b53439d4 100644
--- a/arch/powerpc/include/asm/hugetlb.h
+++ b/arch/powerpc/include/asm/hugetlb.h
@@ -132,12 +132,6 @@ static inline int prepare_hugepage_range(struct file *file,
 	return 0;
 }
 
-static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
-				   pte_t *ptep, pte_t pte)
-{
-	set_pte_at(mm, addr, ptep, pte);
-}
-
 static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
 					    unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/sh/include/asm/hugetlb.h b/arch/sh/include/asm/hugetlb.h
index f6a51b609409..bc552e37c1c9 100644
--- a/arch/sh/include/asm/hugetlb.h
+++ b/arch/sh/include/asm/hugetlb.h
@@ -25,12 +25,6 @@ static inline int prepare_hugepage_range(struct file *file,
 	return 0;
 }
 
-static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
-				   pte_t *ptep, pte_t pte)
-{
-	set_pte_at(mm, addr, ptep, pte);
-}
-
 static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
 					    unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/sparc/include/asm/hugetlb.h b/arch/sparc/include/asm/hugetlb.h
index 59d89b52ccb7..16b0c53ea6c9 100644
--- a/arch/sparc/include/asm/hugetlb.h
+++ b/arch/sparc/include/asm/hugetlb.h
@@ -12,6 +12,7 @@ struct pud_huge_patch_entry {
 extern struct pud_huge_patch_entry __pud_huge_patch, __pud_huge_patch_end;
 #endif
 
+#define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT
 void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
 		     pte_t *ptep, pte_t pte);
 
diff --git a/arch/x86/include/asm/hugetlb.h b/arch/x86/include/asm/hugetlb.h
index 398da3b3414c..8db9a761964d 100644
--- a/arch/x86/include/asm/hugetlb.h
+++ b/arch/x86/include/asm/hugetlb.h
@@ -28,12 +28,6 @@ static inline int prepare_hugepage_range(struct file *file,
 	return 0;
 }
 
-static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
-				   pte_t *ptep, pte_t pte)
-{
-	set_pte_at(mm, addr, ptep, pte);
-}
-
 static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
 					    unsigned long addr, pte_t *ptep)
 {
diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h
index c697ca9dda18..ee010b756246 100644
--- a/include/asm-generic/hugetlb.h
+++ b/include/asm-generic/hugetlb.h
@@ -47,8 +47,14 @@ static inline void hugetlb_free_pgd_range(struct mmu_gather *tlb,
 {
 	free_pgd_range(tlb, addr, end, floor, ceiling);
 }
+#endif
 
-
+#ifndef __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT
+static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
+		pte_t *ptep, pte_t pte)
+{
+	set_pte_at(mm, addr, ptep, pte);
+}
 #endif
 
 #endif /* _ASM_GENERIC_HUGETLB_H */
-- 
2.16.2

^ permalink raw reply related

* [PATCH v5 02/11] hugetlb: Introduce generic version of hugetlb_free_pgd_range
From: Alexandre Ghiti @ 2018-07-31  6:01 UTC (permalink / raw)
  To: linux-mm, mike.kravetz, linux, catalin.marinas, will.deacon,
	tony.luck, fenghua.yu, ralf, paul.burton, jhogan, jejb, deller,
	benh, paulus, mpe, ysato, dalias, davem, tglx, mingo, hpa, x86,
	arnd, linux-arm-kernel, linux-kernel, linux-ia64, linux-mips,
	linux-parisc, linuxppc-dev, linux-sh, sparclinux, linux-arch
  Cc: Alexandre Ghiti
In-Reply-To: <20180731060155.16915-1-alex@ghiti.fr>

arm, arm64, mips, parisc, sh, x86 architectures use the
same version of hugetlb_free_pgd_range, so move this generic
implementation into asm-generic/hugetlb.h.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/arm/include/asm/hugetlb.h     |  9 ---------
 arch/arm64/include/asm/hugetlb.h   | 10 ----------
 arch/ia64/include/asm/hugetlb.h    |  5 +++--
 arch/mips/include/asm/hugetlb.h    | 13 ++-----------
 arch/parisc/include/asm/hugetlb.h  | 12 ++----------
 arch/powerpc/include/asm/hugetlb.h |  4 +++-
 arch/sh/include/asm/hugetlb.h      | 12 ++----------
 arch/sparc/include/asm/hugetlb.h   |  4 +++-
 arch/x86/include/asm/hugetlb.h     |  8 --------
 include/asm-generic/hugetlb.h      | 11 +++++++++++
 10 files changed, 26 insertions(+), 62 deletions(-)

diff --git a/arch/arm/include/asm/hugetlb.h b/arch/arm/include/asm/hugetlb.h
index 7d26f6c4f0f5..537660891f9f 100644
--- a/arch/arm/include/asm/hugetlb.h
+++ b/arch/arm/include/asm/hugetlb.h
@@ -27,15 +27,6 @@
 
 #include <asm/hugetlb-3level.h>
 
-static inline void hugetlb_free_pgd_range(struct mmu_gather *tlb,
-					  unsigned long addr, unsigned long end,
-					  unsigned long floor,
-					  unsigned long ceiling)
-{
-	free_pgd_range(tlb, addr, end, floor, ceiling);
-}
-
-
 static inline int is_hugepage_only_range(struct mm_struct *mm,
 					 unsigned long addr, unsigned long len)
 {
diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
index 3fcf14663dfa..4af1a800a900 100644
--- a/arch/arm64/include/asm/hugetlb.h
+++ b/arch/arm64/include/asm/hugetlb.h
@@ -25,16 +25,6 @@ static inline pte_t huge_ptep_get(pte_t *ptep)
 	return READ_ONCE(*ptep);
 }
 
-
-
-static inline void hugetlb_free_pgd_range(struct mmu_gather *tlb,
-					  unsigned long addr, unsigned long end,
-					  unsigned long floor,
-					  unsigned long ceiling)
-{
-	free_pgd_range(tlb, addr, end, floor, ceiling);
-}
-
 static inline int is_hugepage_only_range(struct mm_struct *mm,
 					 unsigned long addr, unsigned long len)
 {
diff --git a/arch/ia64/include/asm/hugetlb.h b/arch/ia64/include/asm/hugetlb.h
index 74d2a5540aaf..afe9fa4d969b 100644
--- a/arch/ia64/include/asm/hugetlb.h
+++ b/arch/ia64/include/asm/hugetlb.h
@@ -3,9 +3,8 @@
 #define _ASM_IA64_HUGETLB_H
 
 #include <asm/page.h>
-#include <asm-generic/hugetlb.h>
-
 
+#define __HAVE_ARCH_HUGETLB_FREE_PGD_RANGE
 void hugetlb_free_pgd_range(struct mmu_gather *tlb, unsigned long addr,
 			    unsigned long end, unsigned long floor,
 			    unsigned long ceiling);
@@ -70,4 +69,6 @@ static inline void arch_clear_hugepage_flags(struct page *page)
 {
 }
 
+#include <asm-generic/hugetlb.h>
+
 #endif /* _ASM_IA64_HUGETLB_H */
diff --git a/arch/mips/include/asm/hugetlb.h b/arch/mips/include/asm/hugetlb.h
index 982bc0685330..53764050243e 100644
--- a/arch/mips/include/asm/hugetlb.h
+++ b/arch/mips/include/asm/hugetlb.h
@@ -10,8 +10,6 @@
 #define __ASM_HUGETLB_H
 
 #include <asm/page.h>
-#include <asm-generic/hugetlb.h>
-
 
 static inline int is_hugepage_only_range(struct mm_struct *mm,
 					 unsigned long addr,
@@ -38,15 +36,6 @@ static inline int prepare_hugepage_range(struct file *file,
 	return 0;
 }
 
-static inline void hugetlb_free_pgd_range(struct mmu_gather *tlb,
-					  unsigned long addr,
-					  unsigned long end,
-					  unsigned long floor,
-					  unsigned long ceiling)
-{
-	free_pgd_range(tlb, addr, end, floor, ceiling);
-}
-
 static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
 				   pte_t *ptep, pte_t pte)
 {
@@ -114,4 +103,6 @@ static inline void arch_clear_hugepage_flags(struct page *page)
 {
 }
 
+#include <asm-generic/hugetlb.h>
+
 #endif /* __ASM_HUGETLB_H */
diff --git a/arch/parisc/include/asm/hugetlb.h b/arch/parisc/include/asm/hugetlb.h
index 58e0f4620426..28c23b68d38d 100644
--- a/arch/parisc/include/asm/hugetlb.h
+++ b/arch/parisc/include/asm/hugetlb.h
@@ -3,8 +3,6 @@
 #define _ASM_PARISC64_HUGETLB_H
 
 #include <asm/page.h>
-#include <asm-generic/hugetlb.h>
-
 
 void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
 		     pte_t *ptep, pte_t pte);
@@ -32,14 +30,6 @@ static inline int prepare_hugepage_range(struct file *file,
 	return 0;
 }
 
-static inline void hugetlb_free_pgd_range(struct mmu_gather *tlb,
-					  unsigned long addr, unsigned long end,
-					  unsigned long floor,
-					  unsigned long ceiling)
-{
-	free_pgd_range(tlb, addr, end, floor, ceiling);
-}
-
 static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
 					 unsigned long addr, pte_t *ptep)
 {
@@ -71,4 +61,6 @@ static inline void arch_clear_hugepage_flags(struct page *page)
 {
 }
 
+#include <asm-generic/hugetlb.h>
+
 #endif /* _ASM_PARISC64_HUGETLB_H */
diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h
index 3225eb6402cc..a7d5c739df9b 100644
--- a/arch/powerpc/include/asm/hugetlb.h
+++ b/arch/powerpc/include/asm/hugetlb.h
@@ -4,7 +4,6 @@
 
 #ifdef CONFIG_HUGETLB_PAGE
 #include <asm/page.h>
-#include <asm-generic/hugetlb.h>
 
 extern struct kmem_cache *hugepte_cache;
 
@@ -113,6 +112,7 @@ static inline void flush_hugetlb_page(struct vm_area_struct *vma,
 void flush_hugetlb_page(struct vm_area_struct *vma, unsigned long vmaddr);
 #endif
 
+#define __HAVE_ARCH_HUGETLB_FREE_PGD_RANGE
 void hugetlb_free_pgd_range(struct mmu_gather *tlb, unsigned long addr,
 			    unsigned long end, unsigned long floor,
 			    unsigned long ceiling);
@@ -179,6 +179,8 @@ static inline void arch_clear_hugepage_flags(struct page *page)
 {
 }
 
+#include <asm-generic/hugetlb.h>
+
 #else /* ! CONFIG_HUGETLB_PAGE */
 static inline void flush_hugetlb_page(struct vm_area_struct *vma,
 				      unsigned long vmaddr)
diff --git a/arch/sh/include/asm/hugetlb.h b/arch/sh/include/asm/hugetlb.h
index 735939c0f513..f6a51b609409 100644
--- a/arch/sh/include/asm/hugetlb.h
+++ b/arch/sh/include/asm/hugetlb.h
@@ -4,8 +4,6 @@
 
 #include <asm/cacheflush.h>
 #include <asm/page.h>
-#include <asm-generic/hugetlb.h>
-
 
 static inline int is_hugepage_only_range(struct mm_struct *mm,
 					 unsigned long addr,
@@ -27,14 +25,6 @@ static inline int prepare_hugepage_range(struct file *file,
 	return 0;
 }
 
-static inline void hugetlb_free_pgd_range(struct mmu_gather *tlb,
-					  unsigned long addr, unsigned long end,
-					  unsigned long floor,
-					  unsigned long ceiling)
-{
-	free_pgd_range(tlb, addr, end, floor, ceiling);
-}
-
 static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
 				   pte_t *ptep, pte_t pte)
 {
@@ -85,4 +75,6 @@ static inline void arch_clear_hugepage_flags(struct page *page)
 	clear_bit(PG_dcache_clean, &page->flags);
 }
 
+#include <asm-generic/hugetlb.h>
+
 #endif /* _ASM_SH_HUGETLB_H */
diff --git a/arch/sparc/include/asm/hugetlb.h b/arch/sparc/include/asm/hugetlb.h
index 300557c66698..59d89b52ccb7 100644
--- a/arch/sparc/include/asm/hugetlb.h
+++ b/arch/sparc/include/asm/hugetlb.h
@@ -3,7 +3,6 @@
 #define _ASM_SPARC64_HUGETLB_H
 
 #include <asm/page.h>
-#include <asm-generic/hugetlb.h>
 
 #ifdef CONFIG_HUGETLB_PAGE
 struct pud_huge_patch_entry {
@@ -84,8 +83,11 @@ static inline void arch_clear_hugepage_flags(struct page *page)
 {
 }
 
+#define __HAVE_ARCH_HUGETLB_FREE_PGD_RANGE
 void hugetlb_free_pgd_range(struct mmu_gather *tlb, unsigned long addr,
 			    unsigned long end, unsigned long floor,
 			    unsigned long ceiling);
 
+#include <asm-generic/hugetlb.h>
+
 #endif /* _ASM_SPARC64_HUGETLB_H */
diff --git a/arch/x86/include/asm/hugetlb.h b/arch/x86/include/asm/hugetlb.h
index 5ed826da5e07..398da3b3414c 100644
--- a/arch/x86/include/asm/hugetlb.h
+++ b/arch/x86/include/asm/hugetlb.h
@@ -28,14 +28,6 @@ static inline int prepare_hugepage_range(struct file *file,
 	return 0;
 }
 
-static inline void hugetlb_free_pgd_range(struct mmu_gather *tlb,
-					  unsigned long addr, unsigned long end,
-					  unsigned long floor,
-					  unsigned long ceiling)
-{
-	free_pgd_range(tlb, addr, end, floor, ceiling);
-}
-
 static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
 				   pte_t *ptep, pte_t pte)
 {
diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h
index 3da7cff52360..c697ca9dda18 100644
--- a/include/asm-generic/hugetlb.h
+++ b/include/asm-generic/hugetlb.h
@@ -40,4 +40,15 @@ static inline void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
 }
 #endif
 
+#ifndef __HAVE_ARCH_HUGETLB_FREE_PGD_RANGE
+static inline void hugetlb_free_pgd_range(struct mmu_gather *tlb,
+		unsigned long addr, unsigned long end,
+		unsigned long floor, unsigned long ceiling)
+{
+	free_pgd_range(tlb, addr, end, floor, ceiling);
+}
+
+
+#endif
+
 #endif /* _ASM_GENERIC_HUGETLB_H */
-- 
2.16.2

^ permalink raw reply related

* [PATCH v5 01/11] hugetlb: Harmonize hugetlb.h arch specific defines with pgtable.h
From: Alexandre Ghiti @ 2018-07-31  6:01 UTC (permalink / raw)
  To: linux-mm, mike.kravetz, linux, catalin.marinas, will.deacon,
	tony.luck, fenghua.yu, ralf, paul.burton, jhogan, jejb, deller,
	benh, paulus, mpe, ysato, dalias, davem, tglx, mingo, hpa, x86,
	arnd, linux-arm-kernel, linux-kernel, linux-ia64, linux-mips,
	linux-parisc, linuxppc-dev, linux-sh, sparclinux, linux-arch
  Cc: Alexandre Ghiti
In-Reply-To: <20180731060155.16915-1-alex@ghiti.fr>

asm-generic/hugetlb.h proposes generic implementations of hugetlb
related functions: use __HAVE_ARCH_HUGE* defines in order to make arch
specific implementations of hugetlb functions consistent with pgtable.h
scheme.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/arm64/include/asm/hugetlb.h | 2 +-
 include/asm-generic/hugetlb.h    | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
index e73f68569624..3fcf14663dfa 100644
--- a/arch/arm64/include/asm/hugetlb.h
+++ b/arch/arm64/include/asm/hugetlb.h
@@ -81,9 +81,9 @@ extern void huge_ptep_set_wrprotect(struct mm_struct *mm,
 				    unsigned long addr, pte_t *ptep);
 extern void huge_ptep_clear_flush(struct vm_area_struct *vma,
 				  unsigned long addr, pte_t *ptep);
+#define __HAVE_ARCH_HUGE_PTE_CLEAR
 extern void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
 			   pte_t *ptep, unsigned long sz);
-#define huge_pte_clear huge_pte_clear
 extern void set_huge_swap_pte_at(struct mm_struct *mm, unsigned long addr,
 				 pte_t *ptep, pte_t pte, unsigned long sz);
 #define set_huge_swap_pte_at set_huge_swap_pte_at
diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h
index 9d0cde8ab716..3da7cff52360 100644
--- a/include/asm-generic/hugetlb.h
+++ b/include/asm-generic/hugetlb.h
@@ -32,7 +32,7 @@ static inline pte_t huge_pte_modify(pte_t pte, pgprot_t newprot)
 	return pte_modify(pte, newprot);
 }
 
-#ifndef huge_pte_clear
+#ifndef __HAVE_ARCH_HUGE_PTE_CLEAR
 static inline void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
 		    pte_t *ptep, unsigned long sz)
 {
-- 
2.16.2

^ permalink raw reply related

* [PATCH v5 00/11] hugetlb: Factorize hugetlb architecture primitives
From: Alexandre Ghiti @ 2018-07-31  6:01 UTC (permalink / raw)
  To: linux-mm, mike.kravetz, linux, catalin.marinas, will.deacon,
	tony.luck, fenghua.yu, ralf, paul.burton, jhogan, jejb, deller,
	benh, paulus, mpe, ysato, dalias, davem, tglx, mingo, hpa, x86,
	arnd, linux-arm-kernel, linux-kernel, linux-ia64, linux-mips,
	linux-parisc, linuxppc-dev, linux-sh, sparclinux, linux-arch
  Cc: Alexandre Ghiti

[CC linux-mm for inclusion in -mm tree] 

In order to reduce copy/paste of functions across architectures and then
make riscv hugetlb port (and future ports) simpler and smaller, this
patchset intends to factorize the numerous hugetlb primitives that are
defined across all the architectures.

Except for prepare_hugepage_range, this patchset moves the versions that
are just pass-through to standard pte primitives into
asm-generic/hugetlb.h by using the same #ifdef semantic that can be
found in asm-generic/pgtable.h, i.e. __HAVE_ARCH_***.

s390 architecture has not been tackled in this serie since it does not
use asm-generic/hugetlb.h at all.
powerpc could be factorized a bit more (cf huge_ptep_set_wrprotect).

This patchset has been compiled on all addressed architectures with
success (except for parisc, but the problem does not come from this series). 

Tested-by: Helge Deller <deller@gmx.de> # parisc
Acked-by: Paul Burton <paul.burton@mips.com> # MIPS parts

Changelog:

v5:
  As suggested by Mike Kravetz, no need to move the #include
  <asm-generic/hugetlb.h> for arm and x86 architectures, let it live at
  the top of the file.

v4:
  Fix powerpc build error due to misplacing of #include
  <asm-generic/hugetlb.h> outside of #ifdef CONFIG_HUGETLB_PAGE, as
  pointed by Christophe Leroy.

v1, v2, v3:
  Same version, just problems with email provider and misuse of
  --batch-size option of git send-email

Alexandre Ghiti (11):
  hugetlb: Harmonize hugetlb.h arch specific defines with pgtable.h
  hugetlb: Introduce generic version of hugetlb_free_pgd_range
  hugetlb: Introduce generic version of set_huge_pte_at
  hugetlb: Introduce generic version of huge_ptep_get_and_clear
  hugetlb: Introduce generic version of huge_ptep_clear_flush
  hugetlb: Introduce generic version of huge_pte_none
  hugetlb: Introduce generic version of huge_pte_wrprotect
  hugetlb: Introduce generic version of prepare_hugepage_range
  hugetlb: Introduce generic version of huge_ptep_set_wrprotect
  hugetlb: Introduce generic version of huge_ptep_set_access_flags
  hugetlb: Introduce generic version of huge_ptep_get

 arch/arm/include/asm/hugetlb-3level.h        | 32 +---------
 arch/arm/include/asm/hugetlb.h               | 30 ----------
 arch/arm64/include/asm/hugetlb.h             | 39 +++---------
 arch/ia64/include/asm/hugetlb.h              | 47 ++-------------
 arch/mips/include/asm/hugetlb.h              | 40 +++----------
 arch/parisc/include/asm/hugetlb.h            | 33 +++--------
 arch/powerpc/include/asm/book3s/32/pgtable.h |  2 +
 arch/powerpc/include/asm/book3s/64/pgtable.h |  1 +
 arch/powerpc/include/asm/hugetlb.h           | 43 ++------------
 arch/powerpc/include/asm/nohash/32/pgtable.h |  2 +
 arch/powerpc/include/asm/nohash/64/pgtable.h |  1 +
 arch/sh/include/asm/hugetlb.h                | 54 ++---------------
 arch/sparc/include/asm/hugetlb.h             | 40 +++----------
 arch/x86/include/asm/hugetlb.h               | 69 ----------------------
 include/asm-generic/hugetlb.h                | 88 +++++++++++++++++++++++++++-
 15 files changed, 139 insertions(+), 382 deletions(-)

-- 
2.16.2

^ permalink raw reply

* Re: [GIT PULL 0/5] perf/urgent fixes
From: Ingo Molnar @ 2018-07-31  5:51 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Clark Williams, linux-kernel, linux-perf-users, Adrian Hunter,
	Alexander Shishkin, Breno Leitao, Daniel Borkmann, Dan Williams,
	David Ahern, Hendrik Brueckner, Jiri Olsa, Josh Poimboeuf,
	linuxppc-dev, Michael Ellerman, Mika Penttilä, Namhyung Kim,
	Peter Zijlstra, Prashant Bhole, Ravi Bangoria, Stephane Eranian,
	Thomas Gleixner, Thomas Richter, Tony Luck, Vince Weaver,
	Wang Nan, Arnaldo Carvalho de Melo
In-Reply-To: <20180730205031.12853-1-acme@kernel.org>


* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> Hi Ingo,
> 
> 	Please consider pulling, just to get the build without warnings
> and finishing successfully in all my test environments,
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit 7f635ff187ab6be0b350b3ec06791e376af238ab:
> 
>   perf/core: Fix crash when using HW tracing kernel filters (2018-07-25 11:46:22 +0200)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-urgent-for-mingo-4.18-20180730
> 
> for you to fetch changes up to 44fe619b1418ff4e9d2f9518a940fbe2fb686a08:
> 
>   perf tools: Fix the build on the alpine:edge distro (2018-07-30 13:15:03 -0300)
> 
> ----------------------------------------------------------------
> perf/urgent fixes: (Arnaldo Carvalho de Melo)
> 
> - Update the tools copy of several files, including perf_event.h,
>   powerpc's asm/unistd.h (new io_pgetevents syscall), bpf.h and
>   x86's memcpy_64.s (used in 'perf bench mem'), silencing the
>   respective warnings during the perf tools build.
> 
> - Fix the build on the alpine:edge distro.
> 
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> ----------------------------------------------------------------
> Arnaldo Carvalho de Melo (5):
>       tools headers uapi: Update tools's copy of linux/perf_event.h
>       tools headers powerpc: Update asm/unistd.h copy to pick new
>       tools headers uapi: Refresh linux/bpf.h copy
>       tools arch: Update arch/x86/lib/memcpy_64.S copy used in 'perf bench mem memcpy'
>       perf tools: Fix the build on the alpine:edge distro
> 
>  tools/arch/powerpc/include/uapi/asm/unistd.h |   1 +
>  tools/arch/x86/include/asm/mcsafe_test.h     |  13 ++++
>  tools/arch/x86/lib/memcpy_64.S               | 112 +++++++++++++--------------
>  tools/include/uapi/linux/bpf.h               |  28 +++++--
>  tools/include/uapi/linux/perf_event.h        |   2 +
>  tools/perf/arch/x86/util/pmu.c               |   1 +
>  tools/perf/arch/x86/util/tsc.c               |   1 +
>  tools/perf/bench/Build                       |   1 +
>  tools/perf/bench/mem-memcpy-x86-64-asm.S     |   1 +
>  tools/perf/bench/mem-memcpy-x86-64-lib.c     |  24 ++++++
>  tools/perf/perf.h                            |   1 +
>  tools/perf/util/header.h                     |   1 +
>  tools/perf/util/namespaces.h                 |   1 +
>  13 files changed, 124 insertions(+), 63 deletions(-)
>  create mode 100644 tools/arch/x86/include/asm/mcsafe_test.h
>  create mode 100644 tools/perf/bench/mem-memcpy-x86-64-lib.c

Pulled, thanks a lot Arnaldo!

	Ingo

^ permalink raw reply

* Re: [PATCH 1/2] selftests/powerpc: Skip earlier in alignment_handler test
From: Andrew Donnellan @ 2018-07-31  5:29 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev; +Cc: mikey
In-Reply-To: <20180731044104.13728-1-mpe@ellerman.id.au>

On 31/07/18 14:41, Michael Ellerman wrote:
> Currently the alignment_handler test prints "Can't open /dev/fb0"
> about 80 times per run, which is a little annoying.
> 
> Refactor it to check earlier if it can open /dev/fb0 and skip if not,
> this results in each test printing something like:
> 
>    test: test_alignment_handler_vsx_206
>    tags: git_version:v4.18-rc3-134-gfb21a48904aa
>    [SKIP] Test skipped on line 291
>    skip: test_alignment_handler_vsx_206
> 
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

This seems sensible.

Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>

> ---
>   .../powerpc/alignment/alignment_handler.c          | 40 +++++++++++++++++++---
>   1 file changed, 35 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/testing/selftests/powerpc/alignment/alignment_handler.c b/tools/testing/selftests/powerpc/alignment/alignment_handler.c
> index 0f2698f9fd6d..0eddd16af49f 100644
> --- a/tools/testing/selftests/powerpc/alignment/alignment_handler.c
> +++ b/tools/testing/selftests/powerpc/alignment/alignment_handler.c
> @@ -40,6 +40,7 @@
>   #include <sys/stat.h>
>   #include <fcntl.h>
>   #include <unistd.h>
> +#include <stdbool.h>
>   #include <stdio.h>
>   #include <stdlib.h>
>   #include <string.h>
> @@ -191,7 +192,7 @@ int test_memcmp(void *s1, void *s2, int n, int offset, char *test_name)
>    */
>   int do_test(char *test_name, void (*test_func)(char *, char *))
>   {
> -	int offset, width, fd, rc = 0, r;
> +	int offset, width, fd, rc, r;
>   	void *mem0, *mem1, *ci0, *ci1;
>   
>   	printf("\tDoing %s:\t", test_name);
> @@ -199,8 +200,8 @@ int do_test(char *test_name, void (*test_func)(char *, char *))
>   	fd = open("/dev/fb0", O_RDWR);
>   	if (fd < 0) {
>   		printf("\n");
> -		perror("Can't open /dev/fb0");
> -		SKIP_IF(1);
> +		perror("Can't open /dev/fb0 now?");
> +		return 1;
>   	}
>   
>   	ci0 = mmap(NULL, bufsize, PROT_WRITE, MAP_SHARED,
> @@ -226,6 +227,7 @@ int do_test(char *test_name, void (*test_func)(char *, char *))
>   		return rc;
>   	}
>   
> +	rc = 0;
>   	/* offset = 0 no alignment fault, so skip */
>   	for (offset = 1; offset < 16; offset++) {
>   		width = 16; /* vsx == 16 bytes */
> @@ -244,32 +246,50 @@ int do_test(char *test_name, void (*test_func)(char *, char *))
>   		r |= test_memcpy(mem1, mem0, width, offset, test_func);
>   		if (r && !debug) {
>   			printf("FAILED: Got signal");
> +			rc = 1;
>   			break;
>   		}
>   
>   		r |= test_memcmp(mem1, ci1, width, offset, test_name);
> -		rc |= r;
>   		if (r && !debug) {
>   			printf("FAILED: Wrong Data");
> +			rc = 1;
>   			break;
>   		}
>   	}
> -	if (!r)
> +
> +	if (rc == 0)
>   		printf("PASSED");
> +
>   	printf("\n");
>   
>   	munmap(ci0, bufsize);
>   	munmap(ci1, bufsize);
>   	free(mem0);
>   	free(mem1);
> +	close(fd);

Good catch :D

>   
>   	return rc;
>   }
>   
> +static bool can_open_fb0(void)
> +{
> +	int fd;
> +
> +	fd = open("/dev/fb0", O_RDWR);
> +	if (fd < 0)
> +		return false;
> +
> +	close(fd);
> +	return true;
> +}
> +
>   int test_alignment_handler_vsx_206(void)
>   {
>   	int rc = 0;
>   
> +	SKIP_IF(!can_open_fb0());
> +
>   	printf("VSX: 2.06B\n");
>   	LOAD_VSX_XFORM_TEST(lxvd2x);
>   	LOAD_VSX_XFORM_TEST(lxvw4x);
> @@ -285,6 +305,8 @@ int test_alignment_handler_vsx_207(void)
>   {
>   	int rc = 0;
>   
> +	SKIP_IF(!can_open_fb0());
> +
>   	printf("VSX: 2.07B\n");
>   	LOAD_VSX_XFORM_TEST(lxsspx);
>   	LOAD_VSX_XFORM_TEST(lxsiwax);
> @@ -298,6 +320,8 @@ int test_alignment_handler_vsx_300(void)
>   {
>   	int rc = 0;
>   
> +	SKIP_IF(!can_open_fb0());
> +
>   	SKIP_IF(!have_hwcap2(PPC_FEATURE2_ARCH_3_00));
>   	printf("VSX: 3.00B\n");
>   	LOAD_VMX_DFORM_TEST(lxsd);
> @@ -328,6 +352,8 @@ int test_alignment_handler_integer(void)
>   {
>   	int rc = 0;
>   
> +	SKIP_IF(!can_open_fb0());
> +
>   	printf("Integer\n");
>   	LOAD_DFORM_TEST(lbz);
>   	LOAD_DFORM_TEST(lbzu);
> @@ -383,6 +409,8 @@ int test_alignment_handler_vmx(void)
>   {
>   	int rc = 0;
>   
> +	SKIP_IF(!can_open_fb0());
> +
>   	printf("VMX\n");
>   	LOAD_VMX_XFORM_TEST(lvx);
>   
> @@ -408,6 +436,8 @@ int test_alignment_handler_fp(void)
>   {
>   	int rc = 0;
>   
> +	SKIP_IF(!can_open_fb0());
> +
>   	printf("Floating point\n");
>   	LOAD_FLOAT_DFORM_TEST(lfd);
>   	LOAD_FLOAT_XFORM_TEST(lfdx);
> 

-- 
Andrew Donnellan              OzLabs, ADL Canberra
andrew.donnellan@au1.ibm.com  IBM Australia Limited

^ permalink raw reply

* [PATCH 2/2] selftests/powerpc: Add more version checks to alignment_handler test
From: Michael Ellerman @ 2018-07-31  4:41 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mikey, andrew.donnellan
In-Reply-To: <20180731044104.13728-1-mpe@ellerman.id.au>

The alignment_handler is documented to only work on Power8/Power9, but
we can make it run on older CPUs by guarding more of the tests with
feature checks.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 .../powerpc/alignment/alignment_handler.c          | 76 +++++++++++++++++++---
 1 file changed, 68 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/powerpc/alignment/alignment_handler.c b/tools/testing/selftests/powerpc/alignment/alignment_handler.c
index 0eddd16af49f..bcecc44627af 100644
--- a/tools/testing/selftests/powerpc/alignment/alignment_handler.c
+++ b/tools/testing/selftests/powerpc/alignment/alignment_handler.c
@@ -49,6 +49,8 @@
 #include <setjmp.h>
 #include <signal.h>
 
+#include <asm/cputable.h>
+
 #include "utils.h"
 
 int bufsize;
@@ -289,6 +291,7 @@ int test_alignment_handler_vsx_206(void)
 	int rc = 0;
 
 	SKIP_IF(!can_open_fb0());
+	SKIP_IF(!have_hwcap(PPC_FEATURE_ARCH_2_06));
 
 	printf("VSX: 2.06B\n");
 	LOAD_VSX_XFORM_TEST(lxvd2x);
@@ -306,6 +309,7 @@ int test_alignment_handler_vsx_207(void)
 	int rc = 0;
 
 	SKIP_IF(!can_open_fb0());
+	SKIP_IF(!have_hwcap2(PPC_FEATURE2_ARCH_2_07));
 
 	printf("VSX: 2.07B\n");
 	LOAD_VSX_XFORM_TEST(lxsspx);
@@ -380,7 +384,6 @@ int test_alignment_handler_integer(void)
 	LOAD_DFORM_TEST(ldu);
 	LOAD_XFORM_TEST(ldx);
 	LOAD_XFORM_TEST(ldux);
-	LOAD_XFORM_TEST(ldbrx);
 	LOAD_DFORM_TEST(lmw);
 	STORE_DFORM_TEST(stb);
 	STORE_XFORM_TEST(stbx);
@@ -400,8 +403,28 @@ int test_alignment_handler_integer(void)
 	STORE_XFORM_TEST(stdx);
 	STORE_DFORM_TEST(stdu);
 	STORE_XFORM_TEST(stdux);
-	STORE_XFORM_TEST(stdbrx);
 	STORE_DFORM_TEST(stmw);
+
+	if (have_hwcap(PPC_FEATURE_ARCH_2_06)) {
+		LOAD_XFORM_TEST(ldbrx);
+		STORE_XFORM_TEST(stdbrx);
+	}
+
+	return rc;
+}
+
+int test_alignment_handler_integer_206(void)
+{
+	int rc = 0;
+
+	SKIP_IF(!can_open_fb0());
+	SKIP_IF(!have_hwcap(PPC_FEATURE_ARCH_2_06));
+
+	printf("Integer: 2.06\n");
+
+	LOAD_XFORM_TEST(ldbrx);
+	STORE_XFORM_TEST(stdbrx);
+
 	return rc;
 }
 
@@ -410,6 +433,7 @@ int test_alignment_handler_vmx(void)
 	int rc = 0;
 
 	SKIP_IF(!can_open_fb0());
+	SKIP_IF(!have_hwcap(PPC_FEATURE_HAS_ALTIVEC));
 
 	printf("VMX\n");
 	LOAD_VMX_XFORM_TEST(lvx);
@@ -441,20 +465,14 @@ int test_alignment_handler_fp(void)
 	printf("Floating point\n");
 	LOAD_FLOAT_DFORM_TEST(lfd);
 	LOAD_FLOAT_XFORM_TEST(lfdx);
-	LOAD_FLOAT_DFORM_TEST(lfdp);
-	LOAD_FLOAT_XFORM_TEST(lfdpx);
 	LOAD_FLOAT_DFORM_TEST(lfdu);
 	LOAD_FLOAT_XFORM_TEST(lfdux);
 	LOAD_FLOAT_DFORM_TEST(lfs);
 	LOAD_FLOAT_XFORM_TEST(lfsx);
 	LOAD_FLOAT_DFORM_TEST(lfsu);
 	LOAD_FLOAT_XFORM_TEST(lfsux);
-	LOAD_FLOAT_XFORM_TEST(lfiwzx);
-	LOAD_FLOAT_XFORM_TEST(lfiwax);
 	STORE_FLOAT_DFORM_TEST(stfd);
 	STORE_FLOAT_XFORM_TEST(stfdx);
-	STORE_FLOAT_DFORM_TEST(stfdp);
-	STORE_FLOAT_XFORM_TEST(stfdpx);
 	STORE_FLOAT_DFORM_TEST(stfdu);
 	STORE_FLOAT_XFORM_TEST(stfdux);
 	STORE_FLOAT_DFORM_TEST(stfs);
@@ -466,6 +484,42 @@ int test_alignment_handler_fp(void)
 	return rc;
 }
 
+int test_alignment_handler_fp_205(void)
+{
+	int rc = 0;
+
+	SKIP_IF(!can_open_fb0());
+	SKIP_IF(!have_hwcap(PPC_FEATURE_ARCH_2_05));
+
+	printf("Floating point: 2.05\n");
+
+	LOAD_FLOAT_DFORM_TEST(lfdp);
+	LOAD_FLOAT_XFORM_TEST(lfdpx);
+	LOAD_FLOAT_XFORM_TEST(lfiwax);
+	STORE_FLOAT_DFORM_TEST(stfdp);
+	STORE_FLOAT_XFORM_TEST(stfdpx);
+
+	if (have_hwcap(PPC_FEATURE_ARCH_2_06)) {
+		LOAD_FLOAT_XFORM_TEST(lfiwzx);
+	}
+
+	return rc;
+}
+
+int test_alignment_handler_fp_206(void)
+{
+	int rc = 0;
+
+	SKIP_IF(!can_open_fb0());
+	SKIP_IF(!have_hwcap(PPC_FEATURE_ARCH_2_06));
+
+	printf("Floating point: 2.06\n");
+
+	LOAD_FLOAT_XFORM_TEST(lfiwzx);
+
+	return rc;
+}
+
 void usage(char *prog)
 {
 	printf("Usage: %s [options]\n", prog);
@@ -513,9 +567,15 @@ int main(int argc, char *argv[])
 			   "test_alignment_handler_vsx_300");
 	rc |= test_harness(test_alignment_handler_integer,
 			   "test_alignment_handler_integer");
+	rc |= test_harness(test_alignment_handler_integer_206,
+			   "test_alignment_handler_integer_206");
 	rc |= test_harness(test_alignment_handler_vmx,
 			   "test_alignment_handler_vmx");
 	rc |= test_harness(test_alignment_handler_fp,
 			   "test_alignment_handler_fp");
+	rc |= test_harness(test_alignment_handler_fp_205,
+			   "test_alignment_handler_fp_205");
+	rc |= test_harness(test_alignment_handler_fp_206,
+			   "test_alignment_handler_fp_206");
 	return rc;
 }
-- 
2.14.1

^ permalink raw reply related

* [PATCH 1/2] selftests/powerpc: Skip earlier in alignment_handler test
From: Michael Ellerman @ 2018-07-31  4:41 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mikey, andrew.donnellan

Currently the alignment_handler test prints "Can't open /dev/fb0"
about 80 times per run, which is a little annoying.

Refactor it to check earlier if it can open /dev/fb0 and skip if not,
this results in each test printing something like:

  test: test_alignment_handler_vsx_206
  tags: git_version:v4.18-rc3-134-gfb21a48904aa
  [SKIP] Test skipped on line 291
  skip: test_alignment_handler_vsx_206

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 .../powerpc/alignment/alignment_handler.c          | 40 +++++++++++++++++++---
 1 file changed, 35 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/powerpc/alignment/alignment_handler.c b/tools/testing/selftests/powerpc/alignment/alignment_handler.c
index 0f2698f9fd6d..0eddd16af49f 100644
--- a/tools/testing/selftests/powerpc/alignment/alignment_handler.c
+++ b/tools/testing/selftests/powerpc/alignment/alignment_handler.c
@@ -40,6 +40,7 @@
 #include <sys/stat.h>
 #include <fcntl.h>
 #include <unistd.h>
+#include <stdbool.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
@@ -191,7 +192,7 @@ int test_memcmp(void *s1, void *s2, int n, int offset, char *test_name)
  */
 int do_test(char *test_name, void (*test_func)(char *, char *))
 {
-	int offset, width, fd, rc = 0, r;
+	int offset, width, fd, rc, r;
 	void *mem0, *mem1, *ci0, *ci1;
 
 	printf("\tDoing %s:\t", test_name);
@@ -199,8 +200,8 @@ int do_test(char *test_name, void (*test_func)(char *, char *))
 	fd = open("/dev/fb0", O_RDWR);
 	if (fd < 0) {
 		printf("\n");
-		perror("Can't open /dev/fb0");
-		SKIP_IF(1);
+		perror("Can't open /dev/fb0 now?");
+		return 1;
 	}
 
 	ci0 = mmap(NULL, bufsize, PROT_WRITE, MAP_SHARED,
@@ -226,6 +227,7 @@ int do_test(char *test_name, void (*test_func)(char *, char *))
 		return rc;
 	}
 
+	rc = 0;
 	/* offset = 0 no alignment fault, so skip */
 	for (offset = 1; offset < 16; offset++) {
 		width = 16; /* vsx == 16 bytes */
@@ -244,32 +246,50 @@ int do_test(char *test_name, void (*test_func)(char *, char *))
 		r |= test_memcpy(mem1, mem0, width, offset, test_func);
 		if (r && !debug) {
 			printf("FAILED: Got signal");
+			rc = 1;
 			break;
 		}
 
 		r |= test_memcmp(mem1, ci1, width, offset, test_name);
-		rc |= r;
 		if (r && !debug) {
 			printf("FAILED: Wrong Data");
+			rc = 1;
 			break;
 		}
 	}
-	if (!r)
+
+	if (rc == 0)
 		printf("PASSED");
+
 	printf("\n");
 
 	munmap(ci0, bufsize);
 	munmap(ci1, bufsize);
 	free(mem0);
 	free(mem1);
+	close(fd);
 
 	return rc;
 }
 
+static bool can_open_fb0(void)
+{
+	int fd;
+
+	fd = open("/dev/fb0", O_RDWR);
+	if (fd < 0)
+		return false;
+
+	close(fd);
+	return true;
+}
+
 int test_alignment_handler_vsx_206(void)
 {
 	int rc = 0;
 
+	SKIP_IF(!can_open_fb0());
+
 	printf("VSX: 2.06B\n");
 	LOAD_VSX_XFORM_TEST(lxvd2x);
 	LOAD_VSX_XFORM_TEST(lxvw4x);
@@ -285,6 +305,8 @@ int test_alignment_handler_vsx_207(void)
 {
 	int rc = 0;
 
+	SKIP_IF(!can_open_fb0());
+
 	printf("VSX: 2.07B\n");
 	LOAD_VSX_XFORM_TEST(lxsspx);
 	LOAD_VSX_XFORM_TEST(lxsiwax);
@@ -298,6 +320,8 @@ int test_alignment_handler_vsx_300(void)
 {
 	int rc = 0;
 
+	SKIP_IF(!can_open_fb0());
+
 	SKIP_IF(!have_hwcap2(PPC_FEATURE2_ARCH_3_00));
 	printf("VSX: 3.00B\n");
 	LOAD_VMX_DFORM_TEST(lxsd);
@@ -328,6 +352,8 @@ int test_alignment_handler_integer(void)
 {
 	int rc = 0;
 
+	SKIP_IF(!can_open_fb0());
+
 	printf("Integer\n");
 	LOAD_DFORM_TEST(lbz);
 	LOAD_DFORM_TEST(lbzu);
@@ -383,6 +409,8 @@ int test_alignment_handler_vmx(void)
 {
 	int rc = 0;
 
+	SKIP_IF(!can_open_fb0());
+
 	printf("VMX\n");
 	LOAD_VMX_XFORM_TEST(lvx);
 
@@ -408,6 +436,8 @@ int test_alignment_handler_fp(void)
 {
 	int rc = 0;
 
+	SKIP_IF(!can_open_fb0());
+
 	printf("Floating point\n");
 	LOAD_FLOAT_DFORM_TEST(lfd);
 	LOAD_FLOAT_XFORM_TEST(lfdx);
-- 
2.14.1

^ permalink raw reply related

* Re: [RFC PATCH kernel 0/5] powerpc/P9/vfio: Pass through NVIDIA Tesla V100
From: Alexey Kardashevskiy @ 2018-07-31  4:03 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Benjamin Herrenschmidt, linuxppc-dev, David Gibson, kvm-ppc,
	Ram Pai, kvm, Alistair Popple
In-Reply-To: <20180730102957.47fa1191@t450s.home>



On 31/07/2018 02:29, Alex Williamson wrote:
> On Mon, 30 Jul 2018 18:58:49 +1000
> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> 
>> On 11/07/2018 19:26, Alexey Kardashevskiy wrote:
>>> On Tue, 10 Jul 2018 16:37:15 -0600
>>> Alex Williamson <alex.williamson@redhat.com> wrote:
>>>   
>>>> On Tue, 10 Jul 2018 14:10:20 +1000
>>>> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>>>>  
>>>>> On Thu, 7 Jun 2018 23:03:23 -0600
>>>>> Alex Williamson <alex.williamson@redhat.com> wrote:
>>>>>     
>>>>>> On Fri, 8 Jun 2018 14:14:23 +1000
>>>>>> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>>>>>>       
>>>>>>> On 8/6/18 1:44 pm, Alex Williamson wrote:        
>>>>>>>> On Fri, 8 Jun 2018 13:08:54 +1000
>>>>>>>> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>>>>>>>>           
>>>>>>>>> On 8/6/18 8:15 am, Alex Williamson wrote:          
>>>>>>>>>> On Fri, 08 Jun 2018 07:54:02 +1000
>>>>>>>>>> Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
>>>>>>>>>>             
>>>>>>>>>>> On Thu, 2018-06-07 at 11:04 -0600, Alex Williamson wrote:            
>>>>>>>>>>>>
>>>>>>>>>>>> Can we back up and discuss whether the IOMMU grouping of NVLink
>>>>>>>>>>>> connected devices makes sense?  AIUI we have a PCI view of these
>>>>>>>>>>>> devices and from that perspective they're isolated.  That's the view of
>>>>>>>>>>>> the device used to generate the grouping.  However, not visible to us,
>>>>>>>>>>>> these devices are interconnected via NVLink.  What isolation properties
>>>>>>>>>>>> does NVLink provide given that its entire purpose for existing seems to
>>>>>>>>>>>> be to provide a high performance link for p2p between devices?              
>>>>>>>>>>>
>>>>>>>>>>> Not entire. On POWER chips, we also have an nvlink between the device
>>>>>>>>>>> and the CPU which is running significantly faster than PCIe.
>>>>>>>>>>>
>>>>>>>>>>> But yes, there are cross-links and those should probably be accounted
>>>>>>>>>>> for in the grouping.            
>>>>>>>>>>
>>>>>>>>>> Then after we fix the grouping, can we just let the host driver manage
>>>>>>>>>> this coherent memory range and expose vGPUs to guests?  The use case of
>>>>>>>>>> assigning all 6 GPUs to one VM seems pretty limited.  (Might need to
>>>>>>>>>> convince NVIDIA to support more than a single vGPU per VM though)            
>>>>>>>>>
>>>>>>>>> These are physical GPUs, not virtual sriov-alike things they are
>>>>>>>>> implementing as well elsewhere.          
>>>>>>>>
>>>>>>>> vGPUs as implemented on M- and P-series Teslas aren't SR-IOV like
>>>>>>>> either.  That's why we have mdev devices now to implement software
>>>>>>>> defined devices.  I don't have first hand experience with V-series, but
>>>>>>>> I would absolutely expect a PCIe-based Tesla V100 to support vGPU.          
>>>>>>>
>>>>>>> So assuming V100 can do vGPU, you are suggesting ditching this patchset and
>>>>>>> using mediated vGPUs instead, correct?        
>>>>>>
>>>>>> If it turns out that our PCIe-only-based IOMMU grouping doesn't
>>>>>> account for lack of isolation on the NVLink side and we correct that,
>>>>>> limiting assignment to sets of 3 interconnected GPUs, is that still a
>>>>>> useful feature?  OTOH, it's entirely an NVIDIA proprietary decision
>>>>>> whether they choose to support vGPU on these GPUs or whether they can
>>>>>> be convinced to support multiple vGPUs per VM.
>>>>>>       
>>>>>>>>> My current understanding is that every P9 chip in that box has some NVLink2
>>>>>>>>> logic on it so each P9 is directly connected to 3 GPUs via PCIe and
>>>>>>>>> 2xNVLink2, and GPUs in that big group are interconnected by NVLink2 links
>>>>>>>>> as well.
>>>>>>>>>
>>>>>>>>> From small bits of information I have it seems that a GPU can perfectly
>>>>>>>>> work alone and if the NVIDIA driver does not see these interconnects
>>>>>>>>> (because we do not pass the rest of the big 3xGPU group to this guest), it
>>>>>>>>> continues with a single GPU. There is an "nvidia-smi -r" big reset hammer
>>>>>>>>> which simply refuses to work until all 3 GPUs are passed so there is some
>>>>>>>>> distinction between passing 1 or 3 GPUs, and I am trying (as we speak) to
>>>>>>>>> get a confirmation from NVIDIA that it is ok to pass just a single GPU.
>>>>>>>>>
>>>>>>>>> So we will either have 6 groups (one per GPU) or 2 groups (one per
>>>>>>>>> interconnected group).          
>>>>>>>>
>>>>>>>> I'm not gaining much confidence that we can rely on isolation between
>>>>>>>> NVLink connected GPUs, it sounds like you're simply expecting that
>>>>>>>> proprietary code from NVIDIA on a proprietary interconnect from NVIDIA
>>>>>>>> is going to play nice and nobody will figure out how to do bad things
>>>>>>>> because... obfuscation?  Thanks,          
>>>>>>>
>>>>>>> Well, we already believe that a proprietary firmware of a sriov-capable
>>>>>>> adapter like Mellanox ConnextX is not doing bad things, how is this
>>>>>>> different in principle?        
>>>>>>
>>>>>> It seems like the scope and hierarchy are different.  Here we're
>>>>>> talking about exposing big discrete devices, which are peers of one
>>>>>> another (and have history of being reverse engineered), to userspace
>>>>>> drivers.  Once handed to userspace, each of those devices needs to be
>>>>>> considered untrusted.  In the case of SR-IOV, we typically have a
>>>>>> trusted host driver for the PF managing untrusted VFs.  We do rely on
>>>>>> some sanity in the hardware/firmware in isolating the VFs from each
>>>>>> other and from the PF, but we also often have source code for Linux
>>>>>> drivers for these devices and sometimes even datasheets.  Here we have
>>>>>> neither of those and perhaps we won't know the extent of the lack of
>>>>>> isolation between these devices until nouveau (best case) or some
>>>>>> exploit (worst case) exposes it.  IOMMU grouping always assumes a lack
>>>>>> of isolation between devices unless the hardware provides some
>>>>>> indication that isolation exists, for example ACS on PCIe.  If NVIDIA
>>>>>> wants to expose isolation on NVLink, perhaps they need to document
>>>>>> enough of it that the host kernel can manipulate and test for isolation,
>>>>>> perhaps even enabling virtualization of the NVLink interconnect
>>>>>> interface such that the host can prevent GPUs from interfering with
>>>>>> each other.  Thanks,      
>>>>>
>>>>>
>>>>> So far I got this from NVIDIA:
>>>>>
>>>>> 1. An NVLink2 state can be controlled via MMIO registers, there is a
>>>>> "NVLINK ISOLATION ON MULTI-TENANT SYSTEMS" spec (my copy is
>>>>> "confidential" though) from NVIDIA with the MMIO addresses to block if
>>>>> we want to disable certain links. In order to NVLink to work it needs to
>>>>> be enabled on both sides so by filtering certains MMIO ranges we can
>>>>> isolate a GPU.    
>>>>
>>>> Where are these MMIO registers, on the bridge or on the endpoint device?  
>>>
>>> The endpoint GPU device.
>>>   
>>>> I'm wondering when you say block MMIO if these are ranges on the device
>>>> that we disallow mmap to and all the overlapping PAGE_SIZE issues that
>>>> come with that or if this should essentially be device specific
>>>> enable_acs and acs_enabled quirks, and maybe also potentially used by
>>>> Logan's disable acs series to allow GPUs to be linked and have grouping
>>>> to match.  
>>>
>>> An update, I confused P100 and V100, P100 would need filtering but
>>> ours is V100 and it has a couple of registers which we can use to
>>> disable particular links and once disabled, the link cannot be
>>> re-enabled till the next secondary bus reset.
>>>
>>>   
>>>>> 2. We can and should also prohibit the GPU firmware update, this is
>>>>> done via MMIO as well. The protocol is not open but at least register
>>>>> ranges might be in order to filter these accesses, and there is no
>>>>> plan to change this.    
>>>>
>>>> I assume this MMIO is on the endpoint and has all the PAGE_SIZE joys
>>>> along with it.  
>>>
>>> Yes, however NVIDIA says there is no performance critical stuff with
>>> this 64K page.
>>>   
>>>> Also, there are certainly use cases of updating
>>>> firmware for an assigned device, we don't want to impose a policy, but
>>>> we should figure out the right place for that policy to be specified by
>>>> the admin.  
>>>
>>> May be but NVIDIA is talking about some "out-of-band" command to the GPU
>>> to enable firmware update so firmware update is not really supported.
>>>
>>>   
>>>>> 3. DMA trafic over the NVLink2 link can be of 2 types: UT=1 for
>>>>> PCI-style DMA via our usual TCE tables (one per a NVLink2 link),
>>>>> and UT=0 for direct host memory access. UT stands for "use
>>>>> translation" and this is a part of the NVLink2 protocol. Only UT=1 is
>>>>> possible over the PCIe link.
>>>>> This UT=0 trafic uses host physical addresses returned by a nest MMU (a
>>>>> piece of NVIDIA logic on a POWER9 chip), this takes LPID (guest id),
>>>>> mmu context id (guest userspace mm id), a virtual address and translates
>>>>> to the host physical and that result is used for UT=0 DMA, this is
>>>>> called "ATS" although it is not PCIe ATS afaict.
>>>>> NVIDIA says that the hardware is designed in a way that it can only do
>>>>> DMA UT=0 to addresses which ATS translated to, and there is no way to
>>>>> override this behavior and this is what guarantees the isolation.    
>>>>
>>>> I'm kinda lost here, maybe we can compare it to PCIe ATS where an
>>>> endpoint requests a translation of an IOVA to physical address, the
>>>> IOMMU returns a lookup based on PCIe requester ID, and there's an
>>>> invalidation protocol to keep things coherent.  
>>>
>>> Yes there is. The current approach is to have an MMU notifier in
>>> the kernel which tells an NPU (IBM piece of logic between GPU/NVlink2
>>> and NVIDIA nest MMU) to invalidate translations and that in turn pokes
>>> the GPU till that confirms that it invalidated tlbs and there is no
>>> ongoing DMA.
>>>   
>>>> In the case above, who provides a guest id and mmu context id?   
>>>
>>> We (powerpc/powernv platform) configure NPU to bind specific bus:dev:fn to
>>> an LPID (== guest id) and MMU context id comes from the guest. The nest
>>> MMU knows where the partition table and this table contains all the
>>> pointers needs for the translation.
>>>
>>>   
>>>> Additional software
>>>> somewhere?  Is the virtual address an IOVA or a process virtual
>>>> address?   
>>>
>>> A guest kernel or a guest userspace virtual address.
>>>   
>>>> Do we assume some sort of invalidation protocol as well?  
>>>
>>> I am little confused, is this question about the same invalidation
>>> protocol as above or different?
>>>
>>>   
>>>>> So isolation can be achieved if I do not miss something.
>>>>>
>>>>> How do we want this to be documented to proceed? I assume if I post
>>>>> patches filtering MMIOs, this won't do it, right? If just 1..3 are
>>>>> documented, will we take this t&c or we need a GPU API spec (which is
>>>>> not going to happen anyway)?    
>>>>
>>>> "t&c"? I think we need what we're actually interacting with to be well
>>>> documented, but that could be _thorough_ comments in the code, enough
>>>> to understand the theory of operation, as far as I'm concerned.  A pdf
>>>> lost on a corporate webserver isn't necessarily an improvement over
>>>> that, but there needs to be sufficient detail to understand what we're
>>>> touching such that we can maintain, adapt, and improve the code over
>>>> time.  Only item #3 above appears POWER specific, so I'd hope that #1
>>>> is done in the PCI subsystem, #2 might be a QEMU option (maybe kernel
>>>> vfio-pci, but I'm not sure that's necessary), and I don't know where #3
>>>> goes.  Thanks,  
>>>
>>> Ok, understood. Thanks!  
>>
>> After some local discussions, it was pointed out that force disabling
>> nvlinks won't bring us much as for an nvlink to work, both sides need to
>> enable it so malicious guests cannot penetrate good ones (or a host)
>> unless a good guest enabled the link but won't happen with a well
>> behaving guest. And if two guests became malicious, then can still only
>> harm each other, and so can they via other ways such network. This is
>> different from PCIe as once PCIe link is unavoidably enabled, a well
>> behaving device cannot firewall itself from peers as it is up to the
>> upstream bridge(s) now to decide the routing; with nvlink2, a GPU still
>> has means to protect itself, just like a guest can run "firewalld" for
>> network.
>>
>> Although it would be a nice feature to have an extra barrier between
>> GPUs, is inability to block the links in hypervisor still a blocker for
>> V100 pass through?
> 
> How is the NVLink configured by the guest, is it 'on'/'off' or are
> specific routes configured? 

The GPU-GPU links need not to be blocked and need to be enabled
(==trained) by a driver in the guest. There are no routes between GPUs
in NVLink fabric, these are direct links, it is just a switch on each
side, both switches need to be on for a link to work.

The GPU-CPU links - the GPU bit is the same switch, the CPU NVlink state
is controlled via the emulated PCI bridges which I pass through together
with the GPU.


> If the former, then isn't a non-malicious
> guest still susceptible to a malicious guest?

A non-malicious guest needs to turn its switch on for a link to a GPU
which belongs to a malicious guest.

> If the latter, how is
> routing configured by the guest given that the guest view of the
> topology doesn't match physical hardware?  Are these routes
> deconfigured by device reset?  Are they part of the save/restore
> state?  Thanks,





-- 
Alexey

^ permalink raw reply

* Re: [RFC 1/4] virtio: Define virtio_direct_dma_ops structure
From: Anshuman Khandual @ 2018-07-31  4:01 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: virtualization, linux-kernel, linuxppc-dev, aik, robh, joe,
	elfring, david, jasowang, benh, mpe, mst, linuxram, haren, paulus,
	srikar
In-Reply-To: <20180730092419.GA26245@infradead.org>

On 07/30/2018 02:54 PM, Christoph Hellwig wrote:
>> +/*
>> + * Virtio direct mapping DMA API operations structure
>> + *
>> + * This defines DMA API structure for all virtio devices which would not
>> + * either bring in their own DMA OPS from architecture or they would not
>> + * like to use architecture specific IOMMU based DMA OPS because QEMU
>> + * expects GPA instead of an IOVA in absence of VIRTIO_F_IOMMU_PLATFORM.
>> + */
>> +dma_addr_t virtio_direct_map_page(struct device *dev, struct page *page,
>> +			    unsigned long offset, size_t size,
>> +			    enum dma_data_direction dir,
>> +			    unsigned long attrs)
> 
> All these functions should probably be marked static.

Sure.

> 
>> +void virtio_direct_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
>> +			size_t size, enum dma_data_direction dir,
>> +			unsigned long attrs)
>> +{
>> +}
> 
> No need to implement no-op callbacks in struct dma_map_ops.

Okay.

> 
>> +
>> +int virtio_direct_mapping_error(struct device *hwdev, dma_addr_t dma_addr)
>> +{
>> +	return 0;
>> +}
> 
> Including this one.
> 
>> +void *virtio_direct_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
>> +		gfp_t gfp, unsigned long attrs)
>> +{
>> +	void *queue = alloc_pages_exact(PAGE_ALIGN(size), gfp);
>> +
>> +	if (queue) {
>> +		phys_addr_t phys_addr = virt_to_phys(queue);
>> +		*dma_handle = (dma_addr_t)phys_addr;
>> +
>> +		if (WARN_ON_ONCE(*dma_handle != phys_addr)) {
>> +			free_pages_exact(queue, PAGE_ALIGN(size));
>> +			return NULL;
>> +		}
>> +	}
>> +	return queue;
> 
> queue is a very odd name in a generic memory allocator.

Will change it to addr.

> 
>> +void virtio_direct_free(struct device *dev, size_t size, void *vaddr,
>> +		dma_addr_t dma_addr, unsigned long attrs)
>> +{
>> +	free_pages_exact(vaddr, PAGE_ALIGN(size));
>> +}
>> +
>> +const struct dma_map_ops virtio_direct_dma_ops = {
>> +	.alloc			= virtio_direct_alloc,
>> +	.free			= virtio_direct_free,
>> +	.map_page		= virtio_direct_map_page,
>> +	.unmap_page		= virtio_direct_unmap_page,
>> +	.mapping_error		= virtio_direct_mapping_error,
>> +};
> 
> This is missing a dma_map_sg implementation.  In general this is
> mandatory for dma_ops.  So either you implement it or explain in
> a common why you think you can skip it.

Hmm. IIUC virtio core never used dma_map_sg(). Am I missing something
here ? The only reference to dma_map_sg() is inside a comment.

$git grep dma_map_sg drivers/virtio/
drivers/virtio/virtio_ring.c:    * We can't use dma_map_sg, because we don't use scatterlists in

> 
>> +EXPORT_SYMBOL(virtio_direct_dma_ops);
> 
> EXPORT_SYMBOL_GPL like all virtio symbols, please.

I am planning to drop EXPORT_SYMBOL from virtio_direct_dma_ops structure.

^ permalink raw reply

* Re: [PATCH] powerpc: Add a checkpatch wrapper with our preferred settings
From: Russell Currey @ 2018-07-31  3:19 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev
In-Reply-To: <20180724140346.10575-1-mpe@ellerman.id.au>

On Wed, 2018-07-25 at 00:03 +1000, Michael Ellerman wrote:
> This makes it easy to run checkpatch with settings that we have
> agreed
> on (bwhahahahah).
> 
> Usage is eg:
> 
>   $ ./arch/powerpc/tools/checkpatch.sh -g origin/master..
> 
> To check all commits since origin/master.
> 
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

Reviewed-by: Russell Currey <ruscur@russell.cc>

^ permalink raw reply

* Re: [PATCH] powerpc/mobility: Fix node detach/rename problem
From: Tyrel Datwyler @ 2018-07-31  1:46 UTC (permalink / raw)
  To: Michael Bringmann, linuxppc-dev
  Cc: Nathan Fontenot, Thomas Falcon, John Allen
In-Reply-To: <9cd25a93-6c71-728c-e9bf-a16f80ef5655@linux.vnet.ibm.com>

On 07/30/2018 05:59 PM, Tyrel Datwyler wrote:
> On 07/29/2018 06:11 AM, Michael Bringmann wrote:
>> During LPAR migration, the content of the device tree/sysfs may
>> be updated including deletion and replacement of nodes in the
>> tree.  When nodes are added to the internal node structures, they
>> are appended in FIFO order to a list of nodes maintained by the
>> OF code APIs.  When nodes are removed from the device tree, they
>> are marked OF_DETACHED, but not actually deleted from the system
>> to allow for pointers cached elsewhere in the kernel.  The order
>> and content of the entries in the list of nodes is not altered,
>> though.
>>
>> During LPAR migration some common nodes are deleted and re-added
>> e.g. "ibm,platform-facilities".  If a node is re-added to the OF
>> node lists, the of_attach_node function checks to make sure that
>> the name + ibm,phandle of the to-be-added data is unique.  As the
>> previous copy of a re-added node is not modified beyond the addition
>> of a bit flag, the code (1) finds the old copy, (2) prints a WARNING
>> notice to the console, (3) renames the to-be-added node to avoid
>> filename collisions within a directory, and (3) adds entries to
>> the sysfs/kernfs.
> 
> So, this patch actually just band aids over the real problem. This is a long standing problem with several PFO drivers leaking references. The issue here is that, during the device tree update that follows a migration. the update of the ibm,platform-facilities node and friends below are always deleted and re-added on the destination lpar and subsequently the leaked references prevent the devices nodes from every actually being properly cleaned up after detach. Thus, leading to the issue you are observing.
> 
> As and example a quick look at nx-842-pseries.c reveals several issues.
> 
> static int __init nx842_pseries_init(void)
> {
>         struct nx842_devdata *new_devdata;
>         int ret;
> 
>         if (!of_find_compatible_node(NULL, NULL, "ibm,compression"))
>                 return -ENODEV;
> 
> This call to of_find_compatible_node() results in a node returned with refcount incremented and therefore immediately leaked.
> 
> Further, the reconfig notifier logic makes the assumption that it only needs to deal with node updates, but as I outlined above the post migration device tree update always results in PFO nodes and properties being deleted and re-added.
> 
> /**
>  * nx842_OF_notifier - Process updates to OF properties for the device
>  *
>  * @np: notifier block
>  * @action: notifier action
>  * @update: struct pSeries_reconfig_prop_update pointer if action is
>  *      PSERIES_UPDATE_PROPERTY
>  *
>  * Returns:
>  *      NOTIFY_OK on success
>  *      NOTIFY_BAD encoded with error number on failure, use
>  *              notifier_to_errno() to decode this value
>  */
> static int nx842_OF_notifier(struct notifier_block *np, unsigned long action,
>                              void *data)
> {
>         struct of_reconfig_data *upd = data;
>         struct nx842_devdata *local_devdata;
>         struct device_node *node = NULL;
> 
>         rcu_read_lock();
>         local_devdata = rcu_dereference(devdata);
>         if (local_devdata)
>                 node = local_devdata->dev->of_node;
> 
>         if (local_devdata &&
>                         action == OF_RECONFIG_UPDATE_PROPERTY &&
>                         !strcmp(upd->dn->name, node->name)) {
>                 rcu_read_unlock();
>                 nx842_OF_upd(upd->prop);
>         } else
>                 rcu_read_unlock();
> 
>         return NOTIFY_OK;
> }
> 
> I expect to find the same problems in pseries-rng.c and nx.c.

So, in actuality the main root of the problem is really in the vio core. A node reference for each PFO device under "ibm,platform-facilities" is taken during vio_register_device_node(). We need a reconfig notifier to call vio_unregister_device() for each PFO device on detach, and vio_bus_scan_register_devices("ibm,platform-facilities") on attach. This will make sure the PFO vio devices are released such that vio_dev_release() gets called to put the node reference that was taken at original registration time.

/* vio_dev refcount hit 0 */
static void vio_dev_release(struct device *dev)
{
        struct iommu_table *tbl = get_iommu_table_base(dev);

        if (tbl)
                iommu_tce_table_put(tbl);
        of_node_put(dev->of_node);
        kfree(to_vio_dev(dev));
}

-Tyrel

> 
> -Tyrel
> 
>>
>> This patch fixes the 'migration add' problem by changing the
>> stored 'phandle' of the OF_DETACHed node to 0 (reserved value for
>> of_find_node_by_phandle), so that subsequent re-add operations,
>> such as those during migration, do not find the detached node,
>> do not observe duplicate names, do not rename them,  and the
>> extra WARNING notices are removed from the console output.
>>
>> In addition, it erases the 'name' field of the OF_DETACHED node,
>> to prevent any future calls to of_find_node_by_name() or
>> of_find_node_by_path() from matching this node.
>>
>> Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
>> ---
>>  arch/powerpc/platforms/pseries/dlpar.c |    3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/arch/powerpc/platforms/pseries/dlpar.c b/arch/powerpc/platforms/pseries/dlpar.c
>> index 2de0f0d..9d82c28 100644
>> --- a/arch/powerpc/platforms/pseries/dlpar.c
>> +++ b/arch/powerpc/platforms/pseries/dlpar.c
>> @@ -274,6 +274,9 @@ int dlpar_detach_node(struct device_node *dn)
>>  	if (rc)
>>  		return rc;
>>
>> +	dn->phandle = 0;
>> +	memset(dn->name, 0, strlen(dn->name));
>> +
>>  	return 0;
>>  }
>>
>>
> 

^ permalink raw reply

* [PATCH] powerpc/4xx: Fix error return path in ppc4xx_msi_probe()
From: Guenter Roeck @ 2018-07-31  1:44 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linuxppc-dev, linux-kernel, Robin Murphy, Christoph Hellwig,
	Guenter Roeck

An arbitrary error in ppc4xx_msi_probe() quite likely results in a
crash similar to the following, seen after dma_alloc_coherent()
returned an error.

Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc001bff0
Oops: Kernel access of bad area, sig: 11 [#1]
BE Canyonlands
Modules linked in:
CPU: 0 PID: 1 Comm: swapper Tainted: G        W
4.18.0-rc6-00010-gff33d1030a6c #1
NIP:  c001bff0 LR: c001c418 CTR: c01faa7c
REGS: cf82db40 TRAP: 0300   Tainted: G        W
(4.18.0-rc6-00010-gff33d1030a6c)
MSR:  00029000 <CE,EE,ME>  CR: 28002024  XER: 00000000
DEAR: 00000000 ESR: 00000000
GPR00: c001c418 cf82dbf0 cf828000 cf8de400 00000000 00000000 000000c4 000000c4
GPR08: c0481ea4 00000000 00000000 000000c4 22002024 00000000 c00025e8 00000000
GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 c0492380 0000004a
GPR24: 00029000 0000000c 00000000 cf8de410 c0494d60 c0494d60 cf8bebc0 00000001
NIP [c001bff0] ppc4xx_of_msi_remove+0x48/0xa0
LR [c001c418] ppc4xx_msi_probe+0x294/0x3b8
Call Trace:
[cf82dbf0] [00029000] 0x29000 (unreliable)
[cf82dc10] [c001c418] ppc4xx_msi_probe+0x294/0x3b8
[cf82dc70] [c0209fbc] platform_drv_probe+0x40/0x9c
[cf82dc90] [c0208240] driver_probe_device+0x2a8/0x350
[cf82dcc0] [c0206204] bus_for_each_drv+0x60/0xac
[cf82dcf0] [c0207e88] __device_attach+0xe8/0x160
[cf82dd20] [c02071e0] bus_probe_device+0xa0/0xbc
[cf82dd40] [c02050c8] device_add+0x404/0x5c4
[cf82dd90] [c0288978] of_platform_device_create_pdata+0x88/0xd8
[cf82ddb0] [c0288b70] of_platform_bus_create+0x134/0x220
[cf82de10] [c0288bcc] of_platform_bus_create+0x190/0x220
[cf82de70] [c0288cf4] of_platform_bus_probe+0x98/0xec
[cf82de90] [c0449650] __machine_initcall_canyonlands_ppc460ex_device_probe+0x38/0x54
[cf82dea0] [c0002404] do_one_initcall+0x40/0x188
[cf82df00] [c043daec] kernel_init_freeable+0x130/0x1d0
[cf82df30] [c0002600] kernel_init+0x18/0x104
[cf82df40] [c000c23c] ret_from_kernel_thread+0x14/0x1c
Instruction dump:
90010024 813d0024 2f890000 83c30058 41bd0014 48000038 813d0024 7f89f800
409d002c 813e000c 57ea103a 3bff0001 <7c69502e> 2f830000 419effe0 4803b26d
---[ end trace 8cf551077ecfc42a ]---

Fix it up. Specifically,

- Return valid error codes from ppc4xx_setup_pcieh_hw(), have it clean
  up after itself, and only access hardware after all possible error
  conditions have been handled.
- Use devm_kzalloc() instead of kzalloc() in ppc4xx_msi_probe()

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 arch/powerpc/platforms/4xx/msi.c | 51 +++++++++++++++++++++++-----------------
 1 file changed, 30 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/platforms/4xx/msi.c b/arch/powerpc/platforms/4xx/msi.c
index 81b2cbce7df8..7c324eff2f22 100644
--- a/arch/powerpc/platforms/4xx/msi.c
+++ b/arch/powerpc/platforms/4xx/msi.c
@@ -146,13 +146,19 @@ static int ppc4xx_setup_pcieh_hw(struct platform_device *dev,
 	const u32 *sdr_addr;
 	dma_addr_t msi_phys;
 	void *msi_virt;
+	int err;
 
 	sdr_addr = of_get_property(dev->dev.of_node, "sdr-base", NULL);
 	if (!sdr_addr)
-		return -1;
+		return -EINVAL;
 
-	mtdcri(SDR0, *sdr_addr, upper_32_bits(res.start));	/*HIGH addr */
-	mtdcri(SDR0, *sdr_addr + 1, lower_32_bits(res.start));	/* Low addr */
+	msi_data = of_get_property(dev->dev.of_node, "msi-data", NULL);
+	if (!msi_data)
+		return -EINVAL;
+
+	msi_mask = of_get_property(dev->dev.of_node, "msi-mask", NULL);
+	if (!msi_mask)
+		return -EINVAL;
 
 	msi->msi_dev = of_find_node_by_name(NULL, "ppc4xx-msi");
 	if (!msi->msi_dev)
@@ -160,30 +166,30 @@ static int ppc4xx_setup_pcieh_hw(struct platform_device *dev,
 
 	msi->msi_regs = of_iomap(msi->msi_dev, 0);
 	if (!msi->msi_regs) {
-		dev_err(&dev->dev, "of_iomap problem failed\n");
-		return -ENOMEM;
+		dev_err(&dev->dev, "of_iomap failed\n");
+		err = -ENOMEM;
+		goto node_put;
 	}
 	dev_dbg(&dev->dev, "PCIE-MSI: msi register mapped 0x%x 0x%x\n",
 		(u32) (msi->msi_regs + PEIH_TERMADH), (u32) (msi->msi_regs));
 
 	msi_virt = dma_alloc_coherent(&dev->dev, 64, &msi_phys, GFP_KERNEL);
-	if (!msi_virt)
-		return -ENOMEM;
+	if (!msi_virt) {
+		err = -ENOMEM;
+		goto iounmap;
+	}
 	msi->msi_addr_hi = upper_32_bits(msi_phys);
 	msi->msi_addr_lo = lower_32_bits(msi_phys & 0xffffffff);
 	dev_dbg(&dev->dev, "PCIE-MSI: msi address high 0x%x, low 0x%x\n",
 		msi->msi_addr_hi, msi->msi_addr_lo);
 
+	mtdcri(SDR0, *sdr_addr, upper_32_bits(res.start));	/*HIGH addr */
+	mtdcri(SDR0, *sdr_addr + 1, lower_32_bits(res.start));	/* Low addr */
+
 	/* Progam the Interrupt handler Termination addr registers */
 	out_be32(msi->msi_regs + PEIH_TERMADH, msi->msi_addr_hi);
 	out_be32(msi->msi_regs + PEIH_TERMADL, msi->msi_addr_lo);
 
-	msi_data = of_get_property(dev->dev.of_node, "msi-data", NULL);
-	if (!msi_data)
-		return -1;
-	msi_mask = of_get_property(dev->dev.of_node, "msi-mask", NULL);
-	if (!msi_mask)
-		return -1;
 	/* Program MSI Expected data and Mask bits */
 	out_be32(msi->msi_regs + PEIH_MSIED, *msi_data);
 	out_be32(msi->msi_regs + PEIH_MSIMK, *msi_mask);
@@ -191,6 +197,12 @@ static int ppc4xx_setup_pcieh_hw(struct platform_device *dev,
 	dma_free_coherent(&dev->dev, 64, msi_virt, msi_phys);
 
 	return 0;
+
+iounmap:
+	iounmap(msi->msi_regs);
+node_put:
+	of_node_put(msi->msi_dev);
+	return err;
 }
 
 static int ppc4xx_of_msi_remove(struct platform_device *dev)
@@ -209,7 +221,6 @@ static int ppc4xx_of_msi_remove(struct platform_device *dev)
 		msi_bitmap_free(&msi->bitmap);
 	iounmap(msi->msi_regs);
 	of_node_put(msi->msi_dev);
-	kfree(msi);
 
 	return 0;
 }
@@ -223,18 +234,16 @@ static int ppc4xx_msi_probe(struct platform_device *dev)
 
 	dev_dbg(&dev->dev, "PCIE-MSI: Setting up MSI support...\n");
 
-	msi = kzalloc(sizeof(*msi), GFP_KERNEL);
-	if (!msi) {
-		dev_err(&dev->dev, "No memory for MSI structure\n");
+	msi = devm_kzalloc(&dev->dev, sizeof(*msi), GFP_KERNEL);
+	if (!msi)
 		return -ENOMEM;
-	}
 	dev->dev.platform_data = msi;
 
 	/* Get MSI ranges */
 	err = of_address_to_resource(dev->dev.of_node, 0, &res);
 	if (err) {
 		dev_err(&dev->dev, "%pOF resource error!\n", dev->dev.of_node);
-		goto error_out;
+		return err;
 	}
 
 	msi_irqs = of_irq_count(dev->dev.of_node);
@@ -243,7 +252,7 @@ static int ppc4xx_msi_probe(struct platform_device *dev)
 
 	err = ppc4xx_setup_pcieh_hw(dev, res, msi);
 	if (err)
-		goto error_out;
+		return err;
 
 	err = ppc4xx_msi_init_allocator(dev, msi);
 	if (err) {
@@ -256,7 +265,7 @@ static int ppc4xx_msi_probe(struct platform_device *dev)
 		phb->controller_ops.setup_msi_irqs = ppc4xx_setup_msi_irqs;
 		phb->controller_ops.teardown_msi_irqs = ppc4xx_teardown_msi_irqs;
 	}
-	return err;
+	return 0;
 
 error_out:
 	ppc4xx_of_msi_remove(dev);
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH 0/6] rapidio: move Kconfig menu definition to subsystem
From: Randy Dunlap @ 2018-07-31  1:08 UTC (permalink / raw)
  To: Alexei Colin, Alexandre Bounine, Andrew Morton
  Cc: John Paul Walters, Catalin Marinas, Russell King, Arnd Bergmann,
	Will Deacon, Ralf Baechle, Paul Burton, Alexander Sverdlin,
	Benjamin Herrenschmidt, Paul Mackerras, Thomas Gleixner,
	Peter Anvin, Matt Porter, x86, linuxppc-dev, linux-mips,
	linux-arm-kernel, linux-kernel
In-Reply-To: <20180730225035.28365-1-acolin@isi.edu>

On 07/30/2018 03:50 PM, Alexei Colin wrote:
> The top-level Kconfig entry for RapidIO subsystem is currently
> duplicated in several architecture-specific Kconfig files. This set of
> patches does two things:
> 
> 1. Move the Kconfig menu definition into the RapidIO subsystem and
> remove the duplicate definitions from arch Kconfig files.
> 
> 2. Enable RapidIO Kconfig menu entry for arm and arm64 architectures,
> where it was not enabled before. I tested that subsystem and drivers
> build successfully for both architectures, and tested that the modules
> load on a custom arm64 Qemu model.
> 
> For all architectures, RapidIO menu should be offered when either:
> (1) The platform has a PCI bus (which host a RapidIO module on the bus).
> (2) The platform has a RapidIO IP block (connected to a system bus, e.g.
> AXI on ARM). In this case, 'select HAS_RAPIDIO' should be added to the
> 'config ARCH_*' menu entry for the SoCs that offer the IP block.
> 
> Prior to this patchset, different architectures used different criteria:
> * powerpc: (1) and (2)
> * mips: (1) and (2) after recent commit into next that added (2):
>   https://www.linux-mips.org/archives/linux-mips/2018-07/msg00596.html
>   fc5d988878942e9b42a4de5204bdd452f3f1ce47
>   491ec1553e0075f345fbe476a93775eabcbc40b6
> * x86: (1)
> * arm,arm64: none (RapidIO menus never offered)
> 
> Responses to feedback from prior submission (thanks for the reviews!):
> http://lists.infradead.org/pipermail/linux-arm-kernel/2018-July/593347.html
> http://lists.infradead.org/pipermail/linux-arm-kernel/2018-July/593349.html
> 
> Changelog:
>   * Moved Kconfig entry into RapidIO subsystem instead of duplicating
> 
> In the current patchset, I took the approach of adding '|| PCI' to the
> depends in the subsystem. I did try the alterantive approach mentioned
> in the reviews for v1 of this patch, where the subsystem Kconfig does
> not add a '|| PCI' and each per-architecture Kconfig has to add a
> 'select HAS_RAPIDIO if PCI' and SoCs with IP blocks have to also add
> 'select HAS_RAPIDIO'. This works too but requires each architecture's
> Kconfig to add the line for RapidIO (whereas current approach does not
> require that involvement) and also may create a false impression that
> the dependency on PCI is strict.
> 
> We appreciate the suggestion for also selecting the RapdiIO subsystem for
> compilation with COMPILE_TEST, but hope to address it in a separate
> patchset, localized to the subsystem, since it will need to change
> depends on all drivers, not just on the top level, and since this
> patch now spans multiple architectures.
> 
> 
> Alexei Colin (6):
>   rapidio: define top Kconfig menu in driver subtree
>   x86: factor out RapidIO Kconfig menu
>   powerpc: factor out RapidIO Kconfig menu entry
>   mips: factor out RapidIO Kconfig entry
>   arm: enable RapidIO menu in Kconfig
>   arm64: enable RapidIO menu in Kconfig
> 
>  arch/arm/Kconfig        |  2 ++
>  arch/arm64/Kconfig      |  2 ++
>  arch/mips/Kconfig       | 11 -----------
>  arch/powerpc/Kconfig    | 13 +------------
>  arch/x86/Kconfig        |  8 --------
>  drivers/rapidio/Kconfig | 15 +++++++++++++++
>  6 files changed, 20 insertions(+), 31 deletions(-)
> 

LGTM.

Acked-by: Randy Dunlap <rdunlap@infradead.org> # for the series

thanks,
-- 
~Randy

^ permalink raw reply

* Re: [PATCH] powerpc/mobility: Fix node detach/rename problem
From: Tyrel Datwyler @ 2018-07-31  0:59 UTC (permalink / raw)
  To: Michael Bringmann, linuxppc-dev
  Cc: Nathan Fontenot, Thomas Falcon, John Allen, mpe
In-Reply-To: <c2fba52a-baac-25fb-c26b-c84b25c3178c@linux.vnet.ibm.com>

On 07/29/2018 06:11 AM, Michael Bringmann wrote:
> During LPAR migration, the content of the device tree/sysfs may
> be updated including deletion and replacement of nodes in the
> tree.  When nodes are added to the internal node structures, they
> are appended in FIFO order to a list of nodes maintained by the
> OF code APIs.  When nodes are removed from the device tree, they
> are marked OF_DETACHED, but not actually deleted from the system
> to allow for pointers cached elsewhere in the kernel.  The order
> and content of the entries in the list of nodes is not altered,
> though.
> 
> During LPAR migration some common nodes are deleted and re-added
> e.g. "ibm,platform-facilities".  If a node is re-added to the OF
> node lists, the of_attach_node function checks to make sure that
> the name + ibm,phandle of the to-be-added data is unique.  As the
> previous copy of a re-added node is not modified beyond the addition
> of a bit flag, the code (1) finds the old copy, (2) prints a WARNING
> notice to the console, (3) renames the to-be-added node to avoid
> filename collisions within a directory, and (3) adds entries to
> the sysfs/kernfs.

So, this patch actually just band aids over the real problem. This is a long standing problem with several PFO drivers leaking references. The issue here is that, during the device tree update that follows a migration. the update of the ibm,platform-facilities node and friends below are always deleted and re-added on the destination lpar and subsequently the leaked references prevent the devices nodes from every actually being properly cleaned up after detach. Thus, leading to the issue you are observing.

As and example a quick look at nx-842-pseries.c reveals several issues.

static int __init nx842_pseries_init(void)
{
        struct nx842_devdata *new_devdata;
        int ret;

        if (!of_find_compatible_node(NULL, NULL, "ibm,compression"))
                return -ENODEV;

This call to of_find_compatible_node() results in a node returned with refcount incremented and therefore immediately leaked.

Further, the reconfig notifier logic makes the assumption that it only needs to deal with node updates, but as I outlined above the post migration device tree update always results in PFO nodes and properties being deleted and re-added.

/**
 * nx842_OF_notifier - Process updates to OF properties for the device
 *
 * @np: notifier block
 * @action: notifier action
 * @update: struct pSeries_reconfig_prop_update pointer if action is
 *      PSERIES_UPDATE_PROPERTY
 *
 * Returns:
 *      NOTIFY_OK on success
 *      NOTIFY_BAD encoded with error number on failure, use
 *              notifier_to_errno() to decode this value
 */
static int nx842_OF_notifier(struct notifier_block *np, unsigned long action,
                             void *data)
{
        struct of_reconfig_data *upd = data;
        struct nx842_devdata *local_devdata;
        struct device_node *node = NULL;

        rcu_read_lock();
        local_devdata = rcu_dereference(devdata);
        if (local_devdata)
                node = local_devdata->dev->of_node;

        if (local_devdata &&
                        action == OF_RECONFIG_UPDATE_PROPERTY &&
                        !strcmp(upd->dn->name, node->name)) {
                rcu_read_unlock();
                nx842_OF_upd(upd->prop);
        } else
                rcu_read_unlock();

        return NOTIFY_OK;
}

I expect to find the same problems in pseries-rng.c and nx.c.

-Tyrel

> 
> This patch fixes the 'migration add' problem by changing the
> stored 'phandle' of the OF_DETACHed node to 0 (reserved value for
> of_find_node_by_phandle), so that subsequent re-add operations,
> such as those during migration, do not find the detached node,
> do not observe duplicate names, do not rename them,  and the
> extra WARNING notices are removed from the console output.
> 
> In addition, it erases the 'name' field of the OF_DETACHED node,
> to prevent any future calls to of_find_node_by_name() or
> of_find_node_by_path() from matching this node.
> 
> Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
> ---
>  arch/powerpc/platforms/pseries/dlpar.c |    3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/pseries/dlpar.c b/arch/powerpc/platforms/pseries/dlpar.c
> index 2de0f0d..9d82c28 100644
> --- a/arch/powerpc/platforms/pseries/dlpar.c
> +++ b/arch/powerpc/platforms/pseries/dlpar.c
> @@ -274,6 +274,9 @@ int dlpar_detach_node(struct device_node *dn)
>  	if (rc)
>  		return rc;
> 
> +	dn->phandle = 0;
> +	memset(dn->name, 0, strlen(dn->name));
> +
>  	return 0;
>  }
> 
> 

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox