linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH next] powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff
@ 2016-01-10  0:54 Hugh Dickins
  2016-01-10  0:59 ` [PATCH next] mm: make swapoff more robust against soft dirty Hugh Dickins
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Hugh Dickins @ 2016-01-10  0:54 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: Andrew Morton, Michael Ellerman, Aneesh Kumar K.V,
	Cyrill Gorcunov, Martin Schwidefsky, linuxppc-dev, linux-mm

Swapoff after swapping hangs on the G5, when CONFIG_CHECKPOINT_RESTORE=y
but CONFIG_MEM_SOFT_DIRTY is not set.  That's because the non-zero
_PAGE_SWP_SOFT_DIRTY bit, added by CONFIG_HAVE_ARCH_SOFT_DIRTY=y, is not
discounted when CONFIG_MEM_SOFT_DIRTY is not set: so swap ptes cannot be
recognized.

(I suspect that the peculiar dependence of HAVE_ARCH_SOFT_DIRTY on
CHECKPOINT_RESTORE in arch/powerpc/Kconfig comes from an incomplete
attempt to solve this problem.)

It's true that the relationship between CONFIG_HAVE_ARCH_SOFT_DIRTY and
and CONFIG_MEM_SOFT_DIRTY is too confusing, and it's true that swapoff
should be made more robust; but nevertheless, fix up the powerpc ifdefs
as x86_64 and s390 (which met the same problem) have them, defining the
bits as 0 if CONFIG_MEM_SOFT_DIRTY is not set.

Signed-off-by: Hugh Dickins <hughd@google.com>
---

 arch/powerpc/include/asm/book3s/64/hash.h    |    5 +++++
 arch/powerpc/include/asm/book3s/64/pgtable.h |    9 ++++++---
 2 files changed, 11 insertions(+), 3 deletions(-)

--- 4.4-next/arch/powerpc/include/asm/book3s/64/hash.h	2016-01-06 11:54:01.377508976 -0800
+++ linux/arch/powerpc/include/asm/book3s/64/hash.h	2016-01-09 13:54:24.410893347 -0800
@@ -33,7 +33,12 @@
 #define _PAGE_F_GIX_SHIFT	12
 #define _PAGE_F_SECOND		0x08000 /* Whether to use secondary hash or not */
 #define _PAGE_SPECIAL		0x10000 /* software: special page */
+
+#ifdef CONFIG_MEM_SOFT_DIRTY
 #define _PAGE_SOFT_DIRTY	0x20000 /* software: software dirty tracking */
+#else
+#define _PAGE_SOFT_DIRTY	0x00000
+#endif
 
 /*
  * We need to differentiate between explicit huge page and THP huge
--- 4.4-next/arch/powerpc/include/asm/book3s/64/pgtable.h	2016-01-06 11:54:01.377508976 -0800
+++ linux/arch/powerpc/include/asm/book3s/64/pgtable.h	2016-01-09 13:54:24.410893347 -0800
@@ -162,8 +162,13 @@ static inline void pgd_set(pgd_t *pgdp,
 #define __pte_to_swp_entry(pte)		((swp_entry_t) { pte_val((pte)) })
 #define __swp_entry_to_pte(x)		__pte((x).val)
 
-#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
+#ifdef CONFIG_MEM_SOFT_DIRTY
 #define _PAGE_SWP_SOFT_DIRTY   (1UL << (SWP_TYPE_BITS + _PAGE_BIT_SWAP_TYPE))
+#else
+#define _PAGE_SWP_SOFT_DIRTY	0UL
+#endif /* CONFIG_MEM_SOFT_DIRTY */
+
+#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
 static inline pte_t pte_swp_mksoft_dirty(pte_t pte)
 {
 	return __pte(pte_val(pte) | _PAGE_SWP_SOFT_DIRTY);
@@ -176,8 +181,6 @@ static inline pte_t pte_swp_clear_soft_d
 {
 	return __pte(pte_val(pte) & ~_PAGE_SWP_SOFT_DIRTY);
 }
-#else
-#define _PAGE_SWP_SOFT_DIRTY	0
 #endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */
 
 void pgtable_cache_add(unsigned shift, void (*ctor)(void *));

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH next] mm: make swapoff more robust against soft dirty
  2016-01-10  0:54 [PATCH next] powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff Hugh Dickins
@ 2016-01-10  0:59 ` Hugh Dickins
  2016-01-10 14:09   ` Cyrill Gorcunov
  2016-01-11  5:39   ` Aneesh Kumar K.V
  2016-01-10 14:07 ` [PATCH next] powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff Cyrill Gorcunov
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 11+ messages in thread
From: Hugh Dickins @ 2016-01-10  0:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Cyrill Gorcunov, Laurent Dufour, Michael Ellerman,
	Aneesh Kumar K.V, Martin Schwidefsky, linuxppc-dev, linux-mm

Both s390 and powerpc have hit the issue of swapoff hanging, when
CONFIG_HAVE_ARCH_SOFT_DIRTY and CONFIG_MEM_SOFT_DIRTY ifdefs were
not quite as x86_64 had them.  I think it would be much clearer if
HAVE_ARCH_SOFT_DIRTY was just a Kconfig option set by architectures
to determine whether the MEM_SOFT_DIRTY option should be offered,
and the actual code depend upon CONFIG_MEM_SOFT_DIRTY alone.

But won't embark on that change myself: instead make swapoff more
robust, by using pte_swp_clear_soft_dirty() on each pte it encounters,
without an explicit #ifdef CONFIG_MEM_SOFT_DIRTY.  That being a no-op,
whether the bit in question is defined as 0 or the asm-generic fallback
is used, unless soft dirty is fully turned on.

Why "maybe" in maybe_same_pte()?  Rename it pte_same_as_swp().

Signed-off-by: Hugh Dickins <hughd@google.com>
---

 mm/swapfile.c |   18 ++++--------------
 1 file changed, 4 insertions(+), 14 deletions(-)

--- 4.4-next/mm/swapfile.c	2016-01-06 11:54:46.327006983 -0800
+++ linux/mm/swapfile.c	2016-01-09 13:39:19.632872694 -0800
@@ -1109,19 +1109,9 @@ unsigned int count_swap_pages(int type,
 }
 #endif /* CONFIG_HIBERNATION */
 
-static inline int maybe_same_pte(pte_t pte, pte_t swp_pte)
+static inline int pte_same_as_swp(pte_t pte, pte_t swp_pte)
 {
-#ifdef CONFIG_MEM_SOFT_DIRTY
-	/*
-	 * When pte keeps soft dirty bit the pte generated
-	 * from swap entry does not has it, still it's same
-	 * pte from logical point of view.
-	 */
-	pte_t swp_pte_dirty = pte_swp_mksoft_dirty(swp_pte);
-	return pte_same(pte, swp_pte) || pte_same(pte, swp_pte_dirty);
-#else
-	return pte_same(pte, swp_pte);
-#endif
+	return pte_same(pte_swp_clear_soft_dirty(pte), swp_pte);
 }
 
 /*
@@ -1150,7 +1140,7 @@ static int unuse_pte(struct vm_area_stru
 	}
 
 	pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
-	if (unlikely(!maybe_same_pte(*pte, swp_entry_to_pte(entry)))) {
+	if (unlikely(!pte_same_as_swp(*pte, swp_entry_to_pte(entry)))) {
 		mem_cgroup_cancel_charge(page, memcg, false);
 		ret = 0;
 		goto out;
@@ -1208,7 +1198,7 @@ static int unuse_pte_range(struct vm_are
 		 * swapoff spends a _lot_ of time in this loop!
 		 * Test inline before going to call unuse_pte.
 		 */
-		if (unlikely(maybe_same_pte(*pte, swp_pte))) {
+		if (unlikely(pte_same_as_swp(*pte, swp_pte))) {
 			pte_unmap(pte);
 			ret = unuse_pte(vma, pmd, addr, entry, page);
 			if (ret)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH next] powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff
  2016-01-10  0:54 [PATCH next] powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff Hugh Dickins
  2016-01-10  0:59 ` [PATCH next] mm: make swapoff more robust against soft dirty Hugh Dickins
@ 2016-01-10 14:07 ` Cyrill Gorcunov
  2016-01-11  5:43 ` Aneesh Kumar K.V
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Cyrill Gorcunov @ 2016-01-10 14:07 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Laurent Dufour, Andrew Morton, Michael Ellerman, Aneesh Kumar K.V,
	Martin Schwidefsky, linuxppc-dev, linux-mm

On Sat, Jan 09, 2016 at 04:54:59PM -0800, Hugh Dickins wrote:
> Swapoff after swapping hangs on the G5, when CONFIG_CHECKPOINT_RESTORE=y
> but CONFIG_MEM_SOFT_DIRTY is not set.  That's because the non-zero
> _PAGE_SWP_SOFT_DIRTY bit, added by CONFIG_HAVE_ARCH_SOFT_DIRTY=y, is not
> discounted when CONFIG_MEM_SOFT_DIRTY is not set: so swap ptes cannot be
> recognized.
> 
> (I suspect that the peculiar dependence of HAVE_ARCH_SOFT_DIRTY on
> CHECKPOINT_RESTORE in arch/powerpc/Kconfig comes from an incomplete
> attempt to solve this problem.)
> 
> It's true that the relationship between CONFIG_HAVE_ARCH_SOFT_DIRTY and
> and CONFIG_MEM_SOFT_DIRTY is too confusing, and it's true that swapoff
> should be made more robust; but nevertheless, fix up the powerpc ifdefs
> as x86_64 and s390 (which met the same problem) have them, defining the
> bits as 0 if CONFIG_MEM_SOFT_DIRTY is not set.
> 
> Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>

Thank you, Hugh!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH next] mm: make swapoff more robust against soft dirty
  2016-01-10  0:59 ` [PATCH next] mm: make swapoff more robust against soft dirty Hugh Dickins
@ 2016-01-10 14:09   ` Cyrill Gorcunov
  2016-01-11  5:39   ` Aneesh Kumar K.V
  1 sibling, 0 replies; 11+ messages in thread
From: Cyrill Gorcunov @ 2016-01-10 14:09 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Andrew Morton, Laurent Dufour, Michael Ellerman, Aneesh Kumar K.V,
	Martin Schwidefsky, linuxppc-dev, linux-mm

On Sat, Jan 09, 2016 at 04:59:42PM -0800, Hugh Dickins wrote:
> Both s390 and powerpc have hit the issue of swapoff hanging, when
> CONFIG_HAVE_ARCH_SOFT_DIRTY and CONFIG_MEM_SOFT_DIRTY ifdefs were
> not quite as x86_64 had them.  I think it would be much clearer if
> HAVE_ARCH_SOFT_DIRTY was just a Kconfig option set by architectures
> to determine whether the MEM_SOFT_DIRTY option should be offered,
> and the actual code depend upon CONFIG_MEM_SOFT_DIRTY alone.
> 
> But won't embark on that change myself: instead make swapoff more
> robust, by using pte_swp_clear_soft_dirty() on each pte it encounters,
> without an explicit #ifdef CONFIG_MEM_SOFT_DIRTY.  That being a no-op,
> whether the bit in question is defined as 0 or the asm-generic fallback
> is used, unless soft dirty is fully turned on.
> 
> Why "maybe" in maybe_same_pte()?  Rename it pte_same_as_swp().
>
> Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>

Thanks a lot, Hugh!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH next] mm: make swapoff more robust against soft dirty
  2016-01-10  0:59 ` [PATCH next] mm: make swapoff more robust against soft dirty Hugh Dickins
  2016-01-10 14:09   ` Cyrill Gorcunov
@ 2016-01-11  5:39   ` Aneesh Kumar K.V
  1 sibling, 0 replies; 11+ messages in thread
From: Aneesh Kumar K.V @ 2016-01-11  5:39 UTC (permalink / raw)
  To: Hugh Dickins, Andrew Morton
  Cc: Cyrill Gorcunov, Laurent Dufour, Michael Ellerman,
	Martin Schwidefsky, linuxppc-dev, linux-mm

Hugh Dickins <hughd@google.com> writes:

> Both s390 and powerpc have hit the issue of swapoff hanging, when
> CONFIG_HAVE_ARCH_SOFT_DIRTY and CONFIG_MEM_SOFT_DIRTY ifdefs were
> not quite as x86_64 had them.  I think it would be much clearer if
> HAVE_ARCH_SOFT_DIRTY was just a Kconfig option set by architectures
> to determine whether the MEM_SOFT_DIRTY option should be offered,
> and the actual code depend upon CONFIG_MEM_SOFT_DIRTY alone.
>
> But won't embark on that change myself: instead make swapoff more
> robust, by using pte_swp_clear_soft_dirty() on each pte it encounters,
> without an explicit #ifdef CONFIG_MEM_SOFT_DIRTY.  That being a no-op,
> whether the bit in question is defined as 0 or the asm-generic fallback
> is used, unless soft dirty is fully turned on.
>
> Why "maybe" in maybe_same_pte()?  Rename it pte_same_as_swp().
>

Ok this also explains, the _PAGE_PTE issue on powerpc you mentioned in the other
email.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

> Signed-off-by: Hugh Dickins <hughd@google.com>
> ---
>
>  mm/swapfile.c |   18 ++++--------------
>  1 file changed, 4 insertions(+), 14 deletions(-)
>
> --- 4.4-next/mm/swapfile.c	2016-01-06 11:54:46.327006983 -0800
> +++ linux/mm/swapfile.c	2016-01-09 13:39:19.632872694 -0800
> @@ -1109,19 +1109,9 @@ unsigned int count_swap_pages(int type,
>  }
>  #endif /* CONFIG_HIBERNATION */
>
> -static inline int maybe_same_pte(pte_t pte, pte_t swp_pte)
> +static inline int pte_same_as_swp(pte_t pte, pte_t swp_pte)
>  {
> -#ifdef CONFIG_MEM_SOFT_DIRTY
> -	/*
> -	 * When pte keeps soft dirty bit the pte generated
> -	 * from swap entry does not has it, still it's same
> -	 * pte from logical point of view.
> -	 */
> -	pte_t swp_pte_dirty = pte_swp_mksoft_dirty(swp_pte);
> -	return pte_same(pte, swp_pte) || pte_same(pte, swp_pte_dirty);
> -#else
> -	return pte_same(pte, swp_pte);
> -#endif
> +	return pte_same(pte_swp_clear_soft_dirty(pte), swp_pte);
>  }
>
>  /*
> @@ -1150,7 +1140,7 @@ static int unuse_pte(struct vm_area_stru
>  	}
>
>  	pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
> -	if (unlikely(!maybe_same_pte(*pte, swp_entry_to_pte(entry)))) {
> +	if (unlikely(!pte_same_as_swp(*pte, swp_entry_to_pte(entry)))) {
>  		mem_cgroup_cancel_charge(page, memcg, false);
>  		ret = 0;
>  		goto out;
> @@ -1208,7 +1198,7 @@ static int unuse_pte_range(struct vm_are
>  		 * swapoff spends a _lot_ of time in this loop!
>  		 * Test inline before going to call unuse_pte.
>  		 */
> -		if (unlikely(maybe_same_pte(*pte, swp_pte))) {
> +		if (unlikely(pte_same_as_swp(*pte, swp_pte))) {
>  			pte_unmap(pte);
>  			ret = unuse_pte(vma, pmd, addr, entry, page);
>  			if (ret)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH next] powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff
  2016-01-10  0:54 [PATCH next] powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff Hugh Dickins
  2016-01-10  0:59 ` [PATCH next] mm: make swapoff more robust against soft dirty Hugh Dickins
  2016-01-10 14:07 ` [PATCH next] powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff Cyrill Gorcunov
@ 2016-01-11  5:43 ` Aneesh Kumar K.V
  2016-01-11  6:05   ` Hugh Dickins
  2016-01-11 16:04 ` Laurent Dufour
  2016-01-12 12:32 ` [next] " Michael Ellerman
  4 siblings, 1 reply; 11+ messages in thread
From: Aneesh Kumar K.V @ 2016-01-11  5:43 UTC (permalink / raw)
  To: Hugh Dickins, Laurent Dufour
  Cc: Andrew Morton, Michael Ellerman, Cyrill Gorcunov,
	Martin Schwidefsky, linuxppc-dev, linux-mm

Hugh Dickins <hughd@google.com> writes:

> Swapoff after swapping hangs on the G5, when CONFIG_CHECKPOINT_RESTORE=y
> but CONFIG_MEM_SOFT_DIRTY is not set.  That's because the non-zero
> _PAGE_SWP_SOFT_DIRTY bit, added by CONFIG_HAVE_ARCH_SOFT_DIRTY=y, is not
> discounted when CONFIG_MEM_SOFT_DIRTY is not set: so swap ptes cannot be
> recognized.
>
> (I suspect that the peculiar dependence of HAVE_ARCH_SOFT_DIRTY on
> CHECKPOINT_RESTORE in arch/powerpc/Kconfig comes from an incomplete
> attempt to solve this problem.)
>
> It's true that the relationship between CONFIG_HAVE_ARCH_SOFT_DIRTY and
> and CONFIG_MEM_SOFT_DIRTY is too confusing, and it's true that swapoff
> should be made more robust; but nevertheless, fix up the powerpc ifdefs
> as x86_64 and s390 (which met the same problem) have them, defining the
> bits as 0 if CONFIG_MEM_SOFT_DIRTY is not set.

Do we need this patch, if we make the maybe_same_pte() more robust. The
#ifdef with pte bits is always a confusing one and IMHO, we should avoid
that if we can ?

>
> Signed-off-by: Hugh Dickins <hughd@google.com>
> ---
>
>  arch/powerpc/include/asm/book3s/64/hash.h    |    5 +++++
>  arch/powerpc/include/asm/book3s/64/pgtable.h |    9 ++++++---
>  2 files changed, 11 insertions(+), 3 deletions(-)
>
> --- 4.4-next/arch/powerpc/include/asm/book3s/64/hash.h	2016-01-06 11:54:01.377508976 -0800
> +++ linux/arch/powerpc/include/asm/book3s/64/hash.h	2016-01-09 13:54:24.410893347 -0800
> @@ -33,7 +33,12 @@
>  #define _PAGE_F_GIX_SHIFT	12
>  #define _PAGE_F_SECOND		0x08000 /* Whether to use secondary hash or not */
>  #define _PAGE_SPECIAL		0x10000 /* software: special page */
> +
> +#ifdef CONFIG_MEM_SOFT_DIRTY
>  #define _PAGE_SOFT_DIRTY	0x20000 /* software: software dirty tracking */
> +#else
> +#define _PAGE_SOFT_DIRTY	0x00000
> +#endif
>
>  /*
>   * We need to differentiate between explicit huge page and THP huge
> --- 4.4-next/arch/powerpc/include/asm/book3s/64/pgtable.h	2016-01-06 11:54:01.377508976 -0800
> +++ linux/arch/powerpc/include/asm/book3s/64/pgtable.h	2016-01-09 13:54:24.410893347 -0800
> @@ -162,8 +162,13 @@ static inline void pgd_set(pgd_t *pgdp,
>  #define __pte_to_swp_entry(pte)		((swp_entry_t) { pte_val((pte)) })
>  #define __swp_entry_to_pte(x)		__pte((x).val)
>
> -#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
> +#ifdef CONFIG_MEM_SOFT_DIRTY
>  #define _PAGE_SWP_SOFT_DIRTY   (1UL << (SWP_TYPE_BITS + _PAGE_BIT_SWAP_TYPE))
> +#else
> +#define _PAGE_SWP_SOFT_DIRTY	0UL
> +#endif /* CONFIG_MEM_SOFT_DIRTY */
> +
> +#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
>  static inline pte_t pte_swp_mksoft_dirty(pte_t pte)
>  {
>  	return __pte(pte_val(pte) | _PAGE_SWP_SOFT_DIRTY);
> @@ -176,8 +181,6 @@ static inline pte_t pte_swp_clear_soft_d
>  {
>  	return __pte(pte_val(pte) & ~_PAGE_SWP_SOFT_DIRTY);
>  }
> -#else
> -#define _PAGE_SWP_SOFT_DIRTY	0
>  #endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */
>
>  void pgtable_cache_add(unsigned shift, void (*ctor)(void *));

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH next] powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff
  2016-01-11  5:43 ` Aneesh Kumar K.V
@ 2016-01-11  6:05   ` Hugh Dickins
  2016-01-11  6:31     ` Aneesh Kumar K.V
  0 siblings, 1 reply; 11+ messages in thread
From: Hugh Dickins @ 2016-01-11  6:05 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: Hugh Dickins, Laurent Dufour, Andrew Morton, Michael Ellerman,
	Cyrill Gorcunov, Martin Schwidefsky, linuxppc-dev, linux-mm

On Mon, 11 Jan 2016, Aneesh Kumar K.V wrote:
> Hugh Dickins <hughd@google.com> writes:
> 
> > Swapoff after swapping hangs on the G5, when CONFIG_CHECKPOINT_RESTORE=y
> > but CONFIG_MEM_SOFT_DIRTY is not set.  That's because the non-zero
> > _PAGE_SWP_SOFT_DIRTY bit, added by CONFIG_HAVE_ARCH_SOFT_DIRTY=y, is not
> > discounted when CONFIG_MEM_SOFT_DIRTY is not set: so swap ptes cannot be
> > recognized.
> >
> > (I suspect that the peculiar dependence of HAVE_ARCH_SOFT_DIRTY on
> > CHECKPOINT_RESTORE in arch/powerpc/Kconfig comes from an incomplete
> > attempt to solve this problem.)
> >
> > It's true that the relationship between CONFIG_HAVE_ARCH_SOFT_DIRTY and
> > and CONFIG_MEM_SOFT_DIRTY is too confusing, and it's true that swapoff
> > should be made more robust; but nevertheless, fix up the powerpc ifdefs
> > as x86_64 and s390 (which met the same problem) have them, defining the
> > bits as 0 if CONFIG_MEM_SOFT_DIRTY is not set.
> 
> Do we need this patch, if we make the maybe_same_pte() more robust. The
> #ifdef with pte bits is always a confusing one and IMHO, we should avoid
> that if we can ?

If maybe_same_pte() were more robust (as in the pte_same_as_swp() patch),
this patch here becomes an optimization rather than a correctness patch:
without this patch here, pte_same_as_swp() will perform an unnecessary 
transformation (masking out _PAGE_SWP_SOFT_DIRTY) from every one of the
millions of ptes it has to examine, on configs where it couldn't be set.
Or perhaps the processor gets that all nicely lined up without any actual
delay, I don't know.

I've already agreed that the way SOFT_DIRTY is currently config'ed is
too confusing; but until that's improved, I strongly recommend that you
follow the same way of handling this as x86_64 and s390 are doing - going
off and doing it differently is liable to lead to error, as we have seen.

So I recommend using the patch below too, whether or not you care for
the optimization.

Hugh

> 
> >
> > Signed-off-by: Hugh Dickins <hughd@google.com>
> > ---
> >
> >  arch/powerpc/include/asm/book3s/64/hash.h    |    5 +++++
> >  arch/powerpc/include/asm/book3s/64/pgtable.h |    9 ++++++---
> >  2 files changed, 11 insertions(+), 3 deletions(-)
> >
> > --- 4.4-next/arch/powerpc/include/asm/book3s/64/hash.h	2016-01-06 11:54:01.377508976 -0800
> > +++ linux/arch/powerpc/include/asm/book3s/64/hash.h	2016-01-09 13:54:24.410893347 -0800
> > @@ -33,7 +33,12 @@
> >  #define _PAGE_F_GIX_SHIFT	12
> >  #define _PAGE_F_SECOND		0x08000 /* Whether to use secondary hash or not */
> >  #define _PAGE_SPECIAL		0x10000 /* software: special page */
> > +
> > +#ifdef CONFIG_MEM_SOFT_DIRTY
> >  #define _PAGE_SOFT_DIRTY	0x20000 /* software: software dirty tracking */
> > +#else
> > +#define _PAGE_SOFT_DIRTY	0x00000
> > +#endif
> >
> >  /*
> >   * We need to differentiate between explicit huge page and THP huge
> > --- 4.4-next/arch/powerpc/include/asm/book3s/64/pgtable.h	2016-01-06 11:54:01.377508976 -0800
> > +++ linux/arch/powerpc/include/asm/book3s/64/pgtable.h	2016-01-09 13:54:24.410893347 -0800
> > @@ -162,8 +162,13 @@ static inline void pgd_set(pgd_t *pgdp,
> >  #define __pte_to_swp_entry(pte)		((swp_entry_t) { pte_val((pte)) })
> >  #define __swp_entry_to_pte(x)		__pte((x).val)
> >
> > -#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
> > +#ifdef CONFIG_MEM_SOFT_DIRTY
> >  #define _PAGE_SWP_SOFT_DIRTY   (1UL << (SWP_TYPE_BITS + _PAGE_BIT_SWAP_TYPE))
> > +#else
> > +#define _PAGE_SWP_SOFT_DIRTY	0UL
> > +#endif /* CONFIG_MEM_SOFT_DIRTY */
> > +
> > +#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
> >  static inline pte_t pte_swp_mksoft_dirty(pte_t pte)
> >  {
> >  	return __pte(pte_val(pte) | _PAGE_SWP_SOFT_DIRTY);
> > @@ -176,8 +181,6 @@ static inline pte_t pte_swp_clear_soft_d
> >  {
> >  	return __pte(pte_val(pte) & ~_PAGE_SWP_SOFT_DIRTY);
> >  }
> > -#else
> > -#define _PAGE_SWP_SOFT_DIRTY	0
> >  #endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */
> >
> >  void pgtable_cache_add(unsigned shift, void (*ctor)(void *));

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH next] powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff
  2016-01-11  6:05   ` Hugh Dickins
@ 2016-01-11  6:31     ` Aneesh Kumar K.V
  2016-01-11  7:33       ` Hugh Dickins
  0 siblings, 1 reply; 11+ messages in thread
From: Aneesh Kumar K.V @ 2016-01-11  6:31 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Hugh Dickins, Laurent Dufour, Andrew Morton, Michael Ellerman,
	Cyrill Gorcunov, Martin Schwidefsky, linuxppc-dev, linux-mm

Hugh Dickins <hughd@google.com> writes:

> On Mon, 11 Jan 2016, Aneesh Kumar K.V wrote:
>> Hugh Dickins <hughd@google.com> writes:
>> 
>> > Swapoff after swapping hangs on the G5, when CONFIG_CHECKPOINT_RESTORE=y
>> > but CONFIG_MEM_SOFT_DIRTY is not set.  That's because the non-zero
>> > _PAGE_SWP_SOFT_DIRTY bit, added by CONFIG_HAVE_ARCH_SOFT_DIRTY=y, is not
>> > discounted when CONFIG_MEM_SOFT_DIRTY is not set: so swap ptes cannot be
>> > recognized.
>> >
>> > (I suspect that the peculiar dependence of HAVE_ARCH_SOFT_DIRTY on
>> > CHECKPOINT_RESTORE in arch/powerpc/Kconfig comes from an incomplete
>> > attempt to solve this problem.)
>> >
>> > It's true that the relationship between CONFIG_HAVE_ARCH_SOFT_DIRTY and
>> > and CONFIG_MEM_SOFT_DIRTY is too confusing, and it's true that swapoff
>> > should be made more robust; but nevertheless, fix up the powerpc ifdefs
>> > as x86_64 and s390 (which met the same problem) have them, defining the
>> > bits as 0 if CONFIG_MEM_SOFT_DIRTY is not set.
>> 
>> Do we need this patch, if we make the maybe_same_pte() more robust. The
>> #ifdef with pte bits is always a confusing one and IMHO, we should avoid
>> that if we can ?
>
> If maybe_same_pte() were more robust (as in the pte_same_as_swp() patch),
> this patch here becomes an optimization rather than a correctness patch:
> without this patch here, pte_same_as_swp() will perform an unnecessary 
> transformation (masking out _PAGE_SWP_SOFT_DIRTY) from every one of the
> millions of ptes it has to examine, on configs where it couldn't be set.
> Or perhaps the processor gets that all nicely lined up without any actual
> delay, I don't know.

But we have
#ifndef CONFIG_HAVE_ARCH_SOFT_DIRTY
static inline pte_t pte_swp_clear_soft_dirty(pte_t pte)
{
	return pte;
}
#endif 

If we fix the CONFIG_HAVE_ARCH_SOFT_DIRTY correctly, we can do the same
optmization without the #ifdef of pte bits right ?

>
> I've already agreed that the way SOFT_DIRTY is currently config'ed is
> too confusing; but until that's improved, I strongly recommend that you
> follow the same way of handling this as x86_64 and s390 are doing - going
> off and doing it differently is liable to lead to error, as we have seen.
>
> So I recommend using the patch below too, whether or not you care for
> the optimization.
>
> Hugh


-aneesh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH next] powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff
  2016-01-11  6:31     ` Aneesh Kumar K.V
@ 2016-01-11  7:33       ` Hugh Dickins
  0 siblings, 0 replies; 11+ messages in thread
From: Hugh Dickins @ 2016-01-11  7:33 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: Hugh Dickins, Laurent Dufour, Andrew Morton, Michael Ellerman,
	Cyrill Gorcunov, Martin Schwidefsky, linuxppc-dev, linux-mm

On Mon, 11 Jan 2016, Aneesh Kumar K.V wrote:
> Hugh Dickins <hughd@google.com> writes:
> > On Mon, 11 Jan 2016, Aneesh Kumar K.V wrote:
> >> Hugh Dickins <hughd@google.com> writes:
> >> 
> >> > Swapoff after swapping hangs on the G5, when CONFIG_CHECKPOINT_RESTORE=y
> >> > but CONFIG_MEM_SOFT_DIRTY is not set.  That's because the non-zero
> >> > _PAGE_SWP_SOFT_DIRTY bit, added by CONFIG_HAVE_ARCH_SOFT_DIRTY=y, is not
> >> > discounted when CONFIG_MEM_SOFT_DIRTY is not set: so swap ptes cannot be
> >> > recognized.
> >> >
> >> > (I suspect that the peculiar dependence of HAVE_ARCH_SOFT_DIRTY on
> >> > CHECKPOINT_RESTORE in arch/powerpc/Kconfig comes from an incomplete
> >> > attempt to solve this problem.)
> >> >
> >> > It's true that the relationship between CONFIG_HAVE_ARCH_SOFT_DIRTY and
> >> > and CONFIG_MEM_SOFT_DIRTY is too confusing, and it's true that swapoff
> >> > should be made more robust; but nevertheless, fix up the powerpc ifdefs
> >> > as x86_64 and s390 (which met the same problem) have them, defining the
> >> > bits as 0 if CONFIG_MEM_SOFT_DIRTY is not set.
> >> 
> >> Do we need this patch, if we make the maybe_same_pte() more robust. The
> >> #ifdef with pte bits is always a confusing one and IMHO, we should avoid
> >> that if we can ?
> >
> > If maybe_same_pte() were more robust (as in the pte_same_as_swp() patch),
> > this patch here becomes an optimization rather than a correctness patch:
> > without this patch here, pte_same_as_swp() will perform an unnecessary 
> > transformation (masking out _PAGE_SWP_SOFT_DIRTY) from every one of the
> > millions of ptes it has to examine, on configs where it couldn't be set.
> > Or perhaps the processor gets that all nicely lined up without any actual
> > delay, I don't know.
> 
> But we have
> #ifndef CONFIG_HAVE_ARCH_SOFT_DIRTY
> static inline pte_t pte_swp_clear_soft_dirty(pte_t pte)
> {
> 	return pte;
> }
> #endif 
> 
> If we fix the CONFIG_HAVE_ARCH_SOFT_DIRTY correctly, we can do the same
> optmization without the #ifdef of pte bits right ?

I'm not sure that I understand you (I'll have to look at your patch),
but suspect you're not optimizing the CONFIG_HAVE_ARCH_SOFT_DIRTY=y
CONFIG_MEM_SOFT_DIRTY not set case.

Which would not be the end of the world, but...
 
> >
> > I've already agreed that the way SOFT_DIRTY is currently config'ed is
> > too confusing; but until that's improved, I strongly recommend that you
> > follow the same way of handling this as x86_64 and s390 are doing - going
> > off and doing it differently is liable to lead to error, as we have seen.

... as before, I don't think that doing it differently is a good idea.

Hugh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH next] powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff
  2016-01-10  0:54 [PATCH next] powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff Hugh Dickins
                   ` (2 preceding siblings ...)
  2016-01-11  5:43 ` Aneesh Kumar K.V
@ 2016-01-11 16:04 ` Laurent Dufour
  2016-01-12 12:32 ` [next] " Michael Ellerman
  4 siblings, 0 replies; 11+ messages in thread
From: Laurent Dufour @ 2016-01-11 16:04 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Andrew Morton, Michael Ellerman, Aneesh Kumar K.V,
	Cyrill Gorcunov, Martin Schwidefsky, linuxppc-dev, linux-mm

On 10/01/2016 01:54, Hugh Dickins wrote:
> Swapoff after swapping hangs on the G5, when CONFIG_CHECKPOINT_RESTORE=y
> but CONFIG_MEM_SOFT_DIRTY is not set.  That's because the non-zero
> _PAGE_SWP_SOFT_DIRTY bit, added by CONFIG_HAVE_ARCH_SOFT_DIRTY=y, is not
> discounted when CONFIG_MEM_SOFT_DIRTY is not set: so swap ptes cannot be
> recognized.
> 
> (I suspect that the peculiar dependence of HAVE_ARCH_SOFT_DIRTY on
> CHECKPOINT_RESTORE in arch/powerpc/Kconfig comes from an incomplete
> attempt to solve this problem.)
> 
> It's true that the relationship between CONFIG_HAVE_ARCH_SOFT_DIRTY and
> and CONFIG_MEM_SOFT_DIRTY is too confusing, and it's true that swapoff
> should be made more robust; but nevertheless, fix up the powerpc ifdefs
> as x86_64 and s390 (which met the same problem) have them, defining the
> bits as 0 if CONFIG_MEM_SOFT_DIRTY is not set.
> 
> Signed-off-by: Hugh Dickins <hughd@google.com>

Acked-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>

Thanks, Hugh!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [next] powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff
  2016-01-10  0:54 [PATCH next] powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff Hugh Dickins
                   ` (3 preceding siblings ...)
  2016-01-11 16:04 ` Laurent Dufour
@ 2016-01-12 12:32 ` Michael Ellerman
  4 siblings, 0 replies; 11+ messages in thread
From: Michael Ellerman @ 2016-01-12 12:32 UTC (permalink / raw)
  To: Hugh Dickins, Laurent Dufour
  Cc: Cyrill Gorcunov, linux-mm, Aneesh Kumar K.V, Martin Schwidefsky,
	Andrew Morton, linuxppc-dev

On Sun, 2016-10-01 at 00:54:59 UTC, Hugh Dickins wrote:
> Swapoff after swapping hangs on the G5, when CONFIG_CHECKPOINT_RESTORE=y
> but CONFIG_MEM_SOFT_DIRTY is not set.  That's because the non-zero
> _PAGE_SWP_SOFT_DIRTY bit, added by CONFIG_HAVE_ARCH_SOFT_DIRTY=y, is not
> discounted when CONFIG_MEM_SOFT_DIRTY is not set: so swap ptes cannot be
> recognized.
> 
> (I suspect that the peculiar dependence of HAVE_ARCH_SOFT_DIRTY on
> CHECKPOINT_RESTORE in arch/powerpc/Kconfig comes from an incomplete
> attempt to solve this problem.)
> 
> It's true that the relationship between CONFIG_HAVE_ARCH_SOFT_DIRTY and
> and CONFIG_MEM_SOFT_DIRTY is too confusing, and it's true that swapoff
> should be made more robust; but nevertheless, fix up the powerpc ifdefs
> as x86_64 and s390 (which met the same problem) have them, defining the
> bits as 0 if CONFIG_MEM_SOFT_DIRTY is not set.
> 
> Signed-off-by: Hugh Dickins <hughd@google.com>
> Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
> Acked-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/2f10f1a7884e97a68e52c4b6f7

cheers

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-01-12 12:32 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-01-10  0:54 [PATCH next] powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff Hugh Dickins
2016-01-10  0:59 ` [PATCH next] mm: make swapoff more robust against soft dirty Hugh Dickins
2016-01-10 14:09   ` Cyrill Gorcunov
2016-01-11  5:39   ` Aneesh Kumar K.V
2016-01-10 14:07 ` [PATCH next] powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff Cyrill Gorcunov
2016-01-11  5:43 ` Aneesh Kumar K.V
2016-01-11  6:05   ` Hugh Dickins
2016-01-11  6:31     ` Aneesh Kumar K.V
2016-01-11  7:33       ` Hugh Dickins
2016-01-11 16:04 ` Laurent Dufour
2016-01-12 12:32 ` [next] " Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).