linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH next] powerpc/mm: fix _PAGE_PTE breaking swapoff
@ 2016-01-10  0:50 Hugh Dickins
  2016-01-11  4:56 ` Aneesh Kumar K.V
  0 siblings, 1 reply; 5+ messages in thread
From: Hugh Dickins @ 2016-01-10  0:50 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: Andrew Morton, Michael Ellerman, Laurent Dufour, linuxppc-dev,
	linux-mm

Swapoff after swapping hangs on the G5.  That's because the _PAGE_PTE
bit, added by set_pte_at(), is not expected by swapoff: so swap ptes
cannot be recognized.

I'm not sure whether a swap pte should or should not have _PAGE_PTE set:
this patch assumes not, and fixes set_pte_at() to set _PAGE_PTE only on
present entries.

But if that's wrong, a reasonable alternative would be to
#define __pte_to_swp_entry(pte)	((swp_entry_t) { pte_val(pte) & ~_PAGE_PTE })
#define __swp_entry_to_pte(x)	__pte((x).val | _PAGE_PTE)

Signed-off-by: Hugh Dickins <hughd@google.com>
---

 arch/powerpc/mm/pgtable.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- 4.4-next/arch/powerpc/mm/pgtable.c	2016-01-06 11:54:01.477512251 -0800
+++ linux/arch/powerpc/mm/pgtable.c	2016-01-09 13:51:15.793485717 -0800
@@ -180,9 +180,10 @@ void set_pte_at(struct mm_struct *mm, un
 	VM_WARN_ON((pte_val(*ptep) & (_PAGE_PRESENT | _PAGE_USER)) ==
 		(_PAGE_PRESENT | _PAGE_USER));
 	/*
-	 * Add the pte bit when tryint set a pte
+	 * Add the pte bit when setting a pte (not a swap entry)
 	 */
-	pte = __pte(pte_val(pte) | _PAGE_PTE);
+	if (pte_val(pte) & _PAGE_PRESENT)
+		pte = __pte(pte_val(pte) | _PAGE_PTE);
 
 	/* Note: mm->context.id might not yet have been assigned as
 	 * this context might not have been activated yet when this

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH next] powerpc/mm: fix _PAGE_PTE breaking swapoff
  2016-01-10  0:50 [PATCH next] powerpc/mm: fix _PAGE_PTE breaking swapoff Hugh Dickins
@ 2016-01-11  4:56 ` Aneesh Kumar K.V
  2016-01-11  5:45   ` Hugh Dickins
  2016-01-11  5:55   ` Aneesh Kumar K.V
  0 siblings, 2 replies; 5+ messages in thread
From: Aneesh Kumar K.V @ 2016-01-11  4:56 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Andrew Morton, Michael Ellerman, Laurent Dufour, linuxppc-dev,
	linux-mm

Hugh Dickins <hughd@google.com> writes:

> Swapoff after swapping hangs on the G5.  That's because the _PAGE_PTE
> bit, added by set_pte_at(), is not expected by swapoff: so swap ptes
> cannot be recognized.
>
> I'm not sure whether a swap pte should or should not have _PAGE_PTE set:
> this patch assumes not, and fixes set_pte_at() to set _PAGE_PTE only on
> present entries.

One of the reason we added _PAGE_PTE is to enable HUGETLB migration. So
we want migratio ptes to have _PAGE_PTE set.

>
> But if that's wrong, a reasonable alternative would be to
> #define __pte_to_swp_entry(pte)	((swp_entry_t) { pte_val(pte) & ~_PAGE_PTE })
> #define __swp_entry_to_pte(x)	__pte((x).val | _PAGE_PTE)
>

We do clear _PAGE_PTE bits, when converting swp_entry_t to type and
offset. Can you share the stack trace for the hang, which will help me
understand this more ? . 

> Signed-off-by: Hugh Dickins <hughd@google.com>
> ---
>
>  arch/powerpc/mm/pgtable.c |    5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> --- 4.4-next/arch/powerpc/mm/pgtable.c	2016-01-06 11:54:01.477512251 -0800
> +++ linux/arch/powerpc/mm/pgtable.c	2016-01-09 13:51:15.793485717 -0800
> @@ -180,9 +180,10 @@ void set_pte_at(struct mm_struct *mm, un
>  	VM_WARN_ON((pte_val(*ptep) & (_PAGE_PRESENT | _PAGE_USER)) ==
>  		(_PAGE_PRESENT | _PAGE_USER));
>  	/*
> -	 * Add the pte bit when tryint set a pte
> +	 * Add the pte bit when setting a pte (not a swap entry)
>  	 */
> -	pte = __pte(pte_val(pte) | _PAGE_PTE);
> +	if (pte_val(pte) & _PAGE_PRESENT)
> +		pte = __pte(pte_val(pte) | _PAGE_PTE);
>
>  	/* Note: mm->context.id might not yet have been assigned as
>  	 * this context might not have been activated yet when this

-aneesh

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH next] powerpc/mm: fix _PAGE_PTE breaking swapoff
  2016-01-11  4:56 ` Aneesh Kumar K.V
@ 2016-01-11  5:45   ` Hugh Dickins
  2016-01-11  5:55   ` Aneesh Kumar K.V
  1 sibling, 0 replies; 5+ messages in thread
From: Hugh Dickins @ 2016-01-11  5:45 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: Hugh Dickins, Andrew Morton, Michael Ellerman, Laurent Dufour,
	linuxppc-dev, linux-mm

On Mon, 11 Jan 2016, Aneesh Kumar K.V wrote:
> Hugh Dickins <hughd@google.com> writes:
> 
> > Swapoff after swapping hangs on the G5.  That's because the _PAGE_PTE
> > bit, added by set_pte_at(), is not expected by swapoff: so swap ptes
> > cannot be recognized.
> >
> > I'm not sure whether a swap pte should or should not have _PAGE_PTE set:
> > this patch assumes not, and fixes set_pte_at() to set _PAGE_PTE only on
> > present entries.
> 
> One of the reason we added _PAGE_PTE is to enable HUGETLB migration. So
> we want migratio ptes to have _PAGE_PTE set.

Okay, I won't pretend to understand the role of _PAGE_PTE in that;
but if it helps you to have _PAGE_PTE set in (swap and) migration entries,
that's very easily done with the alternative I suggested for pgtable.h:

-#define __pte_to_swp_entry(pte)		((swp_entry_t) { pte_val((pte)) })
-#define __swp_entry_to_pte(x)		__pte((x).val)
+#define __pte_to_swp_entry(pte)	((swp_entry_t) { pte_val(pte) & ~_PAGE_PTE })
+#define __swp_entry_to_pte(x)	__pte((x).val | _PAGE_PTE)

I did test that variant (with set_pte_at() restored to how you have it);
but not understanding _PAGE_PTE, I thought it odd to have in a swap entry.

> 
> >
> > But if that's wrong, a reasonable alternative would be to
> > #define __pte_to_swp_entry(pte)	((swp_entry_t) { pte_val(pte) & ~_PAGE_PTE })
> > #define __swp_entry_to_pte(x)	__pte((x).val | _PAGE_PTE)
> >
> 
> We do clear _PAGE_PTE bits, when converting swp_entry_t to type and
> offset. Can you share the stack trace for the hang, which will help me
> understand this more ? . 

The stack trace can be anywhere below try_to_unuse() in mm/swapfile.c,
since swapoff is circling around and around that function, reading from
each used swap block into a page, then trying to find where that page
belongs, looking at every non-file pte of every mm that ever swapped.

The code to look at is unuse_pte_range(), which at the top does
	pte_t swp_pte = swp_entry_to_pte(entry)
to get the form it hopes to find in the page table; then scans doing
		if (unlikely(maybe_same_pte(*pte, swp_pte))) {
on each pte slot.  Ignoring the MEM_SOFT_DIRTY complication (which
had its own independent bug) maybe_same_pte() just does pte_same().

Hugh

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH next] powerpc/mm: fix _PAGE_PTE breaking swapoff
  2016-01-11  4:56 ` Aneesh Kumar K.V
  2016-01-11  5:45   ` Hugh Dickins
@ 2016-01-11  5:55   ` Aneesh Kumar K.V
  2016-01-11  6:09     ` Hugh Dickins
  1 sibling, 1 reply; 5+ messages in thread
From: Aneesh Kumar K.V @ 2016-01-11  5:55 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Andrew Morton, Michael Ellerman, Laurent Dufour, linuxppc-dev,
	linux-mm

"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:

> Hugh Dickins <hughd@google.com> writes:
>
>> Swapoff after swapping hangs on the G5.  That's because the _PAGE_PTE
>> bit, added by set_pte_at(), is not expected by swapoff: so swap ptes
>> cannot be recognized.
>>
>> I'm not sure whether a swap pte should or should not have _PAGE_PTE set:
>> this patch assumes not, and fixes set_pte_at() to set _PAGE_PTE only on
>> present entries.
>
> One of the reason we added _PAGE_PTE is to enable HUGETLB migration. So
> we want migratio ptes to have _PAGE_PTE set.
>
>>
>> But if that's wrong, a reasonable alternative would be to
>> #define __pte_to_swp_entry(pte)	((swp_entry_t) { pte_val(pte) & ~_PAGE_PTE })
>> #define __swp_entry_to_pte(x)	__pte((x).val | _PAGE_PTE)
>>

You other email w.r.t soft dirty bits explained this. What I missed was
the fact that core kernel expect swp_entry_t to be of an arch neutral
format.  The confusing part was "arch_entry"

static inline pte_t swp_entry_to_pte(swp_entry_t entry)
{
	swp_entry_t arch_entry;
.....
}
	
IMHO we should use the alternative you suggested above. I can write a
patch with additional comments around that if you want me to do that.

-aneesh

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH next] powerpc/mm: fix _PAGE_PTE breaking swapoff
  2016-01-11  5:55   ` Aneesh Kumar K.V
@ 2016-01-11  6:09     ` Hugh Dickins
  0 siblings, 0 replies; 5+ messages in thread
From: Hugh Dickins @ 2016-01-11  6:09 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: Hugh Dickins, Andrew Morton, Michael Ellerman, Laurent Dufour,
	linuxppc-dev, linux-mm

On Mon, 11 Jan 2016, Aneesh Kumar K.V wrote:
> "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:
> > Hugh Dickins <hughd@google.com> writes:
> >
> >> Swapoff after swapping hangs on the G5.  That's because the _PAGE_PTE
> >> bit, added by set_pte_at(), is not expected by swapoff: so swap ptes
> >> cannot be recognized.
> >>
> >> I'm not sure whether a swap pte should or should not have _PAGE_PTE set:
> >> this patch assumes not, and fixes set_pte_at() to set _PAGE_PTE only on
> >> present entries.
> >
> > One of the reason we added _PAGE_PTE is to enable HUGETLB migration. So
> > we want migratio ptes to have _PAGE_PTE set.
> >
> >>
> >> But if that's wrong, a reasonable alternative would be to
> >> #define __pte_to_swp_entry(pte)	((swp_entry_t) { pte_val(pte) & ~_PAGE_PTE })
> >> #define __swp_entry_to_pte(x)	__pte((x).val | _PAGE_PTE)
> >>
> 
> You other email w.r.t soft dirty bits explained this. What I missed was
> the fact that core kernel expect swp_entry_t to be of an arch neutral
> format.  The confusing part was "arch_entry"
> 
> static inline pte_t swp_entry_to_pte(swp_entry_t entry)
> {
> 	swp_entry_t arch_entry;
> .....
> }
> 	
> IMHO we should use the alternative you suggested above. I can write a
> patch with additional comments around that if you want me to do that.

Sure, please go ahead - thanks.

Hugh

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-01-11  6:09 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-01-10  0:50 [PATCH next] powerpc/mm: fix _PAGE_PTE breaking swapoff Hugh Dickins
2016-01-11  4:56 ` Aneesh Kumar K.V
2016-01-11  5:45   ` Hugh Dickins
2016-01-11  5:55   ` Aneesh Kumar K.V
2016-01-11  6:09     ` Hugh Dickins

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).