* [PATCH next] powerpc/mm: fix _PAGE_PTE breaking swapoff
@ 2016-01-10 0:50 Hugh Dickins
2016-01-11 4:56 ` Aneesh Kumar K.V
0 siblings, 1 reply; 5+ messages in thread
From: Hugh Dickins @ 2016-01-10 0:50 UTC (permalink / raw)
To: Aneesh Kumar K.V
Cc: Andrew Morton, Michael Ellerman, Laurent Dufour, linuxppc-dev,
linux-mm
Swapoff after swapping hangs on the G5. That's because the _PAGE_PTE
bit, added by set_pte_at(), is not expected by swapoff: so swap ptes
cannot be recognized.
I'm not sure whether a swap pte should or should not have _PAGE_PTE set:
this patch assumes not, and fixes set_pte_at() to set _PAGE_PTE only on
present entries.
But if that's wrong, a reasonable alternative would be to
#define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val(pte) & ~_PAGE_PTE })
#define __swp_entry_to_pte(x) __pte((x).val | _PAGE_PTE)
Signed-off-by: Hugh Dickins <hughd@google.com>
---
arch/powerpc/mm/pgtable.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- 4.4-next/arch/powerpc/mm/pgtable.c 2016-01-06 11:54:01.477512251 -0800
+++ linux/arch/powerpc/mm/pgtable.c 2016-01-09 13:51:15.793485717 -0800
@@ -180,9 +180,10 @@ void set_pte_at(struct mm_struct *mm, un
VM_WARN_ON((pte_val(*ptep) & (_PAGE_PRESENT | _PAGE_USER)) ==
(_PAGE_PRESENT | _PAGE_USER));
/*
- * Add the pte bit when tryint set a pte
+ * Add the pte bit when setting a pte (not a swap entry)
*/
- pte = __pte(pte_val(pte) | _PAGE_PTE);
+ if (pte_val(pte) & _PAGE_PRESENT)
+ pte = __pte(pte_val(pte) | _PAGE_PTE);
/* Note: mm->context.id might not yet have been assigned as
* this context might not have been activated yet when this
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH next] powerpc/mm: fix _PAGE_PTE breaking swapoff
2016-01-10 0:50 [PATCH next] powerpc/mm: fix _PAGE_PTE breaking swapoff Hugh Dickins
@ 2016-01-11 4:56 ` Aneesh Kumar K.V
2016-01-11 5:45 ` Hugh Dickins
2016-01-11 5:55 ` Aneesh Kumar K.V
0 siblings, 2 replies; 5+ messages in thread
From: Aneesh Kumar K.V @ 2016-01-11 4:56 UTC (permalink / raw)
To: Hugh Dickins
Cc: Andrew Morton, Michael Ellerman, Laurent Dufour, linuxppc-dev,
linux-mm
Hugh Dickins <hughd@google.com> writes:
> Swapoff after swapping hangs on the G5. That's because the _PAGE_PTE
> bit, added by set_pte_at(), is not expected by swapoff: so swap ptes
> cannot be recognized.
>
> I'm not sure whether a swap pte should or should not have _PAGE_PTE set:
> this patch assumes not, and fixes set_pte_at() to set _PAGE_PTE only on
> present entries.
One of the reason we added _PAGE_PTE is to enable HUGETLB migration. So
we want migratio ptes to have _PAGE_PTE set.
>
> But if that's wrong, a reasonable alternative would be to
> #define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val(pte) & ~_PAGE_PTE })
> #define __swp_entry_to_pte(x) __pte((x).val | _PAGE_PTE)
>
We do clear _PAGE_PTE bits, when converting swp_entry_t to type and
offset. Can you share the stack trace for the hang, which will help me
understand this more ? .
> Signed-off-by: Hugh Dickins <hughd@google.com>
> ---
>
> arch/powerpc/mm/pgtable.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> --- 4.4-next/arch/powerpc/mm/pgtable.c 2016-01-06 11:54:01.477512251 -0800
> +++ linux/arch/powerpc/mm/pgtable.c 2016-01-09 13:51:15.793485717 -0800
> @@ -180,9 +180,10 @@ void set_pte_at(struct mm_struct *mm, un
> VM_WARN_ON((pte_val(*ptep) & (_PAGE_PRESENT | _PAGE_USER)) ==
> (_PAGE_PRESENT | _PAGE_USER));
> /*
> - * Add the pte bit when tryint set a pte
> + * Add the pte bit when setting a pte (not a swap entry)
> */
> - pte = __pte(pte_val(pte) | _PAGE_PTE);
> + if (pte_val(pte) & _PAGE_PRESENT)
> + pte = __pte(pte_val(pte) | _PAGE_PTE);
>
> /* Note: mm->context.id might not yet have been assigned as
> * this context might not have been activated yet when this
-aneesh
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH next] powerpc/mm: fix _PAGE_PTE breaking swapoff
2016-01-11 4:56 ` Aneesh Kumar K.V
@ 2016-01-11 5:45 ` Hugh Dickins
2016-01-11 5:55 ` Aneesh Kumar K.V
1 sibling, 0 replies; 5+ messages in thread
From: Hugh Dickins @ 2016-01-11 5:45 UTC (permalink / raw)
To: Aneesh Kumar K.V
Cc: Hugh Dickins, Andrew Morton, Michael Ellerman, Laurent Dufour,
linuxppc-dev, linux-mm
On Mon, 11 Jan 2016, Aneesh Kumar K.V wrote:
> Hugh Dickins <hughd@google.com> writes:
>
> > Swapoff after swapping hangs on the G5. That's because the _PAGE_PTE
> > bit, added by set_pte_at(), is not expected by swapoff: so swap ptes
> > cannot be recognized.
> >
> > I'm not sure whether a swap pte should or should not have _PAGE_PTE set:
> > this patch assumes not, and fixes set_pte_at() to set _PAGE_PTE only on
> > present entries.
>
> One of the reason we added _PAGE_PTE is to enable HUGETLB migration. So
> we want migratio ptes to have _PAGE_PTE set.
Okay, I won't pretend to understand the role of _PAGE_PTE in that;
but if it helps you to have _PAGE_PTE set in (swap and) migration entries,
that's very easily done with the alternative I suggested for pgtable.h:
-#define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val((pte)) })
-#define __swp_entry_to_pte(x) __pte((x).val)
+#define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val(pte) & ~_PAGE_PTE })
+#define __swp_entry_to_pte(x) __pte((x).val | _PAGE_PTE)
I did test that variant (with set_pte_at() restored to how you have it);
but not understanding _PAGE_PTE, I thought it odd to have in a swap entry.
>
> >
> > But if that's wrong, a reasonable alternative would be to
> > #define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val(pte) & ~_PAGE_PTE })
> > #define __swp_entry_to_pte(x) __pte((x).val | _PAGE_PTE)
> >
>
> We do clear _PAGE_PTE bits, when converting swp_entry_t to type and
> offset. Can you share the stack trace for the hang, which will help me
> understand this more ? .
The stack trace can be anywhere below try_to_unuse() in mm/swapfile.c,
since swapoff is circling around and around that function, reading from
each used swap block into a page, then trying to find where that page
belongs, looking at every non-file pte of every mm that ever swapped.
The code to look at is unuse_pte_range(), which at the top does
pte_t swp_pte = swp_entry_to_pte(entry)
to get the form it hopes to find in the page table; then scans doing
if (unlikely(maybe_same_pte(*pte, swp_pte))) {
on each pte slot. Ignoring the MEM_SOFT_DIRTY complication (which
had its own independent bug) maybe_same_pte() just does pte_same().
Hugh
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH next] powerpc/mm: fix _PAGE_PTE breaking swapoff
2016-01-11 4:56 ` Aneesh Kumar K.V
2016-01-11 5:45 ` Hugh Dickins
@ 2016-01-11 5:55 ` Aneesh Kumar K.V
2016-01-11 6:09 ` Hugh Dickins
1 sibling, 1 reply; 5+ messages in thread
From: Aneesh Kumar K.V @ 2016-01-11 5:55 UTC (permalink / raw)
To: Hugh Dickins
Cc: Andrew Morton, Michael Ellerman, Laurent Dufour, linuxppc-dev,
linux-mm
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:
> Hugh Dickins <hughd@google.com> writes:
>
>> Swapoff after swapping hangs on the G5. That's because the _PAGE_PTE
>> bit, added by set_pte_at(), is not expected by swapoff: so swap ptes
>> cannot be recognized.
>>
>> I'm not sure whether a swap pte should or should not have _PAGE_PTE set:
>> this patch assumes not, and fixes set_pte_at() to set _PAGE_PTE only on
>> present entries.
>
> One of the reason we added _PAGE_PTE is to enable HUGETLB migration. So
> we want migratio ptes to have _PAGE_PTE set.
>
>>
>> But if that's wrong, a reasonable alternative would be to
>> #define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val(pte) & ~_PAGE_PTE })
>> #define __swp_entry_to_pte(x) __pte((x).val | _PAGE_PTE)
>>
You other email w.r.t soft dirty bits explained this. What I missed was
the fact that core kernel expect swp_entry_t to be of an arch neutral
format. The confusing part was "arch_entry"
static inline pte_t swp_entry_to_pte(swp_entry_t entry)
{
swp_entry_t arch_entry;
.....
}
IMHO we should use the alternative you suggested above. I can write a
patch with additional comments around that if you want me to do that.
-aneesh
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH next] powerpc/mm: fix _PAGE_PTE breaking swapoff
2016-01-11 5:55 ` Aneesh Kumar K.V
@ 2016-01-11 6:09 ` Hugh Dickins
0 siblings, 0 replies; 5+ messages in thread
From: Hugh Dickins @ 2016-01-11 6:09 UTC (permalink / raw)
To: Aneesh Kumar K.V
Cc: Hugh Dickins, Andrew Morton, Michael Ellerman, Laurent Dufour,
linuxppc-dev, linux-mm
On Mon, 11 Jan 2016, Aneesh Kumar K.V wrote:
> "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:
> > Hugh Dickins <hughd@google.com> writes:
> >
> >> Swapoff after swapping hangs on the G5. That's because the _PAGE_PTE
> >> bit, added by set_pte_at(), is not expected by swapoff: so swap ptes
> >> cannot be recognized.
> >>
> >> I'm not sure whether a swap pte should or should not have _PAGE_PTE set:
> >> this patch assumes not, and fixes set_pte_at() to set _PAGE_PTE only on
> >> present entries.
> >
> > One of the reason we added _PAGE_PTE is to enable HUGETLB migration. So
> > we want migratio ptes to have _PAGE_PTE set.
> >
> >>
> >> But if that's wrong, a reasonable alternative would be to
> >> #define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val(pte) & ~_PAGE_PTE })
> >> #define __swp_entry_to_pte(x) __pte((x).val | _PAGE_PTE)
> >>
>
> You other email w.r.t soft dirty bits explained this. What I missed was
> the fact that core kernel expect swp_entry_t to be of an arch neutral
> format. The confusing part was "arch_entry"
>
> static inline pte_t swp_entry_to_pte(swp_entry_t entry)
> {
> swp_entry_t arch_entry;
> .....
> }
>
> IMHO we should use the alternative you suggested above. I can write a
> patch with additional comments around that if you want me to do that.
Sure, please go ahead - thanks.
Hugh
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-01-11 6:09 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-01-10 0:50 [PATCH next] powerpc/mm: fix _PAGE_PTE breaking swapoff Hugh Dickins
2016-01-11 4:56 ` Aneesh Kumar K.V
2016-01-11 5:45 ` Hugh Dickins
2016-01-11 5:55 ` Aneesh Kumar K.V
2016-01-11 6:09 ` Hugh Dickins
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).