* Re: + mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch added to -mm tree
[not found] ` <530D9F50.1080400@de.ibm.com>
@ 2014-02-26 14:50 ` Oleg Nesterov
2014-02-26 15:06 ` Christian Borntraeger
0 siblings, 1 reply; 18+ messages in thread
From: Oleg Nesterov @ 2014-02-26 14:50 UTC (permalink / raw)
To: Christian Borntraeger
Cc: akpm, linux-kernel, viro, schwidefsky, rientjes, riel, peterz,
pbonzini, mingo, mgorman, kirill.shutemov, heiko.carstens, hannes,
gerald.schaefer, ebiederm, aarcange, athorlton
On 02/26, Christian Borntraeger wrote:
>
> On 26/02/14 00:53, akpm@linux-foundation.org wrote:
> > Subject: + mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch added to -mm tree
> > To: athorlton@sgi.com,aarcange@redhat.com,borntraeger@de.ibm.com,ebiederm@xmission.com,gerald.schaefer@de.ibm.com,hannes@cmpxchg.org,heiko.carstens@de.ibm.com,kirill.shutemov@linux.intel.com,mgorman@suse.de,mingo@kernel.org,oleg@redhat.com,pbonzini@redhat.com,peterz@infradead.org,riel@redhat.com,rientjes@google.com,schwidefsky@de.ibm.com,viro@zeniv.linux.org.uk
> > From: akpm@linux-foundation.org
> > Date: Tue, 25 Feb 2014 15:53:13 -0800
> >
> >
> > The patch titled
> > Subject: mm: revert "thp: make MADV_HUGEPAGE check for mm->def_flags"
> > has been added to the -mm tree. Its filename is
> > mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch
> >
> > This patch should soon appear at
> > http://ozlabs.org/~akpm/mmots/broken-out/mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch
> > and later at
> > http://ozlabs.org/~akpm/mmotm/broken-out/mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch
>
>
> NAK.
>
> Since 2012 qemu does call "qemu_madvise(new_block->host, size, QEMU_MADV_HUGEPAGE);" for all kvm pages.
> (commit ad0b5321f1f797274603ebbe20108b0750baee94 Call MADV_HUGEPAGE for guest RAM allocations) so this
> breaks any recent kvm guest on s390.
Well, I can't really discuss the changes in arch/s390.
But perhaps qemu can be changed to avoid MADV_HUGEPAGE on s390 ?
Otherwise I'd suggest the change below.
Oleg.
--- x/mm/huge_memory.c
+++ x/mm/huge_memory.c
@@ -1968,8 +1968,6 @@ out:
int hugepage_madvise(struct vm_area_struct *vma,
unsigned long *vm_flags, int advice)
{
- struct mm_struct *mm = vma->vm_mm;
-
switch (advice) {
case MADV_HUGEPAGE:
/*
@@ -1977,8 +1975,16 @@ int hugepage_madvise(struct vm_area_stru
*/
if (*vm_flags & (VM_HUGEPAGE | VM_NO_THP))
return -EINVAL;
- if (mm->def_flags & VM_NOHUGEPAGE)
+
+/*
+ * MADV_HUGEPAGE after PRCTL_THP_DISABLE is broken on s390 because
+ * qemu blindly does madvise(MADV_HUGEPAGE) after s390_enable_sie().
+ */
+#ifdef CONFIG_S390
+ if (vma->vm_mm->def_flags & VM_NOHUGEPAGE)
return -EINVAL;
+#endif
+
*vm_flags &= ~VM_NOHUGEPAGE;
*vm_flags |= VM_HUGEPAGE;
/*
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: + mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch added to -mm tree
2014-02-26 14:50 ` + mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch added to -mm tree Oleg Nesterov
@ 2014-02-26 15:06 ` Christian Borntraeger
2014-02-26 15:22 ` Kirill A. Shutemov
2014-02-26 15:31 ` Oleg Nesterov
0 siblings, 2 replies; 18+ messages in thread
From: Christian Borntraeger @ 2014-02-26 15:06 UTC (permalink / raw)
To: Oleg Nesterov
Cc: akpm, linux-kernel, viro, schwidefsky, rientjes, riel, peterz,
pbonzini, mingo, mgorman, kirill.shutemov, heiko.carstens, hannes,
gerald.schaefer, ebiederm, aarcange, athorlton
On 26/02/14 15:50, Oleg Nesterov wrote:
[...]
>> NAK.
>>
>> Since 2012 qemu does call "qemu_madvise(new_block->host, size, QEMU_MADV_HUGEPAGE);" for all kvm pages.
>> (commit ad0b5321f1f797274603ebbe20108b0750baee94 Call MADV_HUGEPAGE for guest RAM allocations) so this
>> breaks any recent kvm guest on s390.
>
> Well, I can't really discuss the changes in arch/s390.
>
> But perhaps qemu can be changed to avoid MADV_HUGEPAGE on s390 ?
> Otherwise I'd suggest the change below.
>
> Oleg.
>
>
> --- x/mm/huge_memory.c
> +++ x/mm/huge_memory.c
> @@ -1968,8 +1968,6 @@ out:
> int hugepage_madvise(struct vm_area_struct *vma,
> unsigned long *vm_flags, int advice)
> {
> - struct mm_struct *mm = vma->vm_mm;
> -
> switch (advice) {
> case MADV_HUGEPAGE:
> /*
> @@ -1977,8 +1975,16 @@ int hugepage_madvise(struct vm_area_stru
> */
> if (*vm_flags & (VM_HUGEPAGE | VM_NO_THP))
> return -EINVAL;
> - if (mm->def_flags & VM_NOHUGEPAGE)
> +
> +/*
> + * MADV_HUGEPAGE after PRCTL_THP_DISABLE is broken on s390 because
> + * qemu blindly does madvise(MADV_HUGEPAGE) after s390_enable_sie().
> + */
> +#ifdef CONFIG_S390
> + if (vma->vm_mm->def_flags & VM_NOHUGEPAGE)
> return -EINVAL;
> +#endif
> +
Ifdefs are ugly but might be the only way of not breaking existing userspace.
If we come up with a solution for THP in KVM host processes on s390, we can
then remove that wart. We could even limit that hack to KVM only processes
to retain Alex' prctl capability by checking mm_has_pgste (defined in
arch/s390/include/asm/pgtable.h should be included via linux/mm.h)
> +
> +/*
> + * MADV_HUGEPAGE after PRCTL_THP_DISABLE is broken on s390 because
> + * qemu blindly does madvise(MADV_HUGEPAGE) after s390_enable_sie().
> + */
> +#ifdef CONFIG_S390
> + if ((vma->vm_mm->def_flags & VM_NOHUGEPAGE) && mm_has_pgste(vma->vm_mm))
> return -EINVAL;
> +#endif
> +
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: + mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch added to -mm tree
2014-02-26 15:06 ` Christian Borntraeger
@ 2014-02-26 15:22 ` Kirill A. Shutemov
2014-02-26 15:31 ` Oleg Nesterov
1 sibling, 0 replies; 18+ messages in thread
From: Kirill A. Shutemov @ 2014-02-26 15:22 UTC (permalink / raw)
To: Christian Borntraeger
Cc: Oleg Nesterov, akpm, linux-kernel, viro, schwidefsky, rientjes,
riel, peterz, pbonzini, mingo, mgorman, kirill.shutemov,
heiko.carstens, hannes, gerald.schaefer, ebiederm, aarcange,
athorlton
Christian Borntraeger wrote:
> On 26/02/14 15:50, Oleg Nesterov wrote:
> [...]
>
> >> NAK.
> >>
> >> Since 2012 qemu does call "qemu_madvise(new_block->host, size, QEMU_MADV_HUGEPAGE);" for all kvm pages.
> >> (commit ad0b5321f1f797274603ebbe20108b0750baee94 Call MADV_HUGEPAGE for guest RAM allocations) so this
> >> breaks any recent kvm guest on s390.
> >
> > Well, I can't really discuss the changes in arch/s390.
> >
> > But perhaps qemu can be changed to avoid MADV_HUGEPAGE on s390 ?
> > Otherwise I'd suggest the change below.
> >
> > Oleg.
> >
> >
> > --- x/mm/huge_memory.c
> > +++ x/mm/huge_memory.c
> > @@ -1968,8 +1968,6 @@ out:
> > int hugepage_madvise(struct vm_area_struct *vma,
> > unsigned long *vm_flags, int advice)
> > {
> > - struct mm_struct *mm = vma->vm_mm;
> > -
> > switch (advice) {
> > case MADV_HUGEPAGE:
> > /*
> > @@ -1977,8 +1975,16 @@ int hugepage_madvise(struct vm_area_stru
> > */
> > if (*vm_flags & (VM_HUGEPAGE | VM_NO_THP))
> > return -EINVAL;
> > - if (mm->def_flags & VM_NOHUGEPAGE)
> > +
> > +/*
> > + * MADV_HUGEPAGE after PRCTL_THP_DISABLE is broken on s390 because
> > + * qemu blindly does madvise(MADV_HUGEPAGE) after s390_enable_sie().
> > + */
> > +#ifdef CONFIG_S390
> > + if (vma->vm_mm->def_flags & VM_NOHUGEPAGE)
> > return -EINVAL;
> > +#endif
> > +
>
> Ifdefs are ugly
IS_ENABLED(CONFIG_S390) should help with this.
--
Kirill A. Shutemov
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: + mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch added to -mm tree
2014-02-26 15:06 ` Christian Borntraeger
2014-02-26 15:22 ` Kirill A. Shutemov
@ 2014-02-26 15:31 ` Oleg Nesterov
2014-02-26 16:55 ` Gerald Schaefer
2014-02-26 16:57 ` Peter Zijlstra
1 sibling, 2 replies; 18+ messages in thread
From: Oleg Nesterov @ 2014-02-26 15:31 UTC (permalink / raw)
To: Christian Borntraeger
Cc: akpm, linux-kernel, viro, schwidefsky, rientjes, riel, peterz,
pbonzini, mingo, mgorman, kirill.shutemov, heiko.carstens, hannes,
gerald.schaefer, ebiederm, aarcange, athorlton
On 02/26, Christian Borntraeger wrote:
>
> On 26/02/14 15:50, Oleg Nesterov wrote:
> >
> > But perhaps qemu can be changed to avoid MADV_HUGEPAGE on s390 ?
> > Otherwise I'd suggest the change below.
> >
> > Oleg.
> >
> >
> > --- x/mm/huge_memory.c
> > +++ x/mm/huge_memory.c
> > @@ -1968,8 +1968,6 @@ out:
> > int hugepage_madvise(struct vm_area_struct *vma,
> > unsigned long *vm_flags, int advice)
> > {
> > - struct mm_struct *mm = vma->vm_mm;
> > -
> > switch (advice) {
> > case MADV_HUGEPAGE:
> > /*
> > @@ -1977,8 +1975,16 @@ int hugepage_madvise(struct vm_area_stru
> > */
> > if (*vm_flags & (VM_HUGEPAGE | VM_NO_THP))
> > return -EINVAL;
> > - if (mm->def_flags & VM_NOHUGEPAGE)
> > +
> > +/*
> > + * MADV_HUGEPAGE after PRCTL_THP_DISABLE is broken on s390 because
> > + * qemu blindly does madvise(MADV_HUGEPAGE) after s390_enable_sie().
> > + */
> > +#ifdef CONFIG_S390
> > + if (vma->vm_mm->def_flags & VM_NOHUGEPAGE)
> > return -EINVAL;
> > +#endif
> > +
>
> Ifdefs are ugly but might be the only way of not breaking existing userspace.
Yes, agreed. but note that this code is already s390-specific, nobody
else puts VM_NOHUGEPAGE into ->def_flags (until this series of course).
> If we come up with a solution for THP in KVM host processes on s390, we can
> then remove that wart. We could even limit that hack to KVM only processes
> to retain Alex' prctl capability by checking mm_has_pgste
Yes, yes, I looked at mm->context.has_pgste too ;) But I wasn't sure
we can rely on it. Thanks.
> > +/*
> > + * MADV_HUGEPAGE after PRCTL_THP_DISABLE is broken on s390 because
> > + * qemu blindly does madvise(MADV_HUGEPAGE) after s390_enable_sie().
> > + */
> > +#ifdef CONFIG_S390
> > + if ((vma->vm_mm->def_flags & VM_NOHUGEPAGE) && mm_has_pgste(vma->vm_mm))
> > return -EINVAL;
Hmm... but why do we need to check VM_NOHUGEPAGE then? Can't the
change below work?
Oleg.
--- x/arch/s390/mm/pgtable.c
+++ x/arch/s390/mm/pgtable.c
@@ -1084,7 +1084,6 @@ static inline void thp_split_mm(struct m
vma->vm_flags &= ~VM_HUGEPAGE;
vma->vm_flags |= VM_NOHUGEPAGE;
}
- mm->def_flags |= VM_NOHUGEPAGE;
}
#else
static inline void thp_split_mm(struct mm_struct *mm)
--- x/mm/huge_memory.c
+++ x/mm/huge_memory.c
@@ -1968,8 +1968,6 @@ out:
int hugepage_madvise(struct vm_area_struct *vma,
unsigned long *vm_flags, int advice)
{
- struct mm_struct *mm = vma->vm_mm;
-
switch (advice) {
case MADV_HUGEPAGE:
/*
@@ -1977,8 +1975,12 @@ int hugepage_madvise(struct vm_area_stru
*/
if (*vm_flags & (VM_HUGEPAGE | VM_NO_THP))
return -EINVAL;
- if (mm->def_flags & VM_NOHUGEPAGE)
+
+#ifdef CONFIG_S390
+ if (mm_has_pgste(vma->vm_mm))
return -EINVAL;
+#endif
+
*vm_flags &= ~VM_NOHUGEPAGE;
*vm_flags |= VM_HUGEPAGE;
/*
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: + mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch added to -mm tree
2014-02-26 15:31 ` Oleg Nesterov
@ 2014-02-26 16:55 ` Gerald Schaefer
2014-02-26 16:57 ` Peter Zijlstra
1 sibling, 0 replies; 18+ messages in thread
From: Gerald Schaefer @ 2014-02-26 16:55 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Christian Borntraeger, akpm, linux-kernel, viro, schwidefsky,
rientjes, riel, peterz, pbonzini, mingo, mgorman, kirill.shutemov,
heiko.carstens, hannes, ebiederm, aarcange, athorlton
On Wed, 26 Feb 2014 16:31:44 +0100
Oleg Nesterov <oleg@redhat.com> wrote:
> On 02/26, Christian Borntraeger wrote:
> >
> > On 26/02/14 15:50, Oleg Nesterov wrote:
> > >
> > > But perhaps qemu can be changed to avoid MADV_HUGEPAGE on s390 ?
> > > Otherwise I'd suggest the change below.
> > >
> > > Oleg.
> > >
> > >
> > > --- x/mm/huge_memory.c
> > > +++ x/mm/huge_memory.c
> > > @@ -1968,8 +1968,6 @@ out:
> > > int hugepage_madvise(struct vm_area_struct *vma,
> > > unsigned long *vm_flags, int advice)
> > > {
> > > - struct mm_struct *mm = vma->vm_mm;
> > > -
> > > switch (advice) {
> > > case MADV_HUGEPAGE:
> > > /*
> > > @@ -1977,8 +1975,16 @@ int hugepage_madvise(struct vm_area_stru
> > > */
> > > if (*vm_flags & (VM_HUGEPAGE | VM_NO_THP))
> > > return -EINVAL;
> > > - if (mm->def_flags & VM_NOHUGEPAGE)
> > > +
> > > +/*
> > > + * MADV_HUGEPAGE after PRCTL_THP_DISABLE is broken on s390 because
> > > + * qemu blindly does madvise(MADV_HUGEPAGE) after s390_enable_sie().
> > > + */
> > > +#ifdef CONFIG_S390
> > > + if (vma->vm_mm->def_flags & VM_NOHUGEPAGE)
> > > return -EINVAL;
> > > +#endif
> > > +
> >
> > Ifdefs are ugly but might be the only way of not breaking existing userspace.
>
> Yes, agreed. but note that this code is already s390-specific, nobody
> else puts VM_NOHUGEPAGE into ->def_flags (until this series of course).
>
> > If we come up with a solution for THP in KVM host processes on s390, we can
> > then remove that wart. We could even limit that hack to KVM only processes
> > to retain Alex' prctl capability by checking mm_has_pgste
>
> Yes, yes, I looked at mm->context.has_pgste too ;) But I wasn't sure
> we can rely on it. Thanks.
>
> > > +/*
> > > + * MADV_HUGEPAGE after PRCTL_THP_DISABLE is broken on s390 because
> > > + * qemu blindly does madvise(MADV_HUGEPAGE) after s390_enable_sie().
> > > + */
> > > +#ifdef CONFIG_S390
> > > + if ((vma->vm_mm->def_flags & VM_NOHUGEPAGE) && mm_has_pgste(vma->vm_mm))
> > > return -EINVAL;
>
> Hmm... but why do we need to check VM_NOHUGEPAGE then? Can't the
> change below work?
Yes, good point, that should do the trick.
>
> Oleg.
>
> --- x/arch/s390/mm/pgtable.c
> +++ x/arch/s390/mm/pgtable.c
> @@ -1084,7 +1084,6 @@ static inline void thp_split_mm(struct m
> vma->vm_flags &= ~VM_HUGEPAGE;
> vma->vm_flags |= VM_NOHUGEPAGE;
> }
> - mm->def_flags |= VM_NOHUGEPAGE;
> }
> #else
> static inline void thp_split_mm(struct mm_struct *mm)
> --- x/mm/huge_memory.c
> +++ x/mm/huge_memory.c
> @@ -1968,8 +1968,6 @@ out:
> int hugepage_madvise(struct vm_area_struct *vma,
> unsigned long *vm_flags, int advice)
> {
> - struct mm_struct *mm = vma->vm_mm;
> -
> switch (advice) {
> case MADV_HUGEPAGE:
> /*
> @@ -1977,8 +1975,12 @@ int hugepage_madvise(struct vm_area_stru
> */
> if (*vm_flags & (VM_HUGEPAGE | VM_NO_THP))
> return -EINVAL;
> - if (mm->def_flags & VM_NOHUGEPAGE)
> +
> +#ifdef CONFIG_S390
> + if (mm_has_pgste(vma->vm_mm))
> return -EINVAL;
> +#endif
> +
> *vm_flags &= ~VM_NOHUGEPAGE;
> *vm_flags |= VM_HUGEPAGE;
> /*
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: + mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch added to -mm tree
2014-02-26 15:31 ` Oleg Nesterov
2014-02-26 16:55 ` Gerald Schaefer
@ 2014-02-26 16:57 ` Peter Zijlstra
2014-02-26 17:22 ` Alex Thorlton
2014-02-26 18:08 ` Oleg Nesterov
1 sibling, 2 replies; 18+ messages in thread
From: Peter Zijlstra @ 2014-02-26 16:57 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Christian Borntraeger, akpm, linux-kernel, viro, schwidefsky,
rientjes, riel, pbonzini, mingo, mgorman, kirill.shutemov,
heiko.carstens, hannes, gerald.schaefer, ebiederm, aarcange,
athorlton
On Wed, Feb 26, 2014 at 04:31:44PM +0100, Oleg Nesterov wrote:
> --- x/arch/s390/mm/pgtable.c
> +++ x/arch/s390/mm/pgtable.c
> @@ -1084,7 +1084,6 @@ static inline void thp_split_mm(struct m
> vma->vm_flags &= ~VM_HUGEPAGE;
> vma->vm_flags |= VM_NOHUGEPAGE;
> }
> - mm->def_flags |= VM_NOHUGEPAGE;
> }
> #else
> static inline void thp_split_mm(struct mm_struct *mm)
> --- x/mm/huge_memory.c
> +++ x/mm/huge_memory.c
> @@ -1968,8 +1968,6 @@ out:
> int hugepage_madvise(struct vm_area_struct *vma,
> unsigned long *vm_flags, int advice)
> {
> - struct mm_struct *mm = vma->vm_mm;
> -
> switch (advice) {
> case MADV_HUGEPAGE:
> /*
> @@ -1977,8 +1975,12 @@ int hugepage_madvise(struct vm_area_stru
> */
> if (*vm_flags & (VM_HUGEPAGE | VM_NO_THP))
> return -EINVAL;
> - if (mm->def_flags & VM_NOHUGEPAGE)
> +
> +#ifdef CONFIG_S390
Do we want a comment here, explaining why s390 is special again?
> + if (mm_has_pgste(vma->vm_mm))
> return -EINVAL;
> +#endif
> +
> *vm_flags &= ~VM_NOHUGEPAGE;
> *vm_flags |= VM_HUGEPAGE;
> /*
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: + mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch added to -mm tree
2014-02-26 16:57 ` Peter Zijlstra
@ 2014-02-26 17:22 ` Alex Thorlton
2014-02-26 18:06 ` Oleg Nesterov
2014-02-26 18:08 ` Oleg Nesterov
1 sibling, 1 reply; 18+ messages in thread
From: Alex Thorlton @ 2014-02-26 17:22 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Oleg Nesterov, Christian Borntraeger, akpm, linux-kernel, viro,
schwidefsky, rientjes, riel, pbonzini, mingo, mgorman,
kirill.shutemov, heiko.carstens, hannes, gerald.schaefer,
ebiederm, aarcange
On Wed, Feb 26, 2014 at 05:57:59PM +0100, Peter Zijlstra wrote:
> On Wed, Feb 26, 2014 at 04:31:44PM +0100, Oleg Nesterov wrote:
> Do we want a comment here, explaining why s390 is special again?
Here's what I've got, with everybody's suggestions spun together:
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 82166bf..7f01491 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1968,8 +1968,6 @@ out:
int hugepage_madvise(struct vm_area_struct *vma,
unsigned long *vm_flags, int advice)
{
- struct mm_struct *mm = vma->vm_mm;
-
switch (advice) {
case MADV_HUGEPAGE:
/*
@@ -1977,8 +1975,16 @@ int hugepage_madvise(struct vm_area_struct *vma,
*/
if (*vm_flags & (VM_HUGEPAGE | VM_NO_THP))
return -EINVAL;
- if (mm->def_flags & VM_NOHUGEPAGE)
+
+/*
+ * MADV_HUGEPAGE after PRCTL_THP_DISABLE is broken on s390 because
+ * qemu blindly does madvise(MADV_HUGEPAGE) after s390_enable_sie().
+ */
+#ifdef CONFIG_S390
+ if (mm_has_pgste(vma->vm_mm))
return -EINVAL;
+#endif
+
*vm_flags &= ~VM_NOHUGEPAGE;
*vm_flags |= VM_HUGEPAGE;
/*
I've compiled and tested and it works ok on my machines (I don't have an
s390 to test on though :). Is everybody okay with this solution?
BTW, Kirill, I looked at using IS_ENABLED to clean up the ifdef, but it
won't work here, since mm_has_pgste isn't defined unless CONFIG_S390
is defined. i.e.: This line:
if (IS_ENABLED(CONFIG_S390) && mm_has_pgste(vma->vm_mm))
won't compile.
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: + mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch added to -mm tree
2014-02-26 17:22 ` Alex Thorlton
@ 2014-02-26 18:06 ` Oleg Nesterov
2014-02-26 19:05 ` Gerald Schaefer
` (2 more replies)
0 siblings, 3 replies; 18+ messages in thread
From: Oleg Nesterov @ 2014-02-26 18:06 UTC (permalink / raw)
To: Alex Thorlton
Cc: Peter Zijlstra, Christian Borntraeger, akpm, linux-kernel, viro,
schwidefsky, rientjes, riel, pbonzini, mingo, mgorman,
kirill.shutemov, heiko.carstens, hannes, gerald.schaefer,
ebiederm, aarcange
On 02/26, Alex Thorlton wrote:
>
> + * MADV_HUGEPAGE after PRCTL_THP_DISABLE is broken on s390 because
> + * qemu blindly does madvise(MADV_HUGEPAGE) after s390_enable_sie().
> + */
> +#ifdef CONFIG_S390
> + if (mm_has_pgste(vma->vm_mm))
> return -EINVAL;
> +#endif
The comment is not really right...
And personally I think that
@@ -504,6 +504,9 @@ static int gmap_connect_pgtable(unsigned long address, unsigned long segment,
if (!pmd_present(*pmd) &&
__pte_alloc(mm, vma, pmd, vmaddr))
return -ENOMEM;
+ /* large pmds cannot yet be handled */
+ if (pmd_large(*pmd))
+ return -EFAULT;
change still makes sense, so that we can simply revert this s390-
specific hack in hugepage_madvise().
I'd suggest the patch below on top of your changes, but I won't argue.
It would be nice to also change thp_split_mm() to not not play with
mm->def_flags, but I am not sure if we can do this.
Oleg.
---
Subject: [PATCH] s390: make sure MADV_HUGEPAGE fails after s390_enable_sie()
As Christian pointed out, the recent 'Revert "thp: make MADV_HUGEPAGE
check for mm->def_flags"' breaks qemu, it does QEMU_MADV_HUGEPAGE for
all kvm pages but this doesn't work after s390_enable_sie/thp_split_mm.
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Suggested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index a4310a5..0e08d92 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1970,11 +1970,22 @@ int hugepage_madvise(struct vm_area_struct *vma,
{
switch (advice) {
case MADV_HUGEPAGE:
+#ifdef CONFIG_S390
+ /*
+ * MADV_HUGEPAGE is broken after s390_enable_sie(), qemu
+ * blindly does madvise(MADV_HUGEPAGE) for for all kvm pages
+ * and expects it must fail on s390. Avoid a possible SIGSEGV
+ * until qemu is changed.
+ */
+ if (mm_has_pgste(vma->vm_mm))
+ return -EINVAL;
+#endif
/*
* Be somewhat over-protective like KSM for now!
*/
if (*vm_flags & (VM_HUGEPAGE | VM_NO_THP))
return -EINVAL;
+
*vm_flags &= ~VM_NOHUGEPAGE;
*vm_flags |= VM_HUGEPAGE;
/*
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: + mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch added to -mm tree
2014-02-26 16:57 ` Peter Zijlstra
2014-02-26 17:22 ` Alex Thorlton
@ 2014-02-26 18:08 ` Oleg Nesterov
1 sibling, 0 replies; 18+ messages in thread
From: Oleg Nesterov @ 2014-02-26 18:08 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Christian Borntraeger, akpm, linux-kernel, viro, schwidefsky,
rientjes, riel, pbonzini, mingo, mgorman, kirill.shutemov,
heiko.carstens, hannes, gerald.schaefer, ebiederm, aarcange,
athorlton
On 02/26, Peter Zijlstra wrote:
>
> On Wed, Feb 26, 2014 at 04:31:44PM +0100, Oleg Nesterov wrote:
> > /*
> > @@ -1977,8 +1975,12 @@ int hugepage_madvise(struct vm_area_stru
> > */
> > if (*vm_flags & (VM_HUGEPAGE | VM_NO_THP))
> > return -EINVAL;
> > - if (mm->def_flags & VM_NOHUGEPAGE)
> > +
> > +#ifdef CONFIG_S390
>
> Do we want a comment here, explaining why s390 is special again?
Yes, yes, sure. Especially because this is (hopefully) the temporary
hack.
Oleg.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: + mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch added to -mm tree
2014-02-26 18:06 ` Oleg Nesterov
@ 2014-02-26 19:05 ` Gerald Schaefer
2014-02-27 16:45 ` Oleg Nesterov
2014-02-26 19:27 ` Christian Borntraeger
2014-02-26 20:41 ` Paolo Bonzini
2 siblings, 1 reply; 18+ messages in thread
From: Gerald Schaefer @ 2014-02-26 19:05 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Alex Thorlton, Peter Zijlstra, Christian Borntraeger, akpm,
linux-kernel, viro, schwidefsky, rientjes, riel, pbonzini, mingo,
mgorman, kirill.shutemov, heiko.carstens, hannes, ebiederm,
aarcange
On Wed, 26 Feb 2014 19:06:03 +0100
Oleg Nesterov <oleg@redhat.com> wrote:
> On 02/26, Alex Thorlton wrote:
> >
> > + * MADV_HUGEPAGE after PRCTL_THP_DISABLE is broken on s390 because
> > + * qemu blindly does madvise(MADV_HUGEPAGE) after s390_enable_sie().
> > + */
> > +#ifdef CONFIG_S390
> > + if (mm_has_pgste(vma->vm_mm))
> > return -EINVAL;
> > +#endif
>
> The comment is not really right...
>
> And personally I think that
>
> @@ -504,6 +504,9 @@ static int gmap_connect_pgtable(unsigned long address, unsigned long segment,
> if (!pmd_present(*pmd) &&
> __pte_alloc(mm, vma, pmd, vmaddr))
> return -ENOMEM;
> + /* large pmds cannot yet be handled */
> + if (pmd_large(*pmd))
> + return -EFAULT;
>
> change still makes sense, so that we can simply revert this s390-
> specific hack in hugepage_madvise().
Yes, agreed.
>
> I'd suggest the patch below on top of your changes, but I won't argue.
>
> It would be nice to also change thp_split_mm() to not not play with
> mm->def_flags, but I am not sure if we can do this.
Hmm, I'm also wondering about this. Basically, we only need VM_NOHUGEPAGE
in vma->vm_flags, which is done for all existing vmas in thp_split_mm().
But if there should be new vmas created afterwards, it would still be
necessary to also have VM_NOHUGEPAGE in mm->def_flags, because the
vm_flags for new vmas will be set via OR of mm->def_flags, e.g. in
do_brk() and do_mmap_pgoff().
I guess the question is if new vmas can be created for the qemu/kvm host
process?
Anyway, this would then have to be a separate patch, to keep the
"revertability" of this hack.
>
> Oleg.
> ---
>
> Subject: [PATCH] s390: make sure MADV_HUGEPAGE fails after s390_enable_sie()
>
> As Christian pointed out, the recent 'Revert "thp: make MADV_HUGEPAGE
> check for mm->def_flags"' breaks qemu, it does QEMU_MADV_HUGEPAGE for
> all kvm pages but this doesn't work after s390_enable_sie/thp_split_mm.
>
> Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
> Suggested-by: Christian Borntraeger <borntraeger@de.ibm.com>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index a4310a5..0e08d92 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1970,11 +1970,22 @@ int hugepage_madvise(struct vm_area_struct *vma,
> {
> switch (advice) {
> case MADV_HUGEPAGE:
> +#ifdef CONFIG_S390
> + /*
> + * MADV_HUGEPAGE is broken after s390_enable_sie(), qemu
> + * blindly does madvise(MADV_HUGEPAGE) for for all kvm pages
> + * and expects it must fail on s390. Avoid a possible SIGSEGV
> + * until qemu is changed.
> + */
> + if (mm_has_pgste(vma->vm_mm))
> + return -EINVAL;
> +#endif
> /*
> * Be somewhat over-protective like KSM for now!
> */
> if (*vm_flags & (VM_HUGEPAGE | VM_NO_THP))
> return -EINVAL;
> +
> *vm_flags &= ~VM_NOHUGEPAGE;
> *vm_flags |= VM_HUGEPAGE;
> /*
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: + mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch added to -mm tree
2014-02-26 18:06 ` Oleg Nesterov
2014-02-26 19:05 ` Gerald Schaefer
@ 2014-02-26 19:27 ` Christian Borntraeger
2014-02-26 19:39 ` Alex Thorlton
2014-02-26 20:41 ` Paolo Bonzini
2 siblings, 1 reply; 18+ messages in thread
From: Christian Borntraeger @ 2014-02-26 19:27 UTC (permalink / raw)
To: Oleg Nesterov, Alex Thorlton
Cc: Peter Zijlstra, akpm, linux-kernel, viro, schwidefsky, rientjes,
riel, pbonzini, mingo, mgorman, kirill.shutemov, heiko.carstens,
hannes, gerald.schaefer, ebiederm, aarcange
On 26/02/14 19:06, Oleg Nesterov wrote:
> On 02/26, Alex Thorlton wrote:
>>
>> + * MADV_HUGEPAGE after PRCTL_THP_DISABLE is broken on s390 because
>> + * qemu blindly does madvise(MADV_HUGEPAGE) after s390_enable_sie().
>> + */
>> +#ifdef CONFIG_S390
>> + if (mm_has_pgste(vma->vm_mm))
>> return -EINVAL;
>> +#endif
>
> The comment is not really right...
>
> And personally I think that
>
> @@ -504,6 +504,9 @@ static int gmap_connect_pgtable(unsigned long address, unsigned long segment,
> if (!pmd_present(*pmd) &&
> __pte_alloc(mm, vma, pmd, vmaddr))
> return -ENOMEM;
> + /* large pmds cannot yet be handled */
> + if (pmd_large(*pmd))
> + return -EFAULT;
>
> change still makes sense, so that we can simply revert this s390-
> specific hack in hugepage_madvise().
Yes, it still makes sense to cover existing THPs here.
> I'd suggest the patch below on top of your changes, but I won't argue.
>
> It would be nice to also change thp_split_mm() to not not play with
> mm->def_flags, but I am not sure if we can do this.
>
> Oleg.
> ---
>
> Subject: [PATCH] s390: make sure MADV_HUGEPAGE fails after s390_enable_sie()
>
> As Christian pointed out, the recent 'Revert "thp: make MADV_HUGEPAGE
> check for mm->def_flags"' breaks qemu, it does QEMU_MADV_HUGEPAGE for
> all kvm pages but this doesn't work after s390_enable_sie/thp_split_mm.
>
> Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
> Suggested-by: Christian Borntraeger <borntraeger@de.ibm.com>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index a4310a5..0e08d92 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1970,11 +1970,22 @@ int hugepage_madvise(struct vm_area_struct *vma,
> {
> switch (advice) {
> case MADV_HUGEPAGE:
> +#ifdef CONFIG_S390
> + /*
> + * MADV_HUGEPAGE is broken after s390_enable_sie(), qemu
> + * blindly does madvise(MADV_HUGEPAGE) for for all kvm pages
> + * and expects it must fail on s390. Avoid a possible SIGSEGV
> + * until qemu is changed.
I prefer:
* until kvm/s390 can handle large pages in the host.
Otherwise qemu has to be changed again, if we get THP working for kvm.
> + */
> + if (mm_has_pgste(vma->vm_mm))
> + return -EINVAL;
> +#endif
> /*
> * Be somewhat over-protective like KSM for now!
> */
> if (*vm_flags & (VM_HUGEPAGE | VM_NO_THP))
> return -EINVAL;
> +
Unrelated white space?
> *vm_flags &= ~VM_NOHUGEPAGE;
> *vm_flags |= VM_HUGEPAGE;
> /*
>
With the comment and white space change:
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Thanks for the quick patch
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: + mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch added to -mm tree
2014-02-26 19:27 ` Christian Borntraeger
@ 2014-02-26 19:39 ` Alex Thorlton
2014-02-26 23:24 ` Andrew Morton
0 siblings, 1 reply; 18+ messages in thread
From: Alex Thorlton @ 2014-02-26 19:39 UTC (permalink / raw)
To: Christian Borntraeger
Cc: Oleg Nesterov, Peter Zijlstra, akpm, linux-kernel, viro,
schwidefsky, rientjes, riel, pbonzini, mingo, mgorman,
kirill.shutemov, heiko.carstens, hannes, gerald.schaefer,
ebiederm, aarcange
On Wed, Feb 26, 2014 at 08:27:36PM +0100, Christian Borntraeger wrote:
> On 26/02/14 19:06, Oleg Nesterov wrote:
> > On 02/26, Alex Thorlton wrote:
> >>
> >> + * MADV_HUGEPAGE after PRCTL_THP_DISABLE is broken on s390 because
> >> + * qemu blindly does madvise(MADV_HUGEPAGE) after s390_enable_sie().
> >> + */
> >> +#ifdef CONFIG_S390
> >> + if (mm_has_pgste(vma->vm_mm))
> >> return -EINVAL;
> >> +#endif
> >
> > The comment is not really right...
> >
> > And personally I think that
> >
> > @@ -504,6 +504,9 @@ static int gmap_connect_pgtable(unsigned long address, unsigned long segment,
> > if (!pmd_present(*pmd) &&
> > __pte_alloc(mm, vma, pmd, vmaddr))
> > return -ENOMEM;
> > + /* large pmds cannot yet be handled */
> > + if (pmd_large(*pmd))
> > + return -EFAULT;
> >
> > change still makes sense, so that we can simply revert this s390-
> > specific hack in hugepage_madvise().
>
> Yes, it still makes sense to cover existing THPs here.
Yes, it does. I just snipped the chunk of the original patch that I
actually changed in my last e-mail. I didn't intend to actually remove
the above portion in the final patch - sorry for the confusion!
>
>
> > I'd suggest the patch below on top of your changes, but I won't argue.
I like this approach, and your updated comment as well. This should
probably go in as [PATCH 2/4] in my series. Do I need to spin a v5
with this new patch included in the series, or will Andrew pull this in?
> >
> > It would be nice to also change thp_split_mm() to not not play with
> > mm->def_flags, but I am not sure if we can do this.
> >
> > Oleg.
> > ---
> >
> > Subject: [PATCH] s390: make sure MADV_HUGEPAGE fails after s390_enable_sie()
> >
> > As Christian pointed out, the recent 'Revert "thp: make MADV_HUGEPAGE
> > check for mm->def_flags"' breaks qemu, it does QEMU_MADV_HUGEPAGE for
> > all kvm pages but this doesn't work after s390_enable_sie/thp_split_mm.
> >
> > Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
> > Suggested-by: Christian Borntraeger <borntraeger@de.ibm.com>
> > Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index a4310a5..0e08d92 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -1970,11 +1970,22 @@ int hugepage_madvise(struct vm_area_struct *vma,
> > {
> > switch (advice) {
> > case MADV_HUGEPAGE:
> > +#ifdef CONFIG_S390
> > + /*
> > + * MADV_HUGEPAGE is broken after s390_enable_sie(), qemu
> > + * blindly does madvise(MADV_HUGEPAGE) for for all kvm pages
> > + * and expects it must fail on s390. Avoid a possible SIGSEGV
> > + * until qemu is changed.
>
> I prefer:
> * until kvm/s390 can handle large pages in the host.
>
> Otherwise qemu has to be changed again, if we get THP working for kvm.
>
> > + */
> > + if (mm_has_pgste(vma->vm_mm))
> > + return -EINVAL;
> > +#endif
> > /*
> > * Be somewhat over-protective like KSM for now!
> > */
> > if (*vm_flags & (VM_HUGEPAGE | VM_NO_THP))
> > return -EINVAL;
> > +
>
> Unrelated white space?
> > *vm_flags &= ~VM_NOHUGEPAGE;
> > *vm_flags |= VM_HUGEPAGE;
> > /*
> >
>
>
>
> With the comment and white space change:
>
> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
>
> Thanks for the quick patch
Yes, thank you all for the quick responses, and for looking over these
patches!
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: + mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch added to -mm tree
2014-02-26 18:06 ` Oleg Nesterov
2014-02-26 19:05 ` Gerald Schaefer
2014-02-26 19:27 ` Christian Borntraeger
@ 2014-02-26 20:41 ` Paolo Bonzini
2014-02-27 16:34 ` Oleg Nesterov
2 siblings, 1 reply; 18+ messages in thread
From: Paolo Bonzini @ 2014-02-26 20:41 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Alex Thorlton, Peter Zijlstra, Christian Borntraeger, akpm,
linux-kernel, viro, schwidefsky, rientjes, riel, mingo, mgorman,
kirill shutemov, heiko carstens, hannes, gerald schaefer,
ebiederm, aarcange
> +#ifdef CONFIG_S390
> + /*
> + * MADV_HUGEPAGE is broken after s390_enable_sie(), qemu
> + * blindly does madvise(MADV_HUGEPAGE) for for all kvm pages
> + * and expects it must fail on s390. Avoid a possible SIGSEGV
> + * until qemu is changed.
> + */
> + if (mm_has_pgste(vma->vm_mm))
> + return -EINVAL;
> +#endif
The comment is not quite true. QEMU doesn't expect that the madvise fails.
It simply expects that the madvise doesn't cause SIGSEGVs or later
malfunctioning, because (quoting tha man page) madvise "does not influence
the semantics of the application".
There's nothing to fix in QEMU, in fact I believe this should return 0
rather than -EINVAL. It's perfectly legal for the kernel to ignore an madvise,
and this is what should happen here.
Paolo
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: + mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch added to -mm tree
2014-02-26 19:39 ` Alex Thorlton
@ 2014-02-26 23:24 ` Andrew Morton
2014-02-27 0:01 ` Alex Thorlton
0 siblings, 1 reply; 18+ messages in thread
From: Andrew Morton @ 2014-02-26 23:24 UTC (permalink / raw)
To: Alex Thorlton
Cc: Christian Borntraeger, Oleg Nesterov, Peter Zijlstra,
linux-kernel, viro, schwidefsky, rientjes, riel, pbonzini, mingo,
mgorman, kirill.shutemov, heiko.carstens, hannes, gerald.schaefer,
ebiederm, aarcange
On Wed, 26 Feb 2014 13:39:17 -0600 Alex Thorlton <athorlton@sgi.com> wrote:
> >
> >
> > > I'd suggest the patch below on top of your changes, but I won't argue.
>
> I like this approach, and your updated comment as well. This should
> probably go in as [PATCH 2/4] in my series. Do I need to spin a v5
> with this new patch included in the series, or will Andrew pull this in?
Paolo's comments on Oleg's patch remain unaddressed, so please take a
look at that and send out the final version?
An incremental patch would be preferred, please. I'll later fold that
into mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch to
avoid any bisection holes.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: + mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch added to -mm tree
2014-02-26 23:24 ` Andrew Morton
@ 2014-02-27 0:01 ` Alex Thorlton
2014-02-27 17:26 ` Alex Thorlton
0 siblings, 1 reply; 18+ messages in thread
From: Alex Thorlton @ 2014-02-27 0:01 UTC (permalink / raw)
To: Andrew Morton
Cc: Christian Borntraeger, Oleg Nesterov, Peter Zijlstra,
linux-kernel, viro, schwidefsky, rientjes, riel, pbonzini, mingo,
mgorman, kirill.shutemov, heiko.carstens, hannes, gerald.schaefer,
ebiederm, aarcange
On Wed, Feb 26, 2014 at 03:24:34PM -0800, Andrew Morton wrote:
> On Wed, 26 Feb 2014 13:39:17 -0600 Alex Thorlton <athorlton@sgi.com> wrote:
>
> > >
> > >
> > > > I'd suggest the patch below on top of your changes, but I won't argue.
> >
> > I like this approach, and your updated comment as well. This should
> > probably go in as [PATCH 2/4] in my series. Do I need to spin a v5
> > with this new patch included in the series, or will Andrew pull this in?
>
> Paolo's comments on Oleg's patch remain unaddressed, so please take a
> look at that and send out the final version?
Got it. I'll get that to you tonight/tomorrow morning.
> An incremental patch would be preferred, please. I'll later fold that
> into mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch to
> avoid any bisection holes.
Understood. I'll keep them separate.
Thanks, Andrew!
- Alex
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: + mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch added to -mm tree
2014-02-26 20:41 ` Paolo Bonzini
@ 2014-02-27 16:34 ` Oleg Nesterov
0 siblings, 0 replies; 18+ messages in thread
From: Oleg Nesterov @ 2014-02-27 16:34 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Alex Thorlton, Peter Zijlstra, Christian Borntraeger, akpm,
linux-kernel, viro, schwidefsky, rientjes, riel, mingo, mgorman,
kirill shutemov, heiko carstens, hannes, gerald schaefer,
ebiederm, aarcange
On 02/26, Paolo Bonzini wrote:
>
> > +#ifdef CONFIG_S390
> > + /*
> > + * MADV_HUGEPAGE is broken after s390_enable_sie(), qemu
> > + * blindly does madvise(MADV_HUGEPAGE) for for all kvm pages
> > + * and expects it must fail on s390. Avoid a possible SIGSEGV
> > + * until qemu is changed.
> > + */
> > + if (mm_has_pgste(vma->vm_mm))
> > + return -EINVAL;
> > +#endif
>
> The comment is not quite true. QEMU doesn't expect that the madvise fails.
Yes, sorry. I didn't mean "it expects -EINVAL".
> It simply expects that the madvise doesn't cause SIGSEGVs or later
> malfunctioning, because (quoting tha man page) madvise "does not influence
> the semantics of the application".
Yes, I understand. But currently this means "MADV_HUGEPAGE should not
actually work", this is what I tried to say.
> There's nothing to fix in QEMU,
I was going to argue, but this is probably true too.
In short: I agree with any comment ;)
Oleg.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: + mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch added to -mm tree
2014-02-26 19:05 ` Gerald Schaefer
@ 2014-02-27 16:45 ` Oleg Nesterov
0 siblings, 0 replies; 18+ messages in thread
From: Oleg Nesterov @ 2014-02-27 16:45 UTC (permalink / raw)
To: Gerald Schaefer
Cc: Alex Thorlton, Peter Zijlstra, Christian Borntraeger, akpm,
linux-kernel, viro, schwidefsky, rientjes, riel, pbonzini, mingo,
mgorman, kirill.shutemov, heiko.carstens, hannes, ebiederm,
aarcange
On 02/26, Gerald Schaefer wrote:
>
> On Wed, 26 Feb 2014 19:06:03 +0100
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> > It would be nice to also change thp_split_mm() to not not play with
> > mm->def_flags, but I am not sure if we can do this.
>
> Hmm, I'm also wondering about this. Basically, we only need VM_NOHUGEPAGE
> in vma->vm_flags, which is done for all existing vmas in thp_split_mm().
> But if there should be new vmas created afterwards, it would still be
> necessary to also have VM_NOHUGEPAGE in mm->def_flags, because the
> vm_flags for new vmas will be set via OR of mm->def_flags, e.g. in
> do_brk() and do_mmap_pgoff().
Yes, exactly, this was my concern.
And while I know nothing about s390, it seems to me that huge pages should
be forbidden for any vma if ->has_pgste was set.
> Anyway, this would then have to be a separate patch, to keep the
> "revertability" of this hack.
Agreed. Thanks!
Oleg.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: + mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch added to -mm tree
2014-02-27 0:01 ` Alex Thorlton
@ 2014-02-27 17:26 ` Alex Thorlton
0 siblings, 0 replies; 18+ messages in thread
From: Alex Thorlton @ 2014-02-27 17:26 UTC (permalink / raw)
To: Andrew Morton
Cc: Christian Borntraeger, Oleg Nesterov, Peter Zijlstra,
linux-kernel, viro, schwidefsky, rientjes, riel, pbonzini, mingo,
mgorman, kirill.shutemov, heiko.carstens, hannes, gerald.schaefer,
ebiederm, aarcange
On Wed, Feb 26, 2014 at 06:01:53PM -0600, Alex Thorlton wrote:
> On Wed, Feb 26, 2014 at 03:24:34PM -0800, Andrew Morton wrote:
> > On Wed, 26 Feb 2014 13:39:17 -0600 Alex Thorlton <athorlton@sgi.com> wrote:
> >
> > > >
> > > >
> > > > > I'd suggest the patch below on top of your changes, but I won't argue.
> > >
> > > I like this approach, and your updated comment as well. This should
> > > probably go in as [PATCH 2/4] in my series. Do I need to spin a v5
> > > with this new patch included in the series, or will Andrew pull this in?
> >
> > Paolo's comments on Oleg's patch remain unaddressed, so please take a
> > look at that and send out the final version?
>
> Got it. I'll get that to you tonight/tomorrow morning.
Just kicked out another version of the patch that should cover all
comments/suggestions. Let me know if you need anything else from me!
- Alex
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2014-02-27 17:26 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <530d2ce9.eikv0ULecNwxF4I5%akpm@linux-foundation.org>
[not found] ` <530D9F50.1080400@de.ibm.com>
2014-02-26 14:50 ` + mm-revert-thp-make-madv_hugepage-check-for-mm-def_flags.patch added to -mm tree Oleg Nesterov
2014-02-26 15:06 ` Christian Borntraeger
2014-02-26 15:22 ` Kirill A. Shutemov
2014-02-26 15:31 ` Oleg Nesterov
2014-02-26 16:55 ` Gerald Schaefer
2014-02-26 16:57 ` Peter Zijlstra
2014-02-26 17:22 ` Alex Thorlton
2014-02-26 18:06 ` Oleg Nesterov
2014-02-26 19:05 ` Gerald Schaefer
2014-02-27 16:45 ` Oleg Nesterov
2014-02-26 19:27 ` Christian Borntraeger
2014-02-26 19:39 ` Alex Thorlton
2014-02-26 23:24 ` Andrew Morton
2014-02-27 0:01 ` Alex Thorlton
2014-02-27 17:26 ` Alex Thorlton
2014-02-26 20:41 ` Paolo Bonzini
2014-02-27 16:34 ` Oleg Nesterov
2014-02-26 18:08 ` Oleg Nesterov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox