* [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify
[not found] <20220805110329.80540-1-david@redhat.com>
@ 2022-08-05 11:03 ` David Hildenbrand
2022-08-05 18:14 ` Peter Xu
0 siblings, 1 reply; 14+ messages in thread
From: David Hildenbrand @ 2022-08-05 11:03 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, David Hildenbrand, Andrew Morton, Mike Kravetz,
Muchun Song, Peter Xu, Peter Feiner, Kirill A . Shutemov, stable
Staring at hugetlb_wp(), one might wonder where all the logic for shared
mappings is when stumbling over a write-protected page in a shared
mapping. In fact, there is none, and so far we thought we could get
away with that because e.g., mprotect() should always do the right thing
and map all pages directly writable.
Looks like we were wrong:
--------------------------------------------------------------------------
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
#include <sys/mman.h>
#define HUGETLB_SIZE (2 * 1024 * 1024u)
static void clear_softdirty(void)
{
int fd = open("/proc/self/clear_refs", O_WRONLY);
const char *ctrl = "4";
int ret;
if (fd < 0) {
fprintf(stderr, "open(clear_refs) failed\n");
exit(1);
}
ret = write(fd, ctrl, strlen(ctrl));
if (ret != strlen(ctrl)) {
fprintf(stderr, "write(clear_refs) failed\n");
exit(1);
}
close(fd);
}
int main(int argc, char **argv)
{
char *map;
int fd;
fd = open("/dev/hugepages/tmp", O_RDWR | O_CREAT);
if (!fd) {
fprintf(stderr, "open() failed\n");
return -errno;
}
if (ftruncate(fd, HUGETLB_SIZE)) {
fprintf(stderr, "ftruncate() failed\n");
return -errno;
}
map = mmap(NULL, HUGETLB_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
if (map == MAP_FAILED) {
fprintf(stderr, "mmap() failed\n");
return -errno;
}
*map = 0;
if (mprotect(map, HUGETLB_SIZE, PROT_READ)) {
fprintf(stderr, "mmprotect() failed\n");
return -errno;
}
clear_softdirty();
if (mprotect(map, HUGETLB_SIZE, PROT_READ|PROT_WRITE)) {
fprintf(stderr, "mmprotect() failed\n");
return -errno;
}
*map = 0;
return 0;
}
--------------------------------------------------------------------------
Above test fails with SIGBUS when there is only a single free hugetlb page.
# echo 1 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
# ./test
Bus error (core dumped)
And worse, with sufficient free hugetlb pages it will map an anonymous page
into a shared mapping, for example, messing up accounting during unmap
and breaking MAP_SHARED semantics:
# echo 2 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
# ./test
# cat /proc/meminfo | grep HugePages_
HugePages_Total: 2
HugePages_Free: 1
HugePages_Rsvd: 18446744073709551615
HugePages_Surp: 0
Reason in this particular case is that vma_wants_writenotify() will
return "true", removing VM_SHARED in vma_set_page_prot() to map pages
write-protected. Let's teach vma_wants_writenotify() that hugetlb does not
support write-notify, including softdirty tracking.
Fixes: 64e455079e1b ("mm: softdirty: enable write notifications on VMAs after VM_SOFTDIRTY cleared")
Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: David Hildenbrand <david@redhat.com>
---
mm/mmap.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/mm/mmap.c b/mm/mmap.c
index 61e6135c54ef..462a6b0344ac 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1683,6 +1683,13 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot)
if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED)))
return 0;
+ /*
+ * Hugetlb does not require/support writenotify; especially, it does not
+ * support softdirty tracking.
+ */
+ if (is_vm_hugetlb_page(vma))
+ return 0;
+
/* The backer wishes to know when pages are first written to? */
if (vm_ops && (vm_ops->page_mkwrite || vm_ops->pfn_mkwrite))
return 1;
--
2.35.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify
2022-08-05 11:03 ` [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify David Hildenbrand
@ 2022-08-05 18:14 ` Peter Xu
2022-08-05 18:22 ` David Hildenbrand
2022-08-05 18:23 ` Mike Kravetz
0 siblings, 2 replies; 14+ messages in thread
From: Peter Xu @ 2022-08-05 18:14 UTC (permalink / raw)
To: David Hildenbrand
Cc: linux-kernel, linux-mm, Andrew Morton, Mike Kravetz, Muchun Song,
Peter Feiner, Kirill A . Shutemov, stable
On Fri, Aug 05, 2022 at 01:03:28PM +0200, David Hildenbrand wrote:
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 61e6135c54ef..462a6b0344ac 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1683,6 +1683,13 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot)
> if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED)))
> return 0;
>
> + /*
> + * Hugetlb does not require/support writenotify; especially, it does not
> + * support softdirty tracking.
> + */
> + if (is_vm_hugetlb_page(vma))
> + return 0;
I'm kind of confused here.. you seems to be fixing up soft-dirty for
hugetlb but here it's explicitly forbidden.
Could you explain a bit more on why this patch is needed if (assume
there'll be a working) patch 2 being provided?
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify
2022-08-05 18:14 ` Peter Xu
@ 2022-08-05 18:22 ` David Hildenbrand
2022-08-05 18:23 ` Mike Kravetz
1 sibling, 0 replies; 14+ messages in thread
From: David Hildenbrand @ 2022-08-05 18:22 UTC (permalink / raw)
To: Peter Xu
Cc: linux-kernel, linux-mm, Andrew Morton, Mike Kravetz, Muchun Song,
Peter Feiner, Kirill A . Shutemov, stable
On 05.08.22 20:14, Peter Xu wrote:
> On Fri, Aug 05, 2022 at 01:03:28PM +0200, David Hildenbrand wrote:
>> diff --git a/mm/mmap.c b/mm/mmap.c
>> index 61e6135c54ef..462a6b0344ac 100644
>> --- a/mm/mmap.c
>> +++ b/mm/mmap.c
>> @@ -1683,6 +1683,13 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot)
>> if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED)))
>> return 0;
>>
>> + /*
>> + * Hugetlb does not require/support writenotify; especially, it does not
>> + * support softdirty tracking.
>> + */
>> + if (is_vm_hugetlb_page(vma))
>> + return 0;
>
> I'm kind of confused here.. you seems to be fixing up soft-dirty for
> hugetlb but here it's explicitly forbidden.
>
> Could you explain a bit more on why this patch is needed if (assume
> there'll be a working) patch 2 being provided?
I want something simple to backport. And even with patch #2 in place, as
long as we don't support softdirty tracking, it doesn't make sense to
enable it here as of now.
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify
2022-08-05 18:14 ` Peter Xu
2022-08-05 18:22 ` David Hildenbrand
@ 2022-08-05 18:23 ` Mike Kravetz
2022-08-05 18:25 ` David Hildenbrand
1 sibling, 1 reply; 14+ messages in thread
From: Mike Kravetz @ 2022-08-05 18:23 UTC (permalink / raw)
To: Peter Xu
Cc: David Hildenbrand, linux-kernel, linux-mm, Andrew Morton,
Muchun Song, Peter Feiner, Kirill A . Shutemov, stable
On 08/05/22 14:14, Peter Xu wrote:
> On Fri, Aug 05, 2022 at 01:03:28PM +0200, David Hildenbrand wrote:
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index 61e6135c54ef..462a6b0344ac 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -1683,6 +1683,13 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot)
> > if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED)))
> > return 0;
> >
> > + /*
> > + * Hugetlb does not require/support writenotify; especially, it does not
> > + * support softdirty tracking.
> > + */
> > + if (is_vm_hugetlb_page(vma))
> > + return 0;
>
> I'm kind of confused here.. you seems to be fixing up soft-dirty for
> hugetlb but here it's explicitly forbidden.
>
> Could you explain a bit more on why this patch is needed if (assume
> there'll be a working) patch 2 being provided?
>
No comments on the patch, but ...
Since it required little thought, I ran the test program on next-20220802 and
was surprised that the issue did not recreate. Even added a simple printk
to make sure we were getting into vma_wants_writenotify with a hugetlb vma.
We were.
I can easily recreate with an older Fedora kernel.
Will be trying to understand why this is the case.
--
Mike Kravetz
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify
2022-08-05 18:23 ` Mike Kravetz
@ 2022-08-05 18:25 ` David Hildenbrand
2022-08-05 18:33 ` Mike Kravetz
0 siblings, 1 reply; 14+ messages in thread
From: David Hildenbrand @ 2022-08-05 18:25 UTC (permalink / raw)
To: Mike Kravetz, Peter Xu
Cc: linux-kernel, linux-mm, Andrew Morton, Muchun Song, Peter Feiner,
Kirill A . Shutemov, stable
On 05.08.22 20:23, Mike Kravetz wrote:
> On 08/05/22 14:14, Peter Xu wrote:
>> On Fri, Aug 05, 2022 at 01:03:28PM +0200, David Hildenbrand wrote:
>>> diff --git a/mm/mmap.c b/mm/mmap.c
>>> index 61e6135c54ef..462a6b0344ac 100644
>>> --- a/mm/mmap.c
>>> +++ b/mm/mmap.c
>>> @@ -1683,6 +1683,13 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot)
>>> if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED)))
>>> return 0;
>>>
>>> + /*
>>> + * Hugetlb does not require/support writenotify; especially, it does not
>>> + * support softdirty tracking.
>>> + */
>>> + if (is_vm_hugetlb_page(vma))
>>> + return 0;
>>
>> I'm kind of confused here.. you seems to be fixing up soft-dirty for
>> hugetlb but here it's explicitly forbidden.
>>
>> Could you explain a bit more on why this patch is needed if (assume
>> there'll be a working) patch 2 being provided?
>>
>
> No comments on the patch, but ...
>
> Since it required little thought, I ran the test program on next-20220802 and
> was surprised that the issue did not recreate. Even added a simple printk
> to make sure we were getting into vma_wants_writenotify with a hugetlb vma.
> We were.
... does your config have CONFIG_MEM_SOFT_DIRTY enabled?
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify
2022-08-05 18:25 ` David Hildenbrand
@ 2022-08-05 18:33 ` Mike Kravetz
2022-08-05 18:57 ` David Hildenbrand
0 siblings, 1 reply; 14+ messages in thread
From: Mike Kravetz @ 2022-08-05 18:33 UTC (permalink / raw)
To: David Hildenbrand
Cc: Peter Xu, linux-kernel, linux-mm, Andrew Morton, Muchun Song,
Peter Feiner, Kirill A . Shutemov, stable
On 08/05/22 20:25, David Hildenbrand wrote:
> On 05.08.22 20:23, Mike Kravetz wrote:
> > On 08/05/22 14:14, Peter Xu wrote:
> >> On Fri, Aug 05, 2022 at 01:03:28PM +0200, David Hildenbrand wrote:
> >>> diff --git a/mm/mmap.c b/mm/mmap.c
> >>> index 61e6135c54ef..462a6b0344ac 100644
> >>> --- a/mm/mmap.c
> >>> +++ b/mm/mmap.c
> >>> @@ -1683,6 +1683,13 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot)
> >>> if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED)))
> >>> return 0;
> >>>
> >>> + /*
> >>> + * Hugetlb does not require/support writenotify; especially, it does not
> >>> + * support softdirty tracking.
> >>> + */
> >>> + if (is_vm_hugetlb_page(vma))
> >>> + return 0;
> >>
> >> I'm kind of confused here.. you seems to be fixing up soft-dirty for
> >> hugetlb but here it's explicitly forbidden.
> >>
> >> Could you explain a bit more on why this patch is needed if (assume
> >> there'll be a working) patch 2 being provided?
> >>
> >
> > No comments on the patch, but ...
> >
> > Since it required little thought, I ran the test program on next-20220802 and
> > was surprised that the issue did not recreate. Even added a simple printk
> > to make sure we were getting into vma_wants_writenotify with a hugetlb vma.
> > We were.
>
>
> ... does your config have CONFIG_MEM_SOFT_DIRTY enabled?
>
No, Duh!
FYI - Some time back, I started looking at adding soft dirty support for
hugetlb mappings. I did not finish that work. But, I seem to recall
places where code was operating on hugetlb mappings when perhaps it should
not.
Perhaps, it would also be good to just disable soft dirty for hugetlb at
the source?
--
Mike Kravetz
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify
2022-08-05 18:33 ` Mike Kravetz
@ 2022-08-05 18:57 ` David Hildenbrand
2022-08-05 20:48 ` Mike Kravetz
0 siblings, 1 reply; 14+ messages in thread
From: David Hildenbrand @ 2022-08-05 18:57 UTC (permalink / raw)
To: Mike Kravetz
Cc: Peter Xu, linux-kernel, linux-mm, Andrew Morton, Muchun Song,
Peter Feiner, Kirill A . Shutemov, stable
On 05.08.22 20:33, Mike Kravetz wrote:
> On 08/05/22 20:25, David Hildenbrand wrote:
>> On 05.08.22 20:23, Mike Kravetz wrote:
>>> On 08/05/22 14:14, Peter Xu wrote:
>>>> On Fri, Aug 05, 2022 at 01:03:28PM +0200, David Hildenbrand wrote:
>>>>> diff --git a/mm/mmap.c b/mm/mmap.c
>>>>> index 61e6135c54ef..462a6b0344ac 100644
>>>>> --- a/mm/mmap.c
>>>>> +++ b/mm/mmap.c
>>>>> @@ -1683,6 +1683,13 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot)
>>>>> if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED)))
>>>>> return 0;
>>>>>
>>>>> + /*
>>>>> + * Hugetlb does not require/support writenotify; especially, it does not
>>>>> + * support softdirty tracking.
>>>>> + */
>>>>> + if (is_vm_hugetlb_page(vma))
>>>>> + return 0;
>>>>
>>>> I'm kind of confused here.. you seems to be fixing up soft-dirty for
>>>> hugetlb but here it's explicitly forbidden.
>>>>
>>>> Could you explain a bit more on why this patch is needed if (assume
>>>> there'll be a working) patch 2 being provided?
>>>>
>>>
>>> No comments on the patch, but ...
>>>
>>> Since it required little thought, I ran the test program on next-20220802 and
>>> was surprised that the issue did not recreate. Even added a simple printk
>>> to make sure we were getting into vma_wants_writenotify with a hugetlb vma.
>>> We were.
>>
>>
>> ... does your config have CONFIG_MEM_SOFT_DIRTY enabled?
>>
>
> No, Duh!
>
> FYI - Some time back, I started looking at adding soft dirty support for
> hugetlb mappings. I did not finish that work. But, I seem to recall
> places where code was operating on hugetlb mappings when perhaps it should
> not.
>
> Perhaps, it would also be good to just disable soft dirty for hugetlb at
> the source?
I thought about that as well. But I came to the conclusion that without
patch #2, hugetlb VMAs cannot possibly support write-notify, so there is
no need to bother in vma_wants_writenotify() at all.
The "root" would be places where we clear VM_SOFTDIRTY. That should only
be fs/proc/task_mmu.c:clear_refs_write() IIRC.
So I don't particularly care, I consider this patch a bit cleaner and
more generic, but I can adjust clear_refs_write() instead of there is a
preference.
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify
2022-08-05 18:57 ` David Hildenbrand
@ 2022-08-05 20:48 ` Mike Kravetz
2022-08-05 23:13 ` Peter Xu
0 siblings, 1 reply; 14+ messages in thread
From: Mike Kravetz @ 2022-08-05 20:48 UTC (permalink / raw)
To: David Hildenbrand
Cc: Peter Xu, linux-kernel, linux-mm, Andrew Morton, Muchun Song,
Peter Feiner, Kirill A . Shutemov, stable
On 08/05/22 20:57, David Hildenbrand wrote:
> On 05.08.22 20:33, Mike Kravetz wrote:
> > On 08/05/22 20:25, David Hildenbrand wrote:
> >> On 05.08.22 20:23, Mike Kravetz wrote:
> >>> On 08/05/22 14:14, Peter Xu wrote:
> >>>> On Fri, Aug 05, 2022 at 01:03:28PM +0200, David Hildenbrand wrote:
> >>>>> diff --git a/mm/mmap.c b/mm/mmap.c
> >>>>> index 61e6135c54ef..462a6b0344ac 100644
> >>>>> --- a/mm/mmap.c
> >>>>> +++ b/mm/mmap.c
> >>>>> @@ -1683,6 +1683,13 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot)
> >>>>> if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED)))
> >>>>> return 0;
> >>>>>
> >>>>> + /*
> >>>>> + * Hugetlb does not require/support writenotify; especially, it does not
> >>>>> + * support softdirty tracking.
> >>>>> + */
> >>>>> + if (is_vm_hugetlb_page(vma))
> >>>>> + return 0;
> >>>>
> >>>> I'm kind of confused here.. you seems to be fixing up soft-dirty for
> >>>> hugetlb but here it's explicitly forbidden.
> >>>>
> >>>> Could you explain a bit more on why this patch is needed if (assume
> >>>> there'll be a working) patch 2 being provided?
> >>>>
> >>>
> >>> No comments on the patch, but ...
> >>>
> >>> Since it required little thought, I ran the test program on next-20220802 and
> >>> was surprised that the issue did not recreate. Even added a simple printk
> >>> to make sure we were getting into vma_wants_writenotify with a hugetlb vma.
> >>> We were.
> >>
> >>
> >> ... does your config have CONFIG_MEM_SOFT_DIRTY enabled?
> >>
> >
> > No, Duh!
> >
> > FYI - Some time back, I started looking at adding soft dirty support for
> > hugetlb mappings. I did not finish that work. But, I seem to recall
> > places where code was operating on hugetlb mappings when perhaps it should
> > not.
> >
> > Perhaps, it would also be good to just disable soft dirty for hugetlb at
> > the source?
>
> I thought about that as well. But I came to the conclusion that without
> patch #2, hugetlb VMAs cannot possibly support write-notify, so there is
> no need to bother in vma_wants_writenotify() at all.
>
> The "root" would be places where we clear VM_SOFTDIRTY. That should only
> be fs/proc/task_mmu.c:clear_refs_write() IIRC.
>
> So I don't particularly care, I consider this patch a bit cleaner and
> more generic, but I can adjust clear_refs_write() instead of there is a
> preference.
>
After a closer look, I agree that this may be the simplest/cleanest way to
proceed. I was going to suggest that you note hugetlb does not support
softdirty, but see you did in the comment.
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
--
Mike Kravetz
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify
2022-08-05 20:48 ` Mike Kravetz
@ 2022-08-05 23:13 ` Peter Xu
2022-08-05 23:33 ` Mike Kravetz
2022-08-08 16:36 ` David Hildenbrand
0 siblings, 2 replies; 14+ messages in thread
From: Peter Xu @ 2022-08-05 23:13 UTC (permalink / raw)
To: Mike Kravetz
Cc: David Hildenbrand, linux-kernel, linux-mm, Andrew Morton,
Muchun Song, Peter Feiner, Kirill A . Shutemov, stable
On Fri, Aug 05, 2022 at 01:48:35PM -0700, Mike Kravetz wrote:
> On 08/05/22 20:57, David Hildenbrand wrote:
> > On 05.08.22 20:33, Mike Kravetz wrote:
> > > On 08/05/22 20:25, David Hildenbrand wrote:
> > >> On 05.08.22 20:23, Mike Kravetz wrote:
> > >>> On 08/05/22 14:14, Peter Xu wrote:
> > >>>> On Fri, Aug 05, 2022 at 01:03:28PM +0200, David Hildenbrand wrote:
> > >>>>> diff --git a/mm/mmap.c b/mm/mmap.c
> > >>>>> index 61e6135c54ef..462a6b0344ac 100644
> > >>>>> --- a/mm/mmap.c
> > >>>>> +++ b/mm/mmap.c
> > >>>>> @@ -1683,6 +1683,13 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot)
> > >>>>> if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED)))
> > >>>>> return 0;
> > >>>>>
> > >>>>> + /*
> > >>>>> + * Hugetlb does not require/support writenotify; especially, it does not
> > >>>>> + * support softdirty tracking.
> > >>>>> + */
> > >>>>> + if (is_vm_hugetlb_page(vma))
> > >>>>> + return 0;
> > >>>>
> > >>>> I'm kind of confused here.. you seems to be fixing up soft-dirty for
> > >>>> hugetlb but here it's explicitly forbidden.
> > >>>>
> > >>>> Could you explain a bit more on why this patch is needed if (assume
> > >>>> there'll be a working) patch 2 being provided?
> > >>>>
> > >>>
> > >>> No comments on the patch, but ...
> > >>>
> > >>> Since it required little thought, I ran the test program on next-20220802 and
> > >>> was surprised that the issue did not recreate. Even added a simple printk
> > >>> to make sure we were getting into vma_wants_writenotify with a hugetlb vma.
> > >>> We were.
> > >>
> > >>
> > >> ... does your config have CONFIG_MEM_SOFT_DIRTY enabled?
> > >>
> > >
> > > No, Duh!
> > >
> > > FYI - Some time back, I started looking at adding soft dirty support for
> > > hugetlb mappings. I did not finish that work. But, I seem to recall
> > > places where code was operating on hugetlb mappings when perhaps it should
> > > not.
> > >
> > > Perhaps, it would also be good to just disable soft dirty for hugetlb at
> > > the source?
> >
> > I thought about that as well. But I came to the conclusion that without
> > patch #2, hugetlb VMAs cannot possibly support write-notify, so there is
> > no need to bother in vma_wants_writenotify() at all.
> >
> > The "root" would be places where we clear VM_SOFTDIRTY. That should only
> > be fs/proc/task_mmu.c:clear_refs_write() IIRC.
> >
> > So I don't particularly care, I consider this patch a bit cleaner and
> > more generic, but I can adjust clear_refs_write() instead of there is a
> > preference.
> >
>
> After a closer look, I agree that this may be the simplest/cleanest way to
> proceed. I was going to suggest that you note hugetlb does not support
> softdirty, but see you did in the comment.
>
> Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Filtering out hugetlbfs in vma_wants_writenotify() is still a bit hard to
follow to me, since it's not clear why hugetlbfs never wants writenotify.
If it's only about soft-dirty, we could have added the hugetlbfs check into
vma_soft_dirty_enabled(), then I think it'll achieve the same thing and
much clearer - with the soft-dirty check constantly returning false for it,
hugetlbfs shared vmas should have vma_wants_writenotify() naturally return
0 already.
For the long term - shouldn't we just enable soft-dirty for hugetlbfs? I
remember Mike used to have that in todo. Since we've got patch 2 already,
I feel like that's really much close (is the only missing piece the clear
refs write part? or maybe some more that I didn't notice).
Then patch 1 (or IMHO equivalant check in vma_soft_dirty_enabled(), but
maybe in stable trees we don't have vma_soft_dirty_enabled then it's
exactly patch 1) can be a stable-only backport just to avoid the bug from
triggering.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify
2022-08-05 23:13 ` Peter Xu
@ 2022-08-05 23:33 ` Mike Kravetz
2022-08-08 16:10 ` Peter Xu
2022-08-08 16:36 ` David Hildenbrand
1 sibling, 1 reply; 14+ messages in thread
From: Mike Kravetz @ 2022-08-05 23:33 UTC (permalink / raw)
To: Peter Xu
Cc: David Hildenbrand, linux-kernel, linux-mm, Andrew Morton,
Muchun Song, Peter Feiner, Kirill A . Shutemov, stable
On 08/05/22 19:13, Peter Xu wrote:
> On Fri, Aug 05, 2022 at 01:48:35PM -0700, Mike Kravetz wrote:
> > On 08/05/22 20:57, David Hildenbrand wrote:
> > > On 05.08.22 20:33, Mike Kravetz wrote:
> > > > On 08/05/22 20:25, David Hildenbrand wrote:
> > > >> On 05.08.22 20:23, Mike Kravetz wrote:
> > > >>> On 08/05/22 14:14, Peter Xu wrote:
> > > >>>> On Fri, Aug 05, 2022 at 01:03:28PM +0200, David Hildenbrand wrote:
> > > >>>>> diff --git a/mm/mmap.c b/mm/mmap.c
> > > >>>>> index 61e6135c54ef..462a6b0344ac 100644
> > > >>>>> --- a/mm/mmap.c
> > > >>>>> +++ b/mm/mmap.c
> > > >>>>> @@ -1683,6 +1683,13 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot)
> > > >>>>> if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED)))
> > > >>>>> return 0;
> > > >>>>>
> > > >>>>> + /*
> > > >>>>> + * Hugetlb does not require/support writenotify; especially, it does not
> > > >>>>> + * support softdirty tracking.
> > > >>>>> + */
> > > >>>>> + if (is_vm_hugetlb_page(vma))
> > > >>>>> + return 0;
> > > >>>>
> > > >>>> I'm kind of confused here.. you seems to be fixing up soft-dirty for
> > > >>>> hugetlb but here it's explicitly forbidden.
> > > >>>>
> > > >>>> Could you explain a bit more on why this patch is needed if (assume
> > > >>>> there'll be a working) patch 2 being provided?
> > > >>>>
> > > >>>
> > > >>> No comments on the patch, but ...
> > > >>>
> > > >>> Since it required little thought, I ran the test program on next-20220802 and
> > > >>> was surprised that the issue did not recreate. Even added a simple printk
> > > >>> to make sure we were getting into vma_wants_writenotify with a hugetlb vma.
> > > >>> We were.
> > > >>
> > > >>
> > > >> ... does your config have CONFIG_MEM_SOFT_DIRTY enabled?
> > > >>
> > > >
> > > > No, Duh!
> > > >
> > > > FYI - Some time back, I started looking at adding soft dirty support for
> > > > hugetlb mappings. I did not finish that work. But, I seem to recall
> > > > places where code was operating on hugetlb mappings when perhaps it should
> > > > not.
> > > >
> > > > Perhaps, it would also be good to just disable soft dirty for hugetlb at
> > > > the source?
> > >
> > > I thought about that as well. But I came to the conclusion that without
> > > patch #2, hugetlb VMAs cannot possibly support write-notify, so there is
> > > no need to bother in vma_wants_writenotify() at all.
> > >
> > > The "root" would be places where we clear VM_SOFTDIRTY. That should only
> > > be fs/proc/task_mmu.c:clear_refs_write() IIRC.
> > >
> > > So I don't particularly care, I consider this patch a bit cleaner and
> > > more generic, but I can adjust clear_refs_write() instead of there is a
> > > preference.
> > >
> >
> > After a closer look, I agree that this may be the simplest/cleanest way to
> > proceed. I was going to suggest that you note hugetlb does not support
> > softdirty, but see you did in the comment.
> >
> > Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
>
> Filtering out hugetlbfs in vma_wants_writenotify() is still a bit hard to
> follow to me, since it's not clear why hugetlbfs never wants writenotify.
>
> If it's only about soft-dirty, we could have added the hugetlbfs check into
> vma_soft_dirty_enabled(), then I think it'll achieve the same thing and
> much clearer - with the soft-dirty check constantly returning false for it,
> hugetlbfs shared vmas should have vma_wants_writenotify() naturally return
> 0 already.
>
> For the long term - shouldn't we just enable soft-dirty for hugetlbfs? I
> remember Mike used to have that in todo. Since we've got patch 2 already,
> I feel like that's really much close (is the only missing piece the clear
> refs write part? or maybe some more that I didn't notice).
>
> Then patch 1 (or IMHO equivalant check in vma_soft_dirty_enabled(), but
> maybe in stable trees we don't have vma_soft_dirty_enabled then it's
> exactly patch 1) can be a stable-only backport just to avoid the bug from
> triggering.
It looks like vma_soft_dirty_enabled is recent and not in any stable
trees (or even 5.19).
Yes, I did start working on hugetlb softdirty support in the past.
https://lore.kernel.org/lkml/20210211000322.159437-1-mike.kravetz@oracle.com/
Unfortunately, it got preempted by other things. I will try to move it up
the priority list.
--
Mike Kravetz
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify
2022-08-05 23:33 ` Mike Kravetz
@ 2022-08-08 16:10 ` Peter Xu
0 siblings, 0 replies; 14+ messages in thread
From: Peter Xu @ 2022-08-08 16:10 UTC (permalink / raw)
To: Mike Kravetz
Cc: David Hildenbrand, linux-kernel, linux-mm, Andrew Morton,
Muchun Song, Peter Feiner, Kirill A . Shutemov, stable
On Fri, Aug 05, 2022 at 04:33:56PM -0700, Mike Kravetz wrote:
> It looks like vma_soft_dirty_enabled is recent and not in any stable
> trees (or even 5.19).
>
> Yes, I did start working on hugetlb softdirty support in the past.
> https://lore.kernel.org/lkml/20210211000322.159437-1-mike.kravetz@oracle.com/
>
> Unfortunately, it got preempted by other things. I will try to move it up
> the priority list.
Thanks, Mike.
It'll also makes sense to forbid it if it may take time to finish, so we
don't need to push ourselves.
--
Peter Xu
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify
2022-08-05 23:13 ` Peter Xu
2022-08-05 23:33 ` Mike Kravetz
@ 2022-08-08 16:36 ` David Hildenbrand
2022-08-08 19:28 ` Peter Xu
1 sibling, 1 reply; 14+ messages in thread
From: David Hildenbrand @ 2022-08-08 16:36 UTC (permalink / raw)
To: Peter Xu, Mike Kravetz
Cc: linux-kernel, linux-mm, Andrew Morton, Muchun Song, Peter Feiner,
Kirill A . Shutemov, stable
On 06.08.22 01:13, Peter Xu wrote:
> On Fri, Aug 05, 2022 at 01:48:35PM -0700, Mike Kravetz wrote:
>> On 08/05/22 20:57, David Hildenbrand wrote:
>>> On 05.08.22 20:33, Mike Kravetz wrote:
>>>> On 08/05/22 20:25, David Hildenbrand wrote:
>>>>> On 05.08.22 20:23, Mike Kravetz wrote:
>>>>>> On 08/05/22 14:14, Peter Xu wrote:
>>>>>>> On Fri, Aug 05, 2022 at 01:03:28PM +0200, David Hildenbrand wrote:
>>>>>>>> diff --git a/mm/mmap.c b/mm/mmap.c
>>>>>>>> index 61e6135c54ef..462a6b0344ac 100644
>>>>>>>> --- a/mm/mmap.c
>>>>>>>> +++ b/mm/mmap.c
>>>>>>>> @@ -1683,6 +1683,13 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot)
>>>>>>>> if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED)))
>>>>>>>> return 0;
>>>>>>>>
>>>>>>>> + /*
>>>>>>>> + * Hugetlb does not require/support writenotify; especially, it does not
>>>>>>>> + * support softdirty tracking.
>>>>>>>> + */
>>>>>>>> + if (is_vm_hugetlb_page(vma))
>>>>>>>> + return 0;
>>>>>>>
>>>>>>> I'm kind of confused here.. you seems to be fixing up soft-dirty for
>>>>>>> hugetlb but here it's explicitly forbidden.
>>>>>>>
>>>>>>> Could you explain a bit more on why this patch is needed if (assume
>>>>>>> there'll be a working) patch 2 being provided?
>>>>>>>
>>>>>>
>>>>>> No comments on the patch, but ...
>>>>>>
>>>>>> Since it required little thought, I ran the test program on next-20220802 and
>>>>>> was surprised that the issue did not recreate. Even added a simple printk
>>>>>> to make sure we were getting into vma_wants_writenotify with a hugetlb vma.
>>>>>> We were.
>>>>>
>>>>>
>>>>> ... does your config have CONFIG_MEM_SOFT_DIRTY enabled?
>>>>>
>>>>
>>>> No, Duh!
>>>>
>>>> FYI - Some time back, I started looking at adding soft dirty support for
>>>> hugetlb mappings. I did not finish that work. But, I seem to recall
>>>> places where code was operating on hugetlb mappings when perhaps it should
>>>> not.
>>>>
>>>> Perhaps, it would also be good to just disable soft dirty for hugetlb at
>>>> the source?
>>>
>>> I thought about that as well. But I came to the conclusion that without
>>> patch #2, hugetlb VMAs cannot possibly support write-notify, so there is
>>> no need to bother in vma_wants_writenotify() at all.
>>>
>>> The "root" would be places where we clear VM_SOFTDIRTY. That should only
>>> be fs/proc/task_mmu.c:clear_refs_write() IIRC.
>>>
>>> So I don't particularly care, I consider this patch a bit cleaner and
>>> more generic, but I can adjust clear_refs_write() instead of there is a
>>> preference.
>>>
>>
>> After a closer look, I agree that this may be the simplest/cleanest way to
>> proceed. I was going to suggest that you note hugetlb does not support
>> softdirty, but see you did in the comment.
>>
>> Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
>
> Filtering out hugetlbfs in vma_wants_writenotify() is still a bit hard to
> follow to me, since it's not clear why hugetlbfs never wants writenotify.
Well, because the write-fault handler as is cannot deal with
write-faults in shared mappings. It cannot possibly work or ever have
worked.
>
> If it's only about soft-dirty, we could have added the hugetlbfs check into
> vma_soft_dirty_enabled(), then I think it'll achieve the same thing and
> much clearer - with the soft-dirty check constantly returning false for it,
> hugetlbfs shared vmas should have vma_wants_writenotify() naturally return
> 0 already.
I considered that an option, but went with this approach for simplicity
and because I don't see a need to check for hugetlb in other code paths
(especially, in the !hugetlb mprotect variant).
I mean, with patch #2 we can remove it anytime we do support softdirty
tracking -- or if there is need for write-notify we can move it into the
softdirty check only.
>
> For the long term - shouldn't we just enable soft-dirty for hugetlbfs? I
> remember Mike used to have that in todo. Since we've got patch 2 already,
> I feel like that's really much close (is the only missing piece the clear
> refs write part? or maybe some more that I didn't notice).
My gut feeling is that there isn't too much missing to have it working.
Define a PTE bit, implement hugetlb variant of clearing, and set it when
setting the PTE dirty.
Thanks!
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify
2022-08-08 16:36 ` David Hildenbrand
@ 2022-08-08 19:28 ` Peter Xu
2022-08-10 9:29 ` David Hildenbrand
0 siblings, 1 reply; 14+ messages in thread
From: Peter Xu @ 2022-08-08 19:28 UTC (permalink / raw)
To: David Hildenbrand
Cc: Mike Kravetz, linux-kernel, linux-mm, Andrew Morton, Muchun Song,
Peter Feiner, Kirill A . Shutemov, stable
On Mon, Aug 08, 2022 at 06:36:58PM +0200, David Hildenbrand wrote:
> Well, because the write-fault handler as is cannot deal with
> write-faults in shared mappings. It cannot possibly work or ever have
> worked.
Trivially - maybe drop the word "require" in "Hugetlb does not
require/support writenotify"?
--
Peter Xu
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify
2022-08-08 19:28 ` Peter Xu
@ 2022-08-10 9:29 ` David Hildenbrand
0 siblings, 0 replies; 14+ messages in thread
From: David Hildenbrand @ 2022-08-10 9:29 UTC (permalink / raw)
To: Peter Xu
Cc: Mike Kravetz, linux-kernel, linux-mm, Andrew Morton, Muchun Song,
Peter Feiner, Kirill A . Shutemov, stable
On 08.08.22 21:28, Peter Xu wrote:
> On Mon, Aug 08, 2022 at 06:36:58PM +0200, David Hildenbrand wrote:
>> Well, because the write-fault handler as is cannot deal with
>> write-faults in shared mappings. It cannot possibly work or ever have
>> worked.
>
> Trivially - maybe drop the word "require" in "Hugetlb does not
> require/support writenotify"?
>
Sure, can do.
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2022-08-10 9:29 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20220805110329.80540-1-david@redhat.com>
2022-08-05 11:03 ` [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify David Hildenbrand
2022-08-05 18:14 ` Peter Xu
2022-08-05 18:22 ` David Hildenbrand
2022-08-05 18:23 ` Mike Kravetz
2022-08-05 18:25 ` David Hildenbrand
2022-08-05 18:33 ` Mike Kravetz
2022-08-05 18:57 ` David Hildenbrand
2022-08-05 20:48 ` Mike Kravetz
2022-08-05 23:13 ` Peter Xu
2022-08-05 23:33 ` Mike Kravetz
2022-08-08 16:10 ` Peter Xu
2022-08-08 16:36 ` David Hildenbrand
2022-08-08 19:28 ` Peter Xu
2022-08-10 9:29 ` David Hildenbrand
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).