* [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify [not found] <20220805110329.80540-1-david@redhat.com> @ 2022-08-05 11:03 ` David Hildenbrand 2022-08-05 18:14 ` Peter Xu 0 siblings, 1 reply; 14+ messages in thread From: David Hildenbrand @ 2022-08-05 11:03 UTC (permalink / raw) To: linux-kernel Cc: linux-mm, David Hildenbrand, Andrew Morton, Mike Kravetz, Muchun Song, Peter Xu, Peter Feiner, Kirill A . Shutemov, stable Staring at hugetlb_wp(), one might wonder where all the logic for shared mappings is when stumbling over a write-protected page in a shared mapping. In fact, there is none, and so far we thought we could get away with that because e.g., mprotect() should always do the right thing and map all pages directly writable. Looks like we were wrong: -------------------------------------------------------------------------- #include <stdio.h> #include <stdlib.h> #include <string.h> #include <fcntl.h> #include <unistd.h> #include <errno.h> #include <sys/mman.h> #define HUGETLB_SIZE (2 * 1024 * 1024u) static void clear_softdirty(void) { int fd = open("/proc/self/clear_refs", O_WRONLY); const char *ctrl = "4"; int ret; if (fd < 0) { fprintf(stderr, "open(clear_refs) failed\n"); exit(1); } ret = write(fd, ctrl, strlen(ctrl)); if (ret != strlen(ctrl)) { fprintf(stderr, "write(clear_refs) failed\n"); exit(1); } close(fd); } int main(int argc, char **argv) { char *map; int fd; fd = open("/dev/hugepages/tmp", O_RDWR | O_CREAT); if (!fd) { fprintf(stderr, "open() failed\n"); return -errno; } if (ftruncate(fd, HUGETLB_SIZE)) { fprintf(stderr, "ftruncate() failed\n"); return -errno; } map = mmap(NULL, HUGETLB_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); if (map == MAP_FAILED) { fprintf(stderr, "mmap() failed\n"); return -errno; } *map = 0; if (mprotect(map, HUGETLB_SIZE, PROT_READ)) { fprintf(stderr, "mmprotect() failed\n"); return -errno; } clear_softdirty(); if (mprotect(map, HUGETLB_SIZE, PROT_READ|PROT_WRITE)) { fprintf(stderr, "mmprotect() failed\n"); return -errno; } *map = 0; return 0; } -------------------------------------------------------------------------- Above test fails with SIGBUS when there is only a single free hugetlb page. # echo 1 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages # ./test Bus error (core dumped) And worse, with sufficient free hugetlb pages it will map an anonymous page into a shared mapping, for example, messing up accounting during unmap and breaking MAP_SHARED semantics: # echo 2 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages # ./test # cat /proc/meminfo | grep HugePages_ HugePages_Total: 2 HugePages_Free: 1 HugePages_Rsvd: 18446744073709551615 HugePages_Surp: 0 Reason in this particular case is that vma_wants_writenotify() will return "true", removing VM_SHARED in vma_set_page_prot() to map pages write-protected. Let's teach vma_wants_writenotify() that hugetlb does not support write-notify, including softdirty tracking. Fixes: 64e455079e1b ("mm: softdirty: enable write notifications on VMAs after VM_SOFTDIRTY cleared") Cc: <stable@vger.kernel.org> # v3.18+ Signed-off-by: David Hildenbrand <david@redhat.com> --- mm/mmap.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/mm/mmap.c b/mm/mmap.c index 61e6135c54ef..462a6b0344ac 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1683,6 +1683,13 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot) if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED))) return 0; + /* + * Hugetlb does not require/support writenotify; especially, it does not + * support softdirty tracking. + */ + if (is_vm_hugetlb_page(vma)) + return 0; + /* The backer wishes to know when pages are first written to? */ if (vm_ops && (vm_ops->page_mkwrite || vm_ops->pfn_mkwrite)) return 1; -- 2.35.3 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify 2022-08-05 11:03 ` [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify David Hildenbrand @ 2022-08-05 18:14 ` Peter Xu 2022-08-05 18:22 ` David Hildenbrand 2022-08-05 18:23 ` Mike Kravetz 0 siblings, 2 replies; 14+ messages in thread From: Peter Xu @ 2022-08-05 18:14 UTC (permalink / raw) To: David Hildenbrand Cc: linux-kernel, linux-mm, Andrew Morton, Mike Kravetz, Muchun Song, Peter Feiner, Kirill A . Shutemov, stable On Fri, Aug 05, 2022 at 01:03:28PM +0200, David Hildenbrand wrote: > diff --git a/mm/mmap.c b/mm/mmap.c > index 61e6135c54ef..462a6b0344ac 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -1683,6 +1683,13 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot) > if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED))) > return 0; > > + /* > + * Hugetlb does not require/support writenotify; especially, it does not > + * support softdirty tracking. > + */ > + if (is_vm_hugetlb_page(vma)) > + return 0; I'm kind of confused here.. you seems to be fixing up soft-dirty for hugetlb but here it's explicitly forbidden. Could you explain a bit more on why this patch is needed if (assume there'll be a working) patch 2 being provided? Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify 2022-08-05 18:14 ` Peter Xu @ 2022-08-05 18:22 ` David Hildenbrand 2022-08-05 18:23 ` Mike Kravetz 1 sibling, 0 replies; 14+ messages in thread From: David Hildenbrand @ 2022-08-05 18:22 UTC (permalink / raw) To: Peter Xu Cc: linux-kernel, linux-mm, Andrew Morton, Mike Kravetz, Muchun Song, Peter Feiner, Kirill A . Shutemov, stable On 05.08.22 20:14, Peter Xu wrote: > On Fri, Aug 05, 2022 at 01:03:28PM +0200, David Hildenbrand wrote: >> diff --git a/mm/mmap.c b/mm/mmap.c >> index 61e6135c54ef..462a6b0344ac 100644 >> --- a/mm/mmap.c >> +++ b/mm/mmap.c >> @@ -1683,6 +1683,13 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot) >> if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED))) >> return 0; >> >> + /* >> + * Hugetlb does not require/support writenotify; especially, it does not >> + * support softdirty tracking. >> + */ >> + if (is_vm_hugetlb_page(vma)) >> + return 0; > > I'm kind of confused here.. you seems to be fixing up soft-dirty for > hugetlb but here it's explicitly forbidden. > > Could you explain a bit more on why this patch is needed if (assume > there'll be a working) patch 2 being provided? I want something simple to backport. And even with patch #2 in place, as long as we don't support softdirty tracking, it doesn't make sense to enable it here as of now. -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify 2022-08-05 18:14 ` Peter Xu 2022-08-05 18:22 ` David Hildenbrand @ 2022-08-05 18:23 ` Mike Kravetz 2022-08-05 18:25 ` David Hildenbrand 1 sibling, 1 reply; 14+ messages in thread From: Mike Kravetz @ 2022-08-05 18:23 UTC (permalink / raw) To: Peter Xu Cc: David Hildenbrand, linux-kernel, linux-mm, Andrew Morton, Muchun Song, Peter Feiner, Kirill A . Shutemov, stable On 08/05/22 14:14, Peter Xu wrote: > On Fri, Aug 05, 2022 at 01:03:28PM +0200, David Hildenbrand wrote: > > diff --git a/mm/mmap.c b/mm/mmap.c > > index 61e6135c54ef..462a6b0344ac 100644 > > --- a/mm/mmap.c > > +++ b/mm/mmap.c > > @@ -1683,6 +1683,13 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot) > > if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED))) > > return 0; > > > > + /* > > + * Hugetlb does not require/support writenotify; especially, it does not > > + * support softdirty tracking. > > + */ > > + if (is_vm_hugetlb_page(vma)) > > + return 0; > > I'm kind of confused here.. you seems to be fixing up soft-dirty for > hugetlb but here it's explicitly forbidden. > > Could you explain a bit more on why this patch is needed if (assume > there'll be a working) patch 2 being provided? > No comments on the patch, but ... Since it required little thought, I ran the test program on next-20220802 and was surprised that the issue did not recreate. Even added a simple printk to make sure we were getting into vma_wants_writenotify with a hugetlb vma. We were. I can easily recreate with an older Fedora kernel. Will be trying to understand why this is the case. -- Mike Kravetz ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify 2022-08-05 18:23 ` Mike Kravetz @ 2022-08-05 18:25 ` David Hildenbrand 2022-08-05 18:33 ` Mike Kravetz 0 siblings, 1 reply; 14+ messages in thread From: David Hildenbrand @ 2022-08-05 18:25 UTC (permalink / raw) To: Mike Kravetz, Peter Xu Cc: linux-kernel, linux-mm, Andrew Morton, Muchun Song, Peter Feiner, Kirill A . Shutemov, stable On 05.08.22 20:23, Mike Kravetz wrote: > On 08/05/22 14:14, Peter Xu wrote: >> On Fri, Aug 05, 2022 at 01:03:28PM +0200, David Hildenbrand wrote: >>> diff --git a/mm/mmap.c b/mm/mmap.c >>> index 61e6135c54ef..462a6b0344ac 100644 >>> --- a/mm/mmap.c >>> +++ b/mm/mmap.c >>> @@ -1683,6 +1683,13 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot) >>> if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED))) >>> return 0; >>> >>> + /* >>> + * Hugetlb does not require/support writenotify; especially, it does not >>> + * support softdirty tracking. >>> + */ >>> + if (is_vm_hugetlb_page(vma)) >>> + return 0; >> >> I'm kind of confused here.. you seems to be fixing up soft-dirty for >> hugetlb but here it's explicitly forbidden. >> >> Could you explain a bit more on why this patch is needed if (assume >> there'll be a working) patch 2 being provided? >> > > No comments on the patch, but ... > > Since it required little thought, I ran the test program on next-20220802 and > was surprised that the issue did not recreate. Even added a simple printk > to make sure we were getting into vma_wants_writenotify with a hugetlb vma. > We were. ... does your config have CONFIG_MEM_SOFT_DIRTY enabled? -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify 2022-08-05 18:25 ` David Hildenbrand @ 2022-08-05 18:33 ` Mike Kravetz 2022-08-05 18:57 ` David Hildenbrand 0 siblings, 1 reply; 14+ messages in thread From: Mike Kravetz @ 2022-08-05 18:33 UTC (permalink / raw) To: David Hildenbrand Cc: Peter Xu, linux-kernel, linux-mm, Andrew Morton, Muchun Song, Peter Feiner, Kirill A . Shutemov, stable On 08/05/22 20:25, David Hildenbrand wrote: > On 05.08.22 20:23, Mike Kravetz wrote: > > On 08/05/22 14:14, Peter Xu wrote: > >> On Fri, Aug 05, 2022 at 01:03:28PM +0200, David Hildenbrand wrote: > >>> diff --git a/mm/mmap.c b/mm/mmap.c > >>> index 61e6135c54ef..462a6b0344ac 100644 > >>> --- a/mm/mmap.c > >>> +++ b/mm/mmap.c > >>> @@ -1683,6 +1683,13 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot) > >>> if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED))) > >>> return 0; > >>> > >>> + /* > >>> + * Hugetlb does not require/support writenotify; especially, it does not > >>> + * support softdirty tracking. > >>> + */ > >>> + if (is_vm_hugetlb_page(vma)) > >>> + return 0; > >> > >> I'm kind of confused here.. you seems to be fixing up soft-dirty for > >> hugetlb but here it's explicitly forbidden. > >> > >> Could you explain a bit more on why this patch is needed if (assume > >> there'll be a working) patch 2 being provided? > >> > > > > No comments on the patch, but ... > > > > Since it required little thought, I ran the test program on next-20220802 and > > was surprised that the issue did not recreate. Even added a simple printk > > to make sure we were getting into vma_wants_writenotify with a hugetlb vma. > > We were. > > > ... does your config have CONFIG_MEM_SOFT_DIRTY enabled? > No, Duh! FYI - Some time back, I started looking at adding soft dirty support for hugetlb mappings. I did not finish that work. But, I seem to recall places where code was operating on hugetlb mappings when perhaps it should not. Perhaps, it would also be good to just disable soft dirty for hugetlb at the source? -- Mike Kravetz ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify 2022-08-05 18:33 ` Mike Kravetz @ 2022-08-05 18:57 ` David Hildenbrand 2022-08-05 20:48 ` Mike Kravetz 0 siblings, 1 reply; 14+ messages in thread From: David Hildenbrand @ 2022-08-05 18:57 UTC (permalink / raw) To: Mike Kravetz Cc: Peter Xu, linux-kernel, linux-mm, Andrew Morton, Muchun Song, Peter Feiner, Kirill A . Shutemov, stable On 05.08.22 20:33, Mike Kravetz wrote: > On 08/05/22 20:25, David Hildenbrand wrote: >> On 05.08.22 20:23, Mike Kravetz wrote: >>> On 08/05/22 14:14, Peter Xu wrote: >>>> On Fri, Aug 05, 2022 at 01:03:28PM +0200, David Hildenbrand wrote: >>>>> diff --git a/mm/mmap.c b/mm/mmap.c >>>>> index 61e6135c54ef..462a6b0344ac 100644 >>>>> --- a/mm/mmap.c >>>>> +++ b/mm/mmap.c >>>>> @@ -1683,6 +1683,13 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot) >>>>> if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED))) >>>>> return 0; >>>>> >>>>> + /* >>>>> + * Hugetlb does not require/support writenotify; especially, it does not >>>>> + * support softdirty tracking. >>>>> + */ >>>>> + if (is_vm_hugetlb_page(vma)) >>>>> + return 0; >>>> >>>> I'm kind of confused here.. you seems to be fixing up soft-dirty for >>>> hugetlb but here it's explicitly forbidden. >>>> >>>> Could you explain a bit more on why this patch is needed if (assume >>>> there'll be a working) patch 2 being provided? >>>> >>> >>> No comments on the patch, but ... >>> >>> Since it required little thought, I ran the test program on next-20220802 and >>> was surprised that the issue did not recreate. Even added a simple printk >>> to make sure we were getting into vma_wants_writenotify with a hugetlb vma. >>> We were. >> >> >> ... does your config have CONFIG_MEM_SOFT_DIRTY enabled? >> > > No, Duh! > > FYI - Some time back, I started looking at adding soft dirty support for > hugetlb mappings. I did not finish that work. But, I seem to recall > places where code was operating on hugetlb mappings when perhaps it should > not. > > Perhaps, it would also be good to just disable soft dirty for hugetlb at > the source? I thought about that as well. But I came to the conclusion that without patch #2, hugetlb VMAs cannot possibly support write-notify, so there is no need to bother in vma_wants_writenotify() at all. The "root" would be places where we clear VM_SOFTDIRTY. That should only be fs/proc/task_mmu.c:clear_refs_write() IIRC. So I don't particularly care, I consider this patch a bit cleaner and more generic, but I can adjust clear_refs_write() instead of there is a preference. -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify 2022-08-05 18:57 ` David Hildenbrand @ 2022-08-05 20:48 ` Mike Kravetz 2022-08-05 23:13 ` Peter Xu 0 siblings, 1 reply; 14+ messages in thread From: Mike Kravetz @ 2022-08-05 20:48 UTC (permalink / raw) To: David Hildenbrand Cc: Peter Xu, linux-kernel, linux-mm, Andrew Morton, Muchun Song, Peter Feiner, Kirill A . Shutemov, stable On 08/05/22 20:57, David Hildenbrand wrote: > On 05.08.22 20:33, Mike Kravetz wrote: > > On 08/05/22 20:25, David Hildenbrand wrote: > >> On 05.08.22 20:23, Mike Kravetz wrote: > >>> On 08/05/22 14:14, Peter Xu wrote: > >>>> On Fri, Aug 05, 2022 at 01:03:28PM +0200, David Hildenbrand wrote: > >>>>> diff --git a/mm/mmap.c b/mm/mmap.c > >>>>> index 61e6135c54ef..462a6b0344ac 100644 > >>>>> --- a/mm/mmap.c > >>>>> +++ b/mm/mmap.c > >>>>> @@ -1683,6 +1683,13 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot) > >>>>> if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED))) > >>>>> return 0; > >>>>> > >>>>> + /* > >>>>> + * Hugetlb does not require/support writenotify; especially, it does not > >>>>> + * support softdirty tracking. > >>>>> + */ > >>>>> + if (is_vm_hugetlb_page(vma)) > >>>>> + return 0; > >>>> > >>>> I'm kind of confused here.. you seems to be fixing up soft-dirty for > >>>> hugetlb but here it's explicitly forbidden. > >>>> > >>>> Could you explain a bit more on why this patch is needed if (assume > >>>> there'll be a working) patch 2 being provided? > >>>> > >>> > >>> No comments on the patch, but ... > >>> > >>> Since it required little thought, I ran the test program on next-20220802 and > >>> was surprised that the issue did not recreate. Even added a simple printk > >>> to make sure we were getting into vma_wants_writenotify with a hugetlb vma. > >>> We were. > >> > >> > >> ... does your config have CONFIG_MEM_SOFT_DIRTY enabled? > >> > > > > No, Duh! > > > > FYI - Some time back, I started looking at adding soft dirty support for > > hugetlb mappings. I did not finish that work. But, I seem to recall > > places where code was operating on hugetlb mappings when perhaps it should > > not. > > > > Perhaps, it would also be good to just disable soft dirty for hugetlb at > > the source? > > I thought about that as well. But I came to the conclusion that without > patch #2, hugetlb VMAs cannot possibly support write-notify, so there is > no need to bother in vma_wants_writenotify() at all. > > The "root" would be places where we clear VM_SOFTDIRTY. That should only > be fs/proc/task_mmu.c:clear_refs_write() IIRC. > > So I don't particularly care, I consider this patch a bit cleaner and > more generic, but I can adjust clear_refs_write() instead of there is a > preference. > After a closer look, I agree that this may be the simplest/cleanest way to proceed. I was going to suggest that you note hugetlb does not support softdirty, but see you did in the comment. Acked-by: Mike Kravetz <mike.kravetz@oracle.com> -- Mike Kravetz ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify 2022-08-05 20:48 ` Mike Kravetz @ 2022-08-05 23:13 ` Peter Xu 2022-08-05 23:33 ` Mike Kravetz 2022-08-08 16:36 ` David Hildenbrand 0 siblings, 2 replies; 14+ messages in thread From: Peter Xu @ 2022-08-05 23:13 UTC (permalink / raw) To: Mike Kravetz Cc: David Hildenbrand, linux-kernel, linux-mm, Andrew Morton, Muchun Song, Peter Feiner, Kirill A . Shutemov, stable On Fri, Aug 05, 2022 at 01:48:35PM -0700, Mike Kravetz wrote: > On 08/05/22 20:57, David Hildenbrand wrote: > > On 05.08.22 20:33, Mike Kravetz wrote: > > > On 08/05/22 20:25, David Hildenbrand wrote: > > >> On 05.08.22 20:23, Mike Kravetz wrote: > > >>> On 08/05/22 14:14, Peter Xu wrote: > > >>>> On Fri, Aug 05, 2022 at 01:03:28PM +0200, David Hildenbrand wrote: > > >>>>> diff --git a/mm/mmap.c b/mm/mmap.c > > >>>>> index 61e6135c54ef..462a6b0344ac 100644 > > >>>>> --- a/mm/mmap.c > > >>>>> +++ b/mm/mmap.c > > >>>>> @@ -1683,6 +1683,13 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot) > > >>>>> if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED))) > > >>>>> return 0; > > >>>>> > > >>>>> + /* > > >>>>> + * Hugetlb does not require/support writenotify; especially, it does not > > >>>>> + * support softdirty tracking. > > >>>>> + */ > > >>>>> + if (is_vm_hugetlb_page(vma)) > > >>>>> + return 0; > > >>>> > > >>>> I'm kind of confused here.. you seems to be fixing up soft-dirty for > > >>>> hugetlb but here it's explicitly forbidden. > > >>>> > > >>>> Could you explain a bit more on why this patch is needed if (assume > > >>>> there'll be a working) patch 2 being provided? > > >>>> > > >>> > > >>> No comments on the patch, but ... > > >>> > > >>> Since it required little thought, I ran the test program on next-20220802 and > > >>> was surprised that the issue did not recreate. Even added a simple printk > > >>> to make sure we were getting into vma_wants_writenotify with a hugetlb vma. > > >>> We were. > > >> > > >> > > >> ... does your config have CONFIG_MEM_SOFT_DIRTY enabled? > > >> > > > > > > No, Duh! > > > > > > FYI - Some time back, I started looking at adding soft dirty support for > > > hugetlb mappings. I did not finish that work. But, I seem to recall > > > places where code was operating on hugetlb mappings when perhaps it should > > > not. > > > > > > Perhaps, it would also be good to just disable soft dirty for hugetlb at > > > the source? > > > > I thought about that as well. But I came to the conclusion that without > > patch #2, hugetlb VMAs cannot possibly support write-notify, so there is > > no need to bother in vma_wants_writenotify() at all. > > > > The "root" would be places where we clear VM_SOFTDIRTY. That should only > > be fs/proc/task_mmu.c:clear_refs_write() IIRC. > > > > So I don't particularly care, I consider this patch a bit cleaner and > > more generic, but I can adjust clear_refs_write() instead of there is a > > preference. > > > > After a closer look, I agree that this may be the simplest/cleanest way to > proceed. I was going to suggest that you note hugetlb does not support > softdirty, but see you did in the comment. > > Acked-by: Mike Kravetz <mike.kravetz@oracle.com> Filtering out hugetlbfs in vma_wants_writenotify() is still a bit hard to follow to me, since it's not clear why hugetlbfs never wants writenotify. If it's only about soft-dirty, we could have added the hugetlbfs check into vma_soft_dirty_enabled(), then I think it'll achieve the same thing and much clearer - with the soft-dirty check constantly returning false for it, hugetlbfs shared vmas should have vma_wants_writenotify() naturally return 0 already. For the long term - shouldn't we just enable soft-dirty for hugetlbfs? I remember Mike used to have that in todo. Since we've got patch 2 already, I feel like that's really much close (is the only missing piece the clear refs write part? or maybe some more that I didn't notice). Then patch 1 (or IMHO equivalant check in vma_soft_dirty_enabled(), but maybe in stable trees we don't have vma_soft_dirty_enabled then it's exactly patch 1) can be a stable-only backport just to avoid the bug from triggering. Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify 2022-08-05 23:13 ` Peter Xu @ 2022-08-05 23:33 ` Mike Kravetz 2022-08-08 16:10 ` Peter Xu 2022-08-08 16:36 ` David Hildenbrand 1 sibling, 1 reply; 14+ messages in thread From: Mike Kravetz @ 2022-08-05 23:33 UTC (permalink / raw) To: Peter Xu Cc: David Hildenbrand, linux-kernel, linux-mm, Andrew Morton, Muchun Song, Peter Feiner, Kirill A . Shutemov, stable On 08/05/22 19:13, Peter Xu wrote: > On Fri, Aug 05, 2022 at 01:48:35PM -0700, Mike Kravetz wrote: > > On 08/05/22 20:57, David Hildenbrand wrote: > > > On 05.08.22 20:33, Mike Kravetz wrote: > > > > On 08/05/22 20:25, David Hildenbrand wrote: > > > >> On 05.08.22 20:23, Mike Kravetz wrote: > > > >>> On 08/05/22 14:14, Peter Xu wrote: > > > >>>> On Fri, Aug 05, 2022 at 01:03:28PM +0200, David Hildenbrand wrote: > > > >>>>> diff --git a/mm/mmap.c b/mm/mmap.c > > > >>>>> index 61e6135c54ef..462a6b0344ac 100644 > > > >>>>> --- a/mm/mmap.c > > > >>>>> +++ b/mm/mmap.c > > > >>>>> @@ -1683,6 +1683,13 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot) > > > >>>>> if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED))) > > > >>>>> return 0; > > > >>>>> > > > >>>>> + /* > > > >>>>> + * Hugetlb does not require/support writenotify; especially, it does not > > > >>>>> + * support softdirty tracking. > > > >>>>> + */ > > > >>>>> + if (is_vm_hugetlb_page(vma)) > > > >>>>> + return 0; > > > >>>> > > > >>>> I'm kind of confused here.. you seems to be fixing up soft-dirty for > > > >>>> hugetlb but here it's explicitly forbidden. > > > >>>> > > > >>>> Could you explain a bit more on why this patch is needed if (assume > > > >>>> there'll be a working) patch 2 being provided? > > > >>>> > > > >>> > > > >>> No comments on the patch, but ... > > > >>> > > > >>> Since it required little thought, I ran the test program on next-20220802 and > > > >>> was surprised that the issue did not recreate. Even added a simple printk > > > >>> to make sure we were getting into vma_wants_writenotify with a hugetlb vma. > > > >>> We were. > > > >> > > > >> > > > >> ... does your config have CONFIG_MEM_SOFT_DIRTY enabled? > > > >> > > > > > > > > No, Duh! > > > > > > > > FYI - Some time back, I started looking at adding soft dirty support for > > > > hugetlb mappings. I did not finish that work. But, I seem to recall > > > > places where code was operating on hugetlb mappings when perhaps it should > > > > not. > > > > > > > > Perhaps, it would also be good to just disable soft dirty for hugetlb at > > > > the source? > > > > > > I thought about that as well. But I came to the conclusion that without > > > patch #2, hugetlb VMAs cannot possibly support write-notify, so there is > > > no need to bother in vma_wants_writenotify() at all. > > > > > > The "root" would be places where we clear VM_SOFTDIRTY. That should only > > > be fs/proc/task_mmu.c:clear_refs_write() IIRC. > > > > > > So I don't particularly care, I consider this patch a bit cleaner and > > > more generic, but I can adjust clear_refs_write() instead of there is a > > > preference. > > > > > > > After a closer look, I agree that this may be the simplest/cleanest way to > > proceed. I was going to suggest that you note hugetlb does not support > > softdirty, but see you did in the comment. > > > > Acked-by: Mike Kravetz <mike.kravetz@oracle.com> > > Filtering out hugetlbfs in vma_wants_writenotify() is still a bit hard to > follow to me, since it's not clear why hugetlbfs never wants writenotify. > > If it's only about soft-dirty, we could have added the hugetlbfs check into > vma_soft_dirty_enabled(), then I think it'll achieve the same thing and > much clearer - with the soft-dirty check constantly returning false for it, > hugetlbfs shared vmas should have vma_wants_writenotify() naturally return > 0 already. > > For the long term - shouldn't we just enable soft-dirty for hugetlbfs? I > remember Mike used to have that in todo. Since we've got patch 2 already, > I feel like that's really much close (is the only missing piece the clear > refs write part? or maybe some more that I didn't notice). > > Then patch 1 (or IMHO equivalant check in vma_soft_dirty_enabled(), but > maybe in stable trees we don't have vma_soft_dirty_enabled then it's > exactly patch 1) can be a stable-only backport just to avoid the bug from > triggering. It looks like vma_soft_dirty_enabled is recent and not in any stable trees (or even 5.19). Yes, I did start working on hugetlb softdirty support in the past. https://lore.kernel.org/lkml/20210211000322.159437-1-mike.kravetz@oracle.com/ Unfortunately, it got preempted by other things. I will try to move it up the priority list. -- Mike Kravetz ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify 2022-08-05 23:33 ` Mike Kravetz @ 2022-08-08 16:10 ` Peter Xu 0 siblings, 0 replies; 14+ messages in thread From: Peter Xu @ 2022-08-08 16:10 UTC (permalink / raw) To: Mike Kravetz Cc: David Hildenbrand, linux-kernel, linux-mm, Andrew Morton, Muchun Song, Peter Feiner, Kirill A . Shutemov, stable On Fri, Aug 05, 2022 at 04:33:56PM -0700, Mike Kravetz wrote: > It looks like vma_soft_dirty_enabled is recent and not in any stable > trees (or even 5.19). > > Yes, I did start working on hugetlb softdirty support in the past. > https://lore.kernel.org/lkml/20210211000322.159437-1-mike.kravetz@oracle.com/ > > Unfortunately, it got preempted by other things. I will try to move it up > the priority list. Thanks, Mike. It'll also makes sense to forbid it if it may take time to finish, so we don't need to push ourselves. -- Peter Xu ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify 2022-08-05 23:13 ` Peter Xu 2022-08-05 23:33 ` Mike Kravetz @ 2022-08-08 16:36 ` David Hildenbrand 2022-08-08 19:28 ` Peter Xu 1 sibling, 1 reply; 14+ messages in thread From: David Hildenbrand @ 2022-08-08 16:36 UTC (permalink / raw) To: Peter Xu, Mike Kravetz Cc: linux-kernel, linux-mm, Andrew Morton, Muchun Song, Peter Feiner, Kirill A . Shutemov, stable On 06.08.22 01:13, Peter Xu wrote: > On Fri, Aug 05, 2022 at 01:48:35PM -0700, Mike Kravetz wrote: >> On 08/05/22 20:57, David Hildenbrand wrote: >>> On 05.08.22 20:33, Mike Kravetz wrote: >>>> On 08/05/22 20:25, David Hildenbrand wrote: >>>>> On 05.08.22 20:23, Mike Kravetz wrote: >>>>>> On 08/05/22 14:14, Peter Xu wrote: >>>>>>> On Fri, Aug 05, 2022 at 01:03:28PM +0200, David Hildenbrand wrote: >>>>>>>> diff --git a/mm/mmap.c b/mm/mmap.c >>>>>>>> index 61e6135c54ef..462a6b0344ac 100644 >>>>>>>> --- a/mm/mmap.c >>>>>>>> +++ b/mm/mmap.c >>>>>>>> @@ -1683,6 +1683,13 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot) >>>>>>>> if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED))) >>>>>>>> return 0; >>>>>>>> >>>>>>>> + /* >>>>>>>> + * Hugetlb does not require/support writenotify; especially, it does not >>>>>>>> + * support softdirty tracking. >>>>>>>> + */ >>>>>>>> + if (is_vm_hugetlb_page(vma)) >>>>>>>> + return 0; >>>>>>> >>>>>>> I'm kind of confused here.. you seems to be fixing up soft-dirty for >>>>>>> hugetlb but here it's explicitly forbidden. >>>>>>> >>>>>>> Could you explain a bit more on why this patch is needed if (assume >>>>>>> there'll be a working) patch 2 being provided? >>>>>>> >>>>>> >>>>>> No comments on the patch, but ... >>>>>> >>>>>> Since it required little thought, I ran the test program on next-20220802 and >>>>>> was surprised that the issue did not recreate. Even added a simple printk >>>>>> to make sure we were getting into vma_wants_writenotify with a hugetlb vma. >>>>>> We were. >>>>> >>>>> >>>>> ... does your config have CONFIG_MEM_SOFT_DIRTY enabled? >>>>> >>>> >>>> No, Duh! >>>> >>>> FYI - Some time back, I started looking at adding soft dirty support for >>>> hugetlb mappings. I did not finish that work. But, I seem to recall >>>> places where code was operating on hugetlb mappings when perhaps it should >>>> not. >>>> >>>> Perhaps, it would also be good to just disable soft dirty for hugetlb at >>>> the source? >>> >>> I thought about that as well. But I came to the conclusion that without >>> patch #2, hugetlb VMAs cannot possibly support write-notify, so there is >>> no need to bother in vma_wants_writenotify() at all. >>> >>> The "root" would be places where we clear VM_SOFTDIRTY. That should only >>> be fs/proc/task_mmu.c:clear_refs_write() IIRC. >>> >>> So I don't particularly care, I consider this patch a bit cleaner and >>> more generic, but I can adjust clear_refs_write() instead of there is a >>> preference. >>> >> >> After a closer look, I agree that this may be the simplest/cleanest way to >> proceed. I was going to suggest that you note hugetlb does not support >> softdirty, but see you did in the comment. >> >> Acked-by: Mike Kravetz <mike.kravetz@oracle.com> > > Filtering out hugetlbfs in vma_wants_writenotify() is still a bit hard to > follow to me, since it's not clear why hugetlbfs never wants writenotify. Well, because the write-fault handler as is cannot deal with write-faults in shared mappings. It cannot possibly work or ever have worked. > > If it's only about soft-dirty, we could have added the hugetlbfs check into > vma_soft_dirty_enabled(), then I think it'll achieve the same thing and > much clearer - with the soft-dirty check constantly returning false for it, > hugetlbfs shared vmas should have vma_wants_writenotify() naturally return > 0 already. I considered that an option, but went with this approach for simplicity and because I don't see a need to check for hugetlb in other code paths (especially, in the !hugetlb mprotect variant). I mean, with patch #2 we can remove it anytime we do support softdirty tracking -- or if there is need for write-notify we can move it into the softdirty check only. > > For the long term - shouldn't we just enable soft-dirty for hugetlbfs? I > remember Mike used to have that in todo. Since we've got patch 2 already, > I feel like that's really much close (is the only missing piece the clear > refs write part? or maybe some more that I didn't notice). My gut feeling is that there isn't too much missing to have it working. Define a PTE bit, implement hugetlb variant of clearing, and set it when setting the PTE dirty. Thanks! -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify 2022-08-08 16:36 ` David Hildenbrand @ 2022-08-08 19:28 ` Peter Xu 2022-08-10 9:29 ` David Hildenbrand 0 siblings, 1 reply; 14+ messages in thread From: Peter Xu @ 2022-08-08 19:28 UTC (permalink / raw) To: David Hildenbrand Cc: Mike Kravetz, linux-kernel, linux-mm, Andrew Morton, Muchun Song, Peter Feiner, Kirill A . Shutemov, stable On Mon, Aug 08, 2022 at 06:36:58PM +0200, David Hildenbrand wrote: > Well, because the write-fault handler as is cannot deal with > write-faults in shared mappings. It cannot possibly work or ever have > worked. Trivially - maybe drop the word "require" in "Hugetlb does not require/support writenotify"? -- Peter Xu ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify 2022-08-08 19:28 ` Peter Xu @ 2022-08-10 9:29 ` David Hildenbrand 0 siblings, 0 replies; 14+ messages in thread From: David Hildenbrand @ 2022-08-10 9:29 UTC (permalink / raw) To: Peter Xu Cc: Mike Kravetz, linux-kernel, linux-mm, Andrew Morton, Muchun Song, Peter Feiner, Kirill A . Shutemov, stable On 08.08.22 21:28, Peter Xu wrote: > On Mon, Aug 08, 2022 at 06:36:58PM +0200, David Hildenbrand wrote: >> Well, because the write-fault handler as is cannot deal with >> write-faults in shared mappings. It cannot possibly work or ever have >> worked. > > Trivially - maybe drop the word "require" in "Hugetlb does not > require/support writenotify"? > Sure, can do. -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2022-08-10 9:29 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20220805110329.80540-1-david@redhat.com>
2022-08-05 11:03 ` [PATCH v1 1/2] mm/hugetlb: fix hugetlb not supporting write-notify David Hildenbrand
2022-08-05 18:14 ` Peter Xu
2022-08-05 18:22 ` David Hildenbrand
2022-08-05 18:23 ` Mike Kravetz
2022-08-05 18:25 ` David Hildenbrand
2022-08-05 18:33 ` Mike Kravetz
2022-08-05 18:57 ` David Hildenbrand
2022-08-05 20:48 ` Mike Kravetz
2022-08-05 23:13 ` Peter Xu
2022-08-05 23:33 ` Mike Kravetz
2022-08-08 16:10 ` Peter Xu
2022-08-08 16:36 ` David Hildenbrand
2022-08-08 19:28 ` Peter Xu
2022-08-10 9:29 ` David Hildenbrand
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).