* + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch
@ 2025-05-07 21:55 Andrew Morton
2025-05-08 14:16 ` Peter Xu
0 siblings, 1 reply; 32+ messages in thread
From: Andrew Morton @ 2025-05-07 21:55 UTC (permalink / raw)
To: mm-commits, wade.farnsworth, peterx, jhubbard, jgg, david,
c.briere, artem.k, p.antoniou, akpm
The patch titled
Subject: Fix zero copy I/O on __get_user_pages allocated pages
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Pantelis Antoniou <p.antoniou@partner.samsung.com>
Subject: Fix zero copy I/O on __get_user_pages allocated pages
Date: Wed, 7 May 2025 10:41:05 -0500
Recent updates to net filesystems enabled zero copy operations, which
require getting a user space page pinned.
This does not work for pages that were allocated via __get_user_pages and
then mapped to user-space via remap_pfn_rage.
remap_pfn_range_internal() will turn on VM_IO | VM_PFNMAP vma bits.
VM_PFNMAP in particular mark the pages as not having struct_page
associated with them, which is not the case for __get_user_pages()
This in turn makes any attempt to lock a page fail, and breaking I/O from
that address range.
This patch address it by special casing pages in those VMAs and not
calling vm_normal_page() for them.
Link: https://lkml.kernel.org/r/20250507154105.763088-2-p.antoniou@partner.samsung.com
Signed-off-by: Pantelis Antoniou <p.antoniou@partner.samsung.com>
Cc: Artem Krupotkin <artem.k@samsung.com>
Cc: Charles Briere <c.briere@samsung.com>
Cc: Wade Farnsworth <wade.farnsworth@siemens.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/gup.c | 22 ++++++++++++++++++----
1 file changed, 18 insertions(+), 4 deletions(-)
--- a/mm/gup.c~fix-zero-copy-i-o-on-__get_user_pages-allocated-pages
+++ a/mm/gup.c
@@ -833,6 +833,20 @@ static inline bool can_follow_write_pte(
return !userfaultfd_pte_wp(vma, pte);
}
+static struct page *gup_normal_page(struct vm_area_struct *vma,
+ unsigned long address, pte_t pte)
+{
+ unsigned long pfn;
+
+ if (vma->vm_flags & (VM_MIXEDMAP | VM_PFNMAP)) {
+ pfn = pte_pfn(pte);
+ if (!pfn_valid(pfn) || is_zero_pfn(pfn) || pfn > highest_memmap_pfn)
+ return NULL;
+ return pfn_to_page(pfn);
+ }
+ return vm_normal_page(vma, address, pte);
+}
+
static struct page *follow_page_pte(struct vm_area_struct *vma,
unsigned long address, pmd_t *pmd, unsigned int flags,
struct dev_pagemap **pgmap)
@@ -858,7 +872,9 @@ static struct page *follow_page_pte(stru
if (pte_protnone(pte) && !gup_can_follow_protnone(vma, flags))
goto no_page;
- page = vm_normal_page(vma, address, pte);
+ page = gup_normal_page(vma, address, pte);
+ if (page && (vma->vm_flags & (VM_MIXEDMAP | VM_PFNMAP)))
+ (void)follow_pfn_pte(vma, address, ptep, flags);
/*
* We only care about anon pages in can_follow_write_pte() and don't
@@ -1130,7 +1146,7 @@ static int get_gate_page(struct mm_struc
*vma = get_gate_vma(mm);
if (!page)
goto out;
- *page = vm_normal_page(*vma, address, entry);
+ *page = gup_normal_page(*vma, address, entry);
if (!*page) {
if ((gup_flags & FOLL_DUMP) || !is_zero_pfn(pte_pfn(entry)))
goto unmap;
@@ -1271,8 +1287,6 @@ static int check_vma_flags(struct vm_are
int foreign = (gup_flags & FOLL_REMOTE);
bool vma_anon = vma_is_anonymous(vma);
- if (vm_flags & (VM_IO | VM_PFNMAP))
- return -EFAULT;
if ((gup_flags & FOLL_ANON) && !vma_anon)
return -EFAULT;
_
Patches currently in -mm which might be from p.antoniou@partner.samsung.com are
fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch
^ permalink raw reply [flat|nested] 32+ messages in thread* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-07 21:55 + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch Andrew Morton @ 2025-05-08 14:16 ` Peter Xu 2025-05-08 14:36 ` Pantelis Antoniou 0 siblings, 1 reply; 32+ messages in thread From: Peter Xu @ 2025-05-08 14:16 UTC (permalink / raw) To: Andrew Morton Cc: mm-commits, wade.farnsworth, jhubbard, jgg, david, c.briere, artem.k, p.antoniou, David Howells Hi, Pantelis, [Cc David Howells] On Wed, May 07, 2025 at 02:55:54PM -0700, Andrew Morton wrote: > > The patch titled > Subject: Fix zero copy I/O on __get_user_pages allocated pages > has been added to the -mm mm-hotfixes-unstable branch. Its filename is > fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch > > This patch will shortly appear at > https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch > > This patch will later appear in the mm-hotfixes-unstable branch at > git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm > > Before you just go and hit "reply", please: > a) Consider who else should be cc'ed > b) Prefer to cc a suitable mailing list as well > c) Ideally: find the original patch on the mailing list and do a > reply-to-all to that, adding suitable additional cc's > > *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** > > The -mm tree is included into linux-next via the mm-everything > branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm > and is updated there every 2-3 working days > > ------------------------------------------------------ > From: Pantelis Antoniou <p.antoniou@partner.samsung.com> > Subject: Fix zero copy I/O on __get_user_pages allocated pages > Date: Wed, 7 May 2025 10:41:05 -0500 > > Recent updates to net filesystems enabled zero copy operations, which > require getting a user space page pinned. > > This does not work for pages that were allocated via __get_user_pages and > then mapped to user-space via remap_pfn_rage. > > remap_pfn_range_internal() will turn on VM_IO | VM_PFNMAP vma bits. > VM_PFNMAP in particular mark the pages as not having struct_page > associated with them, which is not the case for __get_user_pages() > > This in turn makes any attempt to lock a page fail, and breaking I/O from > that address range. > > This patch address it by special casing pages in those VMAs and not > calling vm_normal_page() for them. > > Link: https://lkml.kernel.org/r/20250507154105.763088-2-p.antoniou@partner.samsung.com > Signed-off-by: Pantelis Antoniou <p.antoniou@partner.samsung.com> > Cc: Artem Krupotkin <artem.k@samsung.com> > Cc: Charles Briere <c.briere@samsung.com> > Cc: Wade Farnsworth <wade.farnsworth@siemens.com> > Cc: David Hildenbrand <david@redhat.com> > Cc: Jason Gunthorpe <jgg@ziepe.ca> > Cc: John Hubbard <jhubbard@nvidia.com> > Cc: Peter Xu <peterx@redhat.com> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > --- > > mm/gup.c | 22 ++++++++++++++++++---- > 1 file changed, 18 insertions(+), 4 deletions(-) > > --- a/mm/gup.c~fix-zero-copy-i-o-on-__get_user_pages-allocated-pages > +++ a/mm/gup.c > @@ -833,6 +833,20 @@ static inline bool can_follow_write_pte( > return !userfaultfd_pte_wp(vma, pte); > } > > +static struct page *gup_normal_page(struct vm_area_struct *vma, > + unsigned long address, pte_t pte) > +{ > + unsigned long pfn; > + > + if (vma->vm_flags & (VM_MIXEDMAP | VM_PFNMAP)) { > + pfn = pte_pfn(pte); > + if (!pfn_valid(pfn) || is_zero_pfn(pfn) || pfn > highest_memmap_pfn) > + return NULL; > + return pfn_to_page(pfn); > + } > + return vm_normal_page(vma, address, pte); > +} > + > static struct page *follow_page_pte(struct vm_area_struct *vma, > unsigned long address, pmd_t *pmd, unsigned int flags, > struct dev_pagemap **pgmap) > @@ -858,7 +872,9 @@ static struct page *follow_page_pte(stru > if (pte_protnone(pte) && !gup_can_follow_protnone(vma, flags)) > goto no_page; > > - page = vm_normal_page(vma, address, pte); > + page = gup_normal_page(vma, address, pte); > + if (page && (vma->vm_flags & (VM_MIXEDMAP | VM_PFNMAP))) > + (void)follow_pfn_pte(vma, address, ptep, flags); > > /* > * We only care about anon pages in can_follow_write_pte() and don't > @@ -1130,7 +1146,7 @@ static int get_gate_page(struct mm_struc > *vma = get_gate_vma(mm); > if (!page) > goto out; > - *page = vm_normal_page(*vma, address, entry); > + *page = gup_normal_page(*vma, address, entry); Is this really needed? IIUC the iter code would only use in either UBUF or IOVEC ones. > if (!*page) { > if ((gup_flags & FOLL_DUMP) || !is_zero_pfn(pte_pfn(entry))) > goto unmap; > @@ -1271,8 +1287,6 @@ static int check_vma_flags(struct vm_are > int foreign = (gup_flags & FOLL_REMOTE); > bool vma_anon = vma_is_anonymous(vma); > > - if (vm_flags & (VM_IO | VM_PFNMAP)) > - return -EFAULT; Is there's any justification that this won't break some existing GUP users that may rely on properly failing at pfnmaps? IIUC netfs isn't the first one that wants to GUP on top of pfnmaps, KVM does it for years and so far it was processed in a standalone path: hva_to_pfn: else if (vma->vm_flags & (VM_IO | VM_PFNMAP)) { r = hva_to_pfn_remapped(vma, kfp, &pfn); That started with supporting real pfnmaps (with no page struct), but pfnmap with page structs can also happen afaict, and kvm processes that too by checking page==NULL ultimately, e.g. in kvm_release_faultin_page(). The other thing is above only processed pte level of pfnmap, and just to mention pmd/pud may need attention too because we're gradually supporting huge mappings even for pfns. I didn't check whether it's possible as of now, though. Maybe it's not an immediate concern. In general, I'm uncertain about whether this is the right way to go so far. To me it might be less intrusive if we follow what kvm does for now, or maybe we also at least want to enrich the justification part in the commit log. > > if ((gup_flags & FOLL_ANON) && !vma_anon) > return -EFAULT; > _ > > Patches currently in -mm which might be from p.antoniou@partner.samsung.com are > > fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch > -- Peter Xu ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 14:16 ` Peter Xu @ 2025-05-08 14:36 ` Pantelis Antoniou 2025-05-08 15:08 ` Peter Xu 0 siblings, 1 reply; 32+ messages in thread From: Pantelis Antoniou @ 2025-05-08 14:36 UTC (permalink / raw) To: Peter Xu Cc: Andrew Morton, mm-commits, wade.farnsworth, jhubbard, jgg, david, c.briere, artem.k, David Howells On Thu, 8 May 2025 10:16:31 -0400 Peter Xu <peterx@redhat.com> wrote: Hi Peter, > Hi, Pantelis, [Cc David Howells] On Wed, May 07, 2025 at 02: 55: 54PM > -0700, Andrew Morton wrote: > > The patch titled > Subject: Fix zero > copy I/O on __get_user_pages allocated pages > has been added to the > -mm mm-hotfixes-unstable > Hi, Pantelis, > > [Cc David Howells] > > On Wed, May 07, 2025 at 02:55:54PM -0700, Andrew Morton wrote: > > > > The patch titled > > Subject: Fix zero copy I/O on __get_user_pages allocated pages > > has been added to the -mm mm-hotfixes-unstable branch. Its > > filename is > > fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch > > > > This patch will shortly appear at > > https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch__;!!KUh5zVML9r9m!2UOP9aM2VFq6hYqCdCsuJWGKqQ36OHuy8fOXVwFXktF6e9uH-2METAUSLAFHOPpOplI8gbkk7l6UAmauPPQ$ > > > > This patch will later appear in the mm-hotfixes-unstable branch at > > git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm > > > > Before you just go and hit "reply", please: > > a) Consider who else should be cc'ed > > b) Prefer to cc a suitable mailing list as well > > c) Ideally: find the original patch on the mailing list and do a > > reply-to-all to that, adding suitable additional cc's > > > > *** Remember to use Documentation/process/submit-checklist.rst when > > testing your code *** > > > > The -mm tree is included into linux-next via the mm-everything > > branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm > > and is updated there every 2-3 working days > > > > ------------------------------------------------------ > > From: Pantelis Antoniou <p.antoniou@partner.samsung.com> > > Subject: Fix zero copy I/O on __get_user_pages allocated pages > > Date: Wed, 7 May 2025 10:41:05 -0500 > > > > Recent updates to net filesystems enabled zero copy operations, > > which require getting a user space page pinned. > > > > This does not work for pages that were allocated via > > __get_user_pages and then mapped to user-space via remap_pfn_rage. > > > > remap_pfn_range_internal() will turn on VM_IO | VM_PFNMAP vma bits. > > VM_PFNMAP in particular mark the pages as not having struct_page > > associated with them, which is not the case for __get_user_pages() > > > > This in turn makes any attempt to lock a page fail, and breaking > > I/O from that address range. > > > > This patch address it by special casing pages in those VMAs and not > > calling vm_normal_page() for them. > > > > Link: > > https://urldefense.com/v3/__https://lkml.kernel.org/r/20250507154105.763088-2-p.antoniou@partner.samsung.com__;!!KUh5zVML9r9m!2UOP9aM2VFq6hYqCdCsuJWGKqQ36OHuy8fOXVwFXktF6e9uH-2METAUSLAFHOPpOplI8gbkk7l6UcsZY8XI$ > > Signed-off-by: Pantelis Antoniou <p.antoniou@partner.samsung.com> > > Cc: Artem Krupotkin <artem.k@samsung.com> Cc: Charles Briere > > <c.briere@samsung.com> Cc: Wade Farnsworth > > <wade.farnsworth@siemens.com> Cc: David Hildenbrand > > <david@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> > > Cc: John Hubbard <jhubbard@nvidia.com> > > Cc: Peter Xu <peterx@redhat.com> > > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > > --- > > > > mm/gup.c | 22 ++++++++++++++++++---- > > 1 file changed, 18 insertions(+), 4 deletions(-) > > > > --- > a/mm/gup.c~fix-zero-copy-i-o-on-__get_user_pages-allocated-pages > > +++ a/mm/gup.c > > @@ -833,6 +833,20 @@ static inline bool can_follow_write_pte( > > return !userfaultfd_pte_wp(vma, pte); > > } > > > > +static struct page *gup_normal_page(struct vm_area_struct *vma, > > + unsigned long address, pte_t pte) > > +{ > > + unsigned long pfn; > > + > > + if (vma->vm_flags & (VM_MIXEDMAP | VM_PFNMAP)) { > > + pfn = pte_pfn(pte); > > + if (!pfn_valid(pfn) || is_zero_pfn(pfn) || pfn > > > highest_memmap_pfn) > > + return NULL; > > + return pfn_to_page(pfn); > > + } > > + return vm_normal_page(vma, address, pte); > > +} > > + > > static struct page *follow_page_pte(struct vm_area_struct *vma, > > unsigned long address, pmd_t *pmd, unsigned int > > flags, struct dev_pagemap **pgmap) > > @@ -858,7 +872,9 @@ static struct page *follow_page_pte(stru > > if (pte_protnone(pte) && !gup_can_follow_protnone(vma, > > flags)) goto no_page; > > > > - page = vm_normal_page(vma, address, pte); > > + page = gup_normal_page(vma, address, pte); > > + if (page && (vma->vm_flags & (VM_MIXEDMAP | VM_PFNMAP))) > > + (void)follow_pfn_pte(vma, address, ptep, flags); > > > > /* > > * We only care about anon pages in can_follow_write_pte() > > and don't @@ -1130,7 +1146,7 @@ static int get_gate_page(struct > > mm_struc *vma = get_gate_vma(mm); > > if (!page) > > goto out; > > - *page = vm_normal_page(*vma, address, entry); > > + *page = gup_normal_page(*vma, address, entry); > > Is this really needed? IIUC the iter code would only use in either > UBUF or IOVEC ones. > I think you're right, for our platforms the gate check never passes. However using the same gup_normal_page() method could be clearer in this context. > > if (!*page) { > > if ((gup_flags & FOLL_DUMP) || > > !is_zero_pfn(pte_pfn(entry))) goto unmap; > > @@ -1271,8 +1287,6 @@ static int check_vma_flags(struct vm_are > > int foreign = (gup_flags & FOLL_REMOTE); > > bool vma_anon = vma_is_anonymous(vma); > > > > - if (vm_flags & (VM_IO | VM_PFNMAP)) > > - return -EFAULT; > > Is there's any justification that this won't break some existing GUP > users that may rely on properly failing at pfnmaps? > > IIUC netfs isn't the first one that wants to GUP on top of pfnmaps, > KVM does it for years and so far it was processed in a standalone > path: > > hva_to_pfn: > else if (vma->vm_flags & (VM_IO | VM_PFNMAP)) { > r = hva_to_pfn_remapped(vma, kfp, &pfn); > > That started with supporting real pfnmaps (with no page struct), but > pfnmap with page structs can also happen afaict, and kvm processes > that too by checking page==NULL ultimately, e.g. in > kvm_release_faultin_page(). > I see. The problem is that we're not the owners of the code in netfslib, and it is considerably more intrusive to fix things there. This is a hotfix for a userspace regression. I sort of agree that having different handling for these areas in netfslib would be ideal. Or perhaps changing semantics by having an extra VM_* bit that would mark that VMA as actually having a backing page struct. Dunno, things could get considerably complex fast. > The other thing is above only processed pte level of pfnmap, and just > to mention pmd/pud may need attention too because we're gradually > supporting huge mappings even for pfns. I didn't check whether it's > possible as of now, though. Maybe it's not an immediate concern. > You are absolutely right, eventually it will be a concern in the future. > In general, I'm uncertain about whether this is the right way to go so > far. To me it might be less intrusive if we follow what kvm does for > now, or maybe we also at least want to enrich the justification part > in the commit log. > Again, this as a hotfix. An actual fix might be something that address both KVM and netfslib concerns, but that would be something much larger than a 20 line patch. > > > > if ((gup_flags & FOLL_ANON) && !vma_anon) > > return -EFAULT; > > _ > > > > Patches currently in -mm which might be from > > p.antoniou@partner.samsung.com are > > > > fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch > > > Regards -- Pantelis ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 14:36 ` Pantelis Antoniou @ 2025-05-08 15:08 ` Peter Xu 2025-05-08 15:10 ` David Hildenbrand 2025-05-08 15:17 ` Pantelis Antoniou 0 siblings, 2 replies; 32+ messages in thread From: Peter Xu @ 2025-05-08 15:08 UTC (permalink / raw) To: Pantelis Antoniou Cc: Andrew Morton, mm-commits, wade.farnsworth, jhubbard, jgg, david, c.briere, artem.k, David Howells On Thu, May 08, 2025 at 05:36:12PM +0300, Pantelis Antoniou wrote: > On Thu, 8 May 2025 10:16:31 -0400 > Peter Xu <peterx@redhat.com> wrote: > > Hi Peter, Hi, Pantelis, [...] > > > @@ -1271,8 +1287,6 @@ static int check_vma_flags(struct vm_are > > > int foreign = (gup_flags & FOLL_REMOTE); > > > bool vma_anon = vma_is_anonymous(vma); > > > > > > - if (vm_flags & (VM_IO | VM_PFNMAP)) > > > - return -EFAULT; > > > > Is there's any justification that this won't break some existing GUP > > users that may rely on properly failing at pfnmaps? > > > > IIUC netfs isn't the first one that wants to GUP on top of pfnmaps, > > KVM does it for years and so far it was processed in a standalone > > path: > > > > hva_to_pfn: > > else if (vma->vm_flags & (VM_IO | VM_PFNMAP)) { > > r = hva_to_pfn_remapped(vma, kfp, &pfn); > > > > That started with supporting real pfnmaps (with no page struct), but > > pfnmap with page structs can also happen afaict, and kvm processes > > that too by checking page==NULL ultimately, e.g. in > > kvm_release_faultin_page(). > > > > I see. The problem is that we're not the owners of the code in netfslib, > and it is considerably more intrusive to fix things there. > > This is a hotfix for a userspace regression. I sort of agree that having > different handling for these areas in netfslib would be ideal. Do you mean this used to work in older kernels? Some more info on the regression would be more than welcomed if so.. If it fixes a kernel regression, we may want a Fixes for whatever patch at last. Or do you mean it's a regression caused by userspace change? Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 15:08 ` Peter Xu @ 2025-05-08 15:10 ` David Hildenbrand 2025-05-08 15:27 ` Pantelis Antoniou 2025-05-08 15:17 ` Pantelis Antoniou 1 sibling, 1 reply; 32+ messages in thread From: David Hildenbrand @ 2025-05-08 15:10 UTC (permalink / raw) To: Peter Xu, Pantelis Antoniou Cc: Andrew Morton, mm-commits, wade.farnsworth, jhubbard, jgg, c.briere, artem.k, David Howells On 08.05.25 17:08, Peter Xu wrote: > On Thu, May 08, 2025 at 05:36:12PM +0300, Pantelis Antoniou wrote: >> On Thu, 8 May 2025 10:16:31 -0400 >> Peter Xu <peterx@redhat.com> wrote: >> >> Hi Peter, > > Hi, Pantelis, > > [...] > >>>> @@ -1271,8 +1287,6 @@ static int check_vma_flags(struct vm_are >>>> int foreign = (gup_flags & FOLL_REMOTE); >>>> bool vma_anon = vma_is_anonymous(vma); >>>> >>>> - if (vm_flags & (VM_IO | VM_PFNMAP)) >>>> - return -EFAULT; >>> >>> Is there's any justification that this won't break some existing GUP >>> users that may rely on properly failing at pfnmaps? >>> >>> IIUC netfs isn't the first one that wants to GUP on top of pfnmaps, >>> KVM does it for years and so far it was processed in a standalone >>> path: >>> >>> hva_to_pfn: >>> else if (vma->vm_flags & (VM_IO | VM_PFNMAP)) { >>> r = hva_to_pfn_remapped(vma, kfp, &pfn); >>> >>> That started with supporting real pfnmaps (with no page struct), but >>> pfnmap with page structs can also happen afaict, and kvm processes >>> that too by checking page==NULL ultimately, e.g. in >>> kvm_release_faultin_page(). >>> >> >> I see. The problem is that we're not the owners of the code in netfslib, >> and it is considerably more intrusive to fix things there. >> >> This is a hotfix for a userspace regression. I sort of agree that having >> different handling for these areas in netfslib would be ideal. > > Do you mean this used to work in older kernels? Some more info on the > regression would be more than welcomed if so.. If it fixes a kernel > regression, we may want a Fixes for whatever patch at last. To be precise: Whoever decided to use remap_pfn_range() essentially decided that GUP cannot possibly work. So is the regression introduced by a conversion to remap_pfn_range() in some code, or because suddenly someone relies on GUP for these things? -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 15:10 ` David Hildenbrand @ 2025-05-08 15:27 ` Pantelis Antoniou 2025-05-08 15:40 ` David Hildenbrand 0 siblings, 1 reply; 32+ messages in thread From: Pantelis Antoniou @ 2025-05-08 15:27 UTC (permalink / raw) To: David Hildenbrand Cc: Peter Xu, Andrew Morton, mm-commits, wade.farnsworth, jhubbard, jgg, c.briere, artem.k, David Howells On Thu, 8 May 2025 17:10:10 +0200 David Hildenbrand <david@redhat.com> wrote: > On 08. 05. 25 17: 08, Peter Xu wrote: > On Thu, May 08, 2025 at 05: > 36: 12PM +0300, Pantelis Antoniou wrote: >> On Thu, 8 May 2025 10: > 16: 31 -0400 >> Peter Xu <peterx@ redhat. com> wrote: >> >> Hi Peter, > On 08.05.25 17:08, Peter Xu wrote: > > On Thu, May 08, 2025 at 05:36:12PM +0300, Pantelis Antoniou wrote: > >> On Thu, 8 May 2025 10:16:31 -0400 > >> Peter Xu <peterx@redhat.com> wrote: > >> > >> Hi Peter, > > > > Hi, Pantelis, > > > > [...] > > > >>>> @@ -1271,8 +1287,6 @@ static int check_vma_flags(struct vm_are > >>>> int foreign = (gup_flags & FOLL_REMOTE); > >>>> bool vma_anon = vma_is_anonymous(vma); > >>>> > >>>> - if (vm_flags & (VM_IO | VM_PFNMAP)) > >>>> - return -EFAULT; > >>> > >>> Is there's any justification that this won't break some existing > >>> GUP users that may rely on properly failing at pfnmaps? > >>> > >>> IIUC netfs isn't the first one that wants to GUP on top of > >>> pfnmaps, KVM does it for years and so far it was processed in a > >>> standalone path: > >>> > >>> hva_to_pfn: > >>> else if (vma->vm_flags & (VM_IO | VM_PFNMAP)) { > >>> r = hva_to_pfn_remapped(vma, kfp, &pfn); > >>> > >>> That started with supporting real pfnmaps (with no page struct), > >>> but pfnmap with page structs can also happen afaict, and kvm > >>> processes that too by checking page==NULL ultimately, e.g. in > >>> kvm_release_faultin_page(). > >>> > >> > >> I see. The problem is that we're not the owners of the code in > >> netfslib, and it is considerably more intrusive to fix things > >> there. > >> > >> This is a hotfix for a userspace regression. I sort of agree that > >> having different handling for these areas in netfslib would be > >> ideal. > > > > Do you mean this used to work in older kernels? Some more info on > > the regression would be more than welcomed if so.. If it fixes a > > kernel regression, we may want a Fixes for whatever patch at last. > > To be precise: Whoever decided to use remap_pfn_range() essentially > decided that GUP cannot possibly work. > > So is the regression introduced by a conversion to remap_pfn_range() > in some code, or because suddenly someone relies on GUP for these > things? > I don't think there was a deliberate decision here, but there was no conversion to remap_pfn_range(), the code (in DRM) was always there. The regression occurred when netfslib started using GUP for I/O and when filesystems switched to it we hit this case. Regards -- Pantelis ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 15:27 ` Pantelis Antoniou @ 2025-05-08 15:40 ` David Hildenbrand 2025-05-08 15:48 ` Pantelis Antoniou ` (2 more replies) 0 siblings, 3 replies; 32+ messages in thread From: David Hildenbrand @ 2025-05-08 15:40 UTC (permalink / raw) To: Pantelis Antoniou Cc: Peter Xu, Andrew Morton, mm-commits, wade.farnsworth, jhubbard, jgg, c.briere, artem.k, David Howells On 08.05.25 17:27, Pantelis Antoniou wrote: > On Thu, 8 May 2025 17:10:10 +0200 > David Hildenbrand <david@redhat.com> wrote: > >> On 08. 05. 25 17: 08, Peter Xu wrote: > On Thu, May 08, 2025 at 05: >> 36: 12PM +0300, Pantelis Antoniou wrote: >> On Thu, 8 May 2025 10: >> 16: 31 -0400 >> Peter Xu <peterx@ redhat. com> wrote: >> >> Hi Peter, > >> On 08.05.25 17:08, Peter Xu wrote: >>> On Thu, May 08, 2025 at 05:36:12PM +0300, Pantelis Antoniou wrote: >>>> On Thu, 8 May 2025 10:16:31 -0400 >>>> Peter Xu <peterx@redhat.com> wrote: >>>> >>>> Hi Peter, >>> >>> Hi, Pantelis, >>> >>> [...] >>> >>>>>> @@ -1271,8 +1287,6 @@ static int check_vma_flags(struct vm_are >>>>>> int foreign = (gup_flags & FOLL_REMOTE); >>>>>> bool vma_anon = vma_is_anonymous(vma); >>>>>> >>>>>> - if (vm_flags & (VM_IO | VM_PFNMAP)) >>>>>> - return -EFAULT; >>>>> >>>>> Is there's any justification that this won't break some existing >>>>> GUP users that may rely on properly failing at pfnmaps? >>>>> >>>>> IIUC netfs isn't the first one that wants to GUP on top of >>>>> pfnmaps, KVM does it for years and so far it was processed in a >>>>> standalone path: >>>>> >>>>> hva_to_pfn: >>>>> else if (vma->vm_flags & (VM_IO | VM_PFNMAP)) { >>>>> r = hva_to_pfn_remapped(vma, kfp, &pfn); >>>>> >>>>> That started with supporting real pfnmaps (with no page struct), >>>>> but pfnmap with page structs can also happen afaict, and kvm >>>>> processes that too by checking page==NULL ultimately, e.g. in >>>>> kvm_release_faultin_page(). >>>>> >>>> >>>> I see. The problem is that we're not the owners of the code in >>>> netfslib, and it is considerably more intrusive to fix things >>>> there. >>>> >>>> This is a hotfix for a userspace regression. I sort of agree that >>>> having different handling for these areas in netfslib would be >>>> ideal. >>> >>> Do you mean this used to work in older kernels? Some more info on >>> the regression would be more than welcomed if so.. If it fixes a >>> kernel regression, we may want a Fixes for whatever patch at last. >> >> To be precise: Whoever decided to use remap_pfn_range() essentially >> decided that GUP cannot possibly work. >> >> So is the regression introduced by a conversion to remap_pfn_range() >> in some code, or because suddenly someone relies on GUP for these >> things? >> > > I don't think there was a deliberate decision here, but there was no > conversion to remap_pfn_range(), the code (in DRM) was always there. > > The regression occurred when netfslib started using GUP for I/O and > when filesystems switched to it we hit this case. Okay, so GUP and DRM always worked that way. They are essentially incompatible at this point due to VM_PFNMAP. So netfslib requesting something that is impossible is the problem .. or rather filesystems switching to that and not realizing the problem. Hmmm -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 15:40 ` David Hildenbrand @ 2025-05-08 15:48 ` Pantelis Antoniou 2025-05-08 16:25 ` Pantelis Antoniou 2025-05-08 17:35 ` Jason Gunthorpe 2 siblings, 0 replies; 32+ messages in thread From: Pantelis Antoniou @ 2025-05-08 15:48 UTC (permalink / raw) To: David Hildenbrand Cc: Peter Xu, Andrew Morton, mm-commits, wade.farnsworth, jhubbard, jgg, c.briere, artem.k, David Howells On Thu, 8 May 2025 17:40:15 +0200 David Hildenbrand <david@redhat.com> wrote: > On 08. 05. 25 17: 27, Pantelis Antoniou wrote: > On Thu, 8 May 2025 > 17: 10: 10 +0200 > David Hildenbrand <david@ redhat. com> wrote: > >> > On 08. 05. 25 17: 08, Peter Xu wrote: > On Thu, May 08, 2025 at 05: > On 08.05.25 17:27, Pantelis Antoniou wrote: > > On Thu, 8 May 2025 17:10:10 +0200 > > David Hildenbrand <david@redhat.com> wrote: > > > >> On 08. 05. 25 17: 08, Peter Xu wrote: > On Thu, May 08, 2025 at 05: > >> 36: 12PM +0300, Pantelis Antoniou wrote: >> On Thu, 8 May 2025 10: > >> 16: 31 -0400 >> Peter Xu <peterx@ redhat. com> wrote: >> >> Hi > >> Peter, > > > >> On 08.05.25 17:08, Peter Xu wrote: > >>> On Thu, May 08, 2025 at 05:36:12PM +0300, Pantelis Antoniou wrote: > >>>> On Thu, 8 May 2025 10:16:31 -0400 > >>>> Peter Xu <peterx@redhat.com> wrote: > >>>> > >>>> Hi Peter, > >>> > >>> Hi, Pantelis, > >>> > >>> [...] > >>> > >>>>>> @@ -1271,8 +1287,6 @@ static int check_vma_flags(struct vm_are > >>>>>> int foreign = (gup_flags & FOLL_REMOTE); > >>>>>> bool vma_anon = vma_is_anonymous(vma); > >>>>>> > >>>>>> - if (vm_flags & (VM_IO | VM_PFNMAP)) > >>>>>> - return -EFAULT; > >>>>> > >>>>> Is there's any justification that this won't break some existing > >>>>> GUP users that may rely on properly failing at pfnmaps? > >>>>> > >>>>> IIUC netfs isn't the first one that wants to GUP on top of > >>>>> pfnmaps, KVM does it for years and so far it was processed in a > >>>>> standalone path: > >>>>> > >>>>> hva_to_pfn: > >>>>> else if (vma->vm_flags & (VM_IO | VM_PFNMAP)) { > >>>>> r = hva_to_pfn_remapped(vma, kfp, &pfn); > >>>>> > >>>>> That started with supporting real pfnmaps (with no page struct), > >>>>> but pfnmap with page structs can also happen afaict, and kvm > >>>>> processes that too by checking page==NULL ultimately, e.g. in > >>>>> kvm_release_faultin_page(). > >>>>> > >>>> > >>>> I see. The problem is that we're not the owners of the code in > >>>> netfslib, and it is considerably more intrusive to fix things > >>>> there. > >>>> > >>>> This is a hotfix for a userspace regression. I sort of agree that > >>>> having different handling for these areas in netfslib would be > >>>> ideal. > >>> > >>> Do you mean this used to work in older kernels? Some more info on > >>> the regression would be more than welcomed if so.. If it fixes a > >>> kernel regression, we may want a Fixes for whatever patch at last. > >> > >> To be precise: Whoever decided to use remap_pfn_range() essentially > >> decided that GUP cannot possibly work. > >> > >> So is the regression introduced by a conversion to > >> remap_pfn_range() in some code, or because suddenly someone relies > >> on GUP for these things? > >> > > > > I don't think there was a deliberate decision here, but there was no > > conversion to remap_pfn_range(), the code (in DRM) was always there. > > > > The regression occurred when netfslib started using GUP for I/O and > > when filesystems switched to it we hit this case. > > Okay, so GUP and DRM always worked that way. They are essentially > incompatible at this point due to VM_PFNMAP. > > So netfslib requesting something that is impossible is the problem .. > or rather filesystems switching to that and not realizing the problem. > All of the statements above are true. > Hmmm > Indeed. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 15:40 ` David Hildenbrand 2025-05-08 15:48 ` Pantelis Antoniou @ 2025-05-08 16:25 ` Pantelis Antoniou 2025-05-08 17:35 ` Jason Gunthorpe 2 siblings, 0 replies; 32+ messages in thread From: Pantelis Antoniou @ 2025-05-08 16:25 UTC (permalink / raw) To: David Hildenbrand Cc: Peter Xu, Andrew Morton, mm-commits, wade.farnsworth, jhubbard, jgg, c.briere, artem.k, David Howells [-- Attachment #1: Type: text/plain, Size: 3641 bytes --] On Thu, 8 May 2025 17:40:15 +0200 David Hildenbrand <david@redhat.com> wrote: > On 08. 05. 25 17: 27, Pantelis Antoniou wrote: > On Thu, 8 May 2025 > 17: 10: 10 +0200 > David Hildenbrand <david@ redhat. com> wrote: > >> > On 08. 05. 25 17: 08, Peter Xu wrote: > On Thu, May 08, 2025 at 05: > >> 36: 12PM > On 08.05.25 17:27, Pantelis Antoniou wrote: > > On Thu, 8 May 2025 17:10:10 +0200 > > David Hildenbrand <david@redhat.com> wrote: > > > >> On 08. 05. 25 17: 08, Peter Xu wrote: > On Thu, May 08, 2025 at 05: > >> 36: 12PM +0300, Pantelis Antoniou wrote: >> On Thu, 8 May 2025 10: > >> 16: 31 -0400 >> Peter Xu <peterx@ redhat. com> wrote: >> >> Hi > >> Peter, > > > >> On 08.05.25 17:08, Peter Xu wrote: > >>> On Thu, May 08, 2025 at 05:36:12PM +0300, Pantelis Antoniou wrote: > >>>> On Thu, 8 May 2025 10:16:31 -0400 > >>>> Peter Xu <peterx@redhat.com> wrote: > >>>> > >>>> Hi Peter, > >>> > >>> Hi, Pantelis, > >>> > >>> [...] > >>> > >>>>>> @@ -1271,8 +1287,6 @@ static int check_vma_flags(struct vm_are > >>>>>> int foreign = (gup_flags & FOLL_REMOTE); > >>>>>> bool vma_anon = vma_is_anonymous(vma); > >>>>>> > >>>>>> - if (vm_flags & (VM_IO | VM_PFNMAP)) > >>>>>> - return -EFAULT; > >>>>> > >>>>> Is there's any justification that this won't break some existing > >>>>> GUP users that may rely on properly failing at pfnmaps? > >>>>> > >>>>> IIUC netfs isn't the first one that wants to GUP on top of > >>>>> pfnmaps, KVM does it for years and so far it was processed in a > >>>>> standalone path: > >>>>> > >>>>> hva_to_pfn: > >>>>> else if (vma->vm_flags & (VM_IO | VM_PFNMAP)) { > >>>>> r = hva_to_pfn_remapped(vma, kfp, &pfn); > >>>>> > >>>>> That started with supporting real pfnmaps (with no page struct), > >>>>> but pfnmap with page structs can also happen afaict, and kvm > >>>>> processes that too by checking page==NULL ultimately, e.g. in > >>>>> kvm_release_faultin_page(). > >>>>> > >>>> > >>>> I see. The problem is that we're not the owners of the code in > >>>> netfslib, and it is considerably more intrusive to fix things > >>>> there. > >>>> > >>>> This is a hotfix for a userspace regression. I sort of agree that > >>>> having different handling for these areas in netfslib would be > >>>> ideal. > >>> > >>> Do you mean this used to work in older kernels? Some more info on > >>> the regression would be more than welcomed if so.. If it fixes a > >>> kernel regression, we may want a Fixes for whatever patch at last. > >> > >> To be precise: Whoever decided to use remap_pfn_range() essentially > >> decided that GUP cannot possibly work. > >> > >> So is the regression introduced by a conversion to > >> remap_pfn_range() in some code, or because suddenly someone relies > >> on GUP for these things? > >> > > > > I don't think there was a deliberate decision here, but there was no > > conversion to remap_pfn_range(), the code (in DRM) was always there. > > > > The regression occurred when netfslib started using GUP for I/O and > > when filesystems switched to it we hit this case. > > Okay, so GUP and DRM always worked that way. They are essentially > incompatible at this point due to VM_PFNMAP. > > So netfslib requesting something that is impossible is the problem .. > or rather filesystems switching to that and not realizing the problem. > > Hmmm > In the interest of getting everyone on the same page here's a buildroot patch that reproduces an (simplified) environment for triggering the bug that this patch fixes. Regards -- Pantelis [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-Modify-aarch64-virt-x86_64-qemu-targets-to-exhibit-a.patch --] [-- Type: text/x-patch, Size: 22146 bytes --] From fa33c0501ec4b5e84e9359d56b0da54bfa5af728 Mon Sep 17 00:00:00 2001 From: Pantelis Antoniou <p.antoniou@partner.samsung.com> Date: Tue, 1 Apr 2025 17:28:20 +0300 Subject: [PATCH] Modify aarch64-virt/x86_64 qemu targets to exhibit a vmbug When 9p writes directly from a device mmap vma the write fails. To use, checkout this and: $ make qemu_aarch64_virt_defconfig # arm64 or $ make qemu_x86_64_defconfig # x86_64 $ make $ ./output/images/start-qemu.sh < login as root, no password > qemu# modprobe vmbug-module qemu# vmbug <OK> qemu# vmbug -w /mnt/home/vmbug.bin <FAILS> qemu# vmbug -b -w /mnt/home/vmbug.bin <OK> Signed-off-by: Pantelis Antoniou <p.antoniou@partner.samsung.com> --- board/qemu/aarch64-virt/linux.config | 8 +- board/qemu/aarch64-virt/readme.txt | 2 +- board/qemu/post-build.sh | 6 + board/qemu/x86_64/linux.config | 14 +- board/qemu/x86_64/readme.txt | 2 +- configs/qemu_aarch64_virt_defconfig | 12 +- configs/qemu_x86_64_defconfig | 13 +- linux/linux.hash | 3 + package/Config.in | 2 + package/vmbug-module/Config.in | 5 + package/vmbug-module/Makefile | 10 ++ package/vmbug-module/vmbug-module.c | 136 ++++++++++++++++++ package/vmbug-module/vmbug-module.mk | 14 ++ package/vmbug/Config.in | 5 + package/vmbug/Makefile | 25 ++++ package/vmbug/vmbug.c | 202 +++++++++++++++++++++++++++ package/vmbug/vmbug.mk | 21 +++ 17 files changed, 463 insertions(+), 17 deletions(-) create mode 100755 board/qemu/post-build.sh create mode 100644 package/vmbug-module/Config.in create mode 100644 package/vmbug-module/Makefile create mode 100644 package/vmbug-module/vmbug-module.c create mode 100644 package/vmbug-module/vmbug-module.mk create mode 100644 package/vmbug/Config.in create mode 100644 package/vmbug/Makefile create mode 100644 package/vmbug/vmbug.c create mode 100644 package/vmbug/vmbug.mk diff --git a/board/qemu/aarch64-virt/linux.config b/board/qemu/aarch64-virt/linux.config index 971b9fcf86..0fcb3fbaea 100644 --- a/board/qemu/aarch64-virt/linux.config +++ b/board/qemu/aarch64-virt/linux.config @@ -13,6 +13,7 @@ CONFIG_PROFILING=y CONFIG_ARCH_VEXPRESS=y CONFIG_COMPAT=y CONFIG_ACPI=y +# CONFIG_GCC_PLUGINS is not set CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y CONFIG_BLK_DEV_BSGLIB=y @@ -21,14 +22,14 @@ CONFIG_TRANSPARENT_HUGEPAGE=y CONFIG_NET=y CONFIG_PACKET=y CONFIG_PACKET_DIAG=y -CONFIG_UNIX=y CONFIG_NET_KEY=y -CONFIG_INET=y CONFIG_IP_MULTICAST=y CONFIG_IP_ADVANCED_ROUTER=y CONFIG_BRIDGE=m CONFIG_NET_SCHED=y CONFIG_VSOCKETS=y +CONFIG_NET_9P=y +CONFIG_NET_9P_VIRTIO=y CONFIG_PCI=y CONFIG_PCI_HOST_GENERIC=y CONFIG_DEVTMPFS=y @@ -74,3 +75,6 @@ CONFIG_VIRTIO_FS=y CONFIG_OVERLAY_FS=y CONFIG_TMPFS=y CONFIG_TMPFS_POSIX_ACL=y +CONFIG_9P_FS=y +CONFIG_9P_FS_POSIX_ACL=y +CONFIG_CRYPTO_CRC32C=y diff --git a/board/qemu/aarch64-virt/readme.txt b/board/qemu/aarch64-virt/readme.txt index db35a3a7a8..3fac0a296c 100644 --- a/board/qemu/aarch64-virt/readme.txt +++ b/board/qemu/aarch64-virt/readme.txt @@ -1,5 +1,5 @@ Run the emulation with: - qemu-system-aarch64 -M virt -cpu cortex-a53 -nographic -smp 1 -kernel output/images/Image -append "rootwait root=/dev/vda console=ttyAMA0" -netdev user,id=eth0 -device virtio-net-device,netdev=eth0 -drive file=output/images/rootfs.ext4,if=none,format=raw,id=hd0 -device virtio-blk-device,drive=hd0 # qemu_aarch64_virt_defconfig + qemu-system-aarch64 -M virt -cpu cortex-a53 -nographic -smp 1 -kernel output/images/Image -append "rootwait root=/dev/vda console=ttyAMA0" -netdev user,id=eth0 -device virtio-net-device,netdev=eth0 -drive file=output/images/rootfs.ext4,if=none,format=raw,id=hd0 -device virtio-blk-device,drive=hd0 -virtfs local,path="${HOME}",mount_tag=host0,security_model=mapped,id=host0 # qemu_aarch64_virt_defconfig The login prompt will appear in the terminal that started Qemu. diff --git a/board/qemu/post-build.sh b/board/qemu/post-build.sh new file mode 100755 index 0000000000..61648a1daf --- /dev/null +++ b/board/qemu/post-build.sh @@ -0,0 +1,6 @@ +#!/bin/sh +BOARD_DIR="$(dirname "$0")" + +mkdir -p "${TARGET_DIR}/mnt/home" + +echo "host0 /mnt/home 9p trans=virtio,version=9p2000.L 0 1" >>"${TARGET_DIR}/etc/fstab" diff --git a/board/qemu/x86_64/linux.config b/board/qemu/x86_64/linux.config index e1d2ce01b0..9d4a97a6b5 100644 --- a/board/qemu/x86_64/linux.config +++ b/board/qemu/x86_64/linux.config @@ -1,15 +1,16 @@ CONFIG_SYSVIPC=y CONFIG_CGROUPS=y -CONFIG_MODULES=y -CONFIG_MODULE_UNLOAD=y CONFIG_SMP=y CONFIG_HYPERVISOR_GUEST=y CONFIG_PARAVIRT=y +# CONFIG_GCC_PLUGINS is not set +CONFIG_MODULES=y +CONFIG_MODULE_UNLOAD=y CONFIG_NET=y CONFIG_PACKET=y -CONFIG_UNIX=y -CONFIG_INET=y # CONFIG_WIRELESS is not set +CONFIG_NET_9P=y +CONFIG_NET_9P_VIRTIO=y CONFIG_PCI=y CONFIG_DEVTMPFS=y CONFIG_DEVTMPFS_MOUNT=y @@ -30,8 +31,8 @@ CONFIG_VIRTIO_CONSOLE=y CONFIG_HW_RANDOM_VIRTIO=m CONFIG_DRM=y CONFIG_DRM_QXL=y -CONFIG_DRM_BOCHS=y CONFIG_DRM_VIRTIO_GPU=y +CONFIG_DRM_BOCHS=y CONFIG_SOUND=y CONFIG_SND=y CONFIG_SND_HDA_INTEL=y @@ -47,7 +48,8 @@ CONFIG_VIRTIO_INPUT=y CONFIG_VIRTIO_MMIO=y CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES=y CONFIG_EXT4_FS=y -CONFIG_AUTOFS4_FS=y CONFIG_TMPFS=y CONFIG_TMPFS_POSIX_ACL=y +CONFIG_9P_FS=y +CONFIG_9P_FS_POSIX_ACL=y CONFIG_UNWINDER_FRAME_POINTER=y diff --git a/board/qemu/x86_64/readme.txt b/board/qemu/x86_64/readme.txt index 2b2ae3be20..a0a7fb6ab7 100644 --- a/board/qemu/x86_64/readme.txt +++ b/board/qemu/x86_64/readme.txt @@ -1,6 +1,6 @@ Run the emulation with: - qemu-system-x86_64 -M pc -kernel output/images/bzImage -drive file=output/images/rootfs.ext2,if=virtio,format=raw -append "rootwait root=/dev/vda console=tty1 console=ttyS0" -serial stdio -net nic,model=virtio -net user # qemu_x86_64_defconfig + qemu-system-x86_64 -M pc -kernel output/images/bzImage -drive file=output/images/rootfs.ext2,if=virtio,format=raw -append "rootwait root=/dev/vda console=tty1 console=ttyS0" -serial stdio -net nic,model=virtio -net user -virtfs local,path="${HOME}",mount_tag=host0,security_model=mapped,id=host0 # qemu_x86_64_defconfig Optionally add -smp N to emulate a SMP system with N CPUs. diff --git a/configs/qemu_aarch64_virt_defconfig b/configs/qemu_aarch64_virt_defconfig index fb9db3f0fc..cb4811eba0 100644 --- a/configs/qemu_aarch64_virt_defconfig +++ b/configs/qemu_aarch64_virt_defconfig @@ -1,18 +1,24 @@ BR2_aarch64=y -BR2_PACKAGE_HOST_LINUX_HEADERS_CUSTOM_6_12=y +BR2_PACKAGE_HOST_LINUX_HEADERS_CUSTOM_6_13=y BR2_GLOBAL_PATCH_DIR="board/qemu/patches" BR2_DOWNLOAD_FORCE_CHECK_HASHES=y BR2_SYSTEM_DHCP="eth0" +BR2_ROOTFS_POST_BUILD_SCRIPT="board/qemu/post-build.sh" BR2_ROOTFS_POST_IMAGE_SCRIPT="board/qemu/post-image.sh" BR2_ROOTFS_POST_SCRIPT_ARGS="$(BR2_DEFCONFIG)" BR2_LINUX_KERNEL=y -BR2_LINUX_KERNEL_CUSTOM_VERSION=y -BR2_LINUX_KERNEL_CUSTOM_VERSION_VALUE="6.12.9" +BR2_LINUX_KERNEL_CUSTOM_GIT=y +BR2_LINUX_KERNEL_CUSTOM_REPO_URL="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git" +BR2_LINUX_KERNEL_CUSTOM_REPO_VERSION="e48e99b6edf41c69c5528aa7ffb2daf3c59ee105" +BR2_LINUX_KERNEL_CUSTOM_REPO_GIT_SUBMODULES=y BR2_LINUX_KERNEL_USE_CUSTOM_CONFIG=y BR2_LINUX_KERNEL_CUSTOM_CONFIG_FILE="board/qemu/aarch64-virt/linux.config" BR2_LINUX_KERNEL_NEEDS_HOST_OPENSSL=y +BR2_PACKAGE_VMBUG=y +BR2_PACKAGE_VMBUG_MODULE=y BR2_TARGET_ROOTFS_EXT2=y BR2_TARGET_ROOTFS_EXT2_4=y # BR2_TARGET_ROOTFS_TAR is not set BR2_PACKAGE_HOST_QEMU=y BR2_PACKAGE_HOST_QEMU_SYSTEM_MODE=y +BR2_PACKAGE_HOST_QEMU_VIRTFS=y diff --git a/configs/qemu_x86_64_defconfig b/configs/qemu_x86_64_defconfig index 7c7fc374d9..3ef05dd2a2 100644 --- a/configs/qemu_x86_64_defconfig +++ b/configs/qemu_x86_64_defconfig @@ -1,18 +1,23 @@ BR2_x86_64=y -BR2_PACKAGE_HOST_LINUX_HEADERS_CUSTOM_6_12=y +BR2_PACKAGE_HOST_LINUX_HEADERS_CUSTOM_6_13=y BR2_GLOBAL_PATCH_DIR="board/qemu/patches" BR2_DOWNLOAD_FORCE_CHECK_HASHES=y BR2_SYSTEM_DHCP="eth0" -BR2_ROOTFS_POST_BUILD_SCRIPT="board/qemu/x86_64/post-build.sh" +BR2_ROOTFS_POST_BUILD_SCRIPT="board/qemu/x86_64/post-build.sh board/qemu/post-build.sh" BR2_ROOTFS_POST_IMAGE_SCRIPT="board/qemu/post-image.sh" BR2_ROOTFS_POST_SCRIPT_ARGS="$(BR2_DEFCONFIG)" BR2_LINUX_KERNEL=y -BR2_LINUX_KERNEL_CUSTOM_VERSION=y -BR2_LINUX_KERNEL_CUSTOM_VERSION_VALUE="6.12.9" +BR2_LINUX_KERNEL_CUSTOM_GIT=y +BR2_LINUX_KERNEL_CUSTOM_REPO_URL="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git" +BR2_LINUX_KERNEL_CUSTOM_REPO_VERSION="e48e99b6edf41c69c5528aa7ffb2daf3c59ee105" +BR2_LINUX_KERNEL_CUSTOM_REPO_GIT_SUBMODULES=y BR2_LINUX_KERNEL_USE_CUSTOM_CONFIG=y BR2_LINUX_KERNEL_CUSTOM_CONFIG_FILE="board/qemu/x86_64/linux.config" BR2_LINUX_KERNEL_NEEDS_HOST_LIBELF=y +BR2_PACKAGE_VMBUG=y +BR2_PACKAGE_VMBUG_MODULE=y BR2_TARGET_ROOTFS_EXT2=y # BR2_TARGET_ROOTFS_TAR is not set BR2_PACKAGE_HOST_QEMU=y BR2_PACKAGE_HOST_QEMU_SYSTEM_MODE=y +BR2_PACKAGE_HOST_QEMU_VIRTFS=y diff --git a/linux/linux.hash b/linux/linux.hash index 10aaed1d3f..77e071d12d 100644 --- a/linux/linux.hash +++ b/linux/linux.hash @@ -15,3 +15,6 @@ sha256 b5539243f187e3d478d76d44ae13aab83952c94b885ad889df6fa9997e16a441 linux- sha256 fb5a425bd3b3cd6071a3a9aff9909a859e7c1158d54d32e07658398cd67eb6a0 COPYING sha256 f6b78c087c3ebdf0f3c13415070dd480a3f35d8fc76f3d02180a407c1c812f79 LICENSES/preferred/GPL-2.0 sha256 8e378ab93586eb55135d3bc119cce787f7324f48394777d00c34fa3d0be3303f LICENSES/exceptions/Linux-syscall-note + +# extra +sha256 d1847aa07dfcc0674d1a9ff7f40a3f49b023ab6613daf1184b4be6c4c649f4fc linux-e48e99b6edf41c69c5528aa7ffb2daf3c59ee105-git4.tar.gz diff --git a/package/Config.in b/package/Config.in index 4b7e474cac..56114cc259 100644 --- a/package/Config.in +++ b/package/Config.in @@ -162,6 +162,8 @@ menu "Debugging, profiling and benchmark" source "package/valgrind/Config.in" source "package/vmtouch/Config.in" source "package/whetstone/Config.in" + source "package/vmbug/Config.in" + source "package/vmbug-module/Config.in" endmenu menu "Development tools" diff --git a/package/vmbug-module/Config.in b/package/vmbug-module/Config.in new file mode 100644 index 0000000000..2b8881e23d --- /dev/null +++ b/package/vmbug-module/Config.in @@ -0,0 +1,5 @@ +config BR2_PACKAGE_VMBUG_MODULE + bool "vmbug module" + depends on BR2_LINUX_KERNEL + help + Linux Kernel Module for VMBUG demo. diff --git a/package/vmbug-module/Makefile b/package/vmbug-module/Makefile new file mode 100644 index 0000000000..260b640891 --- /dev/null +++ b/package/vmbug-module/Makefile @@ -0,0 +1,10 @@ +obj-m += $(addsuffix .o, vmbug-module) +ccflags-y := -DDEBUG -g -std=gnu99 -Wno-declaration-after-statement + +.PHONY: all clean + +all: + $(MAKE) -C '$(LINUX_DIR)' M='$(PWD)' modules + +clean: + $(MAKE) -C '$(LINUX_DIR)' M='$(PWD)' clean diff --git a/package/vmbug-module/vmbug-module.c b/package/vmbug-module/vmbug-module.c new file mode 100644 index 0000000000..0409275339 --- /dev/null +++ b/package/vmbug-module/vmbug-module.c @@ -0,0 +1,136 @@ +// vmbug kernel module +#include <linux/io.h> +#include <linux/miscdevice.h> +#include <linux/mm.h> +#include <linux/gfp.h> +#include <linux/module.h> + +#define DRIVER_NAME "vmbug" + +struct vmbug_drvdata { + struct miscdevice misc; + void *page_buffer; +}; + +static struct vmbug_drvdata *vmbug_global_data; + +static inline struct vmbug_drvdata *to_vmbug_drvdata(struct file *filp) +{ + return container_of(filp->private_data, struct vmbug_drvdata, misc); +} + +static ssize_t vmbug_read(struct file *filp, char __user *ptr, size_t len, loff_t *off) +{ + struct vmbug_drvdata *vmbug = to_vmbug_drvdata(filp); + + return simple_read_from_buffer(ptr, len, off, vmbug->page_buffer, PAGE_SIZE); +} + +static ssize_t vmbug_write(struct file *filp, const char __user *ptr, size_t len, loff_t *off) +{ + struct vmbug_drvdata *vmbug = to_vmbug_drvdata(filp); + + return simple_write_to_buffer(vmbug->page_buffer, PAGE_SIZE, off, ptr, len); +} + +static int vmbug_mmap(struct file *filp, struct vm_area_struct *vma) +{ + struct vmbug_drvdata *vmbug = to_vmbug_drvdata(filp); + struct device *dev = vmbug->misc.this_device; + pgprot_t prot; + unsigned long user_addr, pfn; + void *kern_addr; + phys_addr_t phys_addr; + int ret; + + kern_addr = vmbug->page_buffer; + user_addr = vma->vm_start; + phys_addr = virt_to_phys(kern_addr); + pfn = phys_addr >> PAGE_SHIFT; + prot = vma->vm_page_prot; + + ret = remap_pfn_range(vma, user_addr, pfn, PAGE_SIZE, prot); + if (ret) { + dev_err(dev, "remap_pfn_range() failed\n"); + return ret; + } + + return 0; +} + +static const struct file_operations vmbug_fops = { + .owner = THIS_MODULE, + .read = vmbug_read, + .write = vmbug_write, + .mmap = vmbug_mmap, + .llseek = generic_file_llseek, +}; + +static int __init vmbug_init(void) +{ + struct vmbug_drvdata *vmbug; + int ret; + + vmbug = kmalloc(sizeof(*vmbug_global_data), GFP_KERNEL | __GFP_ZERO); + if (!vmbug) { + printk(KERN_ERR "failed to allocate memory for '%s'\n", DRIVER_NAME); + ret = -ENOMEM; + goto err_mem; + } + + vmbug->page_buffer = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 0); + if (!vmbug->page_buffer) { + printk(KERN_ERR "failed to allocate page for '%s'\n", DRIVER_NAME); + ret = -ENOMEM; + goto err_page; + } + + SetPageReserved(virt_to_page(vmbug->page_buffer)); + + vmbug->misc.name = DRIVER_NAME; + vmbug->misc.minor = MISC_DYNAMIC_MINOR; + vmbug->misc.fops = &vmbug_fops; + vmbug->misc.mode = 0600; + + ret = misc_register(&vmbug->misc); + if (ret) { + printk(KERN_ERR "failed to register misc device '%s': %d\n", + DRIVER_NAME, ret); + goto err_misc; + } + + vmbug_global_data = vmbug; + + printk(KERN_INFO "%s: ready\n", DRIVER_NAME); + + return 0; + +err_misc: + free_pages((unsigned long)vmbug->page_buffer, 0); +err_page: + kfree(vmbug); +err_mem: + return ret; +} + +static void __exit vmbug_exit(void) +{ + struct vmbug_drvdata *vmbug; + + vmbug = vmbug_global_data; + if (!vmbug) + return; + vmbug_global_data = NULL; + + misc_deregister(&vmbug->misc); + + ClearPageReserved(virt_to_page(vmbug->page_buffer)); + + free_pages((unsigned long)vmbug->page_buffer, 0); + kfree(vmbug); +} + +module_init(vmbug_init) +module_exit(vmbug_exit) + +MODULE_LICENSE("GPL"); diff --git a/package/vmbug-module/vmbug-module.mk b/package/vmbug-module/vmbug-module.mk new file mode 100644 index 0000000000..22f625f3d7 --- /dev/null +++ b/package/vmbug-module/vmbug-module.mk @@ -0,0 +1,14 @@ +################################################################################ +# +# vmbug-module +# +################################################################################ + +VMBUG_MODULE_LINUX_LICENSE = GPL-2.0+ + +define VMBUG_MODULE_EXTRACT_CMDS + cp $(VMBUG_MODULE_PKGDIR)/vmbug-module.c $(VMBUG_MODULE_PKGDIR)/Makefile $(@D) +endef + +$(eval $(kernel-module)) +$(eval $(generic-package)) diff --git a/package/vmbug/Config.in b/package/vmbug/Config.in new file mode 100644 index 0000000000..5372f62591 --- /dev/null +++ b/package/vmbug/Config.in @@ -0,0 +1,5 @@ +config BR2_PACKAGE_VMBUG + bool "VMBUG userspace program" + depends on BR2_LINUX_KERNEL + help + Linux userspace VMbug program. diff --git a/package/vmbug/Makefile b/package/vmbug/Makefile new file mode 100644 index 0000000000..05b5075448 --- /dev/null +++ b/package/vmbug/Makefile @@ -0,0 +1,25 @@ +.PHONY: all clean install help mrproper + +.SUFFIXES: + +CFLAGS=-Wall -O2 -g + +all: vmbug + +vmbug: vmbug.o + $(CC) vmbug.o -o $@ $(LDFLAGS) + +%.o: %.c + $(CC) $(CFLAGS) -c $< -o $@ + +clean: + -rm -f vmbug *.o *~ + +install: + cp vmbug $(DESTDIR)/bin/vmbug + +help: + @echo "Available targets : all, install, clean, mrproper" + +mrproper: clean + rm -rf vmbug diff --git a/package/vmbug/vmbug.c b/package/vmbug/vmbug.c new file mode 100644 index 0000000000..70fdb2a8d2 --- /dev/null +++ b/package/vmbug/vmbug.c @@ -0,0 +1,202 @@ +#include <stdio.h> +#include <unistd.h> +#include <errno.h> +#include <string.h> +#include <stdlib.h> +#include <fcntl.h> +#include <alloca.h> +#include <sys/mman.h> +#include <stdbool.h> + +static int safe_open(const char *pathname, int flags, mode_t mode) +{ + int fd; + + fd = open(pathname, flags, mode); + if (fd == -1) { + perror("open failed"); + exit(EXIT_FAILURE); + } + return fd; +} + +static void safe_close(int fd) +{ + int ret; + + ret = close(fd); + if (ret == -1) { + perror("close failed"); + exit(EXIT_FAILURE); + } +} + +static void safe_write(int fd, const void *ptr, size_t size) +{ + ssize_t wrn; + + while (size) { + do { + wrn = write(fd, ptr, size); + } while (wrn == -1 && errno == EAGAIN); + if (wrn == -1 || (size_t)wrn > size) { + perror("write failed"); + exit(EXIT_FAILURE); + } + ptr += wrn; + size -= wrn; + } +} + +static void safe_read(int fd, void *ptr, size_t size) +{ + ssize_t rdn; + + while (size) { + do { + rdn = read(fd, ptr, size); + } while (rdn == -1 && errno == EAGAIN); + if (rdn == -1 || (size_t)rdn > size) { + perror("read failed"); + exit(EXIT_FAILURE); + } + ptr += rdn; + size -= rdn; + } +} + +static void safe_lseek(int fd, off_t off, int whence) +{ + off_t ret; + + ret = lseek(fd, off, whence); + if (ret == (off_t)-1) { + perror("lseek failed"); + exit(EXIT_FAILURE); + } +} + +static void *safe_mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset) +{ + void *ret; + + ret = mmap(addr, length, prot, flags, fd, offset); + if (ret == MAP_FAILED) { + perror("mmap failed"); + exit(EXIT_FAILURE); + } + return ret; +} + +#define SIZE 4096 + +int main(int argc, char *argv[]) +{ + int opt, fd, wfd; + char *buf; + const char *test_pattern, *devname, *write_file; + size_t test_pattern_sz, pagesz; + void *addr; + FILE *help_fp; + bool help_ok, use_buffer; + + /* allocate a single page buffer */ + pagesz = (size_t)getpagesize(); + buf = alloca(pagesz); + + /* default test pattern */ + test_pattern = "testing"; + test_pattern_sz = strlen(test_pattern); + devname = "/dev/vmbug"; + write_file = "vmbug.bin"; + use_buffer = false; + while ((opt = getopt(argc, argv, "d:t:w:bh?")) != -1) { + switch (opt) { + case 'd': + devname = optarg; + break; + case 't': + test_pattern = optarg; + test_pattern_sz = strlen(test_pattern); + if (test_pattern_sz > (size_t)pagesz) { + fprintf(stderr, "test-pattern too big size %zu (max = %zu)\n", test_pattern_sz, pagesz); + exit(EXIT_FAILURE); + } + break; + case 'w': + write_file = optarg; + break; + case 'b': + use_buffer = true; + break; + case '?': + case 'h': + default: /* '?' */ + help_ok = opt == '?' || opt == 'h'; + help_fp = help_ok ? stdout : stderr; + fprintf(help_fp, "Usage: %s [-t test-pattern] [-d device] [-w write-file] [-b]\n", argv[0]); + fprintf(help_fp, "Trigger user vm iterator bug\n"); + fprintf(help_fp, "\n"); + fprintf(help_fp, " -t Test pattern to use (default \"testing\")\n"); + fprintf(help_fp, " -d Device to use (default \"/dev/vmbug\")\n"); + fprintf(help_fp, " -w Write to file (default \"vmbug.bin\")\n"); + fprintf(help_fp, " -b Use user space buffer instead of direct write\n"); + exit(help_ok ? EXIT_SUCCESS : EXIT_FAILURE); + } + } + + printf("opening %s device\n", devname); + fd = safe_open(devname, O_RDWR, 0); + + safe_write(fd, test_pattern, test_pattern_sz); + safe_lseek(fd, 0, SEEK_SET); + + safe_read(fd, buf, test_pattern_sz); + + printf("readback of write '%.*s' is '%.*s'\n", + (int)test_pattern_sz, test_pattern, (int)test_pattern_sz, buf); + + if (memcmp(test_pattern, buf, test_pattern_sz)) { + errno = -EINVAL; + perror("readback from device using read differs"); + exit(EXIT_FAILURE); + } + + printf("readback for read/write OK\n"); + memset(buf, 0, pagesz); + + printf("mmaping %s device (size = %zu)\n", devname, pagesz); + addr = safe_mmap(NULL, pagesz, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); + + printf("readback of mmap '%.*s' is '%.*s'\n", + (int)test_pattern_sz, test_pattern, (int)test_pattern_sz, (char *)addr); + + if (memcmp(test_pattern, addr, test_pattern_sz)) { + errno = -EINVAL; + perror("readback from device using mmap differs"); + exit(EXIT_FAILURE); + } + + printf("readback for mmap OK\n"); + + /* open write file */ + printf("opening file %s for write\n", write_file); + wfd = safe_open(write_file, O_WRONLY | O_CREAT, 0660); + + if (!use_buffer) { + printf("direct mmap write to file\n"); + safe_write(wfd, addr, pagesz); + } else { + printf("copy to buffer and then write to file\n"); + memcpy(buf, addr, pagesz); + safe_write(wfd, buf, pagesz); + } + printf("write OK\n"); + + printf("closing file %s for write\n", write_file); + safe_close(wfd); + + safe_close(fd); + + return 0; +} diff --git a/package/vmbug/vmbug.mk b/package/vmbug/vmbug.mk new file mode 100644 index 0000000000..4906f2dc03 --- /dev/null +++ b/package/vmbug/vmbug.mk @@ -0,0 +1,21 @@ +################################################################################ +# +# vmbug +# +################################################################################ + +VMBUG_LICENSE = GPL-2.0+ + +define VMBUG_EXTRACT_CMDS + cp $(VMBUG_PKGDIR)/vmbug.c $(VMBUG_PKGDIR)/Makefile $(@D) +endef + +define VMBUG_BUILD_CMDS + $(MAKE) $(TARGET_CONFIGURE_OPTS) -C $(@D) all +endef + +define VMBUG_INSTALL_TARGET_CMDS + $(INSTALL) -D -m 0755 $(@D)/vmbug $(TARGET_DIR)/usr/bin +endef + +$(eval $(generic-package)) -- 2.25.1 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 15:40 ` David Hildenbrand 2025-05-08 15:48 ` Pantelis Antoniou 2025-05-08 16:25 ` Pantelis Antoniou @ 2025-05-08 17:35 ` Jason Gunthorpe 2025-05-08 17:47 ` Pantelis Antoniou 2 siblings, 1 reply; 32+ messages in thread From: Jason Gunthorpe @ 2025-05-08 17:35 UTC (permalink / raw) To: David Hildenbrand Cc: Pantelis Antoniou, Peter Xu, Andrew Morton, mm-commits, wade.farnsworth, jhubbard, c.briere, artem.k, David Howells On Thu, May 08, 2025 at 05:40:15PM +0200, David Hildenbrand wrote: > > I don't think there was a deliberate decision here, but there was no > > conversion to remap_pfn_range(), the code (in DRM) was always there. > > > > The regression occurred when netfslib started using GUP for I/O and > > when filesystems switched to it we hit this case. > > Okay, so GUP and DRM always worked that way. They are essentially > incompatible at this point due to VM_PFNMAP. > > So netfslib requesting something that is impossible is the problem .. or > rather filesystems switching to that and not realizing the problem. > > Hmmm This patch definately doesn't look very good as is. We *certainly* should not be even trying to touch the struct page of a VMA_PFNMAP *at all*. By definition that is forbidden. It looks to me like vm_normal_page() already supports MIXEDMAP, so probably the better hotfix is to have DRM use MIXEDMAP if it is installing PFNs that it is willing to be used as struct page. But who knows if DRM can do that on arches that don't have PTE_SPECIAL.. Jason ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 17:35 ` Jason Gunthorpe @ 2025-05-08 17:47 ` Pantelis Antoniou 2025-05-08 18:01 ` Jason Gunthorpe 2025-05-08 18:02 ` David Hildenbrand 0 siblings, 2 replies; 32+ messages in thread From: Pantelis Antoniou @ 2025-05-08 17:47 UTC (permalink / raw) To: Jason Gunthorpe Cc: David Hildenbrand, Peter Xu, Andrew Morton, mm-commits, wade.farnsworth, jhubbard, c.briere, artem.k, David Howells On Thu, 8 May 2025 14:35:35 -0300 Jason Gunthorpe <jgg@ziepe.ca> wrote: Hi Jason, > On Thu, May 08, 2025 at 05: 40: 15PM +0200, David Hildenbrand wrote: > > > I don't think there was a deliberate decision here, but there was > > > no > > conversion to remap_pfn_range(), the code (in DRM) was > > > always there. > > > > > On Thu, May 08, 2025 at 05:40:15PM +0200, David Hildenbrand wrote: > > > I don't think there was a deliberate decision here, but there was > > > no conversion to remap_pfn_range(), the code (in DRM) was always > > > there. > > > > > > The regression occurred when netfslib started using GUP for I/O > > > and when filesystems switched to it we hit this case. > > > > Okay, so GUP and DRM always worked that way. They are essentially > > incompatible at this point due to VM_PFNMAP. > > > > So netfslib requesting something that is impossible is the problem > > .. or rather filesystems switching to that and not realizing the > > problem. > > > > Hmmm > > This patch definately doesn't look very good as is. > No argument praising its beauty from me. What is the right solution then? > We *certainly* should not be even trying to touch the struct page of a > VMA_PFNMAP *at all*. By definition that is forbidden. > > It looks to me like vm_normal_page() already supports MIXEDMAP, so > probably the better hotfix is to have DRM use MIXEDMAP if it is > installing PFNs that it is willing to be used as struct page. > > But who knows if DRM can do that on arches that don't have > PTE_SPECIAL.. > The question from me is why a __get_free_pages() area that is passed to remap_pfn_range() gets the PFNMAP bit set. Even if DRM sets the MIXEDMAP bit PFNMAP will still be set. > Jason > Regards -- Pantelis ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 17:47 ` Pantelis Antoniou @ 2025-05-08 18:01 ` Jason Gunthorpe 2025-05-08 18:02 ` David Hildenbrand 1 sibling, 0 replies; 32+ messages in thread From: Jason Gunthorpe @ 2025-05-08 18:01 UTC (permalink / raw) To: Pantelis Antoniou Cc: David Hildenbrand, Peter Xu, Andrew Morton, mm-commits, wade.farnsworth, jhubbard, c.briere, artem.k, David Howells On Thu, May 08, 2025 at 08:47:11PM +0300, Pantelis Antoniou wrote: > > But who knows if DRM can do that on arches that don't have > > PTE_SPECIAL.. > > > > The question from me is why a __get_free_pages() area that is passed > to remap_pfn_range() gets the PFNMAP bit set. > > Even if DRM sets the MIXEDMAP bit PFNMAP will still be set. remap_pfn_range() is the wrong way to install a PFN that has a working struct page. It sets the special bit and it is unambiguously wrong to try to convert a special PTE to a struct page. Special bit means without any doubt the PTE's address must never reach to a struct page even if it has one. It is very meaning of the special bit. I see some DRM drivers using MIXEDMAP and remap_pfn_range() together, that seems to just be creating a confusing mess. Having VM_PFNMAP|VM_MIXEDMAP set together is a nonsensical combination. Having struct page backed memory in a VMA marked with MIXEDMAP and with the PTE set a special is pointless. Jason ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 17:47 ` Pantelis Antoniou 2025-05-08 18:01 ` Jason Gunthorpe @ 2025-05-08 18:02 ` David Hildenbrand 2025-05-08 18:11 ` Pantelis Antoniou 1 sibling, 1 reply; 32+ messages in thread From: David Hildenbrand @ 2025-05-08 18:02 UTC (permalink / raw) To: Pantelis Antoniou, Jason Gunthorpe Cc: Peter Xu, Andrew Morton, mm-commits, wade.farnsworth, jhubbard, c.briere, artem.k, David Howells On 08.05.25 19:47, Pantelis Antoniou wrote: > On Thu, 8 May 2025 14:35:35 -0300 > Jason Gunthorpe <jgg@ziepe.ca> wrote: > > Hi Jason, > >> On Thu, May 08, 2025 at 05: 40: 15PM +0200, David Hildenbrand wrote: >>>> I don't think there was a deliberate decision here, but there was >>>> no > > conversion to remap_pfn_range(), the code (in DRM) was >>>> always there. > > > > >> On Thu, May 08, 2025 at 05:40:15PM +0200, David Hildenbrand wrote: >>>> I don't think there was a deliberate decision here, but there was >>>> no conversion to remap_pfn_range(), the code (in DRM) was always >>>> there. >>>> >>>> The regression occurred when netfslib started using GUP for I/O >>>> and when filesystems switched to it we hit this case. >>> >>> Okay, so GUP and DRM always worked that way. They are essentially >>> incompatible at this point due to VM_PFNMAP. >>> >>> So netfslib requesting something that is impossible is the problem >>> .. or rather filesystems switching to that and not realizing the >>> problem. >>> >>> Hmmm >> >> This patch definately doesn't look very good as is. >> > > No argument praising its beauty from me. > What is the right solution then? > >> We *certainly* should not be even trying to touch the struct page of a >> VMA_PFNMAP *at all*. By definition that is forbidden. >> >> It looks to me like vm_normal_page() already supports MIXEDMAP, so >> probably the better hotfix is to have DRM use MIXEDMAP if it is >> installing PFNs that it is willing to be used as struct page. >> >> But who knows if DRM can do that on arches that don't have >> PTE_SPECIAL.. >> > > The question from me is why a __get_free_pages() area that is passed > to remap_pfn_range() gets the PFNMAP bit set. remap *PFN* range is the wrong interface. You literally tell the system "map a PFN range and ignore any struct page" instead of "please map this refcounted page". -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 18:02 ` David Hildenbrand @ 2025-05-08 18:11 ` Pantelis Antoniou 2025-05-08 18:26 ` David Hildenbrand ` (2 more replies) 0 siblings, 3 replies; 32+ messages in thread From: Pantelis Antoniou @ 2025-05-08 18:11 UTC (permalink / raw) To: David Hildenbrand Cc: Jason Gunthorpe, Peter Xu, Andrew Morton, mm-commits, wade.farnsworth, jhubbard, c.briere, artem.k, David Howells On Thu, 8 May 2025 20:02:38 +0200 David Hildenbrand <david@redhat.com> wrote: Hi David, > On 08. 05. 25 19: 47, Pantelis Antoniou wrote: > On Thu, 8 May 2025 > 14: 35: 35 -0300 > Jason Gunthorpe <jgg@ ziepe. ca> wrote: > > Hi > Jason, > >> On Thu, May 08, 2025 at 05: 40: 15PM +0200, David > Hildenbrand wrote: >>>> > On 08.05.25 19:47, Pantelis Antoniou wrote: > > On Thu, 8 May 2025 14:35:35 -0300 > > Jason Gunthorpe <jgg@ziepe.ca> wrote: > > > > Hi Jason, > > > >> On Thu, May 08, 2025 at 05: 40: 15PM +0200, David Hildenbrand > >> wrote: > >>>> I don't think there was a deliberate decision here, but there was > >>>> no > > conversion to remap_pfn_range(), the code (in DRM) was > >>>> always there. > > > > > >> On Thu, May 08, 2025 at 05:40:15PM +0200, David Hildenbrand wrote: > >>>> I don't think there was a deliberate decision here, but there was > >>>> no conversion to remap_pfn_range(), the code (in DRM) was always > >>>> there. > >>>> > >>>> The regression occurred when netfslib started using GUP for I/O > >>>> and when filesystems switched to it we hit this case. > >>> > >>> Okay, so GUP and DRM always worked that way. They are essentially > >>> incompatible at this point due to VM_PFNMAP. > >>> > >>> So netfslib requesting something that is impossible is the problem > >>> .. or rather filesystems switching to that and not realizing the > >>> problem. > >>> > >>> Hmmm > >> > >> This patch definately doesn't look very good as is. > >> > > > > No argument praising its beauty from me. > > What is the right solution then? > > > >> We *certainly* should not be even trying to touch the struct page > >> of a VMA_PFNMAP *at all*. By definition that is forbidden. > >> > >> It looks to me like vm_normal_page() already supports MIXEDMAP, so > >> probably the better hotfix is to have DRM use MIXEDMAP if it is > >> installing PFNs that it is willing to be used as struct page. > >> > >> But who knows if DRM can do that on arches that don't have > >> PTE_SPECIAL.. > >> > > > > The question from me is why a __get_free_pages() area that is passed > > to remap_pfn_range() gets the PFNMAP bit set. > > remap *PFN* range is the wrong interface. You literally tell the > system "map a PFN range and ignore any struct page" instead of > "please map this refcounted page". > I agree, but it's not my code that's doing it. This has been going on for more than a decade at this point. Can we get a plan on how to go around fixing these issues correctly? 1. Drivers/subsystems (DRM in this case) are doing remap_pfn_range() to map system memory with a page attached to user space. Up until recently this was OK, since no-one tried to pin the pages for any reason. It doesn't seem like this is the right way to do it. What is the right way? 2. DRM in particular has no standardized way to handle mapping system memory Buffer Objects (BOs) to userspace. Each driver is free to do it's own thing and does so. What is the right way to handle this case. 3. While we go about fixing it, this has caused a pretty significant userspace regression, where the address space that those BOs reside cannot be used for I/O when a network filesystem is involved. I think it's a matter of time when regular filesystem start using the same method of pining and doing I/O instead of using the filecache on fast memory mediums. Regards -- Pantelis ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 18:11 ` Pantelis Antoniou @ 2025-05-08 18:26 ` David Hildenbrand 2025-05-08 18:47 ` Peter Xu 2025-05-08 19:11 ` Jason Gunthorpe 2 siblings, 0 replies; 32+ messages in thread From: David Hildenbrand @ 2025-05-08 18:26 UTC (permalink / raw) To: Pantelis Antoniou Cc: Jason Gunthorpe, Peter Xu, Andrew Morton, mm-commits, wade.farnsworth, jhubbard, c.briere, artem.k, David Howells, Lorenzo Stoakes > I agree, but it's not my code that's doing it. I know. And it's not GUP that's broken. Taking a ref on a page and returning a page when explicitly told not to even lookup a page (including if there is no page, or if the page contains unrelated garbage -- e.g., from a memory hole during boot etc) cannot possibly work. And that's what GUP is all about. > This has been going on for more than a decade at this point. > > Can we get a plan on how to go around fixing these issues correctly? > > 1. Drivers/subsystems (DRM in this case) are doing remap_pfn_range() to > map system memory with a page attached to user space. > Up until recently this was OK, since no-one tried to pin the pages for > any reason. It doesn't seem like this is the right way to do it. > What is the right way? I recently had a similar discussion with Lorenzo. (CC) Worth looking at commit 8e553520596bbd5ce832e26e9d721e6a0c797b8b Author: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Date: Mon Mar 31 13:56:08 2025 +0100 intel_th: avoid using deprecated page->mapping, index fields The struct page->mapping, index fields are deprecated and soon to be only available as part of a folio. It is likely the intel_th code which sets page->mapping, index is was implemented out of concern that some aspect of the page fault logic may encounter unexpected problems should they not. However, the appropriate interface for inserting kernel-allocated memory is vm_insert_page() in a VM_MIXEDMAP. By using the helper function vmf_insert_mixed() we can do this with minimal churn in the existing fault handler. ... Take a look at how kernel/trace/ring_buffer.c and io_uring/memmap.c use vm_insert_pages(). It can be done lazily during page faults using vmf_insert_mixed(). > > 2. DRM in particular has no standardized way to handle mapping system > memory Buffer Objects (BOs) to userspace. Each driver is free to do it's > own thing and does so. What is the right way to handle this case. Probably this should all be properly refactored to map kernel allocations as refcounted into user page tables. > > 3. While we go about fixing it, this has caused a pretty significant > userspace regression, where the address space that those BOs reside > cannot be used for I/O when a network filesystem is involved. I think > it's a matter of time when regular filesystem start using the same > method of pining and doing I/O instead of using the filecache on fast > memory mediums. I am surprised that whoever did that change didn't realize that this simply doesn't work and never did work earlier :( -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 18:11 ` Pantelis Antoniou 2025-05-08 18:26 ` David Hildenbrand @ 2025-05-08 18:47 ` Peter Xu 2025-05-08 19:04 ` David Hildenbrand 2025-05-08 19:11 ` Jason Gunthorpe 2 siblings, 1 reply; 32+ messages in thread From: Peter Xu @ 2025-05-08 18:47 UTC (permalink / raw) To: Pantelis Antoniou Cc: David Hildenbrand, Jason Gunthorpe, Andrew Morton, mm-commits, wade.farnsworth, jhubbard, c.briere, artem.k, David Howells On Thu, May 08, 2025 at 09:11:36PM +0300, Pantelis Antoniou wrote: > This has been going on for more than a decade at this point. I wasn't aware how long, but I need to confess I am also aware of such in virt context where drivers map these pages in PFNMAPs.. So KVM also has similar cases happening in corner case setups. > > Can we get a plan on how to go around fixing these issues correctly? > > 1. Drivers/subsystems (DRM in this case) are doing remap_pfn_range() to > map system memory with a page attached to user space. > Up until recently this was OK, since no-one tried to pin the pages for > any reason. It doesn't seem like this is the right way to do it. > What is the right way? > > 2. DRM in particular has no standardized way to handle mapping system > memory Buffer Objects (BOs) to userspace. Each driver is free to do it's > own thing and does so. What is the right way to handle this case. > > 3. While we go about fixing it, this has caused a pretty significant > userspace regression, where the address space that those BOs reside > cannot be used for I/O when a network filesystem is involved. I think > it's a matter of time when regular filesystem start using the same > method of pining and doing I/O instead of using the filecache on fast > memory mediums. I mentioned it elsewhere, but _if_ fixing all the drivers isn't possible in the near future.. we could still have chance to not mess with GUP (in which case PFNMAP is also working like that for so many years, likely what the drivers do on abusing pages in PFNMAPs...). That is supporting such special pages in iov_iter. I attached such patch below, not saying that we should merge it, but IMHO it's much better than fiddling with gup here, and so far that's the best I can think if (and only if it works for 9pfs's current zerocopy case).. I only smoked it, but I didn't verify it. PS: I was also thinking maybe if 9pfs is the only affected so far, I wonder if we could try to fallback to cached RW when necessary, IIUC that's still working, right (and is 9pfs a production-level fs)? But I think that's not as good as below if supporting that isn't extremely hard. Thanks, =======8<====== From cd4aa467e4b653b5bd8496b5123d65d06e2e3263 Mon Sep 17 00:00:00 2001 From: Peter Xu <peterx@redhat.com> Date: Thu, 8 May 2025 14:35:24 -0400 Subject: [PATCH] iov_iter: Supports special PFNMAP for user buffer when pfn_valid() Signed-off-by: Peter Xu <peterx@redhat.com> --- include/linux/mm.h | 1 + lib/iov_iter.c | 60 +++++++++++++++++++++++++++++++++++++++++++++- mm/gup.c | 16 +++++++++++++ 3 files changed, 76 insertions(+), 1 deletion(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 38e16c984b9a..f79fd14599fa 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1459,6 +1459,7 @@ static inline void put_page(struct page *page) */ #define GUP_PIN_COUNTING_BIAS (1U << 10) +int pin_user_page(struct page *page); void unpin_user_page(struct page *page); void unpin_folio(struct folio *folio); void unpin_user_pages_dirty_lock(struct page **pages, unsigned long npages, diff --git a/lib/iov_iter.c b/lib/iov_iter.c index d9e19fb2dcf3..2eb24b2793f0 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -1809,6 +1809,64 @@ static ssize_t iov_iter_extract_kvec_pages(struct iov_iter *i, return size; } +/** + * iov_iter_pin_user_pages() - pin user pages for iov iter ops + * + * @start: starting user address + * @nr_pages: number of pages from start to pin + * @gup_flags: flags modifying pin behaviour + * @pages: array that receives pointers to the pages pinned. + * Should be at least nr_pages long. + * + * Almost a wrapper for pin_user_pages_fast(), but also supports PFNMAPs + * where in extremely rare cases there's actually struct page available + * (e.g. device drivers playing trick with PFNMAP by injecting allocated + * RAM pages). + */ +static inline int +iov_iter_pin_user_pages(unsigned long start, int nr_pages, + unsigned int gup_flags, struct page **pages) +{ + struct follow_pfnmap_args args; + struct vm_area_struct *vma; + struct mm_struct *mm; + int res, ret; + + res = pin_user_pages_fast(start, nr_pages, gup_flags, pages); + + /* Normally, GUP should really work already.. */ + if (likely(res > 0)) + return res; + + /* + * This is to take care of an extremely rare case: retry in case if + * it's a PFNMAP that has struct page backed. + * + * So far it does the minimum we need in the failure path. It + * assumes the PFNMAP entries must exist in the pgtables already, + * and it resolves one PFN at a time. + */ + mm = current->mm; + mmap_read_lock(mm); + vma = vma_lookup(current->mm, start); + if (!vma) + goto out; + + args.vma = vma; + args.address = start; + + ret = follow_pfnmap_start(&args); + if (ret) + goto out; + /* Did we find a special page under VM_PFNMAP? */ + if (pfn_valid(args.pfn) && pin_user_page(pfn_to_page(args.pfn))) + res = 1; + follow_pfnmap_end(&args); +out: + mmap_read_unlock(mm); + return res; +} + /* * Extract a list of contiguous pages from a user iterator and get a pin on * each of them. This should only be used if the iterator is user-backed @@ -1846,7 +1904,7 @@ static ssize_t iov_iter_extract_user_pages(struct iov_iter *i, maxpages = want_pages_array(pages, maxsize, offset, maxpages); if (!maxpages) return -ENOMEM; - res = pin_user_pages_fast(addr, maxpages, gup_flags, *pages); + res = iov_iter_pin_user_pages(addr, maxpages, gup_flags, *pages); if (unlikely(res <= 0)) return res; maxsize = min_t(size_t, maxsize, res * PAGE_SIZE - offset); diff --git a/mm/gup.c b/mm/gup.c index d3aac58862c0..eede221ac89a 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -178,6 +178,22 @@ int __must_check try_grab_folio(struct folio *folio, int refs, return 0; } +/** + * pin_user_page() - dma-pinned a page + * @page: pointer to page to be pinned + * + * NOTE! One should normally use pin_user_pages*() API instead. This + * should be only useful in extremely special cases, like struct page under + * VM_PFNMAP. + * + * Returns: 0 if success, negative if pin failed + */ +int pin_user_page(struct page *page) +{ + return try_grab_folio(page_folio(page), 1, FOLL_PIN); +} +EXPORT_SYMBOL(pin_user_page); + /** * unpin_user_page() - release a dma-pinned page * @page: pointer to page to be released -- 2.49.0 -- Peter Xu ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 18:47 ` Peter Xu @ 2025-05-08 19:04 ` David Hildenbrand 2025-05-08 19:06 ` Jason Gunthorpe 2025-05-08 19:08 ` Peter Xu 0 siblings, 2 replies; 32+ messages in thread From: David Hildenbrand @ 2025-05-08 19:04 UTC (permalink / raw) To: Peter Xu, Pantelis Antoniou Cc: Jason Gunthorpe, Andrew Morton, mm-commits, wade.farnsworth, jhubbard, c.briere, artem.k, David Howells > + /* Did we find a special page under VM_PFNMAP? */ > + if (pfn_valid(args.pfn) && pin_user_page(pfn_to_page(args.pfn))) > + res = 1; > It's doing the same wrong thing at a different place. -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 19:04 ` David Hildenbrand @ 2025-05-08 19:06 ` Jason Gunthorpe 2025-05-08 19:08 ` Peter Xu 1 sibling, 0 replies; 32+ messages in thread From: Jason Gunthorpe @ 2025-05-08 19:06 UTC (permalink / raw) To: David Hildenbrand Cc: Peter Xu, Pantelis Antoniou, Andrew Morton, mm-commits, wade.farnsworth, jhubbard, c.briere, artem.k, David Howells On Thu, May 08, 2025 at 09:04:11PM +0200, David Hildenbrand wrote: > > + /* Did we find a special page under VM_PFNMAP? */ > > + if (pfn_valid(args.pfn) && pin_user_page(pfn_to_page(args.pfn))) > > + res = 1; > > > > It's doing the same wrong thing at a different place. +1 this is a DRM problem, it must be fixed in DRM drivers. Jason ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 19:04 ` David Hildenbrand 2025-05-08 19:06 ` Jason Gunthorpe @ 2025-05-08 19:08 ` Peter Xu 2025-05-08 19:12 ` Jason Gunthorpe 2025-05-08 19:14 ` David Hildenbrand 1 sibling, 2 replies; 32+ messages in thread From: Peter Xu @ 2025-05-08 19:08 UTC (permalink / raw) To: David Hildenbrand Cc: Pantelis Antoniou, Jason Gunthorpe, Andrew Morton, mm-commits, wade.farnsworth, jhubbard, c.briere, artem.k, David Howells On Thu, May 08, 2025 at 09:04:11PM +0200, David Hildenbrand wrote: > It's doing the same wrong thing at a different place. As I mentioned, I believe KVM has this wrong thing working so far.. and it doesn't block us from going right ultimately. It's a matter of time. -- Peter Xu ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 19:08 ` Peter Xu @ 2025-05-08 19:12 ` Jason Gunthorpe 2025-05-08 19:16 ` David Hildenbrand 2025-05-08 19:39 ` Peter Xu 2025-05-08 19:14 ` David Hildenbrand 1 sibling, 2 replies; 32+ messages in thread From: Jason Gunthorpe @ 2025-05-08 19:12 UTC (permalink / raw) To: Peter Xu Cc: David Hildenbrand, Pantelis Antoniou, Andrew Morton, mm-commits, wade.farnsworth, jhubbard, c.briere, artem.k, David Howells On Thu, May 08, 2025 at 03:08:19PM -0400, Peter Xu wrote: > On Thu, May 08, 2025 at 09:04:11PM +0200, David Hildenbrand wrote: > > It's doing the same wrong thing at a different place. > > As I mentioned, I believe KVM has this wrong thing working so far.. and it > doesn't block us from going right ultimately. It's a matter of time. AFAIK KVM is doing this wonky thing using mmu notifiers, it doesn't take page references on pte special pages.. Jason ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 19:12 ` Jason Gunthorpe @ 2025-05-08 19:16 ` David Hildenbrand 2025-05-08 19:39 ` Peter Xu 1 sibling, 0 replies; 32+ messages in thread From: David Hildenbrand @ 2025-05-08 19:16 UTC (permalink / raw) To: Jason Gunthorpe, Peter Xu Cc: Pantelis Antoniou, Andrew Morton, mm-commits, wade.farnsworth, jhubbard, c.briere, artem.k, David Howells On 08.05.25 21:12, Jason Gunthorpe wrote: > On Thu, May 08, 2025 at 03:08:19PM -0400, Peter Xu wrote: >> On Thu, May 08, 2025 at 09:04:11PM +0200, David Hildenbrand wrote: >>> It's doing the same wrong thing at a different place. >> >> As I mentioned, I believe KVM has this wrong thing working so far.. and it >> doesn't block us from going right ultimately. It's a matter of time. > > AFAIK KVM is doing this wonky thing using mmu notifiers, it doesn't > take page references on pte special pages.. Ah, and vfio also doesn't grab a ref in that case IIUC. -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 19:12 ` Jason Gunthorpe 2025-05-08 19:16 ` David Hildenbrand @ 2025-05-08 19:39 ` Peter Xu 1 sibling, 0 replies; 32+ messages in thread From: Peter Xu @ 2025-05-08 19:39 UTC (permalink / raw) To: Jason Gunthorpe Cc: David Hildenbrand, Pantelis Antoniou, Andrew Morton, mm-commits, wade.farnsworth, jhubbard, c.briere, artem.k, David Howells On Thu, May 08, 2025 at 04:12:15PM -0300, Jason Gunthorpe wrote: > AFAIK KVM is doing this wonky thing using mmu notifiers, it doesn't > take page references on pte special pages.. I checked the latest, I think you're right at least on the latest master branch.. IIUC it's behavior on refcounting changed only last year after Sean's 3dd48ecfac7f ("KVM: Provide refcounted page as output field in struct kvm_follow_pfn"). To me, taking the refcount has one tiny little "benefit" of avoiding the UAF you mentioned in the other email. But I agree the whole thing is still pretty wonky. -- Peter Xu ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 19:08 ` Peter Xu 2025-05-08 19:12 ` Jason Gunthorpe @ 2025-05-08 19:14 ` David Hildenbrand 2025-05-08 19:19 ` Jason Gunthorpe 1 sibling, 1 reply; 32+ messages in thread From: David Hildenbrand @ 2025-05-08 19:14 UTC (permalink / raw) To: Peter Xu Cc: Pantelis Antoniou, Jason Gunthorpe, Andrew Morton, mm-commits, wade.farnsworth, jhubbard, c.briere, artem.k, David Howells On 08.05.25 21:08, Peter Xu wrote: > On Thu, May 08, 2025 at 09:04:11PM +0200, David Hildenbrand wrote: >> It's doing the same wrong thing at a different place. > > As I mentioned, I believe KVM has this wrong thing working so far.. and it > doesn't block us from going right ultimately. It's a matter of time. Yes, KVM has it wrong and vfio probably as well. And they are usually not dealing with actual kernel allocations, but rather with MMIO ranges. Sorry, no more hacks. -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 19:14 ` David Hildenbrand @ 2025-05-08 19:19 ` Jason Gunthorpe 2025-05-08 19:34 ` David Hildenbrand 0 siblings, 1 reply; 32+ messages in thread From: Jason Gunthorpe @ 2025-05-08 19:19 UTC (permalink / raw) To: David Hildenbrand Cc: Peter Xu, Pantelis Antoniou, Andrew Morton, mm-commits, wade.farnsworth, jhubbard, c.briere, artem.k, David Howells On Thu, May 08, 2025 at 09:14:38PM +0200, David Hildenbrand wrote: > On 08.05.25 21:08, Peter Xu wrote: > > On Thu, May 08, 2025 at 09:04:11PM +0200, David Hildenbrand wrote: > > > It's doing the same wrong thing at a different place. > > > > As I mentioned, I believe KVM has this wrong thing working so far.. and it > > doesn't block us from going right ultimately. It's a matter of time. > > Yes, KVM has it wrong and vfio probably as well. And they are usually not > dealing with actual kernel allocations, but rather with MMIO ranges. vfio also doesn't take references on the things it pulls out of the VMA. The vfio bug is different, it lets you take a pte special phys_addr_t and reference it through the IOMMU without any refcounting. So when the VMA is destroyed and the page free'd by the GPU driver we just UAF it from VFIO through the iommu page table. Woops. What we are talking about here is very different from both kvm and vfio, this is ignoring pte special and accessing the struct page refcount anyhow. I certainly don't know of anything that is doing that, though I didn't know about the old netfs stuff :\ Jason ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 19:19 ` Jason Gunthorpe @ 2025-05-08 19:34 ` David Hildenbrand 2025-05-09 16:30 ` Pantelis Antoniou 0 siblings, 1 reply; 32+ messages in thread From: David Hildenbrand @ 2025-05-08 19:34 UTC (permalink / raw) To: Jason Gunthorpe Cc: Peter Xu, Pantelis Antoniou, Andrew Morton, mm-commits, wade.farnsworth, jhubbard, c.briere, artem.k, David Howells On 08.05.25 21:19, Jason Gunthorpe wrote: > On Thu, May 08, 2025 at 09:14:38PM +0200, David Hildenbrand wrote: >> On 08.05.25 21:08, Peter Xu wrote: >>> On Thu, May 08, 2025 at 09:04:11PM +0200, David Hildenbrand wrote: >>>> It's doing the same wrong thing at a different place. >>> >>> As I mentioned, I believe KVM has this wrong thing working so far.. and it >>> doesn't block us from going right ultimately. It's a matter of time. >> >> Yes, KVM has it wrong and vfio probably as well. And they are usually not >> dealing with actual kernel allocations, but rather with MMIO ranges. > > vfio also doesn't take references on the things it pulls out of the > VMA. The vfio bug is different, it lets you take a pte special > phys_addr_t and reference it through the IOMMU without any > refcounting. So when the VMA is destroyed and the page free'd by the > GPU driver we just UAF it from VFIO through the iommu page > table. Woops. Right. What is_invalid_reserved_pfn() does is check that if it has a "struct page", that that one must be marked PG_reserved. That PG_reserved check is a nasty check for MMIO pages or memory holes part of a present memory section. So at least it will reject anything that is just an ordinary kernel allocation (!reserved). -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 19:34 ` David Hildenbrand @ 2025-05-09 16:30 ` Pantelis Antoniou 2025-05-09 17:11 ` John Hubbard 0 siblings, 1 reply; 32+ messages in thread From: Pantelis Antoniou @ 2025-05-09 16:30 UTC (permalink / raw) To: David Hildenbrand Cc: Jason Gunthorpe, Peter Xu, Andrew Morton, mm-commits, wade.farnsworth, jhubbard, c.briere, artem.k, David Howells On Thu, 8 May 2025 21:34:28 +0200 David Hildenbrand <david@redhat.com> wrote: > On 08. 05. 25 21: 19, Jason Gunthorpe wrote: > On Thu, May 08, 2025 > at 09: 14: 38PM +0200, David Hildenbrand wrote: >> On 08. 05. 25 21: > 08, Peter Xu wrote: >>> On Thu, May 08, 2025 at 09: 04: 11PM +0200, > On 08.05.25 21:19, Jason Gunthorpe wrote: > > On Thu, May 08, 2025 at 09:14:38PM +0200, David Hildenbrand wrote: > >> On 08.05.25 21:08, Peter Xu wrote: > >>> On Thu, May 08, 2025 at 09:04:11PM +0200, David Hildenbrand wrote: > >>>> It's doing the same wrong thing at a different place. > >>> > >>> As I mentioned, I believe KVM has this wrong thing working so > >>> far.. and it doesn't block us from going right ultimately. It's > >>> a matter of time. > >> > >> Yes, KVM has it wrong and vfio probably as well. And they are > >> usually not dealing with actual kernel allocations, but rather > >> with MMIO ranges. > > > > vfio also doesn't take references on the things it pulls out of the > > VMA. The vfio bug is different, it lets you take a pte special > > phys_addr_t and reference it through the IOMMU without any > > refcounting. So when the VMA is destroyed and the page free'd by the > > GPU driver we just UAF it from VFIO through the iommu page > > table. Woops. > > Right. What is_invalid_reserved_pfn() does is check that if it has a > "struct page", that that one must be marked PG_reserved. > > That PG_reserved check is a nasty check for MMIO pages or memory > holes part of a present memory section. > > So at least it will reject anything that is just an ordinary kernel > allocation (!reserved). > So what's the plan now? DRM seems to be the first place to be fixed, however am I wrong in thinking that most of the uses of remap_pfn_range() are wrong in the context of system page backed memory? Should we start with an implementation of something like remap_range() which does not set PFNMAP bit. Its use is wrong in that context IMO. And then move to DRM proper and replace the call to remap_pfn_range() with it and see how far we go. What do you think? Regards -- Pantelis ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-09 16:30 ` Pantelis Antoniou @ 2025-05-09 17:11 ` John Hubbard 2025-05-09 17:33 ` Jason Gunthorpe 0 siblings, 1 reply; 32+ messages in thread From: John Hubbard @ 2025-05-09 17:11 UTC (permalink / raw) To: Pantelis Antoniou, David Hildenbrand Cc: Jason Gunthorpe, Peter Xu, Andrew Morton, mm-commits, wade.farnsworth, c.briere, artem.k, David Howells On 5/9/25 9:30 AM, Pantelis Antoniou wrote: > On Thu, 8 May 2025 21:34:28 +0200 > David Hildenbrand <david@redhat.com> wrote: ... > So what's the plan now? > > DRM seems to be the first place to be fixed, however am I wrong in > thinking that most of the uses of remap_pfn_range() are wrong in the > context of system page backed memory? > > Should we start with an implementation of something like remap_range() > which does not set PFNMAP bit. Its use is wrong in that context IMO. > > And then move to DRM proper and replace the call to remap_pfn_range() > with it and see how far we go. > That sounds like the right approach to me. Because the way we get into these problems is mostly due to the lack of a clear example, and so providing something correct to call is the way out. thanks, -- John Hubbard ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-09 17:11 ` John Hubbard @ 2025-05-09 17:33 ` Jason Gunthorpe 2025-05-09 17:50 ` Pantelis Antoniou 0 siblings, 1 reply; 32+ messages in thread From: Jason Gunthorpe @ 2025-05-09 17:33 UTC (permalink / raw) To: John Hubbard Cc: Pantelis Antoniou, David Hildenbrand, Peter Xu, Andrew Morton, mm-commits, wade.farnsworth, c.briere, artem.k, David Howells On Fri, May 09, 2025 at 10:11:01AM -0700, John Hubbard wrote: > On 5/9/25 9:30 AM, Pantelis Antoniou wrote: > > On Thu, 8 May 2025 21:34:28 +0200 > > David Hildenbrand <david@redhat.com> wrote: > ... > > So what's the plan now? > > > > DRM seems to be the first place to be fixed, however am I wrong in > > thinking that most of the uses of remap_pfn_range() are wrong in the > > context of system page backed memory? > > > > Should we start with an implementation of something like remap_range() > > which does not set PFNMAP bit. Its use is wrong in that context IMO. > > > > And then move to DRM proper and replace the call to remap_pfn_range() > > with it and see how far we go. > > > That sounds like the right approach to me. Because the way we get into > these problems is mostly due to the lack of a clear example, and so > providing something correct to call is the way out. Thing is if you want to just install struct page memory you don't need MIXEDMAP or a special call, just insert the pages in the normal way. The issue here seems to be that the DRM caller wants to mix and match struct page and non-struct page memory, so I think you want an entirely different API for managing effectively a scatter of two different memory types and computing what the proper VMA flags should be based on what was given. OR DRM is actually using remap_pfn specifical because it does not want 3rd parties taking the page refcount because that destroys its lifetime model.. Jason ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-09 17:33 ` Jason Gunthorpe @ 2025-05-09 17:50 ` Pantelis Antoniou 2025-05-09 18:39 ` Jason Gunthorpe 0 siblings, 1 reply; 32+ messages in thread From: Pantelis Antoniou @ 2025-05-09 17:50 UTC (permalink / raw) To: Jason Gunthorpe Cc: John Hubbard, David Hildenbrand, Peter Xu, Andrew Morton, mm-commits, wade.farnsworth, c.briere, artem.k, David Howells On Fri, 9 May 2025 14:33:14 -0300 Jason Gunthorpe <jgg@ziepe.ca> wrote: > On Fri, May 09, 2025 at 10: 11: 01AM -0700, John Hubbard wrote: > On > 5/9/25 9: 30 AM, Pantelis Antoniou wrote: > > On Thu, 8 May 2025 21: > 34: 28 +0200 > > David Hildenbrand <david@ redhat. com> wrote: > .. . > > > So what's > On Fri, May 09, 2025 at 10:11:01AM -0700, John Hubbard wrote: > > On 5/9/25 9:30 AM, Pantelis Antoniou wrote: > > > On Thu, 8 May 2025 21:34:28 +0200 > > > David Hildenbrand <david@redhat.com> wrote: > > ... > > > So what's the plan now? > > > > > > DRM seems to be the first place to be fixed, however am I wrong in > > > thinking that most of the uses of remap_pfn_range() are wrong in > > > the context of system page backed memory? > > > > > > Should we start with an implementation of something like > > > remap_range() which does not set PFNMAP bit. Its use is wrong in > > > that context IMO. > > > > > > And then move to DRM proper and replace the call to > > > remap_pfn_range() with it and see how far we go. > > > > > That sounds like the right approach to me. Because the way we get > > into these problems is mostly due to the lack of a clear example, > > and so providing something correct to call is the way out. > > Thing is if you want to just install struct page memory you don't need > MIXEDMAP or a special call, just insert the pages in the normal way. > > The issue here seems to be that the DRM caller wants to mix and match > struct page and non-struct page memory, so I think you want an > entirely different API for managing effectively a scatter of two > different memory types and computing what the proper VMA flags should > be based on what was given. > > OR DRM is actually using remap_pfn specifical because it does not want > 3rd parties taking the page refcount because that destroys its > lifetime model.. > To be frank our driver does not explicitly call remap_pfn_range(), I just saw that there is use of it in other drivers and is found in many tutorials for writing drivers that share memory with user-space. However our driver is dependent on the drm_gem_mmap_obj() method where all paths end up setting the PFNMAP bit, and the remap_pfn_range() method is the easier way we could reproduce the bug in a simplified test-case. A quick grep for vm_iomap_memory and remap_pfn_range in drivers/ $ git grep -e 'vm_iomap_memory\|remap_pfn_range' drivers/ | wc -l 92 I have no idea how many of those are operating on system memory pages. From my understanding DRM takes full control of the lifecycle of the buffer objects in question, so I don't think that the page refcount is applicable here. Turning that bit off could expose the pages to the swapper which I guess would be pretty bad (maybe, not my particular area of expertise). Do we have any DRM maintainers available to chime in about the page lifecycle? > Jason > Regards -- Pantelis ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-09 17:50 ` Pantelis Antoniou @ 2025-05-09 18:39 ` Jason Gunthorpe 0 siblings, 0 replies; 32+ messages in thread From: Jason Gunthorpe @ 2025-05-09 18:39 UTC (permalink / raw) To: Pantelis Antoniou Cc: John Hubbard, David Hildenbrand, Peter Xu, Andrew Morton, mm-commits, wade.farnsworth, c.briere, artem.k, David Howells On Fri, May 09, 2025 at 08:50:22PM +0300, Pantelis Antoniou wrote: > However our driver is dependent on the drm_gem_mmap_obj() method where > all paths end up setting the PFNMAP bit, and the remap_pfn_range() method > is the easier way we could reproduce the bug in a simplified test-case. Not all paths, the obj->funcs->mmap() does not. How do the pfns get installed in the flow you are looking at? It seems to me that if the driver knows it is using CPU memory it should follow: GEM objects can either provide a fault handler in their vm_op And then in the fault handler use the normal vmf_insert_.* stuff and never set any special VMA flags. Otherwise it sounds like: or mmap the buffer memory synchronously after calling drm_gem_mmap_obj. Means the driver called something like remap_pfn.. > Turning that bit off could expose the pages to the swapper which I guess > would be pretty bad (maybe, not my particular area of expertise). I think that's different.. Jason ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 18:11 ` Pantelis Antoniou 2025-05-08 18:26 ` David Hildenbrand 2025-05-08 18:47 ` Peter Xu @ 2025-05-08 19:11 ` Jason Gunthorpe 2 siblings, 0 replies; 32+ messages in thread From: Jason Gunthorpe @ 2025-05-08 19:11 UTC (permalink / raw) To: Pantelis Antoniou Cc: David Hildenbrand, Peter Xu, Andrew Morton, mm-commits, wade.farnsworth, jhubbard, c.briere, artem.k, David Howells On Thu, May 08, 2025 at 09:11:36PM +0300, Pantelis Antoniou wrote: > 3. While we go about fixing it, this has caused a pretty significant > userspace regression, where the address space that those BOs reside > cannot be used for I/O when a network filesystem is involved. I think > it's a matter of time when regular filesystem start using the same > method of pining and doing I/O instead of using the filecache on fast > memory mediums. Regular file systems already uses GUP on O_DIRECT paths and already didn't work basically forever. It seems like a kernel bug in the net filesystems to have done something different in their O_DIRECT for so long :\ Anyhow, the fixes must come from DRM using the mm properly, not by hacking up the mm to ignore the well defined API rules we have. I don't know enough about DRM to say exactly what that means, but that is where you should be focusing your attention to fix it. Somehow I suspect the number of places actually using O_DIRECT from a network filesystem to a DRM buffer is going to be pretty small since it never worked on a normal filesystem. Meaning this isn't some general common code that is feeding generic files into GPUs, but something custom built to only use a network that happened to stumble onto this kernel bug and abuse it. Do you know differently? Jason ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch 2025-05-08 15:08 ` Peter Xu 2025-05-08 15:10 ` David Hildenbrand @ 2025-05-08 15:17 ` Pantelis Antoniou 1 sibling, 0 replies; 32+ messages in thread From: Pantelis Antoniou @ 2025-05-08 15:17 UTC (permalink / raw) To: Peter Xu Cc: Andrew Morton, mm-commits, wade.farnsworth, jhubbard, jgg, david, c.briere, artem.k, David Howells On Thu, 8 May 2025 11:08:14 -0400 Peter Xu <peterx@redhat.com> wrote: > On Thu, May 08, 2025 at 05: 36: 12PM +0300, Pantelis Antoniou wrote: > > On Thu, 8 May 2025 10: 16: 31 -0400 > Peter Xu <peterx@ redhat. > > com> wrote: > > Hi Peter, Hi, Pantelis, [. . . ] > > > @@ -1271,8 > > com> +1287,6 @@ static int ZjQcmQRYFpfptBannerStart > On Thu, May 08, 2025 at 05:36:12PM +0300, Pantelis Antoniou wrote: > > On Thu, 8 May 2025 10:16:31 -0400 > > Peter Xu <peterx@redhat.com> wrote: > > > > Hi Peter, > > Hi, Pantelis, > > [...] > > > > > @@ -1271,8 +1287,6 @@ static int check_vma_flags(struct vm_are > > > > int foreign = (gup_flags & FOLL_REMOTE); > > > > bool vma_anon = vma_is_anonymous(vma); > > > > > > > > - if (vm_flags & (VM_IO | VM_PFNMAP)) > > > > - return -EFAULT; > > > > > > Is there's any justification that this won't break some existing > > > GUP users that may rely on properly failing at pfnmaps? > > > > > > IIUC netfs isn't the first one that wants to GUP on top of > > > pfnmaps, KVM does it for years and so far it was processed in a > > > standalone path: > > > > > > hva_to_pfn: > > > else if (vma->vm_flags & (VM_IO | VM_PFNMAP)) { > > > r = hva_to_pfn_remapped(vma, kfp, &pfn); > > > > > > That started with supporting real pfnmaps (with no page struct), > > > but pfnmap with page structs can also happen afaict, and kvm > > > processes that too by checking page==NULL ultimately, e.g. in > > > kvm_release_faultin_page(). > > > > > > > I see. The problem is that we're not the owners of the code in > > netfslib, and it is considerably more intrusive to fix things there. > > > > This is a hotfix for a userspace regression. I sort of agree that > > having different handling for these areas in netfslib would be > > ideal. > > Do you mean this used to work in older kernels? Some more info on the > regression would be more than welcomed if so.. If it fixes a kernel > regression, we may want a Fixes for whatever patch at last. > Yes, it used to work in older kernels, before filesystems like 9p switched to using the accessors. The problem is that there is not a single patch that I can point as the culprit. It took a long time to figure out but the timeline was: 1. Before any netfslib and 9pfs changes, I/O from remap_pfn_page ranges works. 2. netfslib accessors are merged in mainline. Userspace still works. 3. 9pfs picks up netfslib accessors, things break. I doubt any kernel CI would have a test-case for it, because it is quite esoteric. We do have a relatively simple buildroot patch that we can share, that exhibits the problem, and that contains both a kernel, a kernel module and a user space program that performs the I/O. > Or do you mean it's a regression caused by userspace change? > No userspace changes. > Thanks, > Regards -- Pantelis ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2025-05-09 18:39 UTC | newest] Thread overview: 32+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-05-07 21:55 + fix-zero-copy-i-o-on-__get_user_pages-allocated-pages.patch added to mm-hotfixes-unstable branch Andrew Morton 2025-05-08 14:16 ` Peter Xu 2025-05-08 14:36 ` Pantelis Antoniou 2025-05-08 15:08 ` Peter Xu 2025-05-08 15:10 ` David Hildenbrand 2025-05-08 15:27 ` Pantelis Antoniou 2025-05-08 15:40 ` David Hildenbrand 2025-05-08 15:48 ` Pantelis Antoniou 2025-05-08 16:25 ` Pantelis Antoniou 2025-05-08 17:35 ` Jason Gunthorpe 2025-05-08 17:47 ` Pantelis Antoniou 2025-05-08 18:01 ` Jason Gunthorpe 2025-05-08 18:02 ` David Hildenbrand 2025-05-08 18:11 ` Pantelis Antoniou 2025-05-08 18:26 ` David Hildenbrand 2025-05-08 18:47 ` Peter Xu 2025-05-08 19:04 ` David Hildenbrand 2025-05-08 19:06 ` Jason Gunthorpe 2025-05-08 19:08 ` Peter Xu 2025-05-08 19:12 ` Jason Gunthorpe 2025-05-08 19:16 ` David Hildenbrand 2025-05-08 19:39 ` Peter Xu 2025-05-08 19:14 ` David Hildenbrand 2025-05-08 19:19 ` Jason Gunthorpe 2025-05-08 19:34 ` David Hildenbrand 2025-05-09 16:30 ` Pantelis Antoniou 2025-05-09 17:11 ` John Hubbard 2025-05-09 17:33 ` Jason Gunthorpe 2025-05-09 17:50 ` Pantelis Antoniou 2025-05-09 18:39 ` Jason Gunthorpe 2025-05-08 19:11 ` Jason Gunthorpe 2025-05-08 15:17 ` Pantelis Antoniou
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.