All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: David Hildenbrand <david@redhat.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Gavin Shan <gshan@redhat.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	x86@kernel.org, Ingo Molnar <mingo@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Alistair Popple <apopple@nvidia.com>,
	kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Sean Christopherson <seanjc@google.com>,
	Oscar Salvador <osalvador@suse.de>,
	Borislav Petkov <bp@alien8.de>, Zi Yan <ziy@nvidia.com>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Yan Zhao <yan.y.zhao@intel.com>, Will Deacon <will@kernel.org>,
	Kefeng Wang <wangkefeng.wang@huawei.com>,
	Alex Williamson <alex.williamson@redhat.com>
Subject: Re: [PATCH v2 06/19] mm/pagewalk: Check pfnmap for folio_walk_start()
Date: Thu, 29 Aug 2024 14:45:16 -0400	[thread overview]
Message-ID: <ZtDBvAbcT-QI65Vt@x1n> (raw)
In-Reply-To: <2123f339-2487-4b1c-abb1-313e9a012242@redhat.com>

On Thu, Aug 29, 2024 at 08:35:49AM +0200, David Hildenbrand wrote:
> Fortunately, we did an excellent job at documenting vm_normal_page():
> 
>  * There are 2 broad cases. Firstly, an architecture may define a pte_special()
>  * pte bit, in which case this function is trivial. Secondly, an architecture
>  * may not have a spare pte bit, which requires a more complicated scheme,
>  * described below.
>  *
>  * A raw VM_PFNMAP mapping (ie. one that is not COWed) is always considered a
>  * special mapping (even if there are underlying and valid "struct pages").
>  * COWed pages of a VM_PFNMAP are always normal.
>  *
>  * The way we recognize COWed pages within VM_PFNMAP mappings is through the
>  * rules set up by "remap_pfn_range()": the vma will have the VM_PFNMAP bit
>  * set, and the vm_pgoff will point to the first PFN mapped: thus every special
>  * mapping will always honor the rule
>  *
>  *	pfn_of_page == vma->vm_pgoff + ((addr - vma->vm_start) >> PAGE_SHIFT)
>  *
>  * And for normal mappings this is false.
>  *
> 
> remap_pfn_range_notrack() will currently handle that for us:
> 
> if (is_cow_mapping(vma->vm_flags)) {
> 	if (addr != vma->vm_start || end != vma->vm_end)
> 		return -EINVAL;
> }
> 
> Even if [1] would succeed, the is_cow_mapping() check will return NULL and it will
> all work as expected, even without pte_special().

IMHO referencing vm_pgoff is ambiguous, and could be wrong, if without a
clear contract.

For example, consider when the driver setup a MAP_PRIVATE + VM_PFNMAP vma,
vm_pgoff to be not the "base PFN" but some random value, then for a COWed
page it's possible the calculation accidentally satisfies "pfn ==
vma->vm_pgoff + off".  Then it could wrongly return NULL rather than the
COWed anonymous page here.  This is extremely unlikely, but just to show
why it's wrong to reference it at all.

> 
> Because VM_PFNMAP is easy: in a !COW mapping, everything is special.

Yes it's safe for vfio-pci, as vfio-pci doesn't have private mappings.  But
still, I don't think it's clear enough now on how VM_PFNMAP should be
mapped.

-- 
Peter Xu


  reply	other threads:[~2024-08-29 18:45 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-26 20:43 [PATCH v2 00/19] mm: Support huge pfnmaps Peter Xu
2024-08-26 20:43 ` [PATCH v2 01/19] mm: Introduce ARCH_SUPPORTS_HUGE_PFNMAP and special bits to pmd/pud Peter Xu
2024-08-26 20:43 ` [PATCH v2 02/19] mm: Drop is_huge_zero_pud() Peter Xu
2024-08-26 20:43 ` [PATCH v2 03/19] mm: Mark special bits for huge pfn mappings when inject Peter Xu
2024-08-28 15:31   ` David Hildenbrand
2024-08-26 20:43 ` [PATCH v2 04/19] mm: Allow THP orders for PFNMAPs Peter Xu
2024-08-28 15:31   ` David Hildenbrand
2024-08-26 20:43 ` [PATCH v2 05/19] mm/gup: Detect huge pfnmap entries in gup-fast Peter Xu
2024-08-26 20:43 ` [PATCH v2 06/19] mm/pagewalk: Check pfnmap for folio_walk_start() Peter Xu
2024-08-28  7:44   ` David Hildenbrand
2024-08-28 14:24     ` Peter Xu
2024-08-28 15:30       ` David Hildenbrand
2024-08-28 19:45         ` Peter Xu
2024-08-28 23:46           ` Jason Gunthorpe
2024-08-29  6:35             ` David Hildenbrand
2024-08-29 18:45               ` Peter Xu [this message]
2024-08-29 15:10           ` David Hildenbrand
2024-08-29 18:49             ` Peter Xu
2024-08-26 20:43 ` [PATCH v2 07/19] mm/fork: Accept huge pfnmap entries Peter Xu
2024-08-29 15:10   ` David Hildenbrand
2024-08-29 18:26     ` Peter Xu
2024-08-29 19:44       ` David Hildenbrand
2024-08-29 20:01         ` Peter Xu
2024-09-02  7:58   ` Yan Zhao
2024-09-03 21:23     ` Peter Xu
2024-09-09 22:25       ` Andrew Morton
2024-09-09 22:43         ` Peter Xu
2024-09-09 23:15           ` Andrew Morton
2024-09-10  0:08             ` Peter Xu
2024-09-10  2:52               ` Yan Zhao
2024-09-10 12:16                 ` Peter Xu
2024-09-11  2:16                   ` Yan Zhao
2024-09-11 14:34                     ` Peter Xu
2024-08-26 20:43 ` [PATCH v2 08/19] mm: Always define pxx_pgprot() Peter Xu
2024-08-26 20:43 ` [PATCH v2 09/19] mm: New follow_pfnmap API Peter Xu
2024-08-26 20:43 ` [PATCH v2 10/19] KVM: Use " Peter Xu
2024-08-26 20:43 ` [PATCH v2 11/19] s390/pci_mmio: " Peter Xu
2024-08-26 20:43 ` [PATCH v2 12/19] mm/x86/pat: Use the new " Peter Xu
2024-08-26 20:43 ` [PATCH v2 13/19] vfio: " Peter Xu
2024-08-26 20:43 ` [PATCH v2 14/19] acrn: " Peter Xu
2024-08-26 20:43 ` [PATCH v2 15/19] mm/access_process_vm: " Peter Xu
2024-08-26 20:43 ` [PATCH v2 16/19] mm: Remove follow_pte() Peter Xu
2024-09-01  4:33   ` Yu Zhao
2024-09-01 13:39     ` David Hildenbrand
2024-08-26 20:43 ` [PATCH v2 17/19] mm/x86: Support large pfn mappings Peter Xu
2024-08-26 20:43 ` [PATCH v2 18/19] mm/arm64: " Peter Xu
2025-03-19 22:22   ` Keith Busch
2025-03-19 22:46     ` Peter Xu
2025-03-19 22:53       ` Keith Busch
2024-08-26 20:43 ` [PATCH v2 19/19] vfio/pci: Implement huge_fault support Peter Xu
2024-08-27 21:00   ` kernel test robot
2024-08-27 22:36 ` [PATCH v2 00/19] mm: Support huge pfnmaps Jiaqi Yan
2024-08-27 22:57   ` Peter Xu
2024-08-28  0:42     ` Jiaqi Yan
2024-08-28  0:46       ` Jiaqi Yan
2024-08-28 14:24       ` Jason Gunthorpe
2024-08-28 16:10         ` Jiaqi Yan
2024-08-28 23:49           ` Jason Gunthorpe
2024-08-29 19:21             ` Jiaqi Yan
2024-09-04 15:52               ` Jason Gunthorpe
2024-09-04 16:38                 ` Jiaqi Yan
2024-09-04 16:43                   ` Jason Gunthorpe
2024-09-04 16:58                     ` Jiaqi Yan
2024-09-04 17:00                       ` Jason Gunthorpe
2024-09-04 17:07                         ` Jiaqi Yan
2024-09-09  3:56                           ` Ankit Agrawal
2024-08-28 14:41       ` Peter Xu
2024-08-28 16:23         ` Jiaqi Yan
2024-09-09  4:03 ` Ankit Agrawal
2024-09-09 15:03   ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZtDBvAbcT-QI65Vt@x1n \
    --to=peterx@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=apopple@nvidia.com \
    --cc=axelrasmussen@google.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=gshan@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@redhat.com \
    --cc=osalvador@suse.de \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=wangkefeng.wang@huawei.com \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=yan.y.zhao@intel.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.