From: Alex Williamson <alex.williamson@redhat.com>
To: Yan Zhao <yan.y.zhao@intel.com>
Cc: Peter Xu <peterx@redhat.com>, <kvm@vger.kernel.org>,
<ajones@ventanamicro.com>, <kevin.tian@intel.com>,
<jgg@nvidia.com>
Subject: Re: [PATCH 2/2] vfio/pci: Use unmap_mapping_range()
Date: Tue, 28 May 2024 12:42:51 -0600 [thread overview]
Message-ID: <20240528124251.3c3dcfe4.alex.williamson@redhat.com> (raw)
In-Reply-To: <Zk/xlxpsDTYvCSUK@yzhao56-desk.sh.intel.com>
On Fri, 24 May 2024 09:47:03 +0800
Yan Zhao <yan.y.zhao@intel.com> wrote:
> On Thu, May 23, 2024 at 08:49:03PM -0400, Peter Xu wrote:
> > Hi, Yan,
> >
> > On Fri, May 24, 2024 at 08:39:37AM +0800, Yan Zhao wrote:
> > > On Thu, May 23, 2024 at 01:56:27PM -0600, Alex Williamson wrote:
> > > > With the vfio device fd tied to the address space of the pseudo fs
> > > > inode, we can use the mm to track all vmas that might be mmap'ing
> > > > device BARs, which removes our vma_list and all the complicated lock
> > > > ordering necessary to manually zap each related vma.
> > > >
> > > > Note that we can no longer store the pfn in vm_pgoff if we want to use
> > > > unmap_mapping_range() to zap a selective portion of the device fd
> > > > corresponding to BAR mappings.
> > > >
> > > > This also converts our mmap fault handler to use vmf_insert_pfn()
> > > Looks vmf_insert_pfn() does not call memtype_reserve() to reserve memory type
> > > for the PFN on x86 as what's done in io_remap_pfn_range().
> > >
> > > Instead, it just calls lookup_memtype() and determine the final prot based on
> > > the result from this lookup, which might not prevent others from reserving the
> > > PFN to other memory types.
> >
> > I didn't worry too much on others reserving the same pfn range, as that
> > should be the mmio region for this device, and this device should be owned
> > by vfio driver.
> >
> > However I share the same question, see:
> >
> > https://lore.kernel.org/r/20240523223745.395337-2-peterx@redhat.com
> >
> > So far I think it's not a major issue as VFIO always use UC- mem type, and
> > that's also the default. But I do also feel like there's something we can
> Right, but I feel that it may lead to inconsistency in reserved mem type if VFIO
> (or the variant driver) opts to use WC for certain BAR as mem type in future.
> Not sure if it will be true though :)
Does Kevin's comment[1] satisfy your concern? vfio_pci_core_mmap()
needs to make sure the PCI BAR region is requested before the mmap,
which is tracked via the barmap. Therefore the barmap is always setup
via pci_iomap() which will call memtype_reserve() with UC- attribute.
If there are any additional comments required to make this more clear
or outline steps for WC support in the future, please provide
suggestions. Thanks,
Alex
[1]https://lore.kernel.org/all/BN9PR11MB52764E958E6481A112649B5D8CF52@BN9PR11MB5276.namprd11.prod.outlook.com/
> > > Does that matter?
> > > > because we no longer have a vma_list to avoid the concurrency problem
> > > > with io_remap_pfn_range(). The goal is to eventually use the vm_ops
> > > > huge_fault handler to avoid the additional faulting overhead, but
> > > > vmf_insert_pfn_{pmd,pud}() need to learn about pfnmaps first.
> > > >
> > > > Also, Jason notes that a race exists between unmap_mapping_range() and
> > > > the fops mmap callback if we were to call io_remap_pfn_range() to
> > > > populate the vma on mmap. Specifically, mmap_region() does call_mmap()
> > > > before it does vma_link_file() which gives a window where the vma is
> > > > populated but invisible to unmap_mapping_range().
> > > >
> > >
> >
> > --
> > Peter Xu
> >
>
next prev parent reply other threads:[~2024-05-28 18:42 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-23 19:56 [PATCH 0/2] vfio/pci: vfio device address space mapping Alex Williamson
2024-05-23 19:56 ` [PATCH 1/2] vfio: Create vfio_fs_type with inode per device Alex Williamson
2024-05-24 13:24 ` Jason Gunthorpe
2024-05-29 23:59 ` Tian, Kevin
2024-05-23 19:56 ` [PATCH 2/2] vfio/pci: Use unmap_mapping_range() Alex Williamson
2024-05-24 0:39 ` Yan Zhao
2024-05-24 0:49 ` Peter Xu
2024-05-24 1:47 ` Yan Zhao
2024-05-28 18:42 ` Alex Williamson [this message]
2024-05-29 2:29 ` Yan Zhao
2024-05-29 3:12 ` Alex Williamson
2024-05-29 6:34 ` Yan Zhao
2024-05-29 16:50 ` Alex Williamson
2024-05-30 7:46 ` Yan Zhao
2024-05-24 8:40 ` Tian, Kevin
2024-05-24 13:22 ` Jason Gunthorpe
2024-05-24 23:15 ` Peter Xu
2024-05-24 13:42 ` Jason Gunthorpe
2024-05-30 0:09 ` Tian, Kevin
2024-05-30 2:22 ` Alex Williamson
2024-05-30 2:47 ` Tian, Kevin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240528124251.3c3dcfe4.alex.williamson@redhat.com \
--to=alex.williamson@redhat.com \
--cc=ajones@ventanamicro.com \
--cc=jgg@nvidia.com \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=peterx@redhat.com \
--cc=yan.y.zhao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox