public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Yan Zhao <yan.y.zhao@intel.com>
Cc: Peter Xu <peterx@redhat.com>, <kvm@vger.kernel.org>,
	<ajones@ventanamicro.com>, <kevin.tian@intel.com>,
	<jgg@nvidia.com>
Subject: Re: [PATCH 2/2] vfio/pci: Use unmap_mapping_range()
Date: Wed, 29 May 2024 10:50:12 -0600	[thread overview]
Message-ID: <20240529105012.39756143.alex.williamson@redhat.com> (raw)
In-Reply-To: <ZlbMb9F4+vNwTUDf@yzhao56-desk.sh.intel.com>

On Wed, 29 May 2024 14:34:23 +0800
Yan Zhao <yan.y.zhao@intel.com> wrote:

> On Tue, May 28, 2024 at 09:12:00PM -0600, Alex Williamson wrote:
> > On Wed, 29 May 2024 10:29:33 +0800
> > Yan Zhao <yan.y.zhao@intel.com> wrote:
> >   
> > > On Tue, May 28, 2024 at 12:42:51PM -0600, Alex Williamson wrote:  
> > > > On Fri, 24 May 2024 09:47:03 +0800
> > > > Yan Zhao <yan.y.zhao@intel.com> wrote:
> > > >     
> > > > > On Thu, May 23, 2024 at 08:49:03PM -0400, Peter Xu wrote:    
> > > > > > Hi, Yan,
> > > > > > 
> > > > > > On Fri, May 24, 2024 at 08:39:37AM +0800, Yan Zhao wrote:      
> > > > > > > On Thu, May 23, 2024 at 01:56:27PM -0600, Alex Williamson wrote:      
> > > > > > > > With the vfio device fd tied to the address space of the pseudo fs
> > > > > > > > inode, we can use the mm to track all vmas that might be mmap'ing
> > > > > > > > device BARs, which removes our vma_list and all the complicated lock
> > > > > > > > ordering necessary to manually zap each related vma.
> > > > > > > > 
> > > > > > > > Note that we can no longer store the pfn in vm_pgoff if we want to use
> > > > > > > > unmap_mapping_range() to zap a selective portion of the device fd
> > > > > > > > corresponding to BAR mappings.
> > > > > > > > 
> > > > > > > > This also converts our mmap fault handler to use vmf_insert_pfn()      
> > > > > > > Looks vmf_insert_pfn() does not call memtype_reserve() to reserve memory type
> > > > > > > for the PFN on x86 as what's done in io_remap_pfn_range().
> > > > > > > 
> > > > > > > Instead, it just calls lookup_memtype() and determine the final prot based on
> > > > > > > the result from this lookup, which might not prevent others from reserving the
> > > > > > > PFN to other memory types.      
> > > > > > 
> > > > > > I didn't worry too much on others reserving the same pfn range, as that
> > > > > > should be the mmio region for this device, and this device should be owned
> > > > > > by vfio driver.
> > > > > > 
> > > > > > However I share the same question, see:
> > > > > > 
> > > > > > https://lore.kernel.org/r/20240523223745.395337-2-peterx@redhat.com
> > > > > > 
> > > > > > So far I think it's not a major issue as VFIO always use UC- mem type, and
> > > > > > that's also the default.  But I do also feel like there's something we can      
> > > > > Right, but I feel that it may lead to inconsistency in reserved mem type if VFIO
> > > > > (or the variant driver) opts to use WC for certain BAR as mem type in future.
> > > > > Not sure if it will be true though :)    
> > > > 
> > > > Does Kevin's comment[1] satisfy your concern?  vfio_pci_core_mmap()
> > > > needs to make sure the PCI BAR region is requested before the mmap,
> > > > which is tracked via the barmap.  Therefore the barmap is always setup
> > > > via pci_iomap() which will call memtype_reserve() with UC- attribute.    
> > > Just a question out of curiosity.
> > > Is this a must to call pci_iomap() in vfio_pci_core_mmap()?
> > > I don't see it or ioremap*() is called before nvgrace_gpu_mmap().  
> > 
> > nvgrace-gpu is exposing a non-PCI coherent memory region as a BAR, so
> > it doesn't request the PCI BAR region and is on it's own for read/write
> > access as well.  To mmap an actual PCI BAR it's required to request the  
> Thanks for explanation!
> So, if mmap happens before read/write, where is page memtype reserved?

For nvgrace-gpu?  The device for this variant driver only exists on ARM
platforms.  memtype_reserve() only exists on x86.  Thanks,

Alex


  reply	other threads:[~2024-05-29 16:50 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-23 19:56 [PATCH 0/2] vfio/pci: vfio device address space mapping Alex Williamson
2024-05-23 19:56 ` [PATCH 1/2] vfio: Create vfio_fs_type with inode per device Alex Williamson
2024-05-24 13:24   ` Jason Gunthorpe
2024-05-29 23:59   ` Tian, Kevin
2024-05-23 19:56 ` [PATCH 2/2] vfio/pci: Use unmap_mapping_range() Alex Williamson
2024-05-24  0:39   ` Yan Zhao
2024-05-24  0:49     ` Peter Xu
2024-05-24  1:47       ` Yan Zhao
2024-05-28 18:42         ` Alex Williamson
2024-05-29  2:29           ` Yan Zhao
2024-05-29  3:12             ` Alex Williamson
2024-05-29  6:34               ` Yan Zhao
2024-05-29 16:50                 ` Alex Williamson [this message]
2024-05-30  7:46                   ` Yan Zhao
2024-05-24  8:40       ` Tian, Kevin
2024-05-24 13:22         ` Jason Gunthorpe
2024-05-24 23:15           ` Peter Xu
2024-05-24 13:42   ` Jason Gunthorpe
2024-05-30  0:09   ` Tian, Kevin
2024-05-30  2:22     ` Alex Williamson
2024-05-30  2:47       ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240529105012.39756143.alex.williamson@redhat.com \
    --to=alex.williamson@redhat.com \
    --cc=ajones@ventanamicro.com \
    --cc=jgg@nvidia.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=peterx@redhat.com \
    --cc=yan.y.zhao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox