public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jacob Pan <jacob.jun.pan@linux.intel.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	iommu@lists.linux-foundation.org, Joerg Roedel <joro@8bytes.org>,
	Lu Baolu <baolu.lu@linux.intel.com>,
	Jean-Philippe Brucker <jean-philippe@linaro.com>,
	Christoph Hellwig <hch@infradead.org>,
	Yi Liu <yi.l.liu@intel.com>, Raj Ashok <ashok.raj@intel.com>,
	"Tian, Kevin" <kevin.tian@intel.com>,
	Dave Jiang <dave.jiang@intel.com>,
	wangzhou1@hisilicon.com, zhangfei.gao@linaro.org,
	vkoul@kernel.org, jacob.jun.pan@linux.intel.com,
	David Woodhouse <dwmw2@infradead.org>
Subject: Re: [PATCH v4 1/2] iommu/sva: Tighten SVA bind API with explicit flags
Date: Tue, 11 May 2021 09:14:52 -0700	[thread overview]
Message-ID: <20210511091452.721e9a03@jacob-builder> (raw)
In-Reply-To: <20210511114848.GK1002214@nvidia.com>

Hi Jason,

On Tue, 11 May 2021 08:48:48 -0300, Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Mon, May 10, 2021 at 08:31:45PM -0700, Jacob Pan wrote:
> > Hi Jason,
> > 
> > On Mon, 10 May 2021 20:37:49 -0300, Jason Gunthorpe <jgg@nvidia.com>
> > wrote: 
> > > On Mon, May 10, 2021 at 06:25:07AM -0700, Jacob Pan wrote:
> > >   
> > > > +/*
> > > > + * The IOMMU_SVA_BIND_SUPERVISOR flag requests a PASID which can be
> > > > used only
> > > > + * for access to kernel addresses. No IOTLB flushes are
> > > > automatically done
> > > > + * for kernel mappings; it is valid only for access to the kernel's
> > > > static
> > > > + * 1:1 mapping of physical memory — not to vmalloc or even module
> > > > mappings.
> > > > + * A future API addition may permit the use of such ranges, by
> > > > means of an
> > > > + * explicit IOTLB flush call (akin to the DMA API's unmap method).
> > > > + *
> > > > + * It is unlikely that we will ever hook into
> > > > flush_tlb_kernel_range() to
> > > > + * do such IOTLB flushes automatically.
> > > > + */
> > > > +#define IOMMU_SVA_BIND_SUPERVISOR       BIT(0)    
> > > 
> > > Huh? That isn't really SVA, can you call it something saner please?
> > >   
> > This is shared kernel virtual address, I am following the SVA lib naming
> > since this is where the flag will be used. Why this is not SVA? Kernel
> > virtual address is still virtual address. Is it due to direct map?  
> 
> As the above explains it doesn't actually synchronize the kernel's
> address space it just shoves the direct map into the IOMMU.
> 
There is no duplicated kernel direct map in IOMMU.

> I suppose a different IOMMU implementation might point the PASID directly
> at the kernel's page table and avoid those limitations - but since
> that isn't portable it seems irrelevant.
> 
This is what we are doing here. We allocate a supervisor PASID and put
the kernel page table (init_mm pgd) in this PASID entry.

> Since the only thing it really maps is the direct map I would just
> call it direct_map, or all physical or something.
> 
Good idea. It makes things clear to the callers. They must only use direct
map memory for DMA.

> How does this interact with the DMA APIs?
DMA API would use RID2PASID (PASID 0), so it is separated by PASIDs.

> How do you get CPU cache
> flushing/etc into PASID operations that don't trigger IOMMU updates?
> 
Sorry, I am not following. This is used for direct map only.

> Honestly, I'm not convinced we should have "kernel SVA" at all.. Why
> does IDXD use normal DMA on the RID for kernel controlled accesses?
> 
Using SVA simplifies the work submission, there is no need to do map/unmap.
Just bind PASID with init_mm, then submit work directly either with ENQCMDS
(supervisor version of ENQCMD) to a shared workqueue or put the supervisor
PASID in the descriptor for dedicated workqueue.

> > > Is it really a PASID that always has all of physical memory mapped
> > > into it? Sounds dangerous. What is it for?  
> > 
> > Yes. It is to bind DMA request w/ PASID with init_mm/init_top_pgt. Per
> > PCIe spec PASID TLP prefix has "Privileged Mode Requested" bit. VT-d
> > supports this with "Privileged-mode-Requested (PR) flag (to distinguish
> > user versus supervisor access)". Each PASID entry has a SRE (Supervisor
> > Request Enable) bit.  
> 
> The PR flag is only needed if the underlying IOMMU is directly
> processing the CPU page tables. For cases where the IOMMU is using its
> own page table format and has its own copies the PR flag shouldn't be
> used.
> 
We are doing the former case. There is no IOMMU page tables for the direct
map.

> Jason


Thanks,

Jacob

  reply	other threads:[~2021-05-11 16:12 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-10 13:25 [PATCH v4 0/2] Simplify and restrict IOMMU SVA APIs Jacob Pan
2021-05-10 13:25 ` [PATCH v4 1/2] iommu/sva: Tighten SVA bind API with explicit flags Jacob Pan
2021-05-10 23:37   ` Jason Gunthorpe
2021-05-11  3:31     ` Jacob Pan
2021-05-11 11:48       ` Jason Gunthorpe
2021-05-11 16:14         ` Jacob Pan [this message]
2021-05-11 16:35           ` Jason Gunthorpe
2021-05-11 18:05             ` Jacob Pan
2021-05-11 19:47               ` Jason Gunthorpe
2021-05-12  6:37                 ` Christoph Hellwig
2021-05-13 13:00                   ` Jacob Pan
2021-05-13 13:38                     ` Jason Gunthorpe
2021-05-13 15:10                       ` Jacob Pan
2021-05-13 16:44                         ` Luck, Tony
2021-05-13 17:33                           ` Jason Gunthorpe
2021-05-13 18:53                             ` Luck, Tony
2021-05-13 19:00                               ` Jason Gunthorpe
2021-05-13 19:14                                 ` Luck, Tony
2021-05-13 19:20                                   ` Jason Gunthorpe
2021-05-13 19:46                                     ` Jacob Pan
2021-05-13 19:57                                       ` Luck, Tony
2021-05-13 20:22                                         ` Jacob Pan
2021-05-13 22:31                                           ` Jason Gunthorpe
2021-05-13 23:40                                             ` Jacob Pan
2021-05-17 14:37                                               ` Jason Gunthorpe
2021-05-19 15:46                                                 ` Jacob Pan
2021-05-12 10:18   ` Jean-Philippe Brucker
2021-05-10 13:25 ` [PATCH v4 2/2] iommu/sva: Remove mm parameter from SVA bind API Jacob Pan
2021-05-12 10:24   ` Jean-Philippe Brucker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210511091452.721e9a03@jacob-builder \
    --to=jacob.jun.pan@linux.intel.com \
    --cc=ashok.raj@intel.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dwmw2@infradead.org \
    --cc=hch@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jean-philippe@linaro.com \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=vkoul@kernel.org \
    --cc=wangzhou1@hisilicon.com \
    --cc=yi.l.liu@intel.com \
    --cc=zhangfei.gao@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox