iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>
To: Alex Williamson
	<alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: "iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org"
	<iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
Subject: Re: [RFC PATCH 1/3] iommu: introduce IOMMU_DOMAIN_HYP domain type for hypervisor allocation
Date: Tue, 1 Jul 2014 19:04:26 +0100	[thread overview]
Message-ID: <20140701180426.GX28164@arm.com> (raw)
In-Reply-To: <1404236553.3225.93.camel-85EaTFmN5p//9pzu0YdTqQ@public.gmane.org>

Hi Alex,

Thanks for having a look.

On Tue, Jul 01, 2014 at 06:42:33PM +0100, Alex Williamson wrote:
> On Tue, 2014-07-01 at 17:10 +0100, Will Deacon wrote:
> > Some IOMMUs, such as the ARM SMMU, support two stages of translation.
> > The idea behind such a scheme is to allow a guest operating system to
> > use the IOMMU for DMA mappings in the first stage of translation, with
> > the hypervisor then installing mappings in the second stage to provide
> > isolation of the DMA to the physical range assigned to that virtual
> > machine.
> > 
> > In order to allow IOMMU domains to be allocated for second-stage
> > translation, this patch extends iommu_domain_alloc (and the associated
> > ->domain_init callback on struct iommu) to take a type parameter
> > indicating the intended purpose for the domain. The only supported types
> > at present are IOMMU_DOMAIN_DMA (i.e. what we have at the moment) and
> > IOMMU_DOMAIN_HYP, which instructs the backend driver to allocate and
> > initialise a second-stage domain, if possible.
> > 
> > All IOMMU drivers are updated to take the new type parameter, but it is
> > ignored at present. All callers of iommu_domain_alloc are also updated
> > to pass IOMMU_DOMAIN_DMA as the type parameter, apart from
> > kvm_iommu_map_guest, which passes the new IOMMU_DOMAIN_HYP flag.
> > 
> > Finally, a new IOMMU capability, IOMMU_CAP_HYP_MAPPING, is added so that
> > it is possible to check whether or not a domain is able to make use of
> > nested translation.
> 
> Why is this necessarily related to HYPervisor use?  It seems like this
> new domain type is effectively just a normal domain that supports some
> sort of fault handler that can call out to attempt to create missing
> mappings.

Not quite. The idea of this domain is that it provides isolation for a
guest, so I'd actually expect these domains to contain pinned mappings most
of the time (handling guest faults in the hypervisor is pretty error-prone).

Perhaps if I explain how the ARM SMMU works, that might help (and if it
doesn't, please reply saying so :). The ARM SMMU supports two stages of
translation:

  Stage-1: Guest VA (VA) -> Guest PA (IPA, or intermediate physical address)
  Stage-2: IPA -> Host Physical Address (PA)

These can be glued together to form nested translation, where an incoming VA
is translated through both stages to get a PA. Page table walks triggered at
stage-1 expect to see IPAs for the table addresses.

An important thing to note here is that the hardware is configured
differently at each stage; the page table formats themselves are slightly
different (e.g. restricted permissions at stage-2) and certain hardware
contexts are only capable of stage-2 translation.

The way this is supposed to work is that the KVM host would install the VFIO
DMA mapping (ok, now I see why you don't like the name) at stage-2. This
allows the SMMU driver to allocate a corresponding stage-1 context for the
mapping and expose that directly to the guest as part of a virtual, stage-1-only
SMMU. Then the guest can install its own SMMU mappings at stage-1 for
contiguous DMA (in the guest VA space) without any knowledge of the hypervisor
mapping.

To do this successfully, we need to communicate the intention of the mapping
to the SMMU driver (i.e. stage-1 vs stage-2) at domain initialisation time.
I could just add ARM SMMU-specific properties there, but I thought this
might potentially be useful to others.

> IOMMUs supporting PCI PRI (Page Request Interface) could
> potentially make use of something like that on bare metal or under
> hypervisor control.  If that's true, then could this be some sort of
> iommu_domain_set_l2_handler() that happens after the domain is
> allocated?

I'm not sure that's what I was aiming for... see above.

> For this patch, I don't understand why legacy KVM assignment would
> allocate a HYP domain while VFIO would use a DMA domain.  It seems like
> you're just counting on x86 never making the distinction between the
> two.

That's true, but I was also trying to indicate the intention of the mapping
so that other IOMMUs could potentially make use of the flags.

> > --- a/include/linux/iommu.h
> > +++ b/include/linux/iommu.h
> > @@ -49,6 +49,10 @@ struct iommu_domain_geometry {
> >  	bool force_aperture;       /* DMA only allowed in mappable range? */
> >  };
> >  
> > +/* iommu domain types */
> > +#define IOMMU_DOMAIN_DMA	0x0
> > +#define IOMMU_DOMAIN_HYP	0x1
> > +
> >  struct iommu_domain {
> >  	struct iommu_ops *ops;
> >  	void *priv;
> > @@ -59,6 +63,7 @@ struct iommu_domain {
> >  
> >  #define IOMMU_CAP_CACHE_COHERENCY	0x1
> >  #define IOMMU_CAP_INTR_REMAP		0x2	/* isolates device intrs */
> > +#define IOMMU_CAP_HYP_MAPPING		0x3	/* isolates guest DMA */
> 
> This makes no sense, it's exactly what we do with a "DMA" domain.  I
> think the code needs to focus on what is really different about this
> domain, not what is the expected use case.  Thanks,

The use-case is certainly relevant, though. I can do device passthrough
with a stage-1 mapping for example, but you wouldn't then be able to
instantiate a virtual SMMU interface in the guest.

I could rename these IOMMU_CAP_STAGE{1,2}, but that then sounds very
ARM-specific. What do you think?

Will

  parent reply	other threads:[~2014-07-01 18:04 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-01 16:10 [RFC PATCH 1/3] iommu: introduce IOMMU_DOMAIN_HYP domain type for hypervisor allocation Will Deacon
     [not found] ` <1404231017-10856-1-git-send-email-will.deacon-5wv7dgnIgG8@public.gmane.org>
2014-07-01 16:10   ` [RFC PATCH 2/3] vfio/iommu_type1: add new VFIO_TYPE1_HYP_IOMMU IOMMU type Will Deacon
2014-07-01 16:10   ` [RFC PATCH 3/3] iommu/arm-smmu: add support for IOMMU_DOMAIN_HYP flag Will Deacon
2014-07-01 17:42   ` [RFC PATCH 1/3] iommu: introduce IOMMU_DOMAIN_HYP domain type for hypervisor allocation Alex Williamson
     [not found]     ` <1404236553.3225.93.camel-85EaTFmN5p//9pzu0YdTqQ@public.gmane.org>
2014-07-01 18:04       ` Will Deacon [this message]
     [not found]         ` <20140701180426.GX28164-5wv7dgnIgG8@public.gmane.org>
2014-07-01 19:28           ` Alex Williamson
     [not found]             ` <1404242891.3225.144.camel-85EaTFmN5p//9pzu0YdTqQ@public.gmane.org>
2014-07-02 10:49               ` Will Deacon
     [not found]                 ` <20140702104902.GH18731-5wv7dgnIgG8@public.gmane.org>
2014-07-02 13:57                   ` Will Deacon
     [not found]                     ` <20140702135742.GC24879-5wv7dgnIgG8@public.gmane.org>
2014-07-02 15:04                       ` Alex Williamson
     [not found]                         ` <1404313455.1862.34.camel-85EaTFmN5p//9pzu0YdTqQ@public.gmane.org>
2014-07-02 18:57                           ` Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140701180426.GX28164@arm.com \
    --to=will.deacon-5wv7dgnigg8@public.gmane.org \
    --cc=alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).