Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Yu Zhang <yu.c.zhang@linux.intel.com>
To: Igor Mammedov <imammedo@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	qemu-devel@nongnu.org, Peter Xu <peterx@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Richard Henderson <rth@twiddle.net>
Subject: Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
Date: Wed, 19 Dec 2018 10:57:17 +0800	[thread overview]
Message-ID: <20181219025717.6m72hq73p2haexkv@linux.intel.com> (raw)
In-Reply-To: <20181218155536.2b35a037@Igors-MacBook-Pro.local>

On Tue, Dec 18, 2018 at 03:55:36PM +0100, Igor Mammedov wrote:
> On Tue, 18 Dec 2018 17:27:23 +0800
> Yu Zhang <yu.c.zhang@linux.intel.com> wrote:
> 
> > On Mon, Dec 17, 2018 at 02:17:40PM +0100, Igor Mammedov wrote:
> > > On Wed, 12 Dec 2018 21:05:38 +0800
> > > Yu Zhang <yu.c.zhang@linux.intel.com> wrote:
> > > 
> > > > Currently, vIOMMU is using the value of IOVA address width, instead of
> > > > the host address width(HAW) to calculate the number of reserved bits in
> > > > data structures such as root entries, context entries, and entries of
> > > > DMA paging structures etc.
> > > > 
> > > > However values of IOVA address width and of the HAW may not equal. For
> > > > example, a 48-bit IOVA can only be mapped to host addresses no wider than
> > > > 46 bits. Using 48, instead of 46 to calculate the reserved bit may result
> > > > in an invalid IOVA being accepted.
> > > > 
> > > > To fix this, a new field - haw_bits is introduced in struct IntelIOMMUState,
> > > > whose value is initialized based on the maximum physical address set to
> > > > guest CPU.
> > > 
> > > > Also, definitions such as VTD_HOST_AW_39/48BIT etc. are renamed
> > > > to clarify.
> > > > 
> > > > Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
> > > > Reviewed-by: Peter Xu <peterx@redhat.com>
> > > > ---
> > > [...]
> > > 
> > > > @@ -3100,6 +3104,8 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n)
> > > >  static void vtd_init(IntelIOMMUState *s)
> > > >  {
> > > >      X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
> > > > +    CPUState *cs = first_cpu;
> > > > +    X86CPU *cpu = X86_CPU(cs);
> > > >  
> > > >      memset(s->csr, 0, DMAR_REG_SIZE);
> > > >      memset(s->wmask, 0, DMAR_REG_SIZE);
> > > > @@ -3119,23 +3125,24 @@ static void vtd_init(IntelIOMMUState *s)
> > > >      s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND |
> > > >               VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS |
> > > >               VTD_CAP_SAGAW_39bit | VTD_CAP_MGAW(s->aw_bits);
> > > > -    if (s->aw_bits == VTD_HOST_AW_48BIT) {
> > > > +    if (s->aw_bits == VTD_AW_48BIT) {
> > > >          s->cap |= VTD_CAP_SAGAW_48bit;
> > > >      }
> > > >      s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO;
> > > > +    s->haw_bits = cpu->phys_bits;
> > > Is it possible to avoid accessing CPU fields directly or cpu altogether
> > > and set phys_bits when iommu is created?
> > 
> > Thanks for your comments, Igor.
> > 
> > Well, I guess you prefer not to query the CPU capabilities while deciding
> > the vIOMMU features. But to me, they are not that irrelevant.:)
> > 
> > Here the hardware address width in vt-d, and the one in cpuid.MAXPHYSADDR
> > are referring to the same concept. In VM, both are the maximum guest physical
> > address width. If we do not check the CPU field here, we will still have to
> > check the CPU field in other places such as build_dmar_q35(), and reset the
> > s->haw_bits again.
> > 
> > Is this explanation convincing enough? :)
> current build_dmar_q35() doesn't do it, it's all new code in this series that
> contains not acceptable direct access from one device (iommu) to another (cpu).   
> Proper way would be for the owner of iommu to fish limits from somewhere and set
> values during iommu creation.

Well, current build_dmar_q35() doesn't do it, because it is using the incorrect value. :)
According to the spec, the host address width is the maximum physical address width,
yet current implementation is using the DMA address width. For me, this is not only
wrong, but also unsecure. For this point, I think we all agree this need to be fixed.

As to how to fix it - should we query the cpu fields, I still do not understand why
this is not acceptable. :)

I had thought of other approaches before, yet I did not choose:

1> Introduce a new parameter, say, "x-haw-bits" which is used for iommu to limit its
physical address width(similar to the "x-aw-bits" for IOVA). But what should we check
this parameter or not? What if this parameter is set to sth. different than the "phys-bits"
or not?

2> Another choice I had thought of is, to query the physical iommu. I abandoned this
idea because my understanding is that vIOMMU is not a passthrued device, it is emulated.

So Igor, may I ask why you think checking against the cpu fields so not acceptable? :)

> 
> > > 
> > > Perhaps Eduardo
> > >  can suggest better approach, since he's more familiar with phys_bits topic
> > 
> > @Eduardo, any comments? Thanks!
> > 
> > > 
> > > >      /*
> > > >       * Rsvd field masks for spte
> > > >       */
> > > >      vtd_paging_entry_rsvd_field[0] = ~0ULL;
> > > > -    vtd_paging_entry_rsvd_field[1] = VTD_SPTE_PAGE_L1_RSVD_MASK(s->aw_bits);
> > > > -    vtd_paging_entry_rsvd_field[2] = VTD_SPTE_PAGE_L2_RSVD_MASK(s->aw_bits);
> > > > -    vtd_paging_entry_rsvd_field[3] = VTD_SPTE_PAGE_L3_RSVD_MASK(s->aw_bits);
> > > > -    vtd_paging_entry_rsvd_field[4] = VTD_SPTE_PAGE_L4_RSVD_MASK(s->aw_bits);
> > > > -    vtd_paging_entry_rsvd_field[5] = VTD_SPTE_LPAGE_L1_RSVD_MASK(s->aw_bits);
> > > > -    vtd_paging_entry_rsvd_field[6] = VTD_SPTE_LPAGE_L2_RSVD_MASK(s->aw_bits);
> > > > -    vtd_paging_entry_rsvd_field[7] = VTD_SPTE_LPAGE_L3_RSVD_MASK(s->aw_bits);
> > > > -    vtd_paging_entry_rsvd_field[8] = VTD_SPTE_LPAGE_L4_RSVD_MASK(s->aw_bits);
> > > > +    vtd_paging_entry_rsvd_field[1] = VTD_SPTE_PAGE_L1_RSVD_MASK(s->haw_bits);
> > > > +    vtd_paging_entry_rsvd_field[2] = VTD_SPTE_PAGE_L2_RSVD_MASK(s->haw_bits);
> > > > +    vtd_paging_entry_rsvd_field[3] = VTD_SPTE_PAGE_L3_RSVD_MASK(s->haw_bits);
> > > > +    vtd_paging_entry_rsvd_field[4] = VTD_SPTE_PAGE_L4_RSVD_MASK(s->haw_bits);
> > > > +    vtd_paging_entry_rsvd_field[5] = VTD_SPTE_LPAGE_L1_RSVD_MASK(s->haw_bits);
> > > > +    vtd_paging_entry_rsvd_field[6] = VTD_SPTE_LPAGE_L2_RSVD_MASK(s->haw_bits);
> > > > +    vtd_paging_entry_rsvd_field[7] = VTD_SPTE_LPAGE_L3_RSVD_MASK(s->haw_bits);
> > > > +    vtd_paging_entry_rsvd_field[8] = VTD_SPTE_LPAGE_L4_RSVD_MASK(s->haw_bits);
> > > >  
> > > >      if (x86_iommu->intr_supported) {
> > > >          s->ecap |= VTD_ECAP_IR | VTD_ECAP_MHMV;
> > > > @@ -3261,10 +3268,10 @@ static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
> > > >      }
> > > >  
> > > >      /* Currently only address widths supported are 39 and 48 bits */
> > > > -    if ((s->aw_bits != VTD_HOST_AW_39BIT) &&
> > > > -        (s->aw_bits != VTD_HOST_AW_48BIT)) {
> > > > +    if ((s->aw_bits != VTD_AW_39BIT) &&
> > > > +        (s->aw_bits != VTD_AW_48BIT)) {
> > > >          error_setg(errp, "Supported values for x-aw-bits are: %d, %d",
> > > > -                   VTD_HOST_AW_39BIT, VTD_HOST_AW_48BIT);
> > > > +                   VTD_AW_39BIT, VTD_AW_48BIT);
> > > >          return false;
> > > >      }
> > > >  
> > > > diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
> > > > index ed4e758..820451c 100644
> > > > --- a/include/hw/i386/intel_iommu.h
> > > > +++ b/include/hw/i386/intel_iommu.h
> > > > @@ -47,9 +47,9 @@
> > > >  #define VTD_SID_TO_DEVFN(sid)       ((sid) & 0xff)
> > > >  
> > > >  #define DMAR_REG_SIZE               0x230
> > > > -#define VTD_HOST_AW_39BIT           39
> > > > -#define VTD_HOST_AW_48BIT           48
> > > > -#define VTD_HOST_ADDRESS_WIDTH      VTD_HOST_AW_39BIT
> > > > +#define VTD_AW_39BIT                39
> > > > +#define VTD_AW_48BIT                48
> > > > +#define VTD_ADDRESS_WIDTH           VTD_AW_39BIT
> > > >  #define VTD_HAW_MASK(aw)            ((1ULL << (aw)) - 1)
> > > >  
> > > >  #define DMAR_REPORT_F_INTR          (1)
> > > > @@ -244,7 +244,8 @@ struct IntelIOMMUState {
> > > >      bool intr_eime;                 /* Extended interrupt mode enabled */
> > > >      OnOffAuto intr_eim;             /* Toggle for EIM cabability */
> > > >      bool buggy_eim;                 /* Force buggy EIM unless eim=off */
> > > > -    uint8_t aw_bits;                /* Host/IOVA address width (in bits) */
> > > > +    uint8_t aw_bits;                /* IOVA address width (in bits) */
> > > > +    uint8_t haw_bits;               /* Hardware address width (in bits) */
> > > >  
> > > >      /*
> > > >       * Protects IOMMU states in general.  Currently it protects the
> > > 
> > > 
> > 
> > B.R.
> > Yu
> 
> 

B.R.
Yu

next prev parent reply	other threads:[~2018-12-19  3:14 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-12 13:05 [Qemu-devel] [PATCH v3 0/2] intel-iommu: add support for 5-level virtual IOMMU Yu Zhang
2018-12-12 13:05 ` [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width Yu Zhang
2018-12-17 13:17   ` Igor Mammedov
2018-12-18  9:27     ` Yu Zhang
2018-12-18 14:23       ` Michael S. Tsirkin
2018-12-18 14:55       ` Igor Mammedov
2018-12-18 14:58         ` Michael S. Tsirkin
2018-12-19  3:03           ` Yu Zhang
2018-12-19  3:12             ` Michael S. Tsirkin
2018-12-19  6:28               ` Yu Zhang
2018-12-19 15:30                 ` Michael S. Tsirkin
2018-12-19  2:57         ` Yu Zhang [this message]
2018-12-19 10:40           ` Igor Mammedov
2018-12-19 16:47             ` Michael S. Tsirkin
2018-12-20  5:59               ` Yu Zhang
2018-12-20 21:18             ` Eduardo Habkost
2018-12-21 14:13               ` Igor Mammedov
2018-12-21 16:09                 ` Yu Zhang
2018-12-21 17:04                   ` Michael S. Tsirkin
2018-12-21 17:37                     ` Yu Zhang
2018-12-21 19:02                       ` Michael S. Tsirkin
2018-12-21 20:01                         ` Eduardo Habkost
2018-12-22  1:11                         ` Yu Zhang
2018-12-25 16:56                           ` Michael S. Tsirkin
2018-12-26  5:30                             ` Yu Zhang
2018-12-27 15:14                               ` Eduardo Habkost
2018-12-28  2:32                                 ` Yu Zhang
2018-12-29  1:29                                   ` Eduardo Habkost
2019-01-15  7:13                                     ` Yu Zhang
2019-01-18  7:10                                       ` Yu Zhang
2018-12-27 14:54                 ` Eduardo Habkost
2018-12-28 11:42                   ` Igor Mammedov
2018-12-20 20:58       ` Eduardo Habkost
2018-12-12 13:05 ` [Qemu-devel] [PATCH v3 2/2] intel-iommu: extend VTD emulation to allow 57-bit " Yu Zhang
2018-12-17 13:29   ` Igor Mammedov
2018-12-18  9:47     ` Yu Zhang
2018-12-18 10:01       ` Yu Zhang
2018-12-18 12:43         ` Michael S. Tsirkin
2018-12-18 13:45           ` Yu Zhang
2018-12-18 14:49             ` Michael S. Tsirkin
2018-12-19  3:40               ` Yu Zhang
2018-12-19  4:35                 ` Michael S. Tsirkin
2018-12-19  5:57                   ` Yu Zhang
2018-12-19 15:23                     ` Michael S. Tsirkin
2018-12-20  5:49                       ` Yu Zhang
2018-12-20 18:28                         ` Michael S. Tsirkin
2018-12-21 16:19                           ` Yu Zhang
2018-12-21 17:15                             ` Michael S. Tsirkin
2018-12-21 17:34                               ` Yu Zhang
2018-12-21 18:10                                 ` Michael S. Tsirkin
2018-12-22  0:41                                   ` Yu Zhang
2018-12-25 17:00                                     ` Michael S. Tsirkin
2018-12-26  5:58                                       ` Yu Zhang
2018-12-25  1:59                                 ` Tian, Kevin
2018-12-14  9:17 ` [Qemu-devel] [PATCH v3 0/2] intel-iommu: add support for 5-level virtual IOMMU Yu Zhang
2019-01-15  4:02 ` Michael S. Tsirkin
2019-01-15  7:27   ` Yu Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181219025717.6m72hq73p2haexkv@linux.intel.com \
    --to=yu.c.zhang@linux.intel.com \
    --cc=ehabkost@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).