qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org, Tian Kevin <kevin.tian@intel.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Jintack Lim <jintack@cs.columbia.edu>,
	Jason Wang <jasowang@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v3 01/12] intel-iommu: send PSI always even if across PDEs
Date: Fri, 18 May 2018 00:00:38 +0300	[thread overview]
Message-ID: <20180517235838-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20180517085927.24925-2-peterx@redhat.com>

On Thu, May 17, 2018 at 04:59:16PM +0800, Peter Xu wrote:
> During IOVA page table walking, there is a special case when the PSI
> covers one whole PDE (Page Directory Entry, which contains 512 Page
> Table Entries) or more.  In the past, we skip that entry and we don't
> notify the IOMMU notifiers.  This is not correct.  We should send UNMAP
> notification to registered UNMAP notifiers in this case.
> 
> For UNMAP only notifiers, this might cause IOTLBs cached in the devices
> even if they were already invalid.  For MAP/UNMAP notifiers like
> vfio-pci, this will cause stale page mappings.
> 
> This special case doesn't trigger often, but it is very easy to be
> triggered by nested device assignments, since in that case we'll
> possibly map the whole L2 guest RAM region into the device's IOVA
> address space (several GBs at least), which is far bigger than normal
> kernel driver usages of the device (tens of MBs normally).
> 
> Without this patch applied to L1 QEMU, nested device assignment to L2
> guests will dump some errors like:
> 
> qemu-system-x86_64: VFIO_MAP_DMA: -17
> qemu-system-x86_64: vfio_dma_map(0x557305420c30, 0xad000, 0x1000,
>                     0x7f89a920d000) = -17 (File exists)

Could you add description of security implications if any and their
severity?

> Acked-by: Jason Wang <jasowang@redhat.com>
> [peterx: rewrite the commit message]
> Signed-off-by: Peter Xu <peterx@redhat.com>

Cc stable here? For any other patches as well?
> ---
>  hw/i386/intel_iommu.c | 42 ++++++++++++++++++++++++++++++------------
>  1 file changed, 30 insertions(+), 12 deletions(-)
> 
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index fb31de9416..b359efd6f9 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -722,6 +722,15 @@ static int vtd_iova_to_slpte(VTDContextEntry *ce, uint64_t iova, bool is_write,
>  
>  typedef int (*vtd_page_walk_hook)(IOMMUTLBEntry *entry, void *private);
>  
> +static int vtd_page_walk_one(IOMMUTLBEntry *entry, int level,
> +                             vtd_page_walk_hook hook_fn, void *private)
> +{
> +    assert(hook_fn);
> +    trace_vtd_page_walk_one(level, entry->iova, entry->translated_addr,
> +                            entry->addr_mask, entry->perm);
> +    return hook_fn(entry, private);
> +}
> +
>  /**
>   * vtd_page_walk_level - walk over specific level for IOVA range
>   *
> @@ -781,28 +790,37 @@ static int vtd_page_walk_level(dma_addr_t addr, uint64_t start,
>           */
>          entry_valid = read_cur | write_cur;
>  
> +        entry.target_as = &address_space_memory;
> +        entry.iova = iova & subpage_mask;
> +        entry.perm = IOMMU_ACCESS_FLAG(read_cur, write_cur);
> +        entry.addr_mask = ~subpage_mask;
> +
>          if (vtd_is_last_slpte(slpte, level)) {
> -            entry.target_as = &address_space_memory;
> -            entry.iova = iova & subpage_mask;
>              /* NOTE: this is only meaningful if entry_valid == true */
>              entry.translated_addr = vtd_get_slpte_addr(slpte, aw);
> -            entry.addr_mask = ~subpage_mask;
> -            entry.perm = IOMMU_ACCESS_FLAG(read_cur, write_cur);
>              if (!entry_valid && !notify_unmap) {
>                  trace_vtd_page_walk_skip_perm(iova, iova_next);
>                  goto next;
>              }
> -            trace_vtd_page_walk_one(level, entry.iova, entry.translated_addr,
> -                                    entry.addr_mask, entry.perm);
> -            if (hook_fn) {
> -                ret = hook_fn(&entry, private);
> -                if (ret < 0) {
> -                    return ret;
> -                }
> +            ret = vtd_page_walk_one(&entry, level, hook_fn, private);
> +            if (ret < 0) {
> +                return ret;
>              }
>          } else {
>              if (!entry_valid) {
> -                trace_vtd_page_walk_skip_perm(iova, iova_next);
> +                if (notify_unmap) {
> +                    /*
> +                     * The whole entry is invalid; unmap it all.
> +                     * Translated address is meaningless, zero it.
> +                     */
> +                    entry.translated_addr = 0x0;
> +                    ret = vtd_page_walk_one(&entry, level, hook_fn, private);
> +                    if (ret < 0) {
> +                        return ret;
> +                    }
> +                } else {
> +                    trace_vtd_page_walk_skip_perm(iova, iova_next);
> +                }
>                  goto next;
>              }
>              ret = vtd_page_walk_level(vtd_get_slpte_addr(slpte, aw), iova,
> -- 
> 2.17.0

  reply	other threads:[~2018-05-17 21:00 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-17  8:59 [Qemu-devel] [PATCH v3 00/12] intel-iommu: nested vIOMMU, cleanups, bug fixes Peter Xu
2018-05-17  8:59 ` [Qemu-devel] [PATCH v3 01/12] intel-iommu: send PSI always even if across PDEs Peter Xu
2018-05-17 21:00   ` Michael S. Tsirkin [this message]
2018-05-18  8:23   ` Auger Eric
2018-05-17  8:59 ` [Qemu-devel] [PATCH v3 02/12] intel-iommu: remove IntelIOMMUNotifierNode Peter Xu
2018-05-17  8:59 ` [Qemu-devel] [PATCH v3 03/12] intel-iommu: add iommu lock Peter Xu
2018-05-17  8:59 ` [Qemu-devel] [PATCH v3 04/12] intel-iommu: only do page walk for MAP notifiers Peter Xu
2018-05-17  8:59 ` [Qemu-devel] [PATCH v3 05/12] intel-iommu: introduce vtd_page_walk_info Peter Xu
2018-05-18  8:23   ` Auger Eric
2018-05-17  8:59 ` [Qemu-devel] [PATCH v3 06/12] intel-iommu: pass in address space when page walk Peter Xu
2018-05-18  8:23   ` Auger Eric
2018-05-17  8:59 ` [Qemu-devel] [PATCH v3 07/12] intel-iommu: trace domain id during " Peter Xu
2018-05-17  8:59 ` [Qemu-devel] [PATCH v3 08/12] util: implement simple iova tree Peter Xu
2018-05-17  8:59 ` [Qemu-devel] [PATCH v3 09/12] intel-iommu: maintain per-device iova ranges Peter Xu
2018-05-17  9:46   ` Peter Xu
2018-05-17  8:59 ` [Qemu-devel] [PATCH v3 10/12] intel-iommu: simplify page walk logic Peter Xu
2018-05-17  8:59 ` [Qemu-devel] [PATCH v3 11/12] intel-iommu: new vtd_sync_shadow_page_table_range Peter Xu
2018-05-17  8:59 ` [Qemu-devel] [PATCH v3 12/12] intel-iommu: new sync_shadow_page_table Peter Xu
2018-05-17 21:06   ` Michael S. Tsirkin
2018-05-18  6:22     ` Peter Xu
2018-05-17 19:49 ` [Qemu-devel] [PATCH v3 00/12] intel-iommu: nested vIOMMU, cleanups, bug fixes Jintack Lim
2018-05-18  6:26   ` Peter Xu
2018-05-18  6:28     ` Peter Xu
2018-05-17 21:04 ` Michael S. Tsirkin
2018-05-18  6:34   ` Peter Xu
2018-05-17 21:08 ` Michael S. Tsirkin
2018-05-18  6:30   ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180517235838-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=jintack@cs.columbia.edu \
    --cc=kevin.tian@intel.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).