xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix
@ 2015-07-03 16:25 Paul Durrant
  2015-07-03 16:25 ` [PATCH v6 01/16] x86/hvm: make sure emulation is retried if domain is shutting down Paul Durrant
                   ` (16 more replies)
  0 siblings, 17 replies; 53+ messages in thread
From: Paul Durrant @ 2015-07-03 16:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant

This patch series re-works much of the code involved in emulation of port
and memory mapped I/O for HVM guests.

The code has become very convoluted and, at least by inspection, certain
emulations will apparently malfunction.

The series is broken down into 16 patches (which are also available in
my xenbits repo: http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git
on the emulation33 branch).

Previous changelog
------------------

v4:
 - Removed previous patch (make sure translated MMIO reads or
   writes fall within a page) and rebased rest of series.
 - Address Jan's comments on patch #1

v3:
 - Addressed comments from Jan
 - Re-ordered series to bring a couple of more trivial patches to the
   front
 - Backport to XenServer (4.5) now passing automated tests
 - Tested on unstable with QEMU upstream and trad, with and without
   HAP (to force shadow emulation)

v2:
 - Removed bogus assertion from patch #15
 - Re-worked patch #17 after basic testing of back-port onto XenServer


Changelog (now per-patch)
-------------------------

0001-x86-hvm-make-sure-emulation-is-retried-if-domain-is-.patch

v6: Added Andrew's reviewed-by

v5: New patch to fix an issue on staging reported by Don Slutz


0002-x86-hvm-remove-multiple-open-coded-chunking-loops.patch

v6: Addressed Andrew's comments

v5: Addressed further comments from Jan


0003-x86-hvm-change-hvm_mmio_read_t-and-hvm_mmio_write_t-.patch

v6: Added Andrew's reviewed-by

v5: New patch to tidy up types


0004-x86-hvm-restrict-port-numbers-to-uint16_t-and-sizes-.patch

v6: Added Andrew's reviewed-by

v5: New patch to tidy up more types


0005-x86-hvm-unify-internal-portio-and-mmio-intercepts.patch

v6: Added Andrew's reviewed-by and made the modification requested
    by Roger

v5: Addressed further comments from Jan and simplified implementation
    by passing ioreq_t to accept() function


0006-x86-hvm-add-length-to-mmio-check-op.patch

v6: Added Andrew's reviewed-by

v5: Simplified by leaving mmio_check() implementation alone and
    calling to check last byte if first-byte check passes


0007-x86-hvm-unify-dpci-portio-intercept-with-standard-po.patch

v6: Added Andrew's reviewed-by

v5: Addressed further comments from Jan


0008-x86-hvm-unify-stdvga-mmio-intercept-with-standard-mm.patch

v6: Added Andrew's reviewed-by

v5: Fixed semantic problems pointed out by Jan


0009-x86-hvm-limit-reps-to-avoid-the-need-to-handle-retry.patch

v6: Added comment requested by Andrew

v5: Addressed further comments from Jan


0010-x86-hvm-only-call-hvm_io_assist-from-hvm_wait_for_io.patch

v6: Added Andrew's reviewed-by

v5: Added Jan's acked-by


0011-x86-hvm-split-I-O-completion-handling-from-state-mod.patch

v6: Added Andrew's reviewed-by

v5: Confirmed call to msix_write_completion() is in the correct place.


0012-x86-hvm-remove-HVMIO_dispatched-I-O-state.patch

v6: Added Andrew's reviewed-by

v5: Added some extra comments to the commit


0013-x86-hvm-remove-hvm_io_state-enumeration.patch

v6: Added Andrew's reviewed-by

v5: Added Jan's acked-by


0014-x86-hvm-use-ioreq_t-to-track-in-flight-state.patch

v6: Added Andrew's reviewed-by

v5: Added missing hunk with call to handle_pio()


0015-x86-hvm-always-re-emulate-I-O-from-a-buffer.patch

v6: Added Andrew's reviewed-by

v5: Added Jan's acked-by


0016-x86-hvm-track-large-memory-mapped-accesses-by-buffer.patch

v6: Added Andrew's reviewed-by

v5: Fixed to cache up three distict I/O emulations per instruction

Testing
-------

The series was been back-ported to staging-4.5 and then dropped onto the
XenServer (Dundee) patch queue. All automated branch-safety tests pass.

The series as-is has been manually tested with a Windows 7 (32-bit) VM
using upstream QEMU.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v6 01/16] x86/hvm: make sure emulation is retried if domain is shutting down
  2015-07-03 16:25 [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
@ 2015-07-03 16:25 ` Paul Durrant
  2015-07-03 16:25 ` [PATCH v6 02/16] x86/hvm: remove multiple open coded 'chunking' loops Paul Durrant
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 53+ messages in thread
From: Paul Durrant @ 2015-07-03 16:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser, Jan Beulich

The addition of commit 2df1aa01 "x86/hvm: remove hvm_io_pending() check
in hvmemul_do_io()" causes a problem in migration because I/O that was
caught by the test of vcpu_start_shutdown_deferral() in
hvm_send_assist_req() is now considered completed rather than requiring
a retry.

This patch fixes the problem by having hvm_send_assist_req() return
X86EMUL_RETRY rather than X86EMUL_OKAY if the
vcpu_start_shutdown_deferral() test fails and then making sure that
the emulation state is reset if the domain is found to be shutting
down.

Reported-by: Don Slutz <don.slutz@gmail.com>
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c |    2 +-
 xen/arch/x86/hvm/hvm.c     |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index fe5661d..8b60843 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -183,7 +183,7 @@ static int hvmemul_do_io(
         else
         {
             rc = hvm_send_assist_req(s, &p);
-            if ( rc != X86EMUL_RETRY )
+            if ( rc != X86EMUL_RETRY || curr->domain->is_shutting_down )
                 vio->io_state = HVMIO_none;
             else if ( data_is_addr || dir == IOREQ_WRITE )
                 rc = X86EMUL_OKAY;
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 535d622..f0cf064 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2613,7 +2613,7 @@ int hvm_send_assist_req(struct hvm_ioreq_server *s, ioreq_t *proto_p)
 
     ASSERT(s);
     if ( unlikely(!vcpu_start_shutdown_deferral(curr)) )
-        return X86EMUL_OKAY;
+        return X86EMUL_RETRY;
 
     list_for_each_entry ( sv,
                           &s->ioreq_vcpu_list,
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v6 02/16] x86/hvm: remove multiple open coded 'chunking' loops
  2015-07-03 16:25 [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
  2015-07-03 16:25 ` [PATCH v6 01/16] x86/hvm: make sure emulation is retried if domain is shutting down Paul Durrant
@ 2015-07-03 16:25 ` Paul Durrant
  2015-07-03 17:17   ` Andrew Cooper
  2015-07-08 15:52   ` Jan Beulich
  2015-07-03 16:25 ` [PATCH v6 03/16] x86/hvm: change hvm_mmio_read_t and hvm_mmio_write_t length argument Paul Durrant
                   ` (14 subsequent siblings)
  16 siblings, 2 replies; 53+ messages in thread
From: Paul Durrant @ 2015-07-03 16:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

...in hvmemul_read/write()

Add hvmemul_phys_mmio_access() and hvmemul_linear_mmio_access() functions
to reduce code duplication.

NOTE: This patch also introduces a change in 'chunking' around a page
      boundary. Previously (for example) an 8 byte access at the last
      byte of a page would get carried out as 8 single-byte accesses.
      It will now be carried out as a single-byte access, followed by
      a 4-byte access, a 2-byte access and then another single-byte
      access.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c |  223 +++++++++++++++++++++++---------------------
 1 file changed, 116 insertions(+), 107 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 8b60843..b823d84 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -539,6 +539,117 @@ static int hvmemul_virtual_to_linear(
     return X86EMUL_EXCEPTION;
 }
 
+static int hvmemul_phys_mmio_access(
+    paddr_t gpa, unsigned int size, uint8_t dir, uint8_t *buffer)
+{
+    unsigned long one_rep = 1;
+    unsigned int chunk;
+    int rc;
+
+    /* Accesses must fall within a page. */
+    BUG_ON((gpa & ~PAGE_MASK) + size > PAGE_SIZE);
+
+    /*
+     * hvmemul_do_io() cannot handle non-power-of-2 accesses or
+     * accesses larger than sizeof(long), so choose the highest power
+     * of 2 not exceeding sizeof(long) as the 'chunk' size.
+     */
+    ASSERT(size != 0);
+    chunk = 1u << (fls(size) - 1);
+    if ( chunk > sizeof (long) )
+        chunk = sizeof (long);
+
+    for ( ;; )
+    {
+        rc = hvmemul_do_mmio_buffer(gpa, &one_rep, chunk, dir, 0,
+                                    buffer);
+        if ( rc != X86EMUL_OKAY )
+            break;
+
+        /* Advance to the next chunk. */
+        gpa += chunk;
+        buffer += chunk;
+        size -= chunk;
+
+        if ( size == 0 )
+            break;
+
+        /*
+         * If the chunk now exceeds the remaining size, choose the next
+         * lowest power of 2 that will fit.
+         */
+        while ( chunk > size )
+            chunk >>= 1;
+    }
+
+    return rc;
+}
+
+static int hvmemul_linear_mmio_access(
+    unsigned long gla, unsigned int size, uint8_t dir, uint8_t *buffer,
+    uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t known_gpfn)
+{
+    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
+    unsigned long offset = gla & ~PAGE_MASK;
+    unsigned int chunk;
+    paddr_t gpa;
+    unsigned long one_rep = 1;
+    int rc;
+
+    chunk = min_t(unsigned int, size, PAGE_SIZE - offset);
+
+    if ( known_gpfn )
+        gpa = pfn_to_paddr(vio->mmio_gpfn) | offset;
+    else
+    {
+        rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
+                                    hvmemul_ctxt);
+        if ( rc != X86EMUL_OKAY )
+            return rc;
+    }
+
+    for ( ;; )
+    {
+        rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer);
+        if ( rc != X86EMUL_OKAY )
+            break;
+
+        gla += chunk;
+        buffer += chunk;
+        size -= chunk;
+
+        if ( size == 0 )
+            break;
+
+        ASSERT((gla & ~PAGE_MASK) == 0);
+        chunk = min_t(unsigned int, size, PAGE_SIZE);
+        rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
+                                    hvmemul_ctxt);
+        if ( rc != X86EMUL_OKAY )
+            return rc;
+    }
+
+    return rc;
+}
+
+static inline int hvmemul_linear_mmio_read(
+    unsigned long gla, unsigned int size, void *buffer,
+    uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt,
+    bool_t translate)
+{
+    return hvmemul_linear_mmio_access(gla, size, IOREQ_READ, buffer,
+                                      pfec, hvmemul_ctxt, translate);
+}
+
+static inline int hvmemul_linear_mmio_write(
+    unsigned long gla, unsigned int size, void *buffer,
+    uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt,
+    bool_t translate)
+{
+    return hvmemul_linear_mmio_access(gla, size, IOREQ_WRITE, buffer,
+                                      pfec, hvmemul_ctxt, translate);
+}
+
 static int __hvmemul_read(
     enum x86_segment seg,
     unsigned long offset,
@@ -549,51 +660,19 @@ static int __hvmemul_read(
 {
     struct vcpu *curr = current;
     unsigned long addr, reps = 1;
-    unsigned int off, chunk = min(bytes, 1U << LONG_BYTEORDER);
     uint32_t pfec = PFEC_page_present;
     struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
-    paddr_t gpa;
     int rc;
 
     rc = hvmemul_virtual_to_linear(
         seg, offset, bytes, &reps, access_type, hvmemul_ctxt, &addr);
     if ( rc != X86EMUL_OKAY )
         return rc;
-    off = addr & (PAGE_SIZE - 1);
-    /*
-     * We only need to handle sizes actual instruction operands can have. All
-     * such sizes are either powers of 2 or the sum of two powers of 2. Thus
-     * picking as initial chunk size the largest power of 2 not greater than
-     * the total size will always result in only power-of-2 size requests
-     * issued to hvmemul_do_mmio() (hvmemul_do_io() rejects non-powers-of-2).
-     */
-    while ( chunk & (chunk - 1) )
-        chunk &= chunk - 1;
-    if ( off + bytes > PAGE_SIZE )
-        while ( off & (chunk - 1) )
-            chunk >>= 1;
-
     if ( ((access_type != hvm_access_insn_fetch
            ? vio->mmio_access.read_access
            : vio->mmio_access.insn_fetch)) &&
          (vio->mmio_gva == (addr & PAGE_MASK)) )
-    {
-        gpa = (((paddr_t)vio->mmio_gpfn << PAGE_SHIFT) | off);
-        while ( (off + chunk) <= PAGE_SIZE )
-        {
-            rc = hvmemul_do_mmio_buffer(gpa, &reps, chunk, IOREQ_READ, 0,
-                                        p_data);
-            if ( rc != X86EMUL_OKAY || bytes == chunk )
-                return rc;
-            addr += chunk;
-            off += chunk;
-            gpa += chunk;
-            p_data += chunk;
-            bytes -= chunk;
-            if ( bytes < chunk )
-                chunk = bytes;
-        }
-    }
+        return hvmemul_linear_mmio_read(addr, bytes, p_data, pfec, hvmemul_ctxt, 1);
 
     if ( (seg != x86_seg_none) &&
          (hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.dpl == 3) )
@@ -612,30 +691,8 @@ static int __hvmemul_read(
     case HVMCOPY_bad_gfn_to_mfn:
         if ( access_type == hvm_access_insn_fetch )
             return X86EMUL_UNHANDLEABLE;
-        rc = hvmemul_linear_to_phys(addr, &gpa, chunk, &reps, pfec,
-                                    hvmemul_ctxt);
-        while ( rc == X86EMUL_OKAY )
-        {
-            rc = hvmemul_do_mmio_buffer(gpa, &reps, chunk, IOREQ_READ, 0,
-                                        p_data);
-            if ( rc != X86EMUL_OKAY || bytes == chunk )
-                break;
-            addr += chunk;
-            off += chunk;
-            p_data += chunk;
-            bytes -= chunk;
-            if ( bytes < chunk )
-                chunk = bytes;
-            if ( off < PAGE_SIZE )
-                gpa += chunk;
-            else
-            {
-                rc = hvmemul_linear_to_phys(addr, &gpa, chunk, &reps, pfec,
-                                            hvmemul_ctxt);
-                off = 0;
-            }
-        }
-        return rc;
+
+        return hvmemul_linear_mmio_read(addr, bytes, p_data, pfec, hvmemul_ctxt, 0);
     case HVMCOPY_gfn_paged_out:
     case HVMCOPY_gfn_shared:
         return X86EMUL_RETRY;
@@ -701,43 +758,18 @@ static int hvmemul_write(
         container_of(ctxt, struct hvm_emulate_ctxt, ctxt);
     struct vcpu *curr = current;
     unsigned long addr, reps = 1;
-    unsigned int off, chunk = min(bytes, 1U << LONG_BYTEORDER);
     uint32_t pfec = PFEC_page_present | PFEC_write_access;
     struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
-    paddr_t gpa;
     int rc;
 
     rc = hvmemul_virtual_to_linear(
         seg, offset, bytes, &reps, hvm_access_write, hvmemul_ctxt, &addr);
     if ( rc != X86EMUL_OKAY )
         return rc;
-    off = addr & (PAGE_SIZE - 1);
-    /* See the respective comment in __hvmemul_read(). */
-    while ( chunk & (chunk - 1) )
-        chunk &= chunk - 1;
-    if ( off + bytes > PAGE_SIZE )
-        while ( off & (chunk - 1) )
-            chunk >>= 1;
 
     if ( vio->mmio_access.write_access &&
          (vio->mmio_gva == (addr & PAGE_MASK)) )
-    {
-        gpa = (((paddr_t)vio->mmio_gpfn << PAGE_SHIFT) | off);
-        while ( (off + chunk) <= PAGE_SIZE )
-        {
-            rc = hvmemul_do_mmio_buffer(gpa, &reps, chunk, IOREQ_WRITE, 0,
-                                        p_data);
-            if ( rc != X86EMUL_OKAY || bytes == chunk )
-                return rc;
-            addr += chunk;
-            off += chunk;
-            gpa += chunk;
-            p_data += chunk;
-            bytes -= chunk;
-            if ( bytes < chunk )
-                chunk = bytes;
-        }
-    }
+        return hvmemul_linear_mmio_write(addr, bytes, p_data, pfec, hvmemul_ctxt, 1);
 
     if ( (seg != x86_seg_none) &&
          (hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.dpl == 3) )
@@ -752,30 +784,7 @@ static int hvmemul_write(
     case HVMCOPY_bad_gva_to_gfn:
         return X86EMUL_EXCEPTION;
     case HVMCOPY_bad_gfn_to_mfn:
-        rc = hvmemul_linear_to_phys(addr, &gpa, chunk, &reps, pfec,
-                                    hvmemul_ctxt);
-        while ( rc == X86EMUL_OKAY )
-        {
-            rc = hvmemul_do_mmio_buffer(gpa, &reps, chunk, IOREQ_WRITE, 0,
-                                        p_data);
-            if ( rc != X86EMUL_OKAY || bytes == chunk )
-                break;
-            addr += chunk;
-            off += chunk;
-            p_data += chunk;
-            bytes -= chunk;
-            if ( bytes < chunk )
-                chunk = bytes;
-            if ( off < PAGE_SIZE )
-                gpa += chunk;
-            else
-            {
-                rc = hvmemul_linear_to_phys(addr, &gpa, chunk, &reps, pfec,
-                                            hvmemul_ctxt);
-                off = 0;
-            }
-        }
-        return rc;
+        return hvmemul_linear_mmio_write(addr, bytes, p_data, pfec, hvmemul_ctxt, 0);
     case HVMCOPY_gfn_paged_out:
     case HVMCOPY_gfn_shared:
         return X86EMUL_RETRY;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v6 03/16] x86/hvm: change hvm_mmio_read_t and hvm_mmio_write_t length argument...
  2015-07-03 16:25 [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
  2015-07-03 16:25 ` [PATCH v6 01/16] x86/hvm: make sure emulation is retried if domain is shutting down Paul Durrant
  2015-07-03 16:25 ` [PATCH v6 02/16] x86/hvm: remove multiple open coded 'chunking' loops Paul Durrant
@ 2015-07-03 16:25 ` Paul Durrant
  2015-07-03 16:25 ` [PATCH v6 04/16] x86/hvm: restrict port numbers to uint16_t and sizes to unsigned int Paul Durrant
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 53+ messages in thread
From: Paul Durrant @ 2015-07-03 16:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser, Jan Beulich

...from unsigned long to unsigned int

A 64-bit length is not necessary, 32 bits is enough.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/hpet.c                   |    4 ++--
 xen/arch/x86/hvm/vioapic.c                |    4 ++--
 xen/arch/x86/hvm/vlapic.c                 |   18 +++++++++---------
 xen/arch/x86/hvm/vmsi.c                   |    4 ++--
 xen/drivers/passthrough/amd/iommu_guest.c |    8 ++++----
 xen/include/asm-x86/hvm/io.h              |    6 +++---
 6 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/xen/arch/x86/hvm/hpet.c b/xen/arch/x86/hvm/hpet.c
index 9585ca8..30ac5dd 100644
--- a/xen/arch/x86/hvm/hpet.c
+++ b/xen/arch/x86/hvm/hpet.c
@@ -166,7 +166,7 @@ static inline int hpet_check_access_length(
 }
 
 static int hpet_read(
-    struct vcpu *v, unsigned long addr, unsigned long length,
+    struct vcpu *v, unsigned long addr, unsigned int length,
     unsigned long *pval)
 {
     HPETState *h = vcpu_vhpet(v);
@@ -301,7 +301,7 @@ static inline uint64_t hpet_fixup_reg(
 
 static int hpet_write(
     struct vcpu *v, unsigned long addr,
-    unsigned long length, unsigned long val)
+    unsigned int length, unsigned long val)
 {
     HPETState *h = vcpu_vhpet(v);
     uint64_t old_val, new_val;
diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index cbbef9f..5c8d890 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -94,7 +94,7 @@ static uint32_t vioapic_read_indirect(const struct hvm_hw_vioapic *vioapic)
 
 static int vioapic_read(
     struct vcpu *v, unsigned long addr,
-    unsigned long length, unsigned long *pval)
+    unsigned int length, unsigned long *pval)
 {
     const struct hvm_hw_vioapic *vioapic = domain_vioapic(v->domain);
     uint32_t result;
@@ -215,7 +215,7 @@ static void vioapic_write_indirect(
 
 static int vioapic_write(
     struct vcpu *v, unsigned long addr,
-    unsigned long length, unsigned long val)
+    unsigned int length, unsigned long val)
 {
     struct hvm_hw_vioapic *vioapic = domain_vioapic(v->domain);
 
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 32e649e..0235cb3 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -581,7 +581,7 @@ static uint32_t vlapic_read_aligned(struct vlapic *vlapic, unsigned int offset)
 
 static int vlapic_read(
     struct vcpu *v, unsigned long address,
-    unsigned long len, unsigned long *pval)
+    unsigned int len, unsigned long *pval)
 {
     struct vlapic *vlapic = vcpu_vlapic(v);
     unsigned int offset = address - vlapic_base_address(vlapic);
@@ -611,12 +611,12 @@ static int vlapic_read(
         break;
 
     default:
-        gdprintk(XENLOG_ERR, "Local APIC read with len=%#lx, "
+        gdprintk(XENLOG_ERR, "Local APIC read with len=%#x, "
                  "should be 4 instead.\n", len);
         goto exit_and_crash;
     }
 
-    HVM_DBG_LOG(DBG_LEVEL_VLAPIC, "offset %#x with length %#lx, "
+    HVM_DBG_LOG(DBG_LEVEL_VLAPIC, "offset %#x with length %#x, "
                 "and the result is %#x", offset, len, result);
 
  out:
@@ -624,7 +624,7 @@ static int vlapic_read(
     return X86EMUL_OKAY;
 
  unaligned_exit_and_crash:
-    gdprintk(XENLOG_ERR, "Unaligned LAPIC read len=%#lx at offset=%#x.\n",
+    gdprintk(XENLOG_ERR, "Unaligned LAPIC read len=%#x at offset=%#x.\n",
              len, offset);
  exit_and_crash:
     domain_crash(v->domain);
@@ -819,7 +819,7 @@ static int vlapic_reg_write(struct vcpu *v,
 }
 
 static int vlapic_write(struct vcpu *v, unsigned long address,
-                        unsigned long len, unsigned long val)
+                        unsigned int len, unsigned long val)
 {
     struct vlapic *vlapic = vcpu_vlapic(v);
     unsigned int offset = address - vlapic_base_address(vlapic);
@@ -827,7 +827,7 @@ static int vlapic_write(struct vcpu *v, unsigned long address,
 
     if ( offset != APIC_EOI )
         HVM_DBG_LOG(DBG_LEVEL_VLAPIC,
-                    "offset %#x with length %#lx, and value is %#lx",
+                    "offset %#x with length %#x, and value is %#lx",
                     offset, len, val);
 
     /*
@@ -854,11 +854,11 @@ static int vlapic_write(struct vcpu *v, unsigned long address,
             break;
 
         default:
-            gprintk(XENLOG_ERR, "LAPIC write with len %lu\n", len);
+            gprintk(XENLOG_ERR, "LAPIC write with len %u\n", len);
             goto exit_and_crash;
         }
 
-        gdprintk(XENLOG_INFO, "Notice: LAPIC write with len %lu\n", len);
+        gdprintk(XENLOG_INFO, "Notice: LAPIC write with len %u\n", len);
         offset &= ~3;
     }
     else if ( unlikely(offset & 3) )
@@ -867,7 +867,7 @@ static int vlapic_write(struct vcpu *v, unsigned long address,
     return vlapic_reg_write(v, offset, val);
 
  unaligned_exit_and_crash:
-    gprintk(XENLOG_ERR, "Unaligned LAPIC write: len=%lu offset=%#x.\n",
+    gprintk(XENLOG_ERR, "Unaligned LAPIC write: len=%u offset=%#x.\n",
             len, offset);
  exit_and_crash:
     domain_crash(v->domain);
diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
index f89233d..6eeae0a 100644
--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -202,7 +202,7 @@ static struct msi_desc *msixtbl_addr_to_desc(
 
 static int msixtbl_read(
     struct vcpu *v, unsigned long address,
-    unsigned long len, unsigned long *pval)
+    unsigned int len, unsigned long *pval)
 {
     unsigned long offset;
     struct msixtbl_entry *entry;
@@ -258,7 +258,7 @@ out:
 }
 
 static int msixtbl_write(struct vcpu *v, unsigned long address,
-                         unsigned long len, unsigned long val)
+                         unsigned int len, unsigned long val)
 {
     unsigned long offset;
     struct msixtbl_entry *entry;
diff --git a/xen/drivers/passthrough/amd/iommu_guest.c b/xen/drivers/passthrough/amd/iommu_guest.c
index 7b0c102..a832fdb 100644
--- a/xen/drivers/passthrough/amd/iommu_guest.c
+++ b/xen/drivers/passthrough/amd/iommu_guest.c
@@ -682,7 +682,7 @@ static uint64_t iommu_mmio_read64(struct guest_iommu *iommu,
 }
 
 static int guest_iommu_mmio_read(struct vcpu *v, unsigned long addr,
-                                 unsigned long len, unsigned long *pval)
+                                 unsigned int len, unsigned long *pval)
 {
     struct guest_iommu *iommu = vcpu_iommu(v);
     unsigned long offset;
@@ -695,7 +695,7 @@ static int guest_iommu_mmio_read(struct vcpu *v, unsigned long addr,
     if ( unlikely((offset & (len - 1 )) || (len > 8)) )
     {
         AMD_IOMMU_DEBUG("iommu mmio read access is not aligned:"
-                        " offset = %lx, len = %lx\n", offset, len);
+                        " offset = %lx, len = %x\n", offset, len);
         return X86EMUL_UNHANDLEABLE;
     }
 
@@ -772,7 +772,7 @@ static void guest_iommu_mmio_write64(struct guest_iommu *iommu,
 }
 
 static int guest_iommu_mmio_write(struct vcpu *v, unsigned long addr,
-                                  unsigned long len, unsigned long val)
+                                  unsigned int len, unsigned long val)
 {
     struct guest_iommu *iommu = vcpu_iommu(v);
     unsigned long offset;
@@ -785,7 +785,7 @@ static int guest_iommu_mmio_write(struct vcpu *v, unsigned long addr,
     if ( unlikely((offset & (len - 1)) || (len > 8)) )
     {
         AMD_IOMMU_DEBUG("iommu mmio write access is not aligned:"
-                        " offset = %lx, len = %lx\n", offset, len);
+                        " offset = %lx, len = %x\n", offset, len);
         return X86EMUL_UNHANDLEABLE;
     }
 
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index f2aaec5..d1d79dc 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -32,11 +32,11 @@
 
 typedef int (*hvm_mmio_read_t)(struct vcpu *v,
                                unsigned long addr,
-                               unsigned long length,
+                               unsigned int length,
                                unsigned long *val);
 typedef int (*hvm_mmio_write_t)(struct vcpu *v,
                                 unsigned long addr,
-                                unsigned long length,
+                                unsigned int length,
                                 unsigned long val);
 typedef int (*hvm_mmio_check_t)(struct vcpu *v, unsigned long addr);
 
@@ -111,7 +111,7 @@ static inline void relocate_portio_handler(
 
 static inline void register_buffered_io_handler(
     struct domain *d, unsigned long addr,
-    unsigned long size, mmio_action_t action)
+    unsigned int size, mmio_action_t action)
 {
     register_io_handler(d, addr, size, action, HVM_BUFFERED_IO);
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v6 04/16] x86/hvm: restrict port numbers to uint16_t and sizes to unsigned int
  2015-07-03 16:25 [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (2 preceding siblings ...)
  2015-07-03 16:25 ` [PATCH v6 03/16] x86/hvm: change hvm_mmio_read_t and hvm_mmio_write_t length argument Paul Durrant
@ 2015-07-03 16:25 ` Paul Durrant
  2015-07-08 15:57   ` Jan Beulich
  2015-07-03 16:25 ` [PATCH v6 05/16] x86/hvm: unify internal portio and mmio intercepts Paul Durrant
                   ` (12 subsequent siblings)
  16 siblings, 1 reply; 53+ messages in thread
From: Paul Durrant @ 2015-07-03 16:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser, Jan Beulich

Building on the previous patch, this patch restricts portio port numbers
to uint16_t in registration/relocate calls and portio_action_t. It also
changes portio sizes to unsigned int which then allows the io_handler
size field to reduce to an unsigned int.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/hvm.c       |    6 +++---
 xen/arch/x86/hvm/i8254.c     |    8 ++++----
 xen/arch/x86/hvm/intercept.c |    4 ++--
 xen/arch/x86/hvm/pmtimer.c   |    4 ++--
 xen/arch/x86/hvm/rtc.c       |    2 +-
 xen/arch/x86/hvm/stdvga.c    |    2 +-
 xen/arch/x86/hvm/vpic.c      |    4 ++--
 xen/include/asm-x86/hvm/io.h |   20 ++++++++++----------
 8 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index f0cf064..2bd7d45 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -559,7 +559,7 @@ static int hvm_add_ioreq_gmfn(
 }
 
 static int hvm_print_line(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val)
 {
     struct domain *cd = current->domain;
     char c = *val;
@@ -585,7 +585,7 @@ static int hvm_print_line(
 }
 
 static int hvm_access_cf8(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val)
 {
     struct domain *d = current->domain;
 
@@ -597,7 +597,7 @@ static int hvm_access_cf8(
 }
 
 static int handle_pvh_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val)
 {
     struct domain *currd = current->domain;
 
diff --git a/xen/arch/x86/hvm/i8254.c b/xen/arch/x86/hvm/i8254.c
index 36a0a53..5a47580 100644
--- a/xen/arch/x86/hvm/i8254.c
+++ b/xen/arch/x86/hvm/i8254.c
@@ -50,9 +50,9 @@
 #define RW_STATE_WORD1 4
 
 static int handle_pit_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val);
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val);
 static int handle_speaker_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val);
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val);
 
 #define get_guest_time(v) \
    (is_hvm_vcpu(v) ? hvm_get_guest_time(v) : (u64)get_s_time())
@@ -479,7 +479,7 @@ void pit_deinit(struct domain *d)
 
 /* the intercept action for PIT DM retval:0--not handled; 1--handled */  
 static int handle_pit_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val)
 {
     struct PITState *vpit = vcpu_vpit(current);
 
@@ -522,7 +522,7 @@ static uint32_t speaker_ioport_read(
 }
 
 static int handle_speaker_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, uint16_t port, uint32_t bytes, uint32_t *val)
 {
     struct PITState *vpit = vcpu_vpit(current);
 
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index cc44733..52879ff 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -369,7 +369,7 @@ int hvm_io_intercept(ioreq_t *p, int type)
 }
 
 void register_io_handler(
-    struct domain *d, unsigned long addr, unsigned long size,
+    struct domain *d, unsigned long addr, unsigned int size,
     void *action, int type)
 {
     struct hvm_io_handler *handler = d->arch.hvm_domain.io_handler;
@@ -386,7 +386,7 @@ void register_io_handler(
 
 void relocate_io_handler(
     struct domain *d, unsigned long old_addr, unsigned long new_addr,
-    unsigned long size, int type)
+    unsigned int size, int type)
 {
     struct hvm_io_handler *handler = d->arch.hvm_domain.io_handler;
     int i;
diff --git a/xen/arch/x86/hvm/pmtimer.c b/xen/arch/x86/hvm/pmtimer.c
index 6ad2797..8b531d9 100644
--- a/xen/arch/x86/hvm/pmtimer.c
+++ b/xen/arch/x86/hvm/pmtimer.c
@@ -142,7 +142,7 @@ static void pmt_timer_callback(void *opaque)
 
 /* Handle port I/O to the PM1a_STS and PM1a_EN registers */
 static int handle_evt_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val)
 {
     struct vcpu *v = current;
     PMTState *s = &v->domain->arch.hvm_domain.pl_time.vpmt;
@@ -205,7 +205,7 @@ static int handle_evt_io(
 
 /* Handle port I/O to the TMR_VAL register */
 static int handle_pmt_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val)
 {
     struct vcpu *v = current;
     PMTState *s = &v->domain->arch.hvm_domain.pl_time.vpmt;
diff --git a/xen/arch/x86/hvm/rtc.c b/xen/arch/x86/hvm/rtc.c
index 3448971..fde120b 100644
--- a/xen/arch/x86/hvm/rtc.c
+++ b/xen/arch/x86/hvm/rtc.c
@@ -697,7 +697,7 @@ static uint32_t rtc_ioport_read(RTCState *s, uint32_t addr)
 }
 
 static int handle_rtc_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val)
 {
     struct RTCState *vrtc = vcpu_vrtc(current);
 
diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c
index 13d1029..98308cd 100644
--- a/xen/arch/x86/hvm/stdvga.c
+++ b/xen/arch/x86/hvm/stdvga.c
@@ -173,7 +173,7 @@ static void stdvga_out(uint32_t port, uint32_t bytes, uint32_t val)
 }
 
 static int stdvga_intercept_pio(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val)
 {
     struct hvm_hw_stdvga *s = &current->domain->arch.hvm_domain.stdvga;
 
diff --git a/xen/arch/x86/hvm/vpic.c b/xen/arch/x86/hvm/vpic.c
index 8eea061..909b9d0 100644
--- a/xen/arch/x86/hvm/vpic.c
+++ b/xen/arch/x86/hvm/vpic.c
@@ -324,7 +324,7 @@ static uint32_t vpic_ioport_read(struct hvm_hw_vpic *vpic, uint32_t addr)
 }
 
 static int vpic_intercept_pic_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val)
 {
     struct hvm_hw_vpic *vpic;
 
@@ -346,7 +346,7 @@ static int vpic_intercept_pic_io(
 }
 
 static int vpic_intercept_elcr_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val)
 {
     struct hvm_hw_vpic *vpic;
     uint32_t data;
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index d1d79dc..ca6958b 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -41,12 +41,12 @@ typedef int (*hvm_mmio_write_t)(struct vcpu *v,
 typedef int (*hvm_mmio_check_t)(struct vcpu *v, unsigned long addr);
 
 typedef int (*portio_action_t)(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val);
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val);
 typedef int (*mmio_action_t)(ioreq_t *);
 struct io_handler {
     int                 type;
+    unsigned int        size;
     unsigned long       addr;
-    unsigned long       size;
     union {
         portio_action_t portio;
         mmio_action_t   mmio;
@@ -75,11 +75,11 @@ extern const struct hvm_mmio_ops iommu_mmio_ops;
 
 int hvm_io_intercept(ioreq_t *p, int type);
 void register_io_handler(
-    struct domain *d, unsigned long addr, unsigned long size,
+    struct domain *d, unsigned long addr, unsigned int size,
     void *action, int type);
 void relocate_io_handler(
     struct domain *d, unsigned long old_addr, unsigned long new_addr,
-    unsigned long size, int type);
+    unsigned int size, int type);
 
 static inline int hvm_portio_intercept(ioreq_t *p)
 {
@@ -96,17 +96,17 @@ int hvm_mmio_intercept(ioreq_t *p);
 int hvm_buffered_io_send(ioreq_t *p);
 
 static inline void register_portio_handler(
-    struct domain *d, unsigned long addr,
-    unsigned long size, portio_action_t action)
+    struct domain *d, uint16_t port, unsigned int size,
+    portio_action_t action)
 {
-    register_io_handler(d, addr, size, action, HVM_PORTIO);
+    register_io_handler(d, port, size, action, HVM_PORTIO);
 }
 
 static inline void relocate_portio_handler(
-    struct domain *d, unsigned long old_addr, unsigned long new_addr,
-    unsigned long size)
+    struct domain *d, uint16_t old_port, uint16_t new_port,
+    unsigned int size)
 {
-    relocate_io_handler(d, old_addr, new_addr, size, HVM_PORTIO);
+    relocate_io_handler(d, old_port, new_port, size, HVM_PORTIO);
 }
 
 static inline void register_buffered_io_handler(
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v6 05/16] x86/hvm: unify internal portio and mmio intercepts
  2015-07-03 16:25 [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (3 preceding siblings ...)
  2015-07-03 16:25 ` [PATCH v6 04/16] x86/hvm: restrict port numbers to uint16_t and sizes to unsigned int Paul Durrant
@ 2015-07-03 16:25 ` Paul Durrant
  2015-07-08 16:11   ` Jan Beulich
  2015-07-08 16:28   ` Jan Beulich
  2015-07-03 16:25 ` [PATCH v6 06/16] x86/hvm: add length to mmio check op Paul Durrant
                   ` (11 subsequent siblings)
  16 siblings, 2 replies; 53+ messages in thread
From: Paul Durrant @ 2015-07-03 16:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser, Jan Beulich

The implementation of mmio and portio intercepts is unnecessarily different.
This leads to much code duplication. This patch unifies much of the
intercept handling, leaving only distinct handlers for stdvga mmio and dpci
portio. Subsequent patches will unify those handlers.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c                |   11 +-
 xen/arch/x86/hvm/hpet.c                   |    4 +-
 xen/arch/x86/hvm/hvm.c                    |   10 +-
 xen/arch/x86/hvm/intercept.c              |  486 +++++++++++++----------------
 xen/arch/x86/hvm/stdvga.c                 |   30 +-
 xen/arch/x86/hvm/vioapic.c                |    4 +-
 xen/arch/x86/hvm/vlapic.c                 |    5 +-
 xen/arch/x86/hvm/vmsi.c                   |   10 +-
 xen/drivers/passthrough/amd/iommu_guest.c |   30 +-
 xen/include/asm-x86/hvm/domain.h          |    1 +
 xen/include/asm-x86/hvm/io.h              |  118 ++++---
 11 files changed, 342 insertions(+), 367 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index b823d84..20bf372 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -142,16 +142,7 @@ static int hvmemul_do_io(
         hvmtrace_io_assist(&p);
     }
 
-    if ( is_mmio )
-    {
-        rc = hvm_mmio_intercept(&p);
-        if ( rc == X86EMUL_UNHANDLEABLE )
-            rc = hvm_buffered_io_intercept(&p);
-    }
-    else
-    {
-        rc = hvm_portio_intercept(&p);
-    }
+    rc = hvm_io_intercept(&p);
 
     switch ( rc )
     {
diff --git a/xen/arch/x86/hvm/hpet.c b/xen/arch/x86/hvm/hpet.c
index 30ac5dd..732504a 100644
--- a/xen/arch/x86/hvm/hpet.c
+++ b/xen/arch/x86/hvm/hpet.c
@@ -504,7 +504,7 @@ static int hpet_range(struct vcpu *v, unsigned long addr)
              (addr < (HPET_BASE_ADDRESS + HPET_MMAP_SIZE)) );
 }
 
-const struct hvm_mmio_ops hpet_mmio_ops = {
+static const struct hvm_mmio_ops hpet_mmio_ops = {
     .check = hpet_range,
     .read  = hpet_read,
     .write = hpet_write
@@ -659,6 +659,8 @@ void hpet_init(struct domain *d)
         h->hpet.comparator64[i] = ~0ULL;
         h->pt[i].source = PTSRC_isa;
     }
+
+    register_mmio_handler(d, &hpet_mmio_ops);
 }
 
 void hpet_deinit(struct domain *d)
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 2bd7d45..676f294 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1455,9 +1455,6 @@ int hvm_domain_initialise(struct domain *d)
     spin_lock_init(&d->arch.hvm_domain.irq_lock);
     spin_lock_init(&d->arch.hvm_domain.uc_lock);
 
-    INIT_LIST_HEAD(&d->arch.hvm_domain.msixtbl_list);
-    spin_lock_init(&d->arch.hvm_domain.msixtbl_list_lock);
-
     hvm_init_cacheattr_region_list(d);
 
     rc = paging_enable(d, PG_refcounts|PG_translate|PG_external);
@@ -1465,11 +1462,12 @@ int hvm_domain_initialise(struct domain *d)
         goto fail0;
 
     d->arch.hvm_domain.params = xzalloc_array(uint64_t, HVM_NR_PARAMS);
-    d->arch.hvm_domain.io_handler = xmalloc(struct hvm_io_handler);
+    d->arch.hvm_domain.io_handler = xzalloc_array(struct hvm_io_handler,
+                                                  NR_IO_HANDLERS);
     rc = -ENOMEM;
     if ( !d->arch.hvm_domain.params || !d->arch.hvm_domain.io_handler )
         goto fail1;
-    d->arch.hvm_domain.io_handler->num_slot = 0;
+    d->arch.hvm_domain.io_handler_count = 0;
 
     /* Set the default IO Bitmap. */
     if ( is_hardware_domain(d) )
@@ -1506,6 +1504,8 @@ int hvm_domain_initialise(struct domain *d)
 
     rtc_init(d);
 
+    msixtbl_init(d);
+
     register_portio_handler(d, 0xe9, 1, hvm_print_line);
     register_portio_handler(d, 0xcf8, 4, hvm_access_cf8);
 
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index 52879ff..6344c65 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -32,205 +32,94 @@
 #include <xen/event.h>
 #include <xen/iommu.h>
 
-static const struct hvm_mmio_ops *const
-hvm_mmio_handlers[HVM_MMIO_HANDLER_NR] =
+static bool_t hvm_mmio_accept(const struct hvm_io_handler *handler,
+                              const ioreq_t *p)
 {
-    &hpet_mmio_ops,
-    &vlapic_mmio_ops,
-    &vioapic_mmio_ops,
-    &msixtbl_mmio_ops,
-    &iommu_mmio_ops
-};
+    BUG_ON(handler->type != IOREQ_TYPE_COPY);
+
+    return handler->mmio.ops->check(current, p->addr);
+}
 
-static int hvm_mmio_access(struct vcpu *v,
-                           ioreq_t *p,
-                           hvm_mmio_read_t read,
-                           hvm_mmio_write_t write)
+static int hvm_mmio_read(const struct hvm_io_handler *handler,
+                         uint64_t addr,
+                         uint32_t size,
+                         uint64_t *data)
 {
-    struct hvm_vcpu_io *vio = &v->arch.hvm_vcpu.hvm_io;
-    unsigned long data;
-    int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
+    BUG_ON(handler->type != IOREQ_TYPE_COPY);
 
-    if ( !p->data_is_ptr )
-    {
-        if ( p->dir == IOREQ_READ )
-        {
-            if ( vio->mmio_retrying )
-            {
-                if ( vio->mmio_large_read_bytes != p->size )
-                    return X86EMUL_UNHANDLEABLE;
-                memcpy(&data, vio->mmio_large_read, p->size);
-                vio->mmio_large_read_bytes = 0;
-                vio->mmio_retrying = 0;
-            }
-            else
-                rc = read(v, p->addr, p->size, &data);
-            p->data = data;
-        }
-        else /* p->dir == IOREQ_WRITE */
-            rc = write(v, p->addr, p->size, p->data);
-        return rc;
-    }
+    return handler->mmio.ops->read(current, addr, size, data);
+}
 
-    if ( p->dir == IOREQ_READ )
-    {
-        for ( i = 0; i < p->count; i++ )
-        {
-            if ( vio->mmio_retrying )
-            {
-                if ( vio->mmio_large_read_bytes != p->size )
-                    return X86EMUL_UNHANDLEABLE;
-                memcpy(&data, vio->mmio_large_read, p->size);
-                vio->mmio_large_read_bytes = 0;
-                vio->mmio_retrying = 0;
-            }
-            else
-            {
-                rc = read(v, p->addr + step * i, p->size, &data);
-                if ( rc != X86EMUL_OKAY )
-                    break;
-            }
-            switch ( hvm_copy_to_guest_phys(p->data + step * i,
-                                            &data, p->size) )
-            {
-            case HVMCOPY_okay:
-                break;
-            case HVMCOPY_gfn_paged_out:
-            case HVMCOPY_gfn_shared:
-                rc = X86EMUL_RETRY;
-                break;
-            case HVMCOPY_bad_gfn_to_mfn:
-                /* Drop the write as real hardware would. */
-                continue;
-            case HVMCOPY_bad_gva_to_gfn:
-                ASSERT(0);
-                /* fall through */
-            default:
-                rc = X86EMUL_UNHANDLEABLE;
-                break;
-            }
-            if ( rc != X86EMUL_OKAY)
-                break;
-        }
+static int hvm_mmio_write(const struct hvm_io_handler *handler,
+                          uint64_t addr,
+                          uint32_t size,
+                          uint64_t data)
+{
+    BUG_ON(handler->type != IOREQ_TYPE_COPY);
 
-        if ( rc == X86EMUL_RETRY )
-        {
-            vio->mmio_retry = 1;
-            vio->mmio_large_read_bytes = p->size;
-            memcpy(vio->mmio_large_read, &data, p->size);
-        }
-    }
-    else
-    {
-        for ( i = 0; i < p->count; i++ )
-        {
-            switch ( hvm_copy_from_guest_phys(&data, p->data + step * i,
-                                              p->size) )
-            {
-            case HVMCOPY_okay:
-                break;
-            case HVMCOPY_gfn_paged_out:
-            case HVMCOPY_gfn_shared:
-                rc = X86EMUL_RETRY;
-                break;
-            case HVMCOPY_bad_gfn_to_mfn:
-                data = ~0;
-                break;
-            case HVMCOPY_bad_gva_to_gfn:
-                ASSERT(0);
-                /* fall through */
-            default:
-                rc = X86EMUL_UNHANDLEABLE;
-                break;
-            }
-            if ( rc != X86EMUL_OKAY )
-                break;
-            rc = write(v, p->addr + step * i, p->size, data);
-            if ( rc != X86EMUL_OKAY )
-                break;
-        }
+    return handler->mmio.ops->write(current, addr, size, data);
+}
 
-        if ( rc == X86EMUL_RETRY )
-            vio->mmio_retry = 1;
-    }
+static const struct hvm_io_ops mmio_ops = {
+    .accept = hvm_mmio_accept,
+    .read = hvm_mmio_read,
+    .write = hvm_mmio_write
+};
 
-    if ( i != 0 )
-    {
-        p->count = i;
-        rc = X86EMUL_OKAY;
-    }
+static bool_t hvm_portio_accept(const struct hvm_io_handler *handler,
+                                const ioreq_t *p)
+{
+    BUG_ON(handler->type != IOREQ_TYPE_PIO);
 
-    return rc;
+    return (p->addr >= handler->portio.start) &&
+           ((p->addr + p->size) <= handler->portio.end);
 }
 
-bool_t hvm_mmio_internal(paddr_t gpa)
+static int hvm_portio_read(const struct hvm_io_handler *handler,
+                           uint64_t addr,
+                           uint32_t size,
+                           uint64_t *data)
 {
-    struct vcpu *curr = current;
-    unsigned int i;
+    uint32_t val = ~0u;
+    int rc;
+
+    BUG_ON(handler->type != IOREQ_TYPE_PIO);
 
-    for ( i = 0; i < HVM_MMIO_HANDLER_NR; ++i )
-        if ( hvm_mmio_handlers[i]->check(curr, gpa) )
-            return 1;
+    rc = handler->portio.action(IOREQ_READ, addr, size, &val);
+    *data = val;
 
-    return 0;
+    return rc;
 }
 
-int hvm_mmio_intercept(ioreq_t *p)
+static int hvm_portio_write(const struct hvm_io_handler *handler,
+                            uint64_t addr,
+                            uint32_t size,
+                            uint64_t data)
 {
-    struct vcpu *v = current;
-    int i;
-
-    for ( i = 0; i < HVM_MMIO_HANDLER_NR; i++ )
-    {
-        hvm_mmio_check_t check = hvm_mmio_handlers[i]->check;
+    uint32_t val = data;
 
-        if ( check(v, p->addr) )
-        {
-            if ( unlikely(p->count > 1) &&
-                 !check(v, unlikely(p->df)
-                        ? p->addr - (p->count - 1L) * p->size
-                        : p->addr + (p->count - 1L) * p->size) )
-                p->count = 1;
-
-            return hvm_mmio_access(
-                v, p,
-                hvm_mmio_handlers[i]->read,
-                hvm_mmio_handlers[i]->write);
-        }
-    }
+    BUG_ON(handler->type != IOREQ_TYPE_PIO);
 
-    return X86EMUL_UNHANDLEABLE;
+    return handler->portio.action(IOREQ_WRITE, addr, size, &val);
 }
 
-static int process_portio_intercept(portio_action_t action, ioreq_t *p)
+static const struct hvm_io_ops portio_ops = {
+    .accept = hvm_portio_accept,
+    .read = hvm_portio_read,
+    .write = hvm_portio_write
+};
+
+int hvm_process_io_intercept(const struct hvm_io_handler *handler,
+                             ioreq_t *p)
 {
     struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
+    const struct hvm_io_ops *ops =
+        (p->type == IOREQ_TYPE_COPY) ?
+        &mmio_ops :
+        &portio_ops;
     int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
-    uint32_t data;
-
-    if ( !p->data_is_ptr )
-    {
-        if ( p->dir == IOREQ_READ )
-        {
-            if ( vio->mmio_retrying )
-            {
-                if ( vio->mmio_large_read_bytes != p->size )
-                    return X86EMUL_UNHANDLEABLE;
-                memcpy(&data, vio->mmio_large_read, p->size);
-                vio->mmio_large_read_bytes = 0;
-                vio->mmio_retrying = 0;
-            }
-            else
-                rc = action(IOREQ_READ, p->addr, p->size, &data);
-            p->data = data;
-        }
-        else
-        {
-            data = p->data;
-            rc = action(IOREQ_WRITE, p->addr, p->size, &data);
-        }
-        return rc;
-    }
+    uint64_t data;
+    uint64_t addr;
 
     if ( p->dir == IOREQ_READ )
     {
@@ -246,31 +135,40 @@ static int process_portio_intercept(portio_action_t action, ioreq_t *p)
             }
             else
             {
-                rc = action(IOREQ_READ, p->addr, p->size, &data);
+                addr = (p->type == IOREQ_TYPE_COPY) ?
+                       p->addr + step * i :
+                       p->addr;
+                rc = ops->read(handler, addr, p->size, &data);
                 if ( rc != X86EMUL_OKAY )
                     break;
             }
-            switch ( hvm_copy_to_guest_phys(p->data + step * i,
-                                            &data, p->size) )
+
+            if ( p->data_is_ptr )
             {
-            case HVMCOPY_okay:
-                break;
-            case HVMCOPY_gfn_paged_out:
-            case HVMCOPY_gfn_shared:
-                rc = X86EMUL_RETRY;
-                break;
-            case HVMCOPY_bad_gfn_to_mfn:
-                /* Drop the write as real hardware would. */
-                continue;
-            case HVMCOPY_bad_gva_to_gfn:
-                ASSERT(0);
-                /* fall through */
-            default:
-                rc = X86EMUL_UNHANDLEABLE;
-                break;
+                switch ( hvm_copy_to_guest_phys(p->data + step * i,
+                                                &data, p->size) )
+                {
+                case HVMCOPY_okay:
+                    break;
+                case HVMCOPY_gfn_paged_out:
+                case HVMCOPY_gfn_shared:
+                    rc = X86EMUL_RETRY;
+                    break;
+                case HVMCOPY_bad_gfn_to_mfn:
+                    /* Drop the write as real hardware would. */
+                    continue;
+                case HVMCOPY_bad_gva_to_gfn:
+                    ASSERT_UNREACHABLE();
+                    /* fall through */
+                default:
+                    rc = X86EMUL_UNHANDLEABLE;
+                    break;
+                }
+                if ( rc != X86EMUL_OKAY )
+                    break;
             }
-            if ( rc != X86EMUL_OKAY)
-                break;
+            else
+                p->data = data;
         }
 
         if ( rc == X86EMUL_RETRY )
@@ -284,29 +182,37 @@ static int process_portio_intercept(portio_action_t action, ioreq_t *p)
     {
         for ( i = 0; i < p->count; i++ )
         {
-            data = 0;
-            switch ( hvm_copy_from_guest_phys(&data, p->data + step * i,
-                                              p->size) )
+            if ( p->data_is_ptr )
             {
-            case HVMCOPY_okay:
-                break;
-            case HVMCOPY_gfn_paged_out:
-            case HVMCOPY_gfn_shared:
-                rc = X86EMUL_RETRY;
-                break;
-            case HVMCOPY_bad_gfn_to_mfn:
-                data = ~0;
-                break;
-            case HVMCOPY_bad_gva_to_gfn:
-                ASSERT(0);
-                /* fall through */
-            default:
-                rc = X86EMUL_UNHANDLEABLE;
-                break;
+                switch ( hvm_copy_from_guest_phys(&data, p->data + step * i,
+                                                  p->size) )
+                {
+                case HVMCOPY_okay:
+                    break;
+                case HVMCOPY_gfn_paged_out:
+                case HVMCOPY_gfn_shared:
+                    rc = X86EMUL_RETRY;
+                    break;
+                case HVMCOPY_bad_gfn_to_mfn:
+                    data = ~0;
+                    break;
+                case HVMCOPY_bad_gva_to_gfn:
+                    ASSERT_UNREACHABLE();
+                    /* fall through */
+                default:
+                    rc = X86EMUL_UNHANDLEABLE;
+                    break;
+                }
+                if ( rc != X86EMUL_OKAY )
+                    break;
             }
-            if ( rc != X86EMUL_OKAY )
-                break;
-            rc = action(IOREQ_WRITE, p->addr, p->size, &data);
+            else
+                data = p->data;
+
+            addr = (p->type == IOREQ_TYPE_COPY) ?
+                   p->addr + step * i :
+                   p->addr;
+            rc = ops->write(handler, addr, p->size, data);
             if ( rc != X86EMUL_OKAY )
                 break;
         }
@@ -324,78 +230,122 @@ static int process_portio_intercept(portio_action_t action, ioreq_t *p)
     return rc;
 }
 
-/*
- * Check if the request is handled inside xen
- * return value: 0 --not handled; 1 --handled
- */
-int hvm_io_intercept(ioreq_t *p, int type)
+const struct hvm_io_handler *hvm_find_io_handler(ioreq_t *p)
+{
+    struct domain *curr_d = current->domain;
+    const struct hvm_io_ops *ops =
+        (p->type == IOREQ_TYPE_COPY) ?
+        &mmio_ops :
+        &portio_ops;
+    unsigned int i;
+
+    BUG_ON((p->type != IOREQ_TYPE_PIO) &&
+           (p->type != IOREQ_TYPE_COPY));
+
+    for ( i = 0; i < curr_d->arch.hvm_domain.io_handler_count; i++ )
+    {
+        const struct hvm_io_handler *handler =
+            &curr_d->arch.hvm_domain.io_handler[i];
+
+        if ( handler->type != p->type )
+            continue;
+
+        if ( ops->accept(handler, p) )
+            return handler;
+    }
+
+    return NULL;
+}
+
+int hvm_io_intercept(ioreq_t *p)
 {
-    struct vcpu *v = current;
-    struct hvm_io_handler *handler = v->domain->arch.hvm_domain.io_handler;
-    int i;
-    unsigned long addr, size;
+    const struct hvm_io_handler *handler;
 
-    if ( type == HVM_PORTIO )
+    if ( p->type == IOREQ_TYPE_PIO )
     {
         int rc = dpci_ioport_intercept(p);
         if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) )
             return rc;
     }
-
-    for ( i = 0; i < handler->num_slot; i++ )
+    else if ( p->type == IOREQ_TYPE_COPY )
     {
-        if ( type != handler->hdl_list[i].type )
-            continue;
-        addr = handler->hdl_list[i].addr;
-        size = handler->hdl_list[i].size;
-        if ( (p->addr >= addr) &&
-             ((p->addr + p->size) <= (addr + size)) )
-        {
-            if ( type == HVM_PORTIO )
-                return process_portio_intercept(
-                    handler->hdl_list[i].action.portio, p);
+        int rc = stdvga_intercept_mmio(p);
+        if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) )
+            return rc;
+    }
 
-            if ( unlikely(p->count > 1) &&
-                 (unlikely(p->df)
-                  ? p->addr - (p->count - 1L) * p->size < addr
-                  : p->addr + p->count * 1L * p->size - 1 >= addr + size) )
-                p->count = 1;
+    handler = hvm_find_io_handler(p);
 
-            return handler->hdl_list[i].action.mmio(p);
-        }
+    if ( handler == NULL )
+        return X86EMUL_UNHANDLEABLE;
+
+    return hvm_process_io_intercept(handler, p);
+}
+
+struct hvm_io_handler *hvm_next_io_handler(struct domain *d)
+{
+    unsigned int i = d->arch.hvm_domain.io_handler_count++;
+
+    if ( i == NR_IO_HANDLERS )
+    {
+        domain_crash(d);
+        return NULL;
     }
 
-    return X86EMUL_UNHANDLEABLE;
+    return &d->arch.hvm_domain.io_handler[i];
 }
 
-void register_io_handler(
-    struct domain *d, unsigned long addr, unsigned int size,
-    void *action, int type)
+void register_mmio_handler(struct domain *d,
+                           const struct hvm_mmio_ops *ops)
 {
-    struct hvm_io_handler *handler = d->arch.hvm_domain.io_handler;
-    int num = handler->num_slot;
+    struct hvm_io_handler *handler = hvm_next_io_handler(d);
 
-    BUG_ON(num >= MAX_IO_HANDLER);
+    handler->type = IOREQ_TYPE_COPY;
+    handler->mmio.ops = ops;
+}
 
-    handler->hdl_list[num].addr = addr;
-    handler->hdl_list[num].size = size;
-    handler->hdl_list[num].action.ptr = action;
-    handler->hdl_list[num].type = type;
-    handler->num_slot++;
+void register_portio_handler(struct domain *d, uint16_t port,
+                             unsigned int size, portio_action_t action)
+{
+    struct hvm_io_handler *handler = hvm_next_io_handler(d);
+
+    handler->type = IOREQ_TYPE_PIO;
+    handler->portio.start = port;
+    handler->portio.end = port + size;
+    handler->portio.action = action;
 }
 
-void relocate_io_handler(
-    struct domain *d, unsigned long old_addr, unsigned long new_addr,
-    unsigned int size, int type)
+void relocate_portio_handler(struct domain *d, uint16_t old_port,
+                             uint16_t new_port, unsigned int size)
 {
-    struct hvm_io_handler *handler = d->arch.hvm_domain.io_handler;
-    int i;
-
-    for ( i = 0; i < handler->num_slot; i++ )
-        if ( (handler->hdl_list[i].addr == old_addr) &&
-             (handler->hdl_list[i].size == size) &&
-             (handler->hdl_list[i].type == type) )
-            handler->hdl_list[i].addr = new_addr;
+    unsigned int i;
+
+    for ( i = 0; i < d->arch.hvm_domain.io_handler_count; i++ )
+    {
+        struct hvm_io_handler *handler =
+            &d->arch.hvm_domain.io_handler[i];
+
+        if ( handler->type != IOREQ_TYPE_PIO )
+            continue;
+
+        if ( (handler->portio.start == old_port) &&
+             (handler->portio.end = old_port + size) )
+        {
+            handler->portio.start = new_port;
+            handler->portio.end = new_port + size;
+            break;
+        }
+    }
+}
+
+bool_t hvm_mmio_internal(paddr_t gpa)
+{
+    ioreq_t p = {
+        .type = IOREQ_TYPE_COPY,
+        .addr = gpa
+    };
+
+    return (hvm_find_io_handler(&p) != NULL);
 }
 
 /*
diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c
index 98308cd..30760fa 100644
--- a/xen/arch/x86/hvm/stdvga.c
+++ b/xen/arch/x86/hvm/stdvga.c
@@ -547,12 +547,27 @@ static int mmio_move(struct hvm_hw_stdvga *s, ioreq_t *p)
     return 1;
 }
 
-static int stdvga_intercept_mmio(ioreq_t *p)
+int stdvga_intercept_mmio(ioreq_t *p)
 {
     struct domain *d = current->domain;
     struct hvm_hw_stdvga *s = &d->arch.hvm_domain.stdvga;
+    uint64_t start, end, count = p->count, size = p->size;
     int buf = 0, rc;
 
+    if ( p->df )
+    {
+        start = (p->addr - (count - 1) * size);
+        end = p->addr + size;
+    }
+    else
+    {
+        start = p->addr;
+        end = p->addr + count * size;
+    }
+
+    if ( (start < VGA_MEM_BASE) || (end > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
+        return X86EMUL_UNHANDLEABLE;
+
     if ( p->size > 8 )
     {
         gdprintk(XENLOG_WARNING, "invalid mmio size %d\n", (int)p->size);
@@ -619,9 +634,6 @@ void stdvga_init(struct domain *d)
         register_portio_handler(d, 0x3c4, 2, stdvga_intercept_pio);
         /* Graphics registers. */
         register_portio_handler(d, 0x3ce, 2, stdvga_intercept_pio);
-        /* MMIO. */
-        register_buffered_io_handler(
-            d, VGA_MEM_BASE, VGA_MEM_SIZE, stdvga_intercept_mmio);
     }
 }
 
@@ -638,3 +650,13 @@ void stdvga_deinit(struct domain *d)
         s->vram_page[i] = NULL;
     }
 }
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index 5c8d890..9de2ff3 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -250,7 +250,7 @@ static int vioapic_range(struct vcpu *v, unsigned long addr)
              (addr < vioapic->base_address + VIOAPIC_MEM_LENGTH)));
 }
 
-const struct hvm_mmio_ops vioapic_mmio_ops = {
+static const struct hvm_mmio_ops vioapic_mmio_ops = {
     .check = vioapic_range,
     .read = vioapic_read,
     .write = vioapic_write
@@ -456,6 +456,8 @@ int vioapic_init(struct domain *d)
     d->arch.hvm_domain.vioapic->domain = d;
     vioapic_reset(d);
 
+    register_mmio_handler(d, &vioapic_mmio_ops);
+
     return 0;
 }
 
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 0235cb3..5472f79 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -977,7 +977,7 @@ static int vlapic_range(struct vcpu *v, unsigned long addr)
            (offset < PAGE_SIZE);
 }
 
-const struct hvm_mmio_ops vlapic_mmio_ops = {
+static const struct hvm_mmio_ops vlapic_mmio_ops = {
     .check = vlapic_range,
     .read = vlapic_read,
     .write = vlapic_write
@@ -1443,6 +1443,9 @@ int vlapic_init(struct vcpu *v)
                  vlapic_init_sipi_action,
                  (unsigned long)v);
 
+    if ( v->vcpu_id == 0 )
+        register_mmio_handler(v->domain, &vlapic_mmio_ops);
+
     return 0;
 }
 
diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
index 6eeae0a..66545a4 100644
--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -344,7 +344,7 @@ static int msixtbl_range(struct vcpu *v, unsigned long addr)
     return !!desc;
 }
 
-const struct hvm_mmio_ops msixtbl_mmio_ops = {
+static const struct hvm_mmio_ops msixtbl_mmio_ops = {
     .check = msixtbl_range,
     .read = msixtbl_read,
     .write = msixtbl_write
@@ -481,6 +481,14 @@ found:
     spin_unlock_irq(&irq_desc->lock);
 }
 
+void msixtbl_init(struct domain *d)
+{
+    INIT_LIST_HEAD(&d->arch.hvm_domain.msixtbl_list);
+    spin_lock_init(&d->arch.hvm_domain.msixtbl_list_lock);
+
+    register_mmio_handler(d, &msixtbl_mmio_ops);
+}
+
 void msixtbl_pt_cleanup(struct domain *d)
 {
     struct msixtbl_entry *entry, *temp;
diff --git a/xen/drivers/passthrough/amd/iommu_guest.c b/xen/drivers/passthrough/amd/iommu_guest.c
index a832fdb..5908bf3 100644
--- a/xen/drivers/passthrough/amd/iommu_guest.c
+++ b/xen/drivers/passthrough/amd/iommu_guest.c
@@ -868,6 +868,20 @@ static void guest_iommu_reg_init(struct guest_iommu *iommu)
     iommu->reg_ext_feature.hi = upper;
 }
 
+static int guest_iommu_mmio_range(struct vcpu *v, unsigned long addr)
+{
+    struct guest_iommu *iommu = vcpu_iommu(v);
+
+    return iommu && addr >= iommu->mmio_base &&
+           addr < iommu->mmio_base + IOMMU_MMIO_SIZE;
+}
+
+static const struct hvm_mmio_ops iommu_mmio_ops = {
+    .check = guest_iommu_mmio_range,
+    .read = guest_iommu_mmio_read,
+    .write = guest_iommu_mmio_write
+};
+
 /* Domain specific initialization */
 int guest_iommu_init(struct domain* d)
 {
@@ -894,6 +908,8 @@ int guest_iommu_init(struct domain* d)
 
     spin_lock_init(&iommu->lock);
 
+    register_mmio_handler(d, &iommu_mmio_ops);
+
     return 0;
 }
 
@@ -910,17 +926,3 @@ void guest_iommu_destroy(struct domain *d)
 
     domain_hvm_iommu(d)->arch.g_iommu = NULL;
 }
-
-static int guest_iommu_mmio_range(struct vcpu *v, unsigned long addr)
-{
-    struct guest_iommu *iommu = vcpu_iommu(v);
-
-    return iommu && addr >= iommu->mmio_base &&
-           addr < iommu->mmio_base + IOMMU_MMIO_SIZE;
-}
-
-const struct hvm_mmio_ops iommu_mmio_ops = {
-    .check = guest_iommu_mmio_range,
-    .read = guest_iommu_mmio_read,
-    .write = guest_iommu_mmio_write
-};
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index ad68fcf..b612755 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -94,6 +94,7 @@ struct hvm_domain {
     struct pl_time         pl_time;
 
     struct hvm_io_handler *io_handler;
+    unsigned int          io_handler_count;
 
     /* Lock protects access to irq, vpic and vioapic. */
     spinlock_t             irq_lock;
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index ca6958b..461a134 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -25,10 +25,7 @@
 #include <public/hvm/ioreq.h>
 #include <public/event_channel.h>
 
-#define MAX_IO_HANDLER             16
-
-#define HVM_PORTIO                  0
-#define HVM_BUFFERED_IO             2
+#define NR_IO_HANDLERS 32
 
 typedef int (*hvm_mmio_read_t)(struct vcpu *v,
                                unsigned long addr,
@@ -40,82 +37,67 @@ typedef int (*hvm_mmio_write_t)(struct vcpu *v,
                                 unsigned long val);
 typedef int (*hvm_mmio_check_t)(struct vcpu *v, unsigned long addr);
 
+struct hvm_mmio_ops {
+    hvm_mmio_check_t check;
+    hvm_mmio_read_t  read;
+    hvm_mmio_write_t write;
+};
+
 typedef int (*portio_action_t)(
     int dir, uint16_t port, unsigned int bytes, uint32_t *val);
-typedef int (*mmio_action_t)(ioreq_t *);
-struct io_handler {
-    int                 type;
-    unsigned int        size;
-    unsigned long       addr;
-    union {
-        portio_action_t portio;
-        mmio_action_t   mmio;
-        void           *ptr;
-    } action;
-};
 
 struct hvm_io_handler {
-    int     num_slot;
-    struct  io_handler hdl_list[MAX_IO_HANDLER];
+    union {
+        struct {
+            const struct hvm_mmio_ops *ops;
+        } mmio;
+        struct {
+            unsigned int start, end;
+            portio_action_t action;
+        } portio;
+    };
+    uint8_t type;
 };
 
-struct hvm_mmio_ops {
-    hvm_mmio_check_t check;
-    hvm_mmio_read_t  read;
-    hvm_mmio_write_t write;
+typedef int (*hvm_io_read_t)(const struct hvm_io_handler *,
+                             uint64_t addr,
+                             uint32_t size,
+                             uint64_t *data);
+typedef int (*hvm_io_write_t)(const struct hvm_io_handler *,
+                              uint64_t addr,
+                              uint32_t size,
+                              uint64_t data);
+typedef bool_t (*hvm_io_accept_t)(const struct hvm_io_handler *,
+                                  const ioreq_t *p);
+struct hvm_io_ops {
+    hvm_io_accept_t accept;
+    hvm_io_read_t   read;
+    hvm_io_write_t  write;
 };
 
-extern const struct hvm_mmio_ops hpet_mmio_ops;
-extern const struct hvm_mmio_ops vlapic_mmio_ops;
-extern const struct hvm_mmio_ops vioapic_mmio_ops;
-extern const struct hvm_mmio_ops msixtbl_mmio_ops;
-extern const struct hvm_mmio_ops iommu_mmio_ops;
+int hvm_process_io_intercept(const struct hvm_io_handler *handler,
+                             ioreq_t *p);
 
-#define HVM_MMIO_HANDLER_NR 5
+const struct hvm_io_handler *hvm_find_io_handler(ioreq_t *p);
 
-int hvm_io_intercept(ioreq_t *p, int type);
-void register_io_handler(
-    struct domain *d, unsigned long addr, unsigned int size,
-    void *action, int type);
-void relocate_io_handler(
-    struct domain *d, unsigned long old_addr, unsigned long new_addr,
-    unsigned int size, int type);
+int hvm_io_intercept(ioreq_t *p);
 
-static inline int hvm_portio_intercept(ioreq_t *p)
-{
-    return hvm_io_intercept(p, HVM_PORTIO);
-}
-
-static inline int hvm_buffered_io_intercept(ioreq_t *p)
-{
-    return hvm_io_intercept(p, HVM_BUFFERED_IO);
-}
+struct hvm_io_handler *hvm_next_io_handler(struct domain *d);
 
 bool_t hvm_mmio_internal(paddr_t gpa);
-int hvm_mmio_intercept(ioreq_t *p);
-int hvm_buffered_io_send(ioreq_t *p);
 
-static inline void register_portio_handler(
+void register_mmio_handler(struct domain *d,
+                           const struct hvm_mmio_ops *ops);
+
+void register_portio_handler(
     struct domain *d, uint16_t port, unsigned int size,
-    portio_action_t action)
-{
-    register_io_handler(d, port, size, action, HVM_PORTIO);
-}
+    portio_action_t action);
 
-static inline void relocate_portio_handler(
+void relocate_portio_handler(
     struct domain *d, uint16_t old_port, uint16_t new_port,
-    unsigned int size)
-{
-    relocate_io_handler(d, old_port, new_port, size, HVM_PORTIO);
-}
-
-static inline void register_buffered_io_handler(
-    struct domain *d, unsigned long addr,
-    unsigned int size, mmio_action_t action)
-{
-    register_io_handler(d, addr, size, action, HVM_BUFFERED_IO);
-}
+    unsigned int size);
 
+int hvm_buffered_io_send(ioreq_t *p);
 void send_timeoffset_req(unsigned long timeoff);
 void send_invalidate_req(void);
 int handle_mmio(void);
@@ -127,6 +109,7 @@ void hvm_io_assist(ioreq_t *p);
 void hvm_dpci_eoi(struct domain *d, unsigned int guest_irq,
                   const union vioapic_redir_entry *ent);
 void msix_write_completion(struct vcpu *);
+void msixtbl_init(struct domain *d);
 
 struct hvm_hw_stdvga {
     uint8_t sr_index;
@@ -141,8 +124,19 @@ struct hvm_hw_stdvga {
 };
 
 void stdvga_init(struct domain *d);
+int stdvga_intercept_mmio(ioreq_t *p);
 void stdvga_deinit(struct domain *d);
 
 extern void hvm_dpci_msi_eoi(struct domain *d, int vector);
 #endif /* __ASM_X86_HVM_IO_H__ */
 
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v6 06/16] x86/hvm: add length to mmio check op
  2015-07-03 16:25 [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (4 preceding siblings ...)
  2015-07-03 16:25 ` [PATCH v6 05/16] x86/hvm: unify internal portio and mmio intercepts Paul Durrant
@ 2015-07-03 16:25 ` Paul Durrant
  2015-07-03 16:25 ` [PATCH v6 07/16] x86/hvm: unify dpci portio intercept with standard portio intercept Paul Durrant
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 53+ messages in thread
From: Paul Durrant @ 2015-07-03 16:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser, Jan Beulich

When memory mapped I/O is range checked by internal handlers, the length
of the access should be taken into account.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/intercept.c |   23 ++++++++++++++++++++---
 xen/include/asm-x86/hvm/io.h |   16 ++++++++++++++++
 2 files changed, 36 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index 6344c65..ccef38e 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -35,9 +35,20 @@
 static bool_t hvm_mmio_accept(const struct hvm_io_handler *handler,
                               const ioreq_t *p)
 {
+    paddr_t first = hvm_mmio_first_byte(p);
+    paddr_t last = hvm_mmio_last_byte(p);
+
     BUG_ON(handler->type != IOREQ_TYPE_COPY);
 
-    return handler->mmio.ops->check(current, p->addr);
+    if ( !handler->mmio.ops->check(current, first) )
+        return 0;
+
+    /* Make sure the handler will accept the whole access */
+    if ( p->size > 1 &&
+         !handler->mmio.ops->check(current, last) )
+        domain_crash(current->domain);
+
+    return 1;
 }
 
 static int hvm_mmio_read(const struct hvm_io_handler *handler,
@@ -112,7 +123,8 @@ static const struct hvm_io_ops portio_ops = {
 int hvm_process_io_intercept(const struct hvm_io_handler *handler,
                              ioreq_t *p)
 {
-    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
+    struct vcpu *curr = current;
+    struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
     const struct hvm_io_ops *ops =
         (p->type == IOREQ_TYPE_COPY) ?
         &mmio_ops :
@@ -223,6 +235,9 @@ int hvm_process_io_intercept(const struct hvm_io_handler *handler,
 
     if ( i != 0 )
     {
+        if ( rc == X86EMUL_UNHANDLEABLE )
+            domain_crash(curr->domain);
+
         p->count = i;
         rc = X86EMUL_OKAY;
     }
@@ -342,7 +357,9 @@ bool_t hvm_mmio_internal(paddr_t gpa)
 {
     ioreq_t p = {
         .type = IOREQ_TYPE_COPY,
-        .addr = gpa
+        .addr = gpa,
+        .count = 1,
+        .size = 1,
     };
 
     return (hvm_find_io_handler(&p) != NULL);
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index 461a134..4fc2336 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -43,6 +43,22 @@ struct hvm_mmio_ops {
     hvm_mmio_write_t write;
 };
 
+static inline paddr_t hvm_mmio_first_byte(const ioreq_t *p)
+{
+    return p->df ?
+           p->addr - (p->count - 1ul) * p->size :
+           p->addr;
+}
+
+static inline paddr_t hvm_mmio_last_byte(const ioreq_t *p)
+{
+    unsigned long count = p->count;
+
+    return p->df ?
+           p->addr + p->size - 1:
+           p->addr + (count * p->size) - 1;
+}
+
 typedef int (*portio_action_t)(
     int dir, uint16_t port, unsigned int bytes, uint32_t *val);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v6 07/16] x86/hvm: unify dpci portio intercept with standard portio intercept
  2015-07-03 16:25 [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (5 preceding siblings ...)
  2015-07-03 16:25 ` [PATCH v6 06/16] x86/hvm: add length to mmio check op Paul Durrant
@ 2015-07-03 16:25 ` Paul Durrant
  2015-07-08 16:29   ` Jan Beulich
  2015-07-03 16:25 ` [PATCH v6 08/16] x86/hvm: unify stdvga mmio intercept with standard mmio intercept Paul Durrant
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 53+ messages in thread
From: Paul Durrant @ 2015-07-03 16:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser, Jan Beulich

This patch re-works the dpci portio intercepts so that they can be unified
with standard portio handling thereby removing a substantial amount of
code duplication.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/hvm.c         |    2 +
 xen/arch/x86/hvm/intercept.c   |   20 +---
 xen/arch/x86/hvm/io.c          |  220 ++++++++++++----------------------------
 xen/include/asm-x86/hvm/io.h   |    4 +
 xen/include/asm-x86/hvm/vcpu.h |    2 +
 xen/include/xen/iommu.h        |    1 -
 6 files changed, 78 insertions(+), 171 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 676f294..c25b001 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1483,6 +1483,8 @@ int hvm_domain_initialise(struct domain *d)
     else
         d->arch.hvm_domain.io_bitmap = hvm_io_bitmap;
 
+    register_dpci_portio_handler(d);
+
     if ( is_pvh_domain(d) )
     {
         register_portio_handler(d, 0, 0x10003, handle_pvh_io);
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index ccef38e..91968c4 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -125,10 +125,7 @@ int hvm_process_io_intercept(const struct hvm_io_handler *handler,
 {
     struct vcpu *curr = current;
     struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
-    const struct hvm_io_ops *ops =
-        (p->type == IOREQ_TYPE_COPY) ?
-        &mmio_ops :
-        &portio_ops;
+    const struct hvm_io_ops *ops = handler->ops;
     int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
     uint64_t data;
     uint64_t addr;
@@ -248,10 +245,6 @@ int hvm_process_io_intercept(const struct hvm_io_handler *handler,
 const struct hvm_io_handler *hvm_find_io_handler(ioreq_t *p)
 {
     struct domain *curr_d = current->domain;
-    const struct hvm_io_ops *ops =
-        (p->type == IOREQ_TYPE_COPY) ?
-        &mmio_ops :
-        &portio_ops;
     unsigned int i;
 
     BUG_ON((p->type != IOREQ_TYPE_PIO) &&
@@ -261,6 +254,7 @@ const struct hvm_io_handler *hvm_find_io_handler(ioreq_t *p)
     {
         const struct hvm_io_handler *handler =
             &curr_d->arch.hvm_domain.io_handler[i];
+        const struct hvm_io_ops *ops = handler->ops;
 
         if ( handler->type != p->type )
             continue;
@@ -276,13 +270,7 @@ int hvm_io_intercept(ioreq_t *p)
 {
     const struct hvm_io_handler *handler;
 
-    if ( p->type == IOREQ_TYPE_PIO )
-    {
-        int rc = dpci_ioport_intercept(p);
-        if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) )
-            return rc;
-    }
-    else if ( p->type == IOREQ_TYPE_COPY )
+    if ( p->type == IOREQ_TYPE_COPY )
     {
         int rc = stdvga_intercept_mmio(p);
         if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) )
@@ -316,6 +304,7 @@ void register_mmio_handler(struct domain *d,
     struct hvm_io_handler *handler = hvm_next_io_handler(d);
 
     handler->type = IOREQ_TYPE_COPY;
+    handler->ops = &mmio_ops;
     handler->mmio.ops = ops;
 }
 
@@ -325,6 +314,7 @@ void register_portio_handler(struct domain *d, uint16_t port,
     struct hvm_io_handler *handler = hvm_next_io_handler(d);
 
     handler->type = IOREQ_TYPE_PIO;
+    handler->ops = &portio_ops;
     handler->portio.start = port;
     handler->portio.end = port + size;
     handler->portio.action = action;
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index c0964ec..2c88ddb 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -208,185 +208,95 @@ void hvm_io_assist(ioreq_t *p)
     }
 }
 
-static int dpci_ioport_read(uint32_t mport, ioreq_t *p)
+static bool_t dpci_portio_accept(const struct hvm_io_handler *handler,
+                                 const ioreq_t *p)
 {
-    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
-    int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
-    uint32_t data = 0;
+    struct vcpu *curr = current;
+    struct hvm_iommu *hd = domain_hvm_iommu(curr->domain);
+    struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
+    struct g2m_ioport *g2m_ioport;
+    unsigned int start, end;
 
-    for ( i = 0; i < p->count; i++ )
+    list_for_each_entry( g2m_ioport, &hd->arch.g2m_ioport_list, list )
     {
-        if ( vio->mmio_retrying )
-        {
-            if ( vio->mmio_large_read_bytes != p->size )
-                return X86EMUL_UNHANDLEABLE;
-            memcpy(&data, vio->mmio_large_read, p->size);
-            vio->mmio_large_read_bytes = 0;
-            vio->mmio_retrying = 0;
-        }
-        else switch ( p->size )
+        start = g2m_ioport->gport;
+        end = start + g2m_ioport->np;
+        if ( (p->addr >= start) && (p->addr + p->size <= end) )
         {
-        case 1:
-            data = inb(mport);
-            break;
-        case 2:
-            data = inw(mport);
-            break;
-        case 4:
-            data = inl(mport);
-            break;
-        default:
-            BUG();
+            vio->g2m_ioport = g2m_ioport;
+            return 1;
         }
-
-        if ( p->data_is_ptr )
-        {
-            switch ( hvm_copy_to_guest_phys(p->data + step * i,
-                                            &data, p->size) )
-            {
-            case HVMCOPY_okay:
-                break;
-            case HVMCOPY_gfn_paged_out:
-            case HVMCOPY_gfn_shared:
-                rc = X86EMUL_RETRY;
-                break;
-            case HVMCOPY_bad_gfn_to_mfn:
-                /* Drop the write as real hardware would. */
-                continue;
-            case HVMCOPY_bad_gva_to_gfn:
-                ASSERT(0);
-                /* fall through */
-            default:
-                rc = X86EMUL_UNHANDLEABLE;
-                break;
-            }
-            if ( rc != X86EMUL_OKAY)
-                break;
-        }
-        else
-            p->data = data;
     }
 
-    if ( rc == X86EMUL_RETRY )
-    {
-        vio->mmio_retry = 1;
-        vio->mmio_large_read_bytes = p->size;
-        memcpy(vio->mmio_large_read, &data, p->size);
-    }
-
-    if ( i != 0 )
-    {
-        p->count = i;
-        rc = X86EMUL_OKAY;
-    }
-
-    return rc;
+    return 0;
 }
 
-static int dpci_ioport_write(uint32_t mport, ioreq_t *p)
+static int dpci_portio_read(const struct hvm_io_handler *handler,
+                            uint64_t addr,
+                            uint32_t size,
+                            uint64_t *data)
 {
-    int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
-    uint32_t data;
-
-    for ( i = 0; i < p->count; i++ )
-    {
-        data = p->data;
-        if ( p->data_is_ptr )
-        {
-            switch ( hvm_copy_from_guest_phys(&data, p->data + step * i,
-                                              p->size) )
-            {
-            case HVMCOPY_okay:
-                break;
-            case HVMCOPY_gfn_paged_out:
-            case HVMCOPY_gfn_shared:
-                rc = X86EMUL_RETRY;
-                break;
-            case HVMCOPY_bad_gfn_to_mfn:
-                data = ~0;
-                break;
-            case HVMCOPY_bad_gva_to_gfn:
-                ASSERT(0);
-                /* fall through */
-            default:
-                rc = X86EMUL_UNHANDLEABLE;
-                break;
-            }
-            if ( rc != X86EMUL_OKAY)
-                break;
-        }
-
-        switch ( p->size )
-        {
-        case 1:
-            outb(data, mport);
-            break;
-        case 2:
-            outw(data, mport);
-            break;
-        case 4:
-            outl(data, mport);
-            break;
-        default:
-            BUG();
-        }
-    }
-
-    if ( rc == X86EMUL_RETRY )
-        current->arch.hvm_vcpu.hvm_io.mmio_retry = 1;
+    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
+    const struct g2m_ioport *g2m_ioport = vio->g2m_ioport;
+    unsigned int mport = (addr - g2m_ioport->gport) + g2m_ioport->mport;
 
-    if ( i != 0 )
+    switch ( size )
     {
-        p->count = i;
-        rc = X86EMUL_OKAY;
+    case 1:
+        *data = inb(mport);
+        break;
+    case 2:
+        *data = inw(mport);
+        break;
+    case 4:
+        *data = inl(mport);
+        break;
+    default:
+        BUG();
     }
 
-    return rc;
+    return X86EMUL_OKAY;
 }
 
-int dpci_ioport_intercept(ioreq_t *p)
+static int dpci_portio_write(const struct hvm_io_handler *handler,
+                             uint64_t addr,
+                             uint32_t size,
+                             uint64_t data)
 {
-    struct domain *d = current->domain;
-    struct hvm_iommu *hd = domain_hvm_iommu(d);
-    struct g2m_ioport *g2m_ioport;
-    unsigned int mport, gport = p->addr;
-    unsigned int s = 0, e = 0;
-    int rc;
-
-    list_for_each_entry( g2m_ioport, &hd->arch.g2m_ioport_list, list )
-    {
-        s = g2m_ioport->gport;
-        e = s + g2m_ioport->np;
-        if ( (gport >= s) && (gport < e) )
-            goto found;
-    }
-
-    return X86EMUL_UNHANDLEABLE;
-
- found:
-    mport = (gport - s) + g2m_ioport->mport;
-
-    if ( !ioports_access_permitted(d, mport, mport + p->size - 1) ) 
-    {
-        gdprintk(XENLOG_ERR, "Error: access to gport=%#x denied!\n",
-                 (uint32_t)p->addr);
-        return X86EMUL_UNHANDLEABLE;
-    }
+    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
+    const struct g2m_ioport *g2m_ioport = vio->g2m_ioport;
+    unsigned int mport = (addr - g2m_ioport->gport) + g2m_ioport->mport;
 
-    switch ( p->dir )
+    switch ( size )
     {
-    case IOREQ_READ:
-        rc = dpci_ioport_read(mport, p);
+    case 1:
+        outb(data, mport);
         break;
-    case IOREQ_WRITE:
-        rc = dpci_ioport_write(mport, p);
+    case 2:
+        outw(data, mport);
+        break;
+    case 4:
+        outl(data, mport);
         break;
     default:
-        gdprintk(XENLOG_ERR, "Error: couldn't handle p->dir = %d", p->dir);
-        rc = X86EMUL_UNHANDLEABLE;
+        BUG();
     }
 
-    return rc;
+    return X86EMUL_OKAY;
+}
+
+static const struct hvm_io_ops dpci_portio_ops = {
+    .accept = dpci_portio_accept,
+    .read = dpci_portio_read,
+    .write = dpci_portio_write
+};
+
+void register_dpci_portio_handler(struct domain *d)
+{
+    struct hvm_io_handler *handler = hvm_next_io_handler(d);
+
+    handler->type = IOREQ_TYPE_PIO;
+    handler->ops = &dpci_portio_ops;
 }
 
 /*
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index 4fc2336..9503e6e 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -72,6 +72,7 @@ struct hvm_io_handler {
             portio_action_t action;
         } portio;
     };
+    const struct hvm_io_ops *ops;
     uint8_t type;
 };
 
@@ -144,6 +145,9 @@ int stdvga_intercept_mmio(ioreq_t *p);
 void stdvga_deinit(struct domain *d);
 
 extern void hvm_dpci_msi_eoi(struct domain *d, int vector);
+
+void register_dpci_portio_handler(struct domain *d);
+
 #endif /* __ASM_X86_HVM_IO_H__ */
 
 
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 3d8f4dc..b15c1e6 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -77,6 +77,8 @@ struct hvm_vcpu_io {
     bool_t mmio_retry, mmio_retrying;
 
     unsigned long msix_unmask_address;
+
+    const struct g2m_ioport *g2m_ioport;
 };
 
 #define VMCX_EADDR    (~0ULL)
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index b30bf41..1d00696 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -93,7 +93,6 @@ void pt_pci_init(void);
 
 struct pirq;
 int hvm_do_IRQ_dpci(struct domain *, struct pirq *);
-int dpci_ioport_intercept(ioreq_t *p);
 int pt_irq_create_bind(struct domain *, xen_domctl_bind_pt_irq_t *);
 int pt_irq_destroy_bind(struct domain *, xen_domctl_bind_pt_irq_t *);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v6 08/16] x86/hvm: unify stdvga mmio intercept with standard mmio intercept
  2015-07-03 16:25 [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (6 preceding siblings ...)
  2015-07-03 16:25 ` [PATCH v6 07/16] x86/hvm: unify dpci portio intercept with standard portio intercept Paul Durrant
@ 2015-07-03 16:25 ` Paul Durrant
  2015-07-08 16:17   ` Jan Beulich
                     ` (2 more replies)
  2015-07-03 16:25 ` [PATCH v6 09/16] x86/hvm: limit reps to avoid the need to handle retry Paul Durrant
                   ` (8 subsequent siblings)
  16 siblings, 3 replies; 53+ messages in thread
From: Paul Durrant @ 2015-07-03 16:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser, Jan Beulich

It's clear from the following check in hvmemul_rep_movs:

    if ( sp2mt == p2m_mmio_direct || dp2mt == p2m_mmio_direct ||
         (sp2mt == p2m_mmio_dm && dp2mt == p2m_mmio_dm) )
        return X86EMUL_UNHANDLEABLE;

that mmio <-> mmio copy is not handled. This means the code in the
stdvga mmio intercept that explicitly handles mmio <-> mmio copy when
hvm_copy_to/from_guest_phys() fails is never going to be executed.

This patch therefore adds a check in hvmemul_do_io_addr() to make sure
mmio <-> mmio is disallowed and then registers standard mmio intercept ops
in stdvga_init().

With this patch all mmio and portio handled within Xen now goes through
process_io_intercept().

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c   |    9 ++
 xen/arch/x86/hvm/intercept.c |   30 ++++--
 xen/arch/x86/hvm/stdvga.c    |  209 +++++++++++++++---------------------------
 xen/include/asm-x86/hvm/io.h |   10 +-
 4 files changed, 110 insertions(+), 148 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 20bf372..1afd626 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -266,6 +266,15 @@ static int hvmemul_acquire_page(unsigned long gmfn, struct page_info **page)
         return X86EMUL_RETRY;
     }
 
+    /* This code should not be reached if the gmfn is not RAM */
+    if ( p2m_is_mmio(p2mt) )
+    {
+        domain_crash(curr_d);
+
+        put_page(*page);
+        return X86EMUL_UNHANDLEABLE;
+    }
+
     return X86EMUL_OKAY;
 }
 
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index 91968c4..78e2f57 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -269,20 +269,21 @@ const struct hvm_io_handler *hvm_find_io_handler(ioreq_t *p)
 int hvm_io_intercept(ioreq_t *p)
 {
     const struct hvm_io_handler *handler;
-
-    if ( p->type == IOREQ_TYPE_COPY )
-    {
-        int rc = stdvga_intercept_mmio(p);
-        if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) )
-            return rc;
-    }
+    const struct hvm_io_ops *ops;
+    int rc;
 
     handler = hvm_find_io_handler(p);
 
     if ( handler == NULL )
         return X86EMUL_UNHANDLEABLE;
 
-    return hvm_process_io_intercept(handler, p);
+    rc = hvm_process_io_intercept(handler, p);
+
+    ops = handler->ops;
+    if ( ops->complete != NULL )
+        ops->complete(handler);
+
+    return rc;
 }
 
 struct hvm_io_handler *hvm_next_io_handler(struct domain *d)
@@ -345,6 +346,8 @@ void relocate_portio_handler(struct domain *d, uint16_t old_port,
 
 bool_t hvm_mmio_internal(paddr_t gpa)
 {
+    const struct hvm_io_handler *handler;
+    const struct hvm_io_ops *ops;
     ioreq_t p = {
         .type = IOREQ_TYPE_COPY,
         .addr = gpa,
@@ -352,7 +355,16 @@ bool_t hvm_mmio_internal(paddr_t gpa)
         .size = 1,
     };
 
-    return (hvm_find_io_handler(&p) != NULL);
+    handler = hvm_find_io_handler(&p);
+
+    if ( handler == NULL )
+        return 0;
+
+    ops = handler->ops;
+    if ( ops->complete != NULL )
+        ops->complete(handler);
+
+    return 1;
 }
 
 /*
diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c
index 30760fa..b17e400 100644
--- a/xen/arch/x86/hvm/stdvga.c
+++ b/xen/arch/x86/hvm/stdvga.c
@@ -148,7 +148,7 @@ static int stdvga_outb(uint64_t addr, uint8_t val)
     }
     else if ( prev_stdvga && !s->stdvga )
     {
-        gdprintk(XENLOG_INFO, "leaving stdvga\n");
+        gdprintk(XENLOG_INFO, "leaving stdvga mode\n");
     }
 
     return rc;
@@ -275,9 +275,10 @@ static uint8_t stdvga_mem_readb(uint64_t addr)
     return ret;
 }
 
-static uint64_t stdvga_mem_read(uint64_t addr, uint64_t size)
+static int stdvga_mem_read(const struct hvm_io_handler *handler,
+                           uint64_t addr, uint32_t size, uint64_t *p_data)
 {
-    uint64_t data = 0;
+    uint64_t data = ~0ul;
 
     switch ( size )
     {
@@ -309,11 +310,12 @@ static uint64_t stdvga_mem_read(uint64_t addr, uint64_t size)
         break;
 
     default:
-        gdprintk(XENLOG_WARNING, "invalid io size: %"PRId64"\n", size);
+        gdprintk(XENLOG_WARNING, "invalid io size: %u\n", size);
         break;
     }
 
-    return data;
+    *p_data = data;
+    return X86EMUL_OKAY;
 }
 
 static void stdvga_mem_writeb(uint64_t addr, uint32_t val)
@@ -424,8 +426,22 @@ static void stdvga_mem_writeb(uint64_t addr, uint32_t val)
     }
 }
 
-static void stdvga_mem_write(uint64_t addr, uint64_t data, uint64_t size)
+static int stdvga_mem_write(const struct hvm_io_handler *handler,
+                            uint64_t addr, uint32_t size,
+                            uint64_t data)
 {
+    struct hvm_hw_stdvga *s = &current->domain->arch.hvm_domain.stdvga;
+    ioreq_t p = { .type = IOREQ_TYPE_COPY,
+                  .addr = addr,
+                  .size = size,
+                  .count = 1,
+                  .dir = IOREQ_WRITE,
+                  .data = data,
+    };
+
+    if ( !s->cache )
+        goto done;
+
     /* Intercept mmio write */
     switch ( size )
     {
@@ -457,156 +473,72 @@ static void stdvga_mem_write(uint64_t addr, uint64_t data, uint64_t size)
         break;
 
     default:
-        gdprintk(XENLOG_WARNING, "invalid io size: %"PRId64"\n", size);
+        gdprintk(XENLOG_WARNING, "invalid io size: %u\n", size);
         break;
     }
-}
-
-static uint32_t read_data;
-
-static int mmio_move(struct hvm_hw_stdvga *s, ioreq_t *p)
-{
-    int i;
-    uint64_t addr = p->addr;
-    p2m_type_t p2mt;
-    struct domain *d = current->domain;
-    int step = p->df ? -p->size : p->size;
-
-    if ( p->data_is_ptr )
-    {
-        uint64_t data = p->data, tmp;
 
-        if ( p->dir == IOREQ_READ )
-        {
-            for ( i = 0; i < p->count; i++ ) 
-            {
-                tmp = stdvga_mem_read(addr, p->size);
-                if ( hvm_copy_to_guest_phys(data, &tmp, p->size) !=
-                     HVMCOPY_okay )
-                {
-                    struct page_info *dp = get_page_from_gfn(
-                            d, data >> PAGE_SHIFT, &p2mt, P2M_ALLOC);
-                    /*
-                     * The only case we handle is vga_mem <-> vga_mem.
-                     * Anything else disables caching and leaves it to qemu-dm.
-                     */
-                    if ( (p2mt != p2m_mmio_dm) || (data < VGA_MEM_BASE) ||
-                         ((data + p->size) > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
-                    {
-                        if ( dp )
-                            put_page(dp);
-                        return 0;
-                    }
-                    ASSERT(!dp);
-                    stdvga_mem_write(data, tmp, p->size);
-                }
-                data += step;
-                addr += step;
-            }
-        }
-        else
-        {
-            for ( i = 0; i < p->count; i++ )
-            {
-                if ( hvm_copy_from_guest_phys(&tmp, data, p->size) !=
-                     HVMCOPY_okay )
-                {
-                    struct page_info *dp = get_page_from_gfn(
-                        d, data >> PAGE_SHIFT, &p2mt, P2M_ALLOC);
-                    if ( (p2mt != p2m_mmio_dm) || (data < VGA_MEM_BASE) ||
-                         ((data + p->size) > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
-                    {
-                        if ( dp )
-                            put_page(dp);
-                        return 0;
-                    }
-                    ASSERT(!dp);
-                    tmp = stdvga_mem_read(data, p->size);
-                }
-                stdvga_mem_write(addr, tmp, p->size);
-                data += step;
-                addr += step;
-            }
-        }
-    }
-    else if ( p->dir == IOREQ_WRITE )
-    {
-        for ( i = 0; i < p->count; i++ )
-        {
-            stdvga_mem_write(addr, p->data, p->size);
-            addr += step;
-        }
-    }
-    else
-    {
-        ASSERT(p->count == 1);
-        p->data = stdvga_mem_read(addr, p->size);
-    }
+ done:
+    if ( hvm_buffered_io_send(&p) )
+        return X86EMUL_OKAY;
 
-    read_data = p->data;
-    return 1;
+    return X86EMUL_UNHANDLEABLE;
 }
 
-int stdvga_intercept_mmio(ioreq_t *p)
+static bool_t stdvga_mem_accept(const struct hvm_io_handler *handler,
+                                const ioreq_t *p)
 {
-    struct domain *d = current->domain;
-    struct hvm_hw_stdvga *s = &d->arch.hvm_domain.stdvga;
-    uint64_t start, end, count = p->count, size = p->size;
-    int buf = 0, rc;
-
-    if ( p->df )
-    {
-        start = (p->addr - (count - 1) * size);
-        end = p->addr + size;
-    }
-    else
-    {
-        start = p->addr;
-        end = p->addr + count * size;
-    }
-
-    if ( (start < VGA_MEM_BASE) || (end > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
-        return X86EMUL_UNHANDLEABLE;
-
-    if ( p->size > 8 )
-    {
-        gdprintk(XENLOG_WARNING, "invalid mmio size %d\n", (int)p->size);
-        return X86EMUL_UNHANDLEABLE;
-    }
+    struct hvm_hw_stdvga *s = &current->domain->arch.hvm_domain.stdvga;
 
     spin_lock(&s->lock);
 
-    if ( s->stdvga && s->cache )
+    if ( !s->stdvga ||
+         (p->addr < VGA_MEM_BASE) ||
+         ((p->addr + p->size) > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
+        goto reject;
+
+    if ( p->dir == IOREQ_WRITE && p->count > 1 )
     {
-        switch ( p->type )
+        /*
+         * We cannot return X86EMUL_UNHANDLEABLE on anything other then the
+         * first cycle of an I/O. So, since we cannot guarantee to always be
+         * able to buffer writes, we have to reject any multi-cycle I/O and,
+         * since we are rejecting an I/O, we must invalidate the cache.
+         * Single-cycle write transactions are accepted even if the cache is
+         * not active since we can assert, when in stdvga mode, that writes
+         * to VRAM have no side effect and thus we can try to buffer them.
+         */
+        if ( s->cache )
         {
-        case IOREQ_TYPE_COPY:
-            buf = mmio_move(s, p);
-            if ( !buf )
-                s->cache = 0;
-            break;
-        default:
-            gdprintk(XENLOG_WARNING, "unsupported mmio request type:%d "
-                     "addr:0x%04x data:0x%04x size:%d count:%d state:%d "
-                     "isptr:%d dir:%d df:%d\n",
-                     p->type, (int)p->addr, (int)p->data, (int)p->size,
-                     (int)p->count, p->state,
-                     p->data_is_ptr, p->dir, p->df);
+            gdprintk(XENLOG_INFO, "leaving caching mode\n");
             s->cache = 0;
         }
+
+        goto reject;
     }
-    else
-    {
-        buf = (p->dir == IOREQ_WRITE);
-    }
+    else if ( p->dir == IOREQ_READ && !s->cache )
+        goto reject;
 
-    rc = (buf && hvm_buffered_io_send(p));
+    return 1;
 
+ reject:
     spin_unlock(&s->lock);
+    return 0;
+}
 
-    return rc ? X86EMUL_OKAY : X86EMUL_UNHANDLEABLE;
+static void stdvga_mem_complete(const struct hvm_io_handler *handler)
+{
+    struct hvm_hw_stdvga *s = &current->domain->arch.hvm_domain.stdvga;
+
+    spin_unlock(&s->lock);
 }
 
+static const struct hvm_io_ops stdvga_mem_ops = {
+    .accept = stdvga_mem_accept,
+    .read = stdvga_mem_read,
+    .write = stdvga_mem_write,
+    .complete = stdvga_mem_complete
+};
+
 void stdvga_init(struct domain *d)
 {
     struct hvm_hw_stdvga *s = &d->arch.hvm_domain.stdvga;
@@ -630,10 +562,17 @@ void stdvga_init(struct domain *d)
 
     if ( i == ARRAY_SIZE(s->vram_page) )
     {
+        struct hvm_io_handler *handler;
+
         /* Sequencer registers. */
         register_portio_handler(d, 0x3c4, 2, stdvga_intercept_pio);
         /* Graphics registers. */
         register_portio_handler(d, 0x3ce, 2, stdvga_intercept_pio);
+
+        /* VGA memory */
+        handler = hvm_next_io_handler(d);
+        handler->type = IOREQ_TYPE_COPY;
+        handler->ops = &stdvga_mem_ops;
     }
 }
 
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index 9503e6e..8394959 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -86,10 +86,13 @@ typedef int (*hvm_io_write_t)(const struct hvm_io_handler *,
                               uint64_t data);
 typedef bool_t (*hvm_io_accept_t)(const struct hvm_io_handler *,
                                   const ioreq_t *p);
+typedef void (*hvm_io_complete_t)(const struct hvm_io_handler *);
+
 struct hvm_io_ops {
-    hvm_io_accept_t accept;
-    hvm_io_read_t   read;
-    hvm_io_write_t  write;
+    hvm_io_accept_t   accept;
+    hvm_io_read_t     read;
+    hvm_io_write_t    write;
+    hvm_io_complete_t complete;
 };
 
 int hvm_process_io_intercept(const struct hvm_io_handler *handler,
@@ -141,7 +144,6 @@ struct hvm_hw_stdvga {
 };
 
 void stdvga_init(struct domain *d);
-int stdvga_intercept_mmio(ioreq_t *p);
 void stdvga_deinit(struct domain *d);
 
 extern void hvm_dpci_msi_eoi(struct domain *d, int vector);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v6 09/16] x86/hvm: limit reps to avoid the need to handle retry
  2015-07-03 16:25 [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (7 preceding siblings ...)
  2015-07-03 16:25 ` [PATCH v6 08/16] x86/hvm: unify stdvga mmio intercept with standard mmio intercept Paul Durrant
@ 2015-07-03 16:25 ` Paul Durrant
  2015-07-03 17:18   ` Andrew Cooper
  2015-07-09 10:05   ` Jan Beulich
  2015-07-03 16:25 ` [PATCH v6 10/16] x86/hvm: only call hvm_io_assist() from hvm_wait_for_io() Paul Durrant
                   ` (7 subsequent siblings)
  16 siblings, 2 replies; 53+ messages in thread
From: Paul Durrant @ 2015-07-03 16:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

By limiting hvmemul_do_io_addr() to reps falling within the page on which
a reference has already been taken, we can guarantee that calls to
hvm_copy_to/from_guest_phys() will not hit the HVMCOPY_gfn_paged_out
or HVMCOPY_gfn_shared cases. Thus we can remove the retry logic (added
by c/s 82ed8716b "fix direct PCI port I/O emulation retry and error
handling") from the intercept code and simplify it significantly.

Normally hvmemul_do_io_addr() will only reference single page at a time.
It will, however, take an extra page reference for I/O spanning a page
boundary.

It is still important to know, upon returning from x86_emulate(), whether
the number of reps was reduced so the mmio_retry flag is retained for that
purpose.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c     |   76 ++++++++++++++++++++++++++--------------
 xen/arch/x86/hvm/hvm.c         |    4 +++
 xen/arch/x86/hvm/intercept.c   |   57 +++++++-----------------------
 xen/include/asm-x86/hvm/vcpu.h |    2 +-
 4 files changed, 66 insertions(+), 73 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 1afd626..c80c488 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -51,7 +51,7 @@ static void hvmtrace_io_assist(const ioreq_t *p)
 }
 
 static int hvmemul_do_io(
-    bool_t is_mmio, paddr_t addr, unsigned long *reps, unsigned int size,
+    bool_t is_mmio, paddr_t addr, unsigned long reps, unsigned int size,
     uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data)
 {
     struct vcpu *curr = current;
@@ -60,6 +60,7 @@ static int hvmemul_do_io(
         .type = is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO,
         .addr = addr,
         .size = size,
+        .count = reps,
         .dir = dir,
         .df = df,
         .data = data,
@@ -125,15 +126,6 @@ static int hvmemul_do_io(
         HVMIO_dispatched : HVMIO_awaiting_completion;
     vio->io_size = size;
 
-    /*
-     * When retrying a repeated string instruction, force exit to guest after
-     * completion of the retried iteration to allow handling of interrupts.
-     */
-    if ( vio->mmio_retrying )
-        *reps = 1;
-
-    p.count = *reps;
-
     if ( dir == IOREQ_WRITE )
     {
         if ( !data_is_addr )
@@ -147,17 +139,9 @@ static int hvmemul_do_io(
     switch ( rc )
     {
     case X86EMUL_OKAY:
-    case X86EMUL_RETRY:
-        *reps = p.count;
         p.state = STATE_IORESP_READY;
-        if ( !vio->mmio_retry )
-        {
-            hvm_io_assist(&p);
-            vio->io_state = HVMIO_none;
-        }
-        else
-            /* Defer hvm_io_assist() invocation to hvm_do_resume(). */
-            vio->io_state = HVMIO_handle_mmio_awaiting_completion;
+        hvm_io_assist(&p);
+        vio->io_state = HVMIO_none;
         break;
     case X86EMUL_UNHANDLEABLE:
     {
@@ -235,7 +219,7 @@ static int hvmemul_do_io_buffer(
 
     BUG_ON(buffer == NULL);
 
-    rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 0,
+    rc = hvmemul_do_io(is_mmio, addr, *reps, size, dir, df, 0,
                        (uintptr_t)buffer);
     if ( rc == X86EMUL_UNHANDLEABLE && dir == IOREQ_READ )
         memset(buffer, 0xff, size);
@@ -287,17 +271,56 @@ static int hvmemul_do_io_addr(
     bool_t is_mmio, paddr_t addr, unsigned long *reps,
     unsigned int size, uint8_t dir, bool_t df, paddr_t ram_gpa)
 {
-    struct page_info *ram_page;
+    struct vcpu *v = current;
+    unsigned long ram_gmfn = paddr_to_pfn(ram_gpa);
+    unsigned int page_off = ram_gpa & (PAGE_SIZE - 1);
+    struct page_info *ram_page[2];
+    int nr_pages = 0;
+    unsigned long count;
     int rc;
 
-    rc = hvmemul_acquire_page(paddr_to_pfn(ram_gpa), &ram_page);
+    rc = hvmemul_acquire_page(ram_gmfn, &ram_page[nr_pages]);
     if ( rc != X86EMUL_OKAY )
-        return rc;
+        goto out;
 
-    rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 1,
+    nr_pages++;
+
+    /* Detemine how many reps will fit within this page */
+    count = min_t(unsigned long,
+                  *reps,
+                  df ?
+                  (page_off + size - 1) / size :
+                  (PAGE_SIZE - page_off) / size);
+
+    if ( count == 0 )
+    {
+        /*
+         * This access must span two pages, so grab a reference to
+         * the next page and do a single rep.
+         * It is safe to assume multiple pages are physically
+         * contiguous at this point as hvmemul_linear_to_phys() will
+         * ensure this is the case.
+         */
+        rc = hvmemul_acquire_page(df ? ram_gmfn - 1 : ram_gmfn + 1,
+                                  &ram_page[nr_pages]);
+        if ( rc != X86EMUL_OKAY )
+            goto out;
+
+        nr_pages++;
+        count = 1;
+    }
+
+    rc = hvmemul_do_io(is_mmio, addr, count, size, dir, df, 1,
                        ram_gpa);
+    if ( rc == X86EMUL_OKAY )
+    {
+        v->arch.hvm_vcpu.hvm_io.mmio_retry = (count < *reps);
+        *reps = count;
+    }
 
-    hvmemul_release_page(ram_page);
+ out:
+    while ( --nr_pages >= 0 )
+        hvmemul_release_page(ram_page[nr_pages]);
 
     return rc;
 }
@@ -1547,7 +1570,6 @@ static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
     }
 
     hvmemul_ctxt->exn_pending = 0;
-    vio->mmio_retrying = vio->mmio_retry;
     vio->mmio_retry = 0;
 
     if ( cpu_has_vmx )
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index c25b001..810e579 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -440,6 +440,7 @@ static bool_t hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p)
 
 void hvm_do_resume(struct vcpu *v)
 {
+    struct hvm_vcpu_io *vio = &v->arch.hvm_vcpu.hvm_io;
     struct domain *d = v->domain;
     struct hvm_ioreq_server *s;
 
@@ -468,6 +469,9 @@ void hvm_do_resume(struct vcpu *v)
         }
     }
 
+    if ( vio->mmio_retry )
+        (void)handle_mmio();
+
     /* Inject pending hw/sw trap */
     if ( v->arch.hvm_vcpu.inject_trap.vector != -1 ) 
     {
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index 78e2f57..20bb4ea 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -123,8 +123,6 @@ static const struct hvm_io_ops portio_ops = {
 int hvm_process_io_intercept(const struct hvm_io_handler *handler,
                              ioreq_t *p)
 {
-    struct vcpu *curr = current;
-    struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
     const struct hvm_io_ops *ops = handler->ops;
     int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
     uint64_t data;
@@ -134,23 +132,12 @@ int hvm_process_io_intercept(const struct hvm_io_handler *handler,
     {
         for ( i = 0; i < p->count; i++ )
         {
-            if ( vio->mmio_retrying )
-            {
-                if ( vio->mmio_large_read_bytes != p->size )
-                    return X86EMUL_UNHANDLEABLE;
-                memcpy(&data, vio->mmio_large_read, p->size);
-                vio->mmio_large_read_bytes = 0;
-                vio->mmio_retrying = 0;
-            }
-            else
-            {
-                addr = (p->type == IOREQ_TYPE_COPY) ?
-                       p->addr + step * i :
-                       p->addr;
-                rc = ops->read(handler, addr, p->size, &data);
-                if ( rc != X86EMUL_OKAY )
-                    break;
-            }
+            addr = (p->type == IOREQ_TYPE_COPY) ?
+                   p->addr + step * i :
+                   p->addr;
+            rc = ops->read(handler, addr, p->size, &data);
+            if ( rc != X86EMUL_OKAY )
+                break;
 
             if ( p->data_is_ptr )
             {
@@ -159,14 +146,12 @@ int hvm_process_io_intercept(const struct hvm_io_handler *handler,
                 {
                 case HVMCOPY_okay:
                     break;
-                case HVMCOPY_gfn_paged_out:
-                case HVMCOPY_gfn_shared:
-                    rc = X86EMUL_RETRY;
-                    break;
                 case HVMCOPY_bad_gfn_to_mfn:
                     /* Drop the write as real hardware would. */
                     continue;
                 case HVMCOPY_bad_gva_to_gfn:
+                case HVMCOPY_gfn_paged_out:
+                case HVMCOPY_gfn_shared:
                     ASSERT_UNREACHABLE();
                     /* fall through */
                 default:
@@ -179,13 +164,6 @@ int hvm_process_io_intercept(const struct hvm_io_handler *handler,
             else
                 p->data = data;
         }
-
-        if ( rc == X86EMUL_RETRY )
-        {
-            vio->mmio_retry = 1;
-            vio->mmio_large_read_bytes = p->size;
-            memcpy(vio->mmio_large_read, &data, p->size);
-        }
     }
     else /* p->dir == IOREQ_WRITE */
     {
@@ -198,14 +176,12 @@ int hvm_process_io_intercept(const struct hvm_io_handler *handler,
                 {
                 case HVMCOPY_okay:
                     break;
-                case HVMCOPY_gfn_paged_out:
-                case HVMCOPY_gfn_shared:
-                    rc = X86EMUL_RETRY;
-                    break;
                 case HVMCOPY_bad_gfn_to_mfn:
                     data = ~0;
                     break;
                 case HVMCOPY_bad_gva_to_gfn:
+                case HVMCOPY_gfn_paged_out:
+                case HVMCOPY_gfn_shared:
                     ASSERT_UNREACHABLE();
                     /* fall through */
                 default:
@@ -225,19 +201,10 @@ int hvm_process_io_intercept(const struct hvm_io_handler *handler,
             if ( rc != X86EMUL_OKAY )
                 break;
         }
-
-        if ( rc == X86EMUL_RETRY )
-            vio->mmio_retry = 1;
     }
 
-    if ( i != 0 )
-    {
-        if ( rc == X86EMUL_UNHANDLEABLE )
-            domain_crash(curr->domain);
-
-        p->count = i;
-        rc = X86EMUL_OKAY;
-    }
+    if ( i != 0 && rc == X86EMUL_UNHANDLEABLE )
+        domain_crash(current->domain);
 
     return rc;
 }
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index b15c1e6..2ed0606 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -74,7 +74,7 @@ struct hvm_vcpu_io {
      * For string instruction emulation we need to be able to signal a
      * necessary retry through other than function return codes.
      */
-    bool_t mmio_retry, mmio_retrying;
+    bool_t mmio_retry;
 
     unsigned long msix_unmask_address;
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v6 10/16] x86/hvm: only call hvm_io_assist() from hvm_wait_for_io()
  2015-07-03 16:25 [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (8 preceding siblings ...)
  2015-07-03 16:25 ` [PATCH v6 09/16] x86/hvm: limit reps to avoid the need to handle retry Paul Durrant
@ 2015-07-03 16:25 ` Paul Durrant
  2015-07-03 16:25 ` [PATCH v6 11/16] x86/hvm: split I/O completion handling from state model Paul Durrant
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 53+ messages in thread
From: Paul Durrant @ 2015-07-03 16:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser

By removing the calls in hvmemul_do_io() (which is replaced by a single
assignment) and hvm_complete_assist_request() (which is replaced by a
call to hvm_process_portio_intercept() with a suitable set of ops) then
hvm_io_assist() can be moved into hvm.c and made static (and hence be a
candidate for inlining).

The calls to msix_write_completion() and vcpu_end_shutdown_deferral()
are also made unconditionally because the ioreq state will always be
STATE_IOREQ_NONE at the end of hvm_io_assist() so the test was
pointless. These calls are also only relevant when the emulation has
been handled externally which is now always the case.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c    |   34 ++++++++++++++++++---
 xen/arch/x86/hvm/hvm.c        |   67 ++++++++++++++++++++++-------------------
 xen/arch/x86/hvm/io.c         |   39 ------------------------
 xen/include/asm-x86/hvm/hvm.h |    1 -
 xen/include/asm-x86/hvm/io.h  |    1 -
 5 files changed, 66 insertions(+), 76 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index c80c488..d1775ac 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -50,6 +50,32 @@ static void hvmtrace_io_assist(const ioreq_t *p)
     trace_var(event, 0/*!cycles*/, size, buffer);
 }
 
+static int null_read(const struct hvm_io_handler *io_handler,
+                     uint64_t addr,
+                     uint32_t size,
+                     uint64_t *data)
+{
+    *data = ~0ul;
+    return X86EMUL_OKAY;
+}
+
+static int null_write(const struct hvm_io_handler *handler,
+                      uint64_t addr,
+                      uint32_t size,
+                      uint64_t data)
+{
+    return X86EMUL_OKAY;
+}
+
+static const struct hvm_io_ops null_ops = {
+    .read = null_read,
+    .write = null_write
+};
+
+static const struct hvm_io_handler null_handler = {
+    .ops = &null_ops
+};
+
 static int hvmemul_do_io(
     bool_t is_mmio, paddr_t addr, unsigned long reps, unsigned int size,
     uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data)
@@ -139,8 +165,7 @@ static int hvmemul_do_io(
     switch ( rc )
     {
     case X86EMUL_OKAY:
-        p.state = STATE_IORESP_READY;
-        hvm_io_assist(&p);
+        vio->io_data = p.data;
         vio->io_state = HVMIO_none;
         break;
     case X86EMUL_UNHANDLEABLE:
@@ -151,8 +176,9 @@ static int hvmemul_do_io(
         /* If there is no suitable backing DM, just ignore accesses */
         if ( !s )
         {
-            hvm_complete_assist_req(&p);
-            rc = X86EMUL_OKAY;
+            rc = hvm_process_io_intercept(&null_handler, &p);
+            if ( rc == X86EMUL_OKAY )
+                vio->io_data = p.data;
             vio->io_state = HVMIO_none;
         }
         else
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 810e579..c1a58c2 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -411,6 +411,42 @@ bool_t hvm_io_pending(struct vcpu *v)
     return 0;
 }
 
+static void hvm_io_assist(ioreq_t *p)
+{
+    struct vcpu *curr = current;
+    struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
+    enum hvm_io_state io_state;
+
+    p->state = STATE_IOREQ_NONE;
+
+    io_state = vio->io_state;
+    vio->io_state = HVMIO_none;
+
+    switch ( io_state )
+    {
+    case HVMIO_awaiting_completion:
+        vio->io_state = HVMIO_completed;
+        vio->io_data = p->data;
+        break;
+    case HVMIO_handle_mmio_awaiting_completion:
+        vio->io_state = HVMIO_completed;
+        vio->io_data = p->data;
+        (void)handle_mmio();
+        break;
+    case HVMIO_handle_pio_awaiting_completion:
+        if ( vio->io_size == 4 ) /* Needs zero extension. */
+            guest_cpu_user_regs()->rax = (uint32_t)p->data;
+        else
+            memcpy(&guest_cpu_user_regs()->rax, &p->data, vio->io_size);
+        break;
+    default:
+        break;
+    }
+
+    msix_write_completion(curr);
+    vcpu_end_shutdown_deferral(curr);
+}
+
 static bool_t hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p)
 {
     /* NB. Optimised for common case (p->state == STATE_IOREQ_NONE). */
@@ -2664,37 +2700,6 @@ int hvm_send_assist_req(struct hvm_ioreq_server *s, ioreq_t *proto_p)
     return X86EMUL_UNHANDLEABLE;
 }
 
-void hvm_complete_assist_req(ioreq_t *p)
-{
-    switch ( p->type )
-    {
-    case IOREQ_TYPE_PCI_CONFIG:
-        ASSERT_UNREACHABLE();
-        break;
-    case IOREQ_TYPE_COPY:
-    case IOREQ_TYPE_PIO:
-        if ( p->dir == IOREQ_READ )
-        {
-            if ( !p->data_is_ptr )
-                p->data = ~0ul;
-            else
-            {
-                int i, step = p->df ? -p->size : p->size;
-                uint32_t data = ~0;
-
-                for ( i = 0; i < p->count; i++ )
-                    hvm_copy_to_guest_phys(p->data + step * i, &data,
-                                           p->size);
-            }
-        }
-        /* FALLTHRU */
-    default:
-        p->state = STATE_IORESP_READY;
-        hvm_io_assist(p);
-        break;
-    }
-}
-
 void hvm_broadcast_assist_req(ioreq_t *p)
 {
     struct domain *d = current->domain;
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index 2c88ddb..fe099d8 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -169,45 +169,6 @@ int handle_pio(uint16_t port, unsigned int size, int dir)
     return 1;
 }
 
-void hvm_io_assist(ioreq_t *p)
-{
-    struct vcpu *curr = current;
-    struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
-    enum hvm_io_state io_state;
-
-    p->state = STATE_IOREQ_NONE;
-
-    io_state = vio->io_state;
-    vio->io_state = HVMIO_none;
-
-    switch ( io_state )
-    {
-    case HVMIO_awaiting_completion:
-        vio->io_state = HVMIO_completed;
-        vio->io_data = p->data;
-        break;
-    case HVMIO_handle_mmio_awaiting_completion:
-        vio->io_state = HVMIO_completed;
-        vio->io_data = p->data;
-        (void)handle_mmio();
-        break;
-    case HVMIO_handle_pio_awaiting_completion:
-        if ( vio->io_size == 4 ) /* Needs zero extension. */
-            guest_cpu_user_regs()->rax = (uint32_t)p->data;
-        else
-            memcpy(&guest_cpu_user_regs()->rax, &p->data, vio->io_size);
-        break;
-    default:
-        break;
-    }
-
-    if ( p->state == STATE_IOREQ_NONE )
-    {
-        msix_write_completion(curr);
-        vcpu_end_shutdown_deferral(curr);
-    }
-}
-
 static bool_t dpci_portio_accept(const struct hvm_io_handler *handler,
                                  const ioreq_t *p)
 {
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 57f9605..9792e15 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -232,7 +232,6 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
                                                  ioreq_t *p);
 int hvm_send_assist_req(struct hvm_ioreq_server *s, ioreq_t *p);
 void hvm_broadcast_assist_req(ioreq_t *p);
-void hvm_complete_assist_req(ioreq_t *p);
 
 void hvm_get_guest_pat(struct vcpu *v, u64 *guest_pat);
 int hvm_set_guest_pat(struct vcpu *v, u64 guest_pat);
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index 8394959..8ab63a1 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -125,7 +125,6 @@ int handle_mmio_with_translation(unsigned long gva, unsigned long gpfn,
                                  struct npfec);
 int handle_pio(uint16_t port, unsigned int size, int dir);
 void hvm_interrupt_post(struct vcpu *v, int vector, int type);
-void hvm_io_assist(ioreq_t *p);
 void hvm_dpci_eoi(struct domain *d, unsigned int guest_irq,
                   const union vioapic_redir_entry *ent);
 void msix_write_completion(struct vcpu *);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v6 11/16] x86/hvm: split I/O completion handling from state model
  2015-07-03 16:25 [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (9 preceding siblings ...)
  2015-07-03 16:25 ` [PATCH v6 10/16] x86/hvm: only call hvm_io_assist() from hvm_wait_for_io() Paul Durrant
@ 2015-07-03 16:25 ` Paul Durrant
  2015-07-09 10:09   ` Jan Beulich
  2015-07-03 16:25 ` [PATCH v6 12/16] x86/hvm: remove HVMIO_dispatched I/O state Paul Durrant
                   ` (5 subsequent siblings)
  16 siblings, 1 reply; 53+ messages in thread
From: Paul Durrant @ 2015-07-03 16:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser, Jan Beulich

The state of in-flight I/O and how its completion will be handled are
logically separate and conflating the two makes the code unnecessarily
confusing.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/hvm.c            |   44 +++++++++++++++++++++++++++----------
 xen/arch/x86/hvm/io.c             |    6 ++---
 xen/arch/x86/hvm/vmx/realmode.c   |   27 +++++++++++++++++------
 xen/include/asm-x86/hvm/vcpu.h    |   16 +++++++++-----
 xen/include/asm-x86/hvm/vmx/vmx.h |    1 +
 5 files changed, 67 insertions(+), 27 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index c1a58c2..98d1536 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -59,6 +59,7 @@
 #include <asm/hvm/trace.h>
 #include <asm/hvm/nestedhvm.h>
 #include <asm/hvm/event.h>
+#include <asm/hvm/vmx/vmx.h>
 #include <asm/mtrr.h>
 #include <asm/apic.h>
 #include <public/sched.h>
@@ -428,17 +429,6 @@ static void hvm_io_assist(ioreq_t *p)
         vio->io_state = HVMIO_completed;
         vio->io_data = p->data;
         break;
-    case HVMIO_handle_mmio_awaiting_completion:
-        vio->io_state = HVMIO_completed;
-        vio->io_data = p->data;
-        (void)handle_mmio();
-        break;
-    case HVMIO_handle_pio_awaiting_completion:
-        if ( vio->io_size == 4 ) /* Needs zero extension. */
-            guest_cpu_user_regs()->rax = (uint32_t)p->data;
-        else
-            memcpy(&guest_cpu_user_regs()->rax, &p->data, vio->io_size);
-        break;
     default:
         break;
     }
@@ -479,6 +469,7 @@ void hvm_do_resume(struct vcpu *v)
     struct hvm_vcpu_io *vio = &v->arch.hvm_vcpu.hvm_io;
     struct domain *d = v->domain;
     struct hvm_ioreq_server *s;
+    enum hvm_io_completion io_completion;
 
     check_wakeup_from_wait();
 
@@ -505,8 +496,37 @@ void hvm_do_resume(struct vcpu *v)
         }
     }
 
-    if ( vio->mmio_retry )
+    io_completion = vio->io_completion;
+    vio->io_completion = HVMIO_no_completion;
+
+    switch ( io_completion )
+    {
+    case HVMIO_no_completion:
+        break;
+    case HVMIO_mmio_completion:
         (void)handle_mmio();
+        break;
+    case HVMIO_pio_completion:
+        if ( vio->io_size == 4 ) /* Needs zero extension. */
+            guest_cpu_user_regs()->rax = (uint32_t)vio->io_data;
+        else
+            memcpy(&guest_cpu_user_regs()->rax, &vio->io_data, vio->io_size);
+        vio->io_state = HVMIO_none;
+        break;
+    case HVMIO_realmode_completion:
+    {
+        struct hvm_emulate_ctxt ctxt;
+
+        hvm_emulate_prepare(&ctxt, guest_cpu_user_regs());
+        vmx_realmode_emulate_one(&ctxt);
+        hvm_emulate_writeback(&ctxt);
+
+        break;
+    }
+    default:
+        ASSERT_UNREACHABLE();
+        break;
+    }
 
     /* Inject pending hw/sw trap */
     if ( v->arch.hvm_vcpu.inject_trap.vector != -1 ) 
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index fe099d8..221d05e 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -92,8 +92,8 @@ int handle_mmio(void)
 
     if ( rc != X86EMUL_RETRY )
         vio->io_state = HVMIO_none;
-    if ( vio->io_state == HVMIO_awaiting_completion )
-        vio->io_state = HVMIO_handle_mmio_awaiting_completion;
+    if ( vio->io_state == HVMIO_awaiting_completion || vio->mmio_retry )
+        vio->io_completion = HVMIO_mmio_completion;
     else
         vio->mmio_access = (struct npfec){};
 
@@ -158,7 +158,7 @@ int handle_pio(uint16_t port, unsigned int size, int dir)
             return 0;
         /* Completion in hvm_io_assist() with no re-emulation required. */
         ASSERT(dir == IOREQ_READ);
-        vio->io_state = HVMIO_handle_pio_awaiting_completion;
+        vio->io_completion = HVMIO_pio_completion;
         break;
     default:
         gdprintk(XENLOG_ERR, "Weird HVM ioemulation status %d.\n", rc);
diff --git a/xen/arch/x86/hvm/vmx/realmode.c b/xen/arch/x86/hvm/vmx/realmode.c
index fe8b4a0..76ff9a5 100644
--- a/xen/arch/x86/hvm/vmx/realmode.c
+++ b/xen/arch/x86/hvm/vmx/realmode.c
@@ -101,15 +101,19 @@ static void realmode_deliver_exception(
     }
 }
 
-static void realmode_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt)
+void vmx_realmode_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt)
 {
     struct vcpu *curr = current;
+    struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
     int rc;
 
     perfc_incr(realmode_emulations);
 
     rc = hvm_emulate_one(hvmemul_ctxt);
 
+    if ( vio->io_state == HVMIO_awaiting_completion || vio->mmio_retry )
+        vio->io_completion = HVMIO_realmode_completion;
+
     if ( rc == X86EMUL_UNHANDLEABLE )
     {
         gdprintk(XENLOG_ERR, "Failed to emulate insn.\n");
@@ -177,9 +181,6 @@ void vmx_realmode(struct cpu_user_regs *regs)
 
     hvm_emulate_prepare(&hvmemul_ctxt, regs);
 
-    if ( vio->io_state == HVMIO_completed )
-        realmode_emulate_one(&hvmemul_ctxt);
-
     /* Only deliver interrupts into emulated real mode. */
     if ( !(curr->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE) &&
          (intr_info & INTR_INFO_VALID_MASK) )
@@ -190,8 +191,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
 
     curr->arch.hvm_vmx.vmx_emulate = 1;
     while ( curr->arch.hvm_vmx.vmx_emulate &&
-            !softirq_pending(smp_processor_id()) &&
-            (vio->io_state == HVMIO_none) )
+            !softirq_pending(smp_processor_id()) )
     {
         /*
          * Check for pending interrupts only every 16 instructions, because
@@ -203,7 +203,10 @@ void vmx_realmode(struct cpu_user_regs *regs)
              hvm_local_events_need_delivery(curr) )
             break;
 
-        realmode_emulate_one(&hvmemul_ctxt);
+        vmx_realmode_emulate_one(&hvmemul_ctxt);
+
+        if ( vio->io_state != HVMIO_none || vio->mmio_retry )
+            break;
 
         /* Stop emulating unless our segment state is not safe */
         if ( curr->arch.hvm_vmx.vmx_realmode )
@@ -245,3 +248,13 @@ void vmx_realmode(struct cpu_user_regs *regs)
     if ( intr_info & INTR_INFO_VALID_MASK )
         __vmwrite(VM_ENTRY_INTR_INFO, intr_info);
 }
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 2ed0606..dbdee14 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -34,11 +34,16 @@ enum hvm_io_state {
     HVMIO_none = 0,
     HVMIO_dispatched,
     HVMIO_awaiting_completion,
-    HVMIO_handle_mmio_awaiting_completion,
-    HVMIO_handle_pio_awaiting_completion,
     HVMIO_completed
 };
 
+enum hvm_io_completion {
+    HVMIO_no_completion,
+    HVMIO_mmio_completion,
+    HVMIO_pio_completion,
+    HVMIO_realmode_completion
+};
+
 struct hvm_vcpu_asid {
     uint64_t generation;
     uint32_t asid;
@@ -46,9 +51,10 @@ struct hvm_vcpu_asid {
 
 struct hvm_vcpu_io {
     /* I/O request in flight to device model. */
-    enum hvm_io_state   io_state;
-    unsigned long       io_data;
-    int                 io_size;
+    enum hvm_io_state      io_state;
+    unsigned long          io_data;
+    unsigned int           io_size;
+    enum hvm_io_completion io_completion;
 
     /*
      * HVM emulation:
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-x86/hvm/vmx/vmx.h
index 35f804a..c5f3d24 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -94,6 +94,7 @@ void vmx_asm_do_vmentry(void);
 void vmx_intr_assist(void);
 void noreturn vmx_do_resume(struct vcpu *);
 void vmx_vlapic_msr_changed(struct vcpu *v);
+void vmx_realmode_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt);
 void vmx_realmode(struct cpu_user_regs *regs);
 void vmx_update_debug_state(struct vcpu *v);
 void vmx_update_exception_bitmap(struct vcpu *v);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v6 12/16] x86/hvm: remove HVMIO_dispatched I/O state
  2015-07-03 16:25 [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (10 preceding siblings ...)
  2015-07-03 16:25 ` [PATCH v6 11/16] x86/hvm: split I/O completion handling from state model Paul Durrant
@ 2015-07-03 16:25 ` Paul Durrant
  2015-07-09 10:13   ` Jan Beulich
  2015-07-03 16:25 ` [PATCH v6 13/16] x86/hvm: remove hvm_io_state enumeration Paul Durrant
                   ` (4 subsequent siblings)
  16 siblings, 1 reply; 53+ messages in thread
From: Paul Durrant @ 2015-07-03 16:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser, Jan Beulich

By removing the HVMIO_dispatched state and making all pending emulations
(i.e. all those not handled by the hypervisor) use HVMIO_awating_completion,
various code-paths can be simplified.

The completion case for HVMIO_dispatched can also be trivally removed
from hvmemul_do_io() as it was already unreachable. This is because that
state was only ever used for writes or I/O to/from a guest page and
hvmemul_do_io() is never called to complete such I/O.

NOTE: There is one sublety in handle_pio()... The only case when
      handle_pio() got a return code of X86EMUL_RETRY back from
      hvmemul_do_pio_buffer() and found io_state was not
      HVMIO_awaiting_completion was in the case where the domain is
      shutting down. This is because all writes normally yield a return
      of HVMEMUL_OKAY and all reads put io_state into
      HVMIO_awaiting_completion. Hence the io_state check there is
      replaced with a check of the is_shutting_down flag on the domain.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c      |   12 +++---------
 xen/arch/x86/hvm/hvm.c          |   12 +++---------
 xen/arch/x86/hvm/io.c           |   14 +++++++-------
 xen/arch/x86/hvm/vmx/realmode.c |    2 +-
 xen/include/asm-x86/hvm/vcpu.h  |   10 +++++++++-
 5 files changed, 23 insertions(+), 27 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index d1775ac..c694bb1 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -137,20 +137,14 @@ static int hvmemul_do_io(
         if ( data_is_addr || dir == IOREQ_WRITE )
             return X86EMUL_UNHANDLEABLE;
         goto finish_access;
-    case HVMIO_dispatched:
-        /* May have to wait for previous cycle of a multi-write to complete. */
-        if ( is_mmio && !data_is_addr && (dir == IOREQ_WRITE) &&
-             (addr == (vio->mmio_large_write_pa +
-                       vio->mmio_large_write_bytes)) )
-            return X86EMUL_RETRY;
-        /* fallthrough */
     default:
         return X86EMUL_UNHANDLEABLE;
     }
 
-    vio->io_state = (data_is_addr || dir == IOREQ_WRITE) ?
-        HVMIO_dispatched : HVMIO_awaiting_completion;
+    vio->io_state = HVMIO_awaiting_completion;
     vio->io_size = size;
+    vio->io_dir = dir;
+    vio->io_data_is_addr = data_is_addr;
 
     if ( dir == IOREQ_WRITE )
     {
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 98d1536..4aa7ed2 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -416,22 +416,16 @@ static void hvm_io_assist(ioreq_t *p)
 {
     struct vcpu *curr = current;
     struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
-    enum hvm_io_state io_state;
 
     p->state = STATE_IOREQ_NONE;
 
-    io_state = vio->io_state;
-    vio->io_state = HVMIO_none;
-
-    switch ( io_state )
+    if ( hvm_vcpu_io_need_completion(vio) )
     {
-    case HVMIO_awaiting_completion:
         vio->io_state = HVMIO_completed;
         vio->io_data = p->data;
-        break;
-    default:
-        break;
     }
+    else
+        vio->io_state = HVMIO_none;
 
     msix_write_completion(curr);
     vcpu_end_shutdown_deferral(curr);
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index 221d05e..3b51d59 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -90,9 +90,7 @@ int handle_mmio(void)
 
     rc = hvm_emulate_one(&ctxt);
 
-    if ( rc != X86EMUL_RETRY )
-        vio->io_state = HVMIO_none;
-    if ( vio->io_state == HVMIO_awaiting_completion || vio->mmio_retry )
+    if ( hvm_vcpu_io_need_completion(vio) || vio->mmio_retry )
         vio->io_completion = HVMIO_mmio_completion;
     else
         vio->mmio_access = (struct npfec){};
@@ -142,6 +140,9 @@ int handle_pio(uint16_t port, unsigned int size, int dir)
 
     rc = hvmemul_do_pio_buffer(port, size, dir, &data);
 
+    if ( hvm_vcpu_io_need_completion(vio) )
+        vio->io_completion = HVMIO_pio_completion;
+
     switch ( rc )
     {
     case X86EMUL_OKAY:
@@ -154,11 +155,10 @@ int handle_pio(uint16_t port, unsigned int size, int dir)
         }
         break;
     case X86EMUL_RETRY:
-        if ( vio->io_state != HVMIO_awaiting_completion )
+        /* We should not advance RIP/EIP if the domain is shutting down */
+        if ( curr->domain->is_shutting_down )
             return 0;
-        /* Completion in hvm_io_assist() with no re-emulation required. */
-        ASSERT(dir == IOREQ_READ);
-        vio->io_completion = HVMIO_pio_completion;
+
         break;
     default:
         gdprintk(XENLOG_ERR, "Weird HVM ioemulation status %d.\n", rc);
diff --git a/xen/arch/x86/hvm/vmx/realmode.c b/xen/arch/x86/hvm/vmx/realmode.c
index 76ff9a5..deb53ae 100644
--- a/xen/arch/x86/hvm/vmx/realmode.c
+++ b/xen/arch/x86/hvm/vmx/realmode.c
@@ -111,7 +111,7 @@ void vmx_realmode_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt)
 
     rc = hvm_emulate_one(hvmemul_ctxt);
 
-    if ( vio->io_state == HVMIO_awaiting_completion || vio->mmio_retry )
+    if ( hvm_vcpu_io_need_completion(vio) || vio->mmio_retry )
         vio->io_completion = HVMIO_realmode_completion;
 
     if ( rc == X86EMUL_UNHANDLEABLE )
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index dbdee14..de8cdd7 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -32,7 +32,6 @@
 
 enum hvm_io_state {
     HVMIO_none = 0,
-    HVMIO_dispatched,
     HVMIO_awaiting_completion,
     HVMIO_completed
 };
@@ -55,6 +54,8 @@ struct hvm_vcpu_io {
     unsigned long          io_data;
     unsigned int           io_size;
     enum hvm_io_completion io_completion;
+    uint8_t                io_dir;
+    uint8_t                io_data_is_addr;
 
     /*
      * HVM emulation:
@@ -87,6 +88,13 @@ struct hvm_vcpu_io {
     const struct g2m_ioport *g2m_ioport;
 };
 
+static inline bool_t hvm_vcpu_io_need_completion(const struct hvm_vcpu_io *vio)
+{
+    return (vio->io_state == HVMIO_awaiting_completion) &&
+           !vio->io_data_is_addr &&
+           (vio->io_dir == IOREQ_READ);
+}
+
 #define VMCX_EADDR    (~0ULL)
 
 struct nestedvcpu {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v6 13/16] x86/hvm: remove hvm_io_state enumeration
  2015-07-03 16:25 [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (11 preceding siblings ...)
  2015-07-03 16:25 ` [PATCH v6 12/16] x86/hvm: remove HVMIO_dispatched I/O state Paul Durrant
@ 2015-07-03 16:25 ` Paul Durrant
  2015-07-03 16:25 ` [PATCH v6 14/16] x86/hvm: use ioreq_t to track in-flight state Paul Durrant
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 53+ messages in thread
From: Paul Durrant @ 2015-07-03 16:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser

Emulation request status is already covered by STATE_IOREQ_XXX values so
just use those. The mapping is:

HVMIO_none                -> STATE_IOREQ_NONE
HVMIO_awaiting_completion -> STATE_IOREQ_READY
HVMIO_completed           -> STATE_IORESP_READY

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c       |   14 +++++++-------
 xen/arch/x86/hvm/hvm.c           |    6 +++---
 xen/arch/x86/hvm/svm/nestedsvm.c |    2 +-
 xen/arch/x86/hvm/vmx/realmode.c  |    4 ++--
 xen/include/asm-x86/hvm/vcpu.h   |   10 ++--------
 5 files changed, 15 insertions(+), 21 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index c694bb1..8c083c2 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -130,10 +130,10 @@ static int hvmemul_do_io(
 
     switch ( vio->io_state )
     {
-    case HVMIO_none:
+    case STATE_IOREQ_NONE:
         break;
-    case HVMIO_completed:
-        vio->io_state = HVMIO_none;
+    case STATE_IORESP_READY:
+        vio->io_state = STATE_IOREQ_NONE;
         if ( data_is_addr || dir == IOREQ_WRITE )
             return X86EMUL_UNHANDLEABLE;
         goto finish_access;
@@ -141,7 +141,7 @@ static int hvmemul_do_io(
         return X86EMUL_UNHANDLEABLE;
     }
 
-    vio->io_state = HVMIO_awaiting_completion;
+    vio->io_state = STATE_IOREQ_READY;
     vio->io_size = size;
     vio->io_dir = dir;
     vio->io_data_is_addr = data_is_addr;
@@ -160,7 +160,7 @@ static int hvmemul_do_io(
     {
     case X86EMUL_OKAY:
         vio->io_data = p.data;
-        vio->io_state = HVMIO_none;
+        vio->io_state = STATE_IOREQ_NONE;
         break;
     case X86EMUL_UNHANDLEABLE:
     {
@@ -173,13 +173,13 @@ static int hvmemul_do_io(
             rc = hvm_process_io_intercept(&null_handler, &p);
             if ( rc == X86EMUL_OKAY )
                 vio->io_data = p.data;
-            vio->io_state = HVMIO_none;
+            vio->io_state = STATE_IOREQ_NONE;
         }
         else
         {
             rc = hvm_send_assist_req(s, &p);
             if ( rc != X86EMUL_RETRY || curr->domain->is_shutting_down )
-                vio->io_state = HVMIO_none;
+                vio->io_state = STATE_IOREQ_NONE;
             else if ( data_is_addr || dir == IOREQ_WRITE )
                 rc = X86EMUL_OKAY;
         }
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 4aa7ed2..b3b0339 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -421,11 +421,11 @@ static void hvm_io_assist(ioreq_t *p)
 
     if ( hvm_vcpu_io_need_completion(vio) )
     {
-        vio->io_state = HVMIO_completed;
+        vio->io_state = STATE_IORESP_READY;
         vio->io_data = p->data;
     }
     else
-        vio->io_state = HVMIO_none;
+        vio->io_state = STATE_IOREQ_NONE;
 
     msix_write_completion(curr);
     vcpu_end_shutdown_deferral(curr);
@@ -505,7 +505,7 @@ void hvm_do_resume(struct vcpu *v)
             guest_cpu_user_regs()->rax = (uint32_t)vio->io_data;
         else
             memcpy(&guest_cpu_user_regs()->rax, &vio->io_data, vio->io_size);
-        vio->io_state = HVMIO_none;
+        vio->io_state = STATE_IOREQ_NONE;
         break;
     case HVMIO_realmode_completion:
     {
diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c
index be5797a..8b165c6 100644
--- a/xen/arch/x86/hvm/svm/nestedsvm.c
+++ b/xen/arch/x86/hvm/svm/nestedsvm.c
@@ -1231,7 +1231,7 @@ enum hvm_intblk nsvm_intr_blocked(struct vcpu *v)
          * Delay the injection because this would result in delivering
          * an interrupt *within* the execution of an instruction.
          */
-        if ( v->arch.hvm_vcpu.hvm_io.io_state != HVMIO_none )
+        if ( v->arch.hvm_vcpu.hvm_io.io_state != STATE_IOREQ_NONE )
             return hvm_intblk_shadow;
 
         if ( !nv->nv_vmexit_pending && n2vmcb->exitintinfo.bytes != 0 ) {
diff --git a/xen/arch/x86/hvm/vmx/realmode.c b/xen/arch/x86/hvm/vmx/realmode.c
index deb53ae..25533dc 100644
--- a/xen/arch/x86/hvm/vmx/realmode.c
+++ b/xen/arch/x86/hvm/vmx/realmode.c
@@ -205,7 +205,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
 
         vmx_realmode_emulate_one(&hvmemul_ctxt);
 
-        if ( vio->io_state != HVMIO_none || vio->mmio_retry )
+        if ( vio->io_state != STATE_IOREQ_NONE || vio->mmio_retry )
             break;
 
         /* Stop emulating unless our segment state is not safe */
@@ -219,7 +219,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
     }
 
     /* Need to emulate next time if we've started an IO operation */
-    if ( vio->io_state != HVMIO_none )
+    if ( vio->io_state != STATE_IOREQ_NONE )
         curr->arch.hvm_vmx.vmx_emulate = 1;
 
     if ( !curr->arch.hvm_vmx.vmx_emulate && !curr->arch.hvm_vmx.vmx_realmode )
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index de8cdd7..aa1eda5 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -30,12 +30,6 @@
 #include <asm/hvm/svm/nestedsvm.h>
 #include <asm/mtrr.h>
 
-enum hvm_io_state {
-    HVMIO_none = 0,
-    HVMIO_awaiting_completion,
-    HVMIO_completed
-};
-
 enum hvm_io_completion {
     HVMIO_no_completion,
     HVMIO_mmio_completion,
@@ -50,7 +44,7 @@ struct hvm_vcpu_asid {
 
 struct hvm_vcpu_io {
     /* I/O request in flight to device model. */
-    enum hvm_io_state      io_state;
+    uint8_t                io_state;
     unsigned long          io_data;
     unsigned int           io_size;
     enum hvm_io_completion io_completion;
@@ -90,7 +84,7 @@ struct hvm_vcpu_io {
 
 static inline bool_t hvm_vcpu_io_need_completion(const struct hvm_vcpu_io *vio)
 {
-    return (vio->io_state == HVMIO_awaiting_completion) &&
+    return (vio->io_state == STATE_IOREQ_READY) &&
            !vio->io_data_is_addr &&
            (vio->io_dir == IOREQ_READ);
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v6 14/16] x86/hvm: use ioreq_t to track in-flight state
  2015-07-03 16:25 [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (12 preceding siblings ...)
  2015-07-03 16:25 ` [PATCH v6 13/16] x86/hvm: remove hvm_io_state enumeration Paul Durrant
@ 2015-07-03 16:25 ` Paul Durrant
  2015-07-03 16:25 ` [PATCH v6 15/16] x86/hvm: always re-emulate I/O from a buffer Paul Durrant
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 53+ messages in thread
From: Paul Durrant @ 2015-07-03 16:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser, Jan Beulich

Use an ioreq_t rather than open coded state, size, dir and data fields
in struct hvm_vcpu_io. This also allows PIO completion to be handled
similarly to MMIO completion by re-issuing the handle_pio() call.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c       |   35 +++++++++++++++++++++--------------
 xen/arch/x86/hvm/hvm.c           |   13 +++++--------
 xen/arch/x86/hvm/svm/nestedsvm.c |    2 +-
 xen/arch/x86/hvm/vmx/realmode.c  |    4 ++--
 xen/include/asm-x86/hvm/vcpu.h   |   12 ++++--------
 5 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 8c083c2..d45f448 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -91,6 +91,7 @@ static int hvmemul_do_io(
         .df = df,
         .data = data,
         .data_is_ptr = data_is_addr, /* ioreq_t field name is misleading */
+        .state = STATE_IOREQ_READY,
     };
     void *p_data = (void *)data;
     int rc;
@@ -128,12 +129,24 @@ static int hvmemul_do_io(
         }
     }
 
-    switch ( vio->io_state )
+    switch ( vio->io_req.state )
     {
     case STATE_IOREQ_NONE:
         break;
     case STATE_IORESP_READY:
-        vio->io_state = STATE_IOREQ_NONE;
+        vio->io_req.state = STATE_IOREQ_NONE;
+        p = vio->io_req;
+
+        /* Verify the emulation request has been correctly re-issued */
+        if ( (p.type != is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO) ||
+             (p.addr != addr) ||
+             (p.size != size) ||
+             (p.count != reps) ||
+             (p.dir != dir) ||
+             (p.df != df) ||
+             (p.data_is_ptr != data_is_addr) )
+            domain_crash(curr->domain);
+
         if ( data_is_addr || dir == IOREQ_WRITE )
             return X86EMUL_UNHANDLEABLE;
         goto finish_access;
@@ -141,11 +154,6 @@ static int hvmemul_do_io(
         return X86EMUL_UNHANDLEABLE;
     }
 
-    vio->io_state = STATE_IOREQ_READY;
-    vio->io_size = size;
-    vio->io_dir = dir;
-    vio->io_data_is_addr = data_is_addr;
-
     if ( dir == IOREQ_WRITE )
     {
         if ( !data_is_addr )
@@ -154,13 +162,14 @@ static int hvmemul_do_io(
         hvmtrace_io_assist(&p);
     }
 
+    vio->io_req = p;
+
     rc = hvm_io_intercept(&p);
 
     switch ( rc )
     {
     case X86EMUL_OKAY:
-        vio->io_data = p.data;
-        vio->io_state = STATE_IOREQ_NONE;
+        vio->io_req.state = STATE_IOREQ_NONE;
         break;
     case X86EMUL_UNHANDLEABLE:
     {
@@ -171,15 +180,13 @@ static int hvmemul_do_io(
         if ( !s )
         {
             rc = hvm_process_io_intercept(&null_handler, &p);
-            if ( rc == X86EMUL_OKAY )
-                vio->io_data = p.data;
-            vio->io_state = STATE_IOREQ_NONE;
+            vio->io_req.state = STATE_IOREQ_NONE;
         }
         else
         {
             rc = hvm_send_assist_req(s, &p);
             if ( rc != X86EMUL_RETRY || curr->domain->is_shutting_down )
-                vio->io_state = STATE_IOREQ_NONE;
+                vio->io_req.state = STATE_IOREQ_NONE;
             else if ( data_is_addr || dir == IOREQ_WRITE )
                 rc = X86EMUL_OKAY;
         }
@@ -198,7 +205,7 @@ static int hvmemul_do_io(
         hvmtrace_io_assist(&p);
 
         if ( !data_is_addr )
-            memcpy(p_data, &vio->io_data, size);
+            memcpy(p_data, &p.data, size);
     }
 
     if ( is_mmio && !data_is_addr )
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index b3b0339..0c56f4b 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -421,11 +421,11 @@ static void hvm_io_assist(ioreq_t *p)
 
     if ( hvm_vcpu_io_need_completion(vio) )
     {
-        vio->io_state = STATE_IORESP_READY;
-        vio->io_data = p->data;
+        vio->io_req.state = STATE_IORESP_READY;
+        vio->io_req.data = p->data;
     }
     else
-        vio->io_state = STATE_IOREQ_NONE;
+        vio->io_req.state = STATE_IOREQ_NONE;
 
     msix_write_completion(curr);
     vcpu_end_shutdown_deferral(curr);
@@ -501,11 +501,8 @@ void hvm_do_resume(struct vcpu *v)
         (void)handle_mmio();
         break;
     case HVMIO_pio_completion:
-        if ( vio->io_size == 4 ) /* Needs zero extension. */
-            guest_cpu_user_regs()->rax = (uint32_t)vio->io_data;
-        else
-            memcpy(&guest_cpu_user_regs()->rax, &vio->io_data, vio->io_size);
-        vio->io_state = STATE_IOREQ_NONE;
+        (void)handle_pio(vio->io_req.addr, vio->io_req.size,
+                         vio->io_req.dir);
         break;
     case HVMIO_realmode_completion:
     {
diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c
index 8b165c6..78667a2 100644
--- a/xen/arch/x86/hvm/svm/nestedsvm.c
+++ b/xen/arch/x86/hvm/svm/nestedsvm.c
@@ -1231,7 +1231,7 @@ enum hvm_intblk nsvm_intr_blocked(struct vcpu *v)
          * Delay the injection because this would result in delivering
          * an interrupt *within* the execution of an instruction.
          */
-        if ( v->arch.hvm_vcpu.hvm_io.io_state != STATE_IOREQ_NONE )
+        if ( v->arch.hvm_vcpu.hvm_io.io_req.state != STATE_IOREQ_NONE )
             return hvm_intblk_shadow;
 
         if ( !nv->nv_vmexit_pending && n2vmcb->exitintinfo.bytes != 0 ) {
diff --git a/xen/arch/x86/hvm/vmx/realmode.c b/xen/arch/x86/hvm/vmx/realmode.c
index 25533dc..e83a61f 100644
--- a/xen/arch/x86/hvm/vmx/realmode.c
+++ b/xen/arch/x86/hvm/vmx/realmode.c
@@ -205,7 +205,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
 
         vmx_realmode_emulate_one(&hvmemul_ctxt);
 
-        if ( vio->io_state != STATE_IOREQ_NONE || vio->mmio_retry )
+        if ( vio->io_req.state != STATE_IOREQ_NONE || vio->mmio_retry )
             break;
 
         /* Stop emulating unless our segment state is not safe */
@@ -219,7 +219,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
     }
 
     /* Need to emulate next time if we've started an IO operation */
-    if ( vio->io_state != STATE_IOREQ_NONE )
+    if ( vio->io_req.state != STATE_IOREQ_NONE )
         curr->arch.hvm_vmx.vmx_emulate = 1;
 
     if ( !curr->arch.hvm_vmx.vmx_emulate && !curr->arch.hvm_vmx.vmx_realmode )
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index aa1eda5..6502c98 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -44,12 +44,8 @@ struct hvm_vcpu_asid {
 
 struct hvm_vcpu_io {
     /* I/O request in flight to device model. */
-    uint8_t                io_state;
-    unsigned long          io_data;
-    unsigned int           io_size;
+    ioreq_t                io_req;
     enum hvm_io_completion io_completion;
-    uint8_t                io_dir;
-    uint8_t                io_data_is_addr;
 
     /*
      * HVM emulation:
@@ -84,9 +80,9 @@ struct hvm_vcpu_io {
 
 static inline bool_t hvm_vcpu_io_need_completion(const struct hvm_vcpu_io *vio)
 {
-    return (vio->io_state == STATE_IOREQ_READY) &&
-           !vio->io_data_is_addr &&
-           (vio->io_dir == IOREQ_READ);
+    return (vio->io_req.state == STATE_IOREQ_READY) &&
+           !vio->io_req.data_is_ptr &&
+           (vio->io_req.dir == IOREQ_READ);
 }
 
 #define VMCX_EADDR    (~0ULL)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v6 15/16] x86/hvm: always re-emulate I/O from a buffer
  2015-07-03 16:25 [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (13 preceding siblings ...)
  2015-07-03 16:25 ` [PATCH v6 14/16] x86/hvm: use ioreq_t to track in-flight state Paul Durrant
@ 2015-07-03 16:25 ` Paul Durrant
  2015-07-03 16:25 ` [PATCH v6 16/16] x86/hvm: track large memory mapped accesses by buffer offset Paul Durrant
  2015-07-08 15:44 ` [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix Jan Beulich
  16 siblings, 0 replies; 53+ messages in thread
From: Paul Durrant @ 2015-07-03 16:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser

If memory mapped I/O is 'chunked' then the I/O must be re-emulated,
otherwise only the first chunk will be processed. This patch makes
sure all I/O from a buffer is re-emulated regardless of whether it
is a read or a write.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c     |    4 ++--
 xen/include/asm-x86/hvm/vcpu.h |    3 +--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index d45f448..84c4290 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -147,7 +147,7 @@ static int hvmemul_do_io(
              (p.data_is_ptr != data_is_addr) )
             domain_crash(curr->domain);
 
-        if ( data_is_addr || dir == IOREQ_WRITE )
+        if ( data_is_addr )
             return X86EMUL_UNHANDLEABLE;
         goto finish_access;
     default:
@@ -187,7 +187,7 @@ static int hvmemul_do_io(
             rc = hvm_send_assist_req(s, &p);
             if ( rc != X86EMUL_RETRY || curr->domain->is_shutting_down )
                 vio->io_req.state = STATE_IOREQ_NONE;
-            else if ( data_is_addr || dir == IOREQ_WRITE )
+            else if ( data_is_addr )
                 rc = X86EMUL_OKAY;
         }
         break;
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 6502c98..c19c039 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -81,8 +81,7 @@ struct hvm_vcpu_io {
 static inline bool_t hvm_vcpu_io_need_completion(const struct hvm_vcpu_io *vio)
 {
     return (vio->io_req.state == STATE_IOREQ_READY) &&
-           !vio->io_req.data_is_ptr &&
-           (vio->io_req.dir == IOREQ_READ);
+           !vio->io_req.data_is_ptr;
 }
 
 #define VMCX_EADDR    (~0ULL)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v6 16/16] x86/hvm: track large memory mapped accesses by buffer offset
  2015-07-03 16:25 [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (14 preceding siblings ...)
  2015-07-03 16:25 ` [PATCH v6 15/16] x86/hvm: always re-emulate I/O from a buffer Paul Durrant
@ 2015-07-03 16:25 ` Paul Durrant
  2015-07-09 10:33   ` Jan Beulich
  2015-07-09 10:34   ` Jan Beulich
  2015-07-08 15:44 ` [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix Jan Beulich
  16 siblings, 2 replies; 53+ messages in thread
From: Paul Durrant @ 2015-07-03 16:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser, Jan Beulich

The code in hvmemul_do_io() that tracks large reads or writes, to avoid
re-issue of component I/O, is defeated by accesses across a page boundary
because it uses physical address. The code is also only relevant to memory
mapped I/O to or from a buffer.

This patch re-factors the code and moves it into hvmemul_phys_mmio_access()
where it is relevant and tracks using buffer offset rather than address.
Separate I/O emulations (of which there may be up to three per instruction)
are distinguished by linear address.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c     |  122 ++++++++++++++++++++--------------------
 xen/include/asm-x86/hvm/vcpu.h |   25 +++++---
 2 files changed, 78 insertions(+), 69 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 84c4290..49afaeb 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -106,29 +106,6 @@ static int hvmemul_do_io(
         return X86EMUL_UNHANDLEABLE;
     }
 
-    if ( is_mmio && !data_is_addr )
-    {
-        /* Part of a multi-cycle read or write? */
-        if ( dir == IOREQ_WRITE )
-        {
-            paddr_t pa = vio->mmio_large_write_pa;
-            unsigned int bytes = vio->mmio_large_write_bytes;
-            if ( (addr >= pa) && ((addr + size) <= (pa + bytes)) )
-                return X86EMUL_OKAY;
-        }
-        else
-        {
-            paddr_t pa = vio->mmio_large_read_pa;
-            unsigned int bytes = vio->mmio_large_read_bytes;
-            if ( (addr >= pa) && ((addr + size) <= (pa + bytes)) )
-            {
-                memcpy(p_data, &vio->mmio_large_read[addr - pa],
-                       size);
-                return X86EMUL_OKAY;
-            }
-        }
-    }
-
     switch ( vio->io_req.state )
     {
     case STATE_IOREQ_NONE:
@@ -208,33 +185,6 @@ static int hvmemul_do_io(
             memcpy(p_data, &p.data, size);
     }
 
-    if ( is_mmio && !data_is_addr )
-    {
-        /* Part of a multi-cycle read or write? */
-        if ( dir == IOREQ_WRITE )
-        {
-            paddr_t pa = vio->mmio_large_write_pa;
-            unsigned int bytes = vio->mmio_large_write_bytes;
-            if ( bytes == 0 )
-                pa = vio->mmio_large_write_pa = addr;
-            if ( addr == (pa + bytes) )
-                vio->mmio_large_write_bytes += size;
-        }
-        else
-        {
-            paddr_t pa = vio->mmio_large_read_pa;
-            unsigned int bytes = vio->mmio_large_read_bytes;
-            if ( bytes == 0 )
-                pa = vio->mmio_large_read_pa = addr;
-            if ( (addr == (pa + bytes)) &&
-                 ((bytes + size) <= sizeof(vio->mmio_large_read)) )
-            {
-                memcpy(&vio->mmio_large_read[bytes], p_data, size);
-                vio->mmio_large_read_bytes += size;
-            }
-        }
-    }
-
     return X86EMUL_OKAY;
 }
 
@@ -590,11 +540,12 @@ static int hvmemul_virtual_to_linear(
 }
 
 static int hvmemul_phys_mmio_access(
-    paddr_t gpa, unsigned int size, uint8_t dir, uint8_t *buffer)
+    struct hvm_mmio_cache *cache, paddr_t gpa, unsigned int size, uint8_t dir,
+    uint8_t *buffer, unsigned int offset)
 {
     unsigned long one_rep = 1;
     unsigned int chunk;
-    int rc;
+    int rc = X86EMUL_OKAY;
 
     /* Accesses must fall within a page. */
     BUG_ON((gpa & ~PAGE_MASK) + size > PAGE_SIZE);
@@ -611,14 +562,33 @@ static int hvmemul_phys_mmio_access(
 
     for ( ;; )
     {
-        rc = hvmemul_do_mmio_buffer(gpa, &one_rep, chunk, dir, 0,
-                                    buffer);
-        if ( rc != X86EMUL_OKAY )
-            break;
+        /* Have we already done this chunk? */
+        if ( offset < cache->size )
+        {
+            ASSERT((offset + chunk) <= cache->size);
+
+            if ( dir == IOREQ_READ )
+                memcpy(&buffer[offset], &cache->buffer[offset], chunk);
+            else if ( memcmp(&buffer[offset], &cache->buffer[offset], chunk) != 0 )
+                domain_crash(current->domain);
+        }
+        else
+        {
+            ASSERT(offset == cache->size);
+
+            rc = hvmemul_do_mmio_buffer(gpa, &one_rep, chunk, dir, 0,
+                                        &buffer[offset]);
+            if ( rc != X86EMUL_OKAY )
+                break;
+
+            /* Note that we have now done this chunk. */
+            memcpy(&cache->buffer[offset], &buffer[offset], chunk);
+            cache->size += chunk;
+        }
 
         /* Advance to the next chunk. */
         gpa += chunk;
-        buffer += chunk;
+        offset += chunk;
         size -= chunk;
 
         if ( size == 0 )
@@ -635,13 +605,41 @@ static int hvmemul_phys_mmio_access(
     return rc;
 }
 
+static struct hvm_mmio_cache *hvmemul_find_mmio_cache(
+    struct hvm_vcpu_io *vio, unsigned long gla, uint8_t dir)
+{
+    unsigned int i;
+    struct hvm_mmio_cache *cache;
+
+    for ( i = 0; i < vio->mmio_cache_count; i ++ )
+    {
+        cache = &vio->mmio_cache[i];
+
+        if ( gla == cache->gla &&
+             dir == cache->dir )
+            return cache;
+    }
+
+    i = vio->mmio_cache_count++;
+    BUG_ON(i == ARRAY_SIZE(vio->mmio_cache));
+
+    cache = &vio->mmio_cache[i];
+    memset(cache, 0, sizeof (*cache));
+
+    cache->gla = gla;
+    cache->dir = dir;
+
+    return cache;
+}
+
 static int hvmemul_linear_mmio_access(
-    unsigned long gla, unsigned int size, uint8_t dir, uint8_t *buffer,
+    unsigned long gla, unsigned int size, uint8_t dir, void *buffer,
     uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t known_gpfn)
 {
     struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
     unsigned long offset = gla & ~PAGE_MASK;
-    unsigned int chunk;
+    struct hvm_mmio_cache *cache = hvmemul_find_mmio_cache(vio, gla, dir);
+    unsigned int chunk, buffer_offset = 0;
     paddr_t gpa;
     unsigned long one_rep = 1;
     int rc;
@@ -660,12 +658,12 @@ static int hvmemul_linear_mmio_access(
 
     for ( ;; )
     {
-        rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer);
+        rc = hvmemul_phys_mmio_access(cache, gpa, chunk, dir, buffer, buffer_offset);
         if ( rc != X86EMUL_OKAY )
             break;
 
         gla += chunk;
-        buffer += chunk;
+        buffer_offset += chunk;
         size -= chunk;
 
         if ( size == 0 )
@@ -1612,7 +1610,7 @@ static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
         rc = X86EMUL_RETRY;
     if ( rc != X86EMUL_RETRY )
     {
-        vio->mmio_large_read_bytes = vio->mmio_large_write_bytes = 0;
+        vio->mmio_cache_count = 0;
         vio->mmio_insn_bytes = 0;
     }
     else
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index c19c039..a2ef70a 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -42,6 +42,17 @@ struct hvm_vcpu_asid {
     uint32_t asid;
 };
 
+/*
+ * We may read or write up to m256 as a number of device-model
+ * transactions.
+ */
+struct hvm_mmio_cache {
+    unsigned long size;
+    uint8_t buffer[32];
+    unsigned long gla;
+    uint8_t dir;
+};
+
 struct hvm_vcpu_io {
     /* I/O request in flight to device model. */
     ioreq_t                io_req;
@@ -57,13 +68,13 @@ struct hvm_vcpu_io {
     unsigned long       mmio_gva;
     unsigned long       mmio_gpfn;
 
-    /* We may read up to m256 as a number of device-model transactions. */
-    paddr_t mmio_large_read_pa;
-    uint8_t mmio_large_read[32];
-    unsigned int mmio_large_read_bytes;
-    /* We may write up to m256 as a number of device-model transactions. */
-    unsigned int mmio_large_write_bytes;
-    paddr_t mmio_large_write_pa;
+    /*
+     * We may need to handle up to 3 distinct memory accesses per
+     * instruction.
+     */
+    struct hvm_mmio_cache mmio_cache[3];
+    unsigned int mmio_cache_count;
+
     /* For retries we shouldn't re-fetch the instruction. */
     unsigned int mmio_insn_bytes;
     unsigned char mmio_insn[16];
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 02/16] x86/hvm: remove multiple open coded 'chunking' loops
  2015-07-03 16:25 ` [PATCH v6 02/16] x86/hvm: remove multiple open coded 'chunking' loops Paul Durrant
@ 2015-07-03 17:17   ` Andrew Cooper
  2015-07-08 15:52   ` Jan Beulich
  1 sibling, 0 replies; 53+ messages in thread
From: Andrew Cooper @ 2015-07-03 17:17 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Keir Fraser, Jan Beulich

On 03/07/15 17:25, Paul Durrant wrote:
> ...in hvmemul_read/write()
>
> Add hvmemul_phys_mmio_access() and hvmemul_linear_mmio_access() functions
> to reduce code duplication.
>
> NOTE: This patch also introduces a change in 'chunking' around a page
>       boundary. Previously (for example) an 8 byte access at the last
>       byte of a page would get carried out as 8 single-byte accesses.
>       It will now be carried out as a single-byte access, followed by
>       a 4-byte access, a 2-byte access and then another single-byte
>       access.
>
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Keir Fraser <keir@xen.org>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 09/16] x86/hvm: limit reps to avoid the need to handle retry
  2015-07-03 16:25 ` [PATCH v6 09/16] x86/hvm: limit reps to avoid the need to handle retry Paul Durrant
@ 2015-07-03 17:18   ` Andrew Cooper
  2015-07-09 10:05   ` Jan Beulich
  1 sibling, 0 replies; 53+ messages in thread
From: Andrew Cooper @ 2015-07-03 17:18 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Keir Fraser, Jan Beulich

On 03/07/15 17:25, Paul Durrant wrote:
> By limiting hvmemul_do_io_addr() to reps falling within the page on which
> a reference has already been taken, we can guarantee that calls to
> hvm_copy_to/from_guest_phys() will not hit the HVMCOPY_gfn_paged_out
> or HVMCOPY_gfn_shared cases. Thus we can remove the retry logic (added
> by c/s 82ed8716b "fix direct PCI port I/O emulation retry and error
> handling") from the intercept code and simplify it significantly.
>
> Normally hvmemul_do_io_addr() will only reference single page at a time.
> It will, however, take an extra page reference for I/O spanning a page
> boundary.
>
> It is still important to know, upon returning from x86_emulate(), whether
> the number of reps was reduced so the mmio_retry flag is retained for that
> purpose.
>
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Keir Fraser <keir@xen.org>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix
  2015-07-03 16:25 [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (15 preceding siblings ...)
  2015-07-03 16:25 ` [PATCH v6 16/16] x86/hvm: track large memory mapped accesses by buffer offset Paul Durrant
@ 2015-07-08 15:44 ` Jan Beulich
  2015-07-09 11:31   ` Paul Durrant
  16 siblings, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2015-07-08 15:44 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel

>>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
> Changelog (now per-patch)
> -------------------------

That's quite nice, but still not really helpful when not put in the
individual patches - one still needs to keep this one and the patch
open to follow things incrementally.

Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 02/16] x86/hvm: remove multiple open coded 'chunking' loops
  2015-07-03 16:25 ` [PATCH v6 02/16] x86/hvm: remove multiple open coded 'chunking' loops Paul Durrant
  2015-07-03 17:17   ` Andrew Cooper
@ 2015-07-08 15:52   ` Jan Beulich
  2015-07-08 15:57     ` Paul Durrant
  1 sibling, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2015-07-08 15:52 UTC (permalink / raw)
  To: Andrew Cooper, Paul Durrant; +Cc: xen-devel, Keir Fraser

>>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
> +static int hvmemul_linear_mmio_access(
> +    unsigned long gla, unsigned int size, uint8_t dir, uint8_t *buffer,
> +    uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t known_gpfn)
> +{
> +    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
> +    unsigned long offset = gla & ~PAGE_MASK;
> +    unsigned int chunk;
> +    paddr_t gpa;
> +    unsigned long one_rep = 1;
> +    int rc;
> +
> +    chunk = min_t(unsigned int, size, PAGE_SIZE - offset);
> +
> +    if ( known_gpfn )
> +        gpa = pfn_to_paddr(vio->mmio_gpfn) | offset;
> +    else
> +    {
> +        rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
> +                                    hvmemul_ctxt);
> +        if ( rc != X86EMUL_OKAY )
> +            return rc;
> +    }
> +
> +    for ( ;; )
> +    {
> +        rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer);
> +        if ( rc != X86EMUL_OKAY )
> +            break;
> +
> +        gla += chunk;
> +        buffer += chunk;
> +        size -= chunk;
> +
> +        if ( size == 0 )
> +            break;
> +
> +        ASSERT((gla & ~PAGE_MASK) == 0);

Does this really matter for the code below?

> +        chunk = min_t(unsigned int, size, PAGE_SIZE);

Iirc Andrew had asked for this, but I still don't see why: "size" is the
width of an instruction operand, and hence won't even come close
to PAGE_SIZE.

> +        rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
> +                                    hvmemul_ctxt);

The fact that &one_rep is being passed here also makes clear that
we can't be dealing with an <operand-width> * <reps> situation.
And if we were, "size" couldn't validly be "unsigned int".

Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 04/16] x86/hvm: restrict port numbers to uint16_t and sizes to unsigned int
  2015-07-03 16:25 ` [PATCH v6 04/16] x86/hvm: restrict port numbers to uint16_t and sizes to unsigned int Paul Durrant
@ 2015-07-08 15:57   ` Jan Beulich
  2015-07-08 15:59     ` Paul Durrant
  0 siblings, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2015-07-08 15:57 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Keir Fraser

>>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
> Building on the previous patch, this patch restricts portio port numbers
> to uint16_t in registration/relocate calls and portio_action_t. It also
> changes portio sizes to unsigned int which then allows the io_handler
> size field to reduce to an unsigned int.

Changing size's type is fine; changing port numbers to uint16_t will -
afaict - generate worse code, and hence doesn't seem warranted.
I wouldn't mind changing from uint32_t to unsigned int though.

Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 02/16] x86/hvm: remove multiple open coded 'chunking' loops
  2015-07-08 15:52   ` Jan Beulich
@ 2015-07-08 15:57     ` Paul Durrant
  2015-07-08 16:18       ` Jan Beulich
  2015-07-08 16:43       ` Andrew Cooper
  0 siblings, 2 replies; 53+ messages in thread
From: Paul Durrant @ 2015-07-08 15:57 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper; +Cc: xen-devel@lists.xenproject.org, Keir (Xen.org)

> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> bounces@lists.xen.org] On Behalf Of Jan Beulich
> Sent: 08 July 2015 16:53
> To: Andrew Cooper; Paul Durrant
> Cc: xen-devel@lists.xenproject.org; Keir (Xen.org)
> Subject: Re: [Xen-devel] [PATCH v6 02/16] x86/hvm: remove multiple open
> coded 'chunking' loops
> 
> >>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
> > +static int hvmemul_linear_mmio_access(
> > +    unsigned long gla, unsigned int size, uint8_t dir, uint8_t *buffer,
> > +    uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t
> known_gpfn)
> > +{
> > +    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
> > +    unsigned long offset = gla & ~PAGE_MASK;
> > +    unsigned int chunk;
> > +    paddr_t gpa;
> > +    unsigned long one_rep = 1;
> > +    int rc;
> > +
> > +    chunk = min_t(unsigned int, size, PAGE_SIZE - offset);
> > +
> > +    if ( known_gpfn )
> > +        gpa = pfn_to_paddr(vio->mmio_gpfn) | offset;
> > +    else
> > +    {
> > +        rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
> > +                                    hvmemul_ctxt);
> > +        if ( rc != X86EMUL_OKAY )
> > +            return rc;
> > +    }
> > +
> > +    for ( ;; )
> > +    {
> > +        rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer);
> > +        if ( rc != X86EMUL_OKAY )
> > +            break;
> > +
> > +        gla += chunk;
> > +        buffer += chunk;
> > +        size -= chunk;
> > +
> > +        if ( size == 0 )
> > +            break;
> > +
> > +        ASSERT((gla & ~PAGE_MASK) == 0);
> 
> Does this really matter for the code below?
> 
> > +        chunk = min_t(unsigned int, size, PAGE_SIZE);
> 
> Iirc Andrew had asked for this, but I still don't see why: "size" is the
> width of an instruction operand, and hence won't even come close
> to PAGE_SIZE.
> 

I'll let Andrew answer that one. I've already ping-ponged once on this.

I guess it makes it safer if someone were to try to use this function for another purpose in future?

  Paul

> > +        rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
> > +                                    hvmemul_ctxt);
> 
> The fact that &one_rep is being passed here also makes clear that
> we can't be dealing with an <operand-width> * <reps> situation.
> And if we were, "size" couldn't validly be "unsigned int".
> 
> Jan
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 04/16] x86/hvm: restrict port numbers to uint16_t and sizes to unsigned int
  2015-07-08 15:57   ` Jan Beulich
@ 2015-07-08 15:59     ` Paul Durrant
  0 siblings, 0 replies; 53+ messages in thread
From: Paul Durrant @ 2015-07-08 15:59 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel@lists.xenproject.org, Keir (Xen.org)

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 08 July 2015 16:57
> To: Paul Durrant
> Cc: xen-devel@lists.xenproject.org; Keir (Xen.org)
> Subject: Re: [PATCH v6 04/16] x86/hvm: restrict port numbers to uint16_t
> and sizes to unsigned int
> 
> >>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
> > Building on the previous patch, this patch restricts portio port numbers
> > to uint16_t in registration/relocate calls and portio_action_t. It also
> > changes portio sizes to unsigned int which then allows the io_handler
> > size field to reduce to an unsigned int.
> 
> Changing size's type is fine; changing port numbers to uint16_t will -
> afaict - generate worse code, and hence doesn't seem warranted.
> I wouldn't mind changing from uint32_t to unsigned int though.
> 

<sigh> More yak-shaving then.

  Paul

> Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 05/16] x86/hvm: unify internal portio and mmio intercepts
  2015-07-03 16:25 ` [PATCH v6 05/16] x86/hvm: unify internal portio and mmio intercepts Paul Durrant
@ 2015-07-08 16:11   ` Jan Beulich
  2015-07-08 16:28   ` Jan Beulich
  1 sibling, 0 replies; 53+ messages in thread
From: Jan Beulich @ 2015-07-08 16:11 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Keir Fraser

>>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
> @@ -1465,11 +1462,12 @@ int hvm_domain_initialise(struct domain *d)
>          goto fail0;
>  
>      d->arch.hvm_domain.params = xzalloc_array(uint64_t, HVM_NR_PARAMS);
> -    d->arch.hvm_domain.io_handler = xmalloc(struct hvm_io_handler);
> +    d->arch.hvm_domain.io_handler = xzalloc_array(struct hvm_io_handler,
> +                                                  NR_IO_HANDLERS);
>      rc = -ENOMEM;
>      if ( !d->arch.hvm_domain.params || !d->arch.hvm_domain.io_handler )
>          goto fail1;
> -    d->arch.hvm_domain.io_handler->num_slot = 0;
> +    d->arch.hvm_domain.io_handler_count = 0;

I don't think this is needed - the field starts out as zero anyway.

> --- a/xen/arch/x86/hvm/intercept.c
> +++ b/xen/arch/x86/hvm/intercept.c
> @@ -32,205 +32,94 @@
>  #include <xen/event.h>
>  #include <xen/iommu.h>
>  
> -static const struct hvm_mmio_ops *const
> -hvm_mmio_handlers[HVM_MMIO_HANDLER_NR] =
> +static bool_t hvm_mmio_accept(const struct hvm_io_handler *handler,
> +                              const ioreq_t *p)
>  {
> -    &hpet_mmio_ops,
> -    &vlapic_mmio_ops,
> -    &vioapic_mmio_ops,
> -    &msixtbl_mmio_ops,
> -    &iommu_mmio_ops
> -};
> +    BUG_ON(handler->type != IOREQ_TYPE_COPY);
> +
> +    return handler->mmio.ops->check(current, p->addr);
> +}
>  
> -static int hvm_mmio_access(struct vcpu *v,
> -                           ioreq_t *p,
> -                           hvm_mmio_read_t read,
> -                           hvm_mmio_write_t write)
> +static int hvm_mmio_read(const struct hvm_io_handler *handler,
> +                         uint64_t addr,
> +                         uint32_t size,
> +                         uint64_t *data)

These should really go on one line (provided this doesn't make the
line longer than 80 chars).

> @@ -324,78 +230,122 @@ static int process_portio_intercept(portio_action_t action, ioreq_t *p)
>      return rc;
>  }
>  
> -/*
> - * Check if the request is handled inside xen
> - * return value: 0 --not handled; 1 --handled
> - */
> -int hvm_io_intercept(ioreq_t *p, int type)
> +const struct hvm_io_handler *hvm_find_io_handler(ioreq_t *p)
> +{
> +    struct domain *curr_d = current->domain;
> +    const struct hvm_io_ops *ops =
> +        (p->type == IOREQ_TYPE_COPY) ?
> +        &mmio_ops :
> +        &portio_ops;

Similarly (and there were other cases earlier on) I don't see why this
need to be four lines when two suffice:

    const struct hvm_io_ops *ops = p->type == IOREQ_TYPE_COPY ?
                                   &mmio_ops : &portio_ops;

(arguably the ? is better placed on the second line, even if not fully
in line with our general operator placement).

> +bool_t hvm_mmio_internal(paddr_t gpa)
> +{
> +    ioreq_t p = {
> +        .type = IOREQ_TYPE_COPY,
> +        .addr = gpa
> +    };
> +
> +    return (hvm_find_io_handler(&p) != NULL);

Pointless parentheses.

> --- a/xen/arch/x86/hvm/stdvga.c
> +++ b/xen/arch/x86/hvm/stdvga.c
> @@ -547,12 +547,27 @@ static int mmio_move(struct hvm_hw_stdvga *s, ioreq_t *p)
>      return 1;
>  }
>  
> -static int stdvga_intercept_mmio(ioreq_t *p)
> +int stdvga_intercept_mmio(ioreq_t *p)
>  {
>      struct domain *d = current->domain;
>      struct hvm_hw_stdvga *s = &d->arch.hvm_domain.stdvga;
> +    uint64_t start, end, count = p->count, size = p->size;
>      int buf = 0, rc;
>  
> +    if ( p->df )
> +    {
> +        start = (p->addr - (count - 1) * size);
> +        end = p->addr + size;
> +    }
> +    else
> +    {
> +        start = p->addr;
> +        end = p->addr + count * size;
> +    }

As I stumble here not for the first time, I now finally need to make
the remark: I find it quite odd for p->count and p->size to be latched
into local (and even 64-bit) variables, yet the heavier used p->addr
not being treated this same way.

Also I suppose the if and else branches should be swapped (with
the condition inverted), or unlikely() be added to the condition, as
I think the vast majority of operations will be with EFLAGS.DF clear.

> +    if ( (start < VGA_MEM_BASE) || (end > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
> +        return X86EMUL_UNHANDLEABLE;
> +
>      if ( p->size > 8 )

Nor do I see why this wouldn't then use the local variable.

Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 08/16] x86/hvm: unify stdvga mmio intercept with standard mmio intercept
  2015-07-03 16:25 ` [PATCH v6 08/16] x86/hvm: unify stdvga mmio intercept with standard mmio intercept Paul Durrant
@ 2015-07-08 16:17   ` Jan Beulich
  2015-07-09  9:40     ` Paul Durrant
  2015-07-09  8:53   ` Jan Beulich
  2015-07-09  9:17   ` Jan Beulich
  2 siblings, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2015-07-08 16:17 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Keir Fraser

>>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
> -int stdvga_intercept_mmio(ioreq_t *p)
> +static bool_t stdvga_mem_accept(const struct hvm_io_handler *handler,
> +                                const ioreq_t *p)
>  {
> -    struct domain *d = current->domain;
> -    struct hvm_hw_stdvga *s = &d->arch.hvm_domain.stdvga;
> -    uint64_t start, end, count = p->count, size = p->size;
> -    int buf = 0, rc;
> -
> -    if ( p->df )
> -    {
> -        start = (p->addr - (count - 1) * size);
> -        end = p->addr + size;
> -    }
> -    else
> -    {
> -        start = p->addr;
> -        end = p->addr + count * size;
> -    }
> -
> -    if ( (start < VGA_MEM_BASE) || (end > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
> -        return X86EMUL_UNHANDLEABLE;
> -
> -    if ( p->size > 8 )
> -    {
> -        gdprintk(XENLOG_WARNING, "invalid mmio size %d\n", (int)p->size);
> -        return X86EMUL_UNHANDLEABLE;
> -    }
> +    struct hvm_hw_stdvga *s = &current->domain->arch.hvm_domain.stdvga;
>  
>      spin_lock(&s->lock);
>  
> -    if ( s->stdvga && s->cache )
> +    if ( !s->stdvga ||
> +         (p->addr < VGA_MEM_BASE) ||
> +         ((p->addr + p->size) > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
> +        goto reject;

I came here just to check whether you make use of the earlier
added hvm_mmio_{first,last}_byte(), but found that you're ignoring
p->df now altogether. That can't be right?

Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 02/16] x86/hvm: remove multiple open coded 'chunking' loops
  2015-07-08 15:57     ` Paul Durrant
@ 2015-07-08 16:18       ` Jan Beulich
  2015-07-08 16:43       ` Andrew Cooper
  1 sibling, 0 replies; 53+ messages in thread
From: Jan Beulich @ 2015-07-08 16:18 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, Keir(Xen.org), xen-devel@lists.xenproject.org

>>> On 08.07.15 at 17:57, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
>> bounces@lists.xen.org] On Behalf Of Jan Beulich
>> Sent: 08 July 2015 16:53
>> To: Andrew Cooper; Paul Durrant
>> Cc: xen-devel@lists.xenproject.org; Keir (Xen.org)
>> Subject: Re: [Xen-devel] [PATCH v6 02/16] x86/hvm: remove multiple open
>> coded 'chunking' loops
>> 
>> >>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
>> > +static int hvmemul_linear_mmio_access(
>> > +    unsigned long gla, unsigned int size, uint8_t dir, uint8_t *buffer,
>> > +    uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t
>> known_gpfn)
>> > +{
>> > +    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
>> > +    unsigned long offset = gla & ~PAGE_MASK;
>> > +    unsigned int chunk;
>> > +    paddr_t gpa;
>> > +    unsigned long one_rep = 1;
>> > +    int rc;
>> > +
>> > +    chunk = min_t(unsigned int, size, PAGE_SIZE - offset);
>> > +
>> > +    if ( known_gpfn )
>> > +        gpa = pfn_to_paddr(vio->mmio_gpfn) | offset;
>> > +    else
>> > +    {
>> > +        rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
>> > +                                    hvmemul_ctxt);
>> > +        if ( rc != X86EMUL_OKAY )
>> > +            return rc;
>> > +    }
>> > +
>> > +    for ( ;; )
>> > +    {
>> > +        rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer);
>> > +        if ( rc != X86EMUL_OKAY )
>> > +            break;
>> > +
>> > +        gla += chunk;
>> > +        buffer += chunk;
>> > +        size -= chunk;
>> > +
>> > +        if ( size == 0 )
>> > +            break;
>> > +
>> > +        ASSERT((gla & ~PAGE_MASK) == 0);
>> 
>> Does this really matter for the code below?
>> 
>> > +        chunk = min_t(unsigned int, size, PAGE_SIZE);
>> 
>> Iirc Andrew had asked for this, but I still don't see why: "size" is the
>> width of an instruction operand, and hence won't even come close
>> to PAGE_SIZE.
>> 
> 
> I'll let Andrew answer that one. I've already ping-ponged once on this.
> 
> I guess it makes it safer if someone were to try to use this function for 
> another purpose in future?

I can't see such future use, and hence can't see why we should make
this slower than it needs to be.

Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 05/16] x86/hvm: unify internal portio and mmio intercepts
  2015-07-03 16:25 ` [PATCH v6 05/16] x86/hvm: unify internal portio and mmio intercepts Paul Durrant
  2015-07-08 16:11   ` Jan Beulich
@ 2015-07-08 16:28   ` Jan Beulich
  1 sibling, 0 replies; 53+ messages in thread
From: Jan Beulich @ 2015-07-08 16:28 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Keir Fraser

>>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
> -void relocate_io_handler(
> -    struct domain *d, unsigned long old_addr, unsigned long new_addr,
> -    unsigned int size, int type)
> +void relocate_portio_handler(struct domain *d, uint16_t old_port,
> +                             uint16_t new_port, unsigned int size)
>  {
> -    struct hvm_io_handler *handler = d->arch.hvm_domain.io_handler;
> -    int i;
> -
> -    for ( i = 0; i < handler->num_slot; i++ )
> -        if ( (handler->hdl_list[i].addr == old_addr) &&
> -             (handler->hdl_list[i].size == size) &&
> -             (handler->hdl_list[i].type == type) )
> -            handler->hdl_list[i].addr = new_addr;
> +    unsigned int i;
> +
> +    for ( i = 0; i < d->arch.hvm_domain.io_handler_count; i++ )
> +    {
> +        struct hvm_io_handler *handler =
> +            &d->arch.hvm_domain.io_handler[i];
> +
> +        if ( handler->type != IOREQ_TYPE_PIO )
> +            continue;
> +
> +        if ( (handler->portio.start == old_port) &&
> +             (handler->portio.end = old_port + size) )
> +        {
> +            handler->portio.start = new_port;
> +            handler->portio.end = new_port + size;
> +            break;
> +        }
> +    }
> +}

I'm afraid you're introducing a race with the consumer of this data
here: Previously the update consisted of a single write, i.e. the
consumer would see either the old or the new data. After the
change it may see partially updated state.

Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 07/16] x86/hvm: unify dpci portio intercept with standard portio intercept
  2015-07-03 16:25 ` [PATCH v6 07/16] x86/hvm: unify dpci portio intercept with standard portio intercept Paul Durrant
@ 2015-07-08 16:29   ` Jan Beulich
  0 siblings, 0 replies; 53+ messages in thread
From: Jan Beulich @ 2015-07-08 16:29 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Keir Fraser

>>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
> --- a/xen/arch/x86/hvm/intercept.c
> +++ b/xen/arch/x86/hvm/intercept.c
> @@ -125,10 +125,7 @@ int hvm_process_io_intercept(const struct hvm_io_handler *handler,
>  {
>      struct vcpu *curr = current;
>      struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
> -    const struct hvm_io_ops *ops =
> -        (p->type == IOREQ_TYPE_COPY) ?
> -        &mmio_ops :
> -        &portio_ops;
> +    const struct hvm_io_ops *ops = handler->ops;

Ah, I see these odd many line things go away here again. In that case
you may not bother fiddling with such in the earlier patch(es). But those
that survive till the end of the series I'd prefer to see shortened.

Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 02/16] x86/hvm: remove multiple open coded 'chunking' loops
  2015-07-08 15:57     ` Paul Durrant
  2015-07-08 16:18       ` Jan Beulich
@ 2015-07-08 16:43       ` Andrew Cooper
  2015-07-09  6:53         ` Jan Beulich
  1 sibling, 1 reply; 53+ messages in thread
From: Andrew Cooper @ 2015-07-08 16:43 UTC (permalink / raw)
  To: Paul Durrant, Jan Beulich; +Cc: xen-devel@lists.xenproject.org, Keir (Xen.org)

On 08/07/2015 16:57, Paul Durrant wrote:
>> -----Original Message-----
>> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
>> bounces@lists.xen.org] On Behalf Of Jan Beulich
>> Sent: 08 July 2015 16:53
>> To: Andrew Cooper; Paul Durrant
>> Cc: xen-devel@lists.xenproject.org; Keir (Xen.org)
>> Subject: Re: [Xen-devel] [PATCH v6 02/16] x86/hvm: remove multiple open
>> coded 'chunking' loops
>>
>>>>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
>>> +static int hvmemul_linear_mmio_access(
>>> +    unsigned long gla, unsigned int size, uint8_t dir, uint8_t *buffer,
>>> +    uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t
>> known_gpfn)
>>> +{
>>> +    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
>>> +    unsigned long offset = gla & ~PAGE_MASK;
>>> +    unsigned int chunk;
>>> +    paddr_t gpa;
>>> +    unsigned long one_rep = 1;
>>> +    int rc;
>>> +
>>> +    chunk = min_t(unsigned int, size, PAGE_SIZE - offset);
>>> +
>>> +    if ( known_gpfn )
>>> +        gpa = pfn_to_paddr(vio->mmio_gpfn) | offset;
>>> +    else
>>> +    {
>>> +        rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
>>> +                                    hvmemul_ctxt);
>>> +        if ( rc != X86EMUL_OKAY )
>>> +            return rc;
>>> +    }
>>> +
>>> +    for ( ;; )
>>> +    {
>>> +        rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer);
>>> +        if ( rc != X86EMUL_OKAY )
>>> +            break;
>>> +
>>> +        gla += chunk;
>>> +        buffer += chunk;
>>> +        size -= chunk;
>>> +
>>> +        if ( size == 0 )
>>> +            break;
>>> +
>>> +        ASSERT((gla & ~PAGE_MASK) == 0);
>> Does this really matter for the code below?
>>
>>> +        chunk = min_t(unsigned int, size, PAGE_SIZE);
>> Iirc Andrew had asked for this, but I still don't see why: "size" is the
>> width of an instruction operand, and hence won't even come close
>> to PAGE_SIZE.

The original version of the code asserted that size was less than
PAGE_SIZE around here.  This is not true in the general case, given a
for loop like this and can in principle be hit if we ever got into a
position of emulating an xsave instruction to an MMIO region.

This specific example is not as far fetched as it seems.  The VM
instrospection people are looking to emulate more and more instructions,
while the GVT-g are working on the mmio_write_dm side of things which
causes real RAM to be treated as MMIO from Xens point of view.

If we had some blanket sanity checks for size at the top of the
emulation calltree it would be less of an issue, but we don't and I have
a nagging feeing that assumptions like this are going to bite us in an
XSA-kind-of-way.

~Andrew

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 02/16] x86/hvm: remove multiple open coded 'chunking' loops
  2015-07-08 16:43       ` Andrew Cooper
@ 2015-07-09  6:53         ` Jan Beulich
  2015-07-09  8:15           ` Paul Durrant
  0 siblings, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2015-07-09  6:53 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel@lists.xenproject.org, Paul Durrant, Keir(Xen.org)

>>> On 08.07.15 at 18:43, <andrew.cooper3@citrix.com> wrote:
> On 08/07/2015 16:57, Paul Durrant wrote:
>>> -----Original Message-----
>>> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
>>> bounces@lists.xen.org] On Behalf Of Jan Beulich
>>> Sent: 08 July 2015 16:53
>>> To: Andrew Cooper; Paul Durrant
>>> Cc: xen-devel@lists.xenproject.org; Keir (Xen.org)
>>> Subject: Re: [Xen-devel] [PATCH v6 02/16] x86/hvm: remove multiple open
>>> coded 'chunking' loops
>>>
>>>>>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
>>>> +static int hvmemul_linear_mmio_access(
>>>> +    unsigned long gla, unsigned int size, uint8_t dir, uint8_t *buffer,
>>>> +    uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t
>>> known_gpfn)
>>>> +{
>>>> +    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
>>>> +    unsigned long offset = gla & ~PAGE_MASK;
>>>> +    unsigned int chunk;
>>>> +    paddr_t gpa;
>>>> +    unsigned long one_rep = 1;
>>>> +    int rc;
>>>> +
>>>> +    chunk = min_t(unsigned int, size, PAGE_SIZE - offset);
>>>> +
>>>> +    if ( known_gpfn )
>>>> +        gpa = pfn_to_paddr(vio->mmio_gpfn) | offset;
>>>> +    else
>>>> +    {
>>>> +        rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
>>>> +                                    hvmemul_ctxt);
>>>> +        if ( rc != X86EMUL_OKAY )
>>>> +            return rc;
>>>> +    }
>>>> +
>>>> +    for ( ;; )
>>>> +    {
>>>> +        rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer);
>>>> +        if ( rc != X86EMUL_OKAY )
>>>> +            break;
>>>> +
>>>> +        gla += chunk;
>>>> +        buffer += chunk;
>>>> +        size -= chunk;
>>>> +
>>>> +        if ( size == 0 )
>>>> +            break;
>>>> +
>>>> +        ASSERT((gla & ~PAGE_MASK) == 0);
>>> Does this really matter for the code below?
>>>
>>>> +        chunk = min_t(unsigned int, size, PAGE_SIZE);
>>> Iirc Andrew had asked for this, but I still don't see why: "size" is the
>>> width of an instruction operand, and hence won't even come close
>>> to PAGE_SIZE.
> 
> The original version of the code asserted that size was less than
> PAGE_SIZE around here.  This is not true in the general case, given a
> for loop like this and can in principle be hit if we ever got into a
> position of emulating an xsave instruction to an MMIO region.

That's a hypothetical future extension to XSAVE, isn't it? I.e. only
when AVX-1024 eventually arrives ...

> This specific example is not as far fetched as it seems.  The VM
> instrospection people are looking to emulate more and more instructions,
> while the GVT-g are working on the mmio_write_dm side of things which
> causes real RAM to be treated as MMIO from Xens point of view.

... and when we really think we need to support XSAVEing to MMIO
or MMIO-like memory (which I doubt we will).

> If we had some blanket sanity checks for size at the top of the
> emulation calltree it would be less of an issue, but we don't and I have
> a nagging feeing that assumptions like this are going to bite us in an
> XSA-kind-of-way.

Hence the ASSERT() against PAGE_SIZE that originally was there.

Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 02/16] x86/hvm: remove multiple open coded 'chunking' loops
  2015-07-09  6:53         ` Jan Beulich
@ 2015-07-09  8:15           ` Paul Durrant
  2015-07-09  9:19             ` Jan Beulich
  0 siblings, 1 reply; 53+ messages in thread
From: Paul Durrant @ 2015-07-09  8:15 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper; +Cc: xen-devel@lists.xenproject.org, Keir (Xen.org)

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 09 July 2015 07:54
> To: Andrew Cooper
> Cc: Paul Durrant; xen-devel@lists.xenproject.org; Keir (Xen.org)
> Subject: Re: [Xen-devel] [PATCH v6 02/16] x86/hvm: remove multiple open
> coded 'chunking' loops
> 
> >>> On 08.07.15 at 18:43, <andrew.cooper3@citrix.com> wrote:
> > On 08/07/2015 16:57, Paul Durrant wrote:
> >>> -----Original Message-----
> >>> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> >>> bounces@lists.xen.org] On Behalf Of Jan Beulich
> >>> Sent: 08 July 2015 16:53
> >>> To: Andrew Cooper; Paul Durrant
> >>> Cc: xen-devel@lists.xenproject.org; Keir (Xen.org)
> >>> Subject: Re: [Xen-devel] [PATCH v6 02/16] x86/hvm: remove multiple
> open
> >>> coded 'chunking' loops
> >>>
> >>>>>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
> >>>> +static int hvmemul_linear_mmio_access(
> >>>> +    unsigned long gla, unsigned int size, uint8_t dir, uint8_t *buffer,
> >>>> +    uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t
> >>> known_gpfn)
> >>>> +{
> >>>> +    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
> >>>> +    unsigned long offset = gla & ~PAGE_MASK;
> >>>> +    unsigned int chunk;
> >>>> +    paddr_t gpa;
> >>>> +    unsigned long one_rep = 1;
> >>>> +    int rc;
> >>>> +
> >>>> +    chunk = min_t(unsigned int, size, PAGE_SIZE - offset);
> >>>> +
> >>>> +    if ( known_gpfn )
> >>>> +        gpa = pfn_to_paddr(vio->mmio_gpfn) | offset;
> >>>> +    else
> >>>> +    {
> >>>> +        rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
> >>>> +                                    hvmemul_ctxt);
> >>>> +        if ( rc != X86EMUL_OKAY )
> >>>> +            return rc;
> >>>> +    }
> >>>> +
> >>>> +    for ( ;; )
> >>>> +    {
> >>>> +        rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer);
> >>>> +        if ( rc != X86EMUL_OKAY )
> >>>> +            break;
> >>>> +
> >>>> +        gla += chunk;
> >>>> +        buffer += chunk;
> >>>> +        size -= chunk;
> >>>> +
> >>>> +        if ( size == 0 )
> >>>> +            break;
> >>>> +
> >>>> +        ASSERT((gla & ~PAGE_MASK) == 0);
> >>> Does this really matter for the code below?
> >>>
> >>>> +        chunk = min_t(unsigned int, size, PAGE_SIZE);
> >>> Iirc Andrew had asked for this, but I still don't see why: "size" is the
> >>> width of an instruction operand, and hence won't even come close
> >>> to PAGE_SIZE.
> >
> > The original version of the code asserted that size was less than
> > PAGE_SIZE around here.  This is not true in the general case, given a
> > for loop like this and can in principle be hit if we ever got into a
> > position of emulating an xsave instruction to an MMIO region.
> 
> That's a hypothetical future extension to XSAVE, isn't it? I.e. only
> when AVX-1024 eventually arrives ...
> 
> > This specific example is not as far fetched as it seems.  The VM
> > instrospection people are looking to emulate more and more instructions,
> > while the GVT-g are working on the mmio_write_dm side of things which
> > causes real RAM to be treated as MMIO from Xens point of view.
> 
> ... and when we really think we need to support XSAVEing to MMIO
> or MMIO-like memory (which I doubt we will).
> 
> > If we had some blanket sanity checks for size at the top of the
> > emulation calltree it would be less of an issue, but we don't and I have
> > a nagging feeing that assumptions like this are going to bite us in an
> > XSA-kind-of-way.
> 
> Hence the ASSERT() against PAGE_SIZE that originally was there.
> 

Is there any harm in leaving the function as-is so that it can cope with size > PAGE_SIZE in future?

  Paul

> Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 08/16] x86/hvm: unify stdvga mmio intercept with standard mmio intercept
  2015-07-03 16:25 ` [PATCH v6 08/16] x86/hvm: unify stdvga mmio intercept with standard mmio intercept Paul Durrant
  2015-07-08 16:17   ` Jan Beulich
@ 2015-07-09  8:53   ` Jan Beulich
  2015-07-09  9:00     ` Paul Durrant
  2015-07-09  9:17   ` Jan Beulich
  2 siblings, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2015-07-09  8:53 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Keir Fraser

>>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -266,6 +266,15 @@ static int hvmemul_acquire_page(unsigned long gmfn, struct page_info **page)
>          return X86EMUL_RETRY;
>      }
>  
> +    /* This code should not be reached if the gmfn is not RAM */
> +    if ( p2m_is_mmio(p2mt) )
> +    {
> +        domain_crash(curr_d);
> +
> +        put_page(*page);
> +        return X86EMUL_UNHANDLEABLE;
> +    }
> +
>      return X86EMUL_OKAY;
>  }

Does this change really belong here, not in an earlier patch?

> @@ -424,8 +426,22 @@ static void stdvga_mem_writeb(uint64_t addr, uint32_t val)
>      }
>  }
>  
> -static void stdvga_mem_write(uint64_t addr, uint64_t data, uint64_t size)
> +static int stdvga_mem_write(const struct hvm_io_handler *handler,
> +                            uint64_t addr, uint32_t size,
> +                            uint64_t data)
>  {
> +    struct hvm_hw_stdvga *s = &current->domain->arch.hvm_domain.stdvga;
> +    ioreq_t p = { .type = IOREQ_TYPE_COPY,
> +                  .addr = addr,
> +                  .size = size,
> +                  .count = 1,
> +                  .dir = IOREQ_WRITE,
> +                  .data = data,
> +    };

Indentation.

> +    if ( !s->cache )
> +        goto done;

Why is this not being dealt with by stdvga_mem_accept()?

> -    if ( s->stdvga && s->cache )
> +    if ( !s->stdvga ||
> +         (p->addr < VGA_MEM_BASE) ||
> +         ((p->addr + p->size) > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
> +        goto reject;
> +
> +    if ( p->dir == IOREQ_WRITE && p->count > 1 )
>      {
> -        switch ( p->type )
> +        /*
> +         * We cannot return X86EMUL_UNHANDLEABLE on anything other then the
> +         * first cycle of an I/O. So, since we cannot guarantee to always be
> +         * able to buffer writes, we have to reject any multi-cycle I/O and,
> +         * since we are rejecting an I/O, we must invalidate the cache.
> +         * Single-cycle write transactions are accepted even if the cache is
> +         * not active since we can assert, when in stdvga mode, that writes
> +         * to VRAM have no side effect and thus we can try to buffer them.
> +         */
> +        if ( s->cache )

In this comment, does "buffer" refer to being able to successfully
call hvm_buffered_io_send()? To me it reads more like referring to
the internal hypervisor buffering done here, which I can't see why
we couldn't always guarantee to be able to do.

>          {
> -        case IOREQ_TYPE_COPY:
> -            buf = mmio_move(s, p);
> -            if ( !buf )
> -                s->cache = 0;
> -            break;
> -        default:
> -            gdprintk(XENLOG_WARNING, "unsupported mmio request type:%d "
> -                     "addr:0x%04x data:0x%04x size:%d count:%d state:%d "
> -                     "isptr:%d dir:%d df:%d\n",
> -                     p->type, (int)p->addr, (int)p->data, (int)p->size,
> -                     (int)p->count, p->state,
> -                     p->data_is_ptr, p->dir, p->df);
> +            gdprintk(XENLOG_INFO, "leaving caching mode\n");
>              s->cache = 0;
>          }
> +
> +        goto reject;
>      }
> -    else
> -    {
> -        buf = (p->dir == IOREQ_WRITE);
> -    }
> +    else if ( p->dir == IOREQ_READ && !s->cache )
> +        goto reject;
>  
> -    rc = (buf && hvm_buffered_io_send(p));
> +    return 1;

Perhaps worth a comment that the lock is intentionally not being
dropped here (which seems pretty fragile, but I don't see an
alternative).

Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 08/16] x86/hvm: unify stdvga mmio intercept with standard mmio intercept
  2015-07-09  8:53   ` Jan Beulich
@ 2015-07-09  9:00     ` Paul Durrant
  2015-07-09  9:21       ` Jan Beulich
  0 siblings, 1 reply; 53+ messages in thread
From: Paul Durrant @ 2015-07-09  9:00 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel@lists.xenproject.org, Keir (Xen.org)

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 09 July 2015 09:53
> To: Paul Durrant
> Cc: xen-devel@lists.xenproject.org; Keir (Xen.org)
> Subject: Re: [PATCH v6 08/16] x86/hvm: unify stdvga mmio intercept with
> standard mmio intercept
> 
> >>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
> > --- a/xen/arch/x86/hvm/emulate.c
> > +++ b/xen/arch/x86/hvm/emulate.c
> > @@ -266,6 +266,15 @@ static int hvmemul_acquire_page(unsigned long
> gmfn, struct page_info **page)
> >          return X86EMUL_RETRY;
> >      }
> >
> > +    /* This code should not be reached if the gmfn is not RAM */
> > +    if ( p2m_is_mmio(p2mt) )
> > +    {
> > +        domain_crash(curr_d);
> > +
> > +        put_page(*page);
> > +        return X86EMUL_UNHANDLEABLE;
> > +    }
> > +
> >      return X86EMUL_OKAY;
> >  }
> 
> Does this change really belong here, not in an earlier patch?
> 

It could be, but it's always been in this patch because this is the one where I apparently remove code that handled MMIO <-> MMIO copying so I think it's most illustrative to leave it here.

> > @@ -424,8 +426,22 @@ static void stdvga_mem_writeb(uint64_t addr,
> uint32_t val)
> >      }
> >  }
> >
> > -static void stdvga_mem_write(uint64_t addr, uint64_t data, uint64_t size)
> > +static int stdvga_mem_write(const struct hvm_io_handler *handler,
> > +                            uint64_t addr, uint32_t size,
> > +                            uint64_t data)
> >  {
> > +    struct hvm_hw_stdvga *s = &current->domain-
> >arch.hvm_domain.stdvga;
> > +    ioreq_t p = { .type = IOREQ_TYPE_COPY,
> > +                  .addr = addr,
> > +                  .size = size,
> > +                  .count = 1,
> > +                  .dir = IOREQ_WRITE,
> > +                  .data = data,
> > +    };
> 
> Indentation.
> 
> > +    if ( !s->cache )
> > +        goto done;
> 
> Why is this not being dealt with by stdvga_mem_accept()?
> 

Because, as the big lock comment below states:

"Single-cycle write transactions are accepted even if the cache is
not active since we can assert, when in stdvga mode, that writes
to VRAM have no side effect and thus we can try to buffer them"

The code prior to this patch did this, as you pointed in an earlier review where I had not preserved this behaviour.

> > -    if ( s->stdvga && s->cache )
> > +    if ( !s->stdvga ||
> > +         (p->addr < VGA_MEM_BASE) ||
> > +         ((p->addr + p->size) > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
> > +        goto reject;
> > +
> > +    if ( p->dir == IOREQ_WRITE && p->count > 1 )
> >      {
> > -        switch ( p->type )
> > +        /*
> > +         * We cannot return X86EMUL_UNHANDLEABLE on anything other
> then the
> > +         * first cycle of an I/O. So, since we cannot guarantee to always be
> > +         * able to buffer writes, we have to reject any multi-cycle I/O and,
> > +         * since we are rejecting an I/O, we must invalidate the cache.
> > +         * Single-cycle write transactions are accepted even if the cache is
> > +         * not active since we can assert, when in stdvga mode, that writes
> > +         * to VRAM have no side effect and thus we can try to buffer them.
> > +         */
> > +        if ( s->cache )
> 
> In this comment, does "buffer" refer to being able to successfully
> call hvm_buffered_io_send()?

Yes.

> To me it reads more like referring to
> the internal hypervisor buffering done here, which I can't see why
> we couldn't always guarantee to be able to do.
> 
> >          {
> > -        case IOREQ_TYPE_COPY:
> > -            buf = mmio_move(s, p);
> > -            if ( !buf )
> > -                s->cache = 0;
> > -            break;
> > -        default:
> > -            gdprintk(XENLOG_WARNING, "unsupported mmio request type:%d
> "
> > -                     "addr:0x%04x data:0x%04x size:%d count:%d state:%d "
> > -                     "isptr:%d dir:%d df:%d\n",
> > -                     p->type, (int)p->addr, (int)p->data, (int)p->size,
> > -                     (int)p->count, p->state,
> > -                     p->data_is_ptr, p->dir, p->df);
> > +            gdprintk(XENLOG_INFO, "leaving caching mode\n");
> >              s->cache = 0;
> >          }
> > +
> > +        goto reject;
> >      }
> > -    else
> > -    {
> > -        buf = (p->dir == IOREQ_WRITE);
> > -    }
> > +    else if ( p->dir == IOREQ_READ && !s->cache )
> > +        goto reject;
> >
> > -    rc = (buf && hvm_buffered_io_send(p));
> > +    return 1;
> 
> Perhaps worth a comment that the lock is intentionally not being
> dropped here (which seems pretty fragile, but I don't see an
> alternative).

Ok.

  Paul

> 
> Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 08/16] x86/hvm: unify stdvga mmio intercept with standard mmio intercept
  2015-07-03 16:25 ` [PATCH v6 08/16] x86/hvm: unify stdvga mmio intercept with standard mmio intercept Paul Durrant
  2015-07-08 16:17   ` Jan Beulich
  2015-07-09  8:53   ` Jan Beulich
@ 2015-07-09  9:17   ` Jan Beulich
  2 siblings, 0 replies; 53+ messages in thread
From: Jan Beulich @ 2015-07-09  9:17 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Keir Fraser

>>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
> It's clear from the following check in hvmemul_rep_movs:
> 
>     if ( sp2mt == p2m_mmio_direct || dp2mt == p2m_mmio_direct ||
>          (sp2mt == p2m_mmio_dm && dp2mt == p2m_mmio_dm) )
>         return X86EMUL_UNHANDLEABLE;
> 
> that mmio <-> mmio copy is not handled. This means the code in the
> stdvga mmio intercept that explicitly handles mmio <-> mmio copy when
> hvm_copy_to/from_guest_phys() fails is never going to be executed.

And in the end we have to realize that this is a regression introduced
by (afaict) 3bb4f99d8c ("x86 hvm: Replace old MMIO emulator with
x86_emulate()-based harness"). Whether there are any meaningful
use cases (i.e. whether addressing this would be needed/wanted)
isn't clear though - the only way I can trigger this (without creating a
DOS VM) is via leaving grub's gfx mode.

Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 02/16] x86/hvm: remove multiple open coded 'chunking' loops
  2015-07-09  8:15           ` Paul Durrant
@ 2015-07-09  9:19             ` Jan Beulich
  0 siblings, 0 replies; 53+ messages in thread
From: Jan Beulich @ 2015-07-09  9:19 UTC (permalink / raw)
  To: Andrew Cooper, Paul Durrant; +Cc: xen-devel@lists.xenproject.org, Keir(Xen.org)

>>> On 09.07.15 at 10:15, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 09 July 2015 07:54
>> To: Andrew Cooper
>> Cc: Paul Durrant; xen-devel@lists.xenproject.org; Keir (Xen.org)
>> Subject: Re: [Xen-devel] [PATCH v6 02/16] x86/hvm: remove multiple open
>> coded 'chunking' loops
>> 
>> >>> On 08.07.15 at 18:43, <andrew.cooper3@citrix.com> wrote:
>> > On 08/07/2015 16:57, Paul Durrant wrote:
>> >>> -----Original Message-----
>> >>> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
>> >>> bounces@lists.xen.org] On Behalf Of Jan Beulich
>> >>> Sent: 08 July 2015 16:53
>> >>> To: Andrew Cooper; Paul Durrant
>> >>> Cc: xen-devel@lists.xenproject.org; Keir (Xen.org)
>> >>> Subject: Re: [Xen-devel] [PATCH v6 02/16] x86/hvm: remove multiple
>> open
>> >>> coded 'chunking' loops
>> >>>
>> >>>>>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
>> >>>> +static int hvmemul_linear_mmio_access(
>> >>>> +    unsigned long gla, unsigned int size, uint8_t dir, uint8_t *buffer,
>> >>>> +    uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t
>> >>> known_gpfn)
>> >>>> +{
>> >>>> +    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
>> >>>> +    unsigned long offset = gla & ~PAGE_MASK;
>> >>>> +    unsigned int chunk;
>> >>>> +    paddr_t gpa;
>> >>>> +    unsigned long one_rep = 1;
>> >>>> +    int rc;
>> >>>> +
>> >>>> +    chunk = min_t(unsigned int, size, PAGE_SIZE - offset);
>> >>>> +
>> >>>> +    if ( known_gpfn )
>> >>>> +        gpa = pfn_to_paddr(vio->mmio_gpfn) | offset;
>> >>>> +    else
>> >>>> +    {
>> >>>> +        rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
>> >>>> +                                    hvmemul_ctxt);
>> >>>> +        if ( rc != X86EMUL_OKAY )
>> >>>> +            return rc;
>> >>>> +    }
>> >>>> +
>> >>>> +    for ( ;; )
>> >>>> +    {
>> >>>> +        rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer);
>> >>>> +        if ( rc != X86EMUL_OKAY )
>> >>>> +            break;
>> >>>> +
>> >>>> +        gla += chunk;
>> >>>> +        buffer += chunk;
>> >>>> +        size -= chunk;
>> >>>> +
>> >>>> +        if ( size == 0 )
>> >>>> +            break;
>> >>>> +
>> >>>> +        ASSERT((gla & ~PAGE_MASK) == 0);
>> >>> Does this really matter for the code below?
>> >>>
>> >>>> +        chunk = min_t(unsigned int, size, PAGE_SIZE);
>> >>> Iirc Andrew had asked for this, but I still don't see why: "size" is the
>> >>> width of an instruction operand, and hence won't even come close
>> >>> to PAGE_SIZE.
>> >
>> > The original version of the code asserted that size was less than
>> > PAGE_SIZE around here.  This is not true in the general case, given a
>> > for loop like this and can in principle be hit if we ever got into a
>> > position of emulating an xsave instruction to an MMIO region.
>> 
>> That's a hypothetical future extension to XSAVE, isn't it? I.e. only
>> when AVX-1024 eventually arrives ...
>> 
>> > This specific example is not as far fetched as it seems.  The VM
>> > instrospection people are looking to emulate more and more instructions,
>> > while the GVT-g are working on the mmio_write_dm side of things which
>> > causes real RAM to be treated as MMIO from Xens point of view.
>> 
>> ... and when we really think we need to support XSAVEing to MMIO
>> or MMIO-like memory (which I doubt we will).
>> 
>> > If we had some blanket sanity checks for size at the top of the
>> > emulation calltree it would be less of an issue, but we don't and I have
>> > a nagging feeing that assumptions like this are going to bite us in an
>> > XSA-kind-of-way.
>> 
>> Hence the ASSERT() against PAGE_SIZE that originally was there.
>> 
> 
> Is there any harm in leaving the function as-is so that it can cope with size 
> > PAGE_SIZE in future?

Well, not a big problem, but I generally prefer any unneeded
conditionals to be removed. Considering that you and Andrew
agree with one another, I guess I'll give in.

Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 08/16] x86/hvm: unify stdvga mmio intercept with standard mmio intercept
  2015-07-09  9:00     ` Paul Durrant
@ 2015-07-09  9:21       ` Jan Beulich
  0 siblings, 0 replies; 53+ messages in thread
From: Jan Beulich @ 2015-07-09  9:21 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel@lists.xenproject.org, Keir(Xen.org)

>>> On 09.07.15 at 11:00, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 09 July 2015 09:53
>> >>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
>> > --- a/xen/arch/x86/hvm/emulate.c
>> > +++ b/xen/arch/x86/hvm/emulate.c
>> > @@ -266,6 +266,15 @@ static int hvmemul_acquire_page(unsigned long
>> gmfn, struct page_info **page)
>> >          return X86EMUL_RETRY;
>> >      }
>> >
>> > +    /* This code should not be reached if the gmfn is not RAM */
>> > +    if ( p2m_is_mmio(p2mt) )
>> > +    {
>> > +        domain_crash(curr_d);
>> > +
>> > +        put_page(*page);
>> > +        return X86EMUL_UNHANDLEABLE;
>> > +    }
>> > +
>> >      return X86EMUL_OKAY;
>> >  }
>> 
>> Does this change really belong here, not in an earlier patch?
>> 
> 
> It could be, but it's always been in this patch because this is the one 
> where I apparently remove code that handled MMIO <-> MMIO copying so I think 
> it's most illustrative to leave it here.

Okay then.

>> > +    if ( !s->cache )
>> > +        goto done;
>> 
>> Why is this not being dealt with by stdvga_mem_accept()?
>> 
> 
> Because, as the big lock comment below states:
> 
> "Single-cycle write transactions are accepted even if the cache is
> not active since we can assert, when in stdvga mode, that writes
> to VRAM have no side effect and thus we can try to buffer them"

Ah, right, thanks for connecting the two together for me.

Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 08/16] x86/hvm: unify stdvga mmio intercept with standard mmio intercept
  2015-07-08 16:17   ` Jan Beulich
@ 2015-07-09  9:40     ` Paul Durrant
  0 siblings, 0 replies; 53+ messages in thread
From: Paul Durrant @ 2015-07-09  9:40 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel@lists.xenproject.org, Keir (Xen.org)

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 08 July 2015 17:17
> To: Paul Durrant
> Cc: xen-devel@lists.xenproject.org; Keir (Xen.org)
> Subject: Re: [PATCH v6 08/16] x86/hvm: unify stdvga mmio intercept with
> standard mmio intercept
> 
> >>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
> > -int stdvga_intercept_mmio(ioreq_t *p)
> > +static bool_t stdvga_mem_accept(const struct hvm_io_handler *handler,
> > +                                const ioreq_t *p)
> >  {
> > -    struct domain *d = current->domain;
> > -    struct hvm_hw_stdvga *s = &d->arch.hvm_domain.stdvga;
> > -    uint64_t start, end, count = p->count, size = p->size;
> > -    int buf = 0, rc;
> > -
> > -    if ( p->df )
> > -    {
> > -        start = (p->addr - (count - 1) * size);
> > -        end = p->addr + size;
> > -    }
> > -    else
> > -    {
> > -        start = p->addr;
> > -        end = p->addr + count * size;
> > -    }
> > -
> > -    if ( (start < VGA_MEM_BASE) || (end > (VGA_MEM_BASE +
> VGA_MEM_SIZE)) )
> > -        return X86EMUL_UNHANDLEABLE;
> > -
> > -    if ( p->size > 8 )
> > -    {
> > -        gdprintk(XENLOG_WARNING, "invalid mmio size %d\n", (int)p->size);
> > -        return X86EMUL_UNHANDLEABLE;
> > -    }
> > +    struct hvm_hw_stdvga *s = &current->domain-
> >arch.hvm_domain.stdvga;
> >
> >      spin_lock(&s->lock);
> >
> > -    if ( s->stdvga && s->cache )
> > +    if ( !s->stdvga ||
> > +         (p->addr < VGA_MEM_BASE) ||
> > +         ((p->addr + p->size) > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
> > +        goto reject;
> 
> I came here just to check whether you make use of the earlier
> added hvm_mmio_{first,last}_byte(), but found that you're ignoring
> p->df now altogether. That can't be right?
> 

Yes, that's a mistake. I was forgetting that we still handle reads with count > 1.

  Paul

> Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 09/16] x86/hvm: limit reps to avoid the need to handle retry
  2015-07-03 16:25 ` [PATCH v6 09/16] x86/hvm: limit reps to avoid the need to handle retry Paul Durrant
  2015-07-03 17:18   ` Andrew Cooper
@ 2015-07-09 10:05   ` Jan Beulich
  2015-07-09 11:11     ` Paul Durrant
  1 sibling, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2015-07-09 10:05 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, Keir Fraser, xen-devel

>>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
> @@ -287,17 +271,56 @@ static int hvmemul_do_io_addr(
>      bool_t is_mmio, paddr_t addr, unsigned long *reps,
>      unsigned int size, uint8_t dir, bool_t df, paddr_t ram_gpa)
>  {
> -    struct page_info *ram_page;
> +    struct vcpu *v = current;
> +    unsigned long ram_gmfn = paddr_to_pfn(ram_gpa);
> +    unsigned int page_off = ram_gpa & (PAGE_SIZE - 1);
> +    struct page_info *ram_page[2];
> +    int nr_pages = 0;

unsigned int

> +    unsigned long count;
>      int rc;
>  
> -    rc = hvmemul_acquire_page(paddr_to_pfn(ram_gpa), &ram_page);
> +    rc = hvmemul_acquire_page(ram_gmfn, &ram_page[nr_pages]);
>      if ( rc != X86EMUL_OKAY )
> -        return rc;
> +        goto out;
>  
> -    rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 1,
> +    nr_pages++;
> +
> +    /* Detemine how many reps will fit within this page */

"Determine" and missing stop.

> +    count = min_t(unsigned long,
> +                  *reps,
> +                  df ?
> +                  (page_off + size - 1) / size :

Looks like this won't be correct for the page crossing case (where
the code below really expects to see zero). I think you simply need
to mask with (PAGE_SIZE - 1) again here before dividing.

> @@ -468,6 +469,9 @@ void hvm_do_resume(struct vcpu *v)
>          }
>      }
>  
> +    if ( vio->mmio_retry )
> +        (void)handle_mmio();

Pointless cast.

Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 11/16] x86/hvm: split I/O completion handling from state model
  2015-07-03 16:25 ` [PATCH v6 11/16] x86/hvm: split I/O completion handling from state model Paul Durrant
@ 2015-07-09 10:09   ` Jan Beulich
  0 siblings, 0 replies; 53+ messages in thread
From: Jan Beulich @ 2015-07-09 10:09 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Keir Fraser

>>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
> @@ -46,9 +51,10 @@ struct hvm_vcpu_asid {
>  
>  struct hvm_vcpu_io {
>      /* I/O request in flight to device model. */
> -    enum hvm_io_state   io_state;
> -    unsigned long       io_data;
> -    int                 io_size;
> +    enum hvm_io_state      io_state;
> +    unsigned long          io_data;
> +    unsigned int           io_size;
> +    enum hvm_io_completion io_completion;

The subsequent member being a 32-bit one, the ordering should be
changed to not leave two 32-bit holes (e.g. the two enums first).

Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 12/16] x86/hvm: remove HVMIO_dispatched I/O state
  2015-07-03 16:25 ` [PATCH v6 12/16] x86/hvm: remove HVMIO_dispatched I/O state Paul Durrant
@ 2015-07-09 10:13   ` Jan Beulich
  0 siblings, 0 replies; 53+ messages in thread
From: Jan Beulich @ 2015-07-09 10:13 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Keir Fraser

>>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
> @@ -55,6 +54,8 @@ struct hvm_vcpu_io {
>      unsigned long          io_data;
>      unsigned int           io_size;
>      enum hvm_io_completion io_completion;
> +    uint8_t                io_dir;
> +    uint8_t                io_data_is_addr;

Oh, you're adding more fields here. Maybe not worth changing the
earlier patch then (albeit it would look cleaner).

Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 16/16] x86/hvm: track large memory mapped accesses by buffer offset
  2015-07-03 16:25 ` [PATCH v6 16/16] x86/hvm: track large memory mapped accesses by buffer offset Paul Durrant
@ 2015-07-09 10:33   ` Jan Beulich
  2015-07-09 10:36     ` Paul Durrant
  2015-07-09 10:34   ` Jan Beulich
  1 sibling, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2015-07-09 10:33 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Keir Fraser

>>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
> @@ -635,13 +605,41 @@ static int hvmemul_phys_mmio_access(
>      return rc;
>  }
>  
> +static struct hvm_mmio_cache *hvmemul_find_mmio_cache(
> +    struct hvm_vcpu_io *vio, unsigned long gla, uint8_t dir)
> +{
> +    unsigned int i;
> +    struct hvm_mmio_cache *cache;
> +
> +    for ( i = 0; i < vio->mmio_cache_count; i ++ )
> +    {
> +        cache = &vio->mmio_cache[i];
> +
> +        if ( gla == cache->gla &&
> +             dir == cache->dir )
> +            return cache;
> +    }
> +
> +    i = vio->mmio_cache_count++;
> +    BUG_ON(i == ARRAY_SIZE(vio->mmio_cache));
> +
> +    cache = &vio->mmio_cache[i];
> +    memset(cache, 0, sizeof (*cache));
> +
> +    cache->gla = gla;
> +    cache->dir = dir;
> +
> +    return cache;
> +}

There's still a weakness here (but that also applies if you used physical
addresses): Multiple reads to the same address may not return the
same result. I don't think this needs to be addressed here, but adding
a comment clarifying that this case isn't handled correctly would help
future readers easier understand the state of affairs.

As to the BUG_ON() - I think this would better be a domain_crash(),
just in case we overlooked an exotic instruction accessing more than
3 memory locations.

Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 16/16] x86/hvm: track large memory mapped accesses by buffer offset
  2015-07-03 16:25 ` [PATCH v6 16/16] x86/hvm: track large memory mapped accesses by buffer offset Paul Durrant
  2015-07-09 10:33   ` Jan Beulich
@ 2015-07-09 10:34   ` Jan Beulich
  1 sibling, 0 replies; 53+ messages in thread
From: Jan Beulich @ 2015-07-09 10:34 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Keir Fraser

>>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:

(sorry for sending this in pieces - I started this reply and then another)

> --- a/xen/include/asm-x86/hvm/vcpu.h
> +++ b/xen/include/asm-x86/hvm/vcpu.h
> @@ -42,6 +42,17 @@ struct hvm_vcpu_asid {
>      uint32_t asid;
>  };
>  
> +/*
> + * We may read or write up to m256 as a number of device-model
> + * transactions.
> + */
> +struct hvm_mmio_cache {
> +    unsigned long size;
> +    uint8_t buffer[32];
> +    unsigned long gla;
> +    uint8_t dir;
> +};

"size" can be unsigned int (or even less, but that's not worth it) and
hence the structure size can be shrunk by putting it next to "dir".

Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 16/16] x86/hvm: track large memory mapped accesses by buffer offset
  2015-07-09 10:33   ` Jan Beulich
@ 2015-07-09 10:36     ` Paul Durrant
  0 siblings, 0 replies; 53+ messages in thread
From: Paul Durrant @ 2015-07-09 10:36 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel@lists.xenproject.org, Keir (Xen.org)

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 09 July 2015 11:33
> To: Paul Durrant
> Cc: xen-devel@lists.xenproject.org; Keir (Xen.org)
> Subject: Re: [PATCH v6 16/16] x86/hvm: track large memory mapped
> accesses by buffer offset
> 
> >>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
> > @@ -635,13 +605,41 @@ static int hvmemul_phys_mmio_access(
> >      return rc;
> >  }
> >
> > +static struct hvm_mmio_cache *hvmemul_find_mmio_cache(
> > +    struct hvm_vcpu_io *vio, unsigned long gla, uint8_t dir)
> > +{
> > +    unsigned int i;
> > +    struct hvm_mmio_cache *cache;
> > +
> > +    for ( i = 0; i < vio->mmio_cache_count; i ++ )
> > +    {
> > +        cache = &vio->mmio_cache[i];
> > +
> > +        if ( gla == cache->gla &&
> > +             dir == cache->dir )
> > +            return cache;
> > +    }
> > +
> > +    i = vio->mmio_cache_count++;
> > +    BUG_ON(i == ARRAY_SIZE(vio->mmio_cache));
> > +
> > +    cache = &vio->mmio_cache[i];
> > +    memset(cache, 0, sizeof (*cache));
> > +
> > +    cache->gla = gla;
> > +    cache->dir = dir;
> > +
> > +    return cache;
> > +}
> 
> There's still a weakness here (but that also applies if you used physical
> addresses): Multiple reads to the same address may not return the
> same result. I don't think this needs to be addressed here, but adding
> a comment clarifying that this case isn't handled correctly would help
> future readers easier understand the state of affairs.
> 

Yes, the way that multicycle I/O is handled cannot cope with the same instruction reading/writing the same place twice. I will add a comment noting that as you suggest.

> As to the BUG_ON() - I think this would better be a domain_crash(),
> just in case we overlooked an exotic instruction accessing more than
> 3 memory locations.
> 

Sure.

  Paul

> Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 09/16] x86/hvm: limit reps to avoid the need to handle retry
  2015-07-09 10:05   ` Jan Beulich
@ 2015-07-09 11:11     ` Paul Durrant
  2015-07-09 12:04       ` Jan Beulich
  0 siblings, 1 reply; 53+ messages in thread
From: Paul Durrant @ 2015-07-09 11:11 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Keir (Xen.org), xen-devel@lists.xenproject.org

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 09 July 2015 11:05
> To: Paul Durrant
> Cc: Andrew Cooper; xen-devel@lists.xenproject.org; Keir (Xen.org)
> Subject: Re: [PATCH v6 09/16] x86/hvm: limit reps to avoid the need to
> handle retry
> 
> >>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
> > @@ -287,17 +271,56 @@ static int hvmemul_do_io_addr(
> >      bool_t is_mmio, paddr_t addr, unsigned long *reps,
> >      unsigned int size, uint8_t dir, bool_t df, paddr_t ram_gpa)
> >  {
> > -    struct page_info *ram_page;
> > +    struct vcpu *v = current;
> > +    unsigned long ram_gmfn = paddr_to_pfn(ram_gpa);
> > +    unsigned int page_off = ram_gpa & (PAGE_SIZE - 1);
> > +    struct page_info *ram_page[2];
> > +    int nr_pages = 0;
> 
> unsigned int
> 

No, it's intentionally signed because the unwind code at the end of the function is:

    while ( --nr_pages >= 0 )
        hvmemul_release_page(ram_page[nr_pages]);

I.e. the loop terminates when nr_pages gets to -1.

> > +    unsigned long count;
> >      int rc;
> >
> > -    rc = hvmemul_acquire_page(paddr_to_pfn(ram_gpa), &ram_page);
> > +    rc = hvmemul_acquire_page(ram_gmfn, &ram_page[nr_pages]);
> >      if ( rc != X86EMUL_OKAY )
> > -        return rc;
> > +        goto out;
> >
> > -    rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 1,
> > +    nr_pages++;
> > +
> > +    /* Detemine how many reps will fit within this page */
> 
> "Determine" and missing stop.
> 
> > +    count = min_t(unsigned long,
> > +                  *reps,
> > +                  df ?
> > +                  (page_off + size - 1) / size :
> 
> Looks like this won't be correct for the page crossing case (where
> the code below really expects to see zero). I think you simply need
> to mask with (PAGE_SIZE - 1) again here before dividing.

Yep.

> 
> > @@ -468,6 +469,9 @@ void hvm_do_resume(struct vcpu *v)
> >          }
> >      }
> >
> > +    if ( vio->mmio_retry )
> > +        (void)handle_mmio();
> 
> Pointless cast.
> 

Ok.

  Paul

> Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix
  2015-07-08 15:44 ` [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix Jan Beulich
@ 2015-07-09 11:31   ` Paul Durrant
  2015-07-09 11:43     ` David Vrabel
  0 siblings, 1 reply; 53+ messages in thread
From: Paul Durrant @ 2015-07-09 11:31 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel@lists.xenproject.org

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 08 July 2015 16:44
> To: Paul Durrant
> Cc: xen-devel@lists.xenproject.org
> Subject: Re: [Xen-devel] [PATCH v6 00/16] x86/hvm: I/O emulation cleanup
> and fix
> 
> >>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
> > Changelog (now per-patch)
> > -------------------------
> 
> That's quite nice, but still not really helpful when not put in the
> individual patches - one still needs to keep this one and the patch
> open to follow things incrementally.
> 

The problem is that (at least my version of) git format-patch --notes refuses to put notes after the ---, but I'll resort to hand editing the patches before I post them.

  Paul

> Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix
  2015-07-09 11:31   ` Paul Durrant
@ 2015-07-09 11:43     ` David Vrabel
  0 siblings, 0 replies; 53+ messages in thread
From: David Vrabel @ 2015-07-09 11:43 UTC (permalink / raw)
  To: Paul Durrant, Jan Beulich; +Cc: xen-devel@lists.xenproject.org

On 09/07/15 12:31, Paul Durrant wrote:
> 
> The problem is that (at least my version of) git format-patch --notes
> refuses to put notes after the ---, but I'll resort to hand editing
> the patches before I post them.

You can add the changes to the commit message itself, then got
format-patch/send-email does the right thing.

I find it easiest to add the per-commit change notes when git commit
--amend'ing or during the rebase when squashing patches together.

David

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 09/16] x86/hvm: limit reps to avoid the need to handle retry
  2015-07-09 11:11     ` Paul Durrant
@ 2015-07-09 12:04       ` Jan Beulich
  2015-07-09 12:50         ` Paul Durrant
  0 siblings, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2015-07-09 12:04 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, Keir(Xen.org), xen-devel@lists.xenproject.org

>>> On 09.07.15 at 13:11, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 09 July 2015 11:05
>> >>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
>> > @@ -287,17 +271,56 @@ static int hvmemul_do_io_addr(
>> >      bool_t is_mmio, paddr_t addr, unsigned long *reps,
>> >      unsigned int size, uint8_t dir, bool_t df, paddr_t ram_gpa)
>> >  {
>> > -    struct page_info *ram_page;
>> > +    struct vcpu *v = current;
>> > +    unsigned long ram_gmfn = paddr_to_pfn(ram_gpa);
>> > +    unsigned int page_off = ram_gpa & (PAGE_SIZE - 1);
>> > +    struct page_info *ram_page[2];
>> > +    int nr_pages = 0;
>> 
>> unsigned int
> 
> No, it's intentionally signed because the unwind code at the end of the 
> function is:
> 
>     while ( --nr_pages >= 0 )
>         hvmemul_release_page(ram_page[nr_pages]);
> 
> I.e. the loop terminates when nr_pages gets to -1.

     while ( nr_pages-- )
         hvmemul_release_page(ram_page[nr_pages]);

Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 09/16] x86/hvm: limit reps to avoid the need to handle retry
  2015-07-09 12:04       ` Jan Beulich
@ 2015-07-09 12:50         ` Paul Durrant
  2015-07-09 13:38           ` Jan Beulich
  0 siblings, 1 reply; 53+ messages in thread
From: Paul Durrant @ 2015-07-09 12:50 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Keir (Xen.org), xen-devel@lists.xenproject.org

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 09 July 2015 13:04
> To: Paul Durrant
> Cc: Andrew Cooper; xen-devel@lists.xenproject.org; Keir (Xen.org)
> Subject: RE: [PATCH v6 09/16] x86/hvm: limit reps to avoid the need to
> handle retry
> 
> >>> On 09.07.15 at 13:11, <Paul.Durrant@citrix.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 09 July 2015 11:05
> >> >>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
> >> > @@ -287,17 +271,56 @@ static int hvmemul_do_io_addr(
> >> >      bool_t is_mmio, paddr_t addr, unsigned long *reps,
> >> >      unsigned int size, uint8_t dir, bool_t df, paddr_t ram_gpa)
> >> >  {
> >> > -    struct page_info *ram_page;
> >> > +    struct vcpu *v = current;
> >> > +    unsigned long ram_gmfn = paddr_to_pfn(ram_gpa);
> >> > +    unsigned int page_off = ram_gpa & (PAGE_SIZE - 1);
> >> > +    struct page_info *ram_page[2];
> >> > +    int nr_pages = 0;
> >>
> >> unsigned int
> >
> > No, it's intentionally signed because the unwind code at the end of the
> > function is:
> >
> >     while ( --nr_pages >= 0 )
> >         hvmemul_release_page(ram_page[nr_pages]);
> >
> > I.e. the loop terminates when nr_pages gets to -1.
> 
>      while ( nr_pages-- )
>          hvmemul_release_page(ram_page[nr_pages]);
> 

But that wouldn't be correct, since the loop would try to release ram_page[1] and ram_page[0] in the case where only ram_page[0] had been acquired. It would have to be:

      while ( nr_pages )
          hvmemul_release_page(ram_page[--nr_pages]);

which I'll do, since you so dislike signed ints.

  Paul

> Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 09/16] x86/hvm: limit reps to avoid the need to handle retry
  2015-07-09 12:50         ` Paul Durrant
@ 2015-07-09 13:38           ` Jan Beulich
  2015-07-09 13:42             ` Paul Durrant
  2015-07-09 14:00             ` Paul Durrant
  0 siblings, 2 replies; 53+ messages in thread
From: Jan Beulich @ 2015-07-09 13:38 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, Keir(Xen.org), xen-devel@lists.xenproject.org

>>> On 09.07.15 at 14:50, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 09 July 2015 13:04
>> >>> On 09.07.15 at 13:11, <Paul.Durrant@citrix.com> wrote:
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: 09 July 2015 11:05
>> >> >>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
>> >> > @@ -287,17 +271,56 @@ static int hvmemul_do_io_addr(
>> >> >      bool_t is_mmio, paddr_t addr, unsigned long *reps,
>> >> >      unsigned int size, uint8_t dir, bool_t df, paddr_t ram_gpa)
>> >> >  {
>> >> > -    struct page_info *ram_page;
>> >> > +    struct vcpu *v = current;
>> >> > +    unsigned long ram_gmfn = paddr_to_pfn(ram_gpa);
>> >> > +    unsigned int page_off = ram_gpa & (PAGE_SIZE - 1);
>> >> > +    struct page_info *ram_page[2];
>> >> > +    int nr_pages = 0;
>> >>
>> >> unsigned int
>> >
>> > No, it's intentionally signed because the unwind code at the end of the
>> > function is:
>> >
>> >     while ( --nr_pages >= 0 )
>> >         hvmemul_release_page(ram_page[nr_pages]);
>> >
>> > I.e. the loop terminates when nr_pages gets to -1.
>> 
>>      while ( nr_pages-- )
>>          hvmemul_release_page(ram_page[nr_pages]);
>> 
> 
> But that wouldn't be correct, since the loop would try to release 
> ram_page[1] and ram_page[0] in the case where only ram_page[0] had been 
> acquired. It would have to be:
> 
>       while ( nr_pages )
>           hvmemul_release_page(ram_page[--nr_pages]);

Unless you need the value of nr_pages after the loop, I can't see
the difference between your and my proposal - the decrement in
both cases happens between determining whether to continue the
loop and accessing the intended array element.

> which I'll do, since you so dislike signed ints.

Sounds like you're not understanding why. Just look at the
generated code when using signed vs unsigned int as array
index.

Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 09/16] x86/hvm: limit reps to avoid the need to handle retry
  2015-07-09 13:38           ` Jan Beulich
@ 2015-07-09 13:42             ` Paul Durrant
  2015-07-09 14:00             ` Paul Durrant
  1 sibling, 0 replies; 53+ messages in thread
From: Paul Durrant @ 2015-07-09 13:42 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Keir (Xen.org), xen-devel@lists.xenproject.org

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 09 July 2015 14:38
> To: Paul Durrant
> Cc: Andrew Cooper; xen-devel@lists.xenproject.org; Keir (Xen.org)
> Subject: RE: [PATCH v6 09/16] x86/hvm: limit reps to avoid the need to
> handle retry
> 
> >>> On 09.07.15 at 14:50, <Paul.Durrant@citrix.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 09 July 2015 13:04
> >> >>> On 09.07.15 at 13:11, <Paul.Durrant@citrix.com> wrote:
> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> Sent: 09 July 2015 11:05
> >> >> >>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
> >> >> > @@ -287,17 +271,56 @@ static int hvmemul_do_io_addr(
> >> >> >      bool_t is_mmio, paddr_t addr, unsigned long *reps,
> >> >> >      unsigned int size, uint8_t dir, bool_t df, paddr_t ram_gpa)
> >> >> >  {
> >> >> > -    struct page_info *ram_page;
> >> >> > +    struct vcpu *v = current;
> >> >> > +    unsigned long ram_gmfn = paddr_to_pfn(ram_gpa);
> >> >> > +    unsigned int page_off = ram_gpa & (PAGE_SIZE - 1);
> >> >> > +    struct page_info *ram_page[2];
> >> >> > +    int nr_pages = 0;
> >> >>
> >> >> unsigned int
> >> >
> >> > No, it's intentionally signed because the unwind code at the end of the
> >> > function is:
> >> >
> >> >     while ( --nr_pages >= 0 )
> >> >         hvmemul_release_page(ram_page[nr_pages]);
> >> >
> >> > I.e. the loop terminates when nr_pages gets to -1.
> >>
> >>      while ( nr_pages-- )
> >>          hvmemul_release_page(ram_page[nr_pages]);
> >>
> >
> > But that wouldn't be correct, since the loop would try to release
> > ram_page[1] and ram_page[0] in the case where only ram_page[0] had
> been
> > acquired. It would have to be:
> >
> >       while ( nr_pages )
> >           hvmemul_release_page(ram_page[--nr_pages]);
> 
> Unless you need the value of nr_pages after the loop, I can't see
> the difference between your and my proposal - the decrement in
> both cases happens between determining whether to continue the
> loop and accessing the intended array element.

D'oh. Of course it does.

> 
> > which I'll do, since you so dislike signed ints.
> 
> Sounds like you're not understanding why. Just look at the
> generated code when using signed vs unsigned int as array
> index.

Ok, I'll take a look because I would not have thought it would make a huge difference.

  Paul

> 
> Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 09/16] x86/hvm: limit reps to avoid the need to handle retry
  2015-07-09 13:38           ` Jan Beulich
  2015-07-09 13:42             ` Paul Durrant
@ 2015-07-09 14:00             ` Paul Durrant
  2015-07-09 14:19               ` Jan Beulich
  1 sibling, 1 reply; 53+ messages in thread
From: Paul Durrant @ 2015-07-09 14:00 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Keir (Xen.org), xen-devel@lists.xenproject.org

> -----Original Message-----
> From: Paul Durrant
> Sent: 09 July 2015 14:43
> To: 'Jan Beulich'
> Cc: Andrew Cooper; xen-devel@lists.xenproject.org; Keir (Xen.org)
> Subject: RE: [PATCH v6 09/16] x86/hvm: limit reps to avoid the need to
> handle retry
> 
> > -----Original Message-----
> > From: Jan Beulich [mailto:JBeulich@suse.com]
> > Sent: 09 July 2015 14:38
> > To: Paul Durrant
> > Cc: Andrew Cooper; xen-devel@lists.xenproject.org; Keir (Xen.org)
> > Subject: RE: [PATCH v6 09/16] x86/hvm: limit reps to avoid the need to
> > handle retry
> >
> > >>> On 09.07.15 at 14:50, <Paul.Durrant@citrix.com> wrote:
> > >> From: Jan Beulich [mailto:JBeulich@suse.com]
> > >> Sent: 09 July 2015 13:04
> > >> >>> On 09.07.15 at 13:11, <Paul.Durrant@citrix.com> wrote:
> > >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> > >> >> Sent: 09 July 2015 11:05
> > >> >> >>> On 03.07.15 at 18:25, <paul.durrant@citrix.com> wrote:
> > >> >> > @@ -287,17 +271,56 @@ static int hvmemul_do_io_addr(
> > >> >> >      bool_t is_mmio, paddr_t addr, unsigned long *reps,
> > >> >> >      unsigned int size, uint8_t dir, bool_t df, paddr_t ram_gpa)
> > >> >> >  {
> > >> >> > -    struct page_info *ram_page;
> > >> >> > +    struct vcpu *v = current;
> > >> >> > +    unsigned long ram_gmfn = paddr_to_pfn(ram_gpa);
> > >> >> > +    unsigned int page_off = ram_gpa & (PAGE_SIZE - 1);
> > >> >> > +    struct page_info *ram_page[2];
> > >> >> > +    int nr_pages = 0;
> > >> >>
> > >> >> unsigned int
> > >> >
> > >> > No, it's intentionally signed because the unwind code at the end of
> the
> > >> > function is:
> > >> >
> > >> >     while ( --nr_pages >= 0 )
> > >> >         hvmemul_release_page(ram_page[nr_pages]);
> > >> >
> > >> > I.e. the loop terminates when nr_pages gets to -1.
> > >>
> > >>      while ( nr_pages-- )
> > >>          hvmemul_release_page(ram_page[nr_pages]);
> > >>
> > >
> > > But that wouldn't be correct, since the loop would try to release
> > > ram_page[1] and ram_page[0] in the case where only ram_page[0] had
> > been
> > > acquired. It would have to be:
> > >
> > >       while ( nr_pages )
> > >           hvmemul_release_page(ram_page[--nr_pages]);
> >
> > Unless you need the value of nr_pages after the loop, I can't see
> > the difference between your and my proposal - the decrement in
> > both cases happens between determining whether to continue the
> > loop and accessing the intended array element.
> 
> D'oh. Of course it does.
> 
> >
> > > which I'll do, since you so dislike signed ints.
> >
> > Sounds like you're not understanding why. Just look at the
> > generated code when using signed vs unsigned int as array
> > index.
> 
> Ok, I'll take a look because I would not have thought it would make a huge
> difference.
> 

So.... Two test programs:

test1.c:
-------

#include <stdio.h>

int main(void)
{
  int i = 10;
  int a[10];

  while ( --i >= 0 ) {
    a[i] = i;
    printf("%d\n", i);
  }

  return 0;
}

test2.c:
-------

#include <stdio.h>

int main(void)
{
  unsigned int i = 10;
  int a[10];

  while ( i ) {
    a[--i] = i;
    printf("%d\n", i);
  }

  return 0;
}

I compiled them just using 'make test1' and 'make test2', so using whatever the default options are for gcc (on a debian wheezy box). The relevant bits of the objdump are:

test1
-----

000000000040050c <main>:
  40050c:	55                   	push   %rbp
  40050d:	48 89 e5             	mov    %rsp,%rbp
  400510:	48 83 ec 30          	sub    $0x30,%rsp
  400514:	c7 45 fc 0a 00 00 00 	movl   $0xa,-0x4(%rbp)
  40051b:	eb 20                	jmp    40053d <main+0x31>
  40051d:	8b 45 fc             	mov    -0x4(%rbp),%eax
  400520:	48 98                	cltq   
  400522:	8b 55 fc             	mov    -0x4(%rbp),%edx
  400525:	89 54 85 d0          	mov    %edx,-0x30(%rbp,%rax,4)
  400529:	8b 45 fc             	mov    -0x4(%rbp),%eax
  40052c:	89 c6                	mov    %eax,%esi
  40052e:	bf fc 05 40 00       	mov    $0x4005fc,%edi
  400533:	b8 00 00 00 00       	mov    $0x0,%eax
  400538:	e8 a3 fe ff ff       	callq  4003e0 <printf@plt>
  40053d:	83 6d fc 01          	subl   $0x1,-0x4(%rbp)
  400541:	83 7d fc 00          	cmpl   $0x0,-0x4(%rbp)
  400545:	79 d6                	jns    40051d <main+0x11>
  400547:	b8 00 00 00 00       	mov    $0x0,%eax
  40054c:	c9                   	leaveq 
  40054d:	c3                   	retq   
  40054e:	90                   	nop
  40054f:	90                   	nop

test2
-----

000000000040050c <main>:
  40050c:	55                   	push   %rbp
  40050d:	48 89 e5             	mov    %rsp,%rbp
  400510:	48 83 ec 30          	sub    $0x30,%rsp
  400514:	c7 45 fc 0a 00 00 00 	movl   $0xa,-0x4(%rbp)
  40051b:	eb 22                	jmp    40053f <main+0x33>
  40051d:	83 6d fc 01          	subl   $0x1,-0x4(%rbp)
  400521:	8b 55 fc             	mov    -0x4(%rbp),%edx
  400524:	8b 45 fc             	mov    -0x4(%rbp),%eax
  400527:	89 54 85 d0          	mov    %edx,-0x30(%rbp,%rax,4)
  40052b:	8b 45 fc             	mov    -0x4(%rbp),%eax
  40052e:	89 c6                	mov    %eax,%esi
  400530:	bf fc 05 40 00       	mov    $0x4005fc,%edi
  400535:	b8 00 00 00 00       	mov    $0x0,%eax
  40053a:	e8 a1 fe ff ff       	callq  4003e0 <printf@plt>
  40053f:	83 7d fc 00          	cmpl   $0x0,-0x4(%rbp)
  400543:	75 d8                	jne    40051d <main+0x11>
  400545:	b8 00 00 00 00       	mov    $0x0,%eax
  40054a:	c9                   	leaveq 
  40054b:	c3                   	retq   
  40054c:	90                   	nop
  40054d:	90                   	nop
  40054e:	90                   	nop
  40054f:	90                   	nop

So, the only significant difference I can see is that using a signed value causes an extra cltq instruction to be used. Or maybe I need some specific level of optimization for there to be a big difference?

  Paul

>   Paul
> 
> >
> > Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v6 09/16] x86/hvm: limit reps to avoid the need to handle retry
  2015-07-09 14:00             ` Paul Durrant
@ 2015-07-09 14:19               ` Jan Beulich
  0 siblings, 0 replies; 53+ messages in thread
From: Jan Beulich @ 2015-07-09 14:19 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, Keir(Xen.org), xen-devel@lists.xenproject.org

>>> On 09.07.15 at 16:00, <Paul.Durrant@citrix.com> wrote:
> So, the only significant difference I can see is that using a signed value 
> causes an extra cltq instruction to be used. Or maybe I need some specific 
> level of optimization for there to be a big difference?

No, this is the difference I'm trying to point out (and if the index
register wasn't %rax, it would be a movslq or whatever its AT&T
name is).

Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2015-07-09 14:19 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-03 16:25 [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
2015-07-03 16:25 ` [PATCH v6 01/16] x86/hvm: make sure emulation is retried if domain is shutting down Paul Durrant
2015-07-03 16:25 ` [PATCH v6 02/16] x86/hvm: remove multiple open coded 'chunking' loops Paul Durrant
2015-07-03 17:17   ` Andrew Cooper
2015-07-08 15:52   ` Jan Beulich
2015-07-08 15:57     ` Paul Durrant
2015-07-08 16:18       ` Jan Beulich
2015-07-08 16:43       ` Andrew Cooper
2015-07-09  6:53         ` Jan Beulich
2015-07-09  8:15           ` Paul Durrant
2015-07-09  9:19             ` Jan Beulich
2015-07-03 16:25 ` [PATCH v6 03/16] x86/hvm: change hvm_mmio_read_t and hvm_mmio_write_t length argument Paul Durrant
2015-07-03 16:25 ` [PATCH v6 04/16] x86/hvm: restrict port numbers to uint16_t and sizes to unsigned int Paul Durrant
2015-07-08 15:57   ` Jan Beulich
2015-07-08 15:59     ` Paul Durrant
2015-07-03 16:25 ` [PATCH v6 05/16] x86/hvm: unify internal portio and mmio intercepts Paul Durrant
2015-07-08 16:11   ` Jan Beulich
2015-07-08 16:28   ` Jan Beulich
2015-07-03 16:25 ` [PATCH v6 06/16] x86/hvm: add length to mmio check op Paul Durrant
2015-07-03 16:25 ` [PATCH v6 07/16] x86/hvm: unify dpci portio intercept with standard portio intercept Paul Durrant
2015-07-08 16:29   ` Jan Beulich
2015-07-03 16:25 ` [PATCH v6 08/16] x86/hvm: unify stdvga mmio intercept with standard mmio intercept Paul Durrant
2015-07-08 16:17   ` Jan Beulich
2015-07-09  9:40     ` Paul Durrant
2015-07-09  8:53   ` Jan Beulich
2015-07-09  9:00     ` Paul Durrant
2015-07-09  9:21       ` Jan Beulich
2015-07-09  9:17   ` Jan Beulich
2015-07-03 16:25 ` [PATCH v6 09/16] x86/hvm: limit reps to avoid the need to handle retry Paul Durrant
2015-07-03 17:18   ` Andrew Cooper
2015-07-09 10:05   ` Jan Beulich
2015-07-09 11:11     ` Paul Durrant
2015-07-09 12:04       ` Jan Beulich
2015-07-09 12:50         ` Paul Durrant
2015-07-09 13:38           ` Jan Beulich
2015-07-09 13:42             ` Paul Durrant
2015-07-09 14:00             ` Paul Durrant
2015-07-09 14:19               ` Jan Beulich
2015-07-03 16:25 ` [PATCH v6 10/16] x86/hvm: only call hvm_io_assist() from hvm_wait_for_io() Paul Durrant
2015-07-03 16:25 ` [PATCH v6 11/16] x86/hvm: split I/O completion handling from state model Paul Durrant
2015-07-09 10:09   ` Jan Beulich
2015-07-03 16:25 ` [PATCH v6 12/16] x86/hvm: remove HVMIO_dispatched I/O state Paul Durrant
2015-07-09 10:13   ` Jan Beulich
2015-07-03 16:25 ` [PATCH v6 13/16] x86/hvm: remove hvm_io_state enumeration Paul Durrant
2015-07-03 16:25 ` [PATCH v6 14/16] x86/hvm: use ioreq_t to track in-flight state Paul Durrant
2015-07-03 16:25 ` [PATCH v6 15/16] x86/hvm: always re-emulate I/O from a buffer Paul Durrant
2015-07-03 16:25 ` [PATCH v6 16/16] x86/hvm: track large memory mapped accesses by buffer offset Paul Durrant
2015-07-09 10:33   ` Jan Beulich
2015-07-09 10:36     ` Paul Durrant
2015-07-09 10:34   ` Jan Beulich
2015-07-08 15:44 ` [PATCH v6 00/16] x86/hvm: I/O emulation cleanup and fix Jan Beulich
2015-07-09 11:31   ` Paul Durrant
2015-07-09 11:43     ` David Vrabel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).