All of lore.kernel.org
 help / color / mirror / Atom feed
* RE: [PATCH] Calculate correct instruction length for data-fault VM exits on VT-x systems
@ 2006-04-28  9:19 Petersson, Mats
  2006-04-28  9:24 ` Keir Fraser
  0 siblings, 1 reply; 19+ messages in thread
From: Petersson, Mats @ 2006-04-28  9:19 UTC (permalink / raw)
  To: Keir Fraser; +Cc: Khoa Huynh, xen-devel

 

> -----Original Message-----
> From: xen-devel-bounces@lists.xensource.com 
> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of 
> Keir Fraser
> Sent: 28 April 2006 10:15
> To: Petersson, Mats
> Cc: Khoa Huynh; xen-devel
> Subject: Re: [Xen-devel] [PATCH] Calculate correct 
> instruction length for data-fault VM exits on VT-x systems
> 
> 
> On 28 Apr 2006, at 10:02, Petersson, Mats wrote:
> 
> > I'll look at your previous suggestion of merging the MMIO emulation 
> > into x86_emulate later on today. We probably do need to sum up the 
> > length and pass it back to the caller - as that code 
> doesn't know how 
> > to update the correct field of the different processor 
> architectures 
> > (vmcb vs. vmcs vs. stack-frame for Para-virtual machine). But it 
> > shouldn't be particularly hard to achieve this.
> 
> The emulator uses and updates the eip field of the passed-in 
> regs structure. We may want to change this interface in 
> future by having the caller explicitly pass in a buffer 
> containing the instruction, and the number of valid bytes in 
> the buffer.  Or add a 'fetch_insn_byte' 
> callback hook to the emulator interface.

I think passing a buffer is the best choice here. And I suppose we can
always stuff vmc[bs]->rip into regs->eip and pull it back out again when
we get back - using a wrapper function may be the easiest way to achieve
this (at least short term). 

We will of course also need to get the communication with QEMU done in
some way.

I haven't spent any time looking at it so far... 

--
Mats
> 
>   -- Keir
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
> 
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread
* RE: [PATCH] Calculate correct instruction length for data-fault VM exits on VT-x systems
@ 2006-05-02 12:36 Petersson, Mats
  0 siblings, 0 replies; 19+ messages in thread
From: Petersson, Mats @ 2006-05-02 12:36 UTC (permalink / raw)
  To: Keir Fraser, Khoa Huynh; +Cc: xen-devel

> -----Original Message-----
> From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] 
> Sent: 29 April 2006 08:22
> To: Khoa Huynh
> Cc: Petersson, Mats; xen-devel
> Subject: Re: [Xen-devel] [PATCH] Calculate correct 
> instruction length for data-fault VM exits on VT-x systems
> 
> 
> On 28 Apr 2006, at 19:10, Khoa Huynh wrote:
> 
> > For these instructions, on Intel VT-x, the instruction 
> length is valid 
> > in VMCS.  On AMD, there is a simple look-up function which 
> determines 
> > the length of the instruction which is passed in as a parameter.
> > We are good here.
> 
> The Intel code only uses the instr-len field because it 
> happens to be handy. Going to a look-up function in a 
> separate file when you *know* at compile time what the 
> instruction length must be is stupid, imo. We should only 
> have to do that if the instruction needs some decoding for us 
> to know its length (perhaps because of prefix bytes or 
> effective address suffixes) and we are not otherwise going to 
> be decoding the instruction as part of emulation.

We do have to forward the RIP to next instruction, and we don't know the
prefix and other things, so I don't think we can improve on the current
setup [although I noticed some time ago that you replaced the call to
calculate instrlen on HLT with a constant one, which I suppose is fine,
since IF there is a prefix (unlikely) to HLT, then we just re-execute it
without prefix, which doesn't make a whole lot of difference [Except we
MAY end up waiting for another interrupt, technically speaking, which
would probably not be a good thing - of course, I've never seen any code
with a prefix to HLT - but it's perfectly allowed to do redundant and
useless prefixes on any instruction, for example to change the alignment
of the next instruction]. 
> 
> > I guess we can have a wrapper that takes as input the guest 
> > instruction pointer (rip), fetches the whole MAX_INST_LEN (15-byte) 
> > buffer starting at rip (make sure that we don't cross a page 
> > boundary), and then passes that to the emulator.  The 
> emulator would 
> > decode, emulate, and would include in its return the updated guest 
> > instruction pointer (rip) and instruction length.  This 
> info will be 
> > stuffed back into vmcs/vmcb/stack as appropriate.  Is this more or 
> > less what you have in mind ?
> 
> Yes, exactly. It gets a bit trickier though -- we'll have to 
> fill the buffer with up to 15 bytes. If we fail to get all of 
> 15 bytes (perhaps because the instruction straddles a page 
> boundary and the second page has been evicted since the 
> instruction faulted) then we ought to tell the emulator how 
> many bytes we actually copied and it shoudl return error if 
> it ends up going off the end of teh instruction buffer. 
> Alternatively we could re-enter the guest immediately if we 
> cannot read
> 15 bytes from the EIP -- but that'll cause an infinite loop 
> if the instruction itself doesn't straddle the page boundary 
> as it won't trigger paging in the guest. But I/O instructions 
> are unlikely to be in paged memory.

Yes - at the moment, we do not cope well with instructions that straddle
a page-boundary if the next page isn't present in memory. 

On the other hand, I've been looking at just using the existing hooks
(in the ops structure) to read instruction bytes and I think that would
work just fine. 

However, I haven't been looking through this entire thread to see what
the conclusion is, so this may all be moot... 

--
Mats
> 
>   -- Keir
> 
> 
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread
* RE: [PATCH] Calculate correct instruction length for data-fault VM exits on VT-x systems
@ 2006-04-28  9:02 Petersson, Mats
  2006-04-28  9:14 ` Keir Fraser
  0 siblings, 1 reply; 19+ messages in thread
From: Petersson, Mats @ 2006-04-28  9:02 UTC (permalink / raw)
  To: Keir Fraser, Khoa Huynh; +Cc: xen-devel

> -----Original Message-----
> From: xen-devel-bounces@lists.xensource.com 
> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of 
> Keir Fraser
> Sent: 28 April 2006 07:03
> To: Khoa Huynh
> Cc: xen-devel
> Subject: Re: [Xen-devel] [PATCH] Calculate correct 
> instruction length for data-fault VM exits on VT-x systems
> 
> 
> On 28 Apr 2006, at 02:52, Khoa Huynh wrote:
> 
> > It should be noted that VMX only uses this instrlen 
> function when the 
> > hypervisor needs the instruction-length info and that info is 
> > undefined in VMCS, e.g., for MMIO instructions.  In other 
> cases where 
> > the instruction-length field is valid in VMCS, the hypervisor 
> > continues to get that info from VMCS (via vmread operation).
> 
> I don't believe we need the instruction-length at all, and I 
> suspect that the decoder could be removed from hvm/svm 
> entirely. There are two broad categories of instruction I'm 
> thinking of:
>   1. Instructions with their own VMEXIT reason code tend to 
> be really simple so we know their length anyway and, if not, 
> the instr-length field should be valid
>   2. For mmio instructions, the emulator can work out the 
> length for itself and increment eip appropriately. There's no 
> need to know the instruction length in advance of invoking 
> the emulator.
> 
> I guess there may be one or two instructions, particularly on 
> AMD, where we aren't feeding the instruction to the mmio 
> emulator and the instruction isn't fixed length, so perhaps 
> we'll need a small decoder in hvm/svm for those. But even if 
> so, it could be much simpler than what is there right now.

Yes, this is correct. There is a specific routine that takes as an
argument which instruction(s) we're looking for, and calculates it's
length, for this purpose [since we do know which instructions we are
looking for]. 

I'll look at your previous suggestion of merging the MMIO emulation into
x86_emulate later on today. We probably do need to sum up the length and
pass it back to the caller - as that code doesn't know how to update the
correct field of the different processor architectures (vmcb vs. vmcs
vs. stack-frame for Para-virtual machine). But it shouldn't be
particularly hard to achieve this. 

--
Mats

^ permalink raw reply	[flat|nested] 19+ messages in thread
* [PATCH] Calculate correct instruction length for data-fault VM exits on VT-x systems
@ 2006-04-28  1:52 Khoa Huynh
  2006-04-28  2:41 ` Anthony Liguori
  2006-04-28  6:03 ` Keir Fraser
  0 siblings, 2 replies; 19+ messages in thread
From: Khoa Huynh @ 2006-04-28  1:52 UTC (permalink / raw)
  To: xen-devel

[-- Attachment #1: Type: text/plain, Size: 2526 bytes --]

On VT-x systems, according to Intel VMX specifications,
the instruction-length information in VMCS on VM exits
is not always valid.  The instruction-length field in
VMCS is ONLY valid in the follwing cases: when the VM
exit is caused by the execution of instructions that
cause the VM exit unconditionally or based on the
execution-control bitmap, a software exception (INT3
or INT0), or a task switch.

For VM exits caused by data faults (hardware exceptions),
the instruction-length field in VMCS is actually undefined.
In these cases, the hypervisor can derive the correct
instruction length by fetching bytes based on the guest
instruction pointer and decoding those bytes.  There is
already a function to do this in the SVM sub-directory.
This function should be moved up one level to HVM
sub-directory, so both VMX and SVM can use it.

It should be noted that VMX only uses this instrlen
function when the hypervisor needs the instruction-length
info and that info is undefined in VMCS, e.g., for MMIO
instructions.  In other cases where the instruction-length
field is valid in VMCS, the hypervisor continues to get
that info from VMCS (via vmread operation).

I came across this problem in my effort to get Windows
NT booting on Xen.

There are TWO patches attached below:

* instrlen1.patch effectively moves the instrlen.c file
  from xen/arch/x86/hvm/svm sub-directory up one level to
  xen/arch/x86/hvm sub-directory and makes minor changes
  to instrlen.c so that it will work at its new location.

* instrlen2.patch makes additional changes to VMX code
  so the hypervisor can use the instrlen function correctly
  in all modes in cases where the instruction-length field is
  undefined and read from VMCS in cases where it is defined.

I must acknowledge that most of the code in the first patch
(instrlen1.patch) does not come from me since the primary
prupose of this patch is to move the instrlen.c file from
one location to another in the tree (it also makes some
minor changes).  The second patch (instrlen2.patch) is
more meaty :-)

These two patches should apply cleanly to the latest
xen-unstable tree (hg tip = 9866).

I have tested these patches successfully on two systems
using a variety of guest OSes (e.g. WinXP, Win2003 Server).

Signed-off-by: Khoa Huynh <khoa@us.ibm.com>

(See attached file: instrlen1.patch)(See attached file: instrlen2.patch)

Regards,
Khoa
_________________________________________
Khoa Huynh, Ph.D.
IBM Linux Technology Center
(512) 838-4903; T/L 678-4903; khoa@us.ibm.com

[-- Attachment #2: instrlen1.patch --]
[-- Type: application/octet-stream, Size: 32964 bytes --]

diff -r 4e1b8be54311 xen/arch/x86/hvm/instrlen.c
--- /dev/null	Thu Apr 27 12:38:21 2006
+++ b/xen/arch/x86/hvm/instrlen.c	Thu Apr 27 19:50:04 2006
@@ -0,0 +1,482 @@
+/*
+ * instrlen.c - calculates the instruction length for all operating modes
+ *
+ * Travis Betak, travis.betak@amd.com 
+ * Copyright (c) 2005,2006 AMD
+ * Copyright (c) 1006 IBM
+ * Copyright (c) 2005 Keir Fraser
+ * 
+ * Khoa Huynh (khoa@us.ibm.com):
+ *      This file was moved from xen/arch/x86/hvm/svm directory up to
+ *      this directory because the instrlen function is being used by
+ *      both VMX and SVM.  VMX only uses this function when the
+ *      hypervisor needs the instruction-length info and this info
+ *      is not available (i.e., undefined) in VMCS.
+ */
+
+/*
+ * TODO: the way in which we use hvm_instrlen is very inefficient as is now
+ * stands.  It will be worth while to return the actual instruction buffer
+ * along with the instruction length since one of the reasons we are getting
+ * the instruction length is to know how many instruction bytes we need to
+ * fetch.
+ */
+
+#include <xen/config.h>
+#include <xen/types.h>
+#include <xen/lib.h>
+#include <xen/mm.h>
+#include <asm/regs.h>
+#define DPRINTF DPRINTK
+#include <asm-x86/x86_emulate.h>
+
+/* read from guest memory */
+extern int inst_copy_from_guest(unsigned char *buf, unsigned long eip,
+        int length);
+
+/*
+ * Opcode effective-address decode tables.
+ * Note that we only emulate instructions that have at least one memory
+ * operand (excluding implicit stack references). We assume that stack
+ * references and instruction fetches will never occur in special memory
+ * areas that require emulation. So, for example, 'mov <imm>,<reg>' need
+ * not be handled.
+ */
+
+/* Operand sizes: 8-bit operands or specified/overridden size. */
+#define ByteOp      (1<<0) /* 8-bit operands. */
+/* Destination operand type. */
+#define ImplicitOps (1<<1) /* Implicit in opcode. No generic decode. */
+#define DstReg      (2<<1) /* Register operand. */
+#define DstMem      (3<<1) /* Memory operand. */
+#define DstMask     (3<<1)
+/* Source operand type. */
+#define SrcNone     (0<<3) /* No source operand. */
+#define SrcImplicit (0<<3) /* Source operand is implicit in the opcode. */
+#define SrcReg      (1<<3) /* Register operand. */
+#define SrcMem      (2<<3) /* Memory operand. */
+#define SrcMem16    (3<<3) /* Memory operand (16-bit). */
+#define SrcMem32    (4<<3) /* Memory operand (32-bit). */
+#define SrcImm      (5<<3) /* Immediate operand. */
+#define SrcImmByte  (6<<3) /* 8-bit sign-extended immediate operand. */
+#define SrcMask     (7<<3)
+/* Generic ModRM decode. */
+#define ModRM       (1<<6)
+/* Destination is only written; never read. */
+#define Mov         (1<<7)
+
+static uint8_t opcode_table[256] = {
+    /* 0x00 - 0x07 */
+    ByteOp|DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM,
+    ByteOp|DstReg|SrcMem|ModRM, DstReg|SrcMem|ModRM,
+    0, 0, 0, 0,
+    /* 0x08 - 0x0F */
+    ByteOp|DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM,
+    ByteOp|DstReg|SrcMem|ModRM, DstReg|SrcMem|ModRM,
+    0, 0, 0, 0,
+    /* 0x10 - 0x17 */
+    ByteOp|DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM,
+    ByteOp|DstReg|SrcMem|ModRM, DstReg|SrcMem|ModRM,
+    0, 0, 0, 0,
+    /* 0x18 - 0x1F */
+    ByteOp|DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM,
+    ByteOp|DstReg|SrcMem|ModRM, DstReg|SrcMem|ModRM,
+    0, 0, 0, 0,
+    /* 0x20 - 0x27 */
+    ByteOp|DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM,
+    ByteOp|DstReg|SrcMem|ModRM, DstReg|SrcMem|ModRM,
+    0, 0, 0, 0,
+    /* 0x28 - 0x2F */
+    ByteOp|DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM,
+    ByteOp|DstReg|SrcMem|ModRM, DstReg|SrcMem|ModRM,
+    0, 0, 0, 0,
+    /* 0x30 - 0x37 */
+    ByteOp|DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM,
+    ByteOp|DstReg|SrcMem|ModRM, DstReg|SrcMem|ModRM,
+    0, 0, 0, 0,
+    /* 0x38 - 0x3F */
+    ByteOp|DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM,
+    ByteOp|DstReg|SrcMem|ModRM, DstReg|SrcMem|ModRM,
+    0, 0, 0, 0,
+    /* 0x40 - 0x4F */
+    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+    /* 0x50 - 0x5F */
+    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+    /* 0x60 - 0x6F */
+    0, 0, 0, DstReg|SrcMem32|ModRM|Mov /* movsxd (x86/64) */,
+    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+    /* 0x70 - 0x7F */
+    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+    /* 0x80 - 0x87 */
+    ByteOp|DstMem|SrcImm|ModRM, DstMem|SrcImm|ModRM,
+    ByteOp|DstMem|SrcImm|ModRM, DstMem|SrcImmByte|ModRM,
+    ByteOp|DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM,
+    ByteOp|DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM,
+    /* 0x88 - 0x8F */
+    ByteOp|DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM,
+    ByteOp|DstReg|SrcMem|ModRM, DstReg|SrcMem|ModRM,
+    0, 0, 0, DstMem|SrcNone|ModRM|Mov,
+    /* 0x90 - 0x9F */
+    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+    /* 0xA0 - 0xA7 */
+    ByteOp|DstReg|SrcMem|Mov, DstReg|SrcMem|Mov,
+    ByteOp|DstMem|SrcReg|Mov, DstMem|SrcReg|Mov,
+    ByteOp|ImplicitOps|Mov, ImplicitOps|Mov,
+    ByteOp|ImplicitOps, ImplicitOps,
+    /* 0xA8 - 0xAF */
+    0, 0, ByteOp|ImplicitOps|Mov, ImplicitOps|Mov,
+    ByteOp|ImplicitOps|Mov, ImplicitOps|Mov,
+    ByteOp|ImplicitOps, ImplicitOps,
+    /* 0xB0 - 0xBF */
+    SrcImmByte, SrcImmByte, SrcImmByte, SrcImmByte, 
+    SrcImmByte, SrcImmByte, SrcImmByte, SrcImmByte, 
+    0, 0, 0, 0, 0, 0, 0, 0,
+    /* 0xC0 - 0xC7 */
+    ByteOp|DstMem|SrcImm|ModRM, DstMem|SrcImmByte|ModRM, 0, 0,
+    0, 0, ByteOp|DstMem|SrcImm|ModRM, DstMem|SrcImm|ModRM,
+    /* 0xC8 - 0xCF */
+    0, 0, 0, 0, 0, 0, 0, 0,
+    /* 0xD0 - 0xD7 */
+    ByteOp|DstMem|SrcImplicit|ModRM, DstMem|SrcImplicit|ModRM, 
+    ByteOp|DstMem|SrcImplicit|ModRM, DstMem|SrcImplicit|ModRM, 
+    0, 0, 0, 0,
+    /* 0xD8 - 0xDF */
+    0, 0, 0, 0, 0, 0, 0, 0,
+    /* 0xE0 - 0xEF */
+    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+    /* 0xF0 - 0xF7 */
+    0, 0, 0, 0,
+    0, 0, ByteOp|DstMem|SrcNone|ModRM, DstMem|SrcNone|ModRM,
+    /* 0xF8 - 0xFF */
+    0, 0, 0, 0,
+    0, 0, ByteOp|DstMem|SrcNone|ModRM, DstMem|SrcNone|ModRM
+};
+
+static uint8_t twobyte_table[256] = {
+    /* 0x00 - 0x0F */
+    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ImplicitOps|ModRM, 0, 0,
+    /* 0x10 - 0x1F */
+    0, 0, 0, 0, 0, 0, 0, 0, ImplicitOps|ModRM, 0, 0, 0, 0, 0, 0, 0,
+    /* 0x20 - 0x2F */
+    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+    /* 0x30 - 0x3F */
+    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+    /* 0x40 - 0x47 */
+    DstReg|SrcMem|ModRM|Mov, DstReg|SrcMem|ModRM|Mov,
+    DstReg|SrcMem|ModRM|Mov, DstReg|SrcMem|ModRM|Mov,
+    DstReg|SrcMem|ModRM|Mov, DstReg|SrcMem|ModRM|Mov,
+    DstReg|SrcMem|ModRM|Mov, DstReg|SrcMem|ModRM|Mov,
+    /* 0x48 - 0x4F */
+    DstReg|SrcMem|ModRM|Mov, DstReg|SrcMem|ModRM|Mov,
+    DstReg|SrcMem|ModRM|Mov, DstReg|SrcMem|ModRM|Mov,
+    DstReg|SrcMem|ModRM|Mov, DstReg|SrcMem|ModRM|Mov,
+    DstReg|SrcMem|ModRM|Mov, DstReg|SrcMem|ModRM|Mov,
+    /* 0x50 - 0x5F */
+    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+    /* 0x60 - 0x6F */
+    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+    /* 0x70 - 0x7F */
+    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+    /* 0x80 - 0x8F */
+    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+    /* 0x90 - 0x9F */
+    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+    /* 0xA0 - 0xA7 */
+    0, 0, 0, DstMem|SrcReg|ModRM, 0, 0, 0, 0, 
+    /* 0xA8 - 0xAF */
+    0, 0, 0, DstMem|SrcReg|ModRM, 0, 0, 0, 0,
+    /* 0xB0 - 0xB7 */
+    ByteOp|DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM, 0, DstMem|SrcReg|ModRM,
+    0, 0, ByteOp|DstReg|SrcMem|ModRM|Mov, DstReg|SrcMem16|ModRM|Mov,
+    /* 0xB8 - 0xBF */
+    0, 0, DstMem|SrcImmByte|ModRM, DstMem|SrcReg|ModRM,
+    0, 0, ByteOp|DstReg|SrcMem|ModRM|Mov, DstReg|SrcMem16|ModRM|Mov,
+    /* 0xC0 - 0xCF */
+    0, 0, 0, 0, 0, 0, 0, ImplicitOps|ModRM, 0, 0, 0, 0, 0, 0, 0, 0,
+    /* 0xD0 - 0xDF */
+    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+    /* 0xE0 - 0xEF */
+    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+    /* 0xF0 - 0xFF */
+    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
+};
+
+/* 
+ * insn_fetch - fetch the next 1 to 4 bytes from instruction stream 
+ * 
+ * @_type:   u8, u16, u32, s8, s16, or s32
+ * @_size:   1, 2, or 4 bytes
+ * @_eip:    address to fetch from guest memory
+ * @_length: updated! increments the current instruction length counter by _size
+ *
+ * INTERNAL this is used internally by hvm_instrlen to fetch the next byte,
+ * word, or dword from guest memory at location _eip.  we currently use a local
+ * unsigned long as the storage buffer since the most bytes we're gonna get
+ * is limited to 4.
+ */
+#define insn_fetch(_type, _size, _eip, _length) \
+({  unsigned long _x; \
+        if ((rc = inst_copy_from_guest((unsigned char *)(&(_x)), \
+                (unsigned long)(_eip), _size)) \
+                    != _size) \
+        goto done; \
+    (_eip) += (_size); \
+    (_length) += (_size); \
+    (_type)_x; \
+})
+
+
+/**
+ * hvm_instrlen - returns the current instructions length
+ *
+ * @regs: guest register state
+ * @mode: guest operating mode
+ *
+ * EXTERNAL this routine calculates the length of the current instruction
+ * pointed to by eip.  The guest state is _not_ changed by this routine.
+ */
+int hvm_instrlen(struct cpu_user_regs *regs, int mode)
+{
+    uint8_t b, d, twobyte = 0, rex_prefix = 0;
+    uint8_t modrm, modrm_mod = 0, modrm_reg = 0, modrm_rm = 0;
+    unsigned int op_bytes, ad_bytes, lock_prefix = 0, rep_prefix = 0, i;
+    int rc = 0;
+    int length = 0;
+    unsigned int tmp;
+
+    /* Shadow copy of register state. Committed on successful emulation. */
+    struct cpu_user_regs _regs = *regs;
+
+    /* include CS for 16-bit modes */
+    if (mode == X86EMUL_MODE_REAL || mode == X86EMUL_MODE_PROT16)
+        _regs.eip += (_regs.cs << 4);
+
+    switch ( mode )
+    {
+    case X86EMUL_MODE_REAL:
+    case X86EMUL_MODE_PROT16:
+        op_bytes = ad_bytes = 2;
+        break;
+    case X86EMUL_MODE_PROT32:
+        op_bytes = ad_bytes = 4;
+        break;
+#ifdef __x86_64__
+    case X86EMUL_MODE_PROT64:
+        op_bytes = 4;
+        ad_bytes = 8;
+        break;
+#endif
+    default:
+        printf("hvm_instrlen: invalid mode!\n");
+        return -1;
+    }
+
+    /* Legacy prefixes. */
+    for ( i = 0; i < 8; i++ )
+    {
+        switch ( b = insn_fetch(uint8_t, 1, _regs.eip, length) )
+        {
+        case 0x66: /* operand-size override */
+            op_bytes ^= 6;      /* switch between 2/4 bytes */
+            break;
+        case 0x67: /* address-size override */
+            if ( mode == X86EMUL_MODE_PROT64 )
+                ad_bytes ^= 12; /* switch between 4/8 bytes */
+            else
+                ad_bytes ^= 6;  /* switch between 2/4 bytes */
+            break;
+        case 0x2e: /* CS override */
+        case 0x3e: /* DS override */
+        case 0x26: /* ES override */
+        case 0x64: /* FS override */
+        case 0x65: /* GS override */
+        case 0x36: /* SS override */
+            break;
+        case 0xf0: /* LOCK */
+            lock_prefix = 1;
+            break;
+        case 0xf3: /* REP/REPE/REPZ */
+            rep_prefix = 1;
+            break;
+        case 0xf2: /* REPNE/REPNZ */
+            break;
+        default:
+            goto done_prefixes;
+        }
+    }
+done_prefixes:
+
+    /* Note quite the same as 80386 real mode, but hopefully good enough. */
+    if ( (mode == X86EMUL_MODE_REAL) && (ad_bytes != 2) ) {
+        printf("sonofabitch!! we don't support 32-bit addresses in realmode\n");
+        goto cannot_emulate;
+    }
+
+    /* REX prefix. */
+    if ( (mode == X86EMUL_MODE_PROT64) && ((b & 0xf0) == 0x40) )
+    {
+        rex_prefix = b;
+        if ( b & 8 )
+            op_bytes = 8;          /* REX.W */
+        modrm_reg = (b & 4) << 1;  /* REX.R */
+        /* REX.B and REX.X do not need to be decoded. */
+        b = insn_fetch(uint8_t, 1, _regs.eip, length);
+    }
+
+    /* Opcode byte(s). */
+    d = opcode_table[b];
+    if ( d == 0 )
+    {
+        /* Two-byte opcode? */
+        if ( b == 0x0f )
+        {
+            twobyte = 1;
+            b = insn_fetch(uint8_t, 1, _regs.eip, length);
+            d = twobyte_table[b];
+        }
+
+        /* Unrecognised? */
+        if ( d == 0 )
+            goto cannot_emulate;
+    }
+
+    /* ModRM and SIB bytes. */
+    if ( d & ModRM )
+    {
+        modrm = insn_fetch(uint8_t, 1, _regs.eip, length);
+        modrm_mod |= (modrm & 0xc0) >> 6;
+        modrm_reg |= (modrm & 0x38) >> 3;
+        modrm_rm  |= (modrm & 0x07);
+
+        if ( modrm_mod == 3 )
+        {
+            DPRINTF("Cannot parse ModRM.mod == 3.\n");
+            goto cannot_emulate;
+        }
+
+        if ( ad_bytes == 2 )
+        {
+            /* 16-bit ModR/M decode. */
+            switch ( modrm_mod )
+            {
+            case 0:
+                if ( modrm_rm == 6 ) 
+                {
+                    length += 2;
+                    _regs.eip += 2; /* skip disp16 */
+                }
+                break;
+            case 1:
+                length += 1;
+                _regs.eip += 1; /* skip disp8 */
+                break;
+            case 2:
+                length += 2;
+                _regs.eip += 2; /* skip disp16 */
+                break;
+            }
+        }
+        else
+        {
+            /* 32/64-bit ModR/M decode. */
+            switch ( modrm_mod )
+            {
+            case 0:
+                if ( (modrm_rm == 4) && 
+                     (((insn_fetch(uint8_t, 1, _regs.eip, length)) & 7) 
+                        == 5) )
+                {
+                    length += 4;
+                    _regs.eip += 4; /* skip disp32 specified by SIB.base */
+                }
+                else if ( modrm_rm == 5 )
+                {
+                    length += 4;
+                    _regs.eip += 4; /* skip disp32 */
+                }
+                break;
+            case 1:
+                if ( modrm_rm == 4 )
+                {
+                    insn_fetch(uint8_t, 1, _regs.eip, length);
+                }
+                length += 1;
+                _regs.eip += 1; /* skip disp8 */
+                break;
+            case 2:
+                if ( modrm_rm == 4 )
+                {
+                    insn_fetch(uint8_t, 1, _regs.eip, length);
+                }
+                length += 4;
+                _regs.eip += 4; /* skip disp32 */
+                break;
+            }
+        }
+    }
+
+    /* Decode and fetch the destination operand: register or memory. */
+    switch ( d & DstMask )
+    {
+    case ImplicitOps:
+        /* Special instructions do their own operand decoding. */
+        goto done;
+    }
+
+    /* Decode and fetch the source operand: register, memory or immediate. */
+    switch ( d & SrcMask )
+    {
+    case SrcImm:
+        tmp = (d & ByteOp) ? 1 : op_bytes;
+        if ( tmp == 8 ) tmp = 4;
+        /* NB. Immediates are sign-extended as necessary. */
+        switch ( tmp )
+        {
+        case 1: insn_fetch(int8_t,  1, _regs.eip, length); break;
+        case 2: insn_fetch(int16_t, 2, _regs.eip, length); break;
+        case 4: insn_fetch(int32_t, 4, _regs.eip, length); break;
+        }
+        break;
+    case SrcImmByte:
+        insn_fetch(int8_t,  1, _regs.eip, length);
+        break;
+    }
+
+    if ( twobyte )
+        goto done;
+
+    switch ( b )
+    {
+    case 0xa0 ... 0xa1: /* mov */
+        length += ad_bytes;
+        _regs.eip += ad_bytes; /* skip src displacement */
+        break;
+    case 0xa2 ... 0xa3: /* mov */
+        length += ad_bytes;
+        _regs.eip += ad_bytes; /* skip dst displacement */
+        break;
+    case 0xf6 ... 0xf7: /* Grp3 */
+        switch ( modrm_reg )
+        {
+        case 0 ... 1: /* test */
+            /* Special case in Grp3: test has an immediate source operand. */
+            tmp = (d & ByteOp) ? 1 : op_bytes;
+            if ( tmp == 8 ) tmp = 4;
+            switch ( tmp )
+            {
+            case 1: insn_fetch(int8_t,  1, _regs.eip, length); break;
+            case 2: insn_fetch(int16_t, 2, _regs.eip, length); break;
+            case 4: insn_fetch(int32_t, 4, _regs.eip, length); break;
+            }
+            goto done;
+	}
+        break;
+    }
+
+done:
+    return length;
+
+cannot_emulate:
+    printf("hvm_instrlen: Cannot emulate %02x at address %lx (eip %lx, mode %d)\n", b, (unsigned long)_regs.eip, (unsigned long)regs->eip, mode);
+    return -1;
+}
diff -r 4e1b8be54311 xen/arch/x86/hvm/svm/instrlen.c
--- a/xen/arch/x86/hvm/svm/instrlen.c	Thu Apr 27 12:38:21 2006
+++ /dev/null	Thu Apr 27 19:50:04 2006
@@ -1,479 +0,0 @@
-/*
- * instrlen.c - calculates the instruction length for all operating modes
- * 
- * Travis Betak, travis.betak@amd.com
- * Copyright (c) 2005,2006 AMD
- * Copyright (c) 2005 Keir Fraser
- *
- * Essentially a very, very stripped version of Keir Fraser's work in
- * x86_emulate.c.  Used for MMIO.
- */
-
-/*
- * TODO: the way in which we use svm_instrlen is very inefficient as is now
- * stands.  It will be worth while to return the actual instruction buffer
- * along with the instruction length since one of the reasons we are getting
- * the instruction length is to know how many instruction bytes we need to
- * fetch.
- */
-
-#include <xen/config.h>
-#include <xen/types.h>
-#include <xen/lib.h>
-#include <xen/mm.h>
-#include <asm/regs.h>
-#define DPRINTF DPRINTK
-#include <asm-x86/x86_emulate.h>
-
-/* read from guest memory */
-extern int inst_copy_from_guest(unsigned char *buf, unsigned long eip,
-        int length);
-extern void svm_dump_inst(unsigned long eip);
-
-/*
- * Opcode effective-address decode tables.
- * Note that we only emulate instructions that have at least one memory
- * operand (excluding implicit stack references). We assume that stack
- * references and instruction fetches will never occur in special memory
- * areas that require emulation. So, for example, 'mov <imm>,<reg>' need
- * not be handled.
- */
-
-/* Operand sizes: 8-bit operands or specified/overridden size. */
-#define ByteOp      (1<<0) /* 8-bit operands. */
-/* Destination operand type. */
-#define ImplicitOps (1<<1) /* Implicit in opcode. No generic decode. */
-#define DstReg      (2<<1) /* Register operand. */
-#define DstMem      (3<<1) /* Memory operand. */
-#define DstMask     (3<<1)
-/* Source operand type. */
-#define SrcNone     (0<<3) /* No source operand. */
-#define SrcImplicit (0<<3) /* Source operand is implicit in the opcode. */
-#define SrcReg      (1<<3) /* Register operand. */
-#define SrcMem      (2<<3) /* Memory operand. */
-#define SrcMem16    (3<<3) /* Memory operand (16-bit). */
-#define SrcMem32    (4<<3) /* Memory operand (32-bit). */
-#define SrcImm      (5<<3) /* Immediate operand. */
-#define SrcImmByte  (6<<3) /* 8-bit sign-extended immediate operand. */
-#define SrcMask     (7<<3)
-/* Generic ModRM decode. */
-#define ModRM       (1<<6)
-/* Destination is only written; never read. */
-#define Mov         (1<<7)
-
-static uint8_t opcode_table[256] = {
-    /* 0x00 - 0x07 */
-    ByteOp|DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM,
-    ByteOp|DstReg|SrcMem|ModRM, DstReg|SrcMem|ModRM,
-    0, 0, 0, 0,
-    /* 0x08 - 0x0F */
-    ByteOp|DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM,
-    ByteOp|DstReg|SrcMem|ModRM, DstReg|SrcMem|ModRM,
-    0, 0, 0, 0,
-    /* 0x10 - 0x17 */
-    ByteOp|DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM,
-    ByteOp|DstReg|SrcMem|ModRM, DstReg|SrcMem|ModRM,
-    0, 0, 0, 0,
-    /* 0x18 - 0x1F */
-    ByteOp|DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM,
-    ByteOp|DstReg|SrcMem|ModRM, DstReg|SrcMem|ModRM,
-    0, 0, 0, 0,
-    /* 0x20 - 0x27 */
-    ByteOp|DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM,
-    ByteOp|DstReg|SrcMem|ModRM, DstReg|SrcMem|ModRM,
-    0, 0, 0, 0,
-    /* 0x28 - 0x2F */
-    ByteOp|DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM,
-    ByteOp|DstReg|SrcMem|ModRM, DstReg|SrcMem|ModRM,
-    0, 0, 0, 0,
-    /* 0x30 - 0x37 */
-    ByteOp|DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM,
-    ByteOp|DstReg|SrcMem|ModRM, DstReg|SrcMem|ModRM,
-    0, 0, 0, 0,
-    /* 0x38 - 0x3F */
-    ByteOp|DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM,
-    ByteOp|DstReg|SrcMem|ModRM, DstReg|SrcMem|ModRM,
-    0, 0, 0, 0,
-    /* 0x40 - 0x4F */
-    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-    /* 0x50 - 0x5F */
-    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-    /* 0x60 - 0x6F */
-    0, 0, 0, DstReg|SrcMem32|ModRM|Mov /* movsxd (x86/64) */,
-    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-    /* 0x70 - 0x7F */
-    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-    /* 0x80 - 0x87 */
-    ByteOp|DstMem|SrcImm|ModRM, DstMem|SrcImm|ModRM,
-    ByteOp|DstMem|SrcImm|ModRM, DstMem|SrcImmByte|ModRM,
-    ByteOp|DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM,
-    ByteOp|DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM,
-    /* 0x88 - 0x8F */
-    ByteOp|DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM,
-    ByteOp|DstReg|SrcMem|ModRM, DstReg|SrcMem|ModRM,
-    0, 0, 0, DstMem|SrcNone|ModRM|Mov,
-    /* 0x90 - 0x9F */
-    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-    /* 0xA0 - 0xA7 */
-    ByteOp|DstReg|SrcMem|Mov, DstReg|SrcMem|Mov,
-    ByteOp|DstMem|SrcReg|Mov, DstMem|SrcReg|Mov,
-    ByteOp|ImplicitOps|Mov, ImplicitOps|Mov,
-    ByteOp|ImplicitOps, ImplicitOps,
-    /* 0xA8 - 0xAF */
-    0, 0, ByteOp|ImplicitOps|Mov, ImplicitOps|Mov,
-    ByteOp|ImplicitOps|Mov, ImplicitOps|Mov,
-    ByteOp|ImplicitOps, ImplicitOps,
-    /* 0xB0 - 0xBF */
-    SrcImmByte, SrcImmByte, SrcImmByte, SrcImmByte, 
-    SrcImmByte, SrcImmByte, SrcImmByte, SrcImmByte, 
-    0, 0, 0, 0, 0, 0, 0, 0,
-    /* 0xC0 - 0xC7 */
-    ByteOp|DstMem|SrcImm|ModRM, DstMem|SrcImmByte|ModRM, 0, 0,
-    0, 0, ByteOp|DstMem|SrcImm|ModRM, DstMem|SrcImm|ModRM,
-    /* 0xC8 - 0xCF */
-    0, 0, 0, 0, 0, 0, 0, 0,
-    /* 0xD0 - 0xD7 */
-    ByteOp|DstMem|SrcImplicit|ModRM, DstMem|SrcImplicit|ModRM, 
-    ByteOp|DstMem|SrcImplicit|ModRM, DstMem|SrcImplicit|ModRM, 
-    0, 0, 0, 0,
-    /* 0xD8 - 0xDF */
-    0, 0, 0, 0, 0, 0, 0, 0,
-    /* 0xE0 - 0xEF */
-    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-    /* 0xF0 - 0xF7 */
-    0, 0, 0, 0,
-    0, 0, ByteOp|DstMem|SrcNone|ModRM, DstMem|SrcNone|ModRM,
-    /* 0xF8 - 0xFF */
-    0, 0, 0, 0,
-    0, 0, ByteOp|DstMem|SrcNone|ModRM, DstMem|SrcNone|ModRM
-};
-
-static uint8_t twobyte_table[256] = {
-    /* 0x00 - 0x0F */
-    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ImplicitOps|ModRM, 0, 0,
-    /* 0x10 - 0x1F */
-    0, 0, 0, 0, 0, 0, 0, 0, ImplicitOps|ModRM, 0, 0, 0, 0, 0, 0, 0,
-    /* 0x20 - 0x2F */
-    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-    /* 0x30 - 0x3F */
-    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-    /* 0x40 - 0x47 */
-    DstReg|SrcMem|ModRM|Mov, DstReg|SrcMem|ModRM|Mov,
-    DstReg|SrcMem|ModRM|Mov, DstReg|SrcMem|ModRM|Mov,
-    DstReg|SrcMem|ModRM|Mov, DstReg|SrcMem|ModRM|Mov,
-    DstReg|SrcMem|ModRM|Mov, DstReg|SrcMem|ModRM|Mov,
-    /* 0x48 - 0x4F */
-    DstReg|SrcMem|ModRM|Mov, DstReg|SrcMem|ModRM|Mov,
-    DstReg|SrcMem|ModRM|Mov, DstReg|SrcMem|ModRM|Mov,
-    DstReg|SrcMem|ModRM|Mov, DstReg|SrcMem|ModRM|Mov,
-    DstReg|SrcMem|ModRM|Mov, DstReg|SrcMem|ModRM|Mov,
-    /* 0x50 - 0x5F */
-    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-    /* 0x60 - 0x6F */
-    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-    /* 0x70 - 0x7F */
-    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-    /* 0x80 - 0x8F */
-    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-    /* 0x90 - 0x9F */
-    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-    /* 0xA0 - 0xA7 */
-    0, 0, 0, DstMem|SrcReg|ModRM, 0, 0, 0, 0, 
-    /* 0xA8 - 0xAF */
-    0, 0, 0, DstMem|SrcReg|ModRM, 0, 0, 0, 0,
-    /* 0xB0 - 0xB7 */
-    ByteOp|DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM, 0, DstMem|SrcReg|ModRM,
-    0, 0, ByteOp|DstReg|SrcMem|ModRM|Mov, DstReg|SrcMem16|ModRM|Mov,
-    /* 0xB8 - 0xBF */
-    0, 0, DstMem|SrcImmByte|ModRM, DstMem|SrcReg|ModRM,
-    0, 0, ByteOp|DstReg|SrcMem|ModRM|Mov, DstReg|SrcMem16|ModRM|Mov,
-    /* 0xC0 - 0xCF */
-    0, 0, 0, 0, 0, 0, 0, ImplicitOps|ModRM, 0, 0, 0, 0, 0, 0, 0, 0,
-    /* 0xD0 - 0xDF */
-    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-    /* 0xE0 - 0xEF */
-    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-    /* 0xF0 - 0xFF */
-    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
-};
-
-/* 
- * insn_fetch - fetch the next 1 to 4 bytes from instruction stream 
- * 
- * @_type:   u8, u16, u32, s8, s16, or s32
- * @_size:   1, 2, or 4 bytes
- * @_eip:    address to fetch from guest memory
- * @_length: updated! increments the current instruction length counter by _size
- *
- * INTERNAL this is used internally by svm_instrlen to fetch the next byte,
- * word, or dword from guest memory at location _eip.  we currently use a local
- * unsigned long as the storage buffer since the most bytes we're gonna get
- * is limited to 4.
- */
-#define insn_fetch(_type, _size, _eip, _length) \
-({  unsigned long _x; \
-        if ((rc = inst_copy_from_guest((unsigned char *)(&(_x)), \
-                (unsigned long)(_eip), _size)) \
-                    != _size) \
-        goto done; \
-    (_eip) += (_size); \
-    (_length) += (_size); \
-    (_type)_x; \
-})
-
-
-/**
- * svn_instrlen - returns the current instructions length
- *
- * @regs: guest register state
- * @mode: guest operating mode
- *
- * EXTERNAL this routine calculates the length of the current instruction
- * pointed to by eip.  The guest state is _not_ changed by this routine.
- */
-int svm_instrlen(struct cpu_user_regs *regs, int mode)
-{
-    uint8_t b, d, twobyte = 0, rex_prefix = 0;
-    uint8_t modrm, modrm_mod = 0, modrm_reg = 0, modrm_rm = 0;
-    unsigned int op_bytes, ad_bytes, lock_prefix = 0, rep_prefix = 0, i;
-    int rc = 0;
-    int length = 0;
-    unsigned int tmp;
-
-    /* Shadow copy of register state. Committed on successful emulation. */
-    struct cpu_user_regs _regs = *regs;
-
-    /* include CS for 16-bit modes */
-    if (mode == X86EMUL_MODE_REAL || mode == X86EMUL_MODE_PROT16)
-        _regs.eip += (_regs.cs << 4);
-
-    switch ( mode )
-    {
-    case X86EMUL_MODE_REAL:
-    case X86EMUL_MODE_PROT16:
-        op_bytes = ad_bytes = 2;
-        break;
-    case X86EMUL_MODE_PROT32:
-        op_bytes = ad_bytes = 4;
-        break;
-#ifdef __x86_64__
-    case X86EMUL_MODE_PROT64:
-        op_bytes = 4;
-        ad_bytes = 8;
-        break;
-#endif
-    default:
-        return -1;
-    }
-
-    /* Legacy prefixes. */
-    for ( i = 0; i < 8; i++ )
-    {
-        switch ( b = insn_fetch(uint8_t, 1, _regs.eip, length) )
-        {
-        case 0x66: /* operand-size override */
-            op_bytes ^= 6;      /* switch between 2/4 bytes */
-            break;
-        case 0x67: /* address-size override */
-            if ( mode == X86EMUL_MODE_PROT64 )
-                ad_bytes ^= 12; /* switch between 4/8 bytes */
-            else
-                ad_bytes ^= 6;  /* switch between 2/4 bytes */
-            break;
-        case 0x2e: /* CS override */
-        case 0x3e: /* DS override */
-        case 0x26: /* ES override */
-        case 0x64: /* FS override */
-        case 0x65: /* GS override */
-        case 0x36: /* SS override */
-            break;
-        case 0xf0: /* LOCK */
-            lock_prefix = 1;
-            break;
-        case 0xf3: /* REP/REPE/REPZ */
-            rep_prefix = 1;
-            break;
-        case 0xf2: /* REPNE/REPNZ */
-            break;
-        default:
-            goto done_prefixes;
-        }
-    }
-done_prefixes:
-
-    /* Note quite the same as 80386 real mode, but hopefully good enough. */
-    if ( (mode == X86EMUL_MODE_REAL) && (ad_bytes != 2) ) {
-        printf("sonofabitch!! we don't support 32-bit addresses in realmode\n");
-        goto cannot_emulate;
-    }
-
-    /* REX prefix. */
-    if ( (mode == X86EMUL_MODE_PROT64) && ((b & 0xf0) == 0x40) )
-    {
-        rex_prefix = b;
-        if ( b & 8 )
-            op_bytes = 8;          /* REX.W */
-        modrm_reg = (b & 4) << 1;  /* REX.R */
-        /* REX.B and REX.X do not need to be decoded. */
-        b = insn_fetch(uint8_t, 1, _regs.eip, length);
-    }
-
-    /* Opcode byte(s). */
-    d = opcode_table[b];
-    if ( d == 0 )
-    {
-        /* Two-byte opcode? */
-        if ( b == 0x0f )
-        {
-            twobyte = 1;
-            b = insn_fetch(uint8_t, 1, _regs.eip, length);
-            d = twobyte_table[b];
-        }
-
-        /* Unrecognised? */
-        if ( d == 0 )
-            goto cannot_emulate;
-    }
-
-    /* ModRM and SIB bytes. */
-    if ( d & ModRM )
-    {
-        modrm = insn_fetch(uint8_t, 1, _regs.eip, length);
-        modrm_mod |= (modrm & 0xc0) >> 6;
-        modrm_reg |= (modrm & 0x38) >> 3;
-        modrm_rm  |= (modrm & 0x07);
-
-        if ( modrm_mod == 3 )
-        {
-            DPRINTF("Cannot parse ModRM.mod == 3.\n");
-            goto cannot_emulate;
-        }
-
-        if ( ad_bytes == 2 )
-        {
-            /* 16-bit ModR/M decode. */
-            switch ( modrm_mod )
-            {
-            case 0:
-                if ( modrm_rm == 6 ) 
-                {
-                    length += 2;
-                    _regs.eip += 2; /* skip disp16 */
-                }
-                break;
-            case 1:
-                length += 1;
-                _regs.eip += 1; /* skip disp8 */
-                break;
-            case 2:
-                length += 2;
-                _regs.eip += 2; /* skip disp16 */
-                break;
-            }
-        }
-        else
-        {
-            /* 32/64-bit ModR/M decode. */
-            switch ( modrm_mod )
-            {
-            case 0:
-                if ( (modrm_rm == 4) && 
-                     (((insn_fetch(uint8_t, 1, _regs.eip, length)) & 7) 
-                        == 5) )
-                {
-                    length += 4;
-                    _regs.eip += 4; /* skip disp32 specified by SIB.base */
-                }
-                else if ( modrm_rm == 5 )
-                {
-                    length += 4;
-                    _regs.eip += 4; /* skip disp32 */
-                }
-                break;
-            case 1:
-                if ( modrm_rm == 4 )
-                {
-                    insn_fetch(uint8_t, 1, _regs.eip, length);
-                }
-                length += 1;
-                _regs.eip += 1; /* skip disp8 */
-                break;
-            case 2:
-                if ( modrm_rm == 4 )
-                {
-                    insn_fetch(uint8_t, 1, _regs.eip, length);
-                }
-                length += 4;
-                _regs.eip += 4; /* skip disp32 */
-                break;
-            }
-        }
-    }
-
-    /* Decode and fetch the destination operand: register or memory. */
-    switch ( d & DstMask )
-    {
-    case ImplicitOps:
-        /* Special instructions do their own operand decoding. */
-        goto done;
-    }
-
-    /* Decode and fetch the source operand: register, memory or immediate. */
-    switch ( d & SrcMask )
-    {
-    case SrcImm:
-        tmp = (d & ByteOp) ? 1 : op_bytes;
-        if ( tmp == 8 ) tmp = 4;
-        /* NB. Immediates are sign-extended as necessary. */
-        switch ( tmp )
-        {
-        case 1: insn_fetch(int8_t,  1, _regs.eip, length); break;
-        case 2: insn_fetch(int16_t, 2, _regs.eip, length); break;
-        case 4: insn_fetch(int32_t, 4, _regs.eip, length); break;
-        }
-        break;
-    case SrcImmByte:
-        insn_fetch(int8_t,  1, _regs.eip, length);
-        break;
-    }
-
-    if ( twobyte )
-        goto done;
-
-    switch ( b )
-    {
-    case 0xa0 ... 0xa1: /* mov */
-        length += ad_bytes;
-        _regs.eip += ad_bytes; /* skip src displacement */
-        break;
-    case 0xa2 ... 0xa3: /* mov */
-        length += ad_bytes;
-        _regs.eip += ad_bytes; /* skip dst displacement */
-        break;
-    case 0xf6 ... 0xf7: /* Grp3 */
-        switch ( modrm_reg )
-        {
-        case 0 ... 1: /* test */
-            /* Special case in Grp3: test has an immediate source operand. */
-            tmp = (d & ByteOp) ? 1 : op_bytes;
-            if ( tmp == 8 ) tmp = 4;
-            switch ( tmp )
-            {
-            case 1: insn_fetch(int8_t,  1, _regs.eip, length); break;
-            case 2: insn_fetch(int16_t, 2, _regs.eip, length); break;
-            case 4: insn_fetch(int32_t, 4, _regs.eip, length); break;
-            }
-            goto done;
-	}
-        break;
-    }
-
-done:
-    return length;
-
-cannot_emulate:
-    DPRINTF("Cannot emulate %02x at address %lx (eip %lx, mode %d)\n",
-            b, (unsigned long)_regs.eip, (unsigned long)regs->eip, mode);
-    svm_dump_inst(_regs.eip);
-    return -1;
-}

[-- Attachment #3: instrlen2.patch --]
[-- Type: application/octet-stream, Size: 8841 bytes --]

diff -r 4e1b8be54311 xen/arch/x86/hvm/Makefile
--- a/xen/arch/x86/hvm/Makefile	Thu Apr 27 12:38:21 2006
+++ b/xen/arch/x86/hvm/Makefile	Thu Apr 27 20:02:11 2006
@@ -6,6 +6,7 @@
 obj-y += i8259.o
 obj-y += intercept.o
 obj-y += io.o
+obj-y += instrlen.o
 obj-y += platform.o
 obj-y += vioapic.o
 obj-y += vlapic.o
diff -r 4e1b8be54311 xen/arch/x86/hvm/svm/Makefile
--- a/xen/arch/x86/hvm/svm/Makefile	Thu Apr 27 12:38:21 2006
+++ b/xen/arch/x86/hvm/svm/Makefile	Thu Apr 27 20:02:11 2006
@@ -2,7 +2,6 @@
 subdir-$(x86_64) += x86_64
 
 obj-y += emulate.o
-obj-y += instrlen.o
 obj-y += intr.o
 obj-y += svm.o
 obj-y += vmcb.o
diff -r 4e1b8be54311 xen/arch/x86/hvm/svm/svm.c
--- a/xen/arch/x86/hvm/svm/svm.c	Thu Apr 27 12:38:21 2006
+++ b/xen/arch/x86/hvm/svm/svm.c	Thu Apr 27 20:02:11 2006
@@ -70,7 +70,7 @@
 extern asmlinkage void do_IRQ(struct cpu_user_regs *);
 extern void send_pio_req(struct cpu_user_regs *regs, unsigned long port,
        unsigned long count, int size, long value, int dir, int pvalid);
-extern int svm_instrlen(struct cpu_user_regs *regs, int mode);
+extern int hvm_instrlen(struct cpu_user_regs *regs, int mode);
 extern void svm_dump_inst(unsigned long eip);
 extern int svm_dbg_on;
 void svm_manual_event_injection32(struct vcpu *v, struct cpu_user_regs *regs, 
@@ -409,7 +409,7 @@
         mode = vmcb->cs.attributes.fields.l ? 8 : 4;
     else
         mode = (eflags & X86_EFLAGS_VM) || !(cr0 & X86_CR0_PE) ? 2 : 4;
-    return svm_instrlen(guest_cpu_user_regs(), mode);
+    return hvm_instrlen(guest_cpu_user_regs(), mode);
 }
 
 unsigned long svm_get_ctrl_reg(struct vcpu *v, unsigned int num)
diff -r 4e1b8be54311 xen/arch/x86/hvm/vmx/vmx.c
--- a/xen/arch/x86/hvm/vmx/vmx.c	Thu Apr 27 12:38:21 2006
+++ b/xen/arch/x86/hvm/vmx/vmx.c	Thu Apr 27 20:02:11 2006
@@ -50,6 +50,7 @@
 static unsigned long trace_values[NR_CPUS][4];
 #define TRACE_VMEXIT(index,value) trace_values[smp_processor_id()][index]=value
 
+extern int hvm_instrlen(struct cpu_user_regs *regs, int mode);
 static void vmx_ctxt_switch_from(struct vcpu *v);
 static void vmx_ctxt_switch_to(struct vcpu *v);
 
@@ -560,11 +561,20 @@
 
 int vmx_instruction_length(struct vcpu *v)
 {
-    unsigned long inst_len;
-
-    if (__vmread(VM_EXIT_INSTRUCTION_LEN, &inst_len))
-    	return 0;
-    return inst_len;
+    struct vmx_context c;
+    unsigned long cr0, rflags, mode;
+    
+    /* check which operating mode the guest is running */
+    if( VMX_LONG_GUEST(v) ) {
+        __vmread(GUEST_CS_AR_BYTES, &c.cs_arbytes.bytes);
+        mode = c.cs_arbytes.fields.reserved1 ? 8 : 4;
+    }
+    else {
+        __vmread(GUEST_RFLAGS, &rflags);
+        __vmread_vcpu(v, GUEST_CR0, &cr0);
+        mode = (rflags & X86_EFLAGS_VM) || !(cr0 & X86_CR0_PE) ? 2 : 4;
+    }
+    return hvm_instrlen(guest_cpu_user_regs(), mode);
 }
 
 unsigned long vmx_get_ctrl_reg(struct vcpu *v, unsigned int num)
@@ -713,9 +723,9 @@
 /*
  * Not all cases receive valid value in the VM-exit instruction length field.
  */
-#define __get_instruction_length(len) \
+#define __get_instruction_length_from_VMCS(len) \
     __vmread(VM_EXIT_INSTRUCTION_LEN, &(len)); \
-     if ((len) < 1 || (len) > 15) \
+     if ((len) < 1 || (len) > MAX_INST_LEN) \
         __hvm_bug(&regs);
 
 static void inline __update_guest_eip(unsigned long inst_len)
@@ -762,7 +772,7 @@
         /* No support for APIC */
         if (!hvm_apic_support(v->domain) && gpa >= 0xFEC00000) { 
             u32 inst_len;
-            __vmread(VM_EXIT_INSTRUCTION_LEN, &(inst_len));
+            inst_len = hvm_instruction_length(v);
             __update_guest_eip(inst_len);
             return 1;
         }
@@ -773,8 +783,9 @@
 
     result = shadow_fault(va, regs);
     TRACE_VMEXIT (2,result);
-#if 0
-    if ( !result )
+
+#if 0 /* keep for debugging */    
+if ( !result )
     {
         __vmread(GUEST_RIP, &eip);
         printk("vmx pgfault to guest va=%lx eip=%lx\n", va, eip);
@@ -2180,11 +2191,11 @@
         break;
     case EXIT_REASON_CPUID:
         vmx_vmexit_do_cpuid(&regs);
-        __get_instruction_length(inst_len);
+        __get_instruction_length_from_VMCS(inst_len);
         __update_guest_eip(inst_len);
         break;
     case EXIT_REASON_HLT:
-        __get_instruction_length(inst_len);
+        __get_instruction_length_from_VMCS(inst_len);
         __update_guest_eip(inst_len);
         vmx_vmexit_do_hlt();
         break;
@@ -2194,13 +2205,13 @@
 
         __vmread(EXIT_QUALIFICATION, &va);
         vmx_vmexit_do_invlpg(va);
-        __get_instruction_length(inst_len);
+        __get_instruction_length_from_VMCS(inst_len);
         __update_guest_eip(inst_len);
         break;
     }
 #if 0 /* keep this for debugging */
     case EXIT_REASON_VMCALL:
-        __get_instruction_length(inst_len);
+        __get_instruction_length_from_VMCS(inst_len);
         __vmread(GUEST_RIP, &eip);
         __vmread(EXIT_QUALIFICATION, &exit_qualification);
 
@@ -2211,7 +2222,7 @@
     case EXIT_REASON_CR_ACCESS:
     {
         __vmread(GUEST_RIP, &eip);
-        __get_instruction_length(inst_len);
+        __get_instruction_length_from_VMCS(inst_len);
         __vmread(EXIT_QUALIFICATION, &exit_qualification);
 
         HVM_DBG_LOG(DBG_LEVEL_1, "eip = %lx, inst_len =%lx, exit_qualification = %lx",
@@ -2225,24 +2236,24 @@
     case EXIT_REASON_DR_ACCESS:
         __vmread(EXIT_QUALIFICATION, &exit_qualification);
         vmx_dr_access(exit_qualification, &regs);
-        __get_instruction_length(inst_len);
+        __get_instruction_length_from_VMCS(inst_len);
         __update_guest_eip(inst_len);
         break;
     case EXIT_REASON_IO_INSTRUCTION:
         __vmread(EXIT_QUALIFICATION, &exit_qualification);
-        __get_instruction_length(inst_len);
+        __get_instruction_length_from_VMCS(inst_len);
         vmx_io_instruction(&regs, exit_qualification, inst_len);
         TRACE_VMEXIT(4,exit_qualification);
         break;
     case EXIT_REASON_MSR_READ:
-        __get_instruction_length(inst_len);
+        __get_instruction_length_from_VMCS(inst_len);
         vmx_do_msr_read(&regs);
         __update_guest_eip(inst_len);
         break;
     case EXIT_REASON_MSR_WRITE:
         __vmread(GUEST_RIP, &eip);
         vmx_do_msr_write(&regs);
-        __get_instruction_length(inst_len);
+        __get_instruction_length_from_VMCS(inst_len);
         __update_guest_eip(inst_len);
         break;
     case EXIT_REASON_MWAIT_INSTRUCTION:
@@ -2258,8 +2269,8 @@
     case EXIT_REASON_VMWRITE:
     case EXIT_REASON_VMOFF:
     case EXIT_REASON_VMON:
-        /* Report invalid opcode exception when a VMX guest tries to execute 
-            any of the VMX instructions */
+        /* Report invalid opcode exception when a VMX guest tries to execute
+           any of the VMX instructions */
         vmx_inject_exception(v, TRAP_invalid_op, VMX_DELIVER_NO_ERROR_CODE);
         break;
 
diff -r 4e1b8be54311 xen/include/asm-x86/hvm/vmx/vmcs.h
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h	Thu Apr 27 12:38:21 2006
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h	Thu Apr 27 20:02:11 2006
@@ -79,6 +79,53 @@
     void                    *io_bitmap_a, *io_bitmap_b;
     struct timer            hlt_timer;  /* hlt ins emulation wakeup timer */
 };
+/*
+ * VMX context 
+ */
+typedef struct vmx_context {
+    uint32_t  eip;        /* execution pointer */
+    uint32_t  esp;        /* stack pointer */
+    uint32_t  eflags;     /* flags register */
+    uint32_t  cr0;
+    uint32_t  cr3;        /* page table directory */
+    uint32_t  cr4;
+    uint32_t  idtr_limit; /* idt */
+    uint32_t  idtr_base;
+    uint32_t  gdtr_limit; /* gdt */
+    uint32_t  gdtr_base;
+    uint32_t  cs_sel;     /* cs selector */
+    uint32_t  cs_limit;
+    uint32_t  cs_base;
+    union vmcs_arbytes cs_arbytes;
+    uint32_t  ds_sel;     /* ds selector */
+    uint32_t  ds_limit;
+    uint32_t  ds_base;
+    union vmcs_arbytes ds_arbytes;
+    uint32_t  es_sel;     /* es selector */
+    uint32_t  es_limit;
+    uint32_t  es_base;
+    union vmcs_arbytes es_arbytes;
+    uint32_t  ss_sel;     /* ss selector */
+    uint32_t  ss_limit;
+    uint32_t  ss_base;
+    union vmcs_arbytes ss_arbytes;
+    uint32_t  fs_sel;     /* fs selector */
+    uint32_t  fs_limit;
+    uint32_t  fs_base;
+    union vmcs_arbytes fs_arbytes;
+    uint32_t  gs_sel;     /* gs selector */
+    uint32_t  gs_limit;
+    uint32_t  gs_base;
+    union vmcs_arbytes gs_arbytes;
+    uint32_t  tr_sel;     /* task selector */
+    uint32_t  tr_limit;
+    uint32_t  tr_base;
+    union vmcs_arbytes tr_arbytes;
+    uint32_t  ldtr_sel;   /* ldtr selector */
+    uint32_t  ldtr_limit;
+    uint32_t  ldtr_base;
+    union vmcs_arbytes ldtr_arbytes;
+} vmx_context_t;
 
 #define vmx_schedule_tail(next)         \
     (next)->thread.arch_vmx.arch_vmx_schedule_tail((next))

[-- Attachment #4: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2006-05-02 12:36 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-28  9:19 [PATCH] Calculate correct instruction length for data-fault VM exits on VT-x systems Petersson, Mats
2006-04-28  9:24 ` Keir Fraser
2006-04-29  1:20   ` Leendert van Doorn
2006-04-28 20:02     ` Anthony Liguori
2006-04-29  8:00     ` Keir Fraser
2006-04-29 14:54       ` Leendert van Doorn
2006-04-29 10:39         ` Keir Fraser
2006-04-29 23:24           ` Leendert van Doorn
2006-04-29 18:54             ` Keir Fraser
2006-04-30  1:37               ` Leendert van Doorn
2006-04-29 19:46                 ` Keir Fraser
  -- strict thread matches above, loose matches on Subject: below --
2006-05-02 12:36 Petersson, Mats
2006-04-28  9:02 Petersson, Mats
2006-04-28  9:14 ` Keir Fraser
2006-04-28  1:52 Khoa Huynh
2006-04-28  2:41 ` Anthony Liguori
2006-04-28  6:03 ` Keir Fraser
2006-04-28 18:10   ` Khoa Huynh
2006-04-29  7:21     ` Keir Fraser

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.