From mboxrd@z Thu Jan  1 00:00:00 1970
From: Borislav Petkov <bp@alien8.de>
Subject: Re: [PATCH -v2] kvm: Emulate MOVBE
Date: Sun, 14 Apr 2013 23:02:18 +0200
Message-ID: <20130414210218.GF20547@pd.tnic>
References: <20130409234602.GI5077@pd.tnic>
 <20130410112942.07dfc167@slackpad>
 <20130410100845.GB17919@redhat.com>
 <20130410123901.46b65169@slackpad>
 <20130410121639.GE17919@redhat.com>
 <20130411001815.GA17544@pd.tnic>
 <20130414084303.GE17919@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Cc: Andre Przywara <andre@andrep.de>, kvm@vger.kernel.org,
	=?utf-8?B?SsO2cmcgUsO2ZGVs?= <joro@8bytes.org>,
	"H. Peter Anvin" <hpa@zytor.com>, x86-ml <x86@kernel.org>
To: Gleb Natapov <gleb@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mail.skyhub.de ([78.46.96.112]:55308 "EHLO mail.skyhub.de"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1753730Ab3DNVCc (ORCPT <rfc822;kvm@vger.kernel.org>);
	Sun, 14 Apr 2013 17:02:32 -0400
Content-Disposition: inline
In-Reply-To: <20130414084303.GE17919@redhat.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On Sun, Apr 14, 2013 at 11:43:03AM +0300, Gleb Natapov wrote:
> > +#define EmulateOnUD ((u64)1 << 46)  /* emulate if unsupported by the host */
> >  
> Just rename VendorSpecific to EmulateOnUD. The meaning is the same.

done.

> > +static int em_movbe(struct x86_emulate_ctxt *ctxt)
> > +{
> > +	switch (ctxt->op_bytes) {
> > +	case 2:
> > +		*(u16 *)ctxt->dst.valptr = swab16(*(u16 *)ctxt->src.valptr);
> What's wrong with: ctxt->dst.val = swab16((u16)ctxt->src.val)?

This doesn't work for the 16-bit swab because of the following
functionality of MOVBE:

"When the operand size is 16 bits, the upper word of the destination
register remains unchanged."

Now here's what gcc produces here:

	movzwl	112(%rdi), %eax	# ctxt_5(D)->src.D.27823.val, tmp83
	rolw	$8, %ax	#, tmp83
	movzwl	%ax, %eax	# tmp83, tmp84
	movq	%rax, 240(%rdi)	# tmp84,

See the zero-extension happening twice (second one should not be needed,
even) and it is actually writing the whole 64-bit value in %rax back
which effectively destroys the upper word?

However, this, a bit uglier version works:

	*(u16 *)&ctxt->dst.val = swab16((u16)ctxt->src.val);

	movzwl  112(%rdi), %eax # ctxt_5(D)->src.D.27823.val, tmp82
	rolw    $8, %ax #, tmp82
	movw    %ax, 240(%rdi)  # tmp82, MEM[(u16 *)ctxt_5(D) + 240B]

Whichever version we take, I'll slap a comment explaining why we're
doing it differently for 16-bit.

> > +	case 4:
> > +		*(u32 *)ctxt->dst.valptr = swab32(*(u32 *)ctxt->src.valptr);
> > +
> > +		/*
> > +		 * clear upper dword for 32-bit operand size in 64-bit mode.
> > +		 */
> > +		if (ctxt->mode == X86EMUL_MODE_PROT64)
> > +			*((u32 *)ctxt->dst.valptr + 1) = 0x0;
> It should be clean without this this.

Right, this should probably work because for 32-bit, we don't care
about the zero-extended upper dword and in 64-bit mode we want to do it
anyway.

> > +		break;
> > +	case 8:
> > +		*(u64 *)ctxt->dst.valptr = swab64(*(u64 *)ctxt->src.valptr);

Ditto for this one.

> > +static const struct opcode threebyte_table[] = {
> > +	[0xf0] = I(DstReg | SrcMem | ModRM | Mov | EmulateOnUD, em_movbe),
> > +	[0xf1] = I(DstMem | SrcReg | ModRM | Mov | EmulateOnUD, em_movbe),
> > +};
> > +
> Populate the whole table explicitly please. You can use X16(N) to make it
> short. The encoding is also wrong since 0F 38 F0/F1 instruction decoding
> depend on a prefix.

... on the absence of a prefix, actually.

> Same opcode is decoded as CRC32 with f2 prefix. We have Prefix
> mechanism for such instructions, but it currently assumes that 66 and
> f2/f3 prefixes are mutually exclusive, but in this case they are not,
> ugh.

Yeah, I'm lazy and wanted to leave the honors to the poor soul who
implements the whole 0F_38h opcode map. :-)

Ok, I probably could do something similar to pfx_vmovntpx with the GP
macro for the F0/F1 opcodes. I'll give it a try when I get a chance
soonish.

> > +
> > +		if (ctxt->b == 0x38)
> > +			opcode = threebyte_table[insn_fetch(u8, ctxt)];
> ctxt->b = insn_fetch(u8, ctxt);
> opcode = threebyte_table[ctxt->b];
> 
> Otherwise OpImplicit type of decoding will not work for three byte
> instructions. Also should rename twobyte into opbytes and set it to 1,
> 2 or 3. I prefer that three byte instruction decoding will be sent as a
> separate patch.

Ok.

Thanks.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--