From mboxrd@z Thu Jan 1 00:00:00 1970 From: Gleb Natapov Subject: Re: [PATCH 1/2] kvm-unit-tests: Add a func to run instruction in emulator Date: Wed, 19 Jun 2013 15:26:48 +0300 Message-ID: <20130619122648.GA18149@redhat.com> References: <20130618124517.GA16557@redhat.com> <20130618154754.GE21032@redhat.com> <20130618160951.GA22555@redhat.com> <20130618164458.GB22555@redhat.com> <20130619093146.GI5832@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Paolo Bonzini , kvm To: =?utf-8?B?5p2O5pil5aWHIDxBcnRodXIgQ2h1bnFpIExpPg==?= Return-path: Received: from mx1.redhat.com ([209.132.183.28]:7016 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934479Ab3FSM0w convert rfc822-to-8bit (ORCPT ); Wed, 19 Jun 2013 08:26:52 -0400 Content-Disposition: inline In-Reply-To: Sender: kvm-owner@vger.kernel.org List-ID: On Wed, Jun 19, 2013 at 08:18:29PM +0800, =E6=9D=8E=E6=98=A5=E5=A5=87 <= Arthur Chunqi Li> wrote: > On Wed, Jun 19, 2013 at 5:31 PM, Gleb Natapov wrote= : > > On Wed, Jun 19, 2013 at 09:26:59AM +0800, =E6=9D=8E=E6=98=A5=E5=A5=87= wrote: > >> On Wed, Jun 19, 2013 at 12:44 AM, Gleb Natapov w= rote: > >> > Send code in a form of a patch. > >> > > >> > On Wed, Jun 19, 2013 at 12:14:13AM +0800, =E6=9D=8E=E6=98=A5=E5=A5= =87 wrote: > >> >> extern u8 insn_page[], insn_page_end[]; > >> >> extern u8 test_insn[], test_insn_end[]; > >> >> extern u8 alt_insn_page[]; > >> >> > >> >> asm( > >> >> ".align 4096\n\t" > >> >> ".global insn_page\n\t" > >> >> ".global insn_page_end\n\t" > >> >> ".global test_insn\n\t" > >> >> ".global test_insn_end\n\t" > >> >> "insn_page:\n\t" > >> >> > >> >> "ret \n\t" > >> >> > >> >> "push %rax; push %rbx\n\t" > >> >> "push %rcx; push %rdx\n\t" > >> >> "push %rsi; push %rdi\n\t" > >> >> "push %rbp\n\t" > >> >> "push %r8; push %r9\n\t" > >> >> "push %r10; push %r11\n\t" > >> >> "push %r12; push %r13\n\t" > >> >> "push %r14; push %r15\n\t" > >> >> "pushf\n\t" > >> >> > >> >> "push 136+save \n\t" > >> >> "popf \n\t" > >> >> "mov 0+save, %rax \n\t" > >> >> "mov 8+save, %rbx \n\t" > >> >> "mov 16+save, %rcx \n\t" > >> >> "mov 24+save, %rdx \n\t" > >> >> "mov 32+save, %rsi \n\t" > >> >> "mov 40+save, %rdi \n\t" > >> >> "mov 56+save, %rbp \n\t" > >> >> "mov 64+save, %r8 \n\t" > >> >> "mov 72+save, %r9 \n\t" > >> >> "mov 80+save, %r10 \n\t" > >> >> "mov 88+save, %r11 \n\t" > >> >> "mov 96+save, %r12 \n\t" > >> >> "mov 104+save, %r13 \n\t" > >> >> "mov 112+save, %r14 \n\t" > >> >> "mov 120+save, %r15 \n\t" > >> >> > >> >> "test_insn:\n\t" > >> >> "in (%dx),%al\n\t" > >> >> ". =3D . + 31\n\t" > >> >> "test_insn_end:\n\t" > >> >> > >> >> "pushf \n\t" > >> >> "pop 136+save \n\t" > >> >> "mov %rax, 0+save \n\t" > >> >> "mov %rbx, 8+save \n\t" > >> >> "mov %rcx, 16+save \n\t" > >> >> "mov %rdx, 24+save \n\t" > >> >> "mov %rsi, 32+save \n\t" > >> >> "mov %rdi, 40+save \n\t" > >> >> "mov %rbp, 56+save \n\t" > >> >> "mov %r8, 64+save \n\t" > >> >> "mov %r9, 72+save \n\t" > >> >> "mov %r10, 80+save \n\t" > >> >> "mov %r11, 88+save \n\t" > >> >> "mov %r12, 96+save \n\t" > >> >> "mov %r13, 104+save \n\t" > >> >> "mov %r14, 112+save \n\t" > >> >> "mov %r15, 120+save \n\t" > >> >> "popf \n\t" > >> >> "pop %r15; pop %r14 \n\t" > >> >> "pop %r13; pop %r12 \n\t" > >> >> "pop %r11; pop %r10 \n\t" > >> >> "pop %r9; pop %r8 \n\t" > >> >> "pop %rbp \n\t" > >> >> "pop %rdi; pop %rsi \n\t" > >> >> "pop %rdx; pop %rcx \n\t" > >> >> "pop %rbx; pop %rax \n\t" > >> >> > >> >> "ret\n\t" > >> >> "save:\n\t" > >> >> ". =3D . + 256\n\t" > >> >> ".align 4096\n\t" > >> >> "alt_insn_page:\n\t" > >> >> ". =3D . + 4096\n\t" > >> >> ); > >> >> > >> >> > >> >> static void mk_insn_page(uint8_t *alt_insn_page, > >> >> uint8_t *alt_insn, int alt_insn_length) > >> >> { > >> >> int i, emul_offset; > >> >> for (i=3D1; i >> >> test_insn[i] =3D 0x90; // nop > >> > Why? Gcc should pad it with nops. > >> > > >> >> emul_offset =3D test_insn - insn_page; > >> >> for (i=3D0; i >> >> alt_insn_page[i+emul_offset] =3D alt_insn[i]; > >> >> } > >> >> > >> >> static void trap_emulator(uint64_t *mem, uint8_t* alt_insn, int= alt_insn_length) > >> >> { > >> >> ulong *cr3 =3D (ulong *)read_cr3(); > >> >> int save_offset =3D (u8 *)(&save) - insn_page; > >> >> > >> >> memset(alt_insn_page, 0x90, 4096); > >> > alt_insn_page should contains the same instruction as insn_page = except > >> > between test_insn and test_insn_end. I do not know how you expec= t it to > >> > work otherwise. > >> In my oponion, only codes between test_insn and test_insn_end in > >> alt_insn_page need to be set, insn_page will be executed in the gu= est, > >> and when trapping into emulator OS will load alt_insn_page (becaus= e of > >> invlpg(insn_page)), then return to guest with executing insn_page > >> (from TLB). > > While before trap the code will likely be executed from insn_page, > > but after the trap it is very optimistic to assume that tlb cache > > will still contain this virtual address since host will execute qui= te a > > lot of code and can even schedule in the middle, so the TLB will no= t > > contain the address and your test will crash. Even the code before = test > > instruction can be executed from alt_insn_page if guest is schedule= d out > > after invlpg() and before it executes every instruction until trapp= ing > > one. In your case the test will crash too instead of yielding false= positive. > > > >> I don't know if this is right, but I use this trick in my > >> previous patch and it runs well. > > Your previous patches always had c3 (ret) after tested instruction = on > > alt_insn_page. > > > >> I use "trace-cmd record -e kvm" t= o > >> trace it and found instructions in alt_insn_page are not executed,= so > >> I suppose that alt_insn_page is not loaded to the right place. > > Do you see "in" instruction emulated? Anyway current code is incorr= ect > > since current install_page() implementation cannot handle large pag= es > > and the code is backed up by large pages. You can fix install_page(= ) to > > check for that and break large page into small one before installin= g a > > page. > Here I have two questions. > 1. There's another function called "install_large_page", can it be > used to our occasion? I found that this function is not used at all. It is used when initial page tables are created. See lib/x86/vm.c:setup_mmu_range() > 2. Why will current version runs well? Do pages allocated dynamically > are automatically aligned to 2MB (large page size)? >=20 No, they are 4K pages. > Arthur > > > >> > >> Arthur > >> > > >> >> save =3D inregs; > >> >> mk_insn_page(alt_insn_page, alt_insn, alt_insn_length); > >> >> // Load the code TLB with insn_page, but point the page tab= les at > >> >> // alt_insn_page (and keep the data TLB clear, for AMD deco= de assist). > >> >> // This will make the CPU trap on the insn_page instruction= but the > >> >> // hypervisor will see alt_insn_page. > >> >> //install_page(cr3, virt_to_phys(insn_page), insn_page); > >> >> invlpg(insn_page); > >> >> // Load code TLB > >> >> asm volatile("call *%0" : : "r"(insn_page)); > >> >> install_page(cr3, virt_to_phys(alt_insn_page), insn_page); > >> >> // Trap, let hypervisor emulate at alt_insn_page > >> >> asm volatile("call *%0": : "r"(insn_page+1)); > >> >> > >> >> outregs =3D *((struct regs *)(&alt_insn_page[save_offset]))= ; > >> >> } > >> >> > >> >> static void test_movabs(uint64_t *mem) > >> >> { > >> >> // mov $0xc3c3c3c3c3c3c3c3, %rcx > >> >> uint8_t alt_insn[] =3D {0x48, 0xb9, 0xc3, 0xc3, 0xc3, > >> >> 0xc3, 0xc3, 0xc3, 0xc3, 0xc3}; > >> >> inregs =3D (struct regs){ 0 }; > >> >> trap_emulator(mem, alt_insn, 10); > >> >> report("64-bit mov imm2", outregs.rcx =3D=3D 0xc3c3c3c3c3c3= c3c3); > >> >> } > >> >> > >> >> On Wed, Jun 19, 2013 at 12:09 AM, Gleb Natapov wrote: > >> >> > On Tue, Jun 18, 2013 at 11:56:24PM +0800, =E6=9D=8E=E6=98=A5=E5= =A5=87 wrote: > >> >> >> On Tue, Jun 18, 2013 at 11:47 PM, Gleb Natapov wrote: > >> >> >> > On Tue, Jun 18, 2013 at 10:28:59PM +0800, =C3=8A??=C3=8A?=E2= =80=A2=C3=82=E2=80=A2? wrote: > >> >> >> >> On Tue, Jun 18, 2013 at 8:45 PM, Gleb Natapov wrote: > >> >> >> >> > On Thu, Jun 13, 2013 at 05:30:03PM +0800, =C3=8A=C3=B9=C3= =A9=C3=8A=C3=B2=E2=80=A2=C3=82=E2=80=A2=C3=A1 wrote: > >> >> >> >> >> Hi Gleb, > >> >> >> >> >> I'm trying to solve these problems in the past days an= d meet many > >> >> >> >> >> difficulties. You want to save all the general registe= rs in calling > >> >> >> >> >> insn_page, so registers should be saved to (save) in i= nsn_page. > >> >> >> >> >> Because all the instructions should be generated outsi= de and copy to > >> >> >> >> >> insn_page, and the instructions generated outside is R= IP-relative, so > >> >> >> >> >> inside insn_page (save) will be wrong pointed with RIP= -relative code. > >> >> >> >> >> > >> >> >> >> > They do not have to be generated outside. You can write= code into > >> >> >> >> > insn_page directly. Something like this outside of any = functions: > >> >> >> >> > > >> >> >> >> > asm(".align 4096\n\t" > >> >> >> >> > ".global insn_page\n\t" > >> >> >> >> > ".global insn_page_end\n\t" > >> >> >> >> > ".global test_insn\n\t" > >> >> >> >> > ".global test_insn_end\n\t" > >> >> >> >> > "insn_page:" > >> >> >> >> > "mov %%rax, outregs \n\t" > >> >> >> >> > ... > >> >> >> >> > "test_insn:\n\t" > >> >> >> >> > "in (%ds), %al\n\t" > >> >> >> >> > ". =3D . + 31\n\t" > >> >> >> >> > "test_insn_end:\n\t" > >> >> >> >> > "mov outregs, %%rax\n\t" > >> >> >> >> > ... > >> >> >> >> > "ret\n\t" > >> >> >> >> > ".align 4096\n\t" > >> >> >> >> > "insn_page_end:\n\t"); > >> >> >> >> > > >> >> >> >> > Now you copy that into alt_insn_page, put instruction y= ou want to test > >> >> >> >> > into test_insn offset and remap alt_insn_page into "ins= n_page" virtual address. > >> >> >> >> I used such codes: > >> >> >> >> > >> >> >> >> invlpg((void *)virt_to_phys(insn_page)); > >> >> >> > virt_to_phys? > >> >> >> This is a mistake, I changed it to "invlpg(insn_page)" but t= he result > >> >> >> is the same. > >> >> >> > > >> >> >> >> asm volatile("call *%0" : : "r"(insn_page)); > >> >> >> >> install_page(cr3, virt_to_phys(alt_insn_page), insn_page)= ; > >> >> >> >> asm volatile("call *%0": : "r"(insn_page+1)); > >> >> >> > +1? > >> >> >> Here I put "ret" on the first byte of insn_page, so the firs= t call of > >> >> >> "insn_page" can just return, and the second call of "insn_pa= ge+1=E2=80=9C will > >> >> >> directly call the second byte, which is the real content of = insn_page. > >> >> > Send the code. > >> >> > > >> >> > -- > >> >> > Gleb. > >> >> > >> >> > >> >> > >> >> -- > >> >> Arthur Chunqi Li > >> >> Department of Computer Science > >> >> School of EECS > >> >> Peking University > >> >> Beijing, China > >> > > >> > -- > >> > Gleb. > >> > >> > >> > >> -- > >> Arthur Chunqi Li > >> Department of Computer Science > >> School of EECS > >> Peking University > >> Beijing, China > > > > -- > > Gleb. >=20 >=20 >=20 > -- > Arthur Chunqi Li > Department of Computer Science > School of EECS > Peking University > Beijing, China -- Gleb.