* [PATCH 00/18] KVM: PPC: Virtualize Gekko guests
@ 2010-02-04 15:55 Alexander Graf
2010-02-04 15:55 ` [PATCH 01/18] KVM: PPC: Add QPR registers Alexander Graf
` (9 more replies)
0 siblings, 10 replies; 53+ messages in thread
From: Alexander Graf @ 2010-02-04 15:55 UTC (permalink / raw)
To: kvm-ppc; +Cc: kvm
In an effort to get KVM on PPC more useful for other userspace users than
Qemu, I figured it'd be a nice idea to implement virtualization of the
Gekko CPU.
The Gekko is the CPU used in the GameCube. In a slightly more modern
fashion it lives on in the Wii today.
Using this patch set and a modified version of Dolphin, I was able to
virtualize simple GameCube demos on a 970MP system.
As always, while getting this to run I stumbled across several broken
parts and fixed them as they came up. So expect some bug fixes in this
patch set too.
Alexander Graf (18):
KVM: PPC: Add QPR registers
KVM: PPC: Enable MMIO to do 64 bits, fprs and qprs
KVM: PPC: Teach MMIO Signedness
KVM: PPC: Add AGAIN type for emulation return
KVM: PPC: Add hidden flag for paired singles
KVM: PPC: Add Gekko SPRs
KVM: PPC: Combine extension interrupt handlers
KVM: PPC: Preload FPU when possible
KVM: PPC: Fix typo in book3s_32 debug code
KVM: PPC: Implement mtsr instruction emulation
KVM: PPC: Make software load/store return eaddr
KVM: PPC: Make ext giveup non-static
KVM: PPC: Add helpers to call FPU instructions
KVM: PPC: Fix error in BAT assignment
KVM: PPC: Add helpers to modify ppc fields
KVM: PPC: Enable program interrupt to do MMIO
KVM: PPC: Reserve a chunk of memory for opcodes
KVM: PPC: Implement Paired Single emulation
arch/powerpc/include/asm/kvm.h | 7 +
arch/powerpc/include/asm/kvm_asm.h | 1 +
arch/powerpc/include/asm/kvm_book3s.h | 8 +-
arch/powerpc/include/asm/kvm_fpu.h | 45 +
arch/powerpc/include/asm/kvm_host.h | 6 +
arch/powerpc/include/asm/kvm_ppc.h | 43 +-
arch/powerpc/include/asm/reg.h | 10 +
arch/powerpc/kernel/ppc_ksyms.c | 2 +
arch/powerpc/kvm/Makefile | 2 +
arch/powerpc/kvm/book3s.c | 132 +++-
arch/powerpc/kvm/book3s_32_mmu.c | 2 +-
arch/powerpc/kvm/book3s_64_emulate.c | 94 ++-
arch/powerpc/kvm/book3s_paired_singles.c | 1356 ++++++++++++++++++++++++++++++
arch/powerpc/kvm/emulate.c | 18 +-
arch/powerpc/kvm/fpu.S | 77 ++
arch/powerpc/kvm/powerpc.c | 56 ++-
16 files changed, 1821 insertions(+), 38 deletions(-)
create mode 100644 arch/powerpc/include/asm/kvm_fpu.h
create mode 100644 arch/powerpc/kvm/book3s_paired_singles.c
create mode 100644 arch/powerpc/kvm/fpu.S
^ permalink raw reply [flat|nested] 53+ messages in thread* [PATCH 01/18] KVM: PPC: Add QPR registers 2010-02-04 15:55 [PATCH 00/18] KVM: PPC: Virtualize Gekko guests Alexander Graf @ 2010-02-04 15:55 ` Alexander Graf 2010-02-04 15:55 ` [PATCH 07/18] KVM: PPC: Combine extension interrupt handlers Alexander Graf ` (8 subsequent siblings) 9 siblings, 0 replies; 53+ messages in thread From: Alexander Graf @ 2010-02-04 15:55 UTC (permalink / raw) To: kvm-ppc; +Cc: kvm The Gekko has GPRs, SPRs and FPRs like normal PowerPC codes, but it also has QPRs which are basically single precision only FPU registers that get used when in paired single mode. The following patches depend on them being around, so let's add the definitions early. Signed-off-by: Alexander Graf <agraf@suse.de> --- arch/powerpc/include/asm/kvm_host.h | 5 +++++ 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 715aa6b..2ed954e 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -186,6 +186,11 @@ struct kvm_vcpu_arch { u64 vsr[32]; #endif +#ifdef CONFIG_PPC_BOOK3S + /* For Gekko paired singles */ + u32 qpr[32]; +#endif + ulong pc; ulong ctr; ulong lr; -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 07/18] KVM: PPC: Combine extension interrupt handlers 2010-02-04 15:55 [PATCH 00/18] KVM: PPC: Virtualize Gekko guests Alexander Graf 2010-02-04 15:55 ` [PATCH 01/18] KVM: PPC: Add QPR registers Alexander Graf @ 2010-02-04 15:55 ` Alexander Graf 2010-02-04 15:55 ` [PATCH 08/18] KVM: PPC: Preload FPU when possible Alexander Graf ` (7 subsequent siblings) 9 siblings, 0 replies; 53+ messages in thread From: Alexander Graf @ 2010-02-04 15:55 UTC (permalink / raw) To: kvm-ppc; +Cc: kvm When we for example get an Altivec interrupt, but our guest doesn't support altivec, we need to inject a program interrupt, not an altivec interrupt. The same goes for paired singles. When an altivec interrupt arrives, we're pretty sure we need to emulate the instruction because it's a paired single operation. So let's make all the ext handlers aware that they need to jump to the program interrupt handler when an extension interrupt arrives that was not supposed to arrive for the guest CPU. Signed-off-by: Alexander Graf <agraf@suse.de> --- arch/powerpc/kvm/book3s.c | 55 ++++++++++++++++++++++++++++++++++++++++---- 1 files changed, 50 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index 96f7be4..6bdf7f2 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -36,6 +36,8 @@ /* #define DEBUG_EXT */ static void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr); +static int kvmppc_handle_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr, + ulong msr); struct kvm_stats_debugfs_item debugfs_entries[] = { { "exits", VCPU_STAT(sum_exits) }, @@ -628,6 +630,30 @@ static void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr) kvmppc_recalc_shadow_msr(vcpu); } +static int kvmppc_check_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr) +{ + ulong srr0 = vcpu->arch.pc; + int ret; + + /* Need to do paired single emulation? */ + if (!(vcpu->arch.hflags & BOOK3S_HFLAG_PAIRED_SINGLE)) + return EMULATE_DONE; + + /* Read out the instruction */ + ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &vcpu->arch.last_inst, false); + if (ret == -ENOENT) { + vcpu->arch.msr = kvmppc_set_field(vcpu->arch.msr, 33, 33, 1); + vcpu->arch.msr = kvmppc_set_field(vcpu->arch.msr, 34, 36, 0); + vcpu->arch.msr = kvmppc_set_field(vcpu->arch.msr, 42, 47, 0); + kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_INST_STORAGE); + } else if(ret == EMULATE_DONE) { + /* Need to emulate */ + return EMULATE_FAIL; + } + + return EMULATE_AGAIN; +} + /* Handle external providers (FPU, Altivec, VSX) */ static int kvmppc_handle_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr, ulong msr) @@ -772,6 +798,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, enum emulation_result er; ulong flags; +program_interrupt: flags = vcpu->arch.shadow_srr1 & 0x1f0000ull; if (vcpu->arch.msr & MSR_PR) { @@ -815,14 +842,32 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, r = RESUME_GUEST; break; case BOOK3S_INTERRUPT_FP_UNAVAIL: - r = kvmppc_handle_ext(vcpu, exit_nr, MSR_FP); - break; case BOOK3S_INTERRUPT_ALTIVEC: - r = kvmppc_handle_ext(vcpu, exit_nr, MSR_VEC); - break; case BOOK3S_INTERRUPT_VSX: - r = kvmppc_handle_ext(vcpu, exit_nr, MSR_VSX); + { + int ext_msr = 0; + + switch (exit_nr) { + case BOOK3S_INTERRUPT_FP_UNAVAIL: ext_msr = MSR_FP; break; + case BOOK3S_INTERRUPT_ALTIVEC: ext_msr = MSR_VEC; break; + case BOOK3S_INTERRUPT_VSX: ext_msr = MSR_VSX; break; + } + + switch (kvmppc_check_ext(vcpu, exit_nr)) { + case EMULATE_DONE: + /* everything ok - let's enable the ext */ + r = kvmppc_handle_ext(vcpu, exit_nr, ext_msr); + break; + case EMULATE_FAIL: + /* we need to emulate this instruction */ + goto program_interrupt; + break; + default: + /* nothing to worry about - go again */ + break; + } break; + } case BOOK3S_INTERRUPT_MACHINE_CHECK: case BOOK3S_INTERRUPT_TRACE: kvmppc_book3s_queue_irqprio(vcpu, exit_nr); -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 08/18] KVM: PPC: Preload FPU when possible 2010-02-04 15:55 [PATCH 00/18] KVM: PPC: Virtualize Gekko guests Alexander Graf 2010-02-04 15:55 ` [PATCH 01/18] KVM: PPC: Add QPR registers Alexander Graf 2010-02-04 15:55 ` [PATCH 07/18] KVM: PPC: Combine extension interrupt handlers Alexander Graf @ 2010-02-04 15:55 ` Alexander Graf 2010-02-04 15:55 ` [PATCH 09/18] KVM: PPC: Fix typo in book3s_32 debug code Alexander Graf ` (6 subsequent siblings) 9 siblings, 0 replies; 53+ messages in thread From: Alexander Graf @ 2010-02-04 15:55 UTC (permalink / raw) To: kvm-ppc; +Cc: kvm There are some situations when we're pretty sure the guest will use the FPU soon. So we can save the churn of going into the guest, finding out it does want to use the FPU and going out again. This patch adds preloading of the FPU when it's reasonable. Signed-off-by: Alexander Graf <agraf@suse.de> --- arch/powerpc/kvm/book3s.c | 8 ++++++++ 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index 6bdf7f2..07f8b42 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -137,6 +137,10 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u64 msr) kvmppc_mmu_flush_segments(vcpu); kvmppc_mmu_map_segment(vcpu, vcpu->arch.pc); } + + /* Preload FPU if it's enabled */ + if (vcpu->arch.msr & MSR_FP) + kvmppc_handle_ext(vcpu, BOOK3S_INTERRUPT_FP_UNAVAIL, MSR_FP); } void kvmppc_inject_interrupt(struct kvm_vcpu *vcpu, int vec, u64 flags) @@ -1194,6 +1198,10 @@ int __kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) /* XXX we get called with irq disabled - change that! */ local_irq_enable(); + /* Preload FPU if it's enabled */ + if (vcpu->arch.msr & MSR_FP) + kvmppc_handle_ext(vcpu, BOOK3S_INTERRUPT_FP_UNAVAIL, MSR_FP); + ret = __kvmppc_vcpu_entry(kvm_run, vcpu); local_irq_disable(); -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 09/18] KVM: PPC: Fix typo in book3s_32 debug code 2010-02-04 15:55 [PATCH 00/18] KVM: PPC: Virtualize Gekko guests Alexander Graf ` (2 preceding siblings ...) 2010-02-04 15:55 ` [PATCH 08/18] KVM: PPC: Preload FPU when possible Alexander Graf @ 2010-02-04 15:55 ` Alexander Graf 2010-02-04 15:55 ` [PATCH 10/18] KVM: PPC: Implement mtsr instruction emulation Alexander Graf ` (5 subsequent siblings) 9 siblings, 0 replies; 53+ messages in thread From: Alexander Graf @ 2010-02-04 15:55 UTC (permalink / raw) To: kvm-ppc; +Cc: kvm There's a typo in the debug ifdef of the book3s_32 mmu emulation. While trying to debug something I stumbled across that and wanted to save anyone after me (or myself later) from having to debug that again. So let's fix the ifdef. Signed-off-by: Alexander Graf <agraf@suse.de> --- arch/powerpc/kvm/book3s_32_mmu.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kvm/book3s_32_mmu.c b/arch/powerpc/kvm/book3s_32_mmu.c index faf99f2..1483a9b 100644 --- a/arch/powerpc/kvm/book3s_32_mmu.c +++ b/arch/powerpc/kvm/book3s_32_mmu.c @@ -37,7 +37,7 @@ #define dprintk(X...) do { } while(0) #endif -#ifdef DEBUG_PTE +#ifdef DEBUG_MMU_PTE #define dprintk_pte(X...) printk(KERN_INFO X) #else #define dprintk_pte(X...) do { } while(0) -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 10/18] KVM: PPC: Implement mtsr instruction emulation 2010-02-04 15:55 [PATCH 00/18] KVM: PPC: Virtualize Gekko guests Alexander Graf ` (3 preceding siblings ...) 2010-02-04 15:55 ` [PATCH 09/18] KVM: PPC: Fix typo in book3s_32 debug code Alexander Graf @ 2010-02-04 15:55 ` Alexander Graf 2010-02-04 15:55 ` [PATCH 11/18] KVM: PPC: Make software load/store return eaddr Alexander Graf ` (4 subsequent siblings) 9 siblings, 0 replies; 53+ messages in thread From: Alexander Graf @ 2010-02-04 15:55 UTC (permalink / raw) To: kvm-ppc; +Cc: kvm The Book3S_32 specifications allows for two instructions to modify segment registers: mtsrin and mtsr. Most normal operating systems use mtsrin, because it allows to define which segment it wants to change using a register. But since I was trying to run an embedded guest, it turned out to be using mtsr with hardcoded values. So let's also emulate mtsr. It's a valid instruction after all. Signed-off-by: Alexander Graf <agraf@suse.de> --- arch/powerpc/kvm/book3s_64_emulate.c | 6 ++++++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/kvm/book3s_64_emulate.c b/arch/powerpc/kvm/book3s_64_emulate.c index bb4a7c1..e4e7ec3 100644 --- a/arch/powerpc/kvm/book3s_64_emulate.c +++ b/arch/powerpc/kvm/book3s_64_emulate.c @@ -28,6 +28,7 @@ #define OP_31_XOP_MFMSR 83 #define OP_31_XOP_MTMSR 146 #define OP_31_XOP_MTMSRD 178 +#define OP_31_XOP_MTSR 210 #define OP_31_XOP_MTSRIN 242 #define OP_31_XOP_TLBIEL 274 #define OP_31_XOP_TLBIE 306 @@ -101,6 +102,11 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu, } break; } + case OP_31_XOP_MTSR: + vcpu->arch.mmu.mtsrin(vcpu, + (inst >> 16) & 0xf, + kvmppc_get_gpr(vcpu, get_rs(inst))); + break; case OP_31_XOP_MTSRIN: vcpu->arch.mmu.mtsrin(vcpu, (kvmppc_get_gpr(vcpu, get_rb(inst)) >> 28) & 0xf, -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 11/18] KVM: PPC: Make software load/store return eaddr 2010-02-04 15:55 [PATCH 00/18] KVM: PPC: Virtualize Gekko guests Alexander Graf ` (4 preceding siblings ...) 2010-02-04 15:55 ` [PATCH 10/18] KVM: PPC: Implement mtsr instruction emulation Alexander Graf @ 2010-02-04 15:55 ` Alexander Graf 2010-02-04 15:55 ` [PATCH 13/18] KVM: PPC: Add helpers to call FPU instructions Alexander Graf ` (3 subsequent siblings) 9 siblings, 0 replies; 53+ messages in thread From: Alexander Graf @ 2010-02-04 15:55 UTC (permalink / raw) To: kvm-ppc; +Cc: kvm The Book3S KVM implementation contains some helper functions to load and store data from and to virtual addresses. Unfortunately, this helper used to keep the physical address it so nicely found out for us to itself. So let's change that and make it return the physical address it resolved. Signed-off-by: Alexander Graf <agraf@suse.de> --- arch/powerpc/include/asm/kvm_book3s.h | 4 +- arch/powerpc/kvm/book3s.c | 41 ++++++++++++++++++++------------- arch/powerpc/kvm/book3s_64_emulate.c | 11 +++++---- 3 files changed, 33 insertions(+), 23 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h index d28ee83..8463976 100644 --- a/arch/powerpc/include/asm/kvm_book3s.h +++ b/arch/powerpc/include/asm/kvm_book3s.h @@ -115,8 +115,8 @@ extern int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *pte); extern int kvmppc_mmu_map_segment(struct kvm_vcpu *vcpu, ulong eaddr); extern void kvmppc_mmu_flush_segments(struct kvm_vcpu *vcpu); extern struct kvmppc_pte *kvmppc_mmu_find_pte(struct kvm_vcpu *vcpu, u64 ea, bool data); -extern int kvmppc_ld(struct kvm_vcpu *vcpu, ulong eaddr, int size, void *ptr, bool data); -extern int kvmppc_st(struct kvm_vcpu *vcpu, ulong eaddr, int size, void *ptr); +extern int kvmppc_ld(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr, bool data); +extern int kvmppc_st(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr, bool data); extern void kvmppc_book3s_queue_irqprio(struct kvm_vcpu *vcpu, unsigned int vec); extern void kvmppc_set_bat(struct kvm_vcpu *vcpu, struct kvmppc_bat *bat, bool upper, u32 val); diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index 07f8b42..e8dccc6 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -439,55 +439,64 @@ err: return kvmppc_bad_hva(); } -int kvmppc_st(struct kvm_vcpu *vcpu, ulong eaddr, int size, void *ptr) +int kvmppc_st(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr, + bool data) { struct kvmppc_pte pte; - hva_t hva = eaddr; + hva_t hva = *eaddr; vcpu->stat.st++; - if (kvmppc_xlate(vcpu, eaddr, false, &pte)) - goto err; + if (kvmppc_xlate(vcpu, *eaddr, data, &pte)) + goto nopte; + + *eaddr = pte.raddr; hva = kvmppc_pte_to_hva(vcpu, &pte, false); if (kvm_is_error_hva(hva)) - goto err; + goto mmio; if (copy_to_user((void __user *)hva, ptr, size)) { printk(KERN_INFO "kvmppc_st at 0x%lx failed\n", hva); - goto err; + goto mmio; } - return 0; + return EMULATE_DONE; -err: +nopte: return -ENOENT; +mmio: + return EMULATE_DO_MMIO; } -int kvmppc_ld(struct kvm_vcpu *vcpu, ulong eaddr, int size, void *ptr, +int kvmppc_ld(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr, bool data) { struct kvmppc_pte pte; - hva_t hva = eaddr; + hva_t hva = *eaddr; vcpu->stat.ld++; - if (kvmppc_xlate(vcpu, eaddr, data, &pte)) - goto err; + if (kvmppc_xlate(vcpu, *eaddr, data, &pte)) + goto nopte; + + *eaddr = pte.raddr; hva = kvmppc_pte_to_hva(vcpu, &pte, true); if (kvm_is_error_hva(hva)) - goto err; + goto mmio; if (copy_from_user(ptr, (void __user *)hva, size)) { printk(KERN_INFO "kvmppc_ld at 0x%lx failed\n", hva); - goto err; + goto mmio; } - return 0; + return EMULATE_DONE; -err: +nopte: return -ENOENT; +mmio: + return EMULATE_DO_MMIO; } static int kvmppc_visible_gfn(struct kvm_vcpu *vcpu, gfn_t gfn) diff --git a/arch/powerpc/kvm/book3s_64_emulate.c b/arch/powerpc/kvm/book3s_64_emulate.c index e4e7ec3..a93aa47 100644 --- a/arch/powerpc/kvm/book3s_64_emulate.c +++ b/arch/powerpc/kvm/book3s_64_emulate.c @@ -169,7 +169,7 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu, { ulong rb = kvmppc_get_gpr(vcpu, get_rb(inst)); ulong ra = 0; - ulong addr; + ulong addr, vaddr; u32 zeros[8] = { 0, 0, 0, 0, 0, 0, 0, 0 }; if (get_ra(inst)) @@ -178,15 +178,16 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu, addr = (ra + rb) & ~31ULL; if (!(vcpu->arch.msr & MSR_SF)) addr &= 0xffffffff; + vaddr = addr; - if (kvmppc_st(vcpu, addr, 32, zeros)) { - vcpu->arch.dear = addr; - vcpu->arch.fault_dear = addr; + if (kvmppc_st(vcpu, &addr, 32, zeros, true)) { + vcpu->arch.dear = vaddr; + vcpu->arch.fault_dear = vaddr; to_book3s(vcpu)->dsisr = DSISR_PROTFAULT | DSISR_ISSTORE; kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_DATA_STORAGE); - kvmppc_mmu_pte_flush(vcpu, addr, ~0xFFFULL); + kvmppc_mmu_pte_flush(vcpu, vaddr, ~0xFFFULL); } break; -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 13/18] KVM: PPC: Add helpers to call FPU instructions 2010-02-04 15:55 [PATCH 00/18] KVM: PPC: Virtualize Gekko guests Alexander Graf ` (5 preceding siblings ...) 2010-02-04 15:55 ` [PATCH 11/18] KVM: PPC: Make software load/store return eaddr Alexander Graf @ 2010-02-04 15:55 ` Alexander Graf 2010-02-04 15:55 ` [PATCH 17/18] KVM: PPC: Reserve a chunk of memory for opcodes Alexander Graf ` (2 subsequent siblings) 9 siblings, 0 replies; 53+ messages in thread From: Alexander Graf @ 2010-02-04 15:55 UTC (permalink / raw) To: kvm-ppc; +Cc: kvm To emulate paired single instructions, we need to be able to call FPU operations from within the kernel. Since we don't want gcc to spill arbitrary FPU code everywhere, we tell it to use a soft fpu. Since we know we can really call the FPU in safe areas, let's also add some calls that we can later use to actually execute real world FPU operations on the host's FPU. Signed-off-by: Alexander Graf <agraf@suse.de> --- arch/powerpc/include/asm/kvm_fpu.h | 45 +++++++++++++++++++++ arch/powerpc/kernel/ppc_ksyms.c | 2 + arch/powerpc/kvm/Makefile | 1 + arch/powerpc/kvm/fpu.S | 77 ++++++++++++++++++++++++++++++++++++ 4 files changed, 125 insertions(+), 0 deletions(-) create mode 100644 arch/powerpc/include/asm/kvm_fpu.h create mode 100644 arch/powerpc/kvm/fpu.S diff --git a/arch/powerpc/include/asm/kvm_fpu.h b/arch/powerpc/include/asm/kvm_fpu.h new file mode 100644 index 0000000..2e42eb7 --- /dev/null +++ b/arch/powerpc/include/asm/kvm_fpu.h @@ -0,0 +1,45 @@ +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License, version 2, as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. + * + * Copyright Novell Inc. 2010 + * + * Authors: Alexander Graf <agraf@suse.de> + */ + +#ifndef __ASM_KVM_FPU_H__ +#define __ASM_KVM_FPU_H__ + +#include <linux/types.h> + +extern void fp_fres(struct thread_struct *t, u32 *dst, u32 *src1); +extern void fp_frsqrte(struct thread_struct *t, u32 *dst, u32 *src1); +extern void fp_fsqrts(struct thread_struct *t, u32 *dst, u32 *src1); + +extern void fp_fadds(struct thread_struct *t, u32 *dst, u32 *src1, u32 *src2); +extern void fp_fdivs(struct thread_struct *t, u32 *dst, u32 *src1, u32 *src2); +extern void fp_fmuls(struct thread_struct *t, u32 *dst, u32 *src1, u32 *src2); +extern void fp_fsubs(struct thread_struct *t, u32 *dst, u32 *src1, u32 *src2); + +extern void fp_fmadds(struct thread_struct *t, u32 *dst, u32 *src1, u32 *src2, + u32 *src3); +extern void fp_fmsubs(struct thread_struct *t, u32 *dst, u32 *src1, u32 *src2, + u32 *src3); +extern void fp_fnmadds(struct thread_struct *t, u32 *dst, u32 *src1, u32 *src2, + u32 *src3); +extern void fp_fnmsubs(struct thread_struct *t, u32 *dst, u32 *src1, u32 *src2, + u32 *src3); +extern void fp_fsel(struct thread_struct *t, u32 *dst, u32 *src1, u32 *src2, + u32 *src3); + +#endif diff --git a/arch/powerpc/kernel/ppc_ksyms.c b/arch/powerpc/kernel/ppc_ksyms.c index ab3e392..58fdb3a 100644 --- a/arch/powerpc/kernel/ppc_ksyms.c +++ b/arch/powerpc/kernel/ppc_ksyms.c @@ -101,6 +101,8 @@ EXPORT_SYMBOL(pci_dram_offset); EXPORT_SYMBOL(start_thread); EXPORT_SYMBOL(kernel_thread); +EXPORT_SYMBOL_GPL(cvt_df); +EXPORT_SYMBOL_GPL(cvt_fd); EXPORT_SYMBOL(giveup_fpu); #ifdef CONFIG_ALTIVEC EXPORT_SYMBOL(giveup_altivec); diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile index 56484d6..e575cfd 100644 --- a/arch/powerpc/kvm/Makefile +++ b/arch/powerpc/kvm/Makefile @@ -40,6 +40,7 @@ kvm-objs-$(CONFIG_KVM_E500) := $(kvm-e500-objs) kvm-book3s_64-objs := \ $(common-objs-y) \ + fpu.o \ book3s.o \ book3s_64_emulate.o \ book3s_64_interrupts.o \ diff --git a/arch/powerpc/kvm/fpu.S b/arch/powerpc/kvm/fpu.S new file mode 100644 index 0000000..50575ac --- /dev/null +++ b/arch/powerpc/kvm/fpu.S @@ -0,0 +1,77 @@ +/* + * FPU helper code to use FPU operations from inside the kernel + * + * Copyright (C) 2010 Alexander Graf (agraf@suse.de) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + * + */ + +#include <asm/reg.h> +#include <asm/page.h> +#include <asm/mmu.h> +#include <asm/pgtable.h> +#include <asm/cputable.h> +#include <asm/cache.h> +#include <asm/thread_info.h> +#include <asm/ppc_asm.h> +#include <asm/asm-offsets.h> + +#define FPS_ONE_IN(name) \ +_GLOBAL(fp_ ## name); \ + lfd 0,THREAD_FPSCR(r3); /* load up fpscr value */ \ + MTFSF_L(0); \ + lfs 0,0(r5); \ + \ + name 0,0; \ + \ + stfs 0,0(r4); \ + mffs 0; \ + stfd 0,THREAD_FPSCR(r3); /* save new fpscr value */ \ + blr + +#define FPS_TWO_IN(name) \ +_GLOBAL(fp_ ## name); \ + lfd 0,THREAD_FPSCR(r3); /* load up fpscr value */ \ + MTFSF_L(0); \ + lfs 0,0(r5); \ + lfs 1,0(r6); \ + \ + name 0,0,1; \ + \ + stfs 0,0(r4); \ + mffs 0; \ + stfd 0,THREAD_FPSCR(r3); /* save new fpscr value */ \ + blr + +#define FPS_THREE_IN(name) \ +_GLOBAL(fp_ ## name); \ + lfd 0,THREAD_FPSCR(r3); /* load up fpscr value */ \ + MTFSF_L(0); \ + lfs 0,0(r5); \ + lfs 1,0(r6); \ + lfs 2,0(r7); \ + \ + name 0,0,1,2; \ + \ + stfs 0,0(r4); \ + mffs 0; \ + stfd 0,THREAD_FPSCR(r3); /* save new fpscr value */ \ + blr + +FPS_ONE_IN(fres) +FPS_ONE_IN(frsqrte) +FPS_ONE_IN(fsqrts) +FPS_TWO_IN(fadds) +FPS_TWO_IN(fdivs) +FPS_TWO_IN(fmuls) +FPS_TWO_IN(fsubs) +FPS_THREE_IN(fmadds) +FPS_THREE_IN(fmsubs) +FPS_THREE_IN(fnmadds) +FPS_THREE_IN(fnmsubs) +FPS_THREE_IN(fsel) + -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 17/18] KVM: PPC: Reserve a chunk of memory for opcodes 2010-02-04 15:55 [PATCH 00/18] KVM: PPC: Virtualize Gekko guests Alexander Graf ` (6 preceding siblings ...) 2010-02-04 15:55 ` [PATCH 13/18] KVM: PPC: Add helpers to call FPU instructions Alexander Graf @ 2010-02-04 15:55 ` Alexander Graf 2010-02-04 15:55 ` [PATCH 18/18] KVM: PPC: Implement Paired Single emulation Alexander Graf [not found] ` <1265298925-31954-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org> 9 siblings, 0 replies; 53+ messages in thread From: Alexander Graf @ 2010-02-04 15:55 UTC (permalink / raw) To: kvm-ppc; +Cc: kvm With paired singles we have a nifty instruction execution engine. That engine takes safe and properly cleared FPU opcodes and executes them directly on the hardware. Since we can't run off the stack and modifying .bss isn't future-proof either, the best method seemed to be to vmalloc an executable chunk of memory. This chunk will be used by the following patch. Signed-off-by: Alexander Graf <agraf@suse.de> --- arch/powerpc/include/asm/kvm_book3s.h | 1 + arch/powerpc/include/asm/kvm_ppc.h | 4 ++++ arch/powerpc/kvm/book3s.c | 14 +++++++++++++- 3 files changed, 18 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h index fd43210..f74d1db 100644 --- a/arch/powerpc/include/asm/kvm_book3s.h +++ b/arch/powerpc/include/asm/kvm_book3s.h @@ -144,5 +144,6 @@ static inline ulong dsisr(void) extern void kvm_return_point(void); #define INS_DCBZ 0x7c0007ec +#define INS_BLR 0x4e800020 #endif /* __ASM_KVM_BOOK3S_H__ */ diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index c7fcdd7..5c85504 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -103,6 +103,10 @@ extern void kvmppc_booke_exit(void); extern void kvmppc_core_destroy_mmu(struct kvm_vcpu *vcpu); +/* 16*NR_CPUS bytes filled with "blr" instructions. We use this to enable + code to execute arbitrary (checked!) opcodes. */ +extern u32 *kvmppc_call_stack; + /* * Cuts out inst bits with ordering according to spec. * That means the leftmost bit is zero. All given bits are included. diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index f842d1d..272cb37 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -35,6 +35,8 @@ /* #define EXIT_DEBUG_SIMPLE */ /* #define DEBUG_EXT */ +u32 *kvmppc_call_stack; + static int kvmppc_handle_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr, ulong msr); @@ -1249,7 +1251,17 @@ int __kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) static int kvmppc_book3s_init(void) { - return kvm_init(NULL, sizeof(struct kvmppc_vcpu_book3s), THIS_MODULE); + int r, i; + + r = kvm_init(NULL, sizeof(struct kvmppc_vcpu_book3s), THIS_MODULE); + + /* Prepare call blob we can use to execute single instructions */ + kvmppc_call_stack = __vmalloc(NR_CPUS * 2 * sizeof(u32), + GFP_KERNEL | __GFP_HIGHMEM, PAGE_KERNEL_EXEC); + for (i = 0; i < (NR_CPUS * 2); i++) + kvmppc_call_stack[i] = INS_BLR; + + return r; } static void kvmppc_book3s_exit(void) -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 18/18] KVM: PPC: Implement Paired Single emulation 2010-02-04 15:55 [PATCH 00/18] KVM: PPC: Virtualize Gekko guests Alexander Graf ` (7 preceding siblings ...) 2010-02-04 15:55 ` [PATCH 17/18] KVM: PPC: Reserve a chunk of memory for opcodes Alexander Graf @ 2010-02-04 15:55 ` Alexander Graf [not found] ` <1265298925-31954-19-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org> [not found] ` <1265298925-31954-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org> 9 siblings, 1 reply; 53+ messages in thread From: Alexander Graf @ 2010-02-04 15:55 UTC (permalink / raw) To: kvm-ppc; +Cc: kvm The one big thing about the Gekko is paired singles. Paired singles are an extension to the instruction set, that adds 32 single precision floating point registers (qprs), some SPRs to modify the behavior of paired singled operations and instructions to deal with qprs to the instruction set. Unfortunately, it also changes semantics of existing operations that affect single values in FPRs. In most cases they get mirrored to the coresponding QPR. Thanks to that we need to emulate all FPU operations and all the new paired single operations too. In order to achieve that, we take the guest's instruction, rip out the parameters, put in our own and execute the very same instruction, but also fix up the QPR values along the way. That way we can execute paired single FPU operations without implementing a soft fpu. Signed-off-by: Alexander Graf <agraf@suse.de> --- arch/powerpc/include/asm/kvm_book3s.h | 1 + arch/powerpc/kvm/Makefile | 1 + arch/powerpc/kvm/book3s_64_emulate.c | 3 + arch/powerpc/kvm/book3s_paired_singles.c | 1356 ++++++++++++++++++++++++++++++ 4 files changed, 1361 insertions(+), 0 deletions(-) create mode 100644 arch/powerpc/kvm/book3s_paired_singles.c diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h index f74d1db..e32a749 100644 --- a/arch/powerpc/include/asm/kvm_book3s.h +++ b/arch/powerpc/include/asm/kvm_book3s.h @@ -121,6 +121,7 @@ extern void kvmppc_book3s_queue_irqprio(struct kvm_vcpu *vcpu, unsigned int vec) extern void kvmppc_set_bat(struct kvm_vcpu *vcpu, struct kvmppc_bat *bat, bool upper, u32 val); extern void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr); +extern int kvmppc_emulate_paired_single(struct kvm_run *run, struct kvm_vcpu *vcpu); extern u32 kvmppc_trampoline_lowmem; extern u32 kvmppc_trampoline_enter; diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile index e575cfd..eba721e 100644 --- a/arch/powerpc/kvm/Makefile +++ b/arch/powerpc/kvm/Makefile @@ -41,6 +41,7 @@ kvm-objs-$(CONFIG_KVM_E500) := $(kvm-e500-objs) kvm-book3s_64-objs := \ $(common-objs-y) \ fpu.o \ + book3s_paired_singles.o \ book3s.o \ book3s_64_emulate.o \ book3s_64_interrupts.o \ diff --git a/arch/powerpc/kvm/book3s_64_emulate.c b/arch/powerpc/kvm/book3s_64_emulate.c index 1d1b952..c989214 100644 --- a/arch/powerpc/kvm/book3s_64_emulate.c +++ b/arch/powerpc/kvm/book3s_64_emulate.c @@ -200,6 +200,9 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu, emulated = EMULATE_FAIL; } + if (emulated == EMULATE_FAIL) + emulated = kvmppc_emulate_paired_single(run, vcpu); + return emulated; } diff --git a/arch/powerpc/kvm/book3s_paired_singles.c b/arch/powerpc/kvm/book3s_paired_singles.c new file mode 100644 index 0000000..cb258a3 --- /dev/null +++ b/arch/powerpc/kvm/book3s_paired_singles.c @@ -0,0 +1,1356 @@ +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License, version 2, as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. + * + * Copyright Novell Inc 2010 + * + * Authors: Alexander Graf <agraf@suse.de> + */ + +#include <asm/kvm.h> +#include <asm/kvm_ppc.h> +#include <asm/disassemble.h> +#include <asm/kvm_book3s.h> +#include <asm/kvm_fpu.h> +#include <asm/reg.h> +#include <asm/cacheflush.h> +#include <linux/vmalloc.h> + +/* #define DEBUG */ + +#ifdef DEBUG +#define dprintk printk +#else +#define dprintk(...) do { } while(0); +#endif + +#define OP_LFS 48 +#define OP_LFSU 49 +#define OP_LFD 50 +#define OP_LFDU 51 +#define OP_STFS 52 +#define OP_STFSU 53 +#define OP_STFD 54 +#define OP_STFDU 55 +#define OP_PSQ_L 56 +#define OP_PSQ_LU 57 +#define OP_PSQ_ST 60 +#define OP_PSQ_STU 61 + +#define OP_31_LFSX 535 +#define OP_31_LFSUX 567 +#define OP_31_LFDX 599 +#define OP_31_LFDUX 631 +#define OP_31_STFSX 663 +#define OP_31_STFSUX 695 +#define OP_31_STFX 727 +#define OP_31_STFUX 759 +#define OP_31_LWIZX 887 +#define OP_31_STFIWX 983 + +#define OP_59_FADDS 21 +#define OP_59_FSUBS 20 +#define OP_59_FSQRTS 22 +#define OP_59_FDIVS 18 +#define OP_59_FRES 24 +#define OP_59_FMULS 25 +#define OP_59_FRSQRTES 26 +#define OP_59_FMSUBS 28 +#define OP_59_FMADDS 29 +#define OP_59_FNMSUBS 30 +#define OP_59_FNMADDS 31 +#define OP_59_FCFIDS 846 +#define OP_59_FCFIDUS 974 + +#define OP_63_FCMPU 0 +#define OP_63_FCPSGN 8 +#define OP_63_FRSP 12 +#define OP_63_FCTIW 14 +#define OP_63_FCTIWZ 15 +#define OP_63_FDIV 18 +#define OP_63_FADD 21 +#define OP_63_FSQRT 22 +#define OP_63_FSEL 23 +#define OP_63_FRE 24 +#define OP_63_FMUL 25 +#define OP_63_FRSQRTE 26 +#define OP_63_FMSUB 28 +#define OP_63_FMADD 29 +#define OP_63_FNMSUB 30 +#define OP_63_FNMADD 31 +#define OP_63_FCMPO 32 +#define OP_63_MTFSB1 38 +#define OP_63_FSUB 20 +#define OP_63_FNEG 40 +#define OP_63_MCRFS 64 +#define OP_63_MTFSB0 70 +#define OP_63_FMR 72 +#define OP_63_MTFSFI 134 +#define OP_63_FCTIWU 142 +#define OP_63_FCTIWUZ 143 +#define OP_63_FNABS 136 +#define OP_63_FTDIV 128 +#define OP_63_FTSQRT 160 +#define OP_63_FABS 264 +#define OP_63_FRIN 392 +#define OP_63_FRIZ 424 +#define OP_63_FRIP 456 +#define OP_63_FRIM 488 +#define OP_63_MFFS 583 +#define OP_63_MTFSF 711 +#define OP_63_FCTID 814 +#define OP_63_FCTIDZ 815 +#define OP_63_FCFID 846 +#define OP_63_FCTIDU 942 +#define OP_63_FCTIDUZ 943 +#define OP_63_FCFIDU 974 + +#define OP_4X_PS_CMPU0 0 +#define OP_4X_PSQ_LX 6 +#define OP_4XW_PSQ_STX 7 +#define OP_4A_PS_SUM0 10 +#define OP_4A_PS_SUM1 11 +#define OP_4A_PS_MULS0 12 +#define OP_4A_PS_MULS1 13 +#define OP_4A_PS_MADDS0 14 +#define OP_4A_PS_MADDS1 15 +#define OP_4A_PS_DIV 18 +#define OP_4A_PS_SUB 20 +#define OP_4A_PS_ADD 21 +#define OP_4A_PS_SEL 23 +#define OP_4A_PS_RES 24 +#define OP_4A_PS_MUL 25 +#define OP_4A_PS_RSQRTE 26 +#define OP_4A_PS_MSUB 28 +#define OP_4A_PS_MADD 29 +#define OP_4A_PS_NMSUB 30 +#define OP_4A_PS_NMADD 31 +#define OP_4X_PS_CMPO0 32 +#define OP_4X_PSQ_LUX 38 +#define OP_4XW_PSQ_STUX 39 +#define OP_4X_PS_NEG 40 +#define OP_4X_PS_CMPU1 64 +#define OP_4X_PS_MR 72 +#define OP_4X_PS_CMPO1 96 +#define OP_4X_PS_NABS 136 +#define OP_4X_PS_ABS 264 +#define OP_4X_PS_MERGE00 528 +#define OP_4X_PS_MERGE01 560 +#define OP_4X_PS_MERGE10 592 +#define OP_4X_PS_MERGE11 624 + +#define SCALAR_NONE 0 +#define SCALAR_HIGH (1 << 0) +#define SCALAR_LOW (1 << 1) +#define SCALAR_NO_PS0 (1 << 2) +#define SCALAR_NO_PS1 (1 << 3) + +#define GQR_ST_TYPE_MASK 0x00000007 +#define GQR_ST_TYPE_SHIFT 0 +#define GQR_ST_SCALE_MASK 0x00003f00 +#define GQR_ST_SCALE_SHIFT 8 +#define GQR_LD_TYPE_MASK 0x00070000 +#define GQR_LD_TYPE_SHIFT 16 +#define GQR_LD_SCALE_MASK 0x3f000000 +#define GQR_LD_SCALE_SHIFT 24 + +#define GQR_QUANTIZE_FLOAT 0 +#define GQR_QUANTIZE_U8 4 +#define GQR_QUANTIZE_U16 5 +#define GQR_QUANTIZE_S8 6 +#define GQR_QUANTIZE_S16 7 + +#define FPU_LS_SINGLE 0 +#define FPU_LS_DOUBLE 1 +#define FPU_LS_SINGLE_LOW 2 + +static void call_fpu_inst(u32 inst, u64 *out, u64 *in1, u64 *in2, u64 *in3, + u32 *cr, u32 *fpscr) +{ + u32 cr_val = 0; + u32 *call_stack; + u64 inout[5] = { 0, 0, 0, 0, 0 }; + + if (fpscr) + inout[0] = *fpscr; + if (in1) + inout[1] = *in1; + if (in2) + inout[2] = *in2; + if (in3) + inout[3] = *in3; + if (cr) + cr_val = *cr; + + dprintk(KERN_INFO "FPU Emulator 0x%x ( 0x%llx, 0x%llx, 0x%llx )", inst, + inout[1], inout[2], inout[3]); + + call_stack = &kvmppc_call_stack[(smp_processor_id() * 2)]; + call_stack[0] = inst; + /* call_stack[1] is INS_BLR */ + + flush_icache_range((ulong)call_stack, (ulong)&call_stack[1]); + + dprintk(KERN_INFO "FPU call stack -> 0x%p (%x %x)\n", + call_stack, call_stack[0], call_stack[1]); + + __asm__ volatile ( + "lfd 0, (0*8)(%[inout]) ;" + "mtfsf 255, 0, 1, 0 ;" + "lfd 1, (1*8)(%[inout]) ;" + "lfd 2, (2*8)(%[inout]) ;" + "lfd 3, (3*8)(%[inout]) ;" + "mtctr %[call_addr] ;" + "mtcr %[cr_in] ;" + "bctrl ;" + "mffs 1 ;" + "stfd 1, (0*8)(%[inout]) ;" + "stfd 4, (4*8)(%[inout]) ;" + "mfcr %[cr_out] ;" + : [cr_out]"=r"(cr_val), + "=m"(inout[0]), + "=m"(inout[4]) + : [cr_in]"r"(cr_val), + [inout]"b"(inout), + "m"(inout[0]), + "m"(inout[1]), + "m"(inout[2]), + "m"(inout[3]), + [call_addr]"r"(call_stack) + : "cc", "lr", "ctr"); + + dprintk(KERN_INFO "FPU Emulator result = %llx\n", inout[4]); + + if (fpscr) + *fpscr = inout[0]; + if (out) + *out = inout[4]; + if (cr) + *cr = cr_val; +} + +static void kvmppc_inject_pf(struct kvm_vcpu *vcpu, ulong eaddr, bool is_store) +{ + u64 dsisr; + + vcpu->arch.msr = kvmppc_set_field(vcpu->arch.msr, 33, 36, 0); + vcpu->arch.msr = kvmppc_set_field(vcpu->arch.msr, 42, 47, 0); + vcpu->arch.dear = eaddr; + /* Page Fault */ + dsisr = kvmppc_set_field(0, 33, 33, 1); + if (is_store) + to_book3s(vcpu)->dsisr = kvmppc_set_field(dsisr, 38, 38, 1); + kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_DATA_STORAGE); +} + +static int kvmppc_emulate_fpr_load(struct kvm_run *run, struct kvm_vcpu *vcpu, + int rs, ulong addr, int ls_type) +{ + int emulated = EMULATE_FAIL; + struct thread_struct t; + int r; + char tmp[8]; + int len = sizeof(u32); + + if (ls_type == FPU_LS_DOUBLE) + len = sizeof(u64); + + t.fpscr.val = vcpu->arch.fpscr; + + /* read from memory */ + r = kvmppc_ld(vcpu, &addr, len, tmp, true); + vcpu->arch.paddr_accessed = addr; + + if (r < 0) { + kvmppc_inject_pf(vcpu, addr, false); + goto done_load; + } else if (r == EMULATE_DO_MMIO) { + emulated = kvmppc_handle_load(run, vcpu, REG_FPR | rs, len, 1); + goto done_load; + } + + emulated = EMULATE_DONE; + + /* put in registers */ + switch (ls_type) { + case FPU_LS_SINGLE: + cvt_fd((float*)tmp, (double*)&vcpu->arch.fpr[rs], &t); + vcpu->arch.qpr[rs] = *((u32*)tmp); + break; + case FPU_LS_DOUBLE: + vcpu->arch.fpr[rs] = *((u64*)tmp); + break; + } + + dprintk(KERN_INFO "KVM: FPR_LD [0x%llx] at 0x%lx (%d)\n", *(u64*)tmp, + addr, len); + +done_load: + return emulated; +} + +static int kvmppc_emulate_fpr_store(struct kvm_run *run, struct kvm_vcpu *vcpu, + int rs, ulong addr, int ls_type) +{ + int emulated = EMULATE_FAIL; + struct thread_struct t; + int r; + char tmp[8]; + u64 val; + int len; + + t.fpscr.val = vcpu->arch.fpscr; + + switch (ls_type) { + case FPU_LS_SINGLE: + cvt_df((double*)&vcpu->arch.fpr[rs], (float*)tmp, &t); + val = *((u32*)tmp); + len = sizeof(u32); + break; + case FPU_LS_SINGLE_LOW: + *((u32*)tmp) = vcpu->arch.fpr[rs]; + val = vcpu->arch.fpr[rs] & 0xffffffff; + len = sizeof(u32); + break; + case FPU_LS_DOUBLE: + *((u64*)tmp) = vcpu->arch.fpr[rs]; + val = vcpu->arch.fpr[rs]; + len = sizeof(u64); + break; + default: + val = 0; + len = 0; + } + + r = kvmppc_st(vcpu, &addr, len, tmp, true); + vcpu->arch.paddr_accessed = addr; + if (r < 0) { + kvmppc_inject_pf(vcpu, addr, true); + } else if (r == EMULATE_DO_MMIO) { + emulated = kvmppc_handle_store(run, vcpu, val, len, 1); + } else { + emulated = EMULATE_DONE; + } + + dprintk(KERN_INFO "KVM: FPR_ST [0x%llx] at 0x%lx (%d)\n", + val, addr, len); + + return emulated; +} + +static int kvmppc_emulate_psq_load(struct kvm_run *run, struct kvm_vcpu *vcpu, + int rs, ulong addr, bool w, int i) +{ + int emulated = EMULATE_FAIL; + struct thread_struct t; + int r; + float one = 1.0; + u32 tmp[2]; + + t.fpscr.val = vcpu->arch.fpscr; + + /* read from memory */ + if (w) { + r = kvmppc_ld(vcpu, &addr, sizeof(u32), tmp, true); + memcpy(&tmp[1], &one, sizeof(u32)); + } else { + r = kvmppc_ld(vcpu, &addr, sizeof(u32) * 2, tmp, true); + } + vcpu->arch.paddr_accessed = addr; + if (r < 0) { + kvmppc_inject_pf(vcpu, addr, false); + goto done_load; + } else if ((r == EMULATE_DO_MMIO) && w) { + emulated = kvmppc_handle_load(run, vcpu, REG_FPR | rs, 4, 1); + vcpu->arch.qpr[rs] = tmp[1]; + goto done_load; + } else if (r == EMULATE_DO_MMIO) { + emulated = kvmppc_handle_load(run, vcpu, REG_FQPR | rs, 8, 1); + goto done_load; + } + + emulated = EMULATE_DONE; + + /* put in registers */ + cvt_fd((float*)&tmp[0], (double*)&vcpu->arch.fpr[rs], &t); + vcpu->arch.qpr[rs] = tmp[1]; + + dprintk(KERN_INFO "KVM: PSQ_LD [0x%x, 0x%x] at 0x%lx (%d)\n", tmp[0], + tmp[1], addr, w ? 4 : 8); + +done_load: + return emulated; +} + +static int kvmppc_emulate_psq_store(struct kvm_run *run, struct kvm_vcpu *vcpu, + int rs, ulong addr, bool w, int i) +{ + int emulated = EMULATE_FAIL; + struct thread_struct t; + int r; + u32 tmp[2]; + int len = w ? sizeof(u32) : sizeof(u64); + + t.fpscr.val = vcpu->arch.fpscr; + + cvt_df((double*)&vcpu->arch.fpr[rs], (float*)&tmp[0], &t); + tmp[1] = vcpu->arch.qpr[rs]; + + r = kvmppc_st(vcpu, &addr, len, tmp, true); + vcpu->arch.paddr_accessed = addr; + if (r < 0) { + kvmppc_inject_pf(vcpu, addr, true); + } else if ((r == EMULATE_DO_MMIO) && w) { + emulated = kvmppc_handle_store(run, vcpu, tmp[0], 4, 1); + } else if (r == EMULATE_DO_MMIO) { + u64 val = ((u64)tmp[0] << 32) | tmp[1]; + emulated = kvmppc_handle_store(run, vcpu, val, 8, 1); + } else { + emulated = EMULATE_DONE; + } + + dprintk(KERN_INFO "KVM: PSQ_ST [0x%x, 0x%x] at 0x%lx (%d)\n", + tmp[0], tmp[1], addr, len); + + return emulated; +} + +/* + * Cuts out inst bits with ordering according to spec. + * That means the leftmost bit is zero. All given bits are included. + */ +static inline u32 inst_get_field(u32 inst, int msb, int lsb) +{ + return kvmppc_get_field(inst, msb + 32, lsb + 32); +} + +/* + * Replaces inst bits with ordering according to spec. + */ +static inline u32 inst_set_field(u32 inst, int msb, int lsb, int value) +{ + return kvmppc_set_field(inst, msb + 32, lsb + 32, value); +} + +#define FPU_HAS_FRT (1 << 0) +#define FPU_HAS_FRA (1 << 1) +#define FPU_HAS_FRB (1 << 2) +#define FPU_HAS_FRC (1 << 3) +#define FPU_SYNC_QPR (1 << 4) + +static void kvmppc_do_fpu_inst(struct kvm_vcpu *vcpu, u32 inst, int flags) +{ + int frt = inst_get_field(inst, 6, 10); + + u64 *fpr_a = NULL; + u64 *fpr_b = NULL; + u64 *fpr_c = NULL; + u64 *fpr_t = NULL; + + if (flags & FPU_HAS_FRA) { + int fra = inst_get_field(inst, 11, 15); + fpr_a = &vcpu->arch.fpr[fra]; + inst = inst_set_field(inst, 11, 15, 1); + } + if (flags & FPU_HAS_FRB) { + int frb = inst_get_field(inst, 16, 20); + fpr_b = &vcpu->arch.fpr[frb]; + inst = inst_set_field(inst, 16, 20, 2); + } + if (flags & FPU_HAS_FRC) { + int frc = inst_get_field(inst, 21, 25); + fpr_c = &vcpu->arch.fpr[frc]; + inst = inst_set_field(inst, 21, 25, 3); + } + if (flags & FPU_HAS_FRT) { + fpr_t = &vcpu->arch.fpr[frt]; + inst = inst_set_field(inst, 6, 10, 4); + } + + call_fpu_inst(inst, fpr_t, fpr_a, fpr_b, fpr_c, + &get_paca()->shadow_vcpu.cr, + &vcpu->arch.fpscr); + + barrier(); + + if (flags & FPU_SYNC_QPR) { + struct thread_struct t; + + t.fpscr.val = vcpu->arch.fpscr; + cvt_df((double*)fpr_t, (float*)&vcpu->arch.qpr[frt], &t); + } +} + +bool kvmppc_inst_is_paired_single(struct kvm_vcpu *vcpu, u32 inst) +{ + if (!(vcpu->arch.hflags & BOOK3S_HFLAG_PAIRED_SINGLE)) + return false; + + switch (get_op(inst)) { + case OP_PSQ_L: + case OP_PSQ_LU: + case OP_PSQ_ST: + case OP_PSQ_STU: + case OP_LFS: + case OP_LFSU: + case OP_LFD: + case OP_LFDU: + case OP_STFS: + case OP_STFSU: + case OP_STFD: + case OP_STFDU: + return true; + case 4: + /* X form */ + switch (inst_get_field(inst, 21, 30)) { + case OP_4X_PS_CMPU0: + case OP_4X_PSQ_LX: + case OP_4X_PS_CMPO0: + case OP_4X_PSQ_LUX: + case OP_4X_PS_NEG: + case OP_4X_PS_CMPU1: + case OP_4X_PS_MR: + case OP_4X_PS_CMPO1: + case OP_4X_PS_NABS: + case OP_4X_PS_ABS: + case OP_4X_PS_MERGE00: + case OP_4X_PS_MERGE01: + case OP_4X_PS_MERGE10: + case OP_4X_PS_MERGE11: + return true; + } + /* XW form */ + switch (inst_get_field(inst, 25, 30)) { + case OP_4XW_PSQ_STX: + case OP_4XW_PSQ_STUX: + return true; + } + /* A form */ + switch (inst_get_field(inst, 26, 30)) { + case OP_4A_PS_SUM1: + case OP_4A_PS_SUM0: + case OP_4A_PS_MULS0: + case OP_4A_PS_MULS1: + case OP_4A_PS_MADDS0: + case OP_4A_PS_MADDS1: + case OP_4A_PS_DIV: + case OP_4A_PS_SUB: + case OP_4A_PS_ADD: + case OP_4A_PS_SEL: + case OP_4A_PS_RES: + case OP_4A_PS_MUL: + case OP_4A_PS_RSQRTE: + case OP_4A_PS_MSUB: + case OP_4A_PS_MADD: + case OP_4A_PS_NMSUB: + case OP_4A_PS_NMADD: + return true; + } + break; + case 59: + switch (inst_get_field(inst, 21, 30)) { + case OP_59_FADDS: + case OP_59_FSUBS: + case OP_59_FDIVS: + case OP_59_FRES: + case OP_59_FRSQRTES: + return true; + } + switch (inst_get_field(inst, 26, 30)) { + case OP_59_FMULS: + case OP_59_FMSUBS: + case OP_59_FMADDS: + case OP_59_FNMSUBS: + case OP_59_FNMADDS: + return true; + } + break; + case 63: + switch (inst_get_field(inst, 21, 30)) { + case OP_63_MTFSB0: + case OP_63_MTFSB1: + case OP_63_MTFSF: + case OP_63_MTFSFI: + case OP_63_MCRFS: + case OP_63_MFFS: + case OP_63_FCMPU: + case OP_63_FCMPO: + case OP_63_FNEG: + case OP_63_FMR: + case OP_63_FABS: + case OP_63_FRSP: + case OP_63_FDIV: + case OP_63_FADD: + case OP_63_FSUB: + case OP_63_FCTIW: + case OP_63_FCTIWZ: + case OP_63_FRSQRTE: + case OP_63_FCPSGN: + return true; + } + switch (inst_get_field(inst, 26, 30)) { + case OP_63_FMUL: + case OP_63_FSEL: + case OP_63_FMSUB: + case OP_63_FMADD: + case OP_63_FNMSUB: + case OP_63_FNMADD: + return true; + } + break; + case 31: + switch (inst_get_field(inst, 21, 30)) { + case OP_31_LFSX: + case OP_31_LFSUX: + case OP_31_LFDX: + case OP_31_LFDUX: + case OP_31_STFSX: + case OP_31_STFSUX: + case OP_31_STFX: + case OP_31_STFUX: + case OP_31_STFIWX: + return true; + } + break; + } + + return false; +} + +static int get_d_signext(u32 inst) +{ + int d = inst & 0x8ff; + + if (d & 0x800) + return -(d & 0x7ff); + + return (d & 0x7ff); +} + +static int kvmppc_ps_three_in(struct kvm_vcpu *vcpu, bool rc, + int reg_out, int reg_in1, int reg_in2, + int reg_in3, int scalar, + void (*func)(struct thread_struct *t, + u32 *dst, u32 *src1, + u32 *src2, u32 *src3)) +{ + u32 *qpr = vcpu->arch.qpr; + u64 *fpr = vcpu->arch.fpr; + u32 ps0_out; + u32 ps0_in1, ps0_in2, ps0_in3; + u32 ps1_in1, ps1_in2, ps1_in3; + struct thread_struct t; + t.fpscr.val = vcpu->arch.fpscr; + + /* RC */ + WARN_ON(rc); + + /* PS0 */ + cvt_df((double*)&fpr[reg_in1], (float*)&ps0_in1, &t); + cvt_df((double*)&fpr[reg_in2], (float*)&ps0_in2, &t); + cvt_df((double*)&fpr[reg_in3], (float*)&ps0_in3, &t); + + if (scalar & SCALAR_LOW) + ps0_in2 = qpr[reg_in2]; + + func(&t, &ps0_out, &ps0_in1, &ps0_in2, &ps0_in3); + + dprintk(KERN_INFO "PS3 ps0 -> f(0x%x, 0x%x, 0x%x) = 0x%x\n", + ps0_in1, ps0_in2, ps0_in3, ps0_out); + + if (!(scalar & SCALAR_NO_PS0)) + cvt_fd((float*)&ps0_out, (double*)&fpr[reg_out], &t); + + /* PS1 */ + ps1_in1 = qpr[reg_in1]; + ps1_in2 = qpr[reg_in2]; + ps1_in3 = qpr[reg_in3]; + + if (scalar & SCALAR_HIGH) + ps1_in2 = ps0_in2; + + if (!(scalar & SCALAR_NO_PS1)) + func(&t, &qpr[reg_out], &ps1_in1, &ps1_in2, &ps1_in3); + + dprintk(KERN_INFO "PS3 ps1 -> f(0x%x, 0x%x, 0x%x) = 0x%x\n", + ps1_in1, ps1_in2, ps1_in3, qpr[reg_out]); + + return EMULATE_DONE; +} + +static int kvmppc_ps_two_in(struct kvm_vcpu *vcpu, bool rc, + int reg_out, int reg_in1, int reg_in2, + int scalar, + void (*func)(struct thread_struct *t, + u32 *dst, u32 *src1, + u32 *src2)) +{ + u32 *qpr = vcpu->arch.qpr; + u64 *fpr = vcpu->arch.fpr; + u32 ps0_out; + u32 ps0_in1, ps0_in2; + u32 ps1_out; + u32 ps1_in1, ps1_in2; + struct thread_struct t; + t.fpscr.val = vcpu->arch.fpscr; + + /* RC */ + WARN_ON(rc); + + /* PS0 */ + cvt_df((double*)&fpr[reg_in1], (float*)&ps0_in1, &t); + + if (scalar & SCALAR_LOW) + ps0_in2 = qpr[reg_in2]; + else + cvt_df((double*)&fpr[reg_in2], (float*)&ps0_in2, &t); + + func(&t, &ps0_out, &ps0_in1, &ps0_in2); + + if (!(scalar & SCALAR_NO_PS0)) { + dprintk(KERN_INFO "PS2 ps0 -> f(0x%x, 0x%x) = 0x%x\n", + ps0_in1, ps0_in2, ps0_out); + + cvt_fd((float*)&ps0_out, (double*)&fpr[reg_out], &t); + } + + /* PS1 */ + ps1_in1 = qpr[reg_in1]; + ps1_in2 = qpr[reg_in2]; + + if (scalar & SCALAR_HIGH) + ps1_in2 = ps0_in2; + + func(&t, &ps1_out, &ps1_in1, &ps1_in2); + + if (!(scalar & SCALAR_NO_PS1)) { + qpr[reg_out] = ps1_out; + + dprintk(KERN_INFO "PS2 ps1 -> f(0x%x, 0x%x) = 0x%x\n", + ps1_in1, ps1_in2, qpr[reg_out]); + } + + return EMULATE_DONE; +} + +static int kvmppc_ps_one_in(struct kvm_vcpu *vcpu, bool rc, + int reg_out, int reg_in, + void (*func)(struct thread_struct *t, + u32 *dst, u32 *src1)) +{ + u32 *qpr = vcpu->arch.qpr; + u64 *fpr = vcpu->arch.fpr; + u32 ps0_out, ps0_in; + u32 ps1_in; + struct thread_struct t; + t.fpscr.val = vcpu->arch.fpscr; + + /* RC */ + WARN_ON(rc); + + /* PS0 */ + cvt_df((double*)&fpr[reg_in], (float*)&ps0_in, &t); + func(&t, &ps0_out, &ps0_in); + + dprintk(KERN_INFO "PS1 ps0 -> f(0x%x) = 0x%x\n", + ps0_in, ps0_out); + + cvt_fd((float*)&ps0_out, (double*)&fpr[reg_out], &t); + + /* PS1 */ + ps1_in = qpr[reg_in]; + func(&t, &qpr[reg_out], &ps1_in); + + dprintk(KERN_INFO "PS1 ps1 -> f(0x%x) = 0x%x\n", + ps1_in, qpr[reg_out]); + + return EMULATE_DONE; +} + +int kvmppc_emulate_paired_single(struct kvm_run *run, struct kvm_vcpu *vcpu) +{ + u32 inst = vcpu->arch.last_inst; + enum emulation_result emulated = EMULATE_DONE; + + int ax_rd = inst_get_field(inst, 6, 10); + int ax_ra = inst_get_field(inst, 11, 15); + int ax_rb = inst_get_field(inst, 16, 20); + int ax_rc = inst_get_field(inst, 21, 25); + short full_d = inst_get_field(inst, 16, 31); + + bool rcomp = (inst & 1) ? true : false; + struct thread_struct t; +#ifdef DEBUG + int i; +#endif + + t.fpscr.val = vcpu->arch.fpscr; + + if (!kvmppc_inst_is_paired_single(vcpu, inst)) + return EMULATE_FAIL; + + if (!(vcpu->arch.msr & MSR_FP)) { + kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_FP_UNAVAIL); + return EMULATE_AGAIN; + } + + kvmppc_giveup_ext(vcpu, MSR_FP); + preempt_disable(); + enable_kernel_fp(); + /* Do we need to clear FE0 / FE1 here? Don't think so. */ + +#ifdef DEBUG + for (i = 0; i < ARRAY_SIZE(vcpu->arch.fpr); i++) { + u32 f; + cvt_df((double*)&vcpu->arch.fpr[i], (float*)&f, &t); + dprintk(KERN_INFO "FPR[%d] = 0x%x / 0x%llx QPR[%d] = 0x%x\n", + i, f, vcpu->arch.fpr[i], i, vcpu->arch.qpr[i]); + } +#endif + + switch (get_op(inst)) { + case OP_PSQ_L: + { + ulong addr = ax_ra ? kvmppc_get_gpr(vcpu, ax_ra) : 0; + bool w = inst_get_field(inst, 16, 16) ? true : false; + int i = inst_get_field(inst, 17, 19); + + addr += get_d_signext(inst); + emulated = kvmppc_emulate_psq_load(run, vcpu, ax_rd, addr, w, i); + break; + } + case OP_PSQ_LU: + { + ulong addr = kvmppc_get_gpr(vcpu, ax_ra); + bool w = inst_get_field(inst, 16, 16) ? true : false; + int i = inst_get_field(inst, 17, 19); + + addr += get_d_signext(inst); + emulated = kvmppc_emulate_psq_load(run, vcpu, ax_rd, addr, w, i); + + if (emulated == EMULATE_DONE) + kvmppc_set_gpr(vcpu, ax_ra, addr); + break; + } + case OP_PSQ_ST: + { + ulong addr = ax_ra ? kvmppc_get_gpr(vcpu, ax_ra) : 0; + bool w = inst_get_field(inst, 16, 16) ? true : false; + int i = inst_get_field(inst, 17, 19); + + addr += get_d_signext(inst); + emulated = kvmppc_emulate_psq_store(run, vcpu, ax_rd, addr, w, i); + break; + } + case OP_PSQ_STU: + { + ulong addr = kvmppc_get_gpr(vcpu, ax_ra); + bool w = inst_get_field(inst, 16, 16) ? true : false; + int i = inst_get_field(inst, 17, 19); + + addr += get_d_signext(inst); + emulated = kvmppc_emulate_psq_store(run, vcpu, ax_rd, addr, w, i); + + if (emulated == EMULATE_DONE) + kvmppc_set_gpr(vcpu, ax_ra, addr); + break; + } + case 4: + /* X form */ + switch (inst_get_field(inst, 21, 30)) { + case OP_4X_PS_CMPU0: + /* XXX */ + emulated = EMULATE_FAIL; + break; + case OP_4X_PSQ_LX: + { + ulong addr = ax_ra ? kvmppc_get_gpr(vcpu, ax_ra) : 0; + bool w = inst_get_field(inst, 21, 21) ? true : false; + int i = inst_get_field(inst, 22, 24); + + addr += kvmppc_get_gpr(vcpu, ax_rb); + emulated = kvmppc_emulate_psq_load(run, vcpu, ax_rd, addr, w, i); + break; + } + case OP_4X_PS_CMPO0: + /* XXX */ + emulated = EMULATE_FAIL; + break; + case OP_4X_PSQ_LUX: + { + ulong addr = kvmppc_get_gpr(vcpu, ax_ra); + bool w = inst_get_field(inst, 21, 21) ? true : false; + int i = inst_get_field(inst, 22, 24); + + addr += kvmppc_get_gpr(vcpu, ax_rb); + emulated = kvmppc_emulate_psq_load(run, vcpu, ax_rd, addr, w, i); + + if (emulated == EMULATE_DONE) + kvmppc_set_gpr(vcpu, ax_ra, addr); + break; + } + case OP_4X_PS_NEG: + vcpu->arch.fpr[ax_rd] = vcpu->arch.fpr[ax_rb]; + vcpu->arch.fpr[ax_rd] ^= 0x8000000000000000ULL; + vcpu->arch.qpr[ax_rd] = vcpu->arch.qpr[ax_rb]; + vcpu->arch.qpr[ax_rd] ^= 0x80000000; + break; + case OP_4X_PS_CMPU1: + /* XXX */ + emulated = EMULATE_FAIL; + break; + case OP_4X_PS_MR: + WARN_ON(rcomp); + vcpu->arch.fpr[ax_rd] = vcpu->arch.fpr[ax_rb]; + vcpu->arch.qpr[ax_rd] = vcpu->arch.qpr[ax_rb]; + break; + case OP_4X_PS_CMPO1: + /* XXX */ + emulated = EMULATE_FAIL; + break; + case OP_4X_PS_NABS: + WARN_ON(rcomp); + vcpu->arch.fpr[ax_rd] = vcpu->arch.fpr[ax_rb]; + vcpu->arch.fpr[ax_rd] |= 0x8000000000000000ULL; + vcpu->arch.qpr[ax_rd] = vcpu->arch.qpr[ax_rb]; + vcpu->arch.qpr[ax_rd] |= 0x80000000; + break; + case OP_4X_PS_ABS: + WARN_ON(rcomp); + vcpu->arch.fpr[ax_rd] = vcpu->arch.fpr[ax_rb]; + vcpu->arch.fpr[ax_rd] &= ~0x8000000000000000ULL; + vcpu->arch.qpr[ax_rd] = vcpu->arch.qpr[ax_rb]; + vcpu->arch.qpr[ax_rd] &= ~0x80000000; + break; + case OP_4X_PS_MERGE00: + WARN_ON(rcomp); + vcpu->arch.fpr[ax_rd] = vcpu->arch.fpr[ax_ra]; + /* vcpu->arch.qpr[ax_rd] = vcpu->arch.fpr[ax_rb]; */ + cvt_df((double*)&vcpu->arch.fpr[ax_rb], + (float*)&vcpu->arch.qpr[ax_rd], &t); + break; + case OP_4X_PS_MERGE01: + WARN_ON(rcomp); + vcpu->arch.fpr[ax_rd] = vcpu->arch.fpr[ax_ra]; + vcpu->arch.qpr[ax_rd] = vcpu->arch.qpr[ax_rb]; + break; + case OP_4X_PS_MERGE10: + WARN_ON(rcomp); + /* vcpu->arch.fpr[ax_rd] = vcpu->arch.qpr[ax_ra]; */ + cvt_fd((float*)&vcpu->arch.qpr[ax_ra], + (double*)&vcpu->arch.fpr[ax_rd], &t); + /* vcpu->arch.qpr[ax_rd] = vcpu->arch.fpr[ax_rb]; */ + cvt_df((double*)&vcpu->arch.fpr[ax_rb], + (float*)&vcpu->arch.qpr[ax_rd], &t); + break; + case OP_4X_PS_MERGE11: + WARN_ON(rcomp); + /* vcpu->arch.fpr[ax_rd] = vcpu->arch.qpr[ax_ra]; */ + cvt_fd((float*)&vcpu->arch.qpr[ax_ra], + (double*)&vcpu->arch.fpr[ax_rd], &t); + vcpu->arch.qpr[ax_rd] = vcpu->arch.qpr[ax_rb]; + break; + } + /* XW form */ + switch (inst_get_field(inst, 25, 30)) { + case OP_4XW_PSQ_STX: + { + ulong addr = ax_ra ? kvmppc_get_gpr(vcpu, ax_ra) : 0; + bool w = inst_get_field(inst, 21, 21) ? true : false; + int i = inst_get_field(inst, 22, 24); + + addr += kvmppc_get_gpr(vcpu, ax_rb); + emulated = kvmppc_emulate_psq_store(run, vcpu, ax_rd, addr, w, i); + break; + } + case OP_4XW_PSQ_STUX: + { + ulong addr = kvmppc_get_gpr(vcpu, ax_ra); + bool w = inst_get_field(inst, 21, 21) ? true : false; + int i = inst_get_field(inst, 22, 24); + + addr += kvmppc_get_gpr(vcpu, ax_rb); + emulated = kvmppc_emulate_psq_store(run, vcpu, ax_rd, addr, w, i); + + if (emulated == EMULATE_DONE) + kvmppc_set_gpr(vcpu, ax_ra, addr); + break; + } + } + /* A form */ + switch (inst_get_field(inst, 26, 30)) { + case OP_4A_PS_SUM1: + emulated = kvmppc_ps_two_in(vcpu, rcomp, ax_rd, + ax_rb, ax_ra, SCALAR_NO_PS0 | SCALAR_HIGH, fp_fadds); + vcpu->arch.fpr[ax_rd] = vcpu->arch.fpr[ax_rc]; + break; + case OP_4A_PS_SUM0: + emulated = kvmppc_ps_two_in(vcpu, rcomp, ax_rd, + ax_ra, ax_rb, SCALAR_NO_PS1 | SCALAR_LOW, fp_fadds); + vcpu->arch.qpr[ax_rd] = vcpu->arch.qpr[ax_rc]; + break; + case OP_4A_PS_MULS0: + emulated = kvmppc_ps_two_in(vcpu, rcomp, ax_rd, + ax_ra, ax_rc, SCALAR_HIGH, fp_fmuls); + break; + case OP_4A_PS_MULS1: + emulated = kvmppc_ps_two_in(vcpu, rcomp, ax_rd, + ax_ra, ax_rc, SCALAR_LOW, fp_fmuls); + break; + case OP_4A_PS_MADDS0: + emulated = kvmppc_ps_three_in(vcpu, rcomp, ax_rd, + ax_ra, ax_rc, ax_rb, SCALAR_HIGH, fp_fmadds); + break; + case OP_4A_PS_MADDS1: + emulated = kvmppc_ps_three_in(vcpu, rcomp, ax_rd, + ax_ra, ax_rc, ax_rb, SCALAR_LOW, fp_fmadds); + break; + case OP_4A_PS_DIV: + emulated = kvmppc_ps_two_in(vcpu, rcomp, ax_rd, + ax_ra, ax_rb, SCALAR_NONE, fp_fdivs); + break; + case OP_4A_PS_SUB: + emulated = kvmppc_ps_two_in(vcpu, rcomp, ax_rd, + ax_ra, ax_rb, SCALAR_NONE, fp_fsubs); + break; + case OP_4A_PS_ADD: + emulated = kvmppc_ps_two_in(vcpu, rcomp, ax_rd, + ax_ra, ax_rb, SCALAR_NONE, fp_fadds); + break; + case OP_4A_PS_SEL: + emulated = kvmppc_ps_three_in(vcpu, rcomp, ax_rd, + ax_ra, ax_rc, ax_rb, SCALAR_NONE, fp_fsel); + break; + case OP_4A_PS_RES: + emulated = kvmppc_ps_one_in(vcpu, rcomp, ax_rd, + ax_rb, fp_fres); + break; + case OP_4A_PS_MUL: + emulated = kvmppc_ps_two_in(vcpu, rcomp, ax_rd, + ax_ra, ax_rc, SCALAR_NONE, fp_fmuls); + break; + case OP_4A_PS_RSQRTE: + emulated = kvmppc_ps_one_in(vcpu, rcomp, ax_rd, + ax_rb, fp_frsqrte); + break; + case OP_4A_PS_MSUB: + emulated = kvmppc_ps_three_in(vcpu, rcomp, ax_rd, + ax_ra, ax_rc, ax_rb, SCALAR_NONE, fp_fmsubs); + break; + case OP_4A_PS_MADD: + emulated = kvmppc_ps_three_in(vcpu, rcomp, ax_rd, + ax_ra, ax_rc, ax_rb, SCALAR_NONE, fp_fmadds); + break; + case OP_4A_PS_NMSUB: + emulated = kvmppc_ps_three_in(vcpu, rcomp, ax_rd, + ax_ra, ax_rc, ax_rb, SCALAR_NONE, fp_fnmsubs); + break; + case OP_4A_PS_NMADD: + emulated = kvmppc_ps_three_in(vcpu, rcomp, ax_rd, + ax_ra, ax_rc, ax_rb, SCALAR_NONE, fp_fnmadds); + break; + } + break; + + /* Real FPU operations */ + + case OP_LFS: + { + ulong addr = (ax_ra ? kvmppc_get_gpr(vcpu, ax_ra) : 0) + full_d; + + emulated = kvmppc_emulate_fpr_load(run, vcpu, ax_rd, addr, + FPU_LS_SINGLE); + break; + } + case OP_LFSU: + { + ulong addr = kvmppc_get_gpr(vcpu, ax_ra) + full_d; + + emulated = kvmppc_emulate_fpr_load(run, vcpu, ax_rd, addr, + FPU_LS_SINGLE); + + if (emulated == EMULATE_DONE) + kvmppc_set_gpr(vcpu, ax_ra, addr); + break; + } + case OP_LFD: + { + ulong addr = (ax_ra ? kvmppc_get_gpr(vcpu, ax_ra) : 0) + full_d; + + emulated = kvmppc_emulate_fpr_load(run, vcpu, ax_rd, addr, + FPU_LS_DOUBLE); + break; + } + case OP_LFDU: + { + ulong addr = kvmppc_get_gpr(vcpu, ax_ra) + full_d; + + emulated = kvmppc_emulate_fpr_load(run, vcpu, ax_rd, addr, + FPU_LS_DOUBLE); + + if (emulated == EMULATE_DONE) + kvmppc_set_gpr(vcpu, ax_ra, addr); + break; + } + case OP_STFS: + { + ulong addr = (ax_ra ? kvmppc_get_gpr(vcpu, ax_ra) : 0) + full_d; + + emulated = kvmppc_emulate_fpr_store(run, vcpu, ax_rd, addr, + FPU_LS_SINGLE); + break; + } + case OP_STFSU: + { + ulong addr = kvmppc_get_gpr(vcpu, ax_ra) + full_d; + + emulated = kvmppc_emulate_fpr_store(run, vcpu, ax_rd, addr, + FPU_LS_SINGLE); + + if (emulated == EMULATE_DONE) + kvmppc_set_gpr(vcpu, ax_ra, addr); + break; + } + case OP_STFD: + { + ulong addr = (ax_ra ? kvmppc_get_gpr(vcpu, ax_ra) : 0) + full_d; + + emulated = kvmppc_emulate_fpr_store(run, vcpu, ax_rd, addr, + FPU_LS_DOUBLE); + break; + } + case OP_STFDU: + { + ulong addr = kvmppc_get_gpr(vcpu, ax_ra) + full_d; + + emulated = kvmppc_emulate_fpr_store(run, vcpu, ax_rd, addr, + FPU_LS_DOUBLE); + + if (emulated == EMULATE_DONE) + kvmppc_set_gpr(vcpu, ax_ra, addr); + break; + } + case 31: + switch (inst_get_field(inst, 21, 30)) { + case OP_31_LFSX: + { + ulong addr = ax_ra ? kvmppc_get_gpr(vcpu, ax_ra) : 0; + + addr += kvmppc_get_gpr(vcpu, ax_rb); + emulated = kvmppc_emulate_fpr_load(run, vcpu, ax_rd, + addr, FPU_LS_SINGLE); + break; + } + case OP_31_LFSUX: + { + ulong addr = kvmppc_get_gpr(vcpu, ax_ra) + + kvmppc_get_gpr(vcpu, ax_rb); + + emulated = kvmppc_emulate_fpr_load(run, vcpu, ax_rd, + addr, FPU_LS_SINGLE); + + if (emulated == EMULATE_DONE) + kvmppc_set_gpr(vcpu, ax_ra, addr); + break; + } + case OP_31_LFDX: + { + ulong addr = (ax_ra ? kvmppc_get_gpr(vcpu, ax_ra) : 0) + + kvmppc_get_gpr(vcpu, ax_rb); + + emulated = kvmppc_emulate_fpr_load(run, vcpu, ax_rd, + addr, FPU_LS_DOUBLE); + break; + } + case OP_31_LFDUX: + { + ulong addr = kvmppc_get_gpr(vcpu, ax_ra) + + kvmppc_get_gpr(vcpu, ax_rb); + + emulated = kvmppc_emulate_fpr_load(run, vcpu, ax_rd, + addr, FPU_LS_DOUBLE); + + if (emulated == EMULATE_DONE) + kvmppc_set_gpr(vcpu, ax_ra, addr); + break; + } + case OP_31_STFSX: + { + ulong addr = (ax_ra ? kvmppc_get_gpr(vcpu, ax_ra) : 0) + + kvmppc_get_gpr(vcpu, ax_rb); + + emulated = kvmppc_emulate_fpr_store(run, vcpu, ax_rd, + addr, FPU_LS_SINGLE); + break; + } + case OP_31_STFSUX: + { + ulong addr = kvmppc_get_gpr(vcpu, ax_ra) + + kvmppc_get_gpr(vcpu, ax_rb); + + emulated = kvmppc_emulate_fpr_store(run, vcpu, ax_rd, + addr, FPU_LS_SINGLE); + + if (emulated == EMULATE_DONE) + kvmppc_set_gpr(vcpu, ax_ra, addr); + break; + } + case OP_31_STFX: + { + ulong addr = (ax_ra ? kvmppc_get_gpr(vcpu, ax_ra) : 0) + + kvmppc_get_gpr(vcpu, ax_rb); + + emulated = kvmppc_emulate_fpr_store(run, vcpu, ax_rd, + addr, FPU_LS_DOUBLE); + break; + } + case OP_31_STFUX: + { + ulong addr = kvmppc_get_gpr(vcpu, ax_ra) + + kvmppc_get_gpr(vcpu, ax_rb); + + emulated = kvmppc_emulate_fpr_store(run, vcpu, ax_rd, + addr, FPU_LS_DOUBLE); + + if (emulated == EMULATE_DONE) + kvmppc_set_gpr(vcpu, ax_ra, addr); + break; + } + case OP_31_STFIWX: + { + ulong addr = (ax_ra ? kvmppc_get_gpr(vcpu, ax_ra) : 0) + + kvmppc_get_gpr(vcpu, ax_rb); + + emulated = kvmppc_emulate_fpr_store(run, vcpu, ax_rd, + addr, + FPU_LS_SINGLE_LOW); + break; + } + break; + } + break; + case 59: + switch (inst_get_field(inst, 21, 30)) { + case OP_59_FADDS: + case OP_59_FSUBS: + case OP_59_FDIVS: + kvmppc_do_fpu_inst(vcpu, inst, FPU_HAS_FRA | + FPU_HAS_FRB | FPU_HAS_FRT | + FPU_SYNC_QPR); + break; + case OP_59_FRES: + case OP_59_FRSQRTES: + kvmppc_do_fpu_inst(vcpu, inst, FPU_HAS_FRT | + FPU_HAS_FRB | FPU_SYNC_QPR); + break; + } + switch (inst_get_field(inst, 26, 30)) { + case OP_59_FMULS: + kvmppc_do_fpu_inst(vcpu, inst, FPU_HAS_FRA | + FPU_HAS_FRC | FPU_HAS_FRT | + FPU_SYNC_QPR); + break; + case OP_59_FMSUBS: + case OP_59_FMADDS: + case OP_59_FNMSUBS: + case OP_59_FNMADDS: + kvmppc_do_fpu_inst(vcpu, inst, FPU_HAS_FRA | + FPU_HAS_FRB | FPU_HAS_FRC | + FPU_HAS_FRT | FPU_SYNC_QPR); + break; + } + break; + case 63: + switch (inst_get_field(inst, 21, 30)) { + case OP_63_MTFSB0: + case OP_63_MTFSB1: + case OP_63_MCRFS: + case OP_63_MTFSFI: + kvmppc_do_fpu_inst(vcpu, inst, 0); + break; + case OP_63_MFFS: + kvmppc_do_fpu_inst(vcpu, inst, FPU_HAS_FRT); + break; + case OP_63_MTFSF: + kvmppc_do_fpu_inst(vcpu, inst, FPU_HAS_FRB); + break; + case OP_63_FCMPU: + case OP_63_FCMPO: + kvmppc_do_fpu_inst(vcpu, inst, FPU_HAS_FRA | + FPU_HAS_FRB); + break; + case OP_63_FNEG: + case OP_63_FMR: + case OP_63_FABS: + kvmppc_do_fpu_inst(vcpu, inst, FPU_HAS_FRT | + FPU_HAS_FRB); + break; + case OP_63_FCPSGN: + case OP_63_FDIV: + case OP_63_FADD: + case OP_63_FSUB: + kvmppc_do_fpu_inst(vcpu, inst, FPU_HAS_FRA | + FPU_HAS_FRB | FPU_HAS_FRT); + break; + case OP_63_FCTIW: + case OP_63_FCTIWZ: + kvmppc_do_fpu_inst(vcpu, inst, FPU_HAS_FRB | + FPU_HAS_FRT); + break; + case OP_63_FRSP: + kvmppc_do_fpu_inst(vcpu, inst, FPU_HAS_FRB | + FPU_HAS_FRT | FPU_SYNC_QPR); + break; + case OP_63_FRSQRTE: + { + float f, one = 1.0f; + + cvt_df((double*)&vcpu->arch.fpr[ax_rb], &f, &t); + /* f = sqrt(f) */ + fp_fsqrts(&t, (u32*)&f, (u32*)&f); + /* f = 1.0f / f */ + fp_fdivs(&t, (u32*)&f, (u32*)&one, (u32*)&f); + cvt_fd(&f, (double*)&vcpu->arch.fpr[ax_rd], &t); + break; + } + } + switch (inst_get_field(inst, 26, 30)) { + case OP_63_FMUL: + kvmppc_do_fpu_inst(vcpu, inst, FPU_HAS_FRA | + FPU_HAS_FRC | FPU_HAS_FRT); + break; + case OP_63_FSEL: + case OP_63_FMSUB: + case OP_63_FMADD: + case OP_63_FNMSUB: + case OP_63_FNMADD: + kvmppc_do_fpu_inst(vcpu, inst, FPU_HAS_FRA | + FPU_HAS_FRB | FPU_HAS_FRC | + FPU_HAS_FRT); + break; + } + break; + } + +#ifdef DEBUG + for (i = 0; i < ARRAY_SIZE(vcpu->arch.fpr); i++) { + u32 f; + cvt_df((double*)&vcpu->arch.fpr[i], (float*)&f, &t); + dprintk(KERN_INFO "FPR[%d] = 0x%x\n", i, f); + } +#endif + + preempt_enable(); + + return emulated; +} -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 53+ messages in thread
[parent not found: <1265298925-31954-19-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>]
* Re: [PATCH 18/18] KVM: PPC: Implement Paired Single emulation [not found] ` <1265298925-31954-19-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org> @ 2010-02-07 12:50 ` Avi Kivity 2010-02-07 15:57 ` Alexander Graf 0 siblings, 1 reply; 53+ messages in thread From: Avi Kivity @ 2010-02-07 12:50 UTC (permalink / raw) To: Alexander Graf; +Cc: kvm-ppc-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA On 02/04/2010 05:55 PM, Alexander Graf wrote: > The one big thing about the Gekko is paired singles. > > Paired singles are an extension to the instruction set, that adds 32 single > precision floating point registers (qprs), some SPRs to modify the behavior > of paired singled operations and instructions to deal with qprs to the > instruction set. > > Unfortunately, it also changes semantics of existing operations that affect > single values in FPRs. In most cases they get mirrored to the coresponding > QPR. > > Thanks to that we need to emulate all FPU operations and all the new paired > single operations too. > > In order to achieve that, we take the guest's instruction, rip out the > parameters, put in our own and execute the very same instruction, but also > fix up the QPR values along the way. > > That way we can execute paired single FPU operations without implementing a > soft fpu. > > A little frightening. How many instructions are there? Maybe we can just have an array of all of them followed by a return instruction, so we don't jit code. > static void call_fpu_inst(u32 inst, u64 *out, u64 *in1, u64 *in2, u64 *in3, > + u32 *cr, u32 *fpscr) > +{ > + u32 cr_val = 0; > + u32 *call_stack; > + u64 inout[5] = { 0, 0, 0, 0, 0 }; > + > + if (fpscr) > + inout[0] = *fpscr; > + if (in1) > + inout[1] = *in1; > + if (in2) > + inout[2] = *in2; > + if (in3) > + inout[3] = *in3; > + if (cr) > + cr_val = *cr; > + > + dprintk(KERN_INFO "FPU Emulator 0x%x ( 0x%llx, 0x%llx, 0x%llx )", inst, > + inout[1], inout[2], inout[3]); > + > + call_stack =&kvmppc_call_stack[(smp_processor_id() * 2)]; > + call_stack[0] = inst; > + /* call_stack[1] is INS_BLR */ > + > Would be easier on the cache to do this per-cpu? -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 18/18] KVM: PPC: Implement Paired Single emulation 2010-02-07 12:50 ` Avi Kivity @ 2010-02-07 15:57 ` Alexander Graf 2010-02-07 16:18 ` Avi Kivity 0 siblings, 1 reply; 53+ messages in thread From: Alexander Graf @ 2010-02-07 15:57 UTC (permalink / raw) To: Avi Kivity; +Cc: kvm-ppc@vger.kernel.org, kvm@vger.kernel.org Am 07.02.2010 um 13:50 schrieb Avi Kivity <avi@redhat.com>: > On 02/04/2010 05:55 PM, Alexander Graf wrote: >> The one big thing about the Gekko is paired singles. >> >> Paired singles are an extension to the instruction set, that adds >> 32 single >> precision floating point registers (qprs), some SPRs to modify the >> behavior >> of paired singled operations and instructions to deal with qprs to >> the >> instruction set. >> >> Unfortunately, it also changes semantics of existing operations >> that affect >> single values in FPRs. In most cases they get mirrored to the >> coresponding >> QPR. >> >> Thanks to that we need to emulate all FPU operations and all the >> new paired >> single operations too. >> >> In order to achieve that, we take the guest's instruction, rip out >> the >> parameters, put in our own and execute the very same instruction, >> but also >> fix up the QPR values along the way. >> >> That way we can execute paired single FPU operations without >> implementing a >> soft fpu. >> >> > > A little frightening. How many instructions are there? Maybe we > can just have an array of all of them followed by a return > instruction, so we don't jit code. There's all the instructions in the list, most can have the rc (compare) bit set to modify CC and iirc there were a couple ones with immediate values. But maybe you're right. I probably could just always set rc and either ignore the result or use it. I could maybe find alternatives to immediate using instructions. Let me check this on the bus trip back from brussels. > >> static void call_fpu_inst(u32 inst, u64 *out, u64 *in1, u64 *in2, >> u64 *in3, >> + u32 *cr, u32 *fpscr) >> +{ >> + u32 cr_val = 0; >> + u32 *call_stack; >> + u64 inout[5] = { 0, 0, 0, 0, 0 }; >> + >> + if (fpscr) >> + inout[0] = *fpscr; >> + if (in1) >> + inout[1] = *in1; >> + if (in2) >> + inout[2] = *in2; >> + if (in3) >> + inout[3] = *in3; >> + if (cr) >> + cr_val = *cr; >> + >> + dprintk(KERN_INFO "FPU Emulator 0x%x ( 0x%llx, 0x%llx, 0x >> %llx )", inst, >> + inout[1], inout[2], inout[3]); >> + >> + call_stack =&kvmppc_call_stack[(smp_processor_id() * 2)]; >> + call_stack[0] = inst; >> + /* call_stack[1] is INS_BLR */ >> + >> > > Would be easier on the cache to do this per-cpu? It is per-cpu. Or do you mean to actually use the PER_CPU definition? Is that guaranteed to be executable? Alex ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 18/18] KVM: PPC: Implement Paired Single emulation 2010-02-07 15:57 ` Alexander Graf @ 2010-02-07 16:18 ` Avi Kivity 0 siblings, 0 replies; 53+ messages in thread From: Avi Kivity @ 2010-02-07 16:18 UTC (permalink / raw) To: Alexander Graf; +Cc: kvm-ppc@vger.kernel.org, kvm@vger.kernel.org On 02/07/2010 05:57 PM, Alexander Graf wrote:+ >>> + dprintk(KERN_INFO "FPU Emulator 0x%x ( 0x%llx, 0x%llx, 0x%llx >>> )", inst, >>> + inout[1], inout[2], inout[3]); >>> + >>> + call_stack =&kvmppc_call_stack[(smp_processor_id() * 2)]; >>> + call_stack[0] = inst; >>> + /* call_stack[1] is INS_BLR */ >>> + >>> >> >> Would be easier on the cache to do this per-cpu? > > It is per-cpu. Or do you mean to actually use the PER_CPU definition? > Is that guaranteed to be executable? I meant, per-cpu vmalloc area, but it should be enough to have a per-cpu cache line. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <1265298925-31954-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>]
* [PATCH 02/18] KVM: PPC: Enable MMIO to do 64 bits, fprs and qprs [not found] ` <1265298925-31954-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org> @ 2010-02-04 15:55 ` Alexander Graf [not found] ` <1265298925-31954-3-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org> 2010-02-04 15:55 ` [PATCH 03/18] KVM: PPC: Teach MMIO Signedness Alexander Graf ` (8 subsequent siblings) 9 siblings, 1 reply; 53+ messages in thread From: Alexander Graf @ 2010-02-04 15:55 UTC (permalink / raw) To: kvm-ppc-u79uwXL29TY76Z2rM5mHXA; +Cc: kvm-u79uwXL29TY76Z2rM5mHXA Right now MMIO access can only happen for GPRs and is at most 32 bit wide. That's actually enough for almost all types of hardware out there. Unfortunately, the guest I was using used FPU writes to MMIO regions, so it ended up writing 64 bit MMIOs using FPRs and QPRs. So let's add code to handle those odd cases too. Signed-off-by: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org> --- arch/powerpc/include/asm/kvm.h | 7 +++++++ arch/powerpc/include/asm/kvm_ppc.h | 2 +- arch/powerpc/kvm/powerpc.c | 24 ++++++++++++++++++++++-- 3 files changed, 30 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h index 81f3b0b..548376c 100644 --- a/arch/powerpc/include/asm/kvm.h +++ b/arch/powerpc/include/asm/kvm.h @@ -77,4 +77,11 @@ struct kvm_debug_exit_arch { struct kvm_guest_debug_arch { }; +#define REG_MASK 0x001f +#define REG_EXT_MASK 0xffe0 +#define REG_GPR 0x0000 +#define REG_FPR 0x0020 +#define REG_QPR 0x0040 +#define REG_FQPR 0x0060 + #endif /* __LINUX_KVM_POWERPC_H */ diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index e264282..c011170 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -49,7 +49,7 @@ extern int kvmppc_handle_load(struct kvm_run *run, struct kvm_vcpu *vcpu, unsigned int rt, unsigned int bytes, int is_bigendian); extern int kvmppc_handle_store(struct kvm_run *run, struct kvm_vcpu *vcpu, - u32 val, unsigned int bytes, int is_bigendian); + u64 val, unsigned int bytes, int is_bigendian); extern int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu); diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 51aedd7..98d5e6d 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -277,7 +277,7 @@ static void kvmppc_complete_dcr_load(struct kvm_vcpu *vcpu, static void kvmppc_complete_mmio_load(struct kvm_vcpu *vcpu, struct kvm_run *run) { - ulong gpr; + u64 gpr; if (run->mmio.len > sizeof(gpr)) { printk(KERN_ERR "bad MMIO length: %d\n", run->mmio.len); @@ -286,6 +286,7 @@ static void kvmppc_complete_mmio_load(struct kvm_vcpu *vcpu, if (vcpu->arch.mmio_is_bigendian) { switch (run->mmio.len) { + case 8: gpr = *(u64 *)run->mmio.data; break; case 4: gpr = *(u32 *)run->mmio.data; break; case 2: gpr = *(u16 *)run->mmio.data; break; case 1: gpr = *(u8 *)run->mmio.data; break; @@ -300,6 +301,24 @@ static void kvmppc_complete_mmio_load(struct kvm_vcpu *vcpu, } kvmppc_set_gpr(vcpu, vcpu->arch.io_gpr, gpr); + + switch (vcpu->arch.io_gpr & REG_EXT_MASK) { + case REG_GPR: + kvmppc_set_gpr(vcpu, vcpu->arch.io_gpr, gpr); + break; + case REG_FPR: + vcpu->arch.fpr[vcpu->arch.io_gpr & REG_MASK] = gpr; + break; + case REG_QPR: + vcpu->arch.qpr[vcpu->arch.io_gpr & REG_MASK] = gpr; + break; + case REG_FQPR: + vcpu->arch.fpr[vcpu->arch.io_gpr & REG_MASK] = gpr; + vcpu->arch.qpr[vcpu->arch.io_gpr & REG_MASK] = gpr; + break; + default: + BUG(); + } } int kvmppc_handle_load(struct kvm_run *run, struct kvm_vcpu *vcpu, @@ -323,7 +342,7 @@ int kvmppc_handle_load(struct kvm_run *run, struct kvm_vcpu *vcpu, } int kvmppc_handle_store(struct kvm_run *run, struct kvm_vcpu *vcpu, - u32 val, unsigned int bytes, int is_bigendian) + u64 val, unsigned int bytes, int is_bigendian) { void *data = run->mmio.data; @@ -341,6 +360,7 @@ int kvmppc_handle_store(struct kvm_run *run, struct kvm_vcpu *vcpu, /* Store the value at the lowest bytes in 'data'. */ if (is_bigendian) { switch (bytes) { + case 8: *(u64 *)data = val; break; case 4: *(u32 *)data = val; break; case 2: *(u16 *)data = val; break; case 1: *(u8 *)data = val; break; -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 53+ messages in thread
[parent not found: <1265298925-31954-3-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>]
* Re: [PATCH 02/18] KVM: PPC: Enable MMIO to do 64 bits, fprs and qprs [not found] ` <1265298925-31954-3-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org> @ 2010-02-07 12:29 ` Avi Kivity [not found] ` <4B6EB229.8090502-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Avi Kivity @ 2010-02-07 12:29 UTC (permalink / raw) To: Alexander Graf; +Cc: kvm-ppc-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA On 02/04/2010 05:55 PM, Alexander Graf wrote: > Right now MMIO access can only happen for GPRs and is at most 32 bit wide. > That's actually enough for almost all types of hardware out there. > > Unfortunately, the guest I was using used FPU writes to MMIO regions, so > it ended up writing 64 bit MMIOs using FPRs and QPRs. > > So let's add code to handle those odd cases too. > > Signed-off-by: Alexander Graf<agraf-l3A5Bk7waGM@public.gmane.org> > --- > arch/powerpc/include/asm/kvm.h | 7 +++++++ > arch/powerpc/include/asm/kvm_ppc.h | 2 +- > arch/powerpc/kvm/powerpc.c | 24 ++++++++++++++++++++++-- > 3 files changed, 30 insertions(+), 3 deletions(-) > > diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h > index 81f3b0b..548376c 100644 > --- a/arch/powerpc/include/asm/kvm.h > +++ b/arch/powerpc/include/asm/kvm.h > @@ -77,4 +77,11 @@ struct kvm_debug_exit_arch { > struct kvm_guest_debug_arch { > }; > > +#define REG_MASK 0x001f > +#define REG_EXT_MASK 0xffe0 > +#define REG_GPR 0x0000 > +#define REG_FPR 0x0020 > +#define REG_QPR 0x0040 > +#define REG_FQPR 0x0060 > These names seem too generic to belong in asm/kvm.h - some application could use the same names. Please add a KVM_ prefix. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <4B6EB229.8090502-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 02/18] KVM: PPC: Enable MMIO to do 64 bits, fprs and qprs [not found] ` <4B6EB229.8090502-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2010-02-07 15:51 ` Alexander Graf 0 siblings, 0 replies; 53+ messages in thread From: Alexander Graf @ 2010-02-07 15:51 UTC (permalink / raw) To: Avi Kivity Cc: kvm-ppc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Am 07.02.2010 um 13:29 schrieb Avi Kivity <avi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>: > On 02/04/2010 05:55 PM, Alexander Graf wrote: >> Right now MMIO access can only happen for GPRs and is at most 32 >> bit wide. >> That's actually enough for almost all types of hardware out there. >> >> Unfortunately, the guest I was using used FPU writes to MMIO >> regions, so >> it ended up writing 64 bit MMIOs using FPRs and QPRs. >> >> So let's add code to handle those odd cases too. >> >> Signed-off-by: Alexander Graf<agraf-l3A5Bk7waGM@public.gmane.org> >> --- >> arch/powerpc/include/asm/kvm.h | 7 +++++++ >> arch/powerpc/include/asm/kvm_ppc.h | 2 +- >> arch/powerpc/kvm/powerpc.c | 24 ++++++++++++++++++++++-- >> 3 files changed, 30 insertions(+), 3 deletions(-) >> >> diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/ >> asm/kvm.h >> index 81f3b0b..548376c 100644 >> --- a/arch/powerpc/include/asm/kvm.h >> +++ b/arch/powerpc/include/asm/kvm.h >> @@ -77,4 +77,11 @@ struct kvm_debug_exit_arch { >> struct kvm_guest_debug_arch { >> }; >> >> +#define REG_MASK 0x001f >> +#define REG_EXT_MASK 0xffe0 >> +#define REG_GPR 0x0000 >> +#define REG_FPR 0x0020 >> +#define REG_QPR 0x0040 >> +#define REG_FQPR 0x0060 >> > > These names seem too generic to belong in asm/kvm.h - some > application could use the same names. Please add a KVM_ prefix. Yes, will do. Alex ^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH 03/18] KVM: PPC: Teach MMIO Signedness [not found] ` <1265298925-31954-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org> 2010-02-04 15:55 ` [PATCH 02/18] KVM: PPC: Enable MMIO to do 64 bits, fprs and qprs Alexander Graf @ 2010-02-04 15:55 ` Alexander Graf [not found] ` <1265298925-31954-4-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org> 2010-02-04 15:55 ` [PATCH 04/18] KVM: PPC: Add AGAIN type for emulation return Alexander Graf ` (7 subsequent siblings) 9 siblings, 1 reply; 53+ messages in thread From: Alexander Graf @ 2010-02-04 15:55 UTC (permalink / raw) To: kvm-ppc-u79uwXL29TY76Z2rM5mHXA; +Cc: kvm-u79uwXL29TY76Z2rM5mHXA The guest I was trying to get to run uses the LHA and LHAU instructions. Those instructions basically do a load, but also sign extend the result. Since we need to fill our registers by hand when doing MMIO, we also need to sign extend manually. This patch implements sign extended MMIO and the LHA(U) instructions. Signed-off-by: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org> --- arch/powerpc/include/asm/kvm_host.h | 1 + arch/powerpc/include/asm/kvm_ppc.h | 3 +++ arch/powerpc/kvm/emulate.c | 14 ++++++++++++++ arch/powerpc/kvm/powerpc.c | 32 ++++++++++++++++++++++++++++++++ 4 files changed, 50 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 2ed954e..4dd98fa 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -268,6 +268,7 @@ struct kvm_vcpu_arch { u8 io_gpr; /* GPR used as IO source/target */ u8 mmio_is_bigendian; + u8 mmio_sign_extend; u8 dcr_needed; u8 dcr_is_write; diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index c011170..a288dd2 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -48,6 +48,9 @@ extern void kvmppc_dump_vcpu(struct kvm_vcpu *vcpu); extern int kvmppc_handle_load(struct kvm_run *run, struct kvm_vcpu *vcpu, unsigned int rt, unsigned int bytes, int is_bigendian); +extern int kvmppc_handle_loads(struct kvm_run *run, struct kvm_vcpu *vcpu, + unsigned int rt, unsigned int bytes, + int is_bigendian); extern int kvmppc_handle_store(struct kvm_run *run, struct kvm_vcpu *vcpu, u64 val, unsigned int bytes, int is_bigendian); diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c index b905623..ef2ff59 100644 --- a/arch/powerpc/kvm/emulate.c +++ b/arch/powerpc/kvm/emulate.c @@ -62,6 +62,8 @@ #define OP_STBU 39 #define OP_LHZ 40 #define OP_LHZU 41 +#define OP_LHA 42 +#define OP_LHAU 43 #define OP_STH 44 #define OP_STHU 45 @@ -450,6 +452,18 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu) kvmppc_set_gpr(vcpu, ra, vcpu->arch.paddr_accessed); break; + case OP_LHA: + rt = get_rt(inst); + emulated = kvmppc_handle_loads(run, vcpu, rt, 2, 1); + break; + + case OP_LHAU: + ra = get_ra(inst); + rt = get_rt(inst); + emulated = kvmppc_handle_loads(run, vcpu, rt, 2, 1); + kvmppc_set_gpr(vcpu, ra, vcpu->arch.paddr_accessed); + break; + case OP_STH: rs = get_rs(inst); emulated = kvmppc_handle_store(run, vcpu, diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 98d5e6d..a235369 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -300,6 +300,25 @@ static void kvmppc_complete_mmio_load(struct kvm_vcpu *vcpu, } } + if (vcpu->arch.mmio_sign_extend) { + switch (run->mmio.len) { +#ifdef CONFIG_PPC64 + case 4: + if (gpr & 0x80000000) + gpr |= 0xffffffff00000000ULL; + break; +#endif + case 2: + if (gpr & 0x8000) + gpr |= 0xffffffffffff0000ULL; + break; + case 1: + if (gpr & 0x80) + gpr |= 0xffffffffffffff00ULL; + break; + } + } + kvmppc_set_gpr(vcpu, vcpu->arch.io_gpr, gpr); switch (vcpu->arch.io_gpr & REG_EXT_MASK) { @@ -337,10 +356,23 @@ int kvmppc_handle_load(struct kvm_run *run, struct kvm_vcpu *vcpu, vcpu->arch.mmio_is_bigendian = is_bigendian; vcpu->mmio_needed = 1; vcpu->mmio_is_write = 0; + vcpu->arch.mmio_sign_extend = 0; return EMULATE_DO_MMIO; } +/* Same as above, but sign extends */ +int kvmppc_handle_loads(struct kvm_run *run, struct kvm_vcpu *vcpu, + unsigned int rt, unsigned int bytes, int is_bigendian) +{ + int r; + + r = kvmppc_handle_load(run, vcpu, rt, bytes, is_bigendian); + vcpu->arch.mmio_sign_extend = 1; + + return r; +} + int kvmppc_handle_store(struct kvm_run *run, struct kvm_vcpu *vcpu, u64 val, unsigned int bytes, int is_bigendian) { -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 53+ messages in thread
[parent not found: <1265298925-31954-4-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>]
* Re: [PATCH 03/18] KVM: PPC: Teach MMIO Signedness [not found] ` <1265298925-31954-4-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org> @ 2010-02-07 12:32 ` Avi Kivity [not found] ` <4B6EB2D7.1030500-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Avi Kivity @ 2010-02-07 12:32 UTC (permalink / raw) To: Alexander Graf; +Cc: kvm-ppc-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA On 02/04/2010 05:55 PM, Alexander Graf wrote: > The guest I was trying to get to run uses the LHA and LHAU instructions. > Those instructions basically do a load, but also sign extend the result. > > Since we need to fill our registers by hand when doing MMIO, we also need > to sign extend manually. > > This patch implements sign extended MMIO and the LHA(U) instructions. > > @@ -300,6 +300,25 @@ static void kvmppc_complete_mmio_load(struct kvm_vcpu *vcpu, > } > } > > + if (vcpu->arch.mmio_sign_extend) { > + switch (run->mmio.len) { > +#ifdef CONFIG_PPC64 > + case 4: > + if (gpr& 0x80000000) > + gpr |= 0xffffffff00000000ULL; > + break; > Wouldn't gpr = (s64)(gpr << 32) >> 32; work? Not sure if >> is guaranteed to sign extend. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <4B6EB2D7.1030500-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 03/18] KVM: PPC: Teach MMIO Signedness [not found] ` <4B6EB2D7.1030500-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2010-02-07 15:51 ` Alexander Graf [not found] ` <3CEF000F-1751-4E65-A08A-C71B2CE8DAEE-l3A5Bk7waGM@public.gmane.org> 2010-02-07 16:27 ` Anthony Liguori 1 sibling, 1 reply; 53+ messages in thread From: Alexander Graf @ 2010-02-07 15:51 UTC (permalink / raw) To: Avi Kivity Cc: kvm-ppc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Am 07.02.2010 um 13:32 schrieb Avi Kivity <avi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>: > On 02/04/2010 05:55 PM, Alexander Graf wrote: >> The guest I was trying to get to run uses the LHA and LHAU >> instructions. >> Those instructions basically do a load, but also sign extend the >> result. >> >> Since we need to fill our registers by hand when doing MMIO, we >> also need >> to sign extend manually. >> >> This patch implements sign extended MMIO and the LHA(U) instructions. >> >> @@ -300,6 +300,25 @@ static void kvmppc_complete_mmio_load(struct >> kvm_vcpu *vcpu, >> } >> } >> >> + if (vcpu->arch.mmio_sign_extend) { >> + switch (run->mmio.len) { >> +#ifdef CONFIG_PPC64 >> + case 4: >> + if (gpr& 0x80000000) >> + gpr |= 0xffffffff00000000ULL; >> + break; >> > > Wouldn't > > gpr = (s64)(gpr << 32) >> 32; > > work? Not sure if >> is guaranteed to sign extend. Not sure either. The code as is is rather obvious imho, so I wouldn't want to replace it with anything that's even remotely magical. Alex ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <3CEF000F-1751-4E65-A08A-C71B2CE8DAEE-l3A5Bk7waGM@public.gmane.org>]
* Re: [PATCH 03/18] KVM: PPC: Teach MMIO Signedness [not found] ` <3CEF000F-1751-4E65-A08A-C71B2CE8DAEE-l3A5Bk7waGM@public.gmane.org> @ 2010-02-07 16:15 ` Avi Kivity 0 siblings, 0 replies; 53+ messages in thread From: Avi Kivity @ 2010-02-07 16:15 UTC (permalink / raw) To: Alexander Graf Cc: kvm-ppc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 02/07/2010 05:51 PM, Alexander Graf wrote: >>> + if (vcpu->arch.mmio_sign_extend) { >>> + switch (run->mmio.len) { >>> +#ifdef CONFIG_PPC64 >>> + case 4: >>> + if (gpr& 0x80000000) >>> + gpr |= 0xffffffff00000000ULL; >>> + break; >>> >> >> Wouldn't >> >> gpr = (s64)(gpr << 32) >> 32; >> >> work? Not sure if >> is guaranteed to sign extend. > > > Not sure either. The code as is is rather obvious imho, so I wouldn't > want to replace it with anything that's even remotely magical. > That's fair. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 03/18] KVM: PPC: Teach MMIO Signedness [not found] ` <4B6EB2D7.1030500-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2010-02-07 15:51 ` Alexander Graf @ 2010-02-07 16:27 ` Anthony Liguori 2010-02-07 21:35 ` Alexander Graf 1 sibling, 1 reply; 53+ messages in thread From: Anthony Liguori @ 2010-02-07 16:27 UTC (permalink / raw) To: Avi Kivity Cc: Alexander Graf, kvm-ppc-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA On 02/07/2010 06:32 AM, Avi Kivity wrote: > On 02/04/2010 05:55 PM, Alexander Graf wrote: >> The guest I was trying to get to run uses the LHA and LHAU instructions. >> Those instructions basically do a load, but also sign extend the result. >> >> Since we need to fill our registers by hand when doing MMIO, we also >> need >> to sign extend manually. >> >> This patch implements sign extended MMIO and the LHA(U) instructions. >> >> @@ -300,6 +300,25 @@ static void kvmppc_complete_mmio_load(struct >> kvm_vcpu *vcpu, >> } >> } >> >> + if (vcpu->arch.mmio_sign_extend) { >> + switch (run->mmio.len) { >> +#ifdef CONFIG_PPC64 >> + case 4: >> + if (gpr& 0x80000000) >> + gpr |= 0xffffffff00000000ULL; >> + break; > > Wouldn't > > gpr = (s64)(gpr << 32) >> 32; > > work? Not sure if >> is guaranteed to sign extend. It's technically implementation dependent but I don't know of an implementation that doesn't sign extend. Regards, Anthony Liguori ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 03/18] KVM: PPC: Teach MMIO Signedness 2010-02-07 16:27 ` Anthony Liguori @ 2010-02-07 21:35 ` Alexander Graf [not found] ` <1CA08386-21CA-4B4F-A1E6-56C4DE584BA6-l3A5Bk7waGM@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Alexander Graf @ 2010-02-07 21:35 UTC (permalink / raw) To: Anthony Liguori; +Cc: Avi Kivity, kvm-ppc@vger.kernel.org, kvm@vger.kernel.org Am 07.02.2010 um 17:27 schrieb Anthony Liguori <anthony@codemonkey.ws>: > On 02/07/2010 06:32 AM, Avi Kivity wrote: >> On 02/04/2010 05:55 PM, Alexander Graf wrote: >>> The guest I was trying to get to run uses the LHA and LHAU >>> instructions. >>> Those instructions basically do a load, but also sign extend the >>> result. >>> >>> Since we need to fill our registers by hand when doing MMIO, we >>> also need >>> to sign extend manually. >>> >>> This patch implements sign extended MMIO and the LHA(U) >>> instructions. >>> >>> @@ -300,6 +300,25 @@ static void kvmppc_complete_mmio_load(struct >>> kvm_vcpu *vcpu, >>> } >>> } >>> >>> + if (vcpu->arch.mmio_sign_extend) { >>> + switch (run->mmio.len) { >>> +#ifdef CONFIG_PPC64 >>> + case 4: >>> + if (gpr& 0x80000000) >>> + gpr |= 0xffffffff00000000ULL; >>> + break; >> >> Wouldn't >> >> gpr = (s64)(gpr << 32) >> 32; >> >> work? Not sure if >> is guaranteed to sign extend. > > It's technically implementation dependent but I don't know of an > implementation that doesn't sign extend. Hrm, would gpr = (s64)(s32)gpr; work? :) Alex ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <1CA08386-21CA-4B4F-A1E6-56C4DE584BA6-l3A5Bk7waGM@public.gmane.org>]
* Re: [PATCH 03/18] KVM: PPC: Teach MMIO Signedness [not found] ` <1CA08386-21CA-4B4F-A1E6-56C4DE584BA6-l3A5Bk7waGM@public.gmane.org> @ 2010-02-07 22:13 ` Anthony Liguori 0 siblings, 0 replies; 53+ messages in thread From: Anthony Liguori @ 2010-02-07 22:13 UTC (permalink / raw) To: Alexander Graf Cc: Avi Kivity, kvm-ppc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 02/07/2010 03:35 PM, Alexander Graf wrote: >> It's technically implementation dependent but I don't know of an >> implementation that doesn't sign extend. > > > Hrm, would > > gpr = (s64)(s32)gpr; > > work? :) Yes. Integer promotion does guarantee sign extension. Regards, Anthony Liguori > > Alex ^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH 04/18] KVM: PPC: Add AGAIN type for emulation return [not found] ` <1265298925-31954-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org> 2010-02-04 15:55 ` [PATCH 02/18] KVM: PPC: Enable MMIO to do 64 bits, fprs and qprs Alexander Graf 2010-02-04 15:55 ` [PATCH 03/18] KVM: PPC: Teach MMIO Signedness Alexander Graf @ 2010-02-04 15:55 ` Alexander Graf 2010-02-04 15:55 ` [PATCH 05/18] KVM: PPC: Add hidden flag for paired singles Alexander Graf ` (6 subsequent siblings) 9 siblings, 0 replies; 53+ messages in thread From: Alexander Graf @ 2010-02-04 15:55 UTC (permalink / raw) To: kvm-ppc-u79uwXL29TY76Z2rM5mHXA; +Cc: kvm-u79uwXL29TY76Z2rM5mHXA Emulation of an instruction can have different outcomes. It can succeed, fail, require MMIO, do funky BookE stuff - or it can just realize something's odd and will be fixed the next time around. Exactly that is what EMULATE_AGAIN means. Using that flag we can now tell the caller that nothing happened, but we still want to go back to the guest and see what happens next time we come around. Signed-off-by: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org> --- arch/powerpc/include/asm/kvm_ppc.h | 1 + arch/powerpc/kvm/book3s.c | 3 +++ arch/powerpc/kvm/emulate.c | 4 +++- 3 files changed, 7 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index a288dd2..0761218 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -37,6 +37,7 @@ enum emulation_result { EMULATE_DO_MMIO, /* kvm_run filled with MMIO request */ EMULATE_DO_DCR, /* kvm_run filled with DCR request */ EMULATE_FAIL, /* can't emulate this instruction */ + EMULATE_AGAIN, /* something went wrong. go again */ }; extern int __kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu); diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index 9a271f0..1e5e0fc 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -788,6 +788,9 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, case EMULATE_DONE: r = RESUME_GUEST_NV; break; + case EMULATE_AGAIN: + r = RESUME_GUEST; + break; case EMULATE_FAIL: printk(KERN_CRIT "%s: emulation at %lx failed (%08x)\n", __func__, vcpu->arch.pc, vcpu->arch.last_inst); diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c index ef2ff59..c3bab7f 100644 --- a/arch/powerpc/kvm/emulate.c +++ b/arch/powerpc/kvm/emulate.c @@ -486,7 +486,9 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu) if (emulated == EMULATE_FAIL) { emulated = kvmppc_core_emulate_op(run, vcpu, inst, &advance); - if (emulated == EMULATE_FAIL) { + if (emulated == EMULATE_AGAIN) { + advance = 0; + } else if (emulated == EMULATE_FAIL) { advance = 0; printk(KERN_ERR "Couldn't emulate instruction 0x%08x " "(op %d xop %d)\n", inst, get_op(inst), get_xop(inst)); -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 05/18] KVM: PPC: Add hidden flag for paired singles [not found] ` <1265298925-31954-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org> ` (2 preceding siblings ...) 2010-02-04 15:55 ` [PATCH 04/18] KVM: PPC: Add AGAIN type for emulation return Alexander Graf @ 2010-02-04 15:55 ` Alexander Graf 2010-02-04 15:55 ` [PATCH 06/18] KVM: PPC: Add Gekko SPRs Alexander Graf ` (5 subsequent siblings) 9 siblings, 0 replies; 53+ messages in thread From: Alexander Graf @ 2010-02-04 15:55 UTC (permalink / raw) To: kvm-ppc-u79uwXL29TY76Z2rM5mHXA; +Cc: kvm-u79uwXL29TY76Z2rM5mHXA The Gekko implements an extension called paired singles. When the guest wants to use that extension, we need to make sure we're not running the host FPU, because all FPU instructions need to get emulated to accomodate for additional operations that occur. This patch adds an hflag to track if we're in paired single mode or not. Signed-off-by: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org> --- arch/powerpc/include/asm/kvm_asm.h | 1 + arch/powerpc/kvm/book3s.c | 4 ++++ 2 files changed, 5 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_asm.h b/arch/powerpc/include/asm/kvm_asm.h index aadf2dd..7238c04 100644 --- a/arch/powerpc/include/asm/kvm_asm.h +++ b/arch/powerpc/include/asm/kvm_asm.h @@ -88,6 +88,7 @@ #define BOOK3S_HFLAG_DCBZ32 0x1 #define BOOK3S_HFLAG_SLB 0x2 +#define BOOK3S_HFLAG_PAIRED_SINGLE 0x4 #define RESUME_FLAG_NV (1<<0) /* Reload guest nonvolatile state? */ #define RESUME_FLAG_HOST (1<<1) /* Resume host? */ diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index 1e5e0fc..96f7be4 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -638,6 +638,10 @@ static int kvmppc_handle_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr, u64 *thread_fpr = (u64*)t->fpr; int i; + /* When we have paired singles, we emulate in software */ + if (vcpu->arch.hflags & BOOK3S_HFLAG_PAIRED_SINGLE) + return RESUME_GUEST; + if (!(vcpu->arch.msr & msr)) { kvmppc_book3s_queue_irqprio(vcpu, exit_nr); return RESUME_GUEST; -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 06/18] KVM: PPC: Add Gekko SPRs [not found] ` <1265298925-31954-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org> ` (3 preceding siblings ...) 2010-02-04 15:55 ` [PATCH 05/18] KVM: PPC: Add hidden flag for paired singles Alexander Graf @ 2010-02-04 15:55 ` Alexander Graf 2010-02-04 15:55 ` [PATCH 12/18] KVM: PPC: Make ext giveup non-static Alexander Graf ` (4 subsequent siblings) 9 siblings, 0 replies; 53+ messages in thread From: Alexander Graf @ 2010-02-04 15:55 UTC (permalink / raw) To: kvm-ppc-u79uwXL29TY76Z2rM5mHXA; +Cc: kvm-u79uwXL29TY76Z2rM5mHXA The Gekko has some SPR values that differ from other PPC core values and also some additional ones. Let's add support for them in our mfspr/mtspr emulator. Signed-off-by: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org> --- arch/powerpc/include/asm/kvm_book3s.h | 1 + arch/powerpc/include/asm/reg.h | 10 +++++ arch/powerpc/kvm/book3s_64_emulate.c | 70 +++++++++++++++++++++++++++++++++ 3 files changed, 81 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h index db7db0a..d28ee83 100644 --- a/arch/powerpc/include/asm/kvm_book3s.h +++ b/arch/powerpc/include/asm/kvm_book3s.h @@ -82,6 +82,7 @@ struct kvmppc_vcpu_book3s { struct kvmppc_bat ibat[8]; struct kvmppc_bat dbat[8]; u64 hid[6]; + u64 gqr[8]; int slb_nr; u64 sdr1; u64 dsisr; diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h index 5572e86..8a69a39 100644 --- a/arch/powerpc/include/asm/reg.h +++ b/arch/powerpc/include/asm/reg.h @@ -293,10 +293,12 @@ #define HID1_ABE (1<<10) /* 7450 Address Broadcast Enable */ #define HID1_PS (1<<16) /* 750FX PLL selection */ #define SPRN_HID2 0x3F8 /* Hardware Implementation Register 2 */ +#define SPRN_HID2_GEKKO 0x398 /* Gekko HID2 Register */ #define SPRN_IABR 0x3F2 /* Instruction Address Breakpoint Register */ #define SPRN_IABR2 0x3FA /* 83xx */ #define SPRN_IBCR 0x135 /* 83xx Insn Breakpoint Control Reg */ #define SPRN_HID4 0x3F4 /* 970 HID4 */ +#define SPRN_HID4_GEKKO 0x3F3 /* Gekko HID4 */ #define SPRN_HID5 0x3F6 /* 970 HID5 */ #define SPRN_HID6 0x3F9 /* BE HID 6 */ #define HID6_LB (0x0F<<12) /* Concurrent Large Page Modes */ @@ -465,6 +467,14 @@ #define SPRN_VRSAVE 0x100 /* Vector Register Save Register */ #define SPRN_XER 0x001 /* Fixed Point Exception Register */ +#define SPRN_MMCR0_GEKKO 0x3B8 /* Gekko Monitor Mode Control Register 0 */ +#define SPRN_MMCR1_GEKKO 0x3BC /* Gekko Monitor Mode Control Register 1 */ +#define SPRN_PMC1_GEKKO 0x3B9 /* Gekko Performance Monitor Control 1 */ +#define SPRN_PMC2_GEKKO 0x3BA /* Gekko Performance Monitor Control 2 */ +#define SPRN_PMC3_GEKKO 0x3BD /* Gekko Performance Monitor Control 3 */ +#define SPRN_PMC4_GEKKO 0x3BE /* Gekko Performance Monitor Control 4 */ +#define SPRN_WPAR_GEKKO 0x399 /* Gekko Write Pipe Address Register */ + #define SPRN_SCOMC 0x114 /* SCOM Access Control */ #define SPRN_SCOMD 0x115 /* SCOM Access DATA */ diff --git a/arch/powerpc/kvm/book3s_64_emulate.c b/arch/powerpc/kvm/book3s_64_emulate.c index 2b0ee7e..bb4a7c1 100644 --- a/arch/powerpc/kvm/book3s_64_emulate.c +++ b/arch/powerpc/kvm/book3s_64_emulate.c @@ -42,6 +42,15 @@ /* DCBZ is actually 1014, but we patch it to 1010 so we get a trap */ #define OP_31_XOP_DCBZ 1010 +#define SPRN_GQR0 912 +#define SPRN_GQR1 913 +#define SPRN_GQR2 914 +#define SPRN_GQR3 915 +#define SPRN_GQR4 916 +#define SPRN_GQR5 917 +#define SPRN_GQR6 918 +#define SPRN_GQR7 919 + int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu, unsigned int inst, int *advance) { @@ -268,7 +277,29 @@ int kvmppc_core_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, int rs) case SPRN_HID2: to_book3s(vcpu)->hid[2] = spr_val; break; + case SPRN_HID2_GEKKO: + to_book3s(vcpu)->hid[2] = spr_val; + /* HID2.PSE controls paired single on gekko */ + switch (vcpu->arch.pvr) { + case 0x00080200: /* lonestar 2.0 */ + case 0x00088202: /* lonestar 2.2 */ + case 0x70000100: /* gekko 1.0 */ + case 0x00080100: /* gekko 2.0 */ + case 0x00083203: /* gekko 2.3a */ + case 0x00083213: /* gekko 2.3b */ + case 0x00083204: /* gekko 2.4 */ + case 0x00083214: /* gekko 2.4e (8SE) - retail HW2 */ + if (spr_val & (1 << 29)) { /* HID2.PSE */ + vcpu->arch.hflags |= BOOK3S_HFLAG_PAIRED_SINGLE; + kvmppc_giveup_ext(vcpu, MSR_FP); + } else { + vcpu->arch.hflags &= ~BOOK3S_HFLAG_PAIRED_SINGLE; + } + break; + } + break; case SPRN_HID4: + case SPRN_HID4_GEKKO: to_book3s(vcpu)->hid[4] = spr_val; break; case SPRN_HID5: @@ -278,12 +309,30 @@ int kvmppc_core_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, int rs) (mfmsr() & MSR_HV)) vcpu->arch.hflags |= BOOK3S_HFLAG_DCBZ32; break; + case SPRN_GQR0: + case SPRN_GQR1: + case SPRN_GQR2: + case SPRN_GQR3: + case SPRN_GQR4: + case SPRN_GQR5: + case SPRN_GQR6: + case SPRN_GQR7: + to_book3s(vcpu)->gqr[sprn - SPRN_GQR0] = spr_val; + break; case SPRN_ICTC: case SPRN_THRM1: case SPRN_THRM2: case SPRN_THRM3: case SPRN_CTRLF: case SPRN_CTRLT: + case SPRN_L2CR: + case SPRN_MMCR0_GEKKO: + case SPRN_MMCR1_GEKKO: + case SPRN_PMC1_GEKKO: + case SPRN_PMC2_GEKKO: + case SPRN_PMC3_GEKKO: + case SPRN_PMC4_GEKKO: + case SPRN_WPAR_GEKKO: break; default: printk(KERN_INFO "KVM: invalid SPR write: %d\n", sprn); @@ -320,19 +369,40 @@ int kvmppc_core_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, int rt) kvmppc_set_gpr(vcpu, rt, to_book3s(vcpu)->hid[1]); break; case SPRN_HID2: + case SPRN_HID2_GEKKO: kvmppc_set_gpr(vcpu, rt, to_book3s(vcpu)->hid[2]); break; case SPRN_HID4: + case SPRN_HID4_GEKKO: kvmppc_set_gpr(vcpu, rt, to_book3s(vcpu)->hid[4]); break; case SPRN_HID5: kvmppc_set_gpr(vcpu, rt, to_book3s(vcpu)->hid[5]); break; + case SPRN_GQR0: + case SPRN_GQR1: + case SPRN_GQR2: + case SPRN_GQR3: + case SPRN_GQR4: + case SPRN_GQR5: + case SPRN_GQR6: + case SPRN_GQR7: + kvmppc_set_gpr(vcpu, rt, + to_book3s(vcpu)->gqr[sprn - SPRN_GQR0]); + break; case SPRN_THRM1: case SPRN_THRM2: case SPRN_THRM3: case SPRN_CTRLF: case SPRN_CTRLT: + case SPRN_L2CR: + case SPRN_MMCR0_GEKKO: + case SPRN_MMCR1_GEKKO: + case SPRN_PMC1_GEKKO: + case SPRN_PMC2_GEKKO: + case SPRN_PMC3_GEKKO: + case SPRN_PMC4_GEKKO: + case SPRN_WPAR_GEKKO: kvmppc_set_gpr(vcpu, rt, 0); break; default: -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 12/18] KVM: PPC: Make ext giveup non-static [not found] ` <1265298925-31954-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org> ` (4 preceding siblings ...) 2010-02-04 15:55 ` [PATCH 06/18] KVM: PPC: Add Gekko SPRs Alexander Graf @ 2010-02-04 15:55 ` Alexander Graf 2010-02-04 15:55 ` [PATCH 14/18] KVM: PPC: Fix error in BAT assignment Alexander Graf ` (3 subsequent siblings) 9 siblings, 0 replies; 53+ messages in thread From: Alexander Graf @ 2010-02-04 15:55 UTC (permalink / raw) To: kvm-ppc-u79uwXL29TY76Z2rM5mHXA; +Cc: kvm-u79uwXL29TY76Z2rM5mHXA We need to call the ext giveup handlers from code outside of book3s.c. So let's make it non-static. Signed-off-by: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org> --- arch/powerpc/include/asm/kvm_book3s.h | 1 + arch/powerpc/kvm/book3s.c | 3 +-- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h index 8463976..fd43210 100644 --- a/arch/powerpc/include/asm/kvm_book3s.h +++ b/arch/powerpc/include/asm/kvm_book3s.h @@ -120,6 +120,7 @@ extern int kvmppc_st(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr, b extern void kvmppc_book3s_queue_irqprio(struct kvm_vcpu *vcpu, unsigned int vec); extern void kvmppc_set_bat(struct kvm_vcpu *vcpu, struct kvmppc_bat *bat, bool upper, u32 val); +extern void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr); extern u32 kvmppc_trampoline_lowmem; extern u32 kvmppc_trampoline_enter; diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index e8dccc6..99e9e07 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -35,7 +35,6 @@ /* #define EXIT_DEBUG_SIMPLE */ /* #define DEBUG_EXT */ -static void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr); static int kvmppc_handle_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr, ulong msr); @@ -597,7 +596,7 @@ static inline int get_fpr_index(int i) } /* Give up external provider (FPU, Altivec, VSX) */ -static void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr) +void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr) { struct thread_struct *t = ¤t->thread; u64 *vcpu_fpr = vcpu->arch.fpr; -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 14/18] KVM: PPC: Fix error in BAT assignment [not found] ` <1265298925-31954-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org> ` (5 preceding siblings ...) 2010-02-04 15:55 ` [PATCH 12/18] KVM: PPC: Make ext giveup non-static Alexander Graf @ 2010-02-04 15:55 ` Alexander Graf 2010-02-04 15:55 ` [PATCH 15/18] KVM: PPC: Add helpers to modify ppc fields Alexander Graf ` (2 subsequent siblings) 9 siblings, 0 replies; 53+ messages in thread From: Alexander Graf @ 2010-02-04 15:55 UTC (permalink / raw) To: kvm-ppc-u79uwXL29TY76Z2rM5mHXA; +Cc: kvm-u79uwXL29TY76Z2rM5mHXA BATs didn't work. Well, they did, but only up to BAT3. As soon as we came to BAT4 the offset calculation was screwed up and we ended up overwriting BAT0-3. Fortunately, Linux hasn't been using BAT4+. It's still a good idea to write correct code though. Signed-off-by: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org> --- arch/powerpc/kvm/book3s_64_emulate.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kvm/book3s_64_emulate.c b/arch/powerpc/kvm/book3s_64_emulate.c index a93aa47..1d1b952 100644 --- a/arch/powerpc/kvm/book3s_64_emulate.c +++ b/arch/powerpc/kvm/book3s_64_emulate.c @@ -233,13 +233,13 @@ static void kvmppc_write_bat(struct kvm_vcpu *vcpu, int sprn, u32 val) bat = &vcpu_book3s->ibat[(sprn - SPRN_IBAT0U) / 2]; break; case SPRN_IBAT4U ... SPRN_IBAT7L: - bat = &vcpu_book3s->ibat[(sprn - SPRN_IBAT4U) / 2]; + bat = &vcpu_book3s->ibat[4 + ((sprn - SPRN_IBAT4U) / 2)]; break; case SPRN_DBAT0U ... SPRN_DBAT3L: bat = &vcpu_book3s->dbat[(sprn - SPRN_DBAT0U) / 2]; break; case SPRN_DBAT4U ... SPRN_DBAT7L: - bat = &vcpu_book3s->dbat[(sprn - SPRN_DBAT4U) / 2]; + bat = &vcpu_book3s->dbat[4 + ((sprn - SPRN_DBAT4U) / 2)]; break; default: BUG(); -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 15/18] KVM: PPC: Add helpers to modify ppc fields [not found] ` <1265298925-31954-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org> ` (6 preceding siblings ...) 2010-02-04 15:55 ` [PATCH 14/18] KVM: PPC: Fix error in BAT assignment Alexander Graf @ 2010-02-04 15:55 ` Alexander Graf 2010-02-04 15:55 ` [PATCH 16/18] KVM: PPC: Enable program interrupt to do MMIO Alexander Graf 2010-02-07 12:54 ` [PATCH 00/18] KVM: PPC: Virtualize Gekko guests Avi Kivity 9 siblings, 0 replies; 53+ messages in thread From: Alexander Graf @ 2010-02-04 15:55 UTC (permalink / raw) To: kvm-ppc-u79uwXL29TY76Z2rM5mHXA; +Cc: kvm-u79uwXL29TY76Z2rM5mHXA The PowerPC specification always lists bits from MSB to LSB. That is really confusing when you're trying to write C code, because it fits in pretty badly with the normal (1 << xx) schemes. So I came up with some nice wrappers that allow to get and set fields in a u64 with bit numbers exactly as given in the spec. That makes the code in KVM and the spec easier comparable. Signed-off-by: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org> --- arch/powerpc/include/asm/kvm_ppc.h | 33 +++++++++++++++++++++++++++++++++ 1 files changed, 33 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index 0761218..c7fcdd7 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -103,6 +103,39 @@ extern void kvmppc_booke_exit(void); extern void kvmppc_core_destroy_mmu(struct kvm_vcpu *vcpu); +/* + * Cuts out inst bits with ordering according to spec. + * That means the leftmost bit is zero. All given bits are included. + */ +static inline u32 kvmppc_get_field(u64 inst, int msb, int lsb) +{ + u32 r; + u32 mask; + + BUG_ON(msb > lsb); + + mask = (1 << (lsb - msb + 1)) - 1; + r = (inst >> (63 - lsb)) & mask; + + return r; +} + +/* + * Replaces inst bits with ordering according to spec. + */ +static inline u32 kvmppc_set_field(u64 inst, int msb, int lsb, int value) +{ + u32 r; + u32 mask; + + BUG_ON(msb > lsb); + + mask = ((1 << (lsb - msb + 1)) - 1) << (63 - lsb); + r = (inst & ~mask) | ((value << (63 - lsb)) & mask); + + return r; +} + #ifdef CONFIG_PPC_BOOK3S /* We assume we're always acting on the current vcpu */ -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 16/18] KVM: PPC: Enable program interrupt to do MMIO [not found] ` <1265298925-31954-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org> ` (7 preceding siblings ...) 2010-02-04 15:55 ` [PATCH 15/18] KVM: PPC: Add helpers to modify ppc fields Alexander Graf @ 2010-02-04 15:55 ` Alexander Graf 2010-02-07 12:54 ` [PATCH 00/18] KVM: PPC: Virtualize Gekko guests Avi Kivity 9 siblings, 0 replies; 53+ messages in thread From: Alexander Graf @ 2010-02-04 15:55 UTC (permalink / raw) To: kvm-ppc-u79uwXL29TY76Z2rM5mHXA; +Cc: kvm-u79uwXL29TY76Z2rM5mHXA When we get a program interrupt we usually don't expect it to perform an MMIO operation. But why not? When we emulate paired singles, we can end up loading or storing to an MMIO address - and the handling of those happens in the program interrupt handler. So let's teach the program interrupt handler how to deal with EMULATE_MMIO. Signed-off-by: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org> --- arch/powerpc/kvm/book3s.c | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index 99e9e07..f842d1d 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -840,6 +840,10 @@ program_interrupt: kvmppc_core_queue_program(vcpu, flags); r = RESUME_GUEST; break; + case EMULATE_DO_MMIO: + run->exit_reason = KVM_EXIT_MMIO; + r = RESUME_HOST_NV; + break; default: BUG(); } -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH 00/18] KVM: PPC: Virtualize Gekko guests [not found] ` <1265298925-31954-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org> ` (8 preceding siblings ...) 2010-02-04 15:55 ` [PATCH 16/18] KVM: PPC: Enable program interrupt to do MMIO Alexander Graf @ 2010-02-07 12:54 ` Avi Kivity [not found] ` <4B6EB7F6.10304-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 9 siblings, 1 reply; 53+ messages in thread From: Avi Kivity @ 2010-02-07 12:54 UTC (permalink / raw) To: Alexander Graf; +Cc: kvm-ppc-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA On 02/04/2010 05:55 PM, Alexander Graf wrote: > In an effort to get KVM on PPC more useful for other userspace users than > Qemu, I figured it'd be a nice idea to implement virtualization of the > Gekko CPU. > > The Gekko is the CPU used in the GameCube. In a slightly more modern > fashion it lives on in the Wii today. > > Using this patch set and a modified version of Dolphin, I was able to > virtualize simple GameCube demos on a 970MP system. > > As always, while getting this to run I stumbled across several broken > parts and fixed them as they came up. So expect some bug fixes in this > patch set too. > This is halfway into emulation rather than virtualization. What does performance look like when running fpu intensive applications? I might have missed it, but I didn't see the KVM_CAP and save/restore support for this. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <4B6EB7F6.10304-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 00/18] KVM: PPC: Virtualize Gekko guests [not found] ` <4B6EB7F6.10304-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2010-02-07 15:49 ` Alexander Graf 2010-02-07 16:22 ` Avi Kivity 0 siblings, 1 reply; 53+ messages in thread From: Alexander Graf @ 2010-02-07 15:49 UTC (permalink / raw) To: Avi Kivity Cc: kvm-ppc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Am 07.02.2010 um 13:54 schrieb Avi Kivity <avi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>: > On 02/04/2010 05:55 PM, Alexander Graf wrote: >> In an effort to get KVM on PPC more useful for other userspace >> users than >> Qemu, I figured it'd be a nice idea to implement virtualization of >> the >> Gekko CPU. >> >> The Gekko is the CPU used in the GameCube. In a slightly more modern >> fashion it lives on in the Wii today. >> >> Using this patch set and a modified version of Dolphin, I was able to >> virtualize simple GameCube demos on a 970MP system. >> >> As always, while getting this to run I stumbled across several broken >> parts and fixed them as they came up. So expect some bug fixes in >> this >> patch set too. >> > > This is halfway into emulation rather than virtualization. What > does performance look like when running fpu intensive applications? It is for the FPU. It is not for whatever runs on the CPU. I haven't benchmarked things so far, The only two choices I have to get this running is in-kernel emulation or userspace emulation. According to how x86 deals with things I suppose full state transition to userspace and continuing emulation there isn't considered a good idea. So I went with in-kernel. > > I might have missed it, but I didn't see the KVM_CAP and save/ > restore support for this. Ah, cap again. Right. Mind if I send an patch on top of the set? As far as save/restore goes, the ioctl to get/set fprs isn't even implemented (yet)! We're really off full state migration to/from userspace yet. Alex ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 00/18] KVM: PPC: Virtualize Gekko guests 2010-02-07 15:49 ` Alexander Graf @ 2010-02-07 16:22 ` Avi Kivity 2010-02-07 22:02 ` Alexander Graf 0 siblings, 1 reply; 53+ messages in thread From: Avi Kivity @ 2010-02-07 16:22 UTC (permalink / raw) To: Alexander Graf; +Cc: kvm-ppc@vger.kernel.org, kvm@vger.kernel.org On 02/07/2010 05:49 PM, Alexander Graf wrote: > Am 07.02.2010 um 13:54 schrieb Avi Kivity <avi@redhat.com>: > >> On 02/04/2010 05:55 PM, Alexander Graf wrote: >>> In an effort to get KVM on PPC more useful for other userspace users >>> than >>> Qemu, I figured it'd be a nice idea to implement virtualization of the >>> Gekko CPU. >>> >>> The Gekko is the CPU used in the GameCube. In a slightly more modern >>> fashion it lives on in the Wii today. >>> >>> Using this patch set and a modified version of Dolphin, I was able to >>> virtualize simple GameCube demos on a 970MP system. >>> >>> As always, while getting this to run I stumbled across several broken >>> parts and fixed them as they came up. So expect some bug fixes in this >>> patch set too. >>> >> >> This is halfway into emulation rather than virtualization. What does >> performance look like when running fpu intensive applications? > > It is for the FPU. It is not for whatever runs on the CPU. > > I haven't benchmarked things so far, > > The only two choices I have to get this running is in-kernel emulation > or userspace emulation. According to how x86 deals with things I > suppose full state transition to userspace and continuing emulation > there isn't considered a good idea. So I went with in-kernel. It's not a good idea for the kernel either, if it happens all the time. If a typical Gekko application uses the fpu and the emulated instructions intensively, performance will suck badly (as in: qemu/tcg will be faster). -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 00/18] KVM: PPC: Virtualize Gekko guests 2010-02-07 16:22 ` Avi Kivity @ 2010-02-07 22:02 ` Alexander Graf 2010-02-08 8:53 ` Avi Kivity 0 siblings, 1 reply; 53+ messages in thread From: Alexander Graf @ 2010-02-07 22:02 UTC (permalink / raw) To: Avi Kivity; +Cc: kvm-ppc@vger.kernel.org, kvm@vger.kernel.org Avi Kivity wrote: > On 02/07/2010 05:49 PM, Alexander Graf wrote: >> Am 07.02.2010 um 13:54 schrieb Avi Kivity <avi@redhat.com>: >> >>> On 02/04/2010 05:55 PM, Alexander Graf wrote: >>>> In an effort to get KVM on PPC more useful for other userspace >>>> users than >>>> Qemu, I figured it'd be a nice idea to implement virtualization of the >>>> Gekko CPU. >>>> >>>> The Gekko is the CPU used in the GameCube. In a slightly more modern >>>> fashion it lives on in the Wii today. >>>> >>>> Using this patch set and a modified version of Dolphin, I was able to >>>> virtualize simple GameCube demos on a 970MP system. >>>> >>>> As always, while getting this to run I stumbled across several broken >>>> parts and fixed them as they came up. So expect some bug fixes in this >>>> patch set too. >>>> >>> >>> This is halfway into emulation rather than virtualization. What >>> does performance look like when running fpu intensive applications? >> >> It is for the FPU. It is not for whatever runs on the CPU. >> >> I haven't benchmarked things so far, >> >> The only two choices I have to get this running is in-kernel >> emulation or userspace emulation. According to how x86 deals with >> things I suppose full state transition to userspace and continuing >> emulation there isn't considered a good idea. So I went with in-kernel. > > It's not a good idea for the kernel either, if it happens all the > time. If a typical Gekko application uses the fpu and the emulated > instructions intensively, performance will suck badly (as in: qemu/tcg > will be faster). > Yeah, I haven't really gotten far enough to run full-blown guests yet. So far I'm on demos and they look pretty good. But as far as intercept speed goes - I just tried running this little piece of code in kvmctl: .global _start _start: li r3, 42 mtsprg 0, r3 mfsprg r4, 0 b _start and measured the amount of exits I get on my test machine: processor : 0 cpu : PPC970MP, altivec supported clock : 2500.000000MHz revision : 1.1 (pvr 0044 0101) ---> exits 1811108 I have no idea how we manage to get that many exits, but apparently we are. So I'm less concerned about the speed of the FPU rerouting at the moment. If it really gets unusably slow, I'd rather binary patch the guest on the fly in KVM according to rules set by the userspace client. But we'll get there when it turns out to be too slow. For now I'd rather like to have something working at all and then improve speed :-). Alex ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 00/18] KVM: PPC: Virtualize Gekko guests 2010-02-07 22:02 ` Alexander Graf @ 2010-02-08 8:53 ` Avi Kivity [not found] ` <4B6FD118.2090207-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2010-02-09 11:00 ` Alexander Graf 0 siblings, 2 replies; 53+ messages in thread From: Avi Kivity @ 2010-02-08 8:53 UTC (permalink / raw) To: Alexander Graf; +Cc: kvm-ppc@vger.kernel.org, kvm@vger.kernel.org On 02/08/2010 12:02 AM, Alexander Graf wrote: > >> It's not a good idea for the kernel either, if it happens all the >> time. If a typical Gekko application uses the fpu and the emulated >> instructions intensively, performance will suck badly (as in: qemu/tcg >> will be faster). >> >> > Yeah, I haven't really gotten far enough to run full-blown guests yet. > So far I'm on demos and they look pretty good. > > But as far as intercept speed goes - I just tried running this little > piece of code in kvmctl: > > .global _start > _start: > li r3, 42 > mtsprg 0, r3 > mfsprg r4, 0 > b _start > > and measured the amount of exits I get on my test machine: > > processor : 0 > cpu : PPC970MP, altivec supported > clock : 2500.000000MHz > revision : 1.1 (pvr 0044 0101) > > ---> > > exits 1811108 > > I have no idea how we manage to get that many exits, but apparently we > are. So I'm less concerned about the speed of the FPU rerouting at the > moment. > That's pretty impressive (never saw x86 with this exit rate) but it's more than 1000 times slower than the hardware, assuming 1 fpu IPC (and the processor can probably do more). An fpu intensive application will slow to a crawl. > If it really gets unusably slow, I'd rather binary patch the guest on > the fly in KVM according to rules set by the userspace client. Is that even possible? Do those register-pair instructions and registers map 1:1 to 970 instructions and registers? > But we'll > get there when it turns out to be too slow. For now I'd rather like to > have something working at all and then improve speed :-). > Well, I want to see the light at the end of the tunnel first. Adding code is easy, ripping it out later not so much. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <4B6FD118.2090207-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 00/18] KVM: PPC: Virtualize Gekko guests [not found] ` <4B6FD118.2090207-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2010-02-08 10:58 ` Alexander Graf [not found] ` <87CEECB5-107A-46EB-89F5-1E1F92AC22AA-l3A5Bk7waGM@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Alexander Graf @ 2010-02-08 10:58 UTC (permalink / raw) To: Avi Kivity Cc: kvm-ppc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 08.02.2010, at 09:53, Avi Kivity wrote: > On 02/08/2010 12:02 AM, Alexander Graf wrote: >> >>> It's not a good idea for the kernel either, if it happens all the >>> time. If a typical Gekko application uses the fpu and the emulated >>> instructions intensively, performance will suck badly (as in: qemu/tcg >>> will be faster). >>> >>> >> Yeah, I haven't really gotten far enough to run full-blown guests yet. >> So far I'm on demos and they look pretty good. >> >> But as far as intercept speed goes - I just tried running this little >> piece of code in kvmctl: >> >> .global _start >> _start: >> li r3, 42 >> mtsprg 0, r3 >> mfsprg r4, 0 >> b _start >> >> and measured the amount of exits I get on my test machine: >> >> processor : 0 >> cpu : PPC970MP, altivec supported >> clock : 2500.000000MHz >> revision : 1.1 (pvr 0044 0101) >> >> ---> >> >> exits 1811108 >> >> I have no idea how we manage to get that many exits, but apparently we >> are. So I'm less concerned about the speed of the FPU rerouting at the >> moment. >> > > That's pretty impressive (never saw x86 with this exit rate) but it's more than 1000 times slower than the hardware, assuming 1 fpu IPC (and the processor can probably do more). An fpu intensive application will slow to a crawl. True. > >> If it really gets unusably slow, I'd rather binary patch the guest on >> the fly in KVM according to rules set by the userspace client. > > Is that even possible? Do those register-pair instructions and registers map 1:1 to 970 instructions and registers? Almost. Basically all I need to do is execute 2 FPU instructions instead of one for single instructions and paired single special instructions. So if I could patch the instruction to jump to some shared memory page, it'd become fast. At least as long as I figure out how to make sure we run with FP=0 in normal code, but with FP=1 in the special page ;). > >> But we'll >> get there when it turns out to be too slow. For now I'd rather like to >> have something working at all and then improve speed :-). >> > > Well, I want to see the light at the end of the tunnel first. Adding code is easy, ripping it out later not so much. Hum, so you suggest I get some real application running properly first so we can evaluate if it's fast enough? Alex-- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <87CEECB5-107A-46EB-89F5-1E1F92AC22AA-l3A5Bk7waGM@public.gmane.org>]
* Re: [PATCH 00/18] KVM: PPC: Virtualize Gekko guests [not found] ` <87CEECB5-107A-46EB-89F5-1E1F92AC22AA-l3A5Bk7waGM@public.gmane.org> @ 2010-02-08 11:09 ` Avi Kivity [not found] ` <4B6FF0E6.6060309-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Avi Kivity @ 2010-02-08 11:09 UTC (permalink / raw) To: Alexander Graf Cc: kvm-ppc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 02/08/2010 12:58 PM, Alexander Graf wrote: >>> If it really gets unusably slow, I'd rather binary patch the guest on >>> the fly in KVM according to rules set by the userspace client. >>> >> Is that even possible? Do those register-pair instructions and registers map 1:1 to 970 instructions and registers? >> > Almost. Basically all I need to do is execute 2 FPU instructions instead of one for single instructions and paired single special instructions. So if I could patch the instruction to jump to some shared memory page, it'd become fast. At least as long as I figure out how to make sure we run with FP=0 in normal code, but with FP=1 in the special page ;). > How do you locate a free virtual address to poke your shared memory page into? What if the guest kernel instantiates it later? Aren't direct jumps limited in their offset? What if an exception happens in the shared memory page? Patching is hard, let's go shopping. >>> But we'll >>> get there when it turns out to be too slow. For now I'd rather like to >>> have something working at all and then improve speed :-). >>> >>> >> Well, I want to see the light at the end of the tunnel first. Adding code is easy, ripping it out later not so much. >> > Hum, so you suggest I get some real application running properly first so we can evaluate if it's fast enough? > Yes, a real application typical for whatever use case you envision for Gekko emulation (can you shed a few words on that please). -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <4B6FF0E6.6060309-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 00/18] KVM: PPC: Virtualize Gekko guests [not found] ` <4B6FF0E6.6060309-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2010-02-08 11:30 ` Alexander Graf [not found] ` <A5BC5A7E-D45B-4BAF-804A-B364810F50DA-l3A5Bk7waGM@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Alexander Graf @ 2010-02-08 11:30 UTC (permalink / raw) To: Avi Kivity Cc: kvm-ppc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 08.02.2010, at 12:09, Avi Kivity wrote: > On 02/08/2010 12:58 PM, Alexander Graf wrote: >>>> If it really gets unusably slow, I'd rather binary patch the guest on >>>> the fly in KVM according to rules set by the userspace client. >>>> >>> Is that even possible? Do those register-pair instructions and registers map 1:1 to 970 instructions and registers? >>> >> Almost. Basically all I need to do is execute 2 FPU instructions instead of one for single instructions and paired single special instructions. So if I could patch the instruction to jump to some shared memory page, it'd become fast. At least as long as I figure out how to make sure we run with FP=0 in normal code, but with FP=1 in the special page ;). >> > > How do you locate a free virtual address to poke your shared memory page into? Most applications use the same virtual memory layout. I'm 100% confident we can make the region not collide with those. For applications actually doing memory mapping themselves, there is Linux, where we know the layout too, and one or two special applications. But I think it's feasible. At least for 99.9% of the cases. > What if the guest kernel instantiates it later? Well, then we're screwed and need to fall back to trapping and emulating like my patch does now. I guess we could blacklist those guests. > Aren't direct jumps limited in their offset? Yes. We can do an absolute branch do negative addresses, effectively jumping to 0xffffffff - x whereas x is 15 bits & ~3 IIRC. That's definitely enough for at least a shared page for registers and a jump table :-). > What if an exception happens in the shared memory page? Well, then the guest kernel needs to be gracious. I'm fairly sure it is ;-). It doesn't make sense to examine on which ip an interrupt occured. > Patching is hard, let's go shopping. Yay :) > >>>> But we'll >>>> get there when it turns out to be too slow. For now I'd rather like to >>>> have something working at all and then improve speed :-). >>>> >>>> >>> Well, I want to see the light at the end of the tunnel first. Adding code is easy, ripping it out later not so much. >>> >> Hum, so you suggest I get some real application running properly first so we can evaluate if it's fast enough? >> > > Yes, a real application typical for whatever use case you envision for Gekko emulation (can you shed a few words on that please). I did mention Dolphin, right? http://www.dolphin-emu.com/ Basically I envision that this is the easiest way to do PR for KVM on PPC. Releasing this properly will instantly raise awareness and thus potentially increase our user base by a lot. IMHO it'd also help KVM in general, keeping it in the news. Alex-- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <A5BC5A7E-D45B-4BAF-804A-B364810F50DA-l3A5Bk7waGM@public.gmane.org>]
* Re: [PATCH 00/18] KVM: PPC: Virtualize Gekko guests [not found] ` <A5BC5A7E-D45B-4BAF-804A-B364810F50DA-l3A5Bk7waGM@public.gmane.org> @ 2010-02-08 12:03 ` Avi Kivity [not found] ` <4B6FFD85.6090100-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Avi Kivity @ 2010-02-08 12:03 UTC (permalink / raw) To: Alexander Graf Cc: kvm-ppc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 02/08/2010 01:30 PM, Alexander Graf wrote: >>> Hum, so you suggest I get some real application running properly first so we can evaluate if it's fast enough? >>> >>> >> Yes, a real application typical for whatever use case you envision for Gekko emulation (can you shed a few words on that please). >> > > I did mention Dolphin, right? > Must have missed it. > http://www.dolphin-emu.com/ > > Basically I envision that this is the easiest way to do PR for KVM on PPC. Releasing this properly will instantly raise awareness and thus potentially increase our user base by a lot. IMHO it'd also help KVM in general, keeping it in the news. > To me it seems the intersection of gamers with ppc desktop linux owners would be rather small. I'm not opposed to merging special use cases (esp. as you're doing all of the work AND are responsible for maintainance), but I would like to be sure that it doesn't end up unusable. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <4B6FFD85.6090100-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 00/18] KVM: PPC: Virtualize Gekko guests [not found] ` <4B6FFD85.6090100-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2010-02-08 12:05 ` Alexander Graf [not found] ` <939C8633-1B2C-4888-B1C1-357DF1C56CE6-l3A5Bk7waGM@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Alexander Graf @ 2010-02-08 12:05 UTC (permalink / raw) To: Avi Kivity Cc: kvm-ppc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 08.02.2010, at 13:03, Avi Kivity wrote: > On 02/08/2010 01:30 PM, Alexander Graf wrote: >>>> Hum, so you suggest I get some real application running properly first so we can evaluate if it's fast enough? >>>> >>>> >>> Yes, a real application typical for whatever use case you envision for Gekko emulation (can you shed a few words on that please). >>> >> >> I did mention Dolphin, right? >> > > Must have missed it. > >> http://www.dolphin-emu.com/ >> >> Basically I envision that this is the easiest way to do PR for KVM on PPC. Releasing this properly will instantly raise awareness and thus potentially increase our user base by a lot. IMHO it'd also help KVM in general, keeping it in the news. >> > > To me it seems the intersection of gamers with ppc desktop linux owners would be rather small. There are no ppc desktop linux owners left. Well - almost none. There's only servers and gamers. > I'm not opposed to merging special use cases (esp. as you're doing all of the work AND are responsible for maintainance), but I would like to be sure that it doesn't end up unusable. Yep :-). I'll try and see how far I can get on getting something real running. Then I get a feeling for how fast this whole approach is. Alex-- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <939C8633-1B2C-4888-B1C1-357DF1C56CE6-l3A5Bk7waGM@public.gmane.org>]
* Re: [PATCH 00/18] KVM: PPC: Virtualize Gekko guests [not found] ` <939C8633-1B2C-4888-B1C1-357DF1C56CE6-l3A5Bk7waGM@public.gmane.org> @ 2010-02-08 12:15 ` Avi Kivity 2010-02-08 12:31 ` Alexander Graf 0 siblings, 1 reply; 53+ messages in thread From: Avi Kivity @ 2010-02-08 12:15 UTC (permalink / raw) To: Alexander Graf Cc: kvm-ppc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 02/08/2010 02:05 PM, Alexander Graf wrote: >> >>> http://www.dolphin-emu.com/ >>> >>> Basically I envision that this is the easiest way to do PR for KVM on PPC. Releasing this properly will instantly raise awareness and thus potentially increase our user base by a lot. IMHO it'd also help KVM in general, keeping it in the news. >>> >>> >> To me it seems the intersection of gamers with ppc desktop linux owners would be rather small. >> > There are no ppc desktop linux owners left. Well - almost none. There's only servers and gamers. > ... so what's the use case? server owners won't run games, and console owners don't need kvm to run games. Unless you propose to run Wii games on PlayStation, or something. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 00/18] KVM: PPC: Virtualize Gekko guests 2010-02-08 12:15 ` Avi Kivity @ 2010-02-08 12:31 ` Alexander Graf 0 siblings, 0 replies; 53+ messages in thread From: Alexander Graf @ 2010-02-08 12:31 UTC (permalink / raw) To: Avi Kivity; +Cc: kvm-ppc@vger.kernel.org, kvm@vger.kernel.org On 08.02.2010, at 13:15, Avi Kivity wrote: > On 02/08/2010 02:05 PM, Alexander Graf wrote: >>> >>>> http://www.dolphin-emu.com/ >>>> >>>> Basically I envision that this is the easiest way to do PR for KVM on PPC. Releasing this properly will instantly raise awareness and thus potentially increase our user base by a lot. IMHO it'd also help KVM in general, keeping it in the news. >>>> >>>> >>> To me it seems the intersection of gamers with ppc desktop linux owners would be rather small. >>> >> There are no ppc desktop linux owners left. Well - almost none. There's only servers and gamers. >> > > ... so what's the use case? server owners won't run games, and console owners don't need kvm to run games. Unless you propose to run Wii games on PlayStation, or something. It's an experiment to verify a theory I've had for quite a while, based on this talk: http://www.heise.de/fastbin/eventmanager/file/?media_id=466 The general idea is that the only difference between proprietary software and open software for a company developing open source is that they need less marketing. Simply because people outside will be aware of the products. Take Windows Server as an example. The reason people use it is because they know how to use Windows at home. They take a brand they know, usability they know and expectations they have to the server world. I think the same thing could apply to KVM. If there was a UI that's as easy to use and as user focused as virtual box, the awareness of kvm would rise because people use it on their workstations and would thus expect the same piece of software on their servers. Since I'm no UI programmer and I figured creating a usable UI takes way too long for me to spend time on, I figured I'd go with something where I could make a difference myself. And that's PPC. If you're using KVM on your game console and it works well, why not use it on your server? Also, as mentioned earlier, I've had different levels of awareness of stuff I did so far. People were extremely fascinated by and eager to see osx running on kvm. For nesting, the audience was smaller, thus the news it generated was way less. When it comes to commodity hardware (game consoles), I think the audience is a lot bigger again, thus generating more traction. The same thing happened with the PPC btw. Since Apple stopped shipping PPC based Macs, people are way less aware and way less interested in it. I'm pretty sure the number of server sales are still the same as they used to be, but because there's no commodity hardware that generates awareness, people believe it to be dead. So far for the theory. We'll see if my point gets proven soon enough I guess :-). Alex ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 00/18] KVM: PPC: Virtualize Gekko guests 2010-02-08 8:53 ` Avi Kivity [not found] ` <4B6FD118.2090207-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2010-02-09 11:00 ` Alexander Graf [not found] ` <4B714049.7010201-l3A5Bk7waGM@public.gmane.org> 1 sibling, 1 reply; 53+ messages in thread From: Alexander Graf @ 2010-02-09 11:00 UTC (permalink / raw) To: Avi Kivity; +Cc: kvm-ppc@vger.kernel.org, kvm@vger.kernel.org Avi Kivity wrote: > On 02/08/2010 12:02 AM, Alexander Graf wrote: >> >>> It's not a good idea for the kernel either, if it happens all the >>> time. If a typical Gekko application uses the fpu and the emulated >>> instructions intensively, performance will suck badly (as in: qemu/tcg >>> will be faster). >>> >>> >> Yeah, I haven't really gotten far enough to run full-blown guests yet. >> So far I'm on demos and they look pretty good. >> >> But as far as intercept speed goes - I just tried running this little >> piece of code in kvmctl: >> >> .global _start >> _start: >> li r3, 42 >> mtsprg 0, r3 >> mfsprg r4, 0 >> b _start >> >> and measured the amount of exits I get on my test machine: >> >> processor : 0 >> cpu : PPC970MP, altivec supported >> clock : 2500.000000MHz >> revision : 1.1 (pvr 0044 0101) >> >> ---> >> >> exits 1811108 >> >> I have no idea how we manage to get that many exits, but apparently we >> are. So I'm less concerned about the speed of the FPU rerouting at the >> moment. >> > > That's pretty impressive (never saw x86 with this exit rate) but it's > more than 1000 times slower than the hardware, assuming 1 fpu IPC (and > the processor can probably do more). An fpu intensive application > will slow to a crawl. Measuring a typical Gekko application, I get about 200k-250k of fpu (incl. paired singles) instructions per second. Alex ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <4B714049.7010201-l3A5Bk7waGM@public.gmane.org>]
* Re: [PATCH 00/18] KVM: PPC: Virtualize Gekko guests [not found] ` <4B714049.7010201-l3A5Bk7waGM@public.gmane.org> @ 2010-02-09 11:06 ` Avi Kivity 2010-02-09 11:13 ` Alexander Graf 0 siblings, 1 reply; 53+ messages in thread From: Avi Kivity @ 2010-02-09 11:06 UTC (permalink / raw) To: Alexander Graf Cc: kvm-ppc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 02/09/2010 01:00 PM, Alexander Graf wrote: > >> That's pretty impressive (never saw x86 with this exit rate) but it's >> more than 1000 times slower than the hardware, assuming 1 fpu IPC (and >> the processor can probably do more). An fpu intensive application >> will slow to a crawl. >> > Measuring a typical Gekko application, I get about 200k-250k of fpu > (incl. paired singles) instructions per second. > Virtualized, yes? What's the rate on bare metal? -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 00/18] KVM: PPC: Virtualize Gekko guests 2010-02-09 11:06 ` Avi Kivity @ 2010-02-09 11:13 ` Alexander Graf [not found] ` <4B71435E.7010103-l3A5Bk7waGM@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Alexander Graf @ 2010-02-09 11:13 UTC (permalink / raw) To: Avi Kivity; +Cc: kvm-ppc@vger.kernel.org, kvm@vger.kernel.org Avi Kivity wrote: > On 02/09/2010 01:00 PM, Alexander Graf wrote: >> >>> That's pretty impressive (never saw x86 with this exit rate) but it's >>> more than 1000 times slower than the hardware, assuming 1 fpu IPC (and >>> the processor can probably do more). An fpu intensive application >>> will slow to a crawl. >>> >> Measuring a typical Gekko application, I get about 200k-250k of fpu >> (incl. paired singles) instructions per second. >> > > Virtualized, yes? What's the rate on bare metal? Emulated. I can't measure anything on bare metal. Alex ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <4B71435E.7010103-l3A5Bk7waGM@public.gmane.org>]
* Re: [PATCH 00/18] KVM: PPC: Virtualize Gekko guests [not found] ` <4B71435E.7010103-l3A5Bk7waGM@public.gmane.org> @ 2010-02-09 12:27 ` Avi Kivity [not found] ` <4B7154A6.6050809-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Avi Kivity @ 2010-02-09 12:27 UTC (permalink / raw) To: Alexander Graf Cc: kvm-ppc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 02/09/2010 01:13 PM, Alexander Graf wrote: > Avi Kivity wrote: > >> On 02/09/2010 01:00 PM, Alexander Graf wrote: >> >>> >>>> That's pretty impressive (never saw x86 with this exit rate) but it's >>>> more than 1000 times slower than the hardware, assuming 1 fpu IPC (and >>>> the processor can probably do more). An fpu intensive application >>>> will slow to a crawl. >>>> >>>> >>> Measuring a typical Gekko application, I get about 200k-250k of fpu >>> (incl. paired singles) instructions per second. >>> >>> >> Virtualized, yes? What's the rate on bare metal? >> > > Emulated. I can't measure anything on bare metal. > Well, then, the rate may be low due to virtualization overhead. Any way to compare absolute performance? -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <4B7154A6.6050809-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 00/18] KVM: PPC: Virtualize Gekko guests [not found] ` <4B7154A6.6050809-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2010-02-17 14:56 ` Alexander Graf [not found] ` <80F0B53A-6F83-4166-8F85-5D9B07526158-l3A5Bk7waGM@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Alexander Graf @ 2010-02-17 14:56 UTC (permalink / raw) To: Avi Kivity Cc: kvm-ppc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 09.02.2010, at 13:27, Avi Kivity wrote: > On 02/09/2010 01:13 PM, Alexander Graf wrote: >> Avi Kivity wrote: >> >>> On 02/09/2010 01:00 PM, Alexander Graf wrote: >>> >>>> >>>>> That's pretty impressive (never saw x86 with this exit rate) but it's >>>>> more than 1000 times slower than the hardware, assuming 1 fpu IPC (and >>>>> the processor can probably do more). An fpu intensive application >>>>> will slow to a crawl. >>>>> >>>>> >>>> Measuring a typical Gekko application, I get about 200k-250k of fpu >>>> (incl. paired singles) instructions per second. >>>> >>>> >>> Virtualized, yes? What's the rate on bare metal? >>> >> >> Emulated. I can't measure anything on bare metal. >> > > Well, then, the rate may be low due to virtualization overhead. Any way to compare absolute performance? So I changed to code according to your input by making all FPU calls explicit, getting rid of all binary patching. On the PowerStation again I'm running this code (simplified to the important instructions) using kvmctl: li r2, 0x1234 std r2, 0(r1) lfd f3, 0(r1) lfd f4, 0(r1) do_mul: fmul f0, f3, f4 b do_mul With the following kvm_stat output: dec 2236 53 exits 60797802 1171403 ext_intr 379 4 halt_wakeup 0 0 inst_emu 60795247 1171344 ld 60795132 1171348 So I'm getting 1171403 fmul operations per second. And that's even with non-optimized instruction fetching. Not bad. Alex-- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <80F0B53A-6F83-4166-8F85-5D9B07526158-l3A5Bk7waGM@public.gmane.org>]
* Re: [PATCH 00/18] KVM: PPC: Virtualize Gekko guests [not found] ` <80F0B53A-6F83-4166-8F85-5D9B07526158-l3A5Bk7waGM@public.gmane.org> @ 2010-02-17 16:03 ` Avi Kivity [not found] ` <4B7C134C.9040009-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Avi Kivity @ 2010-02-17 16:03 UTC (permalink / raw) To: Alexander Graf Cc: kvm-ppc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 02/17/2010 04:56 PM, Alexander Graf wrote: > > So I changed to code according to your input by making all FPU calls explicit, getting rid of all binary patching. > > On the PowerStation again I'm running this code (simplified to the important instructions) using kvmctl: > > li r2, 0x1234 > std r2, 0(r1) > lfd f3, 0(r1) > lfd f4, 0(r1) > do_mul: > fmul f0, f3, f4 > b do_mul > > > With the following kvm_stat output: > > dec 2236 53 > exits 60797802 1171403 > ext_intr 379 4 > halt_wakeup 0 0 > inst_emu 60795247 1171344 > ld 60795132 1171348 > > So I'm getting 1171403 fmul operations per second. And that's even with non-optimized instruction fetching. Not bad. > It's a large number, but won't real hardware be three orders of magnitude faster? -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <4B7C134C.9040009-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 00/18] KVM: PPC: Virtualize Gekko guests [not found] ` <4B7C134C.9040009-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2010-02-17 16:23 ` Alexander Graf 2010-02-17 16:34 ` Avi Kivity 0 siblings, 1 reply; 53+ messages in thread From: Alexander Graf @ 2010-02-17 16:23 UTC (permalink / raw) To: Avi Kivity Cc: kvm-ppc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 17.02.2010, at 17:03, Avi Kivity wrote: > On 02/17/2010 04:56 PM, Alexander Graf wrote: >> >> So I changed to code according to your input by making all FPU calls explicit, getting rid of all binary patching. >> >> On the PowerStation again I'm running this code (simplified to the important instructions) using kvmctl: >> >> li r2, 0x1234 >> std r2, 0(r1) >> lfd f3, 0(r1) >> lfd f4, 0(r1) >> do_mul: >> fmul f0, f3, f4 >> b do_mul >> >> >> With the following kvm_stat output: >> >> dec 2236 53 >> exits 60797802 1171403 >> ext_intr 379 4 >> halt_wakeup 0 0 >> inst_emu 60795247 1171344 >> ld 60795132 1171348 >> >> So I'm getting 1171403 fmul operations per second. And that's even with non-optimized instruction fetching. Not bad. >> > > It's a large number, but won't real hardware be three orders of magnitude faster? Yes, it would. But we don't have to care. The only thing we need to worry about is being fast enough to emulate enough FPU instructions actually used in normal guests so the guest runs in full speed. And 1000k > 250k, so we can do that apparently, leaving some spare cycles for non-fpu instructions. The kernel on my PS3 is still compiling. Let's see how fast I get there. Alex-- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 00/18] KVM: PPC: Virtualize Gekko guests 2010-02-17 16:23 ` Alexander Graf @ 2010-02-17 16:34 ` Avi Kivity [not found] ` <4B7C1A91.8060205-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Avi Kivity @ 2010-02-17 16:34 UTC (permalink / raw) To: Alexander Graf; +Cc: kvm-ppc@vger.kernel.org, kvm@vger.kernel.org On 02/17/2010 06:23 PM, Alexander Graf wrote: > On 17.02.2010, at 17:03, Avi Kivity wrote: > > >> On 02/17/2010 04:56 PM, Alexander Graf wrote: >> >>> So I changed to code according to your input by making all FPU calls explicit, getting rid of all binary patching. >>> >>> On the PowerStation again I'm running this code (simplified to the important instructions) using kvmctl: >>> >>> li r2, 0x1234 >>> std r2, 0(r1) >>> lfd f3, 0(r1) >>> lfd f4, 0(r1) >>> do_mul: >>> fmul f0, f3, f4 >>> b do_mul >>> >>> >>> With the following kvm_stat output: >>> >>> dec 2236 53 >>> exits 60797802 1171403 >>> ext_intr 379 4 >>> halt_wakeup 0 0 >>> inst_emu 60795247 1171344 >>> ld 60795132 1171348 >>> >>> So I'm getting 1171403 fmul operations per second. And that's even with non-optimized instruction fetching. Not bad. >>> >>> >> It's a large number, but won't real hardware be three orders of magnitude faster? >> > Yes, it would. But we don't have to care. The only thing we need to worry about is being fast enough to emulate enough FPU instructions actually used in normal guests so the guest runs in full speed. And 1000k> 250k, so we can do that apparently, leaving some spare cycles for non-fpu instructions. > I'm sure 250k isn't representative of a floating point intensive program (but maybe there aren't fpu intensive applications on that cpu). -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <4B7C1A91.8060205-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 00/18] KVM: PPC: Virtualize Gekko guests [not found] ` <4B7C1A91.8060205-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2010-02-17 18:07 ` Alexander Graf [not found] ` <D3C1F39A-E9C8-46B8-9C4A-12E56D34B5AC-l3A5Bk7waGM@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Alexander Graf @ 2010-02-17 18:07 UTC (permalink / raw) To: Avi Kivity Cc: kvm-ppc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 17.02.2010, at 17:34, Avi Kivity wrote: > On 02/17/2010 06:23 PM, Alexander Graf wrote: >> On 17.02.2010, at 17:03, Avi Kivity wrote: >> >> >>> On 02/17/2010 04:56 PM, Alexander Graf wrote: >>> >>>> So I changed to code according to your input by making all FPU calls explicit, getting rid of all binary patching. >>>> >>>> On the PowerStation again I'm running this code (simplified to the important instructions) using kvmctl: >>>> >>>> li r2, 0x1234 >>>> std r2, 0(r1) >>>> lfd f3, 0(r1) >>>> lfd f4, 0(r1) >>>> do_mul: >>>> fmul f0, f3, f4 >>>> b do_mul >>>> >>>> >>>> With the following kvm_stat output: >>>> >>>> dec 2236 53 >>>> exits 60797802 1171403 >>>> ext_intr 379 4 >>>> halt_wakeup 0 0 >>>> inst_emu 60795247 1171344 >>>> ld 60795132 1171348 >>>> >>>> So I'm getting 1171403 fmul operations per second. And that's even with non-optimized instruction fetching. Not bad. >>>> >>>> >>> It's a large number, but won't real hardware be three orders of magnitude faster? >>> >> Yes, it would. But we don't have to care. The only thing we need to worry about is being fast enough to emulate enough FPU instructions actually used in normal guests so the guest runs in full speed. And 1000k> 250k, so we can do that apparently, leaving some spare cycles for non-fpu instructions. >> > > I'm sure 250k isn't representative of a floating point intensive program (but maybe there aren't fpu intensive applications on that cpu). Now you made me check how fast the real hw is. I get about 65,000,000 fmul operations per second on it. So we're 65x slower on a PowerStation. And that's for a tight FPU only loop. I'm still not convinced we're running into major problems. Alex-- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <D3C1F39A-E9C8-46B8-9C4A-12E56D34B5AC-l3A5Bk7waGM@public.gmane.org>]
* Re: [PATCH 00/18] KVM: PPC: Virtualize Gekko guests [not found] ` <D3C1F39A-E9C8-46B8-9C4A-12E56D34B5AC-l3A5Bk7waGM@public.gmane.org> @ 2010-02-18 7:40 ` Avi Kivity [not found] ` <4B7CEEFF.9000801-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Avi Kivity @ 2010-02-18 7:40 UTC (permalink / raw) To: Alexander Graf Cc: kvm-ppc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 02/17/2010 08:07 PM, Alexander Graf wrote: > On 17.02.2010, at 17:34, Avi Kivity wrote: > > >> On 02/17/2010 06:23 PM, Alexander Graf wrote: >> >>> On 17.02.2010, at 17:03, Avi Kivity wrote: >>> >>> >>> >>>> On 02/17/2010 04:56 PM, Alexander Graf wrote: >>>> >>>> >>>>> So I changed to code according to your input by making all FPU calls explicit, getting rid of all binary patching. >>>>> >>>>> On the PowerStation again I'm running this code (simplified to the important instructions) using kvmctl: >>>>> >>>>> li r2, 0x1234 >>>>> std r2, 0(r1) >>>>> lfd f3, 0(r1) >>>>> lfd f4, 0(r1) >>>>> do_mul: >>>>> fmul f0, f3, f4 >>>>> b do_mul >>>>> >>>>> >>>>> With the following kvm_stat output: >>>>> >>>>> dec 2236 53 >>>>> exits 60797802 1171403 >>>>> ext_intr 379 4 >>>>> halt_wakeup 0 0 >>>>> inst_emu 60795247 1171344 >>>>> ld 60795132 1171348 >>>>> >>>>> So I'm getting 1171403 fmul operations per second. And that's even with non-optimized instruction fetching. Not bad. >>>>> >>>>> >>>>> >>>> It's a large number, but won't real hardware be three orders of magnitude faster? >>>> >>>> >>> Yes, it would. But we don't have to care. The only thing we need to worry about is being fast enough to emulate enough FPU instructions actually used in normal guests so the guest runs in full speed. And 1000k> 250k, so we can do that apparently, leaving some spare cycles for non-fpu instructions. >>> >>> >> I'm sure 250k isn't representative of a floating point intensive program (but maybe there aren't fpu intensive applications on that cpu). >> > Now you made me check how fast the real hw is. I get about 65,000,000 fmul operations per second on it. > > That's surprisingly low. > So we're 65x slower on a PowerStation. And that's for a tight FPU only loop. I'm still not convinced we're running into major problems. > Well, it's up to you. I just hope we don't end up underperforming due to this. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <4B7CEEFF.9000801-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 00/18] KVM: PPC: Virtualize Gekko guests [not found] ` <4B7CEEFF.9000801-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2010-02-18 8:04 ` Avi Kivity 0 siblings, 0 replies; 53+ messages in thread From: Avi Kivity @ 2010-02-18 8:04 UTC (permalink / raw) To: Alexander Graf Cc: kvm-ppc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 02/18/2010 09:40 AM, Avi Kivity wrote: >> Now you made me check how fast the real hw is. I get about 65,000,000 >> fmul operations per second on it. >> > > That's surprisingly low. I get 3.7 Gflops on my home machine (1G loops, 4 fmul and 4 fadds, all independent, in 2.15 seconds; otherwise I can't saturate the pipeline). -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. ^ permalink raw reply [flat|nested] 53+ messages in thread
end of thread, other threads:[~2010-02-18 8:04 UTC | newest]
Thread overview: 53+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-04 15:55 [PATCH 00/18] KVM: PPC: Virtualize Gekko guests Alexander Graf
2010-02-04 15:55 ` [PATCH 01/18] KVM: PPC: Add QPR registers Alexander Graf
2010-02-04 15:55 ` [PATCH 07/18] KVM: PPC: Combine extension interrupt handlers Alexander Graf
2010-02-04 15:55 ` [PATCH 08/18] KVM: PPC: Preload FPU when possible Alexander Graf
2010-02-04 15:55 ` [PATCH 09/18] KVM: PPC: Fix typo in book3s_32 debug code Alexander Graf
2010-02-04 15:55 ` [PATCH 10/18] KVM: PPC: Implement mtsr instruction emulation Alexander Graf
2010-02-04 15:55 ` [PATCH 11/18] KVM: PPC: Make software load/store return eaddr Alexander Graf
2010-02-04 15:55 ` [PATCH 13/18] KVM: PPC: Add helpers to call FPU instructions Alexander Graf
2010-02-04 15:55 ` [PATCH 17/18] KVM: PPC: Reserve a chunk of memory for opcodes Alexander Graf
2010-02-04 15:55 ` [PATCH 18/18] KVM: PPC: Implement Paired Single emulation Alexander Graf
[not found] ` <1265298925-31954-19-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>
2010-02-07 12:50 ` Avi Kivity
2010-02-07 15:57 ` Alexander Graf
2010-02-07 16:18 ` Avi Kivity
[not found] ` <1265298925-31954-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>
2010-02-04 15:55 ` [PATCH 02/18] KVM: PPC: Enable MMIO to do 64 bits, fprs and qprs Alexander Graf
[not found] ` <1265298925-31954-3-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>
2010-02-07 12:29 ` Avi Kivity
[not found] ` <4B6EB229.8090502-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-07 15:51 ` Alexander Graf
2010-02-04 15:55 ` [PATCH 03/18] KVM: PPC: Teach MMIO Signedness Alexander Graf
[not found] ` <1265298925-31954-4-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>
2010-02-07 12:32 ` Avi Kivity
[not found] ` <4B6EB2D7.1030500-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-07 15:51 ` Alexander Graf
[not found] ` <3CEF000F-1751-4E65-A08A-C71B2CE8DAEE-l3A5Bk7waGM@public.gmane.org>
2010-02-07 16:15 ` Avi Kivity
2010-02-07 16:27 ` Anthony Liguori
2010-02-07 21:35 ` Alexander Graf
[not found] ` <1CA08386-21CA-4B4F-A1E6-56C4DE584BA6-l3A5Bk7waGM@public.gmane.org>
2010-02-07 22:13 ` Anthony Liguori
2010-02-04 15:55 ` [PATCH 04/18] KVM: PPC: Add AGAIN type for emulation return Alexander Graf
2010-02-04 15:55 ` [PATCH 05/18] KVM: PPC: Add hidden flag for paired singles Alexander Graf
2010-02-04 15:55 ` [PATCH 06/18] KVM: PPC: Add Gekko SPRs Alexander Graf
2010-02-04 15:55 ` [PATCH 12/18] KVM: PPC: Make ext giveup non-static Alexander Graf
2010-02-04 15:55 ` [PATCH 14/18] KVM: PPC: Fix error in BAT assignment Alexander Graf
2010-02-04 15:55 ` [PATCH 15/18] KVM: PPC: Add helpers to modify ppc fields Alexander Graf
2010-02-04 15:55 ` [PATCH 16/18] KVM: PPC: Enable program interrupt to do MMIO Alexander Graf
2010-02-07 12:54 ` [PATCH 00/18] KVM: PPC: Virtualize Gekko guests Avi Kivity
[not found] ` <4B6EB7F6.10304-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-07 15:49 ` Alexander Graf
2010-02-07 16:22 ` Avi Kivity
2010-02-07 22:02 ` Alexander Graf
2010-02-08 8:53 ` Avi Kivity
[not found] ` <4B6FD118.2090207-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-08 10:58 ` Alexander Graf
[not found] ` <87CEECB5-107A-46EB-89F5-1E1F92AC22AA-l3A5Bk7waGM@public.gmane.org>
2010-02-08 11:09 ` Avi Kivity
[not found] ` <4B6FF0E6.6060309-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-08 11:30 ` Alexander Graf
[not found] ` <A5BC5A7E-D45B-4BAF-804A-B364810F50DA-l3A5Bk7waGM@public.gmane.org>
2010-02-08 12:03 ` Avi Kivity
[not found] ` <4B6FFD85.6090100-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-08 12:05 ` Alexander Graf
[not found] ` <939C8633-1B2C-4888-B1C1-357DF1C56CE6-l3A5Bk7waGM@public.gmane.org>
2010-02-08 12:15 ` Avi Kivity
2010-02-08 12:31 ` Alexander Graf
2010-02-09 11:00 ` Alexander Graf
[not found] ` <4B714049.7010201-l3A5Bk7waGM@public.gmane.org>
2010-02-09 11:06 ` Avi Kivity
2010-02-09 11:13 ` Alexander Graf
[not found] ` <4B71435E.7010103-l3A5Bk7waGM@public.gmane.org>
2010-02-09 12:27 ` Avi Kivity
[not found] ` <4B7154A6.6050809-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-17 14:56 ` Alexander Graf
[not found] ` <80F0B53A-6F83-4166-8F85-5D9B07526158-l3A5Bk7waGM@public.gmane.org>
2010-02-17 16:03 ` Avi Kivity
[not found] ` <4B7C134C.9040009-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-17 16:23 ` Alexander Graf
2010-02-17 16:34 ` Avi Kivity
[not found] ` <4B7C1A91.8060205-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-17 18:07 ` Alexander Graf
[not found] ` <D3C1F39A-E9C8-46B8-9C4A-12E56D34B5AC-l3A5Bk7waGM@public.gmane.org>
2010-02-18 7:40 ` Avi Kivity
[not found] ` <4B7CEEFF.9000801-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-18 8:04 ` Avi Kivity
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox