From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christian Ehrhardt Date: Tue, 22 Jan 2008 13:25:01 +0000 Subject: Re: [kvm-ppc-devel] KVM kernel/userspace TLB interface Message-Id: <4795EEAD.1090908@linux.vnet.ibm.com> MIME-Version: 1 Content-Type: multipart/mixed; boundary="------------080501090508090105070302" List-Id: References: <1199751688.5741.40.camel@basalt> In-Reply-To: <1199751688.5741.40.camel@basalt> To: kvm-ppc@vger.kernel.org This is a multi-part message in MIME format. --------------080501090508090105070302 Content-Type: text/plain; charset="iso-8859-1"; format="flowed" Content-Transfer-Encoding: quoted-printable It looks like we have two different needs we can solve with the approach Ho= llis suggested by implementing #2 (arch specific) and #3 (generic map this)= interfaces. Use cases - for #2 (set/get TLB elements including architecture specific in= formation): - Dump TLB, comparable to vcpu_dump we will need a tlb_dump for debugging a= nd error reporting. - Save/Restore/Migrate - even not being a need right now this might become = a need for us all end so we will need a get/set function that is able to co= mpletely save/restore the guest TLB Use Cases for #3 (insert a single mapping ) - simplifying shared userspace code by using only one common interface wher= ever possible that does not need to know which hardware it adresses --- #2 - "get/set TLB" I attached a prototype implementation of an architecture dependent mass KVM= _GET_TLB & KVM_SET_TLB which get and set the full status the guest & shadow= tlb structs assigned to a vcpu. I intend to eventually have different calls for guest and shadow structure = because e.g. we actually will never want to set shadow tlb from userspace -= this prototype is still in development and any comments are welcome (I'll = resend it once it is changed to different getter/setter for guest and shado= w tlb). @Zhang Wei - the interface I have in mind for this arch specific implementa= tion is an array of tlbe[MAX_TLB_SIZE] get-/setting all elements of either = guest or shadow tlb at once. Every architecture can define it's own tlbe st= ructure in the kernel headers and can implement special needs in their back= end for these ioctls (KVM_GET_GUEST_TLB, KVM_GET_SHADOW_TLB, KVM_SET_GUEST_= TLB, but no KVM_SET_SHADOW_TLB). --- #3 - "map this" For #3 we need some discussion how the data structure of a shared interface= might look like. For ppc440 it would be the { virtual addr, real addr, siz= e, flags } as you see the tlbe definition that is moved in my patch. I want= ed to ask what more/less needs e500 has for a "map this" interface to inser= t a single tlb-entry. We need also to clarify what flags we all need to def= ine a shared set of bits (if possible to share that flags) in the flags var= iable we can use that every architecture can map to it's own meaning in the= backend to this ioctl. When I know the needs of e500 for the "map this" interface (and any other n= eeds brought up that might affect future hardware we want to be compatible = with) I intend to implement a prototype of this interface for ppc440 and th= en iterate it until we are happy with it for ppc440, e500 and whatever hw w= e have in mind for the future. Zhang Wei wrote: > Hi, Hollis, >=20 >> Hi Zhang, the administrative tools have a need to access the guest's TLB >> state, for example to create the mapping for the initial code to execute >> in. This requires a kernel interface to be exported to userspace, but as >> you noted, the TLBs are very different between 440 and e500. >> >> Here are a couple ideas: [...] >> #2* Have a per-core interface, e.g. KVM_440_SET_TLB_ENTRY and >> KVM_e500_SET_TLB_ENTRY. These would communicate via completely >> different structures. For example, the 440 structure=20 >> would look >> something like { index, TID, word 0, word 1, word 2 }. We'd >> probably want to sanity check each ioctl to make sure it matches >> the vcpu type. Not only that, but we would need core-specific >> code in userspace to actually make the call. >> #3* Try to abstract it so that we have a higher-level "map this" >> interface. This could look something like { virtual addr, real >> addr, size, flags }. It's up to core-specific code to select in >> which TLB to create the mapping, which entry to clobber, >> interpret the meaning of "flags", etc. I guess the big downside >> here is that if we want to dump guest state for debugging >> purposes we'd still need the core-specific interface. >> >> We could also try supporting both #2 and #3, so that=20 >> "generic" code like >> qemu could use the abstracted (and shared) "map this" call,=20 >> while tools >> with more specific needs could still get at the underlying details. >> >=20 > Agree with you! > In my porting implementation, I used #3. I think #3 should be more > flexible. > Maybe we'll add the other TLB support to it in the future. We should > define a good data structure about TLB content, which will include 440 > and E500 TLB meaning. >=20 > And If we take out TLB operations, rest codes are more generic. >=20 > Cheers! > Wei. And Avi Kivity wrote: > Hollis Blanchard wrote: =20 > [...] > An unabstracted interface is needed for save/restore/migrate, which may=20 > or may not be important for your target. I do recommend having getters=20 > and setters for all architectural state. > --=20 Gr=FCsse / regards,=20 Christian Ehrhardt IBM Linux Technology Center, Open Virtualization --------------080501090508090105070302 Content-Type: text/plain; name="ioctl-tlb-userspace" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="ioctl-tlb-userspace" diff --git a/libkvm/libkvm-powerpc.c b/libkvm/libkvm-powerpc.c --- a/libkvm/libkvm-powerpc.c +++ b/libkvm/libkvm-powerpc.c @@ -25,6 +25,7 @@ #include "kvm-powerpc.h" #include #include +#include int handle_dcr(struct kvm_run *run, kvm_context_t kvm) { @@ -110,3 +111,55 @@ int kvm_arch_run(struct kvm_run *run, kv } return ret; } + +int kvm_get_tlb(kvm_context_t kvm, int vcpu, struct kvm_tlb *tlb) +{ + int r = 0; + + r = ioctl(kvm->vcpu_fd[vcpu], KVM_GET_TLB, tlb); + if (r) { + fprintf(stderr,"%s failed (r='%d')\n",__func__,r); + } + + return r; +} + +int kvm_set_tlb(kvm_context_t kvm, int vcpu, struct kvm_tlb *tlb) +{ + int r = 0; + + r = ioctl(kvm->vcpu_fd[vcpu], KVM_SET_TLB, tlb); + if (r) { + fprintf(stderr,"%s - failed (r='%d')\n",__func__,r); + } + + return r; +} + +static void print_tlb(struct tlbe tlb[PPC44x_TLB_SIZE], char* name, char* flag) +{ + int i; + + fprintf(stderr,"%s:\n",name); + fprintf(stderr,"| %2s | %8s | %8s | %8s | %8s |\n","nr"," tid ", + "word0","word1","word2"); + for (i = 0; i < PPC44x_TLB_SIZE; i++) { + if (tlb[i].word0 & PPC44x_TLB_VALID) + fprintf(stderr,"%s%2d | %08X | %08X | %08X | %08X |\n", + flag,i, + tlb[i].tid, + tlb[i].word0, + tlb[i].word1, + tlb[i].word2); + } + fflush(stdout); +} + +void kvm_dump_tlb(kvm_context_t kvm, int vcpu) +{ + struct kvm_tlb tlb; + kvm_get_tlb(kvm, vcpu, &tlb); + + print_tlb(tlb.guest_tlb,"Guest TLB","G"); + print_tlb(tlb.shadow_tlb,"Shadow TLB","S"); +} diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h --- a/libkvm/libkvm.h +++ b/libkvm/libkvm.h @@ -556,4 +556,10 @@ int kvm_enable_vapic(kvm_context_t kvm, #endif +#if defined(__powerpc__) +int kvm_get_tlb(kvm_context_t kvm, int vcpu, struct kvm_tlb *tlb); +int kvm_set_tlb(kvm_context_t kvm, int vcpu, struct kvm_tlb *tlb); +void kvm_dump_tlb(kvm_context_t kvm, int vcpu); #endif + +#endif diff --git a/qemu/hw/ppc440_boards_kvm.c b/qemu/hw/ppc440_boards_kvm.c --- a/qemu/hw/ppc440_boards_kvm.c +++ b/qemu/hw/ppc440_boards_kvm.c @@ -124,7 +124,6 @@ void ppc440_init(CPUState *env, ppc_dcr_register(env, DCRN_CPR0_CFGDATA, cpr0, dcr_read_cpr0, NULL); } - static void bambooKVM_init(ram_addr_t ram_size, int vga_ram_size, const char *boot_device, DisplayState *ds, const char *kernel_filename, @@ -220,7 +219,9 @@ static void bambooKVM_init(ram_addr_t ra } } - /* XXX insert TLB entries */ + /* insert initial TLB entry state */ + kvm_ppc440_tlb_init(env); + /* set up initial stack pointer */ env->gpr[1] = (16<<20) - 8; env->gpr[4] = initrd_base; env->gpr[5] = initrd_size; diff --git a/qemu/qemu-kvm-powerpc.c b/qemu/qemu-kvm-powerpc.c --- a/qemu/qemu-kvm-powerpc.c +++ b/qemu/qemu-kvm-powerpc.c @@ -204,4 +204,27 @@ int handle_powerpc_dcr_write(uint32_t dc return 0; /* XXX ignore failed DCR ops */ } +void kvm_ppc440_tlb_init(CPUState *env) +{ + static struct kvm_tlb tlb; + struct tlbe *gtlbe = &(tlb.guest_tlb[0]); + + /* get/modify/write to preserve the shadow tlb state */ + kvm_get_tlb(kvm_context, env->cpu_index, &tlb); + + gtlbe->tid = 0; + gtlbe->word0 = PPC44x_TLB_16M | PPC44x_TLB_VALID; + gtlbe->word1 = 0; + gtlbe->word2 = PPC44x_TLB_SX | PPC44x_TLB_SW | PPC44x_TLB_SR; + + gtlbe++; + gtlbe->tid = 0; + gtlbe->word0 = 0xef600000 | PPC44x_TLB_4K | PPC44x_TLB_VALID; + gtlbe->word1 = 0xef600000; + gtlbe->word2 = PPC44x_TLB_SX | PPC44x_TLB_SW | PPC44x_TLB_SR + | PPC44x_TLB_I | PPC44x_TLB_G; + + kvm_set_tlb(kvm_context, env->cpu_index, &tlb); +} + #endif diff --git a/qemu/qemu-kvm.h b/qemu/qemu-kvm.h --- a/qemu/qemu-kvm.h +++ b/qemu/qemu-kvm.h @@ -51,6 +51,7 @@ int handle_tpr_access(void *opaque, int #ifdef TARGET_PPC int handle_powerpc_dcr_read(uint32_t dcrn, uint32_t *data); int handle_powerpc_dcr_write(uint32_t dcrn, uint32_t data); +void kvm_ppc440_tlb_init(CPUState *env); #endif #define ALIGN(x, y) (((x)+(y)-1) & ~((y)-1)) --------------080501090508090105070302 Content-Type: text/plain; name="ioctl-tlb" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="ioctl-tlb" diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -26,6 +26,7 @@ #include #include #include +#include #include "44x_tlb.h" @@ -578,24 +579,9 @@ int kvm_arch_vcpu_init(struct kvm_vcpu * /* Initial guest state: 16MB mapping 0 -> 0, PC = 0, MSR = 0, R1 = 16MB */ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu) { - struct tlbe *tlbe = &vcpu->arch.guest_tlb[0]; - - tlbe->tid = 0; - tlbe->word0 = PPC44x_TLB_16M | PPC44x_TLB_VALID; - tlbe->word1 = 0; - tlbe->word2 = PPC44x_TLB_SX | PPC44x_TLB_SW | PPC44x_TLB_SR; - - tlbe++; - tlbe->tid = 0; - tlbe->word0 = 0xef600000 | PPC44x_TLB_4K | PPC44x_TLB_VALID; - tlbe->word1 = 0xef600000; - tlbe->word2 = PPC44x_TLB_SX | PPC44x_TLB_SW | PPC44x_TLB_SR - | PPC44x_TLB_I | PPC44x_TLB_G; - vcpu->arch.pc = 0; vcpu->arch.guest_msr = 0; vcpu->arch.shadow_msr = MSR_PR|MSR_EE|MSR_IS|MSR_DS; - vcpu->arch.gpr[1] = (16<<20) - 8; /* -8 for the callee-save LR slot */ /* Eye-catching number so we know if the guest takes an interrupt * before it's programmed its own IVPR. */ @@ -820,6 +806,30 @@ long kvm_arch_vcpu_ioctl(struct file *fi long r; switch (ioctl) { + case KVM_GET_TLB: { + struct kvm_vcpu *vcpu = filp->private_data; + struct kvm_tlb __user *utlb = (struct kvm_tlb __user *)arg; + + r = copy_to_user(&(utlb->guest_tlb), &(vcpu->arch.guest_tlb), sizeof(utlb->guest_tlb)); + if (r) + return r; + r = copy_to_user(&(utlb->shadow_tlb), &(vcpu->arch.shadow_tlb), sizeof(utlb->shadow_tlb)); + if (r) + return r; + break; + } + case KVM_SET_TLB: { + struct kvm_vcpu *vcpu = filp->private_data; + struct kvm_tlb __user *utlb = (struct kvm_tlb __user *)arg; + + r = copy_from_user(&(vcpu->arch.guest_tlb), &(utlb->guest_tlb), sizeof(utlb->guest_tlb)); + if (r) + return r; + r = copy_from_user(&(vcpu->arch.shadow_tlb), &(utlb->shadow_tlb), sizeof(utlb->shadow_tlb)); + if (r) + return r; + break; + } default: r = -EINVAL; } diff --git a/include/asm-powerpc/kvm.h b/include/asm-powerpc/kvm.h --- a/include/asm-powerpc/kvm.h +++ b/include/asm-powerpc/kvm.h @@ -21,6 +21,7 @@ #define __POWERPC_KVM_H__ #include +#include struct kvm_regs { __u32 pc; @@ -53,4 +54,16 @@ struct kvm_fpu { struct kvm_fpu { }; +struct tlbe { + __u32 tid; /* Only the low 8 bits are used. */ + __u32 word0; + __u32 word1; + __u32 word2; +}; + +struct kvm_tlb { + struct tlbe guest_tlb[PPC44x_TLB_SIZE]; + struct tlbe shadow_tlb[PPC44x_TLB_SIZE]; +}; + #endif /* __POWERPC_KVM_H__ */ diff --git a/include/asm-powerpc/kvm_host.h b/include/asm-powerpc/kvm_host.h --- a/include/asm-powerpc/kvm_host.h +++ b/include/asm-powerpc/kvm_host.h @@ -39,13 +39,6 @@ struct kvm_vcpu_stat { u32 mmio_exits; u32 signal_exits; u32 light_exits; -}; - -struct tlbe { - u32 tid; /* Only the low 8 bits are used. */ - u32 word0; - u32 word1; - u32 word2; }; struct kvm_arch { @@ -127,11 +120,6 @@ struct kvm_vcpu_arch { int pending_dec; }; -struct kvm_tlb { - struct tlbe guest_tlb[PPC44x_TLB_SIZE]; - struct tlbe shadow_tlb[PPC44x_TLB_SIZE]; -}; - enum emulation_result { EMULATE_DONE, /* no further processing */ EMULATE_DO_MMIO, /* kvm_run filled with mmio request */ diff --git a/include/linux/kvm.h b/include/linux/kvm.h --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -289,4 +289,7 @@ struct kvm_vapic_addr { /* Available with KVM_CAP_VAPIC */ #define KVM_SET_VAPIC_ADDR _IOW(KVMIO, 0x93, struct kvm_vapic_addr) +#define KVM_GET_TLB _IOR(KVMIO, 0x94, struct kvm_tlb) +#define KVM_SET_TLB _IOW(KVMIO, 0x95, struct kvm_tlb) + #endif --------------080501090508090105070302 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ --------------080501090508090105070302 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ kvm-ppc-devel mailing list kvm-ppc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-ppc-devel --------------080501090508090105070302--