Re: [kvm-ppc-devel] KVM kernel/userspace TLB interface

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
To: kvm-ppc@vger.kernel.org
Subject: Re: [kvm-ppc-devel] KVM kernel/userspace TLB interface
Date: Tue, 22 Jan 2008 13:25:01 +0000	[thread overview]
Message-ID: <4795EEAD.1090908@linux.vnet.ibm.com> (raw)
In-Reply-To: <1199751688.5741.40.camel@basalt>

[-- Attachment #1: Type: text/plain, Size: 4937 bytes --]

It looks like we have two different needs we can solve with the approach Hollis suggested by implementing #2 (arch specific) and #3 (generic map this) interfaces.

Use cases - for #2 (set/get TLB elements including architecture specific information):
- Dump TLB, comparable to vcpu_dump we will need a tlb_dump for debugging and error reporting.
- Save/Restore/Migrate - even not being a need right now this might become a need for us all end so we will need a get/set function that is able to completely save/restore the guest TLB

Use Cases for #3 (insert a single mapping )
- simplifying shared userspace code by using only one common interface wherever possible that does not need to know which hardware it adresses

---

#2 - "get/set TLB"
I attached a prototype implementation of an architecture dependent mass KVM_GET_TLB & KVM_SET_TLB which get and set the full status the guest & shadow tlb structs assigned to a vcpu.
I intend to eventually have different calls for guest and shadow structure because e.g. we actually will never want to set shadow tlb from userspace - this prototype is still in development and any comments are welcome (I'll resend it once it is changed to different getter/setter for guest and shadow tlb).
@Zhang Wei - the interface I have in mind for this arch specific implementation is an array of tlbe[MAX_TLB_SIZE] get-/setting all elements of either guest or shadow tlb at once. Every architecture can define it's own tlbe structure in the kernel headers and can implement special needs in their backend for these ioctls (KVM_GET_GUEST_TLB, KVM_GET_SHADOW_TLB, KVM_SET_GUEST_TLB, but no KVM_SET_SHADOW_TLB).

---

#3 - "map this"
For #3 we need some discussion how the data structure of a shared interface might look like. For ppc440 it would be the { virtual addr, real addr, size, flags } as you see the tlbe definition that is moved in my patch. I wanted to ask what more/less needs e500 has for a "map this" interface to insert a single tlb-entry. We need also to clarify what flags we all need to define a shared set of bits (if possible to share that flags) in the flags variable we can use that every architecture can map to it's own meaning in the backend to this ioctl.
When I know the needs of e500 for the "map this" interface (and any other needs brought up that might affect future hardware we want to be compatible with) I intend to implement a prototype of this interface for ppc440 and then iterate it until we are happy with it for ppc440, e500 and whatever hw we have in mind for the future.


Zhang Wei wrote:
> Hi, Hollis,
> 
>> Hi Zhang, the administrative tools have a need to access the guest's TLB
>> state, for example to create the mapping for the initial code to execute
>> in. This requires a kernel interface to be exported to userspace, but as
>> you noted, the TLBs are very different between 440 and e500.
>>
>> Here are a couple ideas:
[...]
>>      #2* Have a per-core interface, e.g. KVM_440_SET_TLB_ENTRY and
>>         KVM_e500_SET_TLB_ENTRY. These would communicate via completely
>>         different structures. For example, the 440 structure 
>> would look
>>         something like { index, TID, word 0, word 1, word 2 }. We'd
>>         probably want to sanity check each ioctl to make sure it matches
>>         the vcpu type. Not only that, but we would need core-specific
>>         code in userspace to actually make the call.
>>      #3* Try to abstract it so that we have a higher-level "map this"
>>         interface. This could look something like { virtual addr, real
>>         addr, size, flags }. It's up to core-specific code to select in
>>         which TLB to create the mapping, which entry to clobber,
>>         interpret the meaning of "flags", etc. I guess the big downside
>>         here is that if we want to dump guest state for debugging
>>         purposes we'd still need the core-specific interface.
>>
>> We could also try supporting both #2 and #3, so that 
>> "generic" code like
>> qemu could use the abstracted (and shared) "map this" call, 
>> while tools
>> with more specific needs could still get at the underlying details.
>>
> 
> Agree with you!
> In my porting implementation, I used #3. I think #3 should be more
> flexible.
> Maybe we'll add the other TLB support to it in the future. We should
> define a good data structure about TLB content, which will include 440
> and E500 TLB meaning.
> 
> And If we take out TLB operations, rest codes are more generic.
> 
> Cheers!
> Wei.

And Avi Kivity wrote:
> Hollis Blanchard wrote:   
> [...]
> An unabstracted interface is needed for save/restore/migrate, which may 
> or may not be important for your target.  I do recommend having getters 
> and setters for all architectural state.
>

-- 

Grüsse / regards, 
Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization



[-- Attachment #2: ioctl-tlb-userspace --]
[-- Type: text/plain, Size: 3975 bytes --]

diff --git a/libkvm/libkvm-powerpc.c b/libkvm/libkvm-powerpc.c
--- a/libkvm/libkvm-powerpc.c
+++ b/libkvm/libkvm-powerpc.c
@@ -25,6 +25,7 @@
 #include "kvm-powerpc.h"
 #include <errno.h>
 #include <stdio.h>
+#include <sys/ioctl.h>
 
 int handle_dcr(struct kvm_run *run,  kvm_context_t kvm)
 {
@@ -110,3 +111,55 @@ int kvm_arch_run(struct kvm_run *run, kv
 	}
 	return ret;
 }
+
+int kvm_get_tlb(kvm_context_t kvm, int vcpu, struct kvm_tlb *tlb)
+{
+	int r = 0;
+
+	r = ioctl(kvm->vcpu_fd[vcpu], KVM_GET_TLB, tlb);
+	if (r) {
+		fprintf(stderr,"%s failed (r='%d')\n",__func__,r);
+	}
+
+	return r;
+}
+
+int kvm_set_tlb(kvm_context_t kvm, int vcpu, struct kvm_tlb *tlb)
+{
+	int r = 0;
+
+	r = ioctl(kvm->vcpu_fd[vcpu], KVM_SET_TLB, tlb);
+	if (r) {
+		fprintf(stderr,"%s - failed (r='%d')\n",__func__,r);
+	}
+
+	return r;
+}
+
+static void print_tlb(struct tlbe tlb[PPC44x_TLB_SIZE], char* name, char* flag)
+{
+	int i;
+
+	fprintf(stderr,"%s:\n",name);
+	fprintf(stderr,"| %2s | %8s | %8s | %8s | %8s |\n","nr"," tid ",
+						"word0","word1","word2");
+	for (i = 0; i < PPC44x_TLB_SIZE; i++) {
+		if (tlb[i].word0 & PPC44x_TLB_VALID)
+			fprintf(stderr,"%s%2d | %08X | %08X | %08X | %08X |\n",
+				flag,i,
+				tlb[i].tid,
+				tlb[i].word0,
+				tlb[i].word1,
+				tlb[i].word2);
+	}
+	fflush(stdout);
+}
+
+void kvm_dump_tlb(kvm_context_t kvm, int vcpu)
+{
+	struct kvm_tlb tlb;
+	kvm_get_tlb(kvm, vcpu, &tlb);
+
+	print_tlb(tlb.guest_tlb,"Guest TLB","G");
+	print_tlb(tlb.shadow_tlb,"Shadow TLB","S");
+}
diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h
--- a/libkvm/libkvm.h
+++ b/libkvm/libkvm.h
@@ -556,4 +556,10 @@ int kvm_enable_vapic(kvm_context_t kvm, 
 
 #endif
 
+#if defined(__powerpc__)
+int kvm_get_tlb(kvm_context_t kvm, int vcpu, struct kvm_tlb *tlb);
+int kvm_set_tlb(kvm_context_t kvm, int vcpu, struct kvm_tlb *tlb);
+void kvm_dump_tlb(kvm_context_t kvm, int vcpu);
 #endif
+
+#endif
diff --git a/qemu/hw/ppc440_boards_kvm.c b/qemu/hw/ppc440_boards_kvm.c
--- a/qemu/hw/ppc440_boards_kvm.c
+++ b/qemu/hw/ppc440_boards_kvm.c
@@ -124,7 +124,6 @@ void ppc440_init(CPUState *env,
 	ppc_dcr_register(env, DCRN_CPR0_CFGDATA, cpr0, dcr_read_cpr0, NULL);
 }
 
-
 static void bambooKVM_init(ram_addr_t ram_size, int vga_ram_size, 
 			const char *boot_device, DisplayState *ds,  
 			const char *kernel_filename,
@@ -220,7 +219,9 @@ static void bambooKVM_init(ram_addr_t ra
 		}
 	}
 
-	/* XXX insert TLB entries */
+	/* insert initial TLB entry state */
+	kvm_ppc440_tlb_init(env);
+	/* set up initial stack pointer */
 	env->gpr[1] = (16<<20) - 8;
 	env->gpr[4] = initrd_base;
 	env->gpr[5] = initrd_size;
diff --git a/qemu/qemu-kvm-powerpc.c b/qemu/qemu-kvm-powerpc.c
--- a/qemu/qemu-kvm-powerpc.c
+++ b/qemu/qemu-kvm-powerpc.c
@@ -204,4 +204,27 @@ int handle_powerpc_dcr_write(uint32_t dc
     return 0; /* XXX ignore failed DCR ops */
 }
 
+void kvm_ppc440_tlb_init(CPUState *env)
+{
+    static struct kvm_tlb tlb;
+    struct tlbe *gtlbe = &(tlb.guest_tlb[0]);
+
+    /* get/modify/write to preserve the shadow tlb state */
+    kvm_get_tlb(kvm_context, env->cpu_index, &tlb);
+
+    gtlbe->tid = 0;
+    gtlbe->word0 = PPC44x_TLB_16M | PPC44x_TLB_VALID;
+    gtlbe->word1 = 0;
+    gtlbe->word2 = PPC44x_TLB_SX | PPC44x_TLB_SW | PPC44x_TLB_SR;
+
+    gtlbe++;
+    gtlbe->tid = 0;
+    gtlbe->word0 = 0xef600000 | PPC44x_TLB_4K | PPC44x_TLB_VALID;
+    gtlbe->word1 = 0xef600000;
+    gtlbe->word2 = PPC44x_TLB_SX | PPC44x_TLB_SW | PPC44x_TLB_SR
+                   | PPC44x_TLB_I | PPC44x_TLB_G;
+
+    kvm_set_tlb(kvm_context, env->cpu_index, &tlb);
+}
+
 #endif
diff --git a/qemu/qemu-kvm.h b/qemu/qemu-kvm.h
--- a/qemu/qemu-kvm.h
+++ b/qemu/qemu-kvm.h
@@ -51,6 +51,7 @@ int handle_tpr_access(void *opaque, int 
 #ifdef TARGET_PPC
 int handle_powerpc_dcr_read(uint32_t dcrn, uint32_t *data);
 int handle_powerpc_dcr_write(uint32_t dcrn, uint32_t data);
+void kvm_ppc440_tlb_init(CPUState *env);
 #endif
 
 #define ALIGN(x, y)  (((x)+(y)-1) & ~((y)-1))

[-- Attachment #3: ioctl-tlb --]
[-- Type: text/plain, Size: 3789 bytes --]

diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -26,6 +26,7 @@
 #include <linux/vmalloc.h>
 #include <asm/cputable.h>
 #include <asm/cacheflush.h>
+#include <asm/uaccess.h>
 
 #include "44x_tlb.h"
 
@@ -578,24 +579,9 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *
 /* Initial guest state: 16MB mapping 0 -> 0, PC = 0, MSR = 0, R1 = 16MB */
 int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
 {
-	struct tlbe *tlbe = &vcpu->arch.guest_tlb[0];
-
-	tlbe->tid = 0;
-	tlbe->word0 = PPC44x_TLB_16M | PPC44x_TLB_VALID;
-	tlbe->word1 = 0;
-	tlbe->word2 = PPC44x_TLB_SX | PPC44x_TLB_SW | PPC44x_TLB_SR;
-
-	tlbe++;
-	tlbe->tid = 0;
-	tlbe->word0 = 0xef600000 | PPC44x_TLB_4K | PPC44x_TLB_VALID;
-	tlbe->word1 = 0xef600000;
-	tlbe->word2 = PPC44x_TLB_SX | PPC44x_TLB_SW | PPC44x_TLB_SR
-	              | PPC44x_TLB_I | PPC44x_TLB_G;
-
 	vcpu->arch.pc = 0;
 	vcpu->arch.guest_msr = 0;
 	vcpu->arch.shadow_msr = MSR_PR|MSR_EE|MSR_IS|MSR_DS;
-	vcpu->arch.gpr[1] = (16<<20) - 8; /* -8 for the callee-save LR slot */
 
 	/* Eye-catching number so we know if the guest takes an interrupt
 	 * before it's programmed its own IVPR. */
@@ -820,6 +806,30 @@ long kvm_arch_vcpu_ioctl(struct file *fi
 	long r;
 
 	switch (ioctl) {
+	case KVM_GET_TLB: {
+		struct kvm_vcpu *vcpu = filp->private_data;
+		struct kvm_tlb __user *utlb = (struct kvm_tlb __user *)arg;
+ 
+		r = copy_to_user(&(utlb->guest_tlb), &(vcpu->arch.guest_tlb), sizeof(utlb->guest_tlb));
+		if (r)
+			return r;
+		r = copy_to_user(&(utlb->shadow_tlb), &(vcpu->arch.shadow_tlb), sizeof(utlb->shadow_tlb));
+ 		if (r)
+ 			return r;
+		break;
+	}
+	case KVM_SET_TLB: {
+		struct kvm_vcpu *vcpu = filp->private_data;
+		struct kvm_tlb __user *utlb = (struct kvm_tlb __user *)arg;
+ 
+		r = copy_from_user(&(vcpu->arch.guest_tlb), &(utlb->guest_tlb), sizeof(utlb->guest_tlb));
+		if (r)
+			return r;
+		r = copy_from_user(&(vcpu->arch.shadow_tlb), &(utlb->shadow_tlb), sizeof(utlb->shadow_tlb));
+		if (r)
+			return r;
+ 		break;
+	}
 	default:
 		r = -EINVAL;
 	}
diff --git a/include/asm-powerpc/kvm.h b/include/asm-powerpc/kvm.h
--- a/include/asm-powerpc/kvm.h
+++ b/include/asm-powerpc/kvm.h
@@ -21,6 +21,7 @@
 #define __POWERPC_KVM_H__
 
 #include <asm/types.h>
+#include <asm/mmu-44x.h>
 
 struct kvm_regs {
 	__u32 pc;
@@ -53,4 +54,16 @@ struct kvm_fpu {
 struct kvm_fpu {
 };
 
+struct tlbe {
+	__u32 tid; /* Only the low 8 bits are used. */
+	__u32 word0;
+	__u32 word1;
+	__u32 word2;
+};
+
+struct kvm_tlb {
+	struct tlbe guest_tlb[PPC44x_TLB_SIZE];
+	struct tlbe shadow_tlb[PPC44x_TLB_SIZE];
+};
+
 #endif /* __POWERPC_KVM_H__ */
diff --git a/include/asm-powerpc/kvm_host.h b/include/asm-powerpc/kvm_host.h
--- a/include/asm-powerpc/kvm_host.h
+++ b/include/asm-powerpc/kvm_host.h
@@ -39,13 +39,6 @@ struct kvm_vcpu_stat {
 	u32 mmio_exits;
 	u32 signal_exits;
 	u32 light_exits;
-};
-
-struct tlbe {
-	u32 tid; /* Only the low 8 bits are used. */
-	u32 word0;
-	u32 word1;
-	u32 word2;
 };
 
 struct kvm_arch {
@@ -127,11 +120,6 @@ struct kvm_vcpu_arch {
 	int pending_dec;
 };
 
-struct kvm_tlb {
-	struct tlbe guest_tlb[PPC44x_TLB_SIZE];
-	struct tlbe shadow_tlb[PPC44x_TLB_SIZE];
-};
-
 enum emulation_result {
 	EMULATE_DONE,         /* no further processing */
 	EMULATE_DO_MMIO,      /* kvm_run filled with mmio request */
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -289,4 +289,7 @@ struct kvm_vapic_addr {
 /* Available with KVM_CAP_VAPIC */
 #define KVM_SET_VAPIC_ADDR        _IOW(KVMIO,  0x93, struct kvm_vapic_addr)
 
+#define KVM_GET_TLB               _IOR(KVMIO,  0x94, struct kvm_tlb)
+#define KVM_SET_TLB               _IOW(KVMIO,  0x95, struct kvm_tlb)
+
 #endif

[-- Attachment #4: Type: text/plain, Size: 228 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

[-- Attachment #5: Type: text/plain, Size: 170 bytes --]

_______________________________________________
kvm-ppc-devel mailing list
kvm-ppc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-ppc-devel

next prev parent reply	other threads:[~2008-01-22 13:25 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-08  0:21 [kvm-ppc-devel] KVM kernel/userspace TLB interface Hollis Blanchard
2008-01-08  7:12 ` Zhang Wei
2008-01-08 14:03 ` Avi Kivity
2008-01-22 13:25 ` Christian Ehrhardt [this message]
2008-01-23 13:38 ` Christian Ehrhardt
2008-01-23 21:05 ` Hollis Blanchard
2008-01-24  7:23 ` Christian Ehrhardt
2008-01-24  7:37 ` Hollis Blanchard
2008-01-24  7:58 ` Zhang Wei
2008-01-24  8:01 ` Zhang Wei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4795EEAD.1090908@linux.vnet.ibm.com \
    --to=ehrhardt@linux.vnet.ibm.com \
    --cc=kvm-ppc@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.