From mboxrd@z Thu Jan  1 00:00:00 1970
From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Date: Tue, 22 Jan 2008 13:25:01 +0000
Subject: Re: [kvm-ppc-devel] KVM kernel/userspace TLB interface
Message-Id: <4795EEAD.1090908@linux.vnet.ibm.com>
MIME-Version: 1
Content-Type: multipart/mixed; boundary="------------080501090508090105070302"
List-Id: <kvm-ppc.vger.kernel.org>
References: <1199751688.5741.40.camel@basalt>
In-Reply-To: <1199751688.5741.40.camel@basalt>
To: kvm-ppc@vger.kernel.org

This is a multi-part message in MIME format.
--------------080501090508090105070302
Content-Type: text/plain; charset="iso-8859-1"; format="flowed"
Content-Transfer-Encoding: quoted-printable

It looks like we have two different needs we can solve with the approach Ho=
llis suggested by implementing #2 (arch specific) and #3 (generic map this)=
 interfaces.

Use cases - for #2 (set/get TLB elements including architecture specific in=
formation):
- Dump TLB, comparable to vcpu_dump we will need a tlb_dump for debugging a=
nd error reporting.
- Save/Restore/Migrate - even not being a need right now this might become =
a need for us all end so we will need a get/set function that is able to co=
mpletely save/restore the guest TLB

Use Cases for #3 (insert a single mapping )
- simplifying shared userspace code by using only one common interface wher=
ever possible that does not need to know which hardware it adresses

---

#2 - "get/set TLB"
I attached a prototype implementation of an architecture dependent mass KVM=
_GET_TLB & KVM_SET_TLB which get and set the full status the guest & shadow=
 tlb structs assigned to a vcpu.
I intend to eventually have different calls for guest and shadow structure =
because e.g. we actually will never want to set shadow tlb from userspace -=
 this prototype is still in development and any comments are welcome (I'll =
resend it once it is changed to different getter/setter for guest and shado=
w tlb).
@Zhang Wei - the interface I have in mind for this arch specific implementa=
tion is an array of tlbe[MAX_TLB_SIZE] get-/setting all elements of either =
guest or shadow tlb at once. Every architecture can define it's own tlbe st=
ructure in the kernel headers and can implement special needs in their back=
end for these ioctls (KVM_GET_GUEST_TLB, KVM_GET_SHADOW_TLB, KVM_SET_GUEST_=
TLB, but no KVM_SET_SHADOW_TLB).

---

#3 - "map this"
For #3 we need some discussion how the data structure of a shared interface=
 might look like. For ppc440 it would be the { virtual addr, real addr, siz=
e, flags } as you see the tlbe definition that is moved in my patch. I want=
ed to ask what more/less needs e500 has for a "map this" interface to inser=
t a single tlb-entry. We need also to clarify what flags we all need to def=
ine a shared set of bits (if possible to share that flags) in the flags var=
iable we can use that every architecture can map to it's own meaning in the=
 backend to this ioctl.
When I know the needs of e500 for the "map this" interface (and any other n=
eeds brought up that might affect future hardware we want to be compatible =
with) I intend to implement a prototype of this interface for ppc440 and th=
en iterate it until we are happy with it for ppc440, e500 and whatever hw w=
e have in mind for the future.


Zhang Wei wrote:
> Hi, Hollis,
>=20
>> Hi Zhang, the administrative tools have a need to access the guest's TLB
>> state, for example to create the mapping for the initial code to execute
>> in. This requires a kernel interface to be exported to userspace, but as
>> you noted, the TLBs are very different between 440 and e500.
>>
>> Here are a couple ideas:
[...]
>>      #2* Have a per-core interface, e.g. KVM_440_SET_TLB_ENTRY and
>>         KVM_e500_SET_TLB_ENTRY. These would communicate via completely
>>         different structures. For example, the 440 structure=20
>> would look
>>         something like { index, TID, word 0, word 1, word 2 }. We'd
>>         probably want to sanity check each ioctl to make sure it matches
>>         the vcpu type. Not only that, but we would need core-specific
>>         code in userspace to actually make the call.
>>      #3* Try to abstract it so that we have a higher-level "map this"
>>         interface. This could look something like { virtual addr, real
>>         addr, size, flags }. It's up to core-specific code to select in
>>         which TLB to create the mapping, which entry to clobber,
>>         interpret the meaning of "flags", etc. I guess the big downside
>>         here is that if we want to dump guest state for debugging
>>         purposes we'd still need the core-specific interface.
>>
>> We could also try supporting both #2 and #3, so that=20
>> "generic" code like
>> qemu could use the abstracted (and shared) "map this" call,=20
>> while tools
>> with more specific needs could still get at the underlying details.
>>
>=20
> Agree with you!
> In my porting implementation, I used #3. I think #3 should be more
> flexible.
> Maybe we'll add the other TLB support to it in the future. We should
> define a good data structure about TLB content, which will include 440
> and E500 TLB meaning.
>=20
> And If we take out TLB operations, rest codes are more generic.
>=20
> Cheers!
> Wei.

And Avi Kivity wrote:
> Hollis Blanchard wrote:  =20
> [...]
> An unabstracted interface is needed for save/restore/migrate, which may=20
> or may not be important for your target.  I do recommend having getters=20
> and setters for all architectural state.
>

--=20

Gr=FCsse / regards,=20
Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization


--------------080501090508090105070302
Content-Type: text/plain;
 name="ioctl-tlb-userspace"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="ioctl-tlb-userspace"

diff --git a/libkvm/libkvm-powerpc.c b/libkvm/libkvm-powerpc.c
--- a/libkvm/libkvm-powerpc.c
+++ b/libkvm/libkvm-powerpc.c
@@ -25,6 +25,7 @@
 #include "kvm-powerpc.h"
 #include <errno.h>
 #include <stdio.h>
+#include <sys/ioctl.h>
 
 int handle_dcr(struct kvm_run *run,  kvm_context_t kvm)
 {
@@ -110,3 +111,55 @@ int kvm_arch_run(struct kvm_run *run, kv
 	}
 	return ret;
 }
+
+int kvm_get_tlb(kvm_context_t kvm, int vcpu, struct kvm_tlb *tlb)
+{
+	int r = 0;
+
+	r = ioctl(kvm->vcpu_fd[vcpu], KVM_GET_TLB, tlb);
+	if (r) {
+		fprintf(stderr,"%s failed (r='%d')\n",__func__,r);
+	}
+
+	return r;
+}
+
+int kvm_set_tlb(kvm_context_t kvm, int vcpu, struct kvm_tlb *tlb)
+{
+	int r = 0;
+
+	r = ioctl(kvm->vcpu_fd[vcpu], KVM_SET_TLB, tlb);
+	if (r) {
+		fprintf(stderr,"%s - failed (r='%d')\n",__func__,r);
+	}
+
+	return r;
+}
+
+static void print_tlb(struct tlbe tlb[PPC44x_TLB_SIZE], char* name, char* flag)
+{
+	int i;
+
+	fprintf(stderr,"%s:\n",name);
+	fprintf(stderr,"| %2s | %8s | %8s | %8s | %8s |\n","nr"," tid ",
+						"word0","word1","word2");
+	for (i = 0; i < PPC44x_TLB_SIZE; i++) {
+		if (tlb[i].word0 & PPC44x_TLB_VALID)
+			fprintf(stderr,"%s%2d | %08X | %08X | %08X | %08X |\n",
+				flag,i,
+				tlb[i].tid,
+				tlb[i].word0,
+				tlb[i].word1,
+				tlb[i].word2);
+	}
+	fflush(stdout);
+}
+
+void kvm_dump_tlb(kvm_context_t kvm, int vcpu)
+{
+	struct kvm_tlb tlb;
+	kvm_get_tlb(kvm, vcpu, &tlb);
+
+	print_tlb(tlb.guest_tlb,"Guest TLB","G");
+	print_tlb(tlb.shadow_tlb,"Shadow TLB","S");
+}
diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h
--- a/libkvm/libkvm.h
+++ b/libkvm/libkvm.h
@@ -556,4 +556,10 @@ int kvm_enable_vapic(kvm_context_t kvm, 
 
 #endif
 
+#if defined(__powerpc__)
+int kvm_get_tlb(kvm_context_t kvm, int vcpu, struct kvm_tlb *tlb);
+int kvm_set_tlb(kvm_context_t kvm, int vcpu, struct kvm_tlb *tlb);
+void kvm_dump_tlb(kvm_context_t kvm, int vcpu);
 #endif
+
+#endif
diff --git a/qemu/hw/ppc440_boards_kvm.c b/qemu/hw/ppc440_boards_kvm.c
--- a/qemu/hw/ppc440_boards_kvm.c
+++ b/qemu/hw/ppc440_boards_kvm.c
@@ -124,7 +124,6 @@ void ppc440_init(CPUState *env,
 	ppc_dcr_register(env, DCRN_CPR0_CFGDATA, cpr0, dcr_read_cpr0, NULL);
 }
 
-
 static void bambooKVM_init(ram_addr_t ram_size, int vga_ram_size, 
 			const char *boot_device, DisplayState *ds,  
 			const char *kernel_filename,
@@ -220,7 +219,9 @@ static void bambooKVM_init(ram_addr_t ra
 		}
 	}
 
-	/* XXX insert TLB entries */
+	/* insert initial TLB entry state */
+	kvm_ppc440_tlb_init(env);
+	/* set up initial stack pointer */
 	env->gpr[1] = (16<<20) - 8;
 	env->gpr[4] = initrd_base;
 	env->gpr[5] = initrd_size;
diff --git a/qemu/qemu-kvm-powerpc.c b/qemu/qemu-kvm-powerpc.c
--- a/qemu/qemu-kvm-powerpc.c
+++ b/qemu/qemu-kvm-powerpc.c
@@ -204,4 +204,27 @@ int handle_powerpc_dcr_write(uint32_t dc
     return 0; /* XXX ignore failed DCR ops */
 }
 
+void kvm_ppc440_tlb_init(CPUState *env)
+{
+    static struct kvm_tlb tlb;
+    struct tlbe *gtlbe = &(tlb.guest_tlb[0]);
+
+    /* get/modify/write to preserve the shadow tlb state */
+    kvm_get_tlb(kvm_context, env->cpu_index, &tlb);
+
+    gtlbe->tid = 0;
+    gtlbe->word0 = PPC44x_TLB_16M | PPC44x_TLB_VALID;
+    gtlbe->word1 = 0;
+    gtlbe->word2 = PPC44x_TLB_SX | PPC44x_TLB_SW | PPC44x_TLB_SR;
+
+    gtlbe++;
+    gtlbe->tid = 0;
+    gtlbe->word0 = 0xef600000 | PPC44x_TLB_4K | PPC44x_TLB_VALID;
+    gtlbe->word1 = 0xef600000;
+    gtlbe->word2 = PPC44x_TLB_SX | PPC44x_TLB_SW | PPC44x_TLB_SR
+                   | PPC44x_TLB_I | PPC44x_TLB_G;
+
+    kvm_set_tlb(kvm_context, env->cpu_index, &tlb);
+}
+
 #endif
diff --git a/qemu/qemu-kvm.h b/qemu/qemu-kvm.h
--- a/qemu/qemu-kvm.h
+++ b/qemu/qemu-kvm.h
@@ -51,6 +51,7 @@ int handle_tpr_access(void *opaque, int 
 #ifdef TARGET_PPC
 int handle_powerpc_dcr_read(uint32_t dcrn, uint32_t *data);
 int handle_powerpc_dcr_write(uint32_t dcrn, uint32_t data);
+void kvm_ppc440_tlb_init(CPUState *env);
 #endif
 
 #define ALIGN(x, y)  (((x)+(y)-1) & ~((y)-1))

--------------080501090508090105070302
Content-Type: text/plain;
 name="ioctl-tlb"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="ioctl-tlb"

diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -26,6 +26,7 @@
 #include <linux/vmalloc.h>
 #include <asm/cputable.h>
 #include <asm/cacheflush.h>
+#include <asm/uaccess.h>
 
 #include "44x_tlb.h"
 
@@ -578,24 +579,9 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *
 /* Initial guest state: 16MB mapping 0 -> 0, PC = 0, MSR = 0, R1 = 16MB */
 int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
 {
-	struct tlbe *tlbe = &vcpu->arch.guest_tlb[0];
-
-	tlbe->tid = 0;
-	tlbe->word0 = PPC44x_TLB_16M | PPC44x_TLB_VALID;
-	tlbe->word1 = 0;
-	tlbe->word2 = PPC44x_TLB_SX | PPC44x_TLB_SW | PPC44x_TLB_SR;
-
-	tlbe++;
-	tlbe->tid = 0;
-	tlbe->word0 = 0xef600000 | PPC44x_TLB_4K | PPC44x_TLB_VALID;
-	tlbe->word1 = 0xef600000;
-	tlbe->word2 = PPC44x_TLB_SX | PPC44x_TLB_SW | PPC44x_TLB_SR
-	              | PPC44x_TLB_I | PPC44x_TLB_G;
-
 	vcpu->arch.pc = 0;
 	vcpu->arch.guest_msr = 0;
 	vcpu->arch.shadow_msr = MSR_PR|MSR_EE|MSR_IS|MSR_DS;
-	vcpu->arch.gpr[1] = (16<<20) - 8; /* -8 for the callee-save LR slot */
 
 	/* Eye-catching number so we know if the guest takes an interrupt
 	 * before it's programmed its own IVPR. */
@@ -820,6 +806,30 @@ long kvm_arch_vcpu_ioctl(struct file *fi
 	long r;
 
 	switch (ioctl) {
+	case KVM_GET_TLB: {
+		struct kvm_vcpu *vcpu = filp->private_data;
+		struct kvm_tlb __user *utlb = (struct kvm_tlb __user *)arg;
+ 
+		r = copy_to_user(&(utlb->guest_tlb), &(vcpu->arch.guest_tlb), sizeof(utlb->guest_tlb));
+		if (r)
+			return r;
+		r = copy_to_user(&(utlb->shadow_tlb), &(vcpu->arch.shadow_tlb), sizeof(utlb->shadow_tlb));
+ 		if (r)
+ 			return r;
+		break;
+	}
+	case KVM_SET_TLB: {
+		struct kvm_vcpu *vcpu = filp->private_data;
+		struct kvm_tlb __user *utlb = (struct kvm_tlb __user *)arg;
+ 
+		r = copy_from_user(&(vcpu->arch.guest_tlb), &(utlb->guest_tlb), sizeof(utlb->guest_tlb));
+		if (r)
+			return r;
+		r = copy_from_user(&(vcpu->arch.shadow_tlb), &(utlb->shadow_tlb), sizeof(utlb->shadow_tlb));
+		if (r)
+			return r;
+ 		break;
+	}
 	default:
 		r = -EINVAL;
 	}
diff --git a/include/asm-powerpc/kvm.h b/include/asm-powerpc/kvm.h
--- a/include/asm-powerpc/kvm.h
+++ b/include/asm-powerpc/kvm.h
@@ -21,6 +21,7 @@
 #define __POWERPC_KVM_H__
 
 #include <asm/types.h>
+#include <asm/mmu-44x.h>
 
 struct kvm_regs {
 	__u32 pc;
@@ -53,4 +54,16 @@ struct kvm_fpu {
 struct kvm_fpu {
 };
 
+struct tlbe {
+	__u32 tid; /* Only the low 8 bits are used. */
+	__u32 word0;
+	__u32 word1;
+	__u32 word2;
+};
+
+struct kvm_tlb {
+	struct tlbe guest_tlb[PPC44x_TLB_SIZE];
+	struct tlbe shadow_tlb[PPC44x_TLB_SIZE];
+};
+
 #endif /* __POWERPC_KVM_H__ */
diff --git a/include/asm-powerpc/kvm_host.h b/include/asm-powerpc/kvm_host.h
--- a/include/asm-powerpc/kvm_host.h
+++ b/include/asm-powerpc/kvm_host.h
@@ -39,13 +39,6 @@ struct kvm_vcpu_stat {
 	u32 mmio_exits;
 	u32 signal_exits;
 	u32 light_exits;
-};
-
-struct tlbe {
-	u32 tid; /* Only the low 8 bits are used. */
-	u32 word0;
-	u32 word1;
-	u32 word2;
 };
 
 struct kvm_arch {
@@ -127,11 +120,6 @@ struct kvm_vcpu_arch {
 	int pending_dec;
 };
 
-struct kvm_tlb {
-	struct tlbe guest_tlb[PPC44x_TLB_SIZE];
-	struct tlbe shadow_tlb[PPC44x_TLB_SIZE];
-};
-
 enum emulation_result {
 	EMULATE_DONE,         /* no further processing */
 	EMULATE_DO_MMIO,      /* kvm_run filled with mmio request */
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -289,4 +289,7 @@ struct kvm_vapic_addr {
 /* Available with KVM_CAP_VAPIC */
 #define KVM_SET_VAPIC_ADDR        _IOW(KVMIO,  0x93, struct kvm_vapic_addr)
 
+#define KVM_GET_TLB               _IOR(KVMIO,  0x94, struct kvm_tlb)
+#define KVM_SET_TLB               _IOW(KVMIO,  0x95, struct kvm_tlb)
+
 #endif

--------------080501090508090105070302
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
--------------080501090508090105070302
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
kvm-ppc-devel mailing list
kvm-ppc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-ppc-devel

--------------080501090508090105070302--