* Re: [kvm-ppc-devel] KVM kernel/userspace TLB interface
2008-01-08 0:21 [kvm-ppc-devel] KVM kernel/userspace TLB interface Hollis Blanchard
@ 2008-01-08 7:12 ` Zhang Wei
2008-01-08 14:03 ` Avi Kivity
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Zhang Wei @ 2008-01-08 7:12 UTC (permalink / raw)
To: kvm-ppc
Hi, Hollis,
>
> Hi Zhang, the administrative tools have a need to access the
> guest's TLB
> state, for example to create the mapping for the initial code
> to execute
> in. This requires a kernel interface to be exported to
> userspace, but as
> you noted, the TLBs are very different between 440 and e500.
>
> Here are a couple ideas:
> * Specify an initial guest state to avoid the need for this
> interface. For example, we could say all initial registers are
> 0, and there is a TLB entry mapping virtual 0 to (guest)
> physical 0.
> * Have a per-core interface, e.g. KVM_440_SET_TLB_ENTRY and
> KVM_e500_SET_TLB_ENTRY. These would communicate via completely
> different structures. For example, the 440 structure
> would look
> something like { index, TID, word 0, word 1, word 2 }. We'd
> probably want to sanity check each ioctl to make sure
> it matches
> the vcpu type. Not only that, but we would need core-specific
> code in userspace to actually make the call.
> * Try to abstract it so that we have a higher-level "map this"
> interface. This could look something like { virtual addr, real
> addr, size, flags }. It's up to core-specific code to
> select in
> which TLB to create the mapping, which entry to clobber,
> interpret the meaning of "flags", etc. I guess the
> big downside
> here is that if we want to dump guest state for debugging
> purposes we'd still need the core-specific interface.
>
> We could also try supporting both #2 and #3, so that
> "generic" code like
> qemu could use the abstracted (and shared) "map this" call,
> while tools
> with more specific needs could still get at the underlying details. I
> guess #1 and #3 is like the normal userspace/kernel
> interface, where you
> don't ordinarily need to specify register details, but the ptrace
> interface is available when needed.
>
> Thoughts?
Agree with you!
In my porting implementation, I used #3. I think #3 should be more
flexible.
Maybe we'll add the other TLB support to it in the future. We should
define a good data structure about TLB content, which will include 440
and E500 TLB meaning.
And If we take out TLB operations, rest codes are more generic.
Cheers!
Wei.
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
_______________________________________________
kvm-ppc-devel mailing list
kvm-ppc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-ppc-devel
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [kvm-ppc-devel] KVM kernel/userspace TLB interface
2008-01-08 0:21 [kvm-ppc-devel] KVM kernel/userspace TLB interface Hollis Blanchard
2008-01-08 7:12 ` Zhang Wei
@ 2008-01-08 14:03 ` Avi Kivity
2008-01-22 13:25 ` Christian Ehrhardt
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Avi Kivity @ 2008-01-08 14:03 UTC (permalink / raw)
To: kvm-ppc
Hollis Blanchard wrote:
> * Have a per-core interface, e.g. KVM_440_SET_TLB_ENTRY and
> KVM_e500_SET_TLB_ENTRY. These would communicate via completely
> different structures. For example, the 440 structure would look
> something like { index, TID, word 0, word 1, word 2 }. We'd
> probably want to sanity check each ioctl to make sure it matches
> the vcpu type. Not only that, but we would need core-specific
> code in userspace to actually make the call.
> * Try to abstract it so that we have a higher-level "map this"
> interface. This could look something like { virtual addr, real
> addr, size, flags }. It's up to core-specific code to select in
> which TLB to create the mapping, which entry to clobber,
> interpret the meaning of "flags", etc. I guess the big downside
> here is that if we want to dump guest state for debugging
> purposes we'd still need the core-specific interface.
>
> We could also try supporting both #2 and #3, so that "generic" code like
> qemu could use the abstracted (and shared) "map this" call, while tools
> with more specific needs could still get at the underlying details. I
> guess #1 and #3 is like the normal userspace/kernel interface, where you
> don't ordinarily need to specify register details, but the ptrace
> interface is available when needed.
>
An unabstracted interface is needed for save/restore/migrate, which may
or may not be important for your target. I do recommend having getters
and setters for all architectural state.
--
error compiling committee.c: too many arguments to function
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
_______________________________________________
kvm-ppc-devel mailing list
kvm-ppc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-ppc-devel
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [kvm-ppc-devel] KVM kernel/userspace TLB interface
2008-01-08 0:21 [kvm-ppc-devel] KVM kernel/userspace TLB interface Hollis Blanchard
2008-01-08 7:12 ` Zhang Wei
2008-01-08 14:03 ` Avi Kivity
@ 2008-01-22 13:25 ` Christian Ehrhardt
2008-01-23 13:38 ` Christian Ehrhardt
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Christian Ehrhardt @ 2008-01-22 13:25 UTC (permalink / raw)
To: kvm-ppc
[-- Attachment #1: Type: text/plain, Size: 4937 bytes --]
It looks like we have two different needs we can solve with the approach Hollis suggested by implementing #2 (arch specific) and #3 (generic map this) interfaces.
Use cases - for #2 (set/get TLB elements including architecture specific information):
- Dump TLB, comparable to vcpu_dump we will need a tlb_dump for debugging and error reporting.
- Save/Restore/Migrate - even not being a need right now this might become a need for us all end so we will need a get/set function that is able to completely save/restore the guest TLB
Use Cases for #3 (insert a single mapping )
- simplifying shared userspace code by using only one common interface wherever possible that does not need to know which hardware it adresses
---
#2 - "get/set TLB"
I attached a prototype implementation of an architecture dependent mass KVM_GET_TLB & KVM_SET_TLB which get and set the full status the guest & shadow tlb structs assigned to a vcpu.
I intend to eventually have different calls for guest and shadow structure because e.g. we actually will never want to set shadow tlb from userspace - this prototype is still in development and any comments are welcome (I'll resend it once it is changed to different getter/setter for guest and shadow tlb).
@Zhang Wei - the interface I have in mind for this arch specific implementation is an array of tlbe[MAX_TLB_SIZE] get-/setting all elements of either guest or shadow tlb at once. Every architecture can define it's own tlbe structure in the kernel headers and can implement special needs in their backend for these ioctls (KVM_GET_GUEST_TLB, KVM_GET_SHADOW_TLB, KVM_SET_GUEST_TLB, but no KVM_SET_SHADOW_TLB).
---
#3 - "map this"
For #3 we need some discussion how the data structure of a shared interface might look like. For ppc440 it would be the { virtual addr, real addr, size, flags } as you see the tlbe definition that is moved in my patch. I wanted to ask what more/less needs e500 has for a "map this" interface to insert a single tlb-entry. We need also to clarify what flags we all need to define a shared set of bits (if possible to share that flags) in the flags variable we can use that every architecture can map to it's own meaning in the backend to this ioctl.
When I know the needs of e500 for the "map this" interface (and any other needs brought up that might affect future hardware we want to be compatible with) I intend to implement a prototype of this interface for ppc440 and then iterate it until we are happy with it for ppc440, e500 and whatever hw we have in mind for the future.
Zhang Wei wrote:
> Hi, Hollis,
>
>> Hi Zhang, the administrative tools have a need to access the guest's TLB
>> state, for example to create the mapping for the initial code to execute
>> in. This requires a kernel interface to be exported to userspace, but as
>> you noted, the TLBs are very different between 440 and e500.
>>
>> Here are a couple ideas:
[...]
>> #2* Have a per-core interface, e.g. KVM_440_SET_TLB_ENTRY and
>> KVM_e500_SET_TLB_ENTRY. These would communicate via completely
>> different structures. For example, the 440 structure
>> would look
>> something like { index, TID, word 0, word 1, word 2 }. We'd
>> probably want to sanity check each ioctl to make sure it matches
>> the vcpu type. Not only that, but we would need core-specific
>> code in userspace to actually make the call.
>> #3* Try to abstract it so that we have a higher-level "map this"
>> interface. This could look something like { virtual addr, real
>> addr, size, flags }. It's up to core-specific code to select in
>> which TLB to create the mapping, which entry to clobber,
>> interpret the meaning of "flags", etc. I guess the big downside
>> here is that if we want to dump guest state for debugging
>> purposes we'd still need the core-specific interface.
>>
>> We could also try supporting both #2 and #3, so that
>> "generic" code like
>> qemu could use the abstracted (and shared) "map this" call,
>> while tools
>> with more specific needs could still get at the underlying details.
>>
>
> Agree with you!
> In my porting implementation, I used #3. I think #3 should be more
> flexible.
> Maybe we'll add the other TLB support to it in the future. We should
> define a good data structure about TLB content, which will include 440
> and E500 TLB meaning.
>
> And If we take out TLB operations, rest codes are more generic.
>
> Cheers!
> Wei.
And Avi Kivity wrote:
> Hollis Blanchard wrote:
> [...]
> An unabstracted interface is needed for save/restore/migrate, which may
> or may not be important for your target. I do recommend having getters
> and setters for all architectural state.
>
--
Grüsse / regards,
Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization
[-- Attachment #2: ioctl-tlb-userspace --]
[-- Type: text/plain, Size: 3975 bytes --]
diff --git a/libkvm/libkvm-powerpc.c b/libkvm/libkvm-powerpc.c
--- a/libkvm/libkvm-powerpc.c
+++ b/libkvm/libkvm-powerpc.c
@@ -25,6 +25,7 @@
#include "kvm-powerpc.h"
#include <errno.h>
#include <stdio.h>
+#include <sys/ioctl.h>
int handle_dcr(struct kvm_run *run, kvm_context_t kvm)
{
@@ -110,3 +111,55 @@ int kvm_arch_run(struct kvm_run *run, kv
}
return ret;
}
+
+int kvm_get_tlb(kvm_context_t kvm, int vcpu, struct kvm_tlb *tlb)
+{
+ int r = 0;
+
+ r = ioctl(kvm->vcpu_fd[vcpu], KVM_GET_TLB, tlb);
+ if (r) {
+ fprintf(stderr,"%s failed (r='%d')\n",__func__,r);
+ }
+
+ return r;
+}
+
+int kvm_set_tlb(kvm_context_t kvm, int vcpu, struct kvm_tlb *tlb)
+{
+ int r = 0;
+
+ r = ioctl(kvm->vcpu_fd[vcpu], KVM_SET_TLB, tlb);
+ if (r) {
+ fprintf(stderr,"%s - failed (r='%d')\n",__func__,r);
+ }
+
+ return r;
+}
+
+static void print_tlb(struct tlbe tlb[PPC44x_TLB_SIZE], char* name, char* flag)
+{
+ int i;
+
+ fprintf(stderr,"%s:\n",name);
+ fprintf(stderr,"| %2s | %8s | %8s | %8s | %8s |\n","nr"," tid ",
+ "word0","word1","word2");
+ for (i = 0; i < PPC44x_TLB_SIZE; i++) {
+ if (tlb[i].word0 & PPC44x_TLB_VALID)
+ fprintf(stderr,"%s%2d | %08X | %08X | %08X | %08X |\n",
+ flag,i,
+ tlb[i].tid,
+ tlb[i].word0,
+ tlb[i].word1,
+ tlb[i].word2);
+ }
+ fflush(stdout);
+}
+
+void kvm_dump_tlb(kvm_context_t kvm, int vcpu)
+{
+ struct kvm_tlb tlb;
+ kvm_get_tlb(kvm, vcpu, &tlb);
+
+ print_tlb(tlb.guest_tlb,"Guest TLB","G");
+ print_tlb(tlb.shadow_tlb,"Shadow TLB","S");
+}
diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h
--- a/libkvm/libkvm.h
+++ b/libkvm/libkvm.h
@@ -556,4 +556,10 @@ int kvm_enable_vapic(kvm_context_t kvm,
#endif
+#if defined(__powerpc__)
+int kvm_get_tlb(kvm_context_t kvm, int vcpu, struct kvm_tlb *tlb);
+int kvm_set_tlb(kvm_context_t kvm, int vcpu, struct kvm_tlb *tlb);
+void kvm_dump_tlb(kvm_context_t kvm, int vcpu);
#endif
+
+#endif
diff --git a/qemu/hw/ppc440_boards_kvm.c b/qemu/hw/ppc440_boards_kvm.c
--- a/qemu/hw/ppc440_boards_kvm.c
+++ b/qemu/hw/ppc440_boards_kvm.c
@@ -124,7 +124,6 @@ void ppc440_init(CPUState *env,
ppc_dcr_register(env, DCRN_CPR0_CFGDATA, cpr0, dcr_read_cpr0, NULL);
}
-
static void bambooKVM_init(ram_addr_t ram_size, int vga_ram_size,
const char *boot_device, DisplayState *ds,
const char *kernel_filename,
@@ -220,7 +219,9 @@ static void bambooKVM_init(ram_addr_t ra
}
}
- /* XXX insert TLB entries */
+ /* insert initial TLB entry state */
+ kvm_ppc440_tlb_init(env);
+ /* set up initial stack pointer */
env->gpr[1] = (16<<20) - 8;
env->gpr[4] = initrd_base;
env->gpr[5] = initrd_size;
diff --git a/qemu/qemu-kvm-powerpc.c b/qemu/qemu-kvm-powerpc.c
--- a/qemu/qemu-kvm-powerpc.c
+++ b/qemu/qemu-kvm-powerpc.c
@@ -204,4 +204,27 @@ int handle_powerpc_dcr_write(uint32_t dc
return 0; /* XXX ignore failed DCR ops */
}
+void kvm_ppc440_tlb_init(CPUState *env)
+{
+ static struct kvm_tlb tlb;
+ struct tlbe *gtlbe = &(tlb.guest_tlb[0]);
+
+ /* get/modify/write to preserve the shadow tlb state */
+ kvm_get_tlb(kvm_context, env->cpu_index, &tlb);
+
+ gtlbe->tid = 0;
+ gtlbe->word0 = PPC44x_TLB_16M | PPC44x_TLB_VALID;
+ gtlbe->word1 = 0;
+ gtlbe->word2 = PPC44x_TLB_SX | PPC44x_TLB_SW | PPC44x_TLB_SR;
+
+ gtlbe++;
+ gtlbe->tid = 0;
+ gtlbe->word0 = 0xef600000 | PPC44x_TLB_4K | PPC44x_TLB_VALID;
+ gtlbe->word1 = 0xef600000;
+ gtlbe->word2 = PPC44x_TLB_SX | PPC44x_TLB_SW | PPC44x_TLB_SR
+ | PPC44x_TLB_I | PPC44x_TLB_G;
+
+ kvm_set_tlb(kvm_context, env->cpu_index, &tlb);
+}
+
#endif
diff --git a/qemu/qemu-kvm.h b/qemu/qemu-kvm.h
--- a/qemu/qemu-kvm.h
+++ b/qemu/qemu-kvm.h
@@ -51,6 +51,7 @@ int handle_tpr_access(void *opaque, int
#ifdef TARGET_PPC
int handle_powerpc_dcr_read(uint32_t dcrn, uint32_t *data);
int handle_powerpc_dcr_write(uint32_t dcrn, uint32_t data);
+void kvm_ppc440_tlb_init(CPUState *env);
#endif
#define ALIGN(x, y) (((x)+(y)-1) & ~((y)-1))
[-- Attachment #3: ioctl-tlb --]
[-- Type: text/plain, Size: 3789 bytes --]
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -26,6 +26,7 @@
#include <linux/vmalloc.h>
#include <asm/cputable.h>
#include <asm/cacheflush.h>
+#include <asm/uaccess.h>
#include "44x_tlb.h"
@@ -578,24 +579,9 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *
/* Initial guest state: 16MB mapping 0 -> 0, PC = 0, MSR = 0, R1 = 16MB */
int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
{
- struct tlbe *tlbe = &vcpu->arch.guest_tlb[0];
-
- tlbe->tid = 0;
- tlbe->word0 = PPC44x_TLB_16M | PPC44x_TLB_VALID;
- tlbe->word1 = 0;
- tlbe->word2 = PPC44x_TLB_SX | PPC44x_TLB_SW | PPC44x_TLB_SR;
-
- tlbe++;
- tlbe->tid = 0;
- tlbe->word0 = 0xef600000 | PPC44x_TLB_4K | PPC44x_TLB_VALID;
- tlbe->word1 = 0xef600000;
- tlbe->word2 = PPC44x_TLB_SX | PPC44x_TLB_SW | PPC44x_TLB_SR
- | PPC44x_TLB_I | PPC44x_TLB_G;
-
vcpu->arch.pc = 0;
vcpu->arch.guest_msr = 0;
vcpu->arch.shadow_msr = MSR_PR|MSR_EE|MSR_IS|MSR_DS;
- vcpu->arch.gpr[1] = (16<<20) - 8; /* -8 for the callee-save LR slot */
/* Eye-catching number so we know if the guest takes an interrupt
* before it's programmed its own IVPR. */
@@ -820,6 +806,30 @@ long kvm_arch_vcpu_ioctl(struct file *fi
long r;
switch (ioctl) {
+ case KVM_GET_TLB: {
+ struct kvm_vcpu *vcpu = filp->private_data;
+ struct kvm_tlb __user *utlb = (struct kvm_tlb __user *)arg;
+
+ r = copy_to_user(&(utlb->guest_tlb), &(vcpu->arch.guest_tlb), sizeof(utlb->guest_tlb));
+ if (r)
+ return r;
+ r = copy_to_user(&(utlb->shadow_tlb), &(vcpu->arch.shadow_tlb), sizeof(utlb->shadow_tlb));
+ if (r)
+ return r;
+ break;
+ }
+ case KVM_SET_TLB: {
+ struct kvm_vcpu *vcpu = filp->private_data;
+ struct kvm_tlb __user *utlb = (struct kvm_tlb __user *)arg;
+
+ r = copy_from_user(&(vcpu->arch.guest_tlb), &(utlb->guest_tlb), sizeof(utlb->guest_tlb));
+ if (r)
+ return r;
+ r = copy_from_user(&(vcpu->arch.shadow_tlb), &(utlb->shadow_tlb), sizeof(utlb->shadow_tlb));
+ if (r)
+ return r;
+ break;
+ }
default:
r = -EINVAL;
}
diff --git a/include/asm-powerpc/kvm.h b/include/asm-powerpc/kvm.h
--- a/include/asm-powerpc/kvm.h
+++ b/include/asm-powerpc/kvm.h
@@ -21,6 +21,7 @@
#define __POWERPC_KVM_H__
#include <asm/types.h>
+#include <asm/mmu-44x.h>
struct kvm_regs {
__u32 pc;
@@ -53,4 +54,16 @@ struct kvm_fpu {
struct kvm_fpu {
};
+struct tlbe {
+ __u32 tid; /* Only the low 8 bits are used. */
+ __u32 word0;
+ __u32 word1;
+ __u32 word2;
+};
+
+struct kvm_tlb {
+ struct tlbe guest_tlb[PPC44x_TLB_SIZE];
+ struct tlbe shadow_tlb[PPC44x_TLB_SIZE];
+};
+
#endif /* __POWERPC_KVM_H__ */
diff --git a/include/asm-powerpc/kvm_host.h b/include/asm-powerpc/kvm_host.h
--- a/include/asm-powerpc/kvm_host.h
+++ b/include/asm-powerpc/kvm_host.h
@@ -39,13 +39,6 @@ struct kvm_vcpu_stat {
u32 mmio_exits;
u32 signal_exits;
u32 light_exits;
-};
-
-struct tlbe {
- u32 tid; /* Only the low 8 bits are used. */
- u32 word0;
- u32 word1;
- u32 word2;
};
struct kvm_arch {
@@ -127,11 +120,6 @@ struct kvm_vcpu_arch {
int pending_dec;
};
-struct kvm_tlb {
- struct tlbe guest_tlb[PPC44x_TLB_SIZE];
- struct tlbe shadow_tlb[PPC44x_TLB_SIZE];
-};
-
enum emulation_result {
EMULATE_DONE, /* no further processing */
EMULATE_DO_MMIO, /* kvm_run filled with mmio request */
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -289,4 +289,7 @@ struct kvm_vapic_addr {
/* Available with KVM_CAP_VAPIC */
#define KVM_SET_VAPIC_ADDR _IOW(KVMIO, 0x93, struct kvm_vapic_addr)
+#define KVM_GET_TLB _IOR(KVMIO, 0x94, struct kvm_tlb)
+#define KVM_SET_TLB _IOW(KVMIO, 0x95, struct kvm_tlb)
+
#endif
[-- Attachment #4: Type: text/plain, Size: 228 bytes --]
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
[-- Attachment #5: Type: text/plain, Size: 170 bytes --]
_______________________________________________
kvm-ppc-devel mailing list
kvm-ppc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-ppc-devel
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [kvm-ppc-devel] KVM kernel/userspace TLB interface
2008-01-08 0:21 [kvm-ppc-devel] KVM kernel/userspace TLB interface Hollis Blanchard
` (2 preceding siblings ...)
2008-01-22 13:25 ` Christian Ehrhardt
@ 2008-01-23 13:38 ` Christian Ehrhardt
2008-01-23 21:05 ` Hollis Blanchard
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Christian Ehrhardt @ 2008-01-23 13:38 UTC (permalink / raw)
To: kvm-ppc
[-- Attachment #1: Type: text/plain, Size: 6174 bytes --]
Christian Ehrhardt wrote:
This mail now has the reworked implementation of the patch for ppc440 in the #2 style attached where the guest tlb can be get/set completely and the shadow tlb can be read from userspace (now separate ioctls for guest/shadow).
Actually this should work for most if not all use cases we have. Especially our current use case of setting up a proper initial state is very arch dependent anyway and if we really can save userspace code for shared usage depends on the fact if we will be able to create a shared bit format for flags (I start to put a common "map this" call down on my prio list).
I just downloaded the "PowerPC e500 Core Family Reference Manual" and will read a bit to see where the differences are in relation to this interface, but atm everything works for me because the #2 style interface in this patch is sufficient for the only use case we have atm.
@Zhang Wei: I would really like to see your code for the solution in the "map this" style you have mentioned that you have written for e500.
> It looks like we have two different needs we can solve with the approach
> Hollis suggested by implementing #2 (arch specific) and #3 (generic map
> this) interfaces.
>
> Use cases - for #2 (set/get TLB elements including architecture specific
> information):
> - Dump TLB, comparable to vcpu_dump we will need a tlb_dump for
> debugging and error reporting.
> - Save/Restore/Migrate - even not being a need right now this might
> become a need for us all end so we will need a get/set function that is
> able to completely save/restore the guest TLB
>
> Use Cases for #3 (insert a single mapping )
> - simplifying shared userspace code by using only one common interface
> wherever possible that does not need to know which hardware it adresses
>
> ---
>
> #2 - "get/set TLB"
> I attached a prototype implementation of an architecture dependent mass
> KVM_GET_TLB & KVM_SET_TLB which get and set the full status the guest &
> shadow tlb structs assigned to a vcpu.
> I intend to eventually have different calls for guest and shadow
> structure because e.g. we actually will never want to set shadow tlb
> from userspace - this prototype is still in development and any comments
> are welcome (I'll resend it once it is changed to different
> getter/setter for guest and shadow tlb).
> @Zhang Wei - the interface I have in mind for this arch specific
> implementation is an array of tlbe[MAX_TLB_SIZE] get-/setting all
> elements of either guest or shadow tlb at once. Every architecture can
> define it's own tlbe structure in the kernel headers and can implement
> special needs in their backend for these ioctls (KVM_GET_GUEST_TLB,
> KVM_GET_SHADOW_TLB, KVM_SET_GUEST_TLB, but no KVM_SET_SHADOW_TLB).
>
> ---
>
> #3 - "map this"
> For #3 we need some discussion how the data structure of a shared
> interface might look like. For ppc440 it would be the { virtual addr,
> real addr, size, flags } as you see the tlbe definition that is moved in
> my patch. I wanted to ask what more/less needs e500 has for a "map this"
> interface to insert a single tlb-entry. We need also to clarify what
> flags we all need to define a shared set of bits (if possible to share
> that flags) in the flags variable we can use that every architecture can
> map to it's own meaning in the backend to this ioctl.
> When I know the needs of e500 for the "map this" interface (and any
> other needs brought up that might affect future hardware we want to be
> compatible with) I intend to implement a prototype of this interface for
> ppc440 and then iterate it until we are happy with it for ppc440, e500
> and whatever hw we have in mind for the future.
>
>
> Zhang Wei wrote:
>> Hi, Hollis,
>>
>>> Hi Zhang, the administrative tools have a need to access the guest's TLB
>>> state, for example to create the mapping for the initial code to execute
>>> in. This requires a kernel interface to be exported to userspace, but as
>>> you noted, the TLBs are very different between 440 and e500.
>>>
>>> Here are a couple ideas:
> [...]
>>> #2* Have a per-core interface, e.g. KVM_440_SET_TLB_ENTRY and
>>> KVM_e500_SET_TLB_ENTRY. These would communicate via completely
>>> different structures. For example, the 440 structure would look
>>> something like { index, TID, word 0, word 1, word 2 }. We'd
>>> probably want to sanity check each ioctl to make sure it matches
>>> the vcpu type. Not only that, but we would need core-specific
>>> code in userspace to actually make the call.
>>> #3* Try to abstract it so that we have a higher-level "map this"
>>> interface. This could look something like { virtual addr, real
>>> addr, size, flags }. It's up to core-specific code to select in
>>> which TLB to create the mapping, which entry to clobber,
>>> interpret the meaning of "flags", etc. I guess the big downside
>>> here is that if we want to dump guest state for debugging
>>> purposes we'd still need the core-specific interface.
>>>
>>> We could also try supporting both #2 and #3, so that "generic" code like
>>> qemu could use the abstracted (and shared) "map this" call, while tools
>>> with more specific needs could still get at the underlying details.
>>>
>>
>> Agree with you!
>> In my porting implementation, I used #3. I think #3 should be more
>> flexible.
>> Maybe we'll add the other TLB support to it in the future. We should
>> define a good data structure about TLB content, which will include 440
>> and E500 TLB meaning.
>>
>> And If we take out TLB operations, rest codes are more generic.
>>
>> Cheers!
>> Wei.
>
> And Avi Kivity wrote:
>> Hollis Blanchard wrote: [...]
>> An unabstracted interface is needed for save/restore/migrate, which
>> may or may not be important for your target. I do recommend having
>> getters and setters for all architectural state.
>>
>
--
Grüsse / regards,
Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization
[-- Attachment #2: ioctl-tlb-getset --]
[-- Type: text/plain, Size: 3604 bytes --]
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -26,6 +26,7 @@
#include <linux/vmalloc.h>
#include <asm/cputable.h>
#include <asm/cacheflush.h>
+#include <asm/uaccess.h>
#include "44x_tlb.h"
@@ -578,24 +579,9 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *
/* Initial guest state: 16MB mapping 0 -> 0, PC = 0, MSR = 0, R1 = 16MB */
int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
{
- struct tlbe *tlbe = &vcpu->arch.guest_tlb[0];
-
- tlbe->tid = 0;
- tlbe->word0 = PPC44x_TLB_16M | PPC44x_TLB_VALID;
- tlbe->word1 = 0;
- tlbe->word2 = PPC44x_TLB_SX | PPC44x_TLB_SW | PPC44x_TLB_SR;
-
- tlbe++;
- tlbe->tid = 0;
- tlbe->word0 = 0xef600000 | PPC44x_TLB_4K | PPC44x_TLB_VALID;
- tlbe->word1 = 0xef600000;
- tlbe->word2 = PPC44x_TLB_SX | PPC44x_TLB_SW | PPC44x_TLB_SR
- | PPC44x_TLB_I | PPC44x_TLB_G;
-
vcpu->arch.pc = 0;
vcpu->arch.guest_msr = 0;
vcpu->arch.shadow_msr = MSR_PR|MSR_EE|MSR_IS|MSR_DS;
- vcpu->arch.gpr[1] = (16<<20) - 8; /* -8 for the callee-save LR slot */
/* Eye-catching number so we know if the guest takes an interrupt
* before it's programmed its own IVPR. */
@@ -820,6 +806,33 @@ long kvm_arch_vcpu_ioctl(struct file *fi
long r;
switch (ioctl) {
+ case KVM_GET_GUEST_TLB: {
+ struct kvm_vcpu *vcpu = filp->private_data;
+ struct kvm_single_tlb __user *utlb = (struct kvm_single_tlb __user *)arg;
+
+ r = copy_to_user(&(utlb->tlb), &(vcpu->arch.guest_tlb), sizeof(struct kvm_single_tlb));
+ if (r)
+ return r;
+ break;
+ }
+ case KVM_SET_GUEST_TLB: {
+ struct kvm_vcpu *vcpu = filp->private_data;
+ struct kvm_single_tlb __user *utlb = (struct kvm_single_tlb __user *)arg;
+
+ r = copy_from_user(&(vcpu->arch.guest_tlb), &(utlb->tlb), sizeof(struct kvm_single_tlb));
+ if (r)
+ return r;
+ break;
+ }
+ case KVM_GET_SHADOW_TLB: {
+ struct kvm_vcpu *vcpu = filp->private_data;
+ struct kvm_single_tlb __user *utlb = (struct kvm_single_tlb __user *)arg;
+
+ r = copy_to_user(&(utlb->tlb), &(vcpu->arch.shadow_tlb), sizeof(struct kvm_single_tlb));
+ if (r)
+ return r;
+ break;
+ }
default:
r = -EINVAL;
}
diff --git a/include/asm-powerpc/kvm.h b/include/asm-powerpc/kvm.h
--- a/include/asm-powerpc/kvm.h
+++ b/include/asm-powerpc/kvm.h
@@ -21,6 +21,7 @@
#define __POWERPC_KVM_H__
#include <asm/types.h>
+#include <asm/mmu-44x.h>
struct kvm_regs {
__u32 pc;
@@ -53,4 +54,14 @@ struct kvm_fpu {
struct kvm_fpu {
};
+struct tlbe {
+ __u32 tid; /* Only the low 8 bits are used. */
+ __u32 word0;
+ __u32 word1;
+ __u32 word2;
+};
+
+struct kvm_single_tlb {
+ struct tlbe tlb[PPC44x_TLB_SIZE];
+};
#endif /* __POWERPC_KVM_H__ */
diff --git a/include/asm-powerpc/kvm_host.h b/include/asm-powerpc/kvm_host.h
--- a/include/asm-powerpc/kvm_host.h
+++ b/include/asm-powerpc/kvm_host.h
@@ -39,13 +39,6 @@ struct kvm_vcpu_stat {
u32 mmio_exits;
u32 signal_exits;
u32 light_exits;
-};
-
-struct tlbe {
- u32 tid; /* Only the low 8 bits are used. */
- u32 word0;
- u32 word1;
- u32 word2;
};
struct kvm_arch {
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -289,4 +289,8 @@ struct kvm_vapic_addr {
/* Available with KVM_CAP_VAPIC */
#define KVM_SET_VAPIC_ADDR _IOW(KVMIO, 0x93, struct kvm_vapic_addr)
+#define KVM_GET_GUEST_TLB _IOR(KVMIO, 0x94, struct kvm_single_tlb)
+#define KVM_SET_GUEST_TLB _IOW(KVMIO, 0x95, struct kvm_single_tlb)
+#define KVM_GET_SHADOW_TLB _IOR(KVMIO, 0x96, struct kvm_single_tlb)
+
#endif
[-- Attachment #3: ioctl-tlb-getset-userspace --]
[-- Type: text/plain, Size: 4225 bytes --]
diff --git a/libkvm/libkvm-powerpc.c b/libkvm/libkvm-powerpc.c
--- a/libkvm/libkvm-powerpc.c
+++ b/libkvm/libkvm-powerpc.c
@@ -25,6 +25,7 @@
#include "kvm-powerpc.h"
#include <errno.h>
#include <stdio.h>
+#include <sys/ioctl.h>
int handle_dcr(struct kvm_run *run, kvm_context_t kvm)
{
@@ -110,3 +111,69 @@ int kvm_arch_run(struct kvm_run *run, kv
}
return ret;
}
+
+int kvm_get_guest_tlb(kvm_context_t kvm, int vcpu, struct kvm_single_tlb *tlb)
+{
+ int r = 0;
+
+ r = ioctl(kvm->vcpu_fd[vcpu], KVM_GET_GUEST_TLB, tlb);
+ if (r) {
+ fprintf(stderr,"%s failed (r='%d')\n",__func__,r);
+ }
+
+ return r;
+}
+
+int kvm_get_shadow_tlb(kvm_context_t kvm, int vcpu, struct kvm_single_tlb *tlb)
+{
+ int r = 0;
+
+ r = ioctl(kvm->vcpu_fd[vcpu], KVM_GET_SHADOW_TLB, tlb);
+ if (r) {
+ fprintf(stderr,"%s failed (r='%d')\n",__func__,r);
+ }
+
+ return r;
+}
+
+int kvm_set_guest_tlb(kvm_context_t kvm, int vcpu, struct kvm_single_tlb *tlb)
+{
+ int r = 0;
+
+ r = ioctl(kvm->vcpu_fd[vcpu], KVM_SET_GUEST_TLB, tlb);
+ if (r) {
+ fprintf(stderr,"%s - failed (r='%d')\n",__func__,r);
+ }
+
+ return r;
+}
+
+static void print_tlb(struct kvm_single_tlb *tlb, char* name, char* flag)
+{
+ int i;
+
+ fprintf(stderr,"%s:\n",name);
+ fprintf(stderr,"| %2s | %8s | %8s | %8s | %8s |\n","nr"," tid ",
+ "word0","word1","word2");
+ for (i = 0; i < PPC44x_TLB_SIZE; i++) {
+ if (tlb->tlb[i].word0 & PPC44x_TLB_VALID)
+ fprintf(stderr,"%s%2d | %08X | %08X | %08X | %08X |\n",
+ flag,i,
+ tlb->tlb[i].tid,
+ tlb->tlb[i].word0,
+ tlb->tlb[i].word1,
+ tlb->tlb[i].word2);
+ }
+ fflush(stdout);
+}
+
+void kvm_dump_tlb(kvm_context_t kvm, int vcpu)
+{
+ struct kvm_single_tlb tlb;
+
+ kvm_get_guest_tlb(kvm, vcpu, &tlb);
+ print_tlb(&tlb,"Guest TLB","G");
+
+ kvm_get_shadow_tlb(kvm, vcpu, &tlb);
+ print_tlb(&tlb,"Shadow TLB","S");
+}
diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h
--- a/libkvm/libkvm.h
+++ b/libkvm/libkvm.h
@@ -556,4 +556,10 @@ int kvm_enable_vapic(kvm_context_t kvm,
#endif
+#if defined(__powerpc__)
+int kvm_get_guest_tlb(kvm_context_t kvm, int vcpu, struct kvm_single_tlb *tlb);
+int kvm_set_guest_tlb(kvm_context_t kvm, int vcpu, struct kvm_single_tlb *tlb);
+void kvm_dump_tlb(kvm_context_t kvm, int vcpu);
#endif
+
+#endif
diff --git a/qemu/hw/ppc440_boards_kvm.c b/qemu/hw/ppc440_boards_kvm.c
--- a/qemu/hw/ppc440_boards_kvm.c
+++ b/qemu/hw/ppc440_boards_kvm.c
@@ -124,7 +124,6 @@ void ppc440_init(CPUState *env,
ppc_dcr_register(env, DCRN_CPR0_CFGDATA, cpr0, dcr_read_cpr0, NULL);
}
-
static void bambooKVM_init(ram_addr_t ram_size, int vga_ram_size,
const char *boot_device, DisplayState *ds,
const char *kernel_filename,
@@ -220,7 +219,9 @@ static void bambooKVM_init(ram_addr_t ra
}
}
- /* XXX insert TLB entries */
+ /* insert initial TLB entry state */
+ kvm_ppc440_tlb_init(env);
+ /* set up initial stack pointer */
env->gpr[1] = (16<<20) - 8;
env->gpr[4] = initrd_base;
env->gpr[5] = initrd_size;
diff --git a/qemu/qemu-kvm-powerpc.c b/qemu/qemu-kvm-powerpc.c
--- a/qemu/qemu-kvm-powerpc.c
+++ b/qemu/qemu-kvm-powerpc.c
@@ -204,4 +204,24 @@ int handle_powerpc_dcr_write(uint32_t dc
return 0; /* XXX ignore failed DCR ops */
}
+void kvm_ppc440_tlb_init(CPUState *env)
+{
+ static struct kvm_single_tlb tlb;
+ struct tlbe *gtlbe = (tlb.tlb);
+
+ gtlbe->tid = 0;
+ gtlbe->word0 = PPC44x_TLB_16M | PPC44x_TLB_VALID;
+ gtlbe->word1 = 0;
+ gtlbe->word2 = PPC44x_TLB_SX | PPC44x_TLB_SW | PPC44x_TLB_SR;
+
+ gtlbe++;
+ gtlbe->tid = 0;
+ gtlbe->word0 = 0xef600000 | PPC44x_TLB_4K | PPC44x_TLB_VALID;
+ gtlbe->word1 = 0xef600000;
+ gtlbe->word2 = PPC44x_TLB_SX | PPC44x_TLB_SW | PPC44x_TLB_SR
+ | PPC44x_TLB_I | PPC44x_TLB_G;
+
+ kvm_set_guest_tlb(kvm_context, env->cpu_index, &tlb);
+}
+
#endif
diff --git a/qemu/qemu-kvm.h b/qemu/qemu-kvm.h
--- a/qemu/qemu-kvm.h
+++ b/qemu/qemu-kvm.h
@@ -51,6 +51,7 @@ int handle_tpr_access(void *opaque, int
#ifdef TARGET_PPC
int handle_powerpc_dcr_read(uint32_t dcrn, uint32_t *data);
int handle_powerpc_dcr_write(uint32_t dcrn, uint32_t data);
+void kvm_ppc440_tlb_init(CPUState *env);
#endif
#define ALIGN(x, y) (((x)+(y)-1) & ~((y)-1))
[-- Attachment #4: Type: text/plain, Size: 228 bytes --]
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
[-- Attachment #5: Type: text/plain, Size: 170 bytes --]
_______________________________________________
kvm-ppc-devel mailing list
kvm-ppc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-ppc-devel
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [kvm-ppc-devel] KVM kernel/userspace TLB interface
2008-01-08 0:21 [kvm-ppc-devel] KVM kernel/userspace TLB interface Hollis Blanchard
` (3 preceding siblings ...)
2008-01-23 13:38 ` Christian Ehrhardt
@ 2008-01-23 21:05 ` Hollis Blanchard
2008-01-24 7:23 ` Christian Ehrhardt
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Hollis Blanchard @ 2008-01-23 21:05 UTC (permalink / raw)
To: kvm-ppc
On Wed, 2008-01-23 at 14:38 +0100, Christian Ehrhardt wrote:
> Christian Ehrhardt wrote:
>
> This mail now has the reworked implementation of the patch for ppc440
> in the #2 style attached where the guest tlb can be get/set completely
> and the shadow tlb can be read from userspace (now separate ioctls for
> guest/shadow).
I don't think the host userspace should ever need to directly manipulate
the shadow TLB state. IIRC x86 just has a "sync shadow MMU state"
command (not sure what it does or when you use it though).
I also really don't like accessing the whole TLB as a giant array when
single-entry commands would work just fine.
> Actually this should work for most if not all use cases we have.
> Especially our current use case of setting up a proper initial state
> is very arch dependent anyway and if we really can save userspace code
> for shared usage depends on the fact if we will be able to create a
> shared bit format for flags (I start to put a common "map this" call
> down on my prio list).
"Flags" could just indicate IO vs RAM mappings. That would be nice and
generic. If IO flag is set, the host would map with
cache-inhibited/guarded. If not, it's a RAM mapping, so don't.
--
Hollis Blanchard
IBM Linux Technology Center
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
kvm-ppc-devel mailing list
kvm-ppc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-ppc-devel
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [kvm-ppc-devel] KVM kernel/userspace TLB interface
2008-01-08 0:21 [kvm-ppc-devel] KVM kernel/userspace TLB interface Hollis Blanchard
` (4 preceding siblings ...)
2008-01-23 21:05 ` Hollis Blanchard
@ 2008-01-24 7:23 ` Christian Ehrhardt
2008-01-24 7:37 ` Hollis Blanchard
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Christian Ehrhardt @ 2008-01-24 7:23 UTC (permalink / raw)
To: kvm-ppc
Hollis Blanchard wrote:
> On Wed, 2008-01-23 at 14:38 +0100, Christian Ehrhardt wrote:
>> Christian Ehrhardt wrote:
>>
>> This mail now has the reworked implementation of the patch for ppc440
>> in the #2 style attached where the guest tlb can be get/set completely
>> and the shadow tlb can be read from userspace (now separate ioctls for
>> guest/shadow).
>
> I don't think the host userspace should ever need to directly manipulate
> the shadow TLB state. IIRC x86 just has a "sync shadow MMU state"
> command (not sure what it does or when you use it though).
I agree and the patch has only
KVM_GET_GUEST_TLB
KVM_SET_GUEST_TLB
KVM_GET_SHADOW_TLB
=>no manipulation to the shadow TLB possible.
> I also really don't like accessing the whole TLB as a giant array when
> single-entry commands would work just fine.
Well I think it is just easier to do it with the full array. Its easy to change that to a 1xtlbe+1xindex interface that can be iterated by userspace, but I don't see any benefit atm.
Code where we set up the TLB are not performance relevant so a big array as well as iterating with a lot of calls are both not hurting us. If we think of the get/set as interface used for dump tlb and save/restore we will need to access the full array anyway.
So if you want it as single tlbe get/set is there a specific argument ? The argument "do it because I like it that way" is also accepted if stated explicitly ;-)
>> Actually this should work for most if not all use cases we have.
>> Especially our current use case of setting up a proper initial state
>> is very arch dependent anyway and if we really can save userspace code
>> for shared usage depends on the fact if we will be able to create a
>> shared bit format for flags (I start to put a common "map this" call
>> down on my prio list).
>
> "Flags" could just indicate IO vs RAM mappings. That would be nice and
> generic. If IO flag is set, the host would map with
> cache-inhibited/guarded. If not, it's a RAM mapping, so don't.
I would hope it is so an easy flagging. Don't you think we need all that rwx@problem/kernel state (well both have that) and e.g. e500 needs something more than we are because they are supporting the M flag ("Memory Coherency required"). Well maybe all becomes clear when we see Zhang Wei's code and can compare it to our needs.
--
Grüsse / regards,
Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
kvm-ppc-devel mailing list
kvm-ppc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-ppc-devel
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [kvm-ppc-devel] KVM kernel/userspace TLB interface
2008-01-08 0:21 [kvm-ppc-devel] KVM kernel/userspace TLB interface Hollis Blanchard
` (5 preceding siblings ...)
2008-01-24 7:23 ` Christian Ehrhardt
@ 2008-01-24 7:37 ` Hollis Blanchard
2008-01-24 7:58 ` Zhang Wei
2008-01-24 8:01 ` Zhang Wei
8 siblings, 0 replies; 10+ messages in thread
From: Hollis Blanchard @ 2008-01-24 7:37 UTC (permalink / raw)
To: kvm-ppc
On Thu, 2008-01-24 at 08:23 +0100, Christian Ehrhardt wrote:
> Hollis Blanchard wrote:
> > On Wed, 2008-01-23 at 14:38 +0100, Christian Ehrhardt wrote:
> >> Christian Ehrhardt wrote:
> >>
> >> This mail now has the reworked implementation of the patch for ppc440
> >> in the #2 style attached where the guest tlb can be get/set completely
> >> and the shadow tlb can be read from userspace (now separate ioctls for
> >> guest/shadow).
> >
> > I don't think the host userspace should ever need to directly manipulate
> > the shadow TLB state. IIRC x86 just has a "sync shadow MMU state"
> > command (not sure what it does or when you use it though).
>
> I agree and the patch has only
> KVM_GET_GUEST_TLB
> KVM_SET_GUEST_TLB
> KVM_GET_SHADOW_TLB
> =>no manipulation to the shadow TLB possible.
>
> > I also really don't like accessing the whole TLB as a giant array when
> > single-entry commands would work just fine.
>
> Well I think it is just easier to do it with the full array. Its easy
> to change that to a 1xtlbe+1xindex interface that can be iterated by
> userspace, but I don't see any benefit atm.
> Code where we set up the TLB are not performance relevant so a big
> array as well as iterating with a lot of calls are both not hurting
> us. If we think of the get/set as interface used for dump tlb and
> save/restore we will need to access the full array anyway.
> So if you want it as single tlbe get/set is there a specific
> argument ? The argument "do it because I like it that way" is also
> accepted if stated explicitly ;-)
I like it that way. :)
Also, we will need a "TLB specifier" argument for e500, to differentiate
between TLB0 and TLB1.
> >> Actually this should work for most if not all use cases we have.
> >> Especially our current use case of setting up a proper initial state
> >> is very arch dependent anyway and if we really can save userspace code
> >> for shared usage depends on the fact if we will be able to create a
> >> shared bit format for flags (I start to put a common "map this" call
> >> down on my prio list).
> >
> > "Flags" could just indicate IO vs RAM mappings. That would be nice and
> > generic. If IO flag is set, the host would map with
> > cache-inhibited/guarded. If not, it's a RAM mapping, so don't.
>
> I would hope it is so an easy flagging. Don't you think we need all
> that rwx@problem/kernel state (well both have that) and e.g. e500
> needs something more than we are because they are supporting the M
> flag ("Memory Coherency required"). Well maybe all becomes clear when
> we see Zhang Wei's code and can compare it to our needs.
Yeah, you're right... if it were just WIMG I don't think we'd have a
problem, but we also have permissions to worry about. In addition, not
all processors support non-executable mappings. I'm starting to think a
little bit of processor-specific code in userspace wouldn't be such a
bad thing, and maybe we should just give up on this "generic" idea
entirely.
--
Hollis Blanchard
IBM Linux Technology Center
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
kvm-ppc-devel mailing list
kvm-ppc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-ppc-devel
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [kvm-ppc-devel] KVM kernel/userspace TLB interface
2008-01-08 0:21 [kvm-ppc-devel] KVM kernel/userspace TLB interface Hollis Blanchard
` (6 preceding siblings ...)
2008-01-24 7:37 ` Hollis Blanchard
@ 2008-01-24 7:58 ` Zhang Wei
2008-01-24 8:01 ` Zhang Wei
8 siblings, 0 replies; 10+ messages in thread
From: Zhang Wei @ 2008-01-24 7:58 UTC (permalink / raw)
To: kvm-ppc
Hi, Christian,
Sorry about a little delay about the email. I'm working for refactoring Hollis' codes and study your codes.
I agree with your user level tlb functions mostly. And I have some general TLB definitions codes and make them against Hollis' codes. Hope it could be reference to you. :)
And an idea: Do you like to add a GET_TLB_SIZE ioctl commands?
Cheers!
Wei.
> -----Original Message-----
> From: Christian Ehrhardt [mailto:ehrhardt@linux.vnet.ibm.com]
> Sent: Tuesday, January 22, 2008 9:25 PM
> To: Zhang Wei
> Cc: Hollis Blanchard; kvm-ppc-devel; Christian Ehrhardt
> Subject: Re: [kvm-ppc-devel] KVM kernel/userspace TLB interface
>
> It looks like we have two different needs we can solve with
> the approach Hollis suggested by implementing #2 (arch
> specific) and #3 (generic map this) interfaces.
>
> Use cases - for #2 (set/get TLB elements including
> architecture specific information):
> - Dump TLB, comparable to vcpu_dump we will need a tlb_dump
> for debugging and error reporting.
> - Save/Restore/Migrate - even not being a need right now this
> might become a need for us all end so we will need a get/set
> function that is able to completely save/restore the guest TLB
>
> Use Cases for #3 (insert a single mapping )
> - simplifying shared userspace code by using only one common
> interface wherever possible that does not need to know which
> hardware it adresses
>
> ---
>
> #2 - "get/set TLB"
> I attached a prototype implementation of an architecture
> dependent mass KVM_GET_TLB & KVM_SET_TLB which get and set
> the full status the guest & shadow tlb structs assigned to a vcpu.
> I intend to eventually have different calls for guest and
> shadow structure because e.g. we actually will never want to
> set shadow tlb from userspace - this prototype is still in
> development and any comments are welcome (I'll resend it once
> it is changed to different getter/setter for guest and shadow tlb).
> @Zhang Wei - the interface I have in mind for this arch
> specific implementation is an array of tlbe[MAX_TLB_SIZE]
> get-/setting all elements of either guest or shadow tlb at
> once. Every architecture can define it's own tlbe structure
> in the kernel headers and can implement special needs in
> their backend for these ioctls (KVM_GET_GUEST_TLB,
> KVM_GET_SHADOW_TLB, KVM_SET_GUEST_TLB, but no KVM_SET_SHADOW_TLB).
>
> ---
>
> #3 - "map this"
> For #3 we need some discussion how the data structure of a
> shared interface might look like. For ppc440 it would be the
> { virtual addr, real addr, size, flags } as you see the tlbe
> definition that is moved in my patch. I wanted to ask what
> more/less needs e500 has for a "map this" interface to insert
> a single tlb-entry. We need also to clarify what flags we all
> need to define a shared set of bits (if possible to share
> that flags) in the flags variable we can use that every
> architecture can map to it's own meaning in the backend to this ioctl.
> When I know the needs of e500 for the "map this" interface
> (and any other needs brought up that might affect future
> hardware we want to be compatible with) I intend to implement
> a prototype of this interface for ppc440 and then iterate it
> until we are happy with it for ppc440, e500 and whatever hw
> we have in mind for the future.
>
>
> Zhang Wei wrote:
> > Hi, Hollis,
> >
> >> Hi Zhang, the administrative tools have a need to access
> the guest's TLB
> >> state, for example to create the mapping for the initial
> code to execute
> >> in. This requires a kernel interface to be exported to
> userspace, but as
> >> you noted, the TLBs are very different between 440 and e500.
> >>
> >> Here are a couple ideas:
> [...]
> >> #2* Have a per-core interface, e.g. KVM_440_SET_TLB_ENTRY and
> >> KVM_e500_SET_TLB_ENTRY. These would communicate
> via completely
> >> different structures. For example, the 440 structure
> >> would look
> >> something like { index, TID, word 0, word 1, word 2 }. We'd
> >> probably want to sanity check each ioctl to make
> sure it matches
> >> the vcpu type. Not only that, but we would need
> core-specific
> >> code in userspace to actually make the call.
> >> #3* Try to abstract it so that we have a higher-level
> "map this"
> >> interface. This could look something like {
> virtual addr, real
> >> addr, size, flags }. It's up to core-specific code
> to select in
> >> which TLB to create the mapping, which entry to clobber,
> >> interpret the meaning of "flags", etc. I guess the
> big downside
> >> here is that if we want to dump guest state for debugging
> >> purposes we'd still need the core-specific interface.
> >>
> >> We could also try supporting both #2 and #3, so that
> >> "generic" code like
> >> qemu could use the abstracted (and shared) "map this" call,
> >> while tools
> >> with more specific needs could still get at the underlying details.
> >>
> >
> > Agree with you!
> > In my porting implementation, I used #3. I think #3 should be more
> > flexible.
> > Maybe we'll add the other TLB support to it in the future. We should
> > define a good data structure about TLB content, which will
> include 440
> > and E500 TLB meaning.
> >
> > And If we take out TLB operations, rest codes are more generic.
> >
> > Cheers!
> > Wei.
>
> And Avi Kivity wrote:
> > Hollis Blanchard wrote:
> > [...]
> > An unabstracted interface is needed for
> save/restore/migrate, which may
> > or may not be important for your target. I do recommend
> having getters
> > and setters for all architectural state.
> >
>
> --
>
> Grüsse / regards,
> Christian Ehrhardt
> IBM Linux Technology Center, Open Virtualization
>
>
>
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
kvm-ppc-devel mailing list
kvm-ppc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-ppc-devel
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [kvm-ppc-devel] KVM kernel/userspace TLB interface
2008-01-08 0:21 [kvm-ppc-devel] KVM kernel/userspace TLB interface Hollis Blanchard
` (7 preceding siblings ...)
2008-01-24 7:58 ` Zhang Wei
@ 2008-01-24 8:01 ` Zhang Wei
8 siblings, 0 replies; 10+ messages in thread
From: Zhang Wei @ 2008-01-24 8:01 UTC (permalink / raw)
To: kvm-ppc
> -----Original Message-----
> From: Hollis Blanchard [mailto:hollisb@us.ibm.com]
>
> On Wed, 2008-01-23 at 14:38 +0100, Christian Ehrhardt wrote:
> > Christian Ehrhardt wrote:
> >
> > This mail now has the reworked implementation of the patch
> for ppc440
> > in the #2 style attached where the guest tlb can be get/set
> completely
> > and the shadow tlb can be read from userspace (now separate
> ioctls for
> > guest/shadow).
>
> I don't think the host userspace should ever need to directly
> manipulate
> the shadow TLB state. IIRC x86 just has a "sync shadow MMU state"
> command (not sure what it does or when you use it though).
>
> I also really don't like accessing the whole TLB as a giant array when
> single-entry commands would work just fine.
Agree with Hollis, Accessing guest tlbs is enough. Just let the shadow
tlb be shadow. ;)
Cheers!
Wei.
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
kvm-ppc-devel mailing list
kvm-ppc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-ppc-devel
^ permalink raw reply [flat|nested] 10+ messages in thread