* Userspace MSR handling @ 2009-05-22 20:11 Ed Swierk 2009-05-23 8:57 ` Alexander Graf 2009-05-25 11:16 ` Gerd Hoffmann 0 siblings, 2 replies; 18+ messages in thread From: Ed Swierk @ 2009-05-22 20:11 UTC (permalink / raw) To: kvm I'm experimenting with Gerd's excellent work on integrating Xenner into Qemu (http://git.et.redhat.com/?p=qemu-kraxel.git). I'm using it to boot a FreeBSD guest that uses the Xen paravirtual network drivers. Decoupling the Xen PV guest support from the hypervisor really simplifies deployment. The current implementation doesn't yet support KVM, as KVM has to handle a Xen-specific MSR in order to map hypercall pages into the guest physical address space. A recent thread on this list discussed the issue but didn't come to a resolution. Does it make sense to implement a generic mechanism for handling MSRs in userspace? I imagine a mechanism analogous to PIO, adding a KVM_EXIT_MSR code and a msr type in the kvm_run struct. I'm happy to take a stab at implementing this if no one else is already working on it. --Ed ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Userspace MSR handling 2009-05-22 20:11 Userspace MSR handling Ed Swierk @ 2009-05-23 8:57 ` Alexander Graf 2009-05-24 12:07 ` Avi Kivity 2009-05-25 11:16 ` Gerd Hoffmann 1 sibling, 1 reply; 18+ messages in thread From: Alexander Graf @ 2009-05-23 8:57 UTC (permalink / raw) To: Ed Swierk; +Cc: kvm@vger.kernel.org On 22.05.2009, at 22:11, Ed Swierk <eswierk@aristanetworks.com> wrote: > I'm experimenting with Gerd's excellent work on integrating Xenner > into Qemu (http://git.et.redhat.com/?p=qemu-kraxel.git). I'm using it > to boot a FreeBSD guest that uses the Xen paravirtual network drivers. > Decoupling the Xen PV guest support from the hypervisor really > simplifies deployment. > > The current implementation doesn't yet support KVM, as KVM has to > handle a Xen-specific MSR in order to map hypercall pages into the > guest physical address space. A recent thread on this list discussed > the issue but didn't come to a resolution. > > Does it make sense to implement a generic mechanism for handling MSRs > in userspace? I imagine a mechanism analogous to PIO, adding a > KVM_EXIT_MSR code and a msr type in the kvm_run struct. > > I'm happy to take a stab at implementing this if no one else is > already working on it. I think it's a great idea. I was thinking of doing something similar for ppc's HIDs/SPRs too, so a userspace app can complement the kernel's vcpu support. Also by falling back to userspace all those MSR read/write patches I send wouldn't have to go in-kernel anymore :) Alex > > > --Ed > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Userspace MSR handling 2009-05-23 8:57 ` Alexander Graf @ 2009-05-24 12:07 ` Avi Kivity 2009-05-24 16:15 ` Alexander Graf 2009-05-25 11:03 ` Gerd Hoffmann 0 siblings, 2 replies; 18+ messages in thread From: Avi Kivity @ 2009-05-24 12:07 UTC (permalink / raw) To: Alexander Graf; +Cc: Ed Swierk, kvm@vger.kernel.org Alexander Graf wrote: > > > > > On 22.05.2009, at 22:11, Ed Swierk <eswierk@aristanetworks.com> wrote: > >> I'm experimenting with Gerd's excellent work on integrating Xenner >> into Qemu (http://git.et.redhat.com/?p=qemu-kraxel.git). I'm using it >> to boot a FreeBSD guest that uses the Xen paravirtual network drivers. >> Decoupling the Xen PV guest support from the hypervisor really >> simplifies deployment. >> >> The current implementation doesn't yet support KVM, as KVM has to >> handle a Xen-specific MSR in order to map hypercall pages into the >> guest physical address space. A recent thread on this list discussed >> the issue but didn't come to a resolution. >> >> Does it make sense to implement a generic mechanism for handling MSRs >> in userspace? I imagine a mechanism analogous to PIO, adding a >> KVM_EXIT_MSR code and a msr type in the kvm_run struct. >> >> I'm happy to take a stab at implementing this if no one else is >> already working on it. > > I think it's a great idea. > I was thinking of doing something similar for ppc's HIDs/SPRs too, so > a userspace app can complement the kernel's vcpu support. > > Also by falling back to userspace all those MSR read/write patches I > send wouldn't have to go in-kernel anymore :) I'm wary of this. It spreads the burden of implementing the cpu emulation across the kernel/user boundary. We don't really notice with qemu as userspace, because we have a cpu emulator on both sides, but consider an alternative userspace that only emulates devices and has no cpu emulation support. We want to support that scenario well. Moreover, your patches only stub out those MSRs. As soon as you implement the more interesting bits, you'll find yourself back in the kernel. I agree however that the Xen hypercall page protocol has no business in kvm.ko. But can't we implement it in emu? Xenner conveniently places a ring 0 stub in the guest, we could trap the MSR there and emulate it entirely in the guest. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Userspace MSR handling 2009-05-24 12:07 ` Avi Kivity @ 2009-05-24 16:15 ` Alexander Graf 2009-05-26 11:31 ` Avi Kivity 2009-05-25 11:03 ` Gerd Hoffmann 1 sibling, 1 reply; 18+ messages in thread From: Alexander Graf @ 2009-05-24 16:15 UTC (permalink / raw) To: Avi Kivity; +Cc: Ed Swierk, kvm@vger.kernel.org On 24.05.2009, at 14:07, Avi Kivity <avi@redhat.com> wrote: > Alexander Graf wrote: >> >> >> >> >> On 22.05.2009, at 22:11, Ed Swierk <eswierk@aristanetworks.com> >> wrote: >> >>> I'm experimenting with Gerd's excellent work on integrating Xenner >>> into Qemu (http://git.et.redhat.com/?p=qemu-kraxel.git). I'm using >>> it >>> to boot a FreeBSD guest that uses the Xen paravirtual network >>> drivers. >>> Decoupling the Xen PV guest support from the hypervisor really >>> simplifies deployment. >>> >>> The current implementation doesn't yet support KVM, as KVM has to >>> handle a Xen-specific MSR in order to map hypercall pages into the >>> guest physical address space. A recent thread on this list discussed >>> the issue but didn't come to a resolution. >>> >>> Does it make sense to implement a generic mechanism for handling >>> MSRs >>> in userspace? I imagine a mechanism analogous to PIO, adding a >>> KVM_EXIT_MSR code and a msr type in the kvm_run struct. >>> >>> I'm happy to take a stab at implementing this if no one else is >>> already working on it. >> >> I think it's a great idea. >> I was thinking of doing something similar for ppc's HIDs/SPRs too, >> so a userspace app can complement the kernel's vcpu support. >> >> Also by falling back to userspace all those MSR read/write patches >> I send wouldn't have to go in-kernel anymore :) > > I'm wary of this. It spreads the burden of implementing the cpu > emulation across the kernel/user boundary. We don't really notice > with qemu as userspace, because we have a cpu emulator on both > sides, but consider an alternative userspace that only emulates > devices and has no cpu emulation support. We want to support that > scenario well. > > Moreover, your patches only stub out those MSRs. As soon as you > implement the more interesting bits, you'll find yourself back in > the kernel. > Agreed. The one thing that always makes my life hard is the default policy on what to do for unknown MSRs. So if I could (by having a userspace fallback) either #GP or do nothing, I'd be able to mimic qemu's behavior more closely depending on what I need. I definitely wouldn't see those approaches conflicting, but rather complementing each other. If your kvm using userspace app needs to act on a user-defined msr, you wouldn't want him to contact reshat to implement an ioctl for rhel5 just for this msr, do you? So imho instead of #gp'ing falling back to userspace would be great. Alex > I agree however that the Xen hypercall page protocol has no business > in kvm.ko. But can't we implement it in emu? Xenner conveniently > places a ring 0 stub in the guest, we could trap the MSR there and > emulate it entirely in the guest. > > -- > error compiling committee.c: too many arguments to function > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Userspace MSR handling 2009-05-24 16:15 ` Alexander Graf @ 2009-05-26 11:31 ` Avi Kivity 0 siblings, 0 replies; 18+ messages in thread From: Avi Kivity @ 2009-05-26 11:31 UTC (permalink / raw) To: Alexander Graf; +Cc: Ed Swierk, kvm@vger.kernel.org Alexander Graf wrote: >>>> Does it make sense to implement a generic mechanism for handling MSRs >>>> in userspace? I imagine a mechanism analogous to PIO, adding a >>>> KVM_EXIT_MSR code and a msr type in the kvm_run struct. >>>> >>>> I'm happy to take a stab at implementing this if no one else is >>>> already working on it. >>> >>> I think it's a great idea. >>> I was thinking of doing something similar for ppc's HIDs/SPRs too, >>> so a userspace app can complement the kernel's vcpu support. >>> >>> Also by falling back to userspace all those MSR read/write patches I >>> send wouldn't have to go in-kernel anymore :) >> >> I'm wary of this. It spreads the burden of implementing the cpu >> emulation across the kernel/user boundary. We don't really notice >> with qemu as userspace, because we have a cpu emulator on both sides, >> but consider an alternative userspace that only emulates devices and >> has no cpu emulation support. We want to support that scenario well. >> >> Moreover, your patches only stub out those MSRs. As soon as you >> implement the more interesting bits, you'll find yourself back in the >> kernel. >> > > Agreed. The one thing that always makes my life hard is the default > policy on what to do for unknown MSRs. So if I could (by having a > userspace fallback) either #GP or do nothing, I'd be able to mimic > qemu's behavior more closely depending on what I need. I'm not interested in mimicing qemu, I'm interested in mimicing a real cpu. kvm is not part of qemu. Many (most?) msrs cannot be emulated in userspace. > I definitely wouldn't see those approaches conflicting, but rather > complementing each other. If your kvm using userspace app needs to act > on a user-defined msr, you wouldn't want him to contact reshat to > implement an ioctl for rhel5 just for this msr, do you? An msr is a cpu resource. I don't see how you can define a new cpu resource without changing the cpu implementation, which is in the kernel. If they want to communicate with userspace, let them use mmio or pio. Look at the Xen interface. You write to one cpu's MSR, and the cpu writes a page in guest memory. It doesn't fit; a much better interface would have been mmio. I don't want to break layering just for Xen, so I'm trying to find an alternative. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Userspace MSR handling 2009-05-24 12:07 ` Avi Kivity 2009-05-24 16:15 ` Alexander Graf @ 2009-05-25 11:03 ` Gerd Hoffmann 2009-05-25 11:20 ` Avi Kivity 1 sibling, 1 reply; 18+ messages in thread From: Gerd Hoffmann @ 2009-05-25 11:03 UTC (permalink / raw) To: Avi Kivity; +Cc: Alexander Graf, Ed Swierk, kvm@vger.kernel.org On 05/24/09 14:07, Avi Kivity wrote: > I agree however that the Xen hypercall page protocol has no business in > kvm.ko. But can't we implement it in emu? Xenner conveniently places a > ring 0 stub in the guest, we could trap the MSR there and emulate it > entirely in the guest. No. The case where handling the msr writes is needed is the pv-on-hvm driver support. For pv kernels it could be handled by emu if it would be needed, but pv kernels don't need the msr stuff in the first place. There should be a longish mail about that in the list archive ... cheers, Gerd ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Userspace MSR handling 2009-05-25 11:03 ` Gerd Hoffmann @ 2009-05-25 11:20 ` Avi Kivity 2009-05-25 11:29 ` Gerd Hoffmann 2009-05-27 16:12 ` Ed Swierk 0 siblings, 2 replies; 18+ messages in thread From: Avi Kivity @ 2009-05-25 11:20 UTC (permalink / raw) To: Gerd Hoffmann; +Cc: Alexander Graf, Ed Swierk, kvm@vger.kernel.org Gerd Hoffmann wrote: > On 05/24/09 14:07, Avi Kivity wrote: >> I agree however that the Xen hypercall page protocol has no business in >> kvm.ko. But can't we implement it in emu? Xenner conveniently places a >> ring 0 stub in the guest, we could trap the MSR there and emulate it >> entirely in the guest. > > No. The case where handling the msr writes is needed is the pv-on-hvm > driver support. For pv kernels it could be handled by emu if it would > be needed, but pv kernels don't need the msr stuff in the first place. > There should be a longish mail about that in the list archive ... Yes, I forgot. Device drivers have no business writing to cpu model specific registers. I hate to bring that fugliness to kvm but I do want to support Xen guests. It should have been implemented as mmio. Maybe implement an ioctl that converts rdmsr/wrmsr to equivalent mmios? struct kvm_msr_mmio { __u32 msr; __u32 nr; __u64 mmio; __u32 flags; __u32 pad[3]; } In any case it should reject the standard msr ranges to prevent Alex from implementing cpu emulation in userspace. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Userspace MSR handling 2009-05-25 11:20 ` Avi Kivity @ 2009-05-25 11:29 ` Gerd Hoffmann 2009-05-25 11:31 ` Avi Kivity 2009-05-27 16:12 ` Ed Swierk 1 sibling, 1 reply; 18+ messages in thread From: Gerd Hoffmann @ 2009-05-25 11:29 UTC (permalink / raw) To: Avi Kivity; +Cc: Alexander Graf, Ed Swierk, kvm@vger.kernel.org On 05/25/09 13:20, Avi Kivity wrote: > It should have been implemented as mmio. Maybe implement an ioctl that > converts rdmsr/wrmsr to equivalent mmios? > > struct kvm_msr_mmio { > __u32 msr; > __u32 nr; > __u64 mmio; > __u32 flags; > __u32 pad[3]; > } Funny way to tunnel msr access through the existing interface. Should work though I think. > In any case it should reject the standard msr ranges to prevent Alex > from implementing cpu emulation in userspace. ;) cheers, Gerd ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Userspace MSR handling 2009-05-25 11:29 ` Gerd Hoffmann @ 2009-05-25 11:31 ` Avi Kivity 0 siblings, 0 replies; 18+ messages in thread From: Avi Kivity @ 2009-05-25 11:31 UTC (permalink / raw) To: Gerd Hoffmann; +Cc: Alexander Graf, Ed Swierk, kvm@vger.kernel.org Gerd Hoffmann wrote: > On 05/25/09 13:20, Avi Kivity wrote: >> It should have been implemented as mmio. Maybe implement an ioctl that >> converts rdmsr/wrmsr to equivalent mmios? >> >> struct kvm_msr_mmio { >> __u32 msr; >> __u32 nr; >> __u64 mmio; >> __u32 flags; >> __u32 pad[3]; >> } > > Funny way to tunnel msr access through the existing interface. > Should work though I think. I'm not happy with it. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Userspace MSR handling 2009-05-25 11:20 ` Avi Kivity 2009-05-25 11:29 ` Gerd Hoffmann @ 2009-05-27 16:12 ` Ed Swierk 2009-05-27 16:28 ` Avi Kivity 1 sibling, 1 reply; 18+ messages in thread From: Ed Swierk @ 2009-05-27 16:12 UTC (permalink / raw) To: Avi Kivity; +Cc: Gerd Hoffmann, Alexander Graf, kvm@vger.kernel.org On Mon, May 25, 2009 at 4:20 AM, Avi Kivity <avi@redhat.com> wrote: > Device drivers have no business writing to cpu model specific registers. I > hate to bring that fugliness to kvm but I do want to support Xen guests. > > It should have been implemented as mmio. Maybe implement an ioctl that > converts rdmsr/wrmsr to equivalent mmios? Converting MSRs to IO sounds fine, but a generic mechanism, with a new ioctl type and all the bookkeeping for a dynamically-sized list of MSR-to-MMIO mappings, seems like overkill given the puny scope of the problem. All the Xen HVM guest needs is a single, arbitrary MSR that when written generates an MMIO or PIO write handled by userspace. If this requirement is unique and we don't expect to find other guests that similarly abuse MSRs, could we get away with a less flexible but simpler mechanism? What I have in mind is choosing an unused legacy IO port range, say, 0x28-0x2f, and implementing a KVM-specific MSR, say, MSR_KVM_IO_28, that maps rdmsr/wrmsr to a pair of inl/outl operations on these ports. Either MMIO or PIO would work, but I'm assuming it's safer to grab currently-unused IO ports than particular memory addresses. That odor you smell is the aroma of hardcoded goop, but I'm trying to find a solution that doesn't burden KVM with a big chunk of code to solve a one-off problem. --Ed ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Userspace MSR handling 2009-05-27 16:12 ` Ed Swierk @ 2009-05-27 16:28 ` Avi Kivity 2009-05-27 17:09 ` Ed Swierk 0 siblings, 1 reply; 18+ messages in thread From: Avi Kivity @ 2009-05-27 16:28 UTC (permalink / raw) To: Ed Swierk; +Cc: Gerd Hoffmann, Alexander Graf, kvm@vger.kernel.org Ed Swierk wrote: > On Mon, May 25, 2009 at 4:20 AM, Avi Kivity <avi@redhat.com> wrote: > >> Device drivers have no business writing to cpu model specific registers. I >> hate to bring that fugliness to kvm but I do want to support Xen guests. >> >> It should have been implemented as mmio. Maybe implement an ioctl that >> converts rdmsr/wrmsr to equivalent mmios? >> > > Converting MSRs to IO sounds fine, but a generic mechanism, with a new > ioctl type and all the bookkeeping for a dynamically-sized list of > MSR-to-MMIO mappings, seems like overkill given the puny scope of the > problem. All the Xen HVM guest needs is a single, arbitrary MSR that > when written generates an MMIO or PIO write handled by userspace. If > this requirement is unique and we don't expect to find other guests > that similarly abuse MSRs, could we get away with a less flexible but > simpler mechanism? > I agree, it's stupid. > What I have in mind is choosing an unused legacy IO port range, say, > 0x28-0x2f, and implementing a KVM-specific MSR, say, MSR_KVM_IO_28, > that maps rdmsr/wrmsr to a pair of inl/outl operations on these ports. > Either MMIO or PIO would work, but I'm assuming it's safer to grab > currently-unused IO ports than particular memory addresses. > It's just as bad. > That odor you smell is the aroma of hardcoded goop, but I'm trying to > find a solution that doesn't burden KVM with a big chunk of code to > solve a one-off problem. > Will it actually solve the problem? - can all hypercalls that can be issued with pv-on-hvm-on-kvm-with-a-side-order-of-fries be satisfied from userspace? - what about connecting the guest driver to xen netback one day? we don't want to go through userspace for that. We can consider catering to Xen and implementing that MSR in the kernel, if it's truly one off. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Userspace MSR handling 2009-05-27 16:28 ` Avi Kivity @ 2009-05-27 17:09 ` Ed Swierk 2009-05-27 19:16 ` Gerd Hoffmann 0 siblings, 1 reply; 18+ messages in thread From: Ed Swierk @ 2009-05-27 17:09 UTC (permalink / raw) To: Avi Kivity; +Cc: Gerd Hoffmann, Alexander Graf, kvm@vger.kernel.org On Wed, May 27, 2009 at 9:28 AM, Avi Kivity <avi@redhat.com> wrote: > Will it actually solve the problem? > > - can all hypercalls that can be issued with > pv-on-hvm-on-kvm-with-a-side-order-of-fries be satisfied from userspace? > - what about connecting the guest driver to xen netback one day? we don't > want to go through userspace for that. In Gerd's current implementation, the code in the hypercall page (which the guest maps in using that pesky MSR) handles all hypercalls either internally or by invoking userspace (via another magic IO port). I'm too ignorant of Xen to claim that my proposal solves the problem completely, but after hacking in support for delegating the magic MSR to userspace, I got an unmodified FreeBSD disk image to boot in PV-on-HVM-on-KVM+Qemu and use Xen PV network devices. > We can consider catering to Xen and implementing that MSR in the kernel, if > it's truly one off. One way or another, the MSR somehow has to map in a chunk of data supplied by userspace. Are you suggesting an alternative to the PIO hack? --Ed ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Userspace MSR handling 2009-05-27 17:09 ` Ed Swierk @ 2009-05-27 19:16 ` Gerd Hoffmann 2009-05-27 23:00 ` Ed Swierk 2009-05-28 8:53 ` Avi Kivity 0 siblings, 2 replies; 18+ messages in thread From: Gerd Hoffmann @ 2009-05-27 19:16 UTC (permalink / raw) To: Ed Swierk; +Cc: Avi Kivity, Alexander Graf, kvm@vger.kernel.org On 05/27/09 19:09, Ed Swierk wrote: > On Wed, May 27, 2009 at 9:28 AM, Avi Kivity<avi@redhat.com> wrote: >> Will it actually solve the problem? >> >> - can all hypercalls that can be issued with >> pv-on-hvm-on-kvm-with-a-side-order-of-fries be satisfied from userspace? Yes. >> - what about connecting the guest driver to xen netback one day? we don't >> want to go through userspace for that. You can't without emulation tons of xen stuff in-kernel. Current situation: * Guest does xen hypercalls. We can handle that just fine. * Host userspace (backends) calls libxen*, where the xen hypercall calls are hidden. We can redirect the library calls via LD_PRELOAD (standalone xenner) or function pointers (qemuified xenner) and do something else instead. Trying to use in-kernel xen netback driver adds this problem: * Host kernel does xen hypercalls. Ouch. We have to emulate them in-kernel (otherwise using in-kernel netback would be a quite pointless exercise). > One way or another, the MSR somehow has to map in a chunk of data > supplied by userspace. Are you suggesting an alternative to the PIO > hack? Well, the "chunk of data" is on disk anyway: $libdir/xenner/hvm{32,64}.bin So a possible plan to attack could be "ln -s $libdir/xenner /lib/firmware", let kvm.ko grab it if needed using request_firmware("xenner/hvm${bits}.bin"), and a few lines of kernel code handling the wrmsr. Logic is just this: void xenner_wrmsr(uint64_t val, int longmode) { uint32_t page = val & ~PAGE_MASK; uint64_t paddr = val & PAGE_MASK; uint8_t *blob = longmode ? hvm64 : hvm32; cpu_physical_memory_write(paddr, blob + page * PAGE_SIZE, PAGE_SIZE); } Well, you'll have to sprinkle in blob loading and caching and some error checking. But even with that it is probably hard to beat in actual code size. Additional plus is we get away without a new ioctl then. Comments? cheers, Gerd ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Userspace MSR handling 2009-05-27 19:16 ` Gerd Hoffmann @ 2009-05-27 23:00 ` Ed Swierk 2009-05-28 8:53 ` Avi Kivity 1 sibling, 0 replies; 18+ messages in thread From: Ed Swierk @ 2009-05-27 23:00 UTC (permalink / raw) To: Gerd Hoffmann; +Cc: Avi Kivity, Alexander Graf, kvm@vger.kernel.org On Wed, 2009-05-27 at 21:16 +0200, Gerd Hoffmann wrote: > Well, the "chunk of data" is on disk anyway: > $libdir/xenner/hvm{32,64}.bin > > So a possible plan to attack could be "ln -s $libdir/xenner > /lib/firmware", let kvm.ko grab it if needed using > request_firmware("xenner/hvm${bits}.bin"), and a few lines of kernel > code handling the wrmsr. Logic is just this: > > void xenner_wrmsr(uint64_t val, int longmode) > { > uint32_t page = val & ~PAGE_MASK; > uint64_t paddr = val & PAGE_MASK; > uint8_t *blob = longmode ? hvm64 : hvm32; > cpu_physical_memory_write(paddr, blob + page * PAGE_SIZE, > PAGE_SIZE); > } > > Well, you'll have to sprinkle in blob loading and caching and some error > checking. But even with that it is probably hard to beat in actual code > size. Additional plus is we get away without a new ioctl then. > > Comments? I like it. Here's a first attempt. One obvious improvement would be to cache the reference to the firmware blob to avoid re-reading it on every wrmsr. --- diff -BurN kvm-kmod-2.6.30-rc6/include/asm-x86/kvm_para.h kvm-kmod-2.6.30-rc6.new/include/asm-x86/kvm_para.h --- kvm-kmod-2.6.30-rc6/include/asm-x86/kvm_para.h 2009-05-21 02:10:14.000000000 -0700 +++ kvm-kmod-2.6.30-rc6.new/include/asm-x86/kvm_para.h 2009-05-27 14:44:42.252004038 -0700 @@ -56,6 +56,7 @@ #define MSR_KVM_WALL_CLOCK 0x11 #define MSR_KVM_SYSTEM_TIME 0x12 +#define MSR_KVM_LOAD_XENNER_FIRMWARE 0x40000000 #define KVM_MAX_MMU_OP_BATCH 32 diff -BurN kvm-kmod-2.6.30-rc6/include/linux/kvm_host.h kvm-kmod-2.6.30-rc6.new/include/linux/kvm_host.h --- kvm-kmod-2.6.30-rc6/include/linux/kvm_host.h 2009-05-21 02:10:14.000000000 -0700 +++ kvm-kmod-2.6.30-rc6.new/include/linux/kvm_host.h 2009-05-27 14:16:47.839529841 -0700 @@ -192,6 +192,7 @@ unsigned long mmu_notifier_seq; long mmu_notifier_count; #endif + struct device *kvm_dev; }; /* The guest did something we don't support. */ diff -BurN kvm-kmod-2.6.30-rc6/x86/kvm_main.c kvm-kmod-2.6.30-rc6.new/x86/kvm_main.c --- kvm-kmod-2.6.30-rc6/x86/kvm_main.c 2009-05-21 02:10:18.000000000 -0700 +++ kvm-kmod-2.6.30-rc6.new/x86/kvm_main.c 2009-05-27 15:22:43.463251834 -0700 @@ -816,6 +816,8 @@ }; #endif /* CONFIG_MMU_NOTIFIER && KVM_ARCH_WANT_MMU_NOTIFIER */ +static struct miscdevice kvm_dev; + static struct kvm *kvm_create_vm(void) { struct kvm *kvm = kvm_arch_create_vm(); @@ -869,6 +871,7 @@ #ifdef KVM_COALESCED_MMIO_PAGE_OFFSET kvm_coalesced_mmio_init(kvm); #endif + kvm->kvm_dev = kvm_dev.this_device; out: return kvm; } diff -BurN kvm-kmod-2.6.30-rc6/x86/x86.c kvm-kmod-2.6.30-rc6.new/x86/x86.c --- kvm-kmod-2.6.30-rc6/x86/x86.c 2009-05-21 02:10:18.000000000 -0700 +++ kvm-kmod-2.6.30-rc6.new/x86/x86.c 2009-05-27 15:17:42.798002879 -0700 @@ -77,6 +77,7 @@ #include <linux/iommu.h> #include <linux/intel-iommu.h> #include <linux/cpufreq.h> +#include <linux/firmware.h> #include <asm/uaccess.h> #include <asm/msr.h> @@ -846,6 +847,22 @@ kvm_request_guest_time_update(vcpu); break; } + case MSR_KVM_LOAD_XENNER_FIRMWARE: { + const char *fw_name = (vcpu->arch.shadow_efer & EFER_LME + ? "xenner/hvm64.bin" + : "xenner/hvm32.bin"); + const struct firmware *firmware; + uint32_t page = data & ~PAGE_MASK; + uint64_t paddr = data & PAGE_MASK; + if (request_firmware(&firmware, fw_name, vcpu->kvm->kvm_dev)) + return 1; + printk(KERN_INFO "kvm: loading %s page %d to %llx\n", + fw_name, page, paddr); + kvm_write_guest(vcpu->kvm, paddr, + firmware->data + page * PAGE_SIZE, PAGE_SIZE); + release_firmware(firmware); + break; + } default: pr_unimpl(vcpu, "unhandled wrmsr: 0x%x data %llx\n", msr, data); return 1; ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Userspace MSR handling 2009-05-27 19:16 ` Gerd Hoffmann 2009-05-27 23:00 ` Ed Swierk @ 2009-05-28 8:53 ` Avi Kivity 2009-05-29 9:47 ` Gerd Hoffmann 1 sibling, 1 reply; 18+ messages in thread From: Avi Kivity @ 2009-05-28 8:53 UTC (permalink / raw) To: Gerd Hoffmann; +Cc: Ed Swierk, Alexander Graf, kvm@vger.kernel.org Gerd Hoffmann wrote: > >>> - what about connecting the guest driver to xen netback one day? we >>> don't >>> want to go through userspace for that. > > You can't without emulation tons of xen stuff in-kernel. > > Current situation: > * Guest does xen hypercalls. We can handle that just fine. > * Host userspace (backends) calls libxen*, where the xen hypercall > calls are hidden. We can redirect the library calls via LD_PRELOAD > (standalone xenner) or function pointers (qemuified xenner) and do > something else instead. > > Trying to use in-kernel xen netback driver adds this problem: > * Host kernel does xen hypercalls. Ouch. We have to emulate them > in-kernel (otherwise using in-kernel netback would be a quite > pointless exercise). Or do the standard function pointer trick. Event channel notifications change to eventfd_signal, grant table ops change to copy_to_user(). > >> One way or another, the MSR somehow has to map in a chunk of data >> supplied by userspace. Are you suggesting an alternative to the PIO >> hack? > > Well, the "chunk of data" is on disk anyway: > $libdir/xenner/hvm{32,64}.bin > > So a possible plan to attack could be "ln -s $libdir/xenner > /lib/firmware", let kvm.ko grab it if needed using > request_firmware("xenner/hvm${bits}.bin"), and a few lines of kernel > code handling the wrmsr. Logic is just this: > > void xenner_wrmsr(uint64_t val, int longmode) > { > uint32_t page = val & ~PAGE_MASK; > uint64_t paddr = val & PAGE_MASK; > uint8_t *blob = longmode ? hvm64 : hvm32; > cpu_physical_memory_write(paddr, blob + page * PAGE_SIZE, > PAGE_SIZE); > } > > Well, you'll have to sprinkle in blob loading and caching and some > error checking. But even with that it is probably hard to beat in > actual code size. This ties all guests to one hypercall page implementation installed in one root-only place. > Additional plus is we get away without a new ioctl then. Minimizing the amount of ioctls is an important non-goal. If you replace request_firmware with an ioctl that defines the location and size of the hypercall page in host userspace, this would work well. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Userspace MSR handling 2009-05-28 8:53 ` Avi Kivity @ 2009-05-29 9:47 ` Gerd Hoffmann 2009-05-31 8:21 ` Avi Kivity 0 siblings, 1 reply; 18+ messages in thread From: Gerd Hoffmann @ 2009-05-29 9:47 UTC (permalink / raw) To: Avi Kivity; +Cc: Ed Swierk, Alexander Graf, kvm@vger.kernel.org On 05/28/09 10:53, Avi Kivity wrote: > Gerd Hoffmann wrote: >> Trying to use in-kernel xen netback driver adds this problem: >> * Host kernel does xen hypercalls. Ouch. We have to emulate them >> in-kernel (otherwise using in-kernel netback would be a quite >> pointless exercise). > > Or do the standard function pointer trick. Event channel notifications > change to eventfd_signal, grant table ops change to copy_to_user(). grant table ops include mapping pages of the guest (aka domU) into the host (aka dom0) address space, fill the pointer into some struct (bio for blkback, skb for netback), send it down the road for processing. netback and blkback do quite some lowlevel xen memory management to get that done, including m2p and p2m table updates due to direct paging mode. It isn't as easy as s/hypercall/get_user_pages/ ... cheers, Gerd ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Userspace MSR handling 2009-05-29 9:47 ` Gerd Hoffmann @ 2009-05-31 8:21 ` Avi Kivity 0 siblings, 0 replies; 18+ messages in thread From: Avi Kivity @ 2009-05-31 8:21 UTC (permalink / raw) To: Gerd Hoffmann; +Cc: Ed Swierk, Alexander Graf, kvm@vger.kernel.org Gerd Hoffmann wrote: >> >> Or do the standard function pointer trick. Event channel notifications >> change to eventfd_signal, grant table ops change to copy_to_user(). > > grant table ops include mapping pages of the guest (aka domU) into the > host (aka dom0) address space, fill the pointer into some struct (bio > for blkback, skb for netback), send it down the road for processing. > netback and blkback do quite some lowlevel xen memory management to > get that done, including m2p and p2m table updates due to direct > paging mode. It isn't as easy as s/hypercall/get_user_pages/ ... So we need to plug it at a slightly higher level. Conceptually can provide the same interfaces as xen, so it's just a matter of abstracting it nicely. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Userspace MSR handling 2009-05-22 20:11 Userspace MSR handling Ed Swierk 2009-05-23 8:57 ` Alexander Graf @ 2009-05-25 11:16 ` Gerd Hoffmann 1 sibling, 0 replies; 18+ messages in thread From: Gerd Hoffmann @ 2009-05-25 11:16 UTC (permalink / raw) To: Ed Swierk; +Cc: kvm On 05/22/09 22:11, Ed Swierk wrote: > Does it make sense to implement a generic mechanism for handling MSRs > in userspace? I see no other way to handle the xen pv msr writes. > I imagine a mechanism analogous to PIO, adding a > KVM_EXIT_MSR code and a msr type in the kvm_run struct. Sounds sensible to me. Probably must be off by default for backward compatibility, then enabled by ioctl. I think it would be best to enable it for specific msrs. We probably also want to have a way for userspace to figure whenever some specific msr is implemented in-kernel to handle the case of an msr emulation moving from userspace to kernelspace gracefully. > I'm happy to take a stab at implementing this if no one else is > already working on it. I is somewhere on my todo list, but I didn't looked at it (yet). Feel free to go ahead ;) cheers, Gerd ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2009-05-31 8:21 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-05-22 20:11 Userspace MSR handling Ed Swierk 2009-05-23 8:57 ` Alexander Graf 2009-05-24 12:07 ` Avi Kivity 2009-05-24 16:15 ` Alexander Graf 2009-05-26 11:31 ` Avi Kivity 2009-05-25 11:03 ` Gerd Hoffmann 2009-05-25 11:20 ` Avi Kivity 2009-05-25 11:29 ` Gerd Hoffmann 2009-05-25 11:31 ` Avi Kivity 2009-05-27 16:12 ` Ed Swierk 2009-05-27 16:28 ` Avi Kivity 2009-05-27 17:09 ` Ed Swierk 2009-05-27 19:16 ` Gerd Hoffmann 2009-05-27 23:00 ` Ed Swierk 2009-05-28 8:53 ` Avi Kivity 2009-05-29 9:47 ` Gerd Hoffmann 2009-05-31 8:21 ` Avi Kivity 2009-05-25 11:16 ` Gerd Hoffmann
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).