* Where is the entry of hypercalls in kvm? @ 2010-06-26 1:06 Balachandar 2010-06-30 8:17 ` Peter Teoh 0 siblings, 1 reply; 11+ messages in thread From: Balachandar @ 2010-06-26 1:06 UTC (permalink / raw) To: kernelnewbies, kvm [-- Attachment #1: Type: text/plain, Size: 631 bytes --] Hello, I am trying to understand the virtio mechanism in linux. I read that the kick function will notify the host side about the newly published buffers. I am looking especially at virtio_net.Once a packet is ready for transmission the kick function is called here<http://lxr.linux.no/#linux+v2.6.34/drivers/net/virtio_net.c#L588>. >From here i traced the call and i think it goes to this<http://lxr.linux.no/#linux+v2.6.34/drivers/virtio/virtio_pci.c#L185>. >From here where does it go? Which code contains the backend driver of virtio. Where is the code in the hypervisor which this kick will go to? Thank you... Thanks, Bala [-- Attachment #2: Type: text/html, Size: 680 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Where is the entry of hypercalls in kvm? 2010-06-26 1:06 Where is the entry of hypercalls in kvm? Balachandar @ 2010-06-30 8:17 ` Peter Teoh 2010-06-30 8:56 ` Alexander Graf 2010-06-30 14:59 ` Balachandar 0 siblings, 2 replies; 11+ messages in thread From: Peter Teoh @ 2010-06-30 8:17 UTC (permalink / raw) To: Balachandar; +Cc: kernelnewbies, kvm Your questioned is answered here: http://www.spinics.net/lists/kvm/msg37526.html And check this paper out: http://ozlabs.org/~rusty/virtio-spec/virtio-paper.pdf The general concept to remember is that QEMU and KVM just execute the input as binary stream....it does not know what "functions" it is executing...so the binary stream can be any OS (windows / Linux etc)....QEMU just setup the basic block (call basic blocks translation) mechanism, and then execute it block by block. Each block by definition is demarcated by a branch/jump etc. Within the block if there is any privilege instruction, (eg, write MSR registers, load LDT registers etc), then a transition will be made from guest in QEMU into KVM to update the VMCB/VMCS information. (these terms are from Intel/AMD manual). I have not seen any IOCTL calls in QEMU, but I suspect ultimately it should drop to a VMRUN (for AMD, Intel called it VMLAUNCH or VMRESUME) calls inside KVM, which can be found here: arch/x86/kvm/ And the AMD specific virtualization is done in svm.c whereas that of vmx.c is for Intel. Copying the remark in vmx.c: /* * The exit handlers return 1 if the exit was handled fully and guest execution * may resume. Otherwise they set the kvm_run parameter to indicate what needs * to be done to userspace and return 0. */ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = { [EXIT_REASON_EXCEPTION_ And after reading the Intel manual, u will understand that "exit" here actually refers to the special set of privilege intel instructions, which upon being executed by the guest OS, will immediately caused and VMEXIT condition, and these are handled by the above handler in kvm.ko. To know the entry point INTO the guest OS (ie, when the guest code will first be run) first must understand that all these VMX operation are a state machine (3, VMLAUNCH, VMRESUME and VMEXIT). Once inside the VMRESUME state, there is no way for it to access any of the hosts resources, only accessible after VMEXIT is triggered. All key APIs are defined here (for Intel) (this is KVM specific, Xen has another mechanism, : static struct kvm_x86_ops vmx_x86_ops = { .cpu_has_kvm_support = cpu_has_kvm_support, .disabled_by_bios = vmx_disabled_by_bios, .hardware_setup = hardware_setup, .hardware_unsetup = hardware_unsetup, ... .run = vmx_vcpu_run, .handle_exit = vmx_handle_exit, .skip_emulated_instruction = skip_emulated_instruction, .set_interrupt_shadow = vmx_set_interrupt_shadow, and vmx_vcpu_run() is the the answer to your question.....i supposed? Perhaps another summary resource: http://download.microsoft.com/download/9/8/f/98f3fe47-dfc3-4e74-92a3-088782200fe7/TWAR05015_WinHEC05.ppt As for virtio_net.....it is implemented in drivers/net/virtio_net.c......not sure what is your question? On Sat, Jun 26, 2010 at 9:06 AM, Balachandar <bala1486@gmail.com> wrote: > Hello, > I am trying to understand the virtio mechanism in linux. I read that the > kick function will notify the host side about the newly published buffers. I > am looking especially at virtio_net.Once a packet is ready for transmission > the kick function is called here. From here i traced the call and i think it > goes to this. From here where does it go? Which code contains the backend > driver of virtio. Where is the code in the hypervisor which this kick will > go to? Thank you... > > Thanks, > Bala > > -- Regards, Peter Teoh ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Where is the entry of hypercalls in kvm? 2010-06-30 8:17 ` Peter Teoh @ 2010-06-30 8:56 ` Alexander Graf 2010-06-30 15:10 ` Anthony Liguori 2010-06-30 16:28 ` Peter Teoh 2010-06-30 14:59 ` Balachandar 1 sibling, 2 replies; 11+ messages in thread From: Alexander Graf @ 2010-06-30 8:56 UTC (permalink / raw) To: Peter Teoh; +Cc: Balachandar, kernelnewbies, kvm On 30.06.2010, at 10:17, Peter Teoh wrote: > Your questioned is answered here: > > http://www.spinics.net/lists/kvm/msg37526.html > > And check this paper out: > > http://ozlabs.org/~rusty/virtio-spec/virtio-paper.pdf > > The general concept to remember is that QEMU and KVM just execute the > input as binary stream....it does not know what "functions" it is > executing...so the binary stream can be any OS (windows / Linux > etc)....QEMU just setup the basic block (call basic blocks > translation) mechanism, and then execute it block by block. Each > block by definition is demarcated by a branch/jump etc. Within the > block if there is any privilege instruction, (eg, write MSR registers, > load LDT registers etc), then a transition will be made from guest in > QEMU into KVM to update the VMCB/VMCS information. (these terms are > from Intel/AMD manual). Eh, no. There are two modes of operation: 1) TCG 2) KVM In mode 1, qemu goes through target-xxx/translate.c and converts the basic blocks you were talking about above to native machine code on the host system using tcg (see the tcg directory). No KVM is involved, everything happens in user mode. In mode 2, qemu executes _everything_ by calling KVM. There is no guest code interpreted, looked at or whatever in qemu. Whenever the guest CPU runs, it runs because qemu called ioctrl(VCPU_RUN) on its kvm vcpu fd. > > I have not seen any IOCTL calls in QEMU, See kvm*.c and target-xxx/kvm.c > but I suspect ultimately it > should drop to a VMRUN (for AMD, Intel called it VMLAUNCH or VMRESUME) > calls inside KVM, which can be found here: > > arch/x86/kvm/ > > And the AMD specific virtualization is done in svm.c whereas that of > vmx.c is for Intel. > > Copying the remark in vmx.c: > > /* > * The exit handlers return 1 if the exit was handled fully and guest execution > * may resume. Otherwise they set the kvm_run parameter to indicate what needs > * to be done to userspace and return 0. > */ > static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = { > [EXIT_REASON_EXCEPTION_ > > And after reading the Intel manual, u will understand that "exit" here > actually refers to the special set of privilege intel instructions, > which upon being executed by the guest OS, will immediately caused and > VMEXIT condition, and these are handled by the above handler in > kvm.ko. in kvm-xxx.ko for x86. Also, please don't top post :) Alex ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Where is the entry of hypercalls in kvm? 2010-06-30 8:56 ` Alexander Graf @ 2010-06-30 15:10 ` Anthony Liguori 2010-06-30 16:36 ` Peter Teoh 2010-06-30 16:28 ` Peter Teoh 1 sibling, 1 reply; 11+ messages in thread From: Anthony Liguori @ 2010-06-30 15:10 UTC (permalink / raw) To: Alexander Graf; +Cc: Peter Teoh, Balachandar, kernelnewbies, kvm On 06/30/2010 03:56 AM, Alexander Graf wrote: > On 30.06.2010, at 10:17, Peter Teoh wrote: > > >> Your questioned is answered here: >> >> http://www.spinics.net/lists/kvm/msg37526.html >> >> And check this paper out: >> >> http://ozlabs.org/~rusty/virtio-spec/virtio-paper.pdf >> >> The general concept to remember is that QEMU and KVM just execute the >> input as binary stream....it does not know what "functions" it is >> executing...so the binary stream can be any OS (windows / Linux >> etc)....QEMU just setup the basic block (call basic blocks >> translation) mechanism, and then execute it block by block. Each >> block by definition is demarcated by a branch/jump etc. Within the >> block if there is any privilege instruction, (eg, write MSR registers, >> load LDT registers etc), then a transition will be made from guest in >> QEMU into KVM to update the VMCB/VMCS information. (these terms are >> from Intel/AMD manual). >> > Eh, no. > > There are two modes of operation: > > 1) TCG > 2) KVM > > In mode 1, qemu goes through target-xxx/translate.c and converts the basic blocks you were talking about above to native machine code on the host system using tcg (see the tcg directory). No KVM is involved, everything happens in user mode. > > In mode 2, qemu executes _everything_ by calling KVM. There is no guest code interpreted, looked at or whatever in qemu. Only because there is a mini-x86 interpreter in the kernel. That lets KVM expose an idealized interface to qemu that requires no instruction interpretation. More to the point of the original question, virtio is typically implemented on top of an emulated PCI device. The kick operation is implemented as a write to a PCI IO region that's mapped to PIO. If you look at hw/virtio-pci.c, you'll see the entry points. Regards, Anthony Liguori > Whenever the guest CPU runs, it runs because qemu called ioctrl(VCPU_RUN) on its kvm vcpu fd. > > >> I have not seen any IOCTL calls in QEMU, >> > See kvm*.c and target-xxx/kvm.c > > >> but I suspect ultimately it >> should drop to a VMRUN (for AMD, Intel called it VMLAUNCH or VMRESUME) >> calls inside KVM, which can be found here: >> >> arch/x86/kvm/ >> >> And the AMD specific virtualization is done in svm.c whereas that of >> vmx.c is for Intel. >> >> Copying the remark in vmx.c: >> >> /* >> * The exit handlers return 1 if the exit was handled fully and guest execution >> * may resume. Otherwise they set the kvm_run parameter to indicate what needs >> * to be done to userspace and return 0. >> */ >> static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = { >> [EXIT_REASON_EXCEPTION_ >> >> And after reading the Intel manual, u will understand that "exit" here >> actually refers to the special set of privilege intel instructions, >> which upon being executed by the guest OS, will immediately caused and >> VMEXIT condition, and these are handled by the above handler in >> kvm.ko. >> > in kvm-xxx.ko for x86. > > Also, please don't top post :) > > > Alex > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Where is the entry of hypercalls in kvm? 2010-06-30 15:10 ` Anthony Liguori @ 2010-06-30 16:36 ` Peter Teoh 0 siblings, 0 replies; 11+ messages in thread From: Peter Teoh @ 2010-06-30 16:36 UTC (permalink / raw) To: Anthony Liguori; +Cc: Alexander Graf, Balachandar, kernelnewbies, kvm just to clarify further. On Wed, Jun 30, 2010 at 11:10 PM, Anthony Liguori <anthony@codemonkey.ws> wrote: > On 06/30/2010 03:56 AM, Alexander Graf wrote: >> >> On 30.06.2010, at 10:17, Peter Teoh wrote: >> >> >>> >>> Your questioned is answered here: >>> >>> http://www.spinics.net/lists/kvm/msg37526.html >>> >>> And check this paper out: >>> >>> http://ozlabs.org/~rusty/virtio-spec/virtio-paper.pdf >>> >>> The general concept to remember is that QEMU and KVM just execute the >>> input as binary stream....it does not know what "functions" it is >>> executing...so the binary stream can be any OS (windows / Linux >>> etc)....QEMU just setup the basic block (call basic blocks >>> translation) mechanism, and then execute it block by block. Each >>> block by definition is demarcated by a branch/jump etc. Within the >>> block if there is any privilege instruction, (eg, write MSR registers, >>> load LDT registers etc), then a transition will be made from guest in >>> QEMU into KVM to update the VMCB/VMCS information. (these terms are >>> from Intel/AMD manual). >>> >> >> Eh, no. >> >> There are two modes of operation: >> >> 1) TCG >> 2) KVM >> >> In mode 1, qemu goes through target-xxx/translate.c and converts the basic >> blocks you were talking about above to native machine code on the host >> system using tcg (see the tcg directory). No KVM is involved, everything >> happens in user mode. >> >> In mode 2, qemu executes _everything_ by calling KVM. There is no guest >> code interpreted, looked at or whatever in qemu. > > Only because there is a mini-x86 interpreter in the kernel. That lets KVM > expose an idealized interface to qemu that requires no instruction > interpretation. > From the ioctl call in QEMU, the following will be called: int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) { int r; sigset_t sigsaved; and then subsequently: r = emulate_instruction(vcpu, vcpu->arch.mmio_fault_cr2, 0, EMULTYPE_NO_DECODE); which will interpret the x86 bytecode, correct? So does it distinguish between ring0 vs ring3 insn? > More to the point of the original question, virtio is typically implemented > on top of an emulated PCI device. The kick operation is implemented as a > write to a PCI IO region that's mapped to PIO. If you look at > hw/virtio-pci.c, you'll see the entry points. > > Regards, > > Anthony Liguori > >> Whenever the guest CPU runs, it runs because qemu called ioctrl(VCPU_RUN) >> on its kvm vcpu fd. >> >> >>> >>> I have not seen any IOCTL calls in QEMU, >>> >> >> See kvm*.c and target-xxx/kvm.c >> >> >>> >>> but I suspect ultimately it >>> should drop to a VMRUN (for AMD, Intel called it VMLAUNCH or VMRESUME) >>> calls inside KVM, which can be found here: >>> >>> arch/x86/kvm/ >>> >>> And the AMD specific virtualization is done in svm.c whereas that of >>> vmx.c is for Intel. >>> >>> Copying the remark in vmx.c: >>> >>> /* >>> * The exit handlers return 1 if the exit was handled fully and guest >>> execution >>> * may resume. Otherwise they set the kvm_run parameter to indicate what >>> needs >>> * to be done to userspace and return 0. >>> */ >>> static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = { >>> [EXIT_REASON_EXCEPTION_ >>> >>> And after reading the Intel manual, u will understand that "exit" here >>> actually refers to the special set of privilege intel instructions, >>> which upon being executed by the guest OS, will immediately caused and >>> VMEXIT condition, and these are handled by the above handler in >>> kvm.ko. >>> >> >> in kvm-xxx.ko for x86. >> >> Also, please don't top post :) >> >> >> Alex >> >> -- >> To unsubscribe from this list: send the line "unsubscribe kvm" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > -- Regards, Peter Teoh ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Where is the entry of hypercalls in kvm? 2010-06-30 8:56 ` Alexander Graf 2010-06-30 15:10 ` Anthony Liguori @ 2010-06-30 16:28 ` Peter Teoh 2010-06-30 16:32 ` Alexander Graf 2010-06-30 16:34 ` Anthony Liguori 1 sibling, 2 replies; 11+ messages in thread From: Peter Teoh @ 2010-06-30 16:28 UTC (permalink / raw) To: Alexander Graf; +Cc: Balachandar, kernelnewbies, kvm Thank you Alex for the reply, very glad to know you!!! On Wed, Jun 30, 2010 at 4:56 PM, Alexander Graf <agraf@suse.de> wrote: > > On 30.06.2010, at 10:17, Peter Teoh wrote: > >> Your questioned is answered here: >> >> http://www.spinics.net/lists/kvm/msg37526.html >> >> And check this paper out: >> >> http://ozlabs.org/~rusty/virtio-spec/virtio-paper.pdf >> >> The general concept to remember is that QEMU and KVM just execute the >> input as binary stream....it does not know what "functions" it is >> executing...so the binary stream can be any OS (windows / Linux >> etc)....QEMU just setup the basic block (call basic blocks >> translation) mechanism, and then execute it block by block. Each >> block by definition is demarcated by a branch/jump etc. Within the >> block if there is any privilege instruction, (eg, write MSR registers, >> load LDT registers etc), then a transition will be made from guest in >> QEMU into KVM to update the VMCB/VMCS information. (these terms are >> from Intel/AMD manual). > > Eh, no. > > There are two modes of operation: > > 1) TCG > 2) KVM > Now I am clear, it is translate-all.c vs kvm-all.c as the two main file in QEMU. Thanks for that! > In mode 1, qemu goes through target-xxx/translate.c and converts the basic blocks you were talking about above to native machine code on the host system using tcg (see the tcg directory). No KVM is involved, everything happens in user mode. > > In mode 2, qemu executes _everything_ by calling KVM. There is no guest code interpreted, looked at or whatever in qemu. Whenever the guest CPU runs, it runs because qemu called ioctrl(VCPU_RUN) on its kvm vcpu fd. > Now I don't understand.....guest codes usually have two parts --> one running in ring3, and another in ring0, so if we were running everything in KVM, won't it posed a security risks? as far as I know, VMware use ring1 to run ALL the guest codes, and transition to ring0 whenever privilege instructions is encountered. so what is the equivalent mechanism in qemu? Key issue I am facing with here is basically "privilege insn", -----> only these should be executing in kvm module, which is running in ring0, and the rest is best to be at lower level? >> >> I have not seen any IOCTL calls in QEMU, > > See kvm*.c and target-xxx/kvm.c > >> but I suspect ultimately it >> should drop to a VMRUN (for AMD, Intel called it VMLAUNCH or VMRESUME) >> calls inside KVM, which can be found here: >> >> arch/x86/kvm/ >> >> And the AMD specific virtualization is done in svm.c whereas that of >> vmx.c is for Intel. >> >> Copying the remark in vmx.c: >> >> /* >> * The exit handlers return 1 if the exit was handled fully and guest execution >> * may resume. Otherwise they set the kvm_run parameter to indicate what needs >> * to be done to userspace and return 0. >> */ >> static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = { >> [EXIT_REASON_EXCEPTION_ >> >> And after reading the Intel manual, u will understand that "exit" here >> actually refers to the special set of privilege intel instructions, >> which upon being executed by the guest OS, will immediately caused and >> VMEXIT condition, and these are handled by the above handler in >> kvm.ko. > > in kvm-xxx.ko for x86. > > Also, please don't top post :) > > > Alex > > Thanks again. -- Regards, Peter Teoh ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Where is the entry of hypercalls in kvm? 2010-06-30 16:28 ` Peter Teoh @ 2010-06-30 16:32 ` Alexander Graf 2010-06-30 16:34 ` Anthony Liguori 1 sibling, 0 replies; 11+ messages in thread From: Alexander Graf @ 2010-06-30 16:32 UTC (permalink / raw) To: Peter Teoh; +Cc: Balachandar, kernelnewbies, kvm On 30.06.2010, at 18:28, Peter Teoh wrote: > Thank you Alex for the reply, very glad to know you!!! > > On Wed, Jun 30, 2010 at 4:56 PM, Alexander Graf <agraf@suse.de> wrote: >> >> On 30.06.2010, at 10:17, Peter Teoh wrote: >> >>> Your questioned is answered here: >>> >>> http://www.spinics.net/lists/kvm/msg37526.html >>> >>> And check this paper out: >>> >>> http://ozlabs.org/~rusty/virtio-spec/virtio-paper.pdf >>> >>> The general concept to remember is that QEMU and KVM just execute the >>> input as binary stream....it does not know what "functions" it is >>> executing...so the binary stream can be any OS (windows / Linux >>> etc)....QEMU just setup the basic block (call basic blocks >>> translation) mechanism, and then execute it block by block. Each >>> block by definition is demarcated by a branch/jump etc. Within the >>> block if there is any privilege instruction, (eg, write MSR registers, >>> load LDT registers etc), then a transition will be made from guest in >>> QEMU into KVM to update the VMCB/VMCS information. (these terms are >>> from Intel/AMD manual). >> >> Eh, no. >> >> There are two modes of operation: >> >> 1) TCG >> 2) KVM >> > > Now I am clear, it is translate-all.c vs kvm-all.c as the two main > file in QEMU. Thanks for that! > >> In mode 1, qemu goes through target-xxx/translate.c and converts the basic blocks you were talking about above to native machine code on the host system using tcg (see the tcg directory). No KVM is involved, everything happens in user mode. >> >> In mode 2, qemu executes _everything_ by calling KVM. There is no guest code interpreted, looked at or whatever in qemu. Whenever the guest CPU runs, it runs because qemu called ioctrl(VCPU_RUN) on its kvm vcpu fd. >> > > Now I don't understand.....guest codes usually have two parts --> one > running in ring3, and another in ring0, so if we were running > everything in KVM, won't it posed a security risks? as far as I > know, VMware use ring1 to run ALL the guest codes, and transition to > ring0 whenever privilege instructions is encountered. so what is the > equivalent mechanism in qemu? Key issue I am facing with here is > basically "privilege insn", -----> only these should be executing in > kvm module, which is running in ring0, and the rest is best to be at > lower level? Modern x86 CPUs give you a fake ring0 mode where privileged instructions can either be trapped or act on shadow CPU state that gets swapped with the host state. See the description of the Secure Virtual Machine (AMD) or vt-x (Intel) frameworks in their respective CPU architecture manuals. Alex ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Where is the entry of hypercalls in kvm? 2010-06-30 16:28 ` Peter Teoh 2010-06-30 16:32 ` Alexander Graf @ 2010-06-30 16:34 ` Anthony Liguori 1 sibling, 0 replies; 11+ messages in thread From: Anthony Liguori @ 2010-06-30 16:34 UTC (permalink / raw) To: Peter Teoh; +Cc: Alexander Graf, Balachandar, kernelnewbies, kvm On 06/30/2010 11:28 AM, Peter Teoh wrote: > Thank you Alex for the reply, very glad to know you!!! > > On Wed, Jun 30, 2010 at 4:56 PM, Alexander Graf<agraf@suse.de> wrote: > >> On 30.06.2010, at 10:17, Peter Teoh wrote: >> >> >>> Your questioned is answered here: >>> >>> http://www.spinics.net/lists/kvm/msg37526.html >>> >>> And check this paper out: >>> >>> http://ozlabs.org/~rusty/virtio-spec/virtio-paper.pdf >>> >>> The general concept to remember is that QEMU and KVM just execute the >>> input as binary stream....it does not know what "functions" it is >>> executing...so the binary stream can be any OS (windows / Linux >>> etc)....QEMU just setup the basic block (call basic blocks >>> translation) mechanism, and then execute it block by block. Each >>> block by definition is demarcated by a branch/jump etc. Within the >>> block if there is any privilege instruction, (eg, write MSR registers, >>> load LDT registers etc), then a transition will be made from guest in >>> QEMU into KVM to update the VMCB/VMCS information. (these terms are >>> from Intel/AMD manual). >>> >> Eh, no. >> >> There are two modes of operation: >> >> 1) TCG >> 2) KVM >> >> > Now I am clear, it is translate-all.c vs kvm-all.c as the two main > file in QEMU. Thanks for that! > > >> In mode 1, qemu goes through target-xxx/translate.c and converts the basic blocks you were talking about above to native machine code on the host system using tcg (see the tcg directory). No KVM is involved, everything happens in user mode. >> >> In mode 2, qemu executes _everything_ by calling KVM. There is no guest code interpreted, looked at or whatever in qemu. Whenever the guest CPU runs, it runs because qemu called ioctrl(VCPU_RUN) on its kvm vcpu fd. >> >> > Now I don't understand.....guest codes usually have two parts --> one > running in ring3, and another in ring0, so if we were running > everything in KVM, won't it posed a security risks? as far as I > know, VMware use ring1 to run ALL the guest codes, and transition to > ring0 whenever privilege instructions is encountered. This is not quite accurate anymore. VT and SVM introduce what's often called compressed ring 0 mode. In this new mode, you can execute code in ring 0, 1, 2, or 3 but trap any operations that are potentially sensitive (like IO operations). The act of trapping these events results in a transition from compressed ring 0 to normal ring 0. This transition is called a vmexit. KVM enables compressed ring 0 mode and runs all guest code in that mode. It directly handles all vmexits and decides to pass a subset of those exits down to qemu for further handling. Typically, this subset includes anything that requires device emulation. Regards, Anthony Liguori > so what is the > equivalent mechanism in qemu? Key issue I am facing with here is > basically "privilege insn", -----> only these should be executing in > kvm module, which is running in ring0, and the rest is best to be at > lower level? > > >>> I have not seen any IOCTL calls in QEMU, >>> >> See kvm*.c and target-xxx/kvm.c >> >> >>> but I suspect ultimately it >>> should drop to a VMRUN (for AMD, Intel called it VMLAUNCH or VMRESUME) >>> calls inside KVM, which can be found here: >>> >>> arch/x86/kvm/ >>> >>> And the AMD specific virtualization is done in svm.c whereas that of >>> vmx.c is for Intel. >>> >>> Copying the remark in vmx.c: >>> >>> /* >>> * The exit handlers return 1 if the exit was handled fully and guest execution >>> * may resume. Otherwise they set the kvm_run parameter to indicate what needs >>> * to be done to userspace and return 0. >>> */ >>> static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = { >>> [EXIT_REASON_EXCEPTION_ >>> >>> And after reading the Intel manual, u will understand that "exit" here >>> actually refers to the special set of privilege intel instructions, >>> which upon being executed by the guest OS, will immediately caused and >>> VMEXIT condition, and these are handled by the above handler in >>> kvm.ko. >>> >> in kvm-xxx.ko for x86. >> >> Also, please don't top post :) >> >> >> Alex >> >> >> > Thanks again. > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Where is the entry of hypercalls in kvm? 2010-06-30 8:17 ` Peter Teoh 2010-06-30 8:56 ` Alexander Graf @ 2010-06-30 14:59 ` Balachandar 2010-06-30 15:02 ` Balachandar 1 sibling, 1 reply; 11+ messages in thread From: Balachandar @ 2010-06-30 14:59 UTC (permalink / raw) To: Peter Teoh; +Cc: kernelnewbies, kvm On Wed, Jun 30, 2010 at 4:17 AM, Peter Teoh <htmldeveloper@gmail.com> wrote: > Your questioned is answered here: > > http://www.spinics.net/lists/kvm/msg37526.html > > And check this paper out: > > http://ozlabs.org/~rusty/virtio-spec/virtio-paper.pdf > > The general concept to remember is that QEMU and KVM just execute the > input as binary stream....it does not know what "functions" it is > executing...so the binary stream can be any OS (windows / Linux > etc)....QEMU just setup the basic block (call basic blocks > translation) mechanism, and then execute it block by block. Each > block by definition is demarcated by a branch/jump etc. Within the > block if there is any privilege instruction, (eg, write MSR registers, > load LDT registers etc), then a transition will be made from guest in > QEMU into KVM to update the VMCB/VMCS information. (these terms are > from Intel/AMD manual). > > I have not seen any IOCTL calls in QEMU, but I suspect ultimately it > should drop to a VMRUN (for AMD, Intel called it VMLAUNCH or VMRESUME) > calls inside KVM, which can be found here: > > arch/x86/kvm/ > > And the AMD specific virtualization is done in svm.c whereas that of > vmx.c is for Intel. > > Copying the remark in vmx.c: > > /* > * The exit handlers return 1 if the exit was handled fully and guest execution > * may resume. Otherwise they set the kvm_run parameter to indicate what needs > * to be done to userspace and return 0. > */ > static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = { > [EXIT_REASON_EXCEPTION_ > > And after reading the Intel manual, u will understand that "exit" here > actually refers to the special set of privilege intel instructions, > which upon being executed by the guest OS, will immediately caused and > VMEXIT condition, and these are handled by the above handler in > kvm.ko. > > To know the entry point INTO the guest OS (ie, when the guest code > will first be run) first must understand that all these VMX operation > are a state machine (3, VMLAUNCH, VMRESUME and VMEXIT). Once inside > the VMRESUME state, there is no way for it to access any of the hosts > resources, only accessible after VMEXIT is triggered. > > All key APIs are defined here (for Intel) (this is KVM specific, Xen > has another mechanism, : > > static struct kvm_x86_ops vmx_x86_ops = { > .cpu_has_kvm_support = cpu_has_kvm_support, > .disabled_by_bios = vmx_disabled_by_bios, > .hardware_setup = hardware_setup, > .hardware_unsetup = hardware_unsetup, > ... > .run = vmx_vcpu_run, > .handle_exit = vmx_handle_exit, > .skip_emulated_instruction = skip_emulated_instruction, > .set_interrupt_shadow = vmx_set_interrupt_shadow, > > and vmx_vcpu_run() is the the answer to your question.....i supposed? > > Perhaps another summary resource: > > http://download.microsoft.com/download/9/8/f/98f3fe47-dfc3-4e74-92a3-088782200fe7/TWAR05015_WinHEC05.ppt > > As for virtio_net.....it is implemented in > drivers/net/virtio_net.c......not sure what is your question? > Thank you for your elaborate answer. My question is what is the code in qemu-kvm that is called when kick function is called in virtio_net? The kick function does some ioport write and this will be trapped by the hypervisor into kvm. Then kvm will call some function in qemu-kvm userspace for io emulation. So for this particular case virtio_net what is the function in qemu-kvm that will be called when kick is encountered in the guest? Thanks, Bala ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Where is the entry of hypercalls in kvm? 2010-06-30 14:59 ` Balachandar @ 2010-06-30 15:02 ` Balachandar 2010-06-30 15:12 ` Anthony Liguori 0 siblings, 1 reply; 11+ messages in thread From: Balachandar @ 2010-06-30 15:02 UTC (permalink / raw) To: Peter Teoh; +Cc: kernelnewbies, kvm On Wed, Jun 30, 2010 at 10:59 AM, Balachandar <bala1486@gmail.com> wrote: > On Wed, Jun 30, 2010 at 4:17 AM, Peter Teoh <htmldeveloper@gmail.com> wrote: >> Your questioned is answered here: >> >> http://www.spinics.net/lists/kvm/msg37526.html >> >> And check this paper out: >> >> http://ozlabs.org/~rusty/virtio-spec/virtio-paper.pdf >> >> The general concept to remember is that QEMU and KVM just execute the >> input as binary stream....it does not know what "functions" it is >> executing...so the binary stream can be any OS (windows / Linux >> etc)....QEMU just setup the basic block (call basic blocks >> translation) mechanism, and then execute it block by block. Each >> block by definition is demarcated by a branch/jump etc. Within the >> block if there is any privilege instruction, (eg, write MSR registers, >> load LDT registers etc), then a transition will be made from guest in >> QEMU into KVM to update the VMCB/VMCS information. (these terms are >> from Intel/AMD manual). >> >> I have not seen any IOCTL calls in QEMU, but I suspect ultimately it >> should drop to a VMRUN (for AMD, Intel called it VMLAUNCH or VMRESUME) >> calls inside KVM, which can be found here: >> >> arch/x86/kvm/ >> >> And the AMD specific virtualization is done in svm.c whereas that of >> vmx.c is for Intel. >> >> Copying the remark in vmx.c: >> >> /* >> * The exit handlers return 1 if the exit was handled fully and guest execution >> * may resume. Otherwise they set the kvm_run parameter to indicate what needs >> * to be done to userspace and return 0. >> */ >> static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = { >> [EXIT_REASON_EXCEPTION_ >> >> And after reading the Intel manual, u will understand that "exit" here >> actually refers to the special set of privilege intel instructions, >> which upon being executed by the guest OS, will immediately caused and >> VMEXIT condition, and these are handled by the above handler in >> kvm.ko. >> >> To know the entry point INTO the guest OS (ie, when the guest code >> will first be run) first must understand that all these VMX operation >> are a state machine (3, VMLAUNCH, VMRESUME and VMEXIT). Once inside >> the VMRESUME state, there is no way for it to access any of the hosts >> resources, only accessible after VMEXIT is triggered. >> >> All key APIs are defined here (for Intel) (this is KVM specific, Xen >> has another mechanism, : >> >> static struct kvm_x86_ops vmx_x86_ops = { >> .cpu_has_kvm_support = cpu_has_kvm_support, >> .disabled_by_bios = vmx_disabled_by_bios, >> .hardware_setup = hardware_setup, >> .hardware_unsetup = hardware_unsetup, >> ... >> .run = vmx_vcpu_run, >> .handle_exit = vmx_handle_exit, >> .skip_emulated_instruction = skip_emulated_instruction, >> .set_interrupt_shadow = vmx_set_interrupt_shadow, >> >> and vmx_vcpu_run() is the the answer to your question.....i supposed? >> >> Perhaps another summary resource: >> >> http://download.microsoft.com/download/9/8/f/98f3fe47-dfc3-4e74-92a3-088782200fe7/TWAR05015_WinHEC05.ppt >> >> As for virtio_net.....it is implemented in >> drivers/net/virtio_net.c......not sure what is your question? >> > Thank you for your elaborate answer. My question is what is the code > in qemu-kvm that is called when kick function is called in virtio_net? > The kick function does some ioport write and this will be trapped by > the hypervisor into kvm. Then kvm will call some function in qemu-kvm > userspace for io emulation. So for this particular case virtio_net > what is the function in qemu-kvm that will be called when kick is > encountered in the guest? > I already got the answer from Alexander. If anyone is looking the function is virtio_net_write in hw/virtio_pci.c ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Where is the entry of hypercalls in kvm? 2010-06-30 15:02 ` Balachandar @ 2010-06-30 15:12 ` Anthony Liguori 0 siblings, 0 replies; 11+ messages in thread From: Anthony Liguori @ 2010-06-30 15:12 UTC (permalink / raw) To: Balachandar; +Cc: Peter Teoh, kernelnewbies, kvm On 06/30/2010 10:02 AM, Balachandar wrote: > On Wed, Jun 30, 2010 at 10:59 AM, Balachandar<bala1486@gmail.com> wrote: > >> On Wed, Jun 30, 2010 at 4:17 AM, Peter Teoh<htmldeveloper@gmail.com> wrote: >> >>> Your questioned is answered here: >>> >>> http://www.spinics.net/lists/kvm/msg37526.html >>> >>> And check this paper out: >>> >>> http://ozlabs.org/~rusty/virtio-spec/virtio-paper.pdf >>> >>> The general concept to remember is that QEMU and KVM just execute the >>> input as binary stream....it does not know what "functions" it is >>> executing...so the binary stream can be any OS (windows / Linux >>> etc)....QEMU just setup the basic block (call basic blocks >>> translation) mechanism, and then execute it block by block. Each >>> block by definition is demarcated by a branch/jump etc. Within the >>> block if there is any privilege instruction, (eg, write MSR registers, >>> load LDT registers etc), then a transition will be made from guest in >>> QEMU into KVM to update the VMCB/VMCS information. (these terms are >>> from Intel/AMD manual). >>> >>> I have not seen any IOCTL calls in QEMU, but I suspect ultimately it >>> should drop to a VMRUN (for AMD, Intel called it VMLAUNCH or VMRESUME) >>> calls inside KVM, which can be found here: >>> >>> arch/x86/kvm/ >>> >>> And the AMD specific virtualization is done in svm.c whereas that of >>> vmx.c is for Intel. >>> >>> Copying the remark in vmx.c: >>> >>> /* >>> * The exit handlers return 1 if the exit was handled fully and guest execution >>> * may resume. Otherwise they set the kvm_run parameter to indicate what needs >>> * to be done to userspace and return 0. >>> */ >>> static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = { >>> [EXIT_REASON_EXCEPTION_ >>> >>> And after reading the Intel manual, u will understand that "exit" here >>> actually refers to the special set of privilege intel instructions, >>> which upon being executed by the guest OS, will immediately caused and >>> VMEXIT condition, and these are handled by the above handler in >>> kvm.ko. >>> >>> To know the entry point INTO the guest OS (ie, when the guest code >>> will first be run) first must understand that all these VMX operation >>> are a state machine (3, VMLAUNCH, VMRESUME and VMEXIT). Once inside >>> the VMRESUME state, there is no way for it to access any of the hosts >>> resources, only accessible after VMEXIT is triggered. >>> >>> All key APIs are defined here (for Intel) (this is KVM specific, Xen >>> has another mechanism, : >>> >>> static struct kvm_x86_ops vmx_x86_ops = { >>> .cpu_has_kvm_support = cpu_has_kvm_support, >>> .disabled_by_bios = vmx_disabled_by_bios, >>> .hardware_setup = hardware_setup, >>> .hardware_unsetup = hardware_unsetup, >>> ... >>> .run = vmx_vcpu_run, >>> .handle_exit = vmx_handle_exit, >>> .skip_emulated_instruction = skip_emulated_instruction, >>> .set_interrupt_shadow = vmx_set_interrupt_shadow, >>> >>> and vmx_vcpu_run() is the the answer to your question.....i supposed? >>> >>> Perhaps another summary resource: >>> >>> http://download.microsoft.com/download/9/8/f/98f3fe47-dfc3-4e74-92a3-088782200fe7/TWAR05015_WinHEC05.ppt >>> >>> As for virtio_net.....it is implemented in >>> drivers/net/virtio_net.c......not sure what is your question? >>> >>> >> Thank you for your elaborate answer. My question is what is the code >> in qemu-kvm that is called when kick function is called in virtio_net? >> The kick function does some ioport write and this will be trapped by >> the hypervisor into kvm. Then kvm will call some function in qemu-kvm >> userspace for io emulation. So for this particular case virtio_net >> what is the function in qemu-kvm that will be called when kick is >> encountered in the guest? >> >> > I already got the answer from Alexander. If anyone is looking the > function is virtio_net_write in hw/virtio_pci.c > virtio_ioport_write() in hw/virtio_pci.c. It eventually goes to virtio_net_handle_tx, virtio_net_handle_rx, or virtio_net_handle_ctrl depending on which queue is being notified. Regards, Anthony Liguori > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2010-06-30 16:36 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-06-26 1:06 Where is the entry of hypercalls in kvm? Balachandar 2010-06-30 8:17 ` Peter Teoh 2010-06-30 8:56 ` Alexander Graf 2010-06-30 15:10 ` Anthony Liguori 2010-06-30 16:36 ` Peter Teoh 2010-06-30 16:28 ` Peter Teoh 2010-06-30 16:32 ` Alexander Graf 2010-06-30 16:34 ` Anthony Liguori 2010-06-30 14:59 ` Balachandar 2010-06-30 15:02 ` Balachandar 2010-06-30 15:12 ` Anthony Liguori
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox