kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* (unknown), 
@ 2014-04-13 21:01 Marcus White
  2014-04-15  0:59 ` Marcus White
  0 siblings, 1 reply; 35+ messages in thread
From: Marcus White @ 2014-04-13 21:01 UTC (permalink / raw)
  To: kvm

Hello,
I had some basic questions regarding KVM, and would appreciate any help:)

I have been reading about the KVM architecture, and as I understand
it, the guest shows up as a regular process in the host itself..

I had some questions around that..

1.  Are the guest processes implemented as a control group within the
overall VM process itself? Is the VM a kernel process or a user
process?

2. Is there a way for me to force some specific CPU/s to a guest, and
those CPUs to be not used for any work on the host itself?  Pinning is
just making sure the vCPU runs on the same physical CPU always, I am
looking for something more than that..

3. If the host is compiled as a non pre-emptible kernel, kernel
process run to completion until they give up the CPU themselves. In
the context of a guest, I am trying to understand what that would mean
in the context of KVM and guest VMs. If the VM is a user process, it
means nothing, I wasnt sure as per (1).

Cheers!
M

^ permalink raw reply	[flat|nested] 35+ messages in thread
* (no subject)
@ 2022-01-14 10:54 Li RongQing
  2022-01-14 10:55 ` Paolo Bonzini
  0 siblings, 1 reply; 35+ messages in thread
From: Li RongQing @ 2022-01-14 10:54 UTC (permalink / raw)
  To: pbonzini, seanjc, vkuznets, wanpengli, jmattson, tglx, bp, x86,
	kvm, joro, peterz

After support paravirtualized TLB shootdowns, steal_time.preempted
includes not only KVM_VCPU_PREEMPTED, but also KVM_VCPU_FLUSH_TLB

and kvm_vcpu_is_preempted should test only with KVM_VCPU_PREEMPTED

Signed-off-by: Li RongQing <lirongqing@baidu.com>
---
diff with v1: 
clear the rest of rax, suggested by Sean and peter
remove Fixes tag, since no issue in practice

 arch/x86/kernel/kvm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index b061d17..45c9ce8d 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -1025,8 +1025,8 @@ asm(
 ".type __raw_callee_save___kvm_vcpu_is_preempted, @function;"
 "__raw_callee_save___kvm_vcpu_is_preempted:"
 "movq	__per_cpu_offset(,%rdi,8), %rax;"
-"cmpb	$0, " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rax);"
-"setne	%al;"
+"movb	" __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rax), %al;"
+"and	$" __stringify(KVM_VCPU_PREEMPTED) ", %rax;"
 "ret;"
 ".size __raw_callee_save___kvm_vcpu_is_preempted, .-__raw_callee_save___kvm_vcpu_is_preempted;"
 ".popsection");
-- 
2.9.4


^ permalink raw reply related	[flat|nested] 35+ messages in thread
[parent not found: <E1hUrZM-0007qA-Q8@sslproxy01.your-server.de>]
* (unknown)
@ 2019-03-19 14:41 Maxim Levitsky
  2019-03-20 11:03 ` Felipe Franciosi
  0 siblings, 1 reply; 35+ messages in thread
From: Maxim Levitsky @ 2019-03-19 14:41 UTC (permalink / raw)
  To: linux-nvme
  Cc: Maxim Levitsky, linux-kernel, kvm, Jens Axboe, Alex Williamson,
	Keith Busch, Christoph Hellwig, Sagi Grimberg, Kirti Wankhede,
	David S . Miller, Mauro Carvalho Chehab, Greg Kroah-Hartman,
	Wolfram Sang, Nicolas Ferre, Paul E . McKenney , Paolo Bonzini,
	Liang Cunming, Liu Changpeng, Fam Zheng, Amnon Ilan, John 

Date: Tue, 19 Mar 2019 14:45:45 +0200
Subject: [PATCH 0/9] RFC: NVME VFIO mediated device

Hi everyone!

In this patch series, I would like to introduce my take on the problem of doing 
as fast as possible virtualization of storage with emphasis on low latency.

In this patch series I implemented a kernel vfio based, mediated device that 
allows the user to pass through a partition and/or whole namespace to a guest.

The idea behind this driver is based on paper you can find at
https://www.usenix.org/conference/atc18/presentation/peng,

Although note that I stared the development prior to reading this paper, 
independently.

In addition to that implementation is not based on code used in the paper as 
I wasn't being able at that time to make the source available to me.

***Key points about the implementation:***

* Polling kernel thread is used. The polling is stopped after a 
predefined timeout (1/2 sec by default).
Support for all interrupt driven mode is planned, and it shows promising results.

* Guest sees a standard NVME device - this allows to run guest with 
unmodified drivers, for example windows guests.

* The NVMe device is shared between host and guest.
That means that even a single namespace can be split between host 
and guest based on different partitions.

* Simple configuration

*** Performance ***

Performance was tested on Intel DC P3700, With Xeon E5-2620 v2 
and both latency and throughput is very similar to SPDK.

Soon I will test this on a better server and nvme device and provide
more formal performance numbers.

Latency numbers:
~80ms - spdk with fio plugin on the host.
~84ms - nvme driver on the host
~87ms - mdev-nvme + nvme driver in the guest

Throughput was following similar pattern as well.

* Configuration example
  $ modprobe nvme mdev_queues=4
  $ modprobe nvme-mdev

  $ UUID=$(uuidgen)
  $ DEVICE='device pci address'
  $ echo $UUID > /sys/bus/pci/devices/$DEVICE/mdev_supported_types/nvme-2Q_V1/create
  $ echo n1p3 > /sys/bus/mdev/devices/$UUID/namespaces/add_namespace #attach host namespace 1 parition 3
  $ echo 11 > /sys/bus/mdev/devices/$UUID/settings/iothread_cpu #pin the io thread to cpu 11

  Afterward boot qemu with
  -device vfio-pci,sysfsdev=/sys/bus/mdev/devices/$UUID
  
  Zero configuration on the guest.
  
*** FAQ ***

* Why to make this in the kernel? Why this is better that SPDK

  -> Reuse the existing nvme kernel driver in the host. No new drivers in the guest.
  
  -> Share the NVMe device between host and guest. 
     Even in fully virtualized configurations,
     some partitions of nvme device could be used by guests as block devices 
     while others passed through with nvme-mdev to achieve balance between
     all features of full IO stack emulation and performance.
  
  -> NVME-MDEV is a bit faster due to the fact that in-kernel driver 
     can send interrupts to the guest directly without a context 
     switch that can be expensive due to meltdown mitigation.

  -> Is able to utilize interrupts to get reasonable performance. 
     This is only implemented
     as a proof of concept and not included in the patches, 
     but interrupt driven mode shows reasonable performance
     
  -> This is a framework that later can be used to support NVMe devices 
     with more of the IO virtualization built-in 
     (IOMMU with PASID support coupled with device that supports it)

* Why to attach directly to nvme-pci driver and not use block layer IO
  -> The direct attachment allows for better performance, but I will
     check the possibility of using block IO, especially for fabrics drivers.
  
*** Implementation notes ***

*  All guest memory is mapped into the physical nvme device 
   but not 1:1 as vfio-pci would do this.
   This allows very efficient DMA.
   To support this, patch 2 adds ability for a mdev device to listen on 
   guest's memory map events. 
   Any such memory is immediately pinned and then DMA mapped.
   (Support for fabric drivers where this is not possible exits too,
    in which case the fabric driver will do its own DMA mapping)

*  nvme core driver is modified to announce the appearance 
   and disappearance of nvme controllers and namespaces,
   to which the nvme-mdev driver is subscribed.
 
*  nvme-pci driver is modified to expose raw interface of attaching to 
   and sending/polling the IO queues.
   This allows the mdev driver very efficiently to submit/poll for the IO.
   By default one host queue is used per each mediated device.
   (support for other fabric based host drivers is planned)

* The nvme-mdev doesn't assume presence of KVM, thus any VFIO user, including
  SPDK, a qemu running with tccg, ... can use this virtual device.

*** Testing ***

The device was tested with stock QEMU 3.0 on the host,
with host was using 5.0 kernel with nvme-mdev added and the following hardware:
 * QEMU nvme virtual device (with nested guest)
 * Intel DC P3700 on Xeon E5-2620 v2 server
 * Samsung SM981 (in a Thunderbolt enclosure, with my laptop)
 * Lenovo NVME device found in my laptop

The guest was tested with kernel 4.16, 4.18, 4.20 and
the same custom complied kernel 5.0
Windows 10 guest was tested too with both Microsoft's inbox driver and
open source community NVME driver
(https://lists.openfabrics.org/pipermail/nvmewin/2016-December/001420.html)

Testing was mostly done on x86_64, but 32 bit host/guest combination
was lightly tested too.

In addition to that, the virtual device was tested with nested guest,
by passing the virtual device to it,
using pci passthrough, qemu userspace nvme driver, and spdk


PS: I used to contribute to the kernel as a hobby using the
    maximlevitsky@gmail.com address

Maxim Levitsky (9):
  vfio/mdev: add .request callback
  nvme/core: add some more values from the spec
  nvme/core: add NVME_CTRL_SUSPENDED controller state
  nvme/pci: use the NVME_CTRL_SUSPENDED state
  nvme/pci: add known admin effects to augument admin effects log page
  nvme/pci: init shadow doorbell after each reset
  nvme/core: add mdev interfaces
  nvme/core: add nvme-mdev core driver
  nvme/pci: implement the mdev external queue allocation interface

 MAINTAINERS                   |   5 +
 drivers/nvme/Kconfig          |   1 +
 drivers/nvme/Makefile         |   1 +
 drivers/nvme/host/core.c      | 149 +++++-
 drivers/nvme/host/nvme.h      |  55 ++-
 drivers/nvme/host/pci.c       | 385 ++++++++++++++-
 drivers/nvme/mdev/Kconfig     |  16 +
 drivers/nvme/mdev/Makefile    |   5 +
 drivers/nvme/mdev/adm.c       | 873 ++++++++++++++++++++++++++++++++++
 drivers/nvme/mdev/events.c    | 142 ++++++
 drivers/nvme/mdev/host.c      | 491 +++++++++++++++++++
 drivers/nvme/mdev/instance.c  | 802 +++++++++++++++++++++++++++++++
 drivers/nvme/mdev/io.c        | 563 ++++++++++++++++++++++
 drivers/nvme/mdev/irq.c       | 264 ++++++++++
 drivers/nvme/mdev/mdev.h      |  56 +++
 drivers/nvme/mdev/mmio.c      | 591 +++++++++++++++++++++++
 drivers/nvme/mdev/pci.c       | 247 ++++++++++
 drivers/nvme/mdev/priv.h      | 700 +++++++++++++++++++++++++++
 drivers/nvme/mdev/udata.c     | 390 +++++++++++++++
 drivers/nvme/mdev/vcq.c       | 207 ++++++++
 drivers/nvme/mdev/vctrl.c     | 514 ++++++++++++++++++++
 drivers/nvme/mdev/viommu.c    | 322 +++++++++++++
 drivers/nvme/mdev/vns.c       | 356 ++++++++++++++
 drivers/nvme/mdev/vsq.c       | 178 +++++++
 drivers/vfio/mdev/vfio_mdev.c |  11 +
 include/linux/mdev.h          |   4 +
 include/linux/nvme.h          |  88 +++-
 27 files changed, 7375 insertions(+), 41 deletions(-)
 create mode 100644 drivers/nvme/mdev/Kconfig
 create mode 100644 drivers/nvme/mdev/Makefile
 create mode 100644 drivers/nvme/mdev/adm.c
 create mode 100644 drivers/nvme/mdev/events.c
 create mode 100644 drivers/nvme/mdev/host.c
 create mode 100644 drivers/nvme/mdev/instance.c
 create mode 100644 drivers/nvme/mdev/io.c
 create mode 100644 drivers/nvme/mdev/irq.c
 create mode 100644 drivers/nvme/mdev/mdev.h
 create mode 100644 drivers/nvme/mdev/mmio.c
 create mode 100644 drivers/nvme/mdev/pci.c
 create mode 100644 drivers/nvme/mdev/priv.h
 create mode 100644 drivers/nvme/mdev/udata.c
 create mode 100644 drivers/nvme/mdev/vcq.c
 create mode 100644 drivers/nvme/mdev/vctrl.c
 create mode 100644 drivers/nvme/mdev/viommu.c
 create mode 100644 drivers/nvme/mdev/vns.c
 create mode 100644 drivers/nvme/mdev/vsq.c

-- 
2.17.2

^ permalink raw reply	[flat|nested] 35+ messages in thread
* Re:
@ 2018-02-05  5:28 Fahama Vaserman
  0 siblings, 0 replies; 35+ messages in thread
From: Fahama Vaserman @ 2018-02-05  5:28 UTC (permalink / raw)
  To: info; +Cc: info

Can i confide in you: ?

^ permalink raw reply	[flat|nested] 35+ messages in thread
* Re:
@ 2017-11-13 14:44 Amos Kalonzo
  0 siblings, 0 replies; 35+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:44 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 35+ messages in thread
[parent not found: <CAMj-D2DO_CfvD77izsGfggoKP45HSC9aD6auUPAYC9Yeq_aX7w@mail.gmail.com>]
* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 35+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 35+ messages in thread
[parent not found: <D0613EBE33E8FD439137DAA95CCF59555B7A5A4D@MGCCCMAIL2010-5.mgccc.cc.ms.us>]
* Re:
@ 2013-06-28 10:14 emirates
  0 siblings, 0 replies; 35+ messages in thread
From: emirates @ 2013-06-28 10:14 UTC (permalink / raw)
  To: info

Did You Receive Our Last Notification?(Reply Via fly.emiratesairline@5d6d.cn)


^ permalink raw reply	[flat|nested] 35+ messages in thread
* Re:
@ 2013-06-28 10:12 emirates
  0 siblings, 0 replies; 35+ messages in thread
From: emirates @ 2013-06-28 10:12 UTC (permalink / raw)
  To: info

Did You Receive Our Last Notification?(Reply Via fly.emiratesairline@5d6d.cn)


^ permalink raw reply	[flat|nested] 35+ messages in thread
* Re:
@ 2013-06-27 21:21 emirates
  0 siblings, 0 replies; 35+ messages in thread
From: emirates @ 2013-06-27 21:21 UTC (permalink / raw)
  To: info


Did You Recieve Our Last Notification!!


^ permalink raw reply	[flat|nested] 35+ messages in thread
* Re:.
@ 2011-10-29 21:27 Young Chang
  0 siblings, 0 replies; 35+ messages in thread
From: Young Chang @ 2011-10-29 21:27 UTC (permalink / raw)


May I ask if you would be eligible to pursue a Business Proposal of  
$19.7m with me if you don't mind? Let me know if you are interested?




----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.


^ permalink raw reply	[flat|nested] 35+ messages in thread
* Re:
@ 2010-12-14  3:03 Irish Online News
  0 siblings, 0 replies; 35+ messages in thread
From: Irish Online News @ 2010-12-14  3:03 UTC (permalink / raw)





You've earned 750,000 GBP. Send Necessary Information:Name,Age,Country


^ permalink raw reply	[flat|nested] 35+ messages in thread
[parent not found: <20090427104117.GB29082@redhat.com>]
* "-vga std" causes guest OS to crash (on disk io?) on a 1440x900 latpop.
@ 2009-03-08 22:05 Dylan Reid
  2009-03-14  5:09 ` Dylan
  0 siblings, 1 reply; 35+ messages in thread
From: Dylan Reid @ 2009-03-08 22:05 UTC (permalink / raw)
  To: kvm

When I run the guest without the -vga option, everything is fine, the
resolutions isn't what I would like, but it works remarkably well.

If I add "-vga std" to the command I use to run the guest, then it
will crash when it attempts to start graphics.  This will also happen
if I attempt to boot from an .iso with the "-vga std" option.  If I
attempt to install ubuntu 8.10 for x86 I get errors about not being
able to read startup files in the /etc directory.
This also happens if I use the -no-vkm switch.

reidd@heman:/usr/appliances$ uname -a
Linux heman 2.6.27-11-server #1 SMP Thu Jan 29 20:19:41 UTC 2009 i686 GNU/Linux

this is running an AMD Turion X2 on a toshiba Satellite P305D.

I am using kvm-84, but also tested on 83 with the same results.

Any ideas on why this would happen?

Thanks,

Dylan

^ permalink raw reply	[flat|nested] 35+ messages in thread
* (unknown), 
@ 2009-02-25  0:50 Josh Borke
  2009-02-25  0:58 ` Atsushi SAKAI
  0 siblings, 1 reply; 35+ messages in thread
From: Josh Borke @ 2009-02-25  0:50 UTC (permalink / raw)
  To: kvm

subscribe kvm

^ permalink raw reply	[flat|nested] 35+ messages in thread
* (unknown)
@ 2009-01-10 21:53 Ekin Meroğlu
  2009-11-07 15:59 ` Bulent Abali
  0 siblings, 1 reply; 35+ messages in thread
From: Ekin Meroğlu @ 2009-01-10 21:53 UTC (permalink / raw)
  To: kvm

subscribe kvm

^ permalink raw reply	[flat|nested] 35+ messages in thread
* (unknown)
@ 2008-07-28 21:27 Mohammed Gamal
  2008-07-28 21:29 ` Mohammed Gamal
  0 siblings, 1 reply; 35+ messages in thread
From: Mohammed Gamal @ 2008-07-28 21:27 UTC (permalink / raw)
  To: kvm; +Cc: avi, guillaume.thouvenin

laurent.vivier@bull.net, riel@surriel.com
Bcc: 
Subject: [RFC][PATCH] VMX: Add and enhance VMentry failure detection 
mechanism 
Reply-To: 

This patch is *not* meant to be merged. This patch fixes the random 
crashes with gfxboot and it doesn't crash anymore at random 
instructions.

It mainly does two things:
1- It handles all possible exit reasons before exiting for VMX failures
2- It handles vmentry failures avoiding external interrupts

However, while this patch allows booting FreeDOS with HIMEM with no 
problems. It does occasionally crash with gfxboot at RIP 6e29, looking 
at the gfxboot code the instructions causing the crash is as follows:

00006e10 <switch_to_pm_20>:
    6e10:	66 b8 20 00          	mov    $0x20,%ax
    6e14:	8e d8                	mov    %eax,%ds
    6e16:	8c d0                	mov    %ss,%eax
    6e18:	81 e4 ff ff 00 00    	and    $0xffff,%esp
    6e1e:	c1 e0 04             	shl    $0x4,%eax
    6e21:	01 c4                	add    %eax,%esp
    6e23:	66 b8 08 00          	mov    $0x8,%ax
    6e27:	8e d0                	mov    %eax,%ss
    6e29:	8e c0                	mov    %eax,%es
    6e2b:	8e e0                	mov    %eax,%fs
    6e2d:	8e e8                	mov    %eax,%gs
    6e2f:	58                   	pop    %eax
    6e30:	66 9d                	popfw  
    6e32:	66 c3                	retw   

So apparently to fix the problem we need to add other guest state checks 
-namely for ES, FS, GS- to invalid_guest_state().

Now enough talk, here is the patch

Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Laurent Vivier <laurent.vivier@bull.net>
Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com>

---
 arch/x86/kvm/vmx.c         |  116 +++++++++++++++++++++++++++++++++++++++++---
 arch/x86/kvm/vmx.h         |    3 +
 include/asm-x86/kvm_host.h |    1 +
 3 files changed, 112 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index c4510fe..b438f94 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1316,7 +1316,8 @@ static void enter_pmode(struct kvm_vcpu *vcpu)
 	fix_pmode_dataseg(VCPU_SREG_GS, &vcpu->arch.rmode.gs);
 	fix_pmode_dataseg(VCPU_SREG_FS, &vcpu->arch.rmode.fs);
 
-	vmcs_write16(GUEST_SS_SELECTOR, 0);
+	if (vcpu->arch.rmode_failed)
+		vmcs_write16(GUEST_SS_SELECTOR, 0);
 	vmcs_write32(GUEST_SS_AR_BYTES, 0x93);
 
 	vmcs_write16(GUEST_CS_SELECTOR,
@@ -2708,6 +2709,93 @@ static int handle_nmi_window(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	return 1;
 }
 
+static int invalid_guest_state(struct kvm_vcpu *vcpu,
+		struct kvm_run *kvm_run, u32 failure_reason)
+{
+	u16 ss, cs;
+	u8 opcodes[4];
+	unsigned long rip = kvm_rip_read(vcpu);
+	unsigned long rip_linear;
+
+	ss = vmcs_read16(GUEST_SS_SELECTOR);
+	cs = vmcs_read16(GUEST_CS_SELECTOR);
+
+	if ((ss & 0x03) != (cs & 0x03)) { 
+		int err;
+		rip_linear = rip + vmx_get_segment_base(vcpu, VCPU_SREG_CS);
+		emulator_read_std(rip_linear, (void *)opcodes, 4, vcpu);
+		err = emulate_instruction(vcpu, kvm_run, 0, 0, 0);
+		switch (err) {
+			case EMULATE_DONE:
+				return 1;
+			case EMULATE_DO_MMIO:
+				printk(KERN_INFO "mmio?\n");
+				return 0;
+			default:
+				/* HACK: If we can not emulate the instruction
+				 * we write a sane value on SS to pass sanity
+				 * checks. The good thing to do is to emulate the
+				 * instruction */
+				kvm_report_emulation_failure(vcpu, "vmentry failure");
+				printk(KERN_INFO "   => Quit real mode emulation\n");
+				vcpu->arch.rmode_failed = 1;
+				vmcs_write16(GUEST_SS_SELECTOR, 0);
+				return 1;
+		}
+	}
+
+	kvm_run->exit_reason = KVM_EXIT_UNKNOWN;
+	kvm_run->hw.hardware_exit_reason = failure_reason;
+	printk(KERN_INFO "Failed to handle invalid guest state\n");
+	return 0;
+}
+
+/*
+ * Should be replaced with exit handlers for each individual case
+ */
+static int handle_vmentry_failure(struct kvm_vcpu *vcpu,
+				  struct kvm_run *kvm_run,
+				  u32 failure_reason)
+{
+	unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION);
+	switch (failure_reason) {
+		case EXIT_REASON_INVALID_GUEST_STATE:
+			return invalid_guest_state(vcpu, kvm_run, failure_reason);
+		case EXIT_REASON_MSR_LOADING:
+			printk("VMentry failure caused by MSR entry %ld loading.\n",
+					exit_qualification);
+			printk("  ... Not handled\n");
+			break;
+		case EXIT_REASON_MACHINE_CHECK:
+			printk("VMentry failure caused by machine check.\n");
+			printk("  ... Not handled\n");
+			break;
+		default:
+			printk("reason not known yet!\n");
+			break;
+	}
+	return 0;
+}
+
+static int handle_invalid_guest_state(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
+{
+	int rc;
+	u32 exit_reason = vmcs_read32(VM_EXIT_REASON);
+
+	/*
+ 	 * Disable interrupts to avoid occasional vmexits while
+ 	 * handling vmentry failures
+ 	 */ 
+	spin_lock_irq(&vmx_vpid_lock);
+	if(exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY)
+		exit_reason &= ~VMX_EXIT_REASONS_FAILED_VMENTRY;
+
+	rc = invalid_guest_state(vcpu, kvm_run, exit_reason);
+	spin_unlock_irq(&vmx_vpid_lock);
+
+	return rc;
+}
+
 /*
  * The exit handlers return 1 if the exit was handled fully and guest execution
  * may resume.  Otherwise they set the kvm_run parameter to indicate what needs
@@ -2733,6 +2821,7 @@ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu,
 	[EXIT_REASON_WBINVD]                  = handle_wbinvd,
 	[EXIT_REASON_TASK_SWITCH]             = handle_task_switch,
 	[EXIT_REASON_EPT_VIOLATION]	      = handle_ept_violation,
+	[EXIT_REASON_INVALID_GUEST_STATE]     = handle_invalid_guest_state,
 };
 
 static const int kvm_vmx_max_exit_handlers =
@@ -2758,21 +2847,32 @@ static int kvm_handle_exit(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 		ept_load_pdptrs(vcpu);
 	}
 
-	if (unlikely(vmx->fail)) {
-		kvm_run->exit_reason = KVM_EXIT_FAIL_ENTRY;
-		kvm_run->fail_entry.hardware_entry_failure_reason
-			= vmcs_read32(VM_INSTRUCTION_ERROR);
-		return 0;
-	}
-
 	if ((vectoring_info & VECTORING_INFO_VALID_MASK) &&
 			(exit_reason != EXIT_REASON_EXCEPTION_NMI &&
 			exit_reason != EXIT_REASON_EPT_VIOLATION))
 		printk(KERN_WARNING "%s: unexpected, valid vectoring info and "
 		       "exit reason is 0x%x\n", __func__, exit_reason);
+
+	/*
+ 	 * Instead of using handle_vmentry_failure(), just clear
+ 	 * the vmentry failure bit and leave it to the exit handlers
+ 	 * to deal with the specific exit reason. 
+ 	 * The exit handlers other than invalid guest state handler 
+ 	 * will be added later.
+  	 */
+	if ((exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY))
+		exit_reason &= ~VMX_EXIT_REASONS_FAILED_VMENTRY;
+
+
+ 	/* Handle all possible exits first, handle failure later. */ 
 	if (exit_reason < kvm_vmx_max_exit_handlers
 	    && kvm_vmx_exit_handlers[exit_reason])
 		return kvm_vmx_exit_handlers[exit_reason](vcpu, kvm_run);
+	else if(unlikely(vmx->fail)) {
+		kvm_run->exit_reason = KVM_EXIT_FAIL_ENTRY;
+		kvm_run->fail_entry.hardware_entry_failure_reason
+			= vmcs_read32(VM_INSTRUCTION_ERROR);
+	}
 	else {
 		kvm_run->exit_reason = KVM_EXIT_UNKNOWN;
 		kvm_run->hw.hardware_exit_reason = exit_reason;
diff --git a/arch/x86/kvm/vmx.h b/arch/x86/kvm/vmx.h
index 0c22e5f..cf8b771 100644
--- a/arch/x86/kvm/vmx.h
+++ b/arch/x86/kvm/vmx.h
@@ -239,7 +239,10 @@ enum vmcs_field {
 #define EXIT_REASON_IO_INSTRUCTION      30
 #define EXIT_REASON_MSR_READ            31
 #define EXIT_REASON_MSR_WRITE           32
+#define EXIT_REASON_INVALID_GUEST_STATE 33
+#define EXIT_REASON_MSR_LOADING         34
 #define EXIT_REASON_MWAIT_INSTRUCTION   36
+#define EXIT_REASON_MACHINE_CHECK       41
 #define EXIT_REASON_TPR_BELOW_THRESHOLD 43
 #define EXIT_REASON_APIC_ACCESS         44
 #define EXIT_REASON_EPT_VIOLATION       48
diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h
index 0b6b996..422d7c2 100644
--- a/include/asm-x86/kvm_host.h
+++ b/include/asm-x86/kvm_host.h
@@ -294,6 +294,7 @@ struct kvm_vcpu_arch {
 		} tr, es, ds, fs, gs;
 	} rmode;
 	int halt_request; /* real mode on Intel only */
+	int rmode_failed;
 
 	int cpuid_nent;
 	struct kvm_cpuid_entry2 cpuid_entries[KVM_MAX_CPUID_ENTRIES];
 

^ permalink raw reply related	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2022-01-14 17:17 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-13 21:01 (unknown), Marcus White
2014-04-15  0:59 ` Marcus White
2014-04-16 21:17   ` Re: Marcelo Tosatti
2014-04-17 21:33     ` Re: Marcus White
2014-04-21 21:49       ` Re: Marcelo Tosatti
  -- strict thread matches above, loose matches on Subject: below --
2022-01-14 10:54 Li RongQing
2022-01-14 10:55 ` Paolo Bonzini
2022-01-14 17:13   ` Re: Sean Christopherson
2022-01-14 17:17     ` Re: Paolo Bonzini
     [not found] <E1hUrZM-0007qA-Q8@sslproxy01.your-server.de>
2019-05-29 19:54 ` Re: Alex Williamson
2019-03-19 14:41 (unknown) Maxim Levitsky
2019-03-20 11:03 ` Felipe Franciosi
2019-03-20 19:08   ` Re: Maxim Levitsky
2019-03-21 16:12     ` Re: Stefan Hajnoczi
2019-03-21 16:21       ` Re: Keith Busch
2019-03-21 16:41         ` Re: Felipe Franciosi
2019-03-21 17:04           ` Re: Maxim Levitsky
2019-03-22  7:54             ` Re: Felipe Franciosi
2019-03-22 10:32               ` Re: Maxim Levitsky
2019-03-22 15:30               ` Re: Keith Busch
2019-03-25 15:44                 ` Re: Felipe Franciosi
2018-02-05  5:28 Re: Fahama Vaserman
2017-11-13 14:44 Re: Amos Kalonzo
     [not found] <CAMj-D2DO_CfvD77izsGfggoKP45HSC9aD6auUPAYC9Yeq_aX7w@mail.gmail.com>
2017-05-04 16:44 ` Re: gengdongjiu
2017-02-23 15:09 Qin's Yanjun
     [not found] <D0613EBE33E8FD439137DAA95CCF59555B7A5A4D@MGCCCMAIL2010-5.mgccc.cc.ms.us>
2015-11-24 13:21 ` RE: Amis, Ryann
2013-06-28 10:14 emirates
2013-06-28 10:12 Re: emirates
2013-06-27 21:21 Re: emirates
2011-10-29 21:27 Re: Young Chang
2010-12-14  3:03 Re: Irish Online News
     [not found] <20090427104117.GB29082@redhat.com>
2009-04-27 13:16 ` Re: Sheng Yang
2009-03-08 22:05 "-vga std" causes guest OS to crash (on disk io?) on a 1440x900 latpop Dylan Reid
2009-03-14  5:09 ` Dylan
2009-02-25  0:50 (unknown), Josh Borke
2009-02-25  0:58 ` Atsushi SAKAI
2009-01-10 21:53 (unknown) Ekin Meroğlu
2009-11-07 15:59 ` Bulent Abali
2009-11-07 16:36   ` Neil Aggarwal
2008-07-28 21:27 (unknown) Mohammed Gamal
2008-07-28 21:29 ` Mohammed Gamal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).