linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] KVM: x86: Improve the handling of debug exceptions during instruction emulation
@ 2025-09-10  2:49 Hou Wenlong
  2025-09-10  2:49 ` [PATCH 1/7] KVM: x86: Set guest DR6 by kvm_queue_exception_p() in " Hou Wenlong
                   ` (6 more replies)
  0 siblings, 7 replies; 20+ messages in thread
From: Hou Wenlong @ 2025-09-10  2:49 UTC (permalink / raw)
  To: kvm
  Cc: Lai Jiangshan, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Shuah Khan, linux-kernel, linux-kselftest

During my testing, I found that guest debugging with 'DR6.BD' does not
work in instruction emulation, as the current code only considers the
guest's DR7. Upon reviewing the code, I also observed that the checks
for the userspace guest debugging feature and the guest's own debugging
feature are repeated in different places during instruction
emulation, but the overall logic is the same. If guest debugging
is enabled, it needs to exit to userspace; otherwise, a #DB
exception needs to be injected into the guest. Therefore, as
suggested by Jiangshan Lai, some cleanup has been done for #DB
handling in instruction emulation in this patchset. A new
function named 'kvm_inject_emulated_db()' is introduced to
consolidate all the checking logic. Moreover, I hope we can make
the #DB interception path use the same function as well.

Additionally, when I looked into the single-step #DB handling in
instruction emulation, I noticed that the interrupt shadow is toggled,
but it is not considered in the single-step #DB injection. This
oversight causes VM entry to fail on VMX (due to pending debug
exceptions checking) or breaks the 'MOV SS' suppressed #DB. For the
latter, I have kept the behavior for now in my patchset, as I need some
suggestions.

Hou Wenlong (7):
  KVM: x86: Set guest DR6 by kvm_queue_exception_p() in instruction
    emulation
  KVM: x86: Check guest debug in DR access instruction emulation
  KVM: x86: Only check effective code breakpoint in emulation
  KVM: x86: Consolidate KVM_GUESTDBG_SINGLESTEP check into the
    kvm_inject_emulated_db()
  KVM: VMX: Set 'BS' bit in pending debug exceptions during instruction
    emulation
  KVM: selftests: Verify guest debug DR7.GD checking during instruction
    emulation
  KVM: selftests: Verify 'BS' bit checking in pending debug exception
    during VM entry

 arch/x86/include/asm/kvm-x86-ops.h            |   1 +
 arch/x86/include/asm/kvm_host.h               |   1 +
 arch/x86/kvm/emulate.c                        |  14 +--
 arch/x86/kvm/kvm_emulate.h                    |   7 +-
 arch/x86/kvm/vmx/main.c                       |   9 ++
 arch/x86/kvm/vmx/vmx.c                        |  14 ++-
 arch/x86/kvm/vmx/x86_ops.h                    |   1 +
 arch/x86/kvm/x86.c                            | 109 +++++++++++-------
 arch/x86/kvm/x86.h                            |   7 ++
 .../selftests/kvm/include/x86/processor.h     |   3 +-
 tools/testing/selftests/kvm/x86/debug_regs.c  |  64 +++++++++-
 11 files changed, 167 insertions(+), 63 deletions(-)


base-commit: ecbcc2461839e848970468b44db32282e5059925
--
2.31.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/7] KVM: x86: Set guest DR6 by kvm_queue_exception_p() in instruction emulation
  2025-09-10  2:49 [PATCH 0/7] KVM: x86: Improve the handling of debug exceptions during instruction emulation Hou Wenlong
@ 2025-09-10  2:49 ` Hou Wenlong
  2025-09-10  2:49 ` [PATCH 2/7] KVM: x86: Check guest debug in DR access " Hou Wenlong
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: Hou Wenlong @ 2025-09-10  2:49 UTC (permalink / raw)
  To: kvm
  Cc: Lai Jiangshan, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

Record DR6 in emulate_db() and use kvm_queue_exception_p() to set DR6
instead of directly using kvm_set_dr6() in emulation, which keeps the
handling of DR6 during #DB injection consistent with other code paths.

No functional change intended.

Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
---
 arch/x86/kvm/emulate.c     | 14 ++++----------
 arch/x86/kvm/kvm_emulate.h |  6 +++++-
 arch/x86/kvm/x86.c         |  3 +++
 3 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 542d3664afa3..18e3a732d106 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -593,8 +593,9 @@ static int emulate_exception(struct x86_emulate_ctxt *ctxt, int vec,
 	return X86EMUL_PROPAGATE_FAULT;
 }
 
-static int emulate_db(struct x86_emulate_ctxt *ctxt)
+static int emulate_db(struct x86_emulate_ctxt *ctxt, unsigned long dr6)
 {
+	ctxt->exception.dr6 = dr6;
 	return emulate_exception(ctxt, DB_VECTOR, 0, false);
 }
 
@@ -3857,15 +3858,8 @@ static int check_dr_read(struct x86_emulate_ctxt *ctxt)
 	if ((cr4 & X86_CR4_DE) && (dr == 4 || dr == 5))
 		return emulate_ud(ctxt);
 
-	if (ctxt->ops->get_dr(ctxt, 7) & DR7_GD) {
-		ulong dr6;
-
-		dr6 = ctxt->ops->get_dr(ctxt, 6);
-		dr6 &= ~DR_TRAP_BITS;
-		dr6 |= DR6_BD | DR6_ACTIVE_LOW;
-		ctxt->ops->set_dr(ctxt, 6, dr6);
-		return emulate_db(ctxt);
-	}
+	if (ctxt->ops->get_dr(ctxt, 7) & DR7_GD)
+		return emulate_db(ctxt, DR6_BD);
 
 	return X86EMUL_CONTINUE;
 }
diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index 7b5ddb787a25..a6fad7b938e3 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -24,7 +24,11 @@ struct x86_exception {
 	bool error_code_valid;
 	u16 error_code;
 	bool nested_page_fault;
-	u64 address; /* cr2 or nested page fault gpa */
+	union {
+		u64 address; /* cr2 or nested page fault gpa */
+		unsigned long dr6;
+		u64 payload;
+	};
 	u8 async_page_fault;
 	unsigned long exit_qualification;
 };
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7ba2cdfdac44..b2e8322aeca7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8612,6 +8612,8 @@ static void inject_emulated_exception(struct kvm_vcpu *vcpu)
 
 	if (ctxt->exception.vector == PF_VECTOR)
 		kvm_inject_emulated_page_fault(vcpu, &ctxt->exception);
+	else if (ctxt->exception.vector == DB_VECTOR)
+		kvm_queue_exception_p(vcpu, DB_VECTOR, ctxt->exception.dr6);
 	else if (ctxt->exception.error_code_valid)
 		kvm_queue_exception_e(vcpu, ctxt->exception.vector,
 				      ctxt->exception.error_code);
@@ -8656,6 +8658,7 @@ static void init_emulate_ctxt(struct kvm_vcpu *vcpu)
 	ctxt->interruptibility = 0;
 	ctxt->have_exception = false;
 	ctxt->exception.vector = -1;
+	ctxt->exception.payload = 0;
 	ctxt->perm_ok = false;
 
 	init_decode_cache(ctxt);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/7] KVM: x86: Check guest debug in DR access instruction emulation
  2025-09-10  2:49 [PATCH 0/7] KVM: x86: Improve the handling of debug exceptions during instruction emulation Hou Wenlong
  2025-09-10  2:49 ` [PATCH 1/7] KVM: x86: Set guest DR6 by kvm_queue_exception_p() in " Hou Wenlong
@ 2025-09-10  2:49 ` Hou Wenlong
  2025-12-05 17:51   ` Sean Christopherson
  2025-09-10  2:49 ` [PATCH 3/7] KVM: x86: Only check effective code breakpoint in emulation Hou Wenlong
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 20+ messages in thread
From: Hou Wenlong @ 2025-09-10  2:49 UTC (permalink / raw)
  To: kvm
  Cc: Lai Jiangshan, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

When a DR access instruction is emulated by the x86 instruction
emulator, only the guest DR7.GD is checked. Since the instruction
emulation path has already performed some guest debug checks, add a
guest debug check in the DR access instruction emulation to improve the
guest debug logic in the instruction emulator.

Suggested-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
---
 arch/x86/kvm/emulate.c     |  2 +-
 arch/x86/kvm/kvm_emulate.h |  1 +
 arch/x86/kvm/x86.c         | 51 +++++++++++++++++++++++++++++++++-----
 arch/x86/kvm/x86.h         |  7 ++++++
 4 files changed, 54 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 18e3a732d106..87d98ffd7d2d 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3858,7 +3858,7 @@ static int check_dr_read(struct x86_emulate_ctxt *ctxt)
 	if ((cr4 & X86_CR4_DE) && (dr == 4 || dr == 5))
 		return emulate_ud(ctxt);
 
-	if (ctxt->ops->get_dr(ctxt, 7) & DR7_GD)
+	if (ctxt->ops->get_eff_dr(ctxt, 7) & DR7_GD)
 		return emulate_db(ctxt, DR6_BD);
 
 	return X86EMUL_CONTINUE;
diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index a6fad7b938e3..b971b2947094 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -216,6 +216,7 @@ struct x86_emulate_ops {
 	int (*set_cr)(struct x86_emulate_ctxt *ctxt, int cr, ulong val);
 	int (*cpl)(struct x86_emulate_ctxt *ctxt);
 	ulong (*get_dr)(struct x86_emulate_ctxt *ctxt, int dr);
+	ulong (*get_eff_dr)(struct x86_emulate_ctxt *ctxt, int dr);
 	int (*set_dr)(struct x86_emulate_ctxt *ctxt, int dr, ulong value);
 	int (*set_msr_with_filter)(struct x86_emulate_ctxt *ctxt, u32 msr_index, u64 data);
 	int (*get_msr_with_filter)(struct x86_emulate_ctxt *ctxt, u32 msr_index, u64 *pdata);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b2e8322aeca7..cf289d04b104 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1571,6 +1571,22 @@ unsigned long kvm_get_dr(struct kvm_vcpu *vcpu, int dr)
 }
 EXPORT_SYMBOL_GPL(kvm_get_dr);
 
+static unsigned long kvm_get_eff_dr(struct kvm_vcpu *vcpu, int dr)
+{
+	size_t size = ARRAY_SIZE(vcpu->arch.eff_db);
+
+	switch (dr) {
+	case 0 ... 3:
+		return vcpu->arch.eff_db[array_index_nospec(dr, size)];
+	case 4:
+	case 6:
+		return vcpu->arch.dr6;
+	case 5:
+	default: /* 7 */
+		return kvm_get_eff_dr7(vcpu);
+	}
+}
+
 int kvm_emulate_rdpmc(struct kvm_vcpu *vcpu)
 {
 	u32 pmc = kvm_rcx_read(vcpu);
@@ -8207,6 +8223,11 @@ static unsigned long emulator_get_dr(struct x86_emulate_ctxt *ctxt, int dr)
 	return kvm_get_dr(emul_to_vcpu(ctxt), dr);
 }
 
+static unsigned long emulator_get_eff_dr(struct x86_emulate_ctxt *ctxt, int dr)
+{
+	return kvm_get_eff_dr(emul_to_vcpu(ctxt), dr);
+}
+
 static int emulator_set_dr(struct x86_emulate_ctxt *ctxt, int dr,
 			   unsigned long value)
 {
@@ -8563,6 +8584,7 @@ static const struct x86_emulate_ops emulate_ops = {
 	.set_cr              = emulator_set_cr,
 	.cpl                 = emulator_get_cpl,
 	.get_dr              = emulator_get_dr,
+	.get_eff_dr          = emulator_get_eff_dr,
 	.set_dr              = emulator_set_dr,
 	.set_msr_with_filter = emulator_set_msr_with_filter,
 	.get_msr_with_filter = emulator_get_msr_with_filter,
@@ -8606,19 +8628,38 @@ static void toggle_interruptibility(struct kvm_vcpu *vcpu, u32 mask)
 	}
 }
 
-static void inject_emulated_exception(struct kvm_vcpu *vcpu)
+static int kvm_inject_emulated_db(struct kvm_vcpu *vcpu, unsigned long dr6)
+{
+	struct kvm_run *kvm_run = vcpu->run;
+
+	if (vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP) {
+		kvm_run->debug.arch.dr6 = dr6 | DR6_ACTIVE_LOW;
+		kvm_run->debug.arch.pc = kvm_get_linear_rip(vcpu);
+		kvm_run->debug.arch.exception = DB_VECTOR;
+		kvm_run->exit_reason = KVM_EXIT_DEBUG;
+		return 0;
+	}
+
+	kvm_queue_exception_p(vcpu, DB_VECTOR, dr6);
+	return 1;
+}
+
+static int inject_emulated_exception(struct kvm_vcpu *vcpu)
 {
+	int r = 1;
 	struct x86_emulate_ctxt *ctxt = vcpu->arch.emulate_ctxt;
 
 	if (ctxt->exception.vector == PF_VECTOR)
 		kvm_inject_emulated_page_fault(vcpu, &ctxt->exception);
 	else if (ctxt->exception.vector == DB_VECTOR)
-		kvm_queue_exception_p(vcpu, DB_VECTOR, ctxt->exception.dr6);
+		r = kvm_inject_emulated_db(vcpu, ctxt->exception.dr6);
 	else if (ctxt->exception.error_code_valid)
 		kvm_queue_exception_e(vcpu, ctxt->exception.vector,
 				      ctxt->exception.error_code);
 	else
 		kvm_queue_exception(vcpu, ctxt->exception.vector);
+
+	return r;
 }
 
 static struct x86_emulate_ctxt *alloc_emulate_ctxt(struct kvm_vcpu *vcpu)
@@ -9098,8 +9139,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 				 */
 				WARN_ON_ONCE(ctxt->exception.vector == UD_VECTOR ||
 					     exception_type(ctxt->exception.vector) == EXCPT_TRAP);
-				inject_emulated_exception(vcpu);
-				return 1;
+				return inject_emulated_exception(vcpu);
 			}
 			return handle_emulation_failure(vcpu, emulation_type);
 		}
@@ -9190,8 +9230,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 	if (ctxt->have_exception) {
 		WARN_ON_ONCE(vcpu->mmio_needed && !vcpu->mmio_is_write);
 		vcpu->mmio_needed = false;
-		r = 1;
-		inject_emulated_exception(vcpu);
+		r = inject_emulated_exception(vcpu);
 	} else if (vcpu->arch.pio.count) {
 		if (!vcpu->arch.pio.in) {
 			/* FIXME: return into emulator if single-stepping.  */
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index eb3088684e8a..7ad3b9645ea3 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -593,6 +593,13 @@ static inline bool kvm_dr6_valid(u64 data)
 	return !(data >> 32);
 }
 
+static inline unsigned long kvm_get_eff_dr7(struct kvm_vcpu *vcpu)
+{
+	if (vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP)
+		return vcpu->arch.guest_debug_dr7;
+	return vcpu->arch.dr7;
+}
+
 /*
  * Trigger machine check on the host. We assume all the MSRs are already set up
  * by the CPU and that we still run on the same CPU as the MCE occurred on.
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/7] KVM: x86: Only check effective code breakpoint in emulation
  2025-09-10  2:49 [PATCH 0/7] KVM: x86: Improve the handling of debug exceptions during instruction emulation Hou Wenlong
  2025-09-10  2:49 ` [PATCH 1/7] KVM: x86: Set guest DR6 by kvm_queue_exception_p() in " Hou Wenlong
  2025-09-10  2:49 ` [PATCH 2/7] KVM: x86: Check guest debug in DR access " Hou Wenlong
@ 2025-09-10  2:49 ` Hou Wenlong
  2025-09-10  2:49 ` [PATCH 4/7] KVM: x86: Consolidate KVM_GUESTDBG_SINGLESTEP check into the kvm_inject_emulated_db() Hou Wenlong
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: Hou Wenlong @ 2025-09-10  2:49 UTC (permalink / raw)
  To: kvm
  Cc: Lai Jiangshan, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

When guest debug is enabled, the effective breakpoints are controlled by
guest debug rather than by the guest itself.  Therefore, only check the
code breakpoints of guest debug in emulation if guest debug is enabled,
in order to maintain consistency with hardware behavior.

Fixes: 4a1e10d5b5d8 ("KVM: x86: handle hardware breakpoints during emulation")
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
---
 arch/x86/kvm/x86.c | 35 ++++++++++-------------------------
 1 file changed, 10 insertions(+), 25 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index cf289d04b104..5af652916a19 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8947,6 +8947,9 @@ EXPORT_SYMBOL_GPL(kvm_skip_emulated_instruction);
 
 static bool kvm_is_code_breakpoint_inhibited(struct kvm_vcpu *vcpu)
 {
+	if (vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP)
+		return false;
+
 	if (kvm_get_rflags(vcpu) & X86_EFLAGS_RF)
 		return true;
 
@@ -8963,6 +8966,8 @@ static bool kvm_is_code_breakpoint_inhibited(struct kvm_vcpu *vcpu)
 static bool kvm_vcpu_check_code_breakpoint(struct kvm_vcpu *vcpu,
 					   int emulation_type, int *r)
 {
+	unsigned long dr7 = kvm_get_eff_dr7(vcpu);
+
 	WARN_ON_ONCE(emulation_type & EMULTYPE_NO_DECODE);
 
 	/*
@@ -8983,34 +8988,14 @@ static bool kvm_vcpu_check_code_breakpoint(struct kvm_vcpu *vcpu,
 			      EMULTYPE_TRAP_UD | EMULTYPE_VMWARE_GP | EMULTYPE_PF))
 		return false;
 
-	if (unlikely(vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP) &&
-	    (vcpu->arch.guest_debug_dr7 & DR7_BP_EN_MASK)) {
-		struct kvm_run *kvm_run = vcpu->run;
-		unsigned long eip = kvm_get_linear_rip(vcpu);
-		u32 dr6 = kvm_vcpu_check_hw_bp(eip, 0,
-					   vcpu->arch.guest_debug_dr7,
-					   vcpu->arch.eff_db);
-
-		if (dr6 != 0) {
-			kvm_run->debug.arch.dr6 = dr6 | DR6_ACTIVE_LOW;
-			kvm_run->debug.arch.pc = eip;
-			kvm_run->debug.arch.exception = DB_VECTOR;
-			kvm_run->exit_reason = KVM_EXIT_DEBUG;
-			*r = 0;
-			return true;
-		}
-	}
-
-	if (unlikely(vcpu->arch.dr7 & DR7_BP_EN_MASK) &&
+	if (unlikely(dr7 & DR7_BP_EN_MASK) &&
 	    !kvm_is_code_breakpoint_inhibited(vcpu)) {
 		unsigned long eip = kvm_get_linear_rip(vcpu);
-		u32 dr6 = kvm_vcpu_check_hw_bp(eip, 0,
-					   vcpu->arch.dr7,
-					   vcpu->arch.db);
+		u32 dr6 = kvm_vcpu_check_hw_bp(eip, 0, dr7,
+					       vcpu->arch.eff_db);
 
-		if (dr6 != 0) {
-			kvm_queue_exception_p(vcpu, DB_VECTOR, dr6);
-			*r = 1;
+		if (dr6) {
+			*r = kvm_inject_emulated_db(vcpu, dr6);
 			return true;
 		}
 	}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/7] KVM: x86: Consolidate KVM_GUESTDBG_SINGLESTEP check into the kvm_inject_emulated_db()
  2025-09-10  2:49 [PATCH 0/7] KVM: x86: Improve the handling of debug exceptions during instruction emulation Hou Wenlong
                   ` (2 preceding siblings ...)
  2025-09-10  2:49 ` [PATCH 3/7] KVM: x86: Only check effective code breakpoint in emulation Hou Wenlong
@ 2025-09-10  2:49 ` Hou Wenlong
  2025-12-05 17:58   ` Sean Christopherson
  2025-09-10  2:49 ` [PATCH 5/7] KVM: VMX: Set 'BS' bit in pending debug exceptions during instruction emulation Hou Wenlong
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 20+ messages in thread
From: Hou Wenlong @ 2025-09-10  2:49 UTC (permalink / raw)
  To: kvm
  Cc: Lai Jiangshan, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

Use kvm_inject_emulated_db() in kvm_vcpu_do_singlestep() to consolidate
'KVM_GUESTDBG_SINGLESTEP' check into kvm_inject_emulated_db() during
emulation.

No functional change intended.

Suggested-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
---
 arch/x86/kvm/x86.c | 17 +++++------------
 1 file changed, 5 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5af652916a19..83960214d5d8 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8632,7 +8632,10 @@ static int kvm_inject_emulated_db(struct kvm_vcpu *vcpu, unsigned long dr6)
 {
 	struct kvm_run *kvm_run = vcpu->run;
 
-	if (vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP) {
+	/* Data breakpoints are not supported in emulation for now. */
+	WARN_ON((dr6 & DR6_BS) && (dr6 & DR_TRAP_BITS));
+
+	if (vcpu->guest_debug & (KVM_GUESTDBG_USE_HW_BP | KVM_GUESTDBG_SINGLESTEP)) {
 		kvm_run->debug.arch.dr6 = dr6 | DR6_ACTIVE_LOW;
 		kvm_run->debug.arch.pc = kvm_get_linear_rip(vcpu);
 		kvm_run->debug.arch.exception = DB_VECTOR;
@@ -8907,17 +8910,7 @@ static int kvm_vcpu_check_hw_bp(unsigned long addr, u32 type, u32 dr7,
 
 static int kvm_vcpu_do_singlestep(struct kvm_vcpu *vcpu)
 {
-	struct kvm_run *kvm_run = vcpu->run;
-
-	if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) {
-		kvm_run->debug.arch.dr6 = DR6_BS | DR6_ACTIVE_LOW;
-		kvm_run->debug.arch.pc = kvm_get_linear_rip(vcpu);
-		kvm_run->debug.arch.exception = DB_VECTOR;
-		kvm_run->exit_reason = KVM_EXIT_DEBUG;
-		return 0;
-	}
-	kvm_queue_exception_p(vcpu, DB_VECTOR, DR6_BS);
-	return 1;
+	return kvm_inject_emulated_db(vcpu, DR6_BS);
 }
 
 int kvm_skip_emulated_instruction(struct kvm_vcpu *vcpu)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 5/7] KVM: VMX: Set 'BS' bit in pending debug exceptions during instruction emulation
  2025-09-10  2:49 [PATCH 0/7] KVM: x86: Improve the handling of debug exceptions during instruction emulation Hou Wenlong
                   ` (3 preceding siblings ...)
  2025-09-10  2:49 ` [PATCH 4/7] KVM: x86: Consolidate KVM_GUESTDBG_SINGLESTEP check into the kvm_inject_emulated_db() Hou Wenlong
@ 2025-09-10  2:49 ` Hou Wenlong
  2025-12-05 18:20   ` Sean Christopherson
  2025-09-10  2:49 ` [PATCH 6/7] KVM: selftests: Verify guest debug DR7.GD checking " Hou Wenlong
  2025-09-10  2:49 ` [PATCH 7/7] KVM: selftests: Verify 'BS' bit checking in pending debug exception during VM entry Hou Wenlong
  6 siblings, 1 reply; 20+ messages in thread
From: Hou Wenlong @ 2025-09-10  2:49 UTC (permalink / raw)
  To: kvm
  Cc: Lai Jiangshan, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

If 'STI' or 'MOV SS' with 'X86_EFLAGS_TF' set is emulated by the
emulator (e.g., using the 'force emulation' prefix), the check for
pending debug exceptions during VM entry would fail, as #UD clears the
pending debug exceptions. Therefore, set the 'BS' bit in such situations
to make instruction emulation more robust.

Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
---
 arch/x86/include/asm/kvm-x86-ops.h |  1 +
 arch/x86/include/asm/kvm_host.h    |  1 +
 arch/x86/kvm/vmx/main.c            |  9 +++++++++
 arch/x86/kvm/vmx/vmx.c             | 14 +++++++++-----
 arch/x86/kvm/vmx/x86_ops.h         |  1 +
 arch/x86/kvm/x86.c                 |  7 +++++--
 6 files changed, 26 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index 18a5c3119e1a..3a0ab1683f17 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -50,6 +50,7 @@ KVM_X86_OP(get_gdt)
 KVM_X86_OP(set_gdt)
 KVM_X86_OP(sync_dirty_debug_regs)
 KVM_X86_OP(set_dr7)
+KVM_X86_OP_OPTIONAL(set_pending_dbg)
 KVM_X86_OP(cache_reg)
 KVM_X86_OP(get_rflags)
 KVM_X86_OP(set_rflags)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 0d3cc0fc27af..a36ca751ee2e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1765,6 +1765,7 @@ struct kvm_x86_ops {
 	void (*set_gdt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt);
 	void (*sync_dirty_debug_regs)(struct kvm_vcpu *vcpu);
 	void (*set_dr7)(struct kvm_vcpu *vcpu, unsigned long value);
+	void (*set_pending_dbg)(struct kvm_vcpu *vcpu);
 	void (*cache_reg)(struct kvm_vcpu *vcpu, enum kvm_reg reg);
 	unsigned long (*get_rflags)(struct kvm_vcpu *vcpu);
 	void (*set_rflags)(struct kvm_vcpu *vcpu, unsigned long rflags);
diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c
index dbab1c15b0cd..23adff73f90b 100644
--- a/arch/x86/kvm/vmx/main.c
+++ b/arch/x86/kvm/vmx/main.c
@@ -465,6 +465,14 @@ static void vt_set_dr7(struct kvm_vcpu *vcpu, unsigned long val)
 	vmx_set_dr7(vcpu, val);
 }
 
+static void vt_set_pending_dbg(struct kvm_vcpu *vcpu)
+{
+	if (is_td_vcpu(vcpu))
+		return;
+
+	vmx_set_pending_dbg(vcpu);
+}
+
 static void vt_sync_dirty_debug_regs(struct kvm_vcpu *vcpu)
 {
 	/*
@@ -906,6 +914,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = {
 	.get_gdt = vt_op(get_gdt),
 	.set_gdt = vt_op(set_gdt),
 	.set_dr7 = vt_op(set_dr7),
+	.set_pending_dbg = vt_op(set_pending_dbg),
 	.sync_dirty_debug_regs = vt_op(sync_dirty_debug_regs),
 	.cache_reg = vt_op(cache_reg),
 	.get_rflags = vt_op(get_rflags),
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 227b45430ad8..e861a0edb3f4 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -5243,11 +5243,7 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu)
 			 */
 			if (is_icebp(intr_info))
 				WARN_ON(!skip_emulated_instruction(vcpu));
-			else if ((vmx_get_rflags(vcpu) & X86_EFLAGS_TF) &&
-				 (vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) &
-				  (GUEST_INTR_STATE_STI | GUEST_INTR_STATE_MOV_SS)))
-				vmcs_writel(GUEST_PENDING_DBG_EXCEPTIONS,
-					    vmcs_readl(GUEST_PENDING_DBG_EXCEPTIONS) | DR6_BS);
+			vmx_set_pending_dbg(vcpu);
 
 			kvm_queue_exception_p(vcpu, DB_VECTOR, dr6);
 			return 1;
@@ -5554,6 +5550,14 @@ void vmx_set_dr7(struct kvm_vcpu *vcpu, unsigned long val)
 	vmcs_writel(GUEST_DR7, val);
 }
 
+void vmx_set_pending_dbg(struct kvm_vcpu *vcpu)
+{
+	if ((vmx_get_rflags(vcpu) & X86_EFLAGS_TF) &&
+	    vmx_get_interrupt_shadow(vcpu))
+		vmcs_writel(GUEST_PENDING_DBG_EXCEPTIONS,
+			    vmcs_readl(GUEST_PENDING_DBG_EXCEPTIONS) | DR6_BS);
+}
+
 static int handle_tpr_below_threshold(struct kvm_vcpu *vcpu)
 {
 	kvm_apic_update_ppr(vcpu);
diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h
index 2b3424f638db..2913648cfe4f 100644
--- a/arch/x86/kvm/vmx/x86_ops.h
+++ b/arch/x86/kvm/vmx/x86_ops.h
@@ -75,6 +75,7 @@ void vmx_get_gdt(struct kvm_vcpu *vcpu, struct desc_ptr *dt);
 void vmx_set_gdt(struct kvm_vcpu *vcpu, struct desc_ptr *dt);
 void vmx_set_dr6(struct kvm_vcpu *vcpu, unsigned long val);
 void vmx_set_dr7(struct kvm_vcpu *vcpu, unsigned long val);
+void vmx_set_pending_dbg(struct kvm_vcpu *vcpu);
 void vmx_sync_dirty_debug_regs(struct kvm_vcpu *vcpu);
 void vmx_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg);
 unsigned long vmx_get_rflags(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 83960214d5d8..464e9649cb54 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9250,10 +9250,13 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 			if (ctxt->is_branch)
 				kvm_pmu_branch_retired(vcpu);
 			kvm_rip_write(vcpu, ctxt->eip);
-			if (r && (ctxt->tf || (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)))
+			__kvm_set_rflags(vcpu, ctxt->eflags);
+			if (r && (ctxt->tf || (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP))) {
 				r = kvm_vcpu_do_singlestep(vcpu);
+				if (r)
+					kvm_x86_call(set_pending_dbg)(vcpu);
+			}
 			kvm_x86_call(update_emulated_instruction)(vcpu);
-			__kvm_set_rflags(vcpu, ctxt->eflags);
 		}
 
 		/*
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 6/7] KVM: selftests: Verify guest debug DR7.GD checking during instruction emulation
  2025-09-10  2:49 [PATCH 0/7] KVM: x86: Improve the handling of debug exceptions during instruction emulation Hou Wenlong
                   ` (4 preceding siblings ...)
  2025-09-10  2:49 ` [PATCH 5/7] KVM: VMX: Set 'BS' bit in pending debug exceptions during instruction emulation Hou Wenlong
@ 2025-09-10  2:49 ` Hou Wenlong
  2025-12-05 18:21   ` Sean Christopherson
  2025-09-10  2:49 ` [PATCH 7/7] KVM: selftests: Verify 'BS' bit checking in pending debug exception during VM entry Hou Wenlong
  6 siblings, 1 reply; 20+ messages in thread
From: Hou Wenlong @ 2025-09-10  2:49 UTC (permalink / raw)
  To: kvm
  Cc: Lai Jiangshan, Sean Christopherson, Paolo Bonzini, Shuah Khan,
	linux-kselftest, linux-kernel

Similar to the global disable test case in x86's debug_regs test, use
'KVM_FEP' to trigger instruction emulation in order to verify the guest
debug DR7.GD checking during instruction emulation.

Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
---
 tools/testing/selftests/kvm/x86/debug_regs.c | 25 +++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/kvm/x86/debug_regs.c b/tools/testing/selftests/kvm/x86/debug_regs.c
index 2d814c1d1dc4..ba80b77c2869 100644
--- a/tools/testing/selftests/kvm/x86/debug_regs.c
+++ b/tools/testing/selftests/kvm/x86/debug_regs.c
@@ -19,6 +19,7 @@
 uint32_t guest_value;
 
 extern unsigned char sw_bp, hw_bp, write_data, ss_start, bd_start;
+extern unsigned char fep_bd_start;
 
 static void guest_code(void)
 {
@@ -64,6 +65,12 @@ static void guest_code(void)
 
 	/* DR6.BD test */
 	asm volatile("bd_start: mov %%dr0, %%rax" : : : "rax");
+
+	if (is_forced_emulation_enabled) {
+		/* DR6.BD test for emulation */
+		asm volatile(KVM_FEP "fep_bd_start: mov %%dr0, %%rax" : : : "rax");
+	}
+
 	GUEST_DONE();
 }
 
@@ -185,7 +192,7 @@ int main(void)
 			    target_dr6);
 	}
 
-	/* Finally test global disable */
+	/* test global disable */
 	memset(&debug, 0, sizeof(debug));
 	debug.control = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_HW_BP;
 	debug.arch.debugreg[7] = 0x400 | DR7_GD;
@@ -202,6 +209,22 @@ int main(void)
 			    run->debug.arch.pc, target_rip, run->debug.arch.dr6,
 			    target_dr6);
 
+	/* test global disable in emulation */
+	if (is_forced_emulation_enabled) {
+		/* Skip the 3-bytes "mov dr0" */
+		vcpu_skip_insn(vcpu, 3);
+		vcpu_run(vcpu);
+		TEST_ASSERT(run->exit_reason == KVM_EXIT_DEBUG &&
+			    run->debug.arch.exception == DB_VECTOR &&
+			    run->debug.arch.pc == CAST_TO_RIP(fep_bd_start) &&
+			    run->debug.arch.dr6 == target_dr6,
+			    "DR7.GD: exit %d exception %d rip 0x%llx "
+			    "(should be 0x%llx) dr6 0x%llx (should be 0x%llx)",
+			    run->exit_reason, run->debug.arch.exception,
+			    run->debug.arch.pc, target_rip, run->debug.arch.dr6,
+			    target_dr6);
+	}
+
 	/* Disable all debug controls, run to the end */
 	memset(&debug, 0, sizeof(debug));
 	vcpu_guest_debug_set(vcpu, &debug);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 7/7] KVM: selftests: Verify 'BS' bit checking in pending debug exception during VM entry
  2025-09-10  2:49 [PATCH 0/7] KVM: x86: Improve the handling of debug exceptions during instruction emulation Hou Wenlong
                   ` (5 preceding siblings ...)
  2025-09-10  2:49 ` [PATCH 6/7] KVM: selftests: Verify guest debug DR7.GD checking " Hou Wenlong
@ 2025-09-10  2:49 ` Hou Wenlong
  2025-12-05 18:23   ` Sean Christopherson
  6 siblings, 1 reply; 20+ messages in thread
From: Hou Wenlong @ 2025-09-10  2:49 UTC (permalink / raw)
  To: kvm
  Cc: Lai Jiangshan, Sean Christopherson, Paolo Bonzini, Shuah Khan,
	linux-kselftest, linux-kernel

In the x86's debug_regs test, add a test case to cover the scenario where
single-step with STI in VMX sets the 'BS' bit in pending debug
exceptions for #DB interception and instruction emulation in both cases.

Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
---
 .../selftests/kvm/include/x86/processor.h     |  3 +-
 tools/testing/selftests/kvm/x86/debug_regs.c  | 41 +++++++++++++++++--
 2 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86/processor.h b/tools/testing/selftests/kvm/include/x86/processor.h
index 488d516c4f6f..f5827cca813e 100644
--- a/tools/testing/selftests/kvm/include/x86/processor.h
+++ b/tools/testing/selftests/kvm/include/x86/processor.h
@@ -34,7 +34,8 @@ extern uint64_t guest_tsc_khz;
 
 #define NMI_VECTOR		0x02
 
-#define X86_EFLAGS_FIXED	 (1u << 1)
+#define X86_EFLAGS_FIXED	(1u << 1)
+#define X86_EFLAGS_TF		(1u << 8)
 
 #define X86_CR4_VME		(1ul << 0)
 #define X86_CR4_PVI		(1ul << 1)
diff --git a/tools/testing/selftests/kvm/x86/debug_regs.c b/tools/testing/selftests/kvm/x86/debug_regs.c
index ba80b77c2869..60dea0116b21 100644
--- a/tools/testing/selftests/kvm/x86/debug_regs.c
+++ b/tools/testing/selftests/kvm/x86/debug_regs.c
@@ -15,11 +15,31 @@
 
 #define IRQ_VECTOR 0xAA
 
+#define  CAST_TO_RIP(v)  ((unsigned long long)&(v))
+
 /* For testing data access debug BP */
 uint32_t guest_value;
 
 extern unsigned char sw_bp, hw_bp, write_data, ss_start, bd_start;
-extern unsigned char fep_bd_start;
+extern unsigned char fep_bd_start, fep_sti_start, fep_sti_end;
+
+static void guest_db_handler(struct ex_regs *regs)
+{
+	static int count;
+	unsigned long target_rips[2] = {
+		CAST_TO_RIP(fep_sti_start),
+		CAST_TO_RIP(fep_sti_end),
+	};
+
+	__GUEST_ASSERT(regs->rip == target_rips[count], "STI: unexpected rip 0x%lx (should be 0x%lx)",
+		       regs->rip, target_rips[count]);
+	regs->rflags &= ~X86_EFLAGS_TF;
+	count++;
+}
+
+static void guest_irq_handler(struct ex_regs *regs)
+{
+}
 
 static void guest_code(void)
 {
@@ -69,13 +89,25 @@ static void guest_code(void)
 	if (is_forced_emulation_enabled) {
 		/* DR6.BD test for emulation */
 		asm volatile(KVM_FEP "fep_bd_start: mov %%dr0, %%rax" : : : "rax");
+
+		/* pending debug exceptions for emulation */
+		asm volatile("pushf\n\t"
+			     "orq $" __stringify(X86_EFLAGS_TF) ", (%rsp)\n\t"
+			     "popf\n\t"
+			     "sti\n\t"
+			     "fep_sti_start:"
+			     "cli\n\t"
+			     "pushf\n\t"
+			     "orq $" __stringify(X86_EFLAGS_TF) ", (%rsp)\n\t"
+			     "popf\n\t"
+			     KVM_FEP "sti\n\t"
+			     "fep_sti_end:"
+			     "cli\n\t");
 	}
 
 	GUEST_DONE();
 }
 
-#define  CAST_TO_RIP(v)  ((unsigned long long)&(v))
-
 static void vcpu_skip_insn(struct kvm_vcpu *vcpu, int insn_len)
 {
 	struct kvm_regs regs;
@@ -110,6 +142,9 @@ int main(void)
 	vm = vm_create_with_one_vcpu(&vcpu, guest_code);
 	run = vcpu->run;
 
+	vm_install_exception_handler(vm, DB_VECTOR, guest_db_handler);
+	vm_install_exception_handler(vm, IRQ_VECTOR, guest_irq_handler);
+
 	/* Test software BPs - int3 */
 	memset(&debug, 0, sizeof(debug));
 	debug.control = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/7] KVM: x86: Check guest debug in DR access instruction emulation
  2025-09-10  2:49 ` [PATCH 2/7] KVM: x86: Check guest debug in DR access " Hou Wenlong
@ 2025-12-05 17:51   ` Sean Christopherson
  0 siblings, 0 replies; 20+ messages in thread
From: Sean Christopherson @ 2025-12-05 17:51 UTC (permalink / raw)
  To: Hou Wenlong
  Cc: kvm, Lai Jiangshan, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, linux-kernel

On Wed, Sep 10, 2025, Hou Wenlong wrote:
> @@ -8606,19 +8628,38 @@ static void toggle_interruptibility(struct kvm_vcpu *vcpu, u32 mask)
>  	}
>  }
>  
> -static void inject_emulated_exception(struct kvm_vcpu *vcpu)
> +static int kvm_inject_emulated_db(struct kvm_vcpu *vcpu, unsigned long dr6)
> +{
> +	struct kvm_run *kvm_run = vcpu->run;
> +
> +	if (vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP) {
> +		kvm_run->debug.arch.dr6 = dr6 | DR6_ACTIVE_LOW;
> +		kvm_run->debug.arch.pc = kvm_get_linear_rip(vcpu);
> +		kvm_run->debug.arch.exception = DB_VECTOR;
> +		kvm_run->exit_reason = KVM_EXIT_DEBUG;
> +		return 0;
> +	}
> +
> +	kvm_queue_exception_p(vcpu, DB_VECTOR, dr6);
> +	return 1;
> +}
> +
> +static int inject_emulated_exception(struct kvm_vcpu *vcpu)
>  {
> +	int r = 1;
>  	struct x86_emulate_ctxt *ctxt = vcpu->arch.emulate_ctxt;
>  
>  	if (ctxt->exception.vector == PF_VECTOR)
>  		kvm_inject_emulated_page_fault(vcpu, &ctxt->exception);
>  	else if (ctxt->exception.vector == DB_VECTOR)
> -		kvm_queue_exception_p(vcpu, DB_VECTOR, ctxt->exception.dr6);
> +		r = kvm_inject_emulated_db(vcpu, ctxt->exception.dr6);
>  	else if (ctxt->exception.error_code_valid)
>  		kvm_queue_exception_e(vcpu, ctxt->exception.vector,
>  				      ctxt->exception.error_code);
>  	else
>  		kvm_queue_exception(vcpu, ctxt->exception.vector);
> +
> +	return r;

Hmm, I think I'd rather make the DB_VECTOR case an early termination, and keep
the rest largely as-is.  And while you're modifying this code, maybe add a patch
to capture "struct x86_exception" locally instead of the context?  E.g. to end
up with:

static int inject_emulated_exception(struct kvm_vcpu *vcpu)
{
	struct x86_exception *ex = &vcpu->arch.emulate_ctxt->exception;

	if (ex->vector == DB_VECTOR)
		return kvm_inject_emulated_db(vcpu, ex->dr6);

	if (ex->vector == PF_VECTOR)
		kvm_inject_emulated_page_fault(vcpu, ex);
	else if (ex->error_code_valid)
		kvm_queue_exception_e(vcpu, ex->vector, ex->error_code);
	else
		kvm_queue_exception(vcpu, ex->vector);
	return 1;
}

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/7] KVM: x86: Consolidate KVM_GUESTDBG_SINGLESTEP check into the kvm_inject_emulated_db()
  2025-09-10  2:49 ` [PATCH 4/7] KVM: x86: Consolidate KVM_GUESTDBG_SINGLESTEP check into the kvm_inject_emulated_db() Hou Wenlong
@ 2025-12-05 17:58   ` Sean Christopherson
  2025-12-11 14:05     ` Hou Wenlong
  0 siblings, 1 reply; 20+ messages in thread
From: Sean Christopherson @ 2025-12-05 17:58 UTC (permalink / raw)
  To: Hou Wenlong
  Cc: kvm, Lai Jiangshan, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, linux-kernel

On Wed, Sep 10, 2025, Hou Wenlong wrote:
> Use kvm_inject_emulated_db() in kvm_vcpu_do_singlestep() to consolidate
> 'KVM_GUESTDBG_SINGLESTEP' check into kvm_inject_emulated_db() during
> emulation.
> 
> No functional change intended.
> 
> Suggested-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
> Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> ---
>  arch/x86/kvm/x86.c | 17 +++++------------
>  1 file changed, 5 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 5af652916a19..83960214d5d8 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -8632,7 +8632,10 @@ static int kvm_inject_emulated_db(struct kvm_vcpu *vcpu, unsigned long dr6)
>  {
>  	struct kvm_run *kvm_run = vcpu->run;
>  
> -	if (vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP) {
> +	/* Data breakpoints are not supported in emulation for now. */
> +	WARN_ON((dr6 & DR6_BS) && (dr6 & DR_TRAP_BITS));

If we keep this, it should be a WARN_ON_ONCE().  We've had at least one case where
a sanity check in the emulator caused major problems because a WARN_ON() spammed
the kernel log to the point where it overloaded things :-)

But I think the WARN will be subject to false positives.  KVM doesn't emulate data
#DBs, but it does emulate code #DBs, and fault-like code #DBs can be coincident
with trap-like single-step #DBs.  Ah, but kvm_vcpu_check_code_breakpoint() doesn't
account for RFLAGS.TF.  That should probably be addressed in this series, especially
since it's consolidating KVM_GUESTDBG_SINGLESTEP handling.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 5/7] KVM: VMX: Set 'BS' bit in pending debug exceptions during instruction emulation
  2025-09-10  2:49 ` [PATCH 5/7] KVM: VMX: Set 'BS' bit in pending debug exceptions during instruction emulation Hou Wenlong
@ 2025-12-05 18:20   ` Sean Christopherson
  2025-12-11 14:01     ` Hou Wenlong
  0 siblings, 1 reply; 20+ messages in thread
From: Sean Christopherson @ 2025-12-05 18:20 UTC (permalink / raw)
  To: Hou Wenlong
  Cc: kvm, Lai Jiangshan, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, linux-kernel

On Wed, Sep 10, 2025, Hou Wenlong wrote:
> If 'STI' or 'MOV SS' with 'X86_EFLAGS_TF' set is emulated by the
> emulator (e.g., using the 'force emulation' prefix), the check for
> pending debug exceptions during VM entry would fail,

s/fail/VM-Fail, and please elaborate on what exactly fails.  I've had a lot (too
much) of exposure to the consistency check, but I still have to look up the details
every time.

> as #UD clears the pending debug exceptions. Therefore, set the 'BS' bit in
> such situations to make instruction emulation more robust.
> 
> Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> ---
>  arch/x86/include/asm/kvm-x86-ops.h |  1 +
>  arch/x86/include/asm/kvm_host.h    |  1 +
>  arch/x86/kvm/vmx/main.c            |  9 +++++++++
>  arch/x86/kvm/vmx/vmx.c             | 14 +++++++++-----
>  arch/x86/kvm/vmx/x86_ops.h         |  1 +
>  arch/x86/kvm/x86.c                 |  7 +++++--
>  6 files changed, 26 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
> index 18a5c3119e1a..3a0ab1683f17 100644
> --- a/arch/x86/include/asm/kvm-x86-ops.h
> +++ b/arch/x86/include/asm/kvm-x86-ops.h
> @@ -50,6 +50,7 @@ KVM_X86_OP(get_gdt)
>  KVM_X86_OP(set_gdt)
>  KVM_X86_OP(sync_dirty_debug_regs)
>  KVM_X86_OP(set_dr7)
> +KVM_X86_OP_OPTIONAL(set_pending_dbg)
>  KVM_X86_OP(cache_reg)
>  KVM_X86_OP(get_rflags)
>  KVM_X86_OP(set_rflags)
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 0d3cc0fc27af..a36ca751ee2e 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1765,6 +1765,7 @@ struct kvm_x86_ops {
>  	void (*set_gdt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt);
>  	void (*sync_dirty_debug_regs)(struct kvm_vcpu *vcpu);
>  	void (*set_dr7)(struct kvm_vcpu *vcpu, unsigned long value);
> +	void (*set_pending_dbg)(struct kvm_vcpu *vcpu);
>  	void (*cache_reg)(struct kvm_vcpu *vcpu, enum kvm_reg reg);
>  	unsigned long (*get_rflags)(struct kvm_vcpu *vcpu);
>  	void (*set_rflags)(struct kvm_vcpu *vcpu, unsigned long rflags);
> diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c
> index dbab1c15b0cd..23adff73f90b 100644
> --- a/arch/x86/kvm/vmx/main.c
> +++ b/arch/x86/kvm/vmx/main.c
> @@ -465,6 +465,14 @@ static void vt_set_dr7(struct kvm_vcpu *vcpu, unsigned long val)
>  	vmx_set_dr7(vcpu, val);
>  }
>  
> +static void vt_set_pending_dbg(struct kvm_vcpu *vcpu)
> +{
> +	if (is_td_vcpu(vcpu))

WARN_ON_ONCE()?

> +		return;
> +
> +	vmx_set_pending_dbg(vcpu);
> +}
> +
>  static void vt_sync_dirty_debug_regs(struct kvm_vcpu *vcpu)
>  {
>  	/*
> @@ -906,6 +914,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = {
>  	.get_gdt = vt_op(get_gdt),
>  	.set_gdt = vt_op(set_gdt),
>  	.set_dr7 = vt_op(set_dr7),
> +	.set_pending_dbg = vt_op(set_pending_dbg),
>  	.sync_dirty_debug_regs = vt_op(sync_dirty_debug_regs),
>  	.cache_reg = vt_op(cache_reg),
>  	.get_rflags = vt_op(get_rflags),
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 227b45430ad8..e861a0edb3f4 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -5243,11 +5243,7 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu)
>  			 */
>  			if (is_icebp(intr_info))
>  				WARN_ON(!skip_emulated_instruction(vcpu));
> -			else if ((vmx_get_rflags(vcpu) & X86_EFLAGS_TF) &&
> -				 (vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) &
> -				  (GUEST_INTR_STATE_STI | GUEST_INTR_STATE_MOV_SS)))
> -				vmcs_writel(GUEST_PENDING_DBG_EXCEPTIONS,
> -					    vmcs_readl(GUEST_PENDING_DBG_EXCEPTIONS) | DR6_BS);
> +			vmx_set_pending_dbg(vcpu);

This looks wrong.  Per Table 19-2. Debug Exception Conditions, INT1 doesn't
set DR6.BS.  Oooh, the helper is _conditionally_ setting DR6_BS.  But that still
_looks_ wrong, and it makes the comment confusing.

>  
>  			kvm_queue_exception_p(vcpu, DB_VECTOR, dr6);
>  			return 1;
> @@ -5554,6 +5550,14 @@ void vmx_set_dr7(struct kvm_vcpu *vcpu, unsigned long val)
>  	vmcs_writel(GUEST_DR7, val);
>  }
>  
> +void vmx_set_pending_dbg(struct kvm_vcpu *vcpu)

Related to above, this is a confusing name.  In no small part because of the rather
insane complexity related to pending debug exceptions being visible to software
via the VMCS.  E.g. I initially read this as "set a pending #DB", not "set the
VMCS field with the same name based on RFLAGS.TF and whether or not the vCPU is
in an interrupt shadow".

Maybe refresh_pending_dbg_excpetions()?  And then the above case would be:

			if (is_icebp(intr_info))
				WARN_ON(!skip_emulated_instruction(vcpu));
			else
				vmx_refresh_pending_dbg_exceptions(vcpu);

> +{
> +	if ((vmx_get_rflags(vcpu) & X86_EFLAGS_TF) &&
> +	    vmx_get_interrupt_shadow(vcpu))
> +		vmcs_writel(GUEST_PENDING_DBG_EXCEPTIONS,
> +			    vmcs_readl(GUEST_PENDING_DBG_EXCEPTIONS) | DR6_BS);
> +}
> +
>  static int handle_tpr_below_threshold(struct kvm_vcpu *vcpu)
>  {
>  	kvm_apic_update_ppr(vcpu);
> diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h
> index 2b3424f638db..2913648cfe4f 100644
> --- a/arch/x86/kvm/vmx/x86_ops.h
> +++ b/arch/x86/kvm/vmx/x86_ops.h
> @@ -75,6 +75,7 @@ void vmx_get_gdt(struct kvm_vcpu *vcpu, struct desc_ptr *dt);
>  void vmx_set_gdt(struct kvm_vcpu *vcpu, struct desc_ptr *dt);
>  void vmx_set_dr6(struct kvm_vcpu *vcpu, unsigned long val);
>  void vmx_set_dr7(struct kvm_vcpu *vcpu, unsigned long val);
> +void vmx_set_pending_dbg(struct kvm_vcpu *vcpu);
>  void vmx_sync_dirty_debug_regs(struct kvm_vcpu *vcpu);
>  void vmx_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg);
>  unsigned long vmx_get_rflags(struct kvm_vcpu *vcpu);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 83960214d5d8..464e9649cb54 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9250,10 +9250,13 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
>  			if (ctxt->is_branch)
>  				kvm_pmu_branch_retired(vcpu);
>  			kvm_rip_write(vcpu, ctxt->eip);
> -			if (r && (ctxt->tf || (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)))
> +			__kvm_set_rflags(vcpu, ctxt->eflags);
> +			if (r && (ctxt->tf || (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP))) {
>  				r = kvm_vcpu_do_singlestep(vcpu);
> +				if (r)
> +					kvm_x86_call(set_pending_dbg)(vcpu);

Why not handle this in kvm_vcpu_do_singlestep()?  Ah, because the call from
kvm_skip_emulated_instruction() can never occur in an interrupt shadow.  But
that's a _really_ subtle detail, and more imporantly the call is benign in that
case.

So unless there's a good reason to do otherwise, I vote to move the call into
kvm_vcpu_do_singlestep().

> +			}
>  			kvm_x86_call(update_emulated_instruction)(vcpu);
> -			__kvm_set_rflags(vcpu, ctxt->eflags);

Please move the relocation of the call to __kvm_set_rflags() to its own patch.
I vaguely recall running into problems related to the state of RFLAGS in the
emulator versus those in the vCPU.  I don't _think_ there's a problem here, but
if there is, I want the change to be fully isolated.

>  		}
>  
>  		/*
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/7] KVM: selftests: Verify guest debug DR7.GD checking during instruction emulation
  2025-09-10  2:49 ` [PATCH 6/7] KVM: selftests: Verify guest debug DR7.GD checking " Hou Wenlong
@ 2025-12-05 18:21   ` Sean Christopherson
  0 siblings, 0 replies; 20+ messages in thread
From: Sean Christopherson @ 2025-12-05 18:21 UTC (permalink / raw)
  To: Hou Wenlong
  Cc: kvm, Lai Jiangshan, Paolo Bonzini, Shuah Khan, linux-kselftest,
	linux-kernel

On Wed, Sep 10, 2025, Hou Wenlong wrote:
> Similar to the global disable test case in x86's debug_regs test, use
> 'KVM_FEP' to trigger instruction emulation in order to verify the guest
> debug DR7.GD checking during instruction emulation.
> 
> Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> ---
>  tools/testing/selftests/kvm/x86/debug_regs.c | 25 +++++++++++++++++++-
>  1 file changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/kvm/x86/debug_regs.c b/tools/testing/selftests/kvm/x86/debug_regs.c
> index 2d814c1d1dc4..ba80b77c2869 100644
> --- a/tools/testing/selftests/kvm/x86/debug_regs.c
> +++ b/tools/testing/selftests/kvm/x86/debug_regs.c
> @@ -19,6 +19,7 @@
>  uint32_t guest_value;
>  
>  extern unsigned char sw_bp, hw_bp, write_data, ss_start, bd_start;
> +extern unsigned char fep_bd_start;
>  
>  static void guest_code(void)
>  {
> @@ -64,6 +65,12 @@ static void guest_code(void)
>  
>  	/* DR6.BD test */
>  	asm volatile("bd_start: mov %%dr0, %%rax" : : : "rax");
> +
> +	if (is_forced_emulation_enabled) {
> +		/* DR6.BD test for emulation */

Put the comment above the if-statement, that way there's no need for curly braces.
Or just drop it entirely; unless the comments more verbose, I don't think it adds
much value.

> +		asm volatile(KVM_FEP "fep_bd_start: mov %%dr0, %%rax" : : : "rax");
> +	}
> +
>  	GUEST_DONE();
>  }
>  
> @@ -185,7 +192,7 @@ int main(void)
>  			    target_dr6);
>  	}
>  
> -	/* Finally test global disable */
> +	/* test global disable */
>  	memset(&debug, 0, sizeof(debug));
>  	debug.control = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_HW_BP;
>  	debug.arch.debugreg[7] = 0x400 | DR7_GD;
> @@ -202,6 +209,22 @@ int main(void)
>  			    run->debug.arch.pc, target_rip, run->debug.arch.dr6,
>  			    target_dr6);
>  
> +	/* test global disable in emulation */
> +	if (is_forced_emulation_enabled) {
> +		/* Skip the 3-bytes "mov dr0" */
> +		vcpu_skip_insn(vcpu, 3);
> +		vcpu_run(vcpu);
> +		TEST_ASSERT(run->exit_reason == KVM_EXIT_DEBUG &&
> +			    run->debug.arch.exception == DB_VECTOR &&
> +			    run->debug.arch.pc == CAST_TO_RIP(fep_bd_start) &&
> +			    run->debug.arch.dr6 == target_dr6,
> +			    "DR7.GD: exit %d exception %d rip 0x%llx "
> +			    "(should be 0x%llx) dr6 0x%llx (should be 0x%llx)",
> +			    run->exit_reason, run->debug.arch.exception,
> +			    run->debug.arch.pc, target_rip, run->debug.arch.dr6,
> +			    target_dr6);
> +	}
> +
>  	/* Disable all debug controls, run to the end */
>  	memset(&debug, 0, sizeof(debug));
>  	vcpu_guest_debug_set(vcpu, &debug);
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 7/7] KVM: selftests: Verify 'BS' bit checking in pending debug exception during VM entry
  2025-09-10  2:49 ` [PATCH 7/7] KVM: selftests: Verify 'BS' bit checking in pending debug exception during VM entry Hou Wenlong
@ 2025-12-05 18:23   ` Sean Christopherson
  2025-12-11 13:21     ` Hou Wenlong
  0 siblings, 1 reply; 20+ messages in thread
From: Sean Christopherson @ 2025-12-05 18:23 UTC (permalink / raw)
  To: Hou Wenlong
  Cc: kvm, Lai Jiangshan, Paolo Bonzini, Shuah Khan, linux-kselftest,
	linux-kernel

On Wed, Sep 10, 2025, Hou Wenlong wrote:
>  #define IRQ_VECTOR 0xAA
>  
> +#define  CAST_TO_RIP(v)  ((unsigned long long)&(v))
> +
>  /* For testing data access debug BP */
>  uint32_t guest_value;
>  
>  extern unsigned char sw_bp, hw_bp, write_data, ss_start, bd_start;
> -extern unsigned char fep_bd_start;
> +extern unsigned char fep_bd_start, fep_sti_start, fep_sti_end;
> +
> +static void guest_db_handler(struct ex_regs *regs)
> +{
> +	static int count;
> +	unsigned long target_rips[2] = {
> +		CAST_TO_RIP(fep_sti_start),
> +		CAST_TO_RIP(fep_sti_end),
> +	};
> +
> +	__GUEST_ASSERT(regs->rip == target_rips[count], "STI: unexpected rip 0x%lx (should be 0x%lx)",
> +		       regs->rip, target_rips[count]);
> +	regs->rflags &= ~X86_EFLAGS_TF;
> +	count++;
> +}
> +
> +static void guest_irq_handler(struct ex_regs *regs)
> +{
> +}
>  
>  static void guest_code(void)
>  {
> @@ -69,13 +89,25 @@ static void guest_code(void)
>  	if (is_forced_emulation_enabled) {
>  		/* DR6.BD test for emulation */
>  		asm volatile(KVM_FEP "fep_bd_start: mov %%dr0, %%rax" : : : "rax");
> +
> +		/* pending debug exceptions for emulation */
> +		asm volatile("pushf\n\t"
> +			     "orq $" __stringify(X86_EFLAGS_TF) ", (%rsp)\n\t"
> +			     "popf\n\t"
> +			     "sti\n\t"
> +			     "fep_sti_start:"
> +			     "cli\n\t"
> +			     "pushf\n\t"
> +			     "orq $" __stringify(X86_EFLAGS_TF) ", (%rsp)\n\t"
> +			     "popf\n\t"
> +			     KVM_FEP "sti\n\t"
> +			     "fep_sti_end:"
> +			     "cli\n\t");
>  	}
>  
>  	GUEST_DONE();
>  }
>  
> -#define  CAST_TO_RIP(v)  ((unsigned long long)&(v))
> -
>  static void vcpu_skip_insn(struct kvm_vcpu *vcpu, int insn_len)
>  {
>  	struct kvm_regs regs;
> @@ -110,6 +142,9 @@ int main(void)
>  	vm = vm_create_with_one_vcpu(&vcpu, guest_code);
>  	run = vcpu->run;
>  
> +	vm_install_exception_handler(vm, DB_VECTOR, guest_db_handler);
> +	vm_install_exception_handler(vm, IRQ_VECTOR, guest_irq_handler);

But the IRQ should never be taken thanks to the CLI in the STI shadow.  I.e.
installing a dummy handler could mask failures, no?

> +
>  	/* Test software BPs - int3 */
>  	memset(&debug, 0, sizeof(debug));
>  	debug.control = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP;
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 7/7] KVM: selftests: Verify 'BS' bit checking in pending debug exception during VM entry
  2025-12-05 18:23   ` Sean Christopherson
@ 2025-12-11 13:21     ` Hou Wenlong
  0 siblings, 0 replies; 20+ messages in thread
From: Hou Wenlong @ 2025-12-11 13:21 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, Lai Jiangshan, Paolo Bonzini, Shuah Khan, linux-kselftest,
	linux-kernel

On Fri, Dec 05, 2025 at 10:23:42AM -0800, Sean Christopherson wrote:
> On Wed, Sep 10, 2025, Hou Wenlong wrote:
> >  #define IRQ_VECTOR 0xAA
> >  
> > +#define  CAST_TO_RIP(v)  ((unsigned long long)&(v))
> > +
> >  /* For testing data access debug BP */
> >  uint32_t guest_value;
> >  
> >  extern unsigned char sw_bp, hw_bp, write_data, ss_start, bd_start;
> > -extern unsigned char fep_bd_start;
> > +extern unsigned char fep_bd_start, fep_sti_start, fep_sti_end;
> > +
> > +static void guest_db_handler(struct ex_regs *regs)
> > +{
> > +	static int count;
> > +	unsigned long target_rips[2] = {
> > +		CAST_TO_RIP(fep_sti_start),
> > +		CAST_TO_RIP(fep_sti_end),
> > +	};
> > +
> > +	__GUEST_ASSERT(regs->rip == target_rips[count], "STI: unexpected rip 0x%lx (should be 0x%lx)",
> > +		       regs->rip, target_rips[count]);
> > +	regs->rflags &= ~X86_EFLAGS_TF;
> > +	count++;
> > +}
> > +
> > +static void guest_irq_handler(struct ex_regs *regs)
> > +{
> > +}
> >  
> >  static void guest_code(void)
> >  {
> > @@ -69,13 +89,25 @@ static void guest_code(void)
> >  	if (is_forced_emulation_enabled) {
> >  		/* DR6.BD test for emulation */
> >  		asm volatile(KVM_FEP "fep_bd_start: mov %%dr0, %%rax" : : : "rax");
> > +
> > +		/* pending debug exceptions for emulation */
> > +		asm volatile("pushf\n\t"
> > +			     "orq $" __stringify(X86_EFLAGS_TF) ", (%rsp)\n\t"
> > +			     "popf\n\t"
> > +			     "sti\n\t"
> > +			     "fep_sti_start:"
> > +			     "cli\n\t"
> > +			     "pushf\n\t"
> > +			     "orq $" __stringify(X86_EFLAGS_TF) ", (%rsp)\n\t"
> > +			     "popf\n\t"
> > +			     KVM_FEP "sti\n\t"
> > +			     "fep_sti_end:"
> > +			     "cli\n\t");
> >  	}
> >  
> >  	GUEST_DONE();
> >  }
> >  
> > -#define  CAST_TO_RIP(v)  ((unsigned long long)&(v))
> > -
> >  static void vcpu_skip_insn(struct kvm_vcpu *vcpu, int insn_len)
> >  {
> >  	struct kvm_regs regs;
> > @@ -110,6 +142,9 @@ int main(void)
> >  	vm = vm_create_with_one_vcpu(&vcpu, guest_code);
> >  	run = vcpu->run;
> >  
> > +	vm_install_exception_handler(vm, DB_VECTOR, guest_db_handler);
> > +	vm_install_exception_handler(vm, IRQ_VECTOR, guest_irq_handler);
> 
> But the IRQ should never be taken thanks to the CLI in the STI shadow.  I.e.
> installing a dummy handler could mask failures, no?
>
Yes, this also breaks the testcase regarding KVM_GUESTDBG_BLOCKIRQ.
Sorry, I forgot why I added this, as you said there should be no IRQ
delivered due to the STI shadow. :(
I'll remove it in the next version.
 
Thanks!

> > +
> >  	/* Test software BPs - int3 */
> >  	memset(&debug, 0, sizeof(debug));
> >  	debug.control = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP;
> > -- 
> > 2.31.1
> > 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 5/7] KVM: VMX: Set 'BS' bit in pending debug exceptions during instruction emulation
  2025-12-05 18:20   ` Sean Christopherson
@ 2025-12-11 14:01     ` Hou Wenlong
  0 siblings, 0 replies; 20+ messages in thread
From: Hou Wenlong @ 2025-12-11 14:01 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, Lai Jiangshan, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, linux-kernel

On Fri, Dec 05, 2025 at 10:20:06AM -0800, Sean Christopherson wrote:
> On Wed, Sep 10, 2025, Hou Wenlong wrote:
> > If 'STI' or 'MOV SS' with 'X86_EFLAGS_TF' set is emulated by the
> > emulator (e.g., using the 'force emulation' prefix), the check for
> > pending debug exceptions during VM entry would fail,
> 
> s/fail/VM-Fail, and please elaborate on what exactly fails.  I've had a lot (too
> much) of exposure to the consistency check, but I still have to look up the details
> every time.
>
Get it.
 
> > as #UD clears the pending debug exceptions. Therefore, set the 'BS' bit in
> > such situations to make instruction emulation more robust.
> > 
> > Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> > ---
> >  arch/x86/include/asm/kvm-x86-ops.h |  1 +
> >  arch/x86/include/asm/kvm_host.h    |  1 +
> >  arch/x86/kvm/vmx/main.c            |  9 +++++++++
> >  arch/x86/kvm/vmx/vmx.c             | 14 +++++++++-----
> >  arch/x86/kvm/vmx/x86_ops.h         |  1 +
> >  arch/x86/kvm/x86.c                 |  7 +++++--
> >  6 files changed, 26 insertions(+), 7 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
> > index 18a5c3119e1a..3a0ab1683f17 100644
> > --- a/arch/x86/include/asm/kvm-x86-ops.h
> > +++ b/arch/x86/include/asm/kvm-x86-ops.h
> > @@ -50,6 +50,7 @@ KVM_X86_OP(get_gdt)
> >  KVM_X86_OP(set_gdt)
> >  KVM_X86_OP(sync_dirty_debug_regs)
> >  KVM_X86_OP(set_dr7)
> > +KVM_X86_OP_OPTIONAL(set_pending_dbg)
> >  KVM_X86_OP(cache_reg)
> >  KVM_X86_OP(get_rflags)
> >  KVM_X86_OP(set_rflags)
> > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > index 0d3cc0fc27af..a36ca751ee2e 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -1765,6 +1765,7 @@ struct kvm_x86_ops {
> >  	void (*set_gdt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt);
> >  	void (*sync_dirty_debug_regs)(struct kvm_vcpu *vcpu);
> >  	void (*set_dr7)(struct kvm_vcpu *vcpu, unsigned long value);
> > +	void (*set_pending_dbg)(struct kvm_vcpu *vcpu);
> >  	void (*cache_reg)(struct kvm_vcpu *vcpu, enum kvm_reg reg);
> >  	unsigned long (*get_rflags)(struct kvm_vcpu *vcpu);
> >  	void (*set_rflags)(struct kvm_vcpu *vcpu, unsigned long rflags);
> > diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c
> > index dbab1c15b0cd..23adff73f90b 100644
> > --- a/arch/x86/kvm/vmx/main.c
> > +++ b/arch/x86/kvm/vmx/main.c
> > @@ -465,6 +465,14 @@ static void vt_set_dr7(struct kvm_vcpu *vcpu, unsigned long val)
> >  	vmx_set_dr7(vcpu, val);
> >  }
> >  
> > +static void vt_set_pending_dbg(struct kvm_vcpu *vcpu)
> > +{
> > +	if (is_td_vcpu(vcpu))
> 
> WARN_ON_ONCE()?
> 
> > +		return;
> > +
> > +	vmx_set_pending_dbg(vcpu);
> > +}
> > +
> >  static void vt_sync_dirty_debug_regs(struct kvm_vcpu *vcpu)
> >  {
> >  	/*
> > @@ -906,6 +914,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = {
> >  	.get_gdt = vt_op(get_gdt),
> >  	.set_gdt = vt_op(set_gdt),
> >  	.set_dr7 = vt_op(set_dr7),
> > +	.set_pending_dbg = vt_op(set_pending_dbg),
> >  	.sync_dirty_debug_regs = vt_op(sync_dirty_debug_regs),
> >  	.cache_reg = vt_op(cache_reg),
> >  	.get_rflags = vt_op(get_rflags),
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index 227b45430ad8..e861a0edb3f4 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -5243,11 +5243,7 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu)
> >  			 */
> >  			if (is_icebp(intr_info))
> >  				WARN_ON(!skip_emulated_instruction(vcpu));
> > -			else if ((vmx_get_rflags(vcpu) & X86_EFLAGS_TF) &&
> > -				 (vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) &
> > -				  (GUEST_INTR_STATE_STI | GUEST_INTR_STATE_MOV_SS)))
> > -				vmcs_writel(GUEST_PENDING_DBG_EXCEPTIONS,
> > -					    vmcs_readl(GUEST_PENDING_DBG_EXCEPTIONS) | DR6_BS);
> > +			vmx_set_pending_dbg(vcpu);
> 
> This looks wrong.  Per Table 19-2. Debug Exception Conditions, INT1 doesn't
> set DR6.BS.  Oooh, the helper is _conditionally_ setting DR6_BS.  But that still
> _looks_ wrong, and it makes the comment confusing.
> 
> >  
> >  			kvm_queue_exception_p(vcpu, DB_VECTOR, dr6);
> >  			return 1;
> > @@ -5554,6 +5550,14 @@ void vmx_set_dr7(struct kvm_vcpu *vcpu, unsigned long val)
> >  	vmcs_writel(GUEST_DR7, val);
> >  }
> >  
> > +void vmx_set_pending_dbg(struct kvm_vcpu *vcpu)
> 
> Related to above, this is a confusing name.  In no small part because of the rather
> insane complexity related to pending debug exceptions being visible to software
> via the VMCS.  E.g. I initially read this as "set a pending #DB", not "set the
> VMCS field with the same name based on RFLAGS.TF and whether or not the vCPU is
> in an interrupt shadow".
> 
> Maybe refresh_pending_dbg_excpetions()?  And then the above case would be:
> 
> 			if (is_icebp(intr_info))
> 				WARN_ON(!skip_emulated_instruction(vcpu));
> 			else
> 				vmx_refresh_pending_dbg_exceptions(vcpu);
>
Agree, this makes the code clearly.

> > +{
> > +	if ((vmx_get_rflags(vcpu) & X86_EFLAGS_TF) &&
> > +	    vmx_get_interrupt_shadow(vcpu))
> > +		vmcs_writel(GUEST_PENDING_DBG_EXCEPTIONS,
> > +			    vmcs_readl(GUEST_PENDING_DBG_EXCEPTIONS) | DR6_BS);
> > +}
> > +
> >  static int handle_tpr_below_threshold(struct kvm_vcpu *vcpu)
> >  {
> >  	kvm_apic_update_ppr(vcpu);
> > diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h
> > index 2b3424f638db..2913648cfe4f 100644
> > --- a/arch/x86/kvm/vmx/x86_ops.h
> > +++ b/arch/x86/kvm/vmx/x86_ops.h
> > @@ -75,6 +75,7 @@ void vmx_get_gdt(struct kvm_vcpu *vcpu, struct desc_ptr *dt);
> >  void vmx_set_gdt(struct kvm_vcpu *vcpu, struct desc_ptr *dt);
> >  void vmx_set_dr6(struct kvm_vcpu *vcpu, unsigned long val);
> >  void vmx_set_dr7(struct kvm_vcpu *vcpu, unsigned long val);
> > +void vmx_set_pending_dbg(struct kvm_vcpu *vcpu);
> >  void vmx_sync_dirty_debug_regs(struct kvm_vcpu *vcpu);
> >  void vmx_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg);
> >  unsigned long vmx_get_rflags(struct kvm_vcpu *vcpu);
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 83960214d5d8..464e9649cb54 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -9250,10 +9250,13 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
> >  			if (ctxt->is_branch)
> >  				kvm_pmu_branch_retired(vcpu);
> >  			kvm_rip_write(vcpu, ctxt->eip);
> > -			if (r && (ctxt->tf || (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)))
> > +			__kvm_set_rflags(vcpu, ctxt->eflags);
> > +			if (r && (ctxt->tf || (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP))) {
> >  				r = kvm_vcpu_do_singlestep(vcpu);
> > +				if (r)
> > +					kvm_x86_call(set_pending_dbg)(vcpu);
> 
> Why not handle this in kvm_vcpu_do_singlestep()?  Ah, because the call from
> kvm_skip_emulated_instruction() can never occur in an interrupt shadow.  But
> that's a _really_ subtle detail, and more imporantly the call is benign in that
> case.
>
Yes, my initial thought is that kvm_skip_emulated_instruction() clears
the interrupt shadow, so I didn't add one more function call for it.
However, I'll move it into kvm_vcpu_do_singlestep()

> So unless there's a good reason to do otherwise, I vote to move the call into
> kvm_vcpu_do_singlestep().
> 
> > +			}
> >  			kvm_x86_call(update_emulated_instruction)(vcpu);
> > -			__kvm_set_rflags(vcpu, ctxt->eflags);
> 
> Please move the relocation of the call to __kvm_set_rflags() to its own patch.
> I vaguely recall running into problems related to the state of RFLAGS in the
> emulator versus those in the vCPU.  I don't _think_ there's a problem here, but
> if there is, I want the change to be fully isolated.
> 
> >  		}

Get it. I moved it up to keep it after kvm_rip_write() so that matches that the singlestep
trap is generated after instruction execution.


> >  
> >  		/*
> > -- 
> > 2.31.1
> > 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/7] KVM: x86: Consolidate KVM_GUESTDBG_SINGLESTEP check into the kvm_inject_emulated_db()
  2025-12-05 17:58   ` Sean Christopherson
@ 2025-12-11 14:05     ` Hou Wenlong
  2025-12-11 17:19       ` Sean Christopherson
  0 siblings, 1 reply; 20+ messages in thread
From: Hou Wenlong @ 2025-12-11 14:05 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, Lai Jiangshan, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, linux-kernel

On Fri, Dec 05, 2025 at 09:58:04AM -0800, Sean Christopherson wrote:
> On Wed, Sep 10, 2025, Hou Wenlong wrote:
> > Use kvm_inject_emulated_db() in kvm_vcpu_do_singlestep() to consolidate
> > 'KVM_GUESTDBG_SINGLESTEP' check into kvm_inject_emulated_db() during
> > emulation.
> > 
> > No functional change intended.
> > 
> > Suggested-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
> > Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> > ---
> >  arch/x86/kvm/x86.c | 17 +++++------------
> >  1 file changed, 5 insertions(+), 12 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 5af652916a19..83960214d5d8 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -8632,7 +8632,10 @@ static int kvm_inject_emulated_db(struct kvm_vcpu *vcpu, unsigned long dr6)
> >  {
> >  	struct kvm_run *kvm_run = vcpu->run;
> >  
> > -	if (vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP) {
> > +	/* Data breakpoints are not supported in emulation for now. */
> > +	WARN_ON((dr6 & DR6_BS) && (dr6 & DR_TRAP_BITS));
> 
> If we keep this, it should be a WARN_ON_ONCE().  We've had at least one case where
> a sanity check in the emulator caused major problems because a WARN_ON() spammed
> the kernel log to the point where it overloaded things :-)
>
I'll drop it.

> But I think the WARN will be subject to false positives.  KVM doesn't emulate data
> #DBs, but it does emulate code #DBs, and fault-like code #DBs can be coincident
> with trap-like single-step #DBs.  Ah, but kvm_vcpu_check_code_breakpoint() doesn't
> account for RFLAGS.TF.  That should probably be addressed in this series, especially
> since it's consolidating KVM_GUESTDBG_SINGLESTEP handling.
Sorry, I didn't follow it, how fault-like code #DBs can be coincident
with trap-like single-step #DBs, could you provide an example?

Thanks!


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/7] KVM: x86: Consolidate KVM_GUESTDBG_SINGLESTEP check into the kvm_inject_emulated_db()
  2025-12-11 14:05     ` Hou Wenlong
@ 2025-12-11 17:19       ` Sean Christopherson
  2025-12-12  9:46         ` Hou Wenlong
  0 siblings, 1 reply; 20+ messages in thread
From: Sean Christopherson @ 2025-12-11 17:19 UTC (permalink / raw)
  To: Hou Wenlong
  Cc: kvm, Lai Jiangshan, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, linux-kernel

On Thu, Dec 11, 2025, Hou Wenlong wrote:
> On Fri, Dec 05, 2025 at 09:58:04AM -0800, Sean Christopherson wrote:
> > But I think the WARN will be subject to false positives.  KVM doesn't emulate data
> > #DBs, but it does emulate code #DBs, and fault-like code #DBs can be coincident
> > with trap-like single-step #DBs.  Ah, but kvm_vcpu_check_code_breakpoint() doesn't
> > account for RFLAGS.TF.  That should probably be addressed in this series, especially
> > since it's consolidating KVM_GUESTDBG_SINGLESTEP handling.
>
> Sorry, I didn't follow it, how fault-like code #DBs can be coincident
> with trap-like single-step #DBs, could you provide an example?

Ya, here's a KUT testcase that applies on top of
https://lore.kernel.org/all/20251126191736.907963-1-seanjc@google.com.

---
 x86/debug.c | 43 +++++++++++++++++++++++++++++++++++++++----
 1 file changed, 39 insertions(+), 4 deletions(-)

diff --git a/x86/debug.c b/x86/debug.c
index 8177575c..313d854e 100644
--- a/x86/debug.c
+++ b/x86/debug.c
@@ -92,6 +92,7 @@ typedef unsigned long (*db_test_fn)(void);
 typedef void (*db_report_fn)(unsigned long, const char *);
 
 static unsigned long singlestep_with_movss_blocking_and_dr7_gd(void);
+static unsigned long singlestep_with_code_db(void);
 static unsigned long singlestep_with_sti_hlt(void);
 
 static void __run_single_step_db_test(db_test_fn test, db_report_fn report_fn)
@@ -106,11 +107,12 @@ static void __run_single_step_db_test(db_test_fn test, db_report_fn report_fn)
 	report_fn(start, "");
 
 	/*
-	 * MOV DR #GPs at CPL>0, don't try to run the DR7.GD test in usermode.
-	 * Likewise for HLT.
+	 * MOV DR #GPs at CPL>0, don't try to run the DR7.GD or code #DB tests
+	 * in usermode. Likewise for HLT.
 	 */
-	if (test == singlestep_with_movss_blocking_and_dr7_gd
-	    || test == singlestep_with_sti_hlt)
+	if (test == singlestep_with_movss_blocking_and_dr7_gd ||
+	    test == singlestep_with_code_db ||
+	    test == singlestep_with_sti_hlt)
 		return;
 
 	n = 0;
@@ -163,6 +165,38 @@ static noinline unsigned long singlestep_basic(void)
 	return start;
 }
 
+static void report_singlestep_with_code_db(unsigned long start, const char *usermode)
+{
+	report(n == 3 &&
+	       dr6[0] == (DR6_ACTIVE_LOW | DR6_BS | DR6_TRAP2) && db_addr[0] == start &&
+	       is_single_step_db(dr6[1]) && db_addr[1] == start + 1 &&
+	       is_single_step_db(dr6[2]) && db_addr[2] == start + 1 + 1,
+	       "%sSingle-step + code #DB test", usermode);
+}
+
+static noinline unsigned long singlestep_with_code_db(void)
+{
+	unsigned long start;
+
+	asm volatile (
+		"lea 1f(%%rip), %0\n\t"
+		"mov %0, %%dr2\n\t"
+		"mov $" xstr(DR7_FIXED_1 | DR7_EXECUTE_DRx(2) | DR7_GLOBAL_ENABLE_DR2) ", %0\n\t"
+		"mov %0, %%dr7\n\t"
+		"pushf\n\t"
+		"pop %%rax\n\t"
+		"or $(1<<8),%%rax\n\t"
+		"push %%rax\n\t"
+		"popf\n\t"
+		"and $~(1<<8),%%rax\n\t"
+		"1:push %%rax\n\t"
+		"popf\n\t"
+		"lea 1b(%%rip), %0\n\t"
+		: "=r" (start) : : "rax"
+	);
+	return start;
+}
+
 static void report_singlestep_emulated_instructions(unsigned long start,
 						    const char *usermode)
 {
@@ -517,6 +551,7 @@ int main(int ac, char **av)
 	       n, db_addr[0], dr6[0]);
 
 	run_ss_db_test(singlestep_basic);
+	run_ss_db_test(singlestep_with_code_db);
 	run_ss_db_test(singlestep_emulated_instructions);
 	run_ss_db_test(singlestep_with_sti_blocking);
 	run_ss_db_test(singlestep_with_movss_blocking);

base-commit: 23071a886edbe303fb964c5c386750b0b458dbfb
--

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/7] KVM: x86: Consolidate KVM_GUESTDBG_SINGLESTEP check into the kvm_inject_emulated_db()
  2025-12-11 17:19       ` Sean Christopherson
@ 2025-12-12  9:46         ` Hou Wenlong
  2025-12-12 17:53           ` Sean Christopherson
  0 siblings, 1 reply; 20+ messages in thread
From: Hou Wenlong @ 2025-12-12  9:46 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, Lai Jiangshan, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, linux-kernel

On Thu, Dec 11, 2025 at 09:19:39AM -0800, Sean Christopherson wrote:
> On Thu, Dec 11, 2025, Hou Wenlong wrote:
> > On Fri, Dec 05, 2025 at 09:58:04AM -0800, Sean Christopherson wrote:
> > > But I think the WARN will be subject to false positives.  KVM doesn't emulate data
> > > #DBs, but it does emulate code #DBs, and fault-like code #DBs can be coincident
> > > with trap-like single-step #DBs.  Ah, but kvm_vcpu_check_code_breakpoint() doesn't
> > > account for RFLAGS.TF.  That should probably be addressed in this series, especially
> > > since it's consolidating KVM_GUESTDBG_SINGLESTEP handling.
> >
> > Sorry, I didn't follow it, how fault-like code #DBs can be coincident
> > with trap-like single-step #DBs, could you provide an example?
> 
> Ya, here's a KUT testcase that applies on top of
> https://lore.kernel.org/all/20251126191736.907963-1-seanjc@google.com.
>
Thanks for your testcase; it really changed my perspective.
 
> ---
>  x86/debug.c | 43 +++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 39 insertions(+), 4 deletions(-)
> 
> diff --git a/x86/debug.c b/x86/debug.c
> index 8177575c..313d854e 100644
> --- a/x86/debug.c
> +++ b/x86/debug.c
> @@ -92,6 +92,7 @@ typedef unsigned long (*db_test_fn)(void);
>  typedef void (*db_report_fn)(unsigned long, const char *);
>  
>  static unsigned long singlestep_with_movss_blocking_and_dr7_gd(void);
> +static unsigned long singlestep_with_code_db(void);
>  static unsigned long singlestep_with_sti_hlt(void);
>  
>  static void __run_single_step_db_test(db_test_fn test, db_report_fn report_fn)
> @@ -106,11 +107,12 @@ static void __run_single_step_db_test(db_test_fn test, db_report_fn report_fn)
>  	report_fn(start, "");
>  
>  	/*
> -	 * MOV DR #GPs at CPL>0, don't try to run the DR7.GD test in usermode.
> -	 * Likewise for HLT.
> +	 * MOV DR #GPs at CPL>0, don't try to run the DR7.GD or code #DB tests
> +	 * in usermode. Likewise for HLT.
>  	 */
> -	if (test == singlestep_with_movss_blocking_and_dr7_gd
> -	    || test == singlestep_with_sti_hlt)
> +	if (test == singlestep_with_movss_blocking_and_dr7_gd ||
> +	    test == singlestep_with_code_db ||
> +	    test == singlestep_with_sti_hlt)
>  		return;
>  
>  	n = 0;
> @@ -163,6 +165,38 @@ static noinline unsigned long singlestep_basic(void)
>  	return start;
>  }
>  
> +static void report_singlestep_with_code_db(unsigned long start, const char *usermode)
> +{
> +	report(n == 3 &&
> +	       dr6[0] == (DR6_ACTIVE_LOW | DR6_BS | DR6_TRAP2) && db_addr[0] == start &&
> +	       is_single_step_db(dr6[1]) && db_addr[1] == start + 1 &&
> +	       is_single_step_db(dr6[2]) && db_addr[2] == start + 1 + 1,
> +	       "%sSingle-step + code #DB test", usermode);
> +}
> +
> +static noinline unsigned long singlestep_with_code_db(void)
> +{
> +	unsigned long start;
> +
> +	asm volatile (
> +		"lea 1f(%%rip), %0\n\t"
> +		"mov %0, %%dr2\n\t"
> +		"mov $" xstr(DR7_FIXED_1 | DR7_EXECUTE_DRx(2) | DR7_GLOBAL_ENABLE_DR2) ", %0\n\t"
> +		"mov %0, %%dr7\n\t"
> +		"pushf\n\t"
> +		"pop %%rax\n\t"
> +		"or $(1<<8),%%rax\n\t"
> +		"push %%rax\n\t"
> +		"popf\n\t"
> +		"and $~(1<<8),%%rax\n\t"
In my previous understanding, I thought there would be two #DBs
generated at the instruction boundary. First, the single-step trap #DB
would be handled, and then, when resuming to start the new instruction,
it would check for the code breakpoint and generate a code fault #DB.
However, it turns out that the check for the code breakpoint happened
before the instruction boundary. I also see in the kernel hardware
breakpoint handler that it notes that code breakpoints and single-step
can be detected together. Is this due to instruction prefetch?

If we want to emulate the hardware behavior in the emulator, does that
mean we need to check for code breakpoints in kvm_vcpu_do_single_step()
and set the DR_TRAP_BITS along with the DR6_BS bit?

Thanks!

> +		"1:push %%rax\n\t"
> +		"popf\n\t"
> +		"lea 1b(%%rip), %0\n\t"
> +		: "=r" (start) : : "rax"
> +	);
> +	return start;
> +}
> +
>  static void report_singlestep_emulated_instructions(unsigned long start,
>  						    const char *usermode)
>  {
> @@ -517,6 +551,7 @@ int main(int ac, char **av)
>  	       n, db_addr[0], dr6[0]);
>  
>  	run_ss_db_test(singlestep_basic);
> +	run_ss_db_test(singlestep_with_code_db);
>  	run_ss_db_test(singlestep_emulated_instructions);
>  	run_ss_db_test(singlestep_with_sti_blocking);
>  	run_ss_db_test(singlestep_with_movss_blocking);
> 
> base-commit: 23071a886edbe303fb964c5c386750b0b458dbfb
> --

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/7] KVM: x86: Consolidate KVM_GUESTDBG_SINGLESTEP check into the kvm_inject_emulated_db()
  2025-12-12  9:46         ` Hou Wenlong
@ 2025-12-12 17:53           ` Sean Christopherson
  2025-12-13 16:15             ` Hou Wenlong
  0 siblings, 1 reply; 20+ messages in thread
From: Sean Christopherson @ 2025-12-12 17:53 UTC (permalink / raw)
  To: Hou Wenlong
  Cc: kvm, Lai Jiangshan, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, linux-kernel

On Fri, Dec 12, 2025, Hou Wenlong wrote:
> On Thu, Dec 11, 2025 at 09:19:39AM -0800, Sean Christopherson wrote:
> > On Thu, Dec 11, 2025, Hou Wenlong wrote:
> > +static noinline unsigned long singlestep_with_code_db(void)
> > +{
> > +	unsigned long start;
> > +
> > +	asm volatile (
> > +		"lea 1f(%%rip), %0\n\t"
> > +		"mov %0, %%dr2\n\t"
> > +		"mov $" xstr(DR7_FIXED_1 | DR7_EXECUTE_DRx(2) | DR7_GLOBAL_ENABLE_DR2) ", %0\n\t"
> > +		"mov %0, %%dr7\n\t"
> > +		"pushf\n\t"
> > +		"pop %%rax\n\t"
> > +		"or $(1<<8),%%rax\n\t"
> > +		"push %%rax\n\t"
> > +		"popf\n\t"
> > +		"and $~(1<<8),%%rax\n\t"
> In my previous understanding, I thought there would be two #DBs
> generated at the instruction boundary. First, the single-step trap #DB
> would be handled, and then, when resuming to start the new instruction,
> it would check for the code breakpoint and generate a code fault #DB.
> However, it turns out that the check for the code breakpoint happened
> before the instruction boundary. 

Yeah, that's what I was trying to explain by describing code breakpoint as fault-like.

> I also see in the kernel hardware breakpoint handler that it notes that code
> breakpoints and single-step can be detected together. Is this due to
> instruction prefetch?

Nope, it's just how #DBs work, everything pending gets smushed together.  Note,
data #DBs can also be coincident.  E.g. it's entirely possible that you could
observe a code breakpoint, a data breakpoint, and a single-step breakpoint in a
single #DB.

> If we want to emulate the hardware behavior in the emulator, does that
> mean we need to check for code breakpoints in kvm_vcpu_do_single_step()
> and set the DR_TRAP_BITS along with the DR6_BS bit?

Hmm, ya, I think so?  I don't think the CPU will fetch and merge the imminent
code #DB with the injected single-step #DB.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/7] KVM: x86: Consolidate KVM_GUESTDBG_SINGLESTEP check into the kvm_inject_emulated_db()
  2025-12-12 17:53           ` Sean Christopherson
@ 2025-12-13 16:15             ` Hou Wenlong
  0 siblings, 0 replies; 20+ messages in thread
From: Hou Wenlong @ 2025-12-13 16:15 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, Lai Jiangshan, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, linux-kernel

On Fri, Dec 12, 2025 at 09:53:20AM -0800, Sean Christopherson wrote:
> On Fri, Dec 12, 2025, Hou Wenlong wrote:
> > On Thu, Dec 11, 2025 at 09:19:39AM -0800, Sean Christopherson wrote:
> > > On Thu, Dec 11, 2025, Hou Wenlong wrote:
> > > +static noinline unsigned long singlestep_with_code_db(void)
> > > +{
> > > +	unsigned long start;
> > > +
> > > +	asm volatile (
> > > +		"lea 1f(%%rip), %0\n\t"
> > > +		"mov %0, %%dr2\n\t"
> > > +		"mov $" xstr(DR7_FIXED_1 | DR7_EXECUTE_DRx(2) | DR7_GLOBAL_ENABLE_DR2) ", %0\n\t"
> > > +		"mov %0, %%dr7\n\t"
> > > +		"pushf\n\t"
> > > +		"pop %%rax\n\t"
> > > +		"or $(1<<8),%%rax\n\t"
> > > +		"push %%rax\n\t"
> > > +		"popf\n\t"
> > > +		"and $~(1<<8),%%rax\n\t"
> > In my previous understanding, I thought there would be two #DBs
> > generated at the instruction boundary. First, the single-step trap #DB
> > would be handled, and then, when resuming to start the new instruction,
> > it would check for the code breakpoint and generate a code fault #DB.
> > However, it turns out that the check for the code breakpoint happened
> > before the instruction boundary. 
> 
> Yeah, that's what I was trying to explain by describing code breakpoint as fault-like.
> 
> > I also see in the kernel hardware breakpoint handler that it notes that code
> > breakpoints and single-step can be detected together. Is this due to
> > instruction prefetch?
> 
> Nope, it's just how #DBs work, everything pending gets smushed together.  Note,
> data #DBs can also be coincident.  E.g. it's entirely possible that you could
> observe a code breakpoint, a data breakpoint, and a single-step breakpoint in a
> single #DB.
> 
> > If we want to emulate the hardware behavior in the emulator, does that
> > mean we need to check for code breakpoints in kvm_vcpu_do_single_step()
> > and set the DR_TRAP_BITS along with the DR6_BS bit?
> 
> Hmm, ya, I think so?  I don't think the CPU will fetch and merge the imminent
> code #DB with the injected single-step #DB.
Um, I have one more question: what do you mean when you say that
kvm_vcpu_check_code_breakpoint() doesn't account for RFLAGS.TF?

Thanks!

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2025-12-13 16:15 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-10  2:49 [PATCH 0/7] KVM: x86: Improve the handling of debug exceptions during instruction emulation Hou Wenlong
2025-09-10  2:49 ` [PATCH 1/7] KVM: x86: Set guest DR6 by kvm_queue_exception_p() in " Hou Wenlong
2025-09-10  2:49 ` [PATCH 2/7] KVM: x86: Check guest debug in DR access " Hou Wenlong
2025-12-05 17:51   ` Sean Christopherson
2025-09-10  2:49 ` [PATCH 3/7] KVM: x86: Only check effective code breakpoint in emulation Hou Wenlong
2025-09-10  2:49 ` [PATCH 4/7] KVM: x86: Consolidate KVM_GUESTDBG_SINGLESTEP check into the kvm_inject_emulated_db() Hou Wenlong
2025-12-05 17:58   ` Sean Christopherson
2025-12-11 14:05     ` Hou Wenlong
2025-12-11 17:19       ` Sean Christopherson
2025-12-12  9:46         ` Hou Wenlong
2025-12-12 17:53           ` Sean Christopherson
2025-12-13 16:15             ` Hou Wenlong
2025-09-10  2:49 ` [PATCH 5/7] KVM: VMX: Set 'BS' bit in pending debug exceptions during instruction emulation Hou Wenlong
2025-12-05 18:20   ` Sean Christopherson
2025-12-11 14:01     ` Hou Wenlong
2025-09-10  2:49 ` [PATCH 6/7] KVM: selftests: Verify guest debug DR7.GD checking " Hou Wenlong
2025-12-05 18:21   ` Sean Christopherson
2025-09-10  2:49 ` [PATCH 7/7] KVM: selftests: Verify 'BS' bit checking in pending debug exception during VM entry Hou Wenlong
2025-12-05 18:23   ` Sean Christopherson
2025-12-11 13:21     ` Hou Wenlong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).