All of lore.kernel.org
 help / color / mirror / Atom feed
From: Aidan Khoury <aidan@aktech.ai>
To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Cc: Sean Christopherson <seanjc@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	Aidan Khoury <aidan@revers.engineering>,
	Nick Peterson <everdox@gmail.com>, Aidan Khoury <aidan@aktech.ai>
Subject: [PATCH v1 0/1] KVM: x86: Merge pending debug causes when vectoring #DB
Date: Wed,  7 Jan 2026 19:57:23 -0400	[thread overview]
Message-ID: <20260107235724.28101-1-aidan@aktech.ai> (raw)

This is a single patch that fixes incorrect guest DR6 contents when KVM
vectors #DB on Intel VMX in the presence of deferred debug causes recorded
in VMCS.GUEST_PENDING_DBG_EXCEPTIONS.

See Intel SDM Vol. 3C, 27.3.1.5 Checks on Guest Non-Register State
and Intel SDM Vol. 3C, 27.7.3 Delivery of Pending Debug Exceptions after VM Entry

Intel VMX defers certain debug exception causes in the VMCS field
GUEST_PENDING_DBG_EXCEPTIONS (B0-B3, enabled breakpoint, BS, RTM). This
state is used when debug exceptions are suppressed due to interrupt
shadow (e.g. MOV SS/POP SS or STI), and the deferred causes are later
combined with other debug reasons when #DB is ultimately delivered.

KVM may vector an in-kernel #DB after a VM-exit and/or instruction
emulation. A concrete example is a guest that:

  - programs a data breakpoint (B0) on an operand used by MOV SS
  - enables single-step (RFLAGS.TF)
  - executes MOV SS (data breakpoint triggers, #DB is suppressed)
  - executes an instruction that VM-exits and is emulated (e.g. CPUID),
    or executes ICEBP/INT1 which is intercepted as #DB

On bare metal, guest DR6 reports the combined reasons (e.g. BS+B0). In
KVM/VMX, the deferred breakpoint cause is recorded in
GUEST_PENDING_DBG_EXCEPTIONS while KVM generates a #DB for single-step,
but the queued #DB payload delivered to guest DR6 can omit the pending
causes. This results in guest DR6 missing B0-B3 even though the CPU
would report them.

Fix this by merging pending causes from GUEST_PENDING_DBG_EXCEPTIONS into
the #DB payload when delivering the exception payload to guest state.
The merge is performed in kvm_deliver_exception_payload() so it applies
to both the normal injection path and the !guest_mode path where the
payload may be consumed immediately by kvm_multiple_exception().

To keep x86 core code vendor-agnostic, add an optional x86 op
get_pending_dbg_exceptions() that returns the relevant pending-debug
bits for the active vendor. VMX implements the hook by reading
VMCS.GUEST_PENDING_DBG_EXCEPTIONS and masking to architecturally defined
bits; other vendors return 0.

After this change, guests observe all accumulated #DB causes in DR6 when
#DB is vectored, matching bare-metal behavior.

Tested on Intel host with KVM/VMX enabled by running the tiny repro asm below
inside a Windows guest. On bare metal, DR6 reports both BS and B0. Under KVM/VMX
without this patch, DR6 may report BS but miss B0.

A minimal reproducer in the guest:
  - use ptrace (or veh+SetThreadContext on windows) for handling #DB traps and managing DRs
  - Prime DR0 and DR7 to a memory operand used by MOV SS (data breakpoint, enabled)
  - execute MOV SS from that memory operand (breakpoint met, delivery suppressed)
  - execute CPUID (emulated) or ICEBP/INT1 (intercepted)
  - In the #DB handler, read DR6 to observe the reported causes

```
__asm__ __volatile__(
	"pushq %%rbx\n"
	"pushfq\n"
	"orl $0x100, (%%rsp)\n"    /* set TF in saved RFLAGS on stack */
	"popfq\n"
	"movw (%0), %%ss\n"        /* load SS from probe page */
	"cpuid\n"                  /* trigger intercepting instruction (cpuid in this case) */
	"popq %%rbx\n"             /* #DB fires here with DR6 missing B0 under KVM/VMX pre-patch */
	: : "r"(&ss_probe) : "rax", "rcx", "rdx", "memory"
);
```

Reproducer PoC: https://github.com/ajkhoury/kvm-guest-anomalies

Aidan Khoury (1):
  KVM: x86: Merge pending debug causes when vectoring #DB

 arch/x86/include/asm/kvm-x86-ops.h |  1 +
 arch/x86/include/asm/kvm_host.h    |  1 +
 arch/x86/kvm/vmx/main.c            |  9 +++++++++
 arch/x86/kvm/vmx/vmx.c             | 16 +++++++++++-----
 arch/x86/kvm/vmx/x86_ops.h         |  1 +
 arch/x86/kvm/x86.c                 | 12 ++++++++++++
 6 files changed, 35 insertions(+), 5 deletions(-)

-- 
2.43.0


             reply	other threads:[~2026-01-08  0:02 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-07 23:57 Aidan Khoury [this message]
2026-01-07 23:57 ` [PATCH v1 1/1] KVM: x86: Merge pending debug causes when vectoring #DB Aidan Khoury
2026-05-14  1:08   ` Sean Christopherson
2026-05-15  0:50     ` Sean Christopherson
2026-05-15  1:00       ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260107235724.28101-1-aidan@aktech.ai \
    --to=aidan@aktech.ai \
    --cc=aidan@revers.engineering \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=everdox@gmail.com \
    --cc=hpa@zytor.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.