From: Sean Christopherson <seanjc@google.com>
To: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: linux-kernel@vger.kernel.org,
Lai Jiangshan <laijs@linux.alibaba.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Wanpeng Li <wanpengli@tencent.com>,
Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
kvm@vger.kernel.org
Subject: Re: [PATCH] KVM: X86: Don't reset dr6 unconditionally when the vcpu being scheduled out
Date: Mon, 9 Aug 2021 16:54:03 +0000 [thread overview]
Message-ID: <YRFdq8sNuXYpgemU@google.com> (raw)
In-Reply-To: <20210808232919.862835-1-jiangshanlai@gmail.com>
On Mon, Aug 09, 2021, Lai Jiangshan wrote:
> From: Lai Jiangshan <laijs@linux.alibaba.com>
>
> The commit efdab992813fb ("KVM: x86: fix escape of guest dr6 to the host")
> fixed a bug. It did a great job and reset dr6 unconditionally when the
> vcpu being scheduled out since the linux kernel only activates breakpoints
> in rescheduling (perf events).
>
> But writing to debug registers is slow, and it can be shown in perf results
> sometimes even neither the host nor the guest activate breakpoints.
>
> It'd be better to reset it conditionally and this patch moves the code of
> reseting dr6 to the path of VM-exit where we can check the related
> conditions. And the comment is also updated.
>
> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
> ---
> And even in the future, the linux kernel might activate breakpoint
> in interrupts (rather than in rescheduling only), the host would
> not be confused by the stale dr6 after this patch. The possible future
> author who would implement it wouldn't need to care about how the kvm
> mananges debug registers since it sticks to the host's expectations.
Eh, I don't think this is a valid argument. KGBD already manipulates breakpoints
in NMI context, activating breakpoints from interrupt context would absolutely
require a full audit of the kernel.
> Another solution is changing breakpoint.c and make the linux kernel
> always reset dr6 in activating breakpoints. So that dr6 is allowed
> to be stale when host breakpoints is not enabled and we don't need
> to reset dr6 in this case. But it would be no harm to reset it even in
> both places and killing stale states is good in programing.
Hmm, other than further penalizing guests that enable debug breakpoints.
What about adding a arch.switch_db_regs flag to note that DR6 is loaded with a
guest value and keeping the reset code in kvm_arch_vcpu_put()?
> arch/x86/kvm/x86.c | 29 +++++++++++++++++++++--------
> 1 file changed, 21 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 4116567f3d44..39b5dad2dd19 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4310,12 +4310,6 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>
> static_call(kvm_x86_vcpu_put)(vcpu);
> vcpu->arch.last_host_tsc = rdtsc();
> - /*
> - * If userspace has set any breakpoints or watchpoints, dr6 is restored
> - * on every vmexit, but if not, we might have a stale dr6 from the
> - * guest. do_debug expects dr6 to be cleared after it runs, do the same.
> - */
> - set_debugreg(0, 6);
> }
>
> static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
> @@ -9375,6 +9369,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> fastpath_t exit_fastpath;
>
> bool req_immediate_exit = false;
> + bool reset_dr6 = false;
>
> /* Forbid vmenter if vcpu dirty ring is soft-full */
> if (unlikely(vcpu->kvm->dirty_ring_size &&
> @@ -9601,6 +9596,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> set_debugreg(vcpu->arch.eff_db[3], 3);
> set_debugreg(vcpu->arch.dr6, 6);
Not directly related to this patch, but why does KVM_DEBUGREG_RELOAD exist?
Commit ae561edeb421 ("KVM: x86: DR0-DR3 are not clear on reset") added it to
ensure DR0-3 are fresh when they're modified through non-standard paths, but I
don't see any reason why the new values _must_ be loaded into hardware. eff_db
needs to be updated, but I don't see why hardware DRs need to be updated unless
hardware breakpoints are active or DR exiting is disabled, and in those cases
updating hardware is handled by KVM_DEBUGREG_WONT_EXIT and KVM_DEBUGREG_BP_ENABLED.
Am I missing something?
> vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
> + reset_dr6 = true;
> } else if (unlikely(hw_breakpoint_active())) {
> set_debugreg(0, 7);
> }
> @@ -9631,17 +9627,34 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> kvm_update_dr0123(vcpu);
> kvm_update_dr7(vcpu);
> vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
> + reset_dr6 = true;
> }
>
> /*
> * If the guest has used debug registers, at least dr7
> * will be disabled while returning to the host.
> + *
> + * If we have active breakpoints in the host, restore the old state.
> + *
> * If we don't have active breakpoints in the host, we don't
> - * care about the messed up debug address registers. But if
> - * we have some of them active, restore the old state.
> + * care about the messed up debug address registers but dr6
> + * which is expected to be cleared normally. Otherwise we might
> + * leak a stale dr6 from the guest and confuse the host since
> + * neither the host reset dr6 on activating next breakpoints nor
> + * the hardware clear it on dilivering #DB. The Intel SDM says:
> + *
> + * Certain debug exceptions may clear bits 0-3. The remaining
> + * contents of the DR6 register are never cleared by the
> + * processor. To avoid confusion in identifying debug
> + * exceptions, debug handlers should clear the register before
> + * returning to the interrupted task.
> + *
> + * Keep it simple and live up to expectations: clear DR6 immediately.
> */
> if (hw_breakpoint_active())
> hw_breakpoint_restore();
> + else if (unlikely(reset_dr6))
> + set_debugreg(DR6_RESERVED, 6);
>
> vcpu->arch.last_vmentry_cpu = vcpu->cpu;
> vcpu->arch.last_guest_tsc = kvm_read_l1_tsc(vcpu, rdtsc());
> --
> 2.19.1.6.gb485710b
>
next prev parent reply other threads:[~2021-08-09 16:54 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-08 23:29 [PATCH] KVM: X86: Don't reset dr6 unconditionally when the vcpu being scheduled out Lai Jiangshan
2021-08-09 16:54 ` Sean Christopherson [this message]
2021-08-09 17:43 ` [PATCH V2 1/3] KVM: X86: Remove unneeded KVM_DEBUGREG_RELOAD Lai Jiangshan
2021-08-09 17:43 ` [PATCH V2 2/3] KVM: X86: Set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT Lai Jiangshan
2021-08-10 10:07 ` Paolo Bonzini
2021-08-10 10:30 ` Lai Jiangshan
2021-08-10 10:35 ` Paolo Bonzini
2021-08-10 10:46 ` Lai Jiangshan
2021-08-10 12:49 ` Paolo Bonzini
2021-08-09 17:43 ` [PATCH V2 3/3] KVM: X86: Reset " Lai Jiangshan
2021-08-10 9:42 ` Paolo Bonzini
2021-08-10 10:14 ` Paolo Bonzini
2021-08-10 10:34 ` Lai Jiangshan
2021-08-10 10:41 ` Paolo Bonzini
2021-08-10 9:59 ` [PATCH] KVM: X86: Don't reset dr6 unconditionally when the vcpu being scheduled out Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YRFdq8sNuXYpgemU@google.com \
--to=seanjc@google.com \
--cc=bp@alien8.de \
--cc=hpa@zytor.com \
--cc=jiangshanlai@gmail.com \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=kvm@vger.kernel.org \
--cc=laijs@linux.alibaba.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=tglx@linutronix.de \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox