public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	jmattson@google.com, stable@vger.kernel.org
Subject: Re: [PATCH] KVM: x86: Allow guests to see MSR_IA32_TSX_CTRL even if tsx=off
Date: Thu, 28 Jan 2021 09:56:54 -0800	[thread overview]
Message-ID: <YBL65uIZggTjGO7F@google.com> (raw)
In-Reply-To: <20210128170800.1783502-1-pbonzini@redhat.com>

On Thu, Jan 28, 2021, Paolo Bonzini wrote:
> Userspace that does not know about KVM_GET_MSR_FEATURE_INDEX_LIST will
> generally use the default value for MSR_IA32_ARCH_CAPABILITIES.
> When this happens and the host has tsx=on, it is possible to end up
> with virtual machines that have HLE and RTM disabled, but TSX_CTRL
> disabled.

Thos wording is confusing the heck out of me.  I think what you're saying is
"but TSX disabled in the guest via TSX_CTRL".  I read "but TSX_CTRL disabled" as
saying the the TSX_CTRL itself was disabled/unsupported.

> If the fleet is then switched to tsx=off, kvm_get_arch_capabilities()
> will clear the ARCH_CAP_TSX_CTRL_MSR bit and it will not be possible
> to use the tsx=off as migration destinations, even though the guests
> indeed do not have TSX enabled.
> 
> When tsx=off is used, however, we know that guests will not have
> HLE and RTM (or if userspace sets bogus CPUID data, we do not
> expect HLE and RTM to work in guests).  Therefore we can keep
> TSX_CTRL_RTM_DISABLE set for the entire life of the guests and
> save MSR reads and writes on KVM_RUN and in the user return
> notifiers.
> 
> Cc: stable@vger.kernel.org
> Fixes: cbbaa2727aa3 ("KVM: x86: fix presentation of TSX feature in ARCH_CAPABILITIES")
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 12 +++++++++++-
>  arch/x86/kvm/x86.c     |  2 +-
>  2 files changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index cc60b1fc3ee7..80491a729408 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6863,8 +6863,18 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
>  			 * No need to pass TSX_CTRL_CPUID_CLEAR through, so
>  			 * let's avoid changing CPUID bits under the host
>  			 * kernel's feet.
> +			 *
> +			 * If the host disabled RTM, we may still need TSX_CTRL
> +			 * to be supported in the guest; for example the guest
> +			 * could have been created on a tsx=on host with hle=0,
> +			 * rtm=0, tsx_ctrl=1 and later migrate to a tsx=off host.
> +			 * In that case however do not change the value on the host,
> +			 * so that TSX remains always disabled.

Oof, can you reword this to clarify what "the value" refers to?  The previous
paragraphs talks about TSX_CTRL_CPUID_CLEAR, and the obvious "value" in the code
is also TSX_CTRL_CPUID_CLEAR, and so I thought the comment was saying "don't
change the value of CPUID_CLEAR", which is non-sensical because that's the the
RTM-enabled case does...

>  			 */
> -			vmx->guest_uret_msrs[j].mask = ~(u64)TSX_CTRL_CPUID_CLEAR;
> +			if (boot_cpu_has(X86_FEATURE_RTM))
> +				vmx->guest_uret_msrs[j].mask = ~(u64)TSX_CTRL_CPUID_CLEAR;
> +			else
> +				vmx->guest_uret_msrs[j].mask = 0;

IMO, this is an unnecessarily confusing way to "remove" the user return MSR.
Changing the ordering to do a 'continue' would also provide a separate chunk of
code for the new comment.  And maybe replace the switch with an if-statement to
avoid a 'continue' buried in a switch?

		vmx->guest_uret_msrs[j].slot = i;
		vmx->guest_uret_msrs[j].data = 0;
		if (index == MSR_IA32_TSX_CTRL) {
			/* Fancy new comment here. */
			if (!boot_cpu_has(X86_FEATURE_RTM))
				continue;

			/*
			 * No need to pass TSX_CTRL_CPUID_CLEAR through, so
			 * let's avoid changing CPUID bits under the host
			 * kernel's feet.
			 */
			vmx->guest_uret_msrs[j].mask = ~(u64)TSX_CTRL_CPUID_CLEAR;
		} else {
			vmx->guest_uret_msrs[j].mask = -1ull;

		}

>  			break;
>  		default:
>  			vmx->guest_uret_msrs[j].mask = -1ull;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 76bce832cade..15733013b266 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1401,7 +1401,7 @@ static u64 kvm_get_arch_capabilities(void)

This comments needs to be rewritten, it reflects the old behavior of exposing
the feature iff RTM/TSC is supported by the host.

>  	 *	  This lets the guest use VERW to clear CPU buffers.
>  	 */
>  	if (!boot_cpu_has(X86_FEATURE_RTM))
> -		data &= ~(ARCH_CAP_TAA_NO | ARCH_CAP_TSX_CTRL_MSR);
> +		data &= ~ARCH_CAP_TAA_NO;
>  	else if (!boot_cpu_has_bug(X86_BUG_TAA))
>  		data |= ARCH_CAP_TAA_NO;
>  
> -- 
> 2.26.2
> 

  reply	other threads:[~2021-01-28 17:59 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-28 17:08 [PATCH] KVM: x86: Allow guests to see MSR_IA32_TSX_CTRL even if tsx=off Paolo Bonzini
2021-01-28 17:56 ` Sean Christopherson [this message]
     [not found]   ` <79f0ecab-3521-5df8-0a2e-8a344918b8a8@redhat.com>
2021-01-28 18:17     ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YBL65uIZggTjGO7F@google.com \
    --to=seanjc@google.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox