public inbox for linux-coco@lists.linux.dev
 help / color / mirror / Atom feed
From: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
To: "seanjc@google.com" <seanjc@google.com>
Cc: "tglx@kernel.org" <tglx@kernel.org>,
	"Hansen, Dave" <dave.hansen@intel.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"dmaluka@chromium.org" <dmaluka@chromium.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"binbin.wu@linux.intel.com" <binbin.wu@linux.intel.com>,
	"bp@alien8.de" <bp@alien8.de>, "kas@kernel.org" <kas@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"Yamahata, Isaku" <isaku.yamahata@intel.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"linux-coco@lists.linux.dev" <linux-coco@lists.linux.dev>
Subject: Re: [PATCH] KVM: TDX: Fix APIC MSR ranges in tdx_has_emulated_msr()
Date: Fri, 3 Apr 2026 18:40:37 +0000	[thread overview]
Message-ID: <401db97a254b356ad8539a5d637d68ee826179a5.camel@intel.com> (raw)
In-Reply-To: <ac_rNT69TdocrMYc@google.com>

On Fri, 2026-04-03 at 09:30 -0700, Sean Christopherson wrote:
> On Thu, Mar 19, 2026, Rick P Edgecombe wrote:
> > On Thu, 2026-03-19 at 15:40 +0800, Binbin Wu wrote:
> > > tdx_has_emulated_msr() is used by KVM to decide whether to emulate a MSR access from the
> > > TDVMCALL or just return the error code.
> > > 
> > > During an off-list discussion, Rick noted that #VE reduction could change the behavior of
> > > accessing an MSR (e.g., from #VE to #GP or to be virtualized by the TDX module) without
> > > KVM knowing.Because KVM lacks the full context to perfectly decide if an MSR should be
> > > emulated, the question was raised: Can we just delete tdx_has_emulated_msr() entirely?
> > > 
> > > However, these native type x2apic MSRs are a special case. Since the TDX module owns the
> > > APICv page, KVM cannot emulate these MSRs. If we remove tdx_has_emulated_msr(), a guest
> > > directly issuing TDVMCALLs for these native type x2apic MSRs will trigger a silent failure,
> > > even though this is the guest's fault.
> > > 
> > > It comes down to a tradeoff. Should we prioritize code simplicity by dropping the function,
> > > or keep it to explicitly catch this misbehaving guest corner case?
> > 
> > I think from KVM's perspective it doesn't want to help the guest behave
> > correctly.
> 
> Uh, yes KVM does does.  KVM is responsible for emulating the APIC timer, isn't it?

Yea totally. We need to emulate the interface accurately. But we are kind of
making up the contract after the fact. If the guest performs the wrong type of
MSR write, should we make the contract that the VMM should help it catch it's
mistake?

> 
> > So we can ignore that I think. But it does really care to not define
> > any specific guest ABI that it has to maintain. So tdx_has_emulated_msr() has
> > some value there. And even more, it wants to not allow the guest to hurt the
> > host.
> > 
> > On the latter point, another problem with deleting tdx_has_emulated_msr() is the
> > current code path skips the checks done in the other MSR paths. So we would need
> > to call some appropriate higher up MSR helper to protect the host? And that
> > wades into the CPUID bit consistency issues.
> > 
> > So maybe... could we do a more limited version of the deletion where we allow
> > all the APIC MSRs through? We'd have to check that it won't cause problems.
> 
> What?  No.  KVM can't get actually read/write most (all?) MSRs, allowing access
> is far worse than returning an error, as for all intents and purposes KVM will
> silently drop writes, and return garbage on reads.
> 
> > Failing that, we should maybe just explicitly list the ones TDX supports rather
> > than the current way we define the APIC ones. As you mention below, it's not
> > correct in other ways too so it could be more robust.
> 
> No?  Don't we just want to allow access to MSRs that aren't accelerated?  What
> the TDX-Module supports is largely irrelevant, I think.

Not sure if I might be missing the point here. As above, we don't have enough
info to know which MSRs are accelerated. If the guest enabled #VE reduction, it
changes which ones are accelerated and the VMM is not notified. I think the
below is a sane limitation, but doesn't lets KVM perfectly notify the guest when
it screws up.

So the line would be to block MSRs that can never be emulated.

BTW, I've been treating this secret contract change as an arch mistake to at
least not build on. It's a whole subject though... Let me know if you are
interested in the details.

> 
> diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
> index 1e47c194af53..28e87630870b 100644
> --- a/arch/x86/kvm/vmx/tdx.c
> +++ b/arch/x86/kvm/vmx/tdx.c
> @@ -2116,23 +2116,26 @@ bool tdx_has_emulated_msr(u32 index)
>         case MSR_IA32_MC0_CTL2 ... MSR_IA32_MCx_CTL2(KVM_MAX_MCE_BANKS) - 1:
>                 /* MSR_IA32_MCx_{CTL, STATUS, ADDR, MISC, CTL2} */
>         case MSR_KVM_POLL_CONTROL:
> +       /*
> +        * x2APIC registers that are virtualized by the CPU can't be
> +        * emulated, KVM doesn't have access to the virtual APIC page.
> +        */
> +       case X2APIC_MSR(APIC_ID):
> +       case X2APIC_MSR(APIC_LVR):
> +       case X2APIC_MSR(APIC_LDR):
> +       case X2APIC_MSR(APIC_SPIV):
> +       case X2APIC_MSR(APIC_ESR):
> +       case X2APIC_MSR(APIC_ICR):
> +       case X2APIC_MSR(APIC_LVTT):
> +       case X2APIC_MSR(APIC_LVTTHMR):
> +       case X2APIC_MSR(APIC_LVTPC):
> +       case X2APIC_MSR(APIC_LVT0):
> +       case X2APIC_MSR(APIC_LVT1):
> +       case X2APIC_MSR(APIC_LVTERR):
> +       case X2APIC_MSR(APIC_TMICT):
> +       case X2APIC_MSR(APIC_TMCCT):
> +       case X2APIC_MSR(APIC_TDCR):
>                 return true;
> -       case APIC_BASE_MSR ... APIC_BASE_MSR + 0xff:
> -               /*
> -                * x2APIC registers that are virtualized by the CPU can't be
> -                * emulated, KVM doesn't have access to the virtual APIC page.
> -                */
> -               switch (index) {
> -               case X2APIC_MSR(APIC_TASKPRI):
> -               case X2APIC_MSR(APIC_PROCPRI):
> -               case X2APIC_MSR(APIC_EOI):
> -               case X2APIC_MSR(APIC_ISR) ... X2APIC_MSR(APIC_ISR + APIC_ISR_NR):
> -               case X2APIC_MSR(APIC_TMR) ... X2APIC_MSR(APIC_TMR + APIC_ISR_NR):
> -               case X2APIC_MSR(APIC_IRR) ... X2APIC_MSR(APIC_IRR + APIC_ISR_NR):
> -                       return false;
> -               default:
> -                       return true;
> -               }
>         default:
>                 return false;
>         }


  reply	other threads:[~2026-04-03 18:40 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-18 19:01 [PATCH] KVM: TDX: Fix APIC MSR ranges in tdx_has_emulated_msr() Dmytro Maluka
2026-03-18 19:42 ` Dave Hansen
2026-03-18 20:30   ` Dmytro Maluka
2026-03-19  1:14   ` Binbin Wu
2026-03-19  1:48     ` Dave Hansen
2026-03-19  7:40       ` Binbin Wu
2026-03-19 19:33         ` Edgecombe, Rick P
2026-04-03 16:30           ` Sean Christopherson
2026-04-03 18:40             ` Edgecombe, Rick P [this message]
2026-04-03 19:07               ` Sean Christopherson
2026-04-03 19:30                 ` Edgecombe, Rick P
2026-04-03 23:07                   ` Sean Christopherson
2026-04-04  0:11                     ` Edgecombe, Rick P
2026-04-06 23:07                       ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=401db97a254b356ad8539a5d637d68ee826179a5.camel@intel.com \
    --to=rick.p.edgecombe@intel.com \
    --cc=binbin.wu@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dmaluka@chromium.org \
    --cc=hpa@zytor.com \
    --cc=isaku.yamahata@intel.com \
    --cc=kas@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=tglx@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox