From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4A8AD261B9C for ; Wed, 11 Mar 2026 23:01:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773270088; cv=none; b=MPql/biBSd7O0E/ZJuGVQ9bvCM6fXqGsJC5AnrEx8AvLnZGSFuhSOoQxkuEA+D7wgRuAelEGWTUoAjbpW2TX1hMMsw9JXyW1bdm+i7UdDTwe8BbbBwjA3XXIYrLUtxR/eZDnQN/oFCw9iqY31Ab+BTnZ8UYhgcUiw6nSzbkoNSI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773270088; c=relaxed/simple; bh=3Q8PRJqoFKJMuk9HxEZ3ShPLPeo3AyXzaVBXBwKfcQk=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=rCkKUjs8hjDU+aOs608fGNmwZafwd4VQf199TQ0xXmSTFfTRz1bFyC9On56A4duqXD4vleG+RIrkewLdMJocKfBJbTCm67qqZ//3bo7kH8BqiK5ObWOSgmxSf28A76hcIIuPdTYIiPMlH18INS2XaWDOKzBrSqRnfRuS7ZdWmIc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=k1+qqOxv; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="k1+qqOxv" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-2ae3badc00dso4075665ad.3 for ; Wed, 11 Mar 2026 16:01:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1773270087; x=1773874887; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=sTA0pRoWq5d8vPn220gRJGgTusWTwZ3V4pV9zfY5GSM=; b=k1+qqOxvdIEEdZm4w6lA0qQJMShrSAqjg9n0/FCF8zx2LAT5+pl4m3MxHb1gQ8G401 q2/yhuOrrg+uHM5LZKDmTjnLP80RJUVcdqwriAnimPJuxC/b6g9cDzIBqe0hoIplF64p jm3Zk6BVud4LPbFx5BW/v0L9HJ9pZU94Mw15cEN2nlV/S+MV0h4SSeUnnNULzvhVjyZS 2ZKi/KaKma7k4L+62NxGmeUwqBKe07koMwwRwFXUrlSNmyv/vyn0UDhlY8Omfy90+TeG JxyLT3D4dis+EptQduJZZ1q3XZhE5KGAmtCJeLEcESUnOOqW96hlRmGZ6VakWmf2vnYz eAmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1773270087; x=1773874887; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=sTA0pRoWq5d8vPn220gRJGgTusWTwZ3V4pV9zfY5GSM=; b=bw7DHPX6MhtnuxVoHcYVd+XaIm10NNHnvnggUEbaVgOKxmnPgHN7FoR4fwa5CQlxd6 +IXq9zlJbpuYDi3T7eWs4ow/1WiUhLDsuYefQbfZWb0aLSQMJ0D3FFaZ/datM28Zs2Xi 2xEkkyH33HnrdMY2DjEFEZIvwNavMLQBm4fZFMhXTjXecfStFVQogsJTav5rHBybT7vi yZ0qaPkwfkw8tgvzsFQ4KipT6mT5wqFj2gMjb++luHoxuFPiG37W3AoGN/RFDzEOAN1n zxqJZlZbEf9Y7vWxG2DAS6YY18Jd0BUA5GVcE8CVcBOxMiKeevKzckgCTEh92icwkg8R l1xw== X-Forwarded-Encrypted: i=1; AJvYcCW/ff781a244F0R0XbUlDoDQapa8d3NCA/SYdICQB/6VHcnW9jqKtFWIJaSRA9jcmZFp3nobhVHKPWAnDM=@vger.kernel.org X-Gm-Message-State: AOJu0YwIqCivt/2ocmhqFuGGAboQMcUVRDcTj6NlG1aOV2sj7zdUVoKR uVtUwzZfSD6sz/FvdrDlt9k5lAelpec9nb36Z96bDuAPhQ2j/z3uXvVe50Y9HGNQkez9B1toedO 9YRHmOQ== X-Received: from plai11.prod.google.com ([2002:a17:902:c94b:b0:2ae:420c:c02a]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:3887:b0:2ae:69d3:5b9f with SMTP id d9443c01a7336-2aeae928e9dmr50073645ad.52.1773270086494; Wed, 11 Mar 2026 16:01:26 -0700 (PDT) Date: Wed, 11 Mar 2026 16:01:25 -0700 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260306210900.1933788-2-yosry@kernel.org> Message-ID: Subject: Re: [PATCH v2 1/6] KVM: SVM: Use maxphyaddr in emulator RAX check for VMRUN/VMLOAD/VMSAVE From: Sean Christopherson To: Yosry Ahmed Cc: Jim Mattson , Paolo Bonzini , kvm@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable On Wed, Mar 11, 2026, Yosry Ahmed wrote: > > > > Sean, I intend to send a new version today with 2 main diffs: > > > > - Use cpuid_maxphyaddr() here instead of kvm_host.maxphyaddr. > > > > - Use a common helper for checking RAX for SVM instructions for bot= h > > > > the emulator and gp_interception() (see response on patch 4). > > > > > > > > Holler if you want me to wait for further feedback. > > > > > > I just realized I cannot just do cpuid_maxphyaddr(ctxt->vcpu) in > > > check_svme_pa(), because vcpu is defined as a void pointer in > > > x86_emulate_ctxt. Looking at commit c9b8b07cded5 ("KVM: x86: > > > Dynamically allocate per-vCPU emulation context"), I cannot tell why. > > > > To prevent dereferencing the vcpu object in emulator code. It's kinda = silly > > because common KVM is tightly coupled to the emulator, but we try to co= ntain > > what the emulator can do. > > > > > I was going to move emul_to_vcpu() to arch/x86/kvm/kvm_emulate.h, but > > > maybe we should just make this a normal struct kvm_vpcu pointer and > > > drop emul_to_vcpu() completely? > > > > Heh, talk about scope creep, that'll open a much bigger can of worms an= d subsequent > > discussion. >=20 > Ack. >=20 > > Honestly, why bother keeping check_svme_pa()? Can't we just do the che= cks in > > svm_check_intercept()? E.g. vmx_check_intercept() already "injects" #U= D for RDTSCP. >=20 > Hmm svm_check_intercept() isn't semantically the right place AFAICT, Actually, I think it is. Per the APM: Generally, instruction intercepts are checked after simple exceptions (su= ch as #GP=E2=80=94when CPL is incorrect=E2=80=94or #UD) have been checked, but = before exceptions related to memory accesses (such as page faults) and exceptions based on specific operand values. This #GP is operand specific. > and more importantly, it's only called if the instruction is executed > in guest mode (i.e. in L2+). Hold up, we're getting ahead of ourselves. The only legitimate reason the emulator is at all aware of VMSAVE, VMLOAD, = and VMRUN is to deal with #GP due to the RAX check, because hardware checks the= GPA against the host's physical address space. See commit 82a11e9c6fa2 ("KVM: = SVM: Add emulation support for #GP triggered by SVM instructions"). The emulator "support" was originally added by commit 01de8b09e606 ("KVM: S= VM: Add intercept checks for SVM instructions"), but AFAICT, for all intents an= d purposes that was dead code when it was added, because the emulator doesn't actually _emulate_ the instructions. I assume if they aren't intercepted, = and KVM is full on emulating instead of just decoding, they end up at EMULATION= _FAILED and get a #UD or something. Outside of forced emulation or code stream rewriting, KVM should _never_ fu= lly emulate any of the SVM instructions except VMMCALL (and that is a super spe= cial case). KVM does need to _decode_ the instruction, and it needs to get the pre-intercept exception checks correct so that KVM correctly injects e.g. #= GP instead of synthesizing a #VMEXIT for the CPL check, but KVM doesn't need t= o do *all* of the checks. Note, for L2, the SVME check is meaningless, as EFER.SVME has to be set for= L2 to be active, i.e. it's L1's responsibility to handle that check. Back to the physical address thing, KVM _already_ handles that check in the= #GP path, it's just wrong too: /* All SVM instructions expect page aligned RAX */ if (svm->vmcb->save.rax & ~PAGE_MASK) goto reinject; So I think what we want is to (a) fix the RAX check in gp_interception() (b) drop the RAX check in the emulator (c) add a CPL check in the emulator (because the intercepted #GP could ha= ve been due to L2 executing at CPL>0, not due to a bad-but-good RAX). diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index c8e292e9a24d..74df977a38ca 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -3867,18 +3867,10 @@ static int check_svme(struct x86_emulate_ctxt *ctxt= ) if (!(efer & EFER_SVME)) return emulate_ud(ctxt); =20 - return X86EMUL_CONTINUE; -} - -static int check_svme_pa(struct x86_emulate_ctxt *ctxt) -{ - u64 rax =3D reg_read(ctxt, VCPU_REGS_RAX); - - /* Valid physical address? */ - if (rax & 0xffff000000000000ULL) + if (ctxt->ops->cpl(ctxt)) return emulate_gp(ctxt, 0); =20 - return check_svme(ctxt); + return X86EMUL_CONTINUE; } =20 static int check_rdtsc(struct x86_emulate_ctxt *ctxt) @@ -3984,10 +3976,10 @@ static const struct opcode group7_rm2[] =3D { }; =20 static const struct opcode group7_rm3[] =3D { - DIP(SrcNone | Prot | Priv, vmrun, check_svme_= pa), + DIP(SrcNone | Prot | Priv, vmrun, check_svme)= , II(SrcNone | Prot | EmulateOnUD, em_hypercall, vmmcall), - DIP(SrcNone | Prot | Priv, vmload, check_svme_= pa), - DIP(SrcNone | Prot | Priv, vmsave, check_svme_= pa), + DIP(SrcNone | Prot | Priv, vmload, check_svme)= , + DIP(SrcNone | Prot | Priv, vmsave, check_svme)= , DIP(SrcNone | Prot | Priv, stgi, check_svme)= , DIP(SrcNone | Prot | Priv, clgi, check_svme)= , DIP(SrcNone | Prot | Priv, skinit, check_svme)= , diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index e6691c044913..e1223c07593b 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -2294,7 +2294,7 @@ static int gp_interception(struct kvm_vcpu *vcpu) EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE); } else { /* All SVM instructions expect page aligned RAX */ - if (svm->vmcb->save.rax & ~PAGE_MASK) + if (!page_address_valid(vcpu, svm->vmcb->save.rax)) goto reinject; =20 return emulate_svm_instr(vcpu, opcode);