From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F0FCC374FA for ; Wed, 14 Aug 2024 19:34:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723664057; cv=none; b=OYYkcIUfFWmjmSuztt4nYhErbibyP7IK9R3KyM7JzxLoagCKIOll+VANRGinYZ4qbHDUCJ+CdeHk1NcDWD860YPDGCi7+wseoZgVqJQSs2scvft+sqjEoq/SiN+uifuAeVLD/Qospi7W5dd3KPoBhMy5n8qmpNw6ASSkSiv5Eag= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723664057; c=relaxed/simple; bh=FdU8SvzcWm3PWPtIGFD/hoKydvzDoQz5Wplue72HPm4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=IazI8egCC5relG+nTvqj4fG4W3+r8AIdwbElnvEW8vAWy24Lw5suIBeJflULbIoUAIEFvdXQGrLl6jvq0/5nuA8RfLH9azPW2UZ2NzQIDp1Q9qF7dFtjwjk+UF9fj1nAMogsOlKq4GVe2AhPVXrKyx4YdvDtk8eONyB9SkzN46U= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=dIjF7rsz; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="dIjF7rsz" Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-690404fd34eso5997477b3.1 for ; Wed, 14 Aug 2024 12:34:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1723664055; x=1724268855; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=y8cWK2+UYGLsva20MxS90pT4B5pootLE6XLQcrnibDs=; b=dIjF7rszBcek1yI7j3xtWOmNk8dOHb1hxHQ22/rd5KSj2kXRdziUn80gkQvVQkZ8Do MIU8CpQ1KaoCqOW5LrC0iY1vT2C2syZzAO8kR98agNdVO+MXLwKRe1vCe+Y4P0ln6s64 QEZBuGRju9s+sl4pGtzhN+4XI5g4EWpvxa+Vro1fBEJeP4AQzYPnNy69T+Mb7OMadd1E rsmUeKK4dNsSUmoSzbzhrcnhHFpEPUFH55fOyu3q+pxp3+7hcrMCMQyOGRuSuBTmtI7a KDi109WyXP8z6vMfm9ptU+RDAY9PRNP7gWESev4JXwAFWAudFPGAdrX8yYJ+XxxQjy6K S3uw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723664055; x=1724268855; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=y8cWK2+UYGLsva20MxS90pT4B5pootLE6XLQcrnibDs=; b=HByyPtIua7bweG2KGhv0/V4YY2bJ0UbB+8X+nGbHryTnWd/9WEH+Q1FYxOcNNte5tZ /6LN3S+G9l05Px/98m0+QF/yOfpLWzPNjxXou+GUORPqq6Lx0de7XuCyTpM/zCm4htKd Qv7VWvLTAFrU9le5+dN5P33mGLtfRWsk15lToz7DajuUyMASnQSq5rkaDoBwyv8qQfHP hHwM5YKt+vJFJLc7agVqnCm3Nvo14yc8zNolCZ/87PkydsoYR3lhSevz0hNgXX6zgw7x z05j4QDkuPiayEQs5bHdZb+lZGzzf5wKKiMLj5YlxoaOiHGQiPMLWyHY+oU2h43k+4XJ mVAQ== X-Gm-Message-State: AOJu0YyzuDr0EiFlq8B7kfbo1ZjBzfM9N5pEGx7H/e8h/ddcAIPln/q6 1nXtcCo8t3EcUEafjfpWZnbzlMakzmhNsTqf32pbp6h6w9QtZW+Rji4IQfyzmNIbQp9I6CmXAok uag== X-Google-Smtp-Source: AGHT+IHLt1+HKA0U+xh5NYnbNVB+Ne7FTbbFACVfA5pF/rj2OeD0F++kPiOpi0T4SKQEBxj/Y2L/KElEtSU= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6902:105:b0:e11:5da7:337 with SMTP id 3f1490d57ef6-e115da70653mr49577276.3.1723664054395; Wed, 14 Aug 2024 12:34:14 -0700 (PDT) Date: Wed, 14 Aug 2024 12:34:12 -0700 In-Reply-To: <96293a7d-0347-458e-9776-d11f55894d34@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240809190319.1710470-1-seanjc@google.com> <20240809190319.1710470-4-seanjc@google.com> <96293a7d-0347-458e-9776-d11f55894d34@redhat.com> Message-ID: Subject: Re: [PATCH 03/22] KVM: x86/mmu: Trigger unprotect logic only on write-protection page faults From: Sean Christopherson To: Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Peter Gonda , Michael Roth , Vishal Annapurve , Ackerly Tng Content-Type: text/plain; charset="us-ascii" On Wed, Aug 14, 2024, Paolo Bonzini wrote: > On 8/9/24 21:03, Sean Christopherson wrote: > > Trigger KVM's various "unprotect gfn" paths if and only if the page fault > > was a write to a write-protected gfn. To do so, add a new page fault > > return code, RET_PF_WRITE_PROTECTED, to explicitly and precisely track > > such page faults. > > > > If a page fault requires emulation for any MMIO (or any reason besides > > write-protection), trying to unprotect the gfn is pointless and risks > > putting the vCPU into an infinite loop. E.g. KVM will put the vCPU into > > an infinite loop if the vCPU manages to trigger MMIO on a page table walk. > > > > Fixes: 147277540bbc ("kvm: svm: Add support for additional SVM NPF error codes") > > Cc: stable@vger.kernel.org > > Do we really want Cc: stable@ for all these patches? Most of them are of > the "if it hurts, don't do it" kind; True. I was thinking that the VMX PFERR_GUEST_{FINAL,PAGE}_MASK bug in particular was stable-worthy, but until TDX comes along, it's only relevant if guests puts PDPTRs in an MMIO region. And in that case, the guest is likely hosed anyway, the only difference is if it gets stuck or killed. I'll drop the stable@ tags unless someone objects. > as long as there are no infinite loops in a non-killable region, I prefer not > to complicate our lives with cherry picks of unknown quality. Yeah, the RET_PF_WRITE_PROTECTED one in particular has high potential for a bad cherry-pick. > That said, this patch could be interesting for 6.11 because of the effect on > prefaulting (see below). > > > Signed-off-by: Sean Christopherson > > --- > > arch/x86/kvm/mmu/mmu.c | 78 +++++++++++++++++++-------------- > > arch/x86/kvm/mmu/mmu_internal.h | 3 ++ > > arch/x86/kvm/mmu/mmutrace.h | 1 + > > arch/x86/kvm/mmu/paging_tmpl.h | 2 +- > > arch/x86/kvm/mmu/tdp_mmu.c | 6 +-- > > 5 files changed, 53 insertions(+), 37 deletions(-) > > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > index 901be9e420a4..e3aa04c498ea 100644 > > --- a/arch/x86/kvm/mmu/mmu.c > > +++ b/arch/x86/kvm/mmu/mmu.c > > @@ -2914,10 +2914,8 @@ static int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot, > > trace_kvm_mmu_set_spte(level, gfn, sptep); > > } > > - if (wrprot) { > > - if (write_fault) > > - ret = RET_PF_EMULATE; > > - } > > + if (wrprot && write_fault) > > + ret = RET_PF_WRITE_PROTECTED; > > if (flush) > > kvm_flush_remote_tlbs_gfn(vcpu->kvm, gfn, level); > > @@ -4549,7 +4547,7 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault > > return RET_PF_RETRY; > > if (page_fault_handle_page_track(vcpu, fault)) > > - return RET_PF_EMULATE; > > + return RET_PF_WRITE_PROTECTED; > > r = fast_page_fault(vcpu, fault); > > if (r != RET_PF_INVALID) > > @@ -4642,7 +4640,7 @@ static int kvm_tdp_mmu_page_fault(struct kvm_vcpu *vcpu, > > int r; > > if (page_fault_handle_page_track(vcpu, fault)) > > - return RET_PF_EMULATE; > > + return RET_PF_WRITE_PROTECTED; > > r = fast_page_fault(vcpu, fault); > > if (r != RET_PF_INVALID) > > @@ -4726,6 +4724,9 @@ static int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code, > > case RET_PF_EMULATE: > > return -ENOENT; > > + case RET_PF_WRITE_PROTECTED: > > + return -EPERM; > > Shouldn't this be a "return 0"? Even if kvm_mmu_do_page_fault() cannot > fully unprotect the page, it was nevertheless prefaulted as much as > possible. Hmm, I hadn't thought about it from that perspective. Ah, right, and the early check in page_fault_handle_page_track() only handles PRESENT faults, so KVM will at least install a read-only mapping. So yeah, agreed this should return 0.