From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 683891B3F16 for ; Wed, 14 Aug 2024 14:49:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723646974; cv=none; b=eG9HuD+sOvgYSHV6KuHQj4bKNt65Yk/tDZdUGuOxRj9mM5LIRqYlDoSsQDwCNAeO4L+2GzdvOgFAJUvI0hu+dZk6+8hnPJeDDtuZV7nGeVWuC5woiWFA+OUI+J55tXoyex4UhRqTqphHwzWgj8c4lt60HbHlg5xtnRNlAbBrNQM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723646974; c=relaxed/simple; bh=/+q2N0WMy8zi1EQ8v8/v51IJF0vlxa+BET4jSNKtPHI=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=TgbZTR6rDaNP3VHxBLg1du3Ca1C9GxMlmjlWSvVWvshKzS1BOF78eZjPIQV0mgA/70I7hLrj8sJdoWjcSMLeKV3qS9zwUFekwvY/pLpP6iyKQWH2iePDriaYlVWNKGwxV/57cDHf0q4Do/mnVqOoN+Ih07bQ/VjZqgGVzUi5jEM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=XcADASfs; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="XcADASfs" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-1fc52d3c76eso72886615ad.3 for ; Wed, 14 Aug 2024 07:49:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1723646973; x=1724251773; darn=lists.linux.dev; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=DnSCSaSy3plYeVBVOGtckRrfN95Z3a9Gqe/5qm41Hxk=; b=XcADASfsNwyLC8H8HkTbC3BniyW+X1LCWbvPOfmeOfN2y9f2T7DuA7c1pAVc27nVP+ Td7XeEGikVMC66BfwpDsXT/hIurSvHH5DltbXCvm9qhwflGb9+S8XMGzg9YH5yGsTuxp bPVNdb1/kpsier9cam/secOsIH2nn4HUq8D76n3zFFLYUtrLZjwM/a0v9UrIkwr6bhGL ueoiQ0i+eX1iPoaMkOH+sty+A7bK0eQmUGnbdN75SG1NVbg1qA95DAh/Z6MntN+GzNDr mv6olqJfXPcqrB0Gvt5RlIoT38yNkmmhhWB0oQADZM87gflTkuEHASDxjth1G8pYPz9N blBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723646973; x=1724251773; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=DnSCSaSy3plYeVBVOGtckRrfN95Z3a9Gqe/5qm41Hxk=; b=TWOSbaaWyS+h+gzV5gaNK5EDvakZL2Y1AncM2MWtB8urQeBs3yjyoYOS+aRvtuPQ9I I8Pujg8poY2So2iTi/BuENpIM+A9+AyAC8aMLCMJ0piXntTMTlsA58CxW5ukfnWo492p eDV2+j0WK6HggrbOZXs+gPMIUaJmhZBQ83fwmfHJnNgaflow5fC+flijq1IjQmtXJ24J F6mrn4Lbo8iIuWWGVQl+SouxxoXxSYHL7R3Q9eX3nMYo2+IfT4XiMxGZmK0boPIX7t0Q kMP1IDPo7SOurWXdwauF5MbMn/Yc6fXKj/+gabg7Yo07/YIZZ+MCpYUiVEB4PQQFqkr/ Yrag== X-Forwarded-Encrypted: i=1; AJvYcCXo8yURWF/wgVqGWqM/w+7DIcteD6QGIho37nNcZNmnsZ94wPInn1sIDfp1TjwPIugGQCYJJ3nBODtepIzHLPP0oREpcebs X-Gm-Message-State: AOJu0Yy0Kv5Gl2axO/KnF1Sw7qnu1J4g/SZY5Ve4SkkPGtHU1x6O0EXA ZBmyZ/X/RsUqbY0P1GrIA/8Prhkwr43b3GAHB18ePhcCSHBJnkF23o2KsBKITaCeCiahTmYa4u9 x9Q== X-Google-Smtp-Source: AGHT+IFB61s8uHv+FQHeDsB6TsbibBjrr/ZwNididN1e3znPTPIm8eFzYZIoLcM3rfqSDWNYlAvaPhyrfFs= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:902:6544:b0:201:dc7b:a8b5 with SMTP id d9443c01a7336-201dc7bac0emr541235ad.12.1723646972522; Wed, 14 Aug 2024 07:49:32 -0700 (PDT) Date: Wed, 14 Aug 2024 07:49:31 -0700 In-Reply-To: Precedence: bulk X-Mailing-List: kvmarm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240809205158.1340255-1-amoorthy@google.com> <20240809205158.1340255-4-amoorthy@google.com> Message-ID: Subject: Re: [PATCH v2 3/3] KVM: arm64: Perform memory fault exits when stage-2 handler EFAULTs From: Sean Christopherson To: "Aneesh Kumar K.V" Cc: Anish Moorthy , oliver.upton@linux.dev, kvm@vger.kernel.org, kvmarm@lists.linux.dev, jthoughton@google.com, rananta@google.com Content-Type: text/plain; charset="us-ascii" On Wed, Aug 14, 2024, Aneesh Kumar K.V wrote: > Sean Christopherson writes: > > > On Mon, Aug 12, 2024, Aneesh Kumar K.V wrote: > >> Anish Moorthy writes: > >> > >> > Right now userspace just gets a bare EFAULT when the stage-2 fault > >> > handler fails to fault in the relevant page. Set up a > >> > KVM_EXIT_MEMORY_FAULT whenever this happens, which at the very least > >> > eases debugging and might also let userspace decide on/take some > >> > specific action other than crashing the VM. > >> > > >> > In some cases, user_mem_abort() EFAULTs before the size of the fault is > >> > calculated: return 0 in these cases to indicate that the fault is of > >> > unknown size. > >> > > >> > >> VMMs are now converting private memory to shared or vice-versa on vcpu > >> exit due to memory fault. This change will require VMM track each page's > >> private/shared state so that they can now handle an exit fault on a > >> shared memory where the fault happened due to reasons other than > >> conversion. > > > > I don't see how filling kvm_run.memory_fault in more locations changes anything. > > The userspace exits are inherently racy, e.g. userspace may have already converted > > the page to the appropriate state, thus making KVM's exit spurious. So either > > the VMM already tracks state, or the VMM blindly converts to shared/private. > > > > I might be missing some details here. The change is adding exit_reason = > KVM_EXIT_MEMORY_FAULT to code path which would earlier result in VMM > panics? > > For ex: > > @@ -1473,6 +1475,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > if (unlikely(!vma)) { > kvm_err("Failed to find VMA for hva 0x%lx\n", hva); > mmap_read_unlock(current->mm); > + kvm_prepare_memory_fault_exit(vcpu, fault_ipa, 0, > + write_fault, exec_fault, false); > return -EFAULT; > } > > > VMMs handle this with code as below > > static bool handle_memoryfault(struct kvm_cpu *vcpu) > { > .... > return true; > } > > bool kvm_cpu__handle_exit(struct kvm_cpu *vcpu) > { > switch (vcpu->kvm_run->exit_reason) { > ... > case KVM_EXIT_MEMORY_FAULT: > return handle_memoryfault(vcpu); > } > > return false; > } > > and the caller did > > ret = kvm_cpu__handle_exit(cpu); > if (!ret) > goto panic_kvm; > break; > > > This change will break those VMMs isn't? ie, we will not panic after > this change? If the VMM unconditionally resumes the guest on errno=EFAULT, that's a VMM bug. handle_memoryfault() needs to have some amount of checking to verify that it can actually resolve the fault that was reported, given the gfn and metadata. In practice, that means panicking on any gfn that's not associated with a memslot that has KVM_MEM_GUEST_MEMFD, because prior to this series, it's impossible for userspace to resolve any faults besides implict shared<=>private conversions.