From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AC1AEC3DA4A for ; Fri, 26 Jul 2024 16:52:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:Reply-To:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id: Content-Transfer-Encoding:Content-Type:In-Reply-To:From:References:CC:To: Subject:MIME-Version:Date:Message-ID:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=LunzKZL6emWKetW1H3wwaEoDFwJcR/2MIjOK6DAsJmI=; b=UFpefbkIk8ZIKD hVTx2Da5YZ63P9jyIjFvcFwlRxxPRiLxEsou943xsrJt04flZS+PXR3VtEdjblQQL8dUYuBhq6EWI eEfp9zOz9CRFE/0B1BDArmR8nRjr3mKOYZhGCekqVJXMVP7ZVZ1H+VEqzn0MR+OIidIJnNRNQSAfo UljO4JkKacI3lpiaE8pL0ZarW7zXAOQkigB81EcTBfPpAzdk8f4MeQRWan6RbCCgohfhbSGhyLtRA 8e75JUjQKBUF3gYDKo4Su26v6rTiYo9woyPIG9ti7lxUaRRnmOo2AL7dFvHN9A43iNEvdAfj8KSs+ yjphev0vohlLbObpGT7w==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sXOB3-00000004TLm-3mqD; Fri, 26 Jul 2024 16:52:21 +0000 Received: from smtp-fw-9106.amazon.com ([207.171.188.206]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sXO9E-00000004T9O-0cCV for linux-arm-kernel@lists.infradead.org; Fri, 26 Jul 2024 16:51:58 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1722012628; x=1753548628; h=message-id:date:mime-version:reply-to:subject:to:cc: references:from:in-reply-to:content-transfer-encoding; bh=LunzKZL6emWKetW1H3wwaEoDFwJcR/2MIjOK6DAsJmI=; b=QgD42pShwQQbvw9+FrFWndGeZruvrCTEZNxBFytgkz0mXZXp0V9erxUU f+NmmL1EwM+pq2VyQ7ZoTXMDAUzEDfGt5ySPQfnI4AnODwvASsidf55XG e2mRohnjl6elrFOa8ySwtXygLyDP6alfyHSJs26ctHlPUzxYbC+E3jdP2 U=; X-IronPort-AV: E=Sophos;i="6.09,239,1716249600"; d="scan'208";a="744977504" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO smtpout.prod.us-east-1.prod.farcaster.email.amazon.dev) ([10.25.36.210]) by smtp-border-fw-9106.sea19.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Jul 2024 16:50:19 +0000 Received: from EX19MTAEUA001.ant.amazon.com [10.0.10.100:41594] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.0.129:2525] with esmtp (Farcaster) id dd7c83a3-7c90-419d-98f4-4e8089f8113d; Fri, 26 Jul 2024 16:50:18 +0000 (UTC) X-Farcaster-Flow-ID: dd7c83a3-7c90-419d-98f4-4e8089f8113d Received: from EX19D022EUC002.ant.amazon.com (10.252.51.137) by EX19MTAEUA001.ant.amazon.com (10.252.50.223) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Fri, 26 Jul 2024 16:50:18 +0000 Received: from [192.168.9.159] (10.106.83.8) by EX19D022EUC002.ant.amazon.com (10.252.51.137) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Fri, 26 Jul 2024 16:50:17 +0000 Message-ID: <4e5c2904-f628-4391-853e-37b7f0e132e8@amazon.com> Date: Fri, 26 Jul 2024 17:50:15 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH 14/18] KVM: Add asynchronous userfaults, KVM_READ_USERFAULT To: James Houghton CC: Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , Sean Christopherson , Shuah Khan , Peter Xu , Axel Rasmussen , David Matlack , , , , , , , , "Paolo Bonzini" References: <20240710234222.2333120-1-jthoughton@google.com> <20240710234222.2333120-15-jthoughton@google.com> Content-Language: en-US From: Nikita Kalyazin Autocrypt: addr=kalyazin@amazon.com; keydata= xjMEY+ZIvRYJKwYBBAHaRw8BAQdA9FwYskD/5BFmiiTgktstviS9svHeszG2JfIkUqjxf+/N JU5pa2l0YSBLYWx5YXppbiA8a2FseWF6aW5AYW1hem9uLmNvbT7CjwQTFggANxYhBGhhGDEy BjLQwD9FsK+SyiCpmmTzBQJj5ki9BQkDwmcAAhsDBAsJCAcFFQgJCgsFFgIDAQAACgkQr5LK IKmaZPOR1wD/UTcn4GbLC39QIwJuWXW0DeLoikxFBYkbhYyZ5CbtrtAA/2/rnR/zKZmyXqJ6 ULlSE8eWA3ywAIOH8jIETF2fCaUCzjgEY+ZIvRIKKwYBBAGXVQEFAQEHQCqd7/nb2tb36vZt ubg1iBLCSDctMlKHsQTp7wCnEc4RAwEIB8J+BBgWCAAmFiEEaGEYMTIGMtDAP0Wwr5LKIKma ZPMFAmPmSL0FCQPCZwACGwwACgkQr5LKIKmaZPNCxAEAxwnrmyqSC63nf6hoCFCfJYQapghC abLV0+PWemntlwEA/RYx8qCWD6zOEn4eYhQAucEwtg6h1PBbeGK94khVMooF In-Reply-To: <20240710234222.2333120-15-jthoughton@google.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.106.83.8] X-ClientProxiedBy: EX19D008EUC001.ant.amazon.com (10.252.51.165) To EX19D022EUC002.ant.amazon.com (10.252.51.137) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240726_095028_295632_1933EA3B X-CRM114-Status: GOOD ( 13.98 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: kalyazin@amazon.com Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi James, On 11/07/2024 00:42, James Houghton wrote: > It is possible that KVM wants to access a userfault-enabled GFN in a > path where it is difficult to return out to userspace with the fault > information. For these cases, add a mechanism for KVM to wait for a GFN > to not be userfault-enabled. In this patch series, an asynchronous notification mechanism is used only in cases "where it is difficult to return out to userspace with the fault information". However, we (AWS) have a use case where we would like to be notified asynchronously about _all_ faults. Firecracker can restore a VM from a memory snapshot where the guest memory is supplied via a Userfaultfd by a process separate from the VMM itself [1]. While it looks technically possible for the VMM process to handle exits via forwarding the faults to the other process, that would require building a complex userspace protocol on top and likely introduce extra latency on the critical path. This also implies that a KVM API (KVM_READ_USERFAULT) is not suitable, because KVM checks that the ioctls are performed specifically by the VMM process [2]: if (kvm->mm != current->mm || kvm->vm_dead) return -EIO; > The implementation of this mechanism is certain to change before KVM > Userfault could possibly be merged. How do you envision resolving faults in userspace? Copying the page in (provided that userspace mapping of guest_memfd is supported [3]) and clearing the KVM_MEMORY_ATTRIBUTE_USERFAULT alone do not look sufficient to resolve the fault because an attempt to copy the page directly in userspace will trigger a fault on its own and may lead to a deadlock in the case where the original fault was caused by the VMM. An interface similar to UFFDIO_COPY is needed that would allocate a page, copy the content in and update page tables. [1] Firecracker snapshot restore via UserfaultFD: https://github.com/firecracker-microvm/firecracker/blob/main/docs/snapshotting/handling-page-faults-on-snapshot-resume.md [2] KVM ioctl check for the address space: https://elixir.bootlin.com/linux/v6.10.1/source/virt/kvm/kvm_main.c#L5083 [3] mmap() of guest_memfd: https://lore.kernel.org/kvm/489d1494-626c-40d9-89ec-4afc4cd0624b@redhat.com/T/#mc944a6fdcd20a35f654c2be99f9c91a117c1bed4 Thanks, Nikita