From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sean Christopherson Date: Wed, 1 Nov 2023 14:55:46 -0700 Subject: [PATCH v13 16/35] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory In-Reply-To: References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-17-seanjc@google.com> Message-ID: List-Id: To: kvm-riscv@lists.infradead.org MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On Wed, Nov 01, 2023, Fuad Tabba wrote: > > > > @@ -1034,6 +1034,9 @@ static void kvm_destroy_dirty_bitmap(struct kvm_memory_slot *memslot) > > > > /* This does not remove the slot from struct kvm_memslots data structures */ > > > > static void kvm_free_memslot(struct kvm *kvm, struct kvm_memory_slot *slot) > > > > { > > > > + if (slot->flags & KVM_MEM_PRIVATE) > > > > + kvm_gmem_unbind(slot); > > > > + > > > > > > Should this be called after kvm_arch_free_memslot()? Arch-specific ode > > > might need some of the data before the unbinding, something I thought > > > might be necessary at one point for the pKVM port when deleting a > > > memslot, but realized later that kvm_invalidate_memslot() -> > > > kvm_arch_guest_memory_reclaimed() was the more logical place for it. > > > Also, since that seems to be the pattern for arch-specific handlers in > > > KVM. > > > > Maybe? But only if we can about symmetry between the allocation and free paths > > I really don't think kvm_arch_free_memslot() should be doing anything beyond a > > "pure" free. E.g. kvm_arch_free_memslot() is also called after moving a memslot, > > which hopefully we never actually have to allow for guest_memfd, but any code in > > kvm_arch_free_memslot() would bring about "what if" questions regarding memslot > > movement. I.e. the API is intended to be a "free arch metadata associated with > > the memslot". > > > > Out of curiosity, what does pKVM need to do at kvm_arch_guest_memory_reclaimed()? > > It's about the host reclaiming ownership of guest memory when tearing > down a protected guest. In pKVM, we currently teardown the guest and > reclaim its memory when kvm_arch_destroy_vm() is called. The problem > with guestmem is that kvm_gmem_unbind() could get called before that > happens, after which the host might try to access the unbound guest > memory. Since the host hasn't reclaimed ownership of the guest memory > from hyp, hilarity ensues (it crashes). > > Initially, I hooked reclaim guest memory to kvm_free_memslot(), but > then I needed to move the unbind later in the function. I realized > later that kvm_arch_guest_memory_reclaimed() gets called earlier (at > the right time), and is more aptly named. Aha! I suspected that might be the case. TDX and SNP also need to solve the same problem of "reclaiming" memory before it can be safely accessed by the host. The plan is to add an arch hook (or two?) into guest_memfd that is invoked when memory is freed from guest_memfd. Hooking kvm_arch_guest_memory_reclaimed() isn't completely correct as deleting a memslot doesn't *guarantee* that guest memory is actually reclaimed (which reminds me, we need to figure out a better name for that thing before introducing kvm_arch_gmem_invalidate()). The effective false positives aren't fatal for the current usage because the hook is used only for x86 SEV guests to flush caches. An unnecessary flush can cause performance issues, but it doesn't affect correctness. For TDX and SNP, and IIUC pKVM, false positives are fatal because KVM could assign memory back to the host that is still owned by guest_memfd. E.g. a misbehaving userspace could prematurely delete a memslot. And the more fun example is intrahost migration, where the plan is to allow pointing multiple guest_memfd files at a single guest_memfd inode: https://lore.kernel.org/all/cover.1691446946.git.ackerleytng at google.com There was a lot of discussion for this, but it's scattered all over the place. The TL;DR is is that the inode will represent physical memory, and a file will represent a given "struct kvm" instance's view of that memory. And so the memory isn't reclaimed until the inode is truncated/punched. I _think_ this reflects the most recent plan from the guest_memfd side: https://lore.kernel.org/all/1233d749211c08d51f9ca5d427938d47f008af1f.1689893403.git.isaku.yamahata at intel.com From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5E4DF23BC for ; Wed, 1 Nov 2023 21:55:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="1pUoQoZD" Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-da04fb79246so290031276.2 for ; Wed, 01 Nov 2023 14:55:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698875748; x=1699480548; darn=lists.linux.dev; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=PcqXHpLnYUoCfOstmB/gusaSnz1wSaaIZvticgi9YrM=; b=1pUoQoZDTDacc4jhdrhjaIFkwjvAmB6KPck58npxWAyPgSQDfum8cqEDGEOOi+BK9b AkYFzNTomVLbN+f+S0nU8+KRc7dvWfAHBMqatOdMsgdei+F3pEhHlPbGzUzWG07LqjKR 7eZpCgvZ4jHit3/Of4VZtvVmho9pq2DYdWTv0H+zMGcfnCStXZkuw46uOeWRUO4ini4m iIADuPPLuTB5kpMqAOEw39AZ77TuR+neV4N+E6E5hbftmV+AatgC+jKijP1PzZDa9n6X cQ9L7xpwctYovumrAimfjpkePi0Piiqpnptxy7hemVCcv+66LHQ1yte0iiUz0+vAe9Hh M5iQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698875748; x=1699480548; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=PcqXHpLnYUoCfOstmB/gusaSnz1wSaaIZvticgi9YrM=; b=HZT7dks52rMhkxqwVqd2n367/10x3yN34K7aHgJ6EysFJ/egqCPH5HHZs9PIE6UAd0 N7Tb1Y6Uks5JI6NCsBfCDGOe5Shro8QYUsZ3sy6eptStroNH3BKPpanxr5UnUu1prFmK nGkXzFaRWqfvEZ5dEXOpW7ChFbThiqvGvTii9Hmsnyq+EVEDVGDcGoND9m4Aa9Z7f/sL AIDZfv1+GUDCiHU601OkuOsHwy6hHv9bqZ0xdnVlDbf9GN4PPE6xv8rN8uHYeVv4Lt7D w74HnXbO+PTXzHVyCR2pSFybuHAa7JDJ0lfBn7QUXyd51z1Qz9arXwZEQtpHD7SmFL5D uY0A== X-Gm-Message-State: AOJu0YwlXcI/UlmCQuxT/wG7Yvsh0WZcYbTKRBTNTXWaojX+A46pf7m2 xd9IQedDi/qIpmgDHF8v8SK5etLWQr0= X-Google-Smtp-Source: AGHT+IE2sMY2tsOG0bVHgm+pvNoyezNqlGkOo8OmSBoeRdI+YDBZP5bgBr5ffHZ4bVc102IwBIrDYpGpUa8= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6902:1083:b0:d9a:c3b8:4274 with SMTP id v3-20020a056902108300b00d9ac3b84274mr405001ybu.7.1698875748263; Wed, 01 Nov 2023 14:55:48 -0700 (PDT) Date: Wed, 1 Nov 2023 14:55:46 -0700 In-Reply-To: Precedence: bulk X-Mailing-List: kvmarm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-17-seanjc@google.com> Message-ID: Subject: Re: [PATCH v13 16/35] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory From: Sean Christopherson To: Fuad Tabba Cc: Paolo Bonzini , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexander Viro , Christian Brauner , "Matthew Wilcox (Oracle)" , Andrew Morton , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Xiaoyao Li , Xu Yilun , Chao Peng , Jarkko Sakkinen , Anish Moorthy , David Matlack , Yu Zhang , Isaku Yamahata , "=?utf-8?Q?Micka=C3=ABl_Sala=C3=BCn?=" , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" Content-Type: text/plain; charset="us-ascii" On Wed, Nov 01, 2023, Fuad Tabba wrote: > > > > @@ -1034,6 +1034,9 @@ static void kvm_destroy_dirty_bitmap(struct kvm_memory_slot *memslot) > > > > /* This does not remove the slot from struct kvm_memslots data structures */ > > > > static void kvm_free_memslot(struct kvm *kvm, struct kvm_memory_slot *slot) > > > > { > > > > + if (slot->flags & KVM_MEM_PRIVATE) > > > > + kvm_gmem_unbind(slot); > > > > + > > > > > > Should this be called after kvm_arch_free_memslot()? Arch-specific ode > > > might need some of the data before the unbinding, something I thought > > > might be necessary at one point for the pKVM port when deleting a > > > memslot, but realized later that kvm_invalidate_memslot() -> > > > kvm_arch_guest_memory_reclaimed() was the more logical place for it. > > > Also, since that seems to be the pattern for arch-specific handlers in > > > KVM. > > > > Maybe? But only if we can about symmetry between the allocation and free paths > > I really don't think kvm_arch_free_memslot() should be doing anything beyond a > > "pure" free. E.g. kvm_arch_free_memslot() is also called after moving a memslot, > > which hopefully we never actually have to allow for guest_memfd, but any code in > > kvm_arch_free_memslot() would bring about "what if" questions regarding memslot > > movement. I.e. the API is intended to be a "free arch metadata associated with > > the memslot". > > > > Out of curiosity, what does pKVM need to do at kvm_arch_guest_memory_reclaimed()? > > It's about the host reclaiming ownership of guest memory when tearing > down a protected guest. In pKVM, we currently teardown the guest and > reclaim its memory when kvm_arch_destroy_vm() is called. The problem > with guestmem is that kvm_gmem_unbind() could get called before that > happens, after which the host might try to access the unbound guest > memory. Since the host hasn't reclaimed ownership of the guest memory > from hyp, hilarity ensues (it crashes). > > Initially, I hooked reclaim guest memory to kvm_free_memslot(), but > then I needed to move the unbind later in the function. I realized > later that kvm_arch_guest_memory_reclaimed() gets called earlier (at > the right time), and is more aptly named. Aha! I suspected that might be the case. TDX and SNP also need to solve the same problem of "reclaiming" memory before it can be safely accessed by the host. The plan is to add an arch hook (or two?) into guest_memfd that is invoked when memory is freed from guest_memfd. Hooking kvm_arch_guest_memory_reclaimed() isn't completely correct as deleting a memslot doesn't *guarantee* that guest memory is actually reclaimed (which reminds me, we need to figure out a better name for that thing before introducing kvm_arch_gmem_invalidate()). The effective false positives aren't fatal for the current usage because the hook is used only for x86 SEV guests to flush caches. An unnecessary flush can cause performance issues, but it doesn't affect correctness. For TDX and SNP, and IIUC pKVM, false positives are fatal because KVM could assign memory back to the host that is still owned by guest_memfd. E.g. a misbehaving userspace could prematurely delete a memslot. And the more fun example is intrahost migration, where the plan is to allow pointing multiple guest_memfd files at a single guest_memfd inode: https://lore.kernel.org/all/cover.1691446946.git.ackerleytng@google.com There was a lot of discussion for this, but it's scattered all over the place. The TL;DR is is that the inode will represent physical memory, and a file will represent a given "struct kvm" instance's view of that memory. And so the memory isn't reclaimed until the inode is truncated/punched. I _think_ this reflects the most recent plan from the guest_memfd side: https://lore.kernel.org/all/1233d749211c08d51f9ca5d427938d47f008af1f.1689893403.git.isaku.yamahata@intel.com From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6AC65C4332F for ; Wed, 1 Nov 2023 21:56:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: References:Mime-Version:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=2V9GyNKW3Hbna0Un/7W86TIBmaZte/4EzTarDsClVps=; b=n7DsDTWb+ilBi79Xbcn9LvroxZ duFv/bL+2qFLWex3bJLkuKc2Pw18xpkEyVmBb6OwQRMzzustkxrSrCf0+iivuGBIQR6ugyx5RMyM7 C2oDaZRrul1LGITSJJ1c37C58dpv/r/ITL+W7rk4XToe2ST0TFk+dpd5Efs5YIOs/h9zlDu3ETmME zYx13xKu+SuwKitj8oBG9oewUVq0LHH+PGrC0jjEPtqSuUEEbr2oR5en4ywLWpj811IuPDt+JiJnp pGF2QDxsVdnZ1+Cxr2dlCRp5s9H1mWZBGsLqZvdJ79kQdcJkifvI5l1TAHieJdlITl5vVOYMOix89 c+Y/YiQA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qyJBq-008B92-2I; Wed, 01 Nov 2023 21:55:54 +0000 Received: from mail-yb1-xb4a.google.com ([2607:f8b0:4864:20::b4a]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qyJBn-008B6w-0Q for linux-riscv@lists.infradead.org; Wed, 01 Nov 2023 21:55:54 +0000 Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-d9a5a3f2d4fso287709276.3 for ; Wed, 01 Nov 2023 14:55:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698875748; x=1699480548; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=PcqXHpLnYUoCfOstmB/gusaSnz1wSaaIZvticgi9YrM=; b=fi2v7xC9TH4oErbIOJfDXZfAdxQ494cxTQxZwVGxmvCjEwoZ5W9X+Du2+gOwu+MwLC 0nlb+lsobtKLZIohJNZpFeYQNJw6uYEzrNK9BobAeVzNtzFX0QGhLeasz54o3vP7bZmv OhwgdqCkzGq2A0qY3wlZSutr1B5InCd9WoWXcPO3+w8G1xrcgfenGOfFT7hdyRFYweRC b0om9N3s3Ao1ysAH6kZB7HUY5SNrOt9qjNHvB7E7MA/GNxGxYdXS8pB/8Lm7LK+qLzCS LaM1TEZb2Fg1ihw4FrbFOCDqOIIwKWVN7jNFGgr7FaZvzI72e0NZExVCwvpPpJtawAeh FKVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698875748; x=1699480548; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=PcqXHpLnYUoCfOstmB/gusaSnz1wSaaIZvticgi9YrM=; b=iX17d67luVxpyUGckw555ikljmLWGcvoc4Vx5Cbai5fLUYQwF+OWxOfiRTgPI4oFlj N/YE95EnlJ8FCLQKX24624DWtWM5GOugMs/4GjHzhqZKcXBFaPWhcaSmDb6mODA5pENn BA4tf9Nx0KgN9oYO5X8z1SHUz30YA8pxrSRm+D1v0GAIh4ULsuhgKBvB/Qao73JaRzka 3IsVOIvchcg7XJqPoY/SESck2LcD+MS2498T3F+7+Nr55KnMcRG41CeTxCuabC5olJqa MavWGXTwj0zcQUbJx12xIcfhFjOZyFEkDfDZT8P/iF8szj2sx3Yngic/J+M7ofUs/Ai5 DWEQ== X-Gm-Message-State: AOJu0Yzec/vBfOquoaE0BNd+2oFRUdjWDXRGKJtonOmDXM1S4jMbN3gt P4NyYTo7B3C9SUuQGOZ+OeQ/sdBpNSk= X-Google-Smtp-Source: AGHT+IE2sMY2tsOG0bVHgm+pvNoyezNqlGkOo8OmSBoeRdI+YDBZP5bgBr5ffHZ4bVc102IwBIrDYpGpUa8= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6902:1083:b0:d9a:c3b8:4274 with SMTP id v3-20020a056902108300b00d9ac3b84274mr405001ybu.7.1698875748263; Wed, 01 Nov 2023 14:55:48 -0700 (PDT) Date: Wed, 1 Nov 2023 14:55:46 -0700 In-Reply-To: Mime-Version: 1.0 References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-17-seanjc@google.com> Message-ID: Subject: Re: [PATCH v13 16/35] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory From: Sean Christopherson To: Fuad Tabba Cc: Paolo Bonzini , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexander Viro , Christian Brauner , "Matthew Wilcox (Oracle)" , Andrew Morton , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Xiaoyao Li , Xu Yilun , Chao Peng , Jarkko Sakkinen , Anish Moorthy , David Matlack , Yu Zhang , Isaku Yamahata , "=?utf-8?Q?Micka=C3=ABl_Sala=C3=BCn?=" , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231101_145551_173216_128C27BA X-CRM114-Status: GOOD ( 34.33 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org On Wed, Nov 01, 2023, Fuad Tabba wrote: > > > > @@ -1034,6 +1034,9 @@ static void kvm_destroy_dirty_bitmap(struct kvm_memory_slot *memslot) > > > > /* This does not remove the slot from struct kvm_memslots data structures */ > > > > static void kvm_free_memslot(struct kvm *kvm, struct kvm_memory_slot *slot) > > > > { > > > > + if (slot->flags & KVM_MEM_PRIVATE) > > > > + kvm_gmem_unbind(slot); > > > > + > > > > > > Should this be called after kvm_arch_free_memslot()? Arch-specific ode > > > might need some of the data before the unbinding, something I thought > > > might be necessary at one point for the pKVM port when deleting a > > > memslot, but realized later that kvm_invalidate_memslot() -> > > > kvm_arch_guest_memory_reclaimed() was the more logical place for it. > > > Also, since that seems to be the pattern for arch-specific handlers in > > > KVM. > > > > Maybe? But only if we can about symmetry between the allocation and free paths > > I really don't think kvm_arch_free_memslot() should be doing anything beyond a > > "pure" free. E.g. kvm_arch_free_memslot() is also called after moving a memslot, > > which hopefully we never actually have to allow for guest_memfd, but any code in > > kvm_arch_free_memslot() would bring about "what if" questions regarding memslot > > movement. I.e. the API is intended to be a "free arch metadata associated with > > the memslot". > > > > Out of curiosity, what does pKVM need to do at kvm_arch_guest_memory_reclaimed()? > > It's about the host reclaiming ownership of guest memory when tearing > down a protected guest. In pKVM, we currently teardown the guest and > reclaim its memory when kvm_arch_destroy_vm() is called. The problem > with guestmem is that kvm_gmem_unbind() could get called before that > happens, after which the host might try to access the unbound guest > memory. Since the host hasn't reclaimed ownership of the guest memory > from hyp, hilarity ensues (it crashes). > > Initially, I hooked reclaim guest memory to kvm_free_memslot(), but > then I needed to move the unbind later in the function. I realized > later that kvm_arch_guest_memory_reclaimed() gets called earlier (at > the right time), and is more aptly named. Aha! I suspected that might be the case. TDX and SNP also need to solve the same problem of "reclaiming" memory before it can be safely accessed by the host. The plan is to add an arch hook (or two?) into guest_memfd that is invoked when memory is freed from guest_memfd. Hooking kvm_arch_guest_memory_reclaimed() isn't completely correct as deleting a memslot doesn't *guarantee* that guest memory is actually reclaimed (which reminds me, we need to figure out a better name for that thing before introducing kvm_arch_gmem_invalidate()). The effective false positives aren't fatal for the current usage because the hook is used only for x86 SEV guests to flush caches. An unnecessary flush can cause performance issues, but it doesn't affect correctness. For TDX and SNP, and IIUC pKVM, false positives are fatal because KVM could assign memory back to the host that is still owned by guest_memfd. E.g. a misbehaving userspace could prematurely delete a memslot. And the more fun example is intrahost migration, where the plan is to allow pointing multiple guest_memfd files at a single guest_memfd inode: https://lore.kernel.org/all/cover.1691446946.git.ackerleytng@google.com There was a lot of discussion for this, but it's scattered all over the place. The TL;DR is is that the inode will represent physical memory, and a file will represent a given "struct kvm" instance's view of that memory. And so the memory isn't reclaimed until the inode is truncated/punched. I _think_ this reflects the most recent plan from the guest_memfd side: https://lore.kernel.org/all/1233d749211c08d51f9ca5d427938d47f008af1f.1689893403.git.isaku.yamahata@intel.com _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 04DB5C4332F for ; Wed, 1 Nov 2023 21:56:43 +0000 (UTC) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20230601 header.b=ig0q777y; dkim-atps=neutral Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4SLLSV2PYNz3cFw for ; Thu, 2 Nov 2023 08:56:42 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20230601 header.b=ig0q777y; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=flex--seanjc.bounces.google.com (client-ip=2607:f8b0:4864:20::b49; helo=mail-yb1-xb49.google.com; envelope-from=3zmlczqykddmhtpcyrvddvat.rdbaxcjmeer-stkaxhih.doapqh.dgv@flex--seanjc.bounces.google.com; receiver=lists.ozlabs.org) Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4SLLRY42Fjz2xPb for ; Thu, 2 Nov 2023 08:55:51 +1100 (AEDT) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-da04fb79246so290033276.2 for ; Wed, 01 Nov 2023 14:55:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698875748; x=1699480548; darn=lists.ozlabs.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=PcqXHpLnYUoCfOstmB/gusaSnz1wSaaIZvticgi9YrM=; b=ig0q777ypHYle7BsYh1IBx32A3XxOrL6cxNpe4M2GF4AemUdQJRs6+/g4qiRoaHRK4 7Xj8hV4d4MipHiACmp07gJ90L5mNLFuxyiKyXTX5mkMaQV8p9boSqeuOxBFDW25iZ+Lu ea4WBVD3/Jk/QYsvfnCJ1t6ovuIJi4U21lxs7SvYMz3bGrgiEF8I+eRv/wIiYJBr2AXV Et0CKt/A4SJKIsQwkeQnSTWMXuvajZFWK3uDCQYPxtT4nxRTQU7GjHmUra/vNrPYntFz wPJqw4+9OOhZIpdue8ggyj9ID5Vtxadkbr77YuT3eJqeuf47ubA8mJPtZOZeImtWExJ4 pZ1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698875748; x=1699480548; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=PcqXHpLnYUoCfOstmB/gusaSnz1wSaaIZvticgi9YrM=; b=MqQKARBI//vYFXmyNdEBh3PQsUJ+MmtoJahqpGZipSFyIsqDguy+JblnWh6FLkTRw1 g1oKTJFTeAzizqID3BWgAotoCpmi//FqC3G+NREBELMIH0YqDP5HoPr4t7Jg+zb3+l2n u8F+X26r1GQS+kHTG/PNga23XxqD9yl/lrnhzx2YcJ9jZ9WrI3K2GNRczVDB81R+qZxy oqGWtWpumXiAyCqizwwLlRfS/avzz/Pt5qz+B0Wm1L8s7P4GfXkyGwyHRUYoqyNjJ5Ky PiAPgEVEJC+7q4E8yhkwn8i+3LRVFUYAH5yk7lKAeqYyikMsumrM3OxxGqQXx9/sUKP8 DZ1Q== X-Gm-Message-State: AOJu0Yw93ySjJhMlGnlIzjDk7pwwvq2XuX/FkgsGpJq1SKv+KgXfyeNG t7XE6jQHwyO0MyK34nJWNzYcaIAeZQg= X-Google-Smtp-Source: AGHT+IE2sMY2tsOG0bVHgm+pvNoyezNqlGkOo8OmSBoeRdI+YDBZP5bgBr5ffHZ4bVc102IwBIrDYpGpUa8= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6902:1083:b0:d9a:c3b8:4274 with SMTP id v3-20020a056902108300b00d9ac3b84274mr405001ybu.7.1698875748263; Wed, 01 Nov 2023 14:55:48 -0700 (PDT) Date: Wed, 1 Nov 2023 14:55:46 -0700 In-Reply-To: Mime-Version: 1.0 References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-17-seanjc@google.com> Message-ID: Subject: Re: [PATCH v13 16/35] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory From: Sean Christopherson To: Fuad Tabba Content-Type: text/plain; charset="us-ascii" X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kvm@vger.kernel.org, David Hildenbrand , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Chao Peng , linux-riscv@lists.infradead.org, Isaku Yamahata , Marc Zyngier , Huacai Chen , Xiaoyao Li , "Matthew Wilcox \(Oracle\)" , Wang , Vlastimil Babka , Yu Zhang , Maciej Szmigiero , Albert Ou , Michael Roth , Ackerley Tng , Alexander Viro , Paul Walmsley , kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, =?utf-8?Q?Micka=C3=ABl_Sala=C3=BCn?= , Isaku Yamahata , Christian Brauner , Quentin Perret , Liam Merwick , linux-mips@vger.kernel.org, Oliver Upton , David Matlack , Jarkko Sakkinen , Palmer Dabbelt , "Kirill A . Shutemov" , kvm-riscv@lists.infradead.org, Anup Patel , linux-fsdevel@vger.kernel.org, Paolo Bonzini , Andrew Morton , Vishal Annapurve , linuxppc-dev@lists.ozlabs.org, Xu Yilun , Anish Moorthy Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Wed, Nov 01, 2023, Fuad Tabba wrote: > > > > @@ -1034,6 +1034,9 @@ static void kvm_destroy_dirty_bitmap(struct kvm_memory_slot *memslot) > > > > /* This does not remove the slot from struct kvm_memslots data structures */ > > > > static void kvm_free_memslot(struct kvm *kvm, struct kvm_memory_slot *slot) > > > > { > > > > + if (slot->flags & KVM_MEM_PRIVATE) > > > > + kvm_gmem_unbind(slot); > > > > + > > > > > > Should this be called after kvm_arch_free_memslot()? Arch-specific ode > > > might need some of the data before the unbinding, something I thought > > > might be necessary at one point for the pKVM port when deleting a > > > memslot, but realized later that kvm_invalidate_memslot() -> > > > kvm_arch_guest_memory_reclaimed() was the more logical place for it. > > > Also, since that seems to be the pattern for arch-specific handlers in > > > KVM. > > > > Maybe? But only if we can about symmetry between the allocation and free paths > > I really don't think kvm_arch_free_memslot() should be doing anything beyond a > > "pure" free. E.g. kvm_arch_free_memslot() is also called after moving a memslot, > > which hopefully we never actually have to allow for guest_memfd, but any code in > > kvm_arch_free_memslot() would bring about "what if" questions regarding memslot > > movement. I.e. the API is intended to be a "free arch metadata associated with > > the memslot". > > > > Out of curiosity, what does pKVM need to do at kvm_arch_guest_memory_reclaimed()? > > It's about the host reclaiming ownership of guest memory when tearing > down a protected guest. In pKVM, we currently teardown the guest and > reclaim its memory when kvm_arch_destroy_vm() is called. The problem > with guestmem is that kvm_gmem_unbind() could get called before that > happens, after which the host might try to access the unbound guest > memory. Since the host hasn't reclaimed ownership of the guest memory > from hyp, hilarity ensues (it crashes). > > Initially, I hooked reclaim guest memory to kvm_free_memslot(), but > then I needed to move the unbind later in the function. I realized > later that kvm_arch_guest_memory_reclaimed() gets called earlier (at > the right time), and is more aptly named. Aha! I suspected that might be the case. TDX and SNP also need to solve the same problem of "reclaiming" memory before it can be safely accessed by the host. The plan is to add an arch hook (or two?) into guest_memfd that is invoked when memory is freed from guest_memfd. Hooking kvm_arch_guest_memory_reclaimed() isn't completely correct as deleting a memslot doesn't *guarantee* that guest memory is actually reclaimed (which reminds me, we need to figure out a better name for that thing before introducing kvm_arch_gmem_invalidate()). The effective false positives aren't fatal for the current usage because the hook is used only for x86 SEV guests to flush caches. An unnecessary flush can cause performance issues, but it doesn't affect correctness. For TDX and SNP, and IIUC pKVM, false positives are fatal because KVM could assign memory back to the host that is still owned by guest_memfd. E.g. a misbehaving userspace could prematurely delete a memslot. And the more fun example is intrahost migration, where the plan is to allow pointing multiple guest_memfd files at a single guest_memfd inode: https://lore.kernel.org/all/cover.1691446946.git.ackerleytng@google.com There was a lot of discussion for this, but it's scattered all over the place. The TL;DR is is that the inode will represent physical memory, and a file will represent a given "struct kvm" instance's view of that memory. And so the memory isn't reclaimed until the inode is truncated/punched. I _think_ this reflects the most recent plan from the guest_memfd side: https://lore.kernel.org/all/1233d749211c08d51f9ca5d427938d47f008af1f.1689893403.git.isaku.yamahata@intel.com From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0523DC4332F for ; Wed, 1 Nov 2023 21:56:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: References:Mime-Version:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=wqoFbYsAkSfaZEy4N6r2ELutzMRLsw0CBFqQsdR1594=; b=I3dYJLZ+atJ1aAJzIqg+j1bnK9 xdYNuJsqnzP6+2mzm96IUzP4c7YQ22DIrohHplNa/s4BfMByyCYhir2kxiB94BW1y2accSteWYrj0 CwP++lE4MY1NgvsEQrZwSWCmM2lthv3xtvz5ETUKeWrl1LYD8KLHFuYMuhyHu8cMwv4UON/nRMwgJ 8dl9YRpwR5G2LeUGop0c0hCFBwt5IwIgXkiVx7v5h0sfXp4srDcVsJJWb+3sxLMgBFsvBE6l6U8NJ r/3mzu0rClIrKOQvmFMoFy1CONEt7tBDm3ReuFmjFXa1+utenAxrA4K+D3iCIaGVswPBN6rEJCYd8 ZGc1FHtg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qyJBq-008B8h-0N; Wed, 01 Nov 2023 21:55:54 +0000 Received: from mail-yb1-xb4a.google.com ([2607:f8b0:4864:20::b4a]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qyJBm-008B6v-2u for linux-arm-kernel@lists.infradead.org; Wed, 01 Nov 2023 21:55:52 +0000 Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-d99ec34829aso301304276.1 for ; Wed, 01 Nov 2023 14:55:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698875748; x=1699480548; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=PcqXHpLnYUoCfOstmB/gusaSnz1wSaaIZvticgi9YrM=; b=fi2v7xC9TH4oErbIOJfDXZfAdxQ494cxTQxZwVGxmvCjEwoZ5W9X+Du2+gOwu+MwLC 0nlb+lsobtKLZIohJNZpFeYQNJw6uYEzrNK9BobAeVzNtzFX0QGhLeasz54o3vP7bZmv OhwgdqCkzGq2A0qY3wlZSutr1B5InCd9WoWXcPO3+w8G1xrcgfenGOfFT7hdyRFYweRC b0om9N3s3Ao1ysAH6kZB7HUY5SNrOt9qjNHvB7E7MA/GNxGxYdXS8pB/8Lm7LK+qLzCS LaM1TEZb2Fg1ihw4FrbFOCDqOIIwKWVN7jNFGgr7FaZvzI72e0NZExVCwvpPpJtawAeh FKVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698875748; x=1699480548; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=PcqXHpLnYUoCfOstmB/gusaSnz1wSaaIZvticgi9YrM=; b=e/rZctJKkj/HbBRll2SqOmIUcYi4jOWbMtKx7TJpPhVU59Rf/92y+H1pLWzcSMg12w 0OSXFqZP++6ULLoh50t/xOXhMK0PQfYN0Lc0QDZLXask28/BqyHUmdI2CcdB++AbNcPC 0gfAnMtV2mxHzhUY/rl9pzJ7NaTYBLYrHYfDd/LNKm03RfQ+zLQDf5J2MnnEagp7i5jn LOaPVs0m8DTrlX9vgKigOVl9qvvc7u4jeK/Jbj/DZu0Y1gDz0K2RlovUerZGZ0/uc3aI 0TgaFMqFAJU13h8xADqiC2cb+WZ6x4R6ElHHyxoYRDKsvrBmt5+gMYpbDdX8GjEvSYXU xZ2g== X-Gm-Message-State: AOJu0YwdyOPGmHXjdVWxh2G9Ek6Hc0tIUtoU/bxiJ8JujPr/2ngmwaCR Tb+RacW1BJIliAVUB/EERKMLHNxOo0k= X-Google-Smtp-Source: AGHT+IE2sMY2tsOG0bVHgm+pvNoyezNqlGkOo8OmSBoeRdI+YDBZP5bgBr5ffHZ4bVc102IwBIrDYpGpUa8= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6902:1083:b0:d9a:c3b8:4274 with SMTP id v3-20020a056902108300b00d9ac3b84274mr405001ybu.7.1698875748263; Wed, 01 Nov 2023 14:55:48 -0700 (PDT) Date: Wed, 1 Nov 2023 14:55:46 -0700 In-Reply-To: Mime-Version: 1.0 References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-17-seanjc@google.com> Message-ID: Subject: Re: [PATCH v13 16/35] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory From: Sean Christopherson To: Fuad Tabba Cc: Paolo Bonzini , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexander Viro , Christian Brauner , "Matthew Wilcox (Oracle)" , Andrew Morton , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Xiaoyao Li , Xu Yilun , Chao Peng , Jarkko Sakkinen , Anish Moorthy , David Matlack , Yu Zhang , Isaku Yamahata , "=?utf-8?Q?Micka=C3=ABl_Sala=C3=BCn?=" , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231101_145550_940970_1160540E X-CRM114-Status: GOOD ( 35.96 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Wed, Nov 01, 2023, Fuad Tabba wrote: > > > > @@ -1034,6 +1034,9 @@ static void kvm_destroy_dirty_bitmap(struct kvm_memory_slot *memslot) > > > > /* This does not remove the slot from struct kvm_memslots data structures */ > > > > static void kvm_free_memslot(struct kvm *kvm, struct kvm_memory_slot *slot) > > > > { > > > > + if (slot->flags & KVM_MEM_PRIVATE) > > > > + kvm_gmem_unbind(slot); > > > > + > > > > > > Should this be called after kvm_arch_free_memslot()? Arch-specific ode > > > might need some of the data before the unbinding, something I thought > > > might be necessary at one point for the pKVM port when deleting a > > > memslot, but realized later that kvm_invalidate_memslot() -> > > > kvm_arch_guest_memory_reclaimed() was the more logical place for it. > > > Also, since that seems to be the pattern for arch-specific handlers in > > > KVM. > > > > Maybe? But only if we can about symmetry between the allocation and free paths > > I really don't think kvm_arch_free_memslot() should be doing anything beyond a > > "pure" free. E.g. kvm_arch_free_memslot() is also called after moving a memslot, > > which hopefully we never actually have to allow for guest_memfd, but any code in > > kvm_arch_free_memslot() would bring about "what if" questions regarding memslot > > movement. I.e. the API is intended to be a "free arch metadata associated with > > the memslot". > > > > Out of curiosity, what does pKVM need to do at kvm_arch_guest_memory_reclaimed()? > > It's about the host reclaiming ownership of guest memory when tearing > down a protected guest. In pKVM, we currently teardown the guest and > reclaim its memory when kvm_arch_destroy_vm() is called. The problem > with guestmem is that kvm_gmem_unbind() could get called before that > happens, after which the host might try to access the unbound guest > memory. Since the host hasn't reclaimed ownership of the guest memory > from hyp, hilarity ensues (it crashes). > > Initially, I hooked reclaim guest memory to kvm_free_memslot(), but > then I needed to move the unbind later in the function. I realized > later that kvm_arch_guest_memory_reclaimed() gets called earlier (at > the right time), and is more aptly named. Aha! I suspected that might be the case. TDX and SNP also need to solve the same problem of "reclaiming" memory before it can be safely accessed by the host. The plan is to add an arch hook (or two?) into guest_memfd that is invoked when memory is freed from guest_memfd. Hooking kvm_arch_guest_memory_reclaimed() isn't completely correct as deleting a memslot doesn't *guarantee* that guest memory is actually reclaimed (which reminds me, we need to figure out a better name for that thing before introducing kvm_arch_gmem_invalidate()). The effective false positives aren't fatal for the current usage because the hook is used only for x86 SEV guests to flush caches. An unnecessary flush can cause performance issues, but it doesn't affect correctness. For TDX and SNP, and IIUC pKVM, false positives are fatal because KVM could assign memory back to the host that is still owned by guest_memfd. E.g. a misbehaving userspace could prematurely delete a memslot. And the more fun example is intrahost migration, where the plan is to allow pointing multiple guest_memfd files at a single guest_memfd inode: https://lore.kernel.org/all/cover.1691446946.git.ackerleytng@google.com There was a lot of discussion for this, but it's scattered all over the place. The TL;DR is is that the inode will represent physical memory, and a file will represent a given "struct kvm" instance's view of that memory. And so the memory isn't reclaimed until the inode is truncated/punched. I _think_ this reflects the most recent plan from the guest_memfd side: https://lore.kernel.org/all/1233d749211c08d51f9ca5d427938d47f008af1f.1689893403.git.isaku.yamahata@intel.com _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel