From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 192C4EE498B for ; Fri, 18 Aug 2023 18:16:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1359123AbjHRSPi (ORCPT ); Fri, 18 Aug 2023 14:15:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53986 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1379478AbjHRSPc (ORCPT ); Fri, 18 Aug 2023 14:15:32 -0400 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6CC0EE74 for ; Fri, 18 Aug 2023 11:15:30 -0700 (PDT) Received: by mail-pl1-x649.google.com with SMTP id d9443c01a7336-1bde8160fbdso17125085ad.1 for ; Fri, 18 Aug 2023 11:15:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1692382530; x=1692987330; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=2UpHtort9A6GjNTa35crOHhwpakPEKD1yUlcxtVr0FM=; b=Wf3bir3HeQk+WypI1ZAPThcOJ8gBnSqr+4uMmAGs5XkTifA1Jk+MIDWHiHZRdNk0+C M5661hEphU/S9CH1QehwULjXmETLlJNjdYP7x4EH3ZdU2MpvGa5UN9y+2vDZICrkZDzZ 9a649TKMt+pSSYcAeMrscys6un16n6z/fPt7iyYlqM4sbvaWGeLiX+6zLD4vT13u8GdA 3bdku6+tGWd8ndlrBLcc0mCDTPvthWqpoilu/rEO+XcSveGnU4nmPXc5G48yPu6t9WW9 xAesWTGzznMOvZ3+KVJhuC/+J7hHFF43hvk0Pb9RuSXACDXlbQZZQTwmjhwqyIP7otfl IU8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692382530; x=1692987330; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=2UpHtort9A6GjNTa35crOHhwpakPEKD1yUlcxtVr0FM=; b=gniN+g5YaBJxHBqEBGZ/e9nffBe0H56azsIBq38FuSZ6ne8/YWwu2lugja+7n63M4E gfXzwHlvLrXf0rTeVlV9sDpvwLxrZafqIvZgC3LQ0gTyoqc1BwFBh3kJYuvYaDDpI5K5 BumB0eDvtxX3hWG993s7UlA7ShUnPONaE3sCyFOdi3TNnqWoLzr5JaQOfPDRSisMAHfD nePuQn8GTs4zqxvqWS2RmZI7YQi2/os+kdX4y4f95C152veGJQOuGWXQp2RhvpagSB49 cSCYOBJkjj5dzS5WLYhzPaInPA8Pfalz7xV3UR6hoYTHJfP3cgLmX6bypXzb8hGXt3BO sa7w== X-Gm-Message-State: AOJu0YyBVyIVu8NZllL83sjK+uDF/qMo+GV3QOhh3aWKMMiHxySqxshf McwLmfEs4hI6WZZ6FDioFH1D/tmwu8I= X-Google-Smtp-Source: AGHT+IFIP6KRkFSo7uDQ6UxRK3c68Ehxfv1w9ffi6BGnhY+oGbePYtHkVZUTFVjvumMFptQ/4PuZ+lXpYqw= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:902:f547:b0:1b8:8fe2:6627 with SMTP id h7-20020a170902f54700b001b88fe26627mr1183892plf.8.1692382529796; Fri, 18 Aug 2023 11:15:29 -0700 (PDT) Date: Fri, 18 Aug 2023 11:15:28 -0700 In-Reply-To: Mime-Version: 1.0 References: Message-ID: Subject: Re: [PATCH 7/8] KVM: gmem: Avoid race with kvm_gmem_release and mmu notifier From: Sean Christopherson To: isaku.yamahata@intel.com Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, isaku.yamahata@gmail.com, Michael Roth , Paolo Bonzini , erdemaktas@google.com, Sagi Shahar , David Matlack , Kai Huang , Zhi Wang , chen.bo@intel.com, linux-coco@lists.linux.dev, Chao Peng , Ackerley Tng , Vishal Annapurve , Yuan Yao , Jarkko Sakkinen , Xu Yilun , Quentin Perret , wei.w.wang@intel.com, Fuad Tabba Content-Type: text/plain; charset="us-ascii" Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On Tue, Aug 15, 2023, isaku.yamahata@intel.com wrote: > From: Isaku Yamahata > > Add slots_lock around kvm_flush_shadow_all(). kvm_gmem_release() via > fput() and kvm_mmu_notifier_release() via mmput() can be called > simultaneously on process exit because vhost, /dev/vhost_{net, vsock}, can > delay the call to release mmu_notifier, kvm_mmu_notifier_release() by its > kernel thread. Vhost uses get_task_mm() and mmput() for the kernel thread > to access process memory. mmput() can defer after closing the file. > > kvm_flush_shadow_all() and kvm_gmem_release() can be called simultaneously. KVM shouldn't reclaim memory on file release, it should instead do that on the inode being "evicted": https://lore.kernel.org/all/ZLGiEfJZTyl7M8mS@google.com > With TDX KVM, HKID releasing by kvm_flush_shadow_all() and private memory > releasing by kvm_gmem_release() can race. Add slots_lock to > kvm_mmu_notifier_release(). No, the right answer is to not release the HKID until the VM is destroyed. gmem has a reference to its associated kvm instance, and so that will naturally ensure memory all memory encrypted with the HKID is freed before the HKID is released. kvm_flush_shadow_all() should only tear down page tables, it shouldn't be freeing guest_memfd memory. Then patches 6-8 go away.