From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D16796FB8 for ; Fri, 31 May 2024 01:23:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717118584; cv=none; b=L/vNuUVWoEnZpV795cM53uOwssywO8YdZCv9an+kdtZEKVcNelFQwAlAziTIHXUqIcGbgtMafG1logrSsLYNV0E65dehXrdkcPPhCc9T8ob/vtl65VbGi7f/4CxDQxLD9aU8wGkT/4FI4zpBBio0p3CDeWSkY6g7bv50yioAiZk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717118584; c=relaxed/simple; bh=nlJxlGmynhU7KREnzEbXx6DTgU2f7pwNSzsU+KcONRc=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=jR7EnXNoRPR3czusByXxDheyzstM9xfAkEp7KnumzPN2PjcRW7FKGms5YQYEJQQFq0QZ4oQ5e8ARupFcRjSOrV9aVjtCxENo0gQeVEMN0ZJlamrfTzCF1UcCkt42Mxyiw56H5efDptVfxnMe4PZgzBNP0A3H03slMHxxlcNy0JQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Gctc3cpe; arc=none smtp.client-ip=198.175.65.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Gctc3cpe" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1717118582; x=1748654582; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=nlJxlGmynhU7KREnzEbXx6DTgU2f7pwNSzsU+KcONRc=; b=Gctc3cpe6JcGNm4yd1wve0ZwGc6ksMgZ1q8nMoN7/0NUvyjdcck0mcLF QTdHd+EQDKNNByUYC7oPHSPwQgcNUtrWLZJkHAs+UlaiM6gUJeQKa6n9i 8ONn8NxzXPpu/ljg+CLJhcg76zJE+BnyIKXOgwQOl5UKIQz/yu4ex3PCa kQkQfxQtMkpHLJSOiGsgvcrOIsuARUMoA9HWTMrWknBoK4hCAduEoL4c3 q03g9mNhHoEunL/7m+oqM6JXuTbVuzYxRwwDwmMDpODpzOi2FpYTPkw2e GBCii+P3pJsFXDhz+OXfG3cgCmzfMAxJIUHmJ/0wdST5VUxEbqEgg9veR Q==; X-CSE-ConnectionGUID: Ir0cK/zoRrGbjlIspUMmZw== X-CSE-MsgGUID: jLzxSDDdTca9VhqKIIpJWQ== X-IronPort-AV: E=McAfee;i="6600,9927,11088"; a="13813786" X-IronPort-AV: E=Sophos;i="6.08,202,1712646000"; d="scan'208";a="13813786" Received: from fmviesa001.fm.intel.com ([10.60.135.141]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 May 2024 18:23:02 -0700 X-CSE-ConnectionGUID: JYIBlf2GTXWPxN7j2kb+Tw== X-CSE-MsgGUID: 0jVmIHa9Re2o9ypgdTRyTg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,202,1712646000"; d="scan'208";a="67200458" Received: from unknown (HELO [10.238.8.173]) ([10.238.8.173]) by smtpauth.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 May 2024 18:22:53 -0700 Message-ID: <3999aadf-92a8-43f9-8d9d-84aa47e7d1ae@linux.intel.com> Date: Fri, 31 May 2024 09:22:51 +0800 Precedence: bulk X-Mailing-List: linux-coco@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v15 09/20] KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT To: Sean Christopherson , Paolo Bonzini Cc: Michael Roth , kvm@vger.kernel.org, linux-coco@lists.linux.dev, linux-mm@kvack.org, linux-crypto@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, jroedel@suse.de, thomas.lendacky@amd.com, hpa@zytor.com, ardb@kernel.org, vkuznets@redhat.com, jmattson@google.com, luto@kernel.org, dave.hansen@linux.intel.com, slp@redhat.com, pgonda@google.com, peterz@infradead.org, srinivas.pandruvada@linux.intel.com, rientjes@google.com, dovmurik@linux.ibm.com, tobin@ibm.com, bp@alien8.de, vbabka@suse.cz, kirill@shutemov.name, ak@linux.intel.com, tony.luck@intel.com, sathyanarayanan.kuppuswamy@linux.intel.com, alpergun@google.com, jarkko@kernel.org, ashish.kalra@amd.com, nikunj.dadhania@amd.com, pankaj.gupta@amd.com, liam.merwick@oracle.com, Brijesh Singh , Isaku Yamahata References: <20240501085210.2213060-1-michael.roth@amd.com> <20240501085210.2213060-10-michael.roth@amd.com> <84e8460d-f8e7-46d7-a274-90ea7aec2203@linux.intel.com> <7d6a4320-89f5-48ce-95ff-54b00e7e9597@linux.intel.com> <7da9c4a3-8597-44aa-a7ad-cc2bd2a85024@linux.intel.com> Content-Language: en-US From: Binbin Wu In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 5/30/2024 4:02 AM, Sean Christopherson wrote: > On Tue, May 28, 2024, Paolo Bonzini wrote: >> On Mon, May 27, 2024 at 2:26 PM Binbin Wu wrote: >>>> It seems like TDX should be able to do something similar by limiting the >>>> size of each KVM_HC_MAP_GPA_RANGE to TDX_MAP_GPA_MAX_LEN, and then >>>> returning TDG_VP_VMCALL_RETRY to guest if the original size was greater >>>> than TDX_MAP_GPA_MAX_LEN. But at that point you're effectively done with >>>> the entire request and can return to guest, so it actually seems a little >>>> more straightforward than the SNP case above. E.g. TDX has a 1:1 mapping >>>> between TDG_VP_VMCALL_MAP_GPA and KVM_HC_MAP_GPA_RANGE events. (And even >>>> similar names :)) >>>> >>>> So doesn't seem like there's a good reason to expose any of these >>>> throttling details to userspace, >> I think userspace should never be worried about throttling. I would >> say it's up to the guest to split the GPA into multiple ranges, > I agree in principle, but in practice I can understand not wanting to split up > the conversion in the guest due to the additional overhead of the world switches. > >> but that's not how arch/x86/coco/tdx/tdx.c is implemented so instead we can >> do the split in KVM instead. It can be a module parameter or VM attribute, >> establishing the size that will be processed in a single TDVMCALL. > Is it just interrupts that are problematic for conversions? I assume so, because > I can't think of anything else where telling the guest to retry would be appropriate > and useful. The concern was the lockup detection in guest. > > If so, KVM shouldn't need to unconditionally restrict the size for a single > TDVMCALL, KVM just needs to ensure interrupts are handled soonish. To do that, > KVM could use a much smaller chunk size, e.g. 64KiB (completely made up number), > and keep processing the TDVMCALL as long as there is no interrupt pending. > Hopefully that would obviate the need for a tunable. Thanks for the suggestion. By this way, interrupt can be injected to guest in time and the lockup detection should not be a problem. About the chunk size, if it is too small, it will increase the cost of kernel/userspace context switches. Maybe 2MB?