From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 048D012B60 for ; Tue, 19 Dec 2023 08:09:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Idb1D+qC" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1702973380; x=1734509380; h=date:from:to:cc:subject:message-id:references: content-transfer-encoding:in-reply-to:mime-version; bh=cUAPnRTs5Ph6Mbyb8erhopCOheHAaFf6ZJXdC4smsJc=; b=Idb1D+qCZKYXE2RIBuKhou5MAPYTrbQsy4BL8IZwgUqka5aq6vib6FWw KlkYWh5/UogJTbN5Idcl4SeHLvHWCIEGAXJi5g2Rx2zu9kuVY2OBHM/zL kKTOCAEFIH7t6l763TlQtLQ/iV/cHPWAy6iO7MCqQXUVNdNxTb3/ayeN0 15xfRFk706nveVWHL8xGGMNejrDbJNm6aJ4Z13QywPRF1kMWOGBF2A6hS Xj2NwHD/726X4lIEER0SFA73Jfh428fgnE9SC/svFKQjFfXRYqMpQ2ioq CIqiJ1pa3pYG03KicTNK3gO0JDNbH1X6aEY+4SGGiWdPyKxy9f/0PHJhq A==; X-IronPort-AV: E=McAfee;i="6600,9927,10928"; a="2851506" X-IronPort-AV: E=Sophos;i="6.04,287,1695711600"; d="scan'208";a="2851506" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2023 00:09:34 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10928"; a="949084577" X-IronPort-AV: E=Sophos;i="6.04,287,1695711600"; d="scan'208";a="949084577" Received: from fmsmsx602.amr.corp.intel.com ([10.18.126.82]) by orsmga005.jf.intel.com with ESMTP/TLS/AES256-GCM-SHA384; 19 Dec 2023 00:09:34 -0800 Received: from fmsmsx610.amr.corp.intel.com (10.18.126.90) by fmsmsx602.amr.corp.intel.com (10.18.126.82) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Tue, 19 Dec 2023 00:09:33 -0800 Received: from fmsmsx610.amr.corp.intel.com (10.18.126.90) by fmsmsx610.amr.corp.intel.com (10.18.126.90) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Tue, 19 Dec 2023 00:09:33 -0800 Received: from fmsedg601.ED.cps.intel.com (10.1.192.135) by fmsmsx610.amr.corp.intel.com (10.18.126.90) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35 via Frontend Transport; Tue, 19 Dec 2023 00:09:33 -0800 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (104.47.59.168) by edgegateway.intel.com (192.55.55.70) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Tue, 19 Dec 2023 00:09:32 -0800 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=jQzngpflCiJFcbeJ9znzy/fDq2Km5pRt/J3v6MESB2t4/omN0uG7pQnK6p9F0cF8SIfnbSMQ1I+HYkoHBPBSTa7w0HDWCtWv2GlVof0Zouks+DtclPPZ2GKds2mlf9mAOYDsW/jSEV1mrCOE8PdZIZoLvA2LdBnlPDqGnLdt4lUzCIOo4/B6YJqLgEHATz/p5KXt1EMAdipNMERgtmdRVRtODW2mTdAS8wwNzc1VzjWAggUj4W09cq1RGUDdMGTz4SlOWDUYUfWHxKfZcDq1XyhVotN/XfNfA2hDNE/Ul6lLKjdiE7gGgzVduE6Xxb0Jym2z0LtC0vvwMhzt8ismJw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=x2nI1l2xo6vAwxom6q14qFdUjmnE7RfaiRLcjNjyvmw=; b=T0fO5elvIGdLy8rwntffJOnUYBGwqCrQIoiRM92HTuUB5BVF8N9DTRv1jUoI4Up1iu6CBP9ojiTYxcs9UHk8Ujt2ynHd6R8BwF7qxjRnqePwfhxB/NfTcc0TgAP6j+VbFcuJ2m/hjExeAVzDFz/4AvqSxYk/Kunf4z/2/XWAWLBoCZ1EiyoZFoADehXwMPMXeRSWvdBSnUArCQBe5BVjQVoj3z4qx1vQynbLR6YMnz7F2Oo9EvIDrIpmU1wmyag6NWkwiDhAOdeeUrcFQGU/FYEEScktV9IEtCtZYANeSOa75MeowU6LtgRCPbj1nIgSj8GRK57OYgl1EEzrhxeGJA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=intel.com; Received: from CH3PR11MB8660.namprd11.prod.outlook.com (2603:10b6:610:1ce::13) by CY8PR11MB7136.namprd11.prod.outlook.com (2603:10b6:930:60::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7091.38; Tue, 19 Dec 2023 08:09:31 +0000 Received: from CH3PR11MB8660.namprd11.prod.outlook.com ([fe80::66ec:5c08:f169:6038]) by CH3PR11MB8660.namprd11.prod.outlook.com ([fe80::66ec:5c08:f169:6038%3]) with mapi id 15.20.7091.034; Tue, 19 Dec 2023 08:09:30 +0000 Date: Tue, 19 Dec 2023 16:09:20 +0800 From: Chao Gao To: Jim Mattson CC: Sean Christopherson , Tao Su , , , , , , , , Subject: Re: [PATCH 1/2] x86: KVM: Limit guest physical bits when 5-level EPT is unsupported Message-ID: References: <20231218140543.870234-1-tao1.su@linux.intel.com> <20231218140543.870234-2-tao1.su@linux.intel.com> Content-Type: text/plain; charset="utf-8" Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-ClientProxiedBy: SGAP274CA0021.SGPP274.PROD.OUTLOOK.COM (2603:1096:4:b6::33) To CH3PR11MB8660.namprd11.prod.outlook.com (2603:10b6:610:1ce::13) Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH3PR11MB8660:EE_|CY8PR11MB7136:EE_ X-MS-Office365-Filtering-Correlation-Id: 21a6fb9a-9fbb-43ce-23c2-08dc0069d3bb X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: CL9VudN6kDy9ztzjytJjoOPNBWtGiDShjydvNl7s2kMuPeCCedFYKyU3DM4S/wBlwungA6BQd03azSOGAo53DJhRDqzPQmsAmCHMwFhMxApJkuTB5DBgrcZVtwA7O0TtZYTBurz7/Lvb9IzNlYotweK0RJo5dN22VxonDgADGt2Ew1Ri1XzWY2Pt2YATYGIMQ4yQ8cL2Da4bW6UJE1zSnzxSVJDATd0QGUbpNcptRC8KI+kXMrzNOXjnKJnjTtb82zpGXV0JcwOjfrEYyGwgk2YecMfS6lQ3r7FOrgJgYfnoGGcH+GYPEIYCzIEewO0WXOS7Ed97dxKqGPdIwT60qDgK7mJLxlj7cE7aSBFroIfP3Y293gAvuFI41uIUeZ2GMwseCgyseVp+isYSPli14utOk76feAVY6SUDa/jM5DGzqLWsGrJFq1demij8SoHuDKU4fpNLRoA/sEgljVumDNVal9ABBZjRx8PcfHpKVtV3bR4GIvIwggMgk1Lr9RoL39LoE36Pz5YQ28hLacUqBxfiOY43b7DHQYyLrJD6uTFlPpjEs9/nIluFuu//a5F6 X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:CH3PR11MB8660.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(7916004)(346002)(39860400002)(396003)(136003)(366004)(376002)(230922051799003)(1800799012)(451199024)(186009)(64100799003)(82960400001)(2906002)(6666004)(33716001)(86362001)(6506007)(6512007)(9686003)(6486002)(478600001)(83380400001)(26005)(38100700002)(41300700001)(5660300002)(316002)(6916009)(54906003)(66556008)(66476007)(66946007)(4326008)(8936002)(8676002)(44832011);DIR:OUT;SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?dkx2MUE3cmpGYVpvUkNtVjVMUmJyVk1oMjM2SXZFSFhpZVFyeFRoWVZVZlo3?= =?utf-8?B?Ni9aNGZlbGkxMzd6bzBwQk5xV2RjUWhyMjBkL2liVmlVUFg4Z3A0TVBLeEhM?= =?utf-8?B?OS9jT1crR3BUdXNVNExGdlFhNDY3ajdndzE4UUw2Z0Nhc2U5MXIxcHNwdlBw?= =?utf-8?B?MGczM2tBOU45d1JMeTJrT2xyRjMvc1BpS1hEVnRhb3JTS3ErY25NcUcweW5s?= =?utf-8?B?SUNFemd3UExJRFlPNGdvMVphVWRITEFlUVlBTVU1bFh3RVduVndkTlhKZGVq?= =?utf-8?B?M2JaSUw3OHMwS2hyWXpla21mczdwZERhNHFFKzRmbHZKWW13ZFFMb1ZUZ05w?= =?utf-8?B?bmtRVDhOcW01eUNWY0lONVZ1VEdQVi9KNGhRTGI3WVc3QitLYWFwY1UyQUtU?= =?utf-8?B?NGdnanVBbmdoczhZSFdnL0dLS1FSV1I4V2dRRC9SdVJWRDNSblplVXdHMnRR?= =?utf-8?B?S2MwNGJHVFNyNVJFN3lCcjk4elA3TnRnOSswR05tT3hncVZHMVQwMlhJUDVl?= =?utf-8?B?dWpVUStaeERHRnkwOURzR1dIN0ZCY3NQY0kraDFkYWUvM3RHYkhFd2dBemRJ?= =?utf-8?B?ZC96SXBQdlRrcFRlRGZwZUlLT2VQT1VRQThmT0kzYVBvMmlNTm5zVkNiVWFi?= =?utf-8?B?aTJnNW9rSmE1SUZ0VEI5Y09zRDdBN2dJMlJMUVE5VnFtSTE5OFk5UTQ4bHBS?= =?utf-8?B?TllrYURiTzFreWYyZ3FlakYyOUgrd245ZWk2S2d1aDFHemphSEVYcmZ3UnBW?= =?utf-8?B?WmZzMHFVckhZNkRSOVVUVHdBendPdDhqeUdsOFVhRnJ3V3QrNXBHZklMNGcy?= =?utf-8?B?NytiWnRJdjdvRHJyK2tDM2dpcldTd1NZT0t6Q0NkaEkwSTZJRXhUakU0VHE3?= =?utf-8?B?eUZiRkVOZDQyMk9sc3d0NXlnbzRWZjNCQWZSY1BYS3g5Y21POHUvZ21PRXh1?= =?utf-8?B?KzFncHlPREhpOFlhODdFcjNRbFBlbnltSXFxSVVJR1hIMFNqK0lMbkNVbWtM?= =?utf-8?B?R1JNQUxLQWwzc0lESENGU3luQkxGQzJXVjFxcklaRjhhNmFWV210aDBFVUtp?= =?utf-8?B?c2pYVVQyMjZxMXNVNFhMOHdCdzFOSG1DSndFZUFLd3hrS2c0aU5kWTZMNWh5?= =?utf-8?B?RW0ybHRjNUxxWG1ndTV1a1dwem4yRFhPdk93RERJL2NVUzJmNkhkRmlsNGlT?= =?utf-8?B?VWY4ZGkrNHdWeTMwMlV6UzdHQk5kSGRPV0V1ZUhsb21VVTM5YXVIdkZiNmJV?= =?utf-8?B?SmpWU0ZCYzVaa1NyeFh6UFVLdWgrTExYaWM0TXUxb1pvRWFzRGwrOUdEeTZ2?= =?utf-8?B?Y1NGYlh0U2ZTVDRkWGVrckR0ZXNIUmtMQk9oMFNqYnJoNndNK2dzQ0lVaUda?= =?utf-8?B?Z2VGSFRpZDNwS1JNckwxTUJka0xLdEdQK0pzUkRyWGE2VkozZzhIU21mQVFp?= =?utf-8?B?djFpVGh2RCtZSlVTdGRDQThoY3h2ZXpLbTNvSTlkRFo1UjY1eHRkeXIyUGZ2?= =?utf-8?B?a0JhclJHYVd3YTIvNG95VXdoT3NBd0R2Q0gyTmRjaXhXSmg3OXBNaVhnbGRK?= =?utf-8?B?MERKMHZDLytEclZFS3VmUFpURGpKV0h2TTJRdVZIM3laQkZ5Z3FFMzVTRS9F?= =?utf-8?B?cW44SGZ0NHR6ajJlRzN1eFN4WjR4ZHYrOEx1V1pxS3lhdHJUcHUyVVEzZGRj?= =?utf-8?B?NHBzamRTejQ2MXB3ay9OUzlBdk1qR3AxeFlzRkpGeEpPNVUxSlhZR3oyRGk5?= =?utf-8?B?enlSUWJ5Mmgrc09nU3lRb1pONHdWejFNbjNrNzdyVjRlZTZQVzBmMzB2d29Q?= =?utf-8?B?UXhJSUYvQlczSEU3bVE4TzVQVHdTR0ZaRHlibVRacFVnaHByOUxhYk5BbFFr?= =?utf-8?B?NWdRWW9FYUVSTGFoMXJuTm8zanJLMWU4ZVR5K2dCNDdHbmcrNnNzbzJGU290?= =?utf-8?B?Rml0UXdEbkJjbGUva1hQUUpRSEd3TW5Bdjd0ZFBMcFlXU2ZTZllmV2ZUdzVm?= =?utf-8?B?WEZTeTFQaGhqQ0R3VDNUcVNVUi9iQWk0Mmh6QWZjYmNtUXBYN2VQQjJ1Y00x?= =?utf-8?B?Q2VBNUhqY3ZrQ0I4a0R4Z3BqUmkyc2VIT1hzK3NHT2l0bzgyMmhoU1RtWDVo?= =?utf-8?Q?v8xFkHIOAjdeEUAEmJgiNdHtd?= X-MS-Exchange-CrossTenant-Network-Message-Id: 21a6fb9a-9fbb-43ce-23c2-08dc0069d3bb X-MS-Exchange-CrossTenant-AuthSource: CH3PR11MB8660.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Dec 2023 08:09:30.5570 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 8ZNp6yDcZbyxJ/qiQjWCRIOxW8p+n3jHii6/QpKggg8rW6JxFdeSDUjGfSgMXzyYf+mrThquZcIY3ZfTUugGpA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY8PR11MB7136 X-OriginatorOrg: intel.com On Mon, Dec 18, 2023 at 07:40:11PM -0800, Jim Mattson wrote: >On Mon, Dec 18, 2023 at 6:51 PM Chao Gao wrote: >> >> On Mon, Dec 18, 2023 at 07:13:27AM -0800, Sean Christopherson wrote: >> >On Mon, Dec 18, 2023, Tao Su wrote: >> >> When host doesn't support 5-level EPT, bits 51:48 of the guest physical >> >> address must all be zero, otherwise an EPT violation always occurs and >> >> current handler can't resolve this if the gpa is in RAM region. Hence, >> >> instruction will keep being executed repeatedly, which causes infinite >> >> EPT violation. >> >> >> >> Six KVM selftests are timeout due to this issue: >> >> kvm:access_tracking_perf_test >> >> kvm:demand_paging_test >> >> kvm:dirty_log_test >> >> kvm:dirty_log_perf_test >> >> kvm:kvm_page_table_test >> >> kvm:memslot_modification_stress_test >> >> >> >> The above selftests add a RAM region close to max_gfn, if host has 52 >> >> physical bits but doesn't support 5-level EPT, these will trigger infinite >> >> EPT violation when access the RAM region. >> >> >> >> Since current Intel CPUID doesn't report max guest physical bits like AMD, >> >> introduce kvm_mmu_tdp_maxphyaddr() to limit guest physical bits when tdp is >> >> enabled and report the max guest physical bits which is smaller than host. >> >> >> >> When guest physical bits is smaller than host, some GPA are illegal from >> >> guest's perspective, but are still legal from hardware's perspective, >> >> which should be trapped to inject #PF. Current KVM already has a parameter >> >> allow_smaller_maxphyaddr to support the case when guest.MAXPHYADDR < >> >> host.MAXPHYADDR, which is disabled by default when EPT is enabled, user >> >> can enable it when loading kvm-intel module. When allow_smaller_maxphyaddr >> >> is enabled and guest accesses an illegal address from guest's perspective, >> >> KVM will utilize EPT violation and emulate the instruction to inject #PF >> >> and determine #PF error code. >> > >> >No, fix the selftests, it's not KVM's responsibility to advertise the correct >> >guest.MAXPHYADDR. >> >> In this case, host.MAXPHYADDR is 52 and EPT supports 4-level only thus can >> translate up to 48 bits of GPA. > >There are a number of issues that KVM does not handle when >guest.MAXPHYADDR < host.MAXPHYADDR. For starters, KVM doesn't raise a >#GP in PAE mode when one of the address bits above guest.MAXPHYADDR is >set in one of the PDPTRs. These are long-standing issues I believe. Note: current KVM ABI doesn't enforce guest.MAXPHYADDR = host.MAXPHYADDR regardless of "allow_smaller_maxphyaddr". > >Honestly, I think KVM should just disable EPT if the EPT tables can't >support the CPU's physical address width. Yes, it is an option. But I prefer to allow admin to override this (i.e., admin still can enable EPT via module parameter) because those issues are not new and disabling EPT doesn't prevent QEMU from launching guests w/ smaller MAXPHYADDR. > >> Here nothing visible to selftests or QEMU indicates that guest.MAXPHYADDR = 52 >> is invalid/incorrect. how can we say selftests are at fault and we should fix >> them? > >In this case, the CPU is at fault, and you should complain to the CPU vendor. Yeah, I agree with you and will check with related team inside Intel. My point was just this isn't a selftest issue because not all information is disclosed to the tests. And I am afraid KVM as L1 VMM may run into this situation, i.e., only 4-level EPT is supported but MAXPHYADDR is 52. So, KVM needs a fix anyway.