From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1AB84C76196 for ; Mon, 3 Apr 2023 08:44:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9B0EE6B0072; Mon, 3 Apr 2023 04:44:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 960226B0074; Mon, 3 Apr 2023 04:44:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 829FD6B0075; Mon, 3 Apr 2023 04:44:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 72C1F6B0072 for ; Mon, 3 Apr 2023 04:44:33 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 3C245809FA for ; Mon, 3 Apr 2023 08:44:33 +0000 (UTC) X-FDA: 80639443626.09.9C6E641 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf12.hostedemail.com (Postfix) with ESMTP id 818C34001A for ; Mon, 3 Apr 2023 08:44:31 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=athY3XDR; spf=pass (imf12.hostedemail.com: domain of rppt@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680511471; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ZPaG/KA/k9njRMd4tC51cjdr2KEO9aGNIDGoFG72njk=; b=KteFypB4GwdtMRnKXkqOe0KxQPj+bXSYO9x+SQo8P14Xqx2oEXFu7rDkObGGzu3qZ9OJOl gU/En6wW8zKYBQqJQc6rGH3BqCChWmhn3bw9OP9aPo6McNV24Z8f89xcrlKEFbwLmIlHI1 69y5loYIFpOnNc0AWbKfyZBIINt6RuI= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=athY3XDR; spf=pass (imf12.hostedemail.com: domain of rppt@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680511471; a=rsa-sha256; cv=none; b=8MFld0UxY4ZFFdPNLMtqaagR7OOmwPKDgIxOcbym/Iy4flD7XEnA+2VN02Um3w6ZGH2R8s A3P1gw3B4lN4GgJyZ1ivP6kAq0if3C4oL6RLp4kSO88OufUStKrEOQs9efUg4bEV2zyB/O Hz9rP/30QYKH9MU9nhHDb982/Ud8Q1c= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 6F59160C6B; Mon, 3 Apr 2023 08:44:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 47F63C433EF; Mon, 3 Apr 2023 08:44:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680511469; bh=rcNTkexBq3Q+uJyB3j6QpUz1kVSfgLh2+uusgW7/vik=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=athY3XDRKOfxSL5sGLnDLlGzre4UmRyr1xaYX9FpdWCv+iyibHSeO+JqhP5tfpvBp AsWg7CGZzp2vcQY4jZAbSmmtiZ92SK6bfW7/Bz0PQLeraEQQU483gjwuHdADaSJuoA kgI2jW9Lo7p56UaSfv2c1etvhGJ9kyL6o6GuFZXqekORJFwQRKzxyjUGu4olLqM0Yn kR9FM0/wQLmiCN3AnanGE6blZPurvXcCDzZcOeZc1qiin92YzvdOnt1n88FVpIF7/V Y591H2k+t4U4mLMIXhtBqBOVX4cL0QvUUvca9lmtX05Km670+t/EuWCVyJkkqBsxO2 xkWojXuUymQ+w== Date: Mon, 3 Apr 2023 11:44:23 +0300 From: Mike Rapoport To: Dragan Stancevic Cc: Kyungsan Kim , dan.j.williams@intel.com, lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-cxl@vger.kernel.org, a.manzanares@samsung.com, viacheslav.dubeyko@bytedance.com, ying.huang@intel.com, nil-migration@lists.linux.dev Subject: Re: FW: [LSF/MM/BPF TOPIC] SMDK inspired MM changes for CXL Message-ID: References: <641b7b2117d02_1b98bb294cb@dwillia2-xfh.jf.intel.com.notmuch> <20230323105105.145783-1-ks0204.kim@samsung.com> <362a9e19-fea5-e45a-3c22-3aa47e851aea@stancevic.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <362a9e19-fea5-e45a-3c22-3aa47e851aea@stancevic.com> X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: 9gmc6zhaeqgf7din34kihr3mnr95ayrp X-Rspamd-Queue-Id: 818C34001A X-HE-Tag: 1680511471-64515 X-HE-Meta: U2FsdGVkX1/Stj657tP5HTp/LaWrhInd1kt617pOxsaNEp8p3aFddU9rVFOw18GAHOTwpdvgzJYtYt4ZO13GxXyE/E7qD41xFDUVc2kr2TUYmR/e+DYS6FJ1EiD7y6ZKpmEZk63Asgp6SF2AybH9eRUPoaCLmnJ/pmq/9DcL+iuruHUDSmD3GVbbsFkEQsH+fA4O80uj+4uHdrQkPqqIfXiWktlEXc50x+Rz6BIfZUOrco1DA8xr8cipmsyo/gJPqhY3YHMuHNz7ghAV5iHrLjCz5DsUxdIQ8TPkKv13brRlFuh1Xb+BMpfZ7S/m1x/aVvXFlX6f/BtnPl39OMWqXO4dsbRkeyBT/tGwVKXlgY1sGMls7RPaAaD3PPUTaTWFl3G0DmWznQEnf2INmonB4NM3S7gga6+QFyURrwP9/O3GQxUAv54RdTByHLBgglPbjLa4UHgePAZKLeLHMzg/xdG5+G+honc6AGn1ia7KSwaot0x/TgB/+BAhAuQ5DY4sZK6EuBQmwxXrl7wfqwXeW1/cMNk6JqQmz6mTRLIiwaWHwDCNsOG/0XPeLP4mNCZz1Cx7jopHtx54sk4LQBmpUXC/iyP10Q9g1cjP3JAZM2Uo8Els65E/yhWsSQ4/yEV0mCRIwlK3PlP5apjd6hGEAXg23XGQZIWUtQEtQ/79nDcCblYlG/1b54JpdeuVNOa2Qvhb5QS9gckAEAtzrLgL4/N5VUfu9umy3CGz0lSe9fOfhIrB44g/BLw+08SaqFfxWct9HhHl55LhE9lV3Gi2T7RrVhletXYB7OgZ6E2eYkATi83Cv98kK07d+zEZtuQ09j9ZDGzFmt4VLq+Eu2VCKXZTnotoZ7yO4GXci9q8gk+AQbTQr2rsNqLj8lfEjNerp26AA74JvqagnZbpnEPJRG6anEfMx8yR9gALD8bmpYlzs+xCi3rY0nSZIs3XsAGVbHQ5ve9M4P8+ekjKf3P cje172Iy hJd4OHT7rk7NzioUyV6krqpmxvs67249YvXMxN+TgJiLwtopMMNjGcYLcC+HHsMRXDN+4B6l3bScAOGl1vb8U6tFrXI/9pnumsqQFRFY16uWdcK5JVufvL0rQsmPk3WBT/fAkaWw7GJEh0Y49A5FbBEhBzAWVTB3CJz42ewx08bQd6HNH3k7M+uVD0BdjK44KgzJiCgy0bMEfyoHyERz0j0kane5c8/UqqmSZU2Qt1ADp9CbPDsprW/SjP4W/CYuONBy4TjXwEOl7KCsGdonCWgeDefkN40Um+x1UT/6Awg0L5JC9WZ1qtzwLJGgx/nDhpQp28Aj6LNsAOpgNQZx+47KtaFppWGXCh3Dm6YDzijgcN68= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Dragan, On Thu, Mar 30, 2023 at 05:03:24PM -0500, Dragan Stancevic wrote: > On 3/26/23 02:21, Mike Rapoport wrote: > > Hi, > > > > [..] >> One problem we experienced was occured in the combination of > hot-remove and kerelspace allocation usecases. > > > ZONE_NORMAL allows kernel context allocation, but it does not allow hot-remove because kernel resides all the time. > > > ZONE_MOVABLE allows hot-remove due to the page migration, but it only allows userspace allocation. > > > Alternatively, we allocated a kernel context out of ZONE_MOVABLE by adding GFP_MOVABLE flag. > > > In case, oops and system hang has occasionally occured because ZONE_MOVABLE can be swapped. > > > We resolved the issue using ZONE_EXMEM by allowing seletively choice of the two usecases. > > > As you well know, among heterogeneous DRAM devices, CXL DRAM is the first PCIe basis device, which allows hot-pluggability, different RAS, and extended connectivity. > > > So, we thought it could be a graceful approach adding a new zone and separately manage the new features. > > > > This still does not describe what are the use cases that require having > > kernel allocations on CXL.mem. > > > > I believe it's important to start with explanation *why* it is important to > > have kernel allocations on removable devices. > > Hi Mike, > > not speaking for Kyungsan here, but I am starting to tackle hypervisor > clustering and VM migration over cxl.mem [1]. > > And in my mind, at least one reason that I can think of having kernel > allocations from cxl.mem devices is where you have multiple VH connections > sharing the memory [2]. Where for example you have a user space application > stored in cxl.mem, and then you want the metadata about this > process/application that the kernel keeps on one hypervisor be "passed on" > to another hypervisor. So basically the same way processors in a single > hypervisors cooperate on memory, you extend that across processors that span > over physical hypervisors. If that makes sense... Let me reiterate to make sure I understand your example. If we focus on VM usecase, your suggestion is to store VM's memory and associated KVM structures on a CXL.mem device shared by several nodes. Even putting aside the aspect of keeping KVM structures on presumably slower memory, what ZONE_EXMEM will provide that cannot be accomplished with having the cxl memory in a memoryless node and using that node to allocate VM metadata? > [1] A high-level explanation is at http://nil-migration.org > [2] Compute Express Link Specification r3.0, v1.0 8/1/22, Page 51, figure > 1-4, black color scheme circle(3) and bars. > > -- > Peace can only come as a natural consequence > of universal enlightenment -Dr. Nikola Tesla > -- Sincerely yours, Mike.