From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A021C47258 for ; Thu, 25 Jan 2024 20:19:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5DEC18D0003; Thu, 25 Jan 2024 15:19:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 58F678D0002; Thu, 25 Jan 2024 15:19:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 455F98D0003; Thu, 25 Jan 2024 15:19:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 36AB18D0002 for ; Thu, 25 Jan 2024 15:19:22 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 073671607F2 for ; Thu, 25 Jan 2024 20:19:22 +0000 (UTC) X-FDA: 81718948164.19.DD3FD22 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf17.hostedemail.com (Postfix) with ESMTP id 791B34001E for ; Thu, 25 Jan 2024 20:19:19 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=Cxgzbw34; spf=none (imf17.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706213960; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=I3nEuipWPh3dDCuw+ALtcajL5qRP6AppohXQzraqreY=; b=WshsK3UHp0a2v3aZq1DXa59GQP1tH1IeJfglSCos5MrwxJTALgZ0/mPB7FiZaOP6rjjj90 Zxcla251TJaMZM8NN6PLC3OG5ryNXMxCSPufy2ycAKSZu/4i2A5YLiEgUdi2HQRAkjyRzE kpbruXjjqJCFlvMjJSX+SgSVCg/56G0= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=Cxgzbw34; spf=none (imf17.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706213960; a=rsa-sha256; cv=none; b=OiY21T6LUskv0zwX2nzbP8We7Pz41F/L/HWb8AWYsLLEX9/BuXTaXOGG2yDu0ioJD9PR/A Yq+cjo2rk0jbOuZO2dp8tPaQi8cP1qQS6J7m7TXAZuL7lbpOaXp/t50KqHwT7L49ERFo8M EJzu70Lrxs7ZWRwSI0t8TSqaZCMPWJA= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=I3nEuipWPh3dDCuw+ALtcajL5qRP6AppohXQzraqreY=; b=Cxgzbw34yk7tCm9e6cIGcPCWJ9 6Gxv2otX0temZiwQnoMp0H366Jn4B+s6nmdt+oTajlrjMNMOG6vksRgnkxoUFCGuC8mkNO631pQeq aIB9nIauhQ5ud3eh3TFjD9Wc97Afogu5vwpPPDr6urRm21/3K1uaSLNo30Y4+Q6yxT7Lza9Ehp801 GLbF9/yARQYbYRfvpa4rk2mSlaQEy9lvcmlXL/sjtm49g0RzHEgkFKSr0qI9Jtw6zrgrLjmn19u86 I90+FMaqk5R6HmO27/bFteotvSXLEXzh98MAaigl3rnZShB5W+w7bLrMC7V+2ksvPtHhStzrOHL5I JZMMmahw==; Received: from willy by casper.infradead.org with local (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT6Bo-0000000B35w-40UG; Thu, 25 Jan 2024 20:19:08 +0000 Date: Thu, 25 Jan 2024 20:19:08 +0000 From: Matthew Wilcox To: David Rientjes Cc: John Hubbard , Zi Yan , Bharata B Rao , Dave Jiang , "Aneesh Kumar K.V" , "Huang, Ying" , Alistair Popple , Christoph Lameter , Andrew Morton , Linus Torvalds , Dave Hansen , Mel Gorman , Jon Grimm , Gregory Price , Brian Morris , Wei Xu , Johannes Weiner , linux-mm@kvack.org Subject: Re: [RFC] Memory tiering kernel alignment Message-ID: References: <75f21150-1e12-4f4b-e578-e170e4fea18b@google.com> <2b29dd3d-bb2c-6a8c-94d2-d5c2e035516a@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2b29dd3d-bb2c-6a8c-94d2-d5c2e035516a@google.com> X-Rspamd-Queue-Id: 791B34001E X-Rspam-User: X-Stat-Signature: ysgyp4ao9uo6speny5tni4oq43ju7yzo X-Rspamd-Server: rspam01 X-HE-Tag: 1706213959-167097 X-HE-Meta: U2FsdGVkX19/AxPiSl7df63VwaJkzJQhaNrt6GKCUpebki+ewE0BCvdVEGRNR8n0yumkG+7wslS/zBgLxlac2dverOsu+czMC6XtDX1zs1IpjR20f8mmYKFCXocOX8733ZncpNlBVI/avMKC/Jre5O2KgH5VUPgQXeYypOSAsu2i8ft6fSv6Y0NxPXp8ktuVaQddvCS51Sl7FEEwje4aXL1CSqrMUELIJ50JPUujSuwKJoDDLqLWM0VcmLOfdBy/HOBMxuLXBs9VH3Gjin16GKylkJt/3cvHfAP8SSUa6CcB5WwERKFSeh8HnzNeYnKYaYTgy1SqPhS/oYS8JSgi+u/+j5I3Esjx7jNCvsabaHOaUGdn5B7Pp6qyxosI8snViodoXWqLRt/F0CKCByjVVJG7j1onQiw6ycxWSkCirqLhpK6qx151eI2CEDD5infWooWjAi2XpyntB9btmz3HqHGdkrWSQFmRIq2jaABpSljgjprXha2TDVbFz3xzwCICc8PBPxsDLROsKpkZXaHBAlDm7rvY+W8DLiBfq5KN4cQTXXzbEFyjR0lYsV7FIywCpj1j/xVsK0boR1EOQzaCt5z4GuY2KsvqZjgb4nKH63aebD9bzzkkU+/lEZxdPY8CHyPZxvpsfMHlA0SxmamEnhgyoiwUR9bR8TxTOCt885ZQgRVvKSayWsx3Yi8AsfS9s7qsqGrJ3s26EWtjwQKNyHn+j4hNpgkss6uDHi8V5HkWm8rg522xNClsWBxORR6E8+yDbA1GMtd8d/RCM1c3GYOcret973RSlW1z9Em9Fu7a4DkbKqGWaVTZLOUBwQEXSOLKOXDJp/zuKsazZZ8HNMLdHdwhNUu6e8rRh7pbBU1uGvmWrQtwYo+84rpa+LRwI8dLDiYY3e4Zj4Nr3cwLApebI4xn+K04dKqvk0p3tZ1xVXbOp8277QMH4w0erdjG8NV/qTimjCYncqRWPi8 d1vDl0om 0jNOGTcxF9cebemWMUm7Gc/AttCME7WTiS8LohRrsKWbpbc7pfCrHO8KRCwlXGx+VrHF9qObejcXvJprOtIt9GLbPV4fDxfetPsMQ54JMXIZJsC7pG1Jh0WospPfgjaRR02Q0oopGYjKgy9lB0HGB2m+rPBOy7Dg5BGp41JYBkXzmB7BrPLspZyKDHN/xvO4MPwBQXCmHu+08qo+aJSX562C2YQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jan 25, 2024 at 12:04:37PM -0800, David Rientjes wrote: > On Thu, 25 Jan 2024, Matthew Wilcox wrote: > > On Thu, Jan 25, 2024 at 10:26:19AM -0800, David Rientjes wrote: > > > There is a lot of excitement around upcoming CXL type 3 memory expansion > > > devices and their cost savings potential. As the industry starts to > > > adopt this technology, one of the key components in strategic planning is > > > how the upstream Linux kernel will support various tiered configurations > > > to meet various user needs. I think it goes without saying that this is > > > quite interesting to cloud providers as well as other hyperscalers :) > > > > I'm not excited. I'm disappointed that people are falling for this scam. > > CXL is the ATM of this decade. The protocol is not fit for the purpose > > of accessing remote memory, adding 10ns just for an encode/decode cycle. > > Hands up everybody who's excited about memory latency increasing by 17%. > > Right, I don't think that anybody is claiming that we can leverage locally > attached CXL memory as through it was DRAM on the same or remote socket > and that there won't be a noticable impact to application performance > while the memory is still across the device. > > It does offer several cost savings benefits for offloading of cold memory, > though, if locally attached and I think the support for that use case is > inevitable -- in fact, Linux has some sophisticated support for the > locally attached use case already. > > > Then there are the lies from the vendors who want you to buy switches. > > Not one of them are willing to guarantee you the worst case latency > > through their switches. > > I should have prefaced this thread by saying "locally attached CXL memory > expansion", because that's the primary focus of many of the folks on this > email thread :) That's a huge relief. I was not looking forward to the patches to add support for pooling (etc). Using CXL as cold-data-storage makes a certain amount of sense, although I'm not really sure why it offers an advantage over NAND. It's faster than NAND, but you still want to bring it back locally before operating on it. NAND is denser, and consumes less power while idle. NAND comes with a DMA controller to move the data instead of relying on the CPU to move the data around. And of course moving the data first to CXL and then to swap means that it's got to go over the memory bus multiple times, unless you're building a swap device which attaches to the other end of the CXL bus ...