From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-qk1-f176.google.com (mail-qk1-f176.google.com [209.85.222.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 44D7A387378 for ; Wed, 25 Feb 2026 23:58:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.176 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772063908; cv=none; b=Fy8ZH+BZZbfDAEqEpGwP1KafvrmEt9KocDqM3NrPQOi34PS3OmxQmOWWC8xWZvC8DM0lINTLuw5JbVVxzpLEIo1V9GhJyncHw6ivNRtT1SYt/qO8oJ4mShmYJaHUDmRlPw80XVXHtjO8HueSLve7DNIVcyARPA1NV/L/e6XbspY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772063908; c=relaxed/simple; bh=vMJ146Y/EPWV062fRB1P7stJ5n4Sr6xxBlqgn9/6cfE=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=TJmbad2PNkehlTLaz3e9a3NpTgluzL1wYHJdvC2fJEvwEW5iT3GzjivdJdlvlemZY2epB5xwGrKTwFKgeGnLSNL4KrG6FPm1rFghnczn+vYg5VuzMKtI4QIhuw9z5h2ipMC4NjrYhJjxYPRDVfJvBsYyPqlROUmQ4hrwfExkZ/A= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net; spf=pass smtp.mailfrom=gourry.net; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b=nAahtgay; arc=none smtp.client-ip=209.85.222.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gourry.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b="nAahtgay" Received: by mail-qk1-f176.google.com with SMTP id af79cd13be357-8c6f21c2d81so17823285a.2 for ; Wed, 25 Feb 2026 15:58:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gourry.net; s=google; t=1772063906; x=1772668706; darn=vger.kernel.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=QQWBXTWhdLgoq7OluVHAAR5bDFt9biBm7m+9/q6MXhA=; b=nAahtgayF3j+4rUcxkyd6fIXikcye73VtCWiZeC0ncuwR4D/vEldTS5w3azS7Z3oTb 9c/jU3SWQUQ1cwHYuB+Pf+/0Qd+m4BhCYOfHTALlbeGNny6FUPINIjDZ1D8h+V1sEEWS RgzFvxo81bKYPB53bWH3LcPPF5Gso/9YhcWu6/JWdPL/WtTv0aFJzxJfTIWaCQcuzwwX SdspqyEryoBLok6sg8n8peZpCXZ9SwWM38pV9rljQojlr0AqkTYSisqwne93XNQHG+8r c+02wstHvUENYLBRDE1bGnV8YWsNSMuOu4/ctDNpAwz3d9YCmNN84eNtgnilFB/WNcO0 deLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772063906; x=1772668706; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=QQWBXTWhdLgoq7OluVHAAR5bDFt9biBm7m+9/q6MXhA=; b=DtmilJpBHQC62gCUKw/528UB6nRt0OsPYCZEzqBhm2Wpnev0EMNc3mPWJh8pa7S8AO 2sBA3GqAWrFiL5Zi7gFCw8IY07ewOX7pSDA/aiLqsduIs9EVcmZDHBgp+GusZmYsflUj zd0eZbPhF3jriPgbY5lZt9jPgNxFAPEiMcHoKKFQiQVBfj4V0w6zb5Ew2wt+LRyjCOPo bPZZf1QFTrXO+ZB9N53PYbBm9MJxQ0sBy506WHotJLXsl7+bvqW5TbIn+iGfiKmqjoOA IYqTd0dtbHTt1rK9xaMBXt5fPmly08Kt8lL55r3jXvfYmtR/6neUUE0yYNG/WKiDXVkq epcg== X-Forwarded-Encrypted: i=1; AJvYcCWPwGSltQ8qc+h0jJD2VkiQ/Xe1qk9to4F8h0GJ3KBd2Gh/vRkbCN9ZZaXk3AuklsZIl1Kij6y3MVoETIcGuO26xjg=@vger.kernel.org X-Gm-Message-State: AOJu0Yw5bvwC53ZmU1MpMSgH2duNaCowJhTMB5gRWmbnRwouKNwqWKh8 gO7rSBVpHtqmjx8WgilVy86Ufkb1UKhqmkfLYntizmjWpAc+EwLhYwrMLYZWjwwZHyQ= X-Gm-Gg: ATEYQzwA7KS1AeLGT8ilGyyBA5g9FmUFluyGr1X7Vn1HlJ8AmHxVT+9vGQouFFSypl2 aWvSzQuBvs8fWWgcbrhD3N3LQ+v3hACVn2SyqoonU0RQarS80JNzd2gQG9r18lYfjMNjXj07Zcm G3j9a9eecJ+rKyOSQ/e5TGwaw/BJL1zV/uu0Bi4tBcwq0Pibb8UDznPMwz9rqO7ikD8dFLDWGcN kZB8hbOE5yp/JDOdTXbv1Leo104yL/AaDazHRx/bDI1Um5IYBrUFuOdQ8t3RBq8JwTkhtJ1LmT7 +NVLgIHPUU3UGRHeC7TyS5Fom9kM4xSvpxzjglV7tX9T0U4e0dLIFm7JjnCsd1fn2CoJWQFkMJx +iPpmvkPcNYLIOtWlQaHjL0+HX2c7MYNA46cGhneYbsoP558IUnIfCl8rzziWoriSOai3+BhdN1 lnexO3XCX390Xh/Uxpto8abGjG97/HlQJGoSBQ0otBGC0BnYTVfK7kveyytzmCZmuUJMYmYwsol /wowJnjDw== X-Received: by 2002:a05:622a:314:b0:4ff:8da6:2289 with SMTP id d75a77b69052e-50745f165famr3302881cf.27.1772063905968; Wed, 25 Feb 2026 15:58:25 -0800 (PST) Received: from gourry-fedora-PF4VCD3F (pool-96-255-20-138.washdc.ftas.verizon.net. [96.255.20.138]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-50744ab3f52sm5660571cf.22.2026.02.25.15.58.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Feb 2026 15:58:25 -0800 (PST) Date: Wed, 25 Feb 2026 18:58:21 -0500 From: Gregory Price To: Matthew Brost Cc: Alistair Popple , lsf-pc@lists.linux-foundation.org, linux-kernel@vger.kernel.org, linux-cxl@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, damon@lists.linux.dev, kernel-team@meta.com, gregkh@linuxfoundation.org, rafael@kernel.org, dakr@kernel.org, dave@stgolabs.net, jonathan.cameron@huawei.com, dave.jiang@intel.com, alison.schofield@intel.com, vishal.l.verma@intel.com, ira.weiny@intel.com, dan.j.williams@intel.com, longman@redhat.com, akpm@linux-foundation.org, david@kernel.org, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, mhocko@suse.com, osalvador@suse.de, ziy@nvidia.com, joshua.hahnjy@gmail.com, rakie.kim@sk.com, byungchul@sk.com, ying.huang@linux.alibaba.com, axelrasmussen@google.com, yuanchu@google.com, weixugc@google.com, yury.norov@gmail.com, linux@rasmusvillemoes.dk, mhiramat@kernel.org, mathieu.desnoyers@efficios.com, tj@kernel.org, hannes@cmpxchg.org, mkoutny@suse.com, jackmanb@google.com, sj@kernel.org, baolin.wang@linux.alibaba.com, npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com, baohua@kernel.org, lance.yang@linux.dev, muchun.song@linux.dev, xu.xin16@zte.com.cn, chengming.zhou@linux.dev, jannh@google.com, linmiaohe@huawei.com, nao.horiguchi@gmail.com, pfalcato@suse.de, rientjes@google.com, shakeel.butt@linux.dev, riel@surriel.com, harry.yoo@oracle.com, cl@gentwo.org, roman.gushchin@linux.dev, chrisl@kernel.org, kasong@tencent.com, shikemeng@huaweicloud.com, nphamcs@gmail.com, bhe@redhat.com, zhengqi.arch@bytedance.com, terry.bowman@amd.com Subject: Re: [LSF/MM/BPF TOPIC][RFC PATCH v4 00/27] Private Memory Nodes (w/ Compressed RAM) Message-ID: References: <20260222084842.1824063-1-gourry@gourry.net> Precedence: bulk X-Mailing-List: linux-trace-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: On Wed, Feb 25, 2026 at 02:21:54PM -0800, Matthew Brost wrote: > On Tue, Feb 24, 2026 at 10:17:38AM -0500, Gregory Price wrote: > > > > The idea that drm/ is going to switch to private nodes is outside the > > realm of reality, but part of that is because of years of infrastructure > > built on the assumption that re-using mm/ is infeasible. > > I was about to chime in with essentially the same comment about DRM. > Switching over to core-managed MM is a massive shift and is likely > infeasible, or so extreme that we’d end up throwing away any the > existing driver and starting from scratch. At least for Xe, our MM code > is baked into all meaningful components of the driver. It’s also a > unified driver that has to work on iGPU, dGPU over PCIe, dGPU over a > coherent bus once we get there, devices with GPU pagefaults, and devices > without GPU pagefaults. It also has to support both 3D and compute > user-space stacks, etc. So requirements of what it needs to support is > quite large. > > IIRC, Christian once mentioned that AMD was exploring using NUMA and > udma-buf rather than DRM GEMs for MM on coherent-bus devices. I would > think AMDGPU has nearly all the same requirements as Xe, aside from > supporting both 3D and compute stacks, since AMDKFD currently handles > compute. It might be worth getting Christian’s input on this RFC as he > likely has better insight then myself on DRM's future here. > I also think the usage patterns don't quite match (today). GPUs seem to care very much about specific size allocations, contiguity, how users get swapped in/out, how reclaim occurs, specific shutdown procedures - etc. A private node service just wants to be the arbiter of who can access the memory, but it may not really care to have extremely deep control over the actual management of said memory. Maybe there is a world where GPUs trend in that direction, but it's certainly not where they are today. But trying to generalize DRM's infrastructure seems bad. At best we end up with two mm/ implementations - not good at all. I do think this fundamentally changes how NUMA gets used by userspace, but I think userspace should stop reasoning about nodes for memory placement beyond simple cpu-socket-dram mappings . (using mm/mempolicy.c just makes your code less portable by design) --- As a side note, This infrastructure is not just limited to devices, and I probably should have pointed this out in the cover. We could create service-dedicated memory pools directly from DRAM. Something I was exploring this week: Private-CMA Hack off a chunk of DRAM at boot, hand it to a driver to hotplug as a private node in ZONE_NORMAL with MIGRATE_CMA, and add that node as a valid demotion target. You get: 1) A node of general purpose memory full of (reasonably) cold data 2) Tracked by CMA 3) The CMA is dedicated to a single service 4) And the memory can be pinned for DMA Right now CMA is somewhat of a free-for-all and if you have multiple CMA users you can end up in situations where even CMA fragments. Splitting up users might be nice - but you need some kind of delimiting mechanism for that. A node seems just about right. ~Gregory