From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-189.mta0.migadu.com (out-189.mta0.migadu.com [91.218.175.189]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0965B438FF3 for ; Tue, 16 Jun 2026 11:58:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.189 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781611094; cv=none; b=bGv0+faGigZ/trEat1djTu8yppuoBrVGfMN7oE5lMkBtOeASZvCR6vIkPwfvyXiXFt6/XiAoEpYUtTu+BjxPjuSZgk8guo41SsITVgQ0rnjrqFb0SF1g63eTKMMoqersOgfVriJNXH7gUBbItEP4fwarDDQ95DHdUcbuuTbQbfk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781611094; c=relaxed/simple; bh=CTi4G0hnEufO03WtfWriNSEeYzXA3FYOKUM0aKDFwyg=; h=Mime-Version:Content-Type:Date:Message-Id:Cc:Subject:From:To: References:In-Reply-To; b=V1qJiNib1bbmIkQ+OjBPZvVtYYaTvzTXOAuO4aUo6nh/QqaVy2txORyRc3XkiiwHLlTQkbINNf0cMP6wZfhMqdreF6Lr25aAsL+9ZTo8l82EW3zjUqevTzuHENqYW6zqECUP4/AhmMIbgAHR0gf5XGmkGoANNCC97PNESN3sIWk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=d19EmUrp; arc=none smtp.client-ip=91.218.175.189 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="d19EmUrp" Precedence: bulk X-Mailing-List: linux-trace-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1781611079; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Cj0OqIIza8TZpzIi89NPwfYz5Lr8KtiNgulM6qKZaEc=; b=d19EmUrpyHfnY1cDkWr5Pgy+k+d7eFilIhNo+y4UIIjqndySCSOaazg/G8TshBKTyTYnND XCyUVJJyDOLK/bkGHAJuXNKw+fdkrix2cFn368PgpUQyupyTl+Ld5Z1eUzw/MhgsQEQYZO 0Fwe6r3tB/j/kPtZNQWXjtsg7SB8wEw= Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 Date: Tue, 16 Jun 2026 11:57:42 +0000 Message-Id: Cc: "Balbir Singh" , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , "Matthew Wilcox" Subject: Re: [Lsf-pc] [LSF/MM/BPF TOPIC][RFC PATCH v4 00/27] Private Memory Nodes (w/ Compressed RAM) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: "Brendan Jackman" To: "Vlastimil Babka (SUSE)" , "Gregory Price" , "David Hildenbrand (Arm)" References: <9f1815b0-896b-44ab-9e6d-9316d8f11033@kernel.org> In-Reply-To: <9f1815b0-896b-44ab-9e6d-9316d8f11033@kernel.org> X-Migadu-Flow: FLOW_OUT On Mon Jun 15, 2026 at 2:38 PM UTC, Vlastimil Babka (SUSE) wrote: > On 6/12/26 17:29, Gregory Price wrote: >> On Wed, Jun 10, 2026 at 04:12:52PM -0400, Gregory Price wrote: >>> On Wed, Jun 10, 2026 at 08:59:59PM +0200, David Hildenbrand (Arm) wrote= : >>> > >=20 >>> > > I understand this question in two ways: >>> > >=20 >>> > > 1) Can we disallow PAGE allocation and limit this to FOLIO alloca= tion >>> >=20 >>> > Yes. Can we only allow folios to be allocated from private memory nod= es. So let >>> > me reply to that one below. >>> >=20 >>> ... snip ... >>> >=20 >>> > At LSF/MM we talked about how GFP flags are bad and how deriving stuf= f from the >>> > context might be better. I think there was also talk about how the me= malloc_* >>> > interface might be a better way forward. Maybe we would start giving = the >>> > allocator more context ("we are allocating a folio"). >>> >=20 >>> > The following is incomplete (esp. hugetlb stuff I assume), just as so= me idea: >>> > >>>=20 >>> I will still probably send the next RFC version tomorrow or friday, >>> as I want to get some eyes on the __GFP_PRIVATE-less pattern. >>>=20 >>> Also, I made a new `anondax` driver which enables userland testing >>> of this functionality without any specialty hardware. >>>=20 >>=20 >> (apologies for the length of this email: this will all be covered in >> the coming cover letter, but I just wanted to share a bit of a preview) >>=20 >> =3D=3D=3D >>=20 >> Just another small update - I am planning to post the RFC today once i >> get some mild cleanup done. It will be based on the dax atomic hotplug >>=20 >> https://lore.kernel.org/linux-mm/20260605211911.2160954-1-gourry@gourry.= net/ >>=20 >> But a couple specific details regarding the memalloc pieces that i've >> learned the past couple of days playing with it. >>=20 >> 1) memalloc_folio is required to ensure non-folio allocations don't land >> on the private node, even if it happens within a memalloc_private >> context. Since memalloc_folio may be useful in contexts outside of >> private nodes, I kept this as a separate flag. >>=20 >> If we think there will *never* be additional users of memalloc_folio, >> then we could fold _folio into _private to save the flag for now and >> add it back when we actually need it. >>=20 >> 2) memalloc_private is needed to unlock private nodes, but in the >> original NOFALLBACK-only design, you also needed __GFP_THISNODE. >>=20 >> This is *highly* restrictive. I found when playing with mbind that >> MPOL_BIND + __GFP_THISNODE generates a WARN (valid WARN, it normally >> implies a bug).=20 >>=20 >> That leads me to #3 > > I think the memalloc approach is dangerous due to unexpected nesting. The= re > might be nested page allocations in page allocation itself (due to some > debugging option). But also interrupts do not change what "current" point= s > to. Suddenly those could start requesting folios and/or private nodes and= be > surprised, I'm afraid. Minor side-note: couldn't we just define it such that the allocator ignores the context when not in_task() (and warn if you try to enter the context while not currently in_task())? (Don't think this would change the conclusion very much, e.g. doesn't help with the nesting issues. Mostly curious in case I'm missing a detail here). > The memalloc scopes only work well when they restrict the context wrt > reclaim, and allocations in IRQ have to be already restricted heavily > (atomic) so further memalloc restrictions don't do anything in practice. = But > to make them change other aspects of the allocations like this won't work= .