From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-181.mta0.migadu.com (out-181.mta0.migadu.com [91.218.175.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6693C31E841 for ; Tue, 17 Mar 2026 17:55:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.181 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773770141; cv=none; b=QKR/BTSTFpHFvqHRy0MM4q44u1ZiuH2ANNWYprv6YFCYi76/YKKmL6pkQR5LiZ4DIddyaxl2OZEBW9d99jL42JRN/uFkUo1wfhtBJAPygZbWU/MHIUXmCPzg++uqDeiiZXMPm6qv4fQr/XFXAXfy7oiBKI35lUZ6ym+gK5wUgzg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773770141; c=relaxed/simple; bh=kOVU8iJwnYx0FdEkjEXpzP0JhkgJQiE7YpSMZimGrwU=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=Vvdw+5aY59qm2pqIjJgUd/0SNx4q+dHgkCcUMfRee1DgNOBYaqDaGPi8I+gOJLYfuOUO1tomb5cJa8+TDz/TKbhGPW5xF/0FHfvCn2Bi7VoapGAgU058PP0/vx3u8DP7PprT+FBEErGB3bBeFWWe5uH/hviQjoLItKHjpsnIi3o= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=pjaAZX31; arc=none smtp.client-ip=91.218.175.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="pjaAZX31" Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1773770136; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qTYVKV5nb5ayCDEY32vWKZpQOz0lpO2L66vO5Fes22w=; b=pjaAZX31HfPv0EqxBir1wrcOT4ZRPxKDxntVvAAZnr7TtTsyDgri1szGW/0d/+Frj9Xeuk t7xO1e8UeWMQMiTSIl9JwrugHVnaKhHQL+RxC/k1YCPYhdeKUYjy/J7ZJURqccLLxMt+mv oj2tZq7fy2Z2lQJEQ91Eaz9CEFmhq2c= Date: Tue, 17 Mar 2026 10:55:26 -0700 Precedence: bulk X-Mailing-List: virtualization@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [PATCH v2] mm/mempolicy: track page allocations per mempolicy To: "Huang, Ying" Cc: "Vlastimil Babka (SUSE)" , linux-mm@kvack.org, akpm@linux-foundation.org, mhocko@suse.com, apopple@nvidia.com, axelrasmussen@google.com, byungchul@sk.com, cgroups@vger.kernel.org, david@kernel.org, eperezma@redhat.com, gourry@gourry.net, jasowang@redhat.com, hannes@cmpxchg.org, joshua.hahnjy@gmail.com, Liam.Howlett@oracle.com, linux-kernel@vger.kernel.org, lorenzo.stoakes@oracle.com, matthew.brost@intel.com, mst@redhat.com, rppt@kernel.org, muchun.song@linux.dev, zhengqi.arch@bytedance.com, rakie.kim@sk.com, roman.gushchin@linux.dev, shakeel.butt@linux.dev, surenb@google.com, virtualization@lists.linux.dev, weixugc@google.com, xuanzhuo@linux.alibaba.com, yuanchu@google.com, ziy@nvidia.com, kernel-team@meta.com References: <20260307045520.247998-1-jp.kobryn@linux.dev> <3a42463b-9ddd-4d64-b64c-6c2e6e4fc75d@kernel.org> <343bbd5b-67a0-46c4-8ec4-69158bf26b3f@linux.dev> <874imkpba1.fsf@DESKTOP-5N7EMDA> <60f71f4c-71d9-4751-8c6b-10179b98bef0@kernel.org> <87sea0o55p.fsf@DESKTOP-5N7EMDA> <0d66401f-9874-4047-971b-632723b0b7ee@linux.dev> <87a4w7x8d0.fsf@DESKTOP-5N7EMDA> Content-Language: en-US X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: "JP Kobryn (Meta)" In-Reply-To: <87a4w7x8d0.fsf@DESKTOP-5N7EMDA> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_OUT On 3/16/26 11:44 PM, Huang, Ying wrote: > "JP Kobryn (Meta)" writes: > >> On 3/15/26 7:54 PM, Huang, Ying wrote: >>> "JP Kobryn (Meta)" writes: >>> >>>> On 3/13/26 12:34 AM, Vlastimil Babka (SUSE) wrote: >>>>> On 3/13/26 07:14, JP Kobryn (Meta) wrote: >>>>>> On 3/12/26 10:07 PM, Huang, Ying wrote: >>>>>>> "JP Kobryn (Meta)" writes: >>>>>>> >>>>>>>> On 3/12/26 6:40 AM, Vlastimil Babka (SUSE) wrote: >>>>>>>> >>>>>>>> How about I change from per-policy hit/miss/foreign triplets to a single >>>>>>>> aggregated policy triplet (i.e. just 3 new counters which account for >>>>>>>> all policies)? They would follow the same hit/miss/foreign semantics >>>>>>>> already proposed (visible in quoted text above). This would still >>>>>>>> provide the otherwise missing signal of whether policy-driven >>>>>>>> allocations to a node are intentional or fallback. >>>>>>>> >>>>>>>> Note that I am also planning on moving the stats off of the memcg so the >>>>>>>> 3 new counters will be global per-node in response to similar feedback. >>>>>>> >>>>>>> Emm, what's the difference between these newly added counters and the >>>>>>> existing numa_hit/miss/foreign counters? >>>>>> >>>>>> The existing counters don't account for node masks in the policies that >>>>>> make use of them. An allocation can land on a node in the mask and still >>>>>> be considered a miss because it wasn't the preferred node. >>>>> That sounds like we could just a new counter e.g. numa_hit_preferred >>>>> and >>>>> adjust definitions accordingly? Or some other variant that fills the gap? >>>> >>>> It's an interesting thought. Looking into these existing counters more, >>>> the in-kernel direct node allocations, which don't fall under any >>>> mempolicy, are also included in these stats. One good example might be >>>> include/linux/skbuff.h, where __dev_alloc_pages() calls >>>> alloc_pages_node_noprof(NUMA_NO_NODE, ...) which eventually reaches >>>> zone_statistics() and increments the stats. >>> IIUC, the default memory policy is used here, that is, MPOL_LOCAL. >> >> I'm not seeing that. zone_statistics() is eventually reached. >> alloc_pages_mpol() is not. > > Yes. The page isn't allocated through alloc_pages_mpol(). For example, > if we want to allocate pages for the kernel instead of user space > applications. However, IMHO, the equivalent memory policy is > MPOL_LOCAL, that is, allocate from local node firstly, then fallback to > other nodes. I don't think that alloc_pages_mpol() is so special. Sure. My response was based on how you said, "the default memory policy is used here". I took that literally. I agree on the behavioral equivalence, but the important point is that no mempolicy is set. In the v3 patch which was recently sent out, I'm using that aspect to distinguish between allocations with a user-defined mempolicy and allocations without one.