From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out30-118.freemail.mail.aliyun.com (out30-118.freemail.mail.aliyun.com [115.124.30.118]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 124DD3EFD09 for ; Wed, 1 Jul 2026 11:03:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.118 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782903821; cv=none; b=hxiXClg3DrkOu1hqFRAlspy5/gL/1pjQYF0kHzfmks+xPbP8QGH0QY+2GRsi6+CDitBF6yrt0kUlW7CHyPiRN87l4IIgK1bJWl3gyuTHNZrBZrmBao/NAFCA5bfL/sy41Nw6Fu2KzhtJ3Lz1ycE7M7VuGytpyu+TZn9q8YrI3uM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782903821; c=relaxed/simple; bh=m9+UTuLS3PSWCHwQCk3gy3m3rbC1BAoLiSl/PIHTVk4=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=Hij5c1WNgq4FAv/JLh7AGF5WSyma1Q0ruyP5mzx9kuJQWMi88P43DRXn/WNFK5gHhj+TMYW7cxpdAFBiwHD+9cyAh0RUb9n8KhAkqfCIzYdTN8b47PXiuZ3TAs6CCtfrkMuhCT9VX+1Y6L0D+iniIDHIJgFSFVyUulhFGavjVRE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=A13VXGcC; arc=none smtp.client-ip=115.124.30.118 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="A13VXGcC" DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1782903816; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; bh=BiAQQLl8vVs83xuAeiGeNLD4wLKcdjLHTtoBRrCpg8E=; b=A13VXGcClwLYJRCOuQbSJSd0Cmana/5a/QPcekewcy9qjtxEhaj+NIof9a2cw+7SU/utGS9JkCsCAdKU5psb3FfBKt9iP7Y/NBisQ9qo/YMA8PqNpKLUtqKAcFjhnkCKP9EUBlIAkvI7RLGbPY/CevJ6njahdm8nvpCbII8d+o0= X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R601e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045098064;MF=ying.huang@linux.alibaba.com;NM=1;PH=DS;RN=13;SR=0;TI=SMTPD_---0X67dyRL_1782903814; Received: from DESKTOP-5N7EMDA(mailfrom:ying.huang@linux.alibaba.com fp:SMTPD_---0X67dyRL_1782903814 cluster:ay36) by smtp.aliyun-inc.com; Wed, 01 Jul 2026 19:03:35 +0800 From: "Huang, Ying" To: Gregory Price Cc: Johannes Weiner , Andrew Morton , David Hildenbrand , Zi Yan , Matthew Brost , Joshua Hahn , Rakie Kim , Byungchul Park , Alistair Popple , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Neha Gholkar Subject: Re: [PATCH] mm: mempolicy: fix automatic numa balancing for shmem In-Reply-To: (Gregory Price's message of "Tue, 30 Jun 2026 11:29:36 -0400") References: <20260629163337.1264881-1-hannes@cmpxchg.org> <87h5mkz33h.fsf@DESKTOP-5N7EMDA> Date: Wed, 01 Jul 2026 19:03:32 +0800 Message-ID: <877bnfynsr.fsf@DESKTOP-5N7EMDA> User-Agent: Gnus/5.13 (Gnus v5.13) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=ascii Gregory Price writes: > On Tue, Jun 30, 2026 at 07:20:50PM +0800, Huang, Ying wrote: >> Gregory Price writes: >> >> [snip] >> >> > Demotions don't care about mempolicy, so opting shmem out of NUMA >> > balancing and mbind'ing on a tiered system is just full sadness. >> > >> > This is all just more evidence that demotion needs to be completely >> > redone, it's creating a mess of undefined behavior for memory placement. >> >> It's hard to respect mempolicy during demotion in the current >> implementation. Do you have any ideas on how to improve this? >> > > I think it's feasible we could respect per-vma mempolicies, but not > per-task. That would at least make this particular interaction less > painful and mbind() would do what you'd expect. It is a bit racy, > but with MPOL_MF_MOVE_ALL the user can get what they actually want. Yes. Per-vma mempolicy support is possible. > I think task-wide mempolicy is problematic and generally a bad idea > on tiered systems, maybe it's ok if we simply document task policies > are not respected on tiered systems? Anyway, it's convenient to use numactl to manage mempolicy. Is it possible to enable NUMA_BALANCING_MEMORY_TIERING for non-default VMAs? If we don't enable NUMA_BALANCING_NORMAL, the overhead should be OK because the page table entries are changed to PROTN_ONE only for pages on the slow tier. Additionally, we may need to consider cpusets. --- Best Regards, Huang, Ying