From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wm1-f45.google.com (mail-wm1-f45.google.com [209.85.128.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E76243A7F58 for ; Thu, 19 Mar 2026 08:24:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.45 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773908659; cv=none; b=U1dobcQA+7tbjd0SSxqKFqNv4vl9mI9uZDO7cNslrKUlZrvM2m5keRUNV67Omy2QMaj9x+Xkq7yfY4fk3piR9l+Fanpw4eeO5ZSqrsXBVRAAeyzWdUwmakIpkZuPSjgSJ1r9KGRKIkjjoWRSPofYk0aodvhd3GJALokWbuFXC/M= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773908659; c=relaxed/simple; bh=1Mkx5B7eSYzQwBQq0qsgYKqbQE9J4OvD7fKAaXh0skk=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=NRz2fAK2K4vuanUT1OtvZtxMYw8vfqkcdr2sTrA3SKAw6HtBt8UouAIwJ2wtLyhRF9dSi927M3F6Fv5ighR4hJng0GHgCxE4K24uzG8UlZIK8UtBN62LqZGhyDJBnvur0HQY3HRXKVmVe+pveGmAzERiehtbIwD8b2C1KLs69os= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com; spf=pass smtp.mailfrom=suse.com; dkim=pass (2048-bit key) header.d=suse.com header.i=@suse.com header.b=dZ9ciG8g; arc=none smtp.client-ip=209.85.128.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=suse.com header.i=@suse.com header.b="dZ9ciG8g" Received: by mail-wm1-f45.google.com with SMTP id 5b1f17b1804b1-486fb439299so2794445e9.0 for ; Thu, 19 Mar 2026 01:24:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1773908656; x=1774513456; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=5V6YAkrWNnqXNX+5hq6I3lGeubcpIYjE1TNWAVeFtuk=; b=dZ9ciG8g/6bgXxnR2oXbkPw0IXl1ZtWsyosrMVUOiCLMJNU5hD51orHehMcVBLnYIO 92QVlqfqEaAoC8nrBzjhhaXY3YKjRHJwi8V0/qzrcBotersC5gNhGBrhyg0WTRuIlSzl SCfimMq/F08XumfaeZgnDtwZxluqIBbeDgU15/vB7laxY4TpeRQAiFS5Bp7wU5xHEjjD ujdw4TBT2RHZOjV5QviowXNjPjGTMR+KjXjH88wVm3HMwPITSE6j+AHlExS20oYMJ/O9 dt7blUSnXKScZxOukxyxLk0LHxlBH5R16sf6PfPaiIj7ACeALzw8+cPlC1Cq09pB/yqC axOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773908656; x=1774513456; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5V6YAkrWNnqXNX+5hq6I3lGeubcpIYjE1TNWAVeFtuk=; b=LkVohxY8Utmnw/FUSf1LrLbCgQyCvfgToCn6yv9hqO7J+8MeS1Ytmm0S+wIW2ObwvZ 1c9PKGyJeMZ4CqLv28CTnT3IymzWhZC4nrooy7NPmfIpwB00I1m77HDm0qz5oFhODcNr W45lwu6H9/Jd7YZNhM1ZigDWxQFwry55wGTolCcoUtvjPKvwaF1EiqldAsX9dVNHkOpJ p4uKDp449iBuXU1QhkXUbtXjx0dake7BKqVWfP+Ym9JK4vI8+UiqFs9N9NcIjgo8xxc3 Wg+yPUeIcjfgApqfCkO+paeanSAYjF8RzN+YG/Tj7k9lgsI1F7ZrCI6BBq+/dua47+u0 TEYw== X-Forwarded-Encrypted: i=1; AJvYcCXQRhBx4Wtm2ZdvEZrwcs1Z5MmYcIHZkVQNsVLGwZK+DiR7xepgPSFko7Vo/V2yvnSWtIUd7ltI1Hg8G3Y=@vger.kernel.org X-Gm-Message-State: AOJu0YxGVNc8TaXH0iGFx36Zk9EfZ4k63I85fl/t9TlEIIA8vKSkLLjA 4y1w+aDqNkrWbJb+1vpR0HbnmfICZkGmufHOAXP++oTjjGhwR75V3HCpZStlL3VOcsI= X-Gm-Gg: ATEYQzwyh67QCH8/7QgLvePki94LuNgUr4hwM6Zz1HwPhsaXENLGfK/8Dc4uuFoT+G0 JjrRBdY+LCtWbcpIaOyFJxv4taX4gO+oZVtAIPm6I7luTJzmXGb37ksWztIhCIJ1MSafQcrkwgp EAbhAD6LPwNzwZC0IFtT3oIHaik8oMsxgH22dmU+w3Bax9BwYTELj1ytaxwo2lJyQ396l4Icq4t NsjU/dr6wNvCQp5YipkMYT/DNY5m0npdnwG2mWeo8516a3I61fCTH5kEILEa9vxvscroaZTXLgI 6al7KRc69MoSGyL5Z+HMN9YPlhXoOiCmGOIOV43EPSgimwaa6M8RfiLRhbo/1Toluwke+Qb6+bt 9YvqS2nhQDuEavdHBuwfp+m1TH/rqrYfi8MJoz4fpGheJO6Dvf52X4jrizEiVg6i1+mIMLWvJwN tMPckdRmkrXmLMvyl1h4MOXCf3rGOt87zy5Q== X-Received: by 2002:a05:600c:3f14:b0:486:f8d6:5dea with SMTP id 5b1f17b1804b1-486f8d65df5mr32233575e9.19.1773908656221; Thu, 19 Mar 2026 01:24:16 -0700 (PDT) Received: from localhost (109-81-88-11.rct.o2.cz. [109.81.88.11]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-486f8bd5f7csm52663255e9.0.2026.03.19.01.24.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 Mar 2026 01:24:15 -0700 (PDT) Date: Thu, 19 Mar 2026 09:24:14 +0100 From: Michal Hocko To: Daniil Tatianin Cc: Andrew Morton , Johannes Weiner , Roman Gushchin , Shakeel Butt , Muchun Song , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Axel Rasmussen , Yuanchu Xie , Wei Xu , Brendan Jackman , Zi Yan , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, yc-core@yandex-team.ru Subject: Re: [PATCH] mm: add memory.compact_unevictable_allowed cgroup attribute Message-ID: References: <20260317121736.f73a828de2a989d1a07efea1@linux-foundation.org> <3db237d0-1ee8-44b7-a356-f3015173f7c2@yandex-team.ru> <7ca9876c-f3fa-441c-9a21-ae0ee5523318@yandex-team.ru> <73322279-c6f8-4319-827b-938c20c96b9b@yandex-team.ru> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Wed 18-03-26 17:03:53, Daniil Tatianin wrote: > > On 3/18/26 2:47 PM, Michal Hocko wrote: > > On Wed 18-03-26 13:08:31, Daniil Tatianin wrote: > > > On 3/18/26 1:01 PM, Michal Hocko wrote: > > > > On Wed 18-03-26 12:25:17, Daniil Tatianin wrote: > > > > > On 3/18/26 12:20 PM, Michal Hocko wrote: > > > > [...] > > > > > > Shouldn't those use mlock? > > > > > Absolutely, mlock is required to mark a folio as unevictable. Note that > > > > > unevictable folios are still > > > > > perfectly eligible for compaction. This new property makes it so a cgroup > > > > > can say whether its > > > > > unevictable pages should be compacted (same as the global > > > > > compact_unevictable_allowed sysctl). > > > > If the mlock is already used then why do we need a per memcg control as > > > > well? Do we have different classes of mlocked pages some with acceptable > > > > compaction while others without? > > OK, I have misread the intention and this is exactly focused at mlock > > rather than general protection of all memcg charged memory. Now > > > > > The way it works is mlock(2) only prevents pages from being evicted > > > from the page cache by setting unevictable | mlocked flags on the > > > page. Such pages, however, are still allowed for compaction by > > > default, unless /proc/sys/vm/compact_unevictable_allowed is set to 0. > > > That property essentially "promotes" ALL such (unevictable) pages to a > > > new synthetic tier by making compaction skip them. The per-cgroup > > > property works similarly, however, it allows the scope to be much > > > smaller: from a global setting that promotes literally ALL unevictable > > > (mlocked) pages to this tier, to only promoting pages belonging to the > > > cgroup that has memory.compact_unevictable_allowed as 0. > > This is clear but what is not really clear to me is whether this is > > worth having as mlock workloads are already quite specific, the amount > > of mlocked memory shouldn't really consume huge portion of the memory so > > you still need to have a solid usecase where such a micro management > > really is worth it. In other words why a global > > compact_unevictable_allowed is not sufficient. > > In my opinion both mlocked memory and non-compactible memory have the right > to > co-exist on the same host without a global switch that turns one into the > other. I agree > that it's not a super common thing, but I still think it can be beneficial. > > Some examples include but not limited to: security: so that sensitive data > is never swapped > to disk yet we have no problem if it gets compacted and the actual physical > page gets replaced, > performance for some apps: so that we can e.g. memlock a large binary in > memory to keep it in > page cache and improve startup time, but again don't care much if the actual > backing pages are > replaced via compaction. > > On the other hand, some critically important/real time applications do need > protection from compaction > as well on top of the regular mlock, so that they have predictable latency > and response time, which can > really fluctuate during heavy compaction. Both of these cases can coexist on > the same physical machine. This is a very weak justification for adding a user API. NAK to this. -- Michal Hocko SUSE Labs