From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from stravinsky.debian.org (stravinsky.debian.org [82.195.75.108]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 600CB332ED3 for ; Thu, 5 Mar 2026 11:48:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=82.195.75.108 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772711327; cv=none; b=Srg3/58m7KuVq025pqV7w87Y8hLZtpkloKpMcXr63WbXWcOkcL94GZRO7+46EI+Iw/wxS+q7RFgzMCLEdRg4RXOwm9wSJkXmv+W77VCj1yCZvpXV/InINd80idHjFRQ8WNR0bXUa/SO3pJ7IsR3F2OTDkplKPFUaA3wpzJh8sjg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772711327; c=relaxed/simple; bh=b5r0nQ4FRo1YumohP3T5ov4YDO/x5zfwKfrdHDxNhj0=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=SaS8vikbXxo3mlVf2VUPwmShD89pIbmlUsisxq3sDLDe3b8cxE/UUVuwfBt9X5f+9QMmqNNGCRAk1uw0YAR79R+mhk/Edw9Hwy39hgenfse221dcfWqiMc00EjPyUPuG1XEWIabUXgYNUtxdDdkPywNdqlqy9yryITxOGKhgACI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=debian.org; spf=none smtp.mailfrom=debian.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b=Hq0rTFw2; arc=none smtp.client-ip=82.195.75.108 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=debian.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=debian.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b="Hq0rTFw2" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debian.org; s=smtpauto.stravinsky; h=X-Debian-User:In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=zQkinhDuPvsI+Mc9zYJbcWtkNsBII7yHHxY2ORHu0zM=; b=Hq0rTFw21/sh5uhsqYfzcxFS2d iluItr+FEpx0MlWdcYcTIWwlo+DZWTLIamUpmati9WRVLwJvUhczqblmDxJ1fBjCOBgZ3K/+CGsfv a1GDllI5Kl2ANRQ4puUiZgIActvll7SaQN1PxXT6YRz+8KyJXE9ko9N0q+4BEvIRFESF71bKZToT2 EuXKajdspI1cNFsEK//TQgDM+V2nj7TSw8dgDuAXQ3RzeeOG3d5PbBJVWlIVzefFWx+ag9OqDrL60 YsZkVo7KhoQwBSWuANsg/38M6R0ybAXU6rQ62E5xZM+GWQQSoJv1nws62XQ6sHpxuKFrfGD7BpAlm +ViILwOA==; Received: from authenticated user by stravinsky.debian.org with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.94.2) (envelope-from ) id 1vy7Be-00GhhC-DY; Thu, 05 Mar 2026 11:48:14 +0000 Date: Thu, 5 Mar 2026 03:48:07 -0800 From: Breno Leitao To: "Lorenzo Stoakes (Oracle)" Cc: Andrew Morton , David Hildenbrand , Lorenzo Stoakes , Zi Yan , Baolin Wang , "Liam R. Howlett" , Nico Pache , Ryan Roberts , Dev Jain , Barry Song , Lance Yang , Vlastimil Babka , Suren Baghdasaryan , Michal Hocko , Brendan Jackman , Johannes Weiner , linux-mm@kvack.org, linux-kernel@vger.kernel.org, usamaarif642@gmail.com, kas@kernel.org, kernel-team@meta.com Subject: Re: [PATCH 1/2] mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store() Message-ID: References: <20260304-thp_logs-v1-0-59038218a253@debian.org> <20260304-thp_logs-v1-1-59038218a253@debian.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Debian-User: leitao On Wed, Mar 04, 2026 at 04:40:22PM +0000, Lorenzo Stoakes (Oracle) wrote: > On Wed, Mar 04, 2026 at 02:22:33AM -0800, Breno Leitao wrote: > > Writing "never" (or any other value) multiple times to > > /sys/kernel/mm/transparent_hugepage/hugepages-*/enabled calls > > start_stop_khugepaged() each time, even when nothing actually changed. > > This causes set_recommended_min_free_kbytes() to run unconditionally, > > which is unnecessary and floods the printk buffer with "raising > > min_free_kbytes" messages. Example: > > > > # for i in $(seq 100); do > > # echo never > /sys/kernel/mm/transparent_hugepage/enabled > > # done > > > > # dmesg | grep "min_free_kbytes is not updated" | wc -l > > 100 > > > > Use test_and_set_bit()/test_and_clear_bit() instead of the plain > > variants to detect whether any bit actually flipped, and skip the > > start_stop_khugepaged() call entirely when the configuration is > > unchanged. > > > > With this patch, redoing the same operation becomes a no-op. > > > > Signed-off-by: Breno Leitao > > General concept is sensible, but let's improve this code please. Ack! Thanks for the suggestions. > > spin_unlock(&huge_anon_orders_lock); > > } else > > ret = -EINVAL; > > > > - if (ret > 0) { > > + if (ret > 0 && changed) { > > int err; > > > > err = start_stop_khugepaged(); > > There's a caveat here as mentioned in reply to Kiryl - I'm concerned users > might rely on the set recommended min kbytes even when things don't change. > > Not sure how likely that is, but it's a user-visible change in how this behaves. Is there any motivation that users are retouching /sys/kernel/mm/transparent_hugepage just to trigger set_recommended_min_free_kbytes() ? That seems weird, but, I will keep it in the change. > From cb2c4c8bf183ef0d10068cfd12c12d19cb17a241 Mon Sep 17 00:00:00 2001 > From: "Lorenzo Stoakes (Oracle)" > Date: Wed, 4 Mar 2026 16:37:20 +0000 > Subject: [PATCH] idea > > Signed-off-by: Lorenzo Stoakes (Oracle) Thanks for the idea. Let me hack on top of it, and propose a v2. > --- > mm/huge_memory.c | 74 ++++++++++++++++++++++++++++++------------------ > 1 file changed, 46 insertions(+), 28 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 0df1f4a17430..97dabbeb9112 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -515,46 +515,64 @@ static ssize_t anon_enabled_show(struct kobject *kobj, > return sysfs_emit(buf, "%s\n", output); > } > > +enum huge_mode { > + HUGE_ALWAYS, > + HUGE_INHERIT, > + HUGE_MADVISE, > + HUGE_NUM_MODES, > + HUGE_NEVER, > +}; > + > +static bool change_anon_orders(int order, enum huge_mode mode) > +{ > + static unsigned long *orders[] = { > + &huge_anon_orders_always, > + &huge_anon_orders_inherit, > + &huge_anon_orders_madvise, > + }; > + bool changed = false; > + int i; > + > + spin_lock(&huge_anon_orders_lock); > + for (i = 0; i < HUGE_NUM_MODES; i++) { > + if (i == mode) > + changed |= !test_and_set_bit(order, orders[mode]); > + else > + changed |= test_and_clear_bit(order, orders[mode]); I suppose we want s/mode/i in the test_and_{clear,set}_bit() here: if (i == mode) // set for mode changed |= !test_and_set_bit(order, orders[i]); else // clear for !mode changed |= test_and_clear_bit(order, orders[i]); For two reasons: * you want to unset "i" when i != mode. * you would have an OOB when accessing orders[HUGE_NEVER == 4] > static ssize_t anon_enabled_store(struct kobject *kobj, > struct kobj_attribute *attr, > const char *buf, size_t count) > { > int order = to_thpsize(kobj)->order; > ssize_t ret = count; > + bool changed; > + > + if (sysfs_streq(buf, "always")) > + changed = change_anon_orders(order, HUGE_ALWAYS); > + else if (sysfs_streq(buf, "inherit")) > + changed = change_anon_orders(order, HUGE_INHERIT); > + else if (sysfs_streq(buf, "madvise")) > + changed = change_anon_orders(order, HUGE_MADVISE); > + else if (sysfs_streq(buf, "never")) > + changed = change_anon_orders(order, HUGE_NEVER); > + else > + return -EINVAL; I think we can simplify anon_enabled_store() even more, by leveraging sysfs_match_string(). Something like: static const char *const anon_mode_strings[] = { [HUGE_ALWAYS] = "always", [HUGE_INHERIT] = "inherit", [HUGE_MADVISE] = "madvise", [HUGE_NEVER] = "never", NULL, }; and then static ssize_t anon_enabled_store(struct kobject *kobj, struct kobj_attribute *attr, const char *buf, size_t count) { int order = to_thpsize(kobj)->order; int mode; mode = sysfs_match_string(enabled_mode_strings, buf); if (mode < 0) return mode; if (change_anon_orders(order, mode)) { int err = start_stop_khugepaged(); if (err) return err; } else { /* Users expect this even if unchanged. TODO: Put in header... */ //set_recommended_min_free_kbytes(); } return count; } Anyway, I like this approach, thanks!. Let me hack a v2 based on it. --breno