From: Sohil Mehta <sohil.mehta@intel.com>
To: Breno Leitao <leitao@debian.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@kernel.org>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Zi Yan <ziy@nvidia.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Nico Pache <npache@redhat.com>,
"Ryan Roberts" <ryan.roberts@arm.com>,
Dev Jain <dev.jain@arm.com>, Barry Song <baohua@kernel.org>,
Lance Yang <lance.yang@linux.dev>,
Vlastimil Babka <vbabka@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>,
Brendan Jackman <jackmanb@google.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Mike Rapoport <rppt@kernel.org>, <linux-mm@kvack.org>,
<linux-kernel@vger.kernel.org>, <usamaarif642@gmail.com>,
<kas@kernel.org>, <kernel-team@meta.com>,
"Lorenzo Stoakes (Oracle)" <ljs@kernel.org>,
Wei Yang <richard.weiyang@gmail.com>,
"Accardi, Kristen C" <kristen.c.accardi@intel.com>
Subject: Re: [PATCH v6 3/4] mm: huge_memory: refactor enabled_store() with change_enabled()
Date: Mon, 16 Mar 2026 16:26:30 -0700 [thread overview]
Message-ID: <a503cbfc-e7d9-40f0-99ef-d640a0b5d78e@intel.com> (raw)
In-Reply-To: <abfVLky1zanZyEx8@gmail.com>
On 3/16/2026 3:12 AM, Breno Leitao wrote:
>> I am not a mm expert and typically do not follow the mm list. Is there
>> an issue with the usage of non-atomic variants here? The commit message
>> says this uses the same pattern as set_anon_enabled_mode().
>>
>> However, set_anon_enabled_mode() has a spinlock=>huge_anon_orders_lock
>> protecting the access. But, transparent_hugepage_flags seems to be
>> unprotected in that regard.
>
> I don't think that the atomic vs non-atomic will not help much, given
> this is a compoud operation. Independently if this is atomic or not, it
> is racy with anyone changing these fields (transparent_hugepage_flags).
> In other words, Atomic ops make each individual bit flip safe, but
> set_global_enabled_mode() and defrag_store() need to flip multiple bits
> as a group. With atomic ops, two concurrent writers can still interleave
> and leave the flags in an invalid state.
You are right it is a compound operation. So, there is an existing issue
with two concurrent writers which can leave the flags in an invalid state.
But, I was wondering if there is a slightly different issue now due to
the non-atomic part. Some updates could be lost depending on the timing
of the operation.
For example,
CPU1 does: CPU2 does:
set_global_enabled_mode("always") defrag_store("always")
___test_and_set_bit():
// Trying to set bit 1
old = *p
// reads flags, sees defrag bit=0
set_bit(DEFRAG_DIRECT_FLAG)
// atomic: sets bit 3
*p = old | TRANSPARENT_HUGEPAGE_FLAG;
// writes back old value with bit 1 set
// DEFRAG_DIRECT_FLAG is lost (reverted to 0)
IIUC, this issue didn't exist before. I think it might be safer to use
the atomic test_and_set_bit() to be compatible with the older code.
Though, I'll leave it up to you as I don't have expertise here.
Overall, as you mentioned below, protecting transparent_hugepage_flags
with a spinlock seems like a better, long-term solution to me as well.
>
> That said, Although I don't think this patch is making it worse, I think
> the is a racy issue here that we can make better.
>
> My suggestion is to move the rest of the helpers (defrag_store()) to use
> sysfs_match_string(), and then create a thp_flags_lock spinlock to
> protect operations against transparent_hugepage_flags. Any concern about
> this approach?
>
next prev parent reply other threads:[~2026-03-16 23:26 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-11 10:17 [PATCH v6 0/4] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
2026-03-11 10:17 ` [PATCH v6 1/4] mm: khugepaged: export set_recommended_min_free_kbytes() Breno Leitao
2026-03-11 10:17 ` [PATCH v6 2/4] mm: huge_memory: refactor anon_enabled_store() with change_anon_orders() Breno Leitao
2026-03-11 11:44 ` Lance Yang
2026-03-23 9:00 ` David Hildenbrand (Arm)
2026-03-11 10:17 ` [PATCH v6 3/4] mm: huge_memory: refactor enabled_store() with change_enabled() Breno Leitao
2026-03-11 12:20 ` Lance Yang
2026-03-13 22:31 ` Sohil Mehta
2026-03-16 10:12 ` Breno Leitao
2026-03-16 23:26 ` Sohil Mehta [this message]
2026-03-17 9:37 ` Lorenzo Stoakes (Oracle)
2026-03-17 11:23 ` Breno Leitao
2026-03-17 11:25 ` Lorenzo Stoakes (Oracle)
2026-03-17 11:48 ` Breno Leitao
2026-03-11 10:17 ` [PATCH v6 4/4] mm: ratelimit min_free_kbytes adjustment messages Breno Leitao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a503cbfc-e7d9-40f0-99ef-d640a0b5d78e@intel.com \
--to=sohil.mehta@intel.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=david@kernel.org \
--cc=dev.jain@arm.com \
--cc=hannes@cmpxchg.org \
--cc=jackmanb@google.com \
--cc=kas@kernel.org \
--cc=kernel-team@meta.com \
--cc=kristen.c.accardi@intel.com \
--cc=lance.yang@linux.dev \
--cc=leitao@debian.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=npache@redhat.com \
--cc=richard.weiyang@gmail.com \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=surenb@google.com \
--cc=usamaarif642@gmail.com \
--cc=vbabka@kernel.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.