linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mempolicy: Clarify what RECLAIM_ZONE means
@ 2025-07-25 17:35 Joshua Hahn
  2025-07-25 21:44 ` SeongJae Park
  2025-07-28  1:44 ` Huang, Ying
  0 siblings, 2 replies; 9+ messages in thread
From: Joshua Hahn @ 2025-07-25 17:35 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand, Johannes Weiner
  Cc: Zi Yan, Matthew Brost, Rakie Kim, Byungchul Park, Gregory Price,
	Ying Huang, Alistair Popple, linux-kernel, linux-mm, kernel-team

The zone_reclaim_mode API controls reclaim behavior when a node runs out of
memory. Contrary to its user-facing name, it is internally referred to as
"node_reclaim_mode". This is slightly confusing but there is not much we can
do given that it has already been exposed to userspace (since at least 2.6).

However, what we can do is to make sure the internal description of what the
bits inside zone_reclaim_mode aligns with what it does in practice.
Setting RECLAIM_ZONE does indeed run shrink_inactive_list, but a more holistic
description would be to explain that zone reclaim modulates whether page
allocation (and khugepaged collapsing) prefers reclaiming & attempting to
allocate locally or should fall back to the next node in the zonelist.

Change the description to clarify what zone reclaim entails.

Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
---
 include/uapi/linux/mempolicy.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
index 1f9bb10d1a47..24083809d920 100644
--- a/include/uapi/linux/mempolicy.h
+++ b/include/uapi/linux/mempolicy.h
@@ -69,7 +69,7 @@ enum {
  * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
  * ABI.  New bits are OK, but existing bits can never change.
  */
-#define RECLAIM_ZONE	(1<<0)	/* Run shrink_inactive_list on the zone */
+#define RECLAIM_ZONE	(1<<0)	/* Prefer reclaiming & allocating locally */
 #define RECLAIM_WRITE	(1<<1)	/* Writeout pages during reclaim */
 #define RECLAIM_UNMAP	(1<<2)	/* Unmap pages during reclaim */
 

base-commit: 25fae0b93d1d7ddb25958bcb90c3c0e5e0e202bd
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] mempolicy: Clarify what RECLAIM_ZONE means
  2025-07-25 17:35 [PATCH] mempolicy: Clarify what RECLAIM_ZONE means Joshua Hahn
@ 2025-07-25 21:44 ` SeongJae Park
  2025-07-26  1:24   ` Joshua Hahn
  2025-07-28  1:44 ` Huang, Ying
  1 sibling, 1 reply; 9+ messages in thread
From: SeongJae Park @ 2025-07-25 21:44 UTC (permalink / raw)
  To: Joshua Hahn
  Cc: SeongJae Park, Andrew Morton, David Hildenbrand, Johannes Weiner,
	Zi Yan, Matthew Brost, Rakie Kim, Byungchul Park, Gregory Price,
	Ying Huang, Alistair Popple, linux-kernel, linux-mm, kernel-team

Hi Joshua,

On Fri, 25 Jul 2025 10:35:45 -0700 Joshua Hahn <joshua.hahnjy@gmail.com> wrote:

> The zone_reclaim_mode API controls reclaim behavior when a node runs out of
> memory. Contrary to its user-facing name, it is internally referred to as
> "node_reclaim_mode". This is slightly confusing but there is not much we can
> do given that it has already been exposed to userspace (since at least 2.6).
> 
> However, what we can do is to make sure the internal description of what the
> bits inside zone_reclaim_mode aligns with what it does in practice.
> Setting RECLAIM_ZONE does indeed run shrink_inactive_list, but a more holistic
> description would be to explain that zone reclaim modulates whether page
> allocation (and khugepaged collapsing) prefers reclaiming & attempting to
> allocate locally or should fall back to the next node in the zonelist.
> 
> Change the description to clarify what zone reclaim entails.
> 
> Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> ---
>  include/uapi/linux/mempolicy.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
> index 1f9bb10d1a47..24083809d920 100644
> --- a/include/uapi/linux/mempolicy.h
> +++ b/include/uapi/linux/mempolicy.h
> @@ -69,7 +69,7 @@ enum {
>   * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
>   * ABI.  New bits are OK, but existing bits can never change.
>   */
> -#define RECLAIM_ZONE	(1<<0)	/* Run shrink_inactive_list on the zone */
> +#define RECLAIM_ZONE	(1<<0)	/* Prefer reclaiming & allocating locally */
>  #define RECLAIM_WRITE	(1<<1)	/* Writeout pages during reclaim */
>  #define RECLAIM_UNMAP	(1<<2)	/* Unmap pages during reclaim */

I agree the new comment is more holistic.  It explains general
zone_reclaim_mode behavior (how the system works if the mode is turned on by
having any of rightmost three bits is set) well.  But, I think the old
description is for the specific mode of it (when the rightmost bit is set), and
the place is appropriate for that purpose.

What about keeping the old comment but adding the holistic description on the
upper multi-lines comments block?

And the behavior is also well described in zone_reclaim_mode section of
Documentation/admin-guide/sysctl/vm.rst document in my opinion.  Maybe putting
a reference to the doc together for readers who curious about more details
could also be useful?


Thanks,
SJ

[...]


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mempolicy: Clarify what RECLAIM_ZONE means
  2025-07-25 21:44 ` SeongJae Park
@ 2025-07-26  1:24   ` Joshua Hahn
  0 siblings, 0 replies; 9+ messages in thread
From: Joshua Hahn @ 2025-07-26  1:24 UTC (permalink / raw)
  To: SeongJae Park
  Cc: Andrew Morton, David Hildenbrand, Johannes Weiner, Zi Yan,
	Matthew Brost, Rakie Kim, Byungchul Park, Gregory Price,
	Ying Huang, Alistair Popple, linux-kernel, linux-mm, kernel-team

On Fri, 25 Jul 2025 14:44:26 -0700 SeongJae Park <sj@kernel.org> wrote:

> Hi Joshua,
> 
> On Fri, 25 Jul 2025 10:35:45 -0700 Joshua Hahn <joshua.hahnjy@gmail.com> wrote:
> 
> > The zone_reclaim_mode API controls reclaim behavior when a node runs out of
> > memory. Contrary to its user-facing name, it is internally referred to as
> > "node_reclaim_mode". This is slightly confusing but there is not much we can
> > do given that it has already been exposed to userspace (since at least 2.6).
> > 
> > However, what we can do is to make sure the internal description of what the
> > bits inside zone_reclaim_mode aligns with what it does in practice.
> > Setting RECLAIM_ZONE does indeed run shrink_inactive_list, but a more holistic
> > description would be to explain that zone reclaim modulates whether page
> > allocation (and khugepaged collapsing) prefers reclaiming & attempting to
> > allocate locally or should fall back to the next node in the zonelist.
> > 
> > Change the description to clarify what zone reclaim entails.
> > 
> > Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> > ---
> >  include/uapi/linux/mempolicy.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
> > index 1f9bb10d1a47..24083809d920 100644
> > --- a/include/uapi/linux/mempolicy.h
> > +++ b/include/uapi/linux/mempolicy.h
> > @@ -69,7 +69,7 @@ enum {
> >   * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
> >   * ABI.  New bits are OK, but existing bits can never change.
> >   */
> > -#define RECLAIM_ZONE	(1<<0)	/* Run shrink_inactive_list on the zone */
> > +#define RECLAIM_ZONE	(1<<0)	/* Prefer reclaiming & allocating locally */
> >  #define RECLAIM_WRITE	(1<<1)	/* Writeout pages during reclaim */
> >  #define RECLAIM_UNMAP	(1<<2)	/* Unmap pages during reclaim */
> 
> I agree the new comment is more holistic.  It explains general
> zone_reclaim_mode behavior (how the system works if the mode is turned on by
> having any of rightmost three bits is set) well.  But, I think the old
> description is for the specific mode of it (when the rightmost bit is set), and
> the place is appropriate for that purpose.
> 
> What about keeping the old comment but adding the holistic description on the
> upper multi-lines comments block?

Hi SJ,

Thank you for your kind review as always : -)
On second thought, I think you may be right. To be completely honest, the reason
I submitted this patch is because I was looking into zone_reclaim and got
a bit confused, and thought there was a possibility that others might be
confused as well. It might only have been confusing for me, though ;)

> And the behavior is also well described in zone_reclaim_mode section of
> Documentation/admin-guide/sysctl/vm.rst document in my opinion.  Maybe putting
> a reference to the doc together for readers who curious about more details
> could also be useful?

Yes, I think this is a very good point. The comment block above the #defines
were added because in the past, RECLAIM_ZONE was actually removed by a developer
because there were no explicit users (although this has changed since).
Perhaps pointing users to the admin-guide can help explain more about the
context of the first bit, as well as explain what I am trying to do with the
comment change. 

> 
> Thanks,
> SJ
> 
> [...]

Thanks again SJ! I hope you enjoy your weekend : -)
Joshua

Sent using hkml (https://github.com/sjp38/hackermail)


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mempolicy: Clarify what RECLAIM_ZONE means
  2025-07-25 17:35 [PATCH] mempolicy: Clarify what RECLAIM_ZONE means Joshua Hahn
  2025-07-25 21:44 ` SeongJae Park
@ 2025-07-28  1:44 ` Huang, Ying
  2025-07-28 14:51   ` Joshua Hahn
  1 sibling, 1 reply; 9+ messages in thread
From: Huang, Ying @ 2025-07-28  1:44 UTC (permalink / raw)
  To: Joshua Hahn
  Cc: Andrew Morton, David Hildenbrand, Johannes Weiner, Zi Yan,
	Matthew Brost, Rakie Kim, Byungchul Park, Gregory Price,
	Alistair Popple, linux-kernel, linux-mm, kernel-team

Hi, Joshua,

Joshua Hahn <joshua.hahnjy@gmail.com> writes:

> The zone_reclaim_mode API controls reclaim behavior when a node runs out of
> memory. Contrary to its user-facing name, it is internally referred to as
> "node_reclaim_mode". This is slightly confusing but there is not much we can
> do given that it has already been exposed to userspace (since at least 2.6).
>
> However, what we can do is to make sure the internal description of what the
> bits inside zone_reclaim_mode aligns with what it does in practice.
> Setting RECLAIM_ZONE does indeed run shrink_inactive_list, but a more holistic
> description would be to explain that zone reclaim modulates whether page
> allocation (and khugepaged collapsing) prefers reclaiming & attempting to
> allocate locally or should fall back to the next node in the zonelist.
>
> Change the description to clarify what zone reclaim entails.
>
> Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> ---
>  include/uapi/linux/mempolicy.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
> index 1f9bb10d1a47..24083809d920 100644
> --- a/include/uapi/linux/mempolicy.h
> +++ b/include/uapi/linux/mempolicy.h
> @@ -69,7 +69,7 @@ enum {
>   * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
>   * ABI.  New bits are OK, but existing bits can never change.
>   */
> -#define RECLAIM_ZONE	(1<<0)	/* Run shrink_inactive_list on the zone */
> +#define RECLAIM_ZONE	(1<<0)	/* Prefer reclaiming & allocating locally */
>  #define RECLAIM_WRITE	(1<<1)	/* Writeout pages during reclaim */
>  #define RECLAIM_UNMAP	(1<<2)	/* Unmap pages during reclaim */
>  
>
> base-commit: 25fae0b93d1d7ddb25958bcb90c3c0e5e0e202bd

Please consider the document of zone_reclaim_mode in
Documentation/admin-guide/sysctl/vm.rst too.

And, IIUC, RECLAIM_ZONE doesn't mean "locally" exactly.  It's legal to
bind to some node other than "local node".

---
Best Regards,
Huang, Ying


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mempolicy: Clarify what RECLAIM_ZONE means
  2025-07-28  1:44 ` Huang, Ying
@ 2025-07-28 14:51   ` Joshua Hahn
  2025-07-29  0:58     ` Huang, Ying
  0 siblings, 1 reply; 9+ messages in thread
From: Joshua Hahn @ 2025-07-28 14:51 UTC (permalink / raw)
  To: Huang, Ying
  Cc: Andrew Morton, David Hildenbrand, Johannes Weiner, Zi Yan,
	Matthew Brost, Rakie Kim, Byungchul Park, Gregory Price,
	Alistair Popple, linux-kernel, linux-mm, kernel-team

On Mon, 28 Jul 2025 09:44:06 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:

> Hi, Joshua,
> 
> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
> 
> > The zone_reclaim_mode API controls reclaim behavior when a node runs out of
> > memory. Contrary to its user-facing name, it is internally referred to as
> > "node_reclaim_mode". This is slightly confusing but there is not much we can
> > do given that it has already been exposed to userspace (since at least 2.6).
> >
> > However, what we can do is to make sure the internal description of what the
> > bits inside zone_reclaim_mode aligns with what it does in practice.
> > Setting RECLAIM_ZONE does indeed run shrink_inactive_list, but a more holistic
> > description would be to explain that zone reclaim modulates whether page
> > allocation (and khugepaged collapsing) prefers reclaiming & attempting to
> > allocate locally or should fall back to the next node in the zonelist.
> >
> > Change the description to clarify what zone reclaim entails.
> >
> > Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> > ---
> >  include/uapi/linux/mempolicy.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
> > index 1f9bb10d1a47..24083809d920 100644
> > --- a/include/uapi/linux/mempolicy.h
> > +++ b/include/uapi/linux/mempolicy.h
> > @@ -69,7 +69,7 @@ enum {
> >   * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
> >   * ABI.  New bits are OK, but existing bits can never change.
> >   */
> > -#define RECLAIM_ZONE	(1<<0)	/* Run shrink_inactive_list on the zone */
> > +#define RECLAIM_ZONE	(1<<0)	/* Prefer reclaiming & allocating locally */
> >  #define RECLAIM_WRITE	(1<<1)	/* Writeout pages during reclaim */
> >  #define RECLAIM_UNMAP	(1<<2)	/* Unmap pages during reclaim */
> >  
> >
> > base-commit: 25fae0b93d1d7ddb25958bcb90c3c0e5e0e202bd

Hi Ying, thanks for your review, as always!

> Please consider the document of zone_reclaim_mode in
> Documentation/admin-guide/sysctl/vm.rst too.

Yes, will do. Along with SJ's comment, I think that the information in the
admin-guide should be sufficient enough to explain what these bits do, so
I think my patch is not very necessary.

> And, IIUC, RECLAIM_ZONE doesn't mean "locally" exactly.  It's legal to
> bind to some node other than "local node".

You are correct, it seems you can also reclaim on non-local nodes once you
go further down in the zonelist. I think my intent with the new comment was just
to indicate a preference to reclaim and allocate on the *current* node, as
opposed to falling back to the next node in the zonelist.

With that said, I think your comment along with SJ's feedback have gotten me
to understand that we proably don't need this change : -) 

Thank you, and have a great day!
Joshua

> ---
> Best Regards,
> Huang, Ying

Sent using hkml (https://github.com/sjp38/hackermail)


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mempolicy: Clarify what RECLAIM_ZONE means
  2025-07-28 14:51   ` Joshua Hahn
@ 2025-07-29  0:58     ` Huang, Ying
  2025-07-30 20:19       ` Joshua Hahn
  0 siblings, 1 reply; 9+ messages in thread
From: Huang, Ying @ 2025-07-29  0:58 UTC (permalink / raw)
  To: Joshua Hahn
  Cc: Andrew Morton, David Hildenbrand, Johannes Weiner, Zi Yan,
	Matthew Brost, Rakie Kim, Byungchul Park, Gregory Price,
	Alistair Popple, linux-kernel, linux-mm, kernel-team, Dave Hansen

Joshua Hahn <joshua.hahnjy@gmail.com> writes:

> On Mon, 28 Jul 2025 09:44:06 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:
>
>> Hi, Joshua,
>> 
>> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
>> 
>> > The zone_reclaim_mode API controls reclaim behavior when a node runs out of
>> > memory. Contrary to its user-facing name, it is internally referred to as
>> > "node_reclaim_mode". This is slightly confusing but there is not much we can
>> > do given that it has already been exposed to userspace (since at least 2.6).
>> >
>> > However, what we can do is to make sure the internal description of what the
>> > bits inside zone_reclaim_mode aligns with what it does in practice.
>> > Setting RECLAIM_ZONE does indeed run shrink_inactive_list, but a more holistic
>> > description would be to explain that zone reclaim modulates whether page
>> > allocation (and khugepaged collapsing) prefers reclaiming & attempting to
>> > allocate locally or should fall back to the next node in the zonelist.
>> >
>> > Change the description to clarify what zone reclaim entails.
>> >
>> > Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
>> > ---
>> >  include/uapi/linux/mempolicy.h | 2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
>> > index 1f9bb10d1a47..24083809d920 100644
>> > --- a/include/uapi/linux/mempolicy.h
>> > +++ b/include/uapi/linux/mempolicy.h
>> > @@ -69,7 +69,7 @@ enum {
>> >   * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
>> >   * ABI.  New bits are OK, but existing bits can never change.
>> >   */
>> > -#define RECLAIM_ZONE	(1<<0)	/* Run shrink_inactive_list on the zone */
>> > +#define RECLAIM_ZONE	(1<<0)	/* Prefer reclaiming & allocating locally */
>> >  #define RECLAIM_WRITE	(1<<1)	/* Writeout pages during reclaim */
>> >  #define RECLAIM_UNMAP	(1<<2)	/* Unmap pages during reclaim */
>> >  
>> >
>> > base-commit: 25fae0b93d1d7ddb25958bcb90c3c0e5e0e202bd
>
> Hi Ying, thanks for your review, as always!
>
>> Please consider the document of zone_reclaim_mode in
>> Documentation/admin-guide/sysctl/vm.rst too.
>
> Yes, will do. Along with SJ's comment, I think that the information in the
> admin-guide should be sufficient enough to explain what these bits do, so
> I think my patch is not very necessary.
>
>> And, IIUC, RECLAIM_ZONE doesn't mean "locally" exactly.  It's legal to
>> bind to some node other than "local node".
>
> You are correct, it seems you can also reclaim on non-local nodes once you
> go further down in the zonelist. I think my intent with the new comment was just
> to indicate a preference to reclaim and allocate on the *current* node, as
> opposed to falling back to the next node in the zonelist.
>
> With that said, I think your comment along with SJ's feedback have gotten me
> to understand that we proably don't need this change : -) 

TBH, I think that it's good to make some change to the comments.
Because IMHO, the original comments are bound to some specific
implementation details.  Some more general words may be better for the
user space API description.

---
Best Regards,
Huang, Ying


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mempolicy: Clarify what RECLAIM_ZONE means
  2025-07-29  0:58     ` Huang, Ying
@ 2025-07-30 20:19       ` Joshua Hahn
  2025-07-31  1:48         ` Huang, Ying
  0 siblings, 1 reply; 9+ messages in thread
From: Joshua Hahn @ 2025-07-30 20:19 UTC (permalink / raw)
  To: Huang, Ying
  Cc: Andrew Morton, David Hildenbrand, Johannes Weiner, Zi Yan,
	Matthew Brost, Rakie Kim, Byungchul Park, Gregory Price,
	Alistair Popple, linux-kernel, linux-mm, kernel-team, Dave Hansen

On Tue, 29 Jul 2025 08:58:49 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:

> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
> 
> > On Mon, 28 Jul 2025 09:44:06 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:
> >
> >> Hi, Joshua,
> >> 
> >> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
> >> 
> >> > The zone_reclaim_mode API controls reclaim behavior when a node runs out of
> >> > memory. Contrary to its user-facing name, it is internally referred to as
> >> > "node_reclaim_mode". This is slightly confusing but there is not much we can
> >> > do given that it has already been exposed to userspace (since at least 2.6).
> >> >
> >> > However, what we can do is to make sure the internal description of what the
> >> > bits inside zone_reclaim_mode aligns with what it does in practice.
> >> > Setting RECLAIM_ZONE does indeed run shrink_inactive_list, but a more holistic
> >> > description would be to explain that zone reclaim modulates whether page
> >> > allocation (and khugepaged collapsing) prefers reclaiming & attempting to
> >> > allocate locally or should fall back to the next node in the zonelist.
> >> >
> >> > Change the description to clarify what zone reclaim entails.
> >> >
> >> > Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> >> > ---
> >> >  include/uapi/linux/mempolicy.h | 2 +-
> >> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >> >
> >> > diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
> >> > index 1f9bb10d1a47..24083809d920 100644
> >> > --- a/include/uapi/linux/mempolicy.h
> >> > +++ b/include/uapi/linux/mempolicy.h
> >> > @@ -69,7 +69,7 @@ enum {
> >> >   * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
> >> >   * ABI.  New bits are OK, but existing bits can never change.
> >> >   */
> >> > -#define RECLAIM_ZONE	(1<<0)	/* Run shrink_inactive_list on the zone */
> >> > +#define RECLAIM_ZONE	(1<<0)	/* Prefer reclaiming & allocating locally */
> >> >  #define RECLAIM_WRITE	(1<<1)	/* Writeout pages during reclaim */
> >> >  #define RECLAIM_UNMAP	(1<<2)	/* Unmap pages during reclaim */
> >> >  
> >> >
> >> > base-commit: 25fae0b93d1d7ddb25958bcb90c3c0e5e0e202bd
> >
> > Hi Ying, thanks for your review, as always!
> >
> >> Please consider the document of zone_reclaim_mode in
> >> Documentation/admin-guide/sysctl/vm.rst too.
> >
> > Yes, will do. Along with SJ's comment, I think that the information in the
> > admin-guide should be sufficient enough to explain what these bits do, so
> > I think my patch is not very necessary.
> >
> >> And, IIUC, RECLAIM_ZONE doesn't mean "locally" exactly.  It's legal to
> >> bind to some node other than "local node".
> >
> > You are correct, it seems you can also reclaim on non-local nodes once you
> > go further down in the zonelist. I think my intent with the new comment was just
> > to indicate a preference to reclaim and allocate on the *current* node, as
> > opposed to falling back to the next node in the zonelist.
> >
> > With that said, I think your comment along with SJ's feedback have gotten me
> > to understand that we proably don't need this change : -) 
> 
> TBH, I think that it's good to make some change to the comments.
> Because IMHO, the original comments are bound to some specific
> implementation details.  Some more general words may be better for the
> user space API description.

Hi Ying, sorry for the late reply.

I think that is a good point. Then maybe in that case, we can take SJ's comment
and leave information about both the implementation detail (i.e. that it will
perform shrink inactive_list on the zone), and that it will prefer this over
allocating on the next node as a general description of what happens?

On that note, one thing that I felt was slightly undercaptured in
Documentation/admin-guide is what "zone reclaim" actually means. What it does
is of course well captured by its name, but it misses the nuance of preferring
reclaim over fallback allocation.

Actually the whole motivation behind all of this conversation is because I saw
zone reclaim preventing allocation into a second node in a 2-NUMA node system
and was a bit confused until I understood what the implication of having
zone reclaim was.

Anyways, I can probably spin the patch to include information about what
zone reclaim is, in the comment block above the bits.

But please feel free to correct me if you feel that the descriptions available
in both the mempolicy.h uapi file or the Documentation/admin-guide is already
enough.

Thanks for the review as always, Ying. Have a great day!
Joshua

> ---
> Best Regards,
> Huang, Ying
> 

Sent using hkml (https://github.com/sjp38/hackermail)


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mempolicy: Clarify what RECLAIM_ZONE means
  2025-07-30 20:19       ` Joshua Hahn
@ 2025-07-31  1:48         ` Huang, Ying
  2025-07-31 18:45           ` SeongJae Park
  0 siblings, 1 reply; 9+ messages in thread
From: Huang, Ying @ 2025-07-31  1:48 UTC (permalink / raw)
  To: Joshua Hahn
  Cc: Andrew Morton, David Hildenbrand, Johannes Weiner, Zi Yan,
	Matthew Brost, Rakie Kim, Byungchul Park, Gregory Price,
	Alistair Popple, linux-kernel, linux-mm, kernel-team, Dave Hansen

Joshua Hahn <joshua.hahnjy@gmail.com> writes:

> On Tue, 29 Jul 2025 08:58:49 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:
>
>> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
>> 
>> > On Mon, 28 Jul 2025 09:44:06 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:
>> >
>> >> Hi, Joshua,
>> >> 
>> >> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
>> >> 
>> >> > The zone_reclaim_mode API controls reclaim behavior when a node runs out of
>> >> > memory. Contrary to its user-facing name, it is internally referred to as
>> >> > "node_reclaim_mode". This is slightly confusing but there is not much we can
>> >> > do given that it has already been exposed to userspace (since at least 2.6).
>> >> >
>> >> > However, what we can do is to make sure the internal description of what the
>> >> > bits inside zone_reclaim_mode aligns with what it does in practice.
>> >> > Setting RECLAIM_ZONE does indeed run shrink_inactive_list, but a more holistic
>> >> > description would be to explain that zone reclaim modulates whether page
>> >> > allocation (and khugepaged collapsing) prefers reclaiming & attempting to
>> >> > allocate locally or should fall back to the next node in the zonelist.
>> >> >
>> >> > Change the description to clarify what zone reclaim entails.
>> >> >
>> >> > Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
>> >> > ---
>> >> >  include/uapi/linux/mempolicy.h | 2 +-
>> >> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> >> >
>> >> > diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
>> >> > index 1f9bb10d1a47..24083809d920 100644
>> >> > --- a/include/uapi/linux/mempolicy.h
>> >> > +++ b/include/uapi/linux/mempolicy.h
>> >> > @@ -69,7 +69,7 @@ enum {
>> >> >   * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
>> >> >   * ABI.  New bits are OK, but existing bits can never change.
>> >> >   */
>> >> > -#define RECLAIM_ZONE	(1<<0)	/* Run shrink_inactive_list on the zone */
>> >> > +#define RECLAIM_ZONE	(1<<0)	/* Prefer reclaiming & allocating locally */
>> >> >  #define RECLAIM_WRITE	(1<<1)	/* Writeout pages during reclaim */
>> >> >  #define RECLAIM_UNMAP	(1<<2)	/* Unmap pages during reclaim */
>> >> >  
>> >> >
>> >> > base-commit: 25fae0b93d1d7ddb25958bcb90c3c0e5e0e202bd
>> >
>> > Hi Ying, thanks for your review, as always!
>> >
>> >> Please consider the document of zone_reclaim_mode in
>> >> Documentation/admin-guide/sysctl/vm.rst too.
>> >
>> > Yes, will do. Along with SJ's comment, I think that the information in the
>> > admin-guide should be sufficient enough to explain what these bits do, so
>> > I think my patch is not very necessary.
>> >
>> >> And, IIUC, RECLAIM_ZONE doesn't mean "locally" exactly.  It's legal to
>> >> bind to some node other than "local node".
>> >
>> > You are correct, it seems you can also reclaim on non-local nodes once you
>> > go further down in the zonelist. I think my intent with the new comment was just
>> > to indicate a preference to reclaim and allocate on the *current* node, as
>> > opposed to falling back to the next node in the zonelist.
>> >
>> > With that said, I think your comment along with SJ's feedback have gotten me
>> > to understand that we proably don't need this change : -) 
>> 
>> TBH, I think that it's good to make some change to the comments.
>> Because IMHO, the original comments are bound to some specific
>> implementation details.  Some more general words may be better for the
>> user space API description.
>
> Hi Ying, sorry for the late reply.
>
> I think that is a good point. Then maybe in that case, we can take SJ's comment
> and leave information about both the implementation detail (i.e. that it will
> perform shrink inactive_list on the zone), and that it will prefer this over
> allocating on the next node as a general description of what happens?

Yes.  Something like this, or

Try to reclaim in the current node/zone before allocating on the fallback.

> On that note, one thing that I felt was slightly undercaptured in
> Documentation/admin-guide is what "zone reclaim" actually means. What it does
> is of course well captured by its name, but it misses the nuance of preferring
> reclaim over fallback allocation.
>
> Actually the whole motivation behind all of this conversation is because I saw
> zone reclaim preventing allocation into a second node in a 2-NUMA node system
> and was a bit confused until I understood what the implication of having
> zone reclaim was.

Yes.  It's good to improve the document.  If it makes you confusing, it
may make others confusing too.

> Anyways, I can probably spin the patch to include information about what
> zone reclaim is, in the comment block above the bits.
>
> But please feel free to correct me if you feel that the descriptions available
> in both the mempolicy.h uapi file or the Documentation/admin-guide is already
> enough.

Thanks for doing this.

---
Best Regards,
Huang, Ying


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mempolicy: Clarify what RECLAIM_ZONE means
  2025-07-31  1:48         ` Huang, Ying
@ 2025-07-31 18:45           ` SeongJae Park
  0 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2025-07-31 18:45 UTC (permalink / raw)
  To: Huang, Ying
  Cc: SeongJae Park, Joshua Hahn, Andrew Morton, David Hildenbrand,
	Johannes Weiner, Zi Yan, Matthew Brost, Rakie Kim, Byungchul Park,
	Gregory Price, Alistair Popple, linux-kernel, linux-mm,
	kernel-team, Dave Hansen

On Thu, 31 Jul 2025 09:48:54 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:

> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
[...]
> > On that note, one thing that I felt was slightly undercaptured in
> > Documentation/admin-guide is what "zone reclaim" actually means. What it does
> > is of course well captured by its name, but it misses the nuance of preferring
> > reclaim over fallback allocation.
> >
> > Actually the whole motivation behind all of this conversation is because I saw
> > zone reclaim preventing allocation into a second node in a 2-NUMA node system
> > and was a bit confused until I understood what the implication of having
> > zone reclaim was.
> 
> Yes.  It's good to improve the document.  If it makes you confusing, it
> may make others confusing too.

+1


Thanks,
SJ

[...]


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-07-31 18:45 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-25 17:35 [PATCH] mempolicy: Clarify what RECLAIM_ZONE means Joshua Hahn
2025-07-25 21:44 ` SeongJae Park
2025-07-26  1:24   ` Joshua Hahn
2025-07-28  1:44 ` Huang, Ying
2025-07-28 14:51   ` Joshua Hahn
2025-07-29  0:58     ` Huang, Ying
2025-07-30 20:19       ` Joshua Hahn
2025-07-31  1:48         ` Huang, Ying
2025-07-31 18:45           ` SeongJae Park

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).