linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] mempolicy: Clarify what zone reclaim means
@ 2025-07-31 21:07 Joshua Hahn
  2025-07-31 22:41 ` SeongJae Park
  2025-08-01  0:59 ` Huang, Ying
  0 siblings, 2 replies; 10+ messages in thread
From: Joshua Hahn @ 2025-07-31 21:07 UTC (permalink / raw)
  To: Andrew Morton, SeongJae Park, Ying Huang
  Cc: David Hildenbrand, Zi Yan, Johannes Weiner, Matthew Brost,
	Rakie Kim, Byungchul Park, Gregory Price, Alistair Popple,
	linux-kernel, linux-mm, kernel-team

The zone_reclaim_mode API controls the reclaim behavior when a node runs out of
memory. Contrary to its user-facing name, it is internally referred to as
"node_reclaim_mode".

This can be confusing. But because we cannot change the name of the API since
it has been in place since at least 2.6, let's try to be more explicit about
what the behavior of this API is. 

Change the description to clarify what zone reclaim entails, and be explicit
about the RECLAIM_ZONE bit, whose purpose has led to some confusion in the
past already [1] [2].

[1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/
[2] https://lore.kernel.org/linux-mm/20200626003459.D8E015CA@viggo.jf.intel.com/

Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
---
 include/uapi/linux/mempolicy.h | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
index 1f9bb10d1a47..6c9c9385ff89 100644
--- a/include/uapi/linux/mempolicy.h
+++ b/include/uapi/linux/mempolicy.h
@@ -66,10 +66,16 @@ enum {
 #define MPOL_F_MORON	(1 << 4) /* Migrate On protnone Reference On Node */
 
 /*
+ * Enabling zone reclaim means the page allocator will attempt to fulfill
+ * the allocation request on the current node by triggering reclaim and
+ * trying to shrink the current node.
+ * Fallback allocations on the next candidates in the zonelist are considered
+ * zone when reclaim fails to free up enough memory in the current node/zone.
+ *
  * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
  * ABI.  New bits are OK, but existing bits can never change.
  */
-#define RECLAIM_ZONE	(1<<0)	/* Run shrink_inactive_list on the zone */
+#define RECLAIM_ZONE	(1<<0)	/* Enable zone reclaim */
 #define RECLAIM_WRITE	(1<<1)	/* Writeout pages during reclaim */
 #define RECLAIM_UNMAP	(1<<2)	/* Unmap pages during reclaim */
 

base-commit: 260f6f4fda93c8485c8037865c941b42b9cba5d2
-- 
2.47.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mempolicy: Clarify what zone reclaim means
  2025-07-31 21:07 [PATCH v2] mempolicy: Clarify what zone reclaim means Joshua Hahn
@ 2025-07-31 22:41 ` SeongJae Park
  2025-08-01  9:04   ` David Hildenbrand
  2025-08-01  0:59 ` Huang, Ying
  1 sibling, 1 reply; 10+ messages in thread
From: SeongJae Park @ 2025-07-31 22:41 UTC (permalink / raw)
  To: Joshua Hahn
  Cc: SeongJae Park, Andrew Morton, Ying Huang, David Hildenbrand,
	Zi Yan, Johannes Weiner, Matthew Brost, Rakie Kim, Byungchul Park,
	Gregory Price, Alistair Popple, linux-kernel, linux-mm,
	kernel-team

On Thu, 31 Jul 2025 14:07:37 -0700 Joshua Hahn <joshua.hahnjy@gmail.com> wrote:

> The zone_reclaim_mode API controls the reclaim behavior when a node runs out of
> memory. Contrary to its user-facing name, it is internally referred to as
> "node_reclaim_mode".
> 
> This can be confusing. But because we cannot change the name of the API since
> it has been in place since at least 2.6, let's try to be more explicit about
> what the behavior of this API is. 
> 
> Change the description to clarify what zone reclaim entails, and be explicit
> about the RECLAIM_ZONE bit, whose purpose has led to some confusion in the
> past already [1] [2].
> 
> [1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/
> [2] https://lore.kernel.org/linux-mm/20200626003459.D8E015CA@viggo.jf.intel.com/
> 
> Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> ---
>  include/uapi/linux/mempolicy.h | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
> index 1f9bb10d1a47..6c9c9385ff89 100644
> --- a/include/uapi/linux/mempolicy.h
> +++ b/include/uapi/linux/mempolicy.h
> @@ -66,10 +66,16 @@ enum {
>  #define MPOL_F_MORON	(1 << 4) /* Migrate On protnone Reference On Node */
>  
>  /*
> + * Enabling zone reclaim means the page allocator will attempt to fulfill
> + * the allocation request on the current node by triggering reclaim and
> + * trying to shrink the current node.
> + * Fallback allocations on the next candidates in the zonelist are considered
> + * zone when reclaim fails to free up enough memory in the current node/zone.

s/zone when reclaim fails/when reclaim fails/ ?

> + *
>   * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
>   * ABI.  New bits are OK, but existing bits can never change.
>   */
> -#define RECLAIM_ZONE	(1<<0)	/* Run shrink_inactive_list on the zone */
> +#define RECLAIM_ZONE	(1<<0)	/* Enable zone reclaim */
>  #define RECLAIM_WRITE	(1<<1)	/* Writeout pages during reclaim */
>  #define RECLAIM_UNMAP	(1<<2)	/* Unmap pages during reclaim */
>  
> 
> base-commit: 260f6f4fda93c8485c8037865c941b42b9cba5d2
> -- 
> 2.47.3
> 

Other than the above trivial thing,

Acked-by: SeongJae Park <sj@kernel.org>


Thanks,
SJ

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mempolicy: Clarify what zone reclaim means
  2025-07-31 21:07 [PATCH v2] mempolicy: Clarify what zone reclaim means Joshua Hahn
  2025-07-31 22:41 ` SeongJae Park
@ 2025-08-01  0:59 ` Huang, Ying
  2025-08-01 14:48   ` Joshua Hahn
  1 sibling, 1 reply; 10+ messages in thread
From: Huang, Ying @ 2025-08-01  0:59 UTC (permalink / raw)
  To: Joshua Hahn
  Cc: Andrew Morton, SeongJae Park, David Hildenbrand, Zi Yan,
	Johannes Weiner, Matthew Brost, Rakie Kim, Byungchul Park,
	Gregory Price, Alistair Popple, linux-kernel, linux-mm,
	kernel-team

Joshua Hahn <joshua.hahnjy@gmail.com> writes:

> The zone_reclaim_mode API controls the reclaim behavior when a node runs out of
> memory. Contrary to its user-facing name, it is internally referred to as
> "node_reclaim_mode".
>
> This can be confusing. But because we cannot change the name of the API since
> it has been in place since at least 2.6, let's try to be more explicit about
> what the behavior of this API is. 
>
> Change the description to clarify what zone reclaim entails, and be explicit
> about the RECLAIM_ZONE bit, whose purpose has led to some confusion in the
> past already [1] [2].
>
> [1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/
> [2] https://lore.kernel.org/linux-mm/20200626003459.D8E015CA@viggo.jf.intel.com/
>
> Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> ---
>  include/uapi/linux/mempolicy.h | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
> index 1f9bb10d1a47..6c9c9385ff89 100644
> --- a/include/uapi/linux/mempolicy.h
> +++ b/include/uapi/linux/mempolicy.h
> @@ -66,10 +66,16 @@ enum {
>  #define MPOL_F_MORON	(1 << 4) /* Migrate On protnone Reference On Node */
>  
>  /*
> + * Enabling zone reclaim means the page allocator will attempt to fulfill
> + * the allocation request on the current node by triggering reclaim and
> + * trying to shrink the current node.
> + * Fallback allocations on the next candidates in the zonelist are considered
> + * zone when reclaim fails to free up enough memory in the current node/zone.
> + *
>   * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
>   * ABI.  New bits are OK, but existing bits can never change.

As far as I know, sysctl isn't considered kernel ABI now.  So, cghane
this line too?

>   */
> -#define RECLAIM_ZONE	(1<<0)	/* Run shrink_inactive_list on the zone */
> +#define RECLAIM_ZONE	(1<<0)	/* Enable zone reclaim */
>  #define RECLAIM_WRITE	(1<<1)	/* Writeout pages during reclaim */
>  #define RECLAIM_UNMAP	(1<<2)	/* Unmap pages during reclaim */
>  
>
> base-commit: 260f6f4fda93c8485c8037865c941b42b9cba5d2

---
Best Regards,
Huang, Ying

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mempolicy: Clarify what zone reclaim means
  2025-07-31 22:41 ` SeongJae Park
@ 2025-08-01  9:04   ` David Hildenbrand
  2025-08-01 14:50     ` Joshua Hahn
  0 siblings, 1 reply; 10+ messages in thread
From: David Hildenbrand @ 2025-08-01  9:04 UTC (permalink / raw)
  To: SeongJae Park, Joshua Hahn
  Cc: Andrew Morton, Ying Huang, Zi Yan, Johannes Weiner, Matthew Brost,
	Rakie Kim, Byungchul Park, Gregory Price, Alistair Popple,
	linux-kernel, linux-mm, kernel-team

On 01.08.25 00:41, SeongJae Park wrote:
> On Thu, 31 Jul 2025 14:07:37 -0700 Joshua Hahn <joshua.hahnjy@gmail.com> wrote:
> 
>> The zone_reclaim_mode API controls the reclaim behavior when a node runs out of
>> memory. Contrary to its user-facing name, it is internally referred to as
>> "node_reclaim_mode".
>>
>> This can be confusing. But because we cannot change the name of the API since
>> it has been in place since at least 2.6, let's try to be more explicit about
>> what the behavior of this API is.
>>
>> Change the description to clarify what zone reclaim entails, and be explicit
>> about the RECLAIM_ZONE bit, whose purpose has led to some confusion in the
>> past already [1] [2].
>>
>> [1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/
>> [2] https://lore.kernel.org/linux-mm/20200626003459.D8E015CA@viggo.jf.intel.com/
>>
>> Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
>> ---
>>   include/uapi/linux/mempolicy.h | 8 +++++++-
>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
>> index 1f9bb10d1a47..6c9c9385ff89 100644
>> --- a/include/uapi/linux/mempolicy.h
>> +++ b/include/uapi/linux/mempolicy.h
>> @@ -66,10 +66,16 @@ enum {
>>   #define MPOL_F_MORON	(1 << 4) /* Migrate On protnone Reference On Node */
>>   
>>   /*
>> + * Enabling zone reclaim means the page allocator will attempt to fulfill
>> + * the allocation request on the current node by triggering reclaim and
>> + * trying to shrink the current node.
>> + * Fallback allocations on the next candidates in the zonelist are considered
>> + * zone when reclaim fails to free up enough memory in the current node/zone.
> 
> s/zone when reclaim fails/when reclaim fails/ ?

Agreed, that confused me as well.

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mempolicy: Clarify what zone reclaim means
  2025-08-01  0:59 ` Huang, Ying
@ 2025-08-01 14:48   ` Joshua Hahn
  2025-08-04  1:24     ` Huang, Ying
  0 siblings, 1 reply; 10+ messages in thread
From: Joshua Hahn @ 2025-08-01 14:48 UTC (permalink / raw)
  To: Huang, Ying
  Cc: Andrew Morton, SeongJae Park, David Hildenbrand, Zi Yan,
	Johannes Weiner, Matthew Brost, Rakie Kim, Byungchul Park,
	Gregory Price, Alistair Popple, linux-kernel, linux-mm,
	kernel-team

On Fri, 01 Aug 2025 08:59:20 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:

> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
> 
> > The zone_reclaim_mode API controls the reclaim behavior when a node runs out of
> > memory. Contrary to its user-facing name, it is internally referred to as
> > "node_reclaim_mode".
> >
> > This can be confusing. But because we cannot change the name of the API since
> > it has been in place since at least 2.6, let's try to be more explicit about
> > what the behavior of this API is. 
> >
> > Change the description to clarify what zone reclaim entails, and be explicit
> > about the RECLAIM_ZONE bit, whose purpose has led to some confusion in the
> > past already [1] [2].
> >
> > [1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/
> > [2] https://lore.kernel.org/linux-mm/20200626003459.D8E015CA@viggo.jf.intel.com/
> >
> > Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> > ---
> >  include/uapi/linux/mempolicy.h | 8 +++++++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
> > index 1f9bb10d1a47..6c9c9385ff89 100644
> > --- a/include/uapi/linux/mempolicy.h
> > +++ b/include/uapi/linux/mempolicy.h
> > @@ -66,10 +66,16 @@ enum {
> >  #define MPOL_F_MORON	(1 << 4) /* Migrate On protnone Reference On Node */
> >  
> >  /*
> > + * Enabling zone reclaim means the page allocator will attempt to fulfill
> > + * the allocation request on the current node by triggering reclaim and
> > + * trying to shrink the current node.
> > + * Fallback allocations on the next candidates in the zonelist are considered
> > + * zone when reclaim fails to free up enough memory in the current node/zone.
> > + *
> >   * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
> >   * ABI.  New bits are OK, but existing bits can never change.
> 
> As far as I know, sysctl isn't considered kernel ABI now.  So, cghane
> this line too?

Hi Ying, 

Thank you for reviewing this patch!

I didn't know that sysctl isn't considered a kernel ABI. If I understand your
suggestion correctly, I can rephrase the comment block above to something like this?

- * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
- * ABI. New bits are OK, but existing bits can never change.
+ * These bit locations are exposed in the vm.zone_reclaim_mode sysctl and
+ * in /proc/sys/vm/zone_reclaim_mode. New bits are OK, but existing bits
+ * can never change.

Thanks again for your review Ying, I hope you have a good day : -)
Joshua

Sent using hkml (https://github.com/sjp38/hackermail)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mempolicy: Clarify what zone reclaim means
  2025-08-01  9:04   ` David Hildenbrand
@ 2025-08-01 14:50     ` Joshua Hahn
  0 siblings, 0 replies; 10+ messages in thread
From: Joshua Hahn @ 2025-08-01 14:50 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: SeongJae Park, Andrew Morton, Ying Huang, Zi Yan, Johannes Weiner,
	Matthew Brost, Rakie Kim, Byungchul Park, Gregory Price,
	Alistair Popple, linux-kernel, linux-mm, kernel-team

On Fri, 1 Aug 2025 11:04:00 +0200 David Hildenbrand <david@redhat.com> wrote:

> On 01.08.25 00:41, SeongJae Park wrote:
> > On Thu, 31 Jul 2025 14:07:37 -0700 Joshua Hahn <joshua.hahnjy@gmail.com> wrote:
> > 
> >> The zone_reclaim_mode API controls the reclaim behavior when a node runs out of
> >> memory. Contrary to its user-facing name, it is internally referred to as
> >> "node_reclaim_mode".
> >>
> >> This can be confusing. But because we cannot change the name of the API since
> >> it has been in place since at least 2.6, let's try to be more explicit about
> >> what the behavior of this API is.
> >>
> >> Change the description to clarify what zone reclaim entails, and be explicit
> >> about the RECLAIM_ZONE bit, whose purpose has led to some confusion in the
> >> past already [1] [2].
> >>
> >> [1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/
> >> [2] https://lore.kernel.org/linux-mm/20200626003459.D8E015CA@viggo.jf.intel.com/
> >>
> >> Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> >> ---
> >>   include/uapi/linux/mempolicy.h | 8 +++++++-
> >>   1 file changed, 7 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
> >> index 1f9bb10d1a47..6c9c9385ff89 100644
> >> --- a/include/uapi/linux/mempolicy.h
> >> +++ b/include/uapi/linux/mempolicy.h
> >> @@ -66,10 +66,16 @@ enum {
> >>   #define MPOL_F_MORON	(1 << 4) /* Migrate On protnone Reference On Node */
> >>   
> >>   /*
> >> + * Enabling zone reclaim means the page allocator will attempt to fulfill
> >> + * the allocation request on the current node by triggering reclaim and
> >> + * trying to shrink the current node.
> >> + * Fallback allocations on the next candidates in the zonelist are considered
> >> + * zone when reclaim fails to free up enough memory in the current node/zone.
> > 
> > s/zone when reclaim fails/when reclaim fails/ ?
> 
> Agreed, that confused me as well.

Hi David, hi SJ!

Thank you both for catching this, I definitely missed this before sending the
patch out. Will fix in the next version!

> Acked-by: David Hildenbrand <david@redhat.com>

And thank you for your Ack : -) Have a great day!
Joshua

Sent using hkml (https://github.com/sjp38/hackermail)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mempolicy: Clarify what zone reclaim means
  2025-08-01 14:48   ` Joshua Hahn
@ 2025-08-04  1:24     ` Huang, Ying
  2025-08-04 14:41       ` Joshua Hahn
  0 siblings, 1 reply; 10+ messages in thread
From: Huang, Ying @ 2025-08-04  1:24 UTC (permalink / raw)
  To: Joshua Hahn
  Cc: Andrew Morton, SeongJae Park, David Hildenbrand, Zi Yan,
	Johannes Weiner, Matthew Brost, Rakie Kim, Byungchul Park,
	Gregory Price, Alistair Popple, linux-kernel, linux-mm,
	kernel-team

Joshua Hahn <joshua.hahnjy@gmail.com> writes:

> On Fri, 01 Aug 2025 08:59:20 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:
>
>> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
>> 
>> > The zone_reclaim_mode API controls the reclaim behavior when a node runs out of
>> > memory. Contrary to its user-facing name, it is internally referred to as
>> > "node_reclaim_mode".
>> >
>> > This can be confusing. But because we cannot change the name of the API since
>> > it has been in place since at least 2.6, let's try to be more explicit about
>> > what the behavior of this API is. 
>> >
>> > Change the description to clarify what zone reclaim entails, and be explicit
>> > about the RECLAIM_ZONE bit, whose purpose has led to some confusion in the
>> > past already [1] [2].
>> >
>> > [1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/
>> > [2] https://lore.kernel.org/linux-mm/20200626003459.D8E015CA@viggo.jf.intel.com/
>> >
>> > Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
>> > ---
>> >  include/uapi/linux/mempolicy.h | 8 +++++++-
>> >  1 file changed, 7 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
>> > index 1f9bb10d1a47..6c9c9385ff89 100644
>> > --- a/include/uapi/linux/mempolicy.h
>> > +++ b/include/uapi/linux/mempolicy.h
>> > @@ -66,10 +66,16 @@ enum {
>> >  #define MPOL_F_MORON	(1 << 4) /* Migrate On protnone Reference On Node */
>> >  
>> >  /*
>> > + * Enabling zone reclaim means the page allocator will attempt to fulfill
>> > + * the allocation request on the current node by triggering reclaim and
>> > + * trying to shrink the current node.
>> > + * Fallback allocations on the next candidates in the zonelist are considered
>> > + * zone when reclaim fails to free up enough memory in the current node/zone.
>> > + *
>> >   * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
>> >   * ABI.  New bits are OK, but existing bits can never change.
>> 
>> As far as I know, sysctl isn't considered kernel ABI now.  So, cghane
>> this line too?
>
> Hi Ying, 
>
> Thank you for reviewing this patch!
>
> I didn't know that sysctl isn't considered a kernel ABI. If I understand your
> suggestion correctly, I can rephrase the comment block above to something like this?
>
> - * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
> - * ABI. New bits are OK, but existing bits can never change.
> + * These bit locations are exposed in the vm.zone_reclaim_mode sysctl and
> + * in /proc/sys/vm/zone_reclaim_mode. New bits are OK, but existing bits
> + * can never change.

Because it's not an ABI, I think that we could avoid to say "never".

> Thanks again for your review Ying, I hope you have a good day : -)

Welcome!  You too!

With some trivial tweak, please feel free to add my

Reviewed-by: Huang Ying <ying.huang@linux.alibaba.com>

in the future version.

---
Best Regards,
Huang, Ying

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mempolicy: Clarify what zone reclaim means
  2025-08-04  1:24     ` Huang, Ying
@ 2025-08-04 14:41       ` Joshua Hahn
  2025-08-05  1:27         ` Huang, Ying
  0 siblings, 1 reply; 10+ messages in thread
From: Joshua Hahn @ 2025-08-04 14:41 UTC (permalink / raw)
  To: Huang, Ying
  Cc: Andrew Morton, SeongJae Park, David Hildenbrand, Zi Yan,
	Johannes Weiner, Matthew Brost, Rakie Kim, Byungchul Park,
	Gregory Price, Alistair Popple, linux-kernel, linux-mm,
	kernel-team

On Mon, 04 Aug 2025 09:24:31 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:

> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
> 
> > On Fri, 01 Aug 2025 08:59:20 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:
> >
> >> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
> >> 
> >> > The zone_reclaim_mode API controls the reclaim behavior when a node runs out of
> >> > memory. Contrary to its user-facing name, it is internally referred to as
> >> > "node_reclaim_mode".
> >> >
> >> > This can be confusing. But because we cannot change the name of the API since
> >> > it has been in place since at least 2.6, let's try to be more explicit about
> >> > what the behavior of this API is. 
> >> >
> >> > Change the description to clarify what zone reclaim entails, and be explicit
> >> > about the RECLAIM_ZONE bit, whose purpose has led to some confusion in the
> >> > past already [1] [2].
> >> >
> >> > [1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/
> >> > [2] https://lore.kernel.org/linux-mm/20200626003459.D8E015CA@viggo.jf.intel.com/
> >> >
> >> > Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> >> > ---
> >> >  include/uapi/linux/mempolicy.h | 8 +++++++-
> >> >  1 file changed, 7 insertions(+), 1 deletion(-)
> >> >
> >> > diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
> >> > index 1f9bb10d1a47..6c9c9385ff89 100644
> >> > --- a/include/uapi/linux/mempolicy.h
> >> > +++ b/include/uapi/linux/mempolicy.h
> >> > @@ -66,10 +66,16 @@ enum {
> >> >  #define MPOL_F_MORON	(1 << 4) /* Migrate On protnone Reference On Node */
> >> >  
> >> >  /*
> >> > + * Enabling zone reclaim means the page allocator will attempt to fulfill
> >> > + * the allocation request on the current node by triggering reclaim and
> >> > + * trying to shrink the current node.
> >> > + * Fallback allocations on the next candidates in the zonelist are considered
> >> > + * zone when reclaim fails to free up enough memory in the current node/zone.
> >> > + *
> >> >   * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
> >> >   * ABI.  New bits are OK, but existing bits can never change.
> >> 
> >> As far as I know, sysctl isn't considered kernel ABI now.  So, cghane
> >> this line too?
> >
> > Hi Ying, 
> >
> > Thank you for reviewing this patch!
> >
> > I didn't know that sysctl isn't considered a kernel ABI. If I understand your
> > suggestion correctly, I can rephrase the comment block above to something like this?
> >
> > - * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
> > - * ABI. New bits are OK, but existing bits can never change.
> > + * These bit locations are exposed in the vm.zone_reclaim_mode sysctl and
> > + * in /proc/sys/vm/zone_reclaim_mode. New bits are OK, but existing bits
> > + * can never change.

Hi Ying,

> Because it's not an ABI, I think that we could avoid to say "never".

My personal opinion is that we should keep this warning, since there has
already been an example before where a developer tried to remove this bit [1],
and this broke some behavior for userspace configurations. However, if I
understand your comment correctly, you are suggesting that we should change
the wording to not include "never", since sysctls are no longer an ABI (and
therefore we should be OK to change what the values mean?)

If that is the case, then I can send in another patch since I think the goals
are a bit different for the two patches. With that said, I think we should
keep the warning just to avoid any breakages in userspace, even if sysctl
might not be considered an ABI anymore (also I must have missed this, I didn't
know this at all!)

> > Thanks again for your review Ying, I hope you have a good day : -)
> 
> Welcome!  You too!
> 
> With some trivial tweak, please feel free to add my
> 
> Reviewed-by: Huang Ying <ying.huang@linux.alibaba.com>
> 
> in the future version.

Thank you for your review Ying! Since there is a question remaining about what
to do with the "never" statement, I will wait to send out a v3 with your
review : -) 

Have a great day!
Joshua

[1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/

Sent using hkml (https://github.com/sjp38/hackermail)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mempolicy: Clarify what zone reclaim means
  2025-08-04 14:41       ` Joshua Hahn
@ 2025-08-05  1:27         ` Huang, Ying
  2025-08-05 20:03           ` Joshua Hahn
  0 siblings, 1 reply; 10+ messages in thread
From: Huang, Ying @ 2025-08-05  1:27 UTC (permalink / raw)
  To: Joshua Hahn
  Cc: Andrew Morton, SeongJae Park, David Hildenbrand, Zi Yan,
	Johannes Weiner, Matthew Brost, Rakie Kim, Byungchul Park,
	Gregory Price, Alistair Popple, linux-kernel, linux-mm,
	kernel-team, Dave Hansen

Joshua Hahn <joshua.hahnjy@gmail.com> writes:

> On Mon, 04 Aug 2025 09:24:31 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:
>
>> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
>> 
>> > On Fri, 01 Aug 2025 08:59:20 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:
>> >
>> >> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
>> >> 
>> >> > The zone_reclaim_mode API controls the reclaim behavior when a node runs out of
>> >> > memory. Contrary to its user-facing name, it is internally referred to as
>> >> > "node_reclaim_mode".
>> >> >
>> >> > This can be confusing. But because we cannot change the name of the API since
>> >> > it has been in place since at least 2.6, let's try to be more explicit about
>> >> > what the behavior of this API is. 
>> >> >
>> >> > Change the description to clarify what zone reclaim entails, and be explicit
>> >> > about the RECLAIM_ZONE bit, whose purpose has led to some confusion in the
>> >> > past already [1] [2].
>> >> >
>> >> > [1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/
>> >> > [2] https://lore.kernel.org/linux-mm/20200626003459.D8E015CA@viggo.jf.intel.com/
>> >> >
>> >> > Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
>> >> > ---
>> >> >  include/uapi/linux/mempolicy.h | 8 +++++++-
>> >> >  1 file changed, 7 insertions(+), 1 deletion(-)
>> >> >
>> >> > diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
>> >> > index 1f9bb10d1a47..6c9c9385ff89 100644
>> >> > --- a/include/uapi/linux/mempolicy.h
>> >> > +++ b/include/uapi/linux/mempolicy.h
>> >> > @@ -66,10 +66,16 @@ enum {
>> >> >  #define MPOL_F_MORON	(1 << 4) /* Migrate On protnone Reference On Node */
>> >> >  
>> >> >  /*
>> >> > + * Enabling zone reclaim means the page allocator will attempt to fulfill
>> >> > + * the allocation request on the current node by triggering reclaim and
>> >> > + * trying to shrink the current node.
>> >> > + * Fallback allocations on the next candidates in the zonelist are considered
>> >> > + * zone when reclaim fails to free up enough memory in the current node/zone.
>> >> > + *
>> >> >   * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
>> >> >   * ABI.  New bits are OK, but existing bits can never change.
>> >> 
>> >> As far as I know, sysctl isn't considered kernel ABI now.  So, cghane
>> >> this line too?
>> >
>> > Hi Ying, 
>> >
>> > Thank you for reviewing this patch!
>> >
>> > I didn't know that sysctl isn't considered a kernel ABI. If I understand your
>> > suggestion correctly, I can rephrase the comment block above to something like this?
>> >
>> > - * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
>> > - * ABI. New bits are OK, but existing bits can never change.
>> > + * These bit locations are exposed in the vm.zone_reclaim_mode sysctl and
>> > + * in /proc/sys/vm/zone_reclaim_mode. New bits are OK, but existing bits
>> > + * can never change.
>
> Hi Ying,
>
>> Because it's not an ABI, I think that we could avoid to say "never".
>
> My personal opinion is that we should keep this warning, since there has
> already been an example before where a developer tried to remove this bit [1],
> and this broke some behavior for userspace configurations. However, if I
> understand your comment correctly, you are suggesting that we should change
> the wording to not include "never", since sysctls are no longer an ABI (and
> therefore we should be OK to change what the values mean?)
>
> If that is the case, then I can send in another patch since I think the goals
> are a bit different for the two patches. With that said, I think we should
> keep the warning just to avoid any breakages in userspace, even if sysctl
> might not be considered an ABI anymore (also I must have missed this, I didn't
> know this at all!)

Sorry for confusing.  I agree that we shouldn't change the sysctl
interface in most cases.  I just thought that we could soften the
wording a little?  For example,

New bits are OK, but existing bits shouldn't be changed.

I think that it's still clear that we don't want to change the existing
bits.

However, my English is poor.  So, my suggestion may not make sense.

>> > Thanks again for your review Ying, I hope you have a good day : -)
>> 
>> Welcome!  You too!
>> 
>> With some trivial tweak, please feel free to add my
>> 
>> Reviewed-by: Huang Ying <ying.huang@linux.alibaba.com>
>> 
>> in the future version.
>
> Thank you for your review Ying! Since there is a question remaining about what
> to do with the "never" statement, I will wait to send out a v3 with your
> review : -) 

---
Best Regards,
Huang, Ying

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mempolicy: Clarify what zone reclaim means
  2025-08-05  1:27         ` Huang, Ying
@ 2025-08-05 20:03           ` Joshua Hahn
  0 siblings, 0 replies; 10+ messages in thread
From: Joshua Hahn @ 2025-08-05 20:03 UTC (permalink / raw)
  To: Huang, Ying
  Cc: Andrew Morton, SeongJae Park, David Hildenbrand, Zi Yan,
	Johannes Weiner, Matthew Brost, Rakie Kim, Byungchul Park,
	Gregory Price, Alistair Popple, linux-kernel, linux-mm,
	kernel-team, Dave Hansen

On Tue, 05 Aug 2025 09:27:30 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:

> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
> 
> > On Mon, 04 Aug 2025 09:24:31 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:
> >
> >> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
> >> 
> >> > On Fri, 01 Aug 2025 08:59:20 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:
> >> >
> >> >> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
> >> >> 
> >> >> > The zone_reclaim_mode API controls the reclaim behavior when a node runs out of
> >> >> > memory. Contrary to its user-facing name, it is internally referred to as
> >> >> > "node_reclaim_mode".
> >> >> >
> >> >> > This can be confusing. But because we cannot change the name of the API since
> >> >> > it has been in place since at least 2.6, let's try to be more explicit about
> >> >> > what the behavior of this API is. 
> >> >> >
> >> >> > Change the description to clarify what zone reclaim entails, and be explicit
> >> >> > about the RECLAIM_ZONE bit, whose purpose has led to some confusion in the
> >> >> > past already [1] [2].
> >> >> >
> >> >> > [1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/
> >> >> > [2] https://lore.kernel.org/linux-mm/20200626003459.D8E015CA@viggo.jf.intel.com/
> >> >> >
> >> >> > Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> >> >> > ---
> >> >> >  include/uapi/linux/mempolicy.h | 8 +++++++-
> >> >> >  1 file changed, 7 insertions(+), 1 deletion(-)
> >> >> >
> >> >> > diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
> >> >> > index 1f9bb10d1a47..6c9c9385ff89 100644
> >> >> > --- a/include/uapi/linux/mempolicy.h
> >> >> > +++ b/include/uapi/linux/mempolicy.h
> >> >> > @@ -66,10 +66,16 @@ enum {
> >> >> >  #define MPOL_F_MORON	(1 << 4) /* Migrate On protnone Reference On Node */
> >> >> >  
> >> >> >  /*
> >> >> > + * Enabling zone reclaim means the page allocator will attempt to fulfill
> >> >> > + * the allocation request on the current node by triggering reclaim and
> >> >> > + * trying to shrink the current node.
> >> >> > + * Fallback allocations on the next candidates in the zonelist are considered
> >> >> > + * zone when reclaim fails to free up enough memory in the current node/zone.
> >> >> > + *
> >> >> >   * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
> >> >> >   * ABI.  New bits are OK, but existing bits can never change.
> >> >> 
> >> >> As far as I know, sysctl isn't considered kernel ABI now.  So, cghane
> >> >> this line too?
> >> >
> >> > Hi Ying, 
> >> >
> >> > Thank you for reviewing this patch!
> >> >
> >> > I didn't know that sysctl isn't considered a kernel ABI. If I understand your
> >> > suggestion correctly, I can rephrase the comment block above to something like this?
> >> >
> >> > - * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
> >> > - * ABI. New bits are OK, but existing bits can never change.
> >> > + * These bit locations are exposed in the vm.zone_reclaim_mode sysctl and
> >> > + * in /proc/sys/vm/zone_reclaim_mode. New bits are OK, but existing bits
> >> > + * can never change.
> >
> > Hi Ying,
> >
> >> Because it's not an ABI, I think that we could avoid to say "never".
> >
> > My personal opinion is that we should keep this warning, since there has
> > already been an example before where a developer tried to remove this bit [1],
> > and this broke some behavior for userspace configurations. However, if I
> > understand your comment correctly, you are suggesting that we should change
> > the wording to not include "never", since sysctls are no longer an ABI (and
> > therefore we should be OK to change what the values mean?)
> >
> > If that is the case, then I can send in another patch since I think the goals
> > are a bit different for the two patches. With that said, I think we should
> > keep the warning just to avoid any breakages in userspace, even if sysctl
> > might not be considered an ABI anymore (also I must have missed this, I didn't
> > know this at all!)
> 
> Sorry for confusing.  I agree that we shouldn't change the sysctl
> interface in most cases.  I just thought that we could soften the
> wording a little?  For example,
> 
> New bits are OK, but existing bits shouldn't be changed.
> 
> I think that it's still clear that we don't want to change the existing
> bits.
> 
> However, my English is poor.  So, my suggestion may not make sense.

Hi Ying, thank you again for the response!

No worries at all, it was my misunderstanding : -) This suggestion makes sense,
and I think it's small enough & relevant to the code block, so I'll also fold
this change into my patch as well. I'll send out the next version shortly!

Have a great day!
Joshua

Sent using hkml (https://github.com/sjp38/hackermail)

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-08-05 20:03 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-31 21:07 [PATCH v2] mempolicy: Clarify what zone reclaim means Joshua Hahn
2025-07-31 22:41 ` SeongJae Park
2025-08-01  9:04   ` David Hildenbrand
2025-08-01 14:50     ` Joshua Hahn
2025-08-01  0:59 ` Huang, Ying
2025-08-01 14:48   ` Joshua Hahn
2025-08-04  1:24     ` Huang, Ying
2025-08-04 14:41       ` Joshua Hahn
2025-08-05  1:27         ` Huang, Ying
2025-08-05 20:03           ` Joshua Hahn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).