From: Jonathan Cameron <jonathan.cameron@huawei.com>
To: Raghavendra K T <raghavendra.kt@amd.com>
Cc: <AneeshKumar.KizhakeVeetil@arm.com>, <Michael.Day@amd.com>,
<akpm@linux-foundation.org>, <bharata@amd.com>,
<dave.hansen@intel.com>, <david@redhat.com>,
<dongjoo.linux.dev@gmail.com>, <feng.tang@intel.com>,
<gourry@gourry.net>, <hannes@cmpxchg.org>, <honggyu.kim@sk.com>,
<hughd@google.com>, <jhubbard@nvidia.com>, <jon.grimm@amd.com>,
<k.shutemov@gmail.com>, <kbusch@meta.com>,
<kmanaouil.dev@gmail.com>, <leesuyeon0506@gmail.com>,
<leillc@google.com>, <liam.howlett@oracle.com>,
<linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>,
<mgorman@techsingularity.net>, <mingo@redhat.com>,
<nadav.amit@gmail.com>, <nphamcs@gmail.com>,
<peterz@infradead.org>, <riel@surriel.com>, <rientjes@google.com>,
<rppt@kernel.org>, <santosh.shukla@amd.com>, <shivankg@amd.com>,
<shy828301@gmail.com>, <sj@kernel.org>, <vbabka@suse.cz>,
<weixugc@google.com>, <willy@infradead.org>,
<ying.huang@linux.alibaba.com>, <ziy@nvidia.com>,
<dave@stgolabs.net>, <yuanchu@google.com>, <kinseyho@google.com>,
<hdanton@sina.com>, <harry.yoo@oracle.com>
Subject: Re: [RFC PATCH V3 07/17] mm: Add throttling of mm scanning using scan_period
Date: Thu, 2 Oct 2025 17:24:15 +0100 [thread overview]
Message-ID: <20251002172415.00003140@huawei.com> (raw)
In-Reply-To: <20250814153307.1553061-8-raghavendra.kt@amd.com>
On Thu, 14 Aug 2025 15:32:57 +0000
Raghavendra K T <raghavendra.kt@amd.com> wrote:
> Before this patch, scanning of tasks' mm is done continuously and also
> at the same rate.
>
> Improve that by adding a throttling logic:
> 1) If there were useful pages found during last scan and current scan,
> decrease the scan_period (to increase scan rate) by TUNE_PERCENT (15%).
>
> 2) If there were no useful pages found in last scan, and there are
> candidate migration pages in the current scan decrease the scan_period
> aggressively by 2 power SCAN_CHANGE_SCALE (2^3 = 8 now).
Explain why those values were chosen.
>
> Vice versa is done for the reverse case.
> Scan period is clamped between MIN (600ms) and MAX (5sec).
>
> Signed-off-by: Raghavendra K T <raghavendra.kt@amd.com>
Various minor comments inline.
Thanks,
Jonathan
> ---
> mm/kscand.c | 110 +++++++++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 109 insertions(+), 1 deletion(-)
>
> diff --git a/mm/kscand.c b/mm/kscand.c
> index 5cd2764114df..843069048c61 100644
> --- a/mm/kscand.c
> +++ b/mm/kscand.c
> @@ -20,6 +20,7 @@
> #include <linux/string.h>
> #include <linux/delay.h>
> #include <linux/cleanup.h>
> +#include <linux/minmax.h>
>
> #include <asm/pgalloc.h>
> #include "internal.h"
> @@ -33,6 +34,16 @@ static DEFINE_MUTEX(kscand_mutex);
> #define KSCAND_SCAN_SIZE (1 * 1024 * 1024 * 1024UL)
> static unsigned long kscand_scan_size __read_mostly = KSCAND_SCAN_SIZE;
>
> +/*
> + * Scan period for each mm.
> + * Min: 600ms default: 2sec Max: 5sec
> + */
> +#define KSCAND_SCAN_PERIOD_MAX 5000U
> +#define KSCAND_SCAN_PERIOD_MIN 600U
> +#define KSCAND_SCAN_PERIOD 2000U
> +
> +static unsigned int kscand_mm_scan_period_ms __read_mostly = KSCAND_SCAN_PERIOD;
> +
> /* How long to pause between two scan cycles */
> static unsigned int kscand_scan_sleep_ms __read_mostly = 20;
>
> @@ -78,6 +89,11 @@ static struct kmem_cache *kmigrated_slot_cache __read_mostly;
> /* Per mm information collected to control VMA scanning */
> struct kscand_mm_slot {
> struct mm_slot slot;
> + /* Unit: ms. Determines how aften mm scan should happen. */
Name it scan_period_ms and you can drop the unit comment + that info is
conveyed everywhere it is used. You do that for the global so why not here as well.
> + unsigned int scan_period;
> + unsigned long next_scan;
> + /* Tracks how many useful pages obtained for migration in the last scan */
> + unsigned long scan_delta;
> long address;
> bool is_scanned;
> };
> @@ -715,13 +731,92 @@ static void kmigrated_migrate_folio(void)
> }
> }
> +/* Maintains stability of scan_period by decaying last time accessed pages */
> +#define SCAN_DECAY_SHIFT 4
> +/*
> + * X : Number of useful pages in the last scan.
> + * Y : Number of useful pages found in current scan.
> + * Tuning scan_period:
> + * Initial scan_period is 2s.
> + * case 1: (X = 0, Y = 0)
> + * Increase scan_period by SCAN_PERIOD_TUNE_PERCENT.
> + * case 2: (X = 0, Y > 0)
> + * Decrease scan_period by (2 << SCAN_PERIOD_CHANGE_SCALE).
> + * case 3: (X > 0, Y = 0 )
> + * Increase scan_period by (2 << SCAN_PERIOD_CHANGE_SCALE).
> + * case 4: (X > 0, Y > 0)
> + * Decrease scan_period by SCAN_PERIOD_TUNE_PERCENT.
> + */
> +static inline void kscand_update_mmslot_info(struct kscand_mm_slot *mm_slot,
> + unsigned long total)
As below. I'd make total a local variable for now.
> +{
> + unsigned int scan_period;
> + unsigned long now;
> + unsigned long old_scan_delta;
Might as well combine these two lines.
> +
> + scan_period = mm_slot->scan_period;
> + old_scan_delta = mm_slot->scan_delta;
> +
> + /* decay old value */
> + total = (old_scan_delta >> SCAN_DECAY_SHIFT) + total;
> +
> + /* XXX: Hack to get rid of continuously failing/unmigrateable pages */
> + if (total < KSCAND_IGNORE_SCAN_THR)
> + total = 0;
> +
> + /*
> + * case 1: old_scan_delta and new delta are similar, (slow) TUNE_PERCENT used.
> + * case 2: old_scan_delta and new delta are different. (fast) CHANGE_SCALE used.
> + * TBD:
> + * 1. Further tune scan_period based on delta between last and current scan delta.
> + * 2. Optimize calculation
> + */
> + if (!old_scan_delta && !total) {
> + scan_period = (100 + SCAN_PERIOD_TUNE_PERCENT) * scan_period;
> + scan_period /= 100;
> + } else if (old_scan_delta && total) {
> + scan_period = (100 - SCAN_PERIOD_TUNE_PERCENT) * scan_period;
> + scan_period /= 100;
> + } else if (old_scan_delta && !total) {
> + scan_period = scan_period << SCAN_PERIOD_CHANGE_SCALE;
> + } else {
> + scan_period = scan_period >> SCAN_PERIOD_CHANGE_SCALE;
> + }
> +
> + scan_period = clamp(scan_period, KSCAND_SCAN_PERIOD_MIN, KSCAND_SCAN_PERIOD_MAX);
> +
> + now = jiffies;
> + mm_slot->next_scan = now + msecs_to_jiffies(scan_period);
> + mm_slot->scan_period = scan_period;
> + mm_slot->scan_delta = total;
> +}
> +
> static unsigned long kscand_scan_mm_slot(void)
> {
> bool next_mm = false;
> bool update_mmslot_info = false;
>
> + unsigned int mm_slot_scan_period;
> + unsigned long now;
> + unsigned long mm_slot_next_scan;
> unsigned long vma_scanned_size = 0;
> unsigned long address;
> + unsigned long total = 0;
Given this never changes in this patch I'd drop the parameter
from kscand_update_mmslot_info() and bring it back when you have code
that passes in non 0.
>
> struct mm_slot *slot;
> struct mm_struct *mm;
> @@ -746,6 +841,8 @@ static unsigned long kscand_scan_mm_slot(void)
>
> mm = slot->mm;
> mm_slot->is_scanned = true;
> + mm_slot_next_scan = mm_slot->next_scan;
> + mm_slot_scan_period = mm_slot->scan_period;
> spin_unlock(&kscand_mm_lock);
>
> if (unlikely(!mmap_read_trylock(mm)))
> @@ -756,6 +853,11 @@ static unsigned long kscand_scan_mm_slot(void)
> goto outerloop;
> }
>
> + now = jiffies;
> +
> + if (mm_slot_next_scan && time_before(now, mm_slot_next_scan))
If now is only used once, seems better just to put jiffies in the call
and drop local variable.
> + goto outerloop;
> +
> VMA_ITERATOR(vmi, mm, address);
>
> kmigrated_mm_slot = kmigrated_get_mm_slot(mm, false);
> @@ -786,8 +888,10 @@ static unsigned long kscand_scan_mm_slot(void)
>
> update_mmslot_info = true;
>
> - if (update_mmslot_info)
> + if (update_mmslot_info) {
> mm_slot->address = address;
> + kscand_update_mmslot_info(mm_slot, total);
> + }
>
> outerloop:
> /* exit_mmap will destroy ptes after this */
> @@ -889,6 +993,10 @@ void __kscand_enter(struct mm_struct *mm)
> return;
>
> kscand_slot->address = 0;
> + kscand_slot->scan_period = kscand_mm_scan_period_ms;
> + kscand_slot->next_scan = 0;
> + kscand_slot->scan_delta = 0;
> +
> slot = &kscand_slot->slot;
>
> spin_lock(&kscand_mm_lock);
next prev parent reply other threads:[~2025-10-02 16:24 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-14 15:32 [RFC PATCH V3 00/17] mm: slowtier page promotion based on PTE A bit Raghavendra K T
2025-08-14 15:32 ` [RFC PATCH V3 01/17] mm: Add kscand kthread for PTE A bit scan Raghavendra K T
2025-10-02 13:12 ` Jonathan Cameron
2025-08-14 15:32 ` [RFC PATCH V3 02/17] mm: Maintain mm_struct list in the system Raghavendra K T
2025-10-02 13:23 ` Jonathan Cameron
2025-08-14 15:32 ` [RFC PATCH V3 03/17] mm: Scan the mm and create a migration list Raghavendra K T
2025-08-15 19:41 ` kernel test robot
2025-08-18 6:30 ` RaghavendraKT
2025-10-02 13:53 ` Jonathan Cameron
2025-08-14 15:32 ` [RFC PATCH V3 04/17] mm/kscand: Add only hot pages to " Raghavendra K T
2025-10-02 16:00 ` Jonathan Cameron
2025-08-14 15:32 ` [RFC PATCH V3 05/17] mm: Create a separate kthread for migration Raghavendra K T
2025-10-02 16:03 ` Jonathan Cameron
2025-08-14 15:32 ` [RFC PATCH V3 06/17] mm/migration: migrate accessed folios to toptier node Raghavendra K T
2025-10-02 16:17 ` Jonathan Cameron
2025-08-14 15:32 ` [RFC PATCH V3 07/17] mm: Add throttling of mm scanning using scan_period Raghavendra K T
2025-10-02 16:24 ` Jonathan Cameron [this message]
2025-08-14 15:32 ` [RFC PATCH V3 08/17] mm: Add throttling of mm scanning using scan_size Raghavendra K T
2025-10-03 9:35 ` Jonathan Cameron
2025-08-14 15:32 ` [RFC PATCH V3 09/17] mm: Add initial scan delay Raghavendra K T
2025-10-03 9:41 ` Jonathan Cameron
2025-08-14 15:33 ` [RFC PATCH V3 10/17] mm: Add a heuristic to calculate target node Raghavendra K T
2025-10-03 10:04 ` Jonathan Cameron
2025-08-14 15:33 ` [RFC PATCH V3 11/17] mm/kscand: Implement migration failure feedback Raghavendra K T
2025-10-03 10:10 ` Jonathan Cameron
2025-08-14 15:33 ` [RFC PATCH V3 12/17] sysfs: Add sysfs support to tune scanning Raghavendra K T
2025-10-03 10:25 ` Jonathan Cameron
2025-08-14 15:33 ` [RFC PATCH V3 13/17] mm/vmstat: Add vmstat counters Raghavendra K T
2025-08-14 15:33 ` [RFC PATCH V3 14/17] trace/kscand: Add tracing of scanning and migration Raghavendra K T
2025-10-03 10:28 ` Jonathan Cameron
2025-08-14 15:33 ` [RFC PATCH V3 15/17] prctl: Introduce new prctl to control scanning Raghavendra K T
2025-08-14 15:33 ` [RFC PATCH V3 16/17] prctl: Fine tune scan_period with prctl scale param Raghavendra K T
2025-08-14 15:33 ` [RFC PATCH V3 17/17] mm: Create a list of fallback target nodes Raghavendra K T
2025-08-21 15:24 ` [RFC PATCH V3 00/17] mm: slowtier page promotion based on PTE A bit Raghavendra K T
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251002172415.00003140@huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=AneeshKumar.KizhakeVeetil@arm.com \
--cc=Michael.Day@amd.com \
--cc=akpm@linux-foundation.org \
--cc=bharata@amd.com \
--cc=dave.hansen@intel.com \
--cc=dave@stgolabs.net \
--cc=david@redhat.com \
--cc=dongjoo.linux.dev@gmail.com \
--cc=feng.tang@intel.com \
--cc=gourry@gourry.net \
--cc=hannes@cmpxchg.org \
--cc=harry.yoo@oracle.com \
--cc=hdanton@sina.com \
--cc=honggyu.kim@sk.com \
--cc=hughd@google.com \
--cc=jhubbard@nvidia.com \
--cc=jon.grimm@amd.com \
--cc=k.shutemov@gmail.com \
--cc=kbusch@meta.com \
--cc=kinseyho@google.com \
--cc=kmanaouil.dev@gmail.com \
--cc=leesuyeon0506@gmail.com \
--cc=leillc@google.com \
--cc=liam.howlett@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mingo@redhat.com \
--cc=nadav.amit@gmail.com \
--cc=nphamcs@gmail.com \
--cc=peterz@infradead.org \
--cc=raghavendra.kt@amd.com \
--cc=riel@surriel.com \
--cc=rientjes@google.com \
--cc=rppt@kernel.org \
--cc=santosh.shukla@amd.com \
--cc=shivankg@amd.com \
--cc=shy828301@gmail.com \
--cc=sj@kernel.org \
--cc=vbabka@suse.cz \
--cc=weixugc@google.com \
--cc=willy@infradead.org \
--cc=ying.huang@linux.alibaba.com \
--cc=yuanchu@google.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.