All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v3 0/2] mm: vmscan: retry folios written back while isolated
@ 2024-12-04  4:01 Chen Ridong
  2024-12-04  4:01 ` [RFC PATCH v3 1/2] mm: vmascan: add find_folios_written_back() helper Chen Ridong
  2024-12-04  4:01 ` [RFC PATCH v3 2/2] mm: vmscan: retry folios written back while isolated Chen Ridong
  0 siblings, 2 replies; 8+ messages in thread
From: Chen Ridong @ 2024-12-04  4:01 UTC (permalink / raw)
  To: akpm, mhocko, hannes, yosryahmed, yuzhao, david, willy,
	ryan.roberts, baohua, 21cnbao, wangkefeng.wang
  Cc: linux-mm, linux-kernel, chenridong, wangweiyang2, xieym_ict

From: Chen Ridong <chenridong@huawei.com>

The page reclaim isolates a batch of folios from the tail of one of the
LRU lists and works on those folios one by one.  For a suitable
swap-backed folio, if the swap device is async, it queues that folio for
writeback.  After the page reclaim finishes an entire batch, it puts back
the folios it queued for writeback to the head of the original LRU list.

In the meantime, the page writeback flushes the queued folios also by
batches.  Its batching logic is independent from that of the page reclaim.
For each of the folios it writes back, the page writeback calls
folio_rotate_reclaimable() which tries to rotate a folio to the tail.

folio_rotate_reclaimable() only works for a folio after the page reclaim
has put it back.  If an async swap device is fast enough, the page
writeback can finish with that folio while the page reclaim is still
working on the rest of the batch containing it.  In this case, that folio
will remain at the head and the page reclaim will not retry it before
reaching there.

This issue has been fixed for multi-gen LRU with commit 359a5e1416ca ("mm:
multi-gen LRU: retry folios written back while isolated"). Fix this issue
in the same way for active/inactive lru.

---
v3:
 - fix this issue in the same with way as multi-gen LRU.

v2:
 - detect folios whose writeback has done and move them to the tail
    of lru. suggested by Barry Song
[2] https://lore.kernel.org/linux-kernel/CAGsJ_4zqL8ZHNRZ44o_CC69kE7DBVXvbZfvmQxMGiFqRxqHQdA@mail.gmail.com/

v1:
[1] https://lore.kernel.org/linux-kernel/20241010081802.290893-1-chenridong@huaweicloud.com/

Chen Ridong (2):
  mm: vmascan: add find_folios_written_back() helper
  mm: vmscan: retry folios written back while isolated

 mm/vmscan.c | 108 ++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 76 insertions(+), 32 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC PATCH v3 1/2] mm: vmascan: add find_folios_written_back() helper
  2024-12-04  4:01 [RFC PATCH v3 0/2] mm: vmscan: retry folios written back while isolated Chen Ridong
@ 2024-12-04  4:01 ` Chen Ridong
  2024-12-04 10:37   ` Barry Song
                     ` (2 more replies)
  2024-12-04  4:01 ` [RFC PATCH v3 2/2] mm: vmscan: retry folios written back while isolated Chen Ridong
  1 sibling, 3 replies; 8+ messages in thread
From: Chen Ridong @ 2024-12-04  4:01 UTC (permalink / raw)
  To: akpm, mhocko, hannes, yosryahmed, yuzhao, david, willy,
	ryan.roberts, baohua, 21cnbao, wangkefeng.wang
  Cc: linux-mm, linux-kernel, chenridong, wangweiyang2, xieym_ict

From: Chen Ridong <chenridong@huawei.com>

Add find_folios_written_back() helper, which will be called in the
shrink_inactive_list function in subsequent patch.

Signed-off-by: Chen Ridong <chenridong@huawei.com>
---
 mm/vmscan.c | 73 +++++++++++++++++++++++++++++++----------------------
 1 file changed, 43 insertions(+), 30 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 76378bc257e3..af1ff76f83e7 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -283,6 +283,48 @@ static void set_task_reclaim_state(struct task_struct *task,
 	task->reclaim_state = rs;
 }
 
+/**
+ * find_folios_written_back - Find and move the written back folios to a new list.
+ * @list: filios list
+ * @clean: the written back folios list
+ * @skip: whether skip to move the written back folios to clean list.
+ */
+static inline void find_folios_written_back(struct list_head *list,
+		struct list_head *clean, bool skip)
+{
+	struct folio *folio;
+	struct folio *next;
+
+	list_for_each_entry_safe_reverse(folio, next, list, lru) {
+		if (!folio_evictable(folio)) {
+			list_del(&folio->lru);
+			folio_putback_lru(folio);
+			continue;
+		}
+
+		if (folio_test_reclaim(folio) &&
+		    (folio_test_dirty(folio) || folio_test_writeback(folio))) {
+			/* restore LRU_REFS_FLAGS cleared by isolate_folio() */
+			if (lru_gen_enabled() && folio_test_workingset(folio))
+				folio_set_referenced(folio);
+			continue;
+		}
+
+		if (skip || folio_test_active(folio) || folio_test_referenced(folio) ||
+		    folio_mapped(folio) || folio_test_locked(folio) ||
+		    folio_test_dirty(folio) || folio_test_writeback(folio)) {
+			/* don't add rejected folios to the oldest generation */
+			if (lru_gen_enabled())
+				set_mask_bits(&folio->flags, LRU_REFS_MASK | LRU_REFS_FLAGS,
+					      BIT(PG_active));
+			continue;
+		}
+
+		/* retry folios that may have missed folio_rotate_reclaimable() */
+		list_move(&folio->lru, clean);
+	}
+}
+
 /*
  * flush_reclaim_state(): add pages reclaimed outside of LRU-based reclaim to
  * scan_control->nr_reclaimed.
@@ -4567,8 +4609,6 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap
 	int reclaimed;
 	LIST_HEAD(list);
 	LIST_HEAD(clean);
-	struct folio *folio;
-	struct folio *next;
 	enum vm_event_item item;
 	struct reclaim_stat stat;
 	struct lru_gen_mm_walk *walk;
@@ -4597,34 +4637,7 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap
 			scanned, reclaimed, &stat, sc->priority,
 			type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON);
 
-	list_for_each_entry_safe_reverse(folio, next, &list, lru) {
-		if (!folio_evictable(folio)) {
-			list_del(&folio->lru);
-			folio_putback_lru(folio);
-			continue;
-		}
-
-		if (folio_test_reclaim(folio) &&
-		    (folio_test_dirty(folio) || folio_test_writeback(folio))) {
-			/* restore LRU_REFS_FLAGS cleared by isolate_folio() */
-			if (folio_test_workingset(folio))
-				folio_set_referenced(folio);
-			continue;
-		}
-
-		if (skip_retry || folio_test_active(folio) || folio_test_referenced(folio) ||
-		    folio_mapped(folio) || folio_test_locked(folio) ||
-		    folio_test_dirty(folio) || folio_test_writeback(folio)) {
-			/* don't add rejected folios to the oldest generation */
-			set_mask_bits(&folio->flags, LRU_REFS_MASK | LRU_REFS_FLAGS,
-				      BIT(PG_active));
-			continue;
-		}
-
-		/* retry folios that may have missed folio_rotate_reclaimable() */
-		list_move(&folio->lru, &clean);
-	}
-
+	find_folios_written_back(&list, &clean, skip_retry);
 	spin_lock_irq(&lruvec->lru_lock);
 
 	move_folios_to_lru(lruvec, &list);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH v3 2/2] mm: vmscan: retry folios written back while isolated
  2024-12-04  4:01 [RFC PATCH v3 0/2] mm: vmscan: retry folios written back while isolated Chen Ridong
  2024-12-04  4:01 ` [RFC PATCH v3 1/2] mm: vmascan: add find_folios_written_back() helper Chen Ridong
@ 2024-12-04  4:01 ` Chen Ridong
  2024-12-04 10:45   ` Barry Song
  1 sibling, 1 reply; 8+ messages in thread
From: Chen Ridong @ 2024-12-04  4:01 UTC (permalink / raw)
  To: akpm, mhocko, hannes, yosryahmed, yuzhao, david, willy,
	ryan.roberts, baohua, 21cnbao, wangkefeng.wang
  Cc: linux-mm, linux-kernel, chenridong, wangweiyang2, xieym_ict

From: Chen Ridong <chenridong@huawei.com>

An issue was found with the following testing step:
1. Compile with CONFIG_TRANSPARENT_HUGEPAGE=y, CONFIG_LRU_GEN_ENABLED=n.
2. Mount memcg v1, and create memcg named test_memcg and set
   usage_in_bytes=2.1G, memsw.usage_in_bytes=3G.
3. Use file as swap, and create a 1G swap.
4. Allocate 2.2G anon memory in test_memcg.

It was found that:

cat memory.usage_in_bytes
2144940032
cat memory.memsw.usage_in_bytes
2255056896

free -h
              total        used        free
Mem:           31Gi       2.1Gi        27Gi
Swap:         1.0Gi       618Mi       405Mi

As shown above, the test_memcg used about 100M swap, but 600M+ swap memory
was used, which means that 500M may be wasted because other memcgs can not
use these swap memory.

It can be explained as follows:
1. When entering shrink_inactive_list, it isolates folios from lru from
   tail to head. If it just takes folioN from lru(make it simple).

   inactive lru: folio1<->folio2<->folio3...<->folioN-1
   isolated list: folioN

2. In shrink_page_list function, if folioN is THP(2M), it may be splited
   and added to swap cache folio by folio. After adding to swap cache,
   it will submit io to writeback folio to swap, which is asynchronous.
   When shrink_page_list is finished, the isolated folios list will be
   moved back to the head of inactive lru. The inactive lru may just look
   like this, with 512 filioes have been move to the head of inactive lru.

   folioN512<->folioN511<->...filioN1<->folio1<->folio2...<->folioN-1

   It committed io from folioN1 to folioN512, the later folios committed
   was added to head of the 'ret_folios' in the shrink_page_list function.
   As a result, the order was shown as folioN512->folioN511->...->folioN1.

3. When folio writeback io is completed, the folio may be rotated to tail
   of the lru one by one. It's assumed that filioN1,filioN2, ...,filioN512
   are completed in order(commit io in this order), and they are rotated to
   the tail of the LRU in order (filioN1<->...folioN511<->folioN512).
   Therefore, those folios that are tail of the lru will be reclaimed as
   soon as possible.

   folio1<->folio2<->...<->folioN-1<->filioN1<->...folioN511<->folioN512

4. However, shrink_page_list and folio writeback are asynchronous. If THP
   is splited, shrink_page_list loops at least 512 times, which means that
   shrink_page_list is not completed but some folios writeback have been
   completed, and this may lead to failure to rotate these folios to the
   tail of lru. The lru may look likes as below:

   folioN50<->folioN49<->...filioN1<->folio1<->folio2...<->folioN-1<->
   folioN51<->folioN52<->...folioN511<->folioN512

   Although those folios (N1-N50) have been finished writing back, they
   are still at the head of the lru. This is because their writeback_end
   occurred while it were still looping in shrink_folio_list(), causing
   folio_end_writeback()'s folio_rotate_reclaimable() to fail in moving
   these folios, which are not in the LRU but still in the 'folio_list',
   to the tail of the LRU.
   When isolating folios from lru, it scans from tail to head, so it is
   difficult to scan those folios again.

This issue is fixed when CONFIG_LRU_GEN_ENABLED is enabled with the
commit 359a5e1416ca ("mm: multi-gen LRU: retry folios written back while
isolated"). This issue should be fixed for active/inactive lru in the
same way.

Signed-off-by: Chen Ridong <chenridong@huawei.com>
---
 mm/vmscan.c | 35 +++++++++++++++++++++++++++++++++--
 1 file changed, 33 insertions(+), 2 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index af1ff76f83e7..1f0d194f8b2f 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1949,6 +1949,25 @@ static int current_may_throttle(void)
 	return !(current->flags & PF_LOCAL_THROTTLE);
 }
 
+static inline void acc_reclaimed_stat(struct reclaim_stat *stat,
+		struct reclaim_stat *curr)
+{
+	int i;
+
+	stat->nr_dirty += curr->nr_dirty;
+	stat->nr_unqueued_dirty += curr->nr_unqueued_dirty;
+	stat->nr_congested += curr->nr_congested;
+	stat->nr_writeback += curr->nr_writeback;
+	stat->nr_immediate += curr->nr_immediate;
+	stat->nr_pageout += curr->nr_pageout;
+	stat->nr_ref_keep += curr->nr_ref_keep;
+	stat->nr_unmap_fail += curr->nr_unmap_fail;
+	stat->nr_lazyfree_fail += curr->nr_lazyfree_fail;
+	stat->nr_demoted += curr->nr_demoted;
+	for (i = 0; i < ANON_AND_FILE; i++)
+		stat->nr_activate[i] = curr->nr_activate[i];
+}
+
 /*
  * shrink_inactive_list() is a helper for shrink_node().  It returns the number
  * of reclaimed pages
@@ -1958,14 +1977,16 @@ static unsigned long shrink_inactive_list(unsigned long nr_to_scan,
 		enum lru_list lru)
 {
 	LIST_HEAD(folio_list);
+	LIST_HEAD(clean_list);
 	unsigned long nr_scanned;
 	unsigned int nr_reclaimed = 0;
 	unsigned long nr_taken;
-	struct reclaim_stat stat;
+	struct reclaim_stat stat, curr;
 	bool file = is_file_lru(lru);
 	enum vm_event_item item;
 	struct pglist_data *pgdat = lruvec_pgdat(lruvec);
 	bool stalled = false;
+	bool skip_retry = false;
 
 	while (unlikely(too_many_isolated(pgdat, file, sc))) {
 		if (stalled)
@@ -1999,10 +2020,20 @@ static unsigned long shrink_inactive_list(unsigned long nr_to_scan,
 	if (nr_taken == 0)
 		return 0;
 
-	nr_reclaimed = shrink_folio_list(&folio_list, pgdat, sc, &stat, false);
+	memset(&stat, 0, sizeof(stat));
+retry:
+	nr_reclaimed += shrink_folio_list(&folio_list, pgdat, sc, &curr, false);
+	find_folios_written_back(&folio_list, &clean_list, skip_retry);
+	acc_reclaimed_stat(&stat, &curr);
 
 	spin_lock_irq(&lruvec->lru_lock);
 	move_folios_to_lru(lruvec, &folio_list);
+	if (!list_empty(&clean_list)) {
+		list_splice_init(&clean_list, &folio_list);
+		skip_retry = true;
+		spin_unlock_irq(&lruvec->lru_lock);
+		goto retry;
+	}
 
 	__mod_lruvec_state(lruvec, PGDEMOTE_KSWAPD + reclaimer_offset(),
 					stat.nr_demoted);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v3 1/2] mm: vmascan: add find_folios_written_back() helper
  2024-12-04  4:01 ` [RFC PATCH v3 1/2] mm: vmascan: add find_folios_written_back() helper Chen Ridong
@ 2024-12-04 10:37   ` Barry Song
  2024-12-04 10:38   ` kernel test robot
  2024-12-04 14:03   ` kernel test robot
  2 siblings, 0 replies; 8+ messages in thread
From: Barry Song @ 2024-12-04 10:37 UTC (permalink / raw)
  To: Chen Ridong
  Cc: akpm, mhocko, hannes, yosryahmed, yuzhao, david, willy,
	ryan.roberts, wangkefeng.wang, linux-mm, linux-kernel, chenridong,
	wangweiyang2, xieym_ict

On Wed, Dec 4, 2024 at 5:11 PM Chen Ridong <chenridong@huaweicloud.com> wrote:
>
> From: Chen Ridong <chenridong@huawei.com>
>
> Add find_folios_written_back() helper, which will be called in the
> shrink_inactive_list function in subsequent patch.

This is not about adding a helper but rather extracting a function
that can be used
by both lru_gen and the traditional active/inactive LRU. Making it a
separate patch
may not be ideal, as it isn’t an external function that warrants
special attention.
Combining patch 1 and patch 2 into a single patch creates a more cohesive and
logical flow, making it easier to review.

>
> Signed-off-by: Chen Ridong <chenridong@huawei.com>
> ---
>  mm/vmscan.c | 73 +++++++++++++++++++++++++++++++----------------------
>  1 file changed, 43 insertions(+), 30 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 76378bc257e3..af1ff76f83e7 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -283,6 +283,48 @@ static void set_task_reclaim_state(struct task_struct *task,
>         task->reclaim_state = rs;
>  }
>
> +/**
> + * find_folios_written_back - Find and move the written back folios to a new list.
> + * @list: filios list
> + * @clean: the written back folios list
> + * @skip: whether skip to move the written back folios to clean list.
> + */
> +static inline void find_folios_written_back(struct list_head *list,
> +               struct list_head *clean, bool skip)
> +{
> +       struct folio *folio;
> +       struct folio *next;
> +
> +       list_for_each_entry_safe_reverse(folio, next, list, lru) {
> +               if (!folio_evictable(folio)) {
> +                       list_del(&folio->lru);
> +                       folio_putback_lru(folio);
> +                       continue;
> +               }
> +
> +               if (folio_test_reclaim(folio) &&
> +                   (folio_test_dirty(folio) || folio_test_writeback(folio))) {
> +                       /* restore LRU_REFS_FLAGS cleared by isolate_folio() */
> +                       if (lru_gen_enabled() && folio_test_workingset(folio))
> +                               folio_set_referenced(folio);
> +                       continue;
> +               }
> +
> +               if (skip || folio_test_active(folio) || folio_test_referenced(folio) ||
> +                   folio_mapped(folio) || folio_test_locked(folio) ||
> +                   folio_test_dirty(folio) || folio_test_writeback(folio)) {
> +                       /* don't add rejected folios to the oldest generation */
> +                       if (lru_gen_enabled())
> +                               set_mask_bits(&folio->flags, LRU_REFS_MASK | LRU_REFS_FLAGS,
> +                                             BIT(PG_active));
> +                       continue;
> +               }
> +
> +               /* retry folios that may have missed folio_rotate_reclaimable() */
> +               list_move(&folio->lru, clean);
> +       }
> +}
> +
>  /*
>   * flush_reclaim_state(): add pages reclaimed outside of LRU-based reclaim to
>   * scan_control->nr_reclaimed.
> @@ -4567,8 +4609,6 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap
>         int reclaimed;
>         LIST_HEAD(list);
>         LIST_HEAD(clean);
> -       struct folio *folio;
> -       struct folio *next;
>         enum vm_event_item item;
>         struct reclaim_stat stat;
>         struct lru_gen_mm_walk *walk;
> @@ -4597,34 +4637,7 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap
>                         scanned, reclaimed, &stat, sc->priority,
>                         type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON);
>
> -       list_for_each_entry_safe_reverse(folio, next, &list, lru) {
> -               if (!folio_evictable(folio)) {
> -                       list_del(&folio->lru);
> -                       folio_putback_lru(folio);
> -                       continue;
> -               }
> -
> -               if (folio_test_reclaim(folio) &&
> -                   (folio_test_dirty(folio) || folio_test_writeback(folio))) {
> -                       /* restore LRU_REFS_FLAGS cleared by isolate_folio() */
> -                       if (folio_test_workingset(folio))
> -                               folio_set_referenced(folio);
> -                       continue;
> -               }
> -
> -               if (skip_retry || folio_test_active(folio) || folio_test_referenced(folio) ||
> -                   folio_mapped(folio) || folio_test_locked(folio) ||
> -                   folio_test_dirty(folio) || folio_test_writeback(folio)) {
> -                       /* don't add rejected folios to the oldest generation */
> -                       set_mask_bits(&folio->flags, LRU_REFS_MASK | LRU_REFS_FLAGS,
> -                                     BIT(PG_active));
> -                       continue;
> -               }
> -
> -               /* retry folios that may have missed folio_rotate_reclaimable() */
> -               list_move(&folio->lru, &clean);
> -       }
> -
> +       find_folios_written_back(&list, &clean, skip_retry);
>         spin_lock_irq(&lruvec->lru_lock);
>
>         move_folios_to_lru(lruvec, &list);
> --
> 2.34.1
>

Thanks
Barry


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v3 1/2] mm: vmascan: add find_folios_written_back() helper
  2024-12-04  4:01 ` [RFC PATCH v3 1/2] mm: vmascan: add find_folios_written_back() helper Chen Ridong
  2024-12-04 10:37   ` Barry Song
@ 2024-12-04 10:38   ` kernel test robot
  2024-12-04 14:03   ` kernel test robot
  2 siblings, 0 replies; 8+ messages in thread
From: kernel test robot @ 2024-12-04 10:38 UTC (permalink / raw)
  To: Chen Ridong; +Cc: llvm, oe-kbuild-all

Hi Chen,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:

[auto build test ERROR on akpm-mm/mm-everything]

url:    https://github.com/intel-lab-lkp/linux/commits/Chen-Ridong/mm-vmascan-add-find_folios_written_back-helper/20241204-123817
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/20241204040158.2768519-2-chenridong%40huaweicloud.com
patch subject: [RFC PATCH v3 1/2] mm: vmascan: add find_folios_written_back() helper
config: powerpc-mpc885_ads_defconfig (https://download.01.org/0day-ci/archive/20241204/202412041806.UwnFZEnn-lkp@intel.com/config)
compiler: clang version 20.0.0git (https://github.com/llvm/llvm-project 592c0fe55f6d9a811028b5f3507be91458ab2713)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241204/202412041806.UwnFZEnn-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202412041806.UwnFZEnn-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from mm/vmscan.c:15:
   In file included from include/linux/mm.h:2223:
   include/linux/vmstat.h:518:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
     518 |         return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
         |                               ~~~~~~~~~~~ ^ ~~~
   In file included from mm/vmscan.c:30:
   include/linux/mm_inline.h:47:41: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
      47 |         __mod_lruvec_state(lruvec, NR_LRU_BASE + lru, nr_pages);
         |                                    ~~~~~~~~~~~ ^ ~~~
   include/linux/mm_inline.h:49:22: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
      49 |                                 NR_ZONE_LRU_BASE + lru, nr_pages);
         |                                 ~~~~~~~~~~~~~~~~ ^ ~~~
>> mm/vmscan.c:318:50: error: use of undeclared identifier 'LRU_REFS_FLAGS'
     318 |                                 set_mask_bits(&folio->flags, LRU_REFS_MASK | LRU_REFS_FLAGS,
         |                                                                              ^
   mm/vmscan.c:451:51: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
     451 |                         size += zone_page_state(zone, NR_ZONE_LRU_BASE + lru);
         |                                                       ~~~~~~~~~~~~~~~~ ^ ~~~
   mm/vmscan.c:1783:4: warning: arithmetic between different enumeration types ('enum vm_event_item' and 'enum zone_type') [-Wenum-enum-conversion]
    1783 |                         __count_zid_vm_events(PGSCAN_SKIP, zid, nr_skipped[zid]);
         |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:139:34: note: expanded from macro '__count_zid_vm_events'
     139 |         __count_vm_events(item##_NORMAL - ZONE_NORMAL + zid, delta)
         |                           ~~~~~~~~~~~~~ ^ ~~~~~~~~~~~
   mm/vmscan.c:2289:51: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
    2289 |         inactive = lruvec_page_state(lruvec, NR_LRU_BASE + inactive_lru);
         |                                              ~~~~~~~~~~~ ^ ~~~~~~~~~~~~
   mm/vmscan.c:2290:49: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
    2290 |         active = lruvec_page_state(lruvec, NR_LRU_BASE + active_lru);
         |                                            ~~~~~~~~~~~ ^ ~~~~~~~~~~
   mm/vmscan.c:6294:3: warning: arithmetic between different enumeration types ('enum vm_event_item' and 'enum zone_type') [-Wenum-enum-conversion]
    6294 |                 __count_zid_vm_events(ALLOCSTALL, sc->reclaim_idx, 1);
         |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:139:34: note: expanded from macro '__count_zid_vm_events'
     139 |         __count_vm_events(item##_NORMAL - ZONE_NORMAL + zid, delta)
         |                           ~~~~~~~~~~~~~ ^ ~~~~~~~~~~~
   8 warnings and 1 error generated.


vim +/LRU_REFS_FLAGS +318 mm/vmscan.c

   285	
   286	/**
   287	 * find_folios_written_back - Find and move the written back folios to a new list.
   288	 * @list: filios list
   289	 * @clean: the written back folios list
   290	 * @skip: whether skip to move the written back folios to clean list.
   291	 */
   292	static inline void find_folios_written_back(struct list_head *list,
   293			struct list_head *clean, bool skip)
   294	{
   295		struct folio *folio;
   296		struct folio *next;
   297	
   298		list_for_each_entry_safe_reverse(folio, next, list, lru) {
   299			if (!folio_evictable(folio)) {
   300				list_del(&folio->lru);
   301				folio_putback_lru(folio);
   302				continue;
   303			}
   304	
   305			if (folio_test_reclaim(folio) &&
   306			    (folio_test_dirty(folio) || folio_test_writeback(folio))) {
   307				/* restore LRU_REFS_FLAGS cleared by isolate_folio() */
   308				if (lru_gen_enabled() && folio_test_workingset(folio))
   309					folio_set_referenced(folio);
   310				continue;
   311			}
   312	
   313			if (skip || folio_test_active(folio) || folio_test_referenced(folio) ||
   314			    folio_mapped(folio) || folio_test_locked(folio) ||
   315			    folio_test_dirty(folio) || folio_test_writeback(folio)) {
   316				/* don't add rejected folios to the oldest generation */
   317				if (lru_gen_enabled())
 > 318					set_mask_bits(&folio->flags, LRU_REFS_MASK | LRU_REFS_FLAGS,
   319						      BIT(PG_active));
   320				continue;
   321			}
   322	
   323			/* retry folios that may have missed folio_rotate_reclaimable() */
   324			list_move(&folio->lru, clean);
   325		}
   326	}
   327	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v3 2/2] mm: vmscan: retry folios written back while isolated
  2024-12-04  4:01 ` [RFC PATCH v3 2/2] mm: vmscan: retry folios written back while isolated Chen Ridong
@ 2024-12-04 10:45   ` Barry Song
  2024-12-05  2:06     ` chenridong
  0 siblings, 1 reply; 8+ messages in thread
From: Barry Song @ 2024-12-04 10:45 UTC (permalink / raw)
  To: Chen Ridong
  Cc: akpm, mhocko, hannes, yosryahmed, yuzhao, david, willy,
	ryan.roberts, wangkefeng.wang, linux-mm, linux-kernel, chenridong,
	wangweiyang2, xieym_ict

On Wed, Dec 4, 2024 at 5:11 PM Chen Ridong <chenridong@huaweicloud.com> wrote:
>
> From: Chen Ridong <chenridong@huawei.com>
>
> An issue was found with the following testing step:
> 1. Compile with CONFIG_TRANSPARENT_HUGEPAGE=y, CONFIG_LRU_GEN_ENABLED=n.
> 2. Mount memcg v1, and create memcg named test_memcg and set
>    usage_in_bytes=2.1G, memsw.usage_in_bytes=3G.
> 3. Use file as swap, and create a 1G swap.
> 4. Allocate 2.2G anon memory in test_memcg.
>
> It was found that:
>
> cat memory.usage_in_bytes
> 2144940032
> cat memory.memsw.usage_in_bytes
> 2255056896
>
> free -h
>               total        used        free
> Mem:           31Gi       2.1Gi        27Gi
> Swap:         1.0Gi       618Mi       405Mi
>
> As shown above, the test_memcg used about 100M swap, but 600M+ swap memory
> was used, which means that 500M may be wasted because other memcgs can not
> use these swap memory.
>
> It can be explained as follows:
> 1. When entering shrink_inactive_list, it isolates folios from lru from
>    tail to head. If it just takes folioN from lru(make it simple).
>
>    inactive lru: folio1<->folio2<->folio3...<->folioN-1
>    isolated list: folioN
>
> 2. In shrink_page_list function, if folioN is THP(2M), it may be splited
>    and added to swap cache folio by folio. After adding to swap cache,
>    it will submit io to writeback folio to swap, which is asynchronous.
>    When shrink_page_list is finished, the isolated folios list will be
>    moved back to the head of inactive lru. The inactive lru may just look
>    like this, with 512 filioes have been move to the head of inactive lru.
>
>    folioN512<->folioN511<->...filioN1<->folio1<->folio2...<->folioN-1
>
>    It committed io from folioN1 to folioN512, the later folios committed
>    was added to head of the 'ret_folios' in the shrink_page_list function.
>    As a result, the order was shown as folioN512->folioN511->...->folioN1.
>
> 3. When folio writeback io is completed, the folio may be rotated to tail
>    of the lru one by one. It's assumed that filioN1,filioN2, ...,filioN512
>    are completed in order(commit io in this order), and they are rotated to
>    the tail of the LRU in order (filioN1<->...folioN511<->folioN512).
>    Therefore, those folios that are tail of the lru will be reclaimed as
>    soon as possible.
>
>    folio1<->folio2<->...<->folioN-1<->filioN1<->...folioN511<->folioN512
>
> 4. However, shrink_page_list and folio writeback are asynchronous. If THP
>    is splited, shrink_page_list loops at least 512 times, which means that
>    shrink_page_list is not completed but some folios writeback have been
>    completed, and this may lead to failure to rotate these folios to the
>    tail of lru. The lru may look likes as below:
>
>    folioN50<->folioN49<->...filioN1<->folio1<->folio2...<->folioN-1<->
>    folioN51<->folioN52<->...folioN511<->folioN512
>
>    Although those folios (N1-N50) have been finished writing back, they
>    are still at the head of the lru. This is because their writeback_end
>    occurred while it were still looping in shrink_folio_list(), causing
>    folio_end_writeback()'s folio_rotate_reclaimable() to fail in moving
>    these folios, which are not in the LRU but still in the 'folio_list',
>    to the tail of the LRU.
>    When isolating folios from lru, it scans from tail to head, so it is
>    difficult to scan those folios again.

I don’t think it’s necessary to focus so much on large folios. This
issue affects both small and large folios alike. Splitting large
folios simply lengthens the list, which increases the chances of
missing rotation. It’s enough to note that commit 359a5e1416ca
fixed this issue in mglru, but the same problem exists in the
active/inactive LRU. As a result, we’re extracting the function in
patch 1 to make it usable for both LRUs and applying the same fix
to the active/inactive LRU. Mentioning that THP splitting can
worsen the issue (since it makes the list longer) is sufficient;
it’s not the main point.

It’s better to have a single patch and refine the changelog to focus on
the core and essential problem, avoiding too many unrelated details.

>
> This issue is fixed when CONFIG_LRU_GEN_ENABLED is enabled with the
> commit 359a5e1416ca ("mm: multi-gen LRU: retry folios written back while
> isolated"). This issue should be fixed for active/inactive lru in the
> same way.
>
> Signed-off-by: Chen Ridong <chenridong@huawei.com>
> ---
>  mm/vmscan.c | 35 +++++++++++++++++++++++++++++++++--
>  1 file changed, 33 insertions(+), 2 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index af1ff76f83e7..1f0d194f8b2f 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1949,6 +1949,25 @@ static int current_may_throttle(void)
>         return !(current->flags & PF_LOCAL_THROTTLE);
>  }
>
> +static inline void acc_reclaimed_stat(struct reclaim_stat *stat,
> +               struct reclaim_stat *curr)
> +{
> +       int i;
> +
> +       stat->nr_dirty += curr->nr_dirty;
> +       stat->nr_unqueued_dirty += curr->nr_unqueued_dirty;
> +       stat->nr_congested += curr->nr_congested;
> +       stat->nr_writeback += curr->nr_writeback;
> +       stat->nr_immediate += curr->nr_immediate;
> +       stat->nr_pageout += curr->nr_pageout;
> +       stat->nr_ref_keep += curr->nr_ref_keep;
> +       stat->nr_unmap_fail += curr->nr_unmap_fail;
> +       stat->nr_lazyfree_fail += curr->nr_lazyfree_fail;
> +       stat->nr_demoted += curr->nr_demoted;
> +       for (i = 0; i < ANON_AND_FILE; i++)
> +               stat->nr_activate[i] = curr->nr_activate[i];
> +}
> +
>  /*
>   * shrink_inactive_list() is a helper for shrink_node().  It returns the number
>   * of reclaimed pages
> @@ -1958,14 +1977,16 @@ static unsigned long shrink_inactive_list(unsigned long nr_to_scan,
>                 enum lru_list lru)
>  {
>         LIST_HEAD(folio_list);
> +       LIST_HEAD(clean_list);
>         unsigned long nr_scanned;
>         unsigned int nr_reclaimed = 0;
>         unsigned long nr_taken;
> -       struct reclaim_stat stat;
> +       struct reclaim_stat stat, curr;
>         bool file = is_file_lru(lru);
>         enum vm_event_item item;
>         struct pglist_data *pgdat = lruvec_pgdat(lruvec);
>         bool stalled = false;
> +       bool skip_retry = false;
>
>         while (unlikely(too_many_isolated(pgdat, file, sc))) {
>                 if (stalled)
> @@ -1999,10 +2020,20 @@ static unsigned long shrink_inactive_list(unsigned long nr_to_scan,
>         if (nr_taken == 0)
>                 return 0;
>
> -       nr_reclaimed = shrink_folio_list(&folio_list, pgdat, sc, &stat, false);
> +       memset(&stat, 0, sizeof(stat));
> +retry:
> +       nr_reclaimed += shrink_folio_list(&folio_list, pgdat, sc, &curr, false);
> +       find_folios_written_back(&folio_list, &clean_list, skip_retry);
> +       acc_reclaimed_stat(&stat, &curr);
>
>         spin_lock_irq(&lruvec->lru_lock);
>         move_folios_to_lru(lruvec, &folio_list);
> +       if (!list_empty(&clean_list)) {
> +               list_splice_init(&clean_list, &folio_list);
> +               skip_retry = true;
> +               spin_unlock_irq(&lruvec->lru_lock);
> +               goto retry;
> +       }
>
>         __mod_lruvec_state(lruvec, PGDEMOTE_KSWAPD + reclaimer_offset(),
>                                         stat.nr_demoted);
> --
> 2.34.1
>

Thanks
Barry


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v3 1/2] mm: vmascan: add find_folios_written_back() helper
  2024-12-04  4:01 ` [RFC PATCH v3 1/2] mm: vmascan: add find_folios_written_back() helper Chen Ridong
  2024-12-04 10:37   ` Barry Song
  2024-12-04 10:38   ` kernel test robot
@ 2024-12-04 14:03   ` kernel test robot
  2 siblings, 0 replies; 8+ messages in thread
From: kernel test robot @ 2024-12-04 14:03 UTC (permalink / raw)
  To: Chen Ridong; +Cc: oe-kbuild-all

Hi Chen,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:

[auto build test ERROR on akpm-mm/mm-everything]

url:    https://github.com/intel-lab-lkp/linux/commits/Chen-Ridong/mm-vmascan-add-find_folios_written_back-helper/20241204-123817
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/20241204040158.2768519-2-chenridong%40huaweicloud.com
patch subject: [RFC PATCH v3 1/2] mm: vmascan: add find_folios_written_back() helper
config: i386-buildonly-randconfig-004 (https://download.01.org/0day-ci/archive/20241204/202412042108.Mu208bDJ-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241204/202412042108.Mu208bDJ-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202412042108.Mu208bDJ-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from include/linux/thread_info.h:27,
                    from include/linux/spinlock.h:60,
                    from include/linux/mmzone.h:8,
                    from include/linux/gfp.h:7,
                    from include/linux/mm.h:7,
                    from mm/vmscan.c:15:
   mm/vmscan.c: In function 'find_folios_written_back':
>> mm/vmscan.c:318:78: error: 'LRU_REFS_FLAGS' undeclared (first use in this function); did you mean 'LRU_REFS_MASK'?
     318 |                                 set_mask_bits(&folio->flags, LRU_REFS_MASK | LRU_REFS_FLAGS,
         |                                                                              ^~~~~~~~~~~~~~
   include/linux/bitops.h:330:40: note: in definition of macro 'set_mask_bits'
     330 |         const typeof(*(ptr)) mask__ = (mask), bits__ = (bits);  \
         |                                        ^~~~
   mm/vmscan.c:318:78: note: each undeclared identifier is reported only once for each function it appears in
     318 |                                 set_mask_bits(&folio->flags, LRU_REFS_MASK | LRU_REFS_FLAGS,
         |                                                                              ^~~~~~~~~~~~~~
   include/linux/bitops.h:330:40: note: in definition of macro 'set_mask_bits'
     330 |         const typeof(*(ptr)) mask__ = (mask), bits__ = (bits);  \
         |                                        ^~~~


vim +318 mm/vmscan.c

   285	
   286	/**
   287	 * find_folios_written_back - Find and move the written back folios to a new list.
   288	 * @list: filios list
   289	 * @clean: the written back folios list
   290	 * @skip: whether skip to move the written back folios to clean list.
   291	 */
   292	static inline void find_folios_written_back(struct list_head *list,
   293			struct list_head *clean, bool skip)
   294	{
   295		struct folio *folio;
   296		struct folio *next;
   297	
   298		list_for_each_entry_safe_reverse(folio, next, list, lru) {
   299			if (!folio_evictable(folio)) {
   300				list_del(&folio->lru);
   301				folio_putback_lru(folio);
   302				continue;
   303			}
   304	
   305			if (folio_test_reclaim(folio) &&
   306			    (folio_test_dirty(folio) || folio_test_writeback(folio))) {
   307				/* restore LRU_REFS_FLAGS cleared by isolate_folio() */
   308				if (lru_gen_enabled() && folio_test_workingset(folio))
   309					folio_set_referenced(folio);
   310				continue;
   311			}
   312	
   313			if (skip || folio_test_active(folio) || folio_test_referenced(folio) ||
   314			    folio_mapped(folio) || folio_test_locked(folio) ||
   315			    folio_test_dirty(folio) || folio_test_writeback(folio)) {
   316				/* don't add rejected folios to the oldest generation */
   317				if (lru_gen_enabled())
 > 318					set_mask_bits(&folio->flags, LRU_REFS_MASK | LRU_REFS_FLAGS,
   319						      BIT(PG_active));
   320				continue;
   321			}
   322	
   323			/* retry folios that may have missed folio_rotate_reclaimable() */
   324			list_move(&folio->lru, clean);
   325		}
   326	}
   327	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v3 2/2] mm: vmscan: retry folios written back while isolated
  2024-12-04 10:45   ` Barry Song
@ 2024-12-05  2:06     ` chenridong
  0 siblings, 0 replies; 8+ messages in thread
From: chenridong @ 2024-12-05  2:06 UTC (permalink / raw)
  To: Barry Song, Chen Ridong
  Cc: akpm, mhocko, hannes, yosryahmed, yuzhao, david, willy,
	ryan.roberts, wangkefeng.wang, linux-mm, linux-kernel,
	wangweiyang2, xieym_ict



On 2024/12/4 18:45, Barry Song wrote:
> On Wed, Dec 4, 2024 at 5:11 PM Chen Ridong <chenridong@huaweicloud.com> wrote:
>>
>> From: Chen Ridong <chenridong@huawei.com>
>>
>> An issue was found with the following testing step:
>> 1. Compile with CONFIG_TRANSPARENT_HUGEPAGE=y, CONFIG_LRU_GEN_ENABLED=n.
>> 2. Mount memcg v1, and create memcg named test_memcg and set
>>    usage_in_bytes=2.1G, memsw.usage_in_bytes=3G.
>> 3. Use file as swap, and create a 1G swap.
>> 4. Allocate 2.2G anon memory in test_memcg.
>>
>> It was found that:
>>
>> cat memory.usage_in_bytes
>> 2144940032
>> cat memory.memsw.usage_in_bytes
>> 2255056896
>>
>> free -h
>>               total        used        free
>> Mem:           31Gi       2.1Gi        27Gi
>> Swap:         1.0Gi       618Mi       405Mi
>>
>> As shown above, the test_memcg used about 100M swap, but 600M+ swap memory
>> was used, which means that 500M may be wasted because other memcgs can not
>> use these swap memory.
>>
>> It can be explained as follows:
>> 1. When entering shrink_inactive_list, it isolates folios from lru from
>>    tail to head. If it just takes folioN from lru(make it simple).
>>
>>    inactive lru: folio1<->folio2<->folio3...<->folioN-1
>>    isolated list: folioN
>>
>> 2. In shrink_page_list function, if folioN is THP(2M), it may be splited
>>    and added to swap cache folio by folio. After adding to swap cache,
>>    it will submit io to writeback folio to swap, which is asynchronous.
>>    When shrink_page_list is finished, the isolated folios list will be
>>    moved back to the head of inactive lru. The inactive lru may just look
>>    like this, with 512 filioes have been move to the head of inactive lru.
>>
>>    folioN512<->folioN511<->...filioN1<->folio1<->folio2...<->folioN-1
>>
>>    It committed io from folioN1 to folioN512, the later folios committed
>>    was added to head of the 'ret_folios' in the shrink_page_list function.
>>    As a result, the order was shown as folioN512->folioN511->...->folioN1.
>>
>> 3. When folio writeback io is completed, the folio may be rotated to tail
>>    of the lru one by one. It's assumed that filioN1,filioN2, ...,filioN512
>>    are completed in order(commit io in this order), and they are rotated to
>>    the tail of the LRU in order (filioN1<->...folioN511<->folioN512).
>>    Therefore, those folios that are tail of the lru will be reclaimed as
>>    soon as possible.
>>
>>    folio1<->folio2<->...<->folioN-1<->filioN1<->...folioN511<->folioN512
>>
>> 4. However, shrink_page_list and folio writeback are asynchronous. If THP
>>    is splited, shrink_page_list loops at least 512 times, which means that
>>    shrink_page_list is not completed but some folios writeback have been
>>    completed, and this may lead to failure to rotate these folios to the
>>    tail of lru. The lru may look likes as below:
>>
>>    folioN50<->folioN49<->...filioN1<->folio1<->folio2...<->folioN-1<->
>>    folioN51<->folioN52<->...folioN511<->folioN512
>>
>>    Although those folios (N1-N50) have been finished writing back, they
>>    are still at the head of the lru. This is because their writeback_end
>>    occurred while it were still looping in shrink_folio_list(), causing
>>    folio_end_writeback()'s folio_rotate_reclaimable() to fail in moving
>>    these folios, which are not in the LRU but still in the 'folio_list',
>>    to the tail of the LRU.
>>    When isolating folios from lru, it scans from tail to head, so it is
>>    difficult to scan those folios again.
> 
> I don’t think it’s necessary to focus so much on large folios. This
> issue affects both small and large folios alike. Splitting large
> folios simply lengthens the list, which increases the chances of
> missing rotation. It’s enough to note that commit 359a5e1416ca
> fixed this issue in mglru, but the same problem exists in the
> active/inactive LRU. As a result, we’re extracting the function in
> patch 1 to make it usable for both LRUs and applying the same fix
> to the active/inactive LRU. Mentioning that THP splitting can
> worsen the issue (since it makes the list longer) is sufficient;
> it’s not the main point.
> 
> It’s better to have a single patch and refine the changelog to focus on
> the core and essential problem, avoiding too many unrelated details.
> 

Thank you, will update.

Best regards,
Ridong

>>
>> This issue is fixed when CONFIG_LRU_GEN_ENABLED is enabled with the
>> commit 359a5e1416ca ("mm: multi-gen LRU: retry folios written back while
>> isolated"). This issue should be fixed for active/inactive lru in the
>> same way.
>>
>> Signed-off-by: Chen Ridong <chenridong@huawei.com>
>> ---
>>  mm/vmscan.c | 35 +++++++++++++++++++++++++++++++++--
>>  1 file changed, 33 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index af1ff76f83e7..1f0d194f8b2f 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -1949,6 +1949,25 @@ static int current_may_throttle(void)
>>         return !(current->flags & PF_LOCAL_THROTTLE);
>>  }
>>
>> +static inline void acc_reclaimed_stat(struct reclaim_stat *stat,
>> +               struct reclaim_stat *curr)
>> +{
>> +       int i;
>> +
>> +       stat->nr_dirty += curr->nr_dirty;
>> +       stat->nr_unqueued_dirty += curr->nr_unqueued_dirty;
>> +       stat->nr_congested += curr->nr_congested;
>> +       stat->nr_writeback += curr->nr_writeback;
>> +       stat->nr_immediate += curr->nr_immediate;
>> +       stat->nr_pageout += curr->nr_pageout;
>> +       stat->nr_ref_keep += curr->nr_ref_keep;
>> +       stat->nr_unmap_fail += curr->nr_unmap_fail;
>> +       stat->nr_lazyfree_fail += curr->nr_lazyfree_fail;
>> +       stat->nr_demoted += curr->nr_demoted;
>> +       for (i = 0; i < ANON_AND_FILE; i++)
>> +               stat->nr_activate[i] = curr->nr_activate[i];
>> +}
>> +
>>  /*
>>   * shrink_inactive_list() is a helper for shrink_node().  It returns the number
>>   * of reclaimed pages
>> @@ -1958,14 +1977,16 @@ static unsigned long shrink_inactive_list(unsigned long nr_to_scan,
>>                 enum lru_list lru)
>>  {
>>         LIST_HEAD(folio_list);
>> +       LIST_HEAD(clean_list);
>>         unsigned long nr_scanned;
>>         unsigned int nr_reclaimed = 0;
>>         unsigned long nr_taken;
>> -       struct reclaim_stat stat;
>> +       struct reclaim_stat stat, curr;
>>         bool file = is_file_lru(lru);
>>         enum vm_event_item item;
>>         struct pglist_data *pgdat = lruvec_pgdat(lruvec);
>>         bool stalled = false;
>> +       bool skip_retry = false;
>>
>>         while (unlikely(too_many_isolated(pgdat, file, sc))) {
>>                 if (stalled)
>> @@ -1999,10 +2020,20 @@ static unsigned long shrink_inactive_list(unsigned long nr_to_scan,
>>         if (nr_taken == 0)
>>                 return 0;
>>
>> -       nr_reclaimed = shrink_folio_list(&folio_list, pgdat, sc, &stat, false);
>> +       memset(&stat, 0, sizeof(stat));
>> +retry:
>> +       nr_reclaimed += shrink_folio_list(&folio_list, pgdat, sc, &curr, false);
>> +       find_folios_written_back(&folio_list, &clean_list, skip_retry);
>> +       acc_reclaimed_stat(&stat, &curr);
>>
>>         spin_lock_irq(&lruvec->lru_lock);
>>         move_folios_to_lru(lruvec, &folio_list);
>> +       if (!list_empty(&clean_list)) {
>> +               list_splice_init(&clean_list, &folio_list);
>> +               skip_retry = true;
>> +               spin_unlock_irq(&lruvec->lru_lock);
>> +               goto retry;
>> +       }
>>
>>         __mod_lruvec_state(lruvec, PGDEMOTE_KSWAPD + reclaimer_offset(),
>>                                         stat.nr_demoted);
>> --
>> 2.34.1
>>
> 
> Thanks
> Barry


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-12-05  2:06 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-04  4:01 [RFC PATCH v3 0/2] mm: vmscan: retry folios written back while isolated Chen Ridong
2024-12-04  4:01 ` [RFC PATCH v3 1/2] mm: vmascan: add find_folios_written_back() helper Chen Ridong
2024-12-04 10:37   ` Barry Song
2024-12-04 10:38   ` kernel test robot
2024-12-04 14:03   ` kernel test robot
2024-12-04  4:01 ` [RFC PATCH v3 2/2] mm: vmscan: retry folios written back while isolated Chen Ridong
2024-12-04 10:45   ` Barry Song
2024-12-05  2:06     ` chenridong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.