From: Hugh Dickins <hughd@google.com>
To: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Hugh Dickins <hughd@google.com>,
Shakeel Butt <shakeel.butt@linux.dev>,
akpm@linux-foundation.org, hannes@cmpxchg.org, david@redhat.com,
mhocko@kernel.org, zhengqi.arch@bytedance.com,
lorenzo.stoakes@oracle.com, willy@infradead.org,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] mm: vmscan: remove folio_test_private() check in pageout()
Date: Wed, 17 Sep 2025 00:49:43 -0700 (PDT) [thread overview]
Message-ID: <1111883c-974f-e4da-a38f-bb3d337185ad@google.com> (raw)
In-Reply-To: <7eace9f6-e257-498d-ac10-ae86b5713e3a@linux.alibaba.com>
[-- Attachment #1: Type: text/plain, Size: 4700 bytes --]
On Wed, 17 Sep 2025, Baolin Wang wrote:
> On 2025/9/16 15:18, Baolin Wang wrote:
...
> >
> > Additionally, I'm still struggling to understand this case where a folio is
> > dirty but has a NULL mapping, but I might understand that ext3 journaling
> > might do this from the comments in truncate_cleanup_folio().
> >
> > But I still doubt whether this case exists because the refcount check in
> > is_page_cache_freeable() considers the pagecache. This means if this dirty
> > folio's mapping is NULL, the following check would return false. If it
> > returns true, it means that even if we release the private data here, the
> > orphaned folio's refcount still doesn't meet the requirements for being
> > reclaimed. Please correct me if I missed anything.
> >
> > static inline int is_page_cache_freeable(struct folio *folio)
> > {
> > /*
> > * A freeable page cache folio is referenced only by the caller
> > * that isolated the folio, the page cache and optional filesystem
> > * private data at folio->private.
> > */
> > return folio_ref_count(folio) - folio_test_private(folio) ==
> > 1 + folio_nr_pages(folio);
> > }
> >
Good point, yes, it's surprising that that such a folio could pass
that check and reach the code you're proposing to delete.
(Though a racing scanner of physical memory could raise the refcount
momentarily, causing the folio to look like a page cache freeable.)
>
> I continued to dig into the historical commits, where the private check was
> introduced in 2005 by commit ce91b575332b ("orphaned pagecache memleak fix"),
> as the commit message mentioned, it was to address the issue where reiserfs
> pagecache may be truncated while still pinned:
Yes, I had been doing the same research, coming to that same 2.6.12 commit,
one of the last to go in before the birth of git.
>
> "
> Chris found that with data journaling a reiserfs pagecache may be truncate
> while still pinned. The truncation removes the page->mapping, but the page is
> still listed in the VM queues because it still has buffers. Then during the
> journaling process, a buffer is marked dirty and that sets the PG_dirty
> bitflag as well (in mark_buffer_dirty). After that the page is leaked because
> it's both dirty and without a mapping.
>
> So we must allow pages without mapping and dirty to reach the PagePrivate
> check. The page->mapping will be checked again right after the PagePrivate
> check.
> "
>
> In 2008, commit a2b345642f530 ("Fix dirty page accounting leak with ext3
> data=journal") seems to be dealing with a similar issue, where the page
> becomes dirty after truncation, and provides a very useful call stack:
> truncate_complete_page()
> cancel_dirty_page() // PG_dirty cleared, decr. dirty pages
> do_invalidatepage()
> ext3_invalidatepage()
> journal_invalidatepage()
> journal_unmap_buffer()
> __dispose_buffer()
> __journal_unfile_buffer()
> __journal_temp_unlink_buffer()
> mark_buffer_dirty(); // PG_dirty set, incr. dirty pages
>
> In this fix, we forcefully clear the page's dirty flag during truncation (in
> truncate_complete_page()).
But missed that one.
>
> However, I am still unsure how the reiserfs case is checked through
> is_page_cache_freeable() (if the pagecache is truncated, then the pagecache
> refcount would be decreased). Fortunately, reiserfs was removed in 2024 by
> commit fb6f20ecb121 ("reiserfs: The last commit").
I did find a single report of the "pageout: orphaned page" message
(where Andrew claims the message as his forgotten temporary debugging):
https://lore.kernel.org/all/20061002170353.GA26816@king.bitgnome.net/
From 2006 on 2.6.18: and indeed it was on reiserfs - maybe reiserfs
had some extra refcounting on these pages, which caused them to pass
the is_page_cache_freeable() check (but would they actually be freeable,
or leaked? TBH I haven't tried to work that out, nor care very much).
Where does this leave us? I think it says that your deletion of that
block from pageout() is acceptable now, with reiserfs gone to history.
Though somehow I would prefer, like that ext3 fix, that we would just
clear dirty on such a folio (to avoid "Bad page state" later if it is
freeable), not go to pageout(), but proceed to the folio_needs_release()
block like for clean folios.
But whatever: you've persuaded me! I withdraw my objection to your patch.
Thanks,
Hugh
next prev parent reply other threads:[~2025-09-17 7:50 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-12 3:45 [PATCH 0/2] some cleanups for pageout() Baolin Wang
2025-09-12 3:45 ` [PATCH 1/2] mm: vmscan: remove folio_test_private() check in pageout() Baolin Wang
2025-09-12 8:24 ` David Hildenbrand
2025-09-12 8:24 ` David Hildenbrand
2025-09-12 8:31 ` Baolin Wang
2025-09-12 8:57 ` David Hildenbrand
2025-09-12 9:03 ` Baolin Wang
2025-09-12 8:57 ` David Hildenbrand
2025-09-12 15:21 ` Matthew Wilcox
2025-09-13 3:04 ` Baolin Wang
2025-09-15 20:47 ` Matthew Wilcox
2025-09-18 2:45 ` Baolin Wang
2025-09-12 16:13 ` Shakeel Butt
2025-09-13 3:24 ` Baolin Wang
2025-09-15 20:00 ` Shakeel Butt
2025-09-16 4:00 ` Hugh Dickins
2025-09-16 7:18 ` Baolin Wang
2025-09-17 3:50 ` Baolin Wang
2025-09-17 7:49 ` Hugh Dickins [this message]
2025-09-18 2:22 ` Baolin Wang
2025-09-12 3:45 ` [PATCH 2/2] mm: vmscan: simplify the folio refcount " Baolin Wang
2025-09-12 8:58 ` David Hildenbrand
2025-09-12 16:16 ` Shakeel Butt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1111883c-974f-e4da-a38f-bb3d337185ad@google.com \
--to=hughd@google.com \
--cc=akpm@linux-foundation.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=david@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@kernel.org \
--cc=shakeel.butt@linux.dev \
--cc=willy@infradead.org \
--cc=zhengqi.arch@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.