From: Hugh Dickins <hughd@google.com>
To: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
akpm@linux-foundation.org
Subject: Re: [PATCH v8.1 00/31] Memory Folios
Date: Fri, 30 Apr 2021 11:47:54 -0700 (PDT) [thread overview]
Message-ID: <alpine.LSU.2.11.2104301141320.16885@eggly.anvils> (raw)
In-Reply-To: <20210430180740.2707166-1-willy@infradead.org>
Adding Linus to the Cc (of this one only): he surely has an interest.
On Fri, 30 Apr 2021, Matthew Wilcox (Oracle) wrote:
> Managing memory in 4KiB pages is a serious overhead. Many benchmarks
> benefit from a larger "page size". As an example, an earlier iteration
> of this idea which used compound pages (and wasn't particularly tuned)
> got a 7% performance boost when compiling the kernel.
>
> Using compound pages or THPs exposes a serious weakness in our type
> system. Functions are often unprepared for compound pages to be passed
> to them, and may only act on PAGE_SIZE chunks. Even functions which are
> aware of compound pages may expect a head page, and do the wrong thing
> if passed a tail page.
>
> There have been efforts to label function parameters as 'head' instead
> of 'page' to indicate that the function expects a head page, but this
> leaves us with runtime assertions instead of using the compiler to prove
> that nobody has mistakenly passed a tail page. Calling a struct page
> 'head' is also inaccurate as they will work perfectly well on base pages.
>
> We also waste a lot of instructions ensuring that we're not looking at
> a tail page. Almost every call to PageFoo() contains one or more hidden
> calls to compound_head(). This also happens for get_page(), put_page()
> and many more functions. There does not appear to be a way to tell gcc
> that it can cache the result of compound_head(), nor is there a way to
> tell it that compound_head() is idempotent.
>
> This series introduces the 'struct folio' as a replacement for
> head-or-base pages. This initial set reduces the kernel size by
> approximately 6kB by removing conversions from tail pages to head pages.
> The real purpose of this series is adding infrastructure to enable
> further use of the folio.
>
> The medium-term goal is to convert all filesystems and some device
> drivers to work in terms of folios. This series contains a lot of
> explicit conversions, but it's important to realise it's removing a lot
> of implicit conversions in some relatively hot paths. There will be very
> few conversions from folios when this work is completed; filesystems,
> the page cache, the LRU and so on will generally only deal with folios.
>
> The text size reduces by between 6kB (a config based on Oracle UEK)
> and 1.2kB (allnoconfig). Performance seems almost unaffected based
> on kernbench.
>
> Current tree at:
> https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/folio
>
> (contains another ~120 patches on top of this batch, not all of which are
> in good shape for submission)
>
> v8.1:
> - Rebase on next-20210430
> - You need https://lore.kernel.org/linux-mm/20210430145549.2662354-1-willy@infradead.org/ first
> - Big renaming (thanks to peterz):
> - PageFoo() becomes folio_foo()
> - SetFolioFoo() becomes folio_set_foo()
> - ClearFolioFoo() becomes folio_clear_foo()
> - __SetFolioFoo() becomes __folio_set_foo()
> - __ClearFolioFoo() becomes __folio_clear_foo()
> - TestSetPageFoo() becomes folio_test_set_foo()
> - TestClearPageFoo() becomes folio_test_clear_foo()
> - PageHuge() is now folio_hugetlb()
> - put_folio() becomes folio_put()
> - get_folio() becomes folio_get()
> - put_folio_testzero() becomes folio_put_testzero()
> - set_folio_count() becomes folio_set_count()
> - attach_folio_private() becomes folio_attach_private()
> - detach_folio_private() becomes folio_detach_private()
> - lock_folio() becomes folio_lock()
> - unlock_folio() becomes folio_unlock()
> - trylock_folio() becomes folio_trylock()
> - __lock_folio_or_retry becomes __folio_lock_or_retry()
> - __lock_folio_async() becomes __folio_lock_async()
> - wake_up_folio_bit() becomes folio_wake_bit()
> - wake_up_folio() becomes folio_wake()
> - wait_on_folio_bit() becomes folio_wait_bit()
> - wait_for_stable_folio() becomes folio_wait_stable()
> - wait_on_folio() becomes folio_wait()
> - wait_on_folio_locked() becomes folio_wait_locked()
> - wait_on_folio_writeback() becomes folio_wait_writeback()
> - end_folio_writeback() becomes folio_end_writeback()
> - add_folio_wait_queue() becomes folio_add_wait_queue()
> - Add folio_young() and folio_idle() family of functions
> - Move page_folio() to page-flags.h and use _compound_head()
> - Make page_folio() const-preserving
> - Add folio_page() to get the nth page from a folio
> - Improve struct folio kernel-doc
> - Convert folio flag tests to return bool instead of int
> - Eliminate set_folio_private()
> - folio_get_private() is the equivalent of page_private() (as folio_private()
> is now a test for whether the private flag is set on the folio)
> - Move folio_rotate_reclaimable() into this patchset
> - Add page-flags.h to the kernel-doc
> - Add netfs.h to the kernel-doc
> - Add a family of folio_lock_lruvec() wrappers
> - Add a family of folio_relock_lruvec() wrappers
>
> v7:
> https://lore.kernel.org/linux-mm/20210409185105.188284-1-willy@infradead.org/
>
> Matthew Wilcox (Oracle) (31):
> mm: Introduce struct folio
> mm: Add folio_pgdat and folio_zone
> mm/vmstat: Add functions to account folio statistics
> mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO
> mm: Add folio reference count functions
> mm: Add folio_put
> mm: Add folio_get
> mm: Add folio flag manipulation functions
> mm: Add folio_young() and folio_idle()
> mm: Handle per-folio private data
> mm/filemap: Add folio_index, folio_file_page and folio_contains
> mm/filemap: Add folio_next_index
> mm/filemap: Add folio_offset and folio_file_offset
> mm/util: Add folio_mapping and folio_file_mapping
> mm: Add folio_mapcount
> mm/memcg: Add folio wrappers for various functions
> mm/filemap: Add folio_unlock
> mm/filemap: Add folio_lock
> mm/filemap: Add folio_lock_killable
> mm/filemap: Add __folio_lock_async
> mm/filemap: Add __folio_lock_or_retry
> mm/filemap: Add folio_wait_locked
> mm/swap: Add folio_rotate_reclaimable
> mm/filemap: Add folio_end_writeback
> mm/writeback: Add folio_wait_writeback
> mm/writeback: Add folio_wait_stable
> mm/filemap: Add folio_wait_bit
> mm/filemap: Add folio_wake_bit
> mm/filemap: Convert page wait queues to be folios
> mm/filemap: Add folio private_2 functions
> fs/netfs: Add folio fscache functions
>
> Documentation/core-api/mm-api.rst | 4 +
> Documentation/filesystems/netfs_library.rst | 2 +
> fs/afs/write.c | 9 +-
> fs/cachefiles/rdwr.c | 16 +-
> fs/io_uring.c | 2 +-
> include/linux/memcontrol.h | 58 ++++
> include/linux/mm.h | 173 ++++++++++--
> include/linux/mm_types.h | 71 +++++
> include/linux/mmdebug.h | 20 ++
> include/linux/netfs.h | 77 +++--
> include/linux/page-flags.h | 222 +++++++++++----
> include/linux/page_idle.h | 99 ++++---
> include/linux/page_ref.h | 88 +++++-
> include/linux/pagemap.h | 276 +++++++++++++-----
> include/linux/swap.h | 7 +-
> include/linux/vmstat.h | 107 +++++++
> mm/Makefile | 2 +-
> mm/filemap.c | 295 ++++++++++----------
> mm/folio-compat.c | 37 +++
> mm/internal.h | 1 +
> mm/memory.c | 8 +-
> mm/page-writeback.c | 72 +++--
> mm/page_io.c | 4 +-
> mm/swap.c | 18 +-
> mm/swapfile.c | 8 +-
> mm/util.c | 30 +-
> 26 files changed, 1247 insertions(+), 459 deletions(-)
> create mode 100644 mm/folio-compat.c
>
> --
> 2.30.2
next prev parent reply other threads:[~2021-04-30 18:48 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 01/31] mm: Introduce struct folio Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 02/31] mm: Add folio_pgdat and folio_zone Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 03/31] mm/vmstat: Add functions to account folio statistics Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 04/31] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 05/31] mm: Add folio reference count functions Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 06/31] mm: Add folio_put Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 07/31] mm: Add folio_get Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 08/31] mm: Add folio flag manipulation functions Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 09/31] mm: Add folio_young() and folio_idle() Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 10/31] mm: Handle per-folio private data Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 11/31] mm/filemap: Add folio_index, folio_file_page and folio_contains Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 12/31] mm/filemap: Add folio_next_index Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 13/31] mm/filemap: Add folio_offset and folio_file_offset Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 14/31] mm/util: Add folio_mapping and folio_file_mapping Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 15/31] mm: Add folio_mapcount Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 16/31] mm/memcg: Add folio wrappers for various functions Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 17/31] mm/filemap: Add folio_unlock Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 18/31] mm/filemap: Add folio_lock Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 19/31] mm/filemap: Add folio_lock_killable Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 20/31] mm/filemap: Add __folio_lock_async Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 21/31] mm/filemap: Add __folio_lock_or_retry Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 22/31] mm/filemap: Add folio_wait_locked Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 23/31] mm/swap: Add folio_rotate_reclaimable Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 24/31] mm/filemap: Add folio_end_writeback Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 25/31] mm/writeback: Add folio_wait_writeback Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 26/31] mm/writeback: Add folio_wait_stable Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 27/31] mm/filemap: Add folio_wait_bit Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 28/31] mm/filemap: Add folio_wake_bit Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 29/31] mm/filemap: Convert page wait queues to be folios Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 30/31] mm/filemap: Add folio private_2 functions Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 31/31] fs/netfs: Add folio fscache functions Matthew Wilcox (Oracle)
2021-04-30 18:47 ` Hugh Dickins [this message]
2021-05-01 1:32 ` [PATCH v8.1 00/31] Memory Folios Nicholas Piggin
2021-05-01 2:37 ` Matthew Wilcox
2021-05-01 14:31 ` Matthew Wilcox
2021-05-01 21:38 ` John Hubbard
2021-05-02 0:17 ` Matthew Wilcox
2021-05-02 0:42 ` John Hubbard
2021-05-02 0:45 ` John Hubbard
2021-05-02 2:31 ` Matthew Wilcox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LSU.2.11.2104301141320.16885@eggly.anvils \
--to=hughd@google.com \
--cc=akpm@linux-foundation.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=torvalds@linux-foundation.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).