* [PATCH v8 01/27] mm: Introduce struct folio
2021-04-30 17:22 [PATCH v8 00/27] Memory Folios Matthew Wilcox (Oracle)
@ 2021-04-30 17:22 ` Matthew Wilcox (Oracle)
2021-04-30 17:22 ` [PATCH v8 02/27] mm: Add folio_pgdat and folio_zone Matthew Wilcox (Oracle)
` (18 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 17:22 UTC (permalink / raw)
To: linux-mm, linux-fsdevel, akpm; +Cc: Matthew Wilcox (Oracle), Jeff Layton
A struct folio is a new abstraction to replace the venerable struct page.
A function which takes a struct folio argument declares that it will
operate on the entire (possibly compound) page, not just PAGE_SIZE bytes.
In return, the caller guarantees that the pointer it is passing does
not point to a tail page.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
Documentation/core-api/mm-api.rst | 1 +
include/linux/mm.h | 74 +++++++++++++++++++++++++++++++
include/linux/mm_types.h | 60 +++++++++++++++++++++++++
include/linux/page-flags.h | 27 +++++++++++
4 files changed, 162 insertions(+)
diff --git a/Documentation/core-api/mm-api.rst b/Documentation/core-api/mm-api.rst
index 34f46df91a8b..cbf5858dadac 100644
--- a/Documentation/core-api/mm-api.rst
+++ b/Documentation/core-api/mm-api.rst
@@ -95,5 +95,6 @@ More Memory Management Functions
.. kernel-doc:: mm/mempolicy.c
.. kernel-doc:: include/linux/mm_types.h
:internal:
+.. kernel-doc:: include/linux/page-flags.h
.. kernel-doc:: include/linux/mm.h
:internal:
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2327f99b121f..b29c86824e6b 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -950,6 +950,20 @@ static inline unsigned int compound_order(struct page *page)
return page[1].compound_order;
}
+/**
+ * folio_order - The allocation order of a folio.
+ * @folio: The folio.
+ *
+ * A folio is composed of 2^order pages. See get_order() for the definition
+ * of order.
+ *
+ * Return: The order of the folio.
+ */
+static inline unsigned int folio_order(struct folio *folio)
+{
+ return compound_order(&folio->page);
+}
+
static inline bool hpage_pincount_available(struct page *page)
{
/*
@@ -1595,6 +1609,65 @@ static inline void set_page_links(struct page *page, enum zone_type zone,
#endif
}
+/**
+ * folio_nr_pages - The number of pages in the folio.
+ * @folio: The folio.
+ *
+ * Return: A number which is a power of two.
+ */
+static inline unsigned long folio_nr_pages(struct folio *folio)
+{
+ return compound_nr(&folio->page);
+}
+
+/**
+ * folio_next - Move to the next physical folio.
+ * @folio: The folio we're currently operating on.
+ *
+ * If you have physically contiguous memory which may span more than
+ * one folio (eg a &struct bio_vec), use this function to move from one
+ * folio to the next. Do not use it if the memory is only virtually
+ * contiguous as the folios are almost certainly not adjacent to each
+ * other. This is the folio equivalent to writing ``page++``.
+ *
+ * Context: We assume that the folios are refcounted and/or locked at a
+ * higher level and do not adjust the reference counts.
+ * Return: The next struct folio.
+ */
+static inline struct folio *folio_next(struct folio *folio)
+{
+ return (struct folio *)folio_page(folio, folio_nr_pages(folio));
+}
+
+/**
+ * folio_shift - The number of bits covered by this folio.
+ * @folio: The folio.
+ *
+ * A folio contains a number of bytes which is a power-of-two in size.
+ * This function tells you which power-of-two the folio is.
+ *
+ * Context: The caller should have a reference on the folio to prevent
+ * it from being split. It is not necessary for the folio to be locked.
+ * Return: The base-2 logarithm of the size of this folio.
+ */
+static inline unsigned int folio_shift(struct folio *folio)
+{
+ return PAGE_SHIFT + folio_order(folio);
+}
+
+/**
+ * folio_size - The number of bytes in a folio.
+ * @folio: The folio.
+ *
+ * Context: The caller should have a reference on the folio to prevent
+ * it from being split. It is not necessary for the folio to be locked.
+ * Return: The number of bytes in this folio.
+ */
+static inline size_t folio_size(struct folio *folio)
+{
+ return PAGE_SIZE << folio_order(folio);
+}
+
/*
* Some inline functions in vmstat.h depend on page_zone()
*/
@@ -1699,6 +1772,7 @@ extern void pagefault_out_of_memory(void);
#define offset_in_page(p) ((unsigned long)(p) & ~PAGE_MASK)
#define offset_in_thp(page, p) ((unsigned long)(p) & (thp_size(page) - 1))
+#define offset_in_folio(folio, p) ((unsigned long)(p) & (folio_size(folio) - 1))
/*
* Flags passed to show_mem() and show_free_areas() to suppress output in
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 5aacc1c10a45..276e358c75d3 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -224,6 +224,66 @@ struct page {
#endif
} _struct_page_alignment;
+/**
+ * struct folio - Represents a contiguous set of bytes.
+ * @flags: Identical to the page flags.
+ * @lru: Least Recently Used list; tracks how recently this folio was used.
+ * @mapping: The file this page belongs to, or refers to the anon_vma for
+ * anonymous pages.
+ * @index: Offset within the file, in units of pages. For anonymous pages,
+ * this is the index from the beginning of the mmap.
+ * @private: Filesystem per-folio data (see folio_attach_private()).
+ * Used for swp_entry_t if folio_swapcache().
+ * @_mapcount: Do not access this member directly. Use folio_mapcount() to
+ * find out how many times this folio is mapped by userspace.
+ * @_refcount: Do not access this member directly. Use folio_ref_count()
+ * to find how many references there are to this folio.
+ * @memcg_data: Memory Control Group data.
+ *
+ * A folio is a physically, virtually and logically contiguous set
+ * of bytes. It is a power-of-two in size, and it is aligned to that
+ * same power-of-two. It is at least as large as %PAGE_SIZE. If it is
+ * in the page cache, it is at a file offset which is a multiple of that
+ * power-of-two. It may be mapped into userspace at an address which is
+ * at an arbitrary page offset, but its kernel virtual address is aligned
+ * to its size.
+ */
+struct folio {
+ /* private: don't document the anon union */
+ union {
+ struct {
+ /* public: */
+ unsigned long flags;
+ struct list_head lru;
+ struct address_space *mapping;
+ pgoff_t index;
+ unsigned long private;
+ atomic_t _mapcount;
+ atomic_t _refcount;
+#ifdef CONFIG_MEMCG
+ unsigned long memcg_data;
+#endif
+ /* private: the union with struct page is transitional */
+ };
+ struct page page;
+ };
+};
+
+static_assert(sizeof(struct page) == sizeof(struct folio));
+#define FOLIO_MATCH(pg, fl) \
+ static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
+FOLIO_MATCH(flags, flags);
+FOLIO_MATCH(lru, lru);
+FOLIO_MATCH(compound_head, lru);
+FOLIO_MATCH(index, index);
+FOLIO_MATCH(private, private);
+FOLIO_MATCH(_mapcount, _mapcount);
+FOLIO_MATCH(_refcount, _refcount);
+#ifdef CONFIG_MEMCG
+FOLIO_MATCH(memcg_data, memcg_data);
+#endif
+#undef FOLIO_MATCH
+
static inline atomic_t *compound_mapcount_ptr(struct page *page)
{
return &page[1].compound_mapcount;
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index d8e26243db25..e069aa8b11b7 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -188,6 +188,33 @@ static inline unsigned long _compound_head(const struct page *page)
#define compound_head(page) ((typeof(page))_compound_head(page))
+/**
+ * page_folio - Converts from page to folio.
+ * @p: The page.
+ *
+ * Every page is part of a folio. This function cannot be called on a
+ * NULL pointer.
+ *
+ * Context: No reference, nor lock is required on @page. If the caller
+ * does not hold a reference, this call may race with a folio split, so
+ * it should re-check the folio still contains this page after gaining
+ * a reference on the folio.
+ * Return: The folio which contains this page.
+ */
+#define page_folio(p) (_Generic((p), \
+ const struct page *: (const struct folio *)_compound_head(p), \
+ struct page *: (struct folio *)_compound_head(p)))
+
+/**
+ * folio_page - Return a page from a folio.
+ * @folio: The folio.
+ * @n: The page number to return.
+ *
+ * @n is relative to the start of the folio. It should be between
+ * 0 and folio_nr_pages(@folio) - 1, but this is not checked for.
+ */
+#define folio_page(folio, n) nth_page(&(folio)->page, n)
+
static __always_inline int PageTail(struct page *page)
{
return READ_ONCE(page->compound_head) & 1;
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread* [PATCH v8 02/27] mm: Add folio_pgdat and folio_zone
2021-04-30 17:22 [PATCH v8 00/27] Memory Folios Matthew Wilcox (Oracle)
2021-04-30 17:22 ` [PATCH v8 01/27] mm: Introduce struct folio Matthew Wilcox (Oracle)
@ 2021-04-30 17:22 ` Matthew Wilcox (Oracle)
2021-04-30 17:22 ` [PATCH v8 03/27] mm/vmstat: Add functions to account folio statistics Matthew Wilcox (Oracle)
` (17 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 17:22 UTC (permalink / raw)
To: linux-mm, linux-fsdevel, akpm
Cc: Matthew Wilcox (Oracle), Zi Yan, Christoph Hellwig, Jeff Layton
These are just convenience wrappers for callers with folios; pgdat and
zone can be reached from tail pages as well as head pages.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
include/linux/mm.h | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index b29c86824e6b..a55c2c0628b6 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1560,6 +1560,16 @@ static inline pg_data_t *page_pgdat(const struct page *page)
return NODE_DATA(page_to_nid(page));
}
+static inline struct zone *folio_zone(const struct folio *folio)
+{
+ return page_zone(&folio->page);
+}
+
+static inline pg_data_t *folio_pgdat(const struct folio *folio)
+{
+ return page_pgdat(&folio->page);
+}
+
#ifdef SECTION_IN_PAGE_FLAGS
static inline void set_page_section(struct page *page, unsigned long section)
{
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread* [PATCH v8 03/27] mm/vmstat: Add functions to account folio statistics
2021-04-30 17:22 [PATCH v8 00/27] Memory Folios Matthew Wilcox (Oracle)
2021-04-30 17:22 ` [PATCH v8 01/27] mm: Introduce struct folio Matthew Wilcox (Oracle)
2021-04-30 17:22 ` [PATCH v8 02/27] mm: Add folio_pgdat and folio_zone Matthew Wilcox (Oracle)
@ 2021-04-30 17:22 ` Matthew Wilcox (Oracle)
2021-04-30 17:22 ` [PATCH v8 04/27] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO Matthew Wilcox (Oracle)
` (16 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 17:22 UTC (permalink / raw)
To: linux-mm, linux-fsdevel, akpm
Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton
Allow page counters to be more readily modified by callers which have
a folio. Name these wrappers with 'stat' instead of 'state' as requested
by Linus here:
https://lore.kernel.org/linux-mm/CAHk-=wj847SudR-kt+46fT3+xFFgiwpgThvm7DJWGdi4cVrbnQ@mail.gmail.com/
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
include/linux/vmstat.h | 107 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 107 insertions(+)
diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index 3299cd69e4ca..d287d7c31b8f 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -402,6 +402,78 @@ static inline void drain_zonestat(struct zone *zone,
struct per_cpu_pageset *pset) { }
#endif /* CONFIG_SMP */
+static inline void __zone_stat_mod_folio(struct folio *folio,
+ enum zone_stat_item item, long nr)
+{
+ __mod_zone_page_state(folio_zone(folio), item, nr);
+}
+
+static inline void __zone_stat_add_folio(struct folio *folio,
+ enum zone_stat_item item)
+{
+ __mod_zone_page_state(folio_zone(folio), item, folio_nr_pages(folio));
+}
+
+static inline void __zone_stat_sub_folio(struct folio *folio,
+ enum zone_stat_item item)
+{
+ __mod_zone_page_state(folio_zone(folio), item, -folio_nr_pages(folio));
+}
+
+static inline void zone_stat_mod_folio(struct folio *folio,
+ enum zone_stat_item item, long nr)
+{
+ mod_zone_page_state(folio_zone(folio), item, nr);
+}
+
+static inline void zone_stat_add_folio(struct folio *folio,
+ enum zone_stat_item item)
+{
+ mod_zone_page_state(folio_zone(folio), item, folio_nr_pages(folio));
+}
+
+static inline void zone_stat_sub_folio(struct folio *folio,
+ enum zone_stat_item item)
+{
+ mod_zone_page_state(folio_zone(folio), item, -folio_nr_pages(folio));
+}
+
+static inline void __node_stat_mod_folio(struct folio *folio,
+ enum node_stat_item item, long nr)
+{
+ __mod_node_page_state(folio_pgdat(folio), item, nr);
+}
+
+static inline void __node_stat_add_folio(struct folio *folio,
+ enum node_stat_item item)
+{
+ __mod_node_page_state(folio_pgdat(folio), item, folio_nr_pages(folio));
+}
+
+static inline void __node_stat_sub_folio(struct folio *folio,
+ enum node_stat_item item)
+{
+ __mod_node_page_state(folio_pgdat(folio), item, -folio_nr_pages(folio));
+}
+
+static inline void node_stat_mod_folio(struct folio *folio,
+ enum node_stat_item item, long nr)
+{
+ mod_node_page_state(folio_pgdat(folio), item, nr);
+}
+
+static inline void node_stat_add_folio(struct folio *folio,
+ enum node_stat_item item)
+{
+ mod_node_page_state(folio_pgdat(folio), item, folio_nr_pages(folio));
+}
+
+static inline void node_stat_sub_folio(struct folio *folio,
+ enum node_stat_item item)
+{
+ mod_node_page_state(folio_pgdat(folio), item, -folio_nr_pages(folio));
+}
+
static inline void __mod_zone_freepage_state(struct zone *zone, int nr_pages,
int migratetype)
{
@@ -530,6 +602,24 @@ static inline void __dec_lruvec_page_state(struct page *page,
__mod_lruvec_page_state(page, idx, -1);
}
+static inline void __lruvec_stat_mod_folio(struct folio *folio,
+ enum node_stat_item idx, int val)
+{
+ __mod_lruvec_page_state(&folio->page, idx, val);
+}
+
+static inline void __lruvec_stat_add_folio(struct folio *folio,
+ enum node_stat_item idx)
+{
+ __lruvec_stat_mod_folio(folio, idx, folio_nr_pages(folio));
+}
+
+static inline void __lruvec_stat_sub_folio(struct folio *folio,
+ enum node_stat_item idx)
+{
+ __lruvec_stat_mod_folio(folio, idx, -folio_nr_pages(folio));
+}
+
static inline void inc_lruvec_page_state(struct page *page,
enum node_stat_item idx)
{
@@ -542,4 +632,21 @@ static inline void dec_lruvec_page_state(struct page *page,
mod_lruvec_page_state(page, idx, -1);
}
+static inline void lruvec_stat_mod_folio(struct folio *folio,
+ enum node_stat_item idx, int val)
+{
+ mod_lruvec_page_state(&folio->page, idx, val);
+}
+
+static inline void lruvec_stat_add_folio(struct folio *folio,
+ enum node_stat_item idx)
+{
+ lruvec_stat_mod_folio(folio, idx, folio_nr_pages(folio));
+}
+
+static inline void lruvec_stat_sub_folio(struct folio *folio,
+ enum node_stat_item idx)
+{
+ lruvec_stat_mod_folio(folio, idx, -folio_nr_pages(folio));
+}
#endif /* _LINUX_VMSTAT_H */
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread* [PATCH v8 04/27] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO
2021-04-30 17:22 [PATCH v8 00/27] Memory Folios Matthew Wilcox (Oracle)
` (2 preceding siblings ...)
2021-04-30 17:22 ` [PATCH v8 03/27] mm/vmstat: Add functions to account folio statistics Matthew Wilcox (Oracle)
@ 2021-04-30 17:22 ` Matthew Wilcox (Oracle)
2021-04-30 17:22 ` [PATCH v8 05/27] mm: Add folio reference count functions Matthew Wilcox (Oracle)
` (15 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 17:22 UTC (permalink / raw)
To: linux-mm, linux-fsdevel, akpm
Cc: Matthew Wilcox (Oracle), Zi Yan, Christoph Hellwig, Jeff Layton
These are the folio equivalents of VM_BUG_ON_PAGE and VM_WARN_ON_ONCE_PAGE.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
include/linux/mmdebug.h | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/include/linux/mmdebug.h b/include/linux/mmdebug.h
index 1935d4c72d10..d7285f8148a3 100644
--- a/include/linux/mmdebug.h
+++ b/include/linux/mmdebug.h
@@ -22,6 +22,13 @@ void dump_mm(const struct mm_struct *mm);
BUG(); \
} \
} while (0)
+#define VM_BUG_ON_FOLIO(cond, folio) \
+ do { \
+ if (unlikely(cond)) { \
+ dump_page(&folio->page, "VM_BUG_ON_FOLIO(" __stringify(cond)")");\
+ BUG(); \
+ } \
+ } while (0)
#define VM_BUG_ON_VMA(cond, vma) \
do { \
if (unlikely(cond)) { \
@@ -47,6 +54,17 @@ void dump_mm(const struct mm_struct *mm);
} \
unlikely(__ret_warn_once); \
})
+#define VM_WARN_ON_ONCE_FOLIO(cond, folio) ({ \
+ static bool __section(".data.once") __warned; \
+ int __ret_warn_once = !!(cond); \
+ \
+ if (unlikely(__ret_warn_once && !__warned)) { \
+ dump_page(&folio->page, "VM_WARN_ON_ONCE_FOLIO(" __stringify(cond)")");\
+ __warned = true; \
+ WARN_ON(1); \
+ } \
+ unlikely(__ret_warn_once); \
+})
#define VM_WARN_ON(cond) (void)WARN_ON(cond)
#define VM_WARN_ON_ONCE(cond) (void)WARN_ON_ONCE(cond)
@@ -55,11 +73,13 @@ void dump_mm(const struct mm_struct *mm);
#else
#define VM_BUG_ON(cond) BUILD_BUG_ON_INVALID(cond)
#define VM_BUG_ON_PAGE(cond, page) VM_BUG_ON(cond)
+#define VM_BUG_ON_FOLIO(cond, folio) VM_BUG_ON(cond)
#define VM_BUG_ON_VMA(cond, vma) VM_BUG_ON(cond)
#define VM_BUG_ON_MM(cond, mm) VM_BUG_ON(cond)
#define VM_WARN_ON(cond) BUILD_BUG_ON_INVALID(cond)
#define VM_WARN_ON_ONCE(cond) BUILD_BUG_ON_INVALID(cond)
#define VM_WARN_ON_ONCE_PAGE(cond, page) BUILD_BUG_ON_INVALID(cond)
+#define VM_WARN_ON_ONCE_FOLIO(cond, folio) BUILD_BUG_ON_INVALID(cond)
#define VM_WARN_ONCE(cond, format...) BUILD_BUG_ON_INVALID(cond)
#define VM_WARN(cond, format...) BUILD_BUG_ON_INVALID(cond)
#endif
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread* [PATCH v8 05/27] mm: Add folio reference count functions
2021-04-30 17:22 [PATCH v8 00/27] Memory Folios Matthew Wilcox (Oracle)
` (3 preceding siblings ...)
2021-04-30 17:22 ` [PATCH v8 04/27] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO Matthew Wilcox (Oracle)
@ 2021-04-30 17:22 ` Matthew Wilcox (Oracle)
2021-04-30 17:22 ` [PATCH v8 06/27] mm: Add folio_put Matthew Wilcox (Oracle)
` (14 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 17:22 UTC (permalink / raw)
To: linux-mm, linux-fsdevel, akpm
Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton
These functions mirror their page reference counterparts.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
Documentation/core-api/mm-api.rst | 1 +
include/linux/page_ref.h | 88 ++++++++++++++++++++++++++++++-
2 files changed, 88 insertions(+), 1 deletion(-)
diff --git a/Documentation/core-api/mm-api.rst b/Documentation/core-api/mm-api.rst
index cbf5858dadac..88545b2dce98 100644
--- a/Documentation/core-api/mm-api.rst
+++ b/Documentation/core-api/mm-api.rst
@@ -98,3 +98,4 @@ More Memory Management Functions
.. kernel-doc:: include/linux/page-flags.h
.. kernel-doc:: include/linux/mm.h
:internal:
+.. kernel-doc:: include/linux/page_ref.h
diff --git a/include/linux/page_ref.h b/include/linux/page_ref.h
index 7ad46f45df39..85816b2c0496 100644
--- a/include/linux/page_ref.h
+++ b/include/linux/page_ref.h
@@ -67,9 +67,31 @@ static inline int page_ref_count(const struct page *page)
return atomic_read(&page->_refcount);
}
+/**
+ * folio_ref_count - The reference count on this folio.
+ * @folio: The folio.
+ *
+ * The refcount is usually incremented by calls to folio_get() and
+ * decremented by calls to folio_put(). Some typical users of the
+ * folio refcount:
+ *
+ * - Each reference from a page table
+ * - The page cache
+ * - Filesystem private data
+ * - The LRU list
+ * - Pipes
+ * - Direct IO which references this page in the process address space
+ *
+ * Return: The number of references to this folio.
+ */
+static inline int folio_ref_count(const struct folio *folio)
+{
+ return page_ref_count(&folio->page);
+}
+
static inline int page_count(const struct page *page)
{
- return atomic_read(&compound_head(page)->_refcount);
+ return folio_ref_count(page_folio(page));
}
static inline void set_page_count(struct page *page, int v)
@@ -79,6 +101,11 @@ static inline void set_page_count(struct page *page, int v)
__page_ref_set(page, v);
}
+static inline void folio_set_count(struct folio *folio, int v)
+{
+ set_page_count(&folio->page, v);
+}
+
/*
* Setup the page count before being freed into the page allocator for
* the first time (boot or memory hotplug)
@@ -95,6 +122,11 @@ static inline void page_ref_add(struct page *page, int nr)
__page_ref_mod(page, nr);
}
+static inline void folio_ref_add(struct folio *folio, int nr)
+{
+ page_ref_add(&folio->page, nr);
+}
+
static inline void page_ref_sub(struct page *page, int nr)
{
atomic_sub(nr, &page->_refcount);
@@ -102,6 +134,11 @@ static inline void page_ref_sub(struct page *page, int nr)
__page_ref_mod(page, -nr);
}
+static inline void folio_ref_sub(struct folio *folio, int nr)
+{
+ page_ref_sub(&folio->page, nr);
+}
+
static inline int page_ref_sub_return(struct page *page, int nr)
{
int ret = atomic_sub_return(nr, &page->_refcount);
@@ -111,6 +148,11 @@ static inline int page_ref_sub_return(struct page *page, int nr)
return ret;
}
+static inline int folio_ref_sub_return(struct folio *folio, int nr)
+{
+ return page_ref_sub_return(&folio->page, nr);
+}
+
static inline void page_ref_inc(struct page *page)
{
atomic_inc(&page->_refcount);
@@ -118,6 +160,11 @@ static inline void page_ref_inc(struct page *page)
__page_ref_mod(page, 1);
}
+static inline void folio_ref_inc(struct folio *folio)
+{
+ page_ref_inc(&folio->page);
+}
+
static inline void page_ref_dec(struct page *page)
{
atomic_dec(&page->_refcount);
@@ -125,6 +172,11 @@ static inline void page_ref_dec(struct page *page)
__page_ref_mod(page, -1);
}
+static inline void folio_ref_dec(struct folio *folio)
+{
+ page_ref_dec(&folio->page);
+}
+
static inline int page_ref_sub_and_test(struct page *page, int nr)
{
int ret = atomic_sub_and_test(nr, &page->_refcount);
@@ -134,6 +186,11 @@ static inline int page_ref_sub_and_test(struct page *page, int nr)
return ret;
}
+static inline int folio_ref_sub_and_test(struct folio *folio, int nr)
+{
+ return page_ref_sub_and_test(&folio->page, nr);
+}
+
static inline int page_ref_inc_return(struct page *page)
{
int ret = atomic_inc_return(&page->_refcount);
@@ -143,6 +200,11 @@ static inline int page_ref_inc_return(struct page *page)
return ret;
}
+static inline int folio_ref_inc_return(struct folio *folio)
+{
+ return page_ref_inc_return(&folio->page);
+}
+
static inline int page_ref_dec_and_test(struct page *page)
{
int ret = atomic_dec_and_test(&page->_refcount);
@@ -152,6 +214,11 @@ static inline int page_ref_dec_and_test(struct page *page)
return ret;
}
+static inline int folio_ref_dec_and_test(struct folio *folio)
+{
+ return page_ref_dec_and_test(&folio->page);
+}
+
static inline int page_ref_dec_return(struct page *page)
{
int ret = atomic_dec_return(&page->_refcount);
@@ -161,6 +228,11 @@ static inline int page_ref_dec_return(struct page *page)
return ret;
}
+static inline int folio_ref_dec_return(struct folio *folio)
+{
+ return page_ref_dec_return(&folio->page);
+}
+
static inline int page_ref_add_unless(struct page *page, int nr, int u)
{
int ret = atomic_add_unless(&page->_refcount, nr, u);
@@ -170,6 +242,11 @@ static inline int page_ref_add_unless(struct page *page, int nr, int u)
return ret;
}
+static inline int folio_ref_add_unless(struct folio *folio, int nr, int u)
+{
+ return page_ref_add_unless(&folio->page, nr, u);
+}
+
static inline int page_ref_freeze(struct page *page, int count)
{
int ret = likely(atomic_cmpxchg(&page->_refcount, count, 0) == count);
@@ -179,6 +256,11 @@ static inline int page_ref_freeze(struct page *page, int count)
return ret;
}
+static inline int folio_ref_freeze(struct folio *folio, int count)
+{
+ return page_ref_freeze(&folio->page, count);
+}
+
static inline void page_ref_unfreeze(struct page *page, int count)
{
VM_BUG_ON_PAGE(page_count(page) != 0, page);
@@ -189,4 +271,8 @@ static inline void page_ref_unfreeze(struct page *page, int count)
__page_ref_unfreeze(page, count);
}
+static inline void folio_ref_unfreeze(struct folio *folio, int count)
+{
+ page_ref_unfreeze(&folio->page, count);
+}
#endif
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread* [PATCH v8 06/27] mm: Add folio_put
2021-04-30 17:22 [PATCH v8 00/27] Memory Folios Matthew Wilcox (Oracle)
` (4 preceding siblings ...)
2021-04-30 17:22 ` [PATCH v8 05/27] mm: Add folio reference count functions Matthew Wilcox (Oracle)
@ 2021-04-30 17:22 ` Matthew Wilcox (Oracle)
2021-04-30 17:22 ` [PATCH v8 07/27] mm: Add folio_get Matthew Wilcox (Oracle)
` (13 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 17:22 UTC (permalink / raw)
To: linux-mm, linux-fsdevel, akpm
Cc: Matthew Wilcox (Oracle), Zi Yan, Christoph Hellwig, Jeff Layton
If we know we have a folio, we can call folio_put() instead of put_page()
and save the overhead of calling compound_head(). Also skips the
devmap checks.
This commit looks like it should be a no-op, but actually saves 1312 bytes
of text with the distro-derived config that I'm testing. Some functions
grow a little while others shrink. I presume the compiler is making
different inlining decisions.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
include/linux/mm.h | 33 ++++++++++++++++++++++++++++-----
1 file changed, 28 insertions(+), 5 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index a55c2c0628b6..610948f0cb43 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -751,6 +751,11 @@ static inline int put_page_testzero(struct page *page)
return page_ref_dec_and_test(page);
}
+static inline int folio_put_testzero(struct folio *folio)
+{
+ return put_page_testzero(&folio->page);
+}
+
/*
* Try to grab a ref unless the page has a refcount of zero, return false if
* that is the case.
@@ -1242,9 +1247,28 @@ static inline __must_check bool try_get_page(struct page *page)
return true;
}
+/**
+ * folio_put - Decrement the reference count on a folio.
+ * @folio: The folio.
+ *
+ * If the folio's reference count reaches zero, the memory will be
+ * released back to the page allocator and may be used by another
+ * allocation immediately. Do not access the memory or the struct folio
+ * after calling folio_put() unless you can be sure that it wasn't the
+ * last reference.
+ *
+ * Context: May be called in process or interrupt context, but not in NMI
+ * context. May be called while holding a spinlock.
+ */
+static inline void folio_put(struct folio *folio)
+{
+ if (folio_put_testzero(folio))
+ __put_page(&folio->page);
+}
+
static inline void put_page(struct page *page)
{
- page = compound_head(page);
+ struct folio *folio = page_folio(page);
/*
* For devmap managed pages we need to catch refcount transition from
@@ -1252,13 +1276,12 @@ static inline void put_page(struct page *page)
* need to inform the device driver through callback. See
* include/linux/memremap.h and HMM for details.
*/
- if (page_is_devmap_managed(page)) {
- put_devmap_managed_page(page);
+ if (page_is_devmap_managed(&folio->page)) {
+ put_devmap_managed_page(&folio->page);
return;
}
- if (put_page_testzero(page))
- __put_page(page);
+ folio_put(folio);
}
/*
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread* [PATCH v8 07/27] mm: Add folio_get
2021-04-30 17:22 [PATCH v8 00/27] Memory Folios Matthew Wilcox (Oracle)
` (5 preceding siblings ...)
2021-04-30 17:22 ` [PATCH v8 06/27] mm: Add folio_put Matthew Wilcox (Oracle)
@ 2021-04-30 17:22 ` Matthew Wilcox (Oracle)
2021-04-30 17:22 ` [PATCH v8 08/27] mm: Add folio flag manipulation functions Matthew Wilcox (Oracle)
` (12 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 17:22 UTC (permalink / raw)
To: linux-mm, linux-fsdevel, akpm
Cc: Matthew Wilcox (Oracle), Zi Yan, Christoph Hellwig, Jeff Layton
If we know we have a folio, we can call folio_get() instead
of get_page() and save the overhead of calling compound_head().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
include/linux/mm.h | 26 +++++++++++++++++---------
1 file changed, 17 insertions(+), 9 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 610948f0cb43..b133734a7530 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1219,18 +1219,26 @@ static inline bool is_pci_p2pdma_page(const struct page *page)
}
/* 127: arbitrary random number, small enough to assemble well */
-#define page_ref_zero_or_close_to_overflow(page) \
- ((unsigned int) page_ref_count(page) + 127u <= 127u)
+#define folio_ref_zero_or_close_to_overflow(folio) \
+ ((unsigned int) folio_ref_count(folio) + 127u <= 127u)
+
+/**
+ * folio_get - Increment the reference count on a folio.
+ * @folio: The folio.
+ *
+ * Context: May be called in any context, as long as you know that
+ * you have a refcount on the folio. If you do not already have one,
+ * try_grab_page() may be the right interface for you to use.
+ */
+static inline void folio_get(struct folio *folio)
+{
+ VM_BUG_ON_FOLIO(folio_ref_zero_or_close_to_overflow(folio), folio);
+ folio_ref_inc(folio);
+}
static inline void get_page(struct page *page)
{
- page = compound_head(page);
- /*
- * Getting a normal page or the head of a compound page
- * requires to already have an elevated page->_refcount.
- */
- VM_BUG_ON_PAGE(page_ref_zero_or_close_to_overflow(page), page);
- page_ref_inc(page);
+ folio_get(page_folio(page));
}
bool __must_check try_grab_page(struct page *page, unsigned int flags);
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread* [PATCH v8 08/27] mm: Add folio flag manipulation functions
2021-04-30 17:22 [PATCH v8 00/27] Memory Folios Matthew Wilcox (Oracle)
` (6 preceding siblings ...)
2021-04-30 17:22 ` [PATCH v8 07/27] mm: Add folio_get Matthew Wilcox (Oracle)
@ 2021-04-30 17:22 ` Matthew Wilcox (Oracle)
2021-04-30 17:22 ` [PATCH v8 09/27] mm: Add folio_young() and folio_idle() Matthew Wilcox (Oracle)
` (11 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 17:22 UTC (permalink / raw)
To: linux-mm, linux-fsdevel, akpm
Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton
These new functions are the folio analogues of the various PageFlags
functions. If CONFIG_DEBUG_VM_PGFLAGS is enabled, we check the folio
is not a tail page at every invocation. This will also catch the
PagePoisoned case as a poisoned page has every bit set, which would
include PageTail.
This saves 1727 bytes of text with the distro-derived config that
I'm testing due to removing a double call to compound_head() in
PageSwapCache().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
include/linux/page-flags.h | 195 ++++++++++++++++++++++++++-----------
1 file changed, 140 insertions(+), 55 deletions(-)
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index e069aa8b11b7..3f721431485e 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -140,6 +140,8 @@ enum pageflags {
#endif
__NR_PAGEFLAGS,
+ PG_readahead = PG_reclaim,
+
/* Filesystems */
PG_checked = PG_owner_priv_1,
@@ -239,6 +241,15 @@ static inline void page_init_poison(struct page *page, size_t size)
}
#endif
+static unsigned long *folio_flags(struct folio *folio, unsigned n)
+{
+ struct page *page = &folio->page;
+
+ VM_BUG_ON_PGFLAGS(PageTail(page), page);
+ VM_BUG_ON_PGFLAGS(n > 0 && !test_bit(PG_head, &page->flags), page);
+ return &page[n].flags;
+}
+
/*
* Page flags policies wrt compound pages
*
@@ -283,34 +294,56 @@ static inline void page_init_poison(struct page *page, size_t size)
VM_BUG_ON_PGFLAGS(!PageHead(page), page); \
PF_POISONED_CHECK(&page[1]); })
+/* Which page is the flag stored in */
+#define FOLIO_PF_ANY 0
+#define FOLIO_PF_HEAD 0
+#define FOLIO_PF_ONLY_HEAD 0
+#define FOLIO_PF_NO_TAIL 0
+#define FOLIO_PF_NO_COMPOUND 0
+#define FOLIO_PF_SECOND 1
+
/*
* Macros to create function definitions for page flags
*/
#define TESTPAGEFLAG(uname, lname, policy) \
+static __always_inline bool folio_##lname(struct folio *folio) \
+ { return test_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \
static __always_inline int Page##uname(struct page *page) \
{ return test_bit(PG_##lname, &policy(page, 0)->flags); }
#define SETPAGEFLAG(uname, lname, policy) \
+static __always_inline void folio_set_##lname(struct folio *folio) \
+ { set_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \
static __always_inline void SetPage##uname(struct page *page) \
{ set_bit(PG_##lname, &policy(page, 1)->flags); }
#define CLEARPAGEFLAG(uname, lname, policy) \
+static __always_inline void folio_clear_##lname(struct folio *folio) \
+ { clear_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \
static __always_inline void ClearPage##uname(struct page *page) \
{ clear_bit(PG_##lname, &policy(page, 1)->flags); }
#define __SETPAGEFLAG(uname, lname, policy) \
+static __always_inline void __folio_set_##lname(struct folio *folio) \
+ { __set_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \
static __always_inline void __SetPage##uname(struct page *page) \
{ __set_bit(PG_##lname, &policy(page, 1)->flags); }
#define __CLEARPAGEFLAG(uname, lname, policy) \
+static __always_inline void __folio_clear_##lname(struct folio *folio) \
+ { __clear_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \
static __always_inline void __ClearPage##uname(struct page *page) \
{ __clear_bit(PG_##lname, &policy(page, 1)->flags); }
#define TESTSETFLAG(uname, lname, policy) \
+static __always_inline bool folio_test_set_##lname(struct folio *folio) \
+ { return test_and_set_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \
static __always_inline int TestSetPage##uname(struct page *page) \
{ return test_and_set_bit(PG_##lname, &policy(page, 1)->flags); }
#define TESTCLEARFLAG(uname, lname, policy) \
+static __always_inline bool folio_test_clear_##lname(struct folio *folio) \
+{ return test_and_clear_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \
static __always_inline int TestClearPage##uname(struct page *page) \
{ return test_and_clear_bit(PG_##lname, &policy(page, 1)->flags); }
@@ -328,29 +361,35 @@ static __always_inline int TestClearPage##uname(struct page *page) \
TESTSETFLAG(uname, lname, policy) \
TESTCLEARFLAG(uname, lname, policy)
-#define TESTPAGEFLAG_FALSE(uname) \
+#define TESTPAGEFLAG_FALSE(uname, lname) \
+static inline bool folio_##lname(const struct folio *folio) { return 0; } \
static inline int Page##uname(const struct page *page) { return 0; }
-#define SETPAGEFLAG_NOOP(uname) \
+#define SETPAGEFLAG_NOOP(uname, lname) \
+static inline void folio_set_##lname(struct folio *folio) { } \
static inline void SetPage##uname(struct page *page) { }
-#define CLEARPAGEFLAG_NOOP(uname) \
+#define CLEARPAGEFLAG_NOOP(uname, lname) \
+static inline void folio_clear_##lname(struct folio *folio) { } \
static inline void ClearPage##uname(struct page *page) { }
-#define __CLEARPAGEFLAG_NOOP(uname) \
+#define __CLEARPAGEFLAG_NOOP(uname, lname) \
+static inline void __folio_clear_##lname(struct folio *folio) { } \
static inline void __ClearPage##uname(struct page *page) { }
-#define TESTSETFLAG_FALSE(uname) \
+#define TESTSETFLAG_FALSE(uname, lname) \
+static inline bool folio_test_set_##lname(struct folio *folio) { return 0; } \
static inline int TestSetPage##uname(struct page *page) { return 0; }
-#define TESTCLEARFLAG_FALSE(uname) \
+#define TESTCLEARFLAG_FALSE(uname, lname) \
+static inline bool folio_test_clear_##lname(struct folio *folio) { return 0; } \
static inline int TestClearPage##uname(struct page *page) { return 0; }
-#define PAGEFLAG_FALSE(uname) TESTPAGEFLAG_FALSE(uname) \
- SETPAGEFLAG_NOOP(uname) CLEARPAGEFLAG_NOOP(uname)
+#define PAGEFLAG_FALSE(uname, lname) TESTPAGEFLAG_FALSE(uname, lname) \
+ SETPAGEFLAG_NOOP(uname, lname) CLEARPAGEFLAG_NOOP(uname, lname)
-#define TESTSCFLAG_FALSE(uname) \
- TESTSETFLAG_FALSE(uname) TESTCLEARFLAG_FALSE(uname)
+#define TESTSCFLAG_FALSE(uname, lname) \
+ TESTSETFLAG_FALSE(uname, lname) TESTCLEARFLAG_FALSE(uname, lname)
__PAGEFLAG(Locked, locked, PF_NO_TAIL)
PAGEFLAG(Waiters, waiters, PF_ONLY_HEAD) __CLEARPAGEFLAG(Waiters, waiters, PF_ONLY_HEAD)
@@ -406,8 +445,8 @@ PAGEFLAG(MappedToDisk, mappedtodisk, PF_NO_TAIL)
/* PG_readahead is only used for reads; PG_reclaim is only for writes */
PAGEFLAG(Reclaim, reclaim, PF_NO_TAIL)
TESTCLEARFLAG(Reclaim, reclaim, PF_NO_TAIL)
-PAGEFLAG(Readahead, reclaim, PF_NO_COMPOUND)
- TESTCLEARFLAG(Readahead, reclaim, PF_NO_COMPOUND)
+PAGEFLAG(Readahead, readahead, PF_NO_COMPOUND)
+ TESTCLEARFLAG(Readahead, readahead, PF_NO_COMPOUND)
#ifdef CONFIG_HIGHMEM
/*
@@ -416,22 +455,25 @@ PAGEFLAG(Readahead, reclaim, PF_NO_COMPOUND)
*/
#define PageHighMem(__p) is_highmem_idx(page_zonenum(__p))
#else
-PAGEFLAG_FALSE(HighMem)
+PAGEFLAG_FALSE(HighMem, highmem)
#endif
#ifdef CONFIG_SWAP
-static __always_inline int PageSwapCache(struct page *page)
+static __always_inline bool folio_swapcache(struct folio *folio)
{
-#ifdef CONFIG_THP_SWAP
- page = compound_head(page);
-#endif
- return PageSwapBacked(page) && test_bit(PG_swapcache, &page->flags);
+ return folio_swapbacked(folio) &&
+ test_bit(PG_swapcache, folio_flags(folio, 0));
+}
+static __always_inline bool PageSwapCache(struct page *page)
+{
+ return folio_swapcache(page_folio(page));
}
+
SETPAGEFLAG(SwapCache, swapcache, PF_NO_TAIL)
CLEARPAGEFLAG(SwapCache, swapcache, PF_NO_TAIL)
#else
-PAGEFLAG_FALSE(SwapCache)
+PAGEFLAG_FALSE(SwapCache, swapcache)
#endif
PAGEFLAG(Unevictable, unevictable, PF_HEAD)
@@ -443,14 +485,14 @@ PAGEFLAG(Mlocked, mlocked, PF_NO_TAIL)
__CLEARPAGEFLAG(Mlocked, mlocked, PF_NO_TAIL)
TESTSCFLAG(Mlocked, mlocked, PF_NO_TAIL)
#else
-PAGEFLAG_FALSE(Mlocked) __CLEARPAGEFLAG_NOOP(Mlocked)
- TESTSCFLAG_FALSE(Mlocked)
+PAGEFLAG_FALSE(Mlocked, mlocked) __CLEARPAGEFLAG_NOOP(Mlocked, mlocked)
+ TESTSCFLAG_FALSE(Mlocked, mlocked)
#endif
#ifdef CONFIG_ARCH_USES_PG_UNCACHED
PAGEFLAG(Uncached, uncached, PF_NO_COMPOUND)
#else
-PAGEFLAG_FALSE(Uncached)
+PAGEFLAG_FALSE(Uncached, uncached)
#endif
#ifdef CONFIG_MEMORY_FAILURE
@@ -459,7 +501,7 @@ TESTSCFLAG(HWPoison, hwpoison, PF_ANY)
#define __PG_HWPOISON (1UL << PG_hwpoison)
extern bool take_page_off_buddy(struct page *page);
#else
-PAGEFLAG_FALSE(HWPoison)
+PAGEFLAG_FALSE(HWPoison, hwpoison)
#define __PG_HWPOISON 0
#endif
@@ -505,10 +547,14 @@ static __always_inline int PageMappingFlags(struct page *page)
return ((unsigned long)page->mapping & PAGE_MAPPING_FLAGS) != 0;
}
-static __always_inline int PageAnon(struct page *page)
+static __always_inline bool folio_anon(struct folio *folio)
+{
+ return ((unsigned long)folio->mapping & PAGE_MAPPING_ANON) != 0;
+}
+
+static __always_inline bool PageAnon(struct page *page)
{
- page = compound_head(page);
- return ((unsigned long)page->mapping & PAGE_MAPPING_ANON) != 0;
+ return folio_anon(page_folio(page));
}
static __always_inline int __PageMovable(struct page *page)
@@ -524,30 +570,32 @@ static __always_inline int __PageMovable(struct page *page)
* is found in VM_MERGEABLE vmas. It's a PageAnon page, pointing not to any
* anon_vma, but to that page's node of the stable tree.
*/
-static __always_inline int PageKsm(struct page *page)
+static __always_inline bool folio_ksm(struct folio *folio)
{
- page = compound_head(page);
- return ((unsigned long)page->mapping & PAGE_MAPPING_FLAGS) ==
+ return ((unsigned long)folio->mapping & PAGE_MAPPING_FLAGS) ==
PAGE_MAPPING_KSM;
}
+
+static __always_inline bool PageKsm(struct page *page)
+{
+ return folio_ksm(page_folio(page));
+}
#else
-TESTPAGEFLAG_FALSE(Ksm)
+TESTPAGEFLAG_FALSE(Ksm, ksm)
#endif
u64 stable_page_flags(struct page *page);
-static inline int PageUptodate(struct page *page)
+static inline bool folio_uptodate(struct folio *folio)
{
- int ret;
- page = compound_head(page);
- ret = test_bit(PG_uptodate, &(page)->flags);
+ bool ret = test_bit(PG_uptodate, folio_flags(folio, 0));
/*
- * Must ensure that the data we read out of the page is loaded
- * _after_ we've loaded page->flags to check for PageUptodate.
- * We can skip the barrier if the page is not uptodate, because
+ * Must ensure that the data we read out of the folio is loaded
+ * _after_ we've loaded folio->flags to check the uptodate bit.
+ * We can skip the barrier if the folio is not uptodate, because
* we wouldn't be reading anything from it.
*
- * See SetPageUptodate() for the other side of the story.
+ * See folio_set_uptodate() for the other side of the story.
*/
if (ret)
smp_rmb();
@@ -555,23 +603,36 @@ static inline int PageUptodate(struct page *page)
return ret;
}
-static __always_inline void __SetPageUptodate(struct page *page)
+static inline int PageUptodate(struct page *page)
+{
+ return folio_uptodate(page_folio(page));
+}
+
+static __always_inline void __folio_set_uptodate(struct folio *folio)
{
- VM_BUG_ON_PAGE(PageTail(page), page);
smp_wmb();
- __set_bit(PG_uptodate, &page->flags);
+ __set_bit(PG_uptodate, folio_flags(folio, 0));
}
-static __always_inline void SetPageUptodate(struct page *page)
+static __always_inline void folio_set_uptodate(struct folio *folio)
{
- VM_BUG_ON_PAGE(PageTail(page), page);
/*
* Memory barrier must be issued before setting the PG_uptodate bit,
- * so that all previous stores issued in order to bring the page
- * uptodate are actually visible before PageUptodate becomes true.
+ * so that all previous stores issued in order to bring the folio
+ * uptodate are actually visible before folio_uptodate becomes true.
*/
smp_wmb();
- set_bit(PG_uptodate, &page->flags);
+ set_bit(PG_uptodate, folio_flags(folio, 0));
+}
+
+static __always_inline void __SetPageUptodate(struct page *page)
+{
+ __folio_set_uptodate((struct folio *)page);
+}
+
+static __always_inline void SetPageUptodate(struct page *page)
+{
+ folio_set_uptodate((struct folio *)page);
}
CLEARPAGEFLAG(Uptodate, uptodate, PF_NO_TAIL)
@@ -596,6 +657,17 @@ static inline void set_page_writeback_keepwrite(struct page *page)
__PAGEFLAG(Head, head, PF_ANY) CLEARPAGEFLAG(Head, head, PF_ANY)
+/* Whether there are one or multiple pages in a folio */
+static inline bool folio_single(struct folio *folio)
+{
+ return !folio_head(folio);
+}
+
+static inline bool folio_multi(struct folio *folio)
+{
+ return folio_head(folio);
+}
+
static __always_inline void set_compound_head(struct page *page, struct page *head)
{
WRITE_ONCE(page->compound_head, (unsigned long)head + 1);
@@ -619,12 +691,15 @@ static inline void ClearPageCompound(struct page *page)
#ifdef CONFIG_HUGETLB_PAGE
int PageHuge(struct page *page);
int PageHeadHuge(struct page *page);
+static inline bool folio_hugetlb(struct folio *folio)
+{
+ return PageHeadHuge(&folio->page);
+}
#else
-TESTPAGEFLAG_FALSE(Huge)
-TESTPAGEFLAG_FALSE(HeadHuge)
+TESTPAGEFLAG_FALSE(Huge, hugetlb)
+TESTPAGEFLAG_FALSE(HeadHuge, headhuge)
#endif
-
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
/*
* PageHuge() only returns true for hugetlbfs pages, but not for
@@ -640,6 +715,11 @@ static inline int PageTransHuge(struct page *page)
return PageHead(page);
}
+static inline bool folio_transhuge(struct folio *folio)
+{
+ return folio_head(folio);
+}
+
/*
* PageTransCompound returns true for both transparent huge pages
* and hugetlbfs pages, so it should only be called when it's known
@@ -713,12 +793,12 @@ static inline int PageTransTail(struct page *page)
PAGEFLAG(DoubleMap, double_map, PF_SECOND)
TESTSCFLAG(DoubleMap, double_map, PF_SECOND)
#else
-TESTPAGEFLAG_FALSE(TransHuge)
-TESTPAGEFLAG_FALSE(TransCompound)
-TESTPAGEFLAG_FALSE(TransCompoundMap)
-TESTPAGEFLAG_FALSE(TransTail)
-PAGEFLAG_FALSE(DoubleMap)
- TESTSCFLAG_FALSE(DoubleMap)
+TESTPAGEFLAG_FALSE(TransHuge, transhuge)
+TESTPAGEFLAG_FALSE(TransCompound, transcompound)
+TESTPAGEFLAG_FALSE(TransCompoundMap, transcompoundmap)
+TESTPAGEFLAG_FALSE(TransTail, transtail)
+PAGEFLAG_FALSE(DoubleMap, double_map)
+ TESTSCFLAG_FALSE(DoubleMap, double_map)
#endif
/*
@@ -871,6 +951,11 @@ static inline int page_has_private(struct page *page)
return !!(page->flags & PAGE_FLAGS_PRIVATE);
}
+static inline bool folio_has_private(struct folio *folio)
+{
+ return page_has_private(&folio->page);
+}
+
#undef PF_ANY
#undef PF_HEAD
#undef PF_ONLY_HEAD
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread* [PATCH v8 09/27] mm: Add folio_young() and folio_idle()
2021-04-30 17:22 [PATCH v8 00/27] Memory Folios Matthew Wilcox (Oracle)
` (7 preceding siblings ...)
2021-04-30 17:22 ` [PATCH v8 08/27] mm: Add folio flag manipulation functions Matthew Wilcox (Oracle)
@ 2021-04-30 17:22 ` Matthew Wilcox (Oracle)
2021-04-30 17:22 ` [PATCH v8 10/27] mm: Handle per-folio private data Matthew Wilcox (Oracle)
` (10 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 17:22 UTC (permalink / raw)
To: linux-mm, linux-fsdevel, akpm; +Cc: Matthew Wilcox (Oracle)
Idle page tracking is handled through page_ext for 32-bit. Add folio
equivalents for 32 bit and move all the page compatibility parts to
common code.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
include/linux/page_idle.h | 99 +++++++++++++++++++--------------------
1 file changed, 49 insertions(+), 50 deletions(-)
diff --git a/include/linux/page_idle.h b/include/linux/page_idle.h
index 1e894d34bdce..9693d1a93781 100644
--- a/include/linux/page_idle.h
+++ b/include/linux/page_idle.h
@@ -8,46 +8,16 @@
#ifdef CONFIG_IDLE_PAGE_TRACKING
-#ifdef CONFIG_64BIT
-static inline bool page_is_young(struct page *page)
-{
- return PageYoung(page);
-}
-
-static inline void set_page_young(struct page *page)
-{
- SetPageYoung(page);
-}
-
-static inline bool test_and_clear_page_young(struct page *page)
-{
- return TestClearPageYoung(page);
-}
-
-static inline bool page_is_idle(struct page *page)
-{
- return PageIdle(page);
-}
-
-static inline void set_page_idle(struct page *page)
-{
- SetPageIdle(page);
-}
-
-static inline void clear_page_idle(struct page *page)
-{
- ClearPageIdle(page);
-}
-#else /* !CONFIG_64BIT */
+#ifndef CONFIG_64BIT
/*
* If there is not enough space to store Idle and Young bits in page flags, use
* page ext flags instead.
*/
extern struct page_ext_operations page_idle_ops;
-static inline bool page_is_young(struct page *page)
+static inline bool folio_young(struct folio *folio)
{
- struct page_ext *page_ext = lookup_page_ext(page);
+ struct page_ext *page_ext = lookup_page_ext(&folio->page);
if (unlikely(!page_ext))
return false;
@@ -55,9 +25,9 @@ static inline bool page_is_young(struct page *page)
return test_bit(PAGE_EXT_YOUNG, &page_ext->flags);
}
-static inline void set_page_young(struct page *page)
+static inline void folio_set_young(struct folio *folio)
{
- struct page_ext *page_ext = lookup_page_ext(page);
+ struct page_ext *page_ext = lookup_page_ext(&folio->page);
if (unlikely(!page_ext))
return;
@@ -65,9 +35,9 @@ static inline void set_page_young(struct page *page)
set_bit(PAGE_EXT_YOUNG, &page_ext->flags);
}
-static inline bool test_and_clear_page_young(struct page *page)
+static inline bool folio_test_clear_young(struct folio *folio)
{
- struct page_ext *page_ext = lookup_page_ext(page);
+ struct page_ext *page_ext = lookup_page_ext(&folio->page);
if (unlikely(!page_ext))
return false;
@@ -75,9 +45,9 @@ static inline bool test_and_clear_page_young(struct page *page)
return test_and_clear_bit(PAGE_EXT_YOUNG, &page_ext->flags);
}
-static inline bool page_is_idle(struct page *page)
+static inline bool folio_idle(struct folio *folio)
{
- struct page_ext *page_ext = lookup_page_ext(page);
+ struct page_ext *page_ext = lookup_page_ext(&folio->page);
if (unlikely(!page_ext))
return false;
@@ -85,9 +55,9 @@ static inline bool page_is_idle(struct page *page)
return test_bit(PAGE_EXT_IDLE, &page_ext->flags);
}
-static inline void set_page_idle(struct page *page)
+static inline void folio_set_idle(struct folio *folio)
{
- struct page_ext *page_ext = lookup_page_ext(page);
+ struct page_ext *page_ext = lookup_page_ext(&folio->page);
if (unlikely(!page_ext))
return;
@@ -95,46 +65,75 @@ static inline void set_page_idle(struct page *page)
set_bit(PAGE_EXT_IDLE, &page_ext->flags);
}
-static inline void clear_page_idle(struct page *page)
+static inline void folio_clear_idle(struct folio *folio)
{
- struct page_ext *page_ext = lookup_page_ext(page);
+ struct page_ext *page_ext = lookup_page_ext(&folio->page);
if (unlikely(!page_ext))
return;
clear_bit(PAGE_EXT_IDLE, &page_ext->flags);
}
-#endif /* CONFIG_64BIT */
+#endif /* !CONFIG_64BIT */
#else /* !CONFIG_IDLE_PAGE_TRACKING */
-static inline bool page_is_young(struct page *page)
+static inline bool folio_young(struct folio *folio)
{
return false;
}
-static inline void set_page_young(struct page *page)
+static inline void folio_set_young(struct folio *folio)
{
}
-static inline bool test_and_clear_page_young(struct page *page)
+static inline bool folio_test_clear_young(struct folio *folio)
{
return false;
}
-static inline bool page_is_idle(struct page *page)
+static inline bool folio_idle(struct folio *folio)
{
return false;
}
-static inline void set_page_idle(struct page *page)
+static inline void folio_set_idle(struct folio *folio)
{
}
-static inline void clear_page_idle(struct page *page)
+static inline void folio_clear_idle(struct folio *folio)
{
}
#endif /* CONFIG_IDLE_PAGE_TRACKING */
+static inline bool page_is_young(struct page *page)
+{
+ return folio_young(page_folio(page));
+}
+
+static inline void set_page_young(struct page *page)
+{
+ folio_set_young(page_folio(page));
+}
+
+static inline bool test_and_clear_page_young(struct page *page)
+{
+ return folio_test_clear_young(page_folio(page));
+}
+
+static inline bool page_is_idle(struct page *page)
+{
+ return folio_idle(page_folio(page));
+}
+
+static inline void set_page_idle(struct page *page)
+{
+ folio_set_idle(page_folio(page));
+}
+
+static inline void clear_page_idle(struct page *page)
+{
+ folio_clear_idle(page_folio(page));
+}
#endif /* _LINUX_MM_PAGE_IDLE_H */
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread* [PATCH v8 10/27] mm: Handle per-folio private data
2021-04-30 17:22 [PATCH v8 00/27] Memory Folios Matthew Wilcox (Oracle)
` (8 preceding siblings ...)
2021-04-30 17:22 ` [PATCH v8 09/27] mm: Add folio_young() and folio_idle() Matthew Wilcox (Oracle)
@ 2021-04-30 17:22 ` Matthew Wilcox (Oracle)
2021-04-30 17:22 ` [PATCH v8 11/27] mm/filemap: Add folio_index, folio_file_page and folio_contains Matthew Wilcox (Oracle)
` (9 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 17:22 UTC (permalink / raw)
To: linux-mm, linux-fsdevel, akpm
Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton
Add folio_get_private() and folio_set_private() which mirror page_private()
and set_page_private() -- ie folio private data is the same as page
private data. The only difference is that these return a void *
instead of an unsigned long, which matches the majority of users.
Turn attach_page_private() into folio_attach_private() and reimplement
attach_page_private() as a wrapper. No filesystem which uses page private
data currently supports compound pages, so we're free to define the rules.
attach_page_private() may only be called on a head page; if you want
to add private data to a tail page, you can call set_page_private()
directly (and shouldn't increment the page refcount! That should be
done when adding private data to the head page / folio).
This saves 597 bytes of text with the distro-derived config that I'm
testing due to removing the calls to compound_head() in get_page()
& put_page().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
include/linux/mm_types.h | 11 +++++++++
include/linux/pagemap.h | 48 ++++++++++++++++++++++++----------------
2 files changed, 40 insertions(+), 19 deletions(-)
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 276e358c75d3..111c304b7d13 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -302,6 +302,12 @@ static inline atomic_t *compound_pincount_ptr(struct page *page)
#define PAGE_FRAG_CACHE_MAX_SIZE __ALIGN_MASK(32768, ~PAGE_MASK)
#define PAGE_FRAG_CACHE_MAX_ORDER get_order(PAGE_FRAG_CACHE_MAX_SIZE)
+/*
+ * page_private can be used on tail pages. However, PagePrivate is only
+ * checked by the VM on the head page. So page_private on the tail pages
+ * should be used for data that's ancillary to the head page (eg attaching
+ * buffer heads to tail pages after attaching buffer heads to the head page)
+ */
#define page_private(page) ((page)->private)
static inline void set_page_private(struct page *page, unsigned long private)
@@ -309,6 +315,11 @@ static inline void set_page_private(struct page *page, unsigned long private)
page->private = private;
}
+static inline void *folio_get_private(struct folio *folio)
+{
+ return (void *)folio->private;
+}
+
struct page_frag_cache {
void * va;
#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE)
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index a4bd41128bf3..c58587225fe5 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -260,42 +260,52 @@ static inline int page_cache_add_speculative(struct page *page, int count)
}
/**
- * attach_page_private - Attach private data to a page.
- * @page: Page to attach data to.
- * @data: Data to attach to page.
+ * folio_attach_private - Attach private data to a folio.
+ * @folio: Folio to attach data to.
+ * @data: Data to attach to folio.
*
- * Attaching private data to a page increments the page's reference count.
- * The data must be detached before the page will be freed.
+ * Attaching private data to a folio increments the page's reference count.
+ * The data must be detached before the folio will be freed.
*/
-static inline void attach_page_private(struct page *page, void *data)
+static inline void folio_attach_private(struct folio *folio, void *data)
{
- get_page(page);
- set_page_private(page, (unsigned long)data);
- SetPagePrivate(page);
+ folio_get(folio);
+ folio->private = (unsigned long)data;
+ folio_set_private(folio);
}
/**
- * detach_page_private - Detach private data from a page.
- * @page: Page to detach data from.
+ * folio_detach_private - Detach private data from a folio.
+ * @folio: Folio to detach data from.
*
- * Removes the data that was previously attached to the page and decrements
+ * Removes the data that was previously attached to the folio and decrements
* the refcount on the page.
*
- * Return: Data that was attached to the page.
+ * Return: Data that was attached to the folio.
*/
-static inline void *detach_page_private(struct page *page)
+static inline void *folio_detach_private(struct folio *folio)
{
- void *data = (void *)page_private(page);
+ void *data = folio_get_private(folio);
- if (!PagePrivate(page))
+ if (!folio_private(folio))
return NULL;
- ClearPagePrivate(page);
- set_page_private(page, 0);
- put_page(page);
+ folio_clear_private(folio);
+ folio->private = 0;
+ folio_put(folio);
return data;
}
+static inline void attach_page_private(struct page *page, void *data)
+{
+ folio_attach_private(page_folio(page), data);
+}
+
+static inline void *detach_page_private(struct page *page)
+{
+ return folio_detach_private(page_folio(page));
+}
+
#ifdef CONFIG_NUMA
extern struct page *__page_cache_alloc(gfp_t gfp);
#else
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread* [PATCH v8 11/27] mm/filemap: Add folio_index, folio_file_page and folio_contains
2021-04-30 17:22 [PATCH v8 00/27] Memory Folios Matthew Wilcox (Oracle)
` (9 preceding siblings ...)
2021-04-30 17:22 ` [PATCH v8 10/27] mm: Handle per-folio private data Matthew Wilcox (Oracle)
@ 2021-04-30 17:22 ` Matthew Wilcox (Oracle)
2021-04-30 17:22 ` [PATCH v8 12/27] mm/filemap: Add folio_next_index Matthew Wilcox (Oracle)
` (8 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 17:22 UTC (permalink / raw)
To: linux-mm, linux-fsdevel, akpm
Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton
folio_index() is the equivalent of page_index() for folios.
folio_file_page() is the equivalent of find_subpage().
folio_contains() is the equivalent of thp_contains().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
include/linux/pagemap.h | 53 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 53 insertions(+)
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index c58587225fe5..58c9b98604d0 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -462,6 +462,59 @@ static inline bool thp_contains(struct page *head, pgoff_t index)
return page_index(head) == (index & ~(thp_nr_pages(head) - 1UL));
}
+#define swapcache_index(folio) __page_file_index(&(folio)->page)
+
+/**
+ * folio_index - File index of a folio.
+ * @folio: The folio.
+ *
+ * For a folio which is either in the page cache or the swap cache,
+ * return its index within the address_space it belongs to. If you know
+ * the page is definitely in the page cache, you can look at the folio's
+ * index directly.
+ *
+ * Return: The index (offset in units of pages) of a folio in its file.
+ */
+static inline pgoff_t folio_index(struct folio *folio)
+{
+ if (unlikely(folio_swapcache(folio)))
+ return swapcache_index(folio);
+ return folio->index;
+}
+
+/**
+ * folio_file_page - The page for a particular index.
+ * @folio: The folio which contains this index.
+ * @index: The index we want to look up.
+ *
+ * Sometimes after looking up a folio in the page cache, we need to
+ * obtain the specific page for an index (eg a page fault).
+ *
+ * Return: The page containing the file data for this index.
+ */
+static inline struct page *folio_file_page(struct folio *folio, pgoff_t index)
+{
+ return nth_page(&folio->page, index & (folio_nr_pages(folio) - 1));
+}
+
+/**
+ * folio_contains - Does this folio contain this index?
+ * @folio: The folio.
+ * @index: The page index within the file.
+ *
+ * Context: The caller should have the page locked in order to prevent
+ * (eg) shmem from moving the page between the page cache and swap cache
+ * and changing its index in the middle of the operation.
+ * Return: true or false.
+ */
+static inline bool folio_contains(struct folio *folio, pgoff_t index)
+{
+ /* HugeTLBfs indexes the page cache in units of hpage_size */
+ if (folio_hugetlb(folio))
+ return folio->index == index;
+ return index - folio_index(folio) < folio_nr_pages(folio);
+}
+
/*
* Given the page we found in the page cache, return the page corresponding
* to this index in the file
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread* [PATCH v8 12/27] mm/filemap: Add folio_next_index
2021-04-30 17:22 [PATCH v8 00/27] Memory Folios Matthew Wilcox (Oracle)
` (10 preceding siblings ...)
2021-04-30 17:22 ` [PATCH v8 11/27] mm/filemap: Add folio_index, folio_file_page and folio_contains Matthew Wilcox (Oracle)
@ 2021-04-30 17:22 ` Matthew Wilcox (Oracle)
2021-04-30 17:22 ` [PATCH v8 13/27] mm/filemap: Add folio_offset and folio_file_offset Matthew Wilcox (Oracle)
` (7 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 17:22 UTC (permalink / raw)
To: linux-mm, linux-fsdevel, akpm
Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton
This helper returns the page index of the next folio in the file (ie
the end of this folio, plus one).
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
include/linux/pagemap.h | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 58c9b98604d0..5301131cc5b3 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -482,6 +482,17 @@ static inline pgoff_t folio_index(struct folio *folio)
return folio->index;
}
+/**
+ * folio_next_index - Get the index of the next folio.
+ * @folio: The current folio.
+ *
+ * Return: The index of the folio which follows this folio in the file.
+ */
+static inline pgoff_t folio_next_index(struct folio *folio)
+{
+ return folio->index + folio_nr_pages(folio);
+}
+
/**
* folio_file_page - The page for a particular index.
* @folio: The folio which contains this index.
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread* [PATCH v8 13/27] mm/filemap: Add folio_offset and folio_file_offset
2021-04-30 17:22 [PATCH v8 00/27] Memory Folios Matthew Wilcox (Oracle)
` (11 preceding siblings ...)
2021-04-30 17:22 ` [PATCH v8 12/27] mm/filemap: Add folio_next_index Matthew Wilcox (Oracle)
@ 2021-04-30 17:22 ` Matthew Wilcox (Oracle)
2021-04-30 17:22 ` [PATCH v8 14/27] mm/util: Add folio_mapping and folio_file_mapping Matthew Wilcox (Oracle)
` (6 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 17:22 UTC (permalink / raw)
To: linux-mm, linux-fsdevel, akpm
Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton
These are just wrappers around their page counterpart.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
include/linux/pagemap.h | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 5301131cc5b3..3d4ca4fd7b96 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -634,6 +634,16 @@ static inline loff_t page_file_offset(struct page *page)
return ((loff_t)page_index(page)) << PAGE_SHIFT;
}
+static inline loff_t folio_offset(struct folio *folio)
+{
+ return page_offset(&folio->page);
+}
+
+static inline loff_t folio_file_offset(struct folio *folio)
+{
+ return page_file_offset(&folio->page);
+}
+
extern pgoff_t linear_hugepage_index(struct vm_area_struct *vma,
unsigned long address);
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread* [PATCH v8 14/27] mm/util: Add folio_mapping and folio_file_mapping
2021-04-30 17:22 [PATCH v8 00/27] Memory Folios Matthew Wilcox (Oracle)
` (12 preceding siblings ...)
2021-04-30 17:22 ` [PATCH v8 13/27] mm/filemap: Add folio_offset and folio_file_offset Matthew Wilcox (Oracle)
@ 2021-04-30 17:22 ` Matthew Wilcox (Oracle)
2021-04-30 17:22 ` [PATCH v8 15/27] mm: Add folio_mapcount Matthew Wilcox (Oracle)
` (5 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 17:22 UTC (permalink / raw)
To: linux-mm, linux-fsdevel, akpm
Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton
These are the folio equivalent of page_mapping() and page_file_mapping().
Add an out-of-line page_mapping() wrapper around folio_mapping()
in order to prevent the page_folio() call from bloating every caller
of page_mapping(). Adjust page_file_mapping() and page_mapping_file()
to use folios internally. Rename __page_file_mapping() to
swapcache_mapping() and change it to take a folio.
This ends up saving 186 bytes of text overall. folio_mapping() is
45 bytes shorter than page_mapping() was, but the new page_mapping()
wrapper is 30 bytes. The major reduction is a few bytes less in dozens
of nfs functions (which call page_file_mapping()). Most of these appear
to be a slight change in gcc's register allocation decisions, which allow:
48 8b 56 08 mov 0x8(%rsi),%rdx
48 8d 42 ff lea -0x1(%rdx),%rax
83 e2 01 and $0x1,%edx
48 0f 44 c6 cmove %rsi,%rax
to become:
48 8b 46 08 mov 0x8(%rsi),%rax
48 8d 78 ff lea -0x1(%rax),%rdi
a8 01 test $0x1,%al
48 0f 44 fe cmove %rsi,%rdi
for a reduction of a single byte. Once the NFS client is converted to
use folios, this entire sequence will disappear.
Also add folio_mapping() documentation.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
Documentation/core-api/mm-api.rst | 2 ++
include/linux/mm.h | 14 -------------
include/linux/pagemap.h | 35 +++++++++++++++++++++++++++++--
include/linux/swap.h | 6 ++++++
mm/Makefile | 2 +-
mm/folio-compat.c | 13 ++++++++++++
mm/swapfile.c | 8 +++----
mm/util.c | 30 +++++++++++++++-----------
8 files changed, 77 insertions(+), 33 deletions(-)
create mode 100644 mm/folio-compat.c
diff --git a/Documentation/core-api/mm-api.rst b/Documentation/core-api/mm-api.rst
index 88545b2dce98..e7e0f8a70794 100644
--- a/Documentation/core-api/mm-api.rst
+++ b/Documentation/core-api/mm-api.rst
@@ -99,3 +99,5 @@ More Memory Management Functions
.. kernel-doc:: include/linux/mm.h
:internal:
.. kernel-doc:: include/linux/page_ref.h
+.. kernel-doc:: mm/util.c
+ :functions: folio_mapping
diff --git a/include/linux/mm.h b/include/linux/mm.h
index b133734a7530..fb779dca5ee8 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1749,19 +1749,6 @@ void page_address_init(void);
extern void *page_rmapping(struct page *page);
extern struct anon_vma *page_anon_vma(struct page *page);
-extern struct address_space *page_mapping(struct page *page);
-
-extern struct address_space *__page_file_mapping(struct page *);
-
-static inline
-struct address_space *page_file_mapping(struct page *page)
-{
- if (unlikely(PageSwapCache(page)))
- return __page_file_mapping(page);
-
- return page->mapping;
-}
-
extern pgoff_t __page_file_index(struct page *page);
/*
@@ -1776,7 +1763,6 @@ static inline pgoff_t page_index(struct page *page)
}
bool page_mapped(struct page *page);
-struct address_space *page_mapping(struct page *page);
/*
* Return true only if the page has been allocated with
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 3d4ca4fd7b96..b6bd60bbdee2 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -162,14 +162,45 @@ static inline void filemap_nr_thps_dec(struct address_space *mapping)
void release_pages(struct page **pages, int nr);
+struct address_space *page_mapping(struct page *);
+struct address_space *folio_mapping(struct folio *);
+struct address_space *swapcache_mapping(struct folio *);
+
+/**
+ * folio_file_mapping - Find the mapping this folio belongs to.
+ * @folio: The folio.
+ *
+ * For folios which are in the page cache, return the mapping that this
+ * page belongs to. Folios in the swap cache return the mapping of the
+ * swap file or swap device where the data is stored. This is different
+ * from the mapping returned by folio_mapping(). The only reason to
+ * use it is if, like NFS, you return 0 from ->activate_swapfile.
+ *
+ * Do not call this for folios which aren't in the page cache or swap cache.
+ */
+static inline struct address_space *folio_file_mapping(struct folio *folio)
+{
+ if (unlikely(folio_swapcache(folio)))
+ return swapcache_mapping(folio);
+
+ return folio->mapping;
+}
+
+static inline struct address_space *page_file_mapping(struct page *page)
+{
+ return folio_file_mapping(page_folio(page));
+}
+
/*
* For file cache pages, return the address_space, otherwise return NULL
*/
static inline struct address_space *page_mapping_file(struct page *page)
{
- if (unlikely(PageSwapCache(page)))
+ struct folio *folio = page_folio(page);
+
+ if (unlikely(folio_swapcache(folio)))
return NULL;
- return page_mapping(page);
+ return folio_mapping(folio);
}
/*
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 144727041e78..20766342845b 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -314,6 +314,12 @@ struct vma_swap_readahead {
#endif
};
+static inline swp_entry_t folio_swap_entry(struct folio *folio)
+{
+ swp_entry_t entry = { .val = page_private(&folio->page) };
+ return entry;
+}
+
/* linux/mm/workingset.c */
void workingset_age_nonresident(struct lruvec *lruvec, unsigned long nr_pages);
void *workingset_eviction(struct page *page, struct mem_cgroup *target_memcg);
diff --git a/mm/Makefile b/mm/Makefile
index a9ad6122d468..434c2a46b6c5 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -46,7 +46,7 @@ mmu-$(CONFIG_MMU) += process_vm_access.o
endif
obj-y := filemap.o mempool.o oom_kill.o fadvise.o \
- maccess.o page-writeback.o \
+ maccess.o page-writeback.o folio-compat.o \
readahead.o swap.o truncate.o vmscan.o shmem.o \
util.o mmzone.o vmstat.o backing-dev.o \
mm_init.o percpu.o slab_common.o \
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
new file mode 100644
index 000000000000..5e107aa30a62
--- /dev/null
+++ b/mm/folio-compat.c
@@ -0,0 +1,13 @@
+/*
+ * Compatibility functions which bloat the callers too much to make inline.
+ * All of the callers of these functions should be converted to use folios
+ * eventually.
+ */
+
+#include <linux/pagemap.h>
+
+struct address_space *page_mapping(struct page *page)
+{
+ return folio_mapping(page_folio(page));
+}
+EXPORT_SYMBOL(page_mapping);
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 149e77454e3c..d0ee24239a83 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3533,13 +3533,13 @@ struct swap_info_struct *page_swap_info(struct page *page)
}
/*
- * out-of-line __page_file_ methods to avoid include hell.
+ * out-of-line methods to avoid include hell.
*/
-struct address_space *__page_file_mapping(struct page *page)
+struct address_space *swapcache_mapping(struct folio *folio)
{
- return page_swap_info(page)->swap_file->f_mapping;
+ return page_swap_info(&folio->page)->swap_file->f_mapping;
}
-EXPORT_SYMBOL_GPL(__page_file_mapping);
+EXPORT_SYMBOL_GPL(swapcache_mapping);
pgoff_t __page_file_index(struct page *page)
{
diff --git a/mm/util.c b/mm/util.c
index a8bf17f18a81..afd99591cb81 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -686,30 +686,36 @@ struct anon_vma *page_anon_vma(struct page *page)
return __page_rmapping(page);
}
-struct address_space *page_mapping(struct page *page)
+/**
+ * folio_mapping - Find the mapping where this folio is stored.
+ * @folio: The folio.
+ *
+ * For folios which are in the page cache, return the mapping that this
+ * page belongs to. Folios in the swap cache return the swap mapping
+ * this page is stored in (which is different from the mapping for the
+ * swap file or swap device where the data is stored).
+ *
+ * You can call this for folios which aren't in the swap cache or page
+ * cache and it will return NULL.
+ */
+struct address_space *folio_mapping(struct folio *folio)
{
struct address_space *mapping;
- page = compound_head(page);
-
/* This happens if someone calls flush_dcache_page on slab page */
- if (unlikely(PageSlab(page)))
+ if (unlikely(folio_slab(folio)))
return NULL;
- if (unlikely(PageSwapCache(page))) {
- swp_entry_t entry;
-
- entry.val = page_private(page);
- return swap_address_space(entry);
- }
+ if (unlikely(folio_swapcache(folio)))
+ return swap_address_space(folio_swap_entry(folio));
- mapping = page->mapping;
+ mapping = folio->mapping;
if ((unsigned long)mapping & PAGE_MAPPING_ANON)
return NULL;
return (void *)((unsigned long)mapping & ~PAGE_MAPPING_FLAGS);
}
-EXPORT_SYMBOL(page_mapping);
+EXPORT_SYMBOL(folio_mapping);
/* Slow path of page_mapcount() for compound pages */
int __page_mapcount(struct page *page)
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread* [PATCH v8 15/27] mm: Add folio_mapcount
2021-04-30 17:22 [PATCH v8 00/27] Memory Folios Matthew Wilcox (Oracle)
` (13 preceding siblings ...)
2021-04-30 17:22 ` [PATCH v8 14/27] mm/util: Add folio_mapping and folio_file_mapping Matthew Wilcox (Oracle)
@ 2021-04-30 17:22 ` Matthew Wilcox (Oracle)
2021-04-30 17:22 ` [PATCH v8 16/27] mm/memcg: Add folio wrappers for various functions Matthew Wilcox (Oracle)
` (4 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 17:22 UTC (permalink / raw)
To: linux-mm, linux-fsdevel, akpm
Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton
This is the folio equivalent of page_mapcount().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
include/linux/mm.h | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index fb779dca5ee8..bca3e2518e5e 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -883,6 +883,22 @@ static inline int page_mapcount(struct page *page)
return atomic_read(&page->_mapcount) + 1;
}
+/**
+ * folio_mapcount - The number of mappings of this folio.
+ * @folio: The folio.
+ *
+ * The result includes the number of times any of the pages in the
+ * folio are mapped to userspace.
+ *
+ * Return: The number of page table entries which refer to this folio.
+ */
+static inline int folio_mapcount(struct folio *folio)
+{
+ if (unlikely(folio_multi(folio)))
+ return __page_mapcount(&folio->page);
+ return atomic_read(&folio->_mapcount) + 1;
+}
+
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
int total_mapcount(struct page *page);
int page_trans_huge_mapcount(struct page *page, int *total_mapcount);
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread* [PATCH v8 16/27] mm/memcg: Add folio wrappers for various functions
2021-04-30 17:22 [PATCH v8 00/27] Memory Folios Matthew Wilcox (Oracle)
` (14 preceding siblings ...)
2021-04-30 17:22 ` [PATCH v8 15/27] mm: Add folio_mapcount Matthew Wilcox (Oracle)
@ 2021-04-30 17:22 ` Matthew Wilcox (Oracle)
2021-04-30 17:22 ` [PATCH v8 17/27] mm/filemap: Add folio_unlock Matthew Wilcox (Oracle)
` (3 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 17:22 UTC (permalink / raw)
To: linux-mm, linux-fsdevel, akpm
Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton
Add new wrapper functions folio_memcg(), lock_folio_memcg(),
unlock_folio_memcg(), mem_cgroup_folio_lruvec() and
count_memcg_folio_event()
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
include/linux/memcontrol.h | 58 ++++++++++++++++++++++++++++++++++++++
1 file changed, 58 insertions(+)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index c193be760709..b45b505be0ec 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -456,6 +456,11 @@ static inline struct mem_cgroup *page_memcg(struct page *page)
return __page_memcg(page);
}
+static inline struct mem_cgroup *folio_memcg(struct folio *folio)
+{
+ return page_memcg(&folio->page);
+}
+
/*
* page_memcg_rcu - locklessly get the memory cgroup associated with a page
* @page: a pointer to the page struct
@@ -1058,6 +1063,15 @@ static inline void count_memcg_page_event(struct page *page,
count_memcg_events(memcg, idx, 1);
}
+static inline void count_memcg_folio_event(struct folio *folio,
+ enum vm_event_item idx)
+{
+ struct mem_cgroup *memcg = folio_memcg(folio);
+
+ if (memcg)
+ count_memcg_events(memcg, idx, folio_nr_pages(folio));
+}
+
static inline void count_memcg_event_mm(struct mm_struct *mm,
enum vm_event_item idx)
{
@@ -1477,6 +1491,22 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
}
#endif /* CONFIG_MEMCG */
+static inline void lock_folio_memcg(struct folio *folio)
+{
+ lock_page_memcg(&folio->page);
+}
+
+static inline void unlock_folio_memcg(struct folio *folio)
+{
+ unlock_page_memcg(&folio->page);
+}
+
+static inline struct lruvec *mem_cgroup_folio_lruvec(struct folio *folio,
+ struct pglist_data *pgdat)
+{
+ return mem_cgroup_page_lruvec(&folio->page, pgdat);
+}
+
static inline void __inc_lruvec_kmem_state(void *p, enum node_stat_item idx)
{
__mod_lruvec_kmem_state(p, idx, 1);
@@ -1544,6 +1574,34 @@ static inline struct lruvec *relock_page_lruvec_irqsave(struct page *page,
return lock_page_lruvec_irqsave(page, flags);
}
+static inline struct lruvec *folio_lock_lruvec(struct folio *folio)
+{
+ return lock_page_lruvec(&folio->page);
+}
+
+static inline struct lruvec *folio_lock_lruvec_irq(struct folio *folio)
+{
+ return lock_page_lruvec_irq(&folio->page);
+}
+
+static inline struct lruvec *folio_lock_lruvec_irqsave(struct folio *folio,
+ unsigned long *flagsp)
+{
+ return lock_page_lruvec_irqsave(&folio->page, flagsp);
+}
+
+static inline struct lruvec *folio_relock_lruvec_irq(struct folio *folio,
+ struct lruvec *locked_lruvec)
+{
+ return relock_page_lruvec_irq(&folio->page, locked_lruvec);
+}
+
+static inline struct lruvec *folio_relock_lruvec_irqsave(struct folio *folio,
+ struct lruvec *locked_lruvec, unsigned long *flagsp)
+{
+ return relock_page_lruvec_irqsave(&folio->page, locked_lruvec, flagsp);
+}
+
#ifdef CONFIG_CGROUP_WRITEBACK
struct wb_domain *mem_cgroup_wb_domain(struct bdi_writeback *wb);
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread* [PATCH v8 17/27] mm/filemap: Add folio_unlock
2021-04-30 17:22 [PATCH v8 00/27] Memory Folios Matthew Wilcox (Oracle)
` (15 preceding siblings ...)
2021-04-30 17:22 ` [PATCH v8 16/27] mm/memcg: Add folio wrappers for various functions Matthew Wilcox (Oracle)
@ 2021-04-30 17:22 ` Matthew Wilcox (Oracle)
2021-04-30 17:22 ` [PATCH v8 18/27] mm/filemap: Add folio_lock Matthew Wilcox (Oracle)
` (2 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 17:22 UTC (permalink / raw)
To: linux-mm, linux-fsdevel, akpm
Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton
Convert unlock_page() to call folio_unlock(). By using a folio we
avoid a call to compound_head(). This shortens the function from 39
bytes to 25 and removes 4 instructions on x86-64. Because we still
have unlock_page(), it's a net increase of 24 bytes of text for the
kernel as a whole, but any path that uses folio_unlock() will execute
4 fewer instructions.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
include/linux/pagemap.h | 3 ++-
mm/filemap.c | 27 ++++++++++-----------------
mm/folio-compat.c | 6 ++++++
3 files changed, 18 insertions(+), 18 deletions(-)
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index b6bd60bbdee2..9126c0db3a60 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -719,7 +719,8 @@ extern int __lock_page_killable(struct page *page);
extern int __lock_page_async(struct page *page, struct wait_page_queue *wait);
extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
unsigned int flags);
-extern void unlock_page(struct page *page);
+void unlock_page(struct page *page);
+void folio_unlock(struct folio *folio);
/*
* Return true if the page was successfully locked
diff --git a/mm/filemap.c b/mm/filemap.c
index 66f7e9fdfbc4..090b303bcd45 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1435,29 +1435,22 @@ static inline bool clear_bit_unlock_is_negative_byte(long nr, volatile void *mem
#endif
/**
- * unlock_page - unlock a locked page
- * @page: the page
+ * folio_unlock - Unlock a locked folio.
+ * @folio: The folio.
*
- * Unlocks the page and wakes up sleepers in wait_on_page_locked().
- * Also wakes sleepers in wait_on_page_writeback() because the wakeup
- * mechanism between PageLocked pages and PageWriteback pages is shared.
- * But that's OK - sleepers in wait_on_page_writeback() just go back to sleep.
+ * Unlocks the folio and wakes up any thread sleeping on the page lock.
*
- * Note that this depends on PG_waiters being the sign bit in the byte
- * that contains PG_locked - thus the BUILD_BUG_ON(). That allows us to
- * clear the PG_locked bit and test PG_waiters at the same time fairly
- * portably (architectures that do LL/SC can test any bit, while x86 can
- * test the sign bit).
+ * Context: May be called from interrupt or process context. May not be
+ * called from NMI context.
*/
-void unlock_page(struct page *page)
+void folio_unlock(struct folio *folio)
{
BUILD_BUG_ON(PG_waiters != 7);
- page = compound_head(page);
- VM_BUG_ON_PAGE(!PageLocked(page), page);
- if (clear_bit_unlock_is_negative_byte(PG_locked, &page->flags))
- wake_up_page_bit(page, PG_locked);
+ VM_BUG_ON_FOLIO(!folio_locked(folio), folio);
+ if (clear_bit_unlock_is_negative_byte(PG_locked, folio_flags(folio, 0)))
+ wake_up_page_bit(&folio->page, PG_locked);
}
-EXPORT_SYMBOL(unlock_page);
+EXPORT_SYMBOL(folio_unlock);
/**
* end_page_private_2 - Clear PG_private_2 and release any waiters
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 5e107aa30a62..91b3d00a92f7 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -11,3 +11,9 @@ struct address_space *page_mapping(struct page *page)
return folio_mapping(page_folio(page));
}
EXPORT_SYMBOL(page_mapping);
+
+void unlock_page(struct page *page)
+{
+ return folio_unlock(page_folio(page));
+}
+EXPORT_SYMBOL(unlock_page);
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread* [PATCH v8 18/27] mm/filemap: Add folio_lock
2021-04-30 17:22 [PATCH v8 00/27] Memory Folios Matthew Wilcox (Oracle)
` (16 preceding siblings ...)
2021-04-30 17:22 ` [PATCH v8 17/27] mm/filemap: Add folio_unlock Matthew Wilcox (Oracle)
@ 2021-04-30 17:22 ` Matthew Wilcox (Oracle)
2021-04-30 17:22 ` [PATCH v8 19/27] mm/filemap: Add folio_lock_killable Matthew Wilcox (Oracle)
2021-04-30 17:44 ` [PATCH v8 00/27] Memory Folios Matthew Wilcox
19 siblings, 0 replies; 21+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 17:22 UTC (permalink / raw)
To: linux-mm, linux-fsdevel, akpm
Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton
This is like lock_page() but for use by callers who know they have a folio.
Convert __lock_page() to be __folio_lock(). This saves one call to
compound_head() per contended call to lock_page().
Saves 362 bytes of text; mostly from improved register allocation and
inlining decisions. __folio_lock is 59 bytes while __lock_page was 79.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
include/linux/pagemap.h | 24 +++++++++++++++++++-----
mm/filemap.c | 29 +++++++++++++++--------------
2 files changed, 34 insertions(+), 19 deletions(-)
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 9126c0db3a60..ff81be103539 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -714,7 +714,7 @@ static inline bool wake_page_match(struct wait_page_queue *wait_page,
return true;
}
-extern void __lock_page(struct page *page);
+void __folio_lock(struct folio *folio);
extern int __lock_page_killable(struct page *page);
extern int __lock_page_async(struct page *page, struct wait_page_queue *wait);
extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
@@ -722,13 +722,24 @@ extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
void unlock_page(struct page *page);
void folio_unlock(struct folio *folio);
+static inline bool folio_trylock(struct folio *folio)
+{
+ return likely(!test_and_set_bit_lock(PG_locked, folio_flags(folio, 0)));
+}
+
/*
* Return true if the page was successfully locked
*/
static inline int trylock_page(struct page *page)
{
- page = compound_head(page);
- return (likely(!test_and_set_bit_lock(PG_locked, &page->flags)));
+ return folio_trylock(page_folio(page));
+}
+
+static inline void folio_lock(struct folio *folio)
+{
+ might_sleep();
+ if (!folio_trylock(folio))
+ __folio_lock(folio);
}
/*
@@ -736,9 +747,12 @@ static inline int trylock_page(struct page *page)
*/
static inline void lock_page(struct page *page)
{
+ struct folio *folio;
might_sleep();
- if (!trylock_page(page))
- __lock_page(page);
+
+ folio = page_folio(page);
+ if (!folio_trylock(folio))
+ __folio_lock(folio);
}
/*
diff --git a/mm/filemap.c b/mm/filemap.c
index 090b303bcd45..6935b068856f 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1187,7 +1187,7 @@ static void wake_up_page(struct page *page, int bit)
*/
enum behavior {
EXCLUSIVE, /* Hold ref to page and take the bit when woken, like
- * __lock_page() waiting on then setting PG_locked.
+ * __folio_lock() waiting on then setting PG_locked.
*/
SHARED, /* Hold ref to page and check the bit when woken, like
* wait_on_page_writeback() waiting on PG_writeback.
@@ -1576,17 +1576,16 @@ void page_endio(struct page *page, bool is_write, int err)
EXPORT_SYMBOL_GPL(page_endio);
/**
- * __lock_page - get a lock on the page, assuming we need to sleep to get it
- * @__page: the page to lock
+ * __folio_lock - Get a lock on the folio, assuming we need to sleep to get it.
+ * @folio: The folio to lock
*/
-void __lock_page(struct page *__page)
+void __folio_lock(struct folio *folio)
{
- struct page *page = compound_head(__page);
- wait_queue_head_t *q = page_waitqueue(page);
- wait_on_page_bit_common(q, page, PG_locked, TASK_UNINTERRUPTIBLE,
+ wait_queue_head_t *q = page_waitqueue(&folio->page);
+ wait_on_page_bit_common(q, &folio->page, PG_locked, TASK_UNINTERRUPTIBLE,
EXCLUSIVE);
}
-EXPORT_SYMBOL(__lock_page);
+EXPORT_SYMBOL(__folio_lock);
int __lock_page_killable(struct page *__page)
{
@@ -1661,10 +1660,10 @@ int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
return 0;
}
} else {
- __lock_page(page);
+ __folio_lock(page_folio(page));
}
- return 1;
+ return 1;
}
/**
@@ -2815,7 +2814,9 @@ loff_t mapping_seek_hole_data(struct address_space *mapping, loff_t start,
static int lock_page_maybe_drop_mmap(struct vm_fault *vmf, struct page *page,
struct file **fpin)
{
- if (trylock_page(page))
+ struct folio *folio = page_folio(page);
+
+ if (folio_trylock(folio))
return 1;
/*
@@ -2828,7 +2829,7 @@ static int lock_page_maybe_drop_mmap(struct vm_fault *vmf, struct page *page,
*fpin = maybe_unlock_mmap_for_io(vmf, *fpin);
if (vmf->flags & FAULT_FLAG_KILLABLE) {
- if (__lock_page_killable(page)) {
+ if (__lock_page_killable(&folio->page)) {
/*
* We didn't have the right flags to drop the mmap_lock,
* but all fault_handlers only check for fatal signals
@@ -2840,11 +2841,11 @@ static int lock_page_maybe_drop_mmap(struct vm_fault *vmf, struct page *page,
return 0;
}
} else
- __lock_page(page);
+ __folio_lock(folio);
+
return 1;
}
-
/*
* Synchronous readahead happens when we don't even find a page in the page
* cache at all. We don't want to perform IO under the mmap sem, so if we have
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread* [PATCH v8 19/27] mm/filemap: Add folio_lock_killable
2021-04-30 17:22 [PATCH v8 00/27] Memory Folios Matthew Wilcox (Oracle)
` (17 preceding siblings ...)
2021-04-30 17:22 ` [PATCH v8 18/27] mm/filemap: Add folio_lock Matthew Wilcox (Oracle)
@ 2021-04-30 17:22 ` Matthew Wilcox (Oracle)
2021-04-30 17:44 ` [PATCH v8 00/27] Memory Folios Matthew Wilcox
19 siblings, 0 replies; 21+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 17:22 UTC (permalink / raw)
To: linux-mm, linux-fsdevel, akpm
Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton
This is like lock_page_killable() but for use by callers who
know they have a folio. Convert __lock_page_killable() to be
__folio_lock_killable(). This saves one call to compound_head() per
contended call to lock_page_killable().
__folio_lock_killable() is 20 bytes smaller than __lock_page_killable()
was. lock_page_maybe_drop_mmap() shrinks by 68 bytes and
__lock_page_or_retry() shrinks by 66 bytes. That's a total of 154 bytes
of text saved.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
include/linux/pagemap.h | 15 ++++++++++-----
mm/filemap.c | 17 +++++++++--------
2 files changed, 19 insertions(+), 13 deletions(-)
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index ff81be103539..332731ee541a 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -715,7 +715,7 @@ static inline bool wake_page_match(struct wait_page_queue *wait_page,
}
void __folio_lock(struct folio *folio);
-extern int __lock_page_killable(struct page *page);
+int __folio_lock_killable(struct folio *folio);
extern int __lock_page_async(struct page *page, struct wait_page_queue *wait);
extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
unsigned int flags);
@@ -755,6 +755,14 @@ static inline void lock_page(struct page *page)
__folio_lock(folio);
}
+static inline int folio_lock_killable(struct folio *folio)
+{
+ might_sleep();
+ if (!folio_trylock(folio))
+ return __folio_lock_killable(folio);
+ return 0;
+}
+
/*
* lock_page_killable is like lock_page but can be interrupted by fatal
* signals. It returns 0 if it locked the page and -EINTR if it was
@@ -762,10 +770,7 @@ static inline void lock_page(struct page *page)
*/
static inline int lock_page_killable(struct page *page)
{
- might_sleep();
- if (!trylock_page(page))
- return __lock_page_killable(page);
- return 0;
+ return folio_lock_killable(page_folio(page));
}
/*
diff --git a/mm/filemap.c b/mm/filemap.c
index 6935b068856f..27a86d53dd89 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1587,14 +1587,13 @@ void __folio_lock(struct folio *folio)
}
EXPORT_SYMBOL(__folio_lock);
-int __lock_page_killable(struct page *__page)
+int __folio_lock_killable(struct folio *folio)
{
- struct page *page = compound_head(__page);
- wait_queue_head_t *q = page_waitqueue(page);
- return wait_on_page_bit_common(q, page, PG_locked, TASK_KILLABLE,
+ wait_queue_head_t *q = page_waitqueue(&folio->page);
+ return wait_on_page_bit_common(q, &folio->page, PG_locked, TASK_KILLABLE,
EXCLUSIVE);
}
-EXPORT_SYMBOL_GPL(__lock_page_killable);
+EXPORT_SYMBOL_GPL(__folio_lock_killable);
int __lock_page_async(struct page *page, struct wait_page_queue *wait)
{
@@ -1636,6 +1635,8 @@ int __lock_page_async(struct page *page, struct wait_page_queue *wait)
int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
unsigned int flags)
{
+ struct folio *folio = page_folio(page);
+
if (fault_flag_allow_retry_first(flags)) {
/*
* CAUTION! In this case, mmap_lock is not released
@@ -1654,13 +1655,13 @@ int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
if (flags & FAULT_FLAG_KILLABLE) {
int ret;
- ret = __lock_page_killable(page);
+ ret = __folio_lock_killable(folio);
if (ret) {
mmap_read_unlock(mm);
return 0;
}
} else {
- __folio_lock(page_folio(page));
+ __folio_lock(folio);
}
return 1;
@@ -2829,7 +2830,7 @@ static int lock_page_maybe_drop_mmap(struct vm_fault *vmf, struct page *page,
*fpin = maybe_unlock_mmap_for_io(vmf, *fpin);
if (vmf->flags & FAULT_FLAG_KILLABLE) {
- if (__lock_page_killable(&folio->page)) {
+ if (__folio_lock_killable(folio)) {
/*
* We didn't have the right flags to drop the mmap_lock,
* but all fault_handlers only check for fatal signals
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [PATCH v8 00/27] Memory Folios
2021-04-30 17:22 [PATCH v8 00/27] Memory Folios Matthew Wilcox (Oracle)
` (18 preceding siblings ...)
2021-04-30 17:22 ` [PATCH v8 19/27] mm/filemap: Add folio_lock_killable Matthew Wilcox (Oracle)
@ 2021-04-30 17:44 ` Matthew Wilcox
19 siblings, 0 replies; 21+ messages in thread
From: Matthew Wilcox @ 2021-04-30 17:44 UTC (permalink / raw)
To: linux-mm, linux-fsdevel, akpm
On Fri, Apr 30, 2021 at 06:22:08PM +0100, Matthew Wilcox (Oracle) wrote:
Argh. This is the wrong version. My apologies. I'll send a v8.1.
^ permalink raw reply [flat|nested] 21+ messages in thread