* [PATCH v2 00/11] hugetlb: Use PAGE granularity index in exported i/f and adopt the common read_iter
@ 2026-06-17 17:25 Jane Chu
2026-06-17 17:25 ` [PATCH v2 01/11] mm/memory-failure: make is_raw_hwpoison_page_in_hugepage() general purpose Jane Chu
` (11 more replies)
0 siblings, 12 replies; 14+ messages in thread
From: Jane Chu @ 2026-06-17 17:25 UTC (permalink / raw)
To: akpm
Cc: willy, jack, viro, brauner, muchun.song, osalvador, david, hughd,
baolin.wang, linmiaohe, nao.horiguchi, lorenzo, rppt, peterx,
corbet, linux-doc, linux-mm, linux-kernel, linux-fsdevel
changes in v2:
- new patches 1-4: add hwpoison handling to filemap_read(),
thus replace hugetlbfs_read_iter() with generic_file_read_iter(),
suggested by Matthew [2];
- new patch 5: convert hugetlb fault handler's vmf->pgoff to PAGE_SIZE
granularity like the rest of mm fault handling convention, suggested
by Matthew [2];
- patch 6: fixed a bug in v1 pointed out by Usama Arif, also by syzbot;
- patch 8: did not pick the Acked-by from Oscar (for 5/6 in v1) due to
updates to the patch;
- patch 11: add VM_WARN_ON in hugetlb_unreserve_pages(), per Oscar;
v1:
This series stems from a discussion with David. [1]
The series makes a small cleanup to a few hugetlb interfaces used
outside the subsystem by standardizing them on base-page indices.
Hopefully this makes the interface semantics a bit more coherent with
the rest of mm, while the internal hugetlb code continue to use hugepage
indices where that remains the more natural fit.
[1] https://lore.kernel.org/linux-mm/9ec9edd1-0f4c-4da2-ae78-0e7b251a9e25@kernel.org/
[2] https://lore.kernel.org/linux-mm/aeZwAz6PcdlqSnJ2@casper.infradead.org/
Jane Chu (11):
mm/memory-failure: make is_raw_hwpoison_page_in_hugepage() general
purpose
mm: factor out adjust_range_hwpoison() from hugetlbfs
mm/filemap: add hwpoison handling to filemap_read()
hugetlbfs,filemap: replace hugetlbfs_read_iter() with
generic_file_read_iter()
hugetlb: Convert the vmf->pgoff to PAGE_SIZE granularity
hugetlb: make hugetlb_fault_mutex_hash() to take PAGE_SIZE index
hugetlb: replace filemap_lock_hugetlb_folio with filemap_lock_folio
hugetlb: make hugetlb_add_to_page_cache() to take PAGE_SIZE
granularity index
hugetlb: remove the hugetlb_linear_page_index() helper
hugetlb: drop vma_hugecache_offset() in favor of linear_page_index()
hugetlb: make hugetlb_[un]reserve_pages() to take PAGE granularity
index
Documentation/mm/hugetlbfs_reserv.rst | 19 ++--
fs/hugetlbfs/inode.c | 155 ++++----------------------
include/linux/fs.h | 2 +
include/linux/hugetlb.h | 36 +-----
mm/filemap.c | 62 ++++++++++-
mm/hugetlb.c | 87 ++++++++-------
mm/memfd.c | 25 ++---
mm/memory-failure.c | 12 +-
mm/userfaultfd.c | 6 +-
9 files changed, 164 insertions(+), 240 deletions(-)
--
2.43.5
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 01/11] mm/memory-failure: make is_raw_hwpoison_page_in_hugepage() general purpose
2026-06-17 17:25 [PATCH v2 00/11] hugetlb: Use PAGE granularity index in exported i/f and adopt the common read_iter Jane Chu
@ 2026-06-17 17:25 ` Jane Chu
2026-06-17 17:25 ` [PATCH v2 02/11] mm: factor out adjust_range_hwpoison() from hugetlbfs Jane Chu
` (10 subsequent siblings)
11 siblings, 0 replies; 14+ messages in thread
From: Jane Chu @ 2026-06-17 17:25 UTC (permalink / raw)
To: akpm
Cc: willy, jack, viro, brauner, muchun.song, osalvador, david, hughd,
baolin.wang, linmiaohe, nao.horiguchi, lorenzo, rppt, peterx,
corbet, linux-doc, linux-mm, linux-kernel, linux-fsdevel
Make is_raw_hwpoison_page_in_hugepage() general for checking whether
a given raw page within any kind of folio is HW poisoned. Thus,
replace folio_test_hwpoison() with folio_contain_hwpoisoned_page().
Also rename to is_raw_hwpoison_page_in_folio().
Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
fs/hugetlbfs/inode.c | 4 ++--
include/linux/hugetlb.h | 4 ++--
mm/memory-failure.c | 12 ++++++++++--
3 files changed, 14 insertions(+), 6 deletions(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 78d61bf2bd9b..66520f7c53c6 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -198,7 +198,7 @@ static size_t adjust_range_hwpoison(struct folio *folio, size_t offset,
struct page *page = folio_page(folio, offset / PAGE_SIZE);
size_t safe_bytes;
- if (is_raw_hwpoison_page_in_hugepage(page))
+ if (is_raw_hwpoison_page_in_folio(page))
return 0;
/* Safe to read the remaining bytes in this page. */
safe_bytes = PAGE_SIZE - (offset % PAGE_SIZE);
@@ -206,7 +206,7 @@ static size_t adjust_range_hwpoison(struct folio *folio, size_t offset,
/* Check each remaining page as long as we are not done yet. */
for (; safe_bytes < bytes; safe_bytes += PAGE_SIZE, page++)
- if (is_raw_hwpoison_page_in_hugepage(page))
+ if (is_raw_hwpoison_page_in_folio(page))
break;
return min(safe_bytes, bytes);
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 5957bc25efa8..a9846f043712 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -1079,9 +1079,9 @@ void hugetlb_unregister_node(struct node *node);
#endif
/*
- * Check if a given raw @page in a hugepage is HWPOISON.
+ * Check if a given raw @page is HWPOISON in a folio of any kind
*/
-bool is_raw_hwpoison_page_in_hugepage(struct page *page);
+bool is_raw_hwpoison_page_in_folio(struct page *page);
static inline unsigned long huge_page_mask_align(struct file *file)
{
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index ee42d4361309..40129e0b8213 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1834,14 +1834,21 @@ static inline struct llist_head *raw_hwp_list_head(struct folio *folio)
return (struct llist_head *)&folio->_hugetlb_hwpoison;
}
-bool is_raw_hwpoison_page_in_hugepage(struct page *page)
+/**
+ * is_raw_hwpoison_page_in_folio - answers the question whether a given
+ * page is indeed hwpoisoned.
+ * @page: given page, maybe base page, part of a large folio or hugetlb.
+ *
+ * Return: true if @page is the raw hwpoisoned page; else, false.
+ */
+bool is_raw_hwpoison_page_in_folio(struct page *page)
{
struct llist_head *raw_hwp_head;
struct raw_hwp_page *p;
struct folio *folio = page_folio(page);
bool ret = false;
- if (!folio_test_hwpoison(folio))
+ if (!folio_contain_hwpoisoned_page(folio))
return false;
if (!folio_test_hugetlb(folio))
@@ -1868,6 +1875,7 @@ bool is_raw_hwpoison_page_in_hugepage(struct page *page)
return ret;
}
+EXPORT_SYMBOL_GPL(is_raw_hwpoison_page_in_folio);
static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag)
{
--
2.43.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 02/11] mm: factor out adjust_range_hwpoison() from hugetlbfs
2026-06-17 17:25 [PATCH v2 00/11] hugetlb: Use PAGE granularity index in exported i/f and adopt the common read_iter Jane Chu
2026-06-17 17:25 ` [PATCH v2 01/11] mm/memory-failure: make is_raw_hwpoison_page_in_hugepage() general purpose Jane Chu
@ 2026-06-17 17:25 ` Jane Chu
2026-06-17 17:25 ` [PATCH v2 03/11] mm/filemap: add hwpoison handling to filemap_read() Jane Chu
` (9 subsequent siblings)
11 siblings, 0 replies; 14+ messages in thread
From: Jane Chu @ 2026-06-17 17:25 UTC (permalink / raw)
To: akpm
Cc: willy, jack, viro, brauner, muchun.song, osalvador, david, hughd,
baolin.wang, linmiaohe, nao.horiguchi, lorenzo, rppt, peterx,
corbet, linux-doc, linux-mm, linux-kernel, linux-fsdevel
The functionality and implementation of adjust_range_hwpoison() is
generic, so factor it out and make it ready for generic use.
[1] https://lore.kernel.org/linux-mm/aeZwAz6PcdlqSnJ2@casper.infradead.org/
Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
fs/hugetlbfs/inode.c | 25 -------------------------
include/linux/fs.h | 2 ++
include/linux/hugetlb.h | 5 -----
mm/filemap.c | 31 +++++++++++++++++++++++++++++++
4 files changed, 33 insertions(+), 30 deletions(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 66520f7c53c6..f1f8c3f7388f 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -187,31 +187,6 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
return mm_get_unmapped_area_vmflags(file, addr0, len, pgoff, flags, 0);
}
-/*
- * Someone wants to read @bytes from a HWPOISON hugetlb @folio from @offset.
- * Returns the maximum number of bytes one can read without touching the 1st raw
- * HWPOISON page.
- */
-static size_t adjust_range_hwpoison(struct folio *folio, size_t offset,
- size_t bytes)
-{
- struct page *page = folio_page(folio, offset / PAGE_SIZE);
- size_t safe_bytes;
-
- if (is_raw_hwpoison_page_in_folio(page))
- return 0;
- /* Safe to read the remaining bytes in this page. */
- safe_bytes = PAGE_SIZE - (offset % PAGE_SIZE);
- page++;
-
- /* Check each remaining page as long as we are not done yet. */
- for (; safe_bytes < bytes; safe_bytes += PAGE_SIZE, page++)
- if (is_raw_hwpoison_page_in_folio(page))
- break;
-
- return min(safe_bytes, bytes);
-}
-
/*
* Support for read() - Find the page attached to f_mapping and copy out the
* data. This provides functionality similar to filemap_read().
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 11559c513dfb..3876d5beda58 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3052,6 +3052,8 @@ int generic_write_checks_count(struct kiocb *iocb, loff_t *count);
extern int generic_write_check_limits(struct file *file, loff_t pos,
loff_t *count);
extern int generic_file_rw_checks(struct file *file_in, struct file *file_out);
+bool is_raw_hwpoison_page_in_folio(struct page *page);
+size_t adjust_range_hwpoison(struct folio *folio, size_t offset, size_t bytes);
ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *to,
ssize_t already_read);
extern ssize_t generic_file_read_iter(struct kiocb *, struct iov_iter *);
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index a9846f043712..218284e80451 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -1078,11 +1078,6 @@ void hugetlb_register_node(struct node *node);
void hugetlb_unregister_node(struct node *node);
#endif
-/*
- * Check if a given raw @page is HWPOISON in a folio of any kind
- */
-bool is_raw_hwpoison_page_in_folio(struct page *page);
-
static inline unsigned long huge_page_mask_align(struct file *file)
{
return PAGE_MASK & ~huge_page_mask(hstate_file(file));
diff --git a/mm/filemap.c b/mm/filemap.c
index 4e636647100c..a27ce4ad6247 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2753,6 +2753,37 @@ static void filemap_end_dropbehind_read(struct folio *folio)
}
}
+/**
+ * adjust_range_hwpoison - adjust clean readable range to avoid hwpoison.
+ * @folio: folio that contains hwpoison(s).
+ * @offset: bytes into the folio where subsequent read starts.
+ * @bytes: number of bytes wish to read.
+ *
+ * Return: adjusted total number of bytes starting off @offset that can be
+ * safely read from the @folio.
+ */
+size_t adjust_range_hwpoison(struct folio *folio, size_t offset,
+ size_t bytes)
+{
+ struct page *page = folio_page(folio, offset / PAGE_SIZE);
+ size_t safe_bytes;
+
+ if (is_raw_hwpoison_page_in_folio(page))
+ return 0;
+
+ /* Safe to read the remaining bytes in this page. */
+ safe_bytes = PAGE_SIZE - (offset % PAGE_SIZE);
+ page++;
+
+ /* Check each remaining page as long as we are not done yet. */
+ for (; safe_bytes < bytes; safe_bytes += PAGE_SIZE, page++)
+ if (is_raw_hwpoison_page_in_folio(page))
+ break;
+
+ return min(safe_bytes, bytes);
+}
+EXPORT_SYMBOL_GPL(adjust_range_hwpoison);
+
/**
* filemap_read - Read data from the page cache.
* @iocb: The iocb to read.
--
2.43.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 03/11] mm/filemap: add hwpoison handling to filemap_read()
2026-06-17 17:25 [PATCH v2 00/11] hugetlb: Use PAGE granularity index in exported i/f and adopt the common read_iter Jane Chu
2026-06-17 17:25 ` [PATCH v2 01/11] mm/memory-failure: make is_raw_hwpoison_page_in_hugepage() general purpose Jane Chu
2026-06-17 17:25 ` [PATCH v2 02/11] mm: factor out adjust_range_hwpoison() from hugetlbfs Jane Chu
@ 2026-06-17 17:25 ` Jane Chu
2026-06-17 17:25 ` [PATCH v2 04/11] hugetlbfs,filemap: replace hugetlbfs_read_iter() with generic_file_read_iter() Jane Chu
` (8 subsequent siblings)
11 siblings, 0 replies; 14+ messages in thread
From: Jane Chu @ 2026-06-17 17:25 UTC (permalink / raw)
To: akpm
Cc: willy, jack, viro, brauner, muchun.song, osalvador, david, hughd,
baolin.wang, linmiaohe, nao.horiguchi, lorenzo, rppt, peterx,
corbet, linux-doc, linux-mm, linux-kernel, linux-fsdevel
Add hwpoison handling to filemap_read() such that .read_iter() could
make best effort copying data out of clean pages without risking
MCE in case page cache contains HWpoison.
[1] https://lore.kernel.org/linux-mm/aeZwAz6PcdlqSnJ2@casper.infradead.org/
Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
mm/filemap.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/mm/filemap.c b/mm/filemap.c
index a27ce4ad6247..df8543573570 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2475,6 +2475,8 @@ static void filemap_get_read_batch(struct address_space *mapping,
if (!folio_batch_add(fbatch, folio))
break;
+ if (folio_contain_hwpoisoned_page(folio))
+ break;
if (!folio_test_uptodate(folio))
break;
if (folio_test_readahead(folio))
@@ -2871,6 +2873,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
size_t offset = iocb->ki_pos & (fsize - 1);
size_t bytes = min_t(loff_t, end_offset - iocb->ki_pos,
fsize - offset);
+ size_t adjusted;
size_t copied;
if (end_offset < folio_pos(folio))
@@ -2885,13 +2888,22 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
if (writably_mapped)
flush_dcache_folio(folio);
- copied = copy_folio_to_iter(folio, offset, bytes, iter);
+ adjusted = bytes;
+ if (folio_contain_hwpoisoned_page(folio)) {
+ adjusted = adjust_range_hwpoison(folio, offset, bytes);
+ if (adjusted == 0) {
+ error = -EIO;
+ break;
+ }
+ }
+
+ copied = copy_folio_to_iter(folio, offset, adjusted, iter);
already_read += copied;
iocb->ki_pos += copied;
last_pos = iocb->ki_pos;
- if (copied < bytes) {
+ if (copied < adjusted) {
error = -EFAULT;
break;
}
--
2.43.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 04/11] hugetlbfs,filemap: replace hugetlbfs_read_iter() with generic_file_read_iter()
2026-06-17 17:25 [PATCH v2 00/11] hugetlb: Use PAGE granularity index in exported i/f and adopt the common read_iter Jane Chu
` (2 preceding siblings ...)
2026-06-17 17:25 ` [PATCH v2 03/11] mm/filemap: add hwpoison handling to filemap_read() Jane Chu
@ 2026-06-17 17:25 ` Jane Chu
2026-06-17 20:07 ` Matthew Wilcox
2026-06-17 17:25 ` [PATCH v2 05/11] hugetlb: Convert the vmf->pgoff to PAGE_SIZE granularity Jane Chu
` (7 subsequent siblings)
11 siblings, 1 reply; 14+ messages in thread
From: Jane Chu @ 2026-06-17 17:25 UTC (permalink / raw)
To: akpm
Cc: willy, jack, viro, brauner, muchun.song, osalvador, david, hughd,
baolin.wang, linmiaohe, nao.horiguchi, lorenzo, rppt, peterx,
corbet, linux-doc, linux-mm, linux-kernel, linux-fsdevel
Replace hugetlbfs_read_iter() with generic_file_read_iter(),
teach filemap_get_pages() to be aware of hugetlb pagesize while
calculating 'last_index'.
[1] https://lore.kernel.org/linux-mm/aeZwAz6PcdlqSnJ2@casper.infradead.org/
Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
fs/hugetlbfs/inode.c | 84 +-------------------------------------------
mm/filemap.c | 15 ++++++--
2 files changed, 14 insertions(+), 85 deletions(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index f1f8c3f7388f..1c25485c91b9 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -187,88 +187,6 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
return mm_get_unmapped_area_vmflags(file, addr0, len, pgoff, flags, 0);
}
-/*
- * Support for read() - Find the page attached to f_mapping and copy out the
- * data. This provides functionality similar to filemap_read().
- */
-static ssize_t hugetlbfs_read_iter(struct kiocb *iocb, struct iov_iter *to)
-{
- struct file *file = iocb->ki_filp;
- struct hstate *h = hstate_file(file);
- struct address_space *mapping = file->f_mapping;
- struct inode *inode = mapping->host;
- unsigned long index = iocb->ki_pos >> huge_page_shift(h);
- unsigned long offset = iocb->ki_pos & ~huge_page_mask(h);
- unsigned long end_index;
- loff_t isize;
- ssize_t retval = 0;
-
- while (iov_iter_count(to)) {
- struct folio *folio;
- size_t nr, copied, want;
-
- /* nr is the maximum number of bytes to copy from this page */
- nr = huge_page_size(h);
- isize = i_size_read(inode);
- if (!isize)
- break;
- end_index = (isize - 1) >> huge_page_shift(h);
- if (index > end_index)
- break;
- if (index == end_index) {
- nr = ((isize - 1) & ~huge_page_mask(h)) + 1;
- if (nr <= offset)
- break;
- }
- nr = nr - offset;
-
- /* Find the folio */
- folio = filemap_lock_hugetlb_folio(h, mapping, index);
- if (IS_ERR(folio)) {
- /*
- * We have a HOLE, zero out the user-buffer for the
- * length of the hole or request.
- */
- copied = iov_iter_zero(nr, to);
- } else {
- folio_unlock(folio);
-
- if (!folio_test_hwpoison(folio))
- want = nr;
- else {
- /*
- * Adjust how many bytes safe to read without
- * touching the 1st raw HWPOISON page after
- * offset.
- */
- want = adjust_range_hwpoison(folio, offset, nr);
- if (want == 0) {
- folio_put(folio);
- retval = -EIO;
- break;
- }
- }
-
- /*
- * We have the folio, copy it to user space buffer.
- */
- copied = copy_folio_to_iter(folio, offset, want, to);
- folio_put(folio);
- }
- offset += copied;
- retval += copied;
- if (copied != nr && iov_iter_count(to)) {
- if (!retval)
- retval = -EFAULT;
- break;
- }
- index += offset >> huge_page_shift(h);
- offset &= ~huge_page_mask(h);
- }
- iocb->ki_pos = ((loff_t)index << huge_page_shift(h)) + offset;
- return retval;
-}
-
static int hugetlbfs_write_begin(const struct kiocb *iocb,
struct address_space *mapping,
loff_t pos, unsigned len,
@@ -1181,7 +1099,7 @@ static void init_once(void *foo)
}
static const struct file_operations hugetlbfs_file_operations = {
- .read_iter = hugetlbfs_read_iter,
+ .read_iter = generic_file_read_iter,
.mmap = hugetlbfs_file_mmap,
.fsync = noop_fsync,
.get_unmapped_area = hugetlb_get_unmapped_area,
diff --git a/mm/filemap.c b/mm/filemap.c
index df8543573570..eb03b31791fc 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2672,20 +2672,30 @@ static int filemap_get_pages(struct kiocb *iocb, size_t count,
{
struct file *filp = iocb->ki_filp;
struct address_space *mapping = filp->f_mapping;
+ bool is_hugetlbfs = is_file_hugepages(filp);
pgoff_t index = iocb->ki_pos >> PAGE_SHIFT;
pgoff_t last_index;
struct folio *folio;
unsigned int flags;
+ size_t min_folio_bytes;
int err = 0;
/* "last_index" is the index of the folio beyond the end of the read */
- last_index = round_up(iocb->ki_pos + count,
- mapping_min_folio_nrbytes(mapping)) >> PAGE_SHIFT;
+ if (is_hugetlbfs)
+ min_folio_bytes = huge_page_size(hstate_file(filp));
+ else
+ min_folio_bytes = mapping_min_folio_nrbytes(mapping);
+ last_index = round_up(iocb->ki_pos + count, min_folio_bytes) >> PAGE_SHIFT;
+
retry:
if (fatal_signal_pending(current))
return -EINTR;
filemap_get_read_batch(mapping, index, last_index - 1, fbatch);
+
+ if (is_hugetlbfs)
+ goto done;
+
if (!folio_batch_count(fbatch)) {
DEFINE_READAHEAD(ractl, filp, &filp->f_ra, mapping, index);
@@ -2724,6 +2734,7 @@ static int filemap_get_pages(struct kiocb *iocb, size_t count,
goto err;
}
+done:
trace_mm_filemap_get_pages(mapping, index, last_index - 1);
return 0;
err:
--
2.43.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 05/11] hugetlb: Convert the vmf->pgoff to PAGE_SIZE granularity
2026-06-17 17:25 [PATCH v2 00/11] hugetlb: Use PAGE granularity index in exported i/f and adopt the common read_iter Jane Chu
` (3 preceding siblings ...)
2026-06-17 17:25 ` [PATCH v2 04/11] hugetlbfs,filemap: replace hugetlbfs_read_iter() with generic_file_read_iter() Jane Chu
@ 2026-06-17 17:25 ` Jane Chu
2026-06-17 17:25 ` [PATCH v2 06/11] hugetlb: make hugetlb_fault_mutex_hash() to take PAGE_SIZE index Jane Chu
` (6 subsequent siblings)
11 siblings, 0 replies; 14+ messages in thread
From: Jane Chu @ 2026-06-17 17:25 UTC (permalink / raw)
To: akpm
Cc: willy, jack, viro, brauner, muchun.song, osalvador, david, hughd,
baolin.wang, linmiaohe, nao.horiguchi, lorenzo, rppt, peterx,
corbet, linux-doc, linux-mm, linux-kernel, linux-fsdevel
Everywhere else in MM, the page fault vmf->pgoff is in PAGE_SIZE
granularity, except in hugetlbfs, it's in hugepagesize granularity.
This is really unnecessary.
Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
mm/hugetlb.c | 20 +++++++++++---------
1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 4b80b167cc9c..3255f6b762c9 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5654,6 +5654,8 @@ static inline vm_fault_t hugetlb_handle_userfault(struct vm_fault *vmf,
unsigned long reason)
{
u32 hash;
+ struct hstate *h = hstate_vma(vmf->vma);
+ pgoff_t idx = vmf->pgoff >> huge_page_order(h);
/*
* vma_lock and hugetlb_fault_mutex must be dropped before handling
@@ -5661,7 +5663,7 @@ static inline vm_fault_t hugetlb_handle_userfault(struct vm_fault *vmf,
* userfault, any vma operation should be careful from here.
*/
hugetlb_vma_unlock_read(vmf->vma);
- hash = hugetlb_fault_mutex_hash(mapping, vmf->pgoff);
+ hash = hugetlb_fault_mutex_hash(mapping, idx);
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
return handle_userfault(vmf, reason);
}
@@ -5686,7 +5688,7 @@ static bool hugetlb_pte_stable(struct hstate *h, struct mm_struct *mm, unsigned
static vm_fault_t hugetlb_no_page(struct address_space *mapping,
struct vm_fault *vmf)
{
- u32 hash = hugetlb_fault_mutex_hash(mapping, vmf->pgoff);
+ u32 hash;
bool new_folio, new_anon_folio = false;
struct vm_area_struct *vma = vmf->vma;
struct mm_struct *mm = vma->vm_mm;
@@ -5696,6 +5698,7 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
struct folio *folio;
unsigned long size;
pte_t new_pte;
+ pgoff_t idx = vmf->pgoff >> huge_page_order(h);
/*
* Currently, we are forced to kill the process in the event the
@@ -5714,9 +5717,9 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
* before we get page_table_lock.
*/
new_folio = false;
- folio = filemap_lock_hugetlb_folio(h, mapping, vmf->pgoff);
+ folio = filemap_lock_hugetlb_folio(h, mapping, idx);
if (IS_ERR(folio)) {
- size = i_size_read(mapping->host) >> huge_page_shift(h);
+ size = i_size_read(mapping->host) >> PAGE_SHIFT;
if (vmf->pgoff >= size)
goto out;
/* Check for page in userfault range */
@@ -5778,8 +5781,7 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
new_folio = true;
if (vma->vm_flags & VM_MAYSHARE) {
- int err = hugetlb_add_to_page_cache(folio, mapping,
- vmf->pgoff);
+ int err = hugetlb_add_to_page_cache(folio, mapping, idx);
if (err) {
/*
* err can't be -EEXIST which implies someone
@@ -5894,6 +5896,7 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
if (unlikely(ret & VM_FAULT_RETRY))
vma_end_read(vma);
+ hash = hugetlb_fault_mutex_hash(mapping, idx);
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
return ret;
@@ -5947,8 +5950,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
.address = address & huge_page_mask(h),
.real_address = address,
.flags = flags,
- .pgoff = vma_hugecache_offset(h, vma,
- address & huge_page_mask(h)),
+ .pgoff = linear_page_index(vma, address),
/* TODO: Track hugetlb faults using vm_fault */
/*
@@ -5963,7 +5965,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
* the same page in the page cache.
*/
mapping = vma->vm_file->f_mapping;
- hash = hugetlb_fault_mutex_hash(mapping, vmf.pgoff);
+ hash = hugetlb_fault_mutex_hash(mapping, vmf.pgoff >> huge_page_order(h));
mutex_lock(&hugetlb_fault_mutex_table[hash]);
/*
--
2.43.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 06/11] hugetlb: make hugetlb_fault_mutex_hash() to take PAGE_SIZE index
2026-06-17 17:25 [PATCH v2 00/11] hugetlb: Use PAGE granularity index in exported i/f and adopt the common read_iter Jane Chu
` (4 preceding siblings ...)
2026-06-17 17:25 ` [PATCH v2 05/11] hugetlb: Convert the vmf->pgoff to PAGE_SIZE granularity Jane Chu
@ 2026-06-17 17:25 ` Jane Chu
2026-06-17 17:25 ` [PATCH v2 07/11] hugetlb: replace filemap_lock_hugetlb_folio with filemap_lock_folio Jane Chu
` (5 subsequent siblings)
11 siblings, 0 replies; 14+ messages in thread
From: Jane Chu @ 2026-06-17 17:25 UTC (permalink / raw)
To: akpm
Cc: willy, jack, viro, brauner, muchun.song, osalvador, david, hughd,
baolin.wang, linmiaohe, nao.horiguchi, lorenzo, rppt, peterx,
corbet, linux-doc, linux-mm, linux-kernel, linux-fsdevel
Make hugetlb_fault_mutex_hash() to take a PAGE_SIZE-based index.
This makes the helper interface consistent with filemap_get_folio(),
and linear_page_index(), while preserving the same lock selection for
a given hugetlb file offset.
Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
fs/hugetlbfs/inode.c | 9 ++++-----
include/linux/hugetlb.h | 2 +-
mm/hugetlb.c | 23 ++++++++++++-----------
mm/memfd.c | 9 +++++----
mm/userfaultfd.c | 6 +++---
5 files changed, 25 insertions(+), 24 deletions(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 1c25485c91b9..02cb265a580e 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -448,7 +448,7 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart,
struct address_space *mapping = &inode->i_data;
const pgoff_t end = lend >> PAGE_SHIFT;
struct folio_batch fbatch;
- pgoff_t next, index;
+ pgoff_t next;
int i, freed = 0;
bool truncate_op = (lend == LLONG_MAX);
@@ -459,15 +459,14 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart,
struct folio *folio = fbatch.folios[i];
u32 hash = 0;
- index = folio->index >> huge_page_order(h);
- hash = hugetlb_fault_mutex_hash(mapping, index);
+ hash = hugetlb_fault_mutex_hash(mapping, folio->index);
mutex_lock(&hugetlb_fault_mutex_table[hash]);
/*
* Remove folio that was part of folio_batch.
*/
remove_inode_single_folio(h, inode, mapping, folio,
- index, truncate_op);
+ folio->index, truncate_op);
freed++;
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
@@ -664,7 +663,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
addr = index * hpage_size;
/* mutex taken here, fault path and hole punch */
- hash = hugetlb_fault_mutex_hash(mapping, index);
+ hash = hugetlb_fault_mutex_hash(mapping, index << huge_page_order(h));
mutex_lock(&hugetlb_fault_mutex_table[hash]);
/* See if already present in mapping to avoid alloc/free */
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 218284e80451..cae5cdd3ea00 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -159,7 +159,7 @@ void folio_putback_hugetlb(struct folio *folio);
void move_hugetlb_state(struct folio *old_folio, struct folio *new_folio, int reason);
void hugetlb_fix_reserve_counts(struct inode *inode);
extern struct mutex *hugetlb_fault_mutex_table;
-u32 hugetlb_fault_mutex_hash(struct address_space *mapping, pgoff_t idx);
+u32 hugetlb_fault_mutex_hash(struct address_space *mapping, pgoff_t index);
pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma,
unsigned long addr, pud_t *pud);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 3255f6b762c9..ecd1d1322fda 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5505,7 +5505,7 @@ static vm_fault_t hugetlb_wp(struct vm_fault *vmf)
*/
if (cow_from_owner) {
struct address_space *mapping = vma->vm_file->f_mapping;
- pgoff_t idx;
+ pgoff_t index;
u32 hash;
folio_put(old_folio);
@@ -5518,8 +5518,8 @@ static vm_fault_t hugetlb_wp(struct vm_fault *vmf)
*
* Reacquire both after unmap operation.
*/
- idx = vma_hugecache_offset(h, vma, vmf->address);
- hash = hugetlb_fault_mutex_hash(mapping, idx);
+ index = linear_page_index(vma, vmf->address);
+ hash = hugetlb_fault_mutex_hash(mapping, index);
hugetlb_vma_unlock_read(vma);
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
@@ -5654,8 +5654,6 @@ static inline vm_fault_t hugetlb_handle_userfault(struct vm_fault *vmf,
unsigned long reason)
{
u32 hash;
- struct hstate *h = hstate_vma(vmf->vma);
- pgoff_t idx = vmf->pgoff >> huge_page_order(h);
/*
* vma_lock and hugetlb_fault_mutex must be dropped before handling
@@ -5663,7 +5661,7 @@ static inline vm_fault_t hugetlb_handle_userfault(struct vm_fault *vmf,
* userfault, any vma operation should be careful from here.
*/
hugetlb_vma_unlock_read(vmf->vma);
- hash = hugetlb_fault_mutex_hash(mapping, idx);
+ hash = hugetlb_fault_mutex_hash(mapping, vmf->pgoff);
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
return handle_userfault(vmf, reason);
}
@@ -5896,7 +5894,7 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
if (unlikely(ret & VM_FAULT_RETRY))
vma_end_read(vma);
- hash = hugetlb_fault_mutex_hash(mapping, idx);
+ hash = hugetlb_fault_mutex_hash(mapping, vmf->pgoff);
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
return ret;
@@ -5913,13 +5911,16 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
}
#ifdef CONFIG_SMP
-u32 hugetlb_fault_mutex_hash(struct address_space *mapping, pgoff_t idx)
+u32 hugetlb_fault_mutex_hash(struct address_space *mapping, pgoff_t index)
{
unsigned long key[2];
+ struct hstate *h;
u32 hash;
key[0] = (unsigned long) mapping;
- key[1] = idx;
+
+ h = hstate_inode(mapping->host);
+ key[1] = index >> huge_page_order(h);
hash = jhash2((u32 *)&key, sizeof(key)/(sizeof(u32)), 0);
@@ -5930,7 +5931,7 @@ u32 hugetlb_fault_mutex_hash(struct address_space *mapping, pgoff_t idx)
* For uniprocessor systems we always use a single mutex, so just
* return 0 and avoid the hashing overhead.
*/
-u32 hugetlb_fault_mutex_hash(struct address_space *mapping, pgoff_t idx)
+u32 hugetlb_fault_mutex_hash(struct address_space *mapping, pgoff_t index)
{
return 0;
}
@@ -5965,7 +5966,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
* the same page in the page cache.
*/
mapping = vma->vm_file->f_mapping;
- hash = hugetlb_fault_mutex_hash(mapping, vmf.pgoff >> huge_page_order(h));
+ hash = hugetlb_fault_mutex_hash(mapping, vmf.pgoff);
mutex_lock(&hugetlb_fault_mutex_table[hash]);
/*
diff --git a/mm/memfd.c b/mm/memfd.c
index abe13b291ddc..b0ec0b12b98d 100644
--- a/mm/memfd.c
+++ b/mm/memfd.c
@@ -64,7 +64,7 @@ static void memfd_tag_pins(struct xa_state *xas)
* (memfd_pin_folios()) cannot find a folio in the page cache at a given
* index in the mapping.
*/
-struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx)
+struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t index)
{
#ifdef CONFIG_HUGETLB_PAGE
struct folio *folio;
@@ -79,12 +79,13 @@ struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx)
*/
struct inode *inode = file_inode(memfd);
struct hstate *h = hstate_file(memfd);
+ pgoff_t idx;
int err = -ENOMEM;
long nr_resv;
gfp_mask = htlb_alloc_mask(h);
gfp_mask &= ~(__GFP_HIGHMEM | __GFP_MOVABLE);
- idx >>= huge_page_order(h);
+ idx = index >> huge_page_order(h);
nr_resv = hugetlb_reserve_pages(inode, idx, idx + 1, NULL, EMPTY_VMA_FLAGS);
if (nr_resv < 0)
@@ -116,7 +117,7 @@ struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx)
* races with concurrent allocations, as required by all other
* callers of hugetlb_add_to_page_cache().
*/
- hash = hugetlb_fault_mutex_hash(memfd->f_mapping, idx);
+ hash = hugetlb_fault_mutex_hash(memfd->f_mapping, index);
mutex_lock(&hugetlb_fault_mutex_table[hash]);
err = hugetlb_add_to_page_cache(folio,
@@ -140,7 +141,7 @@ struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx)
return ERR_PTR(err);
}
#endif
- return shmem_read_folio(memfd->f_mapping, idx);
+ return shmem_read_folio(memfd->f_mapping, index);
}
/*
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index 180bad42fc79..95fb94b697a4 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -707,7 +707,7 @@ static __always_inline ssize_t mfill_atomic_hugetlb(
long copied;
struct folio *folio;
unsigned long vma_hpagesize;
- pgoff_t idx;
+ pgoff_t index;
u32 hash;
struct address_space *mapping;
@@ -776,9 +776,9 @@ static __always_inline ssize_t mfill_atomic_hugetlb(
* in the case of shared pmds. fault mutex prevents
* races with other faulting threads.
*/
- idx = hugetlb_linear_page_index(dst_vma, dst_addr);
+ index = linear_page_index(dst_vma, dst_addr);
mapping = dst_vma->vm_file->f_mapping;
- hash = hugetlb_fault_mutex_hash(mapping, idx);
+ hash = hugetlb_fault_mutex_hash(mapping, index);
mutex_lock(&hugetlb_fault_mutex_table[hash]);
hugetlb_vma_lock_read(dst_vma);
--
2.43.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 07/11] hugetlb: replace filemap_lock_hugetlb_folio with filemap_lock_folio
2026-06-17 17:25 [PATCH v2 00/11] hugetlb: Use PAGE granularity index in exported i/f and adopt the common read_iter Jane Chu
` (5 preceding siblings ...)
2026-06-17 17:25 ` [PATCH v2 06/11] hugetlb: make hugetlb_fault_mutex_hash() to take PAGE_SIZE index Jane Chu
@ 2026-06-17 17:25 ` Jane Chu
2026-06-17 17:25 ` [PATCH v2 08/11] hugetlb: make hugetlb_add_to_page_cache() to take PAGE_SIZE granularity index Jane Chu
` (4 subsequent siblings)
11 siblings, 0 replies; 14+ messages in thread
From: Jane Chu @ 2026-06-17 17:25 UTC (permalink / raw)
To: akpm
Cc: willy, jack, viro, brauner, muchun.song, osalvador, david, hughd,
baolin.wang, linmiaohe, nao.horiguchi, lorenzo, rppt, peterx,
corbet, linux-doc, linux-mm, linux-kernel, linux-fsdevel
The problem with filemap_lock_hugetlb_folio() is redundancy, replace
it with the generic filemap_lock_folio().
Suggested-by: David Hildenbrand <david@kernel.org>
Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
fs/hugetlbfs/inode.c | 3 +--
include/linux/hugetlb.h | 12 ------------
mm/hugetlb.c | 4 ++--
3 files changed, 3 insertions(+), 16 deletions(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 02cb265a580e..6c883478f7e7 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -518,10 +518,9 @@ static void hugetlbfs_zero_partial_page(struct hstate *h,
loff_t start,
loff_t end)
{
- pgoff_t idx = start >> huge_page_shift(h);
struct folio *folio;
- folio = filemap_lock_hugetlb_folio(h, mapping, idx);
+ folio = filemap_lock_folio(mapping, start);
if (IS_ERR(folio))
return;
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index cae5cdd3ea00..e78d0f706681 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -824,12 +824,6 @@ static inline unsigned int blocks_per_huge_page(struct hstate *h)
return huge_page_size(h) / 512;
}
-static inline struct folio *filemap_lock_hugetlb_folio(struct hstate *h,
- struct address_space *mapping, pgoff_t idx)
-{
- return filemap_lock_folio(mapping, idx << huge_page_order(h));
-}
-
#include <asm/hugetlb.h>
#ifndef is_hugepage_only_range
@@ -1096,12 +1090,6 @@ static inline struct hugepage_subpool *hugetlb_folio_subpool(struct folio *folio
return NULL;
}
-static inline struct folio *filemap_lock_hugetlb_folio(struct hstate *h,
- struct address_space *mapping, pgoff_t idx)
-{
- return NULL;
-}
-
static inline int isolate_or_dissolve_huge_folio(struct folio *folio,
struct list_head *list)
{
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index ecd1d1322fda..5484e78fe72e 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5715,7 +5715,7 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
* before we get page_table_lock.
*/
new_folio = false;
- folio = filemap_lock_hugetlb_folio(h, mapping, idx);
+ folio = filemap_lock_folio(mapping, vmf->pgoff);
if (IS_ERR(folio)) {
size = i_size_read(mapping->host) >> PAGE_SHIFT;
if (vmf->pgoff >= size)
@@ -6201,7 +6201,7 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte,
if (is_continue) {
ret = -EFAULT;
- folio = filemap_lock_hugetlb_folio(h, mapping, idx);
+ folio = filemap_lock_folio(mapping, idx << huge_page_order(h));
if (IS_ERR(folio))
goto out;
folio_in_pagecache = true;
--
2.43.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 08/11] hugetlb: make hugetlb_add_to_page_cache() to take PAGE_SIZE granularity index
2026-06-17 17:25 [PATCH v2 00/11] hugetlb: Use PAGE granularity index in exported i/f and adopt the common read_iter Jane Chu
` (6 preceding siblings ...)
2026-06-17 17:25 ` [PATCH v2 07/11] hugetlb: replace filemap_lock_hugetlb_folio with filemap_lock_folio Jane Chu
@ 2026-06-17 17:25 ` Jane Chu
2026-06-17 17:25 ` [PATCH v2 09/11] hugetlb: remove the hugetlb_linear_page_index() helper Jane Chu
` (3 subsequent siblings)
11 siblings, 0 replies; 14+ messages in thread
From: Jane Chu @ 2026-06-17 17:25 UTC (permalink / raw)
To: akpm
Cc: willy, jack, viro, brauner, muchun.song, osalvador, david, hughd,
baolin.wang, linmiaohe, nao.horiguchi, lorenzo, rppt, peterx,
corbet, linux-doc, linux-mm, linux-kernel, linux-fsdevel
hugetlb_add_to_page_cache() is partly a wrapper of the generic
__filemap_add_folio() that takes in PAGE_SIZE granularity index,
hence make it consistent by taking PAGE_SIZE granularity index
as well.
Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
fs/hugetlbfs/inode.c | 13 ++++++++-----
mm/hugetlb.c | 18 +++++++++---------
mm/memfd.c | 2 +-
3 files changed, 18 insertions(+), 15 deletions(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 6c883478f7e7..0b49a79efb08 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -599,7 +599,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
struct mm_struct *mm = current->mm;
loff_t hpage_size = huge_page_size(h);
unsigned long hpage_shift = huge_page_shift(h);
- pgoff_t start, index, end;
+ pgoff_t start, idx, end;
int error;
u32 hash;
@@ -639,7 +639,9 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
vm_flags_init(&pseudo_vma, VM_HUGETLB | VM_MAYSHARE | VM_SHARED);
pseudo_vma.vm_file = file;
- for (index = start; index < end; index++) {
+ for (idx = start; idx < end; idx++) {
+ pgoff_t index = idx << huge_page_order(h);
+
/*
* This is supposed to be the vaddr where the page is being
* faulted in, but we have no vaddr here.
@@ -659,14 +661,14 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
}
/* addr is the offset within the file (zero based) */
- addr = index * hpage_size;
+ addr = idx * hpage_size;
/* mutex taken here, fault path and hole punch */
- hash = hugetlb_fault_mutex_hash(mapping, index << huge_page_order(h));
+ hash = hugetlb_fault_mutex_hash(mapping, index);
mutex_lock(&hugetlb_fault_mutex_table[hash]);
/* See if already present in mapping to avoid alloc/free */
- folio = filemap_get_folio(mapping, index << huge_page_order(h));
+ folio = filemap_get_folio(mapping, index);
if (!IS_ERR(folio)) {
folio_put(folio);
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
@@ -690,6 +692,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
folio_zero_user(folio, addr);
__folio_mark_uptodate(folio);
error = hugetlb_add_to_page_cache(folio, mapping, index);
+
if (unlikely(error)) {
restore_reserve_on_error(h, &pseudo_vma, addr, folio);
folio_put(folio);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 5484e78fe72e..b41e7b8df094 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5621,15 +5621,14 @@ bool hugetlbfs_pagecache_present(struct hstate *h,
}
int hugetlb_add_to_page_cache(struct folio *folio, struct address_space *mapping,
- pgoff_t idx)
+ pgoff_t index)
{
struct inode *inode = mapping->host;
struct hstate *h = hstate_inode(inode);
int err;
- idx <<= huge_page_order(h);
__folio_set_locked(folio);
- err = __filemap_add_folio(mapping, folio, idx, GFP_KERNEL, NULL);
+ err = __filemap_add_folio(mapping, folio, index, GFP_KERNEL, NULL);
if (unlikely(err)) {
__folio_clear_locked(folio);
@@ -5696,7 +5695,6 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
struct folio *folio;
unsigned long size;
pte_t new_pte;
- pgoff_t idx = vmf->pgoff >> huge_page_order(h);
/*
* Currently, we are forced to kill the process in the event the
@@ -5779,7 +5777,8 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
new_folio = true;
if (vma->vm_flags & VM_MAYSHARE) {
- int err = hugetlb_add_to_page_cache(folio, mapping, idx);
+ int err = hugetlb_add_to_page_cache(folio, mapping, vmf->pgoff);
+
if (err) {
/*
* err can't be -EEXIST which implies someone
@@ -6170,7 +6169,8 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte,
bool wp_enabled = (flags & MFILL_ATOMIC_WP);
struct hstate *h = hstate_vma(dst_vma);
struct address_space *mapping = dst_vma->vm_file->f_mapping;
- pgoff_t idx = vma_hugecache_offset(h, dst_vma, dst_addr);
+ pgoff_t index = linear_page_index(dst_vma, dst_addr);
+
unsigned long size = huge_page_size(h);
int vm_shared = dst_vma->vm_flags & VM_SHARED;
pte_t _dst_pte;
@@ -6201,7 +6201,7 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte,
if (is_continue) {
ret = -EFAULT;
- folio = filemap_lock_folio(mapping, idx << huge_page_order(h));
+ folio = filemap_lock_folio(mapping, index);
if (IS_ERR(folio))
goto out;
folio_in_pagecache = true;
@@ -6297,7 +6297,7 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte,
/* Add shared, newly allocated pages to the page cache. */
if (vm_shared && !is_continue) {
ret = -EFAULT;
- if (idx >= (i_size_read(mapping->host) >> huge_page_shift(h)))
+ if (index >= (i_size_read(mapping->host) >> PAGE_SHIFT))
goto out_release_nounlock;
/*
@@ -6306,7 +6306,7 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte,
* hugetlb_fault_mutex_table that here must be hold by
* the caller.
*/
- ret = hugetlb_add_to_page_cache(folio, mapping, idx);
+ ret = hugetlb_add_to_page_cache(folio, mapping, index);
if (ret)
goto out_release_nounlock;
folio_in_pagecache = true;
diff --git a/mm/memfd.c b/mm/memfd.c
index b0ec0b12b98d..0b5e8f111b39 100644
--- a/mm/memfd.c
+++ b/mm/memfd.c
@@ -122,7 +122,7 @@ struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t index)
err = hugetlb_add_to_page_cache(folio,
memfd->f_mapping,
- idx);
+ index);
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
--
2.43.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 09/11] hugetlb: remove the hugetlb_linear_page_index() helper
2026-06-17 17:25 [PATCH v2 00/11] hugetlb: Use PAGE granularity index in exported i/f and adopt the common read_iter Jane Chu
` (7 preceding siblings ...)
2026-06-17 17:25 ` [PATCH v2 08/11] hugetlb: make hugetlb_add_to_page_cache() to take PAGE_SIZE granularity index Jane Chu
@ 2026-06-17 17:25 ` Jane Chu
2026-06-17 17:25 ` [PATCH v2 10/11] hugetlb: drop vma_hugecache_offset() in favor of linear_page_index() Jane Chu
` (2 subsequent siblings)
11 siblings, 0 replies; 14+ messages in thread
From: Jane Chu @ 2026-06-17 17:25 UTC (permalink / raw)
To: akpm
Cc: willy, jack, viro, brauner, muchun.song, osalvador, david, hughd,
baolin.wang, linmiaohe, nao.horiguchi, lorenzo, rppt, peterx,
corbet, linux-doc, linux-mm, linux-kernel, linux-fsdevel
No one is calling hugetlb_linear_page_index(), so remove it.
Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
include/linux/hugetlb.h | 17 -----------------
1 file changed, 17 deletions(-)
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index e78d0f706681..e5a459a6e4b2 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -787,23 +787,6 @@ static inline unsigned huge_page_shift(struct hstate *h)
return h->order + PAGE_SHIFT;
}
-/**
- * hugetlb_linear_page_index() - linear_page_index() but in hugetlb
- * page size granularity.
- * @vma: the hugetlb VMA
- * @address: the virtual address within the VMA
- *
- * Return: the page offset within the mapping in huge page units.
- */
-static inline pgoff_t hugetlb_linear_page_index(struct vm_area_struct *vma,
- unsigned long address)
-{
- struct hstate *h = hstate_vma(vma);
-
- return ((address - vma->vm_start) >> huge_page_shift(h)) +
- (vma->vm_pgoff >> huge_page_order(h));
-}
-
static inline bool order_is_gigantic(unsigned int order)
{
return order > MAX_PAGE_ORDER;
--
2.43.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 10/11] hugetlb: drop vma_hugecache_offset() in favor of linear_page_index()
2026-06-17 17:25 [PATCH v2 00/11] hugetlb: Use PAGE granularity index in exported i/f and adopt the common read_iter Jane Chu
` (8 preceding siblings ...)
2026-06-17 17:25 ` [PATCH v2 09/11] hugetlb: remove the hugetlb_linear_page_index() helper Jane Chu
@ 2026-06-17 17:25 ` Jane Chu
2026-06-17 17:25 ` [PATCH v2 11/11] hugetlb: make hugetlb_[un]reserve_pages() to take PAGE granularity index Jane Chu
2026-06-17 18:28 ` [PATCH v2 00/11] hugetlb: Use PAGE granularity index in exported i/f and adopt the common read_iter Mike Rapoport
11 siblings, 0 replies; 14+ messages in thread
From: Jane Chu @ 2026-06-17 17:25 UTC (permalink / raw)
To: akpm
Cc: willy, jack, viro, brauner, muchun.song, osalvador, david, hughd,
baolin.wang, linmiaohe, nao.horiguchi, lorenzo, rppt, peterx,
corbet, linux-doc, linux-mm, linux-kernel, linux-fsdevel
vma_hugecache_offset() converts a hugetlb VMA address into a mapping
offset in hugepage units. While the helper is small, its name is not very
clear, and the resulting code is harder to follow than using the common MM
helper directly.
Use linear_page_index() instead, with an explicit conversion from
PAGE_SIZE units to hugepage units at each call site, and remove
vma_hugecache_offset().
Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
mm/hugetlb.c | 21 +++++++--------------
1 file changed, 7 insertions(+), 14 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index b41e7b8df094..a677ea774143 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1001,17 +1001,6 @@ static long region_count(struct resv_map *resv, long f, long t)
return chg;
}
-/*
- * Convert the address within this vma to the page offset within
- * the mapping, huge page units here.
- */
-static pgoff_t vma_hugecache_offset(struct hstate *h,
- struct vm_area_struct *vma, unsigned long address)
-{
- return ((address - vma->vm_start) >> huge_page_shift(h)) +
- (vma->vm_pgoff >> huge_page_order(h));
-}
-
/*
* Flags for MAP_PRIVATE reservations. These are stored in the bottom
* bits of the reservation map pointer, which are always clear due to
@@ -2437,7 +2426,9 @@ static long __vma_reservation_common(struct hstate *h,
if (!resv)
return 1;
- idx = vma_hugecache_offset(h, vma, addr);
+ idx = linear_page_index(vma, addr);
+ idx >>= huge_page_order(h);
+
switch (mode) {
case VMA_NEEDS_RESV:
ret = region_chg(resv, idx, idx + 1, &dummy_out_regions_needed);
@@ -4693,8 +4684,10 @@ static void hugetlb_vm_op_close(struct vm_area_struct *vma)
if (!resv || !is_vma_resv_set(vma, HPAGE_RESV_OWNER))
return;
- start = vma_hugecache_offset(h, vma, vma->vm_start);
- end = vma_hugecache_offset(h, vma, vma->vm_end);
+ start = linear_page_index(vma, vma->vm_start);
+ start >>= huge_page_order(h);
+ end = linear_page_index(vma, vma->vm_end);
+ end >>= huge_page_order(h);
reserve = (end - start) - region_count(resv, start, end);
hugetlb_cgroup_uncharge_counter(resv, start, end);
--
2.43.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 11/11] hugetlb: make hugetlb_[un]reserve_pages() to take PAGE granularity index
2026-06-17 17:25 [PATCH v2 00/11] hugetlb: Use PAGE granularity index in exported i/f and adopt the common read_iter Jane Chu
` (9 preceding siblings ...)
2026-06-17 17:25 ` [PATCH v2 10/11] hugetlb: drop vma_hugecache_offset() in favor of linear_page_index() Jane Chu
@ 2026-06-17 17:25 ` Jane Chu
2026-06-17 18:28 ` [PATCH v2 00/11] hugetlb: Use PAGE granularity index in exported i/f and adopt the common read_iter Mike Rapoport
11 siblings, 0 replies; 14+ messages in thread
From: Jane Chu @ 2026-06-17 17:25 UTC (permalink / raw)
To: akpm
Cc: willy, jack, viro, brauner, muchun.song, osalvador, david, hughd,
baolin.wang, linmiaohe, nao.horiguchi, lorenzo, rppt, peterx,
corbet, linux-doc, linux-mm, linux-kernel, linux-fsdevel
hugetlb_reserve_pages / hugetlb_unreserve_pages have two callers and
one of them is outside hugetlb. Make both functions to take PAGE granularity
index to be consistent with the rest of MM.
Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
Documentation/mm/hugetlbfs_reserv.rst | 19 ++++++++++---------
fs/hugetlbfs/inode.c | 25 +++++++++++--------------
mm/hugetlb.c | 23 ++++++++++++++++++-----
mm/memfd.c | 20 ++++++--------------
4 files changed, 45 insertions(+), 42 deletions(-)
diff --git a/Documentation/mm/hugetlbfs_reserv.rst b/Documentation/mm/hugetlbfs_reserv.rst
index a49115db18c7..880e9ccd5b57 100644
--- a/Documentation/mm/hugetlbfs_reserv.rst
+++ b/Documentation/mm/hugetlbfs_reserv.rst
@@ -112,11 +112,12 @@ flag was specified in either the shmget() or mmap() call. If NORESERVE
was specified, then this routine returns immediately as no reservations
are desired.
-The arguments 'from' and 'to' are huge page indices into the mapping or
-underlying file. For shmget(), 'from' is always 0 and 'to' corresponds to
-the length of the segment/mapping. For mmap(), the offset argument could
-be used to specify the offset into the underlying file. In such a case,
-the 'from' and 'to' arguments have been adjusted by this offset.
+The arguments 'from' and 'to' are base page indices into the mapping or
+underlying file that must be huge page aligned. For shmget(),
+'from' is always 0 and 'to' corresponds to the length of the segment/mapping.
+For mmap(), the offset argument could be used to specify the offset into
+the underlying file. In such a case, the 'from' and 'to' arguments have been
+adjusted by this offset.
One of the big differences between PRIVATE and SHARED mappings is the way
in which reservations are represented in the reservation map.
@@ -136,10 +137,10 @@ to indicate this VMA owns the reservations.
The reservation map is consulted to determine how many huge page reservations
are needed for the current mapping/segment. For private mappings, this is
-always the value (to - from). However, for shared mappings it is possible that
-some reservations may already exist within the range (to - from). See the
-section :ref:`Reservation Map Modifications <resv_map_modifications>`
-for details on how this is accomplished.
+always the number of huge pages covered by the range [from, to).
+However, for shared mappings it is possible that some reservations may already
+exist within the range [from, to). See the section :ref:`Reservation Map
+Modifications <resv_map_modifications>` for details on how this is accomplished.
The mapping may be associated with a subpool. If so, the subpool is consulted
to ensure there is sufficient space for the mapping. It is possible that the
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 0b49a79efb08..fe1ebfd604dc 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -150,10 +150,8 @@ static int hugetlbfs_file_mmap(struct file *file, struct vm_area_struct *vma)
if (inode->i_flags & S_PRIVATE)
vma_flags_set(&vma_flags, VMA_NORESERVE_BIT);
- if (hugetlb_reserve_pages(inode,
- vma->vm_pgoff >> huge_page_order(h),
- len >> huge_page_shift(h), vma,
- vma_flags) < 0)
+ if (hugetlb_reserve_pages(inode, vma->vm_pgoff, len >> PAGE_SHIFT,
+ vma, vma_flags) < 0)
goto out;
ret = 0;
@@ -389,7 +387,7 @@ hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end,
*/
static void remove_inode_single_folio(struct hstate *h, struct inode *inode,
struct address_space *mapping, struct folio *folio,
- pgoff_t index, bool truncate_op)
+ pgoff_t idx, bool truncate_op)
{
/*
* If folio is mapped, it was faulted in after being
@@ -401,7 +399,7 @@ static void remove_inode_single_folio(struct hstate *h, struct inode *inode,
*/
folio_lock(folio);
if (unlikely(folio_mapped(folio)))
- hugetlb_unmap_file_folio(h, mapping, folio, index);
+ hugetlb_unmap_file_folio(h, mapping, folio, idx);
/*
* We must remove the folio from page cache before removing
@@ -413,8 +411,10 @@ static void remove_inode_single_folio(struct hstate *h, struct inode *inode,
VM_BUG_ON_FOLIO(folio_test_hugetlb_restore_reserve(folio), folio);
hugetlb_delete_from_page_cache(folio);
if (!truncate_op) {
- if (unlikely(hugetlb_unreserve_pages(inode, index,
- index + 1, 1)))
+ pgoff_t index = idx << huge_page_order(h);
+ pgoff_t next = index + pages_per_huge_page(h);
+
+ if (unlikely(hugetlb_unreserve_pages(inode, index, next, 1)))
hugetlb_fix_reserve_counts(inode);
}
@@ -476,9 +476,8 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart,
}
if (truncate_op)
- (void)hugetlb_unreserve_pages(inode,
- lstart >> huge_page_shift(h),
- LONG_MAX, freed);
+ (void)hugetlb_unreserve_pages(inode, lstart >> PAGE_SHIFT,
+ LONG_MAX, freed);
}
static void hugetlbfs_evict_inode(struct inode *inode)
@@ -1429,9 +1428,7 @@ struct file *hugetlb_file_setup(const char *name, size_t size,
inode->i_size = size;
clear_nlink(inode);
- if (hugetlb_reserve_pages(inode, 0,
- size >> huge_page_shift(hstate_inode(inode)), NULL,
- acctflag) < 0)
+ if (hugetlb_reserve_pages(inode, 0, size >> PAGE_SHIFT, NULL, acctflag) < 0)
file = ERR_PTR(-ENOMEM);
else
file = alloc_file_pseudo(inode, mnt, name, O_RDWR,
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index a677ea774143..302f9cf9ef6b 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -6528,7 +6528,7 @@ long hugetlb_change_protection(struct vm_area_struct *vma,
*/
long hugetlb_reserve_pages(struct inode *inode,
- long from, long to,
+ long from_idx, long to_idx,
struct vm_area_struct *vma,
vma_flags_t vma_flags)
{
@@ -6538,14 +6538,21 @@ long hugetlb_reserve_pages(struct inode *inode,
struct resv_map *resv_map;
struct hugetlb_cgroup *h_cg = NULL;
long gbl_reserve, regions_needed = 0;
+ long from, to;
int err;
+ VM_WARN_ON(!IS_ALIGNED(from_idx, 1UL << huge_page_order(h)));
+ VM_WARN_ON(!IS_ALIGNED(to_idx, 1UL << huge_page_order(h)));
+
/* This should never happen */
- if (from > to) {
+ if (from_idx > to_idx) {
VM_WARN(1, "%s called with a negative range\n", __func__);
return -EINVAL;
}
+ from = from_idx >> huge_page_order(h);
+ to = to_idx >> huge_page_order(h);
+
/*
* vma specific semaphore used for pmd sharing and fault/truncation
* synchronization
@@ -6715,14 +6722,20 @@ long hugetlb_reserve_pages(struct inode *inode,
return err;
}
-long hugetlb_unreserve_pages(struct inode *inode, long start, long end,
- long freed)
+long hugetlb_unreserve_pages(struct inode *inode, long start_idx,
+ long end_idx, long freed)
{
struct hstate *h = hstate_inode(inode);
struct resv_map *resv_map = inode_resv_map(inode);
long chg = 0;
struct hugepage_subpool *spool = subpool_inode(inode);
- long gbl_reserve;
+ long gbl_reserve, start, end;
+
+ VM_WARN_ON(!IS_ALIGNED(start_idx, 1UL << huge_page_order(h)));
+ VM_WARN_ON(!IS_ALIGNED(end_idx, 1UL << huge_page_order(h)));
+
+ start = start_idx >> huge_page_order(h);
+ end = end_idx >> huge_page_order(h);
/*
* Since this routine can be called in the evict inode path for all
diff --git a/mm/memfd.c b/mm/memfd.c
index 0b5e8f111b39..24fefb1d2761 100644
--- a/mm/memfd.c
+++ b/mm/memfd.c
@@ -79,22 +79,19 @@ struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t index)
*/
struct inode *inode = file_inode(memfd);
struct hstate *h = hstate_file(memfd);
- pgoff_t idx;
+ pgoff_t next;
int err = -ENOMEM;
long nr_resv;
gfp_mask = htlb_alloc_mask(h);
gfp_mask &= ~(__GFP_HIGHMEM | __GFP_MOVABLE);
- idx = index >> huge_page_order(h);
+ next = index + pages_per_huge_page(h);
- nr_resv = hugetlb_reserve_pages(inode, idx, idx + 1, NULL, EMPTY_VMA_FLAGS);
+ nr_resv = hugetlb_reserve_pages(inode, index, next, NULL, EMPTY_VMA_FLAGS);
if (nr_resv < 0)
return ERR_PTR(nr_resv);
- folio = alloc_hugetlb_folio_reserve(h,
- numa_node_id(),
- NULL,
- gfp_mask);
+ folio = alloc_hugetlb_folio_reserve(h, numa_node_id(), NULL, gfp_mask);
if (folio) {
u32 hash;
@@ -119,13 +116,8 @@ struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t index)
*/
hash = hugetlb_fault_mutex_hash(memfd->f_mapping, index);
mutex_lock(&hugetlb_fault_mutex_table[hash]);
-
- err = hugetlb_add_to_page_cache(folio,
- memfd->f_mapping,
- index);
-
+ err = hugetlb_add_to_page_cache(folio, memfd->f_mapping, index);
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
-
if (err) {
folio_put(folio);
goto err_unresv;
@@ -137,7 +129,7 @@ struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t index)
}
err_unresv:
if (nr_resv > 0)
- hugetlb_unreserve_pages(inode, idx, idx + 1, 0);
+ hugetlb_unreserve_pages(inode, index, next, 0);
return ERR_PTR(err);
}
#endif
--
2.43.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v2 00/11] hugetlb: Use PAGE granularity index in exported i/f and adopt the common read_iter
2026-06-17 17:25 [PATCH v2 00/11] hugetlb: Use PAGE granularity index in exported i/f and adopt the common read_iter Jane Chu
` (10 preceding siblings ...)
2026-06-17 17:25 ` [PATCH v2 11/11] hugetlb: make hugetlb_[un]reserve_pages() to take PAGE granularity index Jane Chu
@ 2026-06-17 18:28 ` Mike Rapoport
11 siblings, 0 replies; 14+ messages in thread
From: Mike Rapoport @ 2026-06-17 18:28 UTC (permalink / raw)
To: Jane Chu
Cc: akpm, willy, jack, viro, brauner, muchun.song, osalvador, david,
hughd, baolin.wang, linmiaohe, nao.horiguchi, lorenzo, peterx,
corbet, linux-doc, linux-mm, linux-kernel, linux-fsdevel
Hi Jane,
On Wed, Jun 17, 2026 at 11:25:21AM -0600, Jane Chu wrote:
> changes in v2:
> - new patches 1-4: add hwpoison handling to filemap_read(),
> thus replace hugetlbfs_read_iter() with generic_file_read_iter(),
> suggested by Matthew [2];
> - new patch 5: convert hugetlb fault handler's vmf->pgoff to PAGE_SIZE
> granularity like the rest of mm fault handling convention, suggested
> by Matthew [2];
> - patch 6: fixed a bug in v1 pointed out by Usama Arif, also by syzbot;
> - patch 8: did not pick the Acked-by from Oscar (for 5/6 in v1) due to
> updates to the patch;
> - patch 11: add VM_WARN_ON in hugetlb_unreserve_pages(), per Oscar;
It seems that cow, hugetlb, GUP and HMM selftests trigger these WARN_ONs:
https://github.com/linux-mm/linux-mm/actions/runs/27707843062/job/81960927740
> v1:
> This series stems from a discussion with David. [1]
> The series makes a small cleanup to a few hugetlb interfaces used
> outside the subsystem by standardizing them on base-page indices.
> Hopefully this makes the interface semantics a bit more coherent with
> the rest of mm, while the internal hugetlb code continue to use hugepage
> indices where that remains the more natural fit.
>
> [1] https://lore.kernel.org/linux-mm/9ec9edd1-0f4c-4da2-ae78-0e7b251a9e25@kernel.org/
> [2] https://lore.kernel.org/linux-mm/aeZwAz6PcdlqSnJ2@casper.infradead.org/
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 04/11] hugetlbfs,filemap: replace hugetlbfs_read_iter() with generic_file_read_iter()
2026-06-17 17:25 ` [PATCH v2 04/11] hugetlbfs,filemap: replace hugetlbfs_read_iter() with generic_file_read_iter() Jane Chu
@ 2026-06-17 20:07 ` Matthew Wilcox
0 siblings, 0 replies; 14+ messages in thread
From: Matthew Wilcox @ 2026-06-17 20:07 UTC (permalink / raw)
To: Jane Chu
Cc: akpm, jack, viro, brauner, muchun.song, osalvador, david, hughd,
baolin.wang, linmiaohe, nao.horiguchi, lorenzo, rppt, peterx,
corbet, linux-doc, linux-mm, linux-kernel, linux-fsdevel
On Wed, Jun 17, 2026 at 11:25:25AM -0600, Jane Chu wrote:
> +++ b/mm/filemap.c
> @@ -2672,20 +2672,30 @@ static int filemap_get_pages(struct kiocb *iocb, size_t count,
> {
> struct file *filp = iocb->ki_filp;
> struct address_space *mapping = filp->f_mapping;
> + bool is_hugetlbfs = is_file_hugepages(filp);
> pgoff_t index = iocb->ki_pos >> PAGE_SHIFT;
> pgoff_t last_index;
> struct folio *folio;
> unsigned int flags;
> + size_t min_folio_bytes;
> int err = 0;
>
> /* "last_index" is the index of the folio beyond the end of the read */
> - last_index = round_up(iocb->ki_pos + count,
> - mapping_min_folio_nrbytes(mapping)) >> PAGE_SHIFT;
> + if (is_hugetlbfs)
> + min_folio_bytes = huge_page_size(hstate_file(filp));
> + else
> + min_folio_bytes = mapping_min_folio_nrbytes(mapping);
> + last_index = round_up(iocb->ki_pos + count, min_folio_bytes) >> PAGE_SHIFT;
I don't love this. Is there a way we can get mapping_min_folio_nrbytes()
to give us the right number for hugetlbfs? I don't see why it wouldn't
be possible ...
> filemap_get_read_batch(mapping, index, last_index - 1, fbatch);
> +
> + if (is_hugetlbfs)
> + goto done;
We don't actually need this, do we? For hugetlbfs, I don't think we
can get 0 folios in the batch, and then we won't find a folio with
readahead set, and they're always uptodate ... so we're just skipping a
few tests with this?
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2026-06-17 20:08 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-17 17:25 [PATCH v2 00/11] hugetlb: Use PAGE granularity index in exported i/f and adopt the common read_iter Jane Chu
2026-06-17 17:25 ` [PATCH v2 01/11] mm/memory-failure: make is_raw_hwpoison_page_in_hugepage() general purpose Jane Chu
2026-06-17 17:25 ` [PATCH v2 02/11] mm: factor out adjust_range_hwpoison() from hugetlbfs Jane Chu
2026-06-17 17:25 ` [PATCH v2 03/11] mm/filemap: add hwpoison handling to filemap_read() Jane Chu
2026-06-17 17:25 ` [PATCH v2 04/11] hugetlbfs,filemap: replace hugetlbfs_read_iter() with generic_file_read_iter() Jane Chu
2026-06-17 20:07 ` Matthew Wilcox
2026-06-17 17:25 ` [PATCH v2 05/11] hugetlb: Convert the vmf->pgoff to PAGE_SIZE granularity Jane Chu
2026-06-17 17:25 ` [PATCH v2 06/11] hugetlb: make hugetlb_fault_mutex_hash() to take PAGE_SIZE index Jane Chu
2026-06-17 17:25 ` [PATCH v2 07/11] hugetlb: replace filemap_lock_hugetlb_folio with filemap_lock_folio Jane Chu
2026-06-17 17:25 ` [PATCH v2 08/11] hugetlb: make hugetlb_add_to_page_cache() to take PAGE_SIZE granularity index Jane Chu
2026-06-17 17:25 ` [PATCH v2 09/11] hugetlb: remove the hugetlb_linear_page_index() helper Jane Chu
2026-06-17 17:25 ` [PATCH v2 10/11] hugetlb: drop vma_hugecache_offset() in favor of linear_page_index() Jane Chu
2026-06-17 17:25 ` [PATCH v2 11/11] hugetlb: make hugetlb_[un]reserve_pages() to take PAGE granularity index Jane Chu
2026-06-17 18:28 ` [PATCH v2 00/11] hugetlb: Use PAGE granularity index in exported i/f and adopt the common read_iter Mike Rapoport
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox