From: Mike Kravetz <mike.kravetz@oracle.com>
To: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
akpm@linux-foundation.org, songmuchun@bytedance.com,
willy@infradead.org, ackerleytng@google.com,
vannapurve@google.com, erdemaktas@google.com,
stable@vger.kernel.org
Subject: Re: [PATCH] mm/hugetlb: revert use of page_cache_next_miss()
Date: Fri, 5 May 2023 14:58:06 -0700 [thread overview]
Message-ID: <20230505215806.GA3497@monkey> (raw)
In-Reply-To: <20230505185301.534259-1-sidhartha.kumar@oracle.com>
On 05/05/23 11:53, Sidhartha Kumar wrote:
> As reported by Ackerley[1], the use of page_cache_next_miss() in
> hugetlbfs_fallocate() introduces a bug where a second fallocate() call to
> same offset fails with -EEXIST. Revert this change and go back to the
> previous method of using get from the page cache and then dropping the
> reference on success.
>
> hugetlbfs_pagecache_present() was also refactored to use
> page_cache_next_miss(), revert the usage there as well.
>
> User visible impacts include hugetlb fallocate incorrectly returning
> EEXIST if pages are already present in the file. In addition, hugetlb
> pages will not be included in core dumps if they need to be brought in via
> GUP. userfaultfd UFFDIO_COPY also uses this code and will not notice pages
> already present in the cache. It may try to allocate a new page and
> potentially return ENOMEM as opposed to EEXIST.
>
> Fixes: d0ce0e47b323 ("mm/hugetlb: convert hugetlb fault paths to use alloc_hugetlb_folio()")
Small nit and a question for people more familiar with stable backports.
d0ce0e47b323 added the usage of page_cache_next_miss to hugetlb fallocate.
91a2fb956ad99 added the usage to hugetlbfs_pagecache_present. Both are
in v6.3 and d0ce0e47b323 (referenced here) comes later. So, I 'think' it
is OK to fix both instances with a single patch and reference the commit
where both are present. Or, should there be two patches which is more
technically correct?
> Cc: <stable@vger.kernel.org> #v6.3+
> Reported-by: Ackerley Tng <ackerleytng@google.com>
> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
>
> [1] https://lore.kernel.org/linux-mm/cover.1683069252.git.ackerleytng@google.com/
> ---
> This patch is meant to fix stable v6.3.1 as safe as possible by doing a
> simple revert.
>
> Patch page cache: fix page_cache_next/prev_miss off by one by Mike is a
> potential fix that will allow the use of page_cache_next_miss() and is
> awaiting review.
>
> Patch Fix fallocate error in hugetlbfs when fallocating again by Ackerley
> is another fix but introduces a new function and is also awaiting review.
>
> fs/hugetlbfs/inode.c | 8 +++-----
> mm/hugetlb.c | 11 +++++------
> 2 files changed, 8 insertions(+), 11 deletions(-)
IMO, this is safest and simplest way of fixing v6.3. My proposed changes to
page_cache_next/prev_miss have the potential to impact readahead, so really
need review/testing by someone more familiar with that. If a fix is
urgently needed, I would suggest using this for backport and then either
use my patch or expand Ackerley's proposal to move forward.
As a backport to stable,
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
--
Mike Kravetz
>
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 9062da6da5675..6d6cd8f26d76d 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -821,7 +821,6 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
> */
> struct folio *folio;
> unsigned long addr;
> - bool present;
>
> cond_resched();
>
> @@ -845,10 +844,9 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
> mutex_lock(&hugetlb_fault_mutex_table[hash]);
>
> /* See if already present in mapping to avoid alloc/free */
> - rcu_read_lock();
> - present = page_cache_next_miss(mapping, index, 1) != index;
> - rcu_read_unlock();
> - if (present) {
> + folio = filemap_get_folio(mapping, index);
> + if (folio) {
> + folio_put(folio);
> mutex_unlock(&hugetlb_fault_mutex_table[hash]);
> hugetlb_drop_vma_policy(&pseudo_vma);
> continue;
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 245038a9fe4ea..29ab27d2a3ef5 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -5666,13 +5666,12 @@ static bool hugetlbfs_pagecache_present(struct hstate *h,
> {
> struct address_space *mapping = vma->vm_file->f_mapping;
> pgoff_t idx = vma_hugecache_offset(h, vma, address);
> - bool present;
> -
> - rcu_read_lock();
> - present = page_cache_next_miss(mapping, idx, 1) != idx;
> - rcu_read_unlock();
> + struct folio *folio;
>
> - return present;
> + folio = filemap_get_folio(mapping, idx);
> + if (folio)
> + folio_put(folio);
> + return folio != NULL;
> }
>
> int hugetlb_add_to_page_cache(struct folio *folio, struct address_space *mapping,
> --
> 2.40.0
>
next prev parent reply other threads:[~2023-05-05 21:58 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-04 23:38 [PATCH 0/1] fix page_cache_next/prev_miss off by one error Mike Kravetz
2023-05-04 23:38 ` [PATCH 1/1] page cache: fix page_cache_next/prev_miss off by one Mike Kravetz
2023-05-05 18:53 ` [PATCH] mm/hugetlb: revert use of page_cache_next_miss() Sidhartha Kumar
2023-05-05 21:58 ` Mike Kravetz [this message]
2023-05-16 23:12 ` Mike Kravetz
2023-05-23 5:00 ` kernel test robot
2023-05-23 6:14 ` Sidhartha Kumar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230505215806.GA3497@monkey \
--to=mike.kravetz@oracle.com \
--cc=ackerleytng@google.com \
--cc=akpm@linux-foundation.org \
--cc=erdemaktas@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=sidhartha.kumar@oracle.com \
--cc=songmuchun@bytedance.com \
--cc=stable@vger.kernel.org \
--cc=vannapurve@google.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).