From: Mike Kravetz <mike.kravetz@oracle.com>
To: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
akpm@linux-foundation.org, songmuchun@bytedance.com,
willy@infradead.org, ackerleytng@google.com,
vannapurve@google.com, erdemaktas@google.com,
stable@vger.kernel.org
Subject: Re: [PATCH] mm/hugetlb: revert use of page_cache_next_miss()
Date: Fri, 5 May 2023 14:58:06 -0700 [thread overview]
Message-ID: <20230505215806.GA3497@monkey> (raw)
In-Reply-To: <20230505185301.534259-1-sidhartha.kumar@oracle.com>
On 05/05/23 11:53, Sidhartha Kumar wrote:
> As reported by Ackerley[1], the use of page_cache_next_miss() in
> hugetlbfs_fallocate() introduces a bug where a second fallocate() call to
> same offset fails with -EEXIST. Revert this change and go back to the
> previous method of using get from the page cache and then dropping the
> reference on success.
>
> hugetlbfs_pagecache_present() was also refactored to use
> page_cache_next_miss(), revert the usage there as well.
>
> User visible impacts include hugetlb fallocate incorrectly returning
> EEXIST if pages are already present in the file. In addition, hugetlb
> pages will not be included in core dumps if they need to be brought in via
> GUP. userfaultfd UFFDIO_COPY also uses this code and will not notice pages
> already present in the cache. It may try to allocate a new page and
> potentially return ENOMEM as opposed to EEXIST.
>
> Fixes: d0ce0e47b323 ("mm/hugetlb: convert hugetlb fault paths to use alloc_hugetlb_folio()")
Small nit and a question for people more familiar with stable backports.
d0ce0e47b323 added the usage of page_cache_next_miss to hugetlb fallocate.
91a2fb956ad99 added the usage to hugetlbfs_pagecache_present. Both are
in v6.3 and d0ce0e47b323 (referenced here) comes later. So, I 'think' it
is OK to fix both instances with a single patch and reference the commit
where both are present. Or, should there be two patches which is more
technically correct?
> Cc: <stable@vger.kernel.org> #v6.3+
> Reported-by: Ackerley Tng <ackerleytng@google.com>
> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
>
> [1] https://lore.kernel.org/linux-mm/cover.1683069252.git.ackerleytng@google.com/
> ---
> This patch is meant to fix stable v6.3.1 as safe as possible by doing a
> simple revert.
>
> Patch page cache: fix page_cache_next/prev_miss off by one by Mike is a
> potential fix that will allow the use of page_cache_next_miss() and is
> awaiting review.
>
> Patch Fix fallocate error in hugetlbfs when fallocating again by Ackerley
> is another fix but introduces a new function and is also awaiting review.
>
> fs/hugetlbfs/inode.c | 8 +++-----
> mm/hugetlb.c | 11 +++++------
> 2 files changed, 8 insertions(+), 11 deletions(-)
IMO, this is safest and simplest way of fixing v6.3. My proposed changes to
page_cache_next/prev_miss have the potential to impact readahead, so really
need review/testing by someone more familiar with that. If a fix is
urgently needed, I would suggest using this for backport and then either
use my patch or expand Ackerley's proposal to move forward.
As a backport to stable,
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
--
Mike Kravetz
>
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 9062da6da5675..6d6cd8f26d76d 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -821,7 +821,6 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
> */
> struct folio *folio;
> unsigned long addr;
> - bool present;
>
> cond_resched();
>
> @@ -845,10 +844,9 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
> mutex_lock(&hugetlb_fault_mutex_table[hash]);
>
> /* See if already present in mapping to avoid alloc/free */
> - rcu_read_lock();
> - present = page_cache_next_miss(mapping, index, 1) != index;
> - rcu_read_unlock();
> - if (present) {
> + folio = filemap_get_folio(mapping, index);
> + if (folio) {
> + folio_put(folio);
> mutex_unlock(&hugetlb_fault_mutex_table[hash]);
> hugetlb_drop_vma_policy(&pseudo_vma);
> continue;
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 245038a9fe4ea..29ab27d2a3ef5 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -5666,13 +5666,12 @@ static bool hugetlbfs_pagecache_present(struct hstate *h,
> {
> struct address_space *mapping = vma->vm_file->f_mapping;
> pgoff_t idx = vma_hugecache_offset(h, vma, address);
> - bool present;
> -
> - rcu_read_lock();
> - present = page_cache_next_miss(mapping, idx, 1) != idx;
> - rcu_read_unlock();
> + struct folio *folio;
>
> - return present;
> + folio = filemap_get_folio(mapping, idx);
> + if (folio)
> + folio_put(folio);
> + return folio != NULL;
> }
>
> int hugetlb_add_to_page_cache(struct folio *folio, struct address_space *mapping,
> --
> 2.40.0
>
next prev parent reply other threads:[~2023-05-05 21:59 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-04 23:38 [PATCH 0/1] fix page_cache_next/prev_miss off by one error Mike Kravetz
2023-05-04 23:38 ` [PATCH 1/1] page cache: fix page_cache_next/prev_miss off by one Mike Kravetz
2023-05-05 18:53 ` [PATCH] mm/hugetlb: revert use of page_cache_next_miss() Sidhartha Kumar
2023-05-05 21:58 ` Mike Kravetz [this message]
2023-05-16 23:12 ` Mike Kravetz
2023-05-23 5:00 ` kernel test robot
2023-05-23 6:14 ` Sidhartha Kumar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230505215806.GA3497@monkey \
--to=mike.kravetz@oracle.com \
--cc=ackerleytng@google.com \
--cc=akpm@linux-foundation.org \
--cc=erdemaktas@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=sidhartha.kumar@oracle.com \
--cc=songmuchun@bytedance.com \
--cc=stable@vger.kernel.org \
--cc=vannapurve@google.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.