* [PATCH 1/2] Revert "page cache: fix page_cache_next/prev_miss off by one"
@ 2023-06-21 21:24 Mike Kravetz
2023-06-21 21:24 ` [PATCH 2/2] hugetlb: revert use of page_cache_next_miss() Mike Kravetz
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Mike Kravetz @ 2023-06-21 21:24 UTC (permalink / raw)
To: linux-kernel, linux-mm, linux-fsdevel
Cc: Matthew Wilcox, Ackerley Tng, Sidhartha Kumar, Muchun Song,
Vishal Annapurve, Erdem Aktas, Greg Kroah-Hartman, Andrew Morton,
Mike Kravetz, kernel test robot
This reverts commit 9425c591e06a9ab27a145ba655fb50532cf0bcc9
The reverted commit fixed up routines primarily used by readahead code
such that they could also be used by hugetlb. Unfortunately, this
caused a performance regression as pointed out by the Closes: tag.
The hugetlb code which uses page_cache_next_miss will be addressed in
a subsequent patch.
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202306211346.1e9ff03e-oliver.sang@intel.com
Fixes: 9425c591e06a ("page cache: fix page_cache_next/prev_miss off by one")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
mm/filemap.c | 26 ++++++++++----------------
1 file changed, 10 insertions(+), 16 deletions(-)
diff --git a/mm/filemap.c b/mm/filemap.c
index 3b73101f9f86..9e44a49bbd74 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1728,9 +1728,7 @@ bool __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm,
*
* Return: The index of the gap if found, otherwise an index outside the
* range specified (in which case 'return - index >= max_scan' will be true).
- * In the rare case of index wrap-around, 0 will be returned. 0 will also
- * be returned if index == 0 and there is a gap at the index. We can not
- * wrap-around if passed index == 0.
+ * In the rare case of index wrap-around, 0 will be returned.
*/
pgoff_t page_cache_next_miss(struct address_space *mapping,
pgoff_t index, unsigned long max_scan)
@@ -1740,13 +1738,12 @@ pgoff_t page_cache_next_miss(struct address_space *mapping,
while (max_scan--) {
void *entry = xas_next(&xas);
if (!entry || xa_is_value(entry))
- return xas.xa_index;
- if (xas.xa_index == 0 && index != 0)
- return xas.xa_index;
+ break;
+ if (xas.xa_index == 0)
+ break;
}
- /* No gaps in range and no wrap-around, return index beyond range */
- return xas.xa_index + 1;
+ return xas.xa_index;
}
EXPORT_SYMBOL(page_cache_next_miss);
@@ -1767,9 +1764,7 @@ EXPORT_SYMBOL(page_cache_next_miss);
*
* Return: The index of the gap if found, otherwise an index outside the
* range specified (in which case 'index - return >= max_scan' will be true).
- * In the rare case of wrap-around, ULONG_MAX will be returned. ULONG_MAX
- * will also be returned if index == ULONG_MAX and there is a gap at the
- * index. We can not wrap-around if passed index == ULONG_MAX.
+ * In the rare case of wrap-around, ULONG_MAX will be returned.
*/
pgoff_t page_cache_prev_miss(struct address_space *mapping,
pgoff_t index, unsigned long max_scan)
@@ -1779,13 +1774,12 @@ pgoff_t page_cache_prev_miss(struct address_space *mapping,
while (max_scan--) {
void *entry = xas_prev(&xas);
if (!entry || xa_is_value(entry))
- return xas.xa_index;
- if (xas.xa_index == ULONG_MAX && index != ULONG_MAX)
- return xas.xa_index;
+ break;
+ if (xas.xa_index == ULONG_MAX)
+ break;
}
- /* No gaps in range and no wrap-around, return index beyond range */
- return xas.xa_index - 1;
+ return xas.xa_index;
}
EXPORT_SYMBOL(page_cache_prev_miss);
--
2.41.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/2] hugetlb: revert use of page_cache_next_miss()
2023-06-21 21:24 [PATCH 1/2] Revert "page cache: fix page_cache_next/prev_miss off by one" Mike Kravetz
@ 2023-06-21 21:24 ` Mike Kravetz
2023-06-21 22:19 ` Sidhartha Kumar
2023-06-21 22:15 ` [PATCH 1/2] Revert "page cache: fix page_cache_next/prev_miss off by one" Sidhartha Kumar
2023-06-21 22:18 ` Andrew Morton
2 siblings, 1 reply; 11+ messages in thread
From: Mike Kravetz @ 2023-06-21 21:24 UTC (permalink / raw)
To: linux-kernel, linux-mm, linux-fsdevel
Cc: Matthew Wilcox, Ackerley Tng, Sidhartha Kumar, Muchun Song,
Vishal Annapurve, Erdem Aktas, Greg Kroah-Hartman, Andrew Morton,
Mike Kravetz
Ackerley Tng reported an issue with hugetlbfs fallocate as noted in the
Closes tag. The issue showed up after the conversion of hugetlb page
cache lookup code to use page_cache_next_miss. User visible effects are:
- hugetlbfs fallocate incorrectly returns -EEXIST if pages are presnet
in the file.
- hugetlb pages will not be included in core dumps if they need to be
brought in via GUP.
- userfaultfd UFFDIO_COPY will not notice pages already present in the
cache. It may try to allocate a new page and potentially return
ENOMEM as opposed to EEXIST.
Revert the use page_cache_next_miss() in hugetlb code.
IMPORTANT NOTE FOR STABLE BACKPORTS:
This patch will apply cleanly to v6.3. However, due to the change of
filemap_get_folio() return values, it will not function correctly. This
patch must be modified for stable backports.
Fixes: d0ce0e47b323 ("mm/hugetlb: convert hugetlb fault paths to use alloc_hugetlb_folio()")
Reported-by: Ackerley Tng <ackerleytng@google.com>
Closes: https://lore.kernel.org/linux-mm/cover.1683069252.git.ackerleytng@google.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
fs/hugetlbfs/inode.c | 8 +++-----
mm/hugetlb.c | 11 +++++------
2 files changed, 8 insertions(+), 11 deletions(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 90361a922cec..7b17ccfa039d 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -821,7 +821,6 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
*/
struct folio *folio;
unsigned long addr;
- bool present;
cond_resched();
@@ -842,10 +841,9 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
mutex_lock(&hugetlb_fault_mutex_table[hash]);
/* See if already present in mapping to avoid alloc/free */
- rcu_read_lock();
- present = page_cache_next_miss(mapping, index, 1) != index;
- rcu_read_unlock();
- if (present) {
+ folio = filemap_get_folio(mapping, index);
+ if (!IS_ERR(folio)) {
+ folio_put(folio);
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
continue;
}
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index d76574425da3..cb9077b96b43 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5728,13 +5728,12 @@ static bool hugetlbfs_pagecache_present(struct hstate *h,
{
struct address_space *mapping = vma->vm_file->f_mapping;
pgoff_t idx = vma_hugecache_offset(h, vma, address);
- bool present;
-
- rcu_read_lock();
- present = page_cache_next_miss(mapping, idx, 1) != idx;
- rcu_read_unlock();
+ struct folio *folio;
- return present;
+ folio = filemap_get_folio(mapping, idx);
+ if (!IS_ERR(folio))
+ folio_put(folio);
+ return folio != NULL;
}
int hugetlb_add_to_page_cache(struct folio *folio, struct address_space *mapping,
--
2.41.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] hugetlb: revert use of page_cache_next_miss()
2023-06-21 21:24 ` [PATCH 2/2] hugetlb: revert use of page_cache_next_miss() Mike Kravetz
@ 2023-06-21 22:19 ` Sidhartha Kumar
2023-06-21 22:39 ` Andrew Morton
0 siblings, 1 reply; 11+ messages in thread
From: Sidhartha Kumar @ 2023-06-21 22:19 UTC (permalink / raw)
To: Mike Kravetz, linux-kernel, linux-mm, linux-fsdevel
Cc: Matthew Wilcox, Ackerley Tng, Muchun Song, Vishal Annapurve,
Erdem Aktas, Greg Kroah-Hartman, Andrew Morton
On 6/21/23 2:24 PM, Mike Kravetz wrote:
> Ackerley Tng reported an issue with hugetlbfs fallocate as noted in the
> Closes tag. The issue showed up after the conversion of hugetlb page
> cache lookup code to use page_cache_next_miss. User visible effects are:
> - hugetlbfs fallocate incorrectly returns -EEXIST if pages are presnet
> in the file.
> - hugetlb pages will not be included in core dumps if they need to be
> brought in via GUP.
> - userfaultfd UFFDIO_COPY will not notice pages already present in the
> cache. It may try to allocate a new page and potentially return
> ENOMEM as opposed to EEXIST.
>
> Revert the use page_cache_next_miss() in hugetlb code.
>
> IMPORTANT NOTE FOR STABLE BACKPORTS:
> This patch will apply cleanly to v6.3. However, due to the change of
> filemap_get_folio() return values, it will not function correctly. This
> patch must be modified for stable backports.
This patch I sent previously can be used for the 6.3 backport:
https://lore.kernel.org/lkml/b5bd2b39-7e1e-148f-7462-9565773f6d41@oracle.com/T/#me37b56ca89368dc8dda2a33d39f681337788d13c
>
> Fixes: d0ce0e47b323 ("mm/hugetlb: convert hugetlb fault paths to use alloc_hugetlb_folio()")
> Reported-by: Ackerley Tng <ackerleytng@google.com>
> Closes: https://lore.kernel.org/linux-mm/cover.1683069252.git.ackerleytng@google.com
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
> ---
> fs/hugetlbfs/inode.c | 8 +++-----
> mm/hugetlb.c | 11 +++++------
> 2 files changed, 8 insertions(+), 11 deletions(-)
>
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 90361a922cec..7b17ccfa039d 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -821,7 +821,6 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
> */
> struct folio *folio;
> unsigned long addr;
> - bool present;
>
> cond_resched();
>
> @@ -842,10 +841,9 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
> mutex_lock(&hugetlb_fault_mutex_table[hash]);
>
> /* See if already present in mapping to avoid alloc/free */
> - rcu_read_lock();
> - present = page_cache_next_miss(mapping, index, 1) != index;
> - rcu_read_unlock();
> - if (present) {
> + folio = filemap_get_folio(mapping, index);
> + if (!IS_ERR(folio)) {
> + folio_put(folio);
> mutex_unlock(&hugetlb_fault_mutex_table[hash]);
> continue;
> }
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index d76574425da3..cb9077b96b43 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -5728,13 +5728,12 @@ static bool hugetlbfs_pagecache_present(struct hstate *h,
> {
> struct address_space *mapping = vma->vm_file->f_mapping;
> pgoff_t idx = vma_hugecache_offset(h, vma, address);
> - bool present;
> -
> - rcu_read_lock();
> - present = page_cache_next_miss(mapping, idx, 1) != idx;
> - rcu_read_unlock();
> + struct folio *folio;
>
> - return present;
> + folio = filemap_get_folio(mapping, idx);
> + if (!IS_ERR(folio))
> + folio_put(folio);
> + return folio != NULL;
> }
>
> int hugetlb_add_to_page_cache(struct folio *folio, struct address_space *mapping,
Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] hugetlb: revert use of page_cache_next_miss()
2023-06-21 22:19 ` Sidhartha Kumar
@ 2023-06-21 22:39 ` Andrew Morton
2023-06-21 22:46 ` Mike Kravetz
0 siblings, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2023-06-21 22:39 UTC (permalink / raw)
To: Sidhartha Kumar
Cc: Mike Kravetz, linux-kernel, linux-mm, linux-fsdevel,
Matthew Wilcox, Ackerley Tng, Muchun Song, Vishal Annapurve,
Erdem Aktas, Greg Kroah-Hartman
On Wed, 21 Jun 2023 15:19:58 -0700 Sidhartha Kumar <sidhartha.kumar@oracle.com> wrote:
> > IMPORTANT NOTE FOR STABLE BACKPORTS:
> > This patch will apply cleanly to v6.3. However, due to the change of
> > filemap_get_folio() return values, it will not function correctly. This
> > patch must be modified for stable backports.
>
> This patch I sent previously can be used for the 6.3 backport:
>
> https://lore.kernel.org/lkml/b5bd2b39-7e1e-148f-7462-9565773f6d41@oracle.com/T/#me37b56ca89368dc8dda2a33d39f681337788d13c
Are we suggesting that this be backported? If so, I'll add the cc:stable.
Because -stable maintainers have been asked not to backport MM patches to
which we didn't add the cc:stable.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] hugetlb: revert use of page_cache_next_miss()
2023-06-21 22:39 ` Andrew Morton
@ 2023-06-21 22:46 ` Mike Kravetz
2023-06-21 22:52 ` Andrew Morton
0 siblings, 1 reply; 11+ messages in thread
From: Mike Kravetz @ 2023-06-21 22:46 UTC (permalink / raw)
To: Andrew Morton
Cc: Sidhartha Kumar, linux-kernel, linux-mm, linux-fsdevel,
Matthew Wilcox, Ackerley Tng, Muchun Song, Vishal Annapurve,
Erdem Aktas, Greg Kroah-Hartman
On 06/21/23 15:39, Andrew Morton wrote:
> On Wed, 21 Jun 2023 15:19:58 -0700 Sidhartha Kumar <sidhartha.kumar@oracle.com> wrote:
>
> > > IMPORTANT NOTE FOR STABLE BACKPORTS:
> > > This patch will apply cleanly to v6.3. However, due to the change of
> > > filemap_get_folio() return values, it will not function correctly. This
> > > patch must be modified for stable backports.
> >
> > This patch I sent previously can be used for the 6.3 backport:
> >
> > https://lore.kernel.org/lkml/b5bd2b39-7e1e-148f-7462-9565773f6d41@oracle.com/T/#me37b56ca89368dc8dda2a33d39f681337788d13c
>
> Are we suggesting that this be backported? If so, I'll add the cc:stable.
>
> Because -stable maintainers have been asked not to backport MM patches to
> which we didn't add the cc:stable.
Yes, we need to get a fix into 6.3 as well.
The 'issue' with a backport is noted in the IMPORTANT NOTE above.
My concern is that adding cc:stable will have it automatically picked up
and this would make things worse than they are in 6.3.
My 'plan' was to not add the cc:stable, but manually create a patch for
6.3 once this goes upstream. Honestly, I am not sure what is the best
way to deal with this. I could also try to catch the email about the
automatic backport and say 'no, use this new patch instead'.
--
Mike Kravetz
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] hugetlb: revert use of page_cache_next_miss()
2023-06-21 22:46 ` Mike Kravetz
@ 2023-06-21 22:52 ` Andrew Morton
2023-06-21 23:02 ` Mike Kravetz
0 siblings, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2023-06-21 22:52 UTC (permalink / raw)
To: Mike Kravetz
Cc: Sidhartha Kumar, linux-kernel, linux-mm, linux-fsdevel,
Matthew Wilcox, Ackerley Tng, Muchun Song, Vishal Annapurve,
Erdem Aktas, Greg Kroah-Hartman
On Wed, 21 Jun 2023 15:46:57 -0700 Mike Kravetz <mike.kravetz@oracle.com> wrote:
> On 06/21/23 15:39, Andrew Morton wrote:
> > On Wed, 21 Jun 2023 15:19:58 -0700 Sidhartha Kumar <sidhartha.kumar@oracle.com> wrote:
> >
> > > > IMPORTANT NOTE FOR STABLE BACKPORTS:
> > > > This patch will apply cleanly to v6.3. However, due to the change of
> > > > filemap_get_folio() return values, it will not function correctly. This
> > > > patch must be modified for stable backports.
> > >
> > > This patch I sent previously can be used for the 6.3 backport:
> > >
> > > https://lore.kernel.org/lkml/b5bd2b39-7e1e-148f-7462-9565773f6d41@oracle.com/T/#me37b56ca89368dc8dda2a33d39f681337788d13c
> >
> > Are we suggesting that this be backported? If so, I'll add the cc:stable.
> >
> > Because -stable maintainers have been asked not to backport MM patches to
> > which we didn't add the cc:stable.
>
> Yes, we need to get a fix into 6.3 as well.
>
> The 'issue' with a backport is noted in the IMPORTANT NOTE above.
>
> My concern is that adding cc:stable will have it automatically picked up
> and this would make things worse than they are in 6.3.
>
> My 'plan' was to not add the cc:stable, but manually create a patch for
> 6.3 once this goes upstream. Honestly, I am not sure what is the best
> way to deal with this. I could also try to catch the email about the
> automatic backport and say 'no, use this new patch instead'.
OK, how about I leave it without cc:stable, so you can send the 6.3
version at a time of your choosing?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] hugetlb: revert use of page_cache_next_miss()
2023-06-21 22:52 ` Andrew Morton
@ 2023-06-21 23:02 ` Mike Kravetz
0 siblings, 0 replies; 11+ messages in thread
From: Mike Kravetz @ 2023-06-21 23:02 UTC (permalink / raw)
To: Andrew Morton
Cc: Sidhartha Kumar, linux-kernel, linux-mm, linux-fsdevel,
Matthew Wilcox, Ackerley Tng, Muchun Song, Vishal Annapurve,
Erdem Aktas, Greg Kroah-Hartman
On 06/21/23 15:52, Andrew Morton wrote:
> On Wed, 21 Jun 2023 15:46:57 -0700 Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> > On 06/21/23 15:39, Andrew Morton wrote:
> > > On Wed, 21 Jun 2023 15:19:58 -0700 Sidhartha Kumar <sidhartha.kumar@oracle.com> wrote:
> > >
> > > > > IMPORTANT NOTE FOR STABLE BACKPORTS:
> > > > > This patch will apply cleanly to v6.3. However, due to the change of
> > > > > filemap_get_folio() return values, it will not function correctly. This
> > > > > patch must be modified for stable backports.
> > > >
> > > > This patch I sent previously can be used for the 6.3 backport:
> > > >
> > > > https://lore.kernel.org/lkml/b5bd2b39-7e1e-148f-7462-9565773f6d41@oracle.com/T/#me37b56ca89368dc8dda2a33d39f681337788d13c
> > >
> > > Are we suggesting that this be backported? If so, I'll add the cc:stable.
> > >
> > > Because -stable maintainers have been asked not to backport MM patches to
> > > which we didn't add the cc:stable.
> >
> > Yes, we need to get a fix into 6.3 as well.
> >
> > The 'issue' with a backport is noted in the IMPORTANT NOTE above.
> >
> > My concern is that adding cc:stable will have it automatically picked up
> > and this would make things worse than they are in 6.3.
> >
> > My 'plan' was to not add the cc:stable, but manually create a patch for
> > 6.3 once this goes upstream. Honestly, I am not sure what is the best
> > way to deal with this. I could also try to catch the email about the
> > automatic backport and say 'no, use this new patch instead'.
>
> OK, how about I leave it without cc:stable, so you can send the 6.3
> version at a time of your choosing?
Perfect
--
Mike Kravetz
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] Revert "page cache: fix page_cache_next/prev_miss off by one"
2023-06-21 21:24 [PATCH 1/2] Revert "page cache: fix page_cache_next/prev_miss off by one" Mike Kravetz
2023-06-21 21:24 ` [PATCH 2/2] hugetlb: revert use of page_cache_next_miss() Mike Kravetz
@ 2023-06-21 22:15 ` Sidhartha Kumar
2023-06-21 22:18 ` Andrew Morton
2 siblings, 0 replies; 11+ messages in thread
From: Sidhartha Kumar @ 2023-06-21 22:15 UTC (permalink / raw)
To: Mike Kravetz, linux-kernel, linux-mm, linux-fsdevel
Cc: Matthew Wilcox, Ackerley Tng, Muchun Song, Vishal Annapurve,
Erdem Aktas, Greg Kroah-Hartman, Andrew Morton, kernel test robot
On 6/21/23 2:24 PM, Mike Kravetz wrote:
> This reverts commit 9425c591e06a9ab27a145ba655fb50532cf0bcc9
>
> The reverted commit fixed up routines primarily used by readahead code
> such that they could also be used by hugetlb. Unfortunately, this
> caused a performance regression as pointed out by the Closes: tag.
>
> The hugetlb code which uses page_cache_next_miss will be addressed in
> a subsequent patch.
>
> Reported-by: kernel test robot <oliver.sang@intel.com>
> Closes: https://lore.kernel.org/oe-lkp/202306211346.1e9ff03e-oliver.sang@intel.com
> Fixes: 9425c591e06a ("page cache: fix page_cache_next/prev_miss off by one")
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
> ---
> mm/filemap.c | 26 ++++++++++----------------
> 1 file changed, 10 insertions(+), 16 deletions(-)
>
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 3b73101f9f86..9e44a49bbd74 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -1728,9 +1728,7 @@ bool __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm,
> *
> * Return: The index of the gap if found, otherwise an index outside the
> * range specified (in which case 'return - index >= max_scan' will be true).
> - * In the rare case of index wrap-around, 0 will be returned. 0 will also
> - * be returned if index == 0 and there is a gap at the index. We can not
> - * wrap-around if passed index == 0.
> + * In the rare case of index wrap-around, 0 will be returned.
> */
> pgoff_t page_cache_next_miss(struct address_space *mapping,
> pgoff_t index, unsigned long max_scan)
> @@ -1740,13 +1738,12 @@ pgoff_t page_cache_next_miss(struct address_space *mapping,
> while (max_scan--) {
> void *entry = xas_next(&xas);
> if (!entry || xa_is_value(entry))
> - return xas.xa_index;
> - if (xas.xa_index == 0 && index != 0)
> - return xas.xa_index;
> + break;
> + if (xas.xa_index == 0)
> + break;
> }
>
> - /* No gaps in range and no wrap-around, return index beyond range */
> - return xas.xa_index + 1;
> + return xas.xa_index;
> }
> EXPORT_SYMBOL(page_cache_next_miss);
>
> @@ -1767,9 +1764,7 @@ EXPORT_SYMBOL(page_cache_next_miss);
> *
> * Return: The index of the gap if found, otherwise an index outside the
> * range specified (in which case 'index - return >= max_scan' will be true).
> - * In the rare case of wrap-around, ULONG_MAX will be returned. ULONG_MAX
> - * will also be returned if index == ULONG_MAX and there is a gap at the
> - * index. We can not wrap-around if passed index == ULONG_MAX.
> + * In the rare case of wrap-around, ULONG_MAX will be returned.
> */
> pgoff_t page_cache_prev_miss(struct address_space *mapping,
> pgoff_t index, unsigned long max_scan)
> @@ -1779,13 +1774,12 @@ pgoff_t page_cache_prev_miss(struct address_space *mapping,
> while (max_scan--) {
> void *entry = xas_prev(&xas);
> if (!entry || xa_is_value(entry))
> - return xas.xa_index;
> - if (xas.xa_index == ULONG_MAX && index != ULONG_MAX)
> - return xas.xa_index;
> + break;
> + if (xas.xa_index == ULONG_MAX)
> + break;
> }
>
> - /* No gaps in range and no wrap-around, return index beyond range */
> - return xas.xa_index - 1;
> + return xas.xa_index;
> }
> EXPORT_SYMBOL(page_cache_prev_miss);
>
Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] Revert "page cache: fix page_cache_next/prev_miss off by one"
2023-06-21 21:24 [PATCH 1/2] Revert "page cache: fix page_cache_next/prev_miss off by one" Mike Kravetz
2023-06-21 21:24 ` [PATCH 2/2] hugetlb: revert use of page_cache_next_miss() Mike Kravetz
2023-06-21 22:15 ` [PATCH 1/2] Revert "page cache: fix page_cache_next/prev_miss off by one" Sidhartha Kumar
@ 2023-06-21 22:18 ` Andrew Morton
2023-06-21 23:01 ` Mike Kravetz
2 siblings, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2023-06-21 22:18 UTC (permalink / raw)
To: Mike Kravetz
Cc: linux-kernel, linux-mm, linux-fsdevel, Matthew Wilcox,
Ackerley Tng, Sidhartha Kumar, Muchun Song, Vishal Annapurve,
Erdem Aktas, Greg Kroah-Hartman, kernel test robot
On Wed, 21 Jun 2023 14:24:02 -0700 Mike Kravetz <mike.kravetz@oracle.com> wrote:
> This reverts commit 9425c591e06a9ab27a145ba655fb50532cf0bcc9
>
> The reverted commit fixed up routines primarily used by readahead code
> such that they could also be used by hugetlb. Unfortunately, this
> caused a performance regression as pointed out by the Closes: tag.
>
> The hugetlb code which uses page_cache_next_miss will be addressed in
> a subsequent patch.
Often these throughput changes are caused by rather random
alignment/layout changes and the code change itself was innocent.
Do we have an explanation for this regression, or was it a surprise?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] Revert "page cache: fix page_cache_next/prev_miss off by one"
2023-06-21 22:18 ` Andrew Morton
@ 2023-06-21 23:01 ` Mike Kravetz
0 siblings, 0 replies; 11+ messages in thread
From: Mike Kravetz @ 2023-06-21 23:01 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, linux-mm, linux-fsdevel, Matthew Wilcox,
Ackerley Tng, Sidhartha Kumar, Muchun Song, Vishal Annapurve,
Erdem Aktas, Greg Kroah-Hartman, kernel test robot
On 06/21/23 15:18, Andrew Morton wrote:
> On Wed, 21 Jun 2023 14:24:02 -0700 Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> > This reverts commit 9425c591e06a9ab27a145ba655fb50532cf0bcc9
> >
> > The reverted commit fixed up routines primarily used by readahead code
> > such that they could also be used by hugetlb. Unfortunately, this
> > caused a performance regression as pointed out by the Closes: tag.
> >
> > The hugetlb code which uses page_cache_next_miss will be addressed in
> > a subsequent patch.
>
> Often these throughput changes are caused by rather random
> alignment/layout changes and the code change itself was innocent.
>
> Do we have an explanation for this regression, or was it a surprise?
It was not a total surprise. As mentioned, the primary user of this
interface is the readahead code. The code in question is in
ondemand_readahead.
rcu_read_lock();
start = page_cache_next_miss(ractl->mapping, index + 1,
max_pages);
rcu_read_unlock();
if (!start || start - index > max_pages)
return;
With the reverted changes, we will take that quick return when there are
no gaps in the range. Previously we did not.
I am of the belief that page_cache_next_miss behavior did not match the
function description. Matthew suggested page_cache_next_miss use in hugetlb
code and I can only guess that he also though it behaved as documented.
I do not know the readahead code well enough to know exactly what is
expected. readahead certainly performs worse with my proposed changes.
Since we can easily 'fix' hugetlb code in another way, let's do that and
leave the readahead code alone unless someone more knowledgable can
provide insight.
--
Mike Kravetz
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] Revert "page cache: fix page_cache_next/prev_miss off by one"
@ 2023-07-24 12:15 Thomas Backlund
0 siblings, 0 replies; 11+ messages in thread
From: Thomas Backlund @ 2023-07-24 12:15 UTC (permalink / raw)
To: Mike Kravetz, linux-kernel, linux-mm, linux-fsdevel
Cc: Matthew Wilcox, Ackerley Tng, Sidhartha Kumar, Muchun Song,
Vishal Annapurve, Erdem Aktas, Greg Kroah-Hartman, Andrew Morton,
kernel test robot
Den 2023-06-22 kl. 00:24, skrev Mike Kravetz:
> This reverts commit 9425c591e06a9ab27a145ba655fb50532cf0bcc9
>
> The reverted commit fixed up routines primarily used by readahead code
> such that they could also be used by hugetlb. Unfortunately, this
> caused a performance regression as pointed out by the Closes: tag.
>
> The hugetlb code which uses page_cache_next_miss will be addressed in
> a subsequent patch.
>
> Reported-by: kernel test robot <oliver.sang@intel.com>
> Closes: https://lore.kernel.org/oe-lkp/202306211346.1e9ff03e-oliver.sang@intel.com
> Fixes: 9425c591e06a ("page cache: fix page_cache_next/prev_miss off by one")
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Should not this one be submitted to 6.4 stable branch too ?
git describe --contains 9425c591e06a
v6.4-rc7~29^2~1
The other one (hugetlb: revert use of page_cache_next_miss()) of this
patch series landed in 6.4.2
Or am I missing something ?
--
Thomas
> ---
> mm/filemap.c | 26 ++++++++++----------------
> 1 file changed, 10 insertions(+), 16 deletions(-)
>
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 3b73101f9f86..9e44a49bbd74 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -1728,9 +1728,7 @@ bool __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm,
> *
> * Return: The index of the gap if found, otherwise an index outside the
> * range specified (in which case 'return - index >= max_scan' will be true).
> - * In the rare case of index wrap-around, 0 will be returned. 0 will also
> - * be returned if index == 0 and there is a gap at the index. We can not
> - * wrap-around if passed index == 0.
> + * In the rare case of index wrap-around, 0 will be returned.
> */
> pgoff_t page_cache_next_miss(struct address_space *mapping,
> pgoff_t index, unsigned long max_scan)
> @@ -1740,13 +1738,12 @@ pgoff_t page_cache_next_miss(struct address_space *mapping,
> while (max_scan--) {
> void *entry = xas_next(&xas);
> if (!entry || xa_is_value(entry))
> - return xas.xa_index;
> - if (xas.xa_index == 0 && index != 0)
> - return xas.xa_index;
> + break;
> + if (xas.xa_index == 0)
> + break;
> }
>
> - /* No gaps in range and no wrap-around, return index beyond range */
> - return xas.xa_index + 1;
> + return xas.xa_index;
> }
> EXPORT_SYMBOL(page_cache_next_miss);
>
> @@ -1767,9 +1764,7 @@ EXPORT_SYMBOL(page_cache_next_miss);
> *
> * Return: The index of the gap if found, otherwise an index outside the
> * range specified (in which case 'index - return >= max_scan' will be true).
> - * In the rare case of wrap-around, ULONG_MAX will be returned. ULONG_MAX
> - * will also be returned if index == ULONG_MAX and there is a gap at the
> - * index. We can not wrap-around if passed index == ULONG_MAX.
> + * In the rare case of wrap-around, ULONG_MAX will be returned.
> */
> pgoff_t page_cache_prev_miss(struct address_space *mapping,
> pgoff_t index, unsigned long max_scan)
> @@ -1779,13 +1774,12 @@ pgoff_t page_cache_prev_miss(struct address_space *mapping,
> while (max_scan--) {
> void *entry = xas_prev(&xas);
> if (!entry || xa_is_value(entry))
> - return xas.xa_index;
> - if (xas.xa_index == ULONG_MAX && index != ULONG_MAX)
> - return xas.xa_index;
> + break;
> + if (xas.xa_index == ULONG_MAX)
> + break;
> }
>
> - /* No gaps in range and no wrap-around, return index beyond range */
> - return xas.xa_index - 1;
> + return xas.xa_index;
> }
> EXPORT_SYMBOL(page_cache_prev_miss);
>
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-07-24 12:15 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-21 21:24 [PATCH 1/2] Revert "page cache: fix page_cache_next/prev_miss off by one" Mike Kravetz
2023-06-21 21:24 ` [PATCH 2/2] hugetlb: revert use of page_cache_next_miss() Mike Kravetz
2023-06-21 22:19 ` Sidhartha Kumar
2023-06-21 22:39 ` Andrew Morton
2023-06-21 22:46 ` Mike Kravetz
2023-06-21 22:52 ` Andrew Morton
2023-06-21 23:02 ` Mike Kravetz
2023-06-21 22:15 ` [PATCH 1/2] Revert "page cache: fix page_cache_next/prev_miss off by one" Sidhartha Kumar
2023-06-21 22:18 ` Andrew Morton
2023-06-21 23:01 ` Mike Kravetz
-- strict thread matches above, loose matches on Subject: below --
2023-07-24 12:15 Thomas Backlund
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).