From: Kent Overstreet <kent.overstreet@gmail.com>
To: Matthew Wilcox <willy@infradead.org>, Yu Kuai <yukuai3@huawei.com>
Cc: akpm@linux-foundation.org, axboe@kernel.dk,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, yi.zhang@huawei.com
Subject: Re: [PATCH -next] mm/filemap: fix that first page is not mark accessed in filemap_read()
Date: Fri, 10 Jun 2022 13:47:02 -0400 [thread overview]
Message-ID: <c5f97e2f-8a48-2906-91a2-1d84629b3641@gmail.com> (raw)
In-Reply-To: <YqNW8cYn9gM7Txg6@casper.infradead.org>
On 6/10/22 10:36, Matthew Wilcox wrote:
> On Fri, Jun 10, 2022 at 03:34:11PM +0100, Matthew Wilcox wrote:
>> On Mon, Jun 06, 2022 at 09:10:03AM +0800, Yu Kuai wrote:
>>> On 2022/06/03 2:30, Matthew Wilcox wrote:
>>>> On Thu, Jun 02, 2022 at 04:21:29PM +0800, Yu Kuai wrote:
>>>>> In filemap_read(), 'ra->prev_pos' is set to 'iocb->ki_pos + copied',
>>>>> while it should be 'iocb->ki_ops'.
>>>>
>>>> Can you walk me through your reasoning which leads you to believe that
>>>> it should be ki_pos instead of ki_pos + copied? As I understand it,
>>>> prev_pos is the end of the previous read, not the beginning of the
>>>> previous read.
>>>
>>> Hi, Matthew
>>>
>>> The main reason is the following judgement in flemap_read():
>>>
>>> if (iocb->ki_pos >> PAGE_SHIFT != -> current page
>>> ra->prev_pos >> PAGE_SHIFT) -> previous page
>>> folio_mark_accessed(fbatch.folios[0]);
>>>
>>> Which means if current page is the same as previous page, don't mark
>>> page accessed. However, prev_pos is set to 'ki_pos + copied' during last
>>> read, which will cause 'prev_pos >> PAGE_SHIFT' to be current page
>>> instead of previous page.
>>>
>>> I was thinking that if prev_pos is set to the begining of the previous
>>> read, 'prev_pos >> PAGE_SHIFT' will be previous page as expected. Set to
>>> the end of previous read is ok, however, I think the caculation of
>>> previous page should be '(prev_pos - 1) >> PAGE_SHIFT' instead.
>>
>> OK, I think Kent broke this in 723ef24b9b37 ("mm/filemap/c: break
>> generic_file_buffered_read up into multiple functions"). Before:
>>
>> - prev_index = ra->prev_pos >> PAGE_SHIFT;
>> - prev_offset = ra->prev_pos & (PAGE_SIZE-1);
>> ...
>> - if (prev_index != index || offset != prev_offset)
>> - mark_page_accessed(page);
>>
>> After:
>> + if (iocb->ki_pos >> PAGE_SHIFT != ra->prev_pos >> PAGE_SHIFT)
>> + mark_page_accessed(page);
>>
>> So surely this should have been:
>>
>> + if (iocb->ki_pos != ra->prev_pos)
>> + mark_page_accessed(page);
>>
>> Kent, do you recall why you changed it the way you did?
>
> Oh, and if this is the right diagnosis, then this is the fix for the
> current tree:
>
> +++ b/mm/filemap.c
> @@ -2673,8 +2673,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
> * When a sequential read accesses a page several times, only
> * mark it as accessed the first time.
> */
> - if (iocb->ki_pos >> PAGE_SHIFT !=
> - ra->prev_pos >> PAGE_SHIFT)
> + if (iocb->ki_pos != ra->prev_pos)
> folio_mark_accessed(fbatch.folios[0]);
>
> for (i = 0; i < folio_batch_count(&fbatch); i++) {
>
>
I think this is the fix we want - I think Yu basically had the right
idea and had the off by one fix, this should be clearer though:
Yu, can you confirm the fix?
-- >8 --
Subject: [PATCH] filemap: Fix off by one error when marking folios accessed
In filemap_read() we mark pages accessed as we read them - but we don't
want to do so redundantly, if the previous read already did so.
But there was an off by one error: we want to check if the current page
was the same as the last page we read from, but the last page we read
from was (ra->prev_pos - 1) >> PAGE_SHIFT.
Reported-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
diff --git a/mm/filemap.c b/mm/filemap.c
index 9daeaab360..8d5c8043cb 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2704,7 +2704,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct
iov_iter *iter,
* mark it as accessed the first time.
*/
if (iocb->ki_pos >> PAGE_SHIFT !=
- ra->prev_pos >> PAGE_SHIFT)
+ (ra->prev_pos - 1) >> PAGE_SHIFT)
folio_mark_accessed(fbatch.folios[0]);
for (i = 0; i < folio_batch_count(&fbatch); i++) {
next prev parent reply other threads:[~2022-06-10 17:47 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-02 8:21 [PATCH -next] mm/filemap: fix that first page is not mark accessed in filemap_read() Yu Kuai
2022-06-02 18:22 ` Andrew Morton
2022-06-06 1:11 ` Yu Kuai
2022-06-02 18:30 ` Matthew Wilcox
2022-06-02 22:25 ` yukuai (C)
2022-06-06 1:10 ` Yu Kuai
2022-06-10 14:34 ` Matthew Wilcox
2022-06-10 14:36 ` Matthew Wilcox
2022-06-10 17:23 ` Kent Overstreet
2022-06-10 17:47 ` Kent Overstreet [this message]
2022-06-10 18:34 ` Matthew Wilcox
2022-06-10 18:48 ` Kent Overstreet
2022-06-11 8:23 ` Yu Kuai
2022-06-11 17:42 ` Matthew Wilcox
2022-06-13 1:31 ` Yu Kuai
2022-06-09 0:59 ` Yu Kuai
2022-06-15 8:36 ` [mm/filemap] 8b157c14b5: phoronix-test-suite.fio.SequentialRead.LinuxAIO.Yes.Yes.4KB.DefaultTestDirectory.mb_s -8.1% regression kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c5f97e2f-8a48-2906-91a2-1d84629b3641@gmail.com \
--to=kent.overstreet@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=willy@infradead.org \
--cc=yi.zhang@huawei.com \
--cc=yukuai3@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).