From: Guixin Liu <kanie@linux.alibaba.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: [RFC PATCH] mm/filemap.c: fix the timing of asignment of prev_pos
Date: Thu, 18 Aug 2022 11:13:24 +0800 [thread overview]
Message-ID: <ba2b9f11-be74-fa32-9ee6-e1794bdc09a0@linux.alibaba.com> (raw)
In-Reply-To: <Yv0IccKJ6Spk/zH4@casper.infradead.org>
在 2022/8/17 23:25, Matthew Wilcox 写道:
> On Wed, Aug 17, 2022 at 09:51:57PM +0800, Guixin Liu wrote:
>> The prev_pos should be assigned before the iocb->ki_pos is incremented,
>> so that the prev_pos is the exact location of the last visit.
>>
>> Fixes: 06c0444290cec ("mm/filemap.c: generic_file_buffered_read() now
>> uses find_get_pages_contig")
>> Signed-off-by: Guixin Liu <kanie@linux.alibaba.com>
>>
>> ---
>> Hi guys,
>> When I`m running repetitive 4k read io which has same offset,
>> I find that access to folio_mark_accessed is inevitable in the
>> read process, the reason is that the prev_pos is assigned after the
>> iocb->ki_pos is incremented, so that the prev_pos is always not equal
>> to the position currently visited.
>> Is this a bug that needs fixing?
> I think you've misunderstood the purpose of 'prev_pos'. But this has
> been the source of bugs, so let's go through it in detail.
>
> In general, we want to mark a folio as accessed each time we read from
> it. So if we do this:
>
> read(fd, buf, 1024 * 1024);
>
> we want to mark each folio as having been accessed.
>
> But if we're doing lots of short reads, we don't want to mark a folio as
> being accessed multiple times (if you dive into the implementation,
> you'll see the first time, the 'referenced' flag is set and the second
> time, the folio is moved to the active list, so it matters how often
> we call mark_accessed). IOW:
>
> for (i = 0; i < 1024 * 1024; i++)
> read(fd, buf, 1);
>
> should do the same amount of accessed/referenced/activation as the single
> read above.
>
> So when we store ki_pos in prev_pos, we don't want to know "Where did
> the previous read start?" We want to know "Where did the previous read
> end". That's why when we test it, we check whether prev_pos - 1 is in
> the same folio as the offset we're looking at:
>
> if (!pos_same_folio(iocb->ki_pos, ra->prev_pos - 1,
> fbatch.folios[0]))
> folio_mark_accessed(fbatch.folios[0]);
>
> I'm not super-proud of this code, and accept that it's confusing.
> But I don't think the patch below is right. If you could share
> your actual test and show what's going wrong, I'm interested.
>
> I think what you're saying is that this loop:
>
> for (i = 0; i < 1000; i++)
> pread(fd, buf, 4096, 1024 * 1024);
>
> results in the folio at offset 1MB being marked as accessed more than
> once. If so, then I think that's the algorithm behaving as designed.
> Whether that's desirable is a different question; when I touched this
> code last, I was trying to restore the previous behaviour which was
> inadvertently broken. I'm not taking a position on what the right
> behaviour is for such code.
>
My thanks for your detailed description, I am wrong about this, I test
not on the newest code, My fault.
The 5ccc944dce3d actually solved this problem.
Best regards,
Guixin Liu
>> mm/filemap.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/filemap.c b/mm/filemap.c
>> index 660490c..68fd987 100644
>> --- a/mm/filemap.c
>> +++ b/mm/filemap.c
>> @@ -2703,8 +2703,8 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
>> copied = copy_folio_to_iter(folio, offset, bytes, iter);
>>
>> already_read += copied;
>> - iocb->ki_pos += copied;
>> ra->prev_pos = iocb->ki_pos;
>> + iocb->ki_pos += copied;
>>
>> if (copied < bytes) {
>> error = -EFAULT;
>> --
>> 1.8.3.1
>>
prev parent reply other threads:[~2022-08-18 3:14 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-17 13:51 [RFC PATCH] mm/filemap.c: fix the timing of asignment of prev_pos Guixin Liu
2022-08-17 15:16 ` Andrew Morton
2022-08-17 15:30 ` Matthew Wilcox
2022-08-17 15:25 ` Matthew Wilcox
2022-08-18 3:13 ` Guixin Liu [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ba2b9f11-be74-fa32-9ee6-e1794bdc09a0@linux.alibaba.com \
--to=kanie@linux.alibaba.com \
--cc=akpm@linux-foundation.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).