From: "Yin, Fengwei" <fengwei.yin@intel.com>
To: Jan Kara <jack@suse.cz>
Cc: Yujie Liu <yujie.liu@intel.com>,
Oliver Sang <oliver.sang@intel.com>, <oe-lkp@lists.linux.dev>,
<lkp@intel.com>, <linux-kernel@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Matthew Wilcox <willy@infradead.org>,
Guo Xuenan <guoxuenan@huawei.com>,
<linux-fsdevel@vger.kernel.org>, <ying.huang@intel.com>,
<feng.tang@intel.com>
Subject: Re: [linus:master] [readahead] ab4443fe3c: vm-scalability.throughput -21.4% regression
Date: Sun, 10 Mar 2024 14:40:00 +0800 [thread overview]
Message-ID: <561465df-1370-4519-abe3-3998bd78233f@intel.com> (raw)
In-Reply-To: <20240307092308.u54fjngivmx23ty3@quack3>
On 3/7/2024 5:23 PM, Jan Kara wrote:
> Thanks for testing! This is an interesting result and certainly unexpected
> for me. The readahead code allocates naturally aligned pages so based on
> the distribution of allocations it seems that before commit ab4443fe3ca6
> readahead window was at least 32 pages (128KB) aligned and so we allocated
> order 5 pages. After the commit, the readahead window somehow ended up only
> aligned to 20 modulo 32. To follow natural alignment and fill 128KB
> readahead window we allocated order 2 page (got us to offset 24 modulo 32),
> then order 3 page (got us to offset 0 modulo 32), order 4 page (larger
> would not fit in 128KB readahead window now), and order 2 page to finish
> filling the readahead window.
>
> Now I'm not 100% sure why the readahead window alignment changed with
> different rounding when placing readahead mark - probably that's some
> artifact when readahead window is tiny in the beginning before we scale it
> up (I'll verify by tracing whether everything ends up looking correctly
> with the current code). So I don't expect this is a problem in ab4443fe3ca6
> as such but it exposes the issue that readahead page insertion code should
> perhaps strive to achieve better readahead window alignment with logical
> file offset even at the cost of occasionally performing somewhat shorter
> readahead. I'll look into this once I dig out of the huge heap of email
> after vacation...
Hi Jan,
I am also curious to this behavior and add tried add logs to understand
the behavior here. Here is something difference w/o ab4443fe3ca6:
- with ab4443fe3ca6:
You are right about the folio order as the readahead window is 0x20.
The folio order sequence is like order 2, order 4, order3, order2.
But different thing is always mark the first order 2 folio readahead.
So the max order is boosted to 4 in page_cache_ra_order(). The code
path always hit
if (index == expected || index == (ra->start + ra->size))
in ondemand_readahead().
If just change the round_down() to round_up() in ra_alloc_folio(),
the major folio order will be restored to 5.
- without ab4443fe3ca6:
at the beginning, the folio order sequence is same like 2, 4, 3, 2.
But besides the first order2 folio, order4 folio will be marked as
readahead also. So it's possible the order boosted to 5.
Also, not just path
if (index == expected || index == (ra->start + ra->size))
is hit. but also
if (folio) {
can be hit (I didn't check other path as this testing is sequential
read).
There are some back and forth between 5 and 2,4,3,2, the order is
stabilized on 5.
I didn't fully understand the whole thing and will dig deeper. The
above is just what the log showed.
Hi Matthew,
I noticed one thing when readahead folio order is being pushed forward,
there are several times readahead trying to allocate and add folios to
page cache. But failed as there is folio inserted to page cache cover
the requested index already. Once the folio order is correct, there is
no such case anymore. I suppose this is expected.
Regards
Yin, Fengwei
prev parent reply other threads:[~2024-03-10 6:40 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-20 8:25 [linus:master] [readahead] ab4443fe3c: vm-scalability.throughput -21.4% regression kernel test robot
2024-02-21 11:14 ` Jan Kara
2024-02-22 1:32 ` Oliver Sang
2024-02-22 11:50 ` Jan Kara
2024-02-22 18:37 ` Jan Kara
2024-03-04 4:59 ` Yujie Liu
2024-03-04 5:35 ` Yin, Fengwei
2024-03-06 5:36 ` Yin Fengwei
2024-03-07 9:23 ` Jan Kara
2024-03-07 18:19 ` Matthew Wilcox
2024-03-08 8:37 ` Yujie Liu
2024-03-10 6:41 ` Yin, Fengwei
2024-03-10 6:40 ` Yin, Fengwei [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=561465df-1370-4519-abe3-3998bd78233f@intel.com \
--to=fengwei.yin@intel.com \
--cc=akpm@linux-foundation.org \
--cc=feng.tang@intel.com \
--cc=guoxuenan@huawei.com \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lkp@intel.com \
--cc=oe-lkp@lists.linux.dev \
--cc=oliver.sang@intel.com \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
--cc=yujie.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).