From: Christoph Hellwig <hch@lst.de>
To: Jens Axboe <axboe@kernel.dk>,
Matthew Wilcox <willy@infradead.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Suren Baghdasaryan <surenb@google.com>,
Andrew Morton <akpm@linux-foundation.org>
Cc: Chris Mason <clm@fb.com>, Josef Bacik <josef@toxicpanda.com>,
David Sterba <dsterba@suse.com>, Gao Xiang <xiang@kernel.org>,
Chao Yu <chao@kernel.org>,
linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org,
linux-fsdevel@vger.kernel.org, linux-erofs@lists.ozlabs.org,
linux-mm@kvack.org
Subject: [PATCH 1/5] mm: add PSI accounting around ->read_folio and ->readahead calls
Date: Thu, 15 Sep 2022 10:41:56 +0100 [thread overview]
Message-ID: <20220915094200.139713-2-hch@lst.de> (raw)
In-Reply-To: <20220915094200.139713-1-hch@lst.de>
PSI tries to account for the cost of bringing back in pages discarded by
the MM LRU management. Currently the prime place for that is hooked into
the bio submission path, which is a rather bad place:
- it does not actually account I/O for non-block file systems, of which
we have many
- it adds overhead and a layering violation to the block layer
Add the accounting into the two places in the core MM code that read
pages into an address space by calling into ->read_folio and ->readahead
so that the entire file system operations are covered, to broaden
the coverage and allow removing the accounting in the block layer going
forward.
As psi_memstall_enter can deal with nested calls this will not lead to
double accounting even while the bio annotations are still present.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
---
include/linux/pagemap.h | 2 ++
mm/filemap.c | 7 +++++++
mm/readahead.c | 22 ++++++++++++++++++----
3 files changed, 27 insertions(+), 4 deletions(-)
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 0178b2040ea38..201dc7281640b 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -1173,6 +1173,8 @@ struct readahead_control {
pgoff_t _index;
unsigned int _nr_pages;
unsigned int _batch_count;
+ bool _workingset;
+ unsigned long _pflags;
};
#define DEFINE_READAHEAD(ractl, f, r, m, i) \
diff --git a/mm/filemap.c b/mm/filemap.c
index 15800334147b3..c943d1b90cc26 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2382,6 +2382,8 @@ static void filemap_get_read_batch(struct address_space *mapping,
static int filemap_read_folio(struct file *file, filler_t filler,
struct folio *folio)
{
+ bool workingset = folio_test_workingset(folio);
+ unsigned long pflags;
int error;
/*
@@ -2390,8 +2392,13 @@ static int filemap_read_folio(struct file *file, filler_t filler,
* fails.
*/
folio_clear_error(folio);
+
/* Start the actual read. The read will unlock the page. */
+ if (unlikely(workingset))
+ psi_memstall_enter(&pflags);
error = filler(file, folio);
+ if (unlikely(workingset))
+ psi_memstall_leave(&pflags);
if (error)
return error;
diff --git a/mm/readahead.c b/mm/readahead.c
index fdcd28cbd92de..b10f0cf81d804 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -122,6 +122,7 @@
#include <linux/task_io_accounting_ops.h>
#include <linux/pagevec.h>
#include <linux/pagemap.h>
+#include <linux/psi.h>
#include <linux/syscalls.h>
#include <linux/file.h>
#include <linux/mm_inline.h>
@@ -152,6 +153,8 @@ static void read_pages(struct readahead_control *rac)
if (!readahead_count(rac))
return;
+ if (unlikely(rac->_workingset))
+ psi_memstall_enter(&rac->_pflags);
blk_start_plug(&plug);
if (aops->readahead) {
@@ -179,6 +182,9 @@ static void read_pages(struct readahead_control *rac)
}
blk_finish_plug(&plug);
+ if (unlikely(rac->_workingset))
+ psi_memstall_leave(&rac->_pflags);
+ rac->_workingset = false;
BUG_ON(readahead_count(rac));
}
@@ -252,6 +258,7 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
}
if (i == nr_to_read - lookahead_size)
folio_set_readahead(folio);
+ ractl->_workingset |= folio_test_workingset(folio);
ractl->_nr_pages++;
}
@@ -480,11 +487,14 @@ static inline int ra_alloc_folio(struct readahead_control *ractl, pgoff_t index,
if (index == mark)
folio_set_readahead(folio);
err = filemap_add_folio(ractl->mapping, folio, index, gfp);
- if (err)
+ if (err) {
folio_put(folio);
- else
- ractl->_nr_pages += 1UL << order;
- return err;
+ return err;
+ }
+
+ ractl->_nr_pages += 1UL << order;
+ ractl->_workingset |= folio_test_workingset(folio);
+ return 0;
}
void page_cache_ra_order(struct readahead_control *ractl,
@@ -826,6 +836,10 @@ void readahead_expand(struct readahead_control *ractl,
put_page(page);
return;
}
+ if (unlikely(PageWorkingset(page)) && !ractl->_workingset) {
+ ractl->_workingset = true;
+ psi_memstall_enter(&ractl->_pflags);
+ }
ractl->_nr_pages++;
if (ra) {
ra->size++;
--
2.30.2
next prev parent reply other threads:[~2022-09-15 9:42 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-15 9:41 improve pagecache PSI annotations v2 Christoph Hellwig
2022-09-15 9:41 ` Christoph Hellwig [this message]
2022-09-15 9:41 ` [PATCH 2/5] sched/psi: export psi_memstall_{enter,leave} Christoph Hellwig
2022-09-15 9:41 ` [PATCH 3/5] btrfs: add manual PSI accounting for compressed reads Christoph Hellwig
2022-11-03 10:46 ` [REGESSION] systemd-oomd overreacting due to PSI changes for Btrfs (was: Re: [PATCH 3/5] btrfs: add manual PSI accounting for compressed reads) Thorsten Leemhuis
2022-11-03 11:08 ` [REGESSION] systemd-oomd overreacting due to PSI changes for Btrfs #forregzbot Thorsten Leemhuis
2022-11-03 12:40 ` [REGESSION] systemd-oomd overreacting due to PSI changes for Btrfs (was: Re: [PATCH 3/5] btrfs: add manual PSI accounting for compressed reads) Christoph Hellwig
2022-11-03 22:20 ` Johannes Weiner
2022-11-04 7:32 ` Thorsten Leemhuis
2022-11-04 12:36 ` Johannes Weiner
2022-09-15 9:41 ` [PATCH 4/5] erofs: add manual PSI accounting for the compressed address space Christoph Hellwig
2022-09-15 9:42 ` [PATCH 5/5] block: remove PSI accounting from the bio layer Christoph Hellwig
2022-09-15 13:01 ` improve pagecache PSI annotations v2 David Sterba
2022-09-19 15:45 ` Christoph Hellwig
2022-09-20 14:24 ` Jens Axboe
2022-09-20 17:21 ` Christoph Hellwig
-- strict thread matches above, loose matches on Subject: below --
2022-09-10 6:50 improve pagecache PSI annotations Christoph Hellwig
2022-09-10 6:50 ` [PATCH 1/5] mm: add PSI accounting around ->read_folio and ->readahead calls Christoph Hellwig
2022-09-10 11:34 ` Jens Axboe
2022-09-12 8:35 ` Christoph Hellwig
2022-09-10 18:26 ` Matthew Wilcox
2022-09-12 8:33 ` Christoph Hellwig
2022-09-14 11:41 ` Johannes Weiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220915094200.139713-2-hch@lst.de \
--to=hch@lst.de \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=chao@kernel.org \
--cc=clm@fb.com \
--cc=dsterba@suse.com \
--cc=hannes@cmpxchg.org \
--cc=josef@toxicpanda.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-erofs@lists.ozlabs.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=surenb@google.com \
--cc=willy@infradead.org \
--cc=xiang@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).