From: Christoph Hellwig <hch@lst.de>
To: Trond Myklebust <trondmy@kernel.org>, Anna Schumaker <anna@kernel.org>
Cc: linux-nfs@vger.kernel.org
Subject: [PATCH 7/7] nfs: don't reuse partially completed requests in nfs_lock_and_join_requests
Date: Mon, 1 Jul 2024 07:26:54 +0200 [thread overview]
Message-ID: <20240701052707.1246254-8-hch@lst.de> (raw)
In-Reply-To: <20240701052707.1246254-1-hch@lst.de>
When NFS requests are split into sub-requests, nfs_inode_remove_request
calls nfs_page_group_sync_on_bit to set PG_REMOVE on this sub-request and
only completes the head requests once PG_REMOVE is set on all requests.
This means that when nfs_lock_and_join_requests sees a PG_REMOVE bit, I/O
on the request is in progress and has partially completed. If such a
request is returned to nfs_try_to_update_request, it could be extended
with the newly dirtied region and I/O for the combined range will be
re-scheduled, leading to extra I/O.
Change the logic to instead restart the search for a request when any
PG_REMOVE bit is set, as the completion handler will remove the request
as soon as it can take the page group lock. This not only avoid
extending the I/O but also does the right thing for the callers that
want to cancel or flush the request.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/nfs/write.c | 49 ++++++++++++++++++++-----------------------------
1 file changed, 20 insertions(+), 29 deletions(-)
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 2c089444303982..4dffdc5aadb2e2 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -144,31 +144,6 @@ static void nfs_io_completion_put(struct nfs_io_completion *ioc)
kref_put(&ioc->refcount, nfs_io_completion_release);
}
-static void
-nfs_page_set_inode_ref(struct nfs_page *req, struct inode *inode)
-{
- if (!test_and_set_bit(PG_INODE_REF, &req->wb_flags)) {
- kref_get(&req->wb_kref);
- atomic_long_inc(&NFS_I(inode)->nrequests);
- }
-}
-
-static int
-nfs_cancel_remove_inode(struct nfs_page *req, struct inode *inode)
-{
- int ret;
-
- if (!test_bit(PG_REMOVE, &req->wb_flags))
- return 0;
- ret = nfs_page_group_lock(req);
- if (ret)
- return ret;
- if (test_and_clear_bit(PG_REMOVE, &req->wb_flags))
- nfs_page_set_inode_ref(req, inode);
- nfs_page_group_unlock(req);
- return 0;
-}
-
/**
* nfs_folio_find_head_request - find head request associated with a folio
* @folio: pointer to folio
@@ -564,6 +539,7 @@ static struct nfs_page *nfs_lock_and_join_requests(struct folio *folio)
struct inode *inode = folio->mapping->host;
struct nfs_page *head, *subreq;
struct nfs_commit_info cinfo;
+ bool removed;
int ret;
/*
@@ -588,18 +564,18 @@ static struct nfs_page *nfs_lock_and_join_requests(struct folio *folio)
goto retry;
}
- ret = nfs_cancel_remove_inode(head, inode);
- if (ret < 0)
- goto out_unlock;
-
ret = nfs_page_group_lock(head);
if (ret < 0)
goto out_unlock;
+ removed = test_bit(PG_REMOVE, &head->wb_flags);
+
/* lock each request in the page group */
for (subreq = head->wb_this_page;
subreq != head;
subreq = subreq->wb_this_page) {
+ if (test_bit(PG_REMOVE, &subreq->wb_flags))
+ removed = true;
ret = nfs_page_group_lock_subreq(head, subreq);
if (ret < 0)
goto out_unlock;
@@ -607,6 +583,21 @@ static struct nfs_page *nfs_lock_and_join_requests(struct folio *folio)
nfs_page_group_unlock(head);
+ /*
+ * If PG_REMOVE is set on any request, I/O on that request has
+ * completed, but some requests were still under I/O at the time
+ * we locked the head request.
+ *
+ * In that case the above wait for all requests means that all I/O
+ * has now finished, and we can restart from a clean slate. Let the
+ * old requests go away and start from scratch instead.
+ */
+ if (removed) {
+ nfs_unroll_locks(head, head);
+ nfs_unlock_and_release_request(head);
+ goto retry;
+ }
+
nfs_init_cinfo_from_inode(&cinfo, inode);
nfs_join_page_group(head, &cinfo, inode);
return head;
--
2.43.0
next prev parent reply other threads:[~2024-07-01 5:27 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-01 5:26 NFS buffered write cleanup Christoph Hellwig
2024-07-01 5:26 ` [PATCH 1/7] nfs: remove dead code for the old swap over NFS implementation Christoph Hellwig
2024-07-02 7:37 ` Sagi Grimberg
2024-07-01 5:26 ` [PATCH 2/7] nfs: remove nfs_folio_private_request Christoph Hellwig
2024-07-02 7:38 ` Sagi Grimberg
2024-07-01 5:26 ` [PATCH 3/7] nfs: simplify nfs_folio_find_and_lock_request Christoph Hellwig
2024-07-02 7:54 ` Sagi Grimberg
2024-07-03 4:19 ` Christoph Hellwig
2024-07-01 5:26 ` [PATCH 4/7] nfs: fold nfs_folio_find_and_lock_request into nfs_lock_and_join_requests Christoph Hellwig
2024-07-02 7:57 ` Sagi Grimberg
2024-07-03 4:20 ` Christoph Hellwig
2024-07-01 5:26 ` [PATCH 5/7] nfs: fold nfs_page_group_lock_subrequests " Christoph Hellwig
2024-07-02 7:59 ` Sagi Grimberg
2024-07-01 5:26 ` [PATCH 6/7] nfs: move nfs_wait_on_request to write.c Christoph Hellwig
2024-07-02 7:59 ` Sagi Grimberg
2024-07-01 5:26 ` Christoph Hellwig [this message]
2024-07-02 8:07 ` [PATCH 7/7] nfs: don't reuse partially completed requests in nfs_lock_and_join_requests Sagi Grimberg
2024-07-03 4:25 ` Christoph Hellwig
2024-07-05 5:35 ` NFS buffered write cleanup Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240701052707.1246254-8-hch@lst.de \
--to=hch@lst.de \
--cc=anna@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=trondmy@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox