From: Christian Schoenebeck <qemu_oss@crudebyte.com>
To: qemu-devel@nongnu.org, Greg Kurz <groug@kaod.org>
Subject: Re: [PATCH 2/2] 9pfs: reduce latency of v9fs_reclaim_fd()
Date: Thu, 06 Mar 2025 16:11:58 +0100 [thread overview]
Message-ID: <1869760.rOlR1cq0hG@silver> (raw)
In-Reply-To: <20250306100756.28e6bea2@bahia>
On Thursday, March 6, 2025 10:07:56 AM CET Greg Kurz wrote:
> On Tue, 4 Mar 2025 16:16:05 +0100
> Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
>
> > This function calls v9fs_co_close() and v9fs_co_closedir() in a loop. Each
> > one of the calls adds two thread hops (between main thread and a fs driver
> > background thread). Each thread hop adds latency, which sums up in
> > function's loop to a significant duration.
> >
> > Reduce overall latency by open coding what v9fs_co_close() and
> > v9fs_co_closedir() do, executing those and the loop itself altogether in
> > only one background thread block, hence reducing the total amount of
> > thread hops to only two.
> >
>
> I'm afraid it is a bit trickier but I might have found a subtle bug
> in the current code :-)
>
> > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > ---
> > hw/9pfs/9p.c | 26 +++++++++++++++++---------
> > 1 file changed, 17 insertions(+), 9 deletions(-)
> >
> > diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> > index 4f9c2dde9c..4b52b1a22a 100644
> > --- a/hw/9pfs/9p.c
> > +++ b/hw/9pfs/9p.c
> > @@ -434,6 +434,7 @@ void coroutine_fn v9fs_reclaim_fd(V9fsPDU *pdu)
> > V9fsFidState *f;
> > GHashTableIter iter;
> > gpointer fid;
> > + int err;
> >
> > /* prevent multiple coroutines running this function simultaniously */
> > if (s->reclaiming) {
> > @@ -446,10 +447,10 @@ void coroutine_fn v9fs_reclaim_fd(V9fsPDU *pdu)
> > QSLIST_HEAD(, V9fsFidState) reclaim_list =
> > QSLIST_HEAD_INITIALIZER(reclaim_list);
> >
> > + /* Pick FIDs to be closed, collect them on reclaim_list. */
> > while (g_hash_table_iter_next(&iter, &fid, (gpointer *) &f)) {
> > /*
> > - * Unlink fids cannot be reclaimed. Check
> > - * for them and skip them. Also skip fids
> > + * Unlinked fids cannot be reclaimed, skip those, and also skip fids
> > * currently being operated on.
> > */
> > if (f->ref || f->flags & FID_NON_RECLAIMABLE) {
> > @@ -499,17 +500,24 @@ void coroutine_fn v9fs_reclaim_fd(V9fsPDU *pdu)
> > }
> > }
> > /*
> > - * Now close the fid in reclaim list. Free them if they
> > - * are already clunked.
> > + * Close the picked FIDs altogether on a background I/O driver thread. Do
> > + * this all at once to keep latency (i.e. amount of thread hops between main
> > + * thread <-> fs driver background thread) as low as possible.
> > */
> > + v9fs_co_run_in_worker({
> > + QSLIST_FOREACH(f, &reclaim_list, reclaim_next) {
> > + err = (f->fid_type == P9_FID_DIR) ?
> > + s->ops->closedir(&s->ctx, &f->fs_reclaim) :
> > + s->ops->close(&s->ctx, &f->fs_reclaim);
> > + if (!err) {
> > + total_open_fd--;
>
> Hmm... total_open_fd is supposed to be accessed by the main thread only.
> Maybe turn total_open_fd into an atomic ? Or have this block to return
> the count of closed fds and propagate that to total_open_fd in the main
> thread ?
Right! I'll change that to the latter, as it involves far less changes.
> > + }
> > + }
> > + });
> > + /* Free the closed FIDs. */
> > while (!QSLIST_EMPTY(&reclaim_list)) {
> > f = QSLIST_FIRST(&reclaim_list);
> > QSLIST_REMOVE(&reclaim_list, f, V9fsFidState, reclaim_next);
> > - if (f->fid_type == P9_FID_FILE) {
> > - v9fs_co_close(pdu, &f->fs_reclaim);
> > - } else if (f->fid_type == P9_FID_DIR) {
> > - v9fs_co_closedir(pdu, &f->fs_reclaim);
> > - }
>
> An now for the subtle bug in the existing code :
>
> v9fs_co_close() and v9fs_co_closedir() honor TFLUSH with the following
> check that skips the actual closing :
>
> if (v9fs_request_cancelled(pdu)) {
> return -EINTR;
> }
>
> If the client cancels the request that needs fd reclaim, some
> fds will be cleared from the fids without being closed. This
> causes fd leaks and probably ruins the overall reclaim logic.
>
> This is kinda fixed by the current patch since all the fds
> in the reclaim list are closed altogether atomically in the
> same v9fs_co_run_in_worker().
Hu, that makes this 2nd patch indeed more than just an optimization fix.
I'll send a v2 and CC qemu-stable on the 2nd patch as well.
Good catches, thanks Greg!
/Christian
prev parent reply other threads:[~2025-03-06 15:12 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-04 15:17 [PATCH 0/2] 9pfs: v9fs_reclaim_fd() fixes Christian Schoenebeck
2025-03-04 15:15 ` [PATCH 1/2] 9pfs: fix concurrent v9fs_reclaim_fd() calls Christian Schoenebeck
2025-03-06 7:38 ` Greg Kurz
2025-03-04 15:16 ` [PATCH 2/2] 9pfs: reduce latency of v9fs_reclaim_fd() Christian Schoenebeck
2025-03-06 9:07 ` Greg Kurz
2025-03-06 15:11 ` Christian Schoenebeck [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1869760.rOlR1cq0hG@silver \
--to=qemu_oss@crudebyte.com \
--cc=groug@kaod.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).