* [PATCH v2 0/2] 9pfs: v9fs_reclaim_fd() fixes @ 2025-03-07 9:24 Christian Schoenebeck 2025-03-07 9:22 ` [PATCH v2 1/2] 9pfs: fix concurrent v9fs_reclaim_fd() calls Christian Schoenebeck ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: Christian Schoenebeck @ 2025-03-07 9:24 UTC (permalink / raw) To: qemu-devel; +Cc: Greg Kurz, qemu-stable Three fixes for 9p server's v9fs_reclaim_fd() function: * Patch 1 fixes a concurrency issue. * Patch 2 fixes a file descriptor leak and optimizes overall latency. On a test machine with ~800,000 FIDs, this reduced execution duration of v9fs_reclaim_fd() from 30ms to 1ms. V2: - Patch 2: Decrement global variable total_open_fd on main thread, not on fs driver background thread. - Patch 2: Update commit log about file descriptor leak being fixed. Christian Schoenebeck (2): 9pfs: fix concurrent v9fs_reclaim_fd() calls 9pfs: fix FD leak and reduce latency of v9fs_reclaim_fd() hw/9pfs/9p.c | 39 ++++++++++++++++++++++++++++++--------- hw/9pfs/9p.h | 1 + 2 files changed, 31 insertions(+), 9 deletions(-) -- 2.39.5 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 1/2] 9pfs: fix concurrent v9fs_reclaim_fd() calls 2025-03-07 9:24 [PATCH v2 0/2] 9pfs: v9fs_reclaim_fd() fixes Christian Schoenebeck @ 2025-03-07 9:22 ` Christian Schoenebeck 2025-03-07 9:23 ` [PATCH v2 2/2] 9pfs: fix FD leak and reduce latency of v9fs_reclaim_fd() Christian Schoenebeck 2025-03-18 19:00 ` [PATCH v2 0/2] 9pfs: v9fs_reclaim_fd() fixes Christian Schoenebeck 2 siblings, 0 replies; 7+ messages in thread From: Christian Schoenebeck @ 2025-03-07 9:22 UTC (permalink / raw) To: qemu-devel; +Cc: Greg Kurz, qemu-stable Even though this function is serialized to be always called from main thread, v9fs_reclaim_fd() is dispatching the coroutine to a worker thread in between via its v9fs_co_*() calls, hence leading to the situation where v9fs_reclaim_fd() is effectively executed multiple times simultaniously, which renders its LRU algorithm useless and causes high latency. Fix this by adding a simple boolean variable to ensure this function is only called once at a time. No synchronization needed for this boolean variable as this function is only entered and returned on main thread. Fixes: 7a46274529c ('hw/9pfs: Add file descriptor reclaim support') Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Greg Kurz <groug@kaod.org> --- hw/9pfs/9p.c | 10 ++++++++++ hw/9pfs/9p.h | 1 + 2 files changed, 11 insertions(+) diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c index 7cad2bce62..4f9c2dde9c 100644 --- a/hw/9pfs/9p.c +++ b/hw/9pfs/9p.c @@ -435,6 +435,12 @@ void coroutine_fn v9fs_reclaim_fd(V9fsPDU *pdu) GHashTableIter iter; gpointer fid; + /* prevent multiple coroutines running this function simultaniously */ + if (s->reclaiming) { + return; + } + s->reclaiming = true; + g_hash_table_iter_init(&iter, s->fids); QSLIST_HEAD(, V9fsFidState) reclaim_list = @@ -510,6 +516,8 @@ void coroutine_fn v9fs_reclaim_fd(V9fsPDU *pdu) */ put_fid(pdu, f); } + + s->reclaiming = false; } /* @@ -4324,6 +4332,8 @@ int v9fs_device_realize_common(V9fsState *s, const V9fsTransport *t, s->ctx.fst = &fse->fst; fsdev_throttle_init(s->ctx.fst); + s->reclaiming = false; + rc = 0; out: if (rc) { diff --git a/hw/9pfs/9p.h b/hw/9pfs/9p.h index 5e041e1f60..259ad32ed1 100644 --- a/hw/9pfs/9p.h +++ b/hw/9pfs/9p.h @@ -362,6 +362,7 @@ struct V9fsState { uint64_t qp_ndevices; /* Amount of entries in qpd_table. */ uint16_t qp_affix_next; uint64_t qp_fullpath_next; + bool reclaiming; }; /* 9p2000.L open flags */ -- 2.39.5 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 2/2] 9pfs: fix FD leak and reduce latency of v9fs_reclaim_fd() 2025-03-07 9:24 [PATCH v2 0/2] 9pfs: v9fs_reclaim_fd() fixes Christian Schoenebeck 2025-03-07 9:22 ` [PATCH v2 1/2] 9pfs: fix concurrent v9fs_reclaim_fd() calls Christian Schoenebeck @ 2025-03-07 9:23 ` Christian Schoenebeck 2025-03-07 9:47 ` Christian Schoenebeck 2025-03-10 9:54 ` Greg Kurz 2025-03-18 19:00 ` [PATCH v2 0/2] 9pfs: v9fs_reclaim_fd() fixes Christian Schoenebeck 2 siblings, 2 replies; 7+ messages in thread From: Christian Schoenebeck @ 2025-03-07 9:23 UTC (permalink / raw) To: qemu-devel; +Cc: Greg Kurz, qemu-stable This patch fixes two different bugs in v9fs_reclaim_fd(): 1. Reduce latency: This function calls v9fs_co_close() and v9fs_co_closedir() in a loop. Each one of the calls adds two thread hops (between main thread and a fs driver background thread). Each thread hop adds latency, which sums up in function's loop to a significant duration. Reduce overall latency by open coding what v9fs_co_close() and v9fs_co_closedir() do, executing those and the loop itself altogether in only one background thread block, hence reducing the total amount of thread hops to only two. 2. Fix file descriptor leak: The existing code called v9fs_co_close() and v9fs_co_closedir() to close file descriptors. Both functions check right at the beginning if the 9p request was cancelled: if (v9fs_request_cancelled(pdu)) { return -EINTR; } So if client sent a 'Tflush' message, v9fs_co_close() / v9fs_co_closedir() returned without having closed the file descriptor and v9fs_reclaim_fd() subsequently freed the FID without its file descriptor being closed, hence leaking those file descriptors. This 2nd bug is fixed by this patch as well by open coding v9fs_co_close() and v9fs_co_closedir() inside of v9fs_reclaim_fd() and not performing the v9fs_request_cancelled(pdu) check there. Fixes: 7a46274529c ('hw/9pfs: Add file descriptor reclaim support') Fixes: bccacf6c792 ('hw/9pfs: Implement TFLUSH operation') Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> --- hw/9pfs/9p.c | 29 ++++++++++++++++++++--------- 1 file changed, 20 insertions(+), 9 deletions(-) diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c index 4f9c2dde9c..80b190ff5b 100644 --- a/hw/9pfs/9p.c +++ b/hw/9pfs/9p.c @@ -434,6 +434,8 @@ void coroutine_fn v9fs_reclaim_fd(V9fsPDU *pdu) V9fsFidState *f; GHashTableIter iter; gpointer fid; + int err; + int nclosed = 0; /* prevent multiple coroutines running this function simultaniously */ if (s->reclaiming) { @@ -446,10 +448,10 @@ void coroutine_fn v9fs_reclaim_fd(V9fsPDU *pdu) QSLIST_HEAD(, V9fsFidState) reclaim_list = QSLIST_HEAD_INITIALIZER(reclaim_list); + /* Pick FIDs to be closed, collect them on reclaim_list. */ while (g_hash_table_iter_next(&iter, &fid, (gpointer *) &f)) { /* - * Unlink fids cannot be reclaimed. Check - * for them and skip them. Also skip fids + * Unlinked fids cannot be reclaimed, skip those, and also skip fids * currently being operated on. */ if (f->ref || f->flags & FID_NON_RECLAIMABLE) { @@ -499,17 +501,26 @@ void coroutine_fn v9fs_reclaim_fd(V9fsPDU *pdu) } } /* - * Now close the fid in reclaim list. Free them if they - * are already clunked. + * Close the picked FIDs altogether on a background I/O driver thread. Do + * this all at once to keep latency (i.e. amount of thread hops between main + * thread <-> fs driver background thread) as low as possible. */ + v9fs_co_run_in_worker({ + QSLIST_FOREACH(f, &reclaim_list, reclaim_next) { + err = (f->fid_type == P9_FID_DIR) ? + s->ops->closedir(&s->ctx, &f->fs_reclaim) : + s->ops->close(&s->ctx, &f->fs_reclaim); + if (!err) { + /* total_open_fd must only be mutated on main thread */ + nclosed++; + } + } + }); + total_open_fd -= nclosed; + /* Free the closed FIDs. */ while (!QSLIST_EMPTY(&reclaim_list)) { f = QSLIST_FIRST(&reclaim_list); QSLIST_REMOVE(&reclaim_list, f, V9fsFidState, reclaim_next); - if (f->fid_type == P9_FID_FILE) { - v9fs_co_close(pdu, &f->fs_reclaim); - } else if (f->fid_type == P9_FID_DIR) { - v9fs_co_closedir(pdu, &f->fs_reclaim); - } /* * Now drop the fid reference, free it * if clunked. -- 2.39.5 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v2 2/2] 9pfs: fix FD leak and reduce latency of v9fs_reclaim_fd() 2025-03-07 9:23 ` [PATCH v2 2/2] 9pfs: fix FD leak and reduce latency of v9fs_reclaim_fd() Christian Schoenebeck @ 2025-03-07 9:47 ` Christian Schoenebeck 2025-03-10 9:57 ` Greg Kurz 2025-03-10 9:54 ` Greg Kurz 1 sibling, 1 reply; 7+ messages in thread From: Christian Schoenebeck @ 2025-03-07 9:47 UTC (permalink / raw) To: qemu-devel; +Cc: Greg Kurz, qemu-stable On Friday, March 7, 2025 10:23:02 AM CET Christian Schoenebeck wrote: > This patch fixes two different bugs in v9fs_reclaim_fd(): > > 1. Reduce latency: > > This function calls v9fs_co_close() and v9fs_co_closedir() in a loop. Each > one of the calls adds two thread hops (between main thread and a fs driver > background thread). Each thread hop adds latency, which sums up in > function's loop to a significant duration. > > Reduce overall latency by open coding what v9fs_co_close() and > v9fs_co_closedir() do, executing those and the loop itself altogether in > only one background thread block, hence reducing the total amount of > thread hops to only two. > > 2. Fix file descriptor leak: > > The existing code called v9fs_co_close() and v9fs_co_closedir() to close > file descriptors. Both functions check right at the beginning if the 9p > request was cancelled: > > if (v9fs_request_cancelled(pdu)) { > return -EINTR; > } > > So if client sent a 'Tflush' message, v9fs_co_close() / v9fs_co_closedir() > returned without having closed the file descriptor and v9fs_reclaim_fd() > subsequently freed the FID without its file descriptor being closed, hence > leaking those file descriptors. > > This 2nd bug is fixed by this patch as well by open coding v9fs_co_close() > and v9fs_co_closedir() inside of v9fs_reclaim_fd() and not performing the > v9fs_request_cancelled(pdu) check there. > > Fixes: 7a46274529c ('hw/9pfs: Add file descriptor reclaim support') > Fixes: bccacf6c792 ('hw/9pfs: Implement TFLUSH operation') > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> > --- > hw/9pfs/9p.c | 29 ++++++++++++++++++++--------- > 1 file changed, 20 insertions(+), 9 deletions(-) > > diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c > index 4f9c2dde9c..80b190ff5b 100644 > --- a/hw/9pfs/9p.c > +++ b/hw/9pfs/9p.c > @@ -434,6 +434,8 @@ void coroutine_fn v9fs_reclaim_fd(V9fsPDU *pdu) > V9fsFidState *f; > GHashTableIter iter; > gpointer fid; > + int err; > + int nclosed = 0; > > /* prevent multiple coroutines running this function simultaniously */ > if (s->reclaiming) { > @@ -446,10 +448,10 @@ void coroutine_fn v9fs_reclaim_fd(V9fsPDU *pdu) > QSLIST_HEAD(, V9fsFidState) reclaim_list = > QSLIST_HEAD_INITIALIZER(reclaim_list); > > + /* Pick FIDs to be closed, collect them on reclaim_list. */ > while (g_hash_table_iter_next(&iter, &fid, (gpointer *) &f)) { > /* > - * Unlink fids cannot be reclaimed. Check > - * for them and skip them. Also skip fids > + * Unlinked fids cannot be reclaimed, skip those, and also skip fids > * currently being operated on. > */ > if (f->ref || f->flags & FID_NON_RECLAIMABLE) { > @@ -499,17 +501,26 @@ void coroutine_fn v9fs_reclaim_fd(V9fsPDU *pdu) > } > } > /* > - * Now close the fid in reclaim list. Free them if they > - * are already clunked. > + * Close the picked FIDs altogether on a background I/O driver thread. Do > + * this all at once to keep latency (i.e. amount of thread hops between main > + * thread <-> fs driver background thread) as low as possible. > */ > + v9fs_co_run_in_worker({ > + QSLIST_FOREACH(f, &reclaim_list, reclaim_next) { > + err = (f->fid_type == P9_FID_DIR) ? > + s->ops->closedir(&s->ctx, &f->fs_reclaim) : > + s->ops->close(&s->ctx, &f->fs_reclaim); > + if (!err) { > + /* total_open_fd must only be mutated on main thread */ > + nclosed++; > + } > + } > + }); > + total_open_fd -= nclosed; So here is another thing: looking at 'man 2 close' I would say that decrementing 'total_open_fd' conditionally based on what close() returned is wrong. The man page suggest that the return value of close() should only be used for diagnostic purposes, as an error on close() often indicates just an error on a previous write() and hence the return value should only be used for catching a data loss related to writes. So this should probably changed here, as well as in v9fs_co_close() / v9fs_co_closedir(), part of a separate patch though, so I haven't addressed it here yet. Does this make sense? /Christian > + /* Free the closed FIDs. */ > while (!QSLIST_EMPTY(&reclaim_list)) { > f = QSLIST_FIRST(&reclaim_list); > QSLIST_REMOVE(&reclaim_list, f, V9fsFidState, reclaim_next); > - if (f->fid_type == P9_FID_FILE) { > - v9fs_co_close(pdu, &f->fs_reclaim); > - } else if (f->fid_type == P9_FID_DIR) { > - v9fs_co_closedir(pdu, &f->fs_reclaim); > - } > /* > * Now drop the fid reference, free it > * if clunked. > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 2/2] 9pfs: fix FD leak and reduce latency of v9fs_reclaim_fd() 2025-03-07 9:47 ` Christian Schoenebeck @ 2025-03-10 9:57 ` Greg Kurz 0 siblings, 0 replies; 7+ messages in thread From: Greg Kurz @ 2025-03-10 9:57 UTC (permalink / raw) To: Christian Schoenebeck; +Cc: qemu-devel, qemu-stable On Fri, 07 Mar 2025 10:47:33 +0100 Christian Schoenebeck <qemu_oss@crudebyte.com> wrote: > On Friday, March 7, 2025 10:23:02 AM CET Christian Schoenebeck wrote: > > This patch fixes two different bugs in v9fs_reclaim_fd(): > > > > 1. Reduce latency: > > > > This function calls v9fs_co_close() and v9fs_co_closedir() in a loop. Each > > one of the calls adds two thread hops (between main thread and a fs driver > > background thread). Each thread hop adds latency, which sums up in > > function's loop to a significant duration. > > > > Reduce overall latency by open coding what v9fs_co_close() and > > v9fs_co_closedir() do, executing those and the loop itself altogether in > > only one background thread block, hence reducing the total amount of > > thread hops to only two. > > > > 2. Fix file descriptor leak: > > > > The existing code called v9fs_co_close() and v9fs_co_closedir() to close > > file descriptors. Both functions check right at the beginning if the 9p > > request was cancelled: > > > > if (v9fs_request_cancelled(pdu)) { > > return -EINTR; > > } > > > > So if client sent a 'Tflush' message, v9fs_co_close() / v9fs_co_closedir() > > returned without having closed the file descriptor and v9fs_reclaim_fd() > > subsequently freed the FID without its file descriptor being closed, hence > > leaking those file descriptors. > > > > This 2nd bug is fixed by this patch as well by open coding v9fs_co_close() > > and v9fs_co_closedir() inside of v9fs_reclaim_fd() and not performing the > > v9fs_request_cancelled(pdu) check there. > > > > Fixes: 7a46274529c ('hw/9pfs: Add file descriptor reclaim support') > > Fixes: bccacf6c792 ('hw/9pfs: Implement TFLUSH operation') > > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> > > --- > > hw/9pfs/9p.c | 29 ++++++++++++++++++++--------- > > 1 file changed, 20 insertions(+), 9 deletions(-) > > > > diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c > > index 4f9c2dde9c..80b190ff5b 100644 > > --- a/hw/9pfs/9p.c > > +++ b/hw/9pfs/9p.c > > @@ -434,6 +434,8 @@ void coroutine_fn v9fs_reclaim_fd(V9fsPDU *pdu) > > V9fsFidState *f; > > GHashTableIter iter; > > gpointer fid; > > + int err; > > + int nclosed = 0; > > > > /* prevent multiple coroutines running this function simultaniously */ > > if (s->reclaiming) { > > @@ -446,10 +448,10 @@ void coroutine_fn v9fs_reclaim_fd(V9fsPDU *pdu) > > QSLIST_HEAD(, V9fsFidState) reclaim_list = > > QSLIST_HEAD_INITIALIZER(reclaim_list); > > > > + /* Pick FIDs to be closed, collect them on reclaim_list. */ > > while (g_hash_table_iter_next(&iter, &fid, (gpointer *) &f)) { > > /* > > - * Unlink fids cannot be reclaimed. Check > > - * for them and skip them. Also skip fids > > + * Unlinked fids cannot be reclaimed, skip those, and also skip fids > > * currently being operated on. > > */ > > if (f->ref || f->flags & FID_NON_RECLAIMABLE) { > > @@ -499,17 +501,26 @@ void coroutine_fn v9fs_reclaim_fd(V9fsPDU *pdu) > > } > > } > > /* > > - * Now close the fid in reclaim list. Free them if they > > - * are already clunked. > > + * Close the picked FIDs altogether on a background I/O driver thread. Do > > + * this all at once to keep latency (i.e. amount of thread hops between main > > + * thread <-> fs driver background thread) as low as possible. > > */ > > + v9fs_co_run_in_worker({ > > + QSLIST_FOREACH(f, &reclaim_list, reclaim_next) { > > + err = (f->fid_type == P9_FID_DIR) ? > > + s->ops->closedir(&s->ctx, &f->fs_reclaim) : > > + s->ops->close(&s->ctx, &f->fs_reclaim); > > + if (!err) { > > + /* total_open_fd must only be mutated on main thread */ > > + nclosed++; > > + } > > + } > > + }); > > + total_open_fd -= nclosed; > > So here is another thing: looking at 'man 2 close' I would say that > decrementing 'total_open_fd' conditionally based on what close() returned is > wrong. The man page suggest that the return value of close() should only be > used for diagnostic purposes, as an error on close() often indicates just an > error on a previous write() and hence the return value should only be used for > catching a data loss related to writes. > > So this should probably changed here, as well as in v9fs_co_close() / > v9fs_co_closedir(), part of a separate patch though, so I haven't addressed it > here yet. > > Does this make sense? > The handling of the return value of `close()` seems to be non-trivial indeed. It makes sense to address this in some other series. Cheers, -- Greg > /Christian > > > + /* Free the closed FIDs. */ > > while (!QSLIST_EMPTY(&reclaim_list)) { > > f = QSLIST_FIRST(&reclaim_list); > > QSLIST_REMOVE(&reclaim_list, f, V9fsFidState, reclaim_next); > > - if (f->fid_type == P9_FID_FILE) { > > - v9fs_co_close(pdu, &f->fs_reclaim); > > - } else if (f->fid_type == P9_FID_DIR) { > > - v9fs_co_closedir(pdu, &f->fs_reclaim); > > - } > > /* > > * Now drop the fid reference, free it > > * if clunked. > > > > -- Greg ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 2/2] 9pfs: fix FD leak and reduce latency of v9fs_reclaim_fd() 2025-03-07 9:23 ` [PATCH v2 2/2] 9pfs: fix FD leak and reduce latency of v9fs_reclaim_fd() Christian Schoenebeck 2025-03-07 9:47 ` Christian Schoenebeck @ 2025-03-10 9:54 ` Greg Kurz 1 sibling, 0 replies; 7+ messages in thread From: Greg Kurz @ 2025-03-10 9:54 UTC (permalink / raw) To: Christian Schoenebeck; +Cc: qemu-devel, qemu-stable On Fri, 7 Mar 2025 10:23:02 +0100 Christian Schoenebeck <qemu_oss@crudebyte.com> wrote: > This patch fixes two different bugs in v9fs_reclaim_fd(): > > 1. Reduce latency: > > This function calls v9fs_co_close() and v9fs_co_closedir() in a loop. Each > one of the calls adds two thread hops (between main thread and a fs driver > background thread). Each thread hop adds latency, which sums up in > function's loop to a significant duration. > > Reduce overall latency by open coding what v9fs_co_close() and > v9fs_co_closedir() do, executing those and the loop itself altogether in > only one background thread block, hence reducing the total amount of > thread hops to only two. > > 2. Fix file descriptor leak: > > The existing code called v9fs_co_close() and v9fs_co_closedir() to close > file descriptors. Both functions check right at the beginning if the 9p > request was cancelled: > > if (v9fs_request_cancelled(pdu)) { > return -EINTR; > } > > So if client sent a 'Tflush' message, v9fs_co_close() / v9fs_co_closedir() > returned without having closed the file descriptor and v9fs_reclaim_fd() > subsequently freed the FID without its file descriptor being closed, hence > leaking those file descriptors. > > This 2nd bug is fixed by this patch as well by open coding v9fs_co_close() > and v9fs_co_closedir() inside of v9fs_reclaim_fd() and not performing the > v9fs_request_cancelled(pdu) check there. > > Fixes: 7a46274529c ('hw/9pfs: Add file descriptor reclaim support') > Fixes: bccacf6c792 ('hw/9pfs: Implement TFLUSH operation') > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> > --- Reviewed-by: Greg Kurz <groug@kaod.org> > hw/9pfs/9p.c | 29 ++++++++++++++++++++--------- > 1 file changed, 20 insertions(+), 9 deletions(-) > > diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c > index 4f9c2dde9c..80b190ff5b 100644 > --- a/hw/9pfs/9p.c > +++ b/hw/9pfs/9p.c > @@ -434,6 +434,8 @@ void coroutine_fn v9fs_reclaim_fd(V9fsPDU *pdu) > V9fsFidState *f; > GHashTableIter iter; > gpointer fid; > + int err; > + int nclosed = 0; > > /* prevent multiple coroutines running this function simultaniously */ > if (s->reclaiming) { > @@ -446,10 +448,10 @@ void coroutine_fn v9fs_reclaim_fd(V9fsPDU *pdu) > QSLIST_HEAD(, V9fsFidState) reclaim_list = > QSLIST_HEAD_INITIALIZER(reclaim_list); > > + /* Pick FIDs to be closed, collect them on reclaim_list. */ > while (g_hash_table_iter_next(&iter, &fid, (gpointer *) &f)) { > /* > - * Unlink fids cannot be reclaimed. Check > - * for them and skip them. Also skip fids > + * Unlinked fids cannot be reclaimed, skip those, and also skip fids > * currently being operated on. > */ > if (f->ref || f->flags & FID_NON_RECLAIMABLE) { > @@ -499,17 +501,26 @@ void coroutine_fn v9fs_reclaim_fd(V9fsPDU *pdu) > } > } > /* > - * Now close the fid in reclaim list. Free them if they > - * are already clunked. > + * Close the picked FIDs altogether on a background I/O driver thread. Do > + * this all at once to keep latency (i.e. amount of thread hops between main > + * thread <-> fs driver background thread) as low as possible. > */ > + v9fs_co_run_in_worker({ > + QSLIST_FOREACH(f, &reclaim_list, reclaim_next) { > + err = (f->fid_type == P9_FID_DIR) ? > + s->ops->closedir(&s->ctx, &f->fs_reclaim) : > + s->ops->close(&s->ctx, &f->fs_reclaim); > + if (!err) { > + /* total_open_fd must only be mutated on main thread */ > + nclosed++; > + } > + } > + }); > + total_open_fd -= nclosed; > + /* Free the closed FIDs. */ > while (!QSLIST_EMPTY(&reclaim_list)) { > f = QSLIST_FIRST(&reclaim_list); > QSLIST_REMOVE(&reclaim_list, f, V9fsFidState, reclaim_next); > - if (f->fid_type == P9_FID_FILE) { > - v9fs_co_close(pdu, &f->fs_reclaim); > - } else if (f->fid_type == P9_FID_DIR) { > - v9fs_co_closedir(pdu, &f->fs_reclaim); > - } > /* > * Now drop the fid reference, free it > * if clunked. -- Greg ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 0/2] 9pfs: v9fs_reclaim_fd() fixes 2025-03-07 9:24 [PATCH v2 0/2] 9pfs: v9fs_reclaim_fd() fixes Christian Schoenebeck 2025-03-07 9:22 ` [PATCH v2 1/2] 9pfs: fix concurrent v9fs_reclaim_fd() calls Christian Schoenebeck 2025-03-07 9:23 ` [PATCH v2 2/2] 9pfs: fix FD leak and reduce latency of v9fs_reclaim_fd() Christian Schoenebeck @ 2025-03-18 19:00 ` Christian Schoenebeck 2 siblings, 0 replies; 7+ messages in thread From: Christian Schoenebeck @ 2025-03-18 19:00 UTC (permalink / raw) To: qemu-devel; +Cc: Greg Kurz, qemu-stable, Christian Schoenebeck On Friday, March 7, 2025 10:24:12 AM CET Christian Schoenebeck wrote: > Three fixes for 9p server's v9fs_reclaim_fd() function: > > * Patch 1 fixes a concurrency issue. > > * Patch 2 fixes a file descriptor leak and optimizes overall latency. On a test > machine with ~800,000 FIDs, this reduced execution duration of > v9fs_reclaim_fd() from 30ms to 1ms. > > V2: > - Patch 2: Decrement global variable total_open_fd on main thread, not > on fs driver background thread. > - Patch 2: Update commit log about file descriptor leak being fixed. > > Christian Schoenebeck (2): > 9pfs: fix concurrent v9fs_reclaim_fd() calls > 9pfs: fix FD leak and reduce latency of v9fs_reclaim_fd() > > hw/9pfs/9p.c | 39 ++++++++++++++++++++++++++++++--------- > hw/9pfs/9p.h | 1 + > 2 files changed, 31 insertions(+), 9 deletions(-) > > Queued on 9p.next: https://github.com/cschoenebeck/qemu/commits/9p.next Thanks! /Christian ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-03-18 19:01 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-03-07 9:24 [PATCH v2 0/2] 9pfs: v9fs_reclaim_fd() fixes Christian Schoenebeck 2025-03-07 9:22 ` [PATCH v2 1/2] 9pfs: fix concurrent v9fs_reclaim_fd() calls Christian Schoenebeck 2025-03-07 9:23 ` [PATCH v2 2/2] 9pfs: fix FD leak and reduce latency of v9fs_reclaim_fd() Christian Schoenebeck 2025-03-07 9:47 ` Christian Schoenebeck 2025-03-10 9:57 ` Greg Kurz 2025-03-10 9:54 ` Greg Kurz 2025-03-18 19:00 ` [PATCH v2 0/2] 9pfs: v9fs_reclaim_fd() fixes Christian Schoenebeck
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).