From: Josef Bacik <josef@toxicpanda.com>
To: Joanne Koong <joannelkoong@gmail.com>
Cc: miklos@szeredi.hu, linux-fsdevel@vger.kernel.org,
bernd.schubert@fastmail.fm, jefflexu@linux.alibaba.com,
laoar.shao@gmail.com, kernel-team@meta.com
Subject: Re: [PATCH v3 1/2] fuse: add optional kernel-enforced timeout for requests
Date: Thu, 8 Aug 2024 16:50:06 -0400 [thread overview]
Message-ID: <20240808205006.GA625513@perftesting> (raw)
In-Reply-To: <20240808190110.3188039-2-joannelkoong@gmail.com>
On Thu, Aug 08, 2024 at 12:01:09PM -0700, Joanne Koong wrote:
> There are situations where fuse servers can become unresponsive or take
> too long to reply to a request. Currently there is no upper bound on
> how long a request may take, which may be frustrating to users who get
> stuck waiting for a request to complete.
>
> This commit adds a timeout option (in seconds) for requests. If the
> timeout elapses before the server replies to the request, the request
> will fail with -ETIME.
>
> There are 3 possibilities for a request that times out:
> a) The request times out before the request has been sent to userspace
> b) The request times out after the request has been sent to userspace
> and before it receives a reply from the server
> c) The request times out after the request has been sent to userspace
> and the server replies while the kernel is timing out the request
>
> While a request timeout is being handled, there may be other handlers
> running at the same time if:
> a) the kernel is forwarding the request to the server
> b) the kernel is processing the server's reply to the request
> c) the request is being re-sent
> d) the connection is aborting
> e) the device is getting released
>
> Proper synchronization must be added to ensure that the request is
> handled correctly in all of these cases. To this effect, there is a new
> FR_FINISHING bit added to the request flags, which is set atomically by
> either the timeout handler (see fuse_request_timeout()) which is invoked
> after the request timeout elapses or set by the request reply handler
> (see dev_do_write()), whichever gets there first. If the reply handler
> and the timeout handler are executing simultaneously and the reply handler
> sets FR_FINISHING before the timeout handler, then the request will be
> handled as if the timeout did not elapse. If the timeout handler sets
> FR_FINISHING before the reply handler, then the request will fail with
> -ETIME and the request will be cleaned up.
>
> Currently, this is the refcount lifecycle of a request:
>
> Synchronous request is created:
> fuse_simple_request -> allocates request, sets refcount to 1
> __fuse_request_send -> acquires refcount
> queues request and waits for reply...
> fuse_simple_request -> drops refcount
>
> Background request is created:
> fuse_simple_background -> allocates request, sets refcount to 1
>
> Request is replied to:
> fuse_dev_do_write
> fuse_request_end -> drops refcount on request
>
> Proper acquires on the request reference must be added to ensure that the
> timeout handler does not drop the last refcount on the request while
> other handlers may be operating on the request. Please note that the
> timeout handler may get invoked at any phase of the request's
> lifetime (eg before the request has been forwarded to userspace, etc).
>
> It is always guaranteed that there is a refcount on the request when the
> timeout handler is executing. The timeout handler will be either
> deactivated by the reply/abort/release handlers, or if the timeout
> handler is concurrently executing on another CPU, the reply/abort/release
> handlers will wait for the timeout handler to finish executing first before
> it drops the final refcount on the request.
>
> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> ---
> fs/fuse/dev.c | 197 +++++++++++++++++++++++++++++++++++++++++++++--
> fs/fuse/fuse_i.h | 14 ++++
> fs/fuse/inode.c | 7 ++
> 3 files changed, 210 insertions(+), 8 deletions(-)
>
> diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
> index 9eb191b5c4de..bcb9ff2156c0 100644
> --- a/fs/fuse/dev.c
> +++ b/fs/fuse/dev.c
> @@ -31,6 +31,8 @@ MODULE_ALIAS("devname:fuse");
>
> static struct kmem_cache *fuse_req_cachep;
>
> +static void fuse_request_timeout(struct timer_list *timer);
> +
> static struct fuse_dev *fuse_get_dev(struct file *file)
> {
> /*
> @@ -48,6 +50,8 @@ static void fuse_request_init(struct fuse_mount *fm, struct fuse_req *req)
> refcount_set(&req->count, 1);
> __set_bit(FR_PENDING, &req->flags);
> req->fm = fm;
> + if (fm->fc->req_timeout)
> + timer_setup(&req->timer, fuse_request_timeout, 0);
> }
>
> static struct fuse_req *fuse_request_alloc(struct fuse_mount *fm, gfp_t flags)
> @@ -277,7 +281,7 @@ static void flush_bg_queue(struct fuse_conn *fc)
> * the 'end' callback is called if given, else the reference to the
> * request is released
> */
> -void fuse_request_end(struct fuse_req *req)
> +static void do_fuse_request_end(struct fuse_req *req)
> {
> struct fuse_mount *fm = req->fm;
> struct fuse_conn *fc = fm->fc;
> @@ -296,8 +300,6 @@ void fuse_request_end(struct fuse_req *req)
> list_del_init(&req->intr_entry);
> spin_unlock(&fiq->lock);
> }
> - WARN_ON(test_bit(FR_PENDING, &req->flags));
> - WARN_ON(test_bit(FR_SENT, &req->flags));
> if (test_bit(FR_BACKGROUND, &req->flags)) {
> spin_lock(&fc->bg_lock);
> clear_bit(FR_BACKGROUND, &req->flags);
> @@ -329,8 +331,104 @@ void fuse_request_end(struct fuse_req *req)
> put_request:
> fuse_put_request(req);
> }
> +
> +void fuse_request_end(struct fuse_req *req)
> +{
> + WARN_ON(test_bit(FR_PENDING, &req->flags));
> + WARN_ON(test_bit(FR_SENT, &req->flags));
> +
> + if (req->timer.function)
> + timer_delete_sync(&req->timer);
This becomes just timer_delete_sync();
> +
> + do_fuse_request_end(req);
> +}
> EXPORT_SYMBOL_GPL(fuse_request_end);
>
> +static void timeout_inflight_req(struct fuse_req *req)
> +{
> + struct fuse_conn *fc = req->fm->fc;
> + struct fuse_iqueue *fiq = &fc->iq;
> + struct fuse_pqueue *fpq;
> +
> + spin_lock(&fiq->lock);
> + fpq = req->fpq;
> + spin_unlock(&fiq->lock);
> +
> + /*
> + * If fpq has not been set yet, then the request is aborting (which
> + * clears FR_PENDING flag) before dev_do_read (which sets req->fpq)
> + * has been called. Let the abort handler handle this request.
> + */
> + if (!fpq)
> + return;
> +
> + spin_lock(&fpq->lock);
> + if (!fpq->connected || req->out.h.error == -ECONNABORTED) {
> + /*
> + * Connection is being aborted or the fuse_dev is being released.
> + * The abort / release will clean up the request
> + */
> + spin_unlock(&fpq->lock);
> + return;
> + }
> +
> + if (!test_bit(FR_PRIVATE, &req->flags))
> + list_del_init(&req->list);
> +
> + spin_unlock(&fpq->lock);
> +
> + req->out.h.error = -ETIME;
> +
> + do_fuse_request_end(req);
> +}
> +
> +static void timeout_pending_req(struct fuse_req *req)
> +{
> + struct fuse_conn *fc = req->fm->fc;
> + struct fuse_iqueue *fiq = &fc->iq;
> + bool background = test_bit(FR_BACKGROUND, &req->flags);
> +
> + if (background)
> + spin_lock(&fc->bg_lock);
> + spin_lock(&fiq->lock);
> +
> + if (!test_bit(FR_PENDING, &req->flags)) {
> + spin_unlock(&fiq->lock);
> + if (background)
> + spin_unlock(&fc->bg_lock);
> + timeout_inflight_req(req);
> + return;
> + }
> +
> + if (!test_bit(FR_PRIVATE, &req->flags))
> + list_del_init(&req->list);
> +
> + spin_unlock(&fiq->lock);
> + if (background)
> + spin_unlock(&fc->bg_lock);
> +
> + req->out.h.error = -ETIME;
> +
> + do_fuse_request_end(req);
> +}
> +
> +static void fuse_request_timeout(struct timer_list *timer)
> +{
> + struct fuse_req *req = container_of(timer, struct fuse_req, timer);
> +
> + /*
> + * Request reply is being finished by the kernel right now.
> + * No need to time out the request.
> + */
> + if (test_and_set_bit(FR_FINISHING, &req->flags))
> + return;
> +
> + if (test_bit(FR_PENDING, &req->flags))
> + timeout_pending_req(req);
> + else
> + timeout_inflight_req(req);
> +}
> +
> static int queue_interrupt(struct fuse_req *req)
> {
> struct fuse_iqueue *fiq = &req->fm->fc->iq;
> @@ -393,6 +491,11 @@ static void request_wait_answer(struct fuse_req *req)
> if (test_bit(FR_PENDING, &req->flags)) {
> list_del(&req->list);
> spin_unlock(&fiq->lock);
> + if (req->timer.function) {
> + bool timed_out = !timer_delete_sync(&req->timer);
> + if (timed_out)
> + return;
> + }
This can just be
if (!timer_delete_sync(&req->timer))
return;
> __fuse_put_request(req);
> req->out.h.error = -EINTR;
> return;
> @@ -409,7 +512,8 @@ static void request_wait_answer(struct fuse_req *req)
>
> static void __fuse_request_send(struct fuse_req *req)
> {
> - struct fuse_iqueue *fiq = &req->fm->fc->iq;
> + struct fuse_conn *fc = req->fm->fc;
> + struct fuse_iqueue *fiq = &fc->iq;
>
> BUG_ON(test_bit(FR_BACKGROUND, &req->flags));
> spin_lock(&fiq->lock);
> @@ -421,6 +525,10 @@ static void __fuse_request_send(struct fuse_req *req)
> /* acquire extra reference, since request is still needed
> after fuse_request_end() */
> __fuse_get_request(req);
> + if (req->timer.function) {
> + req->timer.expires = jiffies + fc->req_timeout;
> + add_timer(&req->timer);
> + }
This can just be
if (req->timer.function)
mod_timer(&req->timer, jiffies + fc->req_timeout);
> queue_request_and_unlock(fiq, req);
>
> request_wait_answer(req);
> @@ -539,6 +647,10 @@ static bool fuse_request_queue_background(struct fuse_req *req)
> if (fc->num_background == fc->max_background)
> fc->blocked = 1;
> list_add_tail(&req->list, &fc->bg_queue);
> + if (req->timer.function) {
> + req->timer.expires = jiffies + fc->req_timeout;
> + add_timer(&req->timer);
> + }
Same comment as above.
> flush_bg_queue(fc);
> queued = true;
> }
> @@ -594,6 +706,10 @@ static int fuse_simple_notify_reply(struct fuse_mount *fm,
>
> spin_lock(&fiq->lock);
> if (fiq->connected) {
> + if (req->timer.function) {
> + req->timer.expires = jiffies + fm->fc->req_timeout;
> + add_timer(&req->timer);
> + }
Here as well.
> queue_request_and_unlock(fiq, req);
> } else {
> err = -ENODEV;
> @@ -1268,8 +1384,26 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file,
> req = list_entry(fiq->pending.next, struct fuse_req, list);
> clear_bit(FR_PENDING, &req->flags);
> list_del_init(&req->list);
> + /* Acquire a reference in case the timeout handler starts executing */
> + __fuse_get_request(req);
> + req->fpq = fpq;
> spin_unlock(&fiq->lock);
>
> + if (req->timer.function) {
> + /*
> + * Temporarily disable the timer on the request to avoid race
> + * conditions between this code and the timeout handler.
> + *
> + * The timer is readded at the end of this function.
> + */
> + bool timed_out = !timer_delete_sync(&req->timer);
> + if (timed_out) {
This can also just be
if (!timer_delete_sync(&req->timer));
> + WARN_ON(!test_bit(FR_FINISHED, &req->flags));
> + fuse_put_request(req);
> + goto restart;
> + }
> + }
> +
> args = req->args;
> reqsize = req->in.h.len;
>
> @@ -1280,6 +1414,7 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file,
> if (args->opcode == FUSE_SETXATTR)
> req->out.h.error = -E2BIG;
> fuse_request_end(req);
> + fuse_put_request(req);
> goto restart;
> }
> spin_lock(&fpq->lock);
> @@ -1316,13 +1451,18 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file,
> }
> hash = fuse_req_hash(req->in.h.unique);
> list_move_tail(&req->list, &fpq->processing[hash]);
> - __fuse_get_request(req);
> set_bit(FR_SENT, &req->flags);
> +
> + /* re-arm the original timer */
> + if (req->timer.function)
> + add_timer(&req->timer);
This will not change anything if the timer was already armed, do you want
mod_timer_pending() here? Or maybe just mod_timer()? Thanks,
Josef
next prev parent reply other threads:[~2024-08-08 20:50 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-08 19:01 [PATCH v3 0/2] fuse: add timeout option for requests Joanne Koong
2024-08-08 19:01 ` [PATCH v3 1/2] fuse: add optional kernel-enforced timeout " Joanne Koong
2024-08-08 20:50 ` Josef Bacik [this message]
2024-08-09 14:14 ` Josef Bacik
2024-08-09 18:33 ` Joanne Koong
2024-08-09 6:22 ` Jingbo Xu
2024-08-09 17:51 ` Joanne Koong
2024-08-13 17:04 ` Bernd Schubert
2024-08-13 18:38 ` Joanne Koong
2024-08-08 19:01 ` [PATCH v3 2/2] fuse: add default_request_timeout and max_request_timeout sysctls Joanne Koong
2024-08-08 20:50 ` Josef Bacik
2024-08-13 20:28 ` Bernd Schubert
2024-08-12 2:33 ` [PATCH v3 0/2] fuse: add timeout option for requests Yafang Shao
2024-08-12 23:18 ` Joanne Koong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240808205006.GA625513@perftesting \
--to=josef@toxicpanda.com \
--cc=bernd.schubert@fastmail.fm \
--cc=jefflexu@linux.alibaba.com \
--cc=joannelkoong@gmail.com \
--cc=kernel-team@meta.com \
--cc=laoar.shao@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=miklos@szeredi.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).