linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bernd Schubert <bernd.schubert@fastmail.fm>
To: Joanne Koong <joannelkoong@gmail.com>,
	miklos@szeredi.hu, linux-fsdevel@vger.kernel.org
Cc: josef@toxicpanda.com, laoar.shao@gmail.com, kernel-team@meta.com
Subject: Re: [PATCH v2 1/2] fuse: add optional kernel-enforced timeout for requests
Date: Mon, 5 Aug 2024 15:26:37 +0200	[thread overview]
Message-ID: <fc1ed986-fcd6-4a52-aed3-f3f61f2513a7@fastmail.fm> (raw)
In-Reply-To: <CAJnrk1Yf68HbGUuDv6zwfqkarMBsaHi1DJPdA0Fg5EyXvWbtFA@mail.gmail.com>



On 8/5/24 06:52, Joanne Koong wrote:
> On Mon, Jul 29, 2024 at 5:28 PM Joanne Koong <joannelkoong@gmail.com> wrote:
>>
>> There are situations where fuse servers can become unresponsive or take
>> too long to reply to a request. Currently there is no upper bound on
>> how long a request may take, which may be frustrating to users who get
>> stuck waiting for a request to complete.
>>
>> This commit adds a timeout option (in seconds) for requests. If the
>> timeout elapses before the server replies to the request, the request
>> will fail with -ETIME.
>>
>> There are 3 possibilities for a request that times out:
>> a) The request times out before the request has been sent to userspace
>> b) The request times out after the request has been sent to userspace
>> and before it receives a reply from the server
>> c) The request times out after the request has been sent to userspace
>> and the server replies while the kernel is timing out the request
>>
>> While a request timeout is being handled, there may be other handlers
>> running at the same time if:
>> a) the kernel is forwarding the request to the server
>> b) the kernel is processing the server's reply to the request
>> c) the request is being re-sent
>> d) the connection is aborting
>> e) the device is getting released
>>
>> Proper synchronization must be added to ensure that the request is
>> handled correctly in all of these cases. To this effect, there is a new
>> FR_FINISHING bit added to the request flags, which is set atomically by
>> either the timeout handler (see fuse_request_timeout()) which is invoked
>> after the request timeout elapses or set by the request reply handler
>> (see dev_do_write()), whichever gets there first. If the reply handler
>> and the timeout handler are executing simultaneously and the reply handler
>> sets FR_FINISHING before the timeout handler, then the request will be
>> handled as if the timeout did not elapse. If the timeout handler sets
>> FR_FINISHING before the reply handler, then the request will fail with
>> -ETIME and the request will be cleaned up.
>>
>> Currently, this is the refcount lifecycle of a request:
>>
>> Synchronous request is created:
>> fuse_simple_request -> allocates request, sets refcount to 1
>>    __fuse_request_send -> acquires refcount
>>      queues request and waits for reply...
>> fuse_simple_request -> drops refcount
>>
>> Background request is created:
>> fuse_simple_background -> allocates request, sets refcount to 1
>>
>> Request is replied to:
>> fuse_dev_do_write
>>    fuse_request_end -> drops refcount on request
>>
>> Proper acquires on the request reference must be added to ensure that the
>> timeout handler does not drop the last refcount on the request while
>> other handlers may be operating on the request. Please note that the
>> timeout handler may get invoked at any phase of the request's
>> lifetime (eg before the request has been forwarded to userspace, etc).
>>
>> It is always guaranteed that there is a refcount on the request when the
>> timeout handler is executing. The timeout handler will be either
>> deactivated by the reply/abort/release handlers, or if the timeout
>> handler is concurrently executing on another CPU, the reply/abort/release
>> handlers will wait for the timeout handler to finish executing first before
>> it drops the final refcount on the request.
>>
>> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
>> ---
>>   fs/fuse/dev.c    | 187 +++++++++++++++++++++++++++++++++++++++++++++--
>>   fs/fuse/fuse_i.h |  14 ++++
>>   fs/fuse/inode.c  |   7 ++
>>   3 files changed, 200 insertions(+), 8 deletions(-)
>>
>> diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
>> index 9eb191b5c4de..9992bc5f4469 100644
>> --- a/fs/fuse/dev.c
>> +++ b/fs/fuse/dev.c
>> @@ -31,6 +31,8 @@ MODULE_ALIAS("devname:fuse");
>>
>>   static struct kmem_cache *fuse_req_cachep;
>>
>> +static void fuse_request_timeout(struct timer_list *timer);
>> +
>>   static struct fuse_dev *fuse_get_dev(struct file *file)
>>   {
>>          /*
>> @@ -48,6 +50,8 @@ static void fuse_request_init(struct fuse_mount *fm, struct fuse_req *req)
>>          refcount_set(&req->count, 1);
>>          __set_bit(FR_PENDING, &req->flags);
>>          req->fm = fm;
>> +       if (fm->fc->req_timeout)
>> +               timer_setup(&req->timer, fuse_request_timeout, 0);
>>   }
>>
>>   static struct fuse_req *fuse_request_alloc(struct fuse_mount *fm, gfp_t flags)
>> @@ -277,12 +281,15 @@ static void flush_bg_queue(struct fuse_conn *fc)
>>    * the 'end' callback is called if given, else the reference to the
>>    * request is released
>>    */
>> -void fuse_request_end(struct fuse_req *req)
>> +static void do_fuse_request_end(struct fuse_req *req, bool from_timer_callback)
>>   {
>>          struct fuse_mount *fm = req->fm;
>>          struct fuse_conn *fc = fm->fc;
>>          struct fuse_iqueue *fiq = &fc->iq;
>>
>> +       if (from_timer_callback)
>> +               req->out.h.error = -ETIME;
>> +
>>          if (test_and_set_bit(FR_FINISHED, &req->flags))
>>                  goto put_request;
>>
>> @@ -296,8 +303,6 @@ void fuse_request_end(struct fuse_req *req)
>>                  list_del_init(&req->intr_entry);
>>                  spin_unlock(&fiq->lock);
>>          }
>> -       WARN_ON(test_bit(FR_PENDING, &req->flags));
>> -       WARN_ON(test_bit(FR_SENT, &req->flags));
>>          if (test_bit(FR_BACKGROUND, &req->flags)) {
>>                  spin_lock(&fc->bg_lock);
>>                  clear_bit(FR_BACKGROUND, &req->flags);
>> @@ -324,13 +329,105 @@ void fuse_request_end(struct fuse_req *req)
>>                  wake_up(&req->waitq);
>>          }
>>
>> +       if (!from_timer_callback && req->timer.function)
>> +               timer_delete_sync(&req->timer);
>> +
>>          if (test_bit(FR_ASYNC, &req->flags))
>>                  req->args->end(fm, req->args, req->out.h.error);
>>   put_request:
>>          fuse_put_request(req);
>>   }
>> +
>> +void fuse_request_end(struct fuse_req *req)
>> +{
>> +       WARN_ON(test_bit(FR_PENDING, &req->flags));
>> +       WARN_ON(test_bit(FR_SENT, &req->flags));
>> +
>> +       do_fuse_request_end(req, false);
>> +}
>>   EXPORT_SYMBOL_GPL(fuse_request_end);
>>
>> +static void timeout_inflight_req(struct fuse_req *req)
>> +{
>> +       struct fuse_conn *fc = req->fm->fc;
>> +       struct fuse_iqueue *fiq = &fc->iq;
>> +       struct fuse_pqueue *fpq;
>> +
>> +       spin_lock(&fiq->lock);
>> +       fpq = req->fpq;
>> +       spin_unlock(&fiq->lock);
>> +
>> +       /*
>> +        * If fpq has not been set yet, then the request is aborting (which
>> +        * clears FR_PENDING flag) before dev_do_read (which sets req->fpq)
>> +        * has been called. Let the abort handler handle this request.
>> +        */
>> +       if (!fpq)
>> +               return;
>> +
>> +       spin_lock(&fpq->lock);
>> +       if (!fpq->connected || req->out.h.error == -ECONNABORTED) {
>> +               /*
>> +                * Connection is being aborted or the fuse_dev is being released.
>> +                * The abort / release will clean up the request
>> +                */
>> +               spin_unlock(&fpq->lock);
>> +               return;
>> +       }
>> +
>> +       if (!test_bit(FR_PRIVATE, &req->flags))
>> +               list_del_init(&req->list);
>> +
>> +       spin_unlock(&fpq->lock);
>> +
>> +       do_fuse_request_end(req, true);
>> +}
>> +
>> +static void timeout_pending_req(struct fuse_req *req)
>> +{
>> +       struct fuse_conn *fc = req->fm->fc;
>> +       struct fuse_iqueue *fiq = &fc->iq;
>> +       bool background = test_bit(FR_BACKGROUND, &req->flags);
>> +
>> +       if (background)
>> +               spin_lock(&fc->bg_lock);
>> +       spin_lock(&fiq->lock);
>> +
>> +       if (!test_bit(FR_PENDING, &req->flags)) {
>> +               spin_unlock(&fiq->lock);
>> +               if (background)
>> +                       spin_unlock(&fc->bg_lock);
>> +               timeout_inflight_req(req);
>> +               return;
>> +       }
>> +
>> +       if (!test_bit(FR_PRIVATE, &req->flags))
>> +               list_del_init(&req->list);
>> +
>> +       spin_unlock(&fiq->lock);
>> +       if (background)
>> +               spin_unlock(&fc->bg_lock);
>> +
>> +       do_fuse_request_end(req, true);
>> +}
>> +
>> +static void fuse_request_timeout(struct timer_list *timer)
>> +{
>> +       struct fuse_req *req = container_of(timer, struct fuse_req, timer);
>> +
>> +       /*
>> +        * Request reply is being finished by the kernel right now.
>> +        * No need to time out the request.
>> +        */
>> +       if (test_and_set_bit(FR_FINISHING, &req->flags))
>> +               return;
>> +
>> +       if (test_bit(FR_PENDING, &req->flags))
>> +               timeout_pending_req(req);
>> +       else
>> +               timeout_inflight_req(req);
>> +}
>> +
>>   static int queue_interrupt(struct fuse_req *req)
>>   {
>>          struct fuse_iqueue *fiq = &req->fm->fc->iq;
>> @@ -409,7 +506,8 @@ static void request_wait_answer(struct fuse_req *req)
>>
>>   static void __fuse_request_send(struct fuse_req *req)
>>   {
>> -       struct fuse_iqueue *fiq = &req->fm->fc->iq;
>> +       struct fuse_conn *fc = req->fm->fc;
>> +       struct fuse_iqueue *fiq = &fc->iq;
>>
>>          BUG_ON(test_bit(FR_BACKGROUND, &req->flags));
>>          spin_lock(&fiq->lock);
>> @@ -421,6 +519,10 @@ static void __fuse_request_send(struct fuse_req *req)
>>                  /* acquire extra reference, since request is still needed
>>                     after fuse_request_end() */
>>                  __fuse_get_request(req);
>> +               if (req->timer.function) {
>> +                       req->timer.expires = jiffies + fc->req_timeout;
>> +                       add_timer(&req->timer);
>> +               }
>>                  queue_request_and_unlock(fiq, req);
>>
>>                  request_wait_answer(req);
>> @@ -539,6 +641,10 @@ static bool fuse_request_queue_background(struct fuse_req *req)
>>                  if (fc->num_background == fc->max_background)
>>                          fc->blocked = 1;
>>                  list_add_tail(&req->list, &fc->bg_queue);
>> +               if (req->timer.function) {
>> +                       req->timer.expires = jiffies + fc->req_timeout;
>> +                       add_timer(&req->timer);
>> +               }
>>                  flush_bg_queue(fc);
>>                  queued = true;
>>          }
>> @@ -1268,6 +1374,9 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file,
>>          req = list_entry(fiq->pending.next, struct fuse_req, list);
>>          clear_bit(FR_PENDING, &req->flags);
>>          list_del_init(&req->list);
>> +       /* Acquire a reference in case the timeout handler starts executing */
>> +       __fuse_get_request(req);
>> +       req->fpq = fpq;
>>          spin_unlock(&fiq->lock);
>>
>>          args = req->args;
>> @@ -1280,6 +1389,7 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file,
>>                  if (args->opcode == FUSE_SETXATTR)
>>                          req->out.h.error = -E2BIG;
>>                  fuse_request_end(req);
>> +               fuse_put_request(req);
>>                  goto restart;
> 
> While rereading through fuse_dev_do_read, I just realized we also need
> to handle the race condition for the error edge cases (here and in the
> "goto out_end;"), since the timeout handler could have finished
> executing by the time we hit the error edge case. We need to
> test_and_set_bit(FR_FINISHING) so that either the timeout_handler or
> dev_do_read cleans up the request, but not both. I'll fix this for v3.

I know it would change semantics a bit, but wouldn't it be much easier /
less racy if fuse_dev_do_read() would delete the timer when it takes a
request from fiq->pending and add it back in (with new timeouts) before
it returns the request?

Untested:

diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 9992bc5f4469..444f667e2f43 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -1379,6 +1379,15 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file,
         req->fpq = fpq;
         spin_unlock(&fiq->lock);

+       if (req->timer.function) {
+               /* request gets handled, remove the previous timeout */
+               timer_delete_sync(&req->timer);
+               if (test_bit(FR_FINISHED, &req->flags)) {
+                       fuse_put_request(req);
+                       goto restart;
+               }
+       }
+
         args = req->args;
         reqsize = req->in.h.len;

@@ -1433,24 +1442,10 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file,
         if (test_bit(FR_INTERRUPTED, &req->flags))
                 queue_interrupt(req);

-       /*
-        * Check if the timeout handler is running / ran. If it did, we need to
-        * remove the request from any lists in case the timeout handler finished
-        * before dev_do_read moved the request to the processing list.
-        *
-        * Check FR_SENT to distinguish whether the timeout or the write handler
-        * is finishing the request. However, there can be the case where the
-        * timeout handler and resend handler are running concurrently, so we
-        * need to also check the FR_PENDING bit.
-        */
-       if (test_bit(FR_FINISHING, &req->flags) &&
-           (test_bit(FR_SENT, &req->flags) || test_bit(FR_PENDING, &req->flags))) {
-               spin_lock(&fpq->lock);
-               if (!test_bit(FR_PRIVATE, &req->flags))
-                       list_del_init(&req->list);
-               spin_unlock(&fpq->lock);
-               fuse_put_request(req);
-               return -ETIME;
+       if (req->timer.function) {
+               /* re-arm the request */
+               req->timer.expires = jiffies + fc->req_timeout;
+               add_timer(&req->timer);
         }

         fuse_put_request(req);

Thanks,
Bernd

  reply	other threads:[~2024-08-05 13:26 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-30  0:23 [PATCH v2 0/2] fuse: add timeout option for requests Joanne Koong
2024-07-30  0:23 ` [PATCH v2 1/2] fuse: add optional kernel-enforced timeout " Joanne Koong
2024-08-04 22:46   ` Bernd Schubert
2024-08-05  4:45     ` Joanne Koong
2024-08-05 13:05       ` Bernd Schubert
2024-08-05  4:52   ` Joanne Koong
2024-08-05 13:26     ` Bernd Schubert [this message]
2024-08-05 22:10       ` Joanne Koong
2024-08-06 15:43         ` Bernd Schubert
2024-08-06 17:08           ` Joanne Koong
2024-08-05  7:32   ` Jingbo Xu
2024-08-05 22:53     ` Joanne Koong
2024-08-06  2:45       ` Jingbo Xu
2024-08-06 16:43         ` Joanne Koong
2024-08-06 15:50       ` Bernd Schubert
2024-07-30  0:23 ` [PATCH v2 2/2] fuse: add default_request_timeout and max_request_timeout sysctls Joanne Koong
2024-07-30  7:49   ` kernel test robot
2024-07-30  9:14   ` kernel test robot
2024-08-05  7:38   ` Jingbo Xu
2024-08-06  1:26     ` Joanne Koong
2024-07-30  5:59 ` [PATCH v2 0/2] fuse: add timeout option for requests Yafang Shao
2024-07-30 18:16   ` Joanne Koong
2024-07-31  2:13     ` Yafang Shao
2024-07-31 17:52       ` Joanne Koong
2024-07-31 18:46         ` Joanne Koong
2024-08-01  2:47           ` Yafang Shao
2024-08-02 19:05             ` Joanne Koong
2024-08-04  7:46               ` Yafang Shao
2024-08-05  5:05                 ` Joanne Koong
2024-08-06 16:23                   ` Joanne Koong
2024-08-06 17:11                     ` Bernd Schubert
2024-08-06 18:26                       ` Joanne Koong
2024-08-06 18:37                         ` Joanne Koong
2024-08-06 20:08                           ` Bernd Schubert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fc1ed986-fcd6-4a52-aed3-f3f61f2513a7@fastmail.fm \
    --to=bernd.schubert@fastmail.fm \
    --cc=joannelkoong@gmail.com \
    --cc=josef@toxicpanda.com \
    --cc=kernel-team@meta.com \
    --cc=laoar.shao@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).