linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bernd Schubert <bernd.schubert@fastmail.fm>
To: Joanne Koong <joannelkoong@gmail.com>,
	Miklos Szeredi <miklos@szeredi.hu>
Cc: linux-fsdevel@vger.kernel.org, josef@toxicpanda.com,
	jefflexu@linux.alibaba.com, laoar.shao@gmail.com,
	kernel-team@meta.com
Subject: Re: [PATCH v6 1/2] fuse: add optional kernel-enforced timeout for requests
Date: Wed, 4 Sep 2024 00:37:58 +0200	[thread overview]
Message-ID: <02b45c36-b64c-4b7c-9148-55cbd06cc07b@fastmail.fm> (raw)
In-Reply-To: <CAJnrk1ZSEk+GuC1kvNS_Cu9u7UsoFW+vd2xOsrbL5i_GNAoEkQ@mail.gmail.com>



On 9/3/24 19:25, Joanne Koong wrote:
> On Mon, Sep 2, 2024 at 3:38 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
>>
>> On Fri, 30 Aug 2024 at 18:27, Joanne Koong <joannelkoong@gmail.com> wrote:
>>>
>>> There are situations where fuse servers can become unresponsive or
>>> stuck, for example if the server is in a deadlock. Currently, there's
>>> no good way to detect if a server is stuck and needs to be killed
>>> manually.
>>>
>>> This commit adds an option for enforcing a timeout (in seconds) on
>>> requests where if the timeout elapses without a reply from the server,
>>> the connection will be automatically aborted.
>>
>> Okay.
>>
>> I'm not sure what the overhead (scheduling and memory) of timers, but
>> starting one for each request seems excessive.
>>
>> Can we make the timeout per-connection instead of per request?
>>
>> I.e. When the first request is sent, the timer is started. When a
>> reply is received but there are still outstanding requests, the timer
>> is reset.  When the last reply is received, the timer is stopped.
>>
>> This should handle the frozen server case just as well.  It may not
>> perfectly handle the case when the server is still alive but for some
>> reason one or more requests get stuck, while others are still being
>> processed.   The latter case is unlikely to be an issue in practice,
>> IMO.
> 
> In that case, if the timeout is per-connection instead of per-request
> and we're not stringent about some requests getting stuck, maybe it
> makes more sense to just do this in userspace (libfuse) then? That
> seems pretty simple with having a watchdog thread that periodically
> (according to whatever specified timeout) checks if the number of
> requests serviced is increasing when
> /sys/fs/fuse/connections/*/waiting is non-zero.
> 
> If there are multiple server threads (eg libfuse's fuse_loop_mt
> interface) and say, all of them are deadlocked except for 1 that is
> actively servicing requests, then this wouldn't catch that case, but
> even if this per-connection timeout was enforced in the kernel
> instead, it wouldn't catch that case either.
> 
> So maybe this logic should just be moved to libfuse then? For this
> we'd need to pass the connection's device id (fc->dev) to userspace
> which i don't think we currently do, but that seems pretty simple. The
> one downside I see is that this doesn't let sysadmins enforce an
> automatic system-wide "max timeout" against malicious fuse servers but
> if we are having the timeout be per-connection instead of per-request,
> then a malicious server could still be malicious anyways (eg
> deadlocking all threads except for 1).
> 
> Curious to hear what your and Bernrd's thoughts on this are.


I have question here, does it need to be an exact timeout or could it be
an interval/epoch? Let's say you timeout based on epoch lists? Example

1) epoch-a starts, requests are added to epoch-a list.
2) epoch-b starts, epoch-a list should get empty
3) epoch-c starts, epoch-b list should get empty, kill the connection if
epoch-a list is not empty (epoch-c list should not be needed, as epoch-a
list can be used, once confirmed it is empty.)


Here timeout would be epoch-a + epoch-b, i.e.
max-timeout <= 2 * epoch-time.
We could have more epochs/list-heads to make it more fine grained.


From my point of view that should be a rather cheap, as it just
adding/removing requests from list and checking for timeout if a list is
empty. With the caveat that it is not precise anymore.

That could be implemented in kernel and/or libfuse?


Thanks,
Bernd

  reply	other threads:[~2024-09-03 22:38 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-30 16:26 [PATCH v6 0/2] fuse: add timeout option for requests Joanne Koong
2024-08-30 16:26 ` [PATCH v6 1/2] fuse: add optional kernel-enforced timeout " Joanne Koong
2024-09-02 10:37   ` Miklos Szeredi
2024-09-02 10:50     ` Bernd Schubert
2024-09-02 11:11       ` Miklos Szeredi
2024-09-03 17:25     ` Joanne Koong
2024-09-03 22:37       ` Bernd Schubert [this message]
2024-09-04 17:23         ` Joanne Koong
2024-09-17 22:00           ` Bernd Schubert
2024-09-18  7:29             ` Miklos Szeredi
2024-09-18  9:12               ` Bernd Schubert
2024-09-27 19:36     ` Joanne Koong
2024-09-28  8:43       ` Bernd Schubert
2024-10-01 17:03         ` Joanne Koong
2024-10-01 17:12           ` Joanne Koong
2024-10-07 18:39           ` Joanne Koong
2024-10-07 20:02             ` Bernd Schubert
2024-10-08 16:26               ` Joanne Koong
2024-10-08 19:00                 ` Joanne Koong
2024-08-30 16:26 ` [PATCH v6 2/2] fuse: add default_request_timeout and max_request_timeout sysctls Joanne Koong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=02b45c36-b64c-4b7c-9148-55cbd06cc07b@fastmail.fm \
    --to=bernd.schubert@fastmail.fm \
    --cc=jefflexu@linux.alibaba.com \
    --cc=joannelkoong@gmail.com \
    --cc=josef@toxicpanda.com \
    --cc=kernel-team@meta.com \
    --cc=laoar.shao@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).