Re: [PATCH v6 1/2] fuse: add optional kernel-enforced timeout for requests

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Bernd Schubert <bernd.schubert@fastmail.fm>
To: Joanne Koong <joannelkoong@gmail.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
	linux-fsdevel@vger.kernel.org, josef@toxicpanda.com,
	jefflexu@linux.alibaba.com, laoar.shao@gmail.com,
	kernel-team@meta.com
Subject: Re: [PATCH v6 1/2] fuse: add optional kernel-enforced timeout for requests
Date: Mon, 7 Oct 2024 22:02:01 +0200	[thread overview]
Message-ID: <ebc29d73-ad5a-4fba-b892-1cea7f1b44d0@fastmail.fm> (raw)
In-Reply-To: <CAJnrk1btbP-jDVttuh-skyAQyHR80to+u55g7BANzqW2af_+Qw@mail.gmail.com>



On 10/7/24 20:39, Joanne Koong wrote:
> On Tue, Oct 1, 2024 at 10:03 AM Joanne Koong <joannelkoong@gmail.com> wrote:
>>
>> On Sat, Sep 28, 2024 at 1:43 AM Bernd Schubert
>> <bernd.schubert@fastmail.fm> wrote:
>>>
>>> Hi Joanne,
>>>
>>> On 9/27/24 21:36, Joanne Koong wrote:
>>>> On Mon, Sep 2, 2024 at 3:38 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
>>>>>
>>>>> On Fri, 30 Aug 2024 at 18:27, Joanne Koong <joannelkoong@gmail.com> wrote:
>>>>>>
>>>>>> There are situations where fuse servers can become unresponsive or
>>>>>> stuck, for example if the server is in a deadlock. Currently, there's
>>>>>> no good way to detect if a server is stuck and needs to be killed
>>>>>> manually.
>>>>>>
>>>>>> This commit adds an option for enforcing a timeout (in seconds) on
>>>>>> requests where if the timeout elapses without a reply from the server,
>>>>>> the connection will be automatically aborted.
>>>>>
>>>>> Okay.
>>>>>
>>>>> I'm not sure what the overhead (scheduling and memory) of timers, but
>>>>> starting one for each request seems excessive.
>>>>
>>>> I ran some benchmarks on this using the passthrough_ll server and saw
>>>> roughly a 1.5% drop in throughput (from ~775 MiB/s to ~765 MiB/s):
>>>> fio --name randwrite --ioengine=sync --thread --invalidate=1
>>>> --runtime=300 --ramp_time=10 --rw=randwrite --size=1G --numjobs=4
>>>> --bs=4k --alloc-size 98304 --allrandrepeat=1 --randseed=12345
>>>> --group_reporting=1 --directory=/root/fuse_mount
>>>>
>>>> Instead of attaching a timer to each request, I think we can instead
>>>> do the following:
>>>> * add a "start_time" field to each request tracking (in jiffies) when
>>>> the request was started
>>>> * add a new list to the connection that all requests get enqueued
>>>> onto. When the request is completed, it gets dequeued from this list
>>>> * have a timer for the connection that fires off every 10 seconds or
>>>> so. When this timer is fired, it checks if "jiffies > req->start_time
>>>> + fc->req_timeout" against the head of the list to check if the
>>>> timeout has expired and we need to abort the request. We only need to
>>>> check against the head of the list because we know every other request
>>>> after this was started later in time. I think we could even just use
>>>> the fc->lock for this instead of needing a separate lock. In the worst
>>>> case, this grants a 10 second upper bound on the timeout a user
>>>> requests (eg if the user requests 2 minutes, in the worst case the
>>>> timeout would trigger at 2 minutes and 10 seconds).
>>>>
>>>> Also, now that we're aborting the connection entirely on a timeout
>>>> instead of just aborting the request, maybe it makes sense to change
>>>> the timeout granularity to minutes instead of seconds. I'm envisioning
>>>> that this timeout mechanism will mostly be used as a safeguard against
>>>> malicious or buggy servers with a high timeout configured (eg 10
>>>> minutes), and minutes seems like a nicer interface for users than them
>>>> having to convert that to seconds.
>>>>
>>>> Let me know if I've missed anything with this approach but if not,
>>>> then I'll submit v7 with this change.
>>>
>>>
>>> sounds great to me. Just, could we do this per fuse_dev to avoid a
>>> single lock for all cores?
>>>
>>
>> Will do! thanks for the suggestion - in that case, I'll add its own
>> spinlock for it too then.
> 
> I realized while working on v7 that we can't do this per fuse device
> because the request is only associated with a device once it's read in
> by the server (eg fuse_dev_do_read).
> 
> I ran some rough preliminary benchmarks with
> ./libfuse/build/example/passthrough_ll  -o writeback -o max_threads=4
> -o source=~/fstests ~/fuse_mount
> and
> fio --name randwrite --ioengine=sync --thread --invalidate=1
> --runtime=300 --ramp_time=10 --rw=randwrite --size=1G --numjobs=4
> --bs=4k --alloc-size 98304 --allrandrepeat=1 --randseed=12345
> --group_reporting=1 --directory=fuse_mount
> 
> and didn't see any noticeable difference in throughput (~37 MiB/sec on
> my system) with vs without the timeout.
> 


That is not much, isn't your limit the backend? I wonder what would happen
with 25GB/s I'm testing with. Wouldn't it make sense for this to test with 
sequential large IO? And possibly with the passthrough-hp branch that
bypasses IO? And a NUMA system probably would be helpful as well. 
I.e. to test the effect on the kernel side without having an IO limited
system?


With the io-uring interface requests stay in queues from the in-coming CPU
so easier to achieve there, although I wonder for your use-case if it
wouldn't be sufficient to start the timer only when the request is on
the way to fuse-server? One disadvantage I see is that virtiofs would need
to be specially handled.


Thanks,
Bernd

next prev parent reply	other threads:[~2024-10-07 20:02 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-30 16:26 [PATCH v6 0/2] fuse: add timeout option for requests Joanne Koong
2024-08-30 16:26 ` [PATCH v6 1/2] fuse: add optional kernel-enforced timeout " Joanne Koong
2024-09-02 10:37   ` Miklos Szeredi
2024-09-02 10:50     ` Bernd Schubert
2024-09-02 11:11       ` Miklos Szeredi
2024-09-03 17:25     ` Joanne Koong
2024-09-03 22:37       ` Bernd Schubert
2024-09-04 17:23         ` Joanne Koong
2024-09-17 22:00           ` Bernd Schubert
2024-09-18  7:29             ` Miklos Szeredi
2024-09-18  9:12               ` Bernd Schubert
2024-09-27 19:36     ` Joanne Koong
2024-09-28  8:43       ` Bernd Schubert
2024-10-01 17:03         ` Joanne Koong
2024-10-01 17:12           ` Joanne Koong
2024-10-07 18:39           ` Joanne Koong
2024-10-07 20:02             ` Bernd Schubert [this message]
2024-10-08 16:26               ` Joanne Koong
2024-10-08 19:00                 ` Joanne Koong
2024-08-30 16:26 ` [PATCH v6 2/2] fuse: add default_request_timeout and max_request_timeout sysctls Joanne Koong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ebc29d73-ad5a-4fba-b892-1cea7f1b44d0@fastmail.fm \
    --to=bernd.schubert@fastmail.fm \
    --cc=jefflexu@linux.alibaba.com \
    --cc=joannelkoong@gmail.com \
    --cc=josef@toxicpanda.com \
    --cc=kernel-team@meta.com \
    --cc=laoar.shao@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).