From: Bernd Schubert <bernd.schubert@fastmail.fm>
To: Joanne Koong <joannelkoong@gmail.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
linux-fsdevel@vger.kernel.org, josef@toxicpanda.com,
jefflexu@linux.alibaba.com, laoar.shao@gmail.com,
kernel-team@meta.com
Subject: Re: [PATCH v6 1/2] fuse: add optional kernel-enforced timeout for requests
Date: Mon, 7 Oct 2024 22:02:01 +0200 [thread overview]
Message-ID: <ebc29d73-ad5a-4fba-b892-1cea7f1b44d0@fastmail.fm> (raw)
In-Reply-To: <CAJnrk1btbP-jDVttuh-skyAQyHR80to+u55g7BANzqW2af_+Qw@mail.gmail.com>
On 10/7/24 20:39, Joanne Koong wrote:
> On Tue, Oct 1, 2024 at 10:03 AM Joanne Koong <joannelkoong@gmail.com> wrote:
>>
>> On Sat, Sep 28, 2024 at 1:43 AM Bernd Schubert
>> <bernd.schubert@fastmail.fm> wrote:
>>>
>>> Hi Joanne,
>>>
>>> On 9/27/24 21:36, Joanne Koong wrote:
>>>> On Mon, Sep 2, 2024 at 3:38 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
>>>>>
>>>>> On Fri, 30 Aug 2024 at 18:27, Joanne Koong <joannelkoong@gmail.com> wrote:
>>>>>>
>>>>>> There are situations where fuse servers can become unresponsive or
>>>>>> stuck, for example if the server is in a deadlock. Currently, there's
>>>>>> no good way to detect if a server is stuck and needs to be killed
>>>>>> manually.
>>>>>>
>>>>>> This commit adds an option for enforcing a timeout (in seconds) on
>>>>>> requests where if the timeout elapses without a reply from the server,
>>>>>> the connection will be automatically aborted.
>>>>>
>>>>> Okay.
>>>>>
>>>>> I'm not sure what the overhead (scheduling and memory) of timers, but
>>>>> starting one for each request seems excessive.
>>>>
>>>> I ran some benchmarks on this using the passthrough_ll server and saw
>>>> roughly a 1.5% drop in throughput (from ~775 MiB/s to ~765 MiB/s):
>>>> fio --name randwrite --ioengine=sync --thread --invalidate=1
>>>> --runtime=300 --ramp_time=10 --rw=randwrite --size=1G --numjobs=4
>>>> --bs=4k --alloc-size 98304 --allrandrepeat=1 --randseed=12345
>>>> --group_reporting=1 --directory=/root/fuse_mount
>>>>
>>>> Instead of attaching a timer to each request, I think we can instead
>>>> do the following:
>>>> * add a "start_time" field to each request tracking (in jiffies) when
>>>> the request was started
>>>> * add a new list to the connection that all requests get enqueued
>>>> onto. When the request is completed, it gets dequeued from this list
>>>> * have a timer for the connection that fires off every 10 seconds or
>>>> so. When this timer is fired, it checks if "jiffies > req->start_time
>>>> + fc->req_timeout" against the head of the list to check if the
>>>> timeout has expired and we need to abort the request. We only need to
>>>> check against the head of the list because we know every other request
>>>> after this was started later in time. I think we could even just use
>>>> the fc->lock for this instead of needing a separate lock. In the worst
>>>> case, this grants a 10 second upper bound on the timeout a user
>>>> requests (eg if the user requests 2 minutes, in the worst case the
>>>> timeout would trigger at 2 minutes and 10 seconds).
>>>>
>>>> Also, now that we're aborting the connection entirely on a timeout
>>>> instead of just aborting the request, maybe it makes sense to change
>>>> the timeout granularity to minutes instead of seconds. I'm envisioning
>>>> that this timeout mechanism will mostly be used as a safeguard against
>>>> malicious or buggy servers with a high timeout configured (eg 10
>>>> minutes), and minutes seems like a nicer interface for users than them
>>>> having to convert that to seconds.
>>>>
>>>> Let me know if I've missed anything with this approach but if not,
>>>> then I'll submit v7 with this change.
>>>
>>>
>>> sounds great to me. Just, could we do this per fuse_dev to avoid a
>>> single lock for all cores?
>>>
>>
>> Will do! thanks for the suggestion - in that case, I'll add its own
>> spinlock for it too then.
>
> I realized while working on v7 that we can't do this per fuse device
> because the request is only associated with a device once it's read in
> by the server (eg fuse_dev_do_read).
>
> I ran some rough preliminary benchmarks with
> ./libfuse/build/example/passthrough_ll -o writeback -o max_threads=4
> -o source=~/fstests ~/fuse_mount
> and
> fio --name randwrite --ioengine=sync --thread --invalidate=1
> --runtime=300 --ramp_time=10 --rw=randwrite --size=1G --numjobs=4
> --bs=4k --alloc-size 98304 --allrandrepeat=1 --randseed=12345
> --group_reporting=1 --directory=fuse_mount
>
> and didn't see any noticeable difference in throughput (~37 MiB/sec on
> my system) with vs without the timeout.
>
That is not much, isn't your limit the backend? I wonder what would happen
with 25GB/s I'm testing with. Wouldn't it make sense for this to test with
sequential large IO? And possibly with the passthrough-hp branch that
bypasses IO? And a NUMA system probably would be helpful as well.
I.e. to test the effect on the kernel side without having an IO limited
system?
With the io-uring interface requests stay in queues from the in-coming CPU
so easier to achieve there, although I wonder for your use-case if it
wouldn't be sufficient to start the timer only when the request is on
the way to fuse-server? One disadvantage I see is that virtiofs would need
to be specially handled.
Thanks,
Bernd
next prev parent reply other threads:[~2024-10-07 20:02 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-30 16:26 [PATCH v6 0/2] fuse: add timeout option for requests Joanne Koong
2024-08-30 16:26 ` [PATCH v6 1/2] fuse: add optional kernel-enforced timeout " Joanne Koong
2024-09-02 10:37 ` Miklos Szeredi
2024-09-02 10:50 ` Bernd Schubert
2024-09-02 11:11 ` Miklos Szeredi
2024-09-03 17:25 ` Joanne Koong
2024-09-03 22:37 ` Bernd Schubert
2024-09-04 17:23 ` Joanne Koong
2024-09-17 22:00 ` Bernd Schubert
2024-09-18 7:29 ` Miklos Szeredi
2024-09-18 9:12 ` Bernd Schubert
2024-09-27 19:36 ` Joanne Koong
2024-09-28 8:43 ` Bernd Schubert
2024-10-01 17:03 ` Joanne Koong
2024-10-01 17:12 ` Joanne Koong
2024-10-07 18:39 ` Joanne Koong
2024-10-07 20:02 ` Bernd Schubert [this message]
2024-10-08 16:26 ` Joanne Koong
2024-10-08 19:00 ` Joanne Koong
2024-08-30 16:26 ` [PATCH v6 2/2] fuse: add default_request_timeout and max_request_timeout sysctls Joanne Koong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ebc29d73-ad5a-4fba-b892-1cea7f1b44d0@fastmail.fm \
--to=bernd.schubert@fastmail.fm \
--cc=jefflexu@linux.alibaba.com \
--cc=joannelkoong@gmail.com \
--cc=josef@toxicpanda.com \
--cc=kernel-team@meta.com \
--cc=laoar.shao@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=miklos@szeredi.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).