linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joanne Koong <joannelkoong@gmail.com>
To: Yafang Shao <laoar.shao@gmail.com>
Cc: miklos@szeredi.hu, linux-fsdevel@vger.kernel.org,
	josef@toxicpanda.com,  bernd.schubert@fastmail.fm,
	kernel-team@meta.com
Subject: Re: [PATCH v2 0/2] fuse: add timeout option for requests
Date: Wed, 31 Jul 2024 10:52:08 -0700	[thread overview]
Message-ID: <CAJnrk1ZGR_a6=GHExrAeQN339++R_rcFqtiRrQ0AS4btr4WDLQ@mail.gmail.com> (raw)
In-Reply-To: <CALOAHbCWQOw6Hj6+zEiivRtfd4haqO+Q8KZQj2OPpsJ3M2=3AA@mail.gmail.com>

On Tue, Jul 30, 2024 at 7:14 PM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> On Wed, Jul 31, 2024 at 2:16 AM Joanne Koong <joannelkoong@gmail.com> wrote:
> >
> > On Mon, Jul 29, 2024 at 11:00 PM Yafang Shao <laoar.shao@gmail.com> wrote:
> > >
> > > On Tue, Jul 30, 2024 at 8:28 AM Joanne Koong <joannelkoong@gmail.com> wrote:
> > > >
> > > > There are situations where fuse servers can become unresponsive or take
> > > > too long to reply to a request. Currently there is no upper bound on
> > > > how long a request may take, which may be frustrating to users who get
> > > > stuck waiting for a request to complete.
> > > >
> > > > This patchset adds a timeout option for requests and two dynamically
> > > > configurable fuse sysctls "default_request_timeout" and "max_request_timeout"
> > > > for controlling/enforcing timeout behavior system-wide.
> > > >
> > > > Existing fuse servers will not be affected unless they explicitly opt into the
> > > > timeout.
> > > >
> > > > v1: https://lore.kernel.org/linux-fsdevel/20240717213458.1613347-1-joannelkoong@gmail.com/
> > > > Changes from v1:
> > > > - Add timeout for background requests
> > > > - Handle resend race condition
> > > > - Add sysctls
> > > >
> > > > Joanne Koong (2):
> > > >   fuse: add optional kernel-enforced timeout for requests
> > > >   fuse: add default_request_timeout and max_request_timeout sysctls
> > > >
> > > >  Documentation/admin-guide/sysctl/fs.rst |  17 +++
> > > >  fs/fuse/Makefile                        |   2 +-
> > > >  fs/fuse/dev.c                           | 187 +++++++++++++++++++++++-
> > > >  fs/fuse/fuse_i.h                        |  30 ++++
> > > >  fs/fuse/inode.c                         |  24 +++
> > > >  fs/fuse/sysctl.c                        |  42 ++++++
> > > >  6 files changed, 293 insertions(+), 9 deletions(-)
> > > >  create mode 100644 fs/fuse/sysctl.c
> > > >
> > > > --
> > > > 2.43.0
> > > >
> > >
> > > Hello Joanne,
> > >
> > > Thanks for your update.
> > >
> > > I have tested your patches using my test case, which is similar to the
> > > hello-fuse [0] example, with an additional change as follows:
> > >
> > > @@ -125,6 +125,8 @@ static int hello_read(const char *path, char *buf,
> > > size_t size, off_t offset,
> > >         } else
> > >                 size = 0;
> > >
> > > +       // TO trigger timeout
> > > +       sleep(60);
> > >         return size;
> > >  }
> > >
> > > [0] https://github.com/libfuse/libfuse/blob/master/example/hello.c
> > >
> > > However, it triggered a crash with the following setup:
> > >
> > > 1. Set FUSE timeout:
> > >   sysctl -w fs.fuse.default_request_timeout=10
> > >   sysctl -w fs.fuse.max_request_timeout = 20
> > >
> > > 2. Start FUSE daemon:
> > >   ./hello /tmp/fuse
> > >
> > > 3. Read from FUSE:
> > >   cat /tmp/fuse/hello
> > >
> > > 4. Kill the process within 10 seconds (to avoid the timeout being triggered).
> > >    Then the crash will be triggered.
> >
> > Hi Yafang,
> >
> > Thanks for trying this out on your use case!
> >
> > How consistently are you able to repro this?
>
> It triggers the crash every time.
>
> > I tried reproing using
> > your instructions above but I'm not able to get the crash.
>
> Please note that it is the `cat /tmp/fuse/hello` process that was
> killed, not the fuse daemon.
> The crash seems to occur when the fuse daemon wakes up after
> sleep(60). Please ensure that the fuse daemon can be woken up.
>

I'm still not able to trigger the crash by killing the `cat
/tmp/fuse/hello` process. This is how I'm repro-ing

1) Add sleep to test code in
https://github.com/libfuse/libfuse/blob/master/example/hello.c
@@ -125,6 +126,9 @@ static int hello_read(const char *path, char *buf,
size_t size, off_t offset,
        } else
                size = 0;

+       sleep(60);
+       printf("hello_read woke up from sleep\n");
+
        return size;
 }

2)  Set fuse timeout to 10 seconds
sysctl -w fs.fuse.default_request_timeout=10

3) Start fuse daemon
./example/hello ./tmp/fuse

4) Read from fuse
cat /tmp/fuse/hello

5) Get pid of cat process
top -b | grep cat

6) Kill cat process (within 10 seconds)
 sudo kill -9 <cat-pid>

7) Wait 60 seconds for fuse's read request to complete

From what it sounds like, this is exactly what you are doing as well?

I added some kernel-side logs and I'm seeing that the read request is
timing out after ~10 seconds and handled by the timeout handler
successfully.

On the fuse daemon side, these are the logs I'm seeing from the above repro:
./example/hello /tmp/fuse -f -d

FUSE library version: 3.17.0
nullpath_ok: 0
unique: 2, opcode: INIT (26), nodeid: 0, insize: 104, pid: 0
INIT: 7.40
flags=0x73fffffb
max_readahead=0x00020000
   INIT: 7.40
   flags=0x4040f039
   max_readahead=0x00020000
   max_write=0x00100000
   max_background=0
   congestion_threshold=0
   time_gran=1
   unique: 2, success, outsize: 80
unique: 4, opcode: LOOKUP (1), nodeid: 1, insize: 46, pid: 673
LOOKUP /hello
getattr[NULL] /hello
   NODEID: 2
   unique: 4, success, outsize: 144
unique: 6, opcode: OPEN (14), nodeid: 2, insize: 48, pid: 673
open flags: 0x8000 /hello
   open[0] flags: 0x8000 /hello
   unique: 6, success, outsize: 32
unique: 8, opcode: READ (15), nodeid: 2, insize: 80, pid: 673
read[0] 4096 bytes from 0 flags: 0x8000
unique: 10, opcode: FLUSH (25), nodeid: 2, insize: 64, pid: 673
   unique: 10, error: -38 (Function not implemented), outsize: 16
unique: 11, opcode: INTERRUPT (36), nodeid: 0, insize: 48, pid: 0
FUSE_INTERRUPT: reply to kernel to disable interrupt
   unique: 11, error: -38 (Function not implemented), outsize: 16

unique: 12, opcode: RELEASE (18), nodeid: 2, insize: 64, pid: 0
   unique: 12, success, outsize: 16

hello_read woke up from sleep
   read[0] 13 bytes from 0
   unique: 8, success, outsize: 29


Are these the debug logs you are seeing from the daemon side as well?

Thanks,
Joanne
> >
> > From the crash logs you provided below, it looks like what's happening
> > is that if the process gets killed, the timer isn't getting deleted.
> > I'll look more into what happens in fuse when a process is killed and
> > get back to you on this.
>
> Thanks
>
> --
> Regards
> Yafang

  reply	other threads:[~2024-07-31 17:52 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-30  0:23 [PATCH v2 0/2] fuse: add timeout option for requests Joanne Koong
2024-07-30  0:23 ` [PATCH v2 1/2] fuse: add optional kernel-enforced timeout " Joanne Koong
2024-08-04 22:46   ` Bernd Schubert
2024-08-05  4:45     ` Joanne Koong
2024-08-05 13:05       ` Bernd Schubert
2024-08-05  4:52   ` Joanne Koong
2024-08-05 13:26     ` Bernd Schubert
2024-08-05 22:10       ` Joanne Koong
2024-08-06 15:43         ` Bernd Schubert
2024-08-06 17:08           ` Joanne Koong
2024-08-05  7:32   ` Jingbo Xu
2024-08-05 22:53     ` Joanne Koong
2024-08-06  2:45       ` Jingbo Xu
2024-08-06 16:43         ` Joanne Koong
2024-08-06 15:50       ` Bernd Schubert
2024-07-30  0:23 ` [PATCH v2 2/2] fuse: add default_request_timeout and max_request_timeout sysctls Joanne Koong
2024-07-30  7:49   ` kernel test robot
2024-07-30  9:14   ` kernel test robot
2024-08-05  7:38   ` Jingbo Xu
2024-08-06  1:26     ` Joanne Koong
2024-07-30  5:59 ` [PATCH v2 0/2] fuse: add timeout option for requests Yafang Shao
2024-07-30 18:16   ` Joanne Koong
2024-07-31  2:13     ` Yafang Shao
2024-07-31 17:52       ` Joanne Koong [this message]
2024-07-31 18:46         ` Joanne Koong
2024-08-01  2:47           ` Yafang Shao
2024-08-02 19:05             ` Joanne Koong
2024-08-04  7:46               ` Yafang Shao
2024-08-05  5:05                 ` Joanne Koong
2024-08-06 16:23                   ` Joanne Koong
2024-08-06 17:11                     ` Bernd Schubert
2024-08-06 18:26                       ` Joanne Koong
2024-08-06 18:37                         ` Joanne Koong
2024-08-06 20:08                           ` Bernd Schubert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJnrk1ZGR_a6=GHExrAeQN339++R_rcFqtiRrQ0AS4btr4WDLQ@mail.gmail.com' \
    --to=joannelkoong@gmail.com \
    --cc=bernd.schubert@fastmail.fm \
    --cc=josef@toxicpanda.com \
    --cc=kernel-team@meta.com \
    --cc=laoar.shao@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).