From: Yafang Shao <laoar.shao@gmail.com>
To: miklos@szeredi.hu
Cc: linux-fsdevel@vger.kernel.org, Yafang Shao <laoar.shao@gmail.com>
Subject: [RFC PATCH 2/2] fuse: Enhance each fuse connection with timeout support
Date: Wed, 24 Jul 2024 15:11:56 +0800 [thread overview]
Message-ID: <20240724071156.97188-3-laoar.shao@gmail.com> (raw)
In-Reply-To: <20240724071156.97188-1-laoar.shao@gmail.com>
In our experience with fuse.hdfs, we encountered a challenge where, if the
HDFS server encounters an issue, the fuse.hdfs daemon—responsible for
sending requests to the HDFS server—can get stuck indefinitely.
Consequently, access to the fuse.hdfs directory becomes unresponsive.
The current workaround involves manually aborting the fuse connection,
which is unreliable in automatically addressing the abnormal connection
issue. To alleviate this pain point, we have implemented a timeout
mechanism that automatically handles such abnormal cases, thereby
streamlining the process and enhancing reliability.
The timeout value is configurable by the user, allowing them to tailor it
according to their specific workload requirements.
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
fs/fuse/dev.c | 57 +++++++++++++++++++++++++++++++++++++++++-------
fs/fuse/fuse_i.h | 2 ++
2 files changed, 51 insertions(+), 8 deletions(-)
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 9eb191b5c4de..ff9c55bcfb3d 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -369,10 +369,27 @@ static void request_wait_answer(struct fuse_req *req)
if (!fc->no_interrupt) {
/* Any signal may interrupt this */
- err = wait_event_interruptible(req->waitq,
- test_bit(FR_FINISHED, &req->flags));
- if (!err)
- return;
+ if (!fc->timeout) {
+ err = wait_event_interruptible(req->waitq,
+ test_bit(FR_FINISHED, &req->flags));
+ if (!err)
+ return;
+ } else {
+ err = wait_event_interruptible_timeout(req->waitq,
+ test_bit(FR_FINISHED, &req->flags),
+ (long)fc->timeout * HZ);
+ if (err > 0)
+ return;
+
+ /* timeout */
+ if (!err) {
+ req->out.h.error = -EAGAIN;
+ set_bit(FR_TIMEOUT, &req->flags);
+ /* matches barrier in fuse_dev_do_write() */
+ smp_mb__after_atomic();
+ return;
+ }
+ }
set_bit(FR_INTERRUPTED, &req->flags);
/* matches barrier in fuse_dev_do_read() */
@@ -383,10 +400,27 @@ static void request_wait_answer(struct fuse_req *req)
if (!test_bit(FR_FORCE, &req->flags)) {
/* Only fatal signals may interrupt this */
- err = wait_event_killable(req->waitq,
- test_bit(FR_FINISHED, &req->flags));
- if (!err)
- return;
+ if (!fc->timeout) {
+ err = wait_event_killable(req->waitq,
+ test_bit(FR_FINISHED, &req->flags));
+ if (!err)
+ return;
+ } else {
+ err = wait_event_killable_timeout(req->waitq,
+ test_bit(FR_FINISHED, &req->flags),
+ (long)fc->timeout * HZ);
+ if (err > 0)
+ return;
+
+ /* timeout */
+ if (!err) {
+ req->out.h.error = -EAGAIN;
+ set_bit(FR_TIMEOUT, &req->flags);
+ /* matches barrier in fuse_dev_do_write() */
+ smp_mb__after_atomic();
+ return;
+ }
+ }
spin_lock(&fiq->lock);
/* Request is not yet in userspace, bail out */
@@ -1951,6 +1985,13 @@ static ssize_t fuse_dev_do_write(struct fuse_dev *fud,
goto copy_finish;
}
+ /* matches barrier in request_wait_answer() */
+ smp_mb__after_atomic();
+ if (test_and_clear_bit(FR_TIMEOUT, &req->flags)) {
+ spin_unlock(&fpq->lock);
+ goto copy_finish;
+ }
+
/* Is it an interrupt reply ID? */
if (oh.unique & FUSE_INT_REQ_BIT) {
__fuse_get_request(req);
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 367601bf7285..c1467eb8c2e9 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -375,6 +375,7 @@ struct fuse_io_priv {
* FR_FINISHED: request is finished
* FR_PRIVATE: request is on private list
* FR_ASYNC: request is asynchronous
+ * FR_TIMEOUT: request is timeout
*/
enum fuse_req_flag {
FR_ISREPLY,
@@ -389,6 +390,7 @@ enum fuse_req_flag {
FR_FINISHED,
FR_PRIVATE,
FR_ASYNC,
+ FR_TIMEOUT,
};
/**
--
2.43.5
next prev parent reply other threads:[~2024-07-24 7:12 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-24 7:11 [RFC PATCH 0/2] fuse: Add timeout support for fuse connection Yafang Shao
2024-07-24 7:11 ` [RFC PATCH 1/2] fuse: Add "timeout" sysfs attribute for each " Yafang Shao
2024-07-24 7:11 ` Yafang Shao [this message]
2024-07-24 17:09 ` [RFC PATCH 2/2] fuse: Enhance each fuse connection with timeout support Joanne Koong
2024-07-25 2:06 ` Yafang Shao
2024-07-25 17:56 ` Joanne Koong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240724071156.97188-3-laoar.shao@gmail.com \
--to=laoar.shao@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=miklos@szeredi.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).