* [PATCH v8 0/3] fuse: add kernel-enforced request timeout option
@ 2024-10-11 19:13 Joanne Koong
2024-10-11 19:13 ` [PATCH v8 1/3] fs_parser: add fsparam_u16 helper Joanne Koong
` (3 more replies)
0 siblings, 4 replies; 12+ messages in thread
From: Joanne Koong @ 2024-10-11 19:13 UTC (permalink / raw)
To: miklos, linux-fsdevel
Cc: josef, bernd.schubert, jefflexu, laoar.shao, kernel-team
There are situations where fuse servers can become unresponsive or
stuck, for example if the server is in a deadlock. Currently, there's
no good way to detect if a server is stuck and needs to be killed
manually.
This patchset adds a timeout option where if the server does not reply to a
request by the time the timeout elapses, the connection will be aborted.
This patchset also adds two dynamically configurable fuse sysctls
"default_request_timeout" and "max_request_timeout" for controlling/enforcing
timeout behavior system-wide.
Existing systems running fuse servers will not be affected unless they
explicitly opt into the timeout.
v7:
https://lore.kernel.org/linux-fsdevel/20241007184258.2837492-1-joannelkoong@gmail.com/
Changes from v7 -> v8:
* Use existing lists for checking expirations (Miklos)
v6:
https://lore.kernel.org/linux-fsdevel/20240830162649.3849586-1-joannelkoong@gmail.com/
Changes from v6 -> v7:
- Make timer per-connection instead of per-request (Miklos)
- Make default granularity of time minutes instead of seconds
- Removed the reviewed-bys since the interface of this has changed (now
minutes, instead of seconds)
v5:
https://lore.kernel.org/linux-fsdevel/20240826203234.4079338-1-joannelkoong@gmail.com/
Changes from v5 -> v6:
- Gate sysctl.o behind CONFIG_SYSCTL in makefile (kernel test robot)
- Reword/clarify last sentence in cover letter (Miklos)
v4:
https://lore.kernel.org/linux-fsdevel/20240813232241.2369855-1-joannelkoong@gmail.com/
Changes from v4 -> v5:
- Change timeout behavior from aborting request to aborting connection
(Miklos)
- Clarify wording for sysctl documentation (Jingbo)
v3:
https://lore.kernel.org/linux-fsdevel/20240808190110.3188039-1-joannelkoong@gmail.com/
Changes from v3 -> v4:
- Fix wording on some comments to make it more clear
- Use simpler logic for timer (eg remove extra if checks, use mod timer API)
(Josef)
- Sanity-check should be on FR_FINISHING not FR_FINISHED (Jingbo)
- Fix comment for "processing queue", add req->fpq = NULL safeguard (Bernd)
v2:
https://lore.kernel.org/linux-fsdevel/20240730002348.3431931-1-joannelkoong@gmail.com/
Changes from v2 -> v3:
- Disarm / rearm timer in dev_do_read to handle race conditions (Bernrd)
- Disarm timer in error handling for fatal interrupt (Yafang)
- Clean up do_fuse_request_end (Jingbo)
- Add timer for notify retrieve requests
- Fix kernel test robot errors for #define no-op functions
v1:
https://lore.kernel.org/linux-fsdevel/20240717213458.1613347-1-joannelkoong@gmail.com/
Changes from v1 -> v2:
- Add timeout for background requests
- Handle resend race condition
- Add sysctls
Joanne Koong (3):
fs_parser: add fsparam_u16 helper
fuse: add optional kernel-enforced timeout for requests
fuse: add default_request_timeout and max_request_timeout sysctls
Documentation/admin-guide/sysctl/fs.rst | 27 +++++++++
fs/fs_parser.c | 14 +++++
fs/fuse/dev.c | 80 +++++++++++++++++++++++++
fs/fuse/fuse_i.h | 31 ++++++++++
fs/fuse/inode.c | 33 ++++++++++
fs/fuse/sysctl.c | 20 +++++++
include/linux/fs_parser.h | 9 ++-
7 files changed, 211 insertions(+), 3 deletions(-)
--
2.43.5
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v8 1/3] fs_parser: add fsparam_u16 helper
2024-10-11 19:13 [PATCH v8 0/3] fuse: add kernel-enforced request timeout option Joanne Koong
@ 2024-10-11 19:13 ` Joanne Koong
2024-10-29 20:58 ` Bernd Schubert
2024-10-11 19:13 ` [PATCH v8 2/3] fuse: add optional kernel-enforced timeout for requests Joanne Koong
` (2 subsequent siblings)
3 siblings, 1 reply; 12+ messages in thread
From: Joanne Koong @ 2024-10-11 19:13 UTC (permalink / raw)
To: miklos, linux-fsdevel
Cc: josef, bernd.schubert, jefflexu, laoar.shao, kernel-team
Add a fsparam helper for unsigned 16 bit values.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
fs/fs_parser.c | 14 ++++++++++++++
include/linux/fs_parser.h | 9 ++++++---
2 files changed, 20 insertions(+), 3 deletions(-)
diff --git a/fs/fs_parser.c b/fs/fs_parser.c
index 24727ec34e5a..0e06f9618c89 100644
--- a/fs/fs_parser.c
+++ b/fs/fs_parser.c
@@ -210,6 +210,20 @@ int fs_param_is_bool(struct p_log *log, const struct fs_parameter_spec *p,
}
EXPORT_SYMBOL(fs_param_is_bool);
+int fs_param_is_u16(struct p_log *log, const struct fs_parameter_spec *p,
+ struct fs_parameter *param, struct fs_parse_result *result)
+{
+ int base = (unsigned long)p->data;
+ if (param->type != fs_value_is_string)
+ return fs_param_bad_value(log, param);
+ if (!*param->string && (p->flags & fs_param_can_be_empty))
+ return 0;
+ if (kstrtou16(param->string, base, &result->uint_16) < 0)
+ return fs_param_bad_value(log, param);
+ return 0;
+}
+EXPORT_SYMBOL(fs_param_is_u16);
+
int fs_param_is_u32(struct p_log *log, const struct fs_parameter_spec *p,
struct fs_parameter *param, struct fs_parse_result *result)
{
diff --git a/include/linux/fs_parser.h b/include/linux/fs_parser.h
index 6cf713a7e6c6..1c940756300c 100644
--- a/include/linux/fs_parser.h
+++ b/include/linux/fs_parser.h
@@ -26,9 +26,10 @@ typedef int fs_param_type(struct p_log *,
/*
* The type of parameter expected.
*/
-fs_param_type fs_param_is_bool, fs_param_is_u32, fs_param_is_s32, fs_param_is_u64,
- fs_param_is_enum, fs_param_is_string, fs_param_is_blob, fs_param_is_blockdev,
- fs_param_is_path, fs_param_is_fd, fs_param_is_uid, fs_param_is_gid;
+fs_param_type fs_param_is_bool, fs_param_is_u16, fs_param_is_u32, fs_param_is_s32,
+ fs_param_is_u64, fs_param_is_enum, fs_param_is_string, fs_param_is_blob,
+ fs_param_is_blockdev, fs_param_is_path, fs_param_is_fd, fs_param_is_uid,
+ fs_param_is_gid;
/*
* Specification of the type of value a parameter wants.
@@ -55,6 +56,7 @@ struct fs_parse_result {
union {
bool boolean; /* For spec_bool */
int int_32; /* For spec_s32/spec_enum */
+ u16 uint_16; /* For spec_u16 */
unsigned int uint_32; /* For spec_u32{,_octal,_hex}/spec_enum */
u64 uint_64; /* For spec_u64 */
kuid_t uid;
@@ -119,6 +121,7 @@ static inline bool fs_validate_description(const char *name,
#define fsparam_flag_no(NAME, OPT) \
__fsparam(NULL, NAME, OPT, fs_param_neg_with_no, NULL)
#define fsparam_bool(NAME, OPT) __fsparam(fs_param_is_bool, NAME, OPT, 0, NULL)
+#define fsparam_u16(NAME, OPT) __fsparam(fs_param_is_u16, NAME, OPT, 0, NULL)
#define fsparam_u32(NAME, OPT) __fsparam(fs_param_is_u32, NAME, OPT, 0, NULL)
#define fsparam_u32oct(NAME, OPT) \
__fsparam(fs_param_is_u32, NAME, OPT, 0, (void *)8)
--
2.43.5
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v8 2/3] fuse: add optional kernel-enforced timeout for requests
2024-10-11 19:13 [PATCH v8 0/3] fuse: add kernel-enforced request timeout option Joanne Koong
2024-10-11 19:13 ` [PATCH v8 1/3] fs_parser: add fsparam_u16 helper Joanne Koong
@ 2024-10-11 19:13 ` Joanne Koong
2024-10-29 19:17 ` Bernd Schubert
2024-10-11 19:13 ` [PATCH v8 3/3] fuse: add default_request_timeout and max_request_timeout sysctls Joanne Koong
2024-10-24 16:19 ` [PATCH v8 0/3] fuse: add kernel-enforced request timeout option Joanne Koong
3 siblings, 1 reply; 12+ messages in thread
From: Joanne Koong @ 2024-10-11 19:13 UTC (permalink / raw)
To: miklos, linux-fsdevel
Cc: josef, bernd.schubert, jefflexu, laoar.shao, kernel-team
There are situations where fuse servers can become unresponsive or
stuck, for example if the server is deadlocked. Currently, there's no
good way to detect if a server is stuck and needs to be killed manually.
This commit adds an option for enforcing a timeout (in minutes) for
requests where if the timeout elapses without the server responding to
the request, the connection will be automatically aborted.
Please note that these timeouts are not 100% precise. The request may
take an extra FUSE_TIMEOUT_TIMER_FREQ seconds beyond the requested max
timeout due to how it's internally implemented.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
fs/fuse/dev.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++
fs/fuse/fuse_i.h | 21 +++++++++++++
fs/fuse/inode.c | 21 +++++++++++++
3 files changed, 122 insertions(+)
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 1f64ae6d7a69..054bfa2a26ed 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -45,6 +45,82 @@ static struct fuse_dev *fuse_get_dev(struct file *file)
return READ_ONCE(file->private_data);
}
+static bool request_expired(struct fuse_conn *fc, struct fuse_req *req)
+{
+ return jiffies > req->create_time + fc->timeout.req_timeout;
+}
+
+/*
+ * Check if any requests aren't being completed by the specified request
+ * timeout. To do so, we:
+ * - check the fiq pending list
+ * - check the bg queue
+ * - check the fpq io and processing lists
+ *
+ * To make this fast, we only check against the head request on each list since
+ * these are generally queued in order of creation time (eg newer requests get
+ * queued to the tail). We might miss a few edge cases (eg requests transitioning
+ * between lists, re-sent requests at the head of the pending list having a
+ * later creation time than other requests on that list, etc.) but that is fine
+ * since if the request never gets fulfilled, it will eventually be caught.
+ */
+void fuse_check_timeout(struct timer_list *timer)
+{
+ struct fuse_conn *fc = container_of(timer, struct fuse_conn, timeout.timer);
+ struct fuse_iqueue *fiq = &fc->iq;
+ struct fuse_req *req;
+ struct fuse_dev *fud;
+ struct fuse_pqueue *fpq;
+ bool expired = false;
+ int i;
+
+ spin_lock(&fiq->lock);
+ req = list_first_entry_or_null(&fiq->pending, struct fuse_req, list);
+ if (req)
+ expired = request_expired(fc, req);
+ spin_unlock(&fiq->lock);
+ if (expired)
+ goto abort_conn;
+
+ spin_lock(&fc->bg_lock);
+ req = list_first_entry_or_null(&fc->bg_queue, struct fuse_req, list);
+ if (req)
+ expired = request_expired(fc, req);
+ spin_unlock(&fc->bg_lock);
+ if (expired)
+ goto abort_conn;
+
+ spin_lock(&fc->lock);
+ if (!fc->connected) {
+ spin_unlock(&fc->lock);
+ return;
+ }
+ list_for_each_entry(fud, &fc->devices, entry) {
+ fpq = &fud->pq;
+ spin_lock(&fpq->lock);
+ req = list_first_entry_or_null(&fpq->io, struct fuse_req, list);
+ if (req && request_expired(fc, req))
+ goto fpq_abort;
+
+ for (i = 0; i < FUSE_PQ_HASH_SIZE; i++) {
+ req = list_first_entry_or_null(&fpq->processing[i], struct fuse_req, list);
+ if (req && request_expired(fc, req))
+ goto fpq_abort;
+ }
+ spin_unlock(&fpq->lock);
+ }
+ spin_unlock(&fc->lock);
+
+ mod_timer(&fc->timeout.timer, jiffies + FUSE_TIMEOUT_TIMER_FREQ);
+ return;
+
+fpq_abort:
+ spin_unlock(&fpq->lock);
+ spin_unlock(&fc->lock);
+abort_conn:
+ fuse_abort_conn(fc);
+}
+
static void fuse_request_init(struct fuse_mount *fm, struct fuse_req *req)
{
INIT_LIST_HEAD(&req->list);
@@ -53,6 +129,7 @@ static void fuse_request_init(struct fuse_mount *fm, struct fuse_req *req)
refcount_set(&req->count, 1);
__set_bit(FR_PENDING, &req->flags);
req->fm = fm;
+ req->create_time = jiffies;
}
static struct fuse_req *fuse_request_alloc(struct fuse_mount *fm, gfp_t flags)
@@ -2296,6 +2373,9 @@ void fuse_abort_conn(struct fuse_conn *fc)
spin_unlock(&fc->lock);
end_requests(&to_end);
+
+ if (fc->timeout.req_timeout)
+ timer_delete(&fc->timeout.timer);
} else {
spin_unlock(&fc->lock);
}
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 7ff00bae4a84..ef4558c2c44e 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -435,6 +435,9 @@ struct fuse_req {
/** fuse_mount this request belongs to */
struct fuse_mount *fm;
+
+ /** When (in jiffies) the request was created */
+ unsigned long create_time;
};
struct fuse_iqueue;
@@ -525,6 +528,16 @@ struct fuse_pqueue {
struct list_head io;
};
+/* Frequency (in seconds) of request timeout checks, if opted into */
+#define FUSE_TIMEOUT_TIMER_FREQ 60 * HZ
+
+struct fuse_timeout {
+ struct timer_list timer;
+
+ /* Request timeout (in jiffies). 0 = no timeout */
+ unsigned long req_timeout;
+};
+
/**
* Fuse device instance
*/
@@ -571,6 +584,8 @@ struct fuse_fs_context {
enum fuse_dax_mode dax_mode;
unsigned int max_read;
unsigned int blksize;
+ /* Request timeout (in minutes). 0 = no timeout (infinite wait) */
+ unsigned int req_timeout;
const char *subtype;
/* DAX device, may be NULL */
@@ -914,6 +929,9 @@ struct fuse_conn {
/** IDR for backing files ids */
struct idr backing_files_map;
#endif
+
+ /** Only used if the connection enforces request timeouts */
+ struct fuse_timeout timeout;
};
/*
@@ -1175,6 +1193,9 @@ void fuse_request_end(struct fuse_req *req);
void fuse_abort_conn(struct fuse_conn *fc);
void fuse_wait_aborted(struct fuse_conn *fc);
+/* Check if any requests timed out */
+void fuse_check_timeout(struct timer_list *timer);
+
/**
* Invalidate inode attributes
*/
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index f1779ff3f8d1..a78aac76b942 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -735,6 +735,7 @@ enum {
OPT_ALLOW_OTHER,
OPT_MAX_READ,
OPT_BLKSIZE,
+ OPT_REQUEST_TIMEOUT,
OPT_ERR
};
@@ -749,6 +750,7 @@ static const struct fs_parameter_spec fuse_fs_parameters[] = {
fsparam_u32 ("max_read", OPT_MAX_READ),
fsparam_u32 ("blksize", OPT_BLKSIZE),
fsparam_string ("subtype", OPT_SUBTYPE),
+ fsparam_u16 ("request_timeout", OPT_REQUEST_TIMEOUT),
{}
};
@@ -844,6 +846,10 @@ static int fuse_parse_param(struct fs_context *fsc, struct fs_parameter *param)
ctx->blksize = result.uint_32;
break;
+ case OPT_REQUEST_TIMEOUT:
+ ctx->req_timeout = result.uint_16;
+ break;
+
default:
return -EINVAL;
}
@@ -973,6 +979,8 @@ void fuse_conn_put(struct fuse_conn *fc)
if (IS_ENABLED(CONFIG_FUSE_DAX))
fuse_dax_conn_free(fc);
+ if (fc->timeout.req_timeout)
+ timer_shutdown_sync(&fc->timeout.timer);
if (fiq->ops->release)
fiq->ops->release(fiq);
put_pid_ns(fc->pid_ns);
@@ -1691,6 +1699,18 @@ int fuse_init_fs_context_submount(struct fs_context *fsc)
}
EXPORT_SYMBOL_GPL(fuse_init_fs_context_submount);
+static void fuse_init_fc_timeout(struct fuse_conn *fc, struct fuse_fs_context *ctx)
+{
+ if (ctx->req_timeout) {
+ if (check_mul_overflow(ctx->req_timeout * 60, HZ, &fc->timeout.req_timeout))
+ fc->timeout.req_timeout = U32_MAX;
+ timer_setup(&fc->timeout.timer, fuse_check_timeout, 0);
+ mod_timer(&fc->timeout.timer, jiffies + FUSE_TIMEOUT_TIMER_FREQ);
+ } else {
+ fc->timeout.req_timeout = 0;
+ }
+}
+
int fuse_fill_super_common(struct super_block *sb, struct fuse_fs_context *ctx)
{
struct fuse_dev *fud = NULL;
@@ -1753,6 +1773,7 @@ int fuse_fill_super_common(struct super_block *sb, struct fuse_fs_context *ctx)
fc->destroy = ctx->destroy;
fc->no_control = ctx->no_control;
fc->no_force_umount = ctx->no_force_umount;
+ fuse_init_fc_timeout(fc, ctx);
err = -ENOMEM;
root = fuse_get_root_inode(sb, ctx->rootmode);
--
2.43.5
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v8 3/3] fuse: add default_request_timeout and max_request_timeout sysctls
2024-10-11 19:13 [PATCH v8 0/3] fuse: add kernel-enforced request timeout option Joanne Koong
2024-10-11 19:13 ` [PATCH v8 1/3] fs_parser: add fsparam_u16 helper Joanne Koong
2024-10-11 19:13 ` [PATCH v8 2/3] fuse: add optional kernel-enforced timeout for requests Joanne Koong
@ 2024-10-11 19:13 ` Joanne Koong
2024-10-29 19:40 ` Bernd Schubert
2024-10-24 16:19 ` [PATCH v8 0/3] fuse: add kernel-enforced request timeout option Joanne Koong
3 siblings, 1 reply; 12+ messages in thread
From: Joanne Koong @ 2024-10-11 19:13 UTC (permalink / raw)
To: miklos, linux-fsdevel
Cc: josef, bernd.schubert, jefflexu, laoar.shao, kernel-team
Introduce two new sysctls, "default_request_timeout" and
"max_request_timeout". These control how long (in minutes) a server can
take to reply to a request. If the server does not reply by the timeout,
then the connection will be aborted.
"default_request_timeout" sets the default timeout if no timeout is
specified by the fuse server on mount. 0 (default) indicates no default
timeout should be enforced. If the server did specify a timeout, then
default_request_timeout will be ignored.
"max_request_timeout" sets the max amount of time the server may take to
reply to a request. 0 (default) indicates no maximum timeout. If
max_request_timeout is set and the fuse server attempts to set a
timeout greater than max_request_timeout, the system will use
max_request_timeout as the timeout. Similarly, if default_request_timeout
is greater than max_request_timeout, the system will use
max_request_timeout as the timeout. If the server does not request a
timeout and default_request_timeout is set to 0 but max_request_timeout
is set, then the timeout will be max_request_timeout.
Please note that these timeouts are not 100% precise. The request may
take an extra FUSE_TIMEOUT_TIMER_FREQ seconds beyond the set max timeout
due to how it's internally implemented.
$ sysctl -a | grep fuse.default_request_timeout
fs.fuse.default_request_timeout = 0
$ echo 65536 | sudo tee /proc/sys/fs/fuse/default_request_timeout
tee: /proc/sys/fs/fuse/default_request_timeout: Invalid argument
$ echo 65535 | sudo tee /proc/sys/fs/fuse/default_request_timeout
65535
$ sysctl -a | grep fuse.default_request_timeout
fs.fuse.default_request_timeout = 65535
$ echo 0 | sudo tee /proc/sys/fs/fuse/default_request_timeout
0
$ sysctl -a | grep fuse.default_request_timeout
fs.fuse.default_request_timeout = 0
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
Documentation/admin-guide/sysctl/fs.rst | 27 +++++++++++++++++++++++++
fs/fuse/fuse_i.h | 10 +++++++++
fs/fuse/inode.c | 16 +++++++++++++--
fs/fuse/sysctl.c | 20 ++++++++++++++++++
4 files changed, 71 insertions(+), 2 deletions(-)
diff --git a/Documentation/admin-guide/sysctl/fs.rst b/Documentation/admin-guide/sysctl/fs.rst
index fa25d7e718b3..790a34291467 100644
--- a/Documentation/admin-guide/sysctl/fs.rst
+++ b/Documentation/admin-guide/sysctl/fs.rst
@@ -342,3 +342,30 @@ filesystems:
``/proc/sys/fs/fuse/max_pages_limit`` is a read/write file for
setting/getting the maximum number of pages that can be used for servicing
requests in FUSE.
+
+``/proc/sys/fs/fuse/default_request_timeout`` is a read/write file for
+setting/getting the default timeout (in minutes) for a fuse server to
+reply to a kernel-issued request in the event where the server did not
+specify a timeout at mount. If the server set a timeout,
+then default_request_timeout will be ignored. The default
+"default_request_timeout" is set to 0. 0 indicates a no-op (eg
+requests will not have a default request timeout set if no timeout was
+specified by the server).
+
+``/proc/sys/fs/fuse/max_request_timeout`` is a read/write file for
+setting/getting the maximum timeout (in minutes) for a fuse server to
+reply to a kernel-issued request. A value greater than 0 automatically opts
+the server into a timeout that will be at most "max_request_timeout", even if
+the server did not specify a timeout and default_request_timeout is set to 0.
+If max_request_timeout is greater than 0 and the server set a timeout greater
+than max_request_timeout or default_request_timeout is set to a value greater
+than max_request_timeout, the system will use max_request_timeout as the
+timeout. 0 indicates a no-op (eg requests will not have an upper bound on the
+timeout and if the server did not request a timeout and default_request_timeout
+was not set, there will be no timeout).
+
+Please note that for the timeout options, if the server does not respond to
+the request by the time the timeout elapses, then the connection to the fuse
+server will be aborted. Please also note that the timeouts are not 100%
+precise (eg you may set 10 minutes but the timeout may kick in after 11
+minutes).
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index ef4558c2c44e..28d9230f4fcb 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -46,6 +46,16 @@
/** Maximum of max_pages received in init_out */
extern unsigned int fuse_max_pages_limit;
+/*
+ * Default timeout (in minutes) for the server to reply to a request
+ * before the connection is aborted, if no timeout was specified on mount.
+ */
+extern unsigned int fuse_default_req_timeout;
+/*
+ * Max timeout (in minutes) for the server to reply to a request before
+ * the connection is aborted.
+ */
+extern unsigned int fuse_max_req_timeout;
/** List of active connections */
extern struct list_head fuse_conn_list;
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index a78aac76b942..d97dde59eac3 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -36,6 +36,9 @@ DEFINE_MUTEX(fuse_mutex);
static int set_global_limit(const char *val, const struct kernel_param *kp);
unsigned int fuse_max_pages_limit = 256;
+/* default is no timeout */
+unsigned int fuse_default_req_timeout = 0;
+unsigned int fuse_max_req_timeout = 0;
unsigned max_user_bgreq;
module_param_call(max_user_bgreq, set_global_limit, param_get_uint,
@@ -1701,8 +1704,17 @@ EXPORT_SYMBOL_GPL(fuse_init_fs_context_submount);
static void fuse_init_fc_timeout(struct fuse_conn *fc, struct fuse_fs_context *ctx)
{
- if (ctx->req_timeout) {
- if (check_mul_overflow(ctx->req_timeout * 60, HZ, &fc->timeout.req_timeout))
+ unsigned int timeout = ctx->req_timeout ?: fuse_default_req_timeout;
+
+ if (fuse_max_req_timeout) {
+ if (!timeout)
+ timeout = fuse_max_req_timeout;
+ else
+ timeout = min(timeout, fuse_max_req_timeout);
+ }
+
+ if (timeout) {
+ if (check_mul_overflow(timeout * 60, HZ, &fc->timeout.req_timeout))
fc->timeout.req_timeout = U32_MAX;
timer_setup(&fc->timeout.timer, fuse_check_timeout, 0);
mod_timer(&fc->timeout.timer, jiffies + FUSE_TIMEOUT_TIMER_FREQ);
diff --git a/fs/fuse/sysctl.c b/fs/fuse/sysctl.c
index b272bb333005..e70b5269c16d 100644
--- a/fs/fuse/sysctl.c
+++ b/fs/fuse/sysctl.c
@@ -13,6 +13,8 @@ static struct ctl_table_header *fuse_table_header;
/* Bound by fuse_init_out max_pages, which is a u16 */
static unsigned int sysctl_fuse_max_pages_limit = 65535;
+static unsigned int sysctl_fuse_max_req_timeout_limit = U16_MAX;
+
static struct ctl_table fuse_sysctl_table[] = {
{
.procname = "max_pages_limit",
@@ -23,6 +25,24 @@ static struct ctl_table fuse_sysctl_table[] = {
.extra1 = SYSCTL_ONE,
.extra2 = &sysctl_fuse_max_pages_limit,
},
+ {
+ .procname = "default_request_timeout",
+ .data = &fuse_default_req_timeout,
+ .maxlen = sizeof(fuse_default_req_timeout),
+ .mode = 0644,
+ .proc_handler = proc_douintvec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = &sysctl_fuse_max_req_timeout_limit,
+ },
+ {
+ .procname = "max_request_timeout",
+ .data = &fuse_max_req_timeout,
+ .maxlen = sizeof(fuse_max_req_timeout),
+ .mode = 0644,
+ .proc_handler = proc_douintvec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = &sysctl_fuse_max_req_timeout_limit,
+ },
};
int fuse_sysctl_register(void)
--
2.43.5
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v8 0/3] fuse: add kernel-enforced request timeout option
2024-10-11 19:13 [PATCH v8 0/3] fuse: add kernel-enforced request timeout option Joanne Koong
` (2 preceding siblings ...)
2024-10-11 19:13 ` [PATCH v8 3/3] fuse: add default_request_timeout and max_request_timeout sysctls Joanne Koong
@ 2024-10-24 16:19 ` Joanne Koong
3 siblings, 0 replies; 12+ messages in thread
From: Joanne Koong @ 2024-10-24 16:19 UTC (permalink / raw)
To: miklos, linux-fsdevel
Cc: josef, bernd.schubert, jefflexu, laoar.shao, kernel-team
On Fri, Oct 11, 2024 at 12:14 PM Joanne Koong <joannelkoong@gmail.com> wrote:
>
> There are situations where fuse servers can become unresponsive or
> stuck, for example if the server is in a deadlock. Currently, there's
> no good way to detect if a server is stuck and needs to be killed
> manually.
>
> This patchset adds a timeout option where if the server does not reply to a
> request by the time the timeout elapses, the connection will be aborted.
> This patchset also adds two dynamically configurable fuse sysctls
> "default_request_timeout" and "max_request_timeout" for controlling/enforcing
> timeout behavior system-wide.
>
> Existing systems running fuse servers will not be affected unless they
> explicitly opt into the timeout.
>
> v7:
> https://lore.kernel.org/linux-fsdevel/20241007184258.2837492-1-joannelkoong@gmail.com/
> Changes from v7 -> v8:
> * Use existing lists for checking expirations (Miklos)
>
> v6:
> https://lore.kernel.org/linux-fsdevel/20240830162649.3849586-1-joannelkoong@gmail.com/
> Changes from v6 -> v7:
> - Make timer per-connection instead of per-request (Miklos)
> - Make default granularity of time minutes instead of seconds
> - Removed the reviewed-bys since the interface of this has changed (now
> minutes, instead of seconds)
>
> v5:
> https://lore.kernel.org/linux-fsdevel/20240826203234.4079338-1-joannelkoong@gmail.com/
> Changes from v5 -> v6:
> - Gate sysctl.o behind CONFIG_SYSCTL in makefile (kernel test robot)
> - Reword/clarify last sentence in cover letter (Miklos)
>
> v4:
> https://lore.kernel.org/linux-fsdevel/20240813232241.2369855-1-joannelkoong@gmail.com/
> Changes from v4 -> v5:
> - Change timeout behavior from aborting request to aborting connection
> (Miklos)
> - Clarify wording for sysctl documentation (Jingbo)
>
> v3:
> https://lore.kernel.org/linux-fsdevel/20240808190110.3188039-1-joannelkoong@gmail.com/
> Changes from v3 -> v4:
> - Fix wording on some comments to make it more clear
> - Use simpler logic for timer (eg remove extra if checks, use mod timer API)
> (Josef)
> - Sanity-check should be on FR_FINISHING not FR_FINISHED (Jingbo)
> - Fix comment for "processing queue", add req->fpq = NULL safeguard (Bernd)
>
> v2:
> https://lore.kernel.org/linux-fsdevel/20240730002348.3431931-1-joannelkoong@gmail.com/
> Changes from v2 -> v3:
> - Disarm / rearm timer in dev_do_read to handle race conditions (Bernrd)
> - Disarm timer in error handling for fatal interrupt (Yafang)
> - Clean up do_fuse_request_end (Jingbo)
> - Add timer for notify retrieve requests
> - Fix kernel test robot errors for #define no-op functions
>
> v1:
> https://lore.kernel.org/linux-fsdevel/20240717213458.1613347-1-joannelkoong@gmail.com/
> Changes from v1 -> v2:
> - Add timeout for background requests
> - Handle resend race condition
> - Add sysctls
>
> Joanne Koong (3):
> fs_parser: add fsparam_u16 helper
> fuse: add optional kernel-enforced timeout for requests
> fuse: add default_request_timeout and max_request_timeout sysctls
>
> Documentation/admin-guide/sysctl/fs.rst | 27 +++++++++
> fs/fs_parser.c | 14 +++++
> fs/fuse/dev.c | 80 +++++++++++++++++++++++++
> fs/fuse/fuse_i.h | 31 ++++++++++
> fs/fuse/inode.c | 33 ++++++++++
> fs/fuse/sysctl.c | 20 +++++++
> include/linux/fs_parser.h | 9 ++-
> 7 files changed, 211 insertions(+), 3 deletions(-)
>
Just checking in on this patchset - any comments or thoughts?
Thanks,
Joanne
> --
> 2.43.5
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v8 2/3] fuse: add optional kernel-enforced timeout for requests
2024-10-11 19:13 ` [PATCH v8 2/3] fuse: add optional kernel-enforced timeout for requests Joanne Koong
@ 2024-10-29 19:17 ` Bernd Schubert
2024-10-30 17:21 ` Joanne Koong
0 siblings, 1 reply; 12+ messages in thread
From: Bernd Schubert @ 2024-10-29 19:17 UTC (permalink / raw)
To: Joanne Koong, miklos, linux-fsdevel
Cc: josef, jefflexu, laoar.shao, kernel-team
On 10/11/24 21:13, Joanne Koong wrote:
> There are situations where fuse servers can become unresponsive or
> stuck, for example if the server is deadlocked. Currently, there's no
> good way to detect if a server is stuck and needs to be killed manually.
>
> This commit adds an option for enforcing a timeout (in minutes) for
> requests where if the timeout elapses without the server responding to
> the request, the connection will be automatically aborted.
>
> Please note that these timeouts are not 100% precise. The request may
> take an extra FUSE_TIMEOUT_TIMER_FREQ seconds beyond the requested max
> timeout due to how it's internally implemented.
>
> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> ---
> fs/fuse/dev.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++
> fs/fuse/fuse_i.h | 21 +++++++++++++
> fs/fuse/inode.c | 21 +++++++++++++
> 3 files changed, 122 insertions(+)
>
> diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
> index 1f64ae6d7a69..054bfa2a26ed 100644
> --- a/fs/fuse/dev.c
> +++ b/fs/fuse/dev.c
> @@ -45,6 +45,82 @@ static struct fuse_dev *fuse_get_dev(struct file *file)
> return READ_ONCE(file->private_data);
> }
>
> +static bool request_expired(struct fuse_conn *fc, struct fuse_req *req)
> +{
> + return jiffies > req->create_time + fc->timeout.req_timeout;
> +}
> +
> +/*
> + * Check if any requests aren't being completed by the specified request
> + * timeout. To do so, we:
> + * - check the fiq pending list
> + * - check the bg queue
> + * - check the fpq io and processing lists
> + *
> + * To make this fast, we only check against the head request on each list since
> + * these are generally queued in order of creation time (eg newer requests get
> + * queued to the tail). We might miss a few edge cases (eg requests transitioning
> + * between lists, re-sent requests at the head of the pending list having a
> + * later creation time than other requests on that list, etc.) but that is fine
> + * since if the request never gets fulfilled, it will eventually be caught.
> + */
> +void fuse_check_timeout(struct timer_list *timer)
> +{
> + struct fuse_conn *fc = container_of(timer, struct fuse_conn, timeout.timer);
> + struct fuse_iqueue *fiq = &fc->iq;
> + struct fuse_req *req;
> + struct fuse_dev *fud;
> + struct fuse_pqueue *fpq;
> + bool expired = false;
> + int i;
> +
> + spin_lock(&fiq->lock);
> + req = list_first_entry_or_null(&fiq->pending, struct fuse_req, list);
> + if (req)
> + expired = request_expired(fc, req);
> + spin_unlock(&fiq->lock);
> + if (expired)
> + goto abort_conn;
> +
> + spin_lock(&fc->bg_lock);
> + req = list_first_entry_or_null(&fc->bg_queue, struct fuse_req, list);
> + if (req)
> + expired = request_expired(fc, req);
> + spin_unlock(&fc->bg_lock);
> + if (expired)
> + goto abort_conn;
> +
> + spin_lock(&fc->lock);
> + if (!fc->connected) {
> + spin_unlock(&fc->lock);
> + return;
> + }
> + list_for_each_entry(fud, &fc->devices, entry) {
> + fpq = &fud->pq;
> + spin_lock(&fpq->lock);
> + req = list_first_entry_or_null(&fpq->io, struct fuse_req, list);
> + if (req && request_expired(fc, req))
> + goto fpq_abort;
> +
> + for (i = 0; i < FUSE_PQ_HASH_SIZE; i++) {
> + req = list_first_entry_or_null(&fpq->processing[i], struct fuse_req, list);
> + if (req && request_expired(fc, req))
> + goto fpq_abort;
> + }
> + spin_unlock(&fpq->lock);
> + }
> + spin_unlock(&fc->lock);
I really don't have a strong opinion on that - I wonder if it wouldn't
be better for this part to have an extra timeout list per fud or pq as
previously. That would slightly increases memory usage and overhead per
request as a second list is needed, but would reduce these 1/min cpu
spikes as only one list per fud would need to be checked. But then, it
would be easy to change that later, if timeout checks turn out to be a
problem.
> +
> + mod_timer(&fc->timeout.timer, jiffies + FUSE_TIMEOUT_TIMER_FREQ);
> + return;
> +
> +fpq_abort:
> + spin_unlock(&fpq->lock);
> + spin_unlock(&fc->lock);
> +abort_conn:
> + fuse_abort_conn(fc);
> +}
> +
> static void fuse_request_init(struct fuse_mount *fm, struct fuse_req *req)
> {
> INIT_LIST_HEAD(&req->list);
> @@ -53,6 +129,7 @@ static void fuse_request_init(struct fuse_mount *fm, struct fuse_req *req)
> refcount_set(&req->count, 1);
> __set_bit(FR_PENDING, &req->flags);
> req->fm = fm;
> + req->create_time = jiffies;
> }
>
> static struct fuse_req *fuse_request_alloc(struct fuse_mount *fm, gfp_t flags)
> @@ -2296,6 +2373,9 @@ void fuse_abort_conn(struct fuse_conn *fc)
> spin_unlock(&fc->lock);
>
> end_requests(&to_end);
> +
> + if (fc->timeout.req_timeout)
> + timer_delete(&fc->timeout.timer);
> } else {
> spin_unlock(&fc->lock);
> }
> diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
> index 7ff00bae4a84..ef4558c2c44e 100644
> --- a/fs/fuse/fuse_i.h
> +++ b/fs/fuse/fuse_i.h
> @@ -435,6 +435,9 @@ struct fuse_req {
>
> /** fuse_mount this request belongs to */
> struct fuse_mount *fm;
> +
> + /** When (in jiffies) the request was created */
> + unsigned long create_time;
> };
>
> struct fuse_iqueue;
> @@ -525,6 +528,16 @@ struct fuse_pqueue {
> struct list_head io;
> };
>
> +/* Frequency (in seconds) of request timeout checks, if opted into */
> +#define FUSE_TIMEOUT_TIMER_FREQ 60 * HZ
> +
> +struct fuse_timeout {
> + struct timer_list timer;
> +
> + /* Request timeout (in jiffies). 0 = no timeout */
> + unsigned long req_timeout;
> +};
> +
> /**
> * Fuse device instance
> */
> @@ -571,6 +584,8 @@ struct fuse_fs_context {
> enum fuse_dax_mode dax_mode;
> unsigned int max_read;
> unsigned int blksize;
> + /* Request timeout (in minutes). 0 = no timeout (infinite wait) */
> + unsigned int req_timeout;
> const char *subtype;
>
> /* DAX device, may be NULL */
> @@ -914,6 +929,9 @@ struct fuse_conn {
> /** IDR for backing files ids */
> struct idr backing_files_map;
> #endif
> +
> + /** Only used if the connection enforces request timeouts */
> + struct fuse_timeout timeout;
> };
>
> /*
> @@ -1175,6 +1193,9 @@ void fuse_request_end(struct fuse_req *req);
> void fuse_abort_conn(struct fuse_conn *fc);
> void fuse_wait_aborted(struct fuse_conn *fc);
>
> +/* Check if any requests timed out */
> +void fuse_check_timeout(struct timer_list *timer);
> +
> /**
> * Invalidate inode attributes
> */
> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> index f1779ff3f8d1..a78aac76b942 100644
> --- a/fs/fuse/inode.c
> +++ b/fs/fuse/inode.c
> @@ -735,6 +735,7 @@ enum {
> OPT_ALLOW_OTHER,
> OPT_MAX_READ,
> OPT_BLKSIZE,
> + OPT_REQUEST_TIMEOUT,
> OPT_ERR
> };
>
> @@ -749,6 +750,7 @@ static const struct fs_parameter_spec fuse_fs_parameters[] = {
> fsparam_u32 ("max_read", OPT_MAX_READ),
> fsparam_u32 ("blksize", OPT_BLKSIZE),
> fsparam_string ("subtype", OPT_SUBTYPE),
> + fsparam_u16 ("request_timeout", OPT_REQUEST_TIMEOUT),
> {}
> };
>
> @@ -844,6 +846,10 @@ static int fuse_parse_param(struct fs_context *fsc, struct fs_parameter *param)
> ctx->blksize = result.uint_32;
> break;
>
> + case OPT_REQUEST_TIMEOUT:
> + ctx->req_timeout = result.uint_16;
> + break;
> +
> default:
> return -EINVAL;
> }
> @@ -973,6 +979,8 @@ void fuse_conn_put(struct fuse_conn *fc)
>
> if (IS_ENABLED(CONFIG_FUSE_DAX))
> fuse_dax_conn_free(fc);
> + if (fc->timeout.req_timeout)
> + timer_shutdown_sync(&fc->timeout.timer);
> if (fiq->ops->release)
> fiq->ops->release(fiq);
> put_pid_ns(fc->pid_ns);
> @@ -1691,6 +1699,18 @@ int fuse_init_fs_context_submount(struct fs_context *fsc)
> }
> EXPORT_SYMBOL_GPL(fuse_init_fs_context_submount);
>
> +static void fuse_init_fc_timeout(struct fuse_conn *fc, struct fuse_fs_context *ctx)
> +{
> + if (ctx->req_timeout) {
> + if (check_mul_overflow(ctx->req_timeout * 60, HZ, &fc->timeout.req_timeout))
> + fc->timeout.req_timeout = U32_MAX;
ULONG_MAX?
> + timer_setup(&fc->timeout.timer, fuse_check_timeout, 0);
> + mod_timer(&fc->timeout.timer, jiffies + FUSE_TIMEOUT_TIMER_FREQ);
> + } else {
> + fc->timeout.req_timeout = 0;
> + }
> +}
> +
> int fuse_fill_super_common(struct super_block *sb, struct fuse_fs_context *ctx)
> {
> struct fuse_dev *fud = NULL;
> @@ -1753,6 +1773,7 @@ int fuse_fill_super_common(struct super_block *sb, struct fuse_fs_context *ctx)
> fc->destroy = ctx->destroy;
> fc->no_control = ctx->no_control;
> fc->no_force_umount = ctx->no_force_umount;
> + fuse_init_fc_timeout(fc, ctx);
>
> err = -ENOMEM;
> root = fuse_get_root_inode(sb, ctx->rootmode);
Reviewed-by: Bernd Schubert <bschubert@ddn.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v8 3/3] fuse: add default_request_timeout and max_request_timeout sysctls
2024-10-11 19:13 ` [PATCH v8 3/3] fuse: add default_request_timeout and max_request_timeout sysctls Joanne Koong
@ 2024-10-29 19:40 ` Bernd Schubert
2024-10-30 17:32 ` Joanne Koong
0 siblings, 1 reply; 12+ messages in thread
From: Bernd Schubert @ 2024-10-29 19:40 UTC (permalink / raw)
To: Joanne Koong, miklos, linux-fsdevel
Cc: josef, jefflexu, laoar.shao, kernel-team
On 10/11/24 21:13, Joanne Koong wrote:
> Introduce two new sysctls, "default_request_timeout" and
> "max_request_timeout". These control how long (in minutes) a server can
> take to reply to a request. If the server does not reply by the timeout,
> then the connection will be aborted.
>
> "default_request_timeout" sets the default timeout if no timeout is
> specified by the fuse server on mount. 0 (default) indicates no default
> timeout should be enforced. If the server did specify a timeout, then
> default_request_timeout will be ignored.
>
> "max_request_timeout" sets the max amount of time the server may take to
> reply to a request. 0 (default) indicates no maximum timeout. If
> max_request_timeout is set and the fuse server attempts to set a
> timeout greater than max_request_timeout, the system will use
> max_request_timeout as the timeout. Similarly, if default_request_timeout
> is greater than max_request_timeout, the system will use
> max_request_timeout as the timeout. If the server does not request a
> timeout and default_request_timeout is set to 0 but max_request_timeout
> is set, then the timeout will be max_request_timeout.
>
> Please note that these timeouts are not 100% precise. The request may
> take an extra FUSE_TIMEOUT_TIMER_FREQ seconds beyond the set max timeout
> due to how it's internally implemented.
>
> $ sysctl -a | grep fuse.default_request_timeout
> fs.fuse.default_request_timeout = 0
>
> $ echo 65536 | sudo tee /proc/sys/fs/fuse/default_request_timeout
> tee: /proc/sys/fs/fuse/default_request_timeout: Invalid argument
>
> $ echo 65535 | sudo tee /proc/sys/fs/fuse/default_request_timeout
> 65535
>
> $ sysctl -a | grep fuse.default_request_timeout
> fs.fuse.default_request_timeout = 65535
>
> $ echo 0 | sudo tee /proc/sys/fs/fuse/default_request_timeout
> 0
>
> $ sysctl -a | grep fuse.default_request_timeout
> fs.fuse.default_request_timeout = 0
>
> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> ---
> Documentation/admin-guide/sysctl/fs.rst | 27 +++++++++++++++++++++++++
> fs/fuse/fuse_i.h | 10 +++++++++
> fs/fuse/inode.c | 16 +++++++++++++--
> fs/fuse/sysctl.c | 20 ++++++++++++++++++
> 4 files changed, 71 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/admin-guide/sysctl/fs.rst b/Documentation/admin-guide/sysctl/fs.rst
> index fa25d7e718b3..790a34291467 100644
> --- a/Documentation/admin-guide/sysctl/fs.rst
> +++ b/Documentation/admin-guide/sysctl/fs.rst
> @@ -342,3 +342,30 @@ filesystems:
> ``/proc/sys/fs/fuse/max_pages_limit`` is a read/write file for
> setting/getting the maximum number of pages that can be used for servicing
> requests in FUSE.
> +
> +``/proc/sys/fs/fuse/default_request_timeout`` is a read/write file for
> +setting/getting the default timeout (in minutes) for a fuse server to
> +reply to a kernel-issued request in the event where the server did not
> +specify a timeout at mount. If the server set a timeout,
> +then default_request_timeout will be ignored. The default
> +"default_request_timeout" is set to 0. 0 indicates a no-op (eg
> +requests will not have a default request timeout set if no timeout was
> +specified by the server).
> +
> +``/proc/sys/fs/fuse/max_request_timeout`` is a read/write file for
> +setting/getting the maximum timeout (in minutes) for a fuse server to
> +reply to a kernel-issued request. A value greater than 0 automatically opts
> +the server into a timeout that will be at most "max_request_timeout", even if
> +the server did not specify a timeout and default_request_timeout is set to 0.
> +If max_request_timeout is greater than 0 and the server set a timeout greater
> +than max_request_timeout or default_request_timeout is set to a value greater
> +than max_request_timeout, the system will use max_request_timeout as the
> +timeout. 0 indicates a no-op (eg requests will not have an upper bound on the
> +timeout and if the server did not request a timeout and default_request_timeout
> +was not set, there will be no timeout).
> +
> +Please note that for the timeout options, if the server does not respond to
> +the request by the time the timeout elapses, then the connection to the fuse
> +server will be aborted. Please also note that the timeouts are not 100%
> +precise (eg you may set 10 minutes but the timeout may kick in after 11
> +minutes).
> diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
> index ef4558c2c44e..28d9230f4fcb 100644
> --- a/fs/fuse/fuse_i.h
> +++ b/fs/fuse/fuse_i.h
> @@ -46,6 +46,16 @@
>
> /** Maximum of max_pages received in init_out */
> extern unsigned int fuse_max_pages_limit;
> +/*
> + * Default timeout (in minutes) for the server to reply to a request
> + * before the connection is aborted, if no timeout was specified on mount.
> + */
> +extern unsigned int fuse_default_req_timeout;
> +/*
> + * Max timeout (in minutes) for the server to reply to a request before
> + * the connection is aborted.
> + */
> +extern unsigned int fuse_max_req_timeout;
>
> /** List of active connections */
> extern struct list_head fuse_conn_list;
> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> index a78aac76b942..d97dde59eac3 100644
> --- a/fs/fuse/inode.c
> +++ b/fs/fuse/inode.c
> @@ -36,6 +36,9 @@ DEFINE_MUTEX(fuse_mutex);
> static int set_global_limit(const char *val, const struct kernel_param *kp);
>
> unsigned int fuse_max_pages_limit = 256;
> +/* default is no timeout */
> +unsigned int fuse_default_req_timeout = 0;
> +unsigned int fuse_max_req_timeout = 0;
>
> unsigned max_user_bgreq;
> module_param_call(max_user_bgreq, set_global_limit, param_get_uint,
> @@ -1701,8 +1704,17 @@ EXPORT_SYMBOL_GPL(fuse_init_fs_context_submount);
>
> static void fuse_init_fc_timeout(struct fuse_conn *fc, struct fuse_fs_context *ctx)
> {
> - if (ctx->req_timeout) {
> - if (check_mul_overflow(ctx->req_timeout * 60, HZ, &fc->timeout.req_timeout))
> + unsigned int timeout = ctx->req_timeout ?: fuse_default_req_timeout;
> +
> + if (fuse_max_req_timeout) {
> + if (!timeout)
> + timeout = fuse_max_req_timeout;
Hmm, so 'max' might be used as 'min' as well. Isn't that a bit confusing?
> + else
> + timeout = min(timeout, fuse_max_req_timeout)> + }
> +
> + if (timeout) {
> + if (check_mul_overflow(timeout * 60, HZ, &fc->timeout.req_timeout))
> fc->timeout.req_timeout = U32_MAX;
> timer_setup(&fc->timeout.timer, fuse_check_timeout, 0);
> mod_timer(&fc->timeout.timer, jiffies + FUSE_TIMEOUT_TIMER_FREQ);
> diff --git a/fs/fuse/sysctl.c b/fs/fuse/sysctl.c
> index b272bb333005..e70b5269c16d 100644
> --- a/fs/fuse/sysctl.c
> +++ b/fs/fuse/sysctl.c
> @@ -13,6 +13,8 @@ static struct ctl_table_header *fuse_table_header;
> /* Bound by fuse_init_out max_pages, which is a u16 */
> static unsigned int sysctl_fuse_max_pages_limit = 65535;
>
> +static unsigned int sysctl_fuse_max_req_timeout_limit = U16_MAX;
> +
> static struct ctl_table fuse_sysctl_table[] = {
> {
> .procname = "max_pages_limit",
> @@ -23,6 +25,24 @@ static struct ctl_table fuse_sysctl_table[] = {
> .extra1 = SYSCTL_ONE,
> .extra2 = &sysctl_fuse_max_pages_limit,
> },
> + {
> + .procname = "default_request_timeout",
> + .data = &fuse_default_req_timeout,
> + .maxlen = sizeof(fuse_default_req_timeout),
> + .mode = 0644,
> + .proc_handler = proc_douintvec_minmax,
> + .extra1 = SYSCTL_ZERO,
There is slight whitespace issue here - spaces instead of tabs.
> + .extra2 = &sysctl_fuse_max_req_timeout_limit,
> + },
> + {
> + .procname = "max_request_timeout",
> + .data = &fuse_max_req_timeout,
> + .maxlen = sizeof(fuse_max_req_timeout),
> + .mode = 0644,
> + .proc_handler = proc_douintvec_minmax,
> + .extra1 = SYSCTL_ZERO,
And here.
> + .extra2 = &sysctl_fuse_max_req_timeout_limit,
> + },
> };
>
> int fuse_sysctl_register(void)
Reviewed-by: Bernd Schubert <bschubert@ddn.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v8 1/3] fs_parser: add fsparam_u16 helper
2024-10-11 19:13 ` [PATCH v8 1/3] fs_parser: add fsparam_u16 helper Joanne Koong
@ 2024-10-29 20:58 ` Bernd Schubert
2024-10-30 16:28 ` Joanne Koong
0 siblings, 1 reply; 12+ messages in thread
From: Bernd Schubert @ 2024-10-29 20:58 UTC (permalink / raw)
To: Joanne Koong, miklos, linux-fsdevel
Cc: josef, jefflexu, laoar.shao, kernel-team
On 10/11/24 21:13, Joanne Koong wrote:
> Add a fsparam helper for unsigned 16 bit values.
>
> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> ---
> fs/fs_parser.c | 14 ++++++++++++++
> include/linux/fs_parser.h | 9 ++++++---
> 2 files changed, 20 insertions(+), 3 deletions(-)
>
> diff --git a/fs/fs_parser.c b/fs/fs_parser.c
> index 24727ec34e5a..0e06f9618c89 100644
> --- a/fs/fs_parser.c
> +++ b/fs/fs_parser.c
> @@ -210,6 +210,20 @@ int fs_param_is_bool(struct p_log *log, const struct fs_parameter_spec *p,
> }
> EXPORT_SYMBOL(fs_param_is_bool);
>
> +int fs_param_is_u16(struct p_log *log, const struct fs_parameter_spec *p,
> + struct fs_parameter *param, struct fs_parse_result *result)
> +{
> + int base = (unsigned long)p->data;
> + if (param->type != fs_value_is_string)
> + return fs_param_bad_value(log, param);
> + if (!*param->string && (p->flags & fs_param_can_be_empty))
> + return 0;
> + if (kstrtou16(param->string, base, &result->uint_16) < 0)
> + return fs_param_bad_value(log, param);
> + return 0;
> +}
> +EXPORT_SYMBOL(fs_param_is_u16);
> +
> int fs_param_is_u32(struct p_log *log, const struct fs_parameter_spec *p,
> struct fs_parameter *param, struct fs_parse_result *result)
> {
> diff --git a/include/linux/fs_parser.h b/include/linux/fs_parser.h
> index 6cf713a7e6c6..1c940756300c 100644
> --- a/include/linux/fs_parser.h
> +++ b/include/linux/fs_parser.h
> @@ -26,9 +26,10 @@ typedef int fs_param_type(struct p_log *,
> /*
> * The type of parameter expected.
> */
> -fs_param_type fs_param_is_bool, fs_param_is_u32, fs_param_is_s32, fs_param_is_u64,
> - fs_param_is_enum, fs_param_is_string, fs_param_is_blob, fs_param_is_blockdev,
> - fs_param_is_path, fs_param_is_fd, fs_param_is_uid, fs_param_is_gid;
> +fs_param_type fs_param_is_bool, fs_param_is_u16, fs_param_is_u32, fs_param_is_s32,
> + fs_param_is_u64, fs_param_is_enum, fs_param_is_string, fs_param_is_blob,
> + fs_param_is_blockdev, fs_param_is_path, fs_param_is_fd, fs_param_is_uid,
> + fs_param_is_gid;
>
> /*
> * Specification of the type of value a parameter wants.
> @@ -55,6 +56,7 @@ struct fs_parse_result {
> union {
> bool boolean; /* For spec_bool */
> int int_32; /* For spec_s32/spec_enum */
> + u16 uint_16; /* For spec_u16 *
> unsigned int uint_32; /* For spec_u32{,_octal,_hex}/spec_enum */
Given fs_param_is_u16() has "int base = (unsigned long)p->data;",
shouldn't the comment also mention ",_octal,_hex}/spec_enum"?
Or should the function get slightly modified to always use a base of 10?
Or 0 with auto detection as for fs_param_is_u64()?
> u64 uint_64; /* For spec_u64 */
> kuid_t uid;
> @@ -119,6 +121,7 @@ static inline bool fs_validate_description(const char *name,
> #define fsparam_flag_no(NAME, OPT) \
> __fsparam(NULL, NAME, OPT, fs_param_neg_with_no, NULL)
> #define fsparam_bool(NAME, OPT) __fsparam(fs_param_is_bool, NAME, OPT, 0, NULL)
> +#define fsparam_u16(NAME, OPT) __fsparam(fs_param_is_u16, NAME, OPT, 0, NULL)
> #define fsparam_u32(NAME, OPT) __fsparam(fs_param_is_u32, NAME, OPT, 0, NULL)
> #define fsparam_u32oct(NAME, OPT) \
> __fsparam(fs_param_is_u32, NAME, OPT, 0, (void *)8)
Reviewed-by: Bernd Schubert <bschubert@ddn.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v8 1/3] fs_parser: add fsparam_u16 helper
2024-10-29 20:58 ` Bernd Schubert
@ 2024-10-30 16:28 ` Joanne Koong
0 siblings, 0 replies; 12+ messages in thread
From: Joanne Koong @ 2024-10-30 16:28 UTC (permalink / raw)
To: Bernd Schubert
Cc: miklos, linux-fsdevel, josef, jefflexu, laoar.shao, kernel-team
On Tue, Oct 29, 2024 at 1:58 PM Bernd Schubert
<bernd.schubert@fastmail.fm> wrote:
>
>
>
> On 10/11/24 21:13, Joanne Koong wrote:
> > Add a fsparam helper for unsigned 16 bit values.
> >
> > Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> > ---
> > fs/fs_parser.c | 14 ++++++++++++++
> > include/linux/fs_parser.h | 9 ++++++---
> > 2 files changed, 20 insertions(+), 3 deletions(-)
> >
> > diff --git a/fs/fs_parser.c b/fs/fs_parser.c
> > index 24727ec34e5a..0e06f9618c89 100644
> > --- a/fs/fs_parser.c
> > +++ b/fs/fs_parser.c
> > @@ -210,6 +210,20 @@ int fs_param_is_bool(struct p_log *log, const struct fs_parameter_spec *p,
> > }
> > EXPORT_SYMBOL(fs_param_is_bool);
> >
> > +int fs_param_is_u16(struct p_log *log, const struct fs_parameter_spec *p,
> > + struct fs_parameter *param, struct fs_parse_result *result)
> > +{
> > + int base = (unsigned long)p->data;
> > + if (param->type != fs_value_is_string)
> > + return fs_param_bad_value(log, param);
> > + if (!*param->string && (p->flags & fs_param_can_be_empty))
> > + return 0;
> > + if (kstrtou16(param->string, base, &result->uint_16) < 0)
> > + return fs_param_bad_value(log, param);
> > + return 0;
> > +}
> > +EXPORT_SYMBOL(fs_param_is_u16);
> > +
> > int fs_param_is_u32(struct p_log *log, const struct fs_parameter_spec *p,
> > struct fs_parameter *param, struct fs_parse_result *result)
> > {
> > diff --git a/include/linux/fs_parser.h b/include/linux/fs_parser.h
> > index 6cf713a7e6c6..1c940756300c 100644
> > --- a/include/linux/fs_parser.h
> > +++ b/include/linux/fs_parser.h
> > @@ -26,9 +26,10 @@ typedef int fs_param_type(struct p_log *,
> > /*
> > * The type of parameter expected.
> > */
> > -fs_param_type fs_param_is_bool, fs_param_is_u32, fs_param_is_s32, fs_param_is_u64,
> > - fs_param_is_enum, fs_param_is_string, fs_param_is_blob, fs_param_is_blockdev,
> > - fs_param_is_path, fs_param_is_fd, fs_param_is_uid, fs_param_is_gid;
> > +fs_param_type fs_param_is_bool, fs_param_is_u16, fs_param_is_u32, fs_param_is_s32,
> > + fs_param_is_u64, fs_param_is_enum, fs_param_is_string, fs_param_is_blob,
> > + fs_param_is_blockdev, fs_param_is_path, fs_param_is_fd, fs_param_is_uid,
> > + fs_param_is_gid;
> >
> > /*
> > * Specification of the type of value a parameter wants.
> > @@ -55,6 +56,7 @@ struct fs_parse_result {
> > union {
> > bool boolean; /* For spec_bool */
> > int int_32; /* For spec_s32/spec_enum */
> > + u16 uint_16; /* For spec_u16 *
> > unsigned int uint_32; /* For spec_u32{,_octal,_hex}/spec_enum */
>
> Given fs_param_is_u16() has "int base = (unsigned long)p->data;",
> shouldn't the comment also mention ",_octal,_hex}/spec_enum"?
> Or should the function get slightly modified to always use a base of 10?
> Or 0 with auto detection as for fs_param_is_u64()?
Thanks for taking a look at this patchset, Bernd!
I'm not sure what the correct answer here is either with whether this
should follow fs_param_is_u32() or fs_param_is_u64(). I leaned towards
fs_param_is_u32() because that seemed more permissive in supporting
octal/hex.
I see in the commit history (commit 328de5287b10a) that Al Viro added
both of these. After the fuse parts of this patchset gets feedback
from Miklos, I'll cc Al on the next iteration and hopefully that
should give us some more clarity.
Thanks,
Joanne
>
> > u64 uint_64; /* For spec_u64 */
> > kuid_t uid;
> > @@ -119,6 +121,7 @@ static inline bool fs_validate_description(const char *name,
> > #define fsparam_flag_no(NAME, OPT) \
> > __fsparam(NULL, NAME, OPT, fs_param_neg_with_no, NULL)
> > #define fsparam_bool(NAME, OPT) __fsparam(fs_param_is_bool, NAME, OPT, 0, NULL)
> > +#define fsparam_u16(NAME, OPT) __fsparam(fs_param_is_u16, NAME, OPT, 0, NULL)
> > #define fsparam_u32(NAME, OPT) __fsparam(fs_param_is_u32, NAME, OPT, 0, NULL)
> > #define fsparam_u32oct(NAME, OPT) \
> > __fsparam(fs_param_is_u32, NAME, OPT, 0, (void *)8)
>
>
> Reviewed-by: Bernd Schubert <bschubert@ddn.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v8 2/3] fuse: add optional kernel-enforced timeout for requests
2024-10-29 19:17 ` Bernd Schubert
@ 2024-10-30 17:21 ` Joanne Koong
2024-10-30 17:50 ` Bernd Schubert
0 siblings, 1 reply; 12+ messages in thread
From: Joanne Koong @ 2024-10-30 17:21 UTC (permalink / raw)
To: Bernd Schubert
Cc: miklos, linux-fsdevel, josef, jefflexu, laoar.shao, kernel-team
On Tue, Oct 29, 2024 at 12:17 PM Bernd Schubert
<bernd.schubert@fastmail.fm> wrote:
>
>
>
> On 10/11/24 21:13, Joanne Koong wrote:
> > There are situations where fuse servers can become unresponsive or
> > stuck, for example if the server is deadlocked. Currently, there's no
> > good way to detect if a server is stuck and needs to be killed manually.
> >
> > This commit adds an option for enforcing a timeout (in minutes) for
> > requests where if the timeout elapses without the server responding to
> > the request, the connection will be automatically aborted.
> >
> > Please note that these timeouts are not 100% precise. The request may
> > take an extra FUSE_TIMEOUT_TIMER_FREQ seconds beyond the requested max
> > timeout due to how it's internally implemented.
> >
> > Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> > ---
> > fs/fuse/dev.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++
> > fs/fuse/fuse_i.h | 21 +++++++++++++
> > fs/fuse/inode.c | 21 +++++++++++++
> > 3 files changed, 122 insertions(+)
> >
> > diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
> > index 1f64ae6d7a69..054bfa2a26ed 100644
> > --- a/fs/fuse/dev.c
> > +++ b/fs/fuse/dev.c
> > @@ -45,6 +45,82 @@ static struct fuse_dev *fuse_get_dev(struct file *file)
> > return READ_ONCE(file->private_data);
> > }
> >
> > +static bool request_expired(struct fuse_conn *fc, struct fuse_req *req)
> > +{
> > + return jiffies > req->create_time + fc->timeout.req_timeout;
> > +}
> > +
> > +/*
> > + * Check if any requests aren't being completed by the specified request
> > + * timeout. To do so, we:
> > + * - check the fiq pending list
> > + * - check the bg queue
> > + * - check the fpq io and processing lists
> > + *
> > + * To make this fast, we only check against the head request on each list since
> > + * these are generally queued in order of creation time (eg newer requests get
> > + * queued to the tail). We might miss a few edge cases (eg requests transitioning
> > + * between lists, re-sent requests at the head of the pending list having a
> > + * later creation time than other requests on that list, etc.) but that is fine
> > + * since if the request never gets fulfilled, it will eventually be caught.
> > + */
> > +void fuse_check_timeout(struct timer_list *timer)
> > +{
> > + struct fuse_conn *fc = container_of(timer, struct fuse_conn, timeout.timer);
> > + struct fuse_iqueue *fiq = &fc->iq;
> > + struct fuse_req *req;
> > + struct fuse_dev *fud;
> > + struct fuse_pqueue *fpq;
> > + bool expired = false;
> > + int i;
> > +
> > + spin_lock(&fiq->lock);
> > + req = list_first_entry_or_null(&fiq->pending, struct fuse_req, list);
> > + if (req)
> > + expired = request_expired(fc, req);
> > + spin_unlock(&fiq->lock);
> > + if (expired)
> > + goto abort_conn;
> > +
> > + spin_lock(&fc->bg_lock);
> > + req = list_first_entry_or_null(&fc->bg_queue, struct fuse_req, list);
> > + if (req)
> > + expired = request_expired(fc, req);
> > + spin_unlock(&fc->bg_lock);
> > + if (expired)
> > + goto abort_conn;
> > +
> > + spin_lock(&fc->lock);
> > + if (!fc->connected) {
> > + spin_unlock(&fc->lock);
> > + return;
> > + }
> > + list_for_each_entry(fud, &fc->devices, entry) {
> > + fpq = &fud->pq;
> > + spin_lock(&fpq->lock);
> > + req = list_first_entry_or_null(&fpq->io, struct fuse_req, list);
> > + if (req && request_expired(fc, req))
> > + goto fpq_abort;
> > +
> > + for (i = 0; i < FUSE_PQ_HASH_SIZE; i++) {
> > + req = list_first_entry_or_null(&fpq->processing[i], struct fuse_req, list);
> > + if (req && request_expired(fc, req))
> > + goto fpq_abort;
> > + }
> > + spin_unlock(&fpq->lock);
> > + }
> > + spin_unlock(&fc->lock);
>
> I really don't have a strong opinion on that - I wonder if it wouldn't
> be better for this part to have an extra timeout list per fud or pq as
> previously. That would slightly increases memory usage and overhead per
> request as a second list is needed, but would reduce these 1/min cpu
> spikes as only one list per fud would need to be checked. But then, it
> would be easy to change that later, if timeout checks turn out to be a
> problem.
>
Thanks for the review.
On v7 [1] which used an extra timeout list, the feedback was
"One thing I worry about is adding more roadblocks on the way to making
request queuing more scalable.
Currently there's fc->num_waiting that's touched on all requests and
bg_queue/bg_lock that are touched on background requests. We should
be trying to fix these bottlenecks instead of adding more.
Can't we use the existing lists to scan requests?
It's more complex, obviously, but at least it doesn't introduce yet
another per-fc list to worry about."
[1] https://lore.kernel.org/linux-fsdevel/CAJfpegs9A7iBbZpPMF-WuR48Ho_=z_ZWfjrLQG2ob0k6NbcaUg@mail.gmail.com/
>
> > +
> > + mod_timer(&fc->timeout.timer, jiffies + FUSE_TIMEOUT_TIMER_FREQ);
> > + return;
> > +
> > +fpq_abort:
> > + spin_unlock(&fpq->lock);
> > + spin_unlock(&fc->lock);
> > +abort_conn:
> > + fuse_abort_conn(fc);
> > +}
> > +
> > static void fuse_request_init(struct fuse_mount *fm, struct fuse_req *req)
> > {
> > INIT_LIST_HEAD(&req->list);
> > @@ -53,6 +129,7 @@ static void fuse_request_init(struct fuse_mount *fm, struct fuse_req *req)
> > refcount_set(&req->count, 1);
> > __set_bit(FR_PENDING, &req->flags);
> > req->fm = fm;
> > + req->create_time = jiffies;
> > }
> >
> > static struct fuse_req *fuse_request_alloc(struct fuse_mount *fm, gfp_t flags)
> > @@ -2296,6 +2373,9 @@ void fuse_abort_conn(struct fuse_conn *fc)
> > spin_unlock(&fc->lock);
> >
> > end_requests(&to_end);
> > +
> > + if (fc->timeout.req_timeout)
> > + timer_delete(&fc->timeout.timer);
> > } else {
> > spin_unlock(&fc->lock);
> > }
> > diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
> > index 7ff00bae4a84..ef4558c2c44e 100644
> > --- a/fs/fuse/fuse_i.h
> > +++ b/fs/fuse/fuse_i.h
> > @@ -435,6 +435,9 @@ struct fuse_req {
> >
> > /** fuse_mount this request belongs to */
> > struct fuse_mount *fm;
> > +
> > + /** When (in jiffies) the request was created */
> > + unsigned long create_time;
> > };
> >
> > struct fuse_iqueue;
> > @@ -525,6 +528,16 @@ struct fuse_pqueue {
> > struct list_head io;
> > };
> >
> > +/* Frequency (in seconds) of request timeout checks, if opted into */
> > +#define FUSE_TIMEOUT_TIMER_FREQ 60 * HZ
> > +
> > +struct fuse_timeout {
> > + struct timer_list timer;
> > +
> > + /* Request timeout (in jiffies). 0 = no timeout */
> > + unsigned long req_timeout;
> > +};
> > +
> > /**
> > * Fuse device instance
> > */
> > @@ -571,6 +584,8 @@ struct fuse_fs_context {
> > enum fuse_dax_mode dax_mode;
> > unsigned int max_read;
> > unsigned int blksize;
> > + /* Request timeout (in minutes). 0 = no timeout (infinite wait) */
> > + unsigned int req_timeout;
> > const char *subtype;
> >
> > /* DAX device, may be NULL */
> > @@ -914,6 +929,9 @@ struct fuse_conn {
> > /** IDR for backing files ids */
> > struct idr backing_files_map;
> > #endif
> > +
> > + /** Only used if the connection enforces request timeouts */
> > + struct fuse_timeout timeout;
> > };
> >
> > /*
> > @@ -1175,6 +1193,9 @@ void fuse_request_end(struct fuse_req *req);
> > void fuse_abort_conn(struct fuse_conn *fc);
> > void fuse_wait_aborted(struct fuse_conn *fc);
> >
> > +/* Check if any requests timed out */
> > +void fuse_check_timeout(struct timer_list *timer);
> > +
> > /**
> > * Invalidate inode attributes
> > */
> > diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> > index f1779ff3f8d1..a78aac76b942 100644
> > --- a/fs/fuse/inode.c
> > +++ b/fs/fuse/inode.c
> > @@ -735,6 +735,7 @@ enum {
> > OPT_ALLOW_OTHER,
> > OPT_MAX_READ,
> > OPT_BLKSIZE,
> > + OPT_REQUEST_TIMEOUT,
> > OPT_ERR
> > };
> >
> > @@ -749,6 +750,7 @@ static const struct fs_parameter_spec fuse_fs_parameters[] = {
> > fsparam_u32 ("max_read", OPT_MAX_READ),
> > fsparam_u32 ("blksize", OPT_BLKSIZE),
> > fsparam_string ("subtype", OPT_SUBTYPE),
> > + fsparam_u16 ("request_timeout", OPT_REQUEST_TIMEOUT),
> > {}
> > };
> >
> > @@ -844,6 +846,10 @@ static int fuse_parse_param(struct fs_context *fsc, struct fs_parameter *param)
> > ctx->blksize = result.uint_32;
> > break;
> >
> > + case OPT_REQUEST_TIMEOUT:
> > + ctx->req_timeout = result.uint_16;
> > + break;
> > +
> > default:
> > return -EINVAL;
> > }
> > @@ -973,6 +979,8 @@ void fuse_conn_put(struct fuse_conn *fc)
> >
> > if (IS_ENABLED(CONFIG_FUSE_DAX))
> > fuse_dax_conn_free(fc);
> > + if (fc->timeout.req_timeout)
> > + timer_shutdown_sync(&fc->timeout.timer);
> > if (fiq->ops->release)
> > fiq->ops->release(fiq);
> > put_pid_ns(fc->pid_ns);
> > @@ -1691,6 +1699,18 @@ int fuse_init_fs_context_submount(struct fs_context *fsc)
> > }
> > EXPORT_SYMBOL_GPL(fuse_init_fs_context_submount);
> >
> > +static void fuse_init_fc_timeout(struct fuse_conn *fc, struct fuse_fs_context *ctx)
> > +{
> > + if (ctx->req_timeout) {
> > + if (check_mul_overflow(ctx->req_timeout * 60, HZ, &fc->timeout.req_timeout))
> > + fc->timeout.req_timeout = U32_MAX;
>
>
> ULONG_MAX?
Nice, I'll change this to ULONG_MAX. We only run into this overflow on
32-bit systems (and only if the kernel has configured HZ to greater
than 1092) so U32_MAX is the same as ULONG_MAX, but ULONG_MAX looks
nicer since "fc->timeout.req_timeout" is an unsigned long.
Thanks,
Joanne
>
> > + timer_setup(&fc->timeout.timer, fuse_check_timeout, 0);
> > + mod_timer(&fc->timeout.timer, jiffies + FUSE_TIMEOUT_TIMER_FREQ);
> > + } else {
> > + fc->timeout.req_timeout = 0;
> > + }
> > +}
> > +
> > int fuse_fill_super_common(struct super_block *sb, struct fuse_fs_context *ctx)
> > {
> > struct fuse_dev *fud = NULL;
> > @@ -1753,6 +1773,7 @@ int fuse_fill_super_common(struct super_block *sb, struct fuse_fs_context *ctx)
> > fc->destroy = ctx->destroy;
> > fc->no_control = ctx->no_control;
> > fc->no_force_umount = ctx->no_force_umount;
> > + fuse_init_fc_timeout(fc, ctx);
> >
> > err = -ENOMEM;
> > root = fuse_get_root_inode(sb, ctx->rootmode);
>
>
> Reviewed-by: Bernd Schubert <bschubert@ddn.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v8 3/3] fuse: add default_request_timeout and max_request_timeout sysctls
2024-10-29 19:40 ` Bernd Schubert
@ 2024-10-30 17:32 ` Joanne Koong
0 siblings, 0 replies; 12+ messages in thread
From: Joanne Koong @ 2024-10-30 17:32 UTC (permalink / raw)
To: Bernd Schubert
Cc: miklos, linux-fsdevel, josef, jefflexu, laoar.shao, kernel-team
On Tue, Oct 29, 2024 at 12:40 PM Bernd Schubert
<bernd.schubert@fastmail.fm> wrote:
>
> On 10/11/24 21:13, Joanne Koong wrote:
> > Introduce two new sysctls, "default_request_timeout" and
> > "max_request_timeout". These control how long (in minutes) a server can
> > take to reply to a request. If the server does not reply by the timeout,
> > then the connection will be aborted.
> >
> > "default_request_timeout" sets the default timeout if no timeout is
> > specified by the fuse server on mount. 0 (default) indicates no default
> > timeout should be enforced. If the server did specify a timeout, then
> > default_request_timeout will be ignored.
> >
> > "max_request_timeout" sets the max amount of time the server may take to
> > reply to a request. 0 (default) indicates no maximum timeout. If
> > max_request_timeout is set and the fuse server attempts to set a
> > timeout greater than max_request_timeout, the system will use
> > max_request_timeout as the timeout. Similarly, if default_request_timeout
> > is greater than max_request_timeout, the system will use
> > max_request_timeout as the timeout. If the server does not request a
> > timeout and default_request_timeout is set to 0 but max_request_timeout
> > is set, then the timeout will be max_request_timeout.
> >
> > Please note that these timeouts are not 100% precise. The request may
> > take an extra FUSE_TIMEOUT_TIMER_FREQ seconds beyond the set max timeout
> > due to how it's internally implemented.
> >
> > $ sysctl -a | grep fuse.default_request_timeout
> > fs.fuse.default_request_timeout = 0
> >
> > $ echo 65536 | sudo tee /proc/sys/fs/fuse/default_request_timeout
> > tee: /proc/sys/fs/fuse/default_request_timeout: Invalid argument
> >
> > $ echo 65535 | sudo tee /proc/sys/fs/fuse/default_request_timeout
> > 65535
> >
> > $ sysctl -a | grep fuse.default_request_timeout
> > fs.fuse.default_request_timeout = 65535
> >
> > $ echo 0 | sudo tee /proc/sys/fs/fuse/default_request_timeout
> > 0
> >
> > $ sysctl -a | grep fuse.default_request_timeout
> > fs.fuse.default_request_timeout = 0
> >
> > Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> > ---
> > Documentation/admin-guide/sysctl/fs.rst | 27 +++++++++++++++++++++++++
> > fs/fuse/fuse_i.h | 10 +++++++++
> > fs/fuse/inode.c | 16 +++++++++++++--
> > fs/fuse/sysctl.c | 20 ++++++++++++++++++
> > 4 files changed, 71 insertions(+), 2 deletions(-)
> >
> > diff --git a/Documentation/admin-guide/sysctl/fs.rst b/Documentation/admin-guide/sysctl/fs.rst
> > index fa25d7e718b3..790a34291467 100644
> > --- a/Documentation/admin-guide/sysctl/fs.rst
> > +++ b/Documentation/admin-guide/sysctl/fs.rst
> > @@ -342,3 +342,30 @@ filesystems:
> > ``/proc/sys/fs/fuse/max_pages_limit`` is a read/write file for
> > setting/getting the maximum number of pages that can be used for servicing
> > requests in FUSE.
> > +
> > +``/proc/sys/fs/fuse/default_request_timeout`` is a read/write file for
> > +setting/getting the default timeout (in minutes) for a fuse server to
> > +reply to a kernel-issued request in the event where the server did not
> > +specify a timeout at mount. If the server set a timeout,
> > +then default_request_timeout will be ignored. The default
> > +"default_request_timeout" is set to 0. 0 indicates a no-op (eg
> > +requests will not have a default request timeout set if no timeout was
> > +specified by the server).
> > +
> > +``/proc/sys/fs/fuse/max_request_timeout`` is a read/write file for
> > +setting/getting the maximum timeout (in minutes) for a fuse server to
> > +reply to a kernel-issued request. A value greater than 0 automatically opts
> > +the server into a timeout that will be at most "max_request_timeout", even if
> > +the server did not specify a timeout and default_request_timeout is set to 0.
> > +If max_request_timeout is greater than 0 and the server set a timeout greater
> > +than max_request_timeout or default_request_timeout is set to a value greater
> > +than max_request_timeout, the system will use max_request_timeout as the
> > +timeout. 0 indicates a no-op (eg requests will not have an upper bound on the
> > +timeout and if the server did not request a timeout and default_request_timeout
> > +was not set, there will be no timeout).
> > +
> > +Please note that for the timeout options, if the server does not respond to
> > +the request by the time the timeout elapses, then the connection to the fuse
> > +server will be aborted. Please also note that the timeouts are not 100%
> > +precise (eg you may set 10 minutes but the timeout may kick in after 11
> > +minutes).
> > diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
> > index ef4558c2c44e..28d9230f4fcb 100644
> > --- a/fs/fuse/fuse_i.h
> > +++ b/fs/fuse/fuse_i.h
> > @@ -46,6 +46,16 @@
> >
> > /** Maximum of max_pages received in init_out */
> > extern unsigned int fuse_max_pages_limit;
> > +/*
> > + * Default timeout (in minutes) for the server to reply to a request
> > + * before the connection is aborted, if no timeout was specified on mount.
> > + */
> > +extern unsigned int fuse_default_req_timeout;
> > +/*
> > + * Max timeout (in minutes) for the server to reply to a request before
> > + * the connection is aborted.
> > + */
> > +extern unsigned int fuse_max_req_timeout;
> >
> > /** List of active connections */
> > extern struct list_head fuse_conn_list;
> > diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> > index a78aac76b942..d97dde59eac3 100644
> > --- a/fs/fuse/inode.c
> > +++ b/fs/fuse/inode.c
> > @@ -36,6 +36,9 @@ DEFINE_MUTEX(fuse_mutex);
> > static int set_global_limit(const char *val, const struct kernel_param *kp);
> >
> > unsigned int fuse_max_pages_limit = 256;
> > +/* default is no timeout */
> > +unsigned int fuse_default_req_timeout = 0;
> > +unsigned int fuse_max_req_timeout = 0;
> >
> > unsigned max_user_bgreq;
> > module_param_call(max_user_bgreq, set_global_limit, param_get_uint,
> > @@ -1701,8 +1704,17 @@ EXPORT_SYMBOL_GPL(fuse_init_fs_context_submount);
> >
> > static void fuse_init_fc_timeout(struct fuse_conn *fc, struct fuse_fs_context *ctx)
> > {
> > - if (ctx->req_timeout) {
> > - if (check_mul_overflow(ctx->req_timeout * 60, HZ, &fc->timeout.req_timeout))
> > + unsigned int timeout = ctx->req_timeout ?: fuse_default_req_timeout;
> > +
> > + if (fuse_max_req_timeout) {
> > + if (!timeout)
> > + timeout = fuse_max_req_timeout;
>
> Hmm, so 'max' might be used as 'min' as well. Isn't that a bit confusing?
Interesting. Yeah, i don't think there's a good name that encapsulates
the behavior of this (eg sets a min timeout if no timeout is set but
sets the max timeout if a timeout is set). in my head, I think of it
more as "max timeout that is automatically enforced if there's no timeout".
Thanks,
Joanne
>
>
> > + else
> > + timeout = min(timeout, fuse_max_req_timeout)> + }
> > +
> > + if (timeout) {
> > + if (check_mul_overflow(timeout * 60, HZ, &fc->timeout.req_timeout))
> > fc->timeout.req_timeout = U32_MAX;
> > timer_setup(&fc->timeout.timer, fuse_check_timeout, 0);
> > mod_timer(&fc->timeout.timer, jiffies + FUSE_TIMEOUT_TIMER_FREQ);
> > diff --git a/fs/fuse/sysctl.c b/fs/fuse/sysctl.c
> > index b272bb333005..e70b5269c16d 100644
> > --- a/fs/fuse/sysctl.c
> > +++ b/fs/fuse/sysctl.c
> > @@ -13,6 +13,8 @@ static struct ctl_table_header *fuse_table_header;
> > /* Bound by fuse_init_out max_pages, which is a u16 */
> > static unsigned int sysctl_fuse_max_pages_limit = 65535;
> >
> > +static unsigned int sysctl_fuse_max_req_timeout_limit = U16_MAX;
> > +
> > static struct ctl_table fuse_sysctl_table[] = {
> > {
> > .procname = "max_pages_limit",
> > @@ -23,6 +25,24 @@ static struct ctl_table fuse_sysctl_table[] = {
> > .extra1 = SYSCTL_ONE,
> > .extra2 = &sysctl_fuse_max_pages_limit,
> > },
> > + {
> > + .procname = "default_request_timeout",
> > + .data = &fuse_default_req_timeout,
> > + .maxlen = sizeof(fuse_default_req_timeout),
> > + .mode = 0644,
> > + .proc_handler = proc_douintvec_minmax,
> > + .extra1 = SYSCTL_ZERO,
>
> There is slight whitespace issue here - spaces instead of tabs.
Thanks for noting - I'll fix this and the one below.
>
>
> > + .extra2 = &sysctl_fuse_max_req_timeout_limit,
> > + },
> > + {
> > + .procname = "max_request_timeout",
> > + .data = &fuse_max_req_timeout,
> > + .maxlen = sizeof(fuse_max_req_timeout),
> > + .mode = 0644,
> > + .proc_handler = proc_douintvec_minmax,
> > + .extra1 = SYSCTL_ZERO,
>
> And here.
>
> > + .extra2 = &sysctl_fuse_max_req_timeout_limit,
> > + },
> > };
> >
> > int fuse_sysctl_register(void)
>
> Reviewed-by: Bernd Schubert <bschubert@ddn.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v8 2/3] fuse: add optional kernel-enforced timeout for requests
2024-10-30 17:21 ` Joanne Koong
@ 2024-10-30 17:50 ` Bernd Schubert
0 siblings, 0 replies; 12+ messages in thread
From: Bernd Schubert @ 2024-10-30 17:50 UTC (permalink / raw)
To: Joanne Koong
Cc: miklos, linux-fsdevel, josef, jefflexu, laoar.shao, kernel-team
On 10/30/24 18:21, Joanne Koong wrote:
> On Tue, Oct 29, 2024 at 12:17 PM Bernd Schubert
> <bernd.schubert@fastmail.fm> wrote:
>>
>>
>>
>> On 10/11/24 21:13, Joanne Koong wrote:
>>> There are situations where fuse servers can become unresponsive or
>>> stuck, for example if the server is deadlocked. Currently, there's no
>>> good way to detect if a server is stuck and needs to be killed manually.
>>>
>>> This commit adds an option for enforcing a timeout (in minutes) for
>>> requests where if the timeout elapses without the server responding to
>>> the request, the connection will be automatically aborted.
>>>
>>> Please note that these timeouts are not 100% precise. The request may
>>> take an extra FUSE_TIMEOUT_TIMER_FREQ seconds beyond the requested max
>>> timeout due to how it's internally implemented.
>>>
>>> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
>>> ---
>>> fs/fuse/dev.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++
>>> fs/fuse/fuse_i.h | 21 +++++++++++++
>>> fs/fuse/inode.c | 21 +++++++++++++
>>> 3 files changed, 122 insertions(+)
>>>
>>> diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
>>> index 1f64ae6d7a69..054bfa2a26ed 100644
>>> --- a/fs/fuse/dev.c
>>> +++ b/fs/fuse/dev.c
>>> @@ -45,6 +45,82 @@ static struct fuse_dev *fuse_get_dev(struct file *file)
>>> return READ_ONCE(file->private_data);
>>> }
>>>
>>> +static bool request_expired(struct fuse_conn *fc, struct fuse_req *req)
>>> +{
>>> + return jiffies > req->create_time + fc->timeout.req_timeout;
>>> +}
>>> +
>>> +/*
>>> + * Check if any requests aren't being completed by the specified request
>>> + * timeout. To do so, we:
>>> + * - check the fiq pending list
>>> + * - check the bg queue
>>> + * - check the fpq io and processing lists
>>> + *
>>> + * To make this fast, we only check against the head request on each list since
>>> + * these are generally queued in order of creation time (eg newer requests get
>>> + * queued to the tail). We might miss a few edge cases (eg requests transitioning
>>> + * between lists, re-sent requests at the head of the pending list having a
>>> + * later creation time than other requests on that list, etc.) but that is fine
>>> + * since if the request never gets fulfilled, it will eventually be caught.
>>> + */
>>> +void fuse_check_timeout(struct timer_list *timer)
>>> +{
>>> + struct fuse_conn *fc = container_of(timer, struct fuse_conn, timeout.timer);
>>> + struct fuse_iqueue *fiq = &fc->iq;
>>> + struct fuse_req *req;
>>> + struct fuse_dev *fud;
>>> + struct fuse_pqueue *fpq;
>>> + bool expired = false;
>>> + int i;
>>> +
>>> + spin_lock(&fiq->lock);
>>> + req = list_first_entry_or_null(&fiq->pending, struct fuse_req, list);
>>> + if (req)
>>> + expired = request_expired(fc, req);
>>> + spin_unlock(&fiq->lock);
>>> + if (expired)
>>> + goto abort_conn;
>>> +
>>> + spin_lock(&fc->bg_lock);
>>> + req = list_first_entry_or_null(&fc->bg_queue, struct fuse_req, list);
>>> + if (req)
>>> + expired = request_expired(fc, req);
>>> + spin_unlock(&fc->bg_lock);
>>> + if (expired)
>>> + goto abort_conn;
>>> +
>>> + spin_lock(&fc->lock);
>>> + if (!fc->connected) {
>>> + spin_unlock(&fc->lock);
>>> + return;
>>> + }
>>> + list_for_each_entry(fud, &fc->devices, entry) {
>>> + fpq = &fud->pq;
>>> + spin_lock(&fpq->lock);
>>> + req = list_first_entry_or_null(&fpq->io, struct fuse_req, list);
>>> + if (req && request_expired(fc, req))
>>> + goto fpq_abort;
>>> +
>>> + for (i = 0; i < FUSE_PQ_HASH_SIZE; i++) {
>>> + req = list_first_entry_or_null(&fpq->processing[i], struct fuse_req, list);
>>> + if (req && request_expired(fc, req))
>>> + goto fpq_abort;
>>> + }
>>> + spin_unlock(&fpq->lock);
>>> + }
>>> + spin_unlock(&fc->lock);
>>
>> I really don't have a strong opinion on that - I wonder if it wouldn't
>> be better for this part to have an extra timeout list per fud or pq as
>> previously. That would slightly increases memory usage and overhead per
>> request as a second list is needed, but would reduce these 1/min cpu
>> spikes as only one list per fud would need to be checked. But then, it
>> would be easy to change that later, if timeout checks turn out to be a
>> problem.
>>
>
> Thanks for the review.
>
> On v7 [1] which used an extra timeout list, the feedback was
>
> "One thing I worry about is adding more roadblocks on the way to making
> request queuing more scalable.
>
> Currently there's fc->num_waiting that's touched on all requests and
> bg_queue/bg_lock that are touched on background requests. We should
> be trying to fix these bottlenecks instead of adding more.
>
> Can't we use the existing lists to scan requests?
>
> It's more complex, obviously, but at least it doesn't introduce yet
> another per-fc list to worry about."
>
>
> [1] https://lore.kernel.org/linux-fsdevel/CAJfpegs9A7iBbZpPMF-WuR48Ho_=z_ZWfjrLQG2ob0k6NbcaUg@mail.gmail.com/
Yeah I know, but I'm just a bit scared about the cpu spikes this gives
on a recent systems (like 96 cores AMDs we have in the lab) and when
device clone is activated (and Miklos had also asked to use the same
hash lists to find requests with fuse-uring).
The previous discussion was mainly about avoiding a new global lock,
i.e. I wonder if can't use a mix - avoid the new list and its lock,
but still avoid scanning through [FUSE_PQ_HASH_SIZE] lists per
cloned-device/core.
I'm also not opposed against the current version and optimize it later.
Thanks,
Bernd
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-10-30 17:50 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-11 19:13 [PATCH v8 0/3] fuse: add kernel-enforced request timeout option Joanne Koong
2024-10-11 19:13 ` [PATCH v8 1/3] fs_parser: add fsparam_u16 helper Joanne Koong
2024-10-29 20:58 ` Bernd Schubert
2024-10-30 16:28 ` Joanne Koong
2024-10-11 19:13 ` [PATCH v8 2/3] fuse: add optional kernel-enforced timeout for requests Joanne Koong
2024-10-29 19:17 ` Bernd Schubert
2024-10-30 17:21 ` Joanne Koong
2024-10-30 17:50 ` Bernd Schubert
2024-10-11 19:13 ` [PATCH v8 3/3] fuse: add default_request_timeout and max_request_timeout sysctls Joanne Koong
2024-10-29 19:40 ` Bernd Schubert
2024-10-30 17:32 ` Joanne Koong
2024-10-24 16:19 ` [PATCH v8 0/3] fuse: add kernel-enforced request timeout option Joanne Koong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).