* Freezer: Don't count threads waiting for frozen filesystems.
@ 2008-10-24 22:07 Nigel Cunningham
2008-10-26 20:01 ` Rafael J. Wysocki
` (2 more replies)
0 siblings, 3 replies; 59+ messages in thread
From: Nigel Cunningham @ 2008-10-24 22:07 UTC (permalink / raw)
To: Linux-PM
Hi.
While working on freezing fuse filesystems, I found that if a filesystem
is frozen when we try to freeze processes, freezing can fail because
threads are waiting in vfs_check_frozen for the filesystem to be thawed.
We should thus not count such threads.
The check will be safe if a filesystem is thawed while we're freezing
processes because filesystem thaws are only invoked from userspace. Any
waiting processes will be woken and frozen prior to us completing the
freezing of userspace (the caller invoking the filesystem thaw will be
freezing) or - in the worst case - together with kernel threads.
Signed-off-by: Nigel Cunningham <nigel@tuxonice.net>
diff --git a/include/linux/fs.h b/include/linux/fs.h
index a6a625b..c9b055d 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -8,6 +8,7 @@
#include <linux/limits.h>
#include <linux/ioctl.h>
+#include <linux/freezer.h>
/*
* It's silly to have NR_OPEN bigger than NR_FILE, but you can change
@@ -1187,8 +1190,11 @@ enum {
SB_FREEZE_TRANS = 2,
};
-#define vfs_check_frozen(sb, level) \
- wait_event((sb)->s_wait_unfrozen, ((sb)->s_frozen < (level)))
+#define vfs_check_frozen(sb, level) do { \
+ freezer_do_not_count(); \
+ wait_event((sb)->s_wait_unfrozen, ((sb)->s_frozen < (level))); \
+ freezer_count(); \
+} while (0)
#define get_fs_excl() atomic_inc(¤t->fs_excl)
#define put_fs_excl() atomic_dec(¤t->fs_excl)
^ permalink raw reply related [flat|nested] 59+ messages in thread* Re: Freezer: Don't count threads waiting for frozen filesystems. 2008-10-24 22:07 Freezer: Don't count threads waiting for frozen filesystems Nigel Cunningham @ 2008-10-26 20:01 ` Rafael J. Wysocki [not found] ` <200810262101.08822.rjw@sisk.pl> 2008-10-28 17:49 ` Pavel Machek 2 siblings, 0 replies; 59+ messages in thread From: Rafael J. Wysocki @ 2008-10-26 20:01 UTC (permalink / raw) To: Nigel Cunningham, Miklos Szeredi; +Cc: linux-pm, LKML On Saturday, 25 of October 2008, Nigel Cunningham wrote: > Hi. Hi Nigel, > While working on freezing fuse filesystems, I found that if a filesystem > is frozen when we try to freeze processes, freezing can fail because > threads are waiting in vfs_check_frozen for the filesystem to be thawed. > We should thus not count such threads. > > The check will be safe if a filesystem is thawed while we're freezing > processes because filesystem thaws are only invoked from userspace. Any > waiting processes will be woken and frozen prior to us completing the > freezing of userspace (the caller invoking the filesystem thaw will be > freezing) or - in the worst case - together with kernel threads. While your description above seems completely correct to me and I have no objections to the patch, I'd prefer it if someone having more experience with the VFS looked at it. Miklos, can you have a look, please? Thanks, Rafael > Signed-off-by: Nigel Cunningham <nigel@tuxonice.net> > > diff --git a/include/linux/fs.h b/include/linux/fs.h > index a6a625b..c9b055d 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -8,6 +8,7 @@ > > #include <linux/limits.h> > #include <linux/ioctl.h> > +#include <linux/freezer.h> > > /* > * It's silly to have NR_OPEN bigger than NR_FILE, but you can change > @@ -1187,8 +1190,11 @@ enum { > SB_FREEZE_TRANS = 2, > }; > > -#define vfs_check_frozen(sb, level) \ > - wait_event((sb)->s_wait_unfrozen, ((sb)->s_frozen < (level))) > +#define vfs_check_frozen(sb, level) do { \ > + freezer_do_not_count(); \ > + wait_event((sb)->s_wait_unfrozen, ((sb)->s_frozen < (level))); \ > + freezer_count(); \ > +} while (0) > > #define get_fs_excl() atomic_inc(¤t->fs_excl) > #define put_fs_excl() atomic_dec(¤t->fs_excl) > > ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <200810262101.08822.rjw@sisk.pl>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <200810262101.08822.rjw@sisk.pl> @ 2008-10-27 11:12 ` Miklos Szeredi [not found] ` <E1KuQ1J-0002Vw-9Q@pomaz-ex.szeredi.hu> 1 sibling, 0 replies; 59+ messages in thread From: Miklos Szeredi @ 2008-10-27 11:12 UTC (permalink / raw) To: rjw; +Cc: linux-kernel, ncunningham, linux-pm, miklos On Sun, 26 Oct 2008, Rafael J. Wysocki wrote: > On Saturday, 25 of October 2008, Nigel Cunningham wrote: > > While working on freezing fuse filesystems, I found that if a filesystem > > is frozen when we try to freeze processes, freezing can fail because > > threads are waiting in vfs_check_frozen for the filesystem to be thawed. > > We should thus not count such threads. > > > > The check will be safe if a filesystem is thawed while we're freezing > > processes because filesystem thaws are only invoked from userspace. Any > > waiting processes will be woken and frozen prior to us completing the > > freezing of userspace (the caller invoking the filesystem thaw will be > > freezing) or - in the worst case - together with kernel threads. The description is missing some details: why is the filesystem frozen before suspend? AFAICS this can happen when DM calls bdev_freeze() on the device before the task freezing begins. Is this the case? Also, while the patch might solve some of the symptoms of the fuse vs. process freezer interaction, it will not fully fix that problem. As such it's just a hack to hide the problem, making it less likely to appear. So I'm not at all conviced about this patch. Thanks, Miklos ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <E1KuQ1J-0002Vw-9Q@pomaz-ex.szeredi.hu>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <E1KuQ1J-0002Vw-9Q@pomaz-ex.szeredi.hu> @ 2008-10-27 11:20 ` Nigel Cunningham [not found] ` <1225106427.26724.5.camel@nigel-laptop> 1 sibling, 0 replies; 59+ messages in thread From: Nigel Cunningham @ 2008-10-27 11:20 UTC (permalink / raw) To: Miklos Szeredi; +Cc: linux-pm, linux-kernel Hi Miklos. On Mon, 2008-10-27 at 12:12 +0100, Miklos Szeredi wrote: > On Sun, 26 Oct 2008, Rafael J. Wysocki wrote: > > On Saturday, 25 of October 2008, Nigel Cunningham wrote: > > > While working on freezing fuse filesystems, I found that if a filesystem > > > is frozen when we try to freeze processes, freezing can fail because > > > threads are waiting in vfs_check_frozen for the filesystem to be thawed. > > > We should thus not count such threads. > > > > > > The check will be safe if a filesystem is thawed while we're freezing > > > processes because filesystem thaws are only invoked from userspace. Any > > > waiting processes will be woken and frozen prior to us completing the > > > freezing of userspace (the caller invoking the filesystem thaw will be > > > freezing) or - in the worst case - together with kernel threads. > > The description is missing some details: why is the filesystem frozen > before suspend? AFAICS this can happen when DM calls bdev_freeze() on > the device before the task freezing begins. Is this the case? It doesn't matter why a process is sitting in that wait_event call. What does matter is that one can be there. In the case where I saw it, I was working on fuse freezing. I don't remember the details, as it's a year since I made this patch, but I don't think I wasn't using fuse or DM. > Also, while the patch might solve some of the symptoms of the fuse > vs. process freezer interaction, it will not fully fix that problem. > As such it's just a hack to hide the problem, making it less likely to > appear. No, it's part of the solution. I haven't posted the full fuse freezing patch because I thought this could be profitably merged without the rest of the patch. Regards, Nigel ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <1225106427.26724.5.camel@nigel-laptop>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <1225106427.26724.5.camel@nigel-laptop> @ 2008-10-27 11:37 ` Rafael J. Wysocki 2008-10-27 11:37 ` Miklos Szeredi [not found] ` <200810271237.40049.rjw@sisk.pl> 2 siblings, 0 replies; 59+ messages in thread From: Rafael J. Wysocki @ 2008-10-27 11:37 UTC (permalink / raw) To: Nigel Cunningham; +Cc: linux-pm, linux-kernel, Miklos Szeredi On Monday, 27 of October 2008, Nigel Cunningham wrote: > Hi Miklos. > > On Mon, 2008-10-27 at 12:12 +0100, Miklos Szeredi wrote: > > On Sun, 26 Oct 2008, Rafael J. Wysocki wrote: > > > On Saturday, 25 of October 2008, Nigel Cunningham wrote: > > > > While working on freezing fuse filesystems, I found that if a filesystem > > > > is frozen when we try to freeze processes, freezing can fail because > > > > threads are waiting in vfs_check_frozen for the filesystem to be thawed. > > > > We should thus not count such threads. > > > > > > > > The check will be safe if a filesystem is thawed while we're freezing > > > > processes because filesystem thaws are only invoked from userspace. Any > > > > waiting processes will be woken and frozen prior to us completing the > > > > freezing of userspace (the caller invoking the filesystem thaw will be > > > > freezing) or - in the worst case - together with kernel threads. > > > > The description is missing some details: why is the filesystem frozen > > before suspend? AFAICS this can happen when DM calls bdev_freeze() on > > the device before the task freezing begins. Is this the case? > > It doesn't matter why a process is sitting in that wait_event call. What > does matter is that one can be there. In the case where I saw it, I was > working on fuse freezing. I don't remember the details, as it's a year > since I made this patch, but I don't think I wasn't using fuse or DM. > > > Also, while the patch might solve some of the symptoms of the fuse > > vs. process freezer interaction, it will not fully fix that problem. > > As such it's just a hack to hide the problem, making it less likely to > > appear. > > No, it's part of the solution. I haven't posted the full fuse freezing > patch because I thought this could be profitably merged without the rest > of the patch. Well, I guess it's better if you post the entire thing so that we can see what the role of the $subject patch is in it, even if this patch finally gets merged separately. Thanks, Rafael ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <1225106427.26724.5.camel@nigel-laptop> 2008-10-27 11:37 ` Rafael J. Wysocki @ 2008-10-27 11:37 ` Miklos Szeredi [not found] ` <200810271237.40049.rjw@sisk.pl> 2 siblings, 0 replies; 59+ messages in thread From: Miklos Szeredi @ 2008-10-27 11:37 UTC (permalink / raw) To: ncunningham; +Cc: linux-pm, linux-kernel, miklos On Mon, 27 Oct 2008, Nigel Cunningham wrote: > It doesn't matter why a process is sitting in that wait_event call. What > does matter is that one can be there. Right, but it could also be sleeping in any other wait_event(), mutex_lock(), etc, resulting in the same issue. > No, it's part of the solution. I haven't posted the full fuse freezing > patch because I thought this could be profitably merged without the rest > of the patch. Please post the full patch, it'd be a lot better to see the big picture instead of just a small part of it. Thanks, Miklos ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <200810271237.40049.rjw@sisk.pl>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <200810271237.40049.rjw@sisk.pl> @ 2008-10-27 11:40 ` Nigel Cunningham [not found] ` <1225107607.26724.9.camel@nigel-laptop> 1 sibling, 0 replies; 59+ messages in thread From: Nigel Cunningham @ 2008-10-27 11:40 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: linux-pm, linux-kernel, Miklos Szeredi Hi. On Mon, 2008-10-27 at 12:37 +0100, Rafael J. Wysocki wrote: > On Monday, 27 of October 2008, Nigel Cunningham wrote: > > Hi Miklos. > > > > On Mon, 2008-10-27 at 12:12 +0100, Miklos Szeredi wrote: > > > On Sun, 26 Oct 2008, Rafael J. Wysocki wrote: > > > > On Saturday, 25 of October 2008, Nigel Cunningham wrote: > > > > > While working on freezing fuse filesystems, I found that if a filesystem > > > > > is frozen when we try to freeze processes, freezing can fail because > > > > > threads are waiting in vfs_check_frozen for the filesystem to be thawed. > > > > > We should thus not count such threads. > > > > > > > > > > The check will be safe if a filesystem is thawed while we're freezing > > > > > processes because filesystem thaws are only invoked from userspace. Any > > > > > waiting processes will be woken and frozen prior to us completing the > > > > > freezing of userspace (the caller invoking the filesystem thaw will be > > > > > freezing) or - in the worst case - together with kernel threads. > > > > > > The description is missing some details: why is the filesystem frozen > > > before suspend? AFAICS this can happen when DM calls bdev_freeze() on > > > the device before the task freezing begins. Is this the case? > > > > It doesn't matter why a process is sitting in that wait_event call. What > > does matter is that one can be there. In the case where I saw it, I was > > working on fuse freezing. I don't remember the details, as it's a year > > since I made this patch, but I don't think I wasn't using fuse or DM. > > > > > Also, while the patch might solve some of the symptoms of the fuse > > > vs. process freezer interaction, it will not fully fix that problem. > > > As such it's just a hack to hide the problem, making it less likely to > > > appear. > > > > No, it's part of the solution. I haven't posted the full fuse freezing > > patch because I thought this could be profitably merged without the rest > > of the patch. > > Well, I guess it's better if you post the entire thing so that we can see > what the role of the $subject patch is in it, even if this patch finally gets > merged separately. Ah.. that makes me see how vfs_check_frozen was getting triggered... (fs/namei.c, below). Regards, Nigel fs/buffer.c | 87 ++++++++++++++++++++++++++++++++++++++++++++ fs/fuse/control.c | 1 fs/fuse/dev.c | 7 +++ fs/fuse/dir.c | 35 +++++++++++++++-- fs/fuse/file.c | 14 +++++++ fs/fuse/fuse.h | 13 ++++++ fs/fuse/inode.c | 4 +- fs/namei.c | 2 + include/linux/buffer_head.h | 5 ++ include/linux/freezer.h | 15 +++++++ include/linux/fs.h | 10 ++++- kernel/power/process.c | 48 +++++++++++++++++++++--- 12 files changed, 227 insertions(+), 14 deletions(-) diff -ruNp 700-BRANCH-fuse-freezing.patch-old/fs/buffer.c 700-BRANCH-fuse-freezing.patch-new/fs/buffer.c --- 700-BRANCH-fuse-freezing.patch-old/fs/buffer.c 2008-10-19 04:57:22.000000000 +1100 +++ 700-BRANCH-fuse-freezing.patch-new/fs/buffer.c 2008-10-21 13:16:08.000000000 +1100 @@ -247,6 +247,93 @@ void thaw_bdev(struct block_device *bdev } EXPORT_SYMBOL(thaw_bdev); +#if 0 +#define FS_PRINTK(fmt, args...) printk(fmt, ## args) +#else +#define FS_PRINTK(fmt, args...) +#endif + +/* #define DEBUG_FS_FREEZING */ + +/** + * freeze_filesystems - lock all filesystems and force them into a consistent + * state + * @which: What combination of fuse & non-fuse to freeze. + */ +void freeze_filesystems(int which) +{ + struct super_block *sb; + + lockdep_off(); + + /* + * Freeze in reverse order so filesystems dependant upon others are + * frozen in the right order (eg. loopback on ext3). + */ + list_for_each_entry_reverse(sb, &super_blocks, s_list) { + FS_PRINTK(KERN_INFO "Considering %s.%s: (root %p, bdev %x)", + sb->s_type->name ? sb->s_type->name : "?", + sb->s_subtype ? sb->s_subtype : "", sb->s_root, + sb->s_bdev ? sb->s_bdev->bd_dev : 0); + + if (sb->s_type->fs_flags & FS_IS_FUSE && + sb->s_frozen == SB_UNFROZEN && + which & FS_FREEZER_FUSE) { + sb->s_frozen = SB_FREEZE_TRANS; + sb->s_flags |= MS_FROZEN; + FS_PRINTK("Fuse filesystem done.\n"); + continue; + } + + if (!sb->s_root || !sb->s_bdev || + (sb->s_frozen == SB_FREEZE_TRANS) || + (sb->s_flags & MS_RDONLY) || + (sb->s_flags & MS_FROZEN) || + !(which & FS_FREEZER_NORMAL)) { + FS_PRINTK(KERN_INFO "Nope.\n"); + continue; + } + + FS_PRINTK(KERN_INFO "Freezing %x... ", sb->s_bdev->bd_dev); + freeze_bdev(sb->s_bdev); + sb->s_flags |= MS_FROZEN; + FS_PRINTK(KERN_INFO "Done.\n"); + } + + lockdep_on(); +} + +/** + * thaw_filesystems - unlock all filesystems + * @which: What combination of fuse & non-fuse to thaw. + */ +void thaw_filesystems(int which) +{ + struct super_block *sb; + + lockdep_off(); + + list_for_each_entry(sb, &super_blocks, s_list) { + if (!(sb->s_flags & MS_FROZEN)) + continue; + + if (sb->s_type->fs_flags & FS_IS_FUSE) { + if (!(which & FS_FREEZER_FUSE)) + continue; + + sb->s_frozen = SB_UNFROZEN; + } else { + if (!(which & FS_FREEZER_NORMAL)) + continue; + + thaw_bdev(sb->s_bdev, sb); + } + sb->s_flags &= ~MS_FROZEN; + } + + lockdep_on(); +} + /* * Various filesystems appear to want __find_get_block to be non-blocking. * But it's the page lock which protects the buffers. To get around this, diff -ruNp 700-BRANCH-fuse-freezing.patch-old/fs/fuse/control.c 700-BRANCH-fuse-freezing.patch-new/fs/fuse/control.c --- 700-BRANCH-fuse-freezing.patch-old/fs/fuse/control.c 2008-10-19 04:57:22.000000000 +1100 +++ 700-BRANCH-fuse-freezing.patch-new/fs/fuse/control.c 2008-10-21 13:12:40.000000000 +1100 @@ -207,6 +207,7 @@ static void fuse_ctl_kill_sb(struct supe static struct file_system_type fuse_ctl_fs_type = { .owner = THIS_MODULE, .name = "fusectl", + .fs_flags = FS_IS_FUSE, .get_sb = fuse_ctl_get_sb, .kill_sb = fuse_ctl_kill_sb, }; diff -ruNp 700-BRANCH-fuse-freezing.patch-old/fs/fuse/dev.c 700-BRANCH-fuse-freezing.patch-new/fs/fuse/dev.c --- 700-BRANCH-fuse-freezing.patch-old/fs/fuse/dev.c 2008-10-19 04:57:22.000000000 +1100 +++ 700-BRANCH-fuse-freezing.patch-new/fs/fuse/dev.c 2008-10-21 13:12:40.000000000 +1100 @@ -7,6 +7,7 @@ */ #include "fuse_i.h" +#include "fuse.h" #include <linux/init.h> #include <linux/module.h> @@ -16,6 +17,7 @@ #include <linux/pagemap.h> #include <linux/file.h> #include <linux/slab.h> +#include <linux/freezer.h> MODULE_ALIAS_MISCDEV(FUSE_MINOR); @@ -743,6 +745,8 @@ static ssize_t fuse_dev_read(struct kioc if (!fc) return -EPERM; + FUSE_MIGHT_FREEZE(file->f_mapping->host->i_sb, "fuse_dev_read"); + restart: spin_lock(&fc->lock); err = -EAGAIN; @@ -869,6 +873,9 @@ static ssize_t fuse_dev_write(struct kio if (!fc) return -EPERM; + FUSE_MIGHT_FREEZE(iocb->ki_filp->f_mapping->host->i_sb, + "fuse_dev_write"); + fuse_copy_init(&cs, fc, 0, NULL, iov, nr_segs); if (nbytes < sizeof(struct fuse_out_header)) return -EINVAL; diff -ruNp 700-BRANCH-fuse-freezing.patch-old/fs/fuse/dir.c 700-BRANCH-fuse-freezing.patch-new/fs/fuse/dir.c --- 700-BRANCH-fuse-freezing.patch-old/fs/fuse/dir.c 2008-10-19 04:57:22.000000000 +1100 +++ 700-BRANCH-fuse-freezing.patch-new/fs/fuse/dir.c 2008-10-21 13:12:40.000000000 +1100 @@ -7,12 +7,14 @@ */ #include "fuse_i.h" +#include "fuse.h" #include <linux/pagemap.h> #include <linux/file.h> #include <linux/gfp.h> #include <linux/sched.h> #include <linux/namei.h> +#include <linux/freezer.h> #if BITS_PER_LONG >= 64 static inline void fuse_dentry_settime(struct dentry *entry, u64 time) @@ -174,6 +176,9 @@ static int fuse_dentry_revalidate(struct return 0; fc = get_fuse_conn(inode); + + FUSE_MIGHT_FREEZE(inode->i_sb, "fuse_dentry_revalidate"); + req = fuse_get_req(fc); if (IS_ERR(req)) return 0; @@ -273,6 +278,8 @@ int fuse_lookup_name(struct super_block if (IS_ERR(req)) goto out; + FUSE_MIGHT_FREEZE(sb, "fuse_lookup"); + forget_req = fuse_get_req(fc); err = PTR_ERR(forget_req); if (IS_ERR(forget_req)) { @@ -402,6 +409,8 @@ static int fuse_create_open(struct inode if (IS_ERR(forget_req)) return PTR_ERR(forget_req); + FUSE_MIGHT_FREEZE(dir->i_sb, "fuse_create_open"); + req = fuse_get_req(fc); err = PTR_ERR(req); if (IS_ERR(req)) @@ -488,6 +497,8 @@ static int create_new_entry(struct fuse_ int err; struct fuse_req *forget_req; + FUSE_MIGHT_FREEZE(dir->i_sb, "create_new_entry"); + forget_req = fuse_get_req(fc); if (IS_ERR(forget_req)) { fuse_put_request(fc, req); @@ -585,7 +596,11 @@ static int fuse_mkdir(struct inode *dir, { struct fuse_mkdir_in inarg; struct fuse_conn *fc = get_fuse_conn(dir); - struct fuse_req *req = fuse_get_req(fc); + struct fuse_req *req; + + FUSE_MIGHT_FREEZE(dir->i_sb, "fuse_mkdir"); + + req = fuse_get_req(fc); if (IS_ERR(req)) return PTR_ERR(req); @@ -605,7 +620,11 @@ static int fuse_symlink(struct inode *di { struct fuse_conn *fc = get_fuse_conn(dir); unsigned len = strlen(link) + 1; - struct fuse_req *req = fuse_get_req(fc); + struct fuse_req *req; + + FUSE_MIGHT_FREEZE(dir->i_sb, "fuse_symlink"); + + req = fuse_get_req(fc); if (IS_ERR(req)) return PTR_ERR(req); @@ -622,7 +641,11 @@ static int fuse_unlink(struct inode *dir { int err; struct fuse_conn *fc = get_fuse_conn(dir); - struct fuse_req *req = fuse_get_req(fc); + struct fuse_req *req; + + FUSE_MIGHT_FREEZE(dir->i_sb, "fuse_unlink"); + + req = fuse_get_req(fc); if (IS_ERR(req)) return PTR_ERR(req); @@ -653,7 +676,11 @@ static int fuse_rmdir(struct inode *dir, { int err; struct fuse_conn *fc = get_fuse_conn(dir); - struct fuse_req *req = fuse_get_req(fc); + struct fuse_req *req; + + FUSE_MIGHT_FREEZE(dir->i_sb, "fuse_rmdir"); + + req = fuse_get_req(fc); if (IS_ERR(req)) return PTR_ERR(req); diff -ruNp 700-BRANCH-fuse-freezing.patch-old/fs/fuse/file.c 700-BRANCH-fuse-freezing.patch-new/fs/fuse/file.c --- 700-BRANCH-fuse-freezing.patch-old/fs/fuse/file.c 2008-10-19 04:57:22.000000000 +1100 +++ 700-BRANCH-fuse-freezing.patch-new/fs/fuse/file.c 2008-10-21 13:12:40.000000000 +1100 @@ -7,11 +7,13 @@ */ #include "fuse_i.h" +#include "fuse.h" #include <linux/pagemap.h> #include <linux/slab.h> #include <linux/kernel.h> #include <linux/sched.h> +#include <linux/freezer.h> static const struct file_operations fuse_direct_io_file_operations; @@ -23,6 +25,8 @@ static int fuse_send_open(struct inode * struct fuse_req *req; int err; + FUSE_MIGHT_FREEZE(inode->i_sb, "fuse_send_open"); + req = fuse_get_req(fc); if (IS_ERR(req)) return PTR_ERR(req); @@ -674,6 +678,8 @@ static int fuse_buffered_write(struct fi if (is_bad_inode(inode)) return -EIO; + FUSE_MIGHT_FREEZE(inode->i_sb, "fuse_commit_write"); + /* * Make sure writepages on the same page are not mixed up with * plain writes. @@ -962,6 +968,8 @@ static ssize_t fuse_direct_io(struct fil if (is_bad_inode(inode)) return -EIO; + FUSE_MIGHT_FREEZE(file->f_mapping->host->i_sb, "fuse_direct_io"); + req = fuse_get_req(fc); if (IS_ERR(req)) return PTR_ERR(req); @@ -1315,6 +1323,8 @@ static int fuse_getlk(struct file *file, struct fuse_lk_out outarg; int err; + FUSE_MIGHT_FREEZE(file->f_mapping->host->i_sb, "fuse_getlk"); + req = fuse_get_req(fc); if (IS_ERR(req)) return PTR_ERR(req); @@ -1350,6 +1360,8 @@ static int fuse_setlk(struct file *file, if (fl->fl_flags & FL_CLOSE) return 0; + FUSE_MIGHT_FREEZE(file->f_mapping->host->i_sb, "fuse_setlk"); + req = fuse_get_req(fc); if (IS_ERR(req)) return PTR_ERR(req); @@ -1416,6 +1428,8 @@ static sector_t fuse_bmap(struct address if (!inode->i_sb->s_bdev || fc->no_bmap) return 0; + FUSE_MIGHT_FREEZE(inode->i_sb, "fuse_bmap"); + req = fuse_get_req(fc); if (IS_ERR(req)) return 0; diff -ruNp 700-BRANCH-fuse-freezing.patch-old/fs/fuse/fuse.h 700-BRANCH-fuse-freezing.patch-new/fs/fuse/fuse.h --- 700-BRANCH-fuse-freezing.patch-old/fs/fuse/fuse.h 1970-01-01 10:00:00.000000000 +1000 +++ 700-BRANCH-fuse-freezing.patch-new/fs/fuse/fuse.h 2008-10-21 13:12:40.000000000 +1100 @@ -0,0 +1,13 @@ +#define FUSE_MIGHT_FREEZE(superblock, desc) \ +do { \ + int printed = 0; \ + while (superblock->s_frozen != SB_UNFROZEN) { \ + if (!printed) { \ + printk(KERN_INFO "%d frozen in " desc ".\n", \ + current->pid); \ + printed = 1; \ + } \ + try_to_freeze(); \ + yield(); \ + } \ +} while (0) diff -ruNp 700-BRANCH-fuse-freezing.patch-old/fs/fuse/inode.c 700-BRANCH-fuse-freezing.patch-new/fs/fuse/inode.c --- 700-BRANCH-fuse-freezing.patch-old/fs/fuse/inode.c 2008-10-19 04:57:22.000000000 +1100 +++ 700-BRANCH-fuse-freezing.patch-new/fs/fuse/inode.c 2008-10-21 13:12:40.000000000 +1100 @@ -914,7 +914,7 @@ static int fuse_get_sb(struct file_syste static struct file_system_type fuse_fs_type = { .owner = THIS_MODULE, .name = "fuse", - .fs_flags = FS_HAS_SUBTYPE, + .fs_flags = FS_HAS_SUBTYPE | FS_IS_FUSE, .get_sb = fuse_get_sb, .kill_sb = kill_anon_super, }; @@ -933,7 +933,7 @@ static struct file_system_type fuseblk_f .name = "fuseblk", .get_sb = fuse_get_sb_blk, .kill_sb = kill_block_super, - .fs_flags = FS_REQUIRES_DEV | FS_HAS_SUBTYPE, + .fs_flags = FS_REQUIRES_DEV | FS_HAS_SUBTYPE | FS_IS_FUSE, }; static inline int register_fuseblk(void) diff -ruNp 700-BRANCH-fuse-freezing.patch-old/fs/namei.c 700-BRANCH-fuse-freezing.patch-new/fs/namei.c --- 700-BRANCH-fuse-freezing.patch-old/fs/namei.c 2008-10-19 04:57:22.000000000 +1100 +++ 700-BRANCH-fuse-freezing.patch-new/fs/namei.c 2008-10-21 13:12:40.000000000 +1100 @@ -2223,6 +2223,8 @@ int vfs_unlink(struct inode *dir, struct if (!dir->i_op || !dir->i_op->unlink) return -EPERM; + vfs_check_frozen(dir->i_sb, SB_FREEZE_WRITE); + DQUOT_INIT(dir); mutex_lock(&dentry->d_inode->i_mutex); diff -ruNp 700-BRANCH-fuse-freezing.patch-old/include/linux/buffer_head.h 700-BRANCH-fuse-freezing.patch-new/include/linux/buffer_head.h --- 700-BRANCH-fuse-freezing.patch-old/include/linux/buffer_head.h 2008-10-19 04:57:22.000000000 +1100 +++ 700-BRANCH-fuse-freezing.patch-new/include/linux/buffer_head.h 2008-10-21 13:12:40.000000000 +1100 @@ -171,6 +171,11 @@ wait_queue_head_t *bh_waitq_head(struct int fsync_bdev(struct block_device *); struct super_block *freeze_bdev(struct block_device *); void thaw_bdev(struct block_device *, struct super_block *); +#define FS_FREEZER_FUSE 1 +#define FS_FREEZER_NORMAL 2 +#define FS_FREEZER_ALL (FS_FREEZER_FUSE | FS_FREEZER_NORMAL) +void freeze_filesystems(int which); +void thaw_filesystems(int which); int fsync_super(struct super_block *); int fsync_no_super(struct block_device *); struct buffer_head *__find_get_block(struct block_device *bdev, sector_t block, diff -ruNp 700-BRANCH-fuse-freezing.patch-old/include/linux/freezer.h 700-BRANCH-fuse-freezing.patch-new/include/linux/freezer.h --- 700-BRANCH-fuse-freezing.patch-old/include/linux/freezer.h 2008-10-19 04:57:22.000000000 +1100 +++ 700-BRANCH-fuse-freezing.patch-new/include/linux/freezer.h 2008-10-21 13:16:08.000000000 +1100 @@ -136,6 +136,19 @@ static inline void set_freezable_with_si current->flags &= ~(PF_NOFREEZE | PF_FREEZER_NOSIG); } +extern int freezer_state; +#define FREEZER_OFF 0 +#define FREEZER_FILESYSTEMS_FROZEN 1 +#define FREEZER_USERSPACE_FROZEN 2 +#define FREEZER_FULLY_ON 3 + +static inline int freezer_is_on(void) +{ + return freezer_state == FREEZER_FULLY_ON; +} + +extern void thaw_kernel_threads(void); + /* * Freezer-friendly wrappers around wait_event_interruptible() and * wait_event_interruptible_timeout(), originally defined in <linux/wait.h> @@ -178,6 +191,8 @@ static inline int freeze_processes(void) static inline void thaw_processes(void) {} static inline int try_to_freeze(void) { return 0; } +static inline int freezer_is_on(void) { return 0; } +static inline void thaw_kernel_threads(void) { } static inline void freezer_do_not_count(void) {} static inline void freezer_count(void) {} diff -ruNp 700-BRANCH-fuse-freezing.patch-old/include/linux/fs.h 700-BRANCH-fuse-freezing.patch-new/include/linux/fs.h --- 700-BRANCH-fuse-freezing.patch-old/include/linux/fs.h 2008-10-19 04:57:22.000000000 +1100 +++ 700-BRANCH-fuse-freezing.patch-new/include/linux/fs.h 2008-10-21 13:12:40.000000000 +1100 @@ -8,6 +8,7 @@ #include <linux/limits.h> #include <linux/ioctl.h> +#include <linux/freezer.h> /* * It's silly to have NR_OPEN bigger than NR_FILE, but you can change @@ -96,6 +97,7 @@ extern int dir_notify_enable; #define FS_REQUIRES_DEV 1 #define FS_BINARY_MOUNTDATA 2 #define FS_HAS_SUBTYPE 4 +#define FS_IS_FUSE 8 /* Fuse filesystem - bdev freeze these too */ #define FS_REVAL_DOT 16384 /* Check the paths ".", ".." for staleness */ #define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() * during rename() internally. @@ -128,6 +130,7 @@ extern int dir_notify_enable; #define MS_RELATIME (1<<21) /* Update atime relative to mtime/ctime. */ #define MS_KERNMOUNT (1<<22) /* this is a kern_mount call */ #define MS_I_VERSION (1<<23) /* Update inode I_version field */ +#define MS_FROZEN (1<<24) /* Frozen by freeze_filesystems() */ #define MS_ACTIVE (1<<30) #define MS_NOUSER (1<<31) @@ -1141,8 +1144,11 @@ enum { SB_FREEZE_TRANS = 2, }; -#define vfs_check_frozen(sb, level) \ - wait_event((sb)->s_wait_unfrozen, ((sb)->s_frozen < (level))) +#define vfs_check_frozen(sb, level) do { \ + freezer_do_not_count(); \ + wait_event((sb)->s_wait_unfrozen, ((sb)->s_frozen < (level))); \ + freezer_count(); \ +} while (0) #define get_fs_excl() atomic_inc(¤t->fs_excl) #define put_fs_excl() atomic_dec(¤t->fs_excl) diff -ruNp 700-BRANCH-fuse-freezing.patch-old/kernel/power/process.c 700-BRANCH-fuse-freezing.patch-new/kernel/power/process.c --- 700-BRANCH-fuse-freezing.patch-old/kernel/power/process.c 2008-10-19 04:57:22.000000000 +1100 +++ 700-BRANCH-fuse-freezing.patch-new/kernel/power/process.c 2008-10-21 13:16:08.000000000 +1100 @@ -13,6 +13,9 @@ #include <linux/module.h> #include <linux/syscalls.h> #include <linux/freezer.h> +#include <linux/buffer_head.h> + +int freezer_state; /* * Timeout for stopping processes @@ -201,7 +204,8 @@ static int try_to_freeze_tasks(bool sig_ do_each_thread(g, p) { task_lock(p); if (freezing(p) && !freezer_should_skip(p)) - printk(KERN_ERR " %s\n", p->comm); + printk(KERN_ERR " %s (%d) failed to freeze.\n", + p->comm, p->pid); cancel_freezing(p); task_unlock(p); } while_each_thread(g, p); @@ -221,17 +225,25 @@ int freeze_processes(void) { int error; - printk("Freezing user space processes ... "); + printk(KERN_INFO "Stopping fuse filesystems.\n"); + freeze_filesystems(FS_FREEZER_FUSE); + freezer_state = FREEZER_FILESYSTEMS_FROZEN; + printk(KERN_INFO "Freezing user space processes ... "); error = try_to_freeze_tasks(true); if (error) goto Exit; - printk("done.\n"); + printk(KERN_INFO "done.\n"); - printk("Freezing remaining freezable tasks ... "); + sys_sync(); + printk(KERN_INFO "Stopping normal filesystems.\n"); + freeze_filesystems(FS_FREEZER_NORMAL); + freezer_state = FREEZER_USERSPACE_FROZEN; + printk(KERN_INFO "Freezing remaining freezable tasks ... "); error = try_to_freeze_tasks(false); if (error) goto Exit; printk("done."); + freezer_state = FREEZER_FULLY_ON; Exit: BUG_ON(in_atomic()); printk("\n"); @@ -257,11 +269,35 @@ static void thaw_tasks(bool nosig_only) void thaw_processes(void) { - printk("Restarting tasks ... "); - thaw_tasks(true); + int old_state = freezer_state; + + if (old_state == FREEZER_OFF) + return; + + /* + * Change state beforehand because thawed tasks might submit I/O + * immediately. + */ + freezer_state = FREEZER_OFF; + + printk(KERN_INFO "Restarting all filesystems ...\n"); + thaw_filesystems(FS_FREEZER_ALL); + + printk(KERN_INFO "Restarting tasks ... "); + + if (old_state == FREEZER_FULLY_ON) + thaw_tasks(true); thaw_tasks(false); schedule(); printk("done.\n"); } EXPORT_SYMBOL(refrigerator); + +void thaw_kernel_threads(void) +{ + freezer_state = FREEZER_USERSPACE_FROZEN; + printk(KERN_INFO "Restarting normal filesystems.\n"); + thaw_filesystems(FS_FREEZER_NORMAL); + thaw_tasks(true); +} ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <1225107607.26724.9.camel@nigel-laptop>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <1225107607.26724.9.camel@nigel-laptop> @ 2008-10-27 12:38 ` Miklos Szeredi [not found] ` <E1KuRN4-0002gS-Hq@pomaz-ex.szeredi.hu> 1 sibling, 0 replies; 59+ messages in thread From: Miklos Szeredi @ 2008-10-27 12:38 UTC (permalink / raw) To: ncunningham; +Cc: linux-pm, linux-kernel, miklos On Mon, 27 Oct 2008, Nigel Cunningham wrote: > Hi. > > On Mon, 2008-10-27 at 12:37 +0100, Rafael J. Wysocki wrote: > > Well, I guess it's better if you post the entire thing so that we can see > > what the role of the $subject patch is in it, even if this patch finally gets > > merged separately. > > Ah.. that makes me see how vfs_check_frozen was getting triggered... > (fs/namei.c, below). Nigel, thanks for the patch, it makes thinks a lot clearer. >From a quick look through the patch it seems to solve a bunch of cases where new fuse requests during the freezing could cause problems. But it doesn't do anything with requests that are already under way when the freezing starts, which would still result in all the same problems. Take this scenario: 1) process A does rename("/mnt/fuse/a", "/mnt/fuse/b") 2) request goes to process B serving the fuse filesystem 3) filesystems are frozen, no new fuse requests 4) processes are frozen, let's say B first, then A 5) freezing A will fail, since it's still waiting for the request to finish Several solutions have been posted, none of which really solve the problem: a) Let's tag fuse server processes and freeze them later. This is basically impossible, because many processes could be interoperating and there's no way to know which is depending on which (example: sshfs uses ssh for communication, which possibly relies on openvpn process for packet transmission). b) While waiting for replies to fuse request, allow process to freeze. Does not fully solve the problem, as VFS might be holding locks, and other processes waiting for those locks will not be freezable. Miklos ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <E1KuRN4-0002gS-Hq@pomaz-ex.szeredi.hu>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <E1KuRN4-0002gS-Hq@pomaz-ex.szeredi.hu> @ 2008-10-27 20:59 ` Nigel Cunningham [not found] ` <1225141168.26724.23.camel@nigel-laptop> 1 sibling, 0 replies; 59+ messages in thread From: Nigel Cunningham @ 2008-10-27 20:59 UTC (permalink / raw) To: Miklos Szeredi; +Cc: linux-pm, linux-kernel Hi. On Mon, 2008-10-27 at 13:38 +0100, Miklos Szeredi wrote: > On Mon, 27 Oct 2008, Nigel Cunningham wrote: > > Hi. > > > > On Mon, 2008-10-27 at 12:37 +0100, Rafael J. Wysocki wrote: > > > Well, I guess it's better if you post the entire thing so that we can see > > > what the role of the $subject patch is in it, even if this patch finally gets > > > merged separately. > > > > Ah.. that makes me see how vfs_check_frozen was getting triggered... > > (fs/namei.c, below). > > Nigel, thanks for the patch, it makes thinks a lot clearer. > > >From a quick look through the patch it seems to solve a bunch of cases > where new fuse requests during the freezing could cause problems. But > it doesn't do anything with requests that are already under way when > the freezing starts, which would still result in all the same > problems. > > Take this scenario: > > 1) process A does rename("/mnt/fuse/a", "/mnt/fuse/b") > 2) request goes to process B serving the fuse filesystem > 3) filesystems are frozen, no new fuse requests > 4) processes are frozen, let's say B first, then A > 5) freezing A will fail, since it's still waiting for the request to finish I'll have a look at the code again. I deliberately didn't stop existing requests, but perhaps that's the wrong behaviour. In the above scenario, process B won't see process A's new request until post-resume if the filesystem is already frozen, so steps 4 & 5 don't happen. Process B will also always be frozen before process A if process A is userspace (most likely in the above scenario) or was mounted after process B. (I've had this patch distributed as is for almost a year, with no problems at all, so I believe I'm right here). > Several solutions have been posted, none of which really solve the problem: > > a) Let's tag fuse server processes and freeze them later. This is > basically impossible, because many processes could be interoperating > and there's no way to know which is depending on which (example: > sshfs uses ssh for communication, which possibly relies on openvpn > process for packet transmission). > > b) While waiting for replies to fuse request, allow process to > freeze. Does not fully solve the problem, as VFS might be holding > locks, and other processes waiting for those locks will not be > freezable. I agree that these solutions won't work. Regards, Nigel ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <1225141168.26724.23.camel@nigel-laptop>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <1225141168.26724.23.camel@nigel-laptop> @ 2008-10-27 21:09 ` Miklos Szeredi [not found] ` <E1KuZKt-0003jl-2N@pomaz-ex.szeredi.hu> 1 sibling, 0 replies; 59+ messages in thread From: Miklos Szeredi @ 2008-10-27 21:09 UTC (permalink / raw) To: ncunningham; +Cc: linux-pm, linux-kernel, miklos On Tue, 28 Oct 2008, Nigel Cunningham wrote: > Hi. > > On Mon, 2008-10-27 at 13:38 +0100, Miklos Szeredi wrote: > > On Mon, 27 Oct 2008, Nigel Cunningham wrote: > > > Hi. > > > > > > On Mon, 2008-10-27 at 12:37 +0100, Rafael J. Wysocki wrote: > > > > Well, I guess it's better if you post the entire thing so that we can see > > > > what the role of the $subject patch is in it, even if this patch finally gets > > > > merged separately. > > > > > > Ah.. that makes me see how vfs_check_frozen was getting triggered... > > > (fs/namei.c, below). > > > > Nigel, thanks for the patch, it makes thinks a lot clearer. > > > > >From a quick look through the patch it seems to solve a bunch of cases > > where new fuse requests during the freezing could cause problems. But > > it doesn't do anything with requests that are already under way when > > the freezing starts, which would still result in all the same > > problems. > > > > Take this scenario: > > > > 1) process A does rename("/mnt/fuse/a", "/mnt/fuse/b") > > 2) request goes to process B serving the fuse filesystem > > 3) filesystems are frozen, no new fuse requests > > 4) processes are frozen, let's say B first, then A > > 5) freezing A will fail, since it's still waiting for the request to finish > > I'll have a look at the code again. I deliberately didn't stop existing > requests, but perhaps that's the wrong behaviour. > > In the above scenario, process B won't see process A's new request until > post-resume if the filesystem is already frozen, so steps 4 & 5 don't > happen. No, I mean the filesystem is only frozen at 3 not before, so B is already processing the request by the time the filesystem gets frozen. > Process B will also always be frozen before process A if process > A is userspace (most likely in the above scenario) or was mounted after > process B. (I've had this patch distributed as is for almost a year, > with no problems at all, so I believe I'm right here). Both A and B are userspace processes. The order of freezing userspace processes is basically random, there's no way to make sure that processes doing work on behalf of a fuse filesystem (B) are frozen after processes accessing the fuse filesystem (A). Miklos ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <E1KuZKt-0003jl-2N@pomaz-ex.szeredi.hu>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <E1KuZKt-0003jl-2N@pomaz-ex.szeredi.hu> @ 2008-10-27 22:13 ` Nigel Cunningham [not found] ` <1225145607.26724.54.camel@nigel-laptop> 1 sibling, 0 replies; 59+ messages in thread From: Nigel Cunningham @ 2008-10-27 22:13 UTC (permalink / raw) To: Miklos Szeredi; +Cc: linux-pm, linux-kernel Hi. On Mon, 2008-10-27 at 22:09 +0100, Miklos Szeredi wrote: > On Tue, 28 Oct 2008, Nigel Cunningham wrote: > > Hi. > > > > On Mon, 2008-10-27 at 13:38 +0100, Miklos Szeredi wrote: > > > On Mon, 27 Oct 2008, Nigel Cunningham wrote: > > > > Hi. > > > > > > > > On Mon, 2008-10-27 at 12:37 +0100, Rafael J. Wysocki wrote: > > > > > Well, I guess it's better if you post the entire thing so that we can see > > > > > what the role of the $subject patch is in it, even if this patch finally gets > > > > > merged separately. > > > > > > > > Ah.. that makes me see how vfs_check_frozen was getting triggered... > > > > (fs/namei.c, below). > > > > > > Nigel, thanks for the patch, it makes thinks a lot clearer. > > > > > > >From a quick look through the patch it seems to solve a bunch of cases > > > where new fuse requests during the freezing could cause problems. But > > > it doesn't do anything with requests that are already under way when > > > the freezing starts, which would still result in all the same > > > problems. > > > > > > Take this scenario: > > > > > > 1) process A does rename("/mnt/fuse/a", "/mnt/fuse/b") > > > 2) request goes to process B serving the fuse filesystem > > > 3) filesystems are frozen, no new fuse requests > > > 4) processes are frozen, let's say B first, then A > > > 5) freezing A will fail, since it's still waiting for the request to finish > > > > I'll have a look at the code again. I deliberately didn't stop existing > > requests, but perhaps that's the wrong behaviour. > > > > In the above scenario, process B won't see process A's new request until > > post-resume if the filesystem is already frozen, so steps 4 & 5 don't > > happen. > > No, I mean the filesystem is only frozen at 3 not before, so B is > already processing the request by the time the filesystem gets frozen. > > > Process B will also always be frozen before process A if process > > A is userspace (most likely in the above scenario) or was mounted after > > process B. (I've had this patch distributed as is for almost a year, > > with no problems at all, so I believe I'm right here). > > Both A and B are userspace processes. The order of freezing userspace > processes is basically random, there's no way to make sure that > processes doing work on behalf of a fuse filesystem (B) are frozen > after processes accessing the fuse filesystem (A). Sorry. You're right - I was confusing things in what I said, but I do have a (unconfused) solution: The answer is to freeze the fuse filesystems first, stopping new requests (freezing the processes making them) before they are passed on to userspace and allowing existing requests to complete or freeze. Once that is done, the userspace fuse processes will be idle, at least as far as fuse activity is concerned. After fuse activity is stopped, userspace can be frozen without worrying about what processes are fuse and what aren't. This is what my patch implements so far. To deal with requests that are already in progress, I'd suggest three possibilities, in the order I think they should be preferred. 1) Use wait_event_freezeable[_timeout] in fuse code. Probably preferable to #2, but I thought of it later :) 2) Use freezer_do_not_count as part of FUSE_MIGHT_FREEZE (resetting before exiting the callers, of course). If the request doesn't complete during the freezing period, it must be because the userspace thread was already frozen. If the request does complete, we're counted again during the normal freezing of userspace that follows the freezing of filesystems. 3) Adding a means to check whether processes being frozen are fuse requests. The code could then wait for fc->num_waiting - (say) fc->num_frozen == 0. Regards, Nigel ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <1225145607.26724.54.camel@nigel-laptop>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <1225145607.26724.54.camel@nigel-laptop> @ 2008-10-28 20:25 ` Miklos Szeredi [not found] ` <E1Kuv87-00064p-Lk@pomaz-ex.szeredi.hu> 1 sibling, 0 replies; 59+ messages in thread From: Miklos Szeredi @ 2008-10-28 20:25 UTC (permalink / raw) To: ncunningham; +Cc: linux-pm, linux-kernel, miklos On Tue, 28 Oct 2008, Nigel Cunningham wrote: > The answer is to freeze the fuse filesystems first, stopping new > requests (freezing the processes making them) before they are passed on > to userspace and allowing existing requests to complete or freeze. Once > that is done, the userspace fuse processes will be idle, at least as far > as fuse activity is concerned. After fuse activity is stopped, userspace > can be frozen without worrying about what processes are fuse and what > aren't. This is what my patch implements so far. OK. > To deal with requests that are already in progress, I'd suggest three > possibilities, in the order I think they should be preferred. > > 1) Use wait_event_freezeable[_timeout] in fuse code. Probably preferable > to #2, but I thought of it later :) > > 2) Use freezer_do_not_count as part of FUSE_MIGHT_FREEZE (resetting > before exiting the callers, of course). If the request doesn't complete > during the freezing period, it must be because the userspace thread was > already frozen. If the request does complete, we're counted again during > the normal freezing of userspace that follows the freezing of > filesystems. > > 3) Adding a means to check whether processes being frozen are fuse > requests. The code could then wait for fc->num_waiting - (say) > fc->num_frozen == 0. Yup, these fix the freezing of tasks which have outstanding fuse requests. However it does not fix the freezing of tasks which are waiting for VFS locks (i.e. inode->i_mutex) held by the outstanding fuse requests. This is the tricky part... Miklos ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <E1Kuv87-00064p-Lk@pomaz-ex.szeredi.hu>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <E1Kuv87-00064p-Lk@pomaz-ex.szeredi.hu> @ 2008-10-28 21:29 ` Nigel Cunningham [not found] ` <1225229399.9661.13.camel@nigel-laptop> ` (2 subsequent siblings) 3 siblings, 0 replies; 59+ messages in thread From: Nigel Cunningham @ 2008-10-28 21:29 UTC (permalink / raw) To: Miklos Szeredi; +Cc: linux-pm, linux-kernel Hi. On Tue, 2008-10-28 at 21:25 +0100, Miklos Szeredi wrote: > > To deal with requests that are already in progress, I'd suggest three > > possibilities, in the order I think they should be preferred. > > > > 1) Use wait_event_freezeable[_timeout] in fuse code. Probably preferable > > to #2, but I thought of it later :) > > > > 2) Use freezer_do_not_count as part of FUSE_MIGHT_FREEZE (resetting > > before exiting the callers, of course). If the request doesn't complete > > during the freezing period, it must be because the userspace thread was > > already frozen. If the request does complete, we're counted again during > > the normal freezing of userspace that follows the freezing of > > filesystems. > > > > 3) Adding a means to check whether processes being frozen are fuse > > requests. The code could then wait for fc->num_waiting - (say) > > fc->num_frozen == 0. > > Yup, these fix the freezing of tasks which have outstanding fuse > requests. > > However it does not fix the freezing of tasks which are waiting for > VFS locks (i.e. inode->i_mutex) held by the outstanding fuse requests. > This is the tricky part... Convert them all to wait_event_freezeable. If you know the locks they're waiting on definitely aren't going to be free until post-resume and we're going to be trying to freeze them while they're waiting, that would be the right behaviour. Regards, Nigel ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <1225229399.9661.13.camel@nigel-laptop>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <1225229399.9661.13.camel@nigel-laptop> @ 2008-10-28 21:51 ` Miklos Szeredi [not found] ` <E1KuwTn-0006EQ-8J@pomaz-ex.szeredi.hu> 1 sibling, 0 replies; 59+ messages in thread From: Miklos Szeredi @ 2008-10-28 21:51 UTC (permalink / raw) To: ncunningham; +Cc: linux-pm, linux-kernel, miklos On Wed, 29 Oct 2008, Nigel Cunningham wrote: > > However it does not fix the freezing of tasks which are waiting for > > VFS locks (i.e. inode->i_mutex) held by the outstanding fuse requests. > > This is the tricky part... > > Convert them all to wait_event_freezeable. You mean convert mutexes to event queues? Not a very good idea. I fear we are going down the same path as the last time. I still don't think rewriting the VFS is the right solution to the freezing problem. But hey, if you want, sumbit a patch or an RFD and lets see what others think. Miklos ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <E1KuwTn-0006EQ-8J@pomaz-ex.szeredi.hu>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <E1KuwTn-0006EQ-8J@pomaz-ex.szeredi.hu> @ 2008-10-28 22:00 ` Rafael J. Wysocki [not found] ` <200810282300.31164.rjw@sisk.pl> 2008-10-28 22:03 ` Nigel Cunningham 2 siblings, 0 replies; 59+ messages in thread From: Rafael J. Wysocki @ 2008-10-28 22:00 UTC (permalink / raw) To: Miklos Szeredi; +Cc: ncunningham, linux-pm, linux-kernel On Tuesday, 28 of October 2008, Miklos Szeredi wrote: > On Wed, 29 Oct 2008, Nigel Cunningham wrote: > > > However it does not fix the freezing of tasks which are waiting for > > > VFS locks (i.e. inode->i_mutex) held by the outstanding fuse requests. > > > This is the tricky part... > > > > Convert them all to wait_event_freezeable. > > You mean convert mutexes to event queues? Not a very good idea. > > I fear we are going down the same path as the last time. I still > don't think rewriting the VFS is the right solution to the freezing > problem. But hey, if you want, sumbit a patch or an RFD and lets see > what others think. So, what solution would you prefer? Rafael ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <200810282300.31164.rjw@sisk.pl>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <200810282300.31164.rjw@sisk.pl> @ 2008-10-28 22:02 ` Miklos Szeredi [not found] ` <E1Kuwe3-0006Gc-LF@pomaz-ex.szeredi.hu> 1 sibling, 0 replies; 59+ messages in thread From: Miklos Szeredi @ 2008-10-28 22:02 UTC (permalink / raw) To: rjw; +Cc: linux-kernel, ncunningham, linux-pm, miklos On Tue, 28 Oct 2008, Rafael J. Wysocki wrote: > On Tuesday, 28 of October 2008, Miklos Szeredi wrote: > > On Wed, 29 Oct 2008, Nigel Cunningham wrote: > > > > However it does not fix the freezing of tasks which are waiting for > > > > VFS locks (i.e. inode->i_mutex) held by the outstanding fuse requests. > > > > This is the tricky part... > > > > > > Convert them all to wait_event_freezeable. > > > > You mean convert mutexes to event queues? Not a very good idea. > > > > I fear we are going down the same path as the last time. I still > > don't think rewriting the VFS is the right solution to the freezing > > problem. But hey, if you want, sumbit a patch or an RFD and lets see > > what others think. > > So, what solution would you prefer? I would prefer a freezer-less solution. Suspend to ram doesn't need the freezer, and with the kexec approach hibernate could be done without it also. I don't think adding hacks to the VFS to work around the issues with the freezer is the right way to solve this. But this is just my personal opinion, the VFS maintainers may think otherwise. Miklos ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <E1Kuwe3-0006Gc-LF@pomaz-ex.szeredi.hu>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <E1Kuwe3-0006Gc-LF@pomaz-ex.szeredi.hu> @ 2008-10-28 22:21 ` Rafael J. Wysocki [not found] ` <200810282321.59041.rjw@sisk.pl> 1 sibling, 0 replies; 59+ messages in thread From: Rafael J. Wysocki @ 2008-10-28 22:21 UTC (permalink / raw) To: Miklos Szeredi; +Cc: ncunningham, linux-pm, linux-kernel On Tuesday, 28 of October 2008, Miklos Szeredi wrote: > On Tue, 28 Oct 2008, Rafael J. Wysocki wrote: > > On Tuesday, 28 of October 2008, Miklos Szeredi wrote: > > > On Wed, 29 Oct 2008, Nigel Cunningham wrote: > > > > > However it does not fix the freezing of tasks which are waiting for > > > > > VFS locks (i.e. inode->i_mutex) held by the outstanding fuse requests. > > > > > This is the tricky part... > > > > > > > > Convert them all to wait_event_freezeable. > > > > > > You mean convert mutexes to event queues? Not a very good idea. > > > > > > I fear we are going down the same path as the last time. I still > > > don't think rewriting the VFS is the right solution to the freezing > > > problem. But hey, if you want, sumbit a patch or an RFD and lets see > > > what others think. > > > > So, what solution would you prefer? > > I would prefer a freezer-less solution. Suspend to ram doesn't need > the freezer, Well, yes it does. And it will in forseeable future, AFAICS. > and with the kexec approach hibernate could be done without it also. There are problems with that too, although they aren't directly related to filesystems. > I don't think adding hacks to the VFS to work around the issues with > the freezer is the right way to solve this. But this is just my > personal opinion, the VFS maintainers may think otherwise. Well, my personal opinion is that we need filesystems to support suspend, this way or another and the sooner it happens, the better. Still, I'm rather not going to make that happen myself. :-) Apparently, Nigel is willing to work in this direction and we can use this as an opportunity to learn what exactly is necessary for this purpose and _then_ decide if this is reasonable or not instead of dismissing it upfront. Thanks, Rafael ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <200810282321.59041.rjw@sisk.pl>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <200810282321.59041.rjw@sisk.pl> @ 2008-10-28 23:21 ` Miklos Szeredi [not found] ` <E1Kuxsg-0006PR-Tr@pomaz-ex.szeredi.hu> 1 sibling, 0 replies; 59+ messages in thread From: Miklos Szeredi @ 2008-10-28 23:21 UTC (permalink / raw) To: rjw; +Cc: linux-kernel, ncunningham, linux-pm, miklos On Tue, 28 Oct 2008, Rafael J. Wysocki wrote: > On Tuesday, 28 of October 2008, Miklos Szeredi wrote: > > I would prefer a freezer-less solution. Suspend to ram doesn't need > > the freezer, > > Well, yes it does. And it will in forseeable future, AFAICS. Umm, OK. Last I remember everybody agreed that there's absolutely no reason why processes need to be frozen, and the only important thing is that drivers are not twiddling the hardware during suspend, and this can usually easily be solved on the subsystem level. > > I don't think adding hacks to the VFS to work around the issues with > > the freezer is the right way to solve this. But this is just my > > personal opinion, the VFS maintainers may think otherwise. > > Well, my personal opinion is that we need filesystems to support suspend, > this way or another and the sooner it happens, the better. Still, I'm rather > not going to make that happen myself. :-) > > Apparently, Nigel is willing to work in this direction and we can use this as > an opportunity to learn what exactly is necessary for this purpose and _then_ > decide if this is reasonable or not instead of dismissing it upfront. I haven't dismissed it, just voiced my opinion that I think it's the wrong direction. Miklos ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <E1Kuxsg-0006PR-Tr@pomaz-ex.szeredi.hu>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <E1Kuxsg-0006PR-Tr@pomaz-ex.szeredi.hu> @ 2008-10-28 23:59 ` Rafael J. Wysocki [not found] ` <200810290059.38411.rjw@sisk.pl> 1 sibling, 0 replies; 59+ messages in thread From: Rafael J. Wysocki @ 2008-10-28 23:59 UTC (permalink / raw) To: Miklos Szeredi; +Cc: ncunningham, linux-pm, linux-kernel On Wednesday, 29 of October 2008, Miklos Szeredi wrote: > On Tue, 28 Oct 2008, Rafael J. Wysocki wrote: > > On Tuesday, 28 of October 2008, Miklos Szeredi wrote: > > > I would prefer a freezer-less solution. Suspend to ram doesn't need > > > the freezer, > > > > Well, yes it does. And it will in forseeable future, AFAICS. > > Umm, OK. Last I remember everybody agreed that there's absolutely no > reason why processes need to be frozen, and the only important thing > is that drivers are not twiddling the hardware during suspend, and > this can usually easily be solved on the subsystem level. Well, this turned out not to be the case in the meantime. In fact to handle that without the freezer we'd have to synchronize every driver's suspend/resume callbacks with every possible way in which applications can access the device for regular I/O (for example for PCI devices this means any I/O other than configuration space accesses). While this is possible in theory, I don't see this happening any time soon, especially that we're going to keep the bar for accepting new drivers relatively low. Thanks, Rafael ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <200810290059.38411.rjw@sisk.pl>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <200810290059.38411.rjw@sisk.pl> @ 2008-10-29 8:10 ` Miklos Szeredi 0 siblings, 0 replies; 59+ messages in thread From: Miklos Szeredi @ 2008-10-29 8:10 UTC (permalink / raw) To: rjw; +Cc: linux-kernel, ncunningham, linux-pm, miklos On Wed, 29 Oct 2008, Rafael J. Wysocki wrote: > On Wednesday, 29 of October 2008, Miklos Szeredi wrote: > > On Tue, 28 Oct 2008, Rafael J. Wysocki wrote: > > > On Tuesday, 28 of October 2008, Miklos Szeredi wrote: > > > > I would prefer a freezer-less solution. Suspend to ram doesn't need > > > > the freezer, > > > > > > Well, yes it does. And it will in forseeable future, AFAICS. > > > > Umm, OK. Last I remember everybody agreed that there's absolutely no > > reason why processes need to be frozen, and the only important thing > > is that drivers are not twiddling the hardware during suspend, and > > this can usually easily be solved on the subsystem level. > > Well, this turned out not to be the case in the meantime. > > In fact to handle that without the freezer we'd have to synchronize > every driver's suspend/resume callbacks with every possible way in > which applications can access the device for regular I/O (for > example for PCI devices this means any I/O other than configuration > space accesses). Not all callbacks. I don't know what the current model is but AFAIR it should be something like this: 1) call drivers to prepare for suspend (allocate space, etc) 2) stop all driver activity (plug queues, disable interrupts, etc) 3) call drivers to actually save state and power down 4) suspend The part we are concerned is stopping driver activity. It could be done with a mutex, or it could be done by freezing tasks. Adding a mutex or other mechanism is the one I most like, but it's probably the biggest work, so lets look at how to fix the freezing: Currently the criteria for freezing is that userspace task has to exit kernelspace, and kernel task has to hit a specific "freeze point". This causes problems where we want to freeze tasks which are "stuck" inside filesystems or other non-driver parts of the kernel. We can fix this two ways: a) mark additional places to freeze for userspace tasks as well. This is the direction Nigel seems to be taking. b) or instead we could allow freezing anywhere in uninterruptible sleep, _except_ where explicity marked. Which is easier? I don't know. But I very storgly feel that marking un-freezable places instead of marking freezable places is a much cleaner solution. It only affects parts of the kernel which have something to do with suspend, instead of affecting parts of the kernel which have absolutely nothing to do with suspend. Miklos ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <E1KuwTn-0006EQ-8J@pomaz-ex.szeredi.hu> 2008-10-28 22:00 ` Rafael J. Wysocki [not found] ` <200810282300.31164.rjw@sisk.pl> @ 2008-10-28 22:03 ` Nigel Cunningham 2 siblings, 0 replies; 59+ messages in thread From: Nigel Cunningham @ 2008-10-28 22:03 UTC (permalink / raw) To: Miklos Szeredi; +Cc: linux-pm, linux-kernel Hi. On Tue, 2008-10-28 at 22:51 +0100, Miklos Szeredi wrote: > On Wed, 29 Oct 2008, Nigel Cunningham wrote: > > > However it does not fix the freezing of tasks which are waiting for > > > VFS locks (i.e. inode->i_mutex) held by the outstanding fuse requests. > > > This is the tricky part... > > > > Convert them all to wait_event_freezeable. > > You mean convert mutexes to event queues? Not a very good idea. No, I don't. Sorry. I was thinking you were referring to the wait_event calls in the fuse request code. My bad. Perhaps it's best to go back to my original position: hold new requests and allow existing ones to complete. I'll look again at previous messages to see why you thought that was problematic. Regards, Nigel ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <E1Kuv87-00064p-Lk@pomaz-ex.szeredi.hu> 2008-10-28 21:29 ` Nigel Cunningham [not found] ` <1225229399.9661.13.camel@nigel-laptop> @ 2008-10-28 23:04 ` Nigel Cunningham [not found] ` <1225235054.9661.37.camel@nigel-laptop> 3 siblings, 0 replies; 59+ messages in thread From: Nigel Cunningham @ 2008-10-28 23:04 UTC (permalink / raw) To: Miklos Szeredi; +Cc: linux-pm, linux-kernel Hi again. On Tue, 2008-10-28 at 21:25 +0100, Miklos Szeredi wrote: > On Tue, 28 Oct 2008, Nigel Cunningham wrote: > > The answer is to freeze the fuse filesystems first, stopping new > > requests (freezing the processes making them) before they are passed on > > to userspace and allowing existing requests to complete or freeze. Once > > that is done, the userspace fuse processes will be idle, at least as far > > as fuse activity is concerned. After fuse activity is stopped, userspace > > can be frozen without worrying about what processes are fuse and what > > aren't. This is what my patch implements so far. > > OK. > > > To deal with requests that are already in progress, I'd suggest three > > possibilities, in the order I think they should be preferred. > > > > 1) Use wait_event_freezeable[_timeout] in fuse code. Probably preferable > > to #2, but I thought of it later :) > > > > 2) Use freezer_do_not_count as part of FUSE_MIGHT_FREEZE (resetting > > before exiting the callers, of course). If the request doesn't complete > > during the freezing period, it must be because the userspace thread was > > already frozen. If the request does complete, we're counted again during > > the normal freezing of userspace that follows the freezing of > > filesystems. > > > > 3) Adding a means to check whether processes being frozen are fuse > > requests. The code could then wait for fc->num_waiting - (say) > > fc->num_frozen == 0. > > Yup, these fix the freezing of tasks which have outstanding fuse > requests. > > However it does not fix the freezing of tasks which are waiting for > VFS locks (i.e. inode->i_mutex) held by the outstanding fuse requests. > This is the tricky part... Okay. Looking back on our conversation brought me back to this message, which I think needs another reply. If we take the strategy of holding new requests and allowing existing ones to complete, then am I right in thinking that we only need to worry about where inode->i_mutex is taken in fs/fuse/file.c (I don't see it taken in other fs/fuse/*.c files). If that's correct, dealing with that issue looks relatively straight forward: we need some more FUSE_MIGHT_FREEZE calls for those functions, and something done to the vfs_check_frozen call - I'm a bit confused by that - inode->i_sb will refer to the fuse filesystem, won't it? Regards, Nigel ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <1225235054.9661.37.camel@nigel-laptop>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <1225235054.9661.37.camel@nigel-laptop> @ 2008-10-28 23:12 ` Miklos Szeredi [not found] ` <E1Kuxk4-0006NT-Ie@pomaz-ex.szeredi.hu> 1 sibling, 0 replies; 59+ messages in thread From: Miklos Szeredi @ 2008-10-28 23:12 UTC (permalink / raw) To: ncunningham; +Cc: linux-pm, linux-kernel, miklos On Wed, 29 Oct 2008, Nigel Cunningham wrote: > > However it does not fix the freezing of tasks which are waiting for > > VFS locks (i.e. inode->i_mutex) held by the outstanding fuse requests. > > This is the tricky part... > > Okay. Looking back on our conversation brought me back to this message, > which I think needs another reply. > > If we take the strategy of holding new requests and allowing existing > ones to complete, then am I right in thinking that we only need to worry > about where inode->i_mutex is taken in fs/fuse/file.c (I don't see it > taken in other fs/fuse/*.c files). Nope, i_mutex is usually taken by the VFS not the filesystem. That means that the filesystem is called with the mutex already held. Try "grep i_mutex fs/*.c". There's also sb->s_vfs_rename_mutex, for all the gory details see Documentation/filesystems/Locking. So it's not just having to fix fuse, it's having to "fix" the VFS as well. Miklos ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <E1Kuxk4-0006NT-Ie@pomaz-ex.szeredi.hu>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <E1Kuxk4-0006NT-Ie@pomaz-ex.szeredi.hu> @ 2008-10-28 23:17 ` Nigel Cunningham [not found] ` <1225235860.9661.42.camel@nigel-laptop> 1 sibling, 0 replies; 59+ messages in thread From: Nigel Cunningham @ 2008-10-28 23:17 UTC (permalink / raw) To: Miklos Szeredi; +Cc: linux-pm, linux-kernel Hi. On Wed, 2008-10-29 at 00:12 +0100, Miklos Szeredi wrote: > On Wed, 29 Oct 2008, Nigel Cunningham wrote: > > > However it does not fix the freezing of tasks which are waiting for > > > VFS locks (i.e. inode->i_mutex) held by the outstanding fuse requests. > > > This is the tricky part... > > > > Okay. Looking back on our conversation brought me back to this message, > > which I think needs another reply. > > > > If we take the strategy of holding new requests and allowing existing > > ones to complete, then am I right in thinking that we only need to worry > > about where inode->i_mutex is taken in fs/fuse/file.c (I don't see it > > taken in other fs/fuse/*.c files). > > Nope, i_mutex is usually taken by the VFS not the filesystem. That > means that the filesystem is called with the mutex already held. Try > "grep i_mutex fs/*.c". There's also sb->s_vfs_rename_mutex, for all > the gory details see Documentation/filesystems/Locking. > > So it's not just having to fix fuse, it's having to "fix" the VFS as > well. Remember, though, that we're only freezing fuse at the moment, and strictly one filesystem at a time. We can thus happily wait for the i_mutex taken by some other process to be released. Regards, Nigel ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <1225235860.9661.42.camel@nigel-laptop>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <1225235860.9661.42.camel@nigel-laptop> @ 2008-10-28 23:24 ` Miklos Szeredi [not found] ` <E1KuxvH-0006Q0-Dy@pomaz-ex.szeredi.hu> 1 sibling, 0 replies; 59+ messages in thread From: Miklos Szeredi @ 2008-10-28 23:24 UTC (permalink / raw) To: ncunningham; +Cc: linux-pm, linux-kernel, miklos On Wed, 29 Oct 2008, Nigel Cunningham wrote: > Remember, though, that we're only freezing fuse at the moment, and > strictly one filesystem at a time. We can thus happily wait for the > i_mutex taken by some other process to be released. Not going to work: you need to wait for all requests to be finished, but those might depend on some other fuse filesystem which has already been frozen. Miklos ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <E1KuxvH-0006Q0-Dy@pomaz-ex.szeredi.hu>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <E1KuxvH-0006Q0-Dy@pomaz-ex.szeredi.hu> @ 2008-10-28 23:41 ` Nigel Cunningham [not found] ` <1225237298.9661.55.camel@nigel-laptop> 1 sibling, 0 replies; 59+ messages in thread From: Nigel Cunningham @ 2008-10-28 23:41 UTC (permalink / raw) To: Miklos Szeredi; +Cc: linux-pm, linux-kernel Hi. On Wed, 2008-10-29 at 00:24 +0100, Miklos Szeredi wrote: > On Wed, 29 Oct 2008, Nigel Cunningham wrote: > > Remember, though, that we're only freezing fuse at the moment, and > > strictly one filesystem at a time. We can thus happily wait for the > > i_mutex taken by some other process to be released. > > Not going to work: you need to wait for all requests to be finished, > but those might depend on some other fuse filesystem which has already > been frozen. Okay. In that case, am I right in thinking that the request waiting on the frozen filesystem will be stuck in request_wait_answer, and the userspace process that was trying to satisfy the request will be stuck in the FUSE_MIGHT_FREEZE call that was invoked for the frozen filesystem? Regards, Nigel ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <1225237298.9661.55.camel@nigel-laptop>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <1225237298.9661.55.camel@nigel-laptop> @ 2008-10-28 23:45 ` Miklos Szeredi [not found] ` <E1KuyFq-0006Sc-PC@pomaz-ex.szeredi.hu> 1 sibling, 0 replies; 59+ messages in thread From: Miklos Szeredi @ 2008-10-28 23:45 UTC (permalink / raw) To: ncunningham; +Cc: linux-pm, linux-kernel, miklos On Wed, 29 Oct 2008, Nigel Cunningham wrote: > Hi. > > On Wed, 2008-10-29 at 00:24 +0100, Miklos Szeredi wrote: > > On Wed, 29 Oct 2008, Nigel Cunningham wrote: > > > Remember, though, that we're only freezing fuse at the moment, and > > > strictly one filesystem at a time. We can thus happily wait for the > > > i_mutex taken by some other process to be released. > > > > Not going to work: you need to wait for all requests to be finished, > > but those might depend on some other fuse filesystem which has already > > been frozen. > > Okay. In that case, am I right in thinking that the request waiting on > the frozen filesystem will be stuck in request_wait_answer, Yes. > and the > userspace process that was trying to satisfy the request will be stuck > in the FUSE_MIGHT_FREEZE call that was invoked for the frozen > filesystem? No, it already passed that, before the filesystem got frozen. But it doesn't matter, in either case i_mutex will already have been taken by the VFS and it won't be released until the request completely finishes. Miklos ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <E1KuyFq-0006Sc-PC@pomaz-ex.szeredi.hu>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <E1KuyFq-0006Sc-PC@pomaz-ex.szeredi.hu> @ 2008-10-28 23:50 ` Miklos Szeredi 2008-10-28 23:54 ` Nigel Cunningham [not found] ` <E1KuyKw-0006UP-EK@pomaz-ex.szeredi.hu> 2 siblings, 0 replies; 59+ messages in thread From: Miklos Szeredi @ 2008-10-28 23:50 UTC (permalink / raw) To: ncunningham; +Cc: linux-pm, linux-kernel, miklos On Wed, 29 Oct 2008, Miklos Szeredi wrote: > On Wed, 29 Oct 2008, Nigel Cunningham wrote: > > Hi. > > > > On Wed, 2008-10-29 at 00:24 +0100, Miklos Szeredi wrote: > > > On Wed, 29 Oct 2008, Nigel Cunningham wrote: > > > > Remember, though, that we're only freezing fuse at the moment, and > > > > strictly one filesystem at a time. We can thus happily wait for the > > > > i_mutex taken by some other process to be released. > > > > > > Not going to work: you need to wait for all requests to be finished, > > > but those might depend on some other fuse filesystem which has already > > > been frozen. > > > > Okay. In that case, am I right in thinking that the request waiting on > > the frozen filesystem will be stuck in request_wait_answer, > > Yes. > > > and the > > userspace process that was trying to satisfy the request will be stuck > > in the FUSE_MIGHT_FREEZE call that was invoked for the frozen > > filesystem? Sorry, I misunderstood this. Yes you're right, in the case of one fuse filesystem relying on another to complete the request the already frozen one will be stuck in FUSE_MIGHT_FREEZE(). How does that help? Miklos ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <E1KuyFq-0006Sc-PC@pomaz-ex.szeredi.hu> 2008-10-28 23:50 ` Miklos Szeredi @ 2008-10-28 23:54 ` Nigel Cunningham [not found] ` <E1KuyKw-0006UP-EK@pomaz-ex.szeredi.hu> 2 siblings, 0 replies; 59+ messages in thread From: Nigel Cunningham @ 2008-10-28 23:54 UTC (permalink / raw) To: Miklos Szeredi; +Cc: linux-pm, linux-kernel Hi Miklos. On Wed, 2008-10-29 at 00:45 +0100, Miklos Szeredi wrote: > On Wed, 29 Oct 2008, Nigel Cunningham wrote: > > Hi. > > > > On Wed, 2008-10-29 at 00:24 +0100, Miklos Szeredi wrote: > > > On Wed, 29 Oct 2008, Nigel Cunningham wrote: > > > > Remember, though, that we're only freezing fuse at the moment, and > > > > strictly one filesystem at a time. We can thus happily wait for the > > > > i_mutex taken by some other process to be released. > > > > > > Not going to work: you need to wait for all requests to be finished, > > > but those might depend on some other fuse filesystem which has already > > > been frozen. > > > > Okay. In that case, am I right in thinking that the request waiting on > > the frozen filesystem will be stuck in request_wait_answer, > > Yes. > > > and the > > userspace process that was trying to satisfy the request will be stuck > > in the FUSE_MIGHT_FREEZE call that was invoked for the frozen > > filesystem? > > No, it already passed that, before the filesystem got frozen. But it > doesn't matter, in either case i_mutex will already have been taken by > the VFS and it won't be released until the request completely > finishes. I think we're making different assumptions. I'm assuming that one of those solutions we already discussed is implemented, such that we don't start freezing a new filesystem until all the requests for the current filesystem are dealt with. Perhaps I should come up with a new version of the patch that implements that. Nigel ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <E1KuyKw-0006UP-EK@pomaz-ex.szeredi.hu>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <E1KuyKw-0006UP-EK@pomaz-ex.szeredi.hu> @ 2008-10-28 23:58 ` Nigel Cunningham 0 siblings, 0 replies; 59+ messages in thread From: Nigel Cunningham @ 2008-10-28 23:58 UTC (permalink / raw) To: Miklos Szeredi; +Cc: linux-pm, linux-kernel Hi. On Wed, 2008-10-29 at 00:50 +0100, Miklos Szeredi wrote: > On Wed, 29 Oct 2008, Miklos Szeredi wrote: > > On Wed, 29 Oct 2008, Nigel Cunningham wrote: > > > Hi. > > > > > > On Wed, 2008-10-29 at 00:24 +0100, Miklos Szeredi wrote: > > > > On Wed, 29 Oct 2008, Nigel Cunningham wrote: > > > > > Remember, though, that we're only freezing fuse at the moment, and > > > > > strictly one filesystem at a time. We can thus happily wait for the > > > > > i_mutex taken by some other process to be released. > > > > > > > > Not going to work: you need to wait for all requests to be finished, > > > > but those might depend on some other fuse filesystem which has already > > > > been frozen. > > > > > > Okay. In that case, am I right in thinking that the request waiting on > > > the frozen filesystem will be stuck in request_wait_answer, > > > > Yes. > > > > > and the > > > userspace process that was trying to satisfy the request will be stuck > > > in the FUSE_MIGHT_FREEZE call that was invoked for the frozen > > > filesystem? > > Sorry, I misunderstood this. Yes you're right, in the case of one > fuse filesystem relying on another to complete the request the already > frozen one will be stuck in FUSE_MIGHT_FREEZE(). > > How does that help? Well, my next question was going to be: can we find a way to know that the userspace process we're waiting on was frozen? If we can know that, then perhaps we can apply that knowledge in this thread to avoid a freezing failure. Regards, Nigel ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Freezer: Don't count threads waiting for frozen filesystems. 2008-10-24 22:07 Freezer: Don't count threads waiting for frozen filesystems Nigel Cunningham 2008-10-26 20:01 ` Rafael J. Wysocki [not found] ` <200810262101.08822.rjw@sisk.pl> @ 2008-10-28 17:49 ` Pavel Machek 2 siblings, 0 replies; 59+ messages in thread From: Pavel Machek @ 2008-10-28 17:49 UTC (permalink / raw) To: Nigel Cunningham; +Cc: Linux-PM Hi! > While working on freezing fuse filesystems, I found that if a filesystem > is frozen when we try to freeze processes, freezing can fail because > threads are waiting in vfs_check_frozen for the filesystem to be thawed. > We should thus not count such threads. > > The check will be safe if a filesystem is thawed while we're freezing > processes because filesystem thaws are only invoked from userspace. Any > waiting processes will be woken and frozen prior to us completing the > freezing of userspace (the caller invoking the filesystem thaw will be > freezing) or - in the worst case - together with kernel threads. In the worst case we have a problem. Userspace and kernel threads *need* to be frozen separately... (Otherwise its a race, and if you freeze kjournald first, user process becomes unfreezeable and freezing fails.) Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <E1Kv688-0007Kp-SN@pomaz-ex.szeredi.hu>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] <E1Kv688-0007Kp-SN@pomaz-ex.szeredi.hu> @ 2008-10-29 13:51 ` Alan Stern 0 siblings, 0 replies; 59+ messages in thread From: Alan Stern @ 2008-10-29 13:51 UTC (permalink / raw) To: Miklos Szeredi; +Cc: ncunningham, linux-kernel, linux-pm On Wed, 29 Oct 2008, Miklos Szeredi wrote: > Not all callbacks. I don't know what the current model is but AFAIR > it should be something like this: > > 1) call drivers to prepare for suspend (allocate space, etc) > 2) stop all driver activity (plug queues, disable interrupts, etc) > 3) call drivers to actually save state and power down > 4) suspend > > The part we are concerned is stopping driver activity. It could be > done with a mutex, or it could be done by freezing tasks. Adding a > mutex or other mechanism is the one I most like, but it's probably the > biggest work, so lets look at how to fix the freezing: Not only is adding a mutex the biggest amount of work, it has has the largest impact. Every I/O pathway would have to acquire the appropriate mutex. That's a significant additional load on the system. > Currently the criteria for freezing is that userspace task has to exit > kernelspace, and kernel task has to hit a specific "freeze point". > This causes problems where we want to freeze tasks which are "stuck" > inside filesystems or other non-driver parts of the kernel. We can > fix this two ways: > > a) mark additional places to freeze for userspace tasks as > well. This is the direction Nigel seems to be taking. > > b) or instead we could allow freezing anywhere in uninterruptible > sleep, _except_ where explicity marked. > > Which is easier? I don't know. But I very storgly feel that marking > un-freezable places instead of marking freezable places is a much > cleaner solution. It only affects parts of the kernel which have > something to do with suspend, instead of affecting parts of the kernel > which have absolutely nothing to do with suspend. The problem with unrestricted freezing shows up when you freeze tasks that hold a mutex or other sort of lock. If this mutex is needed later on for suspending a device then the suspend will hang, because a frozen task can't release any mutexes. I suppose you could try to categorize mutexes as "freezable" and "non-freezable". Ones used by device drives would generally be non-freezable, whereas others (such as those used by VFS) would be freezable. Still, it would be pretty difficult. Among other things, it would be necessary to verify that a task holding a non-freezable mutex never tries to acquire a freezable mutex. Alan Stern ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <Pine.LNX.4.44L0.0810290944240.2515-100000@iolanthe.rowland.org>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] <Pine.LNX.4.44L0.0810290944240.2515-100000@iolanthe.rowland.org> @ 2008-10-29 14:50 ` Miklos Szeredi 0 siblings, 0 replies; 59+ messages in thread From: Miklos Szeredi @ 2008-10-29 14:50 UTC (permalink / raw) To: stern; +Cc: ncunningham, linux-pm, linux-kernel, miklos On Wed, 29 Oct 2008, Alan Stern wrote: > On Wed, 29 Oct 2008, Miklos Szeredi wrote: > > > Not all callbacks. I don't know what the current model is but AFAIR > > it should be something like this: > > > > 1) call drivers to prepare for suspend (allocate space, etc) > > 2) stop all driver activity (plug queues, disable interrupts, etc) > > 3) call drivers to actually save state and power down > > 4) suspend > > > > The part we are concerned is stopping driver activity. It could be > > done with a mutex, or it could be done by freezing tasks. Adding a > > mutex or other mechanism is the one I most like, but it's probably the > > biggest work, so lets look at how to fix the freezing: > > Not only is adding a mutex the biggest amount of work, it has has the > largest impact. Every I/O pathway would have to acquire the > appropriate mutex. That's a significant additional load on the system. Actually I was thinking of an rw-semaphore, not a mutex. But yeah that still has scalability problems. But it could be done with custom locking primitives, optimized for this case: suspend_disable(); /* driver stuff */ suspend_enable(); > The problem with unrestricted freezing shows up when you freeze tasks > that hold a mutex or other sort of lock. If this mutex is needed later > on for suspending a device then the suspend will hang, because a frozen > task can't release any mutexes. I did a random sampling of ->suspend() callbacks, and they don't seem to be taking mutexes. Does that happen at all? Did anybody ever try modifying the freezer for suspend (not hibernate), so that it allows tasks not in running state to freeze? If not, I think that's an experiment worth doing. Miklos ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <E1KvCN9-00081i-KF@pomaz-ex.szeredi.hu>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] <E1KvCN9-00081i-KF@pomaz-ex.szeredi.hu> @ 2008-10-29 15:28 ` Alan Stern 0 siblings, 0 replies; 59+ messages in thread From: Alan Stern @ 2008-10-29 15:28 UTC (permalink / raw) To: Miklos Szeredi; +Cc: ncunningham, linux-kernel, linux-pm On Wed, 29 Oct 2008, Miklos Szeredi wrote: > Actually I was thinking of an rw-semaphore, not a mutex. But yeah > that still has scalability problems. But it could be done with custom > locking primitives, optimized for this case: > > suspend_disable(); > /* driver stuff */ > suspend_enable(); Yes, it could be done. And the overhead could be minimized by using per-CPU variables. It would still be an awful lot of work, and easy to get wrong. > > The problem with unrestricted freezing shows up when you freeze tasks > > that hold a mutex or other sort of lock. If this mutex is needed later > > on for suspending a device then the suspend will hang, because a frozen > > task can't release any mutexes. > > I did a random sampling of ->suspend() callbacks, and they don't seem > to be taking mutexes. Does that happen at all? It does, particularly among drivers that do runtime PM, which is becoming more and more important. Besides, suspend has to synchronize with I/O somehow. Right now that is handled by making suspend wait until no tasks are doing I/O (because they are all frozen). If you allow tasks to be frozen at more or less arbitrary times, while holding arbitrary locks, then you may end up freezing a task that's in the middle of I/O. That should certainly block the suspend (not to mention messing up the I/O operation). > Did anybody ever try modifying the freezer for suspend (not > hibernate), so that it allows tasks not in running state to freeze? > If not, I think that's an experiment worth doing. What happens if the reason the task isn't running is because it's waiting for I/O to complete? I just don't think this can be made to work. Alan Stern ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <Pine.LNX.4.44L0.0810291122220.2189-100000@iolanthe.rowland.org>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] <Pine.LNX.4.44L0.0810291122220.2189-100000@iolanthe.rowland.org> @ 2008-10-29 15:50 ` Miklos Szeredi 2008-10-29 16:10 ` Miklos Szeredi 1 sibling, 0 replies; 59+ messages in thread From: Miklos Szeredi @ 2008-10-29 15:50 UTC (permalink / raw) To: stern; +Cc: ncunningham, linux-pm, linux-kernel, miklos On Wed, 29 Oct 2008, Alan Stern wrote: > On Wed, 29 Oct 2008, Miklos Szeredi wrote: > > I did a random sampling of ->suspend() callbacks, and they don't seem > > to be taking mutexes. Does that happen at all? > > It does, particularly among drivers that do runtime PM, which is > becoming more and more important. > > Besides, suspend has to synchronize with I/O somehow. Right now that > is handled by making suspend wait until no tasks are doing I/O (because > they are all frozen). What about async I/O? > If you allow tasks to be frozen at more or less > arbitrary times, while holding arbitrary locks, then you may end up > freezing a task that's in the middle of I/O. That should certainly > block the suspend (not to mention messing up the I/O operation). What is the middle of I/O? Depending the type of I/O it could be messed up regardless of whether tasks happen to be in userspace or not (e.g. printing). And some types of I/O are already mostly decoupled from userspace (file I/O, networking), so the userspace freezing shoudln't make any difference. > > Did anybody ever try modifying the freezer for suspend (not > > hibernate), so that it allows tasks not in running state to freeze? > > If not, I think that's an experiment worth doing. > > What happens if the reason the task isn't running is because it's > waiting for I/O to complete? I just don't think this can be made to > work. Don't know. I've never written a driver, and I'm not familiar with runtime PM, etc. So I can't come up with a detailed design for solving the freezer issues there. But I do think that the solution does not lie in "fixing" the VFS. Miklos ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] <Pine.LNX.4.44L0.0810291122220.2189-100000@iolanthe.rowland.org> 2008-10-29 15:50 ` Miklos Szeredi @ 2008-10-29 16:10 ` Miklos Szeredi 1 sibling, 0 replies; 59+ messages in thread From: Miklos Szeredi @ 2008-10-29 16:10 UTC (permalink / raw) To: stern; +Cc: ncunningham, linux-pm, linux-kernel, miklos On Wed, 29 Oct 2008, Alan Stern wrote: > On Wed, 29 Oct 2008, Miklos Szeredi wrote: > > > Actually I was thinking of an rw-semaphore, not a mutex. But yeah > > that still has scalability problems. But it could be done with custom > > locking primitives, optimized for this case: > > > > suspend_disable(); > > /* driver stuff */ > > suspend_enable(); > > Yes, it could be done. And the overhead could be minimized by using > per-CPU variables. It would still be an awful lot of work, and easy to > get wrong. OK, getting back to this, as it seems to be the only way that we agree is doable. How about this, a) identify syscalls that may make drivers do I/O: - read - write - ioctl ??? b) add the suspend_disable/enable() primitives to these syscalls c) push primitives inside the implementation c) is slightly tricky, but could be done for example by setting a flag on open: FMODE_NO_SUSPEND_DISABLE (better name required), saying that implementation is responsible for getting the suspend disable magic right. For starters this flag could be set for all non-device opens (maybe all non-char-dev opens?), solving the fuse vs. freezer issues without any complicated trickery. Miklos ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <E1KvDJp-00088f-2b@pomaz-ex.szeredi.hu>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] <E1KvDJp-00088f-2b@pomaz-ex.szeredi.hu> @ 2008-10-29 16:17 ` Alan Stern 0 siblings, 0 replies; 59+ messages in thread From: Alan Stern @ 2008-10-29 16:17 UTC (permalink / raw) To: Miklos Szeredi; +Cc: ncunningham, linux-kernel, linux-pm On Wed, 29 Oct 2008, Miklos Szeredi wrote: > On Wed, 29 Oct 2008, Alan Stern wrote: > > On Wed, 29 Oct 2008, Miklos Szeredi wrote: > > > I did a random sampling of ->suspend() callbacks, and they don't seem > > > to be taking mutexes. Does that happen at all? > > > > It does, particularly among drivers that do runtime PM, which is > > becoming more and more important. > > > > Besides, suspend has to synchronize with I/O somehow. Right now that > > is handled by making suspend wait until no tasks are doing I/O (because > > they are all frozen). > > What about async I/O? I don't know. Maybe it slipped through the cracks. It wouldn't be surprising to find that async I/O can get messed up during a suspend. > > If you allow tasks to be frozen at more or less > > arbitrary times, while holding arbitrary locks, then you may end up > > freezing a task that's in the middle of I/O. That should certainly > > block the suspend (not to mention messing up the I/O operation). > > What is the middle of I/O? Depending the type of I/O it could be > messed up regardless of whether tasks happen to be in userspace or not > (e.g. printing). I'll wriggle out of this by saying that "in the middle of I/O" means the task has gotten a device into a state from which it can't successfully be suspended. An example would be if the system had sent the device part of a command. Even if you did manage to suspend the device and resume it later, you wouldn't expect it to work right when it received the second half of the command. In general, drivers don't leave devices in this kind of intermediate state for long. A driver might sleep at such times, but it wouldn't return to userspace. > And some types of I/O are already mostly decoupled from userspace > (file I/O, networking), so the userspace freezing shoudln't make any > difference. True. We're mostly talking about character device I/O. There are a lot of character device drivers in the kernel. :-) > > > Did anybody ever try modifying the freezer for suspend (not > > > hibernate), so that it allows tasks not in running state to freeze? > > > If not, I think that's an experiment worth doing. > > > > What happens if the reason the task isn't running is because it's > > waiting for I/O to complete? I just don't think this can be made to > > work. > > Don't know. I've never written a driver, and I'm not familiar with > runtime PM, etc. So I can't come up with a detailed design for > solving the freezer issues there. > > But I do think that the solution does not lie in "fixing" the VFS. I tend to agree. The problems posed by fuse are very difficult. It is not at all obvious how to attack them. Alan Stern ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <E1KvDcp-0008B7-SR@pomaz-ex.szeredi.hu>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] <E1KvDcp-0008B7-SR@pomaz-ex.szeredi.hu> @ 2008-10-29 20:37 ` Alan Stern [not found] ` <Pine.LNX.4.44L0.0810291627040.3986-100000@iolanthe.rowland.org> 1 sibling, 0 replies; 59+ messages in thread From: Alan Stern @ 2008-10-29 20:37 UTC (permalink / raw) To: Miklos Szeredi; +Cc: ncunningham, linux-kernel, linux-pm On Wed, 29 Oct 2008, Miklos Szeredi wrote: > On Wed, 29 Oct 2008, Alan Stern wrote: > > On Wed, 29 Oct 2008, Miklos Szeredi wrote: > > > > > Actually I was thinking of an rw-semaphore, not a mutex. But yeah > > > that still has scalability problems. But it could be done with custom > > > locking primitives, optimized for this case: > > > > > > suspend_disable(); > > > /* driver stuff */ > > > suspend_enable(); > > > > Yes, it could be done. And the overhead could be minimized by using > > per-CPU variables. It would still be an awful lot of work, and easy to > > get wrong. > > OK, getting back to this, as it seems to be the only way that we agree > is doable. > > How about this, > > a) identify syscalls that may make drivers do I/O: > > - read > - write > - ioctl > ??? > > b) add the suspend_disable/enable() primitives to these syscalls > > c) push primitives inside the implementation I discussed this last summer with Rafael. It's a lot harder than it looks, for all sorts of reasons. For example, what about user tasks that have access to memory-mapped I/O regions? > c) is slightly tricky, but could be done for example by setting a flag > on open: FMODE_NO_SUSPEND_DISABLE (better name required), saying that > implementation is responsible for getting the suspend disable magic > right. > > For starters this flag could be set for all non-device opens (maybe all > non-char-dev opens?), solving the fuse vs. freezer issues without any > complicated trickery. I don't know. There are other interfaces too, like sysfs attributes, that would have to be handled specially. On the whole, the freezer seems much, much simpler. Regarding fuse, something like Nigel's scheme for preventing new requests and then waiting for old requests to complete might work out. Especially if you combine it with a strategy for making the freezer back and retry after a delay when something goes wrong. Alan Stern ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <Pine.LNX.4.44L0.0810291627040.3986-100000@iolanthe.rowland.org>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <Pine.LNX.4.44L0.0810291627040.3986-100000@iolanthe.rowland.org> @ 2008-10-29 21:11 ` Rafael J. Wysocki [not found] ` <200810292211.49021.rjw@sisk.pl> ` (2 subsequent siblings) 3 siblings, 0 replies; 59+ messages in thread From: Rafael J. Wysocki @ 2008-10-29 21:11 UTC (permalink / raw) To: Alan Stern; +Cc: ncunningham, linux-pm, linux-kernel, Miklos Szeredi On Wednesday, 29 of October 2008, Alan Stern wrote: > On Wed, 29 Oct 2008, Miklos Szeredi wrote: > > > On Wed, 29 Oct 2008, Alan Stern wrote: > > > On Wed, 29 Oct 2008, Miklos Szeredi wrote: > > > > > > > Actually I was thinking of an rw-semaphore, not a mutex. But yeah > > > > that still has scalability problems. But it could be done with custom > > > > locking primitives, optimized for this case: > > > > > > > > suspend_disable(); > > > > /* driver stuff */ > > > > suspend_enable(); > > > > > > Yes, it could be done. And the overhead could be minimized by using > > > per-CPU variables. It would still be an awful lot of work, and easy to > > > get wrong. > > > > OK, getting back to this, as it seems to be the only way that we agree > > is doable. > > > > How about this, > > > > a) identify syscalls that may make drivers do I/O: > > > > - read > > - write > > - ioctl > > ??? > > > > b) add the suspend_disable/enable() primitives to these syscalls > > > > c) push primitives inside the implementation > > I discussed this last summer with Rafael. It's a lot harder than it > looks, for all sorts of reasons. For example, what about user tasks > that have access to memory-mapped I/O regions? > > > c) is slightly tricky, but could be done for example by setting a flag > > on open: FMODE_NO_SUSPEND_DISABLE (better name required), saying that > > implementation is responsible for getting the suspend disable magic > > right. > > > > For starters this flag could be set for all non-device opens (maybe all > > non-char-dev opens?), solving the fuse vs. freezer issues without any > > complicated trickery. > > I don't know. There are other interfaces too, like sysfs attributes, > that would have to be handled specially. On the whole, the freezer > seems much, much simpler. > > Regarding fuse, something like Nigel's scheme for preventing new > requests and then waiting for old requests to complete might work out. > Especially if you combine it with a strategy for making the freezer > back and retry after a delay when something goes wrong. Yeah. The current design of the freezer is rather simplistic and I'm not really sure it's the best one possible. Perhaps we can redesign the freezer to work differently and handle the cases like fuse. I didn't have the time to think about that yet, though. Thanks, Rafael ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <200810292211.49021.rjw@sisk.pl>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <200810292211.49021.rjw@sisk.pl> @ 2008-10-29 21:45 ` Nigel Cunningham [not found] ` <1225316715.18786.1.camel@nigel-laptop> 1 sibling, 0 replies; 59+ messages in thread From: Nigel Cunningham @ 2008-10-29 21:45 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: linux-pm, linux-kernel, Miklos Szeredi Hi. On Wed, 2008-10-29 at 22:11 +0100, Rafael J. Wysocki wrote: > The current design of the freezer is rather simplistic and I'm not really sure > it's the best one possible. Perhaps we can redesign the freezer to work > differently and handle the cases like fuse. Why redo what I've already done? In the full patch, you have the basis of what you're talking about. I haven't seen a failure to freeze fuse or anything else in a year of use. Nigel ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <1225316715.18786.1.camel@nigel-laptop>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <1225316715.18786.1.camel@nigel-laptop> @ 2008-10-29 22:07 ` Rafael J. Wysocki 2008-10-29 23:48 ` Miklos Szeredi ` (2 subsequent siblings) 3 siblings, 0 replies; 59+ messages in thread From: Rafael J. Wysocki @ 2008-10-29 22:07 UTC (permalink / raw) To: Nigel Cunningham; +Cc: linux-pm, linux-kernel, Miklos Szeredi On Wednesday, 29 of October 2008, Nigel Cunningham wrote: > Hi. > > On Wed, 2008-10-29 at 22:11 +0100, Rafael J. Wysocki wrote: > > The current design of the freezer is rather simplistic and I'm not really sure > > it's the best one possible. Perhaps we can redesign the freezer to work > > differently and handle the cases like fuse. > > Why redo what I've already done? In the full patch, you have the basis > of what you're talking about. I haven't seen a failure to freeze fuse or > anything else in a year of use. Still, Miklos noticed some problems with it. I'm not talking about doing things on top of the current signal-based freezing mechanism, but rather about moving the freezer a bit closer towards the scheduler. Maybe this is the way to go. I don't know. In any case, the freezing of user space seems to be much simpler than modifying all drivers to add suspend synchronization. Thanks, Rafael ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <1225316715.18786.1.camel@nigel-laptop> 2008-10-29 22:07 ` Rafael J. Wysocki @ 2008-10-29 23:48 ` Miklos Szeredi [not found] ` <200810292307.26945.rjw@sisk.pl> [not found] ` <E1KvKm8-0000cP-6f@pomaz-ex.szeredi.hu> 3 siblings, 0 replies; 59+ messages in thread From: Miklos Szeredi @ 2008-10-29 23:48 UTC (permalink / raw) To: ncunningham; +Cc: linux-pm, linux-kernel, miklos On Thu, 30 Oct 2008, Nigel Cunningham wrote: > Hi. > > On Wed, 2008-10-29 at 22:11 +0100, Rafael J. Wysocki wrote: > > The current design of the freezer is rather simplistic and I'm not really sure > > it's the best one possible. Perhaps we can redesign the freezer to work > > differently and handle the cases like fuse. > > Why redo what I've already done? In the full patch, you have the basis > of what you're talking about. I haven't seen a failure to freeze fuse or > anything else in a year of use. Well yeah, your patch handles the straightforward cases. But it doesn't help with the more tricky cases, where one fuse filesystem is using another, and as those may become more widespread, this approach will fail. Miklos ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <200810292307.26945.rjw@sisk.pl>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <200810292307.26945.rjw@sisk.pl> @ 2008-10-29 23:53 ` Miklos Szeredi [not found] ` <E1KvKql-0000dC-3M@pomaz-ex.szeredi.hu> 1 sibling, 0 replies; 59+ messages in thread From: Miklos Szeredi @ 2008-10-29 23:53 UTC (permalink / raw) To: rjw; +Cc: ncunningham, miklos, linux-pm, linux-kernel On Wed, 29 Oct 2008, Rafael J. Wysocki wrote: > In any case, the freezing of user space seems to be much simpler > than modifying all drivers to add suspend synchronization. As I mentioned, we don't _have_ to modify all drivers. It could start out by just modifying syscalls that result in calls to drivers. That would be a 10-liner patch. Miklos ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <E1KvKql-0000dC-3M@pomaz-ex.szeredi.hu>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <E1KvKql-0000dC-3M@pomaz-ex.szeredi.hu> @ 2008-11-09 13:44 ` Pavel Machek 0 siblings, 0 replies; 59+ messages in thread From: Pavel Machek @ 2008-11-09 13:44 UTC (permalink / raw) To: Miklos Szeredi; +Cc: ncunningham, linux-pm, linux-kernel On Thu 2008-10-30 00:53:19, Miklos Szeredi wrote: > On Wed, 29 Oct 2008, Rafael J. Wysocki wrote: > > In any case, the freezing of user space seems to be much simpler > > than modifying all drivers to add suspend synchronization. > > As I mentioned, we don't _have_ to modify all drivers. It could start > out by just modifying syscalls that result in calls to drivers. That > would be a 10-liner patch. If you can solve current freezer vs. fuse in 10 lines, that would be useful... -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <E1KvKm8-0000cP-6f@pomaz-ex.szeredi.hu>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <E1KvKm8-0000cP-6f@pomaz-ex.szeredi.hu> @ 2008-10-30 13:04 ` Nigel Cunningham 0 siblings, 0 replies; 59+ messages in thread From: Nigel Cunningham @ 2008-10-30 13:04 UTC (permalink / raw) To: Miklos Szeredi; +Cc: linux-pm, linux-kernel Hi. On Thu, 2008-10-30 at 00:48 +0100, Miklos Szeredi wrote: > On Thu, 30 Oct 2008, Nigel Cunningham wrote: > > Hi. > > > > On Wed, 2008-10-29 at 22:11 +0100, Rafael J. Wysocki wrote: > > > The current design of the freezer is rather simplistic and I'm not really sure > > > it's the best one possible. Perhaps we can redesign the freezer to work > > > differently and handle the cases like fuse. > > > > Why redo what I've already done? In the full patch, you have the basis > > of what you're talking about. I haven't seen a failure to freeze fuse or > > anything else in a year of use. > > Well yeah, your patch handles the straightforward cases. But it > doesn't help with the more tricky cases, where one fuse filesystem is > using another, and as those may become more widespread, this approach > will fail. At the moment, yes. But it's not impossible for us to modify the patch to handle that as well. Regards, Nigel ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <Pine.LNX.4.44L0.0810291627040.3986-100000@iolanthe.rowland.org> 2008-10-29 21:11 ` Rafael J. Wysocki [not found] ` <200810292211.49021.rjw@sisk.pl> @ 2008-10-29 23:43 ` Miklos Szeredi 2008-10-29 23:56 ` Matthew Garrett 3 siblings, 0 replies; 59+ messages in thread From: Miklos Szeredi @ 2008-10-29 23:43 UTC (permalink / raw) To: stern; +Cc: ncunningham, linux-pm, linux-kernel, miklos On Wed, 29 Oct 2008, Alan Stern wrote: > I discussed this last summer with Rafael. It's a lot harder than it > looks, for all sorts of reasons. For example, what about user tasks > that have access to memory-mapped I/O regions? What about them? Freezing doesn't seem to help with that. > > c) is slightly tricky, but could be done for example by setting a flag > > on open: FMODE_NO_SUSPEND_DISABLE (better name required), saying that > > implementation is responsible for getting the suspend disable magic > > right. > > > > For starters this flag could be set for all non-device opens (maybe all > > non-char-dev opens?), solving the fuse vs. freezer issues without any > > complicated trickery. > > I don't know. There are other interfaces too, like sysfs attributes, > that would have to be handled specially. On the whole, the freezer > seems much, much simpler. OK, then non-device files on "regular" filesystems. > Regarding fuse, something like Nigel's scheme for preventing new > requests and then waiting for old requests to complete might work out. > Especially if you combine it with a strategy for making the freezer > back and retry after a delay when something goes wrong. I don't think it will work out, because to be able to do this some ordering between freezing the filesystems must be done. But this is basically impossible, for all the same reasons it's impossible to order the freezing of userspace tasks. Also this is not just a fuse issue: we have userspace network devices, we have userspace USB drivers, etc, affected by this problem. If there _is_ a solution with the freezer that does solve all of this, I haven't yet heard it. But yeah, in the end the simpler solution should win. Miklos ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <Pine.LNX.4.44L0.0810291627040.3986-100000@iolanthe.rowland.org> ` (2 preceding siblings ...) 2008-10-29 23:43 ` Miklos Szeredi @ 2008-10-29 23:56 ` Matthew Garrett 3 siblings, 0 replies; 59+ messages in thread From: Matthew Garrett @ 2008-10-29 23:56 UTC (permalink / raw) To: Alan Stern; +Cc: ncunningham, linux-pm, linux-kernel, Miklos Szeredi On Wed, Oct 29, 2008 at 04:37:30PM -0400, Alan Stern wrote: > I discussed this last summer with Rafael. It's a lot harder than it > looks, for all sorts of reasons. For example, what about user tasks > that have access to memory-mapped I/O regions? The only interesting cases of this I know of are X (which ought to stop doing so in the near future) and certain sound operations (which are going via a well-defined API anyway and need to handle devices going away at random, so not a problem). There may be some other niche cases, but really - if you're mmapping hardware then it's generally because you haven't written a proper kernel driver. Do that instead. Runtime power management's already going to make you wildly unhappy. -- Matthew Garrett | mjg59@srcf.ucam.org ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <E1KvKgv-0000bf-RS@pomaz-ex.szeredi.hu>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] <E1KvKgv-0000bf-RS@pomaz-ex.szeredi.hu> @ 2008-10-30 13:54 ` Alan Stern 0 siblings, 0 replies; 59+ messages in thread From: Alan Stern @ 2008-10-30 13:54 UTC (permalink / raw) To: Miklos Szeredi; +Cc: ncunningham, linux-kernel, linux-pm On Thu, 30 Oct 2008, Miklos Szeredi wrote: > On Wed, 29 Oct 2008, Alan Stern wrote: > > I discussed this last summer with Rafael. It's a lot harder than it > > looks, for all sorts of reasons. For example, what about user tasks > > that have access to memory-mapped I/O regions? > > What about them? Freezing doesn't seem to help with that. Sure it does. A frozen process can't touch a memory-mapped I/O region, whereas a non-frozen process can. Although as Matthew pointed out, this is rapidly becoming a moot point. > > > c) is slightly tricky, but could be done for example by setting a flag > > > on open: FMODE_NO_SUSPEND_DISABLE (better name required), saying that > > > implementation is responsible for getting the suspend disable magic > > > right. > > > > > > For starters this flag could be set for all non-device opens (maybe all > > > non-char-dev opens?), solving the fuse vs. freezer issues without any > > > complicated trickery. > > > > I don't know. There are other interfaces too, like sysfs attributes, > > that would have to be handled specially. On the whole, the freezer > > seems much, much simpler. > > OK, then non-device files on "regular" filesystems. Would you like to write a first-pass patch? I don't think it will work. > > Regarding fuse, something like Nigel's scheme for preventing new > > requests and then waiting for old requests to complete might work out. > > Especially if you combine it with a strategy for making the freezer > > back and retry after a delay when something goes wrong. > > I don't think it will work out, because to be able to do this some > ordering between freezing the filesystems must be done. But this is > basically impossible, for all the same reasons it's impossible to > order the freezing of userspace tasks. > > Also this is not just a fuse issue: we have userspace network devices, > we have userspace USB drivers, etc, affected by this problem. > > If there _is_ a solution with the freezer that does solve all of this, > I haven't yet heard it. But yeah, in the end the simpler solution > should win. To do this properly, it is necessary to categorize sleeping states. Some should count as frozen and others shouldn't. So for instance, all the mutexes and wait queues involved in VFS would count as frozen; a process waiting on one of them can simply be kept from waking up until the suspend is over. Others (those involved in device I/O) wouldn't count as frozen; a task waiting on one of them would have to wake up and move forward before the system could suspend. Doing that seems like a lot of work, just as modifying every driver does. Changing a few kernel entry points is simpler, but I'm pretty sure it won't work. For instance, tasks can block arbitrarily long on read calls (waiting for data to arrive); you can't allow such things to prevent the system from suspending. Alan Stern ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <1225371842.8576.6.camel@nigel-laptop>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] <1225371842.8576.6.camel@nigel-laptop> @ 2008-10-30 13:56 ` Alan Stern 2008-10-30 21:44 ` Nigel Cunningham [not found] ` <1225403060.6574.10.camel@nigel-laptop> 0 siblings, 2 replies; 59+ messages in thread From: Alan Stern @ 2008-10-30 13:56 UTC (permalink / raw) To: Nigel Cunningham; +Cc: linux-pm, linux-kernel, Miklos Szeredi On Fri, 31 Oct 2008, Nigel Cunningham wrote: > > Well yeah, your patch handles the straightforward cases. But it > > doesn't help with the more tricky cases, where one fuse filesystem is > > using another, and as those may become more widespread, this approach > > will fail. > > At the moment, yes. But it's not impossible for us to modify the patch > to handle that as well. It depends on what you mean. The most direct reading of your statement is simply wrong: It is _theoretically_ impossible to find the correct order for freezing filesystems. To do so would be equivalent to solving the halting problem. Alan Stern ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Freezer: Don't count threads waiting for frozen filesystems. 2008-10-30 13:56 ` Alan Stern @ 2008-10-30 21:44 ` Nigel Cunningham [not found] ` <1225403060.6574.10.camel@nigel-laptop> 1 sibling, 0 replies; 59+ messages in thread From: Nigel Cunningham @ 2008-10-30 21:44 UTC (permalink / raw) To: Alan Stern; +Cc: linux-pm, linux-kernel, Miklos Szeredi Hi Alan. On Thu, 2008-10-30 at 09:56 -0400, Alan Stern wrote: > On Fri, 31 Oct 2008, Nigel Cunningham wrote: > > > > Well yeah, your patch handles the straightforward cases. But it > > > doesn't help with the more tricky cases, where one fuse filesystem is > > > using another, and as those may become more widespread, this approach > > > will fail. > > > > At the moment, yes. But it's not impossible for us to modify the patch > > to handle that as well. > > It depends on what you mean. The most direct reading of your statement > is simply wrong: It is _theoretically_ impossible to find the correct > order for freezing filesystems. To do so would be equivalent to > solving the halting problem. I'm not sure that's true. You see, I'm thinking of this as not that different to the problem of unmounting filesystems. There, too, we need to unmount in a particular order, and let transactions on each filesystem stop cleanly before we can unmount them. Even if there are differences, perhaps looking at how we handle unmounting will help with handling freezing. Regards, Nigel ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <1225403060.6574.10.camel@nigel-laptop>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <1225403060.6574.10.camel@nigel-laptop> @ 2008-10-31 8:49 ` Miklos Szeredi [not found] ` <E1Kvpgo-00045l-GU@pomaz-ex.szeredi.hu> 1 sibling, 0 replies; 59+ messages in thread From: Miklos Szeredi @ 2008-10-31 8:49 UTC (permalink / raw) To: ncunningham; +Cc: linux-pm, linux-kernel, miklos On Fri, 31 Oct 2008, Nigel Cunningham wrote: > I'm not sure that's true. You see, I'm thinking of this as not that > different to the problem of unmounting filesystems. There, too, we need > to unmount in a particular order, and let transactions on each > filesystem stop cleanly before we can unmount them. Even if there are > differences, perhaps looking at how we handle unmounting will help with > handling freezing. There's nothing magic about umount, it just uses a refcount on the fs. But umount changes the namespace, that's the big difference. For example if a process is accessing path P which has a component inside the mount, it _will_ get different results before and after the umount. This is not acceptable for freezing. For freezing to work with such a refcounting scheme, we'd have to count _future_ uses of the fs as well, not just current ones, which is obviously impossible. Miklos ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <E1Kvpgo-00045l-GU@pomaz-ex.szeredi.hu>]
[parent not found: <1225444253.6574.21.camel@nigel-laptop>]
[parent not found: <E1Kvq72-00049W-O1@pomaz-ex.szeredi.hu>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <E1Kvq72-00049W-O1@pomaz-ex.szeredi.hu> @ 2008-10-31 11:28 ` Nigel Cunningham [not found] ` <1225452484.6574.35.camel@nigel-laptop> 1 sibling, 0 replies; 59+ messages in thread From: Nigel Cunningham @ 2008-10-31 11:28 UTC (permalink / raw) To: Miklos Szeredi; +Cc: linux-pm, linux-kernel Hi again. On Fri, 2008-10-31 at 10:16 +0100, Miklos Szeredi wrote: > On Fri, 31 Oct 2008, Nigel Cunningham wrote: > > Hi. > > > > On Fri, 2008-10-31 at 09:49 +0100, Miklos Szeredi wrote: > > > On Fri, 31 Oct 2008, Nigel Cunningham wrote: > > > > I'm not sure that's true. You see, I'm thinking of this as not that > > > > different to the problem of unmounting filesystems. There, too, we need > > > > to unmount in a particular order, and let transactions on each > > > > filesystem stop cleanly before we can unmount them. Even if there are > > > > differences, perhaps looking at how we handle unmounting will help with > > > > handling freezing. > > > > > > There's nothing magic about umount, it just uses a refcount on the fs. > > > > > > But umount changes the namespace, that's the big difference. For > > > example if a process is accessing path P which has a component inside > > > the mount, it _will_ get different results before and after the > > > umount. This is not acceptable for freezing. > > > > > > For freezing to work with such a refcounting scheme, we'd have to > > > count _future_ uses of the fs as well, not just current ones, which is > > > obviously impossible. > > > > I must be missing something. If you're freezing future users of the > > filesystem before they can start anything new, doesn't that deal with > > this problem? > > How do you determine which are the future users? You don't. You just use FUSE_MIGHT_FREEZE to stop them before they take locks that might cause problems. So my suggestion is: 1) Stop new requests at FUSE_MIGHT_FREEZE 2) Handle existing requests by using freezer_do_not_count in request_send and request_send_nowait before the spin_lock and freezer_count after the spin_unlock. With #2, we don't need to care about whether the request is completed before freezing completes or post-resume. If the userspace process tries to use an already frozen fuse filesystem and gets frozen, that's okay because we'll sit waiting for an answer, not be counted by the freezer and so not contribute to any failure to freeze processes. If the userspace process completes its work, we'll either get caught at the freezer_count (if we've already been told to freeze) or be gotten later, after exiting the fuse code. Regards, Nigel ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <1225452484.6574.35.camel@nigel-laptop>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <1225452484.6574.35.camel@nigel-laptop> @ 2008-10-31 12:44 ` Miklos Szeredi [not found] ` <E1KvtMn-0004YD-00@pomaz-ex.szeredi.hu> 1 sibling, 0 replies; 59+ messages in thread From: Miklos Szeredi @ 2008-10-31 12:44 UTC (permalink / raw) To: ncunningham; +Cc: linux-pm, linux-kernel, miklos On Fri, 31 Oct 2008, Nigel Cunningham wrote: > You don't. You just use FUSE_MIGHT_FREEZE to stop them before they take > locks that might cause problems. So my suggestion is: Before? FUSE_MIGHT_FREEZE is called _after_ i_mutex is taken by the VFS. > 1) Stop new requests at FUSE_MIGHT_FREEZE > 2) Handle existing requests by using freezer_do_not_count in > request_send and request_send_nowait before the spin_lock and > freezer_count after the spin_unlock. > > With #2, we don't need to care about whether the request is completed > before freezing completes or post-resume. > > If the userspace process tries to use an already frozen fuse filesystem > and gets frozen, that's okay because we'll sit waiting for an answer, > not be counted by the freezer and so not contribute to any failure to > freeze processes. > > If the userspace process completes its work, we'll either get caught at > the freezer_count (if we've already been told to freeze) or be gotten > later, after exiting the fuse code. Yes, this is the variant of categorizing sleeps to freezing and non-freezing. The problem is, you need to do that with all mutex_lock(&inode->i_mutex) instances as well. Try grepping for that in fs/*.c! It _is_ possible, it's just not practical. Miklos ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <E1KvtMn-0004YD-00@pomaz-ex.szeredi.hu>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <E1KvtMn-0004YD-00@pomaz-ex.szeredi.hu> @ 2008-10-31 21:11 ` Nigel Cunningham 0 siblings, 0 replies; 59+ messages in thread From: Nigel Cunningham @ 2008-10-31 21:11 UTC (permalink / raw) To: Miklos Szeredi; +Cc: linux-pm, linux-kernel Hi. On Fri, 2008-10-31 at 13:44 +0100, Miklos Szeredi wrote: > On Fri, 31 Oct 2008, Nigel Cunningham wrote: > > You don't. You just use FUSE_MIGHT_FREEZE to stop them before they take > > locks that might cause problems. So my suggestion is: > > Before? FUSE_MIGHT_FREEZE is called _after_ i_mutex is taken by the > VFS. Ok. So I'll shift it backwards in the calling chain a bit. > > 1) Stop new requests at FUSE_MIGHT_FREEZE > > 2) Handle existing requests by using freezer_do_not_count in > > request_send and request_send_nowait before the spin_lock and > > freezer_count after the spin_unlock. > > > > With #2, we don't need to care about whether the request is completed > > before freezing completes or post-resume. > > > > If the userspace process tries to use an already frozen fuse filesystem > > and gets frozen, that's okay because we'll sit waiting for an answer, > > not be counted by the freezer and so not contribute to any failure to > > freeze processes. > > > > If the userspace process completes its work, we'll either get caught at > > the freezer_count (if we've already been told to freeze) or be gotten > > later, after exiting the fuse code. > > Yes, this is the variant of categorizing sleeps to freezing and > non-freezing. The problem is, you need to do that with all > mutex_lock(&inode->i_mutex) instances as well. Try grepping for that > in fs/*.c! > > It _is_ possible, it's just not practical. Let's get practical. You know fuse a lot better than me. Can you give me the most complicated, ugly configuration you can think of, and I'll work on making it freeze first time, every time. Regards, Nigel ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <Pine.LNX.4.44L0.0810300943001.2401-100000@iolanthe.rowland.org>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] <Pine.LNX.4.44L0.0810300943001.2401-100000@iolanthe.rowland.org> @ 2008-10-30 14:39 ` Miklos Szeredi [not found] ` <E1KvYgR-0002Cx-AZ@pomaz-ex.szeredi.hu> 1 sibling, 0 replies; 59+ messages in thread From: Miklos Szeredi @ 2008-10-30 14:39 UTC (permalink / raw) To: stern; +Cc: ncunningham, linux-pm, linux-kernel, miklos On Thu, 30 Oct 2008, Alan Stern wrote: > On Thu, 30 Oct 2008, Miklos Szeredi wrote: > > > On Wed, 29 Oct 2008, Alan Stern wrote: > > > I discussed this last summer with Rafael. It's a lot harder than it > > > looks, for all sorts of reasons. For example, what about user tasks > > > that have access to memory-mapped I/O regions? > > > > What about them? Freezing doesn't seem to help with that. > > Sure it does. A frozen process can't touch a memory-mapped I/O region, > whereas a non-frozen process can. But it can be in the middle of I/O by your definition. > > > I don't know. There are other interfaces too, like sysfs attributes, > > > that would have to be handled specially. On the whole, the freezer > > > seems much, much simpler. > > > > OK, then non-device files on "regular" filesystems. > > Would you like to write a first-pass patch? I don't think it will > work. If somebody doesn't beat me to it, I'll do that (first implemented with a global rw-sem). > Doing that seems like a lot of work, just as modifying every driver > does. Changing a few kernel entry points is simpler, but I'm pretty > sure it won't work. For instance, tasks can block arbitrarily long on > read calls (waiting for data to arrive); you can't allow such things to > prevent the system from suspending. But we already do: either a) it's in interruptible sleep (I/O on sockets, pipes, etc), and freezing simply interrupts it, or b) it's in uninterruptible sleep and suspend will wait it out (or time out). In the new scheme we could retain that part of the freezer: interrupt all tasks which are inside the critical region and wait for them to exit the critical region. To put it in another way: it's still the freezer, it does all the same things as the old freezer, except that the condition for freezing is not that the task is out of the kernel, rather that it's out of the disable_supend - enable_suspend region. As such it's not a big change to the whole suspend system, and so there shouldn't be anything big going wrong there. Miklos ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <E1KvYgR-0002Cx-AZ@pomaz-ex.szeredi.hu>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <E1KvYgR-0002Cx-AZ@pomaz-ex.szeredi.hu> @ 2008-10-30 17:07 ` Alan Stern 2008-10-30 20:17 ` Rafael J. Wysocki 2008-11-15 16:58 ` Pavel Machek 2 siblings, 0 replies; 59+ messages in thread From: Alan Stern @ 2008-10-30 17:07 UTC (permalink / raw) To: Miklos Szeredi; +Cc: ncunningham, linux-kernel, linux-pm On Thu, 30 Oct 2008, Miklos Szeredi wrote: > On Thu, 30 Oct 2008, Alan Stern wrote: > > On Thu, 30 Oct 2008, Miklos Szeredi wrote: > > > > > On Wed, 29 Oct 2008, Alan Stern wrote: > > > > I discussed this last summer with Rafael. It's a lot harder than it > > > > looks, for all sorts of reasons. For example, what about user tasks > > > > that have access to memory-mapped I/O regions? > > > > > > What about them? Freezing doesn't seem to help with that. > > > > Sure it does. A frozen process can't touch a memory-mapped I/O region, > > whereas a non-frozen process can. > > But it can be in the middle of I/O by your definition. True. Yet another problem... > > Would you like to write a first-pass patch? I don't think it will > > work. > > If somebody doesn't beat me to it, I'll do that (first implemented > with a global rw-sem). Converting it to per-CPU counters later on should be fairly easy. > > Doing that seems like a lot of work, just as modifying every driver > > does. Changing a few kernel entry points is simpler, but I'm pretty > > sure it won't work. For instance, tasks can block arbitrarily long on > > read calls (waiting for data to arrive); you can't allow such things to > > prevent the system from suspending. > > But we already do: either > > a) it's in interruptible sleep (I/O on sockets, pipes, etc), and > freezing simply interrupts it, or > > b) it's in uninterruptible sleep and suspend will wait it out (or > time out). > > In the new scheme we could retain that part of the freezer: interrupt > all tasks which are inside the critical region and wait for them to > exit the critical region. > > To put it in another way: it's still the freezer, it does all the same > things as the old freezer, except that the condition for freezing is > not that the task is out of the kernel, rather that it's out of the > disable_supend - enable_suspend region. As such it's not a big change > to the whole suspend system, and so there shouldn't be anything big > going wrong there. Okay. Don't forget things like ioctl for sockets -- they often involve doing I/O directly to the network interface device. What happens to a task accessing a non-regular file on a fuse filesystem? :-) Alan Stern ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <E1KvYgR-0002Cx-AZ@pomaz-ex.szeredi.hu> 2008-10-30 17:07 ` Alan Stern @ 2008-10-30 20:17 ` Rafael J. Wysocki 2008-11-15 16:58 ` Pavel Machek 2 siblings, 0 replies; 59+ messages in thread From: Rafael J. Wysocki @ 2008-10-30 20:17 UTC (permalink / raw) To: Miklos Szeredi; +Cc: ncunningham, linux-kernel, linux-pm On Thursday, 30 of October 2008, Miklos Szeredi wrote: > On Thu, 30 Oct 2008, Alan Stern wrote: > > On Thu, 30 Oct 2008, Miklos Szeredi wrote: > > > > > On Wed, 29 Oct 2008, Alan Stern wrote: > > > > I discussed this last summer with Rafael. It's a lot harder than it > > > > looks, for all sorts of reasons. For example, what about user tasks > > > > that have access to memory-mapped I/O regions? > > > > > > What about them? Freezing doesn't seem to help with that. > > > > Sure it does. A frozen process can't touch a memory-mapped I/O region, > > whereas a non-frozen process can. > > But it can be in the middle of I/O by your definition. > > > > > I don't know. There are other interfaces too, like sysfs attributes, > > > > that would have to be handled specially. On the whole, the freezer > > > > seems much, much simpler. > > > > > > OK, then non-device files on "regular" filesystems. > > > > Would you like to write a first-pass patch? I don't think it will > > work. > > If somebody doesn't beat me to it, I'll do that (first implemented > with a global rw-sem). > > > Doing that seems like a lot of work, just as modifying every driver > > does. Changing a few kernel entry points is simpler, but I'm pretty > > sure it won't work. For instance, tasks can block arbitrarily long on > > read calls (waiting for data to arrive); you can't allow such things to > > prevent the system from suspending. > > But we already do: either > > a) it's in interruptible sleep (I/O on sockets, pipes, etc), and > freezing simply interrupts it, or > > b) it's in uninterruptible sleep and suspend will wait it out (or > time out). > > In the new scheme we could retain that part of the freezer: interrupt > all tasks which are inside the critical region and wait for them to > exit the critical region. > > To put it in another way: it's still the freezer, it does all the same > things as the old freezer, except that the condition for freezing is > not that the task is out of the kernel, rather that it's out of the > disable_supend - enable_suspend region. As such it's not a big change > to the whole suspend system, and so there shouldn't be anything big > going wrong there. I like this idea. I was thinking about using task flags to mark processes as "not freezable at the moment", which would make the freezer to wait for such tasks (currently a task may only be always freezable or not freezable at all, which is not very flexible). Unfortunately, I didn't have the time to implement this. Thanks, Rafael ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] ` <E1KvYgR-0002Cx-AZ@pomaz-ex.szeredi.hu> 2008-10-30 17:07 ` Alan Stern 2008-10-30 20:17 ` Rafael J. Wysocki @ 2008-11-15 16:58 ` Pavel Machek 2 siblings, 0 replies; 59+ messages in thread From: Pavel Machek @ 2008-11-15 16:58 UTC (permalink / raw) To: Miklos Szeredi; +Cc: ncunningham, linux-kernel, linux-pm Hi! > > > > I don't know. There are other interfaces too, like sysfs attributes, > > > > that would have to be handled specially. On the whole, the freezer > > > > seems much, much simpler. > > > > > > OK, then non-device files on "regular" filesystems. > > > > Would you like to write a first-pass patch? I don't think it will > > work. > > If somebody doesn't beat me to it, I'll do that (first implemented > with a global rw-sem). Cool! l > > Doing that seems like a lot of work, just as modifying every driver > > does. Changing a few kernel entry points is simpler, but I'm pretty > > sure it won't work. For instance, tasks can block arbitrarily long on > > read calls (waiting for data to arrive); you can't allow such things to > > prevent the system from suspending. > > But we already do: either > > a) it's in interruptible sleep (I/O on sockets, pipes, etc), and > freezing simply interrupts it, or > > b) it's in uninterruptible sleep and suspend will wait it out (or > time out). > > In the new scheme we could retain that part of the freezer: interrupt > all tasks which are inside the critical region and wait for them to > exit the critical region. > > To put it in another way: it's still the freezer, it does all the same > things as the old freezer, except that the condition for freezing is > not that the task is out of the kernel, rather that it's out of the > disable_supend - enable_suspend region. As such it's not a big change > to the whole suspend system, and so there shouldn't be anything big > going wrong there. Disadvantage is it will add overhead to regular syscalls, at least initialy. That's why I implemented freezer initialy... Of course, suspend is more important than it was back then, and disadvantages of freezer are now well known, so maybe a little overhead in exchange of cleaner design is worth it...? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 59+ messages in thread
[parent not found: <Pine.LNX.4.44L0.0810301301540.2513-100000@iolanthe.rowland.org>]
* Re: Freezer: Don't count threads waiting for frozen filesystems. [not found] <Pine.LNX.4.44L0.0810301301540.2513-100000@iolanthe.rowland.org> @ 2008-10-30 17:43 ` Miklos Szeredi 0 siblings, 0 replies; 59+ messages in thread From: Miklos Szeredi @ 2008-10-30 17:43 UTC (permalink / raw) To: stern; +Cc: ncunningham, linux-pm, linux-kernel, miklos On Thu, 30 Oct 2008, Alan Stern wrote: > Okay. Don't forget things like ioctl for sockets -- they often involve > doing I/O directly to the network interface device. Yeah, ioctls should probably just always be protected (at least initially), regardless of what type of file they are done on. > What happens to a task accessing a non-regular file on a fuse > filesystem? :-) The same as on any other filesystem, i.e. the fs is only involved as far as calling init_special_inode(), the rest is taken care of by the VFS. Tejun Heo recently posted patches to fuse which enable emulating a char device from userspace. That is another matter, obviously we'd want to keep the "allow suspend during I/O" property of fuse in that case, even though there's a char device involved (but no hardware, at least not on that level). Miklos ^ permalink raw reply [flat|nested] 59+ messages in thread
end of thread, other threads:[~2008-11-15 16:58 UTC | newest]
Thread overview: 59+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-24 22:07 Freezer: Don't count threads waiting for frozen filesystems Nigel Cunningham
2008-10-26 20:01 ` Rafael J. Wysocki
[not found] ` <200810262101.08822.rjw@sisk.pl>
2008-10-27 11:12 ` Miklos Szeredi
[not found] ` <E1KuQ1J-0002Vw-9Q@pomaz-ex.szeredi.hu>
2008-10-27 11:20 ` Nigel Cunningham
[not found] ` <1225106427.26724.5.camel@nigel-laptop>
2008-10-27 11:37 ` Rafael J. Wysocki
2008-10-27 11:37 ` Miklos Szeredi
[not found] ` <200810271237.40049.rjw@sisk.pl>
2008-10-27 11:40 ` Nigel Cunningham
[not found] ` <1225107607.26724.9.camel@nigel-laptop>
2008-10-27 12:38 ` Miklos Szeredi
[not found] ` <E1KuRN4-0002gS-Hq@pomaz-ex.szeredi.hu>
2008-10-27 20:59 ` Nigel Cunningham
[not found] ` <1225141168.26724.23.camel@nigel-laptop>
2008-10-27 21:09 ` Miklos Szeredi
[not found] ` <E1KuZKt-0003jl-2N@pomaz-ex.szeredi.hu>
2008-10-27 22:13 ` Nigel Cunningham
[not found] ` <1225145607.26724.54.camel@nigel-laptop>
2008-10-28 20:25 ` Miklos Szeredi
[not found] ` <E1Kuv87-00064p-Lk@pomaz-ex.szeredi.hu>
2008-10-28 21:29 ` Nigel Cunningham
[not found] ` <1225229399.9661.13.camel@nigel-laptop>
2008-10-28 21:51 ` Miklos Szeredi
[not found] ` <E1KuwTn-0006EQ-8J@pomaz-ex.szeredi.hu>
2008-10-28 22:00 ` Rafael J. Wysocki
[not found] ` <200810282300.31164.rjw@sisk.pl>
2008-10-28 22:02 ` Miklos Szeredi
[not found] ` <E1Kuwe3-0006Gc-LF@pomaz-ex.szeredi.hu>
2008-10-28 22:21 ` Rafael J. Wysocki
[not found] ` <200810282321.59041.rjw@sisk.pl>
2008-10-28 23:21 ` Miklos Szeredi
[not found] ` <E1Kuxsg-0006PR-Tr@pomaz-ex.szeredi.hu>
2008-10-28 23:59 ` Rafael J. Wysocki
[not found] ` <200810290059.38411.rjw@sisk.pl>
2008-10-29 8:10 ` Miklos Szeredi
2008-10-28 22:03 ` Nigel Cunningham
2008-10-28 23:04 ` Nigel Cunningham
[not found] ` <1225235054.9661.37.camel@nigel-laptop>
2008-10-28 23:12 ` Miklos Szeredi
[not found] ` <E1Kuxk4-0006NT-Ie@pomaz-ex.szeredi.hu>
2008-10-28 23:17 ` Nigel Cunningham
[not found] ` <1225235860.9661.42.camel@nigel-laptop>
2008-10-28 23:24 ` Miklos Szeredi
[not found] ` <E1KuxvH-0006Q0-Dy@pomaz-ex.szeredi.hu>
2008-10-28 23:41 ` Nigel Cunningham
[not found] ` <1225237298.9661.55.camel@nigel-laptop>
2008-10-28 23:45 ` Miklos Szeredi
[not found] ` <E1KuyFq-0006Sc-PC@pomaz-ex.szeredi.hu>
2008-10-28 23:50 ` Miklos Szeredi
2008-10-28 23:54 ` Nigel Cunningham
[not found] ` <E1KuyKw-0006UP-EK@pomaz-ex.szeredi.hu>
2008-10-28 23:58 ` Nigel Cunningham
2008-10-28 17:49 ` Pavel Machek
[not found] <E1Kv688-0007Kp-SN@pomaz-ex.szeredi.hu>
2008-10-29 13:51 ` Alan Stern
[not found] <Pine.LNX.4.44L0.0810290944240.2515-100000@iolanthe.rowland.org>
2008-10-29 14:50 ` Miklos Szeredi
[not found] <E1KvCN9-00081i-KF@pomaz-ex.szeredi.hu>
2008-10-29 15:28 ` Alan Stern
[not found] <Pine.LNX.4.44L0.0810291122220.2189-100000@iolanthe.rowland.org>
2008-10-29 15:50 ` Miklos Szeredi
2008-10-29 16:10 ` Miklos Szeredi
[not found] <E1KvDJp-00088f-2b@pomaz-ex.szeredi.hu>
2008-10-29 16:17 ` Alan Stern
[not found] <E1KvDcp-0008B7-SR@pomaz-ex.szeredi.hu>
2008-10-29 20:37 ` Alan Stern
[not found] ` <Pine.LNX.4.44L0.0810291627040.3986-100000@iolanthe.rowland.org>
2008-10-29 21:11 ` Rafael J. Wysocki
[not found] ` <200810292211.49021.rjw@sisk.pl>
2008-10-29 21:45 ` Nigel Cunningham
[not found] ` <1225316715.18786.1.camel@nigel-laptop>
2008-10-29 22:07 ` Rafael J. Wysocki
2008-10-29 23:48 ` Miklos Szeredi
[not found] ` <200810292307.26945.rjw@sisk.pl>
2008-10-29 23:53 ` Miklos Szeredi
[not found] ` <E1KvKql-0000dC-3M@pomaz-ex.szeredi.hu>
2008-11-09 13:44 ` Pavel Machek
[not found] ` <E1KvKm8-0000cP-6f@pomaz-ex.szeredi.hu>
2008-10-30 13:04 ` Nigel Cunningham
2008-10-29 23:43 ` Miklos Szeredi
2008-10-29 23:56 ` Matthew Garrett
[not found] <E1KvKgv-0000bf-RS@pomaz-ex.szeredi.hu>
2008-10-30 13:54 ` Alan Stern
[not found] <1225371842.8576.6.camel@nigel-laptop>
2008-10-30 13:56 ` Alan Stern
2008-10-30 21:44 ` Nigel Cunningham
[not found] ` <1225403060.6574.10.camel@nigel-laptop>
2008-10-31 8:49 ` Miklos Szeredi
[not found] ` <E1Kvpgo-00045l-GU@pomaz-ex.szeredi.hu>
[not found] ` <1225444253.6574.21.camel@nigel-laptop>
[not found] ` <E1Kvq72-00049W-O1@pomaz-ex.szeredi.hu>
2008-10-31 11:28 ` Nigel Cunningham
[not found] ` <1225452484.6574.35.camel@nigel-laptop>
2008-10-31 12:44 ` Miklos Szeredi
[not found] ` <E1KvtMn-0004YD-00@pomaz-ex.szeredi.hu>
2008-10-31 21:11 ` Nigel Cunningham
[not found] <Pine.LNX.4.44L0.0810300943001.2401-100000@iolanthe.rowland.org>
2008-10-30 14:39 ` Miklos Szeredi
[not found] ` <E1KvYgR-0002Cx-AZ@pomaz-ex.szeredi.hu>
2008-10-30 17:07 ` Alan Stern
2008-10-30 20:17 ` Rafael J. Wysocki
2008-11-15 16:58 ` Pavel Machek
[not found] <Pine.LNX.4.44L0.0810301301540.2513-100000@iolanthe.rowland.org>
2008-10-30 17:43 ` Miklos Szeredi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox