* [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
@ 2015-01-19 7:42 Qu Wenruo
2015-01-19 14:06 ` David Sterba
` (2 more replies)
0 siblings, 3 replies; 25+ messages in thread
From: Qu Wenruo @ 2015-01-19 7:42 UTC (permalink / raw)
To: linux-btrfs; +Cc: David Sterba
Commit 6b5fe46dfa52 (btrfs: do commit in sync_fs if there are pending
changes) will call btrfs_start_transaction() in sync_fs(), to handle
some operations needed to be done in next transaction.
However this can cause deadlock if the filesystem is frozen, with the
following sys_r+w output:
[ 143.255932] Call Trace:
[ 143.255936] [<ffffffff816c0e09>] schedule+0x29/0x70
[ 143.255939] [<ffffffff811cb7f3>] __sb_start_write+0xb3/0x100
[ 143.255971] [<ffffffffa040ec06>] start_transaction+0x2e6/0x5a0
[btrfs]
[ 143.255992] [<ffffffffa040f1eb>] btrfs_start_transaction+0x1b/0x20
[btrfs]
[ 143.256003] [<ffffffffa03dc0ba>] btrfs_sync_fs+0xca/0xd0 [btrfs]
[ 143.256007] [<ffffffff811f7be0>] sync_fs_one_sb+0x20/0x30
[ 143.256011] [<ffffffff811cbd01>] iterate_supers+0xe1/0xf0
[ 143.256014] [<ffffffff811f7d75>] sys_sync+0x55/0x90
[ 143.256017] [<ffffffff816c49d2>] system_call_fastpath+0x12/0x17
[ 143.256111] Call Trace:
[ 143.256114] [<ffffffff816c0e09>] schedule+0x29/0x70
[ 143.256119] [<ffffffff816c3405>] rwsem_down_write_failed+0x1c5/0x2d0
[ 143.256123] [<ffffffff8133f013>] call_rwsem_down_write_failed+0x13/0x20
[ 143.256131] [<ffffffff811caae8>] thaw_super+0x28/0xc0
[ 143.256135] [<ffffffff811db3e5>] do_vfs_ioctl+0x3f5/0x540
[ 143.256187] [<ffffffff811db5c1>] SyS_ioctl+0x91/0xb0
[ 143.256213] [<ffffffff816c49d2>] system_call_fastpath+0x12/0x17
The reason is like the following:
(Holding s_umount)
VFS sync_fs staff:
|- btrfs_sync_fs()
|- btrfs_start_transaction()
|- sb_start_intwrite()
(Waiting thaw_fs to unfreeze)
VFS thaw_fs staff:
thaw_fs()
(Waiting sync_fs to release
s_umount)
So deadlock happens.
This can be easily triggered by fstest/generic/068 with inode_cache
mount option.
The fix is to check if the fs is frozen, if the fs is frozen, just
return and waiting for the next transaction.
Cc: David Sterba <dsterba@suse.cz>
Reported-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
fs/btrfs/super.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 60f7cbe..1d9f1e6 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait)
*/
if (fs_info->pending_changes == 0)
return 0;
+ /*
+ * Test if the fs is frozen, or start_trasaction
+ * will deadlock on itself.
+ */
+ if (__sb_start_write(sb, SB_FREEZE_FS, false))
+ __sb_end_write(sb, SB_FREEZE_FS);
+ else
+ return 0;
trans = btrfs_start_transaction(root, 0);
} else {
return PTR_ERR(trans);
--
2.2.2
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. 2015-01-19 7:42 [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock Qu Wenruo @ 2015-01-19 14:06 ` David Sterba 2015-01-20 2:51 ` Qu Wenruo 2015-01-20 0:19 ` Miao Xie 2015-01-20 17:13 ` David Sterba 2 siblings, 1 reply; 25+ messages in thread From: David Sterba @ 2015-01-19 14:06 UTC (permalink / raw) To: Qu Wenruo; +Cc: linux-btrfs, David Sterba On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote: > The fix is to check if the fs is frozen, if the fs is frozen, just > return and waiting for the next transaction. > > --- a/fs/btrfs/super.c > +++ b/fs/btrfs/super.c > @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait) > */ > if (fs_info->pending_changes == 0) > return 0; > + /* > + * Test if the fs is frozen, or start_trasaction > + * will deadlock on itself. > + */ > + if (__sb_start_write(sb, SB_FREEZE_FS, false)) > + __sb_end_write(sb, SB_FREEZE_FS); > + else > + return 0; I'm not sure this is the right fix. We should use either mnt_want_write_file or sb_start_write around the start/commit functions. The fs may be frozen already, but we also have to catch transition to that state, or RO remount. Also, returning 0 is not right, the ioctl actually skipped the expected work. > trans = btrfs_start_transaction(root, 0); > } else { > return PTR_ERR(trans); ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. 2015-01-19 14:06 ` David Sterba @ 2015-01-20 2:51 ` Qu Wenruo 2015-01-20 2:53 ` Qu Wenruo 0 siblings, 1 reply; 25+ messages in thread From: Qu Wenruo @ 2015-01-20 2:51 UTC (permalink / raw) To: dsterba, linux-btrfs, Miao Xie -------- Original Message -------- Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. From: David Sterba <dsterba@suse.cz> To: Qu Wenruo <quwenruo@cn.fujitsu.com> Date: 2015年01月19日 22:06 > On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote: >> The fix is to check if the fs is frozen, if the fs is frozen, just >> return and waiting for the next transaction. >> >> --- a/fs/btrfs/super.c >> +++ b/fs/btrfs/super.c >> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait) >> */ >> if (fs_info->pending_changes == 0) >> return 0; >> + /* >> + * Test if the fs is frozen, or start_trasaction >> + * will deadlock on itself. >> + */ >> + if (__sb_start_write(sb, SB_FREEZE_FS, false)) >> + __sb_end_write(sb, SB_FREEZE_FS); >> + else >> + return 0; > I'm not sure this is the right fix. We should use either > mnt_want_write_file or sb_start_write around the start/commit functions. > The fs may be frozen already, but we also have to catch transition to > that state, or RO remount. But the deadlock between s_umount and frozen level is a larger problem... Even Miao mentioned that we can start a transaction in btrfs_freeze(), but there is still possibility that we try to change the feature of the frozen btrfs and do sync, again the deadlock will happen. Although handling in btrfs_freeze() is also needed, but can't resolve all the problem. IMHO the fix is still needed, or at least as a workaround until we find a real root solution for it (If nobody want to revert the patchset) BTW, what about put the pending changes to a workqueue? If we don't start transaction under s_umount context like sync_fs() Thanks, Qu > > Also, returning 0 is not right, the ioctl actually skipped the expected > work. > >> trans = btrfs_start_transaction(root, 0); >> } else { >> return PTR_ERR(trans); ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. 2015-01-20 2:51 ` Qu Wenruo @ 2015-01-20 2:53 ` Qu Wenruo 2015-01-20 3:06 ` Miao Xie 0 siblings, 1 reply; 25+ messages in thread From: Qu Wenruo @ 2015-01-20 2:53 UTC (permalink / raw) To: dsterba, linux-btrfs, miaoxie@huawei.com >> Miao Xie Add CC to Miao Xie <miaoxie@huawei.com> -------- Original Message -------- Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. From: Qu Wenruo <quwenruo@cn.fujitsu.com> To: dsterba@suse.cz, linux-btrfs@vger.kernel.org, Miao Xie <miaox@cn.fujitsu.com> Date: 2015年01月20日 10:51 > > -------- Original Message -------- > Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on > frozen fs to avoid deadlock. > From: David Sterba <dsterba@suse.cz> > To: Qu Wenruo <quwenruo@cn.fujitsu.com> > Date: 2015年01月19日 22:06 >> On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote: >>> The fix is to check if the fs is frozen, if the fs is frozen, just >>> return and waiting for the next transaction. >>> >>> --- a/fs/btrfs/super.c >>> +++ b/fs/btrfs/super.c >>> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int >>> wait) >>> */ >>> if (fs_info->pending_changes == 0) >>> return 0; >>> + /* >>> + * Test if the fs is frozen, or start_trasaction >>> + * will deadlock on itself. >>> + */ >>> + if (__sb_start_write(sb, SB_FREEZE_FS, false)) >>> + __sb_end_write(sb, SB_FREEZE_FS); >>> + else >>> + return 0; >> I'm not sure this is the right fix. We should use either >> mnt_want_write_file or sb_start_write around the start/commit functions. >> The fs may be frozen already, but we also have to catch transition to >> that state, or RO remount. > But the deadlock between s_umount and frozen level is a larger problem... > > Even Miao mentioned that we can start a transaction in btrfs_freeze(), > but there is still possibility that > we try to change the feature of the frozen btrfs and do sync, again > the deadlock will happen. > Although handling in btrfs_freeze() is also needed, but can't resolve > all the problem. > > IMHO the fix is still needed, or at least as a workaround until we > find a real root solution for it > (If nobody want to revert the patchset) > > BTW, what about put the pending changes to a workqueue? If we don't > start transaction under > s_umount context like sync_fs() > > Thanks, > Qu >> >> Also, returning 0 is not right, the ioctl actually skipped the expected >> work. >> >>> trans = btrfs_start_transaction(root, 0); >>> } else { >>> return PTR_ERR(trans); > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. 2015-01-20 2:53 ` Qu Wenruo @ 2015-01-20 3:06 ` Miao Xie 2015-01-20 3:17 ` Qu Wenruo 0 siblings, 1 reply; 25+ messages in thread From: Miao Xie @ 2015-01-20 3:06 UTC (permalink / raw) To: Qu Wenruo, dsterba, linux-btrfs On Tue, 20 Jan 2015 10:53:05 +0800, Qu Wenruo wrote: > Add CC to Miao Xie <miaoxie@huawei.com> > > -------- Original Message -------- > Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to > avoid deadlock. > From: Qu Wenruo <quwenruo@cn.fujitsu.com> > To: dsterba@suse.cz, linux-btrfs@vger.kernel.org, Miao Xie <miaox@cn.fujitsu.com> > Date: 2015年01月20日 10:51 >> >> -------- Original Message -------- >> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs >> to avoid deadlock. >> From: David Sterba <dsterba@suse.cz> >> To: Qu Wenruo <quwenruo@cn.fujitsu.com> >> Date: 2015年01月19日 22:06 >>> On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote: >>>> The fix is to check if the fs is frozen, if the fs is frozen, just >>>> return and waiting for the next transaction. >>>> >>>> --- a/fs/btrfs/super.c >>>> +++ b/fs/btrfs/super.c >>>> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait) >>>> */ >>>> if (fs_info->pending_changes == 0) >>>> return 0; >>>> + /* >>>> + * Test if the fs is frozen, or start_trasaction >>>> + * will deadlock on itself. >>>> + */ >>>> + if (__sb_start_write(sb, SB_FREEZE_FS, false)) >>>> + __sb_end_write(sb, SB_FREEZE_FS); >>>> + else >>>> + return 0; >>> I'm not sure this is the right fix. We should use either >>> mnt_want_write_file or sb_start_write around the start/commit functions. >>> The fs may be frozen already, but we also have to catch transition to >>> that state, or RO remount. >> But the deadlock between s_umount and frozen level is a larger problem... >> >> Even Miao mentioned that we can start a transaction in btrfs_freeze(), but >> there is still possibility that >> we try to change the feature of the frozen btrfs and do sync, again the >> deadlock will happen. >> Although handling in btrfs_freeze() is also needed, but can't resolve all the >> problem. >> >> IMHO the fix is still needed, or at least as a workaround until we find a real >> root solution for it >> (If nobody want to revert the patchset) >> >> BTW, what about put the pending changes to a workqueue? If we don't start >> transaction under >> s_umount context like sync_fs() I don't like this fix. I think we should deal with those pending changes when we freeze a filesystem. or we break the rule of fs freeze. Thanks Miao >> >> Thanks, >> Qu >>> >>> Also, returning 0 is not right, the ioctl actually skipped the expected >>> work. >>> >>>> trans = btrfs_start_transaction(root, 0); >>>> } else { >>>> return PTR_ERR(trans); >> > > . > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. 2015-01-20 3:06 ` Miao Xie @ 2015-01-20 3:17 ` Qu Wenruo 2015-01-20 8:16 ` Miao Xie 0 siblings, 1 reply; 25+ messages in thread From: Qu Wenruo @ 2015-01-20 3:17 UTC (permalink / raw) To: Miao Xie, dsterba, linux-btrfs -------- Original Message -------- Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. From: Miao Xie <miaoxie@huawei.com> To: Qu Wenruo <quwenruo@cn.fujitsu.com>, <dsterba@suse.cz>, <linux-btrfs@vger.kernel.org> Date: 2015年01月20日 11:06 > On Tue, 20 Jan 2015 10:53:05 +0800, Qu Wenruo wrote: >> Add CC to Miao Xie <miaoxie@huawei.com> >> >> -------- Original Message -------- >> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to >> avoid deadlock. >> From: Qu Wenruo <quwenruo@cn.fujitsu.com> >> To: dsterba@suse.cz, linux-btrfs@vger.kernel.org, Miao Xie <miaox@cn.fujitsu.com> >> Date: 2015年01月20日 10:51 >>> -------- Original Message -------- >>> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs >>> to avoid deadlock. >>> From: David Sterba <dsterba@suse.cz> >>> To: Qu Wenruo <quwenruo@cn.fujitsu.com> >>> Date: 2015年01月19日 22:06 >>>> On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote: >>>>> The fix is to check if the fs is frozen, if the fs is frozen, just >>>>> return and waiting for the next transaction. >>>>> >>>>> --- a/fs/btrfs/super.c >>>>> +++ b/fs/btrfs/super.c >>>>> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait) >>>>> */ >>>>> if (fs_info->pending_changes == 0) >>>>> return 0; >>>>> + /* >>>>> + * Test if the fs is frozen, or start_trasaction >>>>> + * will deadlock on itself. >>>>> + */ >>>>> + if (__sb_start_write(sb, SB_FREEZE_FS, false)) >>>>> + __sb_end_write(sb, SB_FREEZE_FS); >>>>> + else >>>>> + return 0; >>>> I'm not sure this is the right fix. We should use either >>>> mnt_want_write_file or sb_start_write around the start/commit functions. >>>> The fs may be frozen already, but we also have to catch transition to >>>> that state, or RO remount. >>> But the deadlock between s_umount and frozen level is a larger problem... >>> >>> Even Miao mentioned that we can start a transaction in btrfs_freeze(), but >>> there is still possibility that >>> we try to change the feature of the frozen btrfs and do sync, again the >>> deadlock will happen. >>> Although handling in btrfs_freeze() is also needed, but can't resolve all the >>> problem. >>> >>> IMHO the fix is still needed, or at least as a workaround until we find a real >>> root solution for it >>> (If nobody want to revert the patchset) >>> >>> BTW, what about put the pending changes to a workqueue? If we don't start >>> transaction under >>> s_umount context like sync_fs() > I don't like this fix. > I think we should deal with those pending changes when we freeze a filesystem. > or we break the rule of fs freeze. I am afraid handling it in btrfs_freeze() won't help. Case like freeze() -> change_feature -> sync() -> unfreeze() will still deadlock in sync(). Even cleared the pending changes in freeze(), it can still be set through sysfs interface even the fs is frozen. And in fact, if we put the things like attach/create a transaction into a workqueue, we will not break the freeze rule. Since if the fs is frozen, there is no running transaction and we need to create a new one, that will call sb_start_intwrite(), which will sleep until the fs is unfreeze. Thanks, Qu > > Thanks > Miao > >>> Thanks, >>> Qu >>>> Also, returning 0 is not right, the ioctl actually skipped the expected >>>> work. >>>> >>>>> trans = btrfs_start_transaction(root, 0); >>>>> } else { >>>>> return PTR_ERR(trans); >> . >> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. 2015-01-20 3:17 ` Qu Wenruo @ 2015-01-20 8:16 ` Miao Xie 0 siblings, 0 replies; 25+ messages in thread From: Miao Xie @ 2015-01-20 8:16 UTC (permalink / raw) To: Qu Wenruo, dsterba, linux-btrfs On Tue, 20 Jan 2015 11:17:07 +0800, Qu Wenruo wrote: >>>>>> --- a/fs/btrfs/super.c >>>>>> +++ b/fs/btrfs/super.c >>>>>> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait) >>>>>> */ >>>>>> if (fs_info->pending_changes == 0) >>>>>> return 0; >>>>>> + /* >>>>>> + * Test if the fs is frozen, or start_trasaction >>>>>> + * will deadlock on itself. >>>>>> + */ >>>>>> + if (__sb_start_write(sb, SB_FREEZE_FS, false)) >>>>>> + __sb_end_write(sb, SB_FREEZE_FS); >>>>>> + else >>>>>> + return 0; >>>>> I'm not sure this is the right fix. We should use either >>>>> mnt_want_write_file or sb_start_write around the start/commit functions. >>>>> The fs may be frozen already, but we also have to catch transition to >>>>> that state, or RO remount. >>>> But the deadlock between s_umount and frozen level is a larger problem... >>>> >>>> Even Miao mentioned that we can start a transaction in btrfs_freeze(), but >>>> there is still possibility that >>>> we try to change the feature of the frozen btrfs and do sync, again the >>>> deadlock will happen. >>>> Although handling in btrfs_freeze() is also needed, but can't resolve all the >>>> problem. >>>> >>>> IMHO the fix is still needed, or at least as a workaround until we find a real >>>> root solution for it >>>> (If nobody want to revert the patchset) >>>> >>>> BTW, what about put the pending changes to a workqueue? If we don't start >>>> transaction under >>>> s_umount context like sync_fs() >> I don't like this fix. >> I think we should deal with those pending changes when we freeze a filesystem. >> or we break the rule of fs freeze. > I am afraid handling it in btrfs_freeze() won't help. > Case like freeze() -> change_feature -> sync() -> unfreeze() will still deadlock > in sync(). We should not change feature after the fs is freezed. > Even cleared the pending changes in freeze(), it can still be set through sysfs > interface even the fs is frozen. > > And in fact, if we put the things like attach/create a transaction into a > workqueue, we will not break > the freeze rule. > Since if the fs is frozen, there is no running transaction and we need to create > a new one, > that will call sb_start_intwrite(), which will sleep until the fs is unfreeze. I read the pending change code just now, and I found the pending change is just used for changing the mount option now, so I think as a work-around fix we needn't start a new transaction to handle the pending flags which are set after the current transaction is committed, because the data on the disk is integrated. Thanks Miao > > Thanks, > Qu >> >> Thanks >> Miao >> >>>> Thanks, >>>> Qu >>>>> Also, returning 0 is not right, the ioctl actually skipped the expected >>>>> work. >>>>> >>>>>> trans = btrfs_start_transaction(root, 0); >>>>>> } else { >>>>>> return PTR_ERR(trans); >>> . >>> > > . > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. 2015-01-19 7:42 [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock Qu Wenruo 2015-01-19 14:06 ` David Sterba @ 2015-01-20 0:19 ` Miao Xie 2015-01-20 0:26 ` Qu Wenruo 2015-01-20 17:13 ` David Sterba 2 siblings, 1 reply; 25+ messages in thread From: Miao Xie @ 2015-01-20 0:19 UTC (permalink / raw) To: Qu Wenruo, linux-btrfs; +Cc: David Sterba On Mon, 19 Jan 2015 15:42:41 +0800, Qu Wenruo wrote: > Commit 6b5fe46dfa52 (btrfs: do commit in sync_fs if there are pending > changes) will call btrfs_start_transaction() in sync_fs(), to handle > some operations needed to be done in next transaction. > > However this can cause deadlock if the filesystem is frozen, with the > following sys_r+w output: > [ 143.255932] Call Trace: > [ 143.255936] [<ffffffff816c0e09>] schedule+0x29/0x70 > [ 143.255939] [<ffffffff811cb7f3>] __sb_start_write+0xb3/0x100 > [ 143.255971] [<ffffffffa040ec06>] start_transaction+0x2e6/0x5a0 > [btrfs] > [ 143.255992] [<ffffffffa040f1eb>] btrfs_start_transaction+0x1b/0x20 > [btrfs] > [ 143.256003] [<ffffffffa03dc0ba>] btrfs_sync_fs+0xca/0xd0 [btrfs] > [ 143.256007] [<ffffffff811f7be0>] sync_fs_one_sb+0x20/0x30 > [ 143.256011] [<ffffffff811cbd01>] iterate_supers+0xe1/0xf0 > [ 143.256014] [<ffffffff811f7d75>] sys_sync+0x55/0x90 > [ 143.256017] [<ffffffff816c49d2>] system_call_fastpath+0x12/0x17 > [ 143.256111] Call Trace: > [ 143.256114] [<ffffffff816c0e09>] schedule+0x29/0x70 > [ 143.256119] [<ffffffff816c3405>] rwsem_down_write_failed+0x1c5/0x2d0 > [ 143.256123] [<ffffffff8133f013>] call_rwsem_down_write_failed+0x13/0x20 > [ 143.256131] [<ffffffff811caae8>] thaw_super+0x28/0xc0 > [ 143.256135] [<ffffffff811db3e5>] do_vfs_ioctl+0x3f5/0x540 > [ 143.256187] [<ffffffff811db5c1>] SyS_ioctl+0x91/0xb0 > [ 143.256213] [<ffffffff816c49d2>] system_call_fastpath+0x12/0x17 > > The reason is like the following: > (Holding s_umount) > VFS sync_fs staff: > |- btrfs_sync_fs() > |- btrfs_start_transaction() > |- sb_start_intwrite() > (Waiting thaw_fs to unfreeze) > VFS thaw_fs staff: > thaw_fs() > (Waiting sync_fs to release > s_umount) > > So deadlock happens. > This can be easily triggered by fstest/generic/068 with inode_cache > mount option. > > The fix is to check if the fs is frozen, if the fs is frozen, just > return and waiting for the next transaction. > > Cc: David Sterba <dsterba@suse.cz> > Reported-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com> > Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> > --- > fs/btrfs/super.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c > index 60f7cbe..1d9f1e6 100644 > --- a/fs/btrfs/super.c > +++ b/fs/btrfs/super.c > @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait) > */ > if (fs_info->pending_changes == 0) > return 0; I think the problem is here -- why ->pending_changes is not 0 when the filesystem is frozen? so I think the reason of this problem is btrfs_freeze forget to deal with the pending changes, and the correct fix is to correct the behavior of btrfs_freeze(). Thanks Miao > + /* > + * Test if the fs is frozen, or start_trasaction > + * will deadlock on itself. > + */ > + if (__sb_start_write(sb, SB_FREEZE_FS, false)) > + __sb_end_write(sb, SB_FREEZE_FS); > + else > + return 0; > trans = btrfs_start_transaction(root, 0); > } else { > return PTR_ERR(trans); > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. 2015-01-20 0:19 ` Miao Xie @ 2015-01-20 0:26 ` Qu Wenruo 0 siblings, 0 replies; 25+ messages in thread From: Qu Wenruo @ 2015-01-20 0:26 UTC (permalink / raw) To: Miao Xie, linux-btrfs; +Cc: David Sterba -------- Original Message -------- Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. From: Miao Xie <miaoxie@huawei.com> To: Qu Wenruo <quwenruo@cn.fujitsu.com>, <linux-btrfs@vger.kernel.org> Date: 2015年01月20日 08:19 > On Mon, 19 Jan 2015 15:42:41 +0800, Qu Wenruo wrote: >> Commit 6b5fe46dfa52 (btrfs: do commit in sync_fs if there are pending >> changes) will call btrfs_start_transaction() in sync_fs(), to handle >> some operations needed to be done in next transaction. >> >> However this can cause deadlock if the filesystem is frozen, with the >> following sys_r+w output: >> [ 143.255932] Call Trace: >> [ 143.255936] [<ffffffff816c0e09>] schedule+0x29/0x70 >> [ 143.255939] [<ffffffff811cb7f3>] __sb_start_write+0xb3/0x100 >> [ 143.255971] [<ffffffffa040ec06>] start_transaction+0x2e6/0x5a0 >> [btrfs] >> [ 143.255992] [<ffffffffa040f1eb>] btrfs_start_transaction+0x1b/0x20 >> [btrfs] >> [ 143.256003] [<ffffffffa03dc0ba>] btrfs_sync_fs+0xca/0xd0 [btrfs] >> [ 143.256007] [<ffffffff811f7be0>] sync_fs_one_sb+0x20/0x30 >> [ 143.256011] [<ffffffff811cbd01>] iterate_supers+0xe1/0xf0 >> [ 143.256014] [<ffffffff811f7d75>] sys_sync+0x55/0x90 >> [ 143.256017] [<ffffffff816c49d2>] system_call_fastpath+0x12/0x17 >> [ 143.256111] Call Trace: >> [ 143.256114] [<ffffffff816c0e09>] schedule+0x29/0x70 >> [ 143.256119] [<ffffffff816c3405>] rwsem_down_write_failed+0x1c5/0x2d0 >> [ 143.256123] [<ffffffff8133f013>] call_rwsem_down_write_failed+0x13/0x20 >> [ 143.256131] [<ffffffff811caae8>] thaw_super+0x28/0xc0 >> [ 143.256135] [<ffffffff811db3e5>] do_vfs_ioctl+0x3f5/0x540 >> [ 143.256187] [<ffffffff811db5c1>] SyS_ioctl+0x91/0xb0 >> [ 143.256213] [<ffffffff816c49d2>] system_call_fastpath+0x12/0x17 >> >> The reason is like the following: >> (Holding s_umount) >> VFS sync_fs staff: >> |- btrfs_sync_fs() >> |- btrfs_start_transaction() >> |- sb_start_intwrite() >> (Waiting thaw_fs to unfreeze) >> VFS thaw_fs staff: >> thaw_fs() >> (Waiting sync_fs to release >> s_umount) >> >> So deadlock happens. >> This can be easily triggered by fstest/generic/068 with inode_cache >> mount option. >> >> The fix is to check if the fs is frozen, if the fs is frozen, just >> return and waiting for the next transaction. >> >> Cc: David Sterba <dsterba@suse.cz> >> Reported-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com> >> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> >> --- >> fs/btrfs/super.c | 8 ++++++++ >> 1 file changed, 8 insertions(+) >> >> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c >> index 60f7cbe..1d9f1e6 100644 >> --- a/fs/btrfs/super.c >> +++ b/fs/btrfs/super.c >> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait) >> */ >> if (fs_info->pending_changes == 0) >> return 0; > > I think the problem is here -- why ->pending_changes is not 0 when the > filesystem is frozen? This happens when already no transaction is running but some one set inode_cache or things needs pending. And the freeze follows. > so I think the reason of this problem is btrfs_freeze > forget to deal with the pending changes, and the correct fix is to correct > the behavior of btrfs_freeze(). Great! Thanks for pointing this! Starting a transaction in btrfs_freeze() seems to be the silver bullet for such case. Thanks Qu > > Thanks > Miao > >> + /* >> + * Test if the fs is frozen, or start_trasaction >> + * will deadlock on itself. >> + */ >> + if (__sb_start_write(sb, SB_FREEZE_FS, false)) >> + __sb_end_write(sb, SB_FREEZE_FS); >> + else >> + return 0; >> trans = btrfs_start_transaction(root, 0); >> } else { >> return PTR_ERR(trans); >> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. 2015-01-19 7:42 [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock Qu Wenruo 2015-01-19 14:06 ` David Sterba 2015-01-20 0:19 ` Miao Xie @ 2015-01-20 17:13 ` David Sterba 2015-01-21 0:58 ` Qu Wenruo 2 siblings, 1 reply; 25+ messages in thread From: David Sterba @ 2015-01-20 17:13 UTC (permalink / raw) To: Qu Wenruo; +Cc: linux-btrfs, David Sterba, miaoxie, clm On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote: > --- a/fs/btrfs/super.c > +++ b/fs/btrfs/super.c > @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait) > */ > if (fs_info->pending_changes == 0) > return 0; > + /* > + * Test if the fs is frozen, or start_trasaction > + * will deadlock on itself. > + */ > + if (__sb_start_write(sb, SB_FREEZE_FS, false)) > + __sb_end_write(sb, SB_FREEZE_FS); > + else > + return 0; The more I look into that the more I think that the first fix is the right one. Has been pointed out in this thread, it is ok to skip processing the pending changes if the filesystem is frozen. The pending changes have to flushed from sync (by design), we cannot use mnt_want_write or the sb_start* protections that. The btrfs_freeze callback can safely do the last commit, that's under s_umount held by vfs::freeze_super. Then any other new transaction would block. Any other call to btrfs_sync_fs will not find any active transaction and with this patch will not start one. Sounds safe to me. I think the right level to check is SB_FREEZE_WRITE though, to stop any potential writes as soon as possible and when the s_umount lock is still held in vfs::freeze_super. I'll collect the relevant patches and will send it for review. > trans = btrfs_start_transaction(root, 0); > } else { > return PTR_ERR(trans); ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. 2015-01-20 17:13 ` David Sterba @ 2015-01-21 0:58 ` Qu Wenruo 2015-01-21 1:05 ` Chris Mason 0 siblings, 1 reply; 25+ messages in thread From: Qu Wenruo @ 2015-01-21 0:58 UTC (permalink / raw) To: dsterba, linux-btrfs, miaoxie, clm -------- Original Message -------- Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. From: David Sterba <dsterba@suse.cz> To: Qu Wenruo <quwenruo@cn.fujitsu.com> Date: 2015年01月21日 01:13 > On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote: >> --- a/fs/btrfs/super.c >> +++ b/fs/btrfs/super.c >> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait) >> */ >> if (fs_info->pending_changes == 0) >> return 0; >> + /* >> + * Test if the fs is frozen, or start_trasaction >> + * will deadlock on itself. >> + */ >> + if (__sb_start_write(sb, SB_FREEZE_FS, false)) >> + __sb_end_write(sb, SB_FREEZE_FS); >> + else >> + return 0; > The more I look into that the more I think that the first fix is the > right one. > > Has been pointed out in this thread, it is ok to skip processing the > pending changes if the filesystem is frozen. That's good, for me, either this patch or the patch 2~5 in the patchset will solve the sync_fs() problem on frozen fs. Just different timing to start the new transaction. But the patchset one has the problem, which needs to deal with the sysfs interface changes, or sync_fs() will still cause deadlock. So I tried to revert the sysfs related patches, but it seems overkilled, needing extra btrfs_start_transaction* things. As you already picked this one, I'm completely OK with this. > > The pending changes have to flushed from sync (by design), we cannot use > mnt_want_write or the sb_start* protections that. > > The btrfs_freeze callback can safely do the last commit, that's under > s_umount held by vfs::freeze_super. Then any other new transaction would > block. Any other call to btrfs_sync_fs will not find any active > transaction and with this patch will not start one. Sounds safe to me. > > I think the right level to check is SB_FREEZE_WRITE though, to stop any > potential writes as soon as possible and when the s_umount lock is still > held in vfs::freeze_super. SB_FREEZE_WRITE seems good for me. But I didn't catch the difference between SB_FREEZE_FS(WRITE/PAGEFAULT/COMPLETE), since freeze() conflicts with sync_fs(), when we comes to btrfs_sync_fs(), the fs is either totally frozen or unfrozen and frozen level won't change during the protection of s_umount. Although SB_FREEZE_WRITE seems better in its meaning and makes it more readable. > > I'll collect the relevant patches and will send it for review. Thanks for collecting them and sending them out. Thanks, Qu > > >> trans = btrfs_start_transaction(root, 0); >> } else { >> return PTR_ERR(trans); ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. 2015-01-21 0:58 ` Qu Wenruo @ 2015-01-21 1:05 ` Chris Mason 2015-01-21 1:09 ` Qu Wenruo 0 siblings, 1 reply; 25+ messages in thread From: Chris Mason @ 2015-01-21 1:05 UTC (permalink / raw) To: Qu Wenruo; +Cc: dsterba, linux-btrfs, miaoxie On Tue, Jan 20, 2015 at 7:58 PM, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote: > > -------- Original Message -------- > Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on > frozen fs to avoid deadlock. > From: David Sterba <dsterba@suse.cz> > To: Qu Wenruo <quwenruo@cn.fujitsu.com> > Date: 2015年01月21日 01:13 >> On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote: >>> --- a/fs/btrfs/super.c >>> +++ b/fs/btrfs/super.c >>> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, >>> int wait) >>> */ >>> if (fs_info->pending_changes == 0) >>> return 0; >>> + /* >>> + * Test if the fs is frozen, or start_trasaction >>> + * will deadlock on itself. >>> + */ >>> + if (__sb_start_write(sb, SB_FREEZE_FS, false)) >>> + __sb_end_write(sb, SB_FREEZE_FS); >>> + else >>> + return 0; But what if someone freezes the FS after __sb_end_write() and before btrfs_start_transaction()? I don't see what keeps new freezers from coming in. -chris ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. 2015-01-21 1:05 ` Chris Mason @ 2015-01-21 1:09 ` Qu Wenruo 2015-01-21 1:10 ` Chris Mason 0 siblings, 1 reply; 25+ messages in thread From: Qu Wenruo @ 2015-01-21 1:09 UTC (permalink / raw) To: Chris Mason; +Cc: dsterba, linux-btrfs, miaoxie -------- Original Message -------- Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. From: Chris Mason <clm@fb.com> To: Qu Wenruo <quwenruo@cn.fujitsu.com> Date: 2015年01月21日 09:05 > > > On Tue, Jan 20, 2015 at 7:58 PM, Qu Wenruo <quwenruo@cn.fujitsu.com> > wrote: >> >> -------- Original Message -------- >> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on >> frozen fs to avoid deadlock. >> From: David Sterba <dsterba@suse.cz> >> To: Qu Wenruo <quwenruo@cn.fujitsu.com> >> Date: 2015年01月21日 01:13 >>> On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote: >>>> --- a/fs/btrfs/super.c >>>> +++ b/fs/btrfs/super.c >>>> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, >>>> int wait) >>>> */ >>>> if (fs_info->pending_changes == 0) >>>> return 0; >>>> + /* >>>> + * Test if the fs is frozen, or start_trasaction >>>> + * will deadlock on itself. >>>> + */ >>>> + if (__sb_start_write(sb, SB_FREEZE_FS, false)) >>>> + __sb_end_write(sb, SB_FREEZE_FS); >>>> + else >>>> + return 0; > > But what if someone freezes the FS after __sb_end_write() and before > btrfs_start_transaction()? I don't see what keeps new freezers from > coming in. > > -chris Either VFS::freeze_super() and VFS::syncfs() will hold the s_umount mutex, so freeze will not happen during sync. Thanks, Qu > > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. 2015-01-21 1:09 ` Qu Wenruo @ 2015-01-21 1:10 ` Chris Mason 2015-01-21 3:10 ` Miao Xie 0 siblings, 1 reply; 25+ messages in thread From: Chris Mason @ 2015-01-21 1:10 UTC (permalink / raw) To: Qu Wenruo; +Cc: dsterba, linux-btrfs, miaoxie On Tue, Jan 20, 2015 at 8:09 PM, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote: > > -------- Original Message -------- > Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on > frozen fs to avoid deadlock. > From: Chris Mason <clm@fb.com> > To: Qu Wenruo <quwenruo@cn.fujitsu.com> > Date: 2015年01月21日 09:05 >> >> >> On Tue, Jan 20, 2015 at 7:58 PM, Qu Wenruo <quwenruo@cn.fujitsu.com> >> wrote: >>> >>> -------- Original Message -------- >>> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on >>> frozen fs to avoid deadlock. >>> From: David Sterba <dsterba@suse.cz> >>> To: Qu Wenruo <quwenruo@cn.fujitsu.com> >>> Date: 2015年01月21日 01:13 >>>> On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote: >>>>> --- a/fs/btrfs/super.c >>>>> +++ b/fs/btrfs/super.c >>>>> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, >>>>> int wait) >>>>> */ >>>>> if (fs_info->pending_changes == 0) >>>>> return 0; >>>>> + /* >>>>> + * Test if the fs is frozen, or start_trasaction >>>>> + * will deadlock on itself. >>>>> + */ >>>>> + if (__sb_start_write(sb, SB_FREEZE_FS, false)) >>>>> + __sb_end_write(sb, SB_FREEZE_FS); >>>>> + else >>>>> + return 0; >> >> But what if someone freezes the FS after __sb_end_write() and before >> btrfs_start_transaction()? I don't see what keeps new freezers >> from coming in. >> >> -chris > Either VFS::freeze_super() and VFS::syncfs() will hold the s_umount > mutex, so freeze will not happen > during sync. You're right. I was worried about the sync ioctl, but the mutex won't be held there to deadlock against. We'll be fine. -chris ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. 2015-01-21 1:10 ` Chris Mason @ 2015-01-21 3:10 ` Miao Xie 2015-01-21 3:15 ` Qu Wenruo 0 siblings, 1 reply; 25+ messages in thread From: Miao Xie @ 2015-01-21 3:10 UTC (permalink / raw) To: Chris Mason, Qu Wenruo; +Cc: dsterba, linux-btrfs On Tue, 20 Jan 2015 20:10:56 -0500, Chris Mason wrote: > On Tue, Jan 20, 2015 at 8:09 PM, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote: >> >> -------- Original Message -------- >> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs >> to avoid deadlock. >> From: Chris Mason <clm@fb.com> >> To: Qu Wenruo <quwenruo@cn.fujitsu.com> >> Date: 2015年01月21日 09:05 >>> >>> >>> On Tue, Jan 20, 2015 at 7:58 PM, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote: >>>> >>>> -------- Original Message -------- >>>> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen >>>> fs to avoid deadlock. >>>> From: David Sterba <dsterba@suse.cz> >>>> To: Qu Wenruo <quwenruo@cn.fujitsu.com> >>>> Date: 2015年01月21日 01:13 >>>>> On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote: >>>>>> --- a/fs/btrfs/super.c >>>>>> +++ b/fs/btrfs/super.c >>>>>> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait) >>>>>> */ >>>>>> if (fs_info->pending_changes == 0) >>>>>> return 0; >>>>>> + /* >>>>>> + * Test if the fs is frozen, or start_trasaction >>>>>> + * will deadlock on itself. >>>>>> + */ >>>>>> + if (__sb_start_write(sb, SB_FREEZE_FS, false)) >>>>>> + __sb_end_write(sb, SB_FREEZE_FS); >>>>>> + else >>>>>> + return 0; >>> >>> But what if someone freezes the FS after __sb_end_write() and before >>> btrfs_start_transaction()? I don't see what keeps new freezers from coming in. >>> >>> -chris >> Either VFS::freeze_super() and VFS::syncfs() will hold the s_umount mutex, so >> freeze will not happen >> during sync. > > You're right. I was worried about the sync ioctl, but the mutex won't be held > there to deadlock against. We'll be fine. There is another problem which is introduced by pending change. That is we will start and commmit a transaction by changing pending mount option after we set the fs to be R/O. I think it is better that we don't start a new transaction for pending changes which are set after the transaction is committed, just make them be handled by the next transaction, the reason is: - Make the behavior of the fs be consistent(both freezed fs and unfreezed fs) - Data on the disk is right and integrated Thanks Miao ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. 2015-01-21 3:10 ` Miao Xie @ 2015-01-21 3:15 ` Qu Wenruo 2015-01-21 3:26 ` Miao Xie 0 siblings, 1 reply; 25+ messages in thread From: Qu Wenruo @ 2015-01-21 3:15 UTC (permalink / raw) To: Miao Xie, Chris Mason; +Cc: dsterba, linux-btrfs -------- Original Message -------- Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. From: Miao Xie <miaoxie@huawei.com> To: Chris Mason <clm@fb.com>, Qu Wenruo <quwenruo@cn.fujitsu.com> Date: 2015年01月21日 11:10 > On Tue, 20 Jan 2015 20:10:56 -0500, Chris Mason wrote: >> On Tue, Jan 20, 2015 at 8:09 PM, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote: >>> -------- Original Message -------- >>> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs >>> to avoid deadlock. >>> From: Chris Mason <clm@fb.com> >>> To: Qu Wenruo <quwenruo@cn.fujitsu.com> >>> Date: 2015年01月21日 09:05 >>>> >>>> On Tue, Jan 20, 2015 at 7:58 PM, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote: >>>>> -------- Original Message -------- >>>>> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen >>>>> fs to avoid deadlock. >>>>> From: David Sterba <dsterba@suse.cz> >>>>> To: Qu Wenruo <quwenruo@cn.fujitsu.com> >>>>> Date: 2015年01月21日 01:13 >>>>>> On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote: >>>>>>> --- a/fs/btrfs/super.c >>>>>>> +++ b/fs/btrfs/super.c >>>>>>> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait) >>>>>>> */ >>>>>>> if (fs_info->pending_changes == 0) >>>>>>> return 0; >>>>>>> + /* >>>>>>> + * Test if the fs is frozen, or start_trasaction >>>>>>> + * will deadlock on itself. >>>>>>> + */ >>>>>>> + if (__sb_start_write(sb, SB_FREEZE_FS, false)) >>>>>>> + __sb_end_write(sb, SB_FREEZE_FS); >>>>>>> + else >>>>>>> + return 0; >>>> But what if someone freezes the FS after __sb_end_write() and before >>>> btrfs_start_transaction()? I don't see what keeps new freezers from coming in. >>>> >>>> -chris >>> Either VFS::freeze_super() and VFS::syncfs() will hold the s_umount mutex, so >>> freeze will not happen >>> during sync. >> You're right. I was worried about the sync ioctl, but the mutex won't be held >> there to deadlock against. We'll be fine. > There is another problem which is introduced by pending change. That is we will > start and commmit a transaction by changing pending mount option after we set > the fs to be R/O. Oh, I missed this problem. > > I think it is better that we don't start a new transaction for pending changes > which are set after the transaction is committed, just make them be handled by > the next transaction, This will cause another problem, nobody can ensure there will be next transaction and the change may never to written into disk. For example, if we change the features/label through sysfs, and then umount the fs, since there is no write, there is no running transaction and if we don't start a new transaction, it won't be flushed to disk. Thanks, Qu > the reason is: > - Make the behavior of the fs be consistent(both freezed fs and unfreezed fs) > - Data on the disk is right and integrated > > > Thanks > Miao ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. 2015-01-21 3:15 ` Qu Wenruo @ 2015-01-21 3:26 ` Miao Xie 2015-01-21 3:53 ` Qu Wenruo 0 siblings, 1 reply; 25+ messages in thread From: Miao Xie @ 2015-01-21 3:26 UTC (permalink / raw) To: Qu Wenruo, Chris Mason; +Cc: dsterba, linux-btrfs On Wed, 21 Jan 2015 11:15:41 +0800, Qu Wenruo wrote: > > -------- Original Message -------- > Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to > avoid deadlock. > From: Miao Xie <miaoxie@huawei.com> > To: Chris Mason <clm@fb.com>, Qu Wenruo <quwenruo@cn.fujitsu.com> > Date: 2015年01月21日 11:10 >> On Tue, 20 Jan 2015 20:10:56 -0500, Chris Mason wrote: >>> On Tue, Jan 20, 2015 at 8:09 PM, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote: >>>> -------- Original Message -------- >>>> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs >>>> to avoid deadlock. >>>> From: Chris Mason <clm@fb.com> >>>> To: Qu Wenruo <quwenruo@cn.fujitsu.com> >>>> Date: 2015年01月21日 09:05 >>>>> >>>>> On Tue, Jan 20, 2015 at 7:58 PM, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote: >>>>>> -------- Original Message -------- >>>>>> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen >>>>>> fs to avoid deadlock. >>>>>> From: David Sterba <dsterba@suse.cz> >>>>>> To: Qu Wenruo <quwenruo@cn.fujitsu.com> >>>>>> Date: 2015年01月21日 01:13 >>>>>>> On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote: >>>>>>>> --- a/fs/btrfs/super.c >>>>>>>> +++ b/fs/btrfs/super.c >>>>>>>> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait) >>>>>>>> */ >>>>>>>> if (fs_info->pending_changes == 0) >>>>>>>> return 0; >>>>>>>> + /* >>>>>>>> + * Test if the fs is frozen, or start_trasaction >>>>>>>> + * will deadlock on itself. >>>>>>>> + */ >>>>>>>> + if (__sb_start_write(sb, SB_FREEZE_FS, false)) >>>>>>>> + __sb_end_write(sb, SB_FREEZE_FS); >>>>>>>> + else >>>>>>>> + return 0; >>>>> But what if someone freezes the FS after __sb_end_write() and before >>>>> btrfs_start_transaction()? I don't see what keeps new freezers from >>>>> coming in. >>>>> >>>>> -chris >>>> Either VFS::freeze_super() and VFS::syncfs() will hold the s_umount mutex, so >>>> freeze will not happen >>>> during sync. >>> You're right. I was worried about the sync ioctl, but the mutex won't be held >>> there to deadlock against. We'll be fine. >> There is another problem which is introduced by pending change. That is we will >> start and commmit a transaction by changing pending mount option after we set >> the fs to be R/O. > Oh, I missed this problem. >> >> I think it is better that we don't start a new transaction for pending changes >> which are set after the transaction is committed, just make them be handled by >> the next transaction, > This will cause another problem, nobody can ensure there will be next > transaction and the change may > never to written into disk. First, the pending changes is mount option, that is in-memory data. Second, the same problem would happen after you freeze fs. > > For example, if we change the features/label through sysfs, and then umount the fs, It is different from pending change. If you want to change features/label, you should get write permission and make sure the fs is not be freezed because those are on-disk data. So the problem doesn't exist, or there is a bug. Thanks Miao > since there is no write, there is no running transaction and if we don't start a > new transaction, > it won't be flushed to disk. > > Thanks, > Qu >> the reason is: >> - Make the behavior of the fs be consistent(both freezed fs and unfreezed fs) >> - Data on the disk is right and integrated >> >> >> Thanks >> Miao > > . > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. 2015-01-21 3:26 ` Miao Xie @ 2015-01-21 3:53 ` Qu Wenruo 2015-01-21 7:04 ` Miao Xie 0 siblings, 1 reply; 25+ messages in thread From: Qu Wenruo @ 2015-01-21 3:53 UTC (permalink / raw) To: Miao Xie, Chris Mason; +Cc: dsterba, linux-btrfs -------- Original Message -------- Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. From: Miao Xie <miaoxie@huawei.com> To: Qu Wenruo <quwenruo@cn.fujitsu.com>, Chris Mason <clm@fb.com> Date: 2015年01月21日 11:26 > On Wed, 21 Jan 2015 11:15:41 +0800, Qu Wenruo wrote: >> -------- Original Message -------- >> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to >> avoid deadlock. >> From: Miao Xie <miaoxie@huawei.com> >> To: Chris Mason <clm@fb.com>, Qu Wenruo <quwenruo@cn.fujitsu.com> >> Date: 2015年01月21日 11:10 >>> On Tue, 20 Jan 2015 20:10:56 -0500, Chris Mason wrote: >>>> On Tue, Jan 20, 2015 at 8:09 PM, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote: >>>>> -------- Original Message -------- >>>>> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs >>>>> to avoid deadlock. >>>>> From: Chris Mason <clm@fb.com> >>>>> To: Qu Wenruo <quwenruo@cn.fujitsu.com> >>>>> Date: 2015年01月21日 09:05 >>>>>> On Tue, Jan 20, 2015 at 7:58 PM, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote: >>>>>>> -------- Original Message -------- >>>>>>> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen >>>>>>> fs to avoid deadlock. >>>>>>> From: David Sterba <dsterba@suse.cz> >>>>>>> To: Qu Wenruo <quwenruo@cn.fujitsu.com> >>>>>>> Date: 2015年01月21日 01:13 >>>>>>>> On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote: >>>>>>>>> --- a/fs/btrfs/super.c >>>>>>>>> +++ b/fs/btrfs/super.c >>>>>>>>> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait) >>>>>>>>> */ >>>>>>>>> if (fs_info->pending_changes == 0) >>>>>>>>> return 0; >>>>>>>>> + /* >>>>>>>>> + * Test if the fs is frozen, or start_trasaction >>>>>>>>> + * will deadlock on itself. >>>>>>>>> + */ >>>>>>>>> + if (__sb_start_write(sb, SB_FREEZE_FS, false)) >>>>>>>>> + __sb_end_write(sb, SB_FREEZE_FS); >>>>>>>>> + else >>>>>>>>> + return 0; >>>>>> But what if someone freezes the FS after __sb_end_write() and before >>>>>> btrfs_start_transaction()? I don't see what keeps new freezers from >>>>>> coming in. >>>>>> >>>>>> -chris >>>>> Either VFS::freeze_super() and VFS::syncfs() will hold the s_umount mutex, so >>>>> freeze will not happen >>>>> during sync. >>>> You're right. I was worried about the sync ioctl, but the mutex won't be held >>>> there to deadlock against. We'll be fine. >>> There is another problem which is introduced by pending change. That is we will >>> start and commmit a transaction by changing pending mount option after we set >>> the fs to be R/O. >> Oh, I missed this problem. >>> I think it is better that we don't start a new transaction for pending changes >>> which are set after the transaction is committed, just make them be handled by >>> the next transaction, >> This will cause another problem, nobody can ensure there will be next >> transaction and the change may >> never to written into disk. > First, the pending changes is mount option, that is in-memory data. > Second, the same problem would happen after you freeze fs. Pending changes are *not* only mount options. Feature change and label change are also pending changes if using sysfs. Normal ioctl label changing is not affected. For freeze, it's not the same problem since the fs will be unfreeze sooner or later and transaction will be initiated. > >> For example, if we change the features/label through sysfs, and then umount the fs, > It is different from pending change. No, now features/label changing using sysfs both use pending changes to do the commit. See BTRFS_PENDING_COMMIT bit. So freeze -> change features/label -> sync will still cause the deadlock in the same way, and you can try it yourself. Thanks, Qu > If you want to change features/label, you should get write permission and make > sure the fs is not be freezed because those are on-disk data. So the problem > doesn't exist, or there is a bug. > > Thanks > Miao > >> since there is no write, there is no running transaction and if we don't start a >> new transaction, >> it won't be flushed to disk. >> >> Thanks, >> Qu >>> the reason is: >>> - Make the behavior of the fs be consistent(both freezed fs and unfreezed fs) >>> - Data on the disk is right and integrated >>> >>> >>> Thanks >>> Miao >> . >> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. 2015-01-21 3:53 ` Qu Wenruo @ 2015-01-21 7:04 ` Miao Xie 2015-01-21 7:47 ` Qu Wenruo 2015-01-23 16:59 ` David Sterba 0 siblings, 2 replies; 25+ messages in thread From: Miao Xie @ 2015-01-21 7:04 UTC (permalink / raw) To: Qu Wenruo, Chris Mason; +Cc: dsterba, linux-btrfs On Wed, 21 Jan 2015 11:53:34 +0800, Qu Wenruo wrote: >>>>>>>>>> + /* >>>>>>>>>> + * Test if the fs is frozen, or start_trasaction >>>>>>>>>> + * will deadlock on itself. >>>>>>>>>> + */ >>>>>>>>>> + if (__sb_start_write(sb, SB_FREEZE_FS, false)) >>>>>>>>>> + __sb_end_write(sb, SB_FREEZE_FS); >>>>>>>>>> + else >>>>>>>>>> + return 0; >>>>>>> But what if someone freezes the FS after __sb_end_write() and before >>>>>>> btrfs_start_transaction()? I don't see what keeps new freezers from >>>>>>> coming in. >>>>>>> >>>>>>> -chris >>>>>> Either VFS::freeze_super() and VFS::syncfs() will hold the s_umount mutex, so >>>>>> freeze will not happen >>>>>> during sync. >>>>> You're right. I was worried about the sync ioctl, but the mutex won't be held >>>>> there to deadlock against. We'll be fine. >>>> There is another problem which is introduced by pending change. That is we will >>>> start and commmit a transaction by changing pending mount option after we set >>>> the fs to be R/O. >>> Oh, I missed this problem. >>>> I think it is better that we don't start a new transaction for pending changes >>>> which are set after the transaction is committed, just make them be handled by >>>> the next transaction, >>> This will cause another problem, nobody can ensure there will be next >>> transaction and the change may >>> never to written into disk. >> First, the pending changes is mount option, that is in-memory data. >> Second, the same problem would happen after you freeze fs. > Pending changes are *not* only mount options. Feature change and label change > are also pending changes if using sysfs. My miss, I don't notice feature and label change by sysfs. But the implementation of feature and label change by sysfs is wrong, we can not change them without write permission. > Normal ioctl label changing is not affected. > > For freeze, it's not the same problem since the fs will be unfreeze sooner or > later and transaction will be initiated. You can not assume the operations of the users, they might freeze the fs and then shutdown the machine. >> >>> For example, if we change the features/label through sysfs, and then umount >>> the fs, >> It is different from pending change. > No, now features/label changing using sysfs both use pending changes to do the > commit. > See BTRFS_PENDING_COMMIT bit. > So freeze -> change features/label -> sync will still cause the deadlock in the > same way, > and you can try it yourself. As I said above, the implementation of sysfs feature and label change is wrong, it is better to separate them from the pending mount option change, make the sysfs feature and label change be done in the context of transaction after getting the write permission. If so, we needn't do anything special when sync the fs. In short, changing the sysfs feature and label change implementation and removing the unnecessary btrfs_start_transaction in sync_fs can fix the deadlock. Thanks Miao > > Thanks, > Qu > >> If you want to change features/label, you should get write permission and make >> sure the fs is not be freezed because those are on-disk data. So the problem >> doesn't exist, or there is a bug. >> >> Thanks >> Miao >> >>> since there is no write, there is no running transaction and if we don't start a >>> new transaction, >>> it won't be flushed to disk. >>> >>> Thanks, >>> Qu >>>> the reason is: >>>> - Make the behavior of the fs be consistent(both freezed fs and unfreezed fs) >>>> - Data on the disk is right and integrated >>>> >>>> >>>> Thanks >>>> Miao >>> . >>> > > . > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. 2015-01-21 7:04 ` Miao Xie @ 2015-01-21 7:47 ` Qu Wenruo 2015-01-21 8:46 ` Miao Xie 2015-01-23 17:39 ` David Sterba 2015-01-23 16:59 ` David Sterba 1 sibling, 2 replies; 25+ messages in thread From: Qu Wenruo @ 2015-01-21 7:47 UTC (permalink / raw) To: Miao Xie, Chris Mason, dsterba; +Cc: linux-btrfs -------- Original Message -------- Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. From: Miao Xie <miaoxie@huawei.com> To: Qu Wenruo <quwenruo@cn.fujitsu.com>, Chris Mason <clm@fb.com> Date: 2015年01月21日 15:04 > On Wed, 21 Jan 2015 11:53:34 +0800, Qu Wenruo wrote: >>>> [snipped] >>>> This will cause another problem, nobody can ensure there will be next >>>> transaction and the change may >>>> never to written into disk. >>> First, the pending changes is mount option, that is in-memory data. >>> Second, the same problem would happen after you freeze fs. >> Pending changes are *not* only mount options. Feature change and label change >> are also pending changes if using sysfs. > My miss, I don't notice feature and label change by sysfs. > > But the implementation of feature and label change by sysfs is wrong, we can > not change them without write permission. > >> Normal ioctl label changing is not affected. >> >> For freeze, it's not the same problem since the fs will be unfreeze sooner or >> later and transaction will be initiated. > You can not assume the operations of the users, they might freeze the fs and > then shutdown the machine. > >>>> For example, if we change the features/label through sysfs, and then umount >>>> the fs, >>> It is different from pending change. >> No, now features/label changing using sysfs both use pending changes to do the >> commit. >> See BTRFS_PENDING_COMMIT bit. >> So freeze -> change features/label -> sync will still cause the deadlock in the >> same way, >> and you can try it yourself. > As I said above, the implementation of sysfs feature and label change is wrong, > it is better to separate them from the pending mount option change, make the > sysfs feature and label change be done in the context of transaction after > getting the write permission. If so, we needn't do anything special when sync > the fs. > > In short, changing the sysfs feature and label change implementation and > removing the unnecessary btrfs_start_transaction in sync_fs can fix the > deadlock. Your method will only fix the deadlock, but will introduce the risk like pending inode_cache will never be written to disk as I already explained. (If still using the fs_info->pending_changes mechanism) To ensure pending changes written to disk sync_fs() should start a transaction if needed, or there will be chance that no transaction can handle it. But I don't see the necessity to pending current work(inode_cache, feature/label changes) to next transaction. To David: I'm a little curious about why inode_cache needs to be delayed to next transaction. In btrfs_remount() we have s_umount mutex, and we synced the whole filesystem already, so there should be no running transaction and we can just set any mount option into fs_info. Or even in worst case, there is a racing window, we can still start a transaction and do the commit, a little overhead in such minor case won't impact the overall performance. For sysfs change, I prefer attach or start transaction method, and for mount option change, and such sysfs tuning is also minor case for a filesystem. What do you think about reverting the whole patchset and rework the sysfs interface? Thanks, Qu > > Thanks > Miao > >> Thanks, >> Qu >> >>> If you want to change features/label, you should get write permission and make >>> sure the fs is not be freezed because those are on-disk data. So the problem >>> doesn't exist, or there is a bug. >>> >>> Thanks >>> Miao >>> >>>> since there is no write, there is no running transaction and if we don't start a >>>> new transaction, >>>> it won't be flushed to disk. >>>> >>>> Thanks, >>>> Qu >>>>> the reason is: >>>>> - Make the behavior of the fs be consistent(both freezed fs and unfreezed fs) >>>>> - Data on the disk is right and integrated >>>>> >>>>> >>>>> Thanks >>>>> Miao >>>> . >>>> >> . >> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. 2015-01-21 7:47 ` Qu Wenruo @ 2015-01-21 8:46 ` Miao Xie 2015-01-23 17:39 ` David Sterba 1 sibling, 0 replies; 25+ messages in thread From: Miao Xie @ 2015-01-21 8:46 UTC (permalink / raw) To: Qu Wenruo, Chris Mason, dsterba; +Cc: linux-btrfs On Wed, 21 Jan 2015 15:47:54 +0800, Qu Wenruo wrote: >> On Wed, 21 Jan 2015 11:53:34 +0800, Qu Wenruo wrote: >>>>> [snipped] >>>>> This will cause another problem, nobody can ensure there will be next >>>>> transaction and the change may >>>>> never to written into disk. >>>> First, the pending changes is mount option, that is in-memory data. >>>> Second, the same problem would happen after you freeze fs. >>> Pending changes are *not* only mount options. Feature change and label change >>> are also pending changes if using sysfs. >> My miss, I don't notice feature and label change by sysfs. >> >> But the implementation of feature and label change by sysfs is wrong, we can >> not change them without write permission. >> >>> Normal ioctl label changing is not affected. >>> >>> For freeze, it's not the same problem since the fs will be unfreeze sooner or >>> later and transaction will be initiated. >> You can not assume the operations of the users, they might freeze the fs and >> then shutdown the machine. >> >>>>> For example, if we change the features/label through sysfs, and then umount >>>>> the fs, >>>> It is different from pending change. >>> No, now features/label changing using sysfs both use pending changes to do the >>> commit. >>> See BTRFS_PENDING_COMMIT bit. >>> So freeze -> change features/label -> sync will still cause the deadlock in the >>> same way, >>> and you can try it yourself. >> As I said above, the implementation of sysfs feature and label change is wrong, >> it is better to separate them from the pending mount option change, make the >> sysfs feature and label change be done in the context of transaction after >> getting the write permission. If so, we needn't do anything special when sync >> the fs. >> >> In short, changing the sysfs feature and label change implementation and >> removing the unnecessary btrfs_start_transaction in sync_fs can fix the >> deadlock. > Your method will only fix the deadlock, but will introduce the risk like pending > inode_cache will never > be written to disk as I already explained. (If still using the > fs_info->pending_changes mechanism) > To ensure pending changes written to disk sync_fs() should start a transaction > if needed, > or there will be chance that no transaction can handle it. We are sure that writting down the inode cache need transaction. But INODE_CACHE is not a forcible flag. Sometimes though you set it, it is very likely that the inode cache files are not created and the data is not written down because the fs might still be reading inode usage information, and this operation might span several transactions. So I think what you worried is not a problem. Thanks Miao > > But I don't see the necessity to pending current work(inode_cache, feature/label > changes) to next transaction. > > To David: > I'm a little curious about why inode_cache needs to be delayed to next transaction. > In btrfs_remount() we have s_umount mutex, and we synced the whole filesystem > already, > so there should be no running transaction and we can just set any mount option > into fs_info. > > Or even in worst case, there is a racing window, we can still start a > transaction and do the commit, > a little overhead in such minor case won't impact the overall performance. > > For sysfs change, I prefer attach or start transaction method, and for mount > option change, and > such sysfs tuning is also minor case for a filesystem. > > What do you think about reverting the whole patchset and rework the sysfs > interface? > > Thanks, > Qu >> >> Thanks >> Miao >> >>> Thanks, >>> Qu >>> >>>> If you want to change features/label, you should get write permission and make >>>> sure the fs is not be freezed because those are on-disk data. So the problem >>>> doesn't exist, or there is a bug. >>>> >>>> Thanks >>>> Miao >>>> >>>>> since there is no write, there is no running transaction and if we don't >>>>> start a >>>>> new transaction, >>>>> it won't be flushed to disk. >>>>> >>>>> Thanks, >>>>> Qu >>>>>> the reason is: >>>>>> - Make the behavior of the fs be consistent(both freezed fs and unfreezed fs) >>>>>> - Data on the disk is right and integrated >>>>>> >>>>>> >>>>>> Thanks >>>>>> Miao >>>>> . >>>>> >>> . >>> > > . > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. 2015-01-21 7:47 ` Qu Wenruo 2015-01-21 8:46 ` Miao Xie @ 2015-01-23 17:39 ` David Sterba 2015-01-23 18:21 ` Chris Mason 1 sibling, 1 reply; 25+ messages in thread From: David Sterba @ 2015-01-23 17:39 UTC (permalink / raw) To: Qu Wenruo; +Cc: Miao Xie, Chris Mason, dsterba, linux-btrfs On Wed, Jan 21, 2015 at 03:47:54PM +0800, Qu Wenruo wrote: > To David: > I'm a little curious about why inode_cache needs to be delayed to next > transaction. > In btrfs_remount() we have s_umount mutex, and we synced the whole > filesystem already, > so there should be no running transaction and we can just set any mount > option into fs_info. See our discussion under the noinode_cache option: http://www.mail-archive.com/linux-btrfs%40vger.kernel.org/msg30075.html http://www.mail-archive.com/linux-btrfs%40vger.kernel.org/msg30414.html > What do you think about reverting the whole patchset and rework the > sysfs interface? IMO reverting should be the last option, we have a minimal fix to the sync deadlock and you've proposed the per-trasaction mount options to replace the pending inode_change. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. 2015-01-23 17:39 ` David Sterba @ 2015-01-23 18:21 ` Chris Mason 0 siblings, 0 replies; 25+ messages in thread From: Chris Mason @ 2015-01-23 18:21 UTC (permalink / raw) To: dsterba; +Cc: Qu Wenruo, Miao Xie, dsterba, linux-btrfs On Fri, Jan 23, 2015 at 12:39 PM, David Sterba <dsterba@suse.cz> wrote: > On Wed, Jan 21, 2015 at 03:47:54PM +0800, Qu Wenruo wrote: >> To David: >> I'm a little curious about why inode_cache needs to be delayed to >> next >> transaction. >> In btrfs_remount() we have s_umount mutex, and we synced the whole >> filesystem already, >> so there should be no running transaction and we can just set any >> mount >> option into fs_info. > > See our discussion under the noinode_cache option: > > https://urldefense.proofpoint.com/v1/url?u=http://www.mail-archive.com/linux-btrfs%2540vger.kernel.org/msg30075.html&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=6%2FL0lzzDhu0Y1hL9xm%2BQyA%3D%3D%0A&m=sv%2BL93W9i7vNsbS3ozpylY3o%2F3wpA4TZTQTtFh3mUXg%3D%0A&s=2d678af317413a7452f047aa9ed07bc7e5424d4bae831ac15fae5f23a2acd080 > https://urldefense.proofpoint.com/v1/url?u=http://www.mail-archive.com/linux-btrfs%2540vger.kernel.org/msg30414.html&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=6%2FL0lzzDhu0Y1hL9xm%2BQyA%3D%3D%0A&m=sv%2BL93W9i7vNsbS3ozpylY3o%2F3wpA4TZTQTtFh3mUXg%3D%0A&s=2fab711b3d70ab27c008694249bc62596f37e41af84dfc21077629930b4fe854 > >> What do you think about reverting the whole patchset and rework the >> sysfs interface? > > IMO reverting should be the last option, we have a minimal fix to the > sync deadlock and you've proposed the per-trasaction mount options to > replace the pending inode_change. I agree, I'd rather build on top of what we have than use reverts at this point. -chris ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. 2015-01-21 7:04 ` Miao Xie 2015-01-21 7:47 ` Qu Wenruo @ 2015-01-23 16:59 ` David Sterba 2015-01-26 0:31 ` Miao Xie 1 sibling, 1 reply; 25+ messages in thread From: David Sterba @ 2015-01-23 16:59 UTC (permalink / raw) To: Miao Xie; +Cc: Qu Wenruo, Chris Mason, dsterba, linux-btrfs On Wed, Jan 21, 2015 at 03:04:02PM +0800, Miao Xie wrote: > > Pending changes are *not* only mount options. Feature change and label change > > are also pending changes if using sysfs. > > My miss, I don't notice feature and label change by sysfs. > > But the implementation of feature and label change by sysfs is wrong, we can > not change them without write permission. Label change does not happen if the fs is readonly. If the filesystem is RW and label is changed through sysfs, then remount to RO will sync the filesystem and the new label will be saved. The sysfs features write handler is missing that protection though, I'll send a patch. > > For freeze, it's not the same problem since the fs will be unfreeze sooner or > > later and transaction will be initiated. > > You can not assume the operations of the users, they might freeze the fs and > then shutdown the machine. The semantics of freezing should make the on-device image consistent, but still keep some changes in memory. > >>> For example, if we change the features/label through sysfs, and then umount > >>> the fs, > >> It is different from pending change. > > No, now features/label changing using sysfs both use pending changes to do the > > commit. > > See BTRFS_PENDING_COMMIT bit. > > So freeze -> change features/label -> sync will still cause the deadlock in the > > same way, > > and you can try it yourself. > > As I said above, the implementation of sysfs feature and label change is wrong, > it is better to separate them from the pending mount option change, make the > sysfs feature and label change be done in the context of transaction after > getting the write permission. If so, we needn't do anything special when sync > the fs. That would mean to drop the write support of sysfs files that change global filesystem state (label and features right now). This would leave only the ioctl way to do that. I'd like to keep the sysfs write support though for ease of use from scripts and languages not ioctl-friendly. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. 2015-01-23 16:59 ` David Sterba @ 2015-01-26 0:31 ` Miao Xie 0 siblings, 0 replies; 25+ messages in thread From: Miao Xie @ 2015-01-26 0:31 UTC (permalink / raw) To: dsterba, Qu Wenruo, Chris Mason, linux-btrfs On Fri, 23 Jan 2015 17:59:49 +0100, David Sterba wrote: > On Wed, Jan 21, 2015 at 03:04:02PM +0800, Miao Xie wrote: >>> Pending changes are *not* only mount options. Feature change and label change >>> are also pending changes if using sysfs. >> >> My miss, I don't notice feature and label change by sysfs. >> >> But the implementation of feature and label change by sysfs is wrong, we can >> not change them without write permission. > > Label change does not happen if the fs is readonly. If the filesystem is > RW and label is changed through sysfs, then remount to RO will sync the > filesystem and the new label will be saved. > > The sysfs features write handler is missing that protection though, I'll > send a patch. First, the R/O protection is so cheap, there is a race between R/O remount and label/feature change, please consider the following case: Remount R/O task Label/Attr Change Task Check R/O remount ro R/O change Label/feature Second, it forgets to handle the freezing event. > >>> For freeze, it's not the same problem since the fs will be unfreeze sooner or >>> later and transaction will be initiated. >> >> You can not assume the operations of the users, they might freeze the fs and >> then shutdown the machine. > > The semantics of freezing should make the on-device image consistent, > but still keep some changes in memory. > >>>>> For example, if we change the features/label through sysfs, and then umount >>>>> the fs, >>>> It is different from pending change. >>> No, now features/label changing using sysfs both use pending changes to do the >>> commit. >>> See BTRFS_PENDING_COMMIT bit. >>> So freeze -> change features/label -> sync will still cause the deadlock in the >>> same way, >>> and you can try it yourself. >> >> As I said above, the implementation of sysfs feature and label change is wrong, >> it is better to separate them from the pending mount option change, make the >> sysfs feature and label change be done in the context of transaction after >> getting the write permission. If so, we needn't do anything special when sync >> the fs. > > That would mean to drop the write support of sysfs files that change > global filesystem state (label and features right now). This would leave > only the ioctl way to do that. I'd like to keep the sysfs write support > though for ease of use from scripts and languages not ioctl-friendly. > . not drop the write support of sysfs, just fix the bug and make it change the label and features under the writable context. Thanks Miao ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2015-01-26 0:37 UTC | newest] Thread overview: 25+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-01-19 7:42 [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock Qu Wenruo 2015-01-19 14:06 ` David Sterba 2015-01-20 2:51 ` Qu Wenruo 2015-01-20 2:53 ` Qu Wenruo 2015-01-20 3:06 ` Miao Xie 2015-01-20 3:17 ` Qu Wenruo 2015-01-20 8:16 ` Miao Xie 2015-01-20 0:19 ` Miao Xie 2015-01-20 0:26 ` Qu Wenruo 2015-01-20 17:13 ` David Sterba 2015-01-21 0:58 ` Qu Wenruo 2015-01-21 1:05 ` Chris Mason 2015-01-21 1:09 ` Qu Wenruo 2015-01-21 1:10 ` Chris Mason 2015-01-21 3:10 ` Miao Xie 2015-01-21 3:15 ` Qu Wenruo 2015-01-21 3:26 ` Miao Xie 2015-01-21 3:53 ` Qu Wenruo 2015-01-21 7:04 ` Miao Xie 2015-01-21 7:47 ` Qu Wenruo 2015-01-21 8:46 ` Miao Xie 2015-01-23 17:39 ` David Sterba 2015-01-23 18:21 ` Chris Mason 2015-01-23 16:59 ` David Sterba 2015-01-26 0:31 ` Miao Xie
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).