* [PATCH RFC] Btrfs: fix deadlock between sys_sync and freeze
@ 2012-08-14 5:01 liub.liubo
2012-08-14 12:59 ` Marco Stornelli
0 siblings, 1 reply; 6+ messages in thread
From: liub.liubo @ 2012-08-14 5:01 UTC (permalink / raw)
To: linux-btrfs; +Cc: linux-fsdevel
From: Liu Bo <bo.li.liu@oracle.com>
I found this while testing xfstests 068, the story is
t1 t2
sys_sync thaw_super
iterate_supers
down_read(sb->s_umount) down_write(sb->s_umount) --->wait for t1
sync_fs (with wait mode)
start_transaction
sb_start_intwrite --------------------> wait for t2 to set s_writers.frozen to SB_UNFROZEN
In this patch, I add an helper sb_start_intwrite_trylock() and use it before we
start_transaction in sync_fs() with wait mode so that we won't hit the deadlock.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
fs/btrfs/super.c | 15 +++++++++++++++
include/linux/fs.h | 5 +++++
2 files changed, 20 insertions(+), 0 deletions(-)
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index f2eb24c..1e04b41 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -847,6 +847,21 @@ int btrfs_sync_fs(struct super_block *sb, int wait)
return 0;
}
+ /*
+ * sys_sync can cause an ABBA deadlock with freeze/thaw
+ * o freeze_super() grabs s_umount lock and set sb to SB_FREEZE_FS.
+ * o thaw_super() grabs s_umount lock and set sb to SB_UNFROZEN.
+ * o iterate_supers() grabs s_umount lock, and sync fs, during which
+ * we need to do sb_start_intwrite() in starting a
+ * new transaction.
+ * so iterate_supers() will wait for thaw_super() to reset sb's frozen
+ * state, while thaw_super() will wait for iterate_supers() to drop the
+ * s_umount lock. This is an ABBA deadlock.
+ */
+ if (!sb_start_intwrite_trylock(sb))
+ return 0;
+ sb_end_intwrite(sb);
+
btrfs_wait_ordered_extents(root, 0, 0);
trans = btrfs_start_transaction(root, 0);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index aa11047..8a3efd0 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1700,6 +1700,11 @@ static inline void sb_start_intwrite(struct super_block *sb)
__sb_start_write(sb, SB_FREEZE_FS, true);
}
+static inline int sb_start_intwrite_trylock(struct super_block *sb)
+{
+ return __sb_start_write(sb, SB_FREEZE_FS, false);
+}
+
extern bool inode_owner_or_capable(const struct inode *inode);
--
1.7.7.6
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH RFC] Btrfs: fix deadlock between sys_sync and freeze
2012-08-14 5:01 [PATCH RFC] Btrfs: fix deadlock between sys_sync and freeze liub.liubo
@ 2012-08-14 12:59 ` Marco Stornelli
2012-08-14 13:53 ` Liu Bo
0 siblings, 1 reply; 6+ messages in thread
From: Marco Stornelli @ 2012-08-14 12:59 UTC (permalink / raw)
To: liub.liubo; +Cc: linux-btrfs, linux-fsdevel
Il 14/08/2012 07:01, liub.liubo@gmail.com ha scritto:
> From: Liu Bo <bo.li.liu@oracle.com>
>
> I found this while testing xfstests 068, the story is
>
> t1 t2
> sys_sync thaw_super
> iterate_supers
> down_read(sb->s_umount) down_write(sb->s_umount) --->wait for t1
> sync_fs (with wait mode)
> start_transaction
> sb_start_intwrite --------------------> wait for t2 to set s_writers.frozen to SB_UNFROZEN
>
> In this patch, I add an helper sb_start_intwrite_trylock() and use it before we
> start_transaction in sync_fs() with wait mode so that we won't hit the deadlock.
>
IMHO, we should avoid to call the sync operation on a frozen fs. The
freeze operation, indeed, already include a sync operation. According to
man page, no other operation should modify the fs after the freeze. So
for me the modification is inside sync_filesystem (and sync_one_sb).
Marco
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH RFC] Btrfs: fix deadlock between sys_sync and freeze
2012-08-14 12:59 ` Marco Stornelli
@ 2012-08-14 13:53 ` Liu Bo
2012-08-14 14:12 ` Marco Stornelli
0 siblings, 1 reply; 6+ messages in thread
From: Liu Bo @ 2012-08-14 13:53 UTC (permalink / raw)
To: Marco Stornelli; +Cc: liub.liubo, linux-btrfs, linux-fsdevel
On 08/14/2012 08:59 PM, Marco Stornelli wrote:
> Il 14/08/2012 07:01, liub.liubo@gmail.com ha scritto:
>> From: Liu Bo <bo.li.liu@oracle.com>
>>
>> I found this while testing xfstests 068, the story is
>>
>> t1 t2
>> sys_sync thaw_super
>> iterate_supers
>> down_read(sb->s_umount) down_write(sb->s_umount) --->wait for t1
>> sync_fs (with wait mode)
>> start_transaction
>> sb_start_intwrite --------------------> wait for t2 to set s_writers.frozen to SB_UNFROZEN
>>
>> In this patch, I add an helper sb_start_intwrite_trylock() and use it before we
>> start_transaction in sync_fs() with wait mode so that we won't hit the deadlock.
>>
>
> IMHO, we should avoid to call the sync operation on a frozen fs. The freeze operation, indeed, already include a sync operation.
> According to man page, no other operation should modify the fs after the freeze.
> So for me the modification is inside sync_filesystem (and sync_one_sb).
Do you mean that we should add the trylock check in sync_filesystem?
But it seems to be useless because we already run into down_read(sb->s_umount) before starting sync_one_sb().
thanks,
liubo
>
> Marco
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH RFC] Btrfs: fix deadlock between sys_sync and freeze
2012-08-14 13:53 ` Liu Bo
@ 2012-08-14 14:12 ` Marco Stornelli
2012-08-15 12:51 ` Liu Bo
0 siblings, 1 reply; 6+ messages in thread
From: Marco Stornelli @ 2012-08-14 14:12 UTC (permalink / raw)
To: Liu Bo; +Cc: liub.liubo, linux-btrfs, linux-fsdevel
Il 14/08/2012 15:53, Liu Bo ha scritto:
> On 08/14/2012 08:59 PM, Marco Stornelli wrote:
>> Il 14/08/2012 07:01, liub.liubo@gmail.com ha scritto:
>>> From: Liu Bo <bo.li.liu@oracle.com>
>>>
>>> I found this while testing xfstests 068, the story is
>>>
>>> t1 t2
>>> sys_sync thaw_super
>>> iterate_supers
>>> down_read(sb->s_umount) down_write(sb->s_umount) --->wait for t1
>>> sync_fs (with wait mode)
>>> start_transaction
>>> sb_start_intwrite --------------------> wait for t2 to set s_writers.frozen to SB_UNFROZEN
>>>
>>> In this patch, I add an helper sb_start_intwrite_trylock() and use it before we
>>> start_transaction in sync_fs() with wait mode so that we won't hit the deadlock.
>>>
>>
>> IMHO, we should avoid to call the sync operation on a frozen fs. The freeze operation, indeed, already include a sync operation.
>> According to man page, no other operation should modify the fs after the freeze.
>> So for me the modification is inside sync_filesystem (and sync_one_sb).
>
> Do you mean that we should add the trylock check in sync_filesystem?
>
> But it seems to be useless because we already run into down_read(sb->s_umount) before starting sync_one_sb().
>
> thanks,
> liubo
>
I meant that we should check if there are in a "complete" freeze state
(according to the "states" of a freeze transaction) and simply skip the
sync operation.
Marco
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH RFC] Btrfs: fix deadlock between sys_sync and freeze
2012-08-14 14:12 ` Marco Stornelli
@ 2012-08-15 12:51 ` Liu Bo
2012-08-15 13:12 ` Jan Kara
0 siblings, 1 reply; 6+ messages in thread
From: Liu Bo @ 2012-08-15 12:51 UTC (permalink / raw)
To: Marco Stornelli; +Cc: liub.liubo, linux-btrfs, linux-fsdevel, Jan Kara
(CCed Jan, the author of freeze code)
On 08/14/2012 10:12 PM, Marco Stornelli wrote:
> Il 14/08/2012 15:53, Liu Bo ha scritto:
>> On 08/14/2012 08:59 PM, Marco Stornelli wrote:
>>> Il 14/08/2012 07:01, liub.liubo@gmail.com ha scritto:
>>>> From: Liu Bo <bo.li.liu@oracle.com>
>>>>
>>>> I found this while testing xfstests 068, the story is
>>>>
>>>> t1 t2
>>>> sys_sync thaw_super
>>>> iterate_supers
>>>> down_read(sb->s_umount) down_write(sb->s_umount) --->wait for t1
>>>> sync_fs (with wait mode)
>>>> start_transaction
>>>> sb_start_intwrite --------------------> wait for t2 to set s_writers.frozen to SB_UNFROZEN
>>>>
>>>> In this patch, I add an helper sb_start_intwrite_trylock() and use it before we
>>>> start_transaction in sync_fs() with wait mode so that we won't hit the deadlock.
>>>>
>>>
>>> IMHO, we should avoid to call the sync operation on a frozen fs. The freeze operation, indeed, already include a sync operation.
>>> According to man page, no other operation should modify the fs after the freeze.
>>> So for me the modification is inside sync_filesystem (and sync_one_sb).
>>
>> Do you mean that we should add the trylock check in sync_filesystem?
>>
>> But it seems to be useless because we already run into down_read(sb->s_umount) before starting sync_one_sb().
>>
>> thanks,
>> liubo
>>
>
> I meant that we should check if there are in a "complete" freeze state (according to the "states" of a freeze transaction) and simply skip the sync operation.
>
I'm ok with it.
What do you think about it, Jan? Any comments?
thanks,
liubo
> Marco
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH RFC] Btrfs: fix deadlock between sys_sync and freeze
2012-08-15 12:51 ` Liu Bo
@ 2012-08-15 13:12 ` Jan Kara
0 siblings, 0 replies; 6+ messages in thread
From: Jan Kara @ 2012-08-15 13:12 UTC (permalink / raw)
To: Liu Bo; +Cc: Marco Stornelli, liub.liubo, linux-btrfs, linux-fsdevel, Jan Kara
On Wed 15-08-12 20:51:17, Liu Bo wrote:
> (CCed Jan, the author of freeze code)
> On 08/14/2012 10:12 PM, Marco Stornelli wrote:
> > Il 14/08/2012 15:53, Liu Bo ha scritto:
> >> On 08/14/2012 08:59 PM, Marco Stornelli wrote:
> >>> Il 14/08/2012 07:01, liub.liubo@gmail.com ha scritto:
> >>>> From: Liu Bo <bo.li.liu@oracle.com>
> >>>>
> >>>> I found this while testing xfstests 068, the story is
> >>>>
> >>>> t1 t2
> >>>> sys_sync thaw_super
> >>>> iterate_supers
> >>>> down_read(sb->s_umount) down_write(sb->s_umount) --->wait for t1
> >>>> sync_fs (with wait mode)
> >>>> start_transaction
> >>>> sb_start_intwrite --------------------> wait for t2 to set s_writers.frozen to SB_UNFROZEN
> >>>>
> >>>> In this patch, I add an helper sb_start_intwrite_trylock() and use it before we
> >>>> start_transaction in sync_fs() with wait mode so that we won't hit the deadlock.
> >>>>
> >>>
> >>> IMHO, we should avoid to call the sync operation on a frozen fs. The freeze operation, indeed, already include a sync operation.
> >>> According to man page, no other operation should modify the fs after the freeze.
> >>> So for me the modification is inside sync_filesystem (and sync_one_sb).
> >>
> >> Do you mean that we should add the trylock check in sync_filesystem?
> >>
> >> But it seems to be useless because we already run into down_read(sb->s_umount) before starting sync_one_sb().
> >>
> >> thanks,
> >> liubo
> >>
> >
> > I meant that we should check if there are in a "complete" freeze state (according to the "states" of a freeze transaction) and simply skip the sync operation.
> >
>
> I'm ok with it.
>
> What do you think about it, Jan? Any comments?
Hum, so what I don't exactly understand is why does btrfs start a
transaction in ->sync_fs(). The idea of freeze code is that when filesystem
is frozen, there is no data / metadata to write and thus we avoid deadlocks
arising from trying to write anything with s_umount held.
So checking whether filesystem is frozen and avoiding transaction start in
that case in btrfs_sync_fs() will work but looks hacky. Rather the check
should be for the thing which is the reason for transaction start-end pair
in btrfs_sync_fs(). My naive guess would be we should check whether there
is any transaction running...
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-08-15 13:12 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-14 5:01 [PATCH RFC] Btrfs: fix deadlock between sys_sync and freeze liub.liubo
2012-08-14 12:59 ` Marco Stornelli
2012-08-14 13:53 ` Liu Bo
2012-08-14 14:12 ` Marco Stornelli
2012-08-15 12:51 ` Liu Bo
2012-08-15 13:12 ` Jan Kara
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).