linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Miao Xie <miaoxie@huawei.com>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>, Chris Mason <clm@fb.com>,
	<dsterba@suse.cz>
Cc: <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
Date: Wed, 21 Jan 2015 16:46:59 +0800	[thread overview]
Message-ID: <54BF6783.4000603@huawei.com> (raw)
In-Reply-To: <54BF59AA.9070408@cn.fujitsu.com>

On Wed, 21 Jan 2015 15:47:54 +0800, Qu Wenruo wrote:
>> On Wed, 21 Jan 2015 11:53:34 +0800, Qu Wenruo wrote:
>>>>> [snipped]
>>>>> This will cause another problem, nobody can ensure there will be next
>>>>> transaction and the change may
>>>>> never to written into disk.
>>>> First, the pending changes is mount option, that is in-memory data.
>>>> Second, the same problem would happen after you freeze fs.
>>> Pending changes are *not* only mount options. Feature change and label change
>>> are also pending changes if using sysfs.
>> My miss, I don't notice feature and label change by sysfs.
>>
>> But the implementation of feature and label change by sysfs is wrong, we can
>> not change them without write permission.
>>
>>> Normal ioctl label changing is not affected.
>>>
>>> For freeze, it's not the same problem since the fs will be unfreeze sooner or
>>> later and transaction will be initiated.
>> You can not assume the operations of the users, they might freeze the fs and
>> then shutdown the machine.
>>
>>>>> For example, if we change the features/label through sysfs, and then umount
>>>>> the fs,
>>>> It is different from pending change.
>>> No, now features/label changing using sysfs both use pending changes to do the
>>> commit.
>>> See BTRFS_PENDING_COMMIT bit.
>>> So freeze -> change features/label -> sync will still cause the deadlock in the
>>> same way,
>>> and you can try it yourself.
>> As I said above, the implementation of sysfs feature and label change is wrong,
>> it is better to separate them from the pending mount option change, make the
>> sysfs feature and label change be done in the context of transaction after
>> getting the write permission. If so, we needn't do anything special when sync
>> the fs.
>>
>> In short, changing the sysfs feature and label change implementation and
>> removing the unnecessary btrfs_start_transaction in sync_fs can fix the
>> deadlock.
> Your method will only fix the deadlock, but will introduce the risk like pending
> inode_cache will never
> be written to disk as I already explained. (If still using the
> fs_info->pending_changes mechanism)
> To ensure pending changes written to disk sync_fs() should start a transaction
> if needed,
> or there will be chance that no transaction can handle it.

We are sure that writting down the inode cache need transaction. But INODE_CACHE
is not a forcible flag. Sometimes though you set it, it is very likely that the
inode cache files are not created and the data is not written down because the
fs might still be reading inode usage information, and this operation might span
several transactions. So I think what you worried is not a problem.

Thanks
Miao

> 
> But I don't see the necessity to pending current work(inode_cache, feature/label
> changes) to next transaction.
> 
> To David:
> I'm a little curious about why inode_cache needs to be delayed to next transaction.
> In btrfs_remount() we have s_umount mutex, and we synced the whole filesystem
> already,
> so there should be no running transaction and we can just set any mount option
> into fs_info.
> 
> Or even in worst case, there is a racing window, we can still start a
> transaction and do the commit,
> a little overhead in such minor case won't impact the overall performance.
> 
> For sysfs change, I prefer attach or start transaction method, and for mount
> option change, and
> such sysfs tuning is also minor case for a filesystem.
> 
> What do you think about reverting the whole patchset and rework the sysfs
> interface?
> 
> Thanks,
> Qu
>>
>> Thanks
>> Miao
>>
>>> Thanks,
>>> Qu
>>>
>>>> If you want to change features/label,  you should get write permission and make
>>>> sure the fs is not be freezed because those are on-disk data. So the problem
>>>> doesn't exist, or there is a bug.
>>>>
>>>> Thanks
>>>> Miao
>>>>
>>>>> since there is no write, there is no running transaction and if we don't
>>>>> start a
>>>>> new transaction,
>>>>> it won't be flushed to disk.
>>>>>
>>>>> Thanks,
>>>>> Qu
>>>>>> the reason is:
>>>>>> - Make the behavior of the fs be consistent(both freezed fs and unfreezed fs)
>>>>>> - Data on the disk is right and integrated
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> Miao
>>>>> .
>>>>>
>>> .
>>>
> 
> .
> 


  reply	other threads:[~2015-01-21  8:47 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-19  7:42 [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock Qu Wenruo
2015-01-19 14:06 ` David Sterba
2015-01-20  2:51   ` Qu Wenruo
2015-01-20  2:53     ` Qu Wenruo
2015-01-20  3:06       ` Miao Xie
2015-01-20  3:17         ` Qu Wenruo
2015-01-20  8:16           ` Miao Xie
2015-01-20  0:19 ` Miao Xie
2015-01-20  0:26   ` Qu Wenruo
2015-01-20 17:13 ` David Sterba
2015-01-21  0:58   ` Qu Wenruo
2015-01-21  1:05     ` Chris Mason
2015-01-21  1:09       ` Qu Wenruo
2015-01-21  1:10         ` Chris Mason
2015-01-21  3:10           ` Miao Xie
2015-01-21  3:15             ` Qu Wenruo
2015-01-21  3:26               ` Miao Xie
2015-01-21  3:53                 ` Qu Wenruo
2015-01-21  7:04                   ` Miao Xie
2015-01-21  7:47                     ` Qu Wenruo
2015-01-21  8:46                       ` Miao Xie [this message]
2015-01-23 17:39                       ` David Sterba
2015-01-23 18:21                         ` Chris Mason
2015-01-23 16:59                     ` David Sterba
2015-01-26  0:31                       ` Miao Xie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54BF6783.4000603@huawei.com \
    --to=miaoxie@huawei.com \
    --cc=clm@fb.com \
    --cc=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).