From: Qu Wenruo <quwenruo@cn.fujitsu.com>
To: Miao Xie <miaoxie@huawei.com>, <linux-btrfs@vger.kernel.org>
Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH RESEND v4 6/8] vfs: Add get_vfsmount_sb() function to get vfsmount from a given sb.
Date: Fri, 30 Jan 2015 11:30:14 +0800 [thread overview]
Message-ID: <54CAFAC6.9030705@cn.fujitsu.com> (raw)
In-Reply-To: <54CAF8E6.8030100@huawei.com>
-------- Original Message --------
Subject: Re: [PATCH RESEND v4 6/8] vfs: Add get_vfsmount_sb() function
to get vfsmount from a given sb.
From: Miao Xie <miaoxie@huawei.com>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>, <linux-btrfs@vger.kernel.org>
Date: 2015年01月30日 11:22
> On Fri, 30 Jan 2015 10:02:26 +0800, Qu Wenruo wrote:
>> -------- Original Message --------
>> Subject: Re: [PATCH RESEND v4 6/8] vfs: Add get_vfsmount_sb() function to get
>> vfsmount from a given sb.
>> From: Qu Wenruo <quwenruo@cn.fujitsu.com>
>> To: Miao Xie <miaoxie@huawei.com>, linux-btrfs@vger.kernel.org
>> Date: 2015年01月30日 09:44
>>> -------- Original Message --------
>>> Subject: Re: [PATCH RESEND v4 6/8] vfs: Add get_vfsmount_sb() function to get
>>> vfsmount from a given sb.
>>> From: Miao Xie <miaoxie@huawei.com>
>>> To: Qu Wenruo <quwenruo@cn.fujitsu.com>, <linux-btrfs@vger.kernel.org>
>>> Date: 2015年01月30日 08:52
>>>> On Thu, 29 Jan 2015 10:24:39 +0800, Qu Wenruo wrote:
>>>>> There are sysfs interfaces in some fs, only btrfs yet, which will modify
>>>>> on-disk data.
>>>>> Unlike normal file operation routine we can use mnt_want_write_file() to
>>>>> protect the operation, change through sysfs won't to be binded to any file
>>>>> in the filesystem.
>>>>> So we can only extract the first vfsmount of a superblock and pass it to
>>>>> mnt_want_write() to do the protection.
>>>> This method is wrong, becasue one fs may be mounted on the multi places
>>>> at the same time, someone is R/O, someone is R/W, you may get a R/O and
>>>> fail to get the write permission.
>>> This shouldn't happen. If someone is ro, the whole fs should be ro, right?
>>> You can mount a device which is already mounted as rw to other point as ro,
>>> and remount a mount point to ro will also cause all other mount point to ro.
>>>
>>> So I didn't see the problem here.
>>>> I think you do label/feature change by sysfs interface by the following way
>>>>
>>>> btrfs_sysfs_change_XXXX()
>>>> {
>>>> /* Use trylock to avoid the race with umount */
>>>> if(!mutex_trylock(&sb->s_umount))
>>>> return -EBUSY;
>>>>
>>>> check R/O and FREEZE
>>>>
>>>> mutex_unlock(&sb->s_umount);
>>>> }
>>> This looks better since it not introduce changes to VFS.
>>>
>>> Thanks,
>>> Qu
>> Oh, wait a second, this one leads to the old problem and old solution.
>>
>> If we hold s_umount mutex, we must do freeze check and can't start transaction
>> since it will deadlock.
>>
>> And for freeze check, we must use sb_try_start_intwrite() to hold the freeze
>> lock and then add a new
>> btrfs_start_transaction_freeze() which will not call sb_start_write()...
>>
>> Oh this seems so similar, v2 or v3 version RFC patch?
>> So still goes to the old method?
> No. Just check R/O and RREEZE, if failed, go out. if the check pass,
> we start_transaction. Because we do it in s_umount lock, no one can
> change fs to R/O or FREEZE.
>
> Maybe the above description is not so clear, explain it again.
>
> btrfs_sysfs_change_XXXX()
> {
> /* Use trylock to avoid the race with umount */
> if(!mutex_trylock(&sb->s_umount))
> return -EBUSY;
>
> if (fs is R/O or FREEZED) {
> mutex_unlock(&sb->s_umount);
> return -EACCES;
> }
>
> btrfs_start_transaction()
> change label/feature
> btrfs_commit_transaction()
>
> mutex_unlock(&sb->s_umount);
> }
I prefer the sb_want_write() method, since it doesn't even need to hold
the s_umount mutex.
Thanks,
Qu
> Thanks
> Miao
>
>> Thanks,
>> Qu
>>>> Thanks
>>>> Miao
>>>>
>>>>> Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>
>>>>> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
>>>>> ---
>>>>> fs/namespace.c | 25 +++++++++++++++++++++++++
>>>>> include/linux/mount.h | 1 +
>>>>> 2 files changed, 26 insertions(+)
>>>>>
>>>>> diff --git a/fs/namespace.c b/fs/namespace.c
>>>>> index cd1e968..5a16a62 100644
>>>>> --- a/fs/namespace.c
>>>>> +++ b/fs/namespace.c
>>>>> @@ -1105,6 +1105,31 @@ struct vfsmount *mntget(struct vfsmount *mnt)
>>>>> }
>>>>> EXPORT_SYMBOL(mntget);
>>>>> +/*
>>>>> + * get a vfsmount from a given sb
>>>>> + *
>>>>> + * This is especially used for case where change fs' sysfs interface
>>>>> + * will lead to a write, e.g. Change label through sysfs in btrfs.
>>>>> + * So vfs can get a vfsmount and then use mnt_want_write() to protect.
>>>>> + */
>>>>> +struct vfsmount *get_vfsmount_sb(struct super_block *sb)
>>>>> +{
>>>>> + struct vfsmount *ret_vfs = NULL;
>>>>> + struct mount *mnt;
>>>>> + int ret = 0;
>>>>> +
>>>>> + lock_mount_hash();
>>>>> + if (list_empty(&sb->s_mounts))
>>>>> + goto out;
>>>>> + mnt = list_entry(sb->s_mounts.next, struct mount, mnt_instance);
>>>>> + ret_vfs = &mnt->mnt;
>>>>> + ret_vfs = mntget(ret_vfs);
>>>>> +out:
>>>>> + unlock_mount_hash();
>>>>> + return ret_vfs;
>>>>> +}
>>>>> +EXPORT_SYMBOL(get_vfsmount_sb);
>>>>> +
>>>>> struct vfsmount *mnt_clone_internal(struct path *path)
>>>>> {
>>>>> struct mount *p;
>>>>> diff --git a/include/linux/mount.h b/include/linux/mount.h
>>>>> index c2c561d..cf1b0f5 100644
>>>>> --- a/include/linux/mount.h
>>>>> +++ b/include/linux/mount.h
>>>>> @@ -79,6 +79,7 @@ extern void mnt_drop_write_file(struct file *file);
>>>>> extern void mntput(struct vfsmount *mnt);
>>>>> extern struct vfsmount *mntget(struct vfsmount *mnt);
>>>>> extern struct vfsmount *mnt_clone_internal(struct path *path);
>>>>> +extern struct vfsmount *get_vfsmount_sb(struct super_block *sb);
>>>>> extern int __mnt_is_readonly(struct vfsmount *mnt);
>>>>> struct path;
>>>>>
>> .
>>
WARNING: multiple messages have this Message-ID (diff)
From: Qu Wenruo <quwenruo@cn.fujitsu.com>
To: Miao Xie <miaoxie@huawei.com>, <linux-btrfs@vger.kernel.org>
Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH RESEND v4 6/8] vfs: Add get_vfsmount_sb() function to get vfsmount from a given sb.
Date: Fri, 30 Jan 2015 11:30:14 +0800 [thread overview]
Message-ID: <54CAFAC6.9030705@cn.fujitsu.com> (raw)
In-Reply-To: <54CAF8E6.8030100@huawei.com>
-------- Original Message --------
Subject: Re: [PATCH RESEND v4 6/8] vfs: Add get_vfsmount_sb() function
to get vfsmount from a given sb.
From: Miao Xie <miaoxie@huawei.com>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>, <linux-btrfs@vger.kernel.org>
Date: 2015年01月30日 11:22
> On Fri, 30 Jan 2015 10:02:26 +0800, Qu Wenruo wrote:
>> -------- Original Message --------
>> Subject: Re: [PATCH RESEND v4 6/8] vfs: Add get_vfsmount_sb() function to get
>> vfsmount from a given sb.
>> From: Qu Wenruo <quwenruo@cn.fujitsu.com>
>> To: Miao Xie <miaoxie@huawei.com>, linux-btrfs@vger.kernel.org
>> Date: 2015年01月30日 09:44
>>> -------- Original Message --------
>>> Subject: Re: [PATCH RESEND v4 6/8] vfs: Add get_vfsmount_sb() function to get
>>> vfsmount from a given sb.
>>> From: Miao Xie <miaoxie@huawei.com>
>>> To: Qu Wenruo <quwenruo@cn.fujitsu.com>, <linux-btrfs@vger.kernel.org>
>>> Date: 2015年01月30日 08:52
>>>> On Thu, 29 Jan 2015 10:24:39 +0800, Qu Wenruo wrote:
>>>>> There are sysfs interfaces in some fs, only btrfs yet, which will modify
>>>>> on-disk data.
>>>>> Unlike normal file operation routine we can use mnt_want_write_file() to
>>>>> protect the operation, change through sysfs won't to be binded to any file
>>>>> in the filesystem.
>>>>> So we can only extract the first vfsmount of a superblock and pass it to
>>>>> mnt_want_write() to do the protection.
>>>> This method is wrong, becasue one fs may be mounted on the multi places
>>>> at the same time, someone is R/O, someone is R/W, you may get a R/O and
>>>> fail to get the write permission.
>>> This shouldn't happen. If someone is ro, the whole fs should be ro, right?
>>> You can mount a device which is already mounted as rw to other point as ro,
>>> and remount a mount point to ro will also cause all other mount point to ro.
>>>
>>> So I didn't see the problem here.
>>>> I think you do label/feature change by sysfs interface by the following way
>>>>
>>>> btrfs_sysfs_change_XXXX()
>>>> {
>>>> /* Use trylock to avoid the race with umount */
>>>> if(!mutex_trylock(&sb->s_umount))
>>>> return -EBUSY;
>>>>
>>>> check R/O and FREEZE
>>>>
>>>> mutex_unlock(&sb->s_umount);
>>>> }
>>> This looks better since it not introduce changes to VFS.
>>>
>>> Thanks,
>>> Qu
>> Oh, wait a second, this one leads to the old problem and old solution.
>>
>> If we hold s_umount mutex, we must do freeze check and can't start transaction
>> since it will deadlock.
>>
>> And for freeze check, we must use sb_try_start_intwrite() to hold the freeze
>> lock and then add a new
>> btrfs_start_transaction_freeze() which will not call sb_start_write()...
>>
>> Oh this seems so similar, v2 or v3 version RFC patch?
>> So still goes to the old method?
> No. Just check R/O and RREEZE, if failed, go out. if the check pass,
> we start_transaction. Because we do it in s_umount lock, no one can
> change fs to R/O or FREEZE.
>
> Maybe the above description is not so clear, explain it again.
>
> btrfs_sysfs_change_XXXX()
> {
> /* Use trylock to avoid the race with umount */
> if(!mutex_trylock(&sb->s_umount))
> return -EBUSY;
>
> if (fs is R/O or FREEZED) {
> mutex_unlock(&sb->s_umount);
> return -EACCES;
> }
>
> btrfs_start_transaction()
> change label/feature
> btrfs_commit_transaction()
>
> mutex_unlock(&sb->s_umount);
> }
I prefer the sb_want_write() method, since it doesn't even need to hold
the s_umount mutex.
Thanks,
Qu
> Thanks
> Miao
>
>> Thanks,
>> Qu
>>>> Thanks
>>>> Miao
>>>>
>>>>> Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>
>>>>> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
>>>>> ---
>>>>> fs/namespace.c | 25 +++++++++++++++++++++++++
>>>>> include/linux/mount.h | 1 +
>>>>> 2 files changed, 26 insertions(+)
>>>>>
>>>>> diff --git a/fs/namespace.c b/fs/namespace.c
>>>>> index cd1e968..5a16a62 100644
>>>>> --- a/fs/namespace.c
>>>>> +++ b/fs/namespace.c
>>>>> @@ -1105,6 +1105,31 @@ struct vfsmount *mntget(struct vfsmount *mnt)
>>>>> }
>>>>> EXPORT_SYMBOL(mntget);
>>>>> +/*
>>>>> + * get a vfsmount from a given sb
>>>>> + *
>>>>> + * This is especially used for case where change fs' sysfs interface
>>>>> + * will lead to a write, e.g. Change label through sysfs in btrfs.
>>>>> + * So vfs can get a vfsmount and then use mnt_want_write() to protect.
>>>>> + */
>>>>> +struct vfsmount *get_vfsmount_sb(struct super_block *sb)
>>>>> +{
>>>>> + struct vfsmount *ret_vfs = NULL;
>>>>> + struct mount *mnt;
>>>>> + int ret = 0;
>>>>> +
>>>>> + lock_mount_hash();
>>>>> + if (list_empty(&sb->s_mounts))
>>>>> + goto out;
>>>>> + mnt = list_entry(sb->s_mounts.next, struct mount, mnt_instance);
>>>>> + ret_vfs = &mnt->mnt;
>>>>> + ret_vfs = mntget(ret_vfs);
>>>>> +out:
>>>>> + unlock_mount_hash();
>>>>> + return ret_vfs;
>>>>> +}
>>>>> +EXPORT_SYMBOL(get_vfsmount_sb);
>>>>> +
>>>>> struct vfsmount *mnt_clone_internal(struct path *path)
>>>>> {
>>>>> struct mount *p;
>>>>> diff --git a/include/linux/mount.h b/include/linux/mount.h
>>>>> index c2c561d..cf1b0f5 100644
>>>>> --- a/include/linux/mount.h
>>>>> +++ b/include/linux/mount.h
>>>>> @@ -79,6 +79,7 @@ extern void mnt_drop_write_file(struct file *file);
>>>>> extern void mntput(struct vfsmount *mnt);
>>>>> extern struct vfsmount *mntget(struct vfsmount *mnt);
>>>>> extern struct vfsmount *mnt_clone_internal(struct path *path);
>>>>> +extern struct vfsmount *get_vfsmount_sb(struct super_block *sb);
>>>>> extern int __mnt_is_readonly(struct vfsmount *mnt);
>>>>> struct path;
>>>>>
>> .
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2015-01-30 3:30 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-29 2:24 [PATCH RESEND v4 0/8] Fix freeze/sysfs deadlock in better method Qu Wenruo
2015-01-29 2:24 ` [PATCH RESEND v4 1/8] Revert "btrfs: add support for processing pending changes" related commits Qu Wenruo
2015-01-29 2:24 ` [PATCH RESEND v4 2/8] btrfs: Make btrfs_parse_options() parse mount option in a atomic way Qu Wenruo
2015-01-30 0:31 ` Miao Xie
2015-01-30 1:20 ` Qu Wenruo
2015-01-30 1:29 ` Miao Xie
2015-01-30 1:33 ` Qu Wenruo
2015-01-30 2:06 ` Miao Xie
2015-01-30 2:51 ` Qu Wenruo
2015-01-30 3:21 ` Miao Xie
2015-01-30 3:25 ` Qu Wenruo
2015-01-29 2:24 ` [PATCH RESEND v4 3/8] btrfs: Introduce per-transaction mount_opt to keep mount option consistent during transaction Qu Wenruo
2015-01-29 2:24 ` [PATCH RESEND v4 4/8] btrfs: Use btrfs_test_trans_opt() to handle SPACE_CACHE if it's under transaction protect Qu Wenruo
2015-01-29 2:24 ` [PATCH RESEND v4 5/8] btrfs: Use btrfs_test_trans_opt() to handle INODE_CACHE " Qu Wenruo
2015-01-29 2:24 ` [PATCH RESEND v4 6/8] vfs: Add get_vfsmount_sb() function to get vfsmount from a given sb Qu Wenruo
2015-01-29 12:37 ` David Sterba
2015-01-29 15:23 ` Al Viro
2015-01-30 1:11 ` Qu Wenruo
2015-01-30 1:11 ` Qu Wenruo
2015-01-30 2:09 ` Al Viro
2015-01-30 2:20 ` Qu Wenruo
2015-01-30 2:20 ` Qu Wenruo
2015-01-30 0:52 ` Miao Xie
2015-01-30 1:44 ` Qu Wenruo
2015-01-30 1:44 ` Qu Wenruo
2015-01-30 2:02 ` Qu Wenruo
2015-01-30 2:02 ` Qu Wenruo
2015-01-30 3:22 ` Miao Xie
2015-01-30 3:22 ` Miao Xie
2015-01-30 3:30 ` Qu Wenruo [this message]
2015-01-30 3:30 ` Qu Wenruo
2015-01-30 2:14 ` Al Viro
2015-01-30 4:14 ` Miao Xie
2015-01-30 4:37 ` Al Viro
2015-01-30 5:34 ` Miao Xie
2015-01-30 6:15 ` Al Viro
2015-01-30 5:30 ` Qu Wenruo
2015-01-30 5:30 ` Qu Wenruo
2015-01-29 2:24 ` [PATCH RESEND v4 7/8] btrfs: Use mnt_want_write() to protect label change Qu Wenruo
2015-01-29 2:24 ` [PATCH RESEND v4 8/8] btrfs: Use mnt_want_write() to protect sysfs feature change Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54CAFAC6.9030705@cn.fujitsu.com \
--to=quwenruo@cn.fujitsu.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=miaoxie@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.