linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 2nd try v4] btrfs: drop uuid_mutex in btrfs_free_extra_devids()
@ 2018-05-28 14:43 Anand Jain
  2018-05-28 15:40 ` David Sterba
  0 siblings, 1 reply; 6+ messages in thread
From: Anand Jain @ 2018-05-28 14:43 UTC (permalink / raw)
  To: linux-btrfs

btrfs_free_extra_devids() is called only in the mount context which
traverses through the fs_devices::devices and frees the orphan devices
devices in the given %fs_devices if any. As the search for the orphan
device is limited to fs_devices::devices so we don't need the global
uuid_mutex.

There can't be any mount-point based ioctl threads in this context as
the mount thread is not yet returned. But there can be the btrfs-control
based scan ioctls thread which calls device_list_add().

Here in the mount thread the fs_devices::opened is incremented way before
btrfs_free_extra_devids() is called and in the scan context the fs_devices
which are already opened neither be freed or alloc-able at
device_list_add().

But lets say you change the device-path and call the scan again, then scan
would update the new device path and this operation could race against the
btrfs_free_extra_devids() thread, which might be in the process of
free-ing the same device. So synchronize it by using the
device_list_mutex.

This scenario is a very corner case, and practically the scan and mount
are anyway serialized by the usage so unless the race is instrumented its
very difficult to achieve.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
(I didn't see this email in the mailing list, so trying again).
v3->v4: As we traverse through the seed device, fs_device gets updated with
	the child seed fs_devices, so make sure we use the same fs_devices
	pointer for the mutex_unlock as used for the mutex_lock.
v2->v3: Update change log.
	(Currently device_list_add() is very lean on its device_list_mutex usage,
	a cleanup and fix is wip. Given the practicality of the above race
	condition this patch is good to merge).
v1->v2: replace uuid_mutex with device_list_mutex instead of delete.
	change log updated.
 fs/btrfs/volumes.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index b6757b53c297..f03719221fca 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -924,8 +924,9 @@ void btrfs_free_extra_devids(struct btrfs_fs_devices *fs_devices, int step)
 {
 	struct btrfs_device *device, *next;
 	struct btrfs_device *latest_dev = NULL;
+	struct btrfs_fs_devices *parent_fs_devices = fs_devices;
 
-	mutex_lock(&uuid_mutex);
+	mutex_lock(&parent_fs_devices->device_list_mutex);
 again:
 	/* This is the initialized path, it is safe to release the devices. */
 	list_for_each_entry_safe(device, next, &fs_devices->devices, dev_list) {
@@ -979,8 +980,7 @@ void btrfs_free_extra_devids(struct btrfs_fs_devices *fs_devices, int step)
 	}
 
 	fs_devices->latest_bdev = latest_dev->bdev;
-
-	mutex_unlock(&uuid_mutex);
+	mutex_unlock(&parent_fs_devices->device_list_mutex);
 }
 
 static void free_device_rcu(struct rcu_head *head)
-- 
2.7.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2nd try v4] btrfs: drop uuid_mutex in btrfs_free_extra_devids()
  2018-05-28 14:43 [PATCH 2nd try v4] btrfs: drop uuid_mutex in btrfs_free_extra_devids() Anand Jain
@ 2018-05-28 15:40 ` David Sterba
  2018-05-28 22:57   ` Anand Jain
  0 siblings, 1 reply; 6+ messages in thread
From: David Sterba @ 2018-05-28 15:40 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs

On Mon, May 28, 2018 at 10:43:29PM +0800, Anand Jain wrote:
> btrfs_free_extra_devids() is called only in the mount context which
> traverses through the fs_devices::devices and frees the orphan devices
> devices in the given %fs_devices if any. As the search for the orphan
> device is limited to fs_devices::devices so we don't need the global
> uuid_mutex.
> 
> There can't be any mount-point based ioctl threads in this context as
> the mount thread is not yet returned. But there can be the btrfs-control
> based scan ioctls thread which calls device_list_add().
> 
> Here in the mount thread the fs_devices::opened is incremented way before
> btrfs_free_extra_devids() is called and in the scan context the fs_devices
> which are already opened neither be freed or alloc-able at
> device_list_add().
> 
> But lets say you change the device-path and call the scan again, then scan
> would update the new device path and this operation could race against the
> btrfs_free_extra_devids() thread, which might be in the process of
> free-ing the same device. So synchronize it by using the
> device_list_mutex.
> 
> This scenario is a very corner case, and practically the scan and mount
> are anyway serialized by the usage so unless the race is instrumented its
> very difficult to achieve.
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> (I didn't see this email in the mailing list, so trying again).
> v3->v4: As we traverse through the seed device, fs_device gets updated with
> 	the child seed fs_devices, so make sure we use the same fs_devices
> 	pointer for the mutex_unlock as used for the mutex_lock.

Well, now that I see the change, shouldn't we always hold the
device_list_mutex of the fs_devices that's being processed? Ie. each
time it's switched, the previous is unlocked and new one locked.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2nd try v4] btrfs: drop uuid_mutex in btrfs_free_extra_devids()
  2018-05-28 15:40 ` David Sterba
@ 2018-05-28 22:57   ` Anand Jain
  2018-05-29  5:40     ` Anand Jain
  0 siblings, 1 reply; 6+ messages in thread
From: Anand Jain @ 2018-05-28 22:57 UTC (permalink / raw)
  To: dsterba, linux-btrfs



On 05/28/2018 11:40 PM, David Sterba wrote:
> On Mon, May 28, 2018 at 10:43:29PM +0800, Anand Jain wrote:
>> btrfs_free_extra_devids() is called only in the mount context which
>> traverses through the fs_devices::devices and frees the orphan devices
>> devices in the given %fs_devices if any. As the search for the orphan
>> device is limited to fs_devices::devices so we don't need the global
>> uuid_mutex.
>>
>> There can't be any mount-point based ioctl threads in this context as
>> the mount thread is not yet returned. But there can be the btrfs-control
>> based scan ioctls thread which calls device_list_add().
>>
>> Here in the mount thread the fs_devices::opened is incremented way before
>> btrfs_free_extra_devids() is called and in the scan context the fs_devices
>> which are already opened neither be freed or alloc-able at
>> device_list_add().
>>
>> But lets say you change the device-path and call the scan again, then scan
>> would update the new device path and this operation could race against the
>> btrfs_free_extra_devids() thread, which might be in the process of
>> free-ing the same device. So synchronize it by using the
>> device_list_mutex.
>>
>> This scenario is a very corner case, and practically the scan and mount
>> are anyway serialized by the usage so unless the race is instrumented its
>> very difficult to achieve.
>>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>> (I didn't see this email in the mailing list, so trying again).
>> v3->v4: As we traverse through the seed device, fs_device gets updated with
>> 	the child seed fs_devices, so make sure we use the same fs_devices
>> 	pointer for the mutex_unlock as used for the mutex_lock.
> 
> Well, now that I see the change, shouldn't we always hold the
> device_list_mutex of the fs_devices that's being processed? Ie. each
> time it's switched, the previous is unlocked and new one locked.

  No David. That's because we organize seed device under its parent
  fs_devices ((fs_devices::seed)::seed)..so on, and they are a local
  cloned copy of the original seed fs_devices. So parent's
  fs_devices::device_list_mutex lock will suffice.

Thanks, Anand

> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2nd try v4] btrfs: drop uuid_mutex in btrfs_free_extra_devids()
  2018-05-28 22:57   ` Anand Jain
@ 2018-05-29  5:40     ` Anand Jain
  2018-05-29 11:19       ` David Sterba
  0 siblings, 1 reply; 6+ messages in thread
From: Anand Jain @ 2018-05-29  5:40 UTC (permalink / raw)
  To: dsterba; +Cc: linux-btrfs



On 05/29/2018 06:57 AM, Anand Jain wrote:
> 
> 
> On 05/28/2018 11:40 PM, David Sterba wrote:
>> On Mon, May 28, 2018 at 10:43:29PM +0800, Anand Jain wrote:
>>> btrfs_free_extra_devids() is called only in the mount context which
>>> traverses through the fs_devices::devices and frees the orphan devices
>>> devices in the given %fs_devices if any. As the search for the orphan
>>> device is limited to fs_devices::devices so we don't need the global
>>> uuid_mutex.
>>>
>>> There can't be any mount-point based ioctl threads in this context as
>>> the mount thread is not yet returned. But there can be the btrfs-control
>>> based scan ioctls thread which calls device_list_add().
>>>
>>> Here in the mount thread the fs_devices::opened is incremented way 
>>> before
>>> btrfs_free_extra_devids() is called and in the scan context the 
>>> fs_devices
>>> which are already opened neither be freed or alloc-able at
>>> device_list_add().
>>>
>>> But lets say you change the device-path and call the scan again, then 
>>> scan
>>> would update the new device path and this operation could race 
>>> against the
>>> btrfs_free_extra_devids() thread, which might be in the process of
>>> free-ing the same device. So synchronize it by using the
>>> device_list_mutex.
>>>
>>> This scenario is a very corner case, and practically the scan and mount
>>> are anyway serialized by the usage so unless the race is instrumented 
>>> its
>>> very difficult to achieve.
>>>
>>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>>> ---
>>> (I didn't see this email in the mailing list, so trying again).
>>> v3->v4: As we traverse through the seed device, fs_device gets 
>>> updated with
>>>     the child seed fs_devices, so make sure we use the same fs_devices
>>>     pointer for the mutex_unlock as used for the mutex_lock.
>>
>> Well, now that I see the change, shouldn't we always hold the
>> device_list_mutex of the fs_devices that's being processed? Ie. each
>> time it's switched, the previous is unlocked and new one locked.
> 
>   No David. That's because we organize seed device under its parent
>   fs_devices ((fs_devices::seed)::seed)..so on, and they are a local
>   cloned copy of the original seed fs_devices. So parent's
>   fs_devices::device_list_mutex lock will suffice.

  On the 2nd thought, though theoretically you are right. But practically
  there isn't any use case which can benefit by using the intricate
  locking as you suggested above.

  I am following the following method:-
  By holding the parent fs_devices (which is also the mounted fs_devices
  lock) it would imply to lock its dependent cloned seed fs_devices, as
  to reach the cloned seed device, first we need to traverse from the
  parent fs_devices.

Thanks, Anand


> Thanks, Anand
> 
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2nd try v4] btrfs: drop uuid_mutex in btrfs_free_extra_devids()
  2018-05-29  5:40     ` Anand Jain
@ 2018-05-29 11:19       ` David Sterba
  2018-07-16 14:32         ` Anand Jain
  0 siblings, 1 reply; 6+ messages in thread
From: David Sterba @ 2018-05-29 11:19 UTC (permalink / raw)
  To: Anand Jain; +Cc: dsterba, linux-btrfs

On Tue, May 29, 2018 at 01:40:34PM +0800, Anand Jain wrote:
> >>> v3->v4: As we traverse through the seed device, fs_device gets 
> >>> updated with
> >>>     the child seed fs_devices, so make sure we use the same fs_devices
> >>>     pointer for the mutex_unlock as used for the mutex_lock.
> >>
> >> Well, now that I see the change, shouldn't we always hold the
> >> device_list_mutex of the fs_devices that's being processed? Ie. each
> >> time it's switched, the previous is unlocked and new one locked.
> > 
> >   No David. That's because we organize seed device under its parent
> >   fs_devices ((fs_devices::seed)::seed)..so on, and they are a local
> >   cloned copy of the original seed fs_devices. So parent's
> >   fs_devices::device_list_mutex lock will suffice.
> 
>   On the 2nd thought, though theoretically you are right. But practically
>   there isn't any use case which can benefit by using the intricate
>   locking as you suggested above.

I don't think it's intricate or complex, just lock the fs_devices that
can be potentially modified in the following loop.

Schematically:

function called with some fs_devices

again:
	lock fs_devices->device_list_mutex
	foreach device in fs_devices
		if ...
			fs_devices counters get changed after device
			deletion
	endfor

	if (fs_devices->seed)
		unlock fs_devices->device_list_mutex
		fs_devices = fs_devices->seed
		lock fs_devices->device_list_mutex      <-- lock the new one
		goto again
	endif

	unlock fs_devices->device_list_mutex            <-- correctly unlock
endfunc

>   I am following the following method:-
>   By holding the parent fs_devices (which is also the mounted fs_devices
>   lock) it would imply to lock its dependent cloned seed fs_devices, as
>   to reach the cloned seed device, first we need to traverse from the
>   parent fs_devices.

Locking the parent fs_devices might be the right scheme, but if any
child fs_devices pointer gets passed to a random function that will
change it, then the locking will not protect it.

This might need a deeper audit how the seeding device is done and maybe
add some helpers that eg. lock the whole chain so all of them are
protected.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2nd try v4] btrfs: drop uuid_mutex in btrfs_free_extra_devids()
  2018-05-29 11:19       ` David Sterba
@ 2018-07-16 14:32         ` Anand Jain
  0 siblings, 0 replies; 6+ messages in thread
From: Anand Jain @ 2018-07-16 14:32 UTC (permalink / raw)
  To: dsterba, linux-btrfs



On 05/29/2018 07:19 PM, David Sterba wrote:
> On Tue, May 29, 2018 at 01:40:34PM +0800, Anand Jain wrote:
>>>>> v3->v4: As we traverse through the seed device, fs_device gets
>>>>> updated with
>>>>>      the child seed fs_devices, so make sure we use the same fs_devices
>>>>>      pointer for the mutex_unlock as used for the mutex_lock.
>>>>
>>>> Well, now that I see the change, shouldn't we always hold the
>>>> device_list_mutex of the fs_devices that's being processed? Ie. each
>>>> time it's switched, the previous is unlocked and new one locked.
>>>
>>>    No David. That's because we organize seed device under its parent
>>>    fs_devices ((fs_devices::seed)::seed)..so on, and they are a local
>>>    cloned copy of the original seed fs_devices. So parent's
>>>    fs_devices::device_list_mutex lock will suffice.
>>
>>    On the 2nd thought, though theoretically you are right. But practically
>>    there isn't any use case which can benefit by using the intricate
>>    locking as you suggested above.
> 
> I don't think it's intricate or complex, just lock the fs_devices that
> can be potentially modified in the following loop.
> 
> Schematically:
> 
> function called with some fs_devices
> 
> again:
> 	lock fs_devices->device_list_mutex
> 	foreach device in fs_devices
> 		if ...
> 			fs_devices counters get changed after device
> 			deletion
> 	endfor
> 
> 	if (fs_devices->seed)
> 		unlock fs_devices->device_list_mutex
> 		fs_devices = fs_devices->seed
> 		lock fs_devices->device_list_mutex      <-- lock the new one
> 		goto again
> 	endif
> 
> 	unlock fs_devices->device_list_mutex            <-- correctly unlock
> endfunc
> 
>>    I am following the following method:-
>>    By holding the parent fs_devices (which is also the mounted fs_devices
>>    lock) it would imply to lock its dependent cloned seed fs_devices, as
>>    to reach the cloned seed device, first we need to traverse from the
>>    parent fs_devices.
> 
> Locking the parent fs_devices might be the right scheme, but if any
> child fs_devices pointer gets passed to a random function that will
> change it, then the locking will not protect it.

If its not holding the parent fs_devices lock then it would a bug and
if its already holding the parent lock then child lock isn't necessary
at all.

Also sent the
   [DOC] BTRFS Volume operations, Device Lists and Locks all in one page
to have a holistic vewi.

Its not too convincing to me the approach of using the per fs_devices
lock when fs_devices::seed is traversed its not necessary.

Thanks, Anand


> This might need a deeper audit how the seeding device is done and maybe
> add some helpers that eg. lock the whole chain so all of them are
> protected.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-07-16 14:56 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-05-28 14:43 [PATCH 2nd try v4] btrfs: drop uuid_mutex in btrfs_free_extra_devids() Anand Jain
2018-05-28 15:40 ` David Sterba
2018-05-28 22:57   ` Anand Jain
2018-05-29  5:40     ` Anand Jain
2018-05-29 11:19       ` David Sterba
2018-07-16 14:32         ` Anand Jain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).