public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Anand Jain <anand.jain@oracle.com>
To: Johannes Thumshirn <jthumshirn@suse.de>,
	dsterba@suse.cz, David Sterba <dsterba@suse.com>,
	Qu Wenru <wqu@suse.com>,
	Linux BTRFS Mailinglist <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH v2 2/7] btrfs: handle device allocation failure in btrfs_close_one_device()
Date: Thu, 14 Nov 2019 18:56:54 +0800	[thread overview]
Message-ID: <9fb09a95-ec34-0a45-8f4b-97a6467a2c81@oracle.com> (raw)
In-Reply-To: <4a86d0f6-94cb-24a7-05d1-5297673ac248@suse.de>

On 14/11/19 4:48 PM, Johannes Thumshirn wrote:
> On 13/11/2019 15:58, David Sterba wrote:
>> On Wed, Nov 13, 2019 at 11:27:23AM +0100, Johannes Thumshirn wrote:
>>> In btrfs_close_one_device() we're allocating a new device and if this
>>> fails we BUG().
>>>
>>> Move the allocation to the top of the function and return an error in case
>>> it failed.
>>>
>>> The BUG_ON() is temporarily moved to close_fs_devices(), the caller of
>>> btrfs_close_one_device() as further work is pending to untangle this.
>>>
>>> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
>>> ---
>>>   fs/btrfs/volumes.c | 27 +++++++++++++++++++++------
>>>   1 file changed, 21 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>> index 5ee26e7fca32..0a2a73907563 100644
>>> --- a/fs/btrfs/volumes.c
>>> +++ b/fs/btrfs/volumes.c
>>> @@ -1061,12 +1061,17 @@ static void btrfs_close_bdev(struct btrfs_device *device)
>>>   	blkdev_put(device->bdev, device->mode);
>>>   }
>>>   
>>> -static void btrfs_close_one_device(struct btrfs_device *device)
>>> +static int btrfs_close_one_device(struct btrfs_device *device)
>>>   {
>>>   	struct btrfs_fs_devices *fs_devices = device->fs_devices;
>>>   	struct btrfs_device *new_device;
>>>   	struct rcu_string *name;
>>>   
>>> +	new_device = btrfs_alloc_device(NULL, &device->devid,
>>> +					device->uuid);
>>> +	if (IS_ERR(new_device))
>>> +		goto err_close_device;
>>> +
>>>   	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) &&
>>>   	    device->devid != BTRFS_DEV_REPLACE_DEVID) {
>>>   		list_del_init(&device->dev_alloc_list);
>>> @@ -1080,10 +1085,6 @@ static void btrfs_close_one_device(struct btrfs_device *device)
>>>   	if (device->bdev)
>>>   		fs_devices->open_devices--;
>>>   
>>> -	new_device = btrfs_alloc_device(NULL, &device->devid,
>>> -					device->uuid);
>>> -	BUG_ON(IS_ERR(new_device)); /* -ENOMEM */
>>> -
>>>   	/* Safe because we are under uuid_mutex */
>>>   	if (device->name) {
>>>   		name = rcu_string_strdup(device->name->str, GFP_NOFS);
>>> @@ -1096,18 +1097,32 @@ static void btrfs_close_one_device(struct btrfs_device *device)
>>>   
>>>   	synchronize_rcu();
>>>   	btrfs_free_device(device);
>>> +
>>> +	return 0;
>>> +
>>> +err_close_device:
>>> +	btrfs_close_bdev(device);
>>> +	if (device->bdev) {
>>> +		fs_devices->open_devices--;
>>> +		btrfs_sysfs_rm_device_link(fs_devices, device);
>>> +		device->bdev = NULL;
>>> +	}
>>
>> I don't understand this part: the 'device' pointer is from the argument,
>> so the device we want to delete from the list and for that all the state
>> bit tests, bdev close, list replace rcu and synchronize_rcu should
>> happen -- in case we have a newly allocated new_device.
>>
>> What I don't understand how the short version after label
>> err_close_device: is correct. The device is still left in the list but
>> with NULL bdev but rw_devices, missing_devices is untouched.
>>
>> That a device closing needs to allocate memory for a new device instead
>> of reinitializing it again is stupid but with the simplified device
>> closing I'm not sure the state is well defined.
> 
> As we couldn't allocate memory to remove the device from the list, we
> have to keep it in the list (technically even leaking some memory here).
> 
> What we definitively need to do is clear the ->bdev pointer, otherwise
> we'll trip over a NULL-pointer in open_fs_devices().
> 
> open_fs_devices() will traverse the list and call
> btrfs_open_one_device() this will fail as device->bdev is (still) set
> thus latest_dev is NULL and then this 'fs_devices->latest_bdev =
> latest_dev->bdev;' will blow up.
> 
> If you have a better solution I'm all ears. This is what I came up with
> to tackle the problem of half initialized devices.
> 
> One thing we could do though is call btrfs_free_stale_devices() in the
> error case.
> 
> Byte,
> 	Johannes
> 

Johannes,

   Thanks for attempting to fix this.

   I wrote comments about this unoptimized code here [1]

   [1]
    ML email therad
     'invalid opcode in close_fs_devices'

 
https://groups.google.com/forum/#!msg/syzkaller-bugs/eSgcqygYaXE/6wuz-0jMCwAJ

   You may want to review.

   Yes David is correct why a closed device will still remain in the
   dev_alloc_list even after the close here in this patch.

Thanks, Anand

  reply	other threads:[~2019-11-14 10:58 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-13 10:27 [PATCH v2 0/7] remove BUG_ON()s in btrfs_close_one_device() Johannes Thumshirn
2019-11-13 10:27 ` [PATCH v2 1/7] btrfs: decrement number of open devices after closing the device not before Johannes Thumshirn
2019-11-13 10:27 ` [PATCH v2 2/7] btrfs: handle device allocation failure in btrfs_close_one_device() Johannes Thumshirn
2019-11-13 14:58   ` David Sterba
2019-11-14  8:48     ` Johannes Thumshirn
2019-11-14 10:56       ` Anand Jain [this message]
2019-11-14 12:03         ` Johannes Thumshirn
2019-11-14 13:02         ` Johannes Thumshirn
2019-11-13 10:27 ` [PATCH v2 3/7] btrfs: handle allocation failure in strdup Johannes Thumshirn
2019-11-14 11:00   ` Anand Jain
2019-11-15  9:39     ` David Sterba
2019-11-15 21:11   ` Nikolay Borisov
2019-11-13 10:27 ` [PATCH v2 4/7] btrfs: handle error return of close_fs_devices() Johannes Thumshirn
2019-11-13 15:00   ` David Sterba
2019-11-14  8:15     ` Johannes Thumshirn
2019-11-13 10:27 ` [PATCH v2 5/7] btrfs: remove final BUG_ON() in close_fs_devices() Johannes Thumshirn
2019-11-13 15:02   ` David Sterba
2019-11-14  9:01     ` Johannes Thumshirn
2019-11-13 10:27 ` [PATCH v2 6/7] btrfs: change btrfs_fs_devices::seeing to bool Johannes Thumshirn
2019-11-14 11:04   ` Anand Jain
2019-11-13 10:27 ` [PATCH v2 7/7] btrfs: change btrfs_fs_devices::rotating " Johannes Thumshirn
2019-11-14 11:05   ` Anand Jain
2019-11-13 11:56 ` [PATCH v2 0/7] remove BUG_ON()s in btrfs_close_one_device() Qu Wenruo
2019-11-13 15:05 ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9fb09a95-ec34-0a45-8f4b-97a6467a2c81@oracle.com \
    --to=anand.jain@oracle.com \
    --cc=dsterba@suse.com \
    --cc=dsterba@suse.cz \
    --cc=jthumshirn@suse.de \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox