virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] virtio-blk: Don't free ida when disk is in use
@ 2012-12-19 19:40 Alexander Graf
  2012-12-20  4:15 ` Rusty Russell
  2012-12-20 10:54 ` Michael S. Tsirkin
  0 siblings, 2 replies; 13+ messages in thread
From: Alexander Graf @ 2012-12-19 19:40 UTC (permalink / raw)
  To: virtualization; +Cc: kvm, mst

When a file system is mounted on a virtio-blk disk, we then remove it
and then reattach it, the reattached disk gets the same disk name and
ids as the hot removed one.

This leads to very nasty effects - mostly rendering the newly attached
device completely unusable.

Trying what happens when I do the same thing with a USB device, I saw
that the sd node simply doesn't get free'd when a device gets forcefully
removed.

Imitate the same behavior for vd devices. This way broken vd devices
simply are never free'd and newly attached ones keep working just fine.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 drivers/block/virtio_blk.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 0bdde8f..07a18e2 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -889,6 +889,7 @@ static void __devexit virtblk_remove(struct virtio_device *vdev)
 {
 	struct virtio_blk *vblk = vdev->priv;
 	int index = vblk->index;
+	int refc;
 
 	/* Prevent config work handler from accessing the device. */
 	mutex_lock(&vblk->config_lock);
@@ -903,11 +904,15 @@ static void __devexit virtblk_remove(struct virtio_device *vdev)
 
 	flush_work(&vblk->config_work);
 
+	refc = atomic_read(&disk_to_dev(vblk->disk)->kobj.kref.refcount);
 	put_disk(vblk->disk);
 	mempool_destroy(vblk->pool);
 	vdev->config->del_vqs(vdev);
 	kfree(vblk);
-	ida_simple_remove(&vd_index_ida, index);
+
+	/* Only free device id if we don't have any users */
+	if (refc == 1)
+		ida_simple_remove(&vd_index_ida, index);
 }
 
 #ifdef CONFIG_PM
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio-blk: Don't free ida when disk is in use
  2012-12-19 19:40 [PATCH] virtio-blk: Don't free ida when disk is in use Alexander Graf
@ 2012-12-20  4:15 ` Rusty Russell
  2012-12-20  8:46   ` Asias He
  2012-12-20 10:54 ` Michael S. Tsirkin
  1 sibling, 1 reply; 13+ messages in thread
From: Rusty Russell @ 2012-12-20  4:15 UTC (permalink / raw)
  To: Alexander Graf, virtualization; +Cc: kvm, mst

Alexander Graf <agraf@suse.de> writes:

> When a file system is mounted on a virtio-blk disk, we then remove it
> and then reattach it, the reattached disk gets the same disk name and
> ids as the hot removed one.
>
> This leads to very nasty effects - mostly rendering the newly attached
> device completely unusable.
>
> Trying what happens when I do the same thing with a USB device, I saw
> that the sd node simply doesn't get free'd when a device gets forcefully
> removed.
>
> Imitate the same behavior for vd devices. This way broken vd devices
> simply are never free'd and newly attached ones keep working just fine.
>
> Signed-off-by: Alexander Graf <agraf@suse.de>

I think deserves a CC:stable, no?

I've put it in my pending queue for *next* merge window for now...

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio-blk: Don't free ida when disk is in use
  2012-12-20  4:15 ` Rusty Russell
@ 2012-12-20  8:46   ` Asias He
  2012-12-20  9:41     ` Alexander Graf
  0 siblings, 1 reply; 13+ messages in thread
From: Asias He @ 2012-12-20  8:46 UTC (permalink / raw)
  To: Rusty Russell; +Cc: mst, kvm, virtualization

On 12/20/2012 12:15 PM, Rusty Russell wrote:
> Alexander Graf <agraf@suse.de> writes:
> 
>> When a file system is mounted on a virtio-blk disk, we then remove it
>> and then reattach it, the reattached disk gets the same disk name and
>> ids as the hot removed one.
>>
>> This leads to very nasty effects - mostly rendering the newly attached
>> device completely unusable.
>>
>> Trying what happens when I do the same thing with a USB device, I saw
>> that the sd node simply doesn't get free'd when a device gets forcefully
>> removed.
>>
>> Imitate the same behavior for vd devices. This way broken vd devices
>> simply are never free'd and newly attached ones keep working just fine.
>>
>> Signed-off-by: Alexander Graf <agraf@suse.de>
> 
> I think deserves a CC:stable, no?
> 
> I've put it in my pending queue for *next* merge window for now...

Thanks for looking into this, Alexander!

I also noticed this problem. The problem is that, if you hot-unplug a
mounted or opened disk, the disk is in opened state. Next time, when you
hotplug the same disk. The kernel thought it was opened already. The
driver will use the wrong gendisk data structure in bdev.

blkdev_open
   blkdev_get
      __blkdev_get
         if (!bdev->bd_openers) {    <-- Here, bd_disk not got updated
                                         still points to old one
           bdev->bd_disk = disk;
           bdev->bd_queue = disk->queue;
           ...

I tried something like this:

@@ -854,6 +862,19 @@ static int __devinit virtblk_probe(struct
virtio_device *vdev)
                blk_queue_io_opt(q, blk_size * opt_io_size);

        add_disk(vblk->disk);
+
+       for (i = 0; i < 1 << PART_BITS; i++) {
+               bdev = bdget_disk(vblk->disk, i);
+               if (bdev) {
+                       bdev->bd_disk = vblk->disk;
+                       bdev->bd_queue = q;
+                       bdput(bdev);
+               }
+       }


1) Before:
---> hot-plug
[   35.730183] virtio_blk: virtblk_probe: vblk=ffff880078b0e000,
disk=ffff880078d54c00, q=ffff88007f88b3d8
[   35.735352] virtio_blk:    virtblk_ioctl: vblk=ffff880078b0e000,
disk=ffff880078d54c00, bdev=ffff88007c45cc00, q=ffff88007f88b3d8
---> hot-unplug

---> hot-plug
[   83.570480] virtio_blk: virtblk_probe: vblk=ffff880078b0e000,
disk=ffff880078d55800, q=ffff88007f88bb40
[   83.575614] virtio_blk:   virtblk_ioctl: vblk=ffff880078b0e000,
disk=ffff880078d54c00, bdev=ffff88007c45cc00, q=ffff88007f88b3d8

The disk points to old one ffff880078d54c00. The queue also points to
old one ffff88007f88b3d8.

2) After:
---> hot-plug
[   68.035063] virtio_blk: virtblk_probe: vblk=ffff880079b20000,
disk=ffff88007f9ebc00, q=ffff8800784d8ed0
[   68.041140] virtio_blk:    virtblk_ioctl: vblk=ffff880079b20000,
disk=ffff88007f9ebc00, bdev=ffff88007ab2c000, q=ffff8800784d8ed0
---> hot-unplug

---> hot-plug
[   86.317706] virtio_blk: virtblk_probe: vblk=ffff880079b20000,
disk=ffff88007f9eb000, q=ffff8800784d9638
[   86.322535] virtio_blk:   virtblk_ioctl: vblk=ffff880079b20000,
disk=ffff88007f9eb000, bdev=ffff88007ab2c000, q=ffff8800784d9638

The disk and queue are updated correctly. The attached disk works and
still uses the old name.

-- 
Asias

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio-blk: Don't free ida when disk is in use
  2012-12-20  8:46   ` Asias He
@ 2012-12-20  9:41     ` Alexander Graf
  2012-12-21  1:48       ` Asias He
  2013-01-02  5:09       ` Rusty Russell
  0 siblings, 2 replies; 13+ messages in thread
From: Alexander Graf @ 2012-12-20  9:41 UTC (permalink / raw)
  To: Asias He
  Cc: mst@redhat.com, kvm@vger.kernel.org,
	virtualization@lists.linux-foundation.org



On 20.12.2012, at 09:46, Asias He <asias@redhat.com> wrote:

> On 12/20/2012 12:15 PM, Rusty Russell wrote:
>> Alexander Graf <agraf@suse.de> writes:
>> 
>>> When a file system is mounted on a virtio-blk disk, we then remove it
>>> and then reattach it, the reattached disk gets the same disk name and
>>> ids as the hot removed one.
>>> 
>>> This leads to very nasty effects - mostly rendering the newly attached
>>> device completely unusable.
>>> 
>>> Trying what happens when I do the same thing with a USB device, I saw
>>> that the sd node simply doesn't get free'd when a device gets forcefully
>>> removed.
>>> 
>>> Imitate the same behavior for vd devices. This way broken vd devices
>>> simply are never free'd and newly attached ones keep working just fine.
>>> 
>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> 
>> I think deserves a CC:stable, no?

I wanted to make sure people can comment on it first. But yes, it should go into stable when accepted.

>> 
>> I've put it in my pending queue for *next* merge window for now...
> 
> Thanks for looking into this, Alexander!
> 
> I also noticed this problem. The problem is that, if you hot-unplug a
> mounted or opened disk, the disk is in opened state. Next time, when you
> hotplug the same disk. The kernel thought it was opened already. The
> driver will use the wrong gendisk data structure in bdev.

Exactly :)

> 
> blkdev_open
>   blkdev_get
>      __blkdev_get
>         if (!bdev->bd_openers) {    <-- Here, bd_disk not got updated
>                                         still points to old one
>           bdev->bd_disk = disk;
>           bdev->bd_queue = disk->queue;
>           ...
> 
> I tried something like this:
> 
> @@ -854,6 +862,19 @@ static int __devinit virtblk_probe(struct
> virtio_device *vdev)
>                blk_queue_io_opt(q, blk_size * opt_io_size);
> 
>        add_disk(vblk->disk);
> +
> +       for (i = 0; i < 1 << PART_BITS; i++) {
> +               bdev = bdget_disk(vblk->disk, i);
> +               if (bdev) {
> +                       bdev->bd_disk = vblk->disk;
> +                       bdev->bd_queue = q;
> +                       bdput(bdev);

Does this stack? What if the device was mounted twice? Or mounted and dd'ed at the same time? Then your refcount here is wrong.

Also this means that you're free'ing a bdev even though there's still a reference held to it. That means whoever holds it could later use that pointer and access bogus memory.


Alex

> +               }
> +       }
> 
> 
> 1) Before:
> ---> hot-plug
> [   35.730183] virtio_blk: virtblk_probe: vblk=ffff880078b0e000,
> disk=ffff880078d54c00, q=ffff88007f88b3d8
> [   35.735352] virtio_blk:    virtblk_ioctl: vblk=ffff880078b0e000,
> disk=ffff880078d54c00, bdev=ffff88007c45cc00, q=ffff88007f88b3d8
> ---> hot-unplug
> 
> ---> hot-plug
> [   83.570480] virtio_blk: virtblk_probe: vblk=ffff880078b0e000,
> disk=ffff880078d55800, q=ffff88007f88bb40
> [   83.575614] virtio_blk:   virtblk_ioctl: vblk=ffff880078b0e000,
> disk=ffff880078d54c00, bdev=ffff88007c45cc00, q=ffff88007f88b3d8
> 
> The disk points to old one ffff880078d54c00. The queue also points to
> old one ffff88007f88b3d8.
> 
> 2) After:
> ---> hot-plug
> [   68.035063] virtio_blk: virtblk_probe: vblk=ffff880079b20000,
> disk=ffff88007f9ebc00, q=ffff8800784d8ed0
> [   68.041140] virtio_blk:    virtblk_ioctl: vblk=ffff880079b20000,
> disk=ffff88007f9ebc00, bdev=ffff88007ab2c000, q=ffff8800784d8ed0
> ---> hot-unplug
> 
> ---> hot-plug
> [   86.317706] virtio_blk: virtblk_probe: vblk=ffff880079b20000,
> disk=ffff88007f9eb000, q=ffff8800784d9638
> [   86.322535] virtio_blk:   virtblk_ioctl: vblk=ffff880079b20000,
> disk=ffff88007f9eb000, bdev=ffff88007ab2c000, q=ffff8800784d9638
> 
> The disk and queue are updated correctly. The attached disk works and
> still uses the old name.
> 
> -- 
> Asias

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio-blk: Don't free ida when disk is in use
  2012-12-19 19:40 [PATCH] virtio-blk: Don't free ida when disk is in use Alexander Graf
  2012-12-20  4:15 ` Rusty Russell
@ 2012-12-20 10:54 ` Michael S. Tsirkin
  2012-12-20 11:27   ` Alexander Graf
                     ` (2 more replies)
  1 sibling, 3 replies; 13+ messages in thread
From: Michael S. Tsirkin @ 2012-12-20 10:54 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm, virtualization

On Wed, Dec 19, 2012 at 08:40:15PM +0100, Alexander Graf wrote:
> When a file system is mounted on a virtio-blk disk, we then remove it
> and then reattach it, the reattached disk gets the same disk name and
> ids as the hot removed one.
> 
> This leads to very nasty effects - mostly rendering the newly attached
> device completely unusable.
> 
> Trying what happens when I do the same thing with a USB device, I saw
> that the sd node simply doesn't get free'd when a device gets forcefully
> removed.
> 
> Imitate the same behavior for vd devices. This way broken vd devices
> simply are never free'd and newly attached ones keep working just fine.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
>  drivers/block/virtio_blk.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> index 0bdde8f..07a18e2 100644
> --- a/drivers/block/virtio_blk.c
> +++ b/drivers/block/virtio_blk.c
> @@ -889,6 +889,7 @@ static void __devexit virtblk_remove(struct virtio_device *vdev)
>  {
>  	struct virtio_blk *vblk = vdev->priv;
>  	int index = vblk->index;
> +	int refc;
>  
>  	/* Prevent config work handler from accessing the device. */
>  	mutex_lock(&vblk->config_lock);
> @@ -903,11 +904,15 @@ static void __devexit virtblk_remove(struct virtio_device *vdev)
>  
>  	flush_work(&vblk->config_work);
>  
> +	refc = atomic_read(&disk_to_dev(vblk->disk)->kobj.kref.refcount);
>  	put_disk(vblk->disk);
>  	mempool_destroy(vblk->pool);
>  	vdev->config->del_vqs(vdev);
>  	kfree(vblk);
> -	ida_simple_remove(&vd_index_ida, index);
> +
> +	/* Only free device id if we don't have any users */
> +	if (refc == 1)
> +		ida_simple_remove(&vd_index_ida, index);
>  }
>  
>  #ifdef CONFIG_PM

Network devices take the approach of retrying every second.
Donnu if it makes sense here.


> -- 
> 1.7.12.4

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio-blk: Don't free ida when disk is in use
  2012-12-20 10:54 ` Michael S. Tsirkin
@ 2012-12-20 11:27   ` Alexander Graf
  2012-12-21  1:57   ` Asias He
       [not found]   ` <62D6A704-CB88-4A8C-A5F3-6BD3C267895F@suse.de>
  2 siblings, 0 replies; 13+ messages in thread
From: Alexander Graf @ 2012-12-20 11:27 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: kvm, virtualization


On 20.12.2012, at 11:54, Michael S. Tsirkin wrote:

> On Wed, Dec 19, 2012 at 08:40:15PM +0100, Alexander Graf wrote:
>> When a file system is mounted on a virtio-blk disk, we then remove it
>> and then reattach it, the reattached disk gets the same disk name and
>> ids as the hot removed one.
>> 
>> This leads to very nasty effects - mostly rendering the newly attached
>> device completely unusable.
>> 
>> Trying what happens when I do the same thing with a USB device, I saw
>> that the sd node simply doesn't get free'd when a device gets forcefully
>> removed.
>> 
>> Imitate the same behavior for vd devices. This way broken vd devices
>> simply are never free'd and newly attached ones keep working just fine.
>> 
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> ---
>> drivers/block/virtio_blk.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>> 
>> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
>> index 0bdde8f..07a18e2 100644
>> --- a/drivers/block/virtio_blk.c
>> +++ b/drivers/block/virtio_blk.c
>> @@ -889,6 +889,7 @@ static void __devexit virtblk_remove(struct virtio_device *vdev)
>> {
>> 	struct virtio_blk *vblk = vdev->priv;
>> 	int index = vblk->index;
>> +	int refc;
>> 
>> 	/* Prevent config work handler from accessing the device. */
>> 	mutex_lock(&vblk->config_lock);
>> @@ -903,11 +904,15 @@ static void __devexit virtblk_remove(struct virtio_device *vdev)
>> 
>> 	flush_work(&vblk->config_work);
>> 
>> +	refc = atomic_read(&disk_to_dev(vblk->disk)->kobj.kref.refcount);
>> 	put_disk(vblk->disk);
>> 	mempool_destroy(vblk->pool);
>> 	vdev->config->del_vqs(vdev);
>> 	kfree(vblk);
>> -	ida_simple_remove(&vd_index_ida, index);
>> +
>> +	/* Only free device id if we don't have any users */
>> +	if (refc == 1)
>> +		ida_simple_remove(&vd_index_ida, index);
>> }
>> 
>> #ifdef CONFIG_PM
> 
> Network devices take the approach of retrying every second.
> Donnu if it makes sense here.

I would rather think the 100% right approach would be a recursive unrolling of all users bottom to top. Force unmount. Force close all fd's. I'm not sure why that doesn't happen today, but it doesn't :).


Alex

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio-blk: Don't free ida when disk is in use
       [not found]   ` <62D6A704-CB88-4A8C-A5F3-6BD3C267895F@suse.de>
@ 2012-12-20 11:38     ` Michael S. Tsirkin
  2012-12-20 11:47       ` Alexander Graf
  2012-12-21  1:58     ` Asias He
  1 sibling, 1 reply; 13+ messages in thread
From: Michael S. Tsirkin @ 2012-12-20 11:38 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm, virtualization

On Thu, Dec 20, 2012 at 12:27:53PM +0100, Alexander Graf wrote:
> 
> On 20.12.2012, at 11:54, Michael S. Tsirkin wrote:
> 
> > On Wed, Dec 19, 2012 at 08:40:15PM +0100, Alexander Graf wrote:
> >> When a file system is mounted on a virtio-blk disk, we then remove it
> >> and then reattach it, the reattached disk gets the same disk name and
> >> ids as the hot removed one.
> >> 
> >> This leads to very nasty effects - mostly rendering the newly attached
> >> device completely unusable.
> >> 
> >> Trying what happens when I do the same thing with a USB device, I saw
> >> that the sd node simply doesn't get free'd when a device gets forcefully
> >> removed.
> >> 
> >> Imitate the same behavior for vd devices. This way broken vd devices
> >> simply are never free'd and newly attached ones keep working just fine.
> >> 
> >> Signed-off-by: Alexander Graf <agraf@suse.de>
> >> ---
> >> drivers/block/virtio_blk.c | 7 ++++++-
> >> 1 file changed, 6 insertions(+), 1 deletion(-)
> >> 
> >> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> >> index 0bdde8f..07a18e2 100644
> >> --- a/drivers/block/virtio_blk.c
> >> +++ b/drivers/block/virtio_blk.c
> >> @@ -889,6 +889,7 @@ static void __devexit virtblk_remove(struct virtio_device *vdev)
> >> {
> >> 	struct virtio_blk *vblk = vdev->priv;
> >> 	int index = vblk->index;
> >> +	int refc;
> >> 
> >> 	/* Prevent config work handler from accessing the device. */
> >> 	mutex_lock(&vblk->config_lock);
> >> @@ -903,11 +904,15 @@ static void __devexit virtblk_remove(struct virtio_device *vdev)
> >> 
> >> 	flush_work(&vblk->config_work);
> >> 
> >> +	refc = atomic_read(&disk_to_dev(vblk->disk)->kobj.kref.refcount);
> >> 	put_disk(vblk->disk);
> >> 	mempool_destroy(vblk->pool);
> >> 	vdev->config->del_vqs(vdev);
> >> 	kfree(vblk);
> >> -	ida_simple_remove(&vd_index_ida, index);
> >> +
> >> +	/* Only free device id if we don't have any users */
> >> +	if (refc == 1)
> >> +		ida_simple_remove(&vd_index_ida, index);
> >> }
> >> 
> >> #ifdef CONFIG_PM
> > 
> > Network devices take the approach of retrying every second.
> > Donnu if it makes sense here.
> 
> I would rather think the 100% right approach would be a recursive unrolling of all users bottom to top. Force unmount. Force close all fd's.
> I'm not sure why that doesn't happen today, but it doesn't :).
> 
> 
> Alex

Especially force close fd's won't be easy. A simpler alternative
would be preventing unplug by taking some reference count.

-- 
MST

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio-blk: Don't free ida when disk is in use
  2012-12-20 11:38     ` Michael S. Tsirkin
@ 2012-12-20 11:47       ` Alexander Graf
  2012-12-21  2:02         ` Asias He
  0 siblings, 1 reply; 13+ messages in thread
From: Alexander Graf @ 2012-12-20 11:47 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: kvm, virtualization


On 20.12.2012, at 12:38, Michael S. Tsirkin wrote:

> On Thu, Dec 20, 2012 at 12:27:53PM +0100, Alexander Graf wrote:
>> 
>> On 20.12.2012, at 11:54, Michael S. Tsirkin wrote:
>> 
>>> On Wed, Dec 19, 2012 at 08:40:15PM +0100, Alexander Graf wrote:
>>>> When a file system is mounted on a virtio-blk disk, we then remove it
>>>> and then reattach it, the reattached disk gets the same disk name and
>>>> ids as the hot removed one.
>>>> 
>>>> This leads to very nasty effects - mostly rendering the newly attached
>>>> device completely unusable.
>>>> 
>>>> Trying what happens when I do the same thing with a USB device, I saw
>>>> that the sd node simply doesn't get free'd when a device gets forcefully
>>>> removed.
>>>> 
>>>> Imitate the same behavior for vd devices. This way broken vd devices
>>>> simply are never free'd and newly attached ones keep working just fine.
>>>> 
>>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>>> ---
>>>> drivers/block/virtio_blk.c | 7 ++++++-
>>>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>>> 
>>>> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
>>>> index 0bdde8f..07a18e2 100644
>>>> --- a/drivers/block/virtio_blk.c
>>>> +++ b/drivers/block/virtio_blk.c
>>>> @@ -889,6 +889,7 @@ static void __devexit virtblk_remove(struct virtio_device *vdev)
>>>> {
>>>> 	struct virtio_blk *vblk = vdev->priv;
>>>> 	int index = vblk->index;
>>>> +	int refc;
>>>> 
>>>> 	/* Prevent config work handler from accessing the device. */
>>>> 	mutex_lock(&vblk->config_lock);
>>>> @@ -903,11 +904,15 @@ static void __devexit virtblk_remove(struct virtio_device *vdev)
>>>> 
>>>> 	flush_work(&vblk->config_work);
>>>> 
>>>> +	refc = atomic_read(&disk_to_dev(vblk->disk)->kobj.kref.refcount);
>>>> 	put_disk(vblk->disk);
>>>> 	mempool_destroy(vblk->pool);
>>>> 	vdev->config->del_vqs(vdev);
>>>> 	kfree(vblk);
>>>> -	ida_simple_remove(&vd_index_ida, index);
>>>> +
>>>> +	/* Only free device id if we don't have any users */
>>>> +	if (refc == 1)
>>>> +		ida_simple_remove(&vd_index_ida, index);
>>>> }
>>>> 
>>>> #ifdef CONFIG_PM
>>> 
>>> Network devices take the approach of retrying every second.
>>> Donnu if it makes sense here.
>> 
>> I would rather think the 100% right approach would be a recursive unrolling of all users bottom to top. Force unmount. Force close all fd's.
>> I'm not sure why that doesn't happen today, but it doesn't :).
>> 
>> 
>> Alex
> 
> Especially force close fd's won't be easy. A simpler alternative
> would be preventing unplug by taking some reference count.

That's what happens when you do

<add>
mount /mnt
ls /mnt
<remove>

-- here remove failed for me --

But when you do

<add>
mount /mnt
<remove>

then remove works and we get the above problem.

I have to admit that I'm really not a big fan of preventing unplug though. In some cases, preventing simply doesn't work, for example when you unplug a USB stick or when your iscsi remote just vanishes.


Alex

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio-blk: Don't free ida when disk is in use
  2012-12-20  9:41     ` Alexander Graf
@ 2012-12-21  1:48       ` Asias He
  2013-01-02  5:09       ` Rusty Russell
  1 sibling, 0 replies; 13+ messages in thread
From: Asias He @ 2012-12-21  1:48 UTC (permalink / raw)
  To: Alexander Graf
  Cc: mst@redhat.com, kvm@vger.kernel.org,
	virtualization@lists.linux-foundation.org

On 12/20/2012 05:41 PM, Alexander Graf wrote:
> 
> 
> On 20.12.2012, at 09:46, Asias He <asias@redhat.com> wrote:
> 
>> On 12/20/2012 12:15 PM, Rusty Russell wrote:
>>> Alexander Graf <agraf@suse.de> writes:
>>>
>>>> When a file system is mounted on a virtio-blk disk, we then remove it
>>>> and then reattach it, the reattached disk gets the same disk name and
>>>> ids as the hot removed one.
>>>>
>>>> This leads to very nasty effects - mostly rendering the newly attached
>>>> device completely unusable.
>>>>
>>>> Trying what happens when I do the same thing with a USB device, I saw
>>>> that the sd node simply doesn't get free'd when a device gets forcefully
>>>> removed.
>>>>
>>>> Imitate the same behavior for vd devices. This way broken vd devices
>>>> simply are never free'd and newly attached ones keep working just fine.
>>>>
>>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>>
>>> I think deserves a CC:stable, no?
> 
> I wanted to make sure people can comment on it first. But yes, it should go into stable when accepted.
> 
>>>
>>> I've put it in my pending queue for *next* merge window for now...
>>
>> Thanks for looking into this, Alexander!
>>
>> I also noticed this problem. The problem is that, if you hot-unplug a
>> mounted or opened disk, the disk is in opened state. Next time, when you
>> hotplug the same disk. The kernel thought it was opened already. The
>> driver will use the wrong gendisk data structure in bdev.
> 
> Exactly :)
> 
>>
>> blkdev_open
>>   blkdev_get
>>      __blkdev_get
>>         if (!bdev->bd_openers) {    <-- Here, bd_disk not got updated
>>                                         still points to old one
>>           bdev->bd_disk = disk;
>>           bdev->bd_queue = disk->queue;
>>           ...
>>
>> I tried something like this:
>>
>> @@ -854,6 +862,19 @@ static int __devinit virtblk_probe(struct
>> virtio_device *vdev)
>>                blk_queue_io_opt(q, blk_size * opt_io_size);
>>
>>        add_disk(vblk->disk);
>> +
>> +       for (i = 0; i < 1 << PART_BITS; i++) {
>> +               bdev = bdget_disk(vblk->disk, i);
>> +               if (bdev) {
>> +                       bdev->bd_disk = vblk->disk;
>> +                       bdev->bd_queue = q;
>> +                       bdput(bdev);
> 
> Does this stack? What if the device was mounted twice? Or mounted and dd'ed at the same time? Then your refcount here is wrong.

This is in the probe path. And there is no extra refcount here.
bdev->bd_openers stays what it was before the remove.

> Also this means that you're free'ing a bdev even though there's still a reference held to it. That means whoever holds it could later use that pointer and access bogus memory.

The bdev is not freed.

> Alex
> 
>> +               }
>> +       }
>>
>>
>> 1) Before:
>> ---> hot-plug
>> [   35.730183] virtio_blk: virtblk_probe: vblk=ffff880078b0e000,
>> disk=ffff880078d54c00, q=ffff88007f88b3d8
>> [   35.735352] virtio_blk:    virtblk_ioctl: vblk=ffff880078b0e000,
>> disk=ffff880078d54c00, bdev=ffff88007c45cc00, q=ffff88007f88b3d8
>> ---> hot-unplug
>>
>> ---> hot-plug
>> [   83.570480] virtio_blk: virtblk_probe: vblk=ffff880078b0e000,
>> disk=ffff880078d55800, q=ffff88007f88bb40
>> [   83.575614] virtio_blk:   virtblk_ioctl: vblk=ffff880078b0e000,
>> disk=ffff880078d54c00, bdev=ffff88007c45cc00, q=ffff88007f88b3d8
>>
>> The disk points to old one ffff880078d54c00. The queue also points to
>> old one ffff88007f88b3d8.
>>
>> 2) After:
>> ---> hot-plug
>> [   68.035063] virtio_blk: virtblk_probe: vblk=ffff880079b20000,
>> disk=ffff88007f9ebc00, q=ffff8800784d8ed0
>> [   68.041140] virtio_blk:    virtblk_ioctl: vblk=ffff880079b20000,
>> disk=ffff88007f9ebc00, bdev=ffff88007ab2c000, q=ffff8800784d8ed0
>> ---> hot-unplug
>>
>> ---> hot-plug
>> [   86.317706] virtio_blk: virtblk_probe: vblk=ffff880079b20000,
>> disk=ffff88007f9eb000, q=ffff8800784d9638
>> [   86.322535] virtio_blk:   virtblk_ioctl: vblk=ffff880079b20000,
>> disk=ffff88007f9eb000, bdev=ffff88007ab2c000, q=ffff8800784d9638
>>
>> The disk and queue are updated correctly. The attached disk works and
>> still uses the old name.
>>
>> -- 
>> Asias


-- 
Asias

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio-blk: Don't free ida when disk is in use
  2012-12-20 10:54 ` Michael S. Tsirkin
  2012-12-20 11:27   ` Alexander Graf
@ 2012-12-21  1:57   ` Asias He
       [not found]   ` <62D6A704-CB88-4A8C-A5F3-6BD3C267895F@suse.de>
  2 siblings, 0 replies; 13+ messages in thread
From: Asias He @ 2012-12-21  1:57 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: kvm, virtualization

On 12/20/2012 06:54 PM, Michael S. Tsirkin wrote:
> On Wed, Dec 19, 2012 at 08:40:15PM +0100, Alexander Graf wrote:
>> When a file system is mounted on a virtio-blk disk, we then remove it
>> and then reattach it, the reattached disk gets the same disk name and
>> ids as the hot removed one.
>>
>> This leads to very nasty effects - mostly rendering the newly attached
>> device completely unusable.
>>
>> Trying what happens when I do the same thing with a USB device, I saw
>> that the sd node simply doesn't get free'd when a device gets forcefully
>> removed.
>>
>> Imitate the same behavior for vd devices. This way broken vd devices
>> simply are never free'd and newly attached ones keep working just fine.
>>
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> ---
>>  drivers/block/virtio_blk.c | 7 ++++++-
>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
>> index 0bdde8f..07a18e2 100644
>> --- a/drivers/block/virtio_blk.c
>> +++ b/drivers/block/virtio_blk.c
>> @@ -889,6 +889,7 @@ static void __devexit virtblk_remove(struct virtio_device *vdev)
>>  {
>>  	struct virtio_blk *vblk = vdev->priv;
>>  	int index = vblk->index;
>> +	int refc;
>>  
>>  	/* Prevent config work handler from accessing the device. */
>>  	mutex_lock(&vblk->config_lock);
>> @@ -903,11 +904,15 @@ static void __devexit virtblk_remove(struct virtio_device *vdev)
>>  
>>  	flush_work(&vblk->config_work);
>>  
>> +	refc = atomic_read(&disk_to_dev(vblk->disk)->kobj.kref.refcount);
>>  	put_disk(vblk->disk);
>>  	mempool_destroy(vblk->pool);
>>  	vdev->config->del_vqs(vdev);
>>  	kfree(vblk);
>> -	ida_simple_remove(&vd_index_ida, index);
>> +
>> +	/* Only free device id if we don't have any users */
>> +	if (refc == 1)
>> +		ida_simple_remove(&vd_index_ida, index);
>>  }
>>  
>>  #ifdef CONFIG_PM
> 
> Network devices take the approach of retrying every second.
> Donnu if it makes sense here.


virtnet_remove()? I am not seeing the code there.

We are too late in the remove path in virtblk_remove(). No easy way to
come back to the state before the remove. So, giving user warning (tell
him to close the device) and retrying might be a choice.


>> -- 
>> 1.7.12.4
> _______________________________________________
> Virtualization mailing list
> Virtualization@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
> 


-- 
Asias

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio-blk: Don't free ida when disk is in use
       [not found]   ` <62D6A704-CB88-4A8C-A5F3-6BD3C267895F@suse.de>
  2012-12-20 11:38     ` Michael S. Tsirkin
@ 2012-12-21  1:58     ` Asias He
  1 sibling, 0 replies; 13+ messages in thread
From: Asias He @ 2012-12-21  1:58 UTC (permalink / raw)
  To: Alexander Graf; +Cc: virtualization, kvm, Michael S. Tsirkin

On 12/20/2012 07:27 PM, Alexander Graf wrote:
> 
> On 20.12.2012, at 11:54, Michael S. Tsirkin wrote:
> 
>> On Wed, Dec 19, 2012 at 08:40:15PM +0100, Alexander Graf wrote:
>>> When a file system is mounted on a virtio-blk disk, we then remove it
>>> and then reattach it, the reattached disk gets the same disk name and
>>> ids as the hot removed one.
>>>
>>> This leads to very nasty effects - mostly rendering the newly attached
>>> device completely unusable.
>>>
>>> Trying what happens when I do the same thing with a USB device, I saw
>>> that the sd node simply doesn't get free'd when a device gets forcefully
>>> removed.
>>>
>>> Imitate the same behavior for vd devices. This way broken vd devices
>>> simply are never free'd and newly attached ones keep working just fine.
>>>
>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>> ---
>>> drivers/block/virtio_blk.c | 7 ++++++-
>>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
>>> index 0bdde8f..07a18e2 100644
>>> --- a/drivers/block/virtio_blk.c
>>> +++ b/drivers/block/virtio_blk.c
>>> @@ -889,6 +889,7 @@ static void __devexit virtblk_remove(struct virtio_device *vdev)
>>> {
>>> 	struct virtio_blk *vblk = vdev->priv;
>>> 	int index = vblk->index;
>>> +	int refc;
>>>
>>> 	/* Prevent config work handler from accessing the device. */
>>> 	mutex_lock(&vblk->config_lock);
>>> @@ -903,11 +904,15 @@ static void __devexit virtblk_remove(struct virtio_device *vdev)
>>>
>>> 	flush_work(&vblk->config_work);
>>>
>>> +	refc = atomic_read(&disk_to_dev(vblk->disk)->kobj.kref.refcount);
>>> 	put_disk(vblk->disk);
>>> 	mempool_destroy(vblk->pool);
>>> 	vdev->config->del_vqs(vdev);
>>> 	kfree(vblk);
>>> -	ida_simple_remove(&vd_index_ida, index);
>>> +
>>> +	/* Only free device id if we don't have any users */
>>> +	if (refc == 1)
>>> +		ida_simple_remove(&vd_index_ida, index);
>>> }
>>>
>>> #ifdef CONFIG_PM
>>
>> Network devices take the approach of retrying every second.
>> Donnu if it makes sense here.
> 
> I would rather think the 100% right approach would be a recursive unrolling of all users bottom to top. Force unmount. Force close all fd's. I'm not sure why that doesn't happen today, but it doesn't :).

Yes ;-)

> 
> Alex
> 
> _______________________________________________
> Virtualization mailing list
> Virtualization@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
> 


-- 
Asias

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio-blk: Don't free ida when disk is in use
  2012-12-20 11:47       ` Alexander Graf
@ 2012-12-21  2:02         ` Asias He
  0 siblings, 0 replies; 13+ messages in thread
From: Asias He @ 2012-12-21  2:02 UTC (permalink / raw)
  To: Alexander Graf; +Cc: virtualization, kvm, Michael S. Tsirkin

On 12/20/2012 07:47 PM, Alexander Graf wrote:
> 
> On 20.12.2012, at 12:38, Michael S. Tsirkin wrote:
> 
>> On Thu, Dec 20, 2012 at 12:27:53PM +0100, Alexander Graf wrote:
>>>
>>> On 20.12.2012, at 11:54, Michael S. Tsirkin wrote:
>>>
>>>> On Wed, Dec 19, 2012 at 08:40:15PM +0100, Alexander Graf wrote:
>>>>> When a file system is mounted on a virtio-blk disk, we then remove it
>>>>> and then reattach it, the reattached disk gets the same disk name and
>>>>> ids as the hot removed one.
>>>>>
>>>>> This leads to very nasty effects - mostly rendering the newly attached
>>>>> device completely unusable.
>>>>>
>>>>> Trying what happens when I do the same thing with a USB device, I saw
>>>>> that the sd node simply doesn't get free'd when a device gets forcefully
>>>>> removed.
>>>>>
>>>>> Imitate the same behavior for vd devices. This way broken vd devices
>>>>> simply are never free'd and newly attached ones keep working just fine.
>>>>>
>>>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>>>> ---
>>>>> drivers/block/virtio_blk.c | 7 ++++++-
>>>>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
>>>>> index 0bdde8f..07a18e2 100644
>>>>> --- a/drivers/block/virtio_blk.c
>>>>> +++ b/drivers/block/virtio_blk.c
>>>>> @@ -889,6 +889,7 @@ static void __devexit virtblk_remove(struct virtio_device *vdev)
>>>>> {
>>>>> 	struct virtio_blk *vblk = vdev->priv;
>>>>> 	int index = vblk->index;
>>>>> +	int refc;
>>>>>
>>>>> 	/* Prevent config work handler from accessing the device. */
>>>>> 	mutex_lock(&vblk->config_lock);
>>>>> @@ -903,11 +904,15 @@ static void __devexit virtblk_remove(struct virtio_device *vdev)
>>>>>
>>>>> 	flush_work(&vblk->config_work);
>>>>>
>>>>> +	refc = atomic_read(&disk_to_dev(vblk->disk)->kobj.kref.refcount);
>>>>> 	put_disk(vblk->disk);
>>>>> 	mempool_destroy(vblk->pool);
>>>>> 	vdev->config->del_vqs(vdev);
>>>>> 	kfree(vblk);
>>>>> -	ida_simple_remove(&vd_index_ida, index);
>>>>> +
>>>>> +	/* Only free device id if we don't have any users */
>>>>> +	if (refc == 1)
>>>>> +		ida_simple_remove(&vd_index_ida, index);
>>>>> }
>>>>>
>>>>> #ifdef CONFIG_PM
>>>>
>>>> Network devices take the approach of retrying every second.
>>>> Donnu if it makes sense here.
>>>
>>> I would rather think the 100% right approach would be a recursive unrolling of all users bottom to top. Force unmount. Force close all fd's.
>>> I'm not sure why that doesn't happen today, but it doesn't :).
>>>
>>>
>>> Alex
>>
>> Especially force close fd's won't be easy. A simpler alternative
>> would be preventing unplug by taking some reference count.
> 
> That's what happens when you do
> 
> <add>
> mount /mnt
> ls /mnt
> <remove>
> 
> -- here remove failed for me --

What kernel version are you using? Any debug messages available.

> 
> But when you do
> 
> <add>
> mount /mnt
> <remove>
> 
> then remove works and we get the above problem.
> 
> I have to admit that I'm really not a big fan of preventing unplug though. In some cases, preventing simply doesn't work, for example when you unplug a USB stick or when your iscsi remote just vanishes.
> Alex
> 
> _______________________________________________
> Virtualization mailing list
> Virtualization@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
> 


-- 
Asias

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio-blk: Don't free ida when disk is in use
  2012-12-20  9:41     ` Alexander Graf
  2012-12-21  1:48       ` Asias He
@ 2013-01-02  5:09       ` Rusty Russell
  1 sibling, 0 replies; 13+ messages in thread
From: Rusty Russell @ 2013-01-02  5:09 UTC (permalink / raw)
  To: Alexander Graf, Asias He
  Cc: mst@redhat.com, kvm@vger.kernel.org,
	virtualization@lists.linux-foundation.org

Alexander Graf <agraf@suse.de> writes:

> On 20.12.2012, at 09:46, Asias He <asias@redhat.com> wrote:
>
>> On 12/20/2012 12:15 PM, Rusty Russell wrote:
>>> Alexander Graf <agraf@suse.de> writes:
>>> 
>>>> When a file system is mounted on a virtio-blk disk, we then remove it
>>>> and then reattach it, the reattached disk gets the same disk name and
>>>> ids as the hot removed one.
>>>> 
>>>> This leads to very nasty effects - mostly rendering the newly attached
>>>> device completely unusable.
>>>> 
>>>> Trying what happens when I do the same thing with a USB device, I saw
>>>> that the sd node simply doesn't get free'd when a device gets forcefully
>>>> removed.
>>>> 
>>>> Imitate the same behavior for vd devices. This way broken vd devices
>>>> simply are never free'd and newly attached ones keep working just fine.
>>>> 
>>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>> 
>>> I think deserves a CC:stable, no?
>
> I wanted to make sure people can comment on it first. But yes, it should go into stable when accepted.

Queued in my fixes branch, will go to Linus soon.

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2013-01-02  5:09 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-19 19:40 [PATCH] virtio-blk: Don't free ida when disk is in use Alexander Graf
2012-12-20  4:15 ` Rusty Russell
2012-12-20  8:46   ` Asias He
2012-12-20  9:41     ` Alexander Graf
2012-12-21  1:48       ` Asias He
2013-01-02  5:09       ` Rusty Russell
2012-12-20 10:54 ` Michael S. Tsirkin
2012-12-20 11:27   ` Alexander Graf
2012-12-21  1:57   ` Asias He
     [not found]   ` <62D6A704-CB88-4A8C-A5F3-6BD3C267895F@suse.de>
2012-12-20 11:38     ` Michael S. Tsirkin
2012-12-20 11:47       ` Alexander Graf
2012-12-21  2:02         ` Asias He
2012-12-21  1:58     ` Asias He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).