From mboxrd@z Thu Jan  1 00:00:00 1970
From: "J. Roeleveld" <joost@antarean.org>
Subject: Re: [PATCH]: Support dynamic resizing of vbds
Date: Tue, 16 Mar 2010 23:04:44 +0100
Message-ID: <201003162304.44999.joost@antarean.org>
References: <4B96456B0200003000080E91@sinclair.provo.novell.com>
	<201003162224.02105.joost@antarean.org>
	<201003162227.12625.joost@antarean.org>
Mime-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xensource.com>
In-Reply-To: <201003162227.12625.joost@antarean.org>
List-Unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xensource.com>
List-Help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-Subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
Sender: xen-devel-bounces@lists.xensource.com
Errors-To: xen-devel-bounces@lists.xensource.com
To: xen-devel@lists.xensource.com
List-Id: xen-devel@lists.xenproject.org

On Tuesday 16 March 2010 22:27:12 J. Roeleveld wrote:
> On Tuesday 16 March 2010 22:24:02 J. Roeleveld wrote:
> > On Tuesday 16 March 2010 03:50:18 Ky Srinivasan wrote:

> > > Thanks. Looking forward to your feedback.
> > >
> > > K. Y
> >
> > Ok, finally got time to test it.
> > Not seen any major crashes, but my domU and filesystem did end up in an
> > unusable state.
> >
> > I also noticed that the change-entries in the logs didn't show up until I
> > "touched" the drive.
> > Eg: "ls <mount point>"
> >
> > When trying to do an online resize, "resize2fs" refused, saying the
> >  filesystem was already using the full space:
> > --
> > storage ~ # resize2fs /dev/sdb1
> > resize2fs 1.41.9 (22-Aug-2009)
> > The filesystem is already 104857600 blocks long.  Nothing to do!
> > --
> >
> > This was then 'resolved' by umount/mount of the filesystem:
> > --
> > storage ~ # umount /data/homes/
> > storage ~ # mount /data/homes/
> > storage ~ # resize2fs /dev/sdb1
> > resize2fs 1.41.9 (22-Aug-2009)
> > Filesystem at /dev/sdb1 is mounted on /data/homes; on-line resizing
> >  required old desc_blocks = 25, new_desc_blocks = 29
> > Performing an on-line resize of /dev/sdb1 to 117964800 (4k) blocks.
> > --
> >
> > These actions were take in the domU.
> >
> > The patch informs the domU about the new size, but the new size is not
> > cascaded to all the levels.
> >
> > I'm not familiar enough with the kernel internals to point to where the
> > missing part is.
> >
> > My ideal situation would allow the folliowing to work without additional
> > steps:
> >
> > dom0: lvresize -L+10G /dev/vg/foo
> > domU: resizefs /dev/sdb1
> >
> > (with "/dev/vg/foo" exported to domU as "/dev/sdb1")
> >
> > Right now, I need to do the following:
> > dom0: lvresize -L+10G /dev/vg/foo
> > domU: ls /mnt/sdb1
> > domU: umount /mnt/sdb1
> > domU: mount /mnt/sdb1
> > domU: resizefs /dev/sdb1
> >
> > During the 2nd attempt, when trying to umount the filesystem after
> >  increasing it again leads to the domU having a 100% I/O wait.
> > The logs themselves do not, however, show any usefull information.
> >
> > I waited for about 30 minutes and saw no change to this situation.
> >
> > I am afraid that for now I will revert back to not having this patch
> >  applied and use the 'current' method of increasing the filesystem sizes.
> >
> > Please let me know if there is any further testing I can help with.
> >
> > --
> > Joost Roeleveld
> 
> Update,
> 
> After killing the domU, I saw the following in my dom0:
> 
> --
> VBD Resize: new size 943718400
> VBD Resize: new size 1048576000
> INFO: task blkback.3.sdb1:21647 blocked for more than 480 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> blkback.3.sdb D 0000000000000002     0 21647      2
>  ffff88002b54f550 0000000000000246 ffff88002ef08000 0000000600000000
>  ffff88002f9bf800 ffffffff806952c0 ffffffff806952c0 ffff88002eeee550
>  ffff88002b54f778 000000002eee6800 0000000000000000 ffff88002b54f778
> Call Trace:
>  [<ffffffff804d6ece>] printk+0x4e/0x58
>  [<ffffffff804d9387>] __down_read+0x101/0x119
>  [<ffffffff803d301e>] xenbus_transaction_start+0x15/0x62
>  [<ffffffff803d8843>] vbd_resize+0x50/0x120
>  [<ffffffff803d747c>] blkif_schedule+0x7e/0x4ae
>  [<ffffffff803d73fe>] blkif_schedule+0x0/0x4ae
>  [<ffffffff8023f8de>] kthread+0x47/0x73
>  [<ffffffff8020b2ea>] child_rip+0xa/0x20
>  [<ffffffff8023f897>] kthread+0x0/0x73
>  [<ffffffff8020b2e0>] child_rip+0x0/0x20
> --
> 
> (this was the result from "dmesg"
> 
> The ID of the domU was "3" and the device of the filesystem in the domU is
> "sdb1"
> Eg. this matches the above error-message.
> 
> --
> Joost
> 

Update 2:
Trying to start the domU after killing it failed as the device was still 
listed as "in-use".

I hope all this helps, if you require any specific information, I am happy to 
provide it as long as the information is physically on the server.
Please note, I had to reboot the physical machine to get the domU back up 
again and the LV has been recreated as I no longer trusted the actual file-
system. (rather start 'fresh' before loading data into it then to find out 
there is an issue once it's filled)

--
Joost