* [PATCH 0/4] detect online disk resize @ 2008-05-05 23:04 Andrew Patterson 2008-05-05 23:04 ` [PATCH 1/2] Added flush_disk to factor out common buffer cache flushing code Andrew Patterson 2008-05-05 23:04 ` [PATCH 2/2] Wrapper for lower-level revalidate_disk routines Andrew Patterson 0 siblings, 2 replies; 7+ messages in thread From: Andrew Patterson @ 2008-05-05 23:04 UTC (permalink / raw) To: linux-scsi; +Cc: linux-kernel, viro, axboe, andmike This patch series handles online disk resizes that are currently not completely recognized by the kernel using the existing revalidate_disk routines. An online resize can occur when growing or shrinking a Fibre Channel LUN or perhaps by adding a disk to an existing RAID volume. The kernel currently recognizes a device size change when revalidate_disk() is called; however, the block layer does not use the new size while it has any current openers on the device. So, for example, if LVM has an volume open on the device, you will generally not see the size change until after a reboot. We fix this problem by creating a wrapper to be used with lower-level revalidate_disk routines. This wrapper first calls the lower-level driver's revalidate_disk routine. It then compares the gendisk capacity to the block devices inode size. If there is a difference, we adjust the block device's size. If the size has changed, we then flush the disk for safety. This patch series only modifies the sd driver to use these changes as that is all that I currently have to test with. Device drivers like cciss and DAC960 should probably use it as well. Diff stats: drivers/scsi/sd.c | 4 +-- fs/block_dev.c | 76 +++++++++++++++++++++++++++++++++++++++++++++++++--- include/linux/fs.h | 1 + 3 files changed, 74 insertions(+), 7 deletions(-) Commits: - Added flush_disk to factor out common buffer cache flushing code. - Wrapper for lower-level revalidate_disk routines. - Adjust block device size after an online resize of a disk. - SCSI sd driver calls revalidate_disk wrapper. -- Andrew Patterson ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] Added flush_disk to factor out common buffer cache flushing code. 2008-05-05 23:04 [PATCH 0/4] detect online disk resize Andrew Patterson @ 2008-05-05 23:04 ` Andrew Patterson 2008-05-06 8:44 ` Christoph Hellwig 2008-05-05 23:04 ` [PATCH 2/2] Wrapper for lower-level revalidate_disk routines Andrew Patterson 1 sibling, 1 reply; 7+ messages in thread From: Andrew Patterson @ 2008-05-05 23:04 UTC (permalink / raw) To: linux-scsi; +Cc: linux-kernel, viro, axboe, andmike, Andrew Patterson Added flush_disk to factor out common buffer cache flushing code. We need to be able to flush the buffer cache for more than just when a disk is changed, so we factor out common cache flush code in check_disk_change() to an internal flush_disk() routine. This routine will then be used for both disk changes and disk resizes (in a later patch). Include the disk name in the text indicating that there are busy inodes on the device and increase the KERN severity of the message. Signed-off-by: Andrew Patterson <andrew.patterson@hp.com> --- fs/block_dev.c | 33 ++++++++++++++++++++++++++++----- 1 files changed, 28 insertions(+), 5 deletions(-) diff --git a/fs/block_dev.c b/fs/block_dev.c index 7d822fa..fcd0398 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -867,6 +867,33 @@ struct block_device *open_by_devnum(dev_t dev, unsigned mode) EXPORT_SYMBOL(open_by_devnum); + +/** + * flush_disk - invalidates all buffer-cache entries on a disk + * + * @bdev: struct block device to be flushed + * + * Invalidates all buffer-cache entries on a disk. It should be called + * when a disk has been changed -- either by a media change or online + * resize. + */ +static void flush_disk(struct block_device *bdev) +{ + if (__invalidate_device(bdev)) { + char name[BDEVNAME_SIZE] = ""; + + if (bdev->bd_disk) + disk_name(bdev->bd_disk, 0, name); + printk(KERN_WARNING "VFS: busy inodes on changed media %s\n", + name); + } + + if (!bdev->bd_disk) + return; + if (bdev->bd_disk->minors > 1) + bdev->bd_invalidated = 1; +} + /* * This routine checks whether a removable media has been changed, * and invalidates all buffer-cache-entries in that case. This @@ -886,13 +913,9 @@ int check_disk_change(struct block_device *bdev) if (!bdops->media_changed(bdev->bd_disk)) return 0; - if (__invalidate_device(bdev)) - printk("VFS: busy inodes on changed media.\n"); - + flush_disk(bdev); if (bdops->revalidate_disk) bdops->revalidate_disk(bdev->bd_disk); - if (bdev->bd_disk->minors > 1) - bdev->bd_invalidated = 1; return 1; } ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] Added flush_disk to factor out common buffer cache flushing code. 2008-05-05 23:04 ` [PATCH 1/2] Added flush_disk to factor out common buffer cache flushing code Andrew Patterson @ 2008-05-06 8:44 ` Christoph Hellwig 2008-05-07 17:59 ` James Bottomley 0 siblings, 1 reply; 7+ messages in thread From: Christoph Hellwig @ 2008-05-06 8:44 UTC (permalink / raw) To: Andrew Patterson; +Cc: linux-scsi, linux-kernel, viro, axboe, andmike On Mon, May 05, 2008 at 05:04:19PM -0600, Andrew Patterson wrote: > Added flush_disk to factor out common buffer cache flushing code. > > We need to be able to flush the buffer cache for more than just when a > disk is changed, so we factor out common cache flush code in > check_disk_change() to an internal flush_disk() routine. This routine > will then be used for both disk changes and disk resizes (in a later > patch). > > Include the disk name in the text indicating that there are busy > inodes on the device and increase the KERN severity of the message. This doesn't make much sense to me. When a disk has grown there's no point in invalidating any buffers, and when it has shrunk it's too late already. Also I suspect modern filesystems might be really allergic to this kind of under the hood actions. That is if they use the bdev mapping at all, something that at least xfs and I think btrfs aswell don't do at all. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] Added flush_disk to factor out common buffer cache flushing code. 2008-05-06 8:44 ` Christoph Hellwig @ 2008-05-07 17:59 ` James Bottomley 2008-05-07 18:08 ` Andrew Patterson 0 siblings, 1 reply; 7+ messages in thread From: James Bottomley @ 2008-05-07 17:59 UTC (permalink / raw) To: Christoph Hellwig Cc: Andrew Patterson, linux-scsi, linux-kernel, viro, axboe, andmike On Tue, 2008-05-06 at 04:44 -0400, Christoph Hellwig wrote: > On Mon, May 05, 2008 at 05:04:19PM -0600, Andrew Patterson wrote: > > Added flush_disk to factor out common buffer cache flushing code. > > > > We need to be able to flush the buffer cache for more than just when a > > disk is changed, so we factor out common cache flush code in > > check_disk_change() to an internal flush_disk() routine. This routine > > will then be used for both disk changes and disk resizes (in a later > > patch). > > > > Include the disk name in the text indicating that there are busy > > inodes on the device and increase the KERN severity of the message. > > This doesn't make much sense to me. When a disk has grown there's no > point in invalidating any buffers, and when it has shrunk it's too late > already. Also I suspect modern filesystems might be really allergic to > this kind of under the hood actions. That is if they use the bdev > mapping at all, something that at least xfs and I think btrfs aswell > don't do at all. I agree on the grown disc case. For the shrunk disk, we need at least to invalidate the sectors that no-longer physically exist. The two use cases for shrinking I can see are 1. planned: the fs is already shrunk to within the new boundaries and all data is relocated, so invalidate is fine (any dirty buffers that might exist in the shrunk region are there only because they were relocated but not yet written to their original location). 2. unplanned: In this case, the fs is probably toast, so whether we invalidate or not isn't going to make a whole lot of difference; it's still going to try to read or write from sectors beyond the new size and get I/O errors. Unfortunately, we don't seem to have a partial invalidation function for the page cache and filesystem, so should we have one? James ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] Added flush_disk to factor out common buffer cache flushing code. 2008-05-07 17:59 ` James Bottomley @ 2008-05-07 18:08 ` Andrew Patterson 2008-05-07 18:21 ` James Bottomley 0 siblings, 1 reply; 7+ messages in thread From: Andrew Patterson @ 2008-05-07 18:08 UTC (permalink / raw) To: James Bottomley Cc: Christoph Hellwig, linux-scsi, linux-kernel, viro, axboe, andmike On Wed, 2008-05-07 at 12:59 -0500, James Bottomley wrote: > On Tue, 2008-05-06 at 04:44 -0400, Christoph Hellwig wrote: > > On Mon, May 05, 2008 at 05:04:19PM -0600, Andrew Patterson wrote: > > > Added flush_disk to factor out common buffer cache flushing code. > > > > > > We need to be able to flush the buffer cache for more than just when a > > > disk is changed, so we factor out common cache flush code in > > > check_disk_change() to an internal flush_disk() routine. This routine > > > will then be used for both disk changes and disk resizes (in a later > > > patch). > > > > > > Include the disk name in the text indicating that there are busy > > > inodes on the device and increase the KERN severity of the message. > > > > This doesn't make much sense to me. When a disk has grown there's no > > point in invalidating any buffers, and when it has shrunk it's too late > > already. Also I suspect modern filesystems might be really allergic to > > this kind of under the hood actions. That is if they use the bdev > > mapping at all, something that at least xfs and I think btrfs aswell > > don't do at all. > > I agree on the grown disc case. For the shrunk disk, we need at least > to invalidate the sectors that no-longer physically exist. > > The two use cases for shrinking I can see are > > 1. planned: the fs is already shrunk to within the new boundaries > and all data is relocated, so invalidate is fine (any dirty > buffers that might exist in the shrunk region are there only > because they were relocated but not yet written to their > original location). So why do we need to invalidate here if everything is fine? > 2. unplanned: In this case, the fs is probably toast, so whether > we invalidate or not isn't going to make a whole lot of > difference; it's still going to try to read or write from > sectors beyond the new size and get I/O errors. > Invalidating here might be useful in that errors are reported earlier. > Unfortunately, we don't seem to have a partial invalidation function for > the page cache and filesystem, so should we have one? > I have been having problems with my email, hence the missing 2 patches. I'll resend the whole series and add flush_disk() call in revalidate_disk() as separate patch, so that the flush code can be optionally applied. > James > > -- Andrew Patterson ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] Added flush_disk to factor out common buffer cache flushing code. 2008-05-07 18:08 ` Andrew Patterson @ 2008-05-07 18:21 ` James Bottomley 0 siblings, 0 replies; 7+ messages in thread From: James Bottomley @ 2008-05-07 18:21 UTC (permalink / raw) To: Andrew Patterson Cc: Christoph Hellwig, linux-scsi, linux-kernel, viro, axboe, andmike On Wed, 2008-05-07 at 18:08 +0000, Andrew Patterson wrote: > On Wed, 2008-05-07 at 12:59 -0500, James Bottomley wrote: > > On Tue, 2008-05-06 at 04:44 -0400, Christoph Hellwig wrote: > > > On Mon, May 05, 2008 at 05:04:19PM -0600, Andrew Patterson wrote: > > > > Added flush_disk to factor out common buffer cache flushing code. > > > > > > > > We need to be able to flush the buffer cache for more than just when a > > > > disk is changed, so we factor out common cache flush code in > > > > check_disk_change() to an internal flush_disk() routine. This routine > > > > will then be used for both disk changes and disk resizes (in a later > > > > patch). > > > > > > > > Include the disk name in the text indicating that there are busy > > > > inodes on the device and increase the KERN severity of the message. > > > > > > This doesn't make much sense to me. When a disk has grown there's no > > > point in invalidating any buffers, and when it has shrunk it's too late > > > already. Also I suspect modern filesystems might be really allergic to > > > this kind of under the hood actions. That is if they use the bdev > > > mapping at all, something that at least xfs and I think btrfs aswell > > > don't do at all. > > > > I agree on the grown disc case. For the shrunk disk, we need at least > > to invalidate the sectors that no-longer physically exist. > > > > The two use cases for shrinking I can see are > > > > 1. planned: the fs is already shrunk to within the new boundaries > > and all data is relocated, so invalidate is fine (any dirty > > buffers that might exist in the shrunk region are there only > > because they were relocated but not yet written to their > > original location). > > So why do we need to invalidate here if everything is fine? We need rid of stray pages. Obviously dirty ones that would cause write errors at some point need to be killed. The danger ones are read only ones that can hang around for a long time. The (perhaps unlikely) scenario where they bite is if the disk is shrunk then expanded it's one of those annoying scenarios that most people don't care about: expanded space is empty, what does it matter if we get stray data; and security people jump up and down and scream about data leaking. > > 2. unplanned: In this case, the fs is probably toast, so whether > > we invalidate or not isn't going to make a whole lot of > > difference; it's still going to try to read or write from > > sectors beyond the new size and get I/O errors. > > > > Invalidating here might be useful in that errors are reported earlier. Yes ... force the filesystem to have errors immediately before it sees them on writeback or read ahead or something delayed. > > Unfortunately, we don't seem to have a partial invalidation function for > > the page cache and filesystem, so should we have one? > > > > I have been having problems with my email, hence the missing 2 patches. > I'll resend the whole series and add flush_disk() call in > revalidate_disk() as separate patch, so that the flush code can be > optionally applied. James ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/2] Wrapper for lower-level revalidate_disk routines. 2008-05-05 23:04 [PATCH 0/4] detect online disk resize Andrew Patterson 2008-05-05 23:04 ` [PATCH 1/2] Added flush_disk to factor out common buffer cache flushing code Andrew Patterson @ 2008-05-05 23:04 ` Andrew Patterson 1 sibling, 0 replies; 7+ messages in thread From: Andrew Patterson @ 2008-05-05 23:04 UTC (permalink / raw) To: linux-scsi; +Cc: linux-kernel, viro, axboe, andmike, Andrew Patterson Wrapper for lower-level revalidate_disk routines. This is a wrapper for the lower-level revalidate_disk call-backs such as sd_revalidate_disk(). It allows us to perform pre and post operations when calling them. We will use this wrapper in a later patch to adjust block device sizes after an online resize (a _post_ operation). Signed-off-by: Andrew Patterson <andrew.patterson@hp.com> --- fs/block_dev.c | 21 +++++++++++++++++++++ include/linux/fs.h | 1 + 2 files changed, 22 insertions(+), 0 deletions(-) diff --git a/fs/block_dev.c b/fs/block_dev.c index fcd0398..b510451 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -894,6 +894,27 @@ static void flush_disk(struct block_device *bdev) bdev->bd_invalidated = 1; } +/** + * revalidate_disk - wrapper for lower-level driver's revalidate_disk + * call-back + * + * @disk: struct gendisk to be revalidated + * + * This routine is a wrapper for lower-level driver's revalidate_disk + * call-backs. It is used to do common pre and post operations needed + * for all revalidate_disk operations. + */ +int revalidate_disk(struct gendisk *disk) +{ + int ret = 0; + + if (disk->fops->revalidate_disk) + ret = disk->fops->revalidate_disk(disk); + + return ret; +} +EXPORT_SYMBOL(revalidate_disk); + /* * This routine checks whether a removable media has been changed, * and invalidates all buffer-cache-entries in that case. This diff --git a/include/linux/fs.h b/include/linux/fs.h index b84b848..278172f 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1665,6 +1665,7 @@ extern int fs_may_remount_ro(struct super_block *); */ #define bio_data_dir(bio) ((bio)->bi_rw & 1) +extern int revalidate_disk(struct gendisk *); extern int check_disk_change(struct block_device *); extern int __invalidate_device(struct block_device *); extern int invalidate_partition(struct gendisk *, int); ^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-05-07 18:21 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-05-05 23:04 [PATCH 0/4] detect online disk resize Andrew Patterson 2008-05-05 23:04 ` [PATCH 1/2] Added flush_disk to factor out common buffer cache flushing code Andrew Patterson 2008-05-06 8:44 ` Christoph Hellwig 2008-05-07 17:59 ` James Bottomley 2008-05-07 18:08 ` Andrew Patterson 2008-05-07 18:21 ` James Bottomley 2008-05-05 23:04 ` [PATCH 2/2] Wrapper for lower-level revalidate_disk routines Andrew Patterson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox