* [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix
@ 2012-08-22 1:33 Hugh Dickins
2012-08-22 9:12 ` Richard W.M. Jones
2012-08-22 11:52 ` Richard W.M. Jones
0 siblings, 2 replies; 10+ messages in thread
From: Hugh Dickins @ 2012-08-22 1:33 UTC (permalink / raw)
To: Jeff Moyer
Cc: Andrew Morton, Linus Torvalds, Jens Axboe, Torsten Hilbrich,
Richard W.M. Jones, Josh Boyer, linux-kernel
Jeff,
Your commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow"),
already gone into 3.* stable, is not good. Could you and your testers
please give this alternative a try - I think it should work, and have
started it on a few days' memory load on 3.5, but not tried your case.
But, see final paragraph of comment below, I'm worried whether
all __getblk() users are actually ready for your new NULL case.
Thanks,
Hugh
[PATCH] block: replace __getblk_slow misfix by grow_dev_page fix
Commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow")
is not good: a successful call to grow_buffers() cannot guarantee
that the page won't be reclaimed before the immediate next call to
__find_get_block(), which is why there was always a loop there.
Yesterday I got "EXT4-fs error (device loop0): __ext4_get_inode_loc:3595:
inode #19278: block 664: comm cc1: unable to read itable block" on console,
which pointed to this commit.
I've been trying to bisect for weeks, why kbuild-on-ext4-on-loop-on-tmpfs
sometimes fails from a missing header file, under memory pressure on
ppc G5. I've never seen this on x86, and I've never seen it on 3.5-rc7
itself, despite that commit being in there: bisection pointed to an
irrelevant pinctrl merge, but hard to tell when failure takes between
18 minutes and 38 hours (but so far it's happened quicker on 3.6-rc2).
(I've since found such __ext4_get_inode_loc errors in /var/log/messages
from previous weeks: why the message never appeared on console until
yesterday morning is a mystery for another day.)
Revert 91f68c89d8f3, restoring __getblk_slow() to how it was (plus
a checkpatch nitfix). Avoid the infinite loop beyond end of device
by instead checking buffer_mapped(bh) in grow_dev_page() (I presume
that's more efficient than a repeated call to blkdev_max_block()),
and returning -ENXIO to __getblk_slow() in that case.
And remove akpm's ten-year-old "__getblk() cannot fail ... weird"
comment, but that is worrying: are all users of __getblk() really
now prepared for a NULL bh beyond end of device, or will some oops??
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org # 3.0 3.2 3.4 3.5
---
fs/buffer.c | 43 +++++++++++++++++++++----------------------
1 file changed, 21 insertions(+), 22 deletions(-)
--- 3.6-rc2/fs/buffer.c 2012-08-04 09:19:20.644022328 -0700
+++ linux/fs/buffer.c 2012-08-21 08:49:32.333908590 -0700
@@ -950,6 +950,7 @@ grow_dev_page(struct block_device *bdev,
struct inode *inode = bdev->bd_inode;
struct page *page;
struct buffer_head *bh;
+ int err = 0;
page = find_or_create_page(inode->i_mapping, index,
(mapping_gfp_mask(inode->i_mapping) & ~__GFP_FS)|__GFP_MOVABLE);
@@ -962,7 +963,11 @@ grow_dev_page(struct block_device *bdev,
bh = page_buffers(page);
if (bh->b_size == size) {
init_page_buffers(page, bdev, block, size);
- return page;
+ if (buffer_mapped(bh))
+ return page;
+
+ err = -ENXIO;
+ goto failed;
}
if (!try_to_free_buffers(page))
goto failed;
@@ -984,12 +989,14 @@ grow_dev_page(struct block_device *bdev,
link_dev_buffers(page, bh);
init_page_buffers(page, bdev, block, size);
spin_unlock(&inode->i_mapping->private_lock);
- return page;
+ if (buffer_mapped(bh))
+ return page;
+ err = -ENXIO;
failed:
unlock_page(page);
page_cache_release(page);
- return NULL;
+ return ERR_PTR(err);
}
/*
@@ -1026,8 +1033,8 @@ grow_buffers(struct block_device *bdev,
block = index << sizebits;
/* Create a page with the proper size buffers.. */
page = grow_dev_page(bdev, block, index, size);
- if (!page)
- return 0;
+ if (IS_ERR_OR_NULL(page))
+ return PTR_RET(page);
unlock_page(page);
page_cache_release(page);
return 1;
@@ -1036,9 +1043,6 @@ grow_buffers(struct block_device *bdev,
static struct buffer_head *
__getblk_slow(struct block_device *bdev, sector_t block, int size)
{
- int ret;
- struct buffer_head *bh;
-
/* Size must be multiple of hard sectorsize */
if (unlikely(size & (bdev_logical_block_size(bdev)-1) ||
(size < 512 || size > PAGE_SIZE))) {
@@ -1051,21 +1055,20 @@ __getblk_slow(struct block_device *bdev,
return NULL;
}
-retry:
- bh = __find_get_block(bdev, block, size);
- if (bh)
- return bh;
+ for (;;) {
+ struct buffer_head *bh;
+ int ret;
- ret = grow_buffers(bdev, block, size);
- if (ret == 0) {
- free_more_memory();
- goto retry;
- } else if (ret > 0) {
bh = __find_get_block(bdev, block, size);
if (bh)
return bh;
+
+ ret = grow_buffers(bdev, block, size);
+ if (ret < 0)
+ return NULL;
+ if (ret == 0)
+ free_more_memory();
}
- return NULL;
}
/*
@@ -1321,10 +1324,6 @@ EXPORT_SYMBOL(__find_get_block);
* which corresponds to the passed block_device, block and size. The
* returned buffer has its reference count incremented.
*
- * __getblk() cannot fail - it just keeps trying. If you pass it an
- * illegal block number, __getblk() will happily return a buffer_head
- * which represents the non-existent block. Very weird.
- *
* __getblk() will lock up the machine if grow_dev_page's try_to_free_buffers()
* attempt is failing. FIXME, perhaps?
*/
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix 2012-08-22 1:33 [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix Hugh Dickins @ 2012-08-22 9:12 ` Richard W.M. Jones 2012-08-22 11:52 ` Richard W.M. Jones 1 sibling, 0 replies; 10+ messages in thread From: Richard W.M. Jones @ 2012-08-22 9:12 UTC (permalink / raw) To: Hugh Dickins Cc: Jeff Moyer, Andrew Morton, Linus Torvalds, Jens Axboe, Torsten Hilbrich, Josh Boyer, linux-kernel On Tue, Aug 21, 2012 at 06:33:45PM -0700, Hugh Dickins wrote: > Jeff, > > Your commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow"), > already gone into 3.* stable, is not good. Could you and your testers > please give this alternative a try - I think it should work, and have > started it on a few days' memory load on 3.5, but not tried your case. I'll try your patch later today, but the test is very simple: - Create a 1024 byte empty (all zeroes) partition and then try to mount it. Failure was that the mount command would go into an infinite loop using 100% CPU. Success is that it gives an error. More about it here including a very simple libguestfs script to reproduce it: https://bugzilla.redhat.com/show_bug.cgi?id=835019#c0 and a virtual disk image that also demonstrates the problem without needing anything except a VM running the test kernel: https://bugzilla.redhat.com/show_bug.cgi?id=835019#c1 Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones New in Fedora 11: Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 70 libraries supprt'd http://fedoraproject.org/wiki/MinGW http://www.annexia.org/fedora_mingw ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix 2012-08-22 1:33 [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix Hugh Dickins 2012-08-22 9:12 ` Richard W.M. Jones @ 2012-08-22 11:52 ` Richard W.M. Jones 2012-08-23 4:56 ` Hugh Dickins 1 sibling, 1 reply; 10+ messages in thread From: Richard W.M. Jones @ 2012-08-22 11:52 UTC (permalink / raw) To: Hugh Dickins Cc: Jeff Moyer, Andrew Morton, Linus Torvalds, Jens Axboe, Torsten Hilbrich, Josh Boyer, linux-kernel On Tue, Aug 21, 2012 at 06:33:45PM -0700, Hugh Dickins wrote: > Jeff, > > Your commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow"), > already gone into 3.* stable, is not good. Could you and your testers > please give this alternative a try - I think it should work, and have > started it on a few days' memory load on 3.5, but not tried your case. I have tested your patch and it does NOT fix the problem in http://bugzilla.redhat.com/835019 Kernel 3.6.0 + your patch => mount goes into a loop when mounting a small empty partition. Please do NOT apply this as it will cause a regression! Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix 2012-08-22 11:52 ` Richard W.M. Jones @ 2012-08-23 4:56 ` Hugh Dickins 2012-08-23 10:20 ` Jens Axboe ` (2 more replies) 0 siblings, 3 replies; 10+ messages in thread From: Hugh Dickins @ 2012-08-23 4:56 UTC (permalink / raw) To: Richard W.M. Jones Cc: Jeff Moyer, Andrew Morton, Linus Torvalds, Jens Axboe, Torsten Hilbrich, Josh Boyer, linux-kernel On Wed, 22 Aug 2012, Richard W.M. Jones wrote: > On Tue, Aug 21, 2012 at 06:33:45PM -0700, Hugh Dickins wrote: > > Jeff, > > > > Your commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow"), > > already gone into 3.* stable, is not good. Could you and your testers > > please give this alternative a try - I think it should work, and have > > started it on a few days' memory load on 3.5, but not tried your case. > > I have tested your patch and it does NOT fix the problem in > http://bugzilla.redhat.com/835019 > Kernel 3.6.0 + your patch => mount goes into a loop when mounting > a small empty partition. Please do NOT apply this as it will > cause a regression! That was all very helpful information that you provided, thank you. Sorry, I had missed how "block" is massaged as it's passed down a level. The patch below fixes it for me, though it does have to change more than I'd been hoping in such a fix. Perhaps I am just being silly to resist repeating the call to blkdev_max_block(). Does this patch work for you? Thanks, Hugh [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix Commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow") is not good: a successful call to grow_buffers() cannot guarantee that the page won't be reclaimed before the immediate next call to __find_get_block(), which is why there was always a loop there. Yesterday I got "EXT4-fs error (device loop0): __ext4_get_inode_loc:3595: inode #19278: block 664: comm cc1: unable to read itable block" on console, which pointed to this commit. I've been trying to bisect for weeks, why kbuild-on-ext4-on-loop-on-tmpfs sometimes fails from a missing header file, under memory pressure on ppc G5. I've never seen this on x86, and I've never seen it on 3.5-rc7 itself, despite that commit being in there: bisection pointed to an irrelevant pinctrl merge, but hard to tell when failure takes between 18 minutes and 38 hours (but so far it's happened quicker on 3.6-rc2). (I've since found such __ext4_get_inode_loc errors in /var/log/messages from previous weeks: why the message never appeared on console until yesterday morning is a mystery for another day.) Revert 91f68c89d8f3, restoring __getblk_slow() to how it was (plus a checkpatch nitfix). Simplify the interface between grow_buffers() and grow_dev_page(), and avoid the infinite loop beyond end of device by instead checking init_page_buffers()'s end_block there (I presume that's more efficient than a repeated call to blkdev_max_block()), returning -ENXIO to __getblk_slow() in that case. And remove akpm's ten-year-old "__getblk() cannot fail ... weird" comment, but that is worrying: are all users of __getblk() really now prepared for a NULL bh beyond end of device, or will some oops?? Signed-off-by: Hugh Dickins <hughd@google.com> Cc: stable@vger.kernel.org # 3.0 3.2 3.4 3.5 --- fs/buffer.c | 66 ++++++++++++++++++++++---------------------------- 1 file changed, 30 insertions(+), 36 deletions(-) --- 3.6-rc3/fs/buffer.c 2012-08-04 09:19:20.644022328 -0700 +++ linux/fs/buffer.c 2012-08-22 17:11:57.143827063 -0700 @@ -914,7 +914,7 @@ link_dev_buffers(struct page *page, stru /* * Initialise the state of a blockdev page's buffers. */ -static void +static sector_t init_page_buffers(struct page *page, struct block_device *bdev, sector_t block, int size) { @@ -936,33 +936,41 @@ init_page_buffers(struct page *page, str block++; bh = bh->b_this_page; } while (bh != head); + + /* + * Caller needs to validate requested block against end of device. + */ + return end_block; } /* * Create the page-cache page that contains the requested block. * - * This is user purely for blockdev mappings. + * This is used purely for blockdev mappings. */ -static struct page * +static int grow_dev_page(struct block_device *bdev, sector_t block, - pgoff_t index, int size) + pgoff_t index, int size, int sizebits) { struct inode *inode = bdev->bd_inode; struct page *page; struct buffer_head *bh; + sector_t end_block; + int ret = 0; /* Will call free_more_memory() */ page = find_or_create_page(inode->i_mapping, index, (mapping_gfp_mask(inode->i_mapping) & ~__GFP_FS)|__GFP_MOVABLE); if (!page) - return NULL; + return ret; BUG_ON(!PageLocked(page)); if (page_has_buffers(page)) { bh = page_buffers(page); if (bh->b_size == size) { - init_page_buffers(page, bdev, block, size); - return page; + end_block = init_page_buffers(page, bdev, + index << sizebits, size); + goto done; } if (!try_to_free_buffers(page)) goto failed; @@ -982,14 +990,14 @@ grow_dev_page(struct block_device *bdev, */ spin_lock(&inode->i_mapping->private_lock); link_dev_buffers(page, bh); - init_page_buffers(page, bdev, block, size); + end_block = init_page_buffers(page, bdev, index << sizebits, size); spin_unlock(&inode->i_mapping->private_lock); - return page; - +done: + ret = (block < end_block) ? 1 : -ENXIO; failed: unlock_page(page); page_cache_release(page); - return NULL; + return ret; } /* @@ -999,7 +1007,6 @@ failed: static int grow_buffers(struct block_device *bdev, sector_t block, int size) { - struct page *page; pgoff_t index; int sizebits; @@ -1023,22 +1030,14 @@ grow_buffers(struct block_device *bdev, bdevname(bdev, b)); return -EIO; } - block = index << sizebits; + /* Create a page with the proper size buffers.. */ - page = grow_dev_page(bdev, block, index, size); - if (!page) - return 0; - unlock_page(page); - page_cache_release(page); - return 1; + return grow_dev_page(bdev, block, index, size, sizebits); } static struct buffer_head * __getblk_slow(struct block_device *bdev, sector_t block, int size) { - int ret; - struct buffer_head *bh; - /* Size must be multiple of hard sectorsize */ if (unlikely(size & (bdev_logical_block_size(bdev)-1) || (size < 512 || size > PAGE_SIZE))) { @@ -1051,21 +1050,20 @@ __getblk_slow(struct block_device *bdev, return NULL; } -retry: - bh = __find_get_block(bdev, block, size); - if (bh) - return bh; + for (;;) { + struct buffer_head *bh; + int ret; - ret = grow_buffers(bdev, block, size); - if (ret == 0) { - free_more_memory(); - goto retry; - } else if (ret > 0) { bh = __find_get_block(bdev, block, size); if (bh) return bh; + + ret = grow_buffers(bdev, block, size); + if (ret < 0) + return NULL; + if (ret == 0) + free_more_memory(); } - return NULL; } /* @@ -1321,10 +1319,6 @@ EXPORT_SYMBOL(__find_get_block); * which corresponds to the passed block_device, block and size. The * returned buffer has its reference count incremented. * - * __getblk() cannot fail - it just keeps trying. If you pass it an - * illegal block number, __getblk() will happily return a buffer_head - * which represents the non-existent block. Very weird. - * * __getblk() will lock up the machine if grow_dev_page's try_to_free_buffers() * attempt is failing. FIXME, perhaps? */ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix 2012-08-23 4:56 ` Hugh Dickins @ 2012-08-23 10:20 ` Jens Axboe 2012-08-23 20:40 ` Jeff Moyer 2012-08-28 11:14 ` Richard W.M. Jones 2 siblings, 0 replies; 10+ messages in thread From: Jens Axboe @ 2012-08-23 10:20 UTC (permalink / raw) To: Hugh Dickins Cc: Richard W.M. Jones, Jeff Moyer, Andrew Morton, Linus Torvalds, Torsten Hilbrich, Josh Boyer, linux-kernel@vger.kernel.org On 08/23/2012 06:56 AM, Hugh Dickins wrote: > [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix > > Commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow") > is not good: a successful call to grow_buffers() cannot guarantee > that the page won't be reclaimed before the immediate next call to > __find_get_block(), which is why there was always a loop there. > > Yesterday I got "EXT4-fs error (device loop0): __ext4_get_inode_loc:3595: > inode #19278: block 664: comm cc1: unable to read itable block" on console, > which pointed to this commit. > > I've been trying to bisect for weeks, why kbuild-on-ext4-on-loop-on-tmpfs > sometimes fails from a missing header file, under memory pressure on > ppc G5. I've never seen this on x86, and I've never seen it on 3.5-rc7 > itself, despite that commit being in there: bisection pointed to an > irrelevant pinctrl merge, but hard to tell when failure takes between > 18 minutes and 38 hours (but so far it's happened quicker on 3.6-rc2). > > (I've since found such __ext4_get_inode_loc errors in /var/log/messages > from previous weeks: why the message never appeared on console until > yesterday morning is a mystery for another day.) > > Revert 91f68c89d8f3, restoring __getblk_slow() to how it was (plus > a checkpatch nitfix). Simplify the interface between grow_buffers() > and grow_dev_page(), and avoid the infinite loop beyond end of device > by instead checking init_page_buffers()'s end_block there (I presume > that's more efficient than a repeated call to blkdev_max_block()), > returning -ENXIO to __getblk_slow() in that case. > > And remove akpm's ten-year-old "__getblk() cannot fail ... weird" > comment, but that is worrying: are all users of __getblk() really > now prepared for a NULL bh beyond end of device, or will some oops?? Hugh, I tentatively applied this one, awaiting some test feedback before pushing it upstream this cycle. -- Jens Axboe ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix 2012-08-23 4:56 ` Hugh Dickins 2012-08-23 10:20 ` Jens Axboe @ 2012-08-23 20:40 ` Jeff Moyer 2012-08-23 22:46 ` Hugh Dickins 2012-08-28 11:14 ` Richard W.M. Jones 2 siblings, 1 reply; 10+ messages in thread From: Jeff Moyer @ 2012-08-23 20:40 UTC (permalink / raw) To: Hugh Dickins Cc: Richard W.M. Jones, Andrew Morton, Linus Torvalds, Jens Axboe, Torsten Hilbrich, Josh Boyer, linux-kernel Hugh Dickins <hughd@google.com> writes: > [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix > > Commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow") > is not good: a successful call to grow_buffers() cannot guarantee > that the page won't be reclaimed before the immediate next call to > __find_get_block(), which is why there was always a loop there. [snip] > Revert 91f68c89d8f3, restoring __getblk_slow() to how it was (plus > a checkpatch nitfix). Simplify the interface between grow_buffers() > and grow_dev_page(), and avoid the infinite loop beyond end of device > by instead checking init_page_buffers()'s end_block there (I presume > that's more efficient than a repeated call to blkdev_max_block()), > returning -ENXIO to __getblk_slow() in that case. > > And remove akpm's ten-year-old "__getblk() cannot fail ... weird" > comment, but that is worrying: are all users of __getblk() really > now prepared for a NULL bh beyond end of device, or will some oops?? Hi, Hugh, Thanks for digging into this. I had wondered whether that loop was required for other purposes, and you've verified that it was. I mostly like your fix. Returning end_block from init_page_buffers is a little strange, but I understand not wanting to redo the call to blkdev_max_block. I went ahead to re-tested with the reproducer that I had, and your patch works fine. Thanks again for tracking this down, and sorry I wasn't more diligent to begin with. Reviewed-and-Tested-by: Jeff Moyer <jmoyer@redhat.com> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix 2012-08-23 20:40 ` Jeff Moyer @ 2012-08-23 22:46 ` Hugh Dickins 2012-08-24 13:13 ` Jeff Moyer 0 siblings, 1 reply; 10+ messages in thread From: Hugh Dickins @ 2012-08-23 22:46 UTC (permalink / raw) To: Jeff Moyer Cc: Richard W.M. Jones, Andrew Morton, Linus Torvalds, Jens Axboe, Torsten Hilbrich, Josh Boyer, linux-kernel On Thu, 23 Aug 2012, Jeff Moyer wrote: > Hugh Dickins <hughd@google.com> writes: > > > [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix > > > > Commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow") > > is not good: a successful call to grow_buffers() cannot guarantee > > that the page won't be reclaimed before the immediate next call to > > __find_get_block(), which is why there was always a loop there. > [snip] > > Revert 91f68c89d8f3, restoring __getblk_slow() to how it was (plus > > a checkpatch nitfix). Simplify the interface between grow_buffers() > > and grow_dev_page(), and avoid the infinite loop beyond end of device > > by instead checking init_page_buffers()'s end_block there (I presume > > that's more efficient than a repeated call to blkdev_max_block()), > > returning -ENXIO to __getblk_slow() in that case. > > > > And remove akpm's ten-year-old "__getblk() cannot fail ... weird" > > comment, but that is worrying: are all users of __getblk() really > > now prepared for a NULL bh beyond end of device, or will some oops?? > > Hi, Hugh, > > Thanks for digging into this. I had wondered whether that loop was > required for other purposes, and you've verified that it was. I mostly > like your fix. Returning end_block from init_page_buffers is a little > strange, It is a little strange, I agree. I wrestled a lot with the way block and end_block were known at opposite ends of the callstack, and needed to be brought together for the check. If you think it would be better to move the blkdev_max_block() lookup up a level into grow_dev_page(), then pass end_block down to init_page_buffers(), that could work as well; though we'd then still need to report "failure" from init_page_buffers(). I was inclined to leave it precisely where you had sited it, with that comment on passing it back up; but could change it around if preferred. > but I understand not wanting to redo the call to blkdev_max_block. > > I went ahead to re-tested with the reproducer that I had, and your patch > works fine. Thanks again for tracking this down, and sorry I wasn't > more diligent to begin with. > > Reviewed-and-Tested-by: Jeff Moyer <jmoyer@redhat.com> Great, thanks a lot. It (but equally its incorrect earlier version) have been running fine at home under my swapping load. I'm confident that it fixes the issues I had been seeing, although, strictly speaking, it'll take another two or three days to reach bisection confidence level. Hugh ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix 2012-08-23 22:46 ` Hugh Dickins @ 2012-08-24 13:13 ` Jeff Moyer 0 siblings, 0 replies; 10+ messages in thread From: Jeff Moyer @ 2012-08-24 13:13 UTC (permalink / raw) To: Hugh Dickins Cc: Richard W.M. Jones, Andrew Morton, Linus Torvalds, Jens Axboe, Torsten Hilbrich, Josh Boyer, linux-kernel Hugh Dickins <hughd@google.com> writes: > On Thu, 23 Aug 2012, Jeff Moyer wrote: >> Hugh Dickins <hughd@google.com> writes: >> >> > [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix >> > >> > Commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow") >> > is not good: a successful call to grow_buffers() cannot guarantee >> > that the page won't be reclaimed before the immediate next call to >> > __find_get_block(), which is why there was always a loop there. >> [snip] >> > Revert 91f68c89d8f3, restoring __getblk_slow() to how it was (plus >> > a checkpatch nitfix). Simplify the interface between grow_buffers() >> > and grow_dev_page(), and avoid the infinite loop beyond end of device >> > by instead checking init_page_buffers()'s end_block there (I presume >> > that's more efficient than a repeated call to blkdev_max_block()), >> > returning -ENXIO to __getblk_slow() in that case. >> > >> > And remove akpm's ten-year-old "__getblk() cannot fail ... weird" >> > comment, but that is worrying: are all users of __getblk() really >> > now prepared for a NULL bh beyond end of device, or will some oops?? >> >> Hi, Hugh, >> >> Thanks for digging into this. I had wondered whether that loop was >> required for other purposes, and you've verified that it was. I mostly >> like your fix. Returning end_block from init_page_buffers is a little >> strange, > > It is a little strange, I agree. I wrestled a lot with the way block > and end_block were known at opposite ends of the callstack, and needed > to be brought together for the check. > > If you think it would be better to move the blkdev_max_block() > lookup up a level into grow_dev_page(), then pass end_block down to > init_page_buffers(), that could work as well; though we'd then still > need to report "failure" from init_page_buffers(). I have no strong opinion; I can live with it as-is. > Great, thanks a lot. It (but equally its incorrect earlier version) > have been running fine at home under my swapping load. I'm confident > that it fixes the issues I had been seeing, although, strictly speaking, > it'll take another two or three days to reach bisection confidence level. Well, I bought your analysis, so I'm at least ready to see this get pushed. ;-) Cheers, Jeff ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix 2012-08-23 4:56 ` Hugh Dickins 2012-08-23 10:20 ` Jens Axboe 2012-08-23 20:40 ` Jeff Moyer @ 2012-08-28 11:14 ` Richard W.M. Jones 2012-08-28 14:35 ` Hugh Dickins 2 siblings, 1 reply; 10+ messages in thread From: Richard W.M. Jones @ 2012-08-28 11:14 UTC (permalink / raw) To: Hugh Dickins Cc: Jeff Moyer, Andrew Morton, Linus Torvalds, Jens Axboe, Torsten Hilbrich, Josh Boyer, linux-kernel On Wed, Aug 22, 2012 at 09:56:12PM -0700, Hugh Dickins wrote: > On Wed, 22 Aug 2012, Richard W.M. Jones wrote: > > On Tue, Aug 21, 2012 at 06:33:45PM -0700, Hugh Dickins wrote: > > > Jeff, > > > > > > Your commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow"), > > > already gone into 3.* stable, is not good. Could you and your testers > > > please give this alternative a try - I think it should work, and have > > > started it on a few days' memory load on 3.5, but not tried your case. > > > > I have tested your patch and it does NOT fix the problem in > > http://bugzilla.redhat.com/835019 > > Kernel 3.6.0 + your patch => mount goes into a loop when mounting > > a small empty partition. Please do NOT apply this as it will > > cause a regression! > > That was all very helpful information that you provided, thank you. > Sorry, I had missed how "block" is massaged as it's passed down a level. > > The patch below fixes it for me, though it does have to change more than > I'd been hoping in such a fix. Perhaps I am just being silly to resist > repeating the call to blkdev_max_block(). Does this patch work for you? > > Thanks, > Hugh > > [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix I noticed this (second version) went upstream already. Nevertheless I tested it today and it doesn't cause a regression of RHBZ#835019. Thanks, Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix 2012-08-28 11:14 ` Richard W.M. Jones @ 2012-08-28 14:35 ` Hugh Dickins 0 siblings, 0 replies; 10+ messages in thread From: Hugh Dickins @ 2012-08-28 14:35 UTC (permalink / raw) To: Richard W.M. Jones Cc: Jeff Moyer, Andrew Morton, Linus Torvalds, Jens Axboe, Torsten Hilbrich, Josh Boyer, linux-kernel On Tue, 28 Aug 2012, Richard W.M. Jones wrote: > On Wed, Aug 22, 2012 at 09:56:12PM -0700, Hugh Dickins wrote: > > [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix > > I noticed this (second version) went upstream already. Nevertheless I > tested it today and it doesn't cause a regression of RHBZ#835019. Thank you for testing and confirming, Rich. Once Jeff had tested it, we felt safe to go ahead while you were off. Hugh ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2012-08-28 14:36 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-08-22 1:33 [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix Hugh Dickins 2012-08-22 9:12 ` Richard W.M. Jones 2012-08-22 11:52 ` Richard W.M. Jones 2012-08-23 4:56 ` Hugh Dickins 2012-08-23 10:20 ` Jens Axboe 2012-08-23 20:40 ` Jeff Moyer 2012-08-23 22:46 ` Hugh Dickins 2012-08-24 13:13 ` Jeff Moyer 2012-08-28 11:14 ` Richard W.M. Jones 2012-08-28 14:35 ` Hugh Dickins
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.