From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Greg KH <gregkh@linuxfoundation.org>,
Hugh Dickins <hughd@google.com>, Jens Axboe <axboe@kernel.dk>
Subject: [ 27/46] block: replace __getblk_slow misfix by grow_dev_page fix
Date: Wed, 12 Sep 2012 16:39:17 -0700 [thread overview]
Message-ID: <20120912233820.511734466@linuxfoundation.org> (raw)
In-Reply-To: <20120912233817.662663809@linuxfoundation.org>
From: Greg KH <gregkh@linuxfoundation.org>
3.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hugh Dickins <hughd@google.com>
commit 676ce6d5ca3098339c028d44fe0427d1566a4d2d upstream.
Commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow")
is not good: a successful call to grow_buffers() cannot guarantee
that the page won't be reclaimed before the immediate next call to
__find_get_block(), which is why there was always a loop there.
Yesterday I got "EXT4-fs error (device loop0): __ext4_get_inode_loc:3595:
inode #19278: block 664: comm cc1: unable to read itable block" on console,
which pointed to this commit.
I've been trying to bisect for weeks, why kbuild-on-ext4-on-loop-on-tmpfs
sometimes fails from a missing header file, under memory pressure on
ppc G5. I've never seen this on x86, and I've never seen it on 3.5-rc7
itself, despite that commit being in there: bisection pointed to an
irrelevant pinctrl merge, but hard to tell when failure takes between
18 minutes and 38 hours (but so far it's happened quicker on 3.6-rc2).
(I've since found such __ext4_get_inode_loc errors in /var/log/messages
from previous weeks: why the message never appeared on console until
yesterday morning is a mystery for another day.)
Revert 91f68c89d8f3, restoring __getblk_slow() to how it was (plus
a checkpatch nitfix). Simplify the interface between grow_buffers()
and grow_dev_page(), and avoid the infinite loop beyond end of device
by instead checking init_page_buffers()'s end_block there (I presume
that's more efficient than a repeated call to blkdev_max_block()),
returning -ENXIO to __getblk_slow() in that case.
And remove akpm's ten-year-old "__getblk() cannot fail ... weird"
comment, but that is worrying: are all users of __getblk() really
now prepared for a NULL bh beyond end of device, or will some oops??
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/buffer.c | 66 +++++++++++++++++++++++++++---------------------------------
1 file changed, 30 insertions(+), 36 deletions(-)
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -961,7 +961,7 @@ link_dev_buffers(struct page *page, stru
/*
* Initialise the state of a blockdev page's buffers.
*/
-static void
+static sector_t
init_page_buffers(struct page *page, struct block_device *bdev,
sector_t block, int size)
{
@@ -983,33 +983,41 @@ init_page_buffers(struct page *page, str
block++;
bh = bh->b_this_page;
} while (bh != head);
+
+ /*
+ * Caller needs to validate requested block against end of device.
+ */
+ return end_block;
}
/*
* Create the page-cache page that contains the requested block.
*
- * This is user purely for blockdev mappings.
+ * This is used purely for blockdev mappings.
*/
-static struct page *
+static int
grow_dev_page(struct block_device *bdev, sector_t block,
- pgoff_t index, int size)
+ pgoff_t index, int size, int sizebits)
{
struct inode *inode = bdev->bd_inode;
struct page *page;
struct buffer_head *bh;
+ sector_t end_block;
+ int ret = 0; /* Will call free_more_memory() */
page = find_or_create_page(inode->i_mapping, index,
(mapping_gfp_mask(inode->i_mapping) & ~__GFP_FS)|__GFP_MOVABLE);
if (!page)
- return NULL;
+ return ret;
BUG_ON(!PageLocked(page));
if (page_has_buffers(page)) {
bh = page_buffers(page);
if (bh->b_size == size) {
- init_page_buffers(page, bdev, block, size);
- return page;
+ end_block = init_page_buffers(page, bdev,
+ index << sizebits, size);
+ goto done;
}
if (!try_to_free_buffers(page))
goto failed;
@@ -1029,15 +1037,15 @@ grow_dev_page(struct block_device *bdev,
*/
spin_lock(&inode->i_mapping->private_lock);
link_dev_buffers(page, bh);
- init_page_buffers(page, bdev, block, size);
+ end_block = init_page_buffers(page, bdev, index << sizebits, size);
spin_unlock(&inode->i_mapping->private_lock);
- return page;
+done:
+ ret = (block < end_block) ? 1 : -ENXIO;
failed:
- BUG();
unlock_page(page);
page_cache_release(page);
- return NULL;
+ return ret;
}
/*
@@ -1047,7 +1055,6 @@ failed:
static int
grow_buffers(struct block_device *bdev, sector_t block, int size)
{
- struct page *page;
pgoff_t index;
int sizebits;
@@ -1071,22 +1078,14 @@ grow_buffers(struct block_device *bdev,
bdevname(bdev, b));
return -EIO;
}
- block = index << sizebits;
+
/* Create a page with the proper size buffers.. */
- page = grow_dev_page(bdev, block, index, size);
- if (!page)
- return 0;
- unlock_page(page);
- page_cache_release(page);
- return 1;
+ return grow_dev_page(bdev, block, index, size, sizebits);
}
static struct buffer_head *
__getblk_slow(struct block_device *bdev, sector_t block, int size)
{
- int ret;
- struct buffer_head *bh;
-
/* Size must be multiple of hard sectorsize */
if (unlikely(size & (bdev_logical_block_size(bdev)-1) ||
(size < 512 || size > PAGE_SIZE))) {
@@ -1099,21 +1098,20 @@ __getblk_slow(struct block_device *bdev,
return NULL;
}
-retry:
- bh = __find_get_block(bdev, block, size);
- if (bh)
- return bh;
+ for (;;) {
+ struct buffer_head *bh;
+ int ret;
- ret = grow_buffers(bdev, block, size);
- if (ret == 0) {
- free_more_memory();
- goto retry;
- } else if (ret > 0) {
bh = __find_get_block(bdev, block, size);
if (bh)
return bh;
+
+ ret = grow_buffers(bdev, block, size);
+ if (ret < 0)
+ return NULL;
+ if (ret == 0)
+ free_more_memory();
}
- return NULL;
}
/*
@@ -1369,10 +1367,6 @@ EXPORT_SYMBOL(__find_get_block);
* which corresponds to the passed block_device, block and size. The
* returned buffer has its reference count incremented.
*
- * __getblk() cannot fail - it just keeps trying. If you pass it an
- * illegal block number, __getblk() will happily return a buffer_head
- * which represents the non-existent block. Very weird.
- *
* __getblk() will lock up the machine if grow_dev_page's try_to_free_buffers()
* attempt is failing. FIXME, perhaps?
*/
next prev parent reply other threads:[~2012-09-12 23:47 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-12 23:38 [ 00/46] 3.0.43-stable review Greg Kroah-Hartman
2012-09-12 23:38 ` [ 01/46] USB: vt6656: remove __devinit* from the struct usb_device_id table Greg Kroah-Hartman
2012-09-12 23:38 ` [ 02/46] USB: emi62: " Greg Kroah-Hartman
2012-09-12 23:38 ` [ 03/46] ALSA: hda - fix Copyright debug message Greg Kroah-Hartman
2012-09-12 23:38 ` [ 04/46] ARM: 7487/1: mm: avoid setting nG bit for user mappings that arent present Greg Kroah-Hartman
2012-09-12 23:38 ` [ 05/46] ARM: 7488/1: mm: use 5 bits for swapfile type encoding Greg Kroah-Hartman
2012-09-12 23:38 ` [ 06/46] ARM: 7489/1: errata: fix workaround for erratum #720789 on UP systems Greg Kroah-Hartman
2012-09-12 23:38 ` [ 07/46] ARM: S3C24XX: Fix s3c2410_dma_enqueue parameters Greg Kroah-Hartman
2012-09-12 23:38 ` [ 08/46] ARM: imx: select CPU_FREQ_TABLE when needed Greg Kroah-Hartman
2012-09-12 23:38 ` [ 09/46] ASoC: wm9712: Fix microphone source selection Greg Kroah-Hartman
2012-09-12 23:39 ` [ 10/46] vfs: missed source of ->f_pos races Greg Kroah-Hartman
2012-09-12 23:39 ` [ 11/46] vfs: canonicalize create mode in build_open_flags() Greg Kroah-Hartman
2012-09-12 23:39 ` [ 12/46] alpha: Dont export SOCK_NONBLOCK to user space Greg Kroah-Hartman
2012-09-12 23:39 ` [ 13/46] USB: winbond: remove __devinit* from the struct usb_device_id table Greg Kroah-Hartman
2012-09-12 23:39 ` [ 14/46] mm: hugetlbfs: correctly populate shared pmd Greg Kroah-Hartman
2012-09-12 23:39 ` [ 15/46] NFSv3: Ensure that do_proc_get_root() reports errors correctly Greg Kroah-Hartman
2012-09-12 23:39 ` [ 16/46] NFSv4.1: Remove a bogus BUG_ON() in nfs4_layoutreturn_done Greg Kroah-Hartman
2012-09-16 16:33 ` Ben Hutchings
2012-09-16 16:37 ` Greg Kroah-Hartman
2012-09-17 13:05 ` Myklebust, Trond
2012-09-19 9:49 ` Boaz Harrosh
2012-09-12 23:39 ` [ 17/46] NFS: Alias the nfs module to nfs4 Greg Kroah-Hartman
2012-09-12 23:39 ` [ 18/46] audit: dont free_chunk() after fsnotify_add_mark() Greg Kroah-Hartman
2012-09-12 23:39 ` [ 19/46] audit: fix refcounting in audit-tree Greg Kroah-Hartman
2012-09-12 23:39 ` [ 20/46] svcrpc: fix BUG() in svc_tcp_clear_pages Greg Kroah-Hartman
2012-09-12 23:39 ` [ 21/46] svcrpc: fix svc_xprt_enqueue/svc_recv busy-looping Greg Kroah-Hartman
2012-09-12 23:39 ` [ 22/46] svcrpc: sends on closed socket should stop immediately Greg Kroah-Hartman
2012-09-12 23:39 ` [ 23/46] cciss: fix incorrect scsi status reporting Greg Kroah-Hartman
2012-09-12 23:39 ` [ 24/46] ACPI: export symbol acpi_get_table_with_size Greg Kroah-Hartman
2012-09-15 0:22 ` Ben Hutchings
2012-09-15 3:13 ` Greg Kroah-Hartman
2012-09-12 23:39 ` [ 25/46] ath9k: fix decrypt_error initialization in ath_rx_tasklet() Greg Kroah-Hartman
2012-09-12 23:39 ` [ 26/46] PCI: EHCI: Fix crash during hibernation on ASUS computers Greg Kroah-Hartman
2012-09-12 23:39 ` Greg Kroah-Hartman [this message]
2012-09-12 23:39 ` [ 28/46] USB: spca506: remove __devinit* from the struct usb_device_id table Greg Kroah-Hartman
2012-09-12 23:39 ` [ 29/46] USB: p54usb: " Greg Kroah-Hartman
2012-09-12 23:39 ` [ 30/46] USB: rtl8187: " Greg Kroah-Hartman
2012-09-12 23:39 ` [ 31/46] USB: smsusb: " Greg Kroah-Hartman
2012-09-12 23:39 ` [ 32/46] USB: CDC ACM: Fix NULL pointer dereference Greg Kroah-Hartman
2012-09-12 23:39 ` [ 33/46] powerpc: Fix DSCR inheritance in copy_thread() Greg Kroah-Hartman
2012-09-12 23:39 ` [ 34/46] powerpc: Restore correct DSCR in context switch Greg Kroah-Hartman
2012-09-12 23:39 ` [ 35/46] Remove user-triggerable BUG from mpol_to_str Greg Kroah-Hartman
2012-09-12 23:39 ` [ 36/46] SCSI: megaraid_sas: Move poll_aen_lock initializer Greg Kroah-Hartman
2012-09-12 23:39 ` [ 37/46] SCSI: mpt2sas: Fix for Driver oops, when loading driver with max_queue_depth command line option to a very small value Greg Kroah-Hartman
2012-09-12 23:39 ` [ 38/46] SCSI: Fix Device not ready issue on mpt2sas Greg Kroah-Hartman
2012-09-12 23:39 ` [ 39/46] udf: Fix data corruption for files in ICB Greg Kroah-Hartman
2012-09-12 23:39 ` [ 40/46] ext3: Fix fdatasync() for files with only i_size changes Greg Kroah-Hartman
2012-09-12 23:39 ` [ 41/46] fuse: fix retrieve length Greg Kroah-Hartman
2012-09-12 23:39 ` [ 42/46] Input: i8042 - add Gigabyte T1005 series netbooks to noloop table Greg Kroah-Hartman
2012-09-12 23:39 ` [ 43/46] drm/vmwgfx: add MODULE_DEVICE_TABLE so vmwgfx loads at boot Greg Kroah-Hartman
2012-09-12 23:39 ` [ 44/46] PARISC: Redefine ATOMIC_INIT and ATOMIC64_INIT to drop the casts Greg Kroah-Hartman
2012-09-12 23:39 ` [ 45/46] dccp: check ccid before dereferencing Greg Kroah-Hartman
2012-09-12 23:39 ` [ 46/46] hwmon: (asus_atk0110) Add quirk for Asus M5A78L Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120912233820.511734466@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=axboe@kernel.dk \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox