All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Greg KH <gregkh@linuxfoundation.org>,
	Hugh Dickins <hughd@google.com>, Jens Axboe <axboe@kernel.dk>
Subject: [ 47/85] block: replace __getblk_slow misfix by grow_dev_page fix
Date: Wed, 12 Sep 2012 16:36:30 -0700	[thread overview]
Message-ID: <20120912233546.673296113@linuxfoundation.org> (raw)
In-Reply-To: <20120912233541.747973832@linuxfoundation.org>

From: Greg KH <gregkh@linuxfoundation.org>

3.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Hugh Dickins <hughd@google.com>

commit 676ce6d5ca3098339c028d44fe0427d1566a4d2d upstream.

Commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow")
is not good: a successful call to grow_buffers() cannot guarantee
that the page won't be reclaimed before the immediate next call to
__find_get_block(), which is why there was always a loop there.

Yesterday I got "EXT4-fs error (device loop0): __ext4_get_inode_loc:3595:
inode #19278: block 664: comm cc1: unable to read itable block" on console,
which pointed to this commit.

I've been trying to bisect for weeks, why kbuild-on-ext4-on-loop-on-tmpfs
sometimes fails from a missing header file, under memory pressure on
ppc G5.  I've never seen this on x86, and I've never seen it on 3.5-rc7
itself, despite that commit being in there: bisection pointed to an
irrelevant pinctrl merge, but hard to tell when failure takes between
18 minutes and 38 hours (but so far it's happened quicker on 3.6-rc2).

(I've since found such __ext4_get_inode_loc errors in /var/log/messages
from previous weeks: why the message never appeared on console until
yesterday morning is a mystery for another day.)

Revert 91f68c89d8f3, restoring __getblk_slow() to how it was (plus
a checkpatch nitfix).  Simplify the interface between grow_buffers()
and grow_dev_page(), and avoid the infinite loop beyond end of device
by instead checking init_page_buffers()'s end_block there (I presume
that's more efficient than a repeated call to blkdev_max_block()),
returning -ENXIO to __getblk_slow() in that case.

And remove akpm's ten-year-old "__getblk() cannot fail ... weird"
comment, but that is worrying: are all users of __getblk() really
now prepared for a NULL bh beyond end of device, or will some oops??

Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/buffer.c |   66 +++++++++++++++++++++++++++---------------------------------
 1 file changed, 30 insertions(+), 36 deletions(-)

--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -914,7 +914,7 @@ link_dev_buffers(struct page *page, stru
 /*
  * Initialise the state of a blockdev page's buffers.
  */ 
-static void
+static sector_t
 init_page_buffers(struct page *page, struct block_device *bdev,
 			sector_t block, int size)
 {
@@ -936,33 +936,41 @@ init_page_buffers(struct page *page, str
 		block++;
 		bh = bh->b_this_page;
 	} while (bh != head);
+
+	/*
+	 * Caller needs to validate requested block against end of device.
+	 */
+	return end_block;
 }
 
 /*
  * Create the page-cache page that contains the requested block.
  *
- * This is user purely for blockdev mappings.
+ * This is used purely for blockdev mappings.
  */
-static struct page *
+static int
 grow_dev_page(struct block_device *bdev, sector_t block,
-		pgoff_t index, int size)
+		pgoff_t index, int size, int sizebits)
 {
 	struct inode *inode = bdev->bd_inode;
 	struct page *page;
 	struct buffer_head *bh;
+	sector_t end_block;
+	int ret = 0;		/* Will call free_more_memory() */
 
 	page = find_or_create_page(inode->i_mapping, index,
 		(mapping_gfp_mask(inode->i_mapping) & ~__GFP_FS)|__GFP_MOVABLE);
 	if (!page)
-		return NULL;
+		return ret;
 
 	BUG_ON(!PageLocked(page));
 
 	if (page_has_buffers(page)) {
 		bh = page_buffers(page);
 		if (bh->b_size == size) {
-			init_page_buffers(page, bdev, block, size);
-			return page;
+			end_block = init_page_buffers(page, bdev,
+						index << sizebits, size);
+			goto done;
 		}
 		if (!try_to_free_buffers(page))
 			goto failed;
@@ -982,14 +990,14 @@ grow_dev_page(struct block_device *bdev,
 	 */
 	spin_lock(&inode->i_mapping->private_lock);
 	link_dev_buffers(page, bh);
-	init_page_buffers(page, bdev, block, size);
+	end_block = init_page_buffers(page, bdev, index << sizebits, size);
 	spin_unlock(&inode->i_mapping->private_lock);
-	return page;
-
+done:
+	ret = (block < end_block) ? 1 : -ENXIO;
 failed:
 	unlock_page(page);
 	page_cache_release(page);
-	return NULL;
+	return ret;
 }
 
 /*
@@ -999,7 +1007,6 @@ failed:
 static int
 grow_buffers(struct block_device *bdev, sector_t block, int size)
 {
-	struct page *page;
 	pgoff_t index;
 	int sizebits;
 
@@ -1023,22 +1030,14 @@ grow_buffers(struct block_device *bdev,
 			bdevname(bdev, b));
 		return -EIO;
 	}
-	block = index << sizebits;
+
 	/* Create a page with the proper size buffers.. */
-	page = grow_dev_page(bdev, block, index, size);
-	if (!page)
-		return 0;
-	unlock_page(page);
-	page_cache_release(page);
-	return 1;
+	return grow_dev_page(bdev, block, index, size, sizebits);
 }
 
 static struct buffer_head *
 __getblk_slow(struct block_device *bdev, sector_t block, int size)
 {
-	int ret;
-	struct buffer_head *bh;
-
 	/* Size must be multiple of hard sectorsize */
 	if (unlikely(size & (bdev_logical_block_size(bdev)-1) ||
 			(size < 512 || size > PAGE_SIZE))) {
@@ -1051,21 +1050,20 @@ __getblk_slow(struct block_device *bdev,
 		return NULL;
 	}
 
-retry:
-	bh = __find_get_block(bdev, block, size);
-	if (bh)
-		return bh;
+	for (;;) {
+		struct buffer_head *bh;
+		int ret;
 
-	ret = grow_buffers(bdev, block, size);
-	if (ret == 0) {
-		free_more_memory();
-		goto retry;
-	} else if (ret > 0) {
 		bh = __find_get_block(bdev, block, size);
 		if (bh)
 			return bh;
+
+		ret = grow_buffers(bdev, block, size);
+		if (ret < 0)
+			return NULL;
+		if (ret == 0)
+			free_more_memory();
 	}
-	return NULL;
 }
 
 /*
@@ -1321,10 +1319,6 @@ EXPORT_SYMBOL(__find_get_block);
  * which corresponds to the passed block_device, block and size. The
  * returned buffer has its reference count incremented.
  *
- * __getblk() cannot fail - it just keeps trying.  If you pass it an
- * illegal block number, __getblk() will happily return a buffer_head
- * which represents the non-existent block.  Very weird.
- *
  * __getblk() will lock up the machine if grow_dev_page's try_to_free_buffers()
  * attempt is failing.  FIXME, perhaps?
  */



  parent reply	other threads:[~2012-09-13  0:07 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-12 23:35 [ 00/85] 3.4.11-stable review Greg Kroah-Hartman
2012-09-12 23:35 ` [ 01/85] USB: vt6656: remove __devinit* from the struct usb_device_id table Greg Kroah-Hartman
2012-09-12 23:35 ` [ 02/85] USB: emi62: " Greg Kroah-Hartman
2012-09-12 23:35 ` [ 03/85] ALSA: hda - fix Copyright debug message Greg Kroah-Hartman
2012-09-12 23:35 ` [ 04/85] ARM: 7483/1: vfp: only advertise VFPv4 in hwcaps if CONFIG_VFPv3 is enabled Greg Kroah-Hartman
2012-09-12 23:35 ` [ 05/85] ARM: 7487/1: mm: avoid setting nG bit for user mappings that arent present Greg Kroah-Hartman
2012-09-12 23:35 ` [ 06/85] ARM: 7488/1: mm: use 5 bits for swapfile type encoding Greg Kroah-Hartman
2012-09-12 23:35 ` [ 07/85] ARM: 7489/1: errata: fix workaround for erratum #720789 on UP systems Greg Kroah-Hartman
2012-09-12 23:35 ` [ 08/85] ARM: OMAP2+: Fix dmtimer set source clock failure Greg Kroah-Hartman
2012-09-12 23:35 ` [ 09/85] ARM: S3C24XX: Add missing DMACH_DT_PROP Greg Kroah-Hartman
2012-09-12 23:35 ` [ 10/85] ARM: S3C24XX: Fix s3c2410_dma_enqueue parameters Greg Kroah-Hartman
2012-09-12 23:35 ` [ 11/85] Revert dma: imx-dma: Fix kernel crash due to missing clock conversion Greg Kroah-Hartman
2012-09-12 23:35 ` [ 12/85] xen/setup: Fix one-off error when adding for-balloon PFNs to the P2M Greg Kroah-Hartman
2012-09-12 23:35 ` [ 13/85] ARM: imx6: spin the cpu until hardware takes it down Greg Kroah-Hartman
2012-09-12 23:35 ` [ 14/85] ARM: imx: select CPU_FREQ_TABLE when needed Greg Kroah-Hartman
2012-09-12 23:35 ` [ 15/85] ASoC: wm9712: Fix microphone source selection Greg Kroah-Hartman
2012-09-12 23:35 ` [ 16/85] ASoC: omap-mcbsp: Fix 6pin mux configuration Greg Kroah-Hartman
2012-09-12 23:36 ` [ 17/85] vfs: missed source of ->f_pos races Greg Kroah-Hartman
2012-09-12 23:36 ` [ 18/85] vfs: canonicalize create mode in build_open_flags() Greg Kroah-Hartman
2012-09-12 23:36 ` [ 19/85] alpha: fix fpu.h usage in userspace Greg Kroah-Hartman
2012-09-12 23:36 ` [ 20/85] alpha: Dont export SOCK_NONBLOCK to user space Greg Kroah-Hartman
2012-09-12 23:36 ` [ 21/85] USB: winbond: remove __devinit* from the struct usb_device_id table Greg Kroah-Hartman
2012-09-12 23:36 ` [ 22/85] mm: hugetlbfs: correctly populate shared pmd Greg Kroah-Hartman
2012-09-12 23:36 ` [ 23/85] ALSA: hda - dont create dysfunctional mixer controls for ca0132 Greg Kroah-Hartman
2012-09-12 23:36 ` [ 24/85] target: fix NULL pointer dereference bug alloc_page() fails to get memory Greg Kroah-Hartman
2012-09-12 23:36 ` [ 25/85] NFSv3: Ensure that do_proc_get_root() reports errors correctly Greg Kroah-Hartman
2012-09-12 23:36 ` [ 26/85] pnfs: defer release of pages in layoutget Greg Kroah-Hartman
2012-09-12 23:36 ` [ 27/85] NFSv4.1: Remove a bogus BUG_ON() in nfs4_layoutreturn_done Greg Kroah-Hartman
2012-09-12 23:36 ` [ 28/85] NFS: Clear key construction data if the idmap upcall fails Greg Kroah-Hartman
2012-09-12 23:36 ` [ 29/85] NFS: return -ENOKEY when the upcall fails to map the name Greg Kroah-Hartman
2012-09-12 23:36 ` [ 30/85] UBIFS: fix complaints about too small debug buffer size Greg Kroah-Hartman
2012-09-12 23:36 ` [ 31/85] Bluetooth: Fix using NULL inquiry entry Greg Kroah-Hartman
2012-09-12 23:36 ` [ 32/85] Bluetooth: Fix using a NULL inquiry cache entry Greg Kroah-Hartman
2012-09-12 23:36 ` [ 33/85] Bluetooth: Set name_state to unknown when entry name is empty Greg Kroah-Hartman
2012-09-12 23:36 ` [ 34/85] Bluetooth: Fix legacy pairing with some devices Greg Kroah-Hartman
2012-09-12 23:36 ` [ 35/85] NFS: Alias the nfs module to nfs4 Greg Kroah-Hartman
2012-09-12 23:36 ` [ 36/85] audit: dont free_chunk() after fsnotify_add_mark() Greg Kroah-Hartman
2012-09-12 23:36 ` [ 37/85] audit: fix refcounting in audit-tree Greg Kroah-Hartman
2012-09-12 23:36 ` [ 38/85] drm: stop vmgfx driver explosion Greg Kroah-Hartman
2012-09-12 23:36 ` [ 39/85] Revert "drm/radeon: fix bo creation retry path" Greg Kroah-Hartman
2012-09-12 23:36 ` [ 40/85] svcrpc: fix BUG() in svc_tcp_clear_pages Greg Kroah-Hartman
2012-09-12 23:36 ` [ 41/85] svcrpc: fix svc_xprt_enqueue/svc_recv busy-looping Greg Kroah-Hartman
2012-09-12 23:36 ` [ 42/85] svcrpc: sends on closed socket should stop immediately Greg Kroah-Hartman
2012-09-12 23:36 ` [ 43/85] cciss: fix incorrect scsi status reporting Greg Kroah-Hartman
2012-09-12 23:36 ` [ 44/85] ACPI: export symbol acpi_get_table_with_size Greg Kroah-Hartman
2012-09-15  2:19   ` Ben Hutchings
2012-09-12 23:36 ` [ 45/85] ath9k: fix decrypt_error initialization in ath_rx_tasklet() Greg Kroah-Hartman
2012-09-12 23:36 ` [ 46/85] PCI: EHCI: Fix crash during hibernation on ASUS computers Greg Kroah-Hartman
2012-09-12 23:36 ` Greg Kroah-Hartman [this message]
2012-09-12 23:36 ` [ 48/85] sched,cgroup: Fix up task_groups list Greg Kroah-Hartman
2012-09-12 23:36 ` [ 49/85] sched: fix divide by zero at {thread_group,task}_times Greg Kroah-Hartman
2012-09-12 23:36 ` [ 50/85] [media] uvcvideo: Reset the bytesused field when recycling an erroneous buffer Greg Kroah-Hartman
2012-09-12 23:36 ` [ 51/85] rapidio/tsi721: fix inbound doorbell interrupt handling Greg Kroah-Hartman
2012-09-12 23:36 ` [ 52/85] rapidio/tsi721: fix unused variable compiler warning Greg Kroah-Hartman
2012-09-12 23:36 ` [ 53/85] regulator: twl-regulator: fix up VINTANA1/VINTANA2 Greg Kroah-Hartman
2012-09-12 23:36 ` [ 54/85] x32: Use compat shims for {g,s}etsockopt Greg Kroah-Hartman
2012-09-12 23:36 ` [ 55/85] USB: spca506: remove __devinit* from the struct usb_device_id table Greg Kroah-Hartman
2012-09-12 23:36 ` [ 56/85] USB: jl2005bcd: " Greg Kroah-Hartman
2012-09-12 23:36 ` [ 57/85] USB: p54usb: " Greg Kroah-Hartman
2012-09-12 23:36 ` [ 58/85] USB: rtl8187: " Greg Kroah-Hartman
2012-09-12 23:36 ` [ 59/85] USB: smsusb: " Greg Kroah-Hartman
2012-09-12 23:36 ` [ 60/85] USB: CDC ACM: Fix NULL pointer dereference Greg Kroah-Hartman
2012-09-12 23:36 ` [ 61/85] powerpc: Update DSCR on all CPUs when writing sysfs dscr_default Greg Kroah-Hartman
2012-09-12 23:36 ` [ 62/85] powerpc: Keep thread.dscr and thread.dscr_inherit in sync Greg Kroah-Hartman
2012-09-12 23:36 ` [ 63/85] powerpc: Fix DSCR inheritance in copy_thread() Greg Kroah-Hartman
2012-09-12 23:36 ` [ 64/85] powerpc: Restore correct DSCR in context switch Greg Kroah-Hartman
2012-09-12 23:36 ` [ 65/85] powerpc: Make sure IPI handlers see data written by IPI senders Greg Kroah-Hartman
2012-09-12 23:36 ` [ 66/85] Remove user-triggerable BUG from mpol_to_str Greg Kroah-Hartman
2012-09-12 23:36 ` [ 67/85] Fix order of arguments to compat_put_time[spec|val] Greg Kroah-Hartman
2012-09-12 23:36 ` [ 68/85] SCSI: megaraid_sas: Move poll_aen_lock initializer Greg Kroah-Hartman
2012-09-12 23:36 ` [ 69/85] SCSI: scsi_lib: fix scsi_io_completions SG_IO error propagation Greg Kroah-Hartman
2012-09-12 23:36 ` [ 70/85] SCSI: mpt2sas: Fix for Driver oops, when loading driver with max_queue_depth command line option to a very small value Greg Kroah-Hartman
2012-09-12 23:36 ` [ 71/85] SCSI: Fix Device not ready issue on mpt2sas Greg Kroah-Hartman
2012-09-12 23:36 ` [ 72/85] udf: Fix data corruption for files in ICB Greg Kroah-Hartman
2012-09-12 23:36 ` [ 73/85] ext3: Fix fdatasync() for files with only i_size changes Greg Kroah-Hartman
2012-09-12 23:36 ` [ 74/85] fuse: fix retrieve length Greg Kroah-Hartman
2012-09-12 23:36 ` [ 75/85] i2c-designware: Fix build error if CONFIG_I2C_DESIGNWARE_PLATFORM=y && CONFIG_I2C_DESIGNWARE_PCI=y Greg Kroah-Hartman
2012-09-12 23:36 ` [ 76/85] i2c-i801: Add Device IDs for Intel Lynx Point-LP PCH Greg Kroah-Hartman
2012-09-12 23:37 ` [ 77/85] HID: add NOGET quirk for Eaton Ellipse MAX UPS Greg Kroah-Hartman
2012-09-12 23:37 ` [ 78/85] Input: i8042 - add Gigabyte T1005 series netbooks to noloop table Greg Kroah-Hartman
2012-09-12 23:37 ` [ 79/85] drm/vmwgfx: add MODULE_DEVICE_TABLE so vmwgfx loads at boot Greg Kroah-Hartman
2012-09-12 23:37 ` [ 80/85] PARISC: Redefine ATOMIC_INIT and ATOMIC64_INIT to drop the casts Greg Kroah-Hartman
2012-09-12 23:37 ` [ 81/85] xen: Use correct masking in xen_swiotlb_alloc_coherent Greg Kroah-Hartman
2012-09-12 23:37 ` [ 82/85] xen/pciback: Fix proper FLR steps Greg Kroah-Hartman
2012-09-12 23:37 ` [ 83/85] x86, microcode, AMD: Fix broken ucode patch size check Greg Kroah-Hartman
2012-09-12 23:37 ` [ 84/85] dccp: check ccid before dereferencing Greg Kroah-Hartman
2012-09-12 23:37 ` [ 85/85] hwmon: (asus_atk0110) Add quirk for Asus M5A78L Greg Kroah-Hartman
2012-09-12 23:46 ` [ 00/85] 3.4.11-stable review Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120912233546.673296113@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=axboe@kernel.dk \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.