stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	Sebastian Roesner <sroesner-kernelorg@roesner-online.de>,
	Eric Wheeler <bcache@lists.ewheeler.net>,
	Shaohua Li <shli@fb.com>,
	Kent Overstreet <kent.overstreet@gmail.com>,
	Ming Lei <ming.lei@canonical.com>, Jens Axboe <axboe@fb.com>
Subject: [PATCH 4.7 14/59] block: make sure a big bio is split into at most 256 bvecs
Date: Mon, 12 Sep 2016 17:29:34 +0200	[thread overview]
Message-ID: <20160912152129.362452618@linuxfoundation.org> (raw)
In-Reply-To: <20160912152128.765864031@linuxfoundation.org>

4.7-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ming Lei <ming.lei@canonical.com>

commit 4d70dca4eadf2f95abe389116ac02b8439c2d16c upstream.

After arbitrary bio size was introduced, the incoming bio may
be very big. We have to split the bio into small bios so that
each holds at most BIO_MAX_PAGES bvecs for safety reason, such
as bio_clone().

This patch fixes the following kernel crash:

> [  172.660142] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
> [  172.660229] IP: [<ffffffff811e53b4>] bio_trim+0xf/0x2a
> [  172.660289] PGD 7faf3e067 PUD 7f9279067 PMD 0
> [  172.660399] Oops: 0000 [#1] SMP
> [...]
> [  172.664780] Call Trace:
> [  172.664813]  [<ffffffffa007f3be>] ? raid1_make_request+0x2e8/0xad7 [raid1]
> [  172.664846]  [<ffffffff811f07da>] ? blk_queue_split+0x377/0x3d4
> [  172.664880]  [<ffffffffa005fb5f>] ? md_make_request+0xf6/0x1e9 [md_mod]
> [  172.664912]  [<ffffffff811eb860>] ? generic_make_request+0xb5/0x155
> [  172.664947]  [<ffffffffa0445c89>] ? prio_io+0x85/0x95 [bcache]
> [  172.664981]  [<ffffffffa0448252>] ? register_cache_set+0x355/0x8d0 [bcache]
> [  172.665016]  [<ffffffffa04497d3>] ? register_bcache+0x1006/0x1174 [bcache]

The issue can be reproduced by the following steps:
	- create one raid1 over two virtio-blk
	- build bcache device over the above raid1 and another cache device
	and bucket size is set as 2Mbytes
	- set cache mode as writeback
	- run random write over ext4 on the bcache device

Fixes: 54efd50(block: make generic_make_request handle arbitrarily sized bios)
Reported-by: Sebastian Roesner <sroesner-kernelorg@roesner-online.de>
Reported-by: Eric Wheeler <bcache@lists.ewheeler.net>
Cc: Shaohua Li <shli@fb.com>
Acked-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 block/blk-merge.c |   22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -94,9 +94,31 @@ static struct bio *blk_bio_segment_split
 	bool do_split = true;
 	struct bio *new = NULL;
 	const unsigned max_sectors = get_max_io_size(q, bio);
+	unsigned bvecs = 0;
 
 	bio_for_each_segment(bv, bio, iter) {
 		/*
+		 * With arbitrary bio size, the incoming bio may be very
+		 * big. We have to split the bio into small bios so that
+		 * each holds at most BIO_MAX_PAGES bvecs because
+		 * bio_clone() can fail to allocate big bvecs.
+		 *
+		 * It should have been better to apply the limit per
+		 * request queue in which bio_clone() is involved,
+		 * instead of globally. The biggest blocker is the
+		 * bio_clone() in bio bounce.
+		 *
+		 * If bio is splitted by this reason, we should have
+		 * allowed to continue bios merging, but don't do
+		 * that now for making the change simple.
+		 *
+		 * TODO: deal with bio bounce's bio_clone() gracefully
+		 * and convert the global limit into per-queue limit.
+		 */
+		if (bvecs++ >= BIO_MAX_PAGES)
+			goto split;
+
+		/*
 		 * If the queue doesn't support SG gaps and adding this
 		 * offset would create a gap, disallow it.
 		 */



  parent reply	other threads:[~2016-09-12 15:32 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20160912153038uscas1p166159f601d12fb1e6621d90d3ef5d311@uscas1p1.samsung.com>
     [not found] ` <20160912152128.765864031@linuxfoundation.org>
2016-09-12 15:29   ` [PATCH 4.7 01/59] Revert "floppy: refactor open() flags handling" Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 02/59] apparmor: fix refcount race when finding a child profile Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 03/59] kernel: Add noaudit variant of ns_capable() Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 04/59] net: Use ns_capable_noaudit() when determining net sysctl permissions Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 05/59] fs: Check for invalid i_uid in may_follow_link() Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 06/59] cred: Reject inodes with invalid ids in set_create_file_as() Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 07/59] ext4: validate that metadata blocks do not overlap superblock Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 08/59] ext4: fix xattr shifting when expanding inodes Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 09/59] ext4: fix xattr shifting when expanding inodes part 2 Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 10/59] ext4: properly align shifted xattrs when expanding inodes Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 11/59] ext4: avoid deadlock when expanding inode size Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 13/59] block: Fix race triggered by blk_set_queue_dying() Greg Kroah-Hartman
2016-09-12 15:29   ` Greg Kroah-Hartman [this message]
2016-09-12 15:29   ` [PATCH 4.7 15/59] cgroup: reduce read locked section of cgroup_threadgroup_rwsem during fork Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 16/59] cdc-acm: added sanity checking for probe() Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 19/59] drm/atomic: Dont potentially reset color_mgmt_changed on successive property updates Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 20/59] drm: Reject page_flip for !DRIVER_MODESET Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 21/59] drm/msm: fix use of copy_from_user() while holding spinlock Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 22/59] drm/vc4: Use drm_free_large() on handles to match its allocation Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 23/59] drm/vc4: Fix overflow mem unreferencing when the binner runs dry Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 24/59] drm/vc4: Fix oops when userspace hands in a bad BO Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 25/59] ASoC: atmel_ssc_dai: Dont unconditionally reset SSC on stream startup Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 26/59] xfs: fix superblock inprogress check Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 27/59] timekeeping: Cap array access in timekeeping_debug Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 28/59] timekeeping: Avoid taking lock in NMI path with CONFIG_DEBUG_TIMEKEEPING Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 30/59] ovl: proper cleanup of workdir Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 31/59] ovl: dont copy up opaqueness Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 32/59] ovl: remove posix_acl_default from workdir Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 33/59] ovl: listxattr: use strnlen() Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 34/59] ovl: fix workdir creation Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 35/59] mei: me: disable driver on SPT SPS firmware Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 36/59] ubifs: Fix xattr generic handler usage Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 38/59] bdev: fix NULL pointer dereference Greg Kroah-Hartman
2016-09-12 15:29   ` [PATCH 4.7 39/59] bcache: RESERVE_PRIO is too small by one when prio_buckets() is a power of two Greg Kroah-Hartman
2016-09-12 15:30   ` [PATCH 4.7 40/59] irqchip/mips-gic: Cleanup chip and handler setup Greg Kroah-Hartman
2016-09-12 15:30   ` [PATCH 4.7 41/59] irqchip/mips-gic: Implement activate op for device domain Greg Kroah-Hartman
2016-09-12 15:30   ` [PATCH 4.7 42/59] vhost/scsi: fix reuse of &vq->iov[out] in response Greg Kroah-Hartman
2016-09-12 15:30   ` [PATCH 4.7 43/59] x86/apic: Do not init irq remapping if ioapic is disabled Greg Kroah-Hartman
2016-09-12 15:30   ` [PATCH 4.7 44/59] xprtrdma: Create common scatterlist fields in rpcrdma_mw Greg Kroah-Hartman
2016-09-12 15:30   ` [PATCH 4.7 46/59] fscrypto: add authorization check for setting encryption policy Greg Kroah-Hartman
2016-09-12 15:30   ` [PATCH 4.7 47/59] fscrypto: only allow setting encryption policy on directories Greg Kroah-Hartman
2016-09-12 15:30   ` [PATCH 4.7 48/59] ALSA: usb-audio: Add sample rate inquiry quirk for B850V3 CP2114 Greg Kroah-Hartman
2016-09-12 15:30   ` [PATCH 4.7 49/59] ALSA: firewire-tascam: accessing to user space outside spinlock Greg Kroah-Hartman
2016-09-12 15:30   ` [PATCH 4.7 50/59] ALSA: fireworks: " Greg Kroah-Hartman
2016-09-12 15:30   ` [PATCH 4.7 51/59] ALSA: rawmidi: Fix possible deadlock with virmidi registration Greg Kroah-Hartman
2016-09-12 15:30   ` [PATCH 4.7 52/59] ALSA: hda - Add headset mic quirk for Dell Inspiron 5468 Greg Kroah-Hartman
2016-09-12 15:30   ` [PATCH 4.7 53/59] ALSA: hda - Enable subwoofer on Dell Inspiron 7559 Greg Kroah-Hartman
2016-09-12 15:30   ` [PATCH 4.7 54/59] ALSA: timer: fix NULL pointer dereference in read()/ioctl() race Greg Kroah-Hartman
2016-09-12 15:30   ` [PATCH 4.7 55/59] ALSA: timer: fix division by zero after SNDRV_TIMER_IOCTL_CONTINUE Greg Kroah-Hartman
2016-09-12 15:30   ` [PATCH 4.7 56/59] ALSA: timer: fix NULL pointer dereference on memory allocation failure Greg Kroah-Hartman
2016-09-12 15:30   ` [PATCH 4.7 57/59] ALSA: timer: Fix zero-division by continue of uninitialized instance Greg Kroah-Hartman
2016-09-12 15:30   ` [PATCH 4.7 58/59] scsi: fix upper bounds check of sense key in scsi_sense_key_string() Greg Kroah-Hartman
2016-09-12 15:30   ` [PATCH 4.7 59/59] cpufreq: dt: Add terminate entry for of_device_id tables Greg Kroah-Hartman
2016-09-13  3:07   ` [PATCH 4.7 00/59] 4.7.4-stable review Guenter Roeck
2016-09-13  8:57     ` Greg Kroah-Hartman
     [not found]   ` <57d7cc1f.0d8d1c0a.df2d3.a693@mx.google.com>
2016-09-13 12:12     ` Greg Kroah-Hartman
2016-09-13 17:40       ` Kevin Hilman
2016-09-13 18:32   ` Shuah Khan
2016-09-15  6:19     ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160912152129.362452618@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=axboe@fb.com \
    --cc=bcache@lists.ewheeler.net \
    --cc=kent.overstreet@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@canonical.com \
    --cc=shli@fb.com \
    --cc=sroesner-kernelorg@roesner-online.de \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).