public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Bcache fixes for 4.16
@ 2018-02-27 17:49 Michael Lyle
  2018-02-27 17:49 ` [PATCH 1/2] bcache: correct flash only vols (check all uuids) Michael Lyle
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Michael Lyle @ 2018-02-27 17:49 UTC (permalink / raw)
  To: linux-bcache, linux-block; +Cc: axboe

Hi Jens,

Please pick up these two critical fixes to bcache by Tang Junhui.
They're both one-liners and have been reviewed and tested.

The first corrects a regression when flash-only volumes are present
that was introduced in 4.16-RC1.  The second adjusts bio refcount
and completion behavior to work with md RAID5 backing.

Thanks,

Mike

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/2] bcache: correct flash only vols (check all uuids)
  2018-02-27 17:49 [PATCH 0/2] Bcache fixes for 4.16 Michael Lyle
@ 2018-02-27 17:49 ` Michael Lyle
  2018-02-27 17:49 ` [PATCH 2/2] bcache: fix kcrashes with fio in RAID5 backend dev Michael Lyle
  2018-02-27 17:54 ` [PATCH 0/2] Bcache fixes for 4.16 Jens Axboe
  2 siblings, 0 replies; 4+ messages in thread
From: Michael Lyle @ 2018-02-27 17:49 UTC (permalink / raw)
  To: linux-bcache, linux-block; +Cc: axboe, Coly Li

From: Coly Li <colyli@suse.de>

Commit 2831231d4c3f ("bcache: reduce cache_set devices iteration by
devices_max_used") adds c->devices_max_used to reduce iteration of
c->uuids elements, this value is updated in bcache_device_attach().

But for flash only volume, when calling flash_devs_run(), the function
bcache_device_attach() is not called yet and c->devices_max_used is not
updated. The unexpected result is, the flash only volume won't be run
by flash_devs_run().

This patch fixes the issue by iterate all c->uuids elements in
flash_devs_run(). c->devices_max_used will be updated properly when
bcache_device_attach() gets called.

[mlyle: commit subject edited for character limit]

Fixes: 2831231d4c3f ("bcache: reduce cache_set devices iteration by devices_max_used")
Reported-by: Tang Junhui <tang.junhui@zte.com.cn>
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
---
 drivers/md/bcache/super.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 312895788036..4d1d8dfb2d2a 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1274,7 +1274,7 @@ static int flash_devs_run(struct cache_set *c)
 	struct uuid_entry *u;
 
 	for (u = c->uuids;
-	     u < c->uuids + c->devices_max_used && !ret;
+	     u < c->uuids + c->nr_uuids && !ret;
 	     u++)
 		if (UUID_FLASH_ONLY(u))
 			ret = flash_dev_run(c, u);
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/2] bcache: fix kcrashes with fio in RAID5 backend dev
  2018-02-27 17:49 [PATCH 0/2] Bcache fixes for 4.16 Michael Lyle
  2018-02-27 17:49 ` [PATCH 1/2] bcache: correct flash only vols (check all uuids) Michael Lyle
@ 2018-02-27 17:49 ` Michael Lyle
  2018-02-27 17:54 ` [PATCH 0/2] Bcache fixes for 4.16 Jens Axboe
  2 siblings, 0 replies; 4+ messages in thread
From: Michael Lyle @ 2018-02-27 17:49 UTC (permalink / raw)
  To: linux-bcache, linux-block; +Cc: axboe, Tang Junhui

From: Tang Junhui <tang.junhui@zte.com.cn>

Kernel crashed when run fio in a RAID5 backend bcache device, the call
trace is bellow:
[  440.012034] kernel BUG at block/blk-ioc.c:146!
[  440.012696] invalid opcode: 0000 [#1] SMP NOPTI
[  440.026537] CPU: 2 PID: 2205 Comm: md127_raid5 Not tainted 4.15.0 #8
[  440.027441] Hardware name: HP ProLiant MicroServer Gen8, BIOS J06 07/16
/2015
[  440.028615] RIP: 0010:put_io_context+0x8b/0x90
[  440.029246] RSP: 0018:ffffa8c882b43af8 EFLAGS: 00010246
[  440.029990] RAX: 0000000000000000 RBX: ffffa8c88294fca0 RCX: 0000000000
0f4240
[  440.031006] RDX: 0000000000000004 RSI: 0000000000000286 RDI: ffffa8c882
94fca0
[  440.032030] RBP: ffffa8c882b43b10 R08: 0000000000000003 R09: ffff949cb8
0c1700
[  440.033206] R10: 0000000000000104 R11: 000000000000b71c R12: 00000000000
01000
[  440.034222] R13: 0000000000000000 R14: ffff949cad84db70 R15: ffff949cb11
bd1e0
[  440.035239] FS:  0000000000000000(0000) GS:ffff949cba280000(0000) knlGS:
0000000000000000
[  440.060190] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  440.084967] CR2: 00007ff0493ef000 CR3: 00000002f1e0a002 CR4: 00000000001
606e0
[  440.110498] Call Trace:
[  440.135443]  bio_disassociate_task+0x1b/0x60
[  440.160355]  bio_free+0x1b/0x60
[  440.184666]  bio_put+0x23/0x30
[  440.208272]  search_free+0x23/0x40 [bcache]
[  440.231448]  cached_dev_write_complete+0x31/0x70 [bcache]
[  440.254468]  closure_put+0xb6/0xd0 [bcache]
[  440.277087]  request_endio+0x30/0x40 [bcache]
[  440.298703]  bio_endio+0xa1/0x120
[  440.319644]  handle_stripe+0x418/0x2270 [raid456]
[  440.340614]  ? load_balance+0x17b/0x9c0
[  440.360506]  handle_active_stripes.isra.58+0x387/0x5a0 [raid456]
[  440.380675]  ? __release_stripe+0x15/0x20 [raid456]
[  440.400132]  raid5d+0x3ed/0x5d0 [raid456]
[  440.419193]  ? schedule+0x36/0x80
[  440.437932]  ? schedule_timeout+0x1d2/0x2f0
[  440.456136]  md_thread+0x122/0x150
[  440.473687]  ? wait_woken+0x80/0x80
[  440.491411]  kthread+0x102/0x140
[  440.508636]  ? find_pers+0x70/0x70
[  440.524927]  ? kthread_associate_blkcg+0xa0/0xa0
[  440.541791]  ret_from_fork+0x35/0x40
[  440.558020] Code: c2 48 00 5b 41 5c 41 5d 5d c3 48 89 c6 4c 89 e7 e8 bb c2
48 00 48 8b 3d bc 36 4b 01 48 89 de e8 7c f7 e0 ff 5b 41 5c 41 5d 5d c3 <0f> 0b
0f 1f 00 0f 1f 44 00 00 55 48 8d 47 b8 48 89 e5 41 57 41
[  440.610020] RIP: put_io_context+0x8b/0x90 RSP: ffffa8c882b43af8
[  440.628575] ---[ end trace a1fd79d85643a73e ]--

All the crash issue happened when a bypass IO coming, in such scenario
s->iop.bio is pointed to the s->orig_bio. In search_free(), it finishes the
s->orig_bio by calling bio_complete(), and after that, s->iop.bio became
invalid, then kernel would crash when calling bio_put(). Maybe its upper
layer's faulty, since bio should not be freed before we calling bio_put(),
but we'd better calling bio_put() first before calling bio_complete() to
notify upper layer ending this bio.

This patch moves bio_complete() under bio_put() to avoid kernel crash.

[mlyle: fixed commit subject for character limits]

Reported-by: Matthias Ferdinand <bcache@mfedv.net>
Tested-by: Matthias Ferdinand <bcache@mfedv.net>
Signed-off-by: Tang Junhui <tang.junhui@zte.com.cn>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
---
 drivers/md/bcache/request.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
index 1a46b41dac70..6422846b546e 100644
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -659,11 +659,11 @@ static void do_bio_hook(struct search *s, struct bio *orig_bio)
 static void search_free(struct closure *cl)
 {
 	struct search *s = container_of(cl, struct search, cl);
-	bio_complete(s);
 
 	if (s->iop.bio)
 		bio_put(s->iop.bio);
 
+	bio_complete(s);
 	closure_debug_destroy(cl);
 	mempool_free(s, s->d->c->search);
 }
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 0/2] Bcache fixes for 4.16
  2018-02-27 17:49 [PATCH 0/2] Bcache fixes for 4.16 Michael Lyle
  2018-02-27 17:49 ` [PATCH 1/2] bcache: correct flash only vols (check all uuids) Michael Lyle
  2018-02-27 17:49 ` [PATCH 2/2] bcache: fix kcrashes with fio in RAID5 backend dev Michael Lyle
@ 2018-02-27 17:54 ` Jens Axboe
  2 siblings, 0 replies; 4+ messages in thread
From: Jens Axboe @ 2018-02-27 17:54 UTC (permalink / raw)
  To: Michael Lyle, linux-bcache, linux-block

On 2/27/18 10:49 AM, Michael Lyle wrote:
> Hi Jens,
> 
> Please pick up these two critical fixes to bcache by Tang Junhui.
> They're both one-liners and have been reviewed and tested.
> 
> The first corrects a regression when flash-only volumes are present
> that was introduced in 4.16-RC1.  The second adjusts bio refcount
> and completion behavior to work with md RAID5 backing.

Looks fine, applied for 4.16, thanks Mike.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-02-27 17:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-02-27 17:49 [PATCH 0/2] Bcache fixes for 4.16 Michael Lyle
2018-02-27 17:49 ` [PATCH 1/2] bcache: correct flash only vols (check all uuids) Michael Lyle
2018-02-27 17:49 ` [PATCH 2/2] bcache: fix kcrashes with fio in RAID5 backend dev Michael Lyle
2018-02-27 17:54 ` [PATCH 0/2] Bcache fixes for 4.16 Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox