All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bart Van Assche <Bart.VanAssche@wdc.com>
To: "tj@kernel.org" <tj@kernel.org>, "axboe@kernel.dk" <axboe@kernel.dk>
Cc: "kernel-team@fb.com" <kernel-team@fb.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"osandov@fb.com" <osandov@fb.com>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"oleg@redhat.com" <oleg@redhat.com>, "hch@lst.de" <hch@lst.de>
Subject: Re: [PATCHSET v2] blk-mq: reimplement timeout handling
Date: Wed, 20 Dec 2017 23:41:02 +0000	[thread overview]
Message-ID: <1513813261.2603.36.camel@wdc.com> (raw)
In-Reply-To: <20171212190134.535941-1-tj@kernel.org>

T24gVHVlLCAyMDE3LTEyLTEyIGF0IDExOjAxIC0wODAwLCBUZWp1biBIZW8gd3JvdGU6DQo+IEN1
cnJlbnRseSwgYmxrLW1xIHRpbWVvdXQgcGF0aCBzeW5jaHJvbml6ZXMgYWdhaW5zdCB0aGUgdXN1
YWwNCj4gaXNzdWUvY29tcGxldGlvbiBwYXRoIHVzaW5nIGEgY29tcGxleCBzY2hlbWUgaW52b2x2
aW5nIGF0b21pYw0KPiBiaXRmbGFncywgUkVRX0FUT01fKiwgbWVtb3J5IGJhcnJpZXJzIGFuZCBz
dWJ0bGUgbWVtb3J5IGNvaGVyZW5jZQ0KPiBydWxlcy4gIFVuZm9ydHVuYXRsZXksIGl0IGNvbnRh
aW5zIHF1aXRlIGEgZmV3IGhvbGVzLg0KDQpIZWxsbyBUZWp1biwNCg0KQW4gYXR0ZW1wdCB0byBy
dW4gU0NTSSBJL08gd2l0aCB0aGlzIHBhdGNoIHNlcmllcyBhcHBsaWVkIHJlc3VsdGVkIGluDQp0
aGUgZm9sbG93aW5nOg0KDQpCVUc6IHVuYWJsZSB0byBoYW5kbGUga2VybmVsIE5VTEwgcG9pbnRl
ciBkZXJlZmVyZW5jZSBhdCAgICAgICAgICAgKG51bGwpDQpJUDogc2NzaV90aW1lc19vdXQrMHgx
Yy8weDJkMA0KUEdEIDAgUDREIDANCk9vcHM6IDAwMDAgWyMxXSBQUkVFTVBUIFNNUA0KQ1BVOiAx
IFBJRDogNDM3IENvbW06IGt3b3JrZXIvMToxSCBUYWludGVkOiBHICAgICAgICBXICAgICAgICA0
LjE1LjAtcmM0LWRiZysgIzENCkhhcmR3YXJlIG5hbWU6IERlbGwgSW5jLiBQb3dlckVkZ2UgUjcy
MC8wVldUOTAsIEJJT1MgMi41LjQgMDEvMjIvMjAxNg0KV29ya3F1ZXVlOiBrYmxvY2tkIGJsa19t
cV90aW1lb3V0X3dvcmsNClJJUDogMDAxMDpzY3NpX3RpbWVzX291dCsweDFjLzB4MmQwDQpSU1A6
IDAwMTg6ZmZmZmM5MDAwN2VmM2Q1OCBFRkxBR1M6IDAwMDEwMjQ2DQpSQVg6IDAwMDAwMDAwMDAw
MDAwMDAgUkJYOiBmZmZmODgwODc4ZWFiMDAwIFJDWDogMDAwMDAwMDAwMDAwMDAwMA0KUkRYOiAw
MDAwMDAwMDAwMDAwMDAwIFJTSTogMDAwMDAwMDAwMDAwMDAwMCBSREk6IGZmZmY4ODA4NzhlYWIw
MDANClJCUDogZmZmZjg4MDg3OGVhYjFhMCBSMDg6IGZmZmZmZmZmZmZmZmZmZmYgUjA5OiAwMDAw
MDAwMDAwMDAwMDAxDQpSMTA6IDAwMDAwMDAwMDAwMDAwMDAgUjExOiAwMDAwMDAwMDAwMDAwMDAw
IFIxMjogMDAwMDAwMDAwMDAwMDAwNA0KUjEzOiAwMDAwMDAwMDAwMDAwMDAwIFIxNDogZmZmZjg4
MDg1ZTRhNWNlOCBSMTU6IGZmZmY4ODA4NzhlOWY4NDgNCkZTOiAgMDAwMDAwMDAwMDAwMDAwMCgw
MDAwKSBHUzpmZmZmODgwOTNmNjAwMDAwKDAwMDApIGtubEdTOjAwMDAwMDAwMDAwMDAwMDANCkNT
OiAgMDAxMCBEUzogMDAwMCBFUzogMDAwMCBDUjA6IDAwMDAwMDAwODAwNTAwMzMNCkNSMjogMDAw
MDAwMDAwMDAwMDAwMCBDUjM6IDAwMDAwMDAwMDFjMGYwMDIgQ1I0OiAwMDAwMDAwMDAwMDYwNmUw
DQpDYWxsIFRyYWNlOg0KIGJsa19tcV90ZXJtaW5hdGVfZXhwaXJlZCsweDM2LzB4NzANCiBidF9p
dGVyKzB4NDMvMHg1MA0KIGJsa19tcV9xdWV1ZV90YWdfYnVzeV9pdGVyKzB4ZWUvMHgyMDANCiBi
bGtfbXFfdGltZW91dF93b3JrKzB4MTg2LzB4MmUwDQogcHJvY2Vzc19vbmVfd29yaysweDIyMS8w
eDZlMA0KIHdvcmtlcl90aHJlYWQrMHgzYS8weDM5MA0KIGt0aHJlYWQrMHgxMWMvMHgxNDANCiBy
ZXRfZnJvbV9mb3JrKzB4MjQvMHgzMA0KUklQOiBzY3NpX3RpbWVzX291dCsweDFjLzB4MmQwIFJT
UDogZmZmZmM5MDAwN2VmM2Q1OA0KQ1IyOiAwMDAwMDAwMDAwMDAwMDAwDQoNCihnZGIpIGxpc3Qg
KihzY3NpX3RpbWVzX291dCsweDFjKQ0KMHhmZmZmZmZmZjgxNDdhZGJjIGlzIGluIHNjc2lfdGlt
ZXNfb3V0IChkcml2ZXJzL3Njc2kvc2NzaV9lcnJvci5jOjI4NSkuDQoyODAgICAgICAqLw0KMjgx
ICAgICBlbnVtIGJsa19laF90aW1lcl9yZXR1cm4gc2NzaV90aW1lc19vdXQoc3RydWN0IHJlcXVl
c3QgKnJlcSkNCjI4MiAgICAgew0KMjgzICAgICAgICAgICAgIHN0cnVjdCBzY3NpX2NtbmQgKnNj
bWQgPSBibGtfbXFfcnFfdG9fcGR1KHJlcSk7DQoyODQgICAgICAgICAgICAgZW51bSBibGtfZWhf
dGltZXJfcmV0dXJuIHJ0biA9IEJMS19FSF9OT1RfSEFORExFRDsNCjI4NSAgICAgICAgICAgICBz
dHJ1Y3QgU2NzaV9Ib3N0ICpob3N0ID0gc2NtZC0+ZGV2aWNlLT5ob3N0Ow0KMjg2DQoyODcgICAg
ICAgICAgICAgdHJhY2Vfc2NzaV9kaXNwYXRjaF9jbWRfdGltZW91dChzY21kKTsNCjI4OCAgICAg
ICAgICAgICBzY3NpX2xvZ19jb21wbGV0aW9uKHNjbWQsIFRJTUVPVVRfRVJST1IpOw0KMjg5DQoN
CihnZGIpIGRpc2FzIC9zIHNjc2lfdGltZXNfb3V0DQpbIC4uLiBdDQoyODMgICAgICAgICAgICAg
c3RydWN0IHNjc2lfY21uZCAqc2NtZCA9IGJsa19tcV9ycV90b19wZHUocmVxKTsNCjI4NCAgICAg
ICAgICAgICBlbnVtIGJsa19laF90aW1lcl9yZXR1cm4gcnRuID0gQkxLX0VIX05PVF9IQU5ETEVE
Ow0KMjg1ICAgICAgICAgICAgIHN0cnVjdCBTY3NpX0hvc3QgKmhvc3QgPSBzY21kLT5kZXZpY2Ut
Pmhvc3Q7DQogICAweGZmZmZmZmZmODE0N2FkYjIgPCsxOD46ICAgIG1vdiAgICAweDFkOCglcmRp
KSwlcmF4DQogICAweGZmZmZmZmZmODE0N2FkYjkgPCsyNT46ICAgIG1vdiAgICAlcmRpLCVyYngN
CiAgIDB4ZmZmZmZmZmY4MTQ3YWRiYyA8KzI4PjogICAgbW92ICAgICglcmF4KSwlcjEzDQogICAw
eGZmZmZmZmZmODE0N2FkYmYgPCszMT46ICAgIG5vcGwgICAweDAoJXJheCwlcmF4LDEpDQoNCkJh
cnQu

WARNING: multiple messages have this Message-ID (diff)
From: Bart Van Assche <Bart.VanAssche@wdc.com>
To: "tj@kernel.org" <tj@kernel.org>, "axboe@kernel.dk" <axboe@kernel.dk>
Cc: "kernel-team@fb.com" <kernel-team@fb.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"osandov@fb.com" <osandov@fb.com>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"oleg@redhat.com" <oleg@redhat.com>, "hch@lst.de" <hch@lst.de>
Subject: Re: [PATCHSET v2] blk-mq: reimplement timeout handling
Date: Wed, 20 Dec 2017 23:41:02 +0000	[thread overview]
Message-ID: <1513813261.2603.36.camel@wdc.com> (raw)
In-Reply-To: <20171212190134.535941-1-tj@kernel.org>

On Tue, 2017-12-12 at 11:01 -0800, Tejun Heo wrote:
> Currently, blk-mq timeout path synchronizes against the usual
> issue/completion path using a complex scheme involving atomic
> bitflags, REQ_ATOM_*, memory barriers and subtle memory coherence
> rules.  Unfortunatley, it contains quite a few holes.

Hello Tejun,

An attempt to run SCSI I/O with this patch series applied resulted in
the following:

BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: scsi_times_out+0x1c/0x2d0
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP
CPU: 1 PID: 437 Comm: kworker/1:1H Tainted: G        W        4.15.0-rc4-dbg+ #1
Hardware name: Dell Inc. PowerEdge R720/0VWT90, BIOS 2.5.4 01/22/2016
Workqueue: kblockd blk_mq_timeout_work
RIP: 0010:scsi_times_out+0x1c/0x2d0
RSP: 0018:ffffc90007ef3d58 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff880878eab000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880878eab000
RBP: ffff880878eab1a0 R08: ffffffffffffffff R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000004
R13: 0000000000000000 R14: ffff88085e4a5ce8 R15: ffff880878e9f848
FS:  0000000000000000(0000) GS:ffff88093f600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000001c0f002 CR4: 00000000000606e0
Call Trace:
 blk_mq_terminate_expired+0x36/0x70
 bt_iter+0x43/0x50
 blk_mq_queue_tag_busy_iter+0xee/0x200
 blk_mq_timeout_work+0x186/0x2e0
 process_one_work+0x221/0x6e0
 worker_thread+0x3a/0x390
 kthread+0x11c/0x140
 ret_from_fork+0x24/0x30
RIP: scsi_times_out+0x1c/0x2d0 RSP: ffffc90007ef3d58
CR2: 0000000000000000

(gdb) list *(scsi_times_out+0x1c)
0xffffffff8147adbc is in scsi_times_out (drivers/scsi/scsi_error.c:285).
280      */
281     enum blk_eh_timer_return scsi_times_out(struct request *req)
282     {
283             struct scsi_cmnd *scmd = blk_mq_rq_to_pdu(req);
284             enum blk_eh_timer_return rtn = BLK_EH_NOT_HANDLED;
285             struct Scsi_Host *host = scmd->device->host;
286
287             trace_scsi_dispatch_cmd_timeout(scmd);
288             scsi_log_completion(scmd, TIMEOUT_ERROR);
289

(gdb) disas /s scsi_times_out
[ ... ]
283             struct scsi_cmnd *scmd = blk_mq_rq_to_pdu(req);
284             enum blk_eh_timer_return rtn = BLK_EH_NOT_HANDLED;
285             struct Scsi_Host *host = scmd->device->host;
   0xffffffff8147adb2 <+18>:    mov    0x1d8(%rdi),%rax
   0xffffffff8147adb9 <+25>:    mov    %rdi,%rbx
   0xffffffff8147adbc <+28>:    mov    (%rax),%r13
   0xffffffff8147adbf <+31>:    nopl   0x0(%rax,%rax,1)

Bart.

  parent reply	other threads:[~2017-12-20 23:41 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-12 19:01 [PATCHSET v2] blk-mq: reimplement timeout handling Tejun Heo
2017-12-12 19:01 ` [PATCH 1/6] blk-mq: protect completion path with RCU Tejun Heo
2017-12-13  3:30   ` jianchao.wang
2017-12-13 16:13     ` Tejun Heo
2017-12-14  2:09       ` jianchao.wang
2017-12-14 17:01   ` Bart Van Assche
2017-12-14 17:01     ` Bart Van Assche
2017-12-14 18:14     ` tj
2017-12-12 19:01 ` [PATCH 2/6] blk-mq: replace timeout synchronization with a RCU and generation based scheme Tejun Heo
2017-12-12 21:37   ` Bart Van Assche
2017-12-12 21:37     ` Bart Van Assche
2017-12-12 21:44     ` tj
2017-12-13  5:07   ` jianchao.wang
2017-12-13 16:13     ` Tejun Heo
2017-12-14 18:51   ` Bart Van Assche
2017-12-14 18:51     ` Bart Van Assche
2017-12-14 19:19     ` tj
2017-12-14 21:13       ` Bart Van Assche
2017-12-14 21:13         ` Bart Van Assche
2017-12-15 13:30         ` tj
2017-12-14 20:20     ` Peter Zijlstra
2017-12-14 21:42       ` Bart Van Assche
2017-12-14 21:42         ` Bart Van Assche
2017-12-14 21:54         ` Peter Zijlstra
2017-12-15  2:12           ` jianchao.wang
2017-12-15  7:31             ` Peter Zijlstra
2017-12-15 15:14               ` jianchao.wang
2017-12-15  2:39           ` Mike Galbraith
2017-12-15 13:50       ` tj
2017-12-12 19:01 ` [PATCH 3/6] blk-mq: use blk_mq_rq_state() instead of testing REQ_ATOM_COMPLETE Tejun Heo
2017-12-12 19:01 ` [PATCH 4/6] blk-mq: make blk_abort_request() trigger timeout path Tejun Heo
2017-12-14 18:56   ` Bart Van Assche
2017-12-14 18:56     ` Bart Van Assche
2017-12-14 19:26     ` tj
2017-12-12 19:01 ` [PATCH 5/6] blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq Tejun Heo
2017-12-12 19:01 ` [PATCH 6/6] blk-mq: remove REQ_ATOM_STARTED Tejun Heo
2017-12-12 22:20   ` Bart Van Assche
2017-12-12 22:20     ` Bart Van Assche
2017-12-12 22:22     ` tj
2017-12-12 20:23 ` [PATCHSET v2] blk-mq: reimplement timeout handling Jens Axboe
2017-12-12 21:40   ` Tejun Heo
2017-12-20 23:41 ` Bart Van Assche [this message]
2017-12-20 23:41   ` Bart Van Assche
2017-12-21  0:08   ` tj
2017-12-21  1:00     ` Bart Van Assche
2017-12-21  1:00       ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1513813261.2603.36.camel@wdc.com \
    --to=bart.vanassche@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=kernel-team@fb.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=osandov@fb.com \
    --cc=peterz@infradead.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.