* [mst@redhat.com: [PATCH v2] Revert "virtio-blk: support completion batching for the IRQ path"]
@ 2023-06-22 6:15 Michael S. Tsirkin
2023-06-22 6:37 ` Greg KH
0 siblings, 1 reply; 6+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 6:15 UTC (permalink / raw)
To: stable
----- Forwarded message from "Michael S. Tsirkin" <mst@redhat.com> -----
From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Fri, 9 Jun 2023 03:27:28 -0400
To: linux-kernel@vger.kernel.org
Cc: kernel test robot <lkp@intel.com>, Suwan Kim <suwan.kim027@gmail.com>, "Roberts, Martin" <martin.roberts@intel.com>, Jason Wang
<jasowang@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>, Xuan Zhuo
<xuanzhuo@linux.alibaba.com>, Jens Axboe <axboe@kernel.dk>, virtualization@lists.linux-foundation.org,
linux-block@vger.kernel.org
Subject: [PATCH v2] Revert "virtio-blk: support completion batching for the IRQ path"
Message-ID: <336455b4f630f329380a8f53ee8cad3868764d5c.1686295549.git.mst@redhat.com>
This reverts commit 07b679f70d73483930e8d3c293942416d9cd5c13.
This change appears to have broken things...
We now see applications hanging during disk accesses.
e.g.
multi-port virtio-blk device running in h/w (FPGA)
Host running a simple 'fio' test.
[global]
thread=1
direct=1
ioengine=libaio
norandommap=1
group_reporting=1
bs=4K
rw=read
iodepth=128
runtime=1
numjobs=4
time_based
[job0]
filename=/dev/vda
[job1]
filename=/dev/vdb
[job2]
filename=/dev/vdc
...
[job15]
filename=/dev/vdp
i.e. 16 disks; 4 queues per disk; simple burst of 4KB reads
This is repeatedly run in a loop.
After a few, normally <10 seconds, fio hangs.
With 64 queues (16 disks), failure occurs within a few seconds; with 8 queues (2 disks) it may take ~hour before hanging.
Last message:
fio-3.19
Starting 8 threads
Jobs: 1 (f=1): [_(7),R(1)][68.3%][eta 03h:11m:06s]
I think this means at the end of the run 1 queue was left incomplete.
'diskstats' (run while fio is hung) shows no outstanding transactions.
e.g.
$ cat /proc/diskstats
...
252 0 vda 1843140071 0 14745120568 712568645 0 0 0 0 0 3117947 712568645 0 0 0 0 0 0
252 16 vdb 1816291511 0 14530332088 704905623 0 0 0 0 0 3117711 704905623 0 0 0 0 0 0
...
Other stats (in the h/w, and added to the virtio-blk driver ([a]virtio_queue_rq(), [b]virtblk_handle_req(), [c]virtblk_request_done()) all agree, and show every request had a completion, and that virtblk_request_done() never gets called.
e.g.
PF= 0 vq=0 1 2 3
[a]request_count - 839416590 813148916 105586179 84988123
[b]completion1_count - 839416590 813148916 105586179 84988123
[c]completion2_count - 0 0 0 0
PF= 1 vq=0 1 2 3
[a]request_count - 823335887 812516140 104582672 75856549
[b]completion1_count - 823335887 812516140 104582672 75856549
[c]completion2_count - 0 0 0 0
i.e. the issue is after the virtio-blk driver.
This change was introduced in kernel 6.3.0.
I am seeing this using 6.3.3.
If I run with an earlier kernel (5.15), it does not occur.
If I make a simple patch to the 6.3.3 virtio-blk driver, to skip the blk_mq_add_to_batch()call, it does not fail.
e.g.
kernel 5.15 - this is OK
virtio_blk.c,virtblk_done() [irq handler]
if (likely(!blk_should_fake_timeout(req->q))) {
blk_mq_complete_request(req);
}
kernel 6.3.3 - this fails
virtio_blk.c,virtblk_handle_req() [irq handler]
if (likely(!blk_should_fake_timeout(req->q))) {
if (!blk_mq_complete_request_remote(req)) {
if (!blk_mq_add_to_batch(req, iob, virtblk_vbr_status(vbr), virtblk_complete_batch)) {
virtblk_request_done(req); //this never gets called... so blk_mq_add_to_batch() must always succeed
}
}
}
If I do, kernel 6.3.3 - this is OK
virtio_blk.c,virtblk_handle_req() [irq handler]
if (likely(!blk_should_fake_timeout(req->q))) {
if (!blk_mq_complete_request_remote(req)) {
virtblk_request_done(req); //force this here...
if (!blk_mq_add_to_batch(req, iob, virtblk_vbr_status(vbr), virtblk_complete_batch)) {
virtblk_request_done(req); //this never gets called... so blk_mq_add_to_batch() must always succeed
}
}
}
Perhaps you might like to fix/test/revert this change...
Martin
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202306090826.C1fZmdMe-lkp@intel.com/
Cc: Suwan Kim <suwan.kim027@gmail.com>
Reported-by: "Roberts, Martin" <martin.roberts@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
Since v1:
fix build error
Still completely untested as I'm traveling.
Martin, Suwan, could you please test and report?
Suwan if you have a better revert in mind pls post and
I will be happy to drop this.
Thanks!
drivers/block/virtio_blk.c | 82 +++++++++++++++++---------------------
1 file changed, 37 insertions(+), 45 deletions(-)
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 2b918e28acaa..b47358da92a2 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -348,63 +348,33 @@ static inline void virtblk_request_done(struct request *req)
blk_mq_end_request(req, status);
}
-static void virtblk_complete_batch(struct io_comp_batch *iob)
-{
- struct request *req;
-
- rq_list_for_each(&iob->req_list, req) {
- virtblk_unmap_data(req, blk_mq_rq_to_pdu(req));
- virtblk_cleanup_cmd(req);
- }
- blk_mq_end_request_batch(iob);
-}
-
-static int virtblk_handle_req(struct virtio_blk_vq *vq,
- struct io_comp_batch *iob)
-{
- struct virtblk_req *vbr;
- int req_done = 0;
- unsigned int len;
-
- while ((vbr = virtqueue_get_buf(vq->vq, &len)) != NULL) {
- struct request *req = blk_mq_rq_from_pdu(vbr);
-
- if (likely(!blk_should_fake_timeout(req->q)) &&
- !blk_mq_complete_request_remote(req) &&
- !blk_mq_add_to_batch(req, iob, virtblk_vbr_status(vbr),
- virtblk_complete_batch))
- virtblk_request_done(req);
- req_done++;
- }
-
- return req_done;
-}
-
static void virtblk_done(struct virtqueue *vq)
{
struct virtio_blk *vblk = vq->vdev->priv;
- struct virtio_blk_vq *vblk_vq = &vblk->vqs[vq->index];
- int req_done = 0;
+ bool req_done = false;
+ int qid = vq->index;
+ struct virtblk_req *vbr;
unsigned long flags;
- DEFINE_IO_COMP_BATCH(iob);
+ unsigned int len;
- spin_lock_irqsave(&vblk_vq->lock, flags);
+ spin_lock_irqsave(&vblk->vqs[qid].lock, flags);
do {
virtqueue_disable_cb(vq);
- req_done += virtblk_handle_req(vblk_vq, &iob);
+ while ((vbr = virtqueue_get_buf(vblk->vqs[qid].vq, &len)) != NULL) {
+ struct request *req = blk_mq_rq_from_pdu(vbr);
+ if (likely(!blk_should_fake_timeout(req->q)))
+ blk_mq_complete_request(req);
+ req_done = true;
+ }
if (unlikely(virtqueue_is_broken(vq)))
break;
} while (!virtqueue_enable_cb(vq));
- if (req_done) {
- if (!rq_list_empty(iob.req_list))
- iob.complete(&iob);
-
- /* In case queue is stopped waiting for more buffers. */
+ /* In case queue is stopped waiting for more buffers. */
+ if (req_done)
blk_mq_start_stopped_hw_queues(vblk->disk->queue, true);
- }
- spin_unlock_irqrestore(&vblk_vq->lock, flags);
+ spin_unlock_irqrestore(&vblk->vqs[qid].lock, flags);
}
static void virtio_commit_rqs(struct blk_mq_hw_ctx *hctx)
@@ -1283,15 +1253,37 @@ static void virtblk_map_queues(struct blk_mq_tag_set *set)
}
}
+static void virtblk_complete_batch(struct io_comp_batch *iob)
+{
+ struct request *req;
+
+ rq_list_for_each(&iob->req_list, req) {
+ virtblk_unmap_data(req, blk_mq_rq_to_pdu(req));
+ virtblk_cleanup_cmd(req);
+ }
+ blk_mq_end_request_batch(iob);
+}
+
static int virtblk_poll(struct blk_mq_hw_ctx *hctx, struct io_comp_batch *iob)
{
struct virtio_blk *vblk = hctx->queue->queuedata;
struct virtio_blk_vq *vq = get_virtio_blk_vq(hctx);
+ struct virtblk_req *vbr;
unsigned long flags;
+ unsigned int len;
int found = 0;
spin_lock_irqsave(&vq->lock, flags);
- found = virtblk_handle_req(vq, iob);
+
+ while ((vbr = virtqueue_get_buf(vq->vq, &len)) != NULL) {
+ struct request *req = blk_mq_rq_from_pdu(vbr);
+
+ found++;
+ if (!blk_mq_complete_request_remote(req) &&
+ !blk_mq_add_to_batch(req, iob, virtblk_vbr_status(vbr),
+ virtblk_complete_batch))
+ virtblk_request_done(req);
+ }
if (found)
blk_mq_start_stopped_hw_queues(vblk->disk->queue, true);
--
MST
----- End forwarded message -----
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [mst@redhat.com: [PATCH v2] Revert "virtio-blk: support completion batching for the IRQ path"]
2023-06-22 6:15 [mst@redhat.com: [PATCH v2] Revert "virtio-blk: support completion batching for the IRQ path"] Michael S. Tsirkin
@ 2023-06-22 6:37 ` Greg KH
2023-06-22 6:39 ` Michael S. Tsirkin
0 siblings, 1 reply; 6+ messages in thread
From: Greg KH @ 2023-06-22 6:37 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: stable
On Thu, Jun 22, 2023 at 02:15:45AM -0400, Michael S. Tsirkin wrote:
> ----- Forwarded message from "Michael S. Tsirkin" <mst@redhat.com> -----
>
> From: "Michael S. Tsirkin" <mst@redhat.com>
> Date: Fri, 9 Jun 2023 03:27:28 -0400
> To: linux-kernel@vger.kernel.org
> Cc: kernel test robot <lkp@intel.com>, Suwan Kim <suwan.kim027@gmail.com>, "Roberts, Martin" <martin.roberts@intel.com>, Jason Wang
> <jasowang@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>, Xuan Zhuo
> <xuanzhuo@linux.alibaba.com>, Jens Axboe <axboe@kernel.dk>, virtualization@lists.linux-foundation.org,
> linux-block@vger.kernel.org
> Subject: [PATCH v2] Revert "virtio-blk: support completion batching for the IRQ path"
> Message-ID: <336455b4f630f329380a8f53ee8cad3868764d5c.1686295549.git.mst@redhat.com>
>
> This reverts commit 07b679f70d73483930e8d3c293942416d9cd5c13.
What commit id is this in Linus's tree?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [mst@redhat.com: [PATCH v2] Revert "virtio-blk: support completion batching for the IRQ path"]
2023-06-22 6:37 ` Greg KH
@ 2023-06-22 6:39 ` Michael S. Tsirkin
2023-06-22 6:42 ` Greg KH
0 siblings, 1 reply; 6+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 6:39 UTC (permalink / raw)
To: Greg KH; +Cc: stable
On Thu, Jun 22, 2023 at 08:37:36AM +0200, Greg KH wrote:
> On Thu, Jun 22, 2023 at 02:15:45AM -0400, Michael S. Tsirkin wrote:
> > ----- Forwarded message from "Michael S. Tsirkin" <mst@redhat.com> -----
> >
> > From: "Michael S. Tsirkin" <mst@redhat.com>
> > Date: Fri, 9 Jun 2023 03:27:28 -0400
> > To: linux-kernel@vger.kernel.org
> > Cc: kernel test robot <lkp@intel.com>, Suwan Kim <suwan.kim027@gmail.com>, "Roberts, Martin" <martin.roberts@intel.com>, Jason Wang
> > <jasowang@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>, Xuan Zhuo
> > <xuanzhuo@linux.alibaba.com>, Jens Axboe <axboe@kernel.dk>, virtualization@lists.linux-foundation.org,
> > linux-block@vger.kernel.org
> > Subject: [PATCH v2] Revert "virtio-blk: support completion batching for the IRQ path"
> > Message-ID: <336455b4f630f329380a8f53ee8cad3868764d5c.1686295549.git.mst@redhat.com>
> >
> > This reverts commit 07b679f70d73483930e8d3c293942416d9cd5c13.
>
> What commit id is this in Linus's tree?
>
> thanks,
>
> greg k-h
afd384f0dbea2229fd11159efb86a5b41051c4a9
--
MST
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [mst@redhat.com: [PATCH v2] Revert "virtio-blk: support completion batching for the IRQ path"]
2023-06-22 6:39 ` Michael S. Tsirkin
@ 2023-06-22 6:42 ` Greg KH
2023-06-22 6:43 ` Michael S. Tsirkin
0 siblings, 1 reply; 6+ messages in thread
From: Greg KH @ 2023-06-22 6:42 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: stable
On Thu, Jun 22, 2023 at 02:39:15AM -0400, Michael S. Tsirkin wrote:
> On Thu, Jun 22, 2023 at 08:37:36AM +0200, Greg KH wrote:
> > On Thu, Jun 22, 2023 at 02:15:45AM -0400, Michael S. Tsirkin wrote:
> > > ----- Forwarded message from "Michael S. Tsirkin" <mst@redhat.com> -----
> > >
> > > From: "Michael S. Tsirkin" <mst@redhat.com>
> > > Date: Fri, 9 Jun 2023 03:27:28 -0400
> > > To: linux-kernel@vger.kernel.org
> > > Cc: kernel test robot <lkp@intel.com>, Suwan Kim <suwan.kim027@gmail.com>, "Roberts, Martin" <martin.roberts@intel.com>, Jason Wang
> > > <jasowang@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>, Xuan Zhuo
> > > <xuanzhuo@linux.alibaba.com>, Jens Axboe <axboe@kernel.dk>, virtualization@lists.linux-foundation.org,
> > > linux-block@vger.kernel.org
> > > Subject: [PATCH v2] Revert "virtio-blk: support completion batching for the IRQ path"
> > > Message-ID: <336455b4f630f329380a8f53ee8cad3868764d5c.1686295549.git.mst@redhat.com>
> > >
> > > This reverts commit 07b679f70d73483930e8d3c293942416d9cd5c13.
> >
> > What commit id is this in Linus's tree?
> >
> > thanks,
> >
> > greg k-h
>
> afd384f0dbea2229fd11159efb86a5b41051c4a9
Great, and what tree(s) do you want it applied to, just 6.3.y?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [mst@redhat.com: [PATCH v2] Revert "virtio-blk: support completion batching for the IRQ path"]
2023-06-22 6:42 ` Greg KH
@ 2023-06-22 6:43 ` Michael S. Tsirkin
2023-06-22 7:32 ` Greg KH
0 siblings, 1 reply; 6+ messages in thread
From: Michael S. Tsirkin @ 2023-06-22 6:43 UTC (permalink / raw)
To: Greg KH; +Cc: stable
On Thu, Jun 22, 2023 at 08:42:39AM +0200, Greg KH wrote:
> On Thu, Jun 22, 2023 at 02:39:15AM -0400, Michael S. Tsirkin wrote:
> > On Thu, Jun 22, 2023 at 08:37:36AM +0200, Greg KH wrote:
> > > On Thu, Jun 22, 2023 at 02:15:45AM -0400, Michael S. Tsirkin wrote:
> > > > ----- Forwarded message from "Michael S. Tsirkin" <mst@redhat.com> -----
> > > >
> > > > From: "Michael S. Tsirkin" <mst@redhat.com>
> > > > Date: Fri, 9 Jun 2023 03:27:28 -0400
> > > > To: linux-kernel@vger.kernel.org
> > > > Cc: kernel test robot <lkp@intel.com>, Suwan Kim <suwan.kim027@gmail.com>, "Roberts, Martin" <martin.roberts@intel.com>, Jason Wang
> > > > <jasowang@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>, Xuan Zhuo
> > > > <xuanzhuo@linux.alibaba.com>, Jens Axboe <axboe@kernel.dk>, virtualization@lists.linux-foundation.org,
> > > > linux-block@vger.kernel.org
> > > > Subject: [PATCH v2] Revert "virtio-blk: support completion batching for the IRQ path"
> > > > Message-ID: <336455b4f630f329380a8f53ee8cad3868764d5c.1686295549.git.mst@redhat.com>
> > > >
> > > > This reverts commit 07b679f70d73483930e8d3c293942416d9cd5c13.
> > >
> > > What commit id is this in Linus's tree?
> > >
> > > thanks,
> > >
> > > greg k-h
> >
> > afd384f0dbea2229fd11159efb86a5b41051c4a9
>
> Great, and what tree(s) do you want it applied to, just 6.3.y?
>
> thanks,
>
> greg k-h
That's the only one that has the commit we revert so yes. Thanks!
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [mst@redhat.com: [PATCH v2] Revert "virtio-blk: support completion batching for the IRQ path"]
2023-06-22 6:43 ` Michael S. Tsirkin
@ 2023-06-22 7:32 ` Greg KH
0 siblings, 0 replies; 6+ messages in thread
From: Greg KH @ 2023-06-22 7:32 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: stable
On Thu, Jun 22, 2023 at 02:43:41AM -0400, Michael S. Tsirkin wrote:
> On Thu, Jun 22, 2023 at 08:42:39AM +0200, Greg KH wrote:
> > On Thu, Jun 22, 2023 at 02:39:15AM -0400, Michael S. Tsirkin wrote:
> > > On Thu, Jun 22, 2023 at 08:37:36AM +0200, Greg KH wrote:
> > > > On Thu, Jun 22, 2023 at 02:15:45AM -0400, Michael S. Tsirkin wrote:
> > > > > ----- Forwarded message from "Michael S. Tsirkin" <mst@redhat.com> -----
> > > > >
> > > > > From: "Michael S. Tsirkin" <mst@redhat.com>
> > > > > Date: Fri, 9 Jun 2023 03:27:28 -0400
> > > > > To: linux-kernel@vger.kernel.org
> > > > > Cc: kernel test robot <lkp@intel.com>, Suwan Kim <suwan.kim027@gmail.com>, "Roberts, Martin" <martin.roberts@intel.com>, Jason Wang
> > > > > <jasowang@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>, Xuan Zhuo
> > > > > <xuanzhuo@linux.alibaba.com>, Jens Axboe <axboe@kernel.dk>, virtualization@lists.linux-foundation.org,
> > > > > linux-block@vger.kernel.org
> > > > > Subject: [PATCH v2] Revert "virtio-blk: support completion batching for the IRQ path"
> > > > > Message-ID: <336455b4f630f329380a8f53ee8cad3868764d5c.1686295549.git.mst@redhat.com>
> > > > >
> > > > > This reverts commit 07b679f70d73483930e8d3c293942416d9cd5c13.
> > > >
> > > > What commit id is this in Linus's tree?
> > > >
> > > > thanks,
> > > >
> > > > greg k-h
> > >
> > > afd384f0dbea2229fd11159efb86a5b41051c4a9
> >
> > Great, and what tree(s) do you want it applied to, just 6.3.y?
> >
> > thanks,
> >
> > greg k-h
>
>
> That's the only one that has the commit we revert so yes. Thanks!
Great, now queued up, thanks.
greg k-h
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-06-22 7:32 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-22 6:15 [mst@redhat.com: [PATCH v2] Revert "virtio-blk: support completion batching for the IRQ path"] Michael S. Tsirkin
2023-06-22 6:37 ` Greg KH
2023-06-22 6:39 ` Michael S. Tsirkin
2023-06-22 6:42 ` Greg KH
2023-06-22 6:43 ` Michael S. Tsirkin
2023-06-22 7:32 ` Greg KH
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).