From: "Michael S. Tsirkin" <mst@redhat.com>
To: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Max Gurtovoy <mgurtovoy@nvidia.com>,
stefanha@redhat.com, virtualization@lists.linux.dev,
axboe@kernel.dk, kvm@vger.kernel.org,
linux-block@vger.kernel.org, oren@nvidia.com
Subject: Re: [PATCH v2] virtio_blk: implement init_hctx MQ operation
Date: Thu, 12 Sep 2024 02:57:19 -0400 [thread overview]
Message-ID: <20240912025124-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <fb28ea61-4e94-498e-9caa-c8b7786d437a@samsung.com>
On Thu, Sep 12, 2024 at 08:46:15AM +0200, Marek Szyprowski wrote:
> Dear All,
>
> On 08.08.2024 00:41, Max Gurtovoy wrote:
> > Set the driver data of the hardware context (hctx) to point directly to
> > the virtio block queue. This cleanup improves code readability and
> > reduces the number of dereferences in the fast path.
> >
> > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> > Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> > ---
> > drivers/block/virtio_blk.c | 42 ++++++++++++++++++++------------------
> > 1 file changed, 22 insertions(+), 20 deletions(-)
>
> This patch landed in recent linux-next as commit 8d04556131c1
> ("virtio_blk: implement init_hctx MQ operation"). In my tests I found
> that it introduces a regression in system suspend/resume operation. From
> time to time system crashes during suspend/resume cycle. Reverting this
> patch on top of next-20240911 fixes this problem.
>
> I've even managed to catch a kernel panic log of this problem on QEMU's
> ARM64 'virt' machine:
>
> root@target:~# time rtcwake -s10 -mmem
> rtcwake: wakeup from "mem" using /dev/rtc0 at Thu Sep 12 07:11:52 2024
> Unable to handle kernel NULL pointer dereference at virtual address
> 0000000000000090
> Mem abort info:
> ESR = 0x0000000096000046
> EC = 0x25: DABT (current EL), IL = 32 bits
> SET = 0, FnV = 0
> EA = 0, S1PTW = 0
> FSC = 0x06: level 2 translation fault
> Data abort info:
> ISV = 0, ISS = 0x00000046, ISS2 = 0x00000000
> CM = 0, WnR = 1, TnD = 0, TagAccess = 0
> GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> user pgtable: 4k pages, 48-bit VAs, pgdp=0000000046bbb000
> ...
> Internal error: Oops: 0000000096000046 [#1] PREEMPT SMP
> Modules linked in: bluetooth ecdh_generic ecc rfkill ipv6
> CPU: 0 UID: 0 PID: 9 Comm: kworker/0:0H Not tainted 6.11.0-rc6+ #9024
> Hardware name: linux,dummy-virt (DT)
> Workqueue: kblockd blk_mq_requeue_work
> pstate: 800000c5 (Nzcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : virtqueue_add_split+0x458/0x63c
> lr : virtqueue_add_split+0x1d0/0x63c
> ...
> Call trace:
> virtqueue_add_split+0x458/0x63c
> virtqueue_add_sgs+0xc4/0xec
> virtblk_add_req+0x8c/0xf4
> virtio_queue_rq+0x6c/0x1bc
> blk_mq_dispatch_rq_list+0x21c/0x714
> __blk_mq_sched_dispatch_requests+0xb4/0x58c
> blk_mq_sched_dispatch_requests+0x30/0x6c
> blk_mq_run_hw_queue+0x14c/0x40c
> blk_mq_run_hw_queues+0x64/0x124
> blk_mq_requeue_work+0x188/0x1bc
> process_one_work+0x20c/0x608
> worker_thread+0x238/0x370
> kthread+0x124/0x128
> ret_from_fork+0x10/0x20
> Code: f9404282 79401c21 b9004a81 f94047e1 (f8206841)
> ---[ end trace 0000000000000000 ]---
> note: kworker/0:0H[9] exited with irqs disabled
> note: kworker/0:0H[9] exited with preempt_count 1
>
OK I'll drop from next for now, pls try to debug
and repost.
> > diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> > index 2351f411fa46..35a7a586f6f5 100644
> > --- a/drivers/block/virtio_blk.c
> > +++ b/drivers/block/virtio_blk.c
> > @@ -129,14 +129,6 @@ static inline blk_status_t virtblk_result(u8 status)
> > }
> > }
> >
> > -static inline struct virtio_blk_vq *get_virtio_blk_vq(struct blk_mq_hw_ctx *hctx)
> > -{
> > - struct virtio_blk *vblk = hctx->queue->queuedata;
> > - struct virtio_blk_vq *vq = &vblk->vqs[hctx->queue_num];
> > -
> > - return vq;
> > -}
> > -
> > static int virtblk_add_req(struct virtqueue *vq, struct virtblk_req *vbr)
> > {
> > struct scatterlist out_hdr, in_hdr, *sgs[3];
> > @@ -377,8 +369,7 @@ static void virtblk_done(struct virtqueue *vq)
> >
> > static void virtio_commit_rqs(struct blk_mq_hw_ctx *hctx)
> > {
> > - struct virtio_blk *vblk = hctx->queue->queuedata;
> > - struct virtio_blk_vq *vq = &vblk->vqs[hctx->queue_num];
> > + struct virtio_blk_vq *vq = hctx->driver_data;
> > bool kick;
> >
> > spin_lock_irq(&vq->lock);
> > @@ -428,10 +419,10 @@ static blk_status_t virtio_queue_rq(struct blk_mq_hw_ctx *hctx,
> > const struct blk_mq_queue_data *bd)
> > {
> > struct virtio_blk *vblk = hctx->queue->queuedata;
> > + struct virtio_blk_vq *vq = hctx->driver_data;
> > struct request *req = bd->rq;
> > struct virtblk_req *vbr = blk_mq_rq_to_pdu(req);
> > unsigned long flags;
> > - int qid = hctx->queue_num;
> > bool notify = false;
> > blk_status_t status;
> > int err;
> > @@ -440,26 +431,26 @@ static blk_status_t virtio_queue_rq(struct blk_mq_hw_ctx *hctx,
> > if (unlikely(status))
> > return status;
> >
> > - spin_lock_irqsave(&vblk->vqs[qid].lock, flags);
> > - err = virtblk_add_req(vblk->vqs[qid].vq, vbr);
> > + spin_lock_irqsave(&vq->lock, flags);
> > + err = virtblk_add_req(vq->vq, vbr);
> > if (err) {
> > - virtqueue_kick(vblk->vqs[qid].vq);
> > + virtqueue_kick(vq->vq);
> > /* Don't stop the queue if -ENOMEM: we may have failed to
> > * bounce the buffer due to global resource outage.
> > */
> > if (err == -ENOSPC)
> > blk_mq_stop_hw_queue(hctx);
> > - spin_unlock_irqrestore(&vblk->vqs[qid].lock, flags);
> > + spin_unlock_irqrestore(&vq->lock, flags);
> > virtblk_unmap_data(req, vbr);
> > return virtblk_fail_to_queue(req, err);
> > }
> >
> > - if (bd->last && virtqueue_kick_prepare(vblk->vqs[qid].vq))
> > + if (bd->last && virtqueue_kick_prepare(vq->vq))
> > notify = true;
> > - spin_unlock_irqrestore(&vblk->vqs[qid].lock, flags);
> > + spin_unlock_irqrestore(&vq->lock, flags);
> >
> > if (notify)
> > - virtqueue_notify(vblk->vqs[qid].vq);
> > + virtqueue_notify(vq->vq);
> > return BLK_STS_OK;
> > }
> >
> > @@ -504,7 +495,7 @@ static void virtio_queue_rqs(struct request **rqlist)
> > struct request *requeue_list = NULL;
> >
> > rq_list_for_each_safe(rqlist, req, next) {
> > - struct virtio_blk_vq *vq = get_virtio_blk_vq(req->mq_hctx);
> > + struct virtio_blk_vq *vq = req->mq_hctx->driver_data;
> > bool kick;
> >
> > if (!virtblk_prep_rq_batch(req)) {
> > @@ -1164,6 +1155,16 @@ static const struct attribute_group *virtblk_attr_groups[] = {
> > NULL,
> > };
> >
> > +static int virtblk_init_hctx(struct blk_mq_hw_ctx *hctx, void *data,
> > + unsigned int hctx_idx)
> > +{
> > + struct virtio_blk *vblk = data;
> > + struct virtio_blk_vq *vq = &vblk->vqs[hctx_idx];
> > +
> > + hctx->driver_data = vq;
> > + return 0;
> > +}
> > +
> > static void virtblk_map_queues(struct blk_mq_tag_set *set)
> > {
> > struct virtio_blk *vblk = set->driver_data;
> > @@ -1205,7 +1206,7 @@ static void virtblk_complete_batch(struct io_comp_batch *iob)
> > static int virtblk_poll(struct blk_mq_hw_ctx *hctx, struct io_comp_batch *iob)
> > {
> > struct virtio_blk *vblk = hctx->queue->queuedata;
> > - struct virtio_blk_vq *vq = get_virtio_blk_vq(hctx);
> > + struct virtio_blk_vq *vq = hctx->driver_data;
> > struct virtblk_req *vbr;
> > unsigned long flags;
> > unsigned int len;
> > @@ -1236,6 +1237,7 @@ static const struct blk_mq_ops virtio_mq_ops = {
> > .queue_rqs = virtio_queue_rqs,
> > .commit_rqs = virtio_commit_rqs,
> > .complete = virtblk_request_done,
> > + .init_hctx = virtblk_init_hctx,
> > .map_queues = virtblk_map_queues,
> > .poll = virtblk_poll,
> > };
>
> Best regards
> --
> Marek Szyprowski, PhD
> Samsung R&D Institute Poland
next prev parent reply other threads:[~2024-09-12 6:57 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20240912064617eucas1p1c3191629f76e04111d4b39b15fea350a@eucas1p1.samsung.com>
2024-08-07 22:41 ` [PATCH v2] virtio_blk: implement init_hctx MQ operation Max Gurtovoy
2024-08-08 6:53 ` John Garry
2024-08-08 13:37 ` Jens Axboe
2024-08-12 11:15 ` Christoph Hellwig
2024-09-12 6:46 ` Marek Szyprowski
2024-09-12 6:57 ` Michael S. Tsirkin [this message]
2024-09-16 22:06 ` Max Gurtovoy
2024-09-17 14:09 ` Marek Szyprowski
2024-09-22 22:47 ` Max Gurtovoy
2024-09-24 8:06 ` Francesco Lavra
2024-09-24 8:11 ` Max Gurtovoy
2024-09-24 10:26 ` Marek Szyprowski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240912025124-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=axboe@kernel.dk \
--cc=kvm@vger.kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=m.szyprowski@samsung.com \
--cc=mgurtovoy@nvidia.com \
--cc=oren@nvidia.com \
--cc=stefanha@redhat.com \
--cc=virtualization@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.