All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: josef@toxicpanda.com, axboe@kernel.dk,
	linux-block@vger.kernel.org, nbd@other.debian.org,
	eblake@redhat.com, ming.lei@redhat.com
Subject: Re: Kernel NBD client waits on wrong cookie, aborts connection
Date: Tue, 15 Oct 2024 20:11:02 +0800	[thread overview]
Message-ID: <Zw5b1mwk3aG01NTg@fedora> (raw)
In-Reply-To: <CAFj5m9LXwcH7vc2Fk_i+VhfUA+tevzhciJzKc1am49y_5jgC2Q@mail.gmail.com>

On Tue, Oct 15, 2024 at 08:01:43PM +0800, Ming Lei wrote:
> On Tue, Oct 15, 2024 at 6:22 PM Kevin Wolf <kwolf@redhat.com> wrote:
> >
> > Hi all,
> >
> > the other day I was running some benchmarks to compare different QEMU
> > block exports, and one of the scenarios I was interested in was
> > exporting NBD from qemu-storage-daemon over a unix socket and attaching
> > it as a block device using the kernel NBD client. I would then run a VM
> > on top of it and fio inside of it.
> >
> > Unfortunately, I couldn't get any numbers because the connection always
> > aborted with messages like "Double reply on req ..." or "Unexpected
> > reply ..." in the host kernel log.
> >
> > Yesterday I found some time to have a closer look why this is happening,
> > and I think I have a rough understanding of what's going on now. Look at
> > these trace events:
> >
> >         qemu-img-51025   [005] ..... 19503.285423: nbd_header_sent: nbd transport event: request 000000002df03708, handle 0x0000150c0000005a
> > [...]
> >         qemu-img-51025   [008] ..... 19503.285500: nbd_payload_sent: nbd transport event: request 000000002df03708, handle 0x0000150c0000005d
> > [...]
> >    kworker/u49:1-47350   [004] ..... 19503.285514: nbd_header_received: nbd transport event: request 00000000b79e7443, handle 0x0000150c0000005a
> >
> > This is the same request, but the handle has changed between
> > nbd_header_sent and nbd_payload_sent! I think this means that we hit one
> > of the cases where the request is requeued, and then the next time it
> > is executed with a different blk-mq tag, which is something the nbd
> > driver doesn't seem to expect.
> >
> > Of course, since the cookie is transmitted in the header, the server
> > replies with the original handle that contains the tag from the first
> > call, while the kernel is only waiting for a handle with the new tag and
> > is confused by the server response.
> >
> > I'm not sure yet which of the following options should be considered the
> > real problem here, so I'm only describing the situation without trying
> > to provide a patch:
> >
> > 1. Is it that blk-mq should always re-run the request with the same tag?
> >    I don't expect so, though in practice I was surprised to see that it
> >    happens quite often after nbd requeues a request that it actually
> >    does end up with the same cookie again.
> 
> No.
> 
> request->tag will change, but we may take ->internal_tag(sched) or
> ->tag(none), which won't change.
> 
> I guess was_interrupted() in nbd_send_cmd() is triggered, then the payload
> is sent with a different tag.
> 
> I will try to cook one patch soon.

Please try the following patch:


diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 2cafcf11ee8b..e3eb31c3ee75 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -682,3 +682,16 @@ u32 blk_mq_unique_tag(struct request *rq)
 		(rq->tag & BLK_MQ_UNIQUE_TAG_MASK);
 }
 EXPORT_SYMBOL(blk_mq_unique_tag);
+
+/*
+ * Same with blk_mq_unique_tag, but one persistent tag is included in
+ * the request lifetime.
+ */
+u32 blk_mq_unique_static_tag(struct request *rq)
+{
+	u32 tag = rq->q->elevator ? rq->internal_tag : rq->tag;
+
+	return (rq->mq_hctx->queue_num << BLK_MQ_UNIQUE_TAG_BITS) |
+		(tag & BLK_MQ_UNIQUE_TAG_MASK);
+}
+EXPORT_SYMBOL(blk_mq_unique_static_tag);
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index b852050d8a96..cc522a2cb9fb 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -201,7 +201,7 @@ static void nbd_requeue_cmd(struct nbd_cmd *cmd)
 static u64 nbd_cmd_handle(struct nbd_cmd *cmd)
 {
 	struct request *req = blk_mq_rq_from_pdu(cmd);
-	u32 tag = blk_mq_unique_tag(req);
+	u32 tag = blk_mq_unique_static_tag(req);
 	u64 cookie = cmd->cmd_cookie;
 
 	return (cookie << NBD_COOKIE_BITS) | tag;
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 4fecf46ef681..d6266759d62d 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -793,6 +793,7 @@ enum {
 };
 
 u32 blk_mq_unique_tag(struct request *rq);
+u32 blk_mq_unique_static_tag(struct request *rq);
 
 static inline u16 blk_mq_unique_tag_to_hwq(u32 unique_tag)
 {

-- 
Ming


  reply	other threads:[~2024-10-15 12:11 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-15 10:21 Kernel NBD client waits on wrong cookie, aborts connection Kevin Wolf
2024-10-15 12:01 ` Ming Lei
2024-10-15 12:11   ` Ming Lei [this message]
2024-10-15 12:15     ` Ming Lei
2024-10-15 12:59       ` Ming Lei
2024-10-15 16:06         ` Kevin Wolf
2024-10-16  2:20           ` Ming Lei
2024-10-16 17:00             ` Kevin Wolf
2024-10-17  1:16               ` Ming Lei
2024-10-17  2:52                 ` Ming Lei
2024-10-18 14:40             ` Leon Schuermann
2024-10-21 17:54             ` UNSUBSCRIBE Simon Fernandez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zw5b1mwk3aG01NTg@fedora \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=eblake@redhat.com \
    --cc=josef@toxicpanda.com \
    --cc=kwolf@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=nbd@other.debian.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.