All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chaitanya Kulkarni <chaitanyak@nvidia.com>
To: Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	Jason Wang <jasowang@redhat.com>, Keith Busch <kbusch@kernel.org>,
	Sagi Grimberg <sagi@grimberg.me>,
	Pavel Begunkov <asml.silence@gmail.com>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"virtualization@lists.linux.dev" <virtualization@lists.linux.dev>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	"io-uring@vger.kernel.org" <io-uring@vger.kernel.org>
Subject: Re: don't reorder requests passed to ->queue_rqs
Date: Wed, 13 Nov 2024 20:36:27 +0000	[thread overview]
Message-ID: <eb2faaba-c261-48de-8316-c8e34fdb516c@nvidia.com> (raw)
In-Reply-To: <20241113152050.157179-1-hch@lst.de>

On 11/13/24 07:20, Christoph Hellwig wrote:
> Hi Jens,
>
> currently blk-mq reorders requests when adding them to the plug because
> the request list can't do efficient tail appends.  When the plug is
> directly issued using ->queue_rqs that means reordered requests are
> passed to the driver, which can lead to very bad I/O patterns when
> not corrected, especially on rotational devices (e.g. NVMe HDD) or
> when using zone append.
>
> This series first adds two easily backportable workarounds to reverse
> the reording in the virtio_blk and nvme-pci ->queue_rq implementations
> similar to what the non-queue_rqs path does, and then adds a rq_list
> type that allows for efficient tail insertions and uses that to fix
> the reordering for real and then does the same for I/O completions as
> well.

Looks good to me. I ran the quick performance numbers [1].

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>

-ck

fio randread iouring workload :-

IOPS :-
-------
nvme-orig:           Average IOPS: 72,690
nvme-new-no-reorder: Average IOPS: 72,580

BW :-
-------
nvme-orig:           Average BW: 283.9 MiB/s
nvme-new-no-reorder: Average BW: 283.4 MiB/s

IOPS/BW :-

nvme-orig-10.fio:  read: IOPS=72.9k, BW=285MiB/s 
(299MB/s)(16.7GiB/60004msec)
nvme-orig-1.fio:  read: IOPS=72.7k, BW=284MiB/s (298MB/s)(16.6GiB/60004msec)
nvme-orig-2.fio:  read: IOPS=73.0k, BW=285MiB/s (299MB/s)(16.7GiB/60004msec)
nvme-orig-3.fio:  read: IOPS=73.3k, BW=286MiB/s (300MB/s)(16.8GiB/60003msec)
nvme-orig-4.fio:  read: IOPS=72.5k, BW=283MiB/s (297MB/s)(16.6GiB/60003msec)
nvme-orig-5.fio:  read: IOPS=72.4k, BW=283MiB/s (297MB/s)(16.6GiB/60004msec)
nvme-orig-6.fio:  read: IOPS=72.9k, BW=285MiB/s (299MB/s)(16.7GiB/60003msec)
nvme-orig-7.fio:  read: IOPS=72.3k, BW=282MiB/s (296MB/s)(16.5GiB/60004msec)
nvme-orig-8.fio:  read: IOPS=72.4k, BW=283MiB/s (296MB/s)(16.6GiB/60003msec)
nvme-orig-9.fio:  read: IOPS=72.5k, BW=283MiB/s (297MB/s)(16.6GiB/60004msec)
nvme (nvme-6.13) #
nvme (nvme-6.13) # grep BW nvme-new-no-reorder-*fio
nvme-new-no-reorder-10.fio:  read: IOPS=72.5k, BW=283MiB/s 
(297MB/s)(16.6GiB/60004msec)
nvme-new-no-reorder-1.fio:  read: IOPS=72.5k, BW=283MiB/s 
(297MB/s)(16.6GiB/60004msec)
nvme-new-no-reorder-2.fio:  read: IOPS=72.5k, BW=283MiB/s 
(297MB/s)(16.6GiB/60003msec)
nvme-new-no-reorder-3.fio:  read: IOPS=71.7k, BW=280MiB/s 
(294MB/s)(16.4GiB/60005msec)
nvme-new-no-reorder-4.fio:  read: IOPS=72.5k, BW=283MiB/s 
(297MB/s)(16.6GiB/60004msec)
nvme-new-no-reorder-5.fio:  read: IOPS=72.6k, BW=284MiB/s 
(298MB/s)(16.6GiB/60003msec)
nvme-new-no-reorder-6.fio:  read: IOPS=73.3k, BW=286MiB/s 
(300MB/s)(16.8GiB/60003msec)
nvme-new-no-reorder-7.fio:  read: IOPS=72.8k, BW=284MiB/s 
(298MB/s)(16.7GiB/60003msec)
nvme-new-no-reorder-8.fio:  read: IOPS=73.2k, BW=286MiB/s 
(300MB/s)(16.7GiB/60004msec)
nvme-new-no-reorder-9.fio:  read: IOPS=72.2k, BW=282MiB/s 
(296MB/s)(16.5GiB/60005msec)



  parent reply	other threads:[~2024-11-13 20:36 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-13 15:20 don't reorder requests passed to ->queue_rqs Christoph Hellwig
2024-11-13 15:20 ` [PATCH 1/6] nvme-pci: reverse request order in nvme_queue_rqs Christoph Hellwig
2024-11-13 19:10   ` Keith Busch
2024-11-13 15:20 ` [PATCH 2/6] virtio_blk: reverse request order in virtio_queue_rqs Christoph Hellwig
2024-11-13 19:03   ` Keith Busch
2024-11-13 19:05     ` Jens Axboe
2024-11-13 23:25   ` Michael S. Tsirkin
2024-11-13 15:20 ` [PATCH 3/6] block: remove rq_list_move Christoph Hellwig
2024-11-13 15:20 ` [PATCH 4/6] block: add a rq_list type Christoph Hellwig
2024-11-14 20:11   ` Nathan Chancellor
2024-11-15  9:10     ` Christoph Hellwig
2024-11-15 12:49     ` Jens Axboe
2024-11-15 19:38       ` Jens Axboe
2024-11-16  0:56         ` Nathan Chancellor
2024-11-13 15:20 ` [PATCH 5/6] block: don't reorder requests in blk_add_rq_to_plug Christoph Hellwig
2024-11-13 15:20 ` [PATCH 6/6] block: don't reorder requests in blk_mq_add_to_batch Christoph Hellwig
2024-11-13 18:33 ` don't reorder requests passed to ->queue_rqs Bart Van Assche
2024-11-13 18:39   ` Jens Axboe
2024-11-13 18:46 ` Jens Axboe
2024-11-13 20:36 ` Chaitanya Kulkarni [this message]
2024-11-13 20:51   ` Jens Axboe
2024-11-13 22:23     ` Chaitanya Kulkarni
2024-11-13 22:27       ` Jens Axboe
2024-11-14  4:16     ` Christoph Hellwig
2024-11-14 15:14       ` Jens Axboe
2024-11-18 15:23         ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=eb2faaba-c261-48de-8316-c8e34fdb516c@nvidia.com \
    --to=chaitanyak@nvidia.com \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=io-uring@vger.kernel.org \
    --cc=jasowang@redhat.com \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=mst@redhat.com \
    --cc=sagi@grimberg.me \
    --cc=virtualization@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.