From: Mike Snitzer <snitzer@redhat.com>
To: Damien Le Moal <Damien.LeMoal@wdc.com>
Cc: "linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
"dm-devel@redhat.com" <dm-devel@redhat.com>,
Vijayendra Suman <vijayendra.suman@oracle.com>,
Ming Lei <ming.lei@redhat.com>
Subject: Re: [PATCH 1/3] block: fix blk_rq_get_max_sectors() to flow more carefully
Date: Mon, 14 Sep 2020 11:03:52 -0400 [thread overview]
Message-ID: <20200914150352.GC14410@redhat.com> (raw)
In-Reply-To: <CY4PR04MB375160D4EFBA9BE0957AC7EDE7230@CY4PR04MB3751.namprd04.prod.outlook.com>
On Sun, Sep 13 2020 at 8:46pm -0400,
Damien Le Moal <Damien.LeMoal@wdc.com> wrote:
> On 2020/09/12 6:53, Mike Snitzer wrote:
> > blk_queue_get_max_sectors() has been trained for REQ_OP_WRITE_SAME and
> > REQ_OP_WRITE_ZEROES yet blk_rq_get_max_sectors() didn't call it for
> > those operations.
> >
> > Also, there is no need to avoid blk_max_size_offset() if
> > 'chunk_sectors' isn't set because it falls back to 'max_sectors'.
> >
> > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > ---
> > include/linux/blkdev.h | 19 +++++++++++++------
> > 1 file changed, 13 insertions(+), 6 deletions(-)
> >
> > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> > index bb5636cc17b9..453a3d735d66 100644
> > --- a/include/linux/blkdev.h
> > +++ b/include/linux/blkdev.h
> > @@ -1070,17 +1070,24 @@ static inline unsigned int blk_rq_get_max_sectors(struct request *rq,
> > sector_t offset)
> > {
> > struct request_queue *q = rq->q;
> > + int op;
> > + unsigned int max_sectors;
> >
> > if (blk_rq_is_passthrough(rq))
> > return q->limits.max_hw_sectors;
> >
> > - if (!q->limits.chunk_sectors ||
> > - req_op(rq) == REQ_OP_DISCARD ||
> > - req_op(rq) == REQ_OP_SECURE_ERASE)
> > - return blk_queue_get_max_sectors(q, req_op(rq));
> > + op = req_op(rq);
> > + max_sectors = blk_queue_get_max_sectors(q, op);
> >
> > - return min(blk_max_size_offset(q, offset),
> > - blk_queue_get_max_sectors(q, req_op(rq)));
> > + switch (op) {
> > + case REQ_OP_DISCARD:
> > + case REQ_OP_SECURE_ERASE:
> > + case REQ_OP_WRITE_SAME:
> > + case REQ_OP_WRITE_ZEROES:
> > + return max_sectors;
> > + }
>
> Doesn't this break md devices ? (I think does use chunk_sectors for stride size,
> no ?)
>
> As mentioned in my reply to Ming's email, this will allow these commands to
> potentially cross over zone boundaries on zoned block devices, which would be an
> immediate command failure.
Depending on the implementation it is beneficial to get a large
discard (one not constrained by chunk_sectors, e.g. dm-stripe.c's
optimization for handling large discards and issuing N discards, one per
stripe). Same could apply for other commands.
Like all devices, zoned devices should impose command specific limits in
the queue_limits (and not lean on chunk_sectors to do a
one-size-fits-all).
But that aside, yes I agree I didn't pay close enough attention to the
implications of deferring the splitting of these commands until they
were issued to underlying storage. This chunk_sectors early splitting
override is a bit of a mess... not quite following the logic given we
were supposed to be waiting to split bios as late as possible.
Mike
WARNING: multiple messages have this Message-ID (diff)
From: Mike Snitzer <snitzer@redhat.com>
To: Damien Le Moal <Damien.LeMoal@wdc.com>
Cc: Ming Lei <ming.lei@redhat.com>,
Vijayendra Suman <vijayendra.suman@oracle.com>,
"dm-devel@redhat.com" <dm-devel@redhat.com>,
"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>
Subject: Re: [PATCH 1/3] block: fix blk_rq_get_max_sectors() to flow more carefully
Date: Mon, 14 Sep 2020 11:03:52 -0400 [thread overview]
Message-ID: <20200914150352.GC14410@redhat.com> (raw)
In-Reply-To: <CY4PR04MB375160D4EFBA9BE0957AC7EDE7230@CY4PR04MB3751.namprd04.prod.outlook.com>
On Sun, Sep 13 2020 at 8:46pm -0400,
Damien Le Moal <Damien.LeMoal@wdc.com> wrote:
> On 2020/09/12 6:53, Mike Snitzer wrote:
> > blk_queue_get_max_sectors() has been trained for REQ_OP_WRITE_SAME and
> > REQ_OP_WRITE_ZEROES yet blk_rq_get_max_sectors() didn't call it for
> > those operations.
> >
> > Also, there is no need to avoid blk_max_size_offset() if
> > 'chunk_sectors' isn't set because it falls back to 'max_sectors'.
> >
> > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > ---
> > include/linux/blkdev.h | 19 +++++++++++++------
> > 1 file changed, 13 insertions(+), 6 deletions(-)
> >
> > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> > index bb5636cc17b9..453a3d735d66 100644
> > --- a/include/linux/blkdev.h
> > +++ b/include/linux/blkdev.h
> > @@ -1070,17 +1070,24 @@ static inline unsigned int blk_rq_get_max_sectors(struct request *rq,
> > sector_t offset)
> > {
> > struct request_queue *q = rq->q;
> > + int op;
> > + unsigned int max_sectors;
> >
> > if (blk_rq_is_passthrough(rq))
> > return q->limits.max_hw_sectors;
> >
> > - if (!q->limits.chunk_sectors ||
> > - req_op(rq) == REQ_OP_DISCARD ||
> > - req_op(rq) == REQ_OP_SECURE_ERASE)
> > - return blk_queue_get_max_sectors(q, req_op(rq));
> > + op = req_op(rq);
> > + max_sectors = blk_queue_get_max_sectors(q, op);
> >
> > - return min(blk_max_size_offset(q, offset),
> > - blk_queue_get_max_sectors(q, req_op(rq)));
> > + switch (op) {
> > + case REQ_OP_DISCARD:
> > + case REQ_OP_SECURE_ERASE:
> > + case REQ_OP_WRITE_SAME:
> > + case REQ_OP_WRITE_ZEROES:
> > + return max_sectors;
> > + }
>
> Doesn't this break md devices ? (I think does use chunk_sectors for stride size,
> no ?)
>
> As mentioned in my reply to Ming's email, this will allow these commands to
> potentially cross over zone boundaries on zoned block devices, which would be an
> immediate command failure.
Depending on the implementation it is beneficial to get a large
discard (one not constrained by chunk_sectors, e.g. dm-stripe.c's
optimization for handling large discards and issuing N discards, one per
stripe). Same could apply for other commands.
Like all devices, zoned devices should impose command specific limits in
the queue_limits (and not lean on chunk_sectors to do a
one-size-fits-all).
But that aside, yes I agree I didn't pay close enough attention to the
implications of deferring the splitting of these commands until they
were issued to underlying storage. This chunk_sectors early splitting
override is a bit of a mess... not quite following the logic given we
were supposed to be waiting to split bios as late as possible.
Mike
next prev parent reply other threads:[~2020-09-14 15:03 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <529c2394-1b58-b9d8-d462-1f3de1b78ac8@oracle.com>
2020-09-10 14:24 ` Revert "dm: always call blk_queue_split() in dm_process_bio()" Mike Snitzer
2020-09-10 14:24 ` Mike Snitzer
2020-09-10 19:29 ` Vijayendra Suman
2020-09-15 1:33 ` Mike Snitzer
2020-09-15 17:03 ` Mike Snitzer
2020-09-16 14:56 ` Vijayendra Suman
2020-09-11 12:20 ` Ming Lei
2020-09-11 16:13 ` Mike Snitzer
2020-09-11 16:13 ` Mike Snitzer
2020-09-11 21:53 ` [PATCH 0/3] block: a few chunk_sectors fixes/improvements Mike Snitzer
2020-09-11 21:53 ` [PATCH 1/3] block: fix blk_rq_get_max_sectors() to flow more carefully Mike Snitzer
2020-09-12 13:52 ` Ming Lei
2020-09-14 0:43 ` Damien Le Moal
2020-09-14 14:52 ` Mike Snitzer
2020-09-14 23:28 ` Damien Le Moal
2020-09-15 2:03 ` Ming Lei
2020-09-15 2:15 ` Damien Le Moal
2020-09-14 14:49 ` Mike Snitzer
2020-09-14 14:49 ` Mike Snitzer
2020-09-15 1:50 ` Ming Lei
2020-09-14 0:46 ` Damien Le Moal
2020-09-14 15:03 ` Mike Snitzer [this message]
2020-09-14 15:03 ` Mike Snitzer
2020-09-15 1:09 ` Damien Le Moal
2020-09-15 4:21 ` Damien Le Moal
2020-09-15 8:01 ` Ming Lei
2020-09-15 8:01 ` Ming Lei
2020-09-11 21:53 ` [PATCH 2/3] block: use lcm_not_zero() when stacking chunk_sectors Mike Snitzer
2020-09-12 13:58 ` Ming Lei
2020-09-12 13:58 ` Ming Lei
2020-09-11 21:53 ` [PATCH 3/3] block: allow 'chunk_sectors' to be non-power-of-2 Mike Snitzer
2020-09-12 14:06 ` Ming Lei
2020-09-12 14:06 ` Ming Lei
2020-09-14 2:43 ` Keith Busch
2020-09-14 0:55 ` Damien Le Moal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200914150352.GC14410@redhat.com \
--to=snitzer@redhat.com \
--cc=Damien.LeMoal@wdc.com \
--cc=dm-devel@redhat.com \
--cc=linux-block@vger.kernel.org \
--cc=ming.lei@redhat.com \
--cc=vijayendra.suman@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.