All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@redhat.com>
To: JeffleXu <jefflexu@linux.alibaba.com>
Cc: axboe@kernel.dk, xiaoguang.wang@linux.alibaba.com,
	linux-block@vger.kernel.org, joseph.qi@linux.alibaba.com,
	dm-devel@redhat.com, haoxu@linux.alibaba.com,
	io-uring@vger.kernel.org
Subject: Re: [dm-devel] [RFC 0/3] Add support of iopoll for dm device
Date: Fri, 6 Nov 2020 12:45:26 -0500	[thread overview]
Message-ID: <20201106174526.GA13292@redhat.com> (raw)
In-Reply-To: <2c5dab21-8125-fcdf-078e-00a158c57f43@linux.alibaba.com>

On Thu, Nov 05 2020 at  9:51pm -0500,
JeffleXu <jefflexu@linux.alibaba.com> wrote:

> 
> On 11/4/20 11:08 PM, Mike Snitzer wrote:
> >>I'm doubted if this should be implemented in block layer like:
> >>
> >>```
> >>
> >>struct bio {
> >>
> >>     ...
> >>
> >>     struct list_head  cookies;
> >>
> >>};
> >>
> >>```
> >>
> >>After all it's only used by bio-based queue, or more specifically
> >>only dm device currently.
> >I do think this line of work really should be handled in block core
> >because I cannot see any reason why MD or bcache or whatever bio-based
> >device wouldn't want the ability to better support io_uring (with IO
> >poll).
> >
> >>Another design I can come up with is to maintain a global data
> >>structure for the very beginning
> >>original bio. Currently the blocking point is that now one original
> >>bio to the dm device (@bio of dm_submit()) can correspond to multiple
> >>dm_io and thus we have nowhere to place the @cookies list.
> >Yes, and that will always be the case.  We need the design to handle an
> >arbitrary sprawl of splitting from a given bio.  The graph of bios
> >resulting from that fan-out needs to be walked at various levels -- be
> >it the top-level original bio's submit_bio() returned cookie or some
> >intermediate point in the chain of bios.
> >
> >The problem is the lifetime of the data structure created for a given
> >split bio versus layering boundaries (that come from block core's
> >simplistic recursion via bio using submit_bio).
> >
> >>Now we have to maintain one data structure for every original bio,
> >>something like
> >>
> >>```
> >>
> >>struct dm_poll_instance {
> >>
> >>     ...
> >>
> >>     struct list_head cookies;
> >>
> >>};
> >>
> >>```
> >I do think we need a hybrid where at the point of recursion we're able
> >to make the associated data structure available across the recursion
> >boundary so that modeling the association in a chain of split bios is
> >possible. (e.g. struct dm_poll_data or dm_poll_instance as you named it,
> >_but_ that struct definition would live in block core, but would be part
> >of per-bio-data; so 'struct blk_poll_data' is more logical name when
> >elevated to block core).
> >
> >It _might_ be worthwhile to see if a new BIO_ flag could be added to
> >allow augmenting the bio_split + bio_chain pattern to also track this
> >additional case of carrying additional data per-bio while creating
> >bio-chains.  I may not be clear yet, said differently: augmenting
> >bio_chain to not only chain bios, but to _also_ thread/chain together
> >per-bio-data that lives within those chained bios.  SO you have the
> >chain of bios _and_ the chain of potentially opaque void * that happens
> >to point to a list head for a list of 'struct blk_poll_data'.
> >
> >Does that make sense?
> 
> 
> I'm doubted if it really makes sense to maintain a *complete* bio
> chain across the recursive
> 
> call boundary.
> 
> 
> Considering the following device stack:
> 
> ```
> 
>                                   dm0
> 
>         dm1                   dm2           dm3
> 
> nvme0  nvme1          ....               ....
> 
> ```
> 
> 
> We have the following bio graph (please let me know if it's wrong or
> the image can't display)
> 
> 
> For example, for dm1 there are three bios containing valid cookie in
> the end, that is
> 
> bio 9/10/11. We only need to insert the corresponding blk_poll_data
> (containing
> 
> request_queue, cookie, etc) of these three bios into the very
> beginning original
> 
> bio (that is bio0). Of course we can track all the sub-bios down
> through the device
> 
> stack, layer by layer, e.g.,
> 
> - get bio 1/2/3 from bio 0
> 
> - get bio 4 from bio 3
> 
> - finally get bio 9 from bio 4
> 
> But I'm doubted if it's overkill to just implement the iopoll.
> 
> 
> Another issue still unclear is that, if we should implement the
> iopoll in a recursive way.
> 
> In a recursive way, to poll dm 0, we should poll all its
> sub-devices, that is, bio 4/5/7/8.
> 
> Oppositely we can insert only the bottom bio (bio 9/10/11 which have
> valid cookie) at
> 
> the very beginning (at submit_bio() phase), and to poll dm 0, we
> only need to poll bio 9/10/11.

I feel we need the ability to walk the entire graph and call down to
next level.  The lowest level would be what you call a "valid cookie"
that blk-mq returned.  But the preceding cookies need to be made valid
and used when walking the graph (from highest to lowest) and calling
down to the underlying layers.

> 
> 
> To implement this non-recursive design, we can add a field in struct bio
> 
> ```
> 
> struct bio {
> 
>     ...
> 
>     struct bio *orig;
> 
> }
> 
> ```
> 
> @orig points to the original bio inputted into submit_bio(), that is, bio 0.
> 
> 
> @orig field is transmitted through bio splitting.
> 
> ```
> 
> bio_split()
> 
>     split->orig = bio->orig ? : bio
> 
> 
> dm.c: __clone_and_map_data_bio
> 
>     clone->orig = bio->orig ? : bio
> 
> ```
> 
> 
> Finally bio 9/10/11 can be inserted into bio 0.
> 
> ```
> 
> blk-mq.c: blk_mq_submit_bio
> 
>     if (bio->orig)
> 
>         init blk_poll_data and insert it into bio->orig's @cookies list
> 
> ```

If you feel that is doable: certainly give it a shot.

But it is not clear to me how you intend to translate from cookie passed
in to ->blk_poll hook (returned from submit_bio) to the saved off
bio->orig.

What is your cookie strategy in this non-recursive implementation?  What
will you be returning?  Where will you be storing it?

Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


WARNING: multiple messages have this Message-ID (diff)
From: Mike Snitzer <snitzer@redhat.com>
To: JeffleXu <jefflexu@linux.alibaba.com>
Cc: axboe@kernel.dk, xiaoguang.wang@linux.alibaba.com,
	linux-block@vger.kernel.org, joseph.qi@linux.alibaba.com,
	dm-devel@redhat.com, haoxu@linux.alibaba.com,
	io-uring@vger.kernel.org
Subject: Re: [RFC 0/3] Add support of iopoll for dm device
Date: Fri, 6 Nov 2020 12:45:26 -0500	[thread overview]
Message-ID: <20201106174526.GA13292@redhat.com> (raw)
In-Reply-To: <2c5dab21-8125-fcdf-078e-00a158c57f43@linux.alibaba.com>

On Thu, Nov 05 2020 at  9:51pm -0500,
JeffleXu <jefflexu@linux.alibaba.com> wrote:

> 
> On 11/4/20 11:08 PM, Mike Snitzer wrote:
> >>I'm doubted if this should be implemented in block layer like:
> >>
> >>```
> >>
> >>struct bio {
> >>
> >>     ...
> >>
> >>     struct list_head  cookies;
> >>
> >>};
> >>
> >>```
> >>
> >>After all it's only used by bio-based queue, or more specifically
> >>only dm device currently.
> >I do think this line of work really should be handled in block core
> >because I cannot see any reason why MD or bcache or whatever bio-based
> >device wouldn't want the ability to better support io_uring (with IO
> >poll).
> >
> >>Another design I can come up with is to maintain a global data
> >>structure for the very beginning
> >>original bio. Currently the blocking point is that now one original
> >>bio to the dm device (@bio of dm_submit()) can correspond to multiple
> >>dm_io and thus we have nowhere to place the @cookies list.
> >Yes, and that will always be the case.  We need the design to handle an
> >arbitrary sprawl of splitting from a given bio.  The graph of bios
> >resulting from that fan-out needs to be walked at various levels -- be
> >it the top-level original bio's submit_bio() returned cookie or some
> >intermediate point in the chain of bios.
> >
> >The problem is the lifetime of the data structure created for a given
> >split bio versus layering boundaries (that come from block core's
> >simplistic recursion via bio using submit_bio).
> >
> >>Now we have to maintain one data structure for every original bio,
> >>something like
> >>
> >>```
> >>
> >>struct dm_poll_instance {
> >>
> >>     ...
> >>
> >>     struct list_head cookies;
> >>
> >>};
> >>
> >>```
> >I do think we need a hybrid where at the point of recursion we're able
> >to make the associated data structure available across the recursion
> >boundary so that modeling the association in a chain of split bios is
> >possible. (e.g. struct dm_poll_data or dm_poll_instance as you named it,
> >_but_ that struct definition would live in block core, but would be part
> >of per-bio-data; so 'struct blk_poll_data' is more logical name when
> >elevated to block core).
> >
> >It _might_ be worthwhile to see if a new BIO_ flag could be added to
> >allow augmenting the bio_split + bio_chain pattern to also track this
> >additional case of carrying additional data per-bio while creating
> >bio-chains.  I may not be clear yet, said differently: augmenting
> >bio_chain to not only chain bios, but to _also_ thread/chain together
> >per-bio-data that lives within those chained bios.  SO you have the
> >chain of bios _and_ the chain of potentially opaque void * that happens
> >to point to a list head for a list of 'struct blk_poll_data'.
> >
> >Does that make sense?
> 
> 
> I'm doubted if it really makes sense to maintain a *complete* bio
> chain across the recursive
> 
> call boundary.
> 
> 
> Considering the following device stack:
> 
> ```
> 
>                                   dm0
> 
>         dm1                   dm2           dm3
> 
> nvme0  nvme1          ....               ....
> 
> ```
> 
> 
> We have the following bio graph (please let me know if it's wrong or
> the image can't display)
> 
> 
> For example, for dm1 there are three bios containing valid cookie in
> the end, that is
> 
> bio 9/10/11. We only need to insert the corresponding blk_poll_data
> (containing
> 
> request_queue, cookie, etc) of these three bios into the very
> beginning original
> 
> bio (that is bio0). Of course we can track all the sub-bios down
> through the device
> 
> stack, layer by layer, e.g.,
> 
> - get bio 1/2/3 from bio 0
> 
> - get bio 4 from bio 3
> 
> - finally get bio 9 from bio 4
> 
> But I'm doubted if it's overkill to just implement the iopoll.
> 
> 
> Another issue still unclear is that, if we should implement the
> iopoll in a recursive way.
> 
> In a recursive way, to poll dm 0, we should poll all its
> sub-devices, that is, bio 4/5/7/8.
> 
> Oppositely we can insert only the bottom bio (bio 9/10/11 which have
> valid cookie) at
> 
> the very beginning (at submit_bio() phase), and to poll dm 0, we
> only need to poll bio 9/10/11.

I feel we need the ability to walk the entire graph and call down to
next level.  The lowest level would be what you call a "valid cookie"
that blk-mq returned.  But the preceding cookies need to be made valid
and used when walking the graph (from highest to lowest) and calling
down to the underlying layers.

> 
> 
> To implement this non-recursive design, we can add a field in struct bio
> 
> ```
> 
> struct bio {
> 
>     ...
> 
>     struct bio *orig;
> 
> }
> 
> ```
> 
> @orig points to the original bio inputted into submit_bio(), that is, bio 0.
> 
> 
> @orig field is transmitted through bio splitting.
> 
> ```
> 
> bio_split()
> 
>     split->orig = bio->orig ? : bio
> 
> 
> dm.c: __clone_and_map_data_bio
> 
>     clone->orig = bio->orig ? : bio
> 
> ```
> 
> 
> Finally bio 9/10/11 can be inserted into bio 0.
> 
> ```
> 
> blk-mq.c: blk_mq_submit_bio
> 
>     if (bio->orig)
> 
>         init blk_poll_data and insert it into bio->orig's @cookies list
> 
> ```

If you feel that is doable: certainly give it a shot.

But it is not clear to me how you intend to translate from cookie passed
in to ->blk_poll hook (returned from submit_bio) to the saved off
bio->orig.

What is your cookie strategy in this non-recursive implementation?  What
will you be returning?  Where will you be storing it?

Mike


  parent reply	other threads:[~2020-11-06 18:46 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-20  6:54 [dm-devel] [RFC 0/3] Add support of iopoll for dm device Jeffle Xu
2020-10-20  6:54 ` Jeffle Xu
2020-10-20  6:54 ` [dm-devel] [RFC 1/3] block/mq: add iterator for polling hw queues Jeffle Xu
2020-10-20  6:54   ` Jeffle Xu
2020-10-20  6:54 ` [dm-devel] [RFC 2/3] block: add back ->poll_fn in request queue Jeffle Xu
2020-10-20  6:54   ` Jeffle Xu
2020-10-20  6:54 ` [dm-devel] [RFC 3/3] dm: add support for IO polling Jeffle Xu
2020-10-20  6:54   ` Jeffle Xu
2020-10-21 20:49   ` [dm-devel] " Mike Snitzer
2020-10-21 20:49     ` Mike Snitzer
2020-10-22  3:14   ` [dm-devel] " Ming Lei
2020-10-22  3:14     ` Ming Lei
2020-10-21 20:39 ` [dm-devel] [RFC 0/3] Add support of iopoll for dm device Mike Snitzer
2020-10-21 20:39   ` Mike Snitzer
2020-10-22  5:28   ` [dm-devel] " JeffleXu
2020-10-22  5:28     ` JeffleXu
2020-10-26 18:53     ` [dm-devel] " Mike Snitzer
2020-10-26 18:53       ` Mike Snitzer
2020-11-02  3:14       ` [dm-devel] " JeffleXu
2020-11-02  3:14         ` JeffleXu
2020-11-02 15:28         ` [dm-devel] " Mike Snitzer
2020-11-02 15:28           ` Mike Snitzer
2020-11-03  8:59           ` [dm-devel] " JeffleXu
2020-11-03  8:59             ` JeffleXu
2020-11-04  6:47           ` [dm-devel] " JeffleXu
2020-11-04  6:47             ` JeffleXu
2020-11-04 15:08             ` [dm-devel] " Mike Snitzer
2020-11-04 15:08               ` Mike Snitzer
2020-11-06  2:51               ` [dm-devel] " JeffleXu
2020-11-06  2:51                 ` JeffleXu
2020-11-06  2:57                 ` JeffleXu
2020-11-06 17:45                 ` Mike Snitzer [this message]
2020-11-06 17:45                   ` Mike Snitzer
2020-11-08  1:09                   ` [dm-devel] " JeffleXu
2020-11-08  1:09                     ` JeffleXu
2020-11-09 18:15                     ` Mike Snitzer
2020-11-09 18:15                       ` Mike Snitzer
2020-11-10  1:43                       ` [dm-devel] " JeffleXu
2020-11-10  1:43                         ` JeffleXu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201106174526.GA13292@redhat.com \
    --to=snitzer@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=dm-devel@redhat.com \
    --cc=haoxu@linux.alibaba.com \
    --cc=io-uring@vger.kernel.org \
    --cc=jefflexu@linux.alibaba.com \
    --cc=joseph.qi@linux.alibaba.com \
    --cc=linux-block@vger.kernel.org \
    --cc=xiaoguang.wang@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.