qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: qemu-block@nongnu.org, berto@igalia.com, mreitz@redhat.com,
	qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 3/8] quorum: Implement .bdrv_co_readv/writev
Date: Tue, 22 Nov 2016 12:32:51 +0100	[thread overview]
Message-ID: <20161122113251.GB5615@noname.redhat.com> (raw)
In-Reply-To: <d5c86792-374e-9631-d1df-3b9eb73ba81d@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 5722 bytes --]

Am 21.11.2016 um 18:58 hat Eric Blake geschrieben:
> On 11/21/2016 11:31 AM, Kevin Wolf wrote:
> > This converts the quorum block driver from implementing callback-based
> > interfaces for read/write to coroutine-based ones. This is the first
> > step that will allow us further simplification of the code.
> > 
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> >  block/quorum.c | 192 ++++++++++++++++++++++++++++++++++-----------------------
> >  1 file changed, 115 insertions(+), 77 deletions(-)
> > 
> 
> > @@ -174,14 +162,14 @@ static bool quorum_64bits_compare(QuorumVoteValue *a, QuorumVoteValue *b)
> >  static QuorumAIOCB *quorum_aio_get(BlockDriverState *bs,
> >                                     QEMUIOVector *qiov,
> >                                     uint64_t sector_num,
> > -                                   int nb_sectors,
> > -                                   BlockCompletionFunc *cb,
> > -                                   void *opaque)
> > +                                   int nb_sectors)
> >  {
> >      BDRVQuorumState *s = bs->opaque;
> > -    QuorumAIOCB *acb = qemu_aio_get(&quorum_aiocb_info, bs, cb, opaque);
> > +    QuorumAIOCB *acb = g_new(QuorumAIOCB, 1);
> 
> Worth using g_new0() here...
> 
> >      int i;
> >  
> > +    acb->co = qemu_coroutine_self();
> > +    acb->bs = bs;
> >      acb->sector_num = sector_num;
> >      acb->nb_sectors = nb_sectors;
> >      acb->qiov = qiov;
> > @@ -191,6 +179,7 @@ static QuorumAIOCB *quorum_aio_get(BlockDriverState *bs,
> >      acb->rewrite_count = 0;
> >      acb->votes.compare = quorum_sha256_compare;
> >      QLIST_INIT(&acb->votes.vote_list);
> > +    acb->has_completed = false;
> >      acb->is_read = false;
> >      acb->vote_ret = 0;
> 
> ...to eliminate 0-assignments here? Not a show-stopper to leave it
> as-is, though.

Not in this patch anyway. I could add a cleanup patch at the end of
series or as a follow-up, though. As you probably know by now, my style
of writing this in new code would use a compound literal:

    QuorumAIOCB *acb = g_new(QuorumAIOCB, 1);
    *acb = (QuorumAIOCB) {
        ...
    };

> > -static BlockAIOCB *read_fifo_child(QuorumAIOCB *acb);
> > +static int read_fifo_child(QuorumAIOCB *acb);
> >  
> >  static void quorum_copy_qiov(QEMUIOVector *dest, QEMUIOVector *source)
> >  {
> > @@ -272,14 +261,14 @@ static void quorum_report_bad_acb(QuorumChildRequest *sacb, int ret)
> >      QuorumAIOCB *acb = sacb->parent;
> >      QuorumOpType type = acb->is_read ? QUORUM_OP_TYPE_READ : QUORUM_OP_TYPE_WRITE;
> >      quorum_report_bad(type, acb->sector_num, acb->nb_sectors,
> > -                      sacb->aiocb->bs->node_name, ret);
> > +                      sacb->bs->node_name, ret);
> >  }
> >  
> > -static void quorum_fifo_aio_cb(void *opaque, int ret)
> > +static int quorum_fifo_aio_cb(void *opaque, int ret)
> >  {
> >      QuorumChildRequest *sacb = opaque;
> >      QuorumAIOCB *acb = sacb->parent;
> > -    BDRVQuorumState *s = acb->common.bs->opaque;
> > +    BDRVQuorumState *s = acb->bs->opaque;
> >  
> >      assert(acb->is_read && s->read_pattern == QUORUM_READ_PATTERN_FIFO);
> >  
> > @@ -288,8 +277,7 @@ static void quorum_fifo_aio_cb(void *opaque, int ret)
> >  
> >          /* We try to read next child in FIFO order if we fail to read */
> >          if (acb->children_read < s->num_children) {
> > -            read_fifo_child(acb);
> > -            return;
> > +            return read_fifo_child(acb);
> >          }
> 
> Question unrelated to this patch: in FIFO mode, are we doing work
> sequentially or in parallel?  That is, does the quorum code kick off all
> children simultaneously, then wait until the first child answers with
> success (and abort all remaining children) or failure (at which point
> moving to the second child may already have an answer)?  Or does it only
> kick of the first child, wait for a response, and not start the second
> child until after the first child fails?

It's the latter. This is quite easy to see in the new model (at the
end of this patch series) because in FIFO mode, reads don't spawn
coroutines, but just have a loop of bdrv_co_preadv() calls.

> I guess one way has more
> potentially wasted work (and a stress test of our ability to cancel work
> on secondary children), while the other has higher latencies, so maybe
> it is something that a future quorum patch may want to make configurable?

Our ability to cancel work barely exists, so I'm not too sure whether
the other way would really be worth implementing.

> >  
> > -static BlockAIOCB *read_fifo_child(QuorumAIOCB *acb)
> > +static int read_fifo_child(QuorumAIOCB *acb)
> >  {
> > -    BDRVQuorumState *s = acb->common.bs->opaque;
> > +    BDRVQuorumState *s = acb->bs->opaque;
> >      int n = acb->children_read++;
> > +    int ret;
> >  
> > -    acb->qcrs[n].aiocb = bdrv_aio_readv(s->children[n], acb->sector_num,
> > -                                        acb->qiov, acb->nb_sectors,
> > -                                        quorum_fifo_aio_cb, &acb->qcrs[n]);
> > +    acb->qcrs[n].bs = s->children[n]->bs;
> > +    ret = bdrv_co_preadv(s->children[n], acb->sector_num * BDRV_SECTOR_SIZE,
> > +                         acb->nb_sectors * BDRV_SECTOR_SIZE, acb->qiov, 0);
> > +    ret = quorum_fifo_aio_cb(&acb->qcrs[n], ret);
> 
> somewhat answering myself - it looks like the current fifo approach is
> high-latency rather than parallel, in that at most one child is being
> run at a time.

Yes, you can see it in this patch already, even if it's even clearer at
the end of the series.

Kevin

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

  reply	other threads:[~2016-11-22 11:33 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-21 17:31 [Qemu-devel] [PATCH 0/8] quorum: Implement .bdrv_co_preadv/pwritev() Kevin Wolf
2016-11-21 17:31 ` [Qemu-devel] [PATCH 1/8] coroutine: Introduce qemu_coroutine_enter_if_inactive() Kevin Wolf
2016-11-21 17:31 ` [Qemu-devel] [PATCH 2/8] quorum: Remove s from quorum_aio_get() arguments Kevin Wolf
2016-11-21 17:31 ` [Qemu-devel] [PATCH 3/8] quorum: Implement .bdrv_co_readv/writev Kevin Wolf
2016-11-21 17:58   ` Eric Blake
2016-11-22 11:32     ` Kevin Wolf [this message]
2016-11-22  7:39   ` Alberto Garcia
2016-11-21 17:31 ` [Qemu-devel] [PATCH 4/8] quorum: Do cleanup in caller coroutine Kevin Wolf
2016-11-21 19:03   ` Eric Blake
2016-11-21 17:31 ` [Qemu-devel] [PATCH 5/8] quorum: Inline quorum_aio_cb() Kevin Wolf
2016-11-21 19:21   ` Eric Blake
2016-11-22  7:43   ` Alberto Garcia
2016-11-21 17:31 ` [Qemu-devel] [PATCH 6/8] quorum: Avoid bdrv_aio_writev() for rewrites Kevin Wolf
2016-11-21 19:52   ` Eric Blake
2016-11-22  7:45   ` Alberto Garcia
2016-11-21 17:31 ` [Qemu-devel] [PATCH 7/8] quorum: Implement .bdrv_co_preadv/pwritev() Kevin Wolf
2016-11-21 20:04   ` Eric Blake
2016-11-22 11:45     ` Kevin Wolf
2016-11-22 12:49       ` Eric Blake
2016-11-21 17:31 ` [Qemu-devel] [PATCH 8/8] quorum: Inline quorum_fifo_aio_cb() Kevin Wolf
2016-11-21 20:08   ` Eric Blake
2016-11-22  9:23     ` Alberto Garcia
2016-11-22 12:51       ` Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161122113251.GB5615@noname.redhat.com \
    --to=kwolf@redhat.com \
    --cc=berto@igalia.com \
    --cc=eblake@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).