qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: Wen Congyang <wency@cn.fujitsu.com>,
	qemu devel <qemu-devel@nongnu.org>, Fam Zheng <famz@redhat.com>,
	Max Reitz <mreitz@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>, Alberto Garcia <berto@igalia.com>,
	zhanghailiang <zhang.zhanghailiang@huawei.com>,
	qemu block <qemu-block@nongnu.org>,
	Jiang Yunhong <yunhong.jiang@intel.com>,
	Dong Eddie <eddie.dong@intel.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	"Michael R. Hines" <mrhines@linux.vnet.ibm.com>,
	Gonglei <arei.gonglei@huawei.com>,
	Yang Hongyang <yanghy@cn.fujitsu.com>
Subject: Re: [Qemu-devel] [PATCH 06/16] quorum: allow ignoring child errors
Date: Wed, 2 Sep 2015 10:30:26 -0600	[thread overview]
Message-ID: <55E72422.30301@redhat.com> (raw)
In-Reply-To: <1441183880-26993-7-git-send-email-wency@cn.fujitsu.com>

[-- Attachment #1: Type: text/plain, Size: 4191 bytes --]

On 09/02/2015 02:51 AM, Wen Congyang wrote:
> If the child is not ready, read/write/getlength/flush will
> return -errno. It is not critical error, and can be ignored:
> 1. read/write:
>    Just not report the error event.

What happens if all the children report an error?  Or is the threshold
at play here?

For example, if you have a threshold of 3/5, then I'm assuming that if
up to two children return an errno, then it is okay to ignore; but if
three or more return an errno, you haven't met threshold, so the I/O
must fail.

Are you ignoring ALL errors (including things like EACCES), or just EIO
errors?


> 2. getlength:
>    just ignore it. If all children's getlength return -errno,
>    and be ignored, return -EIO.
> 3. flush:
>    Just ignore it. If all children's getlength return -errno,

s/getlength/flush/

>    and be ignored, return 0.

Yuck - claiming success when all of the children fail feels dangerous.

> 
> Usage: children.x.ignore-errors=true
> 
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> Cc: Alberto Garcia <berto@igalia.com>
> ---
>  block/quorum.c       | 94 ++++++++++++++++++++++++++++++++++++++++++++++++----

Interface review only:

> +++ b/qapi/block-core.json
> @@ -1411,6 +1411,8 @@
>  # @allow-write-backing-file: #optional whether the backing file is opened in
>  #                            read-write mode. It is only for backing file
>  #                            (Since 2.5 default: false)
> +# @ignore-errors: #options whether the child's I/O error should be ignored.

s/options/optional/
s/error/errors/

> +#                 it is only for quorum's child.(Since 2.5 default: false)

Space after '.' in English sentences.

The fact that you are documenting that this option can only be specified
for quorum children makes me wonder if it belongs better as an option in
BlockdevOptionsQuorum rather than BlockdevOptionsBase.

Semantically, it sounds like you are trying to allow for a per-child
decision of whether this particular child's errors matter to the overall
quorum.  So, if we have a 3/5 quorum, we can decide that for children A,
B, C, and D, errors cannot be ignored, but for child E, errors are not a
problem.

As written, you are tying the semantics to each child BDS, and requiring
special code to double-check that the property is only ever set if the
BDS is used as the child of a quorum.  Furthermore, if the property is
set, you are then changing what the child does in response to various
operations.

What if you instead create a list property in the quorum parent?  Maybe
along the lines of:

# @child-errors-okay: #optional an array of named-node children where
errors will be ignored (Since 2.5, default empty)

{ 'struct': 'BlockdevOptionsQuorum',
  'data': { '*blkverify': 'bool',
            'children': [ 'BlockdevRef' ],
            'vote-threshold': 'int',
            '*rewrite-corrupted': 'bool',
            '*read-pattern': 'QuorumReadPattern',
            '*child-errors-okay': ['str'] } }

The above example of a 3/5 quorum, where only child E can ignore errors,
would then be represented as:

{ "children": [ "A", "B", "C", "D", "E" ], 'vote-threshold':3,
'child-errors-okay': [ "E" ] }

The code to ignore the errors is then done in the quorum itself (the BDS
for E does not have to care about a special ignore-errors property, but
just always returns the error as usual; and then the quorum is deciding
how to handle the error), and you are not polluting the BDS state for
something that is quorum-specific, because it is now the quorum itself
that tracks the special casing.

Finally, why can't hot-plug/unplug of quorum members work?  If you are
going to always ignore errors from a particular child, then why is that
child even part of the quorum?  Isn't a better design to just not add
the child to the quorum until it is ready and won't be reporting errors?

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

  reply	other threads:[~2015-09-02 16:30 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-02  8:51 [Qemu-devel] [PATCH 00/16] Block replication for continuous checkpoints Wen Congyang
2015-09-02  8:51 ` [Qemu-devel] [PATCH 01/16] introduce a new API to enable/disable attach device model Wen Congyang
2015-09-02 15:37   ` Eric Blake
2015-09-07  1:27     ` Wen Congyang
2015-09-02  8:51 ` [Qemu-devel] [PATCH 02/16] introduce a new API to check if blk is attached Wen Congyang
2015-09-02 15:40   ` Eric Blake
2015-09-02  8:51 ` [Qemu-devel] [PATCH 03/16] allow writing to the backing file Wen Congyang
2015-09-02 16:06   ` Eric Blake
2015-09-09  9:19     ` Wen Congyang
2015-09-02  8:51 ` [Qemu-devel] [PATCH 04/16] block: Allow references for backing files Wen Congyang
2015-09-02 18:50   ` Eric Blake
2015-09-09  8:51     ` Wen Congyang
2015-09-02  8:51 ` [Qemu-devel] [PATCH 05/16] introduce a new API qemu_opts_absorb_qdict_by_index() Wen Congyang
2015-09-02 19:01   ` Eric Blake
2015-09-07  2:18     ` Wen Congyang
2015-09-02  8:51 ` [Qemu-devel] [PATCH 06/16] quorum: allow ignoring child errors Wen Congyang
2015-09-02 16:30   ` Eric Blake [this message]
2015-09-07  3:40     ` Wen Congyang
2015-09-07 16:56     ` Dr. David Alan Gilbert
2015-09-08  0:46       ` Wen Congyang
2015-09-02  8:51 ` [Qemu-devel] [PATCH 07/16] Backup: clear all bitmap when doing block checkpoint Wen Congyang
2015-09-02 14:10   ` Jeff Cody
2015-09-02  8:51 ` [Qemu-devel] [PATCH 08/16] block: make bdrv_put_ref_bh_schedule() as a public API Wen Congyang
2015-09-02  8:51 ` [Qemu-devel] [PATCH 09/16] Allow creating backup jobs when opening BDS Wen Congyang
2015-09-02 14:12   ` Jeff Cody
2015-09-02  8:51 ` [Qemu-devel] [PATCH 10/16] docs: block replication's description Wen Congyang
2015-09-02 20:41   ` Eric Blake
2015-09-09  8:22     ` Wen Congyang
2015-09-02  8:51 ` [Qemu-devel] [PATCH 11/16] Add new block driver interfaces to control block replication Wen Congyang
2015-09-02 16:33   ` Eric Blake
2015-09-09  9:24     ` Wen Congyang
2015-09-25  6:14     ` Wen Congyang
2015-09-02  8:51 ` [Qemu-devel] [PATCH 12/16] skip nbd_target when starting " Wen Congyang
2015-09-02  8:51 ` [Qemu-devel] [PATCH 13/16] quorum: implement block driver interfaces for " Wen Congyang
2015-09-02  8:51 ` [Qemu-devel] [PATCH 14/16] Implement new driver " Wen Congyang
2015-09-02  8:51 ` [Qemu-devel] [PATCH 15/16] support replication driver in blockdev-add Wen Congyang
2015-09-02 16:36   ` Eric Blake
2015-09-09  8:27     ` Wen Congyang
2015-09-02  8:51 ` [Qemu-devel] [PATCH 16/16] Add a new API to start/stop replication, do checkpoint to all BDSes Wen Congyang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55E72422.30301@redhat.com \
    --to=eblake@redhat.com \
    --cc=arei.gonglei@huawei.com \
    --cc=berto@igalia.com \
    --cc=dgilbert@redhat.com \
    --cc=eddie.dong@intel.com \
    --cc=famz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=mrhines@linux.vnet.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=wency@cn.fujitsu.com \
    --cc=yanghy@cn.fujitsu.com \
    --cc=yunhong.jiang@intel.com \
    --cc=zhang.zhanghailiang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).