From: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
To: Max Reitz <mreitz@redhat.com>, Alberto Garcia <berto@igalia.com>,
Wen Congyang <wency@cn.fujitsu.com>,
qemu devel <qemu-devel@nongnu.org>, Kevin Wolf <kwolf@redhat.com>,
Eric Blake <eblake@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v2 1/1] quorum: Change vote rules for 64 bits hash
Date: Mon, 22 Feb 2016 11:17:44 +0800 [thread overview]
Message-ID: <56CA7DD8.9000403@cn.fujitsu.com> (raw)
In-Reply-To: <56C877F3.3070401@redhat.com>
On 02/20/2016 10:28 PM, Max Reitz wrote:
> On 19.02.2016 12:24, Alberto Garcia wrote:
>> On Fri 19 Feb 2016 09:26:53 AM CET, Wen Congyang <wency@cn.fujitsu.com> wrote:
>>
>>>>> If quorum has two children(A, B). A do flush sucessfully, but B
>>>>> flush failed. We MUST choice A as winner rather than just pick
>>>>> anyone of them. Otherwise the filesystem of guest will become
>>>>> read-only with following errors:
>>>>>
>>>>> end_request: I/O error, dev vda, sector 11159960
>>>>> Aborting journal on device vda3-8
>>>>> EXT4-fs error (device vda3): ext4_journal_start_sb:327: Detected abort journal
>>>>> EXT4-fs (vda3): Remounting filesystem read-only
>>>>
>>>> Hi Xie,
>>>>
>>>> Let's see if I'm getting this right:
>>>>
>>>> - When Quorum flushes to disk, there's a vote among the return values of
>>>> the flush operations of its members, and the one that wins is the one
>>>> that Quorum returns.
>>>>
>>>> - If there's a tie then Quorum choses the first result from the list of
>>>> winners.
>>>>
>>>> - With your patch you want to give priority to the vote with result == 0
>>>> if there's any, so Quorum would return 0 (and succeed).
>>>>
>>>> This seems to me like an ad-hoc fix for a particular use case. What
>>>> if you have 3 members and two of them fail with the same error code?
>>>> Would you still return 0 or the error code from the other two?
>>>
>>> For example:
>>> children.0 returns 0
>>> children.1 returns -EIO
>>> children.2 returns -EPIPE
>>>
>>> In this case, quorum returns -EPIPE now(without this patch).
>>>
>>> For example:
>>> children.0 returns -EPIPE
>>> children.1 returns -EIO
>>> children.2 returns 0
>>> In this case, quorum returns 0 now.
>>
>> My question is: what's the rationale for returning 0 in case a) but not
>> in case b)?
>>
>> a)
>> children.0 returns -EPIPE
>> children.1 returns -EIO
>> children.2 returns 0
>>
>> b)
>> children.0 returns -EIO
>> children.1 returns -EIO
>> children.2 returns 0
>>
>> In both cases you have one successful flush and two errors. You want to
>> return always 0 in case a) and always -EIO in case b). But the only
>> difference is that in case b) the errors happen to be the same, so why
>> does that matter?
>>
>> That said, I'm not very convinced of the current logics of the Quorum
>> flush code either, so it's not even a problem with your patch... it
>> seems to me that the code should follow the same logics as in the
>> read/write case: if the number of correct flushes >= threshold then
>> return 0, else select the most common error code.
>
> I'm not convinced of the logic either, which is why I waited for you to
> respond to this patch. :-)
>
> Intuitively, I'd expect Quorum to return an error if flushing failed for
> any of the children, because, well, flushing failed. I somehow feel like
> flushing is different from a read or write operation and therefore
> ignoring the threshold would be fine here. However, maybe my intuition
> is just off.
>
> Anyway, regardless of that, if we do take the threshold into account, we
> should not use the exact error value for voting but just whether an
> error occurred or not. If all but one children fail to flush (all for
> different reasons), I find it totally wrong to return success. We should
> then just return -EIO or something.
>
Hi Berto & Max
Thanks for your comments, i'd like to have a summary here. For flush cases:
1) if flush successfully(result >= 0), result = 0; else if result < 0,
result = -EIO. then invoke quorum_count_vote
2) if correct flushes >= threshold, mark correct flushes as winner directly.
Will fix in next version.
Thanks
-Xie
> Max
>
next prev parent reply other threads:[~2016-02-22 3:16 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-16 2:15 [Qemu-devel] [PATCH v2 0/1] change quorum vote rules for 64-bits hash Changlong Xie
2016-02-16 2:15 ` [Qemu-devel] [PATCH v2 1/1] quorum: Change vote rules for 64 bits hash Changlong Xie
2016-02-18 10:00 ` Dr. David Alan Gilbert
2016-02-18 15:16 ` Alberto Garcia
2016-02-19 8:26 ` Wen Congyang
2016-02-19 11:24 ` Alberto Garcia
2016-02-20 14:28 ` Max Reitz
2016-02-22 3:17 ` Changlong Xie [this message]
2016-02-22 9:02 ` Dr. David Alan Gilbert
2016-02-22 9:52 ` Changlong Xie
2016-02-22 9:59 ` Dr. David Alan Gilbert
2016-02-22 10:34 ` Kevin Wolf
2016-02-22 10:39 ` Dr. David Alan Gilbert
2016-02-23 2:55 ` Changlong Xie
2016-02-22 13:31 ` Alberto Garcia
2016-02-22 13:43 ` Alberto Garcia
2016-02-22 16:37 ` Eric Blake
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56CA7DD8.9000403@cn.fujitsu.com \
--to=xiecl.fnst@cn.fujitsu.com \
--cc=berto@igalia.com \
--cc=dgilbert@redhat.com \
--cc=eblake@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=wency@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).