qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Anthony Liguori <aliguori@linux.vnet.ibm.com>
To: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Cc: ohmura.kei@lab.ntt.co.jp, mtosatti@redhat.com,
	kvm@vger.kernel.org, dlaor@redhat.com, qemu-devel@nongnu.org,
	yoshikawa.takuya@oss.ntt.co.jp, avi@redhat.com
Subject: Re: [Qemu-devel] [RFC PATCH 00/20] Kemari for KVM v0.1
Date: Fri, 23 Apr 2010 08:20:21 -0500	[thread overview]
Message-ID: <4BD19E95.7030906@linux.vnet.ibm.com> (raw)
In-Reply-To: <4BD0FD8F.5060108@lab.ntt.co.jp>

On 04/22/2010 08:53 PM, Yoshiaki Tamura wrote:
> Anthony Liguori wrote:
>> On 04/22/2010 08:16 AM, Yoshiaki Tamura wrote:
>>> 2010/4/22 Dor Laor<dlaor@redhat.com>:
>>>> On 04/22/2010 01:35 PM, Yoshiaki Tamura wrote:
>>>>> Dor Laor wrote:
>>>>>> On 04/21/2010 08:57 AM, Yoshiaki Tamura wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> We have been implementing the prototype of Kemari for KVM, and 
>>>>>>> we're
>>>>>>> sending
>>>>>>> this message to share what we have now and TODO lists. 
>>>>>>> Hopefully, we
>>>>>>> would like
>>>>>>> to get early feedback to keep us in the right direction. Although
>>>>>>> advanced
>>>>>>> approaches in the TODO lists are fascinating, we would like to run
>>>>>>> this project
>>>>>>> step by step while absorbing comments from the community. The 
>>>>>>> current
>>>>>>> code is
>>>>>>> based on qemu-kvm.git 2b644fd0e737407133c88054ba498e772ce01f27.
>>>>>>>
>>>>>>> For those who are new to Kemari for KVM, please take a look at the
>>>>>>> following RFC which we posted last year.
>>>>>>>
>>>>>>> http://www.mail-archive.com/kvm@vger.kernel.org/msg25022.html
>>>>>>>
>>>>>>> The transmission/transaction protocol, and most of the control
>>>>>>> logic is
>>>>>>> implemented in QEMU. However, we needed a hack in KVM to prevent 
>>>>>>> rip
>>>>>>> from
>>>>>>> proceeding before synchronizing VMs. It may also need some
>>>>>>> plumbing in
>>>>>>> the
>>>>>>> kernel side to guarantee replayability of certain events and
>>>>>>> instructions,
>>>>>>> integrate the RAS capabilities of newer x86 hardware with the HA
>>>>>>> stack, as well
>>>>>>> as for optimization purposes, for example.
>>>>>> [ snap]
>>>>>>
>>>>>>> The rest of this message describes TODO lists grouped by each 
>>>>>>> topic.
>>>>>>>
>>>>>>> === event tapping ===
>>>>>>>
>>>>>>> Event tapping is the core component of Kemari, and it decides on
>>>>>>> which
>>>>>>> event the
>>>>>>> primary should synchronize with the secondary. The basic assumption
>>>>>>> here is
>>>>>>> that outgoing I/O operations are idempotent, which is usually true
>>>>>>> for
>>>>>>> disk I/O
>>>>>>> and reliable network protocols such as TCP.
>>>>>> IMO any type of network even should be stalled too. What if the VM
>>>>>> runs
>>>>>> non tcp protocol and the packet that the master node sent reached 
>>>>>> some
>>>>>> remote client and before the sync to the slave the master failed?
>>>>> In current implementation, it is actually stalling any type of 
>>>>> network
>>>>> that goes through virtio-net.
>>>>>
>>>>> However, if the application was using unreliable protocols, it should
>>>>> have its own recovering mechanism, or it should be completely
>>>>> stateless.
>>>> Why do you treat tcp differently? You can damage the entire VM this
>>>> way -
>>>> think of dhcp request that was dropped on the moment you switched
>>>> between
>>>> the master and the slave?
>>> I'm not trying to say that we should treat tcp differently, but just
>>> it's severe.
>>> In case of dhcp request, the client would have a chance to retry after
>>> failover, correct?
>>> BTW, in current implementation,
>>
>> I'm slightly confused about the current implementation vs. my
>> recollection of the original paper with Xen. I had thought that all disk
>> and network I/O was buffered in such a way that at each checkpoint, the
>> I/O operations would be released in a burst. Otherwise, you would have
>> to synchronize after every I/O operation which is what it seems the
>> current implementation does.
>
> Yes, you're almost right.
> It's synchronizing before QEMU starts emulating I/O at each device model.

If NodeA is the master and NodeB is the slave, if NodeA sends a network 
packet, you'll checkpoint before the packet is actually sent, and then 
if a failure occurs before the next checkpoint, won't that result in 
both NodeA and NodeB sending out a duplicate version of the packet?

Regards,

Anthony Liguori

  reply	other threads:[~2010-04-23 13:20 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-21  5:57 [Qemu-devel] [RFC PATCH 00/20] Kemari for KVM v0.1 Yoshiaki Tamura
2010-04-21  5:57 ` [Qemu-devel] [RFC PATCH 01/20] Modify DIRTY_FLAG value and introduce DIRTY_IDX to use as indexes of bit-based phys_ram_dirty Yoshiaki Tamura
2010-04-22 19:26   ` [Qemu-devel] " Anthony Liguori
2010-04-23  2:09     ` Yoshiaki Tamura
2010-04-21  5:57 ` [Qemu-devel] [RFC PATCH 02/20] Introduce cpu_physical_memory_get_dirty_range() Yoshiaki Tamura
2010-04-21  5:57 ` [Qemu-devel] [RFC PATCH 03/20] Use cpu_physical_memory_set_dirty_range() to update phys_ram_dirty Yoshiaki Tamura
2010-04-21  5:57 ` [Qemu-devel] [RFC PATCH 04/20] Make QEMUFile buf expandable, and introduce qemu_realloc_buffer() and qemu_clear_buffer() Yoshiaki Tamura
2010-04-21  8:03   ` [Qemu-devel] " Stefan Hajnoczi
2010-04-21  8:27     ` Yoshiaki Tamura
2010-04-23  9:53   ` Avi Kivity
2010-04-23  9:59     ` Yoshiaki Tamura
2010-04-23 13:14       ` Avi Kivity
2010-04-26 10:43         ` Yoshiaki Tamura
2010-04-23 13:26     ` Anthony Liguori
2010-04-21  5:57 ` [Qemu-devel] [RFC PATCH 05/20] Introduce put_vector() and get_vector to QEMUFile and qemu_fopen_ops() Yoshiaki Tamura
2010-04-22 19:28   ` [Qemu-devel] " Anthony Liguori
2010-04-23  3:37     ` Yoshiaki Tamura
2010-04-23 13:22       ` Anthony Liguori
2010-04-23 13:48         ` Avi Kivity
2010-05-03  9:32           ` Yoshiaki Tamura
2010-05-03 12:05             ` Anthony Liguori
2010-05-03 15:36               ` Yoshiaki Tamura
2010-05-03 16:07                 ` Anthony Liguori
2010-04-26 10:43         ` Yoshiaki Tamura
2010-04-21  5:57 ` [Qemu-devel] [RFC PATCH 06/20] Introduce iovec util functions, qemu_iovec_to_vector() and qemu_iovec_to_size() Yoshiaki Tamura
2010-04-21  5:57 ` [Qemu-devel] [RFC PATCH 07/20] Introduce qemu_put_vector() and qemu_put_vector_prepare() to use put_vector() in QEMUFile Yoshiaki Tamura
2010-04-22 19:29   ` [Qemu-devel] " Anthony Liguori
2010-04-23  4:02     ` Yoshiaki Tamura
2010-04-23 13:23       ` Anthony Liguori
2010-04-26 10:43         ` Yoshiaki Tamura
2010-04-21  5:57 ` [Qemu-devel] [RFC PATCH 08/20] Introduce RAMSaveIO and use cpu_physical_memory_get_dirty_range() to check multiple dirty pages Yoshiaki Tamura
2010-04-22 19:31   ` [Qemu-devel] " Anthony Liguori
2010-04-21  5:57 ` [Qemu-devel] [RFC PATCH 09/20] Introduce writev and read to FdMigrationState Yoshiaki Tamura
2010-04-21  5:57 ` [Qemu-devel] [RFC PATCH 10/20] Introduce skip_header parameter to qemu_loadvm_state() so that it can be called iteratively without reading the header Yoshiaki Tamura
2010-04-22 19:34   ` [Qemu-devel] " Anthony Liguori
2010-04-23  4:25     ` Yoshiaki Tamura
2010-04-21  5:57 ` [Qemu-devel] [RFC PATCH 11/20] Introduce some socket util functions Yoshiaki Tamura
2010-04-21  5:57 ` [Qemu-devel] [RFC PATCH 12/20] Introduce fault tolerant VM transaction QEMUFile and ft_mode Yoshiaki Tamura
2010-04-21  5:57 ` [Qemu-devel] [RFC PATCH 13/20] Introduce util functions to control ft_transaction from savevm layer Yoshiaki Tamura
2010-04-21  5:57 ` [Qemu-devel] [RFC PATCH 14/20] Upgrade QEMU_FILE_VERSION from 3 to 4, and introduce qemu_savevm_state_all() Yoshiaki Tamura
2010-04-22 19:37   ` [Qemu-devel] " Anthony Liguori
2010-04-23  3:29     ` Yoshiaki Tamura
2010-04-21  5:57 ` [Qemu-devel] [RFC PATCH 15/20] Introduce FT mode support to configure Yoshiaki Tamura
2010-04-22 19:38   ` [Qemu-devel] " Anthony Liguori
2010-04-23  3:09     ` Yoshiaki Tamura
2010-04-21  5:57 ` [Qemu-devel] [RFC PATCH 16/20] Introduce event_tap fucntions and ft_tranx_ready() Yoshiaki Tamura
2010-04-21  5:57 ` [Qemu-devel] [RFC PATCH 17/20] Modify migrate_fd_put_ready() when ft_mode is on Yoshiaki Tamura
2010-04-21  5:57 ` [Qemu-devel] [RFC PATCH 18/20] Modify tcp_accept_incoming_migration() to handle ft_mode, and add a hack not to close fd when ft_mode is enabled Yoshiaki Tamura
2010-04-21  5:57 ` [Qemu-devel] [RFC PATCH 19/20] Insert do_event_tap() to virtio-{blk, net}, comment out assert() on cpu_single_env temporally Yoshiaki Tamura
2010-04-22 19:39   ` [Qemu-devel] " Anthony Liguori
2010-04-23  4:51     ` Yoshiaki Tamura
2010-04-21  5:57 ` [Qemu-devel] [RFC PATCH 20/20] Introduce -k option to enable FT migration mode (Kemari) Yoshiaki Tamura
2010-04-22  8:58 ` [Qemu-devel] [RFC PATCH 00/20] Kemari for KVM v0.1 Dor Laor
2010-04-22 10:35   ` Yoshiaki Tamura
2010-04-22 11:36     ` Takuya Yoshikawa
2010-04-22 12:35       ` Yoshiaki Tamura
2010-04-22 12:19     ` Dor Laor
2010-04-22 13:16       ` Yoshiaki Tamura
2010-04-22 20:33         ` Anthony Liguori
2010-04-23  1:53           ` Yoshiaki Tamura
2010-04-23 13:20             ` Anthony Liguori [this message]
2010-04-26 10:44               ` Yoshiaki Tamura
2010-04-22 20:38         ` Dor Laor
2010-04-23  5:17           ` Yoshiaki Tamura
2010-04-23  7:36             ` Fernando Luis Vázquez Cao
2010-04-25 21:52               ` Dor Laor
2010-04-22 16:15     ` Jamie Lokier
2010-04-23  0:20       ` Yoshiaki Tamura
2010-04-23 15:07         ` Jamie Lokier
2010-04-22 19:42 ` [Qemu-devel] " Anthony Liguori
2010-04-23  0:45   ` Yoshiaki Tamura
2010-04-23 13:10     ` Anthony Liguori
2010-04-23 13:24 ` Avi Kivity
2010-04-26 10:44   ` Yoshiaki Tamura

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BD19E95.7030906@linux.vnet.ibm.com \
    --to=aliguori@linux.vnet.ibm.com \
    --cc=avi@redhat.com \
    --cc=dlaor@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=ohmura.kei@lab.ntt.co.jp \
    --cc=qemu-devel@nongnu.org \
    --cc=tamura.yoshiaki@lab.ntt.co.jp \
    --cc=yoshikawa.takuya@oss.ntt.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).