From: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
To: ya su <suya94335@gmail.com>
Cc: kvm@vger.kernel.org, qemu-devel@nongnu.org, avi@redhat.com,
anthony@codemonkey.ws, aliguori@us.ibm.com, mtosatti@redhat.com,
dlaor@redhat.com, mst@redhat.com, kwolf@redhat.com,
pbonzini@redhat.com, quintela@redhat.com, ananth@in.ibm.com,
psuriset@linux.vnet.ibm.com, vatsa@linux.vnet.ibm.com,
stefanha@linux.vnet.ibm.com, blauwirbel@gmail.com,
ohmura.kei@lab.ntt.co.jp
Subject: Re: [PATCH 09/18] Introduce event-tap.
Date: Wed, 09 Mar 2011 15:26:50 +0900 [thread overview]
Message-ID: <4D771DAA.1000801@lab.ntt.co.jp> (raw)
In-Reply-To: <AANLkTimpbEVhu1-ijA9h2__tNJtXQVJZcoXzN0Fdrqpj@mail.gmail.com>
ya su wrote:
> Yoshi:
>
> I think event-tap is a great idea, it remove the reading from disk
> which will increase ft effiency much better as your plan in later
> series.
>
> one question: IO read/write may dirty rams, but it is difficute to
> differ them from other dirty pages like caused by running of
> softwares, whether that means you need change all the emulated device
> realization? actually I think it will not send too much rams caused
> by IO Read/Write in ram_save_live, but if It can event-tap IO
> read/write and replay on the other side, Does that means we don't need
> call qemu_savevm_state_full in ft transactoins?
I'm not expecting to remove qemu_savevm_state_full in the transaction. Just
reduce the number of pages to be transfered as a result.
Thanks,
Yoshi
>
> Green.
>
>
> 2011/3/9 Yoshiaki Tamura<tamura.yoshiaki@lab.ntt.co.jp>:
>> ya su wrote:
>>>
>>> 2011/3/8 Yoshiaki Tamura<tamura.yoshiaki@lab.ntt.co.jp>:
>>>>
>>>> ya su wrote:
>>>>>
>>>>> Yokshiaki:
>>>>>
>>>>> event-tap record block and io wirte events, and replay these on
>>>>> the other side, so block_save_live is useless during the latter ft
>>>>> phase, right? if so, I think it need to process the following code in
>>>>> block_save_live function:
>>>>
>>>> Actually no. It just replays the last events only. We do have patches
>>>> that
>>>> enable block replication without using block live migration, like the way
>>>> you described above. In that case, we disable block live migration when
>>>> we
>>>> go into ft mode. We're thinking to propose it after this series get
>>>> settled.
>>>
>>> so event-tap's objective is to initial a ft transaction, to start the
>>> sync. of ram/block/device states? if so, it need not change
>>> bdrv_aio_writev/bdrv_aio_flush normal process, on the other side it
>>> need not invokde bdrv_aio_writev either, right?
>>
>> Mostly yes, but because event-tap is queuing requests from block/net, it
>> needs to flush queued requests after the transaction on the primary side.
>> On the secondary, it currently doesn't have to invoke bdrv_aio_writev as
>> you mentioned. But will change soon to enable block replication with
>> event-tap.
>>
>>>
>>>>
>>>>>
>>>>> if (stage == 1) {
>>>>> init_blk_migration(mon, f);
>>>>>
>>>>> /* start track dirty blocks */
>>>>> set_dirty_tracking(1);
>>>>> }
>>>>> --------------------------------------
>>>>> the following code will send block to the other side, as this will
>>>>> also be done by event-tap replay. I think it should placed in stage 3,
>>>>> before the assert line. (this may affect some stage 2 rate-limit
>>>>> then, so this can be placed in stage 2, though it looks ugly), another
>>>>> choice is to avoid the invocation of block_save_live, right?
>>>>> ---------------------------------------
>>>>> flush_blks(f);
>>>>>
>>>>> if (qemu_file_has_error(f)) {
>>>>> blk_mig_cleanup(mon);
>>>>> return 0;
>>>>> }
>>>>>
>>>>> blk_mig_reset_dirty_cursor();
>>>>> ----------------------------------------
>>>>> if (stage == 2) {
>>>>>
>>>>>
>>>>> another question is: since you event-tap io write(I think IO READ
>>>>> should also be event-tapped, as read may cause io chip state to
>>>>> change), you then need not invoke qemu_savevm_state_full in
>>>>> qemu_savevm_trans_complete, right? thanks.
>>>>
>>>> It's not necessary to tap IO READ, but you can if you like. We also have
>>>> experimental patches for this to reduce rams to be transfered. But I
>>>> don't
>>>> understand why we don't have to invoke qemu_savevm_state_full although I
>>>> think we may reduce number of rams by replaying IO READ on the secondary.
>>>>
>>>
>>> I first think the objective of io-Write event-tap is to reproduce the
>>> same device state on the other side, though I doubt this, so I think
>>> IO-Read also should be recorded and replayed. since event-tap is only
>>> to initial a ft transaction, the sync. of states still depend on
>>> qemu_save_vm_live/full, I understand the design now, thanks.
>>>
>>> but I don't understand why io-write event-tap can reduce transfered
>>> rams as you mentioned, the amount of rams only depend on dirty pages,
>>> IO write don't change the normal process unlike block write, right?
>>
>> The point is, if we can assure that IO read retrieves the same data on both
>> sides, instead of dirtying the ram by read, meaning we have to transfer in
>> the transaction, just replay the operation and get the same data on the
>> otherside. Anyway, that's just a plan :)
>>
>> Thanks,
>>
>> Yoshi
>>
>>>
>>>> Thanks,
>>>>
>>>> Yoshi
>>>>
>>>>>
>>>>>
>>>>> Green.
>>>>>
>>>>>
>>>>>
>>>>> 2011/2/24 Yoshiaki Tamura<tamura.yoshiaki@lab.ntt.co.jp>:
>>>>>>
>>>>>> event-tap controls when to start FT transaction, and provides proxy
>>>>>> functions to called from net/block devices. While FT transaction, it
>>>>>> queues up net/block requests, and flush them when the transaction gets
>>>>>> completed.
>>>>>>
>>>>>> Signed-off-by: Yoshiaki Tamura<tamura.yoshiaki@lab.ntt.co.jp>
>>>>>> Signed-off-by: OHMURA Kei<ohmura.kei@lab.ntt.co.jp>
>>>>>> ---
>>>>>> Makefile.target | 1 +
>>>>>> event-tap.c | 940
>>>>>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>> event-tap.h | 44 +++
>>>>>> qemu-tool.c | 28 ++
>>>>>> trace-events | 10 +
>>>>>> 5 files changed, 1023 insertions(+), 0 deletions(-)
>>>>>> create mode 100644 event-tap.c
>>>>>> create mode 100644 event-tap.h
>>>>>>
>>>>>> diff --git a/Makefile.target b/Makefile.target
>>>>>> index 220589e..da57efe 100644
>>>>>> --- a/Makefile.target
>>>>>> +++ b/Makefile.target
>>>>>> @@ -199,6 +199,7 @@ obj-y += rwhandler.o
>>>>>> obj-$(CONFIG_KVM) += kvm.o kvm-all.o
>>>>>> obj-$(CONFIG_NO_KVM) += kvm-stub.o
>>>>>> LIBS+=-lz
>>>>>> +obj-y += event-tap.o
>>>>>>
>>>>>> QEMU_CFLAGS += $(VNC_TLS_CFLAGS)
>>>>>> QEMU_CFLAGS += $(VNC_SASL_CFLAGS)
>>>>>> diff --git a/event-tap.c b/event-tap.c
>>>>>> new file mode 100644
>>>>>> index 0000000..95c147a
>>>>>> --- /dev/null
>>>>>> +++ b/event-tap.c
>>>>>> @@ -0,0 +1,940 @@
>>>>>> +/*
>>>>>> + * Event Tap functions for QEMU
>>>>>> + *
>>>>>> + * Copyright (c) 2010 Nippon Telegraph and Telephone Corporation.
>>>>>> + *
>>>>>> + * This work is licensed under the terms of the GNU GPL, version 2.
>>>>>> See
>>>>>> + * the COPYING file in the top-level directory.
>>>>>> + */
>>>>>> +
>>>>>> +#include "qemu-common.h"
>>>>>> +#include "qemu-error.h"
>>>>>> +#include "block.h"
>>>>>> +#include "block_int.h"
>>>>>> +#include "ioport.h"
>>>>>> +#include "osdep.h"
>>>>>> +#include "sysemu.h"
>>>>>> +#include "hw/hw.h"
>>>>>> +#include "net.h"
>>>>>> +#include "event-tap.h"
>>>>>> +#include "trace.h"
>>>>>> +
>>>>>> +enum EVENT_TAP_STATE {
>>>>>> + EVENT_TAP_OFF,
>>>>>> + EVENT_TAP_ON,
>>>>>> + EVENT_TAP_SUSPEND,
>>>>>> + EVENT_TAP_FLUSH,
>>>>>> + EVENT_TAP_LOAD,
>>>>>> + EVENT_TAP_REPLAY,
>>>>>> +};
>>>>>> +
>>>>>> +static enum EVENT_TAP_STATE event_tap_state = EVENT_TAP_OFF;
>>>>>> +
>>>>>> +typedef struct EventTapIOport {
>>>>>> + uint32_t address;
>>>>>> + uint32_t data;
>>>>>> + int index;
>>>>>> +} EventTapIOport;
>>>>>> +
>>>>>> +#define MMIO_BUF_SIZE 8
>>>>>> +
>>>>>> +typedef struct EventTapMMIO {
>>>>>> + uint64_t address;
>>>>>> + uint8_t buf[MMIO_BUF_SIZE];
>>>>>> + int len;
>>>>>> +} EventTapMMIO;
>>>>>> +
>>>>>> +typedef struct EventTapNetReq {
>>>>>> + char *device_name;
>>>>>> + int iovcnt;
>>>>>> + int vlan_id;
>>>>>> + bool vlan_needed;
>>>>>> + bool async;
>>>>>> + struct iovec *iov;
>>>>>> + NetPacketSent *sent_cb;
>>>>>> +} EventTapNetReq;
>>>>>> +
>>>>>> +#define MAX_BLOCK_REQUEST 32
>>>>>> +
>>>>>> +typedef struct EventTapAIOCB EventTapAIOCB;
>>>>>> +
>>>>>> +typedef struct EventTapBlkReq {
>>>>>> + char *device_name;
>>>>>> + int num_reqs;
>>>>>> + int num_cbs;
>>>>>> + bool is_flush;
>>>>>> + BlockRequest reqs[MAX_BLOCK_REQUEST];
>>>>>> + EventTapAIOCB *acb[MAX_BLOCK_REQUEST];
>>>>>> +} EventTapBlkReq;
>>>>>> +
>>>>>> +#define EVENT_TAP_IOPORT (1<< 0)
>>>>>> +#define EVENT_TAP_MMIO (1<< 1)
>>>>>> +#define EVENT_TAP_NET (1<< 2)
>>>>>> +#define EVENT_TAP_BLK (1<< 3)
>>>>>> +
>>>>>> +#define EVENT_TAP_TYPE_MASK (EVENT_TAP_NET - 1)
>>>>>> +
>>>>>> +typedef struct EventTapLog {
>>>>>> + int mode;
>>>>>> + union {
>>>>>> + EventTapIOport ioport;
>>>>>> + EventTapMMIO mmio;
>>>>>> + };
>>>>>> + union {
>>>>>> + EventTapNetReq net_req;
>>>>>> + EventTapBlkReq blk_req;
>>>>>> + };
>>>>>> + QTAILQ_ENTRY(EventTapLog) node;
>>>>>> +} EventTapLog;
>>>>>> +
>>>>>> +struct EventTapAIOCB {
>>>>>> + BlockDriverAIOCB common;
>>>>>> + BlockDriverAIOCB *acb;
>>>>>> + bool is_canceled;
>>>>>> +};
>>>>>> +
>>>>>> +static EventTapLog *last_event_tap;
>>>>>> +
>>>>>> +static QTAILQ_HEAD(, EventTapLog) event_list;
>>>>>> +static QTAILQ_HEAD(, EventTapLog) event_pool;
>>>>>> +
>>>>>> +static int (*event_tap_cb)(void);
>>>>>> +static QEMUBH *event_tap_bh;
>>>>>> +static VMChangeStateEntry *vmstate;
>>>>>> +
>>>>>> +static void event_tap_bh_cb(void *p)
>>>>>> +{
>>>>>> + if (event_tap_cb) {
>>>>>> + event_tap_cb();
>>>>>> + }
>>>>>> +
>>>>>> + qemu_bh_delete(event_tap_bh);
>>>>>> + event_tap_bh = NULL;
>>>>>> +}
>>>>>> +
>>>>>> +static void event_tap_schedule_bh(void)
>>>>>> +{
>>>>>> + trace_event_tap_ignore_bh(!!event_tap_bh);
>>>>>> +
>>>>>> + /* if bh is already set, we ignore it for now */
>>>>>> + if (event_tap_bh) {
>>>>>> + return;
>>>>>> + }
>>>>>> +
>>>>>> + event_tap_bh = qemu_bh_new(event_tap_bh_cb, NULL);
>>>>>> + qemu_bh_schedule(event_tap_bh);
>>>>>> +
>>>>>> + return;
>>>>>> +}
>>>>>> +
>>>>>> +static void *event_tap_alloc_log(void)
>>>>>> +{
>>>>>> + EventTapLog *log;
>>>>>> +
>>>>>> + if (QTAILQ_EMPTY(&event_pool)) {
>>>>>> + log = qemu_mallocz(sizeof(EventTapLog));
>>>>>> + } else {
>>>>>> + log = QTAILQ_FIRST(&event_pool);
>>>>>> + QTAILQ_REMOVE(&event_pool, log, node);
>>>>>> + }
>>>>>> +
>>>>>> + return log;
>>>>>> +}
>>>>>> +
>>>>>> +static void event_tap_free_net_req(EventTapNetReq *net_req);
>>>>>> +static void event_tap_free_blk_req(EventTapBlkReq *blk_req);
>>>>>> +
>>>>>> +static void event_tap_free_log(EventTapLog *log)
>>>>>> +{
>>>>>> + int mode = log->mode& ~EVENT_TAP_TYPE_MASK;
>>>>>> +
>>>>>> + if (mode == EVENT_TAP_NET) {
>>>>>> + event_tap_free_net_req(&log->net_req);
>>>>>> + } else if (mode == EVENT_TAP_BLK) {
>>>>>> + event_tap_free_blk_req(&log->blk_req);
>>>>>> + }
>>>>>> +
>>>>>> + log->mode = 0;
>>>>>> +
>>>>>> + /* return the log to event_pool */
>>>>>> + QTAILQ_INSERT_HEAD(&event_pool, log, node);
>>>>>> +}
>>>>>> +
>>>>>> +static void event_tap_free_pool(void)
>>>>>> +{
>>>>>> + EventTapLog *log, *next;
>>>>>> +
>>>>>> + QTAILQ_FOREACH_SAFE(log,&event_pool, node, next) {
>>>>>> + QTAILQ_REMOVE(&event_pool, log, node);
>>>>>> + qemu_free(log);
>>>>>> + }
>>>>>> +}
>>>>>> +
>>>>>> +static void event_tap_free_net_req(EventTapNetReq *net_req)
>>>>>> +{
>>>>>> + int i;
>>>>>> +
>>>>>> + if (!net_req->async) {
>>>>>> + for (i = 0; i< net_req->iovcnt; i++) {
>>>>>> + qemu_free(net_req->iov[i].iov_base);
>>>>>> + }
>>>>>> + qemu_free(net_req->iov);
>>>>>> + } else if (event_tap_state>= EVENT_TAP_LOAD) {
>>>>>> + qemu_free(net_req->iov);
>>>>>> + }
>>>>>> +
>>>>>> + qemu_free(net_req->device_name);
>>>>>> +}
>>>>>> +
>>>>>> +static void event_tap_alloc_net_req(EventTapNetReq *net_req,
>>>>>> + VLANClientState *vc,
>>>>>> + const struct iovec *iov, int
>>>>>> iovcnt,
>>>>>> + NetPacketSent *sent_cb, bool async)
>>>>>> +{
>>>>>> + int i;
>>>>>> +
>>>>>> + net_req->iovcnt = iovcnt;
>>>>>> + net_req->async = async;
>>>>>> + net_req->device_name = qemu_strdup(vc->name);
>>>>>> + net_req->sent_cb = sent_cb;
>>>>>> +
>>>>>> + if (vc->vlan) {
>>>>>> + net_req->vlan_needed = 1;
>>>>>> + net_req->vlan_id = vc->vlan->id;
>>>>>> + } else {
>>>>>> + net_req->vlan_needed = 0;
>>>>>> + }
>>>>>> +
>>>>>> + if (async) {
>>>>>> + net_req->iov = (struct iovec *)iov;
>>>>>> + } else {
>>>>>> + net_req->iov = qemu_malloc(sizeof(struct iovec) * iovcnt);
>>>>>> + for (i = 0; i< iovcnt; i++) {
>>>>>> + net_req->iov[i].iov_base = qemu_malloc(iov[i].iov_len);
>>>>>> + memcpy(net_req->iov[i].iov_base, iov[i].iov_base,
>>>>>> iov[i].iov_len);
>>>>>> + net_req->iov[i].iov_len = iov[i].iov_len;
>>>>>> + }
>>>>>> + }
>>>>>> +}
>>>>>> +
>>>>>> +static void event_tap_packet(VLANClientState *vc, const struct iovec
>>>>>> *iov,
>>>>>> + int iovcnt, NetPacketSent *sent_cb, bool
>>>>>> async)
>>>>>> +{
>>>>>> + int empty;
>>>>>> + EventTapLog *log = last_event_tap;
>>>>>> +
>>>>>> + if (!log) {
>>>>>> + trace_event_tap_no_event();
>>>>>> + log = event_tap_alloc_log();
>>>>>> + }
>>>>>> +
>>>>>> + if (log->mode& ~EVENT_TAP_TYPE_MASK) {
>>>>>> + trace_event_tap_already_used(log->mode&
>>>>>> ~EVENT_TAP_TYPE_MASK);
>>>>>> + return;
>>>>>> + }
>>>>>> +
>>>>>> + log->mode |= EVENT_TAP_NET;
>>>>>> + event_tap_alloc_net_req(&log->net_req, vc, iov, iovcnt, sent_cb,
>>>>>> async);
>>>>>> +
>>>>>> + empty = QTAILQ_EMPTY(&event_list);
>>>>>> + QTAILQ_INSERT_TAIL(&event_list, log, node);
>>>>>> + last_event_tap = NULL;
>>>>>> +
>>>>>> + if (empty) {
>>>>>> + event_tap_schedule_bh();
>>>>>> + }
>>>>>> +}
>>>>>> +
>>>>>> +void event_tap_send_packet(VLANClientState *vc, const uint8_t *buf,
>>>>>> int
>>>>>> size)
>>>>>> +{
>>>>>> + struct iovec iov;
>>>>>> +
>>>>>> + assert(event_tap_state == EVENT_TAP_ON);
>>>>>> +
>>>>>> + iov.iov_base = (uint8_t *)buf;
>>>>>> + iov.iov_len = size;
>>>>>> + event_tap_packet(vc,&iov, 1, NULL, 0);
>>>>>> +
>>>>>> + return;
>>>>>> +}
>>>>>> +
>>>>>> +ssize_t event_tap_sendv_packet_async(VLANClientState *vc,
>>>>>> + const struct iovec *iov,
>>>>>> + int iovcnt, NetPacketSent
>>>>>> *sent_cb)
>>>>>> +{
>>>>>> + assert(event_tap_state == EVENT_TAP_ON);
>>>>>> + event_tap_packet(vc, iov, iovcnt, sent_cb, 1);
>>>>>> + return 0;
>>>>>> +}
>>>>>> +
>>>>>> +static void event_tap_net_flush(EventTapNetReq *net_req)
>>>>>> +{
>>>>>> + VLANClientState *vc;
>>>>>> + ssize_t len;
>>>>>> +
>>>>>> + if (net_req->vlan_needed) {
>>>>>> + vc = qemu_find_vlan_client_by_name(NULL, net_req->vlan_id,
>>>>>> + net_req->device_name);
>>>>>> + } else {
>>>>>> + vc = qemu_find_netdev(net_req->device_name);
>>>>>> + }
>>>>>> +
>>>>>> + if (net_req->async) {
>>>>>> + len = qemu_sendv_packet_async(vc, net_req->iov,
>>>>>> net_req->iovcnt,
>>>>>> + net_req->sent_cb);
>>>>>> + if (len) {
>>>>>> + net_req->sent_cb(vc, len);
>>>>>> + } else {
>>>>>> + /* packets are queued in the net layer */
>>>>>> + trace_event_tap_append_packet();
>>>>>> + }
>>>>>> + } else {
>>>>>> + qemu_send_packet(vc, net_req->iov[0].iov_base,
>>>>>> + net_req->iov[0].iov_len);
>>>>>> + }
>>>>>> +
>>>>>> + /* force flush to avoid request inversion */
>>>>>> + qemu_aio_flush();
>>>>>> +}
>>>>>> +
>>>>>> +static void event_tap_net_save(QEMUFile *f, EventTapNetReq *net_req)
>>>>>> +{
>>>>>> + ram_addr_t page_addr;
>>>>>> + int i, len;
>>>>>> +
>>>>>> + len = strlen(net_req->device_name);
>>>>>> + qemu_put_byte(f, len);
>>>>>> + qemu_put_buffer(f, (uint8_t *)net_req->device_name, len);
>>>>>> + qemu_put_byte(f, net_req->vlan_id);
>>>>>> + qemu_put_byte(f, net_req->vlan_needed);
>>>>>> + qemu_put_byte(f, net_req->async);
>>>>>> + qemu_put_be32(f, net_req->iovcnt);
>>>>>> +
>>>>>> + for (i = 0; i< net_req->iovcnt; i++) {
>>>>>> + qemu_put_be64(f, net_req->iov[i].iov_len);
>>>>>> + if (net_req->async) {
>>>>>> + page_addr =
>>>>>> +
>>>>>> qemu_ram_addr_from_host_nofail(net_req->iov[i].iov_base);
>>>>>> + qemu_put_be64(f, page_addr);
>>>>>> + } else {
>>>>>> + qemu_put_buffer(f, (uint8_t *)net_req->iov[i].iov_base,
>>>>>> + net_req->iov[i].iov_len);
>>>>>> + }
>>>>>> + }
>>>>>> +}
>>>>>> +
>>>>>> +static void event_tap_net_load(QEMUFile *f, EventTapNetReq *net_req)
>>>>>> +{
>>>>>> + ram_addr_t page_addr;
>>>>>> + int i, len;
>>>>>> +
>>>>>> + len = qemu_get_byte(f);
>>>>>> + net_req->device_name = qemu_malloc(len + 1);
>>>>>> + qemu_get_buffer(f, (uint8_t *)net_req->device_name, len);
>>>>>> + net_req->device_name[len] = '\0';
>>>>>> + net_req->vlan_id = qemu_get_byte(f);
>>>>>> + net_req->vlan_needed = qemu_get_byte(f);
>>>>>> + net_req->async = qemu_get_byte(f);
>>>>>> + net_req->iovcnt = qemu_get_be32(f);
>>>>>> + net_req->iov = qemu_malloc(sizeof(struct iovec) *
>>>>>> net_req->iovcnt);
>>>>>> +
>>>>>> + for (i = 0; i< net_req->iovcnt; i++) {
>>>>>> + net_req->iov[i].iov_len = qemu_get_be64(f);
>>>>>> + if (net_req->async) {
>>>>>> + page_addr = qemu_get_be64(f);
>>>>>> + net_req->iov[i].iov_base = qemu_get_ram_ptr(page_addr);
>>>>>> + } else {
>>>>>> + net_req->iov[i].iov_base =
>>>>>> qemu_malloc(net_req->iov[i].iov_len);
>>>>>> + qemu_get_buffer(f, (uint8_t *)net_req->iov[i].iov_base,
>>>>>> + net_req->iov[i].iov_len);
>>>>>> + }
>>>>>> + }
>>>>>> +}
>>>>>> +
>>>>>> +static void event_tap_free_blk_req(EventTapBlkReq *blk_req)
>>>>>> +{
>>>>>> + int i;
>>>>>> +
>>>>>> + if (event_tap_state>= EVENT_TAP_LOAD&& !blk_req->is_flush) {
>>>>>> + for (i = 0; i< blk_req->num_reqs; i++) {
>>>>>> + qemu_iovec_destroy(blk_req->reqs[i].qiov);
>>>>>> + qemu_free(blk_req->reqs[i].qiov);
>>>>>> + }
>>>>>> + }
>>>>>> +
>>>>>> + qemu_free(blk_req->device_name);
>>>>>> +}
>>>>>> +
>>>>>> +static void event_tap_blk_cb(void *opaque, int ret)
>>>>>> +{
>>>>>> + EventTapLog *log = container_of(opaque, EventTapLog, blk_req);
>>>>>> + EventTapBlkReq *blk_req = opaque;
>>>>>> + int i;
>>>>>> +
>>>>>> + blk_req->num_cbs--;
>>>>>> +
>>>>>> + /* all outstanding requests are flushed */
>>>>>> + if (blk_req->num_cbs == 0) {
>>>>>> + for (i = 0; i< blk_req->num_reqs; i++) {
>>>>>> + EventTapAIOCB *eacb = blk_req->acb[i];
>>>>>> + eacb->common.cb(eacb->common.opaque, ret);
>>>>>> + qemu_aio_release(eacb);
>>>>>> + }
>>>>>> +
>>>>>> + event_tap_free_log(log);
>>>>>> + }
>>>>>> +}
>>>>>> +
>>>>>> +static void event_tap_bdrv_aio_cancel(BlockDriverAIOCB *acb)
>>>>>> +{
>>>>>> + EventTapAIOCB *eacb = container_of(acb, EventTapAIOCB, common);
>>>>>> +
>>>>>> + /* check if already passed to block layer */
>>>>>> + if (eacb->acb) {
>>>>>> + bdrv_aio_cancel(eacb->acb);
>>>>>> + } else {
>>>>>> + eacb->is_canceled = 1;
>>>>>> + }
>>>>>> +}
>>>>>> +
>>>>>> +static AIOPool event_tap_aio_pool = {
>>>>>> + .aiocb_size = sizeof(EventTapAIOCB),
>>>>>> + .cancel = event_tap_bdrv_aio_cancel,
>>>>>> +};
>>>>>> +
>>>>>> +static void event_tap_alloc_blk_req(EventTapBlkReq *blk_req,
>>>>>> + BlockDriverState *bs, BlockRequest
>>>>>> *reqs,
>>>>>> + int num_reqs, void *opaque, bool
>>>>>> is_flush)
>>>>>> +{
>>>>>> + int i;
>>>>>> +
>>>>>> + blk_req->num_reqs = num_reqs;
>>>>>> + blk_req->num_cbs = num_reqs;
>>>>>> + blk_req->device_name = qemu_strdup(bs->device_name);
>>>>>> + blk_req->is_flush = is_flush;
>>>>>> +
>>>>>> + for (i = 0; i< num_reqs; i++) {
>>>>>> + blk_req->reqs[i].sector = reqs[i].sector;
>>>>>> + blk_req->reqs[i].nb_sectors = reqs[i].nb_sectors;
>>>>>> + blk_req->reqs[i].qiov = reqs[i].qiov;
>>>>>> + blk_req->reqs[i].cb = event_tap_blk_cb;
>>>>>> + blk_req->reqs[i].opaque = opaque;
>>>>>> +
>>>>>> + blk_req->acb[i] = qemu_aio_get(&event_tap_aio_pool, bs,
>>>>>> + reqs[i].cb, reqs[i].opaque);
>>>>>> + }
>>>>>> +}
>>>>>> +
>>>>>> +static EventTapBlkReq *event_tap_bdrv(BlockDriverState *bs,
>>>>>> BlockRequest
>>>>>> *reqs,
>>>>>> + int num_reqs, bool is_flush)
>>>>>> +{
>>>>>> + EventTapLog *log = last_event_tap;
>>>>>> + int empty;
>>>>>> +
>>>>>> + if (!log) {
>>>>>> + trace_event_tap_no_event();
>>>>>> + log = event_tap_alloc_log();
>>>>>> + }
>>>>>> +
>>>>>> + if (log->mode& ~EVENT_TAP_TYPE_MASK) {
>>>>>> + trace_event_tap_already_used(log->mode&
>>>>>> ~EVENT_TAP_TYPE_MASK);
>>>>>> + return NULL;
>>>>>> + }
>>>>>> +
>>>>>> + log->mode |= EVENT_TAP_BLK;
>>>>>> + event_tap_alloc_blk_req(&log->blk_req, bs, reqs,
>>>>>> + num_reqs,&log->blk_req, is_flush);
>>>>>> +
>>>>>> + empty = QTAILQ_EMPTY(&event_list);
>>>>>> + QTAILQ_INSERT_TAIL(&event_list, log, node);
>>>>>> + last_event_tap = NULL;
>>>>>> +
>>>>>> + if (empty) {
>>>>>> + event_tap_schedule_bh();
>>>>>> + }
>>>>>> +
>>>>>> + return&log->blk_req;
>>>>>> +}
>>>>>> +
>>>>>> +BlockDriverAIOCB *event_tap_bdrv_aio_writev(BlockDriverState *bs,
>>>>>> + int64_t sector_num,
>>>>>> + QEMUIOVector *iov,
>>>>>> + int nb_sectors,
>>>>>> + BlockDriverCompletionFunc
>>>>>> *cb,
>>>>>> + void *opaque)
>>>>>> +{
>>>>>> + BlockRequest req;
>>>>>> + EventTapBlkReq *ereq;
>>>>>> +
>>>>>> + assert(event_tap_state == EVENT_TAP_ON);
>>>>>> +
>>>>>> + req.sector = sector_num;
>>>>>> + req.nb_sectors = nb_sectors;
>>>>>> + req.qiov = iov;
>>>>>> + req.cb = cb;
>>>>>> + req.opaque = opaque;
>>>>>> + ereq = event_tap_bdrv(bs,&req, 1, 0);
>>>>>> +
>>>>>> + return&ereq->acb[0]->common;
>>>>>> +}
>>>>>> +
>>>>>> +BlockDriverAIOCB *event_tap_bdrv_aio_flush(BlockDriverState *bs,
>>>>>> + BlockDriverCompletionFunc
>>>>>> *cb,
>>>>>> + void *opaque)
>>>>>> +{
>>>>>> + BlockRequest req;
>>>>>> + EventTapBlkReq *ereq;
>>>>>> +
>>>>>> + assert(event_tap_state == EVENT_TAP_ON);
>>>>>> +
>>>>>> + memset(&req, 0, sizeof(req));
>>>>>> + req.cb = cb;
>>>>>> + req.opaque = opaque;
>>>>>> + ereq = event_tap_bdrv(bs,&req, 1, 1);
>>>>>> +
>>>>>> + return&ereq->acb[0]->common;
>>>>>> +}
>>>>>> +
>>>>>> +void event_tap_bdrv_flush(void)
>>>>>> +{
>>>>>> + qemu_bh_cancel(event_tap_bh);
>>>>>> +
>>>>>> + while (!QTAILQ_EMPTY(&event_list)) {
>>>>>> + event_tap_cb();
>>>>>> + }
>>>>>> +}
>>>>>> +
>>>>>> +static void event_tap_blk_flush(EventTapBlkReq *blk_req)
>>>>>> +{
>>>>>> + int i, ret;
>>>>>> +
>>>>>> + for (i = 0; i< blk_req->num_reqs; i++) {
>>>>>> + BlockRequest *req =&blk_req->reqs[i];
>>>>>> + EventTapAIOCB *eacb = blk_req->acb[i];
>>>>>> + BlockDriverAIOCB *acb =&eacb->common;
>>>>>> +
>>>>>> + /* don't flush if canceled */
>>>>>> + if (eacb->is_canceled) {
>>>>>> + continue;
>>>>>> + }
>>>>>> +
>>>>>> + /* receiver needs to restore bs from device name */
>>>>>> + if (!acb->bs) {
>>>>>> + acb->bs = bdrv_find(blk_req->device_name);
>>>>>> + }
>>>>>> +
>>>>>> + if (blk_req->is_flush) {
>>>>>> + eacb->acb = bdrv_aio_flush(acb->bs, req->cb, req->opaque);
>>>>>> + if (!eacb->acb) {
>>>>>> + req->cb(req->opaque, -EIO);
>>>>>> + }
>>>>>> + return;
>>>>>> + }
>>>>>> +
>>>>>> + eacb->acb = bdrv_aio_writev(acb->bs, req->sector, req->qiov,
>>>>>> + req->nb_sectors, req->cb,
>>>>>> req->opaque);
>>>>>> + if (!eacb->acb) {
>>>>>> + req->cb(req->opaque, -EIO);
>>>>>> + }
>>>>>> +
>>>>>> + /* force flush to avoid request inversion */
>>>>>> + qemu_aio_flush();
>>>>>> + ret = bdrv_flush(acb->bs);
>>>>>> + if (ret< 0) {
>>>>>> + error_report("flushing blk_req to %s failed",
>>>>>> blk_req->device_name);
>>>>>> + }
>>>>>> + }
>>>>>> +}
>>>>>> +
>>>>>> +static void event_tap_blk_save(QEMUFile *f, EventTapBlkReq *blk_req)
>>>>>> +{
>>>>>> + ram_addr_t page_addr;
>>>>>> + int i, j, len;
>>>>>> +
>>>>>> + len = strlen(blk_req->device_name);
>>>>>> + qemu_put_byte(f, len);
>>>>>> + qemu_put_buffer(f, (uint8_t *)blk_req->device_name, len);
>>>>>> + qemu_put_byte(f, blk_req->num_reqs);
>>>>>> + qemu_put_byte(f, blk_req->is_flush);
>>>>>> +
>>>>>> + if (blk_req->is_flush) {
>>>>>> + return;
>>>>>> + }
>>>>>> +
>>>>>> + for (i = 0; i< blk_req->num_reqs; i++) {
>>>>>> + BlockRequest *req =&blk_req->reqs[i];
>>>>>> + EventTapAIOCB *eacb = blk_req->acb[i];
>>>>>> + /* don't save canceled requests */
>>>>>> + if (eacb->is_canceled) {
>>>>>> + continue;
>>>>>> + }
>>>>>> + qemu_put_be64(f, req->sector);
>>>>>> + qemu_put_be32(f, req->nb_sectors);
>>>>>> + qemu_put_be32(f, req->qiov->niov);
>>>>>> +
>>>>>> + for (j = 0; j< req->qiov->niov; j++) {
>>>>>> + page_addr =
>>>>>> +
>>>>>> qemu_ram_addr_from_host_nofail(req->qiov->iov[j].iov_base);
>>>>>> + qemu_put_be64(f, page_addr);
>>>>>> + qemu_put_be64(f, req->qiov->iov[j].iov_len);
>>>>>> + }
>>>>>> + }
>>>>>> +}
>>>>>> +
>>>>>> +static void event_tap_blk_load(QEMUFile *f, EventTapBlkReq *blk_req)
>>>>>> +{
>>>>>> + BlockRequest *req;
>>>>>> + ram_addr_t page_addr;
>>>>>> + int i, j, len, niov;
>>>>>> +
>>>>>> + len = qemu_get_byte(f);
>>>>>> + blk_req->device_name = qemu_malloc(len + 1);
>>>>>> + qemu_get_buffer(f, (uint8_t *)blk_req->device_name, len);
>>>>>> + blk_req->device_name[len] = '\0';
>>>>>> + blk_req->num_reqs = qemu_get_byte(f);
>>>>>> + blk_req->is_flush = qemu_get_byte(f);
>>>>>> +
>>>>>> + if (blk_req->is_flush) {
>>>>>> + return;
>>>>>> + }
>>>>>> +
>>>>>> + for (i = 0; i< blk_req->num_reqs; i++) {
>>>>>> + req =&blk_req->reqs[i];
>>>>>> + req->sector = qemu_get_be64(f);
>>>>>> + req->nb_sectors = qemu_get_be32(f);
>>>>>> + req->qiov = qemu_mallocz(sizeof(QEMUIOVector));
>>>>>> + niov = qemu_get_be32(f);
>>>>>> + qemu_iovec_init(req->qiov, niov);
>>>>>> +
>>>>>> + for (j = 0; j< niov; j++) {
>>>>>> + void *iov_base;
>>>>>> + size_t iov_len;
>>>>>> + page_addr = qemu_get_be64(f);
>>>>>> + iov_base = qemu_get_ram_ptr(page_addr);
>>>>>> + iov_len = qemu_get_be64(f);
>>>>>> + qemu_iovec_add(req->qiov, iov_base, iov_len);
>>>>>> + }
>>>>>> + }
>>>>>> +}
>>>>>> +
>>>>>> +void event_tap_ioport(int index, uint32_t address, uint32_t data)
>>>>>> +{
>>>>>> + if (event_tap_state != EVENT_TAP_ON) {
>>>>>> + return;
>>>>>> + }
>>>>>> +
>>>>>> + if (!last_event_tap) {
>>>>>> + last_event_tap = event_tap_alloc_log();
>>>>>> + }
>>>>>> +
>>>>>> + last_event_tap->mode = EVENT_TAP_IOPORT;
>>>>>> + last_event_tap->ioport.index = index;
>>>>>> + last_event_tap->ioport.address = address;
>>>>>> + last_event_tap->ioport.data = data;
>>>>>> +}
>>>>>> +
>>>>>> +static inline void event_tap_ioport_save(QEMUFile *f, EventTapIOport
>>>>>> *ioport)
>>>>>> +{
>>>>>> + qemu_put_be32(f, ioport->index);
>>>>>> + qemu_put_be32(f, ioport->address);
>>>>>> + qemu_put_byte(f, ioport->data);
>>>>>> +}
>>>>>> +
>>>>>> +static inline void event_tap_ioport_load(QEMUFile *f,
>>>>>> + EventTapIOport *ioport)
>>>>>> +{
>>>>>> + ioport->index = qemu_get_be32(f);
>>>>>> + ioport->address = qemu_get_be32(f);
>>>>>> + ioport->data = qemu_get_byte(f);
>>>>>> +}
>>>>>> +
>>>>>> +void event_tap_mmio(uint64_t address, uint8_t *buf, int len)
>>>>>> +{
>>>>>> + if (event_tap_state != EVENT_TAP_ON || len> MMIO_BUF_SIZE) {
>>>>>> + return;
>>>>>> + }
>>>>>> +
>>>>>> + if (!last_event_tap) {
>>>>>> + last_event_tap = event_tap_alloc_log();
>>>>>> + }
>>>>>> +
>>>>>> + last_event_tap->mode = EVENT_TAP_MMIO;
>>>>>> + last_event_tap->mmio.address = address;
>>>>>> + last_event_tap->mmio.len = len;
>>>>>> + memcpy(last_event_tap->mmio.buf, buf, len);
>>>>>> +}
>>>>>> +
>>>>>> +static inline void event_tap_mmio_save(QEMUFile *f, EventTapMMIO
>>>>>> *mmio)
>>>>>> +{
>>>>>> + qemu_put_be64(f, mmio->address);
>>>>>> + qemu_put_byte(f, mmio->len);
>>>>>> + qemu_put_buffer(f, mmio->buf, mmio->len);
>>>>>> +}
>>>>>> +
>>>>>> +static inline void event_tap_mmio_load(QEMUFile *f, EventTapMMIO
>>>>>> *mmio)
>>>>>> +{
>>>>>> + mmio->address = qemu_get_be64(f);
>>>>>> + mmio->len = qemu_get_byte(f);
>>>>>> + qemu_get_buffer(f, mmio->buf, mmio->len);
>>>>>> +}
>>>>>> +
>>>>>> +int event_tap_register(int (*cb)(void))
>>>>>> +{
>>>>>> + if (event_tap_state != EVENT_TAP_OFF) {
>>>>>> + error_report("event-tap is already on");
>>>>>> + return -EINVAL;
>>>>>> + }
>>>>>> +
>>>>>> + if (!cb || event_tap_cb) {
>>>>>> + error_report("can't set event_tap_cb");
>>>>>> + return -EINVAL;
>>>>>> + }
>>>>>> +
>>>>>> + event_tap_cb = cb;
>>>>>> + event_tap_state = EVENT_TAP_ON;
>>>>>> +
>>>>>> + return 0;
>>>>>> +}
>>>>>> +
>>>>>> +void event_tap_unregister(void)
>>>>>> +{
>>>>>> + if (event_tap_state == EVENT_TAP_OFF) {
>>>>>> + error_report("event-tap is already off");
>>>>>> + return;
>>>>>> + }
>>>>>> +
>>>>>> + qemu_del_vm_change_state_handler(vmstate);
>>>>>> +
>>>>>> + event_tap_flush();
>>>>>> + event_tap_free_pool();
>>>>>> +
>>>>>> + event_tap_state = EVENT_TAP_OFF;
>>>>>> + event_tap_cb = NULL;
>>>>>> +}
>>>>>> +
>>>>>> +int event_tap_is_on(void)
>>>>>> +{
>>>>>> + return (event_tap_state == EVENT_TAP_ON);
>>>>>> +}
>>>>>> +
>>>>>> +static void event_tap_suspend(void *opaque, int running, int reason)
>>>>>> +{
>>>>>> + event_tap_state = running ? EVENT_TAP_ON : EVENT_TAP_SUSPEND;
>>>>>> +}
>>>>>> +
>>>>>> +/* returns 1 if the queue gets emtpy */
>>>>>> +int event_tap_flush_one(void)
>>>>>> +{
>>>>>> + EventTapLog *log;
>>>>>> + int ret;
>>>>>> +
>>>>>> + if (QTAILQ_EMPTY(&event_list)) {
>>>>>> + return 1;
>>>>>> + }
>>>>>> +
>>>>>> + event_tap_state = EVENT_TAP_FLUSH;
>>>>>> +
>>>>>> + log = QTAILQ_FIRST(&event_list);
>>>>>> + QTAILQ_REMOVE(&event_list, log, node);
>>>>>> + switch (log->mode& ~EVENT_TAP_TYPE_MASK) {
>>>>>> + case EVENT_TAP_NET:
>>>>>> + event_tap_net_flush(&log->net_req);
>>>>>> + event_tap_free_log(log);
>>>>>> + break;
>>>>>> + case EVENT_TAP_BLK:
>>>>>> + event_tap_blk_flush(&log->blk_req);
>>>>>> + break;
>>>>>> + default:
>>>>>> + error_report("Unknown state %d", log->mode);
>>>>>> + event_tap_free_log(log);
>>>>>> + return -EINVAL;
>>>>>> + }
>>>>>> +
>>>>>> + ret = QTAILQ_EMPTY(&event_list);
>>>>>> + event_tap_state = ret ? EVENT_TAP_ON : EVENT_TAP_FLUSH;
>>>>>> +
>>>>>> + return ret;
>>>>>> +}
>>>>>> +
>>>>>> +void event_tap_flush(void)
>>>>>> +{
>>>>>> + int ret;
>>>>>> +
>>>>>> + do {
>>>>>> + ret = event_tap_flush_one();
>>>>>> + } while (ret == 0);
>>>>>> +
>>>>>> + if (ret< 0) {
>>>>>> + error_report("error flushing event-tap requests");
>>>>>> + abort();
>>>>>> + }
>>>>>> +}
>>>>>> +
>>>>>> +static void event_tap_replay(void *opaque, int running, int reason)
>>>>>> +{
>>>>>> + EventTapLog *log, *next;
>>>>>> +
>>>>>> + if (!running) {
>>>>>> + return;
>>>>>> + }
>>>>>> +
>>>>>> + assert(event_tap_state == EVENT_TAP_LOAD);
>>>>>> +
>>>>>> + event_tap_state = EVENT_TAP_REPLAY;
>>>>>> +
>>>>>> + QTAILQ_FOREACH(log,&event_list, node) {
>>>>>> + if ((log->mode& ~EVENT_TAP_TYPE_MASK) == EVENT_TAP_NET) {
>>>>>> + EventTapNetReq *net_req =&log->net_req;
>>>>>> + if (!net_req->async) {
>>>>>> + event_tap_net_flush(net_req);
>>>>>> + continue;
>>>>>> + }
>>>>>> + }
>>>>>> +
>>>>>> + switch (log->mode& EVENT_TAP_TYPE_MASK) {
>>>>>> + case EVENT_TAP_IOPORT:
>>>>>> + switch (log->ioport.index) {
>>>>>> + case 0:
>>>>>> + cpu_outb(log->ioport.address, log->ioport.data);
>>>>>> + break;
>>>>>> + case 1:
>>>>>> + cpu_outw(log->ioport.address, log->ioport.data);
>>>>>> + break;
>>>>>> + case 2:
>>>>>> + cpu_outl(log->ioport.address, log->ioport.data);
>>>>>> + break;
>>>>>> + }
>>>>>> + break;
>>>>>> + case EVENT_TAP_MMIO:
>>>>>> + cpu_physical_memory_rw(log->mmio.address,
>>>>>> + log->mmio.buf,
>>>>>> + log->mmio.len, 1);
>>>>>> + break;
>>>>>> + case 0:
>>>>>> + trace_event_tap_replay_no_event();
>>>>>> + break;
>>>>>> + default:
>>>>>> + error_report("Unknown state %d", log->mode);
>>>>>> + QTAILQ_REMOVE(&event_list, log, node);
>>>>>> + event_tap_free_log(log);
>>>>>> + return;
>>>>>> + }
>>>>>> + }
>>>>>> +
>>>>>> + /* remove event logs from queue */
>>>>>> + QTAILQ_FOREACH_SAFE(log,&event_list, node, next) {
>>>>>> + QTAILQ_REMOVE(&event_list, log, node);
>>>>>> + event_tap_free_log(log);
>>>>>> + }
>>>>>> +
>>>>>> + event_tap_state = EVENT_TAP_OFF;
>>>>>> + qemu_del_vm_change_state_handler(vmstate);
>>>>>> +}
>>>>>> +
>>>>>> +static void event_tap_save(QEMUFile *f, void *opaque)
>>>>>> +{
>>>>>> + EventTapLog *log;
>>>>>> +
>>>>>> + QTAILQ_FOREACH(log,&event_list, node) {
>>>>>> + qemu_put_byte(f, log->mode);
>>>>>> +
>>>>>> + switch (log->mode& EVENT_TAP_TYPE_MASK) {
>>>>>> + case EVENT_TAP_IOPORT:
>>>>>> + event_tap_ioport_save(f,&log->ioport);
>>>>>> + break;
>>>>>> + case EVENT_TAP_MMIO:
>>>>>> + event_tap_mmio_save(f,&log->mmio);
>>>>>> + break;
>>>>>> + case 0:
>>>>>> + trace_event_tap_save_no_event();
>>>>>> + break;
>>>>>> + default:
>>>>>> + error_report("Unknown state %d", log->mode);
>>>>>> + return;
>>>>>> + }
>>>>>> +
>>>>>> + switch (log->mode& ~EVENT_TAP_TYPE_MASK) {
>>>>>> + case EVENT_TAP_NET:
>>>>>> + event_tap_net_save(f,&log->net_req);
>>>>>> + break;
>>>>>> + case EVENT_TAP_BLK:
>>>>>> + event_tap_blk_save(f,&log->blk_req);
>>>>>> + break;
>>>>>> + default:
>>>>>> + error_report("Unknown state %d", log->mode);
>>>>>> + return;
>>>>>> + }
>>>>>> + }
>>>>>> +
>>>>>> + qemu_put_byte(f, 0); /* EOF */
>>>>>> +}
>>>>>> +
>>>>>> +static int event_tap_load(QEMUFile *f, void *opaque, int version_id)
>>>>>> +{
>>>>>> + EventTapLog *log, *next;
>>>>>> + int mode;
>>>>>> +
>>>>>> + event_tap_state = EVENT_TAP_LOAD;
>>>>>> +
>>>>>> + QTAILQ_FOREACH_SAFE(log,&event_list, node, next) {
>>>>>> + QTAILQ_REMOVE(&event_list, log, node);
>>>>>> + event_tap_free_log(log);
>>>>>> + }
>>>>>> +
>>>>>> + /* loop until EOF */
>>>>>> + while ((mode = qemu_get_byte(f)) != 0) {
>>>>>> + EventTapLog *log = event_tap_alloc_log();
>>>>>> +
>>>>>> + log->mode = mode;
>>>>>> + switch (log->mode& EVENT_TAP_TYPE_MASK) {
>>>>>> + case EVENT_TAP_IOPORT:
>>>>>> + event_tap_ioport_load(f,&log->ioport);
>>>>>> + break;
>>>>>> + case EVENT_TAP_MMIO:
>>>>>> + event_tap_mmio_load(f,&log->mmio);
>>>>>> + break;
>>>>>> + case 0:
>>>>>> + trace_event_tap_load_no_event();
>>>>>> + break;
>>>>>> + default:
>>>>>> + error_report("Unknown state %d", log->mode);
>>>>>> + event_tap_free_log(log);
>>>>>> + return -EINVAL;
>>>>>> + }
>>>>>> +
>>>>>> + switch (log->mode& ~EVENT_TAP_TYPE_MASK) {
>>>>>> + case EVENT_TAP_NET:
>>>>>> + event_tap_net_load(f,&log->net_req);
>>>>>> + break;
>>>>>> + case EVENT_TAP_BLK:
>>>>>> + event_tap_blk_load(f,&log->blk_req);
>>>>>> + break;
>>>>>> + default:
>>>>>> + error_report("Unknown state %d", log->mode);
>>>>>> + event_tap_free_log(log);
>>>>>> + return -EINVAL;
>>>>>> + }
>>>>>> +
>>>>>> + QTAILQ_INSERT_TAIL(&event_list, log, node);
>>>>>> + }
>>>>>> +
>>>>>> + return 0;
>>>>>> +}
>>>>>> +
>>>>>> +void event_tap_schedule_replay(void)
>>>>>> +{
>>>>>> + vmstate = qemu_add_vm_change_state_handler(event_tap_replay,
>>>>>> NULL);
>>>>>> +}
>>>>>> +
>>>>>> +void event_tap_schedule_suspend(void)
>>>>>> +{
>>>>>> + vmstate = qemu_add_vm_change_state_handler(event_tap_suspend,
>>>>>> NULL);
>>>>>> +}
>>>>>> +
>>>>>> +void event_tap_init(void)
>>>>>> +{
>>>>>> + QTAILQ_INIT(&event_list);
>>>>>> + QTAILQ_INIT(&event_pool);
>>>>>> + register_savevm(NULL, "event-tap", 0, 1,
>>>>>> + event_tap_save, event_tap_load,&last_event_tap);
>>>>>> +}
>>>>>> diff --git a/event-tap.h b/event-tap.h
>>>>>> new file mode 100644
>>>>>> index 0000000..ab677f8
>>>>>> --- /dev/null
>>>>>> +++ b/event-tap.h
>>>>>> @@ -0,0 +1,44 @@
>>>>>> +/*
>>>>>> + * Event Tap functions for QEMU
>>>>>> + *
>>>>>> + * Copyright (c) 2010 Nippon Telegraph and Telephone Corporation.
>>>>>> + *
>>>>>> + * This work is licensed under the terms of the GNU GPL, version 2.
>>>>>> See
>>>>>> + * the COPYING file in the top-level directory.
>>>>>> + */
>>>>>> +
>>>>>> +#ifndef EVENT_TAP_H
>>>>>> +#define EVENT_TAP_H
>>>>>> +
>>>>>> +#include "qemu-common.h"
>>>>>> +#include "net.h"
>>>>>> +#include "block.h"
>>>>>> +
>>>>>> +int event_tap_register(int (*cb)(void));
>>>>>> +void event_tap_unregister(void);
>>>>>> +int event_tap_is_on(void);
>>>>>> +void event_tap_schedule_suspend(void);
>>>>>> +void event_tap_ioport(int index, uint32_t address, uint32_t data);
>>>>>> +void event_tap_mmio(uint64_t address, uint8_t *buf, int len);
>>>>>> +void event_tap_init(void);
>>>>>> +void event_tap_flush(void);
>>>>>> +int event_tap_flush_one(void);
>>>>>> +void event_tap_schedule_replay(void);
>>>>>> +
>>>>>> +void event_tap_send_packet(VLANClientState *vc, const uint8_t *buf,
>>>>>> int
>>>>>> size);
>>>>>> +ssize_t event_tap_sendv_packet_async(VLANClientState *vc,
>>>>>> + const struct iovec *iov,
>>>>>> + int iovcnt, NetPacketSent
>>>>>> *sent_cb);
>>>>>> +
>>>>>> +BlockDriverAIOCB *event_tap_bdrv_aio_writev(BlockDriverState *bs,
>>>>>> + int64_t sector_num,
>>>>>> + QEMUIOVector *iov,
>>>>>> + int nb_sectors,
>>>>>> + BlockDriverCompletionFunc
>>>>>> *cb,
>>>>>> + void *opaque);
>>>>>> +BlockDriverAIOCB *event_tap_bdrv_aio_flush(BlockDriverState *bs,
>>>>>> + BlockDriverCompletionFunc
>>>>>> *cb,
>>>>>> + void *opaque);
>>>>>> +void event_tap_bdrv_flush(void);
>>>>>> +
>>>>>> +#endif
>>>>>> diff --git a/qemu-tool.c b/qemu-tool.c
>>>>>> index 392e1c9..3f71215 100644
>>>>>> --- a/qemu-tool.c
>>>>>> +++ b/qemu-tool.c
>>>>>> @@ -16,6 +16,7 @@
>>>>>> #include "qemu-timer.h"
>>>>>> #include "qemu-log.h"
>>>>>> #include "sysemu.h"
>>>>>> +#include "event-tap.h"
>>>>>>
>>>>>> #include<sys/time.h>
>>>>>>
>>>>>> @@ -111,3 +112,30 @@ int qemu_set_fd_handler2(int fd,
>>>>>> {
>>>>>> return 0;
>>>>>> }
>>>>>> +
>>>>>> +BlockDriverAIOCB *event_tap_bdrv_aio_writev(BlockDriverState *bs,
>>>>>> + int64_t sector_num,
>>>>>> + QEMUIOVector *iov,
>>>>>> + int nb_sectors,
>>>>>> + BlockDriverCompletionFunc
>>>>>> *cb,
>>>>>> + void *opaque)
>>>>>> +{
>>>>>> + return NULL;
>>>>>> +}
>>>>>> +
>>>>>> +BlockDriverAIOCB *event_tap_bdrv_aio_flush(BlockDriverState *bs,
>>>>>> + BlockDriverCompletionFunc
>>>>>> *cb,
>>>>>> + void *opaque)
>>>>>> +{
>>>>>> + return NULL;
>>>>>> +}
>>>>>> +
>>>>>> +void event_tap_bdrv_flush(void)
>>>>>> +{
>>>>>> +}
>>>>>> +
>>>>>> +int event_tap_is_on(void)
>>>>>> +{
>>>>>> + return 0;
>>>>>> +}
>>>>>> +
>>>>>> diff --git a/trace-events b/trace-events
>>>>>> index 50ac840..1af3895 100644
>>>>>> --- a/trace-events
>>>>>> +++ b/trace-events
>>>>>> @@ -269,3 +269,13 @@ disable ft_trans_freeze_input(void) "backend not
>>>>>> ready, freezing input"
>>>>>> disable ft_trans_put_ready(void) "file is ready to put"
>>>>>> disable ft_trans_get_ready(void) "file is ready to get"
>>>>>> disable ft_trans_cb(void *cb) "callback %p"
>>>>>> +
>>>>>> +# event-tap.c
>>>>>> +disable event_tap_ignore_bh(int bh) "event_tap_bh is already scheduled
>>>>>> %d"
>>>>>> +disable event_tap_net_cb(char *s, ssize_t len) "%s: %zd bytes packet
>>>>>> was
>>>>>> sended"
>>>>>> +disable event_tap_no_event(void) "no last_event_tap"
>>>>>> +disable event_tap_already_used(int mode) "last_event_tap already used
>>>>>> %d"
>>>>>> +disable event_tap_append_packet(void) "This packet is appended"
>>>>>> +disable event_tap_replay_no_event(void) "No event to replay"
>>>>>> +disable event_tap_save_no_event(void) "No event to save"
>>>>>> +disable event_tap_load_no_event(void) "No event to load"
>>>>>> --
>>>>>> 1.7.1.2
>>>>>>
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>
next prev parent reply other threads:[~2011-03-09 6:27 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-24 7:28 [PATCH 00/18] Kemari for KVM v0.2.12 Yoshiaki Tamura
2011-02-24 7:28 ` [PATCH 01/18] Make QEMUFile buf expandable, and introduce qemu_realloc_buffer() and qemu_clear_buffer() Yoshiaki Tamura
2011-02-24 7:28 ` [PATCH 02/18] Introduce read() to FdMigrationState Yoshiaki Tamura
2011-02-24 7:28 ` [PATCH 03/18] Introduce qemu_loadvm_state_no_header() and make qemu_loadvm_state() a wrapper Yoshiaki Tamura
2011-02-24 7:28 ` [PATCH 04/18] qemu-char: export socket_set_nodelay() Yoshiaki Tamura
2011-02-24 7:28 ` [PATCH 05/18] vl.c: add deleted flag for deleting the handler Yoshiaki Tamura
2011-02-24 7:28 ` [PATCH 06/18] virtio: decrement last_avail_idx with inuse before saving Yoshiaki Tamura
2011-02-24 7:28 ` [PATCH 07/18] Introduce fault tolerant VM transaction QEMUFile and ft_mode Yoshiaki Tamura
2011-02-24 7:28 ` [PATCH 08/18] savevm: introduce util functions to control ft_trans_file from savevm layer Yoshiaki Tamura
2011-02-24 7:28 ` [PATCH 09/18] Introduce event-tap Yoshiaki Tamura
2011-03-04 3:31 ` ya su
2011-03-08 8:22 ` Yoshiaki Tamura
2011-03-09 2:56 ` ya su
[not found] ` <4D76FAC2.3000502@lab.ntt.co.jp>
2011-03-09 4:58 ` ya su
2011-03-09 6:26 ` Yoshiaki Tamura [this message]
2011-03-09 8:36 ` ya su
2011-03-09 8:51 ` Yoshiaki Tamura
2011-02-24 7:28 ` [PATCH 10/18] Call init handler of event-tap at main() in vl.c Yoshiaki Tamura
2011-02-24 7:28 ` [PATCH 11/18] ioport: insert event_tap_ioport() to ioport_write() Yoshiaki Tamura
2011-02-24 7:28 ` [PATCH 12/18] Insert event_tap_mmio() to cpu_physical_memory_rw() in exec.c Yoshiaki Tamura
2011-02-24 7:28 ` [PATCH 13/18] net: insert event-tap to qemu_send_packet() and qemu_sendv_packet_async() Yoshiaki Tamura
2011-02-24 7:28 ` [PATCH 14/18] block: insert event-tap to bdrv_aio_writev(), bdrv_aio_flush() and bdrv_flush() Yoshiaki Tamura
2011-02-24 7:28 ` [PATCH 15/18] savevm: introduce qemu_savevm_trans_{begin,commit} Yoshiaki Tamura
2011-02-24 7:28 ` [PATCH 16/18] migration: introduce migrate_ft_trans_{put,get}_ready(), and modify migrate_fd_put_ready() when ft_mode is on Yoshiaki Tamura
2011-02-24 7:28 ` [PATCH 17/18] migration-tcp: modify tcp_accept_incoming_migration() to handle ft_mode, and add a hack not to close fd when ft_mode is enabled Yoshiaki Tamura
2011-02-24 7:28 ` [PATCH 18/18] Introduce "kemari:" to enable FT migration mode (Kemari) Yoshiaki Tamura
-- strict thread matches above, loose matches on Subject: below --
2011-04-25 11:00 [PATCH 00/18] Kemari for KVM v0.2.14 OHMURA Kei
2011-04-25 11:00 ` [PATCH 09/18] Introduce event-tap OHMURA Kei
2011-03-23 4:10 [PATCH 00/18] [PATCH 00/18] Kemari for KVM v0.2.13 Yoshiaki Tamura
2011-03-23 4:10 ` [PATCH 09/18] Introduce event-tap Yoshiaki Tamura
2011-02-23 13:48 [PATCH 00/18] Kemari for KVM v0.2.11 Yoshiaki Tamura
2011-02-23 13:48 ` [PATCH 09/18] Introduce event-tap Yoshiaki Tamura
2011-02-10 9:30 [PATCH 00/18] Kemari for KVM v0.2.10 Yoshiaki Tamura
2011-02-10 9:30 ` [PATCH 09/18] Introduce event-tap Yoshiaki Tamura
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D771DAA.1000801@lab.ntt.co.jp \
--to=tamura.yoshiaki@lab.ntt.co.jp \
--cc=aliguori@us.ibm.com \
--cc=ananth@in.ibm.com \
--cc=anthony@codemonkey.ws \
--cc=avi@redhat.com \
--cc=blauwirbel@gmail.com \
--cc=dlaor@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=kwolf@redhat.com \
--cc=mst@redhat.com \
--cc=mtosatti@redhat.com \
--cc=ohmura.kei@lab.ntt.co.jp \
--cc=pbonzini@redhat.com \
--cc=psuriset@linux.vnet.ibm.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=stefanha@linux.vnet.ibm.com \
--cc=suya94335@gmail.com \
--cc=vatsa@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox