From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43775) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZfNHR-0002tZ-Mp for qemu-devel@nongnu.org; Fri, 25 Sep 2015 03:18:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZfNHN-0004fY-I5 for qemu-devel@nongnu.org; Fri, 25 Sep 2015 03:18:53 -0400 Received: from [59.151.112.132] (port=6397 helo=heian.cn.fujitsu.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZfNHM-0004eS-A3 for qemu-devel@nongnu.org; Fri, 25 Sep 2015 03:18:49 -0400 Message-ID: <5604F54F.1000509@cn.fujitsu.com> Date: Fri, 25 Sep 2015 15:18:39 +0800 From: Yang Hongyang MIME-Version: 1.0 References: <1442405768-23019-1-git-send-email-yanghy@cn.fujitsu.com> <1442405768-23019-10-git-send-email-yanghy@cn.fujitsu.com> <87si64qiyr.fsf@blackfin.pond.sub.org> In-Reply-To: <87si64qiyr.fsf@blackfin.pond.sub.org> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v11 09/12] netfilter: add a netbuffer filter List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Markus Armbruster Cc: thuth@redhat.com, zhang.zhanghailiang@huawei.com, lizhijian@cn.fujitsu.com, jasowang@redhat.com, qemu-devel@nongnu.org, stefanha@redhat.com On 09/24/2015 05:12 PM, Markus Armbruster wrote: > Yang Hongyang writes: > >> This filter is to buffer/release packets, this feature can be used >> when using MicroCheckpointing, or other Remus like VM FT solutions, you > > What's "Remus"? Remus is an opensource VM FT solution: paper: http://http//www.cs.ubc.ca/~andy/papers/remus-nsdi-final.pdf First implemented on Xen: http://wiki.xen.org/wiki/Remus MicroCheckpointing in QEMU is another Remus implementation. > >> can also use it to simulate the network delay. >> It has an interval option, if supplied, this filter will release >> packets by interval. > > Suggest "will delay packets by that time interval." > > Is interval really optional? It supposed to be optional. When the buffer filter has a user: http://lists.nongnu.org/archive/html/qemu-devel/2015-09/msg00363.html In this patchset, the zero interval check is removed. packets are released on demand through filter_buffer_release_all() api call. > >> >> Usage: >> -netdev tap,id=bn0 >> -object filter-buffer,id=f0,netdev=bn0,chain=in,interval=1000 >> >> NOTE: >> the scale of interval is microsecond. > > Perhaps "interval is in microseconds". Better, thanks. > >> >> Signed-off-by: Yang Hongyang >> --- >> v11: add a fixme comment from Jason >> v10: use NetQueue flush api to flush packets >> sent_cb can not be called when we already return size >> v9: adjustment due to the qapi change >> v7: use QTAILQ_FOREACH_SAFE() when flush packets >> v6: move the interval check earlier and some comment adjust >> v5: remove dummy sent_cb >> change interval type from int64 to uint32 >> check interval!=0 when initialise >> rename FILTERBUFFERState to FilterBufferState >> v4: remove bh >> pass the packet to next filter instead of receiver >> v3: check packet's sender and sender->peer when flush it >> --- >> net/Makefile.objs | 1 + >> net/filter-buffer.c | 170 ++++++++++++++++++++++++++++++++++++++++++++++++++++ >> qemu-options.hx | 18 ++++++ >> vl.c | 7 ++- >> 4 files changed, 195 insertions(+), 1 deletion(-) >> create mode 100644 net/filter-buffer.c >> >> diff --git a/net/Makefile.objs b/net/Makefile.objs >> index 914aec0..5fa2f97 100644 >> --- a/net/Makefile.objs >> +++ b/net/Makefile.objs >> @@ -14,3 +14,4 @@ common-obj-$(CONFIG_SLIRP) += slirp.o >> common-obj-$(CONFIG_VDE) += vde.o >> common-obj-$(CONFIG_NETMAP) += netmap.o >> common-obj-y += filter.o >> +common-obj-y += filter-buffer.o >> diff --git a/net/filter-buffer.c b/net/filter-buffer.c >> new file mode 100644 >> index 0000000..ef94e91 >> --- /dev/null >> +++ b/net/filter-buffer.c >> @@ -0,0 +1,170 @@ >> +/* >> + * Copyright (c) 2015 FUJITSU LIMITED >> + * Author: Yang Hongyang >> + * >> + * This work is licensed under the terms of the GNU GPL, version 2 or >> + * later. See the COPYING file in the top-level directory. >> + */ >> + >> +#include "net/filter.h" >> +#include "net/queue.h" >> +#include "qemu-common.h" >> +#include "qemu/timer.h" >> +#include "qemu/iov.h" >> +#include "qapi/qmp/qerror.h" >> +#include "qapi-visit.h" >> +#include "qom/object.h" >> + >> +#define TYPE_FILTER_BUFFER "filter-buffer" >> + >> +#define FILTER_BUFFER(obj) \ >> + OBJECT_CHECK(FilterBufferState, (obj), TYPE_FILTER_BUFFER) >> + >> +struct FilterBufferState { >> + NetFilterState parent_obj; >> + >> + NetQueue *incoming_queue; >> + uint32_t interval; >> + QEMUTimer release_timer; >> +}; >> +typedef struct FilterBufferState FilterBufferState; > > Again, not splitting the declaration is more concise. > >> + >> +static void filter_buffer_flush(NetFilterState *nf) >> +{ >> + FilterBufferState *s = FILTER_BUFFER(nf); >> + >> + if (!qemu_net_queue_flush(s->incoming_queue)) { >> + /* Unable to empty the queue, purge remaining packets */ >> + qemu_net_queue_purge(s->incoming_queue, nf->netdev); >> + } >> +} > > This either flushes or purges incoming_queue, where "purge" means > dropping packets. Correct? I think it is correct. Dropping packets is allowed, even on real hardware, packets may lose. > >> + >> +static void filter_buffer_release_timer(void *opaque) >> +{ >> + NetFilterState *nf = opaque; >> + FilterBufferState *s = FILTER_BUFFER(nf); > > Style nit: blank line between declarations and statements, please. > >> + filter_buffer_flush(nf); > > Is purging correct here? > >> + timer_mod(&s->release_timer, >> + qemu_clock_get_us(QEMU_CLOCK_VIRTUAL) + s->interval); > > Timer rearmed to fire again in s->interval microseconds. Yes. > >> +} >> + >> +/* filter APIs */ >> +static ssize_t filter_buffer_receive_iov(NetFilterState *nf, >> + NetClientState *sender, >> + unsigned flags, >> + const struct iovec *iov, >> + int iovcnt, >> + NetPacketSent *sent_cb) >> +{ >> + FilterBufferState *s = FILTER_BUFFER(nf); >> + >> + /* >> + * we return size when buffer a packet, the sender will take it as >> + * a already sent packet, so sent_cb should not be called later > > Humor me: when a comment has multiple sentences, start each one with a > capital letter, and end it with punctuation. Ok. > >> + * FIXME: even if guest can't receive packet for some reasons. Filter >> + * can still accept packet until its internal queue is full. >> + */ > > I'm not sure I understand the comment. This is taken from Jason's comments, may be he can have a better explain? > >> + qemu_net_queue_append_iov(s->incoming_queue, sender, flags, >> + iov, iovcnt, NULL); >> + return iov_size(iov, iovcnt); >> +} >> + >> +static void filter_buffer_cleanup(NetFilterState *nf) >> +{ >> + FilterBufferState *s = FILTER_BUFFER(nf); >> + >> + if (s->interval) { >> + timer_del(&s->release_timer); >> + } >> + >> + /* flush packets */ >> + if (s->incoming_queue) { >> + filter_buffer_flush(nf); > > I guess purging is correct here. > >> + g_free(s->incoming_queue); >> + } >> +} >> + >> +static void filter_buffer_setup(NetFilterState *nf, Error **errp) >> +{ >> + FilterBufferState *s = FILTER_BUFFER(nf); >> + >> + /* >> + * this check should be dropped when there're VM FT solutions like MC >> + * or COLO use this filter to release packets on demand. >> + */ > > If you end a sentence with a period, you get to start it with a capital > letter :) Ok, thanks. > > "there're" is odd. Perhaps something like "We may want to accept zero > interval when VM FT solutions like MC or COLO use this filter to release > packets on demand." > >> + if (!s->interval) { >> + error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "interval", >> + "a non-zero interval"); > > How can this happen? Doesn't filter_buffer_set_interval() catch zero > intervals already? When user do not supply an interval parameter, filter_buffer_set_interval() won't be called, and the interval is 0. When we have actual user like I mentioned above, we should allow user to omit the interval option. > >> + return; >> + } >> + >> + s->incoming_queue = qemu_new_net_queue(qemu_netfilter_pass_to_next, nf); >> + if (s->interval) { > > Condition cannot be false. Same in filter_buffer_cleanup(). > >> + timer_init_us(&s->release_timer, QEMU_CLOCK_VIRTUAL, >> + filter_buffer_release_timer, nf); >> + timer_mod(&s->release_timer, >> + qemu_clock_get_us(QEMU_CLOCK_VIRTUAL) + s->interval); > > Timer armed to fire in s->interval microseconds. > > Unless I misunderstand something, this doesn't actually delay each > packet by s->interval microseconds, it batches packet delivery: all > packets arriving in a given interval are delayed until the end of the > interval. Correct? Correct. > >> + } >> +} >> + >> +static void filter_buffer_class_init(ObjectClass *oc, void *data) >> +{ >> + NetFilterClass *nfc = NETFILTER_CLASS(oc); >> + >> + nfc->setup = filter_buffer_setup; >> + nfc->cleanup = filter_buffer_cleanup; >> + nfc->receive_iov = filter_buffer_receive_iov; >> +} >> + >> +static void filter_buffer_get_interval(Object *obj, Visitor *v, void *opaque, >> + const char *name, Error **errp) >> +{ >> + FilterBufferState *s = FILTER_BUFFER(obj); >> + uint32_t value = s->interval; >> + >> + visit_type_uint32(v, &value, name, errp); >> +} >> + >> +static void filter_buffer_set_interval(Object *obj, Visitor *v, void *opaque, >> + const char *name, Error **errp) >> +{ >> + FilterBufferState *s = FILTER_BUFFER(obj); >> + Error *local_err = NULL; >> + uint32_t value; >> + >> + visit_type_uint32(v, &value, name, &local_err); >> + if (local_err) { >> + goto out; >> + } >> + if (!value) { >> + error_setg(&local_err, "Property '%s.%s' doesn't take value '%" >> + PRIu32 "'", object_get_typename(obj), name, value); > > What does it take? That's what the user wants to know... Perhaps > "Property '%s.%s' requires a positive value". Maybe better, thanks. > >> + goto out; >> + } >> + s->interval = value; >> + >> +out: >> + error_propagate(errp, local_err); >> +} >> + >> +static void filter_buffer_init(Object *obj) >> +{ >> + object_property_add(obj, "interval", "int", >> + filter_buffer_get_interval, >> + filter_buffer_set_interval, NULL, NULL, NULL); >> +} >> + >> +static const TypeInfo filter_buffer_info = { >> + .name = TYPE_FILTER_BUFFER, >> + .parent = TYPE_NETFILTER, >> + .class_init = filter_buffer_class_init, >> + .instance_init = filter_buffer_init, >> + .instance_size = sizeof(FilterBufferState), >> +}; >> + >> +static void register_types(void) >> +{ >> + type_register_static(&filter_buffer_info); >> +} >> + >> +type_init(register_types); >> diff --git a/qemu-options.hx b/qemu-options.hx >> index 7e147b8..b09f97f 100644 >> --- a/qemu-options.hx >> +++ b/qemu-options.hx >> @@ -3642,6 +3642,24 @@ in PEM format, in filenames @var{ca-cert.pem}, @var{ca-crl.pem} (optional), >> @var{server-cert.pem} (only servers), @var{server-key.pem} (only servers), >> @var{client-cert.pem} (only clients), and @var{client-key.pem} (only clients). >> >> +@item -object filter-buffer,id=@var{id},netdev=@var{netdevid}[,chain=@var{all|in|out}][,interval=@var{t}] >> + >> +Buffer network packets on netdev @var{netdevid}. >> +If interval @var{t} provided, will release packets by interval. >> +Interval scale: microsecond. >> + >> +If interval @var{t} not provided, you have to make sure the packets can be > > Are you sure omitting t works? filter_buffer_set_interval() rejects > zero... When we omit t, filter_buffer_set_interval() won't be called. > >> +released, either by manually remove this filter or call the release buffer API, >> +otherwise, the packets will be buffered forever. Use with caution. > > Please wrap your lines a bit earlier. > >> + >> +chain @var{all|in|out} is an option that can be applied to any netfilter, default is @option{all}. >> + >> +@option{all} means this filter will receive packets both sent to/from the netdev >> + >> +@option{in} means this filter will receive packets sent to the netdev >> + >> +@option{out} means this filter will receive packets sent from the netdev >> + > > I'd describe this like "filter is inserted in the receive / transmit > queue". thanks. > >> @end table >> >> ETEXI >> diff --git a/vl.c b/vl.c >> index ec589e2..3cf89d5 100644 >> --- a/vl.c >> +++ b/vl.c >> @@ -2794,7 +2794,12 @@ static bool object_create_initial(const char *type) >> if (g_str_equal(type, "rng-egd")) { >> return false; >> } >> - /* TODO: return false for concrete netfilters */ >> + >> + /* return false for concrete netfilters */ > > I find this comment useless, please drop it :) Ok, thanks. > >> + if (g_str_equal(type, "filter-buffer")) { >> + return false; >> + } >> + >> return true; >> } > . > -- Thanks, Yang.