From: Yang Hongyang <yanghy@cn.fujitsu.com>
To: Jason Wang <jasowang@redhat.com>, qemu-devel@nongnu.org
Cc: thuth@redhat.com, stefanha@redhat.com
Subject: Re: [Qemu-devel] [PATCH] RFC/net: Add a net filter
Date: Mon, 27 Jul 2015 15:00:55 +0800 [thread overview]
Message-ID: <55B5D727.8010806@cn.fujitsu.com> (raw)
In-Reply-To: <55B5D214.5030506@redhat.com>
On 07/27/2015 02:39 PM, Jason Wang wrote:
>
>
> On 07/27/2015 01:27 PM, Yang Hongyang wrote:
>> On 07/23/2015 01:59 PM, Jason Wang wrote:
>>>
>>>
>>> On 07/22/2015 06:55 PM, Yang Hongyang wrote:
>>>> This patch add a net filter between network backend and NIC devices.
>>>> All packets will pass by this filter.
>>>> TODO:
>>>> multiqueue support.
>>>> plugin support.
>>>>
>>>> +--------------+ +-------------+
>>>> +----------+ | filter | |frontend(NIC)|
>>>> | real | | | | |
>>>> | network <--+backend <-------+ |
>>>> | backend | | peer +-------> peer |
>>>> +----------+ +--------------+ +-------------+
>>>>
>>>> Usage:
>>>> -netdev tap,id=bn0 # you can use whatever backend as needed
>>>> -netdev filter,id=f0,backend=bn0,plugin=dump
>>>> -device e1000,netdev=f0
>>>>
>>>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>>>
>>> Hi:
>>>
>>> Several questions:
>>>
>>> - Looks like we can do more than filter, so may be something like
>>> traffic control or other is more suitable?
>>
>> The filter is just a transparent proxy of a backend if no filter plugin
>> is inserted. It just by pass all packets. Capture all traffic is the
>> purpose
>> of the filter. As long as we have an entry to capture all packets, we
>> can do more, this is what a filter plugin will do. There are some use
>> cases
>> I can think of:
>> - dump, by using filter, we can dump either output/input packets.
>> - buffer, to buffer/release packets, this feature can be used when using
>> macrocheckpoing. Or other Remus like VM FT solutions. You can
>> also supply an interval to a buffer plugin, which will release
>> packets by interval.
>
> This sounds like traffic shaping.
>
>> May be other use cases based on this special backend.
>>
>>> - What's the advantages of introducing a new type of netdev? As far as I
>>> can see, just replace the dump function in Tomas' series with a
>>> configurable function pointer will do the trick? (Probably with some
>>> monitor commands). And then you won't even need to deal with vnet hder
>>> and offload stuffs?
>>
>> I think dump function focus on every netdev, it adds an dump_enabled to
>> NetClientState, and dump the packet when the netdev receive been called,
>> This filter function more focus on packets between backend/frontend,
>> it's kind of an injection to the network packets flow.
>> So the semantics are different I think.
>
> Yes, their functions are different. But the packet paths are similar,
> both require the packets go through themselves before reaching the
> peers. So simply passing the packets to the filter function before
> calling nc->info->receive{_raw}() in qemu_deliver_packet() will also work?
I think this won't work for the buffer case? If we want the buffer case
to work under this, we should modify the generic netdev layer code, to
check the return value of the filter function call. And it is not as
extensible as we abstract the filter function to a netdev, We can
flexibly add/remove/change filter plugins on the fly.
>
> This seems saves a lot of unnecessary stuffs. E.g netdev, vnet header or
> offload.
>
>>
>>> - I'm not sure the value of doing this especially consider host (linux)
>>> has much more functional and powerful traffic control system.
>>>
>>> Thanks.
>>>
>>>
>>>> ---
>>>> include/net/net.h | 3 +
>>>> net/Makefile.objs | 1 +
>>>> net/clients.h | 3 +
>>>> net/filter.c | 200
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> net/net.c | 6 +-
>>>> qapi-schema.json | 23 ++++++-
>>>> 6 files changed, 233 insertions(+), 3 deletions(-)
>>>> create mode 100644 net/filter.c
>>>>
>>>> diff --git a/include/net/net.h b/include/net/net.h
>>>> index 6a6cbef..250f365 100644
>>>> --- a/include/net/net.h
>>>> +++ b/include/net/net.h
>>>> @@ -45,6 +45,8 @@ typedef void (NetPoll)(NetClientState *, bool
>>>> enable);
>>>> typedef int (NetCanReceive)(NetClientState *);
>>>> typedef ssize_t (NetReceive)(NetClientState *, const uint8_t *,
>>>> size_t);
>>>> typedef ssize_t (NetReceiveIOV)(NetClientState *, const struct
>>>> iovec *, int);
>>>> +typedef ssize_t (NetReceiveFilter)(NetClientState *, NetClientState *,
>>>> + unsigned, const uint8_t *, size_t);
>>>> typedef void (NetCleanup) (NetClientState *);
>>>> typedef void (LinkStatusChanged)(NetClientState *);
>>>> typedef void (NetClientDestructor)(NetClientState *);
>>>> @@ -64,6 +66,7 @@ typedef struct NetClientInfo {
>>>> NetReceive *receive;
>>>> NetReceive *receive_raw;
>>>> NetReceiveIOV *receive_iov;
>>>> + NetReceiveFilter *receive_filter;
>>>> NetCanReceive *can_receive;
>>>> NetCleanup *cleanup;
>>>> LinkStatusChanged *link_status_changed;
>>>> diff --git a/net/Makefile.objs b/net/Makefile.objs
>>>> index ec19cb3..914aec0 100644
>>>> --- a/net/Makefile.objs
>>>> +++ b/net/Makefile.objs
>>>> @@ -13,3 +13,4 @@ common-obj-$(CONFIG_HAIKU) += tap-haiku.o
>>>> common-obj-$(CONFIG_SLIRP) += slirp.o
>>>> common-obj-$(CONFIG_VDE) += vde.o
>>>> common-obj-$(CONFIG_NETMAP) += netmap.o
>>>> +common-obj-y += filter.o
>>>> diff --git a/net/clients.h b/net/clients.h
>>>> index d47530e..bcfb34b 100644
>>>> --- a/net/clients.h
>>>> +++ b/net/clients.h
>>>> @@ -62,4 +62,7 @@ int net_init_netmap(const NetClientOptions *opts,
>>>> const char *name,
>>>> int net_init_vhost_user(const NetClientOptions *opts, const char
>>>> *name,
>>>> NetClientState *peer, Error **errp);
>>>>
>>>> +int net_init_filter(const NetClientOptions *opts, const char *name,
>>>> + NetClientState *peer, Error **errp);
>>>> +
>>>> #endif /* QEMU_NET_CLIENTS_H */
>>>> diff --git a/net/filter.c b/net/filter.c
>>>> new file mode 100644
>>>> index 0000000..006c64a
>>>> --- /dev/null
>>>> +++ b/net/filter.c
>>>> @@ -0,0 +1,200 @@
>>>> +/*
>>>> + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service
>>>> (COLO)
>>>> + * (a.k.a. Fault Tolerance or Continuous Replication)
>>>> + *
>>>> + * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
>>>> + * Copyright (c) 2015 FUJITSU LIMITED
>>>> + * Copyright (c) 2015 Intel Corporation
>>>> + *
>>>> + * This work is licensed under the terms of the GNU GPL, version 2 or
>>>> + * later. See the COPYING file in the top-level directory.
>>>> + */
>>>> +
>>>> +#include "net/net.h"
>>>> +#include "clients.h"
>>>> +#include "qemu-common.h"
>>>> +#include "qemu/error-report.h"
>>>> +
>>>> +typedef struct FILTERState {
>>>> + NetClientState nc;
>>>> + NetClientState *backend;
>>>> +} FILTERState;
>>>> +
>>>> +static ssize_t filter_receive(NetClientState *nc, NetClientState
>>>> *sender,
>>>> + unsigned flags, const uint8_t *data,
>>>> size_t size)
>>>> +{
>>>> + FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>>> + NetClientState *issued_nc = NULL;
>>>> + ssize_t ret;
>>>> +
>>>> + if (sender->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
>>>> + /* packet received from NIC */
>>>> + printf("packet received from NIC!!!\n");
>>>> + issued_nc = s->backend;
>>>> + } else {
>>>> + /* packet received from backend */
>>>> + printf("packet received from backend!!!\n");
>>>> + issued_nc = nc->peer;
>>>> + }
>>>> +
>>>> + if (flags & QEMU_NET_PACKET_FLAG_RAW &&
>>>> issued_nc->info->receive_raw) {
>>>> + ret = issued_nc->info->receive_raw(issued_nc, data, size);
>>>> + } else {
>>>> + ret = issued_nc->info->receive(issued_nc, data, size);
>>>> + }
>>>> +
>>>> + return ret;
>>>> +}
>>>> +
>>>> +static void filter_cleanup(NetClientState *nc)
>>>> +{
>>>> + return;
>>>> +}
>>>> +
>>>> +static bool filter_has_ufo(NetClientState *nc)
>>>> +{
>>>> + FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>>> + NetClientState *backend = s->backend;
>>>> +
>>>> + if (!backend->info->has_ufo) {
>>>> + return false;
>>>> + }
>>>> +
>>>> + return backend->info->has_ufo(backend);
>>>> +}
>>>> +
>>>> +static bool filter_has_vnet_hdr(NetClientState *nc)
>>>> +{
>>>> + FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>>> + NetClientState *backend = s->backend;
>>>> +
>>>> + if (!backend->info->has_vnet_hdr) {
>>>> + return false;
>>>> + }
>>>> +
>>>> + return backend->info->has_vnet_hdr(backend);
>>>> +}
>>>> +
>>>> +static bool filter_has_vnet_hdr_len(NetClientState *nc, int len)
>>>> +{
>>>> + FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>>> + NetClientState *backend = s->backend;
>>>> +
>>>> + if (!backend->info->has_vnet_hdr_len) {
>>>> + return false;
>>>> + }
>>>> +
>>>> + return backend->info->has_vnet_hdr_len(backend, len);
>>>> +}
>>>> +
>>>> +static void filter_using_vnet_hdr(NetClientState *nc, bool
>>>> using_vnet_hdr)
>>>> +{
>>>> + FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>>> + NetClientState *backend = s->backend;
>>>> +
>>>> + if (!backend->info->using_vnet_hdr) {
>>>> + return;
>>>> + }
>>>> +
>>>> + backend->info->using_vnet_hdr(backend, using_vnet_hdr);
>>>> +}
>>>> +
>>>> +static void filter_set_offload(NetClientState *nc, int csum, int tso4,
>>>> + int tso6, int ecn, int ufo)
>>>> +{
>>>> + FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>>> + NetClientState *backend = s->backend;
>>>> +
>>>> + if (!backend->info->set_offload) {
>>>> + return;
>>>> + }
>>>> +
>>>> + backend->info->set_offload(backend, csum, tso4, tso6, ecn, ufo);
>>>> +}
>>>> +
>>>> +static void filter_set_vnet_hdr_len(NetClientState *nc, int len)
>>>> +{
>>>> + FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>>> + NetClientState *backend = s->backend;
>>>> +
>>>> + if (!backend->info->set_vnet_hdr_len) {
>>>> + return;
>>>> + }
>>>> +
>>>> + backend->info->set_vnet_hdr_len(backend, len);
>>>> +}
>>>> +
>>>> +static int filter_set_vnet_le(NetClientState *nc, bool is_le)
>>>> +{
>>>> + FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>>> + NetClientState *backend = s->backend;
>>>> +
>>>> + if (!backend->info->set_vnet_le) {
>>>> + return -ENOSYS;
>>>> + }
>>>> +
>>>> + return backend->info->set_vnet_le(backend, is_le);
>>>> +}
>>>> +
>>>> +static int filter_set_vnet_be(NetClientState *nc, bool is_be)
>>>> +{
>>>> + FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>>> + NetClientState *backend = s->backend;
>>>> +
>>>> + if (!backend->info->set_vnet_be) {
>>>> + return -ENOSYS;
>>>> + }
>>>> +
>>>> + return backend->info->set_vnet_be(backend, is_be);
>>>> +}
>>>> +
>>>> +static NetClientInfo net_filter_info = {
>>>> + .type = NET_CLIENT_OPTIONS_KIND_FILTER,
>>>> + .size = sizeof(FILTERState),
>>>> + .receive_filter = filter_receive,
>>>> + .cleanup = filter_cleanup,
>>>> + .has_ufo = filter_has_ufo,
>>>> + .has_vnet_hdr = filter_has_vnet_hdr,
>>>> + .has_vnet_hdr_len = filter_has_vnet_hdr_len,
>>>> + .using_vnet_hdr = filter_using_vnet_hdr,
>>>> + .set_offload = filter_set_offload,
>>>> + .set_vnet_hdr_len = filter_set_vnet_hdr_len,
>>>> + .set_vnet_le = filter_set_vnet_le,
>>>> + .set_vnet_be = filter_set_vnet_be,
>>>> +};
>>>> +
>>>> +int net_init_filter(const NetClientOptions *opts, const char *name,
>>>> + NetClientState *peer, Error **errp)
>>>> +{
>>>> + NetClientState *nc;
>>>> + FILTERState *s;
>>>> + const NetdevFilterOptions *filter;
>>>> + char *backend_id = NULL;
>>>> + /* char *plugin = NULL; */
>>>> +
>>>> + assert(opts->kind == NET_CLIENT_OPTIONS_KIND_FILTER);
>>>> + filter = opts->filter;
>>>> + assert(filter->has_backend);
>>>> +
>>>> + backend_id = filter->backend;
>>>> + /* plugin = filter->has_plugin ? filter->plugin : NULL; */
>>>> +
>>>> + nc = qemu_new_net_client(&net_filter_info, peer, "filter", name);
>>>> + /*
>>>> + * TODO: Both backend and frontend packets will use this queue, we
>>>> + * double this queue's maxlen
>>>> + */
>>>> + s = DO_UPCAST(FILTERState, nc, nc);
>>>> + s->backend = qemu_find_netdev(backend_id);
>>>> + if (!s->backend) {
>>>> + error_setg(errp, "invalid backend name specified");
>>>> + return -1;
>>>> + }
>>>> +
>>>> + s->backend->peer = nc;
>>>> + /*
>>>> + * TODO:
>>>> + * init filter plugin
>>>> + */
>>>> + return 0;
>>>> +}
>>>> diff --git a/net/net.c b/net/net.c
>>>> index 28a5597..466c6ff 100644
>>>> --- a/net/net.c
>>>> +++ b/net/net.c
>>>> @@ -57,6 +57,7 @@ const char *host_net_devices[] = {
>>>> "tap",
>>>> "socket",
>>>> "dump",
>>>> + "filter",
>>>> #ifdef CONFIG_NET_BRIDGE
>>>> "bridge",
>>>> #endif
>>>> @@ -571,7 +572,9 @@ ssize_t qemu_deliver_packet(NetClientState *sender,
>>>> return 0;
>>>> }
>>>>
>>>> - if (flags & QEMU_NET_PACKET_FLAG_RAW && nc->info->receive_raw) {
>>>> + if (nc->info->receive_filter) {
>>>> + ret = nc->info->receive_filter(nc, sender, flags, data, size);
>>>> + } else if (flags & QEMU_NET_PACKET_FLAG_RAW &&
>>>> nc->info->receive_raw) {
>>>> ret = nc->info->receive_raw(nc, data, size);
>>>> } else {
>>>> ret = nc->info->receive(nc, data, size);
>>>> @@ -886,6 +889,7 @@ static int (* const
>>>> net_client_init_fun[NET_CLIENT_OPTIONS_KIND_MAX])(
>>>> const char *name,
>>>> NetClientState *peer, Error **errp) = {
>>>> [NET_CLIENT_OPTIONS_KIND_NIC] = net_init_nic,
>>>> + [NET_CLIENT_OPTIONS_KIND_FILTER] = net_init_filter,
>>>> #ifdef CONFIG_SLIRP
>>>> [NET_CLIENT_OPTIONS_KIND_USER] = net_init_slirp,
>>>> #endif
>>>> diff --git a/qapi-schema.json b/qapi-schema.json
>>>> index a0a45f7..3329973 100644
>>>> --- a/qapi-schema.json
>>>> +++ b/qapi-schema.json
>>>> @@ -2063,7 +2063,7 @@
>>>> # Add a network backend.
>>>> #
>>>> # @type: the type of network backend. Current valid values are
>>>> 'user', 'tap',
>>>> -# 'vde', 'socket', 'dump' and 'bridge'
>>>> +# 'vde', 'socket', 'dump' , 'bridge' and 'filter'
>>>> #
>>>> # @id: the name of the new network backend
>>>> #
>>>> @@ -2474,6 +2474,24 @@
>>>> '*vhostforce': 'bool' } }
>>>>
>>>> ##
>>>> +# @NetdevFilterOptions
>>>> +#
>>>> +# A net filter between network backend and NIC device
>>>> +#
>>>> +# @plugin: #optional a plugin represent a set of filter rules,
>>>> +# by default, if no plugin is supplied, the net filter
>>>> will do
>>>> +# nothing but pass all packets to network backend.
>>>> +#
>>>> +# @backend: the network backend.
>>>> +#
>>>> +# Since 2.5
>>>> +##
>>>> +{ 'struct': 'NetdevFilterOptions',
>>>> + 'data': {
>>>> + '*plugin': 'str',
>>>> + '*backend': 'str' } }
>>>> +
>>>> +##
>>>> # @NetClientOptions
>>>> #
>>>> # A discriminated record of network device traits.
>>>> @@ -2496,7 +2514,8 @@
>>>> 'bridge': 'NetdevBridgeOptions',
>>>> 'hubport': 'NetdevHubPortOptions',
>>>> 'netmap': 'NetdevNetmapOptions',
>>>> - 'vhost-user': 'NetdevVhostUserOptions' } }
>>>> + 'vhost-user': 'NetdevVhostUserOptions',
>>>> + 'filter': 'NetdevFilterOptions'} }
>>>>
>>>> ##
>>>> # @NetLegacy
>>>
>>> .
>>>
>>
>
> .
>
--
Thanks,
Yang.
next prev parent reply other threads:[~2015-07-27 7:01 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-13 7:39 [Qemu-devel] [PATCH v2 0/5] For QEMU 2.5: Network traffic dumping for -netdev devices Thomas Huth
2015-07-13 7:39 ` [Qemu-devel] [PATCH v2 1/5] net/dump: Add support for receive_iov function Thomas Huth
2015-07-13 7:39 ` [Qemu-devel] [PATCH v2 2/5] net/dump: Move DumpState into NetClientState Thomas Huth
2015-07-13 7:39 ` [Qemu-devel] [PATCH v2 3/5] net/dump: Rework net-dump init functions Thomas Huth
2015-07-13 7:39 ` [Qemu-devel] [PATCH v2 4/5] net/dump: Add dump option for netdev devices Thomas Huth
2015-07-13 7:39 ` [Qemu-devel] [PATCH v2 5/5] qemu options: Add information about dumpfile to help text Thomas Huth
2015-07-22 6:35 ` [Qemu-devel] [PATCH v2 0/5] For QEMU 2.5: Network traffic dumping for -netdev devices Jason Wang
2015-07-22 10:52 ` Yang Hongyang
2015-07-22 10:55 ` [Qemu-devel] [PATCH] RFC/net: Add a net filter Yang Hongyang
2015-07-22 11:06 ` Daniel P. Berrange
2015-07-22 15:16 ` Yang Hongyang
2015-07-22 13:05 ` Thomas Huth
2015-07-22 15:06 ` Yang Hongyang
2015-07-22 13:26 ` Stefan Hajnoczi
2015-07-22 14:57 ` Yang Hongyang
2015-07-23 11:57 ` Stefan Hajnoczi
2015-07-23 5:59 ` Jason Wang
2015-07-27 5:27 ` Yang Hongyang
2015-07-27 6:02 ` Yang Hongyang
2015-07-27 6:39 ` Jason Wang
2015-07-27 7:00 ` Yang Hongyang [this message]
2015-07-27 7:31 ` Jason Wang
2015-07-27 7:45 ` Yang Hongyang
2015-07-27 8:01 ` Jason Wang
2015-07-27 8:39 ` Yang Hongyang
2015-07-27 9:16 ` Jason Wang
2015-07-27 10:03 ` Yang Hongyang
2015-07-28 3:28 ` Jason Wang
2015-07-28 4:00 ` Yang Hongyang
2015-07-28 8:52 ` Yang Hongyang
2015-07-28 9:19 ` Yang Hongyang
2015-07-28 9:30 ` Jason Wang
2015-07-28 9:41 ` Yang Hongyang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55B5D727.8010806@cn.fujitsu.com \
--to=yanghy@cn.fujitsu.com \
--cc=jasowang@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).