From: Yang Hongyang <yanghy@cn.fujitsu.com>
To: Jason Wang <jasowang@redhat.com>, qemu-devel@nongnu.org
Cc: thuth@redhat.com, stefanha@redhat.com
Subject: Re: [Qemu-devel] [PATCH] RFC/net: Add a net filter
Date: Mon, 27 Jul 2015 14:02:50 +0800 [thread overview]
Message-ID: <55B5C98A.6090508@cn.fujitsu.com> (raw)
In-Reply-To: <55B5C14F.5030808@cn.fujitsu.com>
On 07/27/2015 01:27 PM, Yang Hongyang wrote:
> On 07/23/2015 01:59 PM, Jason Wang wrote:
>>
>>
>> On 07/22/2015 06:55 PM, Yang Hongyang wrote:
>>> This patch add a net filter between network backend and NIC devices.
>>> All packets will pass by this filter.
>>> TODO:
>>> multiqueue support.
>>> plugin support.
>>>
>>> +--------------+ +-------------+
>>> +----------+ | filter | |frontend(NIC)|
>>> | real | | | | |
>>> | network <--+backend <-------+ |
>>> | backend | | peer +-------> peer |
>>> +----------+ +--------------+ +-------------+
>>>
>>> Usage:
>>> -netdev tap,id=bn0 # you can use whatever backend as needed
>>> -netdev filter,id=f0,backend=bn0,plugin=dump
>>> -device e1000,netdev=f0
>>>
>>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>>
>> Hi:
>>
>> Several questions:
>>
>> - Looks like we can do more than filter, so may be something like
>> traffic control or other is more suitable?
>
> The filter is just a transparent proxy of a backend if no filter plugin
> is inserted. It just by pass all packets. Capture all traffic is the purpose
> of the filter. As long as we have an entry to capture all packets, we
> can do more, this is what a filter plugin will do. There are some use cases
> I can think of:
> - dump, by using filter, we can dump either output/input packets.
> - buffer, to buffer/release packets, this feature can be used when using
> macrocheckpoing. Or other Remus like VM FT solutions. You can
> also supply an interval to a buffer plugin, which will release
> packets by interval.
> May be other use cases based on this special backend.
>
>> - What's the advantages of introducing a new type of netdev?
You can take the filter as a full featured network backend, And by implement
it as a new type of netdev, we can reuse the existing netdev design, reuse as
many existing code/design as we can.
>> As far as I
>> can see, just replace the dump function in Tomas' series with a
>> configurable function pointer will do the trick? (Probably with some
>> monitor commands). And then you won't even need to deal with vnet hder
>> and offload stuffs?
>
> I think dump function focus on every netdev, it adds an dump_enabled to
> NetClientState, and dump the packet when the netdev receive been called,
> This filter function more focus on packets between backend/frontend,
> it's kind of an injection to the network packets flow.
> So the semantics are different I think.
>
>> - I'm not sure the value of doing this especially consider host (linux)
>> has much more functional and powerful traffic control system.
>>
>> Thanks.
>>
>>
>>> ---
>>> include/net/net.h | 3 +
>>> net/Makefile.objs | 1 +
>>> net/clients.h | 3 +
>>> net/filter.c | 200 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> net/net.c | 6 +-
>>> qapi-schema.json | 23 ++++++-
>>> 6 files changed, 233 insertions(+), 3 deletions(-)
>>> create mode 100644 net/filter.c
>>>
>>> diff --git a/include/net/net.h b/include/net/net.h
>>> index 6a6cbef..250f365 100644
>>> --- a/include/net/net.h
>>> +++ b/include/net/net.h
>>> @@ -45,6 +45,8 @@ typedef void (NetPoll)(NetClientState *, bool enable);
>>> typedef int (NetCanReceive)(NetClientState *);
>>> typedef ssize_t (NetReceive)(NetClientState *, const uint8_t *, size_t);
>>> typedef ssize_t (NetReceiveIOV)(NetClientState *, const struct iovec *, int);
>>> +typedef ssize_t (NetReceiveFilter)(NetClientState *, NetClientState *,
>>> + unsigned, const uint8_t *, size_t);
>>> typedef void (NetCleanup) (NetClientState *);
>>> typedef void (LinkStatusChanged)(NetClientState *);
>>> typedef void (NetClientDestructor)(NetClientState *);
>>> @@ -64,6 +66,7 @@ typedef struct NetClientInfo {
>>> NetReceive *receive;
>>> NetReceive *receive_raw;
>>> NetReceiveIOV *receive_iov;
>>> + NetReceiveFilter *receive_filter;
>>> NetCanReceive *can_receive;
>>> NetCleanup *cleanup;
>>> LinkStatusChanged *link_status_changed;
>>> diff --git a/net/Makefile.objs b/net/Makefile.objs
>>> index ec19cb3..914aec0 100644
>>> --- a/net/Makefile.objs
>>> +++ b/net/Makefile.objs
>>> @@ -13,3 +13,4 @@ common-obj-$(CONFIG_HAIKU) += tap-haiku.o
>>> common-obj-$(CONFIG_SLIRP) += slirp.o
>>> common-obj-$(CONFIG_VDE) += vde.o
>>> common-obj-$(CONFIG_NETMAP) += netmap.o
>>> +common-obj-y += filter.o
>>> diff --git a/net/clients.h b/net/clients.h
>>> index d47530e..bcfb34b 100644
>>> --- a/net/clients.h
>>> +++ b/net/clients.h
>>> @@ -62,4 +62,7 @@ int net_init_netmap(const NetClientOptions *opts, const
>>> char *name,
>>> int net_init_vhost_user(const NetClientOptions *opts, const char *name,
>>> NetClientState *peer, Error **errp);
>>>
>>> +int net_init_filter(const NetClientOptions *opts, const char *name,
>>> + NetClientState *peer, Error **errp);
>>> +
>>> #endif /* QEMU_NET_CLIENTS_H */
>>> diff --git a/net/filter.c b/net/filter.c
>>> new file mode 100644
>>> index 0000000..006c64a
>>> --- /dev/null
>>> +++ b/net/filter.c
>>> @@ -0,0 +1,200 @@
>>> +/*
>>> + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
>>> + * (a.k.a. Fault Tolerance or Continuous Replication)
>>> + *
>>> + * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
>>> + * Copyright (c) 2015 FUJITSU LIMITED
>>> + * Copyright (c) 2015 Intel Corporation
>>> + *
>>> + * This work is licensed under the terms of the GNU GPL, version 2 or
>>> + * later. See the COPYING file in the top-level directory.
>>> + */
>>> +
>>> +#include "net/net.h"
>>> +#include "clients.h"
>>> +#include "qemu-common.h"
>>> +#include "qemu/error-report.h"
>>> +
>>> +typedef struct FILTERState {
>>> + NetClientState nc;
>>> + NetClientState *backend;
>>> +} FILTERState;
>>> +
>>> +static ssize_t filter_receive(NetClientState *nc, NetClientState *sender,
>>> + unsigned flags, const uint8_t *data, size_t size)
>>> +{
>>> + FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>> + NetClientState *issued_nc = NULL;
>>> + ssize_t ret;
>>> +
>>> + if (sender->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
>>> + /* packet received from NIC */
>>> + printf("packet received from NIC!!!\n");
>>> + issued_nc = s->backend;
>>> + } else {
>>> + /* packet received from backend */
>>> + printf("packet received from backend!!!\n");
>>> + issued_nc = nc->peer;
>>> + }
>>> +
>>> + if (flags & QEMU_NET_PACKET_FLAG_RAW && issued_nc->info->receive_raw) {
>>> + ret = issued_nc->info->receive_raw(issued_nc, data, size);
>>> + } else {
>>> + ret = issued_nc->info->receive(issued_nc, data, size);
>>> + }
>>> +
>>> + return ret;
>>> +}
>>> +
>>> +static void filter_cleanup(NetClientState *nc)
>>> +{
>>> + return;
>>> +}
>>> +
>>> +static bool filter_has_ufo(NetClientState *nc)
>>> +{
>>> + FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>> + NetClientState *backend = s->backend;
>>> +
>>> + if (!backend->info->has_ufo) {
>>> + return false;
>>> + }
>>> +
>>> + return backend->info->has_ufo(backend);
>>> +}
>>> +
>>> +static bool filter_has_vnet_hdr(NetClientState *nc)
>>> +{
>>> + FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>> + NetClientState *backend = s->backend;
>>> +
>>> + if (!backend->info->has_vnet_hdr) {
>>> + return false;
>>> + }
>>> +
>>> + return backend->info->has_vnet_hdr(backend);
>>> +}
>>> +
>>> +static bool filter_has_vnet_hdr_len(NetClientState *nc, int len)
>>> +{
>>> + FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>> + NetClientState *backend = s->backend;
>>> +
>>> + if (!backend->info->has_vnet_hdr_len) {
>>> + return false;
>>> + }
>>> +
>>> + return backend->info->has_vnet_hdr_len(backend, len);
>>> +}
>>> +
>>> +static void filter_using_vnet_hdr(NetClientState *nc, bool using_vnet_hdr)
>>> +{
>>> + FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>> + NetClientState *backend = s->backend;
>>> +
>>> + if (!backend->info->using_vnet_hdr) {
>>> + return;
>>> + }
>>> +
>>> + backend->info->using_vnet_hdr(backend, using_vnet_hdr);
>>> +}
>>> +
>>> +static void filter_set_offload(NetClientState *nc, int csum, int tso4,
>>> + int tso6, int ecn, int ufo)
>>> +{
>>> + FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>> + NetClientState *backend = s->backend;
>>> +
>>> + if (!backend->info->set_offload) {
>>> + return;
>>> + }
>>> +
>>> + backend->info->set_offload(backend, csum, tso4, tso6, ecn, ufo);
>>> +}
>>> +
>>> +static void filter_set_vnet_hdr_len(NetClientState *nc, int len)
>>> +{
>>> + FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>> + NetClientState *backend = s->backend;
>>> +
>>> + if (!backend->info->set_vnet_hdr_len) {
>>> + return;
>>> + }
>>> +
>>> + backend->info->set_vnet_hdr_len(backend, len);
>>> +}
>>> +
>>> +static int filter_set_vnet_le(NetClientState *nc, bool is_le)
>>> +{
>>> + FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>> + NetClientState *backend = s->backend;
>>> +
>>> + if (!backend->info->set_vnet_le) {
>>> + return -ENOSYS;
>>> + }
>>> +
>>> + return backend->info->set_vnet_le(backend, is_le);
>>> +}
>>> +
>>> +static int filter_set_vnet_be(NetClientState *nc, bool is_be)
>>> +{
>>> + FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>> + NetClientState *backend = s->backend;
>>> +
>>> + if (!backend->info->set_vnet_be) {
>>> + return -ENOSYS;
>>> + }
>>> +
>>> + return backend->info->set_vnet_be(backend, is_be);
>>> +}
>>> +
>>> +static NetClientInfo net_filter_info = {
>>> + .type = NET_CLIENT_OPTIONS_KIND_FILTER,
>>> + .size = sizeof(FILTERState),
>>> + .receive_filter = filter_receive,
>>> + .cleanup = filter_cleanup,
>>> + .has_ufo = filter_has_ufo,
>>> + .has_vnet_hdr = filter_has_vnet_hdr,
>>> + .has_vnet_hdr_len = filter_has_vnet_hdr_len,
>>> + .using_vnet_hdr = filter_using_vnet_hdr,
>>> + .set_offload = filter_set_offload,
>>> + .set_vnet_hdr_len = filter_set_vnet_hdr_len,
>>> + .set_vnet_le = filter_set_vnet_le,
>>> + .set_vnet_be = filter_set_vnet_be,
>>> +};
>>> +
>>> +int net_init_filter(const NetClientOptions *opts, const char *name,
>>> + NetClientState *peer, Error **errp)
>>> +{
>>> + NetClientState *nc;
>>> + FILTERState *s;
>>> + const NetdevFilterOptions *filter;
>>> + char *backend_id = NULL;
>>> + /* char *plugin = NULL; */
>>> +
>>> + assert(opts->kind == NET_CLIENT_OPTIONS_KIND_FILTER);
>>> + filter = opts->filter;
>>> + assert(filter->has_backend);
>>> +
>>> + backend_id = filter->backend;
>>> + /* plugin = filter->has_plugin ? filter->plugin : NULL; */
>>> +
>>> + nc = qemu_new_net_client(&net_filter_info, peer, "filter", name);
>>> + /*
>>> + * TODO: Both backend and frontend packets will use this queue, we
>>> + * double this queue's maxlen
>>> + */
>>> + s = DO_UPCAST(FILTERState, nc, nc);
>>> + s->backend = qemu_find_netdev(backend_id);
>>> + if (!s->backend) {
>>> + error_setg(errp, "invalid backend name specified");
>>> + return -1;
>>> + }
>>> +
>>> + s->backend->peer = nc;
>>> + /*
>>> + * TODO:
>>> + * init filter plugin
>>> + */
>>> + return 0;
>>> +}
>>> diff --git a/net/net.c b/net/net.c
>>> index 28a5597..466c6ff 100644
>>> --- a/net/net.c
>>> +++ b/net/net.c
>>> @@ -57,6 +57,7 @@ const char *host_net_devices[] = {
>>> "tap",
>>> "socket",
>>> "dump",
>>> + "filter",
>>> #ifdef CONFIG_NET_BRIDGE
>>> "bridge",
>>> #endif
>>> @@ -571,7 +572,9 @@ ssize_t qemu_deliver_packet(NetClientState *sender,
>>> return 0;
>>> }
>>>
>>> - if (flags & QEMU_NET_PACKET_FLAG_RAW && nc->info->receive_raw) {
>>> + if (nc->info->receive_filter) {
>>> + ret = nc->info->receive_filter(nc, sender, flags, data, size);
>>> + } else if (flags & QEMU_NET_PACKET_FLAG_RAW && nc->info->receive_raw) {
>>> ret = nc->info->receive_raw(nc, data, size);
>>> } else {
>>> ret = nc->info->receive(nc, data, size);
>>> @@ -886,6 +889,7 @@ static int (* const
>>> net_client_init_fun[NET_CLIENT_OPTIONS_KIND_MAX])(
>>> const char *name,
>>> NetClientState *peer, Error **errp) = {
>>> [NET_CLIENT_OPTIONS_KIND_NIC] = net_init_nic,
>>> + [NET_CLIENT_OPTIONS_KIND_FILTER] = net_init_filter,
>>> #ifdef CONFIG_SLIRP
>>> [NET_CLIENT_OPTIONS_KIND_USER] = net_init_slirp,
>>> #endif
>>> diff --git a/qapi-schema.json b/qapi-schema.json
>>> index a0a45f7..3329973 100644
>>> --- a/qapi-schema.json
>>> +++ b/qapi-schema.json
>>> @@ -2063,7 +2063,7 @@
>>> # Add a network backend.
>>> #
>>> # @type: the type of network backend. Current valid values are 'user', 'tap',
>>> -# 'vde', 'socket', 'dump' and 'bridge'
>>> +# 'vde', 'socket', 'dump' , 'bridge' and 'filter'
>>> #
>>> # @id: the name of the new network backend
>>> #
>>> @@ -2474,6 +2474,24 @@
>>> '*vhostforce': 'bool' } }
>>>
>>> ##
>>> +# @NetdevFilterOptions
>>> +#
>>> +# A net filter between network backend and NIC device
>>> +#
>>> +# @plugin: #optional a plugin represent a set of filter rules,
>>> +# by default, if no plugin is supplied, the net filter will do
>>> +# nothing but pass all packets to network backend.
>>> +#
>>> +# @backend: the network backend.
>>> +#
>>> +# Since 2.5
>>> +##
>>> +{ 'struct': 'NetdevFilterOptions',
>>> + 'data': {
>>> + '*plugin': 'str',
>>> + '*backend': 'str' } }
>>> +
>>> +##
>>> # @NetClientOptions
>>> #
>>> # A discriminated record of network device traits.
>>> @@ -2496,7 +2514,8 @@
>>> 'bridge': 'NetdevBridgeOptions',
>>> 'hubport': 'NetdevHubPortOptions',
>>> 'netmap': 'NetdevNetmapOptions',
>>> - 'vhost-user': 'NetdevVhostUserOptions' } }
>>> + 'vhost-user': 'NetdevVhostUserOptions',
>>> + 'filter': 'NetdevFilterOptions'} }
>>>
>>> ##
>>> # @NetLegacy
>>
>> .
>>
>
--
Thanks,
Yang.
next prev parent reply other threads:[~2015-07-27 6:03 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-13 7:39 [Qemu-devel] [PATCH v2 0/5] For QEMU 2.5: Network traffic dumping for -netdev devices Thomas Huth
2015-07-13 7:39 ` [Qemu-devel] [PATCH v2 1/5] net/dump: Add support for receive_iov function Thomas Huth
2015-07-13 7:39 ` [Qemu-devel] [PATCH v2 2/5] net/dump: Move DumpState into NetClientState Thomas Huth
2015-07-13 7:39 ` [Qemu-devel] [PATCH v2 3/5] net/dump: Rework net-dump init functions Thomas Huth
2015-07-13 7:39 ` [Qemu-devel] [PATCH v2 4/5] net/dump: Add dump option for netdev devices Thomas Huth
2015-07-13 7:39 ` [Qemu-devel] [PATCH v2 5/5] qemu options: Add information about dumpfile to help text Thomas Huth
2015-07-22 6:35 ` [Qemu-devel] [PATCH v2 0/5] For QEMU 2.5: Network traffic dumping for -netdev devices Jason Wang
2015-07-22 10:52 ` Yang Hongyang
2015-07-22 10:55 ` [Qemu-devel] [PATCH] RFC/net: Add a net filter Yang Hongyang
2015-07-22 11:06 ` Daniel P. Berrange
2015-07-22 15:16 ` Yang Hongyang
2015-07-22 13:05 ` Thomas Huth
2015-07-22 15:06 ` Yang Hongyang
2015-07-22 13:26 ` Stefan Hajnoczi
2015-07-22 14:57 ` Yang Hongyang
2015-07-23 11:57 ` Stefan Hajnoczi
2015-07-23 5:59 ` Jason Wang
2015-07-27 5:27 ` Yang Hongyang
2015-07-27 6:02 ` Yang Hongyang [this message]
2015-07-27 6:39 ` Jason Wang
2015-07-27 7:00 ` Yang Hongyang
2015-07-27 7:31 ` Jason Wang
2015-07-27 7:45 ` Yang Hongyang
2015-07-27 8:01 ` Jason Wang
2015-07-27 8:39 ` Yang Hongyang
2015-07-27 9:16 ` Jason Wang
2015-07-27 10:03 ` Yang Hongyang
2015-07-28 3:28 ` Jason Wang
2015-07-28 4:00 ` Yang Hongyang
2015-07-28 8:52 ` Yang Hongyang
2015-07-28 9:19 ` Yang Hongyang
2015-07-28 9:30 ` Jason Wang
2015-07-28 9:41 ` Yang Hongyang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55B5C98A.6090508@cn.fujitsu.com \
--to=yanghy@cn.fujitsu.com \
--cc=jasowang@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).