qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Yang Hongyang <yanghy@cn.fujitsu.com>
To: Jason Wang <jasowang@redhat.com>, qemu-devel@nongnu.org
Cc: thuth@redhat.com, stefanha@redhat.com
Subject: Re: [Qemu-devel] [PATCH] RFC/net: Add a net filter
Date: Mon, 27 Jul 2015 14:02:50 +0800	[thread overview]
Message-ID: <55B5C98A.6090508@cn.fujitsu.com> (raw)
In-Reply-To: <55B5C14F.5030808@cn.fujitsu.com>



On 07/27/2015 01:27 PM, Yang Hongyang wrote:
> On 07/23/2015 01:59 PM, Jason Wang wrote:
>>
>>
>> On 07/22/2015 06:55 PM, Yang Hongyang wrote:
>>> This patch add a net filter between network backend and NIC devices.
>>> All packets will pass by this filter.
>>> TODO:
>>>   multiqueue support.
>>>   plugin support.
>>>
>>>                +--------------+       +-------------+
>>> +----------+  |    filter    |       |frontend(NIC)|
>>> | real     |  |              |       |             |
>>> | network  <--+backend       <-------+             |
>>> | backend  |  |         peer +-------> peer        |
>>> +----------+  +--------------+       +-------------+
>>>
>>> Usage:
>>> -netdev tap,id=bn0  # you can use whatever backend as needed
>>> -netdev filter,id=f0,backend=bn0,plugin=dump
>>> -device e1000,netdev=f0
>>>
>>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>>
>> Hi:
>>
>> Several questions:
>>
>> - Looks like we can do more than filter, so may be something like
>> traffic control or other is more suitable?
>
> The filter is just a transparent proxy of a backend if no filter plugin
> is inserted. It just by pass all packets. Capture all traffic is the purpose
> of the filter. As long as we have an entry to capture all packets, we
> can do more, this is what a filter plugin will do. There are some use cases
> I can think of:
> - dump, by using filter, we can dump either output/input packets.
> - buffer, to buffer/release packets, this feature can be used when using
>            macrocheckpoing. Or other Remus like VM FT solutions. You can
>            also supply an interval to a buffer plugin, which will release
>            packets by interval.
> May be other use cases based on this special backend.
>
>> - What's the advantages of introducing a new type of netdev?

You can take the filter as a full featured network backend, And by implement
it as a new type of netdev, we can reuse the existing netdev design, reuse as
many existing code/design as we can.

>> As far as I
>> can see, just replace the dump function in Tomas' series with a
>> configurable function pointer will do the trick? (Probably with some
>> monitor commands). And then you won't even need to deal with vnet hder
>> and offload stuffs?
>
> I think dump function focus on every netdev, it adds an dump_enabled to
> NetClientState, and dump the packet when the netdev receive been called,
> This filter function more focus on packets between backend/frontend,
> it's kind of an injection to the network packets flow.
> So the semantics are different I think.
>
>> - I'm not sure the value of doing this especially consider host (linux)
>> has much more functional and powerful traffic control system.
>>
>> Thanks.
>>
>>
>>> ---
>>>   include/net/net.h |   3 +
>>>   net/Makefile.objs |   1 +
>>>   net/clients.h     |   3 +
>>>   net/filter.c      | 200 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>   net/net.c         |   6 +-
>>>   qapi-schema.json  |  23 ++++++-
>>>   6 files changed, 233 insertions(+), 3 deletions(-)
>>>   create mode 100644 net/filter.c
>>>
>>> diff --git a/include/net/net.h b/include/net/net.h
>>> index 6a6cbef..250f365 100644
>>> --- a/include/net/net.h
>>> +++ b/include/net/net.h
>>> @@ -45,6 +45,8 @@ typedef void (NetPoll)(NetClientState *, bool enable);
>>>   typedef int (NetCanReceive)(NetClientState *);
>>>   typedef ssize_t (NetReceive)(NetClientState *, const uint8_t *, size_t);
>>>   typedef ssize_t (NetReceiveIOV)(NetClientState *, const struct iovec *, int);
>>> +typedef ssize_t (NetReceiveFilter)(NetClientState *, NetClientState *,
>>> +                                   unsigned, const uint8_t *, size_t);
>>>   typedef void (NetCleanup) (NetClientState *);
>>>   typedef void (LinkStatusChanged)(NetClientState *);
>>>   typedef void (NetClientDestructor)(NetClientState *);
>>> @@ -64,6 +66,7 @@ typedef struct NetClientInfo {
>>>       NetReceive *receive;
>>>       NetReceive *receive_raw;
>>>       NetReceiveIOV *receive_iov;
>>> +    NetReceiveFilter *receive_filter;
>>>       NetCanReceive *can_receive;
>>>       NetCleanup *cleanup;
>>>       LinkStatusChanged *link_status_changed;
>>> diff --git a/net/Makefile.objs b/net/Makefile.objs
>>> index ec19cb3..914aec0 100644
>>> --- a/net/Makefile.objs
>>> +++ b/net/Makefile.objs
>>> @@ -13,3 +13,4 @@ common-obj-$(CONFIG_HAIKU) += tap-haiku.o
>>>   common-obj-$(CONFIG_SLIRP) += slirp.o
>>>   common-obj-$(CONFIG_VDE) += vde.o
>>>   common-obj-$(CONFIG_NETMAP) += netmap.o
>>> +common-obj-y += filter.o
>>> diff --git a/net/clients.h b/net/clients.h
>>> index d47530e..bcfb34b 100644
>>> --- a/net/clients.h
>>> +++ b/net/clients.h
>>> @@ -62,4 +62,7 @@ int net_init_netmap(const NetClientOptions *opts, const
>>> char *name,
>>>   int net_init_vhost_user(const NetClientOptions *opts, const char *name,
>>>                           NetClientState *peer, Error **errp);
>>>
>>> +int net_init_filter(const NetClientOptions *opts, const char *name,
>>> +                    NetClientState *peer, Error **errp);
>>> +
>>>   #endif /* QEMU_NET_CLIENTS_H */
>>> diff --git a/net/filter.c b/net/filter.c
>>> new file mode 100644
>>> index 0000000..006c64a
>>> --- /dev/null
>>> +++ b/net/filter.c
>>> @@ -0,0 +1,200 @@
>>> +/*
>>> + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
>>> + * (a.k.a. Fault Tolerance or Continuous Replication)
>>> + *
>>> + * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
>>> + * Copyright (c) 2015 FUJITSU LIMITED
>>> + * Copyright (c) 2015 Intel Corporation
>>> + *
>>> + * This work is licensed under the terms of the GNU GPL, version 2 or
>>> + * later.  See the COPYING file in the top-level directory.
>>> + */
>>> +
>>> +#include "net/net.h"
>>> +#include "clients.h"
>>> +#include "qemu-common.h"
>>> +#include "qemu/error-report.h"
>>> +
>>> +typedef struct FILTERState {
>>> +    NetClientState nc;
>>> +    NetClientState *backend;
>>> +} FILTERState;
>>> +
>>> +static ssize_t filter_receive(NetClientState *nc, NetClientState *sender,
>>> +                              unsigned flags, const uint8_t *data, size_t size)
>>> +{
>>> +    FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>> +    NetClientState *issued_nc = NULL;
>>> +    ssize_t ret;
>>> +
>>> +    if (sender->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
>>> +        /* packet received from NIC */
>>> +        printf("packet received from NIC!!!\n");
>>> +        issued_nc = s->backend;
>>> +    } else {
>>> +        /* packet received from backend */
>>> +        printf("packet received from backend!!!\n");
>>> +        issued_nc = nc->peer;
>>> +    }
>>> +
>>> +    if (flags & QEMU_NET_PACKET_FLAG_RAW && issued_nc->info->receive_raw) {
>>> +        ret = issued_nc->info->receive_raw(issued_nc, data, size);
>>> +    } else {
>>> +        ret = issued_nc->info->receive(issued_nc, data, size);
>>> +    }
>>> +
>>> +    return ret;
>>> +}
>>> +
>>> +static void filter_cleanup(NetClientState *nc)
>>> +{
>>> +    return;
>>> +}
>>> +
>>> +static bool filter_has_ufo(NetClientState *nc)
>>> +{
>>> +    FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>> +    NetClientState *backend = s->backend;
>>> +
>>> +    if (!backend->info->has_ufo) {
>>> +        return false;
>>> +    }
>>> +
>>> +    return backend->info->has_ufo(backend);
>>> +}
>>> +
>>> +static bool filter_has_vnet_hdr(NetClientState *nc)
>>> +{
>>> +    FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>> +    NetClientState *backend = s->backend;
>>> +
>>> +    if (!backend->info->has_vnet_hdr) {
>>> +        return false;
>>> +    }
>>> +
>>> +    return backend->info->has_vnet_hdr(backend);
>>> +}
>>> +
>>> +static bool filter_has_vnet_hdr_len(NetClientState *nc, int len)
>>> +{
>>> +    FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>> +    NetClientState *backend = s->backend;
>>> +
>>> +    if (!backend->info->has_vnet_hdr_len) {
>>> +        return false;
>>> +    }
>>> +
>>> +    return backend->info->has_vnet_hdr_len(backend, len);
>>> +}
>>> +
>>> +static void filter_using_vnet_hdr(NetClientState *nc, bool using_vnet_hdr)
>>> +{
>>> +    FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>> +    NetClientState *backend = s->backend;
>>> +
>>> +    if (!backend->info->using_vnet_hdr) {
>>> +        return;
>>> +    }
>>> +
>>> +    backend->info->using_vnet_hdr(backend, using_vnet_hdr);
>>> +}
>>> +
>>> +static void filter_set_offload(NetClientState *nc, int csum, int tso4,
>>> +                               int tso6, int ecn, int ufo)
>>> +{
>>> +    FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>> +    NetClientState *backend = s->backend;
>>> +
>>> +    if (!backend->info->set_offload) {
>>> +        return;
>>> +    }
>>> +
>>> +    backend->info->set_offload(backend, csum, tso4, tso6, ecn, ufo);
>>> +}
>>> +
>>> +static void filter_set_vnet_hdr_len(NetClientState *nc, int len)
>>> +{
>>> +    FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>> +    NetClientState *backend = s->backend;
>>> +
>>> +    if (!backend->info->set_vnet_hdr_len) {
>>> +        return;
>>> +    }
>>> +
>>> +    backend->info->set_vnet_hdr_len(backend, len);
>>> +}
>>> +
>>> +static int filter_set_vnet_le(NetClientState *nc, bool is_le)
>>> +{
>>> +    FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>> +    NetClientState *backend = s->backend;
>>> +
>>> +    if (!backend->info->set_vnet_le) {
>>> +        return -ENOSYS;
>>> +    }
>>> +
>>> +    return backend->info->set_vnet_le(backend, is_le);
>>> +}
>>> +
>>> +static int filter_set_vnet_be(NetClientState *nc, bool is_be)
>>> +{
>>> +    FILTERState *s = DO_UPCAST(FILTERState, nc, nc);
>>> +    NetClientState *backend = s->backend;
>>> +
>>> +    if (!backend->info->set_vnet_be) {
>>> +        return -ENOSYS;
>>> +    }
>>> +
>>> +    return backend->info->set_vnet_be(backend, is_be);
>>> +}
>>> +
>>> +static NetClientInfo net_filter_info = {
>>> +    .type = NET_CLIENT_OPTIONS_KIND_FILTER,
>>> +    .size = sizeof(FILTERState),
>>> +    .receive_filter = filter_receive,
>>> +    .cleanup = filter_cleanup,
>>> +    .has_ufo = filter_has_ufo,
>>> +    .has_vnet_hdr = filter_has_vnet_hdr,
>>> +    .has_vnet_hdr_len = filter_has_vnet_hdr_len,
>>> +    .using_vnet_hdr = filter_using_vnet_hdr,
>>> +    .set_offload = filter_set_offload,
>>> +    .set_vnet_hdr_len = filter_set_vnet_hdr_len,
>>> +    .set_vnet_le = filter_set_vnet_le,
>>> +    .set_vnet_be = filter_set_vnet_be,
>>> +};
>>> +
>>> +int net_init_filter(const NetClientOptions *opts, const char *name,
>>> +                    NetClientState *peer, Error **errp)
>>> +{
>>> +    NetClientState *nc;
>>> +    FILTERState *s;
>>> +    const NetdevFilterOptions *filter;
>>> +    char *backend_id = NULL;
>>> +    /* char *plugin = NULL; */
>>> +
>>> +    assert(opts->kind == NET_CLIENT_OPTIONS_KIND_FILTER);
>>> +    filter = opts->filter;
>>> +    assert(filter->has_backend);
>>> +
>>> +    backend_id = filter->backend;
>>> +    /* plugin = filter->has_plugin ? filter->plugin : NULL; */
>>> +
>>> +    nc = qemu_new_net_client(&net_filter_info, peer, "filter", name);
>>> +    /*
>>> +     * TODO: Both backend and frontend packets will use this queue, we
>>> +     * double this queue's maxlen
>>> +     */
>>> +    s = DO_UPCAST(FILTERState, nc, nc);
>>> +    s->backend = qemu_find_netdev(backend_id);
>>> +    if (!s->backend) {
>>> +        error_setg(errp, "invalid backend name specified");
>>> +        return -1;
>>> +    }
>>> +
>>> +    s->backend->peer = nc;
>>> +    /*
>>> +     * TODO:
>>> +     *   init filter plugin
>>> +     */
>>> +    return 0;
>>> +}
>>> diff --git a/net/net.c b/net/net.c
>>> index 28a5597..466c6ff 100644
>>> --- a/net/net.c
>>> +++ b/net/net.c
>>> @@ -57,6 +57,7 @@ const char *host_net_devices[] = {
>>>       "tap",
>>>       "socket",
>>>       "dump",
>>> +    "filter",
>>>   #ifdef CONFIG_NET_BRIDGE
>>>       "bridge",
>>>   #endif
>>> @@ -571,7 +572,9 @@ ssize_t qemu_deliver_packet(NetClientState *sender,
>>>           return 0;
>>>       }
>>>
>>> -    if (flags & QEMU_NET_PACKET_FLAG_RAW && nc->info->receive_raw) {
>>> +    if (nc->info->receive_filter) {
>>> +        ret = nc->info->receive_filter(nc, sender, flags, data, size);
>>> +    } else if (flags & QEMU_NET_PACKET_FLAG_RAW && nc->info->receive_raw) {
>>>           ret = nc->info->receive_raw(nc, data, size);
>>>       } else {
>>>           ret = nc->info->receive(nc, data, size);
>>> @@ -886,6 +889,7 @@ static int (* const
>>> net_client_init_fun[NET_CLIENT_OPTIONS_KIND_MAX])(
>>>       const char *name,
>>>       NetClientState *peer, Error **errp) = {
>>>           [NET_CLIENT_OPTIONS_KIND_NIC]       = net_init_nic,
>>> +        [NET_CLIENT_OPTIONS_KIND_FILTER]    = net_init_filter,
>>>   #ifdef CONFIG_SLIRP
>>>           [NET_CLIENT_OPTIONS_KIND_USER]      = net_init_slirp,
>>>   #endif
>>> diff --git a/qapi-schema.json b/qapi-schema.json
>>> index a0a45f7..3329973 100644
>>> --- a/qapi-schema.json
>>> +++ b/qapi-schema.json
>>> @@ -2063,7 +2063,7 @@
>>>   # Add a network backend.
>>>   #
>>>   # @type: the type of network backend.  Current valid values are 'user', 'tap',
>>> -#        'vde', 'socket', 'dump' and 'bridge'
>>> +#        'vde', 'socket', 'dump' , 'bridge' and 'filter'
>>>   #
>>>   # @id: the name of the new network backend
>>>   #
>>> @@ -2474,6 +2474,24 @@
>>>       '*vhostforce':    'bool' } }
>>>
>>>   ##
>>> +# @NetdevFilterOptions
>>> +#
>>> +# A net filter between network backend and NIC device
>>> +#
>>> +# @plugin: #optional a plugin represent a set of filter rules,
>>> +#          by default, if no plugin is supplied, the net filter will do
>>> +#          nothing but pass all packets to network backend.
>>> +#
>>> +# @backend: the network backend.
>>> +#
>>> +# Since 2.5
>>> +##
>>> +{ 'struct': 'NetdevFilterOptions',
>>> +  'data': {
>>> +    '*plugin':        'str',
>>> +    '*backend':       'str' } }
>>> +
>>> +##
>>>   # @NetClientOptions
>>>   #
>>>   # A discriminated record of network device traits.
>>> @@ -2496,7 +2514,8 @@
>>>       'bridge':   'NetdevBridgeOptions',
>>>       'hubport':  'NetdevHubPortOptions',
>>>       'netmap':   'NetdevNetmapOptions',
>>> -    'vhost-user': 'NetdevVhostUserOptions' } }
>>> +    'vhost-user': 'NetdevVhostUserOptions',
>>> +    'filter':   'NetdevFilterOptions'} }
>>>
>>>   ##
>>>   # @NetLegacy
>>
>> .
>>
>

-- 
Thanks,
Yang.

  reply	other threads:[~2015-07-27  6:03 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-13  7:39 [Qemu-devel] [PATCH v2 0/5] For QEMU 2.5: Network traffic dumping for -netdev devices Thomas Huth
2015-07-13  7:39 ` [Qemu-devel] [PATCH v2 1/5] net/dump: Add support for receive_iov function Thomas Huth
2015-07-13  7:39 ` [Qemu-devel] [PATCH v2 2/5] net/dump: Move DumpState into NetClientState Thomas Huth
2015-07-13  7:39 ` [Qemu-devel] [PATCH v2 3/5] net/dump: Rework net-dump init functions Thomas Huth
2015-07-13  7:39 ` [Qemu-devel] [PATCH v2 4/5] net/dump: Add dump option for netdev devices Thomas Huth
2015-07-13  7:39 ` [Qemu-devel] [PATCH v2 5/5] qemu options: Add information about dumpfile to help text Thomas Huth
2015-07-22  6:35 ` [Qemu-devel] [PATCH v2 0/5] For QEMU 2.5: Network traffic dumping for -netdev devices Jason Wang
2015-07-22 10:52 ` Yang Hongyang
2015-07-22 10:55   ` [Qemu-devel] [PATCH] RFC/net: Add a net filter Yang Hongyang
2015-07-22 11:06     ` Daniel P. Berrange
2015-07-22 15:16       ` Yang Hongyang
2015-07-22 13:05     ` Thomas Huth
2015-07-22 15:06       ` Yang Hongyang
2015-07-22 13:26     ` Stefan Hajnoczi
2015-07-22 14:57       ` Yang Hongyang
2015-07-23 11:57         ` Stefan Hajnoczi
2015-07-23  5:59     ` Jason Wang
2015-07-27  5:27       ` Yang Hongyang
2015-07-27  6:02         ` Yang Hongyang [this message]
2015-07-27  6:39         ` Jason Wang
2015-07-27  7:00           ` Yang Hongyang
2015-07-27  7:31             ` Jason Wang
2015-07-27  7:45               ` Yang Hongyang
2015-07-27  8:01                 ` Jason Wang
2015-07-27  8:39                   ` Yang Hongyang
2015-07-27  9:16                     ` Jason Wang
2015-07-27 10:03                       ` Yang Hongyang
2015-07-28  3:28                         ` Jason Wang
2015-07-28  4:00                           ` Yang Hongyang
2015-07-28  8:52                             ` Yang Hongyang
2015-07-28  9:19                             ` Yang Hongyang
2015-07-28  9:30                               ` Jason Wang
2015-07-28  9:41                                 ` Yang Hongyang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55B5C98A.6090508@cn.fujitsu.com \
    --to=yanghy@cn.fujitsu.com \
    --cc=jasowang@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).