qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Anton Ivanov <anton.ivanov@kot-begemot.co.uk>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Anton Ivanov <antivano@cisco.com>,
	pbonzini@redhat.com, qemu-devel@nongnu.org, afaerber@suse.de
Subject: Re: [Qemu-devel] [PATCH] New feature - RFC3931 L2TPv3 network transport using static Ethernet over L2TPv3 tunnels
Date: Thu, 06 Mar 2014 10:28:31 +0000	[thread overview]
Message-ID: <53184DCF.4030609@kot-begemot.co.uk> (raw)
In-Reply-To: <20140306094428.GB23172@stefanha-thinkpad.redhat.com>

On 06/03/14 09:44, Stefan Hajnoczi wrote:
> On Wed, Mar 05, 2014 at 02:12:20PM +0000, anton.ivanov@kot-begemot.co.uk wrote:
>
> Please don't put "New feature - " in the commit message.  "net: " or
> "l2tpv3: " would be a good prefix (see git-log(1) for other examples).
> The idea behind using a prefix is that you can immediately determine
> which area of code is affected by a commit.  It also helps with grepping
> the commit history although there are more precise ways to do that.
>
> Anyway, "New feature" isn't a useful prefix.

Understood.

>
>> From: Anton Ivanov <antivano@cisco.com>
>>
>> This transport allows qemu to communicate with host if host
>> supports L2TPv3, communicate directly VM to VM (similar to
>> current socket transport) and VM to other device - f.e. VM to
>> a router.
>>
>> Supported
>>      * Raw IPv4, Raw IPv6, UDPv4 and UDPv6 as per RFC
>>      * No cookie, 32 bit cookie or 64 bit cookie
>>      * Counter (as per RFC)
>>      * No counter (draft-mkonstan-keyed-ipv6-tunnel-00)
>> Unsupported
>>      * Workarounds for implementation with broken counter handling
>>
>> Signed-off-by: Anton Ivanov <antivano@cisco.com>
>> ---
>>   net/Makefile.objs |    1 +
>>   net/clients.h     |    2 +
>>   net/l2tpv3.c      |  554 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   net/net.c         |    3 +
>>   qapi-schema.json  |   56 +++++-
>>   5 files changed, 615 insertions(+), 1 deletion(-)
>>   create mode 100644 net/l2tpv3.c
> Please add documentation to qemu-options.hx.  You can follow the other
> -netdev documentation examples in that file.
>
> Also please add example Linux commands for setting up L2TPv3 to the
> commit description.  This will help anyone wishing to try out the code.
> Something along the lines of:
>
>    # ip l2tp add tunnel tunnel_id 1 peer_tunnel_id 1 udp_sport 5000 \
>      udp_dport 5001 encap udp local 127.0.0.1 remote 127.0.0.1
>    # ip l2tp add session tunnel_id 1 session_id 1 peer_session_id 1
>    # ip addr add 192.168.2.1/32 peer 192.168.2.2/32 dev l2tpeth0
>    # ifconfig l2tpeth0 up
>    # qemu -netdev l2tpv3,src=127.0.0.1,dst=127.0.0.1,\
>                   srcport=5001,dstport=5000,udp=on,...
>
> (incomplete but should be close)

Understood, will do as well as basic router commands.

>
>> diff --git a/net/Makefile.objs b/net/Makefile.objs
>> index 4854a14..160214e 100644
>> --- a/net/Makefile.objs
>> +++ b/net/Makefile.objs
>> @@ -2,6 +2,7 @@ common-obj-y = net.o queue.o checksum.o util.o hub.o
>>   common-obj-y += socket.o
>>   common-obj-y += dump.o
>>   common-obj-y += eth.o
>> +common-obj-$(CONFIG_LINUX) += l2tpv3.o
>>   common-obj-$(CONFIG_POSIX) += tap.o
>>   common-obj-$(CONFIG_LINUX) += tap-linux.o
>>   common-obj-$(CONFIG_WIN32) += tap-win32.o
>> diff --git a/net/clients.h b/net/clients.h
>> index 7793294..bbf177c 100644
>> --- a/net/clients.h
>> +++ b/net/clients.h
>> @@ -47,6 +47,8 @@ int net_init_tap(const NetClientOptions *opts, const char *name,
>>   int net_init_bridge(const NetClientOptions *opts, const char *name,
>>                       NetClientState *peer);
>>   
>> +int net_init_l2tpv3(const NetClientOptions *opts, const char *name,
>> +                    NetClientState *peer);
>>   #ifdef CONFIG_VDE
>>   int net_init_vde(const NetClientOptions *opts, const char *name,
>>                    NetClientState *peer);
>> diff --git a/net/l2tpv3.c b/net/l2tpv3.c
>> new file mode 100644
>> index 0000000..3348b5d
>> --- /dev/null
>> +++ b/net/l2tpv3.c
> Please run patches through scripts/checkpatch.pl before submitting them.
> See here for how to set up git to automatically check your code as you
> commit:
> http://blog.vmsplice.net/2011/03/how-to-automatically-run-checkpatchpl.html

Eric already asked me to do that, next submission will be processed this 
way.

>
>> @@ -0,0 +1,554 @@
>> +/*
>> + * QEMU System Emulator
>> + *
>> + * Copyright (c) 2003-2008 Fabrice Bellard
>> + * Copyright (c) 2012-2014 Cisco Systems
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a copy
>> + * of this software and associated documentation files (the "Software"), to deal
>> + * in the Software without restriction, including without limitation the rights
>> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
>> + * copies of the Software, and to permit persons to whom the Software is
>> + * furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice shall be included in
>> + * all copies or substantial portions of the Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
>> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
>> + * THE SOFTWARE.
>> + */
>> +
>> +#include <linux/ip.h>
>> +#include <netdb.h>
>> +#include "config-host.h"
>> +#include "net/net.h"
>> +#include "clients.h"
>> +#include "monitor/monitor.h"
>> +#include "qemu-common.h"
>> +#include "qemu/error-report.h"
>> +#include "qemu/option.h"
>> +#include "qemu/sockets.h"
>> +#include "qemu/iov.h"
>> +#include "qemu/main-loop.h"
>> +
>> +
>> +/* The buffer size needs to be investigated for optimum numbers and
>> + * optimum means of paging in on different systems. This size is
>> + * chosen to be sufficient to accommodate one packet with some headers
>> + */
>> +
>> +#define BUFFER_ALIGN sysconf(_SC_PAGESIZE)
>> +#define BUFFER_SIZE 2048
>> +#define IOVSIZE 2
>> +#define MAX_L2TPV3_MSGCNT 32
> Please avoid trailing whitespace, git prints a warning about it when
> applying patches.  There are occurrences throughout this file.  You can
> remove it in vi with :%s/\s*$//g
>
>> +#define MAX_L2TPV3_IOVCNT (MAX_L2TPV3_MSGCNT * IOVSIZE)
>> +
>> +#ifndef IPPROTO_L2TP
>> +#define IPPROTO_L2TP 0x73
>> +#endif
> Not needed, see below for details.
>
>> +typedef struct NetL2TPV3State {
>> +    NetClientState nc;
>> +    int fd;
>  From here...
>
>> +    int state;
>> +    unsigned int index;
>> +    unsigned int packet_len;
> ...to here should be dropped.  These fields are not used and were copied
> from another netdev implementation.
>
> (Mentioned in a previous review.)
>
>> +
>> +    /*
>> +     *	these are used for xmit - that happens packet a time
>> +     *	and for first sign of life packet (easier to parse that once)
>> +     */
>> +
>> +    uint8_t *header_buf;
>> +    struct iovec *vec;
>> +
>> +    /*
>> +     * these are used for receive - try to "eat" up to 32 packets at a time
>> +     */
>> +
>> +    struct mmsghdr *msgvec;
>> +
>> +    /*
>> +     * peer address
>> +     */
>> +
>> +    struct sockaddr_storage *dgram_dst;
>> +    uint32_t dst_size;
>> +
>> +    /*
>> +     * L2TPv3 parameters
>> +     */
>> +
>> +    uint64_t rx_cookie;
>> +    uint64_t tx_cookie;
>> +    uint32_t rx_session;
>> +    uint32_t tx_session;
>> +    uint32_t header_size;
>> +    uint32_t counter;
>> +
>> +    /*
>> +     * Precomputed offsets
>> +     */
>> +
>> +    uint32_t offset;
>> +    uint32_t cookie_offset;
>> +    uint32_t counter_offset;
>> +    uint32_t session_offset;
>> +
>> +    /* Flags */
>> +
>> +    bool ipv6;
>> +    bool udp;
>> +    bool has_counter;
>> +    bool cookie;
>> +    bool cookie_is_64;
>> +
>> +} NetL2TPV3State;
>> +
>> +typedef struct NetL2TPV3ListenState {
>> +    NetClientState nc;
>> +    char *model;
>> +    char *name;
>> +    int fd;
>> +} NetL2TPV3ListenState;
> NetL2TPV3ListenState is not used.  Please remove.
>
> (Mentioned in a previous review.)

OK.

>
>> +
>> +static int l2tpv3_form_header(NetL2TPV3State *s)
> I suggest making this void since there is no error return and callers
> don't check.

OK.

>
>> +{
>> +    uint32_t *header;
>> +    uint32_t *session;
>> +    uint64_t *cookie64;
>> +    uint32_t *cookie32;
>> +    uint32_t *counter;
>> +
>> +    if (s->udp) {
>> +	header = (uint32_t *) s->header_buf;
>> +	stl_be_p(header, 0x30000);
> I guess the 3 is the protocol version number.  A comment about magic
> number would help.

No. It signifies a data packet. Agree - a constant will be helpful.

>
>> +    }
> QEMU coding style is 4-space indentation.  Please reindent the code.

I thought I did, I will re-run the source through a few perl one liners 
to make sure it is all indented correctly.

>
>> +    session = (uint32_t *) (s->header_buf + s->session_offset);
>> +    stl_be_p(session, s->tx_session);
>> +
>> +    if (s->cookie) {
>> +	if (s->cookie_is_64) {
>> +	    cookie64 = (uint64_t *)(s->header_buf + s->cookie_offset);
>> +	    stq_be_p(cookie64, s->tx_cookie);
>> +	} else {
>> +	    cookie32 = (uint32_t *) (s->header_buf + s->cookie_offset);
>> +	    stl_be_p(cookie32, s->tx_cookie);
>> +	}
>> +    }
>> +
>> +    if (s->has_counter) {
>> +	counter = (uint32_t *)(s->header_buf + s->counter_offset);
>> +	stl_be_p(counter, ++s->counter);
>> +    }
>> +    return 0;
>> +}
>> +
>> +static ssize_t net_l2tpv3_receive_dgram_iov(NetClientState *nc,
>> +				    const struct iovec *iov,
>> +				    int iovcnt)
>> +{
>> +    NetL2TPV3State *s = DO_UPCAST(NetL2TPV3State, nc, nc);
>> +
>> +    struct msghdr message;
>> +    int ret;
>> +
>> +    if (iovcnt > MAX_L2TPV3_IOVCNT - 1) {
>> +	error_report("iovec too long %d > %d, change l2tpv3.h\n", iovcnt, MAX_L2TPV3_IOVCNT);
> error_report() messages should not include a newline.

Ok. Understood.

>
>> +	return -1;
>> +    }
>> +    l2tpv3_form_header(s);
>> +    memcpy(s->vec + 1, iov, iovcnt * sizeof(struct iovec));
>> +    s->vec->iov_base = s->header_buf;
>> +    s->vec->iov_len = s->offset;
>> +    message.msg_name = s->dgram_dst;
>> +    message.msg_namelen = s->dst_size;
>> +    message.msg_iov = (struct iovec *) s->vec;
> No need to cast here.
>
>> +    message.msg_iovlen = iovcnt + 1;
>> +    message.msg_control = NULL;
>> +    message.msg_controllen = 0;
>> +    message.msg_flags = 0;
>> +    do {
>> +	ret = sendmsg(s->fd, &message, 0);
>> +    } while ((ret == -1) && (errno == EINTR));
>> +    if (ret > 0) {
>> +	ret -= s->offset;
>> +    } else if (ret == 0) {
>> +	ret = iov_size(iov, iovcnt);
>> +    }
>> +    return ret;
> If the socket buffer (in the host kernel) is full this will drop
> packets.  The other netdev implementations return 0 and wait until the
> file descriptor becomes writable again (see net/tap.c or net/socket.c).
>
> Please either implement the same approach or explain why you disagree.

Will implement the correct approach (it did not quite work with one of 
our orginal features which we removed - listen-only mode). Probably me 
returning incorrect error code.

>
> (Mentioned in previous review.)
>
>> +}
>> +
>> +static ssize_t net_l2tpv3_receive_dgram(NetClientState *nc,
>> +					const uint8_t *buf,
>> +					size_t size)
>> +{
>> +    NetL2TPV3State *s = DO_UPCAST(NetL2TPV3State, nc, nc);
>> +
>> +    struct iovec * vec;
>> +    struct msghdr message;
>> +    ssize_t ret = 0;
>> +
>> +    l2tpv3_form_header(s);
>> +    vec = s->vec;
>> +    vec->iov_base = s->header_buf;
>> +    vec->iov_len = s->offset;
>> +    vec++;
>> +    vec->iov_base = (void *) buf;
> Cast is unnecessary.  In C any pointer casts to void* and vice versa
> without warnings.

Bad habits die hard :)

>
>> +    vec->iov_len = size;
>> +    message.msg_name = s->dgram_dst;
>> +    message.msg_namelen = s->dst_size;
>> +    message.msg_iov = (struct iovec *) s->vec;
> Cast is unnecessary.
>
>> +    message.msg_iovlen = 2;
>> +    message.msg_control = NULL;
>> +    message.msg_controllen = 0;
>> +    message.msg_flags = 0;
>> +    do {
>> +	ret = sendmsg(s->fd, &message, 0);
>> +    } while ((ret == -1) && (errno == EINTR));
>> +    if (ret > 0) {
>> +	ret -= s->offset;
>> +    } else if (ret == 0) {
>> +	ret = size;
>> +    }
>> +    return ret;
>> +}
>> +
>> +static int l2tpv3_verify_header(NetL2TPV3State *s,
>> +				    uint8_t *buf)
>> +{
>> +
>> +    uint64_t *cookie64;
>> +    uint32_t *cookie32;
>> +    uint32_t *session;
>> +
>> +    if ((!s->udp) && (!s->ipv6)){
>> +	buf += sizeof(struct iphdr) /* fix for ipv4 raw */;
>> +    }
>> +    if (s->cookie) {
>> +	if (s->cookie_is_64) {
>> +	    /* 64 bit cookie */
>> +	    cookie64 = (uint64_t *)(buf + s->cookie_offset);
>> +	    if (ldq_be_p(cookie64) != s->rx_cookie) {
>> +		error_report("unknown cookie id\n");
>> +		return -1;
>> +	    }
>> +	} else {
>> +	    cookie32 = (uint32_t *)(buf + s->cookie_offset);
>> +	    if (ldl_be_p(cookie32) != * (uint32_t *) &s->rx_cookie) {
> Ouch, the rx_cookie cast assumes little-endian host!  See below for a
> cleaner solution.
>
>> +		error_report("unknown cookie id\n");
>> +		return -1 ;
>> +	    }
>> +	}
> The casting and code duplication is not necessary:
>
> uint64_t cookie;
> if (s->cookie_is_64) {
>      cookie = ldq_be_p(buf + s->cookie_offset);
> } else {
>      cookie = ldl_be_p(buf + s->cookie_offset);
> }
> if (cookie != s->rx_cookie) {
>      error_report("unknown cookie id\n");
>      return -1;
> }

Ok, I read the source of ldq and ldl. I probably misunderstood the finer 
points for non-x86.

>
>> +    }
>> +    session = (uint32_t *) (buf + s->session_offset);
>> +    if (ldl_be_p(session) != s->rx_session) {
>> +	error_report("session mismatch\n");
>> +	return -1;
>> +    }	
>> +    return 0;
>> +}
>> +
>> +static void net_l2tpv3_send(void *opaque)
>> +{
>> +    NetL2TPV3State *s = opaque;
>> +
>> +    int i, count, offset;
>> +    struct mmsghdr * msgvec;
>> +    struct iovec * vec;
>> +
>> +    msgvec = s->msgvec;
>> +    offset = s->offset;
>> +    if ((!s->udp) && (!s->ipv6)){
>> +	offset +=   sizeof(struct iphdr);
>> +    }
>> +    count = recvmmsg(s->fd, msgvec, MAX_L2TPV3_MSGCNT, MSG_DONTWAIT, NULL);
>> +    for (i=0;i<count;i++) {
> QEMU coding style uses spaces:
> for (i = 0; i < count; i++) {

Understood will re-format.

>
>> +	if (msgvec->msg_len > 0) {
>> +	    vec = msgvec->msg_hdr.msg_iov;
>> +	    if (
>> +		(msgvec->msg_len > offset) &&
>> +		(l2tpv3_verify_header(s, vec->iov_base) == 0)
>> +		) {
> if ((msgvec->msg_len > offset) &&
>      (l2tcpv3_verify_header(s, vec->iov_base) == 0))
>
>> +		vec++;
>> +		qemu_send_packet(&s->nc, vec->iov_base, msgvec->msg_len - offset);
>> +	    } else {
>> +		error_report("l2tpv3 header verification failed\n");
>> +		vec++;
> vec isn't used after this so the increment can be removed.

Correct. It was there because originally the code was resetting 
iovec_len. However, we found that it is not modified by recvmmsg so it 
is unnecessary.
>
>> +	    }
>> +	}
>> +	msgvec++;
>> +    }
>> +}
>> +
>> +static void destroy_vector(struct mmsghdr * msgvec, int count, int iovcount)
>> +{
>> +    int i, j;
>> +    struct iovec * iov;
>> +    struct mmsghdr * cleanup = msgvec;
>> +    if (cleanup) {
>> +	for (i=0;i<count;i++) {
>> +	    if (cleanup->msg_hdr.msg_iov) {
>> +		iov = cleanup->msg_hdr.msg_iov;
>> +		for (j=0;j<iovcount;j++) {
>> +		    if (iov->iov_base) {
>> +			g_free(iov->iov_base);
>> +		    }
> g_free(NULL) is a nop so you don't need to check iov->iov_base before
> calling g_free().

Artefact of porting code from "regular" malloc/free, apologies, will 
take into account.

>
>> +		    iov++;
>> +		}
>> +		g_free(cleanup->msg_hdr.msg_iov);
>> +	    }
>> +	    cleanup++;
>> +	}
>> +	g_free(msgvec);
>> +    }
>> +}
>> +
>> +static struct mmsghdr * build_l2tpv3_vector(NetL2TPV3State *s, int count)
>> +{
>> +    int i;
>> +    struct iovec * iov;
>> +    struct mmsghdr * msgvec, *result;
>> +
>> +    msgvec = g_malloc(sizeof(struct mmsghdr) * count);
>> +    result = msgvec;
>> +    for (i=0;i < count ;i++) {
>> +	msgvec->msg_hdr.msg_name = NULL;
>> +	msgvec->msg_hdr.msg_namelen = 0;
>> +	iov =  g_malloc(sizeof(struct iovec) * IOVSIZE);
>> +	msgvec->msg_hdr.msg_iov = iov;
>> +	if ((!s->udp) && (!s->ipv6)){
>> +	    /* fix for ipv4 raw */;
>> +	    iov->iov_base = g_malloc(s->offset + sizeof(struct iphdr));
>> +	    iov->iov_len = s->offset + sizeof (struct iphdr);
>> +	} else {
>> +	    iov->iov_base = g_malloc(s->offset);
>> +	    iov->iov_len = s->offset;
>> +	}
>> +	iov++ ;
>> +	iov->iov_base = qemu_memalign(BUFFER_ALIGN,BUFFER_SIZE);
>> +	iov->iov_len = BUFFER_SIZE;
>> +	msgvec->msg_hdr.msg_iovlen = 2;
>> +	msgvec->msg_hdr.msg_control = NULL;
>> +	msgvec->msg_hdr.msg_controllen = 0;
>> +	msgvec->msg_hdr.msg_flags = 0;
>> +	msgvec++;
>> +    }
>> +    return result;
>> +}
>> +
>> +static void net_l2tpv3_cleanup(NetClientState *nc)
>> +{
>> +    NetL2TPV3State *s = DO_UPCAST(NetL2TPV3State, nc, nc);
>> +    qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
>> +    close(s->fd);
>> +    destroy_vector(s->msgvec, MAX_L2TPV3_MSGCNT, IOVSIZE);
>> +    g_free(s->header_buf);
>> +    g_free(s->dgram_dst);
>> +}
>> +
>> +static NetClientInfo net_l2tpv3_info = {
>> +    .type = NET_CLIENT_OPTIONS_KIND_L2TPV3,
>> +    .size = sizeof(NetL2TPV3State),
>> +    .receive = net_l2tpv3_receive_dgram,
>> +    .receive_iov = net_l2tpv3_receive_dgram_iov,
>> +    .cleanup = net_l2tpv3_cleanup,
>> +};
>> +
>> +int net_init_l2tpv3(const NetClientOptions *opts,
>> +                    const char *name,
>> +                    NetClientState *peer)
>> +{
>> +
>> +
>> +    const NetdevL2TPv3Options * l2tpv3;
>> +    NetL2TPV3State *s;
>> +    NetClientState *nc;
>> +    int fd, gairet;
>> +    struct addrinfo hints;
>> +    struct addrinfo * result = NULL;
>> +    char * srcport, * dstport;
>> +
>> +    nc = qemu_new_net_client(&net_l2tpv3_info, peer, "l2tpv3", name);
>> +
>> +    s = DO_UPCAST(NetL2TPV3State, nc, nc);
>> +
>> +    assert(opts->kind == NET_CLIENT_OPTIONS_KIND_L2TPV3);
>> +    l2tpv3 = opts->l2tpv3;
>> +
>> +    if (l2tpv3->has_ipv6 && l2tpv3->ipv6) {
>> +	s->ipv6 = l2tpv3->ipv6;
>> +    } else {
>> +	s->ipv6 = false;
>> +    }
>> +
>> +    if (l2tpv3->has_rxcookie || l2tpv3->has_txcookie) {
>> +	if (l2tpv3->has_rxcookie && l2tpv3->has_txcookie) {
>> +	    s->cookie = true;
>> +	} else {
>> +	    return -1;
> This leaks nc.  I suggest adding a goto err where nc is freed.  Other
> error return paths should also free resources and can use this label.

OK.

>
>> +	}
>> +    } else {
>> +	s->cookie = false;
>> +    }
>> +
>> +    if (l2tpv3->has_cookie64 || l2tpv3->cookie64) {
>> +	s->cookie_is_64  = true;
>> +    } else {
>> +	s->cookie_is_64  = false;
>> +    }
>> +
>> +    if (l2tpv3->has_udp && l2tpv3->udp) {
>> +	s->udp = true;
>> +	if (!(l2tpv3->has_srcport && l2tpv3->has_dstport)) {
>> +	    error_report("l2tpv3_open : need both src and dst port for udp\n");
>> +	    return -1;
> Leaks nc.
>
>> +	} else {
>> +	    srcport = l2tpv3->srcport;
>> +	    dstport = l2tpv3->dstport;
>> +	}
>> +    } else {
>> +	s->udp = false;
>> +	srcport = NULL;
>> +	dstport = NULL;
>> +    }
>> +
>> +
>> +    s->offset = 4;
>> +    s->session_offset = 0;
>> +    s->cookie_offset = 4;
>> +    s->counter_offset = 4;
>> +
>> +    s->tx_session = l2tpv3->txsession;
>> +    if (l2tpv3->has_rxsession) {
>> +	s->rx_session = l2tpv3->rxsession;
>> +    } else {
>> +	s->rx_session = s->tx_session;
>> +    }
>> +
>> +    if (s->cookie) {
>> +	s->rx_cookie = l2tpv3->rxcookie;
>> +	s->tx_cookie = l2tpv3->txcookie;
>> +	if (s->cookie_is_64 == true) {
>> +	    /* 64 bit cookie */
>> +	    s->offset += 8;
>> +	    s->counter_offset += 8;
>> +	} else {
>> +	    /* 32 bit cookie */
>> +	    s->offset += 4;
>> +	    s->counter_offset +=4;
>> +	}
>> +    }
>> +
>> +    memset(&hints, 0, sizeof(hints));
>> +
>> +    if (s->ipv6) {
>> +	hints.ai_family = AF_INET6;
>> +    } else {
>> +	hints.ai_family = AF_INET;
>> +    }
>> +    if (s->udp) {
>> +	hints.ai_socktype = SOCK_DGRAM;
>> +	hints.ai_protocol = 0;
>> +	s->offset += 4;
>> +	s->counter_offset += 4;
>> +	s->session_offset += 4;
>> +	s->cookie_offset += 4;
>> +    } else {
>> +	hints.ai_socktype = SOCK_RAW;
>> +	hints.ai_protocol = IPPROTO_L2TP;
>> +    }
>> +
>> +    gairet= getaddrinfo(l2tpv3->src, srcport, &hints, &result);
>> +
>> +    if ((gairet !=0) || (result == NULL)) {
>> +	error_report("l2tpv3_open : could not resolve src, errno = %s\n", gai_strerror(gairet));
> Leaks nc.
>
>> +	return -1;
>> +    }
>> +
>> +    if ((fd = socket(result->ai_family, result->ai_socktype, result->ai_protocol)) == -1) {
>> +	fd = -errno;
>> +	error_report("l2tpv3_open : socket creation failed, errno = %d\n", -fd);
>> +	freeaddrinfo(result);
> Leaks nc.
>
>> +	return fd;
>> +    }
>> +    if (bind(fd, (struct sockaddr *) result->ai_addr, result->ai_addrlen)) {
>> +	error_report("l2tpv3_open :  could not bind socket err=%i\n", errno);
>> +	close(fd);
> Leaks nc and result.
>
>> +	return -1;
>> +    }
>> +
>> +    freeaddrinfo(result);
>> +
>> +    memset(&hints, 0, sizeof(hints));
>> +
>> +    if (s->ipv6) {
>> +	hints.ai_family = AF_INET6;
>> +    } else {
>> +	hints.ai_family = AF_INET;
>> +    }
>> +    if (s->udp) {
>> +	hints.ai_socktype = SOCK_DGRAM;
>> +	hints.ai_protocol = 0;
>> +    } else {
>> +	hints.ai_socktype = SOCK_RAW;
>> +	hints.ai_protocol = IPPROTO_L2TP;
> Hang on, this is bogus.  This is a *userspace* L2TP implementation!
>
> We don't want a kernel L2TP driver to handle this socket.  Luckily this
> never happens anyway since net/l2tp/l2tp_ip.c only registers its socket
> type for <AF_INET, SOCK_DGRAM, IPPROTO_L2TP> and <AF_INET6, SOCK_DGRAM,
> IPPROTO_L2TP>.
>
> When we create this socket with <AF_INET, SOCK_RAW, IPPROTO_L2TP> what
> really happens is that the kernel falls back to the IPv4 raw socket
> driver due to a wildcard match.
>
> In other words, we shouldn't use IPPROTO_L2TP.  Just use 0.

OK. Thanks. Will fix, reread the kernel code to make sure we never clash 
with it and retest.

>
>> +    }
>> +
>> +    gairet= getaddrinfo(l2tpv3->dst, dstport, &hints, &result);
>> +    if ((gairet !=0) || (result == NULL)) {
>> +	error_report("l2tpv3_open : could not resolve dst, error = %s\n", gai_strerror(gairet));
>> +	return -1;
> More leaks.
>
>> +    }
>> +
>> +    s->dgram_dst = g_malloc(sizeof(struct sockaddr_storage));
>> +    memset(s->dgram_dst, '\0' , sizeof(struct sockaddr_storage));
>> +    memcpy(s->dgram_dst, result->ai_addr, result->ai_addrlen);
> Why is this field heap-allocated?  It doesn't need to be a pointer and
> then you can avoid the g_malloc()/g_free().

It does not need to be. Habits.

>
>> +    s->dst_size = result->ai_addrlen;
>> +
>> +    freeaddrinfo(result);
>> +
>> +    if (l2tpv3->has_counter && l2tpv3->counter) {
>> +	s->has_counter = true;
>> +	s->offset += 4;
>> +    } else {
>> +	s->has_counter = false;
>> +    }
>> +
>> +    if (l2tpv3->has_offset) {
>> +	/* extra offset */
>> +	s->offset += l2tpv3->offset;
>> +    }
>> +
>> +    s->msgvec = build_l2tpv3_vector(s, MAX_L2TPV3_MSGCNT);
>> +    s->vec = g_malloc(sizeof(struct iovec) * MAX_L2TPV3_IOVCNT);
> It's up to you if you want to bother heap-allocating this.  Seems
> simpler to embed it in the struct:
> struct iovec vec[MAX_L2TPV3_IOVCNT];
>
>> +    if ((!s->udp) && (!s->ipv6)){
>> +	s->header_buf = g_malloc(s->offset + sizeof (struct iphdr));
>> +    } else {
>> +	s->header_buf = g_malloc(s->offset);
>> +    }
>> +
>> +    qemu_set_nonblock(fd);
>> +
>> +    if (fd < 0)
>> +	return -1;
> QEMU coding style always uses curlies:
> if (fd < 0) {
>      return -1;
> }
>
> Anyway, this check is redundant because we already checked when creating
> the socket.
>
>> +
>> +    s->fd = fd;
>> +    s->counter = 0;
>> +
>> +    qemu_set_fd_handler(s->fd, net_l2tpv3_send, NULL, s);
>> +
>> +    if (!s) {
>> +	error_report("l2tpv3_open : failed to set fd handler\n");
>> +	return -1;
>> +    }
> Bogus check.  s is never NULL here.
>
>> +    snprintf(s->nc.info_str, sizeof(s->nc.info_str),
>> +             "l2tpv3: connected");
>> +    return 0;
>> +}
>> +
>> diff --git a/net/net.c b/net/net.c
>> index 0a88e68..749d34c 100644
>> --- a/net/net.c
>> +++ b/net/net.c
>> @@ -731,6 +731,9 @@ static int (* const net_client_init_fun[NET_CLIENT_OPTIONS_KIND_MAX])(
>>           [NET_CLIENT_OPTIONS_KIND_BRIDGE]    = net_init_bridge,
>>   #endif
>>           [NET_CLIENT_OPTIONS_KIND_HUBPORT]   = net_init_hubport,
>> +#ifdef CONFIG_LINUX
>> +        [NET_CLIENT_OPTIONS_KIND_L2TPV3]    = net_init_l2tpv3,
>> +#endif
>>   };
>>   
>>   
>> diff --git a/qapi-schema.json b/qapi-schema.json
>> index 83fa485..7d9f69f 100644
>> --- a/qapi-schema.json
>> +++ b/qapi-schema.json
>> @@ -2940,6 +2940,57 @@
>>       '*localaddr': 'str',
>>       '*udp':       'str' } }
>>   
>> +# @NetdevL2TPv3Options
>> +#
>> +# Connect the VLAN to Ethernet over L2TPv3 Static tunnel
>> +#
>> +# @src :source address
>> +#
>> +# @dst :destination address
>> +#
>> +# @srcport :#optional source port - mandatory for udp, optional for ip
>> +#
>> +# @dstport :#optional destination port - mandatory for udp, optional for ip
>> +#
>> +# @ipv6 :#optional - force the use of ipv6
>> +#
>> +# @udp :#optional - use the udp version of l2tpv3 encapsulation
>> +#
>> +# @cookie64:#optional - use 64 bit coookies
>> +#
>> +# @counter :#optional have sequence counter
>> +#
>> +# @txcookie :#optional 32 or 64 bit transmit cookie
>> +#
>> +# @rxcookie :#optional 32 or 64 bit receive cookie
>> +#
>> +# @txsession : 32 bit transmit session
>> +#
>> +# @rxsession : 32 bit receive session - if not specified set to the same value as transmit
>> +#
>> +# @optional : additional offset - allows the insertion of additional application-specific data before the packet payload
>> +#
>> +#
>> +# Since 1.2
>> +##
>> +##
>> +{ 'type': 'NetdevL2TPv3Options',
>> +  'data': {
>> +    'src':	  'str',
>> +    'dst':	  'str',
>> +    '*srcport':	  'str',
>> +    '*dstport':	  'str',
>> +    '*ipv6':	  'bool',
>> +    '*udp':	  'bool',
>> +    '*cookie64':  'bool',
>> +    '*counter':   'bool',
>> +    '*txcookie':  'uint64',
>> +    '*rxcookie':  'uint64',
>> +    'txsession': 'uint32',
>> +    '*rxsession': 'uint32',
>> +    '*offset':  'uint32' } }
>> +
>> +##
>>   ##
>>   # @NetdevVdeOptions
>>   #
>> @@ -3014,13 +3065,16 @@
>>   # A discriminated record of network device traits.
>>   #
>>   # Since 1.2
>> -##
>> +#
>> +# Added in 2.0 - l2tpv3
>> +#
>>   { 'union': 'NetClientOptions',
>>     'data': {
>>       'none':     'NetdevNoneOptions',
>>       'nic':      'NetLegacyNicOptions',
>>       'user':     'NetdevUserOptions',
>>       'tap':      'NetdevTapOptions',
>> +    'l2tpv3':   'NetdevL2TPv3Options',
>>       'socket':   'NetdevSocketOptions',
>>       'vde':      'NetdevVdeOptions',
>>       'dump':     'NetdevDumpOptions',
>> -- 
>> 1.7.10.4
>>

  reply	other threads:[~2014-03-06 10:28 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-05 14:12 [Qemu-devel] [PATCH] New feature - RFC3931 L2TPv3 network transport using static Ethernet over L2TPv3 tunnels anton.ivanov
2014-03-05 21:22 ` Eric Blake
2014-03-05 23:00   ` Anton Ivanov
2014-03-06  9:44 ` Stefan Hajnoczi
2014-03-06 10:28   ` Anton Ivanov [this message]
2014-03-09 17:06   ` Anton Ivanov
2014-03-10  8:35     ` Stefan Hajnoczi
2014-03-10  8:49       ` Anton Ivanov
2014-03-10 18:04         ` Stefan Hajnoczi
2014-03-10 19:14           ` Anton Ivanov (antivano)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53184DCF.4030609@kot-begemot.co.uk \
    --to=anton.ivanov@kot-begemot.co.uk \
    --cc=afaerber@suse.de \
    --cc=antivano@cisco.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).