From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43585) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WVhBk-0007ly-P5 for qemu-devel@nongnu.org; Thu, 03 Apr 2014 08:56:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WVhBe-0002l8-63 for qemu-devel@nongnu.org; Thu, 03 Apr 2014 08:56:12 -0400 Received: from alln-iport-3.cisco.com ([173.37.142.90]:42805) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WVhBd-0002kn-J6 for qemu-devel@nongnu.org; Thu, 03 Apr 2014 08:56:06 -0400 From: "Anton Ivanov (antivano)" Date: Thu, 3 Apr 2014 12:56:01 +0000 Message-ID: <533D5A5D.3070704@cisco.com> References: <1396276759-189034-1-git-send-email-anton.ivanov@kot-begemot.co.uk> <20140403122039.GA4672@irqsave.net> In-Reply-To: <20140403122039.GA4672@irqsave.net> Content-Language: en-US Content-Type: text/plain; charset="iso-8859-1" Content-ID: <3365D76634A51D4EBB587ECEE82155B4@emea.cisco.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH v7] net: L2TPv3 transport List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?iso-8859-1?Q?Beno=EEt_Canet?= Cc: "pbonzini@redhat.com" , "stefanha@redhat.com" , "anton.ivanov@kot-begemot.co.uk" , "afaerber@suse.de" , "qemu-devel@nongnu.org" On 03/04/14 13:20, Beno=EEt Canet wrote: > The Monday 31 Mar 2014 =E0 15:39:19 (+0100), anton.ivanov@kot-begemot.co.= uk wrote : >> From: Anton Ivanov >> >> This transport allows to connect a QEMU nic to a static Ethernet >> over L2TPv3 tunnel. The transport supports all options present >> in the Linux kernel implementation. It allows QEMU to connect >> to any Linux host running kernel 3.3+, most routers and network >> devices as well as other QEMU instances. >> >> Signed-off-by: Anton Ivanov >> --- > Hi, > > On what branch does this patch apply ? > > I failed to apply it on master. v1.7.0 A. > > Best regards > > Beno=EEt > >> Addressed in this release: >> >> 1. Back to qemu send_packet instead of direct delivery from the >> recvmmsg ring. The zero copy off driver rx ring will be reintroduced >> in a later patch >> >> 2. Fixed mismerge of header size handling from our tree >> >> 3. Fixed formatting >> >> >> net/Makefile.objs | 1 + >> net/clients.h | 2 + >> net/l2tpv3.c | 745 +++++++++++++++++++++++++++++++++++++++++++++= ++++++++ >> net/net.c | 3 + >> qapi-schema.json | 60 +++++ >> qemu-options.hx | 82 ++++++ >> 6 files changed, 893 insertions(+) >> create mode 100644 net/l2tpv3.c >> >> diff --git a/net/Makefile.objs b/net/Makefile.objs >> index 4854a14..160214e 100644 >> --- a/net/Makefile.objs >> +++ b/net/Makefile.objs >> @@ -2,6 +2,7 @@ common-obj-y =3D net.o queue.o checksum.o util.o hub.o >> common-obj-y +=3D socket.o >> common-obj-y +=3D dump.o >> common-obj-y +=3D eth.o >> +common-obj-$(CONFIG_LINUX) +=3D l2tpv3.o >> common-obj-$(CONFIG_POSIX) +=3D tap.o >> common-obj-$(CONFIG_LINUX) +=3D tap-linux.o >> common-obj-$(CONFIG_WIN32) +=3D tap-win32.o >> diff --git a/net/clients.h b/net/clients.h >> index 7793294..bbf177c 100644 >> --- a/net/clients.h >> +++ b/net/clients.h >> @@ -47,6 +47,8 @@ int net_init_tap(const NetClientOptions *opts, const c= har *name, >> int net_init_bridge(const NetClientOptions *opts, const char *name, >> NetClientState *peer); >> =20 >> +int net_init_l2tpv3(const NetClientOptions *opts, const char *name, >> + NetClientState *peer); >> #ifdef CONFIG_VDE >> int net_init_vde(const NetClientOptions *opts, const char *name, >> NetClientState *peer); >> diff --git a/net/l2tpv3.c b/net/l2tpv3.c >> new file mode 100644 >> index 0000000..4439ab7 >> --- /dev/null >> +++ b/net/l2tpv3.c >> @@ -0,0 +1,745 @@ >> +/* >> + * QEMU System Emulator >> + * >> + * Copyright (c) 2003-2008 Fabrice Bellard >> + * Copyright (c) 2012-2014 Cisco Systems >> + * >> + * Permission is hereby granted, free of charge, to any person obtainin= g a copy >> + * of this software and associated documentation files (the "Software")= , to deal >> + * in the Software without restriction, including without limitation th= e rights >> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or= sell >> + * copies of the Software, and to permit persons to whom the Software i= s >> + * furnished to do so, subject to the following conditions: >> + * >> + * The above copyright notice and this permission notice shall be inclu= ded in >> + * all copies or substantial portions of the Software. >> + * >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPR= ESS OR >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABIL= ITY, >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SH= ALL >> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR= OTHER >> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARIS= ING FROM, >> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALIN= GS IN >> + * THE SOFTWARE. >> + */ >> + >> +#include >> +#include >> +#include "config-host.h" >> +#include "net/net.h" >> +#include "clients.h" >> +#include "monitor/monitor.h" >> +#include "qemu-common.h" >> +#include "qemu/error-report.h" >> +#include "qemu/option.h" >> +#include "qemu/sockets.h" >> +#include "qemu/iov.h" >> +#include "qemu/main-loop.h" >> + >> + >> +/* The buffer size needs to be investigated for optimum numbers and >> + * optimum means of paging in on different systems. This size is >> + * chosen to be sufficient to accommodate one packet with some headers >> + */ >> + >> +#define BUFFER_ALIGN sysconf(_SC_PAGESIZE) >> +#define BUFFER_SIZE 2048 >> +#define IOVSIZE 2 >> +#define MAX_L2TPV3_MSGCNT 64 >> +#define MAX_L2TPV3_IOVCNT (MAX_L2TPV3_MSGCNT * IOVSIZE) >> + >> +/* Header set to 0x30000 signifies a data packet */ >> + >> +#define L2TPV3_DATA_PACKET 0x30000 >> + >> +/* IANA-assigned IP protocol ID for L2TPv3 */ >> + >> +#ifndef IPPROTO_L2TP >> +#define IPPROTO_L2TP 0x73 >> +#endif >> + >> +typedef struct NetL2TPV3State { >> + NetClientState nc; >> + int fd; >> + >> + /* >> + * these are used for xmit - that happens packet a time >> + * and for first sign of life packet (easier to parse that once) >> + */ >> + >> + uint8_t *header_buf; >> + struct iovec *vec; >> + >> + /* >> + * these are used for receive - try to "eat" up to 32 packets at a = time >> + */ >> + >> + struct mmsghdr *msgvec; >> + >> + /* >> + * peer address >> + */ >> + >> + struct sockaddr_storage *dgram_dst; >> + uint32_t dst_size; >> + >> + /* >> + * L2TPv3 parameters >> + */ >> + >> + uint64_t rx_cookie; >> + uint64_t tx_cookie; >> + uint32_t rx_session; >> + uint32_t tx_session; >> + uint32_t header_size; >> + uint32_t counter; >> + >> + /* >> + * DOS avoidance in error handling >> + */ >> + >> + bool header_mismatch; >> + >> + /* >> + * Ring buffer handling >> + */ >> + >> + int queue_head; >> + int queue_tail; >> + int queue_depth; >> + >> + /* >> + * Precomputed offsets >> + */ >> + >> + uint32_t offset; >> + uint32_t cookie_offset; >> + uint32_t counter_offset; >> + uint32_t session_offset; >> + >> + /* Poll Control */ >> + >> + bool read_poll; >> + bool write_poll; >> + >> + /* Flags */ >> + >> + bool ipv6; >> + bool udp; >> + bool has_counter; >> + bool pin_counter; >> + bool cookie; >> + bool cookie_is_64; >> + >> +} NetL2TPV3State; >> + >> +static int l2tpv3_can_send(void *opaque); >> +static void net_l2tpv3_send(void *opaque); >> +static void l2tpv3_writable(void *opaque); >> + >> +static void l2tpv3_update_fd_handler(NetL2TPV3State *s) >> +{ >> + qemu_set_fd_handler2(s->fd, >> + s->read_poll ? l2tpv3_can_send : NULL, >> + s->read_poll ? net_l2tpv3_send : NULL, >> + s->write_poll ? l2tpv3_writable : NULL, >> + s); >> +} >> + >> +static void l2tpv3_read_poll(NetL2TPV3State *s, bool enable) >> +{ >> + if (s->read_poll !=3D enable) { >> + s->read_poll =3D enable; >> + l2tpv3_update_fd_handler(s); >> + } >> +} >> + >> +static void l2tpv3_write_poll(NetL2TPV3State *s, bool enable) >> +{ >> + if (s->write_poll !=3D enable) { >> + s->write_poll =3D enable; >> + l2tpv3_update_fd_handler(s); >> + } >> +} >> + >> +static void l2tpv3_writable(void *opaque) >> +{ >> + NetL2TPV3State *s =3D opaque; >> + l2tpv3_write_poll(s, false); >> + qemu_flush_queued_packets(&s->nc); >> +} >> + >> +static int l2tpv3_can_send(void *opaque) >> +{ >> + NetL2TPV3State *s =3D opaque; >> + >> + return qemu_can_send_packet(&s->nc); >> +} >> + >> +static void l2tpv3_send_completed(NetClientState *nc, ssize_t len) >> +{ >> + NetL2TPV3State *s =3D DO_UPCAST(NetL2TPV3State, nc, nc); >> + l2tpv3_read_poll(s, true); >> +} >> + >> +static void l2tpv3_poll(NetClientState *nc, bool enable) >> +{ >> + NetL2TPV3State *s =3D DO_UPCAST(NetL2TPV3State, nc, nc); >> + l2tpv3_write_poll(s, enable); >> + l2tpv3_read_poll(s, enable); >> +} >> + >> +static void l2tpv3_form_header(NetL2TPV3State *s) >> +{ >> + uint32_t *counter; >> + >> + if (s->udp) { >> + stl_be_p((uint32_t *) s->header_buf, L2TPV3_DATA_PACKET); >> + } >> + stl_be_p( >> + (uint32_t *) (s->header_buf + s->session_offset), >> + s->tx_session >> + ); >> + if (s->cookie) { >> + if (s->cookie_is_64) { >> + stq_be_p( >> + (uint64_t *)(s->header_buf + s->cookie_offset), >> + s->tx_cookie >> + ); >> + } else { >> + stl_be_p( >> + (uint32_t *) (s->header_buf + s->cookie_offset), >> + s->tx_cookie >> + ); >> + } >> + } >> + if (s->has_counter) { >> + counter =3D (uint32_t *)(s->header_buf + s->counter_offset); >> + if (s->pin_counter) { >> + *counter =3D 0; >> + } else { >> + stl_be_p(counter, ++s->counter); >> + } >> + } >> +} >> + >> +static ssize_t net_l2tpv3_receive_dgram_iov(NetClientState *nc, >> + const struct iovec *iov, >> + int iovcnt) >> +{ >> + NetL2TPV3State *s =3D DO_UPCAST(NetL2TPV3State, nc, nc); >> + >> + struct msghdr message; >> + int ret; >> + >> + if (iovcnt > MAX_L2TPV3_IOVCNT - 1) { >> + error_report( >> + "iovec too long %d > %d, change l2tpv3.h", >> + iovcnt, MAX_L2TPV3_IOVCNT >> + ); >> + return -1; >> + } >> + l2tpv3_form_header(s); >> + memcpy(s->vec + 1, iov, iovcnt * sizeof(struct iovec)); >> + s->vec->iov_base =3D s->header_buf; >> + s->vec->iov_len =3D s->offset; >> + message.msg_name =3D s->dgram_dst; >> + message.msg_namelen =3D s->dst_size; >> + message.msg_iov =3D s->vec; >> + message.msg_iovlen =3D iovcnt + 1; >> + message.msg_control =3D NULL; >> + message.msg_controllen =3D 0; >> + message.msg_flags =3D 0; >> + do { >> + ret =3D sendmsg(s->fd, &message, 0); >> + } while ((ret =3D=3D -1) && (errno =3D=3D EINTR)); >> + if (ret > 0) { >> + ret -=3D s->offset; >> + } else if (ret =3D=3D 0) { >> + /* belt and braces - should not occur on DGRAM >> + * we should get an error and never a 0 send >> + */ >> + ret =3D iov_size(iov, iovcnt); >> + } else { >> + /* signal upper layer that socket buffer is full */ >> + ret =3D -errno; >> + if (ret =3D=3D -EAGAIN || ret =3D=3D -ENOBUFS) { >> + l2tpv3_write_poll(s, true); >> + ret =3D 0; >> + } >> + } >> + return ret; >> +} >> + >> +static ssize_t net_l2tpv3_receive_dgram(NetClientState *nc, >> + const uint8_t *buf, >> + size_t size) >> +{ >> + NetL2TPV3State *s =3D DO_UPCAST(NetL2TPV3State, nc, nc); >> + >> + struct iovec *vec; >> + struct msghdr message; >> + ssize_t ret =3D 0; >> + >> + l2tpv3_form_header(s); >> + vec =3D s->vec; >> + vec->iov_base =3D s->header_buf; >> + vec->iov_len =3D s->offset; >> + vec++; >> + vec->iov_base =3D (void *) buf; >> + vec->iov_len =3D size; >> + message.msg_name =3D s->dgram_dst; >> + message.msg_namelen =3D s->dst_size; >> + message.msg_iov =3D s->vec; >> + message.msg_iovlen =3D 2; >> + message.msg_control =3D NULL; >> + message.msg_controllen =3D 0; >> + message.msg_flags =3D 0; >> + do { >> + ret =3D sendmsg(s->fd, &message, 0); >> + } while ((ret =3D=3D -1) && (errno =3D=3D EINTR)); >> + if (ret > 0) { >> + ret -=3D s->offset; >> + } else if (ret =3D=3D 0) { >> + /* belt and braces - should not occur on DGRAM >> + * we should get an error and never a 0 send >> + */ >> + ret =3D size; >> + } else { >> + ret =3D -errno; >> + if (ret =3D=3D -EAGAIN || ret =3D=3D -ENOBUFS) { >> + /* signal upper layer that socket buffer is full */ >> + l2tpv3_write_poll(s, true); >> + ret =3D 0; >> + } >> + } >> + return ret; >> +} >> + >> +static int l2tpv3_verify_header(NetL2TPV3State *s, uint8_t *buf) >> +{ >> + >> + uint32_t *session; >> + uint64_t cookie; >> + >> + if ((!s->udp) && (!s->ipv6)) { >> + buf +=3D sizeof(struct iphdr) /* fix for ipv4 raw */; >> + } >> + >> + /* we do not do a strict check for "data" packets as per >> + * the RFC spec because the pure IP spec does not have >> + * that anyway. >> + */ >> + >> + if (s->cookie) { >> + if (s->cookie_is_64) { >> + cookie =3D ldq_be_p(buf + s->cookie_offset); >> + } else { >> + cookie =3D ldl_be_p(buf + s->cookie_offset); >> + } >> + if (cookie !=3D s->rx_cookie) { >> + if (!s->header_mismatch) { >> + error_report("unknown cookie id"); >> + } >> + return -1; >> + } >> + } >> + session =3D (uint32_t *) (buf + s->session_offset); >> + if (ldl_be_p(session) !=3D s->rx_session) { >> + if (!s->header_mismatch) { >> + error_report("session mismatch"); >> + } >> + return -1; >> + } >> + return 0; >> +} >> + >> +static void net_l2tpv3_process_queue(NetL2TPV3State *s) >> +{ >> + int size =3D 0; >> + struct iovec *vec; >> + bool bad_read; >> + int data_size; >> + struct mmsghdr *msgvec; >> + >> + /* go into ring mode only if there is a "pending" tail */ >> + if (s->queue_depth > 0) { >> + do { >> + msgvec =3D s->msgvec + s->queue_tail; >> + if (msgvec->msg_len > 0) { >> + data_size =3D msgvec->msg_len - s->header_size; >> + vec =3D msgvec->msg_hdr.msg_iov; >> + if ((data_size > 0) && >> + (l2tpv3_verify_header(s, vec->iov_base) =3D=3D 0)) = { >> + vec++; >> + /* Use the legacy delivery for now, we will >> + * switch to using our own ring as a queueing mecha= nism >> + * at a later date >> + */ >> + size =3D qemu_send_packet_async( >> + &s->nc, >> + vec->iov_base, >> + data_size, >> + l2tpv3_send_completed >> + ); >> + bad_read =3D false; >> + } else { >> + bad_read =3D true; >> + if (!s->header_mismatch) { >> + /* report error only once */ >> + error_report("l2tpv3 header verification failed= "); >> + s->header_mismatch =3D true; >> + } >> + } >> + } else { >> + bad_read =3D true; >> + } >> + if ((bad_read) || (size > 0)) { >> + s->queue_tail =3D (s->queue_tail + 1) % MAX_L2TPV3_MSGC= NT; >> + s->queue_depth--; >> + } >> + } while ( >> + (s->queue_depth > 0) && >> + qemu_can_send_packet(&s->nc) && >> + ((size > 0) || bad_read) >> + ); >> + } >> +} >> + >> +static void net_l2tpv3_send(void *opaque) >> +{ >> + NetL2TPV3State *s =3D opaque; >> + int target_count, count; >> + struct mmsghdr *msgvec; >> + >> + /* go into ring mode only if there is a "pending" tail */ >> + >> + if (s->queue_depth) { >> + >> + /* The ring buffer we use has variable intake >> + * count of how much we can read varies - adjust accordingly >> + */ >> + >> + target_count =3D MAX_L2TPV3_MSGCNT - s->queue_depth; >> + >> + /* Ensure we do not overrun the ring when we have >> + * a lot of enqueued packets >> + */ >> + >> + if (s->queue_head + target_count > MAX_L2TPV3_MSGCNT) { >> + target_count =3D MAX_L2TPV3_MSGCNT - s->queue_head; >> + } >> + } else { >> + >> + /* we do not have any pending packets - we can use >> + * the whole message vector linearly instead of using >> + * it as a ring >> + */ >> + >> + s->queue_head =3D 0; >> + s->queue_tail =3D 0; >> + target_count =3D MAX_L2TPV3_MSGCNT; >> + } >> + >> + msgvec =3D s->msgvec + s->queue_head; >> + if (target_count > 0) { >> + do { >> + count =3D recvmmsg( >> + s->fd, >> + msgvec, >> + target_count, MSG_DONTWAIT, NULL); >> + } while ((count =3D=3D -1) && (errno =3D=3D EINTR)); >> + if (count < 0) { >> + /* Recv error - we still need to flush packets here, >> + * (re)set queue head to current position >> + */ >> + count =3D 0; >> + } >> + s->queue_head =3D (s->queue_head + count) % MAX_L2TPV3_MSGCNT; >> + s->queue_depth +=3D count; >> + } >> + net_l2tpv3_process_queue(s); >> +} >> + >> +static void destroy_vector(struct mmsghdr *msgvec, int count, int iovco= unt) >> +{ >> + int i, j; >> + struct iovec *iov; >> + struct mmsghdr *cleanup =3D msgvec; >> + if (cleanup) { >> + for (i =3D 0; i < count; i++) { >> + if (cleanup->msg_hdr.msg_iov) { >> + iov =3D cleanup->msg_hdr.msg_iov; >> + for (j =3D 0; j < iovcount; j++) { >> + g_free(iov->iov_base); >> + iov++; >> + } >> + g_free(cleanup->msg_hdr.msg_iov); >> + } >> + cleanup++; >> + } >> + g_free(msgvec); >> + } >> +} >> + >> +static struct mmsghdr *build_l2tpv3_vector(NetL2TPV3State *s, int count= ) >> +{ >> + int i; >> + struct iovec *iov; >> + struct mmsghdr *msgvec, *result; >> + >> + msgvec =3D g_malloc(sizeof(struct mmsghdr) * count); >> + result =3D msgvec; >> + for (i =3D 0; i < count ; i++) { >> + msgvec->msg_hdr.msg_name =3D NULL; >> + msgvec->msg_hdr.msg_namelen =3D 0; >> + iov =3D g_malloc(sizeof(struct iovec) * IOVSIZE); >> + msgvec->msg_hdr.msg_iov =3D iov; >> + iov->iov_base =3D g_malloc(s->header_size); >> + iov->iov_len =3D s->header_size; >> + iov++ ; >> + iov->iov_base =3D qemu_memalign(BUFFER_ALIGN, BUFFER_SIZE); >> + iov->iov_len =3D BUFFER_SIZE; >> + msgvec->msg_hdr.msg_iovlen =3D 2; >> + msgvec->msg_hdr.msg_control =3D NULL; >> + msgvec->msg_hdr.msg_controllen =3D 0; >> + msgvec->msg_hdr.msg_flags =3D 0; >> + msgvec++; >> + } >> + return result; >> +} >> + >> +static void net_l2tpv3_cleanup(NetClientState *nc) >> +{ >> + NetL2TPV3State *s =3D DO_UPCAST(NetL2TPV3State, nc, nc); >> + qemu_purge_queued_packets(nc); >> + l2tpv3_read_poll(s, false); >> + l2tpv3_write_poll(s, false); >> + close(s->fd); >> + destroy_vector(s->msgvec, MAX_L2TPV3_MSGCNT, IOVSIZE); >> + g_free(s->header_buf); >> + g_free(s->dgram_dst); >> +} >> + >> +static NetClientInfo net_l2tpv3_info =3D { >> + .type =3D NET_CLIENT_OPTIONS_KIND_L2TPV3, >> + .size =3D sizeof(NetL2TPV3State), >> + .receive =3D net_l2tpv3_receive_dgram, >> + .receive_iov =3D net_l2tpv3_receive_dgram_iov, >> + .poll =3D l2tpv3_poll, >> + .cleanup =3D net_l2tpv3_cleanup, >> +}; >> + >> +int net_init_l2tpv3(const NetClientOptions *opts, >> + const char *name, >> + NetClientState *peer) >> +{ >> + >> + >> + const NetdevL2TPv3Options *l2tpv3; >> + NetL2TPV3State *s; >> + NetClientState *nc; >> + int fd =3D -1, gairet; >> + struct addrinfo hints; >> + struct addrinfo *result =3D NULL; >> + char *srcport, *dstport; >> + >> + nc =3D qemu_new_net_client(&net_l2tpv3_info, peer, "l2tpv3", name); >> + >> + s =3D DO_UPCAST(NetL2TPV3State, nc, nc); >> + >> + s->queue_head =3D 0; >> + s->queue_tail =3D 0; >> + s->header_mismatch =3D false; >> + >> + assert(opts->kind =3D=3D NET_CLIENT_OPTIONS_KIND_L2TPV3); >> + l2tpv3 =3D opts->l2tpv3; >> + >> + if (l2tpv3->has_ipv6 && l2tpv3->ipv6) { >> + s->ipv6 =3D l2tpv3->ipv6; >> + } else { >> + s->ipv6 =3D false; >> + } >> + >> + if (l2tpv3->has_rxcookie || l2tpv3->has_txcookie) { >> + if (l2tpv3->has_rxcookie && l2tpv3->has_txcookie) { >> + s->cookie =3D true; >> + } else { >> + goto outerr; >> + } >> + } else { >> + s->cookie =3D false; >> + } >> + >> + if (l2tpv3->has_cookie64 || l2tpv3->cookie64) { >> + s->cookie_is_64 =3D true; >> + } else { >> + s->cookie_is_64 =3D false; >> + } >> + >> + if (l2tpv3->has_udp && l2tpv3->udp) { >> + s->udp =3D true; >> + if (!(l2tpv3->has_srcport && l2tpv3->has_dstport)) { >> + error_report("l2tpv3_open : need both src and dst port for = udp"); >> + goto outerr; >> + } else { >> + srcport =3D l2tpv3->srcport; >> + dstport =3D l2tpv3->dstport; >> + } >> + } else { >> + s->udp =3D false; >> + srcport =3D NULL; >> + dstport =3D NULL; >> + } >> + >> + >> + s->offset =3D 4; >> + s->session_offset =3D 0; >> + s->cookie_offset =3D 4; >> + s->counter_offset =3D 4; >> + >> + s->tx_session =3D l2tpv3->txsession; >> + if (l2tpv3->has_rxsession) { >> + s->rx_session =3D l2tpv3->rxsession; >> + } else { >> + s->rx_session =3D s->tx_session; >> + } >> + >> + if (s->cookie) { >> + s->rx_cookie =3D l2tpv3->rxcookie; >> + s->tx_cookie =3D l2tpv3->txcookie; >> + if (s->cookie_is_64 =3D=3D true) { >> + /* 64 bit cookie */ >> + s->offset +=3D 8; >> + s->counter_offset +=3D 8; >> + } else { >> + /* 32 bit cookie */ >> + s->offset +=3D 4; >> + s->counter_offset +=3D 4; >> + } >> + } >> + >> + memset(&hints, 0, sizeof(hints)); >> + >> + if (s->ipv6) { >> + hints.ai_family =3D AF_INET6; >> + } else { >> + hints.ai_family =3D AF_INET; >> + } >> + if (s->udp) { >> + hints.ai_socktype =3D SOCK_DGRAM; >> + hints.ai_protocol =3D 0; >> + s->offset +=3D 4; >> + s->counter_offset +=3D 4; >> + s->session_offset +=3D 4; >> + s->cookie_offset +=3D 4; >> + } else { >> + hints.ai_socktype =3D SOCK_RAW; >> + hints.ai_protocol =3D IPPROTO_L2TP; >> + } >> + >> + gairet =3D getaddrinfo(l2tpv3->src, srcport, &hints, &result); >> + >> + if ((gairet !=3D 0) || (result =3D=3D NULL)) { >> + error_report( >> + "l2tpv3_open : could not resolve src, errno =3D %s", >> + gai_strerror(gairet) >> + ); >> + goto outerr; >> + } >> + fd =3D socket(result->ai_family, result->ai_socktype, result->ai_pr= otocol); >> + if (fd =3D=3D -1) { >> + fd =3D -errno; >> + error_report("l2tpv3_open : socket creation failed, errno =3D %= d", -fd); >> + freeaddrinfo(result); >> + goto outerr; >> + } >> + if (bind(fd, (struct sockaddr *) result->ai_addr, result->ai_addrle= n)) { >> + error_report("l2tpv3_open : could not bind socket err=3D%i", e= rrno); >> + goto outerr; >> + } >> + >> + freeaddrinfo(result); >> + >> + memset(&hints, 0, sizeof(hints)); >> + >> + if (s->ipv6) { >> + hints.ai_family =3D AF_INET6; >> + } else { >> + hints.ai_family =3D AF_INET; >> + } >> + if (s->udp) { >> + hints.ai_socktype =3D SOCK_DGRAM; >> + hints.ai_protocol =3D 0; >> + } else { >> + hints.ai_socktype =3D SOCK_RAW; >> + hints.ai_protocol =3D IPPROTO_L2TP; >> + } >> + >> + gairet =3D getaddrinfo(l2tpv3->dst, dstport, &hints, &result); >> + if ((gairet !=3D 0) || (result =3D=3D NULL)) { >> + error_report( >> + "l2tpv3_open : could not resolve dst, error =3D %s", >> + gai_strerror(gairet) >> + ); >> + goto outerr; >> + } >> + >> + s->dgram_dst =3D g_malloc(sizeof(struct sockaddr_storage)); >> + memset(s->dgram_dst, '\0' , sizeof(struct sockaddr_storage)); >> + memcpy(s->dgram_dst, result->ai_addr, result->ai_addrlen); >> + s->dst_size =3D result->ai_addrlen; >> + >> + freeaddrinfo(result); >> + >> + if (l2tpv3->has_counter && l2tpv3->counter) { >> + s->has_counter =3D true; >> + s->offset +=3D 4; >> + } else { >> + s->has_counter =3D false; >> + } >> + >> + if (l2tpv3->has_pincounter && l2tpv3->pincounter) { >> + s->has_counter =3D true; /* pin counter implies that there is = counter */ >> + s->pin_counter =3D true; >> + } else { >> + s->pin_counter =3D false; >> + } >> + >> + if (l2tpv3->has_offset) { >> + /* extra offset */ >> + s->offset +=3D l2tpv3->offset; >> + } >> + >> + if ((s->ipv6) || (s->udp)) { >> + s->header_size =3D s->offset; >> + } else { >> + s->header_size =3D s->offset + sizeof(struct iphdr); >> + } >> + >> + s->msgvec =3D build_l2tpv3_vector(s, MAX_L2TPV3_MSGCNT); >> + s->vec =3D g_malloc(sizeof(struct iovec) * MAX_L2TPV3_IOVCNT); >> + s->header_buf =3D g_malloc(s->header_size); >> + >> + qemu_set_nonblock(fd); >> + >> + s->fd =3D fd; >> + s->counter =3D 0; >> + >> + l2tpv3_read_poll(s, true); >> + >> + if (!s) { >> + error_report("l2tpv3_open : failed to set fd handler"); >> + goto outerr; >> + } >> + snprintf(s->nc.info_str, sizeof(s->nc.info_str), >> + "l2tpv3: connected"); >> + return 0; >> +outerr: >> + qemu_del_net_client(nc); >> + if (fd > 0) { >> + close(fd); >> + } >> + return -1; >> +} >> + >> diff --git a/net/net.c b/net/net.c >> index 0a88e68..749d34c 100644 >> --- a/net/net.c >> +++ b/net/net.c >> @@ -731,6 +731,9 @@ static int (* const net_client_init_fun[NET_CLIENT_O= PTIONS_KIND_MAX])( >> [NET_CLIENT_OPTIONS_KIND_BRIDGE] =3D net_init_bridge, >> #endif >> [NET_CLIENT_OPTIONS_KIND_HUBPORT] =3D net_init_hubport, >> +#ifdef CONFIG_LINUX >> + [NET_CLIENT_OPTIONS_KIND_L2TPV3] =3D net_init_l2tpv3, >> +#endif >> }; >> =20 >> =20 >> diff --git a/qapi-schema.json b/qapi-schema.json >> index 83fa485..aefc478 100644 >> --- a/qapi-schema.json >> +++ b/qapi-schema.json >> @@ -2941,6 +2941,62 @@ >> '*udp': 'str' } } >> =20 >> ## >> +# @NetdevL2TPv3Options >> +# >> +# Connect the VLAN to Ethernet over L2TPv3 Static tunnel >> +# >> +# @src: source address >> +# >> +# @dst: destination address >> +# >> +# @srcport: #optional source port - mandatory for udp, optional for ip >> +# >> +# @dstport: #optional destination port - mandatory for udp, optional fo= r ip >> +# >> +# @ipv6: #optional - force the use of ipv6 >> +# >> +# @udp: #optional - use the udp version of l2tpv3 encapsulation >> +# >> +# @cookie64: #optional - use 64 bit coookies >> +# >> +# @counter: #optional have sequence counter >> +# >> +# @pincounter: #optional pin sequence counter to zero - >> +# workaround for buggy implementations or >> +# networks with packet reorder >> +# >> +# @txcookie: #optional 32 or 64 bit transmit cookie >> +# >> +# @rxcookie: #optional 32 or 64 bit receive cookie >> +# >> +# @txsession: 32 bit transmit session >> +# >> +# @rxsession: #optional 32 bit receive session - if not specified >> +# set to the same value as transmit >> +# >> +# @offset: #optional additional offset - allows the insertion of >> +# additional application-specific data before the packet paylo= ad >> +# >> +# Since 2.1 >> +## >> +{ 'type': 'NetdevL2TPv3Options', >> + 'data': { >> + 'src': 'str', >> + 'dst': 'str', >> + '*srcport': 'str', >> + '*dstport': 'str', >> + '*ipv6': 'bool', >> + '*udp': 'bool', >> + '*cookie64': 'bool', >> + '*counter': 'bool', >> + '*pincounter': 'bool', >> + '*txcookie': 'uint64', >> + '*rxcookie': 'uint64', >> + 'txsession': 'uint32', >> + '*rxsession': 'uint32', >> + '*offset': 'uint32' } } >> + >> +## >> # @NetdevVdeOptions >> # >> # Connect the VLAN to a vde switch running on the host. >> @@ -3014,6 +3070,9 @@ >> # A discriminated record of network device traits. >> # >> # Since 1.2 >> +# >> +# 'l2tpv3' - since 2.1 >> +# >> ## >> { 'union': 'NetClientOptions', >> 'data': { >> @@ -3021,6 +3080,7 @@ >> 'nic': 'NetLegacyNicOptions', >> 'user': 'NetdevUserOptions', >> 'tap': 'NetdevTapOptions', >> + 'l2tpv3': 'NetdevL2TPv3Options', >> 'socket': 'NetdevSocketOptions', >> 'vde': 'NetdevVdeOptions', >> 'dump': 'NetdevDumpOptions', >> diff --git a/qemu-options.hx b/qemu-options.hx >> index 8b94264..e1caf6f 100644 >> --- a/qemu-options.hx >> +++ b/qemu-options.hx >> @@ -1395,6 +1395,29 @@ DEF("net", HAS_ARG, QEMU_OPTION_net, >> " (default=3D" DEFAULT_BRIDGE_INTERFACE ") using th= e program 'helper'\n" >> " (default=3D" DEFAULT_BRIDGE_HELPER ")\n" >> #endif >> +#ifdef __linux__ >> + "-net l2tpv3[,vlan=3Dn][,name=3Dstr],src=3Dsrcaddr,dst=3Ddstaddr[,s= rcport=3Dsrcport][,dstport=3Ddstport],txsession=3Dtxsession[,rxsession=3Drx= session][,ipv6=3Don/off][,udp=3Don/off][,cookie64=3Don/off][,counter][,pinc= ounter][,txcookie=3Dtxcookie][,rxcookie=3Drxcookie][,offset=3Doffset]\n" >> + " connect the VLAN to an Ethernet over L2TPv3 pseudo= wire\n" >> + " Linux kernel 3.3+ as well as most routers can talk= \n" >> + " L2TPv3. This transport allows to connect a VM to a= VM,\n" >> + " VM to a router and even VM to Host. It is a nearly= -universal\n" >> + " standard (RFC3391). Note - this implementation use= s static\n" >> + " pre-configured tunnels (same as the Linux kernel).= \n" >> + " use 'src=3D' to specify source address\n" >> + " use 'dst=3D' to specify destination address\n" >> + " use 'udp=3Don' to specify udp encapsulation\n" >> + " use 'dstport=3D' to specify destination udp port\n= " >> + " use 'dstport=3D' to specify destination udp port\n= " >> + " use 'ipv6=3Don' to force v6\n" >> + " L2TPv3 uses cookies to prevent misconfiguration as= \n" >> + " well as a weak security measure\n" >> + " use 'rxcookie=3D0x012345678' to specify a rxcookie= \n" >> + " use 'txcookie=3D0x012345678' to specify a txcookie= \n" >> + " use 'cookie64=3Don' to set cookie size to 64 bit, = otherwise 32\n" >> + " use 'counter=3Doff' to force a 'cut-down' L2TPv3 w= ith no counter\n" >> + " use 'pincounter=3Don' to work around broken counte= r handling in peer\n" >> + " use 'offset=3DX' to add an extra offset between he= ader and data\n" >> +#endif >> "-net socket[,vlan=3Dn][,name=3Dstr][,fd=3Dh][,listen=3D[host]:por= t][,connect=3Dhost:port]\n" >> " connect the vlan 'n' to another VLAN using a sock= et connection\n" >> "-net socket[,vlan=3Dn][,name=3Dstr][,fd=3Dh][,mcast=3Dmaddr:port[= ,localaddr=3Daddr]]\n" >> @@ -1730,6 +1753,65 @@ qemu-system-i386 linux.img \ >> -net socket,mcast=3D239.192.168.1:1102,localaddr=3D1.= 2.3.4 >> @end example >> =20 >> +@item -netdev l2tpv3,id=3D@var{id},src=3D@var{srcaddr},dst=3D@var{dstad= dr}[,srcport=3D@var{srcport}][,dstport=3D@var{dstport}],txsession=3D@var{tx= session}[,rxsession=3D@var{rxsession}][,ipv6][,udp][,cookie64][,counter][,p= incounter][,txcookie=3D@var{txcookie}][,rxcookie=3D@var{rxcookie}][,offset= =3D@var{offset}] >> +@item -net l2tpv3[,vlan=3D@var{n}][,name=3D@var{name}],src=3D@var{srcad= dr},dst=3D@var{dstaddr}[,srcport=3D@var{srcport}][,dstport=3D@var{dstport}]= ,txsession=3D@var{txsession}[,rxsession=3D@var{rxsession}][,ipv6][,udp][,co= okie64][,counter][,pincounter][,txcookie=3D@var{txcookie}][,rxcookie=3D@var= {rxcookie}][,offset=3D@var{offset}] >> +Connect VLAN @var{n} to L2TPv3 pseudowire. L2TPv3 (RFC3391) is a popula= r >> +protocol to transport Ethernet (and other Layer 2) data frames between >> +two systems. It is present in routers, firewalls and the Linux kernel >> +(from version 3.3 onwards). >> + >> +This transport allows a VM to communicate to another VM, router or fire= wall directly. >> + >> +@item src=3D@var{srcaddr} >> + source address (mandatory) >> +@item dst=3D@var{dstaddr} >> + destination address (mandatory) >> +@item udp >> + select udp encapsulation (default is ip). >> +@item srcport=3D@var{srcport} >> + source udp port. >> +@item dstport=3D@var{dstport} >> + destination udp port. >> +@item ipv6 >> + force v6, otherwise defaults to v4. >> +@item rxcookie=3D@var{rxcookie} >> +@item txcookie=3D@var{txcookie} >> + Cookies are a weak form of security in the l2tpv3 specification. >> +Their function is mostly to prevent misconfiguration. By default they a= re 32 >> +bit. >> +@item cookie64 >> + Set cookie size to 64 bit instead of the default 32 >> +@item counter=3Doff >> + Force a 'cut-down' L2TPv3 with no counter as in >> +draft-mkonstan-l2tpext-keyed-ipv6-tunnel-00 >> +@item pincounter=3Don >> + Work around broken counter handling in peer. This may also help on >> +networks which have packet reorder. >> +@item offset=3D@var{offset} >> + Add an extra offset between header and data >> + >> +For example, to attach a VM running on host 4.3.2.1 via L2TPv3 to the b= ridge br-lan >> +on the remote Linux host 1.2.3.4: >> +@example >> +# Setup tunnel on linux host using raw ip as encapsulation >> +# on 1.2.3.4 >> +ip l2tp add tunnel remote 4.3.2.1 local 1.2.3.4 tunnel_id 1 peer_tunnel= _id 1 \ >> + encap udp udp_sport 16384 udp_dport 16384 >> +ip l2tp add session tunnel_id 1 name vmtunnel0 session_id \ >> + 0xFFFFFFFF peer_session_id 0xFFFFFFFF >> +ifconfig vmtunnel0 mtu 1500 >> +ifconfig vmtunnel0 up >> +brctl addif br-lan vmtunnel0 >> + >> + >> +# on 4.3.2.1 >> +# launch QEMU instance - if your network has reorder or is very lossy a= dd ,pincounter >> + >> +qemu-system-i386 linux.img -net nic -net l2tpv3,src=3D4.2.3.1,dst=3D1.2= .3.4,udp,srcport=3D16384,dstport=3D16384,rxsession=3D0xffffffff,txsession= =3D0xffffffff,counter >> + >> + >> +@end example >> + >> @item -netdev vde,id=3D@var{id}[,sock=3D@var{socketpath}][,port=3D@var= {n}][,group=3D@var{groupname}][,mode=3D@var{octalmode}] >> @item -net vde[,vlan=3D@var{n}][,name=3D@var{name}][,sock=3D@var{socke= tpath}] [,port=3D@var{n}][,group=3D@var{groupname}][,mode=3D@var{octalmode}= ] >> Connect VLAN @var{n} to PORT @var{n} of a vde switch running on host a= nd >> --=20 >> 1.7.10.4 >> >>