All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Anton Ivanov (antivano)" <antivano@cisco.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: "pbonzini@redhat.com" <pbonzini@redhat.com>,
	"anton.ivanov@kot-begemot.co.uk" <anton.ivanov@kot-begemot.co.uk>,
	"afaerber@suse.de" <afaerber@suse.de>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] [PATCH v3] net: L2TPv3 transport
Date: Tue, 18 Mar 2014 11:03:30 +0000	[thread overview]
Message-ID: <53282801.9090509@cisco.com> (raw)
In-Reply-To: <20140318104404.GA31452@stefanha-thinkpad.redhat.com>

On 18/03/14 10:44, Stefan Hajnoczi wrote:
> On Tue, Mar 11, 2014 at 12:45:30PM +0000, anton.ivanov@kot-begemot.co.uk wrote:
>> +/* This protocol number is passed getaddrinfo(), and not
>> + * used directly. If you give gettaddrinfo() what one is
>> + * supposed to give - INET, RAW, 0, the result is not
>> + * set correctly.
>> + * Setting the args to INET, RAW, L2TPv3 is the "lowest pain"
>> + * workaround in this case as it allows common raw and udp
>> + * setup.
>> + */
>> +
>> +#ifndef IPPROTO_L2TP
>> +#define IPPROTO_L2TP 0x73
>> +#endif
> The comment is incorrect.  Using IPPROTO_L2TP is not a workaround, it's
> the expected way to implement an IP-level protocol in userspace.
>
> getaddrinfo() returns protocol IPPROTO_L2TP unchanged.  Then we call
> socket(AF_INET, SOCK_RAW, IPPROTO_L2TP).
>
> INET, RAW, L2TPv3 tells the kernel that we wish to receive IP packets
> when the protocol field is equal to IPPROTO_L2TP.  If we didn't tell the
> kernel which IP protocol to listen for then the socket would not receive
> packets (IPPROTO_RAW is send-only).
>
> I suggest changing the comment simply to:
> /* IANA-assigned IP protocol ID for L2TPv3 */

OK :)

>
>> +static ssize_t net_l2tpv3_receive_dgram_iov(NetClientState *nc,
>> +                    const struct iovec *iov,
>> +                    int iovcnt)
>> +{
>> +    NetL2TPV3State *s = DO_UPCAST(NetL2TPV3State, nc, nc);
>> +
>> +    struct msghdr message;
>> +    int ret;
>> +
>> +    if (iovcnt > MAX_L2TPV3_IOVCNT - 1) {
>> +        error_report(
>> +            "iovec too long %d > %d, change l2tpv3.h",
>> +            iovcnt, MAX_L2TPV3_IOVCNT
>> +        );
>> +        return -1;
>> +    }
>> +    l2tpv3_form_header(s);
>> +    memcpy(s->vec + 1, iov, iovcnt * sizeof(struct iovec));
>> +    s->vec->iov_base = s->header_buf;
>> +    s->vec->iov_len = s->offset;
>> +    message.msg_name = s->dgram_dst;
>> +    message.msg_namelen = s->dst_size;
>> +    message.msg_iov = s->vec;
>> +    message.msg_iovlen = iovcnt + 1;
>> +    message.msg_control = NULL;
>> +    message.msg_controllen = 0;
>> +    message.msg_flags = 0;
>> +    do {
>> +        ret = sendmsg(s->fd, &message, 0);
>> +    } while ((ret == -1) && (errno == EINTR));
>> +    if (ret > 0) {
>> +        ret -= s->offset;
>> +    } else if (ret == 0) {
>> +        /* belt and braces - should not occur on DGRAM
>> +        * we should get an error and never a 0 send
>> +        */
>> +        ret = iov_size(iov, iovcnt);
>> +    } else {
>> +        /* signal upper layer that socket buffer is full */
>> +        ret = -errno;
>> +        if (ret == -EAGAIN || ret == -ENOBUFS) {
>> +            ret = 0;
>> +        }
>> +    }
>> +    return ret;
>> +}
> Returning 0 means the peer (the emulated NIC) will stop sending packets
> until further notice.  Since qemu_flush_queued_packets() is not invoked
> anywhere in this source file this will result in deadlock!
>
> What needs to happen is:
> 1. We begin monitoring s->fd to see when it becomes writable again.
> 2. When it becomes writable we call qemu_flush_queued_packets() so the
>    peer can resuming transmission.
>
> See net/tap.c or other backends for examples of how this is done.

OK. Understood. Will add it.

I will also add it raw/packet mmap which is almost ready for submission.

>
>> +static int l2tpv3_verify_header(NetL2TPV3State *s, uint8_t *buf)
>> +{
>> +
>> +    uint32_t *session;
>> +    uint64_t cookie;
>> +
>> +    if ((!s->udp) && (!s->ipv6)) {
>> +        buf += sizeof(struct iphdr) /* fix for ipv4 raw */;
>> +    }
>> +    if (s->cookie) {
>> +        if (s->cookie_is_64) {
>> +            cookie = ldq_be_p(buf + s->cookie_offset);
>> +        } else {
>> +            cookie = ldl_be_p(buf + s->cookie_offset);
>> +        }
>> +        if (cookie != s->rx_cookie) {
>> +            error_report("unknown cookie id\n");
>> +            return -1;
>> +        }
>> +    }
>> +    session = (uint32_t *) (buf + s->session_offset);
>> +    if (ldl_be_p(session) != s->rx_session) {
>> +        error_report("session mismatch");
>> +        return -1;
>> +    }
>> +    return 0;
>> +}
>> +
>> +static void net_l2tpv3_send(void *opaque)
>> +{
>> +    NetL2TPV3State *s = opaque;
>> +
>> +    int i, count, offset;
>> +    struct mmsghdr *msgvec;
>> +    struct iovec *vec;
>> +
>> +    msgvec = s->msgvec;
>> +    offset = s->offset;
>> +    if ((!s->udp) && (!s->ipv6)) {
>> +        offset +=   sizeof(struct iphdr);
>> +    }
>> +    count = recvmmsg(s->fd, msgvec, MAX_L2TPV3_MSGCNT, MSG_DONTWAIT, NULL);
>> +    for (i = 0; i < count; i++) {
>> +        if (msgvec->msg_len > 0) {
>> +            vec = msgvec->msg_hdr.msg_iov;
>> +            if ((msgvec->msg_len > offset) &&
>> +                (l2tpv3_verify_header(s, vec->iov_base) == 0)) {
>> +                vec++;
>> +                qemu_send_packet(&s->nc, vec->iov_base,
>> +                                        msgvec->msg_len - offset);
>> +            } else {
>> +                error_report("l2tpv3 header verification failed");
> This and the error_report() ins l2tpv3_verify_header() could be used as
> a Denial of Service: a malicious host may be able to send invalid
> packets to fill up the host's disk with error messages.
>
> These are good candidates for rate-limiting or even once-only error
> messages.  The error_report() API currently does not have a way to do
> that.
>
> I don't insist on this but it's something worth changing in the future.

Agreed - we just increment the dropped count it in the UML counterpart
(I removed this for that exact reason there).

>
>> diff --git a/qemu-options.hx b/qemu-options.hx
>> index 8b94264..ce4d99d 100644
>> --- a/qemu-options.hx
>> +++ b/qemu-options.hx
>> @@ -1395,19 +1395,28 @@ DEF("net", HAS_ARG, QEMU_OPTION_net,
>>      "                (default=" DEFAULT_BRIDGE_INTERFACE ") using the program 'helper'\n"
>>      "                (default=" DEFAULT_BRIDGE_HELPER ")\n"
>>  #endif
>> -    "-net socket[,vlan=n][,name=str][,fd=h][,listen=[host]:port][,connect=host:port]\n"
>> -    "                connect the vlan 'n' to another VLAN using a socket connection\n"
>> -    "-net socket[,vlan=n][,name=str][,fd=h][,mcast=maddr:port[,localaddr=addr]]\n"
>> -    "                connect the vlan 'n' to multicast maddr and port\n"
>> -    "                use 'localaddr=addr' to specify the host address to send packets from\n"
>> -    "-net socket[,vlan=n][,name=str][,fd=h][,udp=host:port][,localaddr=host:port]\n"
>> -    "                connect the vlan 'n' to another VLAN using an UDP tunnel\n"
>> -#ifdef CONFIG_VDE
>> -    "-net vde[,vlan=n][,name=str][,sock=socketpath][,port=n][,group=groupname][,mode=octalmode]\n"
>> -    "                connect the vlan 'n' to port 'n' of a vde switch running\n"
>> -    "                on host and listening for incoming connections on 'socketpath'.\n"
>> -    "                Use group 'groupname' and mode 'octalmode' to change default\n"
>> -    "                ownership and permissions for communication port.\n"
> Please leave socket and vde.  They are not being deprecated yet.

Apologies, did not have the intention of killing them :)

A.

      reply	other threads:[~2014-03-18 11:03 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-11 12:45 [Qemu-devel] [PATCH v3] net: L2TPv3 transport anton.ivanov
2014-03-18 10:44 ` Stefan Hajnoczi
2014-03-18 11:03   ` Anton Ivanov (antivano) [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53282801.9090509@cisco.com \
    --to=antivano@cisco.com \
    --cc=afaerber@suse.de \
    --cc=anton.ivanov@kot-begemot.co.uk \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.