All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wei Xu <wexu@redhat.com>
To: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [ RFC Patch v4 2/3] virtio-net rsc: support coalescing ipv4 tcp traffic
Date: Fri, 8 Apr 2016 17:15:02 +0800	[thread overview]
Message-ID: <57077696.1020701@redhat.com> (raw)
In-Reply-To: <57076C49.8000905@redhat.com>



On 2016年04月08日 16:31, Jason Wang wrote:
>
> On 04/08/2016 03:47 PM, Wei Xu wrote:
>>
>> On 2016年04月05日 10:47, Jason Wang wrote:
>>> On 04/04/2016 03:25 AM, wexu@redhat.com wrote:
>>>> From: Wei Xu <wexu@redhat.com>
>>>>
>>>> All the data packets in a tcp connection will be cached to a big buffer
>>>> in every receive interval, and will be sent out via a timer, the
>>>> 'virtio_net_rsc_timeout' controls the interval, the value will
>>>> influent the
>>>> performance and response of tcp connection extremely, 50000(50us) is a
>>>> experience value to gain a performance improvement, since the whql test
>>>> sends packets every 100us, so '300000(300us)' can pass the test case,
>>>> this is also the default value, it's gonna to be tunable.
>>>>
>>>> The timer will only be triggered if the packets pool is not empty,
>>>> and it'll drain off all the cached packets
>>>>
>>>> 'NetRscChain' is used to save the segments of different protocols in a
>>>> VirtIONet device.
>>>>
>>>> The main handler of TCP includes TCP window update, duplicated ACK
>>>> check
>>>> and the real data coalescing if the new segment passed sanity check
>>>> and is identified as an 'wanted' one.
>>>>
>>>> An 'wanted' segment means:
>>>> 1. Segment is within current window and the sequence is the expected
>>>> one.
>>>> 2. ACK of the segment is in the valid window.
>>>> 3. If the ACK in the segment is a duplicated one, then it must less
>>>> than 2,
>>>>      this is to notify upper layer TCP starting retransmission due to
>>>> the spec.
>>>>
>>>> Sanity check includes:
>>>> 1. Incorrect version in IP header
>>>> 2. IP options & IP fragment
>>>> 3. Not a TCP packets
>>>> 4. Sanity size check to prevent buffer overflow attack.
>>>>
>>>> There maybe more cases should be considered such as ip
>>>> identification other
>>>> flags, while it broke the test because windows set it to the same
>>>> even it's
>>>> not a fragment.
>>>>
>>>> Normally it includes 2 typical ways to handle a TCP control flag,
>>>> 'bypass'
>>>> and 'finalize', 'bypass' means should be sent out directly, and
>>>> 'finalize'
>>>> means the packets should also be bypassed, and this should be done
>>>> after searching for the same connection packets in the pool and sending
>>>> all of them out, this is to avoid out of data.
>>>>
>>>> All the 'SYN' packets will be bypassed since this always begin a new'
>>>> connection, other flags such 'FIN/RST' will trigger a finalization,
>>>> because
>>>> this normally happens upon a connection is going to be closed, an
>>>> 'URG' packet
>>>> also finalize current coalescing unit.
>>>>
>>>> Statistics can be used to monitor the basic coalescing status, the
>>>> 'out of order'
>>>> and 'out of window' means how many retransmitting packets, thus
>>>> describe the
>>>> performance intuitively.
>>>>
>>>> Signed-off-by: Wei Xu <wexu@redhat.com>
>>>> ---
>>>>    hw/net/virtio-net.c            | 480
>>>> ++++++++++++++++++++++++++++++++++++++++-
>>>>    include/hw/virtio/virtio-net.h |   1 +
>>>>    include/hw/virtio/virtio.h     |  72 +++++++
>>>>    3 files changed, 552 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
>>>> index bd91a4b..81e8e71 100644
>>>> --- a/hw/net/virtio-net.c
>>>> +++ b/hw/net/virtio-net.c
>>>> @@ -15,10 +15,12 @@
>>>>    #include "qemu/iov.h"
>>>>    #include "hw/virtio/virtio.h"
>>>>    #include "net/net.h"
>>>> +#include "net/eth.h"
>>>>    #include "net/checksum.h"
>>>>    #include "net/tap.h"
>>>>    #include "qemu/error-report.h"
>>>>    #include "qemu/timer.h"
>>>> +#include "qemu/sockets.h"
>>>>    #include "hw/virtio/virtio-net.h"
>>>>    #include "net/vhost_net.h"
>>>>    #include "hw/virtio/virtio-bus.h"
>>>> @@ -38,6 +40,24 @@
>>>>    #define endof(container, field) \
>>>>        (offsetof(container, field) + sizeof(((container *)0)->field))
>>>>    +#define VIRTIO_NET_IP4_ADDR_SIZE   8        /* ipv4 saddr + daddr */
>>>> +#define VIRTIO_NET_TCP_PORT_SIZE   4        /* sport + dport */
>>>> +
>>>> +/* IPv4 max payload, 16 bits in the header */
>>>> +#define VIRTIO_NET_MAX_IP4_PAYLOAD (65535 - sizeof(struct ip_header))
>>>> +#define VIRTIO_NET_MAX_TCP_PAYLOAD 65535
>>>> +
>>>> +/* header lenght value in ip header without option */
>>> typo here.
>> Thanks.
>>>> +#define VIRTIO_NET_IP4_HEADER_LENGTH 5
>>>> +
>>>> +/* Purge coalesced packets timer interval */
>>>> +#define VIRTIO_NET_RSC_INTERVAL  300000
>>>> +
>>>> +/* This value affects the performance a lot, and should be tuned
>>>> carefully,
>>>> +   '300000'(300us) is the recommended value to pass the WHQL test,
>>>> '50000' can
>>>> +   gain 2x netperf throughput with tso/gso/gro 'off'. */
>>>> +static uint32_t virtio_net_rsc_timeout = VIRTIO_NET_RSC_INTERVAL;
>>> Like we've discussed in previous versions, need we add another property
>>> for this?
>> Do you know how to make this a tunable parameter to guest? can this
>> parameter be set via control queue?
> It's possible I think.
>
>
> [...]
>
>>>> +
>>>> +static void virtio_net_rsc_purge(void *opq)
>>>> +{
>>>> +    NetRscChain *chain = (NetRscChain *)opq;
>>>> +    NetRscSeg *seg, *rn;
>>>> +
>>>> +    QTAILQ_FOREACH_SAFE(seg, &chain->buffers, next, rn) {
>>>> +        if (virtio_net_rsc_drain_seg(chain, seg) == 0) {
>>>> +            chain->stat.purge_failed++;
>>>> +            continue;
>>> Is it better to break here, consider we fail to do the receive?
>> Actually this fails only when receive fails according to the test, but
>> should drain all the segments here,
>> if break here, then will have to free all the buffers, is it possible
>> a successful receiving followed by a failed one?
> I'd say it's possible, e.g guest just finish rx buffer refills.
>
>> if that does happens, i preferred give it a shot other than free it
>> directly.
> Ok.
>
> [...]
>
>>>> +
>>>> +static void virtio_net_rsc_cache_buf(NetRscChain *chain,
>>>> NetClientState *nc,
>>>> +                                     const uint8_t *buf, size_t size)
>>>> +{
>>>> +    uint16_t hdr_len;
>>>> +    NetRscSeg *seg;
>>>> +
>>>> +    hdr_len = ((VirtIONet *)(chain->n))->guest_hdr_len;
>>>> +    seg = g_malloc(sizeof(NetRscSeg));
>>>> +    seg->buf = g_malloc(hdr_len + sizeof(struct eth_header)\
>>>> +                   + sizeof(struct ip6_header) +
>>>> VIRTIO_NET_MAX_TCP_PAYLOAD);
>>> Why ip6_header here?
>> ipv6 packets can carry 65535(2^16) bytes pure data payload, that means
>> the header is not included.
>> but ipv4 is included, i choose the maximum here.
> Let's add a comment here, since the patch's title is to implement ipv4
> coalescing.
ok.
> [...]
>
>>>> +
>>>> +static int32_t virtio_net_rsc_coalesce4(NetRscChain *chain,
>>>> NetRscSeg *seg,
>>>> +                        const uint8_t *buf, size_t size, NetRscUnit
>>>> *unit)
>>> indentation is wrong here.
>> Should it be aligned from parameter begin?
> Looks like we'd better do this.
ok.
>
>> i was going to save an extra line parameter.
> [...]
>

  reply	other threads:[~2016-04-08  9:14 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-03 19:25 [Qemu-devel] [ RFC Patch v4 0/3] Support Receive-Segment-Offload(RSC) for WHQL wexu
2016-04-03 19:25 ` [Qemu-devel] [ RFC Patch v4 1/3] virtio-net rsc: add a new host offload(rsc) feature bit wexu
2016-04-05  2:05   ` Jason Wang
2016-04-05  8:17     ` Michael S. Tsirkin
2016-04-10 15:23       ` Wei Xu
2016-04-03 19:25 ` [Qemu-devel] [ RFC Patch v4 2/3] virtio-net rsc: support coalescing ipv4 tcp traffic wexu
2016-04-05  2:47   ` Jason Wang
2016-04-08  7:47     ` Wei Xu
2016-04-08  8:31       ` Jason Wang
2016-04-08  9:15         ` Wei Xu [this message]
2016-04-12  8:23   ` Michael S. Tsirkin
2016-04-03 19:25 ` [Qemu-devel] [ RFC Patch v4 3/3] virtio-net rsc: support coalescing ipv6 " wexu
2016-04-05  2:50   ` Jason Wang
2016-04-08  7:06     ` Wei Xu
2016-04-08  7:27       ` Jason Wang
2016-04-08  7:51         ` Wei Xu
2016-04-05  2:52 ` [Qemu-devel] [ RFC Patch v4 0/3] Support Receive-Segment-Offload(RSC) for WHQL Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57077696.1020701@redhat.com \
    --to=wexu@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.