From: Tom Herbert <tom@herbertland.com>
To: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Florian Westphal <fw@strlen.de>,
Linux Kernel Network Developers <netdev@vger.kernel.org>,
Jesper Dangaard Brouer <jbrouer@redhat.com>
Subject: Re: Initial thoughts on TXDP
Date: Thu, 1 Dec 2016 15:46:40 -0800 [thread overview]
Message-ID: <CALx6S350eCYS63dGiR+X+nVvqF_uGJop9Z_m7SmZf9QXr-rrfg@mail.gmail.com> (raw)
In-Reply-To: <859a0c99-f427-1db8-d260-1297777792fb@stressinduktion.org>
On Thu, Dec 1, 2016 at 2:47 PM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
> Side note:
>
> On 01.12.2016 20:51, Tom Herbert wrote:
>>> > E.g. "mini-skb": Even if we assume that this provides a speedup
>>> > (where does that come from? should make no difference if a 32 or
>>> > 320 byte buffer gets allocated).
>>> >
>> It's the zero'ing of three cache lines. I believe we talked about that
>> as netdev.
>
> Jesper and me played with that again very recently:
>
> https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/time_bench_memset.c#L590
>
> In micro-benchmarks we saw a pretty good speed up not using the rep
> stosb generated by gcc builtin but plain movq's. Probably the cost model
> for __builtin_memset in gcc is wrong?
>
> When Jesper is free we wanted to benchmark this and maybe come up with a
> arch specific way of cleaning if it turns out to really improve throughput.
>
> SIMD instructions seem even faster but the kernel_fpu_begin/end() kill
> all the benefits.
>
One nice direction of XDP is that it forces drivers to defer
allocating (and hence zero'ing) skbs. In the receive path I think we
can exploit this property deeper into the stack. The only time we
_really_ to allocate an skbuf is when we need to put the packet onto a
queue. All the other use cases are really just to pass a structure
containing a packet from function to function. For that purpose we
should be able to just pass a much smaller structure in a stack
argument and only allocate an skbuff when we need to enqueue. In cases
where we don't ever queue a packet we might never need to allocate any
skbuff-- this includes pure acks, packets that end up being dropped.
But even more than that, if a received packet generates a TX packet
(like a SYN causes a SYN-ACK) then we might even be able to just
recycle the received packet and avoid needing any skbuff allocation on
transmit (XDP_TX already does this in a limited context)-- this could
be a win to handle SYN attacks for instance. Also, since we don't
queue on the socket buffer for UDP it's conceivable we could avoid
skbuffs in an expedited UDP TX path.
Currently, nearly the whole stack depends on packets always being
passed in skbuffs, however __skb_flow_dissect is an interesting
exception as it can handle packets passed in either an skbuff or by
just a void *-- so we know that this "dual mode" is at least possible.
Trying to retrain the whole stack to be able to handle both skbuffs
and raw pages is probably untenable at this point, but selectively
augmenting some critical performance functions for dual mode (ip_rcv,
tcp_rcv, udp_rcv functions for instance) might work.
Thanks,
Tom
> Bye,
> Hannes
>
next prev parent reply other threads:[~2016-12-01 23:46 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-30 22:54 Initial thoughts on TXDP Tom Herbert
2016-12-01 2:44 ` Florian Westphal
2016-12-01 19:51 ` Tom Herbert
2016-12-01 22:47 ` Hannes Frederic Sowa
2016-12-01 23:46 ` Tom Herbert [this message]
2016-12-02 14:36 ` Edward Cree
2016-12-02 17:12 ` Tom Herbert
2016-12-02 13:01 ` Jesper Dangaard Brouer
2016-12-02 12:13 ` Jesper Dangaard Brouer
2016-12-01 13:55 ` Sowmini Varadhan
2016-12-01 19:05 ` Tom Herbert
2016-12-01 19:48 ` Rick Jones
2016-12-01 20:18 ` Tom Herbert
2016-12-01 21:47 ` Rick Jones
2016-12-01 22:12 ` Tom Herbert
2016-12-02 0:04 ` Rick Jones
2016-12-01 20:13 ` Sowmini Varadhan
2016-12-01 20:39 ` Tom Herbert
2016-12-01 22:55 ` Hannes Frederic Sowa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CALx6S350eCYS63dGiR+X+nVvqF_uGJop9Z_m7SmZf9QXr-rrfg@mail.gmail.com \
--to=tom@herbertland.com \
--cc=fw@strlen.de \
--cc=hannes@stressinduktion.org \
--cc=jbrouer@redhat.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).