netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Florian Westphal <fw@strlen.de>,
	Network Development <netdev@vger.kernel.org>
Subject: Re: [flamebait] xdp, well meaning but pointless
Date: Sat, 3 Dec 2016 11:48:22 -0800	[thread overview]
Message-ID: <58432186.20006@gmail.com> (raw)
In-Reply-To: <CAF=yD-+Wx9Pkt55-dUU6dYM7P==kQoKE4YXKoFSq1yi7rY5rDA@mail.gmail.com>

On 16-12-03 08:19 AM, Willem de Bruijn wrote:
> On Fri, Dec 2, 2016 at 12:22 PM, Jesper Dangaard Brouer
> <brouer@redhat.com> wrote:
>>
>> On Thu, 1 Dec 2016 10:11:08 +0100 Florian Westphal <fw@strlen.de> wrote:
>>
>>> In light of DPDKs existence it make a lot more sense to me to provide
>>> a). a faster mmap based interface (possibly AF_PACKET based) that allows
>>> to map nic directly into userspace, detaching tx/rx queue from kernel.
>>>
>>> John Fastabend sent something like this last year as a proof of
>>> concept, iirc it was rejected because register space got exposed directly
>>> to userspace.  I think we should re-consider merging netmap
>>> (or something conceptually close to its design).
>>
>> I'm actually working in this direction, of zero-copy RX mapping packets
>> into userspace.  This work is mostly related to page_pool, and I only
>> plan to use XDP as a filter for selecting packets going to userspace,
>> as this choice need to be taken very early.
>>
>> My design is here:
>>  https://prototype-kernel.readthedocs.io/en/latest/vm/page_pool/design/memory_model_nic.html
>>
>> This is mostly about changing the memory model in the drivers, to allow
>> for safely mapping pages to userspace.  (An efficient queue mechanism is
>> not covered).
> 
> Virtio virtqueues are used in various other locations in the stack.
> With separate memory pools and send + completion descriptor rings,
> signal moderation, careful avoidance of cacheline bouncing, etc. these
> seem like a good opportunity for a TPACKET_V4 format.
> 

FWIW. After we rejected exposing the register space to user space due to
valid security issues we fell back to using VFIO which works nicely for
mapping virtual functions into userspace and VMs. The main  drawback is
user space has to manage the VF but that is mostly a solved problem at
this point. Deployment concerns aside.

There was a TPACKET_V4 version we had a prototype of that passed
buffers down to the hardware to use with the dma engine. This gives
zero-copy but same as VFs requires the hardware to do all the steering
of traffic and any expected policy in front of the application. Due to
requiring user space to kick hardware and vice versa though it was
somewhat slower so I didn't finish it up. The kick was implemented as a
syscall iirc. I can maybe look at it a bit more next week and see if its
worth reviving now in this context.

I don't think any of this requires page pools though. Or rather tpacket
and vhost/virtio already know how to do page pools is perhaps the other
way to look at it.

One idea I've been playing around with is a vhost backend using
tpacketv{3|4} so we don't require socket manipulation.

Thanks,
John

  reply	other threads:[~2016-12-03 19:48 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-01  9:11 [flamebait] xdp, well meaning but pointless Florian Westphal
2016-12-01 13:42 ` Hannes Frederic Sowa
2016-12-01 14:58 ` Thomas Graf
2016-12-01 15:52   ` Hannes Frederic Sowa
2016-12-01 16:28     ` Thomas Graf
2016-12-01 20:44       ` Hannes Frederic Sowa
2016-12-01 21:12         ` Tom Herbert
2016-12-01 21:27           ` Hannes Frederic Sowa
2016-12-01 21:51             ` Tom Herbert
2016-12-02 10:24               ` Jesper Dangaard Brouer
2016-12-02 11:54                 ` Hannes Frederic Sowa
2016-12-02 16:59                   ` Tom Herbert
2016-12-02 18:12                     ` Hannes Frederic Sowa
2016-12-02 19:56                       ` Stephen Hemminger
2016-12-02 20:19                         ` Tom Herbert
2016-12-02 18:39             ` bpf bounded loops. Was: [flamebait] xdp Alexei Starovoitov
2016-12-02 19:25               ` Hannes Frederic Sowa
2016-12-02 19:42                 ` John Fastabend
2016-12-02 19:50                   ` Hannes Frederic Sowa
2016-12-03  0:20                   ` Alexei Starovoitov
2016-12-03  9:11                     ` Sargun Dhillon
2016-12-02 19:42                 ` Hannes Frederic Sowa
2016-12-02 23:34                   ` Alexei Starovoitov
2016-12-04 16:05                     ` [flamebait] xdp Was: " Hannes Frederic Sowa
2016-12-06  3:05                       ` Alexei Starovoitov
2016-12-06  5:08                         ` Tom Herbert
2016-12-06  6:04                           ` Alexei Starovoitov
2016-12-05 16:40                 ` Edward Cree
2016-12-05 16:50                   ` Hannes Frederic Sowa
2016-12-05 16:54                     ` Edward Cree
2016-12-06 11:35                       ` Hannes Frederic Sowa
2016-12-01 16:06   ` [flamebait] xdp, well meaning but pointless Florian Westphal
2016-12-01 16:19   ` David Miller
2016-12-01 16:51     ` Florian Westphal
2016-12-01 17:20     ` Hannes Frederic Sowa
     [not found] ` <CALx6S35R_ZStV=DbD-7Gf_y5xXqQq113_6m5p-p0GQfv46v0Ow@mail.gmail.com>
2016-12-01 18:02   ` Tom Herbert
2016-12-02 17:22 ` Jesper Dangaard Brouer
2016-12-03 16:19   ` Willem de Bruijn
2016-12-03 19:48     ` John Fastabend [this message]
2016-12-05 11:04       ` Jesper Dangaard Brouer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=58432186.20006@gmail.com \
    --to=john.fastabend@gmail.com \
    --cc=brouer@redhat.com \
    --cc=fw@strlen.de \
    --cc=netdev@vger.kernel.org \
    --cc=willemdebruijn.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).