From: "Michael S. Tsirkin" <mst@redhat.com>
To: John Fastabend <john.fastabend@gmail.com>
Cc: daniel@iogearbox.net, eric.dumazet@gmail.com, kubakici@wp.pl,
shm@cumulusnetworks.com, davem@davemloft.net,
alexei.starovoitov@gmail.com, netdev@vger.kernel.org,
bblanco@plumgrid.com, john.r.fastabend@intel.com,
brouer@redhat.com, tgraf@suug.ch
Subject: Re: [net-next PATCH v2 3/5] virtio_net: Add XDP support
Date: Mon, 28 Nov 2016 06:07:18 +0200 [thread overview]
Message-ID: <20161128060446-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <583BAAD9.4080408@gmail.com>
On Sun, Nov 27, 2016 at 07:56:09PM -0800, John Fastabend wrote:
> On 16-11-27 07:36 PM, Michael S. Tsirkin wrote:
> > On Fri, Nov 25, 2016 at 01:24:03PM -0800, John Fastabend wrote:
> >> On 16-11-22 06:58 AM, Michael S. Tsirkin wrote:
> >>> On Tue, Nov 22, 2016 at 12:27:03AM -0800, John Fastabend wrote:
> >>>> On 16-11-21 03:20 PM, Michael S. Tsirkin wrote:
> >>>>> On Sat, Nov 19, 2016 at 06:50:33PM -0800, John Fastabend wrote:
> >>>>>> From: Shrijeet Mukherjee <shrijeet@gmail.com>
> >>>>>>
> >>>>>> This adds XDP support to virtio_net. Some requirements must be
> >>>>>> met for XDP to be enabled depending on the mode. First it will
> >>>>>> only be supported with LRO disabled so that data is not pushed
> >>>>>> across multiple buffers. The MTU must be less than a page size
> >>>>>> to avoid having to handle XDP across multiple pages.
> >>>>>>
> >>>>>> If mergeable receive is enabled this first series only supports
> >>>>>> the case where header and data are in the same buf which we can
> >>>>>> check when a packet is received by looking at num_buf. If the
> >>>>>> num_buf is greater than 1 and a XDP program is loaded the packet
> >>>>>> is dropped and a warning is thrown. When any_header_sg is set this
> >>>>>> does not happen and both header and data is put in a single buffer
> >>>>>> as expected so we check this when XDP programs are loaded. Note I
> >>>>>> have only tested this with Linux vhost backend.
> >>>>>>
> >>>>>> If big packets mode is enabled and MTU/LRO conditions above are
> >>>>>> met then XDP is allowed.
> >>>>>>
> >>>>>> A follow on patch can be generated to solve the mergeable receive
> >>>>>> case with num_bufs equal to 2. Buffers greater than two may not
> >>>>>> be handled has easily.
> >>>>>
> >>>>>
> >>>>> I would very much prefer support for other layouts without drops
> >>>>> before merging this.
> >>>>> header by itself can certainly be handled by skipping it.
> >>>>> People wanted to use that e.g. for zero copy.
> >>>>
> >>>> OK fair enough I'll do this now rather than push it out.
> >>>>
> >>
> >> Hi Michael,
> >>
> >> The header skip logic however complicates the xmit handling a fair
> >> amount. Specifically when we release the buffers after xmit then
> >> both the hdr and data portions need to be released which requires
> >> some tracking.
> >
> > I thought you disable all checksum offloads so why not discard the
> > header immediately?
>
> Well in the "normal" case where the header is part of the same buffer
> we keep it to use the same space for the header on the TX path.
>
> If we discard it in the header split case we have to push the header
> somewhere else. In the skb case the cb[] region is used it looks like.
> In our case I guess free space at the end of the page could be used.
You don't have to put start of page in a buffer, you
can put an offset there. Will result in some waste in the
common case, but it's just several bytes so likely not a big deal.
> My thinking is if we handle the general case of more than one buffer
> being used with a copy we can handle the case above using the same
> logic and no need to handle it as a special case. It seems to be an odd
> case that doesn't really exist anyways. At least not in qemu/Linux. I
> have not tested anything else.
OK
> >
> >> Is the header split logic actually in use somewhere today? It looks
> >> like its not being used in Linux case. And zero copy RX is currently as
> >> best I can tell not supported anywhere so I would prefer not to
> >> complicate the XDP path at the moment with a possible future feature.
> >
> > Well it's part of the documented interface so we never
> > know who implemented it. Normally if we want to make
> > restrictions we would do the reverse and add a feature.
> >
> > We can do this easily, but I'd like to first look into
> > just handling all possible inputs as the spec asks us to.
> > I'm a bit too busy with other stuff next week but will
> > look into this a week after that if you don't beat me to it.
> >
>
> Well I've almost got it working now with some logic to copy everything
> into a single page if we hit this case so should be OK but slow. I'll
> finish testing this and send it out hopefully in the next few days.
>
> >>>>>
> >>>>> Anything else can be handled by copying the packet.
> >>
> >> Any idea how to test this? At the moment I have some code to linearize
> >> the data in all cases with more than a single buffer. But wasn't clear
> >> to me which features I could negotiate with vhost/qemu to get more than
> >> a single buffer in the receive path.
> >>
> >> Thanks,
> >> John
> >
> > ATM you need to hack qemu. Here's a hack to make header completely
> > separate.
> >
>
> Perfect! hacking qemu for testing is no problem this helps a lot thanks
> and saves me time trying to figure out how to get qemu to do this.
Pls note I didn't try this at all, so might not work, but should
give you the idea.
> >
> > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > index b68c69d..4866144 100644
> > --- a/hw/net/virtio-net.c
> > +++ b/hw/net/virtio-net.c
> > @@ -1164,6 +1164,7 @@ static ssize_t virtio_net_receive(NetClientState *nc, const uint8_t *buf, size_t
> > offset = n->host_hdr_len;
> > total += n->guest_hdr_len;
> > guest_offset = n->guest_hdr_len;
> > + continue;
> > } else {
> > guest_offset = 0;
> > }
> >
> >
> >
> > here's one that should cap the 1st s/g to 100 bytes:
> >
> >
> > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > index b68c69d..7943004 100644
> > --- a/hw/net/virtio-net.c
> > +++ b/hw/net/virtio-net.c
> > @@ -1164,6 +1164,7 @@ static ssize_t virtio_net_receive(NetClientState *nc, const uint8_t *buf, size_t
> > offset = n->host_hdr_len;
> > total += n->guest_hdr_len;
> > guest_offset = n->guest_hdr_len;
> > + sg.iov_len = MIN(sg.iov_len, 100);
> > } else {
> > guest_offset = 0;
> > }
> >
next prev parent reply other threads:[~2016-11-28 4:07 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-20 2:49 [net-next PATCH v2 0/5] XDP for virtio_net John Fastabend
2016-11-20 2:49 ` [net-next PATCH v2 1/5] net: virtio dynamically disable/enable LRO John Fastabend
2016-11-21 23:23 ` Michael S. Tsirkin
2016-11-22 8:16 ` John Fastabend
2016-11-20 2:50 ` [net-next PATCH v2 2/5] net: xdp: add invalid buffer warning John Fastabend
2016-11-20 2:50 ` [net-next PATCH v2 3/5] virtio_net: Add XDP support John Fastabend
2016-11-21 23:20 ` Michael S. Tsirkin
2016-11-22 8:27 ` John Fastabend
2016-11-22 14:58 ` Michael S. Tsirkin
2016-11-25 21:24 ` John Fastabend
2016-11-28 3:36 ` Michael S. Tsirkin
2016-11-28 3:56 ` John Fastabend
2016-11-28 4:07 ` Michael S. Tsirkin [this message]
2016-11-28 23:26 ` John Fastabend
2016-11-20 2:51 ` [net-next PATCH v2 4/5] virtio_net: add dedicated XDP transmit queues John Fastabend
2016-11-21 11:45 ` Daniel Borkmann
2016-11-21 15:56 ` John Fastabend
2016-11-21 23:13 ` Michael S. Tsirkin
2016-11-22 8:17 ` John Fastabend
2016-11-22 14:59 ` Michael S. Tsirkin
2016-11-20 2:51 ` [net-next PATCH v2 5/5] virtio_net: add XDP_TX support John Fastabend
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161128060446-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=alexei.starovoitov@gmail.com \
--cc=bblanco@plumgrid.com \
--cc=brouer@redhat.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=john.fastabend@gmail.com \
--cc=john.r.fastabend@intel.com \
--cc=kubakici@wp.pl \
--cc=netdev@vger.kernel.org \
--cc=shm@cumulusnetworks.com \
--cc=tgraf@suug.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).