From: Jesper Dangaard Brouer <brouer@redhat.com>
To: "Karlsson, Magnus" <magnus.karlsson@intel.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
"Björn Töpel" <bjorn.topel@gmail.com>,
"Duyck, Alexander H" <alexander.h.duyck@intel.com>,
"alexander.duyck@gmail.com" <alexander.duyck@gmail.com>,
"john.fastabend@gmail.com" <john.fastabend@gmail.com>,
"ast@fb.com" <ast@fb.com>,
"willemdebruijn.kernel@gmail.com"
<willemdebruijn.kernel@gmail.com>,
"daniel@iogearbox.net" <daniel@iogearbox.net>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"michael.lundkvist@ericsson.com" <michael.lundkvist@ericsson.com>,
"Brandeburg, Jesse" <jesse.brandeburg@intel.com>,
"Singhai, Anjali" <anjali.singhai@intel.com>,
"Zhang, Qi Z" <qi.z.zhang@intel.com>,
"ravineet.singh@ericsson.com" <ravineet.singh@ericsson.com>,
brouer@redhat.com
Subject: Re: [RFC PATCH v2 03/14] xsk: add umem fill queue support and mmap
Date: Thu, 12 Apr 2018 10:54:31 +0200 [thread overview]
Message-ID: <20180412105431.41bd8d0a@redhat.com> (raw)
In-Reply-To: <AFED4FBCE79F3548A8F74434195ACE39588D1AA4@IRSMSX107.ger.corp.intel.com>
On Thu, 12 Apr 2018 07:38:25 +0000
"Karlsson, Magnus" <magnus.karlsson@intel.com> wrote:
> > -----Original Message-----
> > From: Michael S. Tsirkin [mailto:mst@redhat.com]
> > Sent: Thursday, April 12, 2018 4:16 AM
> > To: Björn Töpel <bjorn.topel@gmail.com>
> > Cc: Karlsson, Magnus <magnus.karlsson@intel.com>; Duyck, Alexander H
> > <alexander.h.duyck@intel.com>; alexander.duyck@gmail.com;
> > john.fastabend@gmail.com; ast@fb.com; brouer@redhat.com;
> > willemdebruijn.kernel@gmail.com; daniel@iogearbox.net;
> > netdev@vger.kernel.org; michael.lundkvist@ericsson.com; Brandeburg,
> > Jesse <jesse.brandeburg@intel.com>; Singhai, Anjali
> > <anjali.singhai@intel.com>; Zhang, Qi Z <qi.z.zhang@intel.com>;
> > ravineet.singh@ericsson.com
> > Subject: Re: [RFC PATCH v2 03/14] xsk: add umem fill queue support and
> > mmap
> >
> > On Tue, Mar 27, 2018 at 06:59:08PM +0200, Björn Töpel wrote:
> > > @@ -30,4 +31,18 @@ struct xdp_umem_reg {
> > > __u32 frame_headroom; /* Frame head room */ };
> > >
> > > +/* Pgoff for mmaping the rings */
> > > +#define XDP_UMEM_PGOFF_FILL_QUEUE 0x100000000
> > > +
> > > +struct xdp_queue {
> > > + __u32 head_idx __attribute__((aligned(64)));
> > > + __u32 tail_idx __attribute__((aligned(64))); };
> > > +
> > > +/* Used for the fill and completion queues for buffers */ struct
> > > +xdp_umem_queue {
> > > + struct xdp_queue ptrs;
> > > + __u32 desc[0] __attribute__((aligned(64))); };
> > > +
> > > #endif /* _LINUX_IF_XDP_H */
> >
> > So IIUC it's a head/tail ring of 32 bit descriptors.
> >
> > In my experience (from implementing ptr_ring) this implies that head/tail
> > cache lines bounce a lot between CPUs. Caching will help some. You are also
> > forced to use barriers to check validity which is slow on some architectures.
> >
> > If instead you can use a special descriptor value (e.g. 0) as a valid signal,
> > things work much better:
> >
> > - you read descriptor atomically, if it's not 0 it's fine
> > - same with write - write 0 to pass it to the other side
> > - there is a data dependency so no need for barriers (except on dec alpha)
> > - no need for power of 2 limitations, you can make it any size you like
> > - easy to resize too
> >
> > architecture (if not implementation) would be shared with ptr_ring so some
> > of the optimization ideas like batched updates could be lifted from there.
> >
> > When I was building ptr_ring, any head/tail design underperformed storing
> > valid flag with data itself. YMMV.
I fully agree with MST here. This is also my experience. I even
dropped my own Array-based Lock-Free (ALF) queue implementation[1] in
favor of ptr_ring. (Where I try to amortize this cost by bulking, but
this cause the queue to become non-wait-free)
[1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/include/linux/alf_queue.h
> I think you are definitely right in that there are ways in which
> we can improve performance here. That said, the current queue
> performs slightly better than the previous one we had that was
> more or less a copy of one of your first virtio 1.1 proposals
> from little over a year ago. It had bidirectional queues and a
> valid flag in the descriptor itself. The reason we abandoned this
> was not poor performance (it was good), but a need to go to
> unidirectional queues. Maybe I should have only changed that
> aspect and kept the valid flag.
>
> Anyway, I will take a look at ptr_ring and run some experiments
> along the lines of what you propose to get some
> numbers. Considering your experience with these kind of
> structures, you are likely right. I just need to convince
> myself :-).
When benchmarking, be careful that you don't measure the "wrong"
queue situation. When doing this kind of "overload" benchmarking, you
will likely create a situation where the queue is always full (which
hopefully isn't a production use-case). In the almost/always full
queue situation, using the element values to sync-on (like MST propose)
will still cause the cache-line bouncing (that we want to avoid).
MST explain and have addressed this situation for ptr_ring in:
commit fb9de9704775 ("ptr_ring: batch ring zeroing")
https://git.kernel.org/torvalds/c/fb9de9704775
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
next prev parent reply other threads:[~2018-04-12 8:54 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-27 16:59 [RFC PATCH v2 00/14] Introducing AF_XDP support Björn Töpel
2018-03-27 16:59 ` [RFC PATCH v2 01/14] net: initial AF_XDP skeleton Björn Töpel
2018-03-27 16:59 ` [RFC PATCH v2 02/14] xsk: add user memory registration support sockopt Björn Töpel
2018-03-27 16:59 ` [RFC PATCH v2 03/14] xsk: add umem fill queue support and mmap Björn Töpel
2018-04-12 2:15 ` Michael S. Tsirkin
2018-04-12 7:38 ` Karlsson, Magnus
2018-04-12 8:54 ` Jesper Dangaard Brouer [this message]
2018-04-12 14:04 ` Michael S. Tsirkin
2018-04-12 15:19 ` Karlsson, Magnus
2018-04-23 10:26 ` Karlsson, Magnus
2018-03-27 16:59 ` [RFC PATCH v2 04/14] xsk: add Rx queue setup and mmap support Björn Töpel
2018-03-27 16:59 ` [RFC PATCH v2 05/14] xsk: add support for bind for Rx Björn Töpel
2018-03-27 16:59 ` [RFC PATCH v2 06/14] xsk: add Rx receive functions and poll support Björn Töpel
2018-03-27 16:59 ` [RFC PATCH v2 07/14] bpf: introduce new bpf AF_XDP map type BPF_MAP_TYPE_XSKMAP Björn Töpel
2018-03-27 16:59 ` [RFC PATCH v2 08/14] xsk: wire up XDP_DRV side of AF_XDP Björn Töpel
2018-03-27 16:59 ` [RFC PATCH v2 09/14] xsk: wire up XDP_SKB " Björn Töpel
2018-03-27 16:59 ` [RFC PATCH v2 10/14] xsk: add umem completion queue support and mmap Björn Töpel
2018-03-27 16:59 ` [RFC PATCH v2 11/14] xsk: add Tx queue setup and mmap support Björn Töpel
2018-03-27 16:59 ` [RFC PATCH v2 12/14] xsk: support for Tx Björn Töpel
2018-03-27 16:59 ` [RFC PATCH v2 13/14] xsk: statistics support Björn Töpel
2018-03-27 16:59 ` [RFC PATCH v2 14/14] samples/bpf: sample application for AF_XDP sockets Björn Töpel
2018-04-12 11:05 ` Jesper Dangaard Brouer
2018-04-12 11:08 ` Karlsson, Magnus
2018-03-28 21:18 ` [RFC PATCH v2 00/14] Introducing AF_XDP support Eric Leblond
2018-03-29 6:16 ` Björn Töpel
2018-03-29 15:36 ` Jesper Dangaard Brouer
2018-04-09 21:51 ` William Tu
2018-04-10 6:47 ` Björn Töpel
2018-04-10 14:14 ` William Tu
2018-04-11 12:17 ` Björn Töpel
2018-04-11 18:43 ` Alexei Starovoitov
2018-04-12 14:14 ` Björn Töpel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180412105431.41bd8d0a@redhat.com \
--to=brouer@redhat.com \
--cc=alexander.duyck@gmail.com \
--cc=alexander.h.duyck@intel.com \
--cc=anjali.singhai@intel.com \
--cc=ast@fb.com \
--cc=bjorn.topel@gmail.com \
--cc=daniel@iogearbox.net \
--cc=jesse.brandeburg@intel.com \
--cc=john.fastabend@gmail.com \
--cc=magnus.karlsson@intel.com \
--cc=michael.lundkvist@ericsson.com \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=qi.z.zhang@intel.com \
--cc=ravineet.singh@ericsson.com \
--cc=willemdebruijn.kernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.