From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: linux-kernel@vger.kernel.org,
John Fastabend <john.fastabend@gmail.com>,
netdev@vger.kernel.org, David Miller <davem@davemloft.net>
Subject: Re: [RFC PATCH v2] ptr_ring: linked list fallback
Date: Wed, 28 Feb 2018 06:09:45 +0200 [thread overview]
Message-ID: <20180228060845-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <b73c7c1e-4a63-45c0-cef5-0ec8f1195eca@redhat.com>
On Wed, Feb 28, 2018 at 11:28:57AM +0800, Jason Wang wrote:
>
>
> On 2018年02月28日 01:12, Michael S. Tsirkin wrote:
> > On Tue, Feb 27, 2018 at 10:29:26AM +0800, Jason Wang wrote:
> > >
> > > On 2018年02月27日 04:34, Michael S. Tsirkin wrote:
> > > > On Mon, Feb 26, 2018 at 11:15:42AM +0800, Jason Wang wrote:
> > > > > On 2018年02月26日 09:17, Michael S. Tsirkin wrote:
> > > > > > So pointer rings work fine, but they have a problem: make them too small
> > > > > > and not enough entries fit. Make them too large and you start flushing
> > > > > > your cache and running out of memory.
> > > > > >
> > > > > > This is a new idea of mine: a ring backed by a linked list. Once you run
> > > > > > out of ring entries, instead of a drop you fall back on a list with a
> > > > > > common lock.
> > > > > >
> > > > > > Should work well for the case where the ring is typically sized
> > > > > > correctly, but will help address the fact that some user try to set e.g.
> > > > > > tx queue length to 1000000.
> > > > > >
> > > > > > In other words, the idea is that if a user sets a really huge TX queue
> > > > > > length, we allocate a ptr_ring which is smaller, and use the backup
> > > > > > linked list when necessary to provide the requested TX queue length
> > > > > > legitimately.
> > > > > >
> > > > > > My hope this will move us closer to direction where e.g. fw codel can
> > > > > > use ptr rings without locking at all. The API is still very rough, and
> > > > > > I really need to take a hard look at lock nesting.
> > > > > >
> > > > > > Compiled only, sending for early feedback/flames.
> > > > > >
> > > > > > Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
> > > > > > ---
> > > > > >
> > > > > > changes from v1:
> > > > > > - added clarifications by DaveM in the commit log
> > > > > > - build fixes
> > > > > >
> > > > > > include/linux/ptr_ring.h | 64 +++++++++++++++++++++++++++++++++++++++++++++---
> > > > > > 1 file changed, 61 insertions(+), 3 deletions(-)
> > > > > >
> > > > > > diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
> > > > > > index d72b2e7..8aa8882 100644
> > > > > > --- a/include/linux/ptr_ring.h
> > > > > > +++ b/include/linux/ptr_ring.h
> > > > > > @@ -31,11 +31,18 @@
> > > > > > #include <asm/errno.h>
> > > > > > #endif
> > > > > > +/* entries must start with the following structure */
> > > > > > +struct plist {
> > > > > > + struct plist *next;
> > > > > > + struct plist *last; /* only valid in the 1st entry */
> > > > > > +};
> > > > > So I wonder whether or not it's better to do this in e.g skb_array
> > > > > implementation. Then it can use its own prev/next field.
> > > > XDP uses ptr ring directly, doesn't it?
> > > >
> > > Well I believe the main user for this is qdisc, which use skb array. And we
> > > can not use what implemented in this patch directly for sk_buff without some
> > > changes on the data structure.
> > Why not? skb has next and prev pointers at 1st two fields:
> >
> > struct sk_buff {
> > union {
> > struct {
> > /* These two members must be first. */
> > struct sk_buff *next;
> > struct sk_buff *prev;
> > ...
> > }
> >
> > so it's just a question of casting to struct plist.
>
> Well, then the casting can only be done in skb_array implementation?
why not?
> >
> > Or we can add plist to a union:
> >
> >
> > struct sk_buff {
> > union {
> > struct {
> > /* These two members must be first. */
> > struct sk_buff *next;
> > struct sk_buff *prev;
> > union {
> > struct net_device *dev;
> > /* Some protocols might use this space to store information,
> > * while device pointer would be NULL.
> > * UDP receive path is one user.
> > */
> > unsigned long dev_scratch;
> > };
> > };
> > struct rb_node rbnode; /* used in netem & tcp stack */
> > + struct plist plist; /* For use with ptr_ring */
> > };
> >
>
> This look ok.
>
> >
> > > For XDP, we need to embed plist in struct xdp_buff too,
> > Right - that's pretty straightforward, isn't it?
>
> Yes, it's not clear to me this is really needed for XDP consider the lock
> contention it brings.
>
> Thanks
The contention is only when the ring overflows into the list though.
> > > so it looks to me
> > > that the better approach is to have separated function for ptr ring and skb
> > > array.
> > >
> > > Thanks
next prev parent reply other threads:[~2018-02-28 4:09 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-26 1:17 [RFC PATCH v2] ptr_ring: linked list fallback Michael S. Tsirkin
2018-02-26 3:15 ` Jason Wang
2018-02-26 20:34 ` Michael S. Tsirkin
2018-02-27 2:29 ` Jason Wang
2018-02-27 17:12 ` Michael S. Tsirkin
2018-02-28 3:28 ` Jason Wang
2018-02-28 3:39 ` Jason Wang
2018-02-28 4:11 ` Michael S. Tsirkin
2018-02-28 4:09 ` Michael S. Tsirkin [this message]
2018-02-28 6:28 ` Jason Wang
2018-02-28 14:01 ` Michael S. Tsirkin
2018-02-28 14:20 ` Jason Wang
2018-02-28 15:43 ` Michael S. Tsirkin
2018-03-01 6:41 ` Jason Wang
2018-02-27 17:53 ` Eric Dumazet
2018-02-27 19:35 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180228060845-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=davem@davemloft.net \
--cc=jasowang@redhat.com \
--cc=john.fastabend@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.