netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Tom Herbert <tom@herbertland.com>
Cc: Daniel Borkmann <borkmann@iogearbox.net>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>,
	brouer@redhat.com
Subject: Re: [RFC net-next PATCH 4/5] net: new XDP feature for reading HW rxhash from drivers
Date: Mon, 22 May 2017 08:39:35 +0200	[thread overview]
Message-ID: <20170522083935.4d82174f@redhat.com> (raw)
In-Reply-To: <CALx6S34-4kQLQLkU9QRjyfZ808f_hXV4N+bh+XG8iu1BtQOUSA@mail.gmail.com>

On Sun, 21 May 2017 15:10:29 -0700
Tom Herbert <tom@herbertland.com> wrote:

> On Sun, May 21, 2017 at 9:04 AM, Jesper Dangaard Brouer
> <brouer@redhat.com> wrote:
> > On Sat, 20 May 2017 09:16:09 -0700
> > Tom Herbert <tom@herbertland.com> wrote:
> >  
> >> > +/* XDP rxhash have an associated type, which is related to the RSS
> >> > + * (Receive Side Scaling) standard, but NIC HW have different mapping
> >> > + * and support. Thus, create mapping that is interesting for XDP.  XDP
> >> > + * would primarly want insight into L3 and L4 protocol info.
> >> > + *
> >> > + * TODO: Likely need to get extended with "L3_IPV6_EX" due RSS standard
> >> > + *
> >> > + * The HASH_TYPE will be returned from bpf helper as the top 32-bit of
> >> > + * the 64-bit rxhash (internally type stored in xdp_buff->flags).
> >> > + */
> >> > +#define XDP_HASH(x)            ((x) & ((1ULL << 32)-1))
> >> > +#define XDP_HASH_TYPE(x)       ((x) >> 32)
> >> > +
> >> > +#define XDP_HASH_TYPE_L3_SHIFT 0
> >> > +#define XDP_HASH_TYPE_L3_BITS  3
> >> > +#define XDP_HASH_TYPE_L3_MASK  ((1ULL << XDP_HASH_TYPE_L3_BITS)-1)
> >> > +#define XDP_HASH_TYPE_L3(x)    ((x) & XDP_HASH_TYPE_L3_MASK)
> >> > +enum {
> >> > +       XDP_HASH_TYPE_L3_IPV4 = 1,
> >> > +       XDP_HASH_TYPE_L3_IPV6,
> >> > +};
> >> > +
> >> > +#define XDP_HASH_TYPE_L4_SHIFT XDP_HASH_TYPE_L3_BITS
> >> > +#define XDP_HASH_TYPE_L4_BITS  5
> >> > +#define XDP_HASH_TYPE_L4_MASK                                          \
> >> > +       (((1ULL << XDP_HASH_TYPE_L4_BITS)-1) << XDP_HASH_TYPE_L4_SHIFT)
> >> > +#define XDP_HASH_TYPE_L4(x)    ((x) & XDP_HASH_TYPE_L4_MASK)
> >> > +enum {
> >> > +       _XDP_HASH_TYPE_L4_TCP = 1,
> >> > +       _XDP_HASH_TYPE_L4_UDP,
> >> > +};
> >> > +#define XDP_HASH_TYPE_L4_TCP (_XDP_HASH_TYPE_L4_TCP << XDP_HASH_TYPE_L4_SHIFT)
> >> > +#define XDP_HASH_TYPE_L4_UDP (_XDP_HASH_TYPE_L4_UDP << XDP_HASH_TYPE_L4_SHIFT)
> >> > +  
> >> Hi Jesper,
> >>
> >> Why do we need these indicators for protocol specific hash? It seems
> >> like L4 and L3 is useful differentiation and protocol agnostic (I'm
> >> still holding out hope that SCTP will be deployed some day ;-) )  
> >
> > I'm not sure I understood the question fully, but let me try to answer
> > anyway.  To me it seems obvious that you would want to know the
> > protocol/L4 type, as this helps avoid hash collisions between UDP and
> > TCP flows.  I can easily imagine someone constructing an UDP packet
> > that could hash collide with a given TCP flow.
> >
> > And yes, i40 support matching SCTP, and we will create a
> > XDP_HASH_TYPE_L4_SCTP when adding XDP rxhash support for that driver.
> >  
> But where would this information be used? We don't save it in skbuff,
> don't use it in RPS, RFS. RSS doesn't use it for packet steering so
> the hash collision problem already exists at the device level. If
> there is a collision problem between two protocols then maybe hash
> should be over 5-tuple instead...

One use-case (I heard at a customer) was that they had (web-)servers
that didn't serve any UDP traffic, thus they simply block/drop all
incoming UDP on the service NIC (as an ACL in the switch). (The servers
own DNS lookups and NTP goes through the management NIC to internal
DNS/NTP servers).

Another use-case: Inside an XDP/bpf program is can be used for
splitting protocol processing, into different tail calls, before even
touching packet-data.  I can imagine the bpf TCP handling code is
larger, thus an optimization is to have a separate tail call for the
UDP protocol handling.  One could also transfer/queue all TCP traffic
to other CPU(s) like RPS, just without touching packet memory.


This info is saved in the skb, but due to space constrains, it is
reduced to a single bit, namely skb->l4_hash, iif some
RSS-proto/XDP_HASH_TYPE_L4_* bit was set.  And the network stack do use
and react on this.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

  reply	other threads:[~2017-05-22  6:39 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-18 15:41 [RFC net-next PATCH 0/5] XDP driver feature API and handling change to xdp_buff Jesper Dangaard Brouer
2017-05-18 15:41 ` [RFC net-next PATCH 1/5] samples/bpf: xdp_tx_iptunnel make use of map_data[] Jesper Dangaard Brouer
2017-05-19 15:45   ` Daniel Borkmann
2017-05-18 15:41 ` [RFC net-next PATCH 2/5] mlx5: fix bug reading rss_hash_type from CQE Jesper Dangaard Brouer
2017-05-19 15:47   ` Daniel Borkmann
2017-05-19 23:38   ` David Miller
2017-05-22 18:27     ` Jesper Dangaard Brouer
2017-05-18 15:41 ` [RFC net-next PATCH 3/5] net: introduce XDP driver features interface Jesper Dangaard Brouer
2017-05-19 17:13   ` Daniel Borkmann
2017-05-19 23:37     ` David Miller
2017-05-20  7:53     ` Jesper Dangaard Brouer
2017-05-21  0:58       ` Daniel Borkmann
2017-05-22 14:49         ` Jesper Dangaard Brouer
2017-05-22 17:07           ` Daniel Borkmann
2017-05-30  9:58             ` Jesper Dangaard Brouer
2017-05-18 15:41 ` [RFC net-next PATCH 4/5] net: new XDP feature for reading HW rxhash from drivers Jesper Dangaard Brouer
2017-05-19 11:47   ` Jesper Dangaard Brouer
2017-05-20  3:07   ` Alexei Starovoitov
2017-05-20  3:21     ` Jakub Kicinski
2017-05-20  3:34       ` Alexei Starovoitov
2017-05-20  4:13         ` Jakub Kicinski
2017-05-21 15:55     ` Jesper Dangaard Brouer
2017-05-22  3:21       ` Alexei Starovoitov
2017-05-22  4:12         ` John Fastabend
2017-05-20 16:16   ` Tom Herbert
2017-05-21 16:04     ` Jesper Dangaard Brouer
2017-05-21 22:10       ` Tom Herbert
2017-05-22  6:39         ` Jesper Dangaard Brouer [this message]
2017-05-22 20:42           ` Jesper Dangaard Brouer
2017-05-22 21:32             ` Tom Herbert
2017-05-18 15:41 ` [RFC net-next PATCH 5/5] mlx5: add XDP rxhash feature for driver mlx5 Jesper Dangaard Brouer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170522083935.4d82174f@redhat.com \
    --to=brouer@redhat.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=borkmann@iogearbox.net \
    --cc=netdev@vger.kernel.org \
    --cc=tom@herbertland.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).