From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Tom Herbert <tom@herbertland.com>
Cc: Daniel Borkmann <borkmann@iogearbox.net>,
Alexei Starovoitov <alexei.starovoitov@gmail.com>,
Linux Kernel Network Developers <netdev@vger.kernel.org>,
brouer@redhat.com
Subject: Re: [RFC net-next PATCH 4/5] net: new XDP feature for reading HW rxhash from drivers
Date: Mon, 22 May 2017 08:39:35 +0200 [thread overview]
Message-ID: <20170522083935.4d82174f@redhat.com> (raw)
In-Reply-To: <CALx6S34-4kQLQLkU9QRjyfZ808f_hXV4N+bh+XG8iu1BtQOUSA@mail.gmail.com>
On Sun, 21 May 2017 15:10:29 -0700
Tom Herbert <tom@herbertland.com> wrote:
> On Sun, May 21, 2017 at 9:04 AM, Jesper Dangaard Brouer
> <brouer@redhat.com> wrote:
> > On Sat, 20 May 2017 09:16:09 -0700
> > Tom Herbert <tom@herbertland.com> wrote:
> >
> >> > +/* XDP rxhash have an associated type, which is related to the RSS
> >> > + * (Receive Side Scaling) standard, but NIC HW have different mapping
> >> > + * and support. Thus, create mapping that is interesting for XDP. XDP
> >> > + * would primarly want insight into L3 and L4 protocol info.
> >> > + *
> >> > + * TODO: Likely need to get extended with "L3_IPV6_EX" due RSS standard
> >> > + *
> >> > + * The HASH_TYPE will be returned from bpf helper as the top 32-bit of
> >> > + * the 64-bit rxhash (internally type stored in xdp_buff->flags).
> >> > + */
> >> > +#define XDP_HASH(x) ((x) & ((1ULL << 32)-1))
> >> > +#define XDP_HASH_TYPE(x) ((x) >> 32)
> >> > +
> >> > +#define XDP_HASH_TYPE_L3_SHIFT 0
> >> > +#define XDP_HASH_TYPE_L3_BITS 3
> >> > +#define XDP_HASH_TYPE_L3_MASK ((1ULL << XDP_HASH_TYPE_L3_BITS)-1)
> >> > +#define XDP_HASH_TYPE_L3(x) ((x) & XDP_HASH_TYPE_L3_MASK)
> >> > +enum {
> >> > + XDP_HASH_TYPE_L3_IPV4 = 1,
> >> > + XDP_HASH_TYPE_L3_IPV6,
> >> > +};
> >> > +
> >> > +#define XDP_HASH_TYPE_L4_SHIFT XDP_HASH_TYPE_L3_BITS
> >> > +#define XDP_HASH_TYPE_L4_BITS 5
> >> > +#define XDP_HASH_TYPE_L4_MASK \
> >> > + (((1ULL << XDP_HASH_TYPE_L4_BITS)-1) << XDP_HASH_TYPE_L4_SHIFT)
> >> > +#define XDP_HASH_TYPE_L4(x) ((x) & XDP_HASH_TYPE_L4_MASK)
> >> > +enum {
> >> > + _XDP_HASH_TYPE_L4_TCP = 1,
> >> > + _XDP_HASH_TYPE_L4_UDP,
> >> > +};
> >> > +#define XDP_HASH_TYPE_L4_TCP (_XDP_HASH_TYPE_L4_TCP << XDP_HASH_TYPE_L4_SHIFT)
> >> > +#define XDP_HASH_TYPE_L4_UDP (_XDP_HASH_TYPE_L4_UDP << XDP_HASH_TYPE_L4_SHIFT)
> >> > +
> >> Hi Jesper,
> >>
> >> Why do we need these indicators for protocol specific hash? It seems
> >> like L4 and L3 is useful differentiation and protocol agnostic (I'm
> >> still holding out hope that SCTP will be deployed some day ;-) )
> >
> > I'm not sure I understood the question fully, but let me try to answer
> > anyway. To me it seems obvious that you would want to know the
> > protocol/L4 type, as this helps avoid hash collisions between UDP and
> > TCP flows. I can easily imagine someone constructing an UDP packet
> > that could hash collide with a given TCP flow.
> >
> > And yes, i40 support matching SCTP, and we will create a
> > XDP_HASH_TYPE_L4_SCTP when adding XDP rxhash support for that driver.
> >
> But where would this information be used? We don't save it in skbuff,
> don't use it in RPS, RFS. RSS doesn't use it for packet steering so
> the hash collision problem already exists at the device level. If
> there is a collision problem between two protocols then maybe hash
> should be over 5-tuple instead...
One use-case (I heard at a customer) was that they had (web-)servers
that didn't serve any UDP traffic, thus they simply block/drop all
incoming UDP on the service NIC (as an ACL in the switch). (The servers
own DNS lookups and NTP goes through the management NIC to internal
DNS/NTP servers).
Another use-case: Inside an XDP/bpf program is can be used for
splitting protocol processing, into different tail calls, before even
touching packet-data. I can imagine the bpf TCP handling code is
larger, thus an optimization is to have a separate tail call for the
UDP protocol handling. One could also transfer/queue all TCP traffic
to other CPU(s) like RPS, just without touching packet memory.
This info is saved in the skb, but due to space constrains, it is
reduced to a single bit, namely skb->l4_hash, iif some
RSS-proto/XDP_HASH_TYPE_L4_* bit was set. And the network stack do use
and react on this.
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
next prev parent reply other threads:[~2017-05-22 6:39 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-05-18 15:41 [RFC net-next PATCH 0/5] XDP driver feature API and handling change to xdp_buff Jesper Dangaard Brouer
2017-05-18 15:41 ` [RFC net-next PATCH 1/5] samples/bpf: xdp_tx_iptunnel make use of map_data[] Jesper Dangaard Brouer
2017-05-19 15:45 ` Daniel Borkmann
2017-05-18 15:41 ` [RFC net-next PATCH 2/5] mlx5: fix bug reading rss_hash_type from CQE Jesper Dangaard Brouer
2017-05-19 15:47 ` Daniel Borkmann
2017-05-19 23:38 ` David Miller
2017-05-22 18:27 ` Jesper Dangaard Brouer
2017-05-18 15:41 ` [RFC net-next PATCH 3/5] net: introduce XDP driver features interface Jesper Dangaard Brouer
2017-05-19 17:13 ` Daniel Borkmann
2017-05-19 23:37 ` David Miller
2017-05-20 7:53 ` Jesper Dangaard Brouer
2017-05-21 0:58 ` Daniel Borkmann
2017-05-22 14:49 ` Jesper Dangaard Brouer
2017-05-22 17:07 ` Daniel Borkmann
2017-05-30 9:58 ` Jesper Dangaard Brouer
2017-05-18 15:41 ` [RFC net-next PATCH 4/5] net: new XDP feature for reading HW rxhash from drivers Jesper Dangaard Brouer
2017-05-19 11:47 ` Jesper Dangaard Brouer
2017-05-20 3:07 ` Alexei Starovoitov
2017-05-20 3:21 ` Jakub Kicinski
2017-05-20 3:34 ` Alexei Starovoitov
2017-05-20 4:13 ` Jakub Kicinski
2017-05-21 15:55 ` Jesper Dangaard Brouer
2017-05-22 3:21 ` Alexei Starovoitov
2017-05-22 4:12 ` John Fastabend
2017-05-20 16:16 ` Tom Herbert
2017-05-21 16:04 ` Jesper Dangaard Brouer
2017-05-21 22:10 ` Tom Herbert
2017-05-22 6:39 ` Jesper Dangaard Brouer [this message]
2017-05-22 20:42 ` Jesper Dangaard Brouer
2017-05-22 21:32 ` Tom Herbert
2017-05-18 15:41 ` [RFC net-next PATCH 5/5] mlx5: add XDP rxhash feature for driver mlx5 Jesper Dangaard Brouer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170522083935.4d82174f@redhat.com \
--to=brouer@redhat.com \
--cc=alexei.starovoitov@gmail.com \
--cc=borkmann@iogearbox.net \
--cc=netdev@vger.kernel.org \
--cc=tom@herbertland.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).