From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Daniel Borkmann <daniel@iogearbox.net>
Cc: brouer@redhat.com, davem@davemloft.net,
alexei.starovoitov@gmail.com, john.fastabend@gmail.com,
peter.waskiewicz.jr@intel.com, jakub.kicinski@netronome.com,
netdev@vger.kernel.org, Andy Gospodarek <andy@greyhouse.net>
Subject: Re: [PATCH net-next 2/6] bpf: add meta pointer for direct access
Date: Tue, 26 Sep 2017 21:13:42 +0200 [thread overview]
Message-ID: <20170926211342.0c8e72b0@redhat.com> (raw)
In-Reply-To: <458f9c13ab58abb1a15627906d03c33c42b02a7c.1506297988.git.daniel@iogearbox.net>
On Mon, 25 Sep 2017 02:25:51 +0200
Daniel Borkmann <daniel@iogearbox.net> wrote:
> This work enables generic transfer of metadata from XDP into skb. The
> basic idea is that we can make use of the fact that the resulting skb
> must be linear and already comes with a larger headroom for supporting
> bpf_xdp_adjust_head(), which mangles xdp->data. Here, we base our work
> on a similar principle and introduce a small helper bpf_xdp_adjust_meta()
> for adjusting a new pointer called xdp->data_meta. Thus, the packet has
> a flexible and programmable room for meta data, followed by the actual
> packet data. struct xdp_buff is therefore laid out that we first point
> to data_hard_start, then data_meta directly prepended to data followed
> by data_end marking the end of packet. bpf_xdp_adjust_head() takes into
> account whether we have meta data already prepended and if so, memmove()s
> this along with the given offset provided there's enough room.
>
> [...] The scratch space at the head
> of the packet can be multiple of 4 byte up to 32 byte large. Drivers not
> yet supporting xdp->data_meta can simply be set up with xdp->data_meta
> as xdp->data + 1 as bpf_xdp_adjust_meta() will detect this and bail out,
> such that the subsequent match against xdp->data for later access is
> guaranteed to fail.
So, xdp->meta_data is placed just before the packet xdp->data starts.
I'm currently implementing a cpumap type, that transfers raw XDP frames
to another CPU, and the SKB is allocated on the remote CPU. (It
actually works extremely well).
For transferring info I need, I'm currently using xdp->data_hard_start
(the top/start of the xdp page). Which should be compatible with your
approach, right?
The info I need:
struct xdp_pkt {
void *data;
u16 len;
u16 headroom;
struct net_device *dev_rx;
};
When I enqueue the xdp packet I do the following:
int cpu_map_enqueue(struct bpf_cpu_map_entry *rcpu, struct xdp_buff *xdp,
struct net_device *dev_rx)
{
struct xdp_pkt *xdp_pkt;
int headroom;
/* Convert xdp_buff to xdp_pkt */
headroom = xdp->data - xdp->data_hard_start;
if (headroom < sizeof(*xdp_pkt))
return -EOVERFLOW;
xdp_pkt = xdp->data_hard_start;
xdp_pkt->data = xdp->data;
xdp_pkt->len = xdp->data_end - xdp->data;
xdp_pkt->headroom = headroom - sizeof(*xdp_pkt);
/* Info needed when constructing SKB on remote CPU */
xdp_pkt->dev_rx = dev_rx;
bq_enqueue(rcpu, xdp_pkt);
return 0;
}
On the remote CPU dequeueing the packet, I'm doing the following. As
you can see I'm still lacking some meta-data, that would be nice to
also transfer. Could I use your infrastructure for that?
static struct sk_buff *cpu_map_build_skb(struct bpf_cpu_map_entry *rcpu,
struct xdp_pkt *xdp_pkt)
{
unsigned int truesize;
void *pkt_data_start;
struct sk_buff *skb;
/* TODO: rcpu could provide truesize, it's static per RX-ring */
truesize = 2048;
// pkt_data_start = xdp_pkt + sizeof(*xdp_pkt);
pkt_data_start = xdp_pkt->data - xdp_pkt->headroom;
/* Need to adjust "truesize" for skb_shared_info to get proper
* placed, to take into account that xdp_pkt is using part of
* headroom
*/
skb = build_skb(pkt_data_start, truesize - sizeof(*xdp_pkt));
if (!skb)
return NULL;
skb_reserve(skb, xdp_pkt->headroom);
__skb_put(skb, xdp_pkt->len);
// skb_record_rx_queue(skb, rx_ring->queue_index);
skb->protocol = eth_type_trans(skb, xdp_pkt->dev_rx);
// How much does csum matter?
// skb->ip_summed = CHECKSUM_UNNECESSARY; // Try to fake it...
// Does setting skb_set_hash()) matter?
// __skb_set_hash(skb, 42, true, false); // Say it is software
// __skb_set_hash(skb, 42, false, true); // Say it is hardware
// Do we lack setting rx_queue... it doesn't seem to matter
// skb_record_rx_queue(skb, 0);
return skb;
}
(I'll send out some patches soonish, hopefully tomorrow... to show in
more details what I'm doing)
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
next prev parent reply other threads:[~2017-09-26 19:13 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-25 0:25 [PATCH net-next 0/6] BPF metadata for direct access Daniel Borkmann
2017-09-25 0:25 ` [PATCH net-next 1/6] bpf: rename bpf_compute_data_end into bpf_compute_data_pointers Daniel Borkmann
2017-09-25 0:25 ` [PATCH net-next 2/6] bpf: add meta pointer for direct access Daniel Borkmann
2017-09-25 18:10 ` Andy Gospodarek
2017-09-25 18:50 ` Daniel Borkmann
2017-09-25 19:47 ` John Fastabend
2017-09-26 17:21 ` Andy Gospodarek
2017-09-28 5:59 ` Waskiewicz Jr, Peter
2017-09-28 19:58 ` Andy Gospodarek
2017-09-28 20:52 ` Waskiewicz Jr, Peter
2017-09-28 21:22 ` John Fastabend
2017-09-28 21:40 ` Waskiewicz Jr, Peter
2017-09-28 21:29 ` Daniel Borkmann
2017-09-26 19:13 ` Jesper Dangaard Brouer [this message]
2017-09-26 19:58 ` Daniel Borkmann
2017-09-27 9:26 ` Jesper Dangaard Brouer
2017-09-27 13:35 ` John Fastabend
2017-09-27 14:54 ` Jesper Dangaard Brouer
2017-09-27 17:32 ` Alexei Starovoitov
2017-09-29 7:09 ` Jesper Dangaard Brouer
2017-09-25 0:25 ` [PATCH net-next 3/6] bpf: update bpf.h uapi header for tools Daniel Borkmann
2017-09-27 7:03 ` Jesper Dangaard Brouer
2017-09-27 7:10 ` Jesper Dangaard Brouer
2017-09-25 0:25 ` [PATCH net-next 4/6] bpf: improve selftests and add tests for meta pointer Daniel Borkmann
2017-09-25 0:25 ` [PATCH net-next 5/6] bpf, nfp: add meta data support Daniel Borkmann
2017-09-25 11:12 ` Jakub Kicinski
2017-09-25 0:25 ` [PATCH net-next 6/6] bpf, ixgbe: " Daniel Borkmann
2017-09-26 20:37 ` [PATCH net-next 0/6] BPF metadata for direct access David Miller
2017-09-26 20:44 ` Daniel Borkmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170926211342.0c8e72b0@redhat.com \
--to=brouer@redhat.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andy@greyhouse.net \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=jakub.kicinski@netronome.com \
--cc=john.fastabend@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=peter.waskiewicz.jr@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).