Re: [PATCH net-next 2/6] bpf: add meta pointer for direct access

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Daniel Borkmann <daniel@iogearbox.net>
Cc: brouer@redhat.com, davem@davemloft.net,
	alexei.starovoitov@gmail.com, john.fastabend@gmail.com,
	peter.waskiewicz.jr@intel.com, jakub.kicinski@netronome.com,
	netdev@vger.kernel.org, Andy Gospodarek <andy@greyhouse.net>
Subject: Re: [PATCH net-next 2/6] bpf: add meta pointer for direct access
Date: Tue, 26 Sep 2017 21:13:42 +0200	[thread overview]
Message-ID: <20170926211342.0c8e72b0@redhat.com> (raw)
In-Reply-To: <458f9c13ab58abb1a15627906d03c33c42b02a7c.1506297988.git.daniel@iogearbox.net>

On Mon, 25 Sep 2017 02:25:51 +0200
Daniel Borkmann <daniel@iogearbox.net> wrote:

> This work enables generic transfer of metadata from XDP into skb. The
> basic idea is that we can make use of the fact that the resulting skb
> must be linear and already comes with a larger headroom for supporting
> bpf_xdp_adjust_head(), which mangles xdp->data. Here, we base our work
> on a similar principle and introduce a small helper bpf_xdp_adjust_meta()
> for adjusting a new pointer called xdp->data_meta. Thus, the packet has
> a flexible and programmable room for meta data, followed by the actual
> packet data. struct xdp_buff is therefore laid out that we first point
> to data_hard_start, then data_meta directly prepended to data followed
> by data_end marking the end of packet. bpf_xdp_adjust_head() takes into
> account whether we have meta data already prepended and if so, memmove()s
> this along with the given offset provided there's enough room.
> 
> [...] The scratch space at the head
> of the packet can be multiple of 4 byte up to 32 byte large. Drivers not
> yet supporting xdp->data_meta can simply be set up with xdp->data_meta
> as xdp->data + 1 as bpf_xdp_adjust_meta() will detect this and bail out,
> such that the subsequent match against xdp->data for later access is
> guaranteed to fail.

So, xdp->meta_data is placed just before the packet xdp->data starts.

I'm currently implementing a cpumap type, that transfers raw XDP frames
to another CPU, and the SKB is allocated on the remote CPU.  (It
actually works extremely well).  

For transferring info I need, I'm currently using xdp->data_hard_start
(the top/start of the xdp page).  Which should be compatible with your
approach, right?

The info I need:

 struct xdp_pkt {
	void *data;
	u16 len;
	u16 headroom;
	struct net_device *dev_rx;
 };

When I enqueue the xdp packet I do the following:

 int cpu_map_enqueue(struct bpf_cpu_map_entry *rcpu, struct xdp_buff *xdp,
	struct net_device *dev_rx)
 {
	struct xdp_pkt *xdp_pkt;
	int headroom;

	/* Convert xdp_buff to xdp_pkt */
	headroom = xdp->data - xdp->data_hard_start;
	if (headroom < sizeof(*xdp_pkt))
		return -EOVERFLOW;
	xdp_pkt = xdp->data_hard_start;
	xdp_pkt->data = xdp->data;
	xdp_pkt->len  = xdp->data_end - xdp->data;
	xdp_pkt->headroom = headroom - sizeof(*xdp_pkt);

	/* Info needed when constructing SKB on remote CPU */
	xdp_pkt->dev_rx = dev_rx;

	bq_enqueue(rcpu, xdp_pkt);
	return 0;
 }

On the remote CPU dequeueing the packet, I'm doing the following.  As
you can see I'm still lacking some meta-data, that would be nice to
also transfer.  Could I use your infrastructure for that?

 static struct sk_buff *cpu_map_build_skb(struct bpf_cpu_map_entry *rcpu,
					  struct xdp_pkt *xdp_pkt)
 {
	unsigned int truesize;
	void *pkt_data_start;
	struct sk_buff *skb;

	/* TODO: rcpu could provide truesize, it's static per RX-ring */
	truesize = 2048;

	// pkt_data_start = xdp_pkt + sizeof(*xdp_pkt);
	pkt_data_start = xdp_pkt->data - xdp_pkt->headroom;

	/* Need to adjust "truesize" for skb_shared_info to get proper
	 * placed, to take into account that xdp_pkt is using part of
	 * headroom
	 */
	skb = build_skb(pkt_data_start, truesize - sizeof(*xdp_pkt));
	if (!skb)
		return NULL;

	skb_reserve(skb, xdp_pkt->headroom);
	__skb_put(skb, xdp_pkt->len);

	// skb_record_rx_queue(skb, rx_ring->queue_index);
	skb->protocol = eth_type_trans(skb, xdp_pkt->dev_rx);

	// How much does csum matter? 
 //	skb->ip_summed = CHECKSUM_UNNECESSARY; // Try to fake it...

	// Does setting skb_set_hash()) matter?
 //	__skb_set_hash(skb, 42, true, false); // Say it is software
 //	__skb_set_hash(skb, 42, false, true); // Say it is hardware

	// Do we lack setting rx_queue... it doesn't seem to matter
 //	skb_record_rx_queue(skb, 0);

	return skb;
 }

(I'll send out some patches soonish, hopefully tomorrow... to show in
more details what I'm doing)

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

next prev parent reply	other threads:[~2017-09-26 19:13 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-25  0:25 [PATCH net-next 0/6] BPF metadata for direct access Daniel Borkmann
2017-09-25  0:25 ` [PATCH net-next 1/6] bpf: rename bpf_compute_data_end into bpf_compute_data_pointers Daniel Borkmann
2017-09-25  0:25 ` [PATCH net-next 2/6] bpf: add meta pointer for direct access Daniel Borkmann
2017-09-25 18:10   ` Andy Gospodarek
2017-09-25 18:50     ` Daniel Borkmann
2017-09-25 19:47       ` John Fastabend
2017-09-26 17:21       ` Andy Gospodarek
2017-09-28  5:59         ` Waskiewicz Jr, Peter
2017-09-28 19:58           ` Andy Gospodarek
2017-09-28 20:52             ` Waskiewicz Jr, Peter
2017-09-28 21:22               ` John Fastabend
2017-09-28 21:40                 ` Waskiewicz Jr, Peter
2017-09-28 21:29               ` Daniel Borkmann
2017-09-26 19:13   ` Jesper Dangaard Brouer [this message]
2017-09-26 19:58     ` Daniel Borkmann
2017-09-27  9:26       ` Jesper Dangaard Brouer
2017-09-27 13:35         ` John Fastabend
2017-09-27 14:54           ` Jesper Dangaard Brouer
2017-09-27 17:32             ` Alexei Starovoitov
2017-09-29  7:09               ` Jesper Dangaard Brouer
2017-09-25  0:25 ` [PATCH net-next 3/6] bpf: update bpf.h uapi header for tools Daniel Borkmann
2017-09-27  7:03   ` Jesper Dangaard Brouer
2017-09-27  7:10     ` Jesper Dangaard Brouer
2017-09-25  0:25 ` [PATCH net-next 4/6] bpf: improve selftests and add tests for meta pointer Daniel Borkmann
2017-09-25  0:25 ` [PATCH net-next 5/6] bpf, nfp: add meta data support Daniel Borkmann
2017-09-25 11:12   ` Jakub Kicinski
2017-09-25  0:25 ` [PATCH net-next 6/6] bpf, ixgbe: " Daniel Borkmann
2017-09-26 20:37 ` [PATCH net-next 0/6] BPF metadata for direct access David Miller
2017-09-26 20:44   ` Daniel Borkmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170926211342.0c8e72b0@redhat.com \
    --to=brouer@redhat.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andy@greyhouse.net \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=jakub.kicinski@netronome.com \
    --cc=john.fastabend@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=peter.waskiewicz.jr@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.