From: Alexei Starovoitov <ast@plumgrid.com>
To: David Miller <davem@davemloft.net>
Cc: daniel@iogearbox.net, tgraf@suug.ch, jiri@resnulli.us,
jhs@mojatatu.com, netdev@vger.kernel.org
Subject: Re: [PATCH v3 net-next 2/2] tc: add 'needs_l2' flag to ingress qdisc
Date: Wed, 08 Apr 2015 22:20:47 -0700 [thread overview]
Message-ID: <55260C2F.608@plumgrid.com> (raw)
In-Reply-To: <5525F48F.5030108@plumgrid.com>
On 4/8/15 8:39 PM, Alexei Starovoitov wrote:
> On 4/8/15 8:14 PM, David Miller wrote:
>> From: Alexei Starovoitov <ast@plumgrid.com>
>> Date: Wed, 08 Apr 2015 20:05:13 -0700
>>
>>> I'm sure there is a way to propagate the offset into the programs.
>>> It's not about efficiency of programs, but about consistency.
>>> Programs should know nothing about kernel. Sending network offset
>>> into them is exposing this very specific kernel behavior.
>>
>> It can be performed by the data access helpers the JIT'd programs
>> have to invoke anyways.
>
> hmm, not sure what you mean.
> Let's take specific line from sockex1_kern.c:
> int index = load_byte(skb, ETH_HLEN + offsetof(struct iphdr, protocol));
> this C code is compiled into
> R0 = LD_ABS_B 23
> (instruction with fixed offset)
> which is being interpreted as:
> skb_header_pointer(skb, 23, 1, buffer);
> and similar by JITs which are using doing
> r0 = *(char *)(skb->data + 23)
> in this case.
>
> Are you proposing to change semantics of LD_ABS instruction to use
> skb->head + skb->mac_header instead of skb->data in interpreter
> and in all JITs?
> Performance wise it will be ok, since JITs can cache that pointer.
> But that will be huge and very risky change.
> I'm not sure yet whether all programs will keep working afterwards.
> Is it really worth taking so much risk vs push/pull of L2?
> If you say, let's take the risk, sure, I can try hacking all the bits
> and see what the cost.
have to correct myself.
we cannot change the meaning of ld_abs, since for dgram sockets
offset is actually not pointing to L2.
af_packet is doing:
if (sk->sk_type != SOCK_DGRAM)
skb_push(skb, skb->data - skb_mac_header(skb));
res = run_filter(skb, sk, snaplen);
so not everything assumes L2. raw socket taps, ppp, team, cls_bpf
certainly want to see L2.
I still hate to see ingress and egress cls/act programs to be different.
af_packet also does:
skb = skb_share_check(skb, GFP_ATOMIC);
so Dave, would you consider early_ingress_l2() hook that doesn't need
to do skb_share_check ?
If not, the only option I have is to introduce a set of helpers
that can read from packet where offset is always starts at L2.
Then we can say that cls/act_bpf should never use ld_abs/ld_ind
instructions if they want to run on both ingress and egress and
should use these new read helpers instead.
next prev parent reply other threads:[~2015-04-09 5:20 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-08 23:26 [PATCH v3 net-next 1/2] net: introduce skb_postpush_rcsum() helper Alexei Starovoitov
2015-04-08 23:26 ` [PATCH v3 net-next 2/2] tc: add 'needs_l2' flag to ingress qdisc Alexei Starovoitov
2015-04-09 2:44 ` David Miller
2015-04-09 3:05 ` Alexei Starovoitov
2015-04-09 3:14 ` David Miller
2015-04-09 3:39 ` Alexei Starovoitov
2015-04-09 5:20 ` Alexei Starovoitov [this message]
2015-04-09 5:25 ` David Miller
2015-04-09 15:15 ` Daniel Borkmann
2015-04-09 15:36 ` Eric Dumazet
2015-04-09 17:20 ` Alexei Starovoitov
2015-04-09 10:49 ` Jamal Hadi Salim
2015-04-09 11:02 ` Jamal Hadi Salim
2015-04-09 15:38 ` Daniel Borkmann
2015-04-10 11:49 ` Jamal Hadi Salim
2015-04-09 17:03 ` Alexei Starovoitov
2015-04-10 12:48 ` Jamal Hadi Salim
2015-04-10 22:35 ` Alexei Starovoitov
2015-04-13 14:28 ` Jamal Hadi Salim
2015-04-13 16:13 ` Alexei Starovoitov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55260C2F.608@plumgrid.com \
--to=ast@plumgrid.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=jhs@mojatatu.com \
--cc=jiri@resnulli.us \
--cc=netdev@vger.kernel.org \
--cc=tgraf@suug.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.