All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jamal Hadi Salim <jhs@mojatatu.com>
To: Alexei Starovoitov <ast@plumgrid.com>,
	"David S. Miller" <davem@davemloft.net>
Cc: Daniel Borkmann <daniel@iogearbox.net>,
	Jiri Pirko <jiri@resnulli.us>,
	netdev@vger.kernel.org
Subject: Re: [PATCH v2 net-next 2/2] tc: make ingress and egress qdiscs consistent
Date: Wed, 08 Apr 2015 07:43:15 -0400	[thread overview]
Message-ID: <55251453.2070503@mojatatu.com> (raw)
In-Reply-To: <1428455025-5945-2-git-send-email-ast@plumgrid.com>

Hi Alexei,

Sorry havent had the opportunity to rejoin the discussion on the other
emails (I plan to still). Let me get my head out of the sand
and respond to this one because it is a new thread.

This change is a little intrusive.
Summary: my view is a NACK for this change. It will break existing
assumptions and adds unnecessary costs.
Fix bpf_xxx. If cls_rsvp doesnt work, then it needs fixing too[1].

Longer version (to help move forward):

1) Some netdevs show up at the ingress with Link layer headers and
others dont. The assumption of doing blind pull/push is not going
to work. mirred and a few other users care where their packets
are being seen and in my opinion it is their job to handle those
cases. Besides that i am uncomfortable adding this new per-packet cost.
Why are you not able to achieve your goal using the indicators
of where the packet showed up at? Refer to the AT bit definitions
and how they are used elsewhere. Context: There are three paths the
packet sources from:
ingress, egress and packets coming from the stack (when it is
neither from ingress or egress). One of your changes assumed "if
we are not at egress" to mean "we are at ingress". It missed
the case where packets came from the stack.
The good news is that we have consistent behavior depending
where we are seeing packets, so it is easy to special case things
as mirred does.

2) Actions should be self contained if they are to be used in
conjunction with other actions in a policy. They do some small
thing and do it well. Adding semantics of checksum knowledge
in an action whose role is merely to send link level packets out
a netdev seems a little of an overkill. To be honest i am conflicted;
on one side i see the value being not much different than a netdev
that does csum recomputation - otoh i am seeing it as unnecessary
computation that only few preceding actions need. If it can be made
optional and policy defined with reasonable datapath cost (eg a single
if check when there is no interest) then i think it will be more
palatable.

3) The role of mirred is to send link layer packets. It cares about
making sure that the packets are in the proper link layer format
before sending them on the "wire". So the push is there merely to work
around cases where some netdevs have their link layers stripped off.
Anything else that cares about offsets should take of them as well.
IOW, all things you mentioned as not working need to be fixed.

cheers,
jamal

[1] I looked quickly at tests i ran against rsvp and i dont see any that
test at ingress - so i think you are right, it needs fixing (dont
have time right now, be my guest and i will test).

On 04/07/15 21:03, Alexei Starovoitov wrote:
> TC classifers and actions attached to ingress and egress qdiscs see
> inconsistent skb->data. For ingress L2 header is already pulled, whereas
> for egress it's present. Make them consistent by pushing L2 before calling
> into classifiers/actions and pulling L2 back afterwards.
>
> The following cls/act assume L2 and were broken on ingress before this commit:
> cls_bpf, act_bpf, cls_rsvp, cls_cgroup, act_csum, act_nat, all of ematch.
>
> All other in-tree cls/act use skb_network_offset() accessors for skb data
> and work regardless whether L2 was pulled or not.
>
> Since L2 is now present on ingress update act_mirred (the only action that
> was aware of L2 differences) and fix two bugs in it:
> - it was forwarding packets with L2 present to tunnel devices if used
>    with egress qdisc
> - it wasn't updating skb->csum with ingress qdisc
> Also rename 'ok_push' to 'needs_l2' to make it more readable.
>
> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
>

  parent reply	other threads:[~2015-04-08 11:43 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-08  1:03 [PATCH v2 net-next 1/2] net: introduce skb_postpush_rcsum() helper Alexei Starovoitov
2015-04-08  1:03 ` [PATCH v2 net-next 2/2] tc: make ingress and egress qdiscs consistent Alexei Starovoitov
2015-04-08  2:35   ` David Miller
2015-04-08  3:22     ` Alexei Starovoitov
2015-04-08  4:48       ` Alexei Starovoitov
2015-04-08  8:36         ` Daniel Borkmann
2015-04-08  9:05           ` Jiri Pirko
2015-04-08 10:54             ` Daniel Borkmann
2015-04-08 11:14               ` Daniel Borkmann
2015-04-08 11:47                 ` Jamal Hadi Salim
2015-04-08 12:31                   ` Daniel Borkmann
2015-04-08 12:58                     ` Jamal Hadi Salim
2015-04-08 13:14                       ` Thomas Graf
2015-04-08 13:27                         ` Daniel Borkmann
2015-04-08 13:34                           ` Jiri Pirko
2015-04-08 13:41                             ` Daniel Borkmann
2015-04-08 13:47                               ` Thomas Graf
2015-04-08 13:52                               ` Jamal Hadi Salim
2015-04-08 14:53                                 ` Daniel Borkmann
2015-04-08 16:26                                 ` Alexei Starovoitov
2015-04-08 16:32                                   ` David Miller
2015-04-08 16:44                                     ` Alexei Starovoitov
2015-04-08 16:54                                       ` Daniel Borkmann
2015-04-08 11:43   ` Jamal Hadi Salim [this message]
2015-04-08  7:25 ` [PATCH v2 net-next 1/2] net: introduce skb_postpush_rcsum() helper Daniel Borkmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55251453.2070503@mojatatu.com \
    --to=jhs@mojatatu.com \
    --cc=ast@plumgrid.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=jiri@resnulli.us \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.