netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Florian Westphal <fw@strlen.de>
To: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Florian Westphal <fw@strlen.de>,
	netfilter-devel@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH nf-next 0/8] netfilter: untangle bridge and bridge netfilter
Date: Mon, 16 Mar 2015 14:01:10 +0100	[thread overview]
Message-ID: <20150316130110.GA1524@breakpoint.cc> (raw)
In-Reply-To: <20150316123854.GA5564@salvia>

Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> On Sat, Mar 14, 2015 at 12:13:25PM +0100, Florian Westphal wrote:
> > I think that currently it doesn't matter what solution we'll pick in the
> > end, I'd first like to send all my refactoring patches.
> > 
> > With those patches, it reduces the nf_bridge_info content that we need
> > to the physin and physout devices, plus a two-bit state in skb.
> > 
> > Then, if you still think that ->cb is too fragile I'd be happy to add
> > the lookup table method.  For the normal case of bridge forwarding, it
> > shouldn't be needed though, perhaps we could also use a hybrid approach,
> > cb-based fastpath for forwarding, lookup-table slowpath for dnat and
> > local delivery cases.
> 
> I think the use of skb->cb introduces another dependency between
> br_netfilter and the core, which is what we should avoid.  I still
> have concerns regarding future extensibility issues that this may
> introduce:
> 
> 1) If we want to shrink the skb->cb in size, br_netfilter would limit
>    this shrink since we'll have to keep enough room for our internal
>    private information, even if most people will not need this.

Given the size of tcp and gro this shouldn't be an issue.
We'd still fit into 40 bytes cb, and, if that would be a problem we
could still go with (more complex) external storage scheme.

> 2) If inet or qdisc private info size needs to be increased for
>    whatever reason, we may run out of space to keep the br_netfilter data
>    after it. And we cannot ask David to allow us to allocate more bytes
>    for skb->cb to resolve this.

Sure, absolutely.  skb->cb increase is out of question.

I don't think its a problem though, qdisc private info could even be
reduced to 16 (2 pointers on 64bit which I think is a reasonable size to
provide for qdiscs).  Infiniband has similar hack.

And tcp cb has to embed  inet/inet6 as well.

We wouldn't even need to pull of this stunt actually, we could just zap
the skb instead of reinjection... But that would induce e.g. a syn
retransmit which I want to avoid.

> I think we have to find some permanent solution to isolate this
> br_netfilter code as much as possible. Then, focus on providing a
> replacement including the minimal set of features for what most users
> are using br_netfilter, I would say:
> 
> * nfqueue support
> * transparent proxying
> * some simple way to perform stateless NAT (mangle mac and IP address
>   to place the packet in the right interface).

Please not stateless (IP) nat, it usually bites users due to lack
of helpers...  I've no problem with mac mangling.

> so we can point them to nf_tables to progressively migrate to a better
> environment, the overall goal to me is to deprecate this br_netfilter
> thing.

Yes, but it will be a long time before we can remove this thing.

> I think we can use the per-cpu cache + global hashtable to look up for
> nf_bridge data (using skbuff as index). The forwarded traffic will
> find its nf_bridge info most of the time in the cache without having
> to follow the slow lookup path.

I'm still not convinced we even need lookups in the forward case since
the skb can never leave br_netfilter context.

And we only need to store the physin/out ports, nothing else, so its a
bit sad that we need to add such lookup stuff in the first place :-/

I agree that it would still be better than the skb->nf_bridge
indirection, though.

> > One other solution would be to use skb->cb but not store bridge device
> > pointers but the ifindexes instead (we currently don't hold any
> > device refcounts either so the current method isn't 100% safe either since
> > such device can be removed ...).
> 
> Good catch, I think it would be great to start by fixing this problem
> first.

I don't see how this is easily fixable, sorry -- it would mean we'd have
to dev_hold right after allocation of nf_bridge_info struct...

We can store ifindexes of physin/physout and re-lookup, sure.

I'll start on the central storage infrastructure.
Do you want me to base it on the current patches i have or on plain
nf-next?

Thanks.

  reply	other threads:[~2015-03-16 13:01 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-04 23:52 [PATCH nf-next 0/8] netfilter: untangle bridge and bridge netfilter Florian Westphal
2015-03-04 23:52 ` [PATCH nf-next 1/8] bridge: move mac header copying into br_netfilter Florian Westphal
2015-03-04 23:52 ` [PATCH nf-next 2/8] netfilter: bridge: move nf_bridge_update_protocol to where its used Florian Westphal
2015-03-04 23:52 ` [PATCH nf-next 3/8] netfilter: brige: move DNAT helper " Florian Westphal
2015-03-04 23:52 ` [PATCH nf-next 4/8] netfilter: bridge: refactor conditional in br_nf_dev_queue_xmit Florian Westphal
2015-03-04 23:52 ` [PATCH nf-next 5/8] net: untangle ip_fragment and bridge netfilter Florian Westphal
2015-03-04 23:52 ` [PATCH nf-next 6/8] netfilter: bridge: query conntrack about skb dnat Florian Westphal
2015-03-04 23:52 ` [PATCH nf-next 7/8] netfilter: bridge: don't use nf_bridge_info data to store mac header Florian Westphal
2015-03-04 23:52 ` [PATCH nf-next 8/8] netfilter: bridge: rename nf_bridge_info->data to dnat_orig_mac Florian Westphal
2015-03-09 13:02 ` [PATCH nf-next 0/8] netfilter: untangle bridge and bridge netfilter Pablo Neira Ayuso
2015-03-09 13:13   ` Florian Westphal
2015-03-09 16:47     ` Pablo Neira Ayuso
2015-03-09 17:16     ` David Miller
2015-03-09 17:35       ` Florian Westphal
2015-03-09 19:20         ` David Miller
2015-03-09 13:59   ` Florian Westphal
2015-03-14  9:00     ` Pablo Neira Ayuso
2015-03-14 11:13       ` Florian Westphal
2015-03-16 12:38         ` Pablo Neira Ayuso
2015-03-16 13:01           ` Florian Westphal [this message]
2015-03-16 13:47             ` Pablo Neira Ayuso
2015-03-16 13:41           ` Florian Westphal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150316130110.GA1524@breakpoint.cc \
    --to=fw@strlen.de \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).