From mboxrd@z Thu Jan 1 00:00:00 1970 From: Patrick McHardy Subject: Re: [PATCH 1/1] netfilter: nat: work around shared nfct struct in bridge case Date: Wed, 31 Aug 2011 12:05:31 +0200 Message-ID: <4E5E076B.5040602@trash.net> References: <1314710938-25342-1-git-send-email-fw@strlen.de> <4E5CE92B.2010006@trash.net> <20110830140013.GF7548@Chamillionaire.breakpoint.cc> <4E5CEEBC.8090305@trash.net> <20110830152755.GG7548@Chamillionaire.breakpoint.cc> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Cc: netfilter-devel@vger.kernel.org To: Florian Westphal Return-path: Received: from stinky.trash.net ([213.144.137.162]:42955 "EHLO stinky.trash.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752036Ab1HaKFd (ORCPT ); Wed, 31 Aug 2011 06:05:33 -0400 In-Reply-To: <20110830152755.GG7548@Chamillionaire.breakpoint.cc> Sender: netfilter-devel-owner@vger.kernel.org List-ID: On 30.08.2011 17:27, Florian Westphal wrote: > Patrick McHardy wrote: >> Yes, when using your patch, otherwise (when handling this case in >> nf_nat_setup_info() we might invoke it multiple times simultaneously >> though. >> >>> In case nf_ct_ext_add() we already return NF_ACCEPT, so I think this >>> part is OK. >>> >>>> I also fear this is not >>>> going to be the only problem caused by breaking the "unconfirmed means >>>> non-shared nfct" assumption. >>> >>> Agreed. Perhaps we can solve the module dependeny issue of the "unshare" >>> approach. In fact, if invalid state for the clones would be acceptable >>> then the dependency should go away; AFAICS nf_conntrack_untracked is the >>> only nf-related symbol required by br_netfilter.o not in netfilter/core.c. >> >> I don't think the clones should have invalid state, even untracked is >> very questionable since all packets should have NAT applied to them in >> the same way, connmarks might be used etc. > > Right, but this is probably only going to be fixable in a "try to do the > best without crashing", because even without userspace queueing > there are cases where this is not deterministic: > > -m physdev --physdev-out eth1 -j SNAT ... > -m physdev --physdev-out eth2 -j SNAT ... > > ... will match whatever bridge port the packet will be sent out on > first. Yes, but setting up the rules properly is responsibility of the user. Usually you'd just have a regular NAT rule, in which case you normally want flooded packets to be treated similar. > Also, before 87557c18ac36241b596984589a0889c5c4bf916c > forward ran after pass_frame_up() in which case post_routing is > not involved. > > I am afraid we might first need to find out what should happen in > the "delivered locally and forwarded" case before we can figure > out what a sane fix might look like. I don't really see the problem, the user has to set up his rules properly. >> We probably need to restore the above mentioned assumption somehow. One >> way would be to serialize reinjection of packets belonging to >> unconfirmed conntracks in nf_reinject or the queueing modules. Conntrack >> related stuff doesn't really belong there, but it seems like the easiest >> and safest fix to me. > > Only serializing reinject may not be enough, since some packets might not be > queued (e.g. when queueing only in forward, or only when dealing with > a particular bridge port); in which case we'd still race. True, that case has also always been broken. I don't see a way to properly fix this right now, need to think about it some more.