From mboxrd@z Thu Jan 1 00:00:00 1970 From: Patrick McHardy Subject: Re: [PATCH 6/6] netfilter: do not omit re-route check on NF_QUEUE verdict Date: Thu, 20 Jan 2011 08:53:47 +0100 Message-ID: <4D37EA0B.1080706@trash.net> References: <1295183947-12786-1-git-send-email-fw@strlen.de> <1295183947-12786-7-git-send-email-fw@strlen.de> <4D35AE71.6010301@trash.net> <20110119231435.GA956@Chamillionaire.breakpoint.cc> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Cc: netfilter-devel@vger.kernel.org To: Florian Westphal Return-path: Received: from stinky.trash.net ([213.144.137.162]:62735 "EHLO stinky.trash.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754899Ab1ATHxt (ORCPT ); Thu, 20 Jan 2011 02:53:49 -0500 In-Reply-To: <20110119231435.GA956@Chamillionaire.breakpoint.cc> Sender: netfilter-devel-owner@vger.kernel.org List-ID: On 20.01.2011 00:14, Florian Westphal wrote: > Patrick McHardy wrote: >> On 16.01.2011 14:19, Florian Westphal wrote: >>> ret != NF_QUEUE only works in the "--queue-num 0" case; for >>> queues > 0 the test should be '(ret & NF_VERDICT_MASK) != NF_QUEUE'. >>> >>> However, NF_QUEUE no longer DROPs the skb unconditionally if queueing >>> fails (due to NF_VERDICT_FLAG_QUEUE_BYPASS verdict flag), so the >>> re-route test should also be performed if this flag is set in the >>> verdict. >>> >>> The full test would then look something like >>> >>> && ((ret & NF_VERDICT_MASK) == NF_QUEUE && (ret & NF_VERDICT_FLAG_QUEUE_BYPASS)) >>> >>> This is rather ugly, so just remove the NF_QUEUE test altogether. >>> >>> The only effect is that we might perform an unnecessary route lookup >>> in the NF_QUEUE case. >> >> Alternatively we could have nf_queue.c perform the rerouting when >> a packet is marked for queue bypass, just as it already does when >> reinjecting a packet. mangle just needs to check for NF_ACCEPT, >> since that is the only verdict we can return from the table that >> doesn't cause the packet to be dropped or queued. > > Hrm. I was looking into this, but the current logic appears to be subtly > broken, or at least it is inconsistent. > > Here is what I did: > > ip addr add 192.168.7.20/24 dev eth0 > ip addr add 192.168.8.20/24 dev eth1 > ip route add default via 192.168.7.1 > ip route add default via 192.168.8.1 table 2 > ip rule fwmark 2 lookup 2 > > iptables -t mangle -A OUTPUT -d 10.1.1.1 -j MARK --set-mark 2 > > When I run 'ping -c 1 10.1.1.1' i see the icmp packet on eth1. > > But, when I add > iptables -t mangle -A OUTPUT -d 10.1.1.1 -j NFQUEUE > > I see the packet leaving via eth0 (i am running the nfqueue test program > from libnetfilter_queue on id 0). > > When I change the rule to -j NFQUEUE --queue-id 1, it leaves via eth1 > %-) > > Here is what I think is happening: > > In the --queue-id 0 case, ipt_mangle_out() skips rerouting, because > it tests for ret != NF_QUEUE. > > But nf_queue does not re-route either: > __nf_queue calls afinfo->saveroute(), and, because the test program > does not modify the nfmark any further, afinfo->reroute does not do > anything. > > In the --queue-id 1 case, ipt_mangle_out() does re-routing because > the return value is (1 << 16|NF_QUEUE) and the nfmark is different > than it was before OUTPUT was called. > > My think it would be best to just remove the 'ret != NF_QUEUE' test > in ipt_mangle_out so we always re-route when the nfmark was modified. > > Thoughts? Indeed, that seems inconsistent and I agree with your explanation and suggested fix. I'll apply your patch once the current batch has been merged, thanks.