From mboxrd@z Thu Jan 1 00:00:00 1970 From: Simon Horman Subject: Re: [PATCH] ipvs: fix oops on NAT reply in br_nf context Date: Tue, 10 Jul 2012 17:51:45 +0900 Message-ID: <20120710085145.GA10014@verge.net.au> References: <1341656770.8543.3.camel@chief-river-32> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Julian Anastasov , Massimo Cetra , Eric Dumazet , "David S. Miller" , netdev@vger.kernel.org To: Lin Ming Return-path: Received: from kirsty.vergenet.net ([202.4.237.240]:49545 "EHLO kirsty.vergenet.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754393Ab2GJIvw (ORCPT ); Tue, 10 Jul 2012 04:51:52 -0400 Content-Disposition: inline In-Reply-To: <1341656770.8543.3.camel@chief-river-32> Sender: netdev-owner@vger.kernel.org List-ID: On Sat, Jul 07, 2012 at 06:26:10PM +0800, Lin Ming wrote: > IPVS should not reset skb->nf_bridge in FORWARD hook > by calling nf_reset for NAT replies. It triggers oops in > br_nf_forward_finish. > > [ 579.781508] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 > [ 579.781669] IP: [] br_nf_forward_finish+0x58/0x112 > [ 579.781792] PGD 218f9067 PUD 0 > [ 579.781865] Oops: 0000 [#1] SMP > [ 579.781945] CPU 0 > [ 579.781983] Modules linked in: > [ 579.782047] > [ 579.782080] > [ 579.782114] Pid: 4644, comm: qemu Tainted: G W 3.5.0-rc5-00006-g95e69f9 #282 Hewlett-Packard /30E8 > [ 579.782300] RIP: 0010:[] [] br_nf_forward_finish+0x58/0x112 > [ 579.782455] RSP: 0018:ffff88007b003a98 EFLAGS: 00010287 > [ 579.782541] RAX: 0000000000000008 RBX: ffff8800762ead00 RCX: 000000000001670a > [ 579.782653] RDX: 0000000000000000 RSI: 000000000000000a RDI: ffff8800762ead00 > [ 579.782845] RBP: ffff88007b003ac8 R08: 0000000000016630 R09: ffff88007b003a90 > [ 579.782957] R10: ffff88007b0038e8 R11: ffff88002da37540 R12: ffff88002da01a02 > [ 579.783066] R13: ffff88002da01a80 R14: ffff88002d83c000 R15: ffff88002d82a000 > [ 579.783177] FS: 0000000000000000(0000) GS:ffff88007b000000(0063) knlGS:00000000f62d1b70 > [ 579.783306] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b > [ 579.783395] CR2: 0000000000000004 CR3: 00000000218fe000 CR4: 00000000000027f0 > [ 579.783505] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 579.783684] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [ 579.783795] Process qemu (pid: 4644, threadinfo ffff880021b20000, task ffff880021aba760) > [ 579.783919] Stack: > [ 579.783959] ffff88007693cedc ffff8800762ead00 ffff88002da01a02 ffff8800762ead00 > [ 579.784110] ffff88002da01a02 ffff88002da01a80 ffff88007b003b18 ffffffff817b26c7 > [ 579.784260] ffff880080000000 ffffffff81ef59f0 ffff8800762ead00 ffffffff81ef58b0 > [ 579.784477] Call Trace: > [ 579.784523] > [ 579.784562] > [ 579.784603] [] br_nf_forward_ip+0x275/0x2c8 > [ 579.784707] [] nf_iterate+0x47/0x7d > [ 579.784797] [] ? br_dev_queue_push_xmit+0xae/0xae > [ 579.784906] [] nf_hook_slow+0x6d/0x102 > [ 579.784995] [] ? br_dev_queue_push_xmit+0xae/0xae > [ 579.785175] [] ? _raw_write_unlock_bh+0x19/0x1b > [ 579.785179] [] __br_forward+0x97/0xa2 > [ 579.785179] [] br_handle_frame_finish+0x1a6/0x257 > [ 579.785179] [] br_nf_pre_routing_finish+0x26d/0x2cb > [ 579.785179] [] br_nf_pre_routing+0x55d/0x5c1 > [ 579.785179] [] nf_iterate+0x47/0x7d > [ 579.785179] [] ? br_handle_local_finish+0x44/0x44 > [ 579.785179] [] nf_hook_slow+0x6d/0x102 > [ 579.785179] [] ? br_handle_local_finish+0x44/0x44 > [ 579.785179] [] ? sky2_poll+0xb35/0xb54 > [ 579.785179] [] br_handle_frame+0x213/0x229 > [ 579.785179] [] ? br_handle_frame_finish+0x257/0x257 > [ 579.785179] [] __netif_receive_skb+0x2b4/0x3f1 > [ 579.785179] [] process_backlog+0x99/0x1e2 > [ 579.785179] [] net_rx_action+0xdf/0x242 > [ 579.785179] [] __do_softirq+0xc1/0x1e0 > [ 579.785179] [] ? trace_hardirqs_off_thunk+0x3a/0x6c > [ 579.785179] [] call_softirq+0x1c/0x30 > > The steps to reproduce as follow, > > 1. On Host1, setup brige br0(192.168.1.106) > 2. Boot a kvm guest(192.168.1.105) on Host1 and start httpd > 3. Start IPVS service on Host1 > ipvsadm -A -t 192.168.1.106:80 -s rr > ipvsadm -a -t 192.168.1.106:80 -r 192.168.1.105:80 -m > 4. Run apache benchmark on Host2(192.168.1.101) > ab -n 1000 http://192.168.1.106/ > > ip_vs_reply4 > ip_vs_out > handle_response > ip_vs_notrack > nf_reset() > { > skb->nf_bridge = NULL; > } > > Actually, IPVS wants in this case just to replace nfct > with untracked version. So replace the nf_reset(skb) call > in ip_vs_notrack() with a nf_conntrack_put(skb->nfct) call. > > Signed-off-by: Lin Ming > Signed-off-by: Julian Anastasov Actually, I'll queue up this version for 3.5 rather than the previous one as it has a better title. As per my previous comment (repeated here for reference) it seems to me that this problem has been present since 2.6.37 and thus is stable material.