All of lore.kernel.org
 help / color / mirror / Atom feed
From: Smart Weblications GmbH - Florian Wiessner  <f.wiessner@smart-weblications.de>
To: Julian Anastasov <ja@ssi.bg>,
	Steffen Klassert <steffen.klassert@secunet.com>
Cc: netdev@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
	stable@vger.kernel.org
Subject: Re: 3.12.33 - BUG xfrm_selector_match+0x25/0x2f6
Date: Fri, 05 Dec 2014 03:23:54 +0100	[thread overview]
Message-ID: <5481173A.9060308@smart-weblications.de> (raw)
In-Reply-To: <alpine.LFD.2.11.1412042338370.4841@ja.home.ssi.bg>

Hi,

Am 05.12.2014 00:15, schrieb Julian Anastasov:
> 
> 	Hello,
> 
> On Thu, 4 Dec 2014, Steffen Klassert wrote:
> 
>>> [16623.096721] Call Trace:
>>> [16623.096744]  <IRQ>
>>> [16623.096749]  [<ffffffff81547a7c>] ? xfrm_sk_policy_lookup+0x44/0x9b
>>> [16623.096802]  [<ffffffff81547ef7>] ? xfrm_lookup+0x91/0x446
>>> [16623.096832]  [<ffffffff81541316>] ? ip_route_me_harder+0x150/0x1b0
>>> [16623.096865]  [<ffffffffa01b6457>] ? ip_vs_route_me_harder+0x86/0x91 [ip_vs]
>>> [16623.096899]  [<ffffffffa01b797a>] ? ip_vs_out+0x2d3/0x5bc [ip_vs]
>>> [16623.096930]  [<ffffffff81501420>] ? ip_rcv_finish+0x2b8/0x2b8
>>
>> I really wonder why the xfrm_sk_policy_lookup codepath is taken here.
>> It looks like this is the processing of an inbound ipv4 packet that
>> is going to be rerouted to the output path by ipvs, so this packet
>> should not have socket context at all.
> 
> 	In above trace looks like IPVS-NAT is used between
> local client and some real server. IPVS handles this skb
> at LOCAL_IN and calls ip_vs_route_me_harder(). If we have
> skb->sk at LOCAL_IN, my first thought is about early demux.
> 
> 	If I remember correctly, looking at commit f5a41847acc535e2
> ("ipvs: move ip_route_me_harder for ICMP") that introduced
> this rerouting (2.6.37), it was needed because at that time TCP
> used rt_src from received skb to select daddr in ip_send_reply().
> As packets to server are DNAT-ed and packets to client are
> SNAT-ed we used rerouting to fill rt_src with correct IP
> after SNAT.
> 
> 	Now when routing cache is removed in 3.6 and
> tcp_v4_send_reset() is changed to provide ip_hdr(skb)->saddr
> instead of rt_src it should be safe to remove this rerouting,
> it is enough that ip_hdr(skb)->saddr was updated on IPVS-SNAT at
> LOCAL_IN. In fact, rt_src was removed early in 3.0 with
> commit 0a5ebb8000c5362 ("ipv4: Pass explicit daddr arg to 
> ip_send_reply().").
> 
> 	This is only to explain above stack. Not sure
> if problem is related somehow to early demux but such
> commits look interesting:
> 
> - commit 6b8dbcf2c44fd7a ("bridge: netfilter: orphan skb before invoking 
> ip netfilter hooks")
> 
> 	Also, it would be good to know which 3.x kernel between
> 3.13 and 3.17 fixes the problem, it will narrow the search.
> 


i tried with 3.12.33 without any XFRM and now got this one (which is reproducable):

[  233.956012] BUG: unable to handle kernel NULL pointer dereference at 00000000
                                   00000014
[  233.956218] IP: [<ffffffffa013a470>] nf_ct_seqadj_set+0x60/0x90 [nf_conntrack
                                   ]
[  233.956371] PGD 0
[  233.956493] Oops: 0000 [#1] SMP
[  233.956680] Modules linked in: netconsole xt_nat xt_multiport veth iptable_ma
                                   ngle xt_mark nf_conntrack_netlink nfnetlink
ip_vs_rr ipt_MASQUERADE iptable_nat
nf_nat_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 ipt_REJECT xt_tcpudp iptable_filter
                                    ip_tables cpufreq_ondemand cpufreq_powersave
cpufreq_conservative cpufreq_users                                    pace
ocfs2_stack_o2cb ocfs2_dlm bridge stp llc bonding fuse nf_conntrack_ftp 802
                               1q openvswitch gre vxlan xt_conntrack x_tables
ocfs2_dlmfs dlm sctp ocfs2 ocfs2_                                    nodemanager
ocfs2_stackglue configfs rbd kvm_intel kvm coretemp ip_vs_ftp ip_vs
                        nf_nat nf_conntrack psmouse i2c_i801 serio_raw lpc_ich
mfd_core evdev btrfs lzo_                                    decompress lzo_compress
[  233.960221] CPU: 2 PID: 29996 Comm: vsftpd Not tainted 3.12.33 #4
[  233.960298] Hardware name: Supermicro X9SCI/X9SCA/X9SCI/X9SCA, BIOS 1.1a 09/2
                                   8/2011
[  233.960395] task: ffff88075e87a2c0 ti: ffff8806a7444000 task.ti: ffff8806a744
                                   4000
[  233.960486] RIP: 0010:[<ffffffffa013a470>]  [<ffffffffa013a470>] nf_ct_seqadj
                                   _set+0x60/0x90 [nf_conntrack]
[  233.960632] RSP: 0018:ffff88083fc83998  EFLAGS: 00010206
[  233.960709] RAX: 000000000000000c RBX: ffff8806cab452cc RCX: 0000000000000003
[  233.960791] RDX: 0000000000000029 RSI: 0000000000000003 RDI: ffff8806cab452cc
[  233.960875] RBP: 00000000ee38035a R08: ffff8807e2b1edc0 R09: ffff88083fc839a8
[  233.960957] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000003
[  233.961041] R13: 0000000000000000 R14: 0000000000000003 R15: ffff8806a75a50bc
[  233.961124] FS:  00007ff22daec700(0000) GS:ffff88083fc80000(0000) knlGS:00000
                                   00000000000
[  233.961226] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  233.961303] CR2: 0000000000000014 CR3: 00000006b3259000 CR4: 00000000000407e0
[  233.961384] Stack:
[  233.961460]  ffff880815612b60 0000000000000012 0000000000000014 ffff8806cab45
                                   2c8
[  233.961776]  ffff8806a75a5001 ffffffffa014f681 0000000000000000 ffffffff00000
                                   045
[  233.962095]  ffff880800000048 0000001b00000003 ffff88083fc83a70 ffff880815612
                                   b60
[  233.962411] Call Trace:
[  233.962482]  <IRQ>
[  233.962538]  [<ffffffffa014f681>] ? __nf_nat_mangle_tcp_packet+0x109/0x120 [n
                                   f_nat]
[  233.962762]  [<ffffffffa017749e>] ? ip_vs_ftp_out.part.8+0x2b2/0x338 [ip_vs_f
                                   tp]
[  233.962866]  [<ffffffff814cb8c0>] ? __domain_mapping+0x25d/0x2a3
[  233.962949]  [<ffffffff8154140c>] ? fib_table_lookup+0xe4/0x255
[  233.963032]  [<ffffffffa015f858>] ? ip_vs_app_pkt_out+0x105/0x18b [ip_vs]
[  233.963110]  [<ffffffffa0162ffc>] ? tcp_snat_handler+0x6b/0x320 [ip_vs]
[  233.963189]  [<ffffffffa0155d3d>] ? ip_vs_conn_out_get_proto+0x1c/0x25 [ip_vs
                                   ]
[  233.963284]  [<ffffffffa0158937>] ? ip_vs_out+0x290/0x5bc [ip_vs]
[  233.963362]  [<ffffffff8150f544>] ? ip_frag_mem+0x2a/0x2a
[  233.963442]  [<ffffffff81508e1f>] ? nf_iterate+0x42/0x80
[  233.963519]  [<ffffffff81508ec6>] ? nf_hook_slow+0x69/0xff
[  233.963595]  [<ffffffff8150f544>] ? ip_frag_mem+0x2a/0x2a
[  233.963667]  [<ffffffff8150f8ae>] ? ip_forward+0x22d/0x2cf
[  233.963744]  [<ffffffff814e57ce>] ? __netif_receive_skb_core+0x5f0/0x66c
[  233.963826]  [<ffffffff814e59df>] ? process_backlog+0x13e/0x13e
[  233.963911]  [<ffffffffa0455e09>] ? br_handle_frame_finish+0x382/0x382 [bridg
                                   e]
[  233.964008]  [<ffffffff814e5a2b>] ? netif_receive_skb+0x4c/0x7d
[  233.964090]  [<ffffffffa0455d95>] ? br_handle_frame_finish+0x30e/0x382 [bridg
                                   e]
[  233.964186]  [<ffffffffa0455fda>] ? br_handle_frame+0x1d1/0x217 [bridge]
[  233.964267]  [<ffffffff814e567d>] ? __netif_receive_skb_core+0x49f/0x66c
[  233.964350]  [<ffffffff814e592b>] ? process_backlog+0x8a/0x13e
[  233.964429]  [<ffffffff814e5c31>] ? net_rx_action+0xa2/0x1c0
[  233.964508]  [<ffffffff81047e2e>] ? __do_softirq+0xf6/0x24f
[  233.964588]  [<ffffffff8106cbfd>] ? account_system_time+0x10f/0x169
[  233.964669]  [<ffffffff815ad7dc>] ? call_softirq+0x1c/0x30
[  233.964743]  <EOI>
[  233.964801]  [<ffffffff8100464d>] ? do_softirq+0x2c/0x5f
[  233.965013]  [<ffffffff81047ca1>] ? local_bh_enable+0x67/0x85
[  233.965088]  [<ffffffff81511689>] ? ip_finish_output+0x2c9/0x322
[  233.965165]  [<ffffffff8151240a>] ? ip_queue_xmit+0x2b7/0x2f0
[  233.965239]  [<ffffffff81524772>] ? tcp_transmit_skb+0x6ef/0x755
[  233.965316]  [<ffffffff815250e8>] ? tcp_write_xmit+0x886/0x9cb
[  233.965391]  [<ffffffff8152527a>] ? __tcp_push_pending_frames+0x24/0x7e
[  233.965473]  [<ffffffff8151a33c>] ? tcp_sendmsg+0xa4c/0xbfc
[  233.965550]  [<ffffffff814d3477>] ? sock_aio_write+0xe3/0xfd
[  233.965631]  [<ffffffff81122f4d>] ? do_sync_write+0x59/0x79
[  233.965709]  [<ffffffff811239e3>] ? vfs_write+0xc4/0x182
[  233.965786]  [<ffffffff81123daf>] ? SyS_write+0x45/0x7c
[  233.965864]  [<ffffffff815ac35b>] ? tracesys+0xdd/0xe2
[  233.965940] Code: 68 14 4d 01 c5 45 85 e4 74 46 f0 80 4f 78 40 48 8d 5f 04 48
                                    89 df e8 00 12 47 e1 31 c0 41 83 fe 02 0f 97
c0 48 6b c0 0c 4c 01 e8 <8b> 70 08                                     39 70 04
74 08 89 ea 0f ca 39 10 79 0d 89 70 04 44 01
[  233.969602] RIP  [<ffffffffa013a470>] nf_ct_seqadj_set+0x60/0x90 [nf_conntrac
                                   k]
[  233.969746]  RSP <ffff88083fc83998>
[  233.969816] CR2: 0000000000000014
[  233.969919] ---[ end trace c6faf7aa989b11c2 ]---
[  233.969999] Kernel panic - not syncing: Fatal exception in interrupt
[  233.970081] Rebooting in 10 seconds..
[  244.029931] ACPI MEMORY or I/O RESET_REG.


node01:/ocfs2/usr/src/linux-3.12.33/scripts# ./decodecode < /tmp/oops-ipvsftp.txt
[ 233.965940] Code: 68 14 4d 01 c5 45 85 e4 74 46 f0 80 4f 78 40 48 8d 5f 04 48
89 df e8 00 12 47 e1 31 c0 41 83 fe 02 0f 97 c0 48 6b c0 0c 4c 01 e8 <8b> 70 08
39 70 04 74 08 89 ea 0f ca 39 10 79 0d 89 70 04 44 01
All code
========
   0:   68 14 4d 01 c5          pushq  $0xffffffffc5014d14
   5:   45 85 e4                test   %r12d,%r12d
   8:   74 46                   je     0x50
   a:   f0 80 4f 78 40          lock orb $0x40,0x78(%rdi)
   f:   48 8d 5f 04             lea    0x4(%rdi),%rbx
  13:   48 89 df                mov    %rbx,%rdi
  16:   e8 00 12 47 e1          callq  0xffffffffe147121b
  1b:   31 c0                   xor    %eax,%eax
  1d:   41 83 fe 02             cmp    $0x2,%r14d
  21:   0f 97 c0                seta   %al
  24:   48 6b c0 0c             imul   $0xc,%rax,%rax
  28:   4c 01 e8                add    %r13,%rax
  2b:*  8b 70 08                mov    0x8(%rax),%esi           <-- trapping
instruction
  2e:   39 70 04                cmp    %esi,0x4(%rax)
  31:   74 08                   je     0x3b
  33:   89 ea                   mov    %ebp,%edx
  35:   0f ca                   bswap  %edx
  37:   39 10                   cmp    %edx,(%rax)
  39:   79 0d                   jns    0x48
  3b:   89 70 04                mov    %esi,0x4(%rax)
  3e:   44                      rex.R
  3f:   01                      .byte 0x1

Code starting with the faulting instruction
===========================================
   0:   8b 70 08                mov    0x8(%rax),%esi
   3:   39 70 04                cmp    %esi,0x4(%rax)
   6:   74 08                   je     0x10
   8:   89 ea                   mov    %ebp,%edx
   a:   0f ca                   bswap  %edx
   c:   39 10                   cmp    %edx,(%rax)
   e:   79 0d                   jns    0x1d
  10:   89 70 04                mov    %esi,0x4(%rax)
  13:   44                      rex.R
  14:   01                      .byte 0x1


setup is like this:


#virtual=<myVIP>:21
#       real=10.10.1.20:21 masq
#       real=10.10.1.21:21 masq
#       real=10.10.1.22:21 masq
#       real=10.10.1.23:21 masq
#       persistent=600
#       service=ftp
#       scheduler=rr
#       protocol=tcp
#       checktype=connect

( i remarked it to prevent fruther crashes...)

when ip_vs_ftp is loaded and someone trying to make a ftp connection, the system
panics instantly.

10.10.1.20 - 10.10.1.23 are lxc-containers using veth connected to the bridge
running on 4 different nodes. The node running ldirector/ipvsadm has also one of
those containers running (don't know if that matters)

brctl show
bridge name     bridge id               STP enabled     interfaces
br0             8000.00259052bbf4       no              bond0
                                                        vethMKELUc
                                                        vethXdWGqf
                                                        vethgJMmEb
                                                        vethmKNqFc


I disabled the ftp server lxc container on the node doing ip_vs, so that the
endpoint of the connection is not on the same node and tried again but with the
same result.

Unfortunatelly i cannot test with newer kernels than 3.12, because ocfs2 is
somehow broken in >= 3.14


-- 

Mit freundlichen Grüßen,

Florian Wiessner

Smart Weblications GmbH
Martinsberger Str. 1
D-95119 Naila

fon.: +49 9282 9638 200
fax.: +49 9282 9638 205
24/7: +49 900 144 000 00 - 0,99 EUR/Min*
http://www.smart-weblications.de

--
Sitz der Gesellschaft: Naila
Geschäftsführer: Florian Wiessner
HRB-Nr.: HRB 3840 Amtsgericht Hof
*aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz

WARNING: multiple messages have this Message-ID (diff)
From: Smart Weblications GmbH - Florian Wiessner <f.wiessner@smart-weblications.de>
To: Julian Anastasov <ja@ssi.bg>,
	Steffen Klassert <steffen.klassert@secunet.com>
Cc: netdev@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
	stable@vger.kernel.org
Subject: Re: 3.12.33 - BUG xfrm_selector_match+0x25/0x2f6
Date: Fri, 05 Dec 2014 03:23:54 +0100	[thread overview]
Message-ID: <5481173A.9060308@smart-weblications.de> (raw)
In-Reply-To: <alpine.LFD.2.11.1412042338370.4841@ja.home.ssi.bg>

Hi,

Am 05.12.2014 00:15, schrieb Julian Anastasov:
> 
> 	Hello,
> 
> On Thu, 4 Dec 2014, Steffen Klassert wrote:
> 
>>> [16623.096721] Call Trace:
>>> [16623.096744]  <IRQ>
>>> [16623.096749]  [<ffffffff81547a7c>] ? xfrm_sk_policy_lookup+0x44/0x9b
>>> [16623.096802]  [<ffffffff81547ef7>] ? xfrm_lookup+0x91/0x446
>>> [16623.096832]  [<ffffffff81541316>] ? ip_route_me_harder+0x150/0x1b0
>>> [16623.096865]  [<ffffffffa01b6457>] ? ip_vs_route_me_harder+0x86/0x91 [ip_vs]
>>> [16623.096899]  [<ffffffffa01b797a>] ? ip_vs_out+0x2d3/0x5bc [ip_vs]
>>> [16623.096930]  [<ffffffff81501420>] ? ip_rcv_finish+0x2b8/0x2b8
>>
>> I really wonder why the xfrm_sk_policy_lookup codepath is taken here.
>> It looks like this is the processing of an inbound ipv4 packet that
>> is going to be rerouted to the output path by ipvs, so this packet
>> should not have socket context at all.
> 
> 	In above trace looks like IPVS-NAT is used between
> local client and some real server. IPVS handles this skb
> at LOCAL_IN and calls ip_vs_route_me_harder(). If we have
> skb->sk at LOCAL_IN, my first thought is about early demux.
> 
> 	If I remember correctly, looking at commit f5a41847acc535e2
> ("ipvs: move ip_route_me_harder for ICMP") that introduced
> this rerouting (2.6.37), it was needed because at that time TCP
> used rt_src from received skb to select daddr in ip_send_reply().
> As packets to server are DNAT-ed and packets to client are
> SNAT-ed we used rerouting to fill rt_src with correct IP
> after SNAT.
> 
> 	Now when routing cache is removed in 3.6 and
> tcp_v4_send_reset() is changed to provide ip_hdr(skb)->saddr
> instead of rt_src it should be safe to remove this rerouting,
> it is enough that ip_hdr(skb)->saddr was updated on IPVS-SNAT at
> LOCAL_IN. In fact, rt_src was removed early in 3.0 with
> commit 0a5ebb8000c5362 ("ipv4: Pass explicit daddr arg to 
> ip_send_reply().").
> 
> 	This is only to explain above stack. Not sure
> if problem is related somehow to early demux but such
> commits look interesting:
> 
> - commit 6b8dbcf2c44fd7a ("bridge: netfilter: orphan skb before invoking 
> ip netfilter hooks")
> 
> 	Also, it would be good to know which 3.x kernel between
> 3.13 and 3.17 fixes the problem, it will narrow the search.
> 


i tried with 3.12.33 without any XFRM and now got this one (which is reproducable):

[  233.956012] BUG: unable to handle kernel NULL pointer dereference at 00000000
                                   00000014
[  233.956218] IP: [<ffffffffa013a470>] nf_ct_seqadj_set+0x60/0x90 [nf_conntrack
                                   ]
[  233.956371] PGD 0
[  233.956493] Oops: 0000 [#1] SMP
[  233.956680] Modules linked in: netconsole xt_nat xt_multiport veth iptable_ma
                                   ngle xt_mark nf_conntrack_netlink nfnetlink
ip_vs_rr ipt_MASQUERADE iptable_nat
nf_nat_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 ipt_REJECT xt_tcpudp iptable_filter
                                    ip_tables cpufreq_ondemand cpufreq_powersave
cpufreq_conservative cpufreq_users                                    pace
ocfs2_stack_o2cb ocfs2_dlm bridge stp llc bonding fuse nf_conntrack_ftp 802
                               1q openvswitch gre vxlan xt_conntrack x_tables
ocfs2_dlmfs dlm sctp ocfs2 ocfs2_                                    nodemanager
ocfs2_stackglue configfs rbd kvm_intel kvm coretemp ip_vs_ftp ip_vs
                        nf_nat nf_conntrack psmouse i2c_i801 serio_raw lpc_ich
mfd_core evdev btrfs lzo_                                    decompress lzo_compress
[  233.960221] CPU: 2 PID: 29996 Comm: vsftpd Not tainted 3.12.33 #4
[  233.960298] Hardware name: Supermicro X9SCI/X9SCA/X9SCI/X9SCA, BIOS 1.1a 09/2
                                   8/2011
[  233.960395] task: ffff88075e87a2c0 ti: ffff8806a7444000 task.ti: ffff8806a744
                                   4000
[  233.960486] RIP: 0010:[<ffffffffa013a470>]  [<ffffffffa013a470>] nf_ct_seqadj
                                   _set+0x60/0x90 [nf_conntrack]
[  233.960632] RSP: 0018:ffff88083fc83998  EFLAGS: 00010206
[  233.960709] RAX: 000000000000000c RBX: ffff8806cab452cc RCX: 0000000000000003
[  233.960791] RDX: 0000000000000029 RSI: 0000000000000003 RDI: ffff8806cab452cc
[  233.960875] RBP: 00000000ee38035a R08: ffff8807e2b1edc0 R09: ffff88083fc839a8
[  233.960957] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000003
[  233.961041] R13: 0000000000000000 R14: 0000000000000003 R15: ffff8806a75a50bc
[  233.961124] FS:  00007ff22daec700(0000) GS:ffff88083fc80000(0000) knlGS:00000
                                   00000000000
[  233.961226] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  233.961303] CR2: 0000000000000014 CR3: 00000006b3259000 CR4: 00000000000407e0
[  233.961384] Stack:
[  233.961460]  ffff880815612b60 0000000000000012 0000000000000014 ffff8806cab45
                                   2c8
[  233.961776]  ffff8806a75a5001 ffffffffa014f681 0000000000000000 ffffffff00000
                                   045
[  233.962095]  ffff880800000048 0000001b00000003 ffff88083fc83a70 ffff880815612
                                   b60
[  233.962411] Call Trace:
[  233.962482]  <IRQ>
[  233.962538]  [<ffffffffa014f681>] ? __nf_nat_mangle_tcp_packet+0x109/0x120 [n
                                   f_nat]
[  233.962762]  [<ffffffffa017749e>] ? ip_vs_ftp_out.part.8+0x2b2/0x338 [ip_vs_f
                                   tp]
[  233.962866]  [<ffffffff814cb8c0>] ? __domain_mapping+0x25d/0x2a3
[  233.962949]  [<ffffffff8154140c>] ? fib_table_lookup+0xe4/0x255
[  233.963032]  [<ffffffffa015f858>] ? ip_vs_app_pkt_out+0x105/0x18b [ip_vs]
[  233.963110]  [<ffffffffa0162ffc>] ? tcp_snat_handler+0x6b/0x320 [ip_vs]
[  233.963189]  [<ffffffffa0155d3d>] ? ip_vs_conn_out_get_proto+0x1c/0x25 [ip_vs
                                   ]
[  233.963284]  [<ffffffffa0158937>] ? ip_vs_out+0x290/0x5bc [ip_vs]
[  233.963362]  [<ffffffff8150f544>] ? ip_frag_mem+0x2a/0x2a
[  233.963442]  [<ffffffff81508e1f>] ? nf_iterate+0x42/0x80
[  233.963519]  [<ffffffff81508ec6>] ? nf_hook_slow+0x69/0xff
[  233.963595]  [<ffffffff8150f544>] ? ip_frag_mem+0x2a/0x2a
[  233.963667]  [<ffffffff8150f8ae>] ? ip_forward+0x22d/0x2cf
[  233.963744]  [<ffffffff814e57ce>] ? __netif_receive_skb_core+0x5f0/0x66c
[  233.963826]  [<ffffffff814e59df>] ? process_backlog+0x13e/0x13e
[  233.963911]  [<ffffffffa0455e09>] ? br_handle_frame_finish+0x382/0x382 [bridg
                                   e]
[  233.964008]  [<ffffffff814e5a2b>] ? netif_receive_skb+0x4c/0x7d
[  233.964090]  [<ffffffffa0455d95>] ? br_handle_frame_finish+0x30e/0x382 [bridg
                                   e]
[  233.964186]  [<ffffffffa0455fda>] ? br_handle_frame+0x1d1/0x217 [bridge]
[  233.964267]  [<ffffffff814e567d>] ? __netif_receive_skb_core+0x49f/0x66c
[  233.964350]  [<ffffffff814e592b>] ? process_backlog+0x8a/0x13e
[  233.964429]  [<ffffffff814e5c31>] ? net_rx_action+0xa2/0x1c0
[  233.964508]  [<ffffffff81047e2e>] ? __do_softirq+0xf6/0x24f
[  233.964588]  [<ffffffff8106cbfd>] ? account_system_time+0x10f/0x169
[  233.964669]  [<ffffffff815ad7dc>] ? call_softirq+0x1c/0x30
[  233.964743]  <EOI>
[  233.964801]  [<ffffffff8100464d>] ? do_softirq+0x2c/0x5f
[  233.965013]  [<ffffffff81047ca1>] ? local_bh_enable+0x67/0x85
[  233.965088]  [<ffffffff81511689>] ? ip_finish_output+0x2c9/0x322
[  233.965165]  [<ffffffff8151240a>] ? ip_queue_xmit+0x2b7/0x2f0
[  233.965239]  [<ffffffff81524772>] ? tcp_transmit_skb+0x6ef/0x755
[  233.965316]  [<ffffffff815250e8>] ? tcp_write_xmit+0x886/0x9cb
[  233.965391]  [<ffffffff8152527a>] ? __tcp_push_pending_frames+0x24/0x7e
[  233.965473]  [<ffffffff8151a33c>] ? tcp_sendmsg+0xa4c/0xbfc
[  233.965550]  [<ffffffff814d3477>] ? sock_aio_write+0xe3/0xfd
[  233.965631]  [<ffffffff81122f4d>] ? do_sync_write+0x59/0x79
[  233.965709]  [<ffffffff811239e3>] ? vfs_write+0xc4/0x182
[  233.965786]  [<ffffffff81123daf>] ? SyS_write+0x45/0x7c
[  233.965864]  [<ffffffff815ac35b>] ? tracesys+0xdd/0xe2
[  233.965940] Code: 68 14 4d 01 c5 45 85 e4 74 46 f0 80 4f 78 40 48 8d 5f 04 48
                                    89 df e8 00 12 47 e1 31 c0 41 83 fe 02 0f 97
c0 48 6b c0 0c 4c 01 e8 <8b> 70 08                                     39 70 04
74 08 89 ea 0f ca 39 10 79 0d 89 70 04 44 01
[  233.969602] RIP  [<ffffffffa013a470>] nf_ct_seqadj_set+0x60/0x90 [nf_conntrac
                                   k]
[  233.969746]  RSP <ffff88083fc83998>
[  233.969816] CR2: 0000000000000014
[  233.969919] ---[ end trace c6faf7aa989b11c2 ]---
[  233.969999] Kernel panic - not syncing: Fatal exception in interrupt
[  233.970081] Rebooting in 10 seconds..
[  244.029931] ACPI MEMORY or I/O RESET_REG.


node01:/ocfs2/usr/src/linux-3.12.33/scripts# ./decodecode < /tmp/oops-ipvsftp.txt
[ 233.965940] Code: 68 14 4d 01 c5 45 85 e4 74 46 f0 80 4f 78 40 48 8d 5f 04 48
89 df e8 00 12 47 e1 31 c0 41 83 fe 02 0f 97 c0 48 6b c0 0c 4c 01 e8 <8b> 70 08
39 70 04 74 08 89 ea 0f ca 39 10 79 0d 89 70 04 44 01
All code
========
   0:   68 14 4d 01 c5          pushq  $0xffffffffc5014d14
   5:   45 85 e4                test   %r12d,%r12d
   8:   74 46                   je     0x50
   a:   f0 80 4f 78 40          lock orb $0x40,0x78(%rdi)
   f:   48 8d 5f 04             lea    0x4(%rdi),%rbx
  13:   48 89 df                mov    %rbx,%rdi
  16:   e8 00 12 47 e1          callq  0xffffffffe147121b
  1b:   31 c0                   xor    %eax,%eax
  1d:   41 83 fe 02             cmp    $0x2,%r14d
  21:   0f 97 c0                seta   %al
  24:   48 6b c0 0c             imul   $0xc,%rax,%rax
  28:   4c 01 e8                add    %r13,%rax
  2b:*  8b 70 08                mov    0x8(%rax),%esi           <-- trapping
instruction
  2e:   39 70 04                cmp    %esi,0x4(%rax)
  31:   74 08                   je     0x3b
  33:   89 ea                   mov    %ebp,%edx
  35:   0f ca                   bswap  %edx
  37:   39 10                   cmp    %edx,(%rax)
  39:   79 0d                   jns    0x48
  3b:   89 70 04                mov    %esi,0x4(%rax)
  3e:   44                      rex.R
  3f:   01                      .byte 0x1

Code starting with the faulting instruction
===========================================
   0:   8b 70 08                mov    0x8(%rax),%esi
   3:   39 70 04                cmp    %esi,0x4(%rax)
   6:   74 08                   je     0x10
   8:   89 ea                   mov    %ebp,%edx
   a:   0f ca                   bswap  %edx
   c:   39 10                   cmp    %edx,(%rax)
   e:   79 0d                   jns    0x1d
  10:   89 70 04                mov    %esi,0x4(%rax)
  13:   44                      rex.R
  14:   01                      .byte 0x1


setup is like this:


#virtual=<myVIP>:21
#       real=10.10.1.20:21 masq
#       real=10.10.1.21:21 masq
#       real=10.10.1.22:21 masq
#       real=10.10.1.23:21 masq
#       persistent=600
#       service=ftp
#       scheduler=rr
#       protocol=tcp
#       checktype=connect

( i remarked it to prevent fruther crashes...)

when ip_vs_ftp is loaded and someone trying to make a ftp connection, the system
panics instantly.

10.10.1.20 - 10.10.1.23 are lxc-containers using veth connected to the bridge
running on 4 different nodes. The node running ldirector/ipvsadm has also one of
those containers running (don't know if that matters)

brctl show
bridge name     bridge id               STP enabled     interfaces
br0             8000.00259052bbf4       no              bond0
                                                        vethMKELUc
                                                        vethXdWGqf
                                                        vethgJMmEb
                                                        vethmKNqFc


I disabled the ftp server lxc container on the node doing ip_vs, so that the
endpoint of the connection is not on the same node and tried again but with the
same result.

Unfortunatelly i cannot test with newer kernels than 3.12, because ocfs2 is
somehow broken in >= 3.14


-- 

Mit freundlichen Gr��en,

Florian Wiessner

Smart Weblications GmbH
Martinsberger Str. 1
D-95119 Naila

fon.: +49 9282 9638 200
fax.: +49 9282 9638 205
24/7: +49 900 144 000 00 - 0,99 EUR/Min*
http://www.smart-weblications.de

--
Sitz der Gesellschaft: Naila
Gesch�ftsf�hrer: Florian Wiessner
HRB-Nr.: HRB 3840 Amtsgericht Hof
*aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz

  reply	other threads:[~2014-12-05  2:23 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-03 14:55 3.12.33 - BUG xfrm_selector_match+0x25/0x2f6 Smart Weblications GmbH - Florian Wiessner
2014-12-03 14:55 ` Smart Weblications GmbH - Florian Wiessner
2014-12-04  7:56 ` Steffen Klassert
2014-12-04 16:36   ` Smart Weblications GmbH - Florian Wiessner
2014-12-04 16:36     ` Smart Weblications GmbH - Florian Wiessner
2014-12-05 10:43     ` Steffen Klassert
2014-12-04 23:15   ` Julian Anastasov
2014-12-05  2:23     ` Smart Weblications GmbH - Florian Wiessner [this message]
2014-12-05  2:23       ` Smart Weblications GmbH - Florian Wiessner
2014-12-05  9:55       ` Julian Anastasov
2014-12-05 13:55         ` Smart Weblications GmbH - Florian Wiessner
2014-12-05 13:55           ` Smart Weblications GmbH - Florian Wiessner
2014-12-05 21:32           ` Julian Anastasov
2014-12-07 22:04             ` Smart Weblications GmbH - Florian Wiessner
2014-12-07 18:27           ` Julian Anastasov
2014-12-08 11:19             ` Smart Weblications GmbH - Florian Wiessner
2014-12-08 11:19               ` Smart Weblications GmbH - Florian Wiessner
2014-12-08 20:40               ` Julian Anastasov
2014-12-09 10:23                 ` Smart Weblications GmbH - Florian Wiessner
2014-12-09 10:23                   ` Smart Weblications GmbH - Florian Wiessner
2014-12-10 21:41                   ` Julian Anastasov
2014-12-11 14:04                     ` Smart Weblications GmbH - Florian Wiessner
2014-12-11 14:04                       ` Smart Weblications GmbH - Florian Wiessner
2014-12-13 20:19                       ` Julian Anastasov
2015-01-06 12:56                         ` Jiri Slaby
2015-01-06 20:46                           ` Julian Anastasov
2014-12-05 10:53     ` Steffen Klassert
2014-12-04  9:44 ` Jiri Slaby
2014-12-04 16:40   ` Smart Weblications GmbH - Florian Wiessner
2014-12-04 16:40     ` Smart Weblications GmbH - Florian Wiessner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5481173A.9060308@smart-weblications.de \
    --to=f.wiessner@smart-weblications.de \
    --cc=ja@ssi.bg \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=steffen.klassert@secunet.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.