All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Ruehl <chris.ruehl@gtsys.com.hk>
To: netdev@vger.kernel.org
Cc: davem@davemloft.net, steffen.klassert@secunet.com
Subject: Re: ipv6: oops in datagram.c line 260
Date: Wed, 24 Dec 2014 21:42:12 +0800	[thread overview]
Message-ID: <549AC2B4.8070203@gtsys.com.hk> (raw)
In-Reply-To: <5487DD65.60800@gtsys.com.hk>

On Wednesday, December 10, 2014 01:43 PM, Chris Ruehl wrote:
> Hi all,
>
> We running a Dell server which crash frequently with (dell crash video snapshot)
> vanilla 3.14.25
>
> Capture viewed here: http://www.gtsys.com.hk/~chris/datagram_c_line260.png
>
> The capture sadly don't show the full trace, so we lack on information.
> 1st line I can see in the crash video from the idrac : tcp_transmit_skb+0x461
>
> RIP [<ffffffff815da587>] ipv6_local_error+0x17/0x140
>
> The null pointer happen:
>   Type "apropos word" to search for commands related to "word"...
> Reading symbols from net/ipv6/datagram.o...done.
> (gdb) list *(ipv6_local_error+0x17)
> 0xae7 is in ipv6_local_error (net/ipv6/datagram.c:260).
> 255        struct ipv6_pinfo *np = inet6_sk(sk);
> 256        struct sock_exterr_skb *serr;
> 257        struct ipv6hdr *iph;
> 258        struct sk_buff *skb;
> 259
> 260        if (!np->recverr)
> 261            return;
> 262
> 263        skb = alloc_skb(sizeof(struct ipv6hdr), GFP_ATOMIC);
> 264        if (!skb)
> (gdb) quit
>
>
> We running a 6in4 with ipsec tunnel on the 6. I found a pull request from
> Steffen Klassert
> here:
>      http://article.gmane.org/gmane.linux.network/281469
>
> Which might be relevant to this problem.
>
> For time being I add a
>
>          if (np == NULL){
>                  LIMIT_NETDEBUG(KERN_DEBUG "ipv6_pinfo is NULL\n");
>                  return;
>          }
>
> as work around to stop the server crashing
>
>
> With kind regards
> Chris
>

Catch it!

Update the kernel to 3.14.27 and add a WARN_ON() to the function and catch the 
OOPS after 5 Days.

As mentioned we running a IPv6 in IPv4 with a couple of IPSec tunnels on the v6.

Code change:
void ipv6_local_error(struct sock *sk, int err, struct flowi6 *fl6, u32 info)
{
         struct ipv6_pinfo *np = inet6_sk(sk);
         struct sock_exterr_skb *serr;
         struct ipv6hdr *iph;
         struct sk_buff *skb;

         if (np == NULL){
                 LIMIT_NETDEBUG(KERN_CRIT "ipv6_pinfo is NULL\n");
                 WARN_ON(1);
                 return;
         }



[447604.244357] ipv6_pinfo is NULL
[447604.273733] ------------[ cut here ]------------
[447604.303628] WARNING: CPU: 7 PID: 0 at net/ipv6/datagram.c:262 
ipv6_local_error+0x16b/0x1a0()
[447604.366173] Modules linked in: ipmi_si vhost_net vhost macvtap macvlan 
xt_policy authenc esp6 xfrm4_mode_tunnel xfrm6_mode_tunnel mpt3sas mpt2sas 
raid_class scsi_transport_sas mptctl mptbase ipt_MASQUERADE iptable_nat 
nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack 
ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp ipmi_devintf dell_rbu 
ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables 
x_tables xfrm_user xfrm4_tunnel ipcomp xfrm_ipcomp esp4 ah4 deflate ctr 
twofish_generic twofish_avx_x86_64 twofish_x86_64_3way twofish_x86_64 
twofish_common camellia_generic camellia_aesni_avx_x86_64 camellia_x86_64 
serpent_avx_x86_64 serpent_sse2_x86_64 xts serpent_generic blowfish_generic 
blowfish_x86_64 blowfish_common cast5_avx_x86_64 cast5_generic cast_common 
des_generic cmac xcbc rmd160 crypto_null af_key xfrm_algo sit ip_tunnel tunnel4 
bridge stp llc xfs libcrc32c intel_rapl x86_pkg_temp_thermal intel_powerclamp 
coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul gpio_ich 
ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul joydev glue_helper 
ablk_helper cryptd dcdbas shpchp wmi mei_me mei acpi_power_meter lpc_ich dummy 
lp parport hid_generic tg3 usbhid hid ahci megaraid_sas ptp libahci pps_core 
[last unloaded: ipmi_si]
[447605.087999] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 3.14.27 #11
[447605.139687] Hardware name: Dell Inc. PowerEdge R420/0CN7CM, BIOS 2.3.3 
07/10/2014
[447605.242931]  0000000000000009 ffff8806172e3b48 ffffffff815ffd58 0000000000000000
[447605.349130]  ffff8806172e3b80 ffffffff81043c23 ffff8800a16322e8 ffff880037daa1c0
[447605.459659]  ffff88000b026800 0000000000000000 ffff880037daa4b8 ffff8806172e3b90
[447605.576385] Call Trace:
[447605.634243]  <IRQ>  [<ffffffff815ffd58>] dump_stack+0x45/0x56
[447605.692870]  [<ffffffff81043c23>] warn_slowpath_common+0x73/0x90
[447605.751097]  [<ffffffff81043cf5>] warn_slowpath_null+0x15/0x20
[447605.808000]  [<ffffffff815da6db>] ipv6_local_error+0x16b/0x1a0
[447605.863821]  [<ffffffff815e29d0>] xfrm6_local_error+0x60/0x90
[447605.918493]  [<ffffffff8150b485>] ? skb_dequeue+0x15/0x70
[447605.971871]  [<ffffffff815a6cc1>] xfrm_local_error+0x51/0x70
[447606.024218]  [<ffffffff8159ca15>] xfrm4_extract_output+0x75/0xb0
[447606.075630]  [<ffffffff815a6c5a>] xfrm_inner_extract_output+0x6a/0x80
[447606.126055]  [<ffffffff815e27a2>] xfrm6_prepare_output+0x12/0x60
[447606.175310]  [<ffffffff815a6ed0>] xfrm_output_resume+0x1f0/0x370
[447606.223406]  [<ffffffff8151a486>] ? skb_checksum_help+0x76/0x190
[447606.270572]  [<ffffffff815a709b>] xfrm_output+0x3b/0xf0
[447606.316454]  [<ffffffff815e2ae0>] ? xfrm6_extract_output+0xe0/0xe0
[447606.361803]  [<ffffffff815e2af7>] xfrm6_output_finish+0x17/0x20
[447606.406053]  [<ffffffff8159cad6>] xfrm4_output+0x46/0x80
[447606.448694]  [<ffffffff81550a80>] ip_local_out+0x20/0x30
[447606.489952]  [<ffffffff81550dd5>] ip_queue_xmit+0x135/0x3c0
[447606.530017]  [<ffffffff815672e1>] tcp_transmit_skb+0x461/0x8c0
[447606.569362]  [<ffffffff8156786e>] tcp_write_xmit+0x12e/0xb20
[447606.607876]  [<ffffffff815669ff>] ? tcp_current_mss+0x4f/0x70
[447606.645723]  [<ffffffff8156b320>] ? tcp_write_timer_handler+0x1b0/0x1b0
[447606.682837]  [<ffffffff81569487>] tcp_send_loss_probe+0x37/0x1f0
[447606.719000]  [<ffffffff8156b320>] ? tcp_write_timer_handler+0x1b0/0x1b0
[447606.754537]  [<ffffffff8156b1bb>] tcp_write_timer_handler+0x4b/0x1b0
[447606.789266]  [<ffffffff8156b320>] ? tcp_write_timer_handler+0x1b0/0x1b0
[447606.823242]  [<ffffffff8156b378>] tcp_write_timer+0x58/0x60
[447606.856047]  [<ffffffff8104e848>] call_timer_fn.isra.32+0x18/0x80
[447606.888029]  [<ffffffff8104ea1a>] run_timer_softirq+0x16a/0x200
[447606.920224]  [<ffffffff81047efc>] __do_softirq+0xec/0x250
[447606.951850]  [<ffffffff810482f5>] irq_exit+0xf5/0x100
[447606.982665]  [<ffffffff8102bc6f>] smp_apic_timer_interrupt+0x3f/0x50
[447607.014382]  [<ffffffff8160d98a>] apic_timer_interrupt+0x6a/0x70
[447607.046175]  <EOI>  [<ffffffff8104f336>] ? get_next_timer_interrupt+0x1d6/0x250
[447607.111311]  [<ffffffff814d45a7>] ? cpuidle_enter_state+0x47/0xc0
[447607.145850]  [<ffffffff814d45a3>] ? cpuidle_enter_state+0x43/0xc0
[447607.179625]  [<ffffffff814d46b6>] cpuidle_idle_call+0x96/0x130
[447607.213531]  [<ffffffff8100b909>] arch_cpu_idle+0x9/0x20
[447607.247052]  [<ffffffff810925ba>] cpu_startup_entry+0xda/0x1d0
[447607.280775]  [<ffffffff81029d22>] start_secondary+0x212/0x2c0
[447607.314555] ---[ end trace 6ff3826b6e4fdf67 ]---


Can someone have a closer look into this problem?

Regards
Chris

  reply	other threads:[~2014-12-24 13:42 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-10  5:43 ipv6: oops in datagram.c line 260 Chris Ruehl
2014-12-24 13:42 ` Chris Ruehl [this message]
2015-01-06 16:01   ` Hannes Frederic Sowa
2015-01-07  7:22     ` Steffen Klassert
2015-01-07 10:45       ` Hannes Frederic Sowa
2015-01-07 12:26         ` Steffen Klassert
2015-01-26  8:35     ` Steffen Klassert
2015-01-27  4:20       ` Chris Ruehl
     [not found]       ` <54C71AFB.40300@gtsys.com.hk>
2015-01-27 11:58         ` Steffen Klassert
2015-01-28  3:50           ` Chris Ruehl
2015-02-06  7:37           ` Chris Ruehl
2015-02-10  9:57             ` Steffen Klassert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=549AC2B4.8070203@gtsys.com.hk \
    --to=chris.ruehl@gtsys.com.hk \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=steffen.klassert@secunet.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.