stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de>
To: Eric Dumazet <edumazet@google.com>
Cc: "Holger Hoffstätte" <holger.hoffstaette@googlemail.com>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	LKML <linux-kernel@vger.kernel.org>,
	stable@vger.kernel.org
Subject: Re: Soft lockup issue in Linux 4.1.9
Date: Thu, 8 Oct 2015 18:56:52 +0200	[thread overview]
Message-ID: <1444322507@msgid.manchmal.in-ulm.de> (raw)
In-Reply-To: <CANn89iKvTUKZTiWQYbjXLr4NvYKDgd_mKYVzZHiSqk0e0OfGxA@mail.gmail.com>

Eric Dumazet wrote...

[ commit 83fccfc3940c4a2db90fd7e7079f5b465cd8c6af ]

> It definitely should help !

Yesterday, I've experienced issues somewhat similar to this, but I'm
not entirely sure:

Four of five systems running 4.1.9 stopped working. No reaction on
network, keyboard, serial console. In one case, the stack trace as
below made it to the loghost.

Two things are quite different. First, the systems had a reasonable
uptime, about a week.

And second, the scary part: All incidents happened within a rather
short time span of three minutes the most, beginning after 16:41:28 and
before 16:41:54 UTC. So I assumed a brownout first - until I realized
the systems faded away at slightly different times, and one is at a
different location. While other systems using different kernel versions
continued to operate on both sites.

So, I'd be glad for answers for

- Is this the same issue or should I be even more afraid?
- What might be the reason for this temporal coincidence? I have no
  plausible idea.

Confused,
    Christoph


 INFO: rcu_sched self-detected stall on CPU { 3}  (t=6000 jiffies g=8932806 c=8932805 q=58491)
 rcu_sched kthread starved for 5999 jiffies!
 Task dump for CPU 3:
 swapper/3       R  running task        0     0      1 0x00000008
  ffffffff81e396c0 ffff88042dcc3b20 ffffffff810807da 0000000000000003
  ffffffff81e396c0 ffff88042dcc3b40 ffffffff81083b78 ffff88042dcc3b80
  0000000000000003 ffff88042dcc3b70 ffffffff810a945c ffff88042dcd5740
 Call Trace:
  <IRQ>  [<ffffffff810807da>] sched_show_task+0xaa/0x110
  [<ffffffff81083b78>] dump_cpu_task+0x38/0x40
  [<ffffffff810a945c>] rcu_dump_cpu_stacks+0x8c/0xc0
  [<ffffffff810abf31>] rcu_check_callbacks+0x3b1/0x680
  [<ffffffff810e7bb7>] ? acct_account_cputime+0x17/0x20
  [<ffffffff8108484e>] ? account_system_time+0x8e/0x180
  [<ffffffff810ae4d3>] update_process_times+0x33/0x60
  [<ffffffff810bcae0>] tick_sched_handle.isra.14+0x30/0x40
  [<ffffffff810bcbd3>] tick_sched_timer+0x43/0x80
  [<ffffffff810aea2a>] __run_hrtimer.isra.32+0x4a/0xd0
  [<ffffffff810af225>] hrtimer_interrupt+0xd5/0x1f0
  [<ffffffff81034d84>] local_apic_timer_interrupt+0x34/0x60
 INFO: rcu_sched self-detected stall on CPU { 3}  (t=6000 jiffies g=8932806 c=8932805 q=58491)
 rcu_sched kthread starved for 5999 jiffies!
 Task dump for CPU 3:
 swapper/3       R  running task        0     0      1 0x00000008
  ffffffff81e396c0 ffff88042dcc3b20 ffffffff810807da 0000000000000003
  ffffffff81e396c0 ffff88042dcc3b40 ffffffff81083b78 ffff88042dcc3b80
  0000000000000003 ffff88042dcc3b70 ffffffff810a945c ffff88042dcd5740
 Call Trace:
  <IRQ>  [<ffffffff810807da>] sched_show_task+0xaa/0x110
  [<ffffffff81083b78>] dump_cpu_task+0x38/0x40
  [<ffffffff8103516c>] smp_apic_timer_interrupt+0x3c/0x60
  [<ffffffff8190db7b>] apic_timer_interrupt+0x6b/0x70
  [<ffffffff8190c8a9>] ? _raw_spin_unlock_irqrestore+0x9/0x10
  [<ffffffff810ade58>] try_to_del_timer_sync+0x48/0x60
  [<ffffffff810adeb2>] ? del_timer_sync+0x42/0x60
  [<ffffffff810adeba>] del_timer_sync+0x4a/0x60
  [<ffffffff8178b7da>] inet_csk_reqsk_queue_drop+0x7a/0x1f0
  [<ffffffff8178ba7f>] reqsk_timer_handler+0x12f/0x290
  [<ffffffff8178b950>] ? inet_csk_reqsk_queue_drop+0x1f0/0x1f0
  [<ffffffff810ad9e6>] call_timer_fn.isra.26+0x26/0x80
  [<ffffffff810a945c>] rcu_dump_cpu_stacks+0x8c/0xc0
  [<ffffffff810abf31>] rcu_check_callbacks+0x3b1/0x680
  [<ffffffff810e7bb7>] ? acct_account_cputime+0x17/0x20
  [<ffffffff8108484e>] ? account_system_time+0x8e/0x180
  [<ffffffff810ae4d3>] update_process_times+0x33/0x60
  [<ffffffff810bcae0>] tick_sched_handle.isra.14+0x30/0x40
  [<ffffffff810bcbd3>] tick_sched_timer+0x43/0x80
  [<ffffffff810aea2a>] __run_hrtimer.isra.32+0x4a/0xd0
  [<ffffffff810af225>] hrtimer_interrupt+0xd5/0x1f0
  [<ffffffff81034d84>] local_apic_timer_interrupt+0x34/0x60
  [<ffffffff810ae1ae>] run_timer_softirq+0x18e/0x220
  [<ffffffff81060b1a>] __do_softirq+0xda/0x1f0
  [<ffffffff81060e16>] irq_exit+0x76/0xa0
  [<ffffffff81035175>] smp_apic_timer_interrupt+0x45/0x60
  [<ffffffff8190db7b>] apic_timer_interrupt+0x6b/0x70
  <EOI>  [<ffffffff810844be>] ? sched_clock_cpu+0x9e/0xb0
  [<ffffffff8100bc15>] ? amd_e400_idle+0x35/0xd0
  [<ffffffff8100bc13>] ? amd_e400_idle+0x33/0xd0
  [<ffffffff8100c42a>] arch_cpu_idle+0xa/0x10
  [<ffffffff810929e3>] cpu_startup_entry+0x2c3/0x330
  [<ffffffff8103516c>] smp_apic_timer_interrupt+0x3c/0x60
  [<ffffffff8190db7b>] apic_timer_interrupt+0x6b/0x70
  [<ffffffff8190c8a9>] ? _raw_spin_unlock_irqrestore+0x9/0x10
  [<ffffffff810ade58>] try_to_del_timer_sync+0x48/0x60
  [<ffffffff810adeb2>] ? del_timer_sync+0x42/0x60
  [<ffffffff810adeba>] del_timer_sync+0x4a/0x60
  [<ffffffff8178b7da>] inet_csk_reqsk_queue_drop+0x7a/0x1f0
  [<ffffffff8178ba7f>] reqsk_timer_handler+0x12f/0x290
  [<ffffffff8178b950>] ? inet_csk_reqsk_queue_drop+0x1f0/0x1f0
  [<ffffffff810ad9e6>] call_timer_fn.isra.26+0x26/0x80
  [<ffffffff810332dc>] start_secondary+0x17c/0x1a0
  [<ffffffff810ae1ae>] run_timer_softirq+0x18e/0x220
  [<ffffffff81060b1a>] __do_softirq+0xda/0x1f0
  [<ffffffff81060e16>] irq_exit+0x76/0xa0
  [<ffffffff81035175>] smp_apic_timer_interrupt+0x45/0x60
  [<ffffffff8190db7b>] apic_timer_interrupt+0x6b/0x70
  <EOI>  [<ffffffff810844be>] ? sched_clock_cpu+0x9e/0xb0
  [<ffffffff8100bc15>] ? amd_e400_idle+0x35/0xd0
  [<ffffffff8100bc13>] ? amd_e400_idle+0x33/0xd0
  [<ffffffff8100c42a>] arch_cpu_idle+0xa/0x10
  [<ffffffff810929e3>] cpu_startup_entry+0x2c3/0x330
  [<ffffffff810332dc>] start_secondary+0x17c/0x1a0


  parent reply	other threads:[~2015-10-08 16:56 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1443650383.13282.10.camel@daevel.fr>
     [not found] ` <pan.2015.09.30.22.37.34@googlemail.com>
     [not found]   ` <560CB98A.10107@tomt.net>
2015-10-01 10:51     ` Soft lockup issue in Linux 4.1.9 Holger Hoffstätte
     [not found] ` <560D1223.3070606@googlemail.com>
     [not found]   ` <CANn89i+B5T4Rhs8HnrC0+f+GhLvBFfpr4BVDvhkVOveSfy9B8Q@mail.gmail.com>
2015-10-01 11:43     ` Holger Hoffstätte
2015-10-01 11:52       ` Eric Dumazet
2015-10-02  6:52         ` Andre Tomt
2015-10-02  7:17           ` Holger Hoffstätte
2015-10-02 19:25             ` Wolfgang Walter
2015-10-03 19:14             ` Thomas D.
2015-10-17 23:41               ` Greg Kroah-Hartman
2015-10-02 20:04         ` Thomas Gleixner
2015-10-02 20:59           ` Eric Dumazet
2015-10-02 21:04             ` Thomas Gleixner
2015-10-02 21:32               ` Eric Dumazet
2015-10-08 16:56         ` Christoph Biedl [this message]
2015-10-08 19:27           ` Holger Hoffstätte

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1444322507@msgid.manchmal.in-ulm.de \
    --to=linux-kernel.bfrz@manchmal.in-ulm.de \
    --cc=ebiederm@xmission.com \
    --cc=edumazet@google.com \
    --cc=holger.hoffstaette@googlemail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).