public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Cyberman Wu <cypher.w@gmail.com>
Cc: linux-kernel@vger.kernel.org, Andrew Morton <akpm@linux-foundation.org>
Subject: Re: Is this a kernel bug?
Date: Thu, 8 Nov 2012 17:11:58 -0800	[thread overview]
Message-ID: <20121109011158.GE9672@htj.dyndns.org> (raw)
In-Reply-To: <CADCVGbR-39diiMA-C2cEn315r916HMprt6Lx4s8qN8jHC_e8FQ@mail.gmail.com>

Hello,

On Fri, Nov 09, 2012 at 08:53:49AM +0800, Cyberman Wu wrote:
> A lot of these message on many CPU:

What I'm really curious about is the *first* exception.

Is the following the first one?  Some lines (why the stackdump is
happening) are missing at the top.

>  Pid: 906, comm:         kworker/16:1, CPU: 16
...
>  pc : 0xfffffff7002fc488 ex1: 1     faultnum: 17
> 
> Starting stack dump of tid 906, pid 906 (kworker/16:1) on cpu 16 at
> cycle 416925425702833
>   frame 0: 0xfffffff7002fc488 worker_enter_idle+0x1c8/0x2e8 (sp
> 0xfffffe00f9fbfe78)
>   frame 1: 0xfffffff7002750c8 worker_thread+0x4c8/0x898 (sp 0xfffffe00f9fbfea0)
>   frame 2: 0xfffffff7000f0530 kthread+0xe0/0xe8 (sp 0xfffffe00f9fbff80)
>   frame 3: 0xfffffff7000bab38 start_kernel_thread+0x18/0x20 (sp

Is it triggering one of BUG_ON() in worker_enter_idle()?  Can you map
the pc to the source line number using addr2line?

> The first exception is platform specific and should be a hardware error:
> fffffff7002fc480:       180906cfc0128d82        { addi r2, sp, 40 ;
> addi r31, sp, 32 }
> fffffff7002fc488:       87b886ca04218d95        { addi r21, sp, 24 ;
> addi r20, sp, 16 ; ld lr, r2 }
> While 'ld lr, r2' executed, r2 should be sp+40, but it value is 2.
> I've analysis the execute
> snap shot and:
> 1. r2 should be 2 before 'addi r2, sp, 40' executed.
> 2. r0's value is sp+40 when exception ocurred, but it shouldn't be
> that value following
>     executing flow in that function.
> So it seems while 'addi r2, sp 40' be executed, what it really
> executed is 'addi r0, sp, 40',
> maybe the instruction was load with a bit reverted for memory error,
> or cache error or
> problem of CPU? I'm not sure since it never occurred again.

So, the first exception wasn't a software bug?

> What I thought maybe a kernel bug is that second exception. I've
> simulated it try to
> generate a exception in kworker, and it occurred again. Then I checked
> the code and

After a fatal exception in kernel space, nothing is guaranteed to
work.  It's usually in the realm of "if it limps along, great;
otherwise, too bad", so it isn't really a bug.  There are only so many
things you can do after a program segfaults after all.  That said, it
might be a good idea to clear PF_WQ_WORKER from do_exit() so that at
least we can avoid oops from irq context after a work item messes up.

Thanks.

-- 
tejun

  reply	other threads:[~2012-11-09  1:12 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-03  8:03 Is this a kernel bug? Cyberman Wu
2012-11-07 16:28 ` Tejun Heo
2012-11-09  0:53   ` Cyberman Wu
2012-11-09  1:11     ` Tejun Heo [this message]
2012-11-12  2:42       ` Cyberman Wu
2012-11-09  2:07     ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121109011158.GE9672@htj.dyndns.org \
    --to=tj@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=cypher.w@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox