linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: CAI Qian <caiqian@redhat.com>
To: stable@vger.kernel.org
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>
Subject: Re: oom caused disk corruption on 3.7.1
Date: Wed, 30 Jan 2013 01:57:05 -0500 (EST)	[thread overview]
Message-ID: <561898288.11015388.1359529025256.JavaMail.root@redhat.com> (raw)
In-Reply-To: <1022938540.1925160.1357725053304.JavaMail.root@redhat.com>



----- Original Message -----
> From: "CAI Qian" <caiqian@redhat.com>
> To: "linux-mm" kvack.org>
> Cc: stable@vger.kernel.org, "linux-kernel" vger.kernel.org>
> Sent: Wednesday, January 9, 2013 5:50:53 PM
> Subject: oom caused disk corruption on 3.7.1
> 
> While doing oom testing on a power7 system with swapping,
> it was swallowed a panic on v3.7.1 below. Without a swap device,
> it is running fine. v3.0 has the same problem.
This is weird that if turned on those options,
CONFIG_PCIEPORTBUS=y
CONFIG_PCIEAER=y

it turns out to be fine except some warnings which looks like
better than a panic.
INFO: task (tmpfiles):5456 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
(tmpfiles)      D 00003fff877fb508     0  5456      1 0x00000080
Call Trace:
[c00000001cf76a30] [c0000000010a2180] jiffies+0x0/0x80 (unreliable)
[c00000001cf76c00] [c000000000014960] .__switch_to+0x110/0x240
[c00000001cf76cb0] [c0000000006b1cc0] .__schedule+0x3c0/0x8b0
[c00000001cf76f30] [c0000000006affb4] .schedule_timeout+0x1e4/0x2d0
[c00000001cf77030] [c0000000006b23fc] .wait_for_common+0x18c/0x200
[c00000001cf77110] [c0000000002863a8] .xfs_buf_iowait+0x88/0x150
[c00000001cf771a0] [c000000000286700] .xfs_buf_read_map+0xd0/0x170
[c00000001cf77240] [c0000000002f4074] .xfs_trans_read_buf_map+0x204/0x570
[c00000001cf77300] [c0000000002c5940] .xfs_da_read_buf+0x100/0x250
[c00000001cf773f0] [c0000000002c7098] .xfs_da_node_lookup_int+0xc8/0x440
[c00000001cf774c0] [c0000000002d0c60] .xfs_dir2_node_lookup+0x70/0x1d0
[c00000001cf77570] [c0000000002c8fe4] .xfs_dir_lookup+0x214/0x230
[c00000001cf776a0] [c00000000029f068] .xfs_lookup+0xb8/0x1a0
[c00000001cf77760] [c000000000293f50] .xfs_vn_lookup+0x60/0xd0
[c00000001cf77800] [c0000000001db454] .lookup_real+0x44/0xa0
[c00000001cf77890] [c0000000001e16e8] .do_last+0xad8/0xe00
[c00000001cf779c0] [c0000000001e1afc] .path_openat+0xec/0x5f0
[c00000001cf77ae0] [c0000000001e2450] .do_filp_open+0x40/0xb0
[c00000001cf77c10] [c0000000001d6308] .open_exec+0x48/0x170
[c00000001cf77cc0] [c0000000001d7ae0] .do_execve_common.isra.19+0x240/0x4e0
[c00000001cf77da0] [c0000000001d8100] .SyS_execve+0x50/0x90
[c00000001cf77e30] [c0000000000097d4] syscall_exit+0x0/0x94
> 
> Test case is here,
> http://tinyurl.com/bzzmrb8
> 
> ...
> [  763.781571] Write-error on swap-device (253:0:7545984)
> [  763.781573] sd 0:0:1:0: rejecting I/O to offline device
> [  763.781574] Write-error on swap-device (253:0:7546240)
> [  763.781576] sd 0:0:1:0: rejecting I/O to offline device
> [  763.781577] Kernel panic - not syncing: Attempted to kill init!
> exitcode=0x0000000b
> [  763.781578] Write-error on swap-device (253:0:7546496)
> [  763.781579] Call Trace:
> [  763.781580] sd 0:0:1:0: rejecting I/O to offline device
> [  763.781590] [c0000002eac83870] [c000000000015884]
> .show_stack+0x74/0x1b0 (unreliable)
> [  763.781595] [c0000002eac83920] [c000000000721d28]
> .panic+0xe4/0x264
> [  763.781598] [c0000002eac839c0] [c0000000000886e4]
> .do_exit+0x954/0x960
> [  763.781601] [c0000002eac83ac0] [c0000000000889d4]
> .do_group_exit+0x54/0xf0
> [  763.781604] [c0000002eac83b50] [c00000000009be28]
> .get_signal_to_deliver+0x1f8/0x730
> [  763.781606] [c0000002eac83c60] [c000000000017924]
> .do_signal+0x54/0x320
> [  763.781608] [c0000002eac83da0] [c000000000017d74]
> .do_notify_resume+0xb4/0xd0
> [  763.781611] [c0000002eac83e30] [c000000000009e1c]
> .ret_from_except_lite+0x48/0x4c
> [  763.781612] Write-error on swap-device (253:0:7546752)
> [  763.781613] sd 0:0:1:0: rejecting I/O to offline device
> [  763.781615] Write-error on swap-device (253:0:7547008)
> [  763.781616] Sending IPI to other CPUs
> [  763.781616] sd 0:0:1:0: rejecting I/O to offline device
> [  763.781618] Write-error on swap-device (253:0:7547392)
> [  763.781619] sd 0:0:1:0: rejecting I/O to offline device
> [  763.781620] Write-error on swap-device (253:0:7547648)
> [  763.781622] sd 0:0:1:0: rejecting I/O to offline device
> [  763.781623] Write-error on swap-device (253:0:7547904)
> [  763.781625] sd 0:0:1:0: rejecting I/O to offline device
> [  763.781627] Write-error on swap-device (253:0:7548160)
> [  763.781628] sd 0:0:1:0: rejecting I/O to offline device
> [  763.781630] Write-error on swap-device (253:0:7548416)
> [  763.781631] sd 0:0:1:0: rejecting I/O to offline device
> [  763.781632] Write-error on swap-device (253:0:7548672)
> [  763.781634] sd 0:0:1:0: rejecting I/O to offline device
> [  763.781635] Write-error on swap-device (253:0:7548928)
> [  773.781972] ERROR: 1 cpu(s) not responding
> 
>       KERNEL: /boot/vmlinux-3.7.1+
>     DUMPFILE: /var/crash/127.0.0.1-2013.01.09-19:12:02/vmcore
>         CPUS: 28
>         DATE: Tue Jan  8 23:11:35 2013
>       UPTIME: 00:12:43
> LOAD AVERAGE: 5.88, 4.82, 2.51
>        TASKS: 278
>      RELEASE: 3.7.1+
>      VERSION: #0 SMP Tue Jan 8 06:59:49 EST 2013
>      MACHINE: ppc64  (3550 Mhz)
>       MEMORY: 12 GB
>        PANIC: "Kernel panic - not syncing: Attempted to kill init!
>        exitcode=0x0000000b"
>          PID: 1
>      COMMAND: "systemd"
>         TASK: c0000002eac00000  [THREAD_INFO: c0000002eac80000]
>          CPU: 18
>        STATE: TASK_INTERRUPTIBLE|TASK_UNINTERRUPTIBLE|TASK_TRACED
>        (PANIC)
> 
> crash> bt
> PID: 1      TASK: c0000002eac00000  CPU: 18  COMMAND: "systemd"
> 
>  R0:  c000000000721d34    R1:  c0000002eac83920    R2:
>   c000000001157098
>  R3:  c0000002eac83790    R4:  c0000002eac00000    R5:
>   0000000000000070
>  R6:  0000000000000000    R7:  c0000002fff584a0    R8:
>   0000000000000000
>  R9:  c0000002e7909000    R10: 0000000000000001    R11:
>  6578636570745f6c
>  R12: 0000000022004884    R13: c000000007f23f00    R14:
>  0000000000040006
>  R15: 00000000279b056c    R16: c0000002eac83ea0    R17:
>  c000000001398ab8
>  R18: c0000002eac00000    R19: c0000002eac00000    R20:
>  00000000003c0000
>  R21: c0000002eac00a14    R22: c0000000011b2080    R23:
>  c0000002eac83a30
>  R24: c000000018d90000    R25: 0000000000000140    R26:
>  0000000000106001
>  R27: c0000002eac83790    R28: c0000000013ba848    R29:
>  0000000000000000
>  R30: c0000000010d4d18    R31: c00000000101e4b0
>  NIP: c000000000721d34    MSR: 8000000000009032    OR3:
>  c0000002eac83920
>  CTR: 0000000000000000    LR:  c000000000721d34    XER:
>  0000000000000001
>  CCR: 0000000022004882    MQ:  3030303030303030    DAR:
>  0000000000000000
>  DSISR: c000000018d90000     Syscall Result: 0000000000000140
>  NIP [c000000000721d34] .panic
> 
>  #0 [c0000002eac83920] .panic at c000000000721d34
>  #1 [c0000002eac839c0] .do_exit at c0000000000886e4
>  #2 [c0000002eac83ac0] .do_group_exit at c0000000000889d4
>  #3 [c0000002eac83b50] .get_signal_to_deliver at c00000000009be28
>  #4 [c0000002eac83c60] .do_signal at c000000000017924
>  #5 [c0000002eac83da0] .do_notify_resume at c000000000017d74
>  #6 [c0000002eac83e30] .ret_from_except_lite at c000000000009e1c
> 
> CAI Qian

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

      reply	other threads:[~2013-01-30  6:57 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <767713684.1922924.1357724589680.JavaMail.root@redhat.com>
2013-01-09  9:50 ` oom caused disk corruption on 3.7.1 CAI Qian
2013-01-30  6:57   ` CAI Qian [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=561898288.11015388.1359529025256.JavaMail.root@redhat.com \
    --to=caiqian@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).