linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrius Narbutas <andrius.narbutas@gmail.com>
To: linux-ide@vger.kernel.org
Subject: Crash with Z77 chipset
Date: Mon, 17 Dec 2012 19:07:15 +0200	[thread overview]
Message-ID: <50CF5143.5020207@gmail.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 3634 bytes --]

Hello,
(probably a bit long mail, but i will try to describe what i did or tried)
using ASRock Z77 Pro3 motherboard with Z77 chipset, 4xSATA WDC 
WD1002FAEX-00Z3A0 drives, Debian Linux (basic installation, no X or 
other services).
Problem: any intense I/O to disk causes system to crash. Easiest method 
(for me) to reproduce (100% so far) problem - just do mkfs.ext2 
/dev/sdb3 (any failsystem will work, the same goes for `dd if=/dev/zero 
of=/dev/sdb bs=1M`, just a bit slower). Before crash inode creation 
slows down, for ~10 seconds, then stops at all (and crash immediately).
What i tried:
  - first i noticed that system will crash with default debian kernel 
(2.6.32-5-amd64). This is only one kernel which writes something to 
message log, and crashes when writing inodes at count ~3250/7464. It 
writes info to /var/log/messages and console, system becomes 
unresponsive (kernel.panic from sysctl does not reboot system, same goes 
for software watchdog - you need to "manually" reboot system)
  - i recompiled current stable kernel (3.6.10) with 
CONFIG_DETECT_HUNG_TASK=y and CONFIG_BOOTPARAM_HUNG_TASK_PANIC=y and 
re-tested. System hangs when writing inodes at ~3450/7464, no info on 
screen or syslog. System could be rebooted with `echo b > 
/proc/sysrq-trigger` on another console, console is responsive, but any 
disk access will hung console. Sometimes (rarely) system becomes 
unresponsive, and reboots after timeout
  - i recompiled todays git kernel, recompiled with the same parameters. 
It hangs ~6400/7464 (note - goes much further than previous versions), 
but completely - does not reboot itself, does not respond to ping, only 
poweroff helps. Nothing in syslog, photo from screen will be attached 
with logs in next post (can't be scrolled up/down - so no info what 
happened earlier)

Observations:
  - system could be "alive" and working with low disk activity for long 
time (at least, more than week). But enough to do some disk I/O - crash 
(for example, copying bzip'ed kernel image from one place to another is 
enough to trigger crash)
  - disk type does not matter. I tried to attach Hitachi HDS722020ALA330 
disk instead of WD - the same (i would say, it crashed even earlier, but 
didn't measured exactly)
  - SATA cables are replaced, system could run prime95 torture test for 
several hours - so i could say that RAM/CPU isn't a problem here
  - could be crashed with activity on any disk. I tried to make RAID10, 
LVM on top - crash; disassembled md array, tested with disk activity to 
_all_ disks separately - any disk activity could crash system
  - tested all "quick" solutions i could find on internet, including 
module params "acpi=off noapic", "libata.noacpi=1", 
"libata.force=1.5Gbps", some other woodoo magic like disabling write 
cache or disabling NCQ - no difference (probably tested something more, 
like 'norst', i forgot already)
Attached zip'ed logs - one from 2.6 kernel (with trace), another from 
today's git kernel (entire log from boot to crash, next line in log 
starts again with rsyslog...).
Also, screen images from "dead" system (nothing in logs, and i can't 
scroll up):
  - todays git kernel: http://i49.tinypic.com/js0xl2.jpg
  - 3.6.10 on shutdown (crashed): http://i47.tinypic.com/2exv4fr.jpg

Because this problem is easily reproducible - i could try to get as much 
information as i can, if you ask. Minor problem - i do not have physical 
access to system, so if tests should be done with latest kernel (which 
hangs completely and needs access to system for restart) - i can do 
tests only at day, when others could access and reboot system.

Thanks.


[-- Attachment #2: logs.zip --]
[-- Type: application/zip, Size: 11714 bytes --]

             reply	other threads:[~2012-12-17 17:07 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-17 17:07 Andrius Narbutas [this message]
2012-12-18  3:41 ` Crash with Z77 chipset Robert Hancock
2012-12-18  8:51   ` Andrius Narbutas
2012-12-19  3:36     ` Robert Hancock

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50CF5143.5020207@gmail.com \
    --to=andrius.narbutas@gmail.com \
    --cc=linux-ide@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).