public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* SB600 AHCI: Hard Disk Corruption
@ 2008-05-25 12:10 Patrick
  2008-05-25 12:16 ` Patrick
  2008-05-25 17:38 ` Pavel Machek
  0 siblings, 2 replies; 25+ messages in thread
From: Patrick @ 2008-05-25 12:10 UTC (permalink / raw)
  To: linux-kernel

Hello (Tejun Heo *)

I've got an annoying problem with my athlon 64bit, 4gb ram, asus m2a-vm
(->SB600 AHCI controller), SAMSUNG HD501LJ SATA Disk. I'm using kernel
2.6.26-rc3. Everything works fine, expect for standby/suspend/hibernate.
Standby freezes, hibernate, I acually haven't tested lately cause I
want suspend to ram to work first.

"echo mem > /sys/power/state; vbetool post;" (on text console)
successfully suspends the system and it resumes as well, BUT: After
resuming, things quickly turn bad: "file not fonund", kernel reports
ext2 errors on root (lvm) partition. After a (hard) reboot the root
fileystem won't even be recognized again by mount and e2fschk can harldy
recover it (thousands of inodes go to lost+found, have to restore
backups to make the system work again). This happend even when the
partition was mounted _readonly_ and it happens to ALL partitions
mounted during suspend. ** I'm testing now by appending break=init to
the kernel command line, getting to a busybox on the initramfs, and then
unmounting "root" before suspending. From there i can dmesg to see
what's happening (though the dmesg buffer is quiet small...can i
increase that in proc somewhere?). I'd be willing to test and send
whatever logs you need to get this fixed.

Some additional infos: Upgrading from 2.6.24, I hoped the
AHCI_HFLAG_NO_MSI in drivers/ata/ahci.c might solve the issue - no luck.
All the other sb600 workarounds: obviousley no luck as well.
irqpoll: slightly different behaviour when unloading sd_mod and ahci
modules before suspending:
without irqpoll, the disk ([sda]) doesn't show up again after "modprobe
ahci; modprobe sd_mod" and I get "ata5.00: failed to IDENTIFY [...]
err_mask=0x80" "failed to restore some devices [...]" errors
with irqpoll, disk shows up again and no errors, but "there is different
data" on each read (head -c10000) from /dev/sda. Though the disk is not
changed, after rebooting it contains the original data. I just wonder
how the data is "created" - it seems to be disk content from different
locations (not beginning) on the disk - if i "dd if=/dev/sda
of=/dev/null", i hear the disk reading data....

Well - I hope you might be able to make some sense of that and tell me
what logs and dumps exactly you need to fix it...

Greets - Patrick



* I read many threads in which Tejun provided patches for the SB600 AHCI
Controller which seems to be seriously broken - if only i knew that in
advance... Maybe he can fix this issue as well - last ressort. Otherwise
I'll burn that mobo!

** After my firs install and configuring the system for a day, trying
out suspend to ram smashed it with no backups, since then i didn't learn
my lesson and smashed it again 2-3 times, this time with backups at hand
though, ...




^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2008-09-02  9:22 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-25 12:10 SB600 AHCI: Hard Disk Corruption Patrick
2008-05-25 12:16 ` Patrick
2008-05-25 17:38 ` Pavel Machek
2008-05-25 20:08   ` Patrick
2008-05-25 20:39     ` >3G => iommu => suspend problems -- was " Pavel Machek
2008-05-25 21:10       ` Pavel Machek
2008-05-26 15:31         ` Patrick
2008-05-27 11:22           ` Pavel Machek
2008-05-29 18:44             ` Patrick
2008-05-29 18:51               ` Patrick
2008-05-29 21:05               ` Patrick
2008-06-03 22:33             ` Rafael J. Wysocki
2008-06-06 13:20               ` Pavel Machek
2008-06-08 22:36                 ` Rafael J. Wysocki
     [not found]                   ` <20080609124630.GA28799@elte.hu>
2008-06-09 22:10                     ` [PATCH] x86 GART: Add resume handling (was: Re: >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption) Rafael J. Wysocki
2008-06-10 10:03                       ` Rafael J. Wysocki
2008-06-12  9:34                         ` Ingo Molnar
2008-06-11 11:43                   ` >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption Patrick
2008-06-11 14:38                     ` Rafael J. Wysocki
2008-06-11 15:04                       ` Andi Kleen
2008-07-03 17:35                         ` Patrick
2008-08-07  8:17                           ` Pavel Machek
2008-08-08 22:40                             ` Patrick
2008-09-02  8:05                               ` Pavel Machek
2008-05-27 10:23         ` Pavel Machek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox