* [Bug 16081] Data loss after crash during heavy I/O
2010-05-31 15:19 [Bug 16081] New: Data loss after crash during heavy I/O bugzilla-daemon
@ 2010-05-31 15:21 ` bugzilla-daemon
2010-05-31 15:22 ` bugzilla-daemon
` (20 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: bugzilla-daemon @ 2010-05-31 15:21 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=16081
--- Comment #1 from lkolbe@techfak.uni-bielefeld.de 2010-05-31 15:21:24 ---
Created an attachment (id=26591)
--> (https://bugzilla.kernel.org/attachment.cgi?id=26591)
lsscsi of host
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug 16081] Data loss after crash during heavy I/O
2010-05-31 15:19 [Bug 16081] New: Data loss after crash during heavy I/O bugzilla-daemon
2010-05-31 15:21 ` [Bug 16081] " bugzilla-daemon
@ 2010-05-31 15:22 ` bugzilla-daemon
2010-05-31 15:24 ` bugzilla-daemon
` (19 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: bugzilla-daemon @ 2010-05-31 15:22 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=16081
--- Comment #2 from lkolbe@techfak.uni-bielefeld.de 2010-05-31 15:22:26 ---
Created an attachment (id=26592)
--> (https://bugzilla.kernel.org/attachment.cgi?id=26592)
lspci -vvv of host
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug 16081] Data loss after crash during heavy I/O
2010-05-31 15:19 [Bug 16081] New: Data loss after crash during heavy I/O bugzilla-daemon
2010-05-31 15:21 ` [Bug 16081] " bugzilla-daemon
2010-05-31 15:22 ` bugzilla-daemon
@ 2010-05-31 15:24 ` bugzilla-daemon
2010-05-31 15:28 ` bugzilla-daemon
` (18 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: bugzilla-daemon @ 2010-05-31 15:24 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=16081
--- Comment #3 from lkolbe@techfak.uni-bielefeld.de 2010-05-31 15:24:17 ---
One thing I forgot: Using Supermicro's current BIOS 1.2b, the machine exhibits
machine check exceptions that Linux thinks are the hardwares fault. With their
BIOS 1.1b, they do not happen.
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug 16081] Data loss after crash during heavy I/O
2010-05-31 15:19 [Bug 16081] New: Data loss after crash during heavy I/O bugzilla-daemon
` (2 preceding siblings ...)
2010-05-31 15:24 ` bugzilla-daemon
@ 2010-05-31 15:28 ` bugzilla-daemon
2010-06-01 16:38 ` bugzilla-daemon
` (17 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: bugzilla-daemon @ 2010-05-31 15:28 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=16081
lkolbe@techfak.uni-bielefeld.de changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |sfrey@techfak.uni-bielefeld
| |.de
Severity|normal |high
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug 16081] Data loss after crash during heavy I/O
2010-05-31 15:19 [Bug 16081] New: Data loss after crash during heavy I/O bugzilla-daemon
` (3 preceding siblings ...)
2010-05-31 15:28 ` bugzilla-daemon
@ 2010-06-01 16:38 ` bugzilla-daemon
2010-06-02 12:02 ` bugzilla-daemon
` (16 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: bugzilla-daemon @ 2010-06-01 16:38 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=16081
Eric Sandeen <sandeen@redhat.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |sandeen@redhat.com
--- Comment #4 from Eric Sandeen <sandeen@redhat.com> 2010-06-01 16:38:04 ---
Getting the whole original oops would be great (since it seems like you can
reproduce it)...
Did e2fsck find anything? (e2fsck -f?)
Were filesystem barriers left on, and does the storage support them?
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug 16081] Data loss after crash during heavy I/O
2010-05-31 15:19 [Bug 16081] New: Data loss after crash during heavy I/O bugzilla-daemon
` (4 preceding siblings ...)
2010-06-01 16:38 ` bugzilla-daemon
@ 2010-06-02 12:02 ` bugzilla-daemon
2010-06-02 12:10 ` bugzilla-daemon
` (15 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: bugzilla-daemon @ 2010-06-02 12:02 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=16081
--- Comment #5 from lkolbe@techfak.uni-bielefeld.de 2010-06-02 12:02:18 ---
e2fsck segfaulted after about an hour on the first volume and had gazillions of
questions for the second.
I don't know about barriers, mount says:
/dev/mapper/data-badp2 on /var/bacula/diskpool/fs2 type ext4 (rw,nosuid,nodev)
Wether the Adaptec 52445 supports barriers - I really don't know?
The Serial console is now finally working, so if this happens again we'll get a
full stacktrace. It will most likely take a few days of backup runs to trigger,
though. Thanks for looking into this!
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug 16081] Data loss after crash during heavy I/O
2010-05-31 15:19 [Bug 16081] New: Data loss after crash during heavy I/O bugzilla-daemon
` (5 preceding siblings ...)
2010-06-02 12:02 ` bugzilla-daemon
@ 2010-06-02 12:10 ` bugzilla-daemon
2010-06-02 15:57 ` bugzilla-daemon
` (14 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: bugzilla-daemon @ 2010-06-02 12:10 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=16081
--- Comment #6 from Eric Sandeen <sandeen@redhat.com> 2010-06-02 12:10:00 ---
If a barrier failed outright you'd see a note in dmesg / logs shortly after the
mount, FWIW. Anyway you didn't explicitly disable it. Barrier support in
dm/lvm is rather new, as well. Just a thought...
Capturing a core from the segfaulted e2fsck would help fix -that- bug ... and
attaching the output of the fscks might yield a clue as to what is damaged.
-Eric
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug 16081] Data loss after crash during heavy I/O
2010-05-31 15:19 [Bug 16081] New: Data loss after crash during heavy I/O bugzilla-daemon
` (6 preceding siblings ...)
2010-06-02 12:10 ` bugzilla-daemon
@ 2010-06-02 15:57 ` bugzilla-daemon
2010-06-02 16:44 ` bugzilla-daemon
` (13 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: bugzilla-daemon @ 2010-06-02 15:57 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=16081
lkolbe@techfak.uni-bielefeld.de changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #26590|0 |1
is obsolete| |
--- Comment #7 from lkolbe@techfak.uni-bielefeld.de 2010-06-02 15:57:20 ---
Created an attachment (id=26618)
--> (https://bugzilla.kernel.org/attachment.cgi?id=26618)
oops after writing 1TB
This was rather sooner than expected, after writing about 1TB of data to two
ext4 filesystems with approx. 150MB/sec.
Hopefully this trace means something?
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug 16081] Data loss after crash during heavy I/O
2010-05-31 15:19 [Bug 16081] New: Data loss after crash during heavy I/O bugzilla-daemon
` (7 preceding siblings ...)
2010-06-02 15:57 ` bugzilla-daemon
@ 2010-06-02 16:44 ` bugzilla-daemon
2010-06-02 16:44 ` bugzilla-daemon
` (12 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: bugzilla-daemon @ 2010-06-02 16:44 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=16081
--- Comment #8 from lkolbe@techfak.uni-bielefeld.de 2010-06-02 16:44:04 ---
Created an attachment (id=26619)
--> (https://bugzilla.kernel.org/attachment.cgi?id=26619)
root-filesystem borkage
Hm. After a reset, both 9TB-Filesystems were normal (journal replayed. But a
few minutes after the boot, we got really strange errors (see attachment) and
could only resurrect the root-filesystem with a live-cd and it's fsck, as grub
wouldn't detect a filesystem anymore. fsck fixed it, though (broken superblock
and some minor fixes). The system boots as I write this, and I'll continue the
same backup-tests but this time without barriers on both filesystems.
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug 16081] Data loss after crash during heavy I/O
2010-05-31 15:19 [Bug 16081] New: Data loss after crash during heavy I/O bugzilla-daemon
` (8 preceding siblings ...)
2010-06-02 16:44 ` bugzilla-daemon
@ 2010-06-02 16:44 ` bugzilla-daemon
2010-06-02 17:53 ` bugzilla-daemon
` (11 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: bugzilla-daemon @ 2010-06-02 16:44 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=16081
lkolbe@techfak.uni-bielefeld.de changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #26619|application/octet-stream |text/plain
mime type| |
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug 16081] Data loss after crash during heavy I/O
2010-05-31 15:19 [Bug 16081] New: Data loss after crash during heavy I/O bugzilla-daemon
` (9 preceding siblings ...)
2010-06-02 16:44 ` bugzilla-daemon
@ 2010-06-02 17:53 ` bugzilla-daemon
2010-06-02 18:06 ` bugzilla-daemon
` (10 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: bugzilla-daemon @ 2010-06-02 17:53 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=16081
--- Comment #9 from Eric Sandeen <sandeen@redhat.com> 2010-06-02 17:53:52 ---
(In reply to comment #8)
> ... The system boots as I write this, and I'll continue the
> same backup-tests but this time without barriers on both filesystems.
no... turning barriers -off- certainly won't help anything.
Whenever I see bad metadata corruption post-crash-and-reset I worry about
missing barriers. My mention of them was only to see whether they are properly
in use, as they should be on any storage w/ a volatile write cache.
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug 16081] Data loss after crash during heavy I/O
2010-05-31 15:19 [Bug 16081] New: Data loss after crash during heavy I/O bugzilla-daemon
` (10 preceding siblings ...)
2010-06-02 17:53 ` bugzilla-daemon
@ 2010-06-02 18:06 ` bugzilla-daemon
2010-06-02 18:24 ` bugzilla-daemon
` (9 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: bugzilla-daemon @ 2010-06-02 18:06 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=16081
--- Comment #10 from Eric Sandeen <sandeen@redhat.com> 2010-06-02 18:06:54 ---
(In reply to comment #8)
> Created an attachment (id=26619)
--> (https://bugzilla.kernel.org/attachment.cgi?id=26619) [details]
> root-filesystem borkage
How confident are you in your storage?
> [ 765.812082] attempt to access beyond end of device
> [ 765.812088] dm-6: rw=256, want=18808645176, limit=8388608
the "want" value (in sectors) is ~9T.
The limit is oddly (?) 2^23 - 8388608, that many sectors comes out to exactly
4T.
IOW, now your block device appears to be much smaller than your filesystem....
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug 16081] Data loss after crash during heavy I/O
2010-05-31 15:19 [Bug 16081] New: Data loss after crash during heavy I/O bugzilla-daemon
` (11 preceding siblings ...)
2010-06-02 18:06 ` bugzilla-daemon
@ 2010-06-02 18:24 ` bugzilla-daemon
2010-06-02 18:28 ` bugzilla-daemon
` (8 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: bugzilla-daemon @ 2010-06-02 18:24 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=16081
--- Comment #11 from Eric Sandeen <sandeen@redhat.com> 2010-06-02 18:24:29 ---
As for the trace (attachment #26618) it looks like we've found a page w/o
buffers.
There seems to be rather a lot going wrong with this machine, I'm having a hard
time getting a feel for what the root cause might be...
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug 16081] Data loss after crash during heavy I/O
2010-05-31 15:19 [Bug 16081] New: Data loss after crash during heavy I/O bugzilla-daemon
` (12 preceding siblings ...)
2010-06-02 18:24 ` bugzilla-daemon
@ 2010-06-02 18:28 ` bugzilla-daemon
2010-06-02 21:57 ` bugzilla-daemon
` (7 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: bugzilla-daemon @ 2010-06-02 18:28 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=16081
--- Comment #12 from Eric Sandeen <sandeen@redhat.com> 2010-06-02 18:28:32 ---
I don't know if it's at all possible, but testing on block devices and
filesystems just smaller than 8T would be an interesting datapoint, if that
yields success... we really should be perfectly safe at 9T but this is looking
like maybe a write has wrapped somewhere and corrupted things.
A resident dm expert also requested the output of "dmsetup table" for the
machine that yielded the "access beyond end of device" message.
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug 16081] Data loss after crash during heavy I/O
2010-05-31 15:19 [Bug 16081] New: Data loss after crash during heavy I/O bugzilla-daemon
` (13 preceding siblings ...)
2010-06-02 18:28 ` bugzilla-daemon
@ 2010-06-02 21:57 ` bugzilla-daemon
2010-06-02 22:07 ` bugzilla-daemon
` (6 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: bugzilla-daemon @ 2010-06-02 21:57 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=16081
--- Comment #13 from lkolbe@techfak.uni-bielefeld.de 2010-06-02 21:56:56 ---
Funny thing is, dm-6 is the root-filesystem, and it's 4GB big. It lives on a VG
consisting of one 100GB RAID-50 over 24 disks. Some relevant data:
shepherd:~# lvm lvs -a -o+devices
LV VG Attr LSize Origin Snap% Move Log Copy% Convert
Devices
badp1 data -wi-ao 9.00T
/dev/sdb(25600)
badp2 data -wi-ao 9.00T
/dev/sdb(2384896)
baspool data -wi-ao 1.00T
/dev/sdb(4769792)
bawork data -wi-ao 100.00G
/dev/sdb(0)
db1_srv data -wi-ao 100.00G
/dev/sdb(4744192)
dir1_bawork data -wi-ao 100.00G
/dev/sdb(5031936)
db1_log system -wi-ao 4.00G
/dev/sda1(7168)
db1_root system -wi-ao 4.00G
/dev/sda1(6144)
db1_swap system -wi-ao 4.00G
/dev/sda1(8192)
dir1_log system -wi-ao 4.00G
/dev/sda1(4096)
dir1_root system -wi-ao 4.00G
/dev/sda1(3072)
dir1_swap system -wi-ao 4.00G
/dev/sda1(5120)
log system -wi-ao 4.00G
/dev/sda1(1024)
root system -wi-ao 4.00G
/dev/sda1(0)
swap system -wi-ao 4.00G
/dev/sda1(2048)
The requested dmsetup table:
shepherd:~# dmsetup table
data-dir1_bawork: 0 209715200 linear 8:16 41221620096
system-db1_log: 0 8388608 linear 8:1 58720640
system-db1_swap: 0 8388608 linear 8:1 67109248
system-db1_root: 0 8388608 linear 8:1 50332032
data-bawork: 0 209715200 linear 8:16 384
data-db1_srv: 0 209715200 linear 8:16 38864421248
data-baspool: 0 2147483648 linear 8:16 39074136448
system-dir1_swap: 0 8388608 linear 8:1 41943424
system-dir1_root: 0 8388608 linear 8:1 25166208
data-badp2: 0 19327352832 linear 8:16 19537068416
data-badp1: 0 19327352832 linear 8:16 209715584
system-swap: 0 8388608 linear 8:1 16777600
system-root: 0 8388608 linear 8:1 384
system-dir1_log: 0 8388608 linear 8:1 33554816
system-log 0 8388608 linear 8:1 8388992
Adaptec version numbers are: BIOS, Firmware, Boot flash: 17899
aacraid driver: 2461 (the version shipped with 2.6.32)
I have (yet) no reason not to trust our storage - it's one 100GB RAID-50 and
one ~19TB RAID-50 on 24 Hitachi HDE721010SLA330 with firmware ST6OA3AA, if that
means anything to anyone.
Since the last crash bacula has written 3.2TiB to data-badp1 and it's still
running (when all backups are done, it should have written ~12TiB). We'll see
if it survives tomorrow.
If it crashes again, I'll try 8TiB-Filesystems.
Thanks for taking your time!
Lukas
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug 16081] Data loss after crash during heavy I/O
2010-05-31 15:19 [Bug 16081] New: Data loss after crash during heavy I/O bugzilla-daemon
` (14 preceding siblings ...)
2010-06-02 21:57 ` bugzilla-daemon
@ 2010-06-02 22:07 ` bugzilla-daemon
2010-06-02 22:09 ` bugzilla-daemon
` (5 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: bugzilla-daemon @ 2010-06-02 22:07 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=16081
--- Comment #14 from Eric Sandeen <sandeen@redhat.com> 2010-06-02 22:07:52 ---
(In reply to comment #13)
> Funny thing is, dm-6 is the root-filesystem, and it's 4GB big.
whoops you're right I missed a unit there :( The reported limit was indeed 4G
not 4T. Still, why was it trying to read a block out at 9T ... ? Hmmm.
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug 16081] Data loss after crash during heavy I/O
2010-05-31 15:19 [Bug 16081] New: Data loss after crash during heavy I/O bugzilla-daemon
` (15 preceding siblings ...)
2010-06-02 22:07 ` bugzilla-daemon
@ 2010-06-02 22:09 ` bugzilla-daemon
2010-06-03 6:02 ` bugzilla-daemon
` (4 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: bugzilla-daemon @ 2010-06-02 22:09 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=16081
--- Comment #15 from Eric Sandeen <sandeen@redhat.com> 2010-06-02 22:09:55 ---
Just realized all the root fs errors were on ext3, too - not ext4. This gives
me more reason to be worried about things outside the filesystem itself, I'm
afraid.
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug 16081] Data loss after crash during heavy I/O
2010-05-31 15:19 [Bug 16081] New: Data loss after crash during heavy I/O bugzilla-daemon
` (16 preceding siblings ...)
2010-06-02 22:09 ` bugzilla-daemon
@ 2010-06-03 6:02 ` bugzilla-daemon
2010-06-03 14:19 ` bugzilla-daemon
` (3 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: bugzilla-daemon @ 2010-06-03 6:02 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=16081
--- Comment #16 from lkolbe@techfak.uni-bielefeld.de 2010-06-03 06:02:28 ---
Thanks, I'll do another round of memtest then. Do you have any idea what else
to look for/what to test?
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug 16081] Data loss after crash during heavy I/O
2010-05-31 15:19 [Bug 16081] New: Data loss after crash during heavy I/O bugzilla-daemon
` (17 preceding siblings ...)
2010-06-03 6:02 ` bugzilla-daemon
@ 2010-06-03 14:19 ` bugzilla-daemon
2010-06-05 14:32 ` bugzilla-daemon
` (2 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: bugzilla-daemon @ 2010-06-03 14:19 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=16081
--- Comment #17 from Eric Sandeen <sandeen@redhat.com> 2010-06-03 14:19:42 ---
I'd just review the storage configuration as well, I guess, though not sure of
any specifics to look for.
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug 16081] Data loss after crash during heavy I/O
2010-05-31 15:19 [Bug 16081] New: Data loss after crash during heavy I/O bugzilla-daemon
` (18 preceding siblings ...)
2010-06-03 14:19 ` bugzilla-daemon
@ 2010-06-05 14:32 ` bugzilla-daemon
2011-02-28 1:23 ` bugzilla-daemon
2011-02-28 1:24 ` bugzilla-daemon
21 siblings, 0 replies; 23+ messages in thread
From: bugzilla-daemon @ 2010-06-05 14:32 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=16081
--- Comment #18 from lkolbe@techfak.uni-bielefeld.de 2010-06-05 14:32:46 ---
Thanks, though. After working flawlessly for more than 13TiB, we hit another
crash today - a colleague called 'lsscsi', after that all commands quit with
'Bus error' for a while and the machine stuck with no messages on the serial
line. Befor that, cat /proc/interrupts worked and showed massive ERR:
shepherd:/etc# cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3
[...]
THR: 0 0 0 0 Threshold APIC interrupts
MCE: 0 0 0 0 Machine check exceptions
MCP: 28 28 28 28 Machine check polls
ERR: 37567046
MIS: 0
I suppose this means it's not Linux fault?
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug 16081] Data loss after crash during heavy I/O
2010-05-31 15:19 [Bug 16081] New: Data loss after crash during heavy I/O bugzilla-daemon
` (19 preceding siblings ...)
2010-06-05 14:32 ` bugzilla-daemon
@ 2011-02-28 1:23 ` bugzilla-daemon
2011-02-28 1:24 ` bugzilla-daemon
21 siblings, 0 replies; 23+ messages in thread
From: bugzilla-daemon @ 2011-02-28 1:23 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=16081
Theodore Tso <tytso@mit.edu> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |tytso@mit.edu
Kernel Version|2.6.32.12 (Debian-Version |2.6.32.12 (Debian)
|2.6.32-12) |
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug 16081] Data loss after crash during heavy I/O
2010-05-31 15:19 [Bug 16081] New: Data loss after crash during heavy I/O bugzilla-daemon
` (20 preceding siblings ...)
2011-02-28 1:23 ` bugzilla-daemon
@ 2011-02-28 1:24 ` bugzilla-daemon
21 siblings, 0 replies; 23+ messages in thread
From: bugzilla-daemon @ 2011-02-28 1:24 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=16081
Theodore Tso <tytso@mit.edu> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |UNREPRODUCIBLE
--- Comment #19 from Theodore Tso <tytso@mit.edu> 2011-02-28 01:24:43 ---
Closing this bug as it looks pretty clear it was caused by hardware problems.
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 23+ messages in thread