* Re: [Bugme-new] [Bug 19312] New: bad_page crash when writing to OCZ Agility2 120G
[not found] <bug-19312-10286@https.bugzilla.kernel.org/>
@ 2010-09-29 19:46 ` Andrew Morton
0 siblings, 0 replies; only message in thread
From: Andrew Morton @ 2010-09-29 19:46 UTC (permalink / raw)
To: linux-mm; +Cc: bugzilla-daemon, bugme-daemon, b7.10110111
(switched to email. Please respond via emailed reply-to-all, not via the
bugzilla web interface).
On Wed, 29 Sep 2010 19:03:55 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=19312
>
> Summary: bad_page crash when writing to OCZ Agility2 120G
> Product: Drivers
> Version: 2.5
> Platform: All
> OS/Version: Linux
> Tree: Mainline
> Status: NEW
> Severity: normal
> Priority: P1
> Component: Other
> AssignedTo: drivers_other@kernel-bugs.osdl.org
> ReportedBy: b7.10110111@gmail.com
> Regression: No
>
>
> Created an attachment (id=31932)
> --> (https://bugzilla.kernel.org/attachment.cgi?id=31932)
> dmesg output after the Oops
>
> I recently installed OCZ Agility2 120G SSD. When i tried to make a FS on it, i
> got a crash. After further investigation i found that the crash appears only
> when PAE is enabled. I tried to boot into minimal shell (init=/bin/bash) and
> was able to reproduce the crash. This is how i reproduce the bug:
>
> dd if=/dev/zero of=/dev/sda bs=256M
>
> after some gigs of data (up to some tens of gigs) are written, i get an Oops as
> in the attached dmesg log. After some time from the oops, the system locks up
> (no NumLock as well as no Alt+SysRq stuff seems to work).
>
> I tried to plug the SSD to another SATA port, swap it with HDDs, but the bug
> still persists. I tried to replace my nvidia card with s3virge to no avail. I
> also tried using mem=1024M kernel cmdline to see if it's because of higher
> memory PCI access, but the bug persists, though it appeared later than before.
> Also, the bug sometimes doesn't appear on first write pass, but does on
> second/third.
> Ah, yes, the bug still happened after upgrade to 2.6.35.5 kernel.
> There's no such problem with any of the HDDs. I suspect this may be related to
> high speed of SSD which might create some race condition, but i'm not sure.
>
A repeatable crash in __block_write_full_page() in 2.6.34 and 2.6.35.
Does anyone have time to take a look? scripts/decodecode says
All code
========
0: 89 5c 24 28 mov %ebx,0x28(%rsp)
4: eb 1f jmp 0x25
6: 77 06 ja 0xe
8: 3b 74 24 20 cmp 0x20(%rsp),%esi
c: 76 1d jbe 0x2b
e: f0 80 23 fd lock andb $0xfd,(%rbx)
12: f0 80 0b 01 lock orb $0x1,(%rbx)
16: 8b 5b 04 mov 0x4(%rbx),%ebx
19: 39 5c 24 28 cmp %ebx,0x28(%rsp)
1d: 74 70 je 0x8f
1f: 83 c6 01 add $0x1,%esi
22: 83 d7 00 adc $0x0,%edi
25: 3b 7c 24 24 cmp 0x24(%rsp),%edi
29: 73 db jae 0x6
2b:* 8b 03 mov (%rbx),%eax <-- trapping instruction
2d: a8 20 test $0x20,%al
2f: 74 05 je 0x36
31: f6 c4 02 test $0x2,%ah
34: 74 e0 je 0x16
36: a8 02 test $0x2,%al
38: 90 nop
39: 74 db je 0x16
3b: 8b 44 24 2c mov 0x2c(%rsp),%eax
3f: 3b .byte 0x3b
Code starting with the faulting instruction
===========================================
0: 8b 03 mov (%rbx),%eax
2: a8 20 test $0x20,%al
4: 74 05 je 0xb
6: f6 c4 02 test $0x2,%ah
9: 74 e0 je 0xffffffffffffffeb
b: a8 02 test $0x2,%al
d: 90 nop
e: 74 db je 0xffffffffffffffeb
10: 8b 44 24 2c mov 0x2c(%rsp),%eax
14: 3b .byte 0x3b
but my attention span ran out. I _think_ the bh ring got corrupted.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] only message in thread