linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Bug 63981] New: Bad: Buffer I/O errors make disk unusable
@ 2013-10-28 19:38 bugzilla-daemon
  2013-10-28 22:35 ` [Bug 63981] " bugzilla-daemon
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: bugzilla-daemon @ 2013-10-28 19:38 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=63981

            Bug ID: 63981
           Summary: Bad: Buffer I/O errors make disk unusable
           Product: File System
           Version: 2.5
    Kernel Version: 3.12.0-rc6
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: high
          Priority: P1
         Component: ext4
          Assignee: fs_ext4@kernel-bugs.osdl.org
          Reporter: scalg1@bfh.ch
        Regression: No

Created attachment 112591
  --> https://bugzilla.kernel.org/attachment.cgi?id=112591&action=edit
dmesg of the errors causing the problem

When I use my laptop, suddenly the SSD disk become unusable. The disk is
mounted in read-only mode and the only way to get it work again is to reboot.
During the reboot, the file system check, fixes the errors and I can use the
laptop for some hours after that the problem appear again.

This problem is difficult to reproduce because there are no precise steps to
perform in order to cause the I/O errors showed by the attached dmesg. 

I had the same problem using kernel 3.11.5 and 3.10.6. I use a
Sony VAIO pro (Sony Corporation SVP1321C5E/VAIO, BIOS R1040V7 09/09/2013).

============================

Information about my system:

bash-4.2# cat /proc/scsi/scsi 
Attached devices:
Host: scsi3 Channel: 00 Id: 00 Lun: 00
  Vendor: ATA      Model: SAMSUNG MZNTD256 Rev: DXT2
  Type:   Direct-Access                    ANSI  SCSI revision: 05

========================================

/etc/fstab

/dev/sda1        /                ext4        defaults         1   1
/dev/sda2        /home/      ext4        defaults         1   2
/dev/sda3        /media/hd1       ext4        defaults         1   2
#/dev/cdrom      /mnt/cdrom       auto       
noauto,owner,ro,comment=x-gvfs-show 0   0
/dev/fd0         /mnt/floppy      auto        noauto,owner     0   0
devpts           /dev/pts         devpts      gid=5,mode=620   0   0
proc             /proc            proc        defaults         0   0
tmpfs            /dev/shm         tmpfs       defaults         0   0
tmpfs            /tmp             tmpfs defaults,noatime,nodiratime,mode=1777 
0   0
tmpfs            /var/spool       tmpfs defaults,noatime,nodiratime,mode=1777 
0   0
tmpfs            /var/tmp         tmpfs defaults,noatime,nodiratime,mode=1777 
0   0

/proc/version

Linux version 3.12.0-rc6 (root@darkstar) (gcc version 4.8.1 (GCC) ) #1 SMP Sun
Oct 27 19:02:16 CET 2013


Attached you will find the relevant part of dmesg.

Thanks for your help.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 63981] Bad: Buffer I/O errors make disk unusable
  2013-10-28 19:38 [Bug 63981] New: Bad: Buffer I/O errors make disk unusable bugzilla-daemon
@ 2013-10-28 22:35 ` bugzilla-daemon
  2013-10-28 23:01 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2013-10-28 22:35 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=63981

Theodore Tso <tytso@mit.edu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |tytso@mit.edu
         Resolution|---                         |INVALID

--- Comment #1 from Theodore Tso <tytso@mit.edu> ---
>From the errors listed in the dmesg, looks like it is a hardware problem with
the SSD, not an ext4 bug.

I'd suggest doing a full backup of your disk while you still can, and try
replacing the SSD....

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 63981] Bad: Buffer I/O errors make disk unusable
  2013-10-28 19:38 [Bug 63981] New: Bad: Buffer I/O errors make disk unusable bugzilla-daemon
  2013-10-28 22:35 ` [Bug 63981] " bugzilla-daemon
@ 2013-10-28 23:01 ` bugzilla-daemon
  2013-10-28 23:20 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2013-10-28 23:01 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=63981

--- Comment #2 from Giuseppe Scalzi <scalg1@bfh.ch> ---
(In reply to Theodore Tso from comment #1)
> From the errors listed in the dmesg, looks like it is a hardware problem
> with the SSD, not an ext4 bug.
> 
> I'd suggest doing a full backup of your disk while you still can, and try
> replacing the SSD....

That's strange because I bought the laptop two weeks ago and for one week I
used windows and all worked fine. I have this problem since the first day after
installing Linux. 

Is it possible to check if there are some hardware errors from smartctl?

=== START OF INFORMATION SECTION ===
Device Model:     SAMSUNG MZNTD256HAGL-00000
Serial Number:    S15ZNYAD730814
LU WWN Device Id: 5 002538 5000648f8
Firmware Version: DXT2300Q
User Capacity:    256,060,514,304 bytes [256 GB]
Sector Size:      512 bytes logical/physical
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 4c
Local Time is:    Mon Oct 28 23:47:51 2013 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine
completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (53956) seconds.
Offline data collection
capabilities:                    (0x53) SMART execute Offline immediate.
                                        Auto Offline data collection on/off
support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (  40) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED 
WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always      
-       0
  9 Power_On_Hours          0x0032   099   099   000    Old_age   Always      
-       118
 12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always      
-       143
177 Wear_Leveling_Count     0x0013   099   099   000    Pre-fail  Always      
-       1
179 Used_Rsvd_Blk_Cnt_Tot   0x0013   100   100   010    Pre-fail  Always      
-       0
181 Program_Fail_Cnt_Total  0x0032   100   100   010    Old_age   Always      
-       0
182 Erase_Fail_Count_Total  0x0032   100   100   010    Old_age   Always      
-       0
183 Runtime_Bad_Block       0x0013   100   100   010    Pre-fail  Always      
-       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always      
-       0
190 Airflow_Temperature_Cel 0x0032   061   030   000    Old_age   Always      
-       39
195 Hardware_ECC_Recovered  0x001a   200   200   000    Old_age   Always      
-       0
199 UDMA_CRC_Error_Count    0x003e   100   100   000    Old_age   Always      
-       0
235 Unknown_Attribute       0x0012   099   099   000    Old_age   Always      
-       52
241 Total_LBAs_Written      0x0032   099   099   000    Old_age   Always      
-       851312137

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
  255        0    65535  Read_scanning was never started
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 63981] Bad: Buffer I/O errors make disk unusable
  2013-10-28 19:38 [Bug 63981] New: Bad: Buffer I/O errors make disk unusable bugzilla-daemon
  2013-10-28 22:35 ` [Bug 63981] " bugzilla-daemon
  2013-10-28 23:01 ` bugzilla-daemon
@ 2013-10-28 23:20 ` bugzilla-daemon
  2013-10-28 23:31 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2013-10-28 23:20 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=63981

--- Comment #3 from Theodore Tso <tytso@mit.edu> ---
It's possible there is some kind of compatibility issue with the SATA driver on
your Sony Viao, but the point is with errors like these:

[13546.661310] ata4.00: failed command: WRITE FPDMA QUEUED
[13546.661315] ata4.00: cmd 61/08:00:2f:1d:0a/00:00:00:00:00/40 tag 0 ncq 4096
out
[13546.661315]          res 40/00:01:00:00:00/00:00:00:00:00/40 Emask 0x4
(timeout)
[13546.661318] ata4.00: status: { DRDY }
[13546.661319] ata4.00: failed command: WRITE FPDMA QUEUED
[13546.661323] ata4.00: cmd 61/08:08:27:1d:0a/00:00:00:00:00/40 tag 1 ncq 4096
out
[13546.661323]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4
(timeout)
[13546.661325] ata4.00: status: { DRDY }

... and like these:

[13606.886264] EXT4-fs warning (device sda1): ext4_end_bio:316: I/O error
writing to inode 2097393 (offset 0 size 0 starting block 82853)
[13606.886267] Buffer I/O error on device sda1, logical block 82845
[13606.886268] sd 3:0:0:0: [sda] Unhandled error code
[13606.886270] sd 3:0:0:0: [sda]  
[13606.886271] Result: hostbyte=0x04 driverbyte=0x00
[13606.886273] sd 3:0:0:0: [sda] CDB: 
[13606.886274] cdb[0]=0x2a: 2a 00 00 00 00 3f 00 00 08 00
[13606.886282] sd 3:0:0:0: [sda] Unhandled error code
[13606.886283] Buffer I/O error on device sda1, logical block 0
[13606.886285] lost page write due to I/O error on sda1
[13606.886288] sd 3:0:0:0: [sda]  
[13606.886289] Result: hostbyte=0x04 driverbyte=0x00
[13606.886293] sd 3:0:0:0: [sda] CDB: 
[13606.886294] EXT4-fs error (device sda1): ext4_journal_check_start:56: 
[13606.886294] cdb[0]=0x2a: 2a 00

... there's little that we can do at the ext4 level.  Basically, the disk
device (or the Sony Viao's SATA chipset) is refusing to talk to Linux.

The Sony Viao has, historically, been notorious for using Windows-specific
hardware that doesn't work well with Linux.  I don't know anything about your
specific model, but there have been enough problems in the past that I avoid
Sony laptops like the plague if I intend to use Linux on them.  It's not by
accident that most Linux kernel developers tend to use Lenovo Thinkpads...

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 63981] Bad: Buffer I/O errors make disk unusable
  2013-10-28 19:38 [Bug 63981] New: Bad: Buffer I/O errors make disk unusable bugzilla-daemon
                   ` (2 preceding siblings ...)
  2013-10-28 23:20 ` bugzilla-daemon
@ 2013-10-28 23:31 ` bugzilla-daemon
  2013-10-29  8:28 ` bugzilla-daemon
  2013-10-29 15:14 ` bugzilla-daemon
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2013-10-28 23:31 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=63981

--- Comment #4 from Theodore Tso <tytso@mit.edu> ---
BTW, I'm using a 512GB Samsung 840 PRO (2.5" SATA SSD) and an 240GB Intel 525
SSD (mSata) on my Lenovo T430s, and they both work like a charm.

Hmm... I wasn't able to get detailed specs on your SAMSUNG MZNTD256HAGL-00000,
but upon doing some further research, it appears to be a new-fangled M.2 PCIe
interface.  So it's not a mSATA nor a 2.5" SATA interface, but Something New.

So whether or not this is a Linux bug, or an implementation bug in this new
Samsung part (or a failure in the standardization of this new M.2 PCIe
interface), I can't say, but this looks like the most likely cause is a problem
with this new SSD or its new M.2 interface[1].

[1] http://en.wikipedia.org/wiki/Next_Generation_Form_Factor

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 63981] Bad: Buffer I/O errors make disk unusable
  2013-10-28 19:38 [Bug 63981] New: Bad: Buffer I/O errors make disk unusable bugzilla-daemon
                   ` (3 preceding siblings ...)
  2013-10-28 23:31 ` bugzilla-daemon
@ 2013-10-29  8:28 ` bugzilla-daemon
  2013-10-29 15:14 ` bugzilla-daemon
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2013-10-29  8:28 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=63981

--- Comment #5 from Giuseppe Scalzi <scalg1@bfh.ch> ---
(In reply to Theodore Tso from comment #4)
> BTW, I'm using a 512GB Samsung 840 PRO (2.5" SATA SSD) and an 240GB Intel
> 525 SSD (mSata) on my Lenovo T430s, and they both work like a charm.
> 
> Hmm... I wasn't able to get detailed specs on your SAMSUNG
> MZNTD256HAGL-00000, but upon doing some further research, it appears to be a
> new-fangled M.2 PCIe interface.  So it's not a mSATA nor a 2.5" SATA
> interface, but Something New.
> 
> So whether or not this is a Linux bug, or an implementation bug in this new
> Samsung part (or a failure in the standardization of this new M.2 PCIe
> interface), I can't say, but this looks like the most likely cause is a
> problem with this new SSD or its new M.2 interface[1].
> 
> [1] http://en.wikipedia.org/wiki/Next_Generation_Form_Factor

Ok, thank you for you reply, I understand that isn't a problem related to EXT4.

I noticed from the archlinux wiki of my laptop model
(https://wiki.archlinux.org/index.php/Sony_Vaio_Pro_SVP-1x21) that they suggest
to use this option:

- When booting from USB, append libata.force=noncq to the kernel parameters to
avoid problems with the SSD.

Well they say "when booting from USB" but I'll try "libata.force=noncq" anyway.

We will see what happens.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 63981] Bad: Buffer I/O errors make disk unusable
  2013-10-28 19:38 [Bug 63981] New: Bad: Buffer I/O errors make disk unusable bugzilla-daemon
                   ` (4 preceding siblings ...)
  2013-10-29  8:28 ` bugzilla-daemon
@ 2013-10-29 15:14 ` bugzilla-daemon
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2013-10-29 15:14 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=63981

Giuseppe Scalzi <scalg1@bfh.ch> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|ext4                        |Serial ATA
            Product|File System                 |IO/Storage

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-10-29 15:14 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-28 19:38 [Bug 63981] New: Bad: Buffer I/O errors make disk unusable bugzilla-daemon
2013-10-28 22:35 ` [Bug 63981] " bugzilla-daemon
2013-10-28 23:01 ` bugzilla-daemon
2013-10-28 23:20 ` bugzilla-daemon
2013-10-28 23:31 ` bugzilla-daemon
2013-10-29  8:28 ` bugzilla-daemon
2013-10-29 15:14 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).