[Bug 219300] New: ext4 corrupts data on a specific pendrive

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [Bug 219300] New: ext4 corrupts data on a specific pendrive
@ 2024-09-22 15:47 bugzilla-daemon
  2024-09-22 18:58 ` [Bug 219300] " bugzilla-daemon
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: bugzilla-daemon @ 2024-09-22 15:47 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=219300

            Bug ID: 219300
           Summary: ext4 corrupts data on a specific pendrive
           Product: File System
           Version: 2.5
          Hardware: All
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P3
         Component: ext4
          Assignee: fs_ext4@kernel-bugs.osdl.org
          Reporter: linuxnormaluser@proton.me
        Regression: No

Hi, copying data to a specific pendrive with the ext4 file system does not work
correctly, i.e. the data is damaged after copying. My observations lead me to
believe that this is caused by some bug in the Linux kernel. Below I will list
all relevant observations.

Steps to reproduce:
1. Create an ext4 filesystem using any kernel >=5 (<5 not tested) on a specific
pendrive model. Pendrive: Intenso Speed Line, idVendor=346d, idProduct=5678,
31.5 GB/29.3GiB 
2. Copy at least a few GB of data in the form of several files to the mentioned
pendrive. E.g. at least five files of 1 GB each.
3. Compare the checksums of the files on the host and on the flash drive.
4. At least some files are inconsistent. If not, then unmount and remount the
file system or restart your computer and check the checksums again.

Counterexample:
1. Do the same as above, this time with the ntfs instead of ext4.
2. All files are always consistent.

My observations:
- The problem occurs every time I copy at least a few GB of data.
- The problem occurs on various Linux operating systems (gentoo kernel 6.6.47,
6.6.38, arch kernel 5.x, arch kernel 6.10.7, ubuntu 24.04 LTS kernel
6.8.0-41-generic). So I assume that the problem has been present for a long
time and probably also in the latest version.
- I notice a difference between older kernels and version 6.10.7 (arch linux).
In the case of 6.10.7, the problem does not occur immediately, but only after
remounting the files or restarting the computer.
- I verify the data using crc32 or sha256 checksum.
- I tested on two different machines.
- The host has been tested by memtest. There were no errors.
- The problem concerns a specific pendrive model. I have two physical pendrives
of the exact same model and both of them have this problem. Other models, even
from the same manufacturer, do not cause the problem. Models that cause the
problem: Intenso Speed Line, idVendor=346d, idProduct=5678, 31.5 GB/29.3GiB 
- The problem is not because I unmounted the device incorrectly or removed the
pendrive too quickly. 
- Below is an example of dmesg output.
- Typically, only the data gets corrupted when copied. However, sometimes the
entire file system crashes. Below is an example from dmesg.
- The problem occurs in both USB 2 and USB 3 slots.
- Corrupt data is not the same every time. I.e. by copying the data twice, I
get two different checksums on the flash drive. The number of corrupted files
also varies.
- One might assume that the problem is the poor quality of the pendrive model,
but the problem does not occur at all on ntfs. Ntfs always works fine. Both on
Windows and various Linux distributions.
- Copying to ntfs takes a short time. ext4 is over 10 times slower than ntfs
for this model.
- f2fs also corrupted the data, while extFAT did not. However, I have not
tested these file systems extensively.
- I looked for help on gentoo forum, but they were unable to help me there.
There is a discussion on this topic in the link below, but I have summarized
everything important here. https://forums.gentoo.org/viewtopic-t-1170536.html

It seems that ntfs can handle this hardware correctly, but ext4 has some
problem. 

Sample dmesg output during data corruption: 
[20904.194233] usb 2-4: new high-speed USB device number 3 using xhci_hcd
[20904.322059] usb 2-4: New USB device found, idVendor=346d, idProduct=5678,
bcdDevice= 2.00
[20904.322076] usb 2-4: New USB device strings: Mfr=1, Product=2,
SerialNumber=3
[20904.322083] usb 2-4: Product: Intenso Speed Line
[20904.322090] usb 2-4: Manufacturer: Intenso
[20904.322094] usb 2-4: SerialNumber: FC<replaced...>
[20904.323170] usb-storage 2-4:1.0: USB Mass Storage device detected
[20904.323543] scsi host6: usb-storage 2-4:1.0
[20905.374792] scsi 6:0:0:0: Direct-Access     Intenso  Speed Line       2.00
PQ: 0 ANSI: 4
[20905.375139] sd 6:0:0:0: Attached scsi generic sg1 type 0
[20905.376508] sd 6:0:0:0: [sdb] 61440000 512-byte logical blocks: (31.5
GB/29.3 GiB)
[20905.376780] sd 6:0:0:0: [sdb] Write Protect is off
[20905.376786] sd 6:0:0:0: [sdb] Mode Sense: 03 00 00 00
[20905.376921] sd 6:0:0:0: [sdb] No Caching mode page found
[20905.376924] sd 6:0:0:0: [sdb] Assuming drive cache: write through
[20905.389018]  sdb: sdb1
[20905.389331] sd 6:0:0:0: [sdb] Attached SCSI removable disk
[20931.947073]  sdb: sdb1
[20931.969695]  sdb: sdb1
[20977.720825] EXT4-fs (sdb1): mounted filesystem
28b0a704-e5b8-4dee-aab5-316b73b481a4 r/w with ordered data mode. Quota mode:
none.
[21159.649524] usb 2-4: reset high-speed USB device number 3 using xhci_hcd
[21165.264260] usb 2-4: device descriptor read/64, error -110
[21329.633511] usb 2-4: reset high-speed USB device number 3 using xhci_hcd
[21335.248339] usb 2-4: device descriptor read/64, error -110
[21506.786497] usb 2-4: reset high-speed USB device number 3 using xhci_hcd
[21512.400341] usb 2-4: device descriptor read/64, error -110
[21858.529543] usb 2-4: reset high-speed USB device number 3 using xhci_hcd
[21864.144299] usb 2-4: device descriptor read/64, error -110
[22010.598453] usb 2-4: reset high-speed USB device number 3 using xhci_hcd
[22016.209332] usb 2-4: device descriptor read/64, error -110
[22402.785528] usb 2-4: reset high-speed USB device number 3 using xhci_hcd
[22408.400301] usb 2-4: device descriptor read/64, error -110
[22542.562424] usb 2-4: reset high-speed USB device number 3 using xhci_hcd
[22548.176319] usb 2-4: device descriptor read/64, error -110
[22658.273592] usb 2-4: reset high-speed USB device number 3 using xhci_hcd
[22663.888296] usb 2-4: device descriptor read/64, error -110
...
[23482.082529] usb 2-4: reset high-speed USB device number 3 using xhci_hcd
[23487.697333] usb 2-4: device descriptor read/64, error -110
[23776.993521] usb 2-4: reset high-speed USB device number 3 using xhci_hcd
[23782.608362] usb 2-4: device descriptor read/64, error -110

Another sample dmesg output during data corruption: 
[31547.744532] usb 4-4: new SuperSpeed USB device number 2 using xhci_hcd
[31547.757338] usb 4-4: LPM exit latency is zeroed, disabling LPM.
[31547.758379] usb 4-4: New USB device found, idVendor=346d, idProduct=5678,
bcdDevice= 2.00
[31547.758390] usb 4-4: New USB device strings: Mfr=1, Product=2,
SerialNumber=3
[31547.758394] usb 4-4: Product: Intenso Speed Line
[31547.758398] usb 4-4: Manufacturer: Intenso
[31547.758401] usb 4-4: SerialNumber: FC<replaced...>
[31547.759224] usb-storage 4-4:1.0: USB Mass Storage device detected
[31547.759634] scsi host6: usb-storage 4-4:1.0
[31548.766861] scsi 6:0:0:0: Direct-Access     Intenso  Speed Line       2.00
PQ: 0 ANSI: 4
[31548.767176] sd 6:0:0:0: Attached scsi generic sg1 type 0
[31548.768220] sd 6:0:0:0: [sdb] 61440000 512-byte logical blocks: (31.5
GB/29.3 GiB)
[31548.768375] sd 6:0:0:0: [sdb] Write Protect is off
[31548.768380] sd 6:0:0:0: [sdb] Mode Sense: 03 00 00 00
[31548.768503] sd 6:0:0:0: [sdb] No Caching mode page found
[31548.768507] sd 6:0:0:0: [sdb] Assuming drive cache: write through
[31548.777093] Alternate GPT is invalid, using primary GPT.
[31548.777106]  sdb: sdb1
[31548.777407] sd 6:0:0:0: [sdb] Attached SCSI removable disk
[31562.696274]  sdb: sdb1
[33457.510450]  sdb: sdb1
[33457.532279]  sdb: sdb1
[33555.769208] EXT4-fs (sdb1): mounted filesystem
6660ff4e-c384-405c-be0e-86737a393344 r/w with ordered data mode. Quota mode:
none.
[33986.273861] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
[33987.553302] usb 4-4: LPM exit latency is zeroed, disabling LPM.
[34132.705880] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
[34133.691058] usb 4-4: LPM exit latency is zeroed, disabling LPM.
[34734.306884] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
[34735.012621] usb 4-4: LPM exit latency is zeroed, disabling LPM.
[34769.121882] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
[34769.838692] usb 4-4: LPM exit latency is zeroed, disabling LPM.
[35411.681919] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
[35411.771220] usb 4-4: LPM exit latency is zeroed, disabling LPM.
[35447.009831] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
[35447.944211] usb 4-4: LPM exit latency is zeroed, disabling LPM.

Sample console/dmesg output when the entire filesystem is corrupted:
cp: error writing '<replaced...>': Input/output error 
cp: cannot create regular file '<replaced...>': Read-only file system 
cp: cannot create regular file '<replaced...>': Read-only file system 
cp: cannot create regular file '<replaced...>': Read-only file system 

... 

[ 8202.825924] EXT4-fs (sdb1): mounted filesystem
84c42b25-807a-494f-a8de-bbb280c21d38 r/w with ordered data mode. Quota mode:
none. 
[ 8207.481253] EXT4-fs error (device sdb1): ext4_validate_block_bitmap:421:
comm ext4lazyinit: bg 176: bad block bitmap checksum 
[ 8228.651866] EXT4-fs (sdb1): unmounting filesystem
84c42b25-807a-494f-a8de-bbb280c21d38. 
[ 8237.434827] EXT4-fs (sdb1): warning: mounting fs with errors, running e2fsck
is recommended 
[ 8237.435636] EXT4-fs (sdb1): mounted filesystem
84c42b25-807a-494f-a8de-bbb280c21d38 r/w with ordered data mode. Quota mode:
none. 
[ 8238.993344] EXT4-fs error (device sdb1): ext4_validate_block_bitmap:421:
comm ext4lazyinit: bg 176: bad block bitmap checksum 

... 

[ 8557.663116] EXT4-fs (sdb1): error count since last fsck: 3 
[ 8557.663137] EXT4-fs (sdb1): initial error at time 1725382598:
ext4_validate_block_bitmap:421 
[ 8557.663148] EXT4-fs (sdb1): last error at time 1725383358:
ext4_validate_block_bitmap:421 

... 

[11843.298598] usb 2-2: reset high-speed USB device number 2 using xhci_hcd 
[11844.103802] usb 2-2: device firmware changed 
[11844.103922] usb 2-2: USB disconnect, device number 2 
[11844.111282] device offline error, dev sdb, sector 60278752 op 0x1:(WRITE)
flags 0x0 phys_seg 1 prio class 2 
[11844.111301] EXT4-fs warning (device sdb1): ext4_end_bio:343: I/O error 17
writing to inode 20 starting block 7534844) 
[11844.111331] device offline error, dev sdb, sector 60286976 op 0x1:(WRITE)
flags 0x4000 phys_seg 2 prio class 2 
[11844.111377] device offline error, dev sdb, sector 60287216 op 0x1:(WRITE)
flags 0x4000 phys_seg 2 prio class 2 
[11844.111399] device offline error, dev sdb, sector 60287456 op 0x1:(WRITE)
flags 0x4000 phys_seg 3 prio class 2 
[11844.111425] device offline error, dev sdb, sector 60287696 op 0x1:(WRITE)
flags 0x4000 phys_seg 2 prio class 2 
[11844.111434] device offline error, dev sdb, sector 60287936 op 0x1:(WRITE)
flags 0x4000 phys_seg 3 prio class 2 
[11844.111450] device offline error, dev sdb, sector 60288176 op 0x1:(WRITE)
flags 0x4000 phys_seg 2 prio class 2 
[11844.111460] device offline error, dev sdb, sector 29749456 op 0x1:(WRITE)
flags 0x9800 phys_seg 10 prio class 2 
[11844.111492] device offline error, dev sdb, sector 60288416 op 0x1:(WRITE)
flags 0x4000 phys_seg 3 prio class 2 
[11844.111504] Aborting journal on device sdb1-8. 
[11844.111509] device offline error, dev sdb, sector 60288656 op 0x1:(WRITE)
flags 0x4000 phys_seg 2 prio class 2 
[11844.111522] Buffer I/O error on dev sdb1, logical block 3702784, lost sync
page write 
[11844.111521] EXT4-fs error (device sdb1) in ext4_reserve_inode_write:5787:
Journal has aborted 
[11844.111523] EXT4-fs error (device sdb1) in ext4_reserve_inode_write:5787:
Journal has aborted 
[11844.111534] EXT4-fs error (device sdb1):
ext4_convert_unwritten_extents:4849: inode #20: comm kworker/u16:2:
mark_inode_dirty error 
[11844.111539] EXT4-fs error (device sdb1): ext4_dirty_inode:5991: inode #21:
comm cp: mark_inode_dirty error 
[11844.111543] JBD2: I/O error when updating journal superblock for sdb1-8. 
[11844.111545] EXT4-fs error (device sdb1) in
ext4_convert_unwritten_io_end_vec:4888: Journal has aborted 
[11844.111551] EXT4-fs error (device sdb1) in ext4_dirty_inode:5992: Journal
has aborted 
[11844.111553] EXT4-fs (sdb1): failed to convert unwritten extents to written
extents -- potential data loss!  (inode 20, error -30) 
[11844.111565] Buffer I/O error on device sdb1, logical block 7533568 
[11844.111574] Buffer I/O error on device sdb1, logical block 7533569 
[11844.111577] Buffer I/O error on device sdb1, logical block 7533570 
[11844.111580] Buffer I/O error on device sdb1, logical block 7533571 
[11844.111583] Buffer I/O error on device sdb1, logical block 7533572 
[11844.111586] Buffer I/O error on device sdb1, logical block 7533573 
[11844.111588] Buffer I/O error on device sdb1, logical block 7533574 
[11844.111591] Buffer I/O error on device sdb1, logical block 7533575 
[11844.111594] Buffer I/O error on device sdb1, logical block 7533576 
[11844.111596] Buffer I/O error on device sdb1, logical block 7533577 
[11844.111757] EXT4-fs error (device sdb1): ext4_journal_check_start:84: comm
kworker/u16:1: Detected aborted journal 
[11844.111788] EXT4-fs warning (device sdb1): ext4_end_bio:343: I/O error 17
writing to inode 21 starting block 7536892) 
[11844.111807] Buffer I/O error on dev sdb1, logical block 0, lost sync page
write 
[11844.111816] EXT4-fs (sdb1): I/O error while writing superblock 
[11844.111819] EXT4-fs (sdb1): Remounting filesystem read-only 
[11844.111822] EXT4-fs (sdb1): ext4_do_writepages: jbd2_start: 1024 pages, ino
13; err -30 
[11844.112653] Buffer I/O error on dev sdb1, logical block 0, lost sync page
write 
[11844.112670] EXT4-fs (sdb1): I/O error while writing superblock 
[11845.532320] usb 2-2: new high-speed USB device number 3 using xhci_hcd 
[11845.659894] usb 2-2: New USB device found, idVendor=ffff, idProduct=5678,
bcdDevice= 2.00 
[11845.659906] usb 2-2: New USB device strings: Mfr=1, Product=2,
SerialNumber=3 
[11845.659910] usb 2-2: Product: 䍆㈳㔶 
[11845.659913] usb 2-2: Manufacturer: 楆獲t档灩 
[11845.659917] usb 2-2: SerialNumber: 012345678901 
[11845.661033] usb-storage 2-2:1.0: USB Mass Storage device detected 
[11845.661414] scsi host7: usb-storage 2-2:1.0 
[11867.362604] usb 2-2: reset high-speed USB device number 3 using xhci_hcd

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug 219300] ext4 corrupts data on a specific pendrive
  2024-09-22 15:47 [Bug 219300] New: ext4 corrupts data on a specific pendrive bugzilla-daemon
@ 2024-09-22 18:58 ` bugzilla-daemon
  2024-09-22 18:59 ` bugzilla-daemon
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2024-09-22 18:58 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=219300

Artem S. Tashkinov (aros@gmx.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |INVALID

--- Comment #1 from Artem S. Tashkinov (aros@gmx.com) ---
> [11844.111565] Buffer I/O error on device sdb1, logical block 7533568 
> EXT4-fs (sdb1): I/O error while writing superblock 

Typically, such errors indicate a storage failure, not a filesystem problem.

I strongly suspect your media is broken or damaged and should not be used to
store important information.

The easiest way to test it would be to use badblocks with a single pass, using
the `-w     Use write-mode test` option.

The defaults for -b and -c are quite low, I'd suggest:

sudo badblocks -b 4096 -c 1000 -w -s -v /dev/sdX

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug 219300] ext4 corrupts data on a specific pendrive
  2024-09-22 15:47 [Bug 219300] New: ext4 corrupts data on a specific pendrive bugzilla-daemon
  2024-09-22 18:58 ` [Bug 219300] " bugzilla-daemon
@ 2024-09-22 18:59 ` bugzilla-daemon
  2024-09-23  1:35 ` bugzilla-daemon
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2024-09-22 18:59 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=219300

Artem S. Tashkinov (aros@gmx.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |aros@gmx.com

--- Comment #2 from Artem S. Tashkinov (aros@gmx.com) ---
Note that this operation will destroy all your data and in your case that would
be 

`/dev/sdb`

Please triple check before running the command to avoid data loss.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug 219300] ext4 corrupts data on a specific pendrive
  2024-09-22 15:47 [Bug 219300] New: ext4 corrupts data on a specific pendrive bugzilla-daemon
  2024-09-22 18:58 ` [Bug 219300] " bugzilla-daemon
  2024-09-22 18:59 ` bugzilla-daemon
@ 2024-09-23  1:35 ` bugzilla-daemon
  2024-09-23  1:39 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2024-09-23  1:35 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=219300

--- Comment #3 from nxe9 (linuxnormaluser@proton.me) ---
>Typically, such errors indicate a storage failure, not a filesystem problem.

>I strongly suspect your media is broken or damaged and should not be used to
>store important information.

How can you explain the fact that I can copy tens of GB of data to the ntfs
file system on different operating systems and no errors occur and data is
always consistent? For me, this is a sign that something is wrong with ext4
since ntfs works without any problems on the same hardware.

I've tested badblock before and there were no errors.
badblocks -w -s -o error.log /dev/sdX

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug 219300] ext4 corrupts data on a specific pendrive
  2024-09-22 15:47 [Bug 219300] New: ext4 corrupts data on a specific pendrive bugzilla-daemon
                   ` (2 preceding siblings ...)
  2024-09-23  1:35 ` bugzilla-daemon
@ 2024-09-23  1:39 ` bugzilla-daemon
  2024-09-23  6:26 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2024-09-23  1:39 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=219300

--- Comment #4 from nxe9 (linuxnormaluser@proton.me) ---
In short, in the case of ext4 I can generate an error very quickly. In the case
of ntfs, I was unable to generate it even once.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug 219300] ext4 corrupts data on a specific pendrive
  2024-09-22 15:47 [Bug 219300] New: ext4 corrupts data on a specific pendrive bugzilla-daemon
                   ` (3 preceding siblings ...)
  2024-09-23  1:39 ` bugzilla-daemon
@ 2024-09-23  6:26 ` bugzilla-daemon
  2024-09-23  7:07 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2024-09-23  6:26 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=219300

Theodore Tso (tytso@mit.edu) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tytso@mit.edu

--- Comment #5 from Theodore Tso (tytso@mit.edu) ---
Ext4 uses a block allocation algorithm which spreads the blocks used by files
across the entire storage device in order to reduce file fragmentation.   There
are cheap thumb drives that claim to be, say, 16GB, but which only have 8GB of
flash, and they rely on the fact that some Windows file systems (FAT and NTFS)
allocates blocks starting at the low-numbered block numbers, and so if there is
a fake/scammy USB thumb drive (the kind that you buy in the back alley of
Shenzhen, or at a deap discount in the checkout line of Microcenter, or a
really dodgy vendor on Amazon Marketplace at a price which is too good to be
true), it might work on Windows so long as you don't actually try to store that
many files on it.

In any case, the console messages are very clearly I/O errors and the LBA
sector number reported is a high-numbered address: 60278752.    Whether this is
just a failed thumbdrive, or one which is deliberately sold as a fake is
unclear, but I would suggest trying to read and write to all of the sectors of
the disk.   Fundamentally, ext4 assumes that the storage device is valid; and
if it is not valid (e.g., has I/O errors when you try to read or write to
portions of the disk), that's the storage device's problem, not ext4.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug 219300] ext4 corrupts data on a specific pendrive
  2024-09-22 15:47 [Bug 219300] New: ext4 corrupts data on a specific pendrive bugzilla-daemon
                   ` (4 preceding siblings ...)
  2024-09-23  6:26 ` bugzilla-daemon
@ 2024-09-23  7:07 ` bugzilla-daemon
  2024-09-23 16:24 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2024-09-23  7:07 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=219300

--- Comment #6 from Artem S. Tashkinov (aros@gmx.com) ---
> and so if there is a fake/scammy USB thumb drive

AliExpress has hundreds of them.

Some are even sold as "2TB" drives when in reality you'll be lucky if they
contain 16GB of disk space. Tons of reviews on YouTube as well.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug 219300] ext4 corrupts data on a specific pendrive
  2024-09-22 15:47 [Bug 219300] New: ext4 corrupts data on a specific pendrive bugzilla-daemon
                   ` (5 preceding siblings ...)
  2024-09-23  7:07 ` bugzilla-daemon
@ 2024-09-23 16:24 ` bugzilla-daemon
  2024-09-23 16:30 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2024-09-23 16:24 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=219300

--- Comment #7 from nxe9 (linuxnormaluser@proton.me) ---
Thank you for your entries. My pendrive is not a Chinese fake and I think size
is not correct. At least that's what I think. Intenso is a German company,
although the chips are probably imported from the Far East.

Back to the topic...

I don't know much about file systems, so I'm relying on you. Is it likely that
the file systems are so different that a hardware bug is visible regularly on
one file system but is impossible to reproduce on the other? Besides, the fact
is that two pendrives of the same model have the problem, and other models,
even from the same manufacturer, do not. If I could see the error on ntfs just
once, I wouldn't have a problem, but so far I haven't been able to reproduce
the error on ntfs even once. Today I tested ntfs again with f3 and as usual no
error. Apart from that I generated test data and filled the disk completely. As
usual, all fully consistent on ntfs.

Freespace on ext4 according to f3write: Free space: 28.67 GB
Freespace on ntfs according to f3write: Free space: 29.23 GB

As you can see, I can write even more data to ntfs and it will not generate
errors.

I will summarize some points:
- i/o errors in dmesg appear very rarely. During data corruption this error
usually does not appear.
- f3 tests on ext4 are negative only sometimes.
- when copying my own files to ext4 I can generate data inconsistency very
quickly.
- badblocks doesn't show me any errors.
- ntfs always works great

Therefore, I am still interested in whether one file system can actually hide
hardware defects (or is implemented in such a way that it is very difficult to
reproduce) or maybe the other file system has some rare bug that will only
become visible in the case of this hardware. For me it's not settled.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug 219300] ext4 corrupts data on a specific pendrive
  2024-09-22 15:47 [Bug 219300] New: ext4 corrupts data on a specific pendrive bugzilla-daemon
                   ` (6 preceding siblings ...)
  2024-09-23 16:24 ` bugzilla-daemon
@ 2024-09-23 16:30 ` bugzilla-daemon
  2024-09-23 18:53 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2024-09-23 16:30 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=219300

Artem S. Tashkinov (aros@gmx.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|aros@gmx.com                |

--- Comment #8 from Artem S. Tashkinov (aros@gmx.com) ---
2 billion Android users use ext4 daily with zero issues.

I/O errors must not appear EVER, I repeat a normally working mass storage
device should NEVER produce a single one of them.

In fact if I get a single IO error on any of my devices, it instantly gets
wiped and thrown in the trash.

You can tell a FS that certain blocks are bad but if you value your sanity you
should not be using such storage.

Please ask your question on either:

https://unix.stackexchange.com/questions or https://superuser.com/questions/

It does not belong here.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug 219300] ext4 corrupts data on a specific pendrive
  2024-09-22 15:47 [Bug 219300] New: ext4 corrupts data on a specific pendrive bugzilla-daemon
                   ` (7 preceding siblings ...)
  2024-09-23 16:30 ` bugzilla-daemon
@ 2024-09-23 18:53 ` bugzilla-daemon
  2024-09-24 16:15 ` bugzilla-daemon
  2024-09-25 16:49 ` bugzilla-daemon
  10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2024-09-23 18:53 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=219300

--- Comment #9 from Theodore Tso (tytso@mit.edu) ---
It's not at all surprising that flaky hardware might have issues that are only
exposed on different surprising.   Different file systems might have very
different I/O patterns both in terms of spatially (what blocks get used) and
temporal (how many I/O requests are issued in parallel, and how quickly) and
from a I/O request type (e.g., how much if any CACHE FLUSH requests, how many
if any FORCED UNIT ATTENTION -- FUA).

One quick thing I'd suggest that you try is to experiment with file systems
other than ext4 and ntfs.  For example, what happens if you use xfs or btrfs or
f2fs with your test programs?    If the hardware fails with xfs or btrfs, then
that would very likely put the finger of blame on the hardware being cr*p.

The other thing that you can try is to run tests on the raw hardware.   For
example, something like this [1]to write random data to the disk, and then
verify the output.   The block device must be able to handle having random data
written at high speeds, and when you read back the data, you must get the same
data written back.   Unreasonable, I know, but if the storage device fails with
random writes without a file system in the mix, it's going to be hopeless once
you add a file system.

[1] https://github.com/axboe/fio/blob/master/examples/basic-verify.fio

I will note that large companies that buy millions of dollars of hardware,
whether it's for data centers use at hyperscaler cloud companies like Amazon or
Microsoft, or for Flash devices used in mobile devices such as Samsung,
Motorola, Google Pixel devices, etc., will spend an awful lot of time
qualifying the hardware to make sure it is high quality before they buy them. 
And they do this using raw tests to the block device, since this eliminates the
excuse from the hardware company that "oh, this must be a file system bug".   
If there are failures found when using storage tests against the raw block
device, there is no place for the hardware vendor to hide.....

But in general, as Artem said, if there are any I/O failures at all, that's a
huge red flagh.   That essentially *proves* that the hardware is dodgy.   You
can have dodgy hardware without I/O errors, but if there are I/O errors reading
or writing to a valid block/sector number, then by definition the hardware is
the problem.   And in your case, the errors are "USB disconnect" and "unit is
off-line".   That should never, ever happen, and if it does, then there is a
hardware problem.  It could be a cabling problem; it could be a problem with
the SCSI/SATA/NVME/USB controller, etc., but the file system folks will tell
you that if there are *any* such problems, resolve the hardware problem before
you asking the file system people to debug the problem.    It's much like
asking a civil egnineer to ask why the building might be design issues when
it's built on top of quicksand.  Buildings assume that they are built on stable
ground.   If the ground is not stable, then chose a different building site or
fix the ground first.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug 219300] ext4 corrupts data on a specific pendrive
  2024-09-22 15:47 [Bug 219300] New: ext4 corrupts data on a specific pendrive bugzilla-daemon
                   ` (8 preceding siblings ...)
  2024-09-23 18:53 ` bugzilla-daemon
@ 2024-09-24 16:15 ` bugzilla-daemon
  2024-09-25 16:49 ` bugzilla-daemon
  10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2024-09-24 16:15 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=219300

--- Comment #10 from nxe9 (linuxnormaluser@proton.me) ---
OK, thanks. You convinced me.

@Theodore Tso: Thank you for your detailed post. 

As I wrote in the first post, i tried f2fs once and it also broke the data.
This confirms your claims.

I tried the „basic-veryfy.fio“. Unfortunately, this method is not very
practical, because in the case of my pendrive, the verification time is about
60 days. After 10 hours I stopped. The progress was less than one percent.
Another properly functioning pendrive would also require many days. Perhaps
this method would generate an error, but it is very cumbersome.

From the perspective of the average user, this is not a good situation, because
you can operate on hardware that is not fully functional, not be fully aware of
it and not have an easy and effective method to verify the status of your
device. True, you can also buy hardware from a more reputable manufacturer.

Unfortunately, there's nothing I can do about it. Well, the only thing I can do
is throw this equipment in the trash. Thank you again.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug 219300] ext4 corrupts data on a specific pendrive
  2024-09-22 15:47 [Bug 219300] New: ext4 corrupts data on a specific pendrive bugzilla-daemon
                   ` (9 preceding siblings ...)
  2024-09-24 16:15 ` bugzilla-daemon
@ 2024-09-25 16:49 ` bugzilla-daemon
  10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2024-09-25 16:49 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=219300

--- Comment #11 from Theodore Tso (tytso@mit.edu) ---
From the user's perspective, it means that you should stick to well-regarded
hardware manufacturers, and look for reviews on the web for people who complain
about lost data.   Then make sure you buy from a reputable vendor, to avoid
buying fakes where the vendors claims that it comes from a well-regarded
hardware manufacturer, and but it's really a fake where there is only 16GB of
flash to back a claimed 1TB drive, and the moment you write more than 16GB of
data, it start overwriting previously written blocks.

In general, even high quality storage from well-regarded companies (e.g.,
Samsung, WDC, etc.) are not all that expensive --- especially compared to the
value of the user's time, and the value of the user's data.   So trying to save
money by purchasing the cheapest possible storage is just false economy.   In
general, if it's too good to be true.... it probably is.

Finally, if Intenso is a reputable manufacturer, you should be able to file a
warrantee claim and they should be able to replace it with a new storage
device.  If they are not willing to do that.... they probably aren't a
reputable manufacturer.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-09-25 16:49 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-22 15:47 [Bug 219300] New: ext4 corrupts data on a specific pendrive bugzilla-daemon
2024-09-22 18:58 ` [Bug 219300] " bugzilla-daemon
2024-09-22 18:59 ` bugzilla-daemon
2024-09-23  1:35 ` bugzilla-daemon
2024-09-23  1:39 ` bugzilla-daemon
2024-09-23  6:26 ` bugzilla-daemon
2024-09-23  7:07 ` bugzilla-daemon
2024-09-23 16:24 ` bugzilla-daemon
2024-09-23 16:30 ` bugzilla-daemon
2024-09-23 18:53 ` bugzilla-daemon
2024-09-24 16:15 ` bugzilla-daemon
2024-09-25 16:49 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).