* I/O errors while writing to external Transcend XS-2000 4TB SSD
@ 2024-02-11 15:42 Martin Steigerwald
2024-02-11 16:02 ` Holger Hoffstätte
0 siblings, 1 reply; 10+ messages in thread
From: Martin Steigerwald @ 2024-02-11 15:42 UTC (permalink / raw)
To: stable, regressions, linux-usb
Hi!
This is not exactly a regression, as I am not aware of a prior working
state, but kernel documentation advises me to CC regressions list anyway¹.
I am trying to put data on an external Kingston XS-2000 4 TB SSD using
self-compiled Linux 6.7.4 kernel and encrypted BCacheFS. I do not think
BCacheFS has any part in the errors I see, but if you disagree feel free
to CC the BCacheFS mailing list as you reply.
I am using a ThinkPad T14 AMD Gen 1 with AMD Ryzen 7 PRO 4750U and 32
GiB of RAM.
I connected the SSD onto USB-C port directly with the ThinkPad. lsusb
lists it as:
Bus 007 Device 004: ID 0951:176b Kingston Technology XS2000
The SSD is detected as follows:
[20303.913644] usb 7-1: new SuperSpeed Plus Gen 2x1 USB device number 9 using xhci_hcd
[20303.926616] usb 7-1: New USB device found, idVendor=0951, idProduct=176b, bcdDevice= 1.00
[20303.926633] usb 7-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[20303.926641] usb 7-1: Product: XS2000
[20303.926647] usb 7-1: Manufacturer: Kingston
[20303.926652] usb 7-1: SerialNumber: […]
[20303.929078] scsi host0: uas
[20303.983859] scsi 0:0:0:0: Direct-Access Kingston XS2000 1000 PQ: 0 ANSI: 6
[20303.984426] sd 0:0:0:0: Attached scsi generic sg0 type 0
[20303.985197] sd 0:0:0:0: [sda] 8001573552 512-byte logical blocks: (4.10 TB/3.73 TiB)
[20303.985331] sd 0:0:0:0: [sda] Write Protect is off
[20303.985341] sd 0:0:0:0: [sda] Mode Sense: 43 00 00 00
[20303.985579] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[20303.989516] sda: sda1
[20303.989611] sd 0:0:0:0: [sda] Attached SCSI disk
BCacheFS is mounted as follows – but I suspect BCacheFS is not involved in
those errors anyway:
[20310.437864] bcachefs (sda1): mounting version 1.3: rebalance_work opts=metadata_checksum=xxhash,data_checksum=xxhash,compression=lz4
[20310.437895] bcachefs (sda1): recovering from clean shutdown, journal seq 5094
[20310.450813] bcachefs (sda1): alloc_read... done
[20310.450851] bcachefs (sda1): stripes_read... done
[20310.450855] bcachefs (sda1): snapshots_read... done
[20310.470815] bcachefs (sda1): journal_replay... done
[20310.470824] bcachefs (sda1): resume_logged_ops... done
[20310.470835] bcachefs (sda1): going read-write
During rsync'ing about 1,4 TB of data after eventually a hour I got
things like this:
[33963.462694] sd 0:0:0:0: [sda] tag#10 uas_zap_pending 0 uas-tag 1 inflight: CMD
[33963.462708] sd 0:0:0:0: [sda] tag#10 CDB: Write(16) 8a 00 00 00 00 00 82 c1 bc 00 00 00 04 00 00 00
[33963.462718] sd 0:0:0:0: [sda] tag#11 uas_zap_pending 0 uas-tag 2 inflight: CMD
[33963.462725] sd 0:0:0:0: [sda] tag#11 CDB: Write(16) 8a 00 00 00 00 00 82 c1 c8 00 00 00 04 00 00 00
[33963.462733] sd 0:0:0:0: [sda] tag#15 uas_zap_pending 0 uas-tag 3 inflight: CMD
[33963.462740] sd 0:0:0:0: [sda] tag#15 CDB: Write(16) 8a 00 00 00 00 00 82 c1 d2 4c 00 00 01 2f 00 00
[33963.462748] sd 0:0:0:0: [sda] tag#12 uas_zap_pending 0 uas-tag 4 inflight: CMD
[33963.462754] sd 0:0:0:0: [sda] tag#12 CDB: Write(16) 8a 00 00 00 00 00 82 c1 d0 00 00 00 02 4c 00 00
[33963.462762] sd 0:0:0:0: [sda] tag#13 uas_zap_pending 0 uas-tag 5 inflight: CMD
[33963.462769] sd 0:0:0:0: [sda] tag#13 CDB: Write(16) 8a 00 00 00 00 00 82 c1 d4 00 00 00 00 ff 00 00
[33963.462777] sd 0:0:0:0: [sda] tag#14 uas_zap_pending 0 uas-tag 6 inflight: CMD
[33963.462783] sd 0:0:0:0: [sda] tag#14 CDB: Write(16) 8a 00 00 00 00 00 82 c1 ce 00 00 00 00 cc 00 00
[33963.576991] usb 7-1: reset SuperSpeed Plus Gen 2x1 USB device number 9 using xhci_hcd
[33963.590793] scsi host0: uas_eh_device_reset_handler success
[33963.592857] sd 0:0:0:0: [sda] tag#10 timing out command, waited 180s
[33963.592872] sd 0:0:0:0: [sda] tag#10 FAILED Result: hostbyte=DID_RESET driverbyte=DRIVER_OK cmd_age=182s
[33963.592881] sd 0:0:0:0: [sda] tag#10 CDB: Write(16) 8a 00 00 00 00 00 82 c1 bc 00 00 00 04 00 00 00
[33963.592886] I/O error, dev sda, sector 2193734656 op 0x1:(WRITE) flags 0x104000 phys_seg 773 prio class 2
[33963.592898] bcachefs (sda1 inum 1073761281 offset 265216): data write error: I/O
[33963.592925] bcachefs (sda1 inum 1073761281 offset 467456): data write error: I/O
[33963.592933] bcachefs (sda1 inum 1073761281 offset 470016): data write error: I/O
[33963.592939] bcachefs (sda1 inum 1073761281 offset 471552): data write error: I/O
[33963.592949] bcachefs (sda1 inum 1073761281 offset 514560): data write error: I/O
[33963.592956] bcachefs (sda1 inum 1073761281 offset 517120): data write error: I/O
[33963.592963] bcachefs (sda1 inum 1073761281 offset 519168): data write error: I/O
[33963.592969] bcachefs (sda1 inum 1073761281 offset 521728): data write error: I/O
[33963.592976] bcachefs (sda1 inum 1073761281 offset 523776): data write error: I/O
[33963.592983] bcachefs (sda1 inum 1073761281 offset 526336): data write error: I/O
The rsync completed but I did not trust the result, even tough
"bcachefs fsck" told me the filesystem structure is okay.
Thus I reran rsync with option "-c" for checksumming. After a long time
with data that did match, it started to transfer a file again which should
not happen if data would have been identical. As it ran into I/O errors
again, I stopped the rsync process.
I looked for that UAS error message and according to the article² I
found I disabled UAS as follows:
% cat /etc/modprobe.d/disable-uas.conf
# Does not work with external SSD Transcend XS2000 4TB
options usb-storage quirks=0951:176b:u
The quirk was applied as I reconnected the devices after unloading
both usb-storage and uas modules:
[ 55.871301] usb 7-1: UAS is ignored for this device, using usb-storage instead
[ 55.871310] usb-storage 7-1:1.0: USB Mass Storage device detected
[ 55.871559] usb-storage 7-1:1.0: Quirks match for vid 0951 pid 176b: 800000
I recreated the BCacheFS filesystem and tried again. This time it did
not take more than 10 minutes for the first I/O error to appear. Unless
with UAS it made rsync stop with an I/O error immediately. Before that
there were several USB resets. Here is the excerpt from dmesg:
[ 795.768306] usb 7-1: reset SuperSpeed Plus Gen 2x1 USB device number 4 using xhci_hcd
[ 932.976677] usb 7-1: reset SuperSpeed Plus Gen 2x1 USB device number 4 using xhci_hcd
[ 963.189438] usb 7-1: reset SuperSpeed Plus Gen 2x1 USB device number 4 using xhci_hcd
[ 1000.057333] usb 7-1: reset SuperSpeed Plus Gen 2x1 USB device number 4 using xhci_hcd
[ 1036.917137] usb 7-1: reset SuperSpeed Plus Gen 2x1 USB device number 4 using xhci_hcd
[ 1073.782876] usb 7-1: reset SuperSpeed Plus Gen 2x1 USB device number 4 using xhci_hcd
[ 1110.647786] usb 7-1: reset SuperSpeed Plus Gen 2x1 USB device number 4 using xhci_hcd
[ 1117.163693] sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK cmd_age=214s
[ 1117.163718] sd 0:0:0:0: [sda] tag#0 CDB: Write(16) 8a 00 00 00 00 00 02 72 20 00 00 00 08 00 00 00
[ 1117.163725] I/O error, dev sda, sector 41033728 op 0x1:(WRITE) flags 0x104000 phys_seg 1551 prio class 2
[ 1117.163739] bcachefs (sda1 inum 1879048481 offset 2572800): data write error: I/O
[ 1117.163763] bcachefs (sda1 inum 1879048481 offset 2576384): data write error: I/O
[ 1117.163771] bcachefs (sda1 inum 1879048481 offset 2578432): data write error: I/O
[ 1117.163779] bcachefs (sda1 inum 1879048481 offset 2580480): data write error: I/O
[ 1117.163786] bcachefs (sda1 inum 1879048481 offset 2582528): data write error: I/O
[ 1117.163794] bcachefs (sda1 inum 1879048481 offset 2584576): data write error: I/O
[ 1117.163803] bcachefs (sda1 inum 1879048481 offset 2586624): data write error: I/O
[ 1117.163811] bcachefs (sda1 inum 1879048481 offset 2588672): data write error: I/O
[ 1117.163818] bcachefs (sda1 inum 1879048481 offset 2590720): data write error: I/O
[ 1117.163824] bcachefs (sda1 inum 1879048481 offset 2592768): data write error: I/O
So even without UAS the device does not seem to like to write data on
Linux.
Next steps may involve looking for a firmware update for the external SSD
as well as trying to obtain its SMART status. So far I did not succeed in
finding the right options for smartctl. In case there is enough evidence
that the device is defective I'd try to RMA it.
I will keep a copy of kernel log and I could do some further tests as time
permits. So let me know whether you need anything else, but for now
the mail is long enough as it is.
[1] https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html
[2] How to disable USB Attached Storage (UAS)
Last edited on 4 December 2022, at 14:00
https://leo.leung.xyz/wiki/How_to_disable_USB_Attached_Storage_(UAS)
Ciao,
--
Martin
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: I/O errors while writing to external Transcend XS-2000 4TB SSD
2024-02-11 15:42 I/O errors while writing to external Transcend XS-2000 4TB SSD Martin Steigerwald
@ 2024-02-11 16:02 ` Holger Hoffstätte
2024-02-11 17:06 ` Martin Steigerwald
0 siblings, 1 reply; 10+ messages in thread
From: Holger Hoffstätte @ 2024-02-11 16:02 UTC (permalink / raw)
To: Martin Steigerwald, stable, regressions, linux-usb
On 2024-02-11 16:42, Martin Steigerwald wrote:
> Hi!
> I am trying to put data on an external Kingston XS-2000 4 TB SSD using
> self-compiled Linux 6.7.4 kernel and encrypted BCacheFS. I do not think
> BCacheFS has any part in the errors I see, but if you disagree feel free
> to CC the BCacheFS mailing list as you reply.
This is indeed a known bug with bcachefs on USB-connected devices.
Apply the following commit:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs/bcachefs?id=3e44f325f6f75078cdcd44cd337f517ba3650d05
This and some other commits are already scheduled for -stable.
Holger
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: I/O errors while writing to external Transcend XS-2000 4TB SSD
2024-02-11 16:02 ` Holger Hoffstätte
@ 2024-02-11 17:06 ` Martin Steigerwald
2024-02-11 18:51 ` Kent Overstreet
0 siblings, 1 reply; 10+ messages in thread
From: Martin Steigerwald @ 2024-02-11 17:06 UTC (permalink / raw)
To: stable, regressions, linux-usb, Holger Hoffstätte,
linux-bcachefs
Hi Holger!
CC'ing BCacheFS mailing list.
My original mail is here:
https://lore.kernel.org/linux-usb/5264d425-fc13-6a77-2dbf-6853479051a0@applied-asynchrony.com/T/
#m5ec9ecad1240edfbf41ad63c7aeeb6aa6ea38a5e
Holger Hoffstätte - 11.02.24, 17:02:29 CET:
> On 2024-02-11 16:42, Martin Steigerwald wrote:
> > Hi!
> > I am trying to put data on an external Kingston XS-2000 4 TB SSD using
> > self-compiled Linux 6.7.4 kernel and encrypted BCacheFS. I do not
> > think BCacheFS has any part in the errors I see, but if you disagree
> > feel free to CC the BCacheFS mailing list as you reply.
>
> This is indeed a known bug with bcachefs on USB-connected devices.
> Apply the following commit:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commi
> t/fs/bcachefs?id=3e44f325f6f75078cdcd44cd337f517ba3650d05
>
> This and some other commits are already scheduled for -stable.
Thanks!
Oh my. I was aware of some bug fixes coming for stable. I briefly looked
through them, but now I did not make a connection.
I will wait for 6.7.5 and retry then I bet.
Best,
--
Martin
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: I/O errors while writing to external Transcend XS-2000 4TB SSD
2024-02-11 17:06 ` Martin Steigerwald
@ 2024-02-11 18:51 ` Kent Overstreet
2024-02-12 15:52 ` Martin Steigerwald
2024-03-15 9:08 ` Martin Steigerwald
0 siblings, 2 replies; 10+ messages in thread
From: Kent Overstreet @ 2024-02-11 18:51 UTC (permalink / raw)
To: Martin Steigerwald
Cc: stable, regressions, linux-usb, Holger Hoffstätte,
linux-bcachefs
On Sun, Feb 11, 2024 at 06:06:27PM +0100, Martin Steigerwald wrote:
> Hi Holger!
>
> CC'ing BCacheFS mailing list.
>
> My original mail is here:
>
> https://lore.kernel.org/linux-usb/5264d425-fc13-6a77-2dbf-6853479051a0@applied-asynchrony.com/T/
> #m5ec9ecad1240edfbf41ad63c7aeeb6aa6ea38a5e
>
> Holger Hoffstätte - 11.02.24, 17:02:29 CET:
> > On 2024-02-11 16:42, Martin Steigerwald wrote:
> > > Hi!
> > > I am trying to put data on an external Kingston XS-2000 4 TB SSD using
> > > self-compiled Linux 6.7.4 kernel and encrypted BCacheFS. I do not
> > > think BCacheFS has any part in the errors I see, but if you disagree
> > > feel free to CC the BCacheFS mailing list as you reply.
> >
> > This is indeed a known bug with bcachefs on USB-connected devices.
> > Apply the following commit:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commi
> > t/fs/bcachefs?id=3e44f325f6f75078cdcd44cd337f517ba3650d05
> >
> > This and some other commits are already scheduled for -stable.
>
> Thanks!
>
> Oh my. I was aware of some bug fixes coming for stable. I briefly looked
> through them, but now I did not make a connection.
>
> I will wait for 6.7.5 and retry then I bet.
That doesn't look related - the device claims to not support flush or
fua, and the bug resulted in us not sending flush/fua devices; the main
thing people would see without that patch, on 6.8, would be an immediate
-EOPNOTSUP on the first flush journal write.
He only got errors after an hour or so, or 10 minutes with UAS disabled;
we send flushes once a second. Sounds like a screwy device.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: I/O errors while writing to external Transcend XS-2000 4TB SSD
2024-02-11 18:51 ` Kent Overstreet
@ 2024-02-12 15:52 ` Martin Steigerwald
2024-02-12 20:42 ` Kent Overstreet
2024-03-15 9:08 ` Martin Steigerwald
1 sibling, 1 reply; 10+ messages in thread
From: Martin Steigerwald @ 2024-02-12 15:52 UTC (permalink / raw)
To: Kent Overstreet
Cc: stable, regressions, linux-usb, Holger Hoffstätte,
linux-bcachefs
Kent Overstreet - 11.02.24, 19:51:32 CET:
> On Sun, Feb 11, 2024 at 06:06:27PM +0100, Martin Steigerwald wrote:
[…]
> > CC'ing BCacheFS mailing list.
> >
> > My original mail is here:
> >
> > https://lore.kernel.org/linux-usb/5264d425-fc13-6a77-2dbf-6853479051a0
> > @applied-asynchrony.com/T/ #m5ec9ecad1240edfbf41ad63c7aeeb6aa6ea38a5e
> >
> > Holger Hoffstätte - 11.02.24, 17:02:29 CET:
> > > On 2024-02-11 16:42, Martin Steigerwald wrote:
> > > > Hi!
> > > > I am trying to put data on an external Kingston XS-2000 4 TB SSD
> > > > using
> > > > self-compiled Linux 6.7.4 kernel and encrypted BCacheFS. I do not
> > > > think BCacheFS has any part in the errors I see, but if you
> > > > disagree
> > > > feel free to CC the BCacheFS mailing list as you reply.
> > >
> > > This is indeed a known bug with bcachefs on USB-connected devices.
> > > Apply the following commit:
> > >
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/c
> > > ommi t/fs/bcachefs?id=3e44f325f6f75078cdcd44cd337f517ba3650d05
> > >
> > > This and some other commits are already scheduled for -stable.
> >
> > Thanks!
> >
> > Oh my. I was aware of some bug fixes coming for stable. I briefly
> > looked through them, but now I did not make a connection.
> >
> > I will wait for 6.7.5 and retry then I bet.
>
> That doesn't look related - the device claims to not support flush or
> fua, and the bug resulted in us not sending flush/fua devices; the main
> thing people would see without that patch, on 6.8, would be an immediate
> -EOPNOTSUP on the first flush journal write.
>
> He only got errors after an hour or so, or 10 minutes with UAS disabled;
> we send flushes once a second. Sounds like a screwy device.
Thanks for that explanation, Kent.
I am the one with that external Transcend XS 2000 4 TB SSD and I
specifically did not CC bcachefs mailing list at the beginning as after
seeing things like
[33963.462694] sd 0:0:0:0: [sda] tag#10 uas_zap_pending 0 uas-tag 1 inflight: CMD
[33963.462708] sd 0:0:0:0: [sda] tag#10 CDB: Write(16) 8a 00 00 00 00 00 82 c1 bc 00 00 00 04 00 00 00
[…]
[33963.592872] sd 0:0:0:0: [sda] tag#10 FAILED Result: hostbyte=DID_RESET driverbyte=DRIVER_OK cmd_age=182s
I thought some quirks in the device to be at fault.
However while Sandisk Extreme Pro 2 TB claims to support DPO and FUA I see
Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
also with other devices like external Toshiba Canvio 4 TB hard disks. Using
LUKS encrypted BTRFS on those I never saw any timeout while writing out
data issue with any of those hard disks. Also with disabled write cache
any cache flush / FUA request should be a no-op anyway? These hard disks
have been doing a ton of backup workloads without any issues, but so far
only with BTRFS.
I may test the Transcend XS2000 with BTRFS to see whether it makes a
difference, however I really like to use it with BCacheFS and I do not really
like to use LUKS for external devices. According to the kernel log I still
don't really think those errors at the block layer were about anything
filesystem specific, but what do I know?
With UAS enabled for Transcend XS2000 I see:
Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
This sounds about right: Without cache flush / FUA request disable write
cache.
With UAS disabled, using only usb-storage, however I see:
Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Which appears to be broken to me: If it cannot do cache flush / FUA it
should have write cache disabled.
Thus I removed the quirk to disable UAS again. It did not help anyway.
However when I look at the output of "hdparm -I" for that Transcend XS2000
none of this makes sense. Cause it blatantly advertises to support
[…]
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
[…]
* WRITE_{DMA|MULTIPLE}_FUA_EXT
[…]
It has firmware revision S9K00107. I see whether I can get this updated
in case any update is available. Which is not obvious to me as Kingston
only offers to download a Windows application to update the firmware.
I asked them how to do an update on Linux. But am also prepared to run to
a friend with Windows system to do the update.
There is no urgency in this, so let's see whether a firmware update may
fix anything. In case someone has any additional insight, feel free to add
it. Otherwise I consider it case closed unless I retest with either Linux
kernel 6.7.5 or 6.8-rc4 and/or after having made a firmware update
if available.
Maybe also some other quirks would need to be enabled for that
device? I tested it with:
% cat /etc/modprobe.d/disable-uas.conf
# Does not work with external SSD Transcend XS2000 4TB
options usb-storage quirks=0951:176b:u
but as explained that did not help and thus I disabled UAS disabling
quirk again.
Best,
--
Martin
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: I/O errors while writing to external Transcend XS-2000 4TB SSD
2024-02-12 15:52 ` Martin Steigerwald
@ 2024-02-12 20:42 ` Kent Overstreet
2024-02-15 11:09 ` Martin Steigerwald
0 siblings, 1 reply; 10+ messages in thread
From: Kent Overstreet @ 2024-02-12 20:42 UTC (permalink / raw)
To: Martin Steigerwald
Cc: stable, regressions, linux-usb, Holger Hoffstätte,
linux-bcachefs, linux-block
On Mon, Feb 12, 2024 at 04:52:09PM +0100, Martin Steigerwald wrote:
> Kent Overstreet - 11.02.24, 19:51:32 CET:
> > On Sun, Feb 11, 2024 at 06:06:27PM +0100, Martin Steigerwald wrote:
> […]
> > > CC'ing BCacheFS mailing list.
> > >
> > > My original mail is here:
> > >
> > > https://lore.kernel.org/linux-usb/5264d425-fc13-6a77-2dbf-6853479051a0
> > > @applied-asynchrony.com/T/ #m5ec9ecad1240edfbf41ad63c7aeeb6aa6ea38a5e
> > >
> > > Holger Hoffstätte - 11.02.24, 17:02:29 CET:
> > > > On 2024-02-11 16:42, Martin Steigerwald wrote:
> > > > > Hi!
> > > > > I am trying to put data on an external Kingston XS-2000 4 TB SSD
> > > > > using
> > > > > self-compiled Linux 6.7.4 kernel and encrypted BCacheFS. I do not
> > > > > think BCacheFS has any part in the errors I see, but if you
> > > > > disagree
> > > > > feel free to CC the BCacheFS mailing list as you reply.
> > > >
> > > > This is indeed a known bug with bcachefs on USB-connected devices.
> > > > Apply the following commit:
> > > >
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/c
> > > > ommi t/fs/bcachefs?id=3e44f325f6f75078cdcd44cd337f517ba3650d05
> > > >
> > > > This and some other commits are already scheduled for -stable.
> > >
> > > Thanks!
> > >
> > > Oh my. I was aware of some bug fixes coming for stable. I briefly
> > > looked through them, but now I did not make a connection.
> > >
> > > I will wait for 6.7.5 and retry then I bet.
> >
> > That doesn't look related - the device claims to not support flush or
> > fua, and the bug resulted in us not sending flush/fua devices; the main
> > thing people would see without that patch, on 6.8, would be an immediate
> > -EOPNOTSUP on the first flush journal write.
> >
> > He only got errors after an hour or so, or 10 minutes with UAS disabled;
> > we send flushes once a second. Sounds like a screwy device.
>
> Thanks for that explanation, Kent.
>
> I am the one with that external Transcend XS 2000 4 TB SSD and I
> specifically did not CC bcachefs mailing list at the beginning as after
> seeing things like
>
> [33963.462694] sd 0:0:0:0: [sda] tag#10 uas_zap_pending 0 uas-tag 1 inflight: CMD
> [33963.462708] sd 0:0:0:0: [sda] tag#10 CDB: Write(16) 8a 00 00 00 00 00 82 c1 bc 00 00 00 04 00 00 00
> […]
> [33963.592872] sd 0:0:0:0: [sda] tag#10 FAILED Result: hostbyte=DID_RESET driverbyte=DRIVER_OK cmd_age=182s
>
> I thought some quirks in the device to be at fault.
>
> However while Sandisk Extreme Pro 2 TB claims to support DPO and FUA I see
>
> Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
>
> also with other devices like external Toshiba Canvio 4 TB hard disks. Using
> LUKS encrypted BTRFS on those I never saw any timeout while writing out
> data issue with any of those hard disks. Also with disabled write cache
> any cache flush / FUA request should be a no-op anyway? These hard disks
> have been doing a ton of backup workloads without any issues, but so far
> only with BTRFS.
>
> I may test the Transcend XS2000 with BTRFS to see whether it makes a
> difference, however I really like to use it with BCacheFS and I do not really
> like to use LUKS for external devices. According to the kernel log I still
> don't really think those errors at the block layer were about anything
> filesystem specific, but what do I know?
It's definitely not unheard of for one specific filesystem to be
tickling driver/device bugs and not others.
I wonder what it would take to dump the outstanding requests on device
timeout.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: I/O errors while writing to external Transcend XS-2000 4TB SSD
2024-02-12 20:42 ` Kent Overstreet
@ 2024-02-15 11:09 ` Martin Steigerwald
2024-02-15 15:19 ` Alan Stern
0 siblings, 1 reply; 10+ messages in thread
From: Martin Steigerwald @ 2024-02-15 11:09 UTC (permalink / raw)
To: Kent Overstreet
Cc: stable, regressions, linux-usb, Holger Hoffstätte,
linux-bcachefs, linux-block
Kent Overstreet - 12.02.24, 21:42:26 CET:
[thoughts about whether a cache flush / FUA request with write caches
disabled would be a no-op anyway]
> > I may test the Transcend XS2000 with BTRFS to see whether it makes a
> > difference, however I really like to use it with BCacheFS and I do not
> > really like to use LUKS for external devices. According to the kernel
> > log I still don't really think those errors at the block layer were
> > about anything filesystem specific, but what do I know?
>
> It's definitely not unheard of for one specific filesystem to be
> tickling driver/device bugs and not others.
>
> I wonder what it would take to dump the outstanding requests on device
> timeout.
I got some reply back from Transcend support.
They brought up two possible issues:
1) Copied to many files at once. I am not going to accept that one. An
external 4 TB SSD should handle writing 1,4 TB in about 215000 files,
coming from a slower Toshiba Canvio Basics external HD, just fine. About
90000 files was larger files like sound and video files or installation
archives. The rest is from a Linux system backup, so smaller files. I
likely move those elsewhere before I try again as I do not need these on
flash anyway. However if the amount of files or data matters I could never
know what amount of data I could write safely in one go. That is not
acceptable to me.
2) Power management related to USB port. Cause I am using a laptop. It may
have been that the Linux kernel decided to put the USB port the SSD was
connected to into some kind of sleep state. However it was a constant
rsync based copy workload. Yes, the kernel buffers data and the reads from
Toshiba HD should be quite a bit slower than the Transcend SSD could
handle the writes. I saw now more than 80-90 MiB/s coming from the hard
disk. However I would doubt this lead to pauses of write activity of more
than 30 seconds. Still it could be a thing.
Regarding further testing I am unsure whether to first test with BTRFS on
top of LUKS – I do not like to store clear text data on the SSD – or with
BCacheFS plus fixes which are 6.7.5 or 6.8-rc4 in just in the case the flush
handling fixes would still have an influence on the issue at hand.
First I will have a look on how to see what USB power management options
may be in place and how to tell Linux to keep the USB port the SSD is
connected to at all times.
Let's see how this story unfolds. At least I am in no hurry about it.
Best,
--
Martin
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: I/O errors while writing to external Transcend XS-2000 4TB SSD
2024-02-15 11:09 ` Martin Steigerwald
@ 2024-02-15 15:19 ` Alan Stern
2024-02-15 15:36 ` Martin Steigerwald
0 siblings, 1 reply; 10+ messages in thread
From: Alan Stern @ 2024-02-15 15:19 UTC (permalink / raw)
To: Martin Steigerwald
Cc: Kent Overstreet, stable, regressions, linux-usb,
Holger Hoffstätte, linux-bcachefs, linux-block
On Thu, Feb 15, 2024 at 12:09:20PM +0100, Martin Steigerwald wrote:
> Kent Overstreet - 12.02.24, 21:42:26 CET:
>
> [thoughts about whether a cache flush / FUA request with write caches
> disabled would be a no-op anyway]
>
> > > I may test the Transcend XS2000 with BTRFS to see whether it makes a
> > > difference, however I really like to use it with BCacheFS and I do not
> > > really like to use LUKS for external devices. According to the kernel
> > > log I still don't really think those errors at the block layer were
> > > about anything filesystem specific, but what do I know?
> >
> > It's definitely not unheard of for one specific filesystem to be
> > tickling driver/device bugs and not others.
> >
> > I wonder what it would take to dump the outstanding requests on device
> > timeout.
>
> I got some reply back from Transcend support.
>
> They brought up two possible issues:
>
> 1) Copied to many files at once. I am not going to accept that one. An
> external 4 TB SSD should handle writing 1,4 TB in about 215000 files,
> coming from a slower Toshiba Canvio Basics external HD, just fine. About
> 90000 files was larger files like sound and video files or installation
> archives. The rest is from a Linux system backup, so smaller files. I
> likely move those elsewhere before I try again as I do not need these on
> flash anyway. However if the amount of files or data matters I could never
> know what amount of data I could write safely in one go. That is not
> acceptable to me.
>
> 2) Power management related to USB port. Cause I am using a laptop. It may
> have been that the Linux kernel decided to put the USB port the SSD was
> connected to into some kind of sleep state. However it was a constant
> rsync based copy workload. Yes, the kernel buffers data and the reads from
> Toshiba HD should be quite a bit slower than the Transcend SSD could
> handle the writes. I saw now more than 80-90 MiB/s coming from the hard
> disk. However I would doubt this lead to pauses of write activity of more
> than 30 seconds. Still it could be a thing.
>
> Regarding further testing I am unsure whether to first test with BTRFS on
> top of LUKS – I do not like to store clear text data on the SSD – or with
> BCacheFS plus fixes which are 6.7.5 or 6.8-rc4 in just in the case the flush
> handling fixes would still have an influence on the issue at hand.
>
> First I will have a look on how to see what USB power management options
> may be in place and how to tell Linux to keep the USB port the SSD is
> connected to at all times.
>
> Let's see how this story unfolds. At least I am in no hurry about it.
This may not be an issue of power management but rather one of
insufficient power. A laptop may not provide enough power through its
USB ports for the Transcend SSD to work properly under load.
You can test this by connecting a powered UBS-3 hub between the laptop
and the drive.
Alan Stern
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: I/O errors while writing to external Transcend XS-2000 4TB SSD
2024-02-15 15:19 ` Alan Stern
@ 2024-02-15 15:36 ` Martin Steigerwald
0 siblings, 0 replies; 10+ messages in thread
From: Martin Steigerwald @ 2024-02-15 15:36 UTC (permalink / raw)
To: Alan Stern
Cc: Kent Overstreet, stable, regressions, linux-usb,
Holger Hoffstätte, linux-bcachefs, linux-block
Alan Stern - 15.02.24, 16:19:54 CET:
> > First I will have a look on how to see what USB power management
> > options may be in place and how to tell Linux to keep the USB port
> > the SSD is connected to at all times.
> >
> > Let's see how this story unfolds. At least I am in no hurry about it.
>
> This may not be an issue of power management but rather one of
> insufficient power. A laptop may not provide enough power through its
> USB ports for the Transcend SSD to work properly under load.
>
> You can test this by connecting a powered UBS-3 hub between the laptop
> and the drive.
Interesting idea. Maybe the Transcend XS-2000 4TB needs more power than
the Sandisk Extreme Pro 2TB.
Not sure whether I have one at hand with USB-C here, cause my regular USB
hub only has USB-A connectors. Need to look for one with enough USB-A and
USB-C connectors as I use an USB hub as replacement for a docking station.
But I do have at least optionally powered hub with USB-C one at another
place. It does not have many ports. But for the task ahead one USB-C port
is sufficient.
I will try this as well. Thanks.
Best,
--
Martin
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: I/O errors while writing to external Transcend XS-2000 4TB SSD
2024-02-11 18:51 ` Kent Overstreet
2024-02-12 15:52 ` Martin Steigerwald
@ 2024-03-15 9:08 ` Martin Steigerwald
1 sibling, 0 replies; 10+ messages in thread
From: Martin Steigerwald @ 2024-03-15 9:08 UTC (permalink / raw)
To: Kent Overstreet
Cc: stable, regressions, linux-usb, Holger Hoffstätte,
linux-bcachefs
Hi!
Kent Overstreet - 11.02.24, 19:51:32 CET:
> He only got errors after an hour or so, or 10 minutes with UAS disabled;
> we send flushes once a second. Sounds like a screwy device.
Kingston support intends to RMA the XS-2000 4 TB SSD with a variant with a
newer firmware version, in case they have it available, while they work on
a newer firmware version for the device variant the error happened on.
So it appears the device has a bug. I will keep you posted, once I either
receive that other variant or a firmware upgrade for the existing one.
I am happy with Kingston support so far. It takes quite a while, but they
are taking the issue for real instead of writing use Windows instead of
Linux or something like that :) - like I read before in other occasions
with hardware from other suppliers. Thanks!
Best,
--
Martin
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-03-15 9:08 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-11 15:42 I/O errors while writing to external Transcend XS-2000 4TB SSD Martin Steigerwald
2024-02-11 16:02 ` Holger Hoffstätte
2024-02-11 17:06 ` Martin Steigerwald
2024-02-11 18:51 ` Kent Overstreet
2024-02-12 15:52 ` Martin Steigerwald
2024-02-12 20:42 ` Kent Overstreet
2024-02-15 11:09 ` Martin Steigerwald
2024-02-15 15:19 ` Alan Stern
2024-02-15 15:36 ` Martin Steigerwald
2024-03-15 9:08 ` Martin Steigerwald
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).