All of lore.kernel.org
 help / color / mirror / Atom feed
From: Warren Chartier <icebalm@icebalm.com>
To: linux-nvme <linux-nvme@lists.infradead.org>
Subject: PROBLEM: XPG Gammix S70 Blade PCIe Gen 4 NVMe drive unusable in Linux
Date: Sun, 6 Feb 2022 16:59:56 -0500 (EST)	[thread overview]
Message-ID: <880105512.126.1644184796757.JavaMail.zimbra@icebalm.com> (raw)
In-Reply-To: <1897892278.92.1644177774073.JavaMail.zimbra@icebalm.com>

Summary: XPG Gammix S70 Blade PCIe Gen 4 NVMe drive unusable in Linux 

Full Description: 
XPG Gammix S70 Blade 1TB PCIe Gen 4.0 NVMe drive is detected by the Linux kernel however when block operations are performed on it these errors are generated: 

[ 3.958786] nvme 0000:0e:00.0: invalid VPD tag 0xff (size 65535) at offset 7 
[ 71.726420] nvme nvme1: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0xffff 
[ 71.793517] block nvme1n1: no usable path - requeuing I/O 
[ 71.793523] block nvme1n1: no usable path - requeuing I/O 
[ 71.793525] block nvme1n1: no usable path - requeuing I/O 
[ 71.793527] block nvme1n1: no usable path - requeuing I/O 
[ 71.793528] block nvme1n1: no usable path - requeuing I/O 
[ 71.816389] nvme 0000:0e:00.0: can't change power state from D3cold to D0 (config space inaccessible) 
[ 71.816527] nvme nvme1: Removing after probe failure status: -19 
[ 71.856406] block nvme1n1: no available path - failing I/O 
[ 71.856425] block nvme1n1: no available path - failing I/O 
[ 71.856429] block nvme1n1: no available path - failing I/O 
[ 71.856432] block nvme1n1: no available path - failing I/O 
[ 71.856435] block nvme1n1: no available path - failing I/O 

Some block operations seem to succeed since the Linux kernel looks to be able to read the partition table from the drive at least: 
[ 0.672616] nvme nvme1: pci function 0000:0e:00.0 
[ 0.680459] nvme nvme1: 32/0/0 default/read/poll queues 
[ 0.682463] nvme1n1: p1 p2 p3 p4 

However any kind of user operation such as running a partition editor, attempting to mount a filesystem, etc. will cause the errors and the drive will not work. 

This drive works perfectly fine in Windows 10 on the same system. The drive also works fine in a Playstation 5. 

Keywords: nvme kernel 

Kernel version: Linux version 5.16.5-arch1-1 (linux@archlinux) (gcc (GCC) 11.1.0, GNU ld (GNU Binutils) 2.36.1) #1 SMP PREEMPT Tue, 01 Feb 2022 21:42:50 +0000 

Software: 
GNU C 11.1.0 
GNU Make 4.3 
Binutils 2.36.1 
Util-linux 2.37.3 
Mount 2.37.3 
Module-init-tools 29 
E2fsprogs 1.46.5 
Xfsprogs 5.14.2 
PPP 2.4.9 
Bison 3.8.2 
Flex 2.6.4 
Linux C++ Library 6.0.29 
Linux C Library 2.33 
Dynamic linker (ldd) 2.33 
Procps 3.3.17 
Kbd 2.4.0 
Console-tools 2.4.0 
Sh-utils 9.0 
Udev 250 
Modules Loaded acpi_cpufreq aesni_intel af_alg algif_hash algif_skcipher be2net blake2b_generic bluetooth bnep bpf_preload bridge btbcm btintel btrfs btrtl btusb ccp cdrom cfg80211 cmac crc16 crc32c_generic crc32c_intel crc32_pclmul crct10dif_pclmul cryptd crypto_simd crypto_user dca dm_mod ecdh_generic edac_mce_amd ext4 fat fuse ghash_clmulni_intel hfs hfsplus i2c_piix4 igb intel_rapl_common intel_rapl_msr ip6table_filter ip6_tables iptable_filter ip_tables irqbypass iwlmvm iwlwifi jbd2 jfs joydev k10temp kvm kvm_amd libarc4 libcrc32c llc mac80211 mac_hid mbcache mc minix mousedev msdos mxm_wmi nls_iso8859_1 nvidia nvidia_drm nvidia_modeset nvidia_uvm pcspkr pinctrl_amd raid6_pq rapl rfcomm rfkill rng_core sg snd snd_hda_codec snd_hda_codec_hdmi snd_hda_core snd_hda_intel snd_hrtimer snd_hwdep snd_intel_dspcfg snd_intel_sdw_acpi snd_pcm snd_rawmidi snd_seq snd_seq_device snd_seq_dummy snd_timer snd_usb_audio snd_usbmidi_lib soundcore sp5100_tco stp ufs usbhid uvcvideo vfat vfio vfio_iommu_type1 vfio_pci vfio_pci_core vfio_virqfd videobuf2_common videobuf2_memops videobuf2_v4l2 videobuf2_vmalloc videodev wmi wmi_bmof xfs xhci_pci xhci_pci_renesas xor x_tables 

Processor Information: 
processor : 0 
vendor_id : AuthenticAMD 
cpu family : 23 
model : 113 
model name : AMD Ryzen 7 3800X 8-Core Processor 
stepping : 0 
microcode : 0x8701021 
cpu MHz : 2200.000 
cache size : 512 KB 
physical id : 0 
siblings : 16 
core id : 0 
cpu cores : 8 
apicid : 0 
initial apicid : 0 
fpu : yes 
fpu_exception : yes 
cpuid level : 16 
wp : yes 
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip rdpid overflow_recov succor smca sme sev sev_es 
bugs : sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass 
bogomips : 7803.32 
TLB size : 3072 4K pages 
clflush size : 64 
cache_alignment : 64 
address sizes : 43 bits physical, 48 bits virtual 
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14] 

lspci -vvv for the offending NVMe drive after cold boot before trying to access it: 
0e:00.0 Non-Volatile memory controller: ADATA Technology Co., Ltd. Device 5236 (rev 01) (prog-if 02 [NVM Express]) 
Subsystem: Device 1dbe:5236 
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ 
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort+ <TAbort- <MAbort- >SERR- <PERR- INTx- 
Latency: 0, Cache Line Size: 64 bytes 
Interrupt: pin A routed to IRQ 45 
NUMA node: 0 
IOMMU group: 29 
Region 0: Memory at fce30000 (64-bit, non-prefetchable) [size=16K] 
Region 4: Memory at fce20000 (64-bit, non-prefetchable) [size=64K] 
Expansion ROM at fce00000 [disabled] [size=128K] 
Capabilities: [40] Power Management version 3 
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) 
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- 
Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ 
Address: 0000000000000000 Data: 0000 
Masking: 00000000 Pending: 00000000 
Capabilities: [70] Express (v2) Endpoint, MSI 00 
DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited 
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 75.000W 
DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+ 
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- FLReset- 
MaxPayload 512 bytes, MaxReadReq 512 bytes 
DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr+ TransPend- 
LnkCap: Port #0, Speed 16GT/s, Width x4, ASPM L1, Exit Latency L1 <64us 
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+ 
LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+ 
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- 
LnkSta: Speed 16GT/s (ok), Width x4 (ok) 
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- 
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+ 
10BitTagComp+ 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix- 
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit- 
FRS- TPHComp- ExtTPHComp- 
AtomicOpsCap: 32bit- 64bit- 128bitCAS- 
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR+ OBFF Disabled, 
AtomicOpsCtl: ReqEn- 
LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS- 
LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis- 
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- 
Compliance De-emphasis: -6dB 
LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+ EqualizationPhase1+ 
EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest- 
Retimer- 2Retimers- CrosslinkRes: Upstream Port 
Capabilities: [b0] MSI-X: Enable+ Count=66 Masked- 
Vector table: BAR=0 offset=00002000 
PBA: BAR=0 offset=00003000 
Capabilities: [d0] Vital Product Data 
Product Name: ABCD 
End 
Capabilities: [100 v2] Advanced Error Reporting 
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- 
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- 
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- 
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr- 
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ 
AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn- 
MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- 
HeaderLog: 00000000 00000000 00000000 00000000 
Capabilities: [148 v1] Alternative Routing-ID Interpretation (ARI) 
ARICap: MFVC- ACS-, Next Function: 0 
ARICtl: MFVC- ACS-, Function Group: 0 
Capabilities: [158 v1] Secondary PCI Express 
LnkCtl3: LnkEquIntrruptEn- PerformEqu- 
LaneErrStat: 0 
Capabilities: [178 v1] Physical Layer 16.0 GT/s <?> 
Capabilities: [19c v1] Lane Margining at the Receiver <?> 
Capabilities: [1b4 v1] Single Root I/O Virtualization (SR-IOV) 
IOVCap: Migration-, Interrupt Message Number: 000 
IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy- 
IOVSta: Migration- 
Initial VFs: 32, Total VFs: 32, Number of VFs: 0, Function Dependency Link: 00 
VF offset: 256, stride: 256, Device ID: 5208 
Supported Page Size: 00000553, System Page Size: 00000001 
Region 0: Memory at 00000000fce34000 (64-bit, non-prefetchable) 
VF Migration: offset: 00000000, BIR: 0 
Capabilities: [1f4 v1] Latency Tolerance Reporting 
Max snoop latency: 1048576ns 
Max no snoop latency: 1048576ns 
Capabilities: [1fc v1] L1 PM Substates 
L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+ 
PortCommonModeRestoreTime=10us PortTPowerOnTime=10us 
L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1- 
T_CommonMode=0us LTR1.2_Threshold=32768ns 
L1SubCtl2: T_PwrOn=10us 
Capabilities: [20c v1] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?> 
Capabilities: [244 v1] Data Link Feature <?> 
Kernel driver in use: nvme 

After trying to access it and receiving errors: 
0e:00.0 Non-Volatile memory controller: ADATA Technology Co., Ltd. Device 5236 (rev ff) (prog-if ff) 
!!! Unknown header type 7f 



       reply	other threads:[~2022-02-07  6:04 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1897892278.92.1644177774073.JavaMail.zimbra@icebalm.com>
2022-02-06 21:59 ` Warren Chartier [this message]
2022-02-07  2:21 ` PROBLEM: XPG Gammix S70 Blade PCIe Gen 4 NVMe drive unusable in Linux Keith Busch
2022-02-07  3:40   ` Warren Chartier
2022-02-07  6:46 ` Christoph Hellwig
2022-02-07 12:08   ` Warren Chartier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=880105512.126.1644184796757.JavaMail.zimbra@icebalm.com \
    --to=icebalm@icebalm.com \
    --cc=linux-nvme@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.