* [SPDK] Helloworld program fails during device initialization.
@ 2017-11-14 6:36 Sreeni Busam
0 siblings, 0 replies; 21+ messages in thread
From: Sreeni Busam @ 2017-11-14 6:36 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 2088 bytes --]
I was testing the helloworld program on ubuntu with NVMe SSD and the probe() is failing. Please let me know if this is a known problem.
Let me know if there is workaround.
The perf program did not have any problem.
I have attached the output for both tests.
lspci| grep Non
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
uname -a
Linux sjca5radevct13 4.10.0-38-generic #42~16.04.1-Ubuntu SMP Tue Oct 10 16:32:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
1. helloworld output:
---------------------------
Attached to 0000:ad:00.0
Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1 namespaces.
Namespace ID: 1 size: 480GB
nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization timed out in state 1
nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown within 5 seconds
spdk_nvme_probe() failed
2. perf output:
sudo ./perf -q 1 -s 4096 -t 10 -w read -r "trtype:PCIe traddr:0000:ad:00.0"
---------------
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: perf -c 0x1 --file-prefix=spdk_pid7647 ]
EAL: Detected 32 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
Attached to NVMe Controller at 0000:ad:00.0 [144d:a802]
Associating SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with lcore 0
Initialization complete. Launching workers.
Starting thread on core 0
========================================================
Latency(us)
Device Information : IOPS MB/s Average min max
SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) from core 0: 13773.50 53.80 72.59 60.20 243.82
========================================================
Total : 13773.50 53.80 72.59 60.20 243.82
Thanks for your help,
Sreeni
[-- Attachment #2: attachment.html --]
[-- Type: text/html, Size: 6480 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [SPDK] Helloworld program fails during device initialization.
@ 2017-11-14 9:12 Wu, WenzhongX
0 siblings, 0 replies; 21+ messages in thread
From: Wu, WenzhongX @ 2017-11-14 9:12 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 3748 bytes --]
Hi Sreeni,
Verify with newest spdk source code with intel P3700 ,run helloword everything is ok:
root(a)waikiki5-target:~/workspace/spdk1102/examples/nvme/hello_world# ./hello_world
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-virtaddr=0x1000000000 --proc-type=auto ]
EAL: Detected 36 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
Initializing NVMe Controllers
EAL: PCI device 0000:81:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:81:00.0
EAL: PCI device 0000:84:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:84:00.0
EAL: PCI device 0000:85:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:85:00.0
Attached to 0000:85:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT6401011B800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:81:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT5373002D800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:84:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT53640003800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Initialization complete.
Hello world!
Hello world!
Hello world!
Seem it's your hardware cause this issues.
Thanks
Wenzhong
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 2:36 PM
To: spdk(a)lists.01.org
Subject: [SPDK] Helloworld program fails during device initialization.
I was testing the helloworld program on ubuntu with NVMe SSD and the probe() is failing. Please let me know if this is a known problem.
Let me know if there is workaround.
The perf program did not have any problem.
I have attached the output for both tests.
lspci| grep Non
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
uname -a
Linux sjca5radevct13 4.10.0-38-generic #42~16.04.1-Ubuntu SMP Tue Oct 10 16:32:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
1. helloworld output:
---------------------------
Attached to 0000:ad:00.0
Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1 namespaces.
Namespace ID: 1 size: 480GB
nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization timed out in state 1
nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown within 5 seconds
spdk_nvme_probe() failed
2. perf output:
sudo ./perf -q 1 -s 4096 -t 10 -w read -r "trtype:PCIe traddr:0000:ad:00.0"
---------------
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: perf -c 0x1 --file-prefix=spdk_pid7647 ]
EAL: Detected 32 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
Attached to NVMe Controller at 0000:ad:00.0 [144d:a802]
Associating SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with lcore 0
Initialization complete. Launching workers.
Starting thread on core 0
========================================================
Latency(us)
Device Information : IOPS MB/s Average min max
SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) from core 0: 13773.50 53.80 72.59 60.20 243.82
========================================================
Total : 13773.50 53.80 72.59 60.20 243.82
Thanks for your help,
Sreeni
[-- Attachment #2: attachment.html --]
[-- Type: text/html, Size: 11260 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [SPDK] Helloworld program fails during device initialization.
@ 2017-11-14 13:45 Luse, Paul E
0 siblings, 0 replies; 21+ messages in thread
From: Luse, Paul E @ 2017-11-14 13:45 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 4655 bytes --]
Well, maybe not. Could very well be an issue in the driver that's only exposed via this hardware especially given that hello_world failed and perf did not.
Sreeni, do you have any other data points that you can share? Like, for example, have you tried multiple SSDs of that same model or any other model? Does hello consistently fail and perf consistently pass? I haven't done a ton of debug with either but with a little more info, if someone else doesn't jump in with the answer, I can pretty easily help compare the init sequence in a few of these and maybe start to narrow down what's different.
Thx
Paul
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Wu, WenzhongX
Sent: Tuesday, November 14, 2017 2:13 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Hi Sreeni,
Verify with newest spdk source code with intel P3700 ,run helloword everything is ok:
root(a)waikiki5-target:~/workspace/spdk1102/examples/nvme/hello_world# ./hello_world
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-virtaddr=0x1000000000 --proc-type=auto ]
EAL: Detected 36 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
Initializing NVMe Controllers
EAL: PCI device 0000:81:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:81:00.0
EAL: PCI device 0000:84:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:84:00.0
EAL: PCI device 0000:85:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:85:00.0
Attached to 0000:85:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT6401011B800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:81:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT5373002D800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:84:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT53640003800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Initialization complete.
Hello world!
Hello world!
Hello world!
Seem it's your hardware cause this issues.
Thanks
Wenzhong
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 2:36 PM
To: spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
Subject: [SPDK] Helloworld program fails during device initialization.
I was testing the helloworld program on ubuntu with NVMe SSD and the probe() is failing. Please let me know if this is a known problem.
Let me know if there is workaround.
The perf program did not have any problem.
I have attached the output for both tests.
lspci| grep Non
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
uname -a
Linux sjca5radevct13 4.10.0-38-generic #42~16.04.1-Ubuntu SMP Tue Oct 10 16:32:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
1. helloworld output:
---------------------------
Attached to 0000:ad:00.0
Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1 namespaces.
Namespace ID: 1 size: 480GB
nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization timed out in state 1
nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown within 5 seconds
spdk_nvme_probe() failed
2. perf output:
sudo ./perf -q 1 -s 4096 -t 10 -w read -r "trtype:PCIe traddr:0000:ad:00.0"
---------------
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: perf -c 0x1 --file-prefix=spdk_pid7647 ]
EAL: Detected 32 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
Attached to NVMe Controller at 0000:ad:00.0 [144d:a802]
Associating SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with lcore 0
Initialization complete. Launching workers.
Starting thread on core 0
========================================================
Latency(us)
Device Information : IOPS MB/s Average min max
SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) from core 0: 13773.50 53.80 72.59 60.20 243.82
========================================================
Total : 13773.50 53.80 72.59 60.20 243.82
Thanks for your help,
Sreeni
[-- Attachment #2: attachment.html --]
[-- Type: text/html, Size: 13015 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [SPDK] Helloworld program fails during device initialization.
@ 2017-11-14 17:42 Verkamp, Daniel
0 siblings, 0 replies; 21+ messages in thread
From: Verkamp, Daniel @ 2017-11-14 17:42 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 5551 bytes --]
It looks to me like the device at 0000:ad:00.0 worked in both tests, and the device at 0000:ae:00.0 failed during identify and wasn't used in perf due to the -r argument specifying only 0000:ad:00.0.
Are these two devices the same model/firmware of NVMe controller, or is there some difference between them?
It would also be useful to see the debug output from the NVMe library for the failing controller; you can enable this in hello_world.c by adding these two lines at the beginning of main():
spdk_log_set_print_level(SPDK_LOG_DEBUG);
spdk_log_set_trace_flag("all");
Thanks,
-- Daniel
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul E
Sent: Tuesday, November 14, 2017 6:46 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Well, maybe not. Could very well be an issue in the driver that's only exposed via this hardware especially given that hello_world failed and perf did not.
Sreeni, do you have any other data points that you can share? Like, for example, have you tried multiple SSDs of that same model or any other model? Does hello consistently fail and perf consistently pass? I haven't done a ton of debug with either but with a little more info, if someone else doesn't jump in with the answer, I can pretty easily help compare the init sequence in a few of these and maybe start to narrow down what's different.
Thx
Paul
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Wu, WenzhongX
Sent: Tuesday, November 14, 2017 2:13 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Hi Sreeni,
Verify with newest spdk source code with intel P3700 ,run helloword everything is ok:
root(a)waikiki5-target:~/workspace/spdk1102/examples/nvme/hello_world# ./hello_world
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-virtaddr=0x1000000000 --proc-type=auto ]
EAL: Detected 36 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
Initializing NVMe Controllers
EAL: PCI device 0000:81:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:81:00.0
EAL: PCI device 0000:84:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:84:00.0
EAL: PCI device 0000:85:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:85:00.0
Attached to 0000:85:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT6401011B800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:81:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT5373002D800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:84:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT53640003800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Initialization complete.
Hello world!
Hello world!
Hello world!
Seem it's your hardware cause this issues.
Thanks
Wenzhong
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 2:36 PM
To: spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
Subject: [SPDK] Helloworld program fails during device initialization.
I was testing the helloworld program on ubuntu with NVMe SSD and the probe() is failing. Please let me know if this is a known problem.
Let me know if there is workaround.
The perf program did not have any problem.
I have attached the output for both tests.
lspci| grep Non
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
uname -a
Linux sjca5radevct13 4.10.0-38-generic #42~16.04.1-Ubuntu SMP Tue Oct 10 16:32:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
1. helloworld output:
---------------------------
Attached to 0000:ad:00.0
Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1 namespaces.
Namespace ID: 1 size: 480GB
nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization timed out in state 1
nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown within 5 seconds
spdk_nvme_probe() failed
2. perf output:
sudo ./perf -q 1 -s 4096 -t 10 -w read -r "trtype:PCIe traddr:0000:ad:00.0"
---------------
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: perf -c 0x1 --file-prefix=spdk_pid7647 ]
EAL: Detected 32 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
Attached to NVMe Controller at 0000:ad:00.0 [144d:a802]
Associating SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with lcore 0
Initialization complete. Launching workers.
Starting thread on core 0
========================================================
Latency(us)
Device Information : IOPS MB/s Average min max
SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) from core 0: 13773.50 53.80 72.59 60.20 243.82
========================================================
Total : 13773.50 53.80 72.59 60.20 243.82
Thanks for your help,
Sreeni
[-- Attachment #2: attachment.html --]
[-- Type: text/html, Size: 15518 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [SPDK] Helloworld program fails during device initialization.
@ 2017-11-14 18:09 Sreeni Busam
0 siblings, 0 replies; 21+ messages in thread
From: Sreeni Busam @ 2017-11-14 18:09 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 4374 bytes --]
Thanks Wu. I thought it could be hardware problem with NVMe device. I do not understand why the nvme probe worked in perf application.
I agree with your investigation that it could be a hardware problem, am going to get a different SSD device and try the test with the latest SPDK code. How do I know intel version is P3700?
Sreeni
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Wu, WenzhongX
Sent: Tuesday, November 14, 2017 1:13 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Hi Sreeni,
Verify with newest spdk source code with intel P3700 ,run helloword everything is ok:
root(a)waikiki5-target:~/workspace/spdk1102/examples/nvme/hello_world# ./hello_world
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-virtaddr=0x1000000000 --proc-type=auto ]
EAL: Detected 36 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
Initializing NVMe Controllers
EAL: PCI device 0000:81:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:81:00.0
EAL: PCI device 0000:84:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:84:00.0
EAL: PCI device 0000:85:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:85:00.0
Attached to 0000:85:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT6401011B800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:81:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT5373002D800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:84:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT53640003800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Initialization complete.
Hello world!
Hello world!
Hello world!
Seem it's your hardware cause this issues.
Thanks
Wenzhong
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 2:36 PM
To: spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
Subject: [SPDK] Helloworld program fails during device initialization.
I was testing the helloworld program on ubuntu with NVMe SSD and the probe() is failing. Please let me know if this is a known problem.
Let me know if there is workaround.
The perf program did not have any problem.
I have attached the output for both tests.
lspci| grep Non
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
uname -a
Linux sjca5radevct13 4.10.0-38-generic #42~16.04.1-Ubuntu SMP Tue Oct 10 16:32:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
1. helloworld output:
---------------------------
Attached to 0000:ad:00.0
Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1 namespaces.
Namespace ID: 1 size: 480GB
nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization timed out in state 1
nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown within 5 seconds
spdk_nvme_probe() failed
2. perf output:
sudo ./perf -q 1 -s 4096 -t 10 -w read -r "trtype:PCIe traddr:0000:ad:00.0"
---------------
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: perf -c 0x1 --file-prefix=spdk_pid7647 ]
EAL: Detected 32 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
Attached to NVMe Controller at 0000:ad:00.0 [144d:a802]
Associating SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with lcore 0
Initialization complete. Launching workers.
Starting thread on core 0
========================================================
Latency(us)
Device Information : IOPS MB/s Average min max
SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) from core 0: 13773.50 53.80 72.59 60.20 243.82
========================================================
Total : 13773.50 53.80 72.59 60.20 243.82
Thanks for your help,
Sreeni
[-- Attachment #2: attachment.html --]
[-- Type: text/html, Size: 12678 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [SPDK] Helloworld program fails during device initialization.
@ 2017-11-14 18:39 Sreeni Busam
0 siblings, 0 replies; 21+ messages in thread
From: Sreeni Busam @ 2017-11-14 18:39 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 5769 bytes --]
Hi Paul,
Thanks for your comments. Please see my response inline.
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul E
Sent: Tuesday, November 14, 2017 5:46 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Well, maybe not. Could very well be an issue in the driver that's only exposed via this hardware especially given that hello_world failed and perf did not.
Sreeni, do you have any other data points that you can share? Like, for example, have you tried multiple SSDs of that same model or any other model? Does hello consistently fail and perf consistently pass?
>> As you have seen from the log, we know it fails in initialization of SSD device.
nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization timed out in state 1
I have 2 SSD's and one of them fails in initialization, in case of other device, looks it has hardware problem and the SPDK driver does not see it. I am going to get another SSD and try the same test and will post you the results after testing it. It may take a while as I have to contact my Lab personnel. I have not tested on other models.
Yes, the test on perf always passes and the hello_world failed every time. I tried at least 5 times.
I haven't done a ton of debug with either but with a little more info, if someone else doesn't jump in with the answer, I can pretty easily help compare the init sequence in a few of these and maybe start to narrow down what's different.
>> Do you think it would be good to try to increase the timeout and check if initialization succeeds? Any help would be appreciated.
Thx
Paul
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Wu, WenzhongX
Sent: Tuesday, November 14, 2017 2:13 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Hi Sreeni,
Verify with newest spdk source code with intel P3700 ,run helloword everything is ok:
root(a)waikiki5-target:~/workspace/spdk1102/examples/nvme/hello_world# ./hello_world
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-virtaddr=0x1000000000 --proc-type=auto ]
EAL: Detected 36 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
Initializing NVMe Controllers
EAL: PCI device 0000:81:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:81:00.0
EAL: PCI device 0000:84:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:84:00.0
EAL: PCI device 0000:85:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:85:00.0
Attached to 0000:85:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT6401011B800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:81:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT5373002D800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:84:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT53640003800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Initialization complete.
Hello world!
Hello world!
Hello world!
Seem it's your hardware cause this issues.
Thanks
Wenzhong
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 2:36 PM
To: spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
Subject: [SPDK] Helloworld program fails during device initialization.
I was testing the helloworld program on ubuntu with NVMe SSD and the probe() is failing. Please let me know if this is a known problem.
Let me know if there is workaround.
The perf program did not have any problem.
I have attached the output for both tests.
lspci| grep Non
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
uname -a
Linux sjca5radevct13 4.10.0-38-generic #42~16.04.1-Ubuntu SMP Tue Oct 10 16:32:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
1. helloworld output:
---------------------------
Attached to 0000:ad:00.0
Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1 namespaces.
Namespace ID: 1 size: 480GB
nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization timed out in state 1
nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown within 5 seconds
spdk_nvme_probe() failed
2. perf output:
sudo ./perf -q 1 -s 4096 -t 10 -w read -r "trtype:PCIe traddr:0000:ad:00.0"
---------------
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: perf -c 0x1 --file-prefix=spdk_pid7647 ]
EAL: Detected 32 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
Attached to NVMe Controller at 0000:ad:00.0 [144d:a802]
Associating SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with lcore 0
Initialization complete. Launching workers.
Starting thread on core 0
========================================================
Latency(us)
Device Information : IOPS MB/s Average min max
SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) from core 0: 13773.50 53.80 72.59 60.20 243.82
========================================================
Total : 13773.50 53.80 72.59 60.20 243.82
Thanks for your help,
Sreeni
[-- Attachment #2: attachment.html --]
[-- Type: text/html, Size: 15563 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [SPDK] Helloworld program fails during device initialization.
@ 2017-11-14 18:42 Sreeni Busam
0 siblings, 0 replies; 21+ messages in thread
From: Sreeni Busam @ 2017-11-14 18:42 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 6051 bytes --]
Thanks Daniel for your comments. I think the SSD devices are of same model looking at lspci output.
Let me confirm it with my lab administrator.
I will add the debug code and give you the output.
Sreeni
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Verkamp, Daniel
Sent: Tuesday, November 14, 2017 9:43 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
It looks to me like the device at 0000:ad:00.0 worked in both tests, and the device at 0000:ae:00.0 failed during identify and wasn't used in perf due to the -r argument specifying only 0000:ad:00.0.
Are these two devices the same model/firmware of NVMe controller, or is there some difference between them?
It would also be useful to see the debug output from the NVMe library for the failing controller; you can enable this in hello_world.c by adding these two lines at the beginning of main():
spdk_log_set_print_level(SPDK_LOG_DEBUG);
spdk_log_set_trace_flag("all");
Thanks,
-- Daniel
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul E
Sent: Tuesday, November 14, 2017 6:46 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Well, maybe not. Could very well be an issue in the driver that's only exposed via this hardware especially given that hello_world failed and perf did not.
Sreeni, do you have any other data points that you can share? Like, for example, have you tried multiple SSDs of that same model or any other model? Does hello consistently fail and perf consistently pass? I haven't done a ton of debug with either but with a little more info, if someone else doesn't jump in with the answer, I can pretty easily help compare the init sequence in a few of these and maybe start to narrow down what's different.
Thx
Paul
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Wu, WenzhongX
Sent: Tuesday, November 14, 2017 2:13 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Hi Sreeni,
Verify with newest spdk source code with intel P3700 ,run helloword everything is ok:
root(a)waikiki5-target:~/workspace/spdk1102/examples/nvme/hello_world# ./hello_world
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-virtaddr=0x1000000000 --proc-type=auto ]
EAL: Detected 36 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
Initializing NVMe Controllers
EAL: PCI device 0000:81:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:81:00.0
EAL: PCI device 0000:84:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:84:00.0
EAL: PCI device 0000:85:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:85:00.0
Attached to 0000:85:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT6401011B800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:81:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT5373002D800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:84:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT53640003800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Initialization complete.
Hello world!
Hello world!
Hello world!
Seem it's your hardware cause this issues.
Thanks
Wenzhong
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 2:36 PM
To: spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
Subject: [SPDK] Helloworld program fails during device initialization.
I was testing the helloworld program on ubuntu with NVMe SSD and the probe() is failing. Please let me know if this is a known problem.
Let me know if there is workaround.
The perf program did not have any problem.
I have attached the output for both tests.
lspci| grep Non
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
uname -a
Linux sjca5radevct13 4.10.0-38-generic #42~16.04.1-Ubuntu SMP Tue Oct 10 16:32:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
1. helloworld output:
---------------------------
Attached to 0000:ad:00.0
Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1 namespaces.
Namespace ID: 1 size: 480GB
nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization timed out in state 1
nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown within 5 seconds
spdk_nvme_probe() failed
2. perf output:
sudo ./perf -q 1 -s 4096 -t 10 -w read -r "trtype:PCIe traddr:0000:ad:00.0"
---------------
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: perf -c 0x1 --file-prefix=spdk_pid7647 ]
EAL: Detected 32 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
Attached to NVMe Controller at 0000:ad:00.0 [144d:a802]
Associating SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with lcore 0
Initialization complete. Launching workers.
Starting thread on core 0
========================================================
Latency(us)
Device Information : IOPS MB/s Average min max
SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) from core 0: 13773.50 53.80 72.59 60.20 243.82
========================================================
Total : 13773.50 53.80 72.59 60.20 243.82
Thanks for your help,
Sreeni
[-- Attachment #2: attachment.html --]
[-- Type: text/html, Size: 16983 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [SPDK] Helloworld program fails during device initialization.
@ 2017-11-14 19:43 Sreeni Busam
0 siblings, 0 replies; 21+ messages in thread
From: Sreeni Busam @ 2017-11-14 19:43 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 6633 bytes --]
Hi Daniel,
I added the debug flag you mentioned in the program and tested it. There is not much of any log which is helpful for the probe of the device.
Please ignore the printf from func pci_probe_all_drivers, it was printf added by me.
I have attached the file.
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 10:43 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Thanks Daniel for your comments. I think the SSD devices are of same model looking at lspci output.
Let me confirm it with my lab administrator.
I will add the debug code and give you the output.
Sreeni
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Verkamp, Daniel
Sent: Tuesday, November 14, 2017 9:43 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
It looks to me like the device at 0000:ad:00.0 worked in both tests, and the device at 0000:ae:00.0 failed during identify and wasn't used in perf due to the -r argument specifying only 0000:ad:00.0.
Are these two devices the same model/firmware of NVMe controller, or is there some difference between them?
It would also be useful to see the debug output from the NVMe library for the failing controller; you can enable this in hello_world.c by adding these two lines at the beginning of main():
spdk_log_set_print_level(SPDK_LOG_DEBUG);
spdk_log_set_trace_flag("all");
Thanks,
-- Daniel
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul E
Sent: Tuesday, November 14, 2017 6:46 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Well, maybe not. Could very well be an issue in the driver that's only exposed via this hardware especially given that hello_world failed and perf did not.
Sreeni, do you have any other data points that you can share? Like, for example, have you tried multiple SSDs of that same model or any other model? Does hello consistently fail and perf consistently pass? I haven't done a ton of debug with either but with a little more info, if someone else doesn't jump in with the answer, I can pretty easily help compare the init sequence in a few of these and maybe start to narrow down what's different.
Thx
Paul
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Wu, WenzhongX
Sent: Tuesday, November 14, 2017 2:13 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Hi Sreeni,
Verify with newest spdk source code with intel P3700 ,run helloword everything is ok:
root(a)waikiki5-target:~/workspace/spdk1102/examples/nvme/hello_world# ./hello_world
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-virtaddr=0x1000000000 --proc-type=auto ]
EAL: Detected 36 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
Initializing NVMe Controllers
EAL: PCI device 0000:81:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:81:00.0
EAL: PCI device 0000:84:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:84:00.0
EAL: PCI device 0000:85:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:85:00.0
Attached to 0000:85:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT6401011B800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:81:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT5373002D800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:84:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT53640003800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Initialization complete.
Hello world!
Hello world!
Hello world!
Seem it's your hardware cause this issues.
Thanks
Wenzhong
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 2:36 PM
To: spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
Subject: [SPDK] Helloworld program fails during device initialization.
I was testing the helloworld program on ubuntu with NVMe SSD and the probe() is failing. Please let me know if this is a known problem.
Let me know if there is workaround.
The perf program did not have any problem.
I have attached the output for both tests.
lspci| grep Non
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
uname -a
Linux sjca5radevct13 4.10.0-38-generic #42~16.04.1-Ubuntu SMP Tue Oct 10 16:32:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
1. helloworld output:
---------------------------
Attached to 0000:ad:00.0
Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1 namespaces.
Namespace ID: 1 size: 480GB
nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization timed out in state 1
nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown within 5 seconds
spdk_nvme_probe() failed
2. perf output:
sudo ./perf -q 1 -s 4096 -t 10 -w read -r "trtype:PCIe traddr:0000:ad:00.0"
---------------
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: perf -c 0x1 --file-prefix=spdk_pid7647 ]
EAL: Detected 32 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
Attached to NVMe Controller at 0000:ad:00.0 [144d:a802]
Associating SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with lcore 0
Initialization complete. Launching workers.
Starting thread on core 0
========================================================
Latency(us)
Device Information : IOPS MB/s Average min max
SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) from core 0: 13773.50 53.80 72.59 60.20 243.82
========================================================
Total : 13773.50 53.80 72.59 60.20 243.82
Thanks for your help,
Sreeni
[-- Attachment #2: attachment.html --]
[-- Type: text/html, Size: 18377 bytes --]
[-- Attachment #3: hello_world_dbg_SSD_probe_1.txt --]
[-- Type: text/plain, Size: 11527 bytes --]
sudo ./hello_world
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-virtaddr=0x1000000000 --proc-type=auto ]
EAL: Detected 32 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
pci_probe_all_drivers: no driver found for this device.
pci_probe_all_drivers: no driver found for this device.
...
Initializing NVMe Controllers
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
..
EAL: PCI device 0000:ad:00.0 on NUMA socket 1
EAL: probe driver: 144d:a802 spdk_nvme
rte_pci_map_device: get map resources kdrv 3.
rte_pci_map_device: get uio resources ret 0.
rte_pci_probe_one_driver:call probe for driver.
Attaching to 0000:ad:00.0
pci_probe_all_drivers: no driver found for this device.
EAL: PCI device 0000:ae:00.0 on NUMA socket 1
EAL: probe driver: 144d:a802 spdk_nvme
rte_pci_map_device: get map resources kdrv 3.
rte_pci_map_device: get uio resources ret 0.
rte_pci_probe_one_driver:call probe for driver.
Attaching to 0000:ae:00.0
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
rte_pci_probe_one_driver: pci not match for driver.
pci_probe_all_drivers: cannot use the device, continue.
pci_probe_all_drivers: no driver found for this device.
Attached to 0000:ad:00.0
Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1 namespaces.
Namespace ID: 1 size: 480GB
nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization timed out in state 1
nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown within 5 seconds
spdk_nvme_probe() failed
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [SPDK] Helloworld program fails during device initialization.
@ 2017-11-14 19:49 Sreeni Busam
0 siblings, 0 replies; 21+ messages in thread
From: Sreeni Busam @ 2017-11-14 19:49 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 7410 bytes --]
Hi Daniel,
There is another test which I did to debug the device where identify command has failed . I ran the following test to see if kernel driver is able to bind it. Does this test give any information?
sudo ./scripts/setup.sh reset
0000:ad:00.0 (144d a802): uio_pci_generic -> nvme
0000:ae:00.0 (144d a802): uio_pci_generic -> nvme
sreeni.busam(a)sjca5radevct13:~/spdk_v1/spdk-17.10$
sreeni.busam(a)sjca5radevct13:~/spdk_v1/spdk-17.10$ sudo ./scripts/setup.sh
0000:ad:00.0 (144d a802): nvme -> uio_pci_generic
From: Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 11:44 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: RE: Helloworld program fails during device initialization.
Hi Daniel,
I added the debug flag you mentioned in the program and tested it. There is not much of any log which is helpful for the probe of the device.
Please ignore the printf from func pci_probe_all_drivers, it was printf added by me.
I have attached the file.
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 10:43 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Thanks Daniel for your comments. I think the SSD devices are of same model looking at lspci output.
Let me confirm it with my lab administrator.
I will add the debug code and give you the output.
Sreeni
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Verkamp, Daniel
Sent: Tuesday, November 14, 2017 9:43 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
It looks to me like the device at 0000:ad:00.0 worked in both tests, and the device at 0000:ae:00.0 failed during identify and wasn't used in perf due to the -r argument specifying only 0000:ad:00.0.
Are these two devices the same model/firmware of NVMe controller, or is there some difference between them?
It would also be useful to see the debug output from the NVMe library for the failing controller; you can enable this in hello_world.c by adding these two lines at the beginning of main():
spdk_log_set_print_level(SPDK_LOG_DEBUG);
spdk_log_set_trace_flag("all");
Thanks,
-- Daniel
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul E
Sent: Tuesday, November 14, 2017 6:46 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Well, maybe not. Could very well be an issue in the driver that's only exposed via this hardware especially given that hello_world failed and perf did not.
Sreeni, do you have any other data points that you can share? Like, for example, have you tried multiple SSDs of that same model or any other model? Does hello consistently fail and perf consistently pass? I haven't done a ton of debug with either but with a little more info, if someone else doesn't jump in with the answer, I can pretty easily help compare the init sequence in a few of these and maybe start to narrow down what's different.
Thx
Paul
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Wu, WenzhongX
Sent: Tuesday, November 14, 2017 2:13 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Hi Sreeni,
Verify with newest spdk source code with intel P3700 ,run helloword everything is ok:
root(a)waikiki5-target:~/workspace/spdk1102/examples/nvme/hello_world# ./hello_world
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-virtaddr=0x1000000000 --proc-type=auto ]
EAL: Detected 36 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
Initializing NVMe Controllers
EAL: PCI device 0000:81:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:81:00.0
EAL: PCI device 0000:84:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:84:00.0
EAL: PCI device 0000:85:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:85:00.0
Attached to 0000:85:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT6401011B800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:81:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT5373002D800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:84:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT53640003800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Initialization complete.
Hello world!
Hello world!
Hello world!
Seem it's your hardware cause this issues.
Thanks
Wenzhong
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 2:36 PM
To: spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
Subject: [SPDK] Helloworld program fails during device initialization.
I was testing the helloworld program on ubuntu with NVMe SSD and the probe() is failing. Please let me know if this is a known problem.
Let me know if there is workaround.
The perf program did not have any problem.
I have attached the output for both tests.
lspci| grep Non
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
uname -a
Linux sjca5radevct13 4.10.0-38-generic #42~16.04.1-Ubuntu SMP Tue Oct 10 16:32:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
1. helloworld output:
---------------------------
Attached to 0000:ad:00.0
Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1 namespaces.
Namespace ID: 1 size: 480GB
nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization timed out in state 1
nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown within 5 seconds
spdk_nvme_probe() failed
2. perf output:
sudo ./perf -q 1 -s 4096 -t 10 -w read -r "trtype:PCIe traddr:0000:ad:00.0"
---------------
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: perf -c 0x1 --file-prefix=spdk_pid7647 ]
EAL: Detected 32 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
Attached to NVMe Controller at 0000:ad:00.0 [144d:a802]
Associating SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with lcore 0
Initialization complete. Launching workers.
Starting thread on core 0
========================================================
Latency(us)
Device Information : IOPS MB/s Average min max
SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) from core 0: 13773.50 53.80 72.59 60.20 243.82
========================================================
Total : 13773.50 53.80 72.59 60.20 243.82
Thanks for your help,
Sreeni
[-- Attachment #2: attachment.html --]
[-- Type: text/html, Size: 20323 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [SPDK] Helloworld program fails during device initialization.
@ 2017-11-14 19:55 Luse, Paul E
0 siblings, 0 replies; 21+ messages in thread
From: Luse, Paul E @ 2017-11-14 19:55 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 10338 bytes --]
Sreeni,
Are you sure you got those statements added (also need to add #include "spdk/log.h" as I didn't see some expected output in your text. Here's what it should look like:
peluse(a)pels2-64:~/spdk/examples/nvme/hello_world$ sudo ./hello_world
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-virtaddr=0x1000000000 --proc-type=auto ]
EAL: Detected 72 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
Initializing NVMe Controllers
EAL: PCI device 0000:06:00.0 on NUMA socket 0
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:06:00.0
nvme_ctrlr.c: 623:nvme_ctrlr_set_state: *INFO*: setting state to init (no timeout)
nvme_ctrlr.c:1382:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
nvme_ctrlr.c:1396:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 0
nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to disable and wait for CSTS.RDY = 0 (timeout 20000 ms)
nvme_ctrlr.c:1442:nvme_ctrlr_process_init: *INFO*: CC.EN = 0 && CSTS.RDY = 0
nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to enable controller by writing CC.EN = 1 (timeout 20000 ms)
nvme_ctrlr.c:1454:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 1
nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to wait for CSTS.RDY = 1 (timeout 20000 ms)
nvme_ctrlr.c:1461:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 && CSTS.RDY = 1 - controller is ready
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff9b000 len:4096
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179b000
nvme_ctrlr.c: 723:nvme_ctrlr_identify: *INFO*: transport max_xfer_size 2072576
nvme_ctrlr.c: 727:nvme_ctrlr_identify: *INFO*: MDTS max_xfer_size 131072
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff9a000 len:4096
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179a000
nvme_ns.c: 98:nvme_ns_identify_update: *INFO*: ns 1 stripe size quirk 256 blocks
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff9b000 len:512
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179b000
nvme_ctrlr.c: 823:nvme_ctrlr_set_keep_alive_timeout: *INFO*: Controller KAS is 0 - not enabling Keep Alive
nvme_ctrlr.c: 893:nvme_ctrlr_set_host_id: *INFO*: Using 64-bit host identifier
nvme_ctrlr.c: 909:nvme_ctrlr_set_host_id: *INFO*: User did not specify host ID - not sending Set Features - Host ID
nvme_ctrlr.c: 623:nvme_ctrlr_set_state: *INFO*: setting state to ready (no timeout)
Attached to 0000:06:00.0
Using controller INTEL SSDPEDMD400G4 (CVFT7203005M400LGN ) with 1 namespaces.
Namespace ID: 1 size: 400GB
Initialization complete.
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103fe73000 len:512
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x451673000
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103fe73000 len:512
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x451673000
Hello world!
nvme_ctrlr.c: 512:nvme_ctrlr_shutdown: *INFO*: RTD3E = 0 us
nvme_ctrlr.c: 515:nvme_ctrlr_shutdown: *INFO*: shutdown timeout = 10000 ms
nvme_ctrlr.c: 525:nvme_ctrlr_shutdown: *INFO*: shutdown complete in 1808 milliseconds
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 12:44 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Hi Daniel,
I added the debug flag you mentioned in the program and tested it. There is not much of any log which is helpful for the probe of the device.
Please ignore the printf from func pci_probe_all_drivers, it was printf added by me.
I have attached the file.
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 10:43 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Thanks Daniel for your comments. I think the SSD devices are of same model looking at lspci output.
Let me confirm it with my lab administrator.
I will add the debug code and give you the output.
Sreeni
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Verkamp, Daniel
Sent: Tuesday, November 14, 2017 9:43 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
It looks to me like the device at 0000:ad:00.0 worked in both tests, and the device at 0000:ae:00.0 failed during identify and wasn't used in perf due to the -r argument specifying only 0000:ad:00.0.
Are these two devices the same model/firmware of NVMe controller, or is there some difference between them?
It would also be useful to see the debug output from the NVMe library for the failing controller; you can enable this in hello_world.c by adding these two lines at the beginning of main():
spdk_log_set_print_level(SPDK_LOG_DEBUG);
spdk_log_set_trace_flag("all");
Thanks,
-- Daniel
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul E
Sent: Tuesday, November 14, 2017 6:46 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Well, maybe not. Could very well be an issue in the driver that's only exposed via this hardware especially given that hello_world failed and perf did not.
Sreeni, do you have any other data points that you can share? Like, for example, have you tried multiple SSDs of that same model or any other model? Does hello consistently fail and perf consistently pass? I haven't done a ton of debug with either but with a little more info, if someone else doesn't jump in with the answer, I can pretty easily help compare the init sequence in a few of these and maybe start to narrow down what's different.
Thx
Paul
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Wu, WenzhongX
Sent: Tuesday, November 14, 2017 2:13 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Hi Sreeni,
Verify with newest spdk source code with intel P3700 ,run helloword everything is ok:
root(a)waikiki5-target:~/workspace/spdk1102/examples/nvme/hello_world# ./hello_world
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-virtaddr=0x1000000000 --proc-type=auto ]
EAL: Detected 36 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
Initializing NVMe Controllers
EAL: PCI device 0000:81:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:81:00.0
EAL: PCI device 0000:84:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:84:00.0
EAL: PCI device 0000:85:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:85:00.0
Attached to 0000:85:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT6401011B800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:81:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT5373002D800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:84:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT53640003800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Initialization complete.
Hello world!
Hello world!
Hello world!
Seem it's your hardware cause this issues.
Thanks
Wenzhong
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 2:36 PM
To: spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
Subject: [SPDK] Helloworld program fails during device initialization.
I was testing the helloworld program on ubuntu with NVMe SSD and the probe() is failing. Please let me know if this is a known problem.
Let me know if there is workaround.
The perf program did not have any problem.
I have attached the output for both tests.
lspci| grep Non
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
uname -a
Linux sjca5radevct13 4.10.0-38-generic #42~16.04.1-Ubuntu SMP Tue Oct 10 16:32:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
1. helloworld output:
---------------------------
Attached to 0000:ad:00.0
Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1 namespaces.
Namespace ID: 1 size: 480GB
nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization timed out in state 1
nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown within 5 seconds
spdk_nvme_probe() failed
2. perf output:
sudo ./perf -q 1 -s 4096 -t 10 -w read -r "trtype:PCIe traddr:0000:ad:00.0"
---------------
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: perf -c 0x1 --file-prefix=spdk_pid7647 ]
EAL: Detected 32 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
Attached to NVMe Controller at 0000:ad:00.0 [144d:a802]
Associating SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with lcore 0
Initialization complete. Launching workers.
Starting thread on core 0
========================================================
Latency(us)
Device Information : IOPS MB/s Average min max
SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) from core 0: 13773.50 53.80 72.59 60.20 243.82
========================================================
Total : 13773.50 53.80 72.59 60.20 243.82
Thanks for your help,
Sreeni
[-- Attachment #2: attachment.html --]
[-- Type: text/html, Size: 25901 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [SPDK] Helloworld program fails during device initialization.
@ 2017-11-14 20:05 Sreeni Busam
0 siblings, 0 replies; 21+ messages in thread
From: Sreeni Busam @ 2017-11-14 20:05 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 10727 bytes --]
Yes Paul. I added the header file log.h.
Let me compile hello_world and library again and try it.
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul E
Sent: Tuesday, November 14, 2017 11:56 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Sreeni,
Are you sure you got those statements added (also need to add #include "spdk/log.h" as I didn't see some expected output in your text. Here's what it should look like:
peluse(a)pels2-64:~/spdk/examples/nvme/hello_world$ sudo ./hello_world
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-virtaddr=0x1000000000 --proc-type=auto ]
EAL: Detected 72 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
Initializing NVMe Controllers
EAL: PCI device 0000:06:00.0 on NUMA socket 0
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:06:00.0
nvme_ctrlr.c: 623:nvme_ctrlr_set_state: *INFO*: setting state to init (no timeout)
nvme_ctrlr.c:1382:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
nvme_ctrlr.c:1396:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 0
nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to disable and wait for CSTS.RDY = 0 (timeout 20000 ms)
nvme_ctrlr.c:1442:nvme_ctrlr_process_init: *INFO*: CC.EN = 0 && CSTS.RDY = 0
nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to enable controller by writing CC.EN = 1 (timeout 20000 ms)
nvme_ctrlr.c:1454:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 1
nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to wait for CSTS.RDY = 1 (timeout 20000 ms)
nvme_ctrlr.c:1461:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 && CSTS.RDY = 1 - controller is ready
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff9b000 len:4096
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179b000
nvme_ctrlr.c: 723:nvme_ctrlr_identify: *INFO*: transport max_xfer_size 2072576
nvme_ctrlr.c: 727:nvme_ctrlr_identify: *INFO*: MDTS max_xfer_size 131072
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff9a000 len:4096
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179a000
nvme_ns.c: 98:nvme_ns_identify_update: *INFO*: ns 1 stripe size quirk 256 blocks
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff9b000 len:512
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179b000
nvme_ctrlr.c: 823:nvme_ctrlr_set_keep_alive_timeout: *INFO*: Controller KAS is 0 - not enabling Keep Alive
nvme_ctrlr.c: 893:nvme_ctrlr_set_host_id: *INFO*: Using 64-bit host identifier
nvme_ctrlr.c: 909:nvme_ctrlr_set_host_id: *INFO*: User did not specify host ID - not sending Set Features - Host ID
nvme_ctrlr.c: 623:nvme_ctrlr_set_state: *INFO*: setting state to ready (no timeout)
Attached to 0000:06:00.0
Using controller INTEL SSDPEDMD400G4 (CVFT7203005M400LGN ) with 1 namespaces.
Namespace ID: 1 size: 400GB
Initialization complete.
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103fe73000 len:512
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x451673000
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103fe73000 len:512
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x451673000
Hello world!
nvme_ctrlr.c: 512:nvme_ctrlr_shutdown: *INFO*: RTD3E = 0 us
nvme_ctrlr.c: 515:nvme_ctrlr_shutdown: *INFO*: shutdown timeout = 10000 ms
nvme_ctrlr.c: 525:nvme_ctrlr_shutdown: *INFO*: shutdown complete in 1808 milliseconds
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 12:44 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Hi Daniel,
I added the debug flag you mentioned in the program and tested it. There is not much of any log which is helpful for the probe of the device.
Please ignore the printf from func pci_probe_all_drivers, it was printf added by me.
I have attached the file.
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 10:43 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Thanks Daniel for your comments. I think the SSD devices are of same model looking at lspci output.
Let me confirm it with my lab administrator.
I will add the debug code and give you the output.
Sreeni
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Verkamp, Daniel
Sent: Tuesday, November 14, 2017 9:43 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
It looks to me like the device at 0000:ad:00.0 worked in both tests, and the device at 0000:ae:00.0 failed during identify and wasn't used in perf due to the -r argument specifying only 0000:ad:00.0.
Are these two devices the same model/firmware of NVMe controller, or is there some difference between them?
It would also be useful to see the debug output from the NVMe library for the failing controller; you can enable this in hello_world.c by adding these two lines at the beginning of main():
spdk_log_set_print_level(SPDK_LOG_DEBUG);
spdk_log_set_trace_flag("all");
Thanks,
-- Daniel
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul E
Sent: Tuesday, November 14, 2017 6:46 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Well, maybe not. Could very well be an issue in the driver that's only exposed via this hardware especially given that hello_world failed and perf did not.
Sreeni, do you have any other data points that you can share? Like, for example, have you tried multiple SSDs of that same model or any other model? Does hello consistently fail and perf consistently pass? I haven't done a ton of debug with either but with a little more info, if someone else doesn't jump in with the answer, I can pretty easily help compare the init sequence in a few of these and maybe start to narrow down what's different.
Thx
Paul
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Wu, WenzhongX
Sent: Tuesday, November 14, 2017 2:13 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Hi Sreeni,
Verify with newest spdk source code with intel P3700 ,run helloword everything is ok:
root(a)waikiki5-target:~/workspace/spdk1102/examples/nvme/hello_world# ./hello_world
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-virtaddr=0x1000000000 --proc-type=auto ]
EAL: Detected 36 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
Initializing NVMe Controllers
EAL: PCI device 0000:81:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:81:00.0
EAL: PCI device 0000:84:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:84:00.0
EAL: PCI device 0000:85:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:85:00.0
Attached to 0000:85:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT6401011B800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:81:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT5373002D800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:84:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT53640003800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Initialization complete.
Hello world!
Hello world!
Hello world!
Seem it's your hardware cause this issues.
Thanks
Wenzhong
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 2:36 PM
To: spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
Subject: [SPDK] Helloworld program fails during device initialization.
I was testing the helloworld program on ubuntu with NVMe SSD and the probe() is failing. Please let me know if this is a known problem.
Let me know if there is workaround.
The perf program did not have any problem.
I have attached the output for both tests.
lspci| grep Non
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
uname -a
Linux sjca5radevct13 4.10.0-38-generic #42~16.04.1-Ubuntu SMP Tue Oct 10 16:32:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
1. helloworld output:
---------------------------
Attached to 0000:ad:00.0
Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1 namespaces.
Namespace ID: 1 size: 480GB
nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization timed out in state 1
nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown within 5 seconds
spdk_nvme_probe() failed
2. perf output:
sudo ./perf -q 1 -s 4096 -t 10 -w read -r "trtype:PCIe traddr:0000:ad:00.0"
---------------
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: perf -c 0x1 --file-prefix=spdk_pid7647 ]
EAL: Detected 32 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
Attached to NVMe Controller at 0000:ad:00.0 [144d:a802]
Associating SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with lcore 0
Initialization complete. Launching workers.
Starting thread on core 0
========================================================
Latency(us)
Device Information : IOPS MB/s Average min max
SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) from core 0: 13773.50 53.80 72.59 60.20 243.82
========================================================
Total : 13773.50 53.80 72.59 60.20 243.82
Thanks for your help,
Sreeni
[-- Attachment #2: attachment.html --]
[-- Type: text/html, Size: 27039 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [SPDK] Helloworld program fails during device initialization.
@ 2017-11-14 20:06 Verkamp, Daniel
0 siblings, 0 replies; 21+ messages in thread
From: Verkamp, Daniel @ 2017-11-14 20:06 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 10807 bytes --]
Sorry, forgot to mention that you also need to compile with debug enabled (make CONFIG_DEBUG=y or ./configure --enable-debug); otherwise, the trace flag call doesn't do anything.
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul E
Sent: Tuesday, November 14, 2017 12:56 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Sreeni,
Are you sure you got those statements added (also need to add #include "spdk/log.h" as I didn't see some expected output in your text. Here's what it should look like:
peluse(a)pels2-64:~/spdk/examples/nvme/hello_world$ sudo ./hello_world
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-virtaddr=0x1000000000 --proc-type=auto ]
EAL: Detected 72 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
Initializing NVMe Controllers
EAL: PCI device 0000:06:00.0 on NUMA socket 0
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:06:00.0
nvme_ctrlr.c: 623:nvme_ctrlr_set_state: *INFO*: setting state to init (no timeout)
nvme_ctrlr.c:1382:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
nvme_ctrlr.c:1396:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 0
nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to disable and wait for CSTS.RDY = 0 (timeout 20000 ms)
nvme_ctrlr.c:1442:nvme_ctrlr_process_init: *INFO*: CC.EN = 0 && CSTS.RDY = 0
nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to enable controller by writing CC.EN = 1 (timeout 20000 ms)
nvme_ctrlr.c:1454:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 1
nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to wait for CSTS.RDY = 1 (timeout 20000 ms)
nvme_ctrlr.c:1461:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 && CSTS.RDY = 1 - controller is ready
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff9b000 len:4096
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179b000
nvme_ctrlr.c: 723:nvme_ctrlr_identify: *INFO*: transport max_xfer_size 2072576
nvme_ctrlr.c: 727:nvme_ctrlr_identify: *INFO*: MDTS max_xfer_size 131072
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff9a000 len:4096
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179a000
nvme_ns.c: 98:nvme_ns_identify_update: *INFO*: ns 1 stripe size quirk 256 blocks
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff9b000 len:512
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179b000
nvme_ctrlr.c: 823:nvme_ctrlr_set_keep_alive_timeout: *INFO*: Controller KAS is 0 - not enabling Keep Alive
nvme_ctrlr.c: 893:nvme_ctrlr_set_host_id: *INFO*: Using 64-bit host identifier
nvme_ctrlr.c: 909:nvme_ctrlr_set_host_id: *INFO*: User did not specify host ID - not sending Set Features - Host ID
nvme_ctrlr.c: 623:nvme_ctrlr_set_state: *INFO*: setting state to ready (no timeout)
Attached to 0000:06:00.0
Using controller INTEL SSDPEDMD400G4 (CVFT7203005M400LGN ) with 1 namespaces.
Namespace ID: 1 size: 400GB
Initialization complete.
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103fe73000 len:512
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x451673000
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103fe73000 len:512
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x451673000
Hello world!
nvme_ctrlr.c: 512:nvme_ctrlr_shutdown: *INFO*: RTD3E = 0 us
nvme_ctrlr.c: 515:nvme_ctrlr_shutdown: *INFO*: shutdown timeout = 10000 ms
nvme_ctrlr.c: 525:nvme_ctrlr_shutdown: *INFO*: shutdown complete in 1808 milliseconds
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 12:44 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Hi Daniel,
I added the debug flag you mentioned in the program and tested it. There is not much of any log which is helpful for the probe of the device.
Please ignore the printf from func pci_probe_all_drivers, it was printf added by me.
I have attached the file.
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 10:43 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Thanks Daniel for your comments. I think the SSD devices are of same model looking at lspci output.
Let me confirm it with my lab administrator.
I will add the debug code and give you the output.
Sreeni
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Verkamp, Daniel
Sent: Tuesday, November 14, 2017 9:43 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
It looks to me like the device at 0000:ad:00.0 worked in both tests, and the device at 0000:ae:00.0 failed during identify and wasn't used in perf due to the -r argument specifying only 0000:ad:00.0.
Are these two devices the same model/firmware of NVMe controller, or is there some difference between them?
It would also be useful to see the debug output from the NVMe library for the failing controller; you can enable this in hello_world.c by adding these two lines at the beginning of main():
spdk_log_set_print_level(SPDK_LOG_DEBUG);
spdk_log_set_trace_flag("all");
Thanks,
-- Daniel
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul E
Sent: Tuesday, November 14, 2017 6:46 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Well, maybe not. Could very well be an issue in the driver that's only exposed via this hardware especially given that hello_world failed and perf did not.
Sreeni, do you have any other data points that you can share? Like, for example, have you tried multiple SSDs of that same model or any other model? Does hello consistently fail and perf consistently pass? I haven't done a ton of debug with either but with a little more info, if someone else doesn't jump in with the answer, I can pretty easily help compare the init sequence in a few of these and maybe start to narrow down what's different.
Thx
Paul
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Wu, WenzhongX
Sent: Tuesday, November 14, 2017 2:13 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Hi Sreeni,
Verify with newest spdk source code with intel P3700 ,run helloword everything is ok:
root(a)waikiki5-target:~/workspace/spdk1102/examples/nvme/hello_world# ./hello_world
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-virtaddr=0x1000000000 --proc-type=auto ]
EAL: Detected 36 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
Initializing NVMe Controllers
EAL: PCI device 0000:81:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:81:00.0
EAL: PCI device 0000:84:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:84:00.0
EAL: PCI device 0000:85:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:85:00.0
Attached to 0000:85:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT6401011B800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:81:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT5373002D800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:84:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT53640003800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Initialization complete.
Hello world!
Hello world!
Hello world!
Seem it's your hardware cause this issues.
Thanks
Wenzhong
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 2:36 PM
To: spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
Subject: [SPDK] Helloworld program fails during device initialization.
I was testing the helloworld program on ubuntu with NVMe SSD and the probe() is failing. Please let me know if this is a known problem.
Let me know if there is workaround.
The perf program did not have any problem.
I have attached the output for both tests.
lspci| grep Non
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
uname -a
Linux sjca5radevct13 4.10.0-38-generic #42~16.04.1-Ubuntu SMP Tue Oct 10 16:32:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
1. helloworld output:
---------------------------
Attached to 0000:ad:00.0
Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1 namespaces.
Namespace ID: 1 size: 480GB
nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization timed out in state 1
nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown within 5 seconds
spdk_nvme_probe() failed
2. perf output:
sudo ./perf -q 1 -s 4096 -t 10 -w read -r "trtype:PCIe traddr:0000:ad:00.0"
---------------
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: perf -c 0x1 --file-prefix=spdk_pid7647 ]
EAL: Detected 32 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
Attached to NVMe Controller at 0000:ad:00.0 [144d:a802]
Associating SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with lcore 0
Initialization complete. Launching workers.
Starting thread on core 0
========================================================
Latency(us)
Device Information : IOPS MB/s Average min max
SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) from core 0: 13773.50 53.80 72.59 60.20 243.82
========================================================
Total : 13773.50 53.80 72.59 60.20 243.82
Thanks for your help,
Sreeni
[-- Attachment #2: attachment.html --]
[-- Type: text/html, Size: 26991 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [SPDK] Helloworld program fails during device initialization.
@ 2017-11-14 20:14 Sreeni Busam
0 siblings, 0 replies; 21+ messages in thread
From: Sreeni Busam @ 2017-11-14 20:14 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 13451 bytes --]
Thanks Daniel and Paul for helping debug it.
This is great, I know how to debug SPDK code :)
Here is the debug log.
nvme_ctrlr.c:1372:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
nvme_ctrlr.c:1386:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 0
nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to disable and wait for CSTS.RDY = 0 (timeout 30000 ms)
nvme_ctrlr.c:1372:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
nvme_ctrlr.c:1380:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 && CSTS.RDY = 0 - waiting for reset to complete
nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to disable and wait for CSTS.RDY = 1 (timeout 30000 ms)
nvme_ctrlr.c:1432:nvme_ctrlr_process_init: *INFO*: CC.EN = 0 && CSTS.RDY = 0
nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to enable controller by writing CC.EN = 1 (timeout 30000 ms)
nvme_ctrlr.c:1444:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 1
nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to wait for CSTS.RDY = 1 (timeout 30000 ms)
nvme_ctrlr.c:1451:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 && CSTS.RDY = 1 - controller is ready
nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x100ff4c000 len:4096
nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x3fb0d4c000
nvme_ctrlr.c: 713:nvme_ctrlr_identify: *INFO*: transport max_xfer_size 2072576
nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x100ff4b000 len:4096
nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x3fb0d4b000
nvme_ctrlr.c: 813:nvme_ctrlr_set_keep_alive_timeout: *INFO*: Controller KAS is 0 - not enabling Keep Alive
nvme_ctrlr.c: 883:nvme_ctrlr_set_host_id: *INFO*: Using 64-bit host identifier
nvme_ctrlr.c: 899:nvme_ctrlr_set_host_id: *INFO*: User did not specify host ID - not sending Set Features - Host ID
nvme_ctrlr.c: 613:nvme_ctrlr_set_state: *INFO*: setting state to ready (no timeout)
Attached to 0000:ad:00.0
Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1 namespaces.
Namespace ID: 1 size: 480GB
nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization timed out in state 1
nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown within 5 seconds
spdk_nvme_probe() failed
nvme_ctrlr.c: 515:nvme_ctrlr_shutdown: *INFO*: shutdown complete
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Verkamp, Daniel
Sent: Tuesday, November 14, 2017 12:06 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Sorry, forgot to mention that you also need to compile with debug enabled (make CONFIG_DEBUG=y or ./configure --enable-debug); otherwise, the trace flag call doesn't do anything.
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul E
Sent: Tuesday, November 14, 2017 12:56 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Sreeni,
Are you sure you got those statements added (also need to add #include "spdk/log.h" as I didn't see some expected output in your text. Here's what it should look like:
peluse(a)pels2-64:~/spdk/examples/nvme/hello_world$ sudo ./hello_world
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-virtaddr=0x1000000000 --proc-type=auto ]
EAL: Detected 72 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
Initializing NVMe Controllers
EAL: PCI device 0000:06:00.0 on NUMA socket 0
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:06:00.0
nvme_ctrlr.c: 623:nvme_ctrlr_set_state: *INFO*: setting state to init (no timeout)
nvme_ctrlr.c:1382:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
nvme_ctrlr.c:1396:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 0
nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to disable and wait for CSTS.RDY = 0 (timeout 20000 ms)
nvme_ctrlr.c:1442:nvme_ctrlr_process_init: *INFO*: CC.EN = 0 && CSTS.RDY = 0
nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to enable controller by writing CC.EN = 1 (timeout 20000 ms)
nvme_ctrlr.c:1454:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 1
nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to wait for CSTS.RDY = 1 (timeout 20000 ms)
nvme_ctrlr.c:1461:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 && CSTS.RDY = 1 - controller is ready
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff9b000 len:4096
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179b000
nvme_ctrlr.c: 723:nvme_ctrlr_identify: *INFO*: transport max_xfer_size 2072576
nvme_ctrlr.c: 727:nvme_ctrlr_identify: *INFO*: MDTS max_xfer_size 131072
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff9a000 len:4096
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179a000
nvme_ns.c: 98:nvme_ns_identify_update: *INFO*: ns 1 stripe size quirk 256 blocks
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff9b000 len:512
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179b000
nvme_ctrlr.c: 823:nvme_ctrlr_set_keep_alive_timeout: *INFO*: Controller KAS is 0 - not enabling Keep Alive
nvme_ctrlr.c: 893:nvme_ctrlr_set_host_id: *INFO*: Using 64-bit host identifier
nvme_ctrlr.c: 909:nvme_ctrlr_set_host_id: *INFO*: User did not specify host ID - not sending Set Features - Host ID
nvme_ctrlr.c: 623:nvme_ctrlr_set_state: *INFO*: setting state to ready (no timeout)
Attached to 0000:06:00.0
Using controller INTEL SSDPEDMD400G4 (CVFT7203005M400LGN ) with 1 namespaces.
Namespace ID: 1 size: 400GB
Initialization complete.
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103fe73000 len:512
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x451673000
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103fe73000 len:512
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x451673000
Hello world!
nvme_ctrlr.c: 512:nvme_ctrlr_shutdown: *INFO*: RTD3E = 0 us
nvme_ctrlr.c: 515:nvme_ctrlr_shutdown: *INFO*: shutdown timeout = 10000 ms
nvme_ctrlr.c: 525:nvme_ctrlr_shutdown: *INFO*: shutdown complete in 1808 milliseconds
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 12:44 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Hi Daniel,
I added the debug flag you mentioned in the program and tested it. There is not much of any log which is helpful for the probe of the device.
Please ignore the printf from func pci_probe_all_drivers, it was printf added by me.
I have attached the file.
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 10:43 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Thanks Daniel for your comments. I think the SSD devices are of same model looking at lspci output.
Let me confirm it with my lab administrator.
I will add the debug code and give you the output.
Sreeni
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Verkamp, Daniel
Sent: Tuesday, November 14, 2017 9:43 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
It looks to me like the device at 0000:ad:00.0 worked in both tests, and the device at 0000:ae:00.0 failed during identify and wasn't used in perf due to the -r argument specifying only 0000:ad:00.0.
Are these two devices the same model/firmware of NVMe controller, or is there some difference between them?
It would also be useful to see the debug output from the NVMe library for the failing controller; you can enable this in hello_world.c by adding these two lines at the beginning of main():
spdk_log_set_print_level(SPDK_LOG_DEBUG);
spdk_log_set_trace_flag("all");
Thanks,
-- Daniel
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul E
Sent: Tuesday, November 14, 2017 6:46 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Well, maybe not. Could very well be an issue in the driver that's only exposed via this hardware especially given that hello_world failed and perf did not.
Sreeni, do you have any other data points that you can share? Like, for example, have you tried multiple SSDs of that same model or any other model? Does hello consistently fail and perf consistently pass? I haven't done a ton of debug with either but with a little more info, if someone else doesn't jump in with the answer, I can pretty easily help compare the init sequence in a few of these and maybe start to narrow down what's different.
Thx
Paul
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Wu, WenzhongX
Sent: Tuesday, November 14, 2017 2:13 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Hi Sreeni,
Verify with newest spdk source code with intel P3700 ,run helloword everything is ok:
root(a)waikiki5-target:~/workspace/spdk1102/examples/nvme/hello_world# ./hello_world
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-virtaddr=0x1000000000 --proc-type=auto ]
EAL: Detected 36 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
Initializing NVMe Controllers
EAL: PCI device 0000:81:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:81:00.0
EAL: PCI device 0000:84:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:84:00.0
EAL: PCI device 0000:85:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:85:00.0
Attached to 0000:85:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT6401011B800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:81:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT5373002D800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:84:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT53640003800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Initialization complete.
Hello world!
Hello world!
Hello world!
Seem it's your hardware cause this issues.
Thanks
Wenzhong
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 2:36 PM
To: spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
Subject: [SPDK] Helloworld program fails during device initialization.
I was testing the helloworld program on ubuntu with NVMe SSD and the probe() is failing. Please let me know if this is a known problem.
Let me know if there is workaround.
The perf program did not have any problem.
I have attached the output for both tests.
lspci| grep Non
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
uname -a
Linux sjca5radevct13 4.10.0-38-generic #42~16.04.1-Ubuntu SMP Tue Oct 10 16:32:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
1. helloworld output:
---------------------------
Attached to 0000:ad:00.0
Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1 namespaces.
Namespace ID: 1 size: 480GB
nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization timed out in state 1
nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown within 5 seconds
spdk_nvme_probe() failed
2. perf output:
sudo ./perf -q 1 -s 4096 -t 10 -w read -r "trtype:PCIe traddr:0000:ad:00.0"
---------------
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: perf -c 0x1 --file-prefix=spdk_pid7647 ]
EAL: Detected 32 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
Attached to NVMe Controller at 0000:ad:00.0 [144d:a802]
Associating SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with lcore 0
Initialization complete. Launching workers.
Starting thread on core 0
========================================================
Latency(us)
Device Information : IOPS MB/s Average min max
SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) from core 0: 13773.50 53.80 72.59 60.20 243.82
========================================================
Total : 13773.50 53.80 72.59 60.20 243.82
Thanks for your help,
Sreeni
[-- Attachment #2: attachment.html --]
[-- Type: text/html, Size: 32558 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [SPDK] Helloworld program fails during device initialization.
@ 2017-11-14 21:07 Luse, Paul E
0 siblings, 0 replies; 21+ messages in thread
From: Luse, Paul E @ 2017-11-14 21:07 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 15188 bytes --]
So in your output the highlighted lines below show one extra state that your SSD goes through during init vs the SSD I have, looks like NVME_CTRLR_STATE_DISABLE_WAIT_FOR_READY_1 is the state which is also state 1 in the enum which matches the error listed further below in red. Having not spent much time debugging this driver's init I can't say for sure if this (the additional output/steps between the extra state and the timeout) is expected or not.
You can wait for Daniel or someone w/more experience to reply or if you want to experiment you can either step through the hello_world code with gdb putting a break point on nvme_ctrlr_process_init() with a good drive then with this one and get an idea of the correct flow vs what you are seeing you can maybe mes around with the timeout being set in the condition below which is seen on your drive but not on mine. Up to you, my suggestion is just a shot in the dark.
if (csts.bits.rdy == 0) {
SPDK_DEBUGLOG(SPDK_TRACE_NVME, "CC.EN = 1 && CSTS.RDY = 0 - waiting for reset to complete\n");
nvme_ctrlr_set_state(ctrlr, NVME_CTRLR_STATE_DISABLE_WAIT_FOR_READY_1, ready_timeout_in_ms);
return 0;
}
Thx
Paul
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 1:14 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Thanks Daniel and Paul for helping debug it.
This is great, I know how to debug SPDK code :)
Here is the debug log.
nvme_ctrlr.c:1372:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
nvme_ctrlr.c:1386:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 0
nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to disable and wait for CSTS.RDY = 0 (timeout 30000 ms)
nvme_ctrlr.c:1372:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
nvme_ctrlr.c:1380:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 && CSTS.RDY = 0 - waiting for reset to complete
nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to disable and wait for CSTS.RDY = 1 (timeout 30000 ms)
nvme_ctrlr.c:1432:nvme_ctrlr_process_init: *INFO*: CC.EN = 0 && CSTS.RDY = 0
nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to enable controller by writing CC.EN = 1 (timeout 30000 ms)
nvme_ctrlr.c:1444:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 1
nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to wait for CSTS.RDY = 1 (timeout 30000 ms)
nvme_ctrlr.c:1451:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 && CSTS.RDY = 1 - controller is ready
nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x100ff4c000 len:4096
nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x3fb0d4c000
nvme_ctrlr.c: 713:nvme_ctrlr_identify: *INFO*: transport max_xfer_size 2072576
nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x100ff4b000 len:4096
nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x3fb0d4b000
nvme_ctrlr.c: 813:nvme_ctrlr_set_keep_alive_timeout: *INFO*: Controller KAS is 0 - not enabling Keep Alive
nvme_ctrlr.c: 883:nvme_ctrlr_set_host_id: *INFO*: Using 64-bit host identifier
nvme_ctrlr.c: 899:nvme_ctrlr_set_host_id: *INFO*: User did not specify host ID - not sending Set Features - Host ID
nvme_ctrlr.c: 613:nvme_ctrlr_set_state: *INFO*: setting state to ready (no timeout)
Attached to 0000:ad:00.0
Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1 namespaces.
Namespace ID: 1 size: 480GB
nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization timed out in state 1
nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown within 5 seconds
spdk_nvme_probe() failed
nvme_ctrlr.c: 515:nvme_ctrlr_shutdown: *INFO*: shutdown complete
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Verkamp, Daniel
Sent: Tuesday, November 14, 2017 12:06 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Sorry, forgot to mention that you also need to compile with debug enabled (make CONFIG_DEBUG=y or ./configure --enable-debug); otherwise, the trace flag call doesn't do anything.
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul E
Sent: Tuesday, November 14, 2017 12:56 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Sreeni,
Are you sure you got those statements added (also need to add #include "spdk/log.h" as I didn't see some expected output in your text. Here's what it should look like:
peluse(a)pels2-64:~/spdk/examples/nvme/hello_world$ sudo ./hello_world
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-virtaddr=0x1000000000 --proc-type=auto ]
EAL: Detected 72 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
Initializing NVMe Controllers
EAL: PCI device 0000:06:00.0 on NUMA socket 0
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:06:00.0
nvme_ctrlr.c: 623:nvme_ctrlr_set_state: *INFO*: setting state to init (no timeout)
nvme_ctrlr.c:1382:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
nvme_ctrlr.c:1396:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 0
nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to disable and wait for CSTS.RDY = 0 (timeout 20000 ms)
nvme_ctrlr.c:1442:nvme_ctrlr_process_init: *INFO*: CC.EN = 0 && CSTS.RDY = 0
nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to enable controller by writing CC.EN = 1 (timeout 20000 ms)
nvme_ctrlr.c:1454:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 1
nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to wait for CSTS.RDY = 1 (timeout 20000 ms)
nvme_ctrlr.c:1461:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 && CSTS.RDY = 1 - controller is ready
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff9b000 len:4096
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179b000
nvme_ctrlr.c: 723:nvme_ctrlr_identify: *INFO*: transport max_xfer_size 2072576
nvme_ctrlr.c: 727:nvme_ctrlr_identify: *INFO*: MDTS max_xfer_size 131072
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff9a000 len:4096
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179a000
nvme_ns.c: 98:nvme_ns_identify_update: *INFO*: ns 1 stripe size quirk 256 blocks
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff9b000 len:512
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179b000
nvme_ctrlr.c: 823:nvme_ctrlr_set_keep_alive_timeout: *INFO*: Controller KAS is 0 - not enabling Keep Alive
nvme_ctrlr.c: 893:nvme_ctrlr_set_host_id: *INFO*: Using 64-bit host identifier
nvme_ctrlr.c: 909:nvme_ctrlr_set_host_id: *INFO*: User did not specify host ID - not sending Set Features - Host ID
nvme_ctrlr.c: 623:nvme_ctrlr_set_state: *INFO*: setting state to ready (no timeout)
Attached to 0000:06:00.0
Using controller INTEL SSDPEDMD400G4 (CVFT7203005M400LGN ) with 1 namespaces.
Namespace ID: 1 size: 400GB
Initialization complete.
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103fe73000 len:512
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x451673000
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103fe73000 len:512
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x451673000
Hello world!
nvme_ctrlr.c: 512:nvme_ctrlr_shutdown: *INFO*: RTD3E = 0 us
nvme_ctrlr.c: 515:nvme_ctrlr_shutdown: *INFO*: shutdown timeout = 10000 ms
nvme_ctrlr.c: 525:nvme_ctrlr_shutdown: *INFO*: shutdown complete in 1808 milliseconds
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 12:44 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Hi Daniel,
I added the debug flag you mentioned in the program and tested it. There is not much of any log which is helpful for the probe of the device.
Please ignore the printf from func pci_probe_all_drivers, it was printf added by me.
I have attached the file.
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 10:43 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Thanks Daniel for your comments. I think the SSD devices are of same model looking at lspci output.
Let me confirm it with my lab administrator.
I will add the debug code and give you the output.
Sreeni
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Verkamp, Daniel
Sent: Tuesday, November 14, 2017 9:43 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
It looks to me like the device at 0000:ad:00.0 worked in both tests, and the device at 0000:ae:00.0 failed during identify and wasn't used in perf due to the -r argument specifying only 0000:ad:00.0.
Are these two devices the same model/firmware of NVMe controller, or is there some difference between them?
It would also be useful to see the debug output from the NVMe library for the failing controller; you can enable this in hello_world.c by adding these two lines at the beginning of main():
spdk_log_set_print_level(SPDK_LOG_DEBUG);
spdk_log_set_trace_flag("all");
Thanks,
-- Daniel
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul E
Sent: Tuesday, November 14, 2017 6:46 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Well, maybe not. Could very well be an issue in the driver that's only exposed via this hardware especially given that hello_world failed and perf did not.
Sreeni, do you have any other data points that you can share? Like, for example, have you tried multiple SSDs of that same model or any other model? Does hello consistently fail and perf consistently pass? I haven't done a ton of debug with either but with a little more info, if someone else doesn't jump in with the answer, I can pretty easily help compare the init sequence in a few of these and maybe start to narrow down what's different.
Thx
Paul
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Wu, WenzhongX
Sent: Tuesday, November 14, 2017 2:13 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Hi Sreeni,
Verify with newest spdk source code with intel P3700 ,run helloword everything is ok:
root(a)waikiki5-target:~/workspace/spdk1102/examples/nvme/hello_world# ./hello_world
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-virtaddr=0x1000000000 --proc-type=auto ]
EAL: Detected 36 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
Initializing NVMe Controllers
EAL: PCI device 0000:81:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:81:00.0
EAL: PCI device 0000:84:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:84:00.0
EAL: PCI device 0000:85:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:85:00.0
Attached to 0000:85:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT6401011B800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:81:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT5373002D800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:84:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT53640003800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Initialization complete.
Hello world!
Hello world!
Hello world!
Seem it's your hardware cause this issues.
Thanks
Wenzhong
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 2:36 PM
To: spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
Subject: [SPDK] Helloworld program fails during device initialization.
I was testing the helloworld program on ubuntu with NVMe SSD and the probe() is failing. Please let me know if this is a known problem.
Let me know if there is workaround.
The perf program did not have any problem.
I have attached the output for both tests.
lspci| grep Non
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
uname -a
Linux sjca5radevct13 4.10.0-38-generic #42~16.04.1-Ubuntu SMP Tue Oct 10 16:32:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
1. helloworld output:
---------------------------
Attached to 0000:ad:00.0
Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1 namespaces.
Namespace ID: 1 size: 480GB
nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization timed out in state 1
nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown within 5 seconds
spdk_nvme_probe() failed
2. perf output:
sudo ./perf -q 1 -s 4096 -t 10 -w read -r "trtype:PCIe traddr:0000:ad:00.0"
---------------
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: perf -c 0x1 --file-prefix=spdk_pid7647 ]
EAL: Detected 32 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
Attached to NVMe Controller at 0000:ad:00.0 [144d:a802]
Associating SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with lcore 0
Initialization complete. Launching workers.
Starting thread on core 0
========================================================
Latency(us)
Device Information : IOPS MB/s Average min max
SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) from core 0: 13773.50 53.80 72.59 60.20 243.82
========================================================
Total : 13773.50 53.80 72.59 60.20 243.82
Thanks for your help,
Sreeni
[-- Attachment #2: attachment.html --]
[-- Type: text/html, Size: 37134 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [SPDK] Helloworld program fails during device initialization.
@ 2017-11-14 21:38 Sreeni Busam
0 siblings, 0 replies; 21+ messages in thread
From: Sreeni Busam @ 2017-11-14 21:38 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 15663 bytes --]
Thanks Paul. I am going to wait for Daniel or other experts comments.
This is not a blocking issue for me as perf works for me. I appreciate your time in debugging the problem.
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul E
Sent: Tuesday, November 14, 2017 1:08 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
So in your output the highlighted lines below show one extra state that your SSD goes through during init vs the SSD I have, looks like NVME_CTRLR_STATE_DISABLE_WAIT_FOR_READY_1 is the state which is also state 1 in the enum which matches the error listed further below in red. Having not spent much time debugging this driver's init I can't say for sure if this (the additional output/steps between the extra state and the timeout) is expected or not.
You can wait for Daniel or someone w/more experience to reply or if you want to experiment you can either step through the hello_world code with gdb putting a break point on nvme_ctrlr_process_init() with a good drive then with this one and get an idea of the correct flow vs what you are seeing you can maybe mes around with the timeout being set in the condition below which is seen on your drive but not on mine. Up to you, my suggestion is just a shot in the dark.
if (csts.bits.rdy == 0) {
SPDK_DEBUGLOG(SPDK_TRACE_NVME, "CC.EN = 1 && CSTS.RDY = 0 - waiting for reset to complete\n");
nvme_ctrlr_set_state(ctrlr, NVME_CTRLR_STATE_DISABLE_WAIT_FOR_READY_1, ready_timeout_in_ms);
return 0;
}
Thx
Paul
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 1:14 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Thanks Daniel and Paul for helping debug it.
This is great, I know how to debug SPDK code :)
Here is the debug log.
nvme_ctrlr.c:1372:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
nvme_ctrlr.c:1386:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 0
nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to disable and wait for CSTS.RDY = 0 (timeout 30000 ms)
nvme_ctrlr.c:1372:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
nvme_ctrlr.c:1380:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 && CSTS.RDY = 0 - waiting for reset to complete
nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to disable and wait for CSTS.RDY = 1 (timeout 30000 ms)
nvme_ctrlr.c:1432:nvme_ctrlr_process_init: *INFO*: CC.EN = 0 && CSTS.RDY = 0
nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to enable controller by writing CC.EN = 1 (timeout 30000 ms)
nvme_ctrlr.c:1444:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 1
nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to wait for CSTS.RDY = 1 (timeout 30000 ms)
nvme_ctrlr.c:1451:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 && CSTS.RDY = 1 - controller is ready
nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x100ff4c000 len:4096
nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x3fb0d4c000
nvme_ctrlr.c: 713:nvme_ctrlr_identify: *INFO*: transport max_xfer_size 2072576
nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x100ff4b000 len:4096
nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x3fb0d4b000
nvme_ctrlr.c: 813:nvme_ctrlr_set_keep_alive_timeout: *INFO*: Controller KAS is 0 - not enabling Keep Alive
nvme_ctrlr.c: 883:nvme_ctrlr_set_host_id: *INFO*: Using 64-bit host identifier
nvme_ctrlr.c: 899:nvme_ctrlr_set_host_id: *INFO*: User did not specify host ID - not sending Set Features - Host ID
nvme_ctrlr.c: 613:nvme_ctrlr_set_state: *INFO*: setting state to ready (no timeout)
Attached to 0000:ad:00.0
Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1 namespaces.
Namespace ID: 1 size: 480GB
nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization timed out in state 1
nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown within 5 seconds
spdk_nvme_probe() failed
nvme_ctrlr.c: 515:nvme_ctrlr_shutdown: *INFO*: shutdown complete
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Verkamp, Daniel
Sent: Tuesday, November 14, 2017 12:06 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Sorry, forgot to mention that you also need to compile with debug enabled (make CONFIG_DEBUG=y or ./configure --enable-debug); otherwise, the trace flag call doesn't do anything.
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul E
Sent: Tuesday, November 14, 2017 12:56 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Sreeni,
Are you sure you got those statements added (also need to add #include "spdk/log.h" as I didn't see some expected output in your text. Here's what it should look like:
peluse(a)pels2-64:~/spdk/examples/nvme/hello_world$ sudo ./hello_world
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-virtaddr=0x1000000000 --proc-type=auto ]
EAL: Detected 72 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
Initializing NVMe Controllers
EAL: PCI device 0000:06:00.0 on NUMA socket 0
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:06:00.0
nvme_ctrlr.c: 623:nvme_ctrlr_set_state: *INFO*: setting state to init (no timeout)
nvme_ctrlr.c:1382:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
nvme_ctrlr.c:1396:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 0
nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to disable and wait for CSTS.RDY = 0 (timeout 20000 ms)
nvme_ctrlr.c:1442:nvme_ctrlr_process_init: *INFO*: CC.EN = 0 && CSTS.RDY = 0
nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to enable controller by writing CC.EN = 1 (timeout 20000 ms)
nvme_ctrlr.c:1454:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 1
nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to wait for CSTS.RDY = 1 (timeout 20000 ms)
nvme_ctrlr.c:1461:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 && CSTS.RDY = 1 - controller is ready
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff9b000 len:4096
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179b000
nvme_ctrlr.c: 723:nvme_ctrlr_identify: *INFO*: transport max_xfer_size 2072576
nvme_ctrlr.c: 727:nvme_ctrlr_identify: *INFO*: MDTS max_xfer_size 131072
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff9a000 len:4096
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179a000
nvme_ns.c: 98:nvme_ns_identify_update: *INFO*: ns 1 stripe size quirk 256 blocks
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff9b000 len:512
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179b000
nvme_ctrlr.c: 823:nvme_ctrlr_set_keep_alive_timeout: *INFO*: Controller KAS is 0 - not enabling Keep Alive
nvme_ctrlr.c: 893:nvme_ctrlr_set_host_id: *INFO*: Using 64-bit host identifier
nvme_ctrlr.c: 909:nvme_ctrlr_set_host_id: *INFO*: User did not specify host ID - not sending Set Features - Host ID
nvme_ctrlr.c: 623:nvme_ctrlr_set_state: *INFO*: setting state to ready (no timeout)
Attached to 0000:06:00.0
Using controller INTEL SSDPEDMD400G4 (CVFT7203005M400LGN ) with 1 namespaces.
Namespace ID: 1 size: 400GB
Initialization complete.
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103fe73000 len:512
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x451673000
nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103fe73000 len:512
nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x451673000
Hello world!
nvme_ctrlr.c: 512:nvme_ctrlr_shutdown: *INFO*: RTD3E = 0 us
nvme_ctrlr.c: 515:nvme_ctrlr_shutdown: *INFO*: shutdown timeout = 10000 ms
nvme_ctrlr.c: 525:nvme_ctrlr_shutdown: *INFO*: shutdown complete in 1808 milliseconds
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 12:44 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Hi Daniel,
I added the debug flag you mentioned in the program and tested it. There is not much of any log which is helpful for the probe of the device.
Please ignore the printf from func pci_probe_all_drivers, it was printf added by me.
I have attached the file.
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 10:43 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Thanks Daniel for your comments. I think the SSD devices are of same model looking at lspci output.
Let me confirm it with my lab administrator.
I will add the debug code and give you the output.
Sreeni
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Verkamp, Daniel
Sent: Tuesday, November 14, 2017 9:43 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
It looks to me like the device at 0000:ad:00.0 worked in both tests, and the device at 0000:ae:00.0 failed during identify and wasn't used in perf due to the -r argument specifying only 0000:ad:00.0.
Are these two devices the same model/firmware of NVMe controller, or is there some difference between them?
It would also be useful to see the debug output from the NVMe library for the failing controller; you can enable this in hello_world.c by adding these two lines at the beginning of main():
spdk_log_set_print_level(SPDK_LOG_DEBUG);
spdk_log_set_trace_flag("all");
Thanks,
-- Daniel
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul E
Sent: Tuesday, November 14, 2017 6:46 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Well, maybe not. Could very well be an issue in the driver that's only exposed via this hardware especially given that hello_world failed and perf did not.
Sreeni, do you have any other data points that you can share? Like, for example, have you tried multiple SSDs of that same model or any other model? Does hello consistently fail and perf consistently pass? I haven't done a ton of debug with either but with a little more info, if someone else doesn't jump in with the answer, I can pretty easily help compare the init sequence in a few of these and maybe start to narrow down what's different.
Thx
Paul
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Wu, WenzhongX
Sent: Tuesday, November 14, 2017 2:13 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Hi Sreeni,
Verify with newest spdk source code with intel P3700 ,run helloword everything is ok:
root(a)waikiki5-target:~/workspace/spdk1102/examples/nvme/hello_world# ./hello_world
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-virtaddr=0x1000000000 --proc-type=auto ]
EAL: Detected 36 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
Initializing NVMe Controllers
EAL: PCI device 0000:81:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:81:00.0
EAL: PCI device 0000:84:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:84:00.0
EAL: PCI device 0000:85:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
Attaching to 0000:85:00.0
Attached to 0000:85:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT6401011B800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:81:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT5373002D800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Attached to 0000:84:00.0
Using controller INTEL SSDPEDMD800G4 (PHFT53640003800CGN ) with 1 namespaces.
Namespace ID: 1 size: 800GB
Initialization complete.
Hello world!
Hello world!
Hello world!
Seem it's your hardware cause this issues.
Thanks
Wenzhong
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 2:36 PM
To: spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
Subject: [SPDK] Helloworld program fails during device initialization.
I was testing the helloworld program on ubuntu with NVMe SSD and the probe() is failing. Please let me know if this is a known problem.
Let me know if there is workaround.
The perf program did not have any problem.
I have attached the output for both tests.
lspci| grep Non
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
uname -a
Linux sjca5radevct13 4.10.0-38-generic #42~16.04.1-Ubuntu SMP Tue Oct 10 16:32:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
1. helloworld output:
---------------------------
Attached to 0000:ad:00.0
Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1 namespaces.
Namespace ID: 1 size: 480GB
nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization timed out in state 1
nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown within 5 seconds
spdk_nvme_probe() failed
2. perf output:
sudo ./perf -q 1 -s 4096 -t 10 -w read -r "trtype:PCIe traddr:0000:ad:00.0"
---------------
Starting DPDK 17.08.0 initialization...
[ DPDK EAL parameters: perf -c 0x1 --file-prefix=spdk_pid7647 ]
EAL: Detected 32 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
Attached to NVMe Controller at 0000:ad:00.0 [144d:a802]
Associating SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with lcore 0
Initialization complete. Launching workers.
Starting thread on core 0
========================================================
Latency(us)
Device Information : IOPS MB/s Average min max
SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) from core 0: 13773.50 53.80 72.59 60.20 243.82
========================================================
Total : 13773.50 53.80 72.59 60.20 243.82
Thanks for your help,
Sreeni
[-- Attachment #2: attachment.html --]
[-- Type: text/html, Size: 38285 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [SPDK] Helloworld program fails during device initialization.
@ 2017-11-14 21:44 Walker, Benjamin
0 siblings, 0 replies; 21+ messages in thread
From: Walker, Benjamin @ 2017-11-14 21:44 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 17382 bytes --]
SPDK performs parallel initialization of all devices for performance reasons, so
the log below is both of the initialization paths of your devices muxed
together. The part Paul highlighted is for your non-functioning device, while
all of the other log messages are for your functioning device, I think. It would
be easy to tell if you unplugged the working one and only left the one that
wasn't working - can you try that and send the new log to confirm?
When SPDK first reads the registers for your non-functioning SSD to figure out
what state it is in, it sees that the CC.EN bit is set to 1, but the CSTS.RDY
bit is 0. That means the device is performing some internal operation - i.e.
it's already in the process of enabling itself. It is invalid for SPDK to write
any registers until CSTS.RDY flips to 1 in this case. However, it never does and
the code just times out. Unless I'm interpreting something incorrectly, I
believe your device to be out of spec (probably because that particular device
is not functioning, not because all models of that type are out of spec).
Most commonly, problems like this are resolved by power cycling the device. If
you haven't tried rebooting, definitely give that a shot. Did you say that the
device does work with the kernel? When you load the kernel driver can you see
/dev/sdX and dd to it without errors?
Thanks,
Ben
On Tue, 2017-11-14 at 21:07 +0000, Luse, Paul E wrote:
> So in your output the highlighted lines below show one extra state that your
> SSD goes through during init vs the SSD I have, looks like
> NVME_CTRLR_STATE_DISABLE_WAIT_FOR_READY_1 is the state which is also state 1
> in the enum which matches the error listed further below in red. Having not
> spent much time debugging this driver’s init I can’t say for sure if this (the
> additional output/steps between the extra state and the timeout) is expected
> or not.
>
> You can wait for Daniel or someone w/more experience to reply or if you want
> to experiment you can either step through the hello_world code with gdb
> putting a break point on nvme_ctrlr_process_init() with a good drive then with
> this one and get an idea of the correct flow vs what you are seeing you can
> maybe mes around with the timeout being set in the condition below which is
> seen on your drive but not on mine. Up to you, my suggestion is just a shot
> in the dark.
>
> if (csts.bits.rdy == 0) {
>
> SPDK_DEBUGLOG(SPDK_TRACE_NVME, "CC.EN = 1 && CSTS.RDY = 0 - waiting for reset
> to complete\n");
>
> nvme_ctrlr_set_state(ctrlr, NVME_CTRLR_STATE_DISABLE_WAIT_FOR_READY_1,
> ready_timeout_in_ms);
> return 0;
> }
>
> Thx
> Paul
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa)
> Busam (Stellus)
> Sent: Tuesday, November 14, 2017 1:14 PM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Thanks Daniel and Paul for helping debug it.
> This is great, I know how to debug SPDK code J
> Here is the debug log.
> nvme_ctrlr.c:1372:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
> nvme_ctrlr.c:1386:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 0
> nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to disable and
> wait for CSTS.RDY = 0 (timeout 30000 ms)
> nvme_ctrlr.c:1372:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
> nvme_ctrlr.c:1380:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 && CSTS.RDY = 0 -
> waiting for reset to complete
> nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to disable and
> wait for CSTS.RDY = 1 (timeout 30000 ms)
> nvme_ctrlr.c:1432:nvme_ctrlr_process_init: *INFO*: CC.EN = 0 && CSTS.RDY = 0
> nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to enable
> controller by writing CC.EN = 1 (timeout 30000 ms)
> nvme_ctrlr.c:1444:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 1
> nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to wait for
> CSTS.RDY = 1 (timeout 30000 ms)
> nvme_ctrlr.c:1451:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 && CSTS.RDY = 1 -
> controller is ready
> nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x100ff4c000 len:4096
> nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x3fb0d4c000
> nvme_ctrlr.c: 713:nvme_ctrlr_identify: *INFO*: transport max_xfer_size 2072576
> nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x100ff4b000 len:4096
> nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x3fb0d4b000
> nvme_ctrlr.c: 813:nvme_ctrlr_set_keep_alive_timeout: *INFO*: Controller KAS is
> 0 - not enabling Keep Alive
> nvme_ctrlr.c: 883:nvme_ctrlr_set_host_id: *INFO*: Using 64-bit host identifier
> nvme_ctrlr.c: 899:nvme_ctrlr_set_host_id: *INFO*: User did not specify host ID
> - not sending Set Features - Host ID
> nvme_ctrlr.c: 613:nvme_ctrlr_set_state: *INFO*: setting state to ready (no
> timeout)
> Attached to 0000:ad:00.0
> Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1
> namespaces.
> Namespace ID: 1 size: 480GB
> nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization timed out
> in state 1
> nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown within 5
> seconds
> spdk_nvme_probe() failed
> nvme_ctrlr.c: 515:nvme_ctrlr_shutdown: *INFO*: shutdown complete
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Verkamp, Daniel
> Sent: Tuesday, November 14, 2017 12:06 PM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Sorry, forgot to mention that you also need to compile with debug enabled
> (make CONFIG_DEBUG=y or ./configure --enable-debug); otherwise, the trace flag
> call doesn’t do anything.
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul E
> Sent: Tuesday, November 14, 2017 12:56 PM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Sreeni,
>
> Are you sure you got those statements added (also need to add #include
> “spdk/log.h” as I didn’t see some expected output in your text. Here’s what
> it should look like:
>
> peluse(a)pels2-64:~/spdk/examples/nvme/hello_world$ sudo ./hello_world
> Starting DPDK 17.08.0 initialization...
> [ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-
> virtaddr=0x1000000000 --proc-type=auto ]
> EAL: Detected 72 lcore(s)
> EAL: Auto-detected process type: PRIMARY
> EAL: No free hugepages reported in hugepages-1048576kB
> EAL: Probing VFIO support...
> Initializing NVMe Controllers
> EAL: PCI device 0000:06:00.0 on NUMA socket 0
> EAL: probe driver: 8086:953 spdk_nvme
> Attaching to 0000:06:00.0
> nvme_ctrlr.c: 623:nvme_ctrlr_set_state: *INFO*: setting state to init (no
> timeout)
> nvme_ctrlr.c:1382:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
> nvme_ctrlr.c:1396:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 0
> nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to disable and
> wait for CSTS.RDY = 0 (timeout 20000 ms)
> nvme_ctrlr.c:1442:nvme_ctrlr_process_init: *INFO*: CC.EN = 0 && CSTS.RDY = 0
> nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to enable
> controller by writing CC.EN = 1 (timeout 20000 ms)
> nvme_ctrlr.c:1454:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 1
> nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to wait for
> CSTS.RDY = 1 (timeout 20000 ms)
> nvme_ctrlr.c:1461:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 && CSTS.RDY = 1 -
> controller is ready
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103ff9b000 len:4096
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179b000
> nvme_ctrlr.c: 723:nvme_ctrlr_identify: *INFO*: transport max_xfer_size 2072576
> nvme_ctrlr.c: 727:nvme_ctrlr_identify: *INFO*: MDTS max_xfer_size 131072
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103ff9a000 len:4096
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179a000
> nvme_ns.c: 98:nvme_ns_identify_update: *INFO*: ns 1 stripe size quirk 256
> blocks
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103ff9b000 len:512
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179b000
> nvme_ctrlr.c: 823:nvme_ctrlr_set_keep_alive_timeout: *INFO*: Controller KAS is
> 0 - not enabling Keep Alive
> nvme_ctrlr.c: 893:nvme_ctrlr_set_host_id: *INFO*: Using 64-bit host identifier
> nvme_ctrlr.c: 909:nvme_ctrlr_set_host_id: *INFO*: User did not specify host ID
> - not sending Set Features - Host ID
> nvme_ctrlr.c: 623:nvme_ctrlr_set_state: *INFO*: setting state to ready (no
> timeout)
> Attached to 0000:06:00.0
> Using controller INTEL SSDPEDMD400G4 (CVFT7203005M400LGN ) with 1
> namespaces.
> Namespace ID: 1 size: 400GB
> Initialization complete.
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103fe73000 len:512
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x451673000
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103fe73000 len:512
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x451673000
> Hello world!
> nvme_ctrlr.c: 512:nvme_ctrlr_shutdown: *INFO*: RTD3E = 0 us
> nvme_ctrlr.c: 515:nvme_ctrlr_shutdown: *INFO*: shutdown timeout = 10000 ms
> nvme_ctrlr.c: 525:nvme_ctrlr_shutdown: *INFO*: shutdown complete in 1808
> milliseconds
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa)
> Busam (Stellus)
> Sent: Tuesday, November 14, 2017 12:44 PM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Hi Daniel,
>
> I added the debug flag you mentioned in the program and tested it. There is
> not much of any log which is helpful for the probe of the device.
> Please ignore the printf from func pci_probe_all_drivers, it was printf added
> by me.
> I have attached the file.
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa)
> Busam (Stellus)
> Sent: Tuesday, November 14, 2017 10:43 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Thanks Daniel for your comments. I think the SSD devices are of same model
> looking at lspci output.
> Let me confirm it with my lab administrator.
> I will add the debug code and give you the output.
>
> Sreeni
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Verkamp, Daniel
> Sent: Tuesday, November 14, 2017 9:43 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> It looks to me like the device at 0000:ad:00.0 worked in both tests, and the
> device at 0000:ae:00.0 failed during identify and wasn’t used in perf due to
> the -r argument specifying only 0000:ad:00.0.
>
> Are these two devices the same model/firmware of NVMe controller, or is there
> some difference between them?
>
> It would also be useful to see the debug output from the NVMe library for the
> failing controller; you can enable this in hello_world.c by adding these two
> lines at the beginning of main():
>
> spdk_log_set_print_level(SPDK_LOG_DEBUG);
> spdk_log_set_trace_flag(“all”);
>
> Thanks,
> -- Daniel
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul E
> Sent: Tuesday, November 14, 2017 6:46 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Well, maybe not. Could very well be an issue in the driver that’s only
> exposed via this hardware especially given that hello_world failed and perf
> did not.
>
> Sreeni, do you have any other data points that you can share? Like, for
> example, have you tried multiple SSDs of that same model or any other model?
> Does hello consistently fail and perf consistently pass? I haven’t done a ton
> of debug with either but with a little more info, if someone else doesn’t jump
> in with the answer, I can pretty easily help compare the init sequence in a
> few of these and maybe start to narrow down what’s different.
>
> Thx
> Paul
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Wu, WenzhongX
> Sent: Tuesday, November 14, 2017 2:13 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Hi Sreeni,
>
> Verify with newest spdk source code with intel P3700 ,run helloword everything
> is ok:
> root(a)waikiki5-target:~/workspace/spdk1102/examples/nvme/hello_world#
> ./hello_world
> Starting DPDK 17.08.0 initialization...
> [ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-
> virtaddr=0x1000000000 --proc-type=auto ]
> EAL: Detected 36 lcore(s)
> EAL: Auto-detected process type: PRIMARY
> EAL: No free hugepages reported in hugepages-1048576kB
> EAL: Probing VFIO support...
> Initializing NVMe Controllers
> EAL: PCI device 0000:81:00.0 on NUMA socket 1
> EAL: probe driver: 8086:953 spdk_nvme
> Attaching to 0000:81:00.0
> EAL: PCI device 0000:84:00.0 on NUMA socket 1
> EAL: probe driver: 8086:953 spdk_nvme
> Attaching to 0000:84:00.0
> EAL: PCI device 0000:85:00.0 on NUMA socket 1
> EAL: probe driver: 8086:953 spdk_nvme
> Attaching to 0000:85:00.0
> Attached to 0000:85:00.0
> Using controller INTEL SSDPEDMD800G4 (PHFT6401011B800CGN ) with 1
> namespaces.
> Namespace ID: 1 size: 800GB
> Attached to 0000:81:00.0
> Using controller INTEL SSDPEDMD800G4 (PHFT5373002D800CGN ) with 1
> namespaces.
> Namespace ID: 1 size: 800GB
> Attached to 0000:84:00.0
> Using controller INTEL SSDPEDMD800G4 (PHFT53640003800CGN ) with 1
> namespaces.
> Namespace ID: 1 size: 800GB
> Initialization complete.
> Hello world!
> Hello world!
> Hello world!
>
> Seem it’s your hardware cause this issues.
>
> Thanks
> Wenzhong
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa)
> Busam (Stellus)
> Sent: Tuesday, November 14, 2017 2:36 PM
> To: spdk(a)lists.01.org
> Subject: [SPDK] Helloworld program fails during device initialization.
>
> I was testing the helloworld program on ubuntu with NVMe SSD and the probe()
> is failing. Please let me know if this is a known problem.
> Let me know if there is workaround.
> The perf program did not have any problem.
> I have attached the output for both tests.
> lspci| grep Non
> ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD
> Controller (rev 01)
> ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD
> Controller (rev 01)
> uname -a
> Linux sjca5radevct13 4.10.0-38-generic #42~16.04.1-Ubuntu SMP Tue Oct 10
> 16:32:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> 1. helloworld output:
> ---------------------------
> Attached to 0000:ad:00.0
> Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1
> namespaces.
> Namespace ID: 1 size: 480GB
> nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization timed out
> in state 1
> nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown within 5
> seconds
> spdk_nvme_probe() failed
>
> 2. perf output:
> sudo ./perf -q 1 -s 4096 -t 10 -w read -r "trtype:PCIe traddr:0000:ad:00.0"
> ---------------
> Starting DPDK 17.08.0 initialization...
> [ DPDK EAL parameters: perf -c 0x1 --file-prefix=spdk_pid7647 ]
> EAL: Detected 32 lcore(s)
> EAL: No free hugepages reported in hugepages-1048576kB
> Attached to NVMe Controller at 0000:ad:00.0 [144d:a802]
> Associating SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with lcore 0
> Initialization complete. Launching workers.
> Starting thread on core 0
> ========================================================
>
>
> Latency(us)
> Device Information : IOPS
> MB/s Average min max
> SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) from core 0: 13773.50
> 53.80 72.59 60.20 243.82
> ========================================================
> Total : 13773.50
> 53.80 72.59 60.20 243.82
>
> Thanks for your help,
> Sreeni
> _______________________________________________
> SPDK mailing list
> SPDK(a)lists.01.org
> https://lists.01.org/mailman/listinfo/spdk
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 3274 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [SPDK] Helloworld program fails during device initialization.
@ 2017-11-14 21:55 Sreeni Busam
0 siblings, 0 replies; 21+ messages in thread
From: Sreeni Busam @ 2017-11-14 21:55 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 18500 bytes --]
Yes, I can do it Ben. Let me repeat the test and send you the log.
The second device is not working. For your 2nd question, please see comment inline.
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd
NVMe SSD Controller (rev 01) - hello_world fails
ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd - Looks like identify fails per Daniel comment.
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Walker, Benjamin
Sent: Tuesday, November 14, 2017 1:44 PM
To: spdk(a)lists.01.org
Subject: Re: [SPDK] Helloworld program fails during device initialization.
SPDK performs parallel initialization of all devices for performance reasons, so the log below is both of the initialization paths of your devices muxed together. The part Paul highlighted is for your non-functioning device, while all of the other log messages are for your functioning device, I think. It would be easy to tell if you unplugged the working one and only left the one that wasn't working - can you try that and send the new log to confirm?
When SPDK first reads the registers for your non-functioning SSD to figure out what state it is in, it sees that the CC.EN bit is set to 1, but the CSTS.RDY bit is 0. That means the device is performing some internal operation - i.e.
it's already in the process of enabling itself. It is invalid for SPDK to write any registers until CSTS.RDY flips to 1 in this case. However, it never does and the code just times out. Unless I'm interpreting something incorrectly, I believe your device to be out of spec (probably because that particular device is not functioning, not because all models of that type are out of spec).
Most commonly, problems like this are resolved by power cycling the device. If you haven't tried rebooting, definitely give that a shot. Did you say that the device does work with the kernel? When you load the kernel driver can you see /dev/sdX and dd to it without errors?
>> I just the ran the script in spdk. The kernel driver binds to the device which fails to be identified in SPDK. I have not done any tests with it, so I do not know whether the failing device works with kernel. I tried rebooting the system, it did not make any difference, let me try removing and inserting it physically.
scripts/setup.sh reset
Thanks,
Ben
On Tue, 2017-11-14 at 21:07 +0000, Luse, Paul E wrote:
> So in your output the highlighted lines below show one extra state
> that your SSD goes through during init vs the SSD I have, looks like
> NVME_CTRLR_STATE_DISABLE_WAIT_FOR_READY_1 is the state which is also
> state 1 in the enum which matches the error listed further below in
> red. Having not spent much time debugging this driver’s init I can’t
> say for sure if this (the additional output/steps between the extra
> state and the timeout) is expected or not.
>
> You can wait for Daniel or someone w/more experience to reply or if
> you want to experiment you can either step through the hello_world
> code with gdb putting a break point on nvme_ctrlr_process_init() with
> a good drive then with this one and get an idea of the correct flow vs
> what you are seeing you can maybe mes around with the timeout being
> set in the condition below which is seen on your drive but not on
> mine. Up to you, my suggestion is just a shot in the dark.
>
> if (csts.bits.rdy == 0) {
>
> SPDK_DEBUGLOG(SPDK_TRACE_NVME, "CC.EN = 1 && CSTS.RDY = 0 - waiting
> for reset to complete\n");
>
> nvme_ctrlr_set_state(ctrlr, NVME_CTRLR_STATE_DISABLE_WAIT_FOR_READY_1,
> ready_timeout_in_ms);
> return 0;
> }
>
> Thx
> Paul
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni
> (Sreenivasa) Busam (Stellus)
> Sent: Tuesday, November 14, 2017 1:14 PM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Thanks Daniel and Paul for helping debug it.
> This is great, I know how to debug SPDK code J Here is the debug log.
> nvme_ctrlr.c:1372:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
> nvme_ctrlr.c:1386:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 0
> nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to
> disable and wait for CSTS.RDY = 0 (timeout 30000 ms)
> nvme_ctrlr.c:1372:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
> nvme_ctrlr.c:1380:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 &&
> CSTS.RDY = 0 - waiting for reset to complete
> nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to
> disable and wait for CSTS.RDY = 1 (timeout 30000 ms)
> nvme_ctrlr.c:1432:nvme_ctrlr_process_init: *INFO*: CC.EN = 0 &&
> CSTS.RDY = 0
> nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to
> enable controller by writing CC.EN = 1 (timeout 30000 ms)
> nvme_ctrlr.c:1444:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 1
> nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to wait
> for CSTS.RDY = 1 (timeout 30000 ms)
> nvme_ctrlr.c:1451:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 &&
> CSTS.RDY = 1 - controller is ready
> nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x100ff4c000 len:4096
> nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 =
> 0x3fb0d4c000
> nvme_ctrlr.c: 713:nvme_ctrlr_identify: *INFO*: transport max_xfer_size
> 2072576
> nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x100ff4b000 len:4096
> nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 =
> 0x3fb0d4b000
> nvme_ctrlr.c: 813:nvme_ctrlr_set_keep_alive_timeout: *INFO*:
> Controller KAS is
> 0 - not enabling Keep Alive
> nvme_ctrlr.c: 883:nvme_ctrlr_set_host_id: *INFO*: Using 64-bit host
> identifier
> nvme_ctrlr.c: 899:nvme_ctrlr_set_host_id: *INFO*: User did not specify
> host ID
> - not sending Set Features - Host ID
> nvme_ctrlr.c: 613:nvme_ctrlr_set_state: *INFO*: setting state to ready
> (no
> timeout)
> Attached to 0000:ad:00.0
> Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1
> namespaces.
> Namespace ID: 1 size: 480GB
> nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization
> timed out in state 1
> nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown
> within 5 seconds
> spdk_nvme_probe() failed
> nvme_ctrlr.c: 515:nvme_ctrlr_shutdown: *INFO*: shutdown complete
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Verkamp,
> Daniel
> Sent: Tuesday, November 14, 2017 12:06 PM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Sorry, forgot to mention that you also need to compile with debug
> enabled (make CONFIG_DEBUG=y or ./configure --enable-debug);
> otherwise, the trace flag call doesn’t do anything.
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul
> E
> Sent: Tuesday, November 14, 2017 12:56 PM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Sreeni,
>
> Are you sure you got those statements added (also need to add #include
> “spdk/log.h” as I didn’t see some expected output in your text.
> Here’s what it should look like:
>
> peluse(a)pels2-64:~/spdk/examples/nvme/hello_world$ sudo ./hello_world
> Starting DPDK 17.08.0 initialization...
> [ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-
> virtaddr=0x1000000000 --proc-type=auto ]
> EAL: Detected 72 lcore(s)
> EAL: Auto-detected process type: PRIMARY
> EAL: No free hugepages reported in hugepages-1048576kB
> EAL: Probing VFIO support...
> Initializing NVMe Controllers
> EAL: PCI device 0000:06:00.0 on NUMA socket 0
> EAL: probe driver: 8086:953 spdk_nvme
> Attaching to 0000:06:00.0
> nvme_ctrlr.c: 623:nvme_ctrlr_set_state: *INFO*: setting state to init
> (no
> timeout)
> nvme_ctrlr.c:1382:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
> nvme_ctrlr.c:1396:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 0
> nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to
> disable and wait for CSTS.RDY = 0 (timeout 20000 ms)
> nvme_ctrlr.c:1442:nvme_ctrlr_process_init: *INFO*: CC.EN = 0 &&
> CSTS.RDY = 0
> nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to
> enable controller by writing CC.EN = 1 (timeout 20000 ms)
> nvme_ctrlr.c:1454:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 1
> nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to wait
> for CSTS.RDY = 1 (timeout 20000 ms)
> nvme_ctrlr.c:1461:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 &&
> CSTS.RDY = 1 - controller is ready
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103ff9b000 len:4096
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179b000
> nvme_ctrlr.c: 723:nvme_ctrlr_identify: *INFO*: transport max_xfer_size
> 2072576
> nvme_ctrlr.c: 727:nvme_ctrlr_identify: *INFO*: MDTS max_xfer_size
> 131072
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103ff9a000 len:4096
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179a000
> nvme_ns.c: 98:nvme_ns_identify_update: *INFO*: ns 1 stripe size quirk
> 256 blocks
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103ff9b000 len:512
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179b000
> nvme_ctrlr.c: 823:nvme_ctrlr_set_keep_alive_timeout: *INFO*:
> Controller KAS is
> 0 - not enabling Keep Alive
> nvme_ctrlr.c: 893:nvme_ctrlr_set_host_id: *INFO*: Using 64-bit host
> identifier
> nvme_ctrlr.c: 909:nvme_ctrlr_set_host_id: *INFO*: User did not specify
> host ID
> - not sending Set Features - Host ID
> nvme_ctrlr.c: 623:nvme_ctrlr_set_state: *INFO*: setting state to ready
> (no
> timeout)
> Attached to 0000:06:00.0
> Using controller INTEL SSDPEDMD400G4 (CVFT7203005M400LGN ) with 1
> namespaces.
> Namespace ID: 1 size: 400GB
> Initialization complete.
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103fe73000 len:512
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x451673000
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103fe73000 len:512
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x451673000
> Hello world!
> nvme_ctrlr.c: 512:nvme_ctrlr_shutdown: *INFO*: RTD3E = 0 us
> nvme_ctrlr.c: 515:nvme_ctrlr_shutdown: *INFO*: shutdown timeout =
> 10000 ms
> nvme_ctrlr.c: 525:nvme_ctrlr_shutdown: *INFO*: shutdown complete in
> 1808 milliseconds
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni
> (Sreenivasa) Busam (Stellus)
> Sent: Tuesday, November 14, 2017 12:44 PM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Hi Daniel,
>
> I added the debug flag you mentioned in the program and tested it.
> There is not much of any log which is helpful for the probe of the device.
> Please ignore the printf from func pci_probe_all_drivers, it was
> printf added by me.
> I have attached the file.
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni
> (Sreenivasa) Busam (Stellus)
> Sent: Tuesday, November 14, 2017 10:43 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Thanks Daniel for your comments. I think the SSD devices are of same
> model looking at lspci output.
> Let me confirm it with my lab administrator.
> I will add the debug code and give you the output.
>
> Sreeni
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Verkamp,
> Daniel
> Sent: Tuesday, November 14, 2017 9:43 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> It looks to me like the device at 0000:ad:00.0 worked in both tests,
> and the device at 0000:ae:00.0 failed during identify and wasn’t used
> in perf due to the -r argument specifying only 0000:ad:00.0.
>
> Are these two devices the same model/firmware of NVMe controller, or
> is there some difference between them?
>
> It would also be useful to see the debug output from the NVMe library
> for the failing controller; you can enable this in hello_world.c by
> adding these two lines at the beginning of main():
>
> spdk_log_set_print_level(SPDK_LOG_DEBUG);
> spdk_log_set_trace_flag(“all”);
>
> Thanks,
> -- Daniel
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul
> E
> Sent: Tuesday, November 14, 2017 6:46 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Well, maybe not. Could very well be an issue in the driver that’s
> only exposed via this hardware especially given that hello_world
> failed and perf did not.
>
> Sreeni, do you have any other data points that you can share? Like,
> for example, have you tried multiple SSDs of that same model or any other model?
> Does hello consistently fail and perf consistently pass? I haven’t
> done a ton of debug with either but with a little more info, if
> someone else doesn’t jump in with the answer, I can pretty easily help
> compare the init sequence in a few of these and maybe start to narrow down what’s different.
>
> Thx
> Paul
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Wu,
> WenzhongX
> Sent: Tuesday, November 14, 2017 2:13 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Hi Sreeni,
>
> Verify with newest spdk source code with intel P3700 ,run helloword
> everything is ok:
> root(a)waikiki5-target:~/workspace/spdk1102/examples/nvme/hello_world#
> ./hello_world
> Starting DPDK 17.08.0 initialization...
> [ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-
> virtaddr=0x1000000000 --proc-type=auto ]
> EAL: Detected 36 lcore(s)
> EAL: Auto-detected process type: PRIMARY
> EAL: No free hugepages reported in hugepages-1048576kB
> EAL: Probing VFIO support...
> Initializing NVMe Controllers
> EAL: PCI device 0000:81:00.0 on NUMA socket 1
> EAL: probe driver: 8086:953 spdk_nvme
> Attaching to 0000:81:00.0
> EAL: PCI device 0000:84:00.0 on NUMA socket 1
> EAL: probe driver: 8086:953 spdk_nvme
> Attaching to 0000:84:00.0
> EAL: PCI device 0000:85:00.0 on NUMA socket 1
> EAL: probe driver: 8086:953 spdk_nvme
> Attaching to 0000:85:00.0
> Attached to 0000:85:00.0
> Using controller INTEL SSDPEDMD800G4 (PHFT6401011B800CGN ) with 1
> namespaces.
> Namespace ID: 1 size: 800GB
> Attached to 0000:81:00.0
> Using controller INTEL SSDPEDMD800G4 (PHFT5373002D800CGN ) with 1
> namespaces.
> Namespace ID: 1 size: 800GB
> Attached to 0000:84:00.0
> Using controller INTEL SSDPEDMD800G4 (PHFT53640003800CGN ) with 1
> namespaces.
> Namespace ID: 1 size: 800GB
> Initialization complete.
> Hello world!
> Hello world!
> Hello world!
>
> Seem it’s your hardware cause this issues.
>
> Thanks
> Wenzhong
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni
> (Sreenivasa) Busam (Stellus)
> Sent: Tuesday, November 14, 2017 2:36 PM
> To: spdk(a)lists.01.org
> Subject: [SPDK] Helloworld program fails during device initialization.
>
> I was testing the helloworld program on ubuntu with NVMe SSD and the
> probe() is failing. Please let me know if this is a known problem.
> Let me know if there is workaround.
> The perf program did not have any problem.
> I have attached the output for both tests.
> lspci| grep Non
> ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd
> NVMe SSD Controller (rev 01)
> ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd
> NVMe SSD Controller (rev 01) uname -a Linux sjca5radevct13
> 4.10.0-38-generic #42~16.04.1-Ubuntu SMP Tue Oct 10
> 16:32:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux 1. helloworld output:
> ---------------------------
> Attached to 0000:ad:00.0
> Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1
> namespaces.
> Namespace ID: 1 size: 480GB
> nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization
> timed out in state 1
> nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown
> within 5 seconds
> spdk_nvme_probe() failed
>
> 2. perf output:
> sudo ./perf -q 1 -s 4096 -t 10 -w read -r "trtype:PCIe traddr:0000:ad:00.0"
> ---------------
> Starting DPDK 17.08.0 initialization...
> [ DPDK EAL parameters: perf -c 0x1 --file-prefix=spdk_pid7647 ]
> EAL: Detected 32 lcore(s)
> EAL: No free hugepages reported in hugepages-1048576kB Attached to
> NVMe Controller at 0000:ad:00.0 [144d:a802]
> Associating SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with lcore 0
> Initialization complete. Launching workers.
> Starting thread on core 0
> ========================================================
>
>
> Latency(us)
> Device Information : IOPS
> MB/s Average min max
> SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) from core 0: 13773.50
> 53.80 72.59 60.20 243.82
> ========================================================
> Total : 13773.50
> 53.80 72.59 60.20 243.82
>
> Thanks for your help,
> Sreeni
> _______________________________________________
> SPDK mailing list
> SPDK(a)lists.01.org
> https://lists.01.org/mailman/listinfo/spdk
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [SPDK] Helloworld program fails during device initialization.
@ 2017-11-14 22:25 Sreeni Busam
0 siblings, 0 replies; 21+ messages in thread
From: Sreeni Busam @ 2017-11-14 22:25 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 19824 bytes --]
Hi Ben,
Here is the devices in my system. It looks like the failing device does not get identified in Linux kernel. I have not got a chance to run the test with the working device removing the faulty device, will update you on that test.
ls -l /dev/nvme*
crw------- 1 root root 244, 0 Nov 14 14:14 /dev/nvme0
brw-rw---- 1 root disk 259, 0 Nov 14 14:19 /dev/nvme0n1
sreeni.busam(a)sjca5radevct13:~/sreeni_spdk1/spdk$
sreeni.busam(a)sjca5radevct13:~/sreeni_spdk1/spdk$ lspci| grep Non
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
The dd works on the device which has address "ad".
sudo dd if=/dev/zero of=/dev/nvme0n1 bs=64K count=16K
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1.51357 s, 709 MB/s
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 1:56 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Yes, I can do it Ben. Let me repeat the test and send you the log.
The second device is not working. For your 2nd question, please see comment inline.
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01) - hello_world fails
ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd - Looks like identify fails per Daniel comment.
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Walker, Benjamin
Sent: Tuesday, November 14, 2017 1:44 PM
To: spdk(a)lists.01.org
Subject: Re: [SPDK] Helloworld program fails during device initialization.
SPDK performs parallel initialization of all devices for performance reasons, so the log below is both of the initialization paths of your devices muxed together. The part Paul highlighted is for your non-functioning device, while all of the other log messages are for your functioning device, I think. It would be easy to tell if you unplugged the working one and only left the one that wasn't working - can you try that and send the new log to confirm?
When SPDK first reads the registers for your non-functioning SSD to figure out what state it is in, it sees that the CC.EN bit is set to 1, but the CSTS.RDY bit is 0. That means the device is performing some internal operation - i.e.
it's already in the process of enabling itself. It is invalid for SPDK to write any registers until CSTS.RDY flips to 1 in this case. However, it never does and the code just times out. Unless I'm interpreting something incorrectly, I believe your device to be out of spec (probably because that particular device is not functioning, not because all models of that type are out of spec).
Most commonly, problems like this are resolved by power cycling the device. If you haven't tried rebooting, definitely give that a shot. Did you say that the device does work with the kernel? When you load the kernel driver can you see /dev/sdX and dd to it without errors?
>> I just the ran the script in spdk. The kernel driver binds to the device which fails to be identified in SPDK. I have not done any tests with it, so I do not know whether the failing device works with kernel. I tried rebooting the system, it did not make any difference, let me try removing and inserting it physically.
scripts/setup.sh reset
Thanks,
Ben
On Tue, 2017-11-14 at 21:07 +0000, Luse, Paul E wrote:
> So in your output the highlighted lines below show one extra state
> that your SSD goes through during init vs the SSD I have, looks like
> NVME_CTRLR_STATE_DISABLE_WAIT_FOR_READY_1 is the state which is also
> state 1 in the enum which matches the error listed further below in
> red. Having not spent much time debugging this driver’s init I can’t
> say for sure if this (the additional output/steps between the extra
> state and the timeout) is expected or not.
>
> You can wait for Daniel or someone w/more experience to reply or if
> you want to experiment you can either step through the hello_world
> code with gdb putting a break point on nvme_ctrlr_process_init() with
> a good drive then with this one and get an idea of the correct flow vs
> what you are seeing you can maybe mes around with the timeout being
> set in the condition below which is seen on your drive but not on
> mine. Up to you, my suggestion is just a shot in the dark.
>
> if (csts.bits.rdy == 0) {
>
> SPDK_DEBUGLOG(SPDK_TRACE_NVME, "CC.EN = 1 && CSTS.RDY = 0 - waiting
> for reset to complete\n");
>
> nvme_ctrlr_set_state(ctrlr, NVME_CTRLR_STATE_DISABLE_WAIT_FOR_READY_1,
> ready_timeout_in_ms);
> return 0;
> }
>
> Thx
> Paul
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni
> (Sreenivasa) Busam (Stellus)
> Sent: Tuesday, November 14, 2017 1:14 PM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Thanks Daniel and Paul for helping debug it.
> This is great, I know how to debug SPDK code J Here is the debug log.
> nvme_ctrlr.c:1372:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
> nvme_ctrlr.c:1386:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 0
> nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to
> disable and wait for CSTS.RDY = 0 (timeout 30000 ms)
> nvme_ctrlr.c:1372:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
> nvme_ctrlr.c:1380:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 &&
> CSTS.RDY = 0 - waiting for reset to complete
> nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to
> disable and wait for CSTS.RDY = 1 (timeout 30000 ms)
> nvme_ctrlr.c:1432:nvme_ctrlr_process_init: *INFO*: CC.EN = 0 &&
> CSTS.RDY = 0
> nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to
> enable controller by writing CC.EN = 1 (timeout 30000 ms)
> nvme_ctrlr.c:1444:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 1
> nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to wait
> for CSTS.RDY = 1 (timeout 30000 ms)
> nvme_ctrlr.c:1451:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 &&
> CSTS.RDY = 1 - controller is ready
> nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x100ff4c000 len:4096
> nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 =
> 0x3fb0d4c000
> nvme_ctrlr.c: 713:nvme_ctrlr_identify: *INFO*: transport max_xfer_size
> 2072576
> nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x100ff4b000 len:4096
> nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 =
> 0x3fb0d4b000
> nvme_ctrlr.c: 813:nvme_ctrlr_set_keep_alive_timeout: *INFO*:
> Controller KAS is
> 0 - not enabling Keep Alive
> nvme_ctrlr.c: 883:nvme_ctrlr_set_host_id: *INFO*: Using 64-bit host
> identifier
> nvme_ctrlr.c: 899:nvme_ctrlr_set_host_id: *INFO*: User did not specify
> host ID
> - not sending Set Features - Host ID
> nvme_ctrlr.c: 613:nvme_ctrlr_set_state: *INFO*: setting state to ready
> (no
> timeout)
> Attached to 0000:ad:00.0
> Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1
> namespaces.
> Namespace ID: 1 size: 480GB
> nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization
> timed out in state 1
> nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown
> within 5 seconds
> spdk_nvme_probe() failed
> nvme_ctrlr.c: 515:nvme_ctrlr_shutdown: *INFO*: shutdown complete
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Verkamp,
> Daniel
> Sent: Tuesday, November 14, 2017 12:06 PM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Sorry, forgot to mention that you also need to compile with debug
> enabled (make CONFIG_DEBUG=y or ./configure --enable-debug);
> otherwise, the trace flag call doesn’t do anything.
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul
> E
> Sent: Tuesday, November 14, 2017 12:56 PM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Sreeni,
>
> Are you sure you got those statements added (also need to add #include
> “spdk/log.h” as I didn’t see some expected output in your text.
> Here’s what it should look like:
>
> peluse(a)pels2-64:~/spdk/examples/nvme/hello_world$ sudo ./hello_world
> Starting DPDK 17.08.0 initialization...
> [ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-
> virtaddr=0x1000000000 --proc-type=auto ]
> EAL: Detected 72 lcore(s)
> EAL: Auto-detected process type: PRIMARY
> EAL: No free hugepages reported in hugepages-1048576kB
> EAL: Probing VFIO support...
> Initializing NVMe Controllers
> EAL: PCI device 0000:06:00.0 on NUMA socket 0
> EAL: probe driver: 8086:953 spdk_nvme
> Attaching to 0000:06:00.0
> nvme_ctrlr.c: 623:nvme_ctrlr_set_state: *INFO*: setting state to init
> (no
> timeout)
> nvme_ctrlr.c:1382:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
> nvme_ctrlr.c:1396:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 0
> nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to
> disable and wait for CSTS.RDY = 0 (timeout 20000 ms)
> nvme_ctrlr.c:1442:nvme_ctrlr_process_init: *INFO*: CC.EN = 0 &&
> CSTS.RDY = 0
> nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to
> enable controller by writing CC.EN = 1 (timeout 20000 ms)
> nvme_ctrlr.c:1454:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 1
> nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to wait
> for CSTS.RDY = 1 (timeout 20000 ms)
> nvme_ctrlr.c:1461:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 &&
> CSTS.RDY = 1 - controller is ready
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103ff9b000 len:4096
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179b000
> nvme_ctrlr.c: 723:nvme_ctrlr_identify: *INFO*: transport max_xfer_size
> 2072576
> nvme_ctrlr.c: 727:nvme_ctrlr_identify: *INFO*: MDTS max_xfer_size
> 131072
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103ff9a000 len:4096
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179a000
> nvme_ns.c: 98:nvme_ns_identify_update: *INFO*: ns 1 stripe size quirk
> 256 blocks
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103ff9b000 len:512
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179b000
> nvme_ctrlr.c: 823:nvme_ctrlr_set_keep_alive_timeout: *INFO*:
> Controller KAS is
> 0 - not enabling Keep Alive
> nvme_ctrlr.c: 893:nvme_ctrlr_set_host_id: *INFO*: Using 64-bit host
> identifier
> nvme_ctrlr.c: 909:nvme_ctrlr_set_host_id: *INFO*: User did not specify
> host ID
> - not sending Set Features - Host ID
> nvme_ctrlr.c: 623:nvme_ctrlr_set_state: *INFO*: setting state to ready
> (no
> timeout)
> Attached to 0000:06:00.0
> Using controller INTEL SSDPEDMD400G4 (CVFT7203005M400LGN ) with 1
> namespaces.
> Namespace ID: 1 size: 400GB
> Initialization complete.
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103fe73000 len:512
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x451673000
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103fe73000 len:512
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x451673000
> Hello world!
> nvme_ctrlr.c: 512:nvme_ctrlr_shutdown: *INFO*: RTD3E = 0 us
> nvme_ctrlr.c: 515:nvme_ctrlr_shutdown: *INFO*: shutdown timeout =
> 10000 ms
> nvme_ctrlr.c: 525:nvme_ctrlr_shutdown: *INFO*: shutdown complete in
> 1808 milliseconds
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni
> (Sreenivasa) Busam (Stellus)
> Sent: Tuesday, November 14, 2017 12:44 PM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Hi Daniel,
>
> I added the debug flag you mentioned in the program and tested it.
> There is not much of any log which is helpful for the probe of the device.
> Please ignore the printf from func pci_probe_all_drivers, it was
> printf added by me.
> I have attached the file.
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni
> (Sreenivasa) Busam (Stellus)
> Sent: Tuesday, November 14, 2017 10:43 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Thanks Daniel for your comments. I think the SSD devices are of same
> model looking at lspci output.
> Let me confirm it with my lab administrator.
> I will add the debug code and give you the output.
>
> Sreeni
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Verkamp,
> Daniel
> Sent: Tuesday, November 14, 2017 9:43 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> It looks to me like the device at 0000:ad:00.0 worked in both tests,
> and the device at 0000:ae:00.0 failed during identify and wasn’t used
> in perf due to the -r argument specifying only 0000:ad:00.0.
>
> Are these two devices the same model/firmware of NVMe controller, or
> is there some difference between them?
>
> It would also be useful to see the debug output from the NVMe library
> for the failing controller; you can enable this in hello_world.c by
> adding these two lines at the beginning of main():
>
> spdk_log_set_print_level(SPDK_LOG_DEBUG);
> spdk_log_set_trace_flag(“all”);
>
> Thanks,
> -- Daniel
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul
> E
> Sent: Tuesday, November 14, 2017 6:46 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Well, maybe not. Could very well be an issue in the driver that’s
> only exposed via this hardware especially given that hello_world
> failed and perf did not.
>
> Sreeni, do you have any other data points that you can share? Like,
> for example, have you tried multiple SSDs of that same model or any other model?
> Does hello consistently fail and perf consistently pass? I haven’t
> done a ton of debug with either but with a little more info, if
> someone else doesn’t jump in with the answer, I can pretty easily help
> compare the init sequence in a few of these and maybe start to narrow down what’s different.
>
> Thx
> Paul
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Wu,
> WenzhongX
> Sent: Tuesday, November 14, 2017 2:13 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Hi Sreeni,
>
> Verify with newest spdk source code with intel P3700 ,run helloword
> everything is ok:
> root(a)waikiki5-target:~/workspace/spdk1102/examples/nvme/hello_world#
> ./hello_world
> Starting DPDK 17.08.0 initialization...
> [ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-
> virtaddr=0x1000000000 --proc-type=auto ]
> EAL: Detected 36 lcore(s)
> EAL: Auto-detected process type: PRIMARY
> EAL: No free hugepages reported in hugepages-1048576kB
> EAL: Probing VFIO support...
> Initializing NVMe Controllers
> EAL: PCI device 0000:81:00.0 on NUMA socket 1
> EAL: probe driver: 8086:953 spdk_nvme
> Attaching to 0000:81:00.0
> EAL: PCI device 0000:84:00.0 on NUMA socket 1
> EAL: probe driver: 8086:953 spdk_nvme
> Attaching to 0000:84:00.0
> EAL: PCI device 0000:85:00.0 on NUMA socket 1
> EAL: probe driver: 8086:953 spdk_nvme
> Attaching to 0000:85:00.0
> Attached to 0000:85:00.0
> Using controller INTEL SSDPEDMD800G4 (PHFT6401011B800CGN ) with 1
> namespaces.
> Namespace ID: 1 size: 800GB
> Attached to 0000:81:00.0
> Using controller INTEL SSDPEDMD800G4 (PHFT5373002D800CGN ) with 1
> namespaces.
> Namespace ID: 1 size: 800GB
> Attached to 0000:84:00.0
> Using controller INTEL SSDPEDMD800G4 (PHFT53640003800CGN ) with 1
> namespaces.
> Namespace ID: 1 size: 800GB
> Initialization complete.
> Hello world!
> Hello world!
> Hello world!
>
> Seem it’s your hardware cause this issues.
>
> Thanks
> Wenzhong
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni
> (Sreenivasa) Busam (Stellus)
> Sent: Tuesday, November 14, 2017 2:36 PM
> To: spdk(a)lists.01.org
> Subject: [SPDK] Helloworld program fails during device initialization.
>
> I was testing the helloworld program on ubuntu with NVMe SSD and the
> probe() is failing. Please let me know if this is a known problem.
> Let me know if there is workaround.
> The perf program did not have any problem.
> I have attached the output for both tests.
> lspci| grep Non
> ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd
> NVMe SSD Controller (rev 01)
> ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd
> NVMe SSD Controller (rev 01) uname -a Linux sjca5radevct13
> 4.10.0-38-generic #42~16.04.1-Ubuntu SMP Tue Oct 10
> 16:32:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux 1. helloworld output:
> ---------------------------
> Attached to 0000:ad:00.0
> Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1
> namespaces.
> Namespace ID: 1 size: 480GB
> nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization
> timed out in state 1
> nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown
> within 5 seconds
> spdk_nvme_probe() failed
>
> 2. perf output:
> sudo ./perf -q 1 -s 4096 -t 10 -w read -r "trtype:PCIe traddr:0000:ad:00.0"
> ---------------
> Starting DPDK 17.08.0 initialization...
> [ DPDK EAL parameters: perf -c 0x1 --file-prefix=spdk_pid7647 ]
> EAL: Detected 32 lcore(s)
> EAL: No free hugepages reported in hugepages-1048576kB Attached to
> NVMe Controller at 0000:ad:00.0 [144d:a802]
> Associating SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with lcore 0
> Initialization complete. Launching workers.
> Starting thread on core 0
> ========================================================
>
>
> Latency(us)
> Device Information : IOPS
> MB/s Average min max
> SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) from core 0: 13773.50
> 53.80 72.59 60.20 243.82
> ========================================================
> Total : 13773.50
> 53.80 72.59 60.20 243.82
>
> Thanks for your help,
> Sreeni
> _______________________________________________
> SPDK mailing list
> SPDK(a)lists.01.org
> https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [SPDK] Helloworld program fails during device initialization.
@ 2017-11-14 22:46 Luse, Paul E
0 siblings, 0 replies; 21+ messages in thread
From: Luse, Paul E @ 2017-11-14 22:46 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 20313 bytes --]
Great, looking forward to hearing how it goes
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 3:26 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Hi Ben,
Here is the devices in my system. It looks like the failing device does not get identified in Linux kernel. I have not got a chance to run the test with the working device removing the faulty device, will update you on that test.
ls -l /dev/nvme*
crw------- 1 root root 244, 0 Nov 14 14:14 /dev/nvme0
brw-rw---- 1 root disk 259, 0 Nov 14 14:19 /dev/nvme0n1 sreeni.busam(a)sjca5radevct13:~/sreeni_spdk1/spdk$
sreeni.busam(a)sjca5radevct13:~/sreeni_spdk1/spdk$ lspci| grep Non
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01) The dd works on the device which has address "ad".
sudo dd if=/dev/zero of=/dev/nvme0n1 bs=64K count=16K
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1.51357 s, 709 MB/s
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 1:56 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Yes, I can do it Ben. Let me repeat the test and send you the log.
The second device is not working. For your 2nd question, please see comment inline.
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01) - hello_world fails
ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd - Looks like identify fails per Daniel comment.
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Walker, Benjamin
Sent: Tuesday, November 14, 2017 1:44 PM
To: spdk(a)lists.01.org
Subject: Re: [SPDK] Helloworld program fails during device initialization.
SPDK performs parallel initialization of all devices for performance reasons, so the log below is both of the initialization paths of your devices muxed together. The part Paul highlighted is for your non-functioning device, while all of the other log messages are for your functioning device, I think. It would be easy to tell if you unplugged the working one and only left the one that wasn't working - can you try that and send the new log to confirm?
When SPDK first reads the registers for your non-functioning SSD to figure out what state it is in, it sees that the CC.EN bit is set to 1, but the CSTS.RDY bit is 0. That means the device is performing some internal operation - i.e.
it's already in the process of enabling itself. It is invalid for SPDK to write any registers until CSTS.RDY flips to 1 in this case. However, it never does and the code just times out. Unless I'm interpreting something incorrectly, I believe your device to be out of spec (probably because that particular device is not functioning, not because all models of that type are out of spec).
Most commonly, problems like this are resolved by power cycling the device. If you haven't tried rebooting, definitely give that a shot. Did you say that the device does work with the kernel? When you load the kernel driver can you see /dev/sdX and dd to it without errors?
>> I just the ran the script in spdk. The kernel driver binds to the device which fails to be identified in SPDK. I have not done any tests with it, so I do not know whether the failing device works with kernel. I tried rebooting the system, it did not make any difference, let me try removing and inserting it physically.
scripts/setup.sh reset
Thanks,
Ben
On Tue, 2017-11-14 at 21:07 +0000, Luse, Paul E wrote:
> So in your output the highlighted lines below show one extra state
> that your SSD goes through during init vs the SSD I have, looks like
> NVME_CTRLR_STATE_DISABLE_WAIT_FOR_READY_1 is the state which is also
> state 1 in the enum which matches the error listed further below in
> red. Having not spent much time debugging this driver’s init I can’t
> say for sure if this (the additional output/steps between the extra
> state and the timeout) is expected or not.
>
> You can wait for Daniel or someone w/more experience to reply or if
> you want to experiment you can either step through the hello_world
> code with gdb putting a break point on nvme_ctrlr_process_init() with
> a good drive then with this one and get an idea of the correct flow vs
> what you are seeing you can maybe mes around with the timeout being
> set in the condition below which is seen on your drive but not on
> mine. Up to you, my suggestion is just a shot in the dark.
>
> if (csts.bits.rdy == 0) {
>
> SPDK_DEBUGLOG(SPDK_TRACE_NVME, "CC.EN = 1 && CSTS.RDY = 0 - waiting
> for reset to complete\n");
>
> nvme_ctrlr_set_state(ctrlr, NVME_CTRLR_STATE_DISABLE_WAIT_FOR_READY_1,
> ready_timeout_in_ms);
> return 0;
> }
>
> Thx
> Paul
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni
> (Sreenivasa) Busam (Stellus)
> Sent: Tuesday, November 14, 2017 1:14 PM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Thanks Daniel and Paul for helping debug it.
> This is great, I know how to debug SPDK code J Here is the debug log.
> nvme_ctrlr.c:1372:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
> nvme_ctrlr.c:1386:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 0
> nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to
> disable and wait for CSTS.RDY = 0 (timeout 30000 ms)
> nvme_ctrlr.c:1372:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
> nvme_ctrlr.c:1380:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 &&
> CSTS.RDY = 0 - waiting for reset to complete
> nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to
> disable and wait for CSTS.RDY = 1 (timeout 30000 ms)
> nvme_ctrlr.c:1432:nvme_ctrlr_process_init: *INFO*: CC.EN = 0 &&
> CSTS.RDY = 0
> nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to
> enable controller by writing CC.EN = 1 (timeout 30000 ms)
> nvme_ctrlr.c:1444:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 1
> nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to wait
> for CSTS.RDY = 1 (timeout 30000 ms)
> nvme_ctrlr.c:1451:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 &&
> CSTS.RDY = 1 - controller is ready
> nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x100ff4c000 len:4096
> nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 =
> 0x3fb0d4c000
> nvme_ctrlr.c: 713:nvme_ctrlr_identify: *INFO*: transport max_xfer_size
> 2072576
> nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x100ff4b000 len:4096
> nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 =
> 0x3fb0d4b000
> nvme_ctrlr.c: 813:nvme_ctrlr_set_keep_alive_timeout: *INFO*:
> Controller KAS is
> 0 - not enabling Keep Alive
> nvme_ctrlr.c: 883:nvme_ctrlr_set_host_id: *INFO*: Using 64-bit host
> identifier
> nvme_ctrlr.c: 899:nvme_ctrlr_set_host_id: *INFO*: User did not specify
> host ID
> - not sending Set Features - Host ID
> nvme_ctrlr.c: 613:nvme_ctrlr_set_state: *INFO*: setting state to ready
> (no
> timeout)
> Attached to 0000:ad:00.0
> Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1
> namespaces.
> Namespace ID: 1 size: 480GB
> nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization
> timed out in state 1
> nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown
> within 5 seconds
> spdk_nvme_probe() failed
> nvme_ctrlr.c: 515:nvme_ctrlr_shutdown: *INFO*: shutdown complete
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Verkamp,
> Daniel
> Sent: Tuesday, November 14, 2017 12:06 PM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Sorry, forgot to mention that you also need to compile with debug
> enabled (make CONFIG_DEBUG=y or ./configure --enable-debug);
> otherwise, the trace flag call doesn’t do anything.
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul
> E
> Sent: Tuesday, November 14, 2017 12:56 PM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Sreeni,
>
> Are you sure you got those statements added (also need to add #include
> “spdk/log.h” as I didn’t see some expected output in your text.
> Here’s what it should look like:
>
> peluse(a)pels2-64:~/spdk/examples/nvme/hello_world$ sudo ./hello_world
> Starting DPDK 17.08.0 initialization...
> [ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-
> virtaddr=0x1000000000 --proc-type=auto ]
> EAL: Detected 72 lcore(s)
> EAL: Auto-detected process type: PRIMARY
> EAL: No free hugepages reported in hugepages-1048576kB
> EAL: Probing VFIO support...
> Initializing NVMe Controllers
> EAL: PCI device 0000:06:00.0 on NUMA socket 0
> EAL: probe driver: 8086:953 spdk_nvme
> Attaching to 0000:06:00.0
> nvme_ctrlr.c: 623:nvme_ctrlr_set_state: *INFO*: setting state to init
> (no
> timeout)
> nvme_ctrlr.c:1382:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
> nvme_ctrlr.c:1396:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 0
> nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to
> disable and wait for CSTS.RDY = 0 (timeout 20000 ms)
> nvme_ctrlr.c:1442:nvme_ctrlr_process_init: *INFO*: CC.EN = 0 &&
> CSTS.RDY = 0
> nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to
> enable controller by writing CC.EN = 1 (timeout 20000 ms)
> nvme_ctrlr.c:1454:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 1
> nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to wait
> for CSTS.RDY = 1 (timeout 20000 ms)
> nvme_ctrlr.c:1461:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 &&
> CSTS.RDY = 1 - controller is ready
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103ff9b000 len:4096
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179b000
> nvme_ctrlr.c: 723:nvme_ctrlr_identify: *INFO*: transport max_xfer_size
> 2072576
> nvme_ctrlr.c: 727:nvme_ctrlr_identify: *INFO*: MDTS max_xfer_size
> 131072
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103ff9a000 len:4096
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179a000
> nvme_ns.c: 98:nvme_ns_identify_update: *INFO*: ns 1 stripe size quirk
> 256 blocks
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103ff9b000 len:512
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179b000
> nvme_ctrlr.c: 823:nvme_ctrlr_set_keep_alive_timeout: *INFO*:
> Controller KAS is
> 0 - not enabling Keep Alive
> nvme_ctrlr.c: 893:nvme_ctrlr_set_host_id: *INFO*: Using 64-bit host
> identifier
> nvme_ctrlr.c: 909:nvme_ctrlr_set_host_id: *INFO*: User did not specify
> host ID
> - not sending Set Features - Host ID
> nvme_ctrlr.c: 623:nvme_ctrlr_set_state: *INFO*: setting state to ready
> (no
> timeout)
> Attached to 0000:06:00.0
> Using controller INTEL SSDPEDMD400G4 (CVFT7203005M400LGN ) with 1
> namespaces.
> Namespace ID: 1 size: 400GB
> Initialization complete.
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103fe73000 len:512
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x451673000
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103fe73000 len:512
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x451673000
> Hello world!
> nvme_ctrlr.c: 512:nvme_ctrlr_shutdown: *INFO*: RTD3E = 0 us
> nvme_ctrlr.c: 515:nvme_ctrlr_shutdown: *INFO*: shutdown timeout =
> 10000 ms
> nvme_ctrlr.c: 525:nvme_ctrlr_shutdown: *INFO*: shutdown complete in
> 1808 milliseconds
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni
> (Sreenivasa) Busam (Stellus)
> Sent: Tuesday, November 14, 2017 12:44 PM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Hi Daniel,
>
> I added the debug flag you mentioned in the program and tested it.
> There is not much of any log which is helpful for the probe of the device.
> Please ignore the printf from func pci_probe_all_drivers, it was
> printf added by me.
> I have attached the file.
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni
> (Sreenivasa) Busam (Stellus)
> Sent: Tuesday, November 14, 2017 10:43 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Thanks Daniel for your comments. I think the SSD devices are of same
> model looking at lspci output.
> Let me confirm it with my lab administrator.
> I will add the debug code and give you the output.
>
> Sreeni
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Verkamp,
> Daniel
> Sent: Tuesday, November 14, 2017 9:43 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> It looks to me like the device at 0000:ad:00.0 worked in both tests,
> and the device at 0000:ae:00.0 failed during identify and wasn’t used
> in perf due to the -r argument specifying only 0000:ad:00.0.
>
> Are these two devices the same model/firmware of NVMe controller, or
> is there some difference between them?
>
> It would also be useful to see the debug output from the NVMe library
> for the failing controller; you can enable this in hello_world.c by
> adding these two lines at the beginning of main():
>
> spdk_log_set_print_level(SPDK_LOG_DEBUG);
> spdk_log_set_trace_flag(“all”);
>
> Thanks,
> -- Daniel
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul
> E
> Sent: Tuesday, November 14, 2017 6:46 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Well, maybe not. Could very well be an issue in the driver that’s
> only exposed via this hardware especially given that hello_world
> failed and perf did not.
>
> Sreeni, do you have any other data points that you can share? Like,
> for example, have you tried multiple SSDs of that same model or any other model?
> Does hello consistently fail and perf consistently pass? I haven’t
> done a ton of debug with either but with a little more info, if
> someone else doesn’t jump in with the answer, I can pretty easily help
> compare the init sequence in a few of these and maybe start to narrow down what’s different.
>
> Thx
> Paul
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Wu,
> WenzhongX
> Sent: Tuesday, November 14, 2017 2:13 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Hi Sreeni,
>
> Verify with newest spdk source code with intel P3700 ,run helloword
> everything is ok:
> root(a)waikiki5-target:~/workspace/spdk1102/examples/nvme/hello_world#
> ./hello_world
> Starting DPDK 17.08.0 initialization...
> [ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-
> virtaddr=0x1000000000 --proc-type=auto ]
> EAL: Detected 36 lcore(s)
> EAL: Auto-detected process type: PRIMARY
> EAL: No free hugepages reported in hugepages-1048576kB
> EAL: Probing VFIO support...
> Initializing NVMe Controllers
> EAL: PCI device 0000:81:00.0 on NUMA socket 1
> EAL: probe driver: 8086:953 spdk_nvme
> Attaching to 0000:81:00.0
> EAL: PCI device 0000:84:00.0 on NUMA socket 1
> EAL: probe driver: 8086:953 spdk_nvme
> Attaching to 0000:84:00.0
> EAL: PCI device 0000:85:00.0 on NUMA socket 1
> EAL: probe driver: 8086:953 spdk_nvme
> Attaching to 0000:85:00.0
> Attached to 0000:85:00.0
> Using controller INTEL SSDPEDMD800G4 (PHFT6401011B800CGN ) with 1
> namespaces.
> Namespace ID: 1 size: 800GB
> Attached to 0000:81:00.0
> Using controller INTEL SSDPEDMD800G4 (PHFT5373002D800CGN ) with 1
> namespaces.
> Namespace ID: 1 size: 800GB
> Attached to 0000:84:00.0
> Using controller INTEL SSDPEDMD800G4 (PHFT53640003800CGN ) with 1
> namespaces.
> Namespace ID: 1 size: 800GB
> Initialization complete.
> Hello world!
> Hello world!
> Hello world!
>
> Seem it’s your hardware cause this issues.
>
> Thanks
> Wenzhong
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni
> (Sreenivasa) Busam (Stellus)
> Sent: Tuesday, November 14, 2017 2:36 PM
> To: spdk(a)lists.01.org
> Subject: [SPDK] Helloworld program fails during device initialization.
>
> I was testing the helloworld program on ubuntu with NVMe SSD and the
> probe() is failing. Please let me know if this is a known problem.
> Let me know if there is workaround.
> The perf program did not have any problem.
> I have attached the output for both tests.
> lspci| grep Non
> ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd
> NVMe SSD Controller (rev 01)
> ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd
> NVMe SSD Controller (rev 01) uname -a Linux sjca5radevct13
> 4.10.0-38-generic #42~16.04.1-Ubuntu SMP Tue Oct 10
> 16:32:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux 1. helloworld output:
> ---------------------------
> Attached to 0000:ad:00.0
> Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1
> namespaces.
> Namespace ID: 1 size: 480GB
> nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization
> timed out in state 1
> nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown
> within 5 seconds
> spdk_nvme_probe() failed
>
> 2. perf output:
> sudo ./perf -q 1 -s 4096 -t 10 -w read -r "trtype:PCIe traddr:0000:ad:00.0"
> ---------------
> Starting DPDK 17.08.0 initialization...
> [ DPDK EAL parameters: perf -c 0x1 --file-prefix=spdk_pid7647 ]
> EAL: Detected 32 lcore(s)
> EAL: No free hugepages reported in hugepages-1048576kB Attached to
> NVMe Controller at 0000:ad:00.0 [144d:a802]
> Associating SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with lcore 0
> Initialization complete. Launching workers.
> Starting thread on core 0
> ========================================================
>
>
> Latency(us)
> Device Information : IOPS
> MB/s Average min max
> SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) from core 0: 13773.50
> 53.80 72.59 60.20 243.82
> ========================================================
> Total : 13773.50
> 53.80 72.59 60.20 243.82
>
> Thanks for your help,
> Sreeni
> _______________________________________________
> SPDK mailing list
> SPDK(a)lists.01.org
> https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [SPDK] Helloworld program fails during device initialization.
@ 2017-11-14 23:48 Sreeni Busam
0 siblings, 0 replies; 21+ messages in thread
From: Sreeni Busam @ 2017-11-14 23:48 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 25446 bytes --]
It looks like Ben's investigation is correct. It looks like physically removing and inserting the "ad" device fixed the device.
lspci| grep Non
89:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd Device a804 [- new device]
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01) - [device was reset]
The initialization has completed successfully for both devices.
I do not see any errors. The program is working.
Thank you Ben, Daniel and Paul for spending time on this issue.
Test log.
nvme_ctrlr.c:1372:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
nvme_ctrlr.c:1386:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 0
nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to disable and wait for CSTS.RDY = 0 (timeout 20000 ms)
nvme_ctrlr.c:1372:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
nvme_ctrlr.c:1386:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 0
nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to disable and wait for CSTS.RDY = 0 (timeout 30000 ms)
nvme_ctrlr.c:1432:nvme_ctrlr_process_init: *INFO*: CC.EN = 0 && CSTS.RDY = 0
nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to enable controller by writing CC.EN = 1 (timeout 30000 ms)
nvme_ctrlr.c:1444:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 1
nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to wait for CSTS.RDY = 1 (timeout 30000 ms)
nvme_ctrlr.c:1432:nvme_ctrlr_process_init: *INFO*: CC.EN = 0 && CSTS.RDY = 0
nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to enable controller by writing CC.EN = 1 (timeout 20000 ms)
nvme_ctrlr.c:1444:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 1
nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to wait for CSTS.RDY = 1 (timeout 20000 ms)
nvme_ctrlr.c:1451:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 && CSTS.RDY = 1 - controller is ready
nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff4c000 len:4096
nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x1fc0f4c000
nvme_ctrlr.c: 713:nvme_ctrlr_identify: *INFO*: transport max_xfer_size 2072576
nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff4b000 len:4096
nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x1fc0f4b000
nvme_ctrlr.c: 813:nvme_ctrlr_set_keep_alive_timeout: *INFO*: Controller KAS is 0 - not enabling Keep Alive
nvme_ctrlr.c: 883:nvme_ctrlr_set_host_id: *INFO*: Using 64-bit host identifier
nvme_ctrlr.c: 899:nvme_ctrlr_set_host_id: *INFO*: User did not specify host ID - not sending Set Features - Host ID
nvme_ctrlr.c: 613:nvme_ctrlr_set_state: *INFO*: setting state to ready (no timeout)
Attached to 0000:ad:00.0
Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1 namespaces.
Namespace ID: 1 size: 480GB
nvme_ctrlr.c:1451:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 && CSTS.RDY = 1 - controller is ready
nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff4b000 len:4096
nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x1fc0f4b000
nvme_ctrlr.c: 713:nvme_ctrlr_identify: *INFO*: transport max_xfer_size 2072576
nvme_ctrlr.c: 717:nvme_ctrlr_identify: *INFO*: MDTS max_xfer_size 262144
nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff4a000 len:4096
nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x1fc0f4a000
nvme_ctrlr.c: 813:nvme_ctrlr_set_keep_alive_timeout: *INFO*: Controller KAS is 0 - not enabling Keep Alive
nvme_ctrlr.c: 883:nvme_ctrlr_set_host_id: *INFO*: Using 64-bit host identifier
nvme_ctrlr.c: 899:nvme_ctrlr_set_host_id: *INFO*: User did not specify host ID - not sending Set Features - Host ID
nvme_ctrlr.c: 613:nvme_ctrlr_set_state: *INFO*: setting state to ready (no timeout)
Attached to 0000:89:00.0
Using controller SAMSUNG MZQLW3T8HMLP (S338NXAH300063 ) with 1 namespaces.
Namespace ID: 1 size: 17GB
Initialization complete.
nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103fe23000 len:512
nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x1fc0e23000
nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103fe23000 len:512
nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x1fc0e23000
Hello world!
nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103fe23000 len:512
nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x1fc0e23000
nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103fe23000 len:512
nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x1fc0e23000
Hello world!
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul E
Sent: Tuesday, November 14, 2017 2:47 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Great, looking forward to hearing how it goes
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 3:26 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Hi Ben,
Here is the devices in my system. It looks like the failing device does not get identified in Linux kernel. I have not got a chance to run the test with the working device removing the faulty device, will update you on that test.
ls -l /dev/nvme*
crw------- 1 root root 244, 0 Nov 14 14:14 /dev/nvme0
brw-rw---- 1 root disk 259, 0 Nov 14 14:19 /dev/nvme0n1 sreeni.busam(a)sjca5radevct13:~/sreeni_spdk1/spdk$
sreeni.busam(a)sjca5radevct13:~/sreeni_spdk1/spdk$ lspci| grep Non
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01) The dd works on the device which has address "ad".
sudo dd if=/dev/zero of=/dev/nvme0n1 bs=64K count=16K
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1.51357 s, 709 MB/s
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 1:56 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Yes, I can do it Ben. Let me repeat the test and send you the log.
The second device is not working. For your 2nd question, please see comment inline.
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01) - hello_world fails
ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd - Looks like identify fails per Daniel comment.
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Walker, Benjamin
Sent: Tuesday, November 14, 2017 1:44 PM
To: spdk(a)lists.01.org
Subject: Re: [SPDK] Helloworld program fails during device initialization.
SPDK performs parallel initialization of all devices for performance reasons, so the log below is both of the initialization paths of your devices muxed together. The part Paul highlighted is for your non-functioning device, while all of the other log messages are for your functioning device, I think. It would be easy to tell if you unplugged the working one and only left the one that wasn't working - can you try that and send the new log to confirm?
When SPDK first reads the registers for your non-functioning SSD to figure out what state it is in, it sees that the CC.EN bit is set to 1, but the CSTS.RDY bit is 0. That means the device is performing some internal operation - i.e.
it's already in the process of enabling itself. It is invalid for SPDK to write any registers until CSTS.RDY flips to 1 in this case. However, it never does and the code just times out. Unless I'm interpreting something incorrectly, I believe your device to be out of spec (probably because that particular device is not functioning, not because all models of that type are out of spec).
Most commonly, problems like this are resolved by power cycling the device. If you haven't tried rebooting, definitely give that a shot. Did you say that the device does work with the kernel? When you load the kernel driver can you see /dev/sdX and dd to it without errors?
>> I just the ran the script in spdk. The kernel driver binds to the device which fails to be identified in SPDK. I have not done any tests with it, so I do not know whether the failing device works with kernel. I tried rebooting the system, it did not make any difference, let me try removing and inserting it physically.
scripts/setup.sh reset
Thanks,
Ben
On Tue, 2017-11-14 at 21:07 +0000, Luse, Paul E wrote:
> So in your output the highlighted lines below show one extra state
> that your SSD goes through during init vs the SSD I have, looks like
> NVME_CTRLR_STATE_DISABLE_WAIT_FOR_READY_1 is the state which is also
> state 1 in the enum which matches the error listed further below in
> red. Having not spent much time debugging this driver’s init I can’t
> say for sure if this (the additional output/steps between the extra
> state and the timeout) is expected or not.
>
> You can wait for Daniel or someone w/more experience to reply or if
> you want to experiment you can either step through the hello_world
> code with gdb putting a break point on nvme_ctrlr_process_init() with
> a good drive then with this one and get an idea of the correct flow vs
> what you are seeing you can maybe mes around with the timeout being
> set in the condition below which is seen on your drive but not on
> mine. Up to you, my suggestion is just a shot in the dark.
>
> if (csts.bits.rdy == 0) {
>
> SPDK_DEBUGLOG(SPDK_TRACE_NVME, "CC.EN = 1 && CSTS.RDY = 0 - waiting
> for reset to complete\n");
>
> nvme_ctrlr_set_state(ctrlr, NVME_CTRLR_STATE_DISABLE_WAIT_FOR_READY_1,
> ready_timeout_in_ms);
> return 0;
> }
>
> Thx
> Paul
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni
> (Sreenivasa) Busam (Stellus)
> Sent: Tuesday, November 14, 2017 1:14 PM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Thanks Daniel and Paul for helping debug it.
> This is great, I know how to debug SPDK code J Here is the debug log.
> nvme_ctrlr.c:1372:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
> nvme_ctrlr.c:1386:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 0
> nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to
> disable and wait for CSTS.RDY = 0 (timeout 30000 ms)
> nvme_ctrlr.c:1372:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
> nvme_ctrlr.c:1380:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 &&
> CSTS.RDY = 0 - waiting for reset to complete
> nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to
> disable and wait for CSTS.RDY = 1 (timeout 30000 ms)
> nvme_ctrlr.c:1432:nvme_ctrlr_process_init: *INFO*: CC.EN = 0 &&
> CSTS.RDY = 0
> nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to
> enable controller by writing CC.EN = 1 (timeout 30000 ms)
> nvme_ctrlr.c:1444:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 1
> nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to wait
> for CSTS.RDY = 1 (timeout 30000 ms)
> nvme_ctrlr.c:1451:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 &&
> CSTS.RDY = 1 - controller is ready
> nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x100ff4c000 len:4096
> nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 =
> 0x3fb0d4c000
> nvme_ctrlr.c: 713:nvme_ctrlr_identify: *INFO*: transport max_xfer_size
> 2072576
> nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x100ff4b000 len:4096
> nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 =
> 0x3fb0d4b000
> nvme_ctrlr.c: 813:nvme_ctrlr_set_keep_alive_timeout: *INFO*:
> Controller KAS is
> 0 - not enabling Keep Alive
> nvme_ctrlr.c: 883:nvme_ctrlr_set_host_id: *INFO*: Using 64-bit host
> identifier
> nvme_ctrlr.c: 899:nvme_ctrlr_set_host_id: *INFO*: User did not specify
> host ID
> - not sending Set Features - Host ID
> nvme_ctrlr.c: 613:nvme_ctrlr_set_state: *INFO*: setting state to ready
> (no
> timeout)
> Attached to 0000:ad:00.0
> Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1
> namespaces.
> Namespace ID: 1 size: 480GB
> nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization
> timed out in state 1
> nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown
> within 5 seconds
> spdk_nvme_probe() failed
> nvme_ctrlr.c: 515:nvme_ctrlr_shutdown: *INFO*: shutdown complete
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Verkamp,
> Daniel
> Sent: Tuesday, November 14, 2017 12:06 PM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Sorry, forgot to mention that you also need to compile with debug
> enabled (make CONFIG_DEBUG=y or ./configure --enable-debug);
> otherwise, the trace flag call doesn’t do anything.
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul
> E
> Sent: Tuesday, November 14, 2017 12:56 PM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Sreeni,
>
> Are you sure you got those statements added (also need to add #include
> “spdk/log.h” as I didn’t see some expected output in your text.
> Here’s what it should look like:
>
> peluse(a)pels2-64:~/spdk/examples/nvme/hello_world$ sudo ./hello_world
> Starting DPDK 17.08.0 initialization...
> [ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-
> virtaddr=0x1000000000 --proc-type=auto ]
> EAL: Detected 72 lcore(s)
> EAL: Auto-detected process type: PRIMARY
> EAL: No free hugepages reported in hugepages-1048576kB
> EAL: Probing VFIO support...
> Initializing NVMe Controllers
> EAL: PCI device 0000:06:00.0 on NUMA socket 0
> EAL: probe driver: 8086:953 spdk_nvme
> Attaching to 0000:06:00.0
> nvme_ctrlr.c: 623:nvme_ctrlr_set_state: *INFO*: setting state to init
> (no
> timeout)
> nvme_ctrlr.c:1382:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
> nvme_ctrlr.c:1396:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 0
> nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to
> disable and wait for CSTS.RDY = 0 (timeout 20000 ms)
> nvme_ctrlr.c:1442:nvme_ctrlr_process_init: *INFO*: CC.EN = 0 &&
> CSTS.RDY = 0
> nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to
> enable controller by writing CC.EN = 1 (timeout 20000 ms)
> nvme_ctrlr.c:1454:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 1
> nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to wait
> for CSTS.RDY = 1 (timeout 20000 ms)
> nvme_ctrlr.c:1461:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 &&
> CSTS.RDY = 1 - controller is ready
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103ff9b000 len:4096
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179b000
> nvme_ctrlr.c: 723:nvme_ctrlr_identify: *INFO*: transport max_xfer_size
> 2072576
> nvme_ctrlr.c: 727:nvme_ctrlr_identify: *INFO*: MDTS max_xfer_size
> 131072
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103ff9a000 len:4096
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179a000
> nvme_ns.c: 98:nvme_ns_identify_update: *INFO*: ns 1 stripe size quirk
> 256 blocks
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103ff9b000 len:512
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179b000
> nvme_ctrlr.c: 823:nvme_ctrlr_set_keep_alive_timeout: *INFO*:
> Controller KAS is
> 0 - not enabling Keep Alive
> nvme_ctrlr.c: 893:nvme_ctrlr_set_host_id: *INFO*: Using 64-bit host
> identifier
> nvme_ctrlr.c: 909:nvme_ctrlr_set_host_id: *INFO*: User did not specify
> host ID
> - not sending Set Features - Host ID
> nvme_ctrlr.c: 623:nvme_ctrlr_set_state: *INFO*: setting state to ready
> (no
> timeout)
> Attached to 0000:06:00.0
> Using controller INTEL SSDPEDMD400G4 (CVFT7203005M400LGN ) with 1
> namespaces.
> Namespace ID: 1 size: 400GB
> Initialization complete.
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103fe73000 len:512
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x451673000
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103fe73000 len:512
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x451673000
> Hello world!
> nvme_ctrlr.c: 512:nvme_ctrlr_shutdown: *INFO*: RTD3E = 0 us
> nvme_ctrlr.c: 515:nvme_ctrlr_shutdown: *INFO*: shutdown timeout =
> 10000 ms
> nvme_ctrlr.c: 525:nvme_ctrlr_shutdown: *INFO*: shutdown complete in
> 1808 milliseconds
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni
> (Sreenivasa) Busam (Stellus)
> Sent: Tuesday, November 14, 2017 12:44 PM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Hi Daniel,
>
> I added the debug flag you mentioned in the program and tested it.
> There is not much of any log which is helpful for the probe of the device.
> Please ignore the printf from func pci_probe_all_drivers, it was
> printf added by me.
> I have attached the file.
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni
> (Sreenivasa) Busam (Stellus)
> Sent: Tuesday, November 14, 2017 10:43 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Thanks Daniel for your comments. I think the SSD devices are of same
> model looking at lspci output.
> Let me confirm it with my lab administrator.
> I will add the debug code and give you the output.
>
> Sreeni
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Verkamp,
> Daniel
> Sent: Tuesday, November 14, 2017 9:43 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> It looks to me like the device at 0000:ad:00.0 worked in both tests,
> and the device at 0000:ae:00.0 failed during identify and wasn’t used
> in perf due to the -r argument specifying only 0000:ad:00.0.
>
> Are these two devices the same model/firmware of NVMe controller, or
> is there some difference between them?
>
> It would also be useful to see the debug output from the NVMe library
> for the failing controller; you can enable this in hello_world.c by
> adding these two lines at the beginning of main():
>
> spdk_log_set_print_level(SPDK_LOG_DEBUG);
> spdk_log_set_trace_flag(“all”);
>
> Thanks,
> -- Daniel
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul
> E
> Sent: Tuesday, November 14, 2017 6:46 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Well, maybe not. Could very well be an issue in the driver that’s
> only exposed via this hardware especially given that hello_world
> failed and perf did not.
>
> Sreeni, do you have any other data points that you can share? Like,
> for example, have you tried multiple SSDs of that same model or any other model?
> Does hello consistently fail and perf consistently pass? I haven’t
> done a ton of debug with either but with a little more info, if
> someone else doesn’t jump in with the answer, I can pretty easily help
> compare the init sequence in a few of these and maybe start to narrow down what’s different.
>
> Thx
> Paul
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Wu,
> WenzhongX
> Sent: Tuesday, November 14, 2017 2:13 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Hi Sreeni,
>
> Verify with newest spdk source code with intel P3700 ,run helloword
> everything is ok:
> root(a)waikiki5-target:~/workspace/spdk1102/examples/nvme/hello_world#
> ./hello_world
> Starting DPDK 17.08.0 initialization...
> [ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-
> virtaddr=0x1000000000 --proc-type=auto ]
> EAL: Detected 36 lcore(s)
> EAL: Auto-detected process type: PRIMARY
> EAL: No free hugepages reported in hugepages-1048576kB
> EAL: Probing VFIO support...
> Initializing NVMe Controllers
> EAL: PCI device 0000:81:00.0 on NUMA socket 1
> EAL: probe driver: 8086:953 spdk_nvme
> Attaching to 0000:81:00.0
> EAL: PCI device 0000:84:00.0 on NUMA socket 1
> EAL: probe driver: 8086:953 spdk_nvme
> Attaching to 0000:84:00.0
> EAL: PCI device 0000:85:00.0 on NUMA socket 1
> EAL: probe driver: 8086:953 spdk_nvme
> Attaching to 0000:85:00.0
> Attached to 0000:85:00.0
> Using controller INTEL SSDPEDMD800G4 (PHFT6401011B800CGN ) with 1
> namespaces.
> Namespace ID: 1 size: 800GB
> Attached to 0000:81:00.0
> Using controller INTEL SSDPEDMD800G4 (PHFT5373002D800CGN ) with 1
> namespaces.
> Namespace ID: 1 size: 800GB
> Attached to 0000:84:00.0
> Using controller INTEL SSDPEDMD800G4 (PHFT53640003800CGN ) with 1
> namespaces.
> Namespace ID: 1 size: 800GB
> Initialization complete.
> Hello world!
> Hello world!
> Hello world!
>
> Seem it’s your hardware cause this issues.
>
> Thanks
> Wenzhong
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni
> (Sreenivasa) Busam (Stellus)
> Sent: Tuesday, November 14, 2017 2:36 PM
> To: spdk(a)lists.01.org
> Subject: [SPDK] Helloworld program fails during device initialization.
>
> I was testing the helloworld program on ubuntu with NVMe SSD and the
> probe() is failing. Please let me know if this is a known problem.
> Let me know if there is workaround.
> The perf program did not have any problem.
> I have attached the output for both tests.
> lspci| grep Non
> ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd
> NVMe SSD Controller (rev 01)
> ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd
> NVMe SSD Controller (rev 01) uname -a Linux sjca5radevct13
> 4.10.0-38-generic #42~16.04.1-Ubuntu SMP Tue Oct 10
> 16:32:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux 1. helloworld output:
> ---------------------------
> Attached to 0000:ad:00.0
> Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1
> namespaces.
> Namespace ID: 1 size: 480GB
> nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization
> timed out in state 1
> nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown
> within 5 seconds
> spdk_nvme_probe() failed
>
> 2. perf output:
> sudo ./perf -q 1 -s 4096 -t 10 -w read -r "trtype:PCIe traddr:0000:ad:00.0"
> ---------------
> Starting DPDK 17.08.0 initialization...
> [ DPDK EAL parameters: perf -c 0x1 --file-prefix=spdk_pid7647 ]
> EAL: Detected 32 lcore(s)
> EAL: No free hugepages reported in hugepages-1048576kB Attached to
> NVMe Controller at 0000:ad:00.0 [144d:a802]
> Associating SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with lcore 0
> Initialization complete. Launching workers.
> Starting thread on core 0
> ========================================================
>
>
> Latency(us)
> Device Information : IOPS
> MB/s Average min max
> SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) from core 0: 13773.50
> 53.80 72.59 60.20 243.82
> ========================================================
> Total : 13773.50
> 53.80 72.59 60.20 243.82
>
> Thanks for your help,
> Sreeni
> _______________________________________________
> SPDK mailing list
> SPDK(a)lists.01.org
> https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [SPDK] Helloworld program fails during device initialization.
@ 2017-11-15 0:50 Luse, Paul E
0 siblings, 0 replies; 21+ messages in thread
From: Luse, Paul E @ 2017-11-15 0:50 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 25982 bytes --]
Nice! "Uhhh, try unplugging it and then plugging it back in"... I'm gonna remember that one :)
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 4:49 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
It looks like Ben's investigation is correct. It looks like physically removing and inserting the "ad" device fixed the device.
lspci| grep Non
89:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd Device a804 [- new device]
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01) - [device was reset] The initialization has completed successfully for both devices.
I do not see any errors. The program is working.
Thank you Ben, Daniel and Paul for spending time on this issue.
Test log.
nvme_ctrlr.c:1372:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
nvme_ctrlr.c:1386:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 0
nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to disable and wait for CSTS.RDY = 0 (timeout 20000 ms)
nvme_ctrlr.c:1372:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
nvme_ctrlr.c:1386:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 0
nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to disable and wait for CSTS.RDY = 0 (timeout 30000 ms)
nvme_ctrlr.c:1432:nvme_ctrlr_process_init: *INFO*: CC.EN = 0 && CSTS.RDY = 0
nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to enable controller by writing CC.EN = 1 (timeout 30000 ms)
nvme_ctrlr.c:1444:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 1
nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to wait for CSTS.RDY = 1 (timeout 30000 ms)
nvme_ctrlr.c:1432:nvme_ctrlr_process_init: *INFO*: CC.EN = 0 && CSTS.RDY = 0
nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to enable controller by writing CC.EN = 1 (timeout 20000 ms)
nvme_ctrlr.c:1444:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 1
nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to wait for CSTS.RDY = 1 (timeout 20000 ms)
nvme_ctrlr.c:1451:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 && CSTS.RDY = 1 - controller is ready
nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff4c000 len:4096
nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x1fc0f4c000
nvme_ctrlr.c: 713:nvme_ctrlr_identify: *INFO*: transport max_xfer_size 2072576
nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff4b000 len:4096
nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x1fc0f4b000
nvme_ctrlr.c: 813:nvme_ctrlr_set_keep_alive_timeout: *INFO*: Controller KAS is 0 - not enabling Keep Alive
nvme_ctrlr.c: 883:nvme_ctrlr_set_host_id: *INFO*: Using 64-bit host identifier
nvme_ctrlr.c: 899:nvme_ctrlr_set_host_id: *INFO*: User did not specify host ID - not sending Set Features - Host ID
nvme_ctrlr.c: 613:nvme_ctrlr_set_state: *INFO*: setting state to ready (no timeout) Attached to 0000:ad:00.0
Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1 namespaces.
Namespace ID: 1 size: 480GB
nvme_ctrlr.c:1451:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 && CSTS.RDY = 1 - controller is ready
nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff4b000 len:4096
nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x1fc0f4b000
nvme_ctrlr.c: 713:nvme_ctrlr_identify: *INFO*: transport max_xfer_size 2072576
nvme_ctrlr.c: 717:nvme_ctrlr_identify: *INFO*: MDTS max_xfer_size 262144
nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103ff4a000 len:4096
nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x1fc0f4a000
nvme_ctrlr.c: 813:nvme_ctrlr_set_keep_alive_timeout: *INFO*: Controller KAS is 0 - not enabling Keep Alive
nvme_ctrlr.c: 883:nvme_ctrlr_set_host_id: *INFO*: Using 64-bit host identifier
nvme_ctrlr.c: 899:nvme_ctrlr_set_host_id: *INFO*: User did not specify host ID - not sending Set Features - Host ID
nvme_ctrlr.c: 613:nvme_ctrlr_set_state: *INFO*: setting state to ready (no timeout) Attached to 0000:89:00.0
Using controller SAMSUNG MZQLW3T8HMLP (S338NXAH300063 ) with 1 namespaces.
Namespace ID: 1 size: 17GB
Initialization complete.
nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103fe23000 len:512
nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x1fc0e23000
nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103fe23000 len:512
nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x1fc0e23000 Hello world!
nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103fe23000 len:512
nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x1fc0e23000
nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0 virt_addr:0x103fe23000 len:512
nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x1fc0e23000 Hello world!
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul E
Sent: Tuesday, November 14, 2017 2:47 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Great, looking forward to hearing how it goes
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 3:26 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Hi Ben,
Here is the devices in my system. It looks like the failing device does not get identified in Linux kernel. I have not got a chance to run the test with the working device removing the faulty device, will update you on that test.
ls -l /dev/nvme*
crw------- 1 root root 244, 0 Nov 14 14:14 /dev/nvme0
brw-rw---- 1 root disk 259, 0 Nov 14 14:19 /dev/nvme0n1 sreeni.busam(a)sjca5radevct13:~/sreeni_spdk1/spdk$
sreeni.busam(a)sjca5radevct13:~/sreeni_spdk1/spdk$ lspci| grep Non
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01) The dd works on the device which has address "ad".
sudo dd if=/dev/zero of=/dev/nvme0n1 bs=64K count=16K
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1.51357 s, 709 MB/s
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni (Sreenivasa) Busam (Stellus)
Sent: Tuesday, November 14, 2017 1:56 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Helloworld program fails during device initialization.
Yes, I can do it Ben. Let me repeat the test and send you the log.
The second device is not working. For your 2nd question, please see comment inline.
ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01) - hello_world fails
ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd - Looks like identify fails per Daniel comment.
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Walker, Benjamin
Sent: Tuesday, November 14, 2017 1:44 PM
To: spdk(a)lists.01.org
Subject: Re: [SPDK] Helloworld program fails during device initialization.
SPDK performs parallel initialization of all devices for performance reasons, so the log below is both of the initialization paths of your devices muxed together. The part Paul highlighted is for your non-functioning device, while all of the other log messages are for your functioning device, I think. It would be easy to tell if you unplugged the working one and only left the one that wasn't working - can you try that and send the new log to confirm?
When SPDK first reads the registers for your non-functioning SSD to figure out what state it is in, it sees that the CC.EN bit is set to 1, but the CSTS.RDY bit is 0. That means the device is performing some internal operation - i.e.
it's already in the process of enabling itself. It is invalid for SPDK to write any registers until CSTS.RDY flips to 1 in this case. However, it never does and the code just times out. Unless I'm interpreting something incorrectly, I believe your device to be out of spec (probably because that particular device is not functioning, not because all models of that type are out of spec).
Most commonly, problems like this are resolved by power cycling the device. If you haven't tried rebooting, definitely give that a shot. Did you say that the device does work with the kernel? When you load the kernel driver can you see /dev/sdX and dd to it without errors?
>> I just the ran the script in spdk. The kernel driver binds to the device which fails to be identified in SPDK. I have not done any tests with it, so I do not know whether the failing device works with kernel. I tried rebooting the system, it did not make any difference, let me try removing and inserting it physically.
scripts/setup.sh reset
Thanks,
Ben
On Tue, 2017-11-14 at 21:07 +0000, Luse, Paul E wrote:
> So in your output the highlighted lines below show one extra state
> that your SSD goes through during init vs the SSD I have, looks like
> NVME_CTRLR_STATE_DISABLE_WAIT_FOR_READY_1 is the state which is also
> state 1 in the enum which matches the error listed further below in
> red. Having not spent much time debugging this driver’s init I can’t
> say for sure if this (the additional output/steps between the extra
> state and the timeout) is expected or not.
>
> You can wait for Daniel or someone w/more experience to reply or if
> you want to experiment you can either step through the hello_world
> code with gdb putting a break point on nvme_ctrlr_process_init() with
> a good drive then with this one and get an idea of the correct flow vs
> what you are seeing you can maybe mes around with the timeout being
> set in the condition below which is seen on your drive but not on
> mine. Up to you, my suggestion is just a shot in the dark.
>
> if (csts.bits.rdy == 0) {
>
> SPDK_DEBUGLOG(SPDK_TRACE_NVME, "CC.EN = 1 && CSTS.RDY = 0 - waiting
> for reset to complete\n");
>
> nvme_ctrlr_set_state(ctrlr, NVME_CTRLR_STATE_DISABLE_WAIT_FOR_READY_1,
> ready_timeout_in_ms);
> return 0;
> }
>
> Thx
> Paul
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni
> (Sreenivasa) Busam (Stellus)
> Sent: Tuesday, November 14, 2017 1:14 PM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Thanks Daniel and Paul for helping debug it.
> This is great, I know how to debug SPDK code J Here is the debug log.
> nvme_ctrlr.c:1372:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
> nvme_ctrlr.c:1386:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 0
> nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to
> disable and wait for CSTS.RDY = 0 (timeout 30000 ms)
> nvme_ctrlr.c:1372:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
> nvme_ctrlr.c:1380:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 &&
> CSTS.RDY = 0 - waiting for reset to complete
> nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to
> disable and wait for CSTS.RDY = 1 (timeout 30000 ms)
> nvme_ctrlr.c:1432:nvme_ctrlr_process_init: *INFO*: CC.EN = 0 &&
> CSTS.RDY = 0
> nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to
> enable controller by writing CC.EN = 1 (timeout 30000 ms)
> nvme_ctrlr.c:1444:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 1
> nvme_ctrlr.c: 617:nvme_ctrlr_set_state: *INFO*: setting state to wait
> for CSTS.RDY = 1 (timeout 30000 ms)
> nvme_ctrlr.c:1451:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 &&
> CSTS.RDY = 1 - controller is ready
> nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x100ff4c000 len:4096
> nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 =
> 0x3fb0d4c000
> nvme_ctrlr.c: 713:nvme_ctrlr_identify: *INFO*: transport max_xfer_size
> 2072576
> nvme_pcie.c:1521:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x100ff4b000 len:4096
> nvme_pcie.c:1548:nvme_pcie_prp_list_append: *INFO*: prp1 =
> 0x3fb0d4b000
> nvme_ctrlr.c: 813:nvme_ctrlr_set_keep_alive_timeout: *INFO*:
> Controller KAS is
> 0 - not enabling Keep Alive
> nvme_ctrlr.c: 883:nvme_ctrlr_set_host_id: *INFO*: Using 64-bit host
> identifier
> nvme_ctrlr.c: 899:nvme_ctrlr_set_host_id: *INFO*: User did not specify
> host ID
> - not sending Set Features - Host ID
> nvme_ctrlr.c: 613:nvme_ctrlr_set_state: *INFO*: setting state to ready
> (no
> timeout)
> Attached to 0000:ad:00.0
> Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1
> namespaces.
> Namespace ID: 1 size: 480GB
> nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization
> timed out in state 1
> nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown
> within 5 seconds
> spdk_nvme_probe() failed
> nvme_ctrlr.c: 515:nvme_ctrlr_shutdown: *INFO*: shutdown complete
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Verkamp,
> Daniel
> Sent: Tuesday, November 14, 2017 12:06 PM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Sorry, forgot to mention that you also need to compile with debug
> enabled (make CONFIG_DEBUG=y or ./configure --enable-debug);
> otherwise, the trace flag call doesn’t do anything.
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul
> E
> Sent: Tuesday, November 14, 2017 12:56 PM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Sreeni,
>
> Are you sure you got those statements added (also need to add #include
> “spdk/log.h” as I didn’t see some expected output in your text.
> Here’s what it should look like:
>
> peluse(a)pels2-64:~/spdk/examples/nvme/hello_world$ sudo ./hello_world
> Starting DPDK 17.08.0 initialization...
> [ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-
> virtaddr=0x1000000000 --proc-type=auto ]
> EAL: Detected 72 lcore(s)
> EAL: Auto-detected process type: PRIMARY
> EAL: No free hugepages reported in hugepages-1048576kB
> EAL: Probing VFIO support...
> Initializing NVMe Controllers
> EAL: PCI device 0000:06:00.0 on NUMA socket 0
> EAL: probe driver: 8086:953 spdk_nvme
> Attaching to 0000:06:00.0
> nvme_ctrlr.c: 623:nvme_ctrlr_set_state: *INFO*: setting state to init
> (no
> timeout)
> nvme_ctrlr.c:1382:nvme_ctrlr_process_init: *INFO*: CC.EN = 1
> nvme_ctrlr.c:1396:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 0
> nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to
> disable and wait for CSTS.RDY = 0 (timeout 20000 ms)
> nvme_ctrlr.c:1442:nvme_ctrlr_process_init: *INFO*: CC.EN = 0 &&
> CSTS.RDY = 0
> nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to
> enable controller by writing CC.EN = 1 (timeout 20000 ms)
> nvme_ctrlr.c:1454:nvme_ctrlr_process_init: *INFO*: Setting CC.EN = 1
> nvme_ctrlr.c: 627:nvme_ctrlr_set_state: *INFO*: setting state to wait
> for CSTS.RDY = 1 (timeout 20000 ms)
> nvme_ctrlr.c:1461:nvme_ctrlr_process_init: *INFO*: CC.EN = 1 &&
> CSTS.RDY = 1 - controller is ready
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103ff9b000 len:4096
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179b000
> nvme_ctrlr.c: 723:nvme_ctrlr_identify: *INFO*: transport max_xfer_size
> 2072576
> nvme_ctrlr.c: 727:nvme_ctrlr_identify: *INFO*: MDTS max_xfer_size
> 131072
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103ff9a000 len:4096
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179a000
> nvme_ns.c: 98:nvme_ns_identify_update: *INFO*: ns 1 stripe size quirk
> 256 blocks
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103ff9b000 len:512
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x45179b000
> nvme_ctrlr.c: 823:nvme_ctrlr_set_keep_alive_timeout: *INFO*:
> Controller KAS is
> 0 - not enabling Keep Alive
> nvme_ctrlr.c: 893:nvme_ctrlr_set_host_id: *INFO*: Using 64-bit host
> identifier
> nvme_ctrlr.c: 909:nvme_ctrlr_set_host_id: *INFO*: User did not specify
> host ID
> - not sending Set Features - Host ID
> nvme_ctrlr.c: 623:nvme_ctrlr_set_state: *INFO*: setting state to ready
> (no
> timeout)
> Attached to 0000:06:00.0
> Using controller INTEL SSDPEDMD400G4 (CVFT7203005M400LGN ) with 1
> namespaces.
> Namespace ID: 1 size: 400GB
> Initialization complete.
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103fe73000 len:512
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x451673000
> nvme_pcie.c:1541:nvme_pcie_prp_list_append: *INFO*: prp_index:0
> virt_addr:0x103fe73000 len:512
> nvme_pcie.c:1568:nvme_pcie_prp_list_append: *INFO*: prp1 = 0x451673000
> Hello world!
> nvme_ctrlr.c: 512:nvme_ctrlr_shutdown: *INFO*: RTD3E = 0 us
> nvme_ctrlr.c: 515:nvme_ctrlr_shutdown: *INFO*: shutdown timeout =
> 10000 ms
> nvme_ctrlr.c: 525:nvme_ctrlr_shutdown: *INFO*: shutdown complete in
> 1808 milliseconds
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni
> (Sreenivasa) Busam (Stellus)
> Sent: Tuesday, November 14, 2017 12:44 PM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Hi Daniel,
>
> I added the debug flag you mentioned in the program and tested it.
> There is not much of any log which is helpful for the probe of the device.
> Please ignore the printf from func pci_probe_all_drivers, it was
> printf added by me.
> I have attached the file.
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni
> (Sreenivasa) Busam (Stellus)
> Sent: Tuesday, November 14, 2017 10:43 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Thanks Daniel for your comments. I think the SSD devices are of same
> model looking at lspci output.
> Let me confirm it with my lab administrator.
> I will add the debug code and give you the output.
>
> Sreeni
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Verkamp,
> Daniel
> Sent: Tuesday, November 14, 2017 9:43 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> It looks to me like the device at 0000:ad:00.0 worked in both tests,
> and the device at 0000:ae:00.0 failed during identify and wasn’t used
> in perf due to the -r argument specifying only 0000:ad:00.0.
>
> Are these two devices the same model/firmware of NVMe controller, or
> is there some difference between them?
>
> It would also be useful to see the debug output from the NVMe library
> for the failing controller; you can enable this in hello_world.c by
> adding these two lines at the beginning of main():
>
> spdk_log_set_print_level(SPDK_LOG_DEBUG);
> spdk_log_set_trace_flag(“all”);
>
> Thanks,
> -- Daniel
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Luse, Paul
> E
> Sent: Tuesday, November 14, 2017 6:46 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Well, maybe not. Could very well be an issue in the driver that’s
> only exposed via this hardware especially given that hello_world
> failed and perf did not.
>
> Sreeni, do you have any other data points that you can share? Like,
> for example, have you tried multiple SSDs of that same model or any other model?
> Does hello consistently fail and perf consistently pass? I haven’t
> done a ton of debug with either but with a little more info, if
> someone else doesn’t jump in with the answer, I can pretty easily help
> compare the init sequence in a few of these and maybe start to narrow down what’s different.
>
> Thx
> Paul
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Wu,
> WenzhongX
> Sent: Tuesday, November 14, 2017 2:13 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] Helloworld program fails during device initialization.
>
> Hi Sreeni,
>
> Verify with newest spdk source code with intel P3700 ,run helloword
> everything is ok:
> root(a)waikiki5-target:~/workspace/spdk1102/examples/nvme/hello_world#
> ./hello_world
> Starting DPDK 17.08.0 initialization...
> [ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-
> virtaddr=0x1000000000 --proc-type=auto ]
> EAL: Detected 36 lcore(s)
> EAL: Auto-detected process type: PRIMARY
> EAL: No free hugepages reported in hugepages-1048576kB
> EAL: Probing VFIO support...
> Initializing NVMe Controllers
> EAL: PCI device 0000:81:00.0 on NUMA socket 1
> EAL: probe driver: 8086:953 spdk_nvme
> Attaching to 0000:81:00.0
> EAL: PCI device 0000:84:00.0 on NUMA socket 1
> EAL: probe driver: 8086:953 spdk_nvme
> Attaching to 0000:84:00.0
> EAL: PCI device 0000:85:00.0 on NUMA socket 1
> EAL: probe driver: 8086:953 spdk_nvme
> Attaching to 0000:85:00.0
> Attached to 0000:85:00.0
> Using controller INTEL SSDPEDMD800G4 (PHFT6401011B800CGN ) with 1
> namespaces.
> Namespace ID: 1 size: 800GB
> Attached to 0000:81:00.0
> Using controller INTEL SSDPEDMD800G4 (PHFT5373002D800CGN ) with 1
> namespaces.
> Namespace ID: 1 size: 800GB
> Attached to 0000:84:00.0
> Using controller INTEL SSDPEDMD800G4 (PHFT53640003800CGN ) with 1
> namespaces.
> Namespace ID: 1 size: 800GB
> Initialization complete.
> Hello world!
> Hello world!
> Hello world!
>
> Seem it’s your hardware cause this issues.
>
> Thanks
> Wenzhong
>
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sreeni
> (Sreenivasa) Busam (Stellus)
> Sent: Tuesday, November 14, 2017 2:36 PM
> To: spdk(a)lists.01.org
> Subject: [SPDK] Helloworld program fails during device initialization.
>
> I was testing the helloworld program on ubuntu with NVMe SSD and the
> probe() is failing. Please let me know if this is a known problem.
> Let me know if there is workaround.
> The perf program did not have any problem.
> I have attached the output for both tests.
> lspci| grep Non
> ad:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd
> NVMe SSD Controller (rev 01)
> ae:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd
> NVMe SSD Controller (rev 01) uname -a Linux sjca5radevct13
> 4.10.0-38-generic #42~16.04.1-Ubuntu SMP Tue Oct 10
> 16:32:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux 1. helloworld output:
> ---------------------------
> Attached to 0000:ad:00.0
> Using controller SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with 1
> namespaces.
> Namespace ID: 1 size: 480GB
> nvme_ctrlr.c:1475:nvme_ctrlr_process_init: *ERROR*: Initialization
> timed out in state 1
> nvme_ctrlr.c: 523:nvme_ctrlr_shutdown: *ERROR*: did not shutdown
> within 5 seconds
> spdk_nvme_probe() failed
>
> 2. perf output:
> sudo ./perf -q 1 -s 4096 -t 10 -w read -r "trtype:PCIe traddr:0000:ad:00.0"
> ---------------
> Starting DPDK 17.08.0 initialization...
> [ DPDK EAL parameters: perf -c 0x1 --file-prefix=spdk_pid7647 ]
> EAL: Detected 32 lcore(s)
> EAL: No free hugepages reported in hugepages-1048576kB Attached to
> NVMe Controller at 0000:ad:00.0 [144d:a802]
> Associating SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) with lcore 0
> Initialization complete. Launching workers.
> Starting thread on core 0
> ========================================================
>
>
> Latency(us)
> Device Information : IOPS
> MB/s Average min max
> SAMSUNG MZQWV480HCGM (S1TRNYAFA00148 ) from core 0: 13773.50
> 53.80 72.59 60.20 243.82
> ========================================================
> Total : 13773.50
> 53.80 72.59 60.20 243.82
>
> Thanks for your help,
> Sreeni
> _______________________________________________
> SPDK mailing list
> SPDK(a)lists.01.org
> https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2017-11-15 0:50 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-14 21:44 [SPDK] Helloworld program fails during device initialization Walker, Benjamin
-- strict thread matches above, loose matches on Subject: below --
2017-11-15 0:50 Luse, Paul E
2017-11-14 23:48 Sreeni Busam
2017-11-14 22:46 Luse, Paul E
2017-11-14 22:25 Sreeni Busam
2017-11-14 21:55 Sreeni Busam
2017-11-14 21:38 Sreeni Busam
2017-11-14 21:07 Luse, Paul E
2017-11-14 20:14 Sreeni Busam
2017-11-14 20:06 Verkamp, Daniel
2017-11-14 20:05 Sreeni Busam
2017-11-14 19:55 Luse, Paul E
2017-11-14 19:49 Sreeni Busam
2017-11-14 19:43 Sreeni Busam
2017-11-14 18:42 Sreeni Busam
2017-11-14 18:39 Sreeni Busam
2017-11-14 18:09 Sreeni Busam
2017-11-14 17:42 Verkamp, Daniel
2017-11-14 13:45 Luse, Paul E
2017-11-14 9:12 Wu, WenzhongX
2017-11-14 6:36 Sreeni Busam
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.