Number of data and admin queues in use

All of lore.kernel.org
 help / color / mirror / Atom feed

* Number of data and admin queues in use
@ 2025-07-15  1:58 Thomas Glanzmann
  2025-07-15 14:39 ` Keith Busch
  0 siblings, 1 reply; 5+ messages in thread
From: Thomas Glanzmann @ 2025-07-15  1:58 UTC (permalink / raw)
  To: linux-nvme

Hello,
I have Linux system hooked up over two dedicated links to a NetApp using
NVMe/TCP. I would like to find out how many data and admin queues there are and
their queue depth? How can I find out the same?

So far, I found out:

(live) [~] nvme netapp ontapdevices /dev/nvme0n1
/dev/nvme0n1, Vserver svm1, Subsystem svm1_subsystem_553, Namespace Path rx3082_1, NSID 1, UUID 7f6be93b-60cb-11f0-866f-d039ead647e8, 1.10TB
(live) [~] nvme list-subsys /dev/nvme0n1
nvme-subsys0 - NQN=nqn.1992-08.com.netapp:sn.e0a0273a60b711f09deed039ead647e8:subsystem.svm1_subsystem_553
               hostnqn=nqn.2014-08.org.nvmexpress:uuid:20f011e6-9ab8-584f-abb0-a260d2d685c4
\
 +- nvme0 tcp traddr=192.168.0.2,trsvcid=4420,src_addr=192.168.0.100 live optimized
 +- nvme1 tcp traddr=192.168.1.2,trsvcid=4420,src_addr=192.168.1.100 live optimized
(live) [~] nvme list
Node                  Generic               SN                   Model                                    Namespace  Usage                      Format           FW Rev
--------------------- --------------------- -------------------- ---------------------------------------- ---------- -------------------------- ---------------- --------
/dev/nvme0n1          /dev/ng0n1            824nlJYbYm5ZAAAAAAAB NetApp ONTAP Controller                  0x1          8.62  GB /   1.10  TB      4 KiB +  0 B   9.16.1
(live) [~] sudo nvme get-feature /dev/nvme0n1 --feature-id=7 -H
get-feature:0x07 (Number of Queues), Current value:0x00010001
        Number of IO Completion Queues Allocated (NCQA): 2
        Number of IO Submission Queues Allocated (NSQA): 2

I also have some local NVMe where I would like to find out the same:

(infra) [~] sudo nvme get-feature /dev/nvme0n1 --feature-id=7 -H
get-feature:0x07 (Number of Queues), Current value:0x007f007f
        Number of IO Completion Queues Allocated (NCQA): 128
        Number of IO Submission Queues Allocated (NSQA): 128
(infra) [~] nvme list
Node                  Generic               SN                   Model                                    Namespace Usage                      Format           FW Rev
--------------------- --------------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme1n1          /dev/ng1n1            50026B7685E14353     KINGSTON SKC3000D2048G                   1           2.05  TB /   2.05  TB    512   B +  0 B   EIFK31.6
/dev/nvme0n1          /dev/ng0n1            50026B7685E1439E     KINGSTON SKC3000D2048G                   1           2.05  TB /   2.05  TB    512   B +  0 B   EIFK31.6
(infra) [~] nvme list-subsys /dev/nvme0n1
nvme-subsys0 - NQN=nqn.2020-04.com.kingston:nvme:nvm-subsystem-sn-50026B7685E1439E
\
 +- nvme0 pcie 0000:01:00.0 live

I one heard that the Linux kernel allocates one queue per processor (core or
hyperthread). I can see that using /proc/interrupts but only on physical
devices, but not on NVME/TCP systems.

Cheers,
        Thomas


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Number of data and admin queues in use
  2025-07-15  1:58 Number of data and admin queues in use Thomas Glanzmann
@ 2025-07-15 14:39 ` Keith Busch
  2025-07-15 16:38   ` Thomas Glanzmann
  2025-07-15 17:22   ` Chaitanya Kulkarni
  0 siblings, 2 replies; 5+ messages in thread
From: Keith Busch @ 2025-07-15 14:39 UTC (permalink / raw)
  To: Thomas Glanzmann; +Cc: linux-nvme

On Tue, Jul 15, 2025 at 03:58:01AM +0200, Thomas Glanzmann wrote:
> I one heard that the Linux kernel allocates one queue per processor (core or
> hyperthread). I can see that using /proc/interrupts but only on physical
> devices, but not on NVME/TCP systems.

For PCI, the driver automatically handles the queue and interrupt setup,
and cpu assignment.

For TCP (and all fabrics transports), you have to specificy how many
connections you want to make ("nr_io_queues=X") when you're setting up
your initial fabrics connection.

If you want to see what you've ended up with, you can consult the
namespaces' sysfs entries:

How many IO queues are there:

  # ls -1 /sys/block/nvme0n1/mq/ | wc -l
  64

How large is each IO queue:

  # cat /sys/block/nvme0n1/queue/nr_requests
  1023

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Number of data and admin queues in use
  2025-07-15 14:39 ` Keith Busch
@ 2025-07-15 16:38   ` Thomas Glanzmann
  2025-07-15 17:22   ` Chaitanya Kulkarni
  1 sibling, 0 replies; 5+ messages in thread
From: Thomas Glanzmann @ 2025-07-15 16:38 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-nvme

Hello Keith,

* Keith Busch <kbusch@kernel.org> [2025-07-15 16:39]:
> For PCI, the driver automatically handles the queue and interrupt setup,
> and cpu assignment.

> For TCP (and all fabrics transports), you have to specificy how many
> connections you want to make ("nr_io_queues=X") when you're setting up
> your initial fabrics connection.

> If you want to see what you've ended up with, you can consult the
> namespaces' sysfs entries:

> How many IO queues are there:

>   # ls -1 /sys/block/nvme0n1/mq/ | wc -l
>   64

> How large is each IO queue:

>   # cat /sys/block/nvme0n1/queue/nr_requests
>   1023

thank you for taking the time to answer me. I was looking multiple years for an
answer to this. nr_requests I stumbled on before, but /sys/block/nvme0n1/mq/
was new to me. The maximum that the NetApp appears to support is:

na2501::*> vserver nvme show-host-priority
Node                  Protocol  Priority I/O Queue Count I/O Queue Depth
--------------------- --------- -------- --------------- ---------------
na2501-01             fc-nvme
                                regular                4              32
                                high                   6              32
                      nvme-tcp
                                regular                2             128
                                high                   4             128
na2501-02             fc-nvme
                                regular                4              32
                                high                   6              32
                      nvme-tcp
                                regular                2             128
                                high                   4             128

(live) [~] ls -1 /sys/block/nvme0c0n1/mq/ | wc -l
4
(live) [~] ls -1 /sys/block/nvme0c1n1/mq/ | wc -l
4
(live) [~] cat /sys/block/nvme0c0n1/queue/nr_requests
127

With ext4 and fio I get:

fio --ioengine=libaio --refill_buffers --filesize=4G --ramp_time=2s --numjobs=40 --direct=1 --verify=0 --randrepeat=0 --group_reporting  --directory /mnt --name=4khqd --blocksize=4k --iodepth=50 --readwrite=write
  write: IOPS=159k, BW=620MiB/s (651MB/s)(159GiB/261872msec); 0 zone resets
fio --ioengine=libaio --refill_buffers --filesize=4G --ramp_time=2s --numjobs=40 --direct=1 --verify=0 --randrepeat=0 --group_reporting  --directory /mnt --name=4khqd --blocksize=4k --iodepth=50 --readwrite=read
  read: IOPS=449k, BW=1752MiB/s (1838MB/s)(157GiB/91645msec)

fio --ioengine=libaio --refill_buffers --filesize=4G --ramp_time=2s --numjobs=40 --direct=1 --verify=0 --randrepeat=0 --group_reporting  --directory /mnt --name=1mhqd --blocksize=1m --iodepth=50 --readwrite=write
  write: IOPS=1965, BW=1970MiB/s (2066MB/s)(157GiB/81434msec); 0 zone resets
fio --ioengine=libaio --refill_buffers --filesize=4G --ramp_time=2s --numjobs=40 --direct=1 --verify=0 --randrepeat=0 --group_reporting  --directory /mnt --name=1mhqd --blocksize=1m --iodepth=50 --readwrite=read
  read: IOPS=4034, BW=4044MiB/s (4241MB/s)(153GiB/38682msec)

Using 'iostat -xm 2' I can see that is utilizes the queue depth by watching aqu-sz.

Cheers,
	Thomas


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Number of data and admin queues in use
  2025-07-15 14:39 ` Keith Busch
  2025-07-15 16:38   ` Thomas Glanzmann
@ 2025-07-15 17:22   ` Chaitanya Kulkarni
  2025-07-15 18:05     ` Thomas Glanzmann
  1 sibling, 1 reply; 5+ messages in thread
From: Chaitanya Kulkarni @ 2025-07-15 17:22 UTC (permalink / raw)
  To: Keith Busch, Thomas Glanzmann; +Cc: linux-nvme@lists.infradead.org

On 7/15/25 07:39, Keith Busch wrote:
> On Tue, Jul 15, 2025 at 03:58:01AM +0200, Thomas Glanzmann wrote:
>> I one heard that the Linux kernel allocates one queue per processor (core or
>> hyperthread). I can see that using /proc/interrupts but only on physical
>> devices, but not on NVME/TCP systems.
> For PCI, the driver automatically handles the queue and interrupt setup,
> and cpu assignment.
>
> For TCP (and all fabrics transports), you have to specificy how many
> connections you want to make ("nr_io_queues=X") when you're setting up
> your initial fabrics connection.
>
> If you want to see what you've ended up with, you can consult the
> namespaces' sysfs entries:
>
> How many IO queues are there:
>
>    # ls -1 /sys/block/nvme0n1/mq/ | wc -l
>    64
>
> How large is each IO queue:
>
>    # cat /sys/block/nvme0n1/queue/nr_requests
>    1023
>

For block layer queue allocation which will happen when you issue connect
command from host to target which will create controller and populate block
and char devices see below [1].

 From what I can see you are getting number of queues for both tcp and pcie
NVMe controller, what is your question ?

Another way to dig into controller side fields or queue depth you can
read the CAP space see this from spec

Figure 36: Offset 0h: CAP – Controller Capabilities :-

"Maximum Queue Entries Supported (MQES): This field indicates the maximum
individual queue size that the controller supports. For NVMe over PCIe
implementations, this value applies to the I/O Submission Queues and I/O
Completion Queues that the host creates. For NVMe over Fabrics 
implementations,
this value applies to only the I/O Submission Queues that the host 
creates. This is
a 0’s based value. The minimum value is 1h, indicating two entries."

-ck

[1]

For fabrics transport (TCP) number are queues are calculated using
nvmf_nr_io_queue() to make sure we don't create more read/defult
queues than CPUs available same check is also applicable for write
and poll queues.
nvme_set_queue_count adjusts the queue count based on controller
capabilities which cal also clamp the queue count.

nvmf_set_io_queues() set queue count for each queue type read,
default, poll. then nvmf_map_queues() maps them into blk-mq
structure so that default/read/poll and each gets attached to
blk_mq context.

On My machine I've 48 CPUs so when I create tcp target I get :-

[ 1196.058440] nvme nvme1: creating 48 I/O queues.
[ 1196.062370] nvme nvme1: mapped 48/0/0 default/read/poll queues.

you should be able to see this into debug messages that is coming
from queue allocation helpers respectively that also has controller
device name "nvme1" :-

nvme_tcp_alloc_io_queues()
nvmf_map_queues()

Hope this helps.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Number of data and admin queues in use
  2025-07-15 17:22   ` Chaitanya Kulkarni
@ 2025-07-15 18:05     ` Thomas Glanzmann
  0 siblings, 0 replies; 5+ messages in thread
From: Thomas Glanzmann @ 2025-07-15 18:05 UTC (permalink / raw)
  To: Chaitanya Kulkarni; +Cc: Keith Busch, linux-nvme@lists.infradead.org

Hello Chaitanya,

> From what I can see you are getting number of queues for both tcp and
> pcie NVMe controller, what is your question?

My question was how to see the number and size of NVMe IO queues but
Keith already answered that. I just thanked him and added some stats
from the NetApp.

> Another way to dig into controller side fields or queue depth you can
> read the CAP space see this from spec

> Figure 36: Offset 0h: CAP – Controller Capabilities :-

> "Maximum Queue Entries Supported (MQES): This field indicates the
> maximum individual queue size that the controller supports. For NVMe
> over PCIe implementations, this value applies to the I/O Submission
> Queues and I/O Completion Queues that the host creates. For NVMe over
> Fabrics implementations, this value applies to only the I/O Submission
> Queues that the host creates. This is a 0’s based value. The minimum
> value is 1h, indicating two entries."

> -ck

> [1]

> For fabrics transport (TCP) number are queues are calculated using
> nvmf_nr_io_queue() to make sure we don't create more read/defult
> queues than CPUs available same check is also applicable for write
> and poll queues.
> nvme_set_queue_count adjusts the queue count based on controller
> capabilities which cal also clamp the queue count.

> nvmf_set_io_queues() set queue count for each queue type read,
> default, poll. then nvmf_map_queues() maps them into blk-mq
> structure so that default/read/poll and each gets attached to
> blk_mq context.

> On My machine I've 48 CPUs so when I create tcp target I get :-

> [ 1196.058440] nvme nvme1: creating 48 I/O queues.
> [ 1196.062370] nvme nvme1: mapped 48/0/0 default/read/poll queues.

> you should be able to see this into debug messages that is coming
> from queue allocation helpers respectively that also has controller
> device name "nvme1" :-

> nvme_tcp_alloc_io_queues()
> nvmf_map_queues()

Tomorrow I'll setup a NVMe/TCP target on Linux and do some benchmarking. I'll
also hookup the NetApp to 64 Gbit/s FC and do some benchmarking with FC and
FC/NVMe.
Thank you for the additional insight, I never paid attention to this but, I did
now:

[ 3730.402432] nvme nvme0: queue_size 128 > ctrl sqsize 32, clamping down
[ 3795.115084] subsysnqn nqn.1992-08.com.netapp:sn.e0a0273a60b711f09deed039ead647e8:subsystem.svm1_subsystem_553 iopolicy changed from numa to queue-depth
[ 3795.154560] nvme nvme0: creating 2 I/O queues.
[ 3795.156535] nvme nvme0: mapped 2/0/0 default/read/poll queues.
[ 3801.004641] nvme nvme1: creating 2 I/O queues.
[ 3801.006541] nvme nvme1: mapped 2/0/0 default/read/poll queues.

Than I bumped the queues and queue size on the NetApp and got:

[98114.846603] nvme nvme0: queue_size 128 > ctrl sqsize 32, clamping down
[98727.596158] subsysnqn nqn.1992-08.com.netapp:sn.e0a0273a60b711f09deed039ead647e8:subsystem.svm1_subsystem_553 iopolicy changed from numa to queue-depth
[98727.635617] nvme nvme0: creating 4 I/O queues.
[98727.638218] nvme nvme0: mapped 4/0/0 default/read/poll queues.
[98741.459565] nvme nvme1: creating 4 I/O queues.
[98741.462227] nvme nvme1: mapped 4/0/0 default/read/poll queues.

Cheers,
	Thomas


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-07-15 19:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-15  1:58 Number of data and admin queues in use Thomas Glanzmann
2025-07-15 14:39 ` Keith Busch
2025-07-15 16:38   ` Thomas Glanzmann
2025-07-15 17:22   ` Chaitanya Kulkarni
2025-07-15 18:05     ` Thomas Glanzmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.