public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH V2] nvme-pci: assign separate irq vectors for adminq and ioq0
       [not found]         ` <20180309172445.GC14765@localhost.localdomain>
@ 2018-03-12  9:09           ` Ming Lei
  2018-10-08  5:05             ` nvme-pci: number of queues off by one Prasun Ratn
  0 siblings, 1 reply; 6+ messages in thread
From: Ming Lei @ 2018-03-12  9:09 UTC (permalink / raw)
  To: Keith Busch
  Cc: Christoph Hellwig, sagi, linux-kernel, linux-nvme, axboe,
	Jianchao Wang, linux-block, Thomas Gleixner

On Fri, Mar 09, 2018 at 10:24:45AM -0700, Keith Busch wrote:
> On Thu, Mar 08, 2018 at 08:42:20AM +0100, Christoph Hellwig wrote:
> > 
> > So I suspect we'll need to go with a patch like this, just with a way
> > better changelog.
> 
> I have to agree this is required for that use case. I'll run some
> quick tests and propose an alternate changelog.
> 
> Longer term, the current way we're including offline present cpus either
> (a) has the driver allocate resources it can't use or (b) spreads the
> ones it can use thinner than they need to be. Why don't we rerun the
> irq spread under a hot cpu notifier for only online CPUs?

4b855ad371 ("blk-mq: Create hctx for each present CPU") removes handling
mapping change via hot cpu notifier. Not only code is cleaned up, but
also fixes very complicated queue dependency issue:

- loop/dm-rq queue depends on underlying queue
- for NVMe, IO queue depends on admin queue

If freezing queue can be avoided in CPU notifier, it should be fine to
do that, otherwise it need to be avoided.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 6+ messages in thread

* nvme-pci: number of queues off by one
  2018-03-12  9:09           ` [PATCH V2] nvme-pci: assign separate irq vectors for adminq and ioq0 Ming Lei
@ 2018-10-08  5:05             ` Prasun Ratn
  2018-10-08  5:59               ` Dongli Zhang
  0 siblings, 1 reply; 6+ messages in thread
From: Prasun Ratn @ 2018-10-08  5:05 UTC (permalink / raw)
  To: ming.lei
  Cc: keith.busch, hch, sagi, linux-nvme, axboe, jianchao.w.wang,
	linux-block, tglx

Hi

I have an NVMe SSD that has 8 hw queues and on older kernels I see all
8 show up. However on a recent kernel (I tried 4.18), I only see 7. Is
this a known issue?

$ uname -r
4.14.1-1.el7.elrepo.x86_64

$ ls /sys/block/nvme*n1/mq/*/cpu_list
/sys/block/nvme0n1/mq/0/cpu_list
/sys/block/nvme0n1/mq/1/cpu_list
/sys/block/nvme0n1/mq/2/cpu_list
/sys/block/nvme0n1/mq/3/cpu_list
/sys/block/nvme0n1/mq/4/cpu_list
/sys/block/nvme0n1/mq/5/cpu_list
/sys/block/nvme0n1/mq/6/cpu_list
/sys/block/nvme0n1/mq/7/cpu_list
/sys/block/nvme1n1/mq/0/cpu_list
/sys/block/nvme1n1/mq/1/cpu_list
/sys/block/nvme1n1/mq/2/cpu_list
/sys/block/nvme1n1/mq/3/cpu_list
/sys/block/nvme1n1/mq/4/cpu_list
/sys/block/nvme1n1/mq/5/cpu_list
/sys/block/nvme1n1/mq/6/cpu_list
/sys/block/nvme1n1/mq/7/cpu_list
/sys/block/nvme2n1/mq/0/cpu_list
/sys/block/nvme2n1/mq/1/cpu_list
/sys/block/nvme2n1/mq/2/cpu_list
/sys/block/nvme2n1/mq/3/cpu_list
/sys/block/nvme2n1/mq/4/cpu_list
/sys/block/nvme2n1/mq/5/cpu_list
/sys/block/nvme2n1/mq/6/cpu_list
/sys/block/nvme2n1/mq/7/cpu_list
/sys/block/nvme3n1/mq/0/cpu_list
/sys/block/nvme3n1/mq/1/cpu_list
/sys/block/nvme3n1/mq/2/cpu_list
/sys/block/nvme3n1/mq/3/cpu_list
/sys/block/nvme3n1/mq/4/cpu_list
/sys/block/nvme3n1/mq/5/cpu_list
/sys/block/nvme3n1/mq/6/cpu_list
/sys/block/nvme3n1/mq/7/cpu_list


$ uname -r
4.18.10-1.el7.elrepo.x86_64

$ ls /sys/block/nvme*n1/mq/*/cpu_list
/sys/block/nvme0n1/mq/0/cpu_list
/sys/block/nvme0n1/mq/1/cpu_list
/sys/block/nvme0n1/mq/2/cpu_list
/sys/block/nvme0n1/mq/3/cpu_list
/sys/block/nvme0n1/mq/4/cpu_list
/sys/block/nvme0n1/mq/5/cpu_list
/sys/block/nvme0n1/mq/6/cpu_list
/sys/block/nvme1n1/mq/0/cpu_list
/sys/block/nvme1n1/mq/1/cpu_list
/sys/block/nvme1n1/mq/2/cpu_list
/sys/block/nvme1n1/mq/3/cpu_list
/sys/block/nvme1n1/mq/4/cpu_list
/sys/block/nvme1n1/mq/5/cpu_list
/sys/block/nvme1n1/mq/6/cpu_list
/sys/block/nvme2n1/mq/0/cpu_list
/sys/block/nvme2n1/mq/1/cpu_list
/sys/block/nvme2n1/mq/2/cpu_list
/sys/block/nvme2n1/mq/3/cpu_list
/sys/block/nvme2n1/mq/4/cpu_list
/sys/block/nvme2n1/mq/5/cpu_list
/sys/block/nvme2n1/mq/6/cpu_list
/sys/block/nvme3n1/mq/0/cpu_list
/sys/block/nvme3n1/mq/1/cpu_list
/sys/block/nvme3n1/mq/2/cpu_list
/sys/block/nvme3n1/mq/3/cpu_list
/sys/block/nvme3n1/mq/4/cpu_list
/sys/block/nvme3n1/mq/5/cpu_list
/sys/block/nvme3n1/mq/6/cpu_list

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nvme-pci: number of queues off by one
  2018-10-08  5:05             ` nvme-pci: number of queues off by one Prasun Ratn
@ 2018-10-08  5:59               ` Dongli Zhang
  2018-10-08  6:58                 ` Dongli Zhang
  2018-10-08 10:19                 ` Ming Lei
  0 siblings, 2 replies; 6+ messages in thread
From: Dongli Zhang @ 2018-10-08  5:59 UTC (permalink / raw)
  To: Prasun Ratn, ming.lei
  Cc: keith.busch, hch, sagi, linux-nvme, axboe, jianchao.w.wang,
	linux-block, tglx

I can reproduce with qemu:

# ls /sys/block/nvme*n1/mq/*/cpu_list
/sys/block/nvme0n1/mq/0/cpu_list
/sys/block/nvme0n1/mq/1/cpu_list
/sys/block/nvme0n1/mq/2/cpu_list
/sys/block/nvme0n1/mq/3/cpu_list
/sys/block/nvme0n1/mq/4/cpu_list
/sys/block/nvme0n1/mq/5/cpu_list
/sys/block/nvme0n1/mq/6/cpu_list

Here is the qemu cmdline emulating 8-queue nvme while the VM has 12 cpu:

# qemu-system-x86_64 -m 4096 -smp 12 \
	-kernel /path-to-kernel/linux-4.18.10/arch/x86_64/boot/bzImage \
	-hda /path-to-img/ubuntu1804.qcow2  \
	-append "root=/dev/sda1 init=/sbin/init text" -enable-kvm \
	-net nic -net user,hostfwd=tcp::5022-:22 \
	-device nvme,drive=nvme1,serial=deadbeaf1,num_queues=8 \
	-drive file=/path-to-img/nvme.disk,if=none,id=nvme1

Dongli Zhang


On 10/08/2018 01:05 PM, Prasun Ratn wrote:
> Hi
> 
> I have an NVMe SSD that has 8 hw queues and on older kernels I see all
> 8 show up. However on a recent kernel (I tried 4.18), I only see 7. Is
> this a known issue?
> 
> $ uname -r
> 4.14.1-1.el7.elrepo.x86_64
> 
> $ ls /sys/block/nvme*n1/mq/*/cpu_list
> /sys/block/nvme0n1/mq/0/cpu_list
> /sys/block/nvme0n1/mq/1/cpu_list
> /sys/block/nvme0n1/mq/2/cpu_list
> /sys/block/nvme0n1/mq/3/cpu_list
> /sys/block/nvme0n1/mq/4/cpu_list
> /sys/block/nvme0n1/mq/5/cpu_list
> /sys/block/nvme0n1/mq/6/cpu_list
> /sys/block/nvme0n1/mq/7/cpu_list
> /sys/block/nvme1n1/mq/0/cpu_list
> /sys/block/nvme1n1/mq/1/cpu_list
> /sys/block/nvme1n1/mq/2/cpu_list
> /sys/block/nvme1n1/mq/3/cpu_list
> /sys/block/nvme1n1/mq/4/cpu_list
> /sys/block/nvme1n1/mq/5/cpu_list
> /sys/block/nvme1n1/mq/6/cpu_list
> /sys/block/nvme1n1/mq/7/cpu_list
> /sys/block/nvme2n1/mq/0/cpu_list
> /sys/block/nvme2n1/mq/1/cpu_list
> /sys/block/nvme2n1/mq/2/cpu_list
> /sys/block/nvme2n1/mq/3/cpu_list
> /sys/block/nvme2n1/mq/4/cpu_list
> /sys/block/nvme2n1/mq/5/cpu_list
> /sys/block/nvme2n1/mq/6/cpu_list
> /sys/block/nvme2n1/mq/7/cpu_list
> /sys/block/nvme3n1/mq/0/cpu_list
> /sys/block/nvme3n1/mq/1/cpu_list
> /sys/block/nvme3n1/mq/2/cpu_list
> /sys/block/nvme3n1/mq/3/cpu_list
> /sys/block/nvme3n1/mq/4/cpu_list
> /sys/block/nvme3n1/mq/5/cpu_list
> /sys/block/nvme3n1/mq/6/cpu_list
> /sys/block/nvme3n1/mq/7/cpu_list
> 
> 
> $ uname -r
> 4.18.10-1.el7.elrepo.x86_64
> 
> $ ls /sys/block/nvme*n1/mq/*/cpu_list
> /sys/block/nvme0n1/mq/0/cpu_list
> /sys/block/nvme0n1/mq/1/cpu_list
> /sys/block/nvme0n1/mq/2/cpu_list
> /sys/block/nvme0n1/mq/3/cpu_list
> /sys/block/nvme0n1/mq/4/cpu_list
> /sys/block/nvme0n1/mq/5/cpu_list
> /sys/block/nvme0n1/mq/6/cpu_list
> /sys/block/nvme1n1/mq/0/cpu_list
> /sys/block/nvme1n1/mq/1/cpu_list
> /sys/block/nvme1n1/mq/2/cpu_list
> /sys/block/nvme1n1/mq/3/cpu_list
> /sys/block/nvme1n1/mq/4/cpu_list
> /sys/block/nvme1n1/mq/5/cpu_list
> /sys/block/nvme1n1/mq/6/cpu_list
> /sys/block/nvme2n1/mq/0/cpu_list
> /sys/block/nvme2n1/mq/1/cpu_list
> /sys/block/nvme2n1/mq/2/cpu_list
> /sys/block/nvme2n1/mq/3/cpu_list
> /sys/block/nvme2n1/mq/4/cpu_list
> /sys/block/nvme2n1/mq/5/cpu_list
> /sys/block/nvme2n1/mq/6/cpu_list
> /sys/block/nvme3n1/mq/0/cpu_list
> /sys/block/nvme3n1/mq/1/cpu_list
> /sys/block/nvme3n1/mq/2/cpu_list
> /sys/block/nvme3n1/mq/3/cpu_list
> /sys/block/nvme3n1/mq/4/cpu_list
> /sys/block/nvme3n1/mq/5/cpu_list
> /sys/block/nvme3n1/mq/6/cpu_list
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nvme-pci: number of queues off by one
  2018-10-08  5:59               ` Dongli Zhang
@ 2018-10-08  6:58                 ` Dongli Zhang
  2018-10-08 14:54                   ` Keith Busch
  2018-10-08 10:19                 ` Ming Lei
  1 sibling, 1 reply; 6+ messages in thread
From: Dongli Zhang @ 2018-10-08  6:58 UTC (permalink / raw)
  To: Prasun Ratn, ming.lei, jianchao.w.wang
  Cc: keith.busch, hch, sagi, linux-nvme, axboe, linux-block, tglx

I got the same result when emulating nvme with qemu: the VM has 12 cpu, while
the num_queues of nvme is 8.

# uname -r
4.14.1
# ll /sys/block/nvme*n1/mq/*/cpu_list
-r--r--r-- 1 root root 4096 Oct  8 14:30 /sys/block/nvme0n1/mq/0/cpu_list
-r--r--r-- 1 root root 4096 Oct  8 14:30 /sys/block/nvme0n1/mq/1/cpu_list
-r--r--r-- 1 root root 4096 Oct  8 14:30 /sys/block/nvme0n1/mq/2/cpu_list
-r--r--r-- 1 root root 4096 Oct  8 14:30 /sys/block/nvme0n1/mq/3/cpu_list
-r--r--r-- 1 root root 4096 Oct  8 14:30 /sys/block/nvme0n1/mq/4/cpu_list
-r--r--r-- 1 root root 4096 Oct  8 14:30 /sys/block/nvme0n1/mq/5/cpu_list
-r--r--r-- 1 root root 4096 Oct  8 14:30 /sys/block/nvme0n1/mq/6/cpu_list


# uname -r
4.18.10
# ll /sys/block/nvme*n1/mq/*/cpu_list
-r--r--r-- 1 root root 4096 Oct  8 14:34 /sys/block/nvme0n1/mq/0/cpu_list
-r--r--r-- 1 root root 4096 Oct  8 14:34 /sys/block/nvme0n1/mq/1/cpu_list
-r--r--r-- 1 root root 4096 Oct  8 14:34 /sys/block/nvme0n1/mq/2/cpu_list
-r--r--r-- 1 root root 4096 Oct  8 14:34 /sys/block/nvme0n1/mq/3/cpu_list
-r--r--r-- 1 root root 4096 Oct  8 14:34 /sys/block/nvme0n1/mq/4/cpu_list
-r--r--r-- 1 root root 4096 Oct  8 14:34 /sys/block/nvme0n1/mq/5/cpu_list
-r--r--r-- 1 root root 4096 Oct  8 14:34 /sys/block/nvme0n1/mq/6/cpu_list

>From below qemu source code, when n->num_queues is 8, the handler of
NVME_FEAT_NUM_QUEUES returns 0x60006.

 719 static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
 720 {
 721     uint32_t dw10 = le32_to_cpu(cmd->cdw10);
 722     uint32_t dw11 = le32_to_cpu(cmd->cdw11);
 723
 724     switch (dw10) {
 725     case NVME_VOLATILE_WRITE_CACHE:
 726         blk_set_enable_write_cache(n->conf.blk, dw11 & 1);
 727         break;
 728     case NVME_NUMBER_OF_QUEUES:
 729         trace_nvme_setfeat_numq((dw11 & 0xFFFF) + 1,
 730                                 ((dw11 >> 16) & 0xFFFF) + 1,
 731                                 n->num_queues - 1, n->num_queues - 1);
 732         req->cqe.result =
 733             cpu_to_le32((n->num_queues - 2) | ((n->num_queues - 2) << 16));
----> returns 0x60006 when num_queues is 8.


Finally, nr_io_queues is set to 6+1=7 in nvme_set_queue_count() in VM kernel.

I do not know how to paraphrase this in the world of nvme.

Dongli Zhang

On 10/08/2018 01:59 PM, Dongli Zhang wrote:
> I can reproduce with qemu:
> 
> # ls /sys/block/nvme*n1/mq/*/cpu_list
> /sys/block/nvme0n1/mq/0/cpu_list
> /sys/block/nvme0n1/mq/1/cpu_list
> /sys/block/nvme0n1/mq/2/cpu_list
> /sys/block/nvme0n1/mq/3/cpu_list
> /sys/block/nvme0n1/mq/4/cpu_list
> /sys/block/nvme0n1/mq/5/cpu_list
> /sys/block/nvme0n1/mq/6/cpu_list
> 
> Here is the qemu cmdline emulating 8-queue nvme while the VM has 12 cpu:
> 
> # qemu-system-x86_64 -m 4096 -smp 12 \
> 	-kernel /path-to-kernel/linux-4.18.10/arch/x86_64/boot/bzImage \
> 	-hda /path-to-img/ubuntu1804.qcow2  \
> 	-append "root=/dev/sda1 init=/sbin/init text" -enable-kvm \
> 	-net nic -net user,hostfwd=tcp::5022-:22 \
> 	-device nvme,drive=nvme1,serial=deadbeaf1,num_queues=8 \
> 	-drive file=/path-to-img/nvme.disk,if=none,id=nvme1
> 
> Dongli Zhang
> 
> 
> On 10/08/2018 01:05 PM, Prasun Ratn wrote:
>> Hi
>>
>> I have an NVMe SSD that has 8 hw queues and on older kernels I see all
>> 8 show up. However on a recent kernel (I tried 4.18), I only see 7. Is
>> this a known issue?
>>
>> $ uname -r
>> 4.14.1-1.el7.elrepo.x86_64
>>
>> $ ls /sys/block/nvme*n1/mq/*/cpu_list
>> /sys/block/nvme0n1/mq/0/cpu_list
>> /sys/block/nvme0n1/mq/1/cpu_list
>> /sys/block/nvme0n1/mq/2/cpu_list
>> /sys/block/nvme0n1/mq/3/cpu_list
>> /sys/block/nvme0n1/mq/4/cpu_list
>> /sys/block/nvme0n1/mq/5/cpu_list
>> /sys/block/nvme0n1/mq/6/cpu_list
>> /sys/block/nvme0n1/mq/7/cpu_list
>> /sys/block/nvme1n1/mq/0/cpu_list
>> /sys/block/nvme1n1/mq/1/cpu_list
>> /sys/block/nvme1n1/mq/2/cpu_list
>> /sys/block/nvme1n1/mq/3/cpu_list
>> /sys/block/nvme1n1/mq/4/cpu_list
>> /sys/block/nvme1n1/mq/5/cpu_list
>> /sys/block/nvme1n1/mq/6/cpu_list
>> /sys/block/nvme1n1/mq/7/cpu_list
>> /sys/block/nvme2n1/mq/0/cpu_list
>> /sys/block/nvme2n1/mq/1/cpu_list
>> /sys/block/nvme2n1/mq/2/cpu_list
>> /sys/block/nvme2n1/mq/3/cpu_list
>> /sys/block/nvme2n1/mq/4/cpu_list
>> /sys/block/nvme2n1/mq/5/cpu_list
>> /sys/block/nvme2n1/mq/6/cpu_list
>> /sys/block/nvme2n1/mq/7/cpu_list
>> /sys/block/nvme3n1/mq/0/cpu_list
>> /sys/block/nvme3n1/mq/1/cpu_list
>> /sys/block/nvme3n1/mq/2/cpu_list
>> /sys/block/nvme3n1/mq/3/cpu_list
>> /sys/block/nvme3n1/mq/4/cpu_list
>> /sys/block/nvme3n1/mq/5/cpu_list
>> /sys/block/nvme3n1/mq/6/cpu_list
>> /sys/block/nvme3n1/mq/7/cpu_list
>>
>>
>> $ uname -r
>> 4.18.10-1.el7.elrepo.x86_64
>>
>> $ ls /sys/block/nvme*n1/mq/*/cpu_list
>> /sys/block/nvme0n1/mq/0/cpu_list
>> /sys/block/nvme0n1/mq/1/cpu_list
>> /sys/block/nvme0n1/mq/2/cpu_list
>> /sys/block/nvme0n1/mq/3/cpu_list
>> /sys/block/nvme0n1/mq/4/cpu_list
>> /sys/block/nvme0n1/mq/5/cpu_list
>> /sys/block/nvme0n1/mq/6/cpu_list
>> /sys/block/nvme1n1/mq/0/cpu_list
>> /sys/block/nvme1n1/mq/1/cpu_list
>> /sys/block/nvme1n1/mq/2/cpu_list
>> /sys/block/nvme1n1/mq/3/cpu_list
>> /sys/block/nvme1n1/mq/4/cpu_list
>> /sys/block/nvme1n1/mq/5/cpu_list
>> /sys/block/nvme1n1/mq/6/cpu_list
>> /sys/block/nvme2n1/mq/0/cpu_list
>> /sys/block/nvme2n1/mq/1/cpu_list
>> /sys/block/nvme2n1/mq/2/cpu_list
>> /sys/block/nvme2n1/mq/3/cpu_list
>> /sys/block/nvme2n1/mq/4/cpu_list
>> /sys/block/nvme2n1/mq/5/cpu_list
>> /sys/block/nvme2n1/mq/6/cpu_list
>> /sys/block/nvme3n1/mq/0/cpu_list
>> /sys/block/nvme3n1/mq/1/cpu_list
>> /sys/block/nvme3n1/mq/2/cpu_list
>> /sys/block/nvme3n1/mq/3/cpu_list
>> /sys/block/nvme3n1/mq/4/cpu_list
>> /sys/block/nvme3n1/mq/5/cpu_list
>> /sys/block/nvme3n1/mq/6/cpu_list
>>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nvme-pci: number of queues off by one
  2018-10-08  5:59               ` Dongli Zhang
  2018-10-08  6:58                 ` Dongli Zhang
@ 2018-10-08 10:19                 ` Ming Lei
  1 sibling, 0 replies; 6+ messages in thread
From: Ming Lei @ 2018-10-08 10:19 UTC (permalink / raw)
  To: Dongli Zhang
  Cc: Prasun Ratn, axboe, sagi, linux-nvme, keith.busch, linux-block,
	jianchao.w.wang, tglx, hch

On Mon, Oct 08, 2018 at 01:59:05PM +0800, Dongli Zhang wrote:
> I can reproduce with qemu:
> 
> # ls /sys/block/nvme*n1/mq/*/cpu_list
> /sys/block/nvme0n1/mq/0/cpu_list
> /sys/block/nvme0n1/mq/1/cpu_list
> /sys/block/nvme0n1/mq/2/cpu_list
> /sys/block/nvme0n1/mq/3/cpu_list
> /sys/block/nvme0n1/mq/4/cpu_list
> /sys/block/nvme0n1/mq/5/cpu_list
> /sys/block/nvme0n1/mq/6/cpu_list
> 
> Here is the qemu cmdline emulating 8-queue nvme while the VM has 12 cpu:
> 
> # qemu-system-x86_64 -m 4096 -smp 12 \
> 	-kernel /path-to-kernel/linux-4.18.10/arch/x86_64/boot/bzImage \
> 	-hda /path-to-img/ubuntu1804.qcow2  \
> 	-append "root=/dev/sda1 init=/sbin/init text" -enable-kvm \
> 	-net nic -net user,hostfwd=tcp::5022-:22 \
> 	-device nvme,drive=nvme1,serial=deadbeaf1,num_queues=8 \
> 	-drive file=/path-to-img/nvme.disk,if=none,id=nvme1

This 'issue' can be reproduced on v4.14 too.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nvme-pci: number of queues off by one
  2018-10-08  6:58                 ` Dongli Zhang
@ 2018-10-08 14:54                   ` Keith Busch
  0 siblings, 0 replies; 6+ messages in thread
From: Keith Busch @ 2018-10-08 14:54 UTC (permalink / raw)
  To: Dongli Zhang
  Cc: Prasun Ratn, ming.lei, jianchao.w.wang, hch, sagi, linux-nvme,
	axboe, linux-block, tglx

On Mon, Oct 08, 2018 at 02:58:21PM +0800, Dongli Zhang wrote:
> I got the same result when emulating nvme with qemu: the VM has 12 cpu, while
> the num_queues of nvme is 8.
> 
> # uname -r
> 4.14.1
> # ll /sys/block/nvme*n1/mq/*/cpu_list
> -r--r--r-- 1 root root 4096 Oct  8 14:30 /sys/block/nvme0n1/mq/0/cpu_list
> -r--r--r-- 1 root root 4096 Oct  8 14:30 /sys/block/nvme0n1/mq/1/cpu_list
> -r--r--r-- 1 root root 4096 Oct  8 14:30 /sys/block/nvme0n1/mq/2/cpu_list
> -r--r--r-- 1 root root 4096 Oct  8 14:30 /sys/block/nvme0n1/mq/3/cpu_list
> -r--r--r-- 1 root root 4096 Oct  8 14:30 /sys/block/nvme0n1/mq/4/cpu_list
> -r--r--r-- 1 root root 4096 Oct  8 14:30 /sys/block/nvme0n1/mq/5/cpu_list
> -r--r--r-- 1 root root 4096 Oct  8 14:30 /sys/block/nvme0n1/mq/6/cpu_list
> 
> 
> # uname -r
> 4.18.10
> # ll /sys/block/nvme*n1/mq/*/cpu_list
> -r--r--r-- 1 root root 4096 Oct  8 14:34 /sys/block/nvme0n1/mq/0/cpu_list
> -r--r--r-- 1 root root 4096 Oct  8 14:34 /sys/block/nvme0n1/mq/1/cpu_list
> -r--r--r-- 1 root root 4096 Oct  8 14:34 /sys/block/nvme0n1/mq/2/cpu_list
> -r--r--r-- 1 root root 4096 Oct  8 14:34 /sys/block/nvme0n1/mq/3/cpu_list
> -r--r--r-- 1 root root 4096 Oct  8 14:34 /sys/block/nvme0n1/mq/4/cpu_list
> -r--r--r-- 1 root root 4096 Oct  8 14:34 /sys/block/nvme0n1/mq/5/cpu_list
> -r--r--r-- 1 root root 4096 Oct  8 14:34 /sys/block/nvme0n1/mq/6/cpu_list
> 
> From below qemu source code, when n->num_queues is 8, the handler of
> NVME_FEAT_NUM_QUEUES returns 0x60006.
> 
>  719 static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>  720 {
>  721     uint32_t dw10 = le32_to_cpu(cmd->cdw10);
>  722     uint32_t dw11 = le32_to_cpu(cmd->cdw11);
>  723
>  724     switch (dw10) {
>  725     case NVME_VOLATILE_WRITE_CACHE:
>  726         blk_set_enable_write_cache(n->conf.blk, dw11 & 1);
>  727         break;
>  728     case NVME_NUMBER_OF_QUEUES:
>  729         trace_nvme_setfeat_numq((dw11 & 0xFFFF) + 1,
>  730                                 ((dw11 >> 16) & 0xFFFF) + 1,
>  731                                 n->num_queues - 1, n->num_queues - 1);
>  732         req->cqe.result =
>  733             cpu_to_le32((n->num_queues - 2) | ((n->num_queues - 2) << 16));
> ----> returns 0x60006 when num_queues is 8.
> 
> 
> Finally, nr_io_queues is set to 6+1=7 in nvme_set_queue_count() in VM kernel.
> 
> I do not know how to paraphrase this in the world of nvme.
> 
> Dongli Zhang
> 
> On 10/08/2018 01:59 PM, Dongli Zhang wrote:
> > I can reproduce with qemu:
> > 
> > # ls /sys/block/nvme*n1/mq/*/cpu_list
> > /sys/block/nvme0n1/mq/0/cpu_list
> > /sys/block/nvme0n1/mq/1/cpu_list
> > /sys/block/nvme0n1/mq/2/cpu_list
> > /sys/block/nvme0n1/mq/3/cpu_list
> > /sys/block/nvme0n1/mq/4/cpu_list
> > /sys/block/nvme0n1/mq/5/cpu_list
> > /sys/block/nvme0n1/mq/6/cpu_list
> > 
> > Here is the qemu cmdline emulating 8-queue nvme while the VM has 12 cpu:
> > 
> > # qemu-system-x86_64 -m 4096 -smp 12 \
> > 	-kernel /path-to-kernel/linux-4.18.10/arch/x86_64/boot/bzImage \
> > 	-hda /path-to-img/ubuntu1804.qcow2  \
> > 	-append "root=/dev/sda1 init=/sbin/init text" -enable-kvm \
> > 	-net nic -net user,hostfwd=tcp::5022-:22 \
> > 	-device nvme,drive=nvme1,serial=deadbeaf1,num_queues=8 \
> > 	-drive file=/path-to-img/nvme.disk,if=none,id=nvme1
> > 
> > Dongli Zhang

Qemu counts one of those queues as the admin queue.

> > On 10/08/2018 01:05 PM, Prasun Ratn wrote:
> >> Hi
> >>
> >> I have an NVMe SSD that has 8 hw queues and on older kernels I see all
> >> 8 show up. However on a recent kernel (I tried 4.18), I only see 7. Is
> >> this a known issue?

That probably means you only have 8 MSI-x vectors, one of which is
reserved for the admin queue. We used to share an IO vector with the
admin queue, however some people figured out you can break your controller
that way with the linux irq spread.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-10-08 22:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1519832921-13915-1-git-send-email-jianchao.w.wang@oracle.com>
     [not found] ` <20180228164726.GB16536@lst.de>
     [not found]   ` <20180301150329.GB6795@ming.t460p>
     [not found]     ` <20180301161042.GA14799@localhost.localdomain>
     [not found]       ` <20180308074220.GC15748@lst.de>
     [not found]         ` <20180309172445.GC14765@localhost.localdomain>
2018-03-12  9:09           ` [PATCH V2] nvme-pci: assign separate irq vectors for adminq and ioq0 Ming Lei
2018-10-08  5:05             ` nvme-pci: number of queues off by one Prasun Ratn
2018-10-08  5:59               ` Dongli Zhang
2018-10-08  6:58                 ` Dongli Zhang
2018-10-08 14:54                   ` Keith Busch
2018-10-08 10:19                 ` Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox