From: Max Gurtovoy <maxg@mellanox.com>
To: Weiping Zhang <zwp10758@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>,
sagi@grimberg.me, Weiping Zhang <zhangweiping@didiglobal.com>,
linux-nvme@lists.infradead.org,
Christoph Hellwig <hch@infradead.org>,
Keith Busch <kbusch@kernel.org>
Subject: Re: [PATCH] nvme: align io queue count with allocted nvme_queue in nvme_probe
Date: Tue, 14 Apr 2020 15:59:12 +0300 [thread overview]
Message-ID: <df0fa3a0-7764-bc1f-711d-d264fc4f444f@mellanox.com> (raw)
In-Reply-To: <CAA70yB7bkeSwQVjJ5rjGS3HxZtkraUY1FX9ZMHXH8FO6a-8omw@mail.gmail.com>
On 4/13/2020 3:00 PM, Weiping Zhang wrote:
> On Mon, Apr 13, 2020 at 5:37 PM Max Gurtovoy <maxg@mellanox.com> wrote:
>>
>> On 4/13/2020 4:01 AM, Weiping Zhang wrote:
>>> On Sun, Apr 12, 2020 at 8:38 PM Max Gurtovoy <maxg@mellanox.com> wrote:
>>> Hi Max,
>>>
>>>> hi,
>>>>
>>>> how about the following minor update:
>>>>
>>>> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
>>>> index 4e79e41..46ab28b 100644
>>>> --- a/drivers/nvme/host/pci.c
>>>> +++ b/drivers/nvme/host/pci.c
>>>> @@ -89,6 +89,7 @@
>>>> */
>>>> struct nvme_dev {
>>>> struct nvme_queue *queues;
>>>> + int nr_allocated_queue;
>>>> struct blk_mq_tag_set tagset;
>>>> struct blk_mq_tag_set admin_tagset;
>>>> u32 __iomem *dbs;
>>>> @@ -209,15 +210,15 @@ struct nvme_iod {
>>>> struct scatterlist *sg;
>>>> };
>>>>
>>>> -static unsigned int max_io_queues(void)
>>>> +static unsigned int nr_dev_io_queues(struct nvme_dev *dev)
>>>> {
>>>> - return num_possible_cpus() + write_queues + poll_queues;
>>>> + return dev->nr_allocated_queue - 1;
>>>> }
>>>>
>>>> static unsigned int max_queue_count(void)
>>>> {
>>>> /* IO queues + admin queue */
>>>> - return 1 + max_io_queues();
>>>> + return 1 + num_possible_cpus() + write_queues + poll_queues;
>>>> }
>>>>
>>>> static inline unsigned int nvme_dbbuf_size(u32 stride)
>>>> @@ -2073,7 +2074,7 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
>>>> int result, nr_io_queues;
>>>> unsigned long size;
>>>>
>>>> - nr_io_queues = max_io_queues();
>>>> + nr_io_queues = nr_dev_io_queues(dev);
>>>>
>>> It may have some problem when user decrease queue count for multiple tagset map.
>>> For example, there are total 128 IO and 96 cpus(system),
>>> insmod nvme write_queues=32
>>> nvme_probe will allocate 129(128io + 1 admin), nr_allocated_queue=129;
>>> and then user decrease queue count
>>> echo 2 > /sys/module/nvme/parameters/write_queues
>>> echo 1 > /sys/block/nvme0n1/device/reset_controller.
>>> nvme_setup_io_queues should use
>>> 96(num_possible_cpus) + 2(write_queues) instead of 129(nr_allocated_queue).
>> Any change that you will try to do (increase/decrease) will not effect.
>>
>> If you want it to effect, you need to make nvme_probe to run.
>>
>> I don't see a value only for making the code not to crash but not really
>> effect the queue count.
>>
>> write_queues and poll queues shouldn't be writable IMO.
>>
> I think we can keep it writeable, the user case is that setup as many io
> queues as possible when load nvme module, then change queue count
> for each tag set map dynamically.
We can keep it writable but I prefer not change the controller initial
queue count after reset controller operation.
So we can keep dev->write_queues and dev->poll_queues count for that.
You can use the writable param in case you aim to hotplug a new device
and you want it to probe with less/more queues.
IMO this feature should've somehow configured using nvme-cli as we do
with fabrics controllers that we never change this values after initial
connection.
Keith/Christoph,
what is the right approach in your opinion ?
>
>> Since nvme_dbbuf_dma_alloc/nvme_dbbuf_dma_free also call
>> max_queue_count() that uses writable module params.
>>
>> we can save this values locally or make it read-only param.
> It's not safe to use max_queue_count() for these two function,
> and there is also a problem in nvme_dbbuf_dma_alloc,
> static int nvme_dbbuf_dma_alloc(struct nvme_dev *dev)
> {
> unsigned int mem_size = nvme_dbbuf_size(dev->db_stride);
>
> if (dev->dbbuf_dbs)
> return 0;
> it does not aware queue count was changed or not.
>
> But it can be fixed by
> unsigned int mem_size = nvme_dev->nr_allocated_queue * 8 * dev->db_stride;
>
> Thanks
_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
next prev parent reply other threads:[~2020-04-14 12:59 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-10 9:57 [PATCH] nvme: align io queue count with allocted nvme_queue in nvme_probe Weiping Zhang
2020-04-12 12:38 ` Max Gurtovoy
2020-04-13 1:01 ` Weiping Zhang
2020-04-13 9:37 ` Max Gurtovoy
2020-04-13 12:00 ` Weiping Zhang
2020-04-14 12:59 ` Max Gurtovoy [this message]
2020-04-22 8:37 ` Christoph Hellwig
2020-04-22 9:24 ` weiping zhang
2020-04-22 16:57 ` Christoph Hellwig
2020-04-23 7:59 ` Weiping Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=df0fa3a0-7764-bc1f-711d-d264fc4f444f@mellanox.com \
--to=maxg@mellanox.com \
--cc=axboe@kernel.dk \
--cc=hch@infradead.org \
--cc=kbusch@kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
--cc=zhangweiping@didiglobal.com \
--cc=zwp10758@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox