From: John Meneghini <jmeneghi@redhat.com>
To: Keith Busch <kbusch@kernel.org>
Cc: hch@lst.de, sagi@grimberg.me, emilne@redhat.com,
linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org,
jrani@purestorage.com, randyj@purestorage.com, hare@kernel.org
Subject: Re: [PATCH v3 1/1] nvme: multipath: Implemented new iopolicy "queue-depth"
Date: Tue, 21 May 2024 10:20:52 -0400 [thread overview]
Message-ID: <19fa255b-a29c-42a9-b9aa-48422e6000b3@redhat.com> (raw)
In-Reply-To: <Zku3fBuauZQX6bEO@kbusch-mbp>
On 5/20/24 16:50, Keith Busch wrote:
>> static LIST_HEAD(nvme_subsystems);
>> -static DEFINE_MUTEX(nvme_subsystems_lock);
>> +DEFINE_MUTEX(nvme_subsystems_lock);
> This seems odd. Why is this lock protecting both the global
> nvme_subsystems list, and also subsystem controllers? IOW, why isn't the
> subsys->ctrls list protected by the more fine grained 'subsys->lock'
> instead of this global lock?
>
>> @@ -43,7 +46,7 @@ static int nvme_get_iopolicy(char *buf, const struct kernel_param *kp)
>> module_param_call(iopolicy, nvme_set_iopolicy, nvme_get_iopolicy,
>> &iopolicy, 0644);
>> MODULE_PARM_DESC(iopolicy,
>> - "Default multipath I/O policy; 'numa' (default) or 'round-robin'");
>> + "Default multipath I/O policy; 'numa' (default) , 'round-robin' or 'queue-depth'");
>
> Unnecessary space before the ','.
I'll fix this.
>> + if (READ_ONCE(ns->head->subsys->iopolicy) == NVME_IOPOLICY_QD) {
>> + atomic_inc(&ns->ctrl->nr_active);
>> + nvme_req(rq)->flags |= NVME_MPATH_CNT_ACTIVE;
>> + }
>> +
>> if (!blk_queue_io_stat(disk->queue) || blk_rq_is_passthrough(rq))
>> return;
>>
>> @@ -140,8 +148,12 @@ void nvme_mpath_end_request(struct request *rq)
>> {
>> struct nvme_ns *ns = rq->q->queuedata;
>>
>> + if ((nvme_req(rq)->flags & NVME_MPATH_CNT_ACTIVE))
>> + atomic_dec_if_positive(&ns->ctrl->nr_active);
>
> You can just do a atomic_dec() since your new flag has this tied to to
> the atomic_inc().
No, I don't think that would be correct because, for a number of reasons, this counter could go below zero. e.g. there is
nothing to prevent the IO policy from changing between nvme_mpath_start_request and nvme_mpath_end_request, and there are code
paths like the reset/cancel code path which could affect this counter.
Also, this code path has been extensively tested. Ewan has played with all kinds of different counting schemes, which didn't
work. I'm happy to make non-functional changes, but a functional change like this would require completely retesting and
re-verifying thing.
I'd like to keep this.
>> +static struct nvme_ns *nvme_queue_depth_path(struct nvme_ns_head *head)
>> +{
>> + struct nvme_ns *best_opt = NULL, *best_nonopt = NULL, *ns;
>> + unsigned int min_depth_opt = UINT_MAX, min_depth_nonopt = UINT_MAX;
>> + unsigned int depth;
>> +
>> + list_for_each_entry_rcu(ns, &head->list, siblings) {
>> + if (nvme_path_is_disabled(ns))
>> + continue;
>> +
>> + depth = atomic_read(&ns->ctrl->nr_active);
>> +
>> + switch (ns->ana_state) {
>> + case NVME_ANA_OPTIMIZED:
>> + if (depth < min_depth_opt) {
>> + min_depth_opt = depth;
>> + best_opt = ns;
>> + }
>> + break;
>> +
>> + case NVME_ANA_NONOPTIMIZED:
>> + if (depth < min_depth_nonopt) {
>> + min_depth_nonopt = depth;
>> + best_nonopt = ns;
>> + }
>> + break;
>> + default:
>> + break;
>> + }
>
> Could we break out of this loop early if "min_depth_opt == 0"? We can't
> find a better path that that, so no need to read the rest of the paths.
I can do that... since we are keeping the atomic_dec_if_positive() above. ;-)
>> +void nvme_subsys_iopolicy_update(struct nvme_subsystem *subsys, int iopolicy)
>> +{
>> + struct nvme_ctrl *ctrl;
>> + int old_iopolicy = READ_ONCE(subsys->iopolicy);
>> +
> Let's add a check here:
>
> if (old_iopolicy == iopolicy)
> return;
Actually, this is feature, not a bug. I'd like to keep the ability to reset the nr_active counters. It is an invaluable tool
that I use during testing. See my comments to Hannes.
>> @@ -935,6 +940,7 @@ void nvme_mpath_clear_ctrl_paths(struct nvme_ctrl *ctrl);
>> void nvme_mpath_shutdown_disk(struct nvme_ns_head *head);
>> void nvme_mpath_start_request(struct request *rq);
>> void nvme_mpath_end_request(struct request *rq);
>> +void nvme_subsys_iopolicy_update(struct nvme_subsystem *subsys, int iopolicy);
>
> This funciton isn't used outside multipath.c, so it should be static.
I'll fix this.
/John
next prev parent reply other threads:[~2024-05-21 14:21 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-20 20:20 [PATCH v3 0/1] nvme: queue-depth multipath iopolicy John Meneghini
2024-05-20 20:20 ` [PATCH v3 1/1] nvme: multipath: Implemented new iopolicy "queue-depth" John Meneghini
2024-05-20 20:50 ` Keith Busch
2024-05-21 14:20 ` John Meneghini [this message]
2024-05-21 6:46 ` Hannes Reinecke
2024-05-21 13:58 ` John Meneghini
2024-05-21 14:10 ` Keith Busch
2024-05-21 14:23 ` Hannes Reinecke
2024-05-21 16:35 ` Caleb Sander
2024-05-21 8:48 ` Nilay Shroff
2024-05-21 9:45 ` Sagi Grimberg
2024-05-21 10:07 ` Nilay Shroff
2024-05-21 10:11 ` Sagi Grimberg
2024-05-21 10:15 ` Sagi Grimberg
2024-05-21 10:16 ` Sagi Grimberg
2024-05-21 14:44 ` John Meneghini
2024-05-22 10:48 ` Nilay Shroff
2024-05-22 10:52 ` Sagi Grimberg
2024-05-22 13:12 ` John Meneghini
2024-05-21 10:22 ` Nilay Shroff
2024-05-21 13:05 ` Keith Busch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=19fa255b-a29c-42a9-b9aa-48422e6000b3@redhat.com \
--to=jmeneghi@redhat.com \
--cc=emilne@redhat.com \
--cc=hare@kernel.org \
--cc=hch@lst.de \
--cc=jrani@purestorage.com \
--cc=kbusch@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=randyj@purestorage.com \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox