* [PATCH v3] nvme_core: scan namespaces asynchronously
@ 2024-07-15 20:34 Stuart Hayes
2024-07-15 22:28 ` Sagi Grimberg
2024-07-16 6:37 ` Hannes Reinecke
0 siblings, 2 replies; 4+ messages in thread
From: Stuart Hayes @ 2024-07-15 20:34 UTC (permalink / raw)
To: linux-kernel, Keith Busch, Jens Axboe, Christoph Hellwig,
Sagi Grimberg, linux-nvme
Cc: Hannes Reinecke, Martin Wilck, Ayush Siddarath, Stuart Hayes
Use async function calls to make namespace scanning happen in parallel.
Without the patch, NVME namespaces are scanned serially, so it can take
a long time for all of a controller's namespaces to become available,
especially with a slower (TCP) interface with large number of
namespaces.
It is not uncommon to have large numbers (hundreds or thousands) of
namespaces on nvme-of with storage servers.
The time it took for all namespaces to show up after connecting (via
TCP) to a controller with 1002 namespaces was measured on one system:
network latency without patch with patch
0 6s 1s
50ms 210s 10s
100ms 417s 18s
Measurements taken on another system show the effect of the patch on the
time nvme_scan_work() took to complete, when connecting to a linux
nvme-of target with varying numbers of namespaces, on a network of
400us.
namespaces without patch with patch
1 16ms 14ms
2 24ms 16ms
4 49ms 22ms
8 101ms 33ms
16 207ms 56ms
100 1.4s 0.6s
1000 12.9s 2.0s
On the same system, connecting to a local PCIe NVMe drive (a Samsung
PM1733) instead of a network target:
namespaces without patch with patch
1 13ms 12ms
2 41ms 13ms
Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
---
changes from V2:
* make a separate function nvme_scan_ns_async() that calls
nvme_scan_ns(), instead of modifying nvme_scan_ns()
* only scan asynchronously from nvme_scan_ns_list(), not from
nvme_scan_ns_sequential()
* provide more timing data in the commit message
changes from V1:
* remove module param to enable/disable async scanning
* add scan time measurements to commit message
drivers/nvme/host/core.c | 48 +++++++++++++++++++++++++++++++++-------
1 file changed, 40 insertions(+), 8 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 782090ce0bc1..dbf05cfea063 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4,6 +4,7 @@
* Copyright (c) 2011-2014, Intel Corporation.
*/
+#include <linux/async.h>
#include <linux/blkdev.h>
#include <linux/blk-mq.h>
#include <linux/blk-integrity.h>
@@ -3952,6 +3953,30 @@ static void nvme_scan_ns(struct nvme_ctrl *ctrl, unsigned nsid)
}
}
+/*
+ * struct async_scan_info - keeps track of controller & NSIDs to scan
+ * @ctrl: Controller on which namespaces are being scanned
+ * @next_idx: Index of next NSID to scan in ns_list
+ * @ns_list: Pointer to list of NSIDs to scan
+ */
+struct async_scan_info {
+ struct nvme_ctrl *ctrl;
+ atomic_t next_idx;
+ __le32 *ns_list;
+};
+
+static void nvme_scan_ns_async(void *data, async_cookie_t cookie)
+{
+ struct async_scan_info *scan_info = data;
+ int idx;
+ u32 nsid;
+
+ idx = (u32)atomic_fetch_add(1, &scan_info->next_idx);
+ nsid = le32_to_cpu(scan_info->ns_list[idx]);
+
+ nvme_scan_ns(scan_info->ctrl, nsid);
+}
+
static void nvme_remove_invalid_namespaces(struct nvme_ctrl *ctrl,
unsigned nsid)
{
@@ -3975,12 +4000,14 @@ static void nvme_remove_invalid_namespaces(struct nvme_ctrl *ctrl,
static int nvme_scan_ns_list(struct nvme_ctrl *ctrl)
{
const int nr_entries = NVME_IDENTIFY_DATA_SIZE / sizeof(__le32);
- __le32 *ns_list;
+ struct async_scan_info scan_info;
u32 prev = 0;
int ret = 0, i;
+ ASYNC_DOMAIN(domain);
- ns_list = kzalloc(NVME_IDENTIFY_DATA_SIZE, GFP_KERNEL);
- if (!ns_list)
+ scan_info.ctrl = ctrl;
+ scan_info.ns_list = kzalloc(NVME_IDENTIFY_DATA_SIZE, GFP_KERNEL);
+ if (!scan_info.ns_list)
return -ENOMEM;
for (;;) {
@@ -3990,28 +4017,33 @@ static int nvme_scan_ns_list(struct nvme_ctrl *ctrl)
.identify.nsid = cpu_to_le32(prev),
};
- ret = nvme_submit_sync_cmd(ctrl->admin_q, &cmd, ns_list,
- NVME_IDENTIFY_DATA_SIZE);
+ ret = nvme_submit_sync_cmd(ctrl->admin_q, &cmd,
+ scan_info.ns_list,
+ NVME_IDENTIFY_DATA_SIZE);
if (ret) {
dev_warn(ctrl->device,
"Identify NS List failed (status=0x%x)\n", ret);
goto free;
}
+ atomic_set(&scan_info.next_idx, 0);
for (i = 0; i < nr_entries; i++) {
- u32 nsid = le32_to_cpu(ns_list[i]);
+ u32 nsid = le32_to_cpu(scan_info.ns_list[i]);
if (!nsid) /* end of the list? */
goto out;
- nvme_scan_ns(ctrl, nsid);
+ async_schedule_domain(nvme_scan_ns_async, &scan_info,
+ &domain);
while (++prev < nsid)
nvme_ns_remove_by_nsid(ctrl, prev);
}
+ async_synchronize_full_domain(&domain);
}
out:
nvme_remove_invalid_namespaces(ctrl, prev);
free:
- kfree(ns_list);
+ async_synchronize_full_domain(&domain);
+ kfree(scan_info.ns_list);
return ret;
}
--
2.39.3
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH v3] nvme_core: scan namespaces asynchronously
2024-07-15 20:34 [PATCH v3] nvme_core: scan namespaces asynchronously Stuart Hayes
@ 2024-07-15 22:28 ` Sagi Grimberg
2024-07-16 1:23 ` stuart hayes
2024-07-16 6:37 ` Hannes Reinecke
1 sibling, 1 reply; 4+ messages in thread
From: Sagi Grimberg @ 2024-07-15 22:28 UTC (permalink / raw)
To: Stuart Hayes, linux-kernel, Keith Busch, Jens Axboe,
Christoph Hellwig, linux-nvme
Cc: Hannes Reinecke, Martin Wilck, Ayush Siddarath
On 15/07/2024 23:34, Stuart Hayes wrote:
> Use async function calls to make namespace scanning happen in parallel.
>
> Without the patch, NVME namespaces are scanned serially, so it can take
> a long time for all of a controller's namespaces to become available,
> especially with a slower (TCP) interface with large number of
> namespaces.
>
> It is not uncommon to have large numbers (hundreds or thousands) of
> namespaces on nvme-of with storage servers.
>
> The time it took for all namespaces to show up after connecting (via
> TCP) to a controller with 1002 namespaces was measured on one system:
>
> network latency without patch with patch
> 0 6s 1s
> 50ms 210s 10s
> 100ms 417s 18s
>
> Measurements taken on another system show the effect of the patch on the
> time nvme_scan_work() took to complete, when connecting to a linux
> nvme-of target with varying numbers of namespaces, on a network of
> 400us.
>
> namespaces without patch with patch
> 1 16ms 14ms
> 2 24ms 16ms
> 4 49ms 22ms
> 8 101ms 33ms
> 16 207ms 56ms
> 100 1.4s 0.6s
> 1000 12.9s 2.0s
Not sure how common is the 1000 namespaces use-case, but the dozens of
namespaces
seems compelling enough.
>
> On the same system, connecting to a local PCIe NVMe drive (a Samsung
> PM1733) instead of a network target:
>
> namespaces without patch with patch
> 1 13ms 12ms
> 2 41ms 13ms
>
> Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
> ---
> changes from V2:
> * make a separate function nvme_scan_ns_async() that calls
> nvme_scan_ns(), instead of modifying nvme_scan_ns()
> * only scan asynchronously from nvme_scan_ns_list(), not from
> nvme_scan_ns_sequential()
> * provide more timing data in the commit message
>
> changes from V1:
> * remove module param to enable/disable async scanning
> * add scan time measurements to commit message
>
>
> drivers/nvme/host/core.c | 48 +++++++++++++++++++++++++++++++++-------
> 1 file changed, 40 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 782090ce0bc1..dbf05cfea063 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -4,6 +4,7 @@
> * Copyright (c) 2011-2014, Intel Corporation.
> */
>
> +#include <linux/async.h>
> #include <linux/blkdev.h>
> #include <linux/blk-mq.h>
> #include <linux/blk-integrity.h>
> @@ -3952,6 +3953,30 @@ static void nvme_scan_ns(struct nvme_ctrl *ctrl, unsigned nsid)
> }
> }
>
> +/*
> + * struct async_scan_info - keeps track of controller & NSIDs to scan
> + * @ctrl: Controller on which namespaces are being scanned
> + * @next_idx: Index of next NSID to scan in ns_list
> + * @ns_list: Pointer to list of NSIDs to scan
> + */
> +struct async_scan_info {
> + struct nvme_ctrl *ctrl;
> + atomic_t next_idx;
next_nsid ?
> + __le32 *ns_list;
> +};
> +
> +static void nvme_scan_ns_async(void *data, async_cookie_t cookie)
> +{
> + struct async_scan_info *scan_info = data;
> + int idx;
> + u32 nsid;
> +
> + idx = (u32)atomic_fetch_add(1, &scan_info->next_idx);
> + nsid = le32_to_cpu(scan_info->ns_list[idx]);
> +
> + nvme_scan_ns(scan_info->ctrl, nsid);
> +}
> +
> static void nvme_remove_invalid_namespaces(struct nvme_ctrl *ctrl,
> unsigned nsid)
> {
> @@ -3975,12 +4000,14 @@ static void nvme_remove_invalid_namespaces(struct nvme_ctrl *ctrl,
> static int nvme_scan_ns_list(struct nvme_ctrl *ctrl)
> {
> const int nr_entries = NVME_IDENTIFY_DATA_SIZE / sizeof(__le32);
> - __le32 *ns_list;
> + struct async_scan_info scan_info;
What initializes next_idx?
> u32 prev = 0;
> int ret = 0, i;
> + ASYNC_DOMAIN(domain);
>
> - ns_list = kzalloc(NVME_IDENTIFY_DATA_SIZE, GFP_KERNEL);
> - if (!ns_list)
> + scan_info.ctrl = ctrl;
> + scan_info.ns_list = kzalloc(NVME_IDENTIFY_DATA_SIZE, GFP_KERNEL);
> + if (!scan_info.ns_list)
> return -ENOMEM;
I think you can leave the local variable ns_list as is, and just assign
it to scan_info
after, its common practice to allocate to a local pointer and use it to
init a struct member.
Plus it will make the patch diff simpler.
>
> for (;;) {
> @@ -3990,28 +4017,33 @@ static int nvme_scan_ns_list(struct nvme_ctrl *ctrl)
> .identify.nsid = cpu_to_le32(prev),
> };
>
> - ret = nvme_submit_sync_cmd(ctrl->admin_q, &cmd, ns_list,
> - NVME_IDENTIFY_DATA_SIZE);
> + ret = nvme_submit_sync_cmd(ctrl->admin_q, &cmd,
> + scan_info.ns_list,
> + NVME_IDENTIFY_DATA_SIZE);
> if (ret) {
> dev_warn(ctrl->device,
> "Identify NS List failed (status=0x%x)\n", ret);
> goto free;
> }
>
> + atomic_set(&scan_info.next_idx, 0);
> for (i = 0; i < nr_entries; i++) {
> - u32 nsid = le32_to_cpu(ns_list[i]);
> + u32 nsid = le32_to_cpu(scan_info.ns_list[i]);
>
> if (!nsid) /* end of the list? */
> goto out;
> - nvme_scan_ns(ctrl, nsid);
> + async_schedule_domain(nvme_scan_ns_async, &scan_info,
> + &domain);
> while (++prev < nsid)
> nvme_ns_remove_by_nsid(ctrl, prev);
> }
> + async_synchronize_full_domain(&domain);
> }
> out:
> nvme_remove_invalid_namespaces(ctrl, prev);
> free:
> - kfree(ns_list);
> + async_synchronize_full_domain(&domain);
> + kfree(scan_info.ns_list);
> return ret;
> }
>
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH v3] nvme_core: scan namespaces asynchronously
2024-07-15 22:28 ` Sagi Grimberg
@ 2024-07-16 1:23 ` stuart hayes
0 siblings, 0 replies; 4+ messages in thread
From: stuart hayes @ 2024-07-16 1:23 UTC (permalink / raw)
To: Sagi Grimberg, linux-kernel, Keith Busch, Jens Axboe,
Christoph Hellwig, linux-nvme
Cc: Hannes Reinecke, Martin Wilck, Ayush Siddarath
On 7/15/2024 5:28 PM, Sagi Grimberg wrote:
>
>
> On 15/07/2024 23:34, Stuart Hayes wrote:
>> Use async function calls to make namespace scanning happen in parallel.
>>
>> Without the patch, NVME namespaces are scanned serially, so it can take
>> a long time for all of a controller's namespaces to become available,
>> especially with a slower (TCP) interface with large number of
>> namespaces.
>>
>> It is not uncommon to have large numbers (hundreds or thousands) of
>> namespaces on nvme-of with storage servers.
>>
>> The time it took for all namespaces to show up after connecting (via
>> TCP) to a controller with 1002 namespaces was measured on one system:
>>
>> network latency without patch with patch
>> 0 6s 1s
>> 50ms 210s 10s
>> 100ms 417s 18s
>>
>> Measurements taken on another system show the effect of the patch on the
>> time nvme_scan_work() took to complete, when connecting to a linux
>> nvme-of target with varying numbers of namespaces, on a network of
>> 400us.
>>
>> namespaces without patch with patch
>> 1 16ms 14ms
>> 2 24ms 16ms
>> 4 49ms 22ms
>> 8 101ms 33ms
>> 16 207ms 56ms
>> 100 1.4s 0.6s
>> 1000 12.9s 2.0s
>
> Not sure how common is the 1000 namespaces use-case, but the dozens of namespaces
> seems compelling enough.
>
>>
>> On the same system, connecting to a local PCIe NVMe drive (a Samsung
>> PM1733) instead of a network target:
>>
>> namespaces without patch with patch
>> 1 13ms 12ms
>> 2 41ms 13ms
>>
>> Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
>> ---
>> changes from V2:
>> * make a separate function nvme_scan_ns_async() that calls
>> nvme_scan_ns(), instead of modifying nvme_scan_ns()
>> * only scan asynchronously from nvme_scan_ns_list(), not from
>> nvme_scan_ns_sequential()
>> * provide more timing data in the commit message
>>
>> changes from V1:
>> * remove module param to enable/disable async scanning
>> * add scan time measurements to commit message
>>
>>
>> drivers/nvme/host/core.c | 48 +++++++++++++++++++++++++++++++++-------
>> 1 file changed, 40 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index 782090ce0bc1..dbf05cfea063 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -4,6 +4,7 @@
>> * Copyright (c) 2011-2014, Intel Corporation.
>> */
>> +#include <linux/async.h>
>> #include <linux/blkdev.h>
>> #include <linux/blk-mq.h>
>> #include <linux/blk-integrity.h>
>> @@ -3952,6 +3953,30 @@ static void nvme_scan_ns(struct nvme_ctrl *ctrl, unsigned nsid)
>> }
>> }
>> +/*
>> + * struct async_scan_info - keeps track of controller & NSIDs to scan
>> + * @ctrl: Controller on which namespaces are being scanned
>> + * @next_idx: Index of next NSID to scan in ns_list
>> + * @ns_list: Pointer to list of NSIDs to scan
>> + */
>> +struct async_scan_info {
>> + struct nvme_ctrl *ctrl;
>> + atomic_t next_idx;
>
> next_nsid ?
>
OK!
>> + __le32 *ns_list;
>> +};
>> +
>> +static void nvme_scan_ns_async(void *data, async_cookie_t cookie)
>> +{
>> + struct async_scan_info *scan_info = data;
>> + int idx;
>> + u32 nsid;
>> +
>> + idx = (u32)atomic_fetch_add(1, &scan_info->next_idx);
>> + nsid = le32_to_cpu(scan_info->ns_list[idx]);
>> +
>> + nvme_scan_ns(scan_info->ctrl, nsid);
>> +}
>> +
>> static void nvme_remove_invalid_namespaces(struct nvme_ctrl *ctrl,
>> unsigned nsid)
>> {
>> @@ -3975,12 +4000,14 @@ static void nvme_remove_invalid_namespaces(struct nvme_ctrl *ctrl,
>> static int nvme_scan_ns_list(struct nvme_ctrl *ctrl)
>> {
>> const int nr_entries = NVME_IDENTIFY_DATA_SIZE / sizeof(__le32);
>> - __le32 *ns_list;
>> + struct async_scan_info scan_info;
>
> What initializes next_idx?
See below--there's an atomic_set(). It is inside of the outer "for" loop because there can
be multiple lists that have to be scanned and it has to reset to 0 each time.
>
>> u32 prev = 0;
>> int ret = 0, i;
>> + ASYNC_DOMAIN(domain);
>> - ns_list = kzalloc(NVME_IDENTIFY_DATA_SIZE, GFP_KERNEL);
>> - if (!ns_list)
>> + scan_info.ctrl = ctrl;
>> + scan_info.ns_list = kzalloc(NVME_IDENTIFY_DATA_SIZE, GFP_KERNEL);
>> + if (!scan_info.ns_list)
>> return -ENOMEM;
>
> I think you can leave the local variable ns_list as is, and just assign it to scan_info
> after, its common practice to allocate to a local pointer and use it to init a struct member.
>
> Plus it will make the patch diff simpler.
>
No problem, I agree. I think someone suggested the opposite last time I submitted this. :)
>> for (;;) {
>> @@ -3990,28 +4017,33 @@ static int nvme_scan_ns_list(struct nvme_ctrl *ctrl)
>> .identify.nsid = cpu_to_le32(prev),
>> };
>> - ret = nvme_submit_sync_cmd(ctrl->admin_q, &cmd, ns_list,
>> - NVME_IDENTIFY_DATA_SIZE);
>> + ret = nvme_submit_sync_cmd(ctrl->admin_q, &cmd,
>> + scan_info.ns_list,
>> + NVME_IDENTIFY_DATA_SIZE);
>> if (ret) {
>> dev_warn(ctrl->device,
>> "Identify NS List failed (status=0x%x)\n", ret);
>> goto free;
>> }
>> + atomic_set(&scan_info.next_idx, 0);
This atomic_set is what initializes next_idx.
>> for (i = 0; i < nr_entries; i++) {
>> - u32 nsid = le32_to_cpu(ns_list[i]);
>> + u32 nsid = le32_to_cpu(scan_info.ns_list[i]);
>> if (!nsid) /* end of the list? */
>> goto out;
>> - nvme_scan_ns(ctrl, nsid);
>> + async_schedule_domain(nvme_scan_ns_async, &scan_info,
>> + &domain);
>> while (++prev < nsid)
>> nvme_ns_remove_by_nsid(ctrl, prev);
>> }
>> + async_synchronize_full_domain(&domain);
>> }
>> out:
>> nvme_remove_invalid_namespaces(ctrl, prev);
>> free:
>> - kfree(ns_list);
>> + async_synchronize_full_domain(&domain);
>> + kfree(scan_info.ns_list);
>> return ret;
>> }
>
Thank you for the feedback!
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v3] nvme_core: scan namespaces asynchronously
2024-07-15 20:34 [PATCH v3] nvme_core: scan namespaces asynchronously Stuart Hayes
2024-07-15 22:28 ` Sagi Grimberg
@ 2024-07-16 6:37 ` Hannes Reinecke
1 sibling, 0 replies; 4+ messages in thread
From: Hannes Reinecke @ 2024-07-16 6:37 UTC (permalink / raw)
To: Stuart Hayes, linux-kernel, Keith Busch, Jens Axboe,
Christoph Hellwig, Sagi Grimberg, linux-nvme
Cc: Martin Wilck, Ayush Siddarath
On 7/15/24 22:34, Stuart Hayes wrote:
> Use async function calls to make namespace scanning happen in parallel.
>
> Without the patch, NVME namespaces are scanned serially, so it can take
> a long time for all of a controller's namespaces to become available,
> especially with a slower (TCP) interface with large number of
> namespaces.
>
> It is not uncommon to have large numbers (hundreds or thousands) of
> namespaces on nvme-of with storage servers.
>
> The time it took for all namespaces to show up after connecting (via
> TCP) to a controller with 1002 namespaces was measured on one system:
>
> network latency without patch with patch
> 0 6s 1s
> 50ms 210s 10s
> 100ms 417s 18s
>
> Measurements taken on another system show the effect of the patch on the
> time nvme_scan_work() took to complete, when connecting to a linux
> nvme-of target with varying numbers of namespaces, on a network of
> 400us.
>
> namespaces without patch with patch
> 1 16ms 14ms
> 2 24ms 16ms
> 4 49ms 22ms
> 8 101ms 33ms
> 16 207ms 56ms
> 100 1.4s 0.6s
> 1000 12.9s 2.0s
>
> On the same system, connecting to a local PCIe NVMe drive (a Samsung
> PM1733) instead of a network target:
>
> namespaces without patch with patch
> 1 13ms 12ms
> 2 41ms 13ms
>
> Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
> ---
> changes from V2:
> * make a separate function nvme_scan_ns_async() that calls
> nvme_scan_ns(), instead of modifying nvme_scan_ns()
> * only scan asynchronously from nvme_scan_ns_list(), not from
> nvme_scan_ns_sequential()
> * provide more timing data in the commit message
>
> changes from V1:
> * remove module param to enable/disable async scanning
> * add scan time measurements to commit message
>
>
> drivers/nvme/host/core.c | 48 +++++++++++++++++++++++++++++++++-------
> 1 file changed, 40 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 782090ce0bc1..dbf05cfea063 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -4,6 +4,7 @@
> * Copyright (c) 2011-2014, Intel Corporation.
> */
>
> +#include <linux/async.h>
> #include <linux/blkdev.h>
> #include <linux/blk-mq.h>
> #include <linux/blk-integrity.h>
> @@ -3952,6 +3953,30 @@ static void nvme_scan_ns(struct nvme_ctrl *ctrl, unsigned nsid)
> }
> }
>
> +/*
> + * struct async_scan_info - keeps track of controller & NSIDs to scan
> + * @ctrl: Controller on which namespaces are being scanned
> + * @next_idx: Index of next NSID to scan in ns_list
> + * @ns_list: Pointer to list of NSIDs to scan
> + */
> +struct async_scan_info {
> + struct nvme_ctrl *ctrl;
> + atomic_t next_idx;
> + __le32 *ns_list;
> +};
> +
> +static void nvme_scan_ns_async(void *data, async_cookie_t cookie)
> +{
> + struct async_scan_info *scan_info = data;
> + int idx;
> + u32 nsid;
> +
> + idx = (u32)atomic_fetch_add(1, &scan_info->next_idx);
> + nsid = le32_to_cpu(scan_info->ns_list[idx]);
> +
> + nvme_scan_ns(scan_info->ctrl, nsid);
> +}
> +
> static void nvme_remove_invalid_namespaces(struct nvme_ctrl *ctrl,
> unsigned nsid)
> {
> @@ -3975,12 +4000,14 @@ static void nvme_remove_invalid_namespaces(struct nvme_ctrl *ctrl,
> static int nvme_scan_ns_list(struct nvme_ctrl *ctrl)
> {
> const int nr_entries = NVME_IDENTIFY_DATA_SIZE / sizeof(__le32);
> - __le32 *ns_list;
> + struct async_scan_info scan_info;
> u32 prev = 0;
> int ret = 0, i;
> + ASYNC_DOMAIN(domain);
>
> - ns_list = kzalloc(NVME_IDENTIFY_DATA_SIZE, GFP_KERNEL);
> - if (!ns_list)
> + scan_info.ctrl = ctrl;
> + scan_info.ns_list = kzalloc(NVME_IDENTIFY_DATA_SIZE, GFP_KERNEL);
> + if (!scan_info.ns_list)
> return -ENOMEM;
>
> for (;;) {
> @@ -3990,28 +4017,33 @@ static int nvme_scan_ns_list(struct nvme_ctrl *ctrl)
> .identify.nsid = cpu_to_le32(prev),
> };
>
> - ret = nvme_submit_sync_cmd(ctrl->admin_q, &cmd, ns_list,
> - NVME_IDENTIFY_DATA_SIZE);
> + ret = nvme_submit_sync_cmd(ctrl->admin_q, &cmd,
> + scan_info.ns_list,
> + NVME_IDENTIFY_DATA_SIZE);
> if (ret) {
> dev_warn(ctrl->device,
> "Identify NS List failed (status=0x%x)\n", ret);
> goto free;
> }
>
> + atomic_set(&scan_info.next_idx, 0);
> for (i = 0; i < nr_entries; i++) {
> - u32 nsid = le32_to_cpu(ns_list[i]);
> + u32 nsid = le32_to_cpu(scan_info.ns_list[i]);
>
> if (!nsid) /* end of the list? */
> goto out;
> - nvme_scan_ns(ctrl, nsid);
> + async_schedule_domain(nvme_scan_ns_async, &scan_info,
> + &domain);
> while (++prev < nsid)
> nvme_ns_remove_by_nsid(ctrl, prev);
> }
> + async_synchronize_full_domain(&domain);
Let me see if I get this right ...
You allocate 'scan_info' on the stack, so every call to
'async_schedule_domain()' in the loop will be using the same
scan_info context, right?
So each instance of nvme_scan_ns_async() will be using
whichever value is in 'next_idx', right?
Effectively making 'nvme_scan_ns_async()' completely free-floating,
spawning 'nr_entry' instances, and letting each instance pick whichever
nsid is (at the time of execution) the next one.
If that's the case then I would welcome some comments in the code, as
this is somewhat non-obvious. And it also spells out clearly why the
atomic 'next_idx' value is absolutely crucial to that mechanism, and
we don't want anyone getting wrong ideas by 'optimizing' that away.
Otherwise:
Reviewed-by: Hannes Reinecke <hare@suse.de>
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-07-16 6:37 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-15 20:34 [PATCH v3] nvme_core: scan namespaces asynchronously Stuart Hayes
2024-07-15 22:28 ` Sagi Grimberg
2024-07-16 1:23 ` stuart hayes
2024-07-16 6:37 ` Hannes Reinecke
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox