* [PATCH 1/1] NVMe I/O queue depth change to module parameter
@ 2014-07-15 17:36 Mundu
2014-07-15 18:28 ` Matthew Wilcox
0 siblings, 1 reply; 5+ messages in thread
From: Mundu @ 2014-07-15 17:36 UTC (permalink / raw)
Signed-off-by: Mundu <mundu2510 at gmail.com>
diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index 28aec2d..e8e88d6 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -44,12 +44,15 @@
#include <trace/events/block.h>
-#define NVME_Q_DEPTH 1024
#define SQ_SIZE(depth) (depth * sizeof(struct nvme_command))
#define CQ_SIZE(depth) (depth * sizeof(struct nvme_completion))
#define ADMIN_TIMEOUT (admin_timeout * HZ)
#define IOD_TIMEOUT (retry_time * HZ)
+static unsigned short int nvme_q_depth = 1024;
+module_param(nvme_q_depth, unsigned short int, 0);
+MODULE_PARM_DESC(nvme_q_depth, "queue depth in number of entries for I/O queues");
+
static unsigned char admin_timeout = 60;
module_param(admin_timeout, byte, 0644);
MODULE_PARM_DESC(admin_timeout, "timeout in seconds for admin commands");
@@ -2373,7 +2376,7 @@ static int nvme_dev_map(struct nvme_dev *dev)
goto unmap;
}
cap = readq(&dev->bar->cap);
- dev->q_depth = min_t(int, NVME_CAP_MQES(cap) + 1, NVME_Q_DEPTH);
+ dev->q_depth = min_t(int, NVME_CAP_MQES(cap) + 1, nvme_q_depth);
dev->db_stride = 1 << NVME_CAP_STRIDE(cap);
dev->dbs = ((void __iomem *)dev->bar) + 4096;
--
1.9.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 1/1] NVMe I/O queue depth change to module parameter
2014-07-15 17:36 [PATCH 1/1] NVMe I/O queue depth change to module parameter Mundu
@ 2014-07-15 18:28 ` Matthew Wilcox
2014-07-16 5:37 ` mundu agarwal
[not found] ` <CACuVHZ=vSQ1341CUziDZJjv5DkGzw-9K7WOda59yfpT3WOzG2Q@mail.gmail.com>
0 siblings, 2 replies; 5+ messages in thread
From: Matthew Wilcox @ 2014-07-15 18:28 UTC (permalink / raw)
On Tue, Jul 15, 2014@11:06:38PM +0530, Mundu wrote:
> Signed-off-by: Mundu <mundu2510 at gmail.com>
why?
^ permalink raw reply [flat|nested] 5+ messages in thread* [PATCH 1/1] NVMe I/O queue depth change to module parameter
2014-07-15 18:28 ` Matthew Wilcox
@ 2014-07-16 5:37 ` mundu agarwal
[not found] ` <CACuVHZ=vSQ1341CUziDZJjv5DkGzw-9K7WOda59yfpT3WOzG2Q@mail.gmail.com>
1 sibling, 0 replies; 5+ messages in thread
From: mundu agarwal @ 2014-07-16 5:37 UTC (permalink / raw)
Willy,
In one of the server test environment, user unable to change I/O queue
depth more than 1024. Controller supports much higher number but still
limit to 1024.
Is there any thought for keeping 1024 only ?
Regards,
Mundu
On Tue, Jul 15, 2014@11:58 PM, Matthew Wilcox <willy@linux.intel.com> wrote:
>
> On Tue, Jul 15, 2014@11:06:38PM +0530, Mundu wrote:
> > Signed-off-by: Mundu <mundu2510 at gmail.com>
>
> why?
^ permalink raw reply [flat|nested] 5+ messages in thread[parent not found: <CACuVHZ=vSQ1341CUziDZJjv5DkGzw-9K7WOda59yfpT3WOzG2Q@mail.gmail.com>]
* [PATCH 1/1] NVMe I/O queue depth change to module parameter
[not found] ` <CACuVHZ=vSQ1341CUziDZJjv5DkGzw-9K7WOda59yfpT3WOzG2Q@mail.gmail.com>
@ 2014-07-16 13:27 ` Matthew Wilcox
2014-07-20 14:15 ` mundu agarwal
0 siblings, 1 reply; 5+ messages in thread
From: Matthew Wilcox @ 2014-07-16 13:27 UTC (permalink / raw)
On Wed, Jul 16, 2014@11:00:31AM +0530, mundu agarwal wrote:
> Willy,
>
> In one of the server test environment, user unable to change I/O queue
> depth more than 1024. Controller supports much higher number but still
> limit to 1024.
> Is there any thought for keeping 1024 only ?
That's the kind of comment you need to write in the changelog description.
Now, the reason I limited a queue to 1024 entries was that this was
sufficient to saturate a PCIe bus with typical flash latencies.
If the PCIe bus is x8 gen3, we have 8GB/s of bandwidth available.
Assuming that I/Os are on average 4k and it's a 50/50 read/write split,
we need to service 4 million IOPS to saturate the bus (I haven't heard
of anyone producing a 4 million IOPS device, but let's assume someone's
trying to).
Assuming the controller takes about 100us to service any individual
request, servicing 4 million I/Os serially would take 400 seconds, so
we need to have at least 400 I/Os with the device at all times in order
to hit our goal of saturating the PCIe bus.
So with 1024 I/Os on any given queue, we're a factor of 2.5 above that
goal, *per queue*. So increasing the maximum queue depth any further
isn't going to help us achieve our goal of saturating the PCIe bus.
Indeed, it's only going to upset some of the other timeouts; we've already
had reports that I/Os will start to time out if you saturate all of the
queues as the controllers can't complete the I/Os fast enough.
So what's your motivation for needing a deeper queue?
^ permalink raw reply [flat|nested] 5+ messages in thread* [PATCH 1/1] NVMe I/O queue depth change to module parameter
2014-07-16 13:27 ` Matthew Wilcox
@ 2014-07-20 14:15 ` mundu agarwal
0 siblings, 0 replies; 5+ messages in thread
From: mundu agarwal @ 2014-07-20 14:15 UTC (permalink / raw)
Thanks Willy, for detailed explanation.
Since the HW/device/controller is under evaluation stage, device
parameters such as queue depth, best/worst timing for command
processing to be decide based on computing environment (slow computing
machines - Desktop/Laptop, Fast computing device - Server systems). To
manage the optimal configuration for queue depth, command timeout and
other parameters of the device is needed the frequent change. For ex.
slow performed or optimal environment these parameters (specifically
queue depth) may not need 1024 depth though device published more than
that as you explained in detail.
Regards,
Mundu
On Wed, Jul 16, 2014@6:57 PM, Matthew Wilcox <willy@linux.intel.com> wrote:
> On Wed, Jul 16, 2014@11:00:31AM +0530, mundu agarwal wrote:
>> Willy,
>>
>> In one of the server test environment, user unable to change I/O queue
>> depth more than 1024. Controller supports much higher number but still
>> limit to 1024.
>> Is there any thought for keeping 1024 only ?
>
> That's the kind of comment you need to write in the changelog description.
>
> Now, the reason I limited a queue to 1024 entries was that this was
> sufficient to saturate a PCIe bus with typical flash latencies.
>
> If the PCIe bus is x8 gen3, we have 8GB/s of bandwidth available.
> Assuming that I/Os are on average 4k and it's a 50/50 read/write split,
> we need to service 4 million IOPS to saturate the bus (I haven't heard
> of anyone producing a 4 million IOPS device, but let's assume someone's
> trying to).
>
> Assuming the controller takes about 100us to service any individual
> request, servicing 4 million I/Os serially would take 400 seconds, so
> we need to have at least 400 I/Os with the device at all times in order
> to hit our goal of saturating the PCIe bus.
>
> So with 1024 I/Os on any given queue, we're a factor of 2.5 above that
> goal, *per queue*. So increasing the maximum queue depth any further
> isn't going to help us achieve our goal of saturating the PCIe bus.
> Indeed, it's only going to upset some of the other timeouts; we've already
> had reports that I/Os will start to time out if you saturate all of the
> queues as the controllers can't complete the I/Os fast enough.
>
> So what's your motivation for needing a deeper queue?
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-07-20 14:15 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-15 17:36 [PATCH 1/1] NVMe I/O queue depth change to module parameter Mundu
2014-07-15 18:28 ` Matthew Wilcox
2014-07-16 5:37 ` mundu agarwal
[not found] ` <CACuVHZ=vSQ1341CUziDZJjv5DkGzw-9K7WOda59yfpT3WOzG2Q@mail.gmail.com>
2014-07-16 13:27 ` Matthew Wilcox
2014-07-20 14:15 ` mundu agarwal
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.