From: Krishnamraju Eraparaju <krishna2@chelsio.com>
To: Max Gurtovoy <maxg@mellanox.com>
Cc: sagi@grimberg.me, Chaitanya.Kulkarni@wdc.com, bharat@chelsio.com,
nirranjan@chelsio.com, linux-nvme@lists.infradead.org,
jgg@mellanox.com, kbusch@kernel.org, hch@lst.de
Subject: Re: [PATCH 3/3] nvmet-rdma: allocate RW ctxs according to mdts
Date: Thu, 5 Mar 2020 15:28:52 +0530 [thread overview]
Message-ID: <20200305095847.GA12902@chelsio.com> (raw)
In-Reply-To: <5bef57b6-aade-f074-c1e1-71a1cd93acce@mellanox.com>
On Thursday, March 03/05/20, 2020 at 00:19:01 +0200, Max Gurtovoy wrote:
>
> On 3/4/2020 9:18 PM, Krishnamraju Eraparaju wrote:
> >Hi Max Gurtovoy,
> >
> >I just tested this patch series, the issue is not occuring with these
> >patches.
> >
> >Have couple of questons:
> >- Say both host & target has max_fr_pages size of 128 pages, then
> >the number of MRs allocated at target will be twice the size of
> >send_queue_size, as NVMET_RDMA_MAX_MDTS is set to 256 pages.
> >
> >so, in this case, as host can never request an IO of size greater
> >than 128 pages, half of the MRs allocated at target will always
> >left unused.
> >
> >If this is true, will this be a concern in future when
> >NVMET_RDMA_MAX_MDTS limit is increased, but max_fr_pages
> >size of few devices remained at 128 pages?
>
> for this I suggested a configfs entry so a user would be able to
> configure the target mdts as a QoS and/or to save resources.
>
> Currently this suggestion is not accepted but let's re-think about
> it in the future (I think adding some configfs entries for saving
> resources such as q_depth, mdts, num_queues, etc might be helpful
> for some users).
>
> On the other hand, I didn't limit the mdts even for devices with
> small amount of max_fr_pages in the target side so it will be able
> to work with host the can send "big" IOs (with multiple MRs in the
> target side).
>
> I think this is the right approach - better support capable devices
> and sometimes allocate more than required from host.
>
> The target acts as a subsystem controller and expose it's mdts,
> exactly as the pci ctrl expose it. Sometimes it's bigger than the
> max_io_size we actually need and it's fine :)
>
> >
> >
> >- Also, will just passing the optimal mdts(derived based on
> >max_fr_pages) to host during ctrl identification fixes this issue
> >properly(instead of increasing the max_rdma_ctxs with factor)? I think
> >the target doesn't require multiple MRs in this case as host's blk
> >max_segments got tuned with target's mdts.
> >
> >Please correct me if I'm wrong.
>
> Linux host max_io_size is also set to 1MB (if the device is capable
> for it) so you actually won't be needing multiple MRs per IO.
>
> I don't know what's optimal_mdts since some users would like to send
> 1MB IOs and not split it to 4 requests of 256KB in the host side.
>
> And since we use RW api we always need the factor because it might
> be limited by the API one day (today the limit is 256 pages in RW
> api).
>
> From your question, I understand that your device can support upto
> 512K IOs but I think it will be good idea not to limit hosts that
> use other devices with target that uses your devices.
Thanks for the clarification!
Tested-by: Krishnamraju Eraparaju <krishna2@chelsio.com>
>
> >
> >Thanks,
> >Krishna.
> >On Wednesday, March 03/04/20, 2020 at 17:39:35 +0200, Max Gurtovoy wrote:
> >>Current nvmet-rdma code allocates MR pool budget based on queue size,
> >>assuming both host and target use the same "max_pages_per_mr" count.
> >>After limiting the mdts value for RDMA controllers, we know the factor
> >>of maximum MR's per IO operation. Thus, make sure MR pool will be
> >>sufficient for the required IO depth and IO size.
> >>
> >>That is, say host's SQ size is 100, then the MR pool budget allocated
> >>currently at target will also be 100 MRs. But 100 IO WRITE Requests
> >>with 256 sg_count(IO size above 1MB) require 200 MRs when target's
> >>"max_pages_per_mr" is 128.
> >>
> >>Reported-by: Krishnamraju Eraparaju <krishna2@chelsio.com>
> >>Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
> >>---
> >> drivers/nvme/target/rdma.c | 6 ++++--
> >> 1 file changed, 4 insertions(+), 2 deletions(-)
> >>
> >>diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
> >>index 5ba76d2..a6c9d11 100644
> >>--- a/drivers/nvme/target/rdma.c
> >>+++ b/drivers/nvme/target/rdma.c
> >>@@ -976,7 +976,7 @@ static int nvmet_rdma_create_queue_ib(struct nvmet_rdma_queue *queue)
> >> {
> >> struct ib_qp_init_attr qp_attr;
> >> struct nvmet_rdma_device *ndev = queue->dev;
> >>- int comp_vector, nr_cqe, ret, i;
> >>+ int comp_vector, nr_cqe, ret, i, factor;
> >> /*
> >> * Spread the io queues across completion vectors,
> >>@@ -1009,7 +1009,9 @@ static int nvmet_rdma_create_queue_ib(struct nvmet_rdma_queue *queue)
> >> qp_attr.qp_type = IB_QPT_RC;
> >> /* +1 for drain */
> >> qp_attr.cap.max_send_wr = queue->send_queue_size + 1;
> >>- qp_attr.cap.max_rdma_ctxs = queue->send_queue_size;
> >>+ factor = rdma_rw_mr_factor(ndev->device, queue->cm_id->port_num,
> >>+ 1 << NVMET_RDMA_MAX_MDTS);
> >>+ qp_attr.cap.max_rdma_ctxs = queue->send_queue_size * factor;
> >> qp_attr.cap.max_send_sge = max(ndev->device->attrs.max_sge_rd,
> >> ndev->device->attrs.max_send_sge);
> >>--
> >>1.8.3.1
> >>
> >_______________________________________________
> >linux-nvme mailing list
> >linux-nvme@lists.infradead.org
> >https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.infradead.org%2Fmailman%2Flistinfo%2Flinux-nvme&data=02%7C01%7Cmaxg%40mellanox.com%7C9d945a2bb54543630a1e08d7c070ee88%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637189463598411223&sdata=xBgbsudv9jqJ0mSOYW37zLFvRbxSQ2cyzyFmWCVMSVQ%3D&reserved=0
_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
next prev parent reply other threads:[~2020-03-05 9:59 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-04 15:39 [PATCH 1/3] nvmet: Add mdts setting op for controllers Max Gurtovoy
2020-03-04 15:39 ` [PATCH 2/3] nvmet-rdma: Implement set_mdts controller op Max Gurtovoy
2020-03-04 16:01 ` Christoph Hellwig
2020-03-04 16:15 ` Max Gurtovoy
2020-03-04 16:18 ` Christoph Hellwig
2020-03-04 16:26 ` Max Gurtovoy
2020-03-04 16:30 ` Christoph Hellwig
2020-03-04 16:32 ` Christoph Hellwig
2020-03-04 15:39 ` [PATCH 3/3] nvmet-rdma: allocate RW ctxs according to mdts Max Gurtovoy
2020-03-04 16:32 ` Christoph Hellwig
2020-03-04 19:18 ` Krishnamraju Eraparaju
2020-03-04 22:19 ` Max Gurtovoy
2020-03-05 9:58 ` Krishnamraju Eraparaju [this message]
2020-03-04 16:31 ` [PATCH 1/3] nvmet: Add mdts setting op for controllers Christoph Hellwig
2020-03-04 16:36 ` Bart Van Assche
2020-03-04 16:48 ` Max Gurtovoy
-- strict thread matches above, loose matches on Subject: below --
2020-03-08 10:55 [PATCH V3 1/3] nvmet: Add get_mdts " Max Gurtovoy
2020-03-08 10:55 ` [PATCH 3/3] nvmet-rdma: allocate RW ctxs according to mdts Max Gurtovoy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200305095847.GA12902@chelsio.com \
--to=krishna2@chelsio.com \
--cc=Chaitanya.Kulkarni@wdc.com \
--cc=bharat@chelsio.com \
--cc=hch@lst.de \
--cc=jgg@mellanox.com \
--cc=kbusch@kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=maxg@mellanox.com \
--cc=nirranjan@chelsio.com \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).