All of lore.kernel.org
 help / color / mirror / Atom feed
From: Krishnamraju Eraparaju <krishna2@chelsio.com>
To: Max Gurtovoy <maxg@mellanox.com>
Cc: sagi@grimberg.me, Chaitanya.Kulkarni@wdc.com, bharat@chelsio.com,
	nirranjan@chelsio.com, linux-nvme@lists.infradead.org,
	jgg@mellanox.com, kbusch@kernel.org, hch@lst.de
Subject: Re: [PATCH 3/3] nvmet-rdma: allocate RW ctxs according to mdts
Date: Thu, 5 Mar 2020 15:28:52 +0530	[thread overview]
Message-ID: <20200305095847.GA12902@chelsio.com> (raw)
In-Reply-To: <5bef57b6-aade-f074-c1e1-71a1cd93acce@mellanox.com>

On Thursday, March 03/05/20, 2020 at 00:19:01 +0200, Max Gurtovoy wrote:
> 
> On 3/4/2020 9:18 PM, Krishnamraju Eraparaju wrote:
> >Hi Max Gurtovoy,
> >
> >I just tested this patch series, the issue is not occuring with these
> >patches.
> >
> >Have couple of questons:
> >- Say both host & target has max_fr_pages size of 128 pages, then
> >the number of MRs allocated at target will be twice the size of
> >send_queue_size, as NVMET_RDMA_MAX_MDTS is set to 256 pages.
> >
> >so, in this case, as host can never request an IO of size greater
> >than 128 pages, half of the MRs allocated at target will always
> >left unused.
> >
> >If this is true, will this be a concern in future when
> >NVMET_RDMA_MAX_MDTS limit is increased, but max_fr_pages
> >size of few devices remained at 128 pages?
> 
> for this I suggested a configfs entry so a user would be able to
> configure the target mdts as a QoS and/or to save resources.
> 
> Currently this suggestion is not accepted but let's re-think about
> it in the future (I think adding some configfs entries for saving
> resources such as q_depth, mdts, num_queues, etc might be helpful
> for some users).
> 
> On the other hand, I didn't limit the mdts even for devices with
> small amount of max_fr_pages in the target side so it will be able
> to work with host the can send "big" IOs (with multiple MRs in the
> target side).
> 
> I think this is the right approach - better support capable devices
> and sometimes allocate more than required from host.
> 
> The target acts as a subsystem controller and expose it's mdts,
> exactly as the pci ctrl expose it. Sometimes it's bigger than the
> max_io_size we actually need and it's fine :)
> 
> >
> >
> >- Also, will just passing the optimal mdts(derived based on
> >max_fr_pages) to host during ctrl identification fixes this issue
> >properly(instead of increasing the max_rdma_ctxs with factor)? I think
> >the target doesn't require multiple MRs in this case as host's blk
> >max_segments got tuned with target's mdts.
> >
> >Please correct me if I'm wrong.
> 
> Linux host max_io_size is also set to 1MB (if the device is capable
> for it) so you actually won't be needing multiple MRs per IO.
> 
> I don't know what's optimal_mdts since some users would like to send
> 1MB IOs and not split it to 4 requests of 256KB in the host side.
> 
> And since we use RW api we always need the factor because it might
> be limited by the API one day (today the limit is 256 pages in RW
> api).
> 
> From your question, I understand that your device can support upto
> 512K IOs but I think it will be good idea not to limit hosts that
> use other devices with target that uses your devices.

Thanks for the clarification!

Tested-by: Krishnamraju Eraparaju <krishna2@chelsio.com>

> 
> >
> >Thanks,
> >Krishna.
> >On Wednesday, March 03/04/20, 2020 at 17:39:35 +0200, Max Gurtovoy wrote:
> >>Current nvmet-rdma code allocates MR pool budget based on queue size,
> >>assuming both host and target use the same "max_pages_per_mr" count.
> >>After limiting the mdts value for RDMA controllers, we know the factor
> >>of maximum MR's per IO operation. Thus, make sure MR pool will be
> >>sufficient for the required IO depth and IO size.
> >>
> >>That is, say host's SQ size is 100, then the MR pool budget allocated
> >>currently at target will also be 100 MRs. But 100 IO WRITE Requests
> >>with 256 sg_count(IO size above 1MB) require 200 MRs when target's
> >>"max_pages_per_mr" is 128.
> >>
> >>Reported-by: Krishnamraju Eraparaju <krishna2@chelsio.com>
> >>Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
> >>---
> >>  drivers/nvme/target/rdma.c | 6 ++++--
> >>  1 file changed, 4 insertions(+), 2 deletions(-)
> >>
> >>diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
> >>index 5ba76d2..a6c9d11 100644
> >>--- a/drivers/nvme/target/rdma.c
> >>+++ b/drivers/nvme/target/rdma.c
> >>@@ -976,7 +976,7 @@ static int nvmet_rdma_create_queue_ib(struct nvmet_rdma_queue *queue)
> >>  {
> >>  	struct ib_qp_init_attr qp_attr;
> >>  	struct nvmet_rdma_device *ndev = queue->dev;
> >>-	int comp_vector, nr_cqe, ret, i;
> >>+	int comp_vector, nr_cqe, ret, i, factor;
> >>  	/*
> >>  	 * Spread the io queues across completion vectors,
> >>@@ -1009,7 +1009,9 @@ static int nvmet_rdma_create_queue_ib(struct nvmet_rdma_queue *queue)
> >>  	qp_attr.qp_type = IB_QPT_RC;
> >>  	/* +1 for drain */
> >>  	qp_attr.cap.max_send_wr = queue->send_queue_size + 1;
> >>-	qp_attr.cap.max_rdma_ctxs = queue->send_queue_size;
> >>+	factor = rdma_rw_mr_factor(ndev->device, queue->cm_id->port_num,
> >>+				   1 << NVMET_RDMA_MAX_MDTS);
> >>+	qp_attr.cap.max_rdma_ctxs = queue->send_queue_size * factor;
> >>  	qp_attr.cap.max_send_sge = max(ndev->device->attrs.max_sge_rd,
> >>  					ndev->device->attrs.max_send_sge);
> >>-- 
> >>1.8.3.1
> >>
> >_______________________________________________
> >linux-nvme mailing list
> >linux-nvme@lists.infradead.org
> >https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.infradead.org%2Fmailman%2Flistinfo%2Flinux-nvme&amp;data=02%7C01%7Cmaxg%40mellanox.com%7C9d945a2bb54543630a1e08d7c070ee88%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637189463598411223&amp;sdata=xBgbsudv9jqJ0mSOYW37zLFvRbxSQ2cyzyFmWCVMSVQ%3D&amp;reserved=0

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply	other threads:[~2020-03-05  9:59 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-04 15:39 [PATCH 1/3] nvmet: Add mdts setting op for controllers Max Gurtovoy
2020-03-04 15:39 ` [PATCH 2/3] nvmet-rdma: Implement set_mdts controller op Max Gurtovoy
2020-03-04 16:01   ` Christoph Hellwig
2020-03-04 16:15     ` Max Gurtovoy
2020-03-04 16:18       ` Christoph Hellwig
2020-03-04 16:26         ` Max Gurtovoy
2020-03-04 16:30           ` Christoph Hellwig
2020-03-04 16:32   ` Christoph Hellwig
2020-03-04 15:39 ` [PATCH 3/3] nvmet-rdma: allocate RW ctxs according to mdts Max Gurtovoy
2020-03-04 16:32   ` Christoph Hellwig
2020-03-04 19:18   ` Krishnamraju Eraparaju
2020-03-04 22:19     ` Max Gurtovoy
2020-03-05  9:58       ` Krishnamraju Eraparaju [this message]
2020-03-04 16:31 ` [PATCH 1/3] nvmet: Add mdts setting op for controllers Christoph Hellwig
2020-03-04 16:36 ` Bart Van Assche
2020-03-04 16:48   ` Max Gurtovoy
  -- strict thread matches above, loose matches on Subject: below --
2020-03-08 10:55 [PATCH V3 1/3] nvmet: Add get_mdts " Max Gurtovoy
2020-03-08 10:55 ` [PATCH 3/3] nvmet-rdma: allocate RW ctxs according to mdts Max Gurtovoy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200305095847.GA12902@chelsio.com \
    --to=krishna2@chelsio.com \
    --cc=Chaitanya.Kulkarni@wdc.com \
    --cc=bharat@chelsio.com \
    --cc=hch@lst.de \
    --cc=jgg@mellanox.com \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=maxg@mellanox.com \
    --cc=nirranjan@chelsio.com \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.