From mboxrd@z Thu Jan 1 00:00:00 1970 From: james_p_freyensee@linux.intel.com (J Freyensee) Date: Thu, 11 Aug 2016 09:35:29 -0700 Subject: [PATCH RFC 2/4] nvme-rdma: fix sqsize/hsqsize/hrqsize per spec In-Reply-To: References: <1470888438-823-1-git-send-email-james_p_freyensee@linux.intel.com> <1470888438-823-3-git-send-email-james_p_freyensee@linux.intel.com> Message-ID: <1470933329.2796.3.camel@linux.intel.com> On Thu, 2016-08-11@10:03 +0300, Sagi Grimberg wrote: > > On 11/08/16 07:07, Jay Freyensee wrote: > > > > Per NVMe-over-Fabrics 1.0 spec, sqsize is represented as > > a 0-based value. > > > > Also per spec, the RDMA binding values shall be set > > to sqsize, which makes hsqsize 0-based values. > > > > Also per spec, but not very clear, is hrqsize is +1 > > of hsqsize. > > > > Thus, the sqsize during NVMf connect() is now: > > > > [root at fedora23-fabrics-host1 for-48]# dmesg > > [??318.720645] nvme_fabrics: nvmf_connect_admin_queue(): sqsize for > > admin queue: 31 > > [??318.720884] nvme nvme0: creating 16 I/O queues. > > [??318.810114] nvme_fabrics: nvmf_connect_io_queue(): sqsize for > > i/o > > queue: 127 > > > > Reported-by: Daniel Verkamp > > Signed-off-by: Jay Freyensee > > --- > > ?drivers/nvme/host/rdma.c | 19 ++++++++++++++++--- > > ?1 file changed, 16 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c > > index 3e3ce2b..8be64f1 100644 > > --- a/drivers/nvme/host/rdma.c > > +++ b/drivers/nvme/host/rdma.c > > @@ -1284,8 +1284,21 @@ static int nvme_rdma_route_resolved(struct > > nvme_rdma_queue *queue) > > > > ? priv.recfmt = cpu_to_le16(NVME_RDMA_CM_FMT_1_0); > > ? priv.qid = cpu_to_le16(nvme_rdma_queue_idx(queue)); > > - priv.hrqsize = cpu_to_le16(queue->queue_size); > > - priv.hsqsize = cpu_to_le16(queue->queue_size); > > + > > + /* > > + ?* On one end, the fabrics spec is pretty clear that > > + ?* hsqsize variables shall be set to the value of sqsize, > > + ?* which is a 0-based number. What is confusing is the > > value for > > + ?* hrqsize.??After clarification from NVMe spec committee > > member, > > + ?* the minimum value of hrqsize is hsqsize+1. > > + ?*/ > > + if (priv.qid == 0) { > > + priv.hsqsize = cpu_to_le16(queue->ctrl- > > >ctrl.admin_sqsize); > > + priv.hrqsize = cpu_to_le16(queue->ctrl- > > >ctrl.admin_sqsize+1); > > + } else { > > + priv.hsqsize = cpu_to_le16(queue->ctrl- > > >ctrl.sqsize); > > + priv.hrqsize = cpu_to_le16(queue->ctrl- > > >ctrl.sqsize+1); > > + } > > Huh? (scratch...) using priv.hrqsize = priv.hsqsize+1 is pointless. It may be pointless, but Dave said that is the current interpretation of the NVMe-over-Fabrics spec (which I don't really understand either). > > We expose to the block layer X and we send to the target X-1 and > the target does X+1 (looks goofy, but ok). We also size our RDMA > send/recv according to X so why on earth would we want to tell the > target we have a recv queue of size X+1 Could be the reason I see kato timeouts then kernel crashing...