From: Long Li <longli-Lp/cVzEoVyZiJJESP9tAQJZ3qXmFLfmx@public.gmane.org>
To: Steve French <sfrench-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>,
linux-cifs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
samba-technical-w/Ol4Ecudpl8XjKLYN78aQ@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Christoph Hellwig <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
Tom Talpey <ttalpey-0li6OtcxBFHby3iVrkZq2A@public.gmane.org>,
Matthew Wilcox <mawilcox-0li6OtcxBFHby3iVrkZq2A@public.gmane.org>
Cc: Long Li <longli-0li6OtcxBFHby3iVrkZq2A@public.gmane.org>
Subject: [Patch v4 21/22] CIFS: SMBD: Upper layer performs SMB read via RDMA write through memory registration
Date: Sun, 1 Oct 2017 19:30:29 -0700 [thread overview]
Message-ID: <20171002023030.3582-22-longli@exchange.microsoft.com> (raw)
In-Reply-To: <20171002023030.3582-1-longli-Lp/cVzEoVyZiJJESP9tAQJZ3qXmFLfmx@public.gmane.org>
From: Long Li <longli-0li6OtcxBFHby3iVrkZq2A@public.gmane.org>
If I/O size is larger than rdma_readwrite_threshold, use RDMA write for
SMB read by specifying channel SMB2_CHANNEL_RDMA_V1 or
SMB2_CHANNEL_RDMA_V1_INVALIDATE in the SMB packet, depending on SMB dialect
used. Append a smbd_buffer_descriptor_v1 to the end of the SMB packet and fill
in other values to indicate this SMB read uses RDMA write.
There is no need to read from the transport for incoming payload. At the time
SMB read response comes back, the data is already transfered and placed in the
pages by RDMA hardware.
When SMB read is finished, deregister the memory regions if RDMA write is used
for this SMB read. smbd_deregister_mr may need to do local invalidation and
sleep, if server remote invalidation is not used.
There are situations where the MID may not be created on I/O failure, under
which memory region is deregistered when read data context is released.
Signed-off-by: Long Li <longli-0li6OtcxBFHby3iVrkZq2A@public.gmane.org>
---
fs/cifs/cifsglob.h | 1 +
fs/cifs/file.c | 10 ++++++++++
fs/cifs/smb2pdu.c | 43 +++++++++++++++++++++++++++++++++++++++++++
3 files changed, 54 insertions(+)
diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index f851b50..30b99a5 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -1152,6 +1152,7 @@ struct cifs_readdata {
struct cifs_readdata *rdata,
struct iov_iter *iter);
struct kvec iov[2];
+ struct smbd_mr *mr;
unsigned int pagesz;
unsigned int tailsz;
unsigned int credits;
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 0786f19..8396f1e 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -42,6 +42,7 @@
#include "cifs_debug.h"
#include "cifs_fs_sb.h"
#include "fscache.h"
+#include "smbdirect.h"
static inline int cifs_convert_flags(unsigned int flags)
@@ -2909,6 +2910,11 @@ cifs_readdata_release(struct kref *refcount)
struct cifs_readdata *rdata = container_of(refcount,
struct cifs_readdata, refcount);
+ if (rdata->mr) {
+ smbd_deregister_mr(rdata->mr);
+ rdata->mr = NULL;
+ }
+
if (rdata->cfile)
cifsFileInfo_put(rdata->cfile);
@@ -3037,6 +3043,8 @@ uncached_fill_pages(struct TCP_Server_Info *server,
}
if (iter)
result = copy_page_from_iter(page, 0, n, iter);
+ else if (rdata->mr)
+ result = n;
else
result = cifs_read_page_from_socket(server, page, n);
if (result < 0)
@@ -3606,6 +3614,8 @@ readpages_fill_pages(struct TCP_Server_Info *server,
if (iter)
result = copy_page_from_iter(page, 0, n, iter);
+ else if (rdata->mr)
+ result = n;
else
result = cifs_read_page_from_socket(server, page, n);
if (result < 0)
diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c
index 7053db9..31dcee0 100644
--- a/fs/cifs/smb2pdu.c
+++ b/fs/cifs/smb2pdu.c
@@ -2380,6 +2380,39 @@ smb2_new_read_req(void **buf, unsigned int *total_len,
req->Length = cpu_to_le32(io_parms->length);
req->Offset = cpu_to_le64(io_parms->offset);
+ /*
+ * If we want to do a RDMA write, fill in and append
+ * smbd_buffer_descriptor_v1 to the end of read request
+ */
+ if (server->rdma && rdata &&
+ rdata->bytes >= server->smbd_conn->rdma_readwrite_threshold) {
+
+ struct smbd_buffer_descriptor_v1 *v1;
+ bool need_invalidate =
+ io_parms->tcon->ses->server->dialect == SMB30_PROT_ID;
+
+ rdata->mr = smbd_register_mr(
+ server->smbd_conn, rdata->pages,
+ rdata->nr_pages, rdata->tailsz,
+ true, need_invalidate);
+ if (!rdata->mr)
+ return -ENOBUFS;
+
+ req->Channel = SMB2_CHANNEL_RDMA_V1_INVALIDATE;
+ if (need_invalidate)
+ req->Channel = SMB2_CHANNEL_RDMA_V1;
+ req->ReadChannelInfoOffset =
+ offsetof(struct smb2_read_plain_req, Buffer);
+ req->ReadChannelInfoLength =
+ sizeof(struct smbd_buffer_descriptor_v1);
+ v1 = (struct smbd_buffer_descriptor_v1 *) &req->Buffer[0];
+ v1->offset = rdata->mr->mr->iova;
+ v1->token = rdata->mr->mr->rkey;
+ v1->length = rdata->mr->mr->length;
+
+ *total_len += sizeof(*v1) - 1;
+ }
+
if (request_type & CHAINED_REQUEST) {
if (!(request_type & END_OF_CHAIN)) {
/* next 8-byte aligned request */
@@ -2459,6 +2492,16 @@ smb2_readv_callback(struct mid_q_entry *mid)
rdata->result = -EIO;
}
+ /*
+ * If this rdata has a memmory registered, the MR can be freed
+ * MR needs to be freed as soon as I/O finishes to prevent deadlock
+ * because they have limited number and are used for future I/Os
+ */
+ if (rdata->mr) {
+ smbd_deregister_mr(rdata->mr);
+ rdata->mr = NULL;
+ }
+
if (rdata->result)
cifs_stats_fail_inc(tcon, SMB2_READ_HE);
--
2.7.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2017-10-02 2:30 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-02 2:30 [Patch v4 00/22] CIFS: Implement SMBDirect Long Li
2017-10-02 2:30 ` [Patch v4 01/22] CIFS: SMBD: Add SMBDirect protocol initial values and constants Long Li
2017-10-02 2:30 ` [Patch v4 02/22] CIFS: SMBD: Establish SMBDirect connection Long Li
[not found] ` <20171002023030.3582-3-longli-Lp/cVzEoVyZiJJESP9tAQJZ3qXmFLfmx@public.gmane.org>
2017-10-04 11:01 ` kbuild test robot
2017-10-02 2:30 ` [Patch v4 03/22] CIFS: SMBD: export protocol initial values Long Li
2017-10-02 2:30 ` [Patch v4 04/22] CIFS: SMBD: Add rdma mount option Long Li
2017-10-02 2:30 ` [Patch v4 08/22] CIFS: SMBD: Upper layer reconnects to SMBDirect session Long Li
2017-10-02 2:30 ` [Patch v4 09/22] CIFS: SMBD: Implement function to destroy a SMBDirect connection Long Li
2017-10-02 2:30 ` [Patch v4 11/22] CIFS: SMBD: Set SMBDirect maximum read or write size for I/O Long Li
2017-10-02 2:30 ` [Patch v4 12/22] CIFS: SMBD: Implement function to receive data via RDMA receive Long Li
[not found] ` <20171002023030.3582-13-longli-Lp/cVzEoVyZiJJESP9tAQJZ3qXmFLfmx@public.gmane.org>
2017-10-04 9:43 ` kbuild test robot
[not found] ` <20171002023030.3582-1-longli-Lp/cVzEoVyZiJJESP9tAQJZ3qXmFLfmx@public.gmane.org>
2017-10-02 2:30 ` [Patch v4 05/22] CIFS: SMBD: Implement function to create a SMBDirect connection Long Li
2017-10-02 2:30 ` [Patch v4 06/22] CIFS: SMBD: Upper layer connects to SMBDirect session Long Li
2017-10-02 2:30 ` [Patch v4 07/22] CIFS: SMBD: Implement function to reconnect to a SMBDirect transport Long Li
2017-10-02 2:30 ` [Patch v4 10/22] CIFS: SMBD: Upper layer destroys SMBDirect session on shutdown or umount Long Li
2017-10-02 2:30 ` [Patch v4 13/22] CIFS: SMBD: Upper layer receives data via RDMA receive Long Li
2017-10-02 2:30 ` [Patch v4 18/22] CIFS: SMBD: Upper layer performs SMB write via RDMA read through memory registration Long Li
2017-10-02 2:30 ` Long Li [this message]
2017-10-02 2:30 ` [Patch v4 14/22] CIFS: SMBD: Implement function to send data via RDMA send Long Li
[not found] ` <20171002023030.3582-15-longli-Lp/cVzEoVyZiJJESP9tAQJZ3qXmFLfmx@public.gmane.org>
2017-10-04 9:43 ` kbuild test robot
2017-10-04 10:05 ` kbuild test robot
2017-10-02 2:30 ` [Patch v4 15/22] CIFS: SMBD: Upper layer sends " Long Li
2017-10-02 2:30 ` [Patch v4 16/22] CIFS: SMBD: Fix the definition for SMB2_CHANNEL_RDMA_V1_INVALIDATE Long Li
[not found] ` <20171002023030.3582-17-longli-Lp/cVzEoVyZiJJESP9tAQJZ3qXmFLfmx@public.gmane.org>
2017-10-04 22:07 ` Steve French
2017-10-02 2:30 ` [Patch v4 17/22] CIFS: SMBD: Implement RDMA memory registration Long Li
2017-10-04 14:08 ` kbuild test robot
2017-10-02 2:30 ` [Patch v4 19/22] CIFS: SMBD: Add parameter rdata to smb2_new_read_req Long Li
2017-10-02 2:30 ` [Patch v4 20/22] CIFS: SMBD: Read correct returned data length for RDMA write (SMB read) I/O Long Li
2017-10-02 2:30 ` [Patch v4 22/22] CIFS: SMBD: Add SMBDirect debug counters Long Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171002023030.3582-22-longli@exchange.microsoft.com \
--to=longli-lp/cvzeovyzijjesp9taqjz3qxmflfmx@public.gmane.org \
--cc=hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
--cc=linux-cifs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=longli-0li6OtcxBFHby3iVrkZq2A@public.gmane.org \
--cc=mawilcox-0li6OtcxBFHby3iVrkZq2A@public.gmane.org \
--cc=samba-technical-w/Ol4Ecudpl8XjKLYN78aQ@public.gmane.org \
--cc=sfrench-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org \
--cc=ttalpey-0li6OtcxBFHby3iVrkZq2A@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).