linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chuck Lever <chuck.lever@oracle.com>
To: trondmy@kernel.org
Cc: linux-nfs@vger.kernel.org
Subject: Re: [PATCH] NFS: enable nconnect for RDMA
Date: Sun, 3 Mar 2024 13:35:06 -0500	[thread overview]
Message-ID: <ZeTC2h59TXUTuCh_@manet.1015granger.net> (raw)
In-Reply-To: <20240228213523.35819-1-trondmy@kernel.org>

On Wed, Feb 28, 2024 at 04:35:23PM -0500, trondmy@kernel.org wrote:
> From: Trond Myklebust <trond.myklebust@hammerspace.com>
> 
> It appears that in certain cases, RDMA capable transports can benefit
> from the ability to establish multiple connections to increase their
> throughput. This patch therefore enables the use of the "nconnect" mount
> option for those use cases.
> 
> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

No objection to this patch.

You don't mention here if you have root-caused the throughput issue.
One thing I've noticed is that contention for the transport's
queue_lock is holding back the RPC/RDMA Receive completion handler,
which is single-threaded per transport.

A way to mitigate this would be to replace the recv_queue
R-B tree with a data structure that can perform a lookup while
holding only the RCU read lock. I have experimented with using an
xarray for the recv_queue, and saw improvement.

The workload on that data structure is different for TCP versus
RDMA, though: on RDMA, the number of RPCs in flight is significantly
smaller. For tens of thousands of RPCs in flight, an xarray might
not scale well. The newer Maple tree or rose bush hash, or maybe a
more classic data structure like rhashtable, might handle this
workload better.

It's also worth considering deleting each RPC from the recv_queue
in a less performance-sensitive context, for example, xprt_release,
rather than in xprt_complete_rqst.


> ---
>  fs/nfs/nfs3client.c | 1 +
>  fs/nfs/nfs4client.c | 2 ++
>  2 files changed, 3 insertions(+)
> 
> diff --git a/fs/nfs/nfs3client.c b/fs/nfs/nfs3client.c
> index 674c012868b1..b0c8a39c2bbd 100644
> --- a/fs/nfs/nfs3client.c
> +++ b/fs/nfs/nfs3client.c
> @@ -111,6 +111,7 @@ struct nfs_client *nfs3_set_ds_client(struct nfs_server *mds_srv,
>  	cl_init.hostname = buf;
>  
>  	switch (ds_proto) {
> +	case XPRT_TRANSPORT_RDMA:
>  	case XPRT_TRANSPORT_TCP:
>  	case XPRT_TRANSPORT_TCP_TLS:
>  		if (mds_clp->cl_nconnect > 1)
> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> index 11e3a285594c..84573df5cf5a 100644
> --- a/fs/nfs/nfs4client.c
> +++ b/fs/nfs/nfs4client.c
> @@ -924,6 +924,7 @@ static int nfs4_set_client(struct nfs_server *server,
>  	else
>  		cl_init.max_connect = max_connect;
>  	switch (proto) {
> +	case XPRT_TRANSPORT_RDMA:
>  	case XPRT_TRANSPORT_TCP:
>  	case XPRT_TRANSPORT_TCP_TLS:
>  		cl_init.nconnect = nconnect;
> @@ -1000,6 +1001,7 @@ struct nfs_client *nfs4_set_ds_client(struct nfs_server *mds_srv,
>  	cl_init.hostname = buf;
>  
>  	switch (ds_proto) {
> +	case XPRT_TRANSPORT_RDMA:
>  	case XPRT_TRANSPORT_TCP:
>  	case XPRT_TRANSPORT_TCP_TLS:
>  		if (mds_clp->cl_nconnect > 1) {
> -- 
> 2.44.0
> 
> 

-- 
Chuck Lever

  reply	other threads:[~2024-03-03 18:35 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-28 21:35 [PATCH] NFS: enable nconnect for RDMA trondmy
2024-03-03 18:35 ` Chuck Lever [this message]
2024-03-04 19:01   ` Olga Kornievskaia
2024-03-04 19:32     ` Chuck Lever III
2024-03-04 20:25       ` Olga Kornievskaia
2024-03-04 23:08       ` Trond Myklebust
2024-03-05 14:02         ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZeTC2h59TXUTuCh_@manet.1015granger.net \
    --to=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).