public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] For 2.6.35
@ 2010-04-21 17:42 Chuck Lever
       [not found] ` <20100421173628.2037.54635.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Chuck Lever @ 2010-04-21 17:42 UTC (permalink / raw)
  To: linux-nfs

I'd like these patches to be considered for 2.6.35.

The patches posted last week that add kobjects for the NLM host and
NSM handle caches are also candidates for 2.6.35.  However, I'd like
to have a clear plan for how statd will use this new API before they
are committed upstream, to avoid kernel-user space API churn down
the road.

The first patch in the present series is the previously posted fix for
NFSv4 rsize and wsize.  If this patch hasn't already been accepted for
2.6.34 and prior kernels, please consider it for 2.6.35.

The final four patches in this series are meant to improve the
precision of RPC RTTs used for RTT estimation and reported via
/proc/mountstats.

---

Chuck Lever (9):
      SUNRPC: Replace jiffies-based metrics with ktime-based metrics
      ktime: introduce ktime_to_ms()
      SUNRPC: RPC metrics and RTT estimator should use same RTT value
      NFS: Calldata for nfs4_renew_done()
      NFS: Squelch compiler warning in nfs_add_server_stats()
      NFS: Clean up fscache_uniq mount option
      NFS: Squelch compiler warning
      SUNRPC: Trivial cleanups in include/linux/sunrpc/xdr.h
      NFS: rsize and wsize settings ignored on v4 mounts


 fs/nfs/client.c                |    2 ++
 fs/nfs/fscache.c               |    3 ++-
 fs/nfs/iostat.h                |    6 +++---
 fs/nfs/nfs4proc.c              |   26 ++++++++++++++++++++------
 fs/nfs/super.c                 |   18 +++++++++---------
 include/linux/ktime.h          |   10 ++++++++--
 include/linux/sunrpc/metrics.h |    7 ++++---
 include/linux/sunrpc/sched.h   |    5 +++--
 include/linux/sunrpc/xdr.h     |    8 +++++---
 include/linux/sunrpc/xprt.h    |    4 ++--
 net/sunrpc/sched.c             |    2 +-
 net/sunrpc/stats.c             |   27 +++++++++------------------
 net/sunrpc/xprt.c              |   19 ++++++++-----------
 net/sunrpc/xprtsock.c          |    1 -
 14 files changed, 76 insertions(+), 62 deletions(-)

-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/9] NFS: rsize and wsize settings ignored on v4 mounts
       [not found] ` <20100421173628.2037.54635.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
@ 2010-04-21 17:43   ` Chuck Lever
  2010-04-21 17:43   ` [PATCH 2/9] SUNRPC: Trivial cleanups in include/linux/sunrpc/xdr.h Chuck Lever
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Chuck Lever @ 2010-04-21 17:43 UTC (permalink / raw)
  To: linux-nfs

NFSv4 mounts ignore the rsize and wsize mount options, and always use
the default transfer size for both.  This seems to be because all
NFSv4 mounts are now cloned, and the cloning logic doesn't copy the
rsize and wsize settings from the parent nfs_server.

I tested Fedora's 2.6.32.11-99 and it seems to have this problem as
well, so I'm guessing that .33, .32, and perhaps older kernels have
this issue as well.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Stable <stable@kernel.org>
---

 fs/nfs/client.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index a8766c4..acc9c49 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -966,6 +966,8 @@ out_error:
 static void nfs_server_copy_userdata(struct nfs_server *target, struct nfs_server *source)
 {
 	target->flags = source->flags;
+	target->rsize = source->rsize;
+	target->wsize = source->wsize;
 	target->acregmin = source->acregmin;
 	target->acregmax = source->acregmax;
 	target->acdirmin = source->acdirmin;


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/9] SUNRPC: Trivial cleanups in include/linux/sunrpc/xdr.h
       [not found] ` <20100421173628.2037.54635.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
  2010-04-21 17:43   ` [PATCH 1/9] NFS: rsize and wsize settings ignored on v4 mounts Chuck Lever
@ 2010-04-21 17:43   ` Chuck Lever
  2010-04-21 17:43   ` [PATCH 3/9] NFS: Squelch compiler warning Chuck Lever
                     ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Chuck Lever @ 2010-04-21 17:43 UTC (permalink / raw)
  To: linux-nfs

Clean up: Update the documenting comment, and fix some minor white
space issues.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---

 include/linux/sunrpc/xdr.h |    8 +++++---
 1 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/include/linux/sunrpc/xdr.h b/include/linux/sunrpc/xdr.h
index f5cc089..35cf2e8 100644
--- a/include/linux/sunrpc/xdr.h
+++ b/include/linux/sunrpc/xdr.h
@@ -1,7 +1,10 @@
 /*
- * include/linux/sunrpc/xdr.h
+ * XDR standard data types and function declarations
  *
  * Copyright (C) 1995-1997 Olaf Kirch <okir-pn4DOG8n3UYbFoVRYvo4fw@public.gmane.org>
+ *
+ * Based on:
+ *   RFC 4506 "XDR: External Data Representation Standard", May 2006
  */
 
 #ifndef _SUNRPC_XDR_H_
@@ -62,7 +65,6 @@ struct xdr_buf {
 
 	unsigned int	buflen,		/* Total length of storage buffer */
 			len;		/* Length of XDR encoded message */
-
 };
 
 /*
@@ -178,7 +180,7 @@ struct xdr_array2_desc {
 };
 
 extern int xdr_decode_array2(struct xdr_buf *buf, unsigned int base,
-                             struct xdr_array2_desc *desc);
+			     struct xdr_array2_desc *desc);
 extern int xdr_encode_array2(struct xdr_buf *buf, unsigned int base,
 			     struct xdr_array2_desc *desc);
 


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/9] NFS: Squelch compiler warning
       [not found] ` <20100421173628.2037.54635.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
  2010-04-21 17:43   ` [PATCH 1/9] NFS: rsize and wsize settings ignored on v4 mounts Chuck Lever
  2010-04-21 17:43   ` [PATCH 2/9] SUNRPC: Trivial cleanups in include/linux/sunrpc/xdr.h Chuck Lever
@ 2010-04-21 17:43   ` Chuck Lever
  2010-04-21 17:43   ` [PATCH 4/9] NFS: Clean up fscache_uniq mount option Chuck Lever
                     ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Chuck Lever @ 2010-04-21 17:43 UTC (permalink / raw)
  To: linux-nfs

Seen with -Wextra:

/home/cel/linux/fs/nfs/fscache.c: In function =E2=80=98__nfs_readpages_=
from_fscache=E2=80=99:
/home/cel/linux/fs/nfs/fscache.c:479: warning: comparison between signe=
d and unsigned integer expressions

The comparison implicitly converts "int" to "unsigned", making it
safe.  But there's no need for the implicit type conversions here, and
the dfprintk() already uses a "%u" formatter for "npages."  Better to
reduce confusion.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---

 fs/nfs/fscache.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
index a6b16ed..ce153a6 100644
--- a/fs/nfs/fscache.c
+++ b/fs/nfs/fscache.c
@@ -467,7 +467,8 @@ int __nfs_readpages_from_fscache(struct nfs_open_co=
ntext *ctx,
 				 struct list_head *pages,
 				 unsigned *nr_pages)
 {
-	int ret, npages =3D *nr_pages;
+	unsigned npages =3D *nr_pages;
+	int ret;
=20
 	dfprintk(FSCACHE, "NFS: nfs_getpages_from_fscache (0x%p/%u/0x%p)\n",
 		 NFS_I(inode)->fscache, npages, inode);


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/9] NFS: Clean up fscache_uniq mount option
       [not found] ` <20100421173628.2037.54635.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
                     ` (2 preceding siblings ...)
  2010-04-21 17:43   ` [PATCH 3/9] NFS: Squelch compiler warning Chuck Lever
@ 2010-04-21 17:43   ` Chuck Lever
  2010-04-21 17:43   ` [PATCH 5/9] NFS: Squelch compiler warning in nfs_add_server_stats() Chuck Lever
                     ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Chuck Lever @ 2010-04-21 17:43 UTC (permalink / raw)
  To: linux-nfs

Clean up: fscache_uniq takes a string, so it should be included
with the other string mount option definitions, by convention.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---

 fs/nfs/super.c |   18 +++++++++---------
 1 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index e016372..67a4f09 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -141,7 +141,6 @@ static const match_table_t nfs_mount_option_tokens = {
 	{ Opt_resvport, "resvport" },
 	{ Opt_noresvport, "noresvport" },
 	{ Opt_fscache, "fsc" },
-	{ Opt_fscache_uniq, "fsc=%s" },
 	{ Opt_nofscache, "nofsc" },
 
 	{ Opt_port, "port=%s" },
@@ -171,6 +170,7 @@ static const match_table_t nfs_mount_option_tokens = {
 	{ Opt_mountaddr, "mountaddr=%s" },
 
 	{ Opt_lookupcache, "lookupcache=%s" },
+	{ Opt_fscache_uniq, "fsc=%s" },
 
 	{ Opt_err, NULL }
 };
@@ -1046,14 +1046,6 @@ static int nfs_parse_mount_options(char *raw,
 			kfree(mnt->fscache_uniq);
 			mnt->fscache_uniq = NULL;
 			break;
-		case Opt_fscache_uniq:
-			string = match_strdup(args);
-			if (!string)
-				goto out_nomem;
-			kfree(mnt->fscache_uniq);
-			mnt->fscache_uniq = string;
-			mnt->options |= NFS_OPTION_FSCACHE;
-			break;
 
 		/*
 		 * options that take numeric values
@@ -1384,6 +1376,14 @@ static int nfs_parse_mount_options(char *raw,
 					return 0;
 			};
 			break;
+		case Opt_fscache_uniq:
+			string = match_strdup(args);
+			if (string == NULL)
+				goto out_nomem;
+			kfree(mnt->fscache_uniq);
+			mnt->fscache_uniq = string;
+			mnt->options |= NFS_OPTION_FSCACHE;
+			break;
 
 		/*
 		 * Special options


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/9] NFS: Squelch compiler warning in nfs_add_server_stats()
       [not found] ` <20100421173628.2037.54635.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
                     ` (3 preceding siblings ...)
  2010-04-21 17:43   ` [PATCH 4/9] NFS: Clean up fscache_uniq mount option Chuck Lever
@ 2010-04-21 17:43   ` Chuck Lever
  2010-04-21 17:43   ` [PATCH 6/9] NFS: Calldata for nfs4_renew_done() Chuck Lever
                     ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Chuck Lever @ 2010-04-21 17:43 UTC (permalink / raw)
  To: linux-nfs

Clean up:

fs/nfs/iostat.h: In function =E2=80=98nfs_add_server_stats=E2=80=99:
fs/nfs/iostat.h:41: warning: comparison between signed and unsigned int=
eger expressions
fs/nfs/iostat.h:41: warning: comparison between signed and unsigned int=
eger expressions
fs/nfs/iostat.h:41: warning: comparison between signed and unsigned int=
eger expressions
fs/nfs/iostat.h:41: warning: comparison between signed and unsigned int=
eger expressions

Commit fce22848 replaced the open-coded per-cpu logic in several
functions in fs/nfs/iostat.h with a single invocation of
this_cpu_ptr().  This macro assumes its second argument is signed,
not unsigned.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---

 fs/nfs/iostat.h |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/iostat.h b/fs/nfs/iostat.h
index 1d8d5c8..c583248 100644
--- a/fs/nfs/iostat.h
+++ b/fs/nfs/iostat.h
@@ -36,14 +36,14 @@ static inline void nfs_inc_stats(const struct inode=
 *inode,
=20
 static inline void nfs_add_server_stats(const struct nfs_server *serve=
r,
 					enum nfs_stat_bytecounters stat,
-					unsigned long addend)
+					long addend)
 {
 	this_cpu_add(server->io_stats->bytes[stat], addend);
 }
=20
 static inline void nfs_add_stats(const struct inode *inode,
 				 enum nfs_stat_bytecounters stat,
-				 unsigned long addend)
+				 long addend)
 {
 	nfs_add_server_stats(NFS_SERVER(inode), stat, addend);
 }
@@ -51,7 +51,7 @@ static inline void nfs_add_stats(const struct inode *=
inode,
 #ifdef CONFIG_NFS_FSCACHE
 static inline void nfs_add_fscache_stats(struct inode *inode,
 					 enum nfs_stat_fscachecounters stat,
-					 unsigned long addend)
+					 long addend)
 {
 	this_cpu_add(NFS_SERVER(inode)->io_stats->fscache[stat], addend);
 }


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 6/9] NFS: Calldata for nfs4_renew_done()
       [not found] ` <20100421173628.2037.54635.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
                     ` (4 preceding siblings ...)
  2010-04-21 17:43   ` [PATCH 5/9] NFS: Squelch compiler warning in nfs_add_server_stats() Chuck Lever
@ 2010-04-21 17:43   ` Chuck Lever
  2010-04-21 17:44   ` [PATCH 7/9] SUNRPC: RPC metrics and RTT estimator should use same RTT value Chuck Lever
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Chuck Lever @ 2010-04-21 17:43 UTC (permalink / raw)
  To: linux-nfs

I'm about to change task->tk_start from a jiffies value to a ktime_t
value in order to make RPC RTT reporting more precise.

Recently (commit dc96aef9) nfs4_renew_done() started to reference
task->tk_start so that a jiffies value no longer had to be passed
from nfs4_proc_async_renew().  This allowed the calldata to point to
an nfs_client instead.

Changing task->tk_start to a ktime_t value makes it effectively
useless for renew timestamps, so we need to restore the pre-dc96aef9
logic that provided a jiffies "start" timestamp to nfs4_renew_done().

Both an nfs_client pointer and a timestamp need to be passed to
nfs4_renew_done(), so create a new nfs_renewdata structure that
contains both, resembling what is already done for delegreturn,
lock, and unlock.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---

 fs/nfs/nfs4proc.c |   26 ++++++++++++++++++++------
 1 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 6380670..9a16431 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -3146,23 +3146,31 @@ static void nfs4_proc_commit_setup(struct nfs_write_data *data, struct rpc_messa
 	msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_COMMIT];
 }
 
+struct nfs4_renewdata {
+	struct nfs_client	*client;
+	unsigned long		timestamp;
+};
+
 /*
  * nfs4_proc_async_renew(): This is not one of the nfs_rpc_ops; it is a special
  * standalone procedure for queueing an asynchronous RENEW.
  */
-static void nfs4_renew_release(void *data)
+static void nfs4_renew_release(void *calldata)
 {
-	struct nfs_client *clp = data;
+	struct nfs4_renewdata *data = calldata;
+	struct nfs_client *clp = data->client;
 
 	if (atomic_read(&clp->cl_count) > 1)
 		nfs4_schedule_state_renewal(clp);
 	nfs_put_client(clp);
+	kfree(data);
 }
 
-static void nfs4_renew_done(struct rpc_task *task, void *data)
+static void nfs4_renew_done(struct rpc_task *task, void *calldata)
 {
-	struct nfs_client *clp = data;
-	unsigned long timestamp = task->tk_start;
+	struct nfs4_renewdata *data = calldata;
+	struct nfs_client *clp = data->client;
+	unsigned long timestamp = data->timestamp;
 
 	if (task->tk_status < 0) {
 		/* Unless we're shutting down, schedule state recovery! */
@@ -3188,11 +3196,17 @@ int nfs4_proc_async_renew(struct nfs_client *clp, struct rpc_cred *cred)
 		.rpc_argp	= clp,
 		.rpc_cred	= cred,
 	};
+	struct nfs4_renewdata *data;
 
 	if (!atomic_inc_not_zero(&clp->cl_count))
 		return -EIO;
+	data = kmalloc(sizeof(*data), GFP_KERNEL);
+	if (data == NULL)
+		return -ENOMEM;
+	data->client = clp;
+	data->timestamp = jiffies;
 	return rpc_call_async(clp->cl_rpcclient, &msg, RPC_TASK_SOFT,
-			&nfs4_renew_ops, clp);
+			&nfs4_renew_ops, (void *)data);
 }
 
 int nfs4_proc_renew(struct nfs_client *clp, struct rpc_cred *cred)


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 7/9] SUNRPC: RPC metrics and RTT estimator should use same RTT value
       [not found] ` <20100421173628.2037.54635.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
                     ` (5 preceding siblings ...)
  2010-04-21 17:43   ` [PATCH 6/9] NFS: Calldata for nfs4_renew_done() Chuck Lever
@ 2010-04-21 17:44   ` Chuck Lever
  2010-04-21 17:44   ` [PATCH 8/9] ktime: introduce ktime_to_ms() Chuck Lever
  2010-04-21 17:44   ` [PATCH 9/9] SUNRPC: Replace jiffies-based metrics with ktime-based metrics Chuck Lever
  8 siblings, 0 replies; 10+ messages in thread
From: Chuck Lever @ 2010-04-21 17:44 UTC (permalink / raw)
  To: linux-nfs

Compute an RPC request's RTT once, and use that value both for reporting
RPC metrics, and for adjusting the RTT context used by the RPC client's RTT
estimator algorithm.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---

 include/linux/sunrpc/xprt.h |    1 -
 net/sunrpc/xprt.c           |   13 ++++---------
 net/sunrpc/xprtsock.c       |    1 -
 3 files changed, 4 insertions(+), 11 deletions(-)

diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
index 6f9457a..94bfee0 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -294,7 +294,6 @@ void			xprt_set_retrans_timeout_rtt(struct rpc_task *task);
 void			xprt_wake_pending_tasks(struct rpc_xprt *xprt, int status);
 void			xprt_wait_for_buffer_space(struct rpc_task *task, rpc_action action);
 void			xprt_write_space(struct rpc_xprt *xprt);
-void			xprt_update_rtt(struct rpc_task *task);
 void			xprt_adjust_cwnd(struct rpc_task *task, int result);
 struct rpc_rqst *	xprt_lookup_rqst(struct rpc_xprt *xprt, __be32 xid);
 void			xprt_complete_rqst(struct rpc_task *task, int copied);
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index 42f09ad..5559ec4 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -771,12 +771,7 @@ struct rpc_rqst *xprt_lookup_rqst(struct rpc_xprt *xprt, __be32 xid)
 }
 EXPORT_SYMBOL_GPL(xprt_lookup_rqst);
 
-/**
- * xprt_update_rtt - update an RPC client's RTT state after receiving a reply
- * @task: RPC request that recently completed
- *
- */
-void xprt_update_rtt(struct rpc_task *task)
+static void xprt_update_rtt(struct rpc_task *task)
 {
 	struct rpc_rqst *req = task->tk_rqstp;
 	struct rpc_rtt *rtt = task->tk_client->cl_rtt;
@@ -784,12 +779,10 @@ void xprt_update_rtt(struct rpc_task *task)
 
 	if (timer) {
 		if (req->rq_ntrans == 1)
-			rpc_update_rtt(rtt, timer,
-					(long)jiffies - req->rq_xtime);
+			rpc_update_rtt(rtt, timer, task->tk_rtt);
 		rpc_set_timeo(rtt, timer, req->rq_ntrans - 1);
 	}
 }
-EXPORT_SYMBOL_GPL(xprt_update_rtt);
 
 /**
  * xprt_complete_rqst - called when reply processing is complete
@@ -808,6 +801,8 @@ void xprt_complete_rqst(struct rpc_task *task, int copied)
 
 	xprt->stat.recvs++;
 	task->tk_rtt = (long)jiffies - req->rq_xtime;
+	if (xprt->ops->timer != NULL)
+		xprt_update_rtt(task);
 
 	list_del_init(&req->rq_list);
 	req->rq_private_buf.len = copied;
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 9847c30..38203c8 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -858,7 +858,6 @@ static void xs_udp_data_ready(struct sock *sk, int len)
 	dst_confirm(skb_dst(skb));
 
 	xprt_adjust_cwnd(task, copied);
-	xprt_update_rtt(task);
 	xprt_complete_rqst(task, copied);
 
  out_unlock:


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 8/9] ktime: introduce ktime_to_ms()
       [not found] ` <20100421173628.2037.54635.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
                     ` (6 preceding siblings ...)
  2010-04-21 17:44   ` [PATCH 7/9] SUNRPC: RPC metrics and RTT estimator should use same RTT value Chuck Lever
@ 2010-04-21 17:44   ` Chuck Lever
  2010-04-21 17:44   ` [PATCH 9/9] SUNRPC: Replace jiffies-based metrics with ktime-based metrics Chuck Lever
  8 siblings, 0 replies; 10+ messages in thread
From: Chuck Lever @ 2010-04-21 17:44 UTC (permalink / raw)
  To: linux-nfs

To report ktime statistics to user space in milliseconds, a new helper
is required.

When considering how to do this conversion, I didn't immediately see
why the extra step of converting ktime to a timeval was needed.  To
make that more clear, introduce a couple of large comments.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---

 include/linux/ktime.h |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/include/linux/ktime.h b/include/linux/ktime.h
index ce59832..e1ceaa9 100644
--- a/include/linux/ktime.h
+++ b/include/linux/ktime.h
@@ -130,7 +130,7 @@ static inline ktime_t timeval_to_ktime(struct timeval tv)
 /* Convert ktime_t to nanoseconds - NOP in the scalar storage format: */
 #define ktime_to_ns(kt)			((kt).tv64)
 
-#else
+#else	/* !((BITS_PER_LONG == 64) || defined(CONFIG_KTIME_SCALAR)) */
 
 /*
  * Helper macros/inlines to get the ktime_t math right in the timespec
@@ -275,7 +275,7 @@ static inline s64 ktime_to_ns(const ktime_t kt)
 	return (s64) kt.tv.sec * NSEC_PER_SEC + kt.tv.nsec;
 }
 
-#endif
+#endif	/* !((BITS_PER_LONG == 64) || defined(CONFIG_KTIME_SCALAR)) */
 
 /**
  * ktime_equal - Compares two ktime_t variables to see if they are equal
@@ -295,6 +295,12 @@ static inline s64 ktime_to_us(const ktime_t kt)
 	return (s64) tv.tv_sec * USEC_PER_SEC + tv.tv_usec;
 }
 
+static inline s64 ktime_to_ms(const ktime_t kt)
+{
+	struct timeval tv = ktime_to_timeval(kt);
+	return (s64) tv.tv_sec * MSEC_PER_SEC + tv.tv_usec / USEC_PER_MSEC;
+}
+
 static inline s64 ktime_us_delta(const ktime_t later, const ktime_t earlier)
 {
        return ktime_to_us(ktime_sub(later, earlier));


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 9/9] SUNRPC: Replace jiffies-based metrics with ktime-based metrics
       [not found] ` <20100421173628.2037.54635.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
                     ` (7 preceding siblings ...)
  2010-04-21 17:44   ` [PATCH 8/9] ktime: introduce ktime_to_ms() Chuck Lever
@ 2010-04-21 17:44   ` Chuck Lever
  8 siblings, 0 replies; 10+ messages in thread
From: Chuck Lever @ 2010-04-21 17:44 UTC (permalink / raw)
  To: linux-nfs

Currently RPC performance metrics that tabulate elapsed time use
jiffies time values.  This is problematic on systems that use slow
jiffies (for instance 100HZ systems built for paravirtualized
environments).  It is also a problem for computing precise latency
statistics for advanced network transports, such as InfiniBand,
that can have round-trip latencies significanly faster than a single
clock tick.

For the RPC client, adopt the high resolution time stamp mechanism
already used by the network layer and blktrace: ktime.

We use ktime format time stamps for all internal computations, and
convert to milliseconds for presentation.  As a result, we need only
addition operations in the performance critical paths; multiply/divide
is required only for presentation.

We could report RTT metrics in microseconds.  In fact the mountstats
format is versioned to accomodate exactly this kind of interface
improvement.

For now, however, we'll stay with millisecond precision for
presentation to maintain backwards compatibility with the handful of
currently deployed user space tools.  At a later point, we'll move to
an API such as BDI_STATS where a finer timestamp precision can be
reported.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---

 include/linux/sunrpc/metrics.h |    7 ++++---
 include/linux/sunrpc/sched.h   |    5 +++--
 include/linux/sunrpc/xprt.h    |    3 ++-
 net/sunrpc/sched.c             |    2 +-
 net/sunrpc/stats.c             |   27 +++++++++------------------
 net/sunrpc/xprt.c              |    8 +++++---
 6 files changed, 24 insertions(+), 28 deletions(-)

diff --git a/include/linux/sunrpc/metrics.h b/include/linux/sunrpc/metrics.h
index 77f78e5..b6edbc0 100644
--- a/include/linux/sunrpc/metrics.h
+++ b/include/linux/sunrpc/metrics.h
@@ -26,6 +26,7 @@
 #define _LINUX_SUNRPC_METRICS_H
 
 #include <linux/seq_file.h>
+#include <linux/ktime.h>
 
 #define RPC_IOSTATS_VERS	"1.0"
 
@@ -58,9 +59,9 @@ struct rpc_iostats {
 	 * and the total time the request spent from init to release
 	 * are measured.
 	 */
-	unsigned long long	om_queue,	/* jiffies queued for xmit */
-				om_rtt,		/* jiffies for RPC RTT */
-				om_execute;	/* jiffies for RPC execution */
+	ktime_t			om_queue,	/* queued for xmit */
+				om_rtt,		/* RPC RTT */
+				om_execute;	/* RPC execution */
 } ____cacheline_aligned;
 
 struct rpc_task;
diff --git a/include/linux/sunrpc/sched.h b/include/linux/sunrpc/sched.h
index 7bc7fd5..76720d2 100644
--- a/include/linux/sunrpc/sched.h
+++ b/include/linux/sunrpc/sched.h
@@ -10,6 +10,7 @@
 #define _LINUX_SUNRPC_SCHED_H_
 
 #include <linux/timer.h>
+#include <linux/ktime.h>
 #include <linux/sunrpc/types.h>
 #include <linux/spinlock.h>
 #include <linux/wait.h>
@@ -80,8 +81,8 @@ struct rpc_task {
 
 	unsigned short		tk_timeouts;	/* maj timeouts */
 	size_t			tk_bytes_sent;	/* total bytes sent */
-	unsigned long		tk_start;	/* RPC task init timestamp */
-	long			tk_rtt;		/* round-trip time (jiffies) */
+	ktime_t			tk_start,	/* RPC task init timestamp */
+				tk_rtt;		/* round-trip time */
 
 	pid_t			tk_owner;	/* Process id for batching tasks */
 	unsigned char		tk_priority : 2;/* Task priority */
diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
index 94bfee0..f54aee0 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -13,6 +13,7 @@
 #include <linux/socket.h>
 #include <linux/in.h>
 #include <linux/kref.h>
+#include <linux/ktime.h>
 #include <linux/sunrpc/sched.h>
 #include <linux/sunrpc/xdr.h>
 #include <linux/sunrpc/msg_prot.h>
@@ -94,7 +95,7 @@ struct rpc_rqst {
 	 */
 	u32			rq_bytes_sent;	/* Bytes we have sent */
 
-	unsigned long		rq_xtime;	/* when transmitted */
+	ktime_t			rq_xtime;	/* transmit time stamp */
 	int			rq_ntrans;
 
 #if defined(CONFIG_NFS_V4_1)
diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
index aae6907..4a36a57 100644
--- a/net/sunrpc/sched.c
+++ b/net/sunrpc/sched.c
@@ -834,7 +834,7 @@ static void rpc_init_task(struct rpc_task *task, const struct rpc_task_setup *ta
 	}
 
 	/* starting timestamp */
-	task->tk_start = jiffies;
+	task->tk_start = ktime_get();
 
 	dprintk("RPC:       new task initialized, procpid %u\n",
 				task_pid_nr(current));
diff --git a/net/sunrpc/stats.c b/net/sunrpc/stats.c
index 5785d20..aacd95f 100644
--- a/net/sunrpc/stats.c
+++ b/net/sunrpc/stats.c
@@ -144,7 +144,7 @@ void rpc_count_iostats(struct rpc_task *task)
 	struct rpc_rqst *req = task->tk_rqstp;
 	struct rpc_iostats *stats;
 	struct rpc_iostats *op_metrics;
-	long rtt, execute, queue;
+	ktime_t delta;
 
 	if (!task->tk_client || !task->tk_client->cl_metrics || !req)
 		return;
@@ -159,20 +159,13 @@ void rpc_count_iostats(struct rpc_task *task)
 	op_metrics->om_bytes_sent += task->tk_bytes_sent;
 	op_metrics->om_bytes_recv += req->rq_reply_bytes_recvd;
 
-	queue = (long)req->rq_xtime - task->tk_start;
-	if (queue < 0)
-		queue = -queue;
-	op_metrics->om_queue += queue;
+	delta = ktime_sub(req->rq_xtime, task->tk_start);
+	op_metrics->om_queue = ktime_add(op_metrics->om_queue, delta);
 
-	rtt = task->tk_rtt;
-	if (rtt < 0)
-		rtt = -rtt;
-	op_metrics->om_rtt += rtt;
+	op_metrics->om_rtt = ktime_add(op_metrics->om_rtt, task->tk_rtt);
 
-	execute = (long)jiffies - task->tk_start;
-	if (execute < 0)
-		execute = -execute;
-	op_metrics->om_execute += execute;
+	delta = ktime_sub(ktime_get(), task->tk_start);
+	op_metrics->om_execute = ktime_add(op_metrics->om_execute, delta);
 }
 
 static void _print_name(struct seq_file *seq, unsigned int op,
@@ -186,8 +179,6 @@ static void _print_name(struct seq_file *seq, unsigned int op,
 		seq_printf(seq, "\t%12u: ", op);
 }
 
-#define MILLISECS_PER_JIFFY	(1000 / HZ)
-
 void rpc_print_iostats(struct seq_file *seq, struct rpc_clnt *clnt)
 {
 	struct rpc_iostats *stats = clnt->cl_metrics;
@@ -214,9 +205,9 @@ void rpc_print_iostats(struct seq_file *seq, struct rpc_clnt *clnt)
 				metrics->om_timeouts,
 				metrics->om_bytes_sent,
 				metrics->om_bytes_recv,
-				metrics->om_queue * MILLISECS_PER_JIFFY,
-				metrics->om_rtt * MILLISECS_PER_JIFFY,
-				metrics->om_execute * MILLISECS_PER_JIFFY);
+				ktime_to_ms(metrics->om_queue),
+				ktime_to_ms(metrics->om_rtt),
+				ktime_to_ms(metrics->om_execute));
 	}
 }
 EXPORT_SYMBOL_GPL(rpc_print_iostats);
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index 5559ec4..444ca36 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -43,6 +43,7 @@
 #include <linux/interrupt.h>
 #include <linux/workqueue.h>
 #include <linux/net.h>
+#include <linux/ktime.h>
 
 #include <linux/sunrpc/clnt.h>
 #include <linux/sunrpc/metrics.h>
@@ -776,10 +777,11 @@ static void xprt_update_rtt(struct rpc_task *task)
 	struct rpc_rqst *req = task->tk_rqstp;
 	struct rpc_rtt *rtt = task->tk_client->cl_rtt;
 	unsigned timer = task->tk_msg.rpc_proc->p_timer;
+	long m = usecs_to_jiffies(ktime_to_us(task->tk_rtt));
 
 	if (timer) {
 		if (req->rq_ntrans == 1)
-			rpc_update_rtt(rtt, timer, task->tk_rtt);
+			rpc_update_rtt(rtt, timer, m);
 		rpc_set_timeo(rtt, timer, req->rq_ntrans - 1);
 	}
 }
@@ -800,7 +802,7 @@ void xprt_complete_rqst(struct rpc_task *task, int copied)
 			task->tk_pid, ntohl(req->rq_xid), copied);
 
 	xprt->stat.recvs++;
-	task->tk_rtt = (long)jiffies - req->rq_xtime;
+	task->tk_rtt = ktime_sub(ktime_get(), req->rq_xtime);
 	if (xprt->ops->timer != NULL)
 		xprt_update_rtt(task);
 
@@ -901,7 +903,7 @@ void xprt_transmit(struct rpc_task *task)
 		return;
 
 	req->rq_connect_cookie = xprt->connect_cookie;
-	req->rq_xtime = jiffies;
+	req->rq_xtime = ktime_get();
 	status = xprt->ops->send_request(task);
 	if (status != 0) {
 		task->tk_status = status;


^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-04-21 17:50 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-21 17:42 [PATCH 0/9] For 2.6.35 Chuck Lever
     [not found] ` <20100421173628.2037.54635.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-04-21 17:43   ` [PATCH 1/9] NFS: rsize and wsize settings ignored on v4 mounts Chuck Lever
2010-04-21 17:43   ` [PATCH 2/9] SUNRPC: Trivial cleanups in include/linux/sunrpc/xdr.h Chuck Lever
2010-04-21 17:43   ` [PATCH 3/9] NFS: Squelch compiler warning Chuck Lever
2010-04-21 17:43   ` [PATCH 4/9] NFS: Clean up fscache_uniq mount option Chuck Lever
2010-04-21 17:43   ` [PATCH 5/9] NFS: Squelch compiler warning in nfs_add_server_stats() Chuck Lever
2010-04-21 17:43   ` [PATCH 6/9] NFS: Calldata for nfs4_renew_done() Chuck Lever
2010-04-21 17:44   ` [PATCH 7/9] SUNRPC: RPC metrics and RTT estimator should use same RTT value Chuck Lever
2010-04-21 17:44   ` [PATCH 8/9] ktime: introduce ktime_to_ms() Chuck Lever
2010-04-21 17:44   ` [PATCH 9/9] SUNRPC: Replace jiffies-based metrics with ktime-based metrics Chuck Lever

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox