[RFC][PATCH 0/5] NFS: trace points added to mounting path

public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed

* [RFC][PATCH 0/5] NFS: trace points added to mounting path
@ 2009-01-16 16:22 Steve Dickson
  2009-01-16 16:30 ` [PATCH 3/5] NFS: Adding trace points to nfs/client.c Steve Dickson
                   ` (3 more replies)
  0 siblings, 4 replies; 58+ messages in thread
From: Steve Dickson @ 2009-01-16 16:22 UTC (permalink / raw)
  To: Linux NFSv4 mailing list, Linux NFS Mailing list; +Cc: SystemTAP

Hello,

Very recently patches were added to the mainline kernel that
enabled the use of trace points. This patch series takes
advantage of those patch by introducing trace points 
to the mounting path of NFS mounts. Its hoped these
trace points can be used by system administrators to
identify why NFS mounts are failing or hang in 
production kernels.

IMHO, one general problem with today's "canned" NFS debugging today is it 
becomes very verbose very quickly.... "I get here" and "I get there" type of 
debugging statements. Although they help trace the code but very rarely 
shows/defines what the actual problem was. So what I've try to do is 
"define the error paths" by putting a trace point at every error exit 
in hopes to define where and why things broke. 

So the ultimate goal would be to replace all the dprintks with trace points
but still be able to enable them through the rpcdebug command (although we
might want to think about splitting the command out into three different 
commands nfsdebug, nfsddebug, rpcdebug). Since trace points have very little
overhead, a set of trace points could be enable in production with have
little or no effect on functionality or performance.

Another advantage with trace points is the type and amount of
information that can be retrieved. With these trace points, I'm
passing in the error code as well as the data structure[s] associated
with that error. This allows the "canned" information that IT people
would used (via the rpcdebug command which would turn on a group of
trace points) as well as more detailed information that kernel developers 
can used (via systemtap scripts which would turn on individual trace
points).

Patch summary:
    * fs/nfs/client.c

    * fs/nfs/getroot.c

    * fs/nfs/super.c

      The based files where traces where added.

    * include/trace/nfs.h

    * kernel/trace/Makefile

    * kernel/trace/nfs-trace.c

      The overhead of added the trace points and then converting them
      into trace marks			.

    * samples/nfs/nfs_mount.stp

      The systemtap script used to access the trace marks. I probably
      should have documented the file better, but the first three
      functions in the file are how structures are pulled from the
      kernel. The rest are probes used to active the trace markers.

Comments... Acceptance?? 

steved.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH 3/5] NFS: Adding trace points to nfs/client.c
  2009-01-16 16:22 [RFC][PATCH 0/5] NFS: trace points added to mounting path Steve Dickson
@ 2009-01-16 16:30 ` Steve Dickson
  2009-01-16 16:32 ` [PATCH 4/5] NFS: Convert trace points to trace markers Steve Dickson
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 58+ messages in thread
From: Steve Dickson @ 2009-01-16 16:30 UTC (permalink / raw)
  To: Linux NFSv4 mailing list, Linux NFS Mailing list; +Cc: SystemTAP

Added trace points to a number of client routines 
used in the mount process. If a routine was only called
by the the mounting process, the prefix on the trace
point starts with 'trace_nfs_mount' otherwise the 
routine name as used as the prefix.

Signed-off-by: Steve Dickson <steved@redhat.com>

--- linux/fs/nfs/client.c.orig	2009-01-14 15:54:19.000000000 -0500
+++ linux/fs/nfs/client.c	2009-01-15 12:40:08.000000000 -0500
@@ -37,6 +37,7 @@
 #include <linux/in6.h>
 #include <net/ipv6.h>
 #include <linux/nfs_xdr.h>
+#include <trace/nfs.h>
 
 #include <asm/system.h>
 
@@ -46,6 +47,19 @@
 #include "iostat.h"
 #include "internal.h"
 
+DEFINE_TRACE(nfs_create_rpc_client);
+DEFINE_TRACE(nfs_create_rpc_client_proto);
+DEFINE_TRACE(nfs_start_lockd);
+DEFINE_TRACE(nfs_init_server_rpcclient);
+DEFINE_TRACE(nfs_init_server_rpcclient_clone);
+DEFINE_TRACE(nfs_init_server_rpcclient_auth);
+DEFINE_TRACE(nfs_mount_init_clnt);
+DEFINE_TRACE(nfs_mount_init_srv);
+DEFINE_TRACE(nfs_probe_fsinfo);
+DEFINE_TRACE(nfs_probe_fsinfo_setcaps);
+DEFINE_TRACE(nfs_probe_fsinfo_fsinfo);
+DEFINE_TRACE(nfs_create_server);
+
 #define NFSDBG_FACILITY		NFSDBG_CLIENT
 
 static DEFINE_SPINLOCK(nfs_client_lock);
@@ -507,6 +521,8 @@ static int nfs_create_rpc_client(struct 
 	if (noresvport)
 		args.flags |= RPC_CLNT_CREATE_NONPRIVPORT;
 
+	trace_nfs_create_rpc_client(clp, timeparms, flavor, args.flags);
+
 	if (!IS_ERR(clp->cl_rpcclient))
 		return 0;
 
@@ -514,6 +530,7 @@ static int nfs_create_rpc_client(struct 
 	if (IS_ERR(clnt)) {
 		dprintk("%s: cannot create RPC client. Error = %ld\n",
 				__func__, PTR_ERR(clnt));
+		trace_nfs_create_rpc_client_proto(clp, PTR_ERR(clnt));
 		return PTR_ERR(clnt);
 	}
 
@@ -554,9 +571,10 @@ static int nfs_start_lockd(struct nfs_se
 		return 0;
 
 	host = nlmclnt_init(&nlm_init);
-	if (IS_ERR(host))
+	if (IS_ERR(host)) {
+		trace_nfs_start_lockd(server, PTR_ERR(host));
 		return PTR_ERR(host);
-
+	}
 	server->nlm_host = host;
 	server->destroy = nfs_destroy_server;
 	return 0;
@@ -601,9 +619,12 @@ static int nfs_init_server_rpcclient(str
 {
 	struct nfs_client *clp = server->nfs_client;
 
+	trace_nfs_init_server_rpcclient(server, timeo, pseudoflavour);
+
 	server->client = rpc_clone_client(clp->cl_rpcclient);
 	if (IS_ERR(server->client)) {
 		dprintk("%s: couldn't create rpc_client!\n", __func__);
+		trace_nfs_init_server_rpcclient_clone(server, PTR_ERR(server->client));
 		return PTR_ERR(server->client);
 	}
 
@@ -618,6 +639,7 @@ static int nfs_init_server_rpcclient(str
 		auth = rpcauth_create(pseudoflavour, server->client);
 		if (IS_ERR(auth)) {
 			dprintk("%s: couldn't create credcache!\n", __func__);
+			trace_nfs_init_server_rpcclient_auth(server, PTR_ERR(auth));
 			return PTR_ERR(auth);
 		}
 	}
@@ -657,6 +679,7 @@ static int nfs_init_client(struct nfs_cl
 error:
 	nfs_mark_client_ready(clp, error);
 	dprintk("<-- nfs_init_client() = xerror %d\n", error);
+	trace_nfs_mount_init_clnt(clp, error);
 	return error;
 }
 
@@ -688,6 +711,7 @@ static int nfs_init_server(struct nfs_se
 	clp = nfs_get_client(&cl_init);
 	if (IS_ERR(clp)) {
 		dprintk("<-- nfs_init_server() = error %ld\n", PTR_ERR(clp));
+		trace_nfs_mount_init_srv(server, PTR_ERR(clp));
 		return PTR_ERR(clp);
 	}
 
@@ -807,19 +831,23 @@ static int nfs_probe_fsinfo(struct nfs_s
 	int error;
 
 	dprintk("--> nfs_probe_fsinfo()\n");
+	trace_nfs_probe_fsinfo(server, mntfh, fattr);
 
 	if (clp->rpc_ops->set_capabilities != NULL) {
 		error = clp->rpc_ops->set_capabilities(server, mntfh);
-		if (error < 0)
+		if (error < 0) {
+			trace_nfs_probe_fsinfo_setcaps(server, error);
 			goto out_error;
+		}
 	}
 
 	fsinfo.fattr = fattr;
 	nfs_fattr_init(fattr);
 	error = clp->rpc_ops->fsinfo(server, mntfh, &fsinfo);
-	if (error < 0)
+	if (error < 0) {
+		trace_nfs_probe_fsinfo_fsinfo(server, error);
 		goto out_error;
-
+	}
 	nfs_server_set_fsinfo(server, &fsinfo);
 	error = bdi_init(&server->backing_dev_info);
 	if (error)
@@ -956,6 +984,7 @@ struct nfs_server *nfs_create_server(con
 	if (!(fattr.valid & NFS_ATTR_FATTR)) {
 		error = server->nfs_client->rpc_ops->getattr(server, mntfh, &fattr);
 		if (error < 0) {
+			trace_nfs_create_server(server, mntfh, error);
 			dprintk("nfs_create_server: getattr error = %d\n", -error);
 			goto error;
 		}

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH 4/5] NFS: Convert trace points to trace markers
  2009-01-16 16:22 [RFC][PATCH 0/5] NFS: trace points added to mounting path Steve Dickson
  2009-01-16 16:30 ` [PATCH 3/5] NFS: Adding trace points to nfs/client.c Steve Dickson
@ 2009-01-16 16:32 ` Steve Dickson
  2009-01-16 16:33 ` [PATCH 5/5] NFS: Systemtap script Steve Dickson
       [not found] ` <4970B451.4080201-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
  3 siblings, 0 replies; 58+ messages in thread
From: Steve Dickson @ 2009-01-16 16:32 UTC (permalink / raw)
  To: Linux NFSv4 mailing list, Linux NFS Mailing list; +Cc: SystemTAP

This the magic that converts the trace points into trace marks
so systemtap scripts can read the data. There is a good chance
these interfaces could change, but it should not effect the 
trace points define in the code.

Signed-off-by: Steve Dickson <steved@redhat.com>

--- linux/kernel/trace/Makefile.orig	2009-01-14 15:54:20.000000000 -0500
+++ linux/kernel/trace/Makefile	2009-01-15 12:23:11.000000000 -0500
@@ -18,7 +18,7 @@ endif
 obj-$(CONFIG_FUNCTION_TRACER) += libftrace.o
 obj-$(CONFIG_RING_BUFFER) += ring_buffer.o
 
-obj-$(CONFIG_TRACING) += trace.o
+obj-$(CONFIG_TRACING) += trace.o trace_nfs.o
 obj-$(CONFIG_CONTEXT_SWITCH_TRACER) += trace_sched_switch.o
 obj-$(CONFIG_SYSPROF_TRACER) += trace_sysprof.o
 obj-$(CONFIG_FUNCTION_TRACER) += trace_functions.o
--- /dev/null	2009-01-08 09:11:53.943261863 -0500
+++ linux/kernel/trace/trace_nfs.c	2009-01-15 12:23:11.000000000 -0500
@@ -0,0 +1,135 @@
+/*
+ * kernel/trace/nfs-trace.c
+ *
+ * NFS tracepoint probes.
+ */
+
+#include <linux/autoconf.h>
+#include <linux/module.h>
+#include <linux/marker.h>
+
+#include <linux/sunrpc/sched.h>
+#include <linux/sunrpc/xdr.h>
+#include <linux/sunrpc/xprt.h>
+
+#include <trace/nfs.h>
+
+/*
+ * NFS mount probs 
+ */
+void probe_nfs_mount(struct file_system_type *fs_type,
+	int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt)
+{
+	trace_mark(nfs_mount, "%p %x %p %p %p", 
+		fs_type, flags, dev_name, raw_data, mnt);
+}
+void probe_nfs_mount_data_null(struct nfs_mount_data *data, int err)
+{
+	trace_mark(nfs_mount_data_null, "%p %d", data, err); 
+}
+void probe_nfs_mount_data_invalvers(struct nfs_mount_data *data, int err)
+{
+	trace_mark(nfs_mount_data_invalvers, "%p %d", data, err); 
+}
+void probe_nfs_mount_data_invalsec(struct nfs_mount_data *data, int err)
+{
+	trace_mark(nfs_mount_data_invalsec, "%p %d", data, err); 
+}
+void probe_nfs_mount_data_nomem(struct nfs_mount_data *data, int err)
+{
+	trace_mark(nfs_mount_data_nomem, "%p %d", data, err); 
+}
+void probe_nfs_mount_data_noaddr(struct nfs_mount_data *data, int err)
+{
+	trace_mark(nfs_mount_data_noaddr, "%p %d", data, err); 
+}
+void probe_nfs_mount_data_badfh(struct nfs_mount_data *data, int err)
+{
+	trace_mark(nfs_mount_data_badfh, "%p %d", data, err); 
+}
+void probe_nfs_mount_init_clnt(struct nfs_client *clp, int error) {
+
+	trace_mark(nfs_mount_init_clnt, "%p %d", clp, error);
+}
+void probe_nfs_mount_get_root(struct super_block *sb, 
+	struct nfs_server *server, struct nfs_fh *mntfh)
+{
+	trace_mark(nfs_mount_get_root, "%p %p %p", sb, server, mntfh);
+}
+void probe_nfs__mount_get_root_fhget(struct super_block *sb, 
+	struct nfs_server *server, int error)
+{
+	trace_mark(nfs__mount_get_root_fhget, "%p %p %d", sb, server, error);
+}
+void probe_nfs__mount_get_root_dummy_root(struct super_block *sb, 
+	struct nfs_server *server, int error)
+{
+	trace_mark(nfs__mount_get_root_dummy_root, "%p %p %d", sb, server, error);
+}
+void probe_nfs__mount_get_root_alias(struct super_block *sb, 
+	struct nfs_server *server, int error)
+{
+	trace_mark(nfs__mount_get_root_alias, "%p %p %d", sb, server, error);
+}
+
+void probe_nfs_create_rpc_client(struct nfs_client *clp,
+	const struct rpc_timeout *timeo, rpc_authflavor_t flavor,
+	int flags)
+{
+	trace_mark(nfs_create_rpc_client, "%p %p %d %d", 
+		clp, timeo, flavor, flags);
+}
+void probe_nfs_create_rpc_client_proto(struct nfs_client *clp, int error)
+{
+
+	trace_mark(nfs_create_rpc_client_proto, "%p %d", clp, error);
+}
+void probe_nfs_start_lockd(struct nfs_server *server, int error)
+{
+	trace_mark(nfs_start_lockd, "%p %d", server, error);
+}
+void probe_nfs_init_server_rpcclient(struct nfs_server *server, 
+	const struct rpc_timeout *timeo,
+	rpc_authflavor_t flavor)
+{
+	trace_mark(nfs_init_server_rpcclient, "%p %p %d", server, timeo, flavor);
+}
+void probe_nfs_init_server_rpcclient_clone(
+	struct nfs_server *server, int error) 
+{
+	trace_mark(nfs_init_server_rpcclient_clone, "%p %d", server, error);
+}
+void probe_nfs_init_server_rpcclient_auth(
+	struct nfs_server *server, int error)
+{
+	trace_mark(nfs_init_server_rpcclient_auth, "%p %d", server, error);
+}
+void probe_nfs_mount_init_srv(
+	struct nfs_server *server, int error)
+{
+	trace_mark(nfs_mount_init_srv, "%p %d", server, error);
+}
+
+void probe_nfs_probe_fsinfo(struct nfs_server *server, 
+	struct nfs_fh *mntfh, struct nfs_fattr *fattr)
+{
+	trace_mark(nfs_probe_fsinfo, "%p %p %p", server, mntfh, fattr);
+}
+void probe_nfs_probe_fsinfo_setcaps(struct nfs_server *server,
+	struct nfs_fh *mntfh, int error)
+{
+	trace_mark(nfs_probe_fsinfo_setcaps, "%p %p %d", server, mntfh, error);
+}
+void probe_nfs_probe_fsinfo_fsinfo(struct nfs_server *server, int error)
+{
+	trace_mark(nfs_probe_fsinfo_fsinfo, "%p %d", server, error);
+}
+void probe_nfs_create_server(struct nfs_server *server, 
+	struct nfs_fh *mntfh, int error)
+{
+	trace_mark(nfs_create_server, "%p %p %d", server, mntfh, error);
+}
+void probe_nfs_mount_sget(struct nfs_server *server, int error)
+{
+	trace_mark(nfs_mount_sget, "%p %d", server, error);
+}
--- /dev/null	2009-01-08 09:11:53.943261863 -0500
+++ linux/include/trace/nfs.h	2009-01-15 12:30:45.000000000 -0500
@@ -0,0 +1,99 @@
+#ifndef _TRACE_NFS_H
+#define _TRACE_NFS_H
+#include <linux/init.h>
+#include <linux/kernel.h>
+
+#include <linux/buffer_head.h>
+#include <linux/tracepoint.h>
+
+#include <linux/nfs_mount.h>
+#include <linux/nfs4.h>
+#include <linux/nfs_xdr.h>
+#include <linux/nfs_fs_sb.h>
+#include <linux/sunrpc/auth.h>
+
+#include <linux/posix_acl.h>
+/*
+ * NFS mounting trace points
+ */
+DECLARE_TRACE(nfs_mount, 
+	TPPROTO(struct file_system_type *fs_type, int flags, 
+		const char *dev_name, void *raw_data, struct vfsmount *mnt),
+	TPARGS(fs_type, flags, dev_name, raw_data, mnt));
+
+DECLARE_TRACE(nfs_mount_data_null, 
+	TPPROTO(struct nfs_mount_data *data, int err), TPARGS(data, err));
+DECLARE_TRACE(nfs_mount_data_invalvers, 
+	TPPROTO(struct nfs_mount_data *data, int err), TPARGS(data, err));
+DECLARE_TRACE(nfs_mount_data_invalsec, 
+	TPPROTO(struct nfs_mount_data *data, int err), TPARGS(data, err));
+DECLARE_TRACE(nfs_mount_data_nomem, 
+	TPPROTO(struct nfs_mount_data *data, int err), TPARGS(data, err));
+DECLARE_TRACE(nfs_mount_data_noaddr, 
+	TPPROTO(struct nfs_mount_data *data, int err), TPARGS(data, err));
+DECLARE_TRACE(nfs_mount_data_badfh, 
+	TPPROTO(struct nfs_mount_data *data, int err), TPARGS(data, err));
+
+DECLARE_TRACE(nfs_mount_get_root, 
+	TPPROTO(struct super_block *sb, struct nfs_server *server, 
+		struct nfs_fh *mntfh),
+	TPARGS(sb, server, mntfh));
+DECLARE_TRACE(nfs_mount_getroot, 
+	TPPROTO(struct super_block *sb, struct nfs_server *server, 
+		int error),
+	TPARGS(sb, server, error));
+DECLARE_TRACE(nfs_mount_get_root_fhget, 
+	TPPROTO(struct super_block *sb, struct nfs_server *server, 
+		int error),
+	TPARGS(sb, server, error));
+DECLARE_TRACE(nfs_mount_get_root_alias, 
+	TPPROTO(struct super_block *sb, struct nfs_server *server, 
+		int error),
+	TPARGS(sb, server, error));
+DECLARE_TRACE(nfs_mount_get_root_dummy_root, 
+	TPPROTO(struct super_block *sb, struct nfs_server *server, 
+		int error),
+	TPARGS(sb, server, error));
+
+DECLARE_TRACE(nfs_create_rpc_client, 
+	TPPROTO(struct nfs_client *clp, const struct rpc_timeout *timeo,
+		rpc_authflavor_t flavor, int flags),
+	TPARGS(clp, timeo, flavor, flags));
+
+DECLARE_TRACE(nfs_create_rpc_client_proto, 
+	TPPROTO(struct nfs_client *clp, int error), 
+	TPARGS(clp,error));
+//DECLARE_TRACE(nfs_create_rpc_client_client, TPPROTO(int error), TPARGS(error));
+DECLARE_TRACE(nfs_mount_init_clnt, 
+	TPPROTO(struct nfs_client *clp, int error), TPARGS(clp, error));
+DECLARE_TRACE(nfs_mount_init_srv, 
+	TPPROTO(struct nfs_server *server, int error), TPARGS(server, error));
+
+DECLARE_TRACE(nfs_mount_sget, 
+	TPPROTO(struct nfs_server *server, int error), TPARGS(server, error));
+
+DECLARE_TRACE(nfs_init_server_rpcclient, 
+	TPPROTO(struct nfs_server *server, const struct rpc_timeout *timeo, 
+		rpc_authflavor_t flavor),
+	TPARGS(server, timeo, flavor));
+DECLARE_TRACE(nfs_init_server_rpcclient_clone, 
+	TPPROTO(struct nfs_server *server, int error), TPARGS(server, error));
+DECLARE_TRACE(nfs_init_server_rpcclient_auth, 
+	TPPROTO(struct nfs_server *server, int error), TPARGS(server, error));
+
+DECLARE_TRACE(nfs_start_lockd, 
+	TPPROTO(struct nfs_server *server, int error), TPARGS(server, error));
+
+DECLARE_TRACE(nfs_probe_fsinfo, 
+	TPPROTO(struct nfs_server *server, struct nfs_fh *mntfh, 
+		struct nfs_fattr *fattr),
+	TPARGS(server, mntfh, fattr));
+DECLARE_TRACE(nfs_probe_fsinfo_fsinfo, 
+	TPPROTO(struct nfs_server *server, int error), TPARGS(server, error));
+DECLARE_TRACE(nfs_probe_fsinfo_setcaps, 
+	TPPROTO(struct nfs_server *server, int error), TPARGS(server, error));
+
+DECLARE_TRACE(nfs_create_server, 
+	TPPROTO(struct nfs_server *server, struct nfs_fh *mntfh, int error),
+	TPARGS(server, mntfh, error));
+#endif

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH 5/5] NFS: Systemtap script
  2009-01-16 16:22 [RFC][PATCH 0/5] NFS: trace points added to mounting path Steve Dickson
  2009-01-16 16:30 ` [PATCH 3/5] NFS: Adding trace points to nfs/client.c Steve Dickson
  2009-01-16 16:32 ` [PATCH 4/5] NFS: Convert trace points to trace markers Steve Dickson
@ 2009-01-16 16:33 ` Steve Dickson
       [not found] ` <4970B451.4080201-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
  3 siblings, 0 replies; 58+ messages in thread
From: Steve Dickson @ 2009-01-16 16:33 UTC (permalink / raw)
  To: Linux NFSv4 mailing list, Linux NFS Mailing list; +Cc: SystemTAP

The systemtap scripts used to pull and parse the information
from the kernel.

Signed-off-by: Steve Dickson <steved@redhat.com>

--- /dev/null	2009-01-08 09:11:53.943261863 -0500
+++ linux/samples/nfs/nfs_mount.stp	2009-01-15 12:23:11.000000000 -0500
@@ -0,0 +1,160 @@
+%{
+#include <linux/mount.h>
+#include <linux/nfs_mount.h>
+%}
+
+function _fstype_name:string (_fstype:long) %{
+	struct file_system_type *fstype;
+	char *name;
+
+	fstype = (struct file_system_type *)(long)kread(&(THIS->_fstype));
+	name = (char *)(long)kread(&fstype->name);
+
+	snprintf(THIS->__retvalue, MAXSTRINGLEN, "name %s flags 0x%x",
+		name, fstype->fs_flags);
+
+	CATCH_DEREF_FAULT();
+%}
+function _vfsmnt_dump:string (_vfsmnt:long) %{
+	struct vfsmount *vfsmnt;
+	char *dev;
+
+	vfsmnt = (struct vfsmount *)(long)kread(&(THIS->_vfsmnt));
+	dev = (char *)(long)kread(&vfsmnt->mnt_devname);
+
+	snprintf(THIS->__retvalue, MAXSTRINGLEN, "dev %s flags=0x%x",
+		vfsmnt->mnt_devname, vfsmnt->mnt_flags);
+
+	CATCH_DEREF_FAULT();
+%}
+function _nfsmnt_dump:string (_nfsmnt:long) %{
+	struct nfs_mount_data *data;
+	unsigned char *bytes;
+
+	data = (struct nfs_mount_data *)(long)kread(&(THIS->_nfsmnt));
+	bytes = (unsigned char *)&data->addr.sin_addr.s_addr;
+
+	snprintf(THIS->__retvalue, MAXSTRINGLEN, 
+		"vers %d flags 0x%x flavor %d hostname %s(%d.%d.%d.%d)",
+		data->version, data->flags, data->pseudoflavor,
+		data->hostname, bytes[0], bytes[1], bytes[2], bytes[3]);
+
+	CATCH_DEREF_FAULT();
+%}
+
+probe kernel.mark("nfs_mount") {
+    printf("nfs_mount:entry: fstype (%s) flags %x dev %s\n",
+		_fstype_name($arg1), $arg2, kernel_string($arg3));
+    printf("\tdata: %s\n\tmnt: %s\n",
+		_nfsmnt_dump($arg4), _vfsmnt_dump($arg5));
+
+}
+probe  kernel.mark("nfs_mount_data_null") {
+    printf("nfs_mount: missing mount data: errno %d\n", $arg1);
+}
+probe  kernel.mark("nfs_mount_data_badvers") {
+
+	if ($arg1 <= 0) {
+    	printf("nfs_mount: invalid mount version: vers %d <= 0\n", $arg1);
+	} else  {
+    	printf("nfs_mount: invalid mount version: vers %d > %d\n",
+			$arg1, $arg2);
+	}
+}
+probe  kernel.mark("nfs_mount_data_invalvers") {
+	if ($arg1 == 3) {
+    	printf("nfs_mount: mount structure version %d does not support NFSv3\n", 			$arg1);
+	} else { 
+		printf("nfs_mount: mount structure version %d does not support strong security\n", arg1)
+	}
+}
+probe  kernel.mark("nfs_mount_data_noaddr") {
+    printf("nfs_mount: invalid server IP address:\n");
+}
+probe  kernel.mark("nfs_mount_data_badsize") {
+    printf("nfs_mount: invalid root filehandle: fhsize %s > maxsize %d\n",
+		arg1, arg2);
+}
+probe kernel.mark("nfs_mount_get_root") {
+	printf("nfs_get_root: sb %p server %p mntfh %p\n", $arg1, $arg2, $arg3);
+}
+probe kernel.mark("nfs_mount_getroot_fhget1") {
+	printf("nfs_get_root: !s_root: nfs_fhget failed: errno %d (%s)\n",
+		$arg1, errno_str($arg1));
+}
+probe kernel.mark("nfs_mount_getroot_alloc1") {
+	printf("nfs_get_root: !s_root: d_alloc_root failed: errno %d (%s)\n",
+		$arg1, errno_str($arg1));
+}
+probe kernel.mark("nfs_mount_getroot_fhget2") {
+	printf("nfs_get_root: nfs_fhget failed: errno %d (%s)\n",
+		$arg1, errno_str($arg1));
+}
+probe kernel.mark("nfs_mount_getroot_alloc2") {
+	printf("nfs_get_root: d_alloc_root failed: errno %d (%s)\n",
+		$arg1, errno_str($arg1));
+}
+probe kernel.mark("nfs_mount_init_clnt") {
+	printf("nfs_init_client: nfs_create_rpc_client failed: errno %d (%s)\n",
+		$arg1, errno_str($arg1));
+}
+probe kernel.mark("nfs_create_rpc_client") {
+	printf("nfs_create_rpc_client: clp %p proto %d timeo %d retrans %d flavor %d", 
+		$arg1, $arg2, $arg3, $arg4, $arg5);
+}
+probe kernel.mark("nfs_create_rpc_client_proto") {
+	printf("nfs_create_rpc_client: xprt_create_proto failed: errno %d (%s)\n",
+		$arg1, errno_str($arg1));
+}
+probe kernel.mark("nfs_create_rpc_client_client") {
+	printf("nfs_create_rpc_client: rpc_create_client failed: errno %d (%s)\n",
+		$arg1, errno_str($arg1));
+}
+probe kernel.mark("nfs_mount_init_srv") {
+	printf("nfs_init_server: nfs_get_client failed: errno %d (%s)\n",
+		$arg1, errno_str($arg1));
+}
+probe kernel.mark("nfs_mount_sget") {
+	printf("nfs_init_server: nfs_get_client failed: errno %d (%s)\n",
+		$arg1, errno_str($arg1));
+}
+
+probe kernel.mark("nfs_init_server_rpcclient") {
+	printf("nfs_init_server_rpcclient: server %p flavor %d\n",
+		$arg1, $arg2);
+}
+probe kernel.mark("nfs_init_server_rpcclient_clone") {
+	printf("nfs_init_server_rpcclient: rpc_clone_client failed errno %d (%s)\n",
+		$arg1, errno_str($arg1));
+}
+probe kernel.mark("nfs_init_server_rpcclient_auth") {
+	printf("nfs_init_server_rpcclient: rpcauth_create failed errno %d (%s)\n",
+		$arg1, errno_str($arg1));
+}
+probe kernel.mark("nfs_start_lockd") {
+	// struct nfs_server *server = $arg1;
+
+	printf("nfs_start_lockd: lockd_up_proto failed errno %d (%s)\n",
+		$arg2, errno_str($arg2));
+}
+probe kernel.mark("nfs_probe_fsinfo") {
+	printf("nfs_probe_fsinfo: server %p mntfh %p fattr %p\n",
+		$arg1, $arg2, $arg3);
+}
+probe kernel.mark("nfs_probe_fsinfo_setcaps") {
+	printf("nfs_probe_fsinfo: set_capabilities failed errno %d (%s)\n",
+		$arg1, errno_str($arg1));
+}
+probe kernel.mark("nfs_probe_fsinfo_fsinfo") {
+	printf("nfs_probe_fsinfo: fsinfo failed errno %d (%s)\n",
+		$arg1, errno_str($arg1));
+}
+probe kernel.mark("nfs_create_server") {
+	// struct nfs_server *server = $arg1;
+	// struct nfs_fh *mntfh = $arg2;
+
+	printf("nfs_create_server: getattr failed errno %d (%s)\n",
+		$arg3, errno_str($arg3));
+} 
+probe begin { log("starting nfs_mount trace") }
+probe end { log("ending nfs_mount trace") }

^ permalink raw reply	[flat|nested] 58+ messages in thread

[parent not found: <4970B451.4080201-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>]

* [PATCH 1/5] NFS: Adding trace points to fs/nfs/getroot.c
       [not found] ` <4970B451.4080201-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
@ 2009-01-16 16:25   ` Steve Dickson
  2009-01-16 16:28   ` [PATCH 2/5] NFS: Adding trace points to fs/nfs/super.c Steve Dickson
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 58+ messages in thread
From: Steve Dickson @ 2009-01-16 16:25 UTC (permalink / raw)
  To: Linux NFSv4 mailing list, Linux NFS Mailing list; +Cc: SystemTAP


Added trace points to nfs_get_root() which define 
when nfs_get_root() is called and if it fails.

Signed-off-by: Steve Dickson <steved@redhat.com>

--- linux/fs/nfs/getroot.c.orig	2008-12-24 18:26:37.000000000 -0500
+++ linux/fs/nfs/getroot.c	2009-01-15 12:42:16.000000000 -0500
@@ -32,6 +32,7 @@
 #include <linux/namei.h>
 #include <linux/mnt_namespace.h>
 #include <linux/security.h>
+#include <trace/nfs.h>
 
 #include <asm/system.h>
 #include <asm/uaccess.h>
@@ -40,6 +41,12 @@
 #include "delegation.h"
 #include "internal.h"
 
+DEFINE_TRACE(nfs_mount_get_root);
+DEFINE_TRACE(nfs_mount_getroot);
+DEFINE_TRACE(nfs_mount_get_root_fhget);
+DEFINE_TRACE(nfs_mount_get_root_dummy_root);
+DEFINE_TRACE(nfs_mount_get_root_alias);
+
 #define NFSDBG_FACILITY		NFSDBG_CLIENT
 
 /*
@@ -84,25 +91,30 @@ struct dentry *nfs_get_root(struct super
 	struct inode *inode;
 	int error;
 
+	trace_nfs_mount_get_root(sb, server, mntfh);
+
 	/* get the actual root for this mount */
 	fsinfo.fattr = &fattr;
 
 	error = server->nfs_client->rpc_ops->getroot(server, mntfh, &fsinfo);
 	if (error < 0) {
 		dprintk("nfs_get_root: getattr error = %d\n", -error);
+		trace_nfs_mount_getroot(sb, server, error);
 		return ERR_PTR(error);
 	}
 
 	inode = nfs_fhget(sb, mntfh, fsinfo.fattr);
 	if (IS_ERR(inode)) {
 		dprintk("nfs_get_root: get root inode failed\n");
+		trace_nfs_mount_get_root_fhget(sb, server, error);
 		return ERR_CAST(inode);
 	}
 
 	error = nfs_superblock_set_dummy_root(sb, inode);
-	if (error != 0)
+	if (error != 0) {
+		trace_nfs_mount_get_root_dummy_root(sb, server, error);
 		return ERR_PTR(error);
-
+	}
 	/* root dentries normally start off anonymous and get spliced in later
 	 * if the dentry tree reaches them; however if the dentry already
 	 * exists, we'll pick it up at this point and use it as the root
@@ -110,6 +122,7 @@ struct dentry *nfs_get_root(struct super
 	mntroot = d_obtain_alias(inode);
 	if (IS_ERR(mntroot)) {
 		dprintk("nfs_get_root: get root dentry failed\n");
+		trace_nfs_mount_get_root_alias(sb, server, PTR_ERR(mntroot));
 		return mntroot;
 	}
 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH 2/5] NFS: Adding trace points to fs/nfs/super.c
       [not found] ` <4970B451.4080201-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
  2009-01-16 16:25   ` [PATCH 1/5] NFS: Adding trace points to fs/nfs/getroot.c Steve Dickson
@ 2009-01-16 16:28   ` Steve Dickson
  2009-01-16 18:52   ` [RFC][PATCH 0/5] NFS: trace points added to mounting path Chuck Lever
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 58+ messages in thread
From: Steve Dickson @ 2009-01-16 16:28 UTC (permalink / raw)
  To: Linux NFSv4 mailing list, Linux NFS Mailing list; +Cc: SystemTAP

Added trace points to nfs_validate_mount_data() and
nfs_get_sb(). Since nfs_validate_mount_data() is only
called from nfs_get_sb(), trace points were only added
to the error paths.

Signed-off-by: Steve Dickson <steved@redhat.com>

--- linux/fs/nfs/super.c.orig	2009-01-14 15:54:19.000000000 -0500
+++ linux/fs/nfs/super.c	2009-01-15 12:37:57.000000000 -0500
@@ -51,6 +51,7 @@
 #include <linux/nfs_xdr.h>
 #include <linux/magic.h>
 #include <linux/parser.h>
+#include <trace/nfs.h>
 
 #include <asm/system.h>
 #include <asm/uaccess.h>
@@ -61,6 +62,15 @@
 #include "iostat.h"
 #include "internal.h"
 
+DEFINE_TRACE(nfs_mount_data_null);
+DEFINE_TRACE(nfs_mount_data_invalvers);
+DEFINE_TRACE(nfs_mount_data_invalsec);
+DEFINE_TRACE(nfs_mount_data_nomem);
+DEFINE_TRACE(nfs_mount_data_noaddr);
+DEFINE_TRACE(nfs_mount_data_badfh);
+DEFINE_TRACE(nfs_mount);
+DEFINE_TRACE(nfs_mount_sget);
+
 #define NFSDBG_FACILITY		NFSDBG_VFS
 
 enum {
@@ -1692,15 +1702,18 @@ static int nfs_validate_mount_data(void 
 
 out_no_data:
 	dfprintk(MOUNT, "NFS: mount program didn't pass any mount data\n");
+	trace_nfs_mount_data_null(data, EINVAL);
 	return -EINVAL;
 
 out_no_v3:
 	dfprintk(MOUNT, "NFS: nfs_mount_data version %d does not support v3\n",
 		 data->version);
+	trace_nfs_mount_data_invalvers(data, EINVAL);
 	return -EINVAL;
 
 out_no_sec:
 	dfprintk(MOUNT, "NFS: nfs_mount_data version supports only AUTH_SYS\n");
+	trace_nfs_mount_data_invalsec(data, EINVAL);
 	return -EINVAL;
 
 #ifndef CONFIG_NFS_V3
@@ -1711,14 +1724,17 @@ out_v3_not_compiled:
 
 out_nomem:
 	dfprintk(MOUNT, "NFS: not enough memory to handle mount options\n");
+	trace_nfs_mount_data_nomem(data, EINVAL);
 	return -ENOMEM;
 
 out_no_address:
 	dfprintk(MOUNT, "NFS: mount program didn't pass remote address\n");
+	trace_nfs_mount_data_noaddr(data, EINVAL);
 	return -EINVAL;
 
 out_invalid_fh:
 	dfprintk(MOUNT, "NFS: invalid root filehandle\n");
+	trace_nfs_mount_data_badfh(data, EINVAL);
 	return -EINVAL;
 }
 
@@ -1999,6 +2015,8 @@ static int nfs_get_sb(struct file_system
 
 	security_init_mnt_opts(&data->lsm_opts);
 
+	trace_nfs_mount(fs_type, flags, dev_name, data, mnt);
+
 	/* Validate the mount data */
 	error = nfs_validate_mount_data(raw_data, data, mntfh, dev_name);
 	if (error < 0)
@@ -2019,6 +2037,7 @@ static int nfs_get_sb(struct file_system
 	s = sget(fs_type, compare_super, nfs_set_super, &sb_mntdata);
 	if (IS_ERR(s)) {
 		error = PTR_ERR(s);
+		trace_nfs_mount_sget(server, error);
 		goto out_err_nosb;
 	}
 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
       [not found] ` <4970B451.4080201-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
  2009-01-16 16:25   ` [PATCH 1/5] NFS: Adding trace points to fs/nfs/getroot.c Steve Dickson
  2009-01-16 16:28   ` [PATCH 2/5] NFS: Adding trace points to fs/nfs/super.c Steve Dickson
@ 2009-01-16 18:52   ` Chuck Lever
  2009-01-21 17:13     ` Steve Dickson
  2009-01-16 23:44   ` Greg Banks
  2009-01-18 16:40   ` Christoph Hellwig
  4 siblings, 1 reply; 58+ messages in thread
From: Chuck Lever @ 2009-01-16 18:52 UTC (permalink / raw)
  To: Steve Dickson; +Cc: Linux NFSv4 mailing list, Linux NFS Mailing list, SystemTAP


On Jan 16, 2009, at Jan 16, 2009, 11:22 AM, Steve Dickson wrote:

> Hello,
>
> Very recently patches were added to the mainline kernel that
> enabled the use of trace points. This patch series takes
> advantage of those patch by introducing trace points
> to the mounting path of NFS mounts. Its hoped these
> trace points can be used by system administrators to
> identify why NFS mounts are failing or hang in
> production kernels.
>
>
> IMHO, one general problem with today's "canned" NFS debugging today  
> is it
> becomes very verbose very quickly.... "I get here" and "I get there"  
> type of
> debugging statements. Although they help trace the code but very  
> rarely
> shows/defines what the actual problem was. So what I've try to do is
> "define the error paths" by putting a trace point at every error exit
> in hopes to define where and why things broke.
>
> So the ultimate goal would be to replace all the dprintks with trace  
> points
> but still be able to enable them through the rpcdebug command  
> (although we
> might want to think about splitting the command out into three  
> different
> commands nfsdebug, nfsddebug, rpcdebug). Since trace points have  
> very little
> overhead, a set of trace points could be enable in production with  
> have
> little or no effect on functionality or performance.
>
> Another advantage with trace points is the type and amount of
> information that can be retrieved. With these trace points, I'm
> passing in the error code as well as the data structure[s] associated
> with that error. This allows the "canned" information that IT people
> would used (via the rpcdebug command which would turn on a group of
> trace points) as well as more detailed information that kernel  
> developers
> can used (via systemtap scripts which would turn on individual trace
> points).
>
> Patch summary:
>    * fs/nfs/client.c
>
>    * fs/nfs/getroot.c
>
>    * fs/nfs/super.c
>
>      The based files where traces where added.
>
>    * include/trace/nfs.h
>
>    * kernel/trace/Makefile
>
>    * kernel/trace/nfs-trace.c
>
>      The overhead of added the trace points and then converting them
>      into trace marks			.
>
>    * samples/nfs/nfs_mount.stp
>
>      The systemtap script used to access the trace marks. I probably
>      should have documented the file better, but the first three
>      functions in the file are how structures are pulled from the
>      kernel. The rest are probes used to active the trace markers.
>
>
> Comments... Acceptance??


I'm all for improving the observability of the NFS client.

But I don't (yet) see the advantage of adding this complexity in the  
mount path.  Maybe the more complex and asynchronous parts of the NFS  
client, like the cached read and write paths, are more suitable to  
this type of tool.

Why can't we simply improve the information content of the dprintks?   
Can you give a few real examples of problems that these new trace  
points can identify that better dprintks wouldn't be able to address?   
Generally, what kind of problems do admins face that the dprintks  
don't handle today, and what are the alternatives to addressing those  
issues?

Do admins who run enterprise kernels actually use SystemTap, or do  
they fall back on network traces and other tried and true  
troubleshooting methodologies?

If we think the mount path needs such instrumentation, consider  
updating fs/nfs/mount_clnt.c and net/sunrpc/rpcb_clnt.c as well.

-- 
Chuck Lever
chuck[dot]lever[at]oracle[dot]com

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-16 18:52   ` [RFC][PATCH 0/5] NFS: trace points added to mounting path Chuck Lever
@ 2009-01-21 17:13     ` Steve Dickson
       [not found]       ` <497757D1.7090908-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 58+ messages in thread
From: Steve Dickson @ 2009-01-21 17:13 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Linux NFS Mailing list, Linux NFSv4 mailing list, SystemTAP

Sorry for the delayed response... That darn flux capacitor broke again! ;-)
 

Chuck Lever wrote:
> 
> I'm all for improving the observability of the NFS client.
Well, in theory, trace points will also touch the server and all
of the rpc code... 

> 
> But I don't (yet) see the advantage of adding this complexity in the
> mount path.  Maybe the more complex and asynchronous parts of the NFS
> client, like the cached read and write paths, are more suitable to this
> type of tool.
Well the complexity is, at this point, due to how the trace points
are tied to and used by the systemtap. I'm hopeful this complexity 
will die down as time goes on... 

> 
> Why can't we simply improve the information content of the dprintks?
The theory is trace point can be turned on, in production kernels, with
little or no performance issues... 

> Can you give a few real examples of problems that these new trace points
> can identify that better dprintks wouldn't be able to address?
They can supply more information that can be used by both a kernel
guy and an IT guy.... Meaning they can supply detailed structure information
that a kernel guy would need as well as supplying the simple error code 
that an IT guy would be interested.

> Generally, what kind of problems do admins face that the dprintks don't
> handle today, and what are the alternatives to addressing those issues?
Not being an admin guy, I really don't have an answer for this... but 
I can say since trace point are not so much of a drag on the system as
printks are.. with in timing issues using trace point would be a big advantage
over printks

> 
> Do admins who run enterprise kernels actually use SystemTap, or do they
> fall back on network traces and other tried and true troubleshooting
> methodologies?
Currently to run systemtap, one need kernel debug info and kernel developer
info installed on the system. Most productions system don't install those types
of packages.... But with trace points those type of packages will no longer be 
needed, so I could definitely see admins using systemtap once its available... 
Look at Dtrace... people are using that now that its available and fairly stable. 
 
> 
> If we think the mount path needs such instrumentation, consider updating
> fs/nfs/mount_clnt.c and net/sunrpc/rpcb_clnt.c as well.
> 
I was just following what what was currently being debug when
'rpcinfo -m nfs -s mount' was set... maybe I missed something...
I'll take a look...

steved.

^ permalink raw reply	[flat|nested] 58+ messages in thread

[parent not found: <497757D1.7090908-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>]

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
       [not found]       ` <497757D1.7090908-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
@ 2009-01-21 18:01         ` Chuck Lever
  2009-01-21 19:29           ` Trond Myklebust
                             ` (2 more replies)
  0 siblings, 3 replies; 58+ messages in thread
From: Chuck Lever @ 2009-01-21 18:01 UTC (permalink / raw)
  To: Steve Dickson; +Cc: Linux NFSv4 mailing list, Linux NFS Mailing list, SystemTAP

Hey Steve-

On Jan 21, 2009, at Jan 21, 2009, 12:13 PM, Steve Dickson wrote:
> Sorry for the delayed response... That darn flux capacitor broke  
> again! ;-)
>
>
> Chuck Lever wrote:
>>
>> I'm all for improving the observability of the NFS client.
> Well, in theory, trace points will also touch the server and all
> of the rpc code...
>
>>
>> But I don't (yet) see the advantage of adding this complexity in the
>> mount path.  Maybe the more complex and asynchronous parts of the NFS
>> client, like the cached read and write paths, are more suitable to  
>> this
>> type of tool.
> Well the complexity is, at this point, due to how the trace points
> are tied to and used by the systemtap. I'm hopeful this complexity
> will die down as time goes on...

I understand that your proposed mount path changes were attempting to  
provide a simple example of using trace points that could be applied  
to the NFS client and server in general.

However I'm interested mostly in improving how the mount path in  
specific reports problems.  I'm not convinced that trace points (or  
our current dprintk, for that matter) is a useful approach to solving  
NFS mount issues, in specific.

But that introduces the general question of whether trace points,  
dprintk, network tracing, or something else is the most appropriate  
tool to address the most common troubleshooting problems in any  
particular area of the NFS client or server.  I'd also like some  
clarity on what our problem statement is here.  What problems are we  
trying to address?

>> Why can't we simply improve the information content of the dprintks?
> The theory is trace point can be turned on, in production kernels,  
> with
> little or no performance issues...

mount isn't a performance path, which is one reason I think trace  
points might be overkill for this case.

>> Can you give a few real examples of problems that these new trace  
>> points
>> can identify that better dprintks wouldn't be able to address?
> They can supply more information that can be used by both a kernel
> guy and an IT guy.... Meaning they can supply detailed structure  
> information
> that a kernel guy would need as well as supplying the simple error  
> code
> that an IT guy would be interested.

My point is, does that flexibility really help some poor admin who is  
trying to diagnose a mount problem?  Is it going to reduce the number  
of calls to your support desk?

I'd like to see an example of a real mount problem or two that dprintk  
isn't adequate for, but a trace point could have helped.  In other  
words, can we get some use cases for dprintk and trace points for  
mount problems in specific?  I think that would help us understand the  
trade-offs a little better.

Some general use cases for trace points might also widen our dialog  
about where they are appropriate to use.  I'm not at all arguing  
against using trace points in general, but I would like to see some  
thinking about whether they are the most appropriate tool for each of  
the many troubleshooting jobs we have.

>> Generally, what kind of problems do admins face that the dprintks  
>> don't
>> handle today, and what are the alternatives to addressing those  
>> issues?
> Not being an admin guy, I really don't have an answer for this... but
> I can say since trace point are not so much of a drag on the system as
> printks are.. with in timing issues using trace point would be a big  
> advantage
> over printks

I like the idea of not depending on the system log, and that's  
appropriate for performance hot paths and asynchronous paths where  
timing can be an issue.  That's one reason why I created the NFS and  
RPC performance metrics facility.

But mount is not a performance path, and is synchronous, more or  
less.  In addition, mount encounters problems much more frequently  
than the read or write path, because mount depends a lot on what  
options are selected and the network environment its running in.  It's  
the first thing to try contacting the server, as well, so it "shakes  
out" a lot of problems before a read or write is even done.

So something like dprintk or trace points or a network trace that have  
some set up overhead might be less appropriate for mount than, say,  
beefing up the error reporting framework in the mount path, just as an  
example.

>> Do admins who run enterprise kernels actually use SystemTap, or do  
>> they
>> fall back on network traces and other tried and true troubleshooting
>> methodologies?
> Currently to run systemtap, one need kernel debug info and kernel  
> developer
> info installed on the system. Most productions system don't install  
> those types
> of packages.... But with trace points those type of packages will no  
> longer be
> needed, so I could definitely see admins using systemtap once its  
> available...
> Look at Dtrace... people are using that now that its available and  
> fairly stable.
>
>> If we think the mount path needs such instrumentation, consider  
>> updating
>> fs/nfs/mount_clnt.c and net/sunrpc/rpcb_clnt.c as well.
>>
> I was just following what what was currently being debug when
> 'rpcinfo -m nfs -s mount' was set...

`rpcdebug -m nfs -s mount` also enables the dprintks in fs/nfs/ 
mount_clnt.c, at least.  As with most dprintk infrastructure in NFS,  
it's really aimed at developers and not end users or admins.  The  
rpcbind client is also an integral part of the mount process, so I  
suggested that too.

-- 
Chuck Lever
chuck[dot]lever[at]oracle[dot]com

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-21 18:01         ` Chuck Lever
@ 2009-01-21 19:29           ` Trond Myklebust
  2009-01-21 19:58             ` Steve Dickson
  2009-01-21 19:37           ` Steve Dickson
  2009-01-21 21:26           ` Greg Banks
  2 siblings, 1 reply; 58+ messages in thread
From: Trond Myklebust @ 2009-01-21 19:29 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Steve Dickson, Linux NFSv4 mailing list, Linux NFS Mailing list,
	SystemTAP

On Wed, 2009-01-21 at 13:01 -0500, Chuck Lever wrote:
> `rpcdebug -m nfs -s mount` also enables the dprintks in fs/nfs/ 
> mount_clnt.c, at least.  As with most dprintk infrastructure in NFS,  
> it's really aimed at developers and not end users or admins.  The  
> rpcbind client is also an integral part of the mount process, so I  
> suggested that too.

This would be my main gripe with suggestions that we convert all the
existing dprintks. As Chuck says, they are pretty much a hodgepodge of
messages designed to help kernel developers to debug the NFS and RPC
code.

If you want something dtrace-like to allow administrators to run scripts
to monitor the health of their cluster and troubleshoot performance
problems, then you really want to start afresh. That really needs to be
designed as a long-term API, and should ideally represent the desired
functionality in a manner that is more or less independent of the
underlying code (something that is clearly not the case for the current
mess of dprintks). Otherwise, scripts will have to be rewritten every
time we make some minor tweak or change to the code (i.e. for every
kernel release).

Cheers
  Trond

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-21 19:29           ` Trond Myklebust
@ 2009-01-21 19:58             ` Steve Dickson
  2009-01-21 20:23               ` Trond Myklebust
  0 siblings, 1 reply; 58+ messages in thread
From: Steve Dickson @ 2009-01-21 19:58 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Linux NFS Mailing list, Linux NFSv4 mailing list, SystemTAP

Hey,

Trond Myklebust wrote:
> On Wed, 2009-01-21 at 13:01 -0500, Chuck Lever wrote:
>> `rpcdebug -m nfs -s mount` also enables the dprintks in fs/nfs/ 
>> mount_clnt.c, at least.  As with most dprintk infrastructure in NFS,  
>> it's really aimed at developers and not end users or admins.  The  
>> rpcbind client is also an integral part of the mount process, so I  
>> suggested that too.
> 
> This would be my main gripe with suggestions that we convert all the
> existing dprintks. As Chuck says, they are pretty much a hodgepodge of
> messages designed to help kernel developers to debug the NFS and RPC
> code.
Well as I see it, this is our chance to clean it up... 

> 
> If you want something dtrace-like to allow administrators to run scripts
> to monitor the health of their cluster and troubleshoot performance
> problems, then you really want to start afresh. That really needs to be
> designed as a long-term API, and should ideally represent the desired
> functionality in a manner that is more or less independent of the
> underlying code (something that is clearly not the case for the current
> mess of dprintks). 
I'm not sure how the trace points could independent of the underlying code,
but I do agree a well designed API would be optimal.... But before we go
off designing something I think we need to decide with the end game is.

Do we want to trace points:
    1) at all 
    2) for debugging 
    3) for performance
    4) 2 and 3

Once we get the above nailed down then we can decide how to go...

Also, Greg and Jason Baron (from Red Hat) are off working on 
improving the dprintk() that are currently exist... I would
suspect we would want to also tie in with that to see if
it would be applicable...


> Otherwise, scripts will have to be rewritten every
> time we make some minor tweak or change to the code (i.e. for every
> kernel release).
No matter how well we design this, I'm sure there will always be
a need for tweaks in the user level scripts... but we call always
leave that up to the nfs-utils maintainer....  (Doh!) 8-)

steved.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-21 19:58             ` Steve Dickson
@ 2009-01-21 20:23               ` Trond Myklebust
  2009-01-22 13:07                 ` Steve Dickson
  0 siblings, 1 reply; 58+ messages in thread
From: Trond Myklebust @ 2009-01-21 20:23 UTC (permalink / raw)
  To: Steve Dickson; +Cc: Linux NFS Mailing list, Linux NFSv4 mailing list, SystemTAP

On Wed, 2009-01-21 at 14:58 -0500, Steve Dickson wrote:
> Do we want to trace points:
>     1) at all 
>     2) for debugging 
>     3) for performance
>     4) 2 and 3
> 
> Once we get the above nailed down then we can decide how to go...

I think it might be a good idea to flesh out a bit what you mean by
"debugging" here. Since you mentioned it in conjunction with the two
words "administrators" and "scripts", I assume that you are not talking
about kernel code debugging?

Cheers
  Trond

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-21 20:23               ` Trond Myklebust
@ 2009-01-22 13:07                 ` Steve Dickson
       [not found]                   ` <49786F9F.7030400-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 58+ messages in thread
From: Steve Dickson @ 2009-01-22 13:07 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Chuck Lever, Linux NFSv4 mailing list, Linux NFS Mailing list,
	SystemTAP



Trond Myklebust wrote:
> On Wed, 2009-01-21 at 14:58 -0500, Steve Dickson wrote:
>> Do we want to trace points:
>>     1) at all 
>>     2) for debugging 
>>     3) for performance
>>     4) 2 and 3
>>
>> Once we get the above nailed down then we can decide how to go...
> 
> I think it might be a good idea to flesh out a bit what you mean by
> "debugging" here. Since you mentioned it in conjunction with the two
> words "administrators" and "scripts", I assume that you are not talking
> about kernel code debugging?
I'm talking debugging for both admins and kernel people...

With trace points and systemtap you can do both.

steved.

^ permalink raw reply	[flat|nested] 58+ messages in thread

[parent not found: <49786F9F.7030400-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>]

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
       [not found]                   ` <49786F9F.7030400-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
@ 2009-01-22 15:30                     ` Trond Myklebust
  2009-01-22 15:49                       ` Steve Dickson
  0 siblings, 1 reply; 58+ messages in thread
From: Trond Myklebust @ 2009-01-22 15:30 UTC (permalink / raw)
  To: Steve Dickson
  Cc: Chuck Lever, Linux NFSv4 mailing list, Linux NFS Mailing list,
	SystemTAP

On Thu, 2009-01-22 at 08:07 -0500, Steve Dickson wrote:
> > I think it might be a good idea to flesh out a bit what you mean by
> > "debugging" here. Since you mentioned it in conjunction with the two
> > words "administrators" and "scripts", I assume that you are not talking
> > about kernel code debugging?
> I'm talking debugging for both admins and kernel people...
> 
> With trace points and systemtap you can do both.

Yes, but we still need to figure out details of what each type of user
is expecting/wants.

I suspect that when we get down to cases, we will find that a lot of the
tracepoints that administrators find useful will want to be put in the
VFS rather than in the filesystems themselves. If you have tracepoints
in sys_stat() and sys_fstat(), why would you also need a tracepoint in
nfs_getattr()? AFAICS, that would just make scripting ugly...

Cheers
  Trond

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-22 15:30                     ` Trond Myklebust
@ 2009-01-22 15:49                       ` Steve Dickson
  2009-01-22 17:47                         ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 58+ messages in thread
From: Steve Dickson @ 2009-01-22 15:49 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Linux NFS Mailing list, Linux NFSv4 mailing list, SystemTAP



Trond Myklebust wrote:
> On Thu, 2009-01-22 at 08:07 -0500, Steve Dickson wrote:
>>> I think it might be a good idea to flesh out a bit what you mean by
>>> "debugging" here. Since you mentioned it in conjunction with the two
>>> words "administrators" and "scripts", I assume that you are not talking
>>> about kernel code debugging?
>> I'm talking debugging for both admins and kernel people...
>>
>> With trace points and systemtap you can do both.
> 
> Yes, but we still need to figure out details of what each type of user
> is expecting/wants.
Will that be possible? Until we get something in their hands, how will
they even know what they want or need.

> 
> I suspect that when we get down to cases, we will find that a lot of the
> tracepoints that administrators find useful will want to be put in the
> VFS rather than in the filesystems themselves. If you have tracepoints
> in sys_stat() and sys_fstat(), why would you also need a tracepoint in
> nfs_getattr()? AFAICS, that would just make scripting ugly...
I believe there is an effort to do some type of system call tracing 
as we speak and that effort should be followed to ensure there is
no duplicate of effort... But in the end, I would think more would be
better then less... since each point can be explicitly enabled and disabled
a little duplication may not be all that bad..

steved.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-22 15:49                       ` Steve Dickson
@ 2009-01-22 17:47                         ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 58+ messages in thread
From: Arnaldo Carvalho de Melo @ 2009-01-22 17:47 UTC (permalink / raw)
  To: Steve Dickson
  Cc: Trond Myklebust, Chuck Lever, Linux NFSv4 mailing list,
	Linux NFS Mailing list, SystemTAP

Em Thu, Jan 22, 2009 at 10:49:43AM -0500, Steve Dickson escreveu:
> 
> 
> Trond Myklebust wrote:
> > On Thu, 2009-01-22 at 08:07 -0500, Steve Dickson wrote:
> >>> I think it might be a good idea to flesh out a bit what you mean by
> >>> "debugging" here. Since you mentioned it in conjunction with the two
> >>> words "administrators" and "scripts", I assume that you are not talking
> >>> about kernel code debugging?
> >> I'm talking debugging for both admins and kernel people...
> >>
> >> With trace points and systemtap you can do both.
> > 
> > Yes, but we still need to figure out details of what each type of user
> > is expecting/wants.
> Will that be possible? Until we get something in their hands, how will
> they even know what they want or need.
> 
> > 
> > I suspect that when we get down to cases, we will find that a lot of the
> > tracepoints that administrators find useful will want to be put in the
> > VFS rather than in the filesystems themselves. If you have tracepoints
> > in sys_stat() and sys_fstat(), why would you also need a tracepoint in
> > nfs_getattr()? AFAICS, that would just make scripting ugly...
> I believe there is an effort to do some type of system call tracing 
> as we speak and that effort should be followed to ensure there is
> no duplicate of effort... But in the end, I would think more would be
> better then less... since each point can be explicitly enabled and disabled
> a little duplication may not be all that bad..

I believe that the use case scenario here is blktrace, it was done
before tracepoints were in the kernel, before developers thought that
not requiring whatever size/complexity userspace binary to grok whatever
fast/optimum way to relay data so as some userspace tool would
inteligently process.

Now we're trying to keep the existing developer assists and augment it
with other, standardized ways coming for other subsystems, looking at
how to get the best ways of every trace-me-harder approach previously
endeavoured.

tracepoints, ftrace, etc, are an excellent opportunity for people to go
from stone age
print-msg-that-could-be-interesting-to-some-class-of-observer to
something OS provided, highly optimized.

<expletives deleted>

Erm, what is it that you think should be covered in a from everybody in
this community tracing perspective point of view?

- Arnaldo

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-21 18:01         ` Chuck Lever
  2009-01-21 19:29           ` Trond Myklebust
@ 2009-01-21 19:37           ` Steve Dickson
  2009-01-21 20:19             ` Chuck Lever
  2009-01-21 21:26           ` Greg Banks
  2 siblings, 1 reply; 58+ messages in thread
From: Steve Dickson @ 2009-01-21 19:37 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Linux NFS Mailing list, Linux NFSv4 mailing list, SystemTAP

Chuck Lever wrote:
> Hey Steve-
> 
> On Jan 21, 2009, at Jan 21, 2009, 12:13 PM, Steve Dickson wrote:
>> Sorry for the delayed response... That darn flux capacitor broke
>> again! ;-)
>>
>>
>> Chuck Lever wrote:
>>>
>>> I'm all for improving the observability of the NFS client.
>> Well, in theory, trace points will also touch the server and all
>> of the rpc code...
>>
>>>
>>> But I don't (yet) see the advantage of adding this complexity in the
>>> mount path.  Maybe the more complex and asynchronous parts of the NFS
>>> client, like the cached read and write paths, are more suitable to this
>>> type of tool.
>> Well the complexity is, at this point, due to how the trace points
>> are tied to and used by the systemtap. I'm hopeful this complexity
>> will die down as time goes on...
> 
> I understand that your proposed mount path changes were attempting to
> provide a simple example of using trace points that could be applied to
> the NFS client and server in general.
Very true... Its definitely just a template... If/when we agree to a 
format of the template, I would like to simple clone it through the
rest of the code.

> However I'm interested mostly in improving how the mount path in
> specific reports problems.  I'm not convinced that trace points (or our
> current dprintk, for that matter) is a useful approach to solving NFS
> mount issues, in specific.
> 
> But that introduces the general question of whether trace points,
> dprintk, network tracing, or something else is the most appropriate tool
> to address the most common troubleshooting problems in any particular
> area of the NFS client or server.  I'd also like some clarity on what
> our problem statement is here.  What problems are we trying to address?
The problem I'm trying to address is allowing admins to debug (or decipher)
NFS problems on production system in a very non-intrusive way. Meaning
having no ill effects on performance or stability when the trace points
are enabled. 

> 
>>> Why can't we simply improve the information content of the dprintks?
>> The theory is trace point can be turned on, in production kernels, with
>> little or no performance issues...
> 
> mount isn't a performance path, which is one reason I think trace points
> might be overkill for this case.
Maybe so, but again, it was one of the easier paths to convert. Would it 
be more palatable if I converted the I/O paths?

> 
>>> Can you give a few real examples of problems that these new trace points
>>> can identify that better dprintks wouldn't be able to address?
>> They can supply more information that can be used by both a kernel
>> guy and an IT guy.... Meaning they can supply detailed structure
>> information
>> that a kernel guy would need as well as supplying the simple error code
>> that an IT guy would be interested.
> 
> My point is, does that flexibility really help some poor admin who is
> trying to diagnose a mount problem?  Is it going to reduce the number of
> calls to your support desk?
I think so... Once the admin either learn what is available and how
to use them they will be able better more concise bug reports. So maybe
there may not a decrease in calls but each caller (potentially) will
supply the support desk with better information.
  
> 
> I'd like to see an example of a real mount problem or two that dprintk
> isn't adequate for, but a trace point could have helped.  In other
> words, can we get some use cases for dprintk and trace points for mount
> problems in specific?  I think that would help us understand the
> trade-offs a little better.
In the mount path that might be a bit difficult... but with trace
points you would be able to look at the entire super block or entire 
server and client structures something you can't static/canned 
printks... 
 
> 
> Some general use cases for trace points might also widen our dialog
> about where they are appropriate to use.  I'm not at all arguing against
> using trace points in general, but I would like to see some thinking
> about whether they are the most appropriate tool for each of the many
> troubleshooting jobs we have.
I/O paths jumps into my head... since trace points much less of a performance
killer than printks, the I/O path might be an appropriate use...

> 
>>> Generally, what kind of problems do admins face that the dprintks don't
>>> handle today, and what are the alternatives to addressing those issues?
>> Not being an admin guy, I really don't have an answer for this... but
>> I can say since trace point are not so much of a drag on the system as
>> printks are.. with in timing issues using trace point would be a big
>> advantage
>> over printks
> 
> I like the idea of not depending on the system log, and that's
> appropriate for performance hot paths and asynchronous paths where
> timing can be an issue.  That's one reason why I created the NFS and RPC
> performance metrics facility.
Which is total being underutilized... IMHO... I can see a combination of
using both.... Using the metrics to identify a problem and the using
trace point to solve the problem...
 
> 
> But mount is not a performance path, and is synchronous, more or less. 
> In addition, mount encounters problems much more frequently than the
> read or write path, because mount depends a lot on what options are
> selected and the network environment its running in.  It's the first
> thing to try contacting the server, as well, so it "shakes out" a lot of
> problems before a read or write is even done.
> 
> So something like dprintk or trace points or a network trace that have
> some set up overhead might be less appropriate for mount than, say,
> beefing up the error reporting framework in the mount path, just as an
> example.
Trace points by far have much much less overhead than printks... thats
one of their major advantages... 

> 
>>> Do admins who run enterprise kernels actually use SystemTap, or do they
>>> fall back on network traces and other tried and true troubleshooting
>>> methodologies?
>> Currently to run systemtap, one need kernel debug info and kernel
>> developer
>> info installed on the system. Most productions system don't install
>> those types
>> of packages.... But with trace points those type of packages will no
>> longer be
>> needed, so I could definitely see admins using systemtap once its
>> available...
>> Look at Dtrace... people are using that now that its available and
>> fairly stable.
>>
>>> If we think the mount path needs such instrumentation, consider updating
>>> fs/nfs/mount_clnt.c and net/sunrpc/rpcb_clnt.c as well.
>>>
>> I was just following what what was currently being debug when
>> 'rpcinfo -m nfs -s mount' was set...
> 
> `rpcdebug -m nfs -s mount` also enables the dprintks in
> fs/nfs/mount_clnt.c, at least.  As with most dprintk infrastructure in
> NFS, it's really aimed at developers and not end users or admins.  The
> rpcbind client is also an integral part of the mount process, so I
> suggested that too.
> 
ACK...

steved.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-21 19:37           ` Steve Dickson
@ 2009-01-21 20:19             ` Chuck Lever
  2009-01-21 22:36               ` Greg Banks
  2009-01-22 13:55               ` Steve Dickson
  0 siblings, 2 replies; 58+ messages in thread
From: Chuck Lever @ 2009-01-21 20:19 UTC (permalink / raw)
  To: Steve Dickson; +Cc: Linux NFS Mailing list, Linux NFSv4 mailing list, SystemTAP

On Jan 21, 2009, at Jan 21, 2009, 2:37 PM, Steve Dickson wrote:
> Chuck Lever wrote:
>> Hey Steve-
>>
>> I'd like to see an example of a real mount problem or two that  
>> dprintk
>> isn't adequate for, but a trace point could have helped.  In other
>> words, can we get some use cases for dprintk and trace points for  
>> mount
>> problems in specific?  I think that would help us understand the
>> trade-offs a little better.
> In the mount path that might be a bit difficult... but with trace
> points you would be able to look at the entire super block or entire
> server and client structures something you can't static/canned
> printks...

I've never ever seen an NFS mount problem that required an admin to  
provide information from a superblock.  That seems like a lot of  
implementation detail that would be meaningless to admins and support  
desk folks.

This is why I think we need to have some real world customer examples  
of mount problems (or read performance problems, or whatever) that we  
want to be able to diagnose in enterprise distributions.  I'm not  
saying this to throw up a road block... I think we really need to  
understand the problem before designing the solution, and so let's  
start with some practical examples.

Again, I'm not saying trace points are bad or wrong, just that they  
may not be appropriate for a particular code path and the type of  
problems that arise during specific NFS operations.  I'm not  
criticizing your particular sample code.  I'm asking "Before we add  
trace points everywhere, are trace points strategically the right  
debugging tool in every case?"

Basically we have to know well in advance what kind of information  
will be needed at each trace point.  Who can predict?  If you have to  
solder in trace points in advance, in some ways that doesn't seem any  
more flexible than a dprintk.  What you've demonstrated is another  
good general tool for debugging, but you haven't convinced me that  
this is the right tool for, say, the mount path, or ACL support, and  
so on.

>> But mount is not a performance path, and is synchronous, more or  
>> less.
>> In addition, mount encounters problems much more frequently than the
>> read or write path, because mount depends a lot on what options are
>> selected and the network environment its running in.  It's the first
>> thing to try contacting the server, as well, so it "shakes out" a  
>> lot of
>> problems before a read or write is even done.
>>
>> So something like dprintk or trace points or a network trace that  
>> have
>> some set up overhead might be less appropriate for mount than, say,
>> beefing up the error reporting framework in the mount path, just as  
>> an
>> example.
> Trace points by far have much much less overhead than printks... thats
> one of their major advantages...

Yeah, but that doesn't matter in some cases, like mount, or  
asynchronous file deletes, or .... so we have to look at some of the  
other issues with using them when deciding if they are the right tool  
for the job.

I think we need to visit this issue on a case-by-case basis.   
Sometimes dprintk is appropriate.  Sometimes printk(KERN_ERR).   
Sometimes a performance metric.  Having specific troubleshooting in  
mind when we design this is critical, otherwise we are going to add a  
lot of kruft for no real benefit.

That's an advantage of something like SystemTap.  You can specify  
whatever is needed for a specific problem, and you don't need to  
recompile the kernel to do it.  Enterprise distributions can provide  
specific scripts for their code base, which doesn't change much.   
Upstream is free to make whatever drastic modifications to the code  
base without worrying about breaking a kernel-user space API.

Trond has always maintained that dprintk() is best for developers, but  
probably inappropriate for field debugging, and I think that may also  
apply to trace points.  So I'm not against adding trace points where  
appropriate, but I'm doubtful that they will be helpful outside of  
kernel development; ie I wonder if they will specifically help  
customers of enterprise distributions.

-- 
Chuck Lever
chuck[dot]lever[at]oracle[dot]com

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-21 20:19             ` Chuck Lever
@ 2009-01-21 22:36               ` Greg Banks
       [not found]                 ` <4977A385.8000406-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
  2009-01-22 13:55               ` Steve Dickson
  1 sibling, 1 reply; 58+ messages in thread
From: Greg Banks @ 2009-01-21 22:36 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Linux NFS Mailing list, Linux NFSv4 mailing list, SystemTAP

Chuck Lever wrote:
>
>
> I think we need to visit this issue on a case-by-case basis.   
> Sometimes dprintk is appropriate.  Sometimes printk(KERN_ERR).   
> Sometimes a performance metric.
Well said.

> Trond has always maintained that dprintk() is best for developers, but  
> probably inappropriate for field debugging,
It's not a perfect tool but it beats nothing at all.
>  and I think that may also  
> apply to trace points.  
It depends on whether distros can be convinced to enable it by default,
and install by default any necessary userspace infrastructure.   The
most important thing for field debugging is Just Knowing that you have
all the bits necessary to perform useful debugging without having to
find some RPM that matches the kernel that the machine is actually
running now, and not the one that was present when the machine was
installed.

-- 
Greg Banks, P.Engineer, SGI Australian Software Group.
the brightly coloured sporks of revolution.
I don't speak for SGI.

^ permalink raw reply	[flat|nested] 58+ messages in thread

[parent not found: <4977A385.8000406-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>]

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
       [not found]                 ` <4977A385.8000406-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
@ 2009-01-21 22:47                   ` Arnaldo Carvalho de Melo
  2009-01-21 22:57                     ` Trond Myklebust
  2009-01-21 22:56                   ` Trond Myklebust
  2009-01-21 22:56                   ` J. Bruce Fields
  2 siblings, 1 reply; 58+ messages in thread
From: Arnaldo Carvalho de Melo @ 2009-01-21 22:47 UTC (permalink / raw)
  To: Greg Banks
  Cc: Chuck Lever, Steve Dickson, Linux NFS Mailing list,
	Linux NFSv4 mailing list, SystemTAP

Em Thu, Jan 22, 2009 at 09:36:53AM +1100, Greg Banks escreveu:
> Chuck Lever wrote:
> >
> >
> > I think we need to visit this issue on a case-by-case basis.   
> > Sometimes dprintk is appropriate.  Sometimes printk(KERN_ERR).   
> > Sometimes a performance metric.
> Well said.
> 
> > Trond has always maintained that dprintk() is best for developers, but  
> > probably inappropriate for field debugging,
> It's not a perfect tool but it beats nothing at all.
> >  and I think that may also  
> > apply to trace points.  
> It depends on whether distros can be convinced to enable it by default,
> and install by default any necessary userspace infrastructure.   The
> most important thing for field debugging is Just Knowing that you have
> all the bits necessary to perform useful debugging without having to
> find some RPM that matches the kernel that the machine is actually
> running now, and not the one that was present when the machine was
> installed.

Exactly, that is why an ftrace plugin, that only when selected using
echo "nfs" > /debug/tracing/current_tracer will activate the tracepoints
and provide output via /debug/tracing/trace or /deb/tracing/trace_pipe,
possibly combined with other ftrace plugins such as the stacktrace,
blktrace, etc.

I.e. no need at all for any matching userspace tool, near zero impact
when not activated, useful, if done right, for both developers and for
admins.

Again, an example can be found in the blktrace ftrace plugin[1], that
instead of adding a requirement will eventually drop an existing, well
established one (blktrace(8)).

- Arnaldo

[1] http://lkml.org/lkml/2009/1/20/190

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-21 22:47                   ` Arnaldo Carvalho de Melo
@ 2009-01-21 22:57                     ` Trond Myklebust
  2009-01-21 23:06                       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 58+ messages in thread
From: Trond Myklebust @ 2009-01-21 22:57 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Linux NFS Mailing list, SystemTAP, Linux NFSv4 mailing list,
	Greg Banks

On Wed, 2009-01-21 at 20:47 -0200, Arnaldo Carvalho de Melo wrote:
> Em Thu, Jan 22, 2009 at 09:36:53AM +1100, Greg Banks escreveu:
> > Chuck Lever wrote:
> > >
> > >
> > > I think we need to visit this issue on a case-by-case basis.   
> > > Sometimes dprintk is appropriate.  Sometimes printk(KERN_ERR).   
> > > Sometimes a performance metric.
> > Well said.
> > 
> > > Trond has always maintained that dprintk() is best for developers, but  
> > > probably inappropriate for field debugging,
> > It's not a perfect tool but it beats nothing at all.
> > >  and I think that may also  
> > > apply to trace points.  
> > It depends on whether distros can be convinced to enable it by default,
> > and install by default any necessary userspace infrastructure.   The
> > most important thing for field debugging is Just Knowing that you have
> > all the bits necessary to perform useful debugging without having to
> > find some RPM that matches the kernel that the machine is actually
> > running now, and not the one that was present when the machine was
> > installed.
> 
> Exactly, that is why an ftrace plugin, that only when selected using
> echo "nfs" > /debug/tracing/current_tracer will activate the tracepoints
> and provide output via /debug/tracing/trace or /deb/tracing/trace_pipe,
> possibly combined with other ftrace plugins such as the stacktrace,
> blktrace, etc.
> 
> I.e. no need at all for any matching userspace tool, near zero impact
> when not activated, useful, if done right, for both developers and for
> admins.
> 
> Again, an example can be found in the blktrace ftrace plugin[1], that
> instead of adding a requirement will eventually drop an existing, well
> established one (blktrace(8)).

I must be missing something. Exactly what functionality does this then
give us that we don't have already with the existing RPC/NFS dprintk()
scheme?

Trond

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-21 22:57                     ` Trond Myklebust
@ 2009-01-21 23:06                       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 58+ messages in thread
From: Arnaldo Carvalho de Melo @ 2009-01-21 23:06 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Arnaldo Carvalho de Melo, Greg Banks, Linux NFS Mailing list,
	Linux NFSv4 mailing list, SystemTAP

Em Wed, Jan 21, 2009 at 05:57:46PM -0500, Trond Myklebust escreveu:
> On Wed, 2009-01-21 at 20:47 -0200, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Jan 22, 2009 at 09:36:53AM +1100, Greg Banks escreveu:
> > > Chuck Lever wrote:
> > > >
> > > >
> > > > I think we need to visit this issue on a case-by-case basis.   
> > > > Sometimes dprintk is appropriate.  Sometimes printk(KERN_ERR).   
> > > > Sometimes a performance metric.
> > > Well said.
> > > 
> > > > Trond has always maintained that dprintk() is best for developers, but  
> > > > probably inappropriate for field debugging,
> > > It's not a perfect tool but it beats nothing at all.
> > > >  and I think that may also  
> > > > apply to trace points.  
> > > It depends on whether distros can be convinced to enable it by default,
> > > and install by default any necessary userspace infrastructure.   The
> > > most important thing for field debugging is Just Knowing that you have
> > > all the bits necessary to perform useful debugging without having to
> > > find some RPM that matches the kernel that the machine is actually
> > > running now, and not the one that was present when the machine was
> > > installed.
> > 
> > Exactly, that is why an ftrace plugin, that only when selected using
> > echo "nfs" > /debug/tracing/current_tracer will activate the tracepoints
> > and provide output via /debug/tracing/trace or /deb/tracing/trace_pipe,
> > possibly combined with other ftrace plugins such as the stacktrace,
> > blktrace, etc.
> > 
> > I.e. no need at all for any matching userspace tool, near zero impact
> > when not activated, useful, if done right, for both developers and for
> > admins.
> > 
> > Again, an example can be found in the blktrace ftrace plugin[1], that
> > instead of adding a requirement will eventually drop an existing, well
> > established one (blktrace(8)).
> 
> I must be missing something. Exactly what functionality does this then
> give us that we don't have already with the existing RPC/NFS dprintk()
> scheme?

Filtering by CPU, the possibity of printing stack traces when the
tracepoints are hit, combination with other tracers to try to mix events
and correlate problems, a request may be taking too long because some
problems are happening on a lower layer subsystem that has, in turn,
tracepoints exposed thru another ftrace plugin.

But then I haven't looked too closely to the places that are being
proposed for conversion to tracepoints in NFS land, will do.

- Arnaldo

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
       [not found]                 ` <4977A385.8000406-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
  2009-01-21 22:47                   ` Arnaldo Carvalho de Melo
@ 2009-01-21 22:56                   ` Trond Myklebust
  2009-01-21 23:11                     ` Greg Banks
  2009-01-21 22:56                   ` J. Bruce Fields
  2 siblings, 1 reply; 58+ messages in thread
From: Trond Myklebust @ 2009-01-21 22:56 UTC (permalink / raw)
  To: Greg Banks
  Cc: Chuck Lever, Linux NFS Mailing list, Linux NFSv4 mailing list,
	SystemTAP

On Thu, 2009-01-22 at 09:36 +1100, Greg Banks wrote:
> Chuck Lever wrote:
> >
> >
> > I think we need to visit this issue on a case-by-case basis.   
> > Sometimes dprintk is appropriate.  Sometimes printk(KERN_ERR).   
> > Sometimes a performance metric.
> Well said.
> 
> > Trond has always maintained that dprintk() is best for developers, but  
> > probably inappropriate for field debugging,
> It's not a perfect tool but it beats nothing at all.
> >  and I think that may also  
> > apply to trace points.  
> It depends on whether distros can be convinced to enable it by default,
> and install by default any necessary userspace infrastructure.   The
> most important thing for field debugging is Just Knowing that you have
> all the bits necessary to perform useful debugging without having to
> find some RPM that matches the kernel that the machine is actually
> running now, and not the one that was present when the machine was
> installed.

Which is precisely why dprintk() is such a bad choice as a basis for a
set of trace points: every new patch and bugfix that the distro applies
will result in a reshuffling of the trace points as code is cleaned up
and moved around or removed entirely.

Cheers
  Trond


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-21 22:56                   ` Trond Myklebust
@ 2009-01-21 23:11                     ` Greg Banks
  2009-01-21 23:47                       ` Trond Myklebust
  0 siblings, 1 reply; 58+ messages in thread
From: Greg Banks @ 2009-01-21 23:11 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Linux NFS Mailing list, Linux NFSv4 mailing list, SystemTAP

Trond Myklebust wrote:
> On Thu, 2009-01-22 at 09:36 +1100, Greg Banks wrote:
>   
>> Chuck Lever wrote:
>>     
>>>
>>>       
>> It depends on whether distros can be convinced to enable it by default,
>> and install by default any necessary userspace infrastructure.   The
>> most important thing for field debugging is Just Knowing that you have
>> all the bits necessary to perform useful debugging without having to
>> find some RPM that matches the kernel that the machine is actually
>> running now, and not the one that was present when the machine was
>> installed.
>>     
>
> Which is precisely why dprintk() is such a bad choice as a basis for a
> set of trace points: every new patch and bugfix that the distro applies
> will result in a reshuffling of the trace points as code is cleaned up
> and moved around or removed entirely.
>   
Yes, if the filename and line number were the only information going
out.  The dprintk() format is usually enough (ignoring the patchy
quality of the current dprintk set)  to give a developer enough clue
about which dprintk is which.  Or am I missing something?

-- 
Greg Banks, P.Engineer, SGI Australian Software Group.
the brightly coloured sporks of revolution.
I don't speak for SGI.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-21 23:11                     ` Greg Banks
@ 2009-01-21 23:47                       ` Trond Myklebust
  2009-01-22  0:53                         ` Frank Ch. Eigler
  2009-01-22  2:04                         ` Greg Banks
  0 siblings, 2 replies; 58+ messages in thread
From: Trond Myklebust @ 2009-01-21 23:47 UTC (permalink / raw)
  To: Greg Banks; +Cc: Linux NFS Mailing list, Linux NFSv4 mailing list, SystemTAP

On Thu, 2009-01-22 at 10:11 +1100, Greg Banks wrote:
> Trond Myklebust wrote:
> > On Thu, 2009-01-22 at 09:36 +1100, Greg Banks wrote:
> >   
> >> Chuck Lever wrote:
> >>     
> >>>
> >>>       
> >> It depends on whether distros can be convinced to enable it by default,
> >> and install by default any necessary userspace infrastructure.   The
> >> most important thing for field debugging is Just Knowing that you have
> >> all the bits necessary to perform useful debugging without having to
> >> find some RPM that matches the kernel that the machine is actually
> >> running now, and not the one that was present when the machine was
> >> installed.
> >>     
> >
> > Which is precisely why dprintk() is such a bad choice as a basis for a
> > set of trace points: every new patch and bugfix that the distro applies
> > will result in a reshuffling of the trace points as code is cleaned up
> > and moved around or removed entirely.
> >   
> Yes, if the filename and line number were the only information going
> out.  The dprintk() format is usually enough (ignoring the patchy
> quality of the current dprintk set)  to give a developer enough clue
> about which dprintk is which.  Or am I missing something?

The current dprintk() set was never designed to be anything other than a
logging tool with a very coarse filter (the bitmask
in /proc/sys/sunrpc/*_debug). It was designed to be human-readable only
(no fixed format).

As I understand it, you are not only proposing to make that filter
extremely fine (individually addressable trace points), but also to
enable the application of scripting tools like systemtap and LTTng in
order to provide bespoke debugging of your customer problems. Have I
misunderstood you, or is that correct?

The question then is how is this going to work out in an environment
where the individually addressable trace points/dprintk()s pop in and
out of existence at the whim of a patch, and where the output format is
similarly volatile?
IOW: I'm referring to the difference between an interface that was
designed purely to be interpreted by humans, and one that is designed
from scratch to be interpreted by scripts.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-21 23:47                       ` Trond Myklebust
@ 2009-01-22  0:53                         ` Frank Ch. Eigler
  2009-01-22  2:04                         ` Greg Banks
  1 sibling, 0 replies; 58+ messages in thread
From: Frank Ch. Eigler @ 2009-01-22  0:53 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Greg Banks, nfsv4, Linux NFS Mailing list, SystemTAP

Trond Myklebust <trond.myklebust@fys.uio.no> writes:

> [...]  As I understand it, you are not only proposing to make that
> filter extremely fine (individually addressable trace points), but
> also to enable the application of scripting tools like systemtap and
> LTTng in order to provide bespoke debugging of your customer
> problems. Have I misunderstood you, or is that correct?

Perhaps.

> The question then is how is this going to work out in an environment
> where the individually addressable trace points/dprintk()s pop in
> and out of existence at the whim of a patch, and where the output
> format is similarly volatile?

It would work no worse than what there is now.  For environments where
the code is not subject to that much patching, it could be
piggybacked-upon for more analysis.

> IOW: I'm referring to the difference between an interface that was
> designed purely to be interpreted by humans, and one that is
> designed from scratch to be interpreted by scripts.

It need not be a disjunction.  As more formal machine-oriented
interfaces come into existence, the same tools can shift focus to
them.  Depending on the tool, the shift may be nearly invisible to a
naive end user.


- FChE

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-21 23:47                       ` Trond Myklebust
  2009-01-22  0:53                         ` Frank Ch. Eigler
@ 2009-01-22  2:04                         ` Greg Banks
       [not found]                           ` <4977D431.1020906-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
  1 sibling, 1 reply; 58+ messages in thread
From: Greg Banks @ 2009-01-22  2:04 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Linux NFS Mailing list, Linux NFSv4 mailing list, SystemTAP

Trond Myklebust wrote:
> On Thu, 2009-01-22 at 10:11 +1100, Greg Banks wrote:
>   
>> Trond Myklebust wrote:
>>     
>>> On Thu, 2009-01-22 at 09:36 +1100, Greg Banks wrote:
>>>   
>>>       
>>>> Chuck Lever wrote:
>>>>     
>>>>         
>>>>>       
>>>>>           
>>>> It depends on whether distros can be convinced to enable it by default,
>>>> and install by default any necessary userspace infrastructure.   The
>>>> most important thing for field debugging is Just Knowing that you have
>>>> all the bits necessary to perform useful debugging without having to
>>>> find some RPM that matches the kernel that the machine is actually
>>>> running now, and not the one that was present when the machine was
>>>> installed.
>>>>     
>>>>         
>>> Which is precisely why dprintk() is such a bad choice as a basis for a
>>> set of trace points: every new patch and bugfix that the distro applies
>>> will result in a reshuffling of the trace points as code is cleaned up
>>> and moved around or removed entirely.
>>>   
>>>       
>> Yes, if the filename and line number were the only information going
>> out.  The dprintk() format is usually enough (ignoring the patchy
>> quality of the current dprintk set)  to give a developer enough clue
>> about which dprintk is which.  Or am I missing something?
>>     
>
> The current dprintk() set was never designed to be anything other than a
> logging tool with a very coarse filter (the bitmask
> in /proc/sys/sunrpc/*_debug). It was designed to be human-readable only
> (no fixed format).
>
> As I understand it, you are not only proposing to make that filter
> extremely fine (individually addressable trace points), but also to
> enable the application of scripting tools like systemtap and LTTng in
> order to provide bespoke debugging of your customer problems. Have I
> misunderstood you, or is that correct?
>   

These are two separate proposals between which we're trying to find some
commonality.

In my proposal, the dprintk()s remain designed primarily for humans
(support staff or kernel developers) to read in conjunction with the
correct source code, but control is made fine-grain to make the
mechanism more controllable.  This can be done regardless of whether
trace points are involved and regardless of whether we attempt to
support scripts.

Changing dprintk() to add a trace point is just a way to get some trace
points with strictly minimum changes to callsites.

Replacing dprintk()s with new trace points has more or less the same
result but means more futzing with callsites.

> The question then is how is this going to work out in an environment
> where the individually addressable trace points/dprintk()s pop in and
> out of existence at the whim of a patch, and where the output format is
> similarly volatile?
> IOW: I'm referring to the difference between an interface that was
> designed purely to be interpreted by humans, and one that is designed
> from scratch to be interpreted by scripts.
>
>   

The maintenance problem of correlating any kind of instrumentation point
in kernel code with scripts living out in userspace exists regardless of
how you choose to implement the instrumentation.

-- 
Greg Banks, P.Engineer, SGI Australian Software Group.
the brightly coloured sporks of revolution.
I don't speak for SGI.

^ permalink raw reply	[flat|nested] 58+ messages in thread

[parent not found: <4977D431.1020906-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>]

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
       [not found]                           ` <4977D431.1020906-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
@ 2009-01-22 15:27                             ` Steve Dickson
       [not found]                               ` <49789073.1080200-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 58+ messages in thread
From: Steve Dickson @ 2009-01-22 15:27 UTC (permalink / raw)
  To: Greg Banks
  Cc: Trond Myklebust, Linux NFS Mailing list, Linux NFSv4 mailing list,
	SystemTAP

Greg Banks wrote:
> 
> These are two separate proposals between which we're trying to find some
> commonality.
I agree.. a common effort does make the most sense... imho...

> 
> In my proposal, the dprintk()s remain designed primarily for humans
> (support staff or kernel developers) to read in conjunction with the
> correct source code, but control is made fine-grain to make the
> mechanism more controllable.  This can be done regardless of whether
> trace points are involved and regardless of whether we attempt to
> support scripts.
I would think it the "fine-grain control mechanism" could also manage
trace points as well, true?

> 
> Changing dprintk() to add a trace point is just a way to get some trace
> points with strictly minimum changes to callsites.
I'm not sure this would be a good idea since, as Trond or Chuck pointed out,
there is really not rhyme or reason on where the dprintks live today... 

> 
> Replacing dprintk()s with new trace points has more or less the same
> result but means more futzing with callsites.
Well not if the replacements are well thought out and designed, since
there does not have 1-to-1 replacement... 
 
> 
> The maintenance problem of correlating any kind of instrumentation point
> in kernel code with scripts living out in userspace exists regardless of
> how you choose to implement the instrumentation.
> 
+1

steved.

^ permalink raw reply	[flat|nested] 58+ messages in thread

[parent not found: <49789073.1080200-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>]

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
       [not found]                               ` <49789073.1080200-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
@ 2009-01-22 22:43                                 ` Greg Banks
  0 siblings, 0 replies; 58+ messages in thread
From: Greg Banks @ 2009-01-22 22:43 UTC (permalink / raw)
  To: Steve Dickson; +Cc: Linux NFS Mailing list, SystemTAP, Linux NFSv4 mailing list

Steve Dickson wrote:
> Greg Banks wrote:
>   
>> In my proposal, the dprintk()s remain designed primarily for humans
>> (support staff or kernel developers) to read in conjunction with the
>> correct source code, but control is made fine-grain to make the
>> mechanism more controllable.  This can be done regardless of whether
>> trace points are involved and regardless of whether we attempt to
>> support scripts.
>>     
> I would think it the "fine-grain control mechanism" could also manage
> trace points as well, true?
>   

Yes, and it would have the same value.  When I get back from LCA next
week I want to look at how one would fit that control mechanism into
trace points.
>   
>> Changing dprintk() to add a trace point is just a way to get some trace
>> points with strictly minimum changes to callsites.
>>     
> I'm not sure this would be a good idea since, as Trond or Chuck pointed out,
> there is really not rhyme or reason on where the dprintks live today... 
>   
Indeed, but it would be a starting point.
>   
>> Replacing dprintk()s with new trace points has more or less the same
>> result but means more futzing with callsites.
>>     
> Well not if the replacements are well thought out and designed, since
> there does not have 1-to-1 replacement... 
>   
I would hope that there would be *more* tracepoints than dprintks.

-- 
Greg Banks, P.Engineer, SGI Australian Software Group.
the brightly coloured sporks of revolution.
I don't speak for SGI.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
       [not found]                 ` <4977A385.8000406-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
  2009-01-21 22:47                   ` Arnaldo Carvalho de Melo
  2009-01-21 22:56                   ` Trond Myklebust
@ 2009-01-21 22:56                   ` J. Bruce Fields
  2009-01-21 23:05                     ` Muntz, Daniel
                                       ` (2 more replies)
  2 siblings, 3 replies; 58+ messages in thread
From: J. Bruce Fields @ 2009-01-21 22:56 UTC (permalink / raw)
  To: Greg Banks
  Cc: Chuck Lever, Linux NFS Mailing list, Linux NFSv4 mailing list,
	SystemTAP

On Thu, Jan 22, 2009 at 09:36:53AM +1100, Greg Banks wrote:
> Chuck Lever wrote:
> >
> >
> > I think we need to visit this issue on a case-by-case basis.   
> > Sometimes dprintk is appropriate.  Sometimes printk(KERN_ERR).   
> > Sometimes a performance metric.
> Well said.
> 
> > Trond has always maintained that dprintk() is best for developers, but  
> > probably inappropriate for field debugging,
> It's not a perfect tool but it beats nothing at all.
> >  and I think that may also  
> > apply to trace points.  
> It depends on whether distros can be convinced to enable it by default,
> and install by default any necessary userspace infrastructure.   The
> most important thing for field debugging is Just Knowing that you have
> all the bits necessary to perform useful debugging without having to
> find some RPM that matches the kernel that the machine is actually
> running now, and not the one that was present when the machine was
> installed.

On the mount case specifically: How far are we from the idea of a mount
program that can identify most problems itself?  I know its error
reporting has gotten better....

I suppose the main feedback mount gets right now is an error code from
the mount system call, and that may be too narrow an interface to cover
most problems.  Is there some way we can give mount a real interface it
can use to find out this stuff instead of just dumping more strings into
the logs?

My main obstacle to judging a solution is that I don't have in mind a
good list of (say) the top 10 problems that can cause the first mount to
fail.  Hm:

	- dns lookup of the server fails
	- server isn't reachable
	- server isn't running nfs
	- requested path isn't known to server or isn't exported
	- export is there, but requires more security
	- user doesn't have gss credentials
	- file permissions on the export are wrong
	...

--b.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-21 22:56                   ` J. Bruce Fields
@ 2009-01-21 23:05                     ` Muntz, Daniel
  2009-01-22 15:59                     ` Steve Dickson
  2009-01-23 18:17                     ` Chuck Lever
  2 siblings, 0 replies; 58+ messages in thread
From: Muntz, Daniel @ 2009-01-21 23:05 UTC (permalink / raw)
  To: J. Bruce Fields, Greg Banks
  Cc: Chuck Lever, Linux NFS Mailing list, Linux NFSv4 mailing list,
	SystemTAP

My favorite: when you try a Kerberos mount and one of the kernel modules
requried for this isn't loaded, mount gets ENOMEM and says "out of
memory"

  -Dan

-----Original Message-----
From: J. Bruce Fields [mailto:bfields@fieldses.org] 
Sent: Wednesday, January 21, 2009 2:56 PM
To: Greg Banks
Cc: Chuck Lever; Linux NFS Mailing list; Linux NFSv4 mailing list;
SystemTAP
Subject: Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path

On Thu, Jan 22, 2009 at 09:36:53AM +1100, Greg Banks wrote:
> Chuck Lever wrote:
> >
> >
> > I think we need to visit this issue on a case-by-case basis.   
> > Sometimes dprintk is appropriate.  Sometimes printk(KERN_ERR).   
> > Sometimes a performance metric.
> Well said.
> 
> > Trond has always maintained that dprintk() is best for developers, 
> > but probably inappropriate for field debugging,
> It's not a perfect tool but it beats nothing at all.
> >  and I think that may also
> > apply to trace points.  
> It depends on whether distros can be convinced to enable it by
default,
> and install by default any necessary userspace infrastructure.   The
> most important thing for field debugging is Just Knowing that you have

> all the bits necessary to perform useful debugging without having to 
> find some RPM that matches the kernel that the machine is actually 
> running now, and not the one that was present when the machine was 
> installed.

On the mount case specifically: How far are we from the idea of a mount
program that can identify most problems itself?  I know its error
reporting has gotten better....

I suppose the main feedback mount gets right now is an error code from
the mount system call, and that may be too narrow an interface to cover
most problems.  Is there some way we can give mount a real interface it
can use to find out this stuff instead of just dumping more strings into
the logs?

My main obstacle to judging a solution is that I don't have in mind a
good list of (say) the top 10 problems that can cause the first mount to
fail.  Hm:

	- dns lookup of the server fails
	- server isn't reachable
	- server isn't running nfs
	- requested path isn't known to server or isn't exported
	- export is there, but requires more security
	- user doesn't have gss credentials
	- file permissions on the export are wrong
	...

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org More majordomo info
at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-21 22:56                   ` J. Bruce Fields
  2009-01-21 23:05                     ` Muntz, Daniel
@ 2009-01-22 15:59                     ` Steve Dickson
  2009-01-22 16:45                       ` J. Bruce Fields
  2009-01-23 18:17                     ` Chuck Lever
  2 siblings, 1 reply; 58+ messages in thread
From: Steve Dickson @ 2009-01-22 15:59 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Greg Banks, Linux NFS Mailing list, Linux NFSv4 mailing list,
	SystemTAP

J. Bruce Fields wrote:
> 
> On the mount case specifically: How far are we from the idea of a mount
> program that can identify most problems itself?  I know its error
> reporting has gotten better....
> 
> I suppose the main feedback mount gets right now is an error code from
> the mount system call, and that may be too narrow an interface to cover
> most problems.  Is there some way we can give mount a real interface it
> can use to find out this stuff instead of just dumping more strings into
> the logs?
Interesting.... Store something like a reason code (similar to what they have
in he Kerberos) in somewhere in the proc file system? 

Its seems to me this is a common problem among network file systems... 

steved.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-22 15:59                     ` Steve Dickson
@ 2009-01-22 16:45                       ` J. Bruce Fields
  2009-01-22 22:54                         ` Greg Banks
  0 siblings, 1 reply; 58+ messages in thread
From: J. Bruce Fields @ 2009-01-22 16:45 UTC (permalink / raw)
  To: Steve Dickson
  Cc: Linux NFS Mailing list, SystemTAP, Linux NFSv4 mailing list,
	Greg Banks

On Thu, Jan 22, 2009 at 10:59:49AM -0500, Steve Dickson wrote:
> J. Bruce Fields wrote:
> > 
> > On the mount case specifically: How far are we from the idea of a mount
> > program that can identify most problems itself?  I know its error
> > reporting has gotten better....
> > 
> > I suppose the main feedback mount gets right now is an error code from
> > the mount system call, and that may be too narrow an interface to cover
> > most problems.  Is there some way we can give mount a real interface it
> > can use to find out this stuff instead of just dumping more strings into
> > the logs?
> Interesting.... Store something like a reason code (similar to what they have
> in he Kerberos)

Maybe.

> in somewhere in the proc file system? 

But then I don't know how you'd associate it with a particular mount
attempt.

--b.

> Its seems to me this is a common problem among network file systems... 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-22 16:45                       ` J. Bruce Fields
@ 2009-01-22 22:54                         ` Greg Banks
       [not found]                           ` <4978F91C.4090208-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 58+ messages in thread
From: Greg Banks @ 2009-01-22 22:54 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Steve Dickson, Linux NFS Mailing list, SystemTAP,
	Linux NFSv4 mailing list

J. Bruce Fields wrote:
> On Thu, Jan 22, 2009 at 10:59:49AM -0500, Steve Dickson wrote:
>   
>> J. Bruce Fields wrote:
>>     
>>
>> Interesting.... Store something like a reason code (similar to what they have
>> in he Kerberos)
>>     
>
> Maybe.
>
>   
>> in somewhere in the proc file system? 
>>     
>
> But then I don't know how you'd associate it with a particular mount
> attempt.
>
>
>   
You could do something truly awful like add a new code in the unused
bits of the errno value returned from mount.  It would confuse an
unmodified userspace, but reporting "Unknown Error" isn't much less
useful than "I/O Error".

-- 
Greg Banks, P.Engineer, SGI Australian Software Group.
the brightly coloured sporks of revolution.
I don't speak for SGI.


^ permalink raw reply	[flat|nested] 58+ messages in thread

[parent not found: <4978F91C.4090208-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>]

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
       [not found]                           ` <4978F91C.4090208-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
@ 2009-01-23 18:09                             ` J. Bruce Fields
  2009-01-23 22:18                               ` Greg Banks
  0 siblings, 1 reply; 58+ messages in thread
From: J. Bruce Fields @ 2009-01-23 18:09 UTC (permalink / raw)
  To: Greg Banks
  Cc: Steve Dickson, Linux NFS Mailing list, SystemTAP,
	Linux NFSv4 mailing list

On Fri, Jan 23, 2009 at 09:54:20AM +1100, Greg Banks wrote:
> J. Bruce Fields wrote:
> > On Thu, Jan 22, 2009 at 10:59:49AM -0500, Steve Dickson wrote:
> >   
> >> J. Bruce Fields wrote:
> >>     
> >>
> >> Interesting.... Store something like a reason code (similar to what they have
> >> in he Kerberos)
> >>     
> >
> > Maybe.
> >
> >   
> >> in somewhere in the proc file system? 
> >>     
> >
> > But then I don't know how you'd associate it with a particular mount
> > attempt.
> >
> >
> >   
> You could do something truly awful like add a new code in the unused
> bits of the errno value returned from mount.  It would confuse an
> unmodified userspace, but reporting "Unknown Error" isn't much less
> useful than "I/O Error".

There must be cases where the existing error gives us some information,
but not as much as we'd like, so the "Unknown Error" could be a step
backwards for unmodified userspace.  An extra mount option (normally
hidden from the user) could be used to tell the kernel that the
application doing the mount was capable of handling the new error codes.
Ugh.

--b.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-23 18:09                             ` J. Bruce Fields
@ 2009-01-23 22:18                               ` Greg Banks
  0 siblings, 0 replies; 58+ messages in thread
From: Greg Banks @ 2009-01-23 22:18 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Linux NFS Mailing list, Linux NFSv4 mailing list, SystemTAP

J. Bruce Fields wrote:
> On Fri, Jan 23, 2009 at 09:54:20AM +1100, Greg Banks wrote:
>   
>> J. Bruce Fields wrote:
>>     
>>> On Thu, Jan 22, 2009 at 10:59:49AM -0500, Steve Dickson wrote:
>>>   
>>>       
>>>> J. Bruce Fields wrote:
>>>>     
>>>>
>>>> Interesting.... Store something like a reason code (similar to what they have
>>>> in he Kerberos)
>>>>     
>>>>         
>>> Maybe.
>>>
>>>   
>>>       
>>>> in somewhere in the proc file system? 
>>>>     
>>>>         
>>> But then I don't know how you'd associate it with a particular mount
>>> attempt.
>>>
>>>
>>>   
>>>       
>> You could do something truly awful like add a new code in the unused
>> bits of the errno value returned from mount.  It would confuse an
>> unmodified userspace, but reporting "Unknown Error" isn't much less
>> useful than "I/O Error".
>>     
>
> There must be cases where the existing error gives us some information,
> but not as much as we'd like, so the "Unknown Error" could be a step
> backwards for unmodified userspace.  An extra mount option (normally
> hidden from the user) could be used to tell the kernel that the
> application doing the mount was capable of handling the new error codes.
> Ugh.
>
>
>   
Ugh is right.

On further thought, you could use the extra mount option to enable a
behaviour where the kernel would format into the mount options buffer a
string describing the reasons for the mount failure.  If the option
weren't present the kernel could emit that same message via printk().

-- 
Greg Banks, P.Engineer, SGI Australian Software Group.
the brightly coloured sporks of revolution.
I don't speak for SGI.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-21 22:56                   ` J. Bruce Fields
  2009-01-21 23:05                     ` Muntz, Daniel
  2009-01-22 15:59                     ` Steve Dickson
@ 2009-01-23 18:17                     ` Chuck Lever
  2 siblings, 0 replies; 58+ messages in thread
From: Chuck Lever @ 2009-01-23 18:17 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Linux NFS Mailing list, SystemTAP, Linux NFSv4 mailing list,
	Greg Banks

On Jan 21, 2009, at Jan 21, 2009, 5:56 PM, J. Bruce Fields wrote:
> On Thu, Jan 22, 2009 at 09:36:53AM +1100, Greg Banks wrote:
>> Chuck Lever wrote:
>>>
>>>
>>> I think we need to visit this issue on a case-by-case basis.
>>> Sometimes dprintk is appropriate.  Sometimes printk(KERN_ERR).
>>> Sometimes a performance metric.
>> Well said.
>>
>>> Trond has always maintained that dprintk() is best for developers,  
>>> but
>>> probably inappropriate for field debugging,
>> It's not a perfect tool but it beats nothing at all.
>>> and I think that may also
>>> apply to trace points.
>> It depends on whether distros can be convinced to enable it by  
>> default,
>> and install by default any necessary userspace infrastructure.   The
>> most important thing for field debugging is Just Knowing that you  
>> have
>> all the bits necessary to perform useful debugging without having to
>> find some RPM that matches the kernel that the machine is actually
>> running now, and not the one that was present when the machine was
>> installed.
>
> On the mount case specifically: How far are we from the idea of a  
> mount
> program that can identify most problems itself?  I know its error
> reporting has gotten better....

> I suppose the main feedback mount gets right now is an error code from
> the mount system call, and that may be too narrow an interface to  
> cover
> most problems.  Is there some way we can give mount a real interface  
> it
> can use to find out this stuff instead of just dumping more strings  
> into
> the logs?

A main reason it does this rather than generate error messages on the  
terminal is that mount has to run in "background" environments.   
Mounts done at boot time do not have a controlling terminal.  A bg  
mount can drop into the background, and thus it loses its controlling  
terminal.  Automounter doesn't have a controlling terminal to begin  
with.

My feeling is that, as mount is a system tool, it should report its  
problems in the system log.  If there's a controlling terminal, report  
it there too.  But by and large it is a tool that is run most often  
without direct human intervention or monitoring.

In addition there are a lot of cases it can (and does) handle by  
itself.  Renegotiating mount option settings is one of these things.

It's a narrow interface, but I'm not sure yet it's entirely inadequate.

> My main obstacle to judging a solution is that I don't have in mind a
> good list of (say) the top 10 problems that can cause the first  
> mount to
> fail.  Hm:
>
> 	- dns lookup of the server fails
> 	- server isn't reachable
> 	- server isn't running nfs
> 	- requested path isn't known to server or isn't exported
> 	- export is there, but requires more security
> 	- user doesn't have gss credentials
> 	- file permissions on the export are wrong
> 	...

	- tcp_wrappers or iptables blocking access
	- network routing problems
	- v2/v3 server not running rpcbind or lockd

This is exactly why I want to start with some real world examples.   
Without examples we are much more likely to design something that  
isn't useful to anyone.  History (or e-mail archives) gives us a lot  
of information about what might be common problems.

I think we handle some of these cases reasonably well today, though  
they could probably stand some polish; others, like security  
configuration, are still a little new and kind of a low priority (for  
good or bad reasons) and so it is still a bit confusing.

But really, if mount can report a clear error message and suggest a  
course of corrective action, I don't think a dprintk or trace point or  
SystemTap will be of any greater help.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-21 20:19             ` Chuck Lever
  2009-01-21 22:36               ` Greg Banks
@ 2009-01-22 13:55               ` Steve Dickson
  2009-01-22 22:31                 ` Greg Banks
  1 sibling, 1 reply; 58+ messages in thread
From: Steve Dickson @ 2009-01-22 13:55 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Linux NFS Mailing list, Linux NFSv4 mailing list, SystemTAP

Chuck Lever wrote:
> On Jan 21, 2009, at Jan 21, 2009, 2:37 PM, Steve Dickson wrote:
>> Chuck Lever wrote:
>>> Hey Steve-
>>>
>>> I'd like to see an example of a real mount problem or two that dprintk
>>> isn't adequate for, but a trace point could have helped.  In other
>>> words, can we get some use cases for dprintk and trace points for mount
>>> problems in specific?  I think that would help us understand the
>>> trade-offs a little better.
>> In the mount path that might be a bit difficult... but with trace
>> points you would be able to look at the entire super block or entire
>> server and client structures something you can't static/canned
>> printks...
> 
> I've never ever seen an NFS mount problem that required an admin to
> provide information from a superblock.  That seems like a lot of
> implementation detail that would be meaningless to admins and support
> desk folks.
True... but my point is with trace points and systemtap scripts
one has access to BOTH highly technical data (for the developer)
and simple error codes (for the admins).... Unlike with printks...
  
> 
> This is why I think we need to have some real world customer examples of
> mount problems (or read performance problems, or whatever) that we want
> to be able to diagnose in enterprise distributions.  I'm not saying this
> to throw up a road block... I think we really need to understand the
> problem before designing the solution, and so let's start with some
> practical examples.
I'm not sure this is an obtainable goal.... I see it as we put in a 
well design infrastructure (something I think Trond is suggesting)
and then let the consumers of the infrastructure tell us what is need... 
Believe there are enterprise people that know *exactly* what
they are looking for... ;-)

> 
> Again, I'm not saying trace points are bad or wrong, just that they may
> not be appropriate for a particular code path and the type of problems
> that arise during specific NFS operations.  I'm not criticizing your
> particular sample code.  I'm asking "Before we add trace points
> everywhere, are trace points strategically the right debugging tool in
> every case?"
Good point... but the fact trace points very little overhead with them its
kinda hard to see why they would not be the right tool... But again
I do see your point... 
 
> 
> Basically we have to know well in advance what kind of information will
> be needed at each trace point.  Who can predict?  If you have to solder
> in trace points in advance, in some ways that doesn't seem any more
> flexible than a dprintk.  What you've demonstrated is another good
> general tool for debugging, but you haven't convinced me that this is
> the right tool for, say, the mount path, or ACL support, and so on.
No worries.. I'll keep trying! ;-) 

To your point, I know for a fact there are customers asking for
trace points in particular areas of the code (not the NFS code atm).
So, again, I think we should take the "build it and will come"
approach... Meaning, give people something to work with and they
will let us know what they need...
 
> 
> I think we need to visit this issue on a case-by-case basis.  Sometimes
> dprintk is appropriate.  Sometimes printk(KERN_ERR).  Sometimes a
> performance metric.  Having specific troubleshooting in mind when we
> design this is critical, otherwise we are going to add a lot of kruft
> for no real benefit.
I can agree with this...

> 
> That's an advantage of something like SystemTap.  You can specify
> whatever is needed for a specific problem, and you don't need to
> recompile the kernel to do it.  Enterprise distributions can provide
> specific scripts for their code base, which doesn't change much. 
> Upstream is free to make whatever drastic modifications to the code base
> without worrying about breaking a kernel-user space API.
> 
> Trond has always maintained that dprintk() is best for developers, but
> probably inappropriate for field debugging, and I think that may also
> apply to trace points.  So I'm not against adding trace points where
> appropriate, but I'm doubtful that they will be helpful outside of
> kernel development; ie I wonder if they will specifically help customers
> of enterprise distributions.
> 
Time will tell... I think once customers see how useful and powerful
traces can but they be come addicted.... fairly quickly....

steved.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-22 13:55               ` Steve Dickson
@ 2009-01-22 22:31                 ` Greg Banks
  0 siblings, 0 replies; 58+ messages in thread
From: Greg Banks @ 2009-01-22 22:31 UTC (permalink / raw)
  To: Steve Dickson; +Cc: Linux NFS Mailing list, Linux NFSv4 mailing list, SystemTAP

Steve Dickson wrote:
>   
> True... but my point is with trace points and systemtap scripts
> one has access to BOTH highly technical data (for the developer)
> and simple error codes (for the admins).... Unlike with printks...
>   
Yes, there's a lot more power in trace points.

-- 
Greg Banks, P.Engineer, SGI Australian Software Group.
the brightly coloured sporks of revolution.
I don't speak for SGI.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-21 18:01         ` Chuck Lever
  2009-01-21 19:29           ` Trond Myklebust
  2009-01-21 19:37           ` Steve Dickson
@ 2009-01-21 21:26           ` Greg Banks
  2009-01-22 15:19             ` Steve Dickson
  2 siblings, 1 reply; 58+ messages in thread
From: Greg Banks @ 2009-01-21 21:26 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Linux NFS Mailing list, Linux NFSv4 mailing list, SystemTAP

Chuck Lever wrote:
>>> Why can't we simply improve the information content of the dprintks?
>>>       
>> The theory is trace point can be turned on, in production kernels,  
>> with
>> little or no performance issues...
>>     
>
> mount isn't a performance path, 
Perhaps not on the client, but when you have >6000 clients mounting
simultaneously then mount is most definitely a performance path on the
server :-)

> which is one reason I think trace  
> points might be overkill for this case.
>   

I think both dprintks and trace points are the wrong approach for
client-side mount problems.  What you really want there is good and
useful diagnostic information going unconditionally via printk().  Mount
problems happen frequently enough, and are often not the client's fault
but the server's or a firewall's, that system admins need to be able to
work out what went wrong in retrospect by looking in syslog.

But just because Steve chose an unfortunate example doesn't invalidate
his point.  There are plenty of gnarly logic paths in the NFS client and
server which need better runtime diagnostics.  On the server,  anything
involving an upcall to userspace .  On the client, silly rename or
attribute caching.
>
>> Not being an admin guy, I really don't have an answer for this... but
>> I can say since trace point are not so much of a drag on the system as
>> printks are.. with in timing issues using trace point would be a big  
>> advantage
>> over printks
>>     
>
>   
Well that argument works both ways.  Several times now I've seen
problems where a significant part of the debugging process has involved
noticing correlations between timing of dprintks and syslog messages
from other subsystems, like IPoIB or TCP.  That's harder to do if the
debug statements and printks go through separate mechanisms to userspace.

-- 
Greg Banks, P.Engineer, SGI Australian Software Group.
the brightly coloured sporks of revolution.
I don't speak for SGI.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-21 21:26           ` Greg Banks
@ 2009-01-22 15:19             ` Steve Dickson
  2009-01-23 18:28               ` Chuck Lever
  0 siblings, 1 reply; 58+ messages in thread
From: Steve Dickson @ 2009-01-22 15:19 UTC (permalink / raw)
  To: Greg Banks; +Cc: Linux NFS Mailing list, Linux NFSv4 mailing list, SystemTAP



Greg Banks wrote:
> I think both dprintks and trace points are the wrong approach for
> client-side mount problems.  What you really want there is good and
> useful diagnostic information going unconditionally via printk().  Mount
> problems happen frequently enough, and are often not the client's fault
> but the server's or a firewall's, that system admins need to be able to
> work out what went wrong in retrospect by looking in syslog.
> 
> But just because Steve chose an unfortunate example doesn't invalidate
> his point.  There are plenty of gnarly logic paths in the NFS client and
> server which need better runtime diagnostics.  On the server,  anything
> involving an upcall to userspace .  On the client, silly rename or
> attribute caching.
It appears I did pick an "unfortunate example"... since I was really
trying to introduce trace points to see how they could be used...
Maybe picking the I/O path would have been better... 


>>> Not being an admin guy, I really don't have an answer for this... but
>>> I can say since trace point are not so much of a drag on the system as
>>> printks are.. with in timing issues using trace point would be a big  
>>> advantage
>>> over printks
>>>     
>>   
> Well that argument works both ways.  Several times now I've seen
> problems where a significant part of the debugging process has involved
> noticing correlations between timing of dprintks and syslog messages
> from other subsystems, like IPoIB or TCP.  That's harder to do if the
> debug statements and printks go through separate mechanisms to userspace.

Yes... I have seen this an number of times and places... :-(

steved.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-22 15:19             ` Steve Dickson
@ 2009-01-23 18:28               ` Chuck Lever
  2009-01-23 22:21                 ` Greg Banks
  0 siblings, 1 reply; 58+ messages in thread
From: Chuck Lever @ 2009-01-23 18:28 UTC (permalink / raw)
  To: Steve Dickson
  Cc: Linux NFS Mailing list, SystemTAP, Linux NFSv4 mailing list,
	Greg Banks

On Jan 22, 2009, at Jan 22, 2009, 10:19 AM, Steve Dickson wrote:
> Greg Banks wrote:
>> I think both dprintks and trace points are the wrong approach for
>> client-side mount problems.  What you really want there is good and
>> useful diagnostic information going unconditionally via printk().   
>> Mount
>> problems happen frequently enough, and are often not the client's  
>> fault
>> but the server's or a firewall's, that system admins need to be  
>> able to
>> work out what went wrong in retrospect by looking in syslog.
>>
>> But just because Steve chose an unfortunate example doesn't  
>> invalidate
>> his point.  There are plenty of gnarly logic paths in the NFS  
>> client and
>> server which need better runtime diagnostics.  On the server,   
>> anything
>> involving an upcall to userspace .  On the client, silly rename or
>> attribute caching.
> It appears I did pick an "unfortunate example"... since I was really
> trying to introduce trace points to see how they could be used...
> Maybe picking the I/O path would have been better...

Choosing mount was reasonable, as it's simple.  The discussion we are  
having about what tool is right for the job would have probably been  
less interesting if you had stuck with the I/O path.

The big picture though, is what do we need to do to make it easier to  
troubleshoot and solve problems.  That is a much bigger question than  
how we report errors.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-23 18:28               ` Chuck Lever
@ 2009-01-23 22:21                 ` Greg Banks
  0 siblings, 0 replies; 58+ messages in thread
From: Greg Banks @ 2009-01-23 22:21 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Steve Dickson, Linux NFS Mailing list, Linux NFSv4 mailing list,
	SystemTAP

Chuck Lever wrote:
>
> The big picture though, is what do we need to do to make it easier to
> troubleshoot and solve problems.  That is a much bigger question than
> how we report errors.
Indeed.

-- 
Greg Banks, P.Engineer, SGI Australian Software Group.
the brightly coloured sporks of revolution.
I don't speak for SGI.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
       [not found] ` <4970B451.4080201-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
                     ` (2 preceding siblings ...)
  2009-01-16 18:52   ` [RFC][PATCH 0/5] NFS: trace points added to mounting path Chuck Lever
@ 2009-01-16 23:44   ` Greg Banks
       [not found]     ` <49711BDF.3010605-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
  2009-01-18 16:40   ` Christoph Hellwig
  4 siblings, 1 reply; 58+ messages in thread
From: Greg Banks @ 2009-01-16 23:44 UTC (permalink / raw)
  To: Steve Dickson; +Cc: Linux NFSv4 mailing list, Linux NFS Mailing list, SystemTAP

Steve Dickson wrote:
> So the ultimate goal would be to replace all the dprintks with trace points
> but still be able to enable them through the rpcdebug command
I have a patch which changes the definition of the dprintk() macro (but
*not* dprintk() callsites) to allow enabling and disabling individual
dprintk() statements through a /proc/ interface.  Would you be
interested in that?

-- 
Greg Banks, P.Engineer, SGI Australian Software Group.
the brightly coloured sporks of revolution.
I don't speak for SGI.


^ permalink raw reply	[flat|nested] 58+ messages in thread

[parent not found: <49711BDF.3010605-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>]

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
       [not found]     ` <49711BDF.3010605-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
@ 2009-01-17 16:15       ` Frank Ch. Eigler
       [not found]         ` <4972A8F5.7070806@opengridcomputing.com>
       [not found]         ` <y0mmydpucww.fsf-vo4H8ooykKW2oG+2xah3EoGKTjYczspe@public.gmane.org>
  2009-01-19 14:27       ` Jeff Moyer
  1 sibling, 2 replies; 58+ messages in thread
From: Frank Ch. Eigler @ 2009-01-17 16:15 UTC (permalink / raw)
  To: Greg Banks
  Cc: Steve Dickson, Linux NFSv4 mailing list, Linux NFS Mailing list,
	SystemTAP



Greg Banks <gnb-cP1dWloDopni96+mSzHFpQC/G2K4zDHf-XMD5yJDbdMReXY1tMh2IBg@public.gmane.org> writes:

> I have a patch which changes the definition of the dprintk() macro
> (but *not* dprintk() callsites) to allow enabling and disabling
> individual dprintk() statements through a /proc/ interface.  Would
> you be interested in that?

It would make more sense to me to turn dprintk's into trace_marks, then
use http://lkml.org/lkml/2008/12/30/297 to control transmission of the
data to ftrace.

- FChE


^ permalink raw reply	[flat|nested] 58+ messages in thread

[parent not found: <4972A8F5.7070806@opengridcomputing.com>]

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
       [not found]         ` <4972A8F5.7070806@opengridcomputing.com>
@ 2009-01-18 17:47           ` Frank Ch. Eigler
  0 siblings, 0 replies; 58+ messages in thread
From: Frank Ch. Eigler @ 2009-01-18 17:47 UTC (permalink / raw)
  To: Tom Tucker
  Cc: Greg Banks, Steve Dickson, Linux NFSv4 mailing list,
	Linux NFS Mailing list, SystemTAP



Hi -

On Sat, Jan 17, 2009 at 09:58:45PM -0600, Tom Tucker wrote:
> [...]
> >>I have a patch which changes the definition of the dprintk() macro
> >>(but *not* dprintk() callsites) to allow enabling and disabling
> >>individual dprintk() statements through a /proc/ interface.  Would
> >>you be interested in that?
> >
> >It would make more sense to me to turn dprintk's into trace_marks, then
> >use http://lkml.org/lkml/2008/12/30/297 to control transmission of the
> >data to ftrace.

> [...]
> That said, and sorry for my ignorance on trace markers, but:
> 
> - Could you describe how we would define "classes" of trace markers. I 
> certainly don't want to have to turn on and off each call-site 
> individually. How would these classes be different than adding more bits 
> to the current rpc_debug mechanism?

You're well prepared to reuse the classes you already have - see below.


> - From Steve's patches, it's not obvious to me how we would convert 
> dprintk to trace markers without visiting every single call site. Can 
> the current macros be munged to use the trace marker interfaces without 
> losing debug information?

The minimal possibility is to just do something like this:

    #define dprintk(format...) trace_mark(nfs_message, format)

Or for the dfprintk that includes the facility symbol:

    #define dfprintk(facility,format...) trace_mark (nfs_#facility, format)

The result would be to have one marker family for each
{RPC,NFS,...}DBG_* type - or any other "facility" symbol you invent on
the spot.  All members of each family can be enabled by attaching a
marker handler to the "nfs_FACILITY" name.  With Lai's markers->ftrace
proposed patch, this would be done from user-space by something like:

   echo -n 'nfs_FACILITY' > /debugfs/tracing/tracing_markers

(You can take away the nfs_ prefix if you like.)

> - What is the overhead of an "inactive" trace marker in data size
> and execution time relative to a dprintk?

It should be similar.  The global {nfs,rpc,...}_debug numbers could go
away and just rely on the marker's built-in on/off control API.

> - What is the overhead of an "active" trace marker in data size and 
> execution time relative to a dprintk?

If the markers end up being channeled to the ftrace buffers, it should
be significantly lighter-weight than sending them to printk.  If some
other marker consumer connects also or instead, it depends on what
that does.


- FChE


^ permalink raw reply	[flat|nested] 58+ messages in thread

[parent not found: <y0mmydpucww.fsf-vo4H8ooykKW2oG+2xah3EoGKTjYczspe@public.gmane.org>]

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
       [not found]         ` <y0mmydpucww.fsf-vo4H8ooykKW2oG+2xah3EoGKTjYczspe@public.gmane.org>
@ 2009-01-18 23:12           ` Greg Banks
       [not found]             ` <4973B777.2000102-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 58+ messages in thread
From: Greg Banks @ 2009-01-18 23:12 UTC (permalink / raw)
  To: Frank Ch. Eigler
  Cc: Steve Dickson, NFS list, SystemTAP, Linux NFSv4 mailing list

Frank Ch. Eigler wrote:
> Greg Banks <gnb-cP1dWloDopni96+mSzHFpQC/G2K4zDHf-XMD5yJDbdMReXY1tMh2IBg@public.gmane.org> writes:
>
>   
>> I have a patch which changes the definition of the dprintk() macro
>> (but *not* dprintk() callsites) to allow enabling and disabling
>> individual dprintk() statements through a /proc/ interface.  Would
>> you be interested in that?
>>     
>
> It would make more sense to me to turn dprintk's into trace_marks, 
Umm, ok.  Sorry to be so ignorant but where would I find the doc that
tells me about adding trace marks ?

> then
> use http://lkml.org/lkml/2008/12/30/297 to control transmission of the
> data to ftrace.
>
>   
The control interface seems a little primitive.  It seems like you can
only activate and deactivate single printks ? I don't see a way to e.g.
activate every trace make in a particular function, or in a particular
.c file.  I thought both of these were useful things to do, so I
implemented them.  Below is an extract from the doc that accompanies the
sgi-dprintk module.

The dprintk module has even more useful features:

 * Simple query language allows turning on and off dprintks by matching
   any combination of:

   - source filename
   - function name
   - line number (including ranges of line numbers)
   - module name
   - format string

 * Provides a /proc/dprintk which can be read to display the complete
   list of all dprintk()s known, to help guide you

 * The module is optional.  The NFS dprintk()s still work with the
   /proc/sys/sunrpc/ bitmasks.  The dprintk module can be loaded or
   unloaded at any time.

 * In addition to enabling the print, two other behaviours can be enabled:

  - printing a kernel stack trace
  - crashing the kernel, so that a dump can be taken

...

Viewing dprintk() Behaviour
===========================

You can view the currently configured behaviour of all the dprintk()s
in loaded modules by reading /proc/dprintk.  For example:

nullarbor:~ # cat /proc/dprintk
# filename:lineno [module]function flags format
/usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svc_rdma.c:323 [svcxprt_rdma]svc_rdma_cleanup - "SVCRDMA\040Module\040Removed,\040deregister\040RPC\040RDMA\040transport\012"
/usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svc_rdma.c:341 [svcxprt_rdma]svc_rdma_init - "\011max_inline\040\040\040\040\040\040\040:\040%d\012"
/usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svc_rdma.c:340 [svcxprt_rdma]svc_rdma_init - "\011sq_depth\040\040\040\040\040\040\040\040\040:\040%d\012"
/usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svc_rdma.c:338 [svcxprt_rdma]svc_rdma_init - "\011max_requests\040\040\040\040\040:\040%d\012"
...

Command Language Reference
==========================

At the lexical level, a command comprises a sequence of words separated
by whitespace characters.  Note that newlines are treated as word
separators and do *not* end a command or allow multiple commands to
be done together.  So these are all equivalent:

nullarbor:~ # echo -c 'file svcsock.c line 1603 +p' > /proc/dprintk
nullarbor:~ # echo -c '  file   svcsock.c     line  1603 +p  ' > /proc/dprintk
nullarbor:~ # echo -c 'file svcsock.c\nline 1603 +p' > /proc/dprintk
nullarbor:~ # echo -n 'file svcsock.c line 1603 +p' > /proc/dprintk

Commands are bounded by a write() system call.  If you want to do
multiple commands you need to do a separate "echo" for each, like:

nullarbor:~ # echo 'file svcsock.c line 1603 +p' > /proc/dprintk ;\
> echo 'file svcsock.c line 1563 +p' > /proc/dprintk

or even like:

nullarbor:~ # (
> echo 'file svcsock.c line 1603 +p' ;\
> echo 'file svcsock.c line 1563 +p' ;\
> ) > /proc/dprintk

At the syntactical level, a command comprises a sequence of match
specifications, followed by a flags change specification.

command ::= match-spec* flags-spec

The match-spec's are used to choose a subset of the known dprintk()
callsites to which to apply the flags-spec.  Think of them as a query
with implicit ANDs between each pair.  Note that an empty list of
match-specs is possible, but is not very useful because it will not
match any dprintk() callsites.

A match specification comprises a keyword, which controls the attribute
of the callsite to be compared, and a value to compare against.  Possible
keywords are:

match-spec ::= 'func' string |
               'file' string |
               'module' string |
               'format' string |
               'line' line-range

line-range ::= lineno |
               '-'lineno |
               lineno'-' |
               lineno'-'lineno
// Note: line-range cannot contain space, e.g.
// "1-30" is valid range but "1 - 30" is not.

lineno ::= unsigned-int

The meanings of each keyword are:

func
    The given string is compared against the function name
    of each callsite.  Example:

    func svc_tcp_accept

file
    The given string is compared against either the full
    pathname or the basename of the source file of each
    callsite.  Examples:

    file svcsock.c
    file /usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svcsock.c

module
    The given string is compared against the module name
    of each callsite.  The module name is the string as
    seen in "lsmod", i.e. without the directory or the .ko
    suffix and with '-' changed to '_'.  Examples:

    module sunrpc
    module nfsd

format
    The given string is searched for in the dprintk() format
    string.  Note that the string does not need to match the
    entire format, only some part.  Whitespace and other
    special characters can be escaped using C octal character
    escape \ooo notation, e.g. the space character is \040.
    Examples:

    format svcrdma:         // many of the NFS/RDMA server dprintks
    format readahead        // some dprintks in the readahead cache
    format nfsd:\040SETATTR // how to match a format with whitespace

line
    The given line number or range of line numbers is compared
    against the line number of each dprintk() callsite.  A single
    line number matches the callsite line number exactly.  A
    range of line numbers matches any callsite between the first
    and last line number inclusive.  An empty first number means
    the first line in the file, an empty line number means the
    last number in the file.  Examples:

    line 1603       // exactly line 1603
    line 1600-1605  // the six lines from line 1600 to line 1605
    line -1605      // the 1605 lines from line 1 to line 1605
    line 1600-      // all lines from line 1600 to the end of the file

The flags specification comprises a change operation followed
by one or more flag characters.  The change operation is one
of the characters:

-
    remove the given flags

+
    add the given flags

=
    set the flags to the given flags

The flags are:

p
    Causes a printk() message to be emitted to dmesg,
    i.e. the obvious semantic of a dprintk().

s
    Causes a kernel stack trace to be emitted to dmesg.
    The printk() is emitted first, even if the 'p' flag
    is not specified.

c
    Causes the kernel to panic using the kernel BUG()
    macro.  This will cause the machine to drop into KDB
    or take a kernel crash dump, according to how the
    machine has been configured.  The printk() is emitted
    first, even if the 'p' flag is not specified.

Note the regexp ^[-+=][scp]+$ matches a flags specification.
Note also that there is no convenient syntax to remove all
the flags at once, you need to use "-psc".

Examples
========

// enable the message at line 1603 of file svcsock.c
nullarbor:~ # echo -n 'file svcsock.c line 1603 +p' > /proc/dprintk

// enable all the messages in file svcsock.c
nullarbor:~ # echo -n 'file svcsock.c +p' > /proc/dprintk

// enable all the messages in file svcsock.c
nullarbor:~ # echo -n 'file svcsock.c +p' > /proc/dprintk

// enable all the messages in the NFS server module
nullarbor:~ # echo -n 'module nfsd +p' > /proc/dprintk

// enable all 12 messages in the function svc_process()
nullarbor:~ # echo -n 'func svc_process +p' > /proc/dprintk

// disable all 12 messages in the function svc_process()
nullarbor:~ # echo -n 'func svc_process -p' > /proc/dprintk

// print a stack trace on every upcall to rpc.mountd or rpc.idmapd
nullarbor:~ # echo -n 'format Want\040update,\040refage +s' > /proc/dprintk

// cause a kernel crash dump when an RPC call to an
// unknown RPC program number is received
nullarbor:~ # echo -n 'format unknown\040program +c' > /proc/dprintk



-- 
Greg Banks, P.Engineer, SGI Australian Software Group.
the brightly coloured sporks of revolution.
I don't speak for SGI.


^ permalink raw reply	[flat|nested] 58+ messages in thread

[parent not found: <4973B777.2000102-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>]

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
       [not found]             ` <4973B777.2000102-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
@ 2009-01-19 15:41               ` Frank Ch. Eigler
  2009-01-19 23:13                 ` Greg Banks
  0 siblings, 1 reply; 58+ messages in thread
From: Frank Ch. Eigler @ 2009-01-19 15:41 UTC (permalink / raw)
  To: Greg Banks; +Cc: Steve Dickson, NFS list, SystemTAP, Linux NFSv4 mailing list

Hi -

On Mon, Jan 19, 2009 at 10:12:55AM +1100, Greg Banks wrote:
> [...]
> > It would make more sense to me to turn dprintk's into trace_marks, 
> Umm, ok.  Sorry to be so ignorant but where would I find the doc that
> tells me about adding trace marks ?

Documentation/markers.txt

> The control interface seems a little primitive.  It seems like you
> can only activate and deactivate single printks ?

Or single classes (identical names) per activate/deactivate call.

> I don't see a way to e.g.  activate every trace make in a particular
> function, or in a particular .c file.  I thought both of these were
> useful things to do, so I implemented them.  Below is an extract
> from the doc that accompanies the sgi-dprintk module. [...]

Very clever!

I am not an insider enough to carry much weight in this regard, but it
would make most sense to me if the sgi-dprintk widget were adapted so
that:
- it used trace_mark() as the low-level hook mechanism at the call sites
- it extended the trace_mark structures to track
  __FILE__/__LINE__/__FUNCTION__   values
- it extended the marker activation API to be able to specify queries
  based upon those __FILE__/etc. fields
- as an extension to the proposed marker->ftrace patch, extend the
  option parser the same way as your /proc/dprintk does

The benefits might not be huge, but:
- it would reuse existing functionality (markers for the hooks, other
  ftrace widgets for extra info to gather at each event - like the
  stack traces)
- it could generalize the handling of these dprintks, by not just
  being tied to message printing, but becoming of possible use for
  statistics/health monitoring

- FChE

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-19 15:41               ` Frank Ch. Eigler
@ 2009-01-19 23:13                 ` Greg Banks
  0 siblings, 0 replies; 58+ messages in thread
From: Greg Banks @ 2009-01-19 23:13 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: NFS list, Linux NFSv4 mailing list, SystemTAP

Frank Ch. Eigler wrote:
> Hi -
>
> On Mon, Jan 19, 2009 at 10:12:55AM +1100, Greg Banks wrote:
>   
>> [...]
>>     
>>> It would make more sense to me to turn dprintk's into trace_marks, 
>>>       
>> Umm, ok.  Sorry to be so ignorant but where would I find the doc that
>> tells me about adding trace marks ?
>>     
>
> Documentation/markers.txt
>   

Thanks.
>   
>> The control interface seems a little primitive.  It seems like you
>> can only activate and deactivate single printks ?
>>     
>
> Or single classes (identical names) per activate/deactivate call.
>
>   
The problem with this "classes" approach -- which sounds to my ignorant
ears like the "facilities" feature of nfs/sunrpc dprintks -- is
predicting which combination of classes to define so that they're useful
in the field years later with unknown bugs and unknown load patterns.  I
found that querying by function name was more useful in practice.
>
> [...]
> - it could generalize the handling of these dprintks, by not just
>   being tied to message printing, but becoming of possible use for
>   statistics/health monitoring
>   

That sounds potentially useful.  I was considering extending my dprintk
patch to add a flag to optionally bump a counter when the dprintk() line
is executed, but doing it via trace markers could be more useful if less
performant.

Let me do some reading on trace markers and get back to this.

-- 
Greg Banks, P.Engineer, SGI Australian Software Group.
the brightly coloured sporks of revolution.
I don't speak for SGI.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
       [not found]     ` <49711BDF.3010605-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
  2009-01-17 16:15       ` Frank Ch. Eigler
@ 2009-01-19 14:27       ` Jeff Moyer
       [not found]         ` <x49ab9ntlpp.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
  1 sibling, 1 reply; 58+ messages in thread
From: Jeff Moyer @ 2009-01-19 14:27 UTC (permalink / raw)
  To: Greg Banks
  Cc: Steve Dickson, Linux NFSv4 mailing list, Linux NFS Mailing list,
	SystemTAP

Greg Banks <gnb-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org> writes:

> Steve Dickson wrote:
>> So the ultimate goal would be to replace all the dprintks with trace points
>> but still be able to enable them through the rpcdebug command
> I have a patch which changes the definition of the dprintk() macro (but
> *not* dprintk() callsites) to allow enabling and disabling individual
> dprintk() statements through a /proc/ interface.  Would you be
> interested in that?

That sounds like duplicated work.  How does it differ from Jason Baron's
dynamic printk patches (which I believe are now upstream)?

Cheers,
Jeff

^ permalink raw reply	[flat|nested] 58+ messages in thread

[parent not found: <x49ab9ntlpp.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>]

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
       [not found]         ` <x49ab9ntlpp.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
@ 2009-01-19 19:49           ` Jason Baron
  2009-01-19 22:58             ` Greg Banks
  2009-01-21 10:13           ` K.Prasad
  1 sibling, 1 reply; 58+ messages in thread
From: Jason Baron @ 2009-01-19 19:49 UTC (permalink / raw)
  To: Jeff Moyer
  Cc: Greg Banks, Steve Dickson, Linux NFSv4 mailing list,
	Linux NFS Mailing list, SystemTAP

On Mon, Jan 19, 2009 at 09:27:30AM -0500, Jeff Moyer wrote:
> Greg Banks <gnb-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org> writes:
> 
> > Steve Dickson wrote:
> >> So the ultimate goal would be to replace all the dprintks with trace points
> >> but still be able to enable them through the rpcdebug command
> > I have a patch which changes the definition of the dprintk() macro (but
> > *not* dprintk() callsites) to allow enabling and disabling individual
> > dprintk() statements through a /proc/ interface.  Would you be
> > interested in that?
> 
> That sounds like duplicated work.  How does it differ from Jason Baron's
> dynamic printk patches (which I believe are now upstream)?
> 
> Cheers,
> Jeff

indeed. I've implemented a solution in a very similar problem space
which is now upstream, see:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=346e15beb5343c2eb8216d820f2ed8f150822b08;hp=33376c1c043c05077b4ac79c33804266f6c45e49

One of the core fundamental differences that I see is that 'dprintk'
checks a global variable per dprintk line. Whereas, 'dynamic printk'
assigns each module a set of bits in a *single* global variable. The
idea was that if you have thousands of these debug lines throughout the
kernel, I wanted to have a small footprint.

The per-dprintk granularity could be implemented on top of the
per-module approach that i've taken. That is, each dprintk statement
could activate the module that its associated with when its activated.
Then, a further per-line variable could be checked.

We should probably move this discussion to lkml, since this probably
should involve a wider audience. Perhaps, you can re-post your patchset
there?

thanks,

-Jason

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-19 19:49           ` Jason Baron
@ 2009-01-19 22:58             ` Greg Banks
  0 siblings, 0 replies; 58+ messages in thread
From: Greg Banks @ 2009-01-19 22:58 UTC (permalink / raw)
  To: Jason Baron
  Cc: Jeff Moyer, Steve Dickson, Linux NFSv4 mailing list,
	Linux NFS Mailing list, SystemTAP

Jason Baron wrote:
> On Mon, Jan 19, 2009 at 09:27:30AM -0500, Jeff Moyer wrote:
>   
>> Greg Banks <gnb@melbourne.sgi.com> writes:
>>
>>     
>>> Steve Dickson wrote:
>>>       
>>>> So the ultimate goal would be to replace all the dprintks with trace points
>>>> but still be able to enable them through the rpcdebug command
>>>>         
>>> I have a patch which changes the definition of the dprintk() macro (but
>>> *not* dprintk() callsites) to allow enabling and disabling individual
>>> dprintk() statements through a /proc/ interface.  Would you be
>>> interested in that?
>>>       
>> That sounds like duplicated work.  How does it differ from Jason Baron's
>> dynamic printk patches (which I believe are now upstream)?
>>
>> Cheers,
>> Jeff
>>     
>
> indeed. I've implemented a solution in a very similar problem space
> which is now upstream, see:
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=346e15beb5343c2eb8216d820f2ed8f150822b08;hp=33376c1c043c05077b4ac79c33804266f6c45e49
>
>   
Meh, I've been spending too much time in insular SLES10 land.

> One of the core fundamental differences that I see is that 'dprintk'
> checks a global variable per dprintk line.
Yes, this is a key design feature.

The problem I was addressing was debugging NFS/RDMA.  That transport had
at the time no way to do any kind of network capture, and dprintks were
the *only* debugging mechanism.  So the code is absolutely riddled with
dprintks, some enormous and some in key performance paths.  This means
that enabling dprintks on a per-module basis would a) overflow the
dprintk buffer in a few milliseconds and b) perturb the time behaviour
of the code sufficiently to mask the problem you were trying to diagnose.

Later it became apparent that this would also be very useful for field
support folks.

>  Whereas, 'dynamic printk'
> assigns each module a set of bits in a *single* global variable. 
This seems to be more or less equivalent to the mechanism that
nfs/sunrpc use today, i.e. quite coarse grained.
> The
> idea was that if you have thousands of these debug lines throughout the
> kernel, I wanted to have a small footprint.
>   

Indeed.  For me,debuggability and supportability are far more important.
> The per-dprintk granularity could be implemented on top of the
> per-module approach that i've taken. That is, each dprintk statement
> could activate the module that its associated with when its activated.
> Then, a further per-line variable could be checked.
>   

Yes.  Or you could make the per-line variables the only state kept and
do the equivalent of

echo "module sunrpc +p" > /proc/dprintk

when the sysadmin does

echo "set enabled=1 <module_name>" > dynamic_printk/modules

i.e. run a query over the dprintk records and mark all the ones that
match the module.
> We should probably move this discussion to lkml, since this probably
> should involve a wider audience. Perhaps, you can re-post your patchset
> there?
>
>
>   
Ok, will do.

-- 
Greg Banks, P.Engineer, SGI Australian Software Group.
the brightly coloured sporks of revolution.
I don't speak for SGI.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
       [not found]         ` <x49ab9ntlpp.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
  2009-01-19 19:49           ` Jason Baron
@ 2009-01-21 10:13           ` K.Prasad
  2009-01-21 16:39             ` Steve Dickson
  1 sibling, 1 reply; 58+ messages in thread
From: K.Prasad @ 2009-01-21 10:13 UTC (permalink / raw)
  To: Jeff Moyer, Steve Dickson
  Cc: Greg Banks, Linux NFSv4 mailing list, Linux NFS Mailing list,
	SystemTAP, Christoph Hellwig, Maneesh Soni

On Mon, Jan 19, 2009 at 09:27:30AM -0500, Jeff Moyer wrote:
> Greg Banks <gnb-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org> writes:
> 
> > Steve Dickson wrote:
> >> So the ultimate goal would be to replace all the dprintks with trace points
> >> but still be able to enable them through the rpcdebug command
> > I have a patch which changes the definition of the dprintk() macro (but
> > *not* dprintk() callsites) to allow enabling and disabling individual
> > dprintk() statements through a /proc/ interface.  Would you be
> > interested in that?
> 
> That sounds like duplicated work.  How does it differ from Jason Baron's
> dynamic printk patches (which I believe are now upstream)?
> 
> Cheers,
> Jeff

Introducing/converting one of the accepted methods of static
instrumentation - like tracepoints would help more users (whether
in-kernel or otherwise) harness them.

Steve,
	Would it help convert the systemtap script (nfs_mount.stp) in
Patch - 5 into a kernel module (perhaps in samples/ directory) to bring
an in-kernel user of these tracepoints?

Thanks,
K.Prasad


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-21 10:13           ` K.Prasad
@ 2009-01-21 16:39             ` Steve Dickson
       [not found]               ` <49774FDC.5090307-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 58+ messages in thread
From: Steve Dickson @ 2009-01-21 16:39 UTC (permalink / raw)
  To: prasad; +Cc: Linux NFS Mailing list, Linux NFSv4 mailing list, SystemTAP

K.Prasad wrote:
> On Mon, Jan 19, 2009 at 09:27:30AM -0500, Jeff Moyer wrote:
>> Greg Banks <gnb@melbourne.sgi.com> writes:
>>
>>> Steve Dickson wrote:
>>>> So the ultimate goal would be to replace all the dprintks with trace points
>>>> but still be able to enable them through the rpcdebug command
>>> I have a patch which changes the definition of the dprintk() macro (but
>>> *not* dprintk() callsites) to allow enabling and disabling individual
>>> dprintk() statements through a /proc/ interface.  Would you be
>>> interested in that?
>> That sounds like duplicated work.  How does it differ from Jason Baron's
>> dynamic printk patches (which I believe are now upstream)?
>>
>> Cheers,
>> Jeff
> 
> Introducing/converting one of the accepted methods of static
> instrumentation - like tracepoints would help more users (whether
> in-kernel or otherwise) harness them.
> 
> Steve,
> 	Would it help convert the systemtap script (nfs_mount.stp) in
> Patch - 5 into a kernel module (perhaps in samples/ directory) to bring
> an in-kernel user of these tracepoints?
Well nfs_mount.stp was just an example of how to pull the information
from the kernel.. I just wanted to complete the loop... but as 
Christoph pointed out it probably shouldn't been included in the posting.

I'm not sure moving the nfs_mount.stp script into kernel 
would make sense. One of the advantages of trace point and system
scripts (depending on what is passed up) it allows users to define
exactly what they need to see.. 

For example, a kernel guy might be interested in a particular bit in a flag 
field which would be meaningless to an IT guy. On the other hand, the IT guy 
would be interested in the error code. One trace point could supply all the
information but different systemtap scripts would be need to show the 
desired information.

My point being, if we move things down into the kernel, I think we would
lose this type of flexibly...

steved.

^ permalink raw reply	[flat|nested] 58+ messages in thread

[parent not found: <49774FDC.5090307-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>]

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
       [not found]               ` <49774FDC.5090307-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
@ 2009-01-21 17:04                 ` Arnaldo Carvalho de Melo
  2009-01-21 19:59                   ` Steve Dickson
       [not found]                   ` <20090121170401.GD4394-f8uhVLnGfZaxAyOMLChx1axOck334EZe@public.gmane.org>
  0 siblings, 2 replies; 58+ messages in thread
From: Arnaldo Carvalho de Melo @ 2009-01-21 17:04 UTC (permalink / raw)
  To: Steve Dickson
  Cc: prasad, Linux NFSv4 mailing list, Linux NFS Mailing list,
	SystemTAP

Em Wed, Jan 21, 2009 at 11:39:56AM -0500, Steve Dickson escreveu:
> K.Prasad wrote:
> > On Mon, Jan 19, 2009 at 09:27:30AM -0500, Jeff Moyer wrote:
> >> Greg Banks <gnb-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org> writes:
> >>
> >>> Steve Dickson wrote:
> >>>> So the ultimate goal would be to replace all the dprintks with trace points
> >>>> but still be able to enable them through the rpcdebug command
> >>> I have a patch which changes the definition of the dprintk() macro (but
> >>> *not* dprintk() callsites) to allow enabling and disabling individual
> >>> dprintk() statements through a /proc/ interface.  Would you be
> >>> interested in that?
> >> That sounds like duplicated work.  How does it differ from Jason Baron's
> >> dynamic printk patches (which I believe are now upstream)?
> >>
> >> Cheers,
> >> Jeff
> > 
> > Introducing/converting one of the accepted methods of static
> > instrumentation - like tracepoints would help more users (whether
> > in-kernel or otherwise) harness them.
> > 
> > Steve,
> > 	Would it help convert the systemtap script (nfs_mount.stp) in
> > Patch - 5 into a kernel module (perhaps in samples/ directory) to bring
> > an in-kernel user of these tracepoints?
> Well nfs_mount.stp was just an example of how to pull the information
> from the kernel.. I just wanted to complete the loop... but as 
> Christoph pointed out it probably shouldn't been included in the posting.
> 
> I'm not sure moving the nfs_mount.stp script into kernel 
> would make sense. One of the advantages of trace point and system
> scripts (depending on what is passed up) it allows users to define
> exactly what they need to see.. 
> 
> For example, a kernel guy might be interested in a particular bit in a flag 
> field which would be meaningless to an IT guy. On the other hand, the IT guy 
> would be interested in the error code. One trace point could supply all the
> information but different systemtap scripts would be need to show the 
> desired information.
> 
> My point being, if we move things down into the kernel, I think we would
> lose this type of flexibly...

I suggest you provide an ftrace plugin, just like I'm doing with
blktrace, see:

http://lkml.org/lkml/2009/1/20/190

- Arnaldo

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
  2009-01-21 17:04                 ` Arnaldo Carvalho de Melo
@ 2009-01-21 19:59                   ` Steve Dickson
       [not found]                   ` <20090121170401.GD4394-f8uhVLnGfZaxAyOMLChx1axOck334EZe@public.gmane.org>
  1 sibling, 0 replies; 58+ messages in thread
From: Steve Dickson @ 2009-01-21 19:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Linux NFSv4 mailing list, Linux NFS Mailing list, prasad,
	SystemTAP



Arnaldo Carvalho de Melo wrote:
> Em Wed, Jan 21, 2009 at 11:39:56AM -0500, Steve Dickson escreveu:
>> K.Prasad wrote:
>>> On Mon, Jan 19, 2009 at 09:27:30AM -0500, Jeff Moyer wrote:
>>>> Greg Banks <gnb@melbourne.sgi.com> writes:
>>>>
>>>>> Steve Dickson wrote:
>>>>>> So the ultimate goal would be to replace all the dprintks with trace points
>>>>>> but still be able to enable them through the rpcdebug command
>>>>> I have a patch which changes the definition of the dprintk() macro (but
>>>>> *not* dprintk() callsites) to allow enabling and disabling individual
>>>>> dprintk() statements through a /proc/ interface.  Would you be
>>>>> interested in that?
>>>> That sounds like duplicated work.  How does it differ from Jason Baron's
>>>> dynamic printk patches (which I believe are now upstream)?
>>>>
>>>> Cheers,
>>>> Jeff
>>> Introducing/converting one of the accepted methods of static
>>> instrumentation - like tracepoints would help more users (whether
>>> in-kernel or otherwise) harness them.
>>>
>>> Steve,
>>> 	Would it help convert the systemtap script (nfs_mount.stp) in
>>> Patch - 5 into a kernel module (perhaps in samples/ directory) to bring
>>> an in-kernel user of these tracepoints?
>> Well nfs_mount.stp was just an example of how to pull the information
>> from the kernel.. I just wanted to complete the loop... but as 
>> Christoph pointed out it probably shouldn't been included in the posting.
>>
>> I'm not sure moving the nfs_mount.stp script into kernel 
>> would make sense. One of the advantages of trace point and system
>> scripts (depending on what is passed up) it allows users to define
>> exactly what they need to see.. 
>>
>> For example, a kernel guy might be interested in a particular bit in a flag 
>> field which would be meaningless to an IT guy. On the other hand, the IT guy 
>> would be interested in the error code. One trace point could supply all the
>> information but different systemtap scripts would be need to show the 
>> desired information.
>>
>> My point being, if we move things down into the kernel, I think we would
>> lose this type of flexibly...
> 
> I suggest you provide an ftrace plugin, just like I'm doing with
> blktrace, see:
> 
> http://lkml.org/lkml/2009/1/20/190
I agree its something I need to look into...

steved.

^ permalink raw reply	[flat|nested] 58+ messages in thread

[parent not found: <20090121170401.GD4394-f8uhVLnGfZaxAyOMLChx1axOck334EZe@public.gmane.org>]

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
       [not found]                   ` <20090121170401.GD4394-f8uhVLnGfZaxAyOMLChx1axOck334EZe@public.gmane.org>
@ 2009-01-21 20:39                     ` Christoph Hellwig
  0 siblings, 0 replies; 58+ messages in thread
From: Christoph Hellwig @ 2009-01-21 20:39 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Steve Dickson, prasad, Linux NFSv4 mailing list,
	Linux NFS Mailing list, SystemTAP

On Wed, Jan 21, 2009 at 03:04:01PM -0200, Arnaldo Carvalho de Melo wrote:
> I suggest you provide an ftrace plugin, just like I'm doing with
> blktrace, see:

*nod*

ftrace for now, long-term we also want some ltt integration (hopefully
we'll have some form of a generic shim so that one data source can
provide trace information for both ftrace and ltt)


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC][PATCH 0/5] NFS: trace points added to mounting path
       [not found] ` <4970B451.4080201-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
                     ` (3 preceding siblings ...)
  2009-01-16 23:44   ` Greg Banks
@ 2009-01-18 16:40   ` Christoph Hellwig
  4 siblings, 0 replies; 58+ messages in thread
From: Christoph Hellwig @ 2009-01-18 16:40 UTC (permalink / raw)
  To: Steve Dickson; +Cc: Linux NFSv4 mailing list, Linux NFS Mailing list, SystemTAP

On Fri, Jan 16, 2009 at 11:22:41AM -0500, Steve Dickson wrote:
> Comments... Acceptance?? 

Please add support to actually make use of these without a big pile of
out of tree crap.  

^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2009-01-23 22:22 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-16 16:22 [RFC][PATCH 0/5] NFS: trace points added to mounting path Steve Dickson
2009-01-16 16:30 ` [PATCH 3/5] NFS: Adding trace points to nfs/client.c Steve Dickson
2009-01-16 16:32 ` [PATCH 4/5] NFS: Convert trace points to trace markers Steve Dickson
2009-01-16 16:33 ` [PATCH 5/5] NFS: Systemtap script Steve Dickson
     [not found] ` <4970B451.4080201-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
2009-01-16 16:25   ` [PATCH 1/5] NFS: Adding trace points to fs/nfs/getroot.c Steve Dickson
2009-01-16 16:28   ` [PATCH 2/5] NFS: Adding trace points to fs/nfs/super.c Steve Dickson
2009-01-16 18:52   ` [RFC][PATCH 0/5] NFS: trace points added to mounting path Chuck Lever
2009-01-21 17:13     ` Steve Dickson
     [not found]       ` <497757D1.7090908-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
2009-01-21 18:01         ` Chuck Lever
2009-01-21 19:29           ` Trond Myklebust
2009-01-21 19:58             ` Steve Dickson
2009-01-21 20:23               ` Trond Myklebust
2009-01-22 13:07                 ` Steve Dickson
     [not found]                   ` <49786F9F.7030400-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
2009-01-22 15:30                     ` Trond Myklebust
2009-01-22 15:49                       ` Steve Dickson
2009-01-22 17:47                         ` Arnaldo Carvalho de Melo
2009-01-21 19:37           ` Steve Dickson
2009-01-21 20:19             ` Chuck Lever
2009-01-21 22:36               ` Greg Banks
     [not found]                 ` <4977A385.8000406-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
2009-01-21 22:47                   ` Arnaldo Carvalho de Melo
2009-01-21 22:57                     ` Trond Myklebust
2009-01-21 23:06                       ` Arnaldo Carvalho de Melo
2009-01-21 22:56                   ` Trond Myklebust
2009-01-21 23:11                     ` Greg Banks
2009-01-21 23:47                       ` Trond Myklebust
2009-01-22  0:53                         ` Frank Ch. Eigler
2009-01-22  2:04                         ` Greg Banks
     [not found]                           ` <4977D431.1020906-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
2009-01-22 15:27                             ` Steve Dickson
     [not found]                               ` <49789073.1080200-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
2009-01-22 22:43                                 ` Greg Banks
2009-01-21 22:56                   ` J. Bruce Fields
2009-01-21 23:05                     ` Muntz, Daniel
2009-01-22 15:59                     ` Steve Dickson
2009-01-22 16:45                       ` J. Bruce Fields
2009-01-22 22:54                         ` Greg Banks
     [not found]                           ` <4978F91C.4090208-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
2009-01-23 18:09                             ` J. Bruce Fields
2009-01-23 22:18                               ` Greg Banks
2009-01-23 18:17                     ` Chuck Lever
2009-01-22 13:55               ` Steve Dickson
2009-01-22 22:31                 ` Greg Banks
2009-01-21 21:26           ` Greg Banks
2009-01-22 15:19             ` Steve Dickson
2009-01-23 18:28               ` Chuck Lever
2009-01-23 22:21                 ` Greg Banks
2009-01-16 23:44   ` Greg Banks
     [not found]     ` <49711BDF.3010605-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
2009-01-17 16:15       ` Frank Ch. Eigler
     [not found]         ` <4972A8F5.7070806@opengridcomputing.com>
2009-01-18 17:47           ` Frank Ch. Eigler
     [not found]         ` <y0mmydpucww.fsf-vo4H8ooykKW2oG+2xah3EoGKTjYczspe@public.gmane.org>
2009-01-18 23:12           ` Greg Banks
     [not found]             ` <4973B777.2000102-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
2009-01-19 15:41               ` Frank Ch. Eigler
2009-01-19 23:13                 ` Greg Banks
2009-01-19 14:27       ` Jeff Moyer
     [not found]         ` <x49ab9ntlpp.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
2009-01-19 19:49           ` Jason Baron
2009-01-19 22:58             ` Greg Banks
2009-01-21 10:13           ` K.Prasad
2009-01-21 16:39             ` Steve Dickson
     [not found]               ` <49774FDC.5090307-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
2009-01-21 17:04                 ` Arnaldo Carvalho de Melo
2009-01-21 19:59                   ` Steve Dickson
     [not found]                   ` <20090121170401.GD4394-f8uhVLnGfZaxAyOMLChx1axOck334EZe@public.gmane.org>
2009-01-21 20:39                     ` Christoph Hellwig
2009-01-18 16:40   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox