netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Net device error logging, revised
  2003-08-25 21:31 [PATCH 1/4] " Jim Keniston
@ 2003-09-15 23:08 ` Jim Keniston
  0 siblings, 0 replies; 3+ messages in thread
From: Jim Keniston @ 2003-09-15 23:08 UTC (permalink / raw)
  To: LKML, netdev, Jeff Garzik, Feldman, Scott, Larry Kessler, Greg KH,
	Randy Dunlap, Alan Cox, Andrew Morton, jkenisto
  Cc: David Brownell, Stephen Hemminger

[-- Attachment #1: Type: text/plain, Size: 2255 bytes --]

This patch extends the concept of Linux 2.6's dev_* logging macros to
support network devices.  This is a modification of a patch posted
last month, and addresses the issues raised since then, namely:

1. To minimize stack usage, the msg[] buffer in __netdev_printk() has been
made static and protected by a spinlock.  (The spinlock shouldn't be a big
performance hit because we're about to serialize on printk's lock anyway.)

2. It is no longer possible to omit the msglevel arg.  For example,
        netdev_err(dev,, "NIC fried!\n");
no longer works.  This must be expressed as
        netdev_err(dev, ALL, "NIC fried!\n");
or (see #3 below) something like
        netdev_fatal(dev, HW, "NIC fried!\n");

3. A new macro, netdev_fatal, is included.  Given the call
        netdev_fatal(dev, HW, "NIC fried!\n");
the indicated message is always logged: the msglevel arg (HW, in this
case) is NOT consulted.  In fact, the msglevel arg to netdev_fatal
is ignored in this implementation.  (As previously discussed, in some
future implementation, the msglevel could be logged to help indicate
the circumstances under which the event was logged.)

4. It was suggested that the netdev_* macros should support message
filtering via simple message levels -- e.g.,
        if (dev->msg_enable > 5) printk(KERN_INFO "Received a packet.\n");
-- in addition to (or instead of) via the NETIF_MSG_* bit masks.  But
Jeff Garzik reiterated his desire to standardize on NETIF_MSG_*, so
I'm leaving things unchanged in that respect.

5. It was suggested that netdev_dbg is not flexible enough to handle all
debugging situations.  This is probably true.  Because of the special
nature of debugging messages, I'd expect the developer to use other
approaches in debug code if netdev_dbg doesn't fill the bill.  But the
netdev_dbg approach appears to be reasonably useful.  (For comparison,
there are currently 188 calls to dev_dbg in Linux drivers.)  No change
here.

6. Certain comments seemed to imply that you should be able to
suppress all messages (even those with a msglevel of ALL) by setting the
msg_enable field to 0.  I chose not to support this, because it seemed
counterintuitive and inconsistent with existing practice.

Jim Keniston
IBM Linux Technology Center

[-- Attachment #2: netdev-2.6.0-test5.patch --]
[-- Type: text/plain, Size: 5765 bytes --]

diff -Naur linux.org/include/linux/netdevice.h linux.netdev.patched/include/linux/netdevice.h
--- linux.org/include/linux/netdevice.h	Mon Sep 15 15:58:06 2003
+++ linux.netdev.patched/include/linux/netdevice.h	Mon Sep 15 15:58:06 2003
@@ -467,6 +467,9 @@
 	struct divert_blk	*divert;
 #endif /* CONFIG_NET_DIVERT */
 
+	/* NETIF_MSG_* flags to control the types of events we log */
+	int msg_enable;
+
 	/* class/net/name entry */
 	struct class_device	class_dev;
 };
@@ -741,6 +744,7 @@
 	NETIF_MSG_PKTDATA	= 0x1000,
 	NETIF_MSG_HW		= 0x2000,
 	NETIF_MSG_WOL		= 0x4000,
+	NETIF_MSG_ALL		= -1,		/* always log message */
 };
 
 #define netif_msg_drv(p)	((p)->msg_enable & NETIF_MSG_DRV)
@@ -899,6 +903,40 @@
 extern void		dev_clear_fastroute(struct net_device *dev);
 #endif
 
+/* debugging and troubleshooting/diagnostic helpers. */
+/**
+ * netdev_printk() - Log message with interface name, gated by message level
+ * @sevlevel: severity level -- e.g., KERN_INFO
+ * @netdev: net_device pointer
+ * @msglevel: a standard message-level flag with the NETIF_MSG_ prefix removed.
+ *	Unless msglevel is NETIF_MSG_ALL, log the message only if that flag
+ *	is set in netdev->msg_enable.
+ * @format: as with printk
+ * @args: as with printk
+ */
+extern int __netdev_printk(const char *sevlevel,
+	const struct net_device *netdev, int msglevel, const char *format, ...);
+#define netdev_printk(sevlevel, netdev, msglevel, format, arg...)	\
+	__netdev_printk(sevlevel , netdev , NETIF_MSG_##msglevel ,	\
+	format , ## arg)
+
+#ifdef DEBUG
+#define netdev_dbg(netdev, msglevel, format, arg...)		\
+	netdev_printk(KERN_DEBUG , netdev , msglevel , format , ## arg)
+#else
+#define netdev_dbg(netdev, msglevel, format, arg...) do {} while (0)
+#endif
+
+#define netdev_err(netdev, msglevel, format, arg...)		\
+	netdev_printk(KERN_ERR , netdev , msglevel , format , ## arg)
+#define netdev_info(netdev, msglevel, format, arg...)		\
+	netdev_printk(KERN_INFO , netdev , msglevel , format , ## arg)
+#define netdev_warn(netdev, msglevel, format, arg...)		\
+	netdev_printk(KERN_WARNING , netdev , msglevel , format , ## arg)
+
+/* report fatal error unconditionally; msglevel ignored for now */
+#define netdev_fatal(netdev, msglevel, format, arg...)		\
+	netdev_printk(KERN_ERR , netdev , ALL , format , ## arg)
 
 #endif /* __KERNEL__ */
 
diff -Naur linux.org/net/core/dev.c linux.netdev.patched/net/core/dev.c
--- linux.org/net/core/dev.c	Mon Sep 15 15:58:07 2003
+++ linux.netdev.patched/net/core/dev.c	Mon Sep 15 15:58:07 2003
@@ -3035,3 +3035,75 @@
 }
 
 subsys_initcall(net_dev_init);
+
+static spinlock_t netdev_printk_lock = SPIN_LOCK_UNLOCKED;
+/**
+ * __netdev_printk() - Log message with interface name, gated by message level
+ * @sevlevel: severity level -- e.g., KERN_INFO
+ * @netdev: net_device pointer
+ * @msglevel: a standard message-level flag such as NETIF_MSG_PROBE.
+ *	Unless msglevel is NETIF_MSG_ALL, log the message only if
+ *	that flag is set in netdev->msg_enable.
+ * @format: as with printk
+ * @args: as with printk
+ *
+ * Does the work for the netdev_printk macro.
+ * For a lot of network drivers, the probe function looks like
+ *	...
+ *	netdev = alloc_netdev(...);	// or alloc_etherdev(...)
+ *	SET_NETDEV_DEV(netdev, dev);
+ *	...
+ *	register_netdev(netdev);
+ *	...
+ * netdev_printk and its wrappers (e.g., netdev_err) can be used as
+ * soon as you have a valid net_device pointer -- e.g., from alloc_netdev,
+ * alloc_etherdev, or init_etherdev.  (Before that, use dev_printk and
+ * its wrappers to report device errors.)  It's common for an interface to
+ * have a name like "eth%d" until the device is successfully configured,
+ * and the call to register_netdev changes it to a "real" name like "eth0".
+ *
+ * If the interface's reg_state is NETREG_REGISTERED, we assume that it has
+ * been successfully set up in sysfs, and we prepend only the interface name
+ * to the message -- e.g., "eth0: NIC Link is Down".  The interface
+ * name can be used to find eth0's driver, bus ID, etc. in sysfs.
+ *
+ * For any other value of reg_state, we prepend the driver name and bus ID
+ * as well as the (possibly incomplete) interface name -- e.g.,
+ * "eth%d (e100 0000:00:03.0): Failed to map PCI address..."
+ *
+ * Probe functions that alloc and register in one step (via init_etherdev),
+ * or otherwise register the device before the probe completes successfully,
+ * may need to take other steps to ensure that the failing device is clearly
+ * identified.
+ */
+int __netdev_printk(const char *sevlevel, const struct net_device *netdev,
+	int msglevel, const char *format, ...)
+{
+	if (!netdev || !format) {
+		return -EINVAL;
+	}
+	if (msglevel == NETIF_MSG_ALL || (netdev->msg_enable & msglevel)) {
+		static char msg[512];	/* protected by netdev_printk_lock */
+		unsigned long flags;
+		va_list args;
+		struct device *dev = netdev->class_dev.dev;
+		
+		spin_lock_irqsave(&netdev_printk_lock, flags);
+		va_start(args, format);
+		vsnprintf(msg, 512, format, args);
+		va_end(args);
+
+		if (!sevlevel) {
+			sevlevel = "";
+		}
+
+		if (netdev->reg_state == NETREG_REGISTERED || !dev) {
+			printk("%s%s: %s", sevlevel, netdev->name, msg);
+		} else {
+			printk("%s%s (%s %s): %s", sevlevel, netdev->name,
+				dev->driver->name, dev->bus_id, msg);
+		}
+		spin_unlock_irqrestore(&netdev_printk_lock, flags);
+	}
+	return 0;
+}
diff -Naur linux.org/net/netsyms.c linux.netdev.patched/net/netsyms.c
--- linux.org/net/netsyms.c	Mon Sep 15 15:58:07 2003
+++ linux.netdev.patched/net/netsyms.c	Mon Sep 15 15:58:07 2003
@@ -210,6 +210,7 @@
 EXPORT_SYMBOL(net_ratelimit);
 EXPORT_SYMBOL(net_random);
 EXPORT_SYMBOL(net_srandom);
+EXPORT_SYMBOL(__netdev_printk);
 
 /* Needed by smbfs.o */
 EXPORT_SYMBOL(__scm_destroy);

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: [PATCH] Net device error logging, revised
@ 2003-09-17  2:06 Feldman, Scott
  2003-09-17 17:55 ` Jim Keniston
  0 siblings, 1 reply; 3+ messages in thread
From: Feldman, Scott @ 2003-09-17  2:06 UTC (permalink / raw)
  To: Jim Keniston, LKML, netdev, Jeff Garzik, Larry Kessler, Greg KH,
	Randy Dunlap, Alan Cox, Andrew Morton
  Cc: David Brownell, Stephen Hemminger

> 3. A new macro, netdev_fatal, is included.  Given the call
>         netdev_fatal(dev, HW, "NIC fried!\n");
> the indicated message is always logged: the msglevel arg (HW, in this
> case) is NOT consulted.  In fact, the msglevel arg to netdev_fatal
> is ignored in this implementation.  (As previously discussed, in some
> future implementation, the msglevel could be logged to help indicate
> the circumstances under which the event was logged.)

I couldn't find the previous discussion on netdev_fatal, so sorry if
this has already been worked out.  It uses KERN_ERR; did you mean
something stronger?  If not, why not just use netdev_err(dev, ALL,
"...")?  What is the situation in the driver where we'd want to use
_fatal?  How do I know when to use _fatal and when to use _err?

-scott

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] Net device error logging, revised
  2003-09-17  2:06 [PATCH] Net device error logging, revised Feldman, Scott
@ 2003-09-17 17:55 ` Jim Keniston
  0 siblings, 0 replies; 3+ messages in thread
From: Jim Keniston @ 2003-09-17 17:55 UTC (permalink / raw)
  To: Feldman, Scott
  Cc: LKML, netdev, Jeff Garzik, Larry Kessler, Greg KH, Randy Dunlap,
	Alan Cox, Andrew Morton, David Brownell, Stephen Hemminger

"Feldman, Scott" wrote:
> 
> > 3. A new macro, netdev_fatal, is included.  Given the call
> >         netdev_fatal(dev, HW, "NIC fried!\n");
> > the indicated message is always logged: the msglevel arg (HW, in this
> > case) is NOT consulted.  In fact, the msglevel arg to netdev_fatal
> > is ignored in this implementation.  (As previously discussed, in some
> > future implementation, the msglevel could be logged to help indicate
> > the circumstances under which the event was logged.)
> 
> I couldn't find the previous discussion on netdev_fatal, so sorry if
> this has already been worked out.  It uses KERN_ERR; did you mean
> something stronger?  If not, why not just use netdev_err(dev, ALL,
> "...")?  What is the situation in the driver where we'd want to use
> _fatal?  How do I know when to use _fatal and when to use _err?
> 
> -scott

Good question.  There was a discussion thread where somebody pared down
the cc list (no LKML, no netdev, an apparently no Scott).  I saved most
of the messages, and can send them along to you if you want.

Anyway, in support of passing the NETIF_MSG_* msglevel as an arg to the 
netdev_* macros, I said that it...
> also opens the door for logging more clearly what part of the driver (PROBE,
> TX, RX, etc.) the message comes from.

David Brownell replied:
> This is a different issue.  Why wouldn't it be enough
> to have unique messages?

I replied:
> First, I don't want acceptance of the netdev_* idea to hinge on this
> minor issue.  Everything below is blue(r) sky...
>
> That said, it could be useful for an error-analysis utility to know that
> the message is (say) a WARNING categorized as an RX_ERR, especially if the
> utility hasn't been configured to recognized that specific message string.
> That's probably info that wouldn't come to you via printk (e.g., since
> syslog discards the severity level), but could come via another logging
> or monitoring system that plugs in (along with printk) via netdev_*.
>
> One problem with this idea is that, as I read Becker's document
> (URL above), "fatal" errors should always be logged, and therefore
> not tagged as PROBE, TX_ERR, etc.  If we wanted to pursue this idea,
> we could implement
>         netdev_fatal(dev, PROBE, "...")
> which logs the message unconditionally (KERN_ERR) and categorizes it as
> a PROBE error.

Jeff Garzik replied:
> printk'ing probe errors is definitely something I want to encourage.
> Probe errors are the most visible indication of what went wrong, if the
> driver fails to load.  These are the messages that help developers
> figure out the problem.  So I like the direction you suggest, in "If we
> wanted to pursue[...]"

In summary, the idea for netdev_fatal() is based on the idea that the msglevel
arg can not only be used for verbosity control, but can be logged as yet
another useful piece of info.  For printk-based logging, it could easily be
added to the message prefix:
	eth2: PROBE: Invalid Ethernet address
If you log this message using
	netdev_err(dev, PROBE, "Invalid Ethernet address\n")
it will be logged only if the PROBE flag is set in the msg_enable bitmap.
If you log it using
	netdev_fatal(dev, PROBE, "Invalid Ethernet address\n")
it will always be logged.

As mentioned above, I don't think the netdev_fatal idea is central to the
whole netdev_printk idea.

By the way, regarding using KERN_ERR vs. something stronger: It's pretty
unusual for a driver to log anything as KERN_CRIT or worse -- presumably
due to a desire not to overreact if the nth of n redundant devices happens
to fail.  So unless we can agree on a definition of KERN_CRIT (e.g., it
applies to any failure that makes a device unavailable pending human
intervention), I'd prefer to steer clear of that.

Jim

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2003-09-17 17:55 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-09-17  2:06 [PATCH] Net device error logging, revised Feldman, Scott
2003-09-17 17:55 ` Jim Keniston
  -- strict thread matches above, loose matches on Subject: below --
2003-08-25 21:31 [PATCH 1/4] " Jim Keniston
2003-09-15 23:08 ` [PATCH] " Jim Keniston

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).