Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH v3 1/3] ptp: Added a brand new class driver for ptp clocks.
From: Richard Cochran @ 2010-05-14 16:45 UTC (permalink / raw)
  To: netdev; +Cc: linuxppc-dev, devicetree-discuss
In-Reply-To: <cover.1273855016.git.richard.cochran@omicron.at>

This patch adds an infrastructure for hardware clocks that implement
IEEE 1588, the Precision Time Protocol (PTP). A class driver offers a
registration method to particular hardware clock drivers. Each clock is
exposed to user space as a character device with ioctls that allow tuning
of the PTP clock.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
---
 Documentation/ptp/ptp.txt        |   95 +++++++
 Documentation/ptp/testptp.c      |  245 ++++++++++++++++++
 Documentation/ptp/testptp.mk     |   33 +++
 drivers/Kconfig                  |    2 +
 drivers/Makefile                 |    1 +
 drivers/ptp/Kconfig              |   26 ++
 drivers/ptp/Makefile             |    5 +
 drivers/ptp/ptp_clock.c          |  512 ++++++++++++++++++++++++++++++++++++++
 include/linux/Kbuild             |    1 +
 include/linux/ptp_clock.h        |   79 ++++++
 include/linux/ptp_clock_kernel.h |  137 ++++++++++
 11 files changed, 1136 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/ptp/ptp.txt
 create mode 100644 Documentation/ptp/testptp.c
 create mode 100644 Documentation/ptp/testptp.mk
 create mode 100644 drivers/ptp/Kconfig
 create mode 100644 drivers/ptp/Makefile
 create mode 100644 drivers/ptp/ptp_clock.c
 create mode 100644 include/linux/ptp_clock.h
 create mode 100644 include/linux/ptp_clock_kernel.h

diff --git a/Documentation/ptp/ptp.txt b/Documentation/ptp/ptp.txt
new file mode 100644
index 0000000..46858b3
--- /dev/null
+++ b/Documentation/ptp/ptp.txt
@@ -0,0 +1,95 @@
+
+* PTP infrastructure for Linux
+
+  This patch set introduces support for IEEE 1588 PTP clocks in
+  Linux. Together with the SO_TIMESTAMPING socket options, this
+  presents a standardized method for developing PTP user space
+  programs, synchronizing Linux with external clocks, and using the
+  ancillary features of PTP hardware clocks.
+
+  A new class driver exports a kernel interface for specific clock
+  drivers and a user space interface. The infrastructure supports a
+  complete set of PTP functionality.
+
+  + Basic clock operations
+    - Set time
+    - Get time
+    - Shift the clock by a given offset atomically
+    - Adjust clock frequency
+
+  + Ancillary clock features
+    - One short or periodic alarms, with signal delivery to user program
+    - Time stamp external events
+    - Period output signals configurable from user space
+    - Synchronization of the Linux system time via the PPS subsystem
+
+** PTP kernel API
+
+   A PTP clock driver registers itself with the class driver. The
+   class driver handles all of the dealings with user space. The
+   author of a clock driver need only implement the details of
+   programming the clock hardware. The clock driver notifies the class
+   driver of asynchronous events (alarms and external time stamps) via
+   a simple message passing interface.
+
+   The class driver supports multiple PTP clock drivers. In normal use
+   cases, only one PTP clock is needed. However, for testing and
+   development, it can be useful to have more than one clock in a
+   single system, in order to allow performance comparisons.
+
+** PTP user space API
+
+   The class driver creates a character device for each registered PTP
+   clock. User space programs may control the clock using standardized
+   ioctls. A program may query, enable, configure, and disable the
+   ancillary clock features. User space can receive time stamped
+   events via blocking read() and poll(). One shot and periodic
+   signals may be configured via an ioctl API with semantics similar
+   to the POSIX timer_settime() system call.
+
+   As an real life example, the following two patches for ptpd version
+   1.0.0 demonstrate how the API works.
+
+   https://sourceforge.net/tracker/?func=detail&aid=2992845&group_id=139814&atid=744634
+
+   https://sourceforge.net/tracker/?func=detail&aid=2992847&group_id=139814&atid=744634
+
+** Writing clock drivers
+
+   Clock drivers include include/linux/ptp_clock_kernel.h and register
+   themselves by presenting a 'struct ptp_clock_info' to the
+   registration method. Clock drivers must implement all of the
+   functions in the interface. If a clock does not offer a particular
+   ancillary feature, then the driver should just return -EOPNOTSUPP
+   from those functions.
+
+   Drivers must ensure that all of the methods in interface are
+   reentrant. Since most hardware implementations treat the time value
+   as a 64 bit integer accessed as two 32 bit registers, drivers
+   should use spin_lock_irqsave/spin_unlock_irqrestore to protect
+   against concurrent access. This locking cannot be accomplished in
+   class driver, since the lock may also be needed by the clock
+   driver's interrupt service routine.
+
+** Supported hardware
+
+   + Standard Linux system timer
+     - No special PTP features
+     - For use with software time stamping
+
+   + Freescale eTSEC gianfar
+     - 2 Time stamp external triggers, programmable polarity (opt. interrupt)
+     - 2 Alarm registers (optional interrupt)
+     - 3 Periodic signals (optional interrupt)
+
+   + National DP83640
+     - 6 GPIOs programmable as inputs or outputs
+     - 6 GPIOs with dedicated functions (LED/JTAG/clock) can also be
+       used as general inputs or outputs
+     - GPIO inputs can time stamp external triggers
+     - GPIO outputs can produce periodic signals
+     - 1 interrupt pin
+
+   + Intel IXP465
+     - Auxiliary Slave/Master Mode Snapshot (optional interrupt)
+     - Target Time (optional interrupt)
diff --git a/Documentation/ptp/testptp.c b/Documentation/ptp/testptp.c
new file mode 100644
index 0000000..ed2ceea
--- /dev/null
+++ b/Documentation/ptp/testptp.c
@@ -0,0 +1,245 @@
+/*
+ * PTP 1588 clock support - User space test program
+ *
+ * Copyright (C) 2010 OMICRON electronics GmbH
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, write to the Free Software
+ *  Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+#include <errno.h>
+#include <fcntl.h>
+#include <signal.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <sys/stat.h>
+#include <sys/time.h>
+#include <sys/types.h>
+#include <time.h>
+#include <unistd.h>
+
+#include <linux/ptp_clock.h>
+
+static void handle_alarm(int s)
+{
+	printf("received signal %d \n", s);
+}
+
+static int install_handler(int signum, void (*handler)(int))
+{
+	struct sigaction action;
+	sigset_t mask;
+
+	/* Unblock the signal. */
+	sigemptyset(&mask);
+	sigaddset(&mask, signum);
+	sigprocmask(SIG_UNBLOCK, &mask, NULL);
+
+	/* Install the signal handler. */
+	action.sa_handler = handler;
+	action.sa_flags = 0;
+	sigemptyset(&action.sa_mask);
+	sigaction(signum, &action, NULL);
+
+	return 0;
+}
+
+static void usage (char* progname)
+{
+	fprintf(stderr,
+		"usage: %s [options] \n"
+		" -a val     request a one-shot alarm after 'val' seconds \n"
+		" -A val     request a periodic alarm every 'val' seconds \n"
+		" -c         query the ptp clock's capabilities \n"
+		" -d name    device to open \n"
+		" -e val     read 'val' external time stamp events \n"
+		" -f val     adjust the ptp clock frequency by 'val' PPB \n"
+		" -g         get the ptp clock time \n"
+		" -h         prints this message \n"
+		" -s         set the ptp clock time from the system time \n"
+		" -t val     shift the ptp clock time by 'val' seconds \n"
+		" -v         query the ptp clock api version \n"
+		,progname);
+}
+
+int main(int argc,char *argv[])
+{
+	struct ptp_clock_caps caps;
+	struct ptp_clock_timer timer;
+	struct ptp_extts_event event;
+	struct ptp_clock_request request;
+	struct timespec ts;
+	char *progname;
+	int c, cnt, fd, val=0;
+
+	char *device = "/dev/ptp_clock_0";
+	int adjfreq=0x7fffffff;
+	int adjtime=0;
+	int capabilities=0;
+	int extts=0;
+	int gettime=0;
+	int oneshot=0;
+	int periodic=0;
+	int settime=0;
+	int version=0;
+
+	progname = strrchr(argv[0],'/');
+	progname = progname ? 1+progname : argv[0];
+	while (EOF != (c = getopt(argc, argv, "a:A:cd:e:f:ghst:v"))) {
+		switch (c) {
+		case 'a': oneshot = atoi(optarg); break;
+		case 'A': periodic = atoi(optarg); break;
+		case 'c': capabilities = 1; break;
+		case 'd': device = optarg; break;
+		case 'e': extts = atoi(optarg); break;
+		case 'f': adjfreq = atoi(optarg); break;
+		case 'g': gettime = 1; break;
+		case 's': settime = 1; break;
+		case 't': adjtime = atoi(optarg); break;
+		case 'v': version = 1; break;
+		case 'h': usage(progname); return 0;
+		case '?': usage(progname); return -1;
+		default:  usage(progname); return -1;
+		}
+	}
+
+	fd = open(device, O_RDWR);
+	if (fd < 0) {
+		fprintf(stderr,"cannot open %s: %s", device, strerror(errno));
+		return -1;
+	}
+
+	if (version) {
+		if (ioctl(fd, PTP_CLOCK_APIVERS, &val)) {
+			perror("PTP_CLOCK_APIVERS");
+		} else {
+			printf("version = 0x%08x \n",val);
+		}
+	}
+
+	if (capabilities) {
+		if (ioctl(fd, PTP_CLOCK_GETCAPS, &caps)) {
+			perror("PTP_CLOCK_GETCAPS");
+		} else {
+			printf("capabilities: \n"
+			       "  %d maximum frequency adjustment (PPB) \n"
+			       "  %d programmable alarms \n"
+			       "  %d external time stamp channels \n"
+			       "  %d programmable periodic signals \n"
+			       "  %d pulse per second \n",
+			       caps.max_adj,
+			       caps.n_alarm,
+			       caps.n_ext_ts,
+			       caps.n_per_out,
+			       caps.pps);
+		}
+	}
+
+	if (0x7fffffff != adjfreq) {
+		if (ioctl(fd, PTP_CLOCK_ADJFREQ, adjfreq)) {
+			perror("PTP_CLOCK_ADJFREQ");
+		} else {
+			puts("frequency adjustment okay");
+		}
+	}
+
+	if (adjtime) {
+		ts.tv_sec = adjtime;
+		ts.tv_nsec = 0;
+		if (ioctl(fd, PTP_CLOCK_ADJTIME, &ts)) {
+			perror("PTP_CLOCK_ADJTIME");
+		} else {
+			puts("time shift okay");
+		}
+	}
+
+	if (gettime) {
+		if (ioctl(fd, PTP_CLOCK_GETTIME, &ts)) {
+			perror("PTP_CLOCK_GETTIME");
+		} else {
+			printf("clock time: %ld.%09ld or %s",
+			       ts.tv_sec, ts.tv_nsec, ctime(&ts.tv_sec));
+		}
+	}
+
+	if (settime) {
+		clock_gettime(CLOCK_REALTIME, &ts);
+		if (ioctl(fd, PTP_CLOCK_SETTIME, &ts)) {
+			perror("PTP_CLOCK_SETTIME");
+		} else {
+			puts("set time okay");
+		}
+	}
+
+	if (extts) {
+		memset(&request, 0, sizeof(request));
+		request.type = PTP_REQUEST_EXTTS;
+		request.index = 0;
+		request.flags = PTP_ENABLE_FEATURE;
+		if (ioctl(fd, PTP_FEATURE_REQUEST, &request)) {
+			perror("PTP_FEATURE_REQUEST");
+			extts = 0;
+		} else {
+			puts("set timer okay");
+		}
+		for (; extts; extts--) {
+			cnt = read(fd, &event, sizeof(event));
+			if (cnt != sizeof(event)) {
+				perror("read");
+				break;
+			}
+			printf("event index %d at %ld.%09ld \n", event.index,
+			       event.ts.tv_sec, event.ts.tv_nsec);
+		}
+		/* Disable the feature again. */
+		request.flags = 0;
+		if (ioctl(fd, PTP_FEATURE_REQUEST, &request)) {
+			perror("PTP_FEATURE_REQUEST");
+		}
+	}
+
+	if (oneshot) {
+		install_handler(SIGALRM, handle_alarm);
+		memset(&timer, 0, sizeof(timer));
+		timer.signum = SIGALRM;
+		timer.tsp.it_value.tv_sec = oneshot;
+		if (ioctl(fd, PTP_CLOCK_SETTIMER, &timer)) {
+			perror("PTP_CLOCK_SETTIMER");
+		} else {
+			puts("set timer okay");
+		}
+		pause();
+	}
+
+	if (periodic) {
+		install_handler(SIGALRM, handle_alarm);
+		memset(&timer, 0, sizeof(timer));
+		timer.signum = SIGALRM;
+		timer.tsp.it_value.tv_sec = periodic;
+		timer.tsp.it_interval.tv_sec = periodic;
+		if (ioctl(fd, PTP_CLOCK_SETTIMER, &timer)) {
+			perror("PTP_CLOCK_SETTIMER");
+		} else {
+			puts("set timer okay");
+		}
+		while (1) {
+			pause();
+		}
+	}
+
+	close(fd);
+	return 0;
+}
diff --git a/Documentation/ptp/testptp.mk b/Documentation/ptp/testptp.mk
new file mode 100644
index 0000000..4ef2d97
--- /dev/null
+++ b/Documentation/ptp/testptp.mk
@@ -0,0 +1,33 @@
+# PTP 1588 clock support - User space test program
+#
+# Copyright (C) 2010 OMICRON electronics GmbH
+#
+#  This program is free software; you can redistribute it and/or modify
+#  it under the terms of the GNU General Public License as published by
+#  the Free Software Foundation; either version 2 of the License, or
+#  (at your option) any later version.
+#
+#  This program is distributed in the hope that it will be useful,
+#  but WITHOUT ANY WARRANTY; without even the implied warranty of
+#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+#  GNU General Public License for more details.
+#
+#  You should have received a copy of the GNU General Public License
+#  along with this program; if not, write to the Free Software
+#  Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+
+CC        = $(CROSS_COMPILE)gcc
+INC       = -I$(KBUILD_OUTPUT)/usr/include
+CFLAGS    = -Wall $(INC)
+LDLIBS    = -lrt
+PROGS     = testptp
+
+all: $(PROGS)
+
+testptp: testptp.o
+
+clean:
+	rm -f testptp.o
+
+distclean: clean
+	rm -f $(PROGS)
diff --git a/drivers/Kconfig b/drivers/Kconfig
index a2b902f..774fbd7 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -52,6 +52,8 @@ source "drivers/spi/Kconfig"
 
 source "drivers/pps/Kconfig"
 
+source "drivers/ptp/Kconfig"
+
 source "drivers/gpio/Kconfig"
 
 source "drivers/w1/Kconfig"
diff --git a/drivers/Makefile b/drivers/Makefile
index f42a030..84228bc 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -75,6 +75,7 @@ obj-$(CONFIG_I2O)		+= message/
 obj-$(CONFIG_RTC_LIB)		+= rtc/
 obj-y				+= i2c/ media/
 obj-$(CONFIG_PPS)		+= pps/
+obj-$(CONFIG_PTP_1588_CLOCK)	+= ptp/
 obj-$(CONFIG_W1)		+= w1/
 obj-$(CONFIG_POWER_SUPPLY)	+= power/
 obj-$(CONFIG_HWMON)		+= hwmon/
diff --git a/drivers/ptp/Kconfig b/drivers/ptp/Kconfig
new file mode 100644
index 0000000..c80a25b
--- /dev/null
+++ b/drivers/ptp/Kconfig
@@ -0,0 +1,26 @@
+#
+# PTP clock support configuration
+#
+
+menu "PTP clock support"
+
+config PTP_1588_CLOCK
+	tristate "PTP clock support"
+	depends on EXPERIMENTAL
+	help
+	  The IEEE 1588 standard defines a method to precisely
+	  synchronize distributed clocks over Ethernet networks. The
+	  standard defines a Precision Time Protocol (PTP), which can
+	  be used to achieve synchronization within a few dozen
+	  microseconds. In addition, with the help of special hardware
+	  time stamping units, it can be possible to achieve
+	  synchronization to within a few hundred nanoseconds.
+
+	  This driver adds support for PTP clocks as character
+	  devices. If you want to use a PTP clock, then you should
+	  also enable at least one clock driver as well.
+
+	  To compile this driver as a module, choose M here: the module
+	  will be called ptp_clock.
+
+endmenu
diff --git a/drivers/ptp/Makefile b/drivers/ptp/Makefile
new file mode 100644
index 0000000..b86695c
--- /dev/null
+++ b/drivers/ptp/Makefile
@@ -0,0 +1,5 @@
+#
+# Makefile for PTP 1588 clock support.
+#
+
+obj-$(CONFIG_PTP_1588_CLOCK)		+= ptp_clock.o
diff --git a/drivers/ptp/ptp_clock.c b/drivers/ptp/ptp_clock.c
new file mode 100644
index 0000000..1d94d31
--- /dev/null
+++ b/drivers/ptp/ptp_clock.c
@@ -0,0 +1,512 @@
+/*
+ * PTP 1588 clock support
+ *
+ * Partially adapted from the Linux PPS driver.
+ *
+ * Copyright (C) 2010 OMICRON electronics GmbH
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, write to the Free Software
+ *  Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+#include <linux/bitops.h>
+#include <linux/cdev.h>
+#include <linux/device.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/list.h>
+#include <linux/module.h>
+#include <linux/poll.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+
+#include <linux/ptp_clock_kernel.h>
+#include <linux/ptp_clock.h>
+
+#define PTP_MAX_ALARMS 4
+#define PTP_MAX_CLOCKS BITS_PER_LONG
+#define PTP_MAX_TIMESTAMPS 128
+
+struct alarm {
+	struct pid *pid;
+	int sig;
+};
+
+struct timestamp_event_queue {
+	struct ptp_extts_event buf[PTP_MAX_TIMESTAMPS];
+	int head;
+	int tail;
+	int overflow;
+};
+
+struct ptp_clock {
+	struct list_head list;
+	struct cdev cdev;
+	struct device *dev;
+	struct ptp_clock_info *info;
+	dev_t devid;
+	int index; /* index into clocks.map, also the minor number */
+
+	struct alarm alarm[PTP_MAX_ALARMS];
+	struct mutex alarm_mux; /* one process at a time setting an alarm */
+
+	struct timestamp_event_queue tsevq; /* simple fifo for time stamps */
+	struct mutex tsevq_mux; /* one process at a time reading the fifo */
+	wait_queue_head_t tsev_wq;
+};
+
+/* private globals */
+
+static const struct file_operations ptp_fops;
+static dev_t ptp_devt;
+static struct class *ptp_class;
+
+static struct {
+	struct list_head list;
+	DECLARE_BITMAP(map, PTP_MAX_CLOCKS);
+} clocks;
+static DEFINE_SPINLOCK(clocks_lock); /* protects 'clocks' */
+
+/* time stamp event queue operations */
+
+static inline int queue_cnt(struct timestamp_event_queue *q)
+{	
+	int cnt = q->tail - q->head;
+	return cnt < 0 ? PTP_MAX_TIMESTAMPS + cnt : cnt;
+}
+
+static inline int queue_free(struct timestamp_event_queue *q)
+{
+	return PTP_MAX_TIMESTAMPS - queue_cnt(q) - 1;
+}
+
+static void enqueue_external_timestamp(struct timestamp_event_queue *queue,
+				       struct ptp_clock_event *src)
+{
+	struct ptp_extts_event *dst;
+	u32 remainder;
+
+	dst = &queue->buf[ queue->tail ];
+
+	dst->index = src->index;
+	dst->ts.tv_sec = div_u64_rem(src->timestamp, 1000000000, &remainder);
+	dst->ts.tv_nsec = remainder;
+
+	if (!queue_free(queue))
+		queue->overflow++;
+
+	queue->tail = (queue->tail + 1) % PTP_MAX_TIMESTAMPS;
+}
+
+/* public interface */
+
+struct ptp_clock* ptp_clock_register(struct ptp_clock_info *info)
+{
+	struct ptp_clock *ptp;
+	int err = 0, index, major = MAJOR(ptp_devt);
+	unsigned long flags;
+
+	if (info->n_alarm > PTP_MAX_ALARMS)
+		return ERR_PTR(-EINVAL);
+
+	/* Find a free clock slot and reserve it. */
+	err = -EBUSY;
+	spin_lock_irqsave(&clocks_lock, flags);
+	index = find_first_zero_bit(clocks.map, PTP_MAX_CLOCKS);
+	if (index < PTP_MAX_CLOCKS) {
+		set_bit(index, clocks.map);
+		spin_unlock_irqrestore(&clocks_lock, flags);
+	} else {
+		spin_unlock_irqrestore(&clocks_lock, flags);
+		goto no_clock;
+	}
+
+	/* Initialize a clock structure. */
+	err = -ENOMEM;
+	ptp = kzalloc(sizeof(struct ptp_clock), GFP_KERNEL);
+	if (ptp == NULL)
+		goto no_memory;
+
+	ptp->info = info;
+	ptp->devid = MKDEV(major, index);
+	ptp->index = index;
+	mutex_init(&ptp->alarm_mux);
+	mutex_init(&ptp->tsevq_mux);
+	init_waitqueue_head(&ptp->tsev_wq);
+
+	/* Create a new device in our class. */
+	ptp->dev = device_create(ptp_class, NULL, ptp->devid, ptp,
+				 "ptp_clock_%d", ptp->index);
+	if (IS_ERR(ptp->dev))
+		goto no_device;
+
+	dev_set_drvdata(ptp->dev, ptp);
+
+	/* Register a character device. */
+	cdev_init(&ptp->cdev, &ptp_fops);
+	ptp->cdev.owner = info->owner;
+	err = cdev_add(&ptp->cdev, ptp->devid, 1);
+	if (err)
+		goto no_cdev;
+
+	/* Clock is ready, add it into the list. */
+	spin_lock_irqsave(&clocks_lock, flags);
+	list_add(&ptp->list, &clocks.list);
+	spin_unlock_irqrestore(&clocks_lock, flags);
+
+	return ptp;
+
+no_cdev:
+	device_destroy(ptp_class, ptp->devid);
+no_device:
+	mutex_destroy(&ptp->alarm_mux);
+	mutex_destroy(&ptp->tsevq_mux);
+	kfree(ptp);
+no_memory:
+	spin_lock_irqsave(&clocks_lock, flags);
+	clear_bit(index, clocks.map);
+	spin_unlock_irqrestore(&clocks_lock, flags);
+no_clock:
+	return ERR_PTR(err);
+}
+EXPORT_SYMBOL(ptp_clock_register);
+
+int ptp_clock_unregister(struct ptp_clock *ptp)
+{
+	unsigned long flags;
+
+	/* Release the clock's resources. */
+	cdev_del(&ptp->cdev);
+	device_destroy(ptp_class, ptp->devid);
+	mutex_destroy(&ptp->alarm_mux);
+	mutex_destroy(&ptp->tsevq_mux);
+
+	/* Remove the clock from the list. */
+	spin_lock_irqsave(&clocks_lock, flags);
+	list_del(&ptp->list);
+	clear_bit(ptp->index, clocks.map);
+	spin_unlock_irqrestore(&clocks_lock, flags);
+
+	kfree(ptp);
+
+	return 0;
+}
+EXPORT_SYMBOL(ptp_clock_unregister);
+
+void ptp_clock_event(struct ptp_clock *ptp, struct ptp_clock_event *event)
+{
+	switch (event->type) {
+
+	case PTP_CLOCK_ALARM:
+		kill_pid(ptp->alarm[ event->index ].pid,
+			 ptp->alarm[ event->index ].sig, 1);
+		break;
+
+	case PTP_CLOCK_EXTTS:
+		enqueue_external_timestamp(&ptp->tsevq, event);
+		wake_up_interruptible(&ptp->tsev_wq);
+		break;
+
+	case PTP_CLOCK_PPS:
+		break;
+	}
+}
+EXPORT_SYMBOL(ptp_clock_event);
+
+/* character device operations */
+
+static int ptp_ioctl(struct inode *node, struct file *fp,
+		      unsigned int cmd, unsigned long arg)
+{
+	struct ptp_clock_caps caps;
+	struct ptp_clock_request req;
+	struct ptp_clock_timer timer;
+	struct ptp_clock *ptp = fp->private_data;
+	struct ptp_clock_info *ops = ptp->info;
+	void *priv = ops->priv;
+	struct timespec ts;
+	int flags, index;
+	int err = 0;
+
+	switch (cmd) {
+
+	case PTP_CLOCK_APIVERS:
+		err = put_user(PTP_CLOCK_VERSION, (u32 __user*)arg);
+		break;
+
+	case PTP_CLOCK_ADJFREQ:
+		if (!capable(CAP_SYS_TIME))
+			return -EPERM;
+		err = ops->adjfreq(priv, arg);
+		break;
+
+	case PTP_CLOCK_ADJTIME:
+		if (!capable(CAP_SYS_TIME))
+			return -EPERM;
+		if (copy_from_user(&ts, (void __user*)arg, sizeof(ts)))
+			err = -EFAULT;
+		else
+			err = ops->adjtime(priv, &ts);
+		break;
+
+	case PTP_CLOCK_GETTIME:
+		err = ops->gettime(priv, &ts);
+		if (err)
+			break;
+		err = copy_to_user((void __user*)arg, &ts, sizeof(ts));
+		break;
+
+	case PTP_CLOCK_SETTIME:
+		if (!capable(CAP_SYS_TIME))
+			return -EPERM;
+		if (copy_from_user(&ts, (void __user*)arg, sizeof(ts)))
+			err = -EFAULT;
+		else
+			err = ops->settime(priv, &ts);
+		break;
+
+	case PTP_CLOCK_GETCAPS:
+		memset(&caps, 0, sizeof(caps));
+		caps.max_adj = ptp->info->max_adj;
+		caps.n_alarm = ptp->info->n_alarm;
+		caps.n_ext_ts = ptp->info->n_ext_ts;
+		caps.n_per_out = ptp->info->n_per_out;
+		caps.pps = ptp->info->pps;
+		err = copy_to_user((void __user*)arg, &caps, sizeof(caps));
+		break;
+
+	case PTP_CLOCK_GETTIMER:
+		if (copy_from_user(&timer, (void __user*)arg, sizeof(timer))) {
+			err = -EFAULT;
+			break;
+		}
+		index = timer.alarm_index;
+		if (index < 0 || index >= ptp->info->n_alarm) {
+			err = -EINVAL;
+			break;
+		}
+		err = ops->gettimer(priv, index, &timer.tsp);
+		if (err)
+			break;
+		err = copy_to_user((void __user*)arg, &timer, sizeof(timer));
+		break;
+
+	case PTP_CLOCK_SETTIMER:
+		if (copy_from_user(&timer, (void __user*)arg, sizeof(timer))) {
+			err = -EFAULT;
+			break;
+		}
+		index = timer.alarm_index;
+		if (index < 0 || index >= ptp->info->n_alarm) {
+			err = -EINVAL;
+			break;
+		}
+		if (!valid_signal(timer.signum))
+			return -EINVAL;
+		flags = timer.flags;
+		if (flags & (flags != TIMER_ABSTIME)) {
+			err = -EINVAL;
+			break;
+		}
+		if (mutex_lock_interruptible(&ptp->alarm_mux))
+			return -ERESTARTSYS;
+
+		if (ptp->alarm[index].pid)
+			put_pid(ptp->alarm[index].pid);
+
+		ptp->alarm[index].pid = get_pid(task_pid(current));
+		ptp->alarm[index].sig = timer.signum;
+		err = ops->settimer(priv, index, flags, &timer.tsp);
+
+		mutex_unlock(&ptp->alarm_mux);
+		break;
+
+	case PTP_FEATURE_REQUEST:
+		if (copy_from_user(&req, (void __user*)arg, sizeof(req))) {
+			err = -EFAULT;
+			break;
+		}
+		switch (req.type) {
+		case PTP_REQUEST_EXTTS:
+		case PTP_REQUEST_PEROUT:
+			break;
+		case PTP_REQUEST_PPS:
+			if (!capable(CAP_SYS_TIME))
+				return -EPERM;
+			break;
+		default:
+			err = -EINVAL;
+			break;
+		}
+		if (err)
+			break;
+		err = ops->enable(priv, &req,
+				  req.flags & PTP_ENABLE_FEATURE ? 1 : 0);
+		break;
+
+	default:
+		err = -ENOTTY;
+		break;
+	}
+	return err;
+}
+
+static int ptp_open(struct inode *inode, struct file *fp)
+{
+	struct ptp_clock *ptp;
+	ptp = container_of(inode->i_cdev, struct ptp_clock, cdev);
+
+	fp->private_data = ptp;
+
+	return 0;
+}
+
+static unsigned int ptp_poll(struct file *fp, poll_table *wait)
+{
+	struct ptp_clock *ptp = fp->private_data;
+
+	poll_wait(fp, &ptp->tsev_wq, wait);
+
+	return queue_cnt(&ptp->tsevq) ? POLLIN : 0;
+}
+
+static ssize_t ptp_read(struct file *fp, char __user *buf,
+			size_t cnt, loff_t *off)
+{
+	struct ptp_clock *ptp = fp->private_data;
+	struct timestamp_event_queue *queue = &ptp->tsevq;
+	struct ptp_extts_event *event;
+	size_t qcnt;
+
+	if (mutex_lock_interruptible(&ptp->tsevq_mux))
+		return -ERESTARTSYS;
+
+	cnt = cnt / sizeof(struct ptp_extts_event);
+
+	if (wait_event_interruptible(ptp->tsev_wq,
+				     (qcnt = queue_cnt(&ptp->tsevq)))) {
+		mutex_unlock(&ptp->tsevq_mux);
+		return -ERESTARTSYS;
+	}
+
+	if (cnt > qcnt)
+		cnt = qcnt;
+
+	event = &queue->buf[ queue->head ];
+
+	if (copy_to_user(buf, event, cnt * sizeof(struct ptp_extts_event))) {
+		mutex_unlock(&ptp->tsevq_mux);
+		return -EFAULT;
+	}
+	queue->head = (queue->head + cnt) % PTP_MAX_TIMESTAMPS;
+
+	mutex_unlock(&ptp->tsevq_mux);
+
+	return cnt * sizeof(struct ptp_extts_event);
+}
+
+static int ptp_release(struct inode *inode, struct file *fp)
+{
+	struct ptp_clock *ptp;
+	struct itimerspec ts = {{0,0},{0,0}};
+	int i;
+
+	ptp = container_of(inode->i_cdev, struct ptp_clock, cdev);
+
+	for (i = 0; i < ptp->info->n_alarm; i++) {
+		if (ptp->alarm[i].pid) {
+			ptp->info->settimer(ptp->info->priv, i, 0, &ts);
+			put_pid(ptp->alarm[i].pid);
+			ptp->alarm[i].pid = NULL;
+		}
+	}
+	return 0;
+}
+
+static const struct file_operations ptp_fops = {
+	.owner		= THIS_MODULE,
+	.ioctl		= ptp_ioctl,
+	.open		= ptp_open,
+	.poll		= ptp_poll,
+	.read		= ptp_read,
+	.release	= ptp_release,
+};
+
+/* sysfs */
+
+static ssize_t ptp_show_status(struct device *dev,
+			       struct device_attribute *attr, char *buf)
+{
+	struct ptp_clock *ptp = dev_get_drvdata(dev);
+	return sprintf(buf,
+		       "maximum adjustment:  %d\n"
+		       "programmable alarms: %d\n"
+		       "external timestamps: %d\n"
+		       "periodic outputs:    %d\n"
+		       "has pps:             %d\n"
+		       "device index:        %d\n"
+		       ,ptp->info->max_adj
+		       ,ptp->info->n_alarm
+		       ,ptp->info->n_ext_ts
+		       ,ptp->info->n_per_out
+		       ,ptp->info->pps
+		       ,ptp->index);
+}
+
+struct device_attribute ptp_attrs[] = {
+	__ATTR(capabilities, S_IRUGO, ptp_show_status, NULL),
+	__ATTR_NULL,
+};
+
+/* module operations */
+
+static void __exit ptp_exit(void)
+{
+	class_destroy(ptp_class);
+	unregister_chrdev_region(ptp_devt, PTP_MAX_CLOCKS);
+}
+
+static int __init ptp_init(void)
+{
+	int err;
+
+	INIT_LIST_HEAD(&clocks.list);
+
+	ptp_class = class_create(THIS_MODULE, "ptp");
+	if (!ptp_class) {
+		printk(KERN_ERR "ptp: failed to allocate class\n");
+		return -ENOMEM;
+	}
+	ptp_class->dev_attrs = ptp_attrs;
+
+	err = alloc_chrdev_region(&ptp_devt, 0, PTP_MAX_CLOCKS, "ptp");
+	if (err < 0) {
+		printk(KERN_ERR "ptp: failed to allocate char device region\n");
+		goto no_region;
+	}
+
+	pr_info("PTP clock support registered\n");
+	return 0;
+
+no_region:
+	class_destroy(ptp_class);
+	return err;
+}
+
+subsys_initcall(ptp_init);
+module_exit(ptp_exit);
+
+MODULE_AUTHOR("Richard Cochran <richard.cochran@omicron.at>");
+MODULE_DESCRIPTION("PTP clocks support");
+MODULE_LICENSE("GPL");
diff --git a/include/linux/Kbuild b/include/linux/Kbuild
index 2fc8e14..9959fe4 100644
--- a/include/linux/Kbuild
+++ b/include/linux/Kbuild
@@ -140,6 +140,7 @@ header-y += pkt_sched.h
 header-y += posix_types.h
 header-y += ppdev.h
 header-y += prctl.h
+header-y += ptp_clock.h
 header-y += qnxtypes.h
 header-y += qnx4_fs.h
 header-y += radeonfb.h
diff --git a/include/linux/ptp_clock.h b/include/linux/ptp_clock.h
new file mode 100644
index 0000000..5a509c5
--- /dev/null
+++ b/include/linux/ptp_clock.h
@@ -0,0 +1,79 @@
+/*
+ * PTP 1588 clock support - user space interface
+ *
+ * Copyright (C) 2010 OMICRON electronics GmbH
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, write to the Free Software
+ *  Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#ifndef _PTP_CLOCK_H_
+#define _PTP_CLOCK_H_
+
+#include <linux/ioctl.h>
+#include <linux/types.h>
+
+#define PTP_ENABLE_FEATURE (1<<0)
+#define PTP_RISING_EDGE    (1<<1)
+#define PTP_FALLING_EDGE   (1<<2)
+
+enum ptp_request_types {
+	PTP_REQUEST_EXTTS,
+	PTP_REQUEST_PEROUT,
+	PTP_REQUEST_PPS,
+};
+
+struct ptp_clock_caps {
+	__s32 max_adj; /* Maximum frequency adjustment, parts per billon. */
+	int n_alarm;   /* Number of programmable alarms. */
+	int n_ext_ts;  /* Number of external time stamp channels. */
+	int n_per_out; /* Number of programmable periodic signals. */
+	int pps;       /* Whether the clock supports a PPS callback. */
+};
+
+struct ptp_clock_timer {
+	int alarm_index;       /* Which alarm to query or configure. */
+	int signum;            /* Requested signal. */
+	int flags;             /* Zero or TIMER_ABSTIME, see TIMER_SETTIME(2) */
+	struct itimerspec tsp; /* See TIMER_SETTIME(2) */
+};
+
+struct ptp_clock_request {
+	int type;  /* One of the ptp_request_types enumeration values. */
+	int index; /* Which channel to configure. */
+	struct timespec ts; /* For period signals, the desired period. */
+	int flags; /* Bit field for PTP_ENABLE_FEATURE or other flags. */
+};
+
+struct ptp_extts_event {
+	int index;
+	struct timespec ts;
+};
+
+#define PTP_CLOCK_VERSION 0x00000001
+
+#define PTP_CLK_MAGIC '='
+
+#define PTP_CLOCK_APIVERS _IOR (PTP_CLK_MAGIC, 1, __u32)
+#define PTP_CLOCK_ADJFREQ _IO  (PTP_CLK_MAGIC, 2)
+#define PTP_CLOCK_ADJTIME _IOW (PTP_CLK_MAGIC, 3, struct timespec)
+#define PTP_CLOCK_GETTIME _IOR (PTP_CLK_MAGIC, 4, struct timespec)
+#define PTP_CLOCK_SETTIME _IOW (PTP_CLK_MAGIC, 5, struct timespec)
+
+#define PTP_CLOCK_GETCAPS   _IOR  (PTP_CLK_MAGIC, 6, struct ptp_clock_caps)
+#define PTP_CLOCK_GETTIMER  _IOWR (PTP_CLK_MAGIC, 7, struct ptp_clock_timer)
+#define PTP_CLOCK_SETTIMER  _IOW  (PTP_CLK_MAGIC, 8, struct ptp_clock_timer)
+#define PTP_FEATURE_REQUEST _IOW  (PTP_CLK_MAGIC, 9, struct ptp_clock_request)
+
+#endif
diff --git a/include/linux/ptp_clock_kernel.h b/include/linux/ptp_clock_kernel.h
new file mode 100644
index 0000000..b1fb2a7
--- /dev/null
+++ b/include/linux/ptp_clock_kernel.h
@@ -0,0 +1,137 @@
+/*
+ * PTP 1588 clock support
+ *
+ * Copyright (C) 2010 OMICRON electronics GmbH
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, write to the Free Software
+ *  Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#ifndef _PTP_CLOCK_KERNEL_H_
+#define _PTP_CLOCK_KERNEL_H_
+
+#include <linux/ptp_clock.h>
+
+/**
+ * struct ptp_clock_info - decribes a PTP hardware clock
+ *
+ * @owner:     The clock driver should set to THIS_MODULE.
+ * @name:      A short name to identify the clock.
+ * @max_adj:   The maximum possible frequency adjustment, in parts per billon.
+ * @n_alarm:   The number of programmable alarms.
+ * @n_ext_ts:  The number of external time stamp channels.
+ * @n_per_out: The number of programmable periodic signals.
+ * @pps:       Indicates whether the clock supports a PPS callback.
+ * @priv:      Passed to the clock operations, for the driver's private use.
+ *
+ * clock operations
+ *
+ * @adjfreq:  Adjusts the frequency of the hardware clock.
+ *            parameter delta: Desired period change in parts per billion.
+ *
+ * @adjtime:  Shifts the time of the hardware clock.
+ *            parameter ts: Desired change in seconds and nanoseconds.
+ *
+ * @gettime:  Reads the current time from the hardware clock.
+ *            parameter ts: Holds the result.
+ *
+ * @settime:  Set the current time on the hardware clock.
+ *            parameter ts: Time value to set.
+ *
+ * @gettimer: Reads the time remaining from the given timer.
+ *            parameter index: Which alarm to query.
+ *            parameter ts: Holds the result.
+ *
+ * @settimer: Arms the given timer for periodic or one shot operation.
+ *            parameter index: Which alarm to set.
+ *            parameter abs: TIMER_ABSTIME, or zero for relative timer.
+ *            parameter ts: Alarm time and period to set.
+ *
+ * @enable:   Request driver to enable or disable an ancillary feature.
+ *            parameter request: Desired resource to enable or disable.
+ *            parameter on: Caller passes one to enable or zero to disable.
+ *
+ * The callbacks must all return zero on success, non-zero otherwise.
+ */
+
+struct ptp_clock_info {
+	struct module *owner;
+	char name[16];
+	s32 max_adj;
+	int n_alarm;
+	int n_ext_ts;
+	int n_per_out;
+	int pps;
+	void *priv;
+	int (*adjfreq)(void *priv, s32 delta);
+	int (*adjtime)(void *priv, struct timespec *ts);
+	int (*gettime)(void *priv, struct timespec *ts);
+	int (*settime)(void *priv, struct timespec *ts);
+	int (*gettimer)(void *priv, int index, struct itimerspec *ts);
+	int (*settimer)(void *priv, int index, int abs, struct itimerspec *ts);
+	int (*enable)(void *priv, struct ptp_clock_request *request, int on);
+};
+
+struct ptp_clock;
+
+/**
+ * ptp_clock_register() - register a PTP hardware clock driver
+ *
+ * @info:  Structure describing the new clock.
+ */
+
+extern struct ptp_clock* ptp_clock_register(struct ptp_clock_info *info);
+
+/**
+ * ptp_clock_unregister() - unregister a PTP hardware clock driver
+ *
+ * @ptp:  The clock to remove from service.
+ */
+
+extern int ptp_clock_unregister(struct ptp_clock *ptp);
+
+
+enum ptp_clock_events {
+	PTP_CLOCK_ALARM,
+	PTP_CLOCK_EXTTS,
+	PTP_CLOCK_PPS,
+};
+
+/**
+ * struct ptp_clock_event - decribes a PTP hardware clock event
+ *
+ * @type:  One of the ptp_clock_events enumeration values.
+ * @index: Identifies the source of the event.
+ * @timestamp: When the event occured.
+ */
+
+struct ptp_clock_event {
+	int type;
+	int index;
+	u64 timestamp;
+};
+
+/**
+ * ptp_clock_event() - notify the PTP layer about an event
+ *
+ * This function should only be called from interrupt context.
+ *
+ * @ptp:    The clock obtained from ptp_clock_register().
+ * @event:  Message structure describing the event.
+ */
+
+extern void ptp_clock_event(struct ptp_clock *ptp,
+			    struct ptp_clock_event *event);
+
+#endif
-- 
1.6.3.3


^ permalink raw reply related

* [PATCH v3 0/3] ptp: IEEE 1588 clock support
From: Richard Cochran @ 2010-05-14 16:44 UTC (permalink / raw)
  To: netdev; +Cc: linuxppc-dev, devicetree-discuss

Now and again there has been talk on this list of adding PTP support
into Linux. One part of the picture is already in place, the
SO_TIMESTAMPING API for hardware time stamping. This patch set offers
the missing second part needed for complete IEEE 1588 support.

The only feature still to be implemented is the hook into the PPS
subsystem, to synchronize the Linux clock to the PTP clock.

Enjoy,
Richard

* Patch ChangeLog
** v3
*** general
   - Added documentation on writing clock drivers.
   - Added the ioctls for the ancillary clock features.
   - Changed wrong subsys_initcall() to module_init() in clock drivers.
   - Removed the (too coarse) character device mutex.
   - Setting the clock now requires CAP_SYS_TIME.
*** gianfar
   - Added alarm feature.
   - Added device tree node binding description.
   - Added fine grain locking of the clock registers.
   - Added the external time stamp feature.
   - Added white space for better style.
   - Coverted base+offset to structure pointers for register access.
   - When removing the driver, we now disable all PTP functions.

** v2
   - Changed clock list from a static array into a dynamic list. Also,
     use a bitmap to manage the clock's minor numbers.
   - Replaced character device semaphore with a mutex.
   - Drop .ko from module names in Kbuild help.
   - Replace deprecated unifdef-y with header-y for user space header file.
   - Added links to both of the ptpd patches on sourceforge.
   - Gianfar driver now gets parameters from device tree.
   - Added API documentation to Documentation/ptp/ptp.txt


Richard Cochran (3):
  ptp: Added a brand new class driver for ptp clocks.
  ptp: Added a clock that uses the Linux system time.
  ptp: Added a clock that uses the eTSEC found on the MPC85xx.

 Documentation/powerpc/dts-bindings/fsl/tsec.txt |   56 +++
 Documentation/ptp/ptp.txt                       |   95 ++++
 Documentation/ptp/testptp.c                     |  245 +++++++++++
 Documentation/ptp/testptp.mk                    |   33 ++
 arch/powerpc/boot/dts/mpc8313erdb.dts           |   14 +
 arch/powerpc/boot/dts/p2020ds.dts               |   14 +
 arch/powerpc/boot/dts/p2020rdb.dts              |   14 +
 drivers/Kconfig                                 |    2 +
 drivers/Makefile                                |    1 +
 drivers/net/Makefile                            |    1 +
 drivers/net/gianfar_ptp.c                       |  521 +++++++++++++++++++++++
 drivers/net/gianfar_ptp_reg.h                   |  113 +++++
 drivers/ptp/Kconfig                             |   51 +++
 drivers/ptp/Makefile                            |    6 +
 drivers/ptp/ptp_clock.c                         |  512 ++++++++++++++++++++++
 drivers/ptp/ptp_linux.c                         |  136 ++++++
 include/linux/Kbuild                            |    1 +
 include/linux/ptp_clock.h                       |   79 ++++
 include/linux/ptp_clock_kernel.h                |  137 ++++++
 kernel/time/ntp.c                               |    2 +
 20 files changed, 2033 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/ptp/ptp.txt
 create mode 100644 Documentation/ptp/testptp.c
 create mode 100644 Documentation/ptp/testptp.mk
 create mode 100644 drivers/net/gianfar_ptp.c
 create mode 100644 drivers/net/gianfar_ptp_reg.h
 create mode 100644 drivers/ptp/Kconfig
 create mode 100644 drivers/ptp/Makefile
 create mode 100644 drivers/ptp/ptp_clock.c
 create mode 100644 drivers/ptp/ptp_linux.c
 create mode 100644 include/linux/ptp_clock.h
 create mode 100644 include/linux/ptp_clock_kernel.h


^ permalink raw reply

* Re: [net-next-2.6 V7 PATCH 1/2] Add netlink support for virtual port management (was iovnl)
From: Patrick McHardy @ 2010-05-14 16:42 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: Scott Feldman, davem, netdev, chrisw
In-Reply-To: <201005141412.01578.arnd@arndb.de>

Arnd Bergmann wrote:
> On Friday 14 May 2010, Patrick McHardy wrote:
>>> +static int rtnl_vf_port_fill_nest(struct sk_buff *skb, struct net_device *dev,
>>> +				  int vf)
>>> +{
>>> +	struct nlattr *data;
>>> +	int err;
>>> +
>>> +	data = nla_nest_start(skb, IFLA_VF_PORT);
>> We usually use a top-level attribute to encapsulate lists of identical
>> attributes. The other iflink attributes may only occur once and are
>> usually parsed using nla_parse_nested(), which will parse all
>> IFLA_VF_PORT attributes, but only return the last one.
>>
>> Something like:
>>
>> iflink message:
>> ...
>> [IFLA_VF_PORTS]
>>   [IFLA_VF_PORT]
>>     [IFLA_VF_PORT_*], ...
>>   [IFLA_VF_PORT]
>>     [IFLA_VF_PORT_*], ...
>>   ...
> 
> Ah, I was wondering about this already. Does this mean that IFLA_VFINFO
> does this incorrectly as well?

Yes.

>>>  static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
>>>  			    int type, u32 pid, u32 seq, u32 change,
>>>  			    unsigned int flags)
>>> @@ -747,17 +819,23 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
>>>  		goto nla_put_failure;
>>>  	copy_rtnl_link_stats64(nla_data(attr), stats);
>>>  
>>> +	if (dev->dev.parent)
>>> +		NLA_PUT_U32(skb, IFLA_NUM_VF, dev_num_vf(dev->dev.parent));
>> Just wondering, is the only case where dev.parent is non-NULL
>> really when virtual ports are present?
> 
> No, but if parent is NULL, we must not call dev_num_vf(). The way that enic
> needs the attributes, they can be either for the VF of dev->dev.parent (the
> PCI PF), or for the PF itself, even if it does not have VFs, in which case
> it would be interesting to have IFLA_NUM_VF = 0 in the output.

I see. I was mainly wondering about completely different types of
devices.

> Maybe a better structure would be to separate the two cases, also allowing
> a port profile to be associated with both the PF and with each of its VFs?
> 
> Something like this:
> 
> [IFLA_NUM_VF]
> [IFLA_VF_PORTS]
>   [IFLA_VF_PORT]
>     [IFLA_VF_PORT_*], ...
>   [IFLA_VF_PORT]
>     [IFLA_VF_PORT_*], ...
> [IFLA_PORT_SELF]
>   [IFLA_VF_PORT_*], ...

That would also be fine.

^ permalink raw reply

* RE: loosing IPMI-card by loading netconsole
From: Ronciak, John @ 2010-05-14 16:39 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Henning Fehrmann, Kirsher, Jeffrey T, Brandeburg, Jesse,
	Allan, Bruce W, Waskiewicz Jr, Peter P, netdev@vger.kernel.org,
	Matt Mackall, Carsten Aulbert
In-Reply-To: <4BED79EB.1000204@kernel.org>


> -----Original Message-----
> From: Tejun Heo [mailto:tj@kernel.org]
> Sent: Friday, May 14, 2010 9:27 AM
> To: Ronciak, John
> Cc: Henning Fehrmann; Kirsher, Jeffrey T; Brandeburg, Jesse; Allan,
> Bruce W; Waskiewicz Jr, Peter P; netdev@vger.kernel.org; Matt Mackall;
> Carsten Aulbert
> Subject: Re: loosing IPMI-card by loading netconsole
> 
> Hello, John.
> 
> As Henning seems offline, I'll try to fill in.
> 
> On 05/14/2010 04:51 PM, Ronciak, John wrote:
> > Sorry to hear about the problem you are having Henning.  What do you
> > mean when you say "it disappears"?
> 
> It stops responding to IPMI requests.
> 
> > Can both eth0 and eth1 ping (or be pinged)?  Do all the networking
> > devices still show up in the system when you do an 'lspci'?
> 
> Yeah, everything other than IPMI works just fine.
> 
> > What happens if you down and then up the interface you are having
> > problems with?  Does 'rmmod' do the same thing as your removal
> method?
> 
> Haven't tried these but well I think rmmoding should achieve about the
> same thing.
> 
> > Is there anything in the system logs saying anything about the
> > interfaces?
> 
> Nope.
> 
> > We have not had reports of this so this is a bit unusual.  Please let
> us know.
> >
> > Does this happen on other systems as well or just one particular
> system?
> 
> Yeah, it happens on at least several hundred machines, so not an
> isolated hardware issue at all.
> 
> To sum up.
> 
> On 2.6.27.39, netconsole + IPMI works fine.  On 2.6.32.7, as soon as
> netconsole is loaded, IPMI stops working.  Unloading netconsole doesn't
> revive IPMI but detaching the driver from the controller does.
> In both cases, usual networking works fine.
> 
> Thanks.
> 
> --
> Tejun
Thanks Tejun.

Since the networking things seem to be operational could it be that with the newer kernel the tunnels you had setup somehow no longer exist or have been disabled somehow?  When this interface is removed and setup again it gets things fixed up again?

Cheers,
John


^ permalink raw reply

* SR-IOV PCI quirk for 82599?
From: Fischer, Anna @ 2010-05-14 16:26 UTC (permalink / raw)
  To: netdev@vger.kernel.org, e1000-devel@lists.sourceforge.net

There is a PCI quirk for the 82576 controller that programs the PCI BARs to use Flash memory if the BIOS has not allocated resources for the SR-IOV VF BARs.

Is there a similar quirk for the 82599, or can even the same one be used for that device?

Thanks,
Anna 

^ permalink raw reply

* Re: loosing IPMI-card by loading netconsole
From: Tejun Heo @ 2010-05-14 16:27 UTC (permalink / raw)
  To: Ronciak, John
  Cc: Henning Fehrmann, Kirsher, Jeffrey T, Brandeburg, Jesse,
	Allan, Bruce W, Waskiewicz Jr, Peter P, netdev@vger.kernel.org,
	Matt Mackall, Carsten Aulbert
In-Reply-To: <DDC57477F5D6F845A0DDCB99D3C4812D0CA63BB9E4@orsmsx510.amr.corp.intel.com>

Hello, John.

As Henning seems offline, I'll try to fill in.

On 05/14/2010 04:51 PM, Ronciak, John wrote:
> Sorry to hear about the problem you are having Henning.  What do you
> mean when you say "it disappears"?

It stops responding to IPMI requests.

> Can both eth0 and eth1 ping (or be pinged)?  Do all the networking
> devices still show up in the system when you do an 'lspci'?

Yeah, everything other than IPMI works just fine.

> What happens if you down and then up the interface you are having
> problems with?  Does 'rmmod' do the same thing as your removal
> method?

Haven't tried these but well I think rmmoding should achieve about the
same thing.

> Is there anything in the system logs saying anything about the
> interfaces?

Nope.

> We have not had reports of this so this is a bit unusual.  Please let us know.
> 
> Does this happen on other systems as well or just one particular system?

Yeah, it happens on at least several hundred machines, so not an
isolated hardware issue at all.

To sum up.

On 2.6.27.39, netconsole + IPMI works fine.  On 2.6.32.7, as soon as
netconsole is loaded, IPMI stops working.  Unloading netconsole
doesn't revive IPMI but detaching the driver from the controller does.
In both cases, usual networking works fine.

Thanks.

-- 
tejun

^ permalink raw reply

* Re: [RFC] NF: IP tables idletimer target implementation
From: Patrick McHardy @ 2010-05-14 16:03 UTC (permalink / raw)
  To: Luciano Coelho; +Cc: netdev, Timo Teras, Netfilter Development Mailinglist
In-Reply-To: <1273841458-10443-1-git-send-email-luciano.coelho@nokia.com>

Please CC netfilter-devel on future submissions.

Luciano Coelho wrote:
> It adds a file to the sysfs for each interface that is brought up.  The file
> contains the time remaining before the event is triggered.  This file can
> also be used to set the timer manually.

What is this used for? It doesn't seem to smart to poll manually
if you get an event anyways, and the timeout can already be set
per rule.

> diff --git a/net/ipv4/netfilter/Kconfig b/net/ipv4/netfilter/Kconfig
> index 1833bdb..91fba9a 100644
> --- a/net/ipv4/netfilter/Kconfig
> +++ b/net/ipv4/netfilter/Kconfig
> @@ -204,6 +204,23 @@ config IP_NF_TARGET_REDIRECT
>  
>  	  To compile it as a module, choose M here.  If unsure, say N.
>  
> +config IP_NF_TARGET_IDLETIMER

This should be a x_tables target, there's nothing IPv4-specific
about it.

> diff --git a/net/ipv4/netfilter/ipt_IDLETIMER.c b/net/ipv4/netfilter/ipt_IDLETIMER.c
> new file mode 100644
> index 0000000..2c5b465
> --- /dev/null
> +++ b/net/ipv4/netfilter/ipt_IDLETIMER.c

> +
> +#ifdef CONFIG_IP_NF_TARGET_IDLETIMER_DEBUG
> +#define DEBUGP(format, args...) printk(KERN_DEBUG \
> +				       "ipt_IDLETIMER:%s:" format "\n", \
> +				       __func__ , ## args)
> +#else
> +#define DEBUGP(format, args...)
> +#endif

Please use pr_debug and get rid of the config option.

> +
> +/*
> + * Internal timer management.
> + */
> +static ssize_t utimer_attr_show(struct device *dev,
> +				struct device_attribute *attr, char *buf);
> +static ssize_t utimer_attr_store(struct device *dev,
> +				 struct device_attribute *attr,
> +				 const char *buf, size_t count);
> +
> +struct utimer_t {
> +	char name[IFNAMSIZ];
> +	struct list_head entry;
> +	struct timer_list timer;
> +	struct work_struct work;
> +	struct net *net;
> +};
> +
> +static LIST_HEAD(active_utimer_head);
> +static DEFINE_SPINLOCK(list_lock);
> +static DEVICE_ATTR(idletimer, 0644, utimer_attr_show, utimer_attr_store);
> +
> +static void utimer_delete(struct utimer_t *timer)
> +{
> +	DEBUGP("Deleting timer '%s'\n", timer->name);
> +
> +	list_del(&timer->entry);
> +	del_timer_sync(&timer->timer);
> +	put_net(timer->net);
> +	kfree(timer);
> +}
> +
> +static void utimer_work(struct work_struct *work)
> +{
> +	struct utimer_t *timer = container_of(work, struct utimer_t, work);
> +	struct net_device *netdev = NULL;

Unnecessary initialization.

> +
> +	netdev = dev_get_by_name(timer->net, timer->name);
> +
> +	if (netdev != NULL) {
> +		sysfs_notify(&netdev->dev.kobj, NULL,
> +			     "idletimer");
> +		dev_put(netdev);
> +	}
> +}
> +
> +static void utimer_expired(unsigned long data)
> +{
> +	struct utimer_t *timer = (struct utimer_t *) data;
> +
> +	DEBUGP("Timer '%s' expired\n", timer->name);
> +
> +	spin_lock_bh(&list_lock);
> +	utimer_delete(timer);
> +	spin_unlock_bh(&list_lock);
> +
> +	schedule_work(&timer->work);

Use after free, utimer_delete() frees the timer.

> +}
> +
> +static struct utimer_t *utimer_create(const char *name,
> +				      struct net *net)
> +{
> +	struct utimer_t *timer;
> +
> +	timer = kmalloc(sizeof(struct utimer_t), GFP_ATOMIC);
> +	if (timer == NULL)
> +		return NULL;
> +
> +	list_add(&timer->entry, &active_utimer_head);
> +	strlcpy(timer->name, name, sizeof(timer->name));
> +	timer->net = get_net(net);

How does this handle namespace exit?

> +
> +	init_timer(&timer->timer);
> +	timer->timer.function = utimer_expired;
> +	timer->timer.data = (unsigned long) timer;

setup_timer()

> +
> +	INIT_WORK(&timer->work, utimer_work);
> +
> +	DEBUGP("Created timer '%s'\n", timer->name);
> +
> +	return timer;
> +}
> +
> +static struct utimer_t *__utimer_find(const char *name, const struct net *net)
> +{
> +	struct utimer_t *entry;
> +
> +	list_for_each_entry(entry, &active_utimer_head, entry) {
> +		if (!strcmp(name, entry->name) && net == entry->net)
> +			return entry;
> +	}
> +
> +	return NULL;
> +}
> +
> +static void utimer_modify(const char *name,
> +			  struct net *net,
> +			  unsigned long expires)
> +{
> +	struct utimer_t *timer;
> +
> +	DEBUGP("Modifying timer '%s'\n", name);
> +	spin_lock_bh(&list_lock);
> +	timer = __utimer_find(name, net);

So you're scanning the list up to twice per packet? That seems
highly suboptimal, why not create the timer when the rule is
created and only update the timeout? You could use the interfaces
specified in struct ipt_ip.

> +	if (timer == NULL)
> +		timer = utimer_create(name, net);
> +	mod_timer(&timer->timer, expires);
> +	spin_unlock_bh(&list_lock);
> +}
> +
> +static ssize_t utimer_attr_show(struct device *dev,
> +				struct device_attribute *attr, char *buf)
> +{
> +	struct utimer_t *timer;
> +	struct net_device *netdev = to_net_dev(dev);
> +	unsigned long expires = 0;
> +
> +	spin_lock_bh(&list_lock);
> +	timer = __utimer_find(netdev->name, dev_net(netdev));
> +	if (timer)
> +		expires = timer->timer.expires;
> +	spin_unlock_bh(&list_lock);
> +
> +	if (expires)
> +		return sprintf(buf, "%lu\n", (expires-jiffies) / HZ);
> +
> +	return sprintf(buf, "0\n");
> +}
> +
> +static ssize_t utimer_attr_store(struct device *dev,
> +				 struct device_attribute *attr,
> +				 const char *buf, size_t count)
> +{
> +	int expires;
> +	struct net_device *netdev = to_net_dev(dev);
> +
> +	if (sscanf(buf, "%d", &expires) == 1) {
> +		if (expires > 0)

Using %u seems better.

> +			utimer_modify(netdev->name,
> +				      dev_net(netdev),
> +				      jiffies+HZ*(unsigned long)expires);
> +	}
> +
> +	return count;
> +}
> +
> +static int utimer_notifier_call(struct notifier_block *this,
> +				unsigned long event, void *ptr)
> +{
> +	struct net_device *netdev = ptr;
> +	int ret;
> +
> +	switch (event) {
> +	case NETDEV_UP:
> +		DEBUGP("NETDEV_UP: %s\n", netdev->name);
> +		ret = device_create_file(&netdev->dev,
> +					 &dev_attr_idletimer);
> +		WARN_ON(ret);
> +
> +		break;
> +	case NETDEV_DOWN:
> +		DEBUGP("NETDEV_DOWN: %s\n", netdev->name);
> +		device_remove_file(&netdev->dev,
> +				   &dev_attr_idletimer);
> +		break;
> +	}
> +
> +	return NOTIFY_DONE;
> +}
> +
> +static struct notifier_block utimer_notifier_block = {
> +	.notifier_call	= utimer_notifier_call,
> +};
> +
> +
> +static int utimer_init(void)
> +{
> +	return register_netdevice_notifier(&utimer_notifier_block);
> +}
> +
> +static void utimer_fini(void)
> +{
> +	struct utimer_t *entry, *next;
> +	struct net_device *dev;
> +	struct net *net;
> +
> +	list_for_each_entry_safe(entry, next, &active_utimer_head, entry)
> +		utimer_delete(entry);
> +
> +	rtnl_lock();

deadlock? unregister_netdevice_notifier() already takes the RTNL.

> +	unregister_netdevice_notifier(&utimer_notifier_block);
> +	for_each_net(net) {
> +		for_each_netdev(net, dev) {
> +			utimer_notifier_call(&utimer_notifier_block,
> +					     NETDEV_DOWN, dev);
> +		}
> +	}
> +	rtnl_unlock();
> +}
> +
> +/*
> + * The actual iptables plugin.
> + */
> +static unsigned int ipt_idletimer_target(struct sk_buff *skb,
> +					 const struct xt_action_param *par)
> +{
> +	const struct ipt_idletimer_info *target = par->targinfo;
> +	unsigned long expires;
> +
> +	expires = jiffies + HZ*target->timeout;
> +
> +	if (par->in != NULL)
> +		utimer_modify(par->in->name,
> +			      dev_net(par->in),
> +			      expires);
> +
> +	if (par->out != NULL)
> +		utimer_modify(par->out->name,
> +			      dev_net(par->out),
> +			      expires);
> +
> +	return XT_CONTINUE;
> +}
> +
> +static int ipt_idletimer_checkentry(const struct xt_tgchk_param *par)
> +{
> +	const struct ipt_idletimer_info *info = par->targinfo;
> +
> +	if (info->timeout == 0) {
> +		DEBUGP("timeout value is zero\n");
> +		return false;
> +	}
> +
> +	return true;

The return convention in the current net-next tree is 0 for
no error or an errno code otherwise.

> +}

^ permalink raw reply

* Fw: [Bug 15974] New: kernel panic when squid in bridge mode
From: Stephen Hemminger @ 2010-05-14 15:24 UTC (permalink / raw)
  To: Bart De Schuymer, Patrick McHardy; +Cc: netdev



Begin forwarded message:

Date: Fri, 14 May 2010 08:52:07 GMT
From: bugzilla-daemon@bugzilla.kernel.org
To: shemminger@linux-foundation.org
Subject: [Bug 15974] New: kernel panic when squid in bridge mode


https://bugzilla.kernel.org/show_bug.cgi?id=15974

           Summary: kernel panic when squid in bridge mode
           Product: Networking
           Version: 2.5
    Kernel Version: 2.6.30.5
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: high
          Priority: P1
         Component: IPV4
        AssignedTo: shemminger@linux-foundation.org
        ReportedBy: senthilkumaar2021@gmail.com
        Regression: No


Hi we are using squid tproxy in bridge mode .The kernel version used is
2.6.30.5 once in 10-15 hours we are getting kernel panic message in he screen
.We are passing traffic of 100Mbps through bridge.The iptables and ebtables are
used for squid

ptables -t mangle -N DIVERT
iptables -t mangle -A DIVERT -j MARK --set-mark 1
iptables -t mangle -A DIVERT -j ACCEPT

iptables -t mangle -A PREROUTING -p tcp -m socket -j DIVERT
iptables -t mangle -A PREROUTING -p tcp --dport 80 -j TPROXY --tproxy-mark
0x1/0x1 --on-port 3129

ebtables -t broute -A BROUTING -i $CLIENT_IFACE -p ipv4 --ip-proto tcp
--ip-dport 80 -j redirect --redirect-target DROP

ebtables -t broute -A BROUTING -i $INET_IFACE -p ipv4 --ip-proto tcp --ip-sport
80 -j redirect --redirect-target DROP 


we have got kernel panic in kernel 2.6.28.5 also

the error is

<ffffffffa03933c2>] ? nf_nat_fn+0x138/0x14e [iptable_nat]
[<ffffffffa0393585>] ? nf_nat_in+0x2f/0x6e [iptable_nat]
[<ffffffffa027edaa>] ? br_nf_pre_routing_finish+0x0/0x2c4 [bridge]
[<ffffffffa027edfa>] br_nf_pre_routing_finish+0x50/0x2c4 [bridge]
[<ffffffffa027edaa>] ? br_nf_pre_routing_finish+0x0/0x2c4 [bridge]
[<ffffffff81339a50>] ? nf_hook_slow+0x68/0xc8
[<ffffffffa027edaa>] ? br_nf_pre_routing_finish+0x0/0x2c4 [bridge]
[<ffffffffa027f616>] br_nf_pre_routing+0x5a8/0x5c7 [bridge]
[<ffffffff813399ab>] nf_iterate+0x48/0x85
[<ffffffffa027a931>] ? br_handle_frame_finish+0x0/0x154 [bridge]
[<ffffffff81339a50>] nf_hook_slow+0x68/0xc8
[<ffffffffa027a931>] ? br_handle_frame_finish+0x0/0x154 [bridge]
[<ffffffffa027ac36>] br_handle_frame+0x1b1/0x1db [bridge]
[<ffffffff8131d54b>] netif_receive_skb+0x316/0x434
[<ffffffff8131dbfb>] napi_gro_receive+0x6e/0x83
[<ffffffffa0125bfe>] e1000_receive_skb+0x5c/0x65 [e1000e]
[<ffffffffa0125de8>] e1000_clean_rx_irq+0x1e1/0x28f [e1000e]
[<ffffffffa012730e>] e1000_clean+0x99/0x24a [e1000e]
[<ffffffff813bcfc5>] ? _spin_unlock_irqrestore+0x2c/0x43
[<ffffffff8131ba62>] net_rx_action+0xb8/0x1b4
[<ffffffff8104ed43>] __do_softirq+0x99/0x152
[<ffffffff8101284c>] call_softirq+0x1c/0x30
[<ffffffff81013a02>] do_softirq+0x52/0xb9
[<ffffffff8104e969>] irq_exit+0x53/0x8d
[<ffffffff81013d1a>] do_IRQ+0x135/0x157
[<ffffffff81011f93>] ret_from_intr+0x0/0x2e
<EOI> [<ffffffff81017e20>] ? mwait_idle+0x9e/0xc7
[<ffffffff81017e17>] ? mwait_idle+0x95/0xc7
[<ffffffff813bfd20>] ? atomic_notifier_call_chain+0x13/0x15
[<ffffffff810102f4>] ? enter_idle+0x27/0x29


Please help me in fixing the issue

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.


-- 

^ permalink raw reply

* Re: [PATCH 0/6] sky2: update
From: Stephen Hemminger @ 2010-05-14 15:19 UTC (permalink / raw)
  To: David Miller; +Cc: mikem, netdev
In-Reply-To: <20100514.031501.57481177.davem@davemloft.net>

On Fri, 14 May 2010 03:15:01 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:

> From: Stephen Hemminger <shemminger@vyatta.com>
> Date: Thu, 13 May 2010 09:12:47 -0700
> 
> > Bunch of patches from Mike, with some additional comments.
> 
> All applied to net-next-2.6, thanks.

The first one needs to go to net-2.6 because it a regression:
Current code will lose multicast addresses when the automatic
recovery from stuck chip happens. Auto recovery happens a lot
under load on some configurations.

-- 

^ permalink raw reply

* RE: loosing IPMI-card by loading netconsole
From: Ronciak, John @ 2010-05-14 14:51 UTC (permalink / raw)
  To: Henning Fehrmann, Kirsher, Jeffrey T
  Cc: Brandeburg, Jesse, Allan, Bruce W, Waskiewicz Jr, Peter P,
	netdev@vger.kernel.org, Matt Mackall, Carsten Aulbert, Tejun Heo
In-Reply-To: <20100514134544.GA26674@gretchen.aei.mpg.de>

Sorry to hear about the problem you are having Henning.  What do you mean when you say "it disappears"?  Can both eth0 and eth1 ping (or be pinged)?  Do all the networking devices still show up in the system when you do an 'lspci'?  What happens if you down and then up the interface you are having problems with?  Does 'rmmod' do the same thing as your removal method?  Is there anything in the system logs saying anything about the interfaces?

We have not had reports of this so this is a bit unusual.  Please let us know.

Does this happen on other systems as well or just one particular system?


Cheers,
John


> -----Original Message-----
> From: Henning Fehrmann [mailto:henning.fehrmann@aei.mpg.de]
> Sent: Friday, May 14, 2010 6:46 AM
> To: Kirsher, Jeffrey T
> Cc: Brandeburg, Jesse; Allan, Bruce W; Waskiewicz Jr, Peter P; Ronciak,
> John; netdev@vger.kernel.org; Matt Mackall; Carsten Aulbert; Tejun Heo
> Subject: loosing IPMI-card by loading netconsole
> 
> Hello,
> 
> We have SuperMicro PDSM 2+ boards together with 82573E and 82573L Intel
> NICs.
> Additional we have IPMI cards which are tunneled via the 82573E NIC.
> 
> We are using the e1000e driver for NICs together with the netconsole
> driver.
> (netconsole netconsole=4444@client_IP/eth1,514@server_IP/server_MAC).
> Netconsole is using the NIC which is NOT used by the IPMI card.
> 
> Usually we are able to access the IPMI card remotely with ipmitools.
> 
> Having the kernel 2.6.27.39 installed everything worked together, the
> NICs, remotely accessing the IPMI cards and netconsole.
> 
> The driver version of e1000e is 0.3.3.3-k6.
> 
> Using a more recent kernel: 2.6.32.7 we lost the ability of remotely
> accessing the IPMI card when the netconsole driver is loaded.
> The IPMI card is accessible before the netconsole driver is loaded and
> disappears once we use netconsole. Even unloading netconsole does not
> help then.
> 
> We get the IPMI card back when 'removing' eth0:
> echo 1 > /sys/devices/pci0000:00/0000:00:1c.4/0000:0d:00.0/remove
> 
> 0d:00.0 is eth0.
> 
> 
> The version of the e1000e driver is 1.0.2-k2.
> 
> I compiled and loaded a later version of this driver (1.1.19) without
> solving this problem.
> 
> 
> For eth0 (82573E) we have a firmware version 0.15-4 installed and for
> eth1 we use the firmware version 0.5-7. But this is the same for both
> kernel versions.
> 
> Do you have an idea?
> 
> Thank you and cheers,
> Henning


^ permalink raw reply

* [PATCH] gianfar: Remove legacy PM callbacks
From: Anton Vorontsov @ 2010-05-14 14:27 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linuxppc-dev

These callbacks were needed because dev_pm_ops support for OF
platform devices was in the powerpc tree, and the patch that
added dev_pm_ops for gianfar driver was in the netdev tree. Now
that netdev and powerpc trees have merged into Linus' tree, we
can remove the legacy hooks.

Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
---
 drivers/net/gianfar.c |   14 --------------
 1 files changed, 0 insertions(+), 14 deletions(-)

diff --git a/drivers/net/gianfar.c b/drivers/net/gianfar.c
index 5d3763f..fb23f04 100644
--- a/drivers/net/gianfar.c
+++ b/drivers/net/gianfar.c
@@ -1288,21 +1288,9 @@ static struct dev_pm_ops gfar_pm_ops = {
 
 #define GFAR_PM_OPS (&gfar_pm_ops)
 
-static int gfar_legacy_suspend(struct of_device *ofdev, pm_message_t state)
-{
-	return gfar_suspend(&ofdev->dev);
-}
-
-static int gfar_legacy_resume(struct of_device *ofdev)
-{
-	return gfar_resume(&ofdev->dev);
-}
-
 #else
 
 #define GFAR_PM_OPS NULL
-#define gfar_legacy_suspend NULL
-#define gfar_legacy_resume NULL
 
 #endif
 
@@ -3055,8 +3043,6 @@ static struct of_platform_driver gfar_driver = {
 
 	.probe = gfar_probe,
 	.remove = gfar_remove,
-	.suspend = gfar_legacy_suspend,
-	.resume = gfar_legacy_resume,
 	.driver.pm = GFAR_PM_OPS,
 };
 
-- 
1.7.0.5

^ permalink raw reply related

* [PATCH] fsl_pq_mdio: Fix mdiobus allocation handling
From: Anton Vorontsov @ 2010-05-14 14:27 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linuxppc-dev

The driver could return success code even if mdiobus_alloc() failed.
This patch fixes the issue.

Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
---
 drivers/net/fsl_pq_mdio.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/fsl_pq_mdio.c b/drivers/net/fsl_pq_mdio.c
index 3acac5f..ff028f5 100644
--- a/drivers/net/fsl_pq_mdio.c
+++ b/drivers/net/fsl_pq_mdio.c
@@ -277,15 +277,17 @@ static int fsl_pq_mdio_probe(struct of_device *ofdev,
 	int tbiaddr = -1;
 	const u32 *addrp;
 	u64 addr = 0, size = 0;
-	int err = 0;
+	int err;
 
 	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
 	if (!priv)
 		return -ENOMEM;
 
 	new_bus = mdiobus_alloc();
-	if (NULL == new_bus)
+	if (!new_bus) {
+		err = -ENOMEM;
 		goto err_free_priv;
+	}
 
 	new_bus->name = "Freescale PowerQUICC MII Bus",
 	new_bus->read = &fsl_pq_mdio_read,
-- 
1.7.0.5


^ permalink raw reply related

* loosing IPMI-card by loading netconsole
From: Henning Fehrmann @ 2010-05-14 13:45 UTC (permalink / raw)
  To: Jeff Kirsher
  Cc: Jesse Brandeburg, Bruce Allan, PJ Waskiewicz, John Ronciak,
	netdev, Matt Mackall, Carsten Aulbert, Tejun Heo

Hello,

We have SuperMicro PDSM 2+ boards together with 82573E and 82573L Intel NICs.
Additional we have IPMI cards which are tunneled via the 82573E NIC.

We are using the e1000e driver for NICs together with the netconsole driver.
(netconsole netconsole=4444@client_IP/eth1,514@server_IP/server_MAC). Netconsole
is using the NIC which is NOT used by the IPMI card.

Usually we are able to access the IPMI card remotely with ipmitools.

Having the kernel 2.6.27.39 installed everything worked together, the NICs, remotely 
accessing the IPMI cards and netconsole.

The driver version of e1000e is 0.3.3.3-k6. 

Using a more recent kernel: 2.6.32.7 we lost the ability of remotely accessing the IPMI card
when the netconsole driver is loaded. 
The IPMI card is accessible before the netconsole driver is loaded and disappears 
once we use netconsole. Even unloading netconsole does not help then.

We get the IPMI card back when 'removing' eth0:
echo 1 > /sys/devices/pci0000:00/0000:00:1c.4/0000:0d:00.0/remove

0d:00.0 is eth0.

The version of the e1000e driver is 1.0.2-k2.

I compiled and loaded a later version of this driver (1.1.19) without solving this problem.

For eth0 (82573E) we have a firmware version 0.15-4 installed and for eth1 we use the firmware 
version 0.5-7. But this is the same for both kernel versions. 

Do you have an idea? 

Thank you and cheers,
Henning

^ permalink raw reply

* [RFC] NF: IP tables idletimer target implementation
From: Luciano Coelho @ 2010-05-14 12:50 UTC (permalink / raw)
  To: netdev; +Cc: Timo Teras

This patch implements an idletimer IP tables target that can be used to
identify when interfaces have been idle for a certain period of time.

It adds a file to the sysfs for each interface that is brought up.  The file
contains the time remaining before the event is triggered.  This file can
also be used to set the timer manually.

The default timeout should be set when the IP table rule is defined with the
--timeout parameter set.

This implementation was originally done by Timo Teras and a few other people
who have sent patches with updates and fixes.  It has lived for a while in
the linux-omap tree, but has been removed when linux-omap was aligned with
upstream.  Now the patch has been forward-ported, which includes a few
changes related to net namespaces, x_tables etc.

While this is not the best approach for interface idle time monitoring, it is
non-intrusive and fits well in the existing architecture without any major
changes to the networking subsystem.

Cc: Timo Teras <timo.teras@iki.fi>
Signed-off-by: Luciano Coelho <luciano.coelho@nokia.com>
---
 include/linux/netfilter_ipv4/ipt_IDLETIMER.h |   22 ++
 net/ipv4/netfilter/Kconfig                   |   17 ++
 net/ipv4/netfilter/Makefile                  |    1 +
 net/ipv4/netfilter/ipt_IDLETIMER.c           |  320 ++++++++++++++++++++++++++
 4 files changed, 360 insertions(+), 0 deletions(-)
 create mode 100644 include/linux/netfilter_ipv4/ipt_IDLETIMER.h
 create mode 100644 net/ipv4/netfilter/ipt_IDLETIMER.c

diff --git a/include/linux/netfilter_ipv4/ipt_IDLETIMER.h b/include/linux/netfilter_ipv4/ipt_IDLETIMER.h
new file mode 100644
index 0000000..89993e2
--- /dev/null
+++ b/include/linux/netfilter_ipv4/ipt_IDLETIMER.h
@@ -0,0 +1,22 @@
+/*
+ * linux/include/linux/netfilter_ipv4/ipt_IDLETIMER.h
+ *
+ * Header file for IP tables timer target module.
+ *
+ * Copyright (C) 2004 Nokia Corporation
+ * Written by Timo TerÃ¤s <ext-timo.teras@nokia.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#ifndef _IPT_TIMER_H
+#define _IPT_TIMER_H
+
+struct ipt_idletimer_info {
+	unsigned int timeout;
+};
+
+#endif
diff --git a/net/ipv4/netfilter/Kconfig b/net/ipv4/netfilter/Kconfig
index 1833bdb..91fba9a 100644
--- a/net/ipv4/netfilter/Kconfig
+++ b/net/ipv4/netfilter/Kconfig
@@ -204,6 +204,23 @@ config IP_NF_TARGET_REDIRECT
 
 	  To compile it as a module, choose M here.  If unsure, say N.
 
+config IP_NF_TARGET_IDLETIMER
+	tristate  "IDLETIMER target support"
+	depends on IP_NF_IPTABLES
+	help
+	  This option adds a `IDLETIMER' target. Each matching packet resets
+	  the timer associated with input and/or output interfaces. Timer
+	  expiry causes kobject uevent. Idle timer can be read via sysfs.
+
+	  To compile it as a module, choose M here.  If unsure, say N.
+
+config IP_NF_TARGET_IDLETIMER_DEBUG
+	bool "IDLETIMER target debugging"
+	help
+	  Say Y here if you want to get debugging information when using the
+	  IDLETIMER target.  If unsure, say N.
+
+
 config NF_NAT_SNMP_BASIC
 	tristate "Basic SNMP-ALG support"
 	depends on NF_NAT
diff --git a/net/ipv4/netfilter/Makefile b/net/ipv4/netfilter/Makefile
index 4811159..60bdaf1 100644
--- a/net/ipv4/netfilter/Makefile
+++ b/net/ipv4/netfilter/Makefile
@@ -60,6 +60,7 @@ obj-$(CONFIG_IP_NF_TARGET_MASQUERADE) += ipt_MASQUERADE.o
 obj-$(CONFIG_IP_NF_TARGET_NETMAP) += ipt_NETMAP.o
 obj-$(CONFIG_IP_NF_TARGET_REDIRECT) += ipt_REDIRECT.o
 obj-$(CONFIG_IP_NF_TARGET_REJECT) += ipt_REJECT.o
+obj-$(CONFIG_IP_NF_TARGET_IDLETIMER) += ipt_IDLETIMER.o
 obj-$(CONFIG_IP_NF_TARGET_ULOG) += ipt_ULOG.o
 
 # generic ARP tables
diff --git a/net/ipv4/netfilter/ipt_IDLETIMER.c b/net/ipv4/netfilter/ipt_IDLETIMER.c
new file mode 100644
index 0000000..2c5b465
--- /dev/null
+++ b/net/ipv4/netfilter/ipt_IDLETIMER.c
@@ -0,0 +1,320 @@
+/*
+ * linux/net/ipv4/netfilter/ipt_IDLETIMER.c
+ *
+ * Netfilter module to trigger a timer when packet matches.
+ * After timer expires a kevent will be sent.
+ *
+ * Copyright (C) 2004, 2010 Nokia Corporation
+ * Written by Timo Teras <ext-timo.teras@nokia.com>
+ *
+ * Contact: Luciano Coelho <luciano.coelho@nokia.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
+ * 02110-1301 USA
+ */
+
+#include <linux/module.h>
+#include <linux/skbuff.h>
+#include <linux/timer.h>
+#include <linux/list.h>
+#include <linux/spinlock.h>
+#include <linux/notifier.h>
+#include <linux/netfilter.h>
+#include <linux/rtnetlink.h>
+#include <linux/netfilter/x_tables.h>
+#include <linux/netfilter_ipv4/ipt_IDLETIMER.h>
+#include <linux/kobject.h>
+#include <linux/workqueue.h>
+
+#ifdef CONFIG_IP_NF_TARGET_IDLETIMER_DEBUG
+#define DEBUGP(format, args...) printk(KERN_DEBUG \
+				       "ipt_IDLETIMER:%s:" format "\n", \
+				       __func__ , ## args)
+#else
+#define DEBUGP(format, args...)
+#endif
+
+/*
+ * Internal timer management.
+ */
+static ssize_t utimer_attr_show(struct device *dev,
+				struct device_attribute *attr, char *buf);
+static ssize_t utimer_attr_store(struct device *dev,
+				 struct device_attribute *attr,
+				 const char *buf, size_t count);
+
+struct utimer_t {
+	char name[IFNAMSIZ];
+	struct list_head entry;
+	struct timer_list timer;
+	struct work_struct work;
+	struct net *net;
+};
+
+static LIST_HEAD(active_utimer_head);
+static DEFINE_SPINLOCK(list_lock);
+static DEVICE_ATTR(idletimer, 0644, utimer_attr_show, utimer_attr_store);
+
+static void utimer_delete(struct utimer_t *timer)
+{
+	DEBUGP("Deleting timer '%s'\n", timer->name);
+
+	list_del(&timer->entry);
+	del_timer_sync(&timer->timer);
+	put_net(timer->net);
+	kfree(timer);
+}
+
+static void utimer_work(struct work_struct *work)
+{
+	struct utimer_t *timer = container_of(work, struct utimer_t, work);
+	struct net_device *netdev = NULL;
+
+	netdev = dev_get_by_name(timer->net, timer->name);
+
+	if (netdev != NULL) {
+		sysfs_notify(&netdev->dev.kobj, NULL,
+			     "idletimer");
+		dev_put(netdev);
+	}
+}
+
+static void utimer_expired(unsigned long data)
+{
+	struct utimer_t *timer = (struct utimer_t *) data;
+
+	DEBUGP("Timer '%s' expired\n", timer->name);
+
+	spin_lock_bh(&list_lock);
+	utimer_delete(timer);
+	spin_unlock_bh(&list_lock);
+
+	schedule_work(&timer->work);
+}
+
+static struct utimer_t *utimer_create(const char *name,
+				      struct net *net)
+{
+	struct utimer_t *timer;
+
+	timer = kmalloc(sizeof(struct utimer_t), GFP_ATOMIC);
+	if (timer == NULL)
+		return NULL;
+
+	list_add(&timer->entry, &active_utimer_head);
+	strlcpy(timer->name, name, sizeof(timer->name));
+	timer->net = get_net(net);
+
+	init_timer(&timer->timer);
+	timer->timer.function = utimer_expired;
+	timer->timer.data = (unsigned long) timer;
+
+	INIT_WORK(&timer->work, utimer_work);
+
+	DEBUGP("Created timer '%s'\n", timer->name);
+
+	return timer;
+}
+
+static struct utimer_t *__utimer_find(const char *name, const struct net *net)
+{
+	struct utimer_t *entry;
+
+	list_for_each_entry(entry, &active_utimer_head, entry) {
+		if (!strcmp(name, entry->name) && net == entry->net)
+			return entry;
+	}
+
+	return NULL;
+}
+
+static void utimer_modify(const char *name,
+			  struct net *net,
+			  unsigned long expires)
+{
+	struct utimer_t *timer;
+
+	DEBUGP("Modifying timer '%s'\n", name);
+	spin_lock_bh(&list_lock);
+	timer = __utimer_find(name, net);
+	if (timer == NULL)
+		timer = utimer_create(name, net);
+	mod_timer(&timer->timer, expires);
+	spin_unlock_bh(&list_lock);
+}
+
+static ssize_t utimer_attr_show(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	struct utimer_t *timer;
+	struct net_device *netdev = to_net_dev(dev);
+	unsigned long expires = 0;
+
+	spin_lock_bh(&list_lock);
+	timer = __utimer_find(netdev->name, dev_net(netdev));
+	if (timer)
+		expires = timer->timer.expires;
+	spin_unlock_bh(&list_lock);
+
+	if (expires)
+		return sprintf(buf, "%lu\n", (expires-jiffies) / HZ);
+
+	return sprintf(buf, "0\n");
+}
+
+static ssize_t utimer_attr_store(struct device *dev,
+				 struct device_attribute *attr,
+				 const char *buf, size_t count)
+{
+	int expires;
+	struct net_device *netdev = to_net_dev(dev);
+
+	if (sscanf(buf, "%d", &expires) == 1) {
+		if (expires > 0)
+			utimer_modify(netdev->name,
+				      dev_net(netdev),
+				      jiffies+HZ*(unsigned long)expires);
+	}
+
+	return count;
+}
+
+static int utimer_notifier_call(struct notifier_block *this,
+				unsigned long event, void *ptr)
+{
+	struct net_device *netdev = ptr;
+	int ret;
+
+	switch (event) {
+	case NETDEV_UP:
+		DEBUGP("NETDEV_UP: %s\n", netdev->name);
+		ret = device_create_file(&netdev->dev,
+					 &dev_attr_idletimer);
+		WARN_ON(ret);
+
+		break;
+	case NETDEV_DOWN:
+		DEBUGP("NETDEV_DOWN: %s\n", netdev->name);
+		device_remove_file(&netdev->dev,
+				   &dev_attr_idletimer);
+		break;
+	}
+
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block utimer_notifier_block = {
+	.notifier_call	= utimer_notifier_call,
+};
+
+
+static int utimer_init(void)
+{
+	return register_netdevice_notifier(&utimer_notifier_block);
+}
+
+static void utimer_fini(void)
+{
+	struct utimer_t *entry, *next;
+	struct net_device *dev;
+	struct net *net;
+
+	list_for_each_entry_safe(entry, next, &active_utimer_head, entry)
+		utimer_delete(entry);
+
+	rtnl_lock();
+	unregister_netdevice_notifier(&utimer_notifier_block);
+	for_each_net(net) {
+		for_each_netdev(net, dev) {
+			utimer_notifier_call(&utimer_notifier_block,
+					     NETDEV_DOWN, dev);
+		}
+	}
+	rtnl_unlock();
+}
+
+/*
+ * The actual iptables plugin.
+ */
+static unsigned int ipt_idletimer_target(struct sk_buff *skb,
+					 const struct xt_action_param *par)
+{
+	const struct ipt_idletimer_info *target = par->targinfo;
+	unsigned long expires;
+
+	expires = jiffies + HZ*target->timeout;
+
+	if (par->in != NULL)
+		utimer_modify(par->in->name,
+			      dev_net(par->in),
+			      expires);
+
+	if (par->out != NULL)
+		utimer_modify(par->out->name,
+			      dev_net(par->out),
+			      expires);
+
+	return XT_CONTINUE;
+}
+
+static int ipt_idletimer_checkentry(const struct xt_tgchk_param *par)
+{
+	const struct ipt_idletimer_info *info = par->targinfo;
+
+	if (info->timeout == 0) {
+		DEBUGP("timeout value is zero\n");
+		return false;
+	}
+
+	return true;
+}
+
+static struct xt_target ipt_idletimer = {
+	.name		= "IDLETIMER",
+	.family		= NFPROTO_IPV4,
+	.target		= ipt_idletimer_target,
+	.targetsize     = sizeof(struct ipt_idletimer_info),
+	.checkentry	= ipt_idletimer_checkentry,
+	.me		= THIS_MODULE,
+};
+
+static int __init init(void)
+{
+	int ret;
+
+	ret = utimer_init();
+	if (ret)
+		return ret;
+
+	ret =  xt_register_target(&ipt_idletimer);
+	if (ret < 0) {
+		utimer_fini();
+		return ret;
+	}
+
+	return 0;
+}
+
+static void __exit fini(void)
+{
+	xt_unregister_target(&ipt_idletimer);
+	utimer_fini();
+}
+
+module_init(init);
+module_exit(fini);
+
+MODULE_AUTHOR("Timo Teras <ext-timo.teras@nokia.com>");
+MODULE_DESCRIPTION("iptables idletimer target module");
+MODULE_LICENSE("GPL");
-- 
1.6.3.3


^ permalink raw reply related

* Re: [net-next-2.6 V7 PATCH 1/2] Add netlink support for virtual port management (was iovnl)
From: Arnd Bergmann @ 2010-05-14 12:12 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Scott Feldman, davem, netdev, chrisw
In-Reply-To: <4BED2CD8.4020209@trash.net>

On Friday 14 May 2010, Patrick McHardy wrote:
> Scott Feldman wrote:
> > --- a/net/core/rtnetlink.c
> > +++ b/net/core/rtnetlink.c
> > @@ -653,6 +653,26 @@ static inline int rtnl_vfinfo_size(const struct net_device *dev)
> >  		return 0;
> >  }
> >  
> > +static size_t rtnl_vf_port_size(const struct net_device *dev)
> > +{
> > +	size_t vf_port_size = nla_total_size(sizeof(struct nlattr))
> > +						     /* VF_PORT_VF */
> > +		+ nla_total_size(VF_PORT_PROFILE_MAX)/* VF_PORT_PROFILE */
> > +		+ nla_total_size(sizeof(struct ifla_vf_port_vsi))
> > +						     /* VF_PORT_VSI_TYPE */
> > +		+ nla_total_size(VF_PORT_UUID_MAX)   /* VF_PORT_VSI_INSTANCE */
> > +		+ nla_total_size(VF_PORT_UUID_MAX)   /* VF_PORT_HOST_UUID */
> > +		+ nla_total_size(1)		     /* VF_PROT_VDP_REQUEST */
> 
> Do messages generated by the kernel really contain a request?

Yes, the request field of the VDP message shows the status (e.g. associated or
disassociated).

> > +static int rtnl_vf_port_fill_nest(struct sk_buff *skb, struct net_device *dev,
> > +				  int vf)
> > +{
> > +	struct nlattr *data;
> > +	int err;
> > +
> > +	data = nla_nest_start(skb, IFLA_VF_PORT);
> 
> We usually use a top-level attribute to encapsulate lists of identical
> attributes. The other iflink attributes may only occur once and are
> usually parsed using nla_parse_nested(), which will parse all
> IFLA_VF_PORT attributes, but only return the last one.
> 
> Something like:
> 
> iflink message:
> ...
> [IFLA_VF_PORTS]
>   [IFLA_VF_PORT]
>     [IFLA_VF_PORT_*], ...
>   [IFLA_VF_PORT]
>     [IFLA_VF_PORT_*], ...
>   ...

Ah, I was wondering about this already. Does this mean that IFLA_VFINFO
does this incorrectly as well?

> >  static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
> >  			    int type, u32 pid, u32 seq, u32 change,
> >  			    unsigned int flags)
> > @@ -747,17 +819,23 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
> >  		goto nla_put_failure;
> >  	copy_rtnl_link_stats64(nla_data(attr), stats);
> >  
> > +	if (dev->dev.parent)
> > +		NLA_PUT_U32(skb, IFLA_NUM_VF, dev_num_vf(dev->dev.parent));
> 
> Just wondering, is the only case where dev.parent is non-NULL
> really when virtual ports are present?

No, but if parent is NULL, we must not call dev_num_vf(). The way that enic
needs the attributes, they can be either for the VF of dev->dev.parent (the
PCI PF), or for the PF itself, even if it does not have VFs, in which case
it would be interesting to have IFLA_NUM_VF = 0 in the output.

Maybe a better structure would be to separate the two cases, also allowing
a port profile to be associated with both the PF and with each of its VFs?

Something like this:

[IFLA_NUM_VF]
[IFLA_VF_PORTS]
  [IFLA_VF_PORT]
    [IFLA_VF_PORT_*], ...
  [IFLA_VF_PORT]
    [IFLA_VF_PORT_*], ...
[IFLA_PORT_SELF]
  [IFLA_VF_PORT_*], ...

	Arnd

^ permalink raw reply

* [GIT] Networking
From: David Miller @ 2010-05-14 11:06 UTC (permalink / raw)
  To: torvalds; +Cc: akpm, netdev, linux-kernel


One small last minute fix from the VHOST folks to deal with some
memory barrier issues.

Please pull, thanks a lot!

The following changes since commit cea0d767c29669bf89f86e4aee46ef462d2ebae8:
  Linus Torvalds (1):
        Merge branch 'hwmon-for-linus' of git://git.kernel.org/.../jdelvare/staging

are available in the git repository at:

  master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.git master

David S. Miller (1):
      Merge branch 'net-2.6' of git://git.kernel.org/.../mst/vhost

Michael S. Tsirkin (1):
      vhost: fix barrier pairing

 drivers/vhost/vhost.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

^ permalink raw reply

* Re: [GIT PULL] last minute vhost-net fix
From: David Miller @ 2010-05-14 11:04 UTC (permalink / raw)
  To: mst; +Cc: kvm, virtualization, netdev, linux-kernel
In-Reply-To: <20100513084433.GA23082@redhat.com>

From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Thu, 13 May 2010 11:44:34 +0300

> David, if it's not too late, please pull the following
> last minute fix into 2.6.34.

Pulled, thanks.

^ permalink raw reply

* Re: [net-next-2.6 V7 PATCH 1/2] Add netlink support for virtual port management (was iovnl)
From: Patrick McHardy @ 2010-05-14 10:58 UTC (permalink / raw)
  To: Scott Feldman; +Cc: davem, netdev, chrisw, arnd
In-Reply-To: <20100514013526.1816.45104.stgit@savbu-pc100.cisco.com>

Scott Feldman wrote:
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -653,6 +653,26 @@ static inline int rtnl_vfinfo_size(const struct net_device *dev)
>  		return 0;
>  }
>  
> +static size_t rtnl_vf_port_size(const struct net_device *dev)
> +{
> +	size_t vf_port_size = nla_total_size(sizeof(struct nlattr))
> +						     /* VF_PORT_VF */
> +		+ nla_total_size(VF_PORT_PROFILE_MAX)/* VF_PORT_PROFILE */
> +		+ nla_total_size(sizeof(struct ifla_vf_port_vsi))
> +						     /* VF_PORT_VSI_TYPE */
> +		+ nla_total_size(VF_PORT_UUID_MAX)   /* VF_PORT_VSI_INSTANCE */
> +		+ nla_total_size(VF_PORT_UUID_MAX)   /* VF_PORT_HOST_UUID */
> +		+ nla_total_size(1)		     /* VF_PROT_VDP_REQUEST */

Do messages generated by the kernel really contain a request?

> +		+ nla_total_size(2);		     /* VF_PORT_VDP_RESPONSE */
> +
> +	if (!dev->netdev_ops->ndo_get_vf_port || !dev->dev.parent)
> +		return 0;
> +	if (dev_num_vf(dev->dev.parent))
> +		return vf_port_size * dev_num_vf(dev->dev.parent);
> +	else
> +		return vf_port_size;
> +}
> +


> +static int rtnl_vf_port_fill_nest(struct sk_buff *skb, struct net_device *dev,
> +				  int vf)
> +{
> +	struct nlattr *data;
> +	int err;
> +
> +	data = nla_nest_start(skb, IFLA_VF_PORT);

We usually use a top-level attribute to encapsulate lists of identical
attributes. The other iflink attributes may only occur once and are
usually parsed using nla_parse_nested(), which will parse all
IFLA_VF_PORT attributes, but only return the last one.

Something like:

iflink message:
...
[IFLA_VF_PORTS]
  [IFLA_VF_PORT]
    [IFLA_VF_PORT_*], ...
  [IFLA_VF_PORT]
    [IFLA_VF_PORT_*], ...
  ...


> +	if (!data)
> +		return -EMSGSIZE;
> +
> +	if (vf != VF_PORT_VF_NOT_USED)
> +		nla_put_u32(skb, IFLA_VF_PORT_VF, vf);

This should be checking for errors or use NLA_PUT_U32.

> +
> +	err = dev->netdev_ops->ndo_get_vf_port(dev, vf, skb);
> +	if (err) {
> +		nla_nest_cancel(skb, data);
> +		return err;
> +	}
> +
> +	nla_nest_end(skb, data);
> +
> +	return 0;
> +}
> +

>  static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
>  			    int type, u32 pid, u32 seq, u32 change,
>  			    unsigned int flags)
> @@ -747,17 +819,23 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
>  		goto nla_put_failure;
>  	copy_rtnl_link_stats64(nla_data(attr), stats);
>  
> +	if (dev->dev.parent)
> +		NLA_PUT_U32(skb, IFLA_NUM_VF, dev_num_vf(dev->dev.parent));

Just wondering, is the only case where dev.parent is non-NULL
really when virtual ports are present?

> +
>  	if (dev->netdev_ops->ndo_get_vf_config && dev->dev.parent) {
>  		int i;
>  		struct ifla_vf_info ivi;
>  
> -		NLA_PUT_U32(skb, IFLA_NUM_VF, dev_num_vf(dev->dev.parent));
>  		for (i = 0; i < dev_num_vf(dev->dev.parent); i++) {
>  			if (dev->netdev_ops->ndo_get_vf_config(dev, i, &ivi))
>  				break;
>  			NLA_PUT(skb, IFLA_VFINFO, sizeof(ivi), &ivi);
>  		}
>  	}
> +
> +	if (rtnl_vf_port_fill(skb, dev))
> +		goto nla_put_failure;
> +
>  	if (dev->rtnl_link_ops) {
>  		if (rtnl_link_fill(skb, dev) < 0)
>  			goto nla_put_failure;
> @@ -824,6 +902,7 @@ const struct nla_policy ifla_policy[IFLA_MAX+1] = {
>  				    .len = sizeof(struct ifla_vf_vlan) },
>  	[IFLA_VF_TX_RATE]	= { .type = NLA_BINARY,
>  				    .len = sizeof(struct ifla_vf_tx_rate) },
> +	[IFLA_VF_PORT]		= { .type = NLA_NESTED },
>  };
>  EXPORT_SYMBOL(ifla_policy);
>  
> @@ -832,6 +911,20 @@ static const struct nla_policy ifla_info_policy[IFLA_INFO_MAX+1] = {
>  	[IFLA_INFO_DATA]	= { .type = NLA_NESTED },
>  };
>  
> +static const struct nla_policy ifla_vf_port_policy[IFLA_VF_PORT_MAX+1] = {
> +	[IFLA_VF_PORT_VF]	    = { .type = NLA_U32 },
> +	[IFLA_VF_PORT_PROFILE]	    = { .type = NLA_STRING,
> +					.len = VF_PORT_PROFILE_MAX },
> +	[IFLA_VF_PORT_VSI_TYPE]     = { .type = NLA_BINARY,
> +					.len = sizeof(struct ifla_vf_port_vsi)},
> +	[IFLA_VF_PORT_INSTANCE_UUID]= { .type = NLA_BINARY,
> +					.len = VF_PORT_UUID_MAX },
> +	[IFLA_VF_PORT_HOST_UUID]    = { .type = NLA_STRING,
> +					.len = VF_PORT_UUID_MAX },
> +	[IFLA_VF_PORT_REQUEST]	    = { .type = NLA_U8, },
> +	[IFLA_VF_PORT_RESPONSE]	    = { .type = NLA_U16, },
> +};
> +
>  struct net *rtnl_link_get_net(struct net *src_net, struct nlattr *tb[])
>  {
>  	struct net *net;
> @@ -1028,6 +1121,27 @@ static int do_setlink(struct net_device *dev, struct ifinfomsg *ifm,
>  	}
>  	err = 0;
>  
> +	if (tb[IFLA_VF_PORT]) {
> +		struct nlattr *vf_port[IFLA_VF_PORT_MAX+1];
> +		int vf = VF_PORT_VF_NOT_USED;
> +
> +		err = nla_parse_nested(vf_port, IFLA_VF_PORT_MAX,
> +			tb[IFLA_VF_PORT], ifla_vf_port_policy);
> +		if (err < 0)
> +			goto errout;
> +
> +		if (vf_port[IFLA_VF_PORT_VF])
> +			vf = nla_get_u32(vf_port[IFLA_VF_PORT_VF]);
> +
> +		err = -EOPNOTSUPP;
> +		if (ops->ndo_set_vf_port)
> +			err = ops->ndo_set_vf_port(dev, vf, vf_port);
> +		if (err < 0)
> +			goto errout;
> +		modified = 1;
> +	}
> +	err = 0;
> +
>  errout:
>  	if (err < 0 && modified && net_ratelimit())
>  		printk(KERN_WARNING "A link change request failed with "
> 


^ permalink raw reply

* Re: [net-next-2.6 V6 PATCH 1/2] Add netlink support for virtual port management (was iovnl)
From: Patrick McHardy @ 2010-05-14 10:47 UTC (permalink / raw)
  To: Scott Feldman; +Cc: davem, netdev, chrisw, arnd
In-Reply-To: <C811BD6D.312C4%scofeldm@cisco.com>

Scott Feldman wrote:
> On 5/13/10 1:40 PM, "Patrick McHardy" <kaber@trash.net> wrote:
> 
>>> +  if (vf_port[IFLA_VF_PORT_VF])
>>> +   vf = nla_get_u32(vf_port[IFLA_VF_PORT_VF]);
>>> +  err = -EOPNOTSUPP;
>>> +  if (ops->ndo_set_vf_port)
>>> +   err = ops->ndo_set_vf_port(dev, vf, vf_port);
>> This appears to be addressing a single VF to issue commands.
>> I already explained this during the last set of VF patches,
>> messages are supposed to by symetrical, since you're dumping
>> state for all existing VFs, you also need to accept configuration
>> for multiple VFs. Basically, the kernel must be able to receive
>> a message it created during a dump and fully recreate the state.
> 
> This was modeled same as existing IFLA_VF_ cmd where single VF is addressed
> on set, but all VFs for PF are dumped on get.

Yes, that one should have been done differently as well,
unfortunately my comments were ignored. So far rtnetlink
had two properties that are now broken:

- messages sent by the kernel could be sent back to the
  kernel to re-create an object in the same state

- the same parsing functions could be used in userspace for
  messages sent by the kernel and netlink error messages,
  which contain the original userspace message

I know at least one program I've written a few years ago which
relies on the second property. Anyways, this is easily fixable
by encapsulating all top-level VF attributes in a list and
invoking the ndo_set_vf_port() callback for each VF configuration.

^ permalink raw reply

* Re: [PATCH] netfilter: Remove skb_is_nonlinear check from nf_conntrack_sip
From: Patrick McHardy @ 2010-05-14 10:16 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: Jason Gunthorpe, netfilter-devel, netdev
In-Reply-To: <alpine.LSU.2.01.1005140849500.28602@obet.zrqbmnf.qr>

Jan Engelhardt wrote:
> On Friday 2010-05-14 02:38, Jason Gunthorpe wrote:
> 
>> At least the XEN net front driver always produces non linear skbs,
>> so the SIP module does nothing at all when used with that NIC.
>>
>> Copy the hacky technique for accessing SKB data from the ftp conntrack,
>> better than nothing..
>>
>> Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
>>
>> +/* This is slow, but it's simple. --RR */
>> +static char *sip_buffer;
>> +static DEFINE_SPINLOCK(nf_sip_lock);
>> +
> 
> skb_linearize seems simpler. (What about the cost?)

Yeah, we have to use skb_linearize(). The SIP NAT helper might mangle
the packet and alter its size, at which point we'd have to make a new
copy of the data area to get the offsets right.

^ permalink raw reply

* Re: [PATCH 0/9]qlcnic: cleanup
From: David Miller @ 2010-05-14 10:15 UTC (permalink / raw)
  To: amit.salecha; +Cc: netdev, ameen.rahman
In-Reply-To: <1273756070-7205-1-git-send-email-amit.salecha@qlogic.com>

From: Amit Kumar Salecha <amit.salecha@qlogic.com>
Date: Thu, 13 May 2010 06:07:41 -0700

> Hi
>   Series of 9 patches to cleanup unused code and to support quiscent
>   mode. 

All applied, thanks.

^ permalink raw reply

* Re: [PATCH 0/6] sky2: update
From: David Miller @ 2010-05-14 10:15 UTC (permalink / raw)
  To: shemminger; +Cc: mikem, netdev
In-Reply-To: <20100513161247.833356588@vyatta.com>

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Thu, 13 May 2010 09:12:47 -0700

> Bunch of patches from Mike, with some additional comments.

All applied to net-next-2.6, thanks.

^ permalink raw reply

* Re: [net-next-2.6 PATCH 3/3] ixgb and e1000: Use new function for copybreak tests
From: David Miller @ 2010-05-14 10:14 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, gospo, joe
In-Reply-To: <20100514012615.30457.37881.stgit@localhost.localdomain>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Thu, 13 May 2010 18:26:17 -0700

> From: Joe Perches <joe@perches.com>
> 
> There appears to be an off-by-1 defect in the maximum packet size
> copied when copybreak is speified in these modules.
> 
> The copybreak module params are specified as:
> "Maximum size of packet that is copied to a new buffer on receive"
> 
> The tests are changed from "< copybreak" to "<= copybreak"
> and moved into new static functions for readability.
> 
> Signed-off-by: Joe Perches <joe@perches.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Applied.

^ permalink raw reply

* Re: [net-next-2.6 PATCH 2/3] e1000: cleanup unused parameters
From: David Miller @ 2010-05-14 10:14 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, gospo, jesse.brandeburg
In-Reply-To: <20100514012554.30457.66528.stgit@localhost.localdomain>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Thu, 13 May 2010 18:25:56 -0700

> From: Jesse Brandeburg <jesse.brandeburg@intel.com>
> 
> During the cleanup pass after the removal of e1000e hardware from e1000 some
> parameters were missed.  Remove them because it is just dead code.
> 
> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Applied.

^ permalink raw reply

* Re: [net-next-2.6 PATCH 1/3] e1000: fix WARN_ON with mac-vlan
From: David Miller @ 2010-05-14 10:14 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, gospo, jpirko, jesse.brandeburg
In-Reply-To: <20100514012425.30457.23799.stgit@localhost.localdomain>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Thu, 13 May 2010 18:25:33 -0700

> From: Jesse Brandeburg <jesse.brandeburg@intel.com>
> 
> When adding more than 14 mac-vlan adapters on e1000 the driver
> would fire a WARN_ON when adding the 15th.  The WARN_ON in this
> case is completely un-necessary, as the code below the WARN_ON is
> directly handling the value the WARN_ON triggered on.
> 
> CC: Jiri Pirko <jpirko@redhat.com>
> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Applied.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox