Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH v5] can: Convert to runtime_pm
From: Marc Kleine-Budde @ 2015-01-13 11:08 UTC (permalink / raw)
  To: Sören Brinkmann, Kedareswara rao Appana
  Cc: wg, michal.simek, grant.likely, robh+dt, linux-can, netdev,
	linux-arm-kernel, linux-kernel, devicetree,
	Kedareswara rao Appana
In-Reply-To: <3a3437c5c8ff48d9a45fee7e81fa8dca@BY2FFO11FD058.protection.gbl>

[-- Attachment #1: Type: text/plain, Size: 3138 bytes --]

On 01/12/2015 07:45 PM, Sören Brinkmann wrote:
> On Mon, 2015-01-12 at 08:34PM +0530, Kedareswara rao Appana wrote:
>> Instead of enabling/disabling clocks at several locations in the driver,
>> Use the runtime_pm framework. This consolidates the actions for runtime PM
>> In the appropriate callbacks and makes the driver more readable and mantainable.
>>
>> Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
>> Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
>> ---
>> Changes for v5:
>>  - Updated with the review comments.
>>    Updated the remove fuction to use runtime_pm.
>> Chnages for v4:
>>  - Updated with the review comments.
>> Changes for v3:
>>   - Converted the driver to use runtime_pm.
>> Changes for v2:
>>   - Removed the struct platform_device* from suspend/resume
>>     as suggest by Lothar.
>>
>>  drivers/net/can/xilinx_can.c |  157 ++++++++++++++++++++++++++++-------------
>>  1 files changed, 107 insertions(+), 50 deletions(-)
> [..]
>> +static int __maybe_unused xcan_runtime_resume(struct device *dev)
>>  {
>> -	struct platform_device *pdev = dev_get_drvdata(dev);
>> -	struct net_device *ndev = platform_get_drvdata(pdev);
>> +	struct net_device *ndev = dev_get_drvdata(dev);
>>  	struct xcan_priv *priv = netdev_priv(ndev);
>>  	int ret;
>> +	u32 isr, status;
>>  
>>  	ret = clk_enable(priv->bus_clk);
>>  	if (ret) {
>> @@ -1014,15 +1030,28 @@ static int __maybe_unused xcan_resume(struct device *dev)
>>  	ret = clk_enable(priv->can_clk);
>>  	if (ret) {
>>  		dev_err(dev, "Cannot enable clock.\n");
>> -		clk_disable_unprepare(priv->bus_clk);
>> +		clk_disable(priv->bus_clk);
> [...]
>> @@ -1173,12 +1219,23 @@ static int xcan_remove(struct platform_device *pdev)
>>  {
>>  	struct net_device *ndev = platform_get_drvdata(pdev);
>>  	struct xcan_priv *priv = netdev_priv(ndev);
>> +	int ret;
>> +
>> +	ret = pm_runtime_get_sync(&pdev->dev);
>> +	if (ret < 0) {
>> +		netdev_err(ndev, "%s: pm_runtime_get failed(%d)\n",
>> +				__func__, ret);
>> +		return ret;
>> +	}
>>  
>>  	if (set_reset_mode(ndev) < 0)
>>  		netdev_err(ndev, "mode resetting failed!\n");
>>  
>>  	unregister_candev(ndev);
>> +	pm_runtime_disable(&pdev->dev);
>>  	netif_napi_del(&priv->napi);
>> +	clk_disable_unprepare(priv->bus_clk);
>> +	clk_disable_unprepare(priv->can_clk);
> 
> Shouldn't pretty much all these occurrences of clk_disable/enable
> disappear? This should all be handled by the runtime_pm framework now.

We have:
- clk_prepare_enable() in probe
- clk_disable_unprepare() in remove
- clk_enable() in runtime_resume
- clk_disable() in runtime_suspend

Which is, as far as I understand the right way to do it. Maybe
Kedareswara can post the clock debug output again with this patch
iteration. Have I missed something?

regards,
Marc
-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply

* [RFC 1/2] misc: uidstat: Add uid stat driver to collect network statistics.
From: Kiran Raparthy @ 2015-01-13 11:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mike Chan, Arnd Bergmann, Greg Kroah-Hartman, David S. Miller,
	Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
	Patrick McHardy, netdev, Android Kernel Team, John Stultz,
	Sumit Semwal, JP Abgrall, Arve Hj�nnev�g,
	Kiran Raparthy

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 9395 bytes --]

From: Mike Chan <mike@android.com>

misc: uidstat: Add uid stat driver to collect network statistics.

To analyze application's network statistics, we need a mechanism to export
the UID based statistics to userspace so that userspace tools can use the
exported numbers and generate the report against the UID.

This patch allows the user to explore the UID based network statistics
exported to /proc/uid_stat.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
cc: Mike Chan <mike@android.com>
Cc: linux-kernel@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: Android Kernel Team <kernel-team@android.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: JP Abgrall <jpa@google.com>
Cc: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Mike Chan <mike@android.com>
[Kiran: Added context to commit message.
included build fixes from JP Abgrall and Arve.
Used single_release instead of seq_release.
internally continued to use uid_t but used covnersion
from kuid_t where ever necessary]
Signed-off-by: Kiran Raparthy <kiran.kumar@linaro.org>
---
 drivers/misc/Kconfig     |   7 +++
 drivers/misc/Makefile    |   1 +
 drivers/misc/uid_stat.c  | 157 +++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/uid_stat.h |  33 ++++++++++
 net/ipv4/tcp.c           |  10 +++
 5 files changed, 208 insertions(+)
 create mode 100644 drivers/misc/uid_stat.c
 create mode 100644 include/linux/uid_stat.h

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 006242c..1c79b94 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -402,6 +402,13 @@ config TI_DAC7512
 	  This driver can also be built as a module. If so, the module
 	  will be called ti_dac7512.
 
+config UID_STAT
+	bool "UID based statistics tracking"
+	default n
+	help
+	  This option allows to enable UID based network statistics tracking.
+	  statistics are exported to /proc/uid_stat.
+
 config VMWARE_BALLOON
 	tristate "VMware Balloon Driver"
 	depends on X86 && HYPERVISOR_GUEST
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 7d5c4cd..3754347 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -34,6 +34,7 @@ obj-$(CONFIG_ISL29020)		+= isl29020.o
 obj-$(CONFIG_SENSORS_TSL2550)	+= tsl2550.o
 obj-$(CONFIG_DS1682)		+= ds1682.o
 obj-$(CONFIG_TI_DAC7512)	+= ti_dac7512.o
+obj-$(CONFIG_UID_STAT)		+= uid_stat.o
 obj-$(CONFIG_C2PORT)		+= c2port/
 obj-$(CONFIG_HMC6352)		+= hmc6352.o
 obj-y				+= eeprom/
diff --git a/drivers/misc/uid_stat.c b/drivers/misc/uid_stat.c
new file mode 100644
index 0000000..aaaa406
--- /dev/null
+++ b/drivers/misc/uid_stat.c
@@ -0,0 +1,157 @@
+/* drivers/misc/uid_stat.c
+ *
+ * Copyright (C) 2008 - 2009 Google, Inc.
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#include <linux/atomic.h>
+#include <linux/err.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/list.h>
+#include <linux/proc_fs.h>
+#include <linux/seq_file.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+#include <linux/stat.h>
+#include <linux/uid_stat.h>
+
+static DEFINE_SPINLOCK(uid_lock);
+static LIST_HEAD(uid_list);
+static struct proc_dir_entry *parent;
+
+struct uid_stat {
+	struct list_head link;
+	uid_t uid;
+	atomic_t tcp_rcv;
+	atomic_t tcp_snd;
+};
+
+static struct uid_stat *find_uid_stat(uid_t uid)
+{
+	struct uid_stat *entry;
+
+	list_for_each_entry(entry, &uid_list, link) {
+		if (entry->uid == uid)
+			return entry;
+	}
+	return NULL;
+}
+
+static int uid_stat_atomic_int_show(struct seq_file *m, void *v)
+{
+	unsigned int bytes;
+	atomic_t *counter = m->private;
+
+	bytes = (unsigned int) (atomic_read(counter) + INT_MIN);
+	return seq_printf(m, "%u\n", bytes);
+}
+
+static int uid_stat_read_atomic_int_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, uid_stat_atomic_int_show, PDE_DATA(inode));
+}
+
+static const struct file_operations uid_stat_read_atomic_int_fops = {
+	.open		= uid_stat_read_atomic_int_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= single_release,
+};
+
+/* Create a new entry for tracking the specified uid. */
+static struct uid_stat *create_stat(uid_t uid)
+{
+	struct uid_stat *new_uid;
+	/* Create the uid stat struct and append it to the list. */
+	new_uid = kmalloc(sizeof(struct uid_stat), GFP_ATOMIC);
+	if (!new_uid)
+		return NULL;
+
+	new_uid->uid = uid;
+	/* Counters start at INT_MIN, so we can track 4GB of network traffic. */
+	atomic_set(&new_uid->tcp_rcv, INT_MIN);
+	atomic_set(&new_uid->tcp_snd, INT_MIN);
+
+	list_add_tail(&new_uid->link, &uid_list);
+	return new_uid;
+}
+
+static void create_stat_proc(struct uid_stat *new_uid)
+{
+	char uid_s[32];
+	struct proc_dir_entry *entry;
+
+	sprintf(uid_s, "%d", new_uid->uid);
+	entry = proc_mkdir(uid_s, parent);
+
+	/* Keep reference to uid_stat so we know what uid to read stats from. */
+	proc_create_data("tcp_snd", S_IRUGO, entry,
+			 &uid_stat_read_atomic_int_fops, &new_uid->tcp_snd);
+
+	proc_create_data("tcp_rcv", S_IRUGO, entry,
+			 &uid_stat_read_atomic_int_fops, &new_uid->tcp_rcv);
+}
+
+static struct uid_stat *find_or_create_uid_stat(kuid_t uid)
+{
+	struct uid_stat *entry;
+	unsigned long flags;
+	uid_t proc_uid;
+
+	proc_uid = from_kuid(&init_user_ns, uid);
+	spin_lock_irqsave(&uid_lock, flags);
+	entry = find_uid_stat(proc_uid);
+	if (entry) {
+		spin_unlock_irqrestore(&uid_lock, flags);
+		return entry;
+	}
+	entry = create_stat(proc_uid);
+	spin_unlock_irqrestore(&uid_lock, flags);
+	if (entry)
+		create_stat_proc(entry);
+	return entry;
+}
+
+int uid_stat_tcp_snd(kuid_t uid, int size)
+{
+	struct uid_stat *entry;
+
+	entry = find_or_create_uid_stat(uid);
+	if (!entry)
+		return -1;
+	atomic_add(size, &entry->tcp_snd);
+	return 0;
+}
+
+int uid_stat_tcp_rcv(kuid_t uid, int size)
+{
+	struct uid_stat *entry;
+
+	entry = find_or_create_uid_stat(uid);
+	if (!entry)
+		return -1;
+	atomic_add(size, &entry->tcp_rcv);
+	return 0;
+}
+
+static int __init uid_stat_init(void)
+{
+	parent = proc_mkdir("uid_stat", NULL);
+	if (!parent) {
+		pr_err("uid_stat: failed to create proc entry\n");
+		return -1;
+	}
+	return 0;
+}
+
+device_initcall(uid_stat_init);
diff --git a/include/linux/uid_stat.h b/include/linux/uid_stat.h
new file mode 100644
index 0000000..68823d3
--- /dev/null
+++ b/include/linux/uid_stat.h
@@ -0,0 +1,33 @@
+/* include/linux/uid_stat.h
+ *
+ * Copyright (C) 2008-2009 Google, Inc.
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#ifndef __uid_stat_h
+#define __uid_stat_h
+
+/* Contains definitions for resource tracking per uid. */
+
+#ifdef CONFIG_UID_STAT
+int uid_stat_tcp_snd(kuid_t uid, int size);
+int uid_stat_tcp_rcv(kuid_t uid, int size);
+#else
+static inline int uid_stat_tcp_snd(kuid_t uid, int size)
+{
+}
+static inline int uid_stat_tcp_rcv(kuid_t uid, int size)
+{
+}
+#endif
+
+#endif /* _LINUX_UID_STAT_H */
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 3075723..00eb156 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -268,6 +268,7 @@
 #include <linux/crypto.h>
 #include <linux/time.h>
 #include <linux/slab.h>
+#include <linux/uid_stat.h>
 
 #include <net/icmp.h>
 #include <net/inet_common.h>
@@ -1280,6 +1281,9 @@ out:
 		tcp_push(sk, flags, mss_now, tp->nonagle, size_goal);
 out_nopush:
 	release_sock(sk);
+
+	if (copied + copied_syn)
+		uid_stat_tcp_snd(current_uid(), copied + copied_syn);
 	return copied + copied_syn;
 
 do_fault:
@@ -1551,6 +1555,7 @@ int tcp_read_sock(struct sock *sk, read_descriptor_t *desc,
 	if (copied > 0) {
 		tcp_recv_skb(sk, seq, &offset);
 		tcp_cleanup_rbuf(sk, copied);
+		uid_stat_tcp_rcv(current_uid(), copied);
 	}
 	return copied;
 }
@@ -1879,6 +1884,9 @@ skip_copy:
 	tcp_cleanup_rbuf(sk, copied);
 
 	release_sock(sk);
+
+	if (copied > 0)
+		uid_stat_tcp_rcv(current_uid(), copied);
 	return copied;
 
 out:
@@ -1887,6 +1895,8 @@ out:
 
 recv_urg:
 	err = tcp_recv_urg(sk, msg, len, flags);
+	if (err > 0)
+		uid_stat_tcp_rcv(current_uid(), err);
 	goto out;
 
 recv_sndq:
-- 
1.8.2.1

^ permalink raw reply related

* [RFC 2/2] net: activity_stats: Add statistics for network transmission activity
From: Kiran Raparthy @ 2015-01-13 11:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mike Chan, Arnd Bergmann, Greg Kroah-Hartman, David S. Miller,
	netdev, Android Kernel Team, John Stultz, Sumit Semwal,
	Arve Hj�nnev�g, Kiran Raparthy
In-Reply-To: <1421147642-28360-1-git-send-email-kiran.kumar@linaro.org>

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 7811 bytes --]

From: Mike Chan <mike@android.com>

net: activity_stats: Add statistics for network transmission activity

When enabled, tracks the frequency of network transmissions
(inbound and outbound) and buckets them accordingly.
Buckets are determined by time between network activity.

Each bucket represents the number of network transmisions that were
N sec or longer apart. Where N is defined as 1 << bucket index.

This network pattern tracking is particularly useful for wireless
networks (ie: 3G) where batching network activity closely together
is more power efficient than far apart.

New file: /proc/net/stat/activity

output:

Min Bucket(sec) Count
              1 7
              2 0
              4 1
              8 0
             16 0
             32 2
             64 1
            128 0

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: linux-kernel@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: Android Kernel Team <kernel-team@android.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Mike Chan <mike@android.com>
[Kiran: Added context to commit message.
included build fix from Arve,continued to use uid_t
but used covnersion from kuid_t where ever necessary]
Signed-off-by: Kiran Raparthy <kiran.kumar@linaro.org>
---
 drivers/misc/uid_stat.c      |   3 ++
 include/net/activity_stats.h |  25 ++++++++++
 net/Kconfig                  |   8 +++
 net/Makefile                 |   1 +
 net/activity_stats.c         | 116 +++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 153 insertions(+)
 create mode 100644 include/net/activity_stats.h
 create mode 100644 net/activity_stats.c

diff --git a/drivers/misc/uid_stat.c b/drivers/misc/uid_stat.c
index aaaa406..63065e2 100644
--- a/drivers/misc/uid_stat.c
+++ b/drivers/misc/uid_stat.c
@@ -24,6 +24,7 @@
 #include <linux/spinlock.h>
 #include <linux/stat.h>
 #include <linux/uid_stat.h>
+#include <net/activity_stats.h>
 
 static DEFINE_SPINLOCK(uid_lock);
 static LIST_HEAD(uid_list);
@@ -126,6 +127,7 @@ int uid_stat_tcp_snd(kuid_t uid, int size)
 {
 	struct uid_stat *entry;
 
+	activity_stats_update();
 	entry = find_or_create_uid_stat(uid);
 	if (!entry)
 		return -1;
@@ -137,6 +139,7 @@ int uid_stat_tcp_rcv(kuid_t uid, int size)
 {
 	struct uid_stat *entry;
 
+	activity_stats_update();
 	entry = find_or_create_uid_stat(uid);
 	if (!entry)
 		return -1;
diff --git a/include/net/activity_stats.h b/include/net/activity_stats.h
new file mode 100644
index 0000000..10e4c15
--- /dev/null
+++ b/include/net/activity_stats.h
@@ -0,0 +1,25 @@
+/*
+ * Copyright (C) 2010 Google, Inc.
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * Author: Mike Chan (mike@android.com)
+ */
+
+#ifndef __activity_stats_h
+#define __activity_stats_h
+
+#ifdef CONFIG_NET_ACTIVITY_STATS
+void activity_stats_update(void);
+#else
+#define activity_stats_update(void) {}
+#endif
+
+#endif /* _NET_ACTIVITY_STATS_H */
diff --git a/net/Kconfig b/net/Kconfig
index ff9ffc1..3159361 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -83,6 +83,14 @@ source "net/netlabel/Kconfig"
 
 endif # if INET
 
+config NET_ACTIVITY_STATS
+	bool "Network activity statistics tracking"
+	default y
+	help
+	 Network activity statistics are useful for tracking wireless
+	 modem activity on 2G, 3G, 4G wireless networks. Counts number of
+	 transmissions and groups them in specified time buckets.
+
 config NETWORK_SECMARK
 	bool "Security Marking"
 	help
diff --git a/net/Makefile b/net/Makefile
index 95fc694..d81a969 100644
--- a/net/Makefile
+++ b/net/Makefile
@@ -73,6 +73,7 @@ obj-$(CONFIG_OPENVSWITCH)	+= openvswitch/
 obj-$(CONFIG_VSOCKETS)	+= vmw_vsock/
 obj-$(CONFIG_NET_MPLS_GSO)	+= mpls/
 obj-$(CONFIG_HSR)		+= hsr/
+obj-$(CONFIG_NET_ACTIVITY_STATS)		+= activity_stats.o
 ifneq ($(CONFIG_NET_SWITCHDEV),)
 obj-y				+= switchdev/
 endif
diff --git a/net/activity_stats.c b/net/activity_stats.c
new file mode 100644
index 0000000..cf0a09d
--- /dev/null
+++ b/net/activity_stats.c
@@ -0,0 +1,116 @@
+/* net/activity_stats.c
+ *
+ * Copyright (C) 2010 Google, Inc.
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * Author: Mike Chan (mike@android.com)
+ */
+
+#include <linux/proc_fs.h>
+#include <linux/seq_file.h>
+#include <linux/suspend.h>
+#include <net/net_namespace.h>
+
+/* Track transmission rates in buckets (power of 2).
+ * 1,2,4,8...512 seconds.
+ *
+ * Buckets represent the count of network transmissions at least
+ * N seconds apart, where N is 1 << bucket index.
+ */
+#define BUCKET_MAX 10
+
+/* Track network activity frequency */
+static unsigned long activity_stats[BUCKET_MAX];
+static ktime_t last_transmit;
+static ktime_t suspend_time;
+static DEFINE_SPINLOCK(activity_lock);
+
+void activity_stats_update(void)
+{
+	int i;
+	unsigned long flags;
+	ktime_t now;
+	s64 delta;
+
+	spin_lock_irqsave(&activity_lock, flags);
+	now = ktime_get();
+	delta = ktime_to_ns(ktime_sub(now, last_transmit));
+
+	for (i = BUCKET_MAX - 1; i >= 0; i--) {
+		/* Check if the time delta between network activity is within
+		 * the minimum bucket range.
+		 */
+		if (delta < (1000000000ULL << i))
+			continue;
+
+		activity_stats[i]++;
+		last_transmit = now;
+		break;
+	}
+	spin_unlock_irqrestore(&activity_lock, flags);
+}
+
+static int activity_stats_show(struct seq_file *m, void *v)
+{
+	int i;
+	int ret;
+
+	seq_printf(m, "Min Bucket(sec) Count\n");
+
+	for (i = 0; i < BUCKET_MAX; i++) {
+		ret = seq_printf(m, "%15d %lu\n", 1 << i, activity_stats[i]);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+static int activity_stats_notifier(struct notifier_block *nb,
+				   unsigned long event, void *dummy)
+{
+	switch (event) {
+	case PM_SUSPEND_PREPARE:
+		suspend_time = ktime_get_real();
+		break;
+
+	case PM_POST_SUSPEND:
+		suspend_time = ktime_sub(ktime_get_real(), suspend_time);
+		last_transmit = ktime_sub(last_transmit, suspend_time);
+	}
+
+	return 0;
+}
+
+static int activity_stats_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, activity_stats_show, PDE_DATA(inode));
+}
+
+static const struct file_operations activity_stats_fops = {
+	.open		= activity_stats_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= seq_release,
+};
+
+static struct notifier_block activity_stats_notifier_block = {
+	.notifier_call = activity_stats_notifier,
+};
+
+static int  __init activity_stats_init(void)
+{
+	proc_create("activity", S_IRUGO,
+		    init_net.proc_net_stat, &activity_stats_fops);
+	return register_pm_notifier(&activity_stats_notifier_block);
+}
+
+subsys_initcall(activity_stats_init);
-- 
1.8.2.1

^ permalink raw reply related

* Re: [PATCH net-next] rhashtable: Lower/upper bucket may map to same lock while shrinking
From: 'Thomas Graf' @ 2015-01-13 11:25 UTC (permalink / raw)
  To: David Laight
  Cc: davem@davemloft.net, Fengguang Wu, LKP,
	linux-kernel@vger.kernel.org, netfilter-devel@vger.kernel.org,
	coreteam@netfilter.org, netdev@vger.kernel.org
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6D1CAC6593@AcuExch.aculab.com>

On 01/13/15 at 09:49am, David Laight wrote:
> From: Thomas Graf
> > Each per bucket lock covers a configurable number of buckets. While
> > shrinking, two buckets in the old table contain entries for a single
> > bucket in the new table. We need to lock down both while linking.
> > Check if they are protected by different locks to avoid a recursive
> > lock.
> 
> Thought, could the shrunk table use the same locks as the lower half
> of the old table?

No. A new bucket table and thus a new set of locks is allocated when the
table is shrunk or grown. We only have check for overlapping locks
when holding multiple locks for the same table at the same time.

> I also wonder whether shrinking hash tables is ever actually worth
> the effort. Most likely they'll need to grow again very quickly.

Specifying a .shrink_decision function is optional so every rhashtable
user can decide whether it wants shrinking or not. Need for it was
expressed in the past threads.

Also, the case of multiple buckets mapping to the same lock is also
present in the expanding logic so removing the shrinking logic would
not remove the need for these types of checks.

> >  		spin_lock_bh(old_bucket_lock1);
> > -		spin_lock_bh_nested(old_bucket_lock2, RHT_LOCK_NESTED);
> > -		spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED2);
> > +
> > +		/* Depending on the lock per buckets mapping, the bucket in
> > +		 * the lower and upper region may map to the same lock.
> > +		 */
> > +		if (old_bucket_lock1 != old_bucket_lock2) {
> > +			spin_lock_bh_nested(old_bucket_lock2, RHT_LOCK_NESTED);
> > +			spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED2);
> > +		} else {
> > +			spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED);
> > +		}
> 
> Acquiring 3 locks of much the same type looks like a locking hierarchy
> violation just waiting to happen.

I'm not claiming it's extremely pretty, lockless lookup with deferred
resizing doesn't come for free ;-) If you have a suggestion on how to
implement this differently I'm all ears. That said, it's well isolated
and the user of rhashtable does not have to deal with it. All code paths
which take multiple locks are mutually exclusive to each other (ht->mutex).

^ permalink raw reply

* Re: [PATCH net-next] rhashtable: unnecessary to use delayed work
From: Thomas Graf @ 2015-01-13 11:26 UTC (permalink / raw)
  To: Ying Xue; +Cc: davem, netdev
In-Reply-To: <54B4EA06.8010507@windriver.com>

On 01/13/15 at 05:48pm, Ying Xue wrote:
> On 01/13/2015 05:35 PM, Thomas Graf wrote:
> > If you agree we can explain this shortly in the commit message and add:
> > Fixes: 97defe1 ("rhashtable: Per bucket locks & deferred expansion/shrinking")
> 
> OK, I will deliver the next version.
> 
> By the way, I think we should check the following condition before call
> cancel_work_sync(), otherwise, we may cancel an uninitialized work.
> 
> (ht->p.grow_decision || ht->p.shrink_decision)
> 
> What do you think?

+1

^ permalink raw reply

* Re: [PATCH 2/6] vxlan: Group Policy extension
From: Thomas Graf @ 2015-01-13 11:32 UTC (permalink / raw)
  To: Tom Herbert
  Cc: David Miller, Jesse Gross, Stephen Hemminger, Pravin B Shelar,
	Alexei Starovoitov, Linux Netdev List, dev@openvswitch.org
In-Reply-To: <CA+mtBx8-rMYSkj2KKXBq60KeF=V=iCG2K0HgVxO5JMVK1yhwCg@mail.gmail.com>

On 01/12/15 at 06:28pm, Tom Herbert wrote:
> On Mon, Jan 12, 2015 at 5:03 PM, Thomas Graf <tgraf@suug.ch> wrote:
> >>
> >> Creating a level of indirection for extensions seems overly
> >> complicated to me. Why not just define IFLA_VXLAN_GBP as just another
> >> enum above?
> >
> > I think it's cleaner to group them in a nested attribute.
> > It clearly separates the optional extensions from the base
> > attributes. RCO, GPE, GBP can all live in there.
> 
> This is inconsistent with similar things in GRE and GUE. For instance,
> GRE keyid is set as its own attribute. It just seems like this adding
> more code to the driver than is necessary for the functionality
> needed.

The major difference here is that we have to consider backwards
compatibility specifically for VXLAN. Your initial feedback on GPE
actually led me to how I implemented GBP.

I think the axioms we want to establish are as follows:
 1. Extensions need to be explicitly enabled by the user. A previously
    dropped frame should only be processed if the user explitly asks
    for it.
 2. As a consequence: only share a VLXAN UDP port if the enabled
    extensions match (vxlan_sock_add), e.g. user A might want RCO
    but user B might be unaware. They cannot share the same UDP port.

The 2nd lead me to introduce the 'exts' member to vxlan_sock so we can
compare it in vxlan_find_sock() and only share a UDP port if the
enabled extensions match.

Your patch currently implements (1) but not (2).

^ permalink raw reply

* [net-next 00/15][pull request] Intel Wired LAN Driver Updates 2015-01-13
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, nhorman, sassmann, jogreene

This series contains updates to i40e and i40evf.

Mitch provides a fix for i40e to move the call to pci_disable_sriov() so
that it is called earlier to ensure that the PF driver won't free VF
resources before the VF remove routine can complete.  Also cleans up
redundant and duplicate code in the i40evf.  Refactors the i40evf
shutdown code and let the watchdog take care of shutting things down.
Fix a possible memory leak, if we are using VLANs and the communication
with the PF fail during shutdown.  On some versions of the firmware, the
VF admin send queue may become stalled.  In this case, the easiest
solution is to place another descriptor on the queue and the firmware
will then process both requests.

Greg adds a warning when the NPAR enabled partitions detected a link speed
less than 10 Gpbs.

Vasu removes redundant VN2VN MAC address which were already added by
the FCoE stack.

Shannon adds code to find how many partitions there are per port and
what is the current partition_id when in NPAR mode.  In multifunction
mode, make sure we only allow SR/IOV on the master PF of a port and
only allow partition 1 to set WoL, speed and flow control.

Kamil adds code to read the PBA block from shadow RAM and returns
the part number in a string format.

Catherine provides a fix to check if link state and link speed has
changed before exiting link event

The following are changes since commit d2c60b1350c9a3eb7ed407c18f50306762365646:
  Merge branch 'tuntap_queues'
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next master

Catherine Sullivan (1):
  i40e: Don't exit link event early if link speed has changed

Greg Rose (1):
  i40e: Add warning for NPAR partitions with link speed less than 10Gbps

Kamil Krawczyk (1):
  i40e: Adding function for reading PBA String

Mitch A Williams (8):
  i40e: disable IOV before freeing resources
  i40evf: remove redundant code
  i40evf: Remove some scary log messages
  i40evf: refactor shutdown code
  i40evf: remove leftover VLAN filters
  i40evf: don't fire traffic IRQs when the interface is down
  i40evf: enable interrupt 0 appropriately
  i40evf: kick a stalled admin queue

Shannon Nelson (3):
  i40e/i40evf: find partition_id in npar mode
  i40e: limit WoL and link settings to partition 1
  i40e: limit sriov to partition 1 of NPAR configurations

Vasu Dev (1):
  i40e: remove VN2VN related mac filters

 drivers/net/ethernet/intel/i40e/i40e_common.c      | 125 +++++++++++++++++++++
 drivers/net/ethernet/intel/i40e/i40e_ethtool.c     |  43 ++++++-
 drivers/net/ethernet/intel/i40e/i40e_fcoe.c        |   2 -
 drivers/net/ethernet/intel/i40e/i40e_main.c        |  15 ++-
 drivers/net/ethernet/intel/i40e/i40e_prototype.h   |   5 +
 drivers/net/ethernet/intel/i40e/i40e_type.h        |   9 +-
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c |  11 +-
 drivers/net/ethernet/intel/i40evf/i40e_type.h      |   7 +-
 drivers/net/ethernet/intel/i40evf/i40evf_main.c    |  94 ++++++++--------
 .../net/ethernet/intel/i40evf/i40evf_virtchnl.c    |   6 +-
 10 files changed, 256 insertions(+), 61 deletions(-)

-- 
1.9.3

^ permalink raw reply

* [net-next 02/15] i40evf: remove redundant code
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
  To: davem; +Cc: Mitch A Williams, netdev, nhorman, sassmann, jogreene,
	Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Mitch A Williams <mitch.a.williams@intel.com>

These functions are redundant and duplicate functionality found in
i40evf_free_all_[tx|rx]_resources.

Change-ID: Ia199908926d7a1a4b8247f75f89b5da24c9b149c
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40evf/i40evf_main.c | 27 -------------------------
 1 file changed, 27 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index cabaf59..ee0db59 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -947,30 +947,6 @@ static int i40evf_up_complete(struct i40evf_adapter *adapter)
 }
 
 /**
- * i40evf_clean_all_rx_rings - Free Rx Buffers for all queues
- * @adapter: board private structure
- **/
-static void i40evf_clean_all_rx_rings(struct i40evf_adapter *adapter)
-{
-	int i;
-
-	for (i = 0; i < adapter->num_active_queues; i++)
-		i40evf_clean_rx_ring(adapter->rx_rings[i]);
-}
-
-/**
- * i40evf_clean_all_tx_rings - Free Tx Buffers for all queues
- * @adapter: board private structure
- **/
-static void i40evf_clean_all_tx_rings(struct i40evf_adapter *adapter)
-{
-	int i;
-
-	for (i = 0; i < adapter->num_active_queues; i++)
-		i40evf_clean_tx_ring(adapter->tx_rings[i]);
-}
-
-/**
  * i40e_down - Shutdown the connection processing
  * @adapter: board private structure
  **/
@@ -1008,9 +984,6 @@ void i40evf_down(struct i40evf_adapter *adapter)
 	i40evf_napi_disable_all(adapter);
 
 	netif_carrier_off(netdev);
-
-	i40evf_clean_all_tx_rings(adapter);
-	i40evf_clean_all_rx_rings(adapter);
 }
 
 /**
-- 
1.9.3

^ permalink raw reply related

* [net-next 03/15] i40evf: Remove some scary log messages
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
  To: davem; +Cc: Mitch A Williams, netdev, nhorman, sassmann, jogreene,
	Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Mitch A Williams <mitch.a.williams@intel.com>

These messages may be triggered during normal init of the driver if the
PF or FW take a long time to respond. There's nothing really wrong, so
don't freak people out logging messages.

If the communication channel really is dead, then we'll retry a few
times and give up. This will log a different more scary message that
should cause consternation. This allows the user to more easily detect a
genuine failure.

Change-ID: I6e2b758d4234a3a09c1015c82c8f2442a697cbdb
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40evf/i40evf_main.c     | 4 ----
 drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c | 3 ---
 2 files changed, 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index ee0db59..f8f1d26 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -2026,10 +2026,7 @@ static void i40evf_init_task(struct work_struct *work)
 		/* aq msg sent, awaiting reply */
 		err = i40evf_verify_api_ver(adapter);
 		if (err) {
-			dev_info(&pdev->dev, "Unable to verify API version (%d), retrying\n",
-				 err);
 			if (err == I40E_ERR_ADMIN_QUEUE_NO_WORK) {
-				dev_info(&pdev->dev, "Resending request\n");
 				err = i40evf_send_api_ver(adapter);
 			}
 			goto err;
@@ -2054,7 +2051,6 @@ static void i40evf_init_task(struct work_struct *work)
 		}
 		err = i40evf_get_vf_config(adapter);
 		if (err == I40E_ERR_ADMIN_QUEUE_NO_WORK) {
-			dev_info(&pdev->dev, "Resending VF config request\n");
 			err = i40evf_send_vf_config_msg(adapter);
 			goto err;
 		}
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c b/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
index 5fde5a7..3aeb633 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
@@ -715,9 +715,6 @@ void i40evf_virtchnl_completion(struct i40evf_adapter *adapter,
 		}
 		return;
 	}
-	if (v_opcode != adapter->current_op)
-		dev_info(&adapter->pdev->dev, "Pending op is %d, received %d\n",
-			 adapter->current_op, v_opcode);
 	if (v_retval) {
 		dev_err(&adapter->pdev->dev, "%s: PF returned error %d to our request %d\n",
 			__func__, v_retval, v_opcode);
-- 
1.9.3

^ permalink raw reply related

* [net-next 01/15] i40e: disable IOV before freeing resources
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
  To: davem; +Cc: Mitch A Williams, netdev, nhorman, sassmann, jogreene,
	Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Mitch A Williams <mitch.a.williams@intel.com>

If VF drivers are loaded in the host OS, the call to pci_disable_sriov()
will cause these drivers' remove routines to be called. If the PF driver
has already freed VF resources before this happens, then the VF remove
routine can't properly communicate with the PF driver causing all sorts
of mayhem and error messages and hurt feelings.

To fix this, we move the call to pci_disable_sriov() up to the top of
the function and let it complete before freeing any VF resources.

Change-ID: I397c3997a00f6408e32b7735273911e499600236
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 5bae895..044019b 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -791,10 +791,18 @@ void i40e_free_vfs(struct i40e_pf *pf)
 	if (!pf->vf)
 		return;
 
+	/* Disable IOV before freeing resources. This lets any VF drivers
+	 * running in the host get themselves cleaned up before we yank
+	 * the carpet out from underneath their feet.
+	 */
+	if (!pci_vfs_assigned(pf->pdev))
+		pci_disable_sriov(pf->pdev);
+
+	msleep(20); /* let any messages in transit get finished up */
+
 	/* Disable interrupt 0 so we don't try to handle the VFLR. */
 	i40e_irq_dynamic_disable_icr0(pf);
 
-	mdelay(10); /* let any messages in transit get finished up */
 	/* free up vf resources */
 	tmp = pf->num_alloc_vfs;
 	pf->num_alloc_vfs = 0;
@@ -813,7 +821,6 @@ void i40e_free_vfs(struct i40e_pf *pf)
 	 * before this function ever gets called.
 	 */
 	if (!pci_vfs_assigned(pf->pdev)) {
-		pci_disable_sriov(pf->pdev);
 		/* Acknowledge VFLR for all VFS. Without this, VFs will fail to
 		 * work correctly when SR-IOV gets re-enabled.
 		 */
-- 
1.9.3

^ permalink raw reply related

* [net-next 05/15] i40evf: remove leftover VLAN filters
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
  To: davem; +Cc: Mitch A Williams, netdev, nhorman, sassmann, jogreene,
	Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Mitch A Williams <mitch.a.williams@intel.com>

If we're using VLANs and communications with the PF fail during
shutdown, we will leak memory because not all of the VLAN filters will
be removed. To eliminate this possibility, go through the list again
right before the module is removed and delete any leftover entries.

Change-ID: Id3b5315c47ca0a61ae123a96ff345d010bc41aed
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40evf/i40evf_main.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index 5a2ed90..a21fb88 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -2463,6 +2463,10 @@ static void i40evf_remove(struct pci_dev *pdev)
 		list_del(&f->list);
 		kfree(f);
 	}
+	list_for_each_entry_safe(f, ftmp, &adapter->vlan_filter_list, list) {
+		list_del(&f->list);
+		kfree(f);
+	}
 
 	free_netdev(netdev);
 
-- 
1.9.3

^ permalink raw reply related

* [net-next 04/15] i40evf: refactor shutdown code
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
  To: davem; +Cc: Mitch A Williams, netdev, nhorman, sassmann, jogreene,
	Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Mitch A Williams <mitch.a.williams@intel.com>

If the VF driver is running in the host, the shutdown code is completely
broken. We cannot wait in our down routine for the PF to respond to our
requests, as its admin queue task will never run while we hold the lock.

Instead, we schedule operations, then let the watchdog take care of
shutting things down. If the driver is being removed, then wait in the
remove routine until the watchdog is done before continuing.

Change-ID: I93a58d17389e8d6b58f21e430b56ed7b4590b2c5
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40evf/i40evf_main.c | 29 ++++++++++++++++++++-----
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index f8f1d26..5a2ed90 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -958,6 +958,12 @@ void i40evf_down(struct i40evf_adapter *adapter)
 	if (adapter->state == __I40EVF_DOWN)
 		return;
 
+	while (test_and_set_bit(__I40EVF_IN_CRITICAL_TASK,
+				&adapter->crit_section))
+		usleep_range(500, 1000);
+
+	i40evf_irq_disable(adapter);
+
 	/* remove all MAC filters */
 	list_for_each_entry(f, &adapter->mac_filter_list, list) {
 		f->remove = true;
@@ -968,22 +974,27 @@ void i40evf_down(struct i40evf_adapter *adapter)
 	}
 	if (!(adapter->flags & I40EVF_FLAG_PF_COMMS_FAILED) &&
 	    adapter->state != __I40EVF_RESETTING) {
-		adapter->aq_required |= I40EVF_FLAG_AQ_DEL_MAC_FILTER;
+		/* cancel any current operation */
+		adapter->current_op = I40E_VIRTCHNL_OP_UNKNOWN;
+		adapter->aq_pending = 0;
+		/* Schedule operations to close down the HW. Don't wait
+		 * here for this to complete. The watchdog is still running
+		 * and it will take care of this.
+		 */
+		adapter->aq_required = I40EVF_FLAG_AQ_DEL_MAC_FILTER;
 		adapter->aq_required |= I40EVF_FLAG_AQ_DEL_VLAN_FILTER;
-		/* disable receives */
 		adapter->aq_required |= I40EVF_FLAG_AQ_DISABLE_QUEUES;
-		mod_timer_pending(&adapter->watchdog_timer, jiffies + 1);
-		msleep(20);
 	}
 	netif_tx_disable(netdev);
 
 	netif_tx_stop_all_queues(netdev);
 
-	i40evf_irq_disable(adapter);
-
 	i40evf_napi_disable_all(adapter);
 
+	msleep(20);
+
 	netif_carrier_off(netdev);
+	clear_bit(__I40EVF_IN_CRITICAL_TASK, &adapter->crit_section);
 }
 
 /**
@@ -2409,6 +2420,7 @@ static void i40evf_remove(struct pci_dev *pdev)
 	struct i40evf_adapter *adapter = netdev_priv(netdev);
 	struct i40evf_mac_filter *f, *ftmp;
 	struct i40e_hw *hw = &adapter->hw;
+	int count = 50;
 
 	cancel_delayed_work_sync(&adapter->init_task);
 	cancel_work_sync(&adapter->reset_task);
@@ -2417,6 +2429,11 @@ static void i40evf_remove(struct pci_dev *pdev)
 		unregister_netdev(netdev);
 		adapter->netdev_registered = false;
 	}
+	while (count-- && adapter->aq_required)
+		msleep(50);
+
+	if (count < 0)
+		dev_err(&pdev->dev, "Timed out waiting for PF driver.\n");
 	adapter->state = __I40EVF_REMOVE;
 
 	if (adapter->msix_entries) {
-- 
1.9.3

^ permalink raw reply related

* [net-next 07/15] i40evf: enable interrupt 0 appropriately
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
  To: davem; +Cc: Mitch A Williams, netdev, nhorman, sassmann, jogreene,
	Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Mitch A Williams <mitch.a.williams@intel.com>

Don't enable vector 0 in the ISR, just schedule the adminq task and let
it enable the vector. This prevents the task from being called
reentrantly. Make sure that the vector is enabled on all exit paths of
the adminq task, including error exits.

Change-ID: I53f3d14f91ed7a9e90291ea41c681122a5eca5b5
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40evf/i40evf_main.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index d55c4a9..6d5a0e6 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -313,10 +313,6 @@ static irqreturn_t i40evf_msix_aq(int irq, void *data)
 	val = val | I40E_PFINT_DYN_CTL0_CLEARPBA_MASK;
 	wr32(hw, I40E_VFINT_DYN_CTL01, val);
 
-	/* re-enable interrupt causes */
-	wr32(hw, I40E_VFINT_ICR0_ENA1, ena_mask);
-	wr32(hw, I40E_VFINT_DYN_CTL01, I40E_VFINT_DYN_CTL01_INTENA_MASK);
-
 	/* schedule work on the private workqueue */
 	schedule_work(&adapter->adminq_task);
 
@@ -1620,12 +1616,12 @@ static void i40evf_adminq_task(struct work_struct *work)
 	u16 pending;
 
 	if (adapter->flags & I40EVF_FLAG_PF_COMMS_FAILED)
-		return;
+		goto out;
 
 	event.buf_len = I40EVF_MAX_AQ_BUF_SIZE;
 	event.msg_buf = kzalloc(event.buf_len, GFP_KERNEL);
 	if (!event.msg_buf)
-		return;
+		goto out;
 
 	v_msg = (struct i40e_virtchnl_msg *)&event.desc;
 	do {
@@ -1675,10 +1671,10 @@ static void i40evf_adminq_task(struct work_struct *work)
 	if (oldval != val)
 		wr32(hw, hw->aq.asq.len, val);
 
+	kfree(event.msg_buf);
+out:
 	/* re-enable Admin queue interrupt cause */
 	i40evf_misc_irq_enable(adapter);
-
-	kfree(event.msg_buf);
 }
 
 /**
-- 
1.9.3

^ permalink raw reply related

* [net-next 06/15] i40evf: don't fire traffic IRQs when the interface is down
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
  To: davem; +Cc: Mitch A Williams, netdev, nhorman, sassmann, jogreene,
	Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Mitch A Williams <mitch.a.williams@intel.com>

There is always a possibility that MSI-X interrupts can get lost. To
keep this problem from stalling the driver, we fire all of our MSI-X
vectors during the watchdog routine. However, we should not fire the
traffic vectors when the interface is closed. In this case, just fire
vector 0, which is used for admin queue events.

As a result, we do not enable the interrupt cause for vector 0. This
can cause the admin queue handler to be called reentrantly, which
causes a scary "critical section violation" message to be logged,
even though no real damage is done.

Change-ID: Ic43a5184708ab2cb9a23fca7dedd808a46717795
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40evf/i40evf_main.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index a21fb88..d55c4a9 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -1385,11 +1385,14 @@ static void i40evf_watchdog_task(struct work_struct *work)
 
 	if (adapter->state == __I40EVF_RUNNING)
 		i40evf_request_stats(adapter);
-
-	i40evf_irq_enable(adapter, true);
-	i40evf_fire_sw_int(adapter, 0xFF);
-
 watchdog_done:
+	if (adapter->state == __I40EVF_RUNNING) {
+		i40evf_irq_enable_queues(adapter, ~0);
+		i40evf_fire_sw_int(adapter, 0xFF);
+	} else {
+		i40evf_fire_sw_int(adapter, 0x1);
+	}
+
 	clear_bit(__I40EVF_IN_CRITICAL_TASK, &adapter->crit_section);
 restart_watchdog:
 	if (adapter->state == __I40EVF_REMOVE)
-- 
1.9.3

^ permalink raw reply related

* [net-next 09/15] i40e: Add warning for NPAR partitions with link speed less than 10Gbps
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
  To: davem; +Cc: Greg Rose, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Greg Rose <gregory.v.rose@intel.com>

NPAR enabled partitions should warn the user when detected link speed is
less than 10Gpbs.

Change-ID: I7728bb8ce279bf0f4f755d78d7071074a4eb5f69
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index a5f2660..7c14973 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -4557,6 +4557,15 @@ static void i40e_print_link_message(struct i40e_vsi *vsi, bool isup)
 		return;
 	}
 
+	/* Warn user if link speed on NPAR enabled partition is not at
+	 * least 10GB
+	 */
+	if (vsi->back->hw.func_caps.npar_enable &&
+	    (vsi->back->hw.phy.link_info.link_speed == I40E_LINK_SPEED_1GB ||
+	     vsi->back->hw.phy.link_info.link_speed == I40E_LINK_SPEED_100MB))
+		netdev_warn(vsi->netdev,
+			    "The partition detected link speed that is less than 10Gbps\n");
+
 	switch (vsi->back->hw.phy.link_info.link_speed) {
 	case I40E_LINK_SPEED_40GB:
 		strlcpy(speed, "40 Gbps", SPEED_SIZE);
-- 
1.9.3

^ permalink raw reply related

* [net-next 08/15] i40evf: kick a stalled admin queue
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
  To: davem; +Cc: Mitch A Williams, netdev, nhorman, sassmann, jogreene,
	Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Mitch A Williams <mitch.a.williams@intel.com>

On some versions of the firmware, the VF admin send queue may become
stalled. In this case, the easiest solution is to just place another
descriptor on the queue; the firmware will then process both requests.

The early init code already accounts for this, but the runtime code does
not. In the watchdog task, check for the stall condition, and if it's
found, send our API version to the PF. When the PF replies, just ignore
the reply.

Change-ID: I380d78185a4f284d649c44d263e648afc9b4d50c
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40evf/i40evf_main.c     | 7 ++++++-
 drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c | 3 +++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index 6d5a0e6..c663182 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -1336,8 +1336,13 @@ static void i40evf_watchdog_task(struct work_struct *work)
 	/* Process admin queue tasks. After init, everything gets done
 	 * here so we don't race on the admin queue.
 	 */
-	if (adapter->aq_pending)
+	if (adapter->aq_pending) {
+		if (!i40evf_asq_done(hw)) {
+			dev_dbg(&adapter->pdev->dev, "Admin queue timeout\n");
+			i40evf_send_api_ver(adapter);
+		}
 		goto watchdog_done;
+	}
 
 	if (adapter->aq_required & I40EVF_FLAG_AQ_MAP_VECTORS) {
 		i40evf_map_queues(adapter);
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c b/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
index 3aeb633..3f0c85e 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
@@ -720,6 +720,9 @@ void i40evf_virtchnl_completion(struct i40evf_adapter *adapter,
 			__func__, v_retval, v_opcode);
 	}
 	switch (v_opcode) {
+	case I40E_VIRTCHNL_OP_VERSION:
+		/* no action, but also not an error */
+		break;
 	case I40E_VIRTCHNL_OP_GET_STATS: {
 		struct i40e_eth_stats *stats =
 			(struct i40e_eth_stats *)msg;
-- 
1.9.3

^ permalink raw reply related

* [net-next 11/15] i40e/i40evf: find partition_id in npar mode
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
  To: davem; +Cc: Shannon Nelson, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Shannon Nelson <shannon.nelson@intel.com>

When in NPAR mode the driver instance might be controlling the base
partition or one of the other "fake" PFs.  There are some things that
can only be done by the base partition, aka partition_id 1.  This code
does a bit of work to find how many partitions are there per port and
what is the current partition_id.

Change-ID: Iba427f020a1983d02147d86f121b3627e20ee21d
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_common.c    | 66 ++++++++++++++++++++++++
 drivers/net/ethernet/intel/i40e/i40e_prototype.h |  3 ++
 drivers/net/ethernet/intel/i40e/i40e_type.h      |  7 ++-
 drivers/net/ethernet/intel/i40evf/i40e_type.h    |  7 ++-
 4 files changed, 81 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_common.c b/drivers/net/ethernet/intel/i40e/i40e_common.c
index 3d741ee..b16fc03 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_common.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_common.c
@@ -2035,6 +2035,43 @@ i40e_status i40e_aq_send_msg_to_vf(struct i40e_hw *hw, u16 vfid,
 }
 
 /**
+ * i40e_aq_debug_read_register
+ * @hw: pointer to the hw struct
+ * @reg_addr: register address
+ * @reg_val: register value
+ * @cmd_details: pointer to command details structure or NULL
+ *
+ * Read the register using the admin queue commands
+ **/
+i40e_status i40e_aq_debug_read_register(struct i40e_hw *hw,
+				u32  reg_addr, u64 *reg_val,
+				struct i40e_asq_cmd_details *cmd_details)
+{
+	struct i40e_aq_desc desc;
+	struct i40e_aqc_debug_reg_read_write *cmd_resp =
+		(struct i40e_aqc_debug_reg_read_write *)&desc.params.raw;
+	i40e_status status;
+
+	if (reg_val == NULL)
+		return I40E_ERR_PARAM;
+
+	i40e_fill_default_direct_cmd_desc(&desc,
+					  i40e_aqc_opc_debug_read_reg);
+
+	cmd_resp->address = cpu_to_le32(reg_addr);
+
+	status = i40e_asq_send_command(hw, &desc, NULL, 0, cmd_details);
+
+	if (!status) {
+		*reg_val = ((u64)cmd_resp->value_high << 32) |
+			    (u64)cmd_resp->value_low;
+		*reg_val = le64_to_cpu(*reg_val);
+	}
+
+	return status;
+}
+
+/**
  * i40e_aq_debug_write_register
  * @hw: pointer to the hw struct
  * @reg_addr: register address
@@ -2292,6 +2329,7 @@ static void i40e_parse_discover_capabilities(struct i40e_hw *hw, void *buff,
 				     enum i40e_admin_queue_opc list_type_opc)
 {
 	struct i40e_aqc_list_capabilities_element_resp *cap;
+	u32 valid_functions, num_functions;
 	u32 number, logical_id, phys_id;
 	struct i40e_hw_capabilities *p;
 	u32 i = 0;
@@ -2427,6 +2465,34 @@ static void i40e_parse_discover_capabilities(struct i40e_hw *hw, void *buff,
 	if (p->npar_enable || p->mfp_mode_1)
 		p->fcoe = false;
 
+	/* count the enabled ports (aka the "not disabled" ports) */
+	hw->num_ports = 0;
+	for (i = 0; i < 4; i++) {
+		u32 port_cfg_reg = I40E_PRTGEN_CNF + (4 * i);
+		u64 port_cfg = 0;
+
+		/* use AQ read to get the physical register offset instead
+		 * of the port relative offset
+		 */
+		i40e_aq_debug_read_register(hw, port_cfg_reg, &port_cfg, NULL);
+		if (!(port_cfg & I40E_PRTGEN_CNF_PORT_DIS_MASK))
+			hw->num_ports++;
+	}
+
+	valid_functions = p->valid_functions;
+	num_functions = 0;
+	while (valid_functions) {
+		if (valid_functions & 1)
+			num_functions++;
+		valid_functions >>= 1;
+	}
+
+	/* partition id is 1-based, and functions are evenly spread
+	 * across the ports as partitions
+	 */
+	hw->partition_id = (hw->pf_id / hw->num_ports) + 1;
+	hw->num_partitions = num_functions / hw->num_ports;
+
 	/* additional HW specific goodies that might
 	 * someday be HW version specific
 	 */
diff --git a/drivers/net/ethernet/intel/i40e/i40e_prototype.h b/drivers/net/ethernet/intel/i40e/i40e_prototype.h
index 2fb4306..d1c7d63 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_prototype.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_prototype.h
@@ -71,6 +71,9 @@ i40e_status i40e_aq_get_firmware_version(struct i40e_hw *hw,
 i40e_status i40e_aq_debug_write_register(struct i40e_hw *hw,
 					u32 reg_addr, u64 reg_val,
 					struct i40e_asq_cmd_details *cmd_details);
+i40e_status i40e_aq_debug_read_register(struct i40e_hw *hw,
+				u32  reg_addr, u64 *reg_val,
+				struct i40e_asq_cmd_details *cmd_details);
 i40e_status i40e_aq_set_phy_debug(struct i40e_hw *hw, u8 cmd_flags,
 				struct i40e_asq_cmd_details *cmd_details);
 i40e_status i40e_aq_set_default_vsi(struct i40e_hw *hw, u16 vsi_id,
diff --git a/drivers/net/ethernet/intel/i40e/i40e_type.h b/drivers/net/ethernet/intel/i40e/i40e_type.h
index c1f2eb9..611de3e 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_type.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_type.h
@@ -431,7 +431,7 @@ struct i40e_hw {
 	u8 __iomem *hw_addr;
 	void *back;
 
-	/* function pointer structs */
+	/* subsystem structs */
 	struct i40e_phy_info phy;
 	struct i40e_mac_info mac;
 	struct i40e_bus_info bus;
@@ -458,6 +458,11 @@ struct i40e_hw {
 	u8  pf_id;
 	u16 main_vsi_seid;
 
+	/* for multi-function MACs */
+	u16 partition_id;
+	u16 num_partitions;
+	u16 num_ports;
+
 	/* Closest numa node to the device */
 	u16 numa_node;
 
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_type.h b/drivers/net/ethernet/intel/i40evf/i40e_type.h
index 68aec11..d1c2b5a 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_type.h
+++ b/drivers/net/ethernet/intel/i40evf/i40e_type.h
@@ -425,7 +425,7 @@ struct i40e_hw {
 	u8 __iomem *hw_addr;
 	void *back;
 
-	/* function pointer structs */
+	/* subsystem structs */
 	struct i40e_phy_info phy;
 	struct i40e_mac_info mac;
 	struct i40e_bus_info bus;
@@ -452,6 +452,11 @@ struct i40e_hw {
 	u8  pf_id;
 	u16 main_vsi_seid;
 
+	/* for multi-function MACs */
+	u16 partition_id;
+	u16 num_partitions;
+	u16 num_ports;
+
 	/* Closest numa node to the device */
 	u16 numa_node;
 
-- 
1.9.3

^ permalink raw reply related

* [net-next 12/15] i40e: Adding function for reading PBA String
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
  To: davem; +Cc: Kamil Krawczyk, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Kamil Krawczyk <kamil.krawczyk@intel.com>

Function will read PBA Block from Shadow RAM and return it in a string format.

Change-ID: I4ee7059f6e21bd0eba38687da15e772e0b4ab36e
Signed-off-by: Kamil Krawczyk <kamil.krawczyk@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_common.c    | 59 ++++++++++++++++++++++++
 drivers/net/ethernet/intel/i40e/i40e_prototype.h |  2 +
 drivers/net/ethernet/intel/i40e/i40e_type.h      |  2 +
 3 files changed, 63 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_common.c b/drivers/net/ethernet/intel/i40e/i40e_common.c
index b16fc03..4f4d9d1 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_common.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_common.c
@@ -742,6 +742,65 @@ i40e_status i40e_get_san_mac_addr(struct i40e_hw *hw, u8 *mac_addr)
 #endif
 
 /**
+ *  i40e_read_pba_string - Reads part number string from EEPROM
+ *  @hw: pointer to hardware structure
+ *  @pba_num: stores the part number string from the EEPROM
+ *  @pba_num_size: part number string buffer length
+ *
+ *  Reads the part number string from the EEPROM.
+ **/
+i40e_status i40e_read_pba_string(struct i40e_hw *hw, u8 *pba_num,
+				 u32 pba_num_size)
+{
+	i40e_status status = 0;
+	u16 pba_word = 0;
+	u16 pba_size = 0;
+	u16 pba_ptr = 0;
+	u16 i = 0;
+
+	status = i40e_read_nvm_word(hw, I40E_SR_PBA_FLAGS, &pba_word);
+	if (status || (pba_word != 0xFAFA)) {
+		hw_dbg(hw, "Failed to read PBA flags or flag is invalid.\n");
+		return status;
+	}
+
+	status = i40e_read_nvm_word(hw, I40E_SR_PBA_BLOCK_PTR, &pba_ptr);
+	if (status) {
+		hw_dbg(hw, "Failed to read PBA Block pointer.\n");
+		return status;
+	}
+
+	status = i40e_read_nvm_word(hw, pba_ptr, &pba_size);
+	if (status) {
+		hw_dbg(hw, "Failed to read PBA Block size.\n");
+		return status;
+	}
+
+	/* Subtract one to get PBA word count (PBA Size word is included in
+	 * total size)
+	 */
+	pba_size--;
+	if (pba_num_size < (((u32)pba_size * 2) + 1)) {
+		hw_dbg(hw, "Buffer to small for PBA data.\n");
+		return I40E_ERR_PARAM;
+	}
+
+	for (i = 0; i < pba_size; i++) {
+		status = i40e_read_nvm_word(hw, (pba_ptr + 1) + i, &pba_word);
+		if (status) {
+			hw_dbg(hw, "Failed to read PBA Block word %d.\n", i);
+			return status;
+		}
+
+		pba_num[(i * 2)] = (pba_word >> 8) & 0xFF;
+		pba_num[(i * 2) + 1] = pba_word & 0xFF;
+	}
+	pba_num[(pba_size * 2)] = '\0';
+
+	return status;
+}
+
+/**
  * i40e_get_media_type - Gets media type
  * @hw: pointer to the hardware structure
  **/
diff --git a/drivers/net/ethernet/intel/i40e/i40e_prototype.h b/drivers/net/ethernet/intel/i40e/i40e_prototype.h
index d1c7d63..68e852a 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_prototype.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_prototype.h
@@ -248,6 +248,8 @@ void i40e_clear_pxe_mode(struct i40e_hw *hw);
 bool i40e_get_link_status(struct i40e_hw *hw);
 i40e_status i40e_get_mac_addr(struct i40e_hw *hw, u8 *mac_addr);
 i40e_status i40e_get_port_mac_addr(struct i40e_hw *hw, u8 *mac_addr);
+i40e_status i40e_read_pba_string(struct i40e_hw *hw, u8 *pba_num,
+				 u32 pba_num_size);
 i40e_status i40e_validate_mac_addr(u8 *mac_addr);
 void i40e_pre_tx_queue_cfg(struct i40e_hw *hw, u32 queue, bool enable);
 #ifdef I40E_FCOE
diff --git a/drivers/net/ethernet/intel/i40e/i40e_type.h b/drivers/net/ethernet/intel/i40e/i40e_type.h
index 611de3e..ff121fe 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_type.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_type.h
@@ -1140,6 +1140,8 @@ struct i40e_hw_port_stats {
 /* Checksum and Shadow RAM pointers */
 #define I40E_SR_NVM_CONTROL_WORD		0x00
 #define I40E_SR_EMP_MODULE_PTR			0x0F
+#define I40E_SR_PBA_FLAGS			0x15
+#define I40E_SR_PBA_BLOCK_PTR			0x16
 #define I40E_SR_NVM_IMAGE_VERSION		0x18
 #define I40E_SR_NVM_WAKE_ON_LAN			0x19
 #define I40E_SR_ALTERNATE_SAN_MAC_ADDRESS_PTR	0x27
-- 
1.9.3

^ permalink raw reply related

* [net-next 10/15] i40e: remove VN2VN related mac filters
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
  To: davem; +Cc: Vasu Dev, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Vasu Dev <vasu.dev@intel.com>

These mac address already added by FCoE stack above netdev,
therefore adding them here is redundant.

Change-ID: Ia5b59f426f57efd20f8945f7c6cc5d741fbe06e5
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_fcoe.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_fcoe.c b/drivers/net/ethernet/intel/i40e/i40e_fcoe.c
index a8b8bd9..2cd841b 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_fcoe.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_fcoe.c
@@ -1515,8 +1515,6 @@ void i40e_fcoe_config_netdev(struct net_device *netdev, struct i40e_vsi *vsi)
 	i40e_add_filter(vsi, (u8[6]) FC_FCOE_FLOGI_MAC, 0, false, false);
 	i40e_add_filter(vsi, FIP_ALL_FCOE_MACS, 0, false, false);
 	i40e_add_filter(vsi, FIP_ALL_ENODE_MACS, 0, false, false);
-	i40e_add_filter(vsi, FIP_ALL_VN2VN_MACS, 0, false, false);
-	i40e_add_filter(vsi, FIP_ALL_P2P_MACS, 0, false, false);
 
 	/* use san mac */
 	ether_addr_copy(netdev->dev_addr, hw->mac.san_addr);
-- 
1.9.3

^ permalink raw reply related

* [net-next 13/15] i40e: limit WoL and link settings to partition 1
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
  To: davem; +Cc: Shannon Nelson, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Shannon Nelson <shannon.nelson@intel.com>

When in multi-function mode, e.g. Dell's NPAR, only partition 1
of each MAC is allowed to set WoL, speed, and flow control.

Change-ID: I87a9debc7479361c55a71f0120294ea319f23588
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_ethtool.c | 43 +++++++++++++++++++++++++-
 1 file changed, 42 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
index 951e876..b8230dc 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
@@ -219,6 +219,16 @@ static const char i40e_gstrings_test[][ETH_GSTRING_LEN] = {
 #define I40E_TEST_LEN (sizeof(i40e_gstrings_test) / ETH_GSTRING_LEN)
 
 /**
+ * i40e_partition_setting_complaint - generic complaint for MFP restriction
+ * @pf: the PF struct
+ **/
+static void i40e_partition_setting_complaint(struct i40e_pf *pf)
+{
+	dev_info(&pf->pdev->dev,
+		 "The link settings are allowed to be changed only from the first partition of a given port. Please switch to the first partition in order to change the setting.\n");
+}
+
+/**
  * i40e_get_settings - Get Link Speed and Duplex settings
  * @netdev: network interface device structure
  * @ecmd: ethtool command
@@ -485,6 +495,14 @@ static int i40e_set_settings(struct net_device *netdev,
 	u8 autoneg;
 	u32 advertise;
 
+	/* Changing port settings is not supported if this isn't the
+	 * port's controlling PF
+	 */
+	if (hw->partition_id != 1) {
+		i40e_partition_setting_complaint(pf);
+		return -EOPNOTSUPP;
+	}
+
 	if (vsi != pf->vsi[pf->lan_vsi])
 		return -EOPNOTSUPP;
 
@@ -687,6 +705,14 @@ static int i40e_set_pauseparam(struct net_device *netdev,
 	u8 aq_failures;
 	int err = 0;
 
+	/* Changing the port's flow control is not supported if this isn't the
+	 * port's controlling PF
+	 */
+	if (hw->partition_id != 1) {
+		i40e_partition_setting_complaint(pf);
+		return -EOPNOTSUPP;
+	}
+
 	if (vsi != pf->vsi[pf->lan_vsi])
 		return -EOPNOTSUPP;
 
@@ -1503,7 +1529,7 @@ static void i40e_get_wol(struct net_device *netdev,
 
 	/* NVM bit on means WoL disabled for the port */
 	i40e_read_nvm_word(hw, I40E_SR_NVM_WAKE_ON_LAN, &wol_nvm_bits);
-	if ((1 << hw->port) & wol_nvm_bits) {
+	if ((1 << hw->port) & wol_nvm_bits || hw->partition_id != 1) {
 		wol->supported = 0;
 		wol->wolopts = 0;
 	} else {
@@ -1512,13 +1538,28 @@ static void i40e_get_wol(struct net_device *netdev,
 	}
 }
 
+/**
+ * i40e_set_wol - set the WakeOnLAN configuration
+ * @netdev: the netdev in question
+ * @wol: the ethtool WoL setting data
+ **/
 static int i40e_set_wol(struct net_device *netdev, struct ethtool_wolinfo *wol)
 {
 	struct i40e_netdev_priv *np = netdev_priv(netdev);
 	struct i40e_pf *pf = np->vsi->back;
+	struct i40e_vsi *vsi = np->vsi;
 	struct i40e_hw *hw = &pf->hw;
 	u16 wol_nvm_bits;
 
+	/* WoL not supported if this isn't the controlling PF on the port */
+	if (hw->partition_id != 1) {
+		i40e_partition_setting_complaint(pf);
+		return -EOPNOTSUPP;
+	}
+
+	if (vsi != pf->vsi[pf->lan_vsi])
+		return -EOPNOTSUPP;
+
 	/* NVM bit on means WoL disabled for the port */
 	i40e_read_nvm_word(hw, I40E_SR_NVM_WAKE_ON_LAN, &wol_nvm_bits);
 	if (((1 << hw->port) & wol_nvm_bits))
-- 
1.9.3

^ permalink raw reply related

* [net-next 14/15] i40e: Don't exit link event early if link speed has changed
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
  To: davem; +Cc: Catherine Sullivan, netdev, nhorman, sassmann, jogreene,
	Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Catherine Sullivan <catherine.sullivan@intel.com>

Previously we were only checking if the link up state had changed,
and if it hadn't exiting the link event routine early. We should
also check if speed has changed, and if it has, stay and finish
processing the link event.

Change-ID: I9c8e0991b3f0279108a7858898c3c5ce0a9856b8
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 7c14973..80430b0 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -5503,14 +5503,18 @@ static void i40e_link_event(struct i40e_pf *pf)
 {
 	bool new_link, old_link;
 	struct i40e_vsi *vsi = pf->vsi[pf->lan_vsi];
+	u8 new_link_speed, old_link_speed;
 
 	/* set this to force the get_link_status call to refresh state */
 	pf->hw.phy.get_link_info = true;
 
 	old_link = (pf->hw.phy.link_info_old.link_info & I40E_AQ_LINK_UP);
 	new_link = i40e_get_link_status(&pf->hw);
+	old_link_speed = pf->hw.phy.link_info_old.link_speed;
+	new_link_speed = pf->hw.phy.link_info.link_speed;
 
 	if (new_link == old_link &&
+	    new_link_speed == old_link_speed &&
 	    (test_bit(__I40E_DOWN, &vsi->state) ||
 	     new_link == netif_carrier_ok(vsi->netdev)))
 		return;
-- 
1.9.3

^ permalink raw reply related

* [net-next 15/15] i40e: limit sriov to partition 1 of NPAR configurations
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
  To: davem; +Cc: Shannon Nelson, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Shannon Nelson <shannon.nelson@intel.com>

Make sure we only allow SR/IOV on the master PF of a port in multifunction
mode.  This should be the case anyway based on the num_vfs configured in
the NVM, but this will help make sure there's no question.  If we're not
in multifunction mode the partition_id will always be 1.

Change-ID: I8b2592366fe6782f15301bde2ebd1d4da240109d
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 80430b0..f3b036d 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -7319,7 +7319,7 @@ static int i40e_sw_init(struct i40e_pf *pf)
 
 #endif /* I40E_FCOE */
 #ifdef CONFIG_PCI_IOV
-	if (pf->hw.func_caps.num_vfs) {
+	if (pf->hw.func_caps.num_vfs && pf->hw.partition_id == 1) {
 		pf->num_vf_qps = I40E_DEFAULT_QUEUES_PER_VF;
 		pf->flags |= I40E_FLAG_SRIOV_ENABLED;
 		pf->num_req_vfs = min_t(int,
-- 
1.9.3

^ permalink raw reply related

* Re: [PATCH net-next v2 2/2] vxlan: Remote checksum offload
From: Thomas Graf @ 2015-01-13 11:44 UTC (permalink / raw)
  To: Tom Herbert; +Cc: davem, netdev
In-Reply-To: <20150113012626.GD20387@casper.infradead.org>

On 01/13/15 at 01:26am, Thomas Graf wrote:
> On 01/12/15 at 05:00pm, Tom Herbert wrote:
> > +	if ((flags & VXLAN_HF_RCO) && (vs->flags & VXLAN_F_REMCSUM_RX)) {
> > +		vxh = vxlan_remcsum(skb, vxh, sizeof(struct vxlanhdr), vni);
> > +		if (!vxh)
> > +			goto drop;
> > +
> > +		flags &= ~VXLAN_HF_RCO;
> > +		vni &= VXLAN_VID_MASK;
> > +	}
> 
> Nice.
> 
> Would you mind basing this on top off the extension framework being put
> in place by GBP? I think that all VXLAN extensions should be exposed as
> such in a universal way to user space.

Doing so would also fix the missing "don't share UDP port on extension
mismatch" functionality as explained in "[PATCH 2/6] vxlan: Group Policy
extension".

^ permalink raw reply

* Re: why are IPv6 addresses removed on link down
From: YOSHIFUJI Hideaki @ 2015-01-13 11:58 UTC (permalink / raw)
  To: Hannes Frederic Sowa, Stephen Hemminger
  Cc: hideaki.yoshifuji, David Ahern, netdev@vger.kernel.org
In-Reply-To: <1421145346.13626.12.camel@redhat.com>

Hi,

Hannes Frederic Sowa wrote:
> On Mo, 2015-01-12 at 23:10 -0800, Stephen Hemminger wrote:
>> On Mon, 12 Jan 2015 22:06:44 -0700
>> David Ahern <dsahern@gmail.com> wrote:
>>
>>> We noticed that IPv6 addresses are removed on a link down. e.g.,
>>>     ip link set dev eth1
>>>
>>>
>>> Looking at the code it appears to be this code path in addrconf.c:
>>>
>>>           case NETDEV_DOWN:
>>>           case NETDEV_UNREGISTER:
>>>                   /*
>>>                    *      Remove all addresses from this interface.
>>>                    */
>>>                   addrconf_ifdown(dev, event != NETDEV_DOWN);
>>>                   break;
>>>
>>> IPv4 addresses are NOT removed on a link down. Is there a particular
>>> reason IPv6 addresses are?
>>>
>>> Thanks,
>>> David
>>
>> See RFC's which describes how IPv6 does Duplicate Address Detection.
>> Address is not valid when link is down, since DAD is not possible.
>
> It should be no problem if the kernel would reacquire them on ifup and
> do proper DAD. We simply must not use them while the interface is dead
> (also making sure they don't get used for loopback routing).
>
> The problem the IPv6 addresses get removed is much more a historical
> artifact nowadays, I think. It is part of user space API and scripts
> deal with that already.

We might have another "detached" state which essintially drops
outgoing packets while link is down.  Just after recovering link,
we could start receiving packet from the link and perform optimistic
DAD. And then, after it succeeds, we may start sending packets.

Since "detached" state is like the state just before completing
Optimistic DAD, it is not so difficult to implement this extended
behavior, I guess.

-- 
Hideaki Yoshifuji <hideaki.yoshifuji@miraclelinux.com>
Technical Division, MIRACLE LINUX CORPORATION

^ permalink raw reply

* Re: [RFC PATCH v2 2/2] net: ixgbe: implement af_packet direct queue mappings
From: Hannes Frederic Sowa @ 2015-01-13 12:05 UTC (permalink / raw)
  To: John Fastabend
  Cc: netdev, danny.zhou, nhorman, dborkman, john.ronciak, brouer
In-Reply-To: <20150113043542.29985.15658.stgit@nitbit.x32>

On Mo, 2015-01-12 at 20:35 -0800, John Fastabend wrote:
> +static int
> +ixgbe_ndo_qpair_page_map(struct vm_area_struct *vma, struct net_device *dev)
> +{
> +	struct ixgbe_adapter *adapter = netdev_priv(dev);
> +	phys_addr_t phy_addr = pci_resource_start(adapter->pdev, 0);
> +	unsigned long pfn_rx = (phy_addr + RX_DESC_ADDR_OFFSET) >> PAGE_SHIFT;
> +	unsigned long pfn_tx = (phy_addr + TX_DESC_ADDR_OFFSET) >> PAGE_SHIFT;
> +	unsigned long dummy_page_phy;
> +	pgprot_t pre_vm_page_prot;
> +	unsigned long start;
> +	unsigned int i;
> +	int err;
> +
> +	if (!dummy_page_buf) {
> +		dummy_page_buf = kzalloc(PAGE_SIZE_4K, GFP_KERNEL);
> +		if (!dummy_page_buf)
> +			return -ENOMEM;
> +
> +		for (i = 0; i < PAGE_SIZE_4K / sizeof(unsigned int); i++)
> +			dummy_page_buf[i] = 0xdeadbeef;
> +	}
> +
> +	dummy_page_phy = virt_to_phys(dummy_page_buf);
> +	pre_vm_page_prot = vma->vm_page_prot;
> +	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
> +
> +	/* assume the vm_start is 4K aligned address */
> +	for (start = vma->vm_start;
> +	     start < vma->vm_end;
> +	     start += PAGE_SIZE_4K) {
> +		if (start == vma->vm_start + RX_DESC_ADDR_OFFSET) {
> +			err = remap_pfn_range(vma, start, pfn_rx, PAGE_SIZE_4K,
> +					      vma->vm_page_prot);
> +			if (err)
> +				return -EAGAIN;
> +		} else if (start == vma->vm_start + TX_DESC_ADDR_OFFSET) {
> +			err = remap_pfn_range(vma, start, pfn_tx, PAGE_SIZE_4K,
> +					      vma->vm_page_prot);
> +			if (err)
> +				return -EAGAIN;
> +		} else {
> +			unsigned long addr = dummy_page_phy > PAGE_SHIFT;

I guess you have forgotten to delete this line?

> +
> +			err = remap_pfn_range(vma, start, addr, PAGE_SIZE_4K,
> +					      pre_vm_page_prot);
> +			if (err)
> +				return -EAGAIN;
> +		}
> +	}
> +	return 0;
> +}

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox