Netdev List

Netdev List
 help / color / mirror / Atom feed

* [RFC PATCH 07/10] ixgbe: Add function for setting XPS queue mapping
From: Alexander Duyck @ 2012-06-30  0:16 UTC (permalink / raw)
  To: netdev
  Cc: davem, jeffrey.t.kirsher, edumazet, bhutchings, therbert,
	alexander.duyck
In-Reply-To: <20120630000652.29939.11108.stgit@gitlad.jf.intel.com>

This change adds support for ixgbe to configure the XPS queue mapping on
load.  The result of this change is that on open we will now be resetting
the number of Tx queues, and then setting the default configuration for XPS
based on if ATR is enabled or disabled.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c  |    3 +--
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   18 ++++++++++++++++++
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
index 72386fb..a43dae0 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
@@ -797,8 +797,7 @@ static int ixgbe_alloc_q_vector(struct ixgbe_adapter *adapter,
 	/* setup affinity mask and node */
 	if (cpu != -1)
 		cpumask_set_cpu(cpu, &q_vector->affinity_mask);
-	else
-		cpumask_copy(&q_vector->affinity_mask, cpu_online_mask);
+
 	q_vector->numa_node = node;
 
 	/* initialize CPU for DCA */
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index dedb412..06641ea 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -4848,6 +4848,22 @@ static int ixgbe_change_mtu(struct net_device *netdev, int new_mtu)
 	return 0;
 }
 
+static void ixgbe_set_xps_mapping(struct net_device *netdev)
+{
+	struct ixgbe_adapter *adapter = netdev_priv(netdev);
+	struct ixgbe_q_vector *q_vector;
+	u16 i;
+
+	for (i = 0; i < adapter->num_tx_queues; i++) {
+		q_vector = adapter->tx_ring[i]->q_vector;
+
+		if (!q_vector)
+			continue;
+
+		netif_set_xps_queue(netdev, &q_vector->affinity_mask, i);
+	}
+}
+
 /**
  * ixgbe_open - Called when a network interface is made active
  * @netdev: network interface device structure
@@ -4894,6 +4910,8 @@ static int ixgbe_open(struct net_device *netdev)
 	if (err)
 		goto err_set_queues;
 
+	/* update the Tx mapping */
+	ixgbe_set_xps_mapping(netdev);
 
 	err = netif_set_real_num_rx_queues(netdev,
 					   adapter->num_rx_pools > 1 ? 1 :

^ permalink raw reply related

* [RFC PATCH 06/10] ixgbe: Define FCoE and Flow director limits much sooner to allow for changes
From: Alexander Duyck @ 2012-06-30  0:16 UTC (permalink / raw)
  To: netdev
  Cc: davem, jeffrey.t.kirsher, edumazet, bhutchings, therbert,
	alexander.duyck
In-Reply-To: <20120630000652.29939.11108.stgit@gitlad.jf.intel.com>

Instead of adjusting the FCoE and Flow director limits based on the number
of CPUs we can define them much sooner.  This allows the user to come
through later and adjust them once we have updated the code to support the
set_channels ethtool operation.

I am still allowing for FCoE and RSS queues to be separated if the number
queues is less than the number of CPUs.  This essentially treats the two
groupings like they are two separate traffic classes.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c  |    7 +------
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   12 ++++++++----
 2 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
index 24acd53..72386fb 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
@@ -386,7 +386,6 @@ static bool ixgbe_set_dcb_sriov_queues(struct ixgbe_adapter *adapter)
 		fcoe = &adapter->ring_feature[RING_F_FCOE];
 
 		/* limit ourselves based on feature limits */
-		fcoe_i = min_t(u16, fcoe_i, num_online_cpus());
 		fcoe_i = min_t(u16, fcoe_i, fcoe->limit);
 
 		if (fcoe_i) {
@@ -562,9 +561,6 @@ static bool ixgbe_set_sriov_queues(struct ixgbe_adapter *adapter)
 		fcoe_i = min_t(u16, fcoe_i, fcoe->limit);
 
 		if (vmdq_i > 1 && fcoe_i) {
-			/* reserve no more than number of CPUs */
-			fcoe_i = min_t(u16, fcoe_i, num_online_cpus());
-
 			/* alloc queues for FCoE separately */
 			fcoe->indices = fcoe_i;
 			fcoe->offset = vmdq_i * rss_i;
@@ -623,8 +619,7 @@ static bool ixgbe_set_rss_queues(struct ixgbe_adapter *adapter)
 	if (rss_i > 1 && adapter->atr_sample_rate) {
 		f = &adapter->ring_feature[RING_F_FDIR];
 
-		f->indices = min_t(u16, num_online_cpus(), f->limit);
-		rss_i = max_t(u16, rss_i, f->indices);
+		rss_i = f->indices = f->limit;
 
 		if (!(adapter->flags & IXGBE_FLAG_FDIR_PERFECT_CAPABLE))
 			adapter->flags |= IXGBE_FLAG_FDIR_HASH_CAPABLE;
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 5217b6d..dedb412 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -4433,7 +4433,7 @@ static int __devinit ixgbe_sw_init(struct ixgbe_adapter *adapter)
 {
 	struct ixgbe_hw *hw = &adapter->hw;
 	struct pci_dev *pdev = adapter->pdev;
-	unsigned int rss;
+	unsigned int rss, fdir;
 #ifdef CONFIG_IXGBE_DCB
 	int j;
 	struct tc_configuration *tc;
@@ -4466,8 +4466,8 @@ static int __devinit ixgbe_sw_init(struct ixgbe_adapter *adapter)
 			adapter->flags2 |= IXGBE_FLAG2_TEMP_SENSOR_CAPABLE;
 		/* Flow Director hash filters enabled */
 		adapter->atr_sample_rate = 20;
-		adapter->ring_feature[RING_F_FDIR].limit =
-							 IXGBE_MAX_FDIR_INDICES;
+		fdir = min_t(int, IXGBE_MAX_FDIR_INDICES, num_online_cpus());
+		adapter->ring_feature[RING_F_FDIR].limit = fdir;
 		adapter->fdir_pballoc = IXGBE_FDIR_PBALLOC_64K;
 #ifdef IXGBE_FCOE
 		adapter->flags |= IXGBE_FLAG_FCOE_CAPABLE;
@@ -7324,13 +7324,17 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
 
 #ifdef IXGBE_FCOE
 	if (adapter->flags & IXGBE_FLAG_FCOE_CAPABLE) {
+		unsigned int fcoe_l;
+
 		if (hw->mac.ops.get_device_caps) {
 			hw->mac.ops.get_device_caps(hw, &device_caps);
 			if (device_caps & IXGBE_DEVICE_CAPS_FCOE_OFFLOADS)
 				adapter->flags &= ~IXGBE_FLAG_FCOE_CAPABLE;
 		}
 
-		adapter->ring_feature[RING_F_FCOE].limit = IXGBE_FCRETA_SIZE;
+
+		fcoe_l = min_t(int, IXGBE_FCRETA_SIZE, num_online_cpus());
+		adapter->ring_feature[RING_F_FCOE].limit = fcoe_l;
 
 		netdev->features |= NETIF_F_FSO |
 				    NETIF_F_FCOE_CRC;

^ permalink raw reply related

* [RFC PATCH 05/10] net: Add support for XPS without SYSFS being defined
From: Alexander Duyck @ 2012-06-30  0:16 UTC (permalink / raw)
  To: netdev
  Cc: davem, jeffrey.t.kirsher, edumazet, bhutchings, therbert,
	alexander.duyck
In-Reply-To: <20120630000652.29939.11108.stgit@gitlad.jf.intel.com>

This patch makes it so that we can support transmit packet steering without
sysfs needing to be enabled.  The reason for making this change is to make
it so that a driver can make use of the XPS even while the sysfs portion of
the interface is not present.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 include/linux/netdevice.h |    1 -
 net/Kconfig               |    2 +-
 net/core/dev.c            |   26 ++++++++++++++++++++------
 net/core/net-sysfs.c      |   14 --------------
 4 files changed, 21 insertions(+), 22 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index e9e74b7..db27be2 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2073,7 +2073,6 @@ static inline void netif_wake_subqueue(struct net_device *dev, u16 queue_index)
 }
 
 #ifdef CONFIG_XPS
-extern void netif_reset_xps_queue(struct net_device *dev, u16 index);
 extern int netif_set_xps_queue(struct net_device *dev, struct cpumask *mask,
 			       u16 index);
 #else
diff --git a/net/Kconfig b/net/Kconfig
index 245831b..fcc5657 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -230,7 +230,7 @@ config RFS_ACCEL
 
 config XPS
 	boolean
-	depends on SMP && SYSFS && USE_GENERIC_SMP_HELPERS
+	depends on SMP && USE_GENERIC_SMP_HELPERS
 	default y
 
 config NETPRIO_CGROUP
diff --git a/net/core/dev.c b/net/core/dev.c
index 5f0550b..894faf1 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1758,10 +1758,10 @@ static struct xps_map *remove_xps_queue(struct xps_dev_maps *dev_maps,
 	return map;
 }
 
-void netif_reset_xps_queue(struct net_device *dev, u16 index)
+static void netif_reset_xps_queues_gt(struct net_device *dev, u16 index)
 {
 	struct xps_dev_maps *dev_maps;
-	int cpu;
+	int cpu, i;
 	bool active = false;
 
 	mutex_lock(&xps_map_mutex);
@@ -1771,7 +1771,11 @@ void netif_reset_xps_queue(struct net_device *dev, u16 index)
 		goto out_no_maps;
 
 	for_each_possible_cpu(cpu) {
-		if (remove_xps_queue(dev_maps, cpu, index))
+		for (i = index; i < dev->num_tx_queues; i++) {
+			if (!remove_xps_queue(dev_maps, cpu, i))
+				break;
+		}
+		if (i == dev->num_tx_queues)
 			active = true;
 	}
 
@@ -1780,8 +1784,10 @@ void netif_reset_xps_queue(struct net_device *dev, u16 index)
 		kfree_rcu(dev_maps, rcu);
 	}
 
-	netdev_queue_numa_node_write(netdev_get_tx_queue(dev, index),
-				     NUMA_NO_NODE);
+	for (i = index; i < dev->num_tx_queues; i++)
+		netdev_queue_numa_node_write(netdev_get_tx_queue(dev, i),
+					     NUMA_NO_NODE);
+
 out_no_maps:
 	mutex_unlock(&xps_map_mutex);
 }
@@ -1967,8 +1973,12 @@ int netif_set_real_num_tx_queues(struct net_device *dev, unsigned int txq)
 		if (dev->num_tc)
 			netif_setup_tc(dev, txq);
 
-		if (txq < dev->real_num_tx_queues)
+		if (txq < dev->real_num_tx_queues) {
 			qdisc_reset_all_tx_gt(dev, txq);
+#ifdef CONFIG_XPS
+			netif_reset_xps_queues_gt(dev, txq);
+#endif
+		}
 	}
 
 	dev->real_num_tx_queues = txq;
@@ -5460,6 +5470,10 @@ static void rollback_registered_many(struct list_head *head)
 
 		/* Remove entries from kobject tree */
 		netdev_unregister_kobject(dev);
+#ifdef CONFIG_XPS
+		/* Remove XPS queueing entries */
+		netif_reset_xps_queues_gt(dev, 0);
+#endif
 	}
 
 	/* Process any work delayed until the end of the batch */
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 092d338..42bb496 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -963,16 +963,6 @@ static ssize_t show_xps_map(struct netdev_queue *queue,
 	return len;
 }
 
-static void xps_queue_release(struct netdev_queue *queue)
-{
-	struct net_device *dev = queue->dev;
-	unsigned long index;
-
-	index = get_netdev_queue_index(queue);
-
-	netif_reset_xps_queue(dev, index);
-}
-
 static ssize_t store_xps_map(struct netdev_queue *queue,
 		      struct netdev_queue_attribute *attribute,
 		      const char *buf, size_t len)
@@ -1019,10 +1009,6 @@ static void netdev_queue_release(struct kobject *kobj)
 {
 	struct netdev_queue *queue = to_netdev_queue(kobj);
 
-#ifdef CONFIG_XPS
-	xps_queue_release(queue);
-#endif
-
 	memset(kobj, 0, sizeof(*kobj));
 	dev_put(queue->dev);
 }

^ permalink raw reply related

* [RFC PATCH 04/10] net: Rewrite netif_set_xps_queues to address several issues
From: Alexander Duyck @ 2012-06-30  0:16 UTC (permalink / raw)
  To: netdev
  Cc: davem, jeffrey.t.kirsher, edumazet, bhutchings, therbert,
	alexander.duyck
In-Reply-To: <20120630000652.29939.11108.stgit@gitlad.jf.intel.com>

This change is meant to address several issues I found within the
netif_set_xps_queues function.

If the allocation of one of the maps to be assigned to new_dev_maps failed
we could end up with the device map in an inconsistent state since we had
already worked through a number of CPUs and removed or added the queue.  To
address that I split the process into several steps.  The first of which is
just the allocation of updated maps for CPUs that will need larger maps to
store the queue.  By doing this we can fail gracefully without actually
altering the contents of the current device map.

The second issue I found was the fact that we were always allocating a new
device map even if we were not adding any queues.  I have updated the code
so that we only allocate a new device map if we are adding queues,
otherwise if we are not adding any queues to CPUs we just skip to the
removal process.

The last change I made was to reuse the code from remove_xps_queue to remove
the queue from the CPU.  By making this change we can be consistent in how
we go about adding and removing the queues from the CPUs.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 net/core/dev.c |  183 ++++++++++++++++++++++++++++++++++++--------------------
 1 files changed, 117 insertions(+), 66 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 8e259d4..5f0550b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1786,107 +1786,158 @@ out_no_maps:
 	mutex_unlock(&xps_map_mutex);
 }
 
+static struct xps_map *expand_xps_map(struct xps_map *map,
+				      int cpu, u16 index)
+{
+	struct xps_map *new_map;
+	int alloc_len = XPS_MIN_MAP_ALLOC;
+	int i, pos;
+
+	for (pos = 0; map && pos < map->len; pos++) {
+		if (map->queues[pos] != index)
+			continue;
+		return map;
+	}
+
+	/* Need to add queue to this CPU's existing map */
+	if (map) {
+		if (pos < map->alloc_len)
+			return map;
+
+		alloc_len = map->alloc_len * 2;
+	}
+
+	/* Need to allocate new map to store queue on this CPU's map */
+	new_map = kzalloc_node(XPS_MAP_SIZE(alloc_len), GFP_KERNEL,
+			       cpu_to_node(cpu));
+	if (!new_map)
+		return NULL;
+
+	for (i = 0; i < pos; i++)
+		new_map->queues[i] = map->queues[i];
+	new_map->alloc_len = alloc_len;
+	new_map->len = pos;
+
+	return new_map;
+}
+
 int netif_set_xps_queue(struct net_device *dev, struct cpumask *mask, u16 index)
 {
-	int i, cpu, pos, map_len, alloc_len, need_set;
+	struct xps_dev_maps *dev_maps, *new_dev_maps = NULL;
 	struct xps_map *map, *new_map;
-	struct xps_dev_maps *dev_maps, *new_dev_maps;
-	int nonempty = 0;
-	int numa_node_id = -2;
 	int maps_sz = max_t(unsigned int, XPS_DEV_MAPS_SIZE, L1_CACHE_BYTES);
-
-	new_dev_maps = kzalloc(maps_sz, GFP_KERNEL);
-	if (!new_dev_maps)
-		return -ENOMEM;
+	int cpu, numa_node_id = -2;
+	bool active = false;
 
 	mutex_lock(&xps_map_mutex);
 
 	dev_maps = xmap_dereference(dev->xps_maps);
 
+	/* allocate memory for queue storage */
+	for_each_online_cpu(cpu) {
+		if (!cpumask_test_cpu(cpu, mask))
+			continue;
+
+		if (!new_dev_maps)
+			new_dev_maps = kzalloc(maps_sz, GFP_KERNEL);
+		if (!new_dev_maps)
+			return -ENOMEM;
+
+		map = dev_maps ? xmap_dereference(dev_maps->cpu_map[cpu]) :
+				 NULL;
+
+		map = expand_xps_map(map, cpu, index);
+		if (!map)
+			goto error;
+
+		RCU_INIT_POINTER(new_dev_maps->cpu_map[cpu], map);
+	}
+
+	if (!new_dev_maps)
+		goto out_no_new_maps;
+
 	for_each_possible_cpu(cpu) {
-		map = dev_maps ?
-			xmap_dereference(dev_maps->cpu_map[cpu]) : NULL;
-		new_map = map;
-		if (map) {
-			for (pos = 0; pos < map->len; pos++)
-				if (map->queues[pos] == index)
-					break;
-			map_len = map->len;
-			alloc_len = map->alloc_len;
-		} else
-			pos = map_len = alloc_len = 0;
+		if (cpumask_test_cpu(cpu, mask) && cpu_online(cpu)) {
+			/* add queue to CPU maps */
+			int pos = 0;
 
-		need_set = cpumask_test_cpu(cpu, mask) && cpu_online(cpu);
+			map = xmap_dereference(new_dev_maps->cpu_map[cpu]);
+			while ((pos < map->len) && (map->queues[pos] != index))
+				pos++;
+
+			if (pos == map->len)
+				map->queues[map->len++] = index;
 #ifdef CONFIG_NUMA
-		if (need_set) {
 			if (numa_node_id == -2)
 				numa_node_id = cpu_to_node(cpu);
 			else if (numa_node_id != cpu_to_node(cpu))
 				numa_node_id = -1;
-		}
 #endif
-		if (need_set && pos >= map_len) {
-			/* Need to add queue to this CPU's map */
-			if (map_len >= alloc_len) {
-				alloc_len = alloc_len ?
-				    2 * alloc_len : XPS_MIN_MAP_ALLOC;
-				new_map = kzalloc_node(XPS_MAP_SIZE(alloc_len),
-						       GFP_KERNEL,
-						       cpu_to_node(cpu));
-				if (!new_map)
-					goto error;
-				new_map->alloc_len = alloc_len;
-				for (i = 0; i < map_len; i++)
-					new_map->queues[i] = map->queues[i];
-				new_map->len = map_len;
-			}
-			new_map->queues[new_map->len++] = index;
-		} else if (!need_set && pos < map_len) {
-			/* Need to remove queue from this CPU's map */
-			if (map_len > 1)
-				new_map->queues[pos] =
-				    new_map->queues[--new_map->len];
-			else
-				new_map = NULL;
+		} else if (dev_maps) {
+			/* fill in the new device map from the old device map */
+			map = xmap_dereference(dev_maps->cpu_map[cpu]);
+			RCU_INIT_POINTER(new_dev_maps->cpu_map[cpu], map);
 		}
-		RCU_INIT_POINTER(new_dev_maps->cpu_map[cpu], new_map);
+
 	}
 
+	rcu_assign_pointer(dev->xps_maps, new_dev_maps);
+
 	/* Cleanup old maps */
-	for_each_possible_cpu(cpu) {
-		map = dev_maps ?
-			xmap_dereference(dev_maps->cpu_map[cpu]) : NULL;
-		if (map && xmap_dereference(new_dev_maps->cpu_map[cpu]) != map)
-			kfree_rcu(map, rcu);
-		if (new_dev_maps->cpu_map[cpu])
-			nonempty = 1;
-	}
+	if (dev_maps) {
+		for_each_possible_cpu(cpu) {
+			new_map = xmap_dereference(new_dev_maps->cpu_map[cpu]);
+			map = xmap_dereference(dev_maps->cpu_map[cpu]);
+			if (map && map != new_map)
+				kfree_rcu(map, rcu);
+		}
 
-	if (nonempty) {
-		rcu_assign_pointer(dev->xps_maps, new_dev_maps);
-	} else {
-		kfree(new_dev_maps);
-		RCU_INIT_POINTER(dev->xps_maps, NULL);
+		kfree_rcu(dev_maps, rcu);
 	}
 
-	if (dev_maps)
-		kfree_rcu(dev_maps, rcu);
+	dev_maps = new_dev_maps;
+	active = true;
 
+out_no_new_maps:
+	/* update Tx queue numa node */
 	netdev_queue_numa_node_write(netdev_get_tx_queue(dev, index),
 				     (numa_node_id >= 0) ? numa_node_id :
 				     NUMA_NO_NODE);
 
+	if (!dev_maps)
+		goto out_no_maps;
+
+	/* removes queue from unused CPUs */
+	for_each_possible_cpu(cpu) {
+		if (cpumask_test_cpu(cpu, mask) && cpu_online(cpu))
+			continue;
+
+		if (remove_xps_queue(dev_maps, cpu, index))
+			active = true;
+	}
+
+	/* free map if not active */
+	if (!active) {
+		RCU_INIT_POINTER(dev->xps_maps, NULL);
+		kfree_rcu(dev_maps, rcu);
+	}
+
+out_no_maps:
 	mutex_unlock(&xps_map_mutex);
 
 	return 0;
 error:
+	/* remove any maps that we added */
+	for_each_possible_cpu(cpu) {
+		new_map = xmap_dereference(new_dev_maps->cpu_map[cpu]);
+		map = dev_maps ? xmap_dereference(dev_maps->cpu_map[cpu]) :
+				 NULL;
+		if (new_map && new_map != map)
+			kfree(new_map);
+	}
+
 	mutex_unlock(&xps_map_mutex);
 
-	if (new_dev_maps)
-		for_each_possible_cpu(i)
-			kfree(rcu_dereference_protected(
-				new_dev_maps->cpu_map[i],
-				1));
 	kfree(new_dev_maps);
 	return -ENOMEM;
 }

^ permalink raw reply related

* [RFC PATCH 03/10] net: Rewrite netif_reset_xps_queue to allow for better code reuse
From: Alexander Duyck @ 2012-06-30  0:16 UTC (permalink / raw)
  To: netdev
  Cc: davem, jeffrey.t.kirsher, edumazet, bhutchings, therbert,
	alexander.duyck
In-Reply-To: <20120630000652.29939.11108.stgit@gitlad.jf.intel.com>

This patch does a minor refactor on netif_reset_xps_queue to address a few
items I noticed.

First is the fact that we are doing removal of queues in both
netif_reset_xps_queue and netif_set_xps_queue.  Since there is no need to
have the code in two places I am pushing it out into a separate function
and will come back in another patch and reuse the code in
netif_set_xps_queue.

The second item this change addresses is the fact that the Tx queues were
not getting their numa_node value cleared as a part of the XPS queue reset.
This patch resolves that by resetting the numa_node value if the dev_maps
value is set.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 net/core/dev.c |   56 +++++++++++++++++++++++++++++++++-----------------------
 1 files changed, 33 insertions(+), 23 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 4c0981b..8e259d4 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1733,45 +1733,55 @@ static DEFINE_MUTEX(xps_map_mutex);
 #define xmap_dereference(P)		\
 	rcu_dereference_protected((P), lockdep_is_held(&xps_map_mutex))
 
-void netif_reset_xps_queue(struct net_device *dev, u16 index)
+static struct xps_map *remove_xps_queue(struct xps_dev_maps *dev_maps,
+					int cpu, u16 index)
 {
-	struct xps_dev_maps *dev_maps;
-	struct xps_map *map;
-	int i, pos, nonempty = 0;
-
-	mutex_lock(&xps_map_mutex);
-	dev_maps = xmap_dereference(dev->xps_maps);
-
-	if (!dev_maps)
-		goto out_no_maps;
+	struct xps_map *map = NULL;
+	int pos;
 
-	for_each_possible_cpu(i) {
-		map = xmap_dereference(dev_maps->cpu_map[i]);
-		if (!map)
-			continue;
-
-		for (pos = 0; pos < map->len; pos++)
-			if (map->queues[pos] == index)
-				break;
+	if (dev_maps)
+		map = xmap_dereference(dev_maps->cpu_map[cpu]);
 
-		if (pos < map->len) {
+	for (pos = 0; map && pos < map->len; pos++) {
+		if (map->queues[pos] == index) {
 			if (map->len > 1) {
 				map->queues[pos] = map->queues[--map->len];
 			} else {
-				RCU_INIT_POINTER(dev_maps->cpu_map[i], NULL);
+				RCU_INIT_POINTER(dev_maps->cpu_map[cpu], NULL);
 				kfree_rcu(map, rcu);
 				map = NULL;
 			}
+			break;
 		}
-		if (map)
-			nonempty = 1;
 	}
 
-	if (!nonempty) {
+	return map;
+}
+
+void netif_reset_xps_queue(struct net_device *dev, u16 index)
+{
+	struct xps_dev_maps *dev_maps;
+	int cpu;
+	bool active = false;
+
+	mutex_lock(&xps_map_mutex);
+	dev_maps = xmap_dereference(dev->xps_maps);
+
+	if (!dev_maps)
+		goto out_no_maps;
+
+	for_each_possible_cpu(cpu) {
+		if (remove_xps_queue(dev_maps, cpu, index))
+			active = true;
+	}
+
+	if (!active) {
 		RCU_INIT_POINTER(dev->xps_maps, NULL);
 		kfree_rcu(dev_maps, rcu);
 	}
 
+	netdev_queue_numa_node_write(netdev_get_tx_queue(dev, index),
+				     NUMA_NO_NODE);
 out_no_maps:
 	mutex_unlock(&xps_map_mutex);
 }

^ permalink raw reply related

* [RFC PATCH 02/10] net: Add functions netif_reset_xps_queue and netif_set_xps_queue
From: Alexander Duyck @ 2012-06-30  0:16 UTC (permalink / raw)
  To: netdev
  Cc: davem, jeffrey.t.kirsher, edumazet, bhutchings, therbert,
	alexander.duyck
In-Reply-To: <20120630000652.29939.11108.stgit@gitlad.jf.intel.com>

This patch adds two functions, netif_reset_xps_queue and
netif_set_xps_queue.  The main idea behind these to functions is to provide
a mechanism through which drivers can update their defaults in regards to
XPS.

Currently no such mechanism exists and as a result we cannot use XPS for
things such as ATR which would require a basic configuration to start in
which the Tx queues are mapped to CPUs via a 1:1 mapping.  With this change
I am making it possible for drivers such as ixgbe to be able to use the XPS
feature by controlling the default configuration.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 include/linux/netdevice.h |   13 ++++
 net/core/dev.c            |  155 +++++++++++++++++++++++++++++++++++++++++++++
 net/core/net-sysfs.c      |  148 +------------------------------------------
 3 files changed, 173 insertions(+), 143 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 3329d70..e9e74b7 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2072,6 +2072,19 @@ static inline void netif_wake_subqueue(struct net_device *dev, u16 queue_index)
 		__netif_schedule(txq->qdisc);
 }
 
+#ifdef CONFIG_XPS
+extern void netif_reset_xps_queue(struct net_device *dev, u16 index);
+extern int netif_set_xps_queue(struct net_device *dev, struct cpumask *mask,
+			       u16 index);
+#else
+static inline int netif_set_xps_queue(struct net_device *dev,
+				      struct cpumask *mask,
+				      u16 index)
+{
+	return 0;
+}
+#endif
+
 /*
  * Returns a Tx hash for the given packet when dev->real_num_tx_queues is used
  * as a distribution range limit for the returned value.
diff --git a/net/core/dev.c b/net/core/dev.c
index b31a9ff..4c0981b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1728,6 +1728,161 @@ static void netif_setup_tc(struct net_device *dev, unsigned int txq)
 	}
 }
 
+#ifdef CONFIG_XPS
+static DEFINE_MUTEX(xps_map_mutex);
+#define xmap_dereference(P)		\
+	rcu_dereference_protected((P), lockdep_is_held(&xps_map_mutex))
+
+void netif_reset_xps_queue(struct net_device *dev, u16 index)
+{
+	struct xps_dev_maps *dev_maps;
+	struct xps_map *map;
+	int i, pos, nonempty = 0;
+
+	mutex_lock(&xps_map_mutex);
+	dev_maps = xmap_dereference(dev->xps_maps);
+
+	if (!dev_maps)
+		goto out_no_maps;
+
+	for_each_possible_cpu(i) {
+		map = xmap_dereference(dev_maps->cpu_map[i]);
+		if (!map)
+			continue;
+
+		for (pos = 0; pos < map->len; pos++)
+			if (map->queues[pos] == index)
+				break;
+
+		if (pos < map->len) {
+			if (map->len > 1) {
+				map->queues[pos] = map->queues[--map->len];
+			} else {
+				RCU_INIT_POINTER(dev_maps->cpu_map[i], NULL);
+				kfree_rcu(map, rcu);
+				map = NULL;
+			}
+		}
+		if (map)
+			nonempty = 1;
+	}
+
+	if (!nonempty) {
+		RCU_INIT_POINTER(dev->xps_maps, NULL);
+		kfree_rcu(dev_maps, rcu);
+	}
+
+out_no_maps:
+	mutex_unlock(&xps_map_mutex);
+}
+
+int netif_set_xps_queue(struct net_device *dev, struct cpumask *mask, u16 index)
+{
+	int i, cpu, pos, map_len, alloc_len, need_set;
+	struct xps_map *map, *new_map;
+	struct xps_dev_maps *dev_maps, *new_dev_maps;
+	int nonempty = 0;
+	int numa_node_id = -2;
+	int maps_sz = max_t(unsigned int, XPS_DEV_MAPS_SIZE, L1_CACHE_BYTES);
+
+	new_dev_maps = kzalloc(maps_sz, GFP_KERNEL);
+	if (!new_dev_maps)
+		return -ENOMEM;
+
+	mutex_lock(&xps_map_mutex);
+
+	dev_maps = xmap_dereference(dev->xps_maps);
+
+	for_each_possible_cpu(cpu) {
+		map = dev_maps ?
+			xmap_dereference(dev_maps->cpu_map[cpu]) : NULL;
+		new_map = map;
+		if (map) {
+			for (pos = 0; pos < map->len; pos++)
+				if (map->queues[pos] == index)
+					break;
+			map_len = map->len;
+			alloc_len = map->alloc_len;
+		} else
+			pos = map_len = alloc_len = 0;
+
+		need_set = cpumask_test_cpu(cpu, mask) && cpu_online(cpu);
+#ifdef CONFIG_NUMA
+		if (need_set) {
+			if (numa_node_id == -2)
+				numa_node_id = cpu_to_node(cpu);
+			else if (numa_node_id != cpu_to_node(cpu))
+				numa_node_id = -1;
+		}
+#endif
+		if (need_set && pos >= map_len) {
+			/* Need to add queue to this CPU's map */
+			if (map_len >= alloc_len) {
+				alloc_len = alloc_len ?
+				    2 * alloc_len : XPS_MIN_MAP_ALLOC;
+				new_map = kzalloc_node(XPS_MAP_SIZE(alloc_len),
+						       GFP_KERNEL,
+						       cpu_to_node(cpu));
+				if (!new_map)
+					goto error;
+				new_map->alloc_len = alloc_len;
+				for (i = 0; i < map_len; i++)
+					new_map->queues[i] = map->queues[i];
+				new_map->len = map_len;
+			}
+			new_map->queues[new_map->len++] = index;
+		} else if (!need_set && pos < map_len) {
+			/* Need to remove queue from this CPU's map */
+			if (map_len > 1)
+				new_map->queues[pos] =
+				    new_map->queues[--new_map->len];
+			else
+				new_map = NULL;
+		}
+		RCU_INIT_POINTER(new_dev_maps->cpu_map[cpu], new_map);
+	}
+
+	/* Cleanup old maps */
+	for_each_possible_cpu(cpu) {
+		map = dev_maps ?
+			xmap_dereference(dev_maps->cpu_map[cpu]) : NULL;
+		if (map && xmap_dereference(new_dev_maps->cpu_map[cpu]) != map)
+			kfree_rcu(map, rcu);
+		if (new_dev_maps->cpu_map[cpu])
+			nonempty = 1;
+	}
+
+	if (nonempty) {
+		rcu_assign_pointer(dev->xps_maps, new_dev_maps);
+	} else {
+		kfree(new_dev_maps);
+		RCU_INIT_POINTER(dev->xps_maps, NULL);
+	}
+
+	if (dev_maps)
+		kfree_rcu(dev_maps, rcu);
+
+	netdev_queue_numa_node_write(netdev_get_tx_queue(dev, index),
+				     (numa_node_id >= 0) ? numa_node_id :
+				     NUMA_NO_NODE);
+
+	mutex_unlock(&xps_map_mutex);
+
+	return 0;
+error:
+	mutex_unlock(&xps_map_mutex);
+
+	if (new_dev_maps)
+		for_each_possible_cpu(i)
+			kfree(rcu_dereference_protected(
+				new_dev_maps->cpu_map[i],
+				1));
+	kfree(new_dev_maps);
+	return -ENOMEM;
+}
+EXPORT_SYMBOL(netif_set_xps_queue);
+
+#endif
 /*
  * Routine to help set real_num_tx_queues. To avoid skbs mapped to queues
  * greater then real_num_tx_queues stale skbs on the qdisc must be flushed.
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 7260717..092d338 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -963,54 +963,14 @@ static ssize_t show_xps_map(struct netdev_queue *queue,
 	return len;
 }
 
-static DEFINE_MUTEX(xps_map_mutex);
-#define xmap_dereference(P)		\
-	rcu_dereference_protected((P), lockdep_is_held(&xps_map_mutex))
-
 static void xps_queue_release(struct netdev_queue *queue)
 {
 	struct net_device *dev = queue->dev;
-	struct xps_dev_maps *dev_maps;
-	struct xps_map *map;
 	unsigned long index;
-	int i, pos, nonempty = 0;
 
 	index = get_netdev_queue_index(queue);
 
-	mutex_lock(&xps_map_mutex);
-	dev_maps = xmap_dereference(dev->xps_maps);
-
-	if (dev_maps) {
-		for_each_possible_cpu(i) {
-			map = xmap_dereference(dev_maps->cpu_map[i]);
-			if (!map)
-				continue;
-
-			for (pos = 0; pos < map->len; pos++)
-				if (map->queues[pos] == index)
-					break;
-
-			if (pos < map->len) {
-				if (map->len > 1)
-					map->queues[pos] =
-					    map->queues[--map->len];
-				else {
-					RCU_INIT_POINTER(dev_maps->cpu_map[i],
-					    NULL);
-					kfree_rcu(map, rcu);
-					map = NULL;
-				}
-			}
-			if (map)
-				nonempty = 1;
-		}
-
-		if (!nonempty) {
-			RCU_INIT_POINTER(dev->xps_maps, NULL);
-			kfree_rcu(dev_maps, rcu);
-		}
-	}
-	mutex_unlock(&xps_map_mutex);
+	netif_reset_xps_queue(dev, index);
 }
 
 static ssize_t store_xps_map(struct netdev_queue *queue,
@@ -1018,13 +978,9 @@ static ssize_t store_xps_map(struct netdev_queue *queue,
 		      const char *buf, size_t len)
 {
 	struct net_device *dev = queue->dev;
-	cpumask_var_t mask;
-	int err, i, cpu, pos, map_len, alloc_len, need_set;
 	unsigned long index;
-	struct xps_map *map, *new_map;
-	struct xps_dev_maps *dev_maps, *new_dev_maps;
-	int nonempty = 0;
-	int numa_node_id = -2;
+	cpumask_var_t mask;
+	int err;
 
 	if (!capable(CAP_NET_ADMIN))
 		return -EPERM;
@@ -1040,105 +996,11 @@ static ssize_t store_xps_map(struct netdev_queue *queue,
 		return err;
 	}
 
-	new_dev_maps = kzalloc(max_t(unsigned int,
-	    XPS_DEV_MAPS_SIZE, L1_CACHE_BYTES), GFP_KERNEL);
-	if (!new_dev_maps) {
-		free_cpumask_var(mask);
-		return -ENOMEM;
-	}
-
-	mutex_lock(&xps_map_mutex);
-
-	dev_maps = xmap_dereference(dev->xps_maps);
-
-	for_each_possible_cpu(cpu) {
-		map = dev_maps ?
-			xmap_dereference(dev_maps->cpu_map[cpu]) : NULL;
-		new_map = map;
-		if (map) {
-			for (pos = 0; pos < map->len; pos++)
-				if (map->queues[pos] == index)
-					break;
-			map_len = map->len;
-			alloc_len = map->alloc_len;
-		} else
-			pos = map_len = alloc_len = 0;
-
-		need_set = cpumask_test_cpu(cpu, mask) && cpu_online(cpu);
-#ifdef CONFIG_NUMA
-		if (need_set) {
-			if (numa_node_id == -2)
-				numa_node_id = cpu_to_node(cpu);
-			else if (numa_node_id != cpu_to_node(cpu))
-				numa_node_id = -1;
-		}
-#endif
-		if (need_set && pos >= map_len) {
-			/* Need to add queue to this CPU's map */
-			if (map_len >= alloc_len) {
-				alloc_len = alloc_len ?
-				    2 * alloc_len : XPS_MIN_MAP_ALLOC;
-				new_map = kzalloc_node(XPS_MAP_SIZE(alloc_len),
-						       GFP_KERNEL,
-						       cpu_to_node(cpu));
-				if (!new_map)
-					goto error;
-				new_map->alloc_len = alloc_len;
-				for (i = 0; i < map_len; i++)
-					new_map->queues[i] = map->queues[i];
-				new_map->len = map_len;
-			}
-			new_map->queues[new_map->len++] = index;
-		} else if (!need_set && pos < map_len) {
-			/* Need to remove queue from this CPU's map */
-			if (map_len > 1)
-				new_map->queues[pos] =
-				    new_map->queues[--new_map->len];
-			else
-				new_map = NULL;
-		}
-		RCU_INIT_POINTER(new_dev_maps->cpu_map[cpu], new_map);
-	}
-
-	/* Cleanup old maps */
-	for_each_possible_cpu(cpu) {
-		map = dev_maps ?
-			xmap_dereference(dev_maps->cpu_map[cpu]) : NULL;
-		if (map && xmap_dereference(new_dev_maps->cpu_map[cpu]) != map)
-			kfree_rcu(map, rcu);
-		if (new_dev_maps->cpu_map[cpu])
-			nonempty = 1;
-	}
-
-	if (nonempty) {
-		rcu_assign_pointer(dev->xps_maps, new_dev_maps);
-	} else {
-		kfree(new_dev_maps);
-		RCU_INIT_POINTER(dev->xps_maps, NULL);
-	}
-
-	if (dev_maps)
-		kfree_rcu(dev_maps, rcu);
-
-	netdev_queue_numa_node_write(queue, (numa_node_id >= 0) ? numa_node_id :
-					    NUMA_NO_NODE);
-
-	mutex_unlock(&xps_map_mutex);
+	err = netif_set_xps_queue(dev, mask, index);
 
 	free_cpumask_var(mask);
-	return len;
 
-error:
-	mutex_unlock(&xps_map_mutex);
-
-	if (new_dev_maps)
-		for_each_possible_cpu(i)
-			kfree(rcu_dereference_protected(
-				new_dev_maps->cpu_map[i],
-				1));
-	kfree(new_dev_maps);
-	free_cpumask_var(mask);
-	return -ENOMEM;
+	return err ? : len;
 }
 
 static struct netdev_queue_attribute xps_cpus_attribute =

^ permalink raw reply related

* [RFC PATCH 01/10] net: Split core bits of dev_pick_tx into __dev_pick_tx
From: Alexander Duyck @ 2012-06-30  0:16 UTC (permalink / raw)
  To: netdev
  Cc: davem, jeffrey.t.kirsher, edumazet, bhutchings, therbert,
	alexander.duyck
In-Reply-To: <20120630000652.29939.11108.stgit@gitlad.jf.intel.com>

This change splits the core bits of dev_pick_tx into a separate function.
The main idea behind this is to make this code accessible to select queue
functions when they decide to process the standard path instead of their
own custom path in their select queue routine.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 include/linux/netdevice.h |    3 +++
 net/core/dev.c            |   51 ++++++++++++++++++++++++++-------------------
 2 files changed, 33 insertions(+), 21 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 2c2ecea..3329d70 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2082,6 +2082,9 @@ static inline u16 skb_tx_hash(const struct net_device *dev,
 	return __skb_tx_hash(dev, skb, dev->real_num_tx_queues);
 }
 
+extern int __dev_pick_tx(const struct net_device *dev,
+			 const struct sk_buff *skb);
+
 /**
  *	netif_is_multiqueue - test if device has multiple transmit queues
  *	@dev: network device
diff --git a/net/core/dev.c b/net/core/dev.c
index 57c4f9b..b31a9ff 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2301,7 +2301,8 @@ static inline u16 dev_cap_txqueue(struct net_device *dev, u16 queue_index)
 	return queue_index;
 }
 
-static inline int get_xps_queue(struct net_device *dev, struct sk_buff *skb)
+static inline int get_xps_queue(const struct net_device *dev,
+				const struct sk_buff *skb)
 {
 #ifdef CONFIG_XPS
 	struct xps_dev_maps *dev_maps;
@@ -2339,11 +2340,37 @@ static inline int get_xps_queue(struct net_device *dev, struct sk_buff *skb)
 #endif
 }
 
+int __dev_pick_tx(const struct net_device *dev, const struct sk_buff *skb)
+{
+	struct sock *sk = skb->sk;
+	int queue_index = sk_tx_queue_get(sk);
+
+	if (queue_index < 0 || skb->ooo_okay ||
+	    queue_index >= dev->real_num_tx_queues) {
+		int old_index = queue_index;
+
+		queue_index = get_xps_queue(dev, skb);
+		if (queue_index < 0)
+			queue_index = skb_tx_hash(dev, skb);
+
+		if (queue_index != old_index && sk) {
+			struct dst_entry *dst =
+			    rcu_dereference_check(sk->sk_dst_cache, 1);
+
+			if (dst && skb_dst(skb) == dst)
+				sk_tx_queue_set(sk, queue_index);
+		}
+	}
+
+	return queue_index;
+}
+EXPORT_SYMBOL(__dev_pick_tx);
+
 static struct netdev_queue *dev_pick_tx(struct net_device *dev,
 					struct sk_buff *skb)
 {
-	int queue_index;
 	const struct net_device_ops *ops = dev->netdev_ops;
+	int queue_index;
 
 	if (dev->real_num_tx_queues == 1)
 		queue_index = 0;
@@ -2351,25 +2378,7 @@ static struct netdev_queue *dev_pick_tx(struct net_device *dev,
 		queue_index = ops->ndo_select_queue(dev, skb);
 		queue_index = dev_cap_txqueue(dev, queue_index);
 	} else {
-		struct sock *sk = skb->sk;
-		queue_index = sk_tx_queue_get(sk);
-
-		if (queue_index < 0 || skb->ooo_okay ||
-		    queue_index >= dev->real_num_tx_queues) {
-			int old_index = queue_index;
-
-			queue_index = get_xps_queue(dev, skb);
-			if (queue_index < 0)
-				queue_index = skb_tx_hash(dev, skb);
-
-			if (queue_index != old_index && sk) {
-				struct dst_entry *dst =
-				    rcu_dereference_check(sk->sk_dst_cache, 1);
-
-				if (dst && skb_dst(skb) == dst)
-					sk_tx_queue_set(sk, queue_index);
-			}
-		}
+		queue_index = __dev_pick_tx(dev, skb);
 	}
 
 	skb_set_queue_mapping(skb, queue_index);

^ permalink raw reply related

* [RFC PATCH 00/10] Make XPS usable within ixgbe
From: Alexander Duyck @ 2012-06-30  0:16 UTC (permalink / raw)
  To: netdev
  Cc: davem, jeffrey.t.kirsher, edumazet, bhutchings, therbert,
	alexander.duyck

The following patch series makes it so that the ixgbe driver can support
ATR even when the number of queues is less than the number of CPUs.  To do
this I have updated the kernel to support letting drivers set their own XPS
configuration.  To do this it was necessary to move the code out of the
sysfs specific code and into the dev specific regions.

I am still working out a few issues such as the fact that with routing I
only ever seem to be able to get the first queue that is mapped to the CPU
when XPS is enabled.

Also I am looking for input on if it is acceptable to only let the
set_channels/get_channels calls report/set the number of queues per traffic
class as I implemented the code this way to avoid any significant conflicts
between the DCB traffic classes code and these functions.

---

Alexander Duyck (10):
      ixgbe: Add support for set_channels ethtool operation
      ixgbe: Add support for displaying the number of Tx/Rx channels
      ixgbe: Update ixgbe driver to use __dev_pick_tx in ixgbe_select_queue
      ixgbe: Add function for setting XPS queue mapping
      ixgbe: Define FCoE and Flow director limits much sooner to allow for changes
      net: Add support for XPS without SYSFS being defined
      net: Rewrite netif_set_xps_queues to address several issues
      net: Rewrite netif_reset_xps_queue to allow for better code reuse
      net: Add functions netif_reset_xps_queue and netif_set_xps_queue
      net: Split core bits of dev_pick_tx into __dev_pick_tx

 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c |  112 +++++++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c     |   10 -
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    |   48 +++-
 include/linux/netdevice.h                        |   15 +
 net/Kconfig                                      |    2 
 net/core/dev.c                                   |  283 ++++++++++++++++++++--
 net/core/net-sysfs.c                             |  160 ------------
 7 files changed, 428 insertions(+), 202 deletions(-)

-- 
Thanks,

Alex

^ permalink raw reply

* Re: AF_BUS socket address family
From: Benjamin LaHaise @ 2012-06-30  0:13 UTC (permalink / raw)
  To: Vincent Sanders; +Cc: David Miller, netdev, linux-kernel
In-Reply-To: <20120629234230.GA11480@kyllikki.org>

On Sat, Jun 30, 2012 at 12:42:30AM +0100, Vincent Sanders wrote:
> The current users are suffering from the issues outlined in my
> introductory mail all the time. These issues are caused by emulating an
> IPC system over AF_UNIX in userspace.

Nothing in your introductory statements indicate how your requirements 
can't be met through a hybrid socket + shared memory solution.  The IPC 
facilities of the kernel are already quite rich, and sufficient for 
building many kinds of complex systems.  What's so different about DBus' 
requirements?

		-ben
-- 
"Thought is the essence of where you are now."

^ permalink raw reply

* Re: AF_BUS socket address family
From: Vincent Sanders @ 2012-06-30  0:09 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel
In-Reply-To: <20120629.165023.1605284574408858612.davem@davemloft.net>

On Fri, Jun 29, 2012 at 04:50:23PM -0700, David Miller wrote:
> From: Vincent Sanders <vincent.sanders@collabora.co.uk>
> Date: Sat, 30 Jun 2012 00:42:30 +0100
> 
> > Basically you are indicating you would be completely opposed to any
> > mechanism involving D-Bus IPC and the kernel? 
> 
> I would not oppose existing mechanisms, which I do not believe is
> impossible to use in your scenerio.
> 

You keep saying that yet have offered no concrete way to achive the
semantics we require. To pass fd and credentials currently *requires*
the use of AF_UNIX does it not? And D-Bus already emulates its IPC
over AF_UNIX because of that.

> What you really don't get is that packet drops and event losses are
> absolutely fundamental.

not within an IPC surely? there cannot be packet drops within AF_BUS
we simply do not do it. The rrecive queues are checked for capability
of reciving the message before it is delivered to them all or none.

> 
> As long as receivers lack infinite receive queue this will always be
> the case.

Indeed, I would not question that.

> 
> Multicast operates in non-reliable transports only so that one stuck
> or malfunctioning receiver doesn't screw things over for everyone nor
> unduly brudon the sender.
> 

We have addressed this within AF_BUS by the reciver and bus master
being told if all recepients cannot receive the message (and therefor
it cannot be sent). 

The policy decision of how to handle this situation is therefore
handled by the userspace clients on a protocol level. D-Bus *already*
has to handle this situation, its just currently done over AF_UNIX
sockets so once it occours the problem is harder to rectify as the
ordering constraint is broken (which causes even more issues).

I am afraid it is rather late here and I may not be able to continue
this conversation untill the morning, I apologise if this is
inconveniant, but I must sleep.

-- 
Regards Vincent

^ permalink raw reply

* Re: AF_BUS socket address family
From: David Miller @ 2012-06-29 23:50 UTC (permalink / raw)
  To: vincent.sanders; +Cc: netdev, linux-kernel
In-Reply-To: <20120629234230.GA11480@kyllikki.org>

From: Vincent Sanders <vincent.sanders@collabora.co.uk>
Date: Sat, 30 Jun 2012 00:42:30 +0100

> Basically you are indicating you would be completely opposed to any
> mechanism involving D-Bus IPC and the kernel? 

I would not oppose existing mechanisms, which I do not believe is
impossible to use in your scenerio.

What you really don't get is that packet drops and event losses are
absolutely fundamental.

As long as receivers lack infinite receive queue this will always be
the case.

Multicast operates in non-reliable transports only so that one stuck
or malfunctioning receiver doesn't screw things over for everyone nor
unduly brudon the sender.

^ permalink raw reply

* Re: [PATCH 0/2] [net-next] Netlink updates
From: David Miller @ 2012-06-29 23:42 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev
In-Reply-To: <1340986522-3442-1-git-send-email-pablo@netfilter.org>

From: pablo@netfilter.org
Date: Fri, 29 Jun 2012 18:15:20 +0200

> If you like them, please manually apply. I wanted to know if you are
> happy with these before pushing them into my tree, as they include
> netlink changes.

Looks great, both applied.

The interface can probably be extended further, if you notice all
callers pass in THIS_MODULE.  So you can have an inline wrapper
"netlink_kernel_create()" and passes the THIS_MODULE value down
into "__netlink_kernel_create()"

^ permalink raw reply

* Re: AF_BUS socket address family
From: Vincent Sanders @ 2012-06-29 23:42 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel
In-Reply-To: <20120629.161821.948325645333976311.davem@davemloft.net>

On Fri, Jun 29, 2012 at 04:18:21PM -0700, David Miller wrote:
> From: Vincent Sanders <vincent.sanders@collabora.co.uk>
> Date: Sat, 30 Jun 2012 00:12:37 +0100
> 
> > I had hoped you would have at least read the opening list where I
> > outlined the underlying features which explain why none of the
> > existing IPC match the requirements.
> 
> I had hoped that you had read the part we told you last time where
> we explained why multicast and "reliable delivery" are fundamentally
> incompatible attributes.
> 

I do not beleive we indicated reliable delivery, mearly ordered and
idempotent. eitehr everyone gets the message in the same order or
noone gets it.

> We are not creating a full address family in the kernel which exists
> for one, and only one, specific and difficult user.

Basically you are indicating you would be completely opposed to any
mechanism involving D-Bus IPC and the kernel? 

Is there were a way to convince you that this is of real value to a
great many of the users of Linux systems in use today. I can assert
with some confidence that there are many, many more users of D-Bus IPC
than there are for several of the other address families that are
present within the kernel already. 

The current users are suffering from the issues outlined in my
introductory mail all the time. These issues are caused by emulating an
IPC system over AF_UNIX in userspace.

All we are trying to do is make things better for our users, is there
a way to do that which will satisfy you technically and them? Honestly
I am just looking for a viable solution here.

-- 
Regards Vincent

^ permalink raw reply

* Re: [PATCH 0/5] netfilter fixes for 3.5-rc4
From: David Miller @ 2012-06-29 23:37 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev
In-Reply-To: <1340984255-738-1-git-send-email-pablo@netfilter.org>

From: pablo@netfilter.org
Date: Fri, 29 Jun 2012 17:37:30 +0200

> * One refcount leak fix in IPVS IPv6 support from Eric Dumazet.
> 
> * One fix for interface comparison in ipset hash-netiface sets
>   from Florian Westphal.
> 
> * One fix for a missing rcu_read_unlock in nfnetlink from
>   Tomasz Bursztyka.
> 
> * One fix for a kernel crash if IPSET_CMD_NONE is set to ipset via
>   nfnetlink, again from Tomasz Bursztyka.
> 
> You can pull these changes from:
> 
> git://1984.lsi.us.es/nf master

Pulled, thanks Pablo.

^ permalink raw reply

* Re: [PATCH v5] sctp: be more restrictive in transport selection on bundled sacks
From: David Miller @ 2012-06-29 23:34 UTC (permalink / raw)
  To: nhorman; +Cc: netdev, vyasevich, linux-sctp
In-Reply-To: <1341000929-22933-1-git-send-email-nhorman@tuxdriver.com>

From: Neil Horman <nhorman@tuxdriver.com>
Date: Fri, 29 Jun 2012 16:15:29 -0400

> +	/* Has this transport moved the ctsn since we last sacked */
> +	__u32 sack_generation;
> +
 ...
> +		__u32	sack_generation;

These are __u32 but they only take on the value '1' or '0'.  Please
use bool and give it a more reasonable name, a name that describes
how it is really a predicate.

> -		struct sctp_association *asoc;
>  		struct timer_list *timer;
> -		asoc = pkt->transport->asoc;
> +		struct sctp_association *asoc = pkt->transport->asoc;
> +

Please leave asoc where it was, on the first line.  We encourage
listing local variables such that the longest lines come first,
then gradually shorter and short lines.

> +	/*
> +	 * Once we have a sack generated, check to see what our sack
> +	 * generation is, if its 0, reset the transports to 0, and reset

Please format:

  /* Like
   * this.
   */

Thanks.

^ permalink raw reply

* Re: pull request: wireless-next 2012-06-29
From: David Miller @ 2012-06-29 23:30 UTC (permalink / raw)
  To: linville; +Cc: linux-wireless, netdev
In-Reply-To: <20120629172836.GB30528@tuxdriver.com>

From: "John W. Linville" <linville@tuxdriver.com>
Date: Fri, 29 Jun 2012 13:28:37 -0400

> Here is another batch of updates intended for 3.6.  This includes a
> number of pulls, including ones from the mac80211, iwlwifi, ath6kl, and
> wl12xx trees.  I also pulled from the wireless tree to avoid potential
> build conflicts.  There are a number of other patches applied directly,
> including a number for the Broadcom drivers and the mwifiex driver.
> 
> The updates cover the usual variety of new hardware support and feature
> enhancements.  It's all good work, but there aren't any big headliners.
> This does resolve a net-next/wireless-next merge conflict reported
> by Stephen.

Pulled, thanks John.

^ permalink raw reply

* Re: AF_BUS socket address family
From: Vincent Sanders @ 2012-06-29 23:22 UTC (permalink / raw)
  To: Casey Schaufler; +Cc: netdev, linux-kernel, David S. Miller
In-Reply-To: <4FEDF7B6.3020107@schaufler-ca.com>

On Fri, Jun 29, 2012 at 11:45:10AM -0700, Casey Schaufler wrote:
> On 6/29/2012 9:45 AM, Vincent Sanders wrote:

<snip>

> >
> > A socket created using BUS_PROTO_DBUS indicates that the messages
> > passed will be in the D-Bus format. The userspace libraries have been
> > updated to use this transport with an updated D-Bus daemon [2] as a bus
> > master.
> 
> Why don't you go whole hog and put all of D-Bus into the kernel?
> 

That would be ridiculously excessive. This work represents what we
feel is the minimum required functionlity for the underlying IPC
mechanism. 

The minimal filtering performed by the netfilter module is what is
required to enforce security as used in existing deployments and no more.

<snip>

> >
> > The tools for testing these assertions are available [3] and
> > consistently show a doubling in throughput and better than halving of
> > latency.
> 
> Please cross-post Patches 04/15 and 05/15 to the linux-security-module list.
> Please cross-post Patch 05/15 to the selinux list.
> 
> Where is the analogous patch for the Smack LSM?

we have not tested or built this with the Smack LSM, I would, of
course, be pleased to accept a patch to add this functionality if you
are knowladgeable in this area.

<snip>

^ permalink raw reply

* Re: AF_BUS socket address family
From: David Miller @ 2012-06-29 23:18 UTC (permalink / raw)
  To: vincent.sanders; +Cc: netdev, linux-kernel
In-Reply-To: <20120629231236.GA28593@mail.collabora.co.uk>

From: Vincent Sanders <vincent.sanders@collabora.co.uk>
Date: Sat, 30 Jun 2012 00:12:37 +0100

> I had hoped you would have at least read the opening list where I
> outlined the underlying features which explain why none of the
> existing IPC match the requirements.

I had hoped that you had read the part we told you last time where
we explained why multicast and "reliable delivery" are fundamentally
incompatible attributes.

We are not creating a full address family in the kernel which exists
for one, and only one, specific and difficult user.

^ permalink raw reply

* Re: [PATCH net-next 2/6] qlge: Stand-up card should not report supporting wol.
From: Francois Romieu @ 2012-06-29 23:02 UTC (permalink / raw)
  To: Jitendra Kalsaria; +Cc: David Miller, netdev, Dept-NX Linux NIC Driver
In-Reply-To: <5E4F49720D0BAD499EE1F01232234BA877435B23DA@AVEXMB1.qlogic.org>

(linux-zupk.site removed)

Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> :
[...]
> - broken indent
> - netif_err(... "cleared successfully" ...)
> [JK] I don't think so, %s will be replace with "cleared successfully" or with "clear failed" depend upon status value.

I would not netif_err when everything is ok.

[...]
> If I read things correctly, WAKE_MAGIC can not be cleared and it makes no
> difference on the status code. It should be fixed.
> 
> [JK] You right, here we are not clearing WAKE_MAGIC but we are making sure that wol is disable when no option is set.

Imho wol should be disabled as well when neither WAKE_MAGIC nor any option
is set at all.

-- 
Ueimor

^ permalink raw reply

* Re: AF_BUS socket address family
From: Vincent Sanders @ 2012-06-29 23:12 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel
In-Reply-To: <20120629.153656.1141845894730637434.davem@davemloft.net>

On Fri, Jun 29, 2012 at 03:36:56PM -0700, David Miller wrote:
> 
> There is no extensive text describing why using IPv4 for this cannot
> be done.  I can almost bet that nobody really, honestly, tried.
> 

I can assure you that the team has tried no fewer than six differing
approaches, including using IP and attempting to bend several of the
existing address families.

> Basically this means all of our feedback from the last time we had
> discussions on kernel IPC for DBUS are being completely ignored.

Absolutely not, we listened hard and did extensive research, please do
not ascribe thoughtlessness to our actions. Certainly I would not
presume to waste your time and present something which has not been
thoroughly considered.

I had hoped you would have at least read the opening list where I
outlined the underlying features which explain why none of the
existing IPC match the requirements.

Firstly it is intended is an interprocess mechanism and not to rely on
a configured IP system, indeed one of its primary usages is to
provide mechanism for various tools to set up IP networking.

Leaving that aside the requirements for multicast, strict ordering, fd
passing and credential passing are simply not available in any other
single transport. It was made plain to us that AF_UNIX would not be
expanded to encompass multicast so we are left with adding AF_BUS.

If we are wrong I hope you will explain to me how we can achieve fd and
credential passing to multicast groups within existing protocols.

> 
> Therefore, I will completely ignore this patch submission.
> 

I do hope you will reconsider, or at least educate us appropriately. 
I understand you are a busy maintainer and appreciate your time in this matter.

Best regards
--
Vincent Sanders <vincent.sanders@collabora.co.uk>

^ permalink raw reply

* Re: [PATCH] net: Update netdev_alloc_frag to work more efficiently with TCP and GRO
From: Alexander Duyck @ 2012-06-29 23:04 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Alexander Duyck, netdev, davem, jeffrey.t.kirsher
In-Reply-To: <1340368430.4604.10280.camel@edumazet-glaptop>

On 06/22/2012 05:33 AM, Eric Dumazet wrote:
> Here is the patch I tested here.
>
> Using 32768 bytes allocations is actually nice for MTU=9000 traffic,
> since we can fit 3 frames per 32KB instead of only 2 frames (using
> kmalloc-16384 slab))
>
> Also, I prefill page->_count with a high bias value, to avoid the
> get_page() we did for each allocated frag.
>
> In my profiles, the get_page() cost was dominant, because of false
> sharing with skb consumers (as they might run on different cpus)
>
> This way, when 32768 bytes are filled, we perform a single
> atomic_sub_return() and can recycle the page if we find we are the last
> user (this is what you did in your patch, when testing page->_count
> being 1)
>
> Note : If I used max(PAGE_SIZE, 32678) for MAX_NETDEV_FRAGSIZE,
> gcc was not able to optimise get_order(MAX_NETDEV_FRAGSIZE), strange...
>
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 5b21522..d31efa2 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -296,9 +296,18 @@ EXPORT_SYMBOL(build_skb);
>  struct netdev_alloc_cache {
>  	struct page *page;
>  	unsigned int offset;
> +	unsigned int pagecnt_bias;
>  };
>  static DEFINE_PER_CPU(struct netdev_alloc_cache, netdev_alloc_cache);
>  
> +#if PAGE_SIZE > 32768
> +#define MAX_NETDEV_FRAGSIZE	PAGE_SIZE
> +#else
> +#define MAX_NETDEV_FRAGSIZE	32768
> +#endif
> +
> +#define NETDEV_PAGECNT_BIAS	(MAX_NETDEV_FRAGSIZE /		\
> +				 SKB_DATA_ALIGN(sizeof(struct skb_shared_info)))
>  /**
>   * netdev_alloc_frag - allocate a page fragment
>   * @fragsz: fragment size
> @@ -316,18 +325,25 @@ void *netdev_alloc_frag(unsigned int fragsz)
>  	nc = &__get_cpu_var(netdev_alloc_cache);
>  	if (unlikely(!nc->page)) {
>  refill:
> -		nc->page = alloc_page(GFP_ATOMIC | __GFP_COLD);
> +		nc->page = alloc_pages(GFP_ATOMIC | __GFP_COLD | __GFP_COMP,
> +				       get_order(MAX_NETDEV_FRAGSIZE));
> +		if (unlikely(!nc->page))
> +			goto end;
> +recycle:
> +		atomic_set(&nc->page->_count, NETDEV_PAGECNT_BIAS);
> +		nc->pagecnt_bias = NETDEV_PAGECNT_BIAS;
>  		nc->offset = 0;
>  	}
> -	if (likely(nc->page)) {
> -		if (nc->offset + fragsz > PAGE_SIZE) {
> -			put_page(nc->page);
> -			goto refill;
> -		}
> -		data = page_address(nc->page) + nc->offset;
> -		nc->offset += fragsz;
> -		get_page(nc->page);
> +	if (nc->offset + fragsz > MAX_NETDEV_FRAGSIZE) {
> +		if (!atomic_sub_return(nc->pagecnt_bias,
> +				       &nc->page->_count))
> +			goto recycle;
> +		goto refill;
>  	}
> +	data = page_address(nc->page) + nc->offset;
> +	nc->offset += fragsz;
> +	nc->pagecnt_bias--; /* avoid get_page()/get_page() false sharing */
> +end:
>  	local_irq_restore(flags);
>  	return data;
>  }
> @@ -353,7 +369,7 @@ struct sk_buff *__netdev_alloc_skb(struct net_device *dev,
>  	unsigned int fragsz = SKB_DATA_ALIGN(length + NET_SKB_PAD) +
>  			      SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
>  
> -	if (fragsz <= PAGE_SIZE && !(gfp_mask & __GFP_WAIT)) {
> +	if (fragsz <= MAX_NETDEV_FRAGSIZE && !(gfp_mask & __GFP_WAIT)) {
>  		void *data = netdev_alloc_frag(fragsz);
>  
>  		if (likely(data)) {
>
>
I was wondering if there were any plans to clean this patch up and
submit it to net-next?  If not, I can probably work on that since this
addressed the concerns I had in my original patch.

Thanks,

Alex

^ permalink raw reply

* Source-specific multicast socket options documentation
From: Radek Pazdera @ 2012-06-29 22:52 UTC (permalink / raw)
  To: mtk.manpages; +Cc: linux-man, netdev

Hi,

I found out, that some source-specific multicast socket options are
missing in the IPv4 manpage (man 7 ip) and I'm trying to write some
documentation for them.

This is what I have now (without the formatting). I'd like to ask,
if you could spare a minute and have just a quick look over it
to make sure there are not some errors (either technical or language -
I'm not native speaker), before I post the patch.

IP_ADD_SOURCE_MEMBERSHIP (since Linux 2.5.68)
    Join a multicast group and allow receiving data only from a specific
    source. Argument is an ip_mreq_source structure.

        struct ip_mreq_source {
            struct in_addr imr_multiaddr;  /* IP multicast group
                                              address */
            struct in_addr imr_interface;  /* IP address of local
                                              interface */
            struct in_addr imr_sourceaddr; /* IP address of
                                              multicast source */
        };

    The ip_mreq_source structure is similar to ip_mreqn structure.
    imr_multiaddr contains the address of the multicast group
    the application wants to join or leave. imr_address is the
    address of the local interface which the system will join
    the multicast group with. imr_sourceaddr is the address of
    multicast source the application wants to receive data
    from.

    Application can use this option multiple times to receive
    data from more than one source.


IP_DROP_SOURCE_MEMBERHSIP (since Linux 2.5.68)
    Leave a source-specific group, i.e. stop receiving data from
    a given multicast group, coming from a given source.

    If the application has subscribed to multiple sources within
    the same group, data from the remaining sources will still
    be delivered. To stop receiving data from all sources at once
    use IP_LEAVE_GROUP.

    Argument is an ip_mreq_source structure as described at
    IP_ADD_SOURCE_MEMBERSHIP.


IP_BLOCK_SOURCE (since Linux 2.5.68)
    Block receiving multicast data from a specific source in a given
    group. This is valid only after the application have subscribed
    to the multicast group using either IP_ADD_MEMBERSHIP or
    IP_ADD_SOURCE_MEMBERSHIP.

    Argument is an ip_mreq_source structure as described at
    IP_ADD_SOURCE_MEMBERSHIP.


IP_UNBLOCK_SOURCE (since Linux 2.5.68)
    Unblock previously blocked multicast source. Returns EADDRNOTAVAIL
    when given source was not being blocked.

    Argument is an ip_mreq_source structure as described at
    IP_ADD_SOURCE_MEMBERSHIP.


IP_MSFILTER (since Linux 2.5.68)
    This option provides access to the advanced full-state filtering API.
    Argument is an ip_msfilter structure.

        struct ip_msfilter {
            struct in_addr imsf_multiaddr; /* IP multicast group
                                              address */
            struct in_addr imsf_interface; /* IP address of local
                                              interface */
            uint32_t       imsf_fmode;     /* Filter-mode */

            uint32_t       imsf_numsrc;    /* Number of sources in
                                              the following array */
            struct in_addr imsf_slist[1];  /* Array of source
                                              addresses */
        };

    There are two macros, MCAST_INCLUDE and MCAST_EXCLUDE, which can be
    used to specify the filtering mode. Additionaly, IP_MSFILTER_SIZE(n)
    macro exists to determine how much memory is needed to store
    ip_msfilter structure with n sources in the source list.

    For full description of multicast source filtering refer to RFC 3376.

My sources for these information were:
    * http://tools.ietf.org/html/rfc3678
    * http://tools.ietf.org/html/rfc3376
    * code (http://lxr.free-electrons.com/source/net/ipv4/ip_sockglue.c)

Thank you!

Radek Pazdera :)

^ permalink raw reply

* RE: [PATCH net-next 4/6] qlge: Fixed double pci free upon tx_ring->q allocation failure.
From: Jitendra Kalsaria @ 2012-06-29 22:48 UTC (permalink / raw)
  To: Francois Romieu
  Cc: David Miller, netdev, Ron Mercer, Dept-NX Linux NIC Driver
In-Reply-To: <20120629213755.GC19152@electric-eye.fr.zoreil.com>

Francois,

See comments inline.

Jiten

-----Original Message-----
From: Francois Romieu [mailto:romieu@fr.zoreil.com] 
Sent: Friday, June 29, 2012 2:38 PM
To: Jitendra Kalsaria
Cc: David Miller; netdev; Ron@linux-zupk.site; Dept-NX Linux NIC Driver
Subject: Re: [PATCH net-next 4/6] qlge: Fixed double pci free upon tx_ring->q allocation failure.

Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> :
[...]
> diff --git a/drivers/net/ethernet/qlogic/qlge/qlge_main.c b/drivers/net/ethernet/qlogic/qlge/qlge_main.c
> index cdbc860..aa514c5 100644
> --- a/drivers/net/ethernet/qlogic/qlge/qlge_main.c
> +++ b/drivers/net/ethernet/qlogic/qlge/qlge_main.c
> @@ -2701,11 +2701,9 @@ static int ql_alloc_tx_resources(struct ql_adapter *qdev,
>  	    pci_alloc_consistent(qdev->pdev, tx_ring->wq_size,
>  				 &tx_ring->wq_base_dma);
>  
> -	if ((tx_ring->wq_base == NULL) ||
> -	    tx_ring->wq_base_dma & WQ_ADDR_ALIGN) {
> -		netif_err(qdev, ifup, qdev->ndev, "tx_ring alloc failed.\n");
> -		return -ENOMEM;
> -	}
> +	if ((tx_ring->wq_base == NULL) || tx_ring->wq_base_dma & WQ_ADDR_ALIGN)
> +		goto err;
> +
>  	tx_ring->q =
>  	    kmalloc(tx_ring->wq_len * sizeof(struct tx_ring_desc), GFP_KERNEL);
>  	if (tx_ring->q == NULL)
> @@ -2713,8 +2711,12 @@ static int ql_alloc_tx_resources(struct ql_adapter *qdev,
>  
>  	return 0;
>  err:
> -	pci_free_consistent(qdev->pdev, tx_ring->wq_size,
> +	if (tx_ring->wq_base) {
> +		pci_free_consistent(qdev->pdev, tx_ring->wq_size,
>  			    tx_ring->wq_base, tx_ring->wq_base_dma);
> +		tx_ring->wq_base = NULL;
> +	}

You do not need a test: use a second label + goto.

Nit: You can replace 'if ((tx_ring->wq_base == NULL)' with
'if (!tx_ring->wq_base' and add a local variable for qdev->pdev.

You may consider replacing pci_alloc_consistent with dma_alloc_coherent
in a future patch.

[JK] Will, definitely take care of this in future patch.

-- 
Ueimor

^ permalink raw reply

* RE: [PATCH net-next 2/6] qlge: Stand-up card should not report supporting wol.
From: Jitendra Kalsaria @ 2012-06-29 22:40 UTC (permalink / raw)
  To: Francois Romieu
  Cc: David Miller, netdev, Ron@linux-zupk.site,
	Dept-NX Linux NIC Driver
In-Reply-To: <20120629213747.GB19152@electric-eye.fr.zoreil.com>

Francois,

See comments inline.

Jiten

-----Original Message-----
From: Francois Romieu [mailto:romieu@fr.zoreil.com] 
Sent: Friday, June 29, 2012 2:38 PM
To: Jitendra Kalsaria
Cc: David Miller; netdev; Ron@linux-zupk.site; Dept-NX Linux NIC Driver
Subject: Re: [PATCH net-next 2/6] qlge: Stand-up card should not report supporting wol.

Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> :
> From: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
> 
> Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
> ---
>  drivers/net/ethernet/qlogic/qlge/qlge_ethtool.c |   43 ++++++++++++++--------
>  1 files changed, 27 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/net/ethernet/qlogic/qlge/qlge_ethtool.c b/drivers/net/ethernet/qlogic/qlge/qlge_ethtool.c
> index 8e2c2a7..81672f5 100644
> --- a/drivers/net/ethernet/qlogic/qlge/qlge_ethtool.c
> +++ b/drivers/net/ethernet/qlogic/qlge/qlge_ethtool.c
> @@ -388,10 +388,14 @@ static void ql_get_drvinfo(struct net_device *ndev,
>  static void ql_get_wol(struct net_device *ndev, struct ethtool_wolinfo *wol)
>  {
>  	struct ql_adapter *qdev = netdev_priv(ndev);
> -	/* What we support. */
> -	wol->supported = WAKE_MAGIC;
> -	/* What we've currently got set. */
> -	wol->wolopts = qdev->wol;
> +
> +	if (qdev->pdev->subsystem_device == 0x0068 ||
> +			qdev->pdev->subsystem_device == 0x0180) {
> +		/* What we support. */
> +		wol->supported = WAKE_MAGIC;
> +		/* What we've currently got set. */
> +		wol->wolopts = qdev->wol;
> +	}

	unsigned short ssys_dev = qdev->pdev->subsystem_device;

	if (ssys_dev == 0x0068 || ssys_dev == 0x0180) {
		wol->supported = WAKE_MAGIC;
		wol->wolopts = qdev->wol;
	}
[JK] Will do this.
>  }
>  
>  static int ql_set_wol(struct net_device *ndev, struct ethtool_wolinfo *wol)
> @@ -399,19 +403,26 @@ static int ql_set_wol(struct net_device *ndev, struct ethtool_wolinfo *wol)
>  	struct ql_adapter *qdev = netdev_priv(ndev);
>  	int status;
>  
> -	if (wol->wolopts & ~WAKE_MAGIC)
> -		return -EINVAL;
> -	qdev->wol = wol->wolopts;
> -
> -	netif_info(qdev, drv, qdev->ndev, "Set wol option 0x%x\n", qdev->wol);
> -	if (!qdev->wol) {
> -		u32 wol = 0;
> -		status = ql_mb_wol_mode(qdev, wol);
> -		netif_err(qdev, drv, qdev->ndev, "WOL %s (wol code 0x%x)\n",
> -			  status == 0 ? "cleared successfully" : "clear failed",
> -			  wol);
> +	if (qdev->pdev->subsystem_device == 0x0068 ||
> +			qdev->pdev->subsystem_device == 0x0180) {

See above.

> +		if (wol->wolopts & ~WAKE_MAGIC)
> +			return -EINVAL;
> +		qdev->wol = wol->wolopts;
> +
> +		netif_info(qdev, drv, qdev->ndev,
> +				"Set wol option 0x%x\n", qdev->wol);

qdev->ndev == ndev, right ?
[JK] Yes
> +		if (!qdev->wol) {
> +			u32 wol = 0;
> +			status = ql_mb_wol_mode(qdev, wol);

Missing empty line.

status should be declared here.
[JK] Will fix this
> +			netif_err(qdev, drv, qdev->ndev,
> +			"WOL %s (wol code 0x%x)\n",
> +			status == 0 ? "cleared successfully" : "clear failed",
> +			wol);

- broken indent
- netif_err(... "cleared successfully" ...)
[JK] I don't think so, %s will be replace with "cleared successfully" or with "clear failed" depend upon status value.
> +		}
> +	} else {
> +		netif_info(qdev, drv, qdev->ndev,
> +				"WOL is not supported on stand-up card\n");
>  	}
> -
>  	return 0;

If I read things correctly, WAKE_MAGIC can not be cleared and it makes no
difference on the status code. It should be fixed.

[JK] You right, here we are not clearing WAKE_MAGIC but we are making sure that wol is disable when no option is set.
-- 
Ueimor

^ permalink raw reply

* Re: AF_BUS socket address family
From: David Miller @ 2012-06-29 22:36 UTC (permalink / raw)
  To: vincent.sanders; +Cc: netdev, linux-kernel
In-Reply-To: <1340988354-26981-1-git-send-email-vincent.sanders@collabora.co.uk>

There is no extensive text describing why using IPv4 for this cannot
be done.  I can almost bet that nobody really, honestly, tried.

Basically this means all of our feedback from the last time we had
discussions on kernel IPC for DBUS are being completely ignored.

Therefore, I will completely ignore this patch submission.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox