* [RFC PATCH net-next-2.6 1/2] net: XPS: Allow driver to provide a default mapping of CPUs to TX queues
2011-02-18 16:13 [RFC PATCH net-next-2.6 0/2] Automatic XPS mapping Ben Hutchings
@ 2011-02-18 16:15 ` Ben Hutchings
2011-02-18 16:15 ` [RFC PATCH net-next-2.6 2/2] sfc: Add CPU queue mapping for XPS Ben Hutchings
2011-07-19 17:07 ` [RFC PATCH net-next-2.6 0/2] Automatic XPS mapping Tom Herbert
2 siblings, 0 replies; 4+ messages in thread
From: Ben Hutchings @ 2011-02-18 16:15 UTC (permalink / raw)
To: Tom Herbert; +Cc: netdev, linux-net-drivers
As with RFS acceleration, drivers may use the irq_cpu_rmap facility to
provide a mapping from each CPU to the 'nearest' TX queue.
Define CONFIG_PACKET_STEERING and CONFIG_NET_IRQ_CPU_RMAP so that
drivers can make all cpu_rmap initialisation conditional on the
latter.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
include/linux/netdevice.h | 10 +++++++---
net/Kconfig | 17 ++++++++++++-----
net/core/dev.c | 41 +++++++++++++++++++++++------------------
3 files changed, 42 insertions(+), 26 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index c7d7074..c5cba16 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1080,14 +1080,13 @@ struct net_device {
/* Number of RX queues currently active in device */
unsigned int real_num_rx_queues;
-
-#ifdef CONFIG_RFS_ACCEL
+#endif
+#ifdef CONFIG_NET_IRQ_CPU_RMAP
/* CPU reverse-mapping for RX completion interrupts, indexed
* by RX queue number. Assigned by driver. This must only be
* set if the ndo_rx_flow_steer operation is defined. */
struct cpu_rmap *rx_cpu_rmap;
#endif
-#endif
rx_handler_func_t __rcu *rx_handler;
void __rcu *rx_handler_data;
@@ -1114,6 +1113,11 @@ struct net_device {
#ifdef CONFIG_XPS
struct xps_dev_maps __rcu *xps_maps;
#endif
+#ifdef CONFIG_NET_IRQ_CPU_RMAP
+ /* CPU reverse-mapping for TX completion interrupts. Assigned
+ * by driver. */
+ struct cpu_rmap *tx_cpu_rmap;
+#endif
/* These may be needed for future network-power-down code. */
diff --git a/net/Kconfig b/net/Kconfig
index 79cabf1..fa0a093 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -216,21 +216,28 @@ source "net/dcb/Kconfig"
source "net/dns_resolver/Kconfig"
source "net/batman-adv/Kconfig"
-config RPS
+config PACKET_STEERING
boolean
depends on SMP && SYSFS && USE_GENERIC_SMP_HELPERS
+ select RPS
+ select XPS
default y
-config RFS_ACCEL
+config NET_IRQ_CPU_RMAP
boolean
- depends on RPS && GENERIC_HARDIRQS
+ depends on PACKET_STEERING && GENERIC_HARDIRQS
select CPU_RMAP
+ select RFS_ACCEL
default y
+config RPS
+ boolean
+
+config RFS_ACCEL
+ boolean
+
config XPS
boolean
- depends on SMP && SYSFS && USE_GENERIC_SMP_HELPERS
- default y
menu "Network testing"
diff --git a/net/core/dev.c b/net/core/dev.c
index 54aaca6..dd38b88 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2257,27 +2257,32 @@ static inline int get_xps_queue(struct net_device *dev, struct sk_buff *skb)
rcu_read_lock();
dev_maps = rcu_dereference(dev->xps_maps);
- if (dev_maps) {
+ if (dev_maps)
map = rcu_dereference(
- dev_maps->cpu_map[raw_smp_processor_id()]);
- if (map) {
- if (map->len == 1)
- queue_index = map->queues[0];
- else {
- u32 hash;
- if (skb->sk && skb->sk->sk_hash)
- hash = skb->sk->sk_hash;
- else
- hash = (__force u16) skb->protocol ^
- skb->rxhash;
- hash = jhash_1word(hash, hashrnd);
- queue_index = map->queues[
- ((u64)hash * map->len) >> 32];
- }
- if (unlikely(queue_index >= dev->real_num_tx_queues))
- queue_index = -1;
+ dev_maps->cpu_map[raw_smp_processor_id()]);
+ if (map) {
+ if (map->len == 1)
+ queue_index = map->queues[0];
+ else {
+ u32 hash;
+ if (skb->sk && skb->sk->sk_hash)
+ hash = skb->sk->sk_hash;
+ else
+ hash = (__force u16) skb->protocol ^
+ skb->rxhash;
+ hash = jhash_1word(hash, hashrnd);
+ queue_index = map->queues[
+ ((u64)hash * map->len) >> 32];
}
+ if (unlikely(queue_index >= dev->real_num_tx_queues))
+ queue_index = -1;
+ }
+#ifdef CONFIG_PACKET_STEERING_CPU_RMAP
+ else if (dev->tx_cpu_rmap) {
+ queue_index = cpu_rmap_lookup_index(dev->tx_cpu_rmap,
+ raw_smp_processor_id());
}
+#endif
rcu_read_unlock();
return queue_index;
--
1.7.3.4
--
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [RFC PATCH net-next-2.6 2/2] sfc: Add CPU queue mapping for XPS
2011-02-18 16:13 [RFC PATCH net-next-2.6 0/2] Automatic XPS mapping Ben Hutchings
2011-02-18 16:15 ` [RFC PATCH net-next-2.6 1/2] net: XPS: Allow driver to provide a default mapping of CPUs to TX queues Ben Hutchings
@ 2011-02-18 16:15 ` Ben Hutchings
2011-07-19 17:07 ` [RFC PATCH net-next-2.6 0/2] Automatic XPS mapping Tom Herbert
2 siblings, 0 replies; 4+ messages in thread
From: Ben Hutchings @ 2011-02-18 16:15 UTC (permalink / raw)
To: Tom Herbert; +Cc: netdev, linux-net-drivers
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
drivers/net/sfc/efx.c | 59 ++++++++++++++++++++++++++++++++++---------------
1 files changed, 41 insertions(+), 18 deletions(-)
diff --git a/drivers/net/sfc/efx.c b/drivers/net/sfc/efx.c
index 35b7bc5..6d698c3 100644
--- a/drivers/net/sfc/efx.c
+++ b/drivers/net/sfc/efx.c
@@ -1179,9 +1179,11 @@ static int efx_wanted_channels(void)
}
static int
-efx_init_rx_cpu_rmap(struct efx_nic *efx, struct msix_entry *xentries)
+efx_init_cpu_rmaps(struct efx_nic *efx, struct msix_entry *xentries)
{
-#ifdef CONFIG_RFS_ACCEL
+#ifndef CONFIG_NET_IRQ_CPU_RMAP
+ return 0;
+#else
int i, rc;
efx->net_dev->rx_cpu_rmap = alloc_irq_cpu_rmap(efx->n_rx_channels);
@@ -1190,14 +1192,38 @@ efx_init_rx_cpu_rmap(struct efx_nic *efx, struct msix_entry *xentries)
for (i = 0; i < efx->n_rx_channels; i++) {
rc = irq_cpu_rmap_add(efx->net_dev->rx_cpu_rmap,
xentries[i].vector);
- if (rc) {
- free_irq_cpu_rmap(efx->net_dev->rx_cpu_rmap);
- efx->net_dev->rx_cpu_rmap = NULL;
- return rc;
+ if (rc)
+ goto fail_rx;
+ }
+
+ if (efx->tx_channel_offset == 0) {
+ efx->net_dev->tx_cpu_rmap = efx->net_dev->rx_cpu_rmap;
+ } else {
+ efx->net_dev->tx_cpu_rmap =
+ alloc_irq_cpu_rmap(efx->n_tx_channels);
+ if (!efx->net_dev->tx_cpu_rmap) {
+ rc = -ENOMEM;
+ goto fail_rx;
+ }
+ for (i = 0; i < efx->n_tx_channels; i++) {
+ rc = irq_cpu_rmap_add(
+ efx->net_dev->tx_cpu_rmap,
+ xentries[efx->tx_channel_offset + i].vector);
+ if (rc)
+ goto fail_tx;
}
}
-#endif
+
return 0;
+
+fail_tx:
+ free_irq_cpu_rmap(efx->net_dev->tx_cpu_rmap);
+ efx->net_dev->tx_cpu_rmap = NULL;
+fail_rx:
+ free_irq_cpu_rmap(efx->net_dev->rx_cpu_rmap);
+ efx->net_dev->rx_cpu_rmap = NULL;
+ return rc;
+#endif
}
/* Probe the number and type of interrupts we are able to obtain, and
@@ -1238,14 +1264,15 @@ static int efx_probe_interrupts(struct efx_nic *efx)
if (separate_tx_channels) {
efx->n_tx_channels =
max(efx->n_channels / 2, 1U);
+ efx->tx_channel_offset =
+ efx->n_channels - efx->n_tx_channels;
efx->n_rx_channels =
- max(efx->n_channels -
- efx->n_tx_channels, 1U);
+ max(efx->tx_channel_offset, 1U);
} else {
efx->n_tx_channels = efx->n_channels;
efx->n_rx_channels = efx->n_channels;
}
- rc = efx_init_rx_cpu_rmap(efx, xentries);
+ rc = efx_init_cpu_rmaps(efx, xentries);
if (rc) {
pci_disable_msix(efx->pci_dev);
return rc;
@@ -1301,12 +1328,6 @@ static void efx_remove_interrupts(struct efx_nic *efx)
efx->legacy_irq = 0;
}
-static void efx_set_channels(struct efx_nic *efx)
-{
- efx->tx_channel_offset =
- separate_tx_channels ? efx->n_channels - efx->n_tx_channels : 0;
-}
-
static int efx_probe_nic(struct efx_nic *efx)
{
size_t i;
@@ -1330,7 +1351,6 @@ static int efx_probe_nic(struct efx_nic *efx)
for (i = 0; i < ARRAY_SIZE(efx->rx_indir_table); i++)
efx->rx_indir_table[i] = i % efx->n_rx_channels;
- efx_set_channels(efx);
netif_set_real_num_tx_queues(efx->net_dev, efx->n_tx_channels);
netif_set_real_num_rx_queues(efx->net_dev, efx->n_rx_channels);
@@ -2315,7 +2335,10 @@ static void efx_fini_struct(struct efx_nic *efx)
*/
static void efx_pci_remove_main(struct efx_nic *efx)
{
-#ifdef CONFIG_RFS_ACCEL
+#ifdef CONFIG_NET_IRQ_CPU_RMAP
+ if (efx->net_dev->tx_cpu_rmap != efx->net_dev->rx_cpu_rmap)
+ free_irq_cpu_rmap(efx->net_dev->tx_cpu_rmap);
+ efx->net_dev->tx_cpu_rmap = NULL;
free_irq_cpu_rmap(efx->net_dev->rx_cpu_rmap);
efx->net_dev->rx_cpu_rmap = NULL;
#endif
--
1.7.3.4
--
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [RFC PATCH net-next-2.6 0/2] Automatic XPS mapping
2011-02-18 16:13 [RFC PATCH net-next-2.6 0/2] Automatic XPS mapping Ben Hutchings
2011-02-18 16:15 ` [RFC PATCH net-next-2.6 1/2] net: XPS: Allow driver to provide a default mapping of CPUs to TX queues Ben Hutchings
2011-02-18 16:15 ` [RFC PATCH net-next-2.6 2/2] sfc: Add CPU queue mapping for XPS Ben Hutchings
@ 2011-07-19 17:07 ` Tom Herbert
2 siblings, 0 replies; 4+ messages in thread
From: Tom Herbert @ 2011-07-19 17:07 UTC (permalink / raw)
To: Ben Hutchings; +Cc: netdev, sf-linux-drivers
Hi Ben,
I've finally gotten around to looking at how XPS interacts with the HW
priority queues...
On Fri, Feb 18, 2011 at 8:13 AM, Ben Hutchings
<bhutchings@solarflare.com> wrote:
> In the same way that we maintain a mapping CPUs to RX queues for RFS
> acceleration based on current IRQ affinity and the CPU topology, we can
> maintain a mapping of CPUs to TX queues for queue selection in XPS. (In
> fact this may be the same mapping.)
>
Any more progress on this? It seem like a good way to provide default
configuration for XPS that is usable.
> Questions:
> - Does this make a real difference to performance?
XPS seems to when configured correctly :-)
> (I've only barely tested this.)
> - Should there be a way to disable it?
> - Should the automatic mapping be made visible?
Yes, would be nice for this to be readable in the tx-<n> directory for
the queue. Same thing for rmap in RFS acceleration.
> (This applies RFS acceleration too.)
> - Should different mappings be allowed for different traffic classes,
> in case they have separate sets of TX interrupts with different
> affinity?
> (This applies to manual XPS configuration too.)
>
Yes. Looking at XPS and the HW traffic class support, I realized that
they don't seem to play together at all. If XPS is enabled, we don't
do the skb_tx_hash which is where the priority is taken into account.
It's probably worse than that, AFAICT XPS could be configured so that
packets are inadvertently sent on arbitrary priority queues.
I think the correct approach is to first choose a set of queues by
priority, and then among those queues perform XPS. Probably requires
some new configuration to make this hierarchy visible.
Tom
> Ben Hutchings (2):
> net: XPS: Allow driver to provide a default mapping of CPUs to TX
> queues
> sfc: Add CPU queue mapping for XPS
>
> drivers/net/sfc/efx.c | 59 +++++++++++++++++++++++++++++++-------------
> include/linux/netdevice.h | 10 +++++--
> net/Kconfig | 17 +++++++++----
> net/core/dev.c | 41 +++++++++++++++++-------------
> 4 files changed, 83 insertions(+), 44 deletions(-)
>
> --
> 1.7.3.4
>
>
> --
> Ben Hutchings, Senior Software Engineer, Solarflare Communications
> Not speaking for my employer; that's the marketing department's job.
> They asked us to note that Solarflare product names are trademarked.
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread