[RFC] [PATCH] net: Add support for ndo_select_queue() functions to cache the queue mapping

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC] [PATCH] net: Add support for ndo_select_queue() functions to cache the queue mapping
@ 2010-01-28 17:38 Ben Hutchings
  2010-01-28 18:00 ` Stephen Hemminger
  2010-01-28 18:09 ` Krishna Kumar2
  0 siblings, 2 replies; 9+ messages in thread
From: Ben Hutchings @ 2010-01-28 17:38 UTC (permalink / raw)
  To: netdev; +Cc: linux-net-drivers

We think it's worth matching up RX and TX queue selection for a socket
when the queues share an interrupt.  Currently the default TX queue
selection is unlikely to match RX queue selection.  TX queue selectionc
can be overridden by the driver to match RX queue selection, but at the
expense of caching.  We found that without caching the cost of
recalculating the the hash in software for each packet outweighed the
benefit of good queue selection.

This adds two simple helper functions: sk_may_set_tx_queue() reports
whether the queue mapping should be cacheable for a socket, and
netif_sk_tx_queue_set() sets it after validating the queue index.  This
will allow drivers to evalulate expensive queue selection functions
(such as Toeplitz hash functions) only once for connected sockets, and
to avoid doing so for other skbs.
---
 include/linux/netdevice.h |    3 +++
 include/net/sock.h        |   11 +++++++++++
 net/core/dev.c            |   19 ++++++++++++++++++-
 3 files changed, 32 insertions(+), 1 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index a3fccc8..5354765 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1510,6 +1510,9 @@ static inline int netif_is_multiqueue(const struct net_device *dev)
 	return (dev->num_tx_queues > 1);
 }
 
+extern void netif_sk_tx_queue_set(struct net_device *dev, struct sock *sk,
+				  u16 queue_index);
+
 /* Use this variant when it is known for sure that it
  * is executing from hardware interrupt context or with hardware interrupts
  * disabled.
diff --git a/include/net/sock.h b/include/net/sock.h
index 3f1a480..176fdff 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1105,6 +1105,17 @@ static inline void sock_put(struct sock *sk)
 extern int sk_receive_skb(struct sock *sk, struct sk_buff *skb,
 			  const int nested);
 
+/**
+ *	sk_may_set_tx_queue - is socket's TX queue mapping cacheable?
+ *	@sk: socket from skb to be transmitted; may be %NULL
+ *
+ * Report whether a queue mapping should be set for this socket.
+ */
+static inline bool sk_may_set_tx_queue(const struct sock *sk)
+{
+	return sk && sk->sk_dst_cache;
+}
+
 static inline void sk_tx_queue_set(struct sock *sk, int tx_queue)
 {
 	sk->sk_tx_queue_mapping = tx_queue;
diff --git a/net/core/dev.c b/net/core/dev.c
index be9924f..09ac29e 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1931,7 +1931,7 @@ static struct netdev_queue *dev_pick_tx(struct net_device *dev,
 			if (dev->real_num_tx_queues > 1)
 				queue_index = skb_tx_hash(dev, skb);
 
-			if (sk && sk->sk_dst_cache)
+			if (sk_may_set_tx_queue(sk))
 				sk_tx_queue_set(sk, queue_index);
 		}
 	}
@@ -1940,6 +1940,23 @@ static struct netdev_queue *dev_pick_tx(struct net_device *dev,
 	return netdev_get_tx_queue(dev, queue_index);
 }
 
+/**
+ *	netif_sk_tx_queue_set - set socket's TX queue mapping
+ *	@dev: network device used for transmission
+ *	@sk: socket from skb to be transmitted
+ *	@queue_index: chosen queue index
+ *
+ * Set the TX queue index to be used for future skbs in the same flow
+ * rather than calling the device's ndo_select_queue function.
+ * sk_may_set_tx_queue(@sk) must be true.
+ */
+void netif_sk_tx_queue_set(struct net_device *dev, struct sock *sk,
+			   u16 queue_index)
+{
+	sk_tx_queue_set(sk, dev_cap_txqueue(dev, queue_index));
+}
+EXPORT_SYMBOL(netif_sk_tx_queue_set);
+
 static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
 				 struct net_device *dev,
 				 struct netdev_queue *txq)
-- 
1.5.5

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC] [PATCH] net: Add support for ndo_select_queue() functions to cache the queue mapping
  2010-01-28 17:38 [RFC] [PATCH] net: Add support for ndo_select_queue() functions to cache the queue mapping Ben Hutchings
@ 2010-01-28 18:00 ` Stephen Hemminger
  2010-01-28 18:34   ` Ben Hutchings
  2010-01-28 18:09 ` Krishna Kumar2
  1 sibling, 1 reply; 9+ messages in thread
From: Stephen Hemminger @ 2010-01-28 18:00 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: netdev, linux-net-drivers

On Thu, 28 Jan 2010 17:38:37 +0000
Ben Hutchings <bhutchings@solarflare.com> wrote:

> We think it's worth matching up RX and TX queue selection for a socket
> when the queues share an interrupt.  Currently the default TX queue
> selection is unlikely to match RX queue selection.  TX queue selectionc
> can be overridden by the driver to match RX queue selection, but at the
> expense of caching.  We found that without caching the cost of
> recalculating the the hash in software for each packet outweighed the
> benefit of good queue selection.

Will this work with RPS and device driver hashing?
I am thinking of the problem where the hardware hash function is
Toeplitz (per NDIS spec), and the software hash function is Jhash
(what kernel uses because it is cheaper).

Therefore the transmit hash and receiver hash will be different.


-- 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] [PATCH] net: Add support for ndo_select_queue() functions to cache the queue mapping
  2010-01-28 17:38 [RFC] [PATCH] net: Add support for ndo_select_queue() functions to cache the queue mapping Ben Hutchings
  2010-01-28 18:00 ` Stephen Hemminger
@ 2010-01-28 18:09 ` Krishna Kumar2
  2010-01-28 18:10   ` Ben Hutchings
  1 sibling, 1 reply; 9+ messages in thread
From: Krishna Kumar2 @ 2010-01-28 18:09 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: linux-net-drivers, netdev

> Ben Hutchings <bhutchings@solarflare.com>
>
> We think it's worth matching up RX and TX queue selection for a socket
> when the queues share an interrupt.  Currently the default TX queue
> selection is unlikely to match RX queue selection.  TX queue selectionc
> can be overridden by the driver to match RX queue selection, but at the
> expense of caching.  We found that without caching the cost of
> recalculating the the hash in software for each packet outweighed the
> benefit of good queue selection.
>
> This adds two simple helper functions: sk_may_set_tx_queue() reports
> whether the queue mapping should be cacheable for a socket, and
> netif_sk_tx_queue_set() sets it after validating the queue index.  This
> will allow drivers to evalulate expensive queue selection functions
> (such as Toeplitz hash functions) only once for connected sockets, and
> to avoid doing so for other skbs.
> ---
>  include/linux/netdevice.h |    3 +++
>  include/net/sock.h        |   11 +++++++++++
>  net/core/dev.c            |   19 ++++++++++++++++++-
>  3 files changed, 32 insertions(+), 1 deletions(-)
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index a3fccc8..5354765 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -1510,6 +1510,9 @@ static inline int netif_is_multiqueue(const
> struct net_device *dev)
>     return (dev->num_tx_queues > 1);
>  }
>
> +extern void netif_sk_tx_queue_set(struct net_device *dev, struct sock
*sk,
> +              u16 queue_index);
> +
>  /* Use this variant when it is known for sure that it
>   * is executing from hardware interrupt context or with hardware
interrupts
>   * disabled.
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 3f1a480..176fdff 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -1105,6 +1105,17 @@ static inline void sock_put(struct sock *sk)
>  extern int sk_receive_skb(struct sock *sk, struct sk_buff *skb,
>             const int nested);
>
> +/**
> + *   sk_may_set_tx_queue - is socket's TX queue mapping cacheable?
> + *   @sk: socket from skb to be transmitted; may be %NULL
> + *
> + * Report whether a queue mapping should be set for this socket.
> + */
> +static inline bool sk_may_set_tx_queue(const struct sock *sk)
> +{
> +   return sk && sk->sk_dst_cache;
> +}
> +
>  static inline void sk_tx_queue_set(struct sock *sk, int tx_queue)
>  {
>     sk->sk_tx_queue_mapping = tx_queue;
> diff --git a/net/core/dev.c b/net/core/dev.c
> index be9924f..09ac29e 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -1931,7 +1931,7 @@ static struct netdev_queue *dev_pick_tx(struct
> net_device *dev,
>           if (dev->real_num_tx_queues > 1)
>              queue_index = skb_tx_hash(dev, skb);
>
> -         if (sk && sk->sk_dst_cache)
> +         if (sk_may_set_tx_queue(sk))
>              sk_tx_queue_set(sk, queue_index);
>        }
>     }
> @@ -1940,6 +1940,23 @@ static struct netdev_queue *dev_pick_tx
> (struct net_device *dev,
>     return netdev_get_tx_queue(dev, queue_index);
>  }
>
> +/**
> + *   netif_sk_tx_queue_set - set socket's TX queue mapping
> + *   @dev: network device used for transmission
> + *   @sk: socket from skb to be transmitted
> + *   @queue_index: chosen queue index
> + *
> + * Set the TX queue index to be used for future skbs in the same flow
> + * rather than calling the device's ndo_select_queue function.
> + * sk_may_set_tx_queue(@sk) must be true.
> + */
> +void netif_sk_tx_queue_set(struct net_device *dev, struct sock *sk,
> +            u16 queue_index)
> +{
> +   sk_tx_queue_set(sk, dev_cap_txqueue(dev, queue_index));
> +}
> +EXPORT_SYMBOL(netif_sk_tx_queue_set);
> +
>  static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
>               struct net_device *dev,
>               struct netdev_queue *txq)

I am confused about this - isn't the txq# and rxq# already
matched due to the check for skb_rx_queue_recorded?
sk_dst_cache is also not set for these routing/forwarding
workloads, this can be relied only for locally connected
xmits.

Other than that, I saw netif_sk_tx_queue_set is not called.
And dev_pick_tx has already capped automatically, you probably
don't need another here?

thanks,

- KK


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] [PATCH] net: Add support for ndo_select_queue() functions to cache the queue mapping
  2010-01-28 18:09 ` Krishna Kumar2
@ 2010-01-28 18:10   ` Ben Hutchings
  2010-01-28 18:41     ` Krishna Kumar2
  0 siblings, 1 reply; 9+ messages in thread
From: Ben Hutchings @ 2010-01-28 18:10 UTC (permalink / raw)
  To: Krishna Kumar2; +Cc: linux-net-drivers, netdev

On Thu, 2010-01-28 at 23:39 +0530, Krishna Kumar2 wrote:
> > Ben Hutchings <bhutchings@solarflare.com>
> >
> > We think it's worth matching up RX and TX queue selection for a socket
> > when the queues share an interrupt.  Currently the default TX queue
> > selection is unlikely to match RX queue selection.  TX queue selectionc
> > can be overridden by the driver to match RX queue selection, but at the
> > expense of caching.  We found that without caching the cost of
> > recalculating the the hash in software for each packet outweighed the
> > benefit of good queue selection.
> >
> > This adds two simple helper functions: sk_may_set_tx_queue() reports
> > whether the queue mapping should be cacheable for a socket, and
> > netif_sk_tx_queue_set() sets it after validating the queue index.  This
> > will allow drivers to evalulate expensive queue selection functions
> > (such as Toeplitz hash functions) only once for connected sockets, and
> > to avoid doing so for other skbs.
[...]
> I am confused about this - isn't the txq# and rxq# already
> matched due to the check for skb_rx_queue_recorded?

When forwarding, yes.

> sk_dst_cache is also not set for these routing/forwarding
> workloads, this can be relied only for locally connected
> xmits.

Correct, that's exactly what it's for.

> Other than that, I saw netif_sk_tx_queue_set is not called.
> And dev_pick_tx has already capped automatically, you probably
> don't need another here?

Only the return value of ndo_select_queue() is capped; the cached value
is assumed to be valid.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] [PATCH] net: Add support for ndo_select_queue() functions to cache the queue mapping
  2010-01-28 18:00 ` Stephen Hemminger
@ 2010-01-28 18:34   ` Ben Hutchings
  2010-02-04  1:47     ` David Miller
  0 siblings, 1 reply; 9+ messages in thread
From: Ben Hutchings @ 2010-01-28 18:34 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, linux-net-drivers

On Thu, 2010-01-28 at 10:00 -0800, Stephen Hemminger wrote:
> On Thu, 28 Jan 2010 17:38:37 +0000
> Ben Hutchings <bhutchings@solarflare.com> wrote:
> 
> > We think it's worth matching up RX and TX queue selection for a socket
> > when the queues share an interrupt.  Currently the default TX queue
> > selection is unlikely to match RX queue selection.  TX queue selectionc
> > can be overridden by the driver to match RX queue selection, but at the
> > expense of caching.  We found that without caching the cost of
> > recalculating the the hash in software for each packet outweighed the
> > benefit of good queue selection.
> 
> Will this work with RPS and device driver hashing?

If the device only has a single TX queue this can't possibly help it.

In the case where a device has multiple RX and TX queues, I think it is
preferable that contention for TX queues is likely to be among cores
that are close together, not spread over the whole system.  I assume
that RPS will normally be configured to spread traffic from each
hardware RX queue to a group of cores that are close together (same NUMA
mode, maybe sharing some caches).  In that case it would also be
desirable to limit contention for a TX queue to within these same groups
of cores, so TX queue selection ought to match hardware RX queue
selection.  So I don't think this should conflict with RPS.

> I am thinking of the problem where the hardware hash function is
> Toeplitz (per NDIS spec), and the software hash function is Jhash
> (what kernel uses because it is cheaper).
> 
> Therefore the transmit hash and receiver hash will be different.

The intent here is to make it worthwhile to calculate that expensive
hash on the TX side.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] [PATCH] net: Add support for ndo_select_queue() functions to cache the queue mapping
  2010-01-28 18:41     ` Krishna Kumar2
@ 2010-01-28 18:40       ` Ben Hutchings
  2010-01-28 19:04         ` Krishna Kumar2
  0 siblings, 1 reply; 9+ messages in thread
From: Ben Hutchings @ 2010-01-28 18:40 UTC (permalink / raw)
  To: Krishna Kumar2; +Cc: linux-net-drivers, netdev

On Fri, 2010-01-29 at 00:11 +0530, Krishna Kumar2 wrote:
> > Ben Hutchings <bhutchings@solarflare.com>
> >
> > On Thu, 2010-01-28 at 23:39 +0530, Krishna Kumar2 wrote:
[...]
> > > Other than that, I saw netif_sk_tx_queue_set is not called.
> > > And dev_pick_tx has already capped automatically, you probably
> > > don't need another here?
> >
> > Only the return value of ndo_select_queue() is capped; the cached value
> > is assumed to be valid.
> 
> +void netif_sk_tx_queue_set(struct net_device *dev, struct sock *sk,
> +                             u16 queue_index)
> +{
> +     sk_tx_queue_set(sk, dev_cap_txqueue(dev, queue_index));
> +}
> 
> I guess I didn't understand this then, who calls this function?
[...]

The driver's ndo_select_queue() implementation calls it before
returning, if and only if sk_may_set_tx_queue() is true and its
selection is dependent only on the flow id.

As an example, ixgbe's selection function:

        static u16 ixgbe_select_queue(struct net_device *dev, struct sk_buff *skb)
        {
        	struct ixgbe_adapter *adapter = netdev_priv(dev);
        	int txq = smp_processor_id();
        
        	if (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE)
        		return txq;
        
        #ifdef IXGBE_FCOE
        	if ((adapter->flags & IXGBE_FLAG_FCOE_ENABLED) &&
        	    (skb->protocol == htons(ETH_P_FCOE))) {
        		txq &= (adapter->ring_feature[RING_F_FCOE].indices - 1);
        		txq += adapter->ring_feature[RING_F_FCOE].mask;
        		return txq;
        	}
        #endif
        	if (adapter->flags & IXGBE_FLAG_DCB_ENABLED)
        		return (skb->vlan_tci & IXGBE_TX_FLAGS_VLAN_PRIO_MASK) >> 13;
        
        	return skb_tx_hash(dev, skb);
        }
        
would not call netif_sk_tx_queue_set() in the first two cases, but could
do so in the last two cases if sk_may_set_tx_queue(skb->sk) is true.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] [PATCH] net: Add support for ndo_select_queue() functions to cache the queue mapping
  2010-01-28 18:10   ` Ben Hutchings
@ 2010-01-28 18:41     ` Krishna Kumar2
  2010-01-28 18:40       ` Ben Hutchings
  0 siblings, 1 reply; 9+ messages in thread
From: Krishna Kumar2 @ 2010-01-28 18:41 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: linux-net-drivers, netdev

> Ben Hutchings <bhutchings@solarflare.com>
>
> On Thu, 2010-01-28 at 23:39 +0530, Krishna Kumar2 wrote:
> > > Ben Hutchings <bhutchings@solarflare.com>
> > >
> > > We think it's worth matching up RX and TX queue selection for a
socket
> > > when the queues share an interrupt.  Currently the default TX queue
> > > selection is unlikely to match RX queue selection.  TX queue
selectionc
> > > can be overridden by the driver to match RX queue selection, but at
the
> > > expense of caching.  We found that without caching the cost of
> > > recalculating the the hash in software for each packet outweighed the
> > > benefit of good queue selection.
> > >
> > > This adds two simple helper functions: sk_may_set_tx_queue() reports
> > > whether the queue mapping should be cacheable for a socket, and
> > > netif_sk_tx_queue_set() sets it after validating the queue index.
This
> > > will allow drivers to evalulate expensive queue selection functions
> > > (such as Toeplitz hash functions) only once for connected sockets,
and
> > > to avoid doing so for other skbs.
> [...]
> > I am confused about this - isn't the txq# and rxq# already
> > matched due to the check for skb_rx_queue_recorded?
>
> When forwarding, yes.
>
> > sk_dst_cache is also not set for these routing/forwarding
> > workloads, this can be relied only for locally connected
> > xmits.
>
> Correct, that's exactly what it's for.
>
> > Other than that, I saw netif_sk_tx_queue_set is not called.
> > And dev_pick_tx has already capped automatically, you probably
> > don't need another here?
>
> Only the return value of ndo_select_queue() is capped; the cached value
> is assumed to be valid.

+void netif_sk_tx_queue_set(struct net_device *dev, struct sock *sk,
+                             u16 queue_index)
+{
+     sk_tx_queue_set(sk, dev_cap_txqueue(dev, queue_index));
+}

I guess I didn't understand this then, who calls this function? Is
it after getting the txq from the driver? In which case, are you
planning to do something like:

static struct netdev_queue *dev_pick_tx(struct net_device *dev,
                                        struct sk_buff *skb)
{
      u16 queue_index;
      struct sock *sk = skb->sk;

      if (sk_tx_queue_recorded(sk)) {
            queue_index = sk_tx_queue_get(sk);
      } else {
            const struct net_device_ops *ops = dev->netdev_ops;

            if (ops->ndo_select_queue) {
                  queue_index = ops->ndo_select_queue(dev, skb);
                  queue_index = dev_cap_txqueue(dev, queue_index);
            } else {
                  queue_index = 0;
                  if (dev->real_num_tx_queues > 1)
                        queue_index = skb_tx_hash(dev, skb); /*already
capped */
            }

            if (sk_may_set_tx_queue(sk))
                  sk_tx_queue_set(sk, queue_index);
        }
        skb_set_queue_mapping(skb, queue_index);
        return netdev_get_tx_queue(dev, queue_index);
}

Thanks,

- KK


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] [PATCH] net: Add support for ndo_select_queue() functions to cache the queue mapping
  2010-01-28 18:40       ` Ben Hutchings
@ 2010-01-28 19:04         ` Krishna Kumar2
  0 siblings, 0 replies; 9+ messages in thread
From: Krishna Kumar2 @ 2010-01-28 19:04 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: linux-net-drivers, netdev

> Ben Hutchings <bhutchings@solarflare.com>
>
> Re: [RFC] [PATCH] net: Add support for ndo_select_queue() functions
> to cache the queue mapping
>
> On Fri, 2010-01-29 at 00:11 +0530, Krishna Kumar2 wrote:
> > > Ben Hutchings <bhutchings@solarflare.com>
> > >
> > > On Thu, 2010-01-28 at 23:39 +0530, Krishna Kumar2 wrote:
> [...]
> > > > Other than that, I saw netif_sk_tx_queue_set is not called.
> > > > And dev_pick_tx has already capped automatically, you probably
> > > > don't need another here?
> > >
> > > Only the return value of ndo_select_queue() is capped; the cached
value
> > > is assumed to be valid.
> >
> > +void netif_sk_tx_queue_set(struct net_device *dev, struct sock *sk,
> > +                             u16 queue_index)
> > +{
> > +     sk_tx_queue_set(sk, dev_cap_txqueue(dev, queue_index));
> > +}
> >
> > I guess I didn't understand this then, who calls this function?
> [...]
>
> The driver's ndo_select_queue() implementation calls it before
> returning, if and only if sk_may_set_tx_queue() is true and its
> selection is dependent only on the flow id.
>
> As an example, ixgbe's selection function:
>
>         static u16 ixgbe_select_queue(struct net_device *dev, struct
> sk_buff *skb)
>         {
>            struct ixgbe_adapter *adapter = netdev_priv(dev);
>            int txq = smp_processor_id();
>
>            if (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE)
>               return txq;
>
>         #ifdef IXGBE_FCOE
>            if ((adapter->flags & IXGBE_FLAG_FCOE_ENABLED) &&
>                (skb->protocol == htons(ETH_P_FCOE))) {
>               txq &= (adapter->ring_feature[RING_F_FCOE].indices - 1);
>               txq += adapter->ring_feature[RING_F_FCOE].mask;
>               return txq;
>            }
>         #endif
>            if (adapter->flags & IXGBE_FLAG_DCB_ENABLED)
>               return (skb->vlan_tci & IXGBE_TX_FLAGS_VLAN_PRIO_MASK) >>
13;
>
>            return skb_tx_hash(dev, skb);
>         }
>
> would not call netif_sk_tx_queue_set() in the first two cases, but could
> do so in the last two cases if sk_may_set_tx_queue(skb->sk) is true.

That's a good optimization - saves calls to driver and hash
calculation for every skb (for devices that have select_txq
but find that they can cache after the first time).

Thanks,

- KK


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] [PATCH] net: Add support for ndo_select_queue() functions to cache the queue mapping
  2010-01-28 18:34   ` Ben Hutchings
@ 2010-02-04  1:47     ` David Miller
  0 siblings, 0 replies; 9+ messages in thread
From: David Miller @ 2010-02-04  1:47 UTC (permalink / raw)
  To: bhutchings; +Cc: shemminger, netdev, linux-net-drivers

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Thu, 28 Jan 2010 18:34:34 +0000

> On Thu, 2010-01-28 at 10:00 -0800, Stephen Hemminger wrote:
>> Therefore the transmit hash and receiver hash will be different.
> 
> The intent here is to make it worthwhile to calculate that expensive
> hash on the TX side.

I would rather something like this be implemented in a way
which is device hash agnostic.

And I believe it is possible to do this.

Anyways, why can't you just propagate the RX queue used by the flow to
the socket TX hash?  We already store the RX queue in the SKB on
receive, the socket packet receive handling can simply propagate that
to the TX queue selector.

This would avoid having to do anything special in the driver's select
queue method.  In any event, overriding the queue selection method,
while not absolutely forbidden, is very much discouraged.  And if
this new scheme of yours creates another reason drivers will start
to override it, that's another negative in my book.

But most importantly, the last thing we want to do is calculate
something insane like Toeplitz, even if it's just once per connection.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-02-04  1:47 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-28 17:38 [RFC] [PATCH] net: Add support for ndo_select_queue() functions to cache the queue mapping Ben Hutchings
2010-01-28 18:00 ` Stephen Hemminger
2010-01-28 18:34   ` Ben Hutchings
2010-02-04  1:47     ` David Miller
2010-01-28 18:09 ` Krishna Kumar2
2010-01-28 18:10   ` Ben Hutchings
2010-01-28 18:41     ` Krishna Kumar2
2010-01-28 18:40       ` Ben Hutchings
2010-01-28 19:04         ` Krishna Kumar2

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).