netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next] packet: respect devices with LLTX flag in direct xmit
@ 2014-03-27 15:38 Daniel Borkmann
  2014-03-27 16:01 ` Eric Dumazet
  2014-03-28 20:50 ` David Miller
  0 siblings, 2 replies; 7+ messages in thread
From: Daniel Borkmann @ 2014-03-27 15:38 UTC (permalink / raw)
  To: davem; +Cc: netdev, Jesper Dangaard Brouer

Quite often it can be useful to test with dummy or similar
devices as a blackhole sink for skbs. Such devices are only
equipped with a single txq, but marked as NETIF_F_LLTX as
they do not require locking their internal queues on xmit
(or implement locking themselves). Therefore, rather use
HARD_TX_{UN,}LOCK API, so that NETIF_F_LLTX will be respected.

trafgen mmap/TX_RING example against dummy device with config
foo: { fill(0xff, 64) } results in the following performance
improvements for such scenarios on an ordinary Core i7/2.80GHz:

Before:

 Performance counter stats for 'trafgen -i foo -o du0 -n100000000' (10 runs):

   160,975,944,159 instructions:k            #    0.55  insns per cycle          ( +-  0.09% )
   293,319,390,278 cycles:k                  #    0.000 GHz                      ( +-  0.35% )
       192,501,104 branch-misses:k                                               ( +-  1.63% )
               831 context-switches:k                                            ( +-  9.18% )
                 7 cpu-migrations:k                                              ( +-  7.40% )
            69,382 cache-misses:k            #    0.010 % of all cache refs      ( +-  2.18% )
       671,552,021 cache-references:k                                            ( +-  1.29% )

      22.856401569 seconds time elapsed                                          ( +-  0.33% )

After:

 Performance counter stats for 'trafgen -i foo -o du0 -n100000000' (10 runs):

   133,788,739,692 instructions:k            #    0.92  insns per cycle          ( +-  0.06% )
   145,853,213,256 cycles:k                  #    0.000 GHz                      ( +-  0.17% )
        59,867,100 branch-misses:k                                               ( +-  4.72% )
               384 context-switches:k                                            ( +-  3.76% )
                 6 cpu-migrations:k                                              ( +-  6.28% )
            70,304 cache-misses:k            #    0.077 % of all cache refs      ( +-  1.73% )
        90,879,408 cache-references:k                                            ( +-  1.35% )

      11.719372413 seconds time elapsed                                          ( +-  0.24% )

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
---
 net/packet/af_packet.c | 40 ++++++++++++++++++++--------------------
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 097a354..01039d2 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -243,40 +243,40 @@ static int packet_direct_xmit(struct sk_buff *skb)
 	const struct net_device_ops *ops = dev->netdev_ops;
 	netdev_features_t features;
 	struct netdev_queue *txq;
+	int ret = NETDEV_TX_BUSY;
 	u16 queue_map;
-	int ret;
 
 	if (unlikely(!netif_running(dev) ||
-		     !netif_carrier_ok(dev))) {
-		kfree_skb(skb);
-		return NET_XMIT_DROP;
-	}
+		     !netif_carrier_ok(dev)))
+		goto drop;
 
 	features = netif_skb_features(skb);
 	if (skb_needs_linearize(skb, features) &&
-	    __skb_linearize(skb)) {
-		kfree_skb(skb);
-		return NET_XMIT_DROP;
-	}
+	    __skb_linearize(skb))
+		goto drop;
 
 	queue_map = skb_get_queue_mapping(skb);
 	txq = netdev_get_tx_queue(dev, queue_map);
 
-	__netif_tx_lock_bh(txq);
-	if (unlikely(netif_xmit_frozen_or_stopped(txq))) {
-		ret = NETDEV_TX_BUSY;
-		kfree_skb(skb);
-		goto out;
+	local_bh_disable();
+
+	HARD_TX_LOCK(dev, txq, smp_processor_id());
+	if (!netif_xmit_frozen_or_stopped(txq)) {
+		ret = ops->ndo_start_xmit(skb, dev);
+		if (ret == NETDEV_TX_OK)
+			txq_trans_update(txq);
 	}
+	HARD_TX_UNLOCK(dev, txq);
 
-	ret = ops->ndo_start_xmit(skb, dev);
-	if (likely(dev_xmit_complete(ret)))
-		txq_trans_update(txq);
-	else
+	local_bh_enable();
+
+	if (!dev_xmit_complete(ret))
 		kfree_skb(skb);
-out:
-	__netif_tx_unlock_bh(txq);
+
 	return ret;
+drop:
+	kfree_skb(skb);
+	return NET_XMIT_DROP;
 }
 
 static struct net_device *packet_cached_dev_get(struct packet_sock *po)
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next] packet: respect devices with LLTX flag in direct xmit
  2014-03-27 15:38 [PATCH net-next] packet: respect devices with LLTX flag in direct xmit Daniel Borkmann
@ 2014-03-27 16:01 ` Eric Dumazet
  2014-03-27 16:04   ` Eric Dumazet
  2014-03-28 20:50 ` David Miller
  1 sibling, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2014-03-27 16:01 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: davem, netdev, Jesper Dangaard Brouer

On Thu, 2014-03-27 at 16:38 +0100, Daniel Borkmann wrote:
> Quite often it can be useful to test with dummy or similar
> devices as a blackhole sink for skbs. Such devices are only
> equipped with a single txq, but marked as NETIF_F_LLTX as
> they do not require locking their internal queues on xmit
> (or implement locking themselves). Therefore, rather use
> HARD_TX_{UN,}LOCK API, so that NETIF_F_LLTX will be respected.
> 
> trafgen mmap/TX_RING example against dummy device with config
> foo: { fill(0xff, 64) } results in the following performance
> improvements for such scenarios on an ordinary Core i7/2.80GHz:
> 
> Before:
> 
>  Performance counter stats for 'trafgen -i foo -o du0 -n100000000' (10 runs):
> 
>    160,975,944,159 instructions:k            #    0.55  insns per cycle          ( +-  0.09% )
>    293,319,390,278 cycles:k                  #    0.000 GHz                      ( +-  0.35% )
>        192,501,104 branch-misses:k                                               ( +-  1.63% )
>                831 context-switches:k                                            ( +-  9.18% )
>                  7 cpu-migrations:k                                              ( +-  7.40% )
>             69,382 cache-misses:k            #    0.010 % of all cache refs      ( +-  2.18% )
>        671,552,021 cache-references:k                                            ( +-  1.29% )
> 
>       22.856401569 seconds time elapsed                                          ( +-  0.33% )
> 
> After:
> 
>  Performance counter stats for 'trafgen -i foo -o du0 -n100000000' (10 runs):
> 
>    133,788,739,692 instructions:k            #    0.92  insns per cycle          ( +-  0.06% )
>    145,853,213,256 cycles:k                  #    0.000 GHz                      ( +-  0.17% )
>         59,867,100 branch-misses:k                                               ( +-  4.72% )
>                384 context-switches:k                                            ( +-  3.76% )
>                  6 cpu-migrations:k                                              ( +-  6.28% )
>             70,304 cache-misses:k            #    0.077 % of all cache refs      ( +-  1.73% )
>         90,879,408 cache-references:k                                            ( +-  1.35% )
> 
>       11.719372413 seconds time elapsed                                          ( +-  0.24% )
> 
> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
> Cc: Jesper Dangaard Brouer <brouer@redhat.com>
> ---
>  net/packet/af_packet.c | 40 ++++++++++++++++++++--------------------
>  1 file changed, 20 insertions(+), 20 deletions(-)
> 
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index 097a354..01039d2 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -243,40 +243,40 @@ static int packet_direct_xmit(struct sk_buff *skb)
>  	const struct net_device_ops *ops = dev->netdev_ops;
>  	netdev_features_t features;
>  	struct netdev_queue *txq;
> +	int ret = NETDEV_TX_BUSY;
>  	u16 queue_map;
> -	int ret;
>  
>  	if (unlikely(!netif_running(dev) ||
> -		     !netif_carrier_ok(dev))) {
> -		kfree_skb(skb);
> -		return NET_XMIT_DROP;
> -	}
> +		     !netif_carrier_ok(dev)))
> +		goto drop;
>  
>  	features = netif_skb_features(skb);
>  	if (skb_needs_linearize(skb, features) &&
> -	    __skb_linearize(skb)) {
> -		kfree_skb(skb);
> -		return NET_XMIT_DROP;
> -	}
> +	    __skb_linearize(skb))
> +		goto drop;
>  
>  	queue_map = skb_get_queue_mapping(skb);
>  	txq = netdev_get_tx_queue(dev, queue_map);
>  
> -	__netif_tx_lock_bh(txq);
> -	if (unlikely(netif_xmit_frozen_or_stopped(txq))) {
> -		ret = NETDEV_TX_BUSY;
> -		kfree_skb(skb);
> -		goto out;
> +	local_bh_disable();
> +
> +	HARD_TX_LOCK(dev, txq, smp_processor_id());
> +	if (!netif_xmit_frozen_or_stopped(txq)) {
> +		ret = ops->ndo_start_xmit(skb, dev);
> +		if (ret == NETDEV_TX_OK)
> +			txq_trans_update(txq);
>  	}
> +	HARD_TX_UNLOCK(dev, txq);
>  
> -	ret = ops->ndo_start_xmit(skb, dev);
> -	if (likely(dev_xmit_complete(ret)))
> -		txq_trans_update(txq);

I think this is problematic.

If you have concurrent traffic going through txq, and a flood
going through packet_direct_xmit(), the timeout will trigger on txq
because we do no longer update txq->trans_start

So I think you should force the update

	txq->trans_start = jiffies;


> -	else
> +	local_bh_enable();
> +
> +	if (!dev_xmit_complete(ret))
>  		kfree_skb(skb);
> -out:
> -	__netif_tx_unlock_bh(txq);
> +
>  	return ret;
> +drop:
> +	kfree_skb(skb);


Yep, another spot we likely need to 

	atomic_long_inc(&dev->tx_dropped); 

;)

> +	return NET_XMIT_DROP;
>  }
>  

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next] packet: respect devices with LLTX flag in direct xmit
  2014-03-27 16:01 ` Eric Dumazet
@ 2014-03-27 16:04   ` Eric Dumazet
  0 siblings, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2014-03-27 16:04 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: davem, netdev, Jesper Dangaard Brouer

On Thu, 2014-03-27 at 09:01 -0700, Eric Dumazet wrote:

> > +	HARD_TX_LOCK(dev, txq, smp_processor_id());
> > +	if (!netif_xmit_frozen_or_stopped(txq)) {
> > +		ret = ops->ndo_start_xmit(skb, dev);
> > +		if (ret == NETDEV_TX_OK)
> > +			txq_trans_update(txq);


> I think this is problematic.
> 
> If you have concurrent traffic going through txq, and a flood
> going through packet_direct_xmit(), the timeout will trigger on txq
> because we do no longer update txq->trans_start
> 
> So I think you should force the update
> 
> 	txq->trans_start = jiffies;

Oh well, scratch that, I missed that you did the update.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next] packet: respect devices with LLTX flag in direct xmit
  2014-03-27 15:38 [PATCH net-next] packet: respect devices with LLTX flag in direct xmit Daniel Borkmann
  2014-03-27 16:01 ` Eric Dumazet
@ 2014-03-28 20:50 ` David Miller
  2014-03-28 21:13   ` Daniel Borkmann
  1 sibling, 1 reply; 7+ messages in thread
From: David Miller @ 2014-03-28 20:50 UTC (permalink / raw)
  To: dborkman; +Cc: netdev, brouer

From: Daniel Borkmann <dborkman@redhat.com>
Date: Thu, 27 Mar 2014 16:38:30 +0100

> Quite often it can be useful to test with dummy or similar
> devices as a blackhole sink for skbs. Such devices are only
> equipped with a single txq, but marked as NETIF_F_LLTX as
> they do not require locking their internal queues on xmit
> (or implement locking themselves). Therefore, rather use
> HARD_TX_{UN,}LOCK API, so that NETIF_F_LLTX will be respected.
> 
> trafgen mmap/TX_RING example against dummy device with config
> foo: { fill(0xff, 64) } results in the following performance
> improvements for such scenarios on an ordinary Core i7/2.80GHz:
 ...
> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
> Cc: Jesper Dangaard Brouer <brouer@redhat.com>

Applied, nice work.

The only remaining issue is that Eric seems to suggest that a tx
dropped counter bump should be added.  Please submit that as a
follow-on if my reading of his feedback is accurate.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next] packet: respect devices with LLTX flag in direct xmit
  2014-03-28 20:50 ` David Miller
@ 2014-03-28 21:13   ` Daniel Borkmann
  2014-03-29 15:02     ` Eric Dumazet
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Borkmann @ 2014-03-28 21:13 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, brouer

On 03/28/2014 09:50 PM, David Miller wrote:
> From: Daniel Borkmann <dborkman@redhat.com>
> Date: Thu, 27 Mar 2014 16:38:30 +0100
>
>> Quite often it can be useful to test with dummy or similar
>> devices as a blackhole sink for skbs. Such devices are only
>> equipped with a single txq, but marked as NETIF_F_LLTX as
>> they do not require locking their internal queues on xmit
>> (or implement locking themselves). Therefore, rather use
>> HARD_TX_{UN,}LOCK API, so that NETIF_F_LLTX will be respected.
>>
>> trafgen mmap/TX_RING example against dummy device with config
>> foo: { fill(0xff, 64) } results in the following performance
>> improvements for such scenarios on an ordinary Core i7/2.80GHz:
>   ...
>> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
>> Cc: Jesper Dangaard Brouer <brouer@redhat.com>
>
> Applied, nice work.
>
> The only remaining issue is that Eric seems to suggest that a tx
> dropped counter bump should be added.  Please submit that as a
> follow-on if my reading of his feedback is accurate.

Yes, will do next week. I think at time of submission we didn't
have that yet. :)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next] packet: respect devices with LLTX flag in direct xmit
  2014-03-28 21:13   ` Daniel Borkmann
@ 2014-03-29 15:02     ` Eric Dumazet
  2014-03-30 21:52       ` Daniel Borkmann
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2014-03-29 15:02 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: David Miller, netdev, brouer

On Fri, 2014-03-28 at 22:13 +0100, Daniel Borkmann wrote:
> On 03/28/2014 09:50 PM, David Miller wrote:

> > The only remaining issue is that Eric seems to suggest that a tx
> > dropped counter bump should be added.  Please submit that as a
> > follow-on if my reading of his feedback is accurate.
> 
> Yes, will do next week. I think at time of submission we didn't
> have that yet. :)

BTW, how direct xmit performs in case of BQL enabled driver ?

dummy is nice, but not really representative of real device ;)

netif_xmit_frozen_or_stopped() tests (dev_queue->state &
QUEUE_STATE_ANY_XOFF_OR_FROZEN), so will not fill TX ring.

Since you have no queue, I would rather use a test without
BQL bit (1 << __QUEUE_STATE_STACK_XOFF), so that you can absorb bursts
into TX ring ...

 include/linux/netdevice.h |   21 ++++++++++++++++-----
 net/packet/af_packet.c    |    2 +-
 2 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 159c7e7945f8..ed71a2cc1f66 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -526,11 +526,15 @@ enum netdev_queue_state_t {
 	__QUEUE_STATE_DRV_XOFF,
 	__QUEUE_STATE_STACK_XOFF,
 	__QUEUE_STATE_FROZEN,
-#define QUEUE_STATE_ANY_XOFF ((1 << __QUEUE_STATE_DRV_XOFF)		| \
-			      (1 << __QUEUE_STATE_STACK_XOFF))
-#define QUEUE_STATE_ANY_XOFF_OR_FROZEN (QUEUE_STATE_ANY_XOFF		| \
-					(1 << __QUEUE_STATE_FROZEN))
 };
+#define QUEUE_STATE_DRV_XOFF	(1 << __QUEUE_STATE_DRV_XOFF)
+#define QUEUE_STATE_STACK_XOFF	(1 << __QUEUE_STATE_STACK_XOFF)
+#define QUEUE_STATE_FROZEN	(1 << __QUEUE_STATE_FROZEN)
+
+#define QUEUE_STATE_ANY_XOFF	(QUEUE_STATE_DRV_XOFF | QUEUE_STATE_STACK_XOFF)
+#define QUEUE_STATE_ANY_XOFF_OR_FROZEN (QUEUE_STATE_ANY_XOFF | \
+					QUEUE_STATE_FROZEN)
+
 /*
  * __QUEUE_STATE_DRV_XOFF is used by drivers to stop the transmit queue.  The
  * netif_tx_* functions below are used to manipulate this flag.  The
@@ -2262,11 +2266,18 @@ static inline bool netif_xmit_stopped(const struct netdev_queue *dev_queue)
 	return dev_queue->state & QUEUE_STATE_ANY_XOFF;
 }
 
-static inline bool netif_xmit_frozen_or_stopped(const struct netdev_queue *dev_queue)
+static inline bool
+netif_xmit_frozen_or_stopped(const struct netdev_queue *dev_queue)
 {
 	return dev_queue->state & QUEUE_STATE_ANY_XOFF_OR_FROZEN;
 }
 
+static inline bool
+netif_xmit_frozen_or_drv_stopped(const struct netdev_queue *dev_queue)
+{
+	return dev_queue->state & (QUEUE_STATE_FROZEN | QUEUE_STATE_DRV_XOFF);
+}
+
 static inline void netdev_tx_sent_queue(struct netdev_queue *dev_queue,
 					unsigned int bytes)
 {
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 01039d2b1695..c8d50690b619 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -261,7 +261,7 @@ static int packet_direct_xmit(struct sk_buff *skb)
 	local_bh_disable();
 
 	HARD_TX_LOCK(dev, txq, smp_processor_id());
-	if (!netif_xmit_frozen_or_stopped(txq)) {
+	if (!netif_xmit_frozen_or_drv_stopped(txq)) {
 		ret = ops->ndo_start_xmit(skb, dev);
 		if (ret == NETDEV_TX_OK)
 			txq_trans_update(txq);

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next] packet: respect devices with LLTX flag in direct xmit
  2014-03-29 15:02     ` Eric Dumazet
@ 2014-03-30 21:52       ` Daniel Borkmann
  0 siblings, 0 replies; 7+ messages in thread
From: Daniel Borkmann @ 2014-03-30 21:52 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, brouer

On 03/29/2014 04:02 PM, Eric Dumazet wrote:
> On Fri, 2014-03-28 at 22:13 +0100, Daniel Borkmann wrote:
>> On 03/28/2014 09:50 PM, David Miller wrote:
>
>>> The only remaining issue is that Eric seems to suggest that a tx
>>> dropped counter bump should be added.  Please submit that as a
>>> follow-on if my reading of his feedback is accurate.
>>
>> Yes, will do next week. I think at time of submission we didn't
>> have that yet. :)
>
> BTW, how direct xmit performs in case of BQL enabled driver ?
>
> dummy is nice, but not really representative of real device ;)
>
> netif_xmit_frozen_or_stopped() tests (dev_queue->state &
> QUEUE_STATE_ANY_XOFF_OR_FROZEN), so will not fill TX ring.
>
> Since you have no queue, I would rather use a test without
> BQL bit (1 << __QUEUE_STATE_STACK_XOFF), so that you can absorb bursts
> into TX ring ...

Thanks Eric, I'll look into it on next Tuesday [I was on travel
whole weekend and will be on Monday].

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-03-30 21:52 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-27 15:38 [PATCH net-next] packet: respect devices with LLTX flag in direct xmit Daniel Borkmann
2014-03-27 16:01 ` Eric Dumazet
2014-03-27 16:04   ` Eric Dumazet
2014-03-28 20:50 ` David Miller
2014-03-28 21:13   ` Daniel Borkmann
2014-03-29 15:02     ` Eric Dumazet
2014-03-30 21:52       ` Daniel Borkmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).