[PATCH net] xen-netback: Fix slot estimation

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH net] xen-netback: Fix slot estimation
@ 2014-06-03 13:32 Zoltan Kiss
  2014-06-03 13:37 ` Paul Durrant
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Zoltan Kiss @ 2014-06-03 13:32 UTC (permalink / raw)
  To: xen-devel, ian.campbell, wei.liu2, paul.durrant, linux
  Cc: netdev, david.vrabel, davem, Zoltan Kiss

A recent commit (a02eb4 "xen-netback: worse-case estimate in xenvif_rx_action is
underestimating") capped the slot estimation to MAX_SKB_FRAGS, but that triggers
the next BUG_ON a few lines down, as the packet consumes more slots than
estimated.
This patch remove that cap, and if the frontend doesn't provide enough slot,
put back the skb to the top of the queue and caps rx_last_skb_slots. When the
next try also fails, it drops the packet.
Capping rx_last_skb_slots is needed because if the frontend never gives enough
slots, the ring gets stalled.

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Cc: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index da85ffb..7164157 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -600,13 +600,6 @@ static void xenvif_rx_action(struct xenvif *vif)
 							 PAGE_SIZE);
 		}
 
-		/* To avoid the estimate becoming too pessimal for some
-		 * frontends that limit posted rx requests, cap the estimate
-		 * at MAX_SKB_FRAGS.
-		 */
-		if (max_slots_needed > MAX_SKB_FRAGS)
-			max_slots_needed = MAX_SKB_FRAGS;
-
 		/* We may need one more slot for GSO metadata */
 		if (skb_is_gso(skb) &&
 		   (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4 ||
@@ -615,9 +608,27 @@ static void xenvif_rx_action(struct xenvif *vif)
 
 		/* If the skb may not fit then bail out now */
 		if (!xenvif_rx_ring_slots_available(vif, max_slots_needed)) {
+			/* If the skb needs more than MAX_SKB_FRAGS slots, it
+			 * can happen that the frontend never gives us enough.
+			 * To avoid spining on that packet, first we put it back
+			 * to the top of the queue, but if the next try fail,
+			 * we drop it.
+			 */
+			if (max_slots_needed > MAX_SKB_FRAGS &&
+			    vif->rx_last_skb_slots == MAX_SKB_FRAGS) {
+				kfree_skb(skb);
+				vif->rx_last_skb_slots = 0;
+				continue;
+			}
 			skb_queue_head(&vif->rx_queue, skb);
 			need_to_notify = true;
-			vif->rx_last_skb_slots = max_slots_needed;
+			/* Cap this otherwise if the guest never gives us
+			 * enough slot, rx_work_todo will spin
+			 */
+			vif->rx_last_skb_slots =
+				max_slots_needed > MAX_SKB_FRAGS ?
+				MAX_SKB_FRAGS :
+				max_slots_needed;
 			break;
 		} else
 			vif->rx_last_skb_slots = 0;

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* RE: [PATCH net] xen-netback: Fix slot estimation
  2014-06-03 13:32 [PATCH net] xen-netback: Fix slot estimation Zoltan Kiss
@ 2014-06-03 13:37 ` Paul Durrant
  2014-06-03 14:04   ` Zoltan Kiss
  2014-06-03 13:52 ` David Laight
  2014-06-05 22:02 ` David Miller
  2 siblings, 1 reply; 8+ messages in thread
From: Paul Durrant @ 2014-06-03 13:37 UTC (permalink / raw)
  To: Zoltan Kiss, xen-devel@lists.xenproject.org, Ian Campbell,
	Wei Liu, linux@eikelenboom.it
  Cc: netdev@vger.kernel.org, David Vrabel, davem@davemloft.net

> -----Original Message-----
> From: Zoltan Kiss
> Sent: 03 June 2014 14:32
> To: xen-devel@lists.xenproject.org; Ian Campbell; Wei Liu; Paul Durrant;
> linux@eikelenboom.it
> Cc: netdev@vger.kernel.org; David Vrabel; davem@davemloft.net; Zoltan
> Kiss
> Subject: [PATCH net] xen-netback: Fix slot estimation
> 
> A recent commit (a02eb4 "xen-netback: worse-case estimate in
> xenvif_rx_action is
> underestimating") capped the slot estimation to MAX_SKB_FRAGS, but that
> triggers
> the next BUG_ON a few lines down, as the packet consumes more slots than
> estimated.
> This patch remove that cap, and if the frontend doesn't provide enough slot,
> put back the skb to the top of the queue and caps rx_last_skb_slots. When
> the
> next try also fails, it drops the packet.
> Capping rx_last_skb_slots is needed because if the frontend never gives
> enough
> slots, the ring gets stalled.
> 
> Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
> Cc: Paul Durrant <paul.durrant@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: David Vrabel <david.vrabel@citrix.com>
> 
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
> netback/netback.c
> index da85ffb..7164157 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -600,13 +600,6 @@ static void xenvif_rx_action(struct xenvif *vif)
>  							 PAGE_SIZE);
>  		}
> 
> -		/* To avoid the estimate becoming too pessimal for some
> -		 * frontends that limit posted rx requests, cap the estimate
> -		 * at MAX_SKB_FRAGS.
> -		 */
> -		if (max_slots_needed > MAX_SKB_FRAGS)
> -			max_slots_needed = MAX_SKB_FRAGS;
> -
>  		/* We may need one more slot for GSO metadata */
>  		if (skb_is_gso(skb) &&
>  		   (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4 ||
> @@ -615,9 +608,27 @@ static void xenvif_rx_action(struct xenvif *vif)
> 
>  		/* If the skb may not fit then bail out now */
>  		if (!xenvif_rx_ring_slots_available(vif, max_slots_needed)) {
> +			/* If the skb needs more than MAX_SKB_FRAGS
> slots, it
> +			 * can happen that the frontend never gives us
> enough.
> +			 * To avoid spining on that packet, first we put it back
> +			 * to the top of the queue, but if the next try fail,
> +			 * we drop it.
> +			 */
> +			if (max_slots_needed > MAX_SKB_FRAGS &&
> +			    vif->rx_last_skb_slots == MAX_SKB_FRAGS) {

Isn't it sufficient to say:

if (vif->rx_last_skb_slots != 0)

here? We should not ordinarily wake before the requisite number of slots is available.

  Paul

> +				kfree_skb(skb);
> +				vif->rx_last_skb_slots = 0;
> +				continue;
> +			}
>  			skb_queue_head(&vif->rx_queue, skb);
>  			need_to_notify = true;
> -			vif->rx_last_skb_slots = max_slots_needed;
> +			/* Cap this otherwise if the guest never gives us
> +			 * enough slot, rx_work_todo will spin
> +			 */
> +			vif->rx_last_skb_slots =
> +				max_slots_needed > MAX_SKB_FRAGS ?
> +				MAX_SKB_FRAGS :
> +				max_slots_needed;
>  			break;
>  		} else
>  			vif->rx_last_skb_slots = 0;

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH net] xen-netback: Fix slot estimation
  2014-06-03 13:32 [PATCH net] xen-netback: Fix slot estimation Zoltan Kiss
  2014-06-03 13:37 ` Paul Durrant
@ 2014-06-03 13:52 ` David Laight
  2014-06-03 20:24   ` Zoltan Kiss
  2014-06-05 22:02 ` David Miller
  2 siblings, 1 reply; 8+ messages in thread
From: David Laight @ 2014-06-03 13:52 UTC (permalink / raw)
  To: 'Zoltan Kiss', xen-devel@lists.xenproject.org,
	ian.campbell@citrix.com, wei.liu2@citrix.com,
	paul.durrant@citrix.com, linux@eikelenboom.it
  Cc: netdev@vger.kernel.org, david.vrabel@citrix.com,
	davem@davemloft.net

From: netdev-owner@vger.kernel.org
> A recent commit (a02eb4 "xen-netback: worse-case estimate in xenvif_rx_action is
> underestimating") capped the slot estimation to MAX_SKB_FRAGS, but that triggers
> the next BUG_ON a few lines down, as the packet consumes more slots than
> estimated.
> This patch remove that cap, and if the frontend doesn't provide enough slot,
> put back the skb to the top of the queue and caps rx_last_skb_slots. When the
> next try also fails, it drops the packet.
> Capping rx_last_skb_slots is needed because if the frontend never gives enough
> slots, the ring gets stalled.
> 
> Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
> Cc: Paul Durrant <paul.durrant@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: David Vrabel <david.vrabel@citrix.com>
> 
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index da85ffb..7164157 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -600,13 +600,6 @@ static void xenvif_rx_action(struct xenvif *vif)
>  							 PAGE_SIZE);
>  		}
> 
> -		/* To avoid the estimate becoming too pessimal for some
> -		 * frontends that limit posted rx requests, cap the estimate
> -		 * at MAX_SKB_FRAGS.
> -		 */
> -		if (max_slots_needed > MAX_SKB_FRAGS)
> -			max_slots_needed = MAX_SKB_FRAGS;
> -
>  		/* We may need one more slot for GSO metadata */
>  		if (skb_is_gso(skb) &&
>  		   (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4 ||
> @@ -615,9 +608,27 @@ static void xenvif_rx_action(struct xenvif *vif)
> 
>  		/* If the skb may not fit then bail out now */
>  		if (!xenvif_rx_ring_slots_available(vif, max_slots_needed)) {
> +			/* If the skb needs more than MAX_SKB_FRAGS slots, it
> +			 * can happen that the frontend never gives us enough.
> +			 * To avoid spining on that packet, first we put it back
> +			 * to the top of the queue, but if the next try fail,
> +			 * we drop it.
> +			 */
> +			if (max_slots_needed > MAX_SKB_FRAGS &&
> +			    vif->rx_last_skb_slots == MAX_SKB_FRAGS) {
> +				kfree_skb(skb);
> +				vif->rx_last_skb_slots = 0;
> +				continue;
> +			}

A silent discard here doesn't seem right at all.
While it stops the kernel crashing, or the entire interface locking
up; it is likely to leave one connection 'stuck' - a TCP retransmission
is likely to include the same fragments.
>From a user point of view this as almost as bad.

	David

>  			skb_queue_head(&vif->rx_queue, skb);
>  			need_to_notify = true;
> -			vif->rx_last_skb_slots = max_slots_needed;
> +			/* Cap this otherwise if the guest never gives us
> +			 * enough slot, rx_work_todo will spin
> +			 */
> +			vif->rx_last_skb_slots =
> +				max_slots_needed > MAX_SKB_FRAGS ?
> +				MAX_SKB_FRAGS :
> +				max_slots_needed;
>  			break;
>  		} else
>  			vif->rx_last_skb_slots = 0;
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net] xen-netback: Fix slot estimation
  2014-06-03 13:37 ` Paul Durrant
@ 2014-06-03 14:04   ` Zoltan Kiss
  0 siblings, 0 replies; 8+ messages in thread
From: Zoltan Kiss @ 2014-06-03 14:04 UTC (permalink / raw)
  To: Paul Durrant, xen-devel@lists.xenproject.org, Ian Campbell,
	Wei Liu, linux@eikelenboom.it
  Cc: netdev@vger.kernel.org, David Vrabel, davem@davemloft.net

On 03/06/14 14:37, Paul Durrant wrote:
>> -----Original Message-----
>> From: Zoltan Kiss
>> Sent: 03 June 2014 14:32
>> To: xen-devel@lists.xenproject.org; Ian Campbell; Wei Liu; Paul Durrant;
>> linux@eikelenboom.it
>> Cc: netdev@vger.kernel.org; David Vrabel; davem@davemloft.net; Zoltan
>> Kiss
>> Subject: [PATCH net] xen-netback: Fix slot estimation
>>
>> A recent commit (a02eb4 "xen-netback: worse-case estimate in
>> xenvif_rx_action is
>> underestimating") capped the slot estimation to MAX_SKB_FRAGS, but that
>> triggers
>> the next BUG_ON a few lines down, as the packet consumes more slots than
>> estimated.
>> This patch remove that cap, and if the frontend doesn't provide enough slot,
>> put back the skb to the top of the queue and caps rx_last_skb_slots. When
>> the
>> next try also fails, it drops the packet.
>> Capping rx_last_skb_slots is needed because if the frontend never gives
>> enough
>> slots, the ring gets stalled.
>>
>> Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
>> Cc: Paul Durrant <paul.durrant@citrix.com>
>> Cc: Wei Liu <wei.liu2@citrix.com>
>> Cc: Ian Campbell <ian.campbell@citrix.com>
>> Cc: David Vrabel <david.vrabel@citrix.com>
>>
>> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
>> netback/netback.c
>> index da85ffb..7164157 100644
>> --- a/drivers/net/xen-netback/netback.c
>> +++ b/drivers/net/xen-netback/netback.c
>> @@ -600,13 +600,6 @@ static void xenvif_rx_action(struct xenvif *vif)
>>   							 PAGE_SIZE);
>>   		}
>>
>> -		/* To avoid the estimate becoming too pessimal for some
>> -		 * frontends that limit posted rx requests, cap the estimate
>> -		 * at MAX_SKB_FRAGS.
>> -		 */
>> -		if (max_slots_needed > MAX_SKB_FRAGS)
>> -			max_slots_needed = MAX_SKB_FRAGS;
>> -
>>   		/* We may need one more slot for GSO metadata */
>>   		if (skb_is_gso(skb) &&
>>   		   (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4 ||
>> @@ -615,9 +608,27 @@ static void xenvif_rx_action(struct xenvif *vif)
>>
>>   		/* If the skb may not fit then bail out now */
>>   		if (!xenvif_rx_ring_slots_available(vif, max_slots_needed)) {
>> +			/* If the skb needs more than MAX_SKB_FRAGS
>> slots, it
>> +			 * can happen that the frontend never gives us
>> enough.
>> +			 * To avoid spining on that packet, first we put it back
>> +			 * to the top of the queue, but if the next try fail,
>> +			 * we drop it.
>> +			 */
>> +			if (max_slots_needed > MAX_SKB_FRAGS &&
>> +			    vif->rx_last_skb_slots == MAX_SKB_FRAGS) {
>
> Isn't it sufficient to say:
>
> if (vif->rx_last_skb_slots != 0)
>
> here? We should not ordinarily wake before the requisite number of slots is available.

Yep, that would be enough


>
>    Paul
>
>> +				kfree_skb(skb);
>> +				vif->rx_last_skb_slots = 0;
>> +				continue;
>> +			}
>>   			skb_queue_head(&vif->rx_queue, skb);
>>   			need_to_notify = true;
>> -			vif->rx_last_skb_slots = max_slots_needed;
>> +			/* Cap this otherwise if the guest never gives us
>> +			 * enough slot, rx_work_todo will spin
>> +			 */
>> +			vif->rx_last_skb_slots =
>> +				max_slots_needed > MAX_SKB_FRAGS ?
>> +				MAX_SKB_FRAGS :
>> +				max_slots_needed;
>>   			break;
>>   		} else
>>   			vif->rx_last_skb_slots = 0;

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net] xen-netback: Fix slot estimation
  2014-06-03 13:52 ` David Laight
@ 2014-06-03 20:24   ` Zoltan Kiss
  0 siblings, 0 replies; 8+ messages in thread
From: Zoltan Kiss @ 2014-06-03 20:24 UTC (permalink / raw)
  To: David Laight, xen-devel@lists.xenproject.org,
	ian.campbell@citrix.com, wei.liu2@citrix.com,
	paul.durrant@citrix.com, linux@eikelenboom.it
  Cc: netdev@vger.kernel.org, david.vrabel@citrix.com,
	davem@davemloft.net

On 03/06/14 14:52, David Laight wrote:
> From: netdev-owner@vger.kernel.org
>> @@ -615,9 +608,27 @@ static void xenvif_rx_action(struct xenvif *vif)
>>
>>   		/* If the skb may not fit then bail out now */
>>   		if (!xenvif_rx_ring_slots_available(vif, max_slots_needed)) {
>> +			/* If the skb needs more than MAX_SKB_FRAGS slots, it
>> +			 * can happen that the frontend never gives us enough.
>> +			 * To avoid spining on that packet, first we put it back
>> +			 * to the top of the queue, but if the next try fail,
>> +			 * we drop it.
>> +			 */
>> +			if (max_slots_needed > MAX_SKB_FRAGS &&
>> +			    vif->rx_last_skb_slots == MAX_SKB_FRAGS) {
>> +				kfree_skb(skb);
>> +				vif->rx_last_skb_slots = 0;
>> +				continue;
>> +			}
>
> A silent discard here doesn't seem right at all.
> While it stops the kernel crashing, or the entire interface locking
> up; it is likely to leave one connection 'stuck' - a TCP retransmission
> is likely to include the same fragments.
>  From a user point of view this as almost as bad.

Yes, we are aware of this problem for a while. However I have an idea to 
solve that in a way that we don't lose performance, and these packets 
can pass through as well. See my patch called "Fix handling of skbs 
requiring too many slots"

Zoli

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net] xen-netback: Fix slot estimation
  2014-06-03 13:32 [PATCH net] xen-netback: Fix slot estimation Zoltan Kiss
  2014-06-03 13:37 ` Paul Durrant
  2014-06-03 13:52 ` David Laight
@ 2014-06-05 22:02 ` David Miller
  2014-06-06 10:20   ` Zoltan Kiss
  2 siblings, 1 reply; 8+ messages in thread
From: David Miller @ 2014-06-05 22:02 UTC (permalink / raw)
  To: zoltan.kiss
  Cc: xen-devel, ian.campbell, wei.liu2, paul.durrant, linux, netdev,
	david.vrabel

From: Zoltan Kiss <zoltan.kiss@citrix.com>
Date: Tue, 3 Jun 2014 14:32:16 +0100

> A recent commit (a02eb4 "xen-netback: worse-case estimate in xenvif_rx_action is
> underestimating") capped the slot estimation to MAX_SKB_FRAGS, but that triggers
> the next BUG_ON a few lines down, as the packet consumes more slots than
> estimated.
> This patch remove that cap, and if the frontend doesn't provide enough slot,
> put back the skb to the top of the queue and caps rx_last_skb_slots. When the
> next try also fails, it drops the packet.
> Capping rx_last_skb_slots is needed because if the frontend never gives enough
> slots, the ring gets stalled.
> 
> Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>

Like David Laight, I do not like this patch at all.

Yes a crash or BUG_ON triggered is bad, but fixing it by deadlocking
TCP connections (a silent failure) is not an improvement.

I'm not applying this, sorry.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net] xen-netback: Fix slot estimation
  2014-06-05 22:02 ` David Miller
@ 2014-06-06 10:20   ` Zoltan Kiss
  2014-06-06 20:06     ` David Miller
  0 siblings, 1 reply; 8+ messages in thread
From: Zoltan Kiss @ 2014-06-06 10:20 UTC (permalink / raw)
  To: David Miller
  Cc: xen-devel, ian.campbell, wei.liu2, paul.durrant, linux, netdev,
	david.vrabel

On 05/06/14 23:02, David Miller wrote:
> From: Zoltan Kiss<zoltan.kiss@citrix.com>
> Date: Tue, 3 Jun 2014 14:32:16 +0100
>
>> A recent commit (a02eb4 "xen-netback: worse-case estimate in xenvif_rx_action is
>> underestimating") capped the slot estimation to MAX_SKB_FRAGS, but that triggers
>> the next BUG_ON a few lines down, as the packet consumes more slots than
>> estimated.
>> This patch remove that cap, and if the frontend doesn't provide enough slot,
>> put back the skb to the top of the queue and caps rx_last_skb_slots. When the
>> next try also fails, it drops the packet.
>> Capping rx_last_skb_slots is needed because if the frontend never gives enough
>> slots, the ring gets stalled.
>>
>> Signed-off-by: Zoltan Kiss<zoltan.kiss@citrix.com>
> Like David Laight, I do not like this patch at all.
>
> Yes a crash or BUG_ON triggered is bad, but fixing it by deadlocking
> TCP connections (a silent failure) is not an improvement.
>
> I'm not applying this, sorry.
>
Hi,

I haven't wrote it explicitly, but my other patch "xen-netback: Fix 
handling of skbs requiring too many slots" supersedes this one.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net] xen-netback: Fix slot estimation
  2014-06-06 10:20   ` Zoltan Kiss
@ 2014-06-06 20:06     ` David Miller
  0 siblings, 0 replies; 8+ messages in thread
From: David Miller @ 2014-06-06 20:06 UTC (permalink / raw)
  To: zoltan.kiss
  Cc: xen-devel, ian.campbell, wei.liu2, paul.durrant, linux, netdev,
	david.vrabel

From: Zoltan Kiss <zoltan.kiss@schaman.hu>
Date: Fri, 06 Jun 2014 11:20:45 +0100

> I haven't wrote it explicitly, but my other patch "xen-netback: Fix
> handling of skbs requiring too many slots" supersedes this one.

Right, I figured this out after reading these threads over again.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-06-06 20:06 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-03 13:32 [PATCH net] xen-netback: Fix slot estimation Zoltan Kiss
2014-06-03 13:37 ` Paul Durrant
2014-06-03 14:04   ` Zoltan Kiss
2014-06-03 13:52 ` David Laight
2014-06-03 20:24   ` Zoltan Kiss
2014-06-05 22:02 ` David Miller
2014-06-06 10:20   ` Zoltan Kiss
2014-06-06 20:06     ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).