[PATCH net-next] xen-netback: Stop using xenvif_tx_pending_slots

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH net-next] xen-netback: Stop using xenvif_tx_pending_slots_available
@ 2014-03-20 19:32 Zoltan Kiss
  2014-03-20 19:32 ` [PATCH net-next] xen-netback: Follow-up patch for grant mapping series Zoltan Kiss
  2014-03-21  9:36 ` [PATCH net-next] xen-netback: Stop using xenvif_tx_pending_slots_available Ian Campbell
  0 siblings, 2 replies; 8+ messages in thread
From: Zoltan Kiss @ 2014-03-20 19:32 UTC (permalink / raw)
  To: ian.campbell, wei.liu2, xen-devel
  Cc: paul.durrant, netdev, linux-kernel, jonathan.davies, Zoltan Kiss

Since the early days TX stops if there isn't enough free pending slots to
consume a maximum sized (slot-wise) packet. Probably the reason for that is to
avoid the case when we don't have enough free pending slot in the ring to finish
the packet. But if we make sure that the pending ring has the same size as the
shared ring, that shouldn't really happen. The frontend can only post packets
which fit the to the free space of the shared ring. If it doesn't, the frontend
has to stop, as it can only increase the req_prod when the whole packet fits
onto the ring.
This patch avoid using this checking, makes sure the 2 ring has the same size,
and remove a checking from the callback. As now we don't stop the NAPI instance
on this condition, we don't have to wake it up if we free pending slots up.

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
---
diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index bef37be..a800a8e 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -81,7 +81,7 @@ struct xenvif_rx_meta {
 
 #define MAX_BUFFER_OFFSET PAGE_SIZE
 
-#define MAX_PENDING_REQS 256
+#define MAX_PENDING_REQS XEN_NETIF_TX_RING_SIZE
 
 /* It's possible for an skb to have a maximal number of frags
  * but still be less than MAX_BUFFER_OFFSET in size. Thus the
@@ -253,12 +253,6 @@ static inline pending_ring_idx_t nr_pending_reqs(struct xenvif *vif)
 		vif->pending_prod + vif->pending_cons;
 }
 
-static inline bool xenvif_tx_pending_slots_available(struct xenvif *vif)
-{
-	return nr_pending_reqs(vif) + XEN_NETBK_LEGACY_SLOTS_MAX
-		< MAX_PENDING_REQS;
-}
-
 /* Callback from stack when TX packet can be released */
 void xenvif_zerocopy_callback(struct ubuf_info *ubuf, bool zerocopy_success);
 
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 83a71ac..7bb7734 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -88,8 +88,7 @@ static int xenvif_poll(struct napi_struct *napi, int budget)
 		local_irq_save(flags);
 
 		RING_FINAL_CHECK_FOR_REQUESTS(&vif->tx, more_to_do);
-		if (!(more_to_do &&
-		      xenvif_tx_pending_slots_available(vif)))
+		if (!more_to_do)
 			__napi_complete(napi);
 
 		local_irq_restore(flags);
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index bc94320..5df8d63 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1175,8 +1175,7 @@ static unsigned xenvif_tx_build_gops(struct xenvif *vif, int budget)
 	struct sk_buff *skb;
 	int ret;
 
-	while (xenvif_tx_pending_slots_available(vif) &&
-	       (skb_queue_len(&vif->tx_queue) < budget)) {
+	while (skb_queue_len(&vif->tx_queue) < budget) {
 		struct xen_netif_tx_request txreq;
 		struct xen_netif_tx_request txfrags[XEN_NETBK_LEGACY_SLOTS_MAX];
 		struct xen_netif_extra_info extras[XEN_NETIF_EXTRA_TYPE_MAX-1];
@@ -1516,13 +1515,6 @@ void xenvif_zerocopy_callback(struct ubuf_info *ubuf, bool zerocopy_success)
 	wake_up(&vif->dealloc_wq);
 	spin_unlock_irqrestore(&vif->callback_lock, flags);
 
-	if (RING_HAS_UNCONSUMED_REQUESTS(&vif->tx) &&
-	    xenvif_tx_pending_slots_available(vif)) {
-		local_bh_disable();
-		napi_schedule(&vif->napi);
-		local_bh_enable();
-	}
-
 	if (likely(zerocopy_success))
 		vif->tx_zerocopy_success++;
 	else
@@ -1714,8 +1706,7 @@ static inline int rx_work_todo(struct xenvif *vif)
 static inline int tx_work_todo(struct xenvif *vif)
 {
 
-	if (likely(RING_HAS_UNCONSUMED_REQUESTS(&vif->tx)) &&
-	    xenvif_tx_pending_slots_available(vif))
+	if (likely(RING_HAS_UNCONSUMED_REQUESTS(&vif->tx)))
 		return 1;
 
 	return 0;

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next] xen-netback: Follow-up patch for grant mapping series
  2014-03-20 19:32 [PATCH net-next] xen-netback: Stop using xenvif_tx_pending_slots_available Zoltan Kiss
@ 2014-03-20 19:32 ` Zoltan Kiss
  2014-03-21  9:41   ` Ian Campbell
  2014-03-21 18:13   ` David Miller
  2014-03-21  9:36 ` [PATCH net-next] xen-netback: Stop using xenvif_tx_pending_slots_available Ian Campbell
  1 sibling, 2 replies; 8+ messages in thread
From: Zoltan Kiss @ 2014-03-20 19:32 UTC (permalink / raw)
  To: ian.campbell, wei.liu2, xen-devel
  Cc: paul.durrant, netdev, linux-kernel, jonathan.davies, Zoltan Kiss

Ian made some late comments about the grant mapping series, I incorporated the
outcomes into this patch. Additional comments, refactoring etc.

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
---
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 6837bfc..fa19538 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -576,15 +576,15 @@ void xenvif_disconnect(struct xenvif *vif)
 void xenvif_free(struct xenvif *vif)
 {
 	int i, unmap_timeout = 0;
-	/* Here we want to avoid timeout messages if an skb can be legitimatly
-	 * stucked somewhere else. Realisticly this could be an another vif's
+	/* Here we want to avoid timeout messages if an skb can be legitimately
+	 * stuck somewhere else. Realistically this could be an another vif's
 	 * internal or QDisc queue. That another vif also has this
 	 * rx_drain_timeout_msecs timeout, but the timer only ditches the
 	 * internal queue. After that, the QDisc queue can put in worst case
 	 * XEN_NETIF_RX_RING_SIZE / MAX_SKB_FRAGS skbs into that another vif's
 	 * internal queue, so we need several rounds of such timeouts until we
 	 * can be sure that no another vif should have skb's from us. We are
-	 * not sending more skb's, so newly stucked packets are not interesting
+	 * not sending more skb's, so newly stuck packets are not interesting
 	 * for us here.
 	 */
 	unsigned int worst_case_skb_lifetime = (rx_drain_timeout_msecs/1000) *
@@ -599,6 +599,13 @@ void xenvif_free(struct xenvif *vif)
 				netdev_err(vif->dev,
 					   "Page still granted! Index: %x\n",
 					   i);
+			/* If there are still unmapped pages, reset the loop to
+			 * start mchecking again. We shouldn't exit here until
+			 * dealloc thread and NAPI instance release all the
+			 * pages. If a kernel bug cause the skbs stall
+			 * somewhere, the interface couldn't be brought down
+			 * properly.
+			 */
 			i = -1;
 		}
 	}
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 1c336f6..0e46184b 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -56,7 +56,7 @@ bool separate_tx_rx_irq = 1;
 module_param(separate_tx_rx_irq, bool, 0644);
 
 /* When guest ring is filled up, qdisc queues the packets for us, but we have
- * to timeout them, otherwise other guests' packets can get stucked there
+ * to timeout them, otherwise other guests' packets can get stuck there
  */
 unsigned int rx_drain_timeout_msecs = 10000;
 module_param(rx_drain_timeout_msecs, uint, 0444);
@@ -99,6 +99,9 @@ static inline unsigned long idx_to_kaddr(struct xenvif *vif,
 	return (unsigned long)pfn_to_kaddr(idx_to_pfn(vif, idx));
 }
 
+#define callback_param(vif, pending_idx) \
+	(vif->pending_tx_info[pending_idx].callback_struct)
+
 /* Find the containing VIF's structure from a pointer in pending_tx_info array
  */
 static inline struct xenvif* ubuf_to_vif(struct ubuf_info *ubuf)
@@ -1025,12 +1028,12 @@ static void xenvif_fill_frags(struct xenvif *vif, struct sk_buff *skb)
 		/* If this is not the first frag, chain it to the previous*/
 		if (unlikely(prev_pending_idx == INVALID_PENDING_IDX))
 			skb_shinfo(skb)->destructor_arg =
-				&vif->pending_tx_info[pending_idx].callback_struct;
+				&callback_param(vif, pending_idx);
 		else if (likely(pending_idx != prev_pending_idx))
-			vif->pending_tx_info[prev_pending_idx].callback_struct.ctx =
-				&(vif->pending_tx_info[pending_idx].callback_struct);
+			callback_param(vif, prev_pending_idx).ctx =
+				&callback_param(vif, pending_idx);
 
-		vif->pending_tx_info[pending_idx].callback_struct.ctx = NULL;
+		callback_param(vif, pending_idx).ctx = NULL;
 		prev_pending_idx = pending_idx;
 
 		txp = &vif->pending_tx_info[pending_idx].req;
@@ -1400,13 +1403,13 @@ static int xenvif_tx_submit(struct xenvif *vif)
 		memcpy(skb->data,
 		       (void *)(idx_to_kaddr(vif, pending_idx)|txp->offset),
 		       data_len);
-		vif->pending_tx_info[pending_idx].callback_struct.ctx = NULL;
+		callback_param(vif, pending_idx).ctx = NULL;
 		if (data_len < txp->size) {
 			/* Append the packet payload as a fragment. */
 			txp->offset += data_len;
 			txp->size -= data_len;
 			skb_shinfo(skb)->destructor_arg =
-				&vif->pending_tx_info[pending_idx].callback_struct;
+				&callback_param(vif, pending_idx);
 		} else {
 			/* Schedule a response immediately. */
 			xenvif_idx_unmap(vif, pending_idx);
@@ -1550,7 +1553,6 @@ static inline void xenvif_tx_dealloc_action(struct xenvif *vif)
 					    idx_to_kaddr(vif, pending_idx),
 					    GNTMAP_host_map,
 					    vif->grant_tx_handle[pending_idx]);
-			/* Btw. already unmapped? */
 			xenvif_grant_handle_reset(vif, pending_idx);
 			++gop;
 		}
@@ -1683,12 +1685,20 @@ void xenvif_idx_unmap(struct xenvif *vif, u16 pending_idx)
 			    idx_to_kaddr(vif, pending_idx),
 			    GNTMAP_host_map,
 			    vif->grant_tx_handle[pending_idx]);
-	/* Btw. already unmapped? */
 	xenvif_grant_handle_reset(vif, pending_idx);
 
 	ret = gnttab_unmap_refs(&tx_unmap_op, NULL,
 				&vif->mmap_pages[pending_idx], 1);
-	BUG_ON(ret);
+	if (ret) {
+		netdev_err(vif->dev,
+			   "Unmap fail: ret: %d pending_idx: %d host_addr: %llx handle: %x status: %d\n",
+			   ret,
+			   pending_idx,
+			   tx_unmap_op.host_addr,
+			   tx_unmap_op.handle,
+			   tx_unmap_op.status);
+		BUG();
+	}
 
 	xenvif_idx_release(vif, pending_idx, XEN_NETIF_RSP_OKAY);
 }

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next] xen-netback: Follow-up patch for grant mapping series
  2014-03-20 19:32 ` [PATCH net-next] xen-netback: Follow-up patch for grant mapping series Zoltan Kiss
@ 2014-03-21  9:41   ` Ian Campbell
  2014-03-21 19:31     ` Zoltan Kiss
  2014-03-21 18:13   ` David Miller
  1 sibling, 1 reply; 8+ messages in thread
From: Ian Campbell @ 2014-03-21  9:41 UTC (permalink / raw)
  To: Zoltan Kiss
  Cc: wei.liu2, xen-devel, paul.durrant, netdev, linux-kernel,
	jonathan.davies

On Thu, 2014-03-20 at 19:32 +0000, Zoltan Kiss wrote:
> Ian made some late comments about the grant mapping series, I incorporated the
> outcomes into this patch. Additional comments, refactoring etc.

I don't know how davem feels about merging these kinds of thing into one
patch but from my point of view the least thing to do is to enumerate
the cleanups -- so that people can verify that the patch does infact do
exactly what it says it does, no more, no less.

> Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
> ---
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> index 6837bfc..fa19538 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -576,15 +576,15 @@ void xenvif_disconnect(struct xenvif *vif)
>  void xenvif_free(struct xenvif *vif)
>  {
>  	int i, unmap_timeout = 0;
> -	/* Here we want to avoid timeout messages if an skb can be legitimatly
> -	 * stucked somewhere else. Realisticly this could be an another vif's
> +	/* Here we want to avoid timeout messages if an skb can be legitimately
> +	 * stuck somewhere else. Realistically this could be an another vif's
>  	 * internal or QDisc queue. That another vif also has this
>  	 * rx_drain_timeout_msecs timeout, but the timer only ditches the
>  	 * internal queue. After that, the QDisc queue can put in worst case
>  	 * XEN_NETIF_RX_RING_SIZE / MAX_SKB_FRAGS skbs into that another vif's
>  	 * internal queue, so we need several rounds of such timeouts until we
>  	 * can be sure that no another vif should have skb's from us. We are
> -	 * not sending more skb's, so newly stucked packets are not interesting
> +	 * not sending more skb's, so newly stuck packets are not interesting
>  	 * for us here.
>  	 */
>  	unsigned int worst_case_skb_lifetime = (rx_drain_timeout_msecs/1000) *
> @@ -599,6 +599,13 @@ void xenvif_free(struct xenvif *vif)
>  				netdev_err(vif->dev,
>  					   "Page still granted! Index: %x\n",
>  					   i);
> +			/* If there are still unmapped pages, reset the loop to
> +			 * start mchecking again. We shouldn't exit here until

"checking"

> +			 * dealloc thread and NAPI instance release all the
> +			 * pages. If a kernel bug cause the skbs stall

"causes the skbs to stall"

> +			 * somewhere, the interface couldn't be brought down
> +			 * properly.

I'm not sure what you are saying here exactly. Perhaps you means
s/couldn't/cannot/?

> +			 */
>  			i = -1;
>  		}
>  	}
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 1c336f6..0e46184b 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -56,7 +56,7 @@ bool separate_tx_rx_irq = 1;
>  module_param(separate_tx_rx_irq, bool, 0644);
>  
>  /* When guest ring is filled up, qdisc queues the packets for us, but we have
> - * to timeout them, otherwise other guests' packets can get stucked there
> + * to timeout them, otherwise other guests' packets can get stuck there
>   */
>  unsigned int rx_drain_timeout_msecs = 10000;
>  module_param(rx_drain_timeout_msecs, uint, 0444);
> @@ -99,6 +99,9 @@ static inline unsigned long idx_to_kaddr(struct xenvif *vif,
>  	return (unsigned long)pfn_to_kaddr(idx_to_pfn(vif, idx));
>  }
>  
> +#define callback_param(vif, pending_idx) \
> +	(vif->pending_tx_info[pending_idx].callback_struct)
> +
>  /* Find the containing VIF's structure from a pointer in pending_tx_info array
>   */
>  static inline struct xenvif* ubuf_to_vif(struct ubuf_info *ubuf)
> @@ -1025,12 +1028,12 @@ static void xenvif_fill_frags(struct xenvif *vif, struct sk_buff *skb)
>  		/* If this is not the first frag, chain it to the previous*/
>  		if (unlikely(prev_pending_idx == INVALID_PENDING_IDX))
>  			skb_shinfo(skb)->destructor_arg =
> -				&vif->pending_tx_info[pending_idx].callback_struct;
> +				&callback_param(vif, pending_idx);
>  		else if (likely(pending_idx != prev_pending_idx))
> -			vif->pending_tx_info[prev_pending_idx].callback_struct.ctx =
> -				&(vif->pending_tx_info[pending_idx].callback_struct);
> +			callback_param(vif, prev_pending_idx).ctx =
> +				&callback_param(vif, pending_idx);
>  
> -		vif->pending_tx_info[pending_idx].callback_struct.ctx = NULL;
> +		callback_param(vif, pending_idx).ctx = NULL;
>  		prev_pending_idx = pending_idx;
>  
>  		txp = &vif->pending_tx_info[pending_idx].req;
> @@ -1400,13 +1403,13 @@ static int xenvif_tx_submit(struct xenvif *vif)
>  		memcpy(skb->data,
>  		       (void *)(idx_to_kaddr(vif, pending_idx)|txp->offset),
>  		       data_len);
> -		vif->pending_tx_info[pending_idx].callback_struct.ctx = NULL;
> +		callback_param(vif, pending_idx).ctx = NULL;
>  		if (data_len < txp->size) {
>  			/* Append the packet payload as a fragment. */
>  			txp->offset += data_len;
>  			txp->size -= data_len;
>  			skb_shinfo(skb)->destructor_arg =
> -				&vif->pending_tx_info[pending_idx].callback_struct;
> +				&callback_param(vif, pending_idx);
>  		} else {
>  			/* Schedule a response immediately. */
>  			xenvif_idx_unmap(vif, pending_idx);
> @@ -1550,7 +1553,6 @@ static inline void xenvif_tx_dealloc_action(struct xenvif *vif)
>  					    idx_to_kaddr(vif, pending_idx),
>  					    GNTMAP_host_map,
>  					    vif->grant_tx_handle[pending_idx]);
> -			/* Btw. already unmapped? */
>  			xenvif_grant_handle_reset(vif, pending_idx);
>  			++gop;
>  		}
> @@ -1683,12 +1685,20 @@ void xenvif_idx_unmap(struct xenvif *vif, u16 pending_idx)
>  			    idx_to_kaddr(vif, pending_idx),
>  			    GNTMAP_host_map,
>  			    vif->grant_tx_handle[pending_idx]);
> -	/* Btw. already unmapped? */
>  	xenvif_grant_handle_reset(vif, pending_idx);
>  
>  	ret = gnttab_unmap_refs(&tx_unmap_op, NULL,
>  				&vif->mmap_pages[pending_idx], 1);
> -	BUG_ON(ret);
> +	if (ret) {
> +		netdev_err(vif->dev,
> +			   "Unmap fail: ret: %d pending_idx: %d host_addr: %llx handle: %x status: %d\n",
> +			   ret,
> +			   pending_idx,
> +			   tx_unmap_op.host_addr,
> +			   tx_unmap_op.handle,
> +			   tx_unmap_op.status);
> +		BUG();
> +	}
>  
>  	xenvif_idx_release(vif, pending_idx, XEN_NETIF_RSP_OKAY);
>  }

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next] xen-netback: Follow-up patch for grant mapping series
  2014-03-21  9:41   ` Ian Campbell
@ 2014-03-21 19:31     ` Zoltan Kiss
  2014-03-24 10:34       ` Ian Campbell
  0 siblings, 1 reply; 8+ messages in thread
From: Zoltan Kiss @ 2014-03-21 19:31 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, xen-devel, paul.durrant, netdev, linux-kernel,
	jonathan.davies

On 21/03/14 09:41, Ian Campbell wrote:
> On Thu, 2014-03-20 at 19:32 +0000, Zoltan Kiss wrote:
>> Ian made some late comments about the grant mapping series, I incorporated the
>> outcomes into this patch. Additional comments, refactoring etc.
>
> I don't know how davem feels about merging these kinds of thing into one
> patch but from my point of view the least thing to do is to enumerate
> the cleanups -- so that people can verify that the patch does infact do
> exactly what it says it does, no more, no less.
Ok, I'll make a detailed list in the commit message about what are the 
changes, and fix up the typos you mentioned below.
>
>> Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
>> ---
>> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
>> index 6837bfc..fa19538 100644
>> --- a/drivers/net/xen-netback/interface.c
>> +++ b/drivers/net/xen-netback/interface.c
>> @@ -576,15 +576,15 @@ void xenvif_disconnect(struct xenvif *vif)
>>   void xenvif_free(struct xenvif *vif)
>>   {
>>   	int i, unmap_timeout = 0;
>> -	/* Here we want to avoid timeout messages if an skb can be legitimatly
>> -	 * stucked somewhere else. Realisticly this could be an another vif's
>> +	/* Here we want to avoid timeout messages if an skb can be legitimately
>> +	 * stuck somewhere else. Realistically this could be an another vif's
>>   	 * internal or QDisc queue. That another vif also has this
>>   	 * rx_drain_timeout_msecs timeout, but the timer only ditches the
>>   	 * internal queue. After that, the QDisc queue can put in worst case
>>   	 * XEN_NETIF_RX_RING_SIZE / MAX_SKB_FRAGS skbs into that another vif's
>>   	 * internal queue, so we need several rounds of such timeouts until we
>>   	 * can be sure that no another vif should have skb's from us. We are
>> -	 * not sending more skb's, so newly stucked packets are not interesting
>> +	 * not sending more skb's, so newly stuck packets are not interesting
>>   	 * for us here.
>>   	 */
>>   	unsigned int worst_case_skb_lifetime = (rx_drain_timeout_msecs/1000) *
>> @@ -599,6 +599,13 @@ void xenvif_free(struct xenvif *vif)
>>   				netdev_err(vif->dev,
>>   					   "Page still granted! Index: %x\n",
>>   					   i);
>> +			/* If there are still unmapped pages, reset the loop to
>> +			 * start mchecking again. We shouldn't exit here until
>
> "checking"
>
>> +			 * dealloc thread and NAPI instance release all the
>> +			 * pages. If a kernel bug cause the skbs stall
>
> "causes the skbs to stall"
>
>> +			 * somewhere, the interface couldn't be brought down
>> +			 * properly.
>
> I'm not sure what you are saying here exactly. Perhaps you means
> s/couldn't/cannot/?
>
>> +			 */
>>   			i = -1;
>>   		}
>>   	}
>> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
>> index 1c336f6..0e46184b 100644
>> --- a/drivers/net/xen-netback/netback.c
>> +++ b/drivers/net/xen-netback/netback.c
>> @@ -56,7 +56,7 @@ bool separate_tx_rx_irq = 1;
>>   module_param(separate_tx_rx_irq, bool, 0644);
>>
>>   /* When guest ring is filled up, qdisc queues the packets for us, but we have
>> - * to timeout them, otherwise other guests' packets can get stucked there
>> + * to timeout them, otherwise other guests' packets can get stuck there
>>    */
>>   unsigned int rx_drain_timeout_msecs = 10000;
>>   module_param(rx_drain_timeout_msecs, uint, 0444);
>> @@ -99,6 +99,9 @@ static inline unsigned long idx_to_kaddr(struct xenvif *vif,
>>   	return (unsigned long)pfn_to_kaddr(idx_to_pfn(vif, idx));
>>   }
>>
>> +#define callback_param(vif, pending_idx) \
>> +	(vif->pending_tx_info[pending_idx].callback_struct)
>> +
>>   /* Find the containing VIF's structure from a pointer in pending_tx_info array
>>    */
>>   static inline struct xenvif* ubuf_to_vif(struct ubuf_info *ubuf)
>> @@ -1025,12 +1028,12 @@ static void xenvif_fill_frags(struct xenvif *vif, struct sk_buff *skb)
>>   		/* If this is not the first frag, chain it to the previous*/
>>   		if (unlikely(prev_pending_idx == INVALID_PENDING_IDX))
>>   			skb_shinfo(skb)->destructor_arg =
>> -				&vif->pending_tx_info[pending_idx].callback_struct;
>> +				&callback_param(vif, pending_idx);
>>   		else if (likely(pending_idx != prev_pending_idx))
>> -			vif->pending_tx_info[prev_pending_idx].callback_struct.ctx =
>> -				&(vif->pending_tx_info[pending_idx].callback_struct);
>> +			callback_param(vif, prev_pending_idx).ctx =
>> +				&callback_param(vif, pending_idx);
>>
>> -		vif->pending_tx_info[pending_idx].callback_struct.ctx = NULL;
>> +		callback_param(vif, pending_idx).ctx = NULL;
>>   		prev_pending_idx = pending_idx;
>>
>>   		txp = &vif->pending_tx_info[pending_idx].req;
>> @@ -1400,13 +1403,13 @@ static int xenvif_tx_submit(struct xenvif *vif)
>>   		memcpy(skb->data,
>>   		       (void *)(idx_to_kaddr(vif, pending_idx)|txp->offset),
>>   		       data_len);
>> -		vif->pending_tx_info[pending_idx].callback_struct.ctx = NULL;
>> +		callback_param(vif, pending_idx).ctx = NULL;
>>   		if (data_len < txp->size) {
>>   			/* Append the packet payload as a fragment. */
>>   			txp->offset += data_len;
>>   			txp->size -= data_len;
>>   			skb_shinfo(skb)->destructor_arg =
>> -				&vif->pending_tx_info[pending_idx].callback_struct;
>> +				&callback_param(vif, pending_idx);
>>   		} else {
>>   			/* Schedule a response immediately. */
>>   			xenvif_idx_unmap(vif, pending_idx);
>> @@ -1550,7 +1553,6 @@ static inline void xenvif_tx_dealloc_action(struct xenvif *vif)
>>   					    idx_to_kaddr(vif, pending_idx),
>>   					    GNTMAP_host_map,
>>   					    vif->grant_tx_handle[pending_idx]);
>> -			/* Btw. already unmapped? */
>>   			xenvif_grant_handle_reset(vif, pending_idx);
>>   			++gop;
>>   		}
>> @@ -1683,12 +1685,20 @@ void xenvif_idx_unmap(struct xenvif *vif, u16 pending_idx)
>>   			    idx_to_kaddr(vif, pending_idx),
>>   			    GNTMAP_host_map,
>>   			    vif->grant_tx_handle[pending_idx]);
>> -	/* Btw. already unmapped? */
>>   	xenvif_grant_handle_reset(vif, pending_idx);
>>
>>   	ret = gnttab_unmap_refs(&tx_unmap_op, NULL,
>>   				&vif->mmap_pages[pending_idx], 1);
>> -	BUG_ON(ret);
>> +	if (ret) {
>> +		netdev_err(vif->dev,
>> +			   "Unmap fail: ret: %d pending_idx: %d host_addr: %llx handle: %x status: %d\n",
>> +			   ret,
>> +			   pending_idx,
>> +			   tx_unmap_op.host_addr,
>> +			   tx_unmap_op.handle,
>> +			   tx_unmap_op.status);
>> +		BUG();
>> +	}
>>
>>   	xenvif_idx_release(vif, pending_idx, XEN_NETIF_RSP_OKAY);
>>   }
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next] xen-netback: Follow-up patch for grant mapping series
  2014-03-21 19:31     ` Zoltan Kiss
@ 2014-03-24 10:34       ` Ian Campbell
  0 siblings, 0 replies; 8+ messages in thread
From: Ian Campbell @ 2014-03-24 10:34 UTC (permalink / raw)
  To: Zoltan Kiss
  Cc: wei.liu2, xen-devel, paul.durrant, netdev, linux-kernel,
	jonathan.davies

On Fri, 2014-03-21 at 19:31 +0000, Zoltan Kiss wrote:
> On 21/03/14 09:41, Ian Campbell wrote:
> > On Thu, 2014-03-20 at 19:32 +0000, Zoltan Kiss wrote:
> >> Ian made some late comments about the grant mapping series, I incorporated the
> >> outcomes into this patch. Additional comments, refactoring etc.
> >
> > I don't know how davem feels about merging these kinds of thing into one
> > patch but from my point of view the least thing to do is to enumerate
> > the cleanups -- so that people can verify that the patch does infact do
> > exactly what it says it does, no more, no less.
> Ok, I'll make a detailed list in the commit message about what are the 
> changes, and fix up the typos you mentioned below.

Thanks. Also note Davem's request to split out any real changes from the
pure non-functional cleanups. 

Ian.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next] xen-netback: Follow-up patch for grant mapping series
  2014-03-20 19:32 ` [PATCH net-next] xen-netback: Follow-up patch for grant mapping series Zoltan Kiss
  2014-03-21  9:41   ` Ian Campbell
@ 2014-03-21 18:13   ` David Miller
  1 sibling, 0 replies; 8+ messages in thread
From: David Miller @ 2014-03-21 18:13 UTC (permalink / raw)
  To: zoltan.kiss
  Cc: ian.campbell, wei.liu2, xen-devel, paul.durrant, netdev,
	linux-kernel, jonathan.davies

From: Zoltan Kiss <zoltan.kiss@citrix.com>
Date: Thu, 20 Mar 2014 19:32:41 +0000

> Ian made some late comments about the grant mapping series, I incorporated the
> outcomes into this patch. Additional comments, refactoring etc.
> 
> Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>

I would really prefer that pure non-functional cleanup changes happen
separately from real changes.

Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next] xen-netback: Stop using xenvif_tx_pending_slots_available
  2014-03-20 19:32 [PATCH net-next] xen-netback: Stop using xenvif_tx_pending_slots_available Zoltan Kiss
  2014-03-20 19:32 ` [PATCH net-next] xen-netback: Follow-up patch for grant mapping series Zoltan Kiss
@ 2014-03-21  9:36 ` Ian Campbell
  2014-03-21 19:16   ` Zoltan Kiss
  1 sibling, 1 reply; 8+ messages in thread
From: Ian Campbell @ 2014-03-21  9:36 UTC (permalink / raw)
  To: Zoltan Kiss
  Cc: wei.liu2, xen-devel, paul.durrant, netdev, linux-kernel,
	jonathan.davies

On Thu, 2014-03-20 at 19:32 +0000, Zoltan Kiss wrote:
> Since the early days TX stops if there isn't enough free pending slots to
> consume a maximum sized (slot-wise) packet. Probably the reason for that is to
> avoid the case when we don't have enough free pending slot in the ring to finish
> the packet. But if we make sure that the pending ring has the same size as the
> shared ring, that shouldn't really happen. The frontend can only post packets
> which fit the to the free space of the shared ring. If it doesn't, the frontend
> has to stop, as it can only increase the req_prod when the whole packet fits
> onto the ring.

My only real concern here is that by removing these checks we are
introducing a way for a malicious or buggy guest to trigger misbehaviour
in the backend, leading to e.g. a DoS.

Should we need to some sanity checks which shutdown the ring if
something like this occurs? i.e. if we come to consume a packet and
there is insufficient space on the pending ring we kill the vif.

What's the invariant we are relying on here, is it:
    req_prod >= req_cons >= pending_prod >= pending_cons >= rsp_prod >= rsp_cons
?

> This patch avoid using this checking, makes sure the 2 ring has the same size,
> and remove a checking from the callback. As now we don't stop the NAPI instance
> on this condition, we don't have to wake it up if we free pending slots up.
> 
> Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
> ---
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> index bef37be..a800a8e 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -81,7 +81,7 @@ struct xenvif_rx_meta {
>  
>  #define MAX_BUFFER_OFFSET PAGE_SIZE
>  
> -#define MAX_PENDING_REQS 256
> +#define MAX_PENDING_REQS XEN_NETIF_TX_RING_SIZE

XEN_NETIF_TX_RING_SIZE is already == 256, right? (Just want to make sure
this is semantically no change).
 
>  /* It's possible for an skb to have a maximal number of frags
>   * but still be less than MAX_BUFFER_OFFSET in size. Thus the
> @@ -253,12 +253,6 @@ static inline pending_ring_idx_t nr_pending_reqs(struct xenvif *vif)
>  		vif->pending_prod + vif->pending_cons;
>  }
>  
> -static inline bool xenvif_tx_pending_slots_available(struct xenvif *vif)
> -{
> -	return nr_pending_reqs(vif) + XEN_NETBK_LEGACY_SLOTS_MAX
> -		< MAX_PENDING_REQS;
> -}
> -
>  /* Callback from stack when TX packet can be released */
>  void xenvif_zerocopy_callback(struct ubuf_info *ubuf, bool zerocopy_success);
>  
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> index 83a71ac..7bb7734 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -88,8 +88,7 @@ static int xenvif_poll(struct napi_struct *napi, int budget)
>  		local_irq_save(flags);
>  
>  		RING_FINAL_CHECK_FOR_REQUESTS(&vif->tx, more_to_do);
> -		if (!(more_to_do &&
> -		      xenvif_tx_pending_slots_available(vif)))
> +		if (!more_to_do)
>  			__napi_complete(napi);
>  
>  		local_irq_restore(flags);
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index bc94320..5df8d63 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -1175,8 +1175,7 @@ static unsigned xenvif_tx_build_gops(struct xenvif *vif, int budget)
>  	struct sk_buff *skb;
>  	int ret;
>  
> -	while (xenvif_tx_pending_slots_available(vif) &&
> -	       (skb_queue_len(&vif->tx_queue) < budget)) {
> +	while (skb_queue_len(&vif->tx_queue) < budget) {
>  		struct xen_netif_tx_request txreq;
>  		struct xen_netif_tx_request txfrags[XEN_NETBK_LEGACY_SLOTS_MAX];
>  		struct xen_netif_extra_info extras[XEN_NETIF_EXTRA_TYPE_MAX-1];
> @@ -1516,13 +1515,6 @@ void xenvif_zerocopy_callback(struct ubuf_info *ubuf, bool zerocopy_success)
>  	wake_up(&vif->dealloc_wq);
>  	spin_unlock_irqrestore(&vif->callback_lock, flags);
>  
> -	if (RING_HAS_UNCONSUMED_REQUESTS(&vif->tx) &&
> -	    xenvif_tx_pending_slots_available(vif)) {
> -		local_bh_disable();
> -		napi_schedule(&vif->napi);
> -		local_bh_enable();
> -	}
> -
>  	if (likely(zerocopy_success))
>  		vif->tx_zerocopy_success++;
>  	else
> @@ -1714,8 +1706,7 @@ static inline int rx_work_todo(struct xenvif *vif)
>  static inline int tx_work_todo(struct xenvif *vif)
>  {
>  
> -	if (likely(RING_HAS_UNCONSUMED_REQUESTS(&vif->tx)) &&
> -	    xenvif_tx_pending_slots_available(vif))
> +	if (likely(RING_HAS_UNCONSUMED_REQUESTS(&vif->tx)))
>  		return 1;
>  
>  	return 0;

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next] xen-netback: Stop using xenvif_tx_pending_slots_available
  2014-03-21  9:36 ` [PATCH net-next] xen-netback: Stop using xenvif_tx_pending_slots_available Ian Campbell
@ 2014-03-21 19:16   ` Zoltan Kiss
  0 siblings, 0 replies; 8+ messages in thread
From: Zoltan Kiss @ 2014-03-21 19:16 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, xen-devel, paul.durrant, netdev, linux-kernel,
	jonathan.davies

On 21/03/14 09:36, Ian Campbell wrote:
> On Thu, 2014-03-20 at 19:32 +0000, Zoltan Kiss wrote:
>> Since the early days TX stops if there isn't enough free pending slots to
>> consume a maximum sized (slot-wise) packet. Probably the reason for that is to
>> avoid the case when we don't have enough free pending slot in the ring to finish
>> the packet. But if we make sure that the pending ring has the same size as the
>> shared ring, that shouldn't really happen. The frontend can only post packets
>> which fit the to the free space of the shared ring. If it doesn't, the frontend
>> has to stop, as it can only increase the req_prod when the whole packet fits
>> onto the ring.
>
> My only real concern here is that by removing these checks we are
> introducing a way for a malicious or buggy guest to trigger misbehaviour
> in the backend, leading to e.g. a DoS.
>
> Should we need to some sanity checks which shutdown the ring if
> something like this occurs? i.e. if we come to consume a packet and
> there is insufficient space on the pending ring we kill the vif.
The backend doesn't see what the guest does with the responses, and 
that's OK, it's the guest's problem, after netback increased 
rsp_prod_pvt it doesn't really care. But as soon as the guest start 
placing new requests after rsp_prod_pvt, or just increasing req_prod so 
req_prod-rsp_prod_pvt > XEN_NETIF_TX_RING_SIZE, it becomes an issue.
So far this xenvif_tx_pending_slots_available indirectly saved us from 
consuming requests overwriting still pending requests, but the guest 
could still overwrote our responses. But again, that's still the guests 
problem, we had the original request saved in the pending ring data. If 
the guest went too far, build_gops killed the vif when req_prod-req_cons 
 > XEN_NETIF_TX_RING_SIZE
Indirect above means it only happened because the pending ring had the 
same size, and the used slots has a 1-to-1 mapping for slots between 
rsp_prod_pvt and req_cons. So xenvif_tx_pending_slots_available also 
means (req_cons - rsp_prod_pvt) + XEN_NETBK_LEGACY_SLOTS_MAX < 
XEN_NETIF_TX_RING_SIZE (does this look familiar? :)
But consuming overrunning requests after rsp_prod_pvt is a problem:
- NAPI instance races with dealloc thread over the slots. The first 
reads them as requests, the second writes them as responses
- the NAPI instance overwrites used pending slots as well, so skb frag 
release go wrong etc.
But the current RING_HAS_UNCONSUMED_REQUESTS fortunately saves us here, 
let me explain it through an example:
rsp_prod_pvt = 0
req_cons = 253
req_prod = 258

Therefore:
req = 5
rsp = 3

So in xenvif_tx_build_gops work_to_do will be 3, and in 
xenvif_count_requests we bail out when we see that the packet actually 
exceeds that.
So in the end, we are safe here, but we shouldn't change that macro I 
suggested to refactor :)

Zoli


>
> What's the invariant we are relying on here, is it:
>      req_prod >= req_cons >= pending_prod >= pending_cons >= rsp_prod >= rsp_cons
> ?
>
>> This patch avoid using this checking, makes sure the 2 ring has the same size,
>> and remove a checking from the callback. As now we don't stop the NAPI instance
>> on this condition, we don't have to wake it up if we free pending slots up.
>>
>> Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
>> ---
>> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
>> index bef37be..a800a8e 100644
>> --- a/drivers/net/xen-netback/common.h
>> +++ b/drivers/net/xen-netback/common.h
>> @@ -81,7 +81,7 @@ struct xenvif_rx_meta {
>>
>>   #define MAX_BUFFER_OFFSET PAGE_SIZE
>>
>> -#define MAX_PENDING_REQS 256
>> +#define MAX_PENDING_REQS XEN_NETIF_TX_RING_SIZE
>
> XEN_NETIF_TX_RING_SIZE is already == 256, right? (Just want to make sure
> this is semantically no change).
Yes, it is __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-03-24 10:34 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-20 19:32 [PATCH net-next] xen-netback: Stop using xenvif_tx_pending_slots_available Zoltan Kiss
2014-03-20 19:32 ` [PATCH net-next] xen-netback: Follow-up patch for grant mapping series Zoltan Kiss
2014-03-21  9:41   ` Ian Campbell
2014-03-21 19:31     ` Zoltan Kiss
2014-03-24 10:34       ` Ian Campbell
2014-03-21 18:13   ` David Miller
2014-03-21  9:36 ` [PATCH net-next] xen-netback: Stop using xenvif_tx_pending_slots_available Ian Campbell
2014-03-21 19:16   ` Zoltan Kiss

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).