netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH 0/1] net/hyperv: Add flow control based on hi/low watermark
  2012-03-26 22:18 [PATCH 0/1] net/hyperv: Add flow control based on hi/low watermark Haiyang Zhang
@ 2012-03-26 22:12 ` David Miller
  2012-03-26 22:18 ` [PATCH 1/1] " Haiyang Zhang
  1 sibling, 0 replies; 11+ messages in thread
From: David Miller @ 2012-03-26 22:12 UTC (permalink / raw)
  To: haiyangz; +Cc: netdev, kys, olaf, linux-kernel, devel

From: Haiyang Zhang <haiyangz@microsoft.com>
Date: Mon, 26 Mar 2012 15:18:23 -0700

> This patch is targeting 'net-next' tree.

The merge window is still open, and therefore the net-next tree is
not open yet.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 0/1] net/hyperv: Add flow control based on hi/low watermark
@ 2012-03-26 22:18 Haiyang Zhang
  2012-03-26 22:12 ` David Miller
  2012-03-26 22:18 ` [PATCH 1/1] " Haiyang Zhang
  0 siblings, 2 replies; 11+ messages in thread
From: Haiyang Zhang @ 2012-03-26 22:18 UTC (permalink / raw)
  To: davem, netdev; +Cc: devel, haiyangz, olaf, linux-kernel

This patch is targeting 'net-next' tree.

Thanks to Stephen Hemminger <shemminger@vyatta.com> for his suggestions.


Haiyang Zhang (1):
  net/hyperv: Add flow control based on hi/low watermark

 drivers/hv/ring_buffer.c        |   15 +++++++++++++++
 drivers/net/hyperv/hyperv_net.h |    3 +++
 drivers/net/hyperv/netvsc.c     |   23 +++++++++++++++++++----
 drivers/net/hyperv/netvsc_drv.c |   16 +++++++++++++++-
 include/linux/hyperv.h          |    3 +++
 5 files changed, 55 insertions(+), 5 deletions(-)

-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/1] net/hyperv: Add flow control based on hi/low watermark
  2012-03-26 22:18 [PATCH 0/1] net/hyperv: Add flow control based on hi/low watermark Haiyang Zhang
  2012-03-26 22:12 ` David Miller
@ 2012-03-26 22:18 ` Haiyang Zhang
  2012-03-26 23:10   ` Greg KH
  1 sibling, 1 reply; 11+ messages in thread
From: Haiyang Zhang @ 2012-03-26 22:18 UTC (permalink / raw)
  To: davem, netdev; +Cc: haiyangz, kys, olaf, linux-kernel, devel

In the existing code, we only stop queue when the ringbuffer is full,
so the current packet has to be dropped or retried from upper layer.

This patch stops the tx queue when available ringbuffer is below
the low watermark. So the ringbuffer still has small amount of space
available for the current packet. This will reduce the overhead of
retries on sending.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/ring_buffer.c        |   15 +++++++++++++++
 drivers/net/hyperv/hyperv_net.h |    3 +++
 drivers/net/hyperv/netvsc.c     |   23 +++++++++++++++++++----
 drivers/net/hyperv/netvsc_drv.c |   16 +++++++++++++++-
 include/linux/hyperv.h          |    3 +++
 5 files changed, 55 insertions(+), 5 deletions(-)

diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
index 8af25a0..8cc3f63 100644
--- a/drivers/hv/ring_buffer.c
+++ b/drivers/hv/ring_buffer.c
@@ -23,6 +23,7 @@
  */
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
+#include <linux/module.h>
 #include <linux/kernel.h>
 #include <linux/mm.h>
 #include <linux/hyperv.h>
@@ -160,6 +161,20 @@ hv_get_ring_buffersize(struct hv_ring_buffer_info *ring_info)
 }
 
 /*
+ * Get the percentage of available bytes to write in the ring.
+ * The return value is in range from 0 to 100.
+ */
+u32 hv_ringbuf_avail_percent(struct hv_ring_buffer_info *ring_info)
+{
+	u32 avail_read, avail_write;
+
+	hv_get_ringbuffer_availbytes(ring_info, &avail_read, &avail_write);
+
+	return avail_write * 100 / hv_get_ring_buffersize(ring_info);
+}
+EXPORT_SYMBOL(hv_ringbuf_avail_percent);
+
+/*
  *
  * hv_get_ring_bufferindices()
  *
diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
index c358245..cd234cd 100644
--- a/drivers/net/hyperv/hyperv_net.h
+++ b/drivers/net/hyperv/hyperv_net.h
@@ -470,6 +470,9 @@ struct nvsp_message {
 
 #define NETVSC_PACKET_SIZE                      2048
 
+extern uint ring_avail_percent_hiwater;
+extern uint ring_avail_percent_lowater;
+
 /* Per netvsc channel-specific */
 struct netvsc_device {
 	struct hv_device *dev;
diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index d025c83..fbf4f18 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -455,6 +455,8 @@ static void netvsc_send_completion(struct hv_device *device,
 		complete(&net_device->channel_init_wait);
 	} else if (nvsp_packet->hdr.msg_type ==
 		   NVSP_MSG1_TYPE_SEND_RNDIS_PKT_COMPLETE) {
+		int num_outstanding_sends;
+
 		/* Get the send context */
 		nvsc_packet = (struct hv_netvsc_packet *)(unsigned long)
 			packet->trans_id;
@@ -463,10 +465,14 @@ static void netvsc_send_completion(struct hv_device *device,
 		nvsc_packet->completion.send.send_completion(
 			nvsc_packet->completion.send.send_completion_ctx);
 
-		atomic_dec(&net_device->num_outstanding_sends);
+		num_outstanding_sends =
+			atomic_dec_return(&net_device->num_outstanding_sends);
 
-		if (netif_queue_stopped(ndev) && !net_device->start_remove)
-			netif_wake_queue(ndev);
+		if (netif_queue_stopped(ndev) && !net_device->start_remove &&
+			(hv_ringbuf_avail_percent(&device->channel->outbound)
+			> ring_avail_percent_hiwater ||
+			num_outstanding_sends < 1))
+				netif_wake_queue(ndev);
 	} else {
 		netdev_err(ndev, "Unknown send completion packet type- "
 			   "%d received!!\n", nvsp_packet->hdr.msg_type);
@@ -519,10 +525,19 @@ int netvsc_send(struct hv_device *device,
 
 	if (ret == 0) {
 		atomic_inc(&net_device->num_outstanding_sends);
+		if (hv_ringbuf_avail_percent(&device->channel->outbound) <
+			ring_avail_percent_lowater) {
+			netif_stop_queue(ndev);
+			if (atomic_read(&net_device->
+				num_outstanding_sends) < 1)
+				netif_wake_queue(ndev);
+		}
 	} else if (ret == -EAGAIN) {
 		netif_stop_queue(ndev);
-		if (atomic_read(&net_device->num_outstanding_sends) < 1)
+		if (atomic_read(&net_device->num_outstanding_sends) < 1) {
 			netif_wake_queue(ndev);
+			ret = -ENOSPC;
+		}
 	} else {
 		netdev_err(ndev, "Unable to send packet %p ret %d\n",
 			   packet, ret);
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index dd29478..f13887c 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -51,6 +51,16 @@ static int ring_size = 128;
 module_param(ring_size, int, S_IRUGO);
 MODULE_PARM_DESC(ring_size, "Ring buffer size (# of pages)");
 
+uint ring_avail_percent_hiwater = 20;
+module_param(ring_avail_percent_hiwater, uint, S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(ring_avail_percent_hiwater,
+	"Ring buffer available percentiles to wake up xmit queue");
+
+uint ring_avail_percent_lowater = 10;
+module_param(ring_avail_percent_lowater, uint, S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(ring_avail_percent_lowater,
+	"Ring buffer available percentiles to stop xmit queue");
+
 struct set_multicast_work {
 	struct work_struct work;
 	struct net_device *net;
@@ -224,9 +234,13 @@ static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *net)
 		net->stats.tx_packets++;
 	} else {
 		kfree(packet);
+		if (ret != -EAGAIN) {
+			dev_kfree_skb_any(skb);
+			net->stats.tx_dropped++;
+		}
 	}
 
-	return ret ? NETDEV_TX_BUSY : NETDEV_TX_OK;
+	return (ret == -EAGAIN) ? NETDEV_TX_BUSY : NETDEV_TX_OK;
 }
 
 /*
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 5852545..e8e4c31 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -274,6 +274,9 @@ struct hv_ring_buffer_debug_info {
 	u32 bytes_avail_towrite;
 };
 
+extern u32 hv_ringbuf_avail_percent(struct hv_ring_buffer_info *ring_info);
+
+
 /*
  * We use the same version numbering for all Hyper-V modules.
  *
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] net/hyperv: Add flow control based on hi/low watermark
  2012-03-26 22:18 ` [PATCH 1/1] " Haiyang Zhang
@ 2012-03-26 23:10   ` Greg KH
  2012-03-26 23:12     ` David Miller
  2012-03-27  1:17     ` Haiyang Zhang
  0 siblings, 2 replies; 11+ messages in thread
From: Greg KH @ 2012-03-26 23:10 UTC (permalink / raw)
  To: Haiyang Zhang; +Cc: davem, netdev, devel, olaf, linux-kernel

On Mon, Mar 26, 2012 at 03:18:24PM -0700, Haiyang Zhang wrote:
> In the existing code, we only stop queue when the ringbuffer is full,
> so the current packet has to be dropped or retried from upper layer.
> 
> This patch stops the tx queue when available ringbuffer is below
> the low watermark. So the ringbuffer still has small amount of space
> available for the current packet. This will reduce the overhead of
> retries on sending.
> 
> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
> ---
>  drivers/hv/ring_buffer.c        |   15 +++++++++++++++
>  drivers/net/hyperv/hyperv_net.h |    3 +++
>  drivers/net/hyperv/netvsc.c     |   23 +++++++++++++++++++----
>  drivers/net/hyperv/netvsc_drv.c |   16 +++++++++++++++-
>  include/linux/hyperv.h          |    3 +++
>  5 files changed, 55 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
> index 8af25a0..8cc3f63 100644
> --- a/drivers/hv/ring_buffer.c
> +++ b/drivers/hv/ring_buffer.c
> @@ -23,6 +23,7 @@
>   */
>  #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>  
> +#include <linux/module.h>
>  #include <linux/kernel.h>
>  #include <linux/mm.h>
>  #include <linux/hyperv.h>
> @@ -160,6 +161,20 @@ hv_get_ring_buffersize(struct hv_ring_buffer_info *ring_info)
>  }
>  
>  /*
> + * Get the percentage of available bytes to write in the ring.
> + * The return value is in range from 0 to 100.
> + */
> +u32 hv_ringbuf_avail_percent(struct hv_ring_buffer_info *ring_info)
> +{
> +	u32 avail_read, avail_write;
> +
> +	hv_get_ringbuffer_availbytes(ring_info, &avail_read, &avail_write);
> +
> +	return avail_write * 100 / hv_get_ring_buffersize(ring_info);
> +}
> +EXPORT_SYMBOL(hv_ringbuf_avail_percent);

That makes no sense, what happens 1 second later to the buffer?  You
can't expect this result to be valid anymore, right?

> --- a/drivers/net/hyperv/netvsc_drv.c
> +++ b/drivers/net/hyperv/netvsc_drv.c
> @@ -51,6 +51,16 @@ static int ring_size = 128;
>  module_param(ring_size, int, S_IRUGO);
>  MODULE_PARM_DESC(ring_size, "Ring buffer size (# of pages)");
>  
> +uint ring_avail_percent_hiwater = 20;
> +module_param(ring_avail_percent_hiwater, uint, S_IRUGO | S_IWUSR);
> +MODULE_PARM_DESC(ring_avail_percent_hiwater,
> +	"Ring buffer available percentiles to wake up xmit queue");
> +
> +uint ring_avail_percent_lowater = 10;
> +module_param(ring_avail_percent_lowater, uint, S_IRUGO | S_IWUSR);
> +MODULE_PARM_DESC(ring_avail_percent_lowater,
> +	"Ring buffer available percentiles to stop xmit queue");

Eeek, no, how in the world is someone supposed to know to set these
things?

Please make this work "automatically", otherwise no one will ever get it
right.  Don't make it tunable as a way out of making a decision on how
the driver should work.

Ick.

David, please do NOT apply this as-is.

greg k-h

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] net/hyperv: Add flow control based on hi/low watermark
  2012-03-26 23:10   ` Greg KH
@ 2012-03-26 23:12     ` David Miller
  2012-03-28 18:05       ` Ben Hutchings
  2012-03-27  1:17     ` Haiyang Zhang
  1 sibling, 1 reply; 11+ messages in thread
From: David Miller @ 2012-03-26 23:12 UTC (permalink / raw)
  To: gregkh; +Cc: haiyangz, netdev, devel, olaf, linux-kernel

From: Greg KH <gregkh@linuxfoundation.org>
Date: Mon, 26 Mar 2012 16:10:17 -0700

> David, please do NOT apply this as-is.

BTW, ethtool had controls exactly for stuff like this.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH 1/1] net/hyperv: Add flow control based on hi/low watermark
  2012-03-26 23:10   ` Greg KH
  2012-03-26 23:12     ` David Miller
@ 2012-03-27  1:17     ` Haiyang Zhang
  1 sibling, 0 replies; 11+ messages in thread
From: Haiyang Zhang @ 2012-03-27  1:17 UTC (permalink / raw)
  To: Greg KH
  Cc: davem@davemloft.net, netdev@vger.kernel.org,
	devel@linuxdriverproject.org, olaf@aepfle.de,
	linux-kernel@vger.kernel.org



> -----Original Message-----
> From: Greg KH [mailto:gregkh@linuxfoundation.org]
> Sent: Monday, March 26, 2012 7:10 PM
> To: Haiyang Zhang
> Cc: davem@davemloft.net; netdev@vger.kernel.org;
> devel@linuxdriverproject.org; olaf@aepfle.de; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 1/1] net/hyperv: Add flow control based on hi/low
> watermark
> 
> On Mon, Mar 26, 2012 at 03:18:24PM -0700, Haiyang Zhang wrote:
> > In the existing code, we only stop queue when the ringbuffer is full,
> > so the current packet has to be dropped or retried from upper layer.
> >
> > This patch stops the tx queue when available ringbuffer is below the
> > low watermark. So the ringbuffer still has small amount of space
> > available for the current packet. This will reduce the overhead of
> > retries on sending.
> >
> > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
> > Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
> > ---
> >  drivers/hv/ring_buffer.c        |   15 +++++++++++++++
> >  drivers/net/hyperv/hyperv_net.h |    3 +++
> >  drivers/net/hyperv/netvsc.c     |   23 +++++++++++++++++++----
> >  drivers/net/hyperv/netvsc_drv.c |   16 +++++++++++++++-
> >  include/linux/hyperv.h          |    3 +++
> >  5 files changed, 55 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index
> > 8af25a0..8cc3f63 100644
> > --- a/drivers/hv/ring_buffer.c
> > +++ b/drivers/hv/ring_buffer.c
> > @@ -23,6 +23,7 @@
> >   */
> >  #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> >
> > +#include <linux/module.h>
> >  #include <linux/kernel.h>
> >  #include <linux/mm.h>
> >  #include <linux/hyperv.h>
> > @@ -160,6 +161,20 @@ hv_get_ring_buffersize(struct
> hv_ring_buffer_info
> > *ring_info)  }
> >
> >  /*
> > + * Get the percentage of available bytes to write in the ring.
> > + * The return value is in range from 0 to 100.
> > + */
> > +u32 hv_ringbuf_avail_percent(struct hv_ring_buffer_info *ring_info) {
> > +	u32 avail_read, avail_write;
> > +
> > +	hv_get_ringbuffer_availbytes(ring_info, &avail_read, &avail_write);
> > +
> > +	return avail_write * 100 / hv_get_ring_buffersize(ring_info);
> > +}
> > +EXPORT_SYMBOL(hv_ringbuf_avail_percent);
> 
> That makes no sense, what happens 1 second later to the buffer?  You can't
> expect this result to be valid anymore, right?

It's not necessary to be very precise for the flow control, just a rough 
estimate on how full the buffer is enough. We only want to keep a small
amount of buffer to be available, so the outgoing packets don't need to
be retried.

> > --- a/drivers/net/hyperv/netvsc_drv.c
> > +++ b/drivers/net/hyperv/netvsc_drv.c
> > @@ -51,6 +51,16 @@ static int ring_size = 128;
> > module_param(ring_size, int, S_IRUGO);  MODULE_PARM_DESC(ring_size,
> > "Ring buffer size (# of pages)");
> >
> > +uint ring_avail_percent_hiwater = 20;
> > +module_param(ring_avail_percent_hiwater, uint, S_IRUGO | S_IWUSR);
> > +MODULE_PARM_DESC(ring_avail_percent_hiwater,
> > +	"Ring buffer available percentiles to wake up xmit queue");
> > +
> > +uint ring_avail_percent_lowater = 10;
> > +module_param(ring_avail_percent_lowater, uint, S_IRUGO | S_IWUSR);
> > +MODULE_PARM_DESC(ring_avail_percent_lowater,
> > +	"Ring buffer available percentiles to stop xmit queue");
> 
> Eeek, no, how in the world is someone supposed to know to set these things?
> 
> Please make this work "automatically", otherwise no one will ever get it right.
> Don't make it tunable as a way out of making a decision on how the driver
> should work.

I will remove the tunables. The default values here work fine in our tests.

Thanks,
- Haiyang

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] net/hyperv: Add flow control based on hi/low watermark
  2012-03-27 22:23 ` [PATCH 1/1] " Haiyang Zhang
@ 2012-03-27 22:20   ` Greg KH
  2012-03-27 22:25     ` Haiyang Zhang
  0 siblings, 1 reply; 11+ messages in thread
From: Greg KH @ 2012-03-27 22:20 UTC (permalink / raw)
  To: Haiyang Zhang; +Cc: davem, netdev, devel, olaf, linux-kernel

On Tue, Mar 27, 2012 at 03:23:07PM -0700, Haiyang Zhang wrote:
> --- a/include/linux/hyperv.h
> +++ b/include/linux/hyperv.h
> @@ -274,6 +274,35 @@ struct hv_ring_buffer_debug_info {
>  	u32 bytes_avail_towrite;
>  };
>  
> +/* Amount of space to write to */
> +#define BYTES_AVAIL_TO_WRITE(r, w, z) \
> +	(((w) >= (r)) ? ((z) - ((w) - (r))) : ((r) - (w)))
> +

That's a very bad #define to use in a .h file, please do not do that.

> +
> +/*
> + *
> + * hv_get_ringbuffer_availbytes()
> + *
> + * Get number of bytes available to read and to write to
> + * for the specified ring buffer
> + */
> +extern inline void
> +hv_get_ringbuffer_availbytes(struct hv_ring_buffer_info *rbi,
> +			  u32 *read, u32 *write)

What does "extern inline" mean?

Please fix.

greg k-h

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/1] net/hyperv: Add flow control based on hi/low watermark
  2012-03-27 22:23 [PATCH 0/1] " Haiyang Zhang
@ 2012-03-27 22:23 ` Haiyang Zhang
  2012-03-27 22:20   ` Greg KH
  0 siblings, 1 reply; 11+ messages in thread
From: Haiyang Zhang @ 2012-03-27 22:23 UTC (permalink / raw)
  To: davem, netdev; +Cc: haiyangz, kys, olaf, linux-kernel, devel

In the existing code, we only stop queue when the ringbuffer is full,
so the current packet has to be dropped or retried from upper layer.

This patch stops the tx queue when available ringbuffer is below
the low watermark. So the ringbuffer still has small amount of space
available for the current packet. This will reduce the overhead of
retries on sending.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/ring_buffer.c        |   31 -----------------------------
 drivers/net/hyperv/netvsc.c     |   41 +++++++++++++++++++++++++++++++++++---
 drivers/net/hyperv/netvsc_drv.c |    6 ++++-
 include/linux/hyperv.h          |   29 +++++++++++++++++++++++++++
 4 files changed, 71 insertions(+), 36 deletions(-)

diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
index 8af25a0..7233c88 100644
--- a/drivers/hv/ring_buffer.c
+++ b/drivers/hv/ring_buffer.c
@@ -30,37 +30,6 @@
 #include "hyperv_vmbus.h"
 
 
-/* #defines */
-
-
-/* Amount of space to write to */
-#define BYTES_AVAIL_TO_WRITE(r, w, z) \
-	((w) >= (r)) ? ((z) - ((w) - (r))) : ((r) - (w))
-
-
-/*
- *
- * hv_get_ringbuffer_availbytes()
- *
- * Get number of bytes available to read and to write to
- * for the specified ring buffer
- */
-static inline void
-hv_get_ringbuffer_availbytes(struct hv_ring_buffer_info *rbi,
-			  u32 *read, u32 *write)
-{
-	u32 read_loc, write_loc;
-
-	smp_read_barrier_depends();
-
-	/* Capture the read/write indices before they changed */
-	read_loc = rbi->ring_buffer->read_index;
-	write_loc = rbi->ring_buffer->write_index;
-
-	*write = BYTES_AVAIL_TO_WRITE(read_loc, write_loc, rbi->ring_datasize);
-	*read = rbi->ring_datasize - *write;
-}
-
 /*
  * hv_get_next_write_location()
  *
diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index d025c83..8b91947 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -428,6 +428,24 @@ int netvsc_device_remove(struct hv_device *device)
 	return 0;
 }
 
+
+#define RING_AVAIL_PERCENT_HIWATER 20
+#define RING_AVAIL_PERCENT_LOWATER 10
+
+/*
+ * Get the percentage of available bytes to write in the ring.
+ * The return value is in range from 0 to 100.
+ */
+static inline u32 hv_ringbuf_avail_percent(
+		struct hv_ring_buffer_info *ring_info)
+{
+	u32 avail_read, avail_write;
+
+	hv_get_ringbuffer_availbytes(ring_info, &avail_read, &avail_write);
+
+	return avail_write * 100 / ring_info->ring_datasize;
+}
+
 static void netvsc_send_completion(struct hv_device *device,
 				   struct vmpacket_descriptor *packet)
 {
@@ -455,6 +473,8 @@ static void netvsc_send_completion(struct hv_device *device,
 		complete(&net_device->channel_init_wait);
 	} else if (nvsp_packet->hdr.msg_type ==
 		   NVSP_MSG1_TYPE_SEND_RNDIS_PKT_COMPLETE) {
+		int num_outstanding_sends;
+
 		/* Get the send context */
 		nvsc_packet = (struct hv_netvsc_packet *)(unsigned long)
 			packet->trans_id;
@@ -463,10 +483,14 @@ static void netvsc_send_completion(struct hv_device *device,
 		nvsc_packet->completion.send.send_completion(
 			nvsc_packet->completion.send.send_completion_ctx);
 
-		atomic_dec(&net_device->num_outstanding_sends);
+		num_outstanding_sends =
+			atomic_dec_return(&net_device->num_outstanding_sends);
 
-		if (netif_queue_stopped(ndev) && !net_device->start_remove)
-			netif_wake_queue(ndev);
+		if (netif_queue_stopped(ndev) && !net_device->start_remove &&
+			(hv_ringbuf_avail_percent(&device->channel->outbound)
+			> RING_AVAIL_PERCENT_HIWATER ||
+			num_outstanding_sends < 1))
+				netif_wake_queue(ndev);
 	} else {
 		netdev_err(ndev, "Unknown send completion packet type- "
 			   "%d received!!\n", nvsp_packet->hdr.msg_type);
@@ -519,10 +543,19 @@ int netvsc_send(struct hv_device *device,
 
 	if (ret == 0) {
 		atomic_inc(&net_device->num_outstanding_sends);
+		if (hv_ringbuf_avail_percent(&device->channel->outbound) <
+			RING_AVAIL_PERCENT_LOWATER) {
+			netif_stop_queue(ndev);
+			if (atomic_read(&net_device->
+				num_outstanding_sends) < 1)
+				netif_wake_queue(ndev);
+		}
 	} else if (ret == -EAGAIN) {
 		netif_stop_queue(ndev);
-		if (atomic_read(&net_device->num_outstanding_sends) < 1)
+		if (atomic_read(&net_device->num_outstanding_sends) < 1) {
 			netif_wake_queue(ndev);
+			ret = -ENOSPC;
+		}
 	} else {
 		netdev_err(ndev, "Unable to send packet %p ret %d\n",
 			   packet, ret);
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index dd29478..a0cc127 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -224,9 +224,13 @@ static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *net)
 		net->stats.tx_packets++;
 	} else {
 		kfree(packet);
+		if (ret != -EAGAIN) {
+			dev_kfree_skb_any(skb);
+			net->stats.tx_dropped++;
+		}
 	}
 
-	return ret ? NETDEV_TX_BUSY : NETDEV_TX_OK;
+	return (ret == -EAGAIN) ? NETDEV_TX_BUSY : NETDEV_TX_OK;
 }
 
 /*
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 5852545..2c366f0 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -274,6 +274,35 @@ struct hv_ring_buffer_debug_info {
 	u32 bytes_avail_towrite;
 };
 
+/* Amount of space to write to */
+#define BYTES_AVAIL_TO_WRITE(r, w, z) \
+	(((w) >= (r)) ? ((z) - ((w) - (r))) : ((r) - (w)))
+
+
+/*
+ *
+ * hv_get_ringbuffer_availbytes()
+ *
+ * Get number of bytes available to read and to write to
+ * for the specified ring buffer
+ */
+extern inline void
+hv_get_ringbuffer_availbytes(struct hv_ring_buffer_info *rbi,
+			  u32 *read, u32 *write)
+{
+	u32 read_loc, write_loc;
+
+	smp_read_barrier_depends();
+
+	/* Capture the read/write indices before they changed */
+	read_loc = rbi->ring_buffer->read_index;
+	write_loc = rbi->ring_buffer->write_index;
+
+	*write = BYTES_AVAIL_TO_WRITE(read_loc, write_loc, rbi->ring_datasize);
+	*read = rbi->ring_datasize - *write;
+}
+
+
 /*
  * We use the same version numbering for all Hyper-V modules.
  *
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* RE: [PATCH 1/1] net/hyperv: Add flow control based on hi/low watermark
  2012-03-27 22:20   ` Greg KH
@ 2012-03-27 22:25     ` Haiyang Zhang
  0 siblings, 0 replies; 11+ messages in thread
From: Haiyang Zhang @ 2012-03-27 22:25 UTC (permalink / raw)
  To: Greg KH
  Cc: davem@davemloft.net, netdev@vger.kernel.org,
	devel@linuxdriverproject.org, olaf@aepfle.de,
	linux-kernel@vger.kernel.org



> -----Original Message-----
> From: Greg KH [mailto:gregkh@linuxfoundation.org]
> Sent: Tuesday, March 27, 2012 6:21 PM
> To: Haiyang Zhang
> Cc: davem@davemloft.net; netdev@vger.kernel.org;
> devel@linuxdriverproject.org; olaf@aepfle.de; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 1/1] net/hyperv: Add flow control based on hi/low
> watermark
> 
> On Tue, Mar 27, 2012 at 03:23:07PM -0700, Haiyang Zhang wrote:
> > --- a/include/linux/hyperv.h
> > +++ b/include/linux/hyperv.h
> > @@ -274,6 +274,35 @@ struct hv_ring_buffer_debug_info {
> >  	u32 bytes_avail_towrite;
> >  };
> >
> > +/* Amount of space to write to */
> > +#define BYTES_AVAIL_TO_WRITE(r, w, z) \
> > +	(((w) >= (r)) ? ((z) - ((w) - (r))) : ((r) - (w)))
> > +
> 
> That's a very bad #define to use in a .h file, please do not do that.
> 
> > +
> > +/*
> > + *
> > + * hv_get_ringbuffer_availbytes()
> > + *
> > + * Get number of bytes available to read and to write to
> > + * for the specified ring buffer
> > + */
> > +extern inline void
> > +hv_get_ringbuffer_availbytes(struct hv_ring_buffer_info *rbi,
> > +			  u32 *read, u32 *write)
> 
> What does "extern inline" mean?
> 
> Please fix.

Will do.

Thanks,
- Haiyang

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] net/hyperv: Add flow control based on hi/low watermark
  2012-03-26 23:12     ` David Miller
@ 2012-03-28 18:05       ` Ben Hutchings
  2012-03-28 20:54         ` David Miller
  0 siblings, 1 reply; 11+ messages in thread
From: Ben Hutchings @ 2012-03-28 18:05 UTC (permalink / raw)
  To: David Miller; +Cc: gregkh, haiyangz, netdev, devel, olaf, linux-kernel

On Mon, 2012-03-26 at 19:12 -0400, David Miller wrote:
> From: Greg KH <gregkh@linuxfoundation.org>
> Date: Mon, 26 Mar 2012 16:10:17 -0700
> 
> > David, please do NOT apply this as-is.
> 
> BTW, ethtool had controls exactly for stuff like this.

Not sure what you're thinking of...?  We have pause frame control but I
don't think that's applicable.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] net/hyperv: Add flow control based on hi/low watermark
  2012-03-28 18:05       ` Ben Hutchings
@ 2012-03-28 20:54         ` David Miller
  0 siblings, 0 replies; 11+ messages in thread
From: David Miller @ 2012-03-28 20:54 UTC (permalink / raw)
  To: bhutchings; +Cc: gregkh, haiyangz, netdev, devel, olaf, linux-kernel

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Wed, 28 Mar 2012 19:05:00 +0100

> On Mon, 2012-03-26 at 19:12 -0400, David Miller wrote:
>> From: Greg KH <gregkh@linuxfoundation.org>
>> Date: Mon, 26 Mar 2012 16:10:17 -0700
>> 
>> > David, please do NOT apply this as-is.
>> 
>> BTW, ethtool had controls exactly for stuff like this.
> 
> Not sure what you're thinking of...?  We have pause frame control but I
> don't think that's applicable.

As I understand this, this situation is really about interrupt flow
control, and for that we have the interrupt moderation ethtool
settings.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-03-28 20:54 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-26 22:18 [PATCH 0/1] net/hyperv: Add flow control based on hi/low watermark Haiyang Zhang
2012-03-26 22:12 ` David Miller
2012-03-26 22:18 ` [PATCH 1/1] " Haiyang Zhang
2012-03-26 23:10   ` Greg KH
2012-03-26 23:12     ` David Miller
2012-03-28 18:05       ` Ben Hutchings
2012-03-28 20:54         ` David Miller
2012-03-27  1:17     ` Haiyang Zhang
  -- strict thread matches above, loose matches on Subject: below --
2012-03-27 22:23 [PATCH 0/1] " Haiyang Zhang
2012-03-27 22:23 ` [PATCH 1/1] " Haiyang Zhang
2012-03-27 22:20   ` Greg KH
2012-03-27 22:25     ` Haiyang Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).