* [PATCH] net/hyperv: Fix the stop/wake queue mechanism
@ 2011-12-02 19:56 Haiyang Zhang
2011-12-09 16:00 ` Haiyang Zhang
0 siblings, 1 reply; 4+ messages in thread
From: Haiyang Zhang @ 2011-12-02 19:56 UTC (permalink / raw)
To: haiyangz, kys, davem, gregkh, linux-kernel, netdev, devel
The ring buffer is only used to pass meta data for outbound packets. The
actual payload is accessed by DMA from the host. So the stop/wake queue
mechanism based on counting and comparing number of pages sent v.s. number
of pages in the ring buffer is wrong. Also, there is a race condition in
the stop/wake queue calls, which can stop xmit queue forever.
The new stop/wake queue mechanism is based on the actual bytes used by
outbound packets in the ring buffer. The check for number of outstanding
sends after stop queue prevents the race condition that can cause wake
queue happening earlier than stop queue.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reported-by: Long Li <longli@microsoft.com>
---
drivers/net/hyperv/netvsc.c | 14 +++++++++++---
drivers/net/hyperv/netvsc_drv.c | 24 +-----------------------
2 files changed, 12 insertions(+), 26 deletions(-)
diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index 4a807e4..b6ac152 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -435,6 +435,9 @@ static void netvsc_send_completion(struct hv_device *device,
nvsc_packet->completion.send.send_completion_ctx);
atomic_dec(&net_device->num_outstanding_sends);
+
+ if (netif_queue_stopped(ndev))
+ netif_wake_queue(ndev);
} else {
netdev_err(ndev, "Unknown send completion packet type- "
"%d received!!\n", nvsp_packet->hdr.msg_type);
@@ -485,11 +488,16 @@ int netvsc_send(struct hv_device *device,
}
- if (ret != 0)
+ if (ret == 0) {
+ atomic_inc(&net_device->num_outstanding_sends);
+ } else if (ret == -EAGAIN) {
+ netif_stop_queue(ndev);
+ if (atomic_read(&net_device->num_outstanding_sends) < 1)
+ netif_wake_queue(ndev);
+ } else {
netdev_err(ndev, "Unable to send packet %p ret %d\n",
packet, ret);
- else
- atomic_inc(&net_device->num_outstanding_sends);
+ }
return ret;
}
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index b69c3a4..7da85eb 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -43,15 +43,10 @@
struct net_device_context {
/* point back to our device context */
struct hv_device *device_ctx;
- atomic_t avail;
struct delayed_work dwork;
};
-#define PACKET_PAGES_LOWATER 8
-/* Need this many pages to handle worst case fragmented packet */
-#define PACKET_PAGES_HIWATER (MAX_SKB_FRAGS + 2)
-
static int ring_size = 128;
module_param(ring_size, int, S_IRUGO);
MODULE_PARM_DESC(ring_size, "Ring buffer size (# of pages)");
@@ -144,18 +139,8 @@ static void netvsc_xmit_completion(void *context)
kfree(packet);
- if (skb) {
- struct net_device *net = skb->dev;
- struct net_device_context *net_device_ctx = netdev_priv(net);
- unsigned int num_pages = skb_shinfo(skb)->nr_frags + 2;
-
+ if (skb)
dev_kfree_skb_any(skb);
-
- atomic_add(num_pages, &net_device_ctx->avail);
- if (atomic_read(&net_device_ctx->avail) >=
- PACKET_PAGES_HIWATER)
- netif_wake_queue(net);
- }
}
static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *net)
@@ -167,8 +152,6 @@ static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *net)
/* Add 1 for skb->data and additional one for RNDIS */
num_pages = skb_shinfo(skb)->nr_frags + 1 + 1;
- if (num_pages > atomic_read(&net_device_ctx->avail))
- return NETDEV_TX_BUSY;
/* Allocate a netvsc packet based on # of frags. */
packet = kzalloc(sizeof(struct hv_netvsc_packet) +
@@ -218,10 +201,6 @@ static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *net)
if (ret == 0) {
net->stats.tx_bytes += skb->len;
net->stats.tx_packets++;
-
- atomic_sub(num_pages, &net_device_ctx->avail);
- if (atomic_read(&net_device_ctx->avail) < PACKET_PAGES_LOWATER)
- netif_stop_queue(net);
} else {
/* we are shutting down or bus overloaded, just drop packet */
net->stats.tx_dropped++;
@@ -391,7 +370,6 @@ static int netvsc_probe(struct hv_device *dev,
net_device_ctx = netdev_priv(net);
net_device_ctx->device_ctx = dev;
- atomic_set(&net_device_ctx->avail, ring_size);
hv_set_drvdata(dev, net);
INIT_DELAYED_WORK(&net_device_ctx->dwork, netvsc_send_garp);
--
1.7.4.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* RE: [PATCH] net/hyperv: Fix the stop/wake queue mechanism
2011-12-02 19:56 [PATCH] net/hyperv: Fix the stop/wake queue mechanism Haiyang Zhang
@ 2011-12-09 16:00 ` Haiyang Zhang
2011-12-09 16:35 ` Greg KH
0 siblings, 1 reply; 4+ messages in thread
From: Haiyang Zhang @ 2011-12-09 16:00 UTC (permalink / raw)
To: KY Srinivasan, davem@davemloft.net, gregkh@suse.de,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
devel@linuxdriverproject.org
> -----Original Message-----
> From: Haiyang Zhang [mailto:haiyangz@microsoft.com]
> Sent: Friday, December 02, 2011 2:56 PM
> To: Haiyang Zhang; KY Srinivasan; davem@davemloft.net; gregkh@suse.de;
> linux-kernel@vger.kernel.org; netdev@vger.kernel.org;
> devel@linuxdriverproject.org
> Subject: [PATCH] net/hyperv: Fix the stop/wake queue mechanism
>
> The ring buffer is only used to pass meta data for outbound packets. The
> actual payload is accessed by DMA from the host. So the stop/wake queue
> mechanism based on counting and comparing number of pages sent v.s.
> number
> of pages in the ring buffer is wrong. Also, there is a race condition in
> the stop/wake queue calls, which can stop xmit queue forever.
>
> The new stop/wake queue mechanism is based on the actual bytes used by
> outbound packets in the ring buffer. The check for number of outstanding
> sends after stop queue prevents the race condition that can cause wake
> queue happening earlier than stop queue.
>
> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
> Reported-by: Long Li <longli@microsoft.com>
> ---
> drivers/net/hyperv/netvsc.c | 14 +++++++++++---
> drivers/net/hyperv/netvsc_drv.c | 24 +-----------------------
> 2 files changed, 12 insertions(+), 26 deletions(-)
Hi Greg,
Since the netvsc haven't been merged into Dave's tree yet after out of staging,
could you consider this patch for your tree?
Thanks,
- Haiyang
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] net/hyperv: Fix the stop/wake queue mechanism
2011-12-09 16:00 ` Haiyang Zhang
@ 2011-12-09 16:35 ` Greg KH
2011-12-09 16:49 ` Haiyang Zhang
0 siblings, 1 reply; 4+ messages in thread
From: Greg KH @ 2011-12-09 16:35 UTC (permalink / raw)
To: Haiyang Zhang
Cc: KY Srinivasan, davem@davemloft.net, linux-kernel@vger.kernel.org,
netdev@vger.kernel.org, devel@linuxdriverproject.org
On Fri, Dec 09, 2011 at 04:00:59PM +0000, Haiyang Zhang wrote:
> > -----Original Message-----
> > From: Haiyang Zhang [mailto:haiyangz@microsoft.com]
> > Sent: Friday, December 02, 2011 2:56 PM
> > To: Haiyang Zhang; KY Srinivasan; davem@davemloft.net; gregkh@suse.de;
> > linux-kernel@vger.kernel.org; netdev@vger.kernel.org;
> > devel@linuxdriverproject.org
> > Subject: [PATCH] net/hyperv: Fix the stop/wake queue mechanism
> >
> > The ring buffer is only used to pass meta data for outbound packets. The
> > actual payload is accessed by DMA from the host. So the stop/wake queue
> > mechanism based on counting and comparing number of pages sent v.s.
> > number
> > of pages in the ring buffer is wrong. Also, there is a race condition in
> > the stop/wake queue calls, which can stop xmit queue forever.
> >
> > The new stop/wake queue mechanism is based on the actual bytes used by
> > outbound packets in the ring buffer. The check for number of outstanding
> > sends after stop queue prevents the race condition that can cause wake
> > queue happening earlier than stop queue.
> >
> > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
> > Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
> > Reported-by: Long Li <longli@microsoft.com>
> > ---
> > drivers/net/hyperv/netvsc.c | 14 +++++++++++---
> > drivers/net/hyperv/netvsc_drv.c | 24 +-----------------------
> > 2 files changed, 12 insertions(+), 26 deletions(-)
>
> Hi Greg,
>
> Since the netvsc haven't been merged into Dave's tree yet after out of staging,
> could you consider this patch for your tree?
It's in my "to-apply" queue already.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: [PATCH] net/hyperv: Fix the stop/wake queue mechanism
2011-12-09 16:35 ` Greg KH
@ 2011-12-09 16:49 ` Haiyang Zhang
0 siblings, 0 replies; 4+ messages in thread
From: Haiyang Zhang @ 2011-12-09 16:49 UTC (permalink / raw)
To: Greg KH
Cc: KY Srinivasan, davem@davemloft.net, linux-kernel@vger.kernel.org,
netdev@vger.kernel.org, devel@linuxdriverproject.org
> -----Original Message-----
> From: Greg KH [mailto:gregkh@suse.de]
> Sent: Friday, December 09, 2011 11:35 AM
> To: Haiyang Zhang
> Cc: KY Srinivasan; davem@davemloft.net; linux-kernel@vger.kernel.org;
> netdev@vger.kernel.org; devel@linuxdriverproject.org
> Subject: Re: [PATCH] net/hyperv: Fix the stop/wake queue mechanism
>
> On Fri, Dec 09, 2011 at 04:00:59PM +0000, Haiyang Zhang wrote:
> > > -----Original Message-----
> > > From: Haiyang Zhang [mailto:haiyangz@microsoft.com]
> > > drivers/net/hyperv/netvsc.c | 14 +++++++++++---
> > > drivers/net/hyperv/netvsc_drv.c | 24 +-----------------------
> > > 2 files changed, 12 insertions(+), 26 deletions(-)
> >
> > Hi Greg,
> >
> > Since the netvsc haven't been merged into Dave's tree yet after out of
> staging,
> > could you consider this patch for your tree?
>
> It's in my "to-apply" queue already.
Thank you!
- Haiyang
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2011-12-09 16:49 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-02 19:56 [PATCH] net/hyperv: Fix the stop/wake queue mechanism Haiyang Zhang
2011-12-09 16:00 ` Haiyang Zhang
2011-12-09 16:35 ` Greg KH
2011-12-09 16:49 ` Haiyang Zhang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).