netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for stable 3.10] xen-netback: drop SKB from internal queue if frontend is disconnected
@ 2014-07-11 13:08 Wei Liu
  2014-07-11 13:15 ` Ian Campbell
  2014-07-11 13:57 ` [Xen-devel] " David Vrabel
  0 siblings, 2 replies; 7+ messages in thread
From: Wei Liu @ 2014-07-11 13:08 UTC (permalink / raw)
  To: xen-devel, netdev; +Cc: Wei Liu, Philipp Hahn, Ian Campbell

In 88a810def7 ("xen-netback: fix refcnt unbalance for 3.10"), we moved
the ref counting code from xenvif_disconnect to xenvif_free.

It can occur that frontend is disconnected while there's still SKB
stuck in netback's rx_queue in rare case. When netback thread wakes up,
it will try to write to an already unmapped ring, resulting in kernel
oops.

Moving the ref counting back to xenvif_disconnect isn't an option as it
reintroduces an old bug. Further more, writing into a dead frontend's
ring and memory is just wrong. Dropping those SKBs seems to be a good
strategy.

This patch fixes that corner case: introduce a flag to indicate whether
frontend ring is mapped. If the ring is unmapped, just drop those SKBs.

This bug only manifests in 3.10 kernel. Kernel >=3.12 doesn't have it.

Reported-by: Philipp Hahn <hahn@univention.de>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Philipp Hahn <hahn@univention.de>
Tested-by: Philipp Hahn <hahn@univention.de>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 drivers/net/xen-netback/common.h    |    1 +
 drivers/net/xen-netback/interface.c |    1 +
 drivers/net/xen-netback/netback.c   |   13 +++++++++++++
 3 files changed, 15 insertions(+)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index f2faa77..3418215 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -66,6 +66,7 @@ struct xenvif {
 	/* The shared rings and indexes. */
 	struct xen_netif_tx_back_ring tx;
 	struct xen_netif_rx_back_ring rx;
+	bool ring_mapped;
 
 	/* Frontend feature information. */
 	u8 can_sg:1;
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 540a796..cfdff0d 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -271,6 +271,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
 	vif->dev = dev;
 	INIT_LIST_HEAD(&vif->schedule_list);
 	INIT_LIST_HEAD(&vif->notify_list);
+	vif->ring_mapped = false;
 
 	vif->credit_bytes = vif->remaining_credit = ~0UL;
 	vif->credit_usec  = 0UL;
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 70b830f..aa3f0de 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -720,6 +720,16 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
 		vif = netdev_priv(skb->dev);
 		nr_frags = skb_shinfo(skb)->nr_frags;
 
+		/* In rare case that frontend is disconnected while
+		 * there's still SKBs stuck in netback internal
+		 * rx_queue, drop these SKBs.
+		 */
+		if (unlikely(!vif->ring_mapped)) {
+			dev_kfree_skb(skb);
+			xenvif_put(vif);
+			continue;
+		}
+
 		sco = (struct skb_cb_overlay *)skb->cb;
 		sco->meta_slots_used = netbk_gop_skb(skb, &npo);
 
@@ -1864,6 +1874,8 @@ static int xen_netbk_kthread(void *data)
 
 void xen_netbk_unmap_frontend_rings(struct xenvif *vif)
 {
+	vif->ring_mapped = false;
+
 	if (vif->tx.sring)
 		xenbus_unmap_ring_vfree(xenvif_to_xenbus_device(vif),
 					vif->tx.sring);
@@ -1899,6 +1911,7 @@ int xen_netbk_map_frontend_rings(struct xenvif *vif,
 	BACK_RING_INIT(&vif->rx, rxs, PAGE_SIZE);
 
 	vif->rx_req_cons_peek = 0;
+	vif->ring_mapped = true;
 
 	return 0;
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH for stable 3.10] xen-netback: drop SKB from internal queue if frontend is disconnected
  2014-07-11 13:08 [PATCH for stable 3.10] xen-netback: drop SKB from internal queue if frontend is disconnected Wei Liu
@ 2014-07-11 13:15 ` Ian Campbell
  2014-07-11 13:28   ` Wei Liu
  2014-07-11 13:57 ` [Xen-devel] " David Vrabel
  1 sibling, 1 reply; 7+ messages in thread
From: Ian Campbell @ 2014-07-11 13:15 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, netdev, Philipp Hahn

On Fri, 2014-07-11 at 14:08 +0100, Wei Liu wrote:
> In 88a810def7 ("xen-netback: fix refcnt unbalance for 3.10"), we moved
> the ref counting code from xenvif_disconnect to xenvif_free.

Since this was a 3.10 specific backport that explains why this is also
such?

> It can occur that frontend is disconnected while there's still SKB
> stuck in netback's rx_queue in rare case. When netback thread wakes up,
> it will try to write to an already unmapped ring, resulting in kernel
> oops.
> 
> Moving the ref counting back to xenvif_disconnect isn't an option as it
> reintroduces an old bug. Further more, writing into a dead frontend's
> ring and memory is just wrong. Dropping those SKBs seems to be a good
> strategy.
> 
> This patch fixes that corner case: introduce a flag to indicate whether
> frontend ring is mapped. If the ring is unmapped, just drop those SKBs.
> 
> This bug only manifests in 3.10 kernel. Kernel >=3.12 doesn't have it.

What about 3.11?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH for stable 3.10] xen-netback: drop SKB from internal queue if frontend is disconnected
  2014-07-11 13:15 ` Ian Campbell
@ 2014-07-11 13:28   ` Wei Liu
  2014-07-11 13:30     ` Ian Campbell
  0 siblings, 1 reply; 7+ messages in thread
From: Wei Liu @ 2014-07-11 13:28 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Wei Liu, xen-devel, netdev, Philipp Hahn

On Fri, Jul 11, 2014 at 02:15:29PM +0100, Ian Campbell wrote:
> On Fri, 2014-07-11 at 14:08 +0100, Wei Liu wrote:
> > In 88a810def7 ("xen-netback: fix refcnt unbalance for 3.10"), we moved
> > the ref counting code from xenvif_disconnect to xenvif_free.
> 
> Since this was a 3.10 specific backport that explains why this is also
> such?
> 

Yes.

> > It can occur that frontend is disconnected while there's still SKB
> > stuck in netback's rx_queue in rare case. When netback thread wakes up,
> > it will try to write to an already unmapped ring, resulting in kernel
> > oops.
> > 
> > Moving the ref counting back to xenvif_disconnect isn't an option as it
> > reintroduces an old bug. Further more, writing into a dead frontend's
> > ring and memory is just wrong. Dropping those SKBs seems to be a good
> > strategy.
> > 
> > This patch fixes that corner case: introduce a flag to indicate whether
> > frontend ring is mapped. If the ring is unmapped, just drop those SKBs.
> > 
> > This bug only manifests in 3.10 kernel. Kernel >=3.12 doesn't have it.
> 
> What about 3.11?
> 

3.11 is long dead while 3.10 is LTS kernel. I don't think GregKH accepts
patch for 3.11 anymore.

Wei.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH for stable 3.10] xen-netback: drop SKB from internal queue if frontend is disconnected
  2014-07-11 13:28   ` Wei Liu
@ 2014-07-11 13:30     ` Ian Campbell
  2014-07-11 13:43       ` Wei Liu
  0 siblings, 1 reply; 7+ messages in thread
From: Ian Campbell @ 2014-07-11 13:30 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, netdev, Philipp Hahn

On Fri, 2014-07-11 at 14:28 +0100, Wei Liu wrote:
> On Fri, Jul 11, 2014 at 02:15:29PM +0100, Ian Campbell wrote:
> > On Fri, 2014-07-11 at 14:08 +0100, Wei Liu wrote:
> > > In 88a810def7 ("xen-netback: fix refcnt unbalance for 3.10"), we moved
> > > the ref counting code from xenvif_disconnect to xenvif_free.
> > 
> > Since this was a 3.10 specific backport that explains why this is also
> > such?
> > 
> 
> Yes.

OK.

> > > It can occur that frontend is disconnected while there's still SKB
> > > stuck in netback's rx_queue in rare case. When netback thread wakes up,
> > > it will try to write to an already unmapped ring, resulting in kernel
> > > oops.
> > > 
> > > Moving the ref counting back to xenvif_disconnect isn't an option as it
> > > reintroduces an old bug. Further more, writing into a dead frontend's
> > > ring and memory is just wrong. Dropping those SKBs seems to be a good
> > > strategy.
> > > 
> > > This patch fixes that corner case: introduce a flag to indicate whether
> > > frontend ring is mapped. If the ring is unmapped, just drop those SKBs.
> > > 
> > > This bug only manifests in 3.10 kernel. Kernel >=3.12 doesn't have it.
> > 
> > What about 3.11?
> > 
> 
> 3.11 is long dead while 3.10 is LTS kernel. I don't think GregKH accepts
> patch for 3.11 anymore.

Whether Greg would take a patch is orthogonal to whether the bug exists
in 3.11 though. Someone might want to fix their own kernel locally for
example...

Ian.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH for stable 3.10] xen-netback: drop SKB from internal queue if frontend is disconnected
  2014-07-11 13:30     ` Ian Campbell
@ 2014-07-11 13:43       ` Wei Liu
  0 siblings, 0 replies; 7+ messages in thread
From: Wei Liu @ 2014-07-11 13:43 UTC (permalink / raw)
  To: Ian Campbell; +Cc: netdev, Philipp Hahn, Wei Liu, xen-devel

On Fri, Jul 11, 2014 at 02:30:51PM +0100, Ian Campbell wrote:
[...]
> > > > This bug only manifests in 3.10 kernel. Kernel >=3.12 doesn't have it.
> > > 
> > > What about 3.11?
> > > 
> > 
> > 3.11 is long dead while 3.10 is LTS kernel. I don't think GregKH accepts
> > patch for 3.11 anymore.
> 
> Whether Greg would take a patch is orthogonal to whether the bug exists
> in 3.11 though. Someone might want to fix their own kernel locally for
> example...
> 

To be clear, 3.11 doesn't have that fix which introduced this bug,
because GregKH explictly rejected that fix as 3.11 was already dead at
that time. So this patch is not applicable to 3.11 (whether 3.11 has
some other bugs is another problem).

See <20131202171958.GA661@kroah.com> and the thread it's in.

Wei.

> Ian.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH for stable 3.10] xen-netback: drop SKB from internal queue if frontend is disconnected
  2014-07-11 13:08 [PATCH for stable 3.10] xen-netback: drop SKB from internal queue if frontend is disconnected Wei Liu
  2014-07-11 13:15 ` Ian Campbell
@ 2014-07-11 13:57 ` David Vrabel
  2014-07-11 16:38   ` Wei Liu
  1 sibling, 1 reply; 7+ messages in thread
From: David Vrabel @ 2014-07-11 13:57 UTC (permalink / raw)
  To: Wei Liu, xen-devel, netdev; +Cc: Ian Campbell, Philipp Hahn

On 11/07/14 14:08, Wei Liu wrote:
> 
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -720,6 +720,16 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
>  		vif = netdev_priv(skb->dev);
>  		nr_frags = skb_shinfo(skb)->nr_frags;
>  
> +		/* In rare case that frontend is disconnected while
> +		 * there's still SKBs stuck in netback internal
> +		 * rx_queue, drop these SKBs.
> +		 */
> +		if (unlikely(!vif->ring_mapped)) {
> +			dev_kfree_skb(skb);
> +			xenvif_put(vif);
> +			continue;
> +		}

Is this racy?  The ring may be unmapped after the test but before we put
responses on it.

David

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH for stable 3.10] xen-netback: drop SKB from internal queue if frontend is disconnected
  2014-07-11 13:57 ` [Xen-devel] " David Vrabel
@ 2014-07-11 16:38   ` Wei Liu
  0 siblings, 0 replies; 7+ messages in thread
From: Wei Liu @ 2014-07-11 16:38 UTC (permalink / raw)
  To: David Vrabel; +Cc: Wei Liu, xen-devel, netdev, Ian Campbell, Philipp Hahn

On Fri, Jul 11, 2014 at 02:57:03PM +0100, David Vrabel wrote:
> On 11/07/14 14:08, Wei Liu wrote:
> > 
> > --- a/drivers/net/xen-netback/netback.c
> > +++ b/drivers/net/xen-netback/netback.c
> > @@ -720,6 +720,16 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
> >  		vif = netdev_priv(skb->dev);
> >  		nr_frags = skb_shinfo(skb)->nr_frags;
> >  
> > +		/* In rare case that frontend is disconnected while
> > +		 * there's still SKBs stuck in netback internal
> > +		 * rx_queue, drop these SKBs.
> > +		 */
> > +		if (unlikely(!vif->ring_mapped)) {
> > +			dev_kfree_skb(skb);
> > +			xenvif_put(vif);
> > +			continue;
> > +		}
> 
> Is this racy?  The ring may be unmapped after the test but before we put
> responses on it.
> 

Hmm... there's still a window. Ignore this patch for now. I will need to
look further into a ref-counting solution.

Wei.

> David

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-07-11 16:38 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-11 13:08 [PATCH for stable 3.10] xen-netback: drop SKB from internal queue if frontend is disconnected Wei Liu
2014-07-11 13:15 ` Ian Campbell
2014-07-11 13:28   ` Wei Liu
2014-07-11 13:30     ` Ian Campbell
2014-07-11 13:43       ` Wei Liu
2014-07-11 13:57 ` [Xen-devel] " David Vrabel
2014-07-11 16:38   ` Wei Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).