netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] net: do not process device backlog during unregistration
@ 2015-06-27 16:02 Julian Anastasov
  2015-06-28 19:06 ` Eric W. Biederman
  0 siblings, 1 reply; 4+ messages in thread
From: Julian Anastasov @ 2015-06-27 16:02 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Eric W. Biederman, Vittorio G (VittGam)

commit 381c759d9916 ("ipv4: Avoid crashing in ip_error")
fixes a problem where processed packet comes from device
with destroyed inetdev (dev->ip_ptr). This is not expected
because inetdev_destroy is called in NETDEV_UNREGISTER
phase and packets should not be processed after
dev_close_many() and synchronize_net(). But backlog
processing is deactivated later after NETDEV_UNREGISTER_FINAL
phase which allows CPUs to initiate processing for new
packets during and after the NETDEV_UNREGISTER phase.

Fix it by moving flush_backlog after the device driver
is stopped and before the synchronize_net() call. This
allows dev->ip_ptr to be always valid during packet
processing.

Reported-by: Vittorio Gambaletta <linuxbugs@vittgam.net>
Fixes: 6e583ce5242f ("net: eliminate refcounting in backlog queue")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
---
 net/core/dev.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Tested by reverting commit 381c759d9916 and using the
test commands provided by Eric Biederman:

http://marc.info/?l=linux-netdev&m=143239231905533&w=2

Problem does not occur every time. SMP may be needed
to reproduce.

diff --git a/net/core/dev.c b/net/core/dev.c
index aa82f9a..67132b3 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6027,6 +6027,7 @@ static void rollback_registered_many(struct list_head *head)
 		unlist_netdevice(dev);
 
 		dev->reg_state = NETREG_UNREGISTERING;
+		on_each_cpu(flush_backlog, dev, 1);
 	}
 
 	synchronize_net();
@@ -6650,8 +6651,6 @@ void netdev_run_todo(void)
 
 		dev->reg_state = NETREG_UNREGISTERED;
 
-		on_each_cpu(flush_backlog, dev, 1);
-
 		netdev_wait_allrefs(dev);
 
 		/* paranoia */
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net] net: do not process device backlog during unregistration
  2015-06-27 16:02 [PATCH net] net: do not process device backlog during unregistration Julian Anastasov
@ 2015-06-28 19:06 ` Eric W. Biederman
  2015-06-29 20:33   ` Julian Anastasov
  0 siblings, 1 reply; 4+ messages in thread
From: Eric W. Biederman @ 2015-06-28 19:06 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: David Miller, netdev, Vittorio G (VittGam)

Julian Anastasov <ja@ssi.bg> writes:

> commit 381c759d9916 ("ipv4: Avoid crashing in ip_error")
> fixes a problem where processed packet comes from device
> with destroyed inetdev (dev->ip_ptr). This is not expected
> because inetdev_destroy is called in NETDEV_UNREGISTER
> phase and packets should not be processed after
> dev_close_many() and synchronize_net(). But backlog
> processing is deactivated later after NETDEV_UNREGISTER_FINAL
> phase which allows CPUs to initiate processing for new
> packets during and after the NETDEV_UNREGISTER phase.
>
> Fix it by moving flush_backlog after the device driver
> is stopped and before the synchronize_net() call. This
> allows dev->ip_ptr to be always valid during packet
> processing.

Do you have any evidence that there are other places that care
besides the location in ip_error?

If not this patch should be a canidate for net-next which is currently
closed until after the merge window.

The testing is difficult and I was never able to reproduce the actual
crash.  So I do not know how much evidence should come from testing.

That said this patch actually appears wrong.  Nothing in the veth driver
that I can see will cause packets from the other side to not be
transmitted when only one side is closed.

Further even assuming that dev_close_many actually works and causes no
new packets to be processed there is the larger problem of calling
flush_backlog before syncrhonize_net.   Things happening in rcu
criticial sections (aka enqueuing packets to the backlog) are not
expected to stop until after the end of the critical section
and we are not out of the critical section until after synchronize_net.

So while this work looks interesting, I don't think this is a fix for
a serious bug, and I don't think you have yet created a correct patch.

Eric

> Reported-by: Vittorio Gambaletta <linuxbugs@vittgam.net>
> Fixes: 6e583ce5242f ("net: eliminate refcounting in backlog queue")
> Signed-off-by: Julian Anastasov <ja@ssi.bg>
> ---
>  net/core/dev.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> Tested by reverting commit 381c759d9916 and using the
> test commands provided by Eric Biederman:
>
> http://marc.info/?l=linux-netdev&m=143239231905533&w=2
>
> Problem does not occur every time. SMP may be needed
> to reproduce.
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index aa82f9a..67132b3 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -6027,6 +6027,7 @@ static void rollback_registered_many(struct list_head *head)
>  		unlist_netdevice(dev);
>  
>  		dev->reg_state = NETREG_UNREGISTERING;
> +		on_each_cpu(flush_backlog, dev, 1);
>  	}
>  
>  	synchronize_net();
> @@ -6650,8 +6651,6 @@ void netdev_run_todo(void)
>  
>  		dev->reg_state = NETREG_UNREGISTERED;
>  
> -		on_each_cpu(flush_backlog, dev, 1);
> -
>  		netdev_wait_allrefs(dev);
>  
>  		/* paranoia */

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net] net: do not process device backlog during unregistration
  2015-06-28 19:06 ` Eric W. Biederman
@ 2015-06-29 20:33   ` Julian Anastasov
  2015-06-30  6:25     ` Julian Anastasov
  0 siblings, 1 reply; 4+ messages in thread
From: Julian Anastasov @ 2015-06-29 20:33 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: David Miller, netdev, Vittorio G (VittGam)


	Hello,

On Sun, 28 Jun 2015, Eric W. Biederman wrote:

> Julian Anastasov <ja@ssi.bg> writes:
> 
> > commit 381c759d9916 ("ipv4: Avoid crashing in ip_error")
> > fixes a problem where processed packet comes from device
> > with destroyed inetdev (dev->ip_ptr). This is not expected
> > because inetdev_destroy is called in NETDEV_UNREGISTER
> > phase and packets should not be processed after
> > dev_close_many() and synchronize_net(). But backlog
> > processing is deactivated later after NETDEV_UNREGISTER_FINAL
> > phase which allows CPUs to initiate processing for new
> > packets during and after the NETDEV_UNREGISTER phase.
> >
> > Fix it by moving flush_backlog after the device driver
> > is stopped and before the synchronize_net() call. This
> > allows dev->ip_ptr to be always valid during packet
> > processing.
> 
> Do you have any evidence that there are other places that care
> besides the location in ip_error?

	Examples like that are not many because code is full with
in_dev checks:

- fib_compute_spec_dst: BUG_ON(!in_dev);

	This may need additional fix...

	But the fact is that sd->process_queue can contain
many packets and the synchronize_net call can stop maximum one
packet of them. So, we can reach NETDEV_UNREGISTER chain
while at the same time process_backlog continues to deliver
more packets. Soon, we may start to work with freed dev->ip_ptr.
The window is very small: from seeing dev->ip_ptr != NULL
to using the in_device fields. We may need slow hardirq that
can interrupt the packet processing and to crash on such
use-after-free case.

> If not this patch should be a canidate for net-next which is currently
> closed until after the merge window.
> 
> The testing is difficult and I was never able to reproduce the actual
> crash.  So I do not know how much evidence should come from testing.

	I guess, chance for crash can be increased if
large value is used for /proc/sys/net/core/netdev_max_backlog
but it looks difficult to reproduce such races.

> That said this patch actually appears wrong.  Nothing in the veth driver
> that I can see will cause packets from the other side to not be
> transmitted when only one side is closed.

	veth looks correct because veth_dellink is called
with head!=NULL.

> Further even assuming that dev_close_many actually works and causes no
> new packets to be processed there is the larger problem of calling
> flush_backlog before syncrhonize_net.   Things happening in rcu
> criticial sections (aka enqueuing packets to the backlog) are not
> expected to stop until after the end of the critical section
> and we are not out of the critical section until after synchronize_net.

	You are right. So, below is my next attempt,
still not tested. I'm trying to avoid processing for
enqueued packets. Sadly, enqueue_to_backlog() is used
both on rx (eg. NAPI) and on loop (eg. loopback_xmit) but
we are only interested to avoid enqueue_to_backlog for
the loop case. And as the looped packet may not notice
the new device state, we have to execute smp_mb on all
other CPUs and to add check in fast path. I hope I'm
doing it correctly...

> So while this work looks interesting, I don't think this is a fix for
> a serious bug, and I don't think you have yet created a correct patch.
> 
> Eric

RFC patch:

Subject: [PATCHv2 net] net: do not process device backlog during
 unregistration

commit 381c759d9916 ("ipv4: Avoid crashing in ip_error")
fixes a problem where processed packet comes from device
with destroyed inetdev (dev->ip_ptr). This is not expected
because inetdev_destroy is called in NETDEV_UNREGISTER
phase and packets should not be processed after
dev_close_many() and synchronize_net(). But backlog
processing is deactivated later after NETDEV_UNREGISTER_FINAL
phase which allows CPUs to initiate processing for new
packets during and after the NETDEV_UNREGISTER phase.

As new packets can be enqueued during processing we have
to drop them by adding new netif_running check after
memory barrier executed on every CPU.

Reported-by: Vittorio Gambaletta <linuxbugs@vittgam.net>
Fixes: 6e583ce5242f ("net: eliminate refcounting in backlog queue")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
---
 net/core/dev.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/net/core/dev.c b/net/core/dev.c
index aa82f9a..28d76307 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3670,6 +3670,8 @@ static int __netif_receive_skb_core(struct sk_buff *skb, bool pfmemalloc)
 	rcu_read_lock();
 
 another_round:
+	if (!netif_running(skb->dev))
+		goto drop;
 	skb->skb_iif = skb->dev->ifindex;
 
 	__this_cpu_inc(softnet_data.processed);
@@ -5992,6 +5994,11 @@ static void net_set_todo(struct net_device *dev)
 	dev_net(dev)->dev_unreg_count++;
 }
 
+static void force_netif_barrier_func(void *arg)
+{
+	/* smp_mb in csd_unlock */
+}
+
 static void rollback_registered_many(struct list_head *head)
 {
 	struct net_device *dev, *tmp;
@@ -6029,6 +6036,8 @@ static void rollback_registered_many(struct list_head *head)
 		dev->reg_state = NETREG_UNREGISTERING;
 	}
 
+	/* Make sure other CPUs see the new netif_running state */
+	on_each_cpu(force_netif_barrier_func, NULL, 1);
 	synchronize_net();
 
 	list_for_each_entry(dev, head, unreg_list) {
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net] net: do not process device backlog during unregistration
  2015-06-29 20:33   ` Julian Anastasov
@ 2015-06-30  6:25     ` Julian Anastasov
  0 siblings, 0 replies; 4+ messages in thread
From: Julian Anastasov @ 2015-06-30  6:25 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: David Miller, netdev, Vittorio G (VittGam)


	Hello,

On Mon, 29 Jun 2015, Julian Anastasov wrote:

> RFC patch:
> 
> Subject: [PATCHv2 net] net: do not process device backlog during
>  unregistration

	Ignore this v2, packets in backlog do not hold
dev reference, I'll prepare new version with such check
in enqueue_to_backlog. Also, it is process_backlog that
should start the RCU read section because packets may
escape the last synchronize_net... more fixes are needed.

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-06-30  6:25 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-27 16:02 [PATCH net] net: do not process device backlog during unregistration Julian Anastasov
2015-06-28 19:06 ` Eric W. Biederman
2015-06-29 20:33   ` Julian Anastasov
2015-06-30  6:25     ` Julian Anastasov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).