[RFC] Handling device free after a packet is passed to the network stack

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC] Handling device free after a packet is passed to the network stack
@ 2015-03-27 19:57 subashab
  2015-03-28  2:49 ` Eric Dumazet
  0 siblings, 1 reply; 4+ messages in thread
From: subashab @ 2015-03-27 19:57 UTC (permalink / raw)
  To: netdev; +Cc: eric.dumazet

We have been coming across a couple of scenarios where the device is freed
and the corresponding packets which were already queued up the stack
encounter crashes when they find that contents of skb->dev are no longer
valid.

Specifically, we have observed an instance where a cpu hotplug occurs
along with the network driver module unloading. When the packets are being
queued up the stack using netif_rx_ni from dev_cpu_callback, get_rps_cpus
crashes as it encounters invalid data at skb->dev since it would have been
freed.

We would like to know if the kernel provides some mechanisms to safeguard
against such scenarios.

--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] Handling device free after a packet is passed to the network stack
  2015-03-27 19:57 [RFC] Handling device free after a packet is passed to the network stack subashab
@ 2015-03-28  2:49 ` Eric Dumazet
  2015-03-30 14:19   ` Harout Hedeshian
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Dumazet @ 2015-03-28  2:49 UTC (permalink / raw)
  To: subashab; +Cc: netdev

On Fri, 2015-03-27 at 19:57 +0000, subashab@codeaurora.org wrote:
> We have been coming across a couple of scenarios where the device is freed
> and the corresponding packets which were already queued up the stack
> encounter crashes when they find that contents of skb->dev are no longer
> valid.
> 
> Specifically, we have observed an instance where a cpu hotplug occurs
> along with the network driver module unloading. When the packets are being
> queued up the stack using netif_rx_ni from dev_cpu_callback, get_rps_cpus
> crashes as it encounters invalid data at skb->dev since it would have been
> freed.
> 
> We would like to know if the kernel provides some mechanisms to safeguard
> against such scenarios.

This is supposed to be handled in flush_backlog() (net/core/dev.c)

Maybe you hit some race condition in on_each_cpu().

Really, I do not think we have tested such stress conditions ever.

cpu hotplug was very unstable 2 years ago (last time I tried it, this
was constantly failing outside of networking land)

I would try following patch, although I am not sure what would prevent
dev_cpu_callback() being called too late.

diff --git a/net/core/dev.c b/net/core/dev.c
index a0408d497dae04e7caa145f05c915b058aa2d356..fa74887adfe10709bce55ccf1e34d65c6b7a8fba 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -7149,11 +7149,17 @@ static int dev_cpu_callback(struct notifier_block *nfb,
 
 	/* Process offline CPU's input_pkt_queue */
 	while ((skb = __skb_dequeue(&oldsd->process_queue))) {
-		netif_rx_ni(skb);
+		if (skb->dev->reg_state != NETREG_REGISTERED)
+			kfree_skb(skb);
+		else
+			netif_rx_ni(skb);
 		input_queue_head_incr(oldsd);
 	}
 	while ((skb = skb_dequeue(&oldsd->input_pkt_queue))) {
-		netif_rx_ni(skb);
+		if (skb->dev->reg_state != NETREG_REGISTERED)
+			kfree_skb(skb);
+		else
+			netif_rx_ni(skb);
 		input_queue_head_incr(oldsd);
 	}
 

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* RE: [RFC] Handling device free after a packet is passed to the network stack
  2015-03-28  2:49 ` Eric Dumazet
@ 2015-03-30 14:19   ` Harout Hedeshian
  2015-03-30 22:41     ` Eric Dumazet
  0 siblings, 1 reply; 4+ messages in thread
From: Harout Hedeshian @ 2015-03-30 14:19 UTC (permalink / raw)
  To: 'Eric Dumazet', subashab; +Cc: netdev



> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
> On Behalf Of Eric Dumazet
> Sent: Friday, March 27, 2015 8:49 PM
> To: subashab@codeaurora.org
> Cc: netdev@vger.kernel.org
> Subject: Re: [RFC] Handling device free after a packet is passed to the
> network stack
> 
> On Fri, 2015-03-27 at 19:57 +0000, subashab@codeaurora.org wrote:
> > We have been coming across a couple of scenarios where the device is
> > freed and the corresponding packets which were already queued up the
> > stack encounter crashes when they find that contents of skb->dev are
> > no longer valid.
> >
> > Specifically, we have observed an instance where a cpu hotplug occurs
> > along with the network driver module unloading. When the packets are
> > being queued up the stack using netif_rx_ni from dev_cpu_callback,
> > get_rps_cpus crashes as it encounters invalid data at skb->dev since
> > it would have been freed.
> >
> > We would like to know if the kernel provides some mechanisms to
> > safeguard against such scenarios.
> 
> This is supposed to be handled in flush_backlog() (net/core/dev.c)

It looks like this only takes care of packets in the softnet_data queues (or am I missing something). Wouldn't we run into similar issues if the packet were stuck in, for example, TCP OFO socket queue (not sure if anything tries to access skb->dev after that)? Or perhaps some other queue which may later try to dereference skb->dev. 
 

It seems to me that the only way to be absolutely sure that there is no reference on this net_device would be to have the drivers dev_hold() before earch netif_rx/netif_rx_ni/netif_receive_skb; and then later have __kfree_skb() do a dev_put().

Of course, this approach has a number of issues:
- 2 extra operations per packet
- Anything holding onto an SKB can prevent a net_device from deregistering/freeing
- requires skbs to have a valid dev before they can be freed (or we simply null check this field in __kfree_skb())

> Maybe you hit some race condition in on_each_cpu().
> 
> Really, I do not think we have tested such stress conditions ever.
> 
> cpu hotplug was very unstable 2 years ago (last time I tried it, this
> was constantly failing outside of networking land)
> 
> I would try following patch, although I am not sure what would prevent
> dev_cpu_callback() being called too late.
> 
> diff --git a/net/core/dev.c b/net/core/dev.c index
> a0408d497dae04e7caa145f05c915b058aa2d356..fa74887adfe10709bce55ccf1e34d6
> 5c6b7a8fba 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -7149,11 +7149,17 @@ static int dev_cpu_callback(struct
> notifier_block *nfb,
> 
>  	/* Process offline CPU's input_pkt_queue */
>  	while ((skb = __skb_dequeue(&oldsd->process_queue))) {
> -		netif_rx_ni(skb);
> +		if (skb->dev->reg_state != NETREG_REGISTERED)
> +			kfree_skb(skb);
> +		else
> +			netif_rx_ni(skb);
>  		input_queue_head_incr(oldsd);
>  	}
>  	while ((skb = skb_dequeue(&oldsd->input_pkt_queue))) {
> -		netif_rx_ni(skb);
> +		if (skb->dev->reg_state != NETREG_REGISTERED)
> +			kfree_skb(skb);
> +		else
> +			netif_rx_ni(skb);
>  		input_queue_head_incr(oldsd);
>  	}
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in the
> body of a message to majordomo@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] Handling device free after a packet is passed to the network stack
  2015-03-30 14:19   ` Harout Hedeshian
@ 2015-03-30 22:41     ` Eric Dumazet
  0 siblings, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2015-03-30 22:41 UTC (permalink / raw)
  To: Harout Hedeshian; +Cc: subashab, netdev

On Mon, 2015-03-30 at 08:19 -0600, Harout Hedeshian wrote:
> 
> > -----Original Message-----
> > From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
> > On Behalf Of Eric Dumazet
> > Sent: Friday, March 27, 2015 8:49 PM
> > To: subashab@codeaurora.org
> > Cc: netdev@vger.kernel.org
> > Subject: Re: [RFC] Handling device free after a packet is passed to the
> > network stack
> > 
> > On Fri, 2015-03-27 at 19:57 +0000, subashab@codeaurora.org wrote:
> > > We have been coming across a couple of scenarios where the device is
> > > freed and the corresponding packets which were already queued up the
> > > stack encounter crashes when they find that contents of skb->dev are
> > > no longer valid.
> > >
> > > Specifically, we have observed an instance where a cpu hotplug occurs
> > > along with the network driver module unloading. When the packets are
> > > being queued up the stack using netif_rx_ni from dev_cpu_callback,
> > > get_rps_cpus crashes as it encounters invalid data at skb->dev since
> > > it would have been freed.
> > >
> > > We would like to know if the kernel provides some mechanisms to
> > > safeguard against such scenarios.
> > 
> > This is supposed to be handled in flush_backlog() (net/core/dev.c)
> 
> It looks like this only takes care of packets in the softnet_data
> queues (or am I missing something). Wouldn't we run into similar
> issues if the packet were stuck in, for example, TCP OFO socket queue
> (not sure if anything tries to access skb->dev after that)? Or perhaps
> some other queue which may later try to dereference skb->dev. 
>  

Then these packets wont reach dev. Stack already takes care of these
packets. They are in TCP queues, not yet associated with a device.

IP/neighbour stack wont associate a dead device to a TX packet.
> 
> It seems to me that the only way to be absolutely sure that there is
> no reference on this net_device would be to have the drivers
> dev_hold() before earch netif_rx/netif_rx_ni/netif_receive_skb; and
> then later have __kfree_skb() do a dev_put().
> 
> Of course, this approach has a number of issues:
> - 2 extra operations per packet
> - Anything holding onto an SKB can prevent a net_device from
> deregistering/freeing
> - requires skbs to have a valid dev before they can be freed (or we
> simply null check this field in __kfree_skb())


This is absolutely not needed. Existing infrastructure already is
supposed to take care of not holding a RX packet when a device is
dismantled. cpu backlog queues are supposed to be flushed and packet
dropped.

The only problem that my patch is solving is the joint event of cpu
hotplug and device dismantle.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-03-30 22:41 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-03-27 19:57 [RFC] Handling device free after a packet is passed to the network stack subashab
2015-03-28  2:49 ` Eric Dumazet
2015-03-30 14:19   ` Harout Hedeshian
2015-03-30 22:41     ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).