[RACE] net: in process

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RACE] net: in process_backlog
@ 2009-11-12  8:50 Changli Gao
  2009-11-12 16:57 ` Stephen Hemminger
  0 siblings, 1 reply; 5+ messages in thread
From: Changli Gao @ 2009-11-12  8:50 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David S. Miller, Patrick McHardy, netdev

Dear Stephen:

I don't think this change
http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commitdiff;h=6e583ce5242f32e925dcb198f7123256d0798370
is correct.

                        local_irq_enable();
                        break;
                }
-
                local_irq_enable();

-               dev = skb->dev;
-
on MP system, flush_backlog() will be called here, and after that
skb->dev will be invalid, if we access it, sth. unexpected may
happens.

                netif_receive_skb(skb);
-
-               dev_put(dev);
        } while (++work < quota && jiffies == start_time);

        return work;

-- 
Regards，
Changli Gao(xiaosuo@gmail.com)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RACE] net: in process_backlog
  2009-11-12  8:50 [RACE] net: in process_backlog Changli Gao
@ 2009-11-12 16:57 ` Stephen Hemminger
  2009-11-12 23:54   ` Changli Gao
  0 siblings, 1 reply; 5+ messages in thread
From: Stephen Hemminger @ 2009-11-12 16:57 UTC (permalink / raw)
  To: Changli Gao; +Cc: David S. Miller, Patrick McHardy, netdev

On Thu, 12 Nov 2009 16:50:53 +0800
Changli Gao <xiaosuo@gmail.com> wrote:

> Dear Stephen:
> 
> I don't think this change
> http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commitdiff;h=6e583ce5242f32e925dcb198f7123256d0798370
> is correct.
> 
>                         local_irq_enable();
>                         break;
>                 }
> -
>                 local_irq_enable();
> 
> -               dev = skb->dev;
> -
> on MP system, flush_backlog() will be called here, and after that
> skb->dev will be invalid, if we access it, sth. unexpected may
> happens.
> 
>                 netif_receive_skb(skb);
> -
> -               dev_put(dev);
>         } while (++work < quota && jiffies == start_time);
> 
>         return work;

There is are a couple of issues here, but it is not what you thought
you saw.

The receive process is always done in soft IRQ context. The backlog queue's
are per-cpu. When a device is deleted an IPI is sent to all cpu's to
scan there backlog queue.  What should protect the skb is the fact that
the network device destruction process waits for an RCU grace period.
So skb->dev points to valid data. 

BUT the flush_backlog is run too late in the device destruction process.
It should be moved out of netdev_run_todo, to right after dev_shutdown().
Also adding a check for skb->dev->reg_state in netif_receive_skb would
be wise to drop packets.

-- 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RACE] net: in process_backlog
  2009-11-12 16:57 ` Stephen Hemminger
@ 2009-11-12 23:54   ` Changli Gao
  2009-11-13  0:11     ` Stephen Hemminger
  0 siblings, 1 reply; 5+ messages in thread
From: Changli Gao @ 2009-11-12 23:54 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David S. Miller, Patrick McHardy, netdev

On Fri, Nov 13, 2009 at 12:57 AM, Stephen Hemminger
<shemminger@vyatta.com> wrote:
> On Thu, 12 Nov 2009 16:50:53 +0800
> Changli Gao <xiaosuo@gmail.com> wrote:
>
>
> There is are a couple of issues here, but it is not what you thought
> you saw.
>
> The receive process is always done in soft IRQ context. The backlog queue's
> are per-cpu. When a device is deleted an IPI is sent to all cpu's to
> scan there backlog queue.  What should protect the skb is the fact that
> the network device destruction process waits for an RCU grace period.
> So skb->dev points to valid data.

Yea, if the process waits for a RCU grace period, there will be no
race. But think about another case:
1. flush_backlog().
2. dev_hold(skb->dev); netif_rx(). dev_put(skb->dev);
3. wait_for_refs();
4. free(dev);
5. netif_receive_skb(); //skb->dev doesn't present.
flush_backlog() can't prevent new skbs are added to backlog. If we
swap the flush_backlog() and wait_for_refs(), this case will be OK
too.

>
> BUT the flush_backlog is run too late in the device destruction process.
> It should be moved out of netdev_run_todo, to right after dev_shutdown().
> Also adding a check for skb->dev->reg_state in netif_receive_skb would
> be wise to drop packets.
>
>
> --
>



-- 
Regards，
Changli Gao(xiaosuo@gmail.com)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RACE] net: in process_backlog
  2009-11-12 23:54   ` Changli Gao
@ 2009-11-13  0:11     ` Stephen Hemminger
  2009-11-13  1:17       ` Changli Gao
  0 siblings, 1 reply; 5+ messages in thread
From: Stephen Hemminger @ 2009-11-13  0:11 UTC (permalink / raw)
  To: Changli Gao; +Cc: David S. Miller, Patrick McHardy, netdev

On Fri, 13 Nov 2009 07:54:14 +0800
Changli Gao <xiaosuo@gmail.com> wrote:

> On Fri, Nov 13, 2009 at 12:57 AM, Stephen Hemminger
> <shemminger@vyatta.com> wrote:
> > On Thu, 12 Nov 2009 16:50:53 +0800
> > Changli Gao <xiaosuo@gmail.com> wrote:
> >
> >
> > There is are a couple of issues here, but it is not what you thought
> > you saw.
> >
> > The receive process is always done in soft IRQ context. The backlog queue's
> > are per-cpu. When a device is deleted an IPI is sent to all cpu's to
> > scan there backlog queue.  What should protect the skb is the fact that
> > the network device destruction process waits for an RCU grace period.
> > So skb->dev points to valid data.
> 
> Yea, if the process waits for a RCU grace period, there will be no
> race. But think about another case:
> 1. flush_backlog().
After flush backlog there should be no more skb's with that device
in the queue, and if more are added, the device is buggy.

> 2. dev_hold(skb->dev); netif_rx(). dev_put(skb->dev);
There is no dev_hold in netif_rx path.

> 3. wait_for_refs();
> 4. free(dev);
> 5. netif_receive_skb(); //skb->dev doesn't present.
> flush_backlog() can't prevent new skbs are added to backlog. If we
> swap the flush_backlog() and wait_for_refs(), this case will be OK
> too.

It is still up to device driver not to add skb's to queue when stopped.

-- 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RACE] net: in process_backlog
  2009-11-13  0:11     ` Stephen Hemminger
@ 2009-11-13  1:17       ` Changli Gao
  0 siblings, 0 replies; 5+ messages in thread
From: Changli Gao @ 2009-11-13  1:17 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David S. Miller, Patrick McHardy, netdev

On Fri, Nov 13, 2009 at 8:11 AM, Stephen Hemminger
<shemminger@vyatta.com> wrote:
> On Fri, 13 Nov 2009 07:54:14 +0800
> Changli Gao <xiaosuo@gmail.com> wrote:
>
>>
>> Yea, if the process waits for a RCU grace period, there will be no
>> race. But think about another case:
>> 1. flush_backlog().
> After flush backlog there should be no more skb's with that device
> in the queue, and if more are added, the device is buggy.
>
>> 2. dev_hold(skb->dev); netif_rx(). dev_put(skb->dev);
> There is no dev_hold in netif_rx path.
>

If the caller of netif_rx or netif_rx_ni isn't the device driver self,
it must find and hold the dev, then call netif_rx and put it.

>> 3. wait_for_refs();
>> 4. free(dev);
>> 5. netif_receive_skb(); //skb->dev doesn't present.
>> flush_backlog() can't prevent new skbs are added to backlog. If we
>> swap the flush_backlog() and wait_for_refs(), this case will be OK
>> too.
>
> It is still up to device driver not to add skb's to queue when stopped.
>

do you mean before calling netif_rx, the caller should check the dev
is alive or not? If so, why not add the check into netif_rx or
netif_rx_ni()? Could you explains why swapping the flush_backlog() and
wait_for_refs() can't work?


-- 
Regards，
Changli Gao(xiaosuo@gmail.com)

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-11-13  1:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-12  8:50 [RACE] net: in process_backlog Changli Gao
2009-11-12 16:57 ` Stephen Hemminger
2009-11-12 23:54   ` Changli Gao
2009-11-13  0:11     ` Stephen Hemminger
2009-11-13  1:17       ` Changli Gao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).