netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: 2.6.24-rc6-mm1 - git-lblnet.patch and networking horkage
       [not found]   ` <200712311213.32515.paul.moore@hp.com>
@ 2007-12-31 20:06     ` Paul Moore
  2007-12-31 21:46       ` James Morris
  0 siblings, 1 reply; 3+ messages in thread
From: Paul Moore @ 2007-12-31 20:06 UTC (permalink / raw)
  To: James Morris; +Cc: Valdis.Kletnieks, akpm, linux-kernel, sds, netdev

On Monday 31 December 2007 12:13:32 pm Paul Moore wrote:
> On Wednesday 26 December 2007 4:52:03 pm James Morris wrote:
> > On Thu, 26 Dec 2007, Paul Moore wrote:
> > > As James said I'm away right now and computer access is limited.
> > > However, I'm stuck in the airport right now and spent some time looking
> > > at the code ... Based on what has been found so far I wonder if the
> > > problem isn't a race but a problem of skb->iif never being initialized
> > > correctly?  To my untrained eye it looks like __netdev_alloc_skb()
> > > should be setting skb->iif (like it does for skb->dev) but it currently
> > > doesn't.
> >
> > ->iif will be zeroed during skb allocation, then set during
> > netif_receive_skb().
>
> I was able to reproduce this bug this morning by running avahi as James did
> and did a little more digging.  I don't have a fix yet, but thought I would
> pass along what I've found in case this triggers a moment of clarity to
> someone out there ...
>
> The skb->iif value appears to be messed up as early as netif_receive_skb(),
> in my case it is set to 196611 (trust me, I don't have that many interfaces
> in my test machine) which causes the ->iif initialization code in
> netif_receive_skb() to be skipped because ->iif is greater than zero.  This
> particular packet is locally generated and locally consumed.
>
> Hopefully I'll have a fix later this afternoon but if someone has a bright
> idea I'd love to hear it ...

[NOTE: I added netdev to this thread to gather some input.  @netdev folks, the 
problem is that the skb->iif field contains garbage in some cases which is 
causing problems for some new SELinux network code.  The exact problem 
probably isn't too important for this discussion, what is important is that 
the skb->iif field contains a non-zero garbage value some of the time on 
incoming packets.]

I'm pretty certain this is an uninitialized value problem now and not a 
use-after-free issue.  The invalid/garbage ->iif value seems to only happen 
on packets that are generated locally and sent back into the stack for local 
consumption, e.g. loopback.  These local packets also need to have been 
cloned at some point, either on the output or input path.

The problem appears to be a skb_clone() function which does not clear the skb 
structure properly and fails to copy the ->iif value from the original skb to 
the cloned skb.  From what I can tell, there are two possible solutions to 
this problem:

 1. Clear all of the cloned skb fields in skb_clone() via memset()
 2. Copy the ->iif field in __copy_skb_header()

I don't have a good enough understanding of all the details involving skb 
memory management to know if option #1 is a Good Idea or not, but option #2 
seems much simpler and solves the problem of garbage in the ->iif field.  My 
preference is to go with option #2 but before I submit a patch does anyone 
think this is the wrong solution?

-- 
paul moore
linux security @ hp

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: 2.6.24-rc6-mm1 - git-lblnet.patch and networking horkage
  2007-12-31 20:06     ` 2.6.24-rc6-mm1 - git-lblnet.patch and networking horkage Paul Moore
@ 2007-12-31 21:46       ` James Morris
  2007-12-31 22:01         ` Paul Moore
  0 siblings, 1 reply; 3+ messages in thread
From: James Morris @ 2007-12-31 21:46 UTC (permalink / raw)
  To: Paul Moore; +Cc: Valdis.Kletnieks, akpm, linux-kernel, sds, netdev

On Mon, 31 Dec 2007, Paul Moore wrote:

> I'm pretty certain this is an uninitialized value problem now and not a 
> use-after-free issue.  The invalid/garbage ->iif value seems to only happen 
> on packets that are generated locally and sent back into the stack for local 
> consumption, e.g. loopback.  These local packets also need to have been 
> cloned at some point, either on the output or input path.

I think we need to find out exactly what's happening, first.

> The problem appears to be a skb_clone() function which does not clear the skb 
> structure properly and fails to copy the ->iif value from the original skb to 
> the cloned skb.  From what I can tell, there are two possible solutions to 
> this problem:
> 
>  1. Clear all of the cloned skb fields in skb_clone() via memset()

Sounds like it's not going to fly for performance reasons in any case.

>  2. Copy the ->iif field in __copy_skb_header()

Seems valid.


- James
-- 
James Morris
<jmorris@namei.org>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: 2.6.24-rc6-mm1 - git-lblnet.patch and networking horkage
  2007-12-31 21:46       ` James Morris
@ 2007-12-31 22:01         ` Paul Moore
  0 siblings, 0 replies; 3+ messages in thread
From: Paul Moore @ 2007-12-31 22:01 UTC (permalink / raw)
  To: James Morris; +Cc: Valdis.Kletnieks, akpm, linux-kernel, sds, netdev

On Monday 31 December 2007 4:46:09 pm James Morris wrote:
> On Mon, 31 Dec 2007, Paul Moore wrote:
> > I'm pretty certain this is an uninitialized value problem now and not a
> > use-after-free issue.  The invalid/garbage ->iif value seems to only
> > happen on packets that are generated locally and sent back into the stack
> > for local consumption, e.g. loopback.  These local packets also need to
> > have been cloned at some point, either on the output or input path.
>
> I think we need to find out exactly what's happening, first.

The more I've looked at the code this afternoon, I'm certain this is the case.  
I've also been running a patched kernel (using option #2 from below) and all 
of the skbs coming up the stack have valid ->iif values.  Granted, I haven't 
examined the code from the avahi daemon or the tcl test cases and traced the 
entire code path through the kernel but I _am_ certain that at some point in 
that code path the packet is cloned and due to a problem in skb_clone() 
the ->iif field is not copied correctly causing the problems we have all 
seen.

How much smoke needs to be coming from the gun? :)

> > The problem appears to be a skb_clone() function which does not clear the
> > skb structure properly and fails to copy the ->iif value from the
> > original skb to the cloned skb.  From what I can tell, there are two
> > possible solutions to this problem:
> >
> >  1. Clear all of the cloned skb fields in skb_clone() via memset()
>
> Sounds like it's not going to fly for performance reasons in any case.

That was my gut feeling.  I was also a little unsure where exactly the correct 
placement should be for the memset() call.

> >  2. Copy the ->iif field in __copy_skb_header()
>
> Seems valid.

Okay, I'll stick with this approach.  I'll post a patch backed against 
net-2.6.25 tomorrow as an RFC to see if anyone on netdev has any strong 
feelings.  If no one complains, I'll add it to the lblnet git tree.

-- 
paul moore
linux security @ hp

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2007-12-31 22:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <3281504256.5618888@mail.hp.com>
     [not found] ` <Xine.LNX.4.64.0712270848260.19279@us.intercode.com.au>
     [not found]   ` <200712311213.32515.paul.moore@hp.com>
2007-12-31 20:06     ` 2.6.24-rc6-mm1 - git-lblnet.patch and networking horkage Paul Moore
2007-12-31 21:46       ` James Morris
2007-12-31 22:01         ` Paul Moore

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).