* Fwd: Possible bug in net/ipv4/route.c?
@ 2010-07-02 2:49 YOSHIFUJI Hideaki
2010-07-02 5:38 ` Eric Dumazet
0 siblings, 1 reply; 9+ messages in thread
From: YOSHIFUJI Hideaki @ 2010-07-02 2:49 UTC (permalink / raw)
To: netdev@vger.kernel.org, linux-kernel
Switch to netdev.
--yoshfuji
-------- Original Message --------
Subject: Possible bug in net/ipv4/route.c?
Date: Thu, 1 Jul 2010 16:00:29 -0700
From: Sol Kavy <skavy@ubicom.com>
To: <linux-kernel@vger.kernel.org>
CC: Greg Ren <gren@ubicom.com>, Guojun Jin <gjin@ubicom.com>, Murat Sezgin <msezgin@ubicom.com>, Sener Ilgen <silgen@ubicom.com>
Found Linux: 2.6.28
Arch: Ubicom32 <not yet pushed>
Project: uCLinux based Router
Test: Bit torrent Stress Test
Note: The top of Linus git net/ipv4/route.c appears to have the same issue.
The following is a patch for clearing out IP options area in an input skb during link failure processing. Without this patch, the icmp_send() can result in a call to ip_options_echo() where the common buffer area of the skb is incorrectly interpreted. Depending on the previous use of the skb->cb[], the interpreted option length values can cause stack corruption by copying more than 40 bytes to the output options.
In our case, a driver is using the skb->cb[] area to hold driver specific data. The driver is not zeroing out the area after use. I can see three basic solutions:
1) Drivers are not allowed to use the skb->cb[] area at all. Ubicom should modify the driver to use a different approach.
2) The layer using skb->cb[] should clear this area after use and before handing the skb to another layer. Ubicom should modify the driver to clear the skb->cb[] area before sending it up the line.
3) Any layer that "uses" the skb->cb[] area must clear the area before use. In which case, the proposed patch would fix the problem for the ipv4_link_failure(). I believe that this is the correct fix because I see ip_rcv() clears the skb->cb[] before using it.
Can someone confirm that this is the appropriate fix? If this is documented somewhere, please direct me to the documentation.
Please send email to sol@ubicom.com in addition to posting your response.
Thanks,
Sol Kavy/Murat Sezgin
Ubicom, Inc.
Patch:
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 125ee64..d13805f 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1606,6 +1606,14 @@ static void ipv4_link_failure(struct sk_buff *skb)
{
struct rtable *rt;
+ /*
+ * Since link failure can be called with skbs from many layers (see arp)
+ * the cb area of the skb must be cleared before use. Because the cb area
+ * can be formatted according to the caller layer's cb area format and it may cause
+ * corruptions when it is handled in a different network layer.
+ */
+ memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt));
icmp_send(skb, ICMP_DEST_UNREACH, ICMP_HOST_UNREACH, 0);
rt = skb->rtable;
The packet is enqueud by:
do_IRQ()->do_softirq()->__do_softirq()->net_rx_action()->ubi32_eth_napi_poll()->ubi32_eth_receive()->__vlan_hwaccel_rx()->netif_receive_skb()->br_handle_frame()->nf_hook_slow()->br_nf_pre_routing_finish()->br_nfr_pre_routing_finish_bridge()->neight_resolve_output()->__neigh_event_send().
The packet is then dequeued by:
do_IRQ() -> irq_exit() -> do_softirq() -> run_timer_softirq() -> neigh_timer_handler() -> arp_error_report() -> ipv4_link_failure() -> icmp_send() -> ip_options_echo().
Because the Ubicom Ethernet driver overwrites the common buffer area, the enqueued packet contains garbage when casted as an IP options data structure. This results in ip_options_echo() miss reading the option length information and overwriting memory. By clearing the skb->cb[] before processing the icmp_send() against the packet, we ensure that ip_options_echo() does not corrupt memory.
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: Fwd: Possible bug in net/ipv4/route.c?
2010-07-02 2:49 Fwd: Possible bug in net/ipv4/route.c? YOSHIFUJI Hideaki
@ 2010-07-02 5:38 ` Eric Dumazet
2010-07-05 12:06 ` Herbert Xu
0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2010-07-02 5:38 UTC (permalink / raw)
To: YOSHIFUJI Hideaki; +Cc: netdev@vger.kernel.org, linux-kernel
Le vendredi 02 juillet 2010 à 11:49 +0900, YOSHIFUJI Hideaki a écrit :
> Switch to netdev.
>
thanks ;)
> --yoshfuji
>
> -------- Original Message --------
> Subject: Possible bug in net/ipv4/route.c?
> Date: Thu, 1 Jul 2010 16:00:29 -0700
> From: Sol Kavy <skavy@ubicom.com>
> To: <linux-kernel@vger.kernel.org>
> CC: Greg Ren <gren@ubicom.com>, Guojun Jin <gjin@ubicom.com>, Murat Sezgin <msezgin@ubicom.com>, Sener Ilgen <silgen@ubicom.com>
>
> Found Linux: 2.6.28
> Arch: Ubicom32 <not yet pushed>
> Project: uCLinux based Router
> Test: Bit torrent Stress Test
>
> Note: The top of Linus git net/ipv4/route.c appears to have the same issue.
>
Please use < 72 char lines
> The following is a patch for clearing out IP options area in an input
> skb during link failure processing. Without this patch, the
> icmp_send() can result in a call to ip_options_echo() where the
> common buffer area of the skb is incorrectly interpreted. Depending on the previous use of the skb->cb[], the interpreted option length values can cause stack corruption by copying more than 40 bytes to the output options.
>
> In our case, a driver is using the skb->cb[] area to hold driver
> specific data. The driver is not zeroing out the area after use. I
> can see three basic solutions:
>
> 1) Drivers are not allowed to use the skb->cb[] area at all. Ubicom
> should modify the driver to use a different approach.
>
> 2) The layer using skb->cb[] should clear this area after use and
> before handing the skb to another layer. Ubicom should modify the
> driver to clear the skb->cb[] area before sending it up the line.
>
This is the right option. If you use one word in cb[], only your driver
knows how to clear it efficiently.
> 3) Any layer that "uses" the skb->cb[] area must clear the area before
> use. In which case, the proposed patch would fix the problem for the
> ipv4_link_failure(). I believe that this is the correct fix because I
> see ip_rcv() clears the skb->cb[] before using it.
>
No : ip_rcv clears() skb->cb when leaving ip_rcv, not entering.
skb allocation clears whole cb[], and each layer is responsible to clear
the part it eventually dirtied.
> Can someone confirm that this is the appropriate fix? If this is
> documented somewhere, please direct me to the documentation.
>
> Please send email to sol@ubicom.com in addition to posting your
> response.
>
> Thanks,
>
> Sol Kavy/Murat Sezgin
> Ubicom, Inc.
>
> Patch:
>
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index 125ee64..d13805f 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -1606,6 +1606,14 @@ static void ipv4_link_failure(struct sk_buff *skb)
> {
> struct rtable *rt;
>
> + /*
> + * Since link failure can be called with skbs from many layers (see arp)
> + * the cb area of the skb must be cleared before use. Because the cb area
> + * can be formatted according to the caller layer's cb area format and it may cause
> + * corruptions when it is handled in a different network layer.
> + */
> + memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt));
> icmp_send(skb, ICMP_DEST_UNREACH, ICMP_HOST_UNREACH, 0);
> rt = skb->rtable;
>
> The packet is enqueud by:
> do_IRQ()->do_softirq()->__do_softirq()->net_rx_action()->ubi32_eth_napi_poll()->ubi32_eth_receive()->__vlan_hwaccel_rx()->netif_receive_skb()->br_handle_frame()->nf_hook_slow()->br_nf_pre_routing_finish()->br_nfr_pre_routing_finish_bridge()->neight_resolve_output()->__neigh_event_send().
>
> The packet is then dequeued by:
> do_IRQ() -> irq_exit() -> do_softirq() -> run_timer_softirq() -> neigh_timer_handler() -> arp_error_report() -> ipv4_link_failure() -> icmp_send() -> ip_options_echo().
>
> Because the Ubicom Ethernet driver overwrites the common buffer area, the enqueued packet contains garbage when casted as an IP options data structure. This results in ip_options_echo() miss reading the option length information and overwriting memory. By clearing the skb->cb[] before processing the icmp_send() against the packet, we ensure that ip_options_echo() does not corrupt memory.
>
>
>
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: Fwd: Possible bug in net/ipv4/route.c?
2010-07-02 5:38 ` Eric Dumazet
@ 2010-07-05 12:06 ` Herbert Xu
2010-07-05 12:59 ` Eric Dumazet
0 siblings, 1 reply; 9+ messages in thread
From: Herbert Xu @ 2010-07-05 12:06 UTC (permalink / raw)
To: Eric Dumazet; +Cc: yoshfuji, netdev, linux-kernel, Stephen Hemminger
Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>> 2) The layer using skb->cb[] should clear this area after use and
>> before handing the skb to another layer. Ubicom should modify the
>> driver to clear the skb->cb[] area before sending it up the line.
>>
>
> This is the right option. If you use one word in cb[], only your driver
> knows how to clear it efficiently.
Absolutely not! No protocol stack should rely on an external skb
having a zero cb.
Please fix the bridge instead as it's doing the IP injection.
Cheers,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: Possible bug in net/ipv4/route.c?
2010-07-05 12:06 ` Herbert Xu
@ 2010-07-05 12:59 ` Eric Dumazet
2010-07-05 13:22 ` Herbert Xu
0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2010-07-05 12:59 UTC (permalink / raw)
To: Herbert Xu; +Cc: yoshfuji, netdev, linux-kernel, Stephen Hemminger
Le lundi 05 juillet 2010 à 20:06 +0800, Herbert Xu a écrit :
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >
> >> 2) The layer using skb->cb[] should clear this area after use and
> >> before handing the skb to another layer. Ubicom should modify the
> >> driver to clear the skb->cb[] area before sending it up the line.
> >>
> >
> > This is the right option. If you use one word in cb[], only your driver
> > knows how to clear it efficiently.
>
> Absolutely not! No protocol stack should rely on an external skb
> having a zero cb.
>
Why do we clear full 48 bytes skb->cb[] in skb_alloc(), if no protocol
stack should rely it being zero ?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: Possible bug in net/ipv4/route.c?
2010-07-05 12:59 ` Eric Dumazet
@ 2010-07-05 13:22 ` Herbert Xu
2010-07-05 13:34 ` Eric Dumazet
0 siblings, 1 reply; 9+ messages in thread
From: Herbert Xu @ 2010-07-05 13:22 UTC (permalink / raw)
To: Eric Dumazet; +Cc: yoshfuji, netdev, linux-kernel, Stephen Hemminger
On Mon, Jul 05, 2010 at 02:59:14PM +0200, Eric Dumazet wrote:
>
> Why do we clear full 48 bytes skb->cb[] in skb_alloc(), if no protocol
> stack should rely it being zero ?
Unless a protocol is allocating the skb itself, then the fact
that skb_alloc clears skb->cb is no guarantee that the skb->cb
will be zero.
Thanks,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: Possible bug in net/ipv4/route.c?
2010-07-05 13:22 ` Herbert Xu
@ 2010-07-05 13:34 ` Eric Dumazet
2010-07-05 14:42 ` Herbert Xu
2010-07-05 20:07 ` Henrique de Moraes Holschuh
0 siblings, 2 replies; 9+ messages in thread
From: Eric Dumazet @ 2010-07-05 13:34 UTC (permalink / raw)
To: Herbert Xu; +Cc: yoshfuji, netdev, linux-kernel, Stephen Hemminger
Le lundi 05 juillet 2010 à 21:22 +0800, Herbert Xu a écrit :
> On Mon, Jul 05, 2010 at 02:59:14PM +0200, Eric Dumazet wrote:
> >
> > Why do we clear full 48 bytes skb->cb[] in skb_alloc(), if no protocol
> > stack should rely it being zero ?
>
> Unless a protocol is allocating the skb itself, then the fact
> that skb_alloc clears skb->cb is no guarantee that the skb->cb
> will be zero.
I see. We could :
Avoid this memset(skb->cb, 0, sizeof(skb->cb)) in fastpath.
or in debug mode, poison it to trigger errors more often.
Thanks
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: Possible bug in net/ipv4/route.c?
2010-07-05 13:34 ` Eric Dumazet
@ 2010-07-05 14:42 ` Herbert Xu
2010-07-05 20:07 ` Henrique de Moraes Holschuh
1 sibling, 0 replies; 9+ messages in thread
From: Herbert Xu @ 2010-07-05 14:42 UTC (permalink / raw)
To: Eric Dumazet; +Cc: yoshfuji, netdev, linux-kernel, Stephen Hemminger
On Mon, Jul 05, 2010 at 03:34:58PM +0200, Eric Dumazet wrote:
>
> I see. We could :
>
> Avoid this memset(skb->cb, 0, sizeof(skb->cb)) in fastpath.
Yeah I think this might work. Although skb's are used in all
sorts of esoteric places (such as netlink which may not even be
related to networking) so I can't guarantee that every alloc_skb
caller does the right thing.
Thanks,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: Possible bug in net/ipv4/route.c?
2010-07-05 13:34 ` Eric Dumazet
2010-07-05 14:42 ` Herbert Xu
@ 2010-07-05 20:07 ` Henrique de Moraes Holschuh
2010-07-05 20:18 ` Eric Dumazet
1 sibling, 1 reply; 9+ messages in thread
From: Henrique de Moraes Holschuh @ 2010-07-05 20:07 UTC (permalink / raw)
To: Eric Dumazet
Cc: Herbert Xu, yoshfuji, netdev, linux-kernel, Stephen Hemminger
On Mon, 05 Jul 2010, Eric Dumazet wrote:
> Le lundi 05 juillet 2010 à 21:22 +0800, Herbert Xu a écrit :
> > On Mon, Jul 05, 2010 at 02:59:14PM +0200, Eric Dumazet wrote:
> > >
> > > Why do we clear full 48 bytes skb->cb[] in skb_alloc(), if no protocol
> > > stack should rely it being zero ?
> >
> > Unless a protocol is allocating the skb itself, then the fact
> > that skb_alloc clears skb->cb is no guarantee that the skb->cb
> > will be zero.
>
> I see. We could :
>
> Avoid this memset(skb->cb, 0, sizeof(skb->cb)) in fastpath.
Any chances of skb->cb being leaked to userspace or the network, due to
driver bugs or other such oddities?
--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: Possible bug in net/ipv4/route.c?
2010-07-05 20:07 ` Henrique de Moraes Holschuh
@ 2010-07-05 20:18 ` Eric Dumazet
0 siblings, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2010-07-05 20:18 UTC (permalink / raw)
To: Henrique de Moraes Holschuh
Cc: Herbert Xu, yoshfuji, netdev, linux-kernel, Stephen Hemminger
Le lundi 05 juillet 2010 à 17:07 -0300, Henrique de Moraes Holschuh a
écrit :
> On Mon, 05 Jul 2010, Eric Dumazet wrote:
> > Le lundi 05 juillet 2010 à 21:22 +0800, Herbert Xu a écrit :
> > > On Mon, Jul 05, 2010 at 02:59:14PM +0200, Eric Dumazet wrote:
> > > >
> > > > Why do we clear full 48 bytes skb->cb[] in skb_alloc(), if no protocol
> > > > stack should rely it being zero ?
> > >
> > > Unless a protocol is allocating the skb itself, then the fact
> > > that skb_alloc clears skb->cb is no guarantee that the skb->cb
> > > will be zero.
> >
> > I see. We could :
> >
> > Avoid this memset(skb->cb, 0, sizeof(skb->cb)) in fastpath.
>
> Any chances of skb->cb being leaked to userspace or the network, due to
> driver bugs or other such oddities?
>
Not "a priori", but a bug is always possible ;)
cb[] is internal use only, should not be sent to network or user land.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2010-07-05 20:18 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-02 2:49 Fwd: Possible bug in net/ipv4/route.c? YOSHIFUJI Hideaki
2010-07-02 5:38 ` Eric Dumazet
2010-07-05 12:06 ` Herbert Xu
2010-07-05 12:59 ` Eric Dumazet
2010-07-05 13:22 ` Herbert Xu
2010-07-05 13:34 ` Eric Dumazet
2010-07-05 14:42 ` Herbert Xu
2010-07-05 20:07 ` Henrique de Moraes Holschuh
2010-07-05 20:18 ` Eric Dumazet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).