* [RFC PATCH] af_packet: don't to defrag shared skb @ 2012-12-07 18:56 Eric Leblond 2012-12-07 19:10 ` David Miller 2012-12-07 20:31 ` David Miller 0 siblings, 2 replies; 17+ messages in thread From: Eric Leblond @ 2012-12-07 18:56 UTC (permalink / raw) To: netdev; +Cc: Eric Leblond This patch is adding a check on skb before trying to defrag the packet for the hash computation in fanout mode. The goal of this patch is to avoid an kernel crash in pskb_expand_head. It appears that under some specific condition there is a shared skb reaching the defrag code and this lead to a crash due to the following code: if (skb_shared(skb)) BUG(); I've observed this crash under the following condition: 1. a program is listening to an wifi interface (let say wlan0) 2. it is using fanout capture in flow load balancing mode 3. defrag option is on on the fanout socket 4. the interface disconnect (radio down for example) 5. the interface reconnect (radio switched up) 6. once reconnected a single packet is seen with skb->users=2 7. the kernel crash in pskb_expand_head at skbuff.c:1035 [BBB55:744364] [<ffffffff812a2761>] ? __pskb_pull_tail+0x43x0x26f [BB8S5.744395] [<ffffffff812d29Tb>] ? ip_check_defrag+ox3a/0x14a [BBB55.744422] [<ffffffffB1344459>] ? packet_rcv_fanout+ox5e/oxf9 [BBBS5.7444S0] [<ffffffffB12aaS9b>] ? __netif_receive_skb+ox444/ox4f9 [BBB55.T4447B] [<ffffffffB12aa?e1>] ? netif_receive_skb+ox6d/0x?3 [BBB55.T4447B] [<ffffffffB12aa?e1>] ? ieee80211_deliver_skb+0xbd/0xfa [mac80211] [BBB55.T4447B] [<ffffffffB12aa?e1>] ? ieee80211_rx_h_data+0x1e0/0x21a [mac80211] [BBB55.T4447B] [<ffffffffB12aa?e1>] ? ieee80211_rx_handlers+0x3d5/0x480 [mac80211] [BBB55.T4447B] [<ffffffffB12aa?e1>] ? __wake_up [BBB55.T4447B] [<ffffffffB12aa?e1>] ? evdev_eventr+0xc0/0xcf [evdev] Signed-off-by: Eric Leblond <eric@regit.org> --- net/packet/af_packet.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index e639645..4b453f8 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -1110,7 +1110,7 @@ static int packet_rcv_fanout(struct sk_buff *skb, struct net_device *dev, switch (f->type) { case PACKET_FANOUT_HASH: default: - if (f->defrag) { + if (f->defrag && !skb_shared(skb)) { skb = ip_check_defrag(skb, IP_DEFRAG_AF_PACKET); if (!skb) return 0; -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [RFC PATCH] af_packet: don't to defrag shared skb 2012-12-07 18:56 [RFC PATCH] af_packet: don't to defrag shared skb Eric Leblond @ 2012-12-07 19:10 ` David Miller 2012-12-07 20:31 ` David Miller 1 sibling, 0 replies; 17+ messages in thread From: David Miller @ 2012-12-07 19:10 UTC (permalink / raw) To: eric; +Cc: netdev From: Eric Leblond <eric@regit.org> Date: Fri, 7 Dec 2012 19:56:01 +0100 > This patch is adding a check on skb before trying to defrag the > packet for the hash computation in fanout mode. The goal of this > patch is to avoid an kernel crash in pskb_expand_head. > It appears that under some specific condition there is a shared > skb reaching the defrag code and this lead to a crash due to the > following code: > > if (skb_shared(skb)) > BUG(); > > I've observed this crash under the following condition: > 1. a program is listening to an wifi interface (let say wlan0) > 2. it is using fanout capture in flow load balancing mode > 3. defrag option is on on the fanout socket > 4. the interface disconnect (radio down for example) > 5. the interface reconnect (radio switched up) > 6. once reconnected a single packet is seen with skb->users=2 > 7. the kernel crash in pskb_expand_head at skbuff.c:1035 ... > Signed-off-by: Eric Leblond <eric@regit.org> Thanks Eric. I'll try to figure out if we should instead change the wireless code to avoid sending shared SKBs into the input path like that. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH] af_packet: don't to defrag shared skb 2012-12-07 18:56 [RFC PATCH] af_packet: don't to defrag shared skb Eric Leblond 2012-12-07 19:10 ` David Miller @ 2012-12-07 20:31 ` David Miller 2012-12-07 20:42 ` Johannes Berg ` (2 more replies) 1 sibling, 3 replies; 17+ messages in thread From: David Miller @ 2012-12-07 20:31 UTC (permalink / raw) To: eric; +Cc: netdev, linux-wireless, johannes, linville From: Eric Leblond <eric@regit.org> Date: Fri, 7 Dec 2012 19:56:01 +0100 Wireless folks, please take a look. The issue is that, under the circumstances listed below, we get SKBs in the AF_PACKET input path that are shared. Given the logic present in ieee80211_deliver_skb() I think the mac80211 code doesn't expect this either. More commentary from me below: > This patch is adding a check on skb before trying to defrag the > packet for the hash computation in fanout mode. The goal of this > patch is to avoid an kernel crash in pskb_expand_head. > It appears that under some specific condition there is a shared > skb reaching the defrag code and this lead to a crash due to the > following code: > > if (skb_shared(skb)) > BUG(); > > I've observed this crash under the following condition: > 1. a program is listening to an wifi interface (let say wlan0) > 2. it is using fanout capture in flow load balancing mode > 3. defrag option is on on the fanout socket > 4. the interface disconnect (radio down for example) > 5. the interface reconnect (radio switched up) > 6. once reconnected a single packet is seen with skb->users=2 > 7. the kernel crash in pskb_expand_head at skbuff.c:1035 > > [BBB55:744364] [<ffffffff812a2761>] ? __pskb_pull_tail+0x43x0x26f > [BB8S5.744395] [<ffffffff812d29Tb>] ? ip_check_defrag+ox3a/0x14a > [BBB55.744422] [<ffffffffB1344459>] ? packet_rcv_fanout+ox5e/oxf9 > [BBBS5.7444S0] [<ffffffffB12aaS9b>] ? __netif_receive_skb+ox444/ox4f9 > [BBB55.T4447B] [<ffffffffB12aa?e1>] ? netif_receive_skb+ox6d/0x?3 > [BBB55.T4447B] [<ffffffffB12aa?e1>] ? ieee80211_deliver_skb+0xbd/0xfa [mac80211] > [BBB55.T4447B] [<ffffffffB12aa?e1>] ? ieee80211_rx_h_data+0x1e0/0x21a [mac80211] > [BBB55.T4447B] [<ffffffffB12aa?e1>] ? ieee80211_rx_handlers+0x3d5/0x480 [mac80211] > [BBB55.T4447B] [<ffffffffB12aa?e1>] ? __wake_up > [BBB55.T4447B] [<ffffffffB12aa?e1>] ? evdev_eventr+0xc0/0xcf [evdev] > > Signed-off-by: Eric Leblond <eric@regit.org> So if we look at ieee80211_deliver_skb(), it has code to deal with unaligned packet headers, wherein it memoves() the data into a better aligned location. But if these SKBs really are skb_shared(), this packet data modification is illegal. I suspect that the assumptions built into this unaligned data handling code, and AF_PACKET, are correct. Meaning that we should never see skb_shared() packets here. We just have a missing skb_copy() somewhere in mac80211, Johannes can you please take a look? Thanks. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH] af_packet: don't to defrag shared skb 2012-12-07 20:31 ` David Miller @ 2012-12-07 20:42 ` Johannes Berg 2012-12-07 20:54 ` Eric Leblond 2012-12-07 21:30 ` Johannes Berg 2 siblings, 0 replies; 17+ messages in thread From: Johannes Berg @ 2012-12-07 20:42 UTC (permalink / raw) To: David Miller; +Cc: eric, netdev, linux-wireless, linville On Fri, 2012-12-07 at 15:31 -0500, David Miller wrote: > From: Eric Leblond <eric@regit.org> > Date: Fri, 7 Dec 2012 19:56:01 +0100 > > Wireless folks, please take a look. The issue is that, > under the circumstances listed below, we get SKBs in > the AF_PACKET input path that are shared. > > Given the logic present in ieee80211_deliver_skb() I think > the mac80211 code doesn't expect this either. Indeed, it would certainly not like this, I'll take a look. Eric, what's the driver you're using? I'm wondering whether paged skbs vs. all data in the header would make a difference, hence the question. johannes ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH] af_packet: don't to defrag shared skb 2012-12-07 20:31 ` David Miller 2012-12-07 20:42 ` Johannes Berg @ 2012-12-07 20:54 ` Eric Leblond 2012-12-07 21:30 ` Johannes Berg 2 siblings, 0 replies; 17+ messages in thread From: Eric Leblond @ 2012-12-07 20:54 UTC (permalink / raw) To: David Miller; +Cc: netdev, linux-wireless, johannes, linville Hi, On Fri, 2012-12-07 at 15:31 -0500, David Miller wrote: > From: Eric Leblond <eric@regit.org> > Date: Fri, 7 Dec 2012 19:56:01 +0100 > > Wireless folks, please take a look. The issue is that, > under the circumstances listed below, we get SKBs in > the AF_PACKET input path that are shared. > > Given the logic present in ieee80211_deliver_skb() I think > the mac80211 code doesn't expect this either. > > More commentary from me below: > > > This patch is adding a check on skb before trying to defrag the > > packet for the hash computation in fanout mode. The goal of this > > patch is to avoid an kernel crash in pskb_expand_head. > > It appears that under some specific condition there is a shared > > skb reaching the defrag code and this lead to a crash due to the > > following code: > > > > if (skb_shared(skb)) > > BUG(); > > > > I've observed this crash under the following condition: > > 1. a program is listening to an wifi interface (let say wlan0) > > 2. it is using fanout capture in flow load balancing mode > > 3. defrag option is on on the fanout socket > > 4. the interface disconnect (radio down for example) > > 5. the interface reconnect (radio switched up) > > 6. once reconnected a single packet is seen with skb->users=2 > > 7. the kernel crash in pskb_expand_head at skbuff.c:1035 > > > > [BBB55:744364] [<ffffffff812a2761>] ? __pskb_pull_tail+0x43x0x26f > > [BB8S5.744395] [<ffffffff812d29Tb>] ? ip_check_defrag+ox3a/0x14a > > [BBB55.744422] [<ffffffffB1344459>] ? packet_rcv_fanout+ox5e/oxf9 > > [BBBS5.7444S0] [<ffffffffB12aaS9b>] ? __netif_receive_skb+ox444/ox4f9 > > [BBB55.T4447B] [<ffffffffB12aa?e1>] ? netif_receive_skb+ox6d/0x?3 > > [BBB55.T4447B] [<ffffffffB12aa?e1>] ? ieee80211_deliver_skb+0xbd/0xfa [mac80211] > > [BBB55.T4447B] [<ffffffffB12aa?e1>] ? ieee80211_rx_h_data+0x1e0/0x21a [mac80211] > > [BBB55.T4447B] [<ffffffffB12aa?e1>] ? ieee80211_rx_handlers+0x3d5/0x480 [mac80211] > > [BBB55.T4447B] [<ffffffffB12aa?e1>] ? __wake_up > > [BBB55.T4447B] [<ffffffffB12aa?e1>] ? evdev_eventr+0xc0/0xcf [evdev] > > > > Signed-off-by: Eric Leblond <eric@regit.org> > > So if we look at ieee80211_deliver_skb(), it has code to deal with unaligned > packet headers, wherein it memoves() the data into a better aligned location. > > But if these SKBs really are skb_shared(), this packet data > modification is illegal. > > I suspect that the assumptions built into this unaligned data handling > code, and AF_PACKET, are correct. Meaning that we should never see > skb_shared() packets here. We just have a missing skb_copy() > somewhere in mac80211, Johannes can you please take a look? Here's some more info that may help people knowing the code. During my test, I've removed the BUG() and replaced with a printk to have a living kernel. Only one single shared skb was seen for each up event. I've also add another oops in the same code: [BBB55:744364] [<ffffffff812a2761>] ? __pskb_pull_tail+0x43x0x26f [BB8S5.744395] [<ffffffff812d29Tb>] ? ip_check_defrag+ox3a/0x14a [BBB55.744422] [<ffffffffB1344459>] ? packet_rcv_fanout+ox5e/oxf9 [BBBS5.7444S0] [<ffffffffB12aaS9b>] ? __netif_receive_skb+ox444/ox4f9 [BBB55.T4447B] [<ffffffffB12aa?e1>] ? netif_receive_skb+ox6d/0x?3 [BBB55.T4447B] [<ffffffffB12aa?e1>] ? ieee80211_deliver_skb+0xbd/0xfa [mac80211] [BBB55.T4447B] [<ffffffffB12aa?e1>] ? ieee80211_rx_h_data+0x1e0/0x21a [mac80211] [BBB55.T4447B] [<ffffffffB12aa?e1>] ? ieee80211_rx_handlers+0x3d5/0x480 [mac80211] [BBB55.T4447B] [<ffffffffB12aa?e1>] ? _raw_spin_lock_irqsave+0x14/0x35 [BBB55.T4447B] [<ffffffffB12aa?e1>] ? ieee80211_prepare_and_rx_handle+0x5a3/0x5db [mac80211] ... [BBB55.T4447B] [<ffffffffB12aa?e1>] ? ttwu_dowakeup+0x2d Picture of the oops available here: http://home.regit.org/~regit/wireless-oops.jpg BR, -- Eric Leblond <eric@regit.org> Blog: https://home.regit.org/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH] af_packet: don't to defrag shared skb 2012-12-07 20:31 ` David Miller 2012-12-07 20:42 ` Johannes Berg 2012-12-07 20:54 ` Eric Leblond @ 2012-12-07 21:30 ` Johannes Berg 2012-12-07 21:41 ` Johannes Berg 2012-12-07 21:46 ` [RFC PATCH] af_packet: don't to defrag shared skb Eric Leblond 2 siblings, 2 replies; 17+ messages in thread From: Johannes Berg @ 2012-12-07 21:30 UTC (permalink / raw) To: David Miller; +Cc: eric, netdev, linux-wireless, linville > Wireless folks, please take a look. The issue is that, > under the circumstances listed below, we get SKBs in > the AF_PACKET input path that are shared. Ok so I took a look, but I can't see where the wireless stack is going wrong. > Given the logic present in ieee80211_deliver_skb() I think > the mac80211 code doesn't expect this either. This is correct, but the driver should never give us a shared skb. From the other mail it seems Eric is using iwlwifi, which is definitely not creating shared SKBs. Nothing in mac80211 creates them either. > > I've observed this crash under the following condition: > > 1. a program is listening to an wifi interface (let say wlan0) > > 2. it is using fanout capture in flow load balancing mode > > 3. defrag option is on on the fanout socket How do you set this up, and what does it do? I'd like to try to reproduce this. > > 4. the interface disconnect (radio down for example) > > 5. the interface reconnect (radio switched up) > > 6. once reconnected a single packet is seen with skb->users=2 That's interesting. A single one seems odd. I might have expected two, but not one. Well, since you removed the crash ... I guess I'll have to believe that there's just one and the second one doesn't show up because we crashed before :-) > So if we look at ieee80211_deliver_skb(), it has code to deal with unaligned > packet headers, wherein it memoves() the data into a better aligned location. > > But if these SKBs really are skb_shared(), this packet data > modification is illegal. > > I suspect that the assumptions built into this unaligned data handling > code, and AF_PACKET, are correct. Meaning that we should never see > skb_shared() packets here. We just have a missing skb_copy() > somewhere in mac80211, Johannes can you please take a look? My first theory was related to multiple virtual interfaces, but Eric didn't say he was running that, but we use skb_copy() for that in ieee80211_prepare_and_rx_handle(). That's not necessarily the most efficient (another reason for drivers to use paged RX here) but clearly not causing the issue. The only other theory I can come up with right now is that the skb_get() happens in deliver_skb via __netif_receive_skb. Keeping in mind that wpa_supplicant might have another packet socket open for authentication packets, that seems like a possibility. I'll test it once I figure out how to do this "defrag" option you speak of :) johannes ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH] af_packet: don't to defrag shared skb 2012-12-07 21:30 ` Johannes Berg @ 2012-12-07 21:41 ` Johannes Berg [not found] ` <1354916502.9124.18.camel-8Nb76shvtaUJvtFkdXX2HixXY32XiHfO@public.gmane.org> 2012-12-07 21:46 ` [RFC PATCH] af_packet: don't to defrag shared skb Eric Leblond 1 sibling, 1 reply; 17+ messages in thread From: Johannes Berg @ 2012-12-07 21:41 UTC (permalink / raw) To: David Miller; +Cc: eric, netdev, linux-wireless, linville On Fri, 2012-12-07 at 22:30 +0100, Johannes Berg wrote: > The only other theory I can come up with right now is that the skb_get() > happens in deliver_skb via __netif_receive_skb. Keeping in mind that > wpa_supplicant might have another packet socket open for authentication > packets, that seems like a possibility. I'll test it once I figure out > how to do this "defrag" option you speak of :) Hmm now I'm venturing into the unknown (for me) and realm of speculation... wpa_supplicant opens a packet socket for ETH_P_EAPOL, which indirectly eventually calls dev_add_pack(). But if you do the same for another socket, you'll get the same again, and then deliver_skb() will deliver only a refcounted packet to the prot_hook->func(). This seems like it could very well cause the problem? johannes ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <1354916502.9124.18.camel-8Nb76shvtaUJvtFkdXX2HixXY32XiHfO@public.gmane.org>]
* Re: [RFC PATCH] af_packet: don't to defrag shared skb [not found] ` <1354916502.9124.18.camel-8Nb76shvtaUJvtFkdXX2HixXY32XiHfO@public.gmane.org> @ 2012-12-07 22:12 ` Johannes Berg 2012-12-07 22:23 ` Johannes Berg 0 siblings, 1 reply; 17+ messages in thread From: Johannes Berg @ 2012-12-07 22:12 UTC (permalink / raw) To: David Miller Cc: eric-EVVnsjFE0OfYtjvyW6yDsg, netdev-u79uwXL29TY76Z2rM5mHXA, linux-wireless-u79uwXL29TY76Z2rM5mHXA, linville-2XuSBdqkA4R54TAoqtyWWQ, Eric Dumazet On Fri, 2012-12-07 at 22:41 +0100, Johannes Berg wrote: > wpa_supplicant opens a packet socket for ETH_P_EAPOL, which indirectly > eventually calls dev_add_pack(). But if you do the same for another > socket, you'll get the same again, and then deliver_skb() will deliver > only a refcounted packet to the prot_hook->func(). > > This seems like it could very well cause the problem? Ok I couldn't reproduce it because suricata didn't work this way, but I did try using tcpdump (without the fanout) and deliver_skb() *is* called twice for each packet as I thought. It thus seems the problem is entirely in af_packet itself. It was changed a bit by Eric Dumazet in bc416d9768 but the original code goes back to the original defrag support in 7736d33f4, as far as I can tell. aec27311c changed the code to not do skb_clone(), but it seems the skb_share_check() should be before the pskb_pull(). Well, it seems ip_check_defrag() should simply not modify the SKB before it unshares it ... like this maybe: diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c index 448e685..8d5cc75 100644 --- a/net/ipv4/ip_fragment.c +++ b/net/ipv4/ip_fragment.c @@ -707,28 +707,27 @@ EXPORT_SYMBOL(ip_defrag); struct sk_buff *ip_check_defrag(struct sk_buff *skb, u32 user) { - const struct iphdr *iph; + struct iphdr iph; u32 len; if (skb->protocol != htons(ETH_P_IP)) return skb; - if (!pskb_may_pull(skb, sizeof(struct iphdr))) + if (!skb_copy_bits(skb, 0, &iph, sizeof(iph))) return skb; - iph = ip_hdr(skb); - if (iph->ihl < 5 || iph->version != 4) + if (iph.ihl < 5 || iph.version != 4) return skb; - if (!pskb_may_pull(skb, iph->ihl*4)) - return skb; - iph = ip_hdr(skb); - len = ntohs(iph->tot_len); - if (skb->len < len || len < (iph->ihl * 4)) + + len = ntohs(iph.tot_len); + if (skb->len < len || len < (iph.ihl * 4)) return skb; - if (ip_is_fragment(ip_hdr(skb))) { + if (ip_is_fragment(&iph)) { skb = skb_share_check(skb, GFP_ATOMIC); if (skb) { + if (!pskb_may_pull(skb, iph.ihl*4)) + return skb; if (pskb_trim_rcsum(skb, len)) return skb; memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); johannes -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [RFC PATCH] af_packet: don't to defrag shared skb 2012-12-07 22:12 ` Johannes Berg @ 2012-12-07 22:23 ` Johannes Berg [not found] ` <1354919017.9124.33.camel-8Nb76shvtaUJvtFkdXX2HixXY32XiHfO@public.gmane.org> 0 siblings, 1 reply; 17+ messages in thread From: Johannes Berg @ 2012-12-07 22:23 UTC (permalink / raw) To: David Miller; +Cc: eric, netdev, linux-wireless, linville, Eric Dumazet Oh I should say ... IP is like black magic to me ;-) > - if (!pskb_may_pull(skb, sizeof(struct iphdr))) > + if (!skb_copy_bits(skb, 0, &iph, sizeof(iph))) > return skb; > > - iph = ip_hdr(skb); > - if (iph->ihl < 5 || iph->version != 4) > + if (iph.ihl < 5 || iph.version != 4) > return skb; > - if (!pskb_may_pull(skb, iph->ihl*4)) > - return skb; > - iph = ip_hdr(skb); > - len = ntohs(iph->tot_len); > - if (skb->len < len || len < (iph->ihl * 4)) > + > + len = ntohs(iph.tot_len); > + if (skb->len < len || len < (iph.ihl * 4)) > return skb; > > - if (ip_is_fragment(ip_hdr(skb))) { > + if (ip_is_fragment(&iph)) { > skb = skb_share_check(skb, GFP_ATOMIC); > if (skb) { > + if (!pskb_may_pull(skb, iph.ihl*4)) > + return skb; I moved this here but I have no idea what it does. Asking if we can pull this here seems a bit pointless, and in the previous place it seemed similarly pointless since we only use the static part of the IP header and don't look at any options... So to me it seems this pskb_may_pull() could just be removed, and that most likely means I'm messing with code I don't understand :-) johannes ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <1354919017.9124.33.camel-8Nb76shvtaUJvtFkdXX2HixXY32XiHfO@public.gmane.org>]
* Re: [RFC PATCH] af_packet: don't to defrag shared skb [not found] ` <1354919017.9124.33.camel-8Nb76shvtaUJvtFkdXX2HixXY32XiHfO@public.gmane.org> @ 2012-12-10 9:29 ` Johannes Berg 2012-12-10 9:41 ` [PATCH] ipv4: ip_check_defrag must not modify skb before unsharing Johannes Berg 0 siblings, 1 reply; 17+ messages in thread From: Johannes Berg @ 2012-12-10 9:29 UTC (permalink / raw) To: David Miller Cc: eric-EVVnsjFE0OfYtjvyW6yDsg, netdev-u79uwXL29TY76Z2rM5mHXA, linux-wireless-u79uwXL29TY76Z2rM5mHXA, linville-2XuSBdqkA4R54TAoqtyWWQ, Eric Dumazet On Fri, 2012-12-07 at 23:23 +0100, Johannes Berg wrote: > Oh I should say ... IP is like black magic to me ;-) > > > - if (!pskb_may_pull(skb, sizeof(struct iphdr))) > > + if (!skb_copy_bits(skb, 0, &iph, sizeof(iph))) > > return skb; > > > > - iph = ip_hdr(skb); > > - if (iph->ihl < 5 || iph->version != 4) > > + if (iph.ihl < 5 || iph.version != 4) > > return skb; > > - if (!pskb_may_pull(skb, iph->ihl*4)) > > - return skb; > > - iph = ip_hdr(skb); > > - len = ntohs(iph->tot_len); > > - if (skb->len < len || len < (iph->ihl * 4)) > > + > > + len = ntohs(iph.tot_len); > > + if (skb->len < len || len < (iph.ihl * 4)) > > return skb; > > > > - if (ip_is_fragment(ip_hdr(skb))) { > > + if (ip_is_fragment(&iph)) { > > skb = skb_share_check(skb, GFP_ATOMIC); > > if (skb) { > > + if (!pskb_may_pull(skb, iph.ihl*4)) > > + return skb; > > I moved this here but I have no idea what it does. Well, ok, looking further it does seem kinda obvious -- ip_defrag() assumes the IP header is (fully?) present in the skb header, so that's what this does. Eric (Leblond), could you test this patch? johannes -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH] ipv4: ip_check_defrag must not modify skb before unsharing 2012-12-10 9:29 ` Johannes Berg @ 2012-12-10 9:41 ` Johannes Berg 2012-12-10 11:02 ` Eric Leblond 2012-12-10 18:41 ` David Miller 0 siblings, 2 replies; 17+ messages in thread From: Johannes Berg @ 2012-12-10 9:41 UTC (permalink / raw) To: David Miller; +Cc: eric, netdev, linux-wireless, linville, Eric Dumazet From: Johannes Berg <johannes.berg@intel.com> ip_check_defrag() might be called from af_packet within the RX path where shared SKBs are used, so it must not modify the input SKB before it has unshared it for defragmentation. Use skb_copy_bits() to get the IP header and only pull in everything later. The same is true for the other caller in macvlan as it is called from dev->rx_handler which can also get a shared SKB. Reported-by: Eric Leblond <eric@regit.org> Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com> --- For some versions of the kernel, this code goes into af_packet.c net/ipv4/ip_fragment.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c index 448e685..8d5cc75 100644 --- a/net/ipv4/ip_fragment.c +++ b/net/ipv4/ip_fragment.c @@ -707,28 +707,27 @@ EXPORT_SYMBOL(ip_defrag); struct sk_buff *ip_check_defrag(struct sk_buff *skb, u32 user) { - const struct iphdr *iph; + struct iphdr iph; u32 len; if (skb->protocol != htons(ETH_P_IP)) return skb; - if (!pskb_may_pull(skb, sizeof(struct iphdr))) + if (!skb_copy_bits(skb, 0, &iph, sizeof(iph))) return skb; - iph = ip_hdr(skb); - if (iph->ihl < 5 || iph->version != 4) + if (iph.ihl < 5 || iph.version != 4) return skb; - if (!pskb_may_pull(skb, iph->ihl*4)) - return skb; - iph = ip_hdr(skb); - len = ntohs(iph->tot_len); - if (skb->len < len || len < (iph->ihl * 4)) + + len = ntohs(iph.tot_len); + if (skb->len < len || len < (iph.ihl * 4)) return skb; - if (ip_is_fragment(ip_hdr(skb))) { + if (ip_is_fragment(&iph)) { skb = skb_share_check(skb, GFP_ATOMIC); if (skb) { + if (!pskb_may_pull(skb, iph.ihl*4)) + return skb; if (pskb_trim_rcsum(skb, len)) return skb; memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); -- 1.8.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH] ipv4: ip_check_defrag must not modify skb before unsharing 2012-12-10 9:41 ` [PATCH] ipv4: ip_check_defrag must not modify skb before unsharing Johannes Berg @ 2012-12-10 11:02 ` Eric Leblond 2012-12-10 18:41 ` David Miller 1 sibling, 0 replies; 17+ messages in thread From: Eric Leblond @ 2012-12-10 11:02 UTC (permalink / raw) To: Johannes Berg Cc: David Miller, netdev, linux-wireless, linville, Eric Dumazet Hello, On Mon, 2012-12-10 at 10:41 +0100, Johannes Berg wrote: > From: Johannes Berg <johannes.berg@intel.com> > > ip_check_defrag() might be called from af_packet within the > RX path where shared SKBs are used, so it must not modify > the input SKB before it has unshared it for defragmentation. > Use skb_copy_bits() to get the IP header and only pull in > everything later. > > The same is true for the other caller in macvlan as it is > called from dev->rx_handler which can also get a shared SKB. I've applied the patch and built a new kernel. I did not manage to get it crashed when using the two techniques (suspend to ram and down/up interface) that were working well to crash kernel without the patch. BR, > Reported-by: Eric Leblond <eric@regit.org> > Cc: stable@vger.kernel.org > Signed-off-by: Johannes Berg <johannes.berg@intel.com> > --- > For some versions of the kernel, this code goes into af_packet.c > > net/ipv4/ip_fragment.c | 19 +++++++++---------- > 1 file changed, 9 insertions(+), 10 deletions(-) > > diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c > index 448e685..8d5cc75 100644 > --- a/net/ipv4/ip_fragment.c > +++ b/net/ipv4/ip_fragment.c > @@ -707,28 +707,27 @@ EXPORT_SYMBOL(ip_defrag); > > struct sk_buff *ip_check_defrag(struct sk_buff *skb, u32 user) > { > - const struct iphdr *iph; > + struct iphdr iph; > u32 len; > > if (skb->protocol != htons(ETH_P_IP)) > return skb; > > - if (!pskb_may_pull(skb, sizeof(struct iphdr))) > + if (!skb_copy_bits(skb, 0, &iph, sizeof(iph))) > return skb; > > - iph = ip_hdr(skb); > - if (iph->ihl < 5 || iph->version != 4) > + if (iph.ihl < 5 || iph.version != 4) > return skb; > - if (!pskb_may_pull(skb, iph->ihl*4)) > - return skb; > - iph = ip_hdr(skb); > - len = ntohs(iph->tot_len); > - if (skb->len < len || len < (iph->ihl * 4)) > + > + len = ntohs(iph.tot_len); > + if (skb->len < len || len < (iph.ihl * 4)) > return skb; > > - if (ip_is_fragment(ip_hdr(skb))) { > + if (ip_is_fragment(&iph)) { > skb = skb_share_check(skb, GFP_ATOMIC); > if (skb) { > + if (!pskb_may_pull(skb, iph.ihl*4)) > + return skb; > if (pskb_trim_rcsum(skb, len)) > return skb; > memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); -- Eric Leblond <eric@regit.org> Blog: https://home.regit.org/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] ipv4: ip_check_defrag must not modify skb before unsharing 2012-12-10 9:41 ` [PATCH] ipv4: ip_check_defrag must not modify skb before unsharing Johannes Berg 2012-12-10 11:02 ` Eric Leblond @ 2012-12-10 18:41 ` David Miller 2012-12-10 18:45 ` Johannes Berg 1 sibling, 1 reply; 17+ messages in thread From: David Miller @ 2012-12-10 18:41 UTC (permalink / raw) To: johannes; +Cc: eric, netdev, linux-wireless, linville, eric.dumazet From: Johannes Berg <johannes@sipsolutions.net> Date: Mon, 10 Dec 2012 10:41:06 +0100 > From: Johannes Berg <johannes.berg@intel.com> > > ip_check_defrag() might be called from af_packet within the > RX path where shared SKBs are used, so it must not modify > the input SKB before it has unshared it for defragmentation. > Use skb_copy_bits() to get the IP header and only pull in > everything later. > > The same is true for the other caller in macvlan as it is > called from dev->rx_handler which can also get a shared SKB. > > Reported-by: Eric Leblond <eric@regit.org> > Cc: stable@vger.kernel.org > Signed-off-by: Johannes Berg <johannes.berg@intel.com> > --- > For some versions of the kernel, this code goes into af_packet.c So the bug is that ip_check_defrag() has a precondition which is met properly by all callers except AF_PACKET. If this is the case, remind me why are we changing ip_check_defrag() rather than the violator of the precondition? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] ipv4: ip_check_defrag must not modify skb before unsharing 2012-12-10 18:41 ` David Miller @ 2012-12-10 18:45 ` Johannes Berg [not found] ` <1355165152.8083.4.camel-8Nb76shvtaUJvtFkdXX2HixXY32XiHfO@public.gmane.org> 0 siblings, 1 reply; 17+ messages in thread From: Johannes Berg @ 2012-12-10 18:45 UTC (permalink / raw) To: David Miller; +Cc: eric, netdev, linux-wireless, linville, eric.dumazet On Mon, 2012-12-10 at 13:41 -0500, David Miller wrote: > From: Johannes Berg <johannes@sipsolutions.net> > Date: Mon, 10 Dec 2012 10:41:06 +0100 > > > From: Johannes Berg <johannes.berg@intel.com> > > > > ip_check_defrag() might be called from af_packet within the > > RX path where shared SKBs are used, so it must not modify > > the input SKB before it has unshared it for defragmentation. > > Use skb_copy_bits() to get the IP header and only pull in > > everything later. > > > > The same is true for the other caller in macvlan as it is > > called from dev->rx_handler which can also get a shared SKB. > > > > Reported-by: Eric Leblond <eric@regit.org> > > Cc: stable@vger.kernel.org > > Signed-off-by: Johannes Berg <johannes.berg@intel.com> > > --- > > For some versions of the kernel, this code goes into af_packet.c > > So the bug is that ip_check_defrag() has a precondition which is met > properly by all callers except AF_PACKET. > > If this is the case, remind me why are we changing ip_check_defrag() > rather than the violator of the precondition? I don't think this is the case. If you're referring to my note about af_packet: the kernels where this goes into af_packet.c are the kernels that don't even have ip_check_defrag() because macvlan didn't exist/didn't have ip defrag support and af_packet had this code there -- see commit bc416d9768a. If you're not referring to my note about af_packet: both callers (there are only two) of ip_check_defrag() have this bug as far as I can tell because they're both in the part of the RX path where shared SKBs might happen. johannes ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <1355165152.8083.4.camel-8Nb76shvtaUJvtFkdXX2HixXY32XiHfO@public.gmane.org>]
* Re: [PATCH] ipv4: ip_check_defrag must not modify skb before unsharing [not found] ` <1355165152.8083.4.camel-8Nb76shvtaUJvtFkdXX2HixXY32XiHfO@public.gmane.org> @ 2012-12-10 18:50 ` David Miller 0 siblings, 0 replies; 17+ messages in thread From: David Miller @ 2012-12-10 18:50 UTC (permalink / raw) To: johannes-cdvu00un1VgdHxzADdlk8Q Cc: eric-EVVnsjFE0OfYtjvyW6yDsg, netdev-u79uwXL29TY76Z2rM5mHXA, linux-wireless-u79uwXL29TY76Z2rM5mHXA, linville-2XuSBdqkA4R54TAoqtyWWQ, eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w From: Johannes Berg <johannes-cdvu00un1VgdHxzADdlk8Q@public.gmane.org> Date: Mon, 10 Dec 2012 19:45:52 +0100 > On Mon, 2012-12-10 at 13:41 -0500, David Miller wrote: >> So the bug is that ip_check_defrag() has a precondition which is met >> properly by all callers except AF_PACKET. >> >> If this is the case, remind me why are we changing ip_check_defrag() >> rather than the violator of the precondition? > > I don't think this is the case. > > If you're referring to my note about af_packet: the kernels where this > goes into af_packet.c are the kernels that don't even have > ip_check_defrag() because macvlan didn't exist/didn't have ip defrag > support and af_packet had this code there -- see commit bc416d9768a. > > If you're not referring to my note about af_packet: both callers (there > are only two) of ip_check_defrag() have this bug as far as I can tell > because they're both in the part of the RX path where shared SKBs might > happen. You're right, I misinterpreted what's happening here. My misunderstanding was that this was a situation where normal IPV4 input processing makes sure the SKB is unshared, and we had special code paths that didn't make sure that was the case. Rather, here, we have a special entrypoint for macvlan and AF_PACKET which is supposed to take care of such issues since it is known to execute in a different kind of environment. I'm pretty sure I'll apply this, after I check a few more things, thanks Johannes! -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH] af_packet: don't to defrag shared skb 2012-12-07 21:30 ` Johannes Berg 2012-12-07 21:41 ` Johannes Berg @ 2012-12-07 21:46 ` Eric Leblond 2012-12-07 21:56 ` Johannes Berg 1 sibling, 1 reply; 17+ messages in thread From: Eric Leblond @ 2012-12-07 21:46 UTC (permalink / raw) To: Johannes Berg; +Cc: David Miller, netdev, linux-wireless, linville Hi, On Fri, 2012-12-07 at 22:30 +0100, Johannes Berg wrote: > > > Wireless folks, please take a look. The issue is that, > > under the circumstances listed below, we get SKBs in > > the AF_PACKET input path that are shared. > > Ok so I took a look, but I can't see where the wireless stack is going > wrong. > > > Given the logic present in ieee80211_deliver_skb() I think > > the mac80211 code doesn't expect this either. > > This is correct, but the driver should never give us a shared skb. From > the other mail it seems Eric is using iwlwifi, which is definitely not > creating shared SKBs. Nothing in mac80211 creates them either. > > > > I've observed this crash under the following condition: > > > 1. a program is listening to an wifi interface (let say wlan0) > > > 2. it is using fanout capture in flow load balancing mode > > > 3. defrag option is on on the fanout socket > > How do you set this up, and what does it do? I'd like to try to > reproduce this. > > > 4. the interface disconnect (radio down for example) > > > 5. the interface reconnect (radio switched up) > > > 6. once reconnected a single packet is seen with skb->users=2 > > That's interesting. A single one seems odd. I might have expected two, > but not one. Well, since you removed the crash ... I guess I'll have to > believe that there's just one and the second one doesn't show up because > we crashed before :-) It was the case with initial code but I've suppressed the BUG() call and replaced it with a return ;) > > So if we look at ieee80211_deliver_skb(), it has code to deal with unaligned > > packet headers, wherein it memoves() the data into a better aligned location. > > > > But if these SKBs really are skb_shared(), this packet data > > modification is illegal. > > > > I suspect that the assumptions built into this unaligned data handling > > code, and AF_PACKET, are correct. Meaning that we should never see > > skb_shared() packets here. We just have a missing skb_copy() > > somewhere in mac80211, Johannes can you please take a look? > > My first theory was related to multiple virtual interfaces, but Eric > didn't say he was running that, but we use skb_copy() for that in > ieee80211_prepare_and_rx_handle(). That's not necessarily the most > efficient (another reason for drivers to use paged RX here) but clearly > not causing the issue. > > The only other theory I can come up with right now is that the skb_get() > happens in deliver_skb via __netif_receive_skb. Keeping in mind that > wpa_supplicant might have another packet socket open for authentication > packets, that seems like a possibility. I'll test it once I figure out > how to do this "defrag" option you speak of :) I've no simple code available to test it. I've add the problem when running suricata. Maybe you could use it. It is packaged in most distribution now. To enable packet fanout. Modify default /etc/suricata/suricata.yaml to have something like: af-packet: - interface: wlan0 # Number of receive threads (>1 will enable experimental flow pinned # runmode) threads: 3 Start it with: suricata --af-packet=wlan0 Then get wlan0 interface down and up. After a few seconds, the crash will occur. It is a bit complicated for a simple test case. I can cook a little example code if you want. BR, -- Eric Leblond <eric@regit.org> Blog: https://home.regit.org/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH] af_packet: don't to defrag shared skb 2012-12-07 21:46 ` [RFC PATCH] af_packet: don't to defrag shared skb Eric Leblond @ 2012-12-07 21:56 ` Johannes Berg 0 siblings, 0 replies; 17+ messages in thread From: Johannes Berg @ 2012-12-07 21:56 UTC (permalink / raw) To: Eric Leblond Cc: David Miller, netdev-u79uwXL29TY76Z2rM5mHXA, linux-wireless-u79uwXL29TY76Z2rM5mHXA, linville-2XuSBdqkA4R54TAoqtyWWQ HI, > > That's interesting. A single one seems odd. I might have expected two, > > but not one. Well, since you removed the crash ... I guess I'll have to > > believe that there's just one and the second one doesn't show up because > > we crashed before :-) > > It was the case with initial code but I've suppressed the BUG() call and > replaced it with a return ;) Right. Well, does it actually still work then? I wonder if then you don't see a second packet because the first one doesn't make it through ... I'm thinking that this is just an internal af_packet problem with multiple listeners. > I've no simple code available to test it. I've add the problem when > running suricata. Maybe you could use it. It is packaged in most > distribution now. > To enable packet fanout. Modify default /etc/suricata/suricata.yaml to > have something like: > af-packet: > - interface: wlan0 > # Number of receive threads (>1 will enable experimental flow pinned > # runmode) > threads: 3 That'll should do, thanks. johannes -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2012-12-10 18:50 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-12-07 18:56 [RFC PATCH] af_packet: don't to defrag shared skb Eric Leblond 2012-12-07 19:10 ` David Miller 2012-12-07 20:31 ` David Miller 2012-12-07 20:42 ` Johannes Berg 2012-12-07 20:54 ` Eric Leblond 2012-12-07 21:30 ` Johannes Berg 2012-12-07 21:41 ` Johannes Berg [not found] ` <1354916502.9124.18.camel-8Nb76shvtaUJvtFkdXX2HixXY32XiHfO@public.gmane.org> 2012-12-07 22:12 ` Johannes Berg 2012-12-07 22:23 ` Johannes Berg [not found] ` <1354919017.9124.33.camel-8Nb76shvtaUJvtFkdXX2HixXY32XiHfO@public.gmane.org> 2012-12-10 9:29 ` Johannes Berg 2012-12-10 9:41 ` [PATCH] ipv4: ip_check_defrag must not modify skb before unsharing Johannes Berg 2012-12-10 11:02 ` Eric Leblond 2012-12-10 18:41 ` David Miller 2012-12-10 18:45 ` Johannes Berg [not found] ` <1355165152.8083.4.camel-8Nb76shvtaUJvtFkdXX2HixXY32XiHfO@public.gmane.org> 2012-12-10 18:50 ` David Miller 2012-12-07 21:46 ` [RFC PATCH] af_packet: don't to defrag shared skb Eric Leblond 2012-12-07 21:56 ` Johannes Berg
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).