netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] kernel crash in br_netfilter
@ 2016-02-29 12:33 Zefir Kurtisi
  2016-03-07 17:43 ` Zefir Kurtisi
  0 siblings, 1 reply; 4+ messages in thread
From: Zefir Kurtisi @ 2016-02-29 12:33 UTC (permalink / raw)
  To: OpenWrt Development List, netfilter-devel; +Cc: Florian Westphal

[-- Attachment #1: Type: text/plain, Size: 1347 bytes --]

I've been fighting a kernel bug that is producing random crashes around network /
skb_layer for a long time and was able to isolate it (or one of its components) to
the br_netfilter module.

I am reproducing the bug with PowerPC (TL-WDR4900v1.3) and MIPS (DB120, ar71xx)
based systems. Florian Westphal did not see it on kvm/x86, it is unclear whether
this requires a physical system or is CPU specific. This bug is in the latest
OpenWRT (tested HEAD is 03b15ae9), as it happens with firmwares built 2+ years
ago, so it is no current regression but something that was there for a long time.


Reproducing the crash
1. build the firmware for the system to test
   * use default configuration
   * ensure to select CONFIG_BRIDGE_NETFILTER in kernel_menuconfig
2. boot the device and access it over serial
3. ensure br-lan bridge has at least two active ports
   * tested with ath9k + Ethernet (gianfar and ag71xx)
   * if not enabled, enable radio0 and ensure wlan0 is in bridge
4. run: sysctl -w net.bridge.bridge-nf-call-iptables=1
5. from your host, continuously ping the device over Ethernet
6. run: ifconfig br-lan down

The next ingress packet causes a fatal crash.

Trace logs for MIPS and PPC are attached and hint to __nf_conntrack_confirm


Let me know if I could provide more information to further isolate the problem.


Thanks,
Zefir



[-- Attachment #2: tracelog-MIPS.txt --]
[-- Type: text/plain, Size: 2344 bytes --]

[  191.321163] br-lan: port 1(eth0.1) entered disabled state
[  192.646656] CPU 0 Unable to handle kernel paging request at virtual address 00200200, epc == 87000670, ra == 870018f4
[  192.657446] Oops[#1]:
[  192.659761] CPU: 0 PID: 0 Comm: swapper Not tainted 4.1.16 #1
[  192.665593] task: 803ce958 ti: 803c8000 task.ti: 803c8000
[  192.671069] $ 0   : 00000000 00000000 80000001 00200200
[  192.676410] $ 4   : 86c0fa20 00000001 00000000 a44465b9
[  192.681742] $ 8   : 86c0fa78 86c0fa78 00000000 00000000
[  192.687075] $12   : 115f0002 00000000 00000000 c0a80114
[  192.692408] $16   : 86c0fa20 000006cc 000007b6 803e5af0
[  192.697742] $20   : 000006cc 00000004 803e5af0 00000000
[  192.703082] $24   : 00000000 871367d4                  
[  192.708416] $28   : 803c8000 803c9a28 86c0fa60 870018f4
[  192.713750] Hi    : 000007b6
[  192.716670] Lo    : b5a74800
[  192.719628] epc   : 87000670 nf_conntrack_find_get+0x68/0x88 [nf_conntrack]
[  192.726698] ra    : 870018f4 __nf_conntrack_confirm+0xc0/0x364 [nf_conntrack]
[  192.733927] Status: 1100fc03 KERNEL EXL IE 
[  192.738196] Cause : 8080000c
[  192.741117] BadVA : 00200200
[  192.744040] PrId  : 0001974c (MIPS 74Kc)
[  192.748015] Modules linked in: ath9k ath9k_common pppoe ppp_async iptable_nat ath9k_hw ath pppox ppp_generic nf_nat_ipv4 nf_conntrack_ipv6 nf_conntrack_ipv4 mac80211 ipt_REJECT ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_state xt_nan
[  192.816284] Process swapper (pid: 0, threadinfo=803c8000, task=803ce958, tls=00000000)
[  192.824311] Stack : 87342240 87135744 00000001 02000000 803c9aac 803c9aec 803cabac 87342240
          00000001 00000004 00000003 ffffff62 00000000 8026efb0 86c0fa20 87342240
          00000000 8731f000 86c18100 87137058 00000000 87342240 803c9aec 87342240
          803cab24 fffffffb 00000001 8026f090 8734ca80 87865b7c 00000000 00000008
          00000000 87137058 803caba4 87342240 00000001 8731f000 87342240 8731f05c
          ...
[  192.860643] Call Trace:
[  192.863133] [<87000670>] nf_conntrack_find_get+0x68/0x88 [nf_conntrack]
[  192.869850] 
[  192.871356] 
Code: 00020336  8c820008  30450001 <14a00002> ac620000  ac430004  3c020020  24420200  ac82000c 
[  192.881512] ---[ end trace 1e716eb17e40af8b ]---
[  192.888247] Kernel panic - not syncing: Fatal exception in interrupt
[  192.895654] Rebooting in 3 seconds..


[-- Attachment #3: tracelog-PowerPC.txt --]
[-- Type: text/plain, Size: 3378 bytes --]

[   69.834129] br0: port 3(eth1) entered disabled state
[   69.835427] br0: port 1(wlan0) entered disabled state
[   77.493530] Unable to handle kernel paging request for data at address 0x00200200
[   77.495415] Faulting instruction address: 0xd32ce874
[   77.496669] Oops: Kernel access of bad area, sig: 11 [#1]
[   77.498027] DT50
[   77.498493] Modules linked in: ath9k ath9k_common iptable_nat ath9k_hw ath nf_nat_ipv4 nf_conntrack_ipv4 mac80211 ipt_REJECT ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_tcpmss xt_string xt_statistic xt_state xt_recent xt_quota xt_pkh
[   77.522830] CPU: 0 PID: 0 Comm: swapper Not tainted 3.18.23 #10
[   77.524323] task: c035b300 ti: cffe6000 task.ti: c0370000
[   77.525684] NIP: d32ce874 LR: d32cffec CTR: d35b13c4
[   77.526936] REGS: cffe7c60 TRAP: 0300   Not tainted  (3.18.23)
[   77.528403] MSR: 00029000 <CE,EE,ME>  CR: 42002082  XER: 20000000
[   77.529951] DEAR: 00200200 ESR: 00800000 
GPR00: c762b218 cffe7d10 c035b300 c762b1c0 8db8d32d d72044d0 00000000 00000000 
GPR08: 00000001 80000001 00200200 332f4b8b 22002082 10025420 00200000 c7654080 
GPR16: c7504d80 c7b7e540 cfba4678 c7b4a000 000086dd 00000000 80000000 00000002 
GPR24: c762b200 00000e25 000002b1 c0366fc8 00000225 000006b1 00000000 c762b1c0 
[   77.538144] NIP [d32ce874] 0xd32ce874
[   77.539075] LR [d32cffec] __nf_conntrack_confirm+0x2c8/0x34c [nf_conntrack]
[   77.540826] Call Trace:
[   77.541449] [cffe7d10] [d35b9760] nf_nat_ipv4_fn+0x13c/0x200 [nf_nat_ipv4] (unreliable)
[   77.543478] [cffe7d40] [c022640c] nf_iterate+0x70/0xc0
[   77.544778] [cffe7d80] [c02264d8] nf_hook_slow+0x7c/0x124
[   77.546143] [cffe7dc0] [c022c914] ip_local_deliver+0x98/0xbc
[   77.547578] [cffe7dd0] [c01eee60] __netif_receive_skb_core+0x668/0x79c
[   77.549228] [cffe7e30] [c01f0d84] netif_receive_skb_internal+0x60/0x84
[   77.550884] [cffe7e50] [c0285784] br_handle_frame+0x21c/0x32c
[   77.552337] [cffe7e70] [c01eece0] __netif_receive_skb_core+0x4e8/0x79c
[   77.553985] [cffe7ed0] [c01f0d84] netif_receive_skb_internal+0x60/0x84
[   77.555638] [cffe7ef0] [d328c764] gfar_clean_rx_ring+0x39c/0x2b7c [gianfar_driver]
[   77.557551] [cffe7f40] [d328c9cc] gfar_clean_rx_ring+0x604/0x2b7c [gianfar_driver]
[   77.559462] [cffe7f60] [c01f100c] net_rx_action+0x74/0x188
[   77.560857] [cffe7f90] [c0024508] __do_softirq+0xa8/0x1a8
[   77.562222] [cffe7fe0] [c00247f4] irq_exit+0x4c/0x64
[   77.563479] [cffe7ff0] [c000c198] call_do_irq+0x24/0x3c
[   77.564802] [c0371e80] [c0004280] do_IRQ+0x74/0xb0
[   77.566016] [c0371ea0] [c000d7a8] ret_from_except+0x0/0x18
[   77.567408] --- interrupt: 501 at arch_cpu_idle+0x24/0x60
[   77.567408]     LR = arch_cpu_idle+0x24/0x60
[   77.569849] [c0371f60] [c004ddbc] rcu_idle_enter+0x80/0xa8 (unreliable)
[   77.571529] [c0371f70] [c0042e88] cpu_startup_entry+0xec/0x218
[   77.573005] [c0371fb0] [c0337980] start_kernel+0x304/0x318
[   77.574390] [c0371ff0] [c0000394] set_ivor+0x120/0x15c
[   77.575685] Instruction dump:
[   77.576439] 5484703e 7d445050 7d434a78 554ac03e 7c6a1850 4e800020 8143000c 7d490034 
[   77.578424] 5529d97e 0f090000 81230008 71280001 <912a0000> 40820008 91490004 3d200020 
[   77.580453] ---[ end trace d093fabfbc25455c ]---
[   77.582973] 
[   78.573305] Kernel panic - not syncing: Fatal exception in interrupt
[   78.883261] mtdoops: ready 215, 216 (no erase)
[   78.884382] Rebooting in 3 seconds..



[-- Attachment #4: Type: text/plain, Size: 172 bytes --]

_______________________________________________
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG] kernel crash in br_netfilter
  2016-02-29 12:33 [BUG] kernel crash in br_netfilter Zefir Kurtisi
@ 2016-03-07 17:43 ` Zefir Kurtisi
  2016-03-08 11:06   ` Florian Westphal
  0 siblings, 1 reply; 4+ messages in thread
From: Zefir Kurtisi @ 2016-03-07 17:43 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: OpenWrt Development List, netfilter-devel, Florian Westphal,
	Felix Fietkau

On 02/29/2016 01:33 PM, Zefir Kurtisi wrote:
> I've been fighting a kernel bug that is producing random crashes around network /
> skb_layer for a long time and was able to isolate it (or one of its components) to
> the br_netfilter module.
> 
> I am reproducing the bug with PowerPC (TL-WDR4900v1.3) and MIPS (DB120, ar71xx)
> based systems. Florian Westphal did not see it on kvm/x86, it is unclear whether
> this requires a physical system or is CPU specific. This bug is in the latest
> OpenWRT (tested HEAD is 03b15ae9), as it happens with firmwares built 2+ years
> ago, so it is no current regression but something that was there for a long time.
> 
> 
> Reproducing the crash
> 1. build the firmware for the system to test
>    * use default configuration
>    * ensure to select CONFIG_BRIDGE_NETFILTER in kernel_menuconfig
> 2. boot the device and access it over serial
> 3. ensure br-lan bridge has at least two active ports
>    * tested with ath9k + Ethernet (gianfar and ag71xx)
>    * if not enabled, enable radio0 and ensure wlan0 is in bridge
> 4. run: sysctl -w net.bridge.bridge-nf-call-iptables=1
> 5. from your host, continuously ping the device over Ethernet
> 6. run: ifconfig br-lan down
> 
> The next ingress packet causes a fatal crash.
> 
> Trace logs for MIPS and PPC are attached and hint to __nf_conntrack_confirm
> 
> 
> Let me know if I could provide more information to further isolate the problem.
> 
> 
Got forward with that issue and after wondering why the netfilter folks were
unable to reproduce, it finally turned out the problematic code is OWRT private in
target/linux/generic/patches-X/120-bridge_allow_receiption_on_disabled_port.patch

This is causing reproducible kernel crashes under the conditions given before. In
essence, it leads to a double-free (de-reference of poisoned list) or
use-after-destruction. For more details please check the manually collected
execution trace below. The tldr; version is this: an ingress packet to the
Ethernet port of a disabled bridge
1. gets passed to br_handle_frame()
2. enters the BR_STATE_DISABLED case in the mentioned patch
3. gets passed to the related NF_HOOK
   a) in br_nf_pre_routing() a conntrack context ct is created
   b) that in the same nf_iterate() is destroyed in br_nf_pre_routing_finish()
4. in the br_pass_frame_up() following the NF_HOOK
   a) ipv4_confirm() runs __nf_conntrack_confirm(ct) with invalid ct
   b) which attempts to nf_ct_del_from_dying_or_unconfirmed_list(ct)
   c) and with that de-references and writes to LIST_POISON2 in pprev

This is reproducible with different kernels (tested: 3.18 and 4.4), both with PPC
and MIPS systems (should basically do on any platform).

My hot-fix to prevent the crash is to instead of passing the skb to NF_HOOK
directly pass it to br_handle_local_finish(). But having insufficient insight into
what is going on there, this is fighting the symptoms rather than solving the root
cause. Maybe it is even better to drop patch 120 (not tested yet)?



Cheers,
Zefir

---

br_handle_frame() with p->state = BR_STATE_DISABLED {
  NF_HOOK(br_handle_local_finish) {
    br_nf_pre_routing() {
      NF_HOOK(br_nf_pre_routing_finish) {
        ipv4_conntrack_in() {
          nf_conntrack_in()
            init_conntrack()
              ct = __nf_conntrack_alloc()
        }
        br_nf_pre_routing_finish() [=okfn] {
          NF_HOOK(br_handle_frame_finish) {
            br_handle_frame_finish() { frees skb and returns 0 }
            nf_conntrack_destroy(ct) {
              destroy_conntrack(ct) {
                nf_ct_del_from_dying_or_unconfirmed_list(ct) {
                  ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode.pprev = LIST_POISON2
                }
                nf_conntrack_free(ct)
              }
            }
          }
        }
      }
      return NF_STOLEN
    }
    /* nf_iterate() returns NF_STOLEN, but nf_hook_slow() does not
       handle NF_STOLEN and returns 0 */
  }
  br_pass_frame_up() {
    ipv4_confirm() {
      __nf_conntrack_confirm(ct) {
        /* this one attempts to write to LIST_POISON2 and causes the oops */
        nf_ct_del_from_dying_or_unconfirmed_list(ct)
      }
    }
  }
}




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG] kernel crash in br_netfilter
  2016-03-07 17:43 ` Zefir Kurtisi
@ 2016-03-08 11:06   ` Florian Westphal
  2016-03-08 11:55     ` Felix Fietkau
  0 siblings, 1 reply; 4+ messages in thread
From: Florian Westphal @ 2016-03-08 11:06 UTC (permalink / raw)
  To: Zefir Kurtisi
  Cc: Stephen Hemminger, OpenWrt Development List, netfilter-devel,
	Florian Westphal, Felix Fietkau

Zefir Kurtisi <zefir.kurtisi@neratec.com> wrote:
> > Reproducing the crash
> > 1. build the firmware for the system to test
> >    * use default configuration
> >    * ensure to select CONFIG_BRIDGE_NETFILTER in kernel_menuconfig
> > 2. boot the device and access it over serial
> > 3. ensure br-lan bridge has at least two active ports
> >    * tested with ath9k + Ethernet (gianfar and ag71xx)
> >    * if not enabled, enable radio0 and ensure wlan0 is in bridge
> > 4. run: sysctl -w net.bridge.bridge-nf-call-iptables=1
> > 5. from your host, continuously ping the device over Ethernet
> > 6. run: ifconfig br-lan down
> > 
> > The next ingress packet causes a fatal crash.
> > 
> > Trace logs for MIPS and PPC are attached and hint to __nf_conntrack_confirm
> > 
> > 
> > Let me know if I could provide more information to further isolate the problem.
> > 
> > 
> Got forward with that issue and after wondering why the netfilter folks were
> unable to reproduce, it finally turned out the problematic code is OWRT private in
> target/linux/generic/patches-X/120-bridge_allow_receiption_on_disabled_port.patch

Yes, the patch is wrong.  As you discovered, the
br_netfilter/call-iptables infrastructure will free the skb, so all code
after NF_HOOK in this patch results in use-after-free.

Seems the quick-fix (but thats also not correct) is to use NF_BR_LOCAL_IN instead so that
we bypass the call-iptables infrastructure.

> 1. gets passed to br_handle_frame()
> 2. enters the BR_STATE_DISABLED case in the mentioned patch
> 3. gets passed to the related NF_HOOK
>    a) in br_nf_pre_routing() a conntrack context ct is created
>    b) that in the same nf_iterate() is destroyed in br_nf_pre_routing_finish()
> 4. in the br_pass_frame_up() following the NF_HOOK
>    a) ipv4_confirm() runs __nf_conntrack_confirm(ct) with invalid ct
>    b) which attempts to nf_ct_del_from_dying_or_unconfirmed_list(ct)
>    c) and with that de-references and writes to LIST_POISON2 in pprev

Yes, once NF_HOOK returns skb is in undefined state.

This snippet (from mainline):

                /* Deliver packet to local host only */
                if (NF_HOOK(NFPROTO_BRIDGE, NF_BR_LOCAL_IN,
                            dev_net(skb->dev), NULL, skb, skb->dev, NULL,
                            br_handle_local_finish)) {
                        return RX_HANDLER_CONSUMED; /* consumed by filter */
                } else {
                        *pskb = skb;
                        return RX_HANDLER_PASS; /* continue processing */
                }

... is also dubious.  It only works because no module in current
uptream kernel registers a destructive hook in NF_BR_LOCAL_IN.

In fact, this looks like we get crash here as well once we gain ability to
NFQUEUE in nftables bridge family.

> My hot-fix to prevent the crash is to instead of passing the skb to NF_HOOK
> directly pass it to br_handle_local_finish(). But having insufficient insight into
> what is going on there, this is fighting the symptoms rather than solving the root
> cause. Maybe it is even better to drop patch 120 (not tested yet)?

Sorry, I don't know why this patch was not merged upstream and do not know why its
in openwrt.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG] kernel crash in br_netfilter
  2016-03-08 11:06   ` Florian Westphal
@ 2016-03-08 11:55     ` Felix Fietkau
  0 siblings, 0 replies; 4+ messages in thread
From: Felix Fietkau @ 2016-03-08 11:55 UTC (permalink / raw)
  To: Florian Westphal, Zefir Kurtisi
  Cc: Stephen Hemminger, OpenWrt Development List, netfilter-devel

On 2016-03-08 12:06, Florian Westphal wrote:
>> My hot-fix to prevent the crash is to instead of passing the skb to NF_HOOK
>> directly pass it to br_handle_local_finish(). But having insufficient insight into
>> what is going on there, this is fighting the symptoms rather than solving the root
>> cause. Maybe it is even better to drop patch 120 (not tested yet)?
> 
> Sorry, I don't know why this patch was not merged upstream and do not know why its
> in openwrt.
This patch exists, because it's otherwise impossible to bridge a client
mode (4addr) WLAN interface when encryption is enabled.

wpa_supplicant needs to receive EAP packets before it will change the
operstate to allow the bridge and the rest of the network stack to do
their thing.

This used to work in a while back, and I think it got broken by this
commit:

commit 576eb62598f10c8c7fd75703fe89010cdcfff596
Author: stephen hemminger <shemminger@vyatta.com>
Date:   Fri Dec 28 18:15:22 2012 +0000

 bridge: respect RFC2863 operational state

 The bridge link detection should follow the operational state
 of the lower device, rather than the carrier bit. This allows devices
 like tunnels that are controlled by userspace control plane to work
 with bridge STP link management.

 Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
 Reviewed-by: Flavio Leitner <fbl@redhat.com>
 Signed-off-by: David S. Miller <davem@davemloft.net>

Back then I proposed a patch for upstream inclusion, got some feedback,
Stephen sent me this patch and I fixed it up a bit and re-submitted it.
I think it got lost somewhere in the process and after that I lost track
and didn't get around to re-submitting it.

So we kept the patch in OpenWrt because as far as I know, the regression
still exists in current kernels.

- Felix
_______________________________________________
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-03-08 11:55 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-29 12:33 [BUG] kernel crash in br_netfilter Zefir Kurtisi
2016-03-07 17:43 ` Zefir Kurtisi
2016-03-08 11:06   ` Florian Westphal
2016-03-08 11:55     ` Felix Fietkau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).