* [BUG] MVPP2 driver exploding in presence of a tap interface [not found] <6355174d-4ab6-595d-17db-311bce607aef@arm.com> @ 2018-10-30 10:50 ` Antoine Tenart 2018-10-30 12:16 ` Marc Zyngier 0 siblings, 1 reply; 8+ messages in thread From: Antoine Tenart @ 2018-10-30 10:50 UTC (permalink / raw) To: linux-arm-kernel Marc, On Mon, Oct 29, 2018 at 03:05:53PM +0000, Marc Zyngier wrote: > > This is a follow-up on the conversation Thomas and I had last week at > ELC, with me ranting at the sorry state of the MVPP2 driver. > Triggering this is dead simple: > - Add a macvtap to one of the MVPP2 interfaces > - Bring it online > - Watch the kernel exploding and memory being corrupted > > You don't even need anything listening on the tap interface, just its > simple existence triggers it. I use a similar setup on a large variety > of machines, and this box is the only one that catches fire. Removing > the macvtap interface makes it (more) reliable. > > Given that I cannot reproduce this issue on any other ARM (32 or 64bit) > platform, including other Marvell stuff, I can only conclude that the > MVPP2 driver is responsible for this. > > Example crash and .config below (4.19 vanilla, as linux/master dies in > new and wonderful ways on this box). I'm looking forward to testing any > idea you may have. I used a 4.19 vanilla kernel, with both your configuration and mine, on 2 different Macchiatobins, but was unable to trigger the issue: # ip link set eth0 up # ip link add link eth0 name macvtap0 type macvtap # ip link set macvtap0 up I can even configure the eth0/macvtap0 interfaces, and use them generating or receiving tcp/udp/icmp traffic. (I also made other tests using macvtap and tap interfaces). How much memory do you have on the board? What version of ATF are you using? Version of U-Boot? Antoine -- Antoine T?nart, Bootlin Embedded Linux and Kernel engineering https://bootlin.com ^ permalink raw reply [flat|nested] 8+ messages in thread
* [BUG] MVPP2 driver exploding in presence of a tap interface 2018-10-30 10:50 ` [BUG] MVPP2 driver exploding in presence of a tap interface Antoine Tenart @ 2018-10-30 12:16 ` Marc Zyngier 2018-10-30 12:37 ` Marcin Wojtas 0 siblings, 1 reply; 8+ messages in thread From: Marc Zyngier @ 2018-10-30 12:16 UTC (permalink / raw) To: linux-arm-kernel Antoine, On 30/10/18 10:50, Antoine Tenart wrote: > Marc, > > On Mon, Oct 29, 2018 at 03:05:53PM +0000, Marc Zyngier wrote: >> >> This is a follow-up on the conversation Thomas and I had last week at >> ELC, with me ranting at the sorry state of the MVPP2 driver. > >> Triggering this is dead simple: >> - Add a macvtap to one of the MVPP2 interfaces >> - Bring it online >> - Watch the kernel exploding and memory being corrupted >> >> You don't even need anything listening on the tap interface, just its >> simple existence triggers it. I use a similar setup on a large variety >> of machines, and this box is the only one that catches fire. Removing >> the macvtap interface makes it (more) reliable. >> >> Given that I cannot reproduce this issue on any other ARM (32 or 64bit) >> platform, including other Marvell stuff, I can only conclude that the >> MVPP2 driver is responsible for this. >> >> Example crash and .config below (4.19 vanilla, as linux/master dies in >> new and wonderful ways on this box). I'm looking forward to testing any >> idea you may have. > > I used a 4.19 vanilla kernel, with both your configuration and mine, > on 2 different Macchiatobins, but was unable to trigger the issue: > > # ip link set eth0 up > # ip link add link eth0 name macvtap0 type macvtap > # ip link set macvtap0 up> > I can even configure the eth0/macvtap0 interfaces, and use them > generating or receiving tcp/udp/icmp traffic. > > (I also made other tests using macvtap and tap interfaces). > > How much memory do you have on the board? What version of ATF are you > using? Version of U-Boot? 4GB of RAM. As for the version numbers, see below. I don't use u-boot, but UEFI (EDK-II v2.60). The problem can be reproduced on two different machines, with the same configuration (and firmwares dating from a similar era): Starting CP-0 IOROM 1.07 Booting from SD 0 (0x29) Found valid image at boot postion 0x002 lNOTICE: Starting binary extension NOTICE: Gathering DRAM information mv_ddr: mv_ddr-armada-17.06.1-g47f4c8b (Jun 2 2017 - 17:07:23) mv_ddr: completed successfully NOTICE: Booting Trusted Firmware NOTICE: BL1: v1.3(release):armada-17.06.2:297d68f NOTICE: BL1: Built : 17:07:27, Jun 2 2017 NOTICE: BL1: Booting BL2 lNOTICE: BL2: v1.3(release):armada-17.06.2:297d68f NOTICE: BL2: Built : 17:07:28, Jun 2 2017 NOTICE: BL1: Booting BL31 lNOTICE: BL31: v1.3(release):armada-17.06.2:297d68f NOTICE: BL31: Built : 17:07:30, Jun 2 2017 lUEFI firmware (version MARVELL_EFI built at 17:12:21 on Jun 2 2017) Armada 8040 MachiatoBin Platform Init Comphy0-0: PCIE0 5 Gbps Comphy0-1: PCIE0 5 Gbps Comphy0-2: PCIE0 5 Gbps Comphy0-3: PCIE0 5 Gbps Comphy0-4: SFI 10.31 Gbps Comphy0-5: SATA1 5 Gbps Comphy1-0: SGMII1 1.25 Gbps Comphy1-1: SATA2 5 Gbps Comphy1-2: USB3_HOST0 5 Gbps Comphy1-3: SATA3 5 Gbps Comphy1-4: SFI 10.31 Gbps Comphy1-5: SGMII2 3.125 Gbps UTMI PHY 0 initialized to USB Host0 UTMI PHY 1 initialized to USB Host1 UTMI PHY 0 initialized to USB Host0 RTC: Initialize controller 1 Skip I2c chip 0 Succesfully installed protocol interfaces ramdisk:blckio install. Status=Success With the latest mainline, and after fixing that other irq affinity bug (see patch posted yesterday), I only need to bring the interface up, without doing anything else: # ip link set eth0 up [ 155.507877] mvpp2 f2000000.ethernet eth0: PHY [f212a600.mdio-mii:00] driver [mv88x3310] [ 155.526732] mvpp2 f2000000.ethernet eth0: configuring for phy/10gbase-kr link mode [ 157.592581] mvpp2 f2000000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx [ 158.339396] BUG: Bad page state in process swapper/0 pfn:e6804 [ 158.345345] page:ffff7e00039a0100 count:0 mapcount:0 mapping:ffff8000e7bf3b00 index:0xffff8000e6804c00 [ 158.354696] flags: 0xfffc00000000200(slab) [ 158.358815] raw: 0fffc00000000200 ffff7e00039cff80 0000000400000004 ffff8000e7bf3b00 [ 158.366594] raw: ffff8000e6804c00 000000008010000f 00000000ffffffff 0000000000000000 [ 158.374371] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set [ 158.380840] bad because of flags: 0x200(slab) [ 158.385216] Modules linked in: [ 158.388288] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0-09420-g34ae82ac683c #278 [ 158.396148] Hardware name: Marvell 8040 MACCHIATOBin (DT) [ 158.401567] Call trace: [ 158.404031] dump_backtrace+0x0/0x148 [ 158.407708] show_stack+0x14/0x20 [ 158.411036] dump_stack+0x90/0xb4 [ 158.414365] bad_page+0x104/0x130 [ 158.417692] free_pages_check_bad+0x9c/0xa8 [ 158.421892] __free_pages_ok+0x1b0/0x450 [ 158.425829] page_frag_free+0x8c/0xa8 [ 158.429505] skb_free_head+0x18/0x30 [ 158.433093] skb_release_data+0x130/0x160 [ 158.437117] skb_release_all+0x24/0x30 [ 158.440881] consume_skb+0x2c/0x58 [ 158.444296] arp_process.constprop.4+0x200/0x6f0 [ 158.448931] arp_rcv+0xf4/0x128 [ 158.452084] __netif_receive_skb_one_core+0x54/0x78 [ 158.456981] __netif_receive_skb+0x14/0x60 [ 158.461094] netif_receive_skb_internal+0x40/0x138 [ 158.465903] napi_gro_receive+0x64/0xc8 [ 158.469754] mvpp2_poll+0x3f4/0x810 [ 158.473255] net_rx_action+0x104/0x2c0 [ 158.477018] __do_softirq+0x11c/0x234 [ 158.480695] irq_exit+0xb8/0xc8 [ 158.483848] __handle_domain_irq+0x64/0xb8 [ 158.487959] gic_handle_irq+0x50/0xa0 [ 158.491634] el1_irq+0xb0/0x128 [ 158.494786] arch_cpu_idle+0x10/0x18 [ 158.498375] do_idle+0x208/0x280 [ 158.501615] cpu_startup_entry+0x20/0x28 [ 158.505553] rest_init+0xd4/0xe0 [ 158.508793] arch_call_rest_init+0xc/0x14 [ 158.512818] start_kernel+0x3d8/0x400 [ 158.516497] Disabling lock debugging due to kernel taint [ 159.461058] BUG: Bad page state in process swapper/0 pfn:e681d [ 159.467013] page:ffff7e00039a0740 count:0 mapcount:0 mapping:ffff8000ef43fb00 index:0x0 [ 159.475051] flags: 0xfffc00000000200(slab) [ 159.479170] raw: 0fffc00000000200 dead000000000100 dead000000000200 ffff8000ef43fb00 [ 159.486947] raw: 0000000000000000 00000000001e001e 00000000ffffffff 0000000000000000 [ 159.494721] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set [ 159.501189] bad because of flags: 0x200(slab) [ 159.505566] Modules linked in: [ 159.508636] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G B 4.19.0-09420-g34ae82ac683c #278 [ 159.517892] Hardware name: Marvell 8040 MACCHIATOBin (DT) [ 159.523311] Call trace: [ 159.525775] dump_backtrace+0x0/0x148 [ 159.529451] show_stack+0x14/0x20 [ 159.532779] dump_stack+0x90/0xb4 [ 159.536106] bad_page+0x104/0x130 [ 159.539433] free_pages_check_bad+0x9c/0xa8 [ 159.543633] __free_pages_ok+0x1b0/0x450 [ 159.547570] page_frag_free+0x8c/0xa8 [ 159.551247] skb_free_head+0x18/0x30 [ 159.554836] skb_release_data+0x130/0x160 [ 159.558860] skb_release_all+0x24/0x30 [ 159.562623] kfree_skb+0x2c/0x58 [ 159.565864] __udp4_lib_rcv+0x850/0x948 [ 159.569713] udp_rcv+0x1c/0x28 [ 159.572779] ip_local_deliver_finish+0x100/0x248 [ 159.577414] ip_local_deliver+0x60/0x110 [ 159.581350] ip_rcv_finish+0x38/0x50 [ 159.584938] ip_rcv+0x50/0xd8 [ 159.587918] __netif_receive_skb_one_core+0x54/0x78 [ 159.592815] __netif_receive_skb+0x14/0x60 [ 159.596928] netif_receive_skb_internal+0x40/0x138 [ 159.601738] napi_gro_receive+0x64/0xc8 [ 159.605589] mvpp2_poll+0x3f4/0x810 [ 159.609090] net_rx_action+0x104/0x2c0 [ 159.612853] __do_softirq+0x11c/0x234 [ 159.616530] irq_exit+0xb8/0xc8 [ 159.619683] __handle_domain_irq+0x64/0xb8 [ 159.623794] gic_handle_irq+0x50/0xa0 [ 159.627470] el1_irq+0xb0/0x128 [ 159.630622] arch_cpu_idle+0x10/0x18 [ 159.634211] do_idle+0x208/0x280 [ 159.637451] cpu_startup_entry+0x24/0x28 [ 159.641388] rest_init+0xd4/0xe0 [ 159.644630] arch_call_rest_init+0xc/0x14 [ 159.648655] start_kernel+0x3d8/0x400 Bizarrely, eth1 and eth2 do not crash this way. I have no way to test eth3 (no transceiver). Thanks, M. -- Jazz is not dead. It just smells funny... ^ permalink raw reply [flat|nested] 8+ messages in thread
* [BUG] MVPP2 driver exploding in presence of a tap interface 2018-10-30 12:16 ` Marc Zyngier @ 2018-10-30 12:37 ` Marcin Wojtas 2018-10-30 12:59 ` Marc Zyngier 2018-10-30 13:00 ` Thomas Petazzoni 0 siblings, 2 replies; 8+ messages in thread From: Marcin Wojtas @ 2018-10-30 12:37 UTC (permalink / raw) To: linux-arm-kernel [Resend in UTF-8] Hi Marc, You use _really_ archaic firmware, the bug you see is 99% caused by a bug already fixed long time ago (cleanup all PP2 BM pools correctly during exit boot services). Please grab the latest release: https://github.com/MarvellEmbeddedProcessors/edk2-open-platform/wiki/files/flash-image-18.09.4.bin and let know if you observe any further issues with vanilla kernel. Best regards, Marcin wt., 30 pa? 2018 o 13:16 Marc Zyngier <marc.zyngier@arm.com> napisa?(a): > > Antoine, > > On 30/10/18 10:50, Antoine Tenart wrote: > > Marc, > > > > On Mon, Oct 29, 2018 at 03:05:53PM +0000, Marc Zyngier wrote: > >> > >> This is a follow-up on the conversation Thomas and I had last week at > >> ELC, with me ranting at the sorry state of the MVPP2 driver. > > > >> Triggering this is dead simple: > >> - Add a macvtap to one of the MVPP2 interfaces > >> - Bring it online > >> - Watch the kernel exploding and memory being corrupted > >> > >> You don't even need anything listening on the tap interface, just its > >> simple existence triggers it. I use a similar setup on a large variety > >> of machines, and this box is the only one that catches fire. Removing > >> the macvtap interface makes it (more) reliable. > >> > >> Given that I cannot reproduce this issue on any other ARM (32 or 64bit) > >> platform, including other Marvell stuff, I can only conclude that the > >> MVPP2 driver is responsible for this. > >> > >> Example crash and .config below (4.19 vanilla, as linux/master dies in > >> new and wonderful ways on this box). I'm looking forward to testing any > >> idea you may have. > > > > I used a 4.19 vanilla kernel, with both your configuration and mine, > > on 2 different Macchiatobins, but was unable to trigger the issue: > > > > # ip link set eth0 up > > # ip link add link eth0 name macvtap0 type macvtap > > # ip link set macvtap0 up> > > I can even configure the eth0/macvtap0 interfaces, and use them > > generating or receiving tcp/udp/icmp traffic. > > > > (I also made other tests using macvtap and tap interfaces). > > > > How much memory do you have on the board? What version of ATF are you > > using? Version of U-Boot? > > 4GB of RAM. As for the version numbers, see below. I don't use u-boot, > but UEFI (EDK-II v2.60). The problem can be reproduced on two different > machines, with the same configuration (and firmwares dating from a > similar era): > > Starting CP-0 IOROM 1.07 > Booting from SD 0 (0x29) > Found valid image at boot postion 0x002 > lNOTICE: Starting binary extension > NOTICE: Gathering DRAM information > mv_ddr: mv_ddr-armada-17.06.1-g47f4c8b (Jun 2 2017 - 17:07:23) > mv_ddr: completed successfully > NOTICE: Booting Trusted Firmware > NOTICE: BL1: v1.3(release):armada-17.06.2:297d68f > NOTICE: BL1: Built : 17:07:27, Jun 2 2017 > NOTICE: BL1: Booting BL2 > lNOTICE: BL2: v1.3(release):armada-17.06.2:297d68f > NOTICE: BL2: Built : 17:07:28, Jun 2 2017 > NOTICE: BL1: Booting BL31 > lNOTICE: BL31: v1.3(release):armada-17.06.2:297d68f > NOTICE: BL31: Built : 17:07:30, Jun 2 2017 > lUEFI firmware (version MARVELL_EFI built at 17:12:21 on Jun 2 2017) > > Armada 8040 MachiatoBin Platform Init > > Comphy0-0: PCIE0 5 Gbps > Comphy0-1: PCIE0 5 Gbps > Comphy0-2: PCIE0 5 Gbps > Comphy0-3: PCIE0 5 Gbps > Comphy0-4: SFI 10.31 Gbps > Comphy0-5: SATA1 5 Gbps > > Comphy1-0: SGMII1 1.25 Gbps > Comphy1-1: SATA2 5 Gbps > Comphy1-2: USB3_HOST0 5 Gbps > Comphy1-3: SATA3 5 Gbps > Comphy1-4: SFI 10.31 Gbps > Comphy1-5: SGMII2 3.125 Gbps > > UTMI PHY 0 initialized to USB Host0 > UTMI PHY 1 initialized to USB Host1 > UTMI PHY 0 initialized to USB Host0 > RTC: Initialize controller 1 > Skip I2c chip 0 > Succesfully installed protocol interfaces > ramdisk:blckio install. Status=Success > > With the latest mainline, and after fixing that other irq affinity > bug (see patch posted yesterday), I only need to bring the interface > up, without doing anything else: > > # ip link set eth0 up > [ 155.507877] mvpp2 f2000000.ethernet eth0: PHY [f212a600.mdio-mii:00] driver [mv88x3310] > [ 155.526732] mvpp2 f2000000.ethernet eth0: configuring for phy/10gbase-kr link mode > [ 157.592581] mvpp2 f2000000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx > [ 158.339396] BUG: Bad page state in process swapper/0 pfn:e6804 > [ 158.345345] page:ffff7e00039a0100 count:0 mapcount:0 mapping:ffff8000e7bf3b00 index:0xffff8000e6804c00 > [ 158.354696] flags: 0xfffc00000000200(slab) > [ 158.358815] raw: 0fffc00000000200 ffff7e00039cff80 0000000400000004 ffff8000e7bf3b00 > [ 158.366594] raw: ffff8000e6804c00 000000008010000f 00000000ffffffff 0000000000000000 > [ 158.374371] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set > [ 158.380840] bad because of flags: 0x200(slab) > [ 158.385216] Modules linked in: > [ 158.388288] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0-09420-g34ae82ac683c #278 > [ 158.396148] Hardware name: Marvell 8040 MACCHIATOBin (DT) > [ 158.401567] Call trace: > [ 158.404031] dump_backtrace+0x0/0x148 > [ 158.407708] show_stack+0x14/0x20 > [ 158.411036] dump_stack+0x90/0xb4 > [ 158.414365] bad_page+0x104/0x130 > [ 158.417692] free_pages_check_bad+0x9c/0xa8 > [ 158.421892] __free_pages_ok+0x1b0/0x450 > [ 158.425829] page_frag_free+0x8c/0xa8 > [ 158.429505] skb_free_head+0x18/0x30 > [ 158.433093] skb_release_data+0x130/0x160 > [ 158.437117] skb_release_all+0x24/0x30 > [ 158.440881] consume_skb+0x2c/0x58 > [ 158.444296] arp_process.constprop.4+0x200/0x6f0 > [ 158.448931] arp_rcv+0xf4/0x128 > [ 158.452084] __netif_receive_skb_one_core+0x54/0x78 > [ 158.456981] __netif_receive_skb+0x14/0x60 > [ 158.461094] netif_receive_skb_internal+0x40/0x138 > [ 158.465903] napi_gro_receive+0x64/0xc8 > [ 158.469754] mvpp2_poll+0x3f4/0x810 > [ 158.473255] net_rx_action+0x104/0x2c0 > [ 158.477018] __do_softirq+0x11c/0x234 > [ 158.480695] irq_exit+0xb8/0xc8 > [ 158.483848] __handle_domain_irq+0x64/0xb8 > [ 158.487959] gic_handle_irq+0x50/0xa0 > [ 158.491634] el1_irq+0xb0/0x128 > [ 158.494786] arch_cpu_idle+0x10/0x18 > [ 158.498375] do_idle+0x208/0x280 > [ 158.501615] cpu_startup_entry+0x20/0x28 > [ 158.505553] rest_init+0xd4/0xe0 > [ 158.508793] arch_call_rest_init+0xc/0x14 > [ 158.512818] start_kernel+0x3d8/0x400 > [ 158.516497] Disabling lock debugging due to kernel taint > [ 159.461058] BUG: Bad page state in process swapper/0 pfn:e681d > [ 159.467013] page:ffff7e00039a0740 count:0 mapcount:0 mapping:ffff8000ef43fb00 index:0x0 > [ 159.475051] flags: 0xfffc00000000200(slab) > [ 159.479170] raw: 0fffc00000000200 dead000000000100 dead000000000200 ffff8000ef43fb00 > [ 159.486947] raw: 0000000000000000 00000000001e001e 00000000ffffffff 0000000000000000 > [ 159.494721] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set > [ 159.501189] bad because of flags: 0x200(slab) > [ 159.505566] Modules linked in: > [ 159.508636] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G B 4.19.0-09420-g34ae82ac683c #278 > [ 159.517892] Hardware name: Marvell 8040 MACCHIATOBin (DT) > [ 159.523311] Call trace: > [ 159.525775] dump_backtrace+0x0/0x148 > [ 159.529451] show_stack+0x14/0x20 > [ 159.532779] dump_stack+0x90/0xb4 > [ 159.536106] bad_page+0x104/0x130 > [ 159.539433] free_pages_check_bad+0x9c/0xa8 > [ 159.543633] __free_pages_ok+0x1b0/0x450 > [ 159.547570] page_frag_free+0x8c/0xa8 > [ 159.551247] skb_free_head+0x18/0x30 > [ 159.554836] skb_release_data+0x130/0x160 > [ 159.558860] skb_release_all+0x24/0x30 > [ 159.562623] kfree_skb+0x2c/0x58 > [ 159.565864] __udp4_lib_rcv+0x850/0x948 > [ 159.569713] udp_rcv+0x1c/0x28 > [ 159.572779] ip_local_deliver_finish+0x100/0x248 > [ 159.577414] ip_local_deliver+0x60/0x110 > [ 159.581350] ip_rcv_finish+0x38/0x50 > [ 159.584938] ip_rcv+0x50/0xd8 > [ 159.587918] __netif_receive_skb_one_core+0x54/0x78 > [ 159.592815] __netif_receive_skb+0x14/0x60 > [ 159.596928] netif_receive_skb_internal+0x40/0x138 > [ 159.601738] napi_gro_receive+0x64/0xc8 > [ 159.605589] mvpp2_poll+0x3f4/0x810 > [ 159.609090] net_rx_action+0x104/0x2c0 > [ 159.612853] __do_softirq+0x11c/0x234 > [ 159.616530] irq_exit+0xb8/0xc8 > [ 159.619683] __handle_domain_irq+0x64/0xb8 > [ 159.623794] gic_handle_irq+0x50/0xa0 > [ 159.627470] el1_irq+0xb0/0x128 > [ 159.630622] arch_cpu_idle+0x10/0x18 > [ 159.634211] do_idle+0x208/0x280 > [ 159.637451] cpu_startup_entry+0x24/0x28 > [ 159.641388] rest_init+0xd4/0xe0 > [ 159.644630] arch_call_rest_init+0xc/0x14 > [ 159.648655] start_kernel+0x3d8/0x400 > > Bizarrely, eth1 and eth2 do not crash this way. I have no way to test > eth3 (no transceiver). > > Thanks, > > M. > -- > Jazz is not dead. It just smells funny... ^ permalink raw reply [flat|nested] 8+ messages in thread
* [BUG] MVPP2 driver exploding in presence of a tap interface 2018-10-30 12:37 ` Marcin Wojtas @ 2018-10-30 12:59 ` Marc Zyngier 2018-10-30 13:00 ` Thomas Petazzoni 1 sibling, 0 replies; 8+ messages in thread From: Marc Zyngier @ 2018-10-30 12:59 UTC (permalink / raw) To: linux-arm-kernel Marcin, On 30/10/18 12:37, Marcin Wojtas wrote: > [Resend in UTF-8] > > Hi Marc, > > You use _really_ archaic firmware, the bug you see is 99% caused by a Please let me fix this for you: s/_really_ archaic/released/ > bug already fixed long time ago (cleanup all PP2 BM pools correctly > during exit boot services). How long ago? Why didn't you say so when I reported the bug to you and Antoine back in January? Also, why isn't that "clean-up" taken care of by the Linux driver? Exiting boot services itself doesn't seem to cause the issue, and it is setting the interface up that causes it. > Please grab the latest release: > https://github.com/MarvellEmbeddedProcessors/edk2-open-platform/wiki/files/flash-image-18.09.4.bin > and let know if you observe any further issues with vanilla kernel. What does this image contain? Thanks, M. -- Jazz is not dead. It just smells funny... ^ permalink raw reply [flat|nested] 8+ messages in thread
* [BUG] MVPP2 driver exploding in presence of a tap interface 2018-10-30 12:37 ` Marcin Wojtas 2018-10-30 12:59 ` Marc Zyngier @ 2018-10-30 13:00 ` Thomas Petazzoni 2018-10-30 14:55 ` Marc Zyngier 1 sibling, 1 reply; 8+ messages in thread From: Thomas Petazzoni @ 2018-10-30 13:00 UTC (permalink / raw) To: linux-arm-kernel Hello Marcin, Thanks for the feedback. On Tue, 30 Oct 2018 13:37:37 +0100, Marcin Wojtas wrote: > You use _really_ archaic firmware, the bug you see is 99% caused by a > bug already fixed long time ago (cleanup all PP2 BM pools correctly > during exit boot services). Please grab the latest release: > https://github.com/MarvellEmbeddedProcessors/edk2-open-platform/wiki/files/flash-image-18.09.4.bin > and let know if you observe any further issues with vanilla kernel. Even if this was a bug in the UEFI firmware, shouldn't the kernel be independent from that, by doing a proper reset/reinit of the HW ? I.e, isn't the firmware fix papering over a bug that should be fixed in Linux mvpp2 driver anyway ? Best regards, Thomas -- Thomas Petazzoni, CTO, Bootlin Embedded Linux and Kernel engineering https://bootlin.com ^ permalink raw reply [flat|nested] 8+ messages in thread
* [BUG] MVPP2 driver exploding in presence of a tap interface 2018-10-30 13:00 ` Thomas Petazzoni @ 2018-10-30 14:55 ` Marc Zyngier 2018-10-30 15:10 ` Thomas Petazzoni 0 siblings, 1 reply; 8+ messages in thread From: Marc Zyngier @ 2018-10-30 14:55 UTC (permalink / raw) To: linux-arm-kernel On 30/10/18 13:00, Thomas Petazzoni wrote: > Hello Marcin, > > Thanks for the feedback. > > On Tue, 30 Oct 2018 13:37:37 +0100, Marcin Wojtas wrote: > >> You use _really_ archaic firmware, the bug you see is 99% caused by a >> bug already fixed long time ago (cleanup all PP2 BM pools correctly >> during exit boot services). Please grab the latest release: >> https://github.com/MarvellEmbeddedProcessors/edk2-open-platform/wiki/files/flash-image-18.09.4.bin >> and let know if you observe any further issues with vanilla kernel. > > Even if this was a bug in the UEFI firmware, shouldn't the kernel be > independent from that, by doing a proper reset/reinit of the HW ? > > I.e, isn't the firmware fix papering over a bug that should be fixed in > Linux mvpp2 driver anyway ? Absolutely. Leaving this unpatched in the kernel, with a 100% chance of memory corruption is just mad. I'm pretty sure there should be a way to sanely reset the interface before it starts repainting the memory. And if there is none, we must find a way to tell the user that the machine is a death trap. Really. M. PS: updating the FW to the version provided by Marcin indeed makes things much more reliable. Thanks for that. -- Jazz is not dead. It just smells funny... ^ permalink raw reply [flat|nested] 8+ messages in thread
* [BUG] MVPP2 driver exploding in presence of a tap interface 2018-10-30 14:55 ` Marc Zyngier @ 2018-10-30 15:10 ` Thomas Petazzoni 2018-10-30 15:22 ` Marc Zyngier 0 siblings, 1 reply; 8+ messages in thread From: Thomas Petazzoni @ 2018-10-30 15:10 UTC (permalink / raw) To: linux-arm-kernel Hello, On Tue, 30 Oct 2018 14:55:01 +0000, Marc Zyngier wrote: > > I.e, isn't the firmware fix papering over a bug that should be fixed in > > Linux mvpp2 driver anyway ? > > Absolutely. Leaving this unpatched in the kernel, with a 100% chance of > memory corruption is just mad. > > I'm pretty sure there should be a way to sanely reset the interface > before it starts repainting the memory. I agree here. Do you still have an image of that old firmware version, so that we can try to reproduce, and see if we can come up with a way to reset the BM on boot up that would avoid this issue ? Thanks, Thomas -- Thomas Petazzoni, CTO, Bootlin Embedded Linux and Kernel engineering https://bootlin.com ^ permalink raw reply [flat|nested] 8+ messages in thread
* [BUG] MVPP2 driver exploding in presence of a tap interface 2018-10-30 15:10 ` Thomas Petazzoni @ 2018-10-30 15:22 ` Marc Zyngier 0 siblings, 0 replies; 8+ messages in thread From: Marc Zyngier @ 2018-10-30 15:22 UTC (permalink / raw) To: linux-arm-kernel On 30/10/18 15:10, Thomas Petazzoni wrote: > Hello, > > On Tue, 30 Oct 2018 14:55:01 +0000, Marc Zyngier wrote: > >>> I.e, isn't the firmware fix papering over a bug that should be fixed in >>> Linux mvpp2 driver anyway ? >> >> Absolutely. Leaving this unpatched in the kernel, with a 100% chance of >> memory corruption is just mad. >> >> I'm pretty sure there should be a way to sanely reset the interface >> before it starts repainting the memory. > > I agree here. Do you still have an image of that old firmware version, > so that we can try to reproduce, and see if we can come up with a way > to reset the BM on boot up that would avoid this issue ? Yup. I still have both the original build tree as well as the sdcard, so you should be able to trigger on demand. I'll email you the stuff separately, unless you want another delivery method. Thanks, M. -- Jazz is not dead. It just smells funny... ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2018-10-30 15:22 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <6355174d-4ab6-595d-17db-311bce607aef@arm.com>
2018-10-30 10:50 ` [BUG] MVPP2 driver exploding in presence of a tap interface Antoine Tenart
2018-10-30 12:16 ` Marc Zyngier
2018-10-30 12:37 ` Marcin Wojtas
2018-10-30 12:59 ` Marc Zyngier
2018-10-30 13:00 ` Thomas Petazzoni
2018-10-30 14:55 ` Marc Zyngier
2018-10-30 15:10 ` Thomas Petazzoni
2018-10-30 15:22 ` Marc Zyngier
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).