From: Philippe Gerum <rpm@xenomai.org>
To: Ronny Meeus <ronny.meeus@domain.hid>
Cc: adeos-main@gna.org
Subject: Re: [Adeos-main] Kernel blocked during send/receive raw ethernet packets.
Date: Mon, 04 Jul 2011 10:20:38 +0200 [thread overview]
Message-ID: <1309767638.2157.7.camel@domain.hid> (raw)
In-Reply-To: <CAMJ=MEcn1ZemK7XkW5F9MpJ9pWihm7UM-t-7HZP7LPD7QDLmcA@mail.gmail.com>
On Mon, 2011-07-04 at 10:06 +0200, Ronny Meeus wrote:
> On Sat, Jul 2, 2011 at 11:33 PM, Ronny Meeus <ronny.meeus@domain.hid> wrote:
> > Hello
> >
> > we use have a FreeScale P4040 (powerpc) based board running Linux+Xenomai.
> > I copy-paste here some information I found in the bootlog:
> >
> > [ 0.000000] Using P4080 DS machine description
> > [ 0.000000] Memory CAM mapping: 256/256/256 Mb, residual: 1248Mb
> > [ 0.000000] Linux version 2.6.35.7-hg98224f47aa52-dirty
> > (xxxxx@domain.hid) (gcc version 4.4.6 (Buildroot 2011.05-hg98224f47aa52)
> > ) #1 SMP Fri Jul 1 08:42:30 CEST 2011
> >
> > [ 0.000000] clocksource: timebase mult[6aaaf09] shift[22] registered
> > [ 0.000000] I-pipe 2.12-01: pipeline enabled.
> > [ 0.000000] Console: colour dummy device 80x25
> > [ 0.181150] pid_max: default: 32768 minimum: 301
> >
> > [ 2.093842] I-pipe: Domain Xenomai registered.
> > [ 2.146016] Xenomai: hal/powerpc started.
> > [ 2.193904] Xenomai: scheduling class idle registered.
> > [ 2.255328] Xenomai: scheduling class rt registered.
> > [ 2.319092] Xenomai: real-time nucleus v2.5.5 (Ghosts) loaded.
> > [ 2.388207] Xenomai: starting native API services.
> > [ 2.445249] Xenomai: starting pSOS+ services.
> > [ 2.497478] highmem bounce pool size: 64 pages
> > [ 2.550932] fuse init (API version 7.14)
> >
> > Although the P4040 has 4 cores, we are currently using only 1 core.
> > This is specified in the device tree we are using.
> > The kernel runs SMP enabled.
> >
> > I start 2 test applications on this board.
> > The first application is sending raw Ethernet packets on a link that
> > is put in loop. The result is that all packets we send are received
> > (unmodified) back on the same interface.
> > The second application is listening on the same Ethernet interface
> > also via a raw Ethernet socket.
> > Both application are plain Linux application so no Xenomai code is used.
> >
> > One side effect of using raw Ethernet sockets is that all packets sent
> > on one socket will also be received by all other raw Ethernet sockets.
> > This means that the listening application will receive each packet 2
> > times: once while sending and a second time when it is received via
> > the loop. (A side question: can the behavior be disabled somehow? We
> > basically do not want to receive all packets we send ...)
> >
> > After a very short time (sending something like 30000 packets), both
> > applications block completely and 60 seconds later an indication is
> > displayed on the console that the kernel is locked.
> >
> > [ 805.307213] BUG: soft lockup - CPU#0 stuck for 61s! [send_eth_socket:1907]
> > [ 805.389519] Modules linked in: reboot_helper dpll_si53xx crave ndps_a_cpld
> > [ 805.471880] NIP: c000cc4c LR: 00000000 CTR: 00000000
> > [ 805.531274] REGS: c1f87040 TRAP: 0000 Not tainted
> > (2.6.35.7-hg98224f47aa52-dirty)
> > [ 805.623992] MSR: 00029002 <EE,ME,CE> CR: 00000000 XER: 00000000
> > [ 805.696972] TASK = ec7116d0[1907] 'send_eth_socket' THREAD: ec6aa000 CPU: 0
> > [ 805.778248] GPR00: 00000000 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000
> > [ 805.878359] GPR08: 00000000 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000
> > [ 805.978452] GPR16: 00000000 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000
> > [ 806.078571] GPR24: 00000000 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000
> > [ 806.180773] NIP [c000cc4c] udelay+0x24/0x30
> > [ 806.230782] LR [00000000] (null)
> > [ 806.269334] Call Trace:
> > [ 806.298521] [efff3b50] [c00071b4] show_stack+0x78/0x18c (unreliable)
> > [ 806.374600] [efff3b90] [c00078c4] show_regs+0x200/0x2ec
> > [ 806.437125] [efff3bc0] [c00658d4] softlockup_tick+0x1dc/0x23c
> > [ 806.505897] [efff3bf0] [c003cc50] run_local_timers+0x1c/0x2c
> > [ 806.573626] [efff3c00] [c003cca4] update_process_times+0x44/0x80
> > [ 806.645528] [efff3c20] [c0059bc4] tick_sched_timer+0xd0/0x128
> > [ 806.714307] [efff3c50] [c004d8f0] __run_hrtimer+0x68/0x14c
> > [ 806.779958] [efff3c70] [c004efa4] hrtimer_interrupt+0x1d8/0x41c
> > [ 806.850812] [efff3cf0] [c000d8d8] timer_interrupt+0x1b4/0x238
> > [ 806.919586] [efff3d10] [c0009ac4] __ipipe_do_timer+0x44/0x54
> > [ 806.987315] [efff3d20] [c006d448] __ipipe_sync_stage+0x1d0/0x27c
> > [ 807.059212] [efff3d60] [c0009728] __ipipe_grab_timer+0x104/0x12c
> > [ 807.131112] [efff3d70] [c00129e0] __ipipe_ret_from_except+0x0/0xc
> > [ 807.204063] --- Exception: 901 at _raw_spin_lock+0x30/0x3c
> > [ 807.204068] LR = tpacket_rcv+0x264/0x570
> > [ 807.320754] [efff3e30] [c0325e48] tpacket_rcv+0xf4/0x570 (unreliable)
> > [ 807.397875] [efff3e80] [c02c43b0] __netif_receive_skb+0x2b4/0x2f0
> > [ 807.470811] [efff3eb0] [c02c4fa0] netif_receive_skb+0x98/0xac
> > [ 807.539583] [efff3ee0] [c0292838] ingress_rx_default_dqrr+0x428/0x4b4
> > [ 807.616693] [efff3f10] [c02ac2a8] qman_poll_dqrr+0x1e0/0x284
> > [ 807.684426] [efff3f50] [c0294088] dpaa_eth_poll+0x34/0xd0
> > [ 807.749031] [efff3f70] [c02c5280] net_rx_action+0xc0/0x1e8
> > [ 807.814683] [efff3fa0] [c0035ab0] __do_softirq+0x138/0x210
> > [ 807.880333] [efff3ff0] [c00115e8] call_do_softirq+0x14/0x24
> > [ 807.947022] [ec6abab0] [c000480c] do_softirq+0xb4/0xec
> > [ 808.008503] --- Exception: ec6abbb0 at 0xec6abb70
> > [ 808.008507] LR = 0xec4e6c50
> > [ 808.102274] [ec6abad0] [c00357cc] irq_exit+0x60/0xb8 (unreliable)
> > [ 808.175227] [ec6abae0] [c0009b5c] __ipipe_do_IRQ+0x88/0xc0
> > [ 808.240872] [ec6abb00] [c006d468] __ipipe_sync_stage+0x1f0/0x27c
> > [ 808.312771] [ec6abb40] [c00095f4] __ipipe_handle_irq+0x1b8/0x1e8
> > [ 808.384669] [ec6abb70] [c00098dc] __ipipe_grab_irq+0x18c/0x1bc
> > [ 808.454482] [ec6abba0] [c00129e0] __ipipe_ret_from_except+0x0/0xc
> > [ 808.527425] --- Exception: 501 at _raw_spin_lock+0x14/0x3c
> > [ 808.527430] LR = tpacket_rcv+0x264/0x570
> > [ 808.644114] [ec6abc60] [c0325e48] tpacket_rcv+0xf4/0x570 (unreliable)
> > [ 808.721232] [ec6abcb0] [c02c6238] dev_hard_start_xmit+0x164/0x414
> > [ 808.794171] [ec6abcf0] [c0325b94] packet_sendmsg+0x8c0/0x984
> > [ 808.861901] [ec6abd70] [c02b32f0] sock_sendmsg+0x90/0xb4
> > [ 808.925465] [ec6abe40] [c02b3ea8] sys_sendto+0xd0/0x114
> > [ 808.987988] [ec6abf10] [c02b522c] sys_socketcall+0x148/0x210
> > [ 809.055718] [ec6abf40] [c0011d0c] ret_from_syscall+0x0/0x3c
> > [ 809.122407] --- Exception: c01 at 0x48051f00
> > [ 809.122411] LR = 0x4808e030
> > [ 809.210966] Instruction dump:
> > [ 809.246401] 7d204850 7f891840 419cfff0 7c421378 4e800020 3d20c04c
> > 800967e0 7c0301d6
> > [ 809.339215] 7d2c42a6 48000008 7c210b78 <7d6c42a6> <7d695850>
> > 7f8b0040 419cfff0 7c421378
> > [ 874.025894] BUG: soft lockup - CPU#0 stuck for 61s! [send_eth_socket:1907]
> > [ 874.108198] Modules linked in: reboot_helper dpll_si53xx crave ndps_a_cpld
> > [ 874.190551] NIP: c000cc48 LR: 00000000 CTR: 00000000
> > [ 874.249937] REGS: c1f87040 TRAP: 0000 Not tainted
> > (2.6.35.7-hg98224f47aa52-dirty)
> > [ 874.342658] MSR: 00029002 <EE,ME,CE> CR: 00000000 XER: 00000000
> > [ 874.415638] TASK = ec7116d0[1907] 'send_eth_socket' THREAD: ec6aa000 CPU: 0
> > [ 874.496907] GPR00: 00000000 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000
> > [ 874.597018] GPR08: 00000000 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000
> > [ 874.697124] GPR16: 00000000 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000
> > [ 874.797235] GPR24: 00000000 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000
> > [ 874.899421] NIP [c000cc40] udelay+0x18/0x30
> > [ 874.949434] LR [00000000] (null)
> > [ 874.987986] Call Trace:
> > [ 875.017170] [efff3b50] [c00071b4] show_stack+0x78/0x18c (unreliable)
> > [ 875.093240] [efff3b90] [c00078c4] show_regs+0x200/0x2ec
> > [ 875.155763] [efff3bc0] [c00658d4] softlockup_tick+0x1dc/0x23c
> > [ 875.224534] [efff3bf0] [c003cc50] run_local_timers+0x1c/0x2c
> > [ 875.292265] [efff3c00] [c003cca4] update_process_times+0x44/0x80
> > [ 875.364164] [efff3c20] [c0059bc4] tick_sched_timer+0xd0/0x128
> > [ 875.432936] [efff3c50] [c004d8f0] __run_hrtimer+0x68/0x14c
> > [ 875.498584] [efff3c70] [c004efa4] hrtimer_interrupt+0x1d8/0x41c
> > [ 875.569437] [efff3cf0] [c000d8d8] timer_interrupt+0x1b4/0x238
> > [ 875.638211] [efff3d10] [c0009ac4] __ipipe_do_timer+0x44/0x54
> > [ 875.705941] [efff3d20] [c006d448] __ipipe_sync_stage+0x1d0/0x27c
> > [ 875.777839] [efff3d60] [c0009728] __ipipe_grab_timer+0x104/0x12c
> > [ 875.849736] [efff3d70] [c00129e0] __ipipe_ret_from_except+0x0/0xc
> > [ 875.922680] --- Exception: 901 at _raw_spin_lock+0x30/0x3c
> > [ 875.922684] LR = tpacket_rcv+0x264/0x570
> > [ 876.039367] [efff3e30] [c0325e48] tpacket_rcv+0xf4/0x570 (unreliable)
> > [ 876.116479] [efff3e80] [c02c43b0] __netif_receive_skb+0x2b4/0x2f0
> > [ 876.189418] [efff3eb0] [c02c4fa0] netif_receive_skb+0x98/0xac
> > [ 876.258189] [efff3ee0] [c0292838] ingress_rx_default_dqrr+0x428/0x4b4
> > [ 876.335297] [efff3f10] [c02ac2a8] qman_poll_dqrr+0x1e0/0x284
> > [ 876.403025] [efff3f50] [c0294088] dpaa_eth_poll+0x34/0xd0
> > [ 876.467632] [efff3f70] [c02c5280] net_rx_action+0xc0/0x1e8
> > [ 876.533280] [efff3fa0] [c0035ab0] __do_softirq+0x138/0x210
> > [ 876.598926] [efff3ff0] [c00115e8] call_do_softirq+0x14/0x24
> > [ 876.665618] [ec6abab0] [c000480c] do_softirq+0xb4/0xec
> > [ 876.727097] --- Exception: ec6abbb0 at 0xec6abb70
> > [ 876.727101] LR = 0xec4e6c50
> > [ 876.820868] [ec6abad0] [c00357cc] irq_exit+0x60/0xb8 (unreliable)
> > [ 876.893814] [ec6abae0] [c0009b5c] __ipipe_do_IRQ+0x88/0xc0
> > [ 876.959459] [ec6abb00] [c006d468] __ipipe_sync_stage+0x1f0/0x27c
> > [ 877.031358] [ec6abb40] [c00095f4] __ipipe_handle_irq+0x1b8/0x1e8
> > [ 877.103256] [ec6abb70] [c00098dc] __ipipe_grab_irq+0x18c/0x1bc
> > [ 877.173069] [ec6abba0] [c00129e0] __ipipe_ret_from_except+0x0/0xc
> > [ 877.246012] --- Exception: 501 at _raw_spin_lock+0x14/0x3c
> > [ 877.246017] LR = tpacket_rcv+0x264/0x570
> > [ 877.362701] [ec6abc60] [c0325e48] tpacket_rcv+0xf4/0x570 (unreliable)
> > [ 877.439819] [ec6abcb0] [c02c6238] dev_hard_start_xmit+0x164/0x414
> > [ 877.512758] [ec6abcf0] [c0325b94] packet_sendmsg+0x8c0/0x984
> > [ 877.580487] [ec6abd70] [c02b32f0] sock_sendmsg+0x90/0xb4
> > [ 877.644052] [ec6abe40] [c02b3ea8] sys_sendto+0xd0/0x114
> > [ 877.706575] [ec6abf10] [c02b522c] sys_socketcall+0x148/0x210
> > [ 877.774306] [ec6abf40] [c0011d0c] ret_from_syscall+0x0/0x3c
> > [ 877.840994] --- Exception: c01 at 0x48051f00
> > [ 877.840998] LR = 0x4808e030
> > [ 877.929553] Instruction dump:
> > [ 877.964988] 419cfff0 7c421378 4e800020 3d20c04c 800967e0 7c0301d6
> > 7d2c42a6 48000008
> > [ 878.057802] 7c210b78 7d6c42a6 7d695850 7f8b0040 419cfff0 7c421378
> > 4e800020 3d20c04a
> >
> > I do not completely understand this dump, but it looks like both the
> > receive direction (running in the context of a softirq) and my
> > transmitting application are blocked on the spinlock used in the
> > tpacket_rcv function:
> >
> > [ 876.039367] [efff3e30] [c0325e48] tpacket_rcv+0xf4/0x570 (unreliable)
> > [ 876.116479] [efff3e80] [c02c43b0] __netif_receive_skb+0x2b4/0x2f0
> > [ 876.189418] [efff3eb0] [c02c4fa0] netif_receive_skb+0x98/0xac
> > [ 876.258189] [efff3ee0] [c0292838] ingress_rx_default_dqrr+0x428/0x4b4
> > [ 876.335297] [efff3f10] [c02ac2a8] qman_poll_dqrr+0x1e0/0x284
> > [ 876.403025] [efff3f50] [c0294088] dpaa_eth_poll+0x34/0xd0
> > [ 876.467632] [efff3f70] [c02c5280] net_rx_action+0xc0/0x1e8
> > [ 876.533280] [efff3fa0] [c0035ab0] __do_softirq+0x138/0x210
> > [ 876.598926] [efff3ff0] [c00115e8] call_do_softirq+0x14/0x24
> > [ 876.665618] [ec6abab0] [c000480c] do_softirq+0xb4/0xec
> >
> > and
> >
> > [ 877.362701] [ec6abc60] [c0325e48] tpacket_rcv+0xf4/0x570 (unreliable)
> > [ 877.439819] [ec6abcb0] [c02c6238] dev_hard_start_xmit+0x164/0x414
> > [ 877.512758] [ec6abcf0] [c0325b94] packet_sendmsg+0x8c0/0x984
> > [ 877.580487] [ec6abd70] [c02b32f0] sock_sendmsg+0x90/0xb4
> > [ 877.644052] [ec6abe40] [c02b3ea8] sys_sendto+0xd0/0x114
> > [ 877.706575] [ec6abf10] [c02b522c] sys_socketcall+0x148/0x210
> > [ 877.774306] [ec6abf40] [c0011d0c] ret_from_syscall+0x0/0x3c
> >
> > Is my analysis correct?
> > If yes, can this have anything to do with the IPIPE mechanism we are
> > using (maybe a know issue??).
> >
> > Any help would be much appreciated.
> >
> > Thanks,
> > Ronny
> >
>
> Hello
>
> I did a new test (this time with an older kernel Linux version
> 2.6.34.6): same tests were executed but this time on a pure Linux
> build (no IPIPE included). The issue cannot be reproduced anymore in
> this environment. My test builds keep on running forever.
>
> My next steps are:
> - Running the same test on 2.6.35.7 without IPIPE. This enviroment is
> currently building.
> - Include only IPIPE and no Xenomai and redo the test.
>
Could you try 2.6.36-ipipe as well in case 2.6.35.7 without pipeline
does not exhibit the issue? A number of changes went in the IRQ replay
code during this time frame, and 2.6.35 was in a state of flux regarding
this.
> Best regards
> Ronny
>
> _______________________________________________
> Adeos-main mailing list
> Adeos-main@domain.hid
> https://mail.gna.org/listinfo/adeos-main
--
Philippe.
next prev parent reply other threads:[~2011-07-04 8:20 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-07-02 21:33 [Adeos-main] Kernel blocked during send/receive raw ethernet packets Ronny Meeus
2011-07-04 8:06 ` Ronny Meeus
2011-07-04 8:20 ` Philippe Gerum [this message]
2011-07-04 11:42 ` Gilles Chanteperdrix
2011-07-04 20:04 ` Ronny Meeus
2011-07-04 20:09 ` Philippe Gerum
2011-07-04 20:13 ` Philippe Gerum
2011-07-05 8:45 ` Ronny Meeus
2011-07-05 20:11 ` Ronny Meeus
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1309767638.2157.7.camel@domain.hid \
--to=rpm@xenomai.org \
--cc=adeos-main@gna.org \
--cc=ronny.meeus@domain.hid \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.