linux-serial.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Excessive logging causing rcu_sched_state stall (wait_for_xmitr 8250.c)
@ 2012-05-16  0:02 helpmelearn
  0 siblings, 0 replies; 3+ messages in thread
From: helpmelearn @ 2012-05-16  0:02 UTC (permalink / raw)
  To: linux-serial


Hi,
It's been more than a week since I am stuck with this. I am relatively a
newbie venturing into linux
kernel, so please be gentle.

I have a non-tained 2.6.35.14 kernel that needs to output a lot using the
serial console. To reproduce I 
simply run klogd -2 -c 7 and this causes rcu_sched_state to detect stall
with a trace such as pasted 
below. 

Please note that I had enabled CONFIG_PROVE_RCU, CONFIG_RCU_TRACE and
related flags as well as 
bumped RCU_STALL_RAT_DELAY to 20 while debugging. This makes RCU wait 15020
jiffies until it 
complains. Otherwise, it complains in 10002 jiffies (with default config).

The trace points to the wait_for_xmitr() loop in 8250.c (from another run):

(gdb) list *(wait_for_xmitr+0x56)
0x5ae is in wait_for_xmitr (drivers/serial/8250.c:1863).
1858    {
1859            unsigned int status, tmout = 10000;
1860
1861            /* Wait up to 10ms for the character(s) to be sent. */
1862            do {
1863                    status = serial_in(up, UART_LSR);
1864
1865                    up->lsr_saved_flags |= status & LSR_SAVE_FLAGS;
1866
1867                    if (--tmout == 0)
(gdb)

AFAICT, this piece of code is run with interrupts disabled. So is the issue
here simply that too much 
output is causing wait_for_xmitr() loop to run for longer than
rcu_sched_state() is happy with?

In rcu/stallwarn.txt, Paul notes that rcu_sched_state() will get unhappy
should the CPU loop with 
interrupts disabled.

IF this is indeed the case, what would be a potential fix for this? Any
help/guidance much appreciated. 
Thank you!

[  346.664471] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 5}
(detected by 0, t=15020 jiffies)
[  346.664477] sending NMI to all CPUs:
[  346.666532] NMI backtrace for cpu 5
[  346.666534] CPU 5
[  346.666535] Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler
iscsi_scst crc32c_intel iptable_filter xt_NOTRACK iptable_nat nf_nat
nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_raw ip_tables x_tables
bonding qla2x00tgt qla2xxx scsi_transport_fc scst_vdisk scst vtms softdog
coretemp hwmon dm_mod video backlight output serio_raw rtc_cmos rtc_core
rtc_lib button ixgbe i2c_i801 i2c_core uhci_hcd pcspkr
[  346.666559]
[  346.666562] Pid: 0, comm: swapper Not tainted 2.6.35-14EIsmp
X8DTH/X8DTH-i/6/iF/6F
[  346.666565] RIP: 0010:[<ffffffff812939b4>]  [<ffffffff812939b4>]
io_serial_in+0x10/0x15
[  346.666571] RSP: 0018:ffff880144223a70  EFLAGS: 00000002
[  346.666574] RAX: 0000000000000900 RBX: ffffffff82a5c8c0 RCX:
0000000000000000
[  346.666576] RDX: 00000000000003fd RSI: 0000000000000005 RDI:
ffffffff82a5c8c0
[  346.666579] RBP: ffff880144223a70 R08: 0000000098a41ff0 R09:
ffffffff81293831
[  346.666581] R10: ffffffff81d4db2b R11: 0000000000000000 R12:
0000000000000000
[  346.666583] R13: 00000000000025af R14: 0000000000000020 R15:
000000000000003e
[  346.666586] FS:  0000000000000000(0000) GS:ffff880144220000(0000)
knlGS:0000000000000000
[  346.666589] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  346.666591] CR2: 00007fff5f13d464 CR3: 0000000074531000 CR4:
00000000000006e0
[  346.666594] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  346.666596] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[  346.666599] Process swapper (pid: 0, threadinfo ffff88013fb44000, task
ffff88023fb1e000)
[  346.666600] Stack:
[  346.666602]  ffff880144223aa0 ffffffff81291b6d ffffffff82a5c8c0
000000000000000a
[  346.666605] <0> ffffffff81291bd6 ffffffff82a5c8c0 ffff880144223ac0
ffffffff81291bf5
[  346.666608] <0> ffffffff81d4dc59 000000000000003e ffff880144223b00
ffffffff8128e79b
[  346.666612] Call Trace:
[  346.666614]  <IRQ>
[  346.666617]  [<ffffffff81291b6d>] wait_for_xmitr+0x38/0xa1
[  346.666621]  [<ffffffff81291bd6>] ? serial8250_console_putchar+0x0/0x34
[  346.666624]  [<ffffffff81291bf5>] serial8250_console_putchar+0x1f/0x34
[  346.666627]  [<ffffffff8128e79b>] uart_console_write+0x41/0x52
[  346.666631]  [<ffffffff8129388c>] serial8250_console_write+0xb2/0x116
[  346.666635]  [<ffffffff8103e4dd>] __call_console_drivers+0x67/0x79
[  346.666638]  [<ffffffff8103e54a>] _call_console_drivers+0x5b/0x5f
[  346.666641]  [<ffffffff8103e937>] release_console_sem+0x12a/0x1c5
[  346.666644]  [<ffffffff8103efaf>] vprintk+0x373/0x3a3
[  346.666648]  [<ffffffff8103f046>] ? printk+0x67/0x69
[  346.666651]  [<ffffffff8103f046>] printk+0x67/0x69
[  346.666655]  [<ffffffff81214645>] ? __delay+0xa/0xc
[  346.666659]  [<ffffffff8105b3ef>] ? up+0xf/0x39
[  346.666663]  [<ffffffff8106597e>] ? trace_hardirqs_on+0xd/0xf
[  346.666692]  [<ffffffffa011b2d2>] ? debug_print_prefix+0x135/0x146 [scst]
[  346.666706]  [<ffffffffa00f6ad3>] scst_process_redirect_cmd+0x97/0x323
[scst]
[  346.666720]  [<ffffffffa00f8807>] scst_cmd_done_local+0x227/0x280 [scst]
[  346.666726]  [<ffffffffa0186d61>] blockio_endio+0xcf/0xee [scst_vdisk]
[  346.666733]  [<ffffffffa0085324>] ? end_io_acct+0x0/0x147 [dm_mod]
[  346.666736]  [<ffffffff811026a7>] bio_endio+0x26/0x28
[  346.666742]  [<ffffffffa00855c5>] dec_pending+0x15a/0x163 [dm_mod]
[  346.666748]  [<ffffffffa008576a>] clone_endio+0xa1/0xae [dm_mod]
[  346.666752]  [<ffffffff811026a7>] bio_endio+0x26/0x28
[  346.666759]  [<ffffffffa00bff51>] vtms_softirq_done+0xeb/0x107 [vtms]
[  346.666764]  [<ffffffff8120127f>] blk_done_softirq+0x6d/0x7e
[  346.666767]  [<ffffffff810439fc>] __do_softirq+0xe9/0x1c7
[  346.666771]  [<ffffffff8100384c>] call_softirq+0x1c/0x34
[  346.666774]  [<ffffffff81004e15>] do_softirq+0x38/0x83
[  346.666777]  [<ffffffff81043911>] irq_exit+0x45/0x47
[  346.666780]  [<ffffffff81017c43>]
smp_call_function_single_interrupt+0x2f/0x31
[  346.666784]  [<ffffffff810034b3>]
call_function_single_interrupt+0x13/0x20
[  346.666786]  <EOI>

-- 
View this message in context: http://old.nabble.com/Excessive-logging-causing-rcu_sched_state-stall-%28wait_for_xmitr-8250.c%29-tp33854911p33854911.html
Sent from the linux-serial mailing list archive at Nabble.com.


^ permalink raw reply	[flat|nested] 3+ messages in thread
* Excessive logging causing rcu_sched_state stall (wait_for_xmitr 8250.c)
@ 2012-05-16  0:41 joe shmoe
  2012-05-16 15:12 ` Greg KH
  0 siblings, 1 reply; 3+ messages in thread
From: joe shmoe @ 2012-05-16  0:41 UTC (permalink / raw)
  To: linux-serial

Hi, 
It's been more than a week since I am stuck with this. I am relatively a newbie venturing into linux 
kernel, so please be gentle. 

I have a non-tained 2.6.35.14 kernel that needs to output a lot using the serial console. To reproduce I 
simply run klogd -2 -c 7 and this causes rcu_sched_state to detect stall with a trace such as pasted 
below. 


Please note that I had enabled CONFIG_PROVE_RCU, CONFIG_RCU_TRACE and related flags as well as 
bumped RCU_STALL_RAT_DELAY to 20 while debugging. This makes RCU wait 15020 jiffies until it 
complains. Otherwise, it complains in 10002 jiffies (with default config). 

The trace points to the wait_for_xmitr() loop in 8250.c (from another run): 

(gdb) list *(wait_for_xmitr+0x56) 
0x5ae is in wait_for_xmitr (drivers/serial/8250.c:1863). 
1858    { 
1859            unsigned int status, tmout = 10000; 
1860 
1861            /* Wait up to 10ms for the character(s) to be sent. */ 
1862            do { 
1863                    status = serial_in(up, UART_LSR); 
1864 
1865                    up->lsr_saved_flags |= status & LSR_SAVE_FLAGS; 
1866 
1867                    if (--tmout == 0) 
(gdb) 

AFAICT, this piece of code is run with interrupts disabled. So is the issue here simply that too much 
output is causing wait_for_xmitr() loop to run for longer than rcu_sched_state() is happy with? 

In rcu/stallwarn.txt, Paul notes that rcu_sched_state() will get unhappy should the CPU loop with 
interrupts disabled. 

IF this is indeed the case, what would be a potential fix for this? Any help/guidance much appreciated. 
Thank you! 

[  346.664471] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 5} (detected by 0, t=15020 jiffies) 
[  346.664477] sending NMI to all CPUs: 
[  346.666532] NMI backtrace for cpu 5 
[  346.666534] CPU 5 
[  346.666535] Modules linked in: ipmi_devintf ipmi_si 
ipmi_msghandler iscsi_scst crc32c_intel iptable_filter xt_NOTRACK 
iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 
iptable_raw ip_tables x_tables bonding qla2x00tgt qla2xxx 
scsi_transport_fc scst_vdisk scst vtms softdog coretemp hwmon dm_mod 
video backlight output serio_raw rtc_cmos rtc_core rtc_lib button ixgbe 
i2c_i801 i2c_core uhci_hcd pcspkr 
[  346.666559] 
[  346.666562] Pid: 0, comm: swapper Not tainted 2.6.35-14EIsmp X8DTH/X8DTH-i/6/iF/6F 
[  346.666565] RIP: 0010:[<ffffffff812939b4>]  [<ffffffff812939b4>] io_serial_in+0x10/0x15 
[  346.666571] RSP: 0018:ffff880144223a70  EFLAGS: 00000002 
[  346.666574] RAX: 0000000000000900 RBX: ffffffff82a5c8c0 RCX: 0000000000000000 
[  346.666576] RDX: 00000000000003fd RSI: 0000000000000005 RDI: ffffffff82a5c8c0 
[  346.666579] RBP: ffff880144223a70 R08: 0000000098a41ff0 R09: ffffffff81293831 
[  346.666581] R10: ffffffff81d4db2b R11: 0000000000000000 R12: 0000000000000000 
[  346.666583] R13: 00000000000025af R14: 0000000000000020 R15: 000000000000003e 
[  346.666586] FS:  0000000000000000(0000) GS:ffff880144220000(0000) knlGS:0000000000000000 
[  346.666589] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b 
[  346.666591] CR2: 00007fff5f13d464 CR3: 0000000074531000 CR4: 00000000000006e0 
[  346.666594] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 
[  346.666596] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 
[  346.666599] Process swapper (pid: 0, threadinfo ffff88013fb44000, task ffff88023fb1e000) 
[  346.666600] Stack: 
[  346.666602]  ffff880144223aa0 ffffffff81291b6d ffffffff82a5c8c0 000000000000000a 
[  346.666605] <0> ffffffff81291bd6 ffffffff82a5c8c0 ffff880144223ac0 ffffffff81291bf5 
[  346.666608] <0> ffffffff81d4dc59 000000000000003e ffff880144223b00 ffffffff8128e79b 
[  346.666612] Call Trace: 
[  346.666614]  <IRQ> 
[  346.666617]  [<ffffffff81291b6d>] wait_for_xmitr+0x38/0xa1 
[  346.666621]  [<ffffffff81291bd6>] ? serial8250_console_putchar+0x0/0x34 
[  346.666624]  [<ffffffff81291bf5>] serial8250_console_putchar+0x1f/0x34 
[  346.666627]  [<ffffffff8128e79b>] uart_console_write+0x41/0x52 
[  346.666631]  [<ffffffff8129388c>] serial8250_console_write+0xb2/0x116 
[  346.666635]  [<ffffffff8103e4dd>] __call_console_drivers+0x67/0x79 
[  346.666638]  [<ffffffff8103e54a>] _call_console_drivers+0x5b/0x5f 
[  346.666641]  [<ffffffff8103e937>] release_console_sem+0x12a/0x1c5 
[  346.666644]  [<ffffffff8103efaf>] vprintk+0x373/0x3a3 
[  346.666648]  [<ffffffff8103f046>] ? printk+0x67/0x69 
[  346.666651]  [<ffffffff8103f046>] printk+0x67/0x69 
[  346.666655]  [<ffffffff81214645>] ? __delay+0xa/0xc 
[  346.666659]  [<ffffffff8105b3ef>] ? up+0xf/0x39 
[  346.666663]  [<ffffffff8106597e>] ? trace_hardirqs_on+0xd/0xf 
[  346.666692]  [<ffffffffa011b2d2>] ? debug_print_prefix+0x135/0x146 [scst] 
[  346.666706]  [<ffffffffa00f6ad3>] scst_process_redirect_cmd+0x97/0x323 [scst] 
[  346.666720]  [<ffffffffa00f8807>] scst_cmd_done_local+0x227/0x280 [scst] 
[  346.666726]  [<ffffffffa0186d61>] blockio_endio+0xcf/0xee [scst_vdisk] 
[  346.666733]  [<ffffffffa0085324>] ? end_io_acct+0x0/0x147 [dm_mod] 
[  346.666736]  [<ffffffff811026a7>] bio_endio+0x26/0x28 
[  346.666742]  [<ffffffffa00855c5>] dec_pending+0x15a/0x163 [dm_mod] 
[  346.666748]  [<ffffffffa008576a>] clone_endio+0xa1/0xae [dm_mod] 
[  346.666752]  [<ffffffff811026a7>] bio_endio+0x26/0x28 
[  346.666759]  [<ffffffffa00bff51>] vtms_softirq_done+0xeb/0x107 [vtms] 
[  346.666764]  [<ffffffff8120127f>] blk_done_softirq+0x6d/0x7e 
[  346.666767]  [<ffffffff810439fc>] __do_softirq+0xe9/0x1c7 
[  346.666771]  [<ffffffff8100384c>] call_softirq+0x1c/0x34 
[  346.666774]  [<ffffffff81004e15>] do_softirq+0x38/0x83 
[  346.666777]  [<ffffffff81043911>] irq_exit+0x45/0x47 
[  346.666780]  [<ffffffff81017c43>] smp_call_function_single_interrupt+0x2f/0x31 
[  346.666784]  [<ffffffff810034b3>] call_function_single_interrupt+0x13/0x20 
[  346.666786]  <EOI> 
--
To unsubscribe from this list: send the line "unsubscribe linux-serial" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-05-16 15:12 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-16  0:02 Excessive logging causing rcu_sched_state stall (wait_for_xmitr 8250.c) helpmelearn
  -- strict thread matches above, loose matches on Subject: below --
2012-05-16  0:41 joe shmoe
2012-05-16 15:12 ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).