From: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Anant Nitya <kernel@prachanda.info>,
linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>
Subject: Re: [BUG] local_softirq_pending storm
Date: Tue, 22 May 2007 21:59:42 +0200 [thread overview]
Message-ID: <46534BAE.2010000@googlemail.com> (raw)
In-Reply-To: <6bffcb0e0705221203s1ba21ed3j641e91036859db7d@mail.gmail.com>
Michal Piotrowski napisał(a):
> Hi Thomas,
>
> On 20/05/07, Thomas Gleixner <tglx@linutronix.de> wrote:
>> On Sun, 2007-05-20 at 02:53 +0530, Anant Nitya wrote:
>> > > 1 == TASK_INTERRUPTIBLE, so we know that ksoftirqd was not woken
>> up. At
>> > > least it is not a scheduler problem.
>> > >
>> > > I work out a more complex debug patch and pester you to test once I'm
>> > > done.
>> > No problem :)
>>
>> You asked for it :)
>>
>> Please patch 2.6.22-rc2 with
>>
>> http://tglx.de/projects/hrtimers/2.6.22-rc2/patch-2.6.22-rc2-hrt2.patch
>> and
>> http://www.tglx.de/private/tglx/ht-debug/tracer.diff
>>
>> Compile it with the config
>>
>> http://www.tglx.de/private/tglx/ht-debug/config.debug
>>
>> You should find something like:
>>
>> ( swapper-0 |#0): new 67173 us user-latency.
>>
>> along with the familiar "NOHZ ......" message in your log file.
>>
>> Once that happened please do:
>>
>> $ cat /proc/latency_trace >trace.txt
>>
>> compress it and send it to me along with the full dmesg output or put
>> both up to some place, where I can download it.
>>
>> Michal,
>>
>> IIRC you encountered the same P4/HT related wreckage. Can you do the
>> same ?
>
> Good news - I can't reproduce this bug. It's time to remove
> "Subject : 2.6.21-git4 BUG: soft lockup detected on CPU#1!
> References : http://lkml.org/lkml/2007/5/2/511
> Submitter : Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
> Handled-By : Thomas Gleixner <tglx@linutronix.de>
> Status : problem is being debugged"
> from KR.
>
> Bad news - I hit a bug in 2.6.22-rc2-hrt3. Bug symptoms:
> - X hangs (keyboard, mouse, sound etc.)
> - only Magic SysRq works
>
> http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.22-rc2-hrt3/hrt-config
>
> http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.22-rc2-hrt3/hrt-console.log
>
>
> Now I'm trying to apply hrtimers debug patch ontop 2.6.22-rc2-hrt3
> http://lkml.org/lkml/2007/3/23/106
>
It _almost_ works
http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.22-rc2-hrt3/hrtimers_debug.patch
Hmmm..
[ 135.206505] SysRq : Show Pending Timers
[ 135.210476] Timer List Version: v0.3
[ 135.214102] HRTIMER_MAX_CLOCK_BASES: 2
[ 135.217899] now at 112955954742 nsecs
[ 135.221651]
[ 135.221662] cpu: 0
[ 135.225246] clock 0:
[ 135.227588] .index: 0
[ 135.230545] .resolution: 1 nsecs
[ 135.234010] .get_time: ktime_get_real
[ 135.238151] .offset: 1179862680097808665 nsecs
[ 135.243207] active timers:
[ 135.245992] clock 1:
[ 135.248316] .index: 1
[ 135.251246] .resolution: 1 nsecs
[ 135.254722] .get_time: ktime_get
[ 135.258473] .offset: 0 nsecs
[ 135.261923] active timers:
[ 135.264683] #0: hardirq_stack, tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0
[ 135.273913] # expires at 112952000000 nsecs [in 18446744073705596874 nsecs]
[ 135.281194] #1: hardirq_stack, hrtimer_wakeup, S:01, futex_wait, automount/2293
[ 135.289057] # expires at 113902202172 nsecs [in 946247430 nsecs]
[ 135.295233] #2: hardirq_stack, hrtimer_wakeup, S:01, do_nanosleep, crond/2373
[ 135.302973] # expires at 121028637388 nsecs [in 8072682646 nsecs]
[ 135.309279] #3: hardirq_stack, it_real_fn, S:01, do_setitimer, syslogd/2171
[ 135.316701] # expires at 121465401717 nsecs [in 8509446975 nsecs]
[ 135.323035] .expires_next : 112952000000 nsecs
[ 135.327921] .exp_prev : 112951000000 nsecs
[ 135.332786] last expires_next stacktrace:
[ 135.336944] update_cpu_base_expires_next
[ 135.341188] hrtimer_interrupt
[ 135.344430] smp_apic_timer_interrupt
[ 135.348310] apic_timer_interrupt
[ 135.351845] __mcount
[ 135.354327] mcount
[ 135.356627] permission
[ 135.359298] vfs_permission
[ 135.362305] __link_path_walk
[ 135.365497] link_path_walk
[ 135.368486] path_walk
[ 135.371044] do_path_lookup
[ 135.374086] __user_walk_fd
[ 135.377122] vfs_stat_fd
[ 135.379905] vfs_stat
[ 135.382385] sys_stat64
[ 135.385065] syscall_call
[ 135.387909] <3>BUG: sleeping function called from invalid context at kernel/mutex.c:86
[ 135.396175] in_atomic():1, irqs_disabled():1
[ 135.400541] 2 locks held by readahead/2442:
[ 135.404783] #0: (&serio->lock){++..}, at: [<c02dbf17>] serio_interrupt+0x22/0x79
[ 135.412623] #1: (sysrq_key_table_lock){+...}, at: [<c027a4af>] __handle_sysrq+0x1b/0x10c
[ 135.421208] irq event stamp: 3249062
[ 135.424856] hardirqs last enabled at (3249061): [<c0181b67>] kmem_cache_free+0x97/0xa1
[ 135.433050] hardirqs last disabled at (3249062): [<c0104bd4>] common_interrupt+0x24/0x34
[ 135.441350] softirqs last enabled at (3248980): [<c012a418>] __do_softirq+0xe8/0xef
[ 135.449261] softirqs last disabled at (3248975): [<c0106df7>] do_softirq+0x69/0xd6
[ 135.457049] [<c010526c>] show_trace_log_lvl+0x35/0x54
[ 135.462312] [<c0106027>] show_trace+0x2c/0x2e
[ 135.466876] [<c01060d0>] dump_stack+0x29/0x2b
[ 135.471423] [<c011de0c>] __might_sleep+0xe0/0xe2
[ 135.476221] [<c0360257>] mutex_lock+0x22/0x2e
[ 135.480767] [<c014eb96>] lookup_module_symbol_name+0x1a/0xc4
[ 135.486739] [<c014f514>] lookup_symbol_name+0x62/0x69
[ 135.492021] [<c013eb45>] print_name_offset+0x29/0x81
[ 135.497200] [<c013f4c5>] timer_list_show+0x657/0xc55
[ 135.502368] [<c013fae2>] sysrq_timer_list_show+0x1f/0x21
[ 135.507859] [<c027a5fd>] sysrq_handle_show_timers+0xd/0xf
[ 135.513451] [<c027a525>] __handle_sysrq+0x91/0x10c
[ 135.518507] [<c027a79e>] handle_sysrq+0x37/0x39
[ 135.523296] [<c0274b00>] kbd_event+0x335/0x58d
[ 135.527921] [<c02df806>] input_event+0x43e/0x460
[ 135.532718] [<c02e3766>] atkbd_interrupt+0x467/0x58d
[ 135.537887] [<c02dbf37>] serio_interrupt+0x42/0x79
[ 135.542865] [<c02dceed>] i8042_interrupt+0x1ff/0x210
[ 135.548130] [<c0160963>] handle_IRQ_event+0x24/0x54
[ 135.553187] [<c0161b2a>] handle_edge_irq+0xd7/0x11f
[ 135.558253] [<c0106f46>] do_IRQ+0xe2/0x10c
[ 135.562544] [<c0104bde>] common_interrupt+0x2e/0x34
[ 135.567675] [<c0116134>] mcount+0x14/0x18
[ 135.571858] [<c019b9ab>] mntput_no_expire+0xd/0x89
[ 135.576845] [<c018cd6a>] path_release+0x2f/0x33
[ 135.581645] [<c01893f4>] vfs_stat_fd+0x4c/0x55
[ 135.586276] [<c01894c9>] vfs_stat+0x25/0x27
[ 135.590660] [<c01894e9>] sys_stat64+0x1e/0x37
[ 135.595215] [<c01041f5>] syscall_call+0x7/0xb
[ 135.599752] =======================
[ 135.603365] ---------------------------
[ 135.607229] | preempt count: 00010000 ]
[ 135.611146] | 0-level deep critical section nesting:
[ 135.616202] ----------------------------------------
[ 135.621258]
[ 135.795930] ( readahead-2442 |#0): new 112512428 us user-latency.
[ 135.802591] stopped custom tracer.
[ 135.806058] BUG: at kernel/mutex.c:132 __mutex_lock_common()
[ 135.811824] [<c010526c>] show_trace_log_lvl+0x35/0x54
[ 135.817035] [<c0106027>] show_trace+0x2c/0x2e
[ 135.821557] [<c01060d0>] dump_stack+0x29/0x2b
[ 135.826078] [<c035fff2>] __mutex_lock_slowpath+0x64/0x2a7
[ 135.831714] [<c036025e>] mutex_lock+0x29/0x2e
[ 135.836277] [<c014eb96>] lookup_module_symbol_name+0x1a/0xc4
[ 135.842121] [<c014f514>] lookup_symbol_name+0x62/0x69
[ 135.847342] [<c013eb45>] print_name_offset+0x29/0x81
[ 135.852478] [<c013f4c5>] timer_list_show+0x657/0xc55
[ 135.857611] [<c013fae2>] sysrq_timer_list_show+0x1f/0x21
[ 135.863128] [<c027a5fd>] sysrq_handle_show_timers+0xd/0xf
[ 135.868744] [<c027a525>] __handle_sysrq+0x91/0x10c
[ 135.873689] [<c027a79e>] handle_sysrq+0x37/0x39
[ 135.878383] [<c0274b00>] kbd_event+0x335/0x58d
[ 135.882991] [<c02df806>] input_event+0x43e/0x460
[ 135.887806] [<c02e3766>] atkbd_interrupt+0x467/0x58d
[ 135.892948] [<c02dbf37>] serio_interrupt+0x42/0x79
[ 135.897918] [<c02dceed>] i8042_interrupt+0x1ff/0x210
[ 135.903053] [<c0160963>] handle_IRQ_event+0x24/0x54
[ 135.908120] [<c0161b2a>] handle_edge_irq+0xd7/0x11f
[ 135.913176] [<c0106f46>] do_IRQ+0xe2/0x10c
[ 135.917449] [<c0104bde>] common_interrupt+0x2e/0x34
[ 135.922502] [<c0116134>] mcount+0x14/0x18
[ 135.926696] [<c019b9ab>] mntput_no_expire+0xd/0x89
[ 135.931657] [<c018cd6a>] path_release+0x2f/0x33
[ 135.936378] [<c01893f4>] vfs_stat_fd+0x4c/0x55
[ 135.941001] [<c01894c9>] vfs_stat+0x25/0x27
[ 135.945376] [<c01894e9>] sys_stat64+0x1e/0x37
[ 135.949931] [<c01041f5>] syscall_call+0x7/0xb
[ 135.954467] =======================
[ 135.958091] ---------------------------
[ 135.962015] | preempt count: 00010000 ]
[ 135.965905] | 0-level deep critical section nesting:
[ 135.970927] ----------------------------------------
[ 135.975931]
[ 135.977468] <ffffffff>
[ 135.979865]
[ 135.981396] .hres_active : 1
[ 135.984757] .nr_events : 91663
[ 135.988465] .nohz_mode : 2
[ 135.991820] .idle_tick : 112686000000 nsecs
[ 135.996651] .tick_stopped : 0
[ 136.000016] .idle_jiffies : 4294779982
[ 136.004190] .idle_calls : 30792
[ 136.007882] .idle_sleeps : 3020
[ 136.011486] .idle_entrytime : 112932690671 nsecs
[ 136.016335] .idle_sleeptime : 18007304053 nsecs
[ 136.021063] .last_jiffies : 4294780229
[ 136.025220] .next_jiffies : 4294780556
[ 136.029352] .idle_expires : 112855000000 nsecs
[ 136.034167] jiffies: 4294780248
[ 136.037331]
[ 136.037336] cpu: 1
[ 136.040884] clock 0:
[ 136.043182] .index: 0
[ 136.046087] .resolution: 1 nsecs
[ 136.049535] .get_time: ktime_get_real
[ 136.053739] .offset: 1179862680097808665 nsecs
[ 136.058751] active timers:
[ 136.061501] clock 1:
[ 136.063877] .index: 1
[ 136.066781] .resolution: 1 nsecs
[ 136.070229] .get_time: ktime_get
[ 136.073922] .offset: 0 nsecs
[ 136.077465] active timers:
[ 136.080293] #0: hardirq_stack, tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0
[ 136.089309] # expires at 113828250000 nsecs [in 872295258 nsecs]
[ 136.095654] #1: hardirq_stack, hrtimer_wakeup, S:01, futex_wait, automount/2294
[ 136.103356] # expires at 113902196736 nsecs [in 946241994 nsecs]
[ 136.109485] #2: hardirq_stack, hrtimer_wakeup, S:01, do_nanosleep, smartd/2640
[ 136.117101] # expires at 1897330250564 nsecs [in 1784374295822 nsecs]
[ 136.123723] .expires_next : 113871250000 nsecs
[ 136.128726] .exp_prev : 113875250000 nsecs
[ 136.133558] last expires_next stacktrace:
[ 136.137662] update_cpu_base_expires_next
[ 136.141838] hrtimer_interrupt
[ 136.145071] smp_apic_timer_interrupt
[ 136.148935] apic_timer_interrupt
[ 136.152419] cpu_idle
[ 136.154890] start_secondary
[ 136.157967] <00000000>
[ 136.160639] <ffffffff>
[ 136.163267]
[ 136.164789] .hres_active : 1
[ 136.168143] .nr_events : 85123
[ 136.171981] .nohz_mode : 2
[ 136.175344] .idle_tick : 112682250000 nsecs
[ 136.180281] .tick_stopped : 0
[ 136.183737] .idle_jiffies : 4294779978
[ 136.187974] .idle_calls : 44643
[ 136.191708] .idle_sleeps : 1384
[ 136.195424] .idle_entrytime : 113942257992 nsecs
[ 136.200421] .idle_sleeptime : 18596870874 nsecs
[ 136.205158] .last_jiffies : 4294780248
[ 136.209316] .next_jiffies : 4294780261
[ 136.213595] .idle_expires : 112778000000 nsecs
[ 136.218451] jiffies: 4294780248
[ 136.221650]
[ 136.223163]
[ 136.223168] Tick Device: mode: 1
[ 136.228383] Clock Event Device: pit
[ 136.231928] max_delta_ns: 27461866
[ 136.235636] min_delta_ns: 12571
[ 136.239172] mult: 5124677
[ 136.242769] shift: 32
[ 136.245965] mode: 3
[ 136.249095] next_event: 9223372036854775807 nsecs
[ 136.254282] set_next_event: pit_next_event
[ 136.258613] set_mode: init_pit_timer
[ 136.262908] event_handler: tick_handle_oneshot_broadcast
[ 136.268510] tick_broadcast_mask: 00000000
[ 136.272583] tick_broadcast_oneshot_mask: 00000000
[ 136.277310]
[ 136.278840]
[ 136.278845] Tick Device: mode: 1
[ 136.284086] Clock Event Device: lapic
[ 136.287803] max_delta_ns: 671411430
[ 136.291615] min_delta_ns: 1200
[ 136.294977] mult: 53661274
[ 136.298668] shift: 32
[ 136.301902] mode: 3
[ 136.304971] next_event: 112952000000 nsecs
[ 136.309638] set_next_event: lapic_next_event
[ 136.314177] set_mode: lapic_timer_setup
[ 136.318680] event_handler: hrtimer_interrupt
[ 136.323236]
[ 136.323241] Tick Device: mode: 1
[ 136.328404] Clock Event Device: lapic
[ 136.332191] max_delta_ns: 671411430
[ 136.335977] min_delta_ns: 1200
[ 136.339367] mult: 53661274
[ 136.343075] shift: 32
[ 136.346268] mode: 3
[ 136.349375] next_event: 114098250000 nsecs
[ 136.353914] set_next_event: lapic_next_event
[ 136.358357] set_mode: lapic_timer_setup
[ 136.362973] event_handler: hrtimer_interrupt
Regards,
Michal
--
Michal K. K. Piotrowski
Kernel Monkeys
(http://kernel.wikidot.com/start)
next prev parent reply other threads:[~2007-05-22 20:00 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-05-09 14:12 [BUG] local_softirq_pending storm Anant Nitya
2007-05-09 16:31 ` Thomas Gleixner
2007-05-10 11:53 ` Anant Nitya
2007-05-10 21:58 ` Thomas Gleixner
2007-05-17 6:41 ` Anant Nitya
2007-05-18 13:01 ` Thomas Gleixner
2007-05-19 4:30 ` Anant Nitya
2007-05-19 9:55 ` Anant Nitya
2007-05-19 19:11 ` Thomas Gleixner
2007-05-19 21:23 ` Anant Nitya
2007-05-20 21:43 ` Thomas Gleixner
2007-05-21 6:22 ` Anant Nitya
2007-05-21 9:01 ` Thomas Gleixner
2007-05-21 10:02 ` Anant Nitya
2007-05-21 18:45 ` Anant Nitya
2007-05-22 19:03 ` Michal Piotrowski
2007-05-22 19:59 ` Michal Piotrowski [this message]
2007-05-22 20:10 ` Ingo Molnar
2007-05-22 20:23 ` Michal Piotrowski
2007-05-23 17:00 ` Chuck Ebbert
2007-05-23 18:38 ` Chuck Ebbert
2007-05-23 23:24 ` Anant Nitya
2007-05-23 23:24 ` Anant Nitya
2007-05-24 7:45 ` Ingo Molnar
2007-05-20 10:18 ` Heiko Carstens
2007-05-20 13:52 ` Thomas Gleixner
2007-05-20 18:36 ` Heiko Carstens
[not found] ` <20070520191800.GA14225@osiris.ibm.com>
[not found] ` <1179689389.6570.1.camel@chaos>
[not found] ` <20070521200428.GA9855@osiris.ibm.com>
2007-05-23 20:54 ` Mikulas Patocka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=46534BAE.2010000@googlemail.com \
--to=michal.k.k.piotrowski@gmail.com \
--cc=kernel@prachanda.info \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.