public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Processes spinning forever, apparently in lock_timer_base()?
@ 2007-08-01 22:39 Chuck Ebbert
  2007-08-02 10:37 ` richard kennedy
  2007-08-03 18:34 ` Andrew Morton
  0 siblings, 2 replies; 25+ messages in thread
From: Chuck Ebbert @ 2007-08-01 22:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: Thomas Gleixner

Looks like the same problem with spinlock unfairness we've seen
elsewhere: it seems to be looping here? Or is everyone stuck
just waiting for writeout?

lock_timer_base():
        for (;;) {
                tvec_base_t *prelock_base = timer->base;
                base = tbase_get_base(prelock_base);
                if (likely(base != NULL)) {
                        spin_lock_irqsave(&base->lock, *flags);
                        if (likely(prelock_base == timer->base))
                                return base;
                        /* The timer has migrated to another CPU */
                        spin_unlock_irqrestore(&base->lock, *flags);
                }
                cpu_relax();
        }

The problem goes away completely if filesystem are mounted
*without* noatime. Has happened in 2.6.20 through 2.6.22...

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=249563

Part of sysrq-t listing:

mysqld        D 000017c0  2196 23162   1562
       e383fcb8 00000082 61650954 000017c0 e383fc9c 00000000 c0407208 e383f000 
       a12b0434 00004d1d c6ed2c00 c6ed2d9c c200fa80 00000000 c0724640 f6c60540 
       c4ff3c70 00000508 00000286 c042ffcb e383fcc8 00014926 00000000 00000286 
Call Trace:
 [<c0407208>] do_IRQ+0xbd/0xd1
 [<c042ffcb>] lock_timer_base+0x19/0x35
 [<c04300df>] __mod_timer+0x9a/0xa4
 [<c060bb55>] schedule_timeout+0x70/0x8f
 [<c042fd37>] process_timeout+0x0/0x5
 [<c060bb50>] schedule_timeout+0x6b/0x8f
 [<c060b67c>] io_schedule_timeout+0x39/0x5d
 [<c0465eea>] congestion_wait+0x50/0x64
 [<c0438539>] autoremove_wake_function+0x0/0x35
 [<c04620e2>] balance_dirty_pages_ratelimited_nr+0x148/0x193
 [<c045e7fd>] generic_file_buffered_write+0x4c7/0x5d3


named         D 000017c0  2024  1454      1
       f722acb0 00000082 6165ed96 000017c0 c1523e80 c16f0c00 c16f20e0 f722a000 
       a12be87d 00004d1d f768ac00 f768ad9c c200fa80 00000000 00000000 f75bda80 
       c0407208 00000508 00000286 c042ffcb f722acc0 00020207 00000000 00000286 
Call Trace:
 [<c0407208>] do_IRQ+0xbd/0xd1
 [<c042ffcb>] lock_timer_base+0x19/0x35
 [<c04300df>] __mod_timer+0x9a/0xa4
 [<c060bb55>] schedule_timeout+0x70/0x8f
 [<c042fd37>] process_timeout+0x0/0x5
 [<c060bb50>] schedule_timeout+0x6b/0x8f
 [<c060b67c>] io_schedule_timeout+0x39/0x5d
 [<c0465eea>] congestion_wait+0x50/0x64
 [<c0438539>] autoremove_wake_function+0x0/0x35
 [<c04620e2>] balance_dirty_pages_ratelimited_nr+0x148/0x193
 [<c045e7fd>] generic_file_buffered_write+0x4c7/0x5d3


mysqld        D 000017c0  2196 23456   1562
       e9293cb8 00000082 616692ed 000017c0 e9293c9c 00000000 e9293cc8 e9293000 
       a12c8dd0 00004d1d c3d5ac00 c3d5ad9c c200fa80 00000000 c0724640 f6c60540 
       e9293d10 c07e1f00 00000286 c042ffcb e9293cc8 0002b57f 00000000 00000286 
Call Trace:
 [<c042ffcb>] lock_timer_base+0x19/0x35
 [<c04300df>] __mod_timer+0x9a/0xa4
 [<c060bb55>] schedule_timeout+0x70/0x8f
 [<c042fd37>] process_timeout+0x0/0x5
 [<c060bb50>] schedule_timeout+0x6b/0x8f
 [<c060b67c>] io_schedule_timeout+0x39/0x5d
 [<c0465eea>] congestion_wait+0x50/0x64
 [<c0438539>] autoremove_wake_function+0x0/0x35
 [<c04620e2>] balance_dirty_pages_ratelimited_nr+0x148/0x193
 [<c045e7fd>] generic_file_buffered_write+0x4c7/0x5d3

^ permalink raw reply	[flat|nested] 25+ messages in thread
* Re: Processes spinning forever, apparently in lock_timer_base()?
@ 2007-08-03 20:14 Oleg Nesterov
  0 siblings, 0 replies; 25+ messages in thread
From: Oleg Nesterov @ 2007-08-03 20:14 UTC (permalink / raw)
  To: Chuck Ebbert; +Cc: Andrew Morton, richard kennedy, linux-kernel

Chuck Ebbert wrote:
>
> Looks like the same problem with spinlock unfairness we've seen
> elsewhere: it seems to be looping here? Or is everyone stuck
> just waiting for writeout?
> 
> lock_timer_base():
>         for (;;) {
>                 tvec_base_t *prelock_base = timer->base;
>                 base = tbase_get_base(prelock_base);
>                 if (likely(base != NULL)) {
>                         spin_lock_irqsave(&base->lock, *flags);
>                         if (likely(prelock_base == timer->base))
>                                 return base;
>                         /* The timer has migrated to another CPU */
>                         spin_unlock_irqrestore(&base->lock, *flags);
>                 }
>                 cpu_relax();
>         }

I don't think there is an unfairness problem. We are looping only
if timer->base changes in between. IOW, there is no "lock + unlock
+ lock(same_lock)" here, we take another lock on each iteration.

And:

>  [<c0407208>] do_IRQ+0xbd/0xd1
>  [<c042ffcb>] lock_timer_base+0x19/0x35
>  [<c04300df>] __mod_timer+0x9a/0xa4
>  [<c060bb55>] schedule_timeout+0x70/0x8f
>
> ...
>
>  [<c0407208>] do_IRQ+0xbd/0xd1
>  [<c042ffcb>] lock_timer_base+0x19/0x35
>  [<c04300df>] __mod_timer+0x9a/0xa4
>  [<c060bb55>] schedule_timeout+0x70/0x8f
>
> ...
>
>  [<c042ffcb>] lock_timer_base+0x19/0x35
>  [<c04300df>] __mod_timer+0x9a/0xa4
>  [<c060bb55>] schedule_timeout+0x70/0x8f

All traces start from schedule_timeout()->mod_timer(). This timer
is local, nobody can see it, its ->base can't be NULL or changed.

This means that lock_timer_base() can't loop, but can't take the
tvec_t_base_s.lock. But in that case the trace should different,
strange.

Oleg.


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2007-10-29 19:46 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-01 22:39 Processes spinning forever, apparently in lock_timer_base()? Chuck Ebbert
2007-08-02 10:37 ` richard kennedy
2007-08-03 18:34 ` Andrew Morton
2007-08-04  8:44   ` Matthias Hensler
2007-08-09  9:59     ` Matthias Hensler
2007-08-09 16:55       ` Andrew Morton
2007-08-09 17:37         ` Matthias Hensler
2007-09-20 21:07         ` Chuck Ebbert
2007-09-20 21:29           ` Andrew Morton
2007-09-20 22:04             ` Chuck Ebbert
2007-09-20 22:36               ` Andrew Morton
2007-09-20 22:44                 ` Chuck Ebbert
2007-09-21  8:08                 ` Matthias Hensler
2007-09-21  8:22                   ` Andrew Morton
2007-09-21 10:25                 ` richard kennedy
2007-09-21 10:33                   ` Andrew Morton
2007-09-21 10:47                     ` richard kennedy
2007-09-22 12:08                     ` richard kennedy
2007-09-21  9:39             ` Andy Whitcroft
2007-09-21 15:43               ` Chuck Ebbert
2007-09-21 15:58               ` Hugh Dickins
2007-09-21 16:16                 ` Chuck Ebbert
2007-09-21 18:54                 ` Peter Zijlstra
2007-10-29 18:55                 ` Bruno Wolff III
  -- strict thread matches above, loose matches on Subject: below --
2007-08-03 20:14 Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox