From: Ingo Molnar <mingo@elte.hu>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Chuck Ebbert <cebbert@redhat.com>,
Jarek Poplawski <jarkao2@o2.pl>,
Miklos Szeredi <miklos@szeredi.hu>,
chris@atlee.ca, linux-kernel@vger.kernel.org, tglx@linutronix.de,
akpm@linux-foundation.org
Subject: Re: [BUG] long freezes on thinkpad t60
Date: Fri, 22 Jun 2007 10:17:02 +0200 [thread overview]
Message-ID: <20070622081702.GA14746@elte.hu> (raw)
In-Reply-To: <20070621201624.GD22303@elte.hu>
* Ingo Molnar <mingo@elte.hu> wrote:
> the freezes that Miklos was seeing were hardirq contexts blocking in
> task_rq_lock() - that is done with interrupts disabled. (Miklos i
> think also tried !NOHZ kernels and older kernels, with a similar
> result.)
>
> plus on the ptrace side, the wait_task_inactive() code had most of its
> overhead in the atomic op, so if any timer IRQ hit _that_ core, it was
> likely while we were still holding the runqueue lock!
>
> i think the only thing that eventually got Miklos' laptop out of the
> wedge were timer irqs hitting the ptrace CPU in exactly those
> instructions where it was not holding the runqueue lock. (or perhaps
> an asynchronous SMM event delaying it for a long time)
even considering that the 'LOCK'-ed intruction was the heaviest in the
busy-loop, the numbers still just dont add up to 'tens of seconds of
lockups', so there must be something else happening too.
So here's an addition to the existing theories: the Core2Duo is a
4-issue CPU architecture. Now, why does this matter? It matters to the
timing of the delivery of interrupts. For example, on a 3-issue
architecture, the instruction level profile of well-cached workloads
often looks like this:
c05a3b71: 710 89 d6 mov %edx,%esi
c05a3b73: 0 8b 55 c0 mov 0xffffffc0(%ebp),%edx
c05a3b76: 0 89 c3 mov %eax,%ebx
c05a3b78: 775 8b 82 e8 00 00 00 mov 0xe8(%edx),%eax
c05a3b7e: 0 8b 48 18 mov 0x18(%eax),%ecx
c05a3b81: 0 8b 45 c8 mov 0xffffffc8(%ebp),%eax
c05a3b84: 792 89 1c 24 mov %ebx,(%esp)
c05a3b87: 0 89 74 24 04 mov %esi,0x4(%esp)
c05a3b8b: 0 ff d1 call *%ecx
c05a3b8d: 0 8b 4d c8 mov 0xffffffc8(%ebp),%ecx
c05a3b90: 925 8b 41 6c mov 0x6c(%ecx),%eax
c05a3b93: 0 39 41 10 cmp %eax,0x10(%ecx)
c05a3b96: 0 0f 85 a8 01 00 00 jne c05a3d44 <schedule+0x2a4>
c05a3b9c: 949 89 da mov %ebx,%edx
c05a3b9e: 0 89 f1 mov %esi,%ecx
c05a3ba0: 0 8b 45 c8 mov 0xffffffc8(%ebp),%eax
the second column is the number of times the profiling interrupt has hit
that particular instruction.
Note the many zero entries - this means that for instructions that are
well-cached, the issue order _prevents_ interrupts from _ever_ hitting
to within a bundle of micro-ops that the decoder will issue! The above
workload was a plain lat_ctx, so nothing special, and interrupts and DMA
traffic were coming and going. Still the bundling of instructions was
very strong.
There's no guarantee of 'instruction bundling': a cachemiss can still
stall the pipeline and allow an interrupt to hit any instruction [where
interrupt delivery is valid], but on a well-cached workload like the
above, even a 3-issue architecture can effectively 'merge' instructions
to each other, and can make them essentially 'atomic' as far as external
interrupts go.
[ also note another interesting thing in the profile above: the
CALL *%ecx was likely BTB-optimized and hence we have a 'bundling'
effect that is even larger than 3 instructions. ]
i think that is what might have happened on Miklos's laptop too: the
'movb' of the spin_unlock() done by the wait_task_inactive() got
'bundled' together with the first LOCK instruction that took it again,
making it very unlikely for a timer interrupt to ever hit that small
window in wait_task_inactive(). The cpu_relax()'s "REP; NOP" was likely
a simple NOP, because the Core2Duo is not an SMT platform.
to check this theory, adding 3 NOPs to the critical section should make
the lockups a lot less prominent too. (While NOPs are not actually
'issued', they do take up decoder bandwidth, so they hopefully are able
to break up any 'bundle' of instructions.)
Miklos, if you've got some time to test this - could you revert the
fa490cfd15d7 commit and apply the patch below - does it have any impact
on the lockups you were experiencing?
Ingo
---
kernel/sched.c | 1 +
1 file changed, 1 insertion(+)
Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -1131,6 +1131,7 @@ repeat:
preempted = !task_running(rq, p);
task_rq_unlock(rq, &flags);
cpu_relax();
+ asm volatile ("nop; nop; nop;");
if (preempted)
yield();
goto repeat;
next prev parent reply other threads:[~2007-06-22 8:17 UTC|newest]
Thread overview: 88+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-05-24 12:04 [BUG] long freezes on thinkpad t60 Miklos Szeredi
2007-05-24 12:54 ` Ingo Molnar
2007-05-24 14:03 ` Miklos Szeredi
2007-05-24 14:10 ` Ingo Molnar
2007-05-24 14:28 ` Miklos Szeredi
2007-05-24 14:42 ` Ingo Molnar
2007-05-24 14:44 ` Ingo Molnar
2007-05-24 17:09 ` Miklos Szeredi
2007-05-24 21:01 ` Ingo Molnar
2007-05-25 9:51 ` Miklos Szeredi
2007-06-14 16:04 ` Miklos Szeredi
2007-06-15 21:25 ` Chuck Ebbert
2007-06-16 10:37 ` Ingo Molnar
2007-06-17 21:46 ` Miklos Szeredi
2007-06-18 6:43 ` Ingo Molnar
2007-06-18 7:24 ` Miklos Szeredi
2007-06-18 8:12 ` Ingo Molnar
2007-06-18 8:20 ` Andrew Morton
2007-06-19 4:22 ` Ravikiran G Thirumalai
2007-06-18 8:25 ` Miklos Szeredi
2007-06-18 8:31 ` Ingo Molnar
2007-06-18 8:34 ` Miklos Szeredi
2007-06-18 9:18 ` Ingo Molnar
2007-06-18 9:38 ` Ingo Molnar
2007-06-18 9:44 ` Ingo Molnar
2007-06-18 10:18 ` Miklos Szeredi
2007-06-18 12:36 ` Ingo Molnar
2007-06-18 13:10 ` Miklos Szeredi
2007-06-18 16:34 ` Linus Torvalds
2007-06-18 17:41 ` Miklos Szeredi
2007-06-18 17:48 ` Linus Torvalds
2007-06-18 18:02 ` Ingo Molnar
2007-06-18 18:00 ` Ingo Molnar
2007-06-18 18:25 ` Linus Torvalds
2007-06-20 9:36 ` Jarek Poplawski
2007-06-20 17:34 ` Linus Torvalds
2007-06-21 7:30 ` Ingo Molnar
2007-06-21 15:50 ` Linus Torvalds
2007-06-21 16:08 ` Ingo Molnar
2007-06-21 16:32 ` Linus Torvalds
2007-06-21 16:44 ` Chuck Ebbert
2007-06-21 17:31 ` Linus Torvalds
2007-06-21 18:29 ` Eric Dumazet
2007-06-21 18:44 ` Linus Torvalds
2007-06-21 19:35 ` Linus Torvalds
2007-06-21 20:09 ` Ingo Molnar
2007-06-21 20:14 ` Linus Torvalds
2007-06-21 20:30 ` Ingo Molnar
2007-06-21 20:48 ` Linus Torvalds
2007-06-21 21:06 ` Ingo Molnar
2007-06-21 20:42 ` [patch] spinlock debug: make looping nicer Ingo Molnar
2007-06-21 20:58 ` Linus Torvalds
2007-06-21 21:15 ` Ingo Molnar
2007-06-22 7:00 ` Jarek Poplawski
2007-06-21 20:36 ` [BUG] long freezes on thinkpad t60 Eric Dumazet
2007-06-21 19:56 ` Ingo Molnar
2007-06-21 20:10 ` Linus Torvalds
2007-06-21 20:23 ` Ingo Molnar
2007-06-21 20:12 ` Ingo Molnar
2007-06-26 8:42 ` Nick Piggin
2007-06-26 10:56 ` Jarek Poplawski
2007-06-26 17:23 ` Linus Torvalds
2007-06-27 5:23 ` Nick Piggin
2007-06-27 6:04 ` Linus Torvalds
2007-06-27 6:20 ` Nick Piggin
2007-06-27 19:47 ` Linus Torvalds
2007-06-27 20:10 ` Ingo Molnar
2007-06-27 20:17 ` Davide Libenzi
2007-06-27 22:11 ` Linus Torvalds
2007-06-27 23:30 ` Davide Libenzi
2007-06-28 0:46 ` Linus Torvalds
2007-06-28 3:03 ` Davide Libenzi
2007-07-02 7:06 ` Nick Piggin
2007-06-21 20:16 ` Ingo Molnar
2007-06-22 8:17 ` Ingo Molnar [this message]
2007-06-23 10:36 ` Miklos Szeredi
2007-06-23 16:39 ` Linus Torvalds
2007-06-25 6:45 ` Jarek Poplawski
2007-06-21 20:18 ` Ingo Molnar
2007-06-21 20:36 ` Linus Torvalds
2007-06-21 7:38 ` Jarek Poplawski
2007-06-21 8:39 ` Ingo Molnar
2007-06-21 11:09 ` Jarek Poplawski
2007-06-21 16:01 ` Linus Torvalds
2007-06-22 10:38 ` Jarek Poplawski
2007-05-24 22:08 ` Henrique de Moraes Holschuh
2007-05-24 22:13 ` Kok, Auke
2007-05-25 6:58 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070622081702.GA14746@elte.hu \
--to=mingo@elte.hu \
--cc=akpm@linux-foundation.org \
--cc=cebbert@redhat.com \
--cc=chris@atlee.ca \
--cc=jarkao2@o2.pl \
--cc=linux-kernel@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.