All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Chuck Ebbert <cebbert@redhat.com>,
	Jarek Poplawski <jarkao2@o2.pl>,
	Miklos Szeredi <miklos@szeredi.hu>,
	chris@atlee.ca, linux-kernel@vger.kernel.org, tglx@linutronix.de,
	akpm@linux-foundation.org
Subject: Re: [BUG] long freezes on thinkpad t60
Date: Fri, 22 Jun 2007 10:17:02 +0200	[thread overview]
Message-ID: <20070622081702.GA14746@elte.hu> (raw)
In-Reply-To: <20070621201624.GD22303@elte.hu>


* Ingo Molnar <mingo@elte.hu> wrote:

> the freezes that Miklos was seeing were hardirq contexts blocking in 
> task_rq_lock() - that is done with interrupts disabled. (Miklos i 
> think also tried !NOHZ kernels and older kernels, with a similar 
> result.)
> 
> plus on the ptrace side, the wait_task_inactive() code had most of its 
> overhead in the atomic op, so if any timer IRQ hit _that_ core, it was 
> likely while we were still holding the runqueue lock!
> 
> i think the only thing that eventually got Miklos' laptop out of the 
> wedge were timer irqs hitting the ptrace CPU in exactly those 
> instructions where it was not holding the runqueue lock. (or perhaps 
> an asynchronous SMM event delaying it for a long time)

even considering that the 'LOCK'-ed intruction was the heaviest in the 
busy-loop, the numbers still just dont add up to 'tens of seconds of 
lockups', so there must be something else happening too.

So here's an addition to the existing theories: the Core2Duo is a 
4-issue CPU architecture. Now, why does this matter? It matters to the 
timing of the delivery of interrupts. For example, on a 3-issue 
architecture, the instruction level profile of well-cached workloads 
often looks like this:

c05a3b71:      710      89 d6                   mov    %edx,%esi
c05a3b73:        0      8b 55 c0                mov    0xffffffc0(%ebp),%edx
c05a3b76:        0      89 c3                   mov    %eax,%ebx
c05a3b78:      775      8b 82 e8 00 00 00       mov    0xe8(%edx),%eax
c05a3b7e:        0      8b 48 18                mov    0x18(%eax),%ecx
c05a3b81:        0      8b 45 c8                mov    0xffffffc8(%ebp),%eax
c05a3b84:      792      89 1c 24                mov    %ebx,(%esp)
c05a3b87:        0      89 74 24 04             mov    %esi,0x4(%esp)
c05a3b8b:        0      ff d1                   call   *%ecx
c05a3b8d:        0      8b 4d c8                mov    0xffffffc8(%ebp),%ecx
c05a3b90:      925      8b 41 6c                mov    0x6c(%ecx),%eax
c05a3b93:        0      39 41 10                cmp    %eax,0x10(%ecx)
c05a3b96:        0      0f 85 a8 01 00 00       jne    c05a3d44 <schedule+0x2a4>
c05a3b9c:      949      89 da                   mov    %ebx,%edx
c05a3b9e:        0      89 f1                   mov    %esi,%ecx
c05a3ba0:        0      8b 45 c8                mov    0xffffffc8(%ebp),%eax

the second column is the number of times the profiling interrupt has hit 
that particular instruction.

Note the many zero entries - this means that for instructions that are 
well-cached, the issue order _prevents_ interrupts from _ever_ hitting 
to within a bundle of micro-ops that the decoder will issue! The above 
workload was a plain lat_ctx, so nothing special, and interrupts and DMA 
traffic were coming and going. Still the bundling of instructions was 
very strong.

There's no guarantee of 'instruction bundling': a cachemiss can still 
stall the pipeline and allow an interrupt to hit any instruction [where 
interrupt delivery is valid], but on a well-cached workload like the 
above, even a 3-issue architecture can effectively 'merge' instructions 
to each other, and can make them essentially 'atomic' as far as external 
interrupts go.

[ also note another interesting thing in the profile above: the
  CALL *%ecx was likely BTB-optimized and hence we have a 'bundling' 
  effect that is even larger than 3 instructions. ]

i think that is what might have happened on Miklos's laptop too: the 
'movb' of the spin_unlock() done by the wait_task_inactive() got 
'bundled' together with the first LOCK instruction that took it again, 
making it very unlikely for a timer interrupt to ever hit that small 
window in wait_task_inactive(). The cpu_relax()'s "REP; NOP" was likely 
a simple NOP, because the Core2Duo is not an SMT platform.

to check this theory, adding 3 NOPs to the critical section should make 
the lockups a lot less prominent too. (While NOPs are not actually 
'issued', they do take up decoder bandwidth, so they hopefully are able 
to break up any 'bundle' of instructions.)

Miklos, if you've got some time to test this - could you revert the 
fa490cfd15d7 commit and apply the patch below - does it have any impact 
on the lockups you were experiencing?

	Ingo

---
 kernel/sched.c |    1 +
 1 file changed, 1 insertion(+)

Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -1131,6 +1131,7 @@ repeat:
 		preempted = !task_running(rq, p);
 		task_rq_unlock(rq, &flags);
 		cpu_relax();
+		asm volatile ("nop; nop; nop;");
 		if (preempted)
 			yield();
 		goto repeat;

  reply	other threads:[~2007-06-22  8:17 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-24 12:04 [BUG] long freezes on thinkpad t60 Miklos Szeredi
2007-05-24 12:54 ` Ingo Molnar
2007-05-24 14:03   ` Miklos Szeredi
2007-05-24 14:10     ` Ingo Molnar
2007-05-24 14:28       ` Miklos Szeredi
2007-05-24 14:42         ` Ingo Molnar
2007-05-24 14:44         ` Ingo Molnar
2007-05-24 17:09           ` Miklos Szeredi
2007-05-24 21:01             ` Ingo Molnar
2007-05-25  9:51               ` Miklos Szeredi
2007-06-14 16:04                 ` Miklos Szeredi
2007-06-15 21:25                   ` Chuck Ebbert
2007-06-16 10:37                   ` Ingo Molnar
2007-06-17 21:46                     ` Miklos Szeredi
2007-06-18  6:43                       ` Ingo Molnar
2007-06-18  7:24                         ` Miklos Szeredi
2007-06-18  8:12                           ` Ingo Molnar
2007-06-18  8:20                             ` Andrew Morton
2007-06-19  4:22                               ` Ravikiran G Thirumalai
2007-06-18  8:25                             ` Miklos Szeredi
2007-06-18  8:31                               ` Ingo Molnar
2007-06-18  8:34                                 ` Miklos Szeredi
2007-06-18  9:18                                   ` Ingo Molnar
2007-06-18  9:38                                     ` Ingo Molnar
2007-06-18  9:44                                       ` Ingo Molnar
2007-06-18 10:18                                         ` Miklos Szeredi
2007-06-18 12:36                                           ` Ingo Molnar
2007-06-18 13:10                                             ` Miklos Szeredi
2007-06-18 16:34                             ` Linus Torvalds
2007-06-18 17:41                               ` Miklos Szeredi
2007-06-18 17:48                                 ` Linus Torvalds
2007-06-18 18:02                                   ` Ingo Molnar
2007-06-18 18:00                               ` Ingo Molnar
2007-06-18 18:25                                 ` Linus Torvalds
2007-06-20  9:36                               ` Jarek Poplawski
2007-06-20 17:34                                 ` Linus Torvalds
2007-06-21  7:30                                   ` Ingo Molnar
2007-06-21 15:50                                     ` Linus Torvalds
2007-06-21 16:08                                       ` Ingo Molnar
2007-06-21 16:32                                         ` Linus Torvalds
2007-06-21 16:44                                         ` Chuck Ebbert
2007-06-21 17:31                                           ` Linus Torvalds
2007-06-21 18:29                                             ` Eric Dumazet
2007-06-21 18:44                                               ` Linus Torvalds
2007-06-21 19:35                                                 ` Linus Torvalds
2007-06-21 20:09                                                   ` Ingo Molnar
2007-06-21 20:14                                                     ` Linus Torvalds
2007-06-21 20:30                                                       ` Ingo Molnar
2007-06-21 20:48                                                         ` Linus Torvalds
2007-06-21 21:06                                                           ` Ingo Molnar
2007-06-21 20:42                                                       ` [patch] spinlock debug: make looping nicer Ingo Molnar
2007-06-21 20:58                                                         ` Linus Torvalds
2007-06-21 21:15                                                           ` Ingo Molnar
2007-06-22  7:00                                                             ` Jarek Poplawski
2007-06-21 20:36                                                   ` [BUG] long freezes on thinkpad t60 Eric Dumazet
2007-06-21 19:56                                                 ` Ingo Molnar
2007-06-21 20:10                                                   ` Linus Torvalds
2007-06-21 20:23                                                     ` Ingo Molnar
2007-06-21 20:12                                                 ` Ingo Molnar
2007-06-26  8:42                                                 ` Nick Piggin
2007-06-26 10:56                                                   ` Jarek Poplawski
2007-06-26 17:23                                                   ` Linus Torvalds
2007-06-27  5:23                                                     ` Nick Piggin
2007-06-27  6:04                                                       ` Linus Torvalds
2007-06-27  6:20                                                         ` Nick Piggin
2007-06-27 19:47                                                         ` Linus Torvalds
2007-06-27 20:10                                                           ` Ingo Molnar
2007-06-27 20:17                                                           ` Davide Libenzi
2007-06-27 22:11                                                             ` Linus Torvalds
2007-06-27 23:30                                                               ` Davide Libenzi
2007-06-28  0:46                                                                 ` Linus Torvalds
2007-06-28  3:03                                                                   ` Davide Libenzi
2007-07-02  7:06                                                           ` Nick Piggin
2007-06-21 20:16                                             ` Ingo Molnar
2007-06-22  8:17                                               ` Ingo Molnar [this message]
2007-06-23 10:36                                                 ` Miklos Szeredi
2007-06-23 16:39                                                   ` Linus Torvalds
2007-06-25  6:45                                                   ` Jarek Poplawski
2007-06-21 20:18                                             ` Ingo Molnar
2007-06-21 20:36                                               ` Linus Torvalds
2007-06-21  7:38                                   ` Jarek Poplawski
2007-06-21  8:39                                     ` Ingo Molnar
2007-06-21 11:09                                       ` Jarek Poplawski
2007-06-21 16:01                                     ` Linus Torvalds
2007-06-22 10:38                                       ` Jarek Poplawski
2007-05-24 22:08 ` Henrique de Moraes Holschuh
2007-05-24 22:13   ` Kok, Auke
2007-05-25  6:58     ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070622081702.GA14746@elte.hu \
    --to=mingo@elte.hu \
    --cc=akpm@linux-foundation.org \
    --cc=cebbert@redhat.com \
    --cc=chris@atlee.ca \
    --cc=jarkao2@o2.pl \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.