public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Linus Torvalds <torvalds@osdl.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>,
	Lee Revell <rlrevell@joe-job.com>,
	Andrea Arcangeli <andrea@suse.de>,
	Manfred Spraul <manfred@colorfullife.com>,
	Zwane Mwaikambo <zwane@arm.linux.org.uk>,
	George Anzinger <george@mvista.com>,
	dipankar@in.ibm.com, ganzinger@mvista.com,
	lkml <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@osdl.org>, Andi Kleen <ak@suse.de>
Subject: Re: [patch, 2.6.10-rc3] safe_hlt() & NMIs
Date: Thu, 16 Dec 2004 15:51:59 +0100	[thread overview]
Message-ID: <20041216145159.GA3204@elte.hu> (raw)
In-Reply-To: <Pine.LNX.4.58.0412151756550.3279@ppc970.osdl.org>


* Linus Torvalds <torvalds@osdl.org> wrote:

> The irq window should actually be open every alternate instruction, I
> think. Although it's not actually architected, and I thought that
> there was some errata for some CPU about this..

i have generated an instruction-granularity profile of kernel code
executing the following sequence, driven by the NMI watchdog interrupt:

 asm ("cli; cli; sti; cli; sti; cli; sti; cli; sti; cli; sti; ");
 asm ("cli; cli; sti; cli; sti; cli; sti; cli; sti; cli; sti; ");
 asm ("cli; cli; sti; cli; sti; cli; sti; cli; sti; cli; sti; ");

the first CLI is done twice, to prove that the NMI profiling works and
that the kernel can be interrupted in those places. Then i called this
kernel code in a loop. Here's the result:

c0125ee9:     1529 	fa                   	cli    
                 ^---------------------------------- # of profiler hits
c0125eea:      507 	fb                   	sti    
c0125eeb:        0 	fa                   	cli    
c0125eec:     3719 	fb                   	sti    
c0125eed:        0 	fa                   	cli    
c0125eee:     1579 	fb                   	sti    
c0125eef:        0 	fa                   	cli    
c0125ef0:     3317 	fb                   	sti    
c0125ef1:        0 	fa                   	cli    
c0125ef2:     3030 	fb                   	sti    
c0125ef3:        0 	fa                   	cli    
c0125ef4:     2497 	fa                   	cli    
c0125ef5:     1055 	fb                   	sti    
c0125ef6:        0 	fa                   	cli    
c0125ef7:     4674 	fb                   	sti    
c0125ef8:        0 	fa                   	cli    
c0125ef9:     3827 	fb                   	sti    
c0125efa:        0 	fa                   	cli    
c0125efb:     1622 	fb                   	sti    
c0125efc:        0 	fa                   	cli    
c0125efd:     3155 	fb                   	sti    
c0125efe:        0 	fa                   	cli    
c0125eff:     1273 	fa                   	cli    
c0125f00:      512 	fb                   	sti    
c0125f01:        0 	fa                   	cli    
c0125f02:     1312 	fb                   	sti    
c0125f03:        0 	fa                   	cli    
c0125f04:     1426 	fb                   	sti    
c0125f05:        0 	fa                   	cli    
c0125f06:     1507 	fb                   	sti    
c0125f07:        0 	fa                   	cli    
c0125f08:     2720 	fb                   	sti    
c0125f09:        0 	fa                   	cli    
c0125f0a:     2469 	fa                   	cli    
c0125f0b:      787 	fb                   	sti    
c0125f0c:        0 	fa                   	cli    
c0125f0d:     2085 	fb                   	sti    
c0125f0e:        0 	fa                   	cli    

the 'cli' is always a 'black hole' to the NMI, while the second of two
consecutive cli's are not.

i also played a bit with the %ss instructions, and combined them with
the cli/sti instructions and other instructions in various ways, and
with a bit of experimenting found the following, somewhat surprising
results:

c0125f33:     1016 	66 8c d0             	mov    %ss,%ax
c0125f36:     6626 	8e d0                	mov    %eax,%ss
c0125f38:    34715 	8e d0                	mov    %eax,%ss
c0125f3a:    14682 	8e d0                	mov    %eax,%ss
c0125f3c:     4521 	8e d0                	mov    %eax,%ss
c0125f3e:     7564 	8e d0                	mov    %eax,%ss
c0125f40:     3861 	66 8e d0             	mov    %ax,%ss
c0125f43:        0 	66 8c d1             	mov    %ss,%cx
c0125f46:     1061 	66 8c da             	mov    %ds,%dx
c0125f49:     7660 	8e d1                	mov    %ecx,%ss
c0125f4b:    11322 	17                   	pop    %ss
c0125f4c:        0 	fb                   	sti    
c0125f4d:     8935 	8e d1                	mov    %ecx,%ss
c0125f4f:        0 	fa                   	cli    
c0125f50:     2198 	66 8c d1             	mov    %ss,%cx
c0125f53:      735 	66 8c da             	mov    %ds,%dx
c0125f56:        0 	8e da                	mov    %edx,%ds
c0125f58:     6400 	8e d0                	mov    %eax,%ss
c0125f5a:     3062 	8e d0                	mov    %eax,%ss
c0125f5c:     3552 	8e d0                	mov    %eax,%ss
c0125f5e:     4818 	8e d0                	mov    %eax,%ss
c0125f60:        0 	fb                   	sti    
c0125f61:        0 	66 8c da             	mov    %ds,%dx
c0125f64:    17788 	8e d0                	mov    %eax,%ss
c0125f66:    64694 	8e d0                	mov    %eax,%ss
c0125f68:    12837 	8e d0                	mov    %eax,%ss
c0125f6a:     9859 	8e d0                	mov    %eax,%ss
c0125f6c:        0 	fb                   	sti    
c0125f6d:    74506 	8e d0                	mov    %eax,%ss
c0125f6f:        0 	fb                   	sti    
c0125f70:     8589 	fa                   	cli    
c0125f71:    10248 	8e d0                	mov    %eax,%ss
c0125f73:     3825 	8e d0                	mov    %eax,%ss
c0125f75:     4903 	8e d0                	mov    %eax,%ss
c0125f77:    71134 	8e d0                	mov    %eax,%ss
c0125f79:        0 	fb                   	sti    
c0125f7a:        0 	fa                   	cli    
c0125f7b:     7461 	8e d0                	mov    %eax,%ss
c0125f7d:        0 	66 8c d0             	mov    %ss,%ax
c0125f80:    39387 	8e d0                	mov    %eax,%ss
c0125f82:        0 	fa                   	cli    
c0125f83:    41484 	8e d0                	mov    %eax,%ss
c0125f85:        0 	fa                   	cli    
c0125f86:     4490 	8e d0                	mov    %eax,%ss
c0125f88:        0 	fa                   	cli    
c0125f89:     6024 	8e d0                	mov    %eax,%ss
c0125f8b:    15454 	8e d0                	mov    %eax,%ss
c0125f8d:        0 	fb                   	sti    
c0125f8e:        0 	fb                   	sti    
c0125f8f:   115104 	fb                   	sti    
c0125f90:    39061 	fb                   	sti    

it shows a number of interesting effects:

- "mov %eax,%ss" followed by the _same_ instruction cancels the 
  black-hole. This i suspect is done to prevent the lockup in vm86
  mode.

- an %ss black-hole instruction followed by 'sti' cancels sti's
  black-hole. This is unlikely to occur in real kernel code, but we
  might want to add a 'nop' in front of safe_halt()'s sti, to make sure
  the black-hole takes effect.

- in one case a two-instruction blackhole was created - but this might 
  be some prefetch effect.

i played around with the instructions a bit to manufacture combinations
that enlengthen the black-hole but failed :) This was on an Athlon64.

	Ingo

  reply	other threads:[~2004-12-16 14:53 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-12-09 23:59 RCU question George Anzinger
2004-12-10  4:31 ` Dipankar Sarma
2004-12-10 19:42   ` George Anzinger
2004-12-10 20:40     ` Dipankar Sarma
2004-12-10 20:45       ` Lee Revell
2004-12-10 21:02         ` George Anzinger
2004-12-10 22:58           ` Zwane Mwaikambo
2004-12-11  2:22             ` George Anzinger
2004-12-11  2:45               ` Zwane Mwaikambo
2004-12-11  3:29                 ` George Anzinger
2004-12-11 14:52                   ` Zwane Mwaikambo
2004-12-11 16:32                     ` Manfred Spraul
2004-12-11 16:52                       ` George Anzinger
2004-12-12  2:53                         ` Zwane Mwaikambo
2004-12-12  8:59                           ` Manfred Spraul
2004-12-12  9:37                             ` Andrea Arcangeli
2004-12-12 10:22                               ` Manfred Spraul
2004-12-12 12:15                                 ` Andrea Arcangeli
2004-12-14 21:40                                   ` Lee Revell
2004-12-14 22:23                                     ` [patch, 2.6.10-rc3] safe_hlt() & NMIs Ingo Molnar
2004-12-14 22:47                                       ` Ingo Molnar
2004-12-14 23:09                                         ` Linus Torvalds
2004-12-15  8:52                                           ` Ingo Molnar
2004-12-15 15:44                                             ` Linus Torvalds
2004-12-15 16:35                                               ` Ingo Molnar
2004-12-16  0:37                                           ` Alan Cox
2004-12-16  1:58                                             ` Linus Torvalds
2004-12-16 14:51                                               ` Ingo Molnar [this message]
2004-12-16 15:08                                                 ` Maciej W. Rozycki
2004-12-16 15:11                                                   ` Ingo Molnar
2004-12-16 15:42                                                     ` Maciej W. Rozycki
2004-12-16 15:54                                                 ` Linus Torvalds
2004-12-16  2:10                                             ` Zwane Mwaikambo
2004-12-16 13:26                                               ` Alan Cox
2004-12-14 23:41                                         ` Andrea Arcangeli
2004-12-14 23:00                                       ` Linus Torvalds
2004-12-15  5:04                                         ` Andi Kleen
2004-12-15  6:27                                       ` Avi Kivity
2004-12-15  8:51                                         ` Ingo Molnar
2004-12-12 16:51                                 ` RCU question George Anzinger
2004-12-12 22:40                                   ` Manfred Spraul
2004-12-13  5:22                                     ` George Anzinger
2004-12-12 16:26                             ` Zwane Mwaikambo
  -- strict thread matches above, loose matches on Subject: below --
2004-12-17 23:35 [patch, 2.6.10-rc3] safe_hlt() & NMIs Chuck Ebbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20041216145159.GA3204@elte.hu \
    --to=mingo@elte.hu \
    --cc=ak@suse.de \
    --cc=akpm@osdl.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=andrea@suse.de \
    --cc=dipankar@in.ibm.com \
    --cc=ganzinger@mvista.com \
    --cc=george@mvista.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=manfred@colorfullife.com \
    --cc=rlrevell@joe-job.com \
    --cc=torvalds@osdl.org \
    --cc=zwane@arm.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox