All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Linus Torvalds <torvalds@osdl.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>,
	Lee Revell <rlrevell@joe-job.com>,
	Andrea Arcangeli <andrea@suse.de>,
	Manfred Spraul <manfred@colorfullife.com>,
	Zwane Mwaikambo <zwane@arm.linux.org.uk>,
	George Anzinger <george@mvista.com>,
	dipankar@in.ibm.com, ganzinger@mvista.com,
	lkml <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@osdl.org>, Andi Kleen <ak@suse.de>
Subject: Re: [patch, 2.6.10-rc3] safe_hlt() & NMIs
Date: Thu, 16 Dec 2004 15:51:59 +0100	[thread overview]
Message-ID: <20041216145159.GA3204@elte.hu> (raw)
In-Reply-To: <Pine.LNX.4.58.0412151756550.3279@ppc970.osdl.org>


* Linus Torvalds <torvalds@osdl.org> wrote:

> The irq window should actually be open every alternate instruction, I
> think. Although it's not actually architected, and I thought that
> there was some errata for some CPU about this..

i have generated an instruction-granularity profile of kernel code
executing the following sequence, driven by the NMI watchdog interrupt:

 asm ("cli; cli; sti; cli; sti; cli; sti; cli; sti; cli; sti; ");
 asm ("cli; cli; sti; cli; sti; cli; sti; cli; sti; cli; sti; ");
 asm ("cli; cli; sti; cli; sti; cli; sti; cli; sti; cli; sti; ");

the first CLI is done twice, to prove that the NMI profiling works and
that the kernel can be interrupted in those places. Then i called this
kernel code in a loop. Here's the result:

c0125ee9:     1529 	fa                   	cli    
                 ^---------------------------------- # of profiler hits
c0125eea:      507 	fb                   	sti    
c0125eeb:        0 	fa                   	cli    
c0125eec:     3719 	fb                   	sti    
c0125eed:        0 	fa                   	cli    
c0125eee:     1579 	fb                   	sti    
c0125eef:        0 	fa                   	cli    
c0125ef0:     3317 	fb                   	sti    
c0125ef1:        0 	fa                   	cli    
c0125ef2:     3030 	fb                   	sti    
c0125ef3:        0 	fa                   	cli    
c0125ef4:     2497 	fa                   	cli    
c0125ef5:     1055 	fb                   	sti    
c0125ef6:        0 	fa                   	cli    
c0125ef7:     4674 	fb                   	sti    
c0125ef8:        0 	fa                   	cli    
c0125ef9:     3827 	fb                   	sti    
c0125efa:        0 	fa                   	cli    
c0125efb:     1622 	fb                   	sti    
c0125efc:        0 	fa                   	cli    
c0125efd:     3155 	fb                   	sti    
c0125efe:        0 	fa                   	cli    
c0125eff:     1273 	fa                   	cli    
c0125f00:      512 	fb                   	sti    
c0125f01:        0 	fa                   	cli    
c0125f02:     1312 	fb                   	sti    
c0125f03:        0 	fa                   	cli    
c0125f04:     1426 	fb                   	sti    
c0125f05:        0 	fa                   	cli    
c0125f06:     1507 	fb                   	sti    
c0125f07:        0 	fa                   	cli    
c0125f08:     2720 	fb                   	sti    
c0125f09:        0 	fa                   	cli    
c0125f0a:     2469 	fa                   	cli    
c0125f0b:      787 	fb                   	sti    
c0125f0c:        0 	fa                   	cli    
c0125f0d:     2085 	fb                   	sti    
c0125f0e:        0 	fa                   	cli    

the 'cli' is always a 'black hole' to the NMI, while the second of two
consecutive cli's are not.

i also played a bit with the %ss instructions, and combined them with
the cli/sti instructions and other instructions in various ways, and
with a bit of experimenting found the following, somewhat surprising
results:

c0125f33:     1016 	66 8c d0             	mov    %ss,%ax
c0125f36:     6626 	8e d0                	mov    %eax,%ss
c0125f38:    34715 	8e d0                	mov    %eax,%ss
c0125f3a:    14682 	8e d0                	mov    %eax,%ss
c0125f3c:     4521 	8e d0                	mov    %eax,%ss
c0125f3e:     7564 	8e d0                	mov    %eax,%ss
c0125f40:     3861 	66 8e d0             	mov    %ax,%ss
c0125f43:        0 	66 8c d1             	mov    %ss,%cx
c0125f46:     1061 	66 8c da             	mov    %ds,%dx
c0125f49:     7660 	8e d1                	mov    %ecx,%ss
c0125f4b:    11322 	17                   	pop    %ss
c0125f4c:        0 	fb                   	sti    
c0125f4d:     8935 	8e d1                	mov    %ecx,%ss
c0125f4f:        0 	fa                   	cli    
c0125f50:     2198 	66 8c d1             	mov    %ss,%cx
c0125f53:      735 	66 8c da             	mov    %ds,%dx
c0125f56:        0 	8e da                	mov    %edx,%ds
c0125f58:     6400 	8e d0                	mov    %eax,%ss
c0125f5a:     3062 	8e d0                	mov    %eax,%ss
c0125f5c:     3552 	8e d0                	mov    %eax,%ss
c0125f5e:     4818 	8e d0                	mov    %eax,%ss
c0125f60:        0 	fb                   	sti    
c0125f61:        0 	66 8c da             	mov    %ds,%dx
c0125f64:    17788 	8e d0                	mov    %eax,%ss
c0125f66:    64694 	8e d0                	mov    %eax,%ss
c0125f68:    12837 	8e d0                	mov    %eax,%ss
c0125f6a:     9859 	8e d0                	mov    %eax,%ss
c0125f6c:        0 	fb                   	sti    
c0125f6d:    74506 	8e d0                	mov    %eax,%ss
c0125f6f:        0 	fb                   	sti    
c0125f70:     8589 	fa                   	cli    
c0125f71:    10248 	8e d0                	mov    %eax,%ss
c0125f73:     3825 	8e d0                	mov    %eax,%ss
c0125f75:     4903 	8e d0                	mov    %eax,%ss
c0125f77:    71134 	8e d0                	mov    %eax,%ss
c0125f79:        0 	fb                   	sti    
c0125f7a:        0 	fa                   	cli    
c0125f7b:     7461 	8e d0                	mov    %eax,%ss
c0125f7d:        0 	66 8c d0             	mov    %ss,%ax
c0125f80:    39387 	8e d0                	mov    %eax,%ss
c0125f82:        0 	fa                   	cli    
c0125f83:    41484 	8e d0                	mov    %eax,%ss
c0125f85:        0 	fa                   	cli    
c0125f86:     4490 	8e d0                	mov    %eax,%ss
c0125f88:        0 	fa                   	cli    
c0125f89:     6024 	8e d0                	mov    %eax,%ss
c0125f8b:    15454 	8e d0                	mov    %eax,%ss
c0125f8d:        0 	fb                   	sti    
c0125f8e:        0 	fb                   	sti    
c0125f8f:   115104 	fb                   	sti    
c0125f90:    39061 	fb                   	sti    

it shows a number of interesting effects:

- "mov %eax,%ss" followed by the _same_ instruction cancels the 
  black-hole. This i suspect is done to prevent the lockup in vm86
  mode.

- an %ss black-hole instruction followed by 'sti' cancels sti's
  black-hole. This is unlikely to occur in real kernel code, but we
  might want to add a 'nop' in front of safe_halt()'s sti, to make sure
  the black-hole takes effect.

- in one case a two-instruction blackhole was created - but this might 
  be some prefetch effect.

i played around with the instructions a bit to manufacture combinations
that enlengthen the black-hole but failed :) This was on an Athlon64.

	Ingo

  reply	other threads:[~2004-12-16 14:53 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-12-09 23:59 RCU question George Anzinger
2004-12-10  4:31 ` Dipankar Sarma
2004-12-10 19:42   ` George Anzinger
2004-12-10 20:40     ` Dipankar Sarma
2004-12-10 20:45       ` Lee Revell
2004-12-10 21:02         ` George Anzinger
2004-12-10 22:58           ` Zwane Mwaikambo
2004-12-11  2:22             ` George Anzinger
2004-12-11  2:45               ` Zwane Mwaikambo
2004-12-11  3:29                 ` George Anzinger
2004-12-11 14:52                   ` Zwane Mwaikambo
2004-12-11 16:32                     ` Manfred Spraul
2004-12-11 16:52                       ` George Anzinger
2004-12-12  2:53                         ` Zwane Mwaikambo
2004-12-12  8:59                           ` Manfred Spraul
2004-12-12  9:37                             ` Andrea Arcangeli
2004-12-12 10:22                               ` Manfred Spraul
2004-12-12 12:15                                 ` Andrea Arcangeli
2004-12-14 21:40                                   ` Lee Revell
2004-12-14 22:23                                     ` [patch, 2.6.10-rc3] safe_hlt() & NMIs Ingo Molnar
2004-12-14 22:47                                       ` Ingo Molnar
2004-12-14 23:09                                         ` Linus Torvalds
2004-12-15  8:52                                           ` Ingo Molnar
2004-12-15 15:44                                             ` Linus Torvalds
2004-12-15 16:35                                               ` Ingo Molnar
2004-12-16  0:37                                           ` Alan Cox
2004-12-16  1:58                                             ` Linus Torvalds
2004-12-16 14:51                                               ` Ingo Molnar [this message]
2004-12-16 15:08                                                 ` Maciej W. Rozycki
2004-12-16 15:11                                                   ` Ingo Molnar
2004-12-16 15:42                                                     ` Maciej W. Rozycki
2004-12-16 15:54                                                 ` Linus Torvalds
2004-12-16  2:10                                             ` Zwane Mwaikambo
2004-12-16 13:26                                               ` Alan Cox
2004-12-14 23:41                                         ` Andrea Arcangeli
2004-12-14 23:00                                       ` Linus Torvalds
2004-12-15  5:04                                         ` Andi Kleen
2004-12-15  6:27                                       ` Avi Kivity
2004-12-15  8:51                                         ` Ingo Molnar
2004-12-12 16:51                                 ` RCU question George Anzinger
2004-12-12 22:40                                   ` Manfred Spraul
2004-12-13  5:22                                     ` George Anzinger
2004-12-12 16:26                             ` Zwane Mwaikambo
  -- strict thread matches above, loose matches on Subject: below --
2004-12-17 23:35 [patch, 2.6.10-rc3] safe_hlt() & NMIs Chuck Ebbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20041216145159.GA3204@elte.hu \
    --to=mingo@elte.hu \
    --cc=ak@suse.de \
    --cc=akpm@osdl.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=andrea@suse.de \
    --cc=dipankar@in.ibm.com \
    --cc=ganzinger@mvista.com \
    --cc=george@mvista.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=manfred@colorfullife.com \
    --cc=rlrevell@joe-job.com \
    --cc=torvalds@osdl.org \
    --cc=zwane@arm.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.