public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Disabling x86 System Management Mode
@ 2007-04-16 10:47 John
  2007-04-16 11:31 ` John
  2007-04-16 22:12 ` Andi Kleen
  0 siblings, 2 replies; 12+ messages in thread
From: John @ 2007-04-16 10:47 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux.kernel

Hello everyone,

According to Wikipedia:
http://en.wikipedia.org/wiki/Non-Maskable_Interrupt
http://en.wikipedia.org/wiki/System_Management_Mode

"SMM is an operating mode of the Intel 386SL and later microprocessor in 
which all normal execution (including the operating system) is 
suspended, and special separate software (usually firmware or a 
hardware-assisted debugger) is executed in high-privilege mode.

Operations in SMM take CPU time away from the OS, since the CPU state 
must be stored to memory (SMRAM) and any write back caches must be 
flushed. This can destroy real-time behavior and cause clock ticks to 
get lost."

AFAIU, even a hard real-time OS is "defenseless" against SMIs that
kick the CPU into SMM.

I'm planning on writing a few line of code to gather indirect evidence
that the CPU is periodically entering SMM.

I was considering a kernel module along the lines of...

for (i=0; i < N; ++i)
{
   schedule_timeout(1); /* sleep at least one jiffy */
   disable interrupts
   unsigned cycles = foo();
   /* update stats with cycles */
   enable interrupts
}

foo is a loop full of NOPs.
It takes up only a few lines in L1 cache.
The conditional jump is easy to predict.
There is a serializing instruction before and after.
As a result, foo's latency should be very consistent.
foo returns the number of cycles it ran. On the systems I have to work 
with, foo typically runs in ~800 microseconds.
I suppose disabling interrupts that long is bound to break something 
somewhere?

.globl foo
foo:
   push %ebx
   push %esi
   cpuid
   rdtsc
   mov MM, %ecx
   mov %eax, %esi
.align 16
.L1:
   nop
   nop
   ... (lots of nops)
   dec %ecx
   jnz .L1
.L2:
   cpuid
   rdtsc
   sub %esi, %eax
   pop %esi
   pop %ebx
   ret

If foo returns something out of the ordinary, even though interrupts 
were disabled, then it must have been interrupted by a non-maskable 
interrupt, probably a system management interrupt.

What do you think about this approach?
I'm open to comments and suggestions.

Regards.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling x86 System Management Mode
  2007-04-16 10:47 Disabling x86 System Management Mode John
@ 2007-04-16 11:31 ` John
  2007-04-16 15:12   ` Lee Revell
  2007-04-16 22:12 ` Andi Kleen
  1 sibling, 1 reply; 12+ messages in thread
From: John @ 2007-04-16 11:31 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux.kernel

[-- Attachment #1: Type: text/plain, Size: 1140 bytes --]

John wrote:

> On the systems I have to work with [...]

I didn't specify what they were.

CPU: Intel Pentium 3

# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 11
model name      : Intel(R) Pentium(R) III CPU - S         1266MHz
stepping        : 4
cpu MHz         : 1266.700
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca 
cmov pat pse36 mmx fxsr sse
bogomips        : 2533.40
clflush size    : 32

Motherboard: Adlink EBC-2000T
http://www.adlinktech.com/PD/web/PD_detail.php?pid=213

Chipset: VIA Pro133T
http://www.via.com.tw/en/products/chipsets/legacy/pro133/
VT82C694T north bridge + VT82C686B south bridge

AFAIU, the south bridge can be a source of SMIs.
Can the north bridge also be a source of SMIs?

What I/O ports do I need to write to, and what values should I write to 
these ports, in order to prevent the VT82C686B from sending SMIs?

Regards.

[-- Attachment #2: adlink.lspci --]
[-- Type: text/plain, Size: 9863 bytes --]

00:00.0 Host bridge: VIA Technologies, Inc. VT82C693A/694x [Apollo PRO133x] (rev c4)
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR-
        Latency: 8
        Region 0: Memory at e0000000 (32-bit, prefetchable) [size=64M]
        Capabilities: [a0] AGP version 2.0
                Status: RQ=32 Iso- ArqSz=0 Cal=0 SBA+ ITACoh- GART64- HTrans- 64bit- FW- AGP3- Rate=x1,x2
                Command: RQ=1 ArqSz=0 Cal=0 SBA- AGP- GART64- 64bit- FW- Rate=<none>
        Capabilities: [c0] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:01.0 PCI bridge: VIA Technologies, Inc. VT82C598/694x [Apollo MVP3/Pro133x AGP] (prog-if 00 [Normal decode])
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR-
        Latency: 0
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
        I/O behind bridge: 0000f000-00000fff
        Memory behind bridge: fff00000-000fffff
        Prefetchable memory behind bridge: fff00000-000fffff
        BridgeCtl: Parity- SERR- NoISA+ VGA- MAbort- >Reset- FastB2B-
        Capabilities: [80] Power Management version 2
                Flags: PMEClk- DSI- D1+ D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 40)
        Subsystem: VIA Technologies, Inc. VT82C686/A PCI to ISA Bridge
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 0
        Capabilities: [c0] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:07.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06) (prog-if 8a [Master SecP PriP])
        Subsystem: VIA Technologies, Inc. VT82C586/B/VT82C686/A/B/VT8233/A/C/VT8235 PIPC Bus Master IDE
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 32
        Region 0: [virtual] I/O ports at 01f0 [size=8]
        Region 1: [virtual] I/O ports at 03f4
        Region 2: [virtual] I/O ports at 0170 [size=8]
        Region 3: [virtual] I/O ports at 0374
        Region 4: I/O ports at e400 [size=16]
        Capabilities: [c0] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:07.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 1a) (prog-if 00 [UHCI])
        Subsystem: VIA Technologies, Inc. (Wrong ID) USB Controller
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 32, cache line size 08
        Interrupt: pin D routed to IRQ 5
        Region 4: I/O ports at d000 [size=32]
        Capabilities: [80] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:07.4 Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 40)
        Subsystem: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI]
        Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Interrupt: pin ? routed to IRQ 9
        Capabilities: [68] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:08.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro 100] (rev 08)
        Subsystem: Intel Corporation EtherExpress PRO/100B (TX)
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 32 (2000ns min, 14000ns max), cache line size 08
        Interrupt: pin A routed to IRQ 10
        Region 0: Memory at e6401000 (32-bit, non-prefetchable) [size=4K]
        Region 1: I/O ports at d400 [size=64]
        Region 2: Memory at e6200000 (32-bit, non-prefetchable) [size=1M]
        Expansion ROM at 20000000 [disabled] [size=1M]
        Capabilities: [dc] Power Management version 2
                Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=2 PME-

00:09.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro 100] (rev 08)
        Subsystem: Intel Corporation EtherExpress PRO/100B (TX)
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 32 (2000ns min, 14000ns max), cache line size 08
        Interrupt: pin A routed to IRQ 11
        Region 0: Memory at e6400000 (32-bit, non-prefetchable) [size=4K]
        Region 1: I/O ports at d800 [size=64]
        Region 2: Memory at e6000000 (32-bit, non-prefetchable) [size=1M]
        Capabilities: [dc] Power Management version 2
                Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=2 PME-

00:0a.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro 100] (rev 08)
        Subsystem: Intel Corporation EtherExpress PRO/100B (TX)
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 32 (2000ns min, 14000ns max), cache line size 08
        Interrupt: pin A routed to IRQ 12
        Region 0: Memory at e6403000 (32-bit, non-prefetchable) [size=4K]
        Region 1: I/O ports at dc00 [size=64]
        Region 2: Memory at e6100000 (32-bit, non-prefetchable) [size=1M]
        Capabilities: [dc] Power Management version 2
                Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=2 PME-

00:0b.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27) (prog-if 00 [VGA])
        Subsystem: ATI Technologies Inc Rage XL
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 32 (2000ns min), cache line size 08
        Interrupt: pin A routed to IRQ 5
        Region 0: Memory at e5000000 (32-bit, non-prefetchable) [size=16M]
        Region 1: I/O ports at e000 [size=256]
        Region 2: Memory at e6402000 (32-bit, non-prefetchable) [size=4K]
        Expansion ROM at 20100000 [disabled] [size=128K]
        Capabilities: [5c] Power Management version 2
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:0d.0 PCI bridge: Intel Corporation 21152 PCI-to-PCI Bridge (prog-if 00 [Normal decode])
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 32, cache line size 08
        Bus: primary=00, secondary=02, subordinate=02, sec-latency=32
        I/O behind bridge: 0000c000-0000cfff
        Memory behind bridge: e6300000-e63fffff
        Prefetchable memory behind bridge: 00000000fff00000-0000000000000000
        BridgeCtl: Parity- SERR+ NoISA+ VGA- MAbort- >Reset- FastB2B-
        Capabilities: [dc] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-
                Bridge: PM- B3+

02:0f.0 Multimedia controller: PLX Technology, Inc. 9056 PCI I/O Accelerator
        Subsystem: PHILIPS Business Electronics B.V.: Unknown device d128
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 32 (4000ns min, 6500ns max), cache line size 08
        Interrupt: pin A routed to IRQ 5
        Region 0: Memory at e6300000 (32-bit, non-prefetchable) [size=512]
        Region 1: I/O ports at c000 [size=256]
        Region 2: Memory at e6301000 (32-bit, non-prefetchable) [size=512]
        Capabilities: [40] Power Management version 1
                Flags: PMEClk- DSI- D1- D2+ AuxCurrent=0mA PME(D0-,D1+,D2-,D3hot+,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=1 PME-
        Capabilities: [48] #06 [0000]
        Capabilities: [4c] Vital Product Data

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling x86 System Management Mode
  2007-04-16 11:31 ` John
@ 2007-04-16 15:12   ` Lee Revell
  0 siblings, 0 replies; 12+ messages in thread
From: Lee Revell @ 2007-04-16 15:12 UTC (permalink / raw)
  To: John; +Cc: linux-kernel

On 4/16/07, John <linux.kernel@free.fr> wrote:
> Chipset: VIA Pro133T
> http://www.via.com.tw/en/products/chipsets/legacy/pro133/
> VT82C694T north bridge + VT82C686B south bridge
>
> AFAIU, the south bridge can be a source of SMIs.
> Can the north bridge also be a source of SMIs?
>
> What I/O ports do I need to write to, and what values should I write to
> these ports, in order to prevent the VT82C686B from sending SMIs?
>

Check out some of the links in this thread:

http://www.embeddedrelated.com/usenet/embedded/show/50333-1.php

Lee

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling x86 System Management Mode
  2007-04-16 10:47 Disabling x86 System Management Mode John
  2007-04-16 11:31 ` John
@ 2007-04-16 22:12 ` Andi Kleen
  2007-04-17 16:49   ` John Sigler
  2007-04-18 11:41   ` John Sigler
  1 sibling, 2 replies; 12+ messages in thread
From: Andi Kleen @ 2007-04-16 22:12 UTC (permalink / raw)
  To: John; +Cc: linux-kernel

John <linux.kernel@free.fr> writes:

Please use a full real name for posting.

> AFAIU, even a hard real-time OS is "defenseless" against SMIs that
> kick the CPU into SMM.

There are usually chipset specific bits that can be set to disable SMMs.
See the datasheet if you can get them. Unfortunately most chipset vendors
don't give out data sheets easily.

> .globl foo
> foo:
>    push %ebx
>    push %esi
>    cpuid
>    rdtsc

At least some SMM implementations restore the old TSC value. Sad but true.
Besides RDTSC can be speculated around on some CPUs which also adds errors.

-Andi

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling x86 System Management Mode
  2007-04-16 22:12 ` Andi Kleen
@ 2007-04-17 16:49   ` John Sigler
  2007-04-17 16:57     ` Andi Kleen
                       ` (2 more replies)
  2007-04-18 11:41   ` John Sigler
  1 sibling, 3 replies; 12+ messages in thread
From: John Sigler @ 2007-04-17 16:49 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel, linux.kernel

[-- Attachment #1: Type: text/plain, Size: 3740 bytes --]

Andi Kleen wrote:

> Please use a full real name for posting.

OK.

> John Sigler wrote:
> 
>> AFAIU, even a hard real-time OS is "defenseless" against SMIs that
>> kick the CPU into SMM.
> 
> There are usually chipset specific bits that can be set to disable SMMs.
> See the datasheet if you can get them. Unfortunately most chipset vendors
> don't give out data sheets easily.

I've asked the manufacturer to send me the data sheets.
We'll see how they react.

>> .globl foo
>> foo:
>>    push %ebx
>>    push %esi
>>    cpuid
>>    rdtsc
> 
> At least some SMM implementations restore the old TSC value. Sad but true.

Why would they do that?

How would you detect periodic SMM on such a system?

> Besides RDTSC can be speculated around on some CPUs which also adds errors.

I don't understand this sentence. Could you clarify?

I've attached my kernel module (hello.c)

# : >/var/log/kern.log; cat /proc/interrupts; /bin/time insmod houba.ko; 
cat /proc/interrupts; rmmod houba
            CPU0
   0:     519083    XT-PIC-XT        timer
   2:          0    XT-PIC-XT        cascade
   9:          0    XT-PIC-XT        acpi
  10:       9786    XT-PIC-XT        eth0
  11:          5    XT-PIC-XT        eth1
  12:          5    XT-PIC-XT        eth2
  14:      16920    XT-PIC-XT        ide0
NMI:          0
ERR:          0
0.00user 0.00system 5:00.00elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+111minor)pagefaults 0swaps
            CPU0
   0:     549094    XT-PIC-XT        timer
   2:          0    XT-PIC-XT        cascade
   9:          0    XT-PIC-XT        acpi
  10:       9791    XT-PIC-XT        eth0
  11:          5    XT-PIC-XT        eth1
  12:          5    XT-PIC-XT        eth2
  14:      16970    XT-PIC-XT        ide0
NMI:          0
ERR:          0

(HZ=100)
30011 timer interrupts
5 eth0 interrupts
50 ide0 interrupts

# cat /var/log/kern.log
Apr 17 18:22:27 SEND kernel: INIT
Apr 17 18:27:27 SEND kernel: 2350080 29995
Apr 17 18:27:27 SEND kernel: 2369792 1
Apr 17 18:27:27 SEND kernel: 2440192 1
Apr 17 18:27:27 SEND kernel: 2441216 1
Apr 17 18:27:27 SEND kernel: 2583296 1
Apr 17 18:27:27 SEND kernel: 2852096 1
Apr 17 18:27:27 SEND kernel: EXIT

First column is the cycle count clamped to a multiple of 256.
(1266.7 MHz CPU)
Second column is occurence count.

In the second experiment, I added the IRQ disable/enable around foo.

# : >/var/log/kern.log; cat /proc/interrupts; /bin/time insmod houba.ko; 
cat /proc/interrupts; rmmod houba
            CPU0
   0:     583666    XT-PIC-XT        timer
   2:          0    XT-PIC-XT        cascade
   9:          0    XT-PIC-XT        acpi
  10:      10084    XT-PIC-XT        eth0
  11:          5    XT-PIC-XT        eth1
  12:          5    XT-PIC-XT        eth2
  14:      17012    XT-PIC-XT        ide0
NMI:          0
ERR:          0
0.00user 0.01system 5:00.00elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+110minor)pagefaults 0swaps
            CPU0
   0:     613677    XT-PIC-XT        timer
   2:          0    XT-PIC-XT        cascade
   9:          0    XT-PIC-XT        acpi
  10:      10089    XT-PIC-XT        eth0
  11:          5    XT-PIC-XT        eth1
  12:          5    XT-PIC-XT        eth2
  14:      17070    XT-PIC-XT        ide0
NMI:          0
ERR:          0

30011 timer interrupts
5 eth0 interrupts
58 ide0 interrupts

# cat /var/log/kern.log
Apr 17 18:33:12 SEND kernel: INIT
Apr 17 18:38:12 SEND kernel: 2350080 30000
Apr 17 18:38:12 SEND kernel: EXIT

In this experiment, all the calls to foo had a latency between
2350080 and 2350335 cycles (1.88528 to 1.88548 ms).

I don't think I can conclude anything based on this experiment.

I'll let it run all night long.

Regards.

[-- Attachment #2: hello.c --]
[-- Type: text/plain, Size: 938 bytes --]

#include <linux/init.h>
#include <linux/module.h>
#include <linux/sched.h>
#include <linux/slab.h>

MODULE_LICENSE("GPL");

#define N 30000
#define MAX 20000

extern unsigned foo(void);

static int hello_init(void)
{
  int i;
  int *lat;
  printk(KERN_ALERT "INIT\n");
  lat = kmalloc(MAX * sizeof *lat, GFP_KERNEL);
  if (lat == NULL) return -1;
  for (i=0; i < MAX; ++i) lat[i] = 0;
  for (i=0; i < 5; ++i) foo();
  for (i=0; i < N; ++i)
  {
    unsigned count, res;
    set_current_state(TASK_INTERRUPTIBLE);
    schedule_timeout(1);
    foo();
    local_irq_disable();
    count = foo();
    local_irq_enable();
    res = count >> 8;
    if (res < MAX) ++lat[res]; else printk(KERN_ALERT "OUT OF RANGE\n");
  }

  for (i=0; i < MAX; ++i)
    if (lat[i] != 0) printk(KERN_ALERT "%d %d\n", i<<8, lat[i]);

  return 0;
}

static void hello_exit(void)
{
  printk(KERN_ALERT "EXIT\n");
}

module_init(hello_init);
module_exit(hello_exit);

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling x86 System Management Mode
  2007-04-17 16:49   ` John Sigler
@ 2007-04-17 16:57     ` Andi Kleen
  2007-04-17 21:32       ` John Sigler
  2007-04-17 17:01     ` John Sigler
  2007-04-18  8:09     ` John Sigler
  2 siblings, 1 reply; 12+ messages in thread
From: Andi Kleen @ 2007-04-17 16:57 UTC (permalink / raw)
  To: John Sigler; +Cc: Andi Kleen, linux-kernel

On Tue, Apr 17, 2007 at 06:49:09PM +0200, John Sigler wrote:
> >>.globl foo
> >>foo:
> >>   push %ebx
> >>   push %esi
> >>   cpuid
> >>   rdtsc
> >
> >At least some SMM implementations restore the old TSC value. Sad but true.
> 
> Why would they do that?

I asked the same question.  But it has been observed.

> How would you detect periodic SMM on such a system?

It's not a design goal of SMM to be detectable so the BIOS 
writers and hardware designers don't care if you can.

You could probably try to measure using a external or the LAPIC 
clock.  Or check the chipset bits. 

> 
> >Besides RDTSC can be speculated around on some CPUs which also adds errors.
> 
> I don't understand this sentence. Could you clarify?

Modern x86 CPUs execute code out of order and in parallel. The reordering
window can be quite large and the CPU can execute code speculatively. 
This can add large errors to RDTSC when the instruction is not executed
where you think it is. One way around this is to synchronize it -- 
using CPUID -- but that also adds latency and makes the measurement
less precise.

-Andi


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling x86 System Management Mode
  2007-04-17 16:49   ` John Sigler
  2007-04-17 16:57     ` Andi Kleen
@ 2007-04-17 17:01     ` John Sigler
  2007-04-18  8:09     ` John Sigler
  2 siblings, 0 replies; 12+ messages in thread
From: John Sigler @ 2007-04-17 17:01 UTC (permalink / raw)
  Cc: linux-kernel

John Sigler wrote:

> static int hello_init(void)
> {
>   int i;
>   int *lat;
>   printk(KERN_ALERT "INIT\n");
>   lat = kmalloc(MAX * sizeof *lat, GFP_KERNEL);
>   if (lat == NULL) return -1;
>   for (i=0; i < MAX; ++i) lat[i] = 0;
>   for (i=0; i < 5; ++i) foo();
>   for (i=0; i < N; ++i)
>   {
>     unsigned count, res;
>     set_current_state(TASK_INTERRUPTIBLE);
>     schedule_timeout(1);
>     foo();
>     local_irq_disable();
>     count = foo();
>     local_irq_enable();
>     res = count >> 8;
>     if (res < MAX) ++lat[res]; else printk(KERN_ALERT "OUT OF RANGE\n");
>   }
> 
>   for (i=0; i < MAX; ++i)
>     if (lat[i] != 0) printk(KERN_ALERT "%d %d\n", i<<8, lat[i]);

kfree(lat);

>   return 0;
> }


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling x86 System Management Mode
  2007-04-17 16:57     ` Andi Kleen
@ 2007-04-17 21:32       ` John Sigler
  0 siblings, 0 replies; 12+ messages in thread
From: John Sigler @ 2007-04-17 21:32 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel, linux.kernel

Andi Kleen wrote:

> Modern x86 CPUs execute code out of order and in parallel.

I am aware of the (apparent) non-deterministic nature of
superscalar out-of-order speculative execution.

> The reordering window can be quite large and the CPU can execute code
> speculatively. This can add large errors to RDTSC when the 
> instruction is not executed where you think it is. One way around 
> this is to synchronize it -- using CPUID --

Your terminology is surprising. CPUID is commonly referred to as a
/serializing/ instruction.

> but that also adds latency and makes the measurement less precise.

Why would adding known latency make the measurement less precise?

What I needed was a block of code with consistent latency, whatever
the actual number.

Regards.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling x86 System Management Mode
  2007-04-17 16:49   ` John Sigler
  2007-04-17 16:57     ` Andi Kleen
  2007-04-17 17:01     ` John Sigler
@ 2007-04-18  8:09     ` John Sigler
  2 siblings, 0 replies; 12+ messages in thread
From: John Sigler @ 2007-04-18  8:09 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andi Kleen, linux.kernel

John Sigler wrote:

> # : >/var/log/kern.log; cat /proc/interrupts; /bin/time insmod houba.ko; 
> cat /proc/interrupts; rmmod houba
>            CPU0
>   0:     519083    XT-PIC-XT        timer
>   2:          0    XT-PIC-XT        cascade
>   9:          0    XT-PIC-XT        acpi
>  10:       9786    XT-PIC-XT        eth0
>  11:          5    XT-PIC-XT        eth1
>  12:          5    XT-PIC-XT        eth2
>  14:      16920    XT-PIC-XT        ide0
> NMI:          0
> ERR:          0
> 0.00user 0.00system 5:00.00elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (0major+111minor)pagefaults 0swaps
>            CPU0
>   0:     549094    XT-PIC-XT        timer
>   2:          0    XT-PIC-XT        cascade
>   9:          0    XT-PIC-XT        acpi
>  10:       9791    XT-PIC-XT        eth0
>  11:          5    XT-PIC-XT        eth1
>  12:          5    XT-PIC-XT        eth2
>  14:      16970    XT-PIC-XT        ide0
> NMI:          0
> ERR:          0
> 
> (HZ=100)
> 30011 timer interrupts
> 5 eth0 interrupts
> 50 ide0 interrupts
> 
> # cat /var/log/kern.log
> Apr 17 18:22:27 SEND kernel: INIT
> Apr 17 18:27:27 SEND kernel: 2350080 29995
> Apr 17 18:27:27 SEND kernel: 2369792 1
> Apr 17 18:27:27 SEND kernel: 2440192 1
> Apr 17 18:27:27 SEND kernel: 2441216 1
> Apr 17 18:27:27 SEND kernel: 2583296 1
> Apr 17 18:27:27 SEND kernel: 2852096 1
> Apr 17 18:27:27 SEND kernel: EXIT
> 
> First column is the cycle count clamped to a multiple of 256.
> (1266.7 MHz CPU)
> Second column is occurence count.
> 
> In the second experiment, I added the IRQ disable/enable around foo.
> 
> # : >/var/log/kern.log; cat /proc/interrupts; /bin/time insmod houba.ko; 
> cat /proc/interrupts; rmmod houba
>            CPU0
>   0:     583666    XT-PIC-XT        timer
>   2:          0    XT-PIC-XT        cascade
>   9:          0    XT-PIC-XT        acpi
>  10:      10084    XT-PIC-XT        eth0
>  11:          5    XT-PIC-XT        eth1
>  12:          5    XT-PIC-XT        eth2
>  14:      17012    XT-PIC-XT        ide0
> NMI:          0
> ERR:          0
> 0.00user 0.01system 5:00.00elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (0major+110minor)pagefaults 0swaps
>            CPU0
>   0:     613677    XT-PIC-XT        timer
>   2:          0    XT-PIC-XT        cascade
>   9:          0    XT-PIC-XT        acpi
>  10:      10089    XT-PIC-XT        eth0
>  11:          5    XT-PIC-XT        eth1
>  12:          5    XT-PIC-XT        eth2
>  14:      17070    XT-PIC-XT        ide0
> NMI:          0
> ERR:          0
> 
> 30011 timer interrupts
> 5 eth0 interrupts
> 58 ide0 interrupts
> 
> # cat /var/log/kern.log
> Apr 17 18:33:12 SEND kernel: INIT
> Apr 17 18:38:12 SEND kernel: 2350080 30000
> Apr 17 18:38:12 SEND kernel: EXIT
> 
> In this experiment, all the calls to foo had a latency between
> 2350080 and 2350335 cycles (1.88528 to 1.88548 ms).
> 
> I don't think I can conclude anything based on this experiment.
> 
> I'll let it run all night long.

With the IRQ disable/enable around foo:

# : >/var/log/kern.log; cat /proc/interrupts; /bin/time insmod houba.ko; 
cat /proc/interrupts; rmmod houba
            CPU0
   0:      14659    XT-PIC-XT        timer
   2:          0    XT-PIC-XT        cascade
   9:          0    XT-PIC-XT        acpi
  10:        144    XT-PIC-XT        eth0
  11:          5    XT-PIC-XT        eth1
  12:          5    XT-PIC-XT        eth2
  14:      21526    XT-PIC-XT        ide0
NMI:          0
ERR:          0
0.00user 0.01system 12:30:00elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+110minor)pagefaults 0swaps
            CPU0
   0:    4516164    XT-PIC-XT        timer
   2:          0    XT-PIC-XT        cascade
   9:          0    XT-PIC-XT        acpi
  10:        414    XT-PIC-XT        eth0
  11:          5    XT-PIC-XT        eth1
  12:          5    XT-PIC-XT        eth2
  14:      21620    XT-PIC-XT        ide0
NMI:          0
ERR:          0

4501505 timer interrupts (I had set N to 4500000)
270 eth0 interrupts
94 ide0 interrupts

# cat /var/log/kern.log
Apr 17 19:03:26 SEND kernel: INIT
Apr 18 07:33:26 SEND kernel: 2350080 4500000
Apr 18 07:33:26 SEND kernel: EXIT

I'd say the CPU does not appear to enter SMM on this system.
(Unless the SMI handler restores the TSC as suggested by Andi.)

I need to refine my detection code.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling x86 System Management Mode
  2007-04-16 22:12 ` Andi Kleen
  2007-04-17 16:49   ` John Sigler
@ 2007-04-18 11:41   ` John Sigler
  2007-04-18 14:06     ` Andrew Shewmaker
  1 sibling, 1 reply; 12+ messages in thread
From: John Sigler @ 2007-04-18 11:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux.kernel

Andi Kleen wrote:

> There are usually chipset specific bits that can be set to disable
> SMMs. See the datasheet if you can get them. Unfortunately most
> chipset vendors don't give out data sheets easily.

I managed to find the south bridge data sheet.

http://linux.kernel.free.fr/VT82C686B.pdf

(I'm still looking for the north bridge data sheet.)

Could someone point me in the right direction?

Regards.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling x86 System Management Mode
  2007-04-18 11:41   ` John Sigler
@ 2007-04-18 14:06     ` Andrew Shewmaker
  2007-04-18 14:39       ` John Sigler
  0 siblings, 1 reply; 12+ messages in thread
From: Andrew Shewmaker @ 2007-04-18 14:06 UTC (permalink / raw)
  To: John Sigler; +Cc: linux-kernel

On 4/18/07, John Sigler <linux.kernel@free.fr> wrote:
> Andi Kleen wrote:
>
> > There are usually chipset specific bits that can be set to disable
> > SMMs. See the datasheet if you can get them. Unfortunately most
> > chipset vendors don't give out data sheets easily.
>
> I managed to find the south bridge data sheet.
>
> http://linux.kernel.free.fr/VT82C686B.pdf
>
> (I'm still looking for the north bridge data sheet.)
>
> Could someone point me in the right direction?

The LinuxBIOS folks had a thread on SMM recently.  You don't have to
worry about SMM if you can use their stuff.

http://www.linuxbios.org/pipermail/linuxbios/2006-December/017861.html

same thread, but continued

http://www.linuxbios.org/pipermail/linuxbios/2007-January/017905.html

-- 
Andrew Shewmaker

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling x86 System Management Mode
  2007-04-18 14:06     ` Andrew Shewmaker
@ 2007-04-18 14:39       ` John Sigler
  0 siblings, 0 replies; 12+ messages in thread
From: John Sigler @ 2007-04-18 14:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andrew Shewmaker

Andrew Shewmaker wrote:

> John Sigler wrote:
>
>> Andi Kleen wrote:
>>
>>> There are usually chipset specific bits that can be set to disable
>>> SMMs. See the datasheet if you can get them. Unfortunately most
>>> chipset vendors don't give out data sheets easily.
>>
>> I managed to find the south bridge data sheet.
>>
>> http://linux.kernel.free.fr/VT82C686B.pdf
>>
>> (I'm still looking for the north bridge data sheet.)
>>
>> Could someone point me in the right direction?
> 
> The LinuxBIOS folks had a thread on SMM recently.  You don't have to
> worry about SMM if you can use their stuff.

Unfortunately, LinuxBIOS is not a viable option at this point.
I have to live with this proprietary BIOS.

Phoenix AwardBIOS v6.00PG
EBC-2000 REV: A1.3
02/12/2004-694X-686-6A6LJXA9C-00

Regards.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2007-04-18 14:39 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-04-16 10:47 Disabling x86 System Management Mode John
2007-04-16 11:31 ` John
2007-04-16 15:12   ` Lee Revell
2007-04-16 22:12 ` Andi Kleen
2007-04-17 16:49   ` John Sigler
2007-04-17 16:57     ` Andi Kleen
2007-04-17 21:32       ` John Sigler
2007-04-17 17:01     ` John Sigler
2007-04-18  8:09     ` John Sigler
2007-04-18 11:41   ` John Sigler
2007-04-18 14:06     ` Andrew Shewmaker
2007-04-18 14:39       ` John Sigler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox